Re: Question about Org syntax

2021-05-22 Thread Ihor Radchenko
Nicolas Goaziou  writes:

Thanks for your detailed explanations!

>> 2. Some of the element parsers honour LIMIT argument partially. Part of
>>the parsing is typically done using looking-at (ignoring the LIMIT)
>>and part is honouring it. This can backfire when LIMIT is before
>>first characteristic line of the element. For example take headline
>>parser:
>> ...
>
> LIMIT is not a random position in the buffer. It is supposed to be the
> end of the parent element, or (point-max).
>
> It is a bug (in the parser or in the cache) if it ends up being anywhere
> else.

Makes sense, though it is not mentioned in the docstrings and even in
the docstring of the `org-element--current-element'. Moreover, section
comment about parsers states that most parsers accept no arguments:

> ;; A parser returns the element or object as the list described above.
> ;; Most of them accepts no argument. 

And the comment about adding things to `org-element--current-element'
does not help either.

> ;; Beside implementing a parser and an interpreter, adding a new
> ;; greater element requires tweaking `org-element--current-element'.
> ;; Moreover, the newly defined type must be added to both
> ;; `org-element-all-elements' and `org-element-greater-elements'.

Probably, the docstring of `org-element--current-element' could mention
about the expected values of LIMIT argument?

--

Also, I have some more questions as I am trying to understand the
org-element-cache code.

I tried to add some additional comments to the existing code to clarify
parts I had difficulties understanding. See the attached patch. Let me
know if any of the comments are incorrect so that I can update my
understanding.

For now I am stuck understanding some places in the code. They are
either bugs or I misunderstood some things:

1. In org-element--cache-process-request:

>(while t
> ...
> (let ((beg (aref request 0))
>   (end (aref request 2))
>   (node (org-element--cache-root))
>   data data-key last-container)
> 
> ...
>   (and last-container
> ...
> (progn (when (and (not last-container)
>   (> (org-element-property :end data)
>  end))
>  (setq last-container data))

I do not understand the use of last-container here. The code is ran in a
while loop containing let-bound last-container variable (the let binding
sets it to nil).
(setq last-container data) will not affect further iterations of the
while loop as last-container will always be nil inside the let-binding.

2. In org-element--cache-submit-request

>  (let ((first (org-element--cache-for-removal beg end offset)))
> ...
>   (push (let ((beg (org-element-property :begin first))
>   (key (org-element--cache-key first)))
> ...
>((let ((first-end (org-element-property :end first)))
>   (and (>= first-end end)
>(vector key beg first-end offset first 0
> ...
> org-element--cache-sync-requests)

According to the docstring of org-element--cache-sync-requests,
(aref 4 request) during "0" phase should be "PARENT, when non-nil, is
the parent of the first element to be removed. Yet, KEY is the key of
the FIRST and FIRST itself is passed as PARENT.

>   (push (let ((beg (org-element-property :begin first))
>   (key (org-element--cache-key first)))
> ...
>;; Otherwise, we find the first non robust
>;; element containing END.  All elements between
>;; FIRST and this one are to be removed.
> ...
>(t
> (let* ((element (org-element--cache-find end))
>(end (org-element-property :end element))
>(up element))
>   (while (and (setq up (org-element-property :parent up))
>   (>= (org-element-property :begin up) beg))
> (setq end (org-element-property :end up)
>   element up))
>   (vector key beg end offset element 0)

Despite what the comment states, the following code simply searches for
the parent of FIRST and extents the region of phase 0 request all the
way to the end of that parent. (note that BEG is re-bound to the
beginning of the FIRST).

Now, consider the following branch of org-element--cache-submit-request.
The (aref next 4) may contain the parent of FIRST from the above code.
That parent is robust (or it would be returned by
org-element--cache-for-removal) and its :end/:contents-end have already
been shifted in the earlier call to org-element--cache-for-removal
(that was where we have found the FIRST value):

> ;; If last change happened within area to be removed, 

Re: Question about Org syntax

2021-05-16 Thread Nicolas Goaziou
Ihor Radchenko  writes:

> Nicolas Goaziou  writes:
>> It should be a paragraph. I'll fix it soon.
>>
>> Note the problem can be reproduced with only
>>
>>   * test
>>   :end:
>
> Thanks!

Fixed. Thank you.

> Also, I have few more questions (or maybe bug reports) about
> syntax/parsing:
>
> 1. Does org-element--current element suppose to return (paragraph ...)
>on empty buffer?

It is undefined. `org-element-current-element' is an internal function
being called at the beginning of "something". 

However, `org-element-at-point' is expected to return nil in an empty
buffer.

> 2. Some of the element parsers honour LIMIT argument partially. Part of
>the parsing is typically done using looking-at (ignoring the LIMIT)
>and part is honouring it. This can backfire when LIMIT is before
>first characteristic line of the element. For example take headline
>parser:
>
>* Example headline
>
>:contents-begin of the parsed headline will be _after_ :end
>
>Or even
>* example headline
>
>:contents-begin is  equal to :begin, sometimes leading to infinite
>loops in org-element--parse-to called by org-element-cache (hence,
>known bug with Emacs hangs when org-element-use-cache is non-nil)
>
>Some of the parsers potentially causing similar issues are:
>
>In particular, org-element-footnote-definition-parser,
>org-element-headline-parser, org-element-inlinetask-parser,
>org-element-plain-list-parser, org-element-property-drawer-parser,
>org-element-babel-call-parser, org-element-clock-parser,
>org-element-comment-parser, org-element-diary-sexp-parser,
>org-element-fixed-width-parser, org-element-horizontal-rule-parser,
>org-element-keyword-parser, org-element-node-property-parser,
>org-element-paragraph-parser, ...

LIMIT is not a random position in the buffer. It is supposed to be the
end of the parent element, or (point-max).

It is a bug (in the parser or in the cache) if it ends up being anywhere
else.

>  3. Some of the element parsers ignore LIMIT altogether:
> org-element-item-parser, org-element-section-parser...

`org-element-section-parser' actually recomputes LIMIT since it calls
`outline-next-heading'. This is sub-optimal and could probably be
removed.

`org-element-item-parser' is called in `item' mode, i.e., right after
`org-element-plain-list-parser', which already takes care of LIMIT. No
need to handle it twice.

> Is there any reason behind this? I though that parsing narrowed
> buffer is supposed to honour narrowing. Also, ignoring LIMIT might
> cause issue when trying to parse only visible elements.

No, parsing ignores any narrowing, hence the copious use of
`org-with-wide-buffer' or `org-with-point-at'.

Narrowing is here to help the user focus on a part of the document, not
to cheat on the surrounding syntax. As an example

  Here is an example: what do you think about it?

Narrowing the buffer to ": what do you think about it?" for reasons
should not trick the parser into thinking you're in a fixed width area.

Regards,
-- 
Nicolas Goaziou



Re: Question about Org syntax

2021-05-16 Thread Ihor Radchenko
Nicolas Goaziou  writes:
> It should be a paragraph. I'll fix it soon.
>
> Note the problem can be reproduced with only
>
>   * test
>   :end:

Thanks!

Also, I have few more questions (or maybe bug reports) about
syntax/parsing:

1. Does org-element--current element suppose to return (paragraph ...)
   on empty buffer?

2. Some of the element parsers honour LIMIT argument partially. Part of
   the parsing is typically done using looking-at (ignoring the LIMIT)
   and part is honouring it. This can backfire when LIMIT is before
   first characteristic line of the element. For example take headline
   parser:

   * Example headline

   :contents-begin of the parsed headline will be _after_ :end

   Or even
   * example headline

   :contents-begin is  equal to :begin, sometimes leading to infinite
   loops in org-element--parse-to called by org-element-cache (hence,
   known bug with Emacs hangs when org-element-use-cache is non-nil)

   Some of the parsers potentially causing similar issues are:

   In particular, org-element-footnote-definition-parser,
   org-element-headline-parser, org-element-inlinetask-parser,
   org-element-plain-list-parser, org-element-property-drawer-parser,
   org-element-babel-call-parser, org-element-clock-parser,
   org-element-comment-parser, org-element-diary-sexp-parser,
   org-element-fixed-width-parser, org-element-horizontal-rule-parser,
   org-element-keyword-parser, org-element-node-property-parser,
   org-element-paragraph-parser, ...


 3. Some of the element parsers ignore LIMIT altogether:
org-element-item-parser, org-element-section-parser...

Is there any reason behind this? I though that parsing narrowed
buffer is supposed to honour narrowing. Also, ignoring LIMIT might
cause issue when trying to parse only visible elements.

Best,
Ihor




Re: Question about Org syntax

2021-05-16 Thread Nicolas Goaziou
Hello,

Ihor Radchenko  writes:

> I am wondering about the element structure of the following Org buffer:
>
> * test
> :drawer:
> Paragraph
> * test
> :end:
>
> Should the ":end:" line belong to drawer or should it be a separate
> paragraph? Running org-element-at-point at the beginning of ":end:" line
> yields (drawer ...).

It should be a paragraph. I'll fix it soon.

Note the problem can be reproduced with only

  * test
  :end:

Regards,
-- 
Nicolas Goaziou