Re: Org lint and named source blocks

2021-10-04 Thread Tom Gillespie
Thanks for the pointer! The actual point of contact seems to be
https://github.com/milisims/tree-sitter-org. Good to find another
group that is working on this. Best,
Tom



Re: Org lint and named source blocks

2021-10-04 Thread Timothy
Hi Tom,

> The issue for me is that I don’t have the bandwidth to get started
> with a full tree sitter implementation, especially because it is going
> to need a custom scanner, and because you’re effectively on your
> own when it comes to reconstructing the output of the AST into the
> actual internal representation of an Org file. I also have no idea how
> to deal with nested parsers in tree sitter. I have some ideas about
> how it might be done, but nothing concrete (see the linked issue
> for more on that).

orgmode.nvim is developing a tree-sitter parser, perhaps a dialog with them
could be productive?

All the best,
Timothy


Re: Org lint and named source blocks

2021-10-04 Thread Tom Gillespie
> By the way, wouldn't it be better to use tree-sitter rather than
> something else for the format grammar?

Not really since we are going to need more than one implementation
using a parser generator to avoid baking implementation specific
details into the spec by accident. This is true for more than just
the grammar as well. The complexity of tokenization, parsing,
expanding, etc, for Org means that we are going to need multiple
implementations to nail the behavior for any formal spec.

That said, we definitely want a TS implementation at some point.
See https://github.com/tgbugs/laundry/issues/1 for a recent
discussion about ways forward.

The implementation I'm working on should translate to TS without
too much work since both brag and tree sitter describe LR variants.
There may be some subtle differences, but nothing fundamental.

The issue for me is that I don't have the bandwidth to get started
with a full tree sitter implementation, especially because it is going
to need a custom scanner, and because you're effectively on your
own when it comes to reconstructing the output of the AST into the
actual internal representation of an Org file. I also have no idea how
to deal with nested parsers in tree sitter. I have some ideas about
how it might be done, but nothing concrete (see the linked issue
for more on that).

Best,
Tom



Re: Org lint and named source blocks

2021-10-04 Thread Ihor Radchenko
Tom Gillespie  writes:

>> Should we allow syntax like #+KEYWORD:value to be correct or do we
>> require a whitespace/space after colon all the time?
>
> The spec as written is ambiguous/silent on this issue. In my work on
> laundry tokenizer and grammar I have found keyword syntax to be a
> thorny issue, and I strongly suggest that for the time being we either
> make no ruling on this or we state that the colon that ends the
> keyword should be followed by a space as a precautionary measure.
> The safe thing to do is to always require whitespace after the colon
> because it guarantees correct interpretation.

By the way, wouldn't it be better to use tree-sitter rather than
something else for the format grammar? At least, there is some work on
integrating tree-sitter into Emacs core [1,2].

[1] https://lists.gnu.org/archive/html/emacs-devel/2021-08/msg00268.html
[2] https://archive.casouri.cat/note/2021/emacs-tree-sitter/#Feedback

Best,
Ihor



Re: Org lint and named source blocks

2021-10-04 Thread Ihor Radchenko
Ihor Radchenko  writes:

> This one is tricky. The linter (org-lint-duplicate-name) expects that
> NAME keyword must have space before value. However, the actual Org
> parser (org-element--collect-affiliated-keywords) does not care about
> space. My intuition says that the parser behaviour is
> unintentional. However, not requiring a whitespace may also be a valid
> syntax.

For the time being, let's prefer what org-element does over the linter.
I have pushed the fix to bugfix as bd0493eda.

Best,
Ihor



Re: Org lint and named source blocks

2021-09-21 Thread Tom Gillespie
> Should we allow syntax like #+KEYWORD:value to be correct or do we
> require a whitespace/space after colon all the time?

The spec as written is ambiguous/silent on this issue. In my work on
laundry tokenizer and grammar I have found keyword syntax to be a
thorny issue, and I strongly suggest that for the time being we either
make no ruling on this or we state that the colon that ends the
keyword should be followed by a space as a precautionary measure.
The safe thing to do is to always require whitespace after the colon
because it guarantees correct interpretation.

Requiring whitespace after the colon simplifies the grammar, however
it means that you can't compact keyword lines, and it induces an
annoying failure mode where missing spaces are no longer keywords.

However, it is technically possible to make keywords work without the
whitespace, so long as there is at least one whitespace prior to the
next colon (but not contained in square brackets, e.g. #+key:lol[ a b
c ]:value is a well formed keyword under a slighly generalized
grammar). The problem is that we would like to make keyword syntax
fully closed, and I need a bit more time to get that worked out before
any definitive conclusions are drawn.

The complexity of the generalized keyword syntax can be seen here
https://github.com/tgbugs/laundry/blob/5a396bef98d9a3cd9ee929f21cd47612dd6cb1ac/laundry/lex-abbrev.rkt#L107-L249

Best,
Tom



Re: Org lint and named source blocks

2021-09-21 Thread Ihor Radchenko
Dominik Schrempf  writes:

> Running =org-lint= on an Org file containing
>
> #+NAME:Hello
> #+BEGIN_SRC emacs-lisp :exports code
> #+END_SRC
>
> I get the following error:
> #+begin_quote
> Debugger entered--Lisp error: (search-failed "^[ \11]*#\\+[A-Za-z]+: +Hello 
> *$")
> #+end_quote

Confirmed.

This one is tricky. The linter (org-lint-duplicate-name) expects that
NAME keyword must have space before value. However, the actual Org
parser (org-element--collect-affiliated-keywords) does not care about
space. My intuition says that the parser behaviour is
unintentional. However, not requiring a whitespace may also be a valid
syntax.

Dear Orgers,

Should we allow syntax like #+KEYWORD:value to be correct or do we
require a whitespace/space after colon all the time?

Best,
Ihor



Org lint and named source blocks

2021-09-21 Thread Dominik Schrempf
Thank you for the Haskell fix! I found another issue (not a bug but could be
handled better):

Running =org-lint= on an Org file containing

#+NAME:Hello
#+BEGIN_SRC emacs-lisp :exports code
#+END_SRC

I get the following error:
#+begin_quote
Debugger entered--Lisp error: (search-failed "^[ \11]*#\\+[A-Za-z]+: +Hello *$")
#+end_quote

The code is faulty because there should be a space between #+NAME: and Hello,
like so:

#+NAME: Hello
#+BEGIN_SRC emacs-lisp :exports code
#+END_SRC

However, this should probably be reported by =org-lint= as an Org syntax error,
and not lead to an error when executing =org-lint=.

What do you think?

Thank you,
Dominik