Re: [racket-users] Towards an Incremental Racket Parser for better IDE experience?

2020-12-09 Thread nicobao
Hi all,

I've read with great attention your messages, especially Sam's very 
comprehensive answer.
I now clearly understand that it's a research-level work, and I was 
definitely too ambitious in trying to dig into that - as I have limited 
time after my day job (and probably too limited knowledge too, but that's a 
non-problem as it wouldn't be a one-person task anyway).
Nevertheless, I really appreciate the exchange.

Kind regards,
Nicolas

On Wednesday, December 2, 2020 at 7:56:58 PM UTC+1 Sam Tobin-Hochstadt 
wrote:

> A few thoughts on these topics, which I've been thinking about for a while.
>
> First, let's distinguish two things. One is an _incremental_ system,
> such as a parser, which is one which does less work in response to a
> small change than it would need to do from scratch. The other is a
> system with _error recovery_, which is one where in the presence of
> one error, the system can still provide a useful answer and/or
> continue on to discover other errors. tree-sitter, for example, aims
> to do both of these, but they're quite different.
>
> With that in mind, several points:
>
> 1. It would be relatively straightforward to build an incremental
> _reader_ -- going from text to s-expressions. You could start from the
> grammar here: 
> https://github.com/racket/parser-tools/blob/master/parser-tools-lib/parser-tools/examples/read.rkt
> which is just for Scheme, and the lexer here:
>
> https://github.com/racket/syntax-color/blob/master/syntax-color-lib/syntax-color/racket-lexer.rkt
> which is for full Racket, which as Robby says is already
> error-tolerant. The read syntax (in the absence of reader extensions)
> is definitely context-free and probably LR(1). The code for the reader
> is here: 
> https://github.com/racket/racket/tree/master/racket/src/expander/read
>
> However, just calling `read` from scratch every time isn't a big
> bottleneck -- the biggest Racket-syntax file I have around is about
> 86000 lines and takes 700ms to `read`.
>
> 2. As Robby points out, the big challenge is the macro expander, which
> is (a) not a grammar, (b) large and complicated (the code is here:
> https://github.com/racket/racket/tree/master/racket/src/expander and
> it's about 35k lines) and (c) it runs arbitrary Racket code in the
> form of macros. I'm definitely interested in thinking about what an
> incremental expander would look like, but that's a big research
> project and probably would require a different model of macros than
> Racket has right now. It would not work to use some existing parsing
> toolkit like tree-sitter. You could perhaps write a new macro expander
> using an incremental computation framework such as Adapton
> [https://docs.rs/adapton/0.3.31/adapton/] or write something like
> Adapton for Racket. How well that would work is an interesting
> question. You could also rewrite the macro expander to be incremental
> more directly.
>
> 3. An error-tolerant macro expander is more plausible, but would again
> require substantial changes to the expander. One possible idea is to
> use the information the macro stepper already uses to reconstruct the
> partial program right before it went wrong, and supply that to the IDE
> to use for completion/etc. Another idea would be to replace pieces of
> erroneous syntax with something that allows the expander to continue
> (this is how error-tolerant parsers work). There are probably lots
> more ideas that we could come up with.
>
> 4. Compiling to one of the OCaml intermediate languages is an
> interesting idea -- I've thought about their flambda language as a
> possible target before. The place to start is the `schemify` layer:
> https://github.com/racket/racket/tree/master/racket/src/schemify that
> turns fully-expanded Racket code into Scheme code for Chez Scheme.
> Changing that to produce flambda would be plausible, although there
> are a lot of mismatches between the languages that would be tricky to
> overcome. Another possibility would be to directly produce JavaScript
> from that layer. You might be interested in the RacketScript project:
> https://github.com/vishesh/racketscript
>
> If you're interested in thinking more about these topics, or working
> on them, I'm happy to offer more advice.
>
> Sam
>
> On Wed, Dec 2, 2020 at 9:53 AM nicobao  wrote:
> >
> > Hi!
> >
> > The Racket Reader and the Racket Expander always return "Error : blabla" 
> when you send it a bad Racket source code.
> > As a consequence, when there is a source code error, DrRacket and the 
> Racket LSP cannot provide IDE functionalities like "find references", "info 
> on hover", "find definition"...etc.
> > This is an issue, because 99% of the time one write code, the c

[racket-users] Towards an Incremental Racket Parser for better IDE experience?

2020-12-02 Thread nicobao
Hi!

The Racket Reader and the Racket Expander always return "Error : blabla" 
when you send it a bad Racket source code.
As a consequence, when there is a source code error, DrRacket and the 
Racket LSP cannot provide IDE functionalities like "find references", "info 
on hover", "find definition"...etc.
This is an issue, because 99% of the time one write code, the code is 
incorrect. Other languages (Rust, Typescript/JS, Java, OCaml...etc) rely on 
an incremental parser than can provide a tree even if the source code is 
wrong. Basically it adds an "ERROR" node in the tree, and go on instead of 
stopping everything and returning at the first error.
Currently this compiler issue is blocking the Racket IDE to provide better 
user experience.
For my practical use case of Racket, it is important.

I would like to help working towards that direction.
I see two possible solutions to that:
1) improve the recursive descent parser of the Reader, as well as the 
Expander to make them incremental and fault-tolerant
2) re-writing the parser in something like tree-sitter or Menhir, at the 
cost of having to re-write the Reader/Expander logic (!!!)

Both solutions are daunting tasks.

For solution 1), could you point me to the Racket's recursive descent 
parser source code? What about the Expander ?

For solution 2), I was thinking of writing a tree-sitter grammar for 
racket. However, I can't find a formal description of the grammar, like 
Scheme did here:
https://www.scheme.com/tspl4/grammar.html#APPENDIXFORMALSYNTAX
Of course, the Racket documentation is still quite comprehensive, but it 
would be nice if anyone could tell me if there is such formal document 
somewhere?
Besides, I wonder whether Racket/Scheme could even be described using a 
LR(1) or a GLR grammar? 

Finally, is any work have been started towards this direction?

Totally off-topic, but has anyone ever thought of compiling Racket down to 
OCaml, in order to reuse js_of_ocaml and produce optimized JS code from 
Racket?
I was wondering whether it would be feasible.

Final note: I know all of that is _very_ ambitious!

Kind regards,
Nicolas

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/racket-users/d77440e3-1876-44e5-b52b-323d5715df66n%40googlegroups.com.