Hi Daniel, Many thanks for working on this issue. In my case, I'm waiting for "tolerant" parser feature to continue my work with Freemarker Language Server https://github.com/angelozerr/freemarker-languageserver/ which uses a custom tolerant parser (which basicly parses XML). If you can more manage the capability to update an existing Freemarker DOM by a content (ex: user type space, user type a FM content in the editor), it should be fantastic. It will avoid to reparse the full content of the editor to rebult the Freemarker DOM (incremental).
In other words to support IDE, we need: * tolerant parser (required) * incremental parser (optional) The Java JDT ICompilationUnit of Eclipse provides this feature. It's one reason why Java Editor completion, etc is so fast. Regard's Angelo 2018-08-07 1:59 GMT+02:00 Daniel Dekany <[email protected]>: > Sunday, August 5, 2018, 6:58:11 PM, Stephan Müller wrote: > > > Am 04.07.2018 um 19:28 schrieb Daniel Dekany: > >> I wonder what parser libraries could help us, in FM3, to separate the > >> expression language parsing from the top-level language (like > >> `<#foo>`, `${...}`, etc.) parsing. Or if a hand written parsers is an > >> acceptable compromise. It would be good if we can change the top-level > >> syntax and still reuse the expression syntax. (Or, replace the > >> expression syntax, and reuse the top-level one.) Like, somebody wants > >> a syntax like `#foo(exp)` instead of `<#foo exp>`, but still reuse the > >> expression syntax. (For me it was always part of the FM3 agenda, > >> though might will be proven to be too much...) > >> [..] > > > > During the last days I had a high-level look at different parser > > generators, and as one might imagine, there are a lot of parser > > generators, with different licenses, different maturities, different > > states of maintenance and so on. > > > > Due to https://www.apache.org/legal/resolved.html I ignored all parser > > generators which may not be included in Apache projects because of their > > license, especially GNU GPL etc. > > > > IMHO this leaves us with: > > > > * LL(k) parsers: ANTLR, JavaCC and Grammatica > > * LALR parsers: CookCC > > * PEG parsers: Mouse > > * parser combinators: jparsec, parboiled and PetitParser > > > > This list is not exhaustive, so I probably forget some interesting > > projects. If so, please share, I'd like to have a look into these, too. > > > > My idea for the next step: define a really small subset of FTL and try > > to implement PoCs for this subset with the candidates which I mentioned > > above. > > > > The subset might be something like > > > > * interpolations: ${..} > > * directives: if, assign > > Just to be on the safe side, I will note that you shouldn't try to > hard-code parser logic that's specific to a directive (like "if"). > Instead, you should try to parse an unified/generic directive call > syntax, and then invoke the Dialect to find out the further rules. And > that's tricky, as then the parser definition doesn't specify which > tags have an end-tag pair, and what can be nested between them, only > the Dialect knows that. Like, if you look at the current parser, it > basically says that "if" is like > > "<#" "if" Expression ">" MixedContent "</#" "if" ">" > > which is expressive and all, but sadly it won't be possible in FM3 to > do it like that. > > > * expressions: numbers, variables, + > > * variants of the parsers with different delimiters > > * split into two parsers (interpolations/directives vs. expression > language) > > > > What do you think? > > I haven't used any parser library but JavaCC, so I have not tips > there. Otherwise the plan sounds good. > > Anyway, I kind of repeat myself here, but the expectations that may > filter down the candidates quickly: > > - Splitting into two parsers, of course > > - Maintainability of custom syntax variations (like new FreeMarker > versions won't break them, or at least they need no manual work to > regenerate them) > > - How parsing partially driven by the Dialect looks... it won't fit > JavaCC well for example. (But, probably it won't be very nice with > any of them.) > > In case multiple of the libraries stay alive, some further extras that > can decide: > > - More understandable/helpful error messages is a big plus. > > - It would be interesting to see how hard it is to write a parser that > continues parsing after the first error, to catch more errors. This > is mostly for IDE-s. > > > Stephan. > > > > P.S.: my more detailed list of parser generators can be found here: > > https://gist.github.com/chaquotay/8041096bad36f6f3f0d4166d6f8623b5 > > -- > Thanks, > Daniel Dekany > >
