On Saturday, 10 March 2018 at 11:47:48 UTC, snowCat wrote:
Recently, the task of implementing the FALSE programming language on D has come up and I have faced the problem of splitting the FALSE expression into a list of strings from its typical elements: blocks of comments, numbers, strings and operators.
How to properly implement such a partition, tell me please.
Or at least tell me how to describe his grammar in Pegged.
Your name in the article about the implementation of my blog is guaranteed.

I implemented a small concatenative language in 2016 and used Pegged for the lexing. I uploaded the source on github: https://github.com/remy-j-a-moueza/stacky.
The Pegged grammar is at the beginning of `stacky.d`.
Pegged is also pretty well documented in its wiki and tutorial (https://github.com/PhilippeSigaud/Pegged/wiki/Pegged-Tutorial).

As for a hand made lexer, the last time I clearly remember working on one was in 2007, on D1, trying to lex C++. I copied part of that code here: https://gist.github.com/remy-j-a-moueza/8909819cbf972430bfbb16dff768b97d

The algorithm is mainly:
- try to match a regular expression
-- on a match: create a token for what has been matched, move forward in your string input, loop to the next iteration,
-- on error: try again with another regular expression.
By the end you should have accumulated your tokens.

There is a lot of resources online about parsing, in various programming languages and libraries, way better than what I have provided. Whatever the tools, the principles stay the same. Whatever you can learn somewhere else will still be usable in D, just a bit differently.

I hope this help.
Have fun.

Reply via email to