> On 28 Oct 2017, at 17:37 , Steffen Märcker <merk...@web.de> wrote: > > Does that mean the sets/bdd would be constructed mainly at comile time? > Anyway, Andrew, feel free to contact me, I might help you with this. >
Thanks for the offer, Steffen! The problem is that I need to use SmaCC for my current project, and really do not have a month to take off and re-design the way that it builds its scanner. I’ve talked to Thierry Goubier about, and he doesn’t have time either! It would be a fun project, though, and it ought to be fairly separate from other parts of SmaCC. I’ve spent a fair bit of time thinking about how to do it, but don’t think that I will be able to actually focus on it. An alternative approach, which Thierry has suggested, is to make SmaCC work on the UTF-8 representation of the Unicode. Then we could represent character sets as prefix trees. But the core problem would still exist: you can’t run an algorithm that repeatedly executes for all characters in the alphabet do: when there are 2^21 characters in the alphabet! Andrew