Hi Andrew, Steffen, 2017-11-07 13:10 GMT+01:00 Prof. Andrew P. Black <bl...@cs.pdx.edu>:
> > > On 28 Oct 2017, at 17:37 , Steffen Märcker <merk...@web.de> wrote: > > > > Does that mean the sets/bdd would be constructed mainly at comile time? > Anyway, Andrew, feel free to contact me, I might help you with this. > > > > Thanks for the offer, Steffen! The problem is that I need to use SmaCC > for my current project, and really do not have a month to take off and > re-design the way that it builds its scanner. I’ve talked to Thierry > Goubier about, and he doesn’t have time either! It would be a fun project, > though, and it ought to be fairly separate from other parts of SmaCC. I’ve > spent a fair bit of time thinking about how to do it, but don’t think that > I will be able to actually focus on it. > Yes, this is the essence of the issue. There are a few alternatives about it, but none we have the time to pursue. > > An alternative approach, which Thierry has suggested, is to make SmaCC > work on the UTF-8 representation of the Unicode. Then we could represent > character sets as prefix trees. But the core problem would still exist: > you can’t run an algorithm that repeatedly executes > > for all characters in the alphabet do: > > when there are 2^21 characters in the alphabet! > The main issue is that `for all characters`... All the literature on scanner building uses 'for all characters do'. Thierry > > Andrew > > > > >