OOooooo! That looks quite.... exciting! Now I'm wondering if there could be a little more propagation of token names from the grammar to labels in the bytecode source? Ie: could A, B and I appear in there, as labels on the lines, and annotations or something on the destinations in split?
-- Graham At 4/30/2010 04:41 PM, Terence Parr wrote: >On Apr 30, 2010, at 4:27 PM, Graham Wideman wrote: >> This prompts me to wonder how debuggable these lexers will be? Currently a >> certain amount of troubleshooting of lexing/parsing can be done by >> inspecting the generated lexer source, single-stepping it and so on. >> >> If you move to encoding the lexer logic in bytecodes, does the generated >> lexer source become an inscrutable black box? Or is there still meaningful >> source code to examine, trace etc? > >Yup. The bytecode is actually easier to read than the java ;) > >lexer grammar L2; >A : 'ab'; >B : 'a'..'z'+ ; >I : '0'..'9'+ ; > >yields: > >0000: split 9, 16, 29 // says 3 paths are possible >0009: match8 'a' >0011: match8 'b' >0013: accept 4 >0016: range8 'a', 'z' >0019: split 16, 26 >0026: accept 5 >0029: range8 '0', '9' >0032: split 29, 39 // go back or fall out of loop into accept state >0039: accept 6 > >is that what you mean? It's 1-to-1 with the grammar. taken almost verbatim >from Russ Cox's description of VM-based NFA simulation. > >ANTLR v4 uses 42 bytes to encode entire L2 grammar. ANTLR v3 generates 246 >lines of Java and 2709 bytes of java .class file: > >/tmp $ wc -l L2.java > 246 L2.java >/tmp $ ls -l L2.class >-rw-r--r-- 1 parrt wheel 2709 Apr 30 16:39 L2.class > >Ter List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups "il-antlr-interest" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
