Hi List, I've been revisiting the work I did with the DNP3 parser, aiming to produce a proper context-free grammar for the application layer messages. Two questions have popped up and I would appreciate pointers to any previous work in these areas.
1. Parsing binary (i.e. bitwise) structures below the byte boundary, e.g. flags and 4-bit numbers. Has there been any work on adapting, say, LR parsing to efficiently work on a (theoretical) alphabet of {0,1,$}? I'm assuming that simply using the standard algorithms with that alphabet would ruin performance. I've spent some thought on optimizations or ways of side-stepping this issue but my literature searches have come up empty so far. 2. Classes of finite languages. A lot of practical languages and structures are actually finite (DNP3 messages have a maximum size, for instance), but enumerating them for the purpose of parsing or even pure recognition is impractical/inefficient/undesirable. Are there any formal categorizations here, maybe along the lines of "admits a grammar with the number of rules linear/logarithmic/... in the maximum word length/size of language/..."? Regards, Sven _______________________________________________ langsec-discuss mailing list langsec-discuss@mail.langsec.org https://mail.langsec.org/cgi-bin/mailman/listinfo/langsec-discuss