[langsec-discuss] two parsing questions

Sven M. Hallberg Tue, 22 May 2018 06:06:07 -0700

Hi List,

I've been revisiting the work I did with the DNP3 parser, aiming to
produce a proper context-free grammar for the application layer
messages. Two questions have popped up and I would appreciate pointers
to any previous work in these areas.


1. Parsing binary (i.e. bitwise) structures below the byte boundary,
   e.g. flags and 4-bit numbers. Has there been any work on adapting,
   say, LR parsing to efficiently work on a (theoretical) alphabet of
   {0,1,$}? I'm assuming that simply using the standard algorithms with
   that alphabet would ruin performance. I've spent some thought on
   optimizations or ways of side-stepping this issue but my literature
   searches have come up empty so far.

2. Classes of finite languages. A lot of practical languages and
   structures are actually finite (DNP3 messages have a maximum size,
   for instance), but enumerating them for the purpose of parsing or
   even pure recognition is impractical/inefficient/undesirable.
   Are there any formal categorizations here, maybe along the lines of
   "admits a grammar with the number of rules linear/logarithmic/... in
   the maximum word length/size of language/..."?

Regards,
Sven
_______________________________________________
langsec-discuss mailing list
langsec-discuss@mail.langsec.org
https://mail.langsec.org/cgi-bin/mailman/listinfo/langsec-discuss

[langsec-discuss] two parsing questions

Reply via email to