Hi Andi, Yes it can be done with ANTLR, but ANTLR is not the correct tool for parsing binary files.
The closest example for a binary file is *Java .class file grammar*<http://www.antlr.org/grammar/1147639104266/classfile.tar.gz> Basically you will be using semantic predicates for everything, which is like calling assembly from C. On the surface it may appear to be ANTLR, but in reality is abusing ANTLR to do something it was not primarily designed to do. Also, these will be one off programs. You will have to create a new one for each file layout. Something you might want to do, but this is reinventing the wheel, is to create your own grammar that defines binary layouts, and then use that as input into a driver that reads the binary file. I have done both of these and the latter is the better option. Eric On Thu, Sep 15, 2011 at 12:27 PM, <[email protected]> wrote: > Hi, > I searched through the archives and through the ANTLR reference, but I got > the feeling that building a parser for binary files is a bit hard. > > Are there efforts to allow something like the following: > > > Interpretation of size > > E.g. in binary formats you often have things like the following: > > --------------------------------------------- > | header | size of next block | block | ... | > --------------------------------------------- > > If I got everything correct I could handle this by reading the size in a > size rule, storing it in a variable and pass/use it in a block rule. I think > it's not very elegant, but should work. > > > Byte alignment > > Often you have some sort of byte alignment in binary files. E.g. in a four > byte alignment you end up with 0 to 3 empty bytes. I think it would also be > possible to do this using a variable and then calling a rule from within an > action -- but I find this also not very elegant. > > > Ranges for repetitive rule execution > > ANTLR already supports executing a rule > * exactly one time > * zero or one time > * zero or unlimited times > So I think it shouldn't be a problem to say "execute it at least 3 but not > more than 89 times", e.g. This would also be nice, because binary formats > often have especially upper limits in lists. > > > Specifying Hexadecimal values in rules > > If I got everything correctly, in current ANTLR versions it's not possible > to specify hexadecimal (or octal or ...) in rules. Because binary files most > of the time do not use UTF or ASCII but hexadecimal values etc. for > specifying magic numbers etc. this would be quite nice. > > > Bit handling > > In binary files you often have to extract bits or bit ranges. > > > Perhaps I just didn't find or understand something correctly and some > things mentioned above are already possible -- then just point me to the > place where to look at. > > > Bye, > Andi > -- > Empfehlen Sie GMX DSL Ihren Freunden und Bekannten und wir > belohnen Sie mit bis zu 50,- Euro! https://freundschaftswerbung.gmx.de > > List: http://www.antlr.org/mailman/listinfo/antlr-interest > Unsubscribe: > http://www.antlr.org/mailman/options/antlr-interest/your-email-address > List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups "il-antlr-interest" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
