Hi there, resuming to work on PDFBOX-1000 I came across a question how to maintain some state within the base components PDFLexer and Simple Parser (which has yet to come).
E.g. in order to differentiate a number from an indirect object I potentially have to read three tokens {num} {gen} obj to check if {num} is an individual number or the start of an indirect object. There are two ways to recover if I've read too many tokens and the number was in fact the individual object a) depend on file position e.g. filePointer and seek b) maintain some internal state I currently tend to go for b) as this would remove the dependency on filePointer() and seek() or similar methods but that means if the parsing has to start from a new point within the file, object etc. there needs too be some reset() call to reset the state. Also the caller e.g. ConformingParser has to make sure that there is some way to reposition the cursor. On the other hand not being dependent on a specific position would enable the PDFLexer and SimpleParser to be extended to work on byte[] and similar. WDYT Kind regards Maruan Sahyoun