Hi, there are currently a number of different options to use as a base for a potential new parser/lexer. The ones currently in use are
BaseParser: import org.apache.pdfbox.io.PushBackInputStream; import org.apache.pdfbox.io.RandomAccess; PDFParser (additional): import org.apache.pdfbox.io.RandomAccess; NonSequentialParser: import org.apache.pdfbox.io.PushBackInputStream; import org.apache.pdfbox.io.RandomAccess; import org.apache.pdfbox.io.RandomAccessBuffer; import org.apache.pdfbox.io.RandomAccessBufferedFileInputStream; There are some additional Classes/Interfaces in the io package e.g. RandomAccessBufferedFileInputStream implementing RandomAccessRead Any preferences, ideas of consolidating this? Currently I’m using RandomAccessBufferedFileInputStream with some additional implementations of RandomAccessRead to support reading from a ByteArray for testing purposes) BR Maruan Sahyoun
