I want to determine where are the lines in the document. As I understand it,
PDFStreamEngine --extended by--> PageDrawer --loads--> PageDrawer.properties --defines--> operator processing The properties files define mappings between pdf operators and classes that extend OperatorProcessor and implement public void process(PDFOperator operator, List<COSBase> arguments) function. Those functions somehow all refer to the same PageDrawer for one run. I don't really understand how all of that fits together. What I was planning to do is extracting the positions of line start and end points by writing a new class extending OperatorProcessor. Can somebody explain how the pdfbox library really works; how the parts fit together, and how it is used -- something of a bird's eye-view on various parts of the pdfbox and connections between them? Thank you, Ilija.

