Text sequence of ExtractText utility

Robert Rodini Fri, 19 May 2023 06:17:43 -0700

Hi,
I have successfully used PDFBox ExtractText utility to process PDFs produced by 
a third-party.  The text comes out of a multicolumn PDF in the left to right 
order of the columns from top to bottom.


I now have to process PDFs produced by another third-party which also produces 
a multicolumn PDF.  This time the text comes out in an unpredictable order.

I've read the FAQ https://pdfbox.apache.org/2.0/faq.html regarding "Why does 
the extracted text appear in the wrong sequence?"

I'd like to know if there is a command line switch (or something) that I can do 
to get the text extracted in the right order?  Can I request an CLI switch to 
the ExtractText utility?  How to do this?

Thanks,
Bob Rodini

Text sequence of ExtractText utility

Reply via email to