Hello, <[email protected]> I'm trying to extract text from a pdf ( http://www.oca.state.pa.us/Industry/Electric/elecomp/wpp.pdf). However I'm having trouble with the way the doc is formatted. With default settings (sortbyposition false), the last column is not read along with the line. I'm having more luck with setting sortbyposition true, however that messes up some of the text (see below).
Is there a way to tweak settings to fix the text when sortbyposition is true? Or otherwise is there a way to further troubleshoot this? Thanks so much for any advice! Michael For example on page 4 *with SortByPosition true* *TriEWaegslte PEennenr gPyower * *1-87P7r-i9c3e EtoA GCLomE p(9a3r3e -2453)* *www.trieagletehnrerogyu.cgohm * *FixedA purigcue:s t 6 3 m1o, n2t0h1 t3erm 7.29 ¢ $36.45 $72.90 $145.80* *$20 per month * *for each month * *remaining in the * *contract term* *with SortByPosition false* *TriEagle Energy* *1-877-93EAGLE (933-2453)* *www.trieagleenergy.com* *Fixed price: 6 month term 7.29 ¢ $36.45 $72.90 $145.80*

