Thanks a ton! As it turns out that was the problem, the weird thing is I wasn't using any additional paramters, for some reason the page formating strangness was default. In autoit I am now using this to launch Tesseract ShellExecuteWait(@ProgramFilesDir & "\Tesseract-OCR\tesseract.exe", '"' & $in_image & '" "' & $out_file & '" ' & '"-l eng"' & '" ' &'" -psm 6"') Incase anyone needed it. Thanks again!
On Saturday, February 23, 2013 1:28:46 PM UTC-8, Nick White wrote: > Hi Ben, > > On Fri, Feb 22, 2013 at 01:58:53PM -0800, Ben Richard wrote: > > The problem is the output file > > seems to see columns as line carriages. I used a version from 3 years > ago and > > it worked fine. It seems like this is an intentional formatting change, > does > > anyone know how to keep a line a line. Effectively if you are in the > same y > > plane turn a large gap into a [space] not a [line carriage] > > I'd guess this is due to Tesseract's page segmentation. Run it > without any arguments to see the different segmentation options; it > sounds like you want 4 or 6. That output tells you the appropriate > syntax for setting the segmentation mode. > > Nick > -- -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en --- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/groups/opt_out.

