Thanks a ton!  As it turns out that was the problem, the weird thing is I 
wasn't using any additional paramters, for some reason the page formating 
strangness was default.  In autoit I am now using this to launch Tesseract
 
 ShellExecuteWait(@ProgramFilesDir & "\Tesseract-OCR\tesseract.exe", '"' & 
$in_image & '" "' & $out_file & '" ' & '"-l eng"'  & '" ' &'" -psm 6"')
Incase anyone needed it.
 
Thanks again!
 
 

On Saturday, February 23, 2013 1:28:46 PM UTC-8, Nick White wrote:

> Hi Ben, 
>
> On Fri, Feb 22, 2013 at 01:58:53PM -0800, Ben Richard wrote: 
> > The problem is the output file 
> > seems to see columns as line carriages.  I used a version from 3 years 
> ago and 
> > it worked fine.  It seems like this is an intentional formatting change, 
> does 
> > anyone know how to keep a line a line.  Effectively if you are in the 
> same y 
> > plane turn a large gap into a [space] not a [line carriage] 
>
> I'd guess this is due to Tesseract's page segmentation. Run it 
> without any arguments to see the different segmentation options; it 
> sounds like you want 4 or 6. That output tells you the appropriate 
> syntax for setting the segmentation mode. 
>
> Nick 
>

-- 
-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

--- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.


Reply via email to