Hello,

I'm using Tesseract 3 as a simple command-line tool to generate OCR.
It's doing a fairly good job, but I have one unmet need -- I need to
be able to separate paragraphs with blank lines.  It would be great if
Tesseract could do this for me, but I'd even be happy if it could
include indentation whitespace in the text so I could perform the
splitting using my own software.

Is there any way to achieve this effect?  On a somewhat related note,
is there any way to control Tesseract's command line behavior at all?
I see that it accepts a config file as a command-line option, but I'm
having no luck finding documentation on what options are available or
what they mean -- the provided examples don't actually seem to work,
and even searching the code hasn't given me anything resembling a list
of valid options.

Any help or pointers in the right direction would be greatly
appreciated!

thanks,
Demian

-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

Reply via email to