Re: Customising Tesseract for character recognition

patrickq Sun, 13 Mar 2011 13:18:25 -0700

You expect way too much from Tesseract: it's not Tesseract's job to
slice and dice the text according to various organizational
requirements of applications - that's for the application to handle.
You can get all the coordinates of all characters and easily determine
which one are in what you consider the first column and which are in
the 2nd column. In ScanBizCards' case considering our target material,
we treat each line as a single number formed of two sequences - but if
we wanted to treat the input as columns, it would take us a mere 20
minutes of coding or organize the results that way. We actually don't
even pay attention to where Tesseract thinks lines end and start, we
figure that out ourselves based on coordinates. It's not hard.


Patrick

On Mar 13, 4:10 pm, Jose <[email protected]> wrote:
> Hi Patrick,
>
> yes the results are correct! but the format of the results it is not! that's
> my trouble

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en.

Re: Customising Tesseract for character recognition

Reply via email to