You expect way too much from Tesseract: it's not Tesseract's job to slice and dice the text according to various organizational requirements of applications - that's for the application to handle. You can get all the coordinates of all characters and easily determine which one are in what you consider the first column and which are in the 2nd column. In ScanBizCards' case considering our target material, we treat each line as a single number formed of two sequences - but if we wanted to treat the input as columns, it would take us a mere 20 minutes of coding or organize the results that way. We actually don't even pay attention to where Tesseract thinks lines end and start, we figure that out ourselves based on coordinates. It's not hard.
Patrick On Mar 13, 4:10 pm, Jose <[email protected]> wrote: > Hi Patrick, > > yes the results are correct! but the format of the results it is not! that's > my trouble -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en.

