Sven, I apologize for my delayed response. I just saw your post. Thank you for your response. As I said in my post to Andrew, I am still working on this issue.
I investigated the PSM mode prior to posting my question here on the forum and found this website to be useful for describing the PSM options. https://tesseract-ocr.googlecode.com/svn/trunk/doc/tesseract.1.html Have you used any of these options to produce delimited output csv or other? Cheers, Maureen On Wednesday, October 8, 2014 6:16:02 AM UTC-6, sventech wrote: > > You should look at the different tesseract page segmentation (PSM) modes. > The data you have is in a table and you'll need to process it differently. > hOCR format is HTML, so it will not work as CSV format, though it does > supply accuracy info, so if you want to evaluate that and product CSV you > can. > --Sven > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/7c7b9fa5-470a-4db0-934f-88f1609c8b93%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.

