Sven,

I apologize for my delayed response. I just saw your post. Thank you for 
your response. As I said in my post to Andrew, I am still working on this 
issue. 

I investigated the PSM mode prior to posting my question here on the forum 
and found this website to be useful for describing the PSM options.
https://tesseract-ocr.googlecode.com/svn/trunk/doc/tesseract.1.html

Have you used any of these options to produce delimited output csv or 
other? 

Cheers,
Maureen
 

On Wednesday, October 8, 2014 6:16:02 AM UTC-6, sventech wrote:
>
> You should look at the different tesseract page segmentation (PSM) modes. 
> The data you have is in a table and you'll need to process it differently. 
> hOCR format is HTML, so it will not work as CSV format, though it does 
> supply accuracy info, so if you want to evaluate that and product CSV you 
> can.
> --Sven
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/7c7b9fa5-470a-4db0-934f-88f1609c8b93%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to