[tesseract-ocr] Re: Reading Device labels to get model number

2014-11-13 Thread Allistair C
I think the table lines are not helping. I up-sized your image to 1000px wide, then ran into Tesseract with PSM=6 and got mostly rubbish. Then I removed the table lines manually in Photoshop, then up-sized your image to 1000px wide, then ran into Tesseract with PSM=6: RFZBHMEDBSR R 134a/

[tesseract-ocr] Re: Reading Device labels to get model number

2014-11-13 Thread Allistair C
Do you have higher resolution images to work with - that's one issue going on here as the edges of your text are very fuzzy and at that resolution it's pretty hard for Tesseract. You can also play with Thresholding and Opening (Erosion/Dilation) to thicken some of your lines up (using e.g.

[tesseract-ocr] Re: Reading Device labels to get model number

2014-11-13 Thread shree
also take a look at the pre-processing method mentioned at https://github.com/tleyden/open-ocr/wiki/Stroke-Width-Transform-In-Action On Thursday, November 13, 2014 3:30:03 AM UTC+5:30, Bill Garrison wrote: So if someone sends in labels like the attached ones, I need to grab the model number.