Thanks for your reply,

I can't really use pre defined patterns since the code pattern and font can 
change over time.
I like the idea to segment the characters myself before giving it to 
tesseract one by one, but it looks time consuming (coding it I mean).
Isn't there any other suitable method ? In particular to solve the 3rd 
issue, which I think must be easy to solve.

On Wednesday, May 20, 2015 at 12:29:08 PM UTC+2, Dmitri Silaev wrote:
>
> One no-brainer method to try out would be turning off all dictionaries and 
> using your own custom "user-patterns" file. Since you said about "your 
> application" I suppose you can program. So you can take a look at the 
> comment preceding read_pattern_list() declaration in "dict/trie.h" for more 
> details.
>
> It seems all your strings are of the same format:
> \A\A\d\d\d\d\d\d\d\d\d\d
> (Tess understands very limited pattern syntax).
>
> But if accuracy is critical in your app, in the long run I would 
> absolutely avoid using any parts of Tesseract except char classifier. I.e. 
> crop every single char out of your source image and run Tess in the single 
> char PSM. I think it's should be easy as long as location of every 
> character is quite stable among your source images. ImageMagick/shell 
> scripts would suffice.
>
> Best regards,
> Dmitri Silaev
> www.CustomOCR.com
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/0da310e9-57b6-41a1-a363-66d35dc1bc19%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to