[tesseract-ocr] Training tesseract on non-letter symbols

Piotr Gryta Tue, 10 May 2016 00:13:51 -0700

Hi everyone,
I am devoloping a Java application to vectorize a raster image. One of the 
steps is symbol recognition and I was hoping to train Tesseract to find 
them and return their pixel coordinates. 
My question is: 
1) Is it possible to make a dictionary of symbols to avoid detection of 
letters contained in English dictionary?
2) What steps should I perform?
I managed to make a box files for my training image, but later I get an 
Empty page! error.
I am glad for any suggestion,
Piotrek


Here is a sample image of tree sybols which I would like to train to check 
if it works:
https://gyazo.com/85a1db80f92f2df44625875bcf20d37d

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/36ec3912-bc4c-4cfc-8a08-4dbcffb9d247%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[tesseract-ocr] Training tesseract on non-letter symbols

Reply via email to