[tesseract-ocr] i tried Tesseract training for handwritten mathmatical expression recognition but trained data having 100% error rate

Haris Sheikh Wed, 18 Dec 2019 20:39:57 -0800

hi i'm using Linux (ubuntu), 
i tried tesseract training by following this 
https://github.com/tesseract-ocr/tesseract/wiki/TrainingTesseract-4.00 and 
i used data set like:
'=' folder -> 26,000 .jpg image files in which = is written in different 
forms
'+' folder -> 30,000 .jpg image files in which + is written in different 
forms
so on


i take all the images from each folder and paste it into ground-truth 
folder and converted those images into .tif format and also created their 
labels in .gt.txt format 
then execute the command: "make training"
it worked fine and it took 5-6 hours to train the dataset, after that i 
used the data/foo.traineddata file and paste into 
/usr/local/share/tessdata/ directory and 
run command: "tesseract --list-langs" it showed me that there is my file 
and then

*Issue is this:*

when i use a sample image having "x+y=0" written, and run tesseract as my 
language it gives me output as "xxxx" *why?*

*please tell me where i get wrong!*  

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/9c24849b-69a5-4f6d-928f-da17420adfa3%40googlegroups.com.

[tesseract-ocr] i tried Tesseract training for handwritten mathmatical expression recognition but trained data having 100% error rate

Reply via email to