Environment

Windows Setup: tesseract-ocr-setup-4.0.0-alpha.20170804.exe
Spanish Trained Data: 
https://github.com/tesseract-ocr/tessdata/raw/4.00/spa.traineddata
Command Used to OCR:
tesseract.exe ImageDoc.png output --oem 1 -l spa
Where ImageDoc.png is a Spanish Scanned Document
output is the text file output of OCRed text

   - Tesseract Version: 4.0
   - Platform: Windows version 64 Bit

Current Behavior:

In Spanish, character ‘o’ is recognized incorrectly as some round symbol. 
Attached input file is ImageDoc.png and Error screenshot

[image: spanish] 
<https://user-images.githubusercontent.com/12831051/30733359-45541566-9f94-11e7-8bb1-e8027c2efc0e.png>
[image: imagedoc] 
<https://user-images.githubusercontent.com/12831051/30733369-4d785ab8-9f94-11e7-9ff4-7f594f72a8dc.png>




Expected Behavior:

Character ‘o’ should be recognized correctly.

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/62f497d2-3faa-41fb-a7a4-9054d64697a4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to