I have problem with tesseract training with font i created. After whole 
process of generating bunch of tesseract files and combining them, my 
tesseract reads all "7" as "?". Font ha both chars.

I created unicharambigs file containing:


v1
1   ?   1   7   1



It's saved in Vi in unix fileformat and contains new line char after last 
line. It should replace all '?' for '7'.

Combining gives me result:

 
    Combining tessdata files
    TessdataManager combined tesseract data files.
    Offset for type  0 (SmAftersale.config                ) is -1
    Offset for type  1 (SmAftersale.unicharset            ) is 140
    Offset for type  2 (SmAftersale.unicharambigs         ) is 3047
    Offset for type  3 (SmAftersale.inttemp               ) is 3061
    Offset for type  4 (SmAftersale.pffmtable             ) is 350802
    Offset for type  5 (SmAftersale.normproto             ) is 351219
    Offset for type  6 (SmAftersale.punc-dawg             ) is -1
    Offset for type  7 (SmAftersale.word-dawg             ) is -1
    Offset for type  8 (SmAftersale.number-dawg           ) is -1
    Offset for type  9 (SmAftersale.freq-dawg             ) is -1
    Offset for type 10 (SmAftersale.fixed-length-dawgs    ) is -1
    Offset for type 11 (SmAftersale.cube-unicharset       ) is -1
    Offset for type 12 (SmAftersale.cube-word-dawg        ) is -1
    Offset for type 13 (SmAftersale.shapetable            ) is 357761
    Offset for type 14 (SmAftersale.bigram-dawg           ) is -1
    Offset for type 15 (SmAftersale.unambig-dawg          ) is -1
    Offset for type 16 (SmAftersale.params-model          ) is -1
    Output SmAftersale.traineddata created successfully.


Offset for "SmAftersale.unicharambigs" file is not -1, so i assume that 
file was read. But still, after all that, tesseract keeps reading all '7' 
as '?'.

What i did wrong or what did i missed?

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/e1018428-958c-4d27-9a4d-71a1d8021d85%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to