On 29/11/2015 12:18, Marco Atzeri wrote:
On 27/11/2015 16:28, Sriranga(83yrsold) wrote:
In coninuation of my previous post - I like to inform that also succeeded
to generate the kan.traineddata file in tesseract-3.05.0Dev using
tesstrain.sh.
I am thankful to all concerned who helped me to solve the problem.
Good Luck.

On Fri, Nov 27, 2015 at 6:45 PM, Sriranga(83yrsold)
<withblessing.sriranga.1...@gmail.com
<mailto:withblessing.sriranga.1...@gmail.com>> wrote:

    HI
    After several attempts- for more than two days, now
    Successfully generated kan.traineddata file in ubuntu 15.10 using
    tesstrain.sh of tesseract-3.04.
    Attached terminal extract for benefit of users. since
    kan.traineddata exceeds 25mb - could not attached herewith. Please
    note all fonts listed in language-specific.sh did  not work for kan
    - resulting failures. I don't know reason why it does not work?
    with best of luck,
    sriranga(83)


Nice to heard you solved it.

I am testing the cygwin version using the data you provided me,
and clearly there is something wrong in passing font directive
from the script to the utilities.

Moreover I see some segfaults on text2image, that should never
anyway happens.

As soon I found more, I will update here

Regards
Marco


Using the latest git version for the scripts, with at typo correction,
I was able to process the Sriranga's data with 3.04 Cygwin version.

All the logs and data here
http://matzeri.altervista.org/works/tesseract/

directory contents:
 input      = Sriranga's data
 log        = script and run logs
 scripts    = git version and patch for type
 tessdata   = output file

Additional notes:
- for this case the suggested Cygwin font is "Lohit Kannada"
- There was a misalignment passing temporary date to test2image
  one step putting in "/tmp" and the next step expecting
  in "/tmp/leptonica"
  Workaround linking /tmp/leptonica -> /tmp
- The finale step was expecting "font_properties" in the kan
  directory.
  Workaround linking
  font_properties -> /usr/share/tessdata/font_properties

Regards
Marco



--
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/565C1274.5040608%40gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to