Marco, what is the latest position of your research? pl send me commandline used by you to test on my machine - since I could not understand "Workaround linking font_properties *->* /usr/share/tessdata/font_properties" OR how to compile from source code- for which step by step procedure to be followed. It is presumed the procedure followed in linux is same for cygwin also - only difference is I have to download the source code into the folder Directory c:\cygwin ? I find "tunga" font of M$ (kannada font) does not support? with regards, sriranga(83+)
On Mon, Nov 30, 2015 at 2:40 PM, Marco Atzeri <[email protected]> wrote: > On 29/11/2015 12:18, Marco Atzeri wrote: > >> On 27/11/2015 16:28, Sriranga(83yrsold) wrote: >> >>> In coninuation of my previous post - I like to inform that also succeeded >>> to generate the kan.traineddata file in tesseract-3.05.0Dev using >>> tesstrain.sh. >>> I am thankful to all concerned who helped me to solve the problem. >>> Good Luck. >>> >>> On Fri, Nov 27, 2015 at 6:45 PM, Sriranga(83yrsold) >>> <[email protected] >>> <mailto:[email protected]>> wrote: >>> >>> HI >>> After several attempts- for more than two days, now >>> Successfully generated kan.traineddata file in ubuntu 15.10 using >>> tesstrain.sh of tesseract-3.04. >>> Attached terminal extract for benefit of users. since >>> kan.traineddata exceeds 25mb - could not attached herewith. Please >>> note all fonts listed in language-specific.sh did not work for kan >>> - resulting failures. I don't know reason why it does not work? >>> with best of luck, >>> sriranga(83) >>> >> >> >> Nice to heard you solved it. >> >> I am testing the cygwin version using the data you provided me, >> and clearly there is something wrong in passing font directive >> from the script to the utilities. >> >> Moreover I see some segfaults on text2image, that should never >> anyway happens. >> >> As soon I found more, I will update here >> >> Regards >> Marco >> >> > Using the latest git version for the scripts, with at typo correction, > I was able to process the Sriranga's data with 3.04 Cygwin version. > > All the logs and data here > http://matzeri.altervista.org/works/tesseract/ > > directory contents: > input = Sriranga's data > log = script and run logs > scripts = git version and patch for type > tessdata = output file > > Additional notes: > - for this case the suggested Cygwin font is "Lohit Kannada" > - There was a misalignment passing temporary date to test2image > one step putting in "/tmp" and the next step expecting > in "/tmp/leptonica" > Workaround linking /tmp/leptonica -> /tmp > - The finale step was expecting "font_properties" in the kan > directory. > Workaround linking > font_properties -> /usr/share/tessdata/font_properties > > Regards > Marco > > > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at http://groups.google.com/group/tesseract-ocr. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/565C1274.5040608%40gmail.com > . > > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CANKD7Yz8oyygwrUaA9yOBV3d-G4STUdp%3Dn3ObS3azSoVT%3Di%2Bvw%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.

