Thanks! Eventually the issue was this: https://groups.google.com/forum/#!msg/tesseract-ocr/FSURCa9m7Ko/ So I suppose the files from the download page resulted the error, but the newer files on Git work well when building Tesseract on Cygwin. Greetings:
Kazi 2015. december 30., szerda 9:42:25 UTC+1 időpontban shree a következőt írta: > On cygwin Marco Atzeri has packaged Tesseract as well as the training > utilities for 3.04.00 along with some training data. Instruction for cygwin > installation is here: https://cygwin.com/cygwin-ug-net/setup-net.html > > Tesseract specific packages to be installed: > > tesseract-ocr 3.04.00-2 > tesseract-ocr-eng 3.04-1 > tesseract-training-core 3.04-1 > tesseract-training-eng 3.04-1 > tesseract-training-util 3.04.00-2 > > > ShreeDevi > ____________________________________________________________ > भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com > > On Wed, Dec 30, 2015 at 5:11 AM, bácsi Kazi <[email protected] > <javascript:>> wrote: > >> Dear Zdenko! >> >> Thank you for your reply! Even though the original file was in Italian, >> your output is quite impressive! >> I found a guide how to compile with CygWin: >> http://vorba.ch/2014/tesseract-cygwin.html >> So I installed CygWin64 with the necessary packages, then everything went >> fine with Leptonica, but I screwed up with Tesseract. During make when >> processing ccutil/ambigs.cpp it lacks the strtok_r.h file, but it's in >> the vs2010/port folder (if I place it there it finds it ambiguous). I >> used: CPPFLAGS="-I/usr/local/include" LDFLAGS="-L/usr/local/lib" >> ./configure because of my Leptonica installation. >> So I can't get even a "normal" installation, not to mention the one >> written here: https://github.com/tesseract-ocr/tesseract/wiki/Compiling >> I'm not familiar with this stuff - that's why I was asking an installer >> (couldn't find the one you were referring to). >> I couldn't get either that you have suggested exactly in your last line. >> Greetings: >> >> Kazi >> >> 2015. december 28., hétfő 20:23:35 UTC+1 időpontban zdenop a következőt >> írta: >> >>> First of all - there is no such policy as not providing Windows >>> installers. There is no installer because there is nobody who would >>> maintain it and provide solution (e.g. NSIS destroys my PATH variable on >>> windows ;-) ). Everybody is busy with programming :-) (something else). >>> >>> Next: there is windows build based on cygwin, so if you need windows >>> portable version you get it (search this forum). >>> >>> Next in attachment you can find output created with current tesseract >>> code created with: >>> tesseract example.png example -l spa >>> (I renamed your file and I hope I chose correct language for OCR). It >>> seem that result is better than yours including capitalization. >>> >>> IMO tesseract executable is nice example how to use tesseract library. >>> Maybe you should try to use tesseract library directly >>> >>> >>> Zdenko >>> >>> On Mon, Dec 28, 2015 at 7:00 PM, bácsi Kazi <[email protected]> wrote: >>> >>>> Dear Zdenko, >>>> >>>> I provide an example file in attachment. You can see Enrico, Antonio, >>>> Roberto in the output with this mistake, despite all these names are >>>> present in the dictionary with all-caps. >>>> I haven't tried later versions, because you have a policy of not >>>> providing Windows installers, and I was busy with other programming. But >>>> if >>>> you say it's worth it, I'll try. Is there any guide how to create a >>>> portable version for Windows? >>>> Thanks again! >>>> >>>> Kazi >>>> >>>> 2015. december 28., hétfő 10:08:35 UTC+1 időpontban zdenop a következőt >>>> írta: >>>> >>>>> When you ask for support please provide example files. >>>>> Did you try the latest version of tesseract? >>>>> >>>>> Zdenko >>>>> >>>>> On Sun, Dec 27, 2015 at 9:43 PM, bácsi Kazi <[email protected]> >>>>> wrote: >>>>> >>>>>> Could you help? Have I missed something blatantly trivial? >>>>>> Any help would be highly appreciated! >>>>>> >>>>>> Kazi >>>>>> >>>>>> 2015. december 15., kedd 8:33:27 UTC+1 időpontban bácsi Kazi a >>>>>> következőt írta: >>>>>> >>>>>>> Hi there! >>>>>>> >>>>>>> I'm playing with Tesseract 3.02, and I would need precise >>>>>>> recognition of capital letters. Unfortunately my files are full of all >>>>>>> caps >>>>>>> and small caps. During the training if I included such words in the >>>>>>> sample, >>>>>>> I got random capitals in the rest of the text. I thought I would try to >>>>>>> put >>>>>>> them into a new font, same. I included them in the dictionary files, >>>>>>> somewhat better, but still problematic at letter o, u, v etc. I.e. >>>>>>> HELLo >>>>>>> WoRLD & friends, despite having HELLO WORLD in dictionary. >>>>>>> It's quite similar to this: >>>>>>> https://code.google.com/p/tesseract-ocr/issues/detail?id=691 >>>>>>> What is your experience? How to train Tesseract for caps? Is it >>>>>>> better in later versions? Is there a configuration parameter to set? >>>>>>> Thanks! >>>>>>> >>>>>>> Kazi >>>>>> >>>>>> -- >>>>>> You received this message because you are subscribed to the Google >>>>>> Groups "tesseract-ocr" group. >>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>> send an email to [email protected]. >>>>>> To post to this group, send email to [email protected]. >>>>>> Visit this group at https://groups.google.com/group/tesseract-ocr. >>>>>> To view this discussion on the web visit >>>>>> https://groups.google.com/d/msgid/tesseract-ocr/16a46021-43b9-484f-a66f-e3b077b4aadb%40googlegroups.com >>>>>> >>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/16a46021-43b9-484f-a66f-e3b077b4aadb%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>> . >>>>>> >>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>> >>>>> >>>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "tesseract-ocr" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to [email protected]. >>>> To post to this group, send email to [email protected]. >>>> Visit this group at https://groups.google.com/group/tesseract-ocr. >>>> To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/tesseract-ocr/b07dfde1-a659-4caf-83a7-23464b7f7a27%40googlegroups.com >>>> >>>> <https://groups.google.com/d/msgid/tesseract-ocr/b07dfde1-a659-4caf-83a7-23464b7f7a27%40googlegroups.com?utm_medium=email&utm_source=footer> >>>> . >>>> >>>> For more options, visit https://groups.google.com/d/optout. >>>> >>> >>> -- >> You received this message because you are subscribed to the Google Groups >> "tesseract-ocr" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected] <javascript:>. >> To post to this group, send email to [email protected] >> <javascript:>. >> Visit this group at https://groups.google.com/group/tesseract-ocr. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/tesseract-ocr/10320508-99c9-4d6d-a854-45be085d74a4%40googlegroups.com >> >> <https://groups.google.com/d/msgid/tesseract-ocr/10320508-99c9-4d6d-a854-45be085d74a4%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> > > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/c72f7b1a-9d63-46ba-b6f2-34a82b16416a%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.

