Ok. Thank you very much for your help! I'll get it to work somehow! Cheers, Dustin
Am Mittwoch, 2. Oktober 2019 16:46:25 UTC+2 schrieb shree: > > Sorry, don't know how to add those fonts for Mac. > > The tutorial uses the following set of fonts: > > https://github.com/tesseract-ocr/tesseract/blob/master/src/training/language-specific.sh#L42 > > > You could use a similar set of fonts available on the Mac and assign via > fontlist. > > On Wed, Oct 2, 2019 at 7:38 PM Dustin Theobald <d.th...@gmail.com > <javascript:>> wrote: > >> Hey shree, >> >> do you know how to manually install the missing fonts for MAC, like in >> your docu for linux: >> >> sudo apt update >> sudo apt install ttf-mscorefonts-installer >> sudo apt install fonts-dejavu >> fc-cache -vf >> >> Thank you in advance! >> >> Best regards, >> Dustin >> >> Am Mittwoch, 2. Oktober 2019 11:26:28 UTC+2 schrieb shree: >>> >>> >This doesn't work on my MAC. I can't find some of the fonts, so I only >>> try to create trainingdata for Arial, if use the 5-makedata-plusminus.sh, >>> he is only rendering (creating 2 pages), which seems odd. >>> >>> 2 pages should be ok because it uses the training_text from langdata >>> repo which is around 80 lines plus the extra lines added with plusminus. >>> >>> On Wed, Oct 2, 2019 at 2:53 PM Shree Devi Kumar <shree...@gmail.com> >>> wrote: >>> >>>> 1. You could install on linux using the appropriate package from >>>> https://github.com/tesseract-ocr/tesseract/wiki#tesseract-4-packages-with-lstm-engine-and-related-traineddata >>>> >>>> OR >>>> >>>> 2. When building tesseract from git source, follow >>>> https://github.com/tesseract-ocr/tesseract/wiki/Compiling-%E2%80%93-GitInstallation#build-with-training-tools >>>> >>>> You seem to be missing some steps there. >>>> >>>> On Wed, Oct 2, 2019 at 2:32 PM Dustin Theobald <d.th...@gmail.com> >>>> wrote: >>>> >>>>> Hey Shree, >>>>> >>>>> Thank you for your help! >>>>> >>>>> This doesn't work on my MAC. I can't find some of the fonts, so I only >>>>> try to create trainingdata for Arial, if use the 5-makedata-plusminus.sh, >>>>> he is only rendering (creating 2 pages), which seems odd. >>>>> >>>>> I'm switching to my linux now, but I have problems installing >>>>> tesseract. >>>>> >>>>> I'm following the documentation: >>>>> >>>>> sudo apt install tesseract-ocr >>>>> >>>>> After, I try to find the folder to run >>>>> >>>>> make >>>>> make training >>>>> make training-install >>>>> >>>>> But I cannot find the folder (on ubuntu) >>>>> >>>>> So, I clone the GitHub Repository: >>>>> https://github.com/tesseract-ocr/tesseract >>>>> to my Desktop and run ./autogen.sh ./configure, make, make training, >>>>> sudo make trainng-install >>>>> >>>>> But then I'll get the following error when running >>>>> 5-makedata-plusminus.sh: >>>>> >>>>> /usr/local/bin/text2image: error while loading shared libraries: >>>>> libtesseract.so.5: cannot open shared object file: No such file or >>>>> directory >>>>> ERROR: Program text2image failed. Abort. >>>>> >>>>> Thank you very much for your help! >>>>> >>>>> Am Dienstag, 1. Oktober 2019 17:41:36 UTC+2 schrieb shree: >>>>>> >>>>>> specifically >>>>>> https://github.com/Shreeshrii/tess4training/blob/master/6-plusminus.log#L429 >>>>>> >>>>>> On Tue, Oct 1, 2019 at 9:09 PM Shree Devi Kumar <shree...@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> See https://github.com/Shreeshrii/tess4training >>>>>>> >>>>>>> On Tue, Oct 1, 2019 at 7:53 PM Dustin Theobald <d.th...@gmail.com> >>>>>>> wrote: >>>>>>> >>>>>>>> Changed my evaluation to: >>>>>>>> >>>>>>>> ~/../../usr/local/bin/lstmeval \ >>>>>>>> --model ~/Desktop/tesstutorial/trainplusminus/ >>>>>>>> *plusminus_checkpoint* \ >>>>>>>> --traineddata >>>>>>>> ~/Desktop/tesstutorial/trainplusminus/eng/eng.traineddata \ >>>>>>>> --eval_listfile >>>>>>>> ~/Desktop/tesstutorial/trainplusminus/eng.training_files.txt 2>&1 | >>>>>>>> grep ± >>>>>>>> >>>>>>>> Still doesn't work. >>>>>>>> >>>>>>>> Am Dienstag, 1. Oktober 2019 14:39:48 UTC+2 schrieb Dustin Theobald: >>>>>>>>> >>>>>>>>> Hey guys, >>>>>>>>> >>>>>>>>> I have a Problem when Finetuning Characters (trying the ± approach >>>>>>>>> on >>>>>>>>> https://github.com/tesseract-ocr/tesseract/wiki/TrainingTesseract-4.00 >>>>>>>>> ) >>>>>>>>> >>>>>>>>> (I'm working on a MAC) >>>>>>>>> >>>>>>>>> My tesseract version: >>>>>>>>> >>>>>>>>> tesseract 5.0.0-alpha-457-gb3b74 >>>>>>>>> >>>>>>>>> leptonica-1.78.0 >>>>>>>>> >>>>>>>>> libgif 5.1.4 : libjpeg 9c : libpng 1.6.37 : libtiff 4.0.10 : >>>>>>>>> zlib 1.2.11 : libwebp 1.0.3 : libopenjp2 2.3.1 >>>>>>>>> >>>>>>>>> Found AVX2 >>>>>>>>> >>>>>>>>> Found AVX >>>>>>>>> >>>>>>>>> Found FMA >>>>>>>>> >>>>>>>>> Found SSE >>>>>>>>> >>>>>>>>> Found libarchive 3.4.0 zlib/1.2.11 liblzma/5.2.4 bz2lib/1.0.6 >>>>>>>>> >>>>>>>>> My bashscript looks at follows: https://pastebin.com/XK4CkuM2 >>>>>>>>> >>>>>>>>> When I evaluate via: >>>>>>>>> >>>>>>>>> ~/../../usr/local/bin/lstmeval \ >>>>>>>>> --model ~/Desktop/tesstutorial/trainplusminus/eng.traineddata \ >>>>>>>>> --traineddata >>>>>>>>> ~/Desktop/tesstutorial/trainplusminus/eng/eng.traineddata \ >>>>>>>>> --eval_listfile >>>>>>>>> ~/Desktop/tesstutorial/trainplusminus/eng.training_files.txt 2>&1 | >>>>>>>>> grep ± >>>>>>>>> >>>>>>>>> I don't get any OCR Line correctly. >>>>>>>>> >>>>>>>>> Does someone see a mistake in my code? >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>> You received this message because you are subscribed to the Google >>>>>>>> Groups "tesseract-ocr" group. >>>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>>> send an email to tesser...@googlegroups.com. >>>>>>>> To view this discussion on the web visit >>>>>>>> https://groups.google.com/d/msgid/tesseract-ocr/e9ba2635-6308-41a8-8150-e5d4da520269%40googlegroups.com >>>>>>>> >>>>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/e9ba2635-6308-41a8-8150-e5d4da520269%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>>> . >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> >>>>>>> ____________________________________________________________ >>>>>>> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> >>>>>> ____________________________________________________________ >>>>>> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com >>>>>> >>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "tesseract-ocr" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to tesser...@googlegroups.com. >>>>> To view this discussion on the web visit >>>>> https://groups.google.com/d/msgid/tesseract-ocr/d44cd443-da72-4df4-9a7c-aae082726010%40googlegroups.com >>>>> >>>>> <https://groups.google.com/d/msgid/tesseract-ocr/d44cd443-da72-4df4-9a7c-aae082726010%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>> . >>>>> >>>> >>>> >>>> -- >>>> >>>> ____________________________________________________________ >>>> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com >>>> >>> >>> >>> -- >>> >>> ____________________________________________________________ >>> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com >>> >> -- >> You received this message because you are subscribed to the Google Groups >> "tesseract-ocr" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to tesser...@googlegroups.com <javascript:>. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/tesseract-ocr/0a2e9693-553a-4340-832d-79a31da74314%40googlegroups.com >> >> <https://groups.google.com/d/msgid/tesseract-ocr/0a2e9693-553a-4340-832d-79a31da74314%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> > > > -- > > ____________________________________________________________ > भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/ca6dd8f3-27d1-4ab5-bfe1-45011e63223e%40googlegroups.com.