Sorry, don't know how to add those fonts for Mac. The tutorial uses the following set of fonts: https://github.com/tesseract-ocr/tesseract/blob/master/src/training/language-specific.sh#L42
You could use a similar set of fonts available on the Mac and assign via fontlist. On Wed, Oct 2, 2019 at 7:38 PM Dustin Theobald <d.theo1...@gmail.com> wrote: > Hey shree, > > do you know how to manually install the missing fonts for MAC, like in > your docu for linux: > > sudo apt update > sudo apt install ttf-mscorefonts-installer > sudo apt install fonts-dejavu > fc-cache -vf > > Thank you in advance! > > Best regards, > Dustin > > Am Mittwoch, 2. Oktober 2019 11:26:28 UTC+2 schrieb shree: >> >> >This doesn't work on my MAC. I can't find some of the fonts, so I only >> try to create trainingdata for Arial, if use the 5-makedata-plusminus.sh, >> he is only rendering (creating 2 pages), which seems odd. >> >> 2 pages should be ok because it uses the training_text from langdata repo >> which is around 80 lines plus the extra lines added with plusminus. >> >> On Wed, Oct 2, 2019 at 2:53 PM Shree Devi Kumar <shree...@gmail.com> >> wrote: >> >>> 1. You could install on linux using the appropriate package from >>> https://github.com/tesseract-ocr/tesseract/wiki#tesseract-4-packages-with-lstm-engine-and-related-traineddata >>> >>> OR >>> >>> 2. When building tesseract from git source, follow >>> https://github.com/tesseract-ocr/tesseract/wiki/Compiling-%E2%80%93-GitInstallation#build-with-training-tools >>> >>> You seem to be missing some steps there. >>> >>> On Wed, Oct 2, 2019 at 2:32 PM Dustin Theobald <d.th...@gmail.com> >>> wrote: >>> >>>> Hey Shree, >>>> >>>> Thank you for your help! >>>> >>>> This doesn't work on my MAC. I can't find some of the fonts, so I only >>>> try to create trainingdata for Arial, if use the 5-makedata-plusminus.sh, >>>> he is only rendering (creating 2 pages), which seems odd. >>>> >>>> I'm switching to my linux now, but I have problems installing >>>> tesseract. >>>> >>>> I'm following the documentation: >>>> >>>> sudo apt install tesseract-ocr >>>> >>>> After, I try to find the folder to run >>>> >>>> make >>>> make training >>>> make training-install >>>> >>>> But I cannot find the folder (on ubuntu) >>>> >>>> So, I clone the GitHub Repository: >>>> https://github.com/tesseract-ocr/tesseract >>>> to my Desktop and run ./autogen.sh ./configure, make, make training, >>>> sudo make trainng-install >>>> >>>> But then I'll get the following error when running >>>> 5-makedata-plusminus.sh: >>>> >>>> /usr/local/bin/text2image: error while loading shared libraries: >>>> libtesseract.so.5: cannot open shared object file: No such file or >>>> directory >>>> ERROR: Program text2image failed. Abort. >>>> >>>> Thank you very much for your help! >>>> >>>> Am Dienstag, 1. Oktober 2019 17:41:36 UTC+2 schrieb shree: >>>>> >>>>> specifically >>>>> https://github.com/Shreeshrii/tess4training/blob/master/6-plusminus.log#L429 >>>>> >>>>> On Tue, Oct 1, 2019 at 9:09 PM Shree Devi Kumar <shree...@gmail.com> >>>>> wrote: >>>>> >>>>>> See https://github.com/Shreeshrii/tess4training >>>>>> >>>>>> On Tue, Oct 1, 2019 at 7:53 PM Dustin Theobald <d.th...@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> Changed my evaluation to: >>>>>>> >>>>>>> ~/../../usr/local/bin/lstmeval \ >>>>>>> --model ~/Desktop/tesstutorial/trainplusminus/ >>>>>>> *plusminus_checkpoint* \ >>>>>>> --traineddata >>>>>>> ~/Desktop/tesstutorial/trainplusminus/eng/eng.traineddata \ >>>>>>> --eval_listfile >>>>>>> ~/Desktop/tesstutorial/trainplusminus/eng.training_files.txt 2>&1 | >>>>>>> grep ± >>>>>>> >>>>>>> Still doesn't work. >>>>>>> >>>>>>> Am Dienstag, 1. Oktober 2019 14:39:48 UTC+2 schrieb Dustin Theobald: >>>>>>>> >>>>>>>> Hey guys, >>>>>>>> >>>>>>>> I have a Problem when Finetuning Characters (trying the ± approach >>>>>>>> on >>>>>>>> https://github.com/tesseract-ocr/tesseract/wiki/TrainingTesseract-4.00 >>>>>>>> ) >>>>>>>> >>>>>>>> (I'm working on a MAC) >>>>>>>> >>>>>>>> My tesseract version: >>>>>>>> >>>>>>>> tesseract 5.0.0-alpha-457-gb3b74 >>>>>>>> >>>>>>>> leptonica-1.78.0 >>>>>>>> >>>>>>>> libgif 5.1.4 : libjpeg 9c : libpng 1.6.37 : libtiff 4.0.10 : zlib >>>>>>>> 1.2.11 : libwebp 1.0.3 : libopenjp2 2.3.1 >>>>>>>> >>>>>>>> Found AVX2 >>>>>>>> >>>>>>>> Found AVX >>>>>>>> >>>>>>>> Found FMA >>>>>>>> >>>>>>>> Found SSE >>>>>>>> >>>>>>>> Found libarchive 3.4.0 zlib/1.2.11 liblzma/5.2.4 bz2lib/1.0.6 >>>>>>>> >>>>>>>> My bashscript looks at follows: https://pastebin.com/XK4CkuM2 >>>>>>>> >>>>>>>> When I evaluate via: >>>>>>>> >>>>>>>> ~/../../usr/local/bin/lstmeval \ >>>>>>>> --model ~/Desktop/tesstutorial/trainplusminus/eng.traineddata \ >>>>>>>> --traineddata >>>>>>>> ~/Desktop/tesstutorial/trainplusminus/eng/eng.traineddata \ >>>>>>>> --eval_listfile >>>>>>>> ~/Desktop/tesstutorial/trainplusminus/eng.training_files.txt 2>&1 | >>>>>>>> grep ± >>>>>>>> >>>>>>>> I don't get any OCR Line correctly. >>>>>>>> >>>>>>>> Does someone see a mistake in my code? >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>> You received this message because you are subscribed to the Google >>>>>>> Groups "tesseract-ocr" group. >>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>> send an email to tesser...@googlegroups.com. >>>>>>> To view this discussion on the web visit >>>>>>> https://groups.google.com/d/msgid/tesseract-ocr/e9ba2635-6308-41a8-8150-e5d4da520269%40googlegroups.com >>>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/e9ba2635-6308-41a8-8150-e5d4da520269%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>> . >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> >>>>>> ____________________________________________________________ >>>>>> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com >>>>>> >>>>> >>>>> >>>>> -- >>>>> >>>>> ____________________________________________________________ >>>>> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com >>>>> >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "tesseract-ocr" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to tesser...@googlegroups.com. >>>> To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/tesseract-ocr/d44cd443-da72-4df4-9a7c-aae082726010%40googlegroups.com >>>> <https://groups.google.com/d/msgid/tesseract-ocr/d44cd443-da72-4df4-9a7c-aae082726010%40googlegroups.com?utm_medium=email&utm_source=footer> >>>> . >>>> >>> >>> >>> -- >>> >>> ____________________________________________________________ >>> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com >>> >> >> >> -- >> >> ____________________________________________________________ >> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com >> > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to tesseract-ocr+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/0a2e9693-553a-4340-832d-79a31da74314%40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/0a2e9693-553a-4340-832d-79a31da74314%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- ____________________________________________________________ भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduWqhwDp44dRuH2d6Fnz6msHPQTioHyWUyByAsmdWnkwcA%40mail.gmail.com.