I mean working with font only? Or images?? On Wed, 1 Sep 2021 at 6:09 PM, Samruddhi Dhake <[email protected]> wrote:
> Yes, I am working for eng language. > I am using tessdata.(C:\Program Files\Tesseract-OCR\tessdata) > > On Wednesday, September 1, 2021 at 5:57:24 PM UTC+5:30 P007 wrote: > >> Okay, >> >> Wait you are working for English language right? >> What kind of dataset you used here. >> >> On Wed, 1 Sep 2021 at 5:53 PM, Samruddhi Dhake <[email protected]> >> wrote: >> >>> No. Tessstrain.sh didn't work. I am running tesstrain.sh on cygwin. >>> Command-> >>> *$ ./src/training/tesstrain.sh --fonts_dir %WINDIR%/Fonts/ --lang eng >>> --linedata_only --noextract_font_properties --langdata_dir 'C:/Program >>> Files/Tesseract-OCR/langdata' --tessdata_dir 'C:/Program >>> Files/Tesseract-OCR/tessdata' --output_dir D:/Test/trainneddata --fontlist >>> 'Arial'* >>> >>> After hitting enter for tesstrain.sh, it is processing text2image and >>> giving following error >>> === Starting training for language 'eng' >>> [Tue Aug 31 19:19:05 IST 2021] /cygdrive/c/Program >>> Files/Tesseract-OCR/text2image --fonts_dir=%WINDIR%/Fonts/ --ptsize 12 >>> --font=Arial --outputbase=/tmp/font_tmp.0doGBqWc3I/sample_text.txt >>> --text=/tmp/font_tmp.0doGBqWc3I/sample_text.txt >>> --fontconfig_tmpdir=/tmp/font_tmp.0doGBqWc3I >>> Unable to open '/tmp/font_tmp.0doGBqWc3I/fonts.conf' for writing >>> Fontconfig error: Cannot load default config file >>> Could not find font named 'Arial'. >>> Please correct --font arg. >>> ERROR: Program Program failed. Abort. >>> >>> As per previous suggestions, I ran text2image.exe command on cmd and its >>> working and giving me all available fonts. >>> >>> Then after running tesstrain.sh, why text2image command is failing and >>> it is not creating tempfolder under /tmp/ and I am getting fonts.config >>> error. >>> It is expected that fonts.config file which gets created in >>> tempfolder(here in my case font_tmp.0doGBqWc3I) should gets written and it >>> should include font 'Arial' and then Arial font can be found. >>> Don't why it is not creating.. >>> >>> Regards, >>> Samruddhi >>> >>> On Wednesday, September 1, 2021 at 5:31:10 PM UTC+5:30 P007 wrote: >>> >>>> >>>> Tesstrain.sh work for you ? >>>> >>>> On Wed, 1 Sep 2021 at 5:09 PM, Samruddhi Dhake <[email protected]> >>>> wrote: >>>> >>>>> In this text2image, there is an rgument --fontconfig_tempdir which >>>>> creates temp folder where fonts.conf gets added. >>>>> >>>>> I checked /tmp/, no other tempfolder is created( font_tmp.0doGBqWc3I) >>>>> >>>>> Has anybody this issue? >>>>> >>>>> Regards, >>>>> Samruddhi >>>>> >>>>> On Tuesday, August 31, 2021 at 7:24:46 PM UTC+5:30 Samruddhi Dhake >>>>> wrote: >>>>> >>>>>> >"C:\Program Files\Tesseract-OCR\text2image.exe" >>>>>> --fonts_dir=%WINDIR%/Fonts --fontconfig_tmpdir=/tmp >>>>>> --list_available_fonts >>>>>> This worked. I got list of available fonts which contains Arial and >>>>>> Arial Bold too. >>>>>> >>>>>> Now this time,in Cygwin Bash, I tried giving --fontlist 'Arial' for >>>>>> tesstrain.sh >>>>>> Command-> >>>>>> *$ ./src/training/tesstrain.sh --fonts_dir %WINDIR%/Fonts/ --lang eng >>>>>> --linedata_only --noextract_font_properties --langdata_dir 'C:/Program >>>>>> Files/Tesseract-OCR/langdata' --tessdata_dir 'C:/Program >>>>>> Files/Tesseract-OCR/tessdata' --output_dir D:/Test/trainneddata >>>>>> --fontlist >>>>>> 'Arial'* >>>>>> >>>>>> === Starting training for language 'eng' >>>>>> [Tue Aug 31 19:19:05 IST 2021] /cygdrive/c/Program >>>>>> Files/Tesseract-OCR/text2image --fonts_dir=%WINDIR%/Fonts/ --ptsize 12 >>>>>> --font=Arial --outputbase=/tmp/font_tmp.0doGBqWc3I/sample_text.txt >>>>>> --text=/tmp/font_tmp.0doGBqWc3I/sample_text.txt >>>>>> --fontconfig_tmpdir=/tmp/font_tmp.0doGBqWc3I >>>>>> Unable to open '/tmp/font_tmp.0doGBqWc3I/fonts.conf' for writing >>>>>> Fontconfig error: Cannot load default config file >>>>>> Could not find font named 'Arial'. >>>>>> Please correct --font arg. >>>>>> ERROR: Program Program failed. Abort. >>>>>> >>>>>> Still I am getting this font.conf error. Any idea how to resolve this >>>>>> font.conf error? >>>>>> >>>>>> Regards, >>>>>> Samruddhi >>>>>> >>>>>> On Tuesday, August 31, 2021 at 4:50:14 PM UTC+5:30 zdenop wrote: >>>>>> >>>>>>> try run this: >>>>>>> "C:\Program Files\Tesseract-OCR\text2image.exe" >>>>>>> --fonts_dir=%WINDIR%/Fonts --fontconfig_tmpdir=/tmp >>>>>>> --list_available_fonts >>>>>>> >>>>>>> Zdenko >>>>>>> >>>>>>> >>>>>>> po 30. 8. 2021 o 16:45 Samruddhi Dhake <[email protected]> >>>>>>> napísal(a): >>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> I am running command -> >>>>>>>> >>>>>>>> ./src/training/tesstrain.sh --fonts_dir C:/Windows/Fonts --lang eng >>>>>>>> --linedata_only --noextract_font_properties --langdata_dir "C:/Program >>>>>>>> Files/Tesseract-OCR/langdata" --tessdata_dir "C:/Program >>>>>>>> Files/Tesseract-OCR/tessdata" --output_dir D:\Test\trainneddata >>>>>>>> >>>>>>>> And after hitting enter -> (processing) >>>>>>>> === *Starting training for language 'eng'* >>>>>>>> *[Mon Aug 30 16:51:10 IST 2021] /cygdrive/c/Program >>>>>>>> Files/Tesseract-OCR/text2image --fonts_dir=C:/Windows/Fonts/ --ptsize >>>>>>>> 12 >>>>>>>> --font=Arial Bold --outputbase=/tmp/font_tmp.s9cdSHrzKS/sample_text.txt >>>>>>>> --text=/tmp/font_tmp.s9cdSHrzKS/sample_text.txt >>>>>>>> --fontconfig_tmpdir=/tmp/font_tmp.s9cdSHrzKS* >>>>>>>> *Unable to open '/tmp/font_tmp.s9cdSHrzKS/fonts.conf' for writing* >>>>>>>> *Fontconfig error: Cannot load default config file* >>>>>>>> *Could not find font named 'Arial Bold'.* >>>>>>>> *Please correct --font arg.* >>>>>>>> *ERROR: Program Program failed. Abort.* >>>>>>>> >>>>>>>> >>>>>>>> I will break it to ask few queries. >>>>>>>> >>>>>>>> *[Mon Aug 30 16:51:10 IST 2021] /cygdrive/c/Program >>>>>>>> Files/Tesseract-OCR/text2image --fonts_dir=C:/Windows/Fonts/ --ptsize >>>>>>>> 12 >>>>>>>> --font=Arial Bold --outputbase=/tmp/font_tmp.s9cdSHrzKS/sample_text.txt >>>>>>>> --text=/tmp/font_tmp.s9cdSHrzKS/sample_text.txt >>>>>>>> --fontconfig_tmpdir=/tmp/font_tmp.s9cdSHrzKS* >>>>>>>> *Unable to open '/tmp/font_tmp.s9cdSHrzKS/fonts.conf' for writing* >>>>>>>> ----> Here, I am not giving input as Arial Bold. Outputbase , this >>>>>>>> should create temp folder 'font_tmp.s9cdSHrzKS' but its not creating. >>>>>>>> And so does fontconfig_tmpdir'. So it is giving writing error >>>>>>>> >>>>>>>> *Fontconfig error: Cannot load default config file* >>>>>>>> ----> To resolve this error, I added >>>>>>>> FONTCONFIG_FILE=%WINDIR%\fonts.conf to environment variables(referring >>>>>>>> https://forums.wesnoth.org/viewtopic.php?t=22821) >>>>>>>> But still not resolved. >>>>>>>> >>>>>>>> I was checking-> *text2image.exe ----list_available_fonts* >>>>>>>> And after hitting enter, I got -> Fontconfig warning: >>>>>>>> "/tmp\fonts.conf", line 4: empty font directory name ignored >>>>>>>> >>>>>>>> The contents of the fonts.conf file which gets created are-> >>>>>>>> <?xml version="1.0"?> >>>>>>>> <!DOCTYPE fontconfig SYSTEM "fonts.dtd"> >>>>>>>> <fontconfig> >>>>>>>> <dir></dir> >>>>>>>> <cachedir>/tmp</cachedir> >>>>>>>> <config></config> >>>>>>>> </fontconfig> >>>>>>>> >>>>>>>> Can you please help me how can this be resolved? Or Am I giving >>>>>>>> correct tesstrain.sh command with its args? >>>>>>>> >>>>>>>> Regards, >>>>>>>> Samruddhi >>>>>>>> On Monday, August 30, 2021 at 5:12:21 PM UTC+5:30 zdenop wrote: >>>>>>>> >>>>>>>>> First of all: use quotes for multi word names, or escape >>>>>>>>> space/special symbols (e.g. --font="Arial Bold") >>>>>>>>> Next: fix error message: "Unable to open >>>>>>>>> '/tmp/font_tmp.hbC9F3LEQX/fonts.conf' for writing" >>>>>>>>> Next: check available font for text2image with option >>>>>>>>> --list_available_fonts >>>>>>>>> etc... >>>>>>>>> >>>>>>>>> PS: I would suggest using linux for training instead of windows >>>>>>>>> (e.g. in WSL[1]) >>>>>>>>> [1] https://docs.microsoft.com/en-us/windows/wsl/install-win10 >>>>>>>>> >>>>>>>>> Zdenko >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> po 30. 8. 2021 o 12:12 Samruddhi Dhake <[email protected]> >>>>>>>>> napísal(a): >>>>>>>>> >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> Text2Image error is gone. I am getting *font-config error*. >>>>>>>>>> >>>>>>>>>> SDE26@DTP-SDE26-IND /cygdrive/c/Program Files/Tesseract-OCR >>>>>>>>>> $ ./src/training/tesstrain.sh --fonts_dir C:/Windows/Fonts --lang >>>>>>>>>> eng --linedata_only --noextract_font_properties --langdata_dir >>>>>>>>>> "C:/Program >>>>>>>>>> Files/Tesseract-OCR/langdata" --tessdata_dir "C:/Program >>>>>>>>>> Files/Tesseract-OCR/tessdata" --output_dir D:\Test\trainneddata >>>>>>>>>> Creating new directory D:Testtrainneddata >>>>>>>>>> >>>>>>>>>> === Starting training for language 'eng' >>>>>>>>>> [Mon Aug 30 15:34:53 IST 2021] /cygdrive/c/Program >>>>>>>>>> Files/Tesseract-OCR/text2image --fonts_dir=C:/Windows/Fonts --ptsize >>>>>>>>>> 12 >>>>>>>>>> --font=Arial Bold >>>>>>>>>> --outputbase=/tmp/font_tmp.hbC9F3LEQX/sample_text.txt >>>>>>>>>> --text=/tmp/font_tmp.hbC9F3LEQX/sample_text.txt >>>>>>>>>> --fontconfig_tmpdir=/tmp/font_tmp.hbC9F3LEQX >>>>>>>>>> Unable to open '/tmp/font_tmp.hbC9F3LEQX/fonts.conf' for writing >>>>>>>>>> Fontconfig error: Cannot load default config file >>>>>>>>>> Could not find font named 'Arial Bold'. >>>>>>>>>> Please correct --font arg. >>>>>>>>>> ERROR: Program Program failed. Abort. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> I have Arial Bold font on my machine. Don't know why it cannot >>>>>>>>>> find. And in /tmp/ folder there is no font_tmp.hbC9F3LEQX where >>>>>>>>>> fonts.conf >>>>>>>>>> cannot be opened for writing. >>>>>>>>>> How can I resolve this? >>>>>>>>>> >>>>>>>>>> Regards, >>>>>>>>>> Samruddhi >>>>>>>>>> >>>>>>>>>> On Wednesday, August 25, 2021 at 8:18:47 PM UTC+5:30 zdenop wrote: >>>>>>>>>> >>>>>>>>>>> Honestly, I have no clue what you are doing: text2image is at >>>>>>>>>>> the same location as the tesseract executable. So if you have >>>>>>>>>>> tesseract in >>>>>>>>>>> the path, text2image must work too. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> [image: image.png] >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Zdenko >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> st 25. 8. 2021 o 16:26 Samruddhi Dhake <[email protected]> >>>>>>>>>>> napísal(a): >>>>>>>>>>> >>>>>>>>>>>> As you suggested, I installed Tesseract v5.0.0 on my Windows >>>>>>>>>>>> machine (Index of /tesseract (uni-mannheim.de) >>>>>>>>>>>> <https://digi.bib.uni-mannheim.de/tesseract/>). This included >>>>>>>>>>>> training tools too. >>>>>>>>>>>> I performed all the previous steps(boxfile, lstmf >>>>>>>>>>>> file,unicharset) >>>>>>>>>>>> >>>>>>>>>>>> But still after running tesstrain.sh command in Cygwin, I am >>>>>>>>>>>> getting following error, >>>>>>>>>>>> $ ./src/training/tesstrain.sh --fonts_dir C:/Windows/Fonts >>>>>>>>>>>> --lang eng --linedata_only --noextract_font_properties >>>>>>>>>>>> --langdata_dir >>>>>>>>>>>> "C:/Program Files/Tesseract-OCR/langdata" --tessdata_dir >>>>>>>>>>>> "C:/Program >>>>>>>>>>>> Files/Tesseract-OCR/tessdata" --output_dir >>>>>>>>>>>> D:/Bugs/1206806/folder/trainneddata >>>>>>>>>>>> Creating new directory D:/Bugs/1206806/folder/trainneddata >>>>>>>>>>>> >>>>>>>>>>>> === Starting training for language 'eng' >>>>>>>>>>>> which: no text2image in >>>>>>>>>>>> (/usr/local/bin:/usr/bin:/cygdrive/c/Program Files/Microsoft >>>>>>>>>>>> MPI/Bin:/cygdrive/c/buildtools:/cygdrive/c/Program Files >>>>>>>>>>>> (x86)/NVIDIA >>>>>>>>>>>> Corporation/PhysX/Common:/cygdrive/c/Program Files >>>>>>>>>>>> (x86)/Intel/Intel(R) >>>>>>>>>>>> Management Engine Components/iCLS:/cygdrive/c/Program >>>>>>>>>>>> Files/Intel/Intel(R) >>>>>>>>>>>> Management Engine >>>>>>>>>>>> Components/iCLS:/cygdrive/c/Python25:/cygdrive/c/ProgramData/Oracle/Java/javapath:/cygdrive/c/Perl/site/bin:/cygdrive/c/Perl/bin:/cygdrive/c/Oracle12C_64bCli/client_1/bin:/cygdrive/c/Oracle12C_32bCli/client_1/bin:/cygdrive/c/windows/system32:/cygdrive/c/windows:/cygdrive/c/windows/System32/Wbem:/cygdrive/c/windows/System32/WindowsPowerShell/v1.0:/cygdrive/c/windows/System32/OpenSSH:/cygdrive/c/Program >>>>>>>>>>>> Files (x86)/Microsoft SQL Server/100/Tools/Binn:/cygdrive/c/Program >>>>>>>>>>>> Files/Microsoft SQL Server/100/Tools/Binn:/cygdrive/c/Program >>>>>>>>>>>> Files/Microsoft SQL Server/100/DTS/Binn:/cygdrive/c/Program >>>>>>>>>>>> Files/Microsoft/Web Platform Installer:/cygdrive/c/Program Files >>>>>>>>>>>> (x86)/Microsoft ASP.NET/ASP.NET Web >>>>>>>>>>>> Pages/v1.0:/cygdrive/c/Program Files/Microsoft SQL >>>>>>>>>>>> Server/110/Tools/Binn:/cygdrive/c/windows/system32/config/systemprofile/.dnx/bin:/cygdrive/c/Program >>>>>>>>>>>> Files/Microsoft DNX/Dnvm:/cygdrive/c/Program Files (x86)/Windows >>>>>>>>>>>> Kits/8.1/Windows Performance Toolkit:/cygdrive/c/Program >>>>>>>>>>>> Files/Microsoft >>>>>>>>>>>> SQL Server/130/Tools/Binn:/cygdrive/c/Program Files (x86)/Windows >>>>>>>>>>>> Kits/10/Windows Performance Toolkit:/cygdrive/c/Program Files >>>>>>>>>>>> (x86)/Oracle/Berkeley DB 12cR1 6.0.20/bin:/cygdrive/c/Program >>>>>>>>>>>> Files/dotnet:/cygdrive/c/Program Files/Microsoft SQL Server/Client >>>>>>>>>>>> SDK/ODBC/170/Tools/Binn:/cygdrive/c/Program Files >>>>>>>>>>>> (x86)/IncrediBuild:/cygdrive/c/WINDOWS/system32:/cygdrive/c/WINDOWS:/cygdrive/c/WINDOWS/System32/Wbem:/cygdrive/c/WINDOWS/System32/WindowsPowerShell/v1.0:/cygdrive/c/WINDOWS/System32/OpenSSH:/cygdrive/c/Program >>>>>>>>>>>> Files (x86)/Microsoft SQL Server/150/Tools/Binn:/cygdrive/c/Program >>>>>>>>>>>> Files/Microsoft SQL Server/150/Tools/Binn:/cygdrive/c/Program Files >>>>>>>>>>>> (x86)/Microsoft SQL Server/150/DTS/Binn:/cygdrive/c/Program >>>>>>>>>>>> Files/Microsoft >>>>>>>>>>>> SQL Server/150/DTS/Binn:/cygdrive/c/Program Files (x86)/Microsoft >>>>>>>>>>>> SQL >>>>>>>>>>>> Server/Client SDK/ODBC/130/Tools/Binn:/cygdrive/c/Program Files >>>>>>>>>>>> (x86)/Microsoft SQL Server/140/Tools/Binn:/cygdrive/c/Program Files >>>>>>>>>>>> (x86)/Microsoft SQL Server/140/DTS/Binn:/cygdrive/c/Program Files >>>>>>>>>>>> (x86)/Microsoft SQL >>>>>>>>>>>> Server/140/Tools/Binn/ManagementStudio:/cygdrive/d/Git/cmd:/cygdrive/c/Users/sde26/AppData/Local/Microsoft/WindowsApps:/cygdrive/c/Users/sde26/.dotnet/tools) >>>>>>>>>>>> which: no text2image in (./api) >>>>>>>>>>>> which: no text2image in (./training) >>>>>>>>>>>> ERROR: 'text2image' not found >>>>>>>>>>>> >>>>>>>>>>>> Am I missing something? Can you please guild me? >>>>>>>>>>>> >>>>>>>>>>>> Regards, >>>>>>>>>>>> Samruddhi >>>>>>>>>>>> On Tuesday, August 24, 2021 at 5:59:49 PM UTC+5:30 Samruddhi >>>>>>>>>>>> Dhake wrote: >>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Can you please provide link for steps to install Tesseract and >>>>>>>>>>>>> training tools on Windows? >>>>>>>>>>>>> >>>>>>>>>>>>> Samruddhi >>>>>>>>>>>>> On Tuesday, August 24, 2021 at 3:42:48 PM UTC+5:30 Samruddhi >>>>>>>>>>>>> Dhake wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> How to install tesseract and training tools on Windows? >>>>>>>>>>>>>> Do I have to install Tesseract Windows exe? >>>>>>>>>>>>>> >>>>>>>>>>>>>> Samruddhi >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Tuesday, August 24, 2021 at 3:20:37 PM UTC+5:30 zdenop >>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> So there are only 2 possibilities: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 1. Install tesseract and training tools >>>>>>>>>>>>>>> 2. Learn how to handle & use not installed sw. This >>>>>>>>>>>>>>> option is not related to tesseract. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Zdenko >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> ut 24. 8. 2021 o 9:17 Samruddhi Dhake <[email protected]> >>>>>>>>>>>>>>> napísal(a): >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I haven't installed Tesseract. I have kept in a folder and >>>>>>>>>>>>>>>> I am running exe by giving its path. I have generated training >>>>>>>>>>>>>>>> tools >>>>>>>>>>>>>>>> through source code. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> To create box file, command->(I gave absoulute path of >>>>>>>>>>>>>>>> tesseract.exe) >>>>>>>>>>>>>>>> ..\tesseract.exe Dim4.tif Dim4 lstmbox >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> To create box file, command-> >>>>>>>>>>>>>>>> tesseract.exe Dim4.tif Dim4 lstm.train >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> To create unicharset, command-> >>>>>>>>>>>>>>>> unicharset_extractor.exe --output_unicharset >>>>>>>>>>>>>>>> ..\own.unicharset ..\langdata\eng\eng.training_text >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> And to create trainned data, using tesstrain.sh command, >>>>>>>>>>>>>>>> .\src\training\tesstrain.sh --fonts_dir C:\Windows\Fonts >>>>>>>>>>>>>>>> --lang eng --linedata_only --noextract_font_properties >>>>>>>>>>>>>>>> --langdata_dir >>>>>>>>>>>>>>>> langdata --tessdata_dir tessdata --output_dir trainneddata >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Regards, >>>>>>>>>>>>>>>> Samruddhi >>>>>>>>>>>>>>>> On Tuesday, August 24, 2021 at 12:24:29 PM UTC+5:30 >>>>>>>>>>>>>>>> Samruddhi Dhake wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I have generated training tools through source code. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Monday, August 23, 2021 at 7:09:02 PM UTC+5:30 zdenop >>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> How did you install tesseract? Did you also install >>>>>>>>>>>>>>>>>> training tools? >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Zdenko >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> po 23. 8. 2021 o 15:34 Samruddhi Dhake < >>>>>>>>>>>>>>>>>> [email protected]> napísal(a): >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Hello, >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> I am creating my own trainneddata using tesseract v4.1.1 >>>>>>>>>>>>>>>>>>> on Windows 10. >>>>>>>>>>>>>>>>>>> I am referring documentation >>>>>>>>>>>>>>>>>>> https://tesseract-ocr.github.io/tessdoc/tess4/TrainingTesseract-4.00.html >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> I have successfully created .box file and .lstmf file >>>>>>>>>>>>>>>>>>> using lstmbox and lstm.train respectively. >>>>>>>>>>>>>>>>>>> So next step, I installed Cygwin to run tesstrain.sh >>>>>>>>>>>>>>>>>>> command to create training data. >>>>>>>>>>>>>>>>>>> But I am getting below error. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> $ ./src/training/tesstrain.sh --fonts_dir >>>>>>>>>>>>>>>>>>> C:/Windows/Fonts --lang eng --linedata_only >>>>>>>>>>>>>>>>>>> --noextract_font_properties >>>>>>>>>>>>>>>>>>> --langdata_dir ./langdata --tessdata_dir ./tessdata >>>>>>>>>>>>>>>>>>> --output_dir >>>>>>>>>>>>>>>>>>> ./trainneddata >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> === Starting training for language 'eng' >>>>>>>>>>>>>>>>>>> which: no text2image in >>>>>>>>>>>>>>>>>>> (/usr/local/bin:/usr/bin:/cygdrive/c/Program Files/Microsoft >>>>>>>>>>>>>>>>>>> MPI/Bin:/cygdrive/c/buildtools:/cygdrive/c/Program Files >>>>>>>>>>>>>>>>>>> (x86)/NVIDIA >>>>>>>>>>>>>>>>>>> Corporation/PhysX/Common:/cygdrive/c/Program Files >>>>>>>>>>>>>>>>>>> (x86)/Intel/Intel(R) >>>>>>>>>>>>>>>>>>> Management Engine Components/iCLS:/cygdrive/c/Program >>>>>>>>>>>>>>>>>>> Files/Intel/Intel(R) >>>>>>>>>>>>>>>>>>> Management Engine >>>>>>>>>>>>>>>>>>> Components/iCLS:/cygdrive/c/Python25:/cygdrive/c/ProgramData/Oracle/Java/javapath:/cygdrive/c/Perl/site/bin:/cygdrive/c/Perl/bin:/cygdrive/c/Oracle12C_64bCli/client_1/bin:/cygdrive/c/Oracle12C_32bCli/client_1/bin:/cygdrive/c/windows/system32:/cygdrive/c/windows:/cygdrive/c/windows/System32/Wbem:/cygdrive/c/windows/System32/WindowsPowerShell/v1.0:/cygdrive/c/windows/System32/OpenSSH:/cygdrive/c/Program >>>>>>>>>>>>>>>>>>> Files (x86)/Microsoft SQL >>>>>>>>>>>>>>>>>>> Server/100/Tools/Binn:/cygdrive/c/Program >>>>>>>>>>>>>>>>>>> Files/Microsoft SQL >>>>>>>>>>>>>>>>>>> Server/100/Tools/Binn:/cygdrive/c/Program >>>>>>>>>>>>>>>>>>> Files/Microsoft SQL Server/100/DTS/Binn:/cygdrive/c/Program >>>>>>>>>>>>>>>>>>> Files/Microsoft/Web Platform Installer:/cygdrive/c/Program >>>>>>>>>>>>>>>>>>> Files >>>>>>>>>>>>>>>>>>> (x86)/Microsoft ASP.NET/ASP.NET Web >>>>>>>>>>>>>>>>>>> Pages/v1.0:/cygdrive/c/Program Files/Microsoft SQL >>>>>>>>>>>>>>>>>>> Server/110/Tools/Binn:/cygdrive/c/windows/system32/config/systemprofile/.dnx/bin:/cygdrive/c/Program >>>>>>>>>>>>>>>>>>> Files/Microsoft DNX/Dnvm:/cygdrive/c/Program Files >>>>>>>>>>>>>>>>>>> (x86)/Windows >>>>>>>>>>>>>>>>>>> Kits/8.1/Windows Performance Toolkit:/cygdrive/c/Program >>>>>>>>>>>>>>>>>>> Files/Microsoft >>>>>>>>>>>>>>>>>>> SQL Server/130/Tools/Binn:/cygdrive/c/Program Files >>>>>>>>>>>>>>>>>>> (x86)/Windows >>>>>>>>>>>>>>>>>>> Kits/10/Windows Performance Toolkit:/cygdrive/c/Program >>>>>>>>>>>>>>>>>>> Files >>>>>>>>>>>>>>>>>>> (x86)/Oracle/Berkeley DB 12cR1 >>>>>>>>>>>>>>>>>>> 6.0.20/bin:/cygdrive/c/Program >>>>>>>>>>>>>>>>>>> Files/dotnet:/cygdrive/c/Program Files/Microsoft SQL >>>>>>>>>>>>>>>>>>> Server/Client >>>>>>>>>>>>>>>>>>> SDK/ODBC/170/Tools/Binn:/cygdrive/c/Program Files >>>>>>>>>>>>>>>>>>> (x86)/IncrediBuild:/cygdrive/c/WINDOWS/system32:/cygdrive/c/WINDOWS:/cygdrive/c/WINDOWS/System32/Wbem:/cygdrive/c/WINDOWS/System32/WindowsPowerShell/v1.0:/cygdrive/c/WINDOWS/System32/OpenSSH:/cygdrive/c/Program >>>>>>>>>>>>>>>>>>> Files (x86)/Microsoft SQL >>>>>>>>>>>>>>>>>>> Server/150/Tools/Binn:/cygdrive/c/Program >>>>>>>>>>>>>>>>>>> Files/Microsoft SQL >>>>>>>>>>>>>>>>>>> Server/150/Tools/Binn:/cygdrive/c/Program Files >>>>>>>>>>>>>>>>>>> (x86)/Microsoft SQL Server/150/DTS/Binn:/cygdrive/c/Program >>>>>>>>>>>>>>>>>>> Files/Microsoft >>>>>>>>>>>>>>>>>>> SQL Server/150/DTS/Binn:/cygdrive/c/Program Files >>>>>>>>>>>>>>>>>>> (x86)/Microsoft SQL >>>>>>>>>>>>>>>>>>> Server/Client SDK/ODBC/130/Tools/Binn:/cygdrive/c/Program >>>>>>>>>>>>>>>>>>> Files >>>>>>>>>>>>>>>>>>> (x86)/Microsoft SQL >>>>>>>>>>>>>>>>>>> Server/140/Tools/Binn:/cygdrive/c/Program Files >>>>>>>>>>>>>>>>>>> (x86)/Microsoft SQL Server/140/DTS/Binn:/cygdrive/c/Program >>>>>>>>>>>>>>>>>>> Files >>>>>>>>>>>>>>>>>>> (x86)/Microsoft SQL >>>>>>>>>>>>>>>>>>> Server/140/Tools/Binn/ManagementStudio:/cygdrive/d/Git/cmd:/cygdrive/c/Users/sde26/AppData/Local/Microsoft/WindowsApps:/cygdrive/c/Users/sde26/.dotnet/tools) >>>>>>>>>>>>>>>>>>> which: no text2image in (./api) >>>>>>>>>>>>>>>>>>> which: no text2image in (./training) >>>>>>>>>>>>>>>>>>> ERROR: 'text2image' not found >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> I found text2image comes after running command 'make >>>>>>>>>>>>>>>>>>> training'. >>>>>>>>>>>>>>>>>>> Can you please help me how this can be done in WIndows >>>>>>>>>>>>>>>>>>> 10? >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Regards, >>>>>>>>>>>>>>>>>>> Samruddhi >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>> You received this message because you are subscribed to >>>>>>>>>>>>>>>>>>> the Google Groups "tesseract-ocr" group. >>>>>>>>>>>>>>>>>>> To unsubscribe from this group and stop receiving emails >>>>>>>>>>>>>>>>>>> from it, send an email to >>>>>>>>>>>>>>>>>>> [email protected]. >>>>>>>>>>>>>>>>>>> To view this discussion on the web visit >>>>>>>>>>>>>>>>>>> https://groups.google.com/d/msgid/tesseract-ocr/5adf563d-117b-4bd8-a283-dd21e53575f4n%40googlegroups.com >>>>>>>>>>>>>>>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/5adf563d-117b-4bd8-a283-dd21e53575f4n%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>>>>>>>>>>>>>> . >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>> You received this message because you are subscribed to the >>>>>>>>>>>>>>>> Google Groups "tesseract-ocr" group. >>>>>>>>>>>>>>>> To unsubscribe from this group and stop receiving emails >>>>>>>>>>>>>>>> from it, send an email to [email protected]. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> To view this discussion on the web visit >>>>>>>>>>>>>>>> https://groups.google.com/d/msgid/tesseract-ocr/853c21b6-9b58-42ea-929e-f9b932098bbdn%40googlegroups.com >>>>>>>>>>>>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/853c21b6-9b58-42ea-929e-f9b932098bbdn%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>>>>>>>>>>> . >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -- >>>>>>>>>>>> You received this message because you are subscribed to the >>>>>>>>>>>> Google Groups "tesseract-ocr" group. >>>>>>>>>>>> To unsubscribe from this group and stop receiving emails from >>>>>>>>>>>> it, send an email to [email protected]. >>>>>>>>>>>> >>>>>>>>>>> To view this discussion on the web visit >>>>>>>>>>>> https://groups.google.com/d/msgid/tesseract-ocr/79bf5824-5f74-4dc9-b2da-269840d1dc7fn%40googlegroups.com >>>>>>>>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/79bf5824-5f74-4dc9-b2da-269840d1dc7fn%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>>>>>>> . >>>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>> You received this message because you are subscribed to the >>>>>>>>>> Google Groups "tesseract-ocr" group. >>>>>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>>>>> send an email to [email protected]. >>>>>>>>>> >>>>>>>>> To view this discussion on the web visit >>>>>>>>>> https://groups.google.com/d/msgid/tesseract-ocr/6492b2e2-060c-41a5-97bd-dfc238656cb4n%40googlegroups.com >>>>>>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/6492b2e2-060c-41a5-97bd-dfc238656cb4n%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>>>>> . >>>>>>>>> >>>>>>>>> >>>>>>>>>> -- >>>>>>>> You received this message because you are subscribed to the Google >>>>>>>> Groups "tesseract-ocr" group. >>>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>>> send an email to [email protected]. >>>>>>>> >>>>>>> To view this discussion on the web visit >>>>>>>> https://groups.google.com/d/msgid/tesseract-ocr/a274f441-5986-415c-a0a0-e05de6a3e790n%40googlegroups.com >>>>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/a274f441-5986-415c-a0a0-e05de6a3e790n%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>>> . >>>>>>> >>>>>>> >>>>>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "tesseract-ocr" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to [email protected]. >>>>> >>>> To view this discussion on the web visit >>>>> https://groups.google.com/d/msgid/tesseract-ocr/4d0f22e4-cc3f-4487-a024-363e79ad8598n%40googlegroups.com >>>>> <https://groups.google.com/d/msgid/tesseract-ocr/4d0f22e4-cc3f-4487-a024-363e79ad8598n%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>> . >>>> >>>> >>>>> -- >>> You received this message because you are subscribed to the Google >>> Groups "tesseract-ocr" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> >> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/tesseract-ocr/3fbe32ef-5477-42c4-911b-b980b24cea9cn%40googlegroups.com >>> <https://groups.google.com/d/msgid/tesseract-ocr/3fbe32ef-5477-42c4-911b-b980b24cea9cn%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> >> -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/595017f3-630a-4707-b4b3-a5aeed9e7a53n%40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/595017f3-630a-4707-b4b3-a5aeed9e7a53n%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAFoW%2BHKHwmJuVbPdKqqrtEQmj1yJUPXJYFnYGg6uC2VHJ1aUSw%40mail.gmail.com.

