[tesseract-ocr] Similar pictures, different results

2018-05-15 Thread yang3781590
There are two similar pictures, the difference between them is the white edge size. One result is right(3.png) but the other is wrong(4.png). I don't know why, can you help me. I use the jTessBoxEditor to see the box. It shows that Tesseract has boxed out the right part.

[tesseract-ocr] Error in executing new .traineddata file

2018-05-15 Thread Eman Sawalha
Hello Recently, I worked on training Tesseract to detect Old South Arabian Script, and I produced the .traineddata file. So to test .traineddata file I copied the file into the tessdata file inside the Tesseract. My problem that whenever I tried to execute it on cmd.exe it gives me this

Re: [tesseract-ocr] train more fonts on trained model fas in tesseract

2018-05-15 Thread reza
hi again thanks for your reply. i need more fonts. for examples : B Koodak B Lotus B Titr B Zar B Yekan Iran Nastaliq if needs, i send the .ttf files of that fonts ? thanks On Tuesday, May 15, 2018 at 5:35:10 PM UTC+4:30, shree wrote: > > I will try to put together complete steps. > > I am

Re: [tesseract-ocr] train more fonts on trained model fas in tesseract

2018-05-15 Thread ShreeDevi Kumar
I will try to put together complete steps. I am doing a test run for training persian. Are the following fonts ok for it? '55_Sarchia_Kurdish' \ '56_Sarchia_Kurdish_Bold Bold' \ 'Amiri' \ 'Arabic Typesetting' \ 'Arial' \ 'Arial Unicode MS' \ 'B Nazanin' \ 'B Nazanin Bold' \

Re: [tesseract-ocr] train more fonts on trained model fas in tesseract

2018-05-15 Thread reza
i test it on ubuntu , that raised error too. could u help me and send me a new bash file for fine tuning with new fonts ? i put "eng.traineddata" fil in tessdata_best folder and "eng.training_text" and "eng.traineddata" in langdata\eng is it true and sufficient ? or need more file ? thanks

Re: [tesseract-ocr] train more fonts on trained model fas in tesseract

2018-05-15 Thread ShreeDevi Kumar
Please use the latest windows binaries from https://github.com/UB-Mannheim/tesseract/wiki provided by @stweil How do you run bash script on windows10? @stweil I have not tried training on windows? Do you have feedback from others who have tried it. ShreeDevi

Re: [tesseract-ocr] train more fonts on trained model fas in tesseract

2018-05-15 Thread reza
thanks for reply tesseract 4 beta windows 10 On Tuesday, May 15, 2018 at 1:12:20 PM UTC+4:30, shree wrote: > > What o/s are you running it on? > > Which version of tesseract? > > > ICU ERROR: U_FILE_ACCESS_ERRORERROR: /tmp/tmp.6m4B2TUln1/eng/eng.unicharset > does not exist or is not readable

Re: [tesseract-ocr] train more fonts on trained model fas in tesseract

2018-05-15 Thread reza
windows 10 tesseract 4 alpha On Tuesday, May 15, 2018 at 1:12:20 PM UTC+4:30, shree wrote: > > What o/s are you running it on? > > Which version of tesseract? > > > ICU ERROR: U_FILE_ACCESS_ERRORERROR: /tmp/tmp.6m4B2TUln1/eng/eng.unicharset > does not exist or is not readable > > which version

Re: [tesseract-ocr] train more fonts on trained model fas in tesseract

2018-05-15 Thread ShreeDevi Kumar
What o/s are you running it on? Which version of tesseract? > ICU ERROR: U_FILE_ACCESS_ERRORERROR: /tmp/tmp.6m4B2TUln1/eng/eng.unicharset does not exist or is not readable which version of icu library? ShreeDevi भजन - कीर्तन - आरती @

Re: [tesseract-ocr] train more fonts on trained model fas in tesseract

2018-05-15 Thread reza
i used this attached finetune.sh file ... but that raised error. could u help me ? thanks > ## MAKING TRAINING DATA ## > > >> === Starting training for language 'eng' > > [Tue, May 15, 2018 11:42:36 AM] /c/Program Files >> (x86)/Tesseract-OCR/text2image --fonts_dir=C:WindowsFonts