Re: [tesseract-ocr] Spanish text better processed in eng than in spa

2017-08-28 Thread ShreeDevi Kumar
>Btw, is there any way to tell tesseract that values are in a table, so that it will not make a mistake identifying lines with charts? I don't think tesseract has that ability. You will need to preprocess the image to remove lines. Leptonica has functions to do that, as well as a table detector.

Re: [tesseract-ocr] Spanish text better processed in eng than in spa

2017-08-28 Thread ShreeDevi Kumar
I had not checked the list. It should actually be Latin.traineddata for all languages written in Latin script. Not Spanish, as I had written. On 29-Aug-2017 3:54 AM, wrote: > So... I have installed the default tessdata used by the installer, which > seems to be this

Re: [tesseract-ocr] Spanish text better processed in eng than in spa

2017-08-28 Thread valentin . depablo
So... I have installed the default tessdata used by the installer, which seems to be this one: https://github.com/tesseract-ocr/tessdata/blob/master/spa.traineddata Looking to your comment I have installed the package: https://github.com/tesseract-ocr/tessdata/blob/master/best/spa.traineddata

[tesseract-ocr] tesseract is not working for straightforward image

2017-08-28 Thread Lada Tylich
Hi, I am confused that for the attached image it gives with parameter *-psm 7* result *88C. *It should detect such a picture, I guess. Am I missing something something? Thanks for any response. P.S.: Maybe sorry for duplicate (that is the 2nd post, because I have lost the first one.. :/. - if

[tesseract-ocr] wrong results to straightforward image

2017-08-28 Thread Lada Tylich
Hi to all! I am wondering why tesseract in version 3.0.3 (leptonica is in version 1.74.4) cannot clearly recognize even pictures like the one attached, without additional training. With parameter *-psm 7* I got *88C* as result. Am I missing something? I guess such a images should be pretty ok

Re: [tesseract-ocr] Error:Assert failed:in file ../lstm/lstmtrainer.h, line 110

2017-08-28 Thread ShreeDevi Kumar
Please see https://github.com/tesseract-ocr/tesseract/wiki/TrainingTesseract-4.00#fine-tuning-for-impact The following command extracts the .lstm file from the .traineddata file. training/combine_tessdata -e tessdata/best/eng.traineddata \ ~/tesstutorial/impact_from_full/eng.lstm ShreeDevi

Re: [tesseract-ocr] Error:Assert failed:in file ../lstm/lstmtrainer.h, line 110

2017-08-28 Thread Ava Nimaee
Hi shree I read instructions on the training wiki page but i dont have eng.lstm non of the syntaxs create eng.lstm. how can i create it. even i check langdata which i download it form git amd there is't there. i spend alot of time but i don't khonw how i can create it. can you tell me. On

Re: [tesseract-ocr] Calling Resource sha1 is disabled! Use Resource sha256 instead Error while installing tesseract in mac

2017-08-28 Thread ShreeDevi Kumar
Try $ brew update $ brew install tesseract --HEAD ShreeDevi भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com On Mon, Aug 28, 2017 at 12:33 PM, Mahesh Mesta wrote: > Hello, > > up votedown

[tesseract-ocr] Calling Resource sha1 is disabled! Use Resource sha256 instead Error while installing tesseract in mac

2017-08-28 Thread Mahesh Mesta
Hello, up votedown votefavorite I am trying to install tesseract in mac to work with ruby gem tesseract-ocr. However, it seems like the

Re: [tesseract-ocr] Spanish text better processed in eng than in spa

2017-08-28 Thread ShreeDevi Kumar
Have you tried with the 'best' traineddatas? What about results using best/Spanish vs best/spa? I have opened this as an issue at https://github.com/tesseract-ocr/tessdata/issues/77 You can provide additional feedback there. ShreeDevi