Re: [tesseract-ocr] ERROR: Program text2image failed. Abort.

2019-09-06 Thread Zdenko Podobny
Do you understand error message ("/usr/local/bin/text2image: error while loading shared libraries: libtesseract.so.5: cannot open shared object file: No such file or directory")? IMO it is clear. Zdenko pi 6. 9. 2019 o 17:07 Jundong Qiao napĂ­sal(a): > Hi all, > > I am generating training

[tesseract-ocr] ERROR: Program text2image failed. Abort.

2019-09-06 Thread Jundong Qiao
Hi all, I am generating training files for tesseract, after installing all necessary packages. My code: tesstrain.sh --fonts_dir fonts \ --fontlist "OCR-A Medium"\ --lang eng \ --linedata_only \ --langdata_dir langdata_lstm \ --tessdata_dir tesseract/tessdata

Re: [tesseract-ocr] summarizing LSTM

2019-09-06 Thread Timothy Snyder
the link for my second sentence ^ https://githubharald.github.io/ On Fri, Sep 6, 2019 at 9:24 AM Timothy Snyder wrote: > This page goes into a little more details than the VGSL spec page in the > Tesseract repo: > https://github.com/mldbai/tensorflow-models/blob/master/street/g3doc/vgslspecs.md

Re: [tesseract-ocr] summarizing LSTM

2019-09-06 Thread Timothy Snyder
This page goes into a little more details than the VGSL spec page in the Tesseract repo: https://github.com/mldbai/tensorflow-models/blob/master/street/g3doc/vgslspecs.md Not specific to Tesseract but this guy's articles have good info on lower-level neural net mechanics in relation to OCR. On

Re: [tesseract-ocr] summarizing LSTM

2019-09-06 Thread Purushotham Rao Eravalli
It will be great if you provide any source where we can get detailed information about the architecture used for tesseract and it's loss functions or so. Thanks On Fri, Sep 6, 2019, 6:39 PM Timothy Snyder wrote: > Do you want to learn more about neural networks or specifically, a >

Re: [tesseract-ocr] How do we pass coordinate to tesseract so that we escape detection process and run only recognition using tesseract

2019-09-06 Thread Purushotham Rao Eravalli
Thank you, will try 13 and see. On Fri, Sep 6, 2019, 6:37 PM Timothy Snyder wrote: > If you're doing recognition on a single line of text, use --PSM 13 or > --PSM 7. > > They're both for single line images but I've had highest accuracy using 13 > over 7. > > On Fri, Sep 6, 2019 at 6:18 AM

Re: [tesseract-ocr] summarizing LSTM

2019-09-06 Thread Timothy Snyder
Do you want to learn more about neural networks or specifically, a "summarizing LSTM" in a neural network? On Fri, Sep 6, 2019 at 5:05 AM Youcef wrote: > Hi, > > In that page https://github.com/tesseract-ocr/tesseract/wiki/VGSLSpecs from > officiel github repo, it talks about "summarizing LSTM"

Re: [tesseract-ocr] How do we pass coordinate to tesseract so that we escape detection process and run only recognition using tesseract

2019-09-06 Thread Timothy Snyder
If you're doing recognition on a single line of text, use --PSM 13 or --PSM 7. They're both for single line images but I've had highest accuracy using 13 over 7. On Fri, Sep 6, 2019 at 6:18 AM Purushotham Rao Eravalli < purushot...@sukshi.com> wrote: > Will it still do detection for that passed

Re: [tesseract-ocr] How do we pass coordinate to tesseract so that we escape detection process and run only recognition using tesseract

2019-09-06 Thread Purushotham Rao Eravalli
Will it still do detection for that passed seented image or does it run recognition directly? On Fri, Sep 6, 2019, 3:13 PM Ravi Annaswamy wrote: > If I understand correctly, you want to do the segmentation into blocks or > lines and then send only those panels to teeseract? > > If that is the

Re: [tesseract-ocr] How do we pass coordinate to tesseract so that we escape detection process and run only recognition using tesseract

2019-09-06 Thread Ravi Annaswamy
If I understand correctly, you want to do the segmentation into blocks or lines and then send only those panels to teeseract? If that is the case you can export panels into separate images and run tesseract on them. Ravi Sent from my iPhone > On Sep 6, 2019, at 2:25 AM, Purushotham Rao

Re: [tesseract-ocr] Fine tuning existing model

2019-09-06 Thread Lorenzo Bolzani
Hi Ayush, psm 6 and 7 do some extra pre-processing of the image, 13 does much less. Unless your image contains text like this: I would not expect much difference between PSM 6/7 and 13. While PSM 13 solves some problems I got more "ghost letters" errors (letters that are repeated

[tesseract-ocr] Re: How do we pass coordinate to tesseract so that we escape detection process and run only recognition using tesseract

2019-09-06 Thread Purushotham Rao Eravalli
Yes, basically I will have a single line of text like "Eravalli Purushotham Rao" in the cropped image. Can you provide any link for the solution On Friday, September 6, 2019 at 11:55:16 AM UTC+5:30, Purushotham Rao Eravalli wrote: > > How do we pass coordinate to tesseract so that we escape

[tesseract-ocr] Re: How do we pass coordinate to tesseract so that we escape detection process and run only recognition using tesseract

2019-09-06 Thread Youcef
Hi, If you want to escape detection, I suppose it is because you have a single word in your image, right? In that case, just launch resseract using page segmentation mode (psm parameter) into "single word mode", which corresponds to "psm=8" (see "tesseract --help-extra" for more details) Le

[tesseract-ocr] summarizing LSTM

2019-09-06 Thread Youcef
Hi, In that page https://github.com/tesseract-ocr/tesseract/wiki/VGSLSpecs from officiel github repo, it talks about "summarizing LSTM" and how we can manage to recognize image of both variable height and variable width . Is there a paper, a blog or anything else that provide more explicit

Re: [tesseract-ocr] Fine tuning existing model

2019-09-06 Thread Ayush Pandey
Hi Lorenzo. The empty output was due to the fact that I was using 7 as PSM parameter. Using 13 as PSM parameter completely eliminated the problem. On Friday, September 6, 2019 at 12:34:22 PM UTC+5:30, Lorenzo Blz wrote: > > Can you please share an example? > > An empty output usually means that

Re: [tesseract-ocr] Fine tuning existing model

2019-09-06 Thread Lorenzo Bolzani
Can you please share an example? An empty output usually means that it failed to recognize the black parts as text, this could be because the text is too big or too small or a wrong dpi setting. Or the image is not reasonably clean. To better understand the problem you can try to downscale the

[tesseract-ocr] How do we pass coordinate to tesseract so that we escape detection process and run only recognition using tesseract

2019-09-06 Thread Purushotham Rao Eravalli
How do we pass coordinate to tesseract so that we escape detection process and run only recognition using tesseract -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an