[tesseract-ocr] Re: Improve recognize russian chars

2014-09-19 Thread bulkinvk
Did not help... -- You received this message because you are subscribed to the Google Groups tesseract-ocr group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To post to this group, send email to

[tesseract-ocr] Re: Improve recognize russian chars

2014-09-19 Thread bulkinvk
i should enlarge picture (x3)? Or enlarge dpi on scanner? -- You received this message because you are subscribed to the Google Groups tesseract-ocr group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To post to

Re: [tesseract-ocr] Re: Improve recognize russian chars

2014-09-19 Thread Shree Devi Kumar
Enlarge dpi on scanner to at least 300dpi. pre-process the image. see tips given at https://code.google.com/p/tesseract-ocr/wiki/ImproveQuality for a test, i saved a screenshot from wikipedia page in russian. Attached is the image and its output, and also from a blurred version of same

[tesseract-ocr] Re: Need help reg pre-processing of image before ocr

2014-09-19 Thread Shree Devi Kumar
Do you still need a copy of sanskrit traineddata ? Shree Devi Kumar भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com On Fri, Aug 23, 2013 at 10:21 PM, mns_rao mns...@gmail.com wrote: Hi, The result output of OCR also depends on

[tesseract-ocr] Re: How to get paragraph wise text in Tesseract ?

2014-09-19 Thread Satya Swaroop
Yes,but it did not solve my issue. On Thursday, September 18, 2014 10:21:39 PM UTC+5:30, Albrecht Hilker wrote: Did you try SetPageSegMode(PSM_AUTO) -- You received this message because you are subscribed to the Google Groups tesseract-ocr group. To unsubscribe from this group and stop

[tesseract-ocr] Re: Tesseract recognizes the characters irrespective of the lines

2014-09-19 Thread Satya Swaroop
I am also facing the same problem.Please post your answer once you find it. Thanks in advance On Tuesday, September 9, 2014 6:58:15 PM UTC+5:30, Dineshkumar wrote: What steps will reproduce the problem? 1. Run the Tesseract OCR in Java for the attached image 2. Save the OCR result in a

Re: [tesseract-ocr] Modification of background image allowed in PDF output?

2014-09-19 Thread zdenko podobny
This is known issue - try current code from git repository. It should be fixed. Zdenko On Fri, Sep 19, 2014 at 2:38 PM, Frank Siegert frank.sieg...@googlemail.com wrote: Dear all, I have been testing tesseract to embed OCR in scanned PDF documents, and it works phenomenally well in

Re: [tesseract-ocr] Modification of background image allowed in PDF output?

2014-09-19 Thread Frank Siegert
Dear Zdenko, Thanks for the quick reply! Does that mean in general, i.e. except for this bug, that I can by construction assume the image will remain unmodified and only a text layer added? Cheers, Frank On Friday, September 19, 2014 2:54:52 PM UTC+2, zdenop wrote: This is known issue -

Re: [tesseract-ocr] Modification of background image allowed in PDF output?

2014-09-19 Thread zdenko podobny
Well yes and no ;-) Yes - there should be no change on image, but no - you need to expect that (re)compression of input image by pdf renderer could take a place. See comments for issue 1285[1] for more details. [1] https://code.google.com/p/tesseract-ocr/issues/detail?id=1285 Zdenko On Fri, Sep

[tesseract-ocr] version 3.04

2014-09-19 Thread Rick Leir
Ubuntu 14.04 has tess 3.03 and lept 1.70. I compiled tess 3.04 and lept 1.71, and installed them (and ran ldconfig so the new libraries would get used). Is it ok to use the old tessdata from 3.03 that was installed from the Ubuntu package? I start tess with $

Re: [tesseract-ocr] version 3.04

2014-09-19 Thread zdenko podobny
There is no tesseract 3.04 - so you can not install it. Your question indicates that you do not understand consequences of your action, so I strongly suggest you to revert to last stable release which is 3.02.02. Zdenko On Fri, Sep 19, 2014 at 8:31 PM, Rick Leir rich...@c7a.ca wrote: Ubuntu