Providing a sample image might help us suggest something..
On Sat, Sep 2, 2017 at 12:05 PM, George Erfesoglou wrote:
> I have a small image 300x300 and there are some smaller fonts like the
> size of this text here that it picks up but larger fonts like *JUST THIS
> BIG *it
Can anyone please throw some light on major differences between tesseract
3.04 and 4.0?
Since last 4 months, I have been working on a framework using tesseract
3.04.
Is it worthwhile moving to 4.0 now? Will it improve OCR efficiency?
Any suggestions will be highly appreciated.
Regards,
Ashish
Any one who can please give me suggestions on this? How to tackle inter
word spaces?
On Wednesday, December 7, 2016 at 5:57:21 PM UTC+5:30, Ashish Goel wrote:
>
> I have an image that should read Luz 4 l. Image quality is good.
>
> Tesseract reads it as Luz4l. (It fails to dete
Following options may help you:
1. Image Processing (resize, filter etc,)
2. teseract_whitelist
On Friday, December 9, 2016 at 2:23:38 PM UTC+5:30, dinh van Chinh wrote:
>
>
>
estion.
>
> The challenge is how to automate this process, any thoughts?
>
>
> On Wednesday, December 7, 2016 at 1:00:59 AM UTC-8, Ashish Goel wrote:
>
>> Crop image into sub images and then OCR. Crop it in different segments.
>>
>> On Saturday, December 3
Image is attched, in case some tries to look at it. It is in spanish.
On Wednesday, December 7, 2016 at 5:57:21 PM UTC+5:30, Ashish Goel wrote:
>
> I have an image that should read Luz 4 l. Image quality is good.
>
> Tesseract reads it as Luz4l. (It fails to determine spaces).
>
I have an image that should read Luz 4 l. Image quality is good.
Tesseract reads it as Luz4l. (It fails to determine spaces).
I have tried resizing the image, passing tesseract_whitelist etc etc. but
it is not helping me.
Can any one please help me with how can I tell tesseract with number of
Crop image into sub images and then OCR. Crop it in different segments.
On Saturday, December 3, 2016 at 5:54:51 PM UTC+5:30, Marie wrote:
>
> Hi,
>
> We are trying to recognize receipt using Tesseract (v3.02 on
> Windows). Tried to process the images but the words accuracy (comparing
> with
Size is the problem. I reduced its size (using imagemagick) and the error
went away.
convert 1.tif -resize 70% 2.tif
On Thursday, September 8, 2016 at 4:08:47 PM UTC+5:30, George Papadopoulos
wrote:
>
> Hello,
>
> I am using the following version of tesseract on Windows 7. I have also
>
Did you tried increasing size of the image?
On Friday, September 2, 2016 at 12:03:51 PM UTC+5:30, ahs...@gmail.com
wrote:
>
> So i'm trying to ocr the following images but looks we its not doing it
> 100%. six is written as five. nine is written as 3. Any suggestions?
>
>
>
Can you be specific on what kind of image processing did you do using
imagemagick?
Is this you original image? What image goes to tesseract?
If this is your original image, then I would have to at least rotate, crop
and resize this image to localize it to the meter reading area of the image.
If
Tesseract requires an image to be of minimum of 300 X 300 dpi for good
results.
I would suggest to resize the image and apply a filter for improvement in
detection.
I generally use imagemagick for this purpose.
On Thursday, August 25, 2016 at 3:28:33 PM UTC+5:30, Mikey wrote:
>
> I write java
Instead of retraining font, you should focus on pre-processing image. One
option that worked in this particular case was resizing the image.
I did (tesseract was able to read the image)
$ convert a.png -resize 170% b.png
$ tesseract b.png stdout -l eng --tessdata-dir
Thanks for the reply. That narrows down my options.
On Tuesday, July 19, 2016 at 8:07:09 PM UTC+5:30, zdenop wrote:
>
> No. Tesseract needs for correct OCR result specification of language of
> input image
>
>
> Zdenko
>
> On Tue, Jul 19, 2016 at 8:47 AM, Ashish
Thanks for the reply, but I am looking for a solution which I can integrate
into my custom application.
I have no idea, if I can make use of google drive application for this
purpose.
On Tuesday, July 19, 2016 at 12:17:20 PM UTC+5:30, Ashish Goel wrote:
>
> I have 100s of images in dif
I have 100s of images in different languages that I need to OCR. Presently,
I need to know in advance the language of the image and pass the language
paramater (for ex. -l deu or -l dan).
Is their a way where I can get to somehow figure out language of the image
auto magically?
It is weird but
Zdenko,
Thanks for your reply. I will try with standard distro and let know if it
works.
Ashish
On Monday, June 6, 2016 at 4:38:11 PM UTC+5:30, Ashish Goel wrote:
>
> Hello All,
>
> I am trying to do OCR on a bunch of images. Getting some failures, and I
> want to analyse th
Ubuntu 12.04
On Monday, June 6, 2016 at 4:38:11 PM UTC+5:30, Ashish Goel wrote:
>
> Hello All,
>
> I am trying to do OCR on a bunch of images. Getting some failures, and I
> want to analyse them.
> So, to do that, I am trying to get the tessinput.tif file so that I can
>
(libjpeg-turbo 1.2.0) : libpng 1.2.46 : libtiff 3.9.5 : zlib
1.2.3.4
but still tessinput.tif is blank.
Is there anything else that I can try so that I can get tessinput.tif?
Thanks
Ashish
On Monday, June 6, 2016 at 4:38:11 PM UTC+5:30, Ashish Goel wrote:
>
> Hello All,
>
> I am tryi
I had same problem for Swedish language and a temporary workaround helped
me. I zoomed (re-scaled) image to 400% and it recognized the letter.
(Though it added other problems). Not sure, but it could improve results
for you.
Ashish
On Mon, Jun 6, 2016 at 8:53 PM, Tom Morris
>
> On Mon, Jun 6, 2016 at 1:08 PM, Ashish Goel <goelk...@gmail.com> wrote:
>
>> Hello All,
>>
>> I am trying to do OCR on a bunch of images. Getting some failures, and I
>> want to analyse them.
>> So, to do that, I am trying to get the tessin
Hello All,
I am trying to do OCR on a bunch of images. Getting some failures, and I
want to analyse them.
So, to do that, I am trying to get the tessinput.tif file so that I can
find out what input actually goes to tesseract.
I am passing "-c tessedit_write_images 1" along with my tesseract to
If you can elaborate on what kind of failures you are experiencing, people
might be able to help.
On Monday, June 6, 2016 at 12:47:29 PM UTC+5:30, Doron Saar wrote:
>
> Hi,
>
> I'm trying to train Tesseract to work with a large library of Hebrew
> language documents.
> They are all in good
I also wish to find a way to avoid such cases. Even I am facing some cases
where I get extra white spaces, lower/upper case mismatch and wrong
detection of characters...
On Tuesday, May 31, 2016 at 11:40:28 PM UTC+5:30, Diederik Hattingh wrote:
>
> I have a case where my tesseract isn't
24 matches
Mail list logo