Update: on a batch of 60 meters, I was able to get 46 meters
recognized.

First i ran a batch that runs tesseract on every .tif, and names the
output <picture name>.txt.
Then, I simply wrote a batch script to compare a text file of known
meter numbers against every tesseract output file using findstr.
The results show up as <picture name>.tif:<picture name>.txt.

Is there any way to optimize the pictures to make the text easier to
read before processing?  I tried converting to grayscale last night,
but it actually hurt the results.  The meters that don't come across
all seem to have minimal glare problems.

At any rate, in the trials, I have already saved myself a ton of time,
and for that I am happy.  Where's the donate button?
On Jun 28, 1:30 pm, "[email protected]" <[email protected]> wrote:
> Scenario:  We have 7000+ electric meters being changed out, and while
> changing them out we are taking a picture of the new meter beside the
> old meter to capture the previous reading.  We are looking for a way
> to extract the meter number from all 7000 pictures programmatically.
> I have gotten as far as creating a batch script to run tesseract for
> all files in a folder, and create output txt files for all of the
> images.  Within these images I see a bunch of jarbled text, and
> eventually I find the meter number.  My question, can I extract just
> that meter number out of the images programmatically?  I have a list
> of all 7000 meter numbers, and considered maybe making a dictionary
> file of just these.  Would that possibly work?  Can tesseract be set
> to ignore anything that isn't a dictionary match?
>
> Sample meter file:http://deangrell.com/CIMG0005.tif
>
> The meter number we are trying to read is on the left,76 207 799.
> Everything pulls across, even the "SANAGAMO" on the bottom of the
> right meter.  This software is truly impressive, I just need to find a
> way to focus it on the meter numbers.
>
> Any help at all would be appreciated!

-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

Reply via email to