This SetRectangle() method is intriguing. Could you give me an example on how to implement it? 95% of the new meters are on the left half of the picture.
Thanks! On Jun 29, 1:53 pm, 8flm6 <[email protected]> wrote: > Hello, > > The Tesseract API provides a SetRectangle() method, to limit the > character recognition to a certain area. > If all of your images look nearly the same (new electric meter on the > lower left side and the old on the right), > you could define a static region of interest which generously covers > the number you'd like to read on every image. > If every image looks different, you will likely need a more elaborate > algorithm which finds the ROIs first, > and then passes the Coordinates to Tesseract. Then in the end you > could apply a regular expression to your reading > results to filter the number you're searching for, something like '/ > [0-9]{2} [0-9]{3} [0-9]{3}/' if the number has always the > format like the one in the picture you uploaded. Hope you'll find a > solution! > > 8flm6 > > On 29 Jun., 13:32, "[email protected]" <[email protected]> wrote: > > > > > > > > > Update: on a batch of 60 meters, I was able to get 46 meters > > recognized. > > > First i ran a batch that runs tesseract on every .tif, and names the > > output <picture name>.txt. > > Then, I simply wrote a batch script to compare a text file of known > > meter numbers against every tesseract output file using findstr. > > The results show up as <picture name>.tif:<picture name>.txt. > > > Is there any way to optimize the pictures to make the text easier to > > read before processing? I tried converting to grayscale last night, > > but it actually hurt the results. The meters that don't come across > > all seem to have minimal glare problems. > > > At any rate, in the trials, I have already saved myself a ton of time, > > and for that I am happy. Where's the donate button? > > On Jun 28, 1:30 pm, "[email protected]" <[email protected]> wrote: > > > > Scenario: We have 7000+ electric meters being changed out, and while > > > changing them out we are taking a picture of the new meter beside the > > > old meter to capture the previous reading. We are looking for a way > > > to extract the meter number from all 7000 pictures programmatically. > > > I have gotten as far as creating a batch script to run tesseract for > > > all files in a folder, and create output txt files for all of the > > > images. Within these images I see a bunch of jarbled text, and > > > eventually I find the meter number. My question, can I extract just > > > that meter number out of the images programmatically? I have a list > > > of all 7000 meter numbers, and considered maybe making a dictionary > > > file of just these. Would that possibly work? Can tesseract be set > > > to ignore anything that isn't a dictionary match? > > > > Sample meter file:http://deangrell.com/CIMG0005.tif > > > > The meter number we are trying to read is on the left,76 207 799. > > > Everything pulls across, even the "SANAGAMO" on the bottom of the > > > right meter. This software is truly impressive, I just need to find a > > > way to focus it on the meter numbers. > > > > Any help at all would be appreciated! -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en

