Thanks for the reply Aleksander. These improved the accuracy of my scans, however it does not provide a sloution to detecting paragraphs and blocks of text. Any idea how to do this?
On Monday, February 3, 2014 2:16:15 AM UTC-7, Aleksander Grzyb wrote: > > To improve results you should try to: > > 1. Convert image to binary image. > 2. Crop the image to get rid off the surroundings. > 3. Detect skew of image and do some perspective transform. > > I recommend to use OpenCV to do this operations. There is a pod for OpenCV: > > https://github.com/Fl0p/OpenCV-iOS > > Here are some links that should help you do the image processing part: > > > http://stackoverflow.com/questions/8667818/opencv-c-obj-c-detecting-a-sheet-of-paper-square-detection > > http://stackoverflow.com/questions/6555629/algorithm-to-detect-corners-of-paper-sheet-in-photo > > http://stackoverflow.com/questions/8637867/skew-detection-and-reduction-in-opencv > > http://stackoverflow.com/questions/7838487/executing-cvwarpperspective-for-a-fake-deskewing-on-a-set-of-cvpoint > > W dniu piątek, 31 stycznia 2014 20:44:55 UTC+1 użytkownik Nick Porter > napisał: >> >> I am trying to scan a business card using tesseract OCR, all I am doing >> is sending the image in with no per-prossesing, heres the code I am using. >> >> Tesseract* tesseract = [[Tesseract alloc] initWithLanguage:@"eng+ita"]; >> tesseract.delegate = self; [tesseract >> setVariableValue:@"0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ@.-()" >> forKey:@"tessedit_char_whitelist"]; [tesseract setImage:[UIImage >> imageNamed:@"card.jpg"]]; //image to check [tesseract recognize]; >> NSLog(@"Here is the text %@", [tesseract recognizedText]); >> >> Picture of card <http://imgur.com/nQPG6iq> >> >> This is the output <http://imgur.com/poikzBn> >> >> As you can see the accuracy is not 100%, which is not what I am concerned >> about I figure I can fix that with some simple per-processing. However if >> you notice it mixes the two text blocks at the bottom, which splits up the >> address, and possibly other information on other cards. >> >> How can I possibly use Leptonica(or something else) to group the text >> somehow? Possibly send regions of text on the image individually to >> tesseract to scan? I've been stuck on this problem for a while any possible >> solutions are welcome! >> > -- -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en --- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/groups/opt_out.

