Hi, My project at http://RecordAGrave.com is about recording headstones from graves and posting the text and images on the Net so that people can research their family history. I would appreciate some advice on how to pre-process these headstone images to get the best results from Tesseract OCR. I have thousands of 1-2 MB jpg images of headstones to process.
Example images: http://freepages.genealogy.rootsweb.ancestry.com/~janderse/cemeteries/Star%20of%20David%20Memorial%20Gardens/Garden%20of%20Haifa%20-%20Raw/IMG_28215.jpg http://freepages.genealogy.rootsweb.ancestry.com/~janderse/cemeteries/Star%20of%20David%20Memorial%20Gardens/Garden%20of%20Haifa%20-%20Raw/IMG_28216.jpg http://freepages.genealogy.rootsweb.ancestry.com/~janderse/cemeteries/Star%20of%20David%20Memorial%20Gardens/Garden%20of%20Haifa%20-%20Raw/IMG_28217.jpg I am a software developer so I can script up pre-processing steps to prepare the input for Tesseract. Any advice on improving OCR accuracy through pre-processing steps? Thanks so much, -Jon -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en.

