2009/3/1 alleykat <[email protected]> > > Hello, this sounds interesting. I don't know much, almost nothing > about programing. But, would it be possible to create a font of > someone's handwriting, (my mother's) then use that font with your > script to train the machine to recognise her handwriting? I have 50 > years worth of letters to be OCR'ed. The letters are all in > Norwegian, so we need the teach the machine norwegian as well? > Thanks > alleykat
Handwriting recognition is a pretty complex process and not covered by tesseract-ocr currently. Cursive handwriting as opposed to machine print does not have spaces between 2 successive characters in a word, which fools tesseract into believing that the entire word is actually one character. There may be some software available for doing handwriting recognition too. It is an actively researched topic and here in India there many agencies that have developed such software 10-12 years ago, but just have not released any code. Hope some one else can tell the names of software that can help you out. PS: Tesseract character segmenter can be hacked (or the image itself can be pre processed) to allow Tesseract to ocr cursive handwriting. It is one of my areas of work in the past, but does not cover latin languages. Link: http://debayanin.googlepages.com/hackingtesseract -- Be Intelligent, Use GNU/Linux. http://debayan.wordpress.com http://lug.nitdgp.ac.in http://planet-india.randomink.org --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en -~----------~----~----~----~------~----~------~--~---

