Costa; I am surprised to know that One Notes create image for each page and saved it as a jpeg, then I extracted the text (right click and select copy text from image) and also saved in OneNote. I like to know whether MS oneNote available in MS2003? According to your's, one can type in OneNote and saved the typed text as Jpeg - then from image like jpeg can be re-converted to text and then saved in ordinary NotePad?and if so it is presumed that 100 % accuracy output? Further whether it is supported UTF-8.(i,.e. other than english)? Whether *scanned* image as jpeg saved in OneNote can also be extracted/converted to text? On hearing from you I wanted to test the same. -with regards, sriranga(76yrsold)
On Thu, Jun 18, 2009 at 10:10 AM, costa <[email protected]> wrote: > > Finally, after managing to compile ocropus 0.4 (with tesseract), I > gave it a try with a printed document I scanned and emailed myself as > a pdf document. I printed the document to MS OneNote 2007. OneNotes > create a set of pictures, one for each page. I selected one page, I > saved it as a jpeg, then I extracted the text (right click and select > copy text from image) and also saved in OneNote. > > I was impressed by the ms ocr engine which retrieved the text quite > accurately (the fonts of the pasted text also matched the fonts on the > paper). > > I could not say the same about ocropus. ocropus recognized a lot less > text and it took considerably longer to process the image (I used > ocropus page <jpeg image name>). > > Now the question is what can I do - as user not programmer - to > improve ocropus at recognizing the text on printed documents, to get > to the same levels of recognition as the ms office ocr engine? In my > naive world I thought that ocropus would be capable of recognizing > printed text out of the box with an accuracy of at least 95%. > > Thanks > > > --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "ocropus" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/ocropus?hl=en -~----------~----~----~----~------~----~------~--~---
