A few days ago my machine stopped to work displaying the screen full of the obscure error messages. I took the picture of the screen and rebooted the machine. Because I was too lazy to spend an hour on copying out the contents of the screen manually I decided to try some OCR engines. I inspected gocr 0.49, OCRopus 0.5.4, and Tesseract 3.01.
After four days of the intensive work I learned a bit about OCR and now I know none of the mentioned programs is able to process properly the strings of numbers and letters such as “[226158.728554] [<c1430000>] ? cs5520_init_one+0x14e/0x35f”. Personally I doubt there is any other OCR engine capable to process such a text on the basis of the photo of the moderate quality. The only solution is to copy out these messages manually. It’s the instructive example of the state of affairs named the irony of fate. I studied the “Report on the comparison of Tesseract and ABBYY FineReader OCR engines” by Heliński, Kmieciak, and Parkoła (http://lib.psnc.pl/dlibra/docmetadata?id=358&from=publication&showContent=true). It is very interesting – at least for the users of these two programs – though the other people interested in OCR engines should be satisfied reading that document as well. The report is very reliable and informative. Thank you, professor, for that valuable link. -- You received this message because you are subscribed to the Google Groups "ocropus" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msg/ocropus/-/1uo5A6p6E64J. For more options, visit https://groups.google.com/groups/opt_out.
