Hi Nikola, I suggest you don't try training it. Training is mostly for adding new languages, or at least significantly different fonts. As your input is English, and a common font, I doubt it would help much over the standard english training file.
The results I got from running Tesseract 3 on your sample were pretty good, though. I'll attach them here. Using -psm 6 made a big improvement as it meant the table cells were on the correct row. So I ran: tesseract ocr1.png outtest2 -psm 6 The problems remaining in the output is 7 being consistently recognised as ?, and m is regularly misrecognised as r'n or r‘n. I have suggestions for this. If your input data will never have ? in, create an ambig rule which always changes a ? to a 7 (and similar for the r'n issues). The best way to do this would be: 1) unpack the english training data: combine_tessdata -u eng.traineddata eng. 2) add the following lines to the end of eng.unicharambigs: 1 ? 1 7 1 3 r ' n 1 m 1 3 r ‘ n 1 m 1 3) recombine the training data: combine_tessdata eng. And the eng.traineddata file will contain the extra ambig rules. Hope this helps, and let us know how you get on. Nick -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en
04-Jan-2012 00:22 Ward: Physician: Operator: Total r'nAs 1088? Total DLP 1206 r'nGycr‘n Scan l<\-f r'nAs I ref. CTDlvol DLP Tl :SL r'nGy r'nGycr‘n s mm PatientPosition F-SP Topograrn 1 120 36 mA 5.3 0.6 Thorax 2 120 50 3.3? 140 0.5 0.6 Topograrn 3 120 36 mA 5.3 0.6 F|_CaSc 4D 120 66 I S0 1.00 24 0.20 0.6 Premonitoring 5 100 42 1.2? 1 0.20 10.0 Premonitoring 6 100 42 1.2? 1 0.20 10.0 Premonitoring T 100 42 1.2? 1 0.20 10.0 Contrast Monitoring S 100 42 12.?3 13 0.20 10.0 DS_CorCTA 10D 100 320 50.64 1010 0.20 0.6 Medium Type Iodine Conc. Volume Flow CM Ratio mgfml ml mlfs Contrast Ultravist 3?0 S0 5.5 100% Saline 40 5.5

