I have a scanned bank statement, printed with a sans-serif font. using
gocr, the only problem I have is '1' being recognized as 'I'. ocrad is
a lot worse, but still useful. my results with the same file is
complete gibberish with tesseract.

The file is very high resolution, very high contrast. I can't show it
as it contains my bank statements.

Is there some kind of guide for tunning the tool? At this point I'm
trying it to see if it recognizes the '1's better as the numbers are
of importance. But at this stage, the output is useless. English
language by the way.

Here's an exert of the output, I think it's safe to paste as it seems
to contain nothing intelligible.

-----------------------------------------------------
F’I?IE`\!I()L.IS ST4¤n.TIEI**‘IEI\IT .
6 I)IEF’()SITS 4¤n.I\II) ()TI—IE|
51 (ZI—IE(ZI(S 4¤n.I\II) ()TI—IEI2 I
IINTEIQEST F’4¤n.II) TI—IIS F’|
SEI?\!I(ZE (ZI—I4¤n.I2(5E 4¤n.I**‘I()L.II\I`
(ZLJIQIQEINT ]B4¤n.I.4¤n.I\I(.TIE 4¤n.S (III
I\IL.II**‘IZBIEI2 (III: I)4¤n.‘¤’S II\I ST.
4¤n.I\II\IL.I4¤n.I. F’IEI?(.TEI\IT4¤¤.l
4¤n.\!EI24¤n.(5E I)4¤n.II.‘¤’ IB.
IINTEIQEST F’4¤n.II) ‘¤’|
]D4¤n.TE 4¤n.I**‘I()L.II\I
-----------------------------------------------------
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to