Hello All, I am trying to parse a reasonable looking text image. After a bit of searching/reading, I suspect the problem is due to difficulty binarizing the image.
I'm playing around with black/white threshold levels using GIMP but getting very different results with even small changes in thresholds (none of them very accurate.) If anyone has some insight/experience with this issue, I'd really appreciate a nudge in the right direction. I've attached the original "cruciate.png" with different black/white thresholds and their corresponding tesseract results. Thanks, Dave -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/f16ab514-cb9e-48d2-8c64-5581ad5b6bce%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
mun cmd.|lempIun—.|e: purl-nu :.ppnteld.|I.ux
R1_:|nz:uru(e|upluxe—5eE pxE\mu:m;pm\ed:I:I1Ix|$
Kghl Lruuale rupnne , See pYlHUllXlppYD\Ed(1I1mS
Iugm mm‘. rupture 7 5.. pnunux lyprmed (him:
Right crutiakz nlplnre 7 ye. przrimu Ipprmeddaimx

