You have to experiment .. I got better results after some image processing and vietocr ..
that it has bcln dooi transfer of a portzon which has been leased an. M- nan-ant.‘ 0n Mu [image: Inline image 1] ShreeDevi ____________________________________________________________ भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com On Fri, Oct 17, 2014 at 9:08 PM, Rick Leir <[email protected]> wrote: > Thanks, ShreeDevi > > I opened the jpg in Gimp, and you can see that it is about 100 pixels per > text line: > > > <https://lh5.googleusercontent.com/-jAAkrAFL_wE/VEE3pA5LMbI/AAAAAAAAADs/1kExQh_pdiA/s1600/gimpOriginal.png> > > > > On Friday, October 17, 2014 11:23:37 AM UTC-4, shree wrote: >> >> https://code.google.com/p/tesseract-ocr/wiki/ImproveQuality >> >> try with image at 300dpi or higher. resize 300% >> >> ShreeDevi >> ____________________________________________________________ >> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com >> >> On Fri, Oct 17, 2014 at 8:35 PM, Rick Leir <[email protected]> wrote: >> >>> I have been getting great results from Tesseract when the images are >>> clear. However, many of my images are crummy. >>> How would you get the best results for this? Maybe improved training, >>> maybe image pre-processing? >>> >>> The original is like this: >>> >>> >>> >>> <https://lh5.googleusercontent.com/-Jz-VqLejc-U/VEEau_7k3oI/AAAAAAAAADU/bXOopmkgaSA/s1600/tessOriginal.png> >>> I have done some GraphicsMagick work to get this: >>> >>> >>> >>> <https://lh3.googleusercontent.com/-wAD7kwouFUQ/VEEbBvCH4mI/AAAAAAAAADc/wEQagCMHyxk/s1600/tessAfterIM.png> >>> >>> >>> I am using Ubuntu 14.04, and see this in the terminal: >>> Tesseract Open Source OCR Engine v3.03 with Leptonica >>> >>> The Tesseract output text is, as expected, poor: >>> >>> uh it .0222? 1mm: (lenuimi 7.1: ft. tin“. Six'ori;v;sioxi ".nn’ >>> >>> ;(rt. n of an; 103 an Lam tmtns;hn >>> >>> RG 84; >>> >>> 37.14121 :13}: oven 1w ,4? 1 {2.2" “"D'ud<~‘1ii,xl,;l w ‘;’ tires LU. >>> >>> not, I,» ana'asienbnd in 1.3 arm apartment. .. em; mummy no film >>> rzltnow‘n tau” 1m. and. ruijoiv‘zim: (mean in thin bleak. :an {"311 .25 >>> 33:03:) in Line :‘djomum Monks 1m :1: r a; w 231:9C101l3nt13‘ 1 are u.» >>> rerun «S :2, vngfiunxulw CHUHLVlfiafln. ~.1. was tun iHEHHELgN MC >>> >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "tesseract-ocr" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> To post to this group, send email to [email protected]. >>> Visit this group at http://groups.google.com/group/tesseract-ocr. >>> To view this discussion on the web visit https://groups.google.com/d/ >>> msgid/tesseract-ocr/8cf6020e-1dc3-499f-8de9-a09c8865939a% >>> 40googlegroups.com >>> <https://groups.google.com/d/msgid/tesseract-ocr/8cf6020e-1dc3-499f-8de9-a09c8865939a%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at http://groups.google.com/group/tesseract-ocr. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/7b1d6b68-44ff-4bd1-b527-9ad3a12149f4%40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/7b1d6b68-44ff-4bd1-b527-9ad3a12149f4%40googlegroups.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduXVg8MQC0a%3DVYo_EK3GfQM%2B-HcD_qEbU2ymnbjqqK2EUQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.

