Do you still need a copy of sanskrit traineddata ? Shree Devi Kumar ____________________________________________________________ भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
On Fri, Aug 23, 2013 at 10:21 PM, mns_rao <mns...@gmail.com> wrote: > Hi, > The result output of OCR also depends on traineddata file of the language > of the input image. If you have a good traineddata file for sanskrit you > can use FreeOCR 4.2(http://www.paperfile.net/) by adding it in the > settings-->open language folder and pasting it there. FreeOCR 4.2 does the > entire PDF book (input at 'open PDF' ) at one click OCR-->ocr all pages. > Try with original book first and if not satisfaactory convert cleaned > images into PDF book again > I also need sanskrit traineddata file if you can spare it.. > Wishing success, > MNS Rao > > > On Friday, 23 August 2013 18:38:44 UTC+5:30, shree wrote: >> >> I >> want to OCR a sanskrit book available as a pdf. >> >> I used gsview to save all pages as png and >> then used scantailor to deskew the images which saved them as tifs. >> Then I used irfanview to apply blur and median filters as the text is >> very grainy in the original and also resized the page to a smaller size. >> >> The pre-processed image as above is giving better result than original. >> >> I would like to know if there is a simpler/better method to pre-process >> the image. The pdf is 500+ pages. >> >> I am attaching a single page from the pdf and the processed image file. >> >> Thnaks, >> Shree >> > -- > -- > You received this message because you are subscribed to the Google > Groups "tesseract-ocr" group. > To post to this group, send email to tesseract-ocr@googlegroups.com > To unsubscribe from this group, send email to > tesseract-ocr+unsubscr...@googlegroups.com > For more options, visit this group at > http://groups.google.com/group/tesseract-ocr?hl=en > > --- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to tesseract-ocr+unsubscr...@googlegroups.com. > For more options, visit https://groups.google.com/groups/opt_out. > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To post to this group, send email to tesseract-ocr@googlegroups.com. Visit this group at http://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduVy3xw6fi8K2%3DcDVyWSHwUnksRGgdU2a9HEVXRuoCT5aQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.