On Mon, Aug 17, 2015 at 6:07 AM, ShreeDevi Kumar <[email protected]> wrote:
> Ray was looking for comparative feedback regarding the new traineddata for > RTL languages, so this will be useful. > > As far as I know, Google Docs does not use tesseract OCR engine for > recognizing the text. > Interesting. Can you please clarify source of your knowledge? > Its OCR accuracy is better than Tesseract for some Indian languages also. > However, it doesn't seem to handle tifs, and processes only first 10 pages > of a pdf. > > ShreeDevi > ____________________________________________________________ > भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com > > On Sun, Aug 16, 2015 at 7:14 PM, Hossein Razizadeh <[email protected]> > wrote: > >> It seems 'fas' is for Persian, but there are no cube files, resulting in >> poor results. Arabic language files work much better for Persian images. >> There is another 'per' folder for Persian, but there isn't even >> '.traieddata' file for it. Does anyone know if 'Google Doc' has used >> 'Tesseract' for its OCR engine? Google Docs performs OCR for Persian images >> with good accuracy! >> >> On Saturday, July 18, 2015 at 8:14:07 AM UTC+4:30, Jeff Breidenbach wrote: >>> >>> I think 'fas' is the language code for Persian. >>> >> -- >> You received this message because you are subscribed to the Google Groups >> "tesseract-ocr" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To post to this group, send email to [email protected]. >> Visit this group at http://groups.google.com/group/tesseract-ocr. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/tesseract-ocr/edd64e28-9e52-4b44-80cc-0aaa442caa85%40googlegroups.com >> <https://groups.google.com/d/msgid/tesseract-ocr/edd64e28-9e52-4b44-80cc-0aaa442caa85%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at http://groups.google.com/group/tesseract-ocr. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduX%2B9UqeXbWr-E7sADWK3SeyjiyUiJBH6wSJoMy_E2geuQ%40mail.gmail.com > <https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduX%2B9UqeXbWr-E7sADWK3SeyjiyUiJBH6wSJoMy_E2geuQ%40mail.gmail.com?utm_medium=email&utm_source=footer> > . > > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAJbzG8wxnq4BBwAZD%2BL-7rg80z2FmRpCQg4b8QMaXi-SLUoUcQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.

