>
>
> On Mon, Aug 17, 2015 at 6:07 AM, ShreeDevi Kumar <[email protected]>
> wrote:
>
>> Ray was looking for comparative feedback regarding the new traineddata
>> for RTL languages, so this will be useful.
>>
>
​>>>> Ray -
https://groups.google.com/forum/#!msg/tesseract-dev/qcFtWCAAlT8/SZ4xBS5DHwwJ

Another caveat worth noting is that I only tested a small fraction of these
languages - maybe 25?
I suspect, for instance, that all the Arabic-based langages except ara
don't work very well.
I would be interested in an more feedback on how bad it is in any of them,
and will take suggestions into account for the next version after 3.04.


>> As far as I know, Google Docs does not use tesseract OCR engine for
>> recognizing the text.
>>
>
> Interesting. Can you please clarify source of your knowledge?
>

>
>> Its OCR accuracy is better than Tesseract for some Indian languages also.
>> However, it doesn't seem to handle tifs, and processes only first 10 pages
>> of a pdf.
>>
>
​
​https://support.google.com/drive/answer/176692?hl=en

​


>
>>
>> On Sun, Aug 16, 2015 at 7:14 PM, Hossein Razizadeh <[email protected]>
>> wrote:
>>
>>> It seems 'fas' is for Persian, but there are no cube files, resulting in
>>> poor results. Arabic language files work much better for Persian images.
>>> There is another 'per' folder for Persian, but there isn't even
>>> '.traieddata' file for it. Does anyone know if 'Google Doc' has used
>>> 'Tesseract' for its OCR engine? Google Docs performs OCR for Persian images
>>> with good accuracy!
>>>
>>> On Saturday, July 18, 2015 at 8:14:07 AM UTC+4:30, Jeff Breidenbach
>>> wrote:
>>>>
>>>> I think 'fas' is the language code for Persian.
>>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "tesseract-ocr" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> To post to this group, send email to [email protected].
>>> Visit this group at http://groups.google.com/group/tesseract-ocr.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/tesseract-ocr/edd64e28-9e52-4b44-80cc-0aaa442caa85%40googlegroups.com
>>> <https://groups.google.com/d/msgid/tesseract-ocr/edd64e28-9e52-4b44-80cc-0aaa442caa85%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "tesseract-ocr" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> To post to this group, send email to [email protected].
>> Visit this group at http://groups.google.com/group/tesseract-ocr.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduX%2B9UqeXbWr-E7sADWK3SeyjiyUiJBH6wSJoMy_E2geuQ%40mail.gmail.com
>> <https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduX%2B9UqeXbWr-E7sADWK3SeyjiyUiJBH6wSJoMy_E2geuQ%40mail.gmail.com?utm_medium=email&utm_source=footer>
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at http://groups.google.com/group/tesseract-ocr.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/CAJbzG8wxnq4BBwAZD%2BL-7rg80z2FmRpCQg4b8QMaXi-SLUoUcQ%40mail.gmail.com
> <https://groups.google.com/d/msgid/tesseract-ocr/CAJbzG8wxnq4BBwAZD%2BL-7rg80z2FmRpCQg4b8QMaXi-SLUoUcQ%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduUuHrGarj9Ek8u01R36y7HjmCGH7zqmPCxbBoCc3xpp2w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to