That list of languages possibly supported in 3.0 release is currently: bul -- Bulgarian cat -- Catalan / Valencian ces -- Czech dan -- Danish deu -- German ell -- Greek eng -- English fin -- Finish fra -- French hun -- Hungarian ind -- Indonesian / Bahasa Indonesia / Malay? ita -- Italian lav -- Latvian lit -- Lithuanian nld -- Dutch nor -- Norwegian pol -- Polish por -- Portuguese ron -- Romanian rus -- Russian slk -- Slovak slv -- Slovenian spa -- Spanish srp -- Serbian / Croatian swe -- Swedish tgl -- Tagalog tur -- Turkish ukr -- Ukrainian vie -- Vietnamese
In addition, there is support from some community projects for various Indian languages with Indic scripts, and I believe someone was working on Chinese. A few of us are interested in seeing Arabic and Hebrew support, but there is a need (mentioned in the FAQ, I believe) for a de-italicizing algorithm to be implemented and some other clever stuff... --Sven On Fri, Dec 4, 2009 at 11:58 AM, nguyenq <[email protected]> wrote: > If you look in the tessdata folder at > http://tesseract-ocr.googlecode.com/svn/trunk/ > , there currently are more than two dozens. > > On Dec 2, 7:44 am, andmor <[email protected]> wrote: >> Hi, >> >> Which languages will be supported with the 3.0 release ? >> On the TessearctProjects page, Ray says that Google are working on >> many for the next release. >> Any idea when these extra languages will be availalbe ? >> >> Thanks in advance, >> Andrew > > -- > > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]. > For more options, visit this group at > http://groups.google.com/group/tesseract-ocr?hl=en. > > > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en.

