ShreeDevi, Where did this training text come from? It includes two different Georgian scripts (mkhedruli and asomtavruli). Only mkhedruli is in common usage today, so it seems to me that it would be best to remove the asomtavruli to increase accuracy on modern texts. If complete historical accuracy is desired, then the third Georgian script (nuskhuri) should probably be included as well.
Giorgi, there is some further information about training tesseract with Georgian here (I have trained tesseract to read Georgian and got decent results, but using the old training methods, not the new ones): https://groups.google.com/forum/#!searchin/tesseract-ocr/Georgian/tesseract-ocr/_ytk3bU592A/lHhwYd67xHsJ In addition, you might try contacting Levan Gelashvili (CCed), who has created a tesseract-based OCR program for Georgian; I haven't had very good results with SunnyPage, but he may have improved it since the last time I tried it. On Friday, November 7, 2014 4:57:51 AM UTC-5, shree wrote: > > Please see > https://code.google.com/p/tesseract-ocr/source/browse/?repo=langdata#git%2Fkat > > Language codesISO 639-1 <http://en.wikipedia.org/wiki/ISO_639-1>kaISO > 639-2 <http://en.wikipedia.org/wiki/ISO_639-2>geo > <http://www.sil.org/iso639-3/documentation.asp?id=geo> (B) > kat <http://www.sil.org/iso639-3/documentation.asp?id=kat> (T)ISO 639-3 > <http://en.wikipedia.org/wiki/ISO_639-3>kat > <http://www.sil.org/iso639-3/documentation.asp?id=kat> – inclusive code > <http://en.wikipedia.org/wiki/ISO_639_macrolanguage> > > Possible that it will be included in 3.04. > > > ShreeDevi > ____________________________________________________________ > भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com > > On Thu, Nov 6, 2014 at 8:10 PM, Giorgi Gognadze <[email protected] > <javascript:>> wrote: > >> Hi, I'm George. I want to support Georgian language but don't know where >> to starts and what to do. Can anyone give me a advice? >> >> -- >> You received this message because you are subscribed to the Google Groups >> "tesseract-ocr" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected] <javascript:>. >> To post to this group, send email to [email protected] >> <javascript:>. >> Visit this group at http://groups.google.com/group/tesseract-ocr. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/tesseract-ocr/b2af59da-4fbb-425e-9c29-5a9003702b9a%40googlegroups.com >> >> <https://groups.google.com/d/msgid/tesseract-ocr/b2af59da-4fbb-425e-9c29-5a9003702b9a%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> > > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/12ef4b1d-dde0-49a9-9e37-0534b8d5a283%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.

