For Korean, please check whether adding the following lines to config, improves your results further.
#Fixes https://github.com/tesseract-ocr/tesseract/issues/1009 preserve_interword_spaces 1 ShreeDevi ____________________________________________________________ भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com On Mon, Apr 9, 2018 at 1:45 PM, ShreeDevi Kumar <[email protected]> wrote: > Leftover from 3.04, my guess. > > On Mon 9 Apr, 2018, 12:52 PM Fanatico, <[email protected]> wrote: > >> It worked, thanks. >> >> Any reason for this chi_tra there? >> >> >> On Monday, 9 April 2018 03:24:44 UTC-3, shree wrote: >>> >>> Please remove the sub language line from config file, and use combine >>> tessdata to overwrite it. >>> >>> Right now it seems to be using chi_tra also. >>> >>> On Mon 9 Apr, 2018, 11:48 AM Fanatico, <[email protected]> wrote: >>> >>>> I used one traineddata that I created on removing the top layer from >>>> the kor.traineddata from "tessdata_best", after this I replaced this >>>> traineddata with the one from "tessdata_best" and got the same problem. >>>> >>>> Yes, it include chi_tra as sublanguage >>>> tessedit_load_sublangs chi_tra >>>> >>>> lstm-unicharset only has corean characters >>>> >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "tesseract-ocr" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to [email protected]. >>>> To post to this group, send email to [email protected]. >>>> Visit this group at https://groups.google.com/group/tesseract-ocr. >>>> To view this discussion on the web visit https://groups.google.com/d/ >>>> msgid/tesseract-ocr/0d50ee2b-b5d4-4c73-a45b-d5245403ad04% >>>> 40googlegroups.com >>>> <https://groups.google.com/d/msgid/tesseract-ocr/0d50ee2b-b5d4-4c73-a45b-d5245403ad04%40googlegroups.com?utm_medium=email&utm_source=footer> >>>> . >>>> For more options, visit https://groups.google.com/d/optout. >>>> >>> -- >> You received this message because you are subscribed to the Google Groups >> "tesseract-ocr" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To post to this group, send email to [email protected]. >> Visit this group at https://groups.google.com/group/tesseract-ocr. >> To view this discussion on the web visit https://groups.google.com/d/ >> msgid/tesseract-ocr/8496ad57-f7eb-426c-a4ae-5d365c56bc96% >> 40googlegroups.com >> <https://groups.google.com/d/msgid/tesseract-ocr/8496ad57-f7eb-426c-a4ae-5d365c56bc96%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduU4j1QD_zrAPGws_5ztQh1De6%3DGtHKnzNTHW%3DkeNX2qgg%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.

