After some research in Korean I found that they do use Chinese characters
in their language, so it is correct to set Chinese as a sublanguage, the
problem is that the kor.training_text doesn't have chinede letters, so the
code is only training Korean and ignoring the Chinese, so if I tesseract
The conf from kor did already have it
#Fixes https://github.com/tesseract-ocr/tesseract/issues/1009
preserve_interword_spaces 1
--
You received this message because you are subscribed to the Google Groups
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it,
For Korean, please check whether adding the following lines to config,
improves your results further.
#Fixes https://github.com/tesseract-ocr/tesseract/issues/1009
preserve_interword_spaces 1
ShreeDevi
भजन - कीर्तन - आरती @
Leftover from 3.04, my guess.
On Mon 9 Apr, 2018, 12:52 PM Fanatico, wrote:
> It worked, thanks.
>
> Any reason for this chi_tra there?
>
>
> On Monday, 9 April 2018 03:24:44 UTC-3, shree wrote:
>>
>> Please remove the sub language line from config file, and use combine
It worked, thanks.
Any reason for this chi_tra there?
On Monday, 9 April 2018 03:24:44 UTC-3, shree wrote:
>
> Please remove the sub language line from config file, and use combine
> tessdata to overwrite it.
>
> Right now it seems to be using chi_tra also.
>
> On Mon 9 Apr, 2018, 11:48 AM
Please remove the sub language line from config file, and use combine
tessdata to overwrite it.
Right now it seems to be using chi_tra also.
On Mon 9 Apr, 2018, 11:48 AM Fanatico, wrote:
> I used one traineddata that I created on removing the top layer from the
>
I used one traineddata that I created on removing the top layer from the
kor.traineddata from "tessdata_best", after this I replaced this
traineddata with the one from "tessdata_best" and got the same problem.
Yes, it include chi_tra as sublanguage
tessedit_load_sublangs chi_tra
Which traineddata are you using?
Use combine_tessdata and extract the config file to see if chinese is
included as sub language.
Also look at the lstm-unicharset to see if the Chinese characters are
included in it.
On Mon 9 Apr, 2018, 11:09 AM Fanatico, wrote:
> I'm
I'm running tesseract with the "-l kor" param but it is detecting some
chinese characters, the image really have 3 chinese characters but none of
them is returning correctly (and I'm not expecting them to return
correctly) but the others korean characters are being recognized as chinese
9 matches
Mail list logo