The conf from kor did already have it
#Fixes https://github.com/tesseract-ocr/tesseract/issues/1009
preserve_interword_spaces 1
--
You received this message because you are subscribed to the Google Groups
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it,
Is there a way to extract the header and footer content on a document page
separately using Tesseract OCR? I tried the hOCR output but it doesn't seem
to have any such tags associated with the output.
Regards,
Mohit
--
You received this message because you are subscribed to the Google Groups
Thanks Shree , but if tesseract is open source then why developers can't
answer doubts , If i were to randomly train my model how can i come down to
accurate accuracy of my model , then my model accuracy will also be random.
I want the reason for condition imposed on training text , how much
For tesseract 3.05
random text will work, it is suggested to use combos similar to English
training text.
It is unlikely you will get answers to your questions from the developers.
You can search past issues/questions in forum and github.
3.05 training does not take long, run a few experiments
For Korean, please check whether adding the following lines to config,
improves your results further.
#Fixes https://github.com/tesseract-ocr/tesseract/issues/1009
preserve_interword_spaces 1
ShreeDevi
भजन - कीर्तन - आरती @
Hi Shree Thanks for replying
For tesseract *3.05.00*
I had already checked that link there they mentioned
*"Make sure there are a minimum number of samples of each character. 10 is
good, but 5 is OK for rare characters.*
*There should be more samples of the more frequent characters - at least
Leftover from 3.04, my guess.
On Mon 9 Apr, 2018, 12:52 PM Fanatico, wrote:
> It worked, thanks.
>
> Any reason for this chi_tra there?
>
>
> On Monday, 9 April 2018 03:24:44 UTC-3, shree wrote:
>>
>> Please remove the sub language line from config file, and use combine
It worked, thanks.
Any reason for this chi_tra there?
On Monday, 9 April 2018 03:24:44 UTC-3, shree wrote:
>
> Please remove the sub language line from config file, and use combine
> tessdata to overwrite it.
>
> Right now it seems to be using chi_tra also.
>
> On Mon 9 Apr, 2018, 11:48 AM
Please remove the sub language line from config file, and use combine
tessdata to overwrite it.
Right now it seems to be using chi_tra also.
On Mon 9 Apr, 2018, 11:48 AM Fanatico, wrote:
> I used one traineddata that I created on removing the top layer from the
>
I used one traineddata that I created on removing the top layer from the
kor.traineddata from "tessdata_best", after this I replaced this
traineddata with the one from "tessdata_best" and got the same problem.
Yes, it include chi_tra as sublanguage
tessedit_load_sublangs chi_tra
10 matches
Mail list logo