After some research in Korean I found that they do use Chinese characters
in their language, so it is correct to set Chinese as a sublanguage, the
problem is that the kor.training_text doesn't have chinede letters, so the
code is only training Korean and ignoring the Chinese, so if I tesseract
Thanks, I was going to do this, just to be sure if there wasn't a way to
train 2 traineddata like the actual.
--
You received this message because you are subscribed to the Google Groups
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to
Thank you again. I think I'll stay with plain txt -- pdf looks too
difficult to achieve.
Now, next problem: Everything worked fine with my 1-page test pdf. I now
tried to do the same with a 30 MB 500 pages pdf. After running convert
-density 300 test.pdf -depth 8 -strip -background white
Try to look at leptonica sample programs about column splitting to see if
you can preprocess the image better, before giving to tesseract
On Wed 11 Apr, 2018, 11:46 AM Ewan Mellor, wrote:
> Hi,
>
>
> I am using Tesseract 4 (git 10f4998a) to process a file with two
https://github.com/tesseract-ocr/tesseract/issues/660
Regarding pdf
On Wed 11 Apr, 2018, 1:28 PM ShreeDevi Kumar, wrote:
> 1. Check the output tif and adjust convert command if needed
>
> 2. Depending on your tesseract version you could try -l frk also.
>
> 3. Yes, you
1. Check the output tif and adjust convert command if needed
2. Depending on your tesseract version you could try -l frk also.
3. Yes, you can get a pdf as output.
Search Github issues, there is a long discussion thread regarding best ways
to create a pdf output.
Look for pdf and invisible
i have been using tesseract 3.04 i could use it just by adding the include
file to my project, but when i download the new version tesseract 4.00
there was no include file . plz any one can help me in this thank you .
--
You received this message because you are subscribed to the Google
Hi,
I am using Tesseract 4 (git 10f4998a) to process a file with two columns.
A snippet of the image is shown below. The problem is that there is a
fuzzy line between the two columns, and the column detector has got
confused. I've ended up with one block covering the first column up to
8 matches
Mail list logo