i have this problem too
i used jtessboxeditor to train the tesseract
my tif file had 34000 word and i build it with a 50 pages tiff file

but the output trained file was 1.5 mb and dont detected any words!!

jtessboxeditor have problem?

On 2/25/14, Bernard Polarski <[email protected]> wrote:
> How do you produce your traineddata ?
>
>
>
> Le mardi 25 février 2014 17:51:39 UTC+1, Frederico Ferro Schuh a écrit :
>>
>> Hello all,
>>
>> I'm training Tesseract to recognize handwritten digits, and I have
>> provided it about 6000 samples of each digit, in 10 different box files,
>> one for each digit. Each box file is a 2152x2152 TIF file. However, the
>> resulting traineddata file I get after completing the training procedure
>> is
>> only 137 kb.
>> I went through the process again, providing smaller sample files (1000
>> samples of each digit), and ended up with the same traineddata size of 137
>>
>> kb.
>> Is this size reasonable or am I doing something wrong?
>> I assume something is wrong because my results are pretty bad so far.
>>
>> I've attached the sample image I am using for the digit 0.
>>
>> Thanks in advance,
>> Fred
>>
>
> --
> --
> You received this message because you are subscribed to the Google
> Groups "tesseract-ocr" group.
> To post to this group, send email to [email protected]
> To unsubscribe from this group, send email to
> [email protected]
> For more options, visit this group at
> http://groups.google.com/group/tesseract-ocr?hl=en
>
> ---
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/groups/opt_out.
>

-- 
-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

--- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to