Good morning everyone, 

First of all I found a similar problem on this post, although the solutions 
didn't seem to help me:
https://groups.google.com/forum/#!msg/tesseract-ocr/O8EEFSSj7_I/aRCIzGbvAgAJ

So the question is, after various iterations on hundreds of pages, shound't 
the output traneddata size be diferent than the input? Mine is always the 
same. I'm training using my own set of images, here's what i'm doing:

1 - Create box files
2 - Create lstm models

3 - start lstm training using: 

    lstmtraining \
        --model_output output/por \
        --continue_from   por.lstm \
        --traineddata  tesseract/tessdata/por.traineddata \
        --max_iterations 400\
        --train_listfile train/por.training_files.txt 


4 - after training is complete:


    lstmtraining \
        --stop_training \
        --continue_from output/por_checkpoint \
        --traineddata tesseract/tessdata/por.traineddata \
        --model_output por_NEW.trainneddata


Am I doing something wrong? Or the trained files(input and result) should 
really have the same EXACTLY size?

Thanks in advance 

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/d9a94578-2ede-42b8-a071-9580fcee1ac2%40googlegroups.com.

Reply via email to