[tesseract-ocr] Caching in TrainLineRecognizer?

Jens Weibler Sat, 04 Mar 2017 22:33:06 -0800

Hi,

I'm new to tesseract and wondered why the lstm dataset creation for the 
training process has to write the file again and again in 
TrainLineRecognizer. I've seen 200MB/s IO on the disk while creating the 
training data set.
As far I can see for the training case it would be sufficient to just load 
it once and write it at the end. The same applies to the box and tif file - 
but these are only read and not written...



Thanks,
Jens Weibler

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/ea35cf15-f53a-47f4-afdb-801e8745eb93%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[tesseract-ocr] Caching in TrainLineRecognizer?

Reply via email to