Dear Tom,

from my practical experience, it is observed that ocropus cannot train big
number of image files due to memory error problem. Instead better to train
smaller number(not size) of image files to overcome problem memory error if
any. In this context, I suggest to train the
folder1, folder2, folder 3 etc. each containing small number of image file
like tif or png of Lang. - will generate the following data files:

1. boxdata.h5
2. boxdata.cmodel
3. boxdata.split
4. page.bin.png
5. page.pseg.png
6. page.
7. book-xxxx

If the script ./run-box-training is run for each folder1, folder2, folder3
will generate 7 datafiles with reference to each folder separately.
My suggestion is whether is it possible to copy and paste the contents of
7data files so generated * into Main 7 data files created*?
Thus Main 7 data files will contain the extract of all date files generated
by each folder 1, folder 2, and folder 3 etc.

Awaiting your comments.
with warmest regards,

-- 
You received this message because you are subscribed to the Google Groups 
"ocropus" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit https://groups.google.com/groups/opt_out.


Reply via email to