Re: [tesseract-ocr] Tesseract 4: Shuffling training instances and unicharset compression at the same time?

2017-05-12 Thread 'kolomiyets' via tesseract-ocr
Thanks! On Friday, May 12, 2017 at 10:24:21 AM UTC+2, shree wrote: > > Please see > https://github.com/tesseract-ocr/tesseract/wiki/TrainingTesseract-4.00 > > 80 is the default. I think it means both 64 and 16 are applied. > > > train_mode int 80 Flags from TrainingFlags in lstmrecognizer.h

Re: [tesseract-ocr] Tesseract 4: Shuffling training instances and unicharset compression at the same time?

2017-05-12 Thread ShreeDevi Kumar
Please see https://github.com/tesseract-ocr/tesseract/wiki/TrainingTesseract-4.00 80 is the default. I think it means both 64 and 16 are applied. train_mode int 80 Flags from TrainingFlags in lstmrecognizer.h Possible values= 64 for Compress unicharset, 16 for round-robin training. ShreeDevi

[tesseract-ocr] Tesseract 4: Shuffling training instances and unicharset compression at the same time?

2017-05-12 Thread 'kolomiyets' via tesseract-ocr
Hi, I noticed that when training with unicharset compression (train_mode)training instances are used sequentially from one lstmf training file. This causes a local model convergence (for the current training font), whereas other fonts (training instances) were not used for training at all.