Re: [tesseract-ocr] mftraining Segmentation fault error

2016-11-08 Thread ShreeDevi Kumar
Tom, Please see https://github.com/tesseract-ocr/tesseract/pull/466 I think the developers may want to focus on the merge of Google's private new LSTM codebase with the public github repo. ShreeDevi भजन - कीर्तन - आरती @

Re: [tesseract-ocr] mftraining Segmentation fault error

2016-11-08 Thread Tom De Costere
It seems my topic is not suitable for the DEV forum. (topic creation refused) I would appreciate it sinceraly if anyone can bring this topic to the attention of the devs. Thanks in advance! Tom Op vrijdag 4 november 2016 13:21:56 UTC+1 schreef shree: > > Probably better to post on

Re: [tesseract-ocr] mftraining Segmentation fault error

2016-11-04 Thread ShreeDevi Kumar
Probably better to post on tesseract-dev, though there is no guarantee that the developers will reply. On 4 Nov 2016 3:07 p.m., "Tom De Costere" wrote: > Just to be sure, are the developers watching this Google Group or should I > make a topic under the

Re: [tesseract-ocr] mftraining Segmentation fault error

2016-11-04 Thread Tom De Costere
Just to be sure, are the developers watching this Google Group or should I make a topic under the "tesseract-dev" group? FYI: we've breached the 5k number of fonts this morning I'm thinking of notifying the users that they should only create box files for documents containing terrible

Re: [tesseract-ocr] mftraining Segmentation fault error

2016-11-03 Thread ShreeDevi Kumar
Please see https://github.com/tesseract-ocr/tesseract/blob/master/training/language-specific.sh The max no of fonts for each language is not very large. I am not even sure whether increasing the number of fonts beyond a limit will improve the recognition. I think it is unlikely that tesseract

Re: [tesseract-ocr] mftraining Segmentation fault error

2016-11-03 Thread Tom De Costere
Hello, Thank you for your responses! Let me clarify the situation here on which training is performed, so you understand why we have 130+ tr files. We have fill-in forms for our customers, which they have to hand over to our workers, in order to specify when and what our worker have

Re: [tesseract-ocr] mftraining Segmentation fault error

2016-11-02 Thread RKVS Raman
But why would you need 130 tr files? Are you using 130 fonts? There is a limit of 64 fonts i guess in tesseract. If it is just 1 font (or 1 kind of handwriting in ur case) then you can put it in 1 multi page tiff file which does not exceed 120 pages. Best Regards -Raman

Re: [tesseract-ocr] mftraining Segmentation fault error

2016-11-02 Thread ShreeDevi Kumar
Please see https://groups.google.com/forum/#!msg/tesseract-dev/u5CSn3B3mYc/U39zS6MeCQAJ There seems to be a limit --- ShreeDevi भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com On Wed, Nov 2, 2016 at 5:44 PM, Tom De Costere

[tesseract-ocr] mftraining Segmentation fault error

2016-11-02 Thread Tom De Costere
Hello, We are trying to train tesseract with a new font consisting of multiple handwritings from our customers. The training itself works nicely and the OCR results are very good (85-90% correct detection). However today something strange started to happen during the training process (which