Re: [tesseract-ocr] tesseract data files

2018-03-04 Thread Simon Eigeldinger
Hm. I guess i just ship all 3 of them. *lol* and add the text of the wiki to the readme. Greetings, Simon Am 04.03.2018 um 18:43 schrieb ShreeDevi Kumar: The traineddata files in tessdata_best are larger in size and OCR takes more time. They are supposedly slightly more accurate, but there

Re: [tesseract-ocr] tesseract data files

2018-03-04 Thread Simon Eigeldinger
Hi ShreeDevi, I have scraped the cygwin builds. i am using now the builds i get from the appveyor builds which just needs me to repackage the resulting stuff. so tessdata_best isn't like the wiki says for better accuracy? greetings, Simon Am 03.03.2018 um 05:12 schrieb ShreeDevi Kumar: Hi

Re: [tesseract-ocr] tesseract data files

2018-03-02 Thread ShreeDevi Kumar
Hi Simon, If you are planning to package using 4.00alpha from master branch, please use traineddata files from tessdata_fast. These are the files that have been shipped for Ubuntu 18.04 and included in Debian. See https://github.com/tesseract-ocr/tesseract/wiki for some links. You can update the

[tesseract-ocr] tesseract data files

2018-03-02 Thread Simon Eigeldinger
Hi all, Just looked at the git commits for tesseract and read that there has been changes to the OCR modes. are the 3 tessdata sets still valid? tessdata_fast and tessdata_best have been updated so i guess those reflect the latest developments but tessdata hasn't an update since september.