[tesseract-ocr] Re: Tesseract training has an upper limit on the use of cpu?Is the more cpu, the faster the training?

2018-12-09 Thread bruce
Hi Junye, Now,I hava an workstation with 36 core(Intel(R) Xeon(R) E7-4820 v2 2.00GHz) 32G Memory , RHEL7.3 system My training text is about *29MB* including *9470568* characters. The .tif

[tesseract-ocr] Re: Tesseract training has an upper limit on the use of cpu?Is the more cpu, the faster the training?

2018-11-27 Thread Junye Li
I don't think that would be the case unless your training text is few hundred megabytes in size... I am running Tesseract on Ubuntu 18.04 and based a very quick test it turned out Tesseract on Ubuntu performed better than on Windows in terms of agreement accuracy (I'm training it for

[tesseract-ocr] Re: Tesseract training has an upper limit on the use of cpu?Is the more cpu, the faster the training?

2018-11-27 Thread bruce
Hi Junye Li, I hava an workstation with 36 core(2.0Ghz) and 24G Memory ,RHEL system I'm now running text2image to generate tif/box ,I guess it still needs to be executed for a week. Next,I will run tesseract to generate .lstm files , I guess it will take about two weeks.

[tesseract-ocr] Re: Tesseract training has an upper limit on the use of cpu?Is the more cpu, the faster the training?

2018-11-25 Thread Junye Li
Hi bruce, Hardware requirements can be found here: https://github.com/tesseract-ocr/tesseract/wiki/TrainingTesseract-4.00#hardware-software-requirements. Tesseract uses 4 cores/threads (if your CPU supports hyperthread) at most. I had the training running on a 40 core workstation and it