Re: [tesseract-ocr] 2 min on 1 page TIFF using Fast trained data

2020-04-16 Thread Ravil R
Ok, got it, not to pay too much attention to the libraries other than tesseract itself среда, 15 апреля 2020 г., 21:45:39 UTC+3 пользователь zdenop написал: > > Just for future reference: for AVX (and ...) support there is needed to > rebuild only tesseract - it depends on compiler and HW. > Of

Re: [tesseract-ocr] 2 min on 1 page TIFF using Fast trained data

2020-04-15 Thread Zdenko Podobny
Just for future reference: for AVX (and ...) support there is needed to rebuild only tesseract - it depends on compiler and HW. Of course it make sense to use the latest version of tesseract dependencies (because of security, bugfixes etc) , but they have (AFAIK) minimum effect on tesseract speed

Re: [tesseract-ocr] 2 min on 1 page TIFF using Fast trained data

2020-04-15 Thread Ravil R
Yes exactly, I updated libraries (without turbojpeg and libarchive) and added AVX2 support, now t works at least 10 times faster than before. Problem solved. Thank you very much! Ravil вторник, 14 апреля 2020 г., 13:25:03 UTC+3 пользователь zdenop написал: > > Without AVX support tesseract 4/5

Re: [tesseract-ocr] 2 min on 1 page TIFF using Fast trained data

2020-04-14 Thread Zdenko Podobny
Without AVX support tesseract 4/5 will be slow(er). So try to focus on this. Using more than one lang will slower OCR too... Zdenko ut 14. 4. 2020 o 5:56 Ravil R napísal(a): > Oh you gave so much info, thanks! > My test exe file shows this version information: > tesseract 5.0.0 >

Re: [tesseract-ocr] 2 min on 1 page TIFF using Fast trained data

2020-04-13 Thread Ravil R
Oh you gave so much info, thanks! My test exe file shows this version information: tesseract 5.0.0 leptonica-1.79.0 (Apr 14 2020, 06:42:43) [MSC v.1900 LIB Debug x86] libjpeg 9b : libpng 1.6.32 : libtiff 4.0.7 : zlib 1.2.11 Looks like I need to add (upgrade) the whole package понедельник, 13

Re: [tesseract-ocr] 2 min on 1 page TIFF using Fast trained data

2020-04-13 Thread Zdenko Podobny
OS Name: Microsoft Windows 10 Pro OS Version:10.0.18362 N/A Build 18362 System Model: Latitude E5570 System Type: x64-based PC Processor(s): 1 Processor(s) Installed. [01]: Intel64 Family 6 Model

Re: [tesseract-ocr] 2 min on 1 page TIFF using Fast trained data

2020-04-13 Thread Ravil R
1) It has a standard fax 204x196 dpi of course I can convert it to 300x300 and then recognize. Does it make sense? 2) Font could be of any type and of different language (eng or rus) so no fine tuning is possible. понедельник, 13 апреля 2020 г., 17:56:24 UTC+3 пользователь shree написал: > > >

Re: [tesseract-ocr] 2 min on 1 page TIFF using Fast trained data

2020-04-13 Thread Shree Devi Kumar
> if a tested app is compiled using Release build it is 30% faster, but still very slow. Debug builds are going to be slower. I tested with command line on linux. The tif file does take long to recognize. Changing file to 300 dpi and smaller size speeded up the time somewhat. If all your images

Re: [tesseract-ocr] 2 min on 1 page TIFF using Fast trained data

2020-04-13 Thread Ravil R
Sorry, I have just now seen your full answer with the questions, yesterday i've just got an email with the advice to go to the forum, that I did. Now the answers 1) I tested the latest 5.0.0-alpha build using all types of data files, modern: best, fast, normal and old: for 3.0 version 2)

Re: [tesseract-ocr] 2 min on 1 page TIFF using Fast trained data

2020-04-13 Thread Zdenko Podobny
Why you decided to ignore instructions in comment https://github.com/tesseract-ocr/tesseract/issues/2946#issuecomment-612613461 ? Why we should care about your problems if you do not care? Zdenko ne 12. 4. 2020 o 16:00 Ravil R napísal(a): > I have my own simple Windows dll based on

[tesseract-ocr] 2 min on 1 page TIFF using Fast trained data

2020-04-12 Thread Ravil R
I have my own simple Windows dll based on tesseractmain,cpp code. It works fine since Tesseract 3x (now I moved it the latest 5 build) and the only issue still persists is its low speed - 1 page TIFF takes around 2 minutes even with the Fast version of tessdata ('eng+rus'). Is this how it