> if a tested app is compiled using Release build it is 30% faster, but
still very slow.
Debug builds are going to be slower.

I tested with command line on linux. The tif file does take long to
recognize. Changing file to 300 dpi and smaller size speeded up the time
somewhat.

If all your images are in same font, you can try some finetuning to see if
it helps.

On Mon, Apr 13, 2020 at 1:20 PM Ravil R <[email protected]> wrote:

> Sorry, I have just now seen your full answer with the questions, yesterday
> i've just got an email with the advice to go to the forum, that I did.
> Now the answers
> 1) I tested the latest 5.0.0-alpha build using all types of data files,
> modern: best, fast, normal and old: for 3.0 version
> 2) Yesterday I also tested 3.05 (with old tess data files) and 4.0
> versions (both with old data file and modern "Fast" data files)
> 3) my PC is notebook i7-7700HQ, 32 GB, Windows 10, MS VC 2015. During the
> recognition, one core is fully loaded.
> 4) I read issues regarding performance but didn't find them useful, when
> someone complains that 2 seconds is too slow it just makes me laughing.
> 5) 2 minutes for page recognition with "Fast" data is an approximate
> value, if a tested app is compiled using Release build it is 30% faster,
> but still very slow. "Best" data files recognition takes around 5 minutes.
> 6) Tesseract version doesn't significantly affect the results
> 7) Old data files have the size around the size of "best" data files, work
> a little faster than "fast" data files but produce output results worse
> than "fast". So quality of the recognition is raising.
>
> понедельник, 13 апреля 2020 г., 10:08:08 UTC+3 пользователь zdenop написал:
>>
>> Why you decided to ignore instructions in comment
>>
>> https://github.com/tesseract-ocr/tesseract/issues/2946#issuecomment-612613461
>> ?
>> Why we should care about your problems if you do not care?
>>
>> Zdenko
>>
>>
>> ne 12. 4. 2020 o 16:00 Ravil R <[email protected]> napísal(a):
>>
>>> I have my own simple Windows dll based on tesseractmain,cpp code. It
>>> works fine since Tesseract 3x (now I moved it the latest 5 build) and the
>>> only issue still persists is its low speed - 1 page TIFF takes around 2
>>> minutes even with the Fast version of tessdata ('eng+rus'). Is this how it
>>> actually works or there is something I don't understand?
>>> Almost all the time takes this line:
>>> api.ProcessPages("c:\\1.tif", NULL, 0, NULL);
>>> Sample file is attached
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "tesseract-ocr" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/tesseract-ocr/759d47df-da5f-4683-ab13-0f8ffb08b159%40googlegroups.com
>>> <https://groups.google.com/d/msgid/tesseract-ocr/759d47df-da5f-4683-ab13-0f8ffb08b159%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/36507710-55f7-4c62-8aff-60692be32a96%40googlegroups.com
> <https://groups.google.com/d/msgid/tesseract-ocr/36507710-55f7-4c62-8aff-60692be32a96%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>


-- 

____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduVFTr%2BQYgebUpqofJdKRF_UquJE_r_6hJXfADu8zM73dA%40mail.gmail.com.

Reply via email to