Sorry, I have just now seen your full answer with the questions, yesterday i've just got an email with the advice to go to the forum, that I did. Now the answers 1) I tested the latest 5.0.0-alpha build using all types of data files, modern: best, fast, normal and old: for 3.0 version 2) Yesterday I also tested 3.05 (with old tess data files) and 4.0 versions (both with old data file and modern "Fast" data files) 3) my PC is notebook i7-7700HQ, 32 GB, Windows 10, MS VC 2015. During the recognition, one core is fully loaded. 4) I read issues regarding performance but didn't find them useful, when someone complains that 2 seconds is too slow it just makes me laughing. 5) 2 minutes for page recognition with "Fast" data is an approximate value, if a tested app is compiled using Release build it is 30% faster, but still very slow. "Best" data files recognition takes around 5 minutes. 6) Tesseract version doesn't significantly affect the results 7) Old data files have the size around the size of "best" data files, work a little faster than "fast" data files but produce output results worse than "fast". So quality of the recognition is raising.
понедельник, 13 апреля 2020 г., 10:08:08 UTC+3 пользователь zdenop написал: > > Why you decided to ignore instructions in comment > > https://github.com/tesseract-ocr/tesseract/issues/2946#issuecomment-612613461 > ? > Why we should care about your problems if you do not care? > > Zdenko > > > ne 12. 4. 2020 o 16:00 Ravil R <[email protected] <javascript:>> > napísal(a): > >> I have my own simple Windows dll based on tesseractmain,cpp code. It >> works fine since Tesseract 3x (now I moved it the latest 5 build) and the >> only issue still persists is its low speed - 1 page TIFF takes around 2 >> minutes even with the Fast version of tessdata ('eng+rus'). Is this how it >> actually works or there is something I don't understand? >> Almost all the time takes this line: >> api.ProcessPages("c:\\1.tif", NULL, 0, NULL); >> Sample file is attached >> >> -- >> You received this message because you are subscribed to the Google Groups >> "tesseract-ocr" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected] <javascript:>. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/tesseract-ocr/759d47df-da5f-4683-ab13-0f8ffb08b159%40googlegroups.com >> >> <https://groups.google.com/d/msgid/tesseract-ocr/759d47df-da5f-4683-ab13-0f8ffb08b159%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/36507710-55f7-4c62-8aff-60692be32a96%40googlegroups.com.

