Thank you for the detailed info.

My suggestion is to try recognition with eng.traineddata from the
tessdata_fast repository with --oem 1.


On Tue 3 Apr, 2018, 3:13 AM Patrick Ramsey, <[email protected]>
wrote:

> Answers below inline. And thank you very much for your help :)
>
> |PTR
>
> On Friday, March 30, 2018 at 2:00:18 AM UTC-7, shree wrote:
>>
>> Please check GitHub/issues for similar reports and suggestions.
>>
>> Also specify,
>>
> Which version/commit of tesseract 4
>>
>
> commit hash: 40f43111e05b3dd2f2f8aeae3aba33016523c881
> tag: 4.0.0-beta.1
>
> Which traineddata file, from which repo
>>
>
> eng.traineddata from https://github.com/tesseract-ocr/tessdata at commit
> 9b2e3f6642285b3e9a7a5852e5b10259e42d5510
>
>
>> Which o/s
>>
>
> Ubuntu 17.10 on amd64
>
>>
>> tesseract -v
>>
>
> tesseract 4.0.0-beta.1
>  leptonica-1.74.4
>   libgif 5.1.4 : libjpeg 8d (libjpeg-turbo 1.5.1) : libpng 1.6.34 :
> libtiff 4.0.8 : zlib 1.2.11 : libwebp 0.6.0 : libopenjp2 2.2.0
>
>  Found AVX2
>  Found AVX
>  Found SSE
>
>
>
>>
>>
>
>>
>>
>> On Fri 30 Mar, 2018, 2:19 PM Patrick Ramsey, <[email protected]>
>> wrote:
>>
>>> Hi!
>>>
>>> So, I am running tesseract4 on clean, 1-bit images of rasterized text
>>> (not printed and scanned).  I'm getting very accurate output, as expected,
>>> but tesseract is taking about 1 second to process a single page on a core
>>> i7 cpu, and that seems a lot longer than I'd have expected.
>>>
>>> I've been trying to enable debug output so that I can see what's taking
>>> the most time, to see if there is anything that I could get away with
>>> turning off to speed it up (since I don't need to account for e.g. dirt on
>>> the lens), but thus far I'm feeling pretty stupid.  So:
>>>
>>> A) is there any straightforward way to get more information on what
>>> tesseract is actually doing? (I've built with --enable-debug and it doesn't
>>> seem to have changed the output on the command line)
>>> B) are there any control parameters you folks would suggest setting to
>>> speed up image processing/turn off unnecessary work, given the inputs I've
>>> described?
>>>
>>> Many thanks,
>>>
>>> PTR
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "tesseract-ocr" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> To post to this group, send email to [email protected].
>>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/tesseract-ocr/893cf5f7-8f64-428e-b1fe-5e6214215059%40googlegroups.com
>>> <https://groups.google.com/d/msgid/tesseract-ocr/893cf5f7-8f64-428e-b1fe-5e6214215059%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at https://groups.google.com/group/tesseract-ocr.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/c709dd21-02d4-4d23-a52a-60501916c37a%40googlegroups.com
> <https://groups.google.com/d/msgid/tesseract-ocr/c709dd21-02d4-4d23-a52a-60501916c37a%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduVLbi6wbRyWnNqTwAdZovBm-W%3DmZx4gTOjoCfTdrXcucA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to