Hello Willus,

Can you also test tesseract 5? Can you share your input data for testing or 
script for evaluation, how you generate output charts?

Zdenko
Dátum: pondelok 31. decembra 2018, čas: 23:23:39 UTC+1, odosielateľ: 
[email protected]

> So I did some more experimenting and convinced myself that the "xres" and 
> "yres" values in the PIX structure passed to Tesseract have virtually no 
> impact to the results unless the resolution is so poor as to make the error 
> rate very high.  Using that information, I re-ran my tests in a more 
> systematic way on both Tesseract 4 (with the "TessBest" English training 
> data file--14.7 MiB) and Tesseract 3.05 (with CUBE).  The results below 
> show the average error rate for the six fonts and then excluding 
> Bookman-Demi and Helvetica-Narrow since they're a little out of the 
> ordinary.  The error-rate is plotted against the height of a capital letter 
> in pixels, as before.  A couple of things to note:
> 1. Tess v4.0.0 does far better at the lower resolutions (fewer pixels in a 
> capital letter).
> 2. Tess v4.0.0 is more consistent across a broader font selection than 
> Tess v3.05.  This is very good to see.
> 3. However, if I exclude Bookman-Demi and Helvetica-Narrow, Tess v3.05 
> does better for the higher resolutions (40-140 pixel heights).  Tess v4.0.0 
> definitely has a consistent issue with high-res fonts which should be 
> addressed, as I stated in my earlier posts.
>
> 6-font average:
> [image: tess_accuracy_6fonts.png]
>
> Without Bookman-Demi and Helvetica-Narrow:
> [image: tess_accuracy_4fonts.png]
>
>
>
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/403340a8-f8fb-4726-ba85-2031365a3a0bn%40googlegroups.com.

Reply via email to