Hello Willus, Can you also test tesseract 5? Can you share your input data for testing or script for evaluation, how you generate output charts?
Zdenko Dátum: pondelok 31. decembra 2018, čas: 23:23:39 UTC+1, odosielateľ: [email protected] > So I did some more experimenting and convinced myself that the "xres" and > "yres" values in the PIX structure passed to Tesseract have virtually no > impact to the results unless the resolution is so poor as to make the error > rate very high. Using that information, I re-ran my tests in a more > systematic way on both Tesseract 4 (with the "TessBest" English training > data file--14.7 MiB) and Tesseract 3.05 (with CUBE). The results below > show the average error rate for the six fonts and then excluding > Bookman-Demi and Helvetica-Narrow since they're a little out of the > ordinary. The error-rate is plotted against the height of a capital letter > in pixels, as before. A couple of things to note: > 1. Tess v4.0.0 does far better at the lower resolutions (fewer pixels in a > capital letter). > 2. Tess v4.0.0 is more consistent across a broader font selection than > Tess v3.05. This is very good to see. > 3. However, if I exclude Bookman-Demi and Helvetica-Narrow, Tess v3.05 > does better for the higher resolutions (40-140 pixel heights). Tess v4.0.0 > definitely has a consistent issue with high-res fonts which should be > addressed, as I stated in my earlier posts. > > 6-font average: > [image: tess_accuracy_6fonts.png] > > Without Bookman-Demi and Helvetica-Narrow: > [image: tess_accuracy_4fonts.png] > > > > > > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/403340a8-f8fb-4726-ba85-2031365a3a0bn%40googlegroups.com.

