https://bz.apache.org/SpamAssassin/show_bug.cgi?id=8026

            Bug ID: 8026
           Summary: t/extracttext.t tesseract test fails on some
                    installations
           Product: Spamassassin
           Version: 4.0.0
          Hardware: All
                OS: All
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Regression Tests
          Assignee: dev@spamassassin.apache.org
          Reporter: sid...@sidney.com
  Target Milestone: Undefined

On my copy of FreeBSD 13.1-RELEASE installed on a VirtualBox VM with tesseract
5.1.0 installed from FreeBSD's pkg repository, test t/extracttext.t
consistently fails because tesseract reads the "XJ" characters in the test jpg
file as "X]J".

Recreating the test file using a font that is more tesseract-friendly seems to
help. Since the test is not intended to test the limits of tesseract's OCR
capabilities, this seems like a proper fix. I've redone the test data using Tex
Gyre Bonum font as per the results in https://superuser.com/a/1543382

-- 
You are receiving this mail because:
You are the assignee for the bug.

Reply via email to