On Sun, Oct 19, 2025 at 10:54:45AM -0500, Steven Robbins wrote:
The autopkgtest fails on x86 because "You are running OCRmyPDF in a 32-bit
(x86) Python interpreter. This is not supported."

No, that's not really the right diagnosis.  The full error is:

=================================== FAILURES ===================================
_____________ test_compression_changed[baiona_color.jpg-lossless] ______________

ocrmypdf_exec = ['/usr/bin/python3', '-m', 'ocrmypdf']
resources = 
PosixPath('/tmp/autopkgtest-lxc.p758jhb3/downtmp/build.XrT/src/tests/resources')
image = 'baiona_color.jpg', compression = 'lossless'
outpdf = 
PosixPath('/tmp/pytest-of-debci/pytest-0/test_compression_changed_baion2/out.pdf')

    @pytest.mark.parametrize(
        'image,compression',
        [
            ('baiona.png', 'jpeg'),
            ('baiona_gray.png', 'lossless'),
            ('baiona_color.jpg', 'lossless'),
        ],
    )
    def test_compression_changed(ocrmypdf_exec, resources, image, compression, 
outpdf):
        input_file = str(resources / image)
        output_file = str(outpdf)

        im = Image.open(input_file)

        # Runs: ocrmypdf - output.pdf < testfile
        with open(input_file, 'rb') as input_stream:
            p_args = ocrmypdf_exec + [
                '--image-dpi',
                '150',
                '--output-type',
                'pdfa',
                '--optimize',
                '0',
                '--pdfa-image-compression',
                compression,
                '--plugin',
                'tests/plugins/tesseract_noop.py',
                '-',
                output_file,
            ]
            p = run(
                p_args,
                capture_output=True,
                stdin=input_stream,
                text=True,
                check=False,
            )
          assert p.returncode == ExitCode.ok, p.stderr
E           AssertionError: You are running OCRmyPDF in a 32-bit (x86) Python 
interpreter. This is not supported. 32-bit does not have enough address space 
to process large files. Please use a 64-bit (x86-64) version of Python.
E             reading file from standard input
E             Input file is not a PDF, checking if it is an image...
E             Input file is an image
E             Input image has no ICC profile, assuming sRGB
E             Image seems valid. Try converting to PDF...
E             Successfully converted to PDF, processing...
E
E
E             Postprocessing...
E
E             Image optimization ratio: 1.00 savings: 0.0%
E             Total file size ratio: 0.97 savings: -2.8%
E             Output file is a PDF/A-2b (as expected)
E             WARNING: 
/tmp/pytest-of-debci/pytest-0/test_compression_changed_baion2/out.pdf (offset 
4807): error decoding stream data for object 11 0: invalid jpeg data reading 
from buffer
E             WARNING: 
/tmp/pytest-of-debci/pytest-0/test_compression_changed_baion2/out.pdf (offset 
4807): stream will be re-processed without filtering to avoid data loss
E             Output file: The generated PDF is INVALID
E
E           assert 4 == <ExitCode.ok: 0>
E            +  where 4 = CompletedProcess(args=['/usr/bin/python3', '-m', 
'ocrmypdf', '--image-dpi', '150', '--output-type', 'pdfa', '--optimiz... 4807): 
stream will be re-processed without filtering to avoid data loss\nOutput file: 
The generated PDF is INVALID\n').returncode
E            +  and   <ExitCode.ok: 0> = ExitCode.ok

tests/test_main.py:731: AssertionError

The bit about a 32-bit Python interpreter is a warning emitted at the start of the stderr output, but it doesn't cause this failure on its own.

Looking at https://ci.debian.net/packages/o/ocrmypdf/testing/i386/ and https://ci.debian.net/packages/o/ocrmypdf/testing/amd64/, I see that a number of these tests are flaky even on amd64; and also, these tests passed with previous ghostscript versions. Maybe this should be added to the tests that Anton marked as flaky in https://salsa.debian.org/debian/ocrmypdf/-/blob/debian/debian/patches/20_mark_some_tests_flaky.patch recently?

--
Colin Watson (he/him)                              [[email protected]]

Reply via email to