On Sun, Oct 19, 2025 at 10:54:45AM -0500, Steven Robbins wrote:
The autopkgtest fails on x86 because "You are running OCRmyPDF in a 32-bit
(x86) Python interpreter. This is not supported."
No, that's not really the right diagnosis. The full error is:
=================================== FAILURES ===================================
_____________ test_compression_changed[baiona_color.jpg-lossless] ______________
ocrmypdf_exec = ['/usr/bin/python3', '-m', 'ocrmypdf']
resources =
PosixPath('/tmp/autopkgtest-lxc.p758jhb3/downtmp/build.XrT/src/tests/resources')
image = 'baiona_color.jpg', compression = 'lossless'
outpdf =
PosixPath('/tmp/pytest-of-debci/pytest-0/test_compression_changed_baion2/out.pdf')
@pytest.mark.parametrize(
'image,compression',
[
('baiona.png', 'jpeg'),
('baiona_gray.png', 'lossless'),
('baiona_color.jpg', 'lossless'),
],
)
def test_compression_changed(ocrmypdf_exec, resources, image, compression,
outpdf):
input_file = str(resources / image)
output_file = str(outpdf)
im = Image.open(input_file)
# Runs: ocrmypdf - output.pdf < testfile
with open(input_file, 'rb') as input_stream:
p_args = ocrmypdf_exec + [
'--image-dpi',
'150',
'--output-type',
'pdfa',
'--optimize',
'0',
'--pdfa-image-compression',
compression,
'--plugin',
'tests/plugins/tesseract_noop.py',
'-',
output_file,
]
p = run(
p_args,
capture_output=True,
stdin=input_stream,
text=True,
check=False,
)
assert p.returncode == ExitCode.ok, p.stderr
E AssertionError: You are running OCRmyPDF in a 32-bit (x86) Python
interpreter. This is not supported. 32-bit does not have enough address space
to process large files. Please use a 64-bit (x86-64) version of Python.
E reading file from standard input
E Input file is not a PDF, checking if it is an image...
E Input file is an image
E Input image has no ICC profile, assuming sRGB
E Image seems valid. Try converting to PDF...
E Successfully converted to PDF, processing...
E
E
E Postprocessing...
E
E Image optimization ratio: 1.00 savings: 0.0%
E Total file size ratio: 0.97 savings: -2.8%
E Output file is a PDF/A-2b (as expected)
E WARNING:
/tmp/pytest-of-debci/pytest-0/test_compression_changed_baion2/out.pdf (offset
4807): error decoding stream data for object 11 0: invalid jpeg data reading
from buffer
E WARNING:
/tmp/pytest-of-debci/pytest-0/test_compression_changed_baion2/out.pdf (offset
4807): stream will be re-processed without filtering to avoid data loss
E Output file: The generated PDF is INVALID
E
E assert 4 == <ExitCode.ok: 0>
E + where 4 = CompletedProcess(args=['/usr/bin/python3', '-m',
'ocrmypdf', '--image-dpi', '150', '--output-type', 'pdfa', '--optimiz... 4807):
stream will be re-processed without filtering to avoid data loss\nOutput file:
The generated PDF is INVALID\n').returncode
E + and <ExitCode.ok: 0> = ExitCode.ok
tests/test_main.py:731: AssertionError
The bit about a 32-bit Python interpreter is a warning emitted at the
start of the stderr output, but it doesn't cause this failure on its
own.
Looking at https://ci.debian.net/packages/o/ocrmypdf/testing/i386/ and
https://ci.debian.net/packages/o/ocrmypdf/testing/amd64/, I see that a
number of these tests are flaky even on amd64; and also, these tests
passed with previous ghostscript versions. Maybe this should be added
to the tests that Anton marked as flaky in
https://salsa.debian.org/debian/ocrmypdf/-/blob/debian/debian/patches/20_mark_some_tests_flaky.patch
recently?
--
Colin Watson (he/him) [[email protected]]