Package: src:beautifulsoup4
Version: 4.12.3-2
Severity: serious
Tags: sid trixie
beautifulsoup's autopkg tests fail with lxml 5.3.0:
[...]
40s =================================== FAILURES
===================================
40s _________________ TestLXMLTreeBuilder.test_real_xhtml_document
_________________
40s
40s self = <bs4.tests.test_lxml.TestLXMLTreeBuilder object at
0x7f8c34ffa960>
40s
40s def test_real_xhtml_document(self):
40s """A real XHTML document should come out more or less
the same as it went in."""
40s markup = b"""<?xml version="1.0" encoding="utf-8"?>
40s <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN">
40s <html xmlns="http://www.w3.org/1999/xhtml">
40s <head><title>Hello.</title></head>
40s <body>Goodbye.</body>
40s </html>"""
40s with warnings.catch_warnings(record=True) as w:
40s soup = self.soup(markup)
40s assert soup.encode("utf-8").replace(b"\n", b"") ==
markup.replace(b"\n", b"")
40s
40s # No warning was issued about parsing an XML document
as HTML,
40s # because XHTML is both.
40s > assert w == []
40s E AssertionError
40s
40s /usr/lib/python3/dist-packages/bs4/tests/__init__.py:432:
AssertionError
40s ___________________ TestLXMLTreeBuilder.test_namespaced_html
___________________
40s
40s self = <bs4.tests.test_lxml.TestLXMLTreeBuilder object at
0x7f8c34ffaae0>
40s
40s def test_namespaced_html(self):
40s # When a namespaced XML document is parsed as HTML it should
40s # be treated as HTML with weird tag names.
40s markup = b"""<ns1:foo>content</ns1:foo><ns1:foo/><ns2:foo/>"""
40s with warnings.catch_warnings(record=True) as w:
40s soup = self.soup(markup)
40s
40s assert 2 == len(soup.find_all("ns1:foo"))
40s
40s # n.b. no "you're parsing XML as HTML" warning was given
40s # because there was no XML declaration.
40s > assert [] == w
40s E AssertionError
40s
40s /usr/lib/python3/dist-packages/bs4/tests/__init__.py:446:
AssertionError
40s ______________ TestLXMLTreeBuilder.test_detect_xml_parsed_as_html
______________
40s
40s self = <bs4.tests.test_lxml.TestLXMLTreeBuilder object at
0x7f8c34ffac60>
40s
40s def test_detect_xml_parsed_as_html(self):
40s # A warning is issued when parsing an XML document as HTML,
40s # but basic stuff should still work.
40s markup = b"""<?xml version="1.0"
encoding="utf-8"?><tag>string</tag>"""
40s with warnings.catch_warnings(record=True) as w:
40s soup = self.soup(markup)
40s assert soup.tag.string == 'string'
40s > [warning] = w
40s E ValueError: too many values to unpack (expected 1)
40s
40s /usr/lib/python3/dist-packages/bs4/tests/__init__.py:455: ValueError
40s =============================== warnings summary
===============================
40s test_fuzz.py: 1 warning
40s test_lxml.py: 91 warnings
40s test_tree.py: 1 warning
40s /usr/lib/python3/dist-packages/bs4/builder/_lxml.py:124:
DeprecationWarning: The 'strip_cdata' option of HTMLParser() has never
done anything and will eventually be removed.
40s parser = parser(
40s
40s -- Docs:
https://docs.pytest.org/en/stable/how-to/capture-warnings.html
40s =========================== short test summary info
============================
40s FAILED test_lxml.py::TestLXMLTreeBuilder::test_real_xhtml_document
- Assertio...
40s FAILED test_lxml.py::TestLXMLTreeBuilder::test_namespaced_html -
AssertionError
40s FAILED
test_lxml.py::TestLXMLTreeBuilder::test_detect_xml_parsed_as_html - Va...
40s ============ 3 failed, 654 passed, 7 skipped, 93 warnings in 2.36s
=============