Package: src:beautifulsoup4
Version: 4.14.3-1
Severity: serious
Tags: ftbfs forky sid

Dear maintainer:

During a rebuild of all packages in unstable, this package failed to build.

Below you will find the last part of the build log (probably the most
relevant part, but not necessarily). If required, the full build log
is available here:

https://people.debian.org/~sanvila/build-logs/202512/

About the archive rebuild: The build was made on virtual machines from AWS,
using sbuild and a reduced chroot with only build-essential packages.

If you cannot reproduce the bug please contact me privately, as I
am willing to provide ssh access to a virtual machine where the bug is
fully reproducible.

If this is really a bug in one of the build-depends, please use
reassign and add an affects on src:beautifulsoup4, so that this is still
visible in the BTS web page for this package.

Thanks.

--------------------------------------------------------------------------------
[...]
 debian/rules clean
dh clean --buildsystem=pybuild
   debian/rules override_dh_auto_clean
make[1]: Entering directory '/<<PKGBUILDDIR>>'
dh_auto_clean
rm -rf build
make[1]: Leaving directory '/<<PKGBUILDDIR>>'
   dh_autoreconf_clean -O--buildsystem=pybuild
   dh_clean -O--buildsystem=pybuild
 debian/rules binary
dh binary --buildsystem=pybuild
   dh_update_autotools_config -O--buildsystem=pybuild
   dh_autoreconf -O--buildsystem=pybuild
   dh_auto_configure -O--buildsystem=pybuild
   dh_auto_build -O--buildsystem=pybuild
I: pybuild plugin_pyproject:131: Building wheel for python3.14 with "build" 
module
I: pybuild base:317: python3.14 -m build --skip-dependency-check --no-isolation 
--wheel --outdir /<<PKGBUILDDIR>>/.pybuild/cpython3_3.14_bs4  
* Building wheel...
Successfully built beautifulsoup4-4.14.3-py3-none-any.whl
I: pybuild plugin_pyproject:155: Unpacking wheel built for python3.14 with 
"installer" module
I: pybuild plugin_pyproject:131: Building wheel for python3.13 with "build" 
module
I: pybuild base:317: python3.13 -m build --skip-dependency-check --no-isolation 
--wheel --outdir /<<PKGBUILDDIR>>/.pybuild/cpython3_3.13_bs4  
* Building wheel...
Successfully built beautifulsoup4-4.14.3-py3-none-any.whl
I: pybuild plugin_pyproject:155: Unpacking wheel built for python3.13 with 
"installer" module
   debian/rules execute_after_dh_auto_build
make[1]: Entering directory '/<<PKGBUILDDIR>>'
python3 -m sphinx -aEN -b html -d build/doctrees doc build/html
Running Sphinx v8.2.3
loading translations [en,ja,ko,ru,pt,zh]... not available for built-in messages
making output directory... done
loading intersphinx inventory 'python' from 
https://docs.python.org/3/objects.inv ...
WARNING: failed to reach any of the inventories with the following issues:
intersphinx inventory 'https://docs.python.org/3/objects.inv' not fetchable due 
to <class 'requests.exceptions.ConnectionError'>: 
HTTPSConnectionPool(host='docs.python.org', port=443): Max retries exceeded 
with url: /3/objects.inv (Caused by 
NameResolutionError("<urllib3.connection.HTTPSConnection object at 
0x7fa6990b5940>: Failed to resolve 'docs.python.org' ([Errno -3] Temporary 
failure in name resolution)"))
building [mo]: all of 0 po files
writing output... 
building [html]: all source files
updating environment: [new config] 1 added, 0 changed, 0 removed
reading sources... [100%] index

/<<PKGBUILDDIR>>/doc/index.rst:195: ERROR: Duplicate target name, cannot be 
used as a unique reference: "beautiful soup 3". [docutils]
looking for now-outdated files... none found
pickling environment... done
checking consistency... done
preparing documents... done
copying assets... 
copying static files... 
Writing evaluated template result to 
/<<PKGBUILDDIR>>/build/html/_static/basic.css
Writing evaluated template result to 
/<<PKGBUILDDIR>>/build/html/_static/language_data.js
Writing evaluated template result to 
/<<PKGBUILDDIR>>/build/html/_static/documentation_options.js
Writing evaluated template result to 
/<<PKGBUILDDIR>>/build/html/_static/alabaster.css
copying static files: done
copying extra files... 
copying extra files: done
copying assets: done
writing output... [100%] index

/<<PKGBUILDDIR>>/doc/index.rst:55: WARNING: unknown document: 'api/modules' 
[ref.doc]
/<<PKGBUILDDIR>>/doc/index.rst:3326: WARNING: 'any' reference target not found: 
class [ref.any]
generating indices... genindex done
writing additional pages... search done
copying images... [100%] 6.1.jpg

dumping search index in English (code: en)... done
dumping object inventory... done
build succeeded, 4 warnings.

The HTML pages are in build/html.
python3 -m sphinx -aEN -b html -D language=es_ES -d build/doctrees.es doc.es 
build/html.es
Running Sphinx v8.2.3
loading translations [es_ES]... done
making output directory... done
loading intersphinx inventory 'python' from 
https://docs.python.org/3/objects.inv ...
WARNING: failed to reach any of the inventories with the following issues:
intersphinx inventory 'https://docs.python.org/3/objects.inv' not fetchable due 
to <class 'requests.exceptions.ConnectionError'>: 
HTTPSConnectionPool(host='docs.python.org', port=443): Max retries exceeded 
with url: /3/objects.inv (Caused by 
NameResolutionError("<urllib3.connection.HTTPSConnection object at 
0x7f3243d95a90>: Failed to resolve 'docs.python.org' ([Errno -3] Temporary 
failure in name resolution)"))
building [mo]: all of 0 po files
writing output... 
building [html]: all source files
updating environment: [new config] 1 added, 0 changed, 0 removed
reading sources... [100%] index

/<<PKGBUILDDIR>>/doc.es/index.rst:198: ERROR: Duplicate target name, cannot be 
used as a unique reference: "beautiful soup 3". [docutils]
looking for now-outdated files... none found
pickling environment... done
checking consistency... done
preparing documents... done
copying assets... 
copying static files... 
Writing evaluated template result to 
/<<PKGBUILDDIR>>/build/html.es/_static/basic.css
Writing evaluated template result to 
/<<PKGBUILDDIR>>/build/html.es/_static/language_data.js
Writing evaluated template result to 
/<<PKGBUILDDIR>>/build/html.es/_static/documentation_options.js
Writing evaluated template result to 
/<<PKGBUILDDIR>>/build/html.es/_static/alabaster.css
copying static files: done
copying extra files... 
copying extra files: done
copying assets: done
writing output... [100%] index

/<<PKGBUILDDIR>>/doc.es/index.rst:56: WARNING: unknown document: 'api/modules' 
[ref.doc]
/<<PKGBUILDDIR>>/doc.es/index.rst:3080: WARNING: 'any' reference target not 
found: is [ref.any]
generating indices... genindex done
writing additional pages... search done
copying images... [100%] 6.1.jpg

dumping search index in Spanish (code: es)... done
dumping object inventory... done
build succeeded, 4 warnings.

The HTML pages are in build/html.es.
python3 -m sphinx -aEN -b html -D language=pt_BR -d build/doctrees.ptbr 
doc.ptbr/source build/html.ptbr
Running Sphinx v8.2.3
loading translations [pt_BR]... done
making output directory... done
WARNING: html_static_path entry '_static' does not exist
Converting `source_suffix = '.rst'` to `source_suffix = {'.rst': 
'restructuredtext'}`.
building [mo]: all of 0 po files
writing output... 
building [html]: all source files
updating environment: [new config] 1 added, 0 changed, 0 removed
reading sources... [100%] index

looking for now-outdated files... none found
pickling environment... done
checking consistency... done
preparing documents... done
copying assets... 
copying static files... 
Writing evaluated template result to 
/<<PKGBUILDDIR>>/build/html.ptbr/_static/basic.css
Writing evaluated template result to 
/<<PKGBUILDDIR>>/build/html.ptbr/_static/language_data.js
Writing evaluated template result to 
/<<PKGBUILDDIR>>/build/html.ptbr/_static/documentation_options.js
Writing evaluated template result to 
/<<PKGBUILDDIR>>/build/html.ptbr/_static/classic.css
Writing evaluated template result to 
/<<PKGBUILDDIR>>/build/html.ptbr/_static/sidebar.js
copying static files: done
copying extra files... 
copying extra files: done
copying assets: done
writing output... [100%] index

generating indices... genindex done
writing additional pages... search done
copying images... [100%] 6.1.jpg

dumping search index in Portuguese (code: pt)... done
dumping object inventory... done
build succeeded, 1 warning.

The HTML pages are in build/html.ptbr.
python3 -m sphinx -aEN -b html -D language=ru_RU -d build/doctrees.ru 
doc.ru/source build/html.ru
Running Sphinx v8.2.3
loading translations [ru_RU]... done
making output directory... done
WARNING: html_static_path entry '_static' does not exist
Converting `source_suffix = '.rst'` to `source_suffix = {'.rst': 
'restructuredtext'}`.
building [mo]: all of 0 po files
writing output... 
building [html]: all source files
updating environment: [new config] 2 added, 0 changed, 0 removed
reading sources... [ 50%] bs4ru
reading sources... [100%] index

/<<PKGBUILDDIR>>/doc.ru/source/bs4ru.rst:185: ERROR: Duplicate target name, 
cannot be used as a unique reference: "beautiful soup 3". [docutils]
looking for now-outdated files... none found
pickling environment... done
checking consistency... done
preparing documents... done
copying assets... 
copying static files... 
Writing evaluated template result to 
/<<PKGBUILDDIR>>/build/html.ru/_static/basic.css
Writing evaluated template result to 
/<<PKGBUILDDIR>>/build/html.ru/_static/language_data.js
Writing evaluated template result to 
/<<PKGBUILDDIR>>/build/html.ru/_static/documentation_options.js
Writing evaluated template result to 
/<<PKGBUILDDIR>>/build/html.ru/_static/classic.css
Writing evaluated template result to 
/<<PKGBUILDDIR>>/build/html.ru/_static/sidebar.js
copying static files: done
copying extra files... 
copying extra files: done
copying assets: done
writing output... [ 50%] bs4ru
writing output... [100%] index

generating indices... genindex done
writing additional pages... search done
copying images... [100%] 6.1.jpg

dumping search index in Russian (code: ru)... done
dumping object inventory... done
build succeeded, 2 warnings.

The HTML pages are in build/html.ru.
python3 -m sphinx -aEN -b html -D language=zh_CN -d build/doctrees.zh 
doc.zh/source build/html.zh
Running Sphinx v8.2.3
loading translations [zh_CN]... done
making output directory... done
WARNING: html_static_path entry '_static' does not exist
Converting `source_suffix = '.rst'` to `source_suffix = {'.rst': 
'restructuredtext'}`.
building [mo]: all of 0 po files
writing output... 
building [html]: all source files
updating environment: [new config] 1 added, 0 changed, 0 removed
reading sources... [100%] index

looking for now-outdated files... none found
pickling environment... done
checking consistency... done
preparing documents... done
copying assets... 
copying static files... 
Writing evaluated template result to 
/<<PKGBUILDDIR>>/build/html.zh/_static/basic.css
Writing evaluated template result to 
/<<PKGBUILDDIR>>/build/html.zh/_static/language_data.js
Writing evaluated template result to 
/<<PKGBUILDDIR>>/build/html.zh/_static/documentation_options.js
Writing evaluated template result to 
/<<PKGBUILDDIR>>/build/html.zh/_static/classic.css
Writing evaluated template result to 
/<<PKGBUILDDIR>>/build/html.zh/_static/sidebar.js
copying static files: done
copying extra files... 
copying extra files: done
copying assets: done
writing output... [100%] index

generating indices... genindex py-modindex done
writing additional pages... search done
dumping search index in Chinese (code: zh)... done
dumping object inventory... done
build succeeded, 1 warning.

The HTML pages are in build/html.zh.
make[1]: Leaving directory '/<<PKGBUILDDIR>>'
   dh_auto_test -O--buildsystem=pybuild
I: pybuild base:317: cd /<<PKGBUILDDIR>>/.pybuild/cpython3_3.14_bs4/build; 
python3.14 -m pytest 
============================= test session starts ==============================
platform linux -- Python 3.14.2, pytest-9.0.2, pluggy-1.6.0
rootdir: /<<PKGBUILDDIR>>
configfile: pyproject.toml
collected 895 items

tests/test_builder.py .....                                              [  0%]
tests/test_builder_registry.py ...........                               [  1%]
tests/test_css.py ...................................................... [  7%]
......                                                                   [  8%]
tests/test_dammit.py ................................................... [ 14%]
...........................                                              [ 17%]
tests/test_element.py ..............                                     [ 18%]
tests/test_filter.py ................................................... [ 24%]
.........................                                                [ 27%]
tests/test_formatter.py ............................                     [ 30%]
tests/test_fuzz.py F........sssssss                                      [ 32%]
tests/test_html5lib.py ................................................. [ 37%]
...........................................                              [ 42%]
tests/test_htmlparser.py ............................................... [ 47%]
...................................                                      [ 51%]
tests/test_lxml.py ..................................................... [ 57%]
.............................................................            [ 64%]
tests/test_navigablestring.py .........                                  [ 65%]
tests/test_pageelement.py ......................................         [ 69%]
tests/test_soup.py ..................................................... [ 75%]
..............................                                           [ 78%]
tests/test_tag.py ........................                               [ 81%]
tests/test_tree.py ..................................................... [ 87%]
........................................................................ [ 95%]
........................................                                 [100%]

=================================== FAILURES ===================================
_ 
TestFuzz.test_deeply_nested_document_without_css[clusterfuzz-testcase-minimized-bs4_fuzzer-5984173902397440]
 _

self = <tests.test_fuzz.TestFuzz object at 0x7f7f4fe5ee90>
filename = 'clusterfuzz-testcase-minimized-bs4_fuzzer-5984173902397440'

    @pytest.mark.parametrize(
        "filename",
        [
            "clusterfuzz-testcase-minimized-bs4_fuzzer-5984173902397440",
            "clusterfuzz-testcase-minimized-bs4_fuzzer-5167584867909632",
            "clusterfuzz-testcase-minimized-bs4_fuzzer-6124268085182464",
            "clusterfuzz-testcase-minimized-bs4_fuzzer-6450958476902400",
        ],
    )
    def test_deeply_nested_document_without_css(self, filename):
        # Parsing the document and encoding it back to a string is
        # sufficient to demonstrate that the overflow problem has
        # been fixed.
        markup = self.__markup(filename)
>       BeautifulSoup(markup, "html.parser").encode()
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

tests/test_fuzz.py:90: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
bs4/__init__.py:476: in __init__
    self._feed()
bs4/__init__.py:661: in _feed
    self.builder.feed(self.markup)
bs4/builder/_htmlparser.py:456: in feed
    parser.close()
/usr/lib/python3.14/html/parser.py:173: in close
    self.goahead(1)
/usr/lib/python3.14/html/parser.py:311: in goahead
    self.handle_charref(rawdata[i+2:])
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <bs4.builder._htmlparser.BeautifulSoupHTMLParser object at 
0x7f7f4fa40ad0>
name = 
'2fontipt><!--</script><scriÑpt><ÙÙÙÙÈÙÙÙÙÙÙÙpt><script<>!--</script6<ÿ<head/ÂXpe"
 charse/Â\x00ript><sÿþcript><!-pt><s...\t<meta http-equiv="Content-Type" 
content="text/html; charset=utf-8" 
/<R=2O130B\r\r\r\r\r\r\r\r\r\r\r\r\r\r\r\r\r>\r\r'

    def handle_charref(self, name: str) -> None:
        """Handle a numeric character reference by converting it to the
        corresponding Unicode character and treating it as textual
        data.
    
        :param name: Character number, possibly in hexadecimal.
        """
        # TODO: This was originally a workaround for a bug in
        # HTMLParser. (http://bugs.python.org/issue13633) The bug has
        # been fixed, but removing this code still makes some
        # Beautiful Soup tests fail. This needs investigation.
        real_name:int
        if name.startswith("x"):
            real_name = int(name.lstrip("x"), 16)
        elif name.startswith("X"):
            real_name = int(name.lstrip("X"), 16)
        else:
>           real_name = int(name)
                        ^^^^^^^^^
E           ValueError: invalid literal for int() with base 10: 
'2fontipt><!--</script><scriÑpt><ÙÙÙÙÈÙÙÙÙÙÙÙpt><script<>!--</script6<ÿ<head/ÂXpe"
 charse/Â\x00ript><sÿþcript><!-pt><script<>!--</script6<ÿ<head/ÂXpe" 
charse/Â\x00ript><sÿþcript><!--</scrixt><sÿ<!--</

bs4/builder/_htmlparser.py:243: ValueError
------------------------------ Captured log call -------------------------------
WARNING  bs4.dammit:dammit.py:825 Some characters could not be decoded, and 
were replaced with REPLACEMENT CHARACTER.
=========================== short test summary info ============================
FAILED 
tests/test_fuzz.py::TestFuzz::test_deeply_nested_document_without_css[clusterfuzz-testcase-minimized-bs4_fuzzer-5984173902397440]
=================== 1 failed, 887 passed, 7 skipped in 1.55s ===================
E: pybuild pybuild:389: test: plugin pyproject failed with: exit code=1: cd 
/<<PKGBUILDDIR>>/.pybuild/cpython3_3.14_bs4/build; python3.14 -m pytest 
I: pybuild base:317: cd /<<PKGBUILDDIR>>/.pybuild/cpython3_3.13_bs4/build; 
python3.13 -m pytest 
============================= test session starts ==============================
platform linux -- Python 3.13.11, pytest-9.0.2, pluggy-1.6.0
rootdir: /<<PKGBUILDDIR>>
configfile: pyproject.toml
collected 895 items

tests/test_builder.py .....                                              [  0%]
tests/test_builder_registry.py ...........                               [  1%]
tests/test_css.py ...................................................... [  7%]
......                                                                   [  8%]
tests/test_dammit.py ................................................... [ 14%]
...........................                                              [ 17%]
tests/test_element.py ..............                                     [ 18%]
tests/test_filter.py ................................................... [ 24%]
.........................                                                [ 27%]
tests/test_formatter.py ............................                     [ 30%]
tests/test_fuzz.py F........sssssss                                      [ 32%]
tests/test_html5lib.py ................................................. [ 37%]
...........................................                              [ 42%]
tests/test_htmlparser.py ............................................... [ 47%]
...................................                                      [ 51%]
tests/test_lxml.py ..................................................... [ 57%]
.............................................................            [ 64%]
tests/test_navigablestring.py .........                                  [ 65%]
tests/test_pageelement.py ......................................         [ 69%]
tests/test_soup.py ..................................................... [ 75%]
..............................                                           [ 78%]
tests/test_tag.py ........................                               [ 81%]
tests/test_tree.py ..................................................... [ 87%]
........................................................................ [ 95%]
........................................                                 [100%]

=================================== FAILURES ===================================
_ 
TestFuzz.test_deeply_nested_document_without_css[clusterfuzz-testcase-minimized-bs4_fuzzer-5984173902397440]
 _

self = <tests.test_fuzz.TestFuzz object at 0x7fecdafea0d0>
filename = 'clusterfuzz-testcase-minimized-bs4_fuzzer-5984173902397440'

    @pytest.mark.parametrize(
        "filename",
        [
            "clusterfuzz-testcase-minimized-bs4_fuzzer-5984173902397440",
            "clusterfuzz-testcase-minimized-bs4_fuzzer-5167584867909632",
            "clusterfuzz-testcase-minimized-bs4_fuzzer-6124268085182464",
            "clusterfuzz-testcase-minimized-bs4_fuzzer-6450958476902400",
        ],
    )
    def test_deeply_nested_document_without_css(self, filename):
        # Parsing the document and encoding it back to a string is
        # sufficient to demonstrate that the overflow problem has
        # been fixed.
        markup = self.__markup(filename)
>       BeautifulSoup(markup, "html.parser").encode()
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

tests/test_fuzz.py:90: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
bs4/__init__.py:476: in __init__
    self._feed()
bs4/__init__.py:661: in _feed
    self.builder.feed(self.markup)
bs4/builder/_htmlparser.py:456: in feed
    parser.close()
/usr/lib/python3.13/html/parser.py:173: in close
    self.goahead(1)
/usr/lib/python3.13/html/parser.py:311: in goahead
    self.handle_charref(rawdata[i+2:])
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <bs4.builder._htmlparser.BeautifulSoupHTMLParser object at 
0x7fecdab67a10>
name = 
'2fontipt><!--</script><scriÑpt><ÙÙÙÙÈÙÙÙÙÙÙÙpt><script<>!--</script6<ÿ<head/ÂXpe"
 charse/Â\x00ript><sÿþcript><!-pt><s...\t<meta http-equiv="Content-Type" 
content="text/html; charset=utf-8" 
/<R=2O130B\r\r\r\r\r\r\r\r\r\r\r\r\r\r\r\r\r>\r\r'

    def handle_charref(self, name: str) -> None:
        """Handle a numeric character reference by converting it to the
        corresponding Unicode character and treating it as textual
        data.
    
        :param name: Character number, possibly in hexadecimal.
        """
        # TODO: This was originally a workaround for a bug in
        # HTMLParser. (http://bugs.python.org/issue13633) The bug has
        # been fixed, but removing this code still makes some
        # Beautiful Soup tests fail. This needs investigation.
        real_name:int
        if name.startswith("x"):
            real_name = int(name.lstrip("x"), 16)
        elif name.startswith("X"):
            real_name = int(name.lstrip("X"), 16)
        else:
>           real_name = int(name)
                        ^^^^^^^^^
E           ValueError: invalid literal for int() with base 10: 
'2fontipt><!--</script><scriÑpt><ÙÙÙÙÈÙÙÙÙÙÙÙpt><script<>!--</script6<ÿ<head/ÂXpe"
 charse/Â\x00ript><sÿþcript><!-pt><script<>!--</script6<ÿ<head/ÂXpe" 
charse/Â\x00ript><sÿþcript><!--</scrixt><sÿ<!--</

bs4/builder/_htmlparser.py:243: ValueError
------------------------------ Captured log call -------------------------------
WARNING  bs4.dammit:dammit.py:825 Some characters could not be decoded, and 
were replaced with REPLACEMENT CHARACTER.
=========================== short test summary info ============================
FAILED 
tests/test_fuzz.py::TestFuzz::test_deeply_nested_document_without_css[clusterfuzz-testcase-minimized-bs4_fuzzer-5984173902397440]
=================== 1 failed, 887 passed, 7 skipped in 1.62s ===================
E: pybuild pybuild:389: test: plugin pyproject failed with: exit code=1: cd 
/<<PKGBUILDDIR>>/.pybuild/cpython3_3.13_bs4/build; python3.13 -m pytest 
dh_auto_test: error: pybuild --test --test-pytest -i python{version} -p "3.14 
3.13" returned exit code 13
make: *** [debian/rules:6: binary] Error 25
dpkg-buildpackage: error: debian/rules binary subprocess returned exit status 2
--------------------------------------------------------------------------------

Reply via email to