https://github.com/python/cpython/commit/e2d440ebc6d2457d23c80da75a5b1b40b17f0b4e commit: e2d440ebc6d2457d23c80da75a5b1b40b17f0b4e branch: 3.14 author: Brian Schubert <brianm.schub...@gmail.com> committer: serhiy-storchaka <storch...@gmail.com> date: 2025-05-07T22:03:30Z summary:
[3.14] gh-131535: Fix stale example in html.parser docs, make examples doctests (GH-131551) (GH-133589) (cherry picked from commit ee76e36d76a0e6916c0afc41228b043ab5174685) files: M Doc/library/html.parser.rst diff --git a/Doc/library/html.parser.rst b/Doc/library/html.parser.rst index 6d433b5a04fc4a..dd67fc34e856f1 100644 --- a/Doc/library/html.parser.rst +++ b/Doc/library/html.parser.rst @@ -43,7 +43,9 @@ Example HTML Parser Application As a basic example, below is a simple HTML parser that uses the :class:`HTMLParser` class to print out start tags, end tags, and data -as they are encountered:: +as they are encountered: + +.. testcode:: from html.parser import HTMLParser @@ -63,7 +65,7 @@ as they are encountered:: The output will then be: -.. code-block:: none +.. testoutput:: Encountered a start tag: html Encountered a start tag: head @@ -230,7 +232,9 @@ Examples -------- The following class implements a parser that will be used to illustrate more -examples:: +examples: + +.. testcode:: from html.parser import HTMLParser from html.entities import name2codepoint @@ -266,13 +270,17 @@ examples:: parser = MyHTMLParser() -Parsing a doctype:: +Parsing a doctype: + +.. doctest:: >>> parser.feed('<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" ' ... '"http://www.w3.org/TR/html4/strict.dtd">') Decl : DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd" -Parsing an element with a few attributes and a title:: +Parsing an element with a few attributes and a title: + +.. doctest:: >>> parser.feed('<img src="python-logo.png" alt="The Python logo">') Start tag: img @@ -285,7 +293,9 @@ Parsing an element with a few attributes and a title:: End tag : h1 The content of ``script`` and ``style`` elements is returned as is, without -further parsing:: +further parsing: + +.. doctest:: >>> parser.feed('<style type="text/css">#python { color: green }</style>') Start tag: style @@ -300,16 +310,25 @@ further parsing:: Data : alert("<strong>hello!</strong>"); End tag : script -Parsing comments:: +Parsing comments: + +.. doctest:: - >>> parser.feed('<!-- a comment -->' + >>> parser.feed('<!--a comment-->' ... '<!--[if IE 9]>IE-specific content<![endif]-->') - Comment : a comment + Comment : a comment Comment : [if IE 9]>IE-specific content<![endif] Parsing named and numeric character references and converting them to the -correct char (note: these 3 references are all equivalent to ``'>'``):: +correct char (note: these 3 references are all equivalent to ``'>'``): +.. doctest:: + + >>> parser = MyHTMLParser() + >>> parser.feed('>>>') + Data : >>> + + >>> parser = MyHTMLParser(convert_charrefs=False) >>> parser.feed('>>>') Named ent: > Num ent : > @@ -317,18 +336,22 @@ correct char (note: these 3 references are all equivalent to ``'>'``):: Feeding incomplete chunks to :meth:`~HTMLParser.feed` works, but :meth:`~HTMLParser.handle_data` might be called more than once -(unless *convert_charrefs* is set to ``True``):: +(unless *convert_charrefs* is set to ``True``): - >>> for chunk in ['<sp', 'an>buff', 'ered ', 'text</s', 'pan>']: +.. doctest:: + + >>> for chunk in ['<sp', 'an>buff', 'ered', ' text</s', 'pan>']: ... parser.feed(chunk) ... Start tag: span Data : buff Data : ered - Data : text + Data : text End tag : span -Parsing invalid HTML (e.g. unquoted attributes) also works:: +Parsing invalid HTML (e.g. unquoted attributes) also works: + +.. doctest:: >>> parser.feed('<p><a class=link href=#main>tag soup</p ></a>') Start tag: p _______________________________________________ Python-checkins mailing list -- python-checkins@python.org To unsubscribe send an email to python-checkins-le...@python.org https://mail.python.org/mailman3/lists/python-checkins.python.org/ Member address: arch...@mail-archive.com