[issue40873] Something wrong with html.unescape()

2020-06-19 Thread Ezio Melotti


Change by Ezio Melotti :


--
nosy: +ezio.melotti

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40873] Something wrong with html.unescape()

2020-06-19 Thread Serhiy Storchaka


Serhiy Storchaka  added the comment:

Concur with Christian. It works as designed, in accordance to the standard.

--
nosy: +serhiy.storchaka
resolution:  -> not a bug
stage:  -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40873] Something wrong with html.unescape()

2020-06-19 Thread Christian Heimes


Christian Heimes  added the comment:

According to 
https://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references#cite_ref-semicolon_1-64
 the trailing semicolon can be omitted for the named entity "reg". That means 
"®" and "®" are equivalent.

saxutils.unescape() only handles '<', '>', and '&' by default. You have to pass 
in a dictionary to unescape other entities.

--
nosy: +christian.heimes

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40873] Something wrong with html.unescape()

2020-06-18 Thread hongweipeng


Change by hongweipeng :


--
nosy: +hongweipeng

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40873] Something wrong with html.unescape()

2020-06-09 Thread Dong-hee Na


Change by Dong-hee Na :


--
stage: needs patch -> 
versions:  -Python 3.10, Python 3.9

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40873] Something wrong with html.unescape()

2020-06-09 Thread Dong-hee Na


Change by Dong-hee Na :


--
versions: +Python 3.10, Python 3.9

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40873] Something wrong with html.unescape()

2020-06-09 Thread Dong-hee Na


Change by Dong-hee Na :


--
stage:  -> needs patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40873] Something wrong with html.unescape()

2020-06-09 Thread Dong-hee Na


Change by Dong-hee Na :


--
nosy: +corona10

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40873] Something wrong with html.unescape()

2020-06-05 Thread Валентин Dreyk

New submission from Валентин Dreyk :

import html
import xml.sax.saxutils as saxutils

print(saxutils.unescape("®hard"))  # ®hard
print(html.unescape("®hard"))  # ®hard



html.unescape() replace "®" to "®" even without ";" at the end.

--
components: Library (Lib)
messages: 370765
nosy: Валентин Dreyk
priority: normal
severity: normal
status: open
title: Something wrong with html.unescape()
type: behavior
versions: Python 3.8

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com