[issue42821] HTMLParser: subsequent duplicate attributes should be ignored

2021-01-05 Thread Ezio Melotti


Ezio Melotti  added the comment:

If we follow the behavior of the browser, we will have to pick one of the two 
values and discard the other, making this value unaccessible.  If we provide 
both, scripts and libraries that use HTMLParser will have access to both and 
can decide what to do.

For example BeautifulSoup already does the right thing:
>>> bs4.BeautifulSoup('text')

text

Changing this might also break code that rely on this behavior.  I'm therefore 
going to close this as "not a bug".

--
assignee:  -> ezio.melotti
nosy: +ezio.melotti
resolution:  -> not a bug
stage:  -> resolved
status: open -> closed
type:  -> behavior

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42821] HTMLParser: subsequent duplicate attributes should be ignored

2021-01-04 Thread karl


New submission from karl :

This comes up while working on issue 41748


browser input 
data:text/html,text

browser output
text

Actual HTMLParser output

see https://github.com/python/cpython/pull/24072#discussion_r551158342
('starttag', 'div', [('class', 'bar'), ('class', 'foo')])]

Expected HTMLParser output
('starttag', 'div', [('class', 'bar')])]

--
components: Library (Lib)
messages: 384308
nosy: karlcow
priority: normal
severity: normal
status: open
title: HTMLParser: subsequent duplicate attributes should be ignored
versions: Python 3.10

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com