Patches item #912410, was opened at 2004-03-09 02:20
Message generated for change (Comment added) made by loewis
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=912410&group_id=5470
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
>Status: Closed
>Resolution: Accepted
Priority: 5
Private: No
Submitted By: Aaron Swartz (aaronsw)
Assigned to: Fred L. Drake, Jr. (fdrake)
Summary: HTMLParser should support entities in attributes
Initial Comment:
HTMLParser doesn't currently support entities in attributes,
like this:
<span title="&8221; is a nice character">foo</span>
This patch fixes that. Simply replace the unescape in
HTMLParser.py with:
import htmlentitydefs
def unescape(self, s):
def replaceEntities(s):
s = s.groups()[0]
if s[0] == "#":
s = s[1:]
if s[0] in ['x','X']:
c = int(s[1:], 16)
else:
c = int(s)
return unichr(c)
else:
return
unichr(htmlentitydefs.name2codepoint[c])
return re.sub(r"&(#?[xX]?(?:[0-9a-fA-F]+|\w{1,8}));",
replaceEntities, s)
----------------------------------------------------------------------
>Comment By: Martin v. Löwis (loewis)
Date: 2007-03-06 15:46
Message:
Logged In: YES
user_id=21627
Originator: NO
Thanks for the patch. Committed as r54165, with the following changes:
- added documentation changes
- added testsuite changes
- fixed incorrect usage of c in name2codepoint[c] (should be [s])
- included ' in the list of supported entities, for compatibility
with older versions of HTMLParser
- fall back to replacing an unsupported entity reference with &name;
----------------------------------------------------------------------
Comment By: Aaron Swartz (aaronsw)
Date: 2004-03-09 02:21
Message:
Logged In: YES
user_id=122141
Argh. Hopefully now.
----------------------------------------------------------------------
Comment By: Aaron Swartz (aaronsw)
Date: 2004-03-09 02:21
Message:
Logged In: YES
user_id=122141
Oops. The replacement function is attached.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=912410&group_id=5470
_______________________________________________
Patches mailing list
[email protected]
http://mail.python.org/mailman/listinfo/patches