Patches item #912410, was opened at 2004-03-09 02:20
Message generated for change (Comment added) made by loewis
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=912410&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
>Status: Closed
>Resolution: Accepted
Priority: 5
Private: No
Submitted By: Aaron Swartz (aaronsw)
Assigned to: Fred L. Drake, Jr. (fdrake)
Summary: HTMLParser should support entities in attributes

Initial Comment:
HTMLParser doesn't currently support entities in attributes, 
like this:

<span title="&8221; is a nice character">foo</span>

This patch fixes that. Simply replace the unescape in 
HTMLParser.py with:


import htmlentitydefs

def unescape(self, s):

        def replaceEntities(s):
                s = s.groups()[0]
                if s[0] == "#":
                        s = s[1:]
                        if s[0] in ['x','X']:
                                c = int(s[1:], 16)
                        else:
                                c = int(s)
                        return unichr(c)
                        
                else:
                        return 
unichr(htmlentitydefs.name2codepoint[c])
                        
        return re.sub(r"&(#?[xX]?(?:[0-9a-fA-F]+|\w{1,8}));", 
replaceEntities, s)



----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2007-03-06 15:46

Message:
Logged In: YES 
user_id=21627
Originator: NO

Thanks for the patch. Committed as r54165, with the following changes:

- added documentation changes
- added testsuite changes
- fixed incorrect usage of c in name2codepoint[c] (should be [s])
- included &apos; in the list of supported entities, for compatibility
with older versions of HTMLParser
- fall back to replacing an unsupported entity reference with &name;

----------------------------------------------------------------------

Comment By: Aaron Swartz (aaronsw)
Date: 2004-03-09 02:21

Message:
Logged In: YES 
user_id=122141

Argh. Hopefully now.

----------------------------------------------------------------------

Comment By: Aaron Swartz (aaronsw)
Date: 2004-03-09 02:21

Message:
Logged In: YES 
user_id=122141

Oops. The replacement function is attached.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=912410&group_id=5470
_______________________________________________
Patches mailing list
[email protected]
http://mail.python.org/mailman/listinfo/patches

Reply via email to