Status: New
Owner: ----

New issue 198 by leonard....@gmail.com: Empty or unclosed <a> tags are duplicated
http://code.google.com/p/html5lib/issues/detail?id=198

---
import html5lib
markup = """<div id="1">
 <a href="foo" />
</div>
<div id="2">
 <div id="3">
   <a href="bar"></a>
  </div>
</div>"""

print html5lib.parse(markup).toxml()
---

I'd expect the empty tag '<a href="foo" />' to show up only once in the resulting markup, but it shows up four times:

<html>
 <head/>
 <body>
  <div id="1">
   <a href="foo"></a>
  </div>
  <a href="foo"></a>
  <div id="2">
   <a href="foo"> </a>
  <div id="3">
   <a href="foo"></a>
   <a href="bar"/>
  </div>
  </div>
 </body>
</html>

I get similar results with '<a href="foo">' instead of '<a href="foo" />'. With '<a href="foo"></a>', the first <a> tag only shows up once, as I'd expect.

I discovered this through a bug in Beautiful Soup 4, which can use html5lib as its parser:

https://bugs.launchpad.net/beautifulsoup/+bug/838800


--
You received this message because you are subscribed to the Google Groups 
"html5lib-discuss" group.
To post to this group, send an email to html5lib-discuss@googlegroups.com.
To unsubscribe from this group, send email to 
html5lib-discuss+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/html5lib-discuss?hl=en-GB.

Reply via email to