Hello everyone,
I am using BeautifulSoup to parse some HTML and I came across something
strange.
Here is an illustration:
soup = BeautifulSoup(u'<div class="text">hello ça boume<br /></div')
soup
<div class="text">hello ça boume<br /></div>
soup.find("div", "text")
<div class="text">hello ça boume<br /></div>
soup.find("div", "text").string
soup.find("div", "text").next
u'hello \xe7a boume'
why does soup.find("div", "text").string not give me the string? Is it
because there is a <br/>? Is there a way to have it ignore the <br/>
tag? Am I doing something wrong?
Thank you,
Gabriel
PS
Sorry if this appears twice on the list, but I sent it 4 hours ago and
it never showed up so I resent it.
--
http://mail.python.org/mailman/listinfo/python-list