I'm attempting to parse some basic tagged markup. The output of the TinyMCE editor returns a string that looks something like this;
<p>This is a paragraph with <b>bold</b> and <i>italic</i> elements in it</p><p>It can be made up of multiple lines separated by pagagraph tags.</p> I'm trying to render the paragraph into a bit mapped image. I need to parse it out into the various paragraph and bold/italic pieces. I'm not sure the best way to approach it. Elementree and lxml seem to want a full formatted page, not a small segment like this one. When I tried to feed a line similar to the above to lxml I got an error; "XMLSyntaxError: Extra content at the end of the document".
-- http://mail.python.org/mailman/listinfo/python-list