"Fred L. Drake, Jr." <[EMAIL PROTECTED]> wrote in message 
news:[EMAIL PROTECTED]
> On Sunday 11 June 2006 16:26, Sam Ruby wrote:
> > Planet is a feed aggregator written in Python.  It depends heavily on
> > SGMLLib.  A recent bug report turned out to be a deficiency in sgmllib,
> > and I've submitted a test case and a patch[1] (use or discard the 
> > patch,
> > it is the test that I care about).
...
> > and which are original.  (Note: feeds often contain such abominations 
> > as
> > &amp;copy; which the new code will treat indistinguishably from &copy;)

> It really sounds like sgmllib is the wrong foundation for this.
...
> Have you looked at HTMLParser as an alternate to sgmllib?
> It has better support for XHTML constructs.

Have you (the OP), checked how related Python projects, such as Mark 
Pilgrim's feed parser,
http://www.feedparser.org/
handle the same sort of input (I have only looked at docs and tests, not 
code).

tjr



_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to