"Fred L. Drake, Jr." <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > On Sunday 11 June 2006 16:26, Sam Ruby wrote: > > Planet is a feed aggregator written in Python. It depends heavily on > > SGMLLib. A recent bug report turned out to be a deficiency in sgmllib, > > and I've submitted a test case and a patch[1] (use or discard the > > patch, > > it is the test that I care about). ... > > and which are original. (Note: feeds often contain such abominations > > as > > &copy; which the new code will treat indistinguishably from ©)
> It really sounds like sgmllib is the wrong foundation for this. ... > Have you looked at HTMLParser as an alternate to sgmllib? > It has better support for XHTML constructs. Have you (the OP), checked how related Python projects, such as Mark Pilgrim's feed parser, http://www.feedparser.org/ handle the same sort of input (I have only looked at docs and tests, not code). tjr _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com