I've been using the "feedparser" module, and it turns out that some RSS feeds don't quite do RSS right.
For the Reuters RSS feed, about once every fifteen minutes, the "Etag" changes, even if there are no new stories. I've been logging this in a program of mine: WARNING: Feed "http://feeds.reuters.com/reuters/topNews?format=xml": Etag changed from "YH2PzNGiblDEe3z0hw2T2PLelCs" to "uGI/GLFvX9zQ+o4cdU2pFAetbEE" but no new content. Etags are just an optimization, so that's not too serious. But there are worse problems. Sometimes the item ID for a story changes, although the story text didn't. When a story stays on the Reuters feed for more than a day, it gets a new ID each day. Then, sometimes a higher priority story pushes an old story out of the ten stories returned in the feed. But the higher priority story may disappear from a later feed cycle, and the old story may come back. So you can't actually trust those fields, and have to back them up with checks of your own if you want exactly one copy of each item. It's something that "feedparser" should perhaps do. John Nagle -- http://mail.python.org/mailman/listinfo/python-list