aslkoi fdsda <pythonz...@gmail.com> wrote: > I would like to read just the headers out of a newsgroup. > Being a Python newbie, I was wondering if this is possible and how difficult > it would be for a novice Python programmer. > Thanks for any reply! > [HTML part not displayed]
It's not hard at all. I've pulled some bits and pieces out of the self-written minimalist newsreader I'm responding to your post with, and added some example usage code. It should head you in the right direction, and there's no advanced python involved here: -------------------------------------------------------------- from email.parser import FeedParser from nntplib import NNTP from rfc822 import mktime_tz, parsedate_tz class Article: def __init__(self): self.num = None self.subject = None self.poster = None self.date = None self.id = None self.references = [] self.size = 0 self.lines = 0 self.newsgroups = [] def loadFromOverview(self, overview): (self.subject, self.poster, self.date, self.id, self.references, self.size, self.lines) = overview[1:] try: self.date = mktime_tz(parsedate_tz(self.date)) except ValueError: print "ERROR in date parsing (%s)" % self.date self.date = None return overview[0] def loadMessage(self, server): msgparser = FeedParser() resp, num, id, lines = server.head(self.id) msgparser.feed('\n'.join(lines)+'\n\n') resp, num, id, lines = server.body(self.id) msgparser.feed('\n'.join(lines)+'\n') self.message = msgparser.close() server = NNTP('news.gmane.org') resp, count, first, last, name = server.group('gmane.comp.python.ideas') resp, headersets = server.xover(str(int(last)-100), last) articles = [] for h in headersets: a = Article() artnum = a.loadFromOverview(h) articles.append(a) anarticle = articles[0] anarticle.loadMessage(server) print dir(anarticle.message) for header in anarticle.message.keys(): print "%s: %s" % (header, anarticle.message[header]) -------------------------------------------------------------- Heh, looking at this I remember it is several-years-old code and really needs to be revisited and updated...so I'm not going to claim that this is the best code that could be written for this task :) Oh, and there's more involved in actually printing the headers if you need to deal with non-ASCII characters ("encoded words") in the headers. (That's in the docs for the email module, though it took me a bit to figure out how to do it right.) -- R. David Murray http://www.bitdance.com -- http://mail.python.org/mailman/listinfo/python-list