Excerpts from Joe Abbate's message of mar may 31 10:43:07 -0400 2011: > I have a web crawler for a website I maintain that I could modify to > crawl through the archives of -bugs, say from 5 Dec 2003 where the first > bug with the new format appears, and capture the structured data > (reference, logged by, email address, PG version, OS, description, and > message URL) into a table, for every message whose subject starts with > "BUG #", and capture each message URL for any message that has "BUG #" > somewhere in the subject, in a second table. > > I presume the tables could be used even if it's decided to go with > something like RT or BZ, but before I spend a couple of hours on this > I'd like see some ayes or nays. Useful or not?
I think this would be easier if you crawled the monthly mboxen instead of the web archives. It'd be preferable to use message-ids to identify messages rather than year-and-month based URLs. -- Álvaro Herrera <alvhe...@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers