On 6/1/11 12:44 AM, William Ashworth wrote:
I've searched for a bit and perhaps nothing is available (or I'm typing the
wrong searches).
A client of mine is looking to integrate more tightly to their Mailman list.
There's an archive page, but we're trying to format it nicely for inclusion on
their website so that it matches for members to see. I can see two
possibilities right now...
1. We write a custom PHP application to scrape and store the archive pages
into a database to call later, however we want.
2. We create an email address and subscribe it to the list. Any new messages
will be checked via a PHP script we build and stored in the database, then
we'll pull from our own archive format however we choose.
The only problem with #2 is that we lose 8 years of legacy emails that are
already present in the archives. My best bet is to find some way to hook into
the archives with PHP so that we can roll it into the rest of their complicated
website. Looking at another development language other than PHP at this time
would be a conflict of interest with the rest of their website applications.
I may be dreaming, but if there's some way to nightly export the data to XML
from the archive or something, which we can then (also) nightly import that XML
data into a MySQL database, then the sky's the limit...I simply don't know if
there's a standardized way to access the archived information, as scraping is
very messy.
I'm completely new to Mailman. Any assistance you can offer to help get my
bearings straight would be greatly appreciated.
I have done a bit of both for some applications. For #1, I use the mbox
file instead of scraping the pages though. This result is then put into
a database, and then new messages are added as things go. Since mbox is
very close to the mail format, you can use a lot of common code.
--
Richard Damon
------------------------------------------------------
Mailman-Users mailing list Mailman-Users@python.org
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://wiki.list.org/x/AgA3
Security Policy: http://wiki.list.org/x/QIA9
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/
Unsubscribe:
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org