The Museum Computer Network (http://mcn.edu) is having their 50th
anniversary this year and I'm doing a volunteer project where I'm
looking at the history of computery museum jobs. Fortunately, a couple
years ago I put every mcn-l posting since 1996 up on The Mail Archive
and there are ~1000 job postings in there. (Other people get to look at
old print publications for the earlier history).
It's easy enough on mail-archive to search for "job" or "position" and
get the results. The expand button gets me the content of each message
right on the search results page. Hence,
https://www.mail-archive.com/search?l=mcn-l%40mcn.edu&q=%28%2Bjob+OR+%2Bposition%29&f=1
And the HTML of that page is nicely structured. I'd love to get them
from there into a spreadsheet and figure out things like "most job
postings go out at the end of the month" or "we stopped saying
'webmaster' in 2001" or other kinds of data-informed insights. Has
anybody tried something like this before?
I guess what I'm asking, is there an easy path from mail-archive.com
search results into a spreadsheet (I guess mySQL or postgres would be OK
too) or some other kind of analysis tool?
Thanks!
Matt
_______________________________________________
Gossip mailing list
https://www.mail-archive.com/gossip@mail-archive.com
https://www.mail-archive.com/cgi-bin/mailman/options/gossip