Python has a library which will parse mbox files. It worked on the test
file I downloaded from GMail. If all you want are the message bodies, it
looks like you can do that in seven lines. Obviously, this doesn't
guarantee much of anything for the jobs mbox files.
Looking at some of the posts on
Hi all,
I've done a couple projects mining the data from the code4lib listserv
(e.g. https://ejournals.bc.edu/index.php/ital/article/view/5893 ). Both
times the fastest route was finding helpful folks involved in it to provide
me with a data dump vs. spending time on a scraper.
The most recent
I would see if you can just get an SQL or CSV dump of the tables, maybe it’s
not super-normalized and you can get most of what you need in a table or two,
or perhaps the provider would be so kind as to write a join for the data you
need, and write a dump to a CSV file which you can the import
The initial commit in https://github.com/code4lib/shortimer/ was November
2011, which is ten years for some values of ten. Taking a quick and
noncomprehensive glance around, I see postings as old as 2005. I don't see
an obvious API, but maybe a maintainer could weigh in about data dump
On Jan 22, 2021, at 11:11 AM, Jill Ellern wrote:
> I'm doing some research into systems librarian duties and wondering if there
> is an easy way to get a dump of the code4lib jobs from the last 10 years? In
> excel format?
Easy? I'd be surprised.
There are two or three sources of the
Hey folks,
I'm doing some research into systems librarian duties and wondering if there is
an easy way to get a dump of the code4lib jobs from the last 10 years? In
excel format?
Jill Ellern
4LIB@LISTS.CLIR.ORG
> Subject: Re: [Code4Lib] Code4Lib Jobs ..?
>
> If you look at the sample template for configuring "short timer" the
> Python code that powers the jobs site, it has the old listserv address in
> it.
>
> https://na01.safelinks.protection.outlook.com/?url