[CODE4LIB] Reminder: THATCamp for Computational Archaeology registration deadline is TODAY

2012-06-10 Thread Ethan Gruber
Today, June 10 is the final day to register for THATCamp CAA-NA, an unconference for computer applications in archaeology. The free event will be held Friday, August 10 in the Harrison-Small Special Collections Library of the University of Virginia, Charlottesville. It is sponsored by the

Re: [CODE4LIB] Best way to process large XML files

2012-06-10 Thread Ross Singer
Steve, I'm not sure if you were hoping for a ruby-related answer to your question (since you mentioned Nokogiri), but if you are, take a look at ruby-marc' GenericPullParser [1] as an example of using a SAX parser for this sort of thing. It doesn't quite answer your question, but I think it might

Re: [CODE4LIB] The history of Code4Lib and MediaWiki development.

2012-06-10 Thread stuart yeates
On 09/06/12 06:18, Klein,Max wrote: I was just wondering if there have been any efforts from Code4Lib into MediaWiki development? I know that there have been some Wikipedia templates and bots designed to interface with library services. Yet what about cold hard MediaWiki extensions? Has there

[CODE4LIB] Digital Curation Bibliography: Preservation and Stewardship of Scholarly Works

2012-06-10 Thread Charles W. Bailey, Jr.
Digital Scholarship has released the Digital Curation Bibliography: Preservation and Stewardship of Scholarly Works: http://digital-scholarship.org/dcpb/dcb.htm In a rapidly changing technological environment, the difficult task of ensuring long-term access to digital information is increasingly

Re: [CODE4LIB] Best way to process large XML files

2012-06-10 Thread stuart yeates
On 09/06/12 06:36, Kyle Banerjee wrote: How do you guys deal with large XML files? There have been a number of excellent suggestions from other people, but it's worth pointing out that sometimes low tech is all you need. I frequently use sed to do things such as replace one domain name

Re: [CODE4LIB] Best way to process large XML files

2012-06-10 Thread Edward M Corrado
FWIW: I use sed all the time to edit XML files. I wouldn't say I have any really large files (which is why i didn't respond earlier) but it works great for me. Regular expressions are your friend. -- Edward M. Corrado On Jun 10, 2012, at 19:25, stuart yeates stuart.yea...@vuw.ac.nz wrote:

Re: [CODE4LIB] The history of Code4Lib and MediaWiki development.

2012-06-10 Thread Tomas Saorin
I think Wikipedia needs Mediawiki offers any kind of tools to markup text to repurposing article contents for different reading levels. That is, parse the article's text and generate a link to Kids version or Teen version. Mixing automatic text processing and human tags. But perhaps these are the