Thank you to all who responded with software suggestions. https://github.com/ubleipzig/marctools is looking like the most promising candidate so far. The more I read through the recommendations the more it dawned on me that I don't want to have to configure yet another java toolchain (yes I know, that may be personal bias).
Thank you to all who responded about the challenges of authority control in such collections. I'm aware of these issues. The current project is about marshalling resources for editors to make informed decisions about rather than automating the creation of articles, because there is human judgement involved in the last step I can afford to take a few authority control 'risks' cheers stuart -- I have a new phone number: 04 463 5692 ________________________________________ From: Code for Libraries <CODE4LIB@LISTSERV.ND.EDU> on behalf of raffaele messuti <raffaele.mess...@gmail.com> Sent: Monday, 3 November 2014 11:39 p.m. To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] MARC reporting engine Stuart Yeates wrote: > Do any of these have built-in indexing? 800k records isn't going to fit in > memory and if building my own MARC indexer is 'relatively straightforward' > then you're a better coder than I am. you could try marcdb[1] from marctools[2] [1] https://github.com/ubleipzig/marctools#marcdb [2] https://github.com/ubleipzig/marctools -- raffaele