It looks like the dataset is available in XML format. Perhaps you can
import it into an XML database (eXist - exist-db.org comes to mind), and
then generate a report via its query capabilities.
Miles Fidelman
Jonathan Rochkind wrote:
If you are, can become, or know, a programmer, that would be relatively
straightforward in any programming language using the open source MARC
processing library for that language. (ruby marc, pymarc, perl marc, whatever).
Although you might find more trouble than you expect around authorities, with
them being less standardized in your corpus than you might like.
________________________________________
From: Code for Libraries [[email protected]] on behalf of Stuart Yeates
[[email protected]]
Sent: Sunday, November 02, 2014 5:48 PM
To: [email protected]
Subject: [CODE4LIB] MARC reporting engine
I have ~800,000 MARC records from an indexing service
(http://natlib.govt.nz/about-us/open-data/innz-metadata CC-BY). I am trying to
generate:
(a) a list of person authorities (and sundry metadata), sorted by how many
times they're referenced, in wikimedia syntax
(b) a view of a person authority, with all the records by which they're
referenced, processed into a wikipedia stub biography
I have established that this is too much data to process in XSLT or multi-line
regexps in vi. What other MARC engines are there out there?
The two options I'm aware of are learning multi-line processing in sed or
learning enough koha to write reports in whatever their reporting engine is.
Any advice?
cheers
stuart
--
I have a new phone number: 04 463 5692
--
In theory, there is no difference between theory and practice.
In practice, there is. .... Yogi Berra