> You also could consider to grok Jason Thomale's "Interpreting MARC:
> Where's the Bibliographic Data?" < http://journal.code4lib.org/articles/3832 >
That's a very good article, as it highlights the problems of the prescribed
punctuation both getting in the way of extracting parts of the data and its
role in providing extra context to the subfields.
> It is not a MOM (MARC Object Model) or rather an object model for
> any format derived from ISO 2709 and its concepts of files, records,
> (flavors of) fields and subfields and therefore no abstract API
> can be specified (prescribing that some operation X is defined on
> record objects and yields field objects).
If we are just talking about ISO 2709, the whole family of MARC formats in
general, then you have to remember that UNIMARC and obsolete formats like
UKMARC have very different requirements. UKMARC and UNIMARC are actually much
easier to work with than MARC21 because the ISBD punctuation is not carried in
the record but is generated from the subfield tags. So you don't have to say
"give me the 245 $a and $b but strip / off the end if present" because the
slash is not there. And there is a different subfield tag to introduce a
parallel title, so you don't need to distinguish :$b from =$b.
In the UK most libraries have been MARC21 for a decade or more now. I don't
know how much use is still made of UNIMARC, or the other national formats, nor
how good they were. It seems as though in the last twenty years many countries
have made moves towards MARC21 because of the sheer numbers of records
available in that format. It's just a pity that it's possibly the worst of the
ISO 2709 formats to work with if you want to repurpose the data!
I hope that BIBFRAME is not going to make the same mistakes. I have not been
following that initiative in detail, but I've seen a few examples of data with
punctuation hanging about at the end. Hard to tell whether it's prescribed
punctuation or copying from the book.
The title field, in particular, is much more akin to HTML markup than data
fields in a database. In antiquarian cataloguing rules like DCRM, the emphasis
is on exact transcription from the title page, where the presence or absence of
punctuation can make a difference in identifying variant editions. In MARC21
we get the crazy situation where the cataloguers transcribe the exact
punctuation from the title page and *add* the ISBD punctuation to the MARC21
record. This makes it very hard to present the lay-person with anything
Head of Digital and Bibliographic Services,
Durham University Library, Stockton Road, Durham, DH1 3LY
+44 (0)191 334 2941