Re: [CODE4LIB] ISBD punctuation was [CODE4LIB] Getting data from Voyager into XML?

2007-01-19 Thread David J. Fiander

Walter Lewis wrote:


Perhaps what Erik's put his finger on here is as good an excuse as any
to raise the Death To ISBD Punctuation banner one more time.  Some
60s/70s field termination punctuation rules are at the heart of most of
the crud you're trying to scrape off these records.  If ever there was a
set of encoding rules that were more misguided, I've been fortunate not
to encounter them.


The problem is not ISBD punctuation (which is, after all, just semantic
markup for humans), but ISBD punctuation _embedded in_ MARC markup, which
means we've got two layers of markup intermingled.  There's no reason to
store "semantic" punctuation when the semantic punctuation is clearly
implied by field or subfield delimiters.

But ISBD punctuation is really cool... especially if you've every looked at
an Asian or Cyrillic catalogue card and been able to identify the series
statement just from the punctuation.

- David

--
David J. Fiander
Digital Services Librarian


[CODE4LIB] ISBD punctuation was [CODE4LIB] Getting data from Voyager into XML?

2007-01-19 Thread Walter Lewis

Erik Hatcher wrote:

[snip]
I am, however, skeptical of a purely MARC -> XSLT -> Solr solution.
The MARC data I've seen requires some basic cleanup (removing dots at
the end of subjects, normalizing dates, etc) in order to be useful as
facets.  While XSLT is powerful, this type of data manipulation is
better (IMO) done with scripting languages that allow for easy
tweaking in a succinct way.

Perhaps what Erik's put his finger on here is as good an excuse as any
to raise the Death To ISBD Punctuation banner one more time.  Some
60s/70s field termination punctuation rules are at the heart of most of
the crud you're trying to scrape off these records.  If ever there was a
set of encoding rules that were more misguided, I've been fortunate not
to encounter them.

Walter