Re: [CODE4LIB] ISBD punctuation was [CODE4LIB] Getting data from Voyager into XML?
Walter Lewis wrote: Perhaps what Erik's put his finger on here is as good an excuse as any to raise the Death To ISBD Punctuation banner one more time. Some 60s/70s field termination punctuation rules are at the heart of most of the crud you're trying to scrape off these records. If ever there was a set of encoding rules that were more misguided, I've been fortunate not to encounter them. The problem is not ISBD punctuation (which is, after all, just semantic markup for humans), but ISBD punctuation _embedded in_ MARC markup, which means we've got two layers of markup intermingled. There's no reason to store "semantic" punctuation when the semantic punctuation is clearly implied by field or subfield delimiters. But ISBD punctuation is really cool... especially if you've every looked at an Asian or Cyrillic catalogue card and been able to identify the series statement just from the punctuation. - David -- David J. Fiander Digital Services Librarian
[CODE4LIB] ISBD punctuation was [CODE4LIB] Getting data from Voyager into XML?
Erik Hatcher wrote: [snip] I am, however, skeptical of a purely MARC -> XSLT -> Solr solution. The MARC data I've seen requires some basic cleanup (removing dots at the end of subjects, normalizing dates, etc) in order to be useful as facets. While XSLT is powerful, this type of data manipulation is better (IMO) done with scripting languages that allow for easy tweaking in a succinct way. Perhaps what Erik's put his finger on here is as good an excuse as any to raise the Death To ISBD Punctuation banner one more time. Some 60s/70s field termination punctuation rules are at the heart of most of the crud you're trying to scrape off these records. If ever there was a set of encoding rules that were more misguided, I've been fortunate not to encounter them. Walter