2008/7/3 Dan Scott <[EMAIL PROTECTED]>: > 2008/7/3 Frances Dean McNamara <[EMAIL PROTECTED]>: >> We are using yaz to convert (we already have a setup using that for our >> AquaBrowser weekly dumps of the db, so they used that) So this happened >> when it was running with the xml parameter on a yaz file, then I reproduced >> the problem with a straight marc file using the perl. >> >> I'll ask Dale to look at your yaz command line as opposed to the one we have >> been using. Thanks. >> >> I guess what we have discovered is that we may have to spend some time on a >> custom conversion bib program if we went with this as all sorts of >> interesting issues may show up in such a big file. Turns out the process >> would skip that record and go on but I don't think it writes an error which >> we would need. >> >> That was LC cataloging, so apparently sometimes the do add a 500 with no >> subfield code. The problem looks like it happens when the subfield >> delimeter and code are missing AND the text start with a quotation mark. We >> won't try to fix right now, just note it as an issue >> > > Ah, it's actually very helpful to provide the exact toolset / > processing chain you're using when looking for help debugging a > problem. I retract any aspersions that may have been cast on > MARC::Record / MARC::File::XML! > > And embarrassingly for me, if you look at the XML record I sent, it > has <subfield code=""">August 1993"</subfield> for the offending > subfield rather than <subfield code="a">August 1993"</subfield>. So > yaz 2.1.56 doesn't resolve that problem. I wouldn't be surprised if a > newer version resolves that, though.
Well, just tried 3.0.34 (released just a few weeks ago) and it shows exactly the same problem. And reading the MARC21 specs for subfield codes, it's not a bug, the " symbol is one of the characters reserved for local definition as a data element identifier: http://www.loc.gov/marc/specifications/specrecstruc.html#varifields So yes, you'll have to either preprocess your MARC21 data, or post-process the MARC21XML data, if you really want a subfield 'a' where you currently have a subfield '"'. I've used the latter approach for this problem in the past; it's pretty straightforward to parse through the XML and globally change: <subfield code="""> to: <subfield code="a">" -- Dan Scott Laurentian University
