2008/7/3 Dan Scott <[EMAIL PROTECTED]>:
> 2008/7/3 Frances Dean McNamara <[EMAIL PROTECTED]>:
>> We are using yaz to convert (we already have a setup using that for our 
>> AquaBrowser weekly dumps of the db, so they used that)  So this happened 
>> when it was running with the xml parameter on a yaz file, then I reproduced 
>> the problem with a straight marc file using the perl.
>>
>> I'll ask Dale to look at your yaz command line as opposed to the one we have 
>> been using.  Thanks.
>>
>> I guess what we have discovered is that we may have to spend some time on a 
>> custom conversion bib program if we went with this as all sorts of 
>> interesting issues may show up in such a big file.  Turns out the process 
>> would skip that record and go on but I don't think it writes an error which 
>> we would need.
>>
>> That was LC cataloging, so apparently sometimes the do add a 500 with no 
>> subfield code.  The problem looks like it happens when the subfield 
>> delimeter and code are missing AND the text start with a quotation mark.  We 
>> won't try to fix right now, just note it as an issue
>>
>
> Ah, it's actually very helpful to provide the exact toolset /
> processing chain you're using when looking for help debugging a
> problem. I retract any aspersions that may have been cast on
> MARC::Record / MARC::File::XML!
>
> And embarrassingly for me, if you look at the XML record I sent, it
> has <subfield code="&quot;">August 1993"</subfield> for the offending
> subfield rather than <subfield code="a">August 1993"</subfield>. So
> yaz 2.1.56 doesn't resolve that problem. I wouldn't be surprised if a
> newer version resolves that, though.

Well, just tried 3.0.34 (released just a few weeks ago) and it shows
exactly the same problem. And reading the MARC21 specs for subfield
codes, it's not a bug, the " symbol is one of the characters reserved
for local definition as a data element identifier:

http://www.loc.gov/marc/specifications/specrecstruc.html#varifields

So yes, you'll have to either preprocess your MARC21 data, or
post-process the MARC21XML data, if you really want a subfield 'a'
where you currently have a subfield '"'. I've used the latter approach
for this problem in the past; it's pretty straightforward to parse
through the XML and globally change:

<subfield code="&quot;">

to:

<subfield code="a">"

-- 
Dan Scott
Laurentian University

Reply via email to