Hi Karen,

Sorry for the delay on this.  Here  are links to both the original MARC
extract and the PyMARC/MARCbreaker formatted text version:

MARC - https://dl.dropboxusercontent.com/u/33663928/dnb_sample.mrc
Text - https://dl.dropboxusercontent.com/u/33663928/dnb_sample.txt

I also came across this DNB MARC documentation
http://www.dnb.de/SharedDocs/Downloads/DE/DNB/standardisierung/marc21FieldsDnbZdbRecords2009En.pdf?__blob=publicationFile
which
may be useful.

I originally tried to include them as attachements since I figured it would
be useful to have them in the archive rather than on an ephemeral Dropbox
location, but the mailer is set up with a very low 40KB maximum message
size (this would have been well under 200KB with both attachments).

Tom



On Thu, Sep 12, 2013 at 12:18 PM, Karen Coyle <[email protected]> wrote:

>
>
> On 9/12/13 8:37 AM, Tom Morris wrote:
>
>
>>
>> It looks like the 1998 proposal was approved according to these
>> guidelines from June:
>> http://www.loc.gov/marc/**nonsorting.html<http://www.loc.gov/marc/nonsorting.html>
>>
>
>
> Yes, it was approved, but never implemented in the US. It was added to aid
> the transition of the German libraries to MARC - they already had this
> capability in their format (MAB). I've never seen it in "live" records
> before, so it's still got only limited use.
>
>
>
>
>>
>
>> OK, after maze of documents all pointing at each other, I found a place
>> that defines this in a useful fashion:
>> http://lcweb2.loc.gov/diglib/**codetables/45.html<http://lcweb2.loc.gov/diglib/codetables/45.html>
>>
>> MARC-8  MARC-8
>> as C1   UCS     UTF-8   CHAR    C?      NAME    ALT     ALT UTF-8
>>         88      0098    C298    ˜               NON-SORT BEGIN / START OF
>> STRING
>>         89      009C    C29C    œ               NON-SORT END / STRING
>> TERMINATOR
>>
>> which explains the oe ligature in your data, although the graphic
>> representation doesn't mean it's the same as the real tilde and oe
>> ligature.  The real tilde has UTF-8 representation of 0x7E instead of
>> 0xC298.
>>
>
>
> Great, thanks. I'd forgotten about those code tables. So the 88, 89 were
> 8-bit ascii, as defined in MARC-8, not in "normal" ASCII. MARC doesn't used
> the extended latin combined characters, but has separate codes for
> character and diacritic. (And redefines all of the values of 8-bit ascii in
> a proprietary way!)
>
>
>
>
>> The weird thing is that your data seems to have the raw 0x98 and 0x9C
>> without the 0xC2 byte introducing them.  That doesn't seem correct on
>> the surface, but I'm not sure where you cut & pasted your data from.
>>
>
> You can find them as 0098 and 009C in this code page:
> http://www.unicode.org/charts/**PDF/U0080.pdf<http://www.unicode.org/charts/PDF/U0080.pdf>
>
> I did a hex display of the data in a text editor (textmate) and can't
> attest to its accuracy. I also don't know if either the creation of the
> MARC file or the display of it didn't alter something - that's the real
> bugaboo with trying to "look" at character sets. I no longer have a hex
> dump or binary dump program around. (At one point I could read both with
> ease... glad that's behind me!)
>
>
>
>
>>     For OL (which doesn't really need non-filing characters, I believe) we
>>     could just strip these characters out. If someone could strip them out
>>     of the current set I could run marcedit again. I'm just trying to get
>> a
>>     good look at the records to see if they'll translate well to OL
>> fields.
>>
>>
>> Rather than futzing around with closed source marcedit, could I just use
>> PyMarc to make a formatted dump of a few records for you?
>>
>
>
> That would be great, thanks. Actually, the whole set of 100 that Johannes
> provided would be ideal:
>
> https://dl.dropboxusercontent.**com/u/38124925/dnb_sample.mrc<https://dl.dropboxusercontent.com/u/38124925/dnb_sample.mrc>
>
> kc
>
>
>
>
>> Tom
>>
>>     I'm heading off for 10 days to the Dublin Core conference in Lisbon.
>> If
>>     anyone else has time to do analysis on this, please feel free:
>>
>>     
>> http://archive.org/details/**marc21_records_german_**national_library<http://archive.org/details/marc21_records_german_national_library>
>>
>>     kc
>>
>>     [1] 
>> http://www.loc.gov/marc/marbi/**1998/98-16r.html<http://www.loc.gov/marc/marbi/1998/98-16r.html>
>>
>>
>>
> --
> Karen Coyle
> [email protected] http://kcoyle.net
> ph: 1-510-540-7596
> m: 1-510-435-8234
> skype: kcoylenet
>
_______________________________________________
Ol-tech mailing list
[email protected]
http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech
Archives: http://www.mail-archive.com/[email protected]/
To unsubscribe from this mailing list, send email to 
[email protected]

Reply via email to