MARC::Charset question

moconnor59 Fri, 18 May 2007 03:56:45 -0700

Hi,

I'm using marc8_to_utf8() on Library of Congress data. I'm finding
that I get occasional null characters inserted in the output text, and
I'm wondering what this means.


An example is the author (personal name) of the book that can be found
at http://catalog.loc.gov/ by searching for ISBN 5040039875 (I'm
guessing the fact that the website appears to be displaying a
corrupted name may be part of the problem here).

This name is 'Dontsova, Daria' (approximately), in hex:
446f6eeb74ec736f76612c20446172a7eb69ec612e. When transcoded by
marc8_to_utf8() the result is
446f6e74cda173006f76612c20446172cab969cda161002e - which contains 2
null (00) characters.

Is it safe to ignore these null characters (i.e. strip them out of the
result, which otherwise seems good)?

Thanks,

Michael

MARC::Charset question

Reply via email to