Re: [CODE4LIB] MarcXML and char encodings

Sheila M. Morrissey Tue, 17 Apr 2012 13:23:52 -0700

In XML standard:

        It is RECOMMENDED that character encodings registered (as charsets) 
with the Internet Assigned Numbers Authority [IANA-CHARSETS], other than those 
just listed, be referred to using    their registered names; other encodings 
SHOULD use names starting with an "x-" prefix. XML processors SHOULD match 
character encoding names in a case-insensitive way and SHOULD        either 
interpret an IANA-registered name as the encoding registered at IANA for that 
name or treat it as unknown (processors are, of course, not required to support 
all IANA-  registered encodings).

As I suggested -- since MARC8 isn't (so far as I know) registered -- you won't 
get far with most standard tools, in whatever language -- you'll have to extend 
them to first recognize the encoding name, and second, decode the content.

smm

-----Original Message-----
From: Jonathan Rochkind [mailto:[email protected]] 
Sent: Tuesday, April 17, 2012 4:19 PM
To: Code for Libraries
Cc: Sheila M. Morrissey
Subject: Re: [CODE4LIB] MarcXML and char encodings

On 4/17/2012 3:01 PM, Sheila M. Morrissey wrote:
> No -- it is perfectly legal - -but you MUST declare the encoding to BE Marc8 
> in the XML prolog,

Wait, how canyou declare a Marc8 encoding in an XML 
decleration/prolog/whatever it's called?

The things that appear there need to be from a specific list, and I 
didn't think Marc8 was on that list?

Can you give me an example?  And, if you happen to have it, link to XML 
standard that says this is legal?

Re: [CODE4LIB] MarcXML and char encodings

Reply via email to