I don't understand what I am doing wrong, but i must be misusing either MARC::Record, or perl's utf-8 support: I fetch an xml record from oclc worldcat, and turn it into a MARC::Record, then output it as
perl 5.10.0 -- linux/ ubuntu 2.6.31-19-generic-pae #56-Ubuntu SMP I'm using the world cat apis to fetch an xml record -- when I just run that from php. data is just fine: utf-8 properly encoded, so the code point A9 (copyright symbol) is C2 A9 however, when I use: LWP, and 'get', like this: $URL = " http://www.worldcat.org/webservices/catalog/content/$onum?servicelevel=full&wskey=$WSKEY "; $xml_text = get $URL; and I print xml_text, the copyright symbol is now just 0xA9, so when I do a new MARC::Record->new_from_xml($xml_text); I find myself with a xml record with an iso-latin1 copyright symbol. I think the string is being treated as a iso-latin1 string, and transcoded, and then i end up with an invalid marc record. rick -- Enrico Silterra Software Engineer 501 Olin Library Cornell University Ithaca NY 14853 Voice: 607-255-6851 Fax: 607-255-6110 E-mail: es...@cornell.edu http://www.library.cornell.edu/dlit "Out of the crooked timber of humanity no straight thing was ever made" CONFIDENTIALITY NOTE The information transmitted, including attachments, is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and destroy any copies of this document.