Enrico,

If you suspect that $xml_text is in latin1, maybe you could confirm this by 
running it through Encode::Guess and then encoding it to utf8: 
http://search.cpan.org/~dankogai/Encode-2.42/lib/Encode/Guess.pm

Mark

----- Original Message -----
> I don't understand what I am doing wrong, but i must be misusing
> either
> MARC::Record,
> or perl's utf-8 support:
> I fetch an xml record from oclc worldcat, and turn it into a
> MARC::Record,
> then output it as
> 
> perl 5.10.0 --
> linux/ ubuntu 2.6.31-19-generic-pae #56-Ubuntu SMP
> 
> I'm using the world cat apis to fetch an xml record -- when I just run
> that
> from php.
> data is just fine: utf-8 properly encoded, so the code point A9
> (copyright
> symbol)
> is C2 A9
> 
> however, when I use: LWP, and 'get',
> like this:
> $URL = "
> http://www.worldcat.org/webservices/catalog/content/$onum?servicelevel=full&wskey=$WSKEY
> ";
> $xml_text = get $URL;
> and I print xml_text, the copyright symbol is now just 0xA9, so when I
> do a
> 
> new MARC::Record->new_from_xml($xml_text);
> 
> I find myself with a xml record with an iso-latin1 copyright symbol.
> 
> I think the string is being treated as a iso-latin1 string, and
> transcoded,
> and then
> i end up with an invalid marc record.
> 
> rick
> 
> --
> Enrico Silterra Software Engineer
> 501 Olin Library Cornell University Ithaca NY 14853
> Voice: 607-255-6851 Fax: 607-255-6110 E-mail: es...@cornell.edu
> http://www.library.cornell.edu/dlit
> "Out of the crooked timber of humanity no straight thing was ever
> made"
> CONFIDENTIALITY NOTE
> The information transmitted, including attachments, is intended only
> for the
> person or entity to which it is addressed and may contain confidential
> and/or privileged material. Any review, retransmission, dissemination
> or
> other use of, or taking of any action in reliance upon, this
> information by
> persons or entities other than the intended recipient is prohibited.
> If you
> received this in error, please contact the sender and destroy any
> copies of
> this document.

Reply via email to