Enrico, If you suspect that $xml_text is in latin1, maybe you could confirm this by running it through Encode::Guess and then encoding it to utf8: http://search.cpan.org/~dankogai/Encode-2.42/lib/Encode/Guess.pm
Mark ----- Original Message ----- > I don't understand what I am doing wrong, but i must be misusing > either > MARC::Record, > or perl's utf-8 support: > I fetch an xml record from oclc worldcat, and turn it into a > MARC::Record, > then output it as > > perl 5.10.0 -- > linux/ ubuntu 2.6.31-19-generic-pae #56-Ubuntu SMP > > I'm using the world cat apis to fetch an xml record -- when I just run > that > from php. > data is just fine: utf-8 properly encoded, so the code point A9 > (copyright > symbol) > is C2 A9 > > however, when I use: LWP, and 'get', > like this: > $URL = " > http://www.worldcat.org/webservices/catalog/content/$onum?servicelevel=full&wskey=$WSKEY > "; > $xml_text = get $URL; > and I print xml_text, the copyright symbol is now just 0xA9, so when I > do a > > new MARC::Record->new_from_xml($xml_text); > > I find myself with a xml record with an iso-latin1 copyright symbol. > > I think the string is being treated as a iso-latin1 string, and > transcoded, > and then > i end up with an invalid marc record. > > rick > > -- > Enrico Silterra Software Engineer > 501 Olin Library Cornell University Ithaca NY 14853 > Voice: 607-255-6851 Fax: 607-255-6110 E-mail: es...@cornell.edu > http://www.library.cornell.edu/dlit > "Out of the crooked timber of humanity no straight thing was ever > made" > CONFIDENTIALITY NOTE > The information transmitted, including attachments, is intended only > for the > person or entity to which it is addressed and may contain confidential > and/or privileged material. Any review, retransmission, dissemination > or > other use of, or taking of any action in reliance upon, this > information by > persons or entities other than the intended recipient is prohibited. > If you > received this in error, please contact the sender and destroy any > copies of > this document.