Whoops, Sorry to rick for the double-post, but realized I didn't send my answers to the list directly...
> perl 5.10.0 -- > linux/ ubuntu 2.6.31-19-generic-pae #56-Ubuntu SMP Ok, that should have some nice unicode options and should be using unicode internally. > however, when I use: LWP, and 'get', > like this: > $URL = " > http://www.worldcat.org/webservices/catalog/content/$onum?servicelevel=full&wskey=$WSKEY > "; > $xml_text = get $URL; > and I print xml_text, the copyright symbol is now just 0xA9, so when I do a > > new MARC::Record->new_from_xml($xml_text); My first place to look at wouldn't be MARC::Record in this case. I'd be first looking at what LWP does. I know it has quite a few encoding options and perl now has more built-ins with utf-8 as well. What happens if you just dump the LWP result to file? Also, look at the decoded_content method off of http://search.cpan.org/~gaas/HTTP-Message-6.02/lib/HTTP/Response.pm. Also look at what the actual headers return, you may have to override them. ie $xml_text->decoded_content Then make sure the leader is set up correctly. You might need to use Encoding, but I don't think so. (I think LWP would put it into utf-8 by default, but that might change based on system settings. If you have problems, look at the encoding stuff, decode on the way in from LWP, then encode into utf before handing to MARC::Record. (Btw, are you using LWP, or are you actually using LWP::Simple? Simple doesn't have as many unicode options, I'd go with full-scale LWP). I know I've used http://juerd.nl/site.plp/perluniadvice in the past. It's got some useful info. Jon Gorman