At 00:27 +0100 18/6/10, I wrote:
If I save the file and undo the second decoding I get the proper output
In this case all talk of iso-8859-1 and cp1252 is a red herring. I read several Italian websites where this same problem is manifest in external material such as ads. The news page proper is encoded properly and declared as utf-8 but I imagine the web designers have reckoned that the stuff they receive from the advertisers is most likely to be received as windows-1252 and convert accordingly rather than bother to verify the encoding. As a result material that is received as utf-8 will undergo a superfluous encoding.
Here's a way to get the file in question properly encoded: #!/usr/bin/perl use strict; use LWP::Simple; use Encode; no warnings; # avoid wide character warning my $tempdir = "/tmp"; my $tempfile = "tempfile"; my $f = "$tempdir/$tempfile"; my $uri="http://pipes.yahoo.com/pipes/pipe.run". "?Size=Medium&_id=f53b7bed8b88412fab9715a995629722". "&_render=rss&max=50&nsid=1025993%40N22"; if (getstore($uri, $f)){ open F, $f or die $!; while (<F>){ my $encoding = find_encoding("utf-8"); my $utf8 = $encoding->decode($_); print $utf8; } close F; } unlink $f; JD