Hi,
I am trying to use WWW::Mechanize against a site that is UTF-8 encoded.
It seems to work ok, but I get error-messages about UTF-8 on STDERR. If
I use LWP::Simple to download the page, the same message appears, so I
assume that the problem is in LWP.
A simple test-program that shows the behaviour:
#!/usr/bin/perl -w
use strict;
use LWP::Simple qw/get/;
my $data = get( "https://www.trangselskatt.vv.se/cts/open/loginPrompt.do" );
This program prints the following message to STDERR:
Parsing of undecoded UTF-8 will give garbage when decoding entities at
/usr/share/perl5/LWP/Protocol.pm line 114.
Using WWW::Mechanize to access the site gives the same error-message
from several different source-files.
The site works just fine in Firefox and it includes a
charset-specification in the Content-Type:
[EMAIL PROTECTED]:~/development/trangsel$ wget -S
"https://www.trangselskatt.vv.se/cts/open/loginPrompt.do"
--19:29:39-- https://www.trangselskatt.vv.se/cts/open/loginPrompt.do
=> `loginPrompt.do.1'
Resolving www.trangselskatt.vv.se... 129.35.37.5
Connecting to www.trangselskatt.vv.se|129.35.37.5|:443... connected.
HTTP request sent, awaiting response...
HTTP/1.1 200 OK
Date: Tue, 03 Jan 2006 18:29:26 GMT
Server: IBM_HTTP_Server
Connection: close
Content-Type: text/html; charset=UTF-8
Content-Language: sv-SE
Length: unspecified [text/html]
[ <=> ] 5,302 --.--K/s
19:29:40 (82.63 KB/s) - `loginPrompt.do.1' saved [5302]
How can I avoid this error-message, both in LWP and in WWW::Mechanize? I
am running perl 5.8.7, LWP 5.803 and WWW::Mechanize 1.12 on Debian testing.
/Mattias