On Thu, 16 Aug 2012, Pawel Krol wrote:

There are websites, which contain special German characters in their
URLs, for example: http://www.pbb-planungsbüro-bartsch.de (known as
"umlauts").

Your problem is not with libwww, but with perl and its requirement to
say

    use utf8;

if you're using strings like

    my $url = q{http://www.pbb-planungsbüro-bartsch.de};

that contain utf8-encoded umlauts.

Try it.

--
-- Mike

Mike Schilli
m...@perlmeister.com


I have been unsuccessfully trying to retrieve contents of such
websites using Perl (basically the purpose of it is to check, whether
the URL is valid/invalid - maybe there's a simpler way to do it?).

Here's a code snippet, which you can try out immediately:

#!/opt/local/bin/perl

use Data::Dumper;
use LWP::UserAgent;

my $url = q{http://www.pbb-planungsbüro-bartsch.de};
# my $url = q{http://www.pbb-planungsb%C3%BCro-bartsch.de/};

my $ua = LWP::UserAgent->new;
my $response = $ua->get($url);

warn Dumper $response;

__END__

Well, it doesn't work. It gives me "500 Bad hostname" response
regardless the URL being escaped or not.

Question is... Is it possible to retrieve it at all? Are there any
limitations? Workarounds?

If you may kindly help me with resolving this problem, it would be
very much appreciated.

Many thanks for your help!

Kind regards,
Paweł Król.

Reply via email to