As you've discovered the IRI support doesn't change anything about how
filenames are saved; it only translates between IRIs and URIs (which,
since there are no IRIs involved here, doesn't affect anything).

As a workaround until filename transcoding is supported in wget, you may
find that --restrict-file-names=nocontrol does what you need it to -
provided the encoding of the characters in the URL and the encoding for
your system match.

-mjc

(05/24/2011 01:23 AM), kns wrote:
> Hello.
> 
> We have:
> 
> utf-8 urlencoded link: 
> http://lurkmore.ru/images/8/89/%D0%AD%D1%82%D1%8C%D0%B5%D0%BD_%D0%94%D1%8E%D0%BC%D0%BE%D0%BD.jpeg
> 
> wget on cygwin:
> $ wget --version
> GNU Wget 1.12 built on cygwin.
> 
> +digest +ipv6 +nls +ntlm +opie +md5/openssl +https -gnutls +openssl
> +iri
> 
> ---------
> 
> $ wget -o ./w.log --local-encoding=utf-8 --remote-encoding=utf-8 
> http://lurkmore.ru/images/8/89/%D0%AD%D1%82%D1%8C%D0%B5%D0%BD_%D0%94%D1%8E%D0%BC%D0%BE%D0%BD.jpeg
> 
> $ cat w.log
> --2011-05-24 12:19:39--  
> http://lurkmore.ru/images/8/89/%D0%AD%D1%82%D1%8C%D0%B5
> %D0%BD_%D0%94%D1%8E%D0%BC%D0%BE%D0%BD.jpeg
> Resolving lurkmore.ru (lurkmore.ru)... 174.122.234.203
> Connecting to lurkmore.ru (lurkmore.ru)|174.122.234.203|:80... connected.
> HTTP request sent, awaiting response... 200 OK
> Length: 39532 (39K) [image/jpeg]
> Saving to: `Э\321%82\321%8Cен_\320%94\321%8Eмон.jpeg'
> 
>      0K .......... .......... .......... ........             100% 45.1K=0.9s
> 
> 2011-05-24 12:19:41 (45.1 KB/s) - `Э\321%82\321%8Cен_\320%94\321%8Eмон.jpeg' 
> sav
> ed [39532/39532]
> 
> --------
> Wget writes "Э\321%82\321%8Cен_\320%94\321%8Eмон.jpeg" 
> (Э%82%8Cен_%94%8Eмон.jpeg) instead of "Этьен_Дюмон.jpeg"
> 
> 
> Debian version without iri support does the same.


-- 
Micah J. Cowan
http://micah.cowan.name/

Reply via email to