Re: [Bug-wget] bad filenames (again)

Andries E. Brouwer Tue, 18 Aug 2015 10:52:38 -0700

On Tue, Aug 18, 2015 at 07:43:05PM +0300, Eli Zaretskii wrote:

> > > If we convert the file names using iconv, Windows users will also be
> > > happier, at least when the remote URL can be encoded in their system
> > > codepage.
> > 
> > Windows does not differ from Unix - since the remote character set
> > is unknown and not necessarily constant, a conversion is impossible.
> 
> Windows does differ from Unix, in that arbitrary byte sequences cannot
> be used in file names.


Of course. The code already tries to take care of that.

>  See
> 
>   
> https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247%28v=vs.85%29.aspx
> 
> for the gory details.

Thanks for the reference!

> > I already indicated the 1-line change that fixes the Windows problems.
> 
> It doesn't, unfortunately.

You are too brief. What is wrong with the change that changes
    /* insert some test for Windows */
into
    return true;
?

That change only changes what wget does with bytes in the 128-159 range,
and reading the gory details I fail to see any problem. Almost the opposite:
"Use any character in the current code page for a name, including Unicode 
characters
 and characters in the extended character set (128–255)"
At first sight, if there were a problem it would be because of the clause
"Any other character that the target file system does not allow".

Thanks to your reference I now feel confident to make that 1-line change
so that also Windows users are happy.

Andries


(There are restrictions involving filenames that wget perhaps does not enforce:
no LPT3, no final space or period, ... It might be useful to teach wget about
such details.)

Re: [Bug-wget] bad filenames (again)

Reply via email to