On Friday 07 August 2015 16:38:01 Andries E. Brouwer wrote:
> On Fri, Aug 07, 2015 at 04:14:45PM +0200, Tim Ruehsen wrote:
> > Hi Andries,
> > 
> > as I already mentioned, changing the default behavior of wget is not a
> > good
> > idea.
> > 
> > But I started a wget2 branch that produces wget and wget2 executables.
> > wget2's default behavior is to keep filenames as they are.
> > 
> > I am not sure how it compiles and works on Windows (Cygwin could work).
> > If you dare to check it out: any feedback is highly welcome.
> > 
> > Regards, Tim
> 
> Hi Tim,
> 
> I disagree. This is just a bug.
> Nobody wants illegal filenames.
> Even removing them is not entirely trivial since the filenames
> produced by wget are not legal character sequences, so cannot be typed.

Hi Andries,

obviously I got it wrong.

If it's a bug, let's just fix it (without breaking compatibility).

I don't have the time to read *all* the old emails right now.
But as far as I understand escaping occurs within legal UTF-8 sequences - and 
you are right when saying this is a bug when we have a UTF-8 locale.

The solution would something like

if locale is UTF-8
  do not escape valid UTF-8 sequences
else
  keep wget's current behavior

If URLs (and thus filenames) are not in UTF-8, Wget will convert them to UTF-8 
before the above procedure (I guess that is what wget does anyways, well not 
100% sure).

Would you agree ?

If you provide patch for this we will appreciate that.

> I am a Linux man, no Windows computers here. So, I am happy to do
> stuff on Linux, but cannot test on Windows.

Sorry, won't bother you again regarding Windows ;-)

Tim


Reply via email to