Re: [Bug-wget] bad filename

2014-04-26 Thread Andries E. Brouwer
On Fri, Apr 25, 2014 at 09:39:55PM +0300, Bykov Aleksey wrote: Greetings, Andries E. Brouwer - the patch is inside #ifdef WINDOWS ... #endif while the problem occurs on all systems, also on Unix. Yes, it is. - Presently, 0-31 and 127-159 are considerd control. Sorry, i preffer

Re: [Bug-wget] bad filename

2014-04-25 Thread Bykov Aleksey
- only windows related stuff. Best regards, Bykov Aleksey ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ From: andries.brou...@cwi.nl To: gnfa...@rambler.ru Date: 17:28:10, 04.23.2014 Subject: Re: [Bug-wget] bad filename ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ On Wed, Apr 23, 2014 at 04:57:11PM +0300, Bykov

Re: [Bug-wget] bad filename

2014-04-24 Thread Tim Ruehsen
On Thursday 24 April 2014 12:21:54 Andries E. Brouwer wrote: I couldn't read that in your post before (I still can't). If Wget puts illegal characters into filenames, that is a bug and has to be fixed. Then let me clarify this point. Sorry for the length. Andries, first of thanks for your

Re: [Bug-wget] bad filename

2014-04-24 Thread Tim Rühsen
Am Donnerstag, 24. April 2014, 20:00:18 schrieb Andries E. Brouwer: On Thu, Apr 24, 2014 at 03:43:40PM +0200, Tim Ruehsen wrote: 1. How do you know, what filesystem you are writing to ? I just think of these fat32 USB sticks flying around everywhere. UTF-8 might be a problem (see

Re: [Bug-wget] bad filename

2014-04-24 Thread Andries E. Brouwer
On Thu, Apr 24, 2014 at 03:43:40PM +0200, Tim Ruehsen wrote: 1. How do you know, what filesystem you are writing to ? I just think of these fat32 USB sticks flying around everywhere. UTF-8 might be a problem (see http://en.wikipedia.org/wiki/Comparison_of_file_systems). I just mention

Re: [Bug-wget] bad filename

2014-04-24 Thread Tim Ruehsen
On Wednesday 23 April 2014 15:32:47 Andries E. Brouwer wrote: On Wed, Apr 23, 2014 at 02:43:21PM +0200, Tim Ruehsen wrote: Wget has a serious problem. It creates by default illegal filenames. I couldn't read that in your post before (I still can't). If Wget puts illegal characters into

Re: [Bug-wget] bad filename

2014-04-24 Thread Andries E. Brouwer
On Thu, Apr 24, 2014 at 09:56:15AM +0200, Tim Ruehsen wrote: On Wednesday 23 April 2014 15:32:47 Andries E. Brouwer wrote: On Wed, Apr 23, 2014 at 02:43:21PM +0200, Tim Ruehsen wrote: Wget has a serious problem. It creates by default illegal filenames. I couldn't read that in your post

Re: [Bug-wget] bad filename

2014-04-23 Thread Darshit Shah
On Tue, Apr 22, 2014 at 10:57 PM, Andries E. Brouwer andries.brou...@cwi.nl wrote: If I ask wget to download the wikipedia page http://he.wikipedia.org/wiki/ש._שפרה then I hope for a resulting file ש._שפרה. Instead, wget gives me ש._שפר\327%94, where the \327 is an unpronounceable byte that

Re: [Bug-wget] bad filename

2014-04-23 Thread Andries E. Brouwer
On Wed, Apr 23, 2014 at 12:59:43PM +0200, Darshit Shah wrote: On Tue, Apr 22, 2014 at 10:57 PM, Andries E. Brouwer wrote: If I ask wget to download the wikipedia page http://he.wikipedia.org/wiki/ש._שפרה then I hope for a resulting file ש._שפרה. Instead, wget gives me ש._שפר\327%94, where

Re: [Bug-wget] bad filename

2014-04-23 Thread Tim Ruehsen
On Wednesday 23 April 2014 13:57:15 Andries E. Brouwer wrote: On Wed, Apr 23, 2014 at 12:59:43PM +0200, Darshit Shah wrote: On Tue, Apr 22, 2014 at 10:57 PM, Andries E. Brouwer wrote: If I ask wget to download the wikipedia page http://he.wikipedia.org/wiki/ש._שפרה then I hope for a

Re: [Bug-wget] bad filename

2014-04-23 Thread Andries E. Brouwer
On Wed, Apr 23, 2014 at 02:43:21PM +0200, Tim Ruehsen wrote: Can you live with that answer ? No. Wget has a serious problem. It creates by default illegal filenames. Wget should be fixed. Of course I fixed my private source, but as UTF-8 filenames are getting more and more common, this problem

Re: [Bug-wget] bad filename

2014-04-23 Thread Bykov Aleksey
-wget] bad filename ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ On Tue, Apr 22, 2014 at 10:57 PM, Andries E. Brouwer andries.brou...@cwi.nl wrote: If I ask wget to download the wikipedia page http://he.wikipedia.org/wiki/ש._שפרה then I hope for a resulting file ש._שפרה. Instead, wget gives me ש

Re: [Bug-wget] bad filename

2014-04-23 Thread Andries E. Brouwer
On Wed, Apr 23, 2014 at 04:57:11PM +0300, Bykov Aleksey wrote: Greetings, Darshit Shah This was disscussed some (or long) time ago. Possible logic: If locale isn't UTF-8 then process as before else 1. Convert string to WideCharString with mbstowcs(). 2. For Each WideChar check it size with

Re: [Bug-wget] bad filename

2014-04-23 Thread Ángel González
On 23/04/14 15:57, Bykov Aleksey wrote: Greetings, Darshit Shah This was disscussed some (or long) time ago. Possible logic: If locale isn't UTF-8 then process as before else 1. Convert string to WideCharString with mbstowcs(). 2. For Each WideChar check it size with wctomb(). If size is 1 then

[Bug-wget] bad filename

2014-04-22 Thread Andries E. Brouwer
If I ask wget to download the wikipedia page http://he.wikipedia.org/wiki/ש._שפרה then I hope for a resulting file ש._שפרה. Instead, wget gives me ש._שפר\327%94, where the \327 is an unpronounceable byte that cannot be typed (This is an UTF-8 system and the filename that wget produces is not