Re: [Bug-wget] Problem with ÅÄÖ and wget

2013-10-02 Thread Ángel González
On 24/09/13 10:38, Tim Ruehsen wrote: Just for completeness: these guessing steps called encoding sniffing algorithm are described in 12.2.2.2. But only In some cases, it might be impractical to unambiguously determine the encoding before parsing the document.. Yes, it allows to start parsing

Re: [Bug-wget] Problem with ÅÄÖ and wget

2013-09-24 Thread Tim Ruehsen
On Monday 23 September 2013 23:32:39 Ángel González wrote: On 17/09/13 09:49, Tim Ruehsen wrote: On Tuesday 17 September 2013 00:17:21 Ángel González wrote: [1] http://nikitathespider.com/articles/EncodingDivination.html Note that these steps are outdated now (that was written at most at

Re: [Bug-wget] Problem with ÅÄÖ and wget

2013-09-23 Thread Ángel González
On 17/09/13 09:49, Tim Ruehsen wrote: On Tuesday 17 September 2013 00:17:21 Ángel González wrote: [1] http://nikitathespider.com/articles/EncodingDivination.html Note that these steps are outdated now (that was written at most at 2008). Outdated by exactly what ? RFC3986 is of 2005 and does

Re: [Bug-wget] Problem with ÅÄÖ and wget

2013-09-17 Thread Tim Ruehsen
On Tuesday 17 September 2013 00:17:21 Ángel González wrote: On 16/09/13 12:50, Tim Ruehsen wrote: Just to have it mentioned: Your download (wget -r http://bmit.se/wget) succeeds, but it shouldn't ! IMHO, Wget has a bug here and just because of this bug your test case succeeds. Why ?

Re: [Bug-wget] Problem with ÅÄÖ and wget

2013-09-16 Thread Tony Lewis
Tim Ruehsen wrote: Wget should have taken the URL 'teståäöÅÄÖ' as ISO-8859-1 and convert it into UTF-8, which would fail to download. Neither Firefox nor Internet Explorer can navigate that link. Both fail trying to retrieve teståäöÅÄÖ. I concur with Tim that this behavior of wget is

Re: [Bug-wget] Problem with ÅÄÖ and wget

2013-09-16 Thread Ángel González
On 16/09/13 12:50, Tim Ruehsen wrote: Just to have it mentioned: Your download (wget -r http://bmit.se/wget) succeeds, but it shouldn't ! IMHO, Wget has a bug here and just because of this bug your test case succeeds. Why ? Your wget/index.html holds the UTF-8 encoded URL 'teståäöÅÄÖ', but

Re: [Bug-wget] Problem with ÅÄÖ and wget

2013-09-15 Thread Bykov Aleksey
Greetings Thanks for correcting. Sorry for unclean code and troubling. - Make wget recognise utf-8 urls and accept them without nocontrol when the filesystem encoding is utf-8. Did You sure? UTF-8 name can contain colon (i remember, that see likewise files). And at least in Windows colon

Re: [Bug-wget] Problem with ÅÄÖ and wget

2013-09-14 Thread Bykov Aleksey
Greetings Great thanks for pushing in correct direction. With attached patch Wget in Windows can work with UTF-8 names. But - also only with --restrict-file-names=nocontrol... Windows need conversion for all work with wide chars. MultiByteToWideChar() choosed because it allow to force set

Re: [Bug-wget] Problem with ÅÄÖ and wget

2013-09-14 Thread Ángel González
On 15/09/13 00:59, Bykov Aleksey wrote: Greetings Great thanks for pushing in correct direction. With attached patch Wget in Windows can work with UTF-8 names. But - also only with --restrict-file-names=nocontrol... I think there are two issues: - Make wget recognise utf-8 urls and accept

Re: [Bug-wget] Problem with ÅÄÖ and wget

2013-09-13 Thread Tim Ruehsen
Wasn't that problem always there? Looks like bug 37564 [1], you can work around it with --restrict-file-names=nocontrol You may find some more information in the list archives. 1- https://savannah.gnu.org/bugs/index.php?37564 Please excuse me for my confusion. In my first tests I didn't

Re: [Bug-wget] Problem with ÅÄÖ and wget

2013-09-13 Thread Björn Mattsson
On 2013-09-13 09:42, Tim Ruehsen wrote: On Thursday 12 September 2013 21:34:01 Björn Mattsson wrote: On 2013-09-12 21:21, Tim Rühsen wrote: Am Donnerstag, 12. September 2013, 12:59:00 schrieb Björn Mattsson: Run into a bug in wget last week. Done some digging but can't solve it by my self.

Re: [Bug-wget] Problem with ÅÄÖ and wget

2013-09-13 Thread Bykov Aleksey
Greetings Yes, You show correct cyrillic filename. Sorry, I'm not aggree that this bug is ready to close. Your method is mentioned in it. This bug about filenames in non UTF-8 locales. Main qoute: If you are using a unix-like OS where the filesystem interface uses utf-8, there is a workaround of

Re: [Bug-wget] Problem with ÅÄÖ and wget

2013-09-13 Thread Tim Ruehsen
On Friday 13 September 2013 12:43:53 Bykov Aleksey wrote: Greetings Yes, You show correct cyrillic filename. Sorry, I'm not aggree that this bug is ready to close. Your method is mentioned in it. This bug about filenames in non UTF-8 locales. Main qoute: If you are using a unix-like OS

[Bug-wget] Problem with ÅÄÖ and wget

2013-09-12 Thread Björn Mattsson
Run into a bug in wget last week. Done some digging but can't solve it by my self. If i tries to wget a file containing capital ÅÄÖ they gets coverted wrongly, and åäö works fine. I uses wget -m to backup one of my webb-sites to another machine. Have worked like a cahrm for the last 4-5

Re: [Bug-wget] Problem with ÅÄÖ and wget

2013-09-12 Thread Tim Rühsen
Am Donnerstag, 12. September 2013, 12:59:00 schrieb Björn Mattsson: Run into a bug in wget last week. Done some digging but can't solve it by my self. If i tries to wget a file containing capital ÅÄÖ they gets coverted wrongly, and åäö works fine. I uses wget -m to backup one of my

Re: [Bug-wget] Problem with ÅÄÖ and wget

2013-09-12 Thread Björn Mattsson
On 2013-09-12 17:37, Tim Ruehsen wrote: On Thursday 12 September 2013 12:59:00 Björn Mattsson wrote: Run into a bug in wget last week. Done some digging but can't solve it by my self. If i tries to wget a file containing capital ÅÄÖ they gets coverted wrongly, and åäö works fine. I uses wget

Re: [Bug-wget] Problem with ÅÄÖ and wget

2013-09-12 Thread Tim Rühsen
Am Donnerstag, 12. September 2013, 17:37:17 schrieb Tim Ruehsen: On Thursday 12 September 2013 12:59:00 Björn Mattsson wrote: Run into a bug in wget last week. Done some digging but can't solve it by my self. If i tries to wget a file containing capital ÅÄÖ they gets coverted wrongly,

Re: [Bug-wget] Problem with ÅÄÖ and wget

2013-09-12 Thread Ángel González
Tim Rühsen schrieb: On Thursday 12 September 2013 12:59:00 Björn Mattsson wrote: Run into a bug in wget last week. Done some digging but can't solve it by my self. If i tries to wget a file containing capital ÅÄÖ they gets coverted wrongly, and åäö works fine. I uses wget -m to backup one of