On 24/09/13 10:38, Tim Ruehsen wrote:
Just for completeness: these guessing steps called encoding sniffing
algorithm are described in 12.2.2.2.
But only In some cases, it might be impractical to unambiguously determine
the encoding before parsing the document..
Yes, it allows to start parsing
On Monday 23 September 2013 23:32:39 Ángel González wrote:
On 17/09/13 09:49, Tim Ruehsen wrote:
On Tuesday 17 September 2013 00:17:21 Ángel González wrote:
[1] http://nikitathespider.com/articles/EncodingDivination.html
Note that these steps are outdated now (that was written at most at
On 17/09/13 09:49, Tim Ruehsen wrote:
On Tuesday 17 September 2013 00:17:21 Ángel González wrote:
[1] http://nikitathespider.com/articles/EncodingDivination.html
Note that these steps are outdated now (that was written at most at 2008).
Outdated by exactly what ? RFC3986 is of 2005 and does
On Tuesday 17 September 2013 00:17:21 Ángel González wrote:
On 16/09/13 12:50, Tim Ruehsen wrote:
Just to have it mentioned:
Your download (wget -r http://bmit.se/wget) succeeds, but it shouldn't !
IMHO, Wget has a bug here and just because of this bug your test case
succeeds.
Why ?
Tim Ruehsen wrote:
Wget should have taken the URL 'teståäöÅÄÖ' as ISO-8859-1 and convert it
into UTF-8, which would fail to download.
Neither Firefox nor Internet Explorer can navigate that link. Both fail
trying to retrieve teståäöÃ
ÃÃ.
I concur with Tim that this behavior of wget is
On 16/09/13 12:50, Tim Ruehsen wrote:
Just to have it mentioned:
Your download (wget -r http://bmit.se/wget) succeeds, but it shouldn't !
IMHO, Wget has a bug here and just because of this bug your test case
succeeds.
Why ?
Your wget/index.html holds the UTF-8 encoded URL 'teståäöÅÄÖ', but
Greetings
Thanks for correcting.
Sorry for unclean code and troubling.
- Make wget recognise utf-8 urls and accept them without nocontrol when
the filesystem encoding is utf-8.
Did You sure? UTF-8 name can contain colon (i remember, that see likewise
files). And at
least in Windows colon
Greetings
Great thanks for pushing in correct direction.
With attached patch Wget in Windows can work with UTF-8 names. But - also
only with --restrict-file-names=nocontrol...
Windows need conversion for all work with wide chars.
MultiByteToWideChar() choosed because it allow to force set
On 15/09/13 00:59, Bykov Aleksey wrote:
Greetings
Great thanks for pushing in correct direction.
With attached patch Wget in Windows can work with UTF-8 names. But -
also only with --restrict-file-names=nocontrol...
I think there are two issues:
- Make wget recognise utf-8 urls and accept
Wasn't that problem always there?
Looks like bug 37564 [1], you can work around it with
--restrict-file-names=nocontrol
You may find some more information in the list archives.
1- https://savannah.gnu.org/bugs/index.php?37564
Please excuse me for my confusion. In my first tests I didn't
On 2013-09-13 09:42, Tim Ruehsen wrote:
On Thursday 12 September 2013 21:34:01 Björn Mattsson wrote:
On 2013-09-12 21:21, Tim Rühsen wrote:
Am Donnerstag, 12. September 2013, 12:59:00 schrieb Björn Mattsson:
Run into a bug in wget last week.
Done some digging but can't solve it by my self.
Greetings
Yes, You show correct cyrillic filename.
Sorry, I'm not aggree that this bug is ready to close.
Your method is mentioned in it.
This bug about filenames in non UTF-8 locales.
Main qoute:
If you are using a unix-like OS where the filesystem interface uses
utf-8, there is a workaround of
On Friday 13 September 2013 12:43:53 Bykov Aleksey wrote:
Greetings
Yes, You show correct cyrillic filename.
Sorry, I'm not aggree that this bug is ready to close.
Your method is mentioned in it.
This bug about filenames in non UTF-8 locales.
Main qoute:
If you are using a unix-like OS
Run into a bug in wget last week.
Done some digging but can't solve it by my self.
If i tries to wget a file containing capital ÅÄÖ they gets coverted
wrongly, and åäö works fine.
I uses wget -m to backup one of my webb-sites to another machine. Have
worked like a cahrm for the last 4-5
Am Donnerstag, 12. September 2013, 12:59:00 schrieb Björn Mattsson:
Run into a bug in wget last week.
Done some digging but can't solve it by my self.
If i tries to wget a file containing capital ÅÄÖ they gets coverted
wrongly, and åäö works fine.
I uses wget -m to backup one of my
On 2013-09-12 17:37, Tim Ruehsen wrote:
On Thursday 12 September 2013 12:59:00 Björn Mattsson wrote:
Run into a bug in wget last week.
Done some digging but can't solve it by my self.
If i tries to wget a file containing capital ÅÄÖ they gets coverted
wrongly, and åäö works fine.
I uses wget
Am Donnerstag, 12. September 2013, 17:37:17 schrieb Tim Ruehsen:
On Thursday 12 September 2013 12:59:00 Björn Mattsson wrote:
Run into a bug in wget last week.
Done some digging but can't solve it by my self.
If i tries to wget a file containing capital ÅÄÖ they gets coverted
wrongly,
Tim Rühsen schrieb:
On Thursday 12 September 2013 12:59:00 Björn Mattsson wrote:
Run into a bug in wget last week.
Done some digging but can't solve it by my self.
If i tries to wget a file containing capital ÅÄÖ they gets coverted
wrongly, and åäö works fine.
I uses wget -m to backup one of
18 matches
Mail list logo