Am Donnerstag, 12. September 2013, 17:37:17 schrieb Tim Ruehsen: > On Thursday 12 September 2013 12:59:00 Björn Mattsson wrote: > > Run into a bug in wget last week. > > Done some digging but can't solve it by my self. > > > > If i tries to wget a file containing capital ÅÄÖ they gets coverted > > wrongly, and åäö works fine. > > > > I uses wget -m to backup one of my webb-sites to another machine. Have > > worked like a cahrm for the last 4-5 years but a couple of week ago one > > of teh files came down wrong. Thought it was a college that had uploaded > > something wrong but after some digging it's wget that converts wrongly. > > > > I have UTF-8 as charset on my machine. > > > > If you want to test/see the problem > > > > wget -m http://bmit.se/wget > > A request to http://bmit.se/wget/ returns text/html document without > specifying the charset (AFAIR, default is iso-8859-1). > Either your Server has to tag the response as utf-8 (Content-Type: > text/html; charset=utf-8) or you have to specify utf-8 in your document > header. > > Or you specify --remote-encoding=utf-8 when calling wget. > > Could you give it a try, maybe with -d to see what is going on.
Sorry, forget my answer. Meanwhile I could make some tests in an utf-8 env, and yes, Wget 1.14 (Debian package as well as current git) has the problem you described. I am not shure if we can change it without breaking backward compatibility !? Tim
signature.asc
Description: This is a digitally signed message part.
