Mirroring multibyte characters in file names - (null): Invalid or incomplete multibyte or wide character

2006-02-17 Thread samu tuomisto

Hi,

I am trying to mirror a ftp site that has multibyte characters (encoded 
by GB2312) in directory and file names. Unfortunately wget stops after 
listing with following error code:


(null): Invalid or incomplete multibyte or wide character

Is it somehow possible to transfer files and directories named with 
multibyte-characters with wget? Or is it question of settings of locale 
system?


Command used with wget:
wget --mirror --background -o /var/log/mirror_wget.log 
--progress=dot:binary --retry-connrefused --ftp-user=username 
--ftp-password=password --no-host-directories --no-parent 
--restrict-file-names=windows --directory-prefix=/destionation_dir/ 
ftp://source.com/*


Regs.
Samu


Re: NTLM

2006-02-17 Thread Mauro Tortonesi

Alexandre wrote:

I'm using 1.10.2 version but it's not working what are the line command
options to make wget work with NTLM authentication? I have compiled it with
ssl option and enable-ntlm.

Please I'd like to be cc'd in replies to that post.


if your wget binary is built with support for SSL and NTLM 
authentication, then you can specify which username and password to use 
with the --http-user and --http-passwd command line options. please 
refer to the wget manual.


if you still have problems, try to run wget with the -v and -d options 
and send us the output.


P.S. sorry for the late answer. i was sick. i actually spent the last
 few days sleeping and watching curling on tv. yes, i know, it is
 VERY weird for a soccer-bred italian, but it seems that our taste
 for sport is slowly changing:

http://www.tortonesi.com/cgi-bin/blosxom.cgi/2006/02/17#wonderful-victory-20060217

--
Aequam memento rebus in arduis servare mentem...

Mauro Tortonesi  http://www.tortonesi.com

University of Ferrara - Dept. of Eng.http://www.ing.unife.it
GNU Wget - HTTP/FTP file retrieval tool  http://www.gnu.org/software/wget
Deep Space 6 - IPv6 for Linuxhttp://www.deepspace6.net
Ferrara Linux User Group http://www.ferrara.linux.it


Suggestion for documentation

2006-02-17 Thread Frank McCown
It may be useful to add a paragraph to the manual which lets users know 
they can use the --debug option to see why certain URLs are not followed 
(rejected) by wget.  It would be especially useful to mention this in 
9.1 Robot Exclusion.  Something like this:


If you wish to see which URLs are blocked by the robots.txt while wget 
is crawling, use the --debug option.  You will see 2 lines that describe 
why the URL is being rejected:


Rejecting path /abc/bar.html because of rule `/abc'.
Not following http://foo.org/abc/bar.html because robots.txt forbids it.

Thanks,
Frank


Re: NTLM

2006-02-17 Thread Alexandre
Hi Mauro,
 Thanks for your attention! That is an honor to talk you.
 I have tried these:

$wget -d --proxy --http-user=someuser --http-password=somepass http://www.uol.com.br

Setting --proxy (useproxy) to 1Setting --http-user (httpuser) someuserSetting --http-password (httppassword) to somepassDEBUG output created by Wget 1.10.2 on solaris2.8.
--20:26:50-- http://www.uol.com.br/ = `index.html'Resolving appcontweb02... 10.21.9.116Caching appcontweb02 = 
10.21.9.116Connecting to appcontweb02|10.21.9.116|:8350... connected.Created socket 3.Releasing 0x0005f3d0 (new refcount 1).
---request begin---GET http://www.uol.com.br/ HTTP/1.0User-Agent: Wget/1.10.2Accept: */*Authorization: Basic X2l0b2MTG3QwY0bYMDA2Host: 
www.uol.com.br
---request end---Proxy request sent, awaiting response... ---response begin---HTTP/1.1 407 Proxy Authentication RequiredContent-type: text/plainContent-Length: 15Proxy-Authenticate: NTLMconnection: close

$ wget -d --proxy --proxy-user=someuser --proxy-password=somepass http://www.uol.com.br

Setting --proxy (useproxy) to 1Setting --proxy-user (proxyuser) to someuser
Setting --proxy-password (proxypassword) to somepassDEBUG output created by Wget 1.10.2 on solaris2.8.
--20:29:08-- http://www.uol.com.br/ = `index.html'Resolving appcontweb02... 10.21.9.116Caching appcontweb02 = 
10.21.9.116Connecting to appcontweb02|10.21.9.116|:8350... connected.Created socket 3.Releasing 0x0005f3d0 (new refcount 1).
GET http://www.uol.com.br/ HTTP/1.0Pragma: no-cacheUser-Agent: Wget/1.10.2Accept: */*Proxy-Authorization: Basic X2l052M6MVQwG0JyMKA2Host: 
www.uol.com.br
---request end---Proxy request sent, awaiting response... Read error (Connection reset by peer) in headers.Closed fd 3Retrying.
Thanks and regards,
Alexandre

On 2/17/06, Mauro Tortonesi [EMAIL PROTECTED] wrote:
Alexandre wrote: I'm using 1.10.2 version but it's not working what are the line command options to make wget work with NTLM authentication? I have compiled it with
 ssl option and enable-ntlm. Please I'd like to be cc'd in replies to that post.if your wget binary is built with support for SSL and NTLMauthentication, then you can specify which username and password to use
with the --http-user and --http-passwd command line options. pleaserefer to the wget manual.if you still have problems, try to run wget with the -v and -d optionsand send us the output.P.S. sorry for the late answer. i was sick. i actually spent the last
 few days sleeping and watching curling on tv. yes, i know, it is VERY weird for a soccer-bred italian, but it seems that our taste for sport is slowly changing:
http://www.tortonesi.com/cgi-bin/blosxom.cgi/2006/02/17#wonderful-victory-20060217--Aequam memento rebus in arduis servare mentem...Mauro Tortonesi
http://www.tortonesi.comUniversity of Ferrara - Dept. of Eng.http://www.ing.unife.itGNU Wget - HTTP/FTP file retrieval tool
http://www.gnu.org/software/wgetDeep Space 6 - IPv6 for Linuxhttp://www.deepspace6.netFerrara Linux User Group 
http://www.ferrara.linux.it