wget should only use one FTP connection

2001-02-26 Thread Jamie Zawinski

If I specify a whole bunch of ftp: URLs to wget that are on the same
host, it opens a new connection to the server for each one.  It should
reuse the same connection if they're on the same host.

-- 
Jamie Zawinski
[EMAIL PROTECTED] http://www.jwz.org/
[EMAIL PROTECTED]   http://www.dnalounge.com/



wget ftp url syntax is wrong

2001-02-26 Thread Jamie Zawinski

Netscape can retrieve this URL:

  ftp://ftp.redhat.com/pub/redhat/updates/7.0/i386/apache-devel-1.3.14-3.i386.rpm

wget cannot.   wget wants it to be:

  ftp://ftp.redhat.com//pub/redhat/updates/7.0/i386/apache-devel-1.3.14-3.i386.rpm

I believe the Netscape behavior is right and the wget behavior is wrong.

-- 
Jamie Zawinski
[EMAIL PROTECTED] http://www.jwz.org/
[EMAIL PROTECTED]   http://www.dnalounge.com/



Patch: new option --ignore-size

2001-02-26 Thread Andre Majorel

I'm mirroring a very large tree locally. As the tree is larger
than the local filesystem, I periodically stop wget, save what
I've downloaded on CD-ROM, truncate the saved files to 0 and
then start wget -N -r again to get more files.

Unfortunately, wget checks not only the mtime but also the size
of the local files and starts downloading them again.

This patch adds the --ignore-size option which prevents this.
When this option is present, wget will not retrieve the remote
file again as long as the local file exists and is more recent,
even if its size is not the same as the remote file.

The patch has been posted to wget-patches. It's also available
at URL:http://www.teaser.fr/~amajorel/wget/.

I will write a documentation patch if you think the patch worth
including in the distribution.

-- 
Andr Majorel
Work: [EMAIL PROTECTED]
Home: [EMAIL PROTECTED] http://www.teaser.fr/~amajorel/



Re: wget ftp url syntax is wrong

2001-02-26 Thread Hanno Foest

On Mon, Feb 26, 2001 at 12:46:51AM -0800, Jamie Zawinski wrote:

 Netscape can retrieve this URL: 

   ftp://ftp.redhat.com/pub/redhat/updates/7.0/i386/apache-devel-1.3.14-3.i386.rpm
 
 wget cannot.   wget wants it to be:
 
   ftp://ftp.redhat.com//pub/redhat/updates/7.0/i386/apache-devel-1.3.14-3.i386.rpm
 
 I believe the Netscape behavior is right and the wget behavior is wrong.

I don't think so. The double slash in front of the path part of the URL
starts the path in the ftp server's root, while the single slash starts
it in the default directory you log into when doing anonymous ftp. The
default directory isn't the server's root in this case, but "pub".

So

wget ftp://ftp.redhat.com/redhat/updates/7.0/i386/apache-devel-1.3.14-3.i386.rpm

works as intended, starting the path relative to the default directory.
Netscape can't retrieve this URL, though... which I believe is wrong.

Hanno



Re: wget ftp url syntax is wrong

2001-02-26 Thread Jan Prikryl

Quoting Hanno Foest ([EMAIL PROTECTED]):

 On Mon, Feb 26, 2001 at 12:46:51AM -0800, Jamie Zawinski wrote:
 
  Netscape can retrieve this URL: 
 
ftp://ftp.redhat.com/pub/redhat/updates/7.0/i386/apache-devel-1.3.14-3.i386.rpm
  
  wget cannot.   wget wants it to be:
  
ftp://ftp.redhat.com//pub/redhat/updates/7.0/i386/apache-devel-1.3.14-3.i386.rpm
  
  I believe the Netscape behavior is right and the wget behavior is wrong.
 
 I don't think so. The double slash in front of the path part of the URL
 starts the path in the ftp server's root, while the single slash starts
 it in the default directory you log into when doing anonymous ftp. The
 default directory isn't the server's root in this case, but "pub".

Right. On the other hand, wget shall be probably able to handle the
missing slash at the beginning (as Netscape does).

-- jan

+--
 Jan Prikryl| vr|vis center for virtual reality and visualisation
 [EMAIL PROTECTED] | http://www.vrvis.at
+--



Re: wget ftp url syntax is wrong

2001-02-26 Thread Jamie Zawinski

Hanno Foest wrote:
 
   ftp://ftp.redhat.com/pub/redhat/updates/7.0/i386/apache-devel-1.3.14-3.i386.rpm
   ftp://ftp.redhat.com//pub/redhat/updates/7.0/i386/apache-devel-1.3.14-3.i386.rpm
...
 I don't think so. The double slash in front of the path part of the URL
 starts the path in the ftp server's root, while the single slash starts
 it in the default directory you log into when doing anonymous ftp. The
 default directory isn't the server's root in this case, but "pub".

Ok, I read the RFC, and we're both wrong:

   http://www.faqs.org/rfcs/rfc1738.html

   For example, the URL URL:ftp:[EMAIL PROTECTED]/%2Fetc/motd is
   interpreted by FTP-ing to "host.dom", logging in as "myname"
   (prompting for a password if it is asked for), and then executing
   "CWD /etc" and then "RETR motd". This has a different meaning from
   URL:ftp:[EMAIL PROTECTED]/etc/motd which would "CWD etc" and then
   "RETR motd"; the initial "CWD" might be executed relative to the
   default directory for "myname". On the other hand,
   URL:ftp:[EMAIL PROTECTED]//etc/motd, would "CWD " with a null
   argument, then "CWD etc", and then "RETR motd".

So according to the RFC, to use an absolute path, you have to begin
the path component with "/%2F", not with "//" -- the latter means
"cd to the current directory first", thus, it's a no-op.  (Actually
it's not clear whether "CWD " means "home directory" or "current 
directory": it's unspecified by RFC 765.)

However, that said, I still think wget should do what Netscape does,
because that's what everyone expects.  The concept of a "default 
directory" in a URL is silly.

I'll bet MSIE does the same thing as Netscape.  That makes it
the standard.

-- 
Jamie Zawinski
[EMAIL PROTECTED] http://www.jwz.org/
[EMAIL PROTECTED]   http://www.dnalounge.com/



Re: wget ftp url syntax is wrong

2001-02-26 Thread Jan Prikryl

Quoting Jamie Zawinski ([EMAIL PROTECTED]):

 However, that said, I still think wget should do what Netscape does,
 because that's what everyone expects.  The concept of a "default 
 directory" in a URL is silly.

The correct approach would be to try "CWD url/dir/path/" (the correct
meaning) and if this does not work, try "CWD /url/dir/path/".

-- jan

+--
 Jan Prikryl| vr|vis center for virtual reality and visualisation
 [EMAIL PROTECTED] | http://www.vrvis.at
+--



Re: FTP retrieval not functioning

2001-02-26 Thread Jan Prikryl

Quoting Chunks ([EMAIL PROTECTED]):

 I did RTFM, and the links to any mailing list archives I could find
 were broken. Please accept my apologies in advance if this is
 something covered elsewhere. Perhaps ignoring permissions will take
 care of it?

Could you tell us which links were actually broken? 

 I am running GNU Wget 1.5.3.1, win32 compilation and have also tried
 wget 1.5.3 linux compilation with identical results.

As Hack already suggested, try using the latest CVS version - it may
solve your problems. If not, please send us a complete debug ouput so
that we can try to fix what is broken.

-- jan

+--
 Jan Prikryl| vr|vis center for virtual reality and visualisation
 [EMAIL PROTECTED] | http://www.vrvis.at
+--