Re: wget ftp url syntax is wrong
> > By the way, neither "//" nor "/%2F" works in 1.7-dev. Perhaps we > > broke that when we fixed the problem where recursive FTP 'wget's > > assumed that logging in always put you in '/'? > > I believe some of Jan's changes broke it. Also, the standard idiom: > > wget -r ftp://username:password@host//path/to/home/something > > no longer works. Aargh. I will have a look at it. -- jan
Re: wget ftp url syntax is wrong
"Dan Harkless" <[EMAIL PROTECTED]> writes: > By the way, neither "//" nor "/%2F" works in 1.7-dev. Perhaps we > broke that when we fixed the problem where recursive FTP 'wget's > assumed that logging in always put you in '/'? I believe some of Jan's changes broke it. Also, the standard idiom: wget -r ftp://username:password@host//path/to/home/something no longer works.
Re: wget ftp url syntax is wrong
Dan Harkless wrote: > > It's my experience that very few anonymous FTP servers put you in a > directory other than '/' (it certainly may be a chroot()ed '/'), ftp.redhat.com puts you in /pub by default (as user "anonymous".) I haven't checked, but I'd say it's a safe bet that this is what the ftpd that comes with Red Hat Linux does by default. -- Jamie Zawinski [EMAIL PROTECTED] http://www.jwz.org/ [EMAIL PROTECTED] http://www.dnalounge.com/
Re: wget ftp url syntax is wrong
Hrvoje Niksic <[EMAIL PROTECTED]> writes: > Jamie Zawinski <[EMAIL PROTECTED]> writes: > > Netscape can retrieve this URL: > > > > ftp://ftp.redhat.com/pub/redhat/updates/7.0/i386/apache-devel-1.3.14-3.i386.rpm > > > > wget cannot. wget wants it to be: > > > > ftp://ftp.redhat.com//pub/redhat/updates/7.0/i386/apache-devel-1.3.14-3.i386.rpm > > > > I believe the Netscape behavior is right and the wget behavior is > > wrong. > > Wget behavior is based on what was specified in rfc1738 at the time I > was writing the code. rfc1738 does require %2F to be used instead of > the slash immediately preceding "pub", but I considered the > distinction to be purely academic and made Wget accept both. (I have > yet to see a purpose for CWD-ing into an empty directory.) By the way, neither "//" nor "/%2F" works in 1.7-dev. Perhaps we broke that when we fixed the problem where recursive FTP 'wget's assumed that logging in always put you in '/'? --- Dan Harkless| To help prevent SPAM contamination, GNU Wget co-maintainer | please do not mention this email http://sunsite.dk/wget/ | address in Usenet posts -- thank you.
Re: wget ftp url syntax is wrong
Jamie Zawinski <[EMAIL PROTECTED]> writes: > Dan Harkless wrote: > > Well, silly or not, the concept is already there, so I don't think it makes > > sense to remove the ability to access RFC-valid URLs in order to imitate > > Netscape or Internet Explorer. > > I guess that depends on whether you think it's more important to > do the most useful thing, and what people expect; or do what the > RFC says, despite the fact that nobody else has actually implemented > that. Well, I don't think we've shown that _nobody_ else has implemented that. I guess it mightn't be a terrible idea to make the common, non-compliant behavior the default, though, as long as the RFC-correct behavior is optionally available. It's my experience that very few anonymous FTP servers put you in a directory other than '/' (it certainly may be a chroot()ed '/'), though, and FTP files that require a login and password to get at tend not to be published as URLs, so in reality I don't think we're talking about that large a body of common practice. > (But if you're going to slavishly follow the RFC, you have to do one CWD > for each directory component, or it won't work on, e.g., VMS and TWENEX > file servers.) I certainly wouldn't be opposed to putting in such behavior, if people using VMS or TWENEX servers complained. --- Dan Harkless| To help prevent SPAM contamination, GNU Wget co-maintainer | please do not mention this email http://sunsite.dk/wget/ | address in Usenet posts -- thank you.
Re: wget ftp url syntax is wrong
Jamie Zawinski <[EMAIL PROTECTED]> writes: > Netscape can retrieve this URL: > > ftp://ftp.redhat.com/pub/redhat/updates/7.0/i386/apache-devel-1.3.14-3.i386.rpm > > wget cannot. wget wants it to be: > > ftp://ftp.redhat.com//pub/redhat/updates/7.0/i386/apache-devel-1.3.14-3.i386.rpm > > I believe the Netscape behavior is right and the wget behavior is > wrong. Wget behavior is based on what was specified in rfc1738 at the time I was writing the code. rfc1738 does require %2F to be used instead of the slash immediately preceding "pub", but I considered the distinction to be purely academic and made Wget accept both. (I have yet to see a purpose for CWD-ing into an empty directory.)
Re: wget ftp url syntax is wrong
Dan Harkless wrote: > > Well, silly or not, the concept is already there, so I don't think it makes > sense to remove the ability to access RFC-valid URLs in order to imitate > Netscape or Internet Explorer. I guess that depends on whether you think it's more important to do the most useful thing, and what people expect; or do what the RFC says, despite the fact that nobody else has actually implemented that. I guess you know what my opinion is: de facto standards are the only ones that matter. > > The correct approach would be to try "CWD url/dir/path/" (the correct > > meaning) and if this does not work, try "CWD /url/dir/path/". > > I agree this would seem to be the best approach. I'll add this to the TODO. That works too, I suppose. (But if you're going to slavishly follow the RFC, you have to do one CWD for each directory component, or it won't work on, e.g., VMS and TWENEX file servers.) -- Jamie Zawinski [EMAIL PROTECTED] http://www.jwz.org/ [EMAIL PROTECTED] http://www.dnalounge.com/
Re: wget ftp url syntax is wrong
Jan Prikryl <[EMAIL PROTECTED]> writes: > Quoting Jamie Zawinski ([EMAIL PROTECTED]): > > However, that said, I still think wget should do what Netscape does, > > because that's what everyone expects. The concept of a "default > > directory" in a URL is silly. Well, silly or not, the concept is already there, so I don't think it makes sense to remove the ability to access RFC-valid URLs in order to imitate Netscape or Internet Explorer. > The correct approach would be to try "CWD url/dir/path/" (the correct > meaning) and if this does not work, try "CWD /url/dir/path/". I agree this would seem to be the best approach. I'll add this to the TODO. --- Dan Harkless| To help prevent SPAM contamination, GNU Wget co-maintainer | please do not mention this email http://sunsite.dk/wget/ | address in Usenet posts -- thank you.
Re: wget ftp url syntax is wrong
Quoting Jamie Zawinski ([EMAIL PROTECTED]): > However, that said, I still think wget should do what Netscape does, > because that's what everyone expects. The concept of a "default > directory" in a URL is silly. The correct approach would be to try "CWD url/dir/path/" (the correct meaning) and if this does not work, try "CWD /url/dir/path/". -- jan +-- Jan Prikryl| vr|vis center for virtual reality and visualisation <[EMAIL PROTECTED]> | http://www.vrvis.at +--
Re: wget ftp url syntax is wrong
Hanno Foest wrote: > >> ftp://ftp.redhat.com/pub/redhat/updates/7.0/i386/apache-devel-1.3.14-3.i386.rpm >> ftp://ftp.redhat.com//pub/redhat/updates/7.0/i386/apache-devel-1.3.14-3.i386.rpm ... > I don't think so. The double slash in front of the path part of the URL > starts the path in the ftp server's root, while the single slash starts > it in the default directory you log into when doing anonymous ftp. The > default directory isn't the server's root in this case, but "pub". Ok, I read the RFC, and we're both wrong: http://www.faqs.org/rfcs/rfc1738.html For example, the URL ftp:[EMAIL PROTECTED]/%2Fetc/motd> is interpreted by FTP-ing to "host.dom", logging in as "myname" (prompting for a password if it is asked for), and then executing "CWD /etc" and then "RETR motd". This has a different meaning from ftp:[EMAIL PROTECTED]/etc/motd> which would "CWD etc" and then "RETR motd"; the initial "CWD" might be executed relative to the default directory for "myname". On the other hand, ftp:[EMAIL PROTECTED]//etc/motd>, would "CWD " with a null argument, then "CWD etc", and then "RETR motd". So according to the RFC, to use an absolute path, you have to begin the path component with "/%2F", not with "//" -- the latter means "cd to the current directory first", thus, it's a no-op. (Actually it's not clear whether "CWD " means "home directory" or "current directory": it's unspecified by RFC 765.) However, that said, I still think wget should do what Netscape does, because that's what everyone expects. The concept of a "default directory" in a URL is silly. I'll bet MSIE does the same thing as Netscape. That makes it the standard. -- Jamie Zawinski [EMAIL PROTECTED] http://www.jwz.org/ [EMAIL PROTECTED] http://www.dnalounge.com/
Re: wget ftp url syntax is wrong
Quoting Hanno Foest ([EMAIL PROTECTED]): > On Mon, Feb 26, 2001 at 12:46:51AM -0800, Jamie Zawinski wrote: > > > Netscape can retrieve this URL: > > > > ftp://ftp.redhat.com/pub/redhat/updates/7.0/i386/apache-devel-1.3.14-3.i386.rpm > > > > wget cannot. wget wants it to be: > > > > ftp://ftp.redhat.com//pub/redhat/updates/7.0/i386/apache-devel-1.3.14-3.i386.rpm > > > > I believe the Netscape behavior is right and the wget behavior is wrong. > > I don't think so. The double slash in front of the path part of the URL > starts the path in the ftp server's root, while the single slash starts > it in the default directory you log into when doing anonymous ftp. The > default directory isn't the server's root in this case, but "pub". Right. On the other hand, wget shall be probably able to handle the missing slash at the beginning (as Netscape does). -- jan +-- Jan Prikryl| vr|vis center for virtual reality and visualisation <[EMAIL PROTECTED]> | http://www.vrvis.at +--
Re: wget ftp url syntax is wrong
On Mon, Feb 26, 2001 at 12:46:51AM -0800, Jamie Zawinski wrote: > Netscape can retrieve this URL: > > ftp://ftp.redhat.com/pub/redhat/updates/7.0/i386/apache-devel-1.3.14-3.i386.rpm > > wget cannot. wget wants it to be: > > ftp://ftp.redhat.com//pub/redhat/updates/7.0/i386/apache-devel-1.3.14-3.i386.rpm > > I believe the Netscape behavior is right and the wget behavior is wrong. I don't think so. The double slash in front of the path part of the URL starts the path in the ftp server's root, while the single slash starts it in the default directory you log into when doing anonymous ftp. The default directory isn't the server's root in this case, but "pub". So wget ftp://ftp.redhat.com/redhat/updates/7.0/i386/apache-devel-1.3.14-3.i386.rpm works as intended, starting the path relative to the default directory. Netscape can't retrieve this URL, though... which I believe is wrong. Hanno
wget ftp url syntax is wrong
Netscape can retrieve this URL: ftp://ftp.redhat.com/pub/redhat/updates/7.0/i386/apache-devel-1.3.14-3.i386.rpm wget cannot. wget wants it to be: ftp://ftp.redhat.com//pub/redhat/updates/7.0/i386/apache-devel-1.3.14-3.i386.rpm I believe the Netscape behavior is right and the wget behavior is wrong. -- Jamie Zawinski [EMAIL PROTECTED] http://www.jwz.org/ [EMAIL PROTECTED] http://www.dnalounge.com/