Re: Does HTTP allow this?
On Sun, 9 Nov 2003, Hrvoje Niksic wrote: One thing that might break (but that Wget doesn't yet support anyway) is NTLM, which seems to authorize the *connections* individual connections. Yes it does. It certainly makes things more complicated, as you would have to exclude such a connection from the checks (at least I think you want that, I don't think you'll be forced to do so). And you also need to exclude HTTPS-connections from this logic (since name-based virtual hosting over SSL isn't really possible). curl doesn't do such advanced IP-checking to detect existing connections to re-use, it only uses host-name based checking for connection re-use for persistant connections. Does curl handle NTLM? Yes it does since a while back. I am willing to donate NTLM code to the wget project, if you want it. I'm not very familiar with the wget internals so they wouldn't be a fully working patch, but a set of (proved working) functions to be integrated by someone with more wget insights. (It depends on crypto- functions provided by OpenSSL.) Otherwise, I can recommend Eric Glass' superb web page for all bits and and details on the NTLM protocol: http://davenport.sourceforge.net/ntlm.html -- -=- Daniel Stenberg -=- http://daniel.haxx.se -=- ech`echo xiun|tr nu oc|sed 'sx\([sx]\)\([xoi]\)xo un\2\1 is xg'`ol
Re: Does HTTP allow this?
Daniel Stenberg [EMAIL PROTECTED] writes: Yes it does. It certainly makes things more complicated, as you would have to exclude such a connection from the checks (at least I think you want that, I don't think you'll be forced to do so). And you also need to exclude HTTPS-connections from this logic (since name-based virtual hosting over SSL isn't really possible). I'm already treating SSL and non-SSL connections as incompatible. But I'm curious as to why you say name-based virtual hosting isn't possible over SSL? Does curl handle NTLM? Yes it does since a while back. I am willing to donate NTLM code to the wget project, if you want it. I'm not very familiar with the wget internals so they wouldn't be a fully working patch, but a set of (proved working) functions to be integrated by someone with more wget insights. (It depends on crypto- functions provided by OpenSSL.) That's very generous, thanks! I planned to improve Wget's HTTP internals (which are very raw right now) anyway, so don't worry about that. A set of callable functions would be perfect.
Re: Does HTTP allow this?
On Mon, 10 Nov 2003, Hrvoje Niksic wrote: I'm already treating SSL and non-SSL connections as incompatible. But I'm curious as to why you say name-based virtual hosting isn't possible over SSL? To quote the Apache docs: Name-based virtual hosting cannot be used with SSL secure servers because of the nature of the SSL protocol. Since you connect to the site in a secure manner, you can't select which host to get data from after a successful connection, as the connection will not be successul unless you have all the proper credentials already. That's very generous, thanks! I'll prepare a C file and header and post them in a separate mail. They will need a little attention, but not much. Mainly to setup pointers to user name, password, etc. -- -=- Daniel Stenberg -=- http://daniel.haxx.se -=- ech`echo xiun|tr nu oc|sed 'sx\([sx]\)\([xoi]\)xo un\2\1 is xg'`ol
Re: Does HTTP allow this?
Hrvoje Niksic wrote: Assume that Wget has retrieved a document from the host A, which hasn't closed the connection in accordance with Wget's keep-alive request. Then Wget needs to connect to host B, which is really the same as A because the provider uses DNS-based virtual hosts. Is it OK to reuse the connection to A to talk to B? snip FWIW, it works fine with Apache. There is a fairly high probability that it will work with most hosts (regardless of the server software). If an IP address has been registered with multiple hosts, then the address alone is not sufficient to retrieve a resource so you have to add a Host header. It's possible that the server responding to the IP address forwards connections to multiple backend servers. These backend servers may or may not know about all the resources that the gateway server know about. Since it will work most of the time, I think it's a reasonable optimization to use, however you might want to add a --one-host-per-connection flag for the rare cases where the current behavior won't work. Tony
Re: Does HTTP allow this?
Tony Lewis [EMAIL PROTECTED] writes: It's possible that the server responding to the IP address forwards connections to multiple backend servers. These backend servers may or may not know about all the resources that the gateway server know about. That is precisely the case I'm worried about. I can't point to anything in rfc2616 that would forbid this kind of server-side optimization. Since it will work most of the time, I think it's a reasonable optimization to use, however you might want to add a --one-host-per-connection flag for the rare cases where the current behavior won't work. The thing is, I don't want to bloat Wget with obscure options to turn off even more obscure (and *very* rarely needed) optimizations. Wget has enough command-line options as it is. If there are cases where the optimization doesn't work, I'd rather omit it completely.
Re: Does HTTP allow this?
Hrvoje Niksic wrote: The thing is, I don't want to bloat Wget with obscure options to turn off even more obscure (and *very* rarely needed) optimizations. Wget has enough command-line options as it is. If there are cases where the optimization doesn't work, I'd rather omit it completely. It's probably safest to turn off that optimization even if it does eliminate a few opens now and then. Tony
Re: Does HTTP allow this?
Tony Lewis [EMAIL PROTECTED] writes: Hrvoje Niksic wrote: The thing is, I don't want to bloat Wget with obscure options to turn off even more obscure (and *very* rarely needed) optimizations. Wget has enough command-line options as it is. If there are cases where the optimization doesn't work, I'd rather omit it completely. It's probably safest to turn off that optimization even if it does eliminate a few opens now and then. Yup. If we get a report of a case where it doesn't work, it goes away. (NB the optimization is already there since at least 1.8.x, and noone has reported a problem. For example, try `wget www.apache.org httpd.apache.org' with 1.8.2.)
Re: Does HTTP allow this?
On Sat, 8 Nov 2003, Hrvoje Niksic wrote: So if I have the connection to the endpoint, I should be able to reuse it. But on the other hand, a server might decide to connect a file descriptor to a handler for a specific virtual host, which would be unable to serve anything else. FWIW, it works fine with Apache. I would say that your described approach would work nicely, and it would not contradict anything in the HTTP standards. Each single request is stand-alone and may indeed have its own Host: header, even when the connection is kept alive. At least this is how I interpret these things. -- -=- Daniel Stenberg -=- http://daniel.haxx.se -=- ech`echo xiun|tr nu oc|sed 'sx\([sx]\)\([xoi]\)xo un\2\1 is xg'`ol
Re: Does HTTP allow this?
Daniel Stenberg [EMAIL PROTECTED] writes: On Sat, 8 Nov 2003, Hrvoje Niksic wrote: So if I have the connection to the endpoint, I should be able to reuse it. But on the other hand, a server might decide to connect a file descriptor to a handler for a specific virtual host, which would be unable to serve anything else. FWIW, it works fine with Apache. I would say that your described approach would work nicely, and it would not contradict anything in the HTTP standards. Each single request is stand-alone and may indeed have its own Host: header, even when the connection is kept alive. Hmm, OK. I guess I needed an independent confirmation, thanks. I would feel safer if 19.6.1.1 section of rfc2616 were explicit about persistent connections, but I guess it could be inferred. One thing that might break (but that Wget doesn't yet support anyway) is NTLM, which seems to authorize the *connections* individual connections. Does curl handle NTLM?