Hi Bradley, I'm thinking that we should instead document the {proxy} attribute. It can be useful to have it available in $response->request to be able to tell where the request was sent.
The reason why _need_proxy() does not override {proxy} if set is that it allows other handlers to set it up. You might for instance want to set up a handler that proxies everything not going to your intranett, or just set it explicitly for some requests. --Gisle On Apr 9, 2011, at 0:44 , Bradley Dean wrote: > Greetings, > > I've been looking at what feels to me like a bug with the handling of proxy > configuration. When a HTTP::Request is made through an LWP::UserAgent with a > proxy configured the HTTP::Request is changed by LWP::UserAgent::__need_proxy: > > sub _need_proxy { > ... > $req->{proxy} = $HTTP::URI_CLASS->new($proxy); > } > > "proxy" is not a documented attribute of the HTTP::Request class, but when it > is set _need_proxy ignores the current LWP::UserAgent proxy settings: > > sub _need_proxy { > my($req, $ua) = @_; > return if exists $req->{proxy}; > ... > } > > This becomes a problem when trying to use a single HTTP::Request instance > across multiple proxies - for example: > > my $request = HTTP::Request->new(GET => 'http://www.example.com/'); > my $ua = LWP::UserAgent->new(); > $ua->proxy('http', 'http://proxy_one.example.org:12345/'); > > # This first request goes through proxy_one > $ua->request($request); > > # Now change the proxy on the LWP::UserAgent > $ua->proxy('http', 'http://proxy_two.example.org:12345/'); > > # This second request goes through proxy_one still because > # $request->{proxy} == "http://proxy_one.example.org:12345/" > $ua->request($request); > > This can be worked around in the calling software by deleting > $request->{proxy} between requests but it's a hack (and one which could fail > at any time given that this is a non-documented attribute). > > I've thought of a few possible patches: > > PATCH 1: > > Remove the "return if exists $req->{proxy};" line from _need_proxy. I think > this is the cleanest approach. > > Thinking of reasons not to do this: that it incurs some extra processing cost > per request (but only in the case that a single request is used over and over > again); it might break existing code that relies on the undocumented "proxy" > attribute (where people are building requests with proxies rather than > configuring via LWP::UserAgent). If this is a concern see 2. > > PATCH 1 GIT: > > diff --git a/lib/LWP/UserAgent.pm b/lib/LWP/UserAgent.pm > index 1ee9ffb..945cd8a 100644 > --- a/lib/LWP/UserAgent.pm > +++ b/lib/LWP/UserAgent.pm > @@ -934,7 +934,6 @@ sub mirror > > sub _need_proxy { > my($req, $ua) = @_; > - return if exists $req->{proxy}; > my $proxy = $ua->{proxy}{$req->uri->scheme} || return; > if ($ua->{no_proxy}) { > if (my $host = eval { $req->uri->host }) { > > > PATCH 2: > > A more expensive option would be to check if the LWP::UserAgent has a proxy > configured and if it is different from the proxy attribute on the > HTTP::Request. > > If it is different I would argue that LWP::UserAgent value should override > the requestm, so if the values were different the LWP::UserAgent could change > the value of the proxy atttribute. > > I've not included a git patch for this option but could supply one if it was > of interest. > > PATCH 3: > > At the moment $request->{proxy} has to be set because the request is passed > out of the LWP::UserAgent to a protocol handler. As I recall at this stage > the LWP::UserAgent is not visible to the protocol handler, so $request is > used to pass on the proxy configuration. > > A much bigger change to the code would have the protocol handler given access > to the LWP::UserAgent so that the proxy did not have to be attached to the > HTTP::Request. The same decision still needs to be made as in PATCH 2 (ie > what happens if someone has set the proxy attribute). > > As with patch 2, I've not included a git patch for this option but could > supply one if it was of interest. > > Cheerio, > > Brad