Hello,

One of the users of my Checkbot tool <http://degraaff.org/checkbot/> had
some trouble with the proxy functionality in LWP. In Checkbot I use the
proxy and noproxy functionality of LWP without changes, so I guess that
these issues should be addressed in LWP. The first issue is a problem
with the noproxy feature in relation to domain-less hostnames. The
report mentions FQDN's, but I think this should be read as canonical
URI's. The second issue is a feature request.

Kind regards,

Hans

- checkbot will ask the proxy if the URL contains a non-FQDN hostname
  and --noproxy contains the local domain. E.g. one intranet
  server is foo.de.marconicomms.com, and I run
  checkbot --proxy bar --noproxy de.marconicomms.com
  which unwantedly asks the proxy bar for http://foo/index.html
  A direct connection is used for
http://foo.de.marconicomms.com/index.html
  as expected.

  This is probably because the noproxy args are matched against the
  hostname as found in the URL, and not against the FQDN. Thus,
  "foo" does not match "de.marconicomms.com" and the proxy is used.
  This could be fixed if the matching would follow the same mechanism
  as the resolver, e.g. looking at the "search" line in /etc/resolv.conf
  for possible domains. Alternatively, a non-FQDN could be canonicalized
  by a name service lookup before being matched agains the noproxy list.
  What do you think?

- The common web browsers (IE; Mozilla et al) configure their
  proxy/noproxy via a proxy.pac file. This file is normally centrally
  maintained. It is referenced in RFC 3040, quote:

        6.2 Proxy Auto Configuration (PAC)

           Best known reference:
                  "Navigator Proxy Auto-Config File Format" [12]

           Description:
                  A JavaScript script retrieved from a web server is
executed for
                  each URL accessed to determine the appropriate proxy
(if any) to
                  be used to access the resource.  User agents must be
configured to
                  request this script upon startup.  There is no
bootstrap
                  mechanism, manual configuration is necessary.

                  Despite manual configuration, the process of proxy
configuration
                  is simplified by centralizing it within a script at a
single
                  location.

           Security:
                  Common policy per organization possible but still
requires initial
                  manual configuration.  PAC is better than "manual
proxy

                  configuration" since PAC administrators may update the
proxy
                  configuration without further user intervention.

                  Interoperability of PAC files is not high, since
different
                  browsers have slightly different interpretations of
the same
                  script, possibly leading to undesired effects.

           Deployment:
                  Implemented in Netscape Navigator and Microsoft
Internet Explorer.

           Submitter:
                  Document editors.



[12] refers to
http://wp.netscape.com/eng/mozilla/2.0/relnotes/demo/proxy-live.html

  Now, in an ideal world, checkbot would just use the same mechanism for
  proxy configuration that the web servers use. In fact, our proxy.pac
  is almost hundred lines in length and turning that into --[no]proxy
  args is repetitive and error prone.

  I realize that parsing a proxy.pac (= extremely restricted JavaScript)
  may not be a one-liner in perl. Anyway, maybe you consider the idea of
  using a standardized and flexible [no]proxy determination a Good
Thing(TM).
  And if you don't get right to it, adding it to the shadow todo list
  is a good idea. :-)

Attachment: signature.asc
Description: This is a digitally signed message part

Reply via email to