To add to what Lewis said, PAC files are mostly used by browsers, not so much 
by servers (like Nutch). It is possible your IT department has another proxy 
configuration that you can use in a server.
Keep in mind that a PAC file is just a JavaScript function that translates a 
URL to proxy information, so if the logic is simple and the file is static, it 
may be enough for you to look at the contents of the file, and extract some 
static proxy definition that will work for all URLs.

> -----Original Message-----
> From: lewis john mcgibbney <lewi...@apache.org>
> Sent: 23 April 2018 18:04
> To: user@nutch.apache.org
> Subject: Re: No internet connection in Nutch crawler: Proxy configuration -PAC
> file
> 
> Hi Patricia,
> I've never used a proxy auto-config (PAC) method for proxying anything before.
> The PAC is defined as "...Proxy auto-configuration (PAC): Specify the URL for 
> a
> PAC file with a JavaScript function that determines the appropriate proxy for
> each URL. This method is more suitable for laptop users who need several
> different proxy configurations, or complex corporate setups with many 
> different
> proxies."
> Right now, the public guidance for using Nutch with a proxy goes as far as the
> following tutorial https://wiki.apache.org/nutch/SetupProxyForNutch
> Right now, Nutch does not support the reading of PAC files... I think you 
> would
> need to add this functionality.
> Lewis
> 
> On Sun, Apr 22, 2018 at 10:31 AM, <user-digest-h...@nutch.apache.org> wrote:
> 
> >
> > From: Patricia Helmich <patriciahelm...@hotmail.com>
> > To: "user@nutch.apache.org" <user@nutch.apache.org>
> > Cc:
> > Bcc:
> > Date: Fri, 20 Apr 2018 10:31:42 +0000
> > Subject: No internet connection in Nutch crawler: Proxy configuration
> > -PAC file Hi,
> >
> > I am using Nutch and it used to work fine. Now, some internet
> > configurations changed and I have to use a proxy. In my browser, I
> > specify the proxy by providing a PAC file to the option "Automatic
> > proxy configuration URL". I was searching for a similar option in
> > Nutch in the conf/nutch-default.xml file. I do find some proxy options
> > (http.proxy.host, http.proxy.port, http.proxy.username,
> > http.proxy.password,
> > http.proxy.realm) but none seems to be the one I am searching for.
> >
> > So, my question is: where can I specify the PAC file in the Nutch
> > configurations for the proxy?
> >
> > Thanks for your help,
> >
> > Patricia
> >
> >
> 
> 
> --
> http://home.apache.org/~lewismc/
> http://people.apache.org/keys/committer/lewismc

Reply via email to