Hi Patricia, I've never used a proxy auto-config (PAC) method for proxying anything before. The PAC is defined as "...Proxy auto-configuration (PAC): Specify the URL for a PAC file with a JavaScript function that determines the appropriate proxy for each URL. This method is more suitable for laptop users who need several different proxy configurations, or complex corporate setups with many different proxies." Right now, the public guidance for using Nutch with a proxy goes as far as the following tutorial https://wiki.apache.org/nutch/SetupProxyForNutch Right now, Nutch does not support the reading of PAC files... I think you would need to add this functionality. Lewis
On Sun, Apr 22, 2018 at 10:31 AM, <[email protected]> wrote: > > From: Patricia Helmich <[email protected]> > To: "[email protected]" <[email protected]> > Cc: > Bcc: > Date: Fri, 20 Apr 2018 10:31:42 +0000 > Subject: No internet connection in Nutch crawler: Proxy configuration -PAC > file > Hi, > > I am using Nutch and it used to work fine. Now, some internet > configurations changed and I have to use a proxy. In my browser, I specify > the proxy by providing a PAC file to the option "Automatic proxy > configuration URL". I was searching for a similar option in Nutch in the > conf/nutch-default.xml file. I do find some proxy options (http.proxy.host, > http.proxy.port, http.proxy.username, http.proxy.password, > http.proxy.realm) but none seems to be the one I am searching for. > > So, my question is: where can I specify the PAC file in the Nutch > configurations for the proxy? > > Thanks for your help, > > Patricia > > -- http://home.apache.org/~lewismc/ http://people.apache.org/keys/committer/lewismc

