Hi - stick to protocol-http if you can and configure the plugins to only use protocol-httpclient for HTTPS connections. Contrary to protocol-http, the protocol-httpclient plugin has trouble with not properly encoded URL's, but it can easily be resolved by adapting the URL normalizers.
We operate such a configuration and deal with both protocols without significant issues, so it can be done. Do make sure you use the latest Nutch, there are some advances we committed recently, also to support SSL for protocol-http based on work from Aloisius at github we worked on for Nutch, which Julien kindly committed. So Nutch 1.9, which is ready to take off, supports SSL using two protocol plugins. https://issues.apache.org/jira/browse/NUTCH-1676 On Wednesday 13 August 2014 22:07:20 Mattmann, Chris A wrote: > CC'ing dev@nutch. > > Nutch'ers - question from JPL peeps below on protocol-httpclient. > FYI. if you could chime in would be great! > > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > Chris Mattmann, Ph.D. > Chief Architect > Instrument Software and Science Data Systems Section (398) > NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA > Office: 168-519, Mailstop: 168-527 > Email: [email protected] > WWW: http://sunset.usc.edu/~mattmann/ > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > Adjunct Associate Professor, Computer Science Department > University of Southern California, Los Angeles, CA 90089 USA > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > > > > > > -----Original Message----- > From: <Kieu>, "Phu (172C-Affiliate)" <[email protected]> > Date: Wednesday, August 13, 2014 3:00 PM > To: Chris Mattmann <[email protected]>, "Mcgibbney, Lewis J > (398J)" <[email protected]>, "Liu, Jeff Y (172B)" > <[email protected]> > Subject: Re: Nutch 1.9 > > > >Https support is a must for us. > > > >On 08/13/2014, 14:27, "Mattmann, Chris A (3980)" > ><[email protected]> wrote: > > > > > >>hrm, any reason to switch to httpclient? Why not just keep http? > >> > >>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > >>Chris Mattmann, Ph.D. > >>Chief Architect > >>Instrument Software and Science Data Systems Section (398) > >>NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA > >>Office: 168-519, Mailstop: 168-527 > >>Email: [email protected] > >>WWW: http://sunset.usc.edu/~mattmann/ > >>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > >>Adjunct Associate Professor, Computer Science Department > >>University of Southern California, Los Angeles, CA 90089 USA > >>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > >> > >> > >> > >> > >> > >> > >>-----Original Message----- > >>From: <Kieu>, "Phu (172C-Affiliate)" <[email protected]> > >>Date: Wednesday, August 13, 2014 2:20 PM > >>To: "Mcgibbney, Lewis J (398J)" <[email protected]>, "Liu, > >>Jeff Y (172B)" <[email protected]> > >>Cc: Chris Mattmann <[email protected]> > >>Subject: Re: Nutch 1.9 > >> > >> > >>>Thanks for the heads up. > >>> > >>>We switched over to trunk last week, and it looks good. > >>> > >>>A question about protocol-http however: > >>> > >>>What does it support and not support? > >>> > >>>We¹ve been using protocol-httpclient from the beginning, and it¹s been > >>>working fine. Switching over to protocol-http is currently giving us an > >>>SSL exception. > >>> > >>>On 08/12/2014, 22:59, "Mcgibbney, Lewis J (398J)" > >>><[email protected]> wrote: > >>> > >>> > >>>>Hi Folks, > >>>> > >>>>Just a quick note to say that there is a release candidate out for > >>>>Nutch > >>>>1.9. > >>>>The proposed release report can be found here[0], you can check it for > >>>>some issues which may help with your deployment for JPLSpace. > >>>>Would be great to hear if you have any further problems using the > >>>>software. > >>>>Best > >>>>Lewis > >>>> > >>>>[0] > >>>>https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=10680& > >>>>v > >>>>e > >>>>r > >>>>s > >>>>ion=12324611 > >>>> > >>> > >>> > >> > >> > > > > > >

