Hey Roannel, This is what I needed. I kept trying to Google things related to certificates and nutch but I guess I should have just said java and certificates instead. Works like a charm now.
Thanks Sid -----Original Message----- From: Roannel Fernández Hernández [mailto:[email protected]] Sent: November-28-17 3:31 PM To: [email protected] Subject: Re: [MASSMAIL]Certificates Hi Sadiki: You must add your Solr's certificate into cacerts (keystore by default) of your Java distribution. Under Linux you can know where your cacerts file is, with: echo $(readlink -f /usr/bin/java | sed "s:bin/java::")lib/security/cacerts as is described on https://stackoverflow.com/questions/11936685/how-to-obtain-the-location-of-cacerts-of-the-default-java-installation Regards. ----- Mensaje original ----- > De: "Sadiki Latty" <[email protected]> > Para: [email protected] > Enviados: Martes, 28 de Noviembre 2017 14:03:27 > Asunto: RE: [MASSMAIL]Certificates > > Hey Eyeris, > > Thanks for the response. My issue isn't with the http/https crawling > but rather the indexing to Solr. My Solr instances are self-signed and > when Nutch tries to index what It found it fails because it doesn’t > respect the cert that Solr made. I had the same issue with Solr > talking to other Solr instances and the solution was to manually add > the cert and point Solr to the keystore file. I was hoping I could > find a similar solution for Nutch where I could add the Solr cert to the > Nutch keystore but. > 1. I don’t know if Nutch can do that? > 2. If Nutch has this feature I don't know where the keystore file is. > 3. Your suggestion of using Portecle may be suitable for what I need > but I > still need to know where Nutch keeps this keystore file AND/OR how to > tell > Nutch to use this keystore file. > > I am also willing to use protocol-httpclient but it still (without > extra > configuration) doesn’t work me. I'm fairly new to Nutch so forgive me > if I'm missing something obvious. > > Thanks > > Sid > > > -----Original Message----- > From: Eyeris Rodriguez Rueda [mailto:[email protected]] > Sent: November-28-17 12:07 PM > To: [email protected] > Subject: Re: [MASSMAIL]Certificates > > Hello Sid. > I am using protocol-httpclient because in my modest opinion it have a > better handling of https websites than protocol-http. > Since java 1.7 my problems with self signed certificates was deleted > and using protocol-httpclient and nutch 1.12. > But if you have problems with websites that have self signed > certificates maybe you need to insert certificates into java keystore > using portecle tool you can download here: > https://sourceforge.net/projects/portecle/ > > Best regards. > > > > ----- Mensaje original ----- > De: "Sadiki Latty" <[email protected]> > Para: [email protected] > Enviados: Martes, 28 de Noviembre 2017 11:08:28 > Asunto: [MASSMAIL]Certificates > > Hey all, > > I have a question regarding self-signed certs. I will be using nutch > to crawl http and https sites, as well as using it to index to > self-signed https Solr servers. I managed to add certificates to Solr > and it fixed their inter-node communication butI am yet to find where > in nutch I can do a similar configuration. I have seen articles saying > that the protocol-httpclient plugin should be able to do it with some > code modifications but the caveat is that httpclient may have underlying bugs > so protocol-http is recommended. > These articles were also almost 3 years old so options may have evolved now. > Can some someone provide some insight into what my next steps should be. > Essentially here are my questions: > > 1. Should I use protocol-http, protocol-httpclient or other? > > > > 2. Is there somewhere in a config file that I can tell Nutch to use a > java keystore file similar to Solr? > > Thanks > > Sid > > ********************** > Text below is autogenerated by my email suplier. > La @universidad_uci es Fidel: 15 años conectados al futuro... > conectados a la Revolución > 2002-2017 > La @universidad_uci es Fidel: 15 años conectados al futuro... conectados a la Revolución 2002-2017

