Hi, protocol-http also supports https with Nutch 1.9 (with some limitations, see NUTCH-1676). Can you try it without httpclient?
Thanks, Sebastian 2014-11-11 20:42 GMT+01:00 Eyeris RodrIguez Rueda <[email protected]>: > Hello all. > > A few days ago I started using nutch 1.9 but i have a problem tryng to use > parsechecher tool with some websites that use https protocol. > I think that the problem is with https protocol but i´m not sure. I have > activated httpclient protocol in nutch default file but my problem persist. > this is my command and the error > > bin/nutch parsechecker https://facultad5.uci.cu/ > fetching: https://facultad5.uci.cu/ > Fetch failed with protocol status: exception(16), lastModified=0: > javax.net.ssl.SSLHandshakeException: > sun.security.validator.ValidatorException: PKIX path building failed: > sun.security.provider.certpath.SunCertPathBuilderException: unable to find > valid certification path to requested target > > > Also when i do a crawl i found also similar error > > ****************************************************************************************************************** > INFO api.HttpRobotRulesParser - Couldn't get robots.txt for > https://dragones.uci.cu/: javax.net.ssl.SSLHandshakeException: > sun.security.validator.ValidatorException: PKIX path building failed: > sun.security.provider.certpath.SunCertPathBuilderException: unable to find > valid certification path to requested target > 2014-11-11 13:41:52,620 ERROR httpclient.Http - Failed to get protocol > output > javax.net.ssl.SSLHandshakeException: > sun.security.validator.ValidatorException: PKIX path building failed: > sun.security.provider.certpath.SunCertPathBuilderException: unable to find > valid certification path to requested target > at sun.security.ssl.Alerts.getSSLException(Alerts.java:192) > at sun.security.ssl.SSLSocketImpl.fatal(SSLSocketImpl.java:1884) > at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:276) > at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:270) > at > sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1439) > at > sun.security.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:209) > at sun.security.ssl.Handshaker.processLoop(Handshaker.java:878) > at sun.security.ssl.Handshaker.process_record(Handshaker.java:814) > at > sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1016) > at > sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1312) > at > sun.security.ssl.SSLSocketImpl.writeRecord(SSLSocketImpl.java:702) > at sun.security.ssl.AppOutputStream.write(AppOutputStream.java:122) > at > java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82) > at > java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140) > at > org.apache.commons.httpclient.HttpConnection.flushRequestOutputStream(HttpConnection.java:828) > at > org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$HttpConnectionAdapter.flushRequestOutputStream(MultiThreadedHttpConnectionManager.java:1565) > at > org.apache.commons.httpclient.HttpMethodBase.writeRequest(HttpMethodBase.java:2116) > at > org.apache.commons.httpclient.HttpMethodBase.execute(HttpMethodBase.java:1096) > at > org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:398) > at > org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171) > at > org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397) > at > org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323) > at > org.apache.nutch.protocol.httpclient.HttpResponse.<init>(HttpResponse.java:94) > at > org.apache.nutch.protocol.httpclient.Http.getResponse(Http.java:154) > at > org.apache.nutch.protocol.http.api.HttpBase.getProtocolOutput(HttpBase.java:183) > at > org.apache.nutch.fetcher.Fetcher$FetcherThread.run(Fetcher.java:715) > Caused by: sun.security.validator.ValidatorException: PKIX path building > failed: sun.security.provider.certpath.SunCertPathBuilderException: unable > to find valid certification path to requested target > at > sun.security.validator.PKIXValidator.doBuild(PKIXValidator.java:385) > at > sun.security.validator.PKIXValidator.engineValidate(PKIXValidator.java:292) > at sun.security.validator.Validator.validate(Validator.java:260) > at > sun.security.ssl.X509TrustManagerImpl.validate(X509TrustManagerImpl.java:326) > at > sun.security.ssl.X509TrustManagerImpl.checkTrusted(X509TrustManagerImpl.java:231) > at > sun.security.ssl.X509TrustManagerImpl.checkServerTrusted(X509TrustManagerImpl.java:126) > at > sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1421) > ... 21 more > Caused by: sun.security.provider.certpath.SunCertPathBuilderException: > unable to find valid certification path to requested target > at > sun.security.provider.certpath.SunCertPathBuilder.engineBuild(SunCertPathBuilder.java:196) > at > java.security.cert.CertPathBuilder.build(CertPathBuilder.java:268) > at > sun.security.validator.PKIXValidator.doBuild(PKIXValidator.java:380) > ... 27 more > 2014-11-11 13:41:52,623 INFO fetcher.Fetcher - fetch of > https://dragones.uci.cu/ failed with: > javax.net.ssl.SSLHandshakeException: > sun.security.validator.ValidatorException: PKIX path building failed: > sun.security.provider.certpath.SunCertPathBuilderException: unable to find > valid certification path to requested target > 2014-11-11 13:41:52,623 INFO fetcher.Fetcher - Thread FetcherThread has > no more work available > 2014-11-11 13:41:52,623 INFO fetcher.Fetcher - -finishing thread > FetcherThread, activeThreads=2 > 2014-11-11 13:41:53,119 INFO fetcher.Fetcher - Thread FetcherThread has > no more work available > 2014-11-11 13:41:53,119 INFO fetcher.Fetcher - -finishing thread > FetcherThread, activeThreads=1 > 2014-11-11 13:41:53,274 INFO fetcher.Fetcher - -activeThreads=1, > spinWaiting=0, fetchQueues.totalSize=0, fetchQueues.getQueueCount=1 > > Please some advice or help will be appreciated. >

