hanbing created NUTCH-3106:
------------------------------

             Summary: Issue with SSLHandshakeException in v1.20 using 
protocol-http plugin
                 Key: NUTCH-3106
                 URL: https://issues.apache.org/jira/browse/NUTCH-3106
             Project: Nutch
          Issue Type: Bug
          Components: plugin
    Affects Versions: 1.20
         Environment: OS:

Windows 11 Home Edition, Version: 23H2.

Ubuntu 20.04 is the same.

 

java -version
java version "{*}11.0.23{*}" 2024-04-16 LTS
Java(TM) SE Runtime Environment 18.9 (build 11.0.23+7-LTS-222)
Java HotSpot(TM) 64-Bit Server VM 18.9 (build 11.0.23+7-LTS-222, mixed mode)
            Reporter: hanbing


When using a proxy instead of a direct request in Nutch v1.20, the default 
{{protocol-http}} plugin causes an {{{}SSLHandshakeException{}}}. However, this 
issue does not occur with the {{protocol-okhttp}} plugin.

I encountered this while using a ClashVerge client 
([GitHub|https://github.com/clash-verge-rev/clash-verge-rev]) on localhost 
(port 7890) to bypass Cloudflare Bot Protection, which was returning a 403 
response.

Upon investigation, I found that the {{SSLHandshakeException}} is linked to the 
{{HttpResponse}} class, particularly between lines 121 and 136. Debugging 
revealed that the SSL handshake is performed with {{localhost:7890}} instead of 
the target website.

 
*Detailed error stack:*
2025-01-10 15:12:32,399 ERROR o.a.n.p.h.Http [main] Failed to get protocol 
output
org.apache.nutch.protocol.http.api.HttpException: SSL connect to 
[https://weworkremotely.com|https://weworkremotely.com/] failed with: Remote 
host terminated the handshake
at org.apache.nutch.protocol.http.HttpResponse.<init>(HttpResponse.java:156) 
~[classes/:?]
at org.apache.nutch.protocol.http.Http.getResponse(Http.java:65) ~[classes/:?]
at 
org.apache.nutch.protocol.http.api.HttpBase.getProtocolOutput(HttpBase.java:354)
 [classes/:?]
at org.apache.nutch.protocol.http.api.HttpBase.main(HttpBase.java:697) 
[classes/:?]
at org.apache.nutch.protocol.http.Http.main(Http.java:59) [classes/:?]
Caused by: javax.net.ssl.SSLHandshakeException: *Remote host terminated the 
handshake*
at 
[java.base/sun.security.ssl.|http://java.base/sun.security.ssl.]SSLSocketImpl.handleEOF(SSLSocketImpl.java:1715)
 ~[?:?]
at 
[java.base/sun.security.ssl.|http://java.base/sun.security.ssl.]SSLSocketImpl.decode(SSLSocketImpl.java:1514)
 ~[?:?]
at 
[java.base/sun.security.ssl.|http://java.base/sun.security.ssl.]SSLSocketImpl.readHandshakeRecord(SSLSocketImpl.java:1421)
 ~[?:?]
at 
[java.base/sun.security.ssl.|http://java.base/sun.security.ssl.]SSLSocketImpl.startHandshake(SSLSocketImpl.java:455)
 ~[?:?]
at 
[java.base/sun.security.ssl.|http://java.base/sun.security.ssl.]SSLSocketImpl.startHandshake(SSLSocketImpl.java:426)
 ~[?:?]
at org.apache.nutch.protocol.http.HttpResponse.<init>(HttpResponse.java:136) 
~[classes/:?]
... 4 more
Caused by: java.io.EOFException: *SSL peer shut down incorrectly*
at 
[java.base/sun.security.ssl.|http://java.base/sun.security.ssl.]SSLSocketInputRecord.read(SSLSocketInputRecord.java:489)
 ~[?:?]
at 
[java.base/sun.security.ssl.|http://java.base/sun.security.ssl.]SSLSocketInputRecord.readHeader(SSLSocketInputRecord.java:478)
 ~[?:?]
at 
[java.base/sun.security.ssl.|http://java.base/sun.security.ssl.]SSLSocketInputRecord.decode(SSLSocketInputRecord.java:160)
 ~[?:?]
at 
[java.base/sun.security.ssl.|http://java.base/sun.security.ssl.]SSLTransport.decode(SSLTransport.java:111)
 ~[?:?]
at 
[java.base/sun.security.ssl.|http://java.base/sun.security.ssl.]SSLSocketImpl.decode(SSLSocketImpl.java:1506)
 ~[?:?]
at 
[java.base/sun.security.ssl.|http://java.base/sun.security.ssl.]SSLSocketImpl.readHandshakeRecord(SSLSocketImpl.java:1421)
 ~[?:?]
at 
[java.base/sun.security.ssl.|http://java.base/sun.security.ssl.]SSLSocketImpl.startHandshake(SSLSocketImpl.java:455)
 ~[?:?]
at 
[java.base/sun.security.ssl.|http://java.base/sun.security.ssl.]SSLSocketImpl.startHandshake(SSLSocketImpl.java:426)
 ~[?:?]
at org.apache.nutch.protocol.http.HttpResponse.<init>(HttpResponse.java:136) 
~[classes/:?]
... 4 more
Status: exception(16), lastModified=0: 
org.apache.nutch.protocol.http.api.HttpException: SSL connect to 
[https://weworkremotely.com|https://weworkremotely.com/] failed with: Remote 
host terminated the handshake



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to