[ 
https://issues.apache.org/jira/browse/NUTCH-3106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17923247#comment-17923247
 ] 

ASF GitHub Bot commented on NUTCH-3106:
---------------------------------------

tatecn opened a new pull request, #847:
URL: https://github.com/apache/nutch/pull/847

   
[NUTCH-3106](https://issues.apache.org/jira/projects/NUTCH/issues/NUTCH-3106) 
Fix SSLHandshakeException with proxy in protocol-http plugin.
   
   Additionally, I added the TestProtocolHttpByProxy unit test using 
[LittleProxy](https://github.com/adamfisk/LittleProxy), though its error 
message differs from NUTCH-3106.
   
   2025-02-02 22:35:43,575 ERROR o.a.n.p.h.Http [main] Failed to get protocol 
output
   org.apache.nutch.protocol.http.api.HttpException: SSL connect to 
https://www.baidu.com failed with: **Unsupported or unrecognized SSL message**
        at 
org.apache.nutch.protocol.http.HttpResponse.<init>(HttpResponse.java:156) 
~[classes/:?]
        at org.apache.nutch.protocol.http.Http.getResponse(Http.java:65) 
~[classes/:?]
        at 
org.apache.nutch.protocol.http.api.HttpBase.getProtocolOutput(HttpBase.java:361)
 [classes/:?]
        at 
org.apache.nutch.protocol.http.TestProtocolHttpByProxy.testRequestByProxy(TestProtocolHttpByProxy.java:88)
 [classes/:?]
        at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
~[?:?]
        at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
 ~[?:?]
        at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:?]
        at java.base/java.lang.reflect.Method.invoke(Method.java:568) ~[?:?]
        at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
 [junit-4.13.2.jar:4.13.2]
        at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
 [junit-4.13.2.jar:4.13.2]
        at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
 [junit-4.13.2.jar:4.13.2]
        at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
 [junit-4.13.2.jar:4.13.2]
        at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) 
[junit-4.13.2.jar:4.13.2]
        at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) 
[junit-4.13.2.jar:4.13.2]
        at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) 
[junit-4.13.2.jar:4.13.2]
        at 
org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
 [junit-4.13.2.jar:4.13.2]
        at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) 
[junit-4.13.2.jar:4.13.2]
        at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
 [junit-4.13.2.jar:4.13.2]
        at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
 [junit-4.13.2.jar:4.13.2]
        at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) 
[junit-4.13.2.jar:4.13.2]
        at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) 
[junit-4.13.2.jar:4.13.2]
        at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) 
[junit-4.13.2.jar:4.13.2]
        at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) 
[junit-4.13.2.jar:4.13.2]
        at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) 
[junit-4.13.2.jar:4.13.2]
        at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) 
[junit-4.13.2.jar:4.13.2]
        at org.junit.runners.ParentRunner.run(ParentRunner.java:413) 
[junit-4.13.2.jar:4.13.2]
        at org.junit.runner.JUnitCore.run(JUnitCore.java:137) 
[junit-4.13.2.jar:4.13.2]
        at 
com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:69)
 [junit-rt.jar:?]
        at 
com.intellij.rt.junit.IdeaTestRunner$Repeater$1.execute(IdeaTestRunner.java:38) 
[junit-rt.jar:?]
        at 
com.intellij.rt.execution.junit.TestsRepeater.repeat(TestsRepeater.java:11) 
[idea_rt.jar:?]
        at 
com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:35)
 [junit-rt.jar:?]
        at 
com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:232)
 [junit-rt.jar:?]
        at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:55) 
[junit-rt.jar:?]
   Caused by: javax.net.ssl.SSLException: Unsupported or unrecognized SSL 
message
        at 
java.base/sun.security.ssl.SSLSocketInputRecord.handleUnknownRecord(SSLSocketInputRecord.java:457)
 ~[?:?]
        at 
java.base/sun.security.ssl.SSLSocketInputRecord.decode(SSLSocketInputRecord.java:175)
 ~[?:?]
        at 
java.base/sun.security.ssl.SSLTransport.decode(SSLTransport.java:111) ~[?:?]
        at 
java.base/sun.security.ssl.SSLSocketImpl.decode(SSLSocketImpl.java:1506) ~[?:?]
        at 
java.base/sun.security.ssl.SSLSocketImpl.readHandshakeRecord(SSLSocketImpl.java:1421)
 ~[?:?]
        at 
java.base/sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:455) 
~[?:?]
        at 
java.base/sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:426) 
~[?:?]
        at 
org.apache.nutch.protocol.http.HttpResponse.<init>(HttpResponse.java:136) 
~[classes/:?]
        ... 32 more
   Status: exception(16), lastModified=0: 
org.apache.nutch.protocol.http.api.HttpException: SSL connect to 
https://www.baidu.com failed with: Unsupported or unrecognized SSL message
   




> Issue with SSLHandshakeException in v1.20 using protocol-http plugin
> --------------------------------------------------------------------
>
>                 Key: NUTCH-3106
>                 URL: https://issues.apache.org/jira/browse/NUTCH-3106
>             Project: Nutch
>          Issue Type: Bug
>          Components: plugin
>    Affects Versions: 1.20
>         Environment: OS:
> Windows 11 Home Edition, Version: 23H2.
> Ubuntu 20.04 is the same.
>  
> java -version
> java version "{*}11.0.23{*}" 2024-04-16 LTS
> Java(TM) SE Runtime Environment 18.9 (build 11.0.23+7-LTS-222)
> Java HotSpot(TM) 64-Bit Server VM 18.9 (build 11.0.23+7-LTS-222, mixed mode)
>            Reporter: hanbing
>            Priority: Major
>              Labels: features
>
> When using a proxy instead of a direct request in Nutch v1.20, the default 
> {{protocol-http}} plugin causes an {{{}SSLHandshakeException{}}}. However, 
> this issue does not occur with the {{protocol-okhttp}} plugin.
> I encountered this while using a ClashVerge client 
> ([GitHub|https://github.com/clash-verge-rev/clash-verge-rev]) on localhost 
> (port 7890) to bypass Cloudflare Bot Protection, which was returning a 403 
> response.
> Upon investigation, I found that the {{SSLHandshakeException}} is linked to 
> the {{HttpResponse}} class, particularly between lines 121 and 136. Debugging 
> revealed that the SSL handshake is performed with {{localhost:7890}} instead 
> of the target website.
>  
> *Detailed error stack:*
> 2025-01-10 15:12:32,399 ERROR o.a.n.p.h.Http [main] Failed to get protocol 
> output
> org.apache.nutch.protocol.http.api.HttpException: SSL connect to 
> [https://weworkremotely.com|https://weworkremotely.com/] failed with: Remote 
> host terminated the handshake
> at org.apache.nutch.protocol.http.HttpResponse.<init>(HttpResponse.java:156) 
> ~[classes/:?]
> at org.apache.nutch.protocol.http.Http.getResponse(Http.java:65) ~[classes/:?]
> at 
> org.apache.nutch.protocol.http.api.HttpBase.getProtocolOutput(HttpBase.java:354)
>  [classes/:?]
> at org.apache.nutch.protocol.http.api.HttpBase.main(HttpBase.java:697) 
> [classes/:?]
> at org.apache.nutch.protocol.http.Http.main(Http.java:59) [classes/:?]
> Caused by: javax.net.ssl.SSLHandshakeException: *Remote host terminated the 
> handshake*
> at 
> [java.base/sun.security.ssl.|http://java.base/sun.security.ssl.]SSLSocketImpl.handleEOF(SSLSocketImpl.java:1715)
>  ~[?:?]
> at 
> [java.base/sun.security.ssl.|http://java.base/sun.security.ssl.]SSLSocketImpl.decode(SSLSocketImpl.java:1514)
>  ~[?:?]
> at 
> [java.base/sun.security.ssl.|http://java.base/sun.security.ssl.]SSLSocketImpl.readHandshakeRecord(SSLSocketImpl.java:1421)
>  ~[?:?]
> at 
> [java.base/sun.security.ssl.|http://java.base/sun.security.ssl.]SSLSocketImpl.startHandshake(SSLSocketImpl.java:455)
>  ~[?:?]
> at 
> [java.base/sun.security.ssl.|http://java.base/sun.security.ssl.]SSLSocketImpl.startHandshake(SSLSocketImpl.java:426)
>  ~[?:?]
> at org.apache.nutch.protocol.http.HttpResponse.<init>(HttpResponse.java:136) 
> ~[classes/:?]
> ... 4 more
> Caused by: java.io.EOFException: *SSL peer shut down incorrectly*
> at 
> [java.base/sun.security.ssl.|http://java.base/sun.security.ssl.]SSLSocketInputRecord.read(SSLSocketInputRecord.java:489)
>  ~[?:?]
> at 
> [java.base/sun.security.ssl.|http://java.base/sun.security.ssl.]SSLSocketInputRecord.readHeader(SSLSocketInputRecord.java:478)
>  ~[?:?]
> at 
> [java.base/sun.security.ssl.|http://java.base/sun.security.ssl.]SSLSocketInputRecord.decode(SSLSocketInputRecord.java:160)
>  ~[?:?]
> at 
> [java.base/sun.security.ssl.|http://java.base/sun.security.ssl.]SSLTransport.decode(SSLTransport.java:111)
>  ~[?:?]
> at 
> [java.base/sun.security.ssl.|http://java.base/sun.security.ssl.]SSLSocketImpl.decode(SSLSocketImpl.java:1506)
>  ~[?:?]
> at 
> [java.base/sun.security.ssl.|http://java.base/sun.security.ssl.]SSLSocketImpl.readHandshakeRecord(SSLSocketImpl.java:1421)
>  ~[?:?]
> at 
> [java.base/sun.security.ssl.|http://java.base/sun.security.ssl.]SSLSocketImpl.startHandshake(SSLSocketImpl.java:455)
>  ~[?:?]
> at 
> [java.base/sun.security.ssl.|http://java.base/sun.security.ssl.]SSLSocketImpl.startHandshake(SSLSocketImpl.java:426)
>  ~[?:?]
> at org.apache.nutch.protocol.http.HttpResponse.<init>(HttpResponse.java:136) 
> ~[classes/:?]
> ... 4 more
> Status: exception(16), lastModified=0: 
> org.apache.nutch.protocol.http.api.HttpException: SSL connect to 
> [https://weworkremotely.com|https://weworkremotely.com/] failed with: Remote 
> host terminated the handshake



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to