Sebastiano Vigna created HTTPCLIENT-1829:
--------------------------------------------

             Summary: Apparently normal SSL site generates a handshake failure
                 Key: HTTPCLIENT-1829
                 URL: https://issues.apache.org/jira/browse/HTTPCLIENT-1829
             Project: HttpComponents HttpClient
          Issue Type: Bug
          Components: HttpClient (classic)
    Affects Versions: 4.5.3
         Environment: Linux localhost.localdomain 4.9.10-200.fc25.x86_64 #1 SMP 
Wed Feb 15 23:28:59 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

java version "1.8.0_121"
Java(TM) SE Runtime Environment (build 1.8.0_121-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.121-b13, mixed mode)

            Reporter: Sebastiano Vigna
            Priority: Minor


When our crawler (BUbiNG) tries to fetch this robots.txt file

http://isizulu.net/robots.txt

HTTP client finds a redirect to

https://isizulu.net/robots.txt

But than it dies with the exception below. There is no problem with Chrome etc. 
or wget. We configure the client as

                robotsRequestConfig = RequestConfig.custom()
                                .setRedirectsEnabled( true )
                                .setMaxRedirects( 5 ) 
                                .build();


For your amusement, this is the "bug report" we got (the sender is omitted for 
mercy):

-----------
Subject: Bubing borken by design?

Hi,

lately I've been seeing your Bubing crawler trying to retrieve
http://isizulu.net/robots.txt but it doesn't seem to be capable of
handling the redirect to https://isizulu.net/robots.txt so I am
wondering what you have actually been doing during "the last ten years
of research" as it says on your site (a minimum finding could have been
that Java is just crap).

-----------------------
javax.net.ssl.SSLHandshakeException: Received fatal alert: handshake_failure
        at sun.security.ssl.Alerts.getSSLException(Alerts.java:192)
        at sun.security.ssl.Alerts.getSSLException(Alerts.java:154)
        at sun.security.ssl.SSLSocketImpl.recvAlert(SSLSocketImpl.java:2023)
        at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1125)
        at 
sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1375)
        at 
sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1403)
        at 
sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1387)
        at 
org.apache.http.conn.ssl.SSLConnectionSocketFactory.createLayeredSocket(SSLConnectionSocketFactory.java:396)
        at 
org.apache.http.conn.ssl.SSLConnectionSocketFactory.connectSocket(SSLConnectionSocketFactory.java:355)
        at 
org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:142)
        at 
org.apache.http.impl.conn.BasicHttpClientConnectionManager.connect(BasicHttpClientConnectionManager.java:323)
        at 
org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:381)
        at 
org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:237)
        at 
org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185)
        at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
        at 
org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:111)
        at 
org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
        at 
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:72)
        at 
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:221)
        at 
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:191)
        at it.unimi.di.law.bubing.util.FetchData.fetch(FetchData.java:322)
        at 
it.unimi.di.law.bubing.frontier.FetchingThread.run(FetchingThread.java:239)




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to