[ 
https://issues.apache.org/jira/browse/HADOOP-15609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16547059#comment-16547059
 ] 

Xiao Chen commented on HADOOP-15609:
------------------------------------

Thanks for the patch Kitti.

I actually prefer we confine this retry just in KMSClientProvider. The retry 
policy in hadoop-common is widely used, and SSLHandshakeException can happen 
for invalid setups too (e.g. handshake failure due to certificates, cipher 
suites etc.). It feels to me we should be specific to KMS here to reduce the 
impact.

Also could you add a unit test for this? There are some similar tests in 
TestLoadBalancingKMSClientProvider

> Retry KMS calls when SSLHandshakeException occurs
> -------------------------------------------------
>
>                 Key: HADOOP-15609
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15609
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: common, kms
>    Affects Versions: 3.1.0
>            Reporter: Kitti Nanasi
>            Assignee: Kitti Nanasi
>            Priority: Major
>         Attachments: HADOOP-15609.001.patch
>
>
> KMS call should retry when javax.net.ssl.SSLHandshakeException occurs and 
> FailoverOnNetworkExceptionRetry policy is used.
> For example in the following stack trace, we can see that the KMS Provider's 
> connection is lost, an SSLHandshakeException is thrown and the operation is 
> not retried:
> {code}
> W0711 18:19:50.213472  1508 LoadBalancingKMSClientProvider.java:132] KMS 
> provider at [https://example.com:16000/kms/v1/] threw an IOException:
> Java exception follows:
> javax.net.ssl.SSLHandshakeException: Remote host closed connection during 
> handshake
>         at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1002)
>         at 
> sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1385)
>         at 
> sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1413)
>         at 
> sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1397)
>         at 
> sun.net.www.protocol.https.HttpsClient.afterConnect(HttpsClient.java:559)
>         at 
> sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:185)
>         at 
> sun.net.www.protocol.http.HttpURLConnection.getOutputStream0(HttpURLConnection.java:1316)
>         at 
> sun.net.www.protocol.http.HttpURLConnection.getOutputStream(HttpURLConnection.java:1291)
>         at 
> sun.net.www.protocol.https.HttpsURLConnectionImpl.getOutputStream(HttpsURLConnectionImpl.java:250)
>         at 
> org.apache.hadoop.crypto.key.kms.KMSClientProvider.call(KMSClientProvider.java:512)
>         at 
> org.apache.hadoop.crypto.key.kms.KMSClientProvider.call(KMSClientProvider.java:502)
>         at 
> org.apache.hadoop.crypto.key.kms.KMSClientProvider.decryptEncryptedKey(KMSClientProvider.java:791)
>         at 
> org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$5.call(LoadBalancingKMSClientProvider.java:288)
>         at 
> org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$5.call(LoadBalancingKMSClientProvider.java:284)
>         at 
> org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.doOp(LoadBalancingKMSClientProvider.java:124)
>         at 
> org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.decryptEncryptedKey(LoadBalancingKMSClientProvider.java:284)
>         at 
> org.apache.hadoop.crypto.key.KeyProviderCryptoExtension.decryptEncryptedKey(KeyProviderCryptoExtension.java:532)
>         at 
> org.apache.hadoop.hdfs.DFSClient.decryptEncryptedDataEncryptionKey(DFSClient.java:927)
>         at 
> org.apache.hadoop.hdfs.DFSClient.createWrappedInputStream(DFSClient.java:946)
>         at 
> org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:316)
>         at 
> org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:311)
>         at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>         at 
> org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:323)
> Caused by: java.io.EOFException: SSL peer shut down incorrectly
>         at sun.security.ssl.InputRecord.read(InputRecord.java:505)
>         at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:983)
>         ... 22 more
> W0711 18:19:50.239328  1508 LoadBalancingKMSClientProvider.java:149] Aborting 
> since the Request has failed with all KMS providers(depending on 
> hadoop.security.kms.client.failover.max.retries=1 setting and numProviders=1) 
> in the group OR the exception is not recoverable
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to