[
https://issues.apache.org/jira/browse/HADOOP-15609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16547814#comment-16547814
]
Kitti Nanasi commented on HADOOP-15609:
---------------------------------------
Thanks for the comments [~xiaochen]! I modified the code to only retry in case
of KMSClientProvider and I also added some tests in my latest patch.
> Retry KMS calls when SSLHandshakeException occurs
> -------------------------------------------------
>
> Key: HADOOP-15609
> URL: https://issues.apache.org/jira/browse/HADOOP-15609
> Project: Hadoop Common
> Issue Type: Improvement
> Components: common, kms
> Affects Versions: 3.1.0
> Reporter: Kitti Nanasi
> Assignee: Kitti Nanasi
> Priority: Major
> Attachments: HADOOP-15609.001.patch, HADOOP-15609.002.patch
>
>
> KMS call should retry when javax.net.ssl.SSLHandshakeException occurs and
> FailoverOnNetworkExceptionRetry policy is used.
> For example in the following stack trace, we can see that the KMS Provider's
> connection is lost, an SSLHandshakeException is thrown and the operation is
> not retried:
> {code}
> W0711 18:19:50.213472 1508 LoadBalancingKMSClientProvider.java:132] KMS
> provider at [https://example.com:16000/kms/v1/] threw an IOException:
> Java exception follows:
> javax.net.ssl.SSLHandshakeException: Remote host closed connection during
> handshake
> at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1002)
> at
> sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1385)
> at
> sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1413)
> at
> sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1397)
> at
> sun.net.www.protocol.https.HttpsClient.afterConnect(HttpsClient.java:559)
> at
> sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:185)
> at
> sun.net.www.protocol.http.HttpURLConnection.getOutputStream0(HttpURLConnection.java:1316)
> at
> sun.net.www.protocol.http.HttpURLConnection.getOutputStream(HttpURLConnection.java:1291)
> at
> sun.net.www.protocol.https.HttpsURLConnectionImpl.getOutputStream(HttpsURLConnectionImpl.java:250)
> at
> org.apache.hadoop.crypto.key.kms.KMSClientProvider.call(KMSClientProvider.java:512)
> at
> org.apache.hadoop.crypto.key.kms.KMSClientProvider.call(KMSClientProvider.java:502)
> at
> org.apache.hadoop.crypto.key.kms.KMSClientProvider.decryptEncryptedKey(KMSClientProvider.java:791)
> at
> org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$5.call(LoadBalancingKMSClientProvider.java:288)
> at
> org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$5.call(LoadBalancingKMSClientProvider.java:284)
> at
> org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.doOp(LoadBalancingKMSClientProvider.java:124)
> at
> org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.decryptEncryptedKey(LoadBalancingKMSClientProvider.java:284)
> at
> org.apache.hadoop.crypto.key.KeyProviderCryptoExtension.decryptEncryptedKey(KeyProviderCryptoExtension.java:532)
> at
> org.apache.hadoop.hdfs.DFSClient.decryptEncryptedDataEncryptionKey(DFSClient.java:927)
> at
> org.apache.hadoop.hdfs.DFSClient.createWrappedInputStream(DFSClient.java:946)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:316)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:311)
> at
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:323)
> Caused by: java.io.EOFException: SSL peer shut down incorrectly
> at sun.security.ssl.InputRecord.read(InputRecord.java:505)
> at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:983)
> ... 22 more
> W0711 18:19:50.239328 1508 LoadBalancingKMSClientProvider.java:149] Aborting
> since the Request has failed with all KMS providers(depending on
> hadoop.security.kms.client.failover.max.retries=1 setting and numProviders=1)
> in the group OR the exception is not recoverable
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]