[
https://issues.apache.org/jira/browse/HADOOP-15655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576162#comment-16576162
]
Kitti Nanasi commented on HADOOP-15655:
---------------------------------------
Thanks [~gabor.bota] for the comment! I agree that other possible invocations
should be tested as well, but I modified the code in patch v003 to only affect
LoadBalancingKMSClientProvider, so with the newest patch it makes sense to only
test it in TestLoadBalancingKMSClientProvider. Retrying upon
SocketTimeoutException in other cases than LoadBalancingKMSClientProvider might
cause unexpected behaviour.
I also added "isIdempotent" flag to KMS operations, so it can be passed down to
FailoverOnNetworkExceptionRetry policy, which will retry on IOExceptions
regardless if the operation is idempotent. So the new implementation will retry
on SocketTimeoutException as well in case of the operation is idempotent.
> KMS should retry upon SocketTimeoutException
> --------------------------------------------
>
> Key: HADOOP-15655
> URL: https://issues.apache.org/jira/browse/HADOOP-15655
> Project: Hadoop Common
> Issue Type: Bug
> Components: kms
> Affects Versions: 3.1.0
> Reporter: Kitti Nanasi
> Assignee: Kitti Nanasi
> Priority: Critical
> Attachments: HADOOP-15655.001.patch, HADOOP-15655.002.patch,
> HADOOP-15655.003.patch
>
>
> KMS doesn't retry upon SocketTimeoutException (the ssl connection was
> established, but the handshake timed out).
> {noformat}
> 6:08:55.315 PM WARN KMSClientProvider
> Failed to connect to example.com:16000
> 6:08:55.317 PM WARN LoadBalancingKMSClientProvider
> KMS provider at [https://example.com:16000/kms/v1/] threw an IOException:
> java.net.SocketTimeoutException: Read timed out
> at java.net.SocketInputStream.socketRead0(Native Method)
> at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
> at java.net.SocketInputStream.read(SocketInputStream.java:171)
> at java.net.SocketInputStream.read(SocketInputStream.java:141)
> at sun.security.ssl.InputRecord.readFully(InputRecord.java:465)
> at sun.security.ssl.InputRecord.read(InputRecord.java:503)
> at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:983)
> at
> sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1385)
> at
> sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1413)
> at
> sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1397)
> at
> sun.net.www.protocol.https.HttpsClient.afterConnect(HttpsClient.java:559)
> at
> sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:185)
> at
> sun.net.www.protocol.https.HttpsURLConnectionImpl.connect(HttpsURLConnectionImpl.java:153)
> at
> org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:186)
> at
> org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.authenticate(DelegationTokenAuthenticator.java:140)
> at
> org.apache.hadoop.security.authentication.client.AuthenticatedURL.openConnection(AuthenticatedURL.java:348)
> at
> org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticatedURL.openConnection(DelegationTokenAuthenticatedURL.java:333)
> at
> org.apache.hadoop.crypto.key.kms.KMSClientProvider$1.run(KMSClientProvider.java:478)
> at
> org.apache.hadoop.crypto.key.kms.KMSClientProvider$1.run(KMSClientProvider.java:473)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1685)
> at
> org.apache.hadoop.crypto.key.kms.KMSClientProvider.createConnection(KMSClientProvider.java:472)
> at
> org.apache.hadoop.crypto.key.kms.KMSClientProvider.decryptEncryptedKey(KMSClientProvider.java:788)
> at
> org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$5.call(LoadBalancingKMSClientProvider.java:288)
> at
> org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$5.call(LoadBalancingKMSClientProvider.java:284)
> at
> org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.doOp(LoadBalancingKMSClientProvider.java:124)
> at
> org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.decryptEncryptedKey(LoadBalancingKMSClientProvider.java:284)
> at
> org.apache.hadoop.crypto.key.KeyProviderCryptoExtension.decryptEncryptedKey(KeyProviderCryptoExtension.java:532)
> at
> org.apache.hadoop.hdfs.DFSClient.decryptEncryptedDataEncryptionKey(DFSClient.java:927)
> at
> org.apache.hadoop.hdfs.DFSClient.createWrappedInputStream(DFSClient.java:946)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:316)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:311)
> at
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:323)
> at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:949)
> at org.apache.hadoop.hbase.util.FSUtils.getVersion(FSUtils.java:338)
> at org.apache.hadoop.hbase.util.FSUtils.checkVersion(FSUtils.java:423)
> at
> org.apache.hadoop.hbase.master.MasterFileSystem.checkRootDir(MasterFileSystem.java:260)
> at
> org.apache.hadoop.hbase.master.MasterFileSystem.createInitialFileSystemLayout(MasterFileSystem.java:151)
> at
> org.apache.hadoop.hbase.master.MasterFileSystem.<init>(MasterFileSystem.java:122)
> at
> org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:795)
> at
> org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2036)
> at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:553)
> at java.lang.Thread.run(Thread.java:748)
> 6:08:55.346 PM WARN LoadBalancingKMSClientProvider
> Aborting since the Request has failed with all KMS providers(depending on
> hadoop.security.kms.client.failover.max.retries=1 setting and numProviders=1)
> in the group OR the exception is not recoverable
> {noformat}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]