[ 
https://issues.apache.org/jira/browse/HADOOP-15655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576162#comment-16576162
 ] 

Kitti Nanasi commented on HADOOP-15655:
---------------------------------------

Thanks [~gabor.bota] for the comment! I agree that other possible invocations 
should be tested as well, but I modified the code in patch v003 to only affect 
LoadBalancingKMSClientProvider, so with the newest patch it makes sense to only 
test it in TestLoadBalancingKMSClientProvider. Retrying upon 
SocketTimeoutException in other cases than LoadBalancingKMSClientProvider might 
cause unexpected behaviour.

I also added "isIdempotent" flag to KMS operations, so it can be passed down to 
FailoverOnNetworkExceptionRetry policy, which will retry on IOExceptions 
regardless if the operation is idempotent. So the new implementation will retry 
on SocketTimeoutException as well in case of the operation is idempotent.

> KMS should retry upon SocketTimeoutException
> --------------------------------------------
>
>                 Key: HADOOP-15655
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15655
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: kms
>    Affects Versions: 3.1.0
>            Reporter: Kitti Nanasi
>            Assignee: Kitti Nanasi
>            Priority: Critical
>         Attachments: HADOOP-15655.001.patch, HADOOP-15655.002.patch, 
> HADOOP-15655.003.patch
>
>
> KMS doesn't retry upon SocketTimeoutException (the ssl connection was 
> established, but the handshake timed out).
> {noformat}
> 6:08:55.315 PM        WARN    KMSClientProvider       
> Failed to connect to example.com:16000
> 6:08:55.317 PM        WARN    LoadBalancingKMSClientProvider  
> KMS provider at [https://example.com:16000/kms/v1/] threw an IOException: 
> java.net.SocketTimeoutException: Read timed out
>       at java.net.SocketInputStream.socketRead0(Native Method)
>       at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
>       at java.net.SocketInputStream.read(SocketInputStream.java:171)
>       at java.net.SocketInputStream.read(SocketInputStream.java:141)
>       at sun.security.ssl.InputRecord.readFully(InputRecord.java:465)
>       at sun.security.ssl.InputRecord.read(InputRecord.java:503)
>       at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:983)
>       at 
> sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1385)
>       at 
> sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1413)
>       at 
> sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1397)
>       at 
> sun.net.www.protocol.https.HttpsClient.afterConnect(HttpsClient.java:559)
>       at 
> sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:185)
>       at 
> sun.net.www.protocol.https.HttpsURLConnectionImpl.connect(HttpsURLConnectionImpl.java:153)
>       at 
> org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:186)
>       at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.authenticate(DelegationTokenAuthenticator.java:140)
>       at 
> org.apache.hadoop.security.authentication.client.AuthenticatedURL.openConnection(AuthenticatedURL.java:348)
>       at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticatedURL.openConnection(DelegationTokenAuthenticatedURL.java:333)
>       at 
> org.apache.hadoop.crypto.key.kms.KMSClientProvider$1.run(KMSClientProvider.java:478)
>       at 
> org.apache.hadoop.crypto.key.kms.KMSClientProvider$1.run(KMSClientProvider.java:473)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:422)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1685)
>       at 
> org.apache.hadoop.crypto.key.kms.KMSClientProvider.createConnection(KMSClientProvider.java:472)
>       at 
> org.apache.hadoop.crypto.key.kms.KMSClientProvider.decryptEncryptedKey(KMSClientProvider.java:788)
>       at 
> org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$5.call(LoadBalancingKMSClientProvider.java:288)
>       at 
> org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$5.call(LoadBalancingKMSClientProvider.java:284)
>       at 
> org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.doOp(LoadBalancingKMSClientProvider.java:124)
>       at 
> org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.decryptEncryptedKey(LoadBalancingKMSClientProvider.java:284)
>       at 
> org.apache.hadoop.crypto.key.KeyProviderCryptoExtension.decryptEncryptedKey(KeyProviderCryptoExtension.java:532)
>       at 
> org.apache.hadoop.hdfs.DFSClient.decryptEncryptedDataEncryptionKey(DFSClient.java:927)
>       at 
> org.apache.hadoop.hdfs.DFSClient.createWrappedInputStream(DFSClient.java:946)
>       at 
> org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:316)
>       at 
> org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:311)
>       at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>       at 
> org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:323)
>       at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:949)
>       at org.apache.hadoop.hbase.util.FSUtils.getVersion(FSUtils.java:338)
>       at org.apache.hadoop.hbase.util.FSUtils.checkVersion(FSUtils.java:423)
>       at 
> org.apache.hadoop.hbase.master.MasterFileSystem.checkRootDir(MasterFileSystem.java:260)
>       at 
> org.apache.hadoop.hbase.master.MasterFileSystem.createInitialFileSystemLayout(MasterFileSystem.java:151)
>       at 
> org.apache.hadoop.hbase.master.MasterFileSystem.<init>(MasterFileSystem.java:122)
>       at 
> org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:795)
>       at 
> org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2036)
>       at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:553)
>       at java.lang.Thread.run(Thread.java:748)
> 6:08:55.346 PM        WARN    LoadBalancingKMSClientProvider  
> Aborting since the Request has failed with all KMS providers(depending on 
> hadoop.security.kms.client.failover.max.retries=1 setting and numProviders=1) 
> in the group OR the exception is not recoverable
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to