nkeywal created HBASE-6751:
------------------------------

             Summary: Too many retries, leading a a delay to read the HLog 
after a datanode failure
                 Key: HBASE-6751
                 URL: https://issues.apache.org/jira/browse/HBASE-6751
             Project: HBase
          Issue Type: Improvement
          Components: regionserver
    Affects Versions: 0.94.0, 0.96.0
            Reporter: nkeywal


When reading an HLog, we need to got to the last block to get the file size.
In HDFS 1.0.3, it leads to HDFS-3701 / HBASE-6401

In HDFS branch-2, this bug is fixed; but we have two other issues.

1) For simple cases as a single node died, we don't have the effect of 
HDFS-3703, and the default location order leads us to try to connect to a dead 
datanode while we should not. This is not analysed yet. A specific JIRA will be 
created later.
2) If we are redirected to a wrong node, we experience a huge delay:

The pseudo code in DFSInputStream#readBlockLength is:

{noformat}   
    for(DatanodeInfo datanode : locatedblock.getLocations()) {     
      try {
        ClientDatanodeProtocol cdp = DFSUtil.createClientDatanodeProtocolProxy(
            datanode, dfsClient.conf, dfsClient.getConf().socketTimeout,
            dfsClient.getConf().connectToDnViaHostname, locatedblock);
       
       return cdp.getReplicaVisibleLength(locatedblock.getBlock());

        } catch {
             // retry
      }
    }
{noformat} 

However, with this code, the connection is created with a null RetryPolicy. 
It's then defaulted to 10 retries, with:

{noformat} 
  public static final String  IPC_CLIENT_CONNECT_MAX_RETRIES_KEY = 
"ipc.client.connect.max.retries";
  public static final int     IPC_CLIENT_CONNECT_MAX_RETRIES_DEFAULT = 10;
{noformat} 

So if the first datanode is bad, we will try 10 times before trying the second. 
In the context of HBASE-6738, the split task is cancelled before we're have 
opened the file to split.

By nature, it's likely to be a pure HDFS issue. But may be it can be solved in 
HBase with the right configuration on "ipc.client.connect.max.retries".

The ideal fix (in HDFS) would be to try the datanodes once each, and then loop 
10 times.



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to