If read error while lease is being recovered, client reverts to stale view on
block info
----------------------------------------------------------------------------------------
Key: HDFS-2296
URL: https://issues.apache.org/jira/browse/HDFS-2296
Project: Hadoop HDFS
Issue Type: Bug
Components: hdfs client
Affects Versions: 0.20-append, 0.22.0, 0.23.0
Reporter: stack
Priority: Critical
We are seeing the following issue around recoverLease over in hbaselandia.
DFSClient calls recoverLease to assume ownership of a file. The recoverLease
returns to the client but it can take time for the new state to propagate.
Meantime, an incoming read fails though its using updated block info.
Thereafter all read retries fail because on exception we revert to stale block
view and we never recover. Laxman reports this issue in the below mailing
thread:
See this thread for first report of this issue:
http://search-hadoop.com/m/S1mOHFRmgk2/%2527FW%253A+Handling+read+failures+during+recovery%2527&subj=FW+Handling+read+failures+during+recovery
Chatting w/ Hairong offline, she suggests this a general issue around lease
recovery no matter how it triggered (new recoverLease or not).
I marked this critical. At least over in hbase it is since we get set stuck
here recovering a crashed server.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira