[jira] [Commented] (HBASE-4177) Handling read failures during recovery‏ - when HMaster calls Namenode recovery, recovery may be a failure leading to read failure while splitting logs

2012-08-23 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13440578#comment-13440578
 ] 

Lars Hofhansl commented on HBASE-4177:
--

That is superseded by all of N's work, correct?

 Handling read failures during recovery‏ - when HMaster calls Namenode 
 recovery, recovery may be a failure leading to read failure while splitting 
 logs
 --

 Key: HBASE-4177
 URL: https://issues.apache.org/jira/browse/HBASE-4177
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
Priority: Critical

 As per the mailing thread with the heading
 'Handling read failures during recovery‏' we found this problem.
 As part of split Logs the HMaster calls Namenode recovery.  The recovery is 
 an asynchronous process. 
 In HDFS
 ===
 Even though client is getting the updated block info from Namenode on first
 read failure, client is discarding the new info and using the old info only
 to retrieve the data from datanode. So, all the read
 retries are failing. [Method parameter reassignment - Not reflected in
 caller]. 
 In HBASE
 ===
 In HMaster code we tend to wait for  1sec.  But if the recovery had some 
 failure then split log may not happen and may lead to dataloss.
 So may be we need to decide upon the actual delay that needs to be introduced 
 once Hmaster calls NN recovery.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4177) Handling read failures during recovery‏ - when HMaster calls Namenode recovery, recovery may be a failure leading to read failure while splitting logs

2012-08-23 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13440652#comment-13440652
 ] 

nkeywal commented on HBASE-4177:


hum, it's really closed to what I've done, but this problem may still be there. 
Ram, what do you think? If you don't have the time, I can give it a try.

 Handling read failures during recovery‏ - when HMaster calls Namenode 
 recovery, recovery may be a failure leading to read failure while splitting 
 logs
 --

 Key: HBASE-4177
 URL: https://issues.apache.org/jira/browse/HBASE-4177
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
Priority: Critical

 As per the mailing thread with the heading
 'Handling read failures during recovery‏' we found this problem.
 As part of split Logs the HMaster calls Namenode recovery.  The recovery is 
 an asynchronous process. 
 In HDFS
 ===
 Even though client is getting the updated block info from Namenode on first
 read failure, client is discarding the new info and using the old info only
 to retrieve the data from datanode. So, all the read
 retries are failing. [Method parameter reassignment - Not reflected in
 caller]. 
 In HBASE
 ===
 In HMaster code we tend to wait for  1sec.  But if the recovery had some 
 failure then split log may not happen and may lead to dataloss.
 So may be we need to decide upon the actual delay that needs to be introduced 
 once Hmaster calls NN recovery.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4177) Handling read failures during recovery‏ - when HMaster calls Namenode recovery, recovery may be a failure leading to read failure while splitting logs

2012-08-23 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13440919#comment-13440919
 ] 

ramkrishna.s.vasudevan commented on HBASE-4177:
---

@N
I too think the problem is still there.  But internally here also we have not 
started working on this yet.  At that time we had discussions that HDFS side 
also we need some changes and Stack has already raised the same in HDFS JIRA.  
Surely you can take a stab at it N.

 Handling read failures during recovery‏ - when HMaster calls Namenode 
 recovery, recovery may be a failure leading to read failure while splitting 
 logs
 --

 Key: HBASE-4177
 URL: https://issues.apache.org/jira/browse/HBASE-4177
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
Priority: Critical

 As per the mailing thread with the heading
 'Handling read failures during recovery‏' we found this problem.
 As part of split Logs the HMaster calls Namenode recovery.  The recovery is 
 an asynchronous process. 
 In HDFS
 ===
 Even though client is getting the updated block info from Namenode on first
 read failure, client is discarding the new info and using the old info only
 to retrieve the data from datanode. So, all the read
 retries are failing. [Method parameter reassignment - Not reflected in
 caller]. 
 In HBASE
 ===
 In HMaster code we tend to wait for  1sec.  But if the recovery had some 
 failure then split log may not happen and may lead to dataloss.
 So may be we need to decide upon the actual delay that needs to be introduced 
 once Hmaster calls NN recovery.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4177) Handling read failures during recovery‏ - when HMaster calls Namenode recovery, recovery may be a failure leading to read failure while splitting logs

2012-02-02 Thread ramkrishna.s.vasudevan (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13199033#comment-13199033
 ] 

ramkrishna.s.vasudevan commented on HBASE-4177:
---

Any suggestions on this.  We tend to run into this problem every now and then.

 Handling read failures during recovery‏ - when HMaster calls Namenode 
 recovery, recovery may be a failure leading to read failure while splitting 
 logs
 --

 Key: HBASE-4177
 URL: https://issues.apache.org/jira/browse/HBASE-4177
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
Priority: Critical

 As per the mailing thread with the heading
 'Handling read failures during recovery‏' we found this problem.
 As part of split Logs the HMaster calls Namenode recovery.  The recovery is 
 an asynchronous process. 
 In HDFS
 ===
 Even though client is getting the updated block info from Namenode on first
 read failure, client is discarding the new info and using the old info only
 to retrieve the data from datanode. So, all the read
 retries are failing. [Method parameter reassignment - Not reflected in
 caller]. 
 In HBASE
 ===
 In HMaster code we tend to wait for  1sec.  But if the recovery had some 
 failure then split log may not happen and may lead to dataloss.
 So may be we need to decide upon the actual delay that needs to be introduced 
 once Hmaster calls NN recovery.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4177) Handling read failures during recovery‏ - when HMaster calls Namenode recovery, recovery may be a failure leading to read failure while splitting logs

2011-08-29 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13092981#comment-13092981
 ] 

stack commented on HBASE-4177:
--

I created HDFS-2296 at Hairong's suggestion.

 Handling read failures during recovery‏ - when HMaster calls Namenode 
 recovery, recovery may be a failure leading to read failure while splitting 
 logs
 --

 Key: HBASE-4177
 URL: https://issues.apache.org/jira/browse/HBASE-4177
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
Priority: Critical

 As per the mailing thread with the heading
 'Handling read failures during recovery‏' we found this problem.
 As part of split Logs the HMaster calls Namenode recovery.  The recovery is 
 an asynchronous process. 
 In HDFS
 ===
 Even though client is getting the updated block info from Namenode on first
 read failure, client is discarding the new info and using the old info only
 to retrieve the data from datanode. So, all the read
 retries are failing. [Method parameter reassignment - Not reflected in
 caller]. 
 In HBASE
 ===
 In HMaster code we tend to wait for  1sec.  But if the recovery had some 
 failure then split log may not happen and may lead to dataloss.
 So may be we need to decide upon the actual delay that needs to be introduced 
 once Hmaster calls NN recovery.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4177) Handling read failures during recovery‏ - when HMaster calls Namenode recovery, recovery may be a failure leading to read failure while splitting logs

2011-08-29 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13093003#comment-13093003
 ] 

ramkrishna.s.vasudevan commented on HBASE-4177:
---

@Stack
Thanks for tracking this and raising an issue for the same in HDFS.

 Handling read failures during recovery‏ - when HMaster calls Namenode 
 recovery, recovery may be a failure leading to read failure while splitting 
 logs
 --

 Key: HBASE-4177
 URL: https://issues.apache.org/jira/browse/HBASE-4177
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
Priority: Critical

 As per the mailing thread with the heading
 'Handling read failures during recovery‏' we found this problem.
 As part of split Logs the HMaster calls Namenode recovery.  The recovery is 
 an asynchronous process. 
 In HDFS
 ===
 Even though client is getting the updated block info from Namenode on first
 read failure, client is discarding the new info and using the old info only
 to retrieve the data from datanode. So, all the read
 retries are failing. [Method parameter reassignment - Not reflected in
 caller]. 
 In HBASE
 ===
 In HMaster code we tend to wait for  1sec.  But if the recovery had some 
 failure then split log may not happen and may lead to dataloss.
 So may be we need to decide upon the actual delay that needs to be introduced 
 once Hmaster calls NN recovery.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4177) Handling read failures during recovery‏ - when HMaster calls Namenode recovery, recovery may be a failure leading to read failure while splitting logs

2011-08-08 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081098#comment-13081098
 ] 

Ted Yu commented on HBASE-4177:
---

Looking at FSUtils.recoverFileLease(), we check the type of fs inside while 
loop. This is unnecessary.

w.r.t. soft limit for the lease, we have:
{code}
  if (waitedFor  FSConstants.LEASE_SOFTLIMIT_PERIOD) {
LOG.warn(Waited  + waitedFor + ms for lease recovery on  + p +
  : + e.getMessage());
  }
{code}
I think we should wait for the remainder of soft limit (which is 60 seconds).


 Handling read failures during recovery‏ - when HMaster calls Namenode 
 recovery, recovery may be a failure leading to read failure while splitting 
 logs
 --

 Key: HBASE-4177
 URL: https://issues.apache.org/jira/browse/HBASE-4177
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan

 As per the mailing thread with the heading
 'Handling read failures during recovery‏' we found this problem.
 As part of split Logs the HMaster calls Namenode recovery.  The recovery is 
 an asynchronous process. 
 In HDFS
 ===
 Even though client is getting the updated block info from Namenode on first
 read failure, client is discarding the new info and using the old info only
 to retrieve the data from datanode. So, all the read
 retries are failing. [Method parameter reassignment - Not reflected in
 caller]. 
 In HBASE
 ===
 In HMaster code we tend to wait for  1sec.  But if the recovery had some 
 failure then split log may not happen and may lead to dataloss.
 So may be we need to decide upon the actual delay that needs to be introduced 
 once Hmaster calls NN recovery.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira