[jira] [Commented] (HBASE-7242) Use Runtime.exit() instead of Runtime.halt() upon HLog Sync failures

2012-12-01 Thread Amitanand Aiyer (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13508131#comment-13508131
 ] 

Amitanand Aiyer commented on HBASE-7242:


Yes, thats true. 

But, we only try to do that if we are not requesting aborts, and fs is OK. 
(Although this state is maintained by HRegionServer, we can get update it, in 
this code path, if we want to skip immediately).

> Use Runtime.exit() instead of Runtime.halt() upon HLog Sync failures
> 
>
> Key: HBASE-7242
> URL: https://issues.apache.org/jira/browse/HBASE-7242
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: Amitanand Aiyer
>Priority: Minor
>
> Hey Guys,
>   Should we use Runtime.exit() instead of Runtime.halt(), when we fail a Hlog 
> sync. 
>  The key difference is that Runtime.exit() is going to invoke the shutdown 
> hooks; while Runtime.halt() does not.
>  Why we might need this: 
>We had a HDFS name node reboot today on one of our cells, and this caused 
> multiple region servers to abort because they could not sync the Hlog.
>However, since multiple RS died simultaneously, this seemed like a 
> co-related failure to the master. The master waits for the
> Znode to expire; but, this could take up to few minutes after RS death (this 
> setting is in place so that we can withstand rack switch reboots, lasting a 
> couple of minutes, without region movement).
>   If the shutdown hooks are called, RS will close the ZK connection, causing 
> a immediate Znode expiry. This might help cut down the unavailability as 
> Regions can begin to get assigned faster.
>  While, we do want to abort on Hlog failure, I do not think it would hurt 
> giving the JVM a few seconds to shutdown gracefully. Please let me know
> If I am missing something.
> Thanks,
> -Amit

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7242) Use Runtime.exit() instead of Runtime.halt() upon HLog Sync failures

2012-11-29 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13507089#comment-13507089
 ] 

stack commented on HBASE-7242:
--

What Kannan said (though if the abort flag is set, they might skip doing this)?

> Use Runtime.exit() instead of Runtime.halt() upon HLog Sync failures
> 
>
> Key: HBASE-7242
> URL: https://issues.apache.org/jira/browse/HBASE-7242
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: Amitanand Aiyer
>Priority: Minor
>
> Hey Guys,
>   Should we use Runtime.exit() instead of Runtime.halt(), when we fail a Hlog 
> sync. 
>  The key difference is that Runtime.exit() is going to invoke the shutdown 
> hooks; while Runtime.halt() does not.
>  Why we might need this: 
>We had a HDFS name node reboot today on one of our cells, and this caused 
> multiple region servers to abort because they could not sync the Hlog.
>However, since multiple RS died simultaneously, this seemed like a 
> co-related failure to the master. The master waits for the
> Znode to expire; but, this could take up to few minutes after RS death (this 
> setting is in place so that we can withstand rack switch reboots, lasting a 
> couple of minutes, without region movement).
>   If the shutdown hooks are called, RS will close the ZK connection, causing 
> a immediate Znode expiry. This might help cut down the unavailability as 
> Regions can begin to get assigned faster.
>  While, we do want to abort on Hlog failure, I do not think it would hurt 
> giving the JVM a few seconds to shutdown gracefully. Please let me know
> If I am missing something.
> Thanks,
> -Amit

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7242) Use Runtime.exit() instead of Runtime.halt() upon HLog Sync failures

2012-11-29 Thread Kannan Muthukkaruppan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13506993#comment-13506993
 ] 

Kannan Muthukkaruppan commented on HBASE-7242:
--

Currently, don't the shutdown hooks also try to flush/close the regions before 
closing the ZK connection?

> Use Runtime.exit() instead of Runtime.halt() upon HLog Sync failures
> 
>
> Key: HBASE-7242
> URL: https://issues.apache.org/jira/browse/HBASE-7242
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: Amitanand Aiyer
>Priority: Minor
>
> Hey Guys,
>   Should we use Runtime.exit() instead of Runtime.halt(), when we fail a Hlog 
> sync. 
>  The key difference is that Runtime.exit() is going to invoke the shutdown 
> hooks; while Runtime.halt() does not.
>  Why we might need this: 
>We had a HDFS name node reboot today on one of our cells, and this caused 
> multiple region servers to abort because they could not sync the Hlog.
>However, since multiple RS died simultaneously, this seemed like a 
> co-related failure to the master. The master waits for the
> Znode to expire; but, this could take up to few minutes after RS death (this 
> setting is in place so that we can withstand rack switch reboots, lasting a 
> couple of minutes, without region movement).
>   If the shutdown hooks are called, RS will close the ZK connection, causing 
> a immediate Znode expiry. This might help cut down the unavailability as 
> Regions can begin to get assigned faster.
>  While, we do want to abort on Hlog failure, I do not think it would hurt 
> giving the JVM a few seconds to shutdown gracefully. Please let me know
> If I am missing something.
> Thanks,
> -Amit

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira