[jira] Commented: (HBASE-2669) HCM.shutdownHook causes data loss with hbase.client.write.buffer != 0

2010-10-16 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12921789#action_12921789
 ] 

stack commented on HBASE-2669:
--

Let me fix up this patch and make it apply to trunk.  I think its general drift 
is fine.  Whats missing now is a bunch of explaination of how Connections work 
and are shared -- of how the sharing is keyed by Configuration and of how if 
you want a clean shutdown of your tables, then you will need to do the ugly 
HConnectionManager.deleteConnection stuff for now, in 0.90, at least.

Running tests.

 HCM.shutdownHook causes data loss with hbase.client.write.buffer != 0
 -

 Key: HBASE-2669
 URL: https://issues.apache.org/jira/browse/HBASE-2669
 Project: HBase
  Issue Type: Bug
  Components: client
Reporter: Benoit Sigoure
Assignee: Benoit Sigoure
Priority: Critical
 Fix For: 0.90.0

 Attachments: 2669.txt


 In my application I set {{hbase.client.write.buffer}} to a reasonably small 
 value (roughly 64 edits) in order to try to batch a few {{Put}} together 
 before talking to HBase.  When my application does a graceful shutdown, I 
 call {{HTable#flushCommits}} in order to flush any pending change to HBase.  
 I want to do the same thing when I get a {{SIGTERM}} by using 
 {{Runtime#addShutdownHook}} but this is impossible since 
 {{HConnectionManager}} already registers a shutdown hook that invokes 
 {{HConnectionManager#deleteAllConnections}}.  This static method closes all 
 the connections to HBase and then all connections to ZooKeeper.  Because all 
 shutdown hooks run in parallel, my hook will attempt to flush edits while 
 connections are getting closed.
 There is no way to guarantee the order in which the hooks will execute, so I 
 propose that we remove the hook in the HCM altogether and provide some 
 user-visible API they call in their own hook after they're done flushing 
 their stuff, if they really want to do a graceful shutdown.  I expect that a 
 lot of users won't use a hook though, otherwise this issue would have cropped 
 up already.  For those users, connections won't get gracefully terminated, 
 but I don't think that would be a problem since the underlying TCP socket 
 will get closed by the OS anyway, so things like ZooKeeper and such should 
 realize that the connection has been terminated and assume the client is 
 gone, and do the necessary clean-up on their side.
 An alternate fix would be to leave the hook in place by default but keep a 
 reference to it and add a user-visible API to be able to un-register the 
 hook.  I find this ugly.
 Thoughts?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-2669) HCM.shutdownHook causes data loss with hbase.client.write.buffer != 0

2010-06-05 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12875915#action_12875915
 ] 

stack commented on HBASE-2669:
--

In old days, we had similar prob w/ hdfs.  We wanted to run a shutdown cleanup 
of hbase  hook but hdfs would be running its clean up at same time and we 
couldn't guarantee order.

Using reflection, we looked for hdsf hook, if present, unregistered it but kept 
a reference and then in our shutdown hook, after was done, we'd call the hdfs 
one.  Lets fix this benoit.  Mind if I move it out of 0.20.5 though?  Its a 
prob. but not end of world and I'd like to get a 0.20.5 rolled today.  Thanks.

 HCM.shutdownHook causes data loss with hbase.client.write.buffer != 0
 -

 Key: HBASE-2669
 URL: https://issues.apache.org/jira/browse/HBASE-2669
 Project: HBase
  Issue Type: Bug
  Components: client
Reporter: Benoit Sigoure
Assignee: Benoit Sigoure
Priority: Critical
 Fix For: 0.20.5


 In my application I set {{hbase.client.write.buffer}} to a reasonably small 
 value (roughly 64 edits) in order to try to batch a few {{Put}} together 
 before talking to HBase.  When my application does a graceful shutdown, I 
 call {{HTable#flushCommits}} in order to flush any pending change to HBase.  
 I want to do the same thing when I get a {{SIGTERM}} by using 
 {{Runtime#addShutdownHook}} but this is impossible since 
 {{HConnectionManager}} already registers a shutdown hook that invokes 
 {{HConnectionManager#deleteAllConnections}}.  This static method closes all 
 the connections to HBase and then all connections to ZooKeeper.  Because all 
 shutdown hooks run in parallel, my hook will attempt to flush edits while 
 connections are getting closed.
 There is no way to guarantee the order in which the hooks will execute, so I 
 propose that we remove the hook in the HCM altogether and provide some 
 user-visible API they call in their own hook after they're done flushing 
 their stuff, if they really want to do a graceful shutdown.  I expect that a 
 lot of users won't use a hook though, otherwise this issue would have cropped 
 up already.  For those users, connections won't get gracefully terminated, 
 but I don't think that would be a problem since the underlying TCP socket 
 will get closed by the OS anyway, so things like ZooKeeper and such should 
 realize that the connection has been terminated and assume the client is 
 gone, and do the necessary clean-up on their side.
 An alternate fix would be to leave the hook in place by default but keep a 
 reference to it and add a user-visible API to be able to un-register the 
 hook.  I find this ugly.
 Thoughts?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-2669) HCM.shutdownHook causes data loss with hbase.client.write.buffer != 0

2010-06-05 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12875916#action_12875916
 ] 

stack commented on HBASE-2669:
--

hmm... well yeah, its pretty critical prob.  want to work on this today then?

 HCM.shutdownHook causes data loss with hbase.client.write.buffer != 0
 -

 Key: HBASE-2669
 URL: https://issues.apache.org/jira/browse/HBASE-2669
 Project: HBase
  Issue Type: Bug
  Components: client
Reporter: Benoit Sigoure
Assignee: Benoit Sigoure
Priority: Critical
 Fix For: 0.20.5


 In my application I set {{hbase.client.write.buffer}} to a reasonably small 
 value (roughly 64 edits) in order to try to batch a few {{Put}} together 
 before talking to HBase.  When my application does a graceful shutdown, I 
 call {{HTable#flushCommits}} in order to flush any pending change to HBase.  
 I want to do the same thing when I get a {{SIGTERM}} by using 
 {{Runtime#addShutdownHook}} but this is impossible since 
 {{HConnectionManager}} already registers a shutdown hook that invokes 
 {{HConnectionManager#deleteAllConnections}}.  This static method closes all 
 the connections to HBase and then all connections to ZooKeeper.  Because all 
 shutdown hooks run in parallel, my hook will attempt to flush edits while 
 connections are getting closed.
 There is no way to guarantee the order in which the hooks will execute, so I 
 propose that we remove the hook in the HCM altogether and provide some 
 user-visible API they call in their own hook after they're done flushing 
 their stuff, if they really want to do a graceful shutdown.  I expect that a 
 lot of users won't use a hook though, otherwise this issue would have cropped 
 up already.  For those users, connections won't get gracefully terminated, 
 but I don't think that would be a problem since the underlying TCP socket 
 will get closed by the OS anyway, so things like ZooKeeper and such should 
 realize that the connection has been terminated and assume the client is 
 gone, and do the necessary clean-up on their side.
 An alternate fix would be to leave the hook in place by default but keep a 
 reference to it and add a user-visible API to be able to un-register the 
 hook.  I find this ugly.
 Thoughts?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-2669) HCM.shutdownHook causes data loss with hbase.client.write.buffer != 0

2010-06-05 Thread Benoit Sigoure (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12875963#action_12875963
 ] 

Benoit Sigoure commented on HBASE-2669:
---

As you wrote Stack, there's no easy way to get to the HCM's hook.  First of 
all, the HCM itself doesn't retain a reference to it, so right now it's just 
impossible to reach.  Even if a reference was retained, the 
{{suppressHdfsShutdownHook}} hack in {{HRegionServer}} seems too ugly and 
fragile to me.  We don't need to expose this internal implementation detail of 
the HCM to user.  Instead we can just have a static method in {{HTable}} that 
the user can call to perform a graceful shutdown.

But I also doubt this hook is useful at all.  As I said, I'm not sure we need 
to properly close the connections ourselves.  The OS will take care of them 
anyway, and whatever the client was talking to will be notified that the socket 
was closed on the other side.  Maybe we can just remove the hook altogether and 
get away with it.

BTW, sorry I didn't mean to close 0.20.5 with this issue, I'm working with 
{{trunk}} so this has nothing to do with 0.20.5 anyway.  I also noticed that 
{{suppressHdfsShutdownHook}} has gone away in trunk.  Not sure why, but that's 
a good thing.

 HCM.shutdownHook causes data loss with hbase.client.write.buffer != 0
 -

 Key: HBASE-2669
 URL: https://issues.apache.org/jira/browse/HBASE-2669
 Project: HBase
  Issue Type: Bug
  Components: client
Reporter: Benoit Sigoure
Assignee: Benoit Sigoure
Priority: Critical
 Fix For: 0.21.0


 In my application I set {{hbase.client.write.buffer}} to a reasonably small 
 value (roughly 64 edits) in order to try to batch a few {{Put}} together 
 before talking to HBase.  When my application does a graceful shutdown, I 
 call {{HTable#flushCommits}} in order to flush any pending change to HBase.  
 I want to do the same thing when I get a {{SIGTERM}} by using 
 {{Runtime#addShutdownHook}} but this is impossible since 
 {{HConnectionManager}} already registers a shutdown hook that invokes 
 {{HConnectionManager#deleteAllConnections}}.  This static method closes all 
 the connections to HBase and then all connections to ZooKeeper.  Because all 
 shutdown hooks run in parallel, my hook will attempt to flush edits while 
 connections are getting closed.
 There is no way to guarantee the order in which the hooks will execute, so I 
 propose that we remove the hook in the HCM altogether and provide some 
 user-visible API they call in their own hook after they're done flushing 
 their stuff, if they really want to do a graceful shutdown.  I expect that a 
 lot of users won't use a hook though, otherwise this issue would have cropped 
 up already.  For those users, connections won't get gracefully terminated, 
 but I don't think that would be a problem since the underlying TCP socket 
 will get closed by the OS anyway, so things like ZooKeeper and such should 
 realize that the connection has been terminated and assume the client is 
 gone, and do the necessary clean-up on their side.
 An alternate fix would be to leave the hook in place by default but keep a 
 reference to it and add a user-visible API to be able to un-register the 
 hook.  I find this ugly.
 Thoughts?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.