[jira] Commented: (HBASE-2669) HCM.shutdownHook causes data loss with hbase.client.write.buffer != 0
[ https://issues.apache.org/jira/browse/HBASE-2669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12921789#action_12921789 ] stack commented on HBASE-2669: -- Let me fix up this patch and make it apply to trunk. I think its general drift is fine. Whats missing now is a bunch of explaination of how Connections work and are shared -- of how the sharing is keyed by Configuration and of how if you want a clean shutdown of your tables, then you will need to do the ugly HConnectionManager.deleteConnection stuff for now, in 0.90, at least. Running tests. HCM.shutdownHook causes data loss with hbase.client.write.buffer != 0 - Key: HBASE-2669 URL: https://issues.apache.org/jira/browse/HBASE-2669 Project: HBase Issue Type: Bug Components: client Reporter: Benoit Sigoure Assignee: Benoit Sigoure Priority: Critical Fix For: 0.90.0 Attachments: 2669.txt In my application I set {{hbase.client.write.buffer}} to a reasonably small value (roughly 64 edits) in order to try to batch a few {{Put}} together before talking to HBase. When my application does a graceful shutdown, I call {{HTable#flushCommits}} in order to flush any pending change to HBase. I want to do the same thing when I get a {{SIGTERM}} by using {{Runtime#addShutdownHook}} but this is impossible since {{HConnectionManager}} already registers a shutdown hook that invokes {{HConnectionManager#deleteAllConnections}}. This static method closes all the connections to HBase and then all connections to ZooKeeper. Because all shutdown hooks run in parallel, my hook will attempt to flush edits while connections are getting closed. There is no way to guarantee the order in which the hooks will execute, so I propose that we remove the hook in the HCM altogether and provide some user-visible API they call in their own hook after they're done flushing their stuff, if they really want to do a graceful shutdown. I expect that a lot of users won't use a hook though, otherwise this issue would have cropped up already. For those users, connections won't get gracefully terminated, but I don't think that would be a problem since the underlying TCP socket will get closed by the OS anyway, so things like ZooKeeper and such should realize that the connection has been terminated and assume the client is gone, and do the necessary clean-up on their side. An alternate fix would be to leave the hook in place by default but keep a reference to it and add a user-visible API to be able to un-register the hook. I find this ugly. Thoughts? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-2669) HCM.shutdownHook causes data loss with hbase.client.write.buffer != 0
[ https://issues.apache.org/jira/browse/HBASE-2669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12875915#action_12875915 ] stack commented on HBASE-2669: -- In old days, we had similar prob w/ hdfs. We wanted to run a shutdown cleanup of hbase hook but hdfs would be running its clean up at same time and we couldn't guarantee order. Using reflection, we looked for hdsf hook, if present, unregistered it but kept a reference and then in our shutdown hook, after was done, we'd call the hdfs one. Lets fix this benoit. Mind if I move it out of 0.20.5 though? Its a prob. but not end of world and I'd like to get a 0.20.5 rolled today. Thanks. HCM.shutdownHook causes data loss with hbase.client.write.buffer != 0 - Key: HBASE-2669 URL: https://issues.apache.org/jira/browse/HBASE-2669 Project: HBase Issue Type: Bug Components: client Reporter: Benoit Sigoure Assignee: Benoit Sigoure Priority: Critical Fix For: 0.20.5 In my application I set {{hbase.client.write.buffer}} to a reasonably small value (roughly 64 edits) in order to try to batch a few {{Put}} together before talking to HBase. When my application does a graceful shutdown, I call {{HTable#flushCommits}} in order to flush any pending change to HBase. I want to do the same thing when I get a {{SIGTERM}} by using {{Runtime#addShutdownHook}} but this is impossible since {{HConnectionManager}} already registers a shutdown hook that invokes {{HConnectionManager#deleteAllConnections}}. This static method closes all the connections to HBase and then all connections to ZooKeeper. Because all shutdown hooks run in parallel, my hook will attempt to flush edits while connections are getting closed. There is no way to guarantee the order in which the hooks will execute, so I propose that we remove the hook in the HCM altogether and provide some user-visible API they call in their own hook after they're done flushing their stuff, if they really want to do a graceful shutdown. I expect that a lot of users won't use a hook though, otherwise this issue would have cropped up already. For those users, connections won't get gracefully terminated, but I don't think that would be a problem since the underlying TCP socket will get closed by the OS anyway, so things like ZooKeeper and such should realize that the connection has been terminated and assume the client is gone, and do the necessary clean-up on their side. An alternate fix would be to leave the hook in place by default but keep a reference to it and add a user-visible API to be able to un-register the hook. I find this ugly. Thoughts? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-2669) HCM.shutdownHook causes data loss with hbase.client.write.buffer != 0
[ https://issues.apache.org/jira/browse/HBASE-2669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12875916#action_12875916 ] stack commented on HBASE-2669: -- hmm... well yeah, its pretty critical prob. want to work on this today then? HCM.shutdownHook causes data loss with hbase.client.write.buffer != 0 - Key: HBASE-2669 URL: https://issues.apache.org/jira/browse/HBASE-2669 Project: HBase Issue Type: Bug Components: client Reporter: Benoit Sigoure Assignee: Benoit Sigoure Priority: Critical Fix For: 0.20.5 In my application I set {{hbase.client.write.buffer}} to a reasonably small value (roughly 64 edits) in order to try to batch a few {{Put}} together before talking to HBase. When my application does a graceful shutdown, I call {{HTable#flushCommits}} in order to flush any pending change to HBase. I want to do the same thing when I get a {{SIGTERM}} by using {{Runtime#addShutdownHook}} but this is impossible since {{HConnectionManager}} already registers a shutdown hook that invokes {{HConnectionManager#deleteAllConnections}}. This static method closes all the connections to HBase and then all connections to ZooKeeper. Because all shutdown hooks run in parallel, my hook will attempt to flush edits while connections are getting closed. There is no way to guarantee the order in which the hooks will execute, so I propose that we remove the hook in the HCM altogether and provide some user-visible API they call in their own hook after they're done flushing their stuff, if they really want to do a graceful shutdown. I expect that a lot of users won't use a hook though, otherwise this issue would have cropped up already. For those users, connections won't get gracefully terminated, but I don't think that would be a problem since the underlying TCP socket will get closed by the OS anyway, so things like ZooKeeper and such should realize that the connection has been terminated and assume the client is gone, and do the necessary clean-up on their side. An alternate fix would be to leave the hook in place by default but keep a reference to it and add a user-visible API to be able to un-register the hook. I find this ugly. Thoughts? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-2669) HCM.shutdownHook causes data loss with hbase.client.write.buffer != 0
[ https://issues.apache.org/jira/browse/HBASE-2669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12875963#action_12875963 ] Benoit Sigoure commented on HBASE-2669: --- As you wrote Stack, there's no easy way to get to the HCM's hook. First of all, the HCM itself doesn't retain a reference to it, so right now it's just impossible to reach. Even if a reference was retained, the {{suppressHdfsShutdownHook}} hack in {{HRegionServer}} seems too ugly and fragile to me. We don't need to expose this internal implementation detail of the HCM to user. Instead we can just have a static method in {{HTable}} that the user can call to perform a graceful shutdown. But I also doubt this hook is useful at all. As I said, I'm not sure we need to properly close the connections ourselves. The OS will take care of them anyway, and whatever the client was talking to will be notified that the socket was closed on the other side. Maybe we can just remove the hook altogether and get away with it. BTW, sorry I didn't mean to close 0.20.5 with this issue, I'm working with {{trunk}} so this has nothing to do with 0.20.5 anyway. I also noticed that {{suppressHdfsShutdownHook}} has gone away in trunk. Not sure why, but that's a good thing. HCM.shutdownHook causes data loss with hbase.client.write.buffer != 0 - Key: HBASE-2669 URL: https://issues.apache.org/jira/browse/HBASE-2669 Project: HBase Issue Type: Bug Components: client Reporter: Benoit Sigoure Assignee: Benoit Sigoure Priority: Critical Fix For: 0.21.0 In my application I set {{hbase.client.write.buffer}} to a reasonably small value (roughly 64 edits) in order to try to batch a few {{Put}} together before talking to HBase. When my application does a graceful shutdown, I call {{HTable#flushCommits}} in order to flush any pending change to HBase. I want to do the same thing when I get a {{SIGTERM}} by using {{Runtime#addShutdownHook}} but this is impossible since {{HConnectionManager}} already registers a shutdown hook that invokes {{HConnectionManager#deleteAllConnections}}. This static method closes all the connections to HBase and then all connections to ZooKeeper. Because all shutdown hooks run in parallel, my hook will attempt to flush edits while connections are getting closed. There is no way to guarantee the order in which the hooks will execute, so I propose that we remove the hook in the HCM altogether and provide some user-visible API they call in their own hook after they're done flushing their stuff, if they really want to do a graceful shutdown. I expect that a lot of users won't use a hook though, otherwise this issue would have cropped up already. For those users, connections won't get gracefully terminated, but I don't think that would be a problem since the underlying TCP socket will get closed by the OS anyway, so things like ZooKeeper and such should realize that the connection has been terminated and assume the client is gone, and do the necessary clean-up on their side. An alternate fix would be to leave the hook in place by default but keep a reference to it and add a user-visible API to be able to un-register the hook. I find this ugly. Thoughts? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.