On Thu, Feb 3, 2011 at 2:57 PM, Ryan Rawson <[email protected]> wrote: > The result was the client never realized that it's session was > actually timed out, and the HBase processes continued to run. Kill -9 > and a restart fixed it.
Hi Ryan, there are two issues at play here, session timeout and session expiration. Correct me if I'm wrong but I think you meant to say "the client never realized that it's session was actually _expired_". Which is correct behavior. Clients can only determine if a session is expired once they reconnect to the cluster. Session timeout on the other hand happens when the server heartbeat is not received by the client w/in the session timeout period. Clients who are disconnected from the cluster will attempt to reconnect back to the cluster until they are successful. When a client is disconnected the client's watchers will be notified about the disconnect. (same for expiration). See questions 1 & 2 here in the faq, specifically "Example state transitions" in question 2: https://cwiki.apache.org/confluence/display/ZOOKEEPER/FAQ Your clients were stuck btw steps 4 and 5 (which they will never reach in your scenario). Does that help? Patrick
