ZooKeeper: use native threads to avoid GC stalls (JNI integration)
------------------------------------------------------------------

                 Key: HBASE-1316
                 URL: https://issues.apache.org/jira/browse/HBASE-1316
             Project: Hadoop HBase
          Issue Type: Improvement
    Affects Versions: 0.20.0
            Reporter: Andrew Purtell


>From Joey Echeverria up on hbase-users@:

We've used zookeeper in a write-heavy project we've been working on and 
experienced issues similar to what you described. After several days of 
debugging, we discovered that our issue was garbage collection. There was no 
way to guarantee we wouldn't have long pauses especially since our environment 
was the worst case for garbage collection, millions of tiny, short lived 
objects. I suspect HBase sees similar work loads frequently, if it's not 
constantly. With anything shorter than a 30 second session time out, we got 
session expiration events extremely frequently. We needed to use 60 seconds for 
any real confidence that an ephemeral node disappearing meant something was 
unavailable.

We really wanted quick recovery so we ended up writing a light-weight wrapper 
around the C API and used swig to auto-generate a JNI interface. It's not 
perfect, but since we switched to this method we've never seen a session 
expiration event and ephemeral nodes only disappear when there are network 
issues or a machine/process goes down.

I don't know if it's worth doing the same kind of thing for HBase as it adds 
some "unnecessary" native code, but it's a solution that I found works.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to