[ 
https://issues.apache.org/jira/browse/HBASE-1316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12726758#action_12726758
 ] 

Nitay Joffe commented on HBASE-1316:
------------------------------------

I've started looking into this. I have a minimal JNI binding to ZooKeeper based 
off of Joey's work. It doesn't use swig, as I think that adds an unnecessary 
dependency.

The question, as Joey mentions, is where do we want to put the gap between JNI 
ZooKeeper and Java ZooKeeper?
On one hand, we can just have the JNI binding handle ephemeral nodes to reduce 
session expired events from Java GC starving hearbeats. On the other hand we 
can try making the JNI binding handle and not have a Java ZooKeeper handle at 
all, but that might be ugly with the watcher events going back and forth 
between C <=> Java.

What do you guys think?

> ZooKeeper: use native threads to avoid GC stalls (JNI integration)
> ------------------------------------------------------------------
>
>                 Key: HBASE-1316
>                 URL: https://issues.apache.org/jira/browse/HBASE-1316
>             Project: Hadoop HBase
>          Issue Type: Improvement
>    Affects Versions: 0.20.0
>            Reporter: Andrew Purtell
>            Assignee: Nitay Joffe
>         Attachments: zk_wrapper.tar.gz
>
>
> From Joey Echeverria up on hbase-users@:
> We've used zookeeper in a write-heavy project we've been working on and 
> experienced issues similar to what you described. After several days of 
> debugging, we discovered that our issue was garbage collection. There was no 
> way to guarantee we wouldn't have long pauses especially since our 
> environment was the worst case for garbage collection, millions of tiny, 
> short lived objects. I suspect HBase sees similar work loads frequently, if 
> it's not constantly. With anything shorter than a 30 second session time out, 
> we got session expiration events extremely frequently. We needed to use 60 
> seconds for any real confidence that an ephemeral node disappearing meant 
> something was unavailable.
> We really wanted quick recovery so we ended up writing a light-weight wrapper 
> around the C API and used swig to auto-generate a JNI interface. It's not 
> perfect, but since we switched to this method we've never seen a session 
> expiration event and ephemeral nodes only disappear when there are network 
> issues or a machine/process goes down.
> I don't know if it's worth doing the same kind of thing for HBase as it adds 
> some "unnecessary" native code, but it's a solution that I found works.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to