On Sat, Jun 30, 2012 at 10:27 AM, Ryan Rawson <[email protected]> wrote: > On Fri, Jun 29, 2012 at 5:04 PM, Todd Lipcon <[email protected]> wrote: >>> I just posted a pretty early skeleton( >>> https://issues.apache.org/jira/browse/HBASE-2182) on what I think a netty >>> based hbase client/server could look like. >>> Pros: >>> - Faster >>> - Giraph got a 3x perf improvement by droppping hadoop rpc >> Whats the reference for this? The 3x perf I heard about from Giraph was >> from switching to using LMAX's Disruptor instead of queues, internally. We >> could do the same, but I'm not certain the model works well for our use >> cases where the RPC processing can end up blocked on disk access, etc.
>>> - Asynhbase trounces our client when JD benchmarked them >> >> I'm still convinced that the majority of this has to do with the way our >> batching happens to the server, not async vs sync. (in the current sync >> client, once we fill up the buffer, we "flush" from the same thread, and >> block the flush until all buffered edits have made it, vs doing it in the >> background). We could fix this without going to a fully async model. > > I also agree here, if you do the apriori code analysis, it becomes > obvious that the issue is that slower regionservers can hold up entire > batches even if 90%+ of the Puts were already acked... fwiw, I had something roughly similar in mind (work in background instead of waiting for the result of the first part). I created HBASE-6295 to detail what I was thinking about.
