Sorry I only alluded to it in the bullet point about the filter model. I would imagine that as a (or two) filter in the channel stack. It's honestly something that I haven't gotten to looking at in-depth yet.
On Fri, Jun 29, 2012 at 5:34 PM, Andrew Purtell <[email protected]>wrote: > Without SASL/krb/security integration with the rest of Hadoop this would > be a nonstarter for us. I didn't see that mentioned? > > On Jun 29, 2012, at 5:04 PM, Todd Lipcon <[email protected]> wrote: > > > A few inline notes below: > > > > On Fri, Jun 29, 2012 at 4:42 PM, Elliott Clark <[email protected] > >wrote: > > > >> I just posted a pretty early skeleton( > >> https://issues.apache.org/jira/browse/HBASE-2182) on what I think a > netty > >> based hbase client/server could look like. > >> > >> Pros: > >> > >> - Faster > >> - Giraph got a 3x perf improvement by droppping hadoop rpc > >> > > > > Whats the reference for this? The 3x perf I heard about from Giraph was > > from switching to using LMAX's Disruptor instead of queues, internally. > We > > could do the same, but I'm not certain the model works well for our use > > cases where the RPC processing can end up blocked on disk access, etc. > > > > > >> - Asynhbase trounces our client when JD benchmarked them > >> > > > > I'm still convinced that the majority of this has to do with the way our > > batching happens to the server, not async vs sync. (in the current sync > > client, once we fill up the buffer, we "flush" from the same thread, and > > block the flush until all buffered edits have made it, vs doing it in the > > background). We could fix this without going to a fully async model. > > > > > >> - Could encourage things to be a little more modular if everything > isn't > >> hanging directly off of HRegionServer > >> > > Sure, but not sure I see why this is Netty vs not-Netty > > > > > >> - Netty is better about thread usage than hadoop rpc server. > >> > > Can you explain further? > > > > > >> - Pretty easy to define an rpc protocol after all of the work on > >> protobuf (Thanks everyone) > >> - Decoupling the rpc server library from the hadoop library could allow > >> us to rev the server code easier. > >> - The filter model is very easy to work with. > >> - Security can be just a single filter. > >> - Logging can ba another > >> - Stats can be another. > >> > >> Cons: > >> > >> - Netty and non apache rpc server's don't play well togther. They > might > >> be able to but I haven't gotten there yet. > >> > > What do you mean "non apache rpc servers"? > > > > > >> - Complexity > >> - Two different servers in the src > >> - Confusing users who don't know which to pick > >> - Non-blocking could make the client a harder to write. > >> > >> > >> I'm really just trying to gauge what people think of the direction and > if > >> it's still something that is wanted. The code is a loooooong way from > even > >> being a tech demo, and I'm not a netty expert, so suggestions would be > >> welcomed. > >> > >> Thoughts ? Are people interested in this? Should I push this to my > github > >> so other can help ? > >> > > > > IMO, I'd want to see a noticeable perf difference from the change - > > unfortunately it would take a fair amount of work to get to the point > where > > you could benchmark it. But if you're willing to spend the time to get to > > that point, seems worth investigating. > > > > -- > > Todd Lipcon > > Software Engineer, Cloudera >
