Thanks, tsuna for the detailed response. I totally agree with your points. > No it doesn't make sense. HBase's jar actually contains a > copy-pasted-hacked version of the Hadoop RPC code. Most of the RPC > stuff happens inside the HBase jar, it only uses some helper functions > from the Hadoop jar.
In light of the above, I think it would be worth the improvement to decouple the client from the Hadoop jar. If these helper functions are also moved into the HBase jar (perhaps repackaged as well), we can make this cleanly layered (client -> HBase (&ZK) -> HDFS). (Should we open a JIRA for this?) Of course, when the client starts adapting to different versions of HBase, that's when this becomes really powerful. I see that we are already making incremental changes towards this (https://issues.apache.org/jira/browse/HBASE-3581) which is great. > Yeah that's idea, change the on-wire protocol, just keep the logic in > the thick client. Personally I just hope whichever RPC protocol ends > up replacing the horrible Hadoop RPC in HBase will have support for > asynchronous / non-blocking operations both on the client side and on > the server side, so we can finally move away from the inefficient > model with thread pools containing hundreds of threads. Given that Avro based HBase is now looking further away (based on discussions on this thread), I'm thinking that enhancing the HBase Java client to be version aware (incrementally) would be something to follow / focus on - again, emphasizing _incrementally_ as it may be hard to do a full scale redesign . I don't know if there are JIRAs already out there that covers this ... Thanks, --Suraj
