Chad Walters wrote:
3. More performance work: Michael did some performance measurements a while back that seemed to indicate a lot of time spent back-and-forth in RPC. We're exploring Thrift as a lighter-weight RPC mechanism, but there are probably other things to be done to reduce this cost. More analysis and measurement would be helpful.
Hmmm. Hadoop's RPC has its shortcomings, but I wouldn't call it heavy-weight or low-performance. And Thrift has advantages over Hadoop RPC, but I have not heard that performance is a primary one. So I would not assume that replacing Hadoop RPC with Thrift would improve performance.
5. Memory caching: Instead of pinning a whole Hbase table in RAM, I'd recommend the use of memcached in front of Hbase to provide cached read access.
Memcached is useful when many nodes need to access the same data. It pools and shares memory across a cluster. In HBase, each node caches a different portion of a table, no? So I don't see how memcached would help there.
Doug
