We had Hadoop Summit Europe last week where we had an HBase Meetup. First we 
had Enis talk about HBase Architectureand then Lars talked about some 
interesting HBase Use CasesFinally, we opened it up to the public where we had 
a frank discussion on the Uptake of HBase vs. other NoSQL DB's such as Mongo 
and Cassandra. This wasn't about bashing other DB's, just understanding how the 
spectrum of NoSQL DB's was leading to a evaluation/production use of HBase. It 
was also partly based on the report from 
InfoWorldhttp://podcasts.infoworld.com/d/big-data/big-data-showdown-cassandra-vs-hbase-239592Anyways
 these were the major points we discussed(Lars and Jon Hsieh from Cloudera, 
Enis and Devaraj from Hortonworks contributed with about input from 12 other 
users from the community)Documentation - Cassandra has a better web page than 
HBase does. Even though HBase's documentation is complete, finding the 
documentation is a bit hard. Installation - HBase is hard to install for the 
newbie. I think there has been some effort to make this more friendly by 
wrapping the master in RegionServersVendor Pushes - Cassandra has DataStax, 
Pentaho pushes Mongo, Cloudera pushes Impala, MapR is pushing their proprietary 
FS, IBM their own DB's. Even though HBase is part of the Hadoop Ecosystem, 
there is no one vendor that is exclusively pushing HBase to uptake by the 
community or even by the Hadoop communityMessaging - HBase has been at the 
backend of a no. of negative marketing by various vendors over things that were 
possibly true in the past. For e.g. Lars mentioned that a certain vendor was 
incorrectly stating that HBase has issue with SPOF even though this hasn't been 
true for quite some time. Similarly, Jon mentioned that a certain slide where 
he was talking about the complexity of HBase was taken out of context and shown 
as a negative implementation of HBaseSQL based solutions - Even though there 
are a no. of efforts to showcase that HBase has some SQL based interfaces 
available like Phoenix, Impala & Hive(Albeit some issues), there is still 
misconception that HBase is purely accessed via JavaSecurity in HBase - Even 
though 0.98 has Security, it needs to be road tested.Some recommendations:Push 
messaging out and make it more clear - Apache blogs, Hortonworks Blogs, 
Cloudera blogsDocumentation - David Worms, who is a consultant out of France, 
has volunteered to help make the website better. You may want to reach out to 
him - fr.linkedin.com/pub/david-worms/7/626/630Cost Calculator - Lars made a 
great point of having a cost calculator ability to estimate the cost of various 
operations. This makes it very likely by bigger organizations to pick and 
choose HBase by understanding how they affect the bottom line



Update from Andrew - 

"HBase has had strong security since 0.94 if not 0.92 - secure RPC and ACLs at 
the table and column family level. We had these features before Cassandra and 
even Accumulo.Why stuff like that gets lost is we are a bunch of engineers not 
marketers. The trouble with messaging is someone has to write it. Since it's a 
joyless job for most engineers, someone must be paid to do it. "


Thanks
Subash
                                          

Reply via email to