IMO, they are very similar. Lots of smart people on both sides making really good changes.

I think HBase has a lot more instrumentation and understanding on the resources a cluster will use. For example, I think it's much clearer the resources and threads that the RPC server will use. This is much more obtuse (and grows/shrinks on its own in Accumulo). I think this also drifts into the HBase API usage too -- I have a much better understanding of what needs to be managed with HBase. This is a little more obtuse in Accumulo for new users.

I also think HBase has a much better understanding and tuning of the read path. I would trust consistently performing (SLA bound) workloads on HBase much more than Accumulo just because there hasn't been (public) work that is or has happened in Accumulo.

On the other side, it's been years since I've seen data loss or assignment bugs in Accumulo. Around 1.1.0, the bugs that Enis and Stack fixed shocked me. I was rather surprised to see these kinds of bugs crop up, and really worried me when I spent quite a lot of time trying to understand the bugs. Personally, I would trust Accumulo to be thrown off a cliff and still keep chugging (again, because I've done this myself). I don't have this confidence with HBase (yet).

Specifically WRT security since you brought it up. Last I tried to play with the cell-level security APIs in HBase, it seemed very obtuse to me. Perhaps I was just dense and didn't find the right sort of instructions. I think where security is critical, I would trust Accumulo more because it's been very fleshed out over many years and been a part of the core model since the start. I felt that HBase is still in a shake-down phase. (again, I don't want to be argumentative -- it's just my personal experience to date using the code and watching JIRA issues)

The HBase coprocessors and Accumulo iterators difference will still stand (they are not equivalent features and solve different problems, IMO). Coprocessors enable quite a bunch of interesting things (notably, Phoenix). At the same time, I like the functional-conciseness in how I can represent some problems using Accumulo iterators.

Ultimately, consider the use cases, evaluate the solutions and make your decision off of empirical evidence. That's the only way to really make a decision :)

Jerry He wrote:
Hi, folks

We have people that are evaluating HBase vs Accumulo.
Security is an important factor.

But I think after the Cell security was added in HBase, there is no more
real gap compared to Accumulo.

I know we have both HBase and Accumulo experts on this list.
Could someone shred more light?
I am looking for real gap comparing HBase to Accumulo if there is any so
that I can be prepared to address them. This is not limited to the security
area.

There are differences in some features and implementations. But they don't
see like real 'gaps'.

Any comments and feedbacks are welcome.

Thanks,

Jerry

Reply via email to