I am building a prototype REST server for the D4M Schema at
https://github.com/medined/D4M_Schema/tree/master/rest. I've started
to document the endpoints via the homepage of the REST server. I
intend to get the REST server working to the point that you can use
/grep to find records through a
Er... basically I need to explain to my manager why choosing Accumulo,
instead of HBase.
So what are the pros and cons of Accumulo vs. HBase? (btw HBase 0.98 also
got cell-level security, modeled after Accumulo)
--
Jianshi Huang
LinkedIn: jianshi
Twitter: @jshuang
Github Blog:
Jianshi!
The choice of Accumulo vs HBase often comes down to the specifics of a use
case and the deployment environment.
Could you describe your use case some?
Does your organization have an existing investment (e.g. cluster, ops
personnel, dev personnel) in either Accumulo or HBase?
FWIW, the
Another way you could word this is that Accumulo has a very mature
security implementation, whereas, like you pointed out, HBase has only
recently added this in 0.98.
The note about how visibility being in the Key as opposed to the Value
also has impact when writing Iterators. Because the
In my opinion, one of our main goals for Accumulo is “it just works.”
Specifically, Accumulo’s development focuses on fault tolerance, ingest
performance, and ease of administration. It is likely that its design
scales to larger clusters than HBase's does because it splits its metadata
table,
A few observations I can make from watching both communities (although
only really participating in Accumulo's).
- HBase undeniably has a much larger public community of both users and
developers; however, we are seeing broader adoption across different
vertical markets with Accumulo. IMO, I
Sent too quickly..
- The BatchScanner is communicating to tservers in *parallel* which is
where this really shows it strength.
- A default locality group. You don't have to define the locality
groups for a table at creation time in Accumulo (or have to modify the
table if you want to insert
This needs to be documented on the official blog.
On Mon, Jun 23, 2014 at 3:31 PM, Josh Elser josh.el...@gmail.com wrote:
Sent too quickly..
- The BatchScanner is communicating to tservers in *parallel* which is
where this really shows it strength.
- A default locality group. You don't
Just to play devil's advocate, I suspect you'll be asking the same of
user@hbase?
Sent from mobile device; please excuse brevity.
On Jun 23, 2014 1:56 PM, Jianshi Huang jianshi.hu...@gmail.com wrote:
Er... basically I need to explain to my manager why choosing Accumulo,
instead of HBase.
So
Performance is probably the largest difference between Accumulo and HBase.
Accumulo can ingest/scan at a rate of 800K entries/sec/node.
This performance scales well into the hundreds of nodes to deliver
100M+ entries/sec.
There are no recent HBase benchmarks and none in the peer-reviewed
UNOFFICIAL
Is there a date when the VirtualBox test drive VM will be available from the
Sqrrl website? This would be extremely useful for our developers to quickly
get an isolated Accumulo to start coming up to speed.
Is it possible to get an early version of it?
Thanks in advance.
Matt
Matt,
You're probably better off trying to contact Sqrrl directly for
information on their products, but, since you brought it up here, I
would also point you to some other vendor-provided VM solutions that
have Accumulo or have Accumulo easily installed/configured.
Cloudera -
Mike did a pretty good presentation on performance comparison between
Accumulo / HBase. Again not official IMO but is pretty detailed in the
approach take and apples-apples comparison
http://www.slideshare.net/AccumuloSummit/10-30-drob
From: Jeremy Kepner kep...@ll.mit.edu
To:
If v1.5.1 is acceptable then
https://github.com/medined/Accumulo_1_5_1_By_Vagrant is a better place
to start.
On Mon, Jun 23, 2014 at 10:09 PM, Sean Busbey bus...@cloudera.com wrote:
There's also David Medineds' work with Vagrant:
https://github.com/medined/Accumulo_Snapshot_By_Vagrant
Also,
I think first and foremost, how has writing your application been? Is it
something you can easily onboard other people for? Does it seem stable
enough? If you can answer those questions positively, I think you have a
winning situation.
The big three Hadoop vendors (Cloudera, Hortonworks and MapR)
15 matches
Mail list logo