I'm on the side of benchmarking for the use case and with an expert. There a so many ways to cheat a benchmark. And the bench mark may not be anything like your use case. On Aug 19, 2015 5:43 PM, "Andrew Purtell" <apurt...@apache.org> wrote:
> I think someone who uses third party benchmarks to assess a system like > HBase or Accumulo (or Cassandra...) is taking a foolish shortcut, so > perhaps we must agree to disagree. > > > On Wed, Aug 19, 2015 at 2:34 PM, Jeremy Kepner <kep...@ll.mit.edu> wrote: > > > I agree, that performance on real apps is the most important for > > any particular organization, but as technologists how do we measure > > ourselves? > > Hence imperfect benchmarking remains our only recourse. > > > > On Wed, Aug 19, 2015 at 12:34:44PM -0700, Andrew Purtell wrote: > > > I can't speak for anyone other than myself in the HBase community, but > > I'm > > > much more interested and focused on performance analysis and > > > developing/deploying for the use cases of my employer than > participating > > in > > > generic bench-marketing to make weapons for happy OSS warriors. Perhaps > > > this does a disservice to the HBase project overall and if so then I > > > apologize to others on the project for that. > > > > > > That said, from long and bitter experience let me state the only > > benchmarks > > > that every really matter are the comparative benchmarks you make for > your > > > own use cases in your own environments, preferably exercising those > > > candidates with real data and operating conditions. See: > > > https://pbs.twimg.com/media/CMnTyKVUEAA1tOm.jpg (smile) > > > > > > > > > > > > On Wed, Aug 19, 2015 at 12:27 PM, Josh Elser <josh.el...@gmail.com> > > wrote: > > > > > > > Alright, I have to ask... are you referring to the paper that cites > > > > Accumulo performance without write-ahead logs enabled? I have some > > serious > > > > reservations about the relevance of that paper to this conversation > and > > > > just want to make sure people aren't led astray by what the actual > > takeaway > > > > should be. > > > > > > > > Jeremy Kepner wrote: > > > > > > > >> A big difference between Accumulo and HBase is the published > > performance > > > >> numbers. > > > >> The Accumulo community has done a good job of continuing to publish > > > >> up-to-date performance > > > >> numbers in peer-reviewed venues which allow Accumulo to claim best > in > > the > > > >> world performance. > > > >> > > > >> The HBase community hasn't been doing that so much. It would be > > great if > > > >> they did because > > > >> the HBase points on the graphs are old and it would be good to get > new > > > >> ones. > > > >> > > > >> > > > >> > > > >> On Wed, Aug 19, 2015 at 02:30:58PM -0400, Josh Elser wrote: > > > >> > > > >>> Like I've said many times now, it's relative to your actual > problem. > > > >>> If you don't have that much data (or intend to grow into that much > > > >>> data), it's not an issue. Obviously, this is the case for you. > > > >>> > > > >>> However, it is an architectural difference between the two projects > > > >>> with known limitations for a single metadata region. It's a > > > >>> difference as what was asked for by Jerry. > > > >>> > > > >>> Ted Malaska wrote: > > > >>> > > > >>>> I've been doing HBase for a long time and never had an issue with > > region > > > >>>> count limits and I have clusters with 10s of billions of records. > > Many > > > >>>> there would be issues around a couple Trillion records, but never > > got > > > >>>> that > > > >>>> high yet. > > > >>>> > > > >>>> Ted Malaska > > > >>>> > > > >>>> On Wed, Aug 19, 2015 at 2:24 PM, Josh Elser<josh.el...@gmail.com> > > > >>>> wrote: > > > >>>> > > > >>>> Oh, one other thing that I should mention (was prompted off-list). > > > >>>>> > > > >>>>> (definition time since cross-list now: HBase regions == Accumulo > > > >>>>> tablets) > > > >>>>> > > > >>>>> Accumulo will handle many more regions than HBase does now due > to a > > > >>>>> splittable metadata table. While I was told this was a very long > > and > > > >>>>> arduous journey to implement correctly (WRT splitting, merges and > > bulk > > > >>>>> loading), users with "too many regions" problems are extremely > few > > and > > > >>>>> far > > > >>>>> between for Accumulo. > > > >>>>> > > > >>>>> I was very happy to see effort/design being put into this in > HBase. > > > >>>>> And, > > > >>>>> just to be fair in criticism/praises, HBase does appear to me to > do > > > >>>>> assignments of regions much faster than Accumulo does on a small > > > >>>>> cluster > > > >>>>> (~5-10 nodes). Accumulo may take a few seconds to notice and > > reassign > > > >>>>> tablets. I have yet to notice this with HBase (which also could > be > > due > > > >>>>> to > > > >>>>> lack of personal testing). > > > >>>>> > > > >>>>> > > > >>>>> Jerry He wrote: > > > >>>>> > > > >>>>> Hi, folks > > > >>>>>> > > > >>>>>> We have people that are evaluating HBase vs Accumulo. > > > >>>>>> Security is an important factor. > > > >>>>>> > > > >>>>>> But I think after the Cell security was added in HBase, there is > > no > > > >>>>>> more > > > >>>>>> real gap compared to Accumulo. > > > >>>>>> > > > >>>>>> I know we have both HBase and Accumulo experts on this list. > > > >>>>>> Could someone shred more light? > > > >>>>>> I am looking for real gap comparing HBase to Accumulo if there > is > > any > > > >>>>>> so > > > >>>>>> that I can be prepared to address them. This is not limited to > the > > > >>>>>> security > > > >>>>>> area. > > > >>>>>> > > > >>>>>> There are differences in some features and implementations. But > > they > > > >>>>>> don't > > > >>>>>> see like real 'gaps'. > > > >>>>>> > > > >>>>>> Any comments and feedbacks are welcome. > > > >>>>>> > > > >>>>>> Thanks, > > > >>>>>> > > > >>>>>> Jerry > > > >>>>>> > > > >>>>>> > > > >>>>>> > > > > > > > > > -- > > > Best regards, > > > > > > - Andy > > > > > > Problems worthy of attack prove their worth by hitting back. - Piet > Hein > > > (via Tom White) > > > > > > -- > Best regards, > > - Andy > > Problems worthy of attack prove their worth by hitting back. - Piet Hein > (via Tom White) >