Perhaps there is a happy medium, though, by not necessarily defining example configurations by the size of your memory footprint, but instead by performance configuration? Snappy could be the default for those who want a faster but less space cognizant implementation. Christopher's concerns would be allayed, and perhaps those who try Accumulo may get better performance by using Snappy?
On Sat, Aug 13, 2016 at 11:19 PM, Christopher <[email protected]> wrote: > Native libraries for snappy are also not typically installed by default on > Linux distros. Even if the hadoop native libraries are installed, the user > is likely going to end up using the Java implementation by default, I > *think*, unless they take additional actions. > > On Sat, Aug 13, 2016 at 11:18 PM Adam Fuchs <[email protected]> wrote: > > > In my experience gz gets roughly 1.5x to 2x better compression than > snappy. > > Snappy is definitely not a pareto improvement (although we tend to use > > snappy by default). Since it's not always better I think you would need a > > more solid argument to change the default. > > > > Adam > > > > On Aug 13, 2016 8:06 PM, "Josh Elser" <[email protected]> wrote: > > > > > Same motivation of using it as for making it the default. I am not > aware > > > of any downside to it. It's become pretty standard across all > > installations > > > I've worked with for years. > > > > > > Asking because I am no oracle on the matter. I could just be ignorant > of > > > some issue, but, given my current understanding, there is no downside > for > > > the average case. > > > > > > Christopher wrote: > > > > > >> Sorry. I wasn't clear. I understand the motivation for using it... I'm > > >> asking about the motivation for making it the default. > > >> > > >> Since both are available, I'm not sure the default matters *that* > much, > > >> but > > >> it could be an unexpected change for those preferring GZ. > > >> > > >> Also, are there any risks regarding library availability of snappy? GZ > > is > > >> pretty ubiquitous. > > >> > > >> On Sat, Aug 13, 2016 at 10:59 PM Josh Elser<[email protected]> > > wrote: > > >> > > >> Uhh, besides what I already mentioned? (close in compressed size but > > >>> "much" faster) > > >>> > > >>> Christopher wrote: > > >>> > > >>>> What's the motivation for changing it? > > >>>> > > >>>> On Sat, Aug 13, 2016 at 10:47 PM Josh Elser<[email protected]> > > >>>> > > >>> wrote: > > >>> > > >>>> Any reason we don't want to do this? Last rule-of-thumb I heard was > > that > > >>>>> snappy is often close enough in compression to GZ but quite a bit > > >>>>> faster > > >>>>> (I don't remember exactly how much). > > >>>>> > > >>>>> - Josh > > >>>>> > > >>>>> > > >> > > >
