That's a fair point. I'm off in nebulous vendor land and tend to be removed from pure Apache Hadoop artifacts. I feel like there's a snappy package (at least on centos) which is enough, but understanding this would be good.
Is there a nonnative snappy impl? On Aug 13, 2016 11:19 PM, "Christopher" <[email protected]> wrote: > Native libraries for snappy are also not typically installed by default on > Linux distros. Even if the hadoop native libraries are installed, the user > is likely going to end up using the Java implementation by default, I > *think*, unless they take additional actions. > > On Sat, Aug 13, 2016 at 11:18 PM Adam Fuchs <[email protected]> wrote: > > > In my experience gz gets roughly 1.5x to 2x better compression than > snappy. > > Snappy is definitely not a pareto improvement (although we tend to use > > snappy by default). Since it's not always better I think you would need a > > more solid argument to change the default. > > > > Adam > > > > On Aug 13, 2016 8:06 PM, "Josh Elser" <[email protected]> wrote: > > > > > Same motivation of using it as for making it the default. I am not > aware > > > of any downside to it. It's become pretty standard across all > > installations > > > I've worked with for years. > > > > > > Asking because I am no oracle on the matter. I could just be ignorant > of > > > some issue, but, given my current understanding, there is no downside > for > > > the average case. > > > > > > Christopher wrote: > > > > > >> Sorry. I wasn't clear. I understand the motivation for using it... I'm > > >> asking about the motivation for making it the default. > > >> > > >> Since both are available, I'm not sure the default matters *that* > much, > > >> but > > >> it could be an unexpected change for those preferring GZ. > > >> > > >> Also, are there any risks regarding library availability of snappy? GZ > > is > > >> pretty ubiquitous. > > >> > > >> On Sat, Aug 13, 2016 at 10:59 PM Josh Elser<[email protected]> > > wrote: > > >> > > >> Uhh, besides what I already mentioned? (close in compressed size but > > >>> "much" faster) > > >>> > > >>> Christopher wrote: > > >>> > > >>>> What's the motivation for changing it? > > >>>> > > >>>> On Sat, Aug 13, 2016 at 10:47 PM Josh Elser<[email protected]> > > >>>> > > >>> wrote: > > >>> > > >>>> Any reason we don't want to do this? Last rule-of-thumb I heard was > > that > > >>>>> snappy is often close enough in compression to GZ but quite a bit > > >>>>> faster > > >>>>> (I don't remember exactly how much). > > >>>>> > > >>>>> - Josh > > >>>>> > > >>>>> > > >> > > >
