Per above. I noticed i do ask for compression of results and intermediate data. (more of a programming reflex really than any motivated decision).
But for data such as vectors, assuming sparse vectors are used where appropriate, compression is not going to win much. On the other hand, if native libraries are enabled, default GZIP codec does not cost much compared to computations etiher. And a third option, maybe we shouldn't put any defaults in at all and leave it for -D options. Which i see as somewhat a problem since hadoop somewhat tries to encapsulate those properties in static methods of classes such as FileOutputFormat, which may imply that the property names are not meant to be part of any user contract and just implementation details of a concrete file format. I am leaning towards enforcing no compression by default.
