I have about 5K input files so running a Hive job creates as many (small) output files. Small-file merging seems to be enabled by default (hive.merge.mapfiles=true) but it doesn't seem to work unless output compression is disabled (hive.exec.compress.output=false). If I do that, I get only 30 (uncompressed) output files which is much more manageable.
Is there a way to enable both compression and small-file merge? If not, I am thinking about saving into an uncompressed temp table first, then enabling compression and saving into the output table. Is there an easier way? Thanks.
