Hive Version: Hive 0.8 (last commit SHA b581a6192b8d4c544092679d05f45b2e50d42b45 )
Hadoop version : chd3u0 I am trying to use the hive merge small file feature by setting all the necessary params. I am disabling use of CombineHiveInputFormat since my input is compressed text. hive> set mapred.min.split.size.per.node=1000000000; hive> set mapred.min.split.size.per.rack=1000000000; hive> set mapred.max.split.size=1000000000; hive> set hive.merge.size.per.task=1000000000; hive> set hive.merge.smallfiles.avgsize=1000000000; hive> set hive.merge.size.smallfiles.avgsize=1000000000; hive> set hive.merge.mapfiles=false; hive> set hive.merge.mapredfiles=true; The plan decides to launch two MR jobs but after first job succeeds I get runt time error "java.lang.RuntimeException: Plan invalid, Reason: Reducers == 0 but reduce operator specified" I think the problem can be fixed by using this patch I came with : https://gist.github.com/2025303 Of course my understanding and hence this patch can be totally wrong. Please provide feedback.