[ https://issues.apache.org/jira/browse/HIVE-1585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12901638#action_12901638 ]
Namit Jain commented on HIVE-1585: ---------------------------------- <property> <name>hive.merge.size.per.task</name> <value>256000000</value> <description>Size of merged files at the end of the job</description> </property> <property> <name>hive.merge.size.smallfiles.avgsize</name> <value>16000000</value> <description>When the average output file size of a job is less than this number, Hive will start an additional map-reduce job to merge the output files into bigger files. This is only done for map-only jobs if hive.merge.mapfiles is true, and for map-reduce jobs if hive.merge.mapredfiles is true.</description> </property> Don't the above parameters meet your criteria ? > Customizable merge output size > ------------------------------ > > Key: HIVE-1585 > URL: https://issues.apache.org/jira/browse/HIVE-1585 > Project: Hadoop Hive > Issue Type: Improvement > Reporter: Ning Zhang > > Currently if hive.merge.[mapfiles|mapredfiles] is true and the merged output > file size is determined by the input split size which is determined by > mapred.min.split.size, mapred.min.split.size.per.[node|rack] and > mapred.max.split.size. Sometimes it is desirable to have different output > file size than the input split size. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.