[ https://issues.apache.org/jira/browse/HIVE-1463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Joydeep Sen Sarma updated HIVE-1463: ------------------------------------ Attachment: 1463.3.patch this fixes all the issues 1) regex expanded to cover both 17 and later releases. in 17 tasks are indeed named _map_ and _reduce_ in local mode. 2) no change to strip leading zeros in taskid. ordering of files will not be changed by this diff. the filename component being removed is constant per map-reduce job (jobid + jobtracker_id etc.). 3) one line env setting in the build file that allows us to control test execution logging from hive-log4j. this passes all the tests. the problem with load_dyn_part2.q was due to incorrect regex application. the taskid matching has to be applied to the last component of the path name only. as an aside - replaceTaskIdFromFilename would also be easier to understand and simpler if it simply did this (cut last component, replace taskid, concat back and return). > hive output file names are unnecessarily large > ---------------------------------------------- > > Key: HIVE-1463 > URL: https://issues.apache.org/jira/browse/HIVE-1463 > Project: Hadoop Hive > Issue Type: Improvement > Reporter: Joydeep Sen Sarma > Attachments: 1463.2.patch, 1463.3.patch, hive-1463.1.patch > > > Hive's output files are named like this: > attempt_201006221843_431854_r_000000_0 > out of all of this goop - only one character '0' would have sufficed. we > should fix this. This would help environments with namenode memory > constraints. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.