[ https://issues.apache.org/jira/browse/HIVE-2296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13068794#comment-13068794 ]
jirapos...@reviews.apache.org commented on HIVE-2296: ----------------------------------------------------- ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1155/ ----------------------------------------------------------- Review request for hive and Siying Dong. Summary ------- Fixes problem of bad compressed file names by stripping off the file format (ex ".gz") and reappending it to the path later. This addresses bug HIVE-2296. https://issues.apache.org/jira/browse/HIVE-2296 Diffs ----- trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 1148973 trunk/ql/src/test/queries/clientpositive/insert_compressed.q PRE-CREATION trunk/ql/src/test/results/clientpositive/insert_compressed.q.out PRE-CREATION Diff: https://reviews.apache.org/r/1155/diff Testing ------- Unit tests pass Thanks, Franklin > bad compressed file names from insert into > ------------------------------------------ > > Key: HIVE-2296 > URL: https://issues.apache.org/jira/browse/HIVE-2296 > Project: Hive > Issue Type: Bug > Affects Versions: 0.8.0 > Reporter: Franklin Hu > Assignee: Franklin Hu > Attachments: hive-2296.1.patch, hive-2296.2.patch > > > When INSERT INTO is run on a table with compressed output > (hive.exec.compress.output=true) and existing files in the table, it may copy > the new files in bad file names: > Before INSERT INTO: > 000000_0.gz > After INSERT INTO: > 000000_0.gz > 000000_0.gz_copy_1 > This causes corrupted output when doing a SELECT * on the table. > Correct behavior should be to pick a valid filename such as: > 000000_0_copy_1.gz -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira