[ 
https://issues.apache.org/jira/browse/HIVE-1950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12995112#comment-12995112
 ] 

Ning Zhang commented on HIVE-1950:
----------------------------------

2nd Round:
=========
QTestUtil: 
  same as my previous comment: revert the change if it belongs to another JIRA

StatsTask: 
  line 274, 310: you may not need the updateOnly variable in StatsWork. Instead 
you can just check HiveConf.ConfVars.HIVE_STATS_ATOMIC. 
 
CombineHiveKey.java: 
  missing Apache liense header

RCFileMergeMapper.java:
  the jobClose() functions should handle the exceptional case when abort is 
true (similar to what FileSinkOperator does) or an exception was thrown from 
the hadoopo layber but it failed to call close(abort=true). 
  also in jobClose(), the partition's old directory is first moved to backup 
directory and then the intermediate directory is moved to the partition's 
destination directory. All this is done when the partition is online (other 
queries can read the partition's directory). You may want to create a follow-up 
JIRA to make this partition as offline during the move. 



> Block merge for RCFile
> ----------------------
>
>                 Key: HIVE-1950
>                 URL: https://issues.apache.org/jira/browse/HIVE-1950
>             Project: Hive
>          Issue Type: New Feature
>            Reporter: He Yongqiang
>            Assignee: He Yongqiang
>         Attachments: HIVE-1950.1.patch, HIVE-1950.2.patch, HIVE-1950.3.patch, 
> HIVE-1950.4.patch
>
>
> In our env, there are a lot of small files inside one partition/table. In 
> order to reduce the namenode load, we have one dedicated housekeeping job 
> running to merge these file. Right now the merge is an 'insert overwrite' in 
> hive, and requires decompress the data and compress it. This jira is to add a 
> command in Hive to do the merge without decompress and recompress the data.
> Something like "alter table tbl_name [partition ()] merge files". In this 
> jira the new command will only support RCFile, since there need some new APIs 
> to the fileformat.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to