[ 
https://issues.apache.org/jira/browse/HIVE-1950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12990434#comment-12990434
 ] 

Namit Jain commented on HIVE-1950:
----------------------------------

I will take a look - 1 minor comment.
Can you add some negative tests:

1. merge_files should fail if there is a index on the table/partition.
2. merge_files should fail if the table is partitioned, but the user did not 
specify the partition.


Going forward, we should support merge even if the partition is not fully 
specified.

alter table srcpart partition (ds='1') merge_files;

should merge ds=1/hr=1 and ds=1/hr=2 as a follow-up.
But for now, they should throw an error

> Block merge for RCFile
> ----------------------
>
>                 Key: HIVE-1950
>                 URL: https://issues.apache.org/jira/browse/HIVE-1950
>             Project: Hive
>          Issue Type: New Feature
>            Reporter: He Yongqiang
>            Assignee: He Yongqiang
>         Attachments: HIVE-1950.1.patch
>
>
> In our env, there are a lot of small files inside one partition/table. In 
> order to reduce the namenode load, we have one dedicated housekeeping job 
> running to merge these file. Right now the merge is an 'insert overwrite' in 
> hive, and requires decompress the data and compress it. This jira is to add a 
> command in Hive to do the merge without decompress and recompress the data.
> Something like "alter table tbl_name [partition ()] merge files". In this 
> jira the new command will only support RCFile, since there need some new APIs 
> to the fileformat.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to