[ 
https://issues.apache.org/jira/browse/HIVE-6897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14393045#comment-14393045
 ] 

Michelle Ufford commented on HIVE-6897:
---------------------------------------

I too have both scenarios described by Dip Kharod in the description. In 
particular, the ability to append to an existing partition is helpful for 
processing late-arriving data. Late-arriving data may occur for a variety of 
reasons in our environment, such as a short-term spike in data volumes or a 
failed upstream process. Providing an optional flag that allows data to be 
appended to an existing partition would enable more robust ETL processing and 
thus reduce the need for manual intervention when these types of events occur. 

> Allow overwrite/append to external Hive table (with partitions) via HCatStorer
> ------------------------------------------------------------------------------
>
>                 Key: HIVE-6897
>                 URL: https://issues.apache.org/jira/browse/HIVE-6897
>             Project: Hive
>          Issue Type: Improvement
>          Components: HCatalog, HiveServer2
>    Affects Versions: 0.12.0
>            Reporter: Dip Kharod
>
> I'm using HCatStorer to write to external Hive table with partition from Pig 
> and have the following different use cases:
> 1) Need to overwrite (aka, refresh) data into table: Currently I end up doing 
> this outside (drop partition and delete HDFS folder) of Pig which is very 
> painful and error-prone
> 2) Need to append (aka, add new file) data to the Hive external 
> table/partition: Again, I end up doing this outside of Pig by copying file in 
> appropriate folder
> It would be very productive for the developers to have both options in 
> HCatStorer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to