[ 
https://issues.apache.org/jira/browse/HIVE-788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12747660#action_12747660
 ] 

Zheng Shao commented on HIVE-788:
---------------------------------

Edward, yes, by A I mean the user will manage and run a script himself, and the 
script will call a Hive command which waits for a new partition to appear.

Talked offline with Ashish about the 3 questions of B. 
1. We will support shell command for now. The reason is that users can hook the 
trigger up with some other existing job/process management tool to monitor the 
status of the triggered command.
2. The MoveTask (which calls db.loadTable and db.loadPartition) will be running 
the shell command on the same machine that loads the load/partition. (Since 
there may not be a HiveServer available)
3.  If the shell command failed, the move task will return failure (while the 
new table/partition is already created / data updated in case of overwrite). 
This is also a simple choice because we don't have the concept of 
transactions/roll back yet.

So it seems B will be a better way to go.


The next question would be, what are the types of trigger we want to support 
now:
1. On new partition creation in a specified table
2. On data change (overwrite/append) in a specified table (or any partitions of 
a specified table)

There might be more but it seems these two are highly wanted.


> Triggers when a new partition is created for a table
> ----------------------------------------------------
>
>                 Key: HIVE-788
>                 URL: https://issues.apache.org/jira/browse/HIVE-788
>             Project: Hadoop Hive
>          Issue Type: New Feature
>            Reporter: Zheng Shao
>
> One requirement for HIVE-787 is that users would like to run a command 
> whenever a new partition of a Hive table gets created.
> There are several ways to achieve this functionality:
> A. Probe and wait: We can have the scripts running in a loop checking if a 
> new partition is created.
>   Pros: easy to write, easy to control
>   Cons: will introduce another delay based on the probing interval.
> B. Triggered: The command is registered inside the hive metastore. Whenever a 
> partition gets created, we run the registered command. 
> Several questions around option B are:
> 1. whether to support registration of HiveQL or shell command;
> 2. which machine/environment to run the command;
> 3. what to do if the registered command failed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to