[ 
https://issues.apache.org/jira/browse/OOZIE-2179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14570838#comment-14570838
 ] 

Jaydeep Vishwakarma commented on OOZIE-2179:
--------------------------------------------

[~rkanter], I am not seeing any assignee for this issue. Are you working on it?

> Use HDFS INotify to track HDFS data dependencies instead of polling
> -------------------------------------------------------------------
>
>                 Key: OOZIE-2179
>                 URL: https://issues.apache.org/jira/browse/OOZIE-2179
>             Project: Oozie
>          Issue Type: New Feature
>          Components: coordinator
>            Reporter: Robert Kanter
>
> Instead of polling the NN every minute for Coordinators, we should look into 
> using the new INotify feature in HDFS-6634.  It allows you to get a stream of 
> events from HDFS.  Internally, it still uses a polling mechanism for now, but 
> even so, it would likely be more efficient and less heavy-handed than what 
> we're doing.
> We'd probably still have to check if the directory exists when a coordinator 
> action starts in case we missed the event, but while waiting for an HDFS 
> dependency to be available, we can use INotify.
> For HCat dependencies we still have a backup polling of 10 minutes in case a 
> JMS message is missed or lost.  I don't think we'll need to do this for 
> INotify because you can view past events as long as you keep track of the 
> event ID.  For example, if you restart Oozie and we kept track of the last ID 
> Oozie looked at, we could resume from there without losing anything.
> The INotify stream is asynchronous, so we won't receive a notification 
> immediately.  We should look into the guarantees of how long it can take for 
> the notification to show up.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to