[ 
https://issues.apache.org/jira/browse/NIFI-190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16169624#comment-16169624
 ] 

Minmin DU commented on NIFI-190:
--------------------------------

Hi there, 

I am recently working in a project that uses NiFi a lot. Based on my 
understanding, the current Notify / Wait (in NiFi 1.3) is working in a way that 
relies on count. It provides capacity to solve 1:N dependency whereas I don't 
think it works well for M:N dependency. 

For example, FlowFile D depends on FlowFile A and FlowFile B. And FlowFile E 
depends on FlowFile B and FlowFile C. It means that FlowFile D won't start 
until both FlowFile A and FlowFile B come in. After FlowFile D kicks off, 
FlowFile B cannot be removed from distribute cache map since FlowFile E also 
relies on it. 

In our project, we had to create custom Wait processor to handle this multiple 
dependency. I guess multiple dependency scenario happens a lot in an enterprise 
workflow scheduling. 

Could you please let me know if the existing Wait/Notify processors can deal 
with this scenario?

Thanks a lot. 


Cheers,

> Wait/Notify processors
> ----------------------
>
>                 Key: NIFI-190
>                 URL: https://issues.apache.org/jira/browse/NIFI-190
>             Project: Apache NiFi
>          Issue Type: New Feature
>          Components: Extensions
>            Reporter: Joseph Gresock
>            Assignee: Joseph Gresock
>            Priority: Minor
>             Fix For: 1.2.0
>
>         Attachments: Wait_Notify_template.xml
>
>
> Our team has developed a set of processors for the following use case:
> * Format A needs to be sent to Endpoint A
> * Format B needs to be sent to Endpoint B, but should not proceed until A has 
> reached Endpoint A.  We most commonly have this restriction when Endpoint B 
> requires some output of Endpoint A.
> The proposed Wait/Notify processors enable this functionality:
> * Wait: routes files to the 'wait' relationship until a matching Release 
> Signal Identifier is found in the distributed map cache.  Then routes them to 
> 'success' (unless they have expired)
> * Notify: stores a Release Signal Identifier in the distributed map cache, 
> optionally with attributes to copy to the outgoing matching Wait flow files.
> An example:
> Wait is configured with Release Signal Attribute = "$\{myId}". Its 'wait' 
> relationship routes back onto itself.
>     flowFile 1 \{ myId : "123" }
>     comes into Wait processor
>     Wait checks the distributed cache map for "123", doesn't find it, and is 
> routed to the 'wait' relationship
> Notify is configured with Release Signal Attribute = "$\{myId}"
>     flowFile 2 \{ myId : "123" }
>     comes in to Notify processor
>     Notify puts an entry in the map for "123" with any other attributes from 
> flowFile2
> Next time flowFile 1 is processed by Wait...
>     Finds an entry for "123"
>     Removes that entry from the map
>     Copies attributes to flowFile 1
>     Sends flowFile 1 out the success relationship
> Notify will optionally cache attributes in the distributed map, as determined 
> by a regex property.  This is what allows the output of Endpoint A to pass to 
> Endpoint B, above.  Wait also allows conflicting attributes from the cache to 
> either be replaced or kept, depending on property configuration.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to