Mark, 
Changing the concept of "Run on Primary Node" to " Run on Only one node" will 
not solve the problem .  Name Grouping constructs would be better option . 

Nijel, 
Our usecase is also similar.  We have many tasks to run only in one node and 
wanted to distribute the load . If we can have a list of primary node  to 
distribute the load it will solve our problem . 
 
Tijo

    On Wednesday, 21 September 2016 6:01 PM, "marka...@hotmail.com" 
<marka...@hotmail.com> wrote:
 

 Nijel,

I'd like to hear more about your use case, as from the description given, I'm 
not sure that this all would need to run on a primary node. Generally, you want 
only "source processors" to run on primary node.

One thing that I've been thinking about, though, is changing the concept of 
"Run on Primary Node" to a "Run on Only One Node." The concern there is that we 
will have cases where a few processors have to run on the same node. So we 
would need a mechanism for supporting that. Perhaps some sort of named grouping 
construct. 

Thoughts?

Sent from my iPhone

> On Sep 21, 2016, at 5:07 AM, Nijel s f <nijel...@huawei.com> wrote:
> 
> Hi all
> 
>                Supporting to Tijo’s thought, have one scenario.
> 
> we are trying to use Nifi for a data pipeline solution. The scenario is to 
> coordinate between various services and provide a solution for big data 
> analysis
>                In our scenario many of the activities are kind of "run on 
>primary" mode processors. These are being implemented on top of various 
>components like Yarn, Hbase, Spark, DB etc.
> 
>                One issue we are seeing is all these processors to be run on 
>primary node  [like spark execution, yarn/mr job execution etc.. ] and it is 
>only one.
>                We are thinking of having multiple primary nodes and assign 
>the activities using some distribution algorithm.
>                The idea is to handle the coordination and failover mechanism 
>using zookeeper.
> 
>                Any thoughts on this ?
> 
> Regards
> Nijel
> 
> From: Jeff [mailto:jtsw...@gmail.com]
> Sent: Monday, September 19, 2016 11:17 PM
> To: Tijo Thomas; us...@nifi.apache.org
> Subject: Re: enforce run only in promary node $ multiple primary node
> 
> Tijo,
> 
> To give you some information on your second question, you can design your 
> flow to redistribute the flowfiles coming out of your processors to other 
> nodes in the cluster for processing.  There are several examples on how this 
> on various blogs/email lists/etc, and I just grabbed one for reference, 
> written by Apache NiFi's own Bryan Bende: 
> http://apache-nifi.1125220.n5.nabble.com/How-to-configure-site-to-site-communication-between-nodes-in-one-cluster-td8528.html
> 
> Please review that thread and let us know if you have further questions!
> 
> On Mon, Sep 19, 2016 at 1:19 PM Tijo Thomas 
> <tijopara...@yahoo.in<mailto:tijopara...@yahoo.in>> wrote:
> 
> Hi ,
> 
> 1. While writing a processor is it possible to enforce to run only in primary 
> node. I saw a Jira for this but appears to unresolved.
> 
> [NIFI-543] Provide extensions a way to indicate that they can run only on 
> primary node, if clustered - ASF 
> JIRA<https://issues.apache.org/jira/browse/NIFI-543>
> 
> 
> 
> 
> 
> [NIFI-543] Provide extensions a way to indicate that they can run only on p...
> 
> 
> 
> 
> 2. Currently my Primary node is heavily loaded  as i have many  processor 
> which will run only in Primary node.  Is it possible to define multiple 
> primary nodes . or is it possible to configure processors not to run in 
> primary node.
> 
> Tijo


   

Reply via email to