Hi Mark - thanks for your prompt response. A few thoughts .. 

a) Currently, when Processor A is configured to run on the Primary Node, in the 
absence of special configuration (e.g. to the rest of the flow configured as a 
Process Group), the downstream Processors in the flow seem to automatically run 
on the Primary Node too. So in a sense, we have the affinity or grouping of 
processors to a given node already, except this is limited to the Primary Node. 
Could we not allow the scheduling of the Isolated Processor to occur on ANY 
single node, rather than just the Primary node? That would suffice for our 
current use case - i.e. we would be perfectly load balanced on initial ingest 
ACROSS the entire cluster, even though the entire downstream flow would run on 
whichever node the isolated processor was (randomly) scheduled on.

b) That said, the "Scheduling Group" paradigm sounds very promising, if that 
includes the ability to Group Processors/Flows, as well as restrict their 
running to Groups-of-nodes. It is even more interesting if the concept can be 
coupled with Multi-tenancy, so cluster-resources (viz. the nodes) can be 
partitioned/isolated-to particular tenants.

Regards, Manoj 

Manoj Seshan - Senior Architect
Platform Content Technology, Bangalore

Voice: +91-9686578756  +91-80-67492572

-----Original Message-----
From: Mark Payne [mailto:[email protected]] 
Sent: Tuesday, April 05, 2016 6:01 PM
To: [email protected]
Subject: Re: Feature Request: Isolated Processors on ANY ONE node rather than 
on Primary node alone

Manoj,

That is a very good point, and it is something that we are working toward.
However, it does get a little bit more complicated than this. If you have some 
Processor, say Processor A running on some arbitrary node, there will often be 
times that you will also need another Processor, Processor B, running on that 
same node.

Using a Primary Node means that we are able to accomplish this easily, but as 
you are noting here, it is quite limiting. In version 1.0.0 of NiFi, one of the 
big changes in a Zero-Master clustering design, whereby the Primary Node is 
automatically elected and fails over to a different node whenever the Primary 
Node leaves the cluster. This improves the overall functionality of Primary 
Node but does not address the issue here, of avoiding scheduling all 
"singleton" processors on the same node.

I think the path that we'd like to take moving forward, post-1.0.0, is to 
provide a mechanism that allows the user to schedule a Processor to run in some 
sort of named "Scheduling Group". So, for instance, you could say Processor A 
and B should both run in "Group A" but Processor C should run in "Group C". 
This way, we can ensure that Processors that need to run together can do so 
while at the same time avoiding the need for all such processors to run on the 
same node.

Does this sound like a reasonable approach for your use case?

Thanks
-Mark

> On Apr 5, 2016, at 3:08 AM, <[email protected]> 
> <[email protected]> wrote:
> 
> For the purposes of symmetry of the NiFi Cluster, and so that the initial 
> ingest of content is not limited to just one primary node in the NiFi 
> cluster, would it not be beneficial  for the framework to have the ability to 
> schedule an Isolated Processor on ANY ONE of available nodes in the NiFi 
> Cluster?
>  
> Regards, Manoj
>  
> Manoj Seshan - Senior Architect
> Platform Content Technology, Bangalore
> 
> Voice: +91-9686578756  +91-80-67492572

Reply via email to