Hi Mark - thanks for your prompt response. A few thoughts .. a) Currently, when Processor A is configured to run on the Primary Node, in the absence of special configuration (e.g. to the rest of the flow configured as a Process Group), the downstream Processors in the flow seem to automatically run on the Primary Node too. So in a sense, we have the affinity or grouping of processors to a given node already, except this is limited to the Primary Node. Could we not allow the scheduling of the Isolated Processor to occur on ANY single node, rather than just the Primary node? That would suffice for our current use case - i.e. we would be perfectly load balanced on initial ingest ACROSS the entire cluster, even though the entire downstream flow would run on whichever node the isolated processor was (randomly) scheduled on.
b) That said, the "Scheduling Group" paradigm sounds very promising, if that includes the ability to Group Processors/Flows, as well as restrict their running to Groups-of-nodes. It is even more interesting if the concept can be coupled with Multi-tenancy, so cluster-resources (viz. the nodes) can be partitioned/isolated-to particular tenants. Regards, Manoj Manoj Seshan - Senior Architect Platform Content Technology, Bangalore Voice: +91-9686578756 +91-80-67492572 -----Original Message----- From: Mark Payne [mailto:[email protected]] Sent: Tuesday, April 05, 2016 6:01 PM To: [email protected] Subject: Re: Feature Request: Isolated Processors on ANY ONE node rather than on Primary node alone Manoj, That is a very good point, and it is something that we are working toward. However, it does get a little bit more complicated than this. If you have some Processor, say Processor A running on some arbitrary node, there will often be times that you will also need another Processor, Processor B, running on that same node. Using a Primary Node means that we are able to accomplish this easily, but as you are noting here, it is quite limiting. In version 1.0.0 of NiFi, one of the big changes in a Zero-Master clustering design, whereby the Primary Node is automatically elected and fails over to a different node whenever the Primary Node leaves the cluster. This improves the overall functionality of Primary Node but does not address the issue here, of avoiding scheduling all "singleton" processors on the same node. I think the path that we'd like to take moving forward, post-1.0.0, is to provide a mechanism that allows the user to schedule a Processor to run in some sort of named "Scheduling Group". So, for instance, you could say Processor A and B should both run in "Group A" but Processor C should run in "Group C". This way, we can ensure that Processors that need to run together can do so while at the same time avoiding the need for all such processors to run on the same node. Does this sound like a reasonable approach for your use case? Thanks -Mark > On Apr 5, 2016, at 3:08 AM, <[email protected]> > <[email protected]> wrote: > > For the purposes of symmetry of the NiFi Cluster, and so that the initial > ingest of content is not limited to just one primary node in the NiFi > cluster, would it not be beneficial for the framework to have the ability to > schedule an Isolated Processor on ANY ONE of available nodes in the NiFi > Cluster? > > Regards, Manoj > > Manoj Seshan - Senior Architect > Platform Content Technology, Bangalore > > Voice: +91-9686578756 +91-80-67492572
