[ 
https://issues.apache.org/jira/browse/NIP-22?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18063035#comment-18063035
 ] 

Sean Hunter commented on NIP-22:
--------------------------------

Copying my email in here for reference:

I don’t have a strong opinion either way (and I’m not on the PMC so take my 
thoughts with a grain of salt), but I do see a tradeoff with both approaches.

The existing approach has the obvious downside of not balancing the load across 
the cluster. I can understand the appeal of alleviating that, as it’s something 
I’ve occasionally considered in my own environment.

However, switching to a load balanced approach has a downside as well. 
Troubleshooting becomes more difficult. Today, I can log into a cluster, check 
which node is primary, and know where to expect those processors to be 
executing. This has been useful, on rare occasions.

As I said, I don’t have a particularly strong opinion, but I’d be curious if 
others have also benefitted from knowing where a given processor was executing. 
Perhaps, there would be a way to indicate to users where the cluster thinks a 
given single-node processor is supposed to be scheduled..?

> "Single Node Only" Execution Node
> ---------------------------------
>
>                 Key: NIP-22
>                 URL: https://issues.apache.org/jira/browse/NIP-22
>             Project: NiFi Improvement Proposal
>          Issue Type: New Feature
>            Reporter: Mark Payne
>            Assignee: Mark Payne
>            Priority: Major
>
> h3. Motivation
> NiFi supports running a Processor on "All Nodes" or "Primary Node Only." This 
> works well when you have a few processors that don't require a large amount 
> of overhead. However, some flows have many different processors that have to 
> be run as Primary Node Only, and it can cause a single node to be heavily 
> overloaded.
> There are times that we want to have a Primary Node because we want multiple 
> flows running on the same node. But the vast majority of the time we simply 
> care that the processor runs on 1 node only - but we don't care which node.
> To this end, we should introduce the ability to schedule a Processor to run 
> on a "Single Node Only" but not care which node. If we have 100 Processors 
> configured for Single Node Only, each node in the cluster should run some 
> number of those Processors but not all. It is worth noting that we already 
> have the necessary prerequisites in place for Leader Election, as we already 
> use these for Primary Node and Cluster Coordinator - this simply allows for 
> more leases to be used.
> h3. NiFi API Changes
> Introduce a new ExecutionNode value of `SINGLE_NODE_ONLY`.
> Deprecate the `@PrimaryNodeOnly` in favor of a new `@SingleNodeOnly`
> Introduce a new `@OnSingleNodeElectionChange` annotation that can be added to 
> processors that is analogous to the @OnPrimaryNodeChange.
> h3. REST API Changes
> There should be no significant changes to the REST API, except to allow for 
> the new Execution Node value.
> h3. UI Changes
> We will need a new icon similar to the 'P' badge that shows on Processors 
> when the Processor is run on Primary Node Only. This also appears in the 
> Summary table. We also need the ability to choose the new value for the 
> "Execution Node".
> h3. Open Questions
> Do we truly have a need for Primary Node still? Or should this replace it 
> entirely?
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to