Re: Unable to modify flow when one of the nodes in a cluster is disconnected

Purushotham Pushpavanthar Mon, 01 Jul 2019 11:18:09 -0700

Mark, thanks for the clarification.

On Mon, Jul 1, 2019, 9:05 PM Mark Payne <[email protected]> wrote:


> My apologies, I wasn't very clear. If a node is in a disconnected state,
> you cannot make any changes
> to the cluster. You would first have to go to the Cluster menu and choose
> to remove the node from the cluster.
> Then you would be free to make changes to the flow. If the now-removed now
> is then restarted, it will attempt
> to re-join the cluster. At this point, if there are components that have
> been stopped/started/moved around, then
> the node will inherit these changes and join the cluster. But if you have
> changed a processor's properties, for
> instance, this will result in the node failing to join the cluster and
> indicating that the local flow differs from the cluster's flow.
>
>
> On Jun 29, 2019, at 2:53 PM, Purushotham Pushpavanthar <
> [email protected]<mailto:[email protected]>> wrote:
>
> Hi Mark,
>
> I thank you for your time and descriptive insights. However, the concern I
> raised was regarding the allowable changes like changing the run status of
> the processors. I couldn't stop or start a processor in the cluster when
> one of the nodes was disconnected. The warning panel displayed is attached
> to the initial mail in this thread.
>
>
> *Now, there are some changes that we do allow, and the node will still
> re-join. For instance, if the positions of elements change, elements are
> startedor stopped, etc. In these cases, the new node will just inherit the
> flow from the cluster and take on those changes.*
>
> Regarding certain kind of changes you mentioned in your previous mail,
> could you please throw some light on which release this it supported from?
>
>
> Regards,
> Purushotham Pushpavanth
>
>
>
> On Thu, 27 Jun 2019 at 19:34, Mark Payne <[email protected]<mailto:
> [email protected]>> wrote:
>
> Purushotham,
>
> If the node is disconnected and then attempts to reconnect, flow election
> does not occur. Rather, the node obtains a copy of the flow
> from the cluster, determines whether or not it matches, and if so rejoins.
> If the flow does not match, it disconnects and stops trying to
> reconnect.
>
> There are a few reasons that the node doesn't just inherit the cluster's
> flow blindly. Firstly, if a user were to delete a connection, and the
> re-joining node had data in that connection, it would lose the data. This
> is probably the most important reason - we never want to
> design for data loss.
>
> Secondly, when a node is disconnected from the cluster, the user is able
> to make changes. There are times when users will disconnect a
> particular node from the cluster and make some changes to the dataflow for
> diagnostic purposes. For example, they may want to temporarily
> send data to a new endpoint for sampling. When this happens, we don't want
> to just blindly lose those changes, because the user may not
> have wanted those changes lost. And if an admin is managing several
> systems, it's possible that they could accidentally configure the node
> to point to the wrong cluster, in which case it could potentially lose the
> entire dataflow. Perhaps not a problem if the dataflow exists on other
> nodes, but if this is a standalone node being converted into cluster, it
> could be devastating for the user.
>
> Now, there are some changes that we do allow, and the node will still
> re-join. For instance, if the positions of elements change, elements are
> started
> or stopped, etc. In these cases, the new node will just inherit the flow
> from the cluster and take on those changes.
>
> I think it would probably be advantageous to allow the node to back up its
> own flow before inheriting from the cluster, and then apply any changes
> from
> the cluster that do not result in data loss (i.e., if any connection is
> removed and the node has data in that connection, then fail, else inherit).
> The big down
> side there, honestly, is that it's just a huge amount of effort that would
> be required in order to make that work properly.
>
> So to make a long story short: there are reasons that we don't just
> inherit the flow, but we could work around those problems. There are
> definitely
> areas where we could improve, but it's just not been taken on yet by
> anyone in the community.
>
> Thanks
> -Mark
>
>
> On Jun 27, 2019, at 3:37 AM, Purushotham Pushpavanthar <
> [email protected]<mailto:[email protected]><mailto:
> [email protected]>> wrote:
>
> Hi,
>
> I'm having a 3 nodes( ver 1.9.2) cluster running in production. As infra
> is unreliable due to various factors, our nodes go down often. We don't
> have distinction between dev and prod cluster. We modify, deploy, test in
> the same cluster. However, when one of the node goes down NiFi restricts us
> to modify the state of the flow by throwing warning window in the
> attachment.
>
> I read<
>
> https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#flow-election
> >
> that if a node in the cluster is disconnected and comes back again, flow
> election happens. I would like to understand the motivation for not
> allowing the change of flow in the above scenario.
> I was thinking why can't the latest node joining to the cluster pull a
> most elected flow.xml.gz from the cluster and apply it to itself?
>
> Regards,
> Purushotham Pushpavanth
>
>

Re: Unable to modify flow when one of the nodes in a cluster is disconnected

Reply via email to