Mark,

Ultimately we have a versioned state machine in which we want to
change the definition of the state machine while there is state in the
machine.

Here the case is that we have state which no longer has a home in the
new version of the machine.

We detect this case as you note and prevent the user from performing
this action until they solve that fundamental gap which we could not
possibly know the need for automatically.

Solutions are in the camp of
1. Things a user/process could do.
 - Manual stuff. Stop components to let the state bleed out.  Then do
version change.  OR delete the state.  Then do version change.
 - Make sure flow version changes use existing connections
meaningfully where possible.  Not perfect but helpful.  Gaps remain.

2. Things we can do from an app/framework point of view.
 - Stop flows to sort of squeeze the toothpaste out.  Timing not
reliable here but would probably work in most cases reasonably fast.
 - Give an option to auto delete state which no longer has a home.
 - Give an option to 'move' state (flowfiles) from a now orphaned
connection to a now existing connection in the new version.

This is the blessing and curse of operating a durable state machine
and supporting version control changes as if the flow definition
exists at a point in time independent of data/state flowing through
it.

Joe

On Thu, Aug 19, 2021 at 11:05 AM Mark Bean <[email protected]> wrote:
>
> Scenario:
> Using NiFi Registry to version control changes to the graph. A development
> system is used to make all changes. Once the changes are "vetted", the
> production system will pull down the latest version to get the changes.
> Further, the goal is to automate the version update, and not require an
> operator to manually perform the version update through the UI.
>
> Now, consider the case where the new version removes a connection. When the
> production system attempts to apply the new version, it will fail if there
> are flowfiles in the queue of that removed connection. That's good in that
> it prevents data loss. However, it prevents the versioned process group
> from updating.
>
> Are there suggested solutions to this - besides the obvious of manually
> stopping the upstream flow and waiting for queue(s) to empty?
>
> One option we discussed was to allow components affected by the version
> change to be stopped "smartly". They would be ordered such that upstream
> components are stopped first, then downstream, and lastly controller
> services. (Is that being done currently?) Additionally, if a processor has
> flowfiles in an upstream queue (that is being removed by the version
> change), it would delay stopping the processor by some period of time thus
> giving the opportunity for that queue to empty. Granted, this could become
> problematic if there are many components that fall into this category.
> Also, what is an appropriate "period of time"? For example, a delay of even
> 2 seconds could result in the overall process taking over a minute if there
> are 30+ such connections.
>
> Comments are welcome.
>
> Thanks,
> Mark

Reply via email to