[ 
https://issues.apache.org/jira/browse/NIFI-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15326110#comment-15326110
 ] 

Andrew Grande commented on NIFI-2001:
-------------------------------------

*nifi.flowcontroller.autoResumeState* - good to know there is something today. 
Can you clarify what's going on with the 1.0 line and clustering though? 
Shouldn't we have a way to stand up a cluster in safe mode? Some node somewhere 
should be the originator of properties to propagate, shouldn't there?

We may consider a short alias for a property, however, in a spirit of keeping 
things simple and intuitive (-safe is very easy to explain and remember, and a 
wrapper script can take care of any special aliasing and translate to a runtime 
property before JVM launch, see below).

*-P<prop_name>* - that's a great idea and I did implement exactly that in other 
projects (became a very often-used feature). The design we took was, however, 
not to touch the original properties config (it's often coming from source 
control, Puppet/Chef recipes, etc and is generally a no no in production to 
edit). Instead, on every launch the bootstrap code looked for an 
nifi-local.properties (or -override.properties, or choose any of your liking). 
It was an overlay for that particular run. Every launch wipes this file and the 
bootstrap process generates it from the passed in command line switches.

In any case, as long as there is a cluster-wide, intuitive property name we can 
add for the current launch, and it doesn't change the core properties file 
which came from a deployment system, we should be all set. You choose the 
implementation :)

> Introduce a 'safe' startup mode
> -------------------------------
>
>                 Key: NIFI-2001
>                 URL: https://issues.apache.org/jira/browse/NIFI-2001
>             Project: Apache NiFi
>          Issue Type: Improvement
>    Affects Versions: 0.6.1
>            Reporter: Andrew Grande
>
> *Driver*: after NiFi crashes (e.g. with OOME), one needs to go in and 
> manually flush queues or partially enable/disable flows to let them run and 
> drain data. Also, if, for some reason, NiFi was sent into a CPU or I/O death 
> spiral by the flow (e.g. a design mistake, or an unexpected spike in the data 
> without backpressure controls configured).
> Today it's not possible without directly hacking the flow.xml.gz. This 
> becomes even more troublesome in a clustered environment. Another drastic 
> alternative is to wipe out all repos, but this means data loss.
> *Proposal*: add a 'safe' switch/mode to NiFi, which will let it start up with 
> full UI and infrastructure controls, but explicitly stopping all flows and CS.
> E.g. invoking *nifi.sh start -safe* on command line.
> Internally, the following should happen:
> * NiFi starts up as normal, with all libraries, plugins, etc.
> * However, *none* of the processorts, CS & groups are started/enabled



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to