[
https://issues.apache.org/jira/browse/NIFI-842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Bryan Bende updated NIFI-842:
-----------------------------
Attachment: NIFI-842.patch
The attached patch modifies RunNiFi so that it stores the timestamp of the
first failed start-up, and on future restarts it only continues if 60 seconds
or less has elapsed since the last failure.
The restart loop is somewhat tricky in that it can't really tell what is going
on in the other process that is starting, so even when the other process is
going to fail, RunNiFi thinks it started the process successfully and can
sometimes hit the "if (alive)" block on line 790 or the "if (started)" block on
834), even though on the next iteration the process may be failed.
This creates a problem in that there is not a good way to reset the last failed
timestamp because it doesn't know what is really a successful start up. Without
resetting the timestamp a problem scenario would be... nifi is running and hits
its first failure so it stores the timestamp and restarts, the restart works
and nifi continues running for hours/days, now another failure occurs and this
time it sees that over 60 seconds has passed since the last failure so it just
stops when it could have tried to restart.
A completely alternate approach to solving the issue of a bad service loader
configuration, would be to catch ServiceConfigurationError around NiFi line
120-121 and then remove the pid file which would signal to RunNiFi not to
restart. This would not address the overall issue the ticket describes though.
> If Bootstrap is unable to start NiFi, the process just stays around,
> listening for connections
> ----------------------------------------------------------------------------------------------
>
> Key: NIFI-842
> URL: https://issues.apache.org/jira/browse/NIFI-842
> Project: Apache NiFi
> Issue Type: Bug
> Components: Core Framework
> Reporter: Mark Payne
> Assignee: Bryan Bende
> Fix For: 0.3.0
>
> Attachments: NIFI-842.patch
>
>
> When the bootstrap launches NiFi, it should allow some amount of time
> (perhaps a minute?) waiting to hear from NiFi. If it never hears, then it
> should assume that the app was unable to start at all. In this case,
> attempting to start it again is not going to be very helpful, so the
> bootstrap should just exit
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)