[
https://issues.apache.org/jira/browse/NIFI-10070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
David Handermann resolved NIFI-10070.
-------------------------------------
Fix Version/s: 1.17.0
Resolution: Fixed
> NiFi fails to delete/update component because it's still running, immediately
> after confirming that the component is stopped.
> -----------------------------------------------------------------------------------------------------------------------------
>
> Key: NIFI-10070
> URL: https://issues.apache.org/jira/browse/NIFI-10070
> Project: Apache NiFi
> Issue Type: Bug
> Components: Core Framework
> Reporter: Mark Payne
> Assignee: Nathan Gough
> Priority: Major
> Labels: clustering, entity, merging, response
> Fix For: 1.17.0
>
> Time Spent: 3h 20m
> Remaining Estimate: 0h
>
> This issue has been identified by analyzing the logs, code, etc., of the
> system tests. Many of the system tests indicate that after each test (or
> after a set of tests), the flow must be torn down. This will stop all
> processors/reporting tasks and disable all controller services. It will then
> wait for them to fully stop/disable, according to the REST API. It will then
> purge any queues and delete all components. Then it deletes all components.
> However, occasionally we see a failure in the step that deletes the
> components. One node will indicate that the component cannot be deleted
> because it's still running, so the REST API will send back a 409. However,
> before making this request, we've already made a request to get all
> components and checked that their state is STOPPED/DISABLED and no active
> threads.
> If we look at the code that is used to determine whether or not they are
> STOPPED/DISABLED, it is using the "status" field in the Entity objects (
> {{reportingTaskEntity.getStatus().getRunStatus()}} for example).
> However, the DTO also has a state field: {{ReportingTaskDTO.getState()}}
> We have a similar situation with Processors, Reporting Tasks, and Controller
> Services.
> In order to maintain backward compatibility, we need to leave both of these
> fields. However, the issue we have appears to be in the
> ReportingTaskEntityMerger, ProcessorEntityMerger, and
> ControllerServiceEntityMerger.
> These mergers do not take into account / merge this status field in the
> Entity. They take into account only the fields in the DTO. As a result, we
> can have one node indicating that the status is STOPPED with 0 threads while
> another node indicates STOPPED with 1 thread. The merging logic may choose
> the STOPPED with 0 threads, confirming that the component is fully stopped.
> At this point, a delete or update will fail because the component is not in
> the desired state on all nodes.
> We need to update the 3 Entity Mergers to ensure that they properly merge the
> state in the Entity objects as well.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)