[ 
https://issues.apache.org/jira/browse/NIFI-12228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17774938#comment-17774938
 ] 

Mark Payne commented on NIFI-12228:
-----------------------------------

Because this is a concurrent bug, it is difficult to replicate/verify. However, 
I was able to replicate the issue locally by changing the code slightly. In 
{{{}LocalPort{}}}, I modified the {{triggerOutputPort}} to sleep before closing 
flow out of the group:
{code:java}
try {
    transferUnboundedConcurrency(context, session);
} finally {
    try {
      Thread.sleep(1500L);
    } catch (final InterruptedException ignored) {
    }

    dataValve.closeFlowOutOfGroup(getProcessGroup());
}{code}
With this in place, after rebuilding, the system tests failed in the same 
manner. This is because the output port has a chance to run and transfer data 
out of the group. But before it closes the valve, the Input Port now has a 
chance to run and open the gate for allowing data to flow into the group. It 
then pulls a FlowFile in before the output valve is closed.

As a result, the call to {{dataValve.closeFlowOutOfGroup}} returns {{false}} 
because the group is not empty. As a result, the FlowFiles in the connection 
between Groups A and B cannot be access (because Group B's output valve is left 
open). This, in turn, means that the valve won't be closed until the next group 
fo data is pushed out. Now there are 10 FlowFiles in the connection between 
Groups A and B, not 5. So the test times out waiting for there to be 5 
FlowFiles. Meanwhile, Group A keeps bringing new data in until eventually there 
are 25 FlowFiles in that connection. But since the test is expecting exactly 5, 
it times out waiting for that to occur.

> Concurrency bug can occasionally lead to allowing a group with Single 
> FlowFile per Node input pulling in multiple FlowFiles
> ---------------------------------------------------------------------------------------------------------------------------
>
>                 Key: NIFI-12228
>                 URL: https://issues.apache.org/jira/browse/NIFI-12228
>             Project: Apache NiFi
>          Issue Type: Bug
>            Reporter: Mark Payne
>            Assignee: Mark Payne
>            Priority: Major
>             Fix For: 1.latest, 2.latest
>
>
> Every now and then we see a failure in the system tests:
> {code:java}
> BatchFlowBetweenGroupsIT.testSingleConcurrencyAndBatchOutputToBatchInputOutput
>  ยป Timeout testSingleConcurrencyAndBatchOutputToBatchInputOutput() timed out 
> after 5 minutes {code}
> Looking at the logs shows that this is happening because data enters Group A, 
> and then before the Output Port has a chance to push the data out of Group A, 
> a second FlowFile enters. As a result, the test fails because the queue 
> between Groups A and B, or the connection inside of Group B, never reach the 
> expected size.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to