[jira] [Commented] (NIFI-14710) Stateless Process Groups can lead to data loss when stopped

ASF subversion and git services (Jira) Wed, 09 Jul 2025 13:45:15 -0700


    [ 
https://issues.apache.org/jira/browse/NIFI-14710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18004261#comment-18004261
 ]


ASF subversion and git services commented on NIFI-14710:
--------------------------------------------------------

Commit 82ee8a204b05363aa80d1db5233584dc54cbd7cd in nifi's branch 
refs/heads/main from Pierre Villard
[ https://gitbox.apache.org/repos/asf?p=nifi.git;h=82ee8a204b ]

NIFI-14710 - Stateless should trigger failure callback when queues are not 
empty after process group is stopped (#10074)

NIFI-14710 - Stateless should trigger failure callback when queues are not 
empty after process group is stopped

> Stateless Process Groups can lead to data loss when stopped
> -----------------------------------------------------------
>
>                 Key: NIFI-14710
>                 URL: https://issues.apache.org/jira/browse/NIFI-14710
>             Project: Apache NiFi
>          Issue Type: Bug
>          Components: Core Framework, NiFi Stateless
>    Affects Versions: 2.4.0
>            Reporter: Joe Witt
>            Assignee: Pierre Villard
>            Priority: Critical
>          Time Spent: 40m
>  Remaining Estimate: 0h
>
> When you configure a process group in 'stateless' execution mode it changes 
> how the flow works considerably.  One of these changes is the idea that 
> either a full chain of processes runs correctly and is committed or they're 
> all rolled back.  If data is routed to the 'failure' output port for instance 
> this signals that we. have a failure and things should rollback.
> This is really important for cases like 
> - ConsumeKafka
> - do stuff
> - send to someplace like a database
> And you want to have At-Least-Once or even exactly-once sematics.
> Today though stateless has a defect in which consider you only have two 
> processors in the stateless group
> - ConsumeKafka
> - SendToFancyDB
> If the SendToFancyDB processor is failing to send somewhere but chooses to 
> rollback itself rather than routing to failure then the data is technically 
> sitting in the queue between ConsumeKafka and SendToFancyDB.  
> If a user then chooses to stop the process group we have data sitting in that 
> queue but it isn't failed and stateless is currently treating that like a 
> successful condition and then committing the kafka session which ack's the 
> data but it never actually got to SendToFancyDB (i'm not sure what happens to 
> the data actually in that case in nifi but it is definitely 'lost' in the 
> sense that we ack'd the offset from kafka so subsequent runs wont get it 
> either).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (NIFI-14710) Stateless Process Groups can lead to data loss when stopped

Reply via email to