[jira] [Commented] (NIFI-5997) If swap file written but FlowFile Repository fails to update, connection queue counts wrong and flowfiles are duplicated upon restart
[ https://issues.apache.org/jira/browse/NIFI-5997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807789#comment-16807789 ] Joseph Percivall commented on NIFI-5997: Sounds good thanks [~markap14]. That aligns with the testing we've done. We saw the issue with 1.8 and 1.7.1, and we were able to manually backport the fix. I'll go ahead and mark affects version for the versions that we were able to reproduce and observe the fix on (1.8 and 1.7.1). > If swap file written but FlowFile Repository fails to update, connection > queue counts wrong and flowfiles are duplicated upon restart > - > > Key: NIFI-5997 > URL: https://issues.apache.org/jira/browse/NIFI-5997 > Project: Apache NiFi > Issue Type: Bug >Reporter: Mark Payne >Assignee: Mark Payne >Priority: Blocker > Fix For: 1.9.0 > > Time Spent: 1h > Remaining Estimate: 0h > > If a queue writes out a Swap File but then the FlowFile Repository throws an > Exception when attempting to update, we end up with a scenario where the size > of the queue increases by 10,000 FlowFiles (the number of FlowFiles to be > written to the swap file) as well as the corresponding size of the FlowFiles. > We also have a Swap File that is written out to disk but the FlowFile Repo > didn't get updated so on restart we have those FlowFiles in the FlowFile Repo > as well as in the Swap File, so we end up with two of the same FlowFile. This > can then cause some odd behavior because two FlowFiles exist with the same ID > and the counts on the queues are very wrong, which also causes a lot of > confusion. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-5997) If swap file written but FlowFile Repository fails to update, connection queue counts wrong and flowfiles are duplicated upon restart
[ https://issues.apache.org/jira/browse/NIFI-5997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807775#comment-16807775 ] Mark Payne commented on NIFI-5997: -- [~JPercivall] - looking through Git history, I am not seeing any version prior to 1.9.0 that called the #getRecoveredSwapLocations method so IĀ supposeĀ it was this way for all prior releases. > If swap file written but FlowFile Repository fails to update, connection > queue counts wrong and flowfiles are duplicated upon restart > - > > Key: NIFI-5997 > URL: https://issues.apache.org/jira/browse/NIFI-5997 > Project: Apache NiFi > Issue Type: Bug >Reporter: Mark Payne >Assignee: Mark Payne >Priority: Blocker > Fix For: 1.9.0 > > Time Spent: 1h > Remaining Estimate: 0h > > If a queue writes out a Swap File but then the FlowFile Repository throws an > Exception when attempting to update, we end up with a scenario where the size > of the queue increases by 10,000 FlowFiles (the number of FlowFiles to be > written to the swap file) as well as the corresponding size of the FlowFiles. > We also have a Swap File that is written out to disk but the FlowFile Repo > didn't get updated so on restart we have those FlowFiles in the FlowFile Repo > as well as in the Swap File, so we end up with two of the same FlowFile. This > can then cause some odd behavior because two FlowFiles exist with the same ID > and the counts on the queues are very wrong, which also causes a lot of > confusion. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-5997) If swap file written but FlowFile Repository fails to update, connection queue counts wrong and flowfiles are duplicated upon restart
[ https://issues.apache.org/jira/browse/NIFI-5997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16801183#comment-16801183 ] Joseph Percivall commented on NIFI-5997: [~markap14] do you know which versions this affects? > If swap file written but FlowFile Repository fails to update, connection > queue counts wrong and flowfiles are duplicated upon restart > - > > Key: NIFI-5997 > URL: https://issues.apache.org/jira/browse/NIFI-5997 > Project: Apache NiFi > Issue Type: Bug >Reporter: Mark Payne >Assignee: Mark Payne >Priority: Blocker > Fix For: 1.9.0 > > Time Spent: 1h > Remaining Estimate: 0h > > If a queue writes out a Swap File but then the FlowFile Repository throws an > Exception when attempting to update, we end up with a scenario where the size > of the queue increases by 10,000 FlowFiles (the number of FlowFiles to be > written to the swap file) as well as the corresponding size of the FlowFiles. > We also have a Swap File that is written out to disk but the FlowFile Repo > didn't get updated so on restart we have those FlowFiles in the FlowFile Repo > as well as in the Swap File, so we end up with two of the same FlowFile. This > can then cause some odd behavior because two FlowFiles exist with the same ID > and the counts on the queues are very wrong, which also causes a lot of > confusion. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-5997) If swap file written but FlowFile Repository fails to update, connection queue counts wrong and flowfiles are duplicated upon restart
[ https://issues.apache.org/jira/browse/NIFI-5997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16761174#comment-16761174 ] ASF subversion and git services commented on NIFI-5997: --- Commit 412c4908e2c5d79d958b09403c816db57c828179 in nifi's branch refs/heads/master from Mark Payne [ https://gitbox.apache.org/repos/asf?p=nifi.git;h=412c490 ] NIFI-5997: Recover FlowFile Repository before swap files; then, when recovering swap files, ignore any that are unknown to the flowfile repo. This prevents us from incrementing the size of the flowfile queue for unknown swap files This closes #3292. Signed-off-by: Bryan Bende > If swap file written but FlowFile Repository fails to update, connection > queue counts wrong and flowfiles are duplicated upon restart > - > > Key: NIFI-5997 > URL: https://issues.apache.org/jira/browse/NIFI-5997 > Project: Apache NiFi > Issue Type: Bug >Reporter: Mark Payne >Assignee: Mark Payne >Priority: Blocker > Fix For: 1.9.0 > > Time Spent: 50m > Remaining Estimate: 0h > > If a queue writes out a Swap File but then the FlowFile Repository throws an > Exception when attempting to update, we end up with a scenario where the size > of the queue increases by 10,000 FlowFiles (the number of FlowFiles to be > written to the swap file) as well as the corresponding size of the FlowFiles. > We also have a Swap File that is written out to disk but the FlowFile Repo > didn't get updated so on restart we have those FlowFiles in the FlowFile Repo > as well as in the Swap File, so we end up with two of the same FlowFile. This > can then cause some odd behavior because two FlowFiles exist with the same ID > and the counts on the queues are very wrong, which also causes a lot of > confusion. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-5997) If swap file written but FlowFile Repository fails to update, connection queue counts wrong and flowfiles are duplicated upon restart
[ https://issues.apache.org/jira/browse/NIFI-5997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16760136#comment-16760136 ] ASF subversion and git services commented on NIFI-5997: --- Commit 83ac191736e8036f82da467ceb1940b50d9886f0 in nifi's branch refs/heads/master from Mark Payne [ https://gitbox.apache.org/repos/asf?p=nifi.git;h=83ac191 ] NIFI-5997: If we swap out data, ensure that we do not increment the size of the queue by the size of the data that we failed to swap out. Also, if the FlowFile Repo does not know about a given swap file, do not restore it on restart This closes #3290. Signed-off-by: Bryan Bende > If swap file written but FlowFile Repository fails to update, connection > queue counts wrong and flowfiles are duplicated upon restart > - > > Key: NIFI-5997 > URL: https://issues.apache.org/jira/browse/NIFI-5997 > Project: Apache NiFi > Issue Type: Bug >Reporter: Mark Payne >Assignee: Mark Payne >Priority: Blocker > Fix For: 1.9.0 > > Time Spent: 20m > Remaining Estimate: 0h > > If a queue writes out a Swap File but then the FlowFile Repository throws an > Exception when attempting to update, we end up with a scenario where the size > of the queue increases by 10,000 FlowFiles (the number of FlowFiles to be > written to the swap file) as well as the corresponding size of the FlowFiles. > We also have a Swap File that is written out to disk but the FlowFile Repo > didn't get updated so on restart we have those FlowFiles in the FlowFile Repo > as well as in the Swap File, so we end up with two of the same FlowFile. This > can then cause some odd behavior because two FlowFiles exist with the same ID > and the counts on the queues are very wrong, which also causes a lot of > confusion. -- This message was sent by Atlassian JIRA (v7.6.3#76005)