[jira] [Commented] (NIFI-11789) ExecuteSQL doesn't set fragment.count attribute when Output Batch Size is set
[ https://issues.apache.org/jira/browse/NIFI-11789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17774055#comment-17774055 ] Matt Burgess commented on NIFI-11789: - That is correct. Although the PR fixes your use case, I believe it would take a significant refactor in order to keep track of the FlowFile "groups" such that we know how many FlowFiles are in the group by the time the session gets committed (versus the different points in code where the FlowFiles are transferred within the session, they are not sent until the session is committed). For your use case it was more straightforward, but there are cases depending on the configuration of the processor where different FlowFiles are transferred at different points and the session is committed later. After the FlowFile has been transferred it cannot be altered (i.e. the fragment.count attribute cannot be set) If someone wants to start with my PR and has a better approach, or wants to start from scratch, I encourage them to do so. Also if I feel like revisiting it with fresh eyes down the road, I may do that. As a workaround / solution, if the outgoing FlowFiles have the correct number of records in them (meaning the record count is accurate for what the fragment.count attribute SHOULD be and the fragment.index attribute values are correct), you can pass the FlowFiles through a CalculateRecordStats processor then UpdateAttribute to set "fragment.count" to the value of "record.count", and the merge should work downstream. > ExecuteSQL doesn't set fragment.count attribute when Output Batch Size is set > - > > Key: NIFI-11789 > URL: https://issues.apache.org/jira/browse/NIFI-11789 > Project: Apache NiFi > Issue Type: Bug > Components: Extensions >Reporter: Tamas Neumer >Assignee: Matt Burgess >Priority: Minor > Time Spent: 0.5h > Remaining Estimate: 0h > > Hi, > I am working with the ExecuteSQL processor and discovered an unexpected > behavior. If I specify the attribute "Output Batch Size", I get the > fragment.index on the outflowing flowing Flowfiles, but the fragment.count > attribute is not set (according to the documentation). > The behavior I would expect (in line with how merge processors work) is that > the attribute fragment.count is just set at the last Flowfile for the batch. > This would make it possible to merge all the batches together afterward. > So my proposal, in short, is that the fragment.count should be set in the > last Flowfile of a batch. > BR Florian -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NIFI-11789) ExecuteSQL doesn't set fragment.count attribute when Output Batch Size is set
[ https://issues.apache.org/jira/browse/NIFI-11789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17773239#comment-17773239 ] Tamas Neumer commented on NIFI-11789: - Hi [~mattyb149], may I ask why the MR was closed? Is the solution in the PR not _complete_ feature complete and it would take otherwise too much time? Thanks! KR, Tamas > ExecuteSQL doesn't set fragment.count attribute when Output Batch Size is set > - > > Key: NIFI-11789 > URL: https://issues.apache.org/jira/browse/NIFI-11789 > Project: Apache NiFi > Issue Type: Bug > Components: Extensions >Reporter: Tamas Neumer >Assignee: Matt Burgess >Priority: Minor > Fix For: 1.latest, 2.latest > > Time Spent: 0.5h > Remaining Estimate: 0h > > Hi, > I am working with the ExecuteSQL processor and discovered an unexpected > behavior. If I specify the attribute "Output Batch Size", I get the > fragment.index on the outflowing flowing Flowfiles, but the fragment.count > attribute is not set (according to the documentation). > The behavior I would expect (in line with how merge processors work) is that > the attribute fragment.count is just set at the last Flowfile for the batch. > This would make it possible to merge all the batches together afterward. > So my proposal, in short, is that the fragment.count should be set in the > last Flowfile of a batch. > BR Florian -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NIFI-11789) ExecuteSQL doesn't set fragment.count attribute when Output Batch Size is set
[ https://issues.apache.org/jira/browse/NIFI-11789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17766879#comment-17766879 ] Matt Burgess commented on NIFI-11789: - Doubtful, even though the PR may work for your use case, it is not a complete solution so I'm not sure we should offer a partial solution, please see my comments on the PR (I changed the PR to "draft" status) > ExecuteSQL doesn't set fragment.count attribute when Output Batch Size is set > - > > Key: NIFI-11789 > URL: https://issues.apache.org/jira/browse/NIFI-11789 > Project: Apache NiFi > Issue Type: Bug > Components: Extensions >Reporter: Tamas Neumer >Assignee: Matt Burgess >Priority: Minor > Fix For: 1.latest, 2.latest > > Time Spent: 10m > Remaining Estimate: 0h > > Hi, > I am working with the ExecuteSQL processor and discovered an unexpected > behavior. If I specify the attribute "Output Batch Size", I get the > fragment.index on the outflowing flowing Flowfiles, but the fragment.count > attribute is not set (according to the documentation). > The behavior I would expect (in line with how merge processors work) is that > the attribute fragment.count is just set at the last Flowfile for the batch. > This would make it possible to merge all the batches together afterward. > So my proposal, in short, is that the fragment.count should be set in the > last Flowfile of a batch. > BR Florian -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NIFI-11789) ExecuteSQL doesn't set fragment.count attribute when Output Batch Size is set
[ https://issues.apache.org/jira/browse/NIFI-11789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17766384#comment-17766384 ] Tamas Neumer commented on NIFI-11789: - Hi! Can we expect this fix to be in the next release? Thank you! KR, Tamas > ExecuteSQL doesn't set fragment.count attribute when Output Batch Size is set > - > > Key: NIFI-11789 > URL: https://issues.apache.org/jira/browse/NIFI-11789 > Project: Apache NiFi > Issue Type: Bug > Components: Extensions >Reporter: Tamas Neumer >Assignee: Matt Burgess >Priority: Minor > Fix For: 1.latest, 2.latest > > Time Spent: 10m > Remaining Estimate: 0h > > Hi, > I am working with the ExecuteSQL processor and discovered an unexpected > behavior. If I specify the attribute "Output Batch Size", I get the > fragment.index on the outflowing flowing Flowfiles, but the fragment.count > attribute is not set (according to the documentation). > The behavior I would expect (in line with how merge processors work) is that > the attribute fragment.count is just set at the last Flowfile for the batch. > This would make it possible to merge all the batches together afterward. > So my proposal, in short, is that the fragment.count should be set in the > last Flowfile of a batch. > BR Florian -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NIFI-11789) ExecuteSQL doesn't set fragment.count attribute when Output Batch Size is set
[ https://issues.apache.org/jira/browse/NIFI-11789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17757244#comment-17757244 ] Tamas Neumer commented on NIFI-11789: - Hi! I have built an image from your feature branch - it works and it solved our problem! 拾 Thank you so much! > ExecuteSQL doesn't set fragment.count attribute when Output Batch Size is set > - > > Key: NIFI-11789 > URL: https://issues.apache.org/jira/browse/NIFI-11789 > Project: Apache NiFi > Issue Type: Bug > Components: Extensions >Reporter: Tamas Neumer >Assignee: Matt Burgess >Priority: Minor > Fix For: 1.latest, 2.latest > > Time Spent: 10m > Remaining Estimate: 0h > > Hi, > I am working with the ExecuteSQL processor and discovered an unexpected > behavior. If I specify the attribute "Output Batch Size", I get the > fragment.index on the outflowing flowing Flowfiles, but the fragment.count > attribute is not set (according to the documentation). > The behavior I would expect (in line with how merge processors work) is that > the attribute fragment.count is just set at the last Flowfile for the batch. > This would make it possible to merge all the batches together afterward. > So my proposal, in short, is that the fragment.count should be set in the > last Flowfile of a batch. > BR Florian -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NIFI-11789) ExecuteSQL doesn't set fragment.count attribute when Output Batch Size is set
[ https://issues.apache.org/jira/browse/NIFI-11789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17754335#comment-17754335 ] Matt Burgess commented on NIFI-11789: - Check the PR for my comments on whether this is a good idea or not, it doesn't seem for the complexity that "the juice is worth the squeeze" so to speak. > ExecuteSQL doesn't set fragment.count attribute when Output Batch Size is set > - > > Key: NIFI-11789 > URL: https://issues.apache.org/jira/browse/NIFI-11789 > Project: Apache NiFi > Issue Type: Bug > Components: Extensions >Reporter: Tamas Neumer >Assignee: Matt Burgess >Priority: Minor > Fix For: 1.latest, 2.latest > > Time Spent: 10m > Remaining Estimate: 0h > > Hi, > I am working with the ExecuteSQL processor and discovered an unexpected > behavior. If I specify the attribute "Output Batch Size", I get the > fragment.index on the outflowing flowing Flowfiles, but the fragment.count > attribute is not set (according to the documentation). > The behavior I would expect (in line with how merge processors work) is that > the attribute fragment.count is just set at the last Flowfile for the batch. > This would make it possible to merge all the batches together afterward. > So my proposal, in short, is that the fragment.count should be set in the > last Flowfile of a batch. > BR Florian -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NIFI-11789) ExecuteSQL doesn't set fragment.count attribute when Output Batch Size is set
[ https://issues.apache.org/jira/browse/NIFI-11789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17754132#comment-17754132 ] Matt Burgess commented on NIFI-11789: - I hope it will be, I'm still working on it. I think we'll have to keep track of all flowfiles transferred in the session so we can "go back" and add the attribute before committing the session. > ExecuteSQL doesn't set fragment.count attribute when Output Batch Size is set > - > > Key: NIFI-11789 > URL: https://issues.apache.org/jira/browse/NIFI-11789 > Project: Apache NiFi > Issue Type: Bug > Components: Extensions >Reporter: Tamas Neumer >Assignee: Matt Burgess >Priority: Minor > Fix For: 1.latest, 2.latest > > > Hi, > I am working with the ExecuteSQL processor and discovered an unexpected > behavior. If I specify the attribute "Output Batch Size", I get the > fragment.index on the outflowing flowing Flowfiles, but the fragment.count > attribute is not set (according to the documentation). > The behavior I would expect (in line with how merge processors work) is that > the attribute fragment.count is just set at the last Flowfile for the batch. > This would make it possible to merge all the batches together afterward. > So my proposal, in short, is that the fragment.count should be set in the > last Flowfile of a batch. > BR Florian -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NIFI-11789) ExecuteSQL doesn't set fragment.count attribute when Output Batch Size is set
[ https://issues.apache.org/jira/browse/NIFI-11789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17753977#comment-17753977 ] Tamas Neumer commented on NIFI-11789: - Hi, Do you think this issue will be fixed in the next release? 梁 BR Tamas > ExecuteSQL doesn't set fragment.count attribute when Output Batch Size is set > - > > Key: NIFI-11789 > URL: https://issues.apache.org/jira/browse/NIFI-11789 > Project: Apache NiFi > Issue Type: Bug > Components: Extensions >Reporter: Tamas Neumer >Assignee: Matt Burgess >Priority: Minor > Fix For: 1.latest, 2.latest > > > Hi, > I am working with the ExecuteSQL processor and discovered an unexpected > behavior. If I specify the attribute "Output Batch Size", I get the > fragment.index on the outflowing flowing Flowfiles, but the fragment.count > attribute is not set (according to the documentation). > The behavior I would expect (in line with how merge processors work) is that > the attribute fragment.count is just set at the last Flowfile for the batch. > This would make it possible to merge all the batches together afterward. > So my proposal, in short, is that the fragment.count should be set in the > last Flowfile of a batch. > BR Florian -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NIFI-11789) ExecuteSQL doesn't set fragment.count attribute when Output Batch Size is set
[ https://issues.apache.org/jira/browse/NIFI-11789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17750756#comment-17750756 ] Tamas Neumer commented on NIFI-11789: - Hi Matt, Thank you! > ExecuteSQL doesn't set fragment.count attribute when Output Batch Size is set > - > > Key: NIFI-11789 > URL: https://issues.apache.org/jira/browse/NIFI-11789 > Project: Apache NiFi > Issue Type: Bug > Components: Extensions >Reporter: Tamas Neumer >Assignee: Matt Burgess >Priority: Minor > Fix For: 1.latest, 2.latest > > > Hi, > I am working with the ExecuteSQL processor and discovered an unexpected > behavior. If I specify the attribute "Output Batch Size", I get the > fragment.index on the outflowing flowing Flowfiles, but the fragment.count > attribute is not set (according to the documentation). > The behavior I would expect (in line with how merge processors work) is that > the attribute fragment.count is just set at the last Flowfile for the batch. > This would make it possible to merge all the batches together afterward. > So my proposal, in short, is that the fragment.count should be set in the > last Flowfile of a batch. > BR Florian -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NIFI-11789) ExecuteSQL doesn't set fragment.count attribute when Output Batch Size is set
[ https://issues.apache.org/jira/browse/NIFI-11789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17750754#comment-17750754 ] Matt Burgess commented on NIFI-11789: - Makes sense, I will look at how/if we can set the final FlowFile fragment.count, an initial glance at the code seems to indicate it might be tricky as the "last flow file" can be processed at different points in the code. But if we keep a faithful count of all the flowfiles processed, we should be able to make that work. > ExecuteSQL doesn't set fragment.count attribute when Output Batch Size is set > - > > Key: NIFI-11789 > URL: https://issues.apache.org/jira/browse/NIFI-11789 > Project: Apache NiFi > Issue Type: Bug > Components: Extensions >Reporter: Tamas Neumer >Assignee: Matt Burgess >Priority: Minor > Fix For: 1.latest, 2.latest > > > Hi, > I am working with the ExecuteSQL processor and discovered an unexpected > behavior. If I specify the attribute "Output Batch Size", I get the > fragment.index on the outflowing flowing Flowfiles, but the fragment.count > attribute is not set (according to the documentation). > The behavior I would expect (in line with how merge processors work) is that > the attribute fragment.count is just set at the last Flowfile for the batch. > This would make it possible to merge all the batches together afterward. > So my proposal, in short, is that the fragment.count should be set in the > last Flowfile of a batch. > BR Florian -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NIFI-11789) ExecuteSQL doesn't set fragment.count attribute when Output Batch Size is set
[ https://issues.apache.org/jira/browse/NIFI-11789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17750645#comment-17750645 ] Tamas Neumer commented on NIFI-11789: - HI Matt, "merge all batches together" for me means the following: Imagine the situation that one flowfile is coming to the ExecuteSQL processor. Let's assume there 8 records to fetch and the batch size is set to 1. This would result in 3 batches of size 3, 3 and 2 flowfiles. What i want to achieve is to merge all 8 flowfiles back into one. So i would number them 1-8 and set the fragment.count to 8 BR > ExecuteSQL doesn't set fragment.count attribute when Output Batch Size is set > - > > Key: NIFI-11789 > URL: https://issues.apache.org/jira/browse/NIFI-11789 > Project: Apache NiFi > Issue Type: Bug > Components: Extensions >Reporter: Tamas Neumer >Assignee: Matt Burgess >Priority: Minor > Fix For: 1.latest, 2.latest > > > Hi, > I am working with the ExecuteSQL processor and discovered an unexpected > behavior. If I specify the attribute "Output Batch Size", I get the > fragment.index on the outflowing flowing Flowfiles, but the fragment.count > attribute is not set (according to the documentation). > The behavior I would expect (in line with how merge processors work) is that > the attribute fragment.count is just set at the last Flowfile for the batch. > This would make it possible to merge all the batches together afterward. > So my proposal, in short, is that the fragment.count should be set in the > last Flowfile of a batch. > BR Florian -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NIFI-11789) ExecuteSQL doesn't set fragment.count attribute when Output Batch Size is set
[ https://issues.apache.org/jira/browse/NIFI-11789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17746735#comment-17746735 ] Matt Burgess commented on NIFI-11789: - I'm assuming when you mean "merge all batches together", you mean merge all FlowFiles from a single batch together? Or are you looking to set the final count on the final FlowFile even if Output Batch Size is set? > ExecuteSQL doesn't set fragment.count attribute when Output Batch Size is set > - > > Key: NIFI-11789 > URL: https://issues.apache.org/jira/browse/NIFI-11789 > Project: Apache NiFi > Issue Type: Bug > Components: Extensions >Reporter: Tamas Neumer >Assignee: Matt Burgess >Priority: Minor > Fix For: 1.latest, 2.latest > > > Hi, > I am working with the ExecuteSQL processor and discovered an unexpected > behavior. If I specify the attribute "Output Batch Size", I get the > fragment.index on the outflowing flowing Flowfiles, but the fragment.count > attribute is not set (according to the documentation). > The behavior I would expect (in line with how merge processors work) is that > the attribute fragment.count is just set at the last Flowfile for the batch. > This would make it possible to merge all the batches together afterward. > So my proposal, in short, is that the fragment.count should be set in the > last Flowfile of a batch. > BR Florian -- This message was sent by Atlassian Jira (v8.20.10#820010)