[jira] [Commented] (NIFI-11789) ExecuteSQL doesn't set fragment.count attribute when Output Batch Size is set

2023-10-11 Thread Matt Burgess (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-11789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17774055#comment-17774055
 ] 

Matt Burgess commented on NIFI-11789:
-

That is correct. Although the PR fixes your use case, I believe it would take a 
significant refactor in order to keep track of the FlowFile "groups" such that 
we know how many FlowFiles are in the group by the time the session gets 
committed (versus the different points in code where the FlowFiles are 
transferred within the session, they are not sent until the session is 
committed). For your use case it was more straightforward, but there are cases 
depending on the configuration of the processor where different FlowFiles are 
transferred at different points and the session is committed later. After the 
FlowFile has been transferred it cannot be altered (i.e. the fragment.count 
attribute cannot be set)

If someone wants to start with my PR and has a better approach, or wants to 
start from scratch, I encourage them to do so. Also if I feel like revisiting 
it with fresh eyes down the road, I may do that.

As a workaround / solution, if the outgoing FlowFiles have the correct number 
of records in them (meaning the record count is accurate for what the 
fragment.count attribute SHOULD be and the fragment.index attribute values are 
correct), you can pass the FlowFiles through a CalculateRecordStats processor 
then UpdateAttribute to set "fragment.count" to the value of "record.count", 
and the merge should work downstream.

> ExecuteSQL doesn't set fragment.count attribute when Output Batch Size is set
> -
>
> Key: NIFI-11789
> URL: https://issues.apache.org/jira/browse/NIFI-11789
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Reporter: Tamas Neumer
>Assignee: Matt Burgess
>Priority: Minor
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Hi,
> I am working with the ExecuteSQL processor and discovered an unexpected 
> behavior. If I specify the attribute "Output Batch Size", I get the 
> fragment.index on the outflowing flowing Flowfiles, but the fragment.count 
> attribute is not set (according to the documentation).
> The behavior I would expect (in line with how merge processors work) is that 
> the attribute fragment.count is just set at the last Flowfile for the batch. 
> This would make it possible to merge all the batches together afterward.
> So my proposal, in short, is that the fragment.count should be set in the 
> last Flowfile of a batch. 
> BR Florian



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (NIFI-11789) ExecuteSQL doesn't set fragment.count attribute when Output Batch Size is set

2023-10-09 Thread Tamas Neumer (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-11789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17773239#comment-17773239
 ] 

Tamas Neumer commented on NIFI-11789:
-

Hi [~mattyb149],

may I ask why the MR was closed? Is the solution in the PR not _complete_ 
feature complete and it would take otherwise too much time?

Thanks!
KR,

Tamas

> ExecuteSQL doesn't set fragment.count attribute when Output Batch Size is set
> -
>
> Key: NIFI-11789
> URL: https://issues.apache.org/jira/browse/NIFI-11789
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Reporter: Tamas Neumer
>Assignee: Matt Burgess
>Priority: Minor
> Fix For: 1.latest, 2.latest
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Hi,
> I am working with the ExecuteSQL processor and discovered an unexpected 
> behavior. If I specify the attribute "Output Batch Size", I get the 
> fragment.index on the outflowing flowing Flowfiles, but the fragment.count 
> attribute is not set (according to the documentation).
> The behavior I would expect (in line with how merge processors work) is that 
> the attribute fragment.count is just set at the last Flowfile for the batch. 
> This would make it possible to merge all the batches together afterward.
> So my proposal, in short, is that the fragment.count should be set in the 
> last Flowfile of a batch. 
> BR Florian



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (NIFI-11789) ExecuteSQL doesn't set fragment.count attribute when Output Batch Size is set

2023-09-19 Thread Matt Burgess (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-11789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17766879#comment-17766879
 ] 

Matt Burgess commented on NIFI-11789:
-

Doubtful, even though the PR may work for your use case, it is not a complete 
solution so I'm not sure we should offer a partial solution, please see my 
comments on the PR (I changed the PR to "draft" status)

> ExecuteSQL doesn't set fragment.count attribute when Output Batch Size is set
> -
>
> Key: NIFI-11789
> URL: https://issues.apache.org/jira/browse/NIFI-11789
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Reporter: Tamas Neumer
>Assignee: Matt Burgess
>Priority: Minor
> Fix For: 1.latest, 2.latest
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Hi,
> I am working with the ExecuteSQL processor and discovered an unexpected 
> behavior. If I specify the attribute "Output Batch Size", I get the 
> fragment.index on the outflowing flowing Flowfiles, but the fragment.count 
> attribute is not set (according to the documentation).
> The behavior I would expect (in line with how merge processors work) is that 
> the attribute fragment.count is just set at the last Flowfile for the batch. 
> This would make it possible to merge all the batches together afterward.
> So my proposal, in short, is that the fragment.count should be set in the 
> last Flowfile of a batch. 
> BR Florian



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (NIFI-11789) ExecuteSQL doesn't set fragment.count attribute when Output Batch Size is set

2023-09-18 Thread Tamas Neumer (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-11789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17766384#comment-17766384
 ] 

Tamas Neumer commented on NIFI-11789:
-

Hi!

Can we expect this fix to be in the next release?

Thank you!
KR,
Tamas

> ExecuteSQL doesn't set fragment.count attribute when Output Batch Size is set
> -
>
> Key: NIFI-11789
> URL: https://issues.apache.org/jira/browse/NIFI-11789
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Reporter: Tamas Neumer
>Assignee: Matt Burgess
>Priority: Minor
> Fix For: 1.latest, 2.latest
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Hi,
> I am working with the ExecuteSQL processor and discovered an unexpected 
> behavior. If I specify the attribute "Output Batch Size", I get the 
> fragment.index on the outflowing flowing Flowfiles, but the fragment.count 
> attribute is not set (according to the documentation).
> The behavior I would expect (in line with how merge processors work) is that 
> the attribute fragment.count is just set at the last Flowfile for the batch. 
> This would make it possible to merge all the batches together afterward.
> So my proposal, in short, is that the fragment.count should be set in the 
> last Flowfile of a batch. 
> BR Florian



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (NIFI-11789) ExecuteSQL doesn't set fragment.count attribute when Output Batch Size is set

2023-08-22 Thread Tamas Neumer (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-11789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17757244#comment-17757244
 ] 

Tamas Neumer commented on NIFI-11789:
-

Hi!

I have built an image from your feature branch - it works and it solved our 
problem! 拾

Thank you so much!

> ExecuteSQL doesn't set fragment.count attribute when Output Batch Size is set
> -
>
> Key: NIFI-11789
> URL: https://issues.apache.org/jira/browse/NIFI-11789
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Reporter: Tamas Neumer
>Assignee: Matt Burgess
>Priority: Minor
> Fix For: 1.latest, 2.latest
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Hi,
> I am working with the ExecuteSQL processor and discovered an unexpected 
> behavior. If I specify the attribute "Output Batch Size", I get the 
> fragment.index on the outflowing flowing Flowfiles, but the fragment.count 
> attribute is not set (according to the documentation).
> The behavior I would expect (in line with how merge processors work) is that 
> the attribute fragment.count is just set at the last Flowfile for the batch. 
> This would make it possible to merge all the batches together afterward.
> So my proposal, in short, is that the fragment.count should be set in the 
> last Flowfile of a batch. 
> BR Florian



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (NIFI-11789) ExecuteSQL doesn't set fragment.count attribute when Output Batch Size is set

2023-08-14 Thread Matt Burgess (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-11789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17754335#comment-17754335
 ] 

Matt Burgess commented on NIFI-11789:
-

Check the PR for my comments on whether this is a good idea or not, it doesn't 
seem for the complexity that "the juice is worth the squeeze" so to speak.

> ExecuteSQL doesn't set fragment.count attribute when Output Batch Size is set
> -
>
> Key: NIFI-11789
> URL: https://issues.apache.org/jira/browse/NIFI-11789
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Reporter: Tamas Neumer
>Assignee: Matt Burgess
>Priority: Minor
> Fix For: 1.latest, 2.latest
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Hi,
> I am working with the ExecuteSQL processor and discovered an unexpected 
> behavior. If I specify the attribute "Output Batch Size", I get the 
> fragment.index on the outflowing flowing Flowfiles, but the fragment.count 
> attribute is not set (according to the documentation).
> The behavior I would expect (in line with how merge processors work) is that 
> the attribute fragment.count is just set at the last Flowfile for the batch. 
> This would make it possible to merge all the batches together afterward.
> So my proposal, in short, is that the fragment.count should be set in the 
> last Flowfile of a batch. 
> BR Florian



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (NIFI-11789) ExecuteSQL doesn't set fragment.count attribute when Output Batch Size is set

2023-08-14 Thread Matt Burgess (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-11789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17754132#comment-17754132
 ] 

Matt Burgess commented on NIFI-11789:
-

I hope it will be, I'm still working on it. I think we'll have to keep track of 
all flowfiles transferred in the session so we can "go back" and add the 
attribute before committing the session.

> ExecuteSQL doesn't set fragment.count attribute when Output Batch Size is set
> -
>
> Key: NIFI-11789
> URL: https://issues.apache.org/jira/browse/NIFI-11789
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Reporter: Tamas Neumer
>Assignee: Matt Burgess
>Priority: Minor
> Fix For: 1.latest, 2.latest
>
>
> Hi,
> I am working with the ExecuteSQL processor and discovered an unexpected 
> behavior. If I specify the attribute "Output Batch Size", I get the 
> fragment.index on the outflowing flowing Flowfiles, but the fragment.count 
> attribute is not set (according to the documentation).
> The behavior I would expect (in line with how merge processors work) is that 
> the attribute fragment.count is just set at the last Flowfile for the batch. 
> This would make it possible to merge all the batches together afterward.
> So my proposal, in short, is that the fragment.count should be set in the 
> last Flowfile of a batch. 
> BR Florian



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (NIFI-11789) ExecuteSQL doesn't set fragment.count attribute when Output Batch Size is set

2023-08-14 Thread Tamas Neumer (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-11789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17753977#comment-17753977
 ] 

Tamas Neumer commented on NIFI-11789:
-

Hi,

Do you think this issue will be fixed in the next release? 梁

BR
Tamas

> ExecuteSQL doesn't set fragment.count attribute when Output Batch Size is set
> -
>
> Key: NIFI-11789
> URL: https://issues.apache.org/jira/browse/NIFI-11789
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Reporter: Tamas Neumer
>Assignee: Matt Burgess
>Priority: Minor
> Fix For: 1.latest, 2.latest
>
>
> Hi,
> I am working with the ExecuteSQL processor and discovered an unexpected 
> behavior. If I specify the attribute "Output Batch Size", I get the 
> fragment.index on the outflowing flowing Flowfiles, but the fragment.count 
> attribute is not set (according to the documentation).
> The behavior I would expect (in line with how merge processors work) is that 
> the attribute fragment.count is just set at the last Flowfile for the batch. 
> This would make it possible to merge all the batches together afterward.
> So my proposal, in short, is that the fragment.count should be set in the 
> last Flowfile of a batch. 
> BR Florian



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (NIFI-11789) ExecuteSQL doesn't set fragment.count attribute when Output Batch Size is set

2023-08-03 Thread Tamas Neumer (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-11789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17750756#comment-17750756
 ] 

Tamas Neumer commented on NIFI-11789:
-

Hi Matt,

Thank you!

> ExecuteSQL doesn't set fragment.count attribute when Output Batch Size is set
> -
>
> Key: NIFI-11789
> URL: https://issues.apache.org/jira/browse/NIFI-11789
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Reporter: Tamas Neumer
>Assignee: Matt Burgess
>Priority: Minor
> Fix For: 1.latest, 2.latest
>
>
> Hi,
> I am working with the ExecuteSQL processor and discovered an unexpected 
> behavior. If I specify the attribute "Output Batch Size", I get the 
> fragment.index on the outflowing flowing Flowfiles, but the fragment.count 
> attribute is not set (according to the documentation).
> The behavior I would expect (in line with how merge processors work) is that 
> the attribute fragment.count is just set at the last Flowfile for the batch. 
> This would make it possible to merge all the batches together afterward.
> So my proposal, in short, is that the fragment.count should be set in the 
> last Flowfile of a batch. 
> BR Florian



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (NIFI-11789) ExecuteSQL doesn't set fragment.count attribute when Output Batch Size is set

2023-08-03 Thread Matt Burgess (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-11789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17750754#comment-17750754
 ] 

Matt Burgess commented on NIFI-11789:
-

Makes sense, I will look at how/if we can set the final FlowFile 
fragment.count, an initial glance at the code seems to indicate it might be 
tricky as the "last flow file" can be processed at different points in the 
code. But if we keep a faithful count of all the flowfiles processed, we should 
be able to make that work.

> ExecuteSQL doesn't set fragment.count attribute when Output Batch Size is set
> -
>
> Key: NIFI-11789
> URL: https://issues.apache.org/jira/browse/NIFI-11789
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Reporter: Tamas Neumer
>Assignee: Matt Burgess
>Priority: Minor
> Fix For: 1.latest, 2.latest
>
>
> Hi,
> I am working with the ExecuteSQL processor and discovered an unexpected 
> behavior. If I specify the attribute "Output Batch Size", I get the 
> fragment.index on the outflowing flowing Flowfiles, but the fragment.count 
> attribute is not set (according to the documentation).
> The behavior I would expect (in line with how merge processors work) is that 
> the attribute fragment.count is just set at the last Flowfile for the batch. 
> This would make it possible to merge all the batches together afterward.
> So my proposal, in short, is that the fragment.count should be set in the 
> last Flowfile of a batch. 
> BR Florian



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (NIFI-11789) ExecuteSQL doesn't set fragment.count attribute when Output Batch Size is set

2023-08-03 Thread Tamas Neumer (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-11789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17750645#comment-17750645
 ] 

Tamas Neumer commented on NIFI-11789:
-

HI Matt,

 

"merge all batches together" for me means the following:
Imagine the situation that one flowfile is coming to the ExecuteSQL processor. 
Let's assume there 8 records to fetch and the batch size is set to 1.
This would result in 3 batches of size 3, 3 and 2 flowfiles.

 

What i want to achieve is to merge all 8 flowfiles back into one.
So i would number them 1-8 and set the fragment.count to 8

 

BR

> ExecuteSQL doesn't set fragment.count attribute when Output Batch Size is set
> -
>
> Key: NIFI-11789
> URL: https://issues.apache.org/jira/browse/NIFI-11789
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Reporter: Tamas Neumer
>Assignee: Matt Burgess
>Priority: Minor
> Fix For: 1.latest, 2.latest
>
>
> Hi,
> I am working with the ExecuteSQL processor and discovered an unexpected 
> behavior. If I specify the attribute "Output Batch Size", I get the 
> fragment.index on the outflowing flowing Flowfiles, but the fragment.count 
> attribute is not set (according to the documentation).
> The behavior I would expect (in line with how merge processors work) is that 
> the attribute fragment.count is just set at the last Flowfile for the batch. 
> This would make it possible to merge all the batches together afterward.
> So my proposal, in short, is that the fragment.count should be set in the 
> last Flowfile of a batch. 
> BR Florian



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (NIFI-11789) ExecuteSQL doesn't set fragment.count attribute when Output Batch Size is set

2023-07-24 Thread Matt Burgess (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-11789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17746735#comment-17746735
 ] 

Matt Burgess commented on NIFI-11789:
-

I'm assuming when you mean "merge all batches together", you mean merge all 
FlowFiles from a single batch together? Or are you looking to set the final 
count on the final FlowFile even if Output Batch Size is set?

> ExecuteSQL doesn't set fragment.count attribute when Output Batch Size is set
> -
>
> Key: NIFI-11789
> URL: https://issues.apache.org/jira/browse/NIFI-11789
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Reporter: Tamas Neumer
>Assignee: Matt Burgess
>Priority: Minor
> Fix For: 1.latest, 2.latest
>
>
> Hi,
> I am working with the ExecuteSQL processor and discovered an unexpected 
> behavior. If I specify the attribute "Output Batch Size", I get the 
> fragment.index on the outflowing flowing Flowfiles, but the fragment.count 
> attribute is not set (according to the documentation).
> The behavior I would expect (in line with how merge processors work) is that 
> the attribute fragment.count is just set at the last Flowfile for the batch. 
> This would make it possible to merge all the batches together afterward.
> So my proposal, in short, is that the fragment.count should be set in the 
> last Flowfile of a batch. 
> BR Florian



--
This message was sent by Atlassian Jira
(v8.20.10#820010)