[
https://issues.apache.org/jira/browse/NIFI-11789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17774055#comment-17774055
]
Matt Burgess commented on NIFI-11789:
-------------------------------------
That is correct. Although the PR fixes your use case, I believe it would take a
significant refactor in order to keep track of the FlowFile "groups" such that
we know how many FlowFiles are in the group by the time the session gets
committed (versus the different points in code where the FlowFiles are
transferred within the session, they are not sent until the session is
committed). For your use case it was more straightforward, but there are cases
depending on the configuration of the processor where different FlowFiles are
transferred at different points and the session is committed later. After the
FlowFile has been transferred it cannot be altered (i.e. the fragment.count
attribute cannot be set)
If someone wants to start with my PR and has a better approach, or wants to
start from scratch, I encourage them to do so. Also if I feel like revisiting
it with fresh eyes down the road, I may do that.
As a workaround / solution, if the outgoing FlowFiles have the correct number
of records in them (meaning the record count is accurate for what the
fragment.count attribute SHOULD be and the fragment.index attribute values are
correct), you can pass the FlowFiles through a CalculateRecordStats processor
then UpdateAttribute to set "fragment.count" to the value of "record.count",
and the merge should work downstream.
> ExecuteSQL doesn't set fragment.count attribute when Output Batch Size is set
> -----------------------------------------------------------------------------
>
> Key: NIFI-11789
> URL: https://issues.apache.org/jira/browse/NIFI-11789
> Project: Apache NiFi
> Issue Type: Bug
> Components: Extensions
> Reporter: Tamas Neumer
> Assignee: Matt Burgess
> Priority: Minor
> Time Spent: 0.5h
> Remaining Estimate: 0h
>
> Hi,
> I am working with the ExecuteSQL processor and discovered an unexpected
> behavior. If I specify the attribute "Output Batch Size", I get the
> fragment.index on the outflowing flowing Flowfiles, but the fragment.count
> attribute is not set (according to the documentation).
> The behavior I would expect (in line with how merge processors work) is that
> the attribute fragment.count is just set at the last Flowfile for the batch.
> This would make it possible to merge all the batches together afterward.
> So my proposal, in short, is that the fragment.count should be set in the
> last Flowfile of a batch.
> BR Florian
--
This message was sent by Atlassian Jira
(v8.20.10#820010)