[
https://issues.apache.org/jira/browse/NIFI-13288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Daniel Stieglitz updated NIFI-13288:
------------------------------------
Description:
Per [~markap14] in the following
[post|https://lists.apache.org/thread/7zo2px31r3377c7vhby4h6nrngdf3llf] one
should avoid calling session.putAttribute many times since in order to
maintain object immutability it has to create a new FlowFile object (and a new
HashMap of all attributes!)
for every call to session.putAttribute which leads to potentially a huge amount
of garbage getting created. Per this advice some of the split processors
SplitJson, SplitXml, and SplitAvro all have loops to create a new flow file for
each split and it calls putAttribute more than once (in order to populate the
split attributes FRAGMENT_ID, FRAGMENT_INDEX etc) for each flow file created.
These should be fixed to to populate the attributes in a Map and then make one
call to session.putAttributes.
was:
Per [~markap14] in the following
[post|https://lists.apache.org/thread/7zo2px31r3377c7vhby4h6nrngdf3llf] one
should avoid calling session.putAttribute many times since in order to
maintain object immutability it has to create a new FlowFile object (and a new
HashMap of all attributes!)
for every call to putAttribute which leads to potentially a huge amount of
garbage getting created. Per this advice some of the split processors
SplitJson, SplitXml, and SplitAvro all have loops to create a new flow file for
each split and it calls putAttribute more than once (in order to populate the
split attributes FRAGMENT_ID, FRAGMENT_INDEX etc) for each flow file created.
These should be fixed to to populate the attributes in a Map and then make one
call to session.putAttributes.
> Fix SplitJson, SplitXml, and SplitAvro processor not to call
> session.putAttribute multiple times
> ------------------------------------------------------------------------------------------------
>
> Key: NIFI-13288
> URL: https://issues.apache.org/jira/browse/NIFI-13288
> Project: Apache NiFi
> Issue Type: Improvement
> Reporter: Daniel Stieglitz
> Assignee: Daniel Stieglitz
> Priority: Major
>
> Per [~markap14] in the following
> [post|https://lists.apache.org/thread/7zo2px31r3377c7vhby4h6nrngdf3llf] one
> should avoid calling session.putAttribute many times since in order to
> maintain object immutability it has to create a new FlowFile object (and a
> new HashMap of all attributes!)
> for every call to session.putAttribute which leads to potentially a huge
> amount of garbage getting created. Per this advice some of the split
> processors SplitJson, SplitXml, and SplitAvro all have loops to create a new
> flow file for each split and it calls putAttribute more than once (in order
> to populate the split attributes FRAGMENT_ID, FRAGMENT_INDEX etc) for each
> flow file created. These should be fixed to to populate the attributes in a
> Map and then make one call to session.putAttributes.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)