[
https://issues.apache.org/jira/browse/NIFI-4872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16379263#comment-16379263
]
ASF GitHub Bot commented on NIFI-4872:
--------------------------------------
Github user markap14 commented on a diff in the pull request:
https://github.com/apache/nifi/pull/2475#discussion_r171065023
--- Diff:
nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/SplitText.java
---
@@ -87,6 +89,7 @@
@WritesAttribute(attribute = "fragment.count", description = "The
number of split FlowFiles generated from the parent FlowFile"),
@WritesAttribute(attribute = "segment.original.filename ", description
= "The filename of the parent FlowFile")})
@SeeAlso(MergeContent.class)
+@SystemResourceConsideration(resource = SystemResource.MEMORY)
--- End diff --
I would again add a description here that indicates that it's not buffering
the content in memory but rather just storing the FlowFile w/ its attributes in
memory and that if generating too many splits, a two-phase approach may be
necessary.
> NIFI component high resource usage annotation
> ---------------------------------------------
>
> Key: NIFI-4872
> URL: https://issues.apache.org/jira/browse/NIFI-4872
> Project: Apache NiFi
> Issue Type: New Feature
> Components: Core Framework, Core UI
> Affects Versions: 1.5.0
> Reporter: Jeff Storck
> Assignee: Jeff Storck
> Priority: Critical
>
> NiFi Processors currently have no means to relay whether or not they have may
> be resource intensive or not. The idea here would be to introduce an
> Annotation that can be added to Processors that indicate they may cause high
> memory, disk, CPU, or network usage. For instance, any Processor that reads
> the FlowFile contents into memory (like many XML Processors for instance) may
> cause high memory usage. What ultimately determines if there is high
> memory/disk/cpu/network usage will depend on the FlowFiles being processed.
> With many of these components in the dataflow, it increases the risk of
> OutOfMemoryErrors and performance degradation.
> The annotation should support one value from a fixed list of: CPU, Disk,
> Memory, Network. It should also allow the developer to provide a custom
> description of the scenario that the component would fall under the high
> usage category. The annotation should be able to be specified multiple
> times, for as many resources as it has the potential to be high usage.
> By marking components with this new Annotation, we can update the generated
> Processor documentation to include this fact.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)