[
https://issues.apache.org/jira/browse/NIFI-4872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16379269#comment-16379269
]
ASF GitHub Bot commented on NIFI-4872:
--------------------------------------
Github user markap14 commented on the issue:
https://github.com/apache/nifi/pull/2475
@jtstorck I should probably have read through all the comments before
adding my own :) Sorry about that. I did notice though, that you have resources
for "DISK" and "NETWORK" but they are not used anywhere. I would imagine that
any processor that changes the content of the FlowFile would get a "DISK" one -
which is a very large number of them. And perhaps even processors that read the
content? I wonder if that's actually necessary. Since the Processor shows how
much data is being read/written in the 5 minute stats, I wonder if we could
just drop that? Similarly, I think that the NETWORK utilization may be kind of
inferred in most cases - any processor that interacts with an external service
is likely to have high network utilization. But not sure it makes sense to
label every single one of those. Would recommend that we either remove those or
add javadocs explaining when exactly we recommend using those annotations if we
are not going to use them for each processor that touches flowfile content /
network.
> NIFI component high resource usage annotation
> ---------------------------------------------
>
> Key: NIFI-4872
> URL: https://issues.apache.org/jira/browse/NIFI-4872
> Project: Apache NiFi
> Issue Type: New Feature
> Components: Core Framework, Core UI
> Affects Versions: 1.5.0
> Reporter: Jeff Storck
> Assignee: Jeff Storck
> Priority: Critical
>
> NiFi Processors currently have no means to relay whether or not they have may
> be resource intensive or not. The idea here would be to introduce an
> Annotation that can be added to Processors that indicate they may cause high
> memory, disk, CPU, or network usage. For instance, any Processor that reads
> the FlowFile contents into memory (like many XML Processors for instance) may
> cause high memory usage. What ultimately determines if there is high
> memory/disk/cpu/network usage will depend on the FlowFiles being processed.
> With many of these components in the dataflow, it increases the risk of
> OutOfMemoryErrors and performance degradation.
> The annotation should support one value from a fixed list of: CPU, Disk,
> Memory, Network. It should also allow the developer to provide a custom
> description of the scenario that the component would fall under the high
> usage category. The annotation should be able to be specified multiple
> times, for as many resources as it has the potential to be high usage.
> By marking components with this new Annotation, we can update the generated
> Processor documentation to include this fact.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)