[ 
https://issues.apache.org/jira/browse/NIFI-4872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16379269#comment-16379269
 ] 

ASF GitHub Bot commented on NIFI-4872:
--------------------------------------

Github user markap14 commented on the issue:

    https://github.com/apache/nifi/pull/2475
  
    @jtstorck I should probably have read through all the comments before 
adding my own :) Sorry about that. I did notice though, that you have resources 
for "DISK" and "NETWORK" but they are not used anywhere. I would imagine that 
any processor that changes the content of the FlowFile would get a "DISK" one - 
which is a very large number of them. And perhaps even processors that read the 
content? I wonder if that's actually necessary. Since the Processor shows how 
much data is being read/written in the 5 minute stats, I wonder if we could 
just drop that? Similarly, I think that the NETWORK utilization may be kind of 
inferred in most cases - any processor that interacts with an external service 
is likely to have high network utilization. But not sure it makes sense to 
label every single one of those. Would recommend that we either remove those or 
add javadocs explaining when exactly we recommend using those annotations if we 
are not going to use them for each processor that touches flowfile content / 
network.


> NIFI component high resource usage annotation
> ---------------------------------------------
>
>                 Key: NIFI-4872
>                 URL: https://issues.apache.org/jira/browse/NIFI-4872
>             Project: Apache NiFi
>          Issue Type: New Feature
>          Components: Core Framework, Core UI
>    Affects Versions: 1.5.0
>            Reporter: Jeff Storck
>            Assignee: Jeff Storck
>            Priority: Critical
>
> NiFi Processors currently have no means to relay whether or not they have may 
> be resource intensive or not. The idea here would be to introduce an 
> Annotation that can be added to Processors that indicate they may cause high 
> memory, disk, CPU, or network usage. For instance, any Processor that reads 
> the FlowFile contents into memory (like many XML Processors for instance) may 
> cause high memory usage. What ultimately determines if there is high 
> memory/disk/cpu/network usage will depend on the FlowFiles being processed. 
> With many of these components in the dataflow, it increases the risk of 
> OutOfMemoryErrors and performance degradation.
> The annotation should support one value from a fixed list of: CPU, Disk, 
> Memory, Network.  It should also allow the developer to provide a custom 
> description of the scenario that the component would fall under the high 
> usage category.  The annotation should be able to be specified multiple 
> times, for as many resources as it has the potential to be high usage.
> By marking components with this new Annotation, we can update the generated 
> Processor documentation to include this fact.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to