[ 
https://issues.apache.org/jira/browse/NIFI-4872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16370393#comment-16370393
 ] 

ASF GitHub Bot commented on NIFI-4872:
--------------------------------------

Github user joewitt commented on the issue:

    https://github.com/apache/nifi/pull/2475
  
    I think my only concern is that as-is we're labeling a bunch of things as 
"CPU" or "MEMORY" but not giving descriptions.  As a user i'd see that and 
thing 'well, how does this use memory'?  For instance, does that mean each 
flowfile's content is fully loaded in memory?  Or does it mean part of one is?  
Or all of a batch of them?  Or if we say CPU usage for compression how should I 
think about number of threads?  Or in the case of compress content it might be 
worth adding 'MEMORY" and explaining that it is actually really efficient and 
can handle large objects without ever loading much in memory.  So in that case 
the resource consideration is to alleviate concerns.  We're not qualifying the 
usage consideration as good or bad in this approach.  But merely "Hey here is a 
resource usage consideration you should or might have in mind and here is how 
this component works in that regard".  Does this make sense?  So, in that sense 
I'd like to see us add descriptions to all these things we're tagging.  Not 
saying it is a must for the PR but adding "MEMORY" without explaining might 
just be alarming


> NIFI component high resource usage annotation
> ---------------------------------------------
>
>                 Key: NIFI-4872
>                 URL: https://issues.apache.org/jira/browse/NIFI-4872
>             Project: Apache NiFi
>          Issue Type: New Feature
>          Components: Core Framework, Core UI
>    Affects Versions: 1.5.0
>            Reporter: Jeff Storck
>            Assignee: Jeff Storck
>            Priority: Critical
>
> NiFi Processors currently have no means to relay whether or not they have may 
> be resource intensive or not. The idea here would be to introduce an 
> Annotation that can be added to Processors that indicate they may cause high 
> memory, disk, CPU, or network usage. For instance, any Processor that reads 
> the FlowFile contents into memory (like many XML Processors for instance) may 
> cause high memory usage. What ultimately determines if there is high 
> memory/disk/cpu/network usage will depend on the FlowFiles being processed. 
> With many of these components in the dataflow, it increases the risk of 
> OutOfMemoryErrors and performance degradation.
> The annotation should support one value from a fixed list of: CPU, Disk, 
> Memory, Network.  It should also allow the developer to provide a custom 
> description of the scenario that the component would fall under the high 
> usage category.  The annotation should be able to be specified multiple 
> times, for as many resources as it has the potential to be high usage.
> By marking components with this new Annotation, we can update the generated 
> Processor documentation to include this fact.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to