[ 
https://issues.apache.org/jira/browse/NIFI-4872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16379290#comment-16379290
 ] 

ASF GitHub Bot commented on NIFI-4872:
--------------------------------------

Github user markap14 commented on the issue:

    https://github.com/apache/nifi/pull/2475
  
    Thinking about this a little more, I think that the DISK resource makes a 
lot of sense to have but I think we should document as to when to use it - that 
being, it should be used if the Processor would use the disk in a way that may 
not be intuitive. For example, ConvertRecord perhaps does not need it, given 
that it reads the records once and writes them once, which is what would be 
expected for converting from one format to another.
    
    However, QueryRecord is a great example of where this annotation would make 
sense. This is because QueryRecord will read the data up to N number of times, 
where N is the number of SQL statements supplied. DetectMimeType is also an 
interesting example, because I would expect it to read through all of the 
FlowFile content, but in some cases it is able to read only a few bytes, I 
believe, to determine the content's mime type.
    
    Perhaps we should treat the NETWORK one the same way? Or potentially drop 
it? I don't know of any cases off the top of my head that would use the network 
in any unexpected way.


> NIFI component high resource usage annotation
> ---------------------------------------------
>
>                 Key: NIFI-4872
>                 URL: https://issues.apache.org/jira/browse/NIFI-4872
>             Project: Apache NiFi
>          Issue Type: New Feature
>          Components: Core Framework, Core UI
>    Affects Versions: 1.5.0
>            Reporter: Jeff Storck
>            Assignee: Jeff Storck
>            Priority: Critical
>
> NiFi Processors currently have no means to relay whether or not they have may 
> be resource intensive or not. The idea here would be to introduce an 
> Annotation that can be added to Processors that indicate they may cause high 
> memory, disk, CPU, or network usage. For instance, any Processor that reads 
> the FlowFile contents into memory (like many XML Processors for instance) may 
> cause high memory usage. What ultimately determines if there is high 
> memory/disk/cpu/network usage will depend on the FlowFiles being processed. 
> With many of these components in the dataflow, it increases the risk of 
> OutOfMemoryErrors and performance degradation.
> The annotation should support one value from a fixed list of: CPU, Disk, 
> Memory, Network.  It should also allow the developer to provide a custom 
> description of the scenario that the component would fall under the high 
> usage category.  The annotation should be able to be specified multiple 
> times, for as many resources as it has the potential to be high usage.
> By marking components with this new Annotation, we can update the generated 
> Processor documentation to include this fact.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to