[jira] [Commented] (NIFI-4872) NIFI component high resource usage annotation
[ https://issues.apache.org/jira/browse/NIFI-4872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16386725#comment-16386725 ] ASF GitHub Bot commented on NIFI-4872: -- Github user markap14 commented on the issue: https://github.com/apache/nifi/pull/2475 @jtstorck this looks good to me. Simple merge conflict in import statements but I was able to address that. Otherwise, I think this is all a great step forward. I do agree that we will likely need more PR's later to further enrich the existing processors but this lays the groundwork for it all, so it makes sense to merge it in as-is. So +1 merged to master. Thanks for getting this knocked out! > NIFI component high resource usage annotation > - > > Key: NIFI-4872 > URL: https://issues.apache.org/jira/browse/NIFI-4872 > Project: Apache NiFi > Issue Type: New Feature > Components: Core Framework, Core UI >Affects Versions: 1.5.0 >Reporter: Jeff Storck >Assignee: Jeff Storck >Priority: Critical > Fix For: 1.6.0 > > > NiFi Processors currently have no means to relay whether or not they have may > be resource intensive or not. The idea here would be to introduce an > Annotation that can be added to Processors that indicate they may cause high > memory, disk, CPU, or network usage. For instance, any Processor that reads > the FlowFile contents into memory (like many XML Processors for instance) may > cause high memory usage. What ultimately determines if there is high > memory/disk/cpu/network usage will depend on the FlowFiles being processed. > With many of these components in the dataflow, it increases the risk of > OutOfMemoryErrors and performance degradation. > The annotation should support one value from a fixed list of: CPU, Disk, > Memory, Network. It should also allow the developer to provide a custom > description of the scenario that the component would fall under the high > usage category. The annotation should be able to be specified multiple > times, for as many resources as it has the potential to be high usage. > By marking components with this new Annotation, we can update the generated > Processor documentation to include this fact. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4872) NIFI component high resource usage annotation
[ https://issues.apache.org/jira/browse/NIFI-4872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16386721#comment-16386721 ] ASF GitHub Bot commented on NIFI-4872: -- Github user asfgit closed the pull request at: https://github.com/apache/nifi/pull/2475 > NIFI component high resource usage annotation > - > > Key: NIFI-4872 > URL: https://issues.apache.org/jira/browse/NIFI-4872 > Project: Apache NiFi > Issue Type: New Feature > Components: Core Framework, Core UI >Affects Versions: 1.5.0 >Reporter: Jeff Storck >Assignee: Jeff Storck >Priority: Critical > > NiFi Processors currently have no means to relay whether or not they have may > be resource intensive or not. The idea here would be to introduce an > Annotation that can be added to Processors that indicate they may cause high > memory, disk, CPU, or network usage. For instance, any Processor that reads > the FlowFile contents into memory (like many XML Processors for instance) may > cause high memory usage. What ultimately determines if there is high > memory/disk/cpu/network usage will depend on the FlowFiles being processed. > With many of these components in the dataflow, it increases the risk of > OutOfMemoryErrors and performance degradation. > The annotation should support one value from a fixed list of: CPU, Disk, > Memory, Network. It should also allow the developer to provide a custom > description of the scenario that the component would fall under the high > usage category. The annotation should be able to be specified multiple > times, for as many resources as it has the potential to be high usage. > By marking components with this new Annotation, we can update the generated > Processor documentation to include this fact. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4872) NIFI component high resource usage annotation
[ https://issues.apache.org/jira/browse/NIFI-4872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16379290#comment-16379290 ] ASF GitHub Bot commented on NIFI-4872: -- Github user markap14 commented on the issue: https://github.com/apache/nifi/pull/2475 Thinking about this a little more, I think that the DISK resource makes a lot of sense to have but I think we should document as to when to use it - that being, it should be used if the Processor would use the disk in a way that may not be intuitive. For example, ConvertRecord perhaps does not need it, given that it reads the records once and writes them once, which is what would be expected for converting from one format to another. However, QueryRecord is a great example of where this annotation would make sense. This is because QueryRecord will read the data up to N number of times, where N is the number of SQL statements supplied. DetectMimeType is also an interesting example, because I would expect it to read through all of the FlowFile content, but in some cases it is able to read only a few bytes, I believe, to determine the content's mime type. Perhaps we should treat the NETWORK one the same way? Or potentially drop it? I don't know of any cases off the top of my head that would use the network in any unexpected way. > NIFI component high resource usage annotation > - > > Key: NIFI-4872 > URL: https://issues.apache.org/jira/browse/NIFI-4872 > Project: Apache NiFi > Issue Type: New Feature > Components: Core Framework, Core UI >Affects Versions: 1.5.0 >Reporter: Jeff Storck >Assignee: Jeff Storck >Priority: Critical > > NiFi Processors currently have no means to relay whether or not they have may > be resource intensive or not. The idea here would be to introduce an > Annotation that can be added to Processors that indicate they may cause high > memory, disk, CPU, or network usage. For instance, any Processor that reads > the FlowFile contents into memory (like many XML Processors for instance) may > cause high memory usage. What ultimately determines if there is high > memory/disk/cpu/network usage will depend on the FlowFiles being processed. > With many of these components in the dataflow, it increases the risk of > OutOfMemoryErrors and performance degradation. > The annotation should support one value from a fixed list of: CPU, Disk, > Memory, Network. It should also allow the developer to provide a custom > description of the scenario that the component would fall under the high > usage category. The annotation should be able to be specified multiple > times, for as many resources as it has the potential to be high usage. > By marking components with this new Annotation, we can update the generated > Processor documentation to include this fact. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4872) NIFI component high resource usage annotation
[ https://issues.apache.org/jira/browse/NIFI-4872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16379269#comment-16379269 ] ASF GitHub Bot commented on NIFI-4872: -- Github user markap14 commented on the issue: https://github.com/apache/nifi/pull/2475 @jtstorck I should probably have read through all the comments before adding my own :) Sorry about that. I did notice though, that you have resources for "DISK" and "NETWORK" but they are not used anywhere. I would imagine that any processor that changes the content of the FlowFile would get a "DISK" one - which is a very large number of them. And perhaps even processors that read the content? I wonder if that's actually necessary. Since the Processor shows how much data is being read/written in the 5 minute stats, I wonder if we could just drop that? Similarly, I think that the NETWORK utilization may be kind of inferred in most cases - any processor that interacts with an external service is likely to have high network utilization. But not sure it makes sense to label every single one of those. Would recommend that we either remove those or add javadocs explaining when exactly we recommend using those annotations if we are not going to use them for each processor that touches flowfile content / network. > NIFI component high resource usage annotation > - > > Key: NIFI-4872 > URL: https://issues.apache.org/jira/browse/NIFI-4872 > Project: Apache NiFi > Issue Type: New Feature > Components: Core Framework, Core UI >Affects Versions: 1.5.0 >Reporter: Jeff Storck >Assignee: Jeff Storck >Priority: Critical > > NiFi Processors currently have no means to relay whether or not they have may > be resource intensive or not. The idea here would be to introduce an > Annotation that can be added to Processors that indicate they may cause high > memory, disk, CPU, or network usage. For instance, any Processor that reads > the FlowFile contents into memory (like many XML Processors for instance) may > cause high memory usage. What ultimately determines if there is high > memory/disk/cpu/network usage will depend on the FlowFiles being processed. > With many of these components in the dataflow, it increases the risk of > OutOfMemoryErrors and performance degradation. > The annotation should support one value from a fixed list of: CPU, Disk, > Memory, Network. It should also allow the developer to provide a custom > description of the scenario that the component would fall under the high > usage category. The annotation should be able to be specified multiple > times, for as many resources as it has the potential to be high usage. > By marking components with this new Annotation, we can update the generated > Processor documentation to include this fact. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4872) NIFI component high resource usage annotation
[ https://issues.apache.org/jira/browse/NIFI-4872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16379266#comment-16379266 ] ASF GitHub Bot commented on NIFI-4872: -- Github user markap14 commented on a diff in the pull request: https://github.com/apache/nifi/pull/2475#discussion_r171065265 --- Diff: nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/SplitXml.java --- @@ -82,6 +84,7 @@ description = "The number of split FlowFiles generated from the parent FlowFile"), @WritesAttribute(attribute = "segment.original.filename ", description = "The filename of the parent FlowFile") }) +@SystemResourceConsideration(resource = SystemResource.MEMORY) --- End diff -- In this particular context, we are buffering the entirety of the FlowFile's content (as a Document object, which can take approximately 10 times as much heap as the size of the XML - i.e., a 1 MB XML document may take 10 MB of heap), in addition to all of the generated FlowFile objects. A two-stage approach may well be necessary for lots of splits, but even then if the XML is large you could potentially run out of heap space. > NIFI component high resource usage annotation > - > > Key: NIFI-4872 > URL: https://issues.apache.org/jira/browse/NIFI-4872 > Project: Apache NiFi > Issue Type: New Feature > Components: Core Framework, Core UI >Affects Versions: 1.5.0 >Reporter: Jeff Storck >Assignee: Jeff Storck >Priority: Critical > > NiFi Processors currently have no means to relay whether or not they have may > be resource intensive or not. The idea here would be to introduce an > Annotation that can be added to Processors that indicate they may cause high > memory, disk, CPU, or network usage. For instance, any Processor that reads > the FlowFile contents into memory (like many XML Processors for instance) may > cause high memory usage. What ultimately determines if there is high > memory/disk/cpu/network usage will depend on the FlowFiles being processed. > With many of these components in the dataflow, it increases the risk of > OutOfMemoryErrors and performance degradation. > The annotation should support one value from a fixed list of: CPU, Disk, > Memory, Network. It should also allow the developer to provide a custom > description of the scenario that the component would fall under the high > usage category. The annotation should be able to be specified multiple > times, for as many resources as it has the potential to be high usage. > By marking components with this new Annotation, we can update the generated > Processor documentation to include this fact. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4872) NIFI component high resource usage annotation
[ https://issues.apache.org/jira/browse/NIFI-4872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16379263#comment-16379263 ] ASF GitHub Bot commented on NIFI-4872: -- Github user markap14 commented on a diff in the pull request: https://github.com/apache/nifi/pull/2475#discussion_r171065023 --- Diff: nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/SplitText.java --- @@ -87,6 +89,7 @@ @WritesAttribute(attribute = "fragment.count", description = "The number of split FlowFiles generated from the parent FlowFile"), @WritesAttribute(attribute = "segment.original.filename ", description = "The filename of the parent FlowFile")}) @SeeAlso(MergeContent.class) +@SystemResourceConsideration(resource = SystemResource.MEMORY) --- End diff -- I would again add a description here that indicates that it's not buffering the content in memory but rather just storing the FlowFile w/ its attributes in memory and that if generating too many splits, a two-phase approach may be necessary. > NIFI component high resource usage annotation > - > > Key: NIFI-4872 > URL: https://issues.apache.org/jira/browse/NIFI-4872 > Project: Apache NiFi > Issue Type: New Feature > Components: Core Framework, Core UI >Affects Versions: 1.5.0 >Reporter: Jeff Storck >Assignee: Jeff Storck >Priority: Critical > > NiFi Processors currently have no means to relay whether or not they have may > be resource intensive or not. The idea here would be to introduce an > Annotation that can be added to Processors that indicate they may cause high > memory, disk, CPU, or network usage. For instance, any Processor that reads > the FlowFile contents into memory (like many XML Processors for instance) may > cause high memory usage. What ultimately determines if there is high > memory/disk/cpu/network usage will depend on the FlowFiles being processed. > With many of these components in the dataflow, it increases the risk of > OutOfMemoryErrors and performance degradation. > The annotation should support one value from a fixed list of: CPU, Disk, > Memory, Network. It should also allow the developer to provide a custom > description of the scenario that the component would fall under the high > usage category. The annotation should be able to be specified multiple > times, for as many resources as it has the potential to be high usage. > By marking components with this new Annotation, we can update the generated > Processor documentation to include this fact. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4872) NIFI component high resource usage annotation
[ https://issues.apache.org/jira/browse/NIFI-4872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16379261#comment-16379261 ] ASF GitHub Bot commented on NIFI-4872: -- Github user markap14 commented on a diff in the pull request: https://github.com/apache/nifi/pull/2475#discussion_r171064833 --- Diff: nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/SplitJson.java --- @@ -80,7 +82,7 @@ description = "The number of split FlowFiles generated from the parent FlowFile"), @WritesAttribute(attribute = "segment.original.filename ", description = "The filename of the parent FlowFile") }) - +@SystemResourceConsideration(resource = SystemResource.MEMORY) --- End diff -- In this particular context, we are buffering the entirety of the FlowFile's content (as a JsonNode object), in addition to all of the generated FlowFile objects. A two-stage approach may well be necessary for lots of splits, but even then if the JSON is extremely large you could potentially run out of heap space. > NIFI component high resource usage annotation > - > > Key: NIFI-4872 > URL: https://issues.apache.org/jira/browse/NIFI-4872 > Project: Apache NiFi > Issue Type: New Feature > Components: Core Framework, Core UI >Affects Versions: 1.5.0 >Reporter: Jeff Storck >Assignee: Jeff Storck >Priority: Critical > > NiFi Processors currently have no means to relay whether or not they have may > be resource intensive or not. The idea here would be to introduce an > Annotation that can be added to Processors that indicate they may cause high > memory, disk, CPU, or network usage. For instance, any Processor that reads > the FlowFile contents into memory (like many XML Processors for instance) may > cause high memory usage. What ultimately determines if there is high > memory/disk/cpu/network usage will depend on the FlowFiles being processed. > With many of these components in the dataflow, it increases the risk of > OutOfMemoryErrors and performance degradation. > The annotation should support one value from a fixed list of: CPU, Disk, > Memory, Network. It should also allow the developer to provide a custom > description of the scenario that the component would fall under the high > usage category. The annotation should be able to be specified multiple > times, for as many resources as it has the potential to be high usage. > By marking components with this new Annotation, we can update the generated > Processor documentation to include this fact. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4872) NIFI component high resource usage annotation
[ https://issues.apache.org/jira/browse/NIFI-4872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16379259#comment-16379259 ] ASF GitHub Bot commented on NIFI-4872: -- Github user markap14 commented on a diff in the pull request: https://github.com/apache/nifi/pull/2475#discussion_r171064496 --- Diff: nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/SplitContent.java --- @@ -75,6 +77,7 @@ @WritesAttribute(attribute = "fragment.count", description = "The number of split FlowFiles generated from the parent FlowFile"), @WritesAttribute(attribute = "segment.original.filename ", description = "The filename of the parent FlowFile")}) @SeeAlso(MergeContent.class) +@SystemResourceConsideration(resource = SystemResource.MEMORY) --- End diff -- I would again add a description here that indicates that it's not buffering the content in memory but rather just storing the FlowFile w/ its attributes in memory and that if generating too many splits, a two-phase approach may be necessary. > NIFI component high resource usage annotation > - > > Key: NIFI-4872 > URL: https://issues.apache.org/jira/browse/NIFI-4872 > Project: Apache NiFi > Issue Type: New Feature > Components: Core Framework, Core UI >Affects Versions: 1.5.0 >Reporter: Jeff Storck >Assignee: Jeff Storck >Priority: Critical > > NiFi Processors currently have no means to relay whether or not they have may > be resource intensive or not. The idea here would be to introduce an > Annotation that can be added to Processors that indicate they may cause high > memory, disk, CPU, or network usage. For instance, any Processor that reads > the FlowFile contents into memory (like many XML Processors for instance) may > cause high memory usage. What ultimately determines if there is high > memory/disk/cpu/network usage will depend on the FlowFiles being processed. > With many of these components in the dataflow, it increases the risk of > OutOfMemoryErrors and performance degradation. > The annotation should support one value from a fixed list of: CPU, Disk, > Memory, Network. It should also allow the developer to provide a custom > description of the scenario that the component would fall under the high > usage category. The annotation should be able to be specified multiple > times, for as many resources as it has the potential to be high usage. > By marking components with this new Annotation, we can update the generated > Processor documentation to include this fact. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4872) NIFI component high resource usage annotation
[ https://issues.apache.org/jira/browse/NIFI-4872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16379256#comment-16379256 ] ASF GitHub Bot commented on NIFI-4872: -- Github user markap14 commented on a diff in the pull request: https://github.com/apache/nifi/pull/2475#discussion_r171064049 --- Diff: nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/MergeContent.java --- @@ -131,6 +133,7 @@ @WritesAttribute(attribute = "merge.bin.age", description = "The age of the bin, in milliseconds, when it was merged and output. Effectively " + "this is the greatest amount of time that any FlowFile in this bundle remained waiting in this processor before it was output") }) @SeeAlso({SegmentContent.class, MergeRecord.class}) +@SystemResourceConsideration(resource = SystemResource.MEMORY) --- End diff -- It would probably be helpful here to add a description that explains that the content itself is not stored in memory but rather the FlowFiles' attributes and that the configuration for max bin size, etc. will influence how much heap is used. Would also call out that if merging together many small FlowFiles, a two-stage approach may be necessary in order to avoid running out of memory. > NIFI component high resource usage annotation > - > > Key: NIFI-4872 > URL: https://issues.apache.org/jira/browse/NIFI-4872 > Project: Apache NiFi > Issue Type: New Feature > Components: Core Framework, Core UI >Affects Versions: 1.5.0 >Reporter: Jeff Storck >Assignee: Jeff Storck >Priority: Critical > > NiFi Processors currently have no means to relay whether or not they have may > be resource intensive or not. The idea here would be to introduce an > Annotation that can be added to Processors that indicate they may cause high > memory, disk, CPU, or network usage. For instance, any Processor that reads > the FlowFile contents into memory (like many XML Processors for instance) may > cause high memory usage. What ultimately determines if there is high > memory/disk/cpu/network usage will depend on the FlowFiles being processed. > With many of these components in the dataflow, it increases the risk of > OutOfMemoryErrors and performance degradation. > The annotation should support one value from a fixed list of: CPU, Disk, > Memory, Network. It should also allow the developer to provide a custom > description of the scenario that the component would fall under the high > usage category. The annotation should be able to be specified multiple > times, for as many resources as it has the potential to be high usage. > By marking components with this new Annotation, we can update the generated > Processor documentation to include this fact. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4872) NIFI component high resource usage annotation
[ https://issues.apache.org/jira/browse/NIFI-4872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16370566#comment-16370566 ] ASF GitHub Bot commented on NIFI-4872: -- Github user joewitt commented on the issue: https://github.com/apache/nifi/pull/2475 @jtstorck i'm supportive of it as is. Not in a position right now to test/build. Can try later/tomorrow if someone else hasn't. noticed the travis-ci issues but they appear unrelated > NIFI component high resource usage annotation > - > > Key: NIFI-4872 > URL: https://issues.apache.org/jira/browse/NIFI-4872 > Project: Apache NiFi > Issue Type: New Feature > Components: Core Framework, Core UI >Affects Versions: 1.5.0 >Reporter: Jeff Storck >Assignee: Jeff Storck >Priority: Critical > > NiFi Processors currently have no means to relay whether or not they have may > be resource intensive or not. The idea here would be to introduce an > Annotation that can be added to Processors that indicate they may cause high > memory, disk, CPU, or network usage. For instance, any Processor that reads > the FlowFile contents into memory (like many XML Processors for instance) may > cause high memory usage. What ultimately determines if there is high > memory/disk/cpu/network usage will depend on the FlowFiles being processed. > With many of these components in the dataflow, it increases the risk of > OutOfMemoryErrors and performance degradation. > The annotation should support one value from a fixed list of: CPU, Disk, > Memory, Network. It should also allow the developer to provide a custom > description of the scenario that the component would fall under the high > usage category. The annotation should be able to be specified multiple > times, for as many resources as it has the potential to be high usage. > By marking components with this new Annotation, we can update the generated > Processor documentation to include this fact. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4872) NIFI component high resource usage annotation
[ https://issues.apache.org/jira/browse/NIFI-4872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16370407#comment-16370407 ] ASF GitHub Bot commented on NIFI-4872: -- Github user jtstorck commented on the issue: https://github.com/apache/nifi/pull/2475 I agree, @joewitt. I wanted to get the annotation, its integration, and the components that need the annotation tagged, and sort through any issues or changes to the annotation itself before diving too deeply into writing specific descriptions. The annotation supports a description, but it might be that a wall of text might not be the best way to convey a system resource consideration. It might be a good time to look into supporting some formatting of the content in the annotation's description (including Reads/WritesAttribute). > NIFI component high resource usage annotation > - > > Key: NIFI-4872 > URL: https://issues.apache.org/jira/browse/NIFI-4872 > Project: Apache NiFi > Issue Type: New Feature > Components: Core Framework, Core UI >Affects Versions: 1.5.0 >Reporter: Jeff Storck >Assignee: Jeff Storck >Priority: Critical > > NiFi Processors currently have no means to relay whether or not they have may > be resource intensive or not. The idea here would be to introduce an > Annotation that can be added to Processors that indicate they may cause high > memory, disk, CPU, or network usage. For instance, any Processor that reads > the FlowFile contents into memory (like many XML Processors for instance) may > cause high memory usage. What ultimately determines if there is high > memory/disk/cpu/network usage will depend on the FlowFiles being processed. > With many of these components in the dataflow, it increases the risk of > OutOfMemoryErrors and performance degradation. > The annotation should support one value from a fixed list of: CPU, Disk, > Memory, Network. It should also allow the developer to provide a custom > description of the scenario that the component would fall under the high > usage category. The annotation should be able to be specified multiple > times, for as many resources as it has the potential to be high usage. > By marking components with this new Annotation, we can update the generated > Processor documentation to include this fact. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4872) NIFI component high resource usage annotation
[ https://issues.apache.org/jira/browse/NIFI-4872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16370393#comment-16370393 ] ASF GitHub Bot commented on NIFI-4872: -- Github user joewitt commented on the issue: https://github.com/apache/nifi/pull/2475 I think my only concern is that as-is we're labeling a bunch of things as "CPU" or "MEMORY" but not giving descriptions. As a user i'd see that and thing 'well, how does this use memory'? For instance, does that mean each flowfile's content is fully loaded in memory? Or does it mean part of one is? Or all of a batch of them? Or if we say CPU usage for compression how should I think about number of threads? Or in the case of compress content it might be worth adding 'MEMORY" and explaining that it is actually really efficient and can handle large objects without ever loading much in memory. So in that case the resource consideration is to alleviate concerns. We're not qualifying the usage consideration as good or bad in this approach. But merely "Hey here is a resource usage consideration you should or might have in mind and here is how this component works in that regard". Does this make sense? So, in that sense I'd like to see us add descriptions to all these things we're tagging. Not saying it is a must for the PR but adding "MEMORY" without explaining might just be alarming > NIFI component high resource usage annotation > - > > Key: NIFI-4872 > URL: https://issues.apache.org/jira/browse/NIFI-4872 > Project: Apache NiFi > Issue Type: New Feature > Components: Core Framework, Core UI >Affects Versions: 1.5.0 >Reporter: Jeff Storck >Assignee: Jeff Storck >Priority: Critical > > NiFi Processors currently have no means to relay whether or not they have may > be resource intensive or not. The idea here would be to introduce an > Annotation that can be added to Processors that indicate they may cause high > memory, disk, CPU, or network usage. For instance, any Processor that reads > the FlowFile contents into memory (like many XML Processors for instance) may > cause high memory usage. What ultimately determines if there is high > memory/disk/cpu/network usage will depend on the FlowFiles being processed. > With many of these components in the dataflow, it increases the risk of > OutOfMemoryErrors and performance degradation. > The annotation should support one value from a fixed list of: CPU, Disk, > Memory, Network. It should also allow the developer to provide a custom > description of the scenario that the component would fall under the high > usage category. The annotation should be able to be specified multiple > times, for as many resources as it has the potential to be high usage. > By marking components with this new Annotation, we can update the generated > Processor documentation to include this fact. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4872) NIFI component high resource usage annotation
[ https://issues.apache.org/jira/browse/NIFI-4872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16370388#comment-16370388 ] ASF GitHub Bot commented on NIFI-4872: -- Github user jtstorck commented on a diff in the pull request: https://github.com/apache/nifi/pull/2475#discussion_r169409286 --- Diff: nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-documentation/src/main/java/org/apache/nifi/documentation/html/HtmlDocumentationWriter.java --- @@ -727,6 +729,41 @@ protected void writeLink(final XMLStreamWriter xmlStreamWriter, final String tex xmlStreamWriter.writeEndElement(); } +/** + * Writes all the system resource considerations for this component + * + * @param configurableComponent the component to describe + * @param xmlStreamWriter the xml stream writer to use + * @throws XMLStreamException thrown if there was a problem writing the XML + */ +private void writeSystemResourceConsiderationInfo(ConfigurableComponent configurableComponent, XMLStreamWriter xmlStreamWriter) +throws XMLStreamException { + +SystemResourceConsideration[] systemResourceConsiderations = configurableComponent.getClass().getAnnotationsByType(SystemResourceConsideration.class); + +writeSimpleElement(xmlStreamWriter, "h3", "System Resource Considerations:"); +if (systemResourceConsiderations.length > 0) { +xmlStreamWriter.writeStartElement("table"); +xmlStreamWriter.writeAttribute("id", "system-resource-considerations"); +xmlStreamWriter.writeStartElement("tr"); +writeSimpleElement(xmlStreamWriter, "th", "Resource"); +writeSimpleElement(xmlStreamWriter, "th", "Description"); +xmlStreamWriter.writeEndElement(); +for (SystemResourceConsideration systemResourceConsideration : systemResourceConsiderations) { +xmlStreamWriter.writeStartElement("tr"); +writeSimpleElement(xmlStreamWriter, "td", systemResourceConsideration.resource().name()); +// TODO allow for HTML characters here. --- End diff -- That TODO is also present on the reads/writes attributes code in HtmlProcessorDocumentationWriter. Since the functionality is similar, I added the TODO there as well. Will have to talk to @mcgilman about the intention there. > NIFI component high resource usage annotation > - > > Key: NIFI-4872 > URL: https://issues.apache.org/jira/browse/NIFI-4872 > Project: Apache NiFi > Issue Type: New Feature > Components: Core Framework, Core UI >Affects Versions: 1.5.0 >Reporter: Jeff Storck >Assignee: Jeff Storck >Priority: Critical > > NiFi Processors currently have no means to relay whether or not they have may > be resource intensive or not. The idea here would be to introduce an > Annotation that can be added to Processors that indicate they may cause high > memory, disk, CPU, or network usage. For instance, any Processor that reads > the FlowFile contents into memory (like many XML Processors for instance) may > cause high memory usage. What ultimately determines if there is high > memory/disk/cpu/network usage will depend on the FlowFiles being processed. > With many of these components in the dataflow, it increases the risk of > OutOfMemoryErrors and performance degradation. > The annotation should support one value from a fixed list of: CPU, Disk, > Memory, Network. It should also allow the developer to provide a custom > description of the scenario that the component would fall under the high > usage category. The annotation should be able to be specified multiple > times, for as many resources as it has the potential to be high usage. > By marking components with this new Annotation, we can update the generated > Processor documentation to include this fact. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4872) NIFI component high resource usage annotation
[ https://issues.apache.org/jira/browse/NIFI-4872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16370385#comment-16370385 ] ASF GitHub Bot commented on NIFI-4872: -- Github user joewitt commented on a diff in the pull request: https://github.com/apache/nifi/pull/2475#discussion_r169408354 --- Diff: nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-documentation/src/main/java/org/apache/nifi/documentation/html/HtmlDocumentationWriter.java --- @@ -727,6 +729,41 @@ protected void writeLink(final XMLStreamWriter xmlStreamWriter, final String tex xmlStreamWriter.writeEndElement(); } +/** + * Writes all the system resource considerations for this component + * + * @param configurableComponent the component to describe + * @param xmlStreamWriter the xml stream writer to use + * @throws XMLStreamException thrown if there was a problem writing the XML + */ +private void writeSystemResourceConsiderationInfo(ConfigurableComponent configurableComponent, XMLStreamWriter xmlStreamWriter) +throws XMLStreamException { + +SystemResourceConsideration[] systemResourceConsiderations = configurableComponent.getClass().getAnnotationsByType(SystemResourceConsideration.class); + +writeSimpleElement(xmlStreamWriter, "h3", "System Resource Considerations:"); +if (systemResourceConsiderations.length > 0) { +xmlStreamWriter.writeStartElement("table"); +xmlStreamWriter.writeAttribute("id", "system-resource-considerations"); +xmlStreamWriter.writeStartElement("tr"); +writeSimpleElement(xmlStreamWriter, "th", "Resource"); +writeSimpleElement(xmlStreamWriter, "th", "Description"); +xmlStreamWriter.writeEndElement(); +for (SystemResourceConsideration systemResourceConsideration : systemResourceConsiderations) { +xmlStreamWriter.writeStartElement("tr"); +writeSimpleElement(xmlStreamWriter, "td", systemResourceConsideration.resource().name()); +// TODO allow for HTML characters here. --- End diff -- probably need/want to sort out this todo? > NIFI component high resource usage annotation > - > > Key: NIFI-4872 > URL: https://issues.apache.org/jira/browse/NIFI-4872 > Project: Apache NiFi > Issue Type: New Feature > Components: Core Framework, Core UI >Affects Versions: 1.5.0 >Reporter: Jeff Storck >Assignee: Jeff Storck >Priority: Critical > > NiFi Processors currently have no means to relay whether or not they have may > be resource intensive or not. The idea here would be to introduce an > Annotation that can be added to Processors that indicate they may cause high > memory, disk, CPU, or network usage. For instance, any Processor that reads > the FlowFile contents into memory (like many XML Processors for instance) may > cause high memory usage. What ultimately determines if there is high > memory/disk/cpu/network usage will depend on the FlowFiles being processed. > With many of these components in the dataflow, it increases the risk of > OutOfMemoryErrors and performance degradation. > The annotation should support one value from a fixed list of: CPU, Disk, > Memory, Network. It should also allow the developer to provide a custom > description of the scenario that the component would fall under the high > usage category. The annotation should be able to be specified multiple > times, for as many resources as it has the potential to be high usage. > By marking components with this new Annotation, we can update the generated > Processor documentation to include this fact. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4872) NIFI component high resource usage annotation
[ https://issues.apache.org/jira/browse/NIFI-4872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16370379#comment-16370379 ] ASF GitHub Bot commented on NIFI-4872: -- Github user jtstorck commented on the issue: https://github.com/apache/nifi/pull/2475 @joewitt PR has been rebased against current master, and I've implemented some of the changes you requested. > NIFI component high resource usage annotation > - > > Key: NIFI-4872 > URL: https://issues.apache.org/jira/browse/NIFI-4872 > Project: Apache NiFi > Issue Type: New Feature > Components: Core Framework, Core UI >Affects Versions: 1.5.0 >Reporter: Jeff Storck >Assignee: Jeff Storck >Priority: Critical > > NiFi Processors currently have no means to relay whether or not they have may > be resource intensive or not. The idea here would be to introduce an > Annotation that can be added to Processors that indicate they may cause high > memory, disk, CPU, or network usage. For instance, any Processor that reads > the FlowFile contents into memory (like many XML Processors for instance) may > cause high memory usage. What ultimately determines if there is high > memory/disk/cpu/network usage will depend on the FlowFiles being processed. > With many of these components in the dataflow, it increases the risk of > OutOfMemoryErrors and performance degradation. > The annotation should support one value from a fixed list of: CPU, Disk, > Memory, Network. It should also allow the developer to provide a custom > description of the scenario that the component would fall under the high > usage category. The annotation should be able to be specified multiple > times, for as many resources as it has the potential to be high usage. > By marking components with this new Annotation, we can update the generated > Processor documentation to include this fact. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4872) NIFI component high resource usage annotation
[ https://issues.apache.org/jira/browse/NIFI-4872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16368244#comment-16368244 ] ASF GitHub Bot commented on NIFI-4872: -- Github user pvillard31 commented on the issue: https://github.com/apache/nifi/pull/2475 Few suggestions regarding existing processors: ExtractText and ReplaceText can also be CPU intensive when using some tricky regular expressions. Same goes for grok processors as well as TransformXML (depends of the XSLT). It's not true in most cases but it can be in some situations. Will try to continue the review early next week. > NIFI component high resource usage annotation > - > > Key: NIFI-4872 > URL: https://issues.apache.org/jira/browse/NIFI-4872 > Project: Apache NiFi > Issue Type: New Feature > Components: Core Framework, Core UI >Affects Versions: 1.5.0 >Reporter: Jeff Storck >Assignee: Jeff Storck >Priority: Critical > > NiFi Processors currently have no means to relay whether or not they have may > be resource intensive or not. The idea here would be to introduce an > Annotation that can be added to Processors that indicate they may cause high > memory, disk, CPU, or network usage. For instance, any Processor that reads > the FlowFile contents into memory (like many XML Processors for instance) may > cause high memory usage. What ultimately determines if there is high > memory/disk/cpu/network usage will depend on the FlowFiles being processed. > With many of these components in the dataflow, it increases the risk of > OutOfMemoryErrors and performance degradation. > The annotation should support one value from a fixed list of: CPU, Disk, > Memory, Network. It should also allow the developer to provide a custom > description of the scenario that the component would fall under the high > usage category. The annotation should be able to be specified multiple > times, for as many resources as it has the potential to be high usage. > By marking components with this new Annotation, we can update the generated > Processor documentation to include this fact. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4872) NIFI component high resource usage annotation
[ https://issues.apache.org/jira/browse/NIFI-4872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16368025#comment-16368025 ] ASF GitHub Bot commented on NIFI-4872: -- Github user jtstorck commented on a diff in the pull request: https://github.com/apache/nifi/pull/2475#discussion_r168896206 --- Diff: nifi-nar-bundles/nifi-amqp-bundle/nifi-amqp-processors/src/main/java/org/apache/nifi/amqp/processors/PublishAMQP.java --- @@ -62,6 +64,7 @@ + "and Queue is not set up, the message will have no final destination and will return (i.e., the data will not make it to the queue). If " + "that happens you will see a log in both app-log and bulletin stating to that effect. Fixing the binding " + "(normally done by AMQP administrator) will resolve the issue.") +@HighResourceUsageScenario(resource = SystemResource.MEMORY) --- End diff -- The developer can provide a description using the "scenario" argument on the annotation. This first pass was to identify most of the processors that have the annotation. As we look through the list of components, specific descriptions can be added to override the default scenario from the annotation itself. > NIFI component high resource usage annotation > - > > Key: NIFI-4872 > URL: https://issues.apache.org/jira/browse/NIFI-4872 > Project: Apache NiFi > Issue Type: New Feature > Components: Core Framework, Core UI >Affects Versions: 1.5.0 >Reporter: Jeff Storck >Assignee: Jeff Storck >Priority: Critical > > NiFi Processors currently have no means to relay whether or not they have may > be resource intensive or not. The idea here would be to introduce an > Annotation that can be added to Processors that indicate they may cause high > memory, disk, CPU, or network usage. For instance, any Processor that reads > the FlowFile contents into memory (like many XML Processors for instance) may > cause high memory usage. What ultimately determines if there is high > memory/disk/cpu/network usage will depend on the FlowFiles being processed. > With many of these components in the dataflow, it increases the risk of > OutOfMemoryErrors and performance degradation. > The annotation should support one value from a fixed list of: CPU, Disk, > Memory, Network. It should also allow the developer to provide a custom > description of the scenario that the component would fall under the high > usage category. The annotation should be able to be specified multiple > times, for as many resources as it has the potential to be high usage. > By marking components with this new Annotation, we can update the generated > Processor documentation to include this fact. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4872) NIFI component high resource usage annotation
[ https://issues.apache.org/jira/browse/NIFI-4872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16368020#comment-16368020 ] ASF GitHub Bot commented on NIFI-4872: -- Github user jtstorck commented on a diff in the pull request: https://github.com/apache/nifi/pull/2475#discussion_r168895945 --- Diff: nifi-api/src/main/java/org/apache/nifi/annotation/behavior/HighResourceUsageScenario.java --- @@ -0,0 +1,51 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.nifi.annotation.behavior; + +import java.lang.annotation.Documented; +import java.lang.annotation.ElementType; +import java.lang.annotation.Inherited; +import java.lang.annotation.Repeatable; +import java.lang.annotation.Retention; +import java.lang.annotation.RetentionPolicy; +import java.lang.annotation.Target; + +/** + * Annotation that may be placed on a + * {@link org.apache.nifi.components.ConfigurableComponent Component} indicating that this + * component may cause high usage of a resource. + */ +@Documented +@Target({ElementType.TYPE}) +@Retention(RetentionPolicy.RUNTIME) +@Inherited +@Repeatable(HighResourceUsageScenarios.class) +public @interface HighResourceUsageScenario { --- End diff -- I'm not tied to any of the names of classes that I've used so far. SystemResourceConsideration sounds good to me, especially since it has a wider range of meaning than just specifying higher resource usage. > NIFI component high resource usage annotation > - > > Key: NIFI-4872 > URL: https://issues.apache.org/jira/browse/NIFI-4872 > Project: Apache NiFi > Issue Type: New Feature > Components: Core Framework, Core UI >Affects Versions: 1.5.0 >Reporter: Jeff Storck >Assignee: Jeff Storck >Priority: Critical > > NiFi Processors currently have no means to relay whether or not they have may > be resource intensive or not. The idea here would be to introduce an > Annotation that can be added to Processors that indicate they may cause high > memory, disk, CPU, or network usage. For instance, any Processor that reads > the FlowFile contents into memory (like many XML Processors for instance) may > cause high memory usage. What ultimately determines if there is high > memory/disk/cpu/network usage will depend on the FlowFiles being processed. > With many of these components in the dataflow, it increases the risk of > OutOfMemoryErrors and performance degradation. > The annotation should support one value from a fixed list of: CPU, Disk, > Memory, Network. It should also allow the developer to provide a custom > description of the scenario that the component would fall under the high > usage category. The annotation should be able to be specified multiple > times, for as many resources as it has the potential to be high usage. > By marking components with this new Annotation, we can update the generated > Processor documentation to include this fact. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4872) NIFI component high resource usage annotation
[ https://issues.apache.org/jira/browse/NIFI-4872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16367513#comment-16367513 ] ASF GitHub Bot commented on NIFI-4872: -- Github user joewitt commented on a diff in the pull request: https://github.com/apache/nifi/pull/2475#discussion_r168796980 --- Diff: nifi-nar-bundles/nifi-amqp-bundle/nifi-amqp-processors/src/main/java/org/apache/nifi/amqp/processors/PublishAMQP.java --- @@ -62,6 +64,7 @@ + "and Queue is not set up, the message will have no final destination and will return (i.e., the data will not make it to the queue). If " + "that happens you will see a log in both app-log and bulletin stating to that effect. Fixing the binding " + "(normally done by AMQP administrator) will resolve the issue.") +@HighResourceUsageScenario(resource = SystemResource.MEMORY) --- End diff -- We need to be able to articulate the memory usage. Is it that every message published is fully loaded into memory in a byte[] therefore large messages will consume large amounts of heap? Same for a lot of items below. We need to be able to let the developer explain. In some cases we have processors that operate on batches of things and people will worry it is the batch that is the problem. But in reality it is that if any single event/record is large within a batch that single event will be in mem/etc... > NIFI component high resource usage annotation > - > > Key: NIFI-4872 > URL: https://issues.apache.org/jira/browse/NIFI-4872 > Project: Apache NiFi > Issue Type: New Feature > Components: Core Framework, Core UI >Affects Versions: 1.5.0 >Reporter: Jeff Storck >Assignee: Jeff Storck >Priority: Critical > > NiFi Processors currently have no means to relay whether or not they have may > be resource intensive or not. The idea here would be to introduce an > Annotation that can be added to Processors that indicate they may cause high > memory, disk, CPU, or network usage. For instance, any Processor that reads > the FlowFile contents into memory (like many XML Processors for instance) may > cause high memory usage. What ultimately determines if there is high > memory/disk/cpu/network usage will depend on the FlowFiles being processed. > With many of these components in the dataflow, it increases the risk of > OutOfMemoryErrors and performance degradation. > The annotation should support one value from a fixed list of: CPU, Disk, > Memory, Network. It should also allow the developer to provide a custom > description of the scenario that the component would fall under the high > usage category. The annotation should be able to be specified multiple > times, for as many resources as it has the potential to be high usage. > By marking components with this new Annotation, we can update the generated > Processor documentation to include this fact. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4872) NIFI component high resource usage annotation
[ https://issues.apache.org/jira/browse/NIFI-4872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16367508#comment-16367508 ] ASF GitHub Bot commented on NIFI-4872: -- Github user joewitt commented on a diff in the pull request: https://github.com/apache/nifi/pull/2475#discussion_r168796396 --- Diff: nifi-api/src/main/java/org/apache/nifi/annotation/behavior/HighResourceUsageScenarios.java --- @@ -0,0 +1,38 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.nifi.annotation.behavior; + +import java.lang.annotation.Documented; +import java.lang.annotation.ElementType; +import java.lang.annotation.Inherited; +import java.lang.annotation.Retention; +import java.lang.annotation.RetentionPolicy; +import java.lang.annotation.Target; + +/** + * Annotation that may be placed on a + * {@link org.apache.nifi.components.ConfigurableComponent Component} indicating that this + * component may cause high usage of resources. + * + */ +@Documented +@Target({ElementType.TYPE}) +@Retention(RetentionPolicy.RUNTIME) +@Inherited +public @interface HighResourceUsageScenarios { --- End diff -- SystemResourceConsiderations instead? > NIFI component high resource usage annotation > - > > Key: NIFI-4872 > URL: https://issues.apache.org/jira/browse/NIFI-4872 > Project: Apache NiFi > Issue Type: New Feature > Components: Core Framework, Core UI >Affects Versions: 1.5.0 >Reporter: Jeff Storck >Assignee: Jeff Storck >Priority: Critical > > NiFi Processors currently have no means to relay whether or not they have may > be resource intensive or not. The idea here would be to introduce an > Annotation that can be added to Processors that indicate they may cause high > memory, disk, CPU, or network usage. For instance, any Processor that reads > the FlowFile contents into memory (like many XML Processors for instance) may > cause high memory usage. What ultimately determines if there is high > memory/disk/cpu/network usage will depend on the FlowFiles being processed. > With many of these components in the dataflow, it increases the risk of > OutOfMemoryErrors and performance degradation. > The annotation should support one value from a fixed list of: CPU, Disk, > Memory, Network. It should also allow the developer to provide a custom > description of the scenario that the component would fall under the high > usage category. The annotation should be able to be specified multiple > times, for as many resources as it has the potential to be high usage. > By marking components with this new Annotation, we can update the generated > Processor documentation to include this fact. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4872) NIFI component high resource usage annotation
[ https://issues.apache.org/jira/browse/NIFI-4872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16367507#comment-16367507 ] ASF GitHub Bot commented on NIFI-4872: -- Github user joewitt commented on a diff in the pull request: https://github.com/apache/nifi/pull/2475#discussion_r168796071 --- Diff: nifi-api/src/main/java/org/apache/nifi/annotation/behavior/HighResourceUsageScenario.java --- @@ -0,0 +1,51 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.nifi.annotation.behavior; + +import java.lang.annotation.Documented; +import java.lang.annotation.ElementType; +import java.lang.annotation.Inherited; +import java.lang.annotation.Repeatable; +import java.lang.annotation.Retention; +import java.lang.annotation.RetentionPolicy; +import java.lang.annotation.Target; + +/** + * Annotation that may be placed on a + * {@link org.apache.nifi.components.ConfigurableComponent Component} indicating that this + * component may cause high usage of a resource. + */ +@Documented +@Target({ElementType.TYPE}) +@Retention(RetentionPolicy.RUNTIME) +@Inherited +@Repeatable(HighResourceUsageScenarios.class) +public @interface HighResourceUsageScenario { --- End diff -- What do you think about calling this 'SystemResourceConsideration' instead of HighResourceUsageScenario? It takes 'SystemResource' types which make sense to me and this isn't just about 'high usage' it is also about helping provide the developer a way to articulate these concerns to a user. We get questions all the time about 'Can you compress large objects' - and the answer is yes because it is done in a streaming/small buffer fashion regardless of whether something is 10KB or 10GB. > NIFI component high resource usage annotation > - > > Key: NIFI-4872 > URL: https://issues.apache.org/jira/browse/NIFI-4872 > Project: Apache NiFi > Issue Type: New Feature > Components: Core Framework, Core UI >Affects Versions: 1.5.0 >Reporter: Jeff Storck >Assignee: Jeff Storck >Priority: Critical > > NiFi Processors currently have no means to relay whether or not they have may > be resource intensive or not. The idea here would be to introduce an > Annotation that can be added to Processors that indicate they may cause high > memory, disk, CPU, or network usage. For instance, any Processor that reads > the FlowFile contents into memory (like many XML Processors for instance) may > cause high memory usage. What ultimately determines if there is high > memory/disk/cpu/network usage will depend on the FlowFiles being processed. > With many of these components in the dataflow, it increases the risk of > OutOfMemoryErrors and performance degradation. > The annotation should support one value from a fixed list of: CPU, Disk, > Memory, Network. It should also allow the developer to provide a custom > description of the scenario that the component would fall under the high > usage category. The annotation should be able to be specified multiple > times, for as many resources as it has the potential to be high usage. > By marking components with this new Annotation, we can update the generated > Processor documentation to include this fact. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4872) NIFI component high resource usage annotation
[ https://issues.apache.org/jira/browse/NIFI-4872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16366275#comment-16366275 ] ASF GitHub Bot commented on NIFI-4872: -- GitHub user jtstorck opened a pull request: https://github.com/apache/nifi/pull/2475 NIFI-4872 Added annotation for specifying scenarios in which components can cause high usage of system resources. Thank you for submitting a contribution to Apache NiFi. In order to streamline the review of the contribution we ask you to ensure the following steps have been taken: ### For all changes: - [x] Is there a JIRA ticket associated with this PR? Is it referenced in the commit message? - [x] Does your PR title start with NIFI- where is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character. - [x] Has your PR been rebased against the latest commit within the target branch (typically master)? - [ ] Is your initial contribution a single, squashed commit? ### For code changes: - [x] Have you ensured that the full suite of tests is executed via mvn -Pcontrib-check clean install at the root nifi folder? - [x] Have you written or updated unit tests to verify your changes? - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? - [ ] If applicable, have you updated the LICENSE file, including the main LICENSE file under nifi-assembly? - [ ] If applicable, have you updated the NOTICE file, including the main NOTICE file found under nifi-assembly? - [ ] If adding new Properties, have you added .displayName in addition to .name (programmatic access) for each of the new properties? ### For documentation related changes: - [ ] Have you ensured that format looks appropriate for the output in which it is rendered? ### Note: Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible. You can merge this pull request into a Git repository by running: $ git pull https://github.com/jtstorck/nifi NIFI-4872 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/nifi/pull/2475.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2475 commit 4906f2d1545f94c1ec264fc0a65615412b81cbe9 Author: Jeff StorckDate: 2018-02-15T18:12:49Z NIFI-4872 Added annotation for specifying scenarios in which components can cause high usage of system resources. commit ec85dadc21c9081297d6dcb1ae0424c33ed6f42b Author: Jeff Storck Date: 2018-02-15T20:03:39Z NIFI-4872 Initial set of components marked with the HighResourceUsageScenario annotation. > NIFI component high resource usage annotation > - > > Key: NIFI-4872 > URL: https://issues.apache.org/jira/browse/NIFI-4872 > Project: Apache NiFi > Issue Type: New Feature > Components: Core Framework, Core UI >Affects Versions: 1.5.0 >Reporter: Jeff Storck >Assignee: Jeff Storck >Priority: Critical > > NiFi Processors currently have no means to relay whether or not they have may > be resource intensive or not. The idea here would be to introduce an > Annotation that can be added to Processors that indicate they may cause high > memory, disk, CPU, or network usage. For instance, any Processor that reads > the FlowFile contents into memory (like many XML Processors for instance) may > cause high memory usage. What ultimately determines if there is high > memory/disk/cpu/network usage will depend on the FlowFiles being processed. > With many of these components in the dataflow, it increases the risk of > OutOfMemoryErrors and performance degradation. > The annotation should support one value from a fixed list of: CPU, Disk, > Memory, Network. It should also allow the developer to provide a custom > description of the scenario that the component would fall under the high > usage category. The annotation should be able to be specified multiple > times, for as many resources as it has the potential to be high usage. > By marking components with this new Annotation, we can update the generated > Processor documentation to include this fact. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4872) NIFI component high resource usage annotation
[ https://issues.apache.org/jira/browse/NIFI-4872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363069#comment-16363069 ] Pierre Villard commented on NIFI-4872: -- [~jtstorck] - cool! Sounds good to me! > NIFI component high resource usage annotation > - > > Key: NIFI-4872 > URL: https://issues.apache.org/jira/browse/NIFI-4872 > Project: Apache NiFi > Issue Type: New Feature > Components: Core Framework, Core UI >Affects Versions: 1.5.0 >Reporter: Jeff Storck >Assignee: Jeff Storck >Priority: Critical > > NiFi Processors currently have no means to relay whether or not they have may > be resource intensive or not. The idea here would be to introduce an > Annotation that can be added to Processors that indicate they may cause high > memory, disk, CPU, or network usage. For instance, any Processor that reads > the FlowFile contents into memory (like many XML Processors for instance) may > cause high memory usage. What ultimately determines if there is high > memory/disk/cpu/network usage will depend on the FlowFiles being processed. > With many of these components in the dataflow, it increases the risk of > OutOfMemoryErrors and performance degradation. > The annotation should support one value from a fixed list of: CPU, Disk, > Memory, Network. It should also allow the developer to provide a custom > description of the scenario that the component would fall under the high > usage category. The annotation should be able to be specified multiple > times, for as many resources as it has the potential to be high usage. > By marking components with this new Annotation, we can update the generated > Processor documentation to include this fact. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4872) NIFI component high resource usage annotation
[ https://issues.apache.org/jira/browse/NIFI-4872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361970#comment-16361970 ] Pierre Villard commented on NIFI-4872: -- Would it also make sense to add a description field in this annotation? I'm thinking about Merge and Split processors: we often recall users to perform a two steps processing when using such a processor with a huge file. We could also update the capability description though. > NIFI component high resource usage annotation > - > > Key: NIFI-4872 > URL: https://issues.apache.org/jira/browse/NIFI-4872 > Project: Apache NiFi > Issue Type: New Feature > Components: Core Framework, Core UI >Affects Versions: 1.5.0 >Reporter: Jeff Storck >Assignee: Jeff Storck >Priority: Critical > > NiFi Processors currently have no means to relay whether or not they have may > be resource intensive or not. The idea here would be to introduce an > Annotation that can be added to Processors that indicate they may cause high > memory, disk, CPU, or network usage. For instance, any Processor that reads > the FlowFile contents into memory (like many XML Processors for instance) may > cause high memory usage. What ultimately determines if there is high > memory/disk/cpu/network usage will depend on the FlowFiles being processed. > With many of these components in the dataflow, it increases the risk of > OutOfMemoryErrors and performance degradation. > The annotation should support one or more values from a fixed list of: CPU, > Disk, Memory, Network. > By marking components with this new Annotation, we can update the generated > Processor documentation to include this fact. -- This message was sent by Atlassian JIRA (v7.6.3#76005)