[GitHub] nifi pull request #2475: NIFI-4872 Added annotation for specifying scenarios...

2018-03-05 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/nifi/pull/2475


---


[GitHub] nifi pull request #2475: NIFI-4872 Added annotation for specifying scenarios...

2018-02-27 Thread markap14
Github user markap14 commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2475#discussion_r171065265
  
--- Diff: 
nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/SplitXml.java
 ---
@@ -82,6 +84,7 @@
 description = "The number of split FlowFiles generated 
from the parent FlowFile"),
 @WritesAttribute(attribute = "segment.original.filename ", 
description = "The filename of the parent FlowFile")
 })
+@SystemResourceConsideration(resource = SystemResource.MEMORY)
--- End diff --

In this particular context, we are buffering the entirety of the FlowFile's 
content (as a Document object, which can take approximately 10 times as much 
heap as the size of the XML - i.e., a 1 MB XML document may take 10 MB of 
heap), in addition to all of the generated FlowFile objects. A two-stage 
approach may well be necessary for lots of splits, but even then if the XML is 
large you could potentially run out of heap space.


---


[GitHub] nifi pull request #2475: NIFI-4872 Added annotation for specifying scenarios...

2018-02-27 Thread markap14
Github user markap14 commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2475#discussion_r171065023
  
--- Diff: 
nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/SplitText.java
 ---
@@ -87,6 +89,7 @@
 @WritesAttribute(attribute = "fragment.count", description = "The 
number of split FlowFiles generated from the parent FlowFile"),
 @WritesAttribute(attribute = "segment.original.filename ", description 
= "The filename of the parent FlowFile")})
 @SeeAlso(MergeContent.class)
+@SystemResourceConsideration(resource = SystemResource.MEMORY)
--- End diff --

I would again add a description here that indicates that it's not buffering 
the content in memory but rather just storing the FlowFile w/ its attributes in 
memory and that if generating too many splits, a two-phase approach may be 
necessary.


---


[GitHub] nifi pull request #2475: NIFI-4872 Added annotation for specifying scenarios...

2018-02-27 Thread markap14
Github user markap14 commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2475#discussion_r171064833
  
--- Diff: 
nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/SplitJson.java
 ---
@@ -80,7 +82,7 @@
 description = "The number of split FlowFiles generated 
from the parent FlowFile"),
 @WritesAttribute(attribute = "segment.original.filename ", 
description = "The filename of the parent FlowFile")
 })
-
+@SystemResourceConsideration(resource = SystemResource.MEMORY)
--- End diff --

In this particular context, we are buffering the entirety of the FlowFile's 
content (as a JsonNode object), in addition to all of the generated FlowFile 
objects. A two-stage approach may well be necessary for lots of splits, but 
even then if the JSON is extremely large you could potentially run out of heap 
space.


---


[GitHub] nifi pull request #2475: NIFI-4872 Added annotation for specifying scenarios...

2018-02-27 Thread markap14
Github user markap14 commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2475#discussion_r171064496
  
--- Diff: 
nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/SplitContent.java
 ---
@@ -75,6 +77,7 @@
 @WritesAttribute(attribute = "fragment.count", description = "The 
number of split FlowFiles generated from the parent FlowFile"),
 @WritesAttribute(attribute = "segment.original.filename ", description 
= "The filename of the parent FlowFile")})
 @SeeAlso(MergeContent.class)
+@SystemResourceConsideration(resource = SystemResource.MEMORY)
--- End diff --

I would again add a description here that indicates that it's not buffering 
the content in memory but rather just storing the FlowFile w/ its attributes in 
memory and that if generating too many splits, a two-phase approach may be 
necessary.


---


[GitHub] nifi pull request #2475: NIFI-4872 Added annotation for specifying scenarios...

2018-02-27 Thread markap14
Github user markap14 commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2475#discussion_r171064049
  
--- Diff: 
nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/MergeContent.java
 ---
@@ -131,6 +133,7 @@
 @WritesAttribute(attribute = "merge.bin.age", description = "The age 
of the bin, in milliseconds, when it was merged and output. Effectively "
 + "this is the greatest amount of time that any FlowFile in this 
bundle remained waiting in this processor before it was output") })
 @SeeAlso({SegmentContent.class, MergeRecord.class})
+@SystemResourceConsideration(resource = SystemResource.MEMORY)
--- End diff --

It would probably be helpful here to add a description that explains that 
the content itself is not stored in memory but rather the FlowFiles' attributes 
and that the configuration for max bin size, etc. will influence how much heap 
is used. Would also call out that if merging together many small FlowFiles, a 
two-stage approach may be necessary in order to avoid running out of memory.


---


[GitHub] nifi pull request #2475: NIFI-4872 Added annotation for specifying scenarios...

2018-02-20 Thread jtstorck
Github user jtstorck commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2475#discussion_r169409286
  
--- Diff: 
nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-documentation/src/main/java/org/apache/nifi/documentation/html/HtmlDocumentationWriter.java
 ---
@@ -727,6 +729,41 @@ protected void writeLink(final XMLStreamWriter 
xmlStreamWriter, final String tex
 xmlStreamWriter.writeEndElement();
 }
 
+/**
+ * Writes all the system resource considerations for this component
+ *
+ * @param configurableComponent the component to describe
+ * @param xmlStreamWriter the xml stream writer to use
+ * @throws XMLStreamException thrown if there was a problem writing 
the XML
+ */
+private void 
writeSystemResourceConsiderationInfo(ConfigurableComponent 
configurableComponent, XMLStreamWriter xmlStreamWriter)
+throws XMLStreamException {
+
+SystemResourceConsideration[] systemResourceConsiderations = 
configurableComponent.getClass().getAnnotationsByType(SystemResourceConsideration.class);
+
+writeSimpleElement(xmlStreamWriter, "h3", "System Resource 
Considerations:");
+if (systemResourceConsiderations.length > 0) {
+xmlStreamWriter.writeStartElement("table");
+xmlStreamWriter.writeAttribute("id", 
"system-resource-considerations");
+xmlStreamWriter.writeStartElement("tr");
+writeSimpleElement(xmlStreamWriter, "th", "Resource");
+writeSimpleElement(xmlStreamWriter, "th", "Description");
+xmlStreamWriter.writeEndElement();
+for (SystemResourceConsideration systemResourceConsideration : 
systemResourceConsiderations) {
+xmlStreamWriter.writeStartElement("tr");
+writeSimpleElement(xmlStreamWriter, "td", 
systemResourceConsideration.resource().name());
+// TODO allow for HTML characters here.
--- End diff --

That TODO is also present on the reads/writes attributes code in 
HtmlProcessorDocumentationWriter.  Since the functionality is similar, I added 
the TODO there as well.  Will have to talk to @mcgilman about the intention 
there.


---


[GitHub] nifi pull request #2475: NIFI-4872 Added annotation for specifying scenarios...

2018-02-20 Thread joewitt
Github user joewitt commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2475#discussion_r169408354
  
--- Diff: 
nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-documentation/src/main/java/org/apache/nifi/documentation/html/HtmlDocumentationWriter.java
 ---
@@ -727,6 +729,41 @@ protected void writeLink(final XMLStreamWriter 
xmlStreamWriter, final String tex
 xmlStreamWriter.writeEndElement();
 }
 
+/**
+ * Writes all the system resource considerations for this component
+ *
+ * @param configurableComponent the component to describe
+ * @param xmlStreamWriter the xml stream writer to use
+ * @throws XMLStreamException thrown if there was a problem writing 
the XML
+ */
+private void 
writeSystemResourceConsiderationInfo(ConfigurableComponent 
configurableComponent, XMLStreamWriter xmlStreamWriter)
+throws XMLStreamException {
+
+SystemResourceConsideration[] systemResourceConsiderations = 
configurableComponent.getClass().getAnnotationsByType(SystemResourceConsideration.class);
+
+writeSimpleElement(xmlStreamWriter, "h3", "System Resource 
Considerations:");
+if (systemResourceConsiderations.length > 0) {
+xmlStreamWriter.writeStartElement("table");
+xmlStreamWriter.writeAttribute("id", 
"system-resource-considerations");
+xmlStreamWriter.writeStartElement("tr");
+writeSimpleElement(xmlStreamWriter, "th", "Resource");
+writeSimpleElement(xmlStreamWriter, "th", "Description");
+xmlStreamWriter.writeEndElement();
+for (SystemResourceConsideration systemResourceConsideration : 
systemResourceConsiderations) {
+xmlStreamWriter.writeStartElement("tr");
+writeSimpleElement(xmlStreamWriter, "td", 
systemResourceConsideration.resource().name());
+// TODO allow for HTML characters here.
--- End diff --

probably need/want to sort out this todo?


---


[GitHub] nifi pull request #2475: NIFI-4872 Added annotation for specifying scenarios...

2018-02-16 Thread jtstorck
Github user jtstorck commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2475#discussion_r168896206
  
--- Diff: 
nifi-nar-bundles/nifi-amqp-bundle/nifi-amqp-processors/src/main/java/org/apache/nifi/amqp/processors/PublishAMQP.java
 ---
@@ -62,6 +64,7 @@
 + "and Queue is not set up, the message will have no final 
destination and will return (i.e., the data will not make it to the queue). If "
 + "that happens you will see a log in both app-log and bulletin 
stating to that effect. Fixing the binding "
 + "(normally done by AMQP administrator) will resolve the issue.")
+@HighResourceUsageScenario(resource = SystemResource.MEMORY)
--- End diff --

The developer can provide a description using the "scenario" argument on 
the annotation.  This first pass was to identify most of the processors that 
have the annotation.  As we look through the list of components, specific 
descriptions can be added to override the default scenario from the annotation 
itself.


---


[GitHub] nifi pull request #2475: NIFI-4872 Added annotation for specifying scenarios...

2018-02-16 Thread jtstorck
Github user jtstorck commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2475#discussion_r168895945
  
--- Diff: 
nifi-api/src/main/java/org/apache/nifi/annotation/behavior/HighResourceUsageScenario.java
 ---
@@ -0,0 +1,51 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.nifi.annotation.behavior;
+
+import java.lang.annotation.Documented;
+import java.lang.annotation.ElementType;
+import java.lang.annotation.Inherited;
+import java.lang.annotation.Repeatable;
+import java.lang.annotation.Retention;
+import java.lang.annotation.RetentionPolicy;
+import java.lang.annotation.Target;
+
+/**
+ * Annotation that may be placed on a
+ * {@link org.apache.nifi.components.ConfigurableComponent Component} 
indicating that this
+ * component may cause high usage of a resource.
+ */
+@Documented
+@Target({ElementType.TYPE})
+@Retention(RetentionPolicy.RUNTIME)
+@Inherited
+@Repeatable(HighResourceUsageScenarios.class)
+public @interface HighResourceUsageScenario {
--- End diff --

I'm not tied to any of the names of classes that I've used so far.  
SystemResourceConsideration sounds good to me, especially since it has a wider 
range of meaning than just specifying higher resource usage.


---


[GitHub] nifi pull request #2475: NIFI-4872 Added annotation for specifying scenarios...

2018-02-16 Thread joewitt
Github user joewitt commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2475#discussion_r168796980
  
--- Diff: 
nifi-nar-bundles/nifi-amqp-bundle/nifi-amqp-processors/src/main/java/org/apache/nifi/amqp/processors/PublishAMQP.java
 ---
@@ -62,6 +64,7 @@
 + "and Queue is not set up, the message will have no final 
destination and will return (i.e., the data will not make it to the queue). If "
 + "that happens you will see a log in both app-log and bulletin 
stating to that effect. Fixing the binding "
 + "(normally done by AMQP administrator) will resolve the issue.")
+@HighResourceUsageScenario(resource = SystemResource.MEMORY)
--- End diff --

We need to be able to articulate the memory usage.  Is it that every 
message published is fully loaded into memory in a byte[] therefore large 
messages will consume large amounts of heap?  Same for a lot of items below.  
We need to be able to let the developer explain.  In some cases we have 
processors that operate on batches of things and people will worry it is the 
batch that is the problem.  But in reality it is that if any single 
event/record is large within a batch that single event will be in mem/etc...


---


[GitHub] nifi pull request #2475: NIFI-4872 Added annotation for specifying scenarios...

2018-02-16 Thread joewitt
Github user joewitt commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2475#discussion_r168796396
  
--- Diff: 
nifi-api/src/main/java/org/apache/nifi/annotation/behavior/HighResourceUsageScenarios.java
 ---
@@ -0,0 +1,38 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.nifi.annotation.behavior;
+
+import java.lang.annotation.Documented;
+import java.lang.annotation.ElementType;
+import java.lang.annotation.Inherited;
+import java.lang.annotation.Retention;
+import java.lang.annotation.RetentionPolicy;
+import java.lang.annotation.Target;
+
+/**
+ * Annotation that may be placed on a
+ * {@link org.apache.nifi.components.ConfigurableComponent Component} 
indicating that this
+ * component may cause high usage of resources.
+ *
+ */
+@Documented
+@Target({ElementType.TYPE})
+@Retention(RetentionPolicy.RUNTIME)
+@Inherited
+public @interface HighResourceUsageScenarios {
--- End diff --

SystemResourceConsiderations instead?


---


[GitHub] nifi pull request #2475: NIFI-4872 Added annotation for specifying scenarios...

2018-02-16 Thread joewitt
Github user joewitt commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2475#discussion_r168796071
  
--- Diff: 
nifi-api/src/main/java/org/apache/nifi/annotation/behavior/HighResourceUsageScenario.java
 ---
@@ -0,0 +1,51 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.nifi.annotation.behavior;
+
+import java.lang.annotation.Documented;
+import java.lang.annotation.ElementType;
+import java.lang.annotation.Inherited;
+import java.lang.annotation.Repeatable;
+import java.lang.annotation.Retention;
+import java.lang.annotation.RetentionPolicy;
+import java.lang.annotation.Target;
+
+/**
+ * Annotation that may be placed on a
+ * {@link org.apache.nifi.components.ConfigurableComponent Component} 
indicating that this
+ * component may cause high usage of a resource.
+ */
+@Documented
+@Target({ElementType.TYPE})
+@Retention(RetentionPolicy.RUNTIME)
+@Inherited
+@Repeatable(HighResourceUsageScenarios.class)
+public @interface HighResourceUsageScenario {
--- End diff --

What do you think about calling this 'SystemResourceConsideration' instead 
of HighResourceUsageScenario?  It takes 'SystemResource' types which make sense 
to me and this isn't just about 'high usage' it is also about helping provide 
the developer a way to articulate these concerns to a user.  We get questions 
all the time about 'Can you compress large objects' - and the answer is yes 
because it is done in a streaming/small buffer fashion regardless of whether 
something is 10KB or 10GB.


---


[GitHub] nifi pull request #2475: NIFI-4872 Added annotation for specifying scenarios...

2018-02-15 Thread jtstorck
GitHub user jtstorck opened a pull request:

https://github.com/apache/nifi/pull/2475

NIFI-4872 Added annotation for specifying scenarios in which components can 
cause high usage of system resources.

Thank you for submitting a contribution to Apache NiFi.

In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:

### For all changes:
- [x] Is there a JIRA ticket associated with this PR? Is it referenced 
 in the commit message?

- [x] Does your PR title start with NIFI- where  is the JIRA number 
you are trying to resolve? Pay particular attention to the hyphen "-" character.

- [x] Has your PR been rebased against the latest commit within the target 
branch (typically master)?

- [ ] Is your initial contribution a single, squashed commit?

### For code changes:
- [x] Have you ensured that the full suite of tests is executed via mvn 
-Pcontrib-check clean install at the root nifi folder?
- [x] Have you written or updated unit tests to verify your changes?
- [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)? 
- [ ] If applicable, have you updated the LICENSE file, including the main 
LICENSE file under nifi-assembly?
- [ ] If applicable, have you updated the NOTICE file, including the main 
NOTICE file found under nifi-assembly?
- [ ] If adding new Properties, have you added .displayName in addition to 
.name (programmatic access) for each of the new properties?

### For documentation related changes:
- [ ] Have you ensured that format looks appropriate for the output in 
which it is rendered?

### Note:
Please ensure that once the PR is submitted, you check travis-ci for build 
issues and submit an update to your PR as soon as possible.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jtstorck/nifi NIFI-4872

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/nifi/pull/2475.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2475


commit 4906f2d1545f94c1ec264fc0a65615412b81cbe9
Author: Jeff Storck 
Date:   2018-02-15T18:12:49Z

NIFI-4872 Added annotation for specifying scenarios in which components can 
cause high usage of system resources.

commit ec85dadc21c9081297d6dcb1ae0424c33ed6f42b
Author: Jeff Storck 
Date:   2018-02-15T20:03:39Z

NIFI-4872 Initial set of components marked with the 
HighResourceUsageScenario annotation.




---