[nifi] branch main updated: NIFI-9319 Make edits and corrections to latest additions to User Guide

pvillard Fri, 22 Oct 2021 00:51:26 -0700

This is an automated email from the ASF dual-hosted git repository.

pvillard pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/nifi.git



The following commit(s) were added to refs/heads/main by this push:
     new 77c6f0a  NIFI-9319 Make edits and corrections to latest additions to 
User Guide
77c6f0a is described below

commit 77c6f0a819d2b52d986042ab7dd5bed6ca500ae5
Author: Andrew Lim <[email protected]>
AuthorDate: Thu Oct 21 12:37:39 2021 -0400

    NIFI-9319 Make edits and corrections to latest additions to User Guide
    
    Signed-off-by: Pierre Villard <[email protected]>
    
    This closes #5474.
---
 .../asciidoc/images/configure-process-group.png    | Bin 65302 -> 0 bytes
 .../images/process-group-configuration-window.png  | Bin 102300 -> 118585 bytes
 nifi-docs/src/main/asciidoc/user-guide.adoc        |  52 ++++++++++-----------
 3 files changed, 26 insertions(+), 26 deletions(-)

diff --git a/nifi-docs/src/main/asciidoc/images/configure-process-group.png 
b/nifi-docs/src/main/asciidoc/images/configure-process-group.png
deleted file mode 100644
index 2b1076d..0000000
Binary files a/nifi-docs/src/main/asciidoc/images/configure-process-group.png 
and /dev/null differ
diff --git 
a/nifi-docs/src/main/asciidoc/images/process-group-configuration-window.png 
b/nifi-docs/src/main/asciidoc/images/process-group-configuration-window.png
index 8921129..58b9dd6 100644
Binary files 
a/nifi-docs/src/main/asciidoc/images/process-group-configuration-window.png and 
b/nifi-docs/src/main/asciidoc/images/process-group-configuration-window.png 
differ
diff --git a/nifi-docs/src/main/asciidoc/user-guide.adoc 
b/nifi-docs/src/main/asciidoc/user-guide.adoc
index b8ff7ca..7583759 100644
--- a/nifi-docs/src/main/asciidoc/user-guide.adoc
+++ b/nifi-docs/src/main/asciidoc/user-guide.adoc
@@ -210,8 +210,8 @@ The available component-level access policies are:
 |view the component   |Allows users to view component configuration details
 |modify the component  |Allows users to modify component configuration details
 |view provenance   |Allows users to view provenance events generated by this 
component
-|view the data     |Allows users to view metadata and content for this 
component in flowfile queues in outbound connections and through provenance 
events
-|modify the data   |Allows users to empty flowfile queues in outbound 
connections and submit replays through provenance events
+|view the data     |Allows users to view metadata and content for this 
component in FlowFile queues in outbound connections and through provenance 
events
+|modify the data   |Allows users to empty FlowFile queues in outbound 
connections and submit replays through provenance events
 |view the policies |Allows users to view the list of users who can view and 
modify a component
 |modify the policies  |Allows users to modify the list of users who can view 
and modify a component
 |retrieve data via site-to-site  |Allows a port to receive data from NiFi 
instances
@@ -301,7 +301,7 @@ While the options available from the context menu vary, the 
following options ar
 NOTE: For Processors, Ports, Remote Process Groups, Connections and Labels, it 
is possible to open the configuration dialog by double-clicking on the desired 
component.
 
 - *Start* or *Stop*: This option allows the user to start or stop a Processor; 
the option will be either Start or Stop, depending on the current state of the 
Processor.
-- *Run Once*: This option allows the user to run a selected Processor exactly 
once. If the Processor is prevented from executing (e.g. there are no incoming 
FlowFiles or the outgoing connection has back pressure applied) the Processor 
won't get triggered. *Execution* settings apply - i.e. *Primary Node* and *All 
Nodes* setting will result in running the Processor only once on the Primary 
Node or one time on each of the nodes, respectively. Works only with *Timer 
Driven* and *CRON driven* [...]
+- *Run Once*: This option allows the user to run a selected Processor exactly 
once. If the Processor is prevented from executing (e.g., there are no incoming 
FlowFiles or the outgoing connection has back pressure applied) the Processor 
won't get triggered. *Execution* settings apply (i.e., *Primary Node* and *All 
Nodes* settings will result in running the Processor only once on the Primary 
Node or one time on each of the nodes, respectively). Works only with *Timer 
driven* and *CRON driv [...]
 - *Enable* or *Disable*: This option allows the user to enable or disable a 
Processor; the option will be either Enable or Disable, depending on the 
current state of the Processor.
 - *View data provenance*: This option displays the NiFi Data Provenance table, 
with information about data provenance events for the FlowFiles routed through 
that Processor (see <<data_provenance>>).
 - *View status history*: This option opens a graphical representation of the 
Processor's statistical information over time.
@@ -653,7 +653,7 @@ The 'Run Schedule' dictates how often the Processor should 
be scheduled to run.
 Scheduling Strategy (see above). If using the Event driven Scheduling 
Strategy, this field is not available. When using the Timer driven
 Scheduling Strategy, this value is a time duration specified by a number 
followed by a time unit. For example, `1 second` or `5 mins`.
 The default value of `0 sec` means that the Processor should run as often as 
possible as long as it has data to process. This is true
-for any time duration of 0, regardless of the time unit (i.e., `0 sec`, `0 
mins`, `0 days`). For an explanation of values that are
+for any time duration of 0, regardless of the time unit (e.g., `0 sec`, `0 
mins`, `0 days`). For an explanation of values that are
 applicable for the CRON driven Scheduling Strategy, see the description of the 
CRON driven Scheduling Strategy itself.
 
 ===== Execution
@@ -731,7 +731,7 @@ You can access additional documentation about each 
Processor's usage by right-cl
 === Configuring a Process Group
 To configure a Process Group, right-click on the Process Group and select the 
`Configure` option from the context menu. The configuration dialog is opened 
with two tabs: General and Controller Services.
 
-image::configure-process-group.png["Configure Process Group"]
+image::process-group-configuration-window.png["Configure Process Group"]
 
 
 [[General_tab_ProcessGroup]]
@@ -740,7 +740,7 @@ This tab contains several different configuration items. 
First is the Process Gr
 
 The next configuration element is the Process Group Parameter Context, which 
is used to provide parameters to components of the flow. From this drop-down, 
the user is able to choose which Parameter Context should be bound to this 
Process Group and can optionally create a new one to bind to the Process Group. 
For more information refer to <<Parameters>> and <<parameter-contexts,Parameter 
Contexts>>.
 
-The third element in the configuration dialog is the Process Group Comments. 
This provides a mechanism for providing any useful information or context about 
the Process Group.
+The third element in the configuration dialog is the Process Group Comments. 
This provides a mechanism to add any useful information about the Process Group.
 
 The next two elements, Process Group FlowFile Concurrency and Process Group 
Outbound Policy, are covered in the following sections.
 
@@ -784,14 +784,14 @@ data that arrives at an Output Port is immediately 
transferred out of the Proces
 When the Outbound Policy is configured to "Batch Output", the Output Ports 
will not transfer data out of the Process Group until
 all data that is in the Process Group is queued up at an Output Port (i.e., no 
data leaves the Process Group until all of the data has finished processing).
 It doesn't matter whether the data is all queued up for the same Output Port, 
or if some data is queued up for Output Port A while other data is queued up
-for Output Port B. These conditions are both considered the same in terms of 
the completion of the FlowFile Processing.
+for Output Port B. These conditions are both considered the same in terms of 
the completion of the FlowFile processing.
 
 Using an Outbound Policy of "Batch Output" along with a FlowFile Concurrency 
of "Single FlowFile Per Node" allows a user to easily ingest a single FlowFile
 (which in and of itself may represent a batch of data) and then wait until all 
processing of that FlowFile has completed before continuing on to the next step
 in the dataflow (i.e., the next component outside of the Process Group). 
Additionally, when using this mode, each FlowFile that is transferred out of 
the Process Group
 will be given a series of attributes named "batch.output.<Port Name>" for each 
Output Port in the Process Group. The value will be equal to the number of 
FlowFiles
-that were routed to that Output Port for this batch of data. For example, 
consider a case where a single FlowFile is split into 5 FlowFiles, and two 
FlowFiles go to Output Port A, one goes
-to Output Port B, and two go to Output Port C, and no FlowFiles go to Output 
Port D. In this case, each FlowFile will have attributes `batch.output.A = 2`,
+that were routed to that Output Port for this batch of data. For example, 
consider a case where a single FlowFile is split into 5 FlowFiles: two 
FlowFiles go to Output Port A, one goes
+to Output Port B, two go to Output Port C, and no FlowFiles go to Output Port 
D. In this case, each FlowFile will have attributes `batch.output.A = 2`,
 `batch.output.B = 1`, `batch.output.C = 2`, `batch.output.D = 0`.
 
 The Outbound Policy of "Batch Output" doesn't provide any benefits when used 
in conjunction with a FlowFile Concurrency of "Unbounded".
@@ -801,7 +801,7 @@ As a result, the Outbound Policy is ignored if the FlowFile 
Concurrency is set t
 [[Connecting_Batch_Oriented_Groups]]
 ===== Connecting Batch-Oriented Process Groups
 
-A common use case in NiFi is to perform some batch-oriented process and only 
after that process completes perform another process on that same batch of data.
+A common use case in NiFi is to perform some batch-oriented process and only 
after that process completes, perform another process on that same batch of 
data.
 
 NiFi makes this possible by encapsulating each of these processes in its own 
Process Group. The Outbound Policy of the first Process Group should be 
configured as "Batch Output"
 while the FlowFile Concurrency should be either "Single FlowFile Per Node" or 
"Single Batch Per Node". With this configuration, the first Process Group
@@ -809,7 +809,7 @@ will process an entire batch of data (which will either be 
a single FlowFile or
 When processing has completed for that batch of data, the data will be held 
until all FlowFiles are finished processing and ready to leave the Process 
Group. At that point, the data can be transferred out of the Process Group as a 
batch. This configuration - when a Process Group is configured with an Outbound 
Policy of "Batch Output"
 and an Output Port is connected directly to the Input Port of a Process Group 
with a FlowFile Concurrency of "Single Batch Per Node" - is treated as a 
slightly special case.
 The receiving Process Group will ingest data not only until its input queues 
are empty but until they are empty AND the source Process Group has transferred 
all of the data from that
-batch out of the Process Group. This allows a collection of FlowFiles to be 
transferred as a single batch of data between Process Groups - even if those 
FlowFiles
+batch out of the Process Group. This allows a collection of FlowFiles to be 
transferred as a single batch of data between Process Groups, even if those 
FlowFiles
 are spread across multiple ports.
 
 
@@ -837,10 +837,10 @@ See <<Backpressure>> for more information.
 ===== Default Settings for Connections
 The final three elements in the Process Group configuration dialog are for 
Default FlowFile Expiration, Default Back Pressure Object Threshold, and
 Default Back Pressure Data Size Threshold. These settings configure the 
default values when creating a new Connection. Each Connection represents a 
queue,
-and every queue has settings for flowfile expiration, back pressure object 
count, and back pressure data size. The settings specified here will effect the
-default values for all new Connections created within the Process Group; it 
will not effect existing Connections. Child Process Groups created within the
-configured Process Group will inherit the default settings. Again, existing 
Process Groups will not be effected. If not overridden with these options, the
-root Process Group obtains its default back pressure settings from 
nifi.properties, and has a default FlowFile expiration of "0 sec", i.e. do not 
expire.
+and every queue has settings for FlowFile expiration, back pressure object 
count, and back pressure data size. The settings specified here will affect the
+default values for all new Connections created within the Process Group; it 
will not affect existing Connections. Child Process Groups created within the
+configured Process Group will inherit the default settings. Again, existing 
Process Groups will not be affected. If not overridden with these options, the
+root Process Group obtains its default back pressure settings from 
`nifi.properties`, and has a default FlowFile expiration of "0 sec" (i.e., do 
not expire).
 
 NOTE: Setting the Default FlowFile Expiration to a non-zero value may lead to 
data loss due to a FlowFile expiring as its time limit is reached.
 
@@ -918,7 +918,7 @@ The Referencing Components section now lists an aggregation 
of all the component
 ==== Parameters and Expression Language
 
 When adding a Parameter that makes use of the Expression Language, it is 
important to understand the context in which the Expression Language will be 
evaluated. The expression is always evaluated
-in the context of the Process or Controller Service that references the 
Parameter. Take, for example, a scenario where Parameter with the name `Time` 
is added with a value of `${now()}`. The
+in the context of the Processor or Controller Service that references the 
Parameter. Take, for example, a scenario where a Parameter with the name `Time` 
is added with a value of `${now()}`. The
 Expression Language results in a call to determine the system time when it is 
evaluated. When added as a Parameter, the system time is not evaluated when the 
Parameter is added, but rather when a
 Processor or Controller Service evaluates the Expression. That is, if a 
Processor has a Property whose value is set to `#{Time}` it will function in 
exactly the same manner as if the Property's
 value were set to `${now()}`. Each time that the property is referenced, it 
will produce a different timestamp.
@@ -1138,7 +1138,7 @@ image::variable-putfile-property.png["Processor Property 
Using Variable"]
 
 ===== Variable Scope
 
-Variables are scoped by the Process Group they are defined in and are 
available to any Processor defined at that level and below (i.e. any descendant 
Processors).
+Variables are scoped by the Process Group they are defined in and are 
available to any Processor defined at that level and below (i.e., any 
descendant Processors).
 
 Variables in a descendant group override the value in a parent group.  More 
specifically, if a variable `x` is declared at the root group and also declared 
inside a process group, components inside the process group will use the value 
of `x` defined in the process group.
 
@@ -1456,7 +1456,7 @@ The following prioritizers are available:
 ** Note that an UpdateAttribute processor should be used to add the "priority" 
attribute to the FlowFiles before they reach a connection that has this 
prioritizer set.
 ** If only one has that attribute it will go first.
 ** Values for the "priority" attribute can be alphanumeric, where "a" will 
come before "z" and "1" before "9"
-** If "priority" attribute cannot be parsed as a long, unicode string ordering 
will be used. For example: "99" and "100" will be ordered so the flowfile with 
"99" comes first, but "A-99" and "A-100" will sort so the flowfile with "A-100" 
comes first.
+** If "priority" attribute cannot be parsed as a long, unicode string ordering 
will be used. For example: "99" and "100" will be ordered so the FlowFile with 
"99" comes first, but "A-99" and "A-100" will sort so the FlowFile with "A-100" 
comes first.
 
 NOTE: With a <<load_balance_strategy>> configured, the connection has a queue 
per node in addition to the local queue. The prioritizer will sort the data in 
each queue independently.
 
@@ -1694,17 +1694,17 @@ be performed. The number of active tasks is shown in 
the top-right corner of the
 for more information). See <<terminating_tasks>> for how to terminate the 
running tasks.
 
 [[terminating_tasks]]
-=== Terminating a Component's tasks
+=== Terminating a Component's Tasks
 
 When a component is stopped, it does not interrupt the currently running 
tasks. This allows for the current execution to complete while no new
-tasks are scheduled, which is the desired behaviour in many cases. In some 
cases, it is desirable to terminate the running tasks, particularly
+tasks are scheduled, which is the desired behavior in many cases. In some 
cases, it is desirable to terminate the running tasks, particularly
 in cases where a task has hung and is no longer responsive, or while 
developing new flows.
 
 To be able to terminate the running task(s), the component must first be 
stopped (see <<stopping_components>>). Once the component is in the
-Stopped state, the Terminate option will become available only if there are 
tasks still running (See <<processor_anatomy>>). The Terminate option
-(image:iconTerminate.png["Terminate"]) can be accessed either via the context 
menu or the Operations Palette while the component is selected.
+Stopped state, the Terminate option will become available only if there are 
tasks still running (see <<processor_anatomy>>). The Terminate option
+(image:iconTerminate.png["Terminate"]) can be accessed via the context menu or 
the Operate Palette while the component is selected.
 
-The number of tasks that are actively being terminated will be displayed in 
parentheses next to the number of active tasks e.g. 
image:terminated-thread.png["Terminated-Threads"]. For example, if there is one 
active task at the time that Terminate is selected, this will display "0 (1)" - 
meaning
+The number of tasks that are actively being terminated will be displayed in 
parentheses next to the number of active tasks 
(image:terminated-thread.png["Terminated-Threads"]). For example, if there is 
one active task at the time that Terminate is selected, this will display "0 
(1)" - meaning
 0 active tasks and 1 task being terminated.
 
 A task may not terminate immediately, as different components may respond to 
the Terminate command differently. However, the components can be
@@ -2160,7 +2160,7 @@ The FlowFiles enqueued in a Connection can be viewed when 
necessary. The Queue l
 a Connection's context menu. The listing will return the top 100 FlowFiles in 
the active queue according to the
 configured priority. The listing can be performed even if the source and 
destination are actively running.
 
-Additionally, details for a Flowfile in the listing can be viewed by clicking 
the "Details" button (image:iconDetails.png["Details"]) in the left most 
column. From here, the FlowFile details and attributes are available as well as 
buttons for
+Additionally, details for a FlowFile in the listing can be viewed by clicking 
the "Details" button (image:iconDetails.png["Details"]) in the left most 
column. From here, the FlowFile details and attributes are available as well as 
buttons for
 downloading or viewing the content. Viewing the content is only available if 
the `nifi.content.viewer.url` has been configured.
 If the source or destination of the Connection are actively running, there is 
a chance that the desired FlowFile will
 no longer be in the active queue.
@@ -2761,7 +2761,7 @@ The provenance event types are:
 |FORK                    |Indicates that one or more FlowFiles were derived 
from a parent FlowFile
 |JOIN                    |Indicates that a single FlowFile is derived from 
joining together multiple parent FlowFiles
 |RECEIVE                 |Indicates a provenance event for receiving data from 
an external process
-|REMOTE_INVOCATION       |Indicates that a remote invocation was requested to 
an external endpoint (e.g. deleting a remote resource)
+|REMOTE_INVOCATION       |Indicates that a remote invocation was requested to 
an external endpoint (e.g., deleting a remote resource)
 |REPLAY                  |Indicates a provenance event for replaying a FlowFile
 |ROUTE                   |Indicates that a FlowFile was routed to a specified 
relationship and provides information about why the FlowFile was routed to this 
relationship
 |SEND                    |Indicates a provenance event for sending data to an 
external process
@@ -2868,7 +2868,7 @@ java.arg.13=-XX:+UseG1GC
 Many of the same system properties are supported by both the Persistent and 
Write Ahead configurations, however the default values have been chosen for a 
Persistent Provenance configuration. The following exceptions and 
recommendations should be noted when changing to a Write Ahead configuration:
 
 * `nifi.provenance.repository.journal.count` is not relevant to a Write Ahead 
configuration
-* `nifi.provenance.repository.concurrent.merge.threads` and 
`nifi.provenance.repository.warm.cache.frequency` are new properties.  The 
default values of `2` for threads and blank for frequency (i.e. disabled) 
should remain for most installations.
+* `nifi.provenance.repository.concurrent.merge.threads` and 
`nifi.provenance.repository.warm.cache.frequency` are new properties.  The 
default values of `2` for threads and blank for frequency (i.e., disabled) 
should remain for most installations.
 * Change the settings for `nifi.provenance.repository.max.storage.time` 
(default value of `24 hours`) and `nifi.provenance.repository.max.storage.size` 
(default value of `1 GB`) to values more suitable for your production 
environment
 * Change `nifi.provenance.repository.index.shard.size` from the default value 
of `500 MB` to `4 GB`
 * Change `nifi.provenance.repository.index.threads` from the default value of 
`2` to either `4` or `8` as the Write Ahead repository enables this to scale 
better

[nifi] branch main updated: NIFI-9319 Make edits and corrections to latest additions to User Guide

Reply via email to