[
https://issues.apache.org/jira/browse/NIFI-5469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16560274#comment-16560274
]
ASF GitHub Bot commented on NIFI-5469:
--------------------------------------
Github user andrewmlim commented on a diff in the pull request:
https://github.com/apache/nifi/pull/2922#discussion_r205887314
--- Diff: nifi-docs/src/main/asciidoc/administration-guide.adoc ---
@@ -2944,95 +3048,95 @@ at this time.
|====
|*Property*|*Description*
-|nifi.provenance.repository.directory.default*|The location of the
Provenance Repository. The default value is `./provenance_repository`. +
+|`nifi.provenance.repository.directory.default`*|The location of the
Provenance Repository. The default value is `./provenance_repository`. +
+
*NOTE*: Multiple provenance repositories can be specified by using the
*_nifi.provenance.repository.directory._* prefix with unique suffixes and
separate paths as values. +
+
For example, to provide two additional locations to act as part of the
provenance repository, a user could also specify additional properties with
keys of: +
+
-nifi.provenance.repository.directory.provenance1=/repos/provenance1 +
-nifi.provenance.repository.directory.provenance2=/repos/provenance2 +
+`nifi.provenance.repository.directory.provenance1=/repos/provenance1` +
+`nifi.provenance.repository.directory.provenance2=/repos/provenance2` +
+
Providing three total locations, including
`nifi.provenance.repository.directory.default`.
-|nifi.provenance.repository.max.storage.time|The maximum amount of time to
keep data provenance information. The default value is `24 hours`.
-|nifi.provenance.repository.max.storage.size|The maximum amount of data
provenance information to store at a time. The default value is `1 GB`.
-|nifi.provenance.repository.rollover.time|The amount of time to wait
before rolling over the latest data provenance information so that it is
available in the User Interface. The default value is `30 secs`.
-|nifi.provenance.repository.rollover.size|The amount of information to
roll over at a time. The default value is `100 MB`.
-|nifi.provenance.repository.query.threads|The number of threads to use for
Provenance Repository queries. The default value is `2`.
-|nifi.provenance.repository.index.threads|The number of threads to use for
indexing Provenance events so that they are searchable. The default value is
`2`.
+|`nifi.provenance.repository.max.storage.time`|The maximum amount of time
to keep data provenance information. The default value is `24 hours`.
+|`nifi.provenance.repository.max.storage.size`|The maximum amount of data
provenance information to store at a time. The default value is `1 GB`.
+|`nifi.provenance.repository.rollover.time`|The amount of time to wait
before rolling over the latest data provenance information so that it is
available in the User Interface. The default value is `30 secs`.
+|`nifi.provenance.repository.rollover.size`|The amount of information to
roll over at a time. The default value is `100 MB`.
+|`nifi.provenance.repository.query.threads`|The number of threads to use
for Provenance Repository queries. The default value is `2`.
+|`nifi.provenance.repository.index.threads`|The number of threads to use
for indexing Provenance events so that they are searchable. The default value
is `2`.
For flows that operate on a very high number of FlowFiles, the indexing
of Provenance events could become a bottleneck. If this is the case, a bulletin
will appear, indicating that
"The rate of the dataflow is exceeding the provenance recording rate.
Slowing down flow to accommodate." If this happens, increasing the value of
this property
may increase the rate at which the Provenance Repository is able to
process these records, resulting in better overall throughput.
-|nifi.provenance.repository.compress.on.rollover|Indicates whether to
compress the provenance information when rolling it over. The default value is
`true`.
-|nifi.provenance.repository.always.sync|If set to `true`, any change to
the repository will be synchronized to the disk, meaning that NiFi will ask the
operating system not to cache the information. This is very expensive and can
significantly reduce NiFi performance. However, if it is `false`, there could
be the potential for data loss if either there is a sudden power loss or the
operating system crashes. The default value is `false`.
-|nifi.provenance.repository.journal.count|The number of journal files that
should be used to serialize Provenance Event data. Increasing this value will
allow more tasks to simultaneously update the repository but will result in
more expensive merging of the journal files later. This value should ideally be
equal to the number of threads that are expected to update the repository
simultaneously, but 16 tends to work well in must environments. The default
value is `16`.
-|nifi.provenance.repository.indexed.fields|This is a comma-separated list
of the fields that should be indexed and made searchable. Fields that are not
indexed will not be searchable. Valid fields are: `EventType, FlowFileUUID,
Filename, TransitURI, ProcessorID, AlternateIdentifierURI, Relationship,
Details`. The default value is: `EventType, FlowFileUUID, Filename,
ProcessorID`.
-|nifi.provenance.repository.indexed.attributes|This is a comma-separated
list of FlowFile Attributes that should be indexed and made searchable. It is
blank by default. But some good examples to consider are 'filename', 'uuid',
and 'mime.type' as well as any custom attritubes you might use which are
valuable for your use case.
-|nifi.provenance.repository.index.shard.size|Large values for the shard
size will result in more Java heap usage when searching the Provenance
Repository but should provide better performance. The default value is `500 MB`.
-|nifi.provenance.repository.max.attribute.length|Indicates the maximum
length that a FlowFile attribute can be when retrieving a Provenance Event from
the repository. If the length of any attribute exceeds this value, it will be
truncated when the event is retrieved. The default value is `65536`.
+|`nifi.provenance.repository.compress.on.rollover`|Indicates whether to
compress the provenance information when rolling it over. The default value is
`true`.
+|`nifi.provenance.repository.always.sync`|If set to `true`, any change to
the repository will be synchronized to the disk, meaning that NiFi will ask the
operating system not to cache the information. This is very expensive and can
significantly reduce NiFi performance. However, if it is `false`, there could
be the potential for data loss if either there is a sudden power loss or the
operating system crashes. The default value is `false`.
+|`nifi.provenance.repository.journal.count`|The number of journal files
that should be used to serialize Provenance Event data. Increasing this value
will allow more tasks to simultaneously update the repository but will result
in more expensive merging of the journal files later. This value should ideally
be equal to the number of threads that are expected to update the repository
simultaneously, but 16 tends to work well in must environments. The default
value is `16`.
+|`nifi.provenance.repository.indexed.fields`|This is a comma-separated
list of the fields that should be indexed and made searchable. Fields that are
not indexed will not be searchable. Valid fields are: `EventType`,
`FlowFileUUID`, `Filename`, `TransitURI`, `ProcessorID`,
`AlternateIdentifierURI`, `Relationship`, `Details`. The default value is:
`EventType, FlowFileUUID, Filename, ProcessorID`.
+|`nifi.provenance.repository.indexed.attributes`|This is a comma-separated
list of FlowFile Attributes that should be indexed and made searchable. It is
blank by default. But some good examples to consider are `filename`, `uuid`,
and `mime.type` as well as any custom attritubes you might use which are
valuable for your use case.
+|`nifi.provenance.repository.index.shard.size`|Large values for the shard
size will result in more Java heap usage when searching the Provenance
Repository but should provide better performance. The default value is `500 MB`.
+|`nifi.provenance.repository.max.attribute.length`|Indicates the maximum
length that a FlowFile attribute can be when retrieving a Provenance Event from
the repository. If the length of any attribute exceeds this value, it will be
truncated when the event is retrieved. The default value is `65536`.
|====
=== Volatile Provenance Repository Properties
|====
|*Property*|*Description*
-|nifi.provenance.repository.buffer.size|The Provenance Repository buffer
size. The default value is `100000`.
+|`nifi.provenance.repository.buffer.size`|The Provenance Repository buffer
size. The default value is `100000`.
|====
=== Write Ahead Provenance Repository Properties
|====
|*Property*|*Description*
-|nifi.provenance.repository.directory.default*|The location of the
Provenance Repository. The default value is `./provenance_repository`. +
+|`nifi.provenance.repository.directory.default`*|The location of the
Provenance Repository. The default value is `./provenance_repository`. +
+
*NOTE*: Multiple provenance repositories can be specified by using the
*_nifi.provenance.repository.directory._* prefix with unique suffixes and
separate paths as values. +
+
For example, to provide two additional locations to act as part of the
provenance repository, a user could also specify additional properties with
keys of: +
+
- nifi.provenance.repository.directory.provenance1=/repos/provenance1 +
- nifi.provenance.repository.directory.provenance2=/repos/provenance2 +
+ `nifi.provenance.repository.directory.provenance1`=/repos/provenance1 +
+ `nifi.provenance.repository.directory.provenance2`=/repos/provenance2 +
+
Providing three total locations, including
`nifi.provenance.repository.directory.default`.
-|nifi.provenance.repository.max.storage.time|The maximum amount of time to
keep data provenance information. The default value is `24 hours`.
-|nifi.provenance.repository.max.storage.size|The maximum amount of data
provenance information to store at a time.
+|`nifi.provenance.repository.max.storage.time`|The maximum amount of time
to keep data provenance information. The default value is `24 hours`.
+|`nifi.provenance.repository.max.storage.size`|The maximum amount of data
provenance information to store at a time.
The default value is `1 GB`. The Data Provenance capability can consume
a great deal of storage space because so much data is kept.
For production environments, values of 1-2 TB or more is not uncommon.
The repository will write to a single "event file" (or set of
"event files" if multiple storage locations are defined, as described
above) for some period of time (defined by the
- nifi.provenance.repository.rollover.time and
nifi.provenance.repository.rollover.size properties). Data is always aged off
one file at a time,
+ `nifi.provenance.repository.rollover.time` and
`nifi.provenance.repository.rollover.size` properties). Data is always aged off
one file at a time,
so it is not advisable to write to a single "event file" for a
tremendous amount of time, as it will prevent old data from aging off as
smoothly.
-|nifi.provenance.repository.rollover.time|The amount of time to wait
before rolling over the "event file" that the repository is writing to.
-|nifi.provenance.repository.rollover.size|The amount of data to write to a
single "event file." The default value is `100 MB`. For production
+|`nifi.provenance.repository.rollover.time`|The amount of time to wait
before rolling over the "event file" that the repository is writing to.
+|`nifi.provenance.repository.rollover.size`|The amount of data to write to
a single "event file." The default value is `100 MB`. For production
environments where a very large amount of Data Provenance is generated,
a value of 1 GB is also very reasonable.
--- End diff --
Sounds good.
> Edits needed for LDAP and Kerberos login identity provider sections in Admin
> Guide
> ----------------------------------------------------------------------------------
>
> Key: NIFI-5469
> URL: https://issues.apache.org/jira/browse/NIFI-5469
> Project: Apache NiFi
> Issue Type: Improvement
> Components: Documentation & Website
> Reporter: Andrew Lim
> Assignee: Andrew Lim
> Priority: Minor
>
> Going through the Authentication and Authorization sections of the Admin
> Guide, I noticed the following improvements could be made:
> * Removed “Kerberos Config File” property from kerberos-provider login
> identity provider (this was done because the same property exists in
> nifi.properties)
> * Corrected the "LDAP-based Users/Groups Referencing User Attribute” login
> identity provider example to refer to “member uid"
> * Added titles to login identity provider examples for improved
> readability/search
> * Changed UserGroupProvider property examples from bulleted lists to tables
> Also, text formatting for references to config files, directories, etc. need
> to be made consistent. For example, config files like _nifi.properties_,
> _authorizers.xml_ should be italicized. Directories, properties and default
> values for properties should be monospaced.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)