fgerlits commented on code in PR #1483: URL: https://github.com/apache/nifi-minifi-cpp/pull/1483#discussion_r1135682353
##########
PROCESSORS.md:
##########
@@ -249,165 +305,176 @@ In the list below, the names of required properties
appear in bold. Any other pr
### Description
Compresses or decompresses the contents of FlowFiles using a user-specified
compression algorithm and updates the mime.type attribute as appropriate
+
### Properties
In the list below, the names of required properties appear in bold. Any other
properties (not in bold) are considered optional. The table also indicates any
default values, and whether a property supports the NiFi Expression Language.
-| Name | Default Value | Allowable Values |
Description |
-|--------------------|-------------------------|------------------|--------------------------------------------------------------------------------|
-| Compression Format | use mime.type attribute | | The
compression format to use. |
-| Compression Level | 1 | | The
compression level to use; this is valid only when using GZIP compression. |
-| Mode | compress | | Indicates
whether the processor should compress content or decompress content. |
-| Update Filename | false | | Determines
if filename extension need to be updated |
+| Name | Default Value | Allowable Values
| Description
|
+|--------------------|-------------------------|------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| Mode | compress | compress<br/>decompress
| Indicates whether the processor should
compress content or decompress content.
|
+| Compression Level | 1 |
| The compression level to use; this is
valid only when using GZIP compression.
|
+| Compression Format | use mime.type attribute |
bzip2<br/>gzip<br/>lzma<br/>use mime.type attribute<br/>xz-lzma2 | The
compression format to use.
|
+| Update Filename | false |
| Determines if filename extension need to
be updated
|
+| Encapsulate in TAR | true |
| If true, on compression the FlowFile is
added to a TAR archive and then compressed, and on decompression a compressed,
TAR-encapsulated FlowFile is expected.<br/>If false, on compression the content
of the FlowFile simply gets compressed, and on decompression a simple
compressed content is expected.<br/>true is the behaviour compatible with older
MiNiFi C++ versions, false is the behaviour compatible with NiFi. |
+| Batch Size | 1 |
| Maximum number of FlowFiles processed in a
single session
|
+
### Relationships
| Name | Description
|
|---------|---------------------------------------------------------------------------------------------------------------|
-| failure | FlowFiles will be transferred to the failure relationship if they
fail to compress/decompress |
| success | FlowFiles will be transferred to the success relationship after
successfully being compressed or decompressed |
+| failure | FlowFiles will be transferred to the failure relationship if they
fail to compress/decompress |
## ConsumeJournald
+
### Description
-Consume systemd-journald journal messages. Available on Linux only.
+
+Consume systemd-journald journal messages. Creates one flow file per message.
Fields are mapped to attributes. Realtime timestamp is mapped to the
'timestamp' attribute. Available on Linux only.
### Properties
-All properties are required with a default value, making them effectively
optional. None of the properties support the NiFi Expression Language.
-| Name | Default Value | Allowable Values
| Description
|
-|----------------------|---------------|------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------|
-| Batch Size | 1000 | Positive numbers
| The maximum number of entries
processed in a single execution.
|
-| Payload Format | Syslog | Raw<br>Syslog
| Configures flow file content
formatting.<br>Raw: only the message.<br>Syslog: similar to syslog or
journalctl output. |
-| Include Timestamp | true | true<br>false
| Include message timestamp in the
'timestamp' attribute.
|
-| Journal Type | System | User<br>System<br>Both
| Type of journal to consume.
|
-| Process Old Messages | false | true<br>false
| Process events created before
the first usage (schedule) of the processor instance.
|
-| Timestamp Format | %x %X %Z | [date
format](https://howardhinnant.github.io/date/date.html#to_stream_formatting) |
Format string to use when creating the timestamp attribute or writing messages
in the syslog format. ISO/ISO 8601/ISO8601 are equivalent to "%FT%T%Ez". |
+In the list below, the names of required properties appear in bold. Any other
properties (not in bold) are considered optional. The table also indicates any
default values, and whether a property supports the NiFi Expression Language.
+
+| Name | Default Value | Allowable Values |
Description
|
+|--------------------------|---------------|--------------------------|-----------------------------------------------------------------------------------------------------------------|
+| **Batch Size** | 1000 | | The
maximum number of entries processed in a single execution.
|
+| **Payload Format** | Syslog | Raw<br/>Syslog |
Configures flow file content formatting. Raw: only the message. Syslog: similar
to syslog or journalctl output. |
+| **Include Timestamp** | true | |
Include message timestamp in the 'timestamp' attribute.
|
+| **Journal Type** | System | Both<br/>System<br/>User | Type
of journal to consume.
|
+| **Process Old Messages** | false | |
Process events created before the first usage (schedule) of the processor
instance. |
+| **Timestamp Format** | %x %X %Z | | Format
string to use when creating the timestamp attribute or writing messages in the
syslog format. |
### Relationships
-| Name | Description |
-|---------|--------------------------------|
-| success | Journal messages as flow files |
+| Name | Description |
+|---------|-----------------------------------------|
+| success | Successfully consumed journal messages. |
## ConsumeKafka
### Description
Consumes messages from Apache Kafka and transform them into MiNiFi FlowFiles.
The application should make sure that the processor is triggered at regular
intervals, even if no messages are expected, to serve any queued callbacks
waiting to be called. Rebalancing can also only happen on trigger.
+
### Properties
In the list below, the names of required properties appear in bold. Any other
properties (not in bold) are considered optional. The table also indicates any
default values, and whether a property supports the NiFi Expression Language.
-| Name | Default Value | Allowable Values
| Description
|
-|------------------------------|----------------|--------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
-| Duplicate Header Handling | Keep Latest | Comma-separated
Merge<br>Keep First<br>Keep Latest<br> | For headers to be added as attributes,
this option specifies how to handle cases where multiple headers are present
with the same key. For example in case of receiving these two headers: "Accept:
text/html" and "Accept: application/xml" and we want to attach the value of
"Accept" as a FlowFile attribute:<br/> - "Keep First" attaches: "Accept ->
text/html"<br/> - "Keep Latest" attaches: "Accept -> application/xml"<br/> -
"Comma-separated Merge" attaches: "Accept -> text/html, application/xml"
|
-| **Group ID** | |
| A Group ID is used to identify consumers that are
within the same consumer group. Corresponds to Kafka's 'group.id'
property.<br/>**Supports Expression Language: true**
|
-| Headers To Add As Attributes | |
| A comma separated list to match against all message
headers. Any message header whose name matches an item from the list will be
added to the FlowFile as an Attribute. If not specified, no Header values will
be added as FlowFile attributes. The behaviour on when multiple headers of the
same name are present is set using the DuplicateHeaderHandling attribute.
|
-| **Honor Transactions** | true |
| Specifies whether or not MiNiFi should honor
transactional guarantees when communicating with Kafka. If false, the Processor
will use an "isolation level" of read_uncomitted. This means that messages will
be received as soon as they are written to Kafka but will be pulled, even if
the producer cancels the transactions. If this value is true, MiNiFi will not
receive any messages for which the producer's transaction was canceled, but
this can result in some latency since the consumer must wait for the producer
to finish its entire transaction instead of pulling as the messages become
available. |
-| **Kafka Brokers** | localhost:9092 |
| A comma-separated list of known Kafka Brokers in
the format <host>:<port>.<br/>**Supports Expression Language: true**
|
-| Kerberos Keytab Path | |
| The path to the location on the local filesystem
where the kerberos keytab is located. Read permission on the file is required.
|
-| Kerberos Principal | |
| Keberos Principal
|
-| Kerberos Service Name | |
| Kerberos Service Name
|
-| **Key Attribute Encoding** | UTF-8 | Hex<br>UTF-8<br>
| FlowFiles that are emitted have an attribute named
'kafka.key'. This property dictates how the value of the attribute should be
encoded.
|
-| Max Poll Records | 10000 |
| Specifies the maximum number of records Kafka
should return when polling each time the processor is triggered.
|
-| **Max Poll Time** | 4 seconds |
| Specifies the maximum amount of time the consumer
can use for polling data from the brokers. Polling is a blocking operation, so
the upper limit of this value is specified in 4 seconds.
|
-| Message Demarcator | |
| Since KafkaConsumer receives messages in batches,
you have an option to output FlowFiles which contains all Kafka messages in a
single batch for a given topic and partition and this property allows you to
provide a string (interpreted as UTF-8) to use for demarcating apart multiple
Kafka messages. This is an optional property and if not provided each Kafka
message received will result in a single FlowFile which time it is triggered.
<br/>**Supports Expression Language: true**
|
-| Message Header Encoding | UTF-8 | Hex<br>UTF-8<br>
| Any message header that is found on a Kafka message
will be added to the outbound FlowFile as an attribute. This property indicates
the Character Encoding to use for deserializing the headers.
|
-| **Offset Reset** | latest |
earliest<br>latest<br>none<br> | Allows you to manage
the condition when there is no initial offset in Kafka or if the current offset
does not exist any more on the server (e.g. because that data has been
deleted). Corresponds to Kafka's 'auto.offset.reset' property.
|
-| Password | |
| The password for the given username when the SASL
Mechanism is sasl_plaintext
|
-| SASL Mechanism | GSSAPI | GSSAPI<br/>PLAIN
| The SASL mechanism to use for authentication.
Corresponds to Kafka's 'sasl.mechanism' property.
|
-| **Security Protocol** | plaintext |
plaintext<br/>ssl<br/>sasl_plaintext<br/>sasl_ssl | Protocol used to
communicate with brokers. Corresponds to Kafka's 'security.protocol' property.
|
-| Session Timeout | 60 seconds |
| Client group session and failure detection timeout.
The consumer sends periodic heartbeats to indicate its liveness to the broker.
If no hearts are received by the broker for a group member within the session
timeout, the broker will remove the consumer from the group and trigger a
rebalance. The allowed range is configured with the broker configuration
properties group.min.session.timeout.ms and group.max.session.timeout.ms.
|
-| SSL Context Service | |
| SSL Context Service Name
|
-| **Topic Name Format** | Names | Names<br>Patterns<br>
| Specifies whether the Topic(s) provided are a comma
separated list of names or a single regular expression. Using regular
expressions does not automatically discover Kafka topics created after the
processor started.
|
-| **Topic Names** | |
| The name of the Kafka Topic(s) to pull from.
Multiple topic names are supported as a comma separated list.<br/>**Supports
Expression Language: true**
|
-| Username | |
| The username when the SASL Mechanism is
sasl_plaintext
|
-### Properties
+| Name | Default Value | Allowable Values
| Description
|
+|------------------------------|----------------|------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| SSL Context Service | |
| SSL Context Service Name
|
+| **Security Protocol** | plaintext |
plaintext<br/>sasl_plaintext<br/>sasl_ssl<br/>ssl | Protocol used to
communicate with brokers. Corresponds to Kafka's 'security.protocol' property.
|
+| Kerberos Service Name | |
| Kerberos Service Name
|
+| Kerberos Principal | |
| Keberos Principal
|
+| Kerberos Keytab Path | |
| The path to the location on the local filesystem
where the kerberos keytab is located. Read permission on the file is required.
|
+| **SASL Mechanism** | GSSAPI | GSSAPI<br/>PLAIN
| The SASL mechanism to use for authentication.
Corresponds to Kafka's 'sasl.mechanism' property.
|
+| Username | |
| The username when the SASL Mechanism is
sasl_plaintext
|
+| Password | |
| The password for the given username when the SASL
Mechanism is sasl_plaintext
|
+| **Kafka Brokers** | localhost:9092 |
| A comma-separated list of known Kafka Brokers in the
format <host>:<port>.<br/>**Supports Expression Language: true**
|
+| **Topic Names** | |
| The name of the Kafka Topic(s) to pull from. Multiple
topic names are supported as a comma separated list.<br/>**Supports Expression
Language: true**
|
+| **Topic Name Format** | Names | Names<br/>Patterns
| Specifies whether the Topic(s) provided are a comma
separated list of names or a single regular expression. Using regular
expressions does not automatically discover Kafka topics created after the
processor started.
|
+| **Honor Transactions** | true |
| Specifies whether or not MiNiFi should honor
transactional guarantees when communicating with Kafka. If false, the Processor
will use an "isolation level" of read_uncomitted. This means that messages will
be received as soon as they are written to Kafka but will be pulled, even if
the producer cancels the transactions. If this value is true, MiNiFi will not
receive any messages for which the producer's transaction was canceled, but
this can result in some latency since the consumer must wait for the producer
to finish its entire transaction instead of pulling as the messages become
available. |
+| **Group ID** | |
| A Group ID is used to identify consumers that are
within the same consumer group. Corresponds to Kafka's 'group.id'
property.<br/>**Supports Expression Language: true**
|
+| **Offset Reset** | latest | earliest<br/>latest<br/>none
| Allows you to manage the condition when there is no
initial offset in Kafka or if the current offset does not exist any more on the
server (e.g. because that data has been deleted). Corresponds to Kafka's
'auto.offset.reset' property.
|
+| **Key Attribute Encoding** | UTF-8 | Hex<br/>UTF-8
| FlowFiles that are emitted have an attribute named
'kafka.key'. This property dictates how the value of the attribute should be
encoded.
|
+| Message Demarcator | |
| Since KafkaConsumer receives messages in batches, you
have an option to output FlowFiles which contains all Kafka messages in a
single batch for a given topic and partition and this property allows you to
provide a string (interpreted as UTF-8) to use for demarcating apart multiple
Kafka messages. This is an optional property and if not provided each Kafka
message received will result in a single FlowFile which time it is triggered.
<br/>**Supports Expression Language: true**
|
+| Message Header Encoding | UTF-8 | Hex<br/>UTF-8
| Any message header that is found on a Kafka message
will be added to the outbound FlowFile as an attribute. This property indicates
the Character Encoding to use for deserializing the headers.
|
+| Headers To Add As Attributes | |
| A comma separated list to match against all message
headers. Any message header whose name matches an item from the list will be
added to the FlowFile as an Attribute. If not specified, no Header values will
be added as FlowFile attributes. The behaviour on when multiple headers of the
same name are present is set using the DuplicateHeaderHandling attribute.
|
Review Comment:
fixed in f48606ccce5a4ce7bcd3ddb73ed5eb013d03b5c8
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
