Adam Turley created NIFI-15985:
----------------------------------
Summary: Add per-document Index Field and Timestamp Field
extraction to PutElasticsearchJson
Key: NIFI-15985
URL: https://issues.apache.org/jira/browse/NIFI-15985
Project: Apache NiFi
Issue Type: Improvement
Affects Versions: 2.9.0
Reporter: Adam Turley
Assignee: Adam Turley
The Elasticsearch index name can only be set via the Index property using
Expression Language over FlowFile attributes. For NDJSON and JSON Array
workloads where each document carries its own routing metadata (e.g. an _index
or data_stream field), users must add upstream processors to extract and
promote those values into FlowFile attributes before reaching
PutElasticsearchJson. Similarly, there is no mechanism to map a document field
to Elasticsearch's @timestamp field. Additionally, when the existing Identifier
Field is used to set the document _id, the source field remains in the document
body with no option to remove it.
Desired Behavior:
An Index Field property should allow users to specify a field within each
document whose value is used as the Elasticsearch index name, falling back to
the configured Index property when absent or blank. This should work across all
three input formats (NDJSON, JSON Array, Single JSON).
A Timestamp Field property should allow users to specify a field within each
document whose value is written to Elasticsearch as @timestamp, across all
three input formats.
A "Retain Identifier Field", "Retain Index Field", and "Retain Timestamp Field"
property should be added for each of the above (including the existing
Identifier Field), controlling whether the source field is removed from the
document body after extraction. The default should be false (remove the field),
since these fields are typically routing or metadata values rather than
document content.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)