[
https://issues.apache.org/jira/browse/NIFI-7990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chris Sampson updated NIFI-7990:
--------------------------------
Description:
PUT Elasticsearch should support the new [Elasticsearch Data
Streams|https://www.elastic.co/guide/en/elasticsearch/reference/current/use-a-data-stream.html#add-documents-to-a-data-stream]
(new in Elasticsearch 7.9).
NIFI-7474 will allow these processors to submit {{create}} operations via the
_bulk API (which is a large part of the requirement).
However, Data Streams require an {{@timestamp}} field to be provided in each
document, but this field name is illegal in [Avro
schemas|http://avro.apache.org/docs/1.8.2/spec.html#names] due to the leading
{{@}}. The Record-based processors should therefore allow for the injection of
this field into the JSON being sent to Elasticsearch - this could be based upon
an existing field within the FlowFile and be identified by a property on the
processor (e.g. like the {{_id}} field can be specified using Record Path).
Optionally, the processor allow for the field used as the {{@timestamp}} field
to be removed from the data being sent to Elasticsearch (i.e. rename the
existing field *or* duplicate it depending upon property settings). Such field
transformation should also take the timestamp format settings into account(e.g.
if a {{Long}} epoch millisecond value is to be converted to a formatted
date/time {{String}}).
was:
PutElasticsearchHttp and PutElasticsearchRecordHttp (and possibly other ES
related processors) should support the new [Elasticsearch Data
Streams|https://www.elastic.co/guide/en/elasticsearch/reference/current/use-a-data-stream.html#add-documents-to-a-data-stream].
As these processors use the {{_bulk}} endpoint to PUT one or more documents in
one request, the processors need to be updated to support the "create"
operation type. This change is likely related to: NIFI-7474.
Also, Data Streams require an {{@timestamp}} field to be provided in each
document, however such a field name is illegal in [Avro
schemas|http://avro.apache.org/docs/1.8.2/spec.html#names] due to the leading
{{@}}. The processors should therefore allow for the injection of this field
into the JSON being sent to Elasticsearch - this could be based upon an
existing field within the FlowFile and be identified by a property on the
processor (e.g. like the {{_id}} field can be specified using Record Path).
Optionally, the processor allow for the field used as the {{@timestamp}} field
to be removed from the data being sent to Elasticsearch (i.e. rename the
existing field *or* duplicate it depending upon property settings). Such field
transformation should also take the timestamp format settings into account(e.g.
if a {{Long}} epoch millisecond value is to be converted to a formatted
date/time {{String}}).
> PutElasticsearch/RecordHttp processors should support Elasticsearch Data
> Streams
> --------------------------------------------------------------------------------
>
> Key: NIFI-7990
> URL: https://issues.apache.org/jira/browse/NIFI-7990
> Project: Apache NiFi
> Issue Type: Improvement
> Affects Versions: 1.11.4, 1.12.1
> Reporter: Chris Sampson
> Priority: Minor
>
> PUT Elasticsearch should support the new [Elasticsearch Data
> Streams|https://www.elastic.co/guide/en/elasticsearch/reference/current/use-a-data-stream.html#add-documents-to-a-data-stream]
> (new in Elasticsearch 7.9).
> NIFI-7474 will allow these processors to submit {{create}} operations via the
> _bulk API (which is a large part of the requirement).
> However, Data Streams require an {{@timestamp}} field to be provided in each
> document, but this field name is illegal in [Avro
> schemas|http://avro.apache.org/docs/1.8.2/spec.html#names] due to the leading
> {{@}}. The Record-based processors should therefore allow for the injection
> of this field into the JSON being sent to Elasticsearch - this could be based
> upon an existing field within the FlowFile and be identified by a property on
> the processor (e.g. like the {{_id}} field can be specified using Record
> Path).
> Optionally, the processor allow for the field used as the {{@timestamp}}
> field to be removed from the data being sent to Elasticsearch (i.e. rename
> the existing field *or* duplicate it depending upon property settings). Such
> field transformation should also take the timestamp format settings into
> account(e.g. if a {{Long}} epoch millisecond value is to be converted to a
> formatted date/time {{String}}).
--
This message was sent by Atlassian Jira
(v8.3.4#803005)