[
https://issues.apache.org/jira/browse/BEAM-6168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16732187#comment-16732187
]
Etienne Chauchot commented on BEAM-6168:
----------------------------------------
No, adding the behavior you want is totally possible and will not break
anything. The only downside I see is that it will require some parsing and json
generation so it will slow down Beam ElasticsearchIO.
The point I wanted to make about other connectors is that Beam connector's
current behavior is the same as other already existing Elasticsearch
connectors. For example for spark, the Elasticsearch connector is used that way:
{code:java}
JavaEsSpark.saveToEs(rdd, ES_INDEX + "/" + ES_DOCTYPE,
ImmutableMap.of("es.mapping.id", "id"));{code}
This writes spark rdd objects to Elasticseach using "id" field in these
objects as doc id but will also write id in the doc source.
> Allow modification of JSON value before writing to ElasticSearch
> ----------------------------------------------------------------
>
> Key: BEAM-6168
> URL: https://issues.apache.org/jira/browse/BEAM-6168
> Project: Beam
> Issue Type: Improvement
> Components: io-java-elasticsearch
> Reporter: Mark Norkin
> Assignee: Etienne Chauchot
> Priority: Major
>
> I have an Apache Beam streaming job which reads data from Kafka and writes to
> ElasticSearch using ElasticSearchIO.
> The issue I'm having is that messages in Kafka already have _{{key}}_ field,
> and using {{ElasticSearchIO.Write.withIdFn()}} I'm mapping this field to
> document _{{_id}}_ field in ElasticSearch.
> Having a big volume of data I don't want the _{{key}}_ field to be also
> written to ElasticSearch as part of _{{_source}}_.
> Is there an option/workaround that would allow doing that?
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)