Derek Abdine created CAMEL-8149:
-----------------------------------
Summary: Support application-generated document identifiers in
bulk index requests
Key: CAMEL-8149
URL: https://issues.apache.org/jira/browse/CAMEL-8149
Project: Camel
Issue Type: Improvement
Components: camel-elasticsearch
Affects Versions: 2.14.0
Reporter: Derek Abdine
Fix For: 2.15.0
Elasticsearch (via the elasticsearch-java transport client) provides two
categories of APIs to write and read data: Individual requests (index, get,
delete) and bulk requests.
When performing bulk updates one creates individual index requests and adds
them to the bulk request. When creating an index request one can set the source
document, id, etc.
The current design of the camel-elasticsearch component controls the
transformation and assembly of an input body (json string, byte[],
xcontentfactory, map) to an index request. Thus, it is impossible to set the id
on the index request that goes into a bulk action. The end result is that the
id is set by the default behavior of the underlying elasticsearch-java client
which generates a random identifier. This is problematic in situations where
control is needed over the id, e.g. for de-duplication purposes.
My proposal is to improve the design of the producer to allow for
elasticsearch-java ActionRequest sub-classes in the message body so that
upstream message processors can control the creation of those requests.
I've attached a patch and sent a pull request on github.
Thank you!
Derek Abdine
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)