Derek Abdine created CAMEL-8149:
-----------------------------------

             Summary: Support application-generated document identifiers in 
bulk index requests
                 Key: CAMEL-8149
                 URL: https://issues.apache.org/jira/browse/CAMEL-8149
             Project: Camel
          Issue Type: Improvement
          Components: camel-elasticsearch
    Affects Versions: 2.14.0
            Reporter: Derek Abdine
             Fix For: 2.15.0


Elasticsearch (via the elasticsearch-java transport client) provides two 
categories of APIs to write and read data: Individual requests (index, get, 
delete) and bulk requests.

When performing bulk updates one creates individual index requests and adds 
them to the bulk request. When creating an index request one can set the source 
document, id, etc. 

The current design of the camel-elasticsearch component controls the 
transformation and assembly of an input body (json string, byte[], 
xcontentfactory, map) to an index request. Thus, it is impossible to set the id 
on the index request that goes into a bulk action. The end result is that the 
id is set by the default behavior of the underlying elasticsearch-java client 
which generates a random identifier.  This is problematic in situations where 
control is needed over the id, e.g. for de-duplication purposes.

My proposal is to improve the design of the producer to allow for 
elasticsearch-java ActionRequest sub-classes in the message body so that 
upstream message processors can control the creation of those requests.

I've attached a patch and sent a pull request on github.

Thank you!
Derek Abdine



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to