[jira] [Commented] (BEAM-3201) ElasticsearchIO should deal with documents id

Chet Aldrich (JIRA) Tue, 28 Nov 2017 11:24:47 -0800

    [ 
https://issues.apache.org/jira/browse/BEAM-3201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16269318#comment-16269318
 ]


Chet Aldrich commented on BEAM-3201:
------------------------------------

[~echauchot] [~nerdynick] sounds like we have reached a rough agreement on the 
design, at least enough for me to start coding something up and show you guys 
the PR. To summarize: 

We will keep the API of PCollection<String>. 

Three optional methods will be added, one for each of the following metadata 
fields: _id, _index, _type. Each will require a function that takes in a JSON 
object and returns a String, which is what will be placed in the corresponding 
metadata field.

If any of these methods are called, parse the string into JSON so that each of 
the methods can use it. Reuse the deserialization for speed.

Run the method for each element in the PCollection. 

I'm going to start coding this up based on what I said above, PR will come 
soon. Let me know if I'm missing something important, and I'll edit the PR 
accordingly.


> ElasticsearchIO should deal with documents id
> ---------------------------------------------
>
>                 Key: BEAM-3201
>                 URL: https://issues.apache.org/jira/browse/BEAM-3201
>             Project: Beam
>          Issue Type: Improvement
>          Components: sdk-java-extensions
>            Reporter: Etienne Chauchot
>            Assignee: Chet Aldrich
>
> Today the ESIO only inserts the payload of the ES documents. Elasticsearch 
> generates a document id for each record inserted. So each new insertion is 
> considered as a new document. Users want to be able to update documents using 
> the IO. So, for the write part of the IO, users should be able to provide a 
> document id so that they could update already stored documents. Providing an 
> id for the documents could also help the user on indempotency.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (BEAM-3201) ElasticsearchIO should deal with documents id

Reply via email to