[ 
https://issues.apache.org/jira/browse/BEAM-3201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16414640#comment-16414640
 ] 

Tim Robertson edited comment on BEAM-3201 at 3/27/18 9:00 AM:
--------------------------------------------------------------

[~chet.aldrich] I really don't want to step on your toes, but I needed this 
functionality so have an implementation. I couldn't have done this so easily 
without your work (thanks!).

 (Edited link) [https://github.com/timrobertson100/beam/tree/BEAM-3201]  

I made the following changes from the approach [~chet.aldrich] had started:
 # Used a single Interface instead of 3 and I opted for a different signature 
 # Used Jackson for JSON serde
 ## Removes need to bring in another dependency
 ## I _suspect_ is in wider use so might be better as it forms part of the 
public API
 # Added the ability to route to different types which was not yet implemented 
in your branch 
 # Added sanity checking to the index field (ES requires lower case)
 # I opted for different test strategy to avoid using the deprecated DoFnTester

Would it be ok with you [~chet.aldrich] / [~echauchot] if I put this up for a 
PR tomorrow after I tidy my reformatting error please? Other than formatting I 
think it is a complete solution to this issue with test coverage.

 


was (Author: timrobertson100):
[~chet.aldrich] I really don't want to step on your toes, but I needed this 
functionality so have an implementation. I couldn't have done this so easily 
without your work (thanks!).

 

  
[https://github.com/timrobertson100/beam/commit/a6002f1a4b8388e955e512281d38001ae828cdcf]

The commit above needs a little bit of tidying as I have accidentally 
reformatted the whole SolrIO incorrectly - but it is late here and I'll do it 
tomorrow.

I made the following changes from the approach [~chet.aldrich] had started:
 # Used a single Interface instead of 3 and I opted for a different signature 
 # Used Jackson for JSON serde
 ## Removes need to bring in another dependency
 ## I _suspect_ is in wider use so might be better as it forms part of the 
public API
 # Added the ability to route to different types which was not yet implemented 
in your branch 
 # Added sanity checking to the index field (ES requires lower case)
 # I opted for different test strategy to avoid using the deprecated DoFnTester

Would it be ok with you [~chet.aldrich] / [~echauchot] if I put this up for a 
PR tomorrow after I tidy my reformatting error please? Other than formatting I 
think it is a complete solution to this issue with test coverage.

 

> ElasticsearchIO should allow the user to optionally pass id, type and index 
> per document
> ----------------------------------------------------------------------------------------
>
>                 Key: BEAM-3201
>                 URL: https://issues.apache.org/jira/browse/BEAM-3201
>             Project: Beam
>          Issue Type: Improvement
>          Components: io-java-elasticsearch
>            Reporter: Etienne Chauchot
>            Assignee: Chet Aldrich
>            Priority: Major
>
> *Dynamic documents id*: Today the ESIO only inserts the payload of the ES 
> documents. Elasticsearch generates a document id for each record inserted. So 
> each new insertion is considered as a new document. Users want to be able to 
> update documents using the IO. So, for the write part of the IO, users should 
> be able to provide a document id so that they could update already stored 
> documents. Providing an id for the documents could also help the user on 
> indempotency.
> *Dynamic ES type and ES index*: In some cases (streaming pipeline with high 
> throughput) partitioning the PCollection to allow to plug to different ESIO 
> instances (pointing to different index/type) is not very practical, the users 
> would like to be able to set ES index/type per document.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to