[
https://issues.apache.org/jira/browse/BEAM-3333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16290550#comment-16290550
]
Etienne Chauchot commented on BEAM-3333:
----------------------------------------
Hi [[email protected]] even if this ticket is closed for duplicate, I'll
comment in here to keep the context.
For your comment around the ES pipeline API: the support could have been added
to ESIO v5, but this version of the IO 's first aim was to offer support to the
same features set as in v2. But adding support to ES pipelines would be awesome
!
Regarding splitting, ESIOv5 already splits respecting {{desiredBundleSize}}.
Indeed it uses ES slice API that allows to split the ES shards, so we respect
runner desired bundle size. This ES slice API was not available in ES v2 so
shards could not be split in this version leading to just ignore desired bundle
size in ESIO v2.
> Create Elasticsearch IO compatible with ES 6.x
> ----------------------------------------------
>
> Key: BEAM-3333
> URL: https://issues.apache.org/jira/browse/BEAM-3333
> Project: Beam
> Issue Type: New Feature
> Components: sdk-java-extensions
> Reporter: Fokko van der Wal
> Assignee: Etienne Chauchot
> Priority: Minor
> Fix For: 2.2.0
>
>
> The current Elasticsearch IO is only compatible with Elasticsearch v 2.x and
> v 5.x. The aim is to have an IO compatible with ES v 6.x. Beyond being able
> to address v6.x elasticsearch instances, we could also leverage the use of
> the Elasticsearch pipeline API and also better split the dataset (be as close
> as possible of desiredBundleSize) thanks to the new ES split API that allows
> ES shards splitting.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)