[jira] [Work logged] (BEAM-12601) Support append-only indices in ES output

ASF GitHub Bot (Jira) Wed, 04 Aug 2021 13:25:04 -0700


     [ 
https://issues.apache.org/jira/browse/BEAM-12601?focusedWorklogId=633804&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-633804
 ]


ASF GitHub Bot logged work on BEAM-12601:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 04/Aug/21 20:24
            Start Date: 04/Aug/21 20:24
    Worklog Time Spent: 10m 
      Work Description: echauchot commented on pull request #15257:
URL: https://github.com/apache/beam/pull/15257#issuecomment-892949942


   > I wonder if we should move towards (please don't hate me for suggesting it 
Etienne  ) pre-commit unit tests that don't make use of ES itself but analyze 
the resulting in-memory PCollection contents to ensure that what's been 
produced is as expected. We could still employ post-commit/regression tests 
that use a real ES instance, but this could de-flake the pre-commit unit tests?
   
   Don't worry Evan, I do agree, ES tests have been flaky for years because of 
embedded ES being sensitive to load. We tried to lower flakiness with test 
containers (thanks for your work on that) but there is still. Flaky Utests are 
painful for the build so they are painful for the whole dev process. So now 
comes the time to set a limit with which we're confident in UTests to spot all 
misbehavior and leave the rest to ITests. Only, in that case, these IO ITests 
need to run as part of each PR, which is not done right now: e.g. CassandraioIT 
and ESIOIT are run on an on-demand basis for load tests mainly.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 633804)
    Time Spent: 3h 10m  (was: 3h)

> Support append-only indices in ES output 
> -----------------------------------------
>
>                 Key: BEAM-12601
>                 URL: https://issues.apache.org/jira/browse/BEAM-12601
>             Project: Beam
>          Issue Type: Improvement
>          Components: io-java-elasticsearch
>            Reporter: Andres Rodriguez
>            Priority: P2
>          Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> Currently, the Apache Beam Elasticsearch sink is 
> [using|https://github.com/apache/beam/blob/master/sdks/java/io/elasticsearch/src/main/java/org/apache/beam/sdk/io/elasticsearch/ElasticsearchIO.java#L1532]
>  the 
> [index|https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-bulk.html#bulk-api-request-body]
>  bulk API operation to add data to the target index. When using append-only 
> indices it is better to use the 
> [create|https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-bulk.html#bulk-api-request-body]
>  operation. This also applies to new append-only indexes, like [data 
> streams|https://www.elastic.co/guide/en/elasticsearch/reference/7.x/use-a-data-stream.html#add-documents-to-a-data-stream].
> The scope of this improvement is to add a new boolean configuration option, 
> {{append-only}}, to the Elasticsearch sink, with a default value of {{false}} 
> (to keep the current behaivour) that when enabled makes it use the {{create}} 
> operation instead of the {{index}} one when sending data.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-12601) Support append-only indices in ES output

Reply via email to