[
https://issues.apache.org/jira/browse/FLINK-18359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Aljoscha Krettek updated FLINK-18359:
-------------------------------------
Summary: Log failures in handler instead of ElasticsearchSinkBase (was:
Improve error-log strategy for Elasticsearch sink for large data documentId
conflict when using create mode for `insert ignore` semantics)
> Log failures in handler instead of ElasticsearchSinkBase
> --------------------------------------------------------
>
> Key: FLINK-18359
> URL: https://issues.apache.org/jira/browse/FLINK-18359
> Project: Flink
> Issue Type: Improvement
> Components: Connectors / ElasticSearch
> Reporter: rinkako
> Priority: Major
> Labels: pull-request-available, usability
> Fix For: 1.12.0
>
>
> The story is: when a flink job for ingesting large number of records from
> data sources, processing and indexing with Elasticsearch sink failed, we may
> restart it from a specific data set which contains lots of data which already
> sink into ES.
> At this case, a `INSERT IGNORE` semantics is necessary, and we use `public
> IndexRequest create(boolean create)` with `true` args and ignore the 409
> restStatusCode at a customized ActionRequestFailureHandler to make it work.
> But, the `BulkProcessorListener` always log a error event before it calls the
> `failureHandler` in its `afterBulk` method, and will produce tons of error
> log for document id conflict, which we already know and handle them in
> customized ActionRequestFailureHandler.
> Therefore, it seems that the error log action at the
> ActionRequestFailureHandler (either the default IgnoringFailureHandler or a
> custom handler) is more flexible ?
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)