[ 
https://issues.apache.org/jira/browse/KAFKA-2479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gwen Shapira updated KAFKA-2479:
--------------------------------
    Fix Version/s: 0.10.0.0

> Add CopycatExceptions to indicate transient and permanent errors in a 
> connector/task
> ------------------------------------------------------------------------------------
>
>                 Key: KAFKA-2479
>                 URL: https://issues.apache.org/jira/browse/KAFKA-2479
>             Project: Kafka
>          Issue Type: Sub-task
>          Components: KafkaConnect
>            Reporter: Ewen Cheslack-Postava
>            Assignee: Liquan Pei
>             Fix For: 0.10.1.0, 0.10.0.0
>
>
> Sometimes the connector will need to indicate to the framework that an error 
> occurred, but the error could have multiple responses by the framework.
> For source connectors, there's not much they need to indicate since they can 
> block indefinitely. They probably only need to indicate permanent errors for 
> correctness, though we may want them to indicate transient errors so we can 
> report health of the task in a metric.
> For sink connectors, there are at least a couple of scenarios:
> 1. A task encounters some error while processing a {{put(records)}} call and 
> was unable to fully process it, but thinks it could be resolved in the 
> future. The task doesn't want to see any new records until the issue is 
> resolved, but will need to see the same set of records again. (It would be 
> nice if the task doesn't have to deal with saving these to a buffer itself.)
> 2. A task encounters some error while processing data, but it has 
> enqueued/handled the data passed into the {{put(records)}} call. For example, 
> it may have passed it to some library which buffers it, but then the library 
> indicated that it is having some connection issues. The connector might be 
> able accept more data, but the task is not in a healthy state.
> 3. The task encounters some error that it decides is unrecoverable. This 
> might just be transient errors that repeat for long enough that the task 
> thinks its time to give up. Unclear what to do here, but one option is 
> relocating the task to another worker, hoping that the issue is specific to 
> the worker.
> Note that it is not, generally, safe for sink tasks to do their own backoff 
> or we'd potentially starve the consumer, which needs to poll() in order to 
> heartbeat. So we need to make sure whatever mechanism we implement encourages 
> the user to throw an exception and pass control back to us instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to