Bumping the thread On Sun, Dec 1, 2024 at 8:33 AM Anton Liauchuk <anton93...@gmail.com> wrote:
> Hi > Thank you for your feedback. > I have numbered the questions to simplify communication. > > 1. What sort of implementation do you have in mind for this interface? >> What use-case does this interface enable that is not possible with log >> scraping, or implementing a source-connector DLQ to Kafka? > > I have a use case where source connectors need to send metrics and logs to > a custom Kafka topic. Although it's possible to use a log reporter to > extract the required information from logs, there are several limitations > to consider: > - It depends on the log format used in `*kafka-runtime*`. > - A pluggable interface provides greater flexibility for defining custom > behavior. > - The API will have better support in future releases of `*kafka-connect* > `. > > 2. Could you add the ErrorContext class to your public API description? I >> don't think that is an existing interface. Also please specify the >> package/fully qualified names for these classes. > > added, thank you! > > 3. How do you expect this will interact with the existing log and DLQ >> reporters? Will users specifying a custom error reporter be able to turn >> off the other reporters? > > In the current implementation, custom reporters are an independent > addition to the runtime reporters. > > 4. Are error reporters expected to be source/sink agnostic (like the Log >> reporter) or are they permitted to function for just one type (like the DLQ >> reporter?) > > Error reporters are expected to be source/sink agnostic. > > 5. Should reporters be asynchronous/fire-and-forget, or should they have a >> mechanism for propagating errors that kill the task? > > I believe that adding a mechanism for propagating errors to the error > handler interface is preferable. > > 6. Would it make sense for error reporting to also involve error handling: >> i.e. let the plugin decide how to handle errors (drop record, trigger >> retries, fail the task, etc)? > > I believe this approach makes sense. I have added new changes to a > separate branch and created a PR > https://github.com/anton-liauchuk/kafka/pull/1/files. I haven’t extended > the KIP at this stage, as I would like to discuss some items first. In this > PR, I haven’t prepared all the necessary changes to support a new mode yet; > it's just POC. > > It seems we don’t need to add this functionality to the reporter, as it > would be better for the reporter interface to focus solely on reporting. I > have created a new interface called `ErrorHandler`, which provides a way to > handle error responses. I designed this interface to be similar to > `org.apache.kafka.streams.errors.ProcessingExceptionHandler` from the > `kafka-streams` project. > > I'm considering extending the tolerance configuration to enable this > handler with the `*errors.tolerance=custom*` setting. When custom > tolerance is selected, the client can specify the class name for the error > handler. Handling the error might result in one of three options: > - *DROP*: Skips the record. > - *FAIL*: Fails the task. > - *ACK*: Skips the message and acknowledges it, applicable for source > connectors. > The following stages are where error handling might be used (these stages > are part of the `*TOLERABLE_EXCEPTIONS*` in ` > *org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator#TOLERABLE_EXCEPTIONS* > `): > - *TRANSFORMATION* > - *KEY_CONVERTER* > - *VALUE_CONVERTER* > - *HEADER_CONVERTER* > > I would like some advice on the following items: > 6.1. Do we still need to define an error reporter interface if we have the > option to create an error handler? I believe that all necessary reporting > can be managed within the error handler, making the reporter interface seem > unnecessary. > 6.2. Does it make sense to expand the list of stages where the error > handler can be used? The current list is based on the existing error > handling logic. For instance, it could be beneficial to handle errors from > the `*TASK_POLL*` stage. The current implementation does not support > error handling for errors that are unassigned to any records, but we could > consider how to extend it if needed. Additionally, we might review the ` > *KAFKA_PRODUCE*` and `*TASK_PUT*` stages. > 6.3. If we begin improvements to error handling, should we also explore > the possibility of supporting error handling for connector or task failures? > > > On Fri, Oct 25, 2024 at 2:30 AM Greg Harris <greg.har...@aiven.io.invalid> > wrote: > >> Hi Anton, >> >> Thanks for the KIP! I think that looking at internal APIs as inspiration >> for new external APIs is a good idea, and I'm glad that you found an >> interface close to the problem you're trying to solve. >> >> What sort of implementation do you have in mind for this interface? What >> use-case does this interface enable that is not possible with log >> scraping, >> or implementing a source-connector DLQ to Kafka? >> Before we make something pluggable, we should consider if the existing >> framework implementations could be improved directly. >> >> Could you add the ErrorContext class to your public API description? I >> don't think that is an existing interface. Also please specify the >> package/fully qualified names for these classes. >> >> How do you expect this will interact with the existing log and DLQ >> reporters? >> Will users specifying a custom error reporter be able to turn off the >> other >> reporters? >> >> Are error reporters expected to be source/sink agnostic (like the Log >> reporter) or are they permitted to function for just one type (like the >> DLQ >> reporter?) >> >> The runtime interface returns a Future<RecordMetadata>, which is an >> abstraction specific for the DLQ reporter and ignored for the Log >> reporter, >> and I see that you've omitted it from the new API. >> Should reporters be asynchronous/fire-and-forget, or should they have a >> mechanism for propagating errors that kill the task? >> >> Would it make sense for error reporting to also involve error handling: >> i.e. let the plugin decide how to handle errors (drop record, trigger >> retries, fail the task, etc)? >> In Connect there's been a longstanding pattern where every connector >> reimplements error handling individually, often hardcoding response >> behaviors to various errors, because the existing errors.tolerance >> configuration is too limiting. >> Maybe making this pluggable leads us towards a solution where there could >> be a pluggable "error handler" that can implement reporting for many >> different errors, but also allow for simple reconfiguration of error >> handling behavior. >> >> Thanks, >> Greg >> >> On Thu, Oct 24, 2024 at 3:57 PM Anton Liauchuk <anton93...@gmail.com> >> wrote: >> >> > Bumping the thread. Please review this KIP. Thanks! >> > >> > On Sun, Oct 13, 2024 at 11:44 PM Anton Liauchuk <anton93...@gmail.com> >> > wrote: >> > > >> > > Hi all, >> > > >> > > I have opened >> > >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1097+error+record+reporter >> > > >> > > POC: https://github.com/apache/kafka/pull/17493 >> > > >> > > Please review KIP and PR, feedbacks and suggestions are welcome. >> > >> >