rhauch commented on a change in pull request #8858: URL: https://github.com/apache/kafka/pull/8858#discussion_r448023368
########## File path: docs/connect.html ########## @@ -258,6 +258,48 @@ <h4><a id="connect_rest" href="#connect_rest">REST API</a></h4> <li><code>GET /</code>- return basic information about the Kafka Connect cluster such as the version of the Connect worker that serves the REST request (including git commit ID of the source code) and the Kafka cluster ID that is connected to. </ul> + <h4><a id="connect_errorreporting" href="#connect_errorreporting">Error Reporting in Connect</a></h4> + + <p>Kafka Connect provides error reporting to handle errors encountered along various stages of processing. By default, any error encountered during conversion or within transformations will cause the connector to fail. Each connector configuration can also enable tolerating such errors by skipping them, optionally writing each error and the details of the failed operation and problematic record (with various levels of detail) to the Connect application log. These mechanisms also capture errors when a sink connector is processing the messages consumed from its Kafka topics, and all of the errors can be written to a configurable "dead letter queue" (DLQ) Kafka topic.</p> + + <p>To report errors within a connector's converter, transforms, or in specifically the sink connector itself to the log, set <code>errors.log.enable=true</code> in the connector configuration to log details of each error and problem record's topic, partition, and offset. For additional debugging purposes, set <code>errors.log.include.messages=true</code> to also log the problem record key, value, and headers to the log (note this may log sensitive information). + + <p>To report errors within a connector's converter, transforms, or in specifically the sink connector itself to a dead letter queue topic, set <code>errors.deadletterqueue.topic.name</code>, and optionally <code>errors.deadletterqueue.context.headers.enable=true</code>.</p> Review comment: ```suggestion <p>To report errors within a connector's converter, transforms, or within the sink connector itself to a dead letter queue topic, set <code>errors.deadletterqueue.topic.name</code>, and optionally <code>errors.deadletterqueue.context.headers.enable=true</code>.</p> ``` ########## File path: docs/connect.html ########## @@ -258,6 +258,48 @@ <h4><a id="connect_rest" href="#connect_rest">REST API</a></h4> <li><code>GET /</code>- return basic information about the Kafka Connect cluster such as the version of the Connect worker that serves the REST request (including git commit ID of the source code) and the Kafka cluster ID that is connected to. </ul> + <h4><a id="connect_errorreporting" href="#connect_errorreporting">Error Reporting in Connect</a></h4> + + <p>Kafka Connect provides error reporting to handle errors encountered along various stages of processing. By default, any error encountered during conversion or within transformations will cause the connector to fail. Each connector configuration can also enable tolerating such errors by skipping them, optionally writing each error and the details of the failed operation and problematic record (with various levels of detail) to the Connect application log. These mechanisms also capture errors when a sink connector is processing the messages consumed from its Kafka topics, and all of the errors can be written to a configurable "dead letter queue" (DLQ) Kafka topic.</p> + + <p>To report errors within a connector's converter, transforms, or in specifically the sink connector itself to the log, set <code>errors.log.enable=true</code> in the connector configuration to log details of each error and problem record's topic, partition, and offset. For additional debugging purposes, set <code>errors.log.include.messages=true</code> to also log the problem record key, value, and headers to the log (note this may log sensitive information). + + <p>To report errors within a connector's converter, transforms, or in specifically the sink connector itself to a dead letter queue topic, set <code>errors.deadletterqueue.topic.name</code>, and optionally <code>errors.deadletterqueue.context.headers.enable=true</code>.</p> + + <p>For example, below shows a configuration that will cause a connector will fail immediately upon an error or exception. Although it is not necessary to add extra configuration properties for this behavior, adding the following properties to a sink connector configuration would achieve this "fail fast" behavior:</p> + + <pre class="brush: text;"> + # disable retries on failure + errors.retry.timeout=0 + + # do not log the error and their contexts + errors.log.enable=false + + # do not record errors in a dead letter queue topic + errors.deadletterqueue.topic.name= Review comment: ```suggestion # errors.deadletterqueue.topic.name= ``` ########## File path: docs/connect.html ########## @@ -258,6 +258,48 @@ <h4><a id="connect_rest" href="#connect_rest">REST API</a></h4> <li><code>GET /</code>- return basic information about the Kafka Connect cluster such as the version of the Connect worker that serves the REST request (including git commit ID of the source code) and the Kafka cluster ID that is connected to. </ul> + <h4><a id="connect_errorreporting" href="#connect_errorreporting">Error Reporting in Connect</a></h4> + + <p>Kafka Connect provides error reporting to handle errors encountered along various stages of processing. By default, any error encountered during conversion or within transformations will cause the connector to fail. Each connector configuration can also enable tolerating such errors by skipping them, optionally writing each error and the details of the failed operation and problematic record (with various levels of detail) to the Connect application log. These mechanisms also capture errors when a sink connector is processing the messages consumed from its Kafka topics, and all of the errors can be written to a configurable "dead letter queue" (DLQ) Kafka topic.</p> + + <p>To report errors within a connector's converter, transforms, or in specifically the sink connector itself to the log, set <code>errors.log.enable=true</code> in the connector configuration to log details of each error and problem record's topic, partition, and offset. For additional debugging purposes, set <code>errors.log.include.messages=true</code> to also log the problem record key, value, and headers to the log (note this may log sensitive information). + + <p>To report errors within a connector's converter, transforms, or in specifically the sink connector itself to a dead letter queue topic, set <code>errors.deadletterqueue.topic.name</code>, and optionally <code>errors.deadletterqueue.context.headers.enable=true</code>.</p> + + <p>For example, below shows a configuration that will cause a connector will fail immediately upon an error or exception. Although it is not necessary to add extra configuration properties for this behavior, adding the following properties to a sink connector configuration would achieve this "fail fast" behavior:</p> + + <pre class="brush: text;"> + # disable retries on failure + errors.retry.timeout=0 + + # do not log the error and their contexts + errors.log.enable=false + + # do not record errors in a dead letter queue topic + errors.deadletterqueue.topic.name= + + # Fail on first error + errors.tolerance=none + </pre> + + <p>The following configuration shows how to setup error handling with multiple retries, logging both to the application logs and a Kafka topic with infinite tolerance:</p> Review comment: ```suggestion <p>The following configuration properties can be added to a connector configuration to setup error handling with multiple retries, logging to the application logs and the <code>my-connector-errors</code> Kafka topic, and tolerating all errors rather than failing the connector or task:</p> ``` ########## File path: docs/connect.html ########## @@ -258,6 +258,48 @@ <h4><a id="connect_rest" href="#connect_rest">REST API</a></h4> <li><code>GET /</code>- return basic information about the Kafka Connect cluster such as the version of the Connect worker that serves the REST request (including git commit ID of the source code) and the Kafka cluster ID that is connected to. </ul> + <h4><a id="connect_errorreporting" href="#connect_errorreporting">Error Reporting in Connect</a></h4> + + <p>Kafka Connect provides error reporting to handle errors encountered along various stages of processing. By default, any error encountered during conversion or within transformations will cause the connector to fail. Each connector configuration can also enable tolerating such errors by skipping them, optionally writing each error and the details of the failed operation and problematic record (with various levels of detail) to the Connect application log. These mechanisms also capture errors when a sink connector is processing the messages consumed from its Kafka topics, and all of the errors can be written to a configurable "dead letter queue" (DLQ) Kafka topic.</p> + + <p>To report errors within a connector's converter, transforms, or in specifically the sink connector itself to the log, set <code>errors.log.enable=true</code> in the connector configuration to log details of each error and problem record's topic, partition, and offset. For additional debugging purposes, set <code>errors.log.include.messages=true</code> to also log the problem record key, value, and headers to the log (note this may log sensitive information). + + <p>To report errors within a connector's converter, transforms, or in specifically the sink connector itself to a dead letter queue topic, set <code>errors.deadletterqueue.topic.name</code>, and optionally <code>errors.deadletterqueue.context.headers.enable=true</code>.</p> + + <p>For example, below shows a configuration that will cause a connector will fail immediately upon an error or exception. Although it is not necessary to add extra configuration properties for this behavior, adding the following properties to a sink connector configuration would achieve this "fail fast" behavior:</p> Review comment: ```suggestion <p>For example, the following configuration properties can be added to a connector configuration to cause the connector to fail immediately upon an error or exception. Although it is not necessary to add extra configuration properties for this behavior, adding the following properties to a sink connector configuration would achieve this "fail fast" behavior:</p> ``` ########## File path: docs/connect.html ########## @@ -429,6 +471,42 @@ <h5><a id="connect_sinktasks" href="#connect_sinktasks">Sink Tasks</a></h5> <p>The <code>flush()</code> method is used during the offset commit process, which allows tasks to recover from failures and resume from a safe point such that no events will be missed. The method should push any outstanding data to the destination system and then block until the write has been acknowledged. The <code>offsets</code> parameter can often be ignored, but is useful in some cases where implementations want to store offset information in the destination store to provide exactly-once delivery. For example, an HDFS connector could do this and use atomic move operations to make sure the <code>flush()</code> operation atomically commits the data and offsets to a final location in HDFS.</p> + <h5><a id="connect_errantrecordreporter" href="connect_errantrecordreporter">Errant Record Reporter</a></h5> + + <p>The <code>ErrantRecordReporter</code> can be used to report errors encountered after records have been sent to a sink connector. The following is an example implementation and use case of the <code>ErrantRecordReporter</code> in the <code>SinkTask</code> class:</p> Review comment: ```suggestion <p>When <a href="#connect_errorreporting">error reporting</a> is enabled for a connector, the connector can use an <code>ErrantRecordReporter</code> to report problems with individual records sent to a sink connector. The following example shows how to obtain and use the <code>ErrantRecordReporter</code> in a <code>SinkTask</code> subclass, while safely handling the case when the connector is installed in an older Connect runtime that doesn't have this reporter feature:</p> ``` ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org