[
https://issues.apache.org/jira/browse/KAFKA-15968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
David Arthur updated KAFKA-15968:
---------------------------------
Description:
When QuorumController encounters a CorruptRecordException, it does not include
the exception in the log message. Since CorruptRecordException extends
ApiException, it gets caught by the first condition in
EventHandlerExceptionInfo#fromInternal.
The controller treats ApiException as an excepted case (for things like authz
errors of creating a topic that already exists) so it does not cause a
failover. If the active controller sees a corrupt record, it should be a fatal
error.
While we are fixing this, we should audit the subclasses of ApiException and
make sure we are handling the fatal ones correctly.
-----
This was found while tracing the origin of the following log4j log:
{code:java}
INFO [ControllerServer id=9990] handleCommit[baseOffset=192554233]: event
failed with CorruptRecordException in 234 microseconds.
{code}
was:
When QuorumController encounters a CorruptRecordException, it does not include
the exception in the log message. Since CorruptRecordException extends
ApiException, it gets caught by the first condition in
EventHandlerExceptionInfo#fromInternal.
The controller treats ApiException as an excepted case (for things like authz
errors of creating a topic that already exists) so it does not cause a
failover. If the active controller sees a corrupt record, it should be a fatal
error.
While we are fixing this, we should audit the subclasses of ApiException and
make sure we are handling the fatal ones correctly.
> QuorumController does not treat CorruptRecordException as fatal
> ---------------------------------------------------------------
>
> Key: KAFKA-15968
> URL: https://issues.apache.org/jira/browse/KAFKA-15968
> Project: Kafka
> Issue Type: Bug
> Affects Versions: 3.6.0, 3.7.0
> Reporter: David Arthur
> Priority: Critical
>
> When QuorumController encounters a CorruptRecordException, it does not
> include the exception in the log message. Since CorruptRecordException
> extends ApiException, it gets caught by the first condition in
> EventHandlerExceptionInfo#fromInternal.
> The controller treats ApiException as an excepted case (for things like authz
> errors of creating a topic that already exists) so it does not cause a
> failover. If the active controller sees a corrupt record, it should be a
> fatal error.
> While we are fixing this, we should audit the subclasses of ApiException and
> make sure we are handling the fatal ones correctly.
> -----
> This was found while tracing the origin of the following log4j log:
> {code:java}
> INFO [ControllerServer id=9990] handleCommit[baseOffset=192554233]: event
> failed with CorruptRecordException in 234 microseconds.
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)