Bruno Cadonna created KAFKA-17489:
-------------------------------------

             Summary: IllegalStateException if failed task is removed from 
state updater
                 Key: KAFKA-17489
                 URL: https://issues.apache.org/jira/browse/KAFKA-17489
             Project: Kafka
          Issue Type: Task
          Components: streams
            Reporter: Bruno Cadonna
            Assignee: Bruno Cadonna
             Fix For: 3.9.0


If a task that is managed by the state updater fails (e.g. 
{{OffsetOutOfRangeException}}) and this same task is removed from the state 
updater, the task is regarded as corrupted and put into the task registry 
waiting for handling.

Now there are multiple ways this leads to an {{IllegalStateException}}:

1. In {{handleAssignment()}} the tasks in the state updater are handled before 
the tasks in the task registry. It could happen that a failed standby task is 
removed from the state updater and is put in the task registry. When the tasks 
in the task registry are handled, the standby task is identified. However, with 
the state updater it is illegal to have standby tasks in the task regsitry. The 
following {{IllegalStateException}} is thrown:  

{code:java}
java.lang.IllegalStateException: Standby tasks should only be managed by the 
state updater, but standby task 1_0 is managed by the stream thread
{code}

2. If a failed active task is removed from the state updater during handle 
revocation ({{onPartitionRevoked()}} call in the {{ConsumerCoordinator}}), the 
exception of the failed task is not immediately thrown by the 
{{ConsumerCoordinator#onJoinComplete()}} method. The exception is stored and 
{{onAssignment}} is called. Additionally, the failed task is put into the task 
registry for later handling. Method {{onAssignment}} calls the 
{{handleAssignment()}} which as above handles the tasks in the task registry. 
Here two {{IllegalStateException}} are thrown:

{code:java}
java.lang.IllegalStateException: Illegal state RESTORING while recycling active 
task 2_1
{code} 
(This exception may differ according to the handling, e.g., recycling or 
re-assigning)
and 

{code:java}
java.lang.IllegalStateException: Task unknown: 2_1
{code} 
The latter occurs because the failed task is handled and remove from the task 
regsistry in {{handleAssignment()}} although it should stay there until the 
original exception is handled.
 




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to