mjsax commented on code in PR #20767:
URL: https://github.com/apache/kafka/pull/20767#discussion_r2535397977


##########
streams/src/main/java/org/apache/kafka/streams/processor/internals/DefaultStateUpdater.java:
##########
@@ -606,14 +609,19 @@ private boolean removeFailedTask(final TaskId taskId, 
final CompletableFuture<Re
         private void pauseTask(final Task task) {
             final TaskId taskId = task.id();
             // do not need to unregister changelog partitions for paused tasks
-            measureCheckpointLatency(() -> task.maybeCheckpoint(true));
-            pausedTasks.put(taskId, task);
-            updatingTasks.remove(taskId);
-            if (task.isActive()) {
-                transitToUpdateStandbysIfOnlyStandbysLeft();
+            try {
+                measureCheckpointLatency(() -> task.maybeCheckpoint(true));
+                pausedTasks.put(taskId, task);
+                updatingTasks.remove(taskId);
+                if (task.isActive()) {
+                    transitToUpdateStandbysIfOnlyStandbysLeft();
+                }
+                log.info((task.isActive() ? "Active" : "Standby")
+                    + " task " + task.id() + " was paused from the updating 
tasks and added to the paused tasks.");
+
+            } catch (final StreamsException streamsException) {

Review Comment:
   Thanks for reading the JavaDoc comments Lucas. This makes it pretty clear. 
`sending changelog records failed` is fatal as it would be data loss, and we 
cannot proceed. It make sense after the fact -- before we can write the local 
checkpoint file, we need to flush the producer (missed this point originally).
   
   > Why does the comment indicate that the thread should die on an I/O error
   
   For such critical errors, we don't want to have any built-in recovery 
mechanism (design decision), but let the user opt into auto-recovery via the 
uncaught exception handler.
   
   As we flush pending producer writes, a `TaskMigratedException` can originate 
from this flush. We should still handle this one gracefully, and not let the 
thread die.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to