vvcephei commented on a change in pull request #8896:
URL: https://github.com/apache/kafka/pull/8896#discussion_r442397523



##########
File path: 
streams/src/main/java/org/apache/kafka/streams/processor/internals/StoreChangelogReader.java
##########
@@ -415,19 +418,20 @@ public void restore() {
                 // for restoring active and updating standby we may prefer 
different poll time
                 // in order to make sure we call the main consumer#poll in 
time.
                 // TODO: once we move ChangelogReader to a separate thread 
this may no longer be a concern
-                polledRecords = 
restoreConsumer.poll(state.equals(ChangelogReaderState.STANDBY_UPDATING) ? 
Duration.ZERO : pollTime);
+                polledRecords = restoreConsumer.poll(state == 
ChangelogReaderState.STANDBY_UPDATING ? Duration.ZERO : pollTime);

Review comment:
       trivial cleanup

##########
File path: 
streams/src/main/java/org/apache/kafka/streams/processor/internals/StoreChangelogReader.java
##########
@@ -415,19 +418,20 @@ public void restore() {
                 // for restoring active and updating standby we may prefer 
different poll time
                 // in order to make sure we call the main consumer#poll in 
time.
                 // TODO: once we move ChangelogReader to a separate thread 
this may no longer be a concern
-                polledRecords = 
restoreConsumer.poll(state.equals(ChangelogReaderState.STANDBY_UPDATING) ? 
Duration.ZERO : pollTime);
+                polledRecords = restoreConsumer.poll(state == 
ChangelogReaderState.STANDBY_UPDATING ? Duration.ZERO : pollTime);
             } catch (final InvalidOffsetException e) {
-                log.warn("Encountered {} fetching records from restore 
consumer for partitions {}, it is likely that " +
+                log.warn("Encountered " + e.getClass().getName() +
+                    " fetching records from restore consumer for partitions " 
+ e.partitions() + ", it is likely that " +
                     "the consumer's position has fallen out of the topic 
partition offset range because the topic was " +
                     "truncated or compacted on the broker, marking the 
corresponding tasks as corrupted and re-initializing" +
-                    " it later.", e.getClass().getName(), e.partitions());
+                    " it later.", e);

Review comment:
       Added the exception itself as the "cause" of the warning. The actual 
message of the IOE is actually pretty good at explaining the root cause.

##########
File path: 
streams/src/main/java/org/apache/kafka/streams/processor/internals/StoreChangelogReader.java
##########
@@ -446,6 +450,38 @@ public void restore() {
             }
 
             maybeUpdateLimitOffsetsForStandbyChangelogs();
+
+            maybeLogRestorationProgress();

Review comment:
       This is the main change. Once every ten seconds, we will log the 
progress for each active restoring changelog.

##########
File path: 
streams/src/main/java/org/apache/kafka/streams/processor/internals/StoreChangelogReader.java
##########
@@ -415,19 +418,20 @@ public void restore() {
                 // for restoring active and updating standby we may prefer 
different poll time
                 // in order to make sure we call the main consumer#poll in 
time.
                 // TODO: once we move ChangelogReader to a separate thread 
this may no longer be a concern
-                polledRecords = 
restoreConsumer.poll(state.equals(ChangelogReaderState.STANDBY_UPDATING) ? 
Duration.ZERO : pollTime);
+                polledRecords = restoreConsumer.poll(state == 
ChangelogReaderState.STANDBY_UPDATING ? Duration.ZERO : pollTime);
             } catch (final InvalidOffsetException e) {
-                log.warn("Encountered {} fetching records from restore 
consumer for partitions {}, it is likely that " +
+                log.warn("Encountered " + e.getClass().getName() +
+                    " fetching records from restore consumer for partitions " 
+ e.partitions() + ", it is likely that " +
                     "the consumer's position has fallen out of the topic 
partition offset range because the topic was " +
                     "truncated or compacted on the broker, marking the 
corresponding tasks as corrupted and re-initializing" +
-                    " it later.", e.getClass().getName(), e.partitions());
+                    " it later.", e);
 
                 final Map<TaskId, Collection<TopicPartition>> 
taskWithCorruptedChangelogs = new HashMap<>();
                 for (final TopicPartition partition : e.partitions()) {
                     final TaskId taskId = 
changelogs.get(partition).stateManager.taskId();
                     taskWithCorruptedChangelogs.computeIfAbsent(taskId, k -> 
new HashSet<>()).add(partition);
                 }
-                throw new TaskCorruptedException(taskWithCorruptedChangelogs);
+                throw new TaskCorruptedException(taskWithCorruptedChangelogs, 
e);

Review comment:
       Also added the cause to the thrown exception.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to