[GitHub] [kafka] cadonna commented on a diff in pull request #13925: KAFKA-10199: Consider tasks in state updater when computing offset sums

2023-07-03 Thread via GitHub


cadonna commented on code in PR #13925:
URL: https://github.com/apache/kafka/pull/13925#discussion_r1250497598


##
streams/src/main/java/org/apache/kafka/streams/processor/internals/TaskManager.java:
##
@@ -1141,25 +1141,30 @@ public Map getTaskOffsetSums() {
 // Not all tasks will create directories, and there may be directories 
for tasks we don't currently own,
 // so we consider all tasks that are either owned or on disk. This 
includes stateless tasks, which should
 // just have an empty changelogOffsets map.
-for (final TaskId id : union(HashSet::new, lockedTaskDirectories, 
tasks.allTaskIds())) {

Review Comment:
   > It seems with the state updated enabled, tasks is actually only containing 
"running tasks". It seems appropriate the rename this variable to runningTasks 
(can also happen in a follow up PR).
   
   The old code path with disabled state updater does still exist and we can 
disable the state updater if we encounter a major bug after releasing. So, I 
would postpone such renamings to the removal of the old code path.
   
   > I am actually also wondering if we still need this Tasks container any 
longer to begin with?
   
   I would keep it, because it allows to cleanly set a specific state of the 
task manager in unit tests. Anyways, I would wait for the upcoming thread 
refactoring to make such changes.
   
   
   > would it still be useful for the state-updated-thread to use Tasks 
container, given that is also own active tasks as long as they are restoring?
   
   I do not think so, since access by the state updater would imply that the 
tasks registry (aka tasks container) needs to be concurrently accessed. For 
this reason, we defined a invariant, that a task can only be owned either by 
the stream thread or by the state updater, but not both. Sharing the tasks 
registry between stream thread and state updater would break that invariant. If 
you meant to use an separate instance of the tasks registry for the state 
updater, that would be not useful IMO. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] cadonna commented on a diff in pull request #13925: KAFKA-10199: Consider tasks in state updater when computing offset sums

2023-06-29 Thread via GitHub


cadonna commented on code in PR #13925:
URL: https://github.com/apache/kafka/pull/13925#discussion_r1246365489


##
streams/src/main/java/org/apache/kafka/streams/processor/internals/TaskManager.java:
##
@@ -1138,28 +1138,33 @@ public void signalResume() {
 public Map getTaskOffsetSums() {
 final Map taskOffsetSums = new HashMap<>();
 
-// Not all tasks will create directories, and there may be directories 
for tasks we don't currently own,
-// so we consider all tasks that are either owned or on disk. This 
includes stateless tasks, which should
-// just have an empty changelogOffsets map.
-for (final TaskId id : union(HashSet::new, lockedTaskDirectories, 
tasks.allTaskIds())) {
-final Task task = tasks.contains(id) ? tasks.task(id) : null;
-// Closed and uninitialized tasks don't have any offsets so we 
should read directly from the checkpoint
-if (task != null && task.state() != State.CREATED && task.state() 
!= State.CLOSED) {
+final Map tasks = allTasks();
+final Set 
lockedTaskDirectoriesOfNonOwnedTasksAndClosedAndCreatedTasks =

Review Comment:
   Let's be defensive then!
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] cadonna commented on a diff in pull request #13925: KAFKA-10199: Consider tasks in state updater when computing offset sums

2023-06-29 Thread via GitHub


cadonna commented on code in PR #13925:
URL: https://github.com/apache/kafka/pull/13925#discussion_r1246336689


##
streams/src/main/java/org/apache/kafka/streams/processor/internals/TaskManager.java:
##
@@ -1177,14 +1181,15 @@ private void tryToLockAllNonEmptyTaskDirectories() {
 // current set of actually-locked tasks.
 lockedTaskDirectories.clear();
 
+final Map allTasks = allTasks();
 for (final TaskDirectory taskDir : 
stateDirectory.listNonEmptyTaskDirectories()) {
 final File dir = taskDir.file();
 final String namedTopology = taskDir.namedTopology();
 try {
 final TaskId id = parseTaskDirectoryName(dir.getName(), 
namedTopology);
 if (stateDirectory.lock(id)) {
 lockedTaskDirectories.add(id);
-if (!tasks.contains(id)) {
+if (!allTasks.containsKey(id)) {

Review Comment:
   For this debug log, we did only consider tasks owned by the stream thread.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] cadonna commented on a diff in pull request #13925: KAFKA-10199: Consider tasks in state updater when computing offset sums

2023-06-29 Thread via GitHub


cadonna commented on code in PR #13925:
URL: https://github.com/apache/kafka/pull/13925#discussion_r1246331625


##
streams/src/main/java/org/apache/kafka/streams/processor/internals/TaskManager.java:
##
@@ -1138,28 +1138,33 @@ public void signalResume() {
 public Map getTaskOffsetSums() {
 final Map taskOffsetSums = new HashMap<>();
 
-// Not all tasks will create directories, and there may be directories 
for tasks we don't currently own,
-// so we consider all tasks that are either owned or on disk. This 
includes stateless tasks, which should
-// just have an empty changelogOffsets map.
-for (final TaskId id : union(HashSet::new, lockedTaskDirectories, 
tasks.allTaskIds())) {
-final Task task = tasks.contains(id) ? tasks.task(id) : null;
-// Closed and uninitialized tasks don't have any offsets so we 
should read directly from the checkpoint
-if (task != null && task.state() != State.CREATED && task.state() 
!= State.CLOSED) {
+final Map tasks = allTasks();
+final Set 
lockedTaskDirectoriesOfNonOwnedTasksAndClosedAndCreatedTasks =

Review Comment:
   OK, I agree with you!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] cadonna commented on a diff in pull request #13925: KAFKA-10199: Consider tasks in state updater when computing offset sums

2023-06-28 Thread via GitHub


cadonna commented on code in PR #13925:
URL: https://github.com/apache/kafka/pull/13925#discussion_r1245477150


##
streams/src/main/java/org/apache/kafka/streams/processor/internals/TaskManager.java:
##
@@ -1138,28 +1138,33 @@ public void signalResume() {
 public Map getTaskOffsetSums() {
 final Map taskOffsetSums = new HashMap<>();
 
-// Not all tasks will create directories, and there may be directories 
for tasks we don't currently own,
-// so we consider all tasks that are either owned or on disk. This 
includes stateless tasks, which should
-// just have an empty changelogOffsets map.
-for (final TaskId id : union(HashSet::new, lockedTaskDirectories, 
tasks.allTaskIds())) {
-final Task task = tasks.contains(id) ? tasks.task(id) : null;
-// Closed and uninitialized tasks don't have any offsets so we 
should read directly from the checkpoint
-if (task != null && task.state() != State.CREATED && task.state() 
!= State.CLOSED) {
+final Map tasks = allTasks();
+final Set 
lockedTaskDirectoriesOfNonOwnedTasksAndClosedAndCreatedTasks =

Review Comment:
   I do not think there is guarantee that `lockedTaskDirectories` contains any 
tasks the client owns. `lockedTaskDirectories` are just the non-empty task 
directories in the state directory when a rebalance starts. However, a task 
directory is created when a task is created, i.e., it is in state `CREATE`. A 
task directory is not deleted when a task is closed, i.e., in state `CLOSED`. 
This might be a correlation and not a thought-out invariant. At least, the 
original code did not rely on this since it used `union(HashSet::new, 
lockedTaskDirectories, tasks.allTaskIds())`.
   I am also somehow reluctant to rely on such -- IMO -- brittle invariant. 
   As an example, in future we could decide to move the creation of the task 
directory to other parts of the code -- like when the task is initialized -- 
which would mean that there is a interval in which the task is in state 
`CREATED` but does not have a task directory.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] cadonna commented on a diff in pull request #13925: KAFKA-10199: Consider tasks in state updater when computing offset sums

2023-06-28 Thread via GitHub


cadonna commented on code in PR #13925:
URL: https://github.com/apache/kafka/pull/13925#discussion_r1245477150


##
streams/src/main/java/org/apache/kafka/streams/processor/internals/TaskManager.java:
##
@@ -1138,28 +1138,33 @@ public void signalResume() {
 public Map getTaskOffsetSums() {
 final Map taskOffsetSums = new HashMap<>();
 
-// Not all tasks will create directories, and there may be directories 
for tasks we don't currently own,
-// so we consider all tasks that are either owned or on disk. This 
includes stateless tasks, which should
-// just have an empty changelogOffsets map.
-for (final TaskId id : union(HashSet::new, lockedTaskDirectories, 
tasks.allTaskIds())) {
-final Task task = tasks.contains(id) ? tasks.task(id) : null;
-// Closed and uninitialized tasks don't have any offsets so we 
should read directly from the checkpoint
-if (task != null && task.state() != State.CREATED && task.state() 
!= State.CLOSED) {
+final Map tasks = allTasks();
+final Set 
lockedTaskDirectoriesOfNonOwnedTasksAndClosedAndCreatedTasks =

Review Comment:
   I do not think there is guarantee that `lockedTaskDirectories` contains any 
tasks the client owns. `lockedTaskDirectories` are just the non-empty task 
directories in the state directory when a rebalance starts. However, a task 
directory is created when a task is created, i.e., it is in state `CREATE`. A 
task directory is not deleted when a task is closed, i.e., in state `CLOSED`. 
This might be a correlation and not a thought-out invariant. At least, the 
original code did not rely on this since it used `union(HashSet::new, 
lockedTaskDirectories, tasks.allTaskIds())`.
   I am also somehow reluctant to rely on such -- IMO -- brittle invariant. 
   The creation of the task directory can probably be moved to other parts of 
the code like when the task is initialized which would mean that there is a 
interval in which the task is in state `CREATED` but does not have a task 
directory.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org