This is an automated email from the ASF dual-hosted git repository. kabhwan pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push: new ac30e9312a8 [SPARK-42565][SS] Error log improvement for the lock acquisition of RocksDB state store instance ac30e9312a8 is described below commit ac30e9312a89ca16c16ef27c9276b2bb21cd7431 Author: Huanli Wang <huanli.w...@databricks.com> AuthorDate: Sat Feb 25 07:54:18 2023 +0900 [SPARK-42565][SS] Error log improvement for the lock acquisition of RocksDB state store instance ``` "23/02/23 23:57:44 INFO Executor: Running task 2.0 in stage 57.1 (TID 363) "23/02/23 23:58:44 ERROR RocksDB StateStoreId(opId=0,partId=3,name=default): RocksDB instance could not be acquired by [ThreadId: Some(49), task: 3.0 in stage 57, TID 363] as it was not released by [ThreadId: Some(51), task: 3.1 in stage 57, TID 342] after 60002 ms. ``` We are seeing those error messages for a testing query. The `taskId != partitionId` but we fail to be clear on this in the error log. It's confusing when we see those logs: the second log entry seems to talk about `task 3.0` (it's actually partition 3 and retry attempt 0), but the `TID 363` is already occupied by `task 2.0 in stage 57.1`. Also, it's unclear at which stage retry attempt, the lock is acquired (or fails to be acquired) ### What changes were proposed in this pull request? * add `partition ` after `task: ` in the log message for clarification * add stage attempt to distinguish different stage retries. ### Why are the changes needed? improve the log message for a better debuggability ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? only log message change Closes #40161 from huanliwang-db/rocksdb. Authored-by: Huanli Wang <huanli.w...@databricks.com> Signed-off-by: Jungtaek Lim <kabhwan.opensou...@gmail.com> --- .../scala/org/apache/spark/sql/execution/streaming/state/RocksDB.scala | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDB.scala b/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDB.scala index cab2fe9b90d..32caf8a1bc8 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDB.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDB.scala @@ -710,7 +710,8 @@ case class AcquiredThreadInfo() { override def toString(): String = { val taskStr = if (tc != null) { val taskDetails = - s"${tc.partitionId}.${tc.attemptNumber} in stage ${tc.stageId}, TID ${tc.taskAttemptId}" + s"partition ${tc.partitionId}.${tc.attemptNumber} in stage " + + s"${tc.stageId}.${tc.stageAttemptNumber()}, TID ${tc.taskAttemptId}" s", task: $taskDetails" } else "" --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org