[
https://issues.apache.org/jira/browse/SPARK-42565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Apache Spark reassigned SPARK-42565:
------------------------------------
Assignee: (was: Apache Spark)
> Error log improve ment for the lock acquisition of RocksDB state store
> instance
> -------------------------------------------------------------------------------
>
> Key: SPARK-42565
> URL: https://issues.apache.org/jira/browse/SPARK-42565
> Project: Spark
> Issue Type: Improvement
> Components: Structured Streaming
> Affects Versions: 3.5.0
> Reporter: Huanli Wang
> Priority: Minor
>
>
> {code:java}
> "23/02/23 23:57:44 INFO Executor: Running task 2.0 in stage 57.1 (TID 363)
> "23/02/23 23:58:44 ERROR RocksDB StateStoreId(opId=0,partId=3,name=default):
> RocksDB instance could not be acquired by [ThreadId: Some(49), task: 3.0 in
> stage 57, TID 363] as it was not released by [ThreadId: Some(51), task: 3.1
> in stage 57, TID 342] after 60002 ms.{code}
>
> We are seeing those error messages for a testing query. The *taskId !=
> partitionId* but we fail to be clear on this in the error log.
> It's confusing when we see those logs: the second log entry seems to talk
> about `{*}task 3.0{*}` (it's actually partition 3 and retry attempt 0), but
> the `{*}TID 363{*}` is already occupied by `{*}task 2.0 in stage 57.1{*}`.
>
> Also, it's unclear at which stage retry attempt, the lock is acquired (or
> fails to be acquired)
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]