zecookiez commented on code in PR #50123:
URL: https://github.com/apache/spark/pull/50123#discussion_r2035566019
##########
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/StateStore.scala:
##########
@@ -593,7 +593,10 @@ trait StateStoreProvider {
def supportedInstanceMetrics: Seq[StateStoreInstanceMetric] = Seq.empty
}
-object StateStoreProvider {
+object StateStoreProvider extends Logging {
+
+ @GuardedBy("this")
Review Comment:
Yeah I'll add more context to this, but there were some situations that
caused the `coordinatorRef` call from the uploadSnapshot method to freeze.
I think the issue is a lock contention with the `loadedProviders` lock, so
RPC calls to obtain the coordinator were getting stuck. Since these upload RPC
calls seemed logically separate to what was being used in StateStore object, I
made a separate endpoint from StateStoreProviders.
Maybe we can put this elsewhere that would make more sense
This was the stack trace and exception error reported:
```
org.apache.spark.SparkException:
[CANNOT_LOAD_STATE_STORE.UNRELEASED_THREAD_ERROR] An error occurred during
loading state. StateStoreId(opId=0,partId=4,name=default): RocksDB instance
could not be acquired by [ThreadId: Some(16)] for operationType=close_store as
it was not released by [ThreadId: Some(314791), task: partition 4.0 in stage
372.0, TID 1145] after 120007 ms.
[info] Thread holding the lock has trace:
app//org.apache.spark.sql.execution.streaming.state.StateStore$.coordinatorRef(StateStore.scala:1157)
[info]
app//org.apache.spark.sql.execution.streaming.state.StateStore$.reportSnapshotUploaded(StateStore.scala:1154)
[info]
app//org.apache.spark.sql.execution.streaming.state.RocksDBEventListener.reportSnapshotUploaded(RocksDBStateStoreProvider.scala:981)
...
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]