anishshri-db commented on code in PR #44542:
URL: https://github.com/apache/spark/pull/44542#discussion_r1438619998
##########
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/statefulOperators.scala:
##########
@@ -434,22 +434,26 @@ case class StateStoreRestoreExec(
numColsPrefixKey = 0,
session.sessionState,
Some(session.streams.stateStoreCoordinator)) { case (store, iter) =>
- val hasInput = iter.hasNext
- if (!hasInput && keyExpressions.isEmpty) {
- // If our `keyExpressions` are empty, we're getting a global
aggregation. In that case
- // the `HashAggregateExec` will output a 0 value for the partial
merge. We need to
- // restore the value, so that we don't overwrite our state with a 0
value, but rather
- // merge the 0 with existing state.
- store.iterator().map(_.value)
- } else {
- iter.flatMap { row =>
- val key = stateManager.getKey(row.asInstanceOf[UnsafeRow])
- val restoredRow = stateManager.get(store, key)
- val outputRows = Option(restoredRow).toSeq :+ row
- numOutputRows += outputRows.size
- outputRows
- }
+ val hasInput = iter.hasNext
+ val result = if (!hasInput && keyExpressions.isEmpty) {
+ // If our `keyExpressions` are empty, we're getting a global
aggregation. In that case
+ // the `HashAggregateExec` will output a 0 value for the partial
merge. We need to
+ // restore the value, so that we don't overwrite our state with a 0
value, but rather
+ // merge the 0 with existing state.
+ store.iterator().map(_.value)
+ } else {
+ iter.flatMap { row =>
+ val key = stateManager.getKey(row.asInstanceOf[UnsafeRow])
+ val restoredRow = stateManager.get(store, key)
+ val outputRows = Option(restoredRow).toSeq :+ row
+ numOutputRows += outputRows.size
+ outputRows
}
+ }
+ // SPARK-46547 - Release any locks/resources if required, to prevent
+ // deadlocks with the maintenance thread.
+ store.abort()
Review Comment:
@HeartSaVioR - we could probably make this more light-weight - in order to
just release the instance lock and keep the loaded version intact. but then we
probably need to add another API on the store or pass an argument to `abort` ?
Thoughts ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]