[GitHub] [spark] lucyyao-db commented on a diff in pull request #41705: [SPARK-44252][SS] Define a new error class and apply for the case where loading state from DFS fails

via GitHub Wed, 12 Jul 2023 10:43:15 -0700


lucyyao-db commented on code in PR #41705:
URL: https://github.com/apache/spark/pull/41705#discussion_r1261509634



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala:
##########
@@ -2855,6 +2863,50 @@ private[sql] object QueryExecutionErrors extends 
QueryErrorsBase {
         "enumString" -> enumString))
   }
 
+  def unreleasedThreadError(loggingId: String, newAcquiredThreadInfo: String,
+                            AcquiredThreadInfo: String, timeWaitedMs: Long,
+                            stackTraceOutput: String): Throwable = {
+    new SparkException (
+      errorClass = "CANNOT_LOAD_STATE_STORE.UNRELEASED_THREAD_ERROR",
+      messageParameters = Map(
+        "loggingId" -> loggingId,
+        "newAcquiredThreadInfo" -> newAcquiredThreadInfo,
+        "acquiredThreadInfo" -> AcquiredThreadInfo,
+        "timeWaitedMs" -> timeWaitedMs.toString,
+        "stackTraceOutput" -> stackTraceOutput

Review Comment:
   I believe cannotLoadStore is a user-facing error, but I'm not sure if 
unreleasedThreadError should be as well.
   
   This is from a previous conversation:
   > These are not about inputs and outputs from the point of end-to-end query. 
You can imagine that there is an operator which have to retain accumulators 
over the streaming query's lifetime (I'm over-simplifying but I guess this is 
more SQL friendly), and we are checkpointing accumulators for that microbatch 
to the durable (mostly remote) file system.
   > 
   > RocksDB is what users pick up as local storage on retaining accumulators. 
Picking up in-memory map is also feasible. It's just that when Spark runs a 
microbatch, Spark needs to load the state store for specific microbatch (to 
continue accumulating), which would involve downloading the files from the 
remote file system, and deserializing them, and loading the accumulators into 
local storage. Various interactions happen in there which we want to capture 
these cases with different sub-categories.
   > 
   > So it's not directly related to the data users read from/write to, but the 
internal data maintained by the streaming query.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] lucyyao-db commented on a diff in pull request #41705: [SPARK-44252][SS] Define a new error class and apply for the case where loading state from DFS fails

Reply via email to