WweiL commented on code in PR #40937:
URL: https://github.com/apache/spark/pull/40937#discussion_r1177124740


##########
connector/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/SparkConnectPlanner.scala:
##########
@@ -2092,6 +2097,11 @@ class SparkConnectPlanner(val session: SparkSession) {
       case path => writer.start(path)
     }
 
+    // Register the new query so that the session and query references are 
cached.
+    SparkConnectService.streamingSessionManager.registerNewStreamingQuery(

Review Comment:
   If the query throws before this line (basically in query creation, which is 
common I believe, for example wrong config is set, then 
IllegalArgumentException would be thrown 
[example](https://github.com/apache/spark/blob/b26844ce879ac0097d6e1a95da18a5c3ef3c9284/connector/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSourceProvider.scala#L227-L268)).
 What would happen? 
   My guess is that the user could see the error as it's handled by existing 
error-handling framework in SparkConnectService. 
   
   But `StreamingQueryManager` unregisters the query in that case 
   
   
https://github.com/apache/spark/blob/b26844ce879ac0097d6e1a95da18a5c3ef3c9284/sql/core/src/main/scala/org/apache/spark/sql/streaming/StreamingQueryManager.scala#L411
   
   When the customer try to access the query, the `orElse` below would be 
triggered:
   ```
   val query = SparkConnectService.streamingSessionManager
         .findCachedQuery(id, session) // Common case: query is cached in 
connect session manager.
         .orElse { // Else try to find it in active streams. Mostly will not be 
found here.
           Option(session.streams.get(id))
   ```
   And that'd still return a query ID not found exception?



##########
connector/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/SparkConnectPlanner.scala:
##########
@@ -2092,6 +2097,11 @@ class SparkConnectPlanner(val session: SparkSession) {
       case path => writer.start(path)
     }
 
+    // Register the new query so that the session and query references are 
cached.
+    SparkConnectService.streamingSessionManager.registerNewStreamingQuery(

Review Comment:
   If the query throws before this line (basically in query creation, which is 
common I believe, for example wrong config is set, then 
IllegalArgumentException would be thrown 
[example](https://github.com/apache/spark/blob/b26844ce879ac0097d6e1a95da18a5c3ef3c9284/connector/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSourceProvider.scala#L227-L268)).
 What would happen? 
   
   My guess is that the user could see the error as it's handled by existing 
error-handling framework in SparkConnectService. 
   
   But `StreamingQueryManager` unregisters the query in that case 
   
   
https://github.com/apache/spark/blob/b26844ce879ac0097d6e1a95da18a5c3ef3c9284/sql/core/src/main/scala/org/apache/spark/sql/streaming/StreamingQueryManager.scala#L411
   
   When the customer try to access the query, the `orElse` below would be 
triggered:
   ```
   val query = SparkConnectService.streamingSessionManager
         .findCachedQuery(id, session) // Common case: query is cached in 
connect session manager.
         .orElse { // Else try to find it in active streams. Mostly will not be 
found here.
           Option(session.streams.get(id))
   ```
   And that'd still return a query ID not found exception?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to