WweiL commented on code in PR #42686:
URL: https://github.com/apache/spark/pull/42686#discussion_r1306049532


##########
python/pyspark/sql/streaming/listener.py:
##########
@@ -477,7 +477,7 @@ def fromJson(cls, j: Dict[str, Any]) -> 
"StreamingQueryProgress":
             name=j["name"],
             timestamp=j["timestamp"],
             batchId=j["batchId"],
-            batchDuration=j["batchDuration"],
+            batchDuration=j["batchDuration"] if "batchDuration" in j else None,

Review Comment:
   So this is currently how this method is used: in spark connect, the way the 
listener works, is that 
   1. The user's listener code is serialized and sent to spark server
   2. The server starts a scala listener, [in which starts a python 
process](https://github.com/apache/spark/blob/master/connector/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/StreamingQueryListenerHelper.scala)
 (essentially as another connect client), that runs the user's code
   3. Each time a new event comes in, the event on [java side is serialized to 
json](https://github.com/apache/spark/blob/master/connector/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/StreamingQueryListenerHelper.scala#L52)
 and passed to server python process, which calls this `fromJson` [method to 
convert it back to the actual `StreamingQueryProgress` 
object](https://github.com/apache/spark/blob/master/python/pyspark/sql/connect/streaming/worker/listener_worker.py#L76-L77)
   
   But before https://github.com/apache/spark/pull/42077 in 3.5, that field is 
not added in the jvm `json` method of `StreamingQueryProgress`. Here it excepts 
that to always be presented, hence we get an error



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to