WweiL commented on code in PR #42664:
URL: https://github.com/apache/spark/pull/42664#discussion_r1305937549


##########
python/pyspark/sql/streaming/listener.py:
##########
@@ -477,7 +477,8 @@ def fromJson(cls, j: Dict[str, Any]) -> 
"StreamingQueryProgress":
             name=j["name"],
             timestamp=j["timestamp"],
             batchId=j["batchId"],
-            batchDuration=j["batchDuration"],
+            # before spark 4.0, batchDuration is not in the json method of jvm 
side StreamingQueryProgress
+            batchDuration=j["batchDuration"] if "batchDuration" in j else None,

Review Comment:
   @dongjoon-hyun Hi Dongjoon, sorry for the back and forth. On second thought 
I actually find out that the newly added tests and the test failure in 
https://github.com/apache/spark/pull/42521#issuecomment-1691547730 actually 
finds out a bug. Here before I assume `batchDuration` is always in the passed 
in json, but before 4.0 it is not there.
   
   Given that we don't add the change in 
https://github.com/apache/spark/pull/42077, this check is needed. I reverted 
that commit, and add this check.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to