WweiL commented on code in PR #42664:
URL: https://github.com/apache/spark/pull/42664#discussion_r1305937549
##########
python/pyspark/sql/streaming/listener.py:
##########
@@ -477,7 +477,8 @@ def fromJson(cls, j: Dict[str, Any]) ->
"StreamingQueryProgress":
name=j["name"],
timestamp=j["timestamp"],
batchId=j["batchId"],
- batchDuration=j["batchDuration"],
+ # before spark 4.0, batchDuration is not in the json method of jvm
side StreamingQueryProgress
+ batchDuration=j["batchDuration"] if "batchDuration" in j else None,
Review Comment:
@dongjoon-hyun Hi Dongjoon, sorry for the back and forth. On second thought
I actually find out that the newly added tests and the test failure in
https://github.com/apache/spark/pull/42521#issuecomment-1691547730 actually
finds out a bug. Here before I assume `batchDuration` is always in the passed
in json, but before 4.0 it is not there.
Given that we don't add the change in
https://github.com/apache/spark/pull/42077, this check is needed. I reverted
that commit, and add this check.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]