jingz-db commented on code in PR #49560:
URL: https://github.com/apache/spark/pull/49560#discussion_r1953457708


##########
python/pyspark/sql/tests/pandas/test_pandas_transform_with_state.py:
##########
@@ -499,28 +488,14 @@ def check_results(batch_df, batch_id):
                     Row(id="0", countAsString="-1"),
                     Row(id="1", countAsString="3"),
                 }
-                self.first_expired_timestamp = batch_df.filter(
-                    batch_df["countAsString"] == -1
-                ).first()["timeValues"]
                 check_timestamp(batch_df)
 
-            elif batch_id == 2:
+            else:
                 assert set(batch_df.sort("id").select("id", 
"countAsString").collect()) == {
                     Row(id="0", countAsString="3"),
                     Row(id="0", countAsString="-1"),
                     Row(id="1", countAsString="5"),
                 }
-                # The expired timestamp in current batch is larger than expiry 
timestamp in batch 1

Review Comment:
   This check was removed after connect suites are added. It is good to have 
additional check but not necessary. We need to access 
`self.first_expired_timestamp` which is not doable for spark suite. This check 
is mostly checking the expired timer timestamp in batch 1 is smaller than 
timestamp of expired timer in batch 2.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to