gaogaotiantian commented on code in PR #56309:
URL: https://github.com/apache/spark/pull/56309#discussion_r3377289690


##########
python/pyspark/sql/tests/connect/streaming/test_parity_listener.py:
##########
@@ -257,7 +258,13 @@ def test_listener_events_spark_command(self):
 
                 @eventually(timeout=60, catch_assertions=True)
                 def load_event(event_name, table_name):
-                    table = self.spark.read.table(table_name).collect()
+                    try:
+                        table = self.spark.read.table(table_name).collect()
+                    except AnalysisException as e:
+                        # It's possible that the table has not been created yet

Review Comment:
   Yes removing the sleep caused this flakiness, but we need to remove that 
crazy sleep.
   
   The test was to confirm that the events existed after the query. It used to 
be `query - sleep - check`. Now we are doing `query - polling`. The purpose is 
the same. The difference is that we are polling the result, instead of waiting 
for a fixed amount of time.
   
   This PR filled in a gap where the table might not even exist when we poll - 
in that case we should just wait for the table to be created and poll again. 
The failed case for this test should be that the event was never in the table 
after 60s, which we are able to catch with both this method and the previous 
waiting method. We can however, save a lot of time if we see the event earlier 
than 60s.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to