This is an automated email from the ASF dual-hosted git repository.
gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push:
new 4ee63ac930d8 [SPARK-55020][PYTHON][FOLLOW-UP] Use `iter()` to get an
iterator from ExecutePlan
4ee63ac930d8 is described below
commit 4ee63ac930d87eb1ff2bc80d83f1f30b3741abb5
Author: Tian Gao <[email protected]>
AuthorDate: Thu Feb 19 10:26:27 2026 +0900
[SPARK-55020][PYTHON][FOLLOW-UP] Use `iter()` to get an iterator from
ExecutePlan
### What changes were proposed in this pull request?
Get the iterator from `ExecutePlan`, instead of using it as an iterator
directly.
### Why are the changes needed?
In #54248 we did `gen = self._stub.ExecutePlan(req,
metadata=self._builder.metadata())` to replace `for b in
self._stub.ExecutePlan(req, metadata=self._builder.metadata())`. This is
theoretically inequivalent. We use the iterable to be the iterator. For the
actual `ExecutePlan` it works fine because the class can be both an iterator
and an iterable. However, we do some mock test in our test suite and we should
not have to worry too much about our mock. I fixed the test in the previous
[...]
I also reverted the change I made to the mock test. It still works with
`iter()`, but I don't want to mislead people to believe that's necessary.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
The changed test passed locally.
### Was this patch authored or co-authored using generative AI tooling?
No.
Closes #54351 from gaogaotiantian/fix-iter-problem.
Authored-by: Tian Gao <[email protected]>
Signed-off-by: Hyukjin Kwon <[email protected]>
---
python/pyspark/sql/connect/client/core.py | 6 ++++--
python/pyspark/sql/tests/connect/client/test_client.py | 2 +-
2 files changed, 5 insertions(+), 3 deletions(-)
diff --git a/python/pyspark/sql/connect/client/core.py
b/python/pyspark/sql/connect/client/core.py
index b906aee1b7d0..0a9e8ae3c61c 100644
--- a/python/pyspark/sql/connect/client/core.py
+++ b/python/pyspark/sql/connect/client/core.py
@@ -1701,11 +1701,13 @@ class SparkConnectClient(object):
for attempt in self._retrying():
with attempt:
with disable_gc():
- gen = self._stub.ExecutePlan(req,
metadata=self._builder.metadata())
+ it = iter(
+ self._stub.ExecutePlan(req,
metadata=self._builder.metadata())
+ )
while True:
try:
with disable_gc():
- b = next(gen)
+ b = next(it)
yield from handle_response(b)
except StopIteration:
break
diff --git a/python/pyspark/sql/tests/connect/client/test_client.py
b/python/pyspark/sql/tests/connect/client/test_client.py
index 6c58a8a98d87..8daf82ad3f37 100644
--- a/python/pyspark/sql/tests/connect/client/test_client.py
+++ b/python/pyspark/sql/tests/connect/client/test_client.py
@@ -158,7 +158,7 @@ if should_test_connect:
buf = sink.getvalue()
resp.arrow_batch.data = buf.to_pybytes()
resp.arrow_batch.row_count = 2
- return iter([resp])
+ return [resp]
def Interrupt(self, req: proto.InterruptRequest, metadata):
self.req = req
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]