juliuszsompolski commented on PR #43955:
URL: https://github.com/apache/spark/pull/43955#issuecomment-1822605607

   > In order to preserve the previous behaviour at gRPC 1.56.0. After 
upgrading gRPC from 1.56.0 to 1.59.3, it makes the first request when the 
stream is created, but our code here assumes that no request is made before 
that.
   
   Whoa, that's a rather big change to drop in that small of an gRPC upgrade.
   
   FWIW, I really disliked the previous behaviour, and, on the other hand, many 
tests have to do
   ```
   iter.hasNext() // open the iterator
   ```
   because of it. I think it would be more logical if just calling `execute` 
sent the execution to start on the server.
   
   FWIW, I would be in favor of allowing this behaviour change, and just 
removing that piece of test case as not needed anymore.
   The only public API affected by it would be `Dataset.toLocalIterator`, 
because others open and consume the iterator immediately anyway.
   For `Dataset.toLocalIterator`, I believe that the behaviour change is a good 
one. Before, the query wouldn't even be submitted to the server until someone 
calls `next` or `hasNext` on the resulting iterator. I think that it is better 
that the query is submitted for execution immediately.
   
   WDYT @HyukjinKwon @LuciferYang @dongjoon-hyun ?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to