hvanhovell commented on PR #50873: URL: https://github.com/apache/spark/pull/50873#issuecomment-2959316695
@xi-db the Connect API is supposed to be lazy. That we did this in Python is a mistake. Concretely, I can see two problems with this: - It can create quite a few more extra RPCs. - It is misleading. By the time you submit something for execution, your underlying data might have changed. You will see a failure anyway. This works for classic because we have eager analysis, and the Dataset is bound at definition time instead of execution time. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org