[GitHub] [spark] zhengruifeng commented on pull request #41986: [SPARK-44406][CONNECT] Make `SparkSession.sql` work properly with dropped temp view

via GitHub Thu, 13 Jul 2023 17:13:22 -0700


zhengruifeng commented on PR #41986:
URL: https://github.com/apache/spark/pull/41986#issuecomment-1635078602


   > This is a behavioral change. The first version of the SQL() was using a 
similar cached DF approach but we decided against it because it is now you're 
creating a lot of state on the server side for no good reason.
   > 
   > Now simply by using the API the DF becomes stored on the server and it's 
not easily retriable.
   
   the problem I think, is that the `spark.sql` becomes nondeterministic. After 
user get a DF from  `spark.sql`, if user drop/replace the view, the query 
output can be changed.
   
   spark still have to store the plans in catalog for temp views, which can not 
be dropped and are exposed to the end users.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] zhengruifeng commented on pull request #41986: [SPARK-44406][CONNECT] Make `SparkSession.sql` work properly with dropped temp view

Reply via email to