[GitHub] [iceberg] rdblue commented on pull request #5392: Spark: Fix a separate table cache being created for each rewriteFiles

GitBox Sat, 30 Jul 2022 11:43:48 -0700


rdblue commented on PR #5392:
URL: https://github.com/apache/iceberg/pull/5392#issuecomment-1200275092


   I'm interested to hear what @szehon-ho and @RussellSpitzer think about this.
   
   My initial reaction is that this is not something that we should change. We 
don't want to disable AQE for other Spark work, which is a side-effect of this 
change. I also don't like that we need to create a new Spark session for each 
rewrite, but I don't think there is much we can do to avoid it if we want to 
disable AQE. We could also fail if AQE is on or just accept the AQE results.
   
   Also, is a separate table cache a bug? Since it is only used once, what is 
the problem with doing it this way? Sure, this won't cache the rewritten table, 
but is there a behavior problem or are loads just slightly slower?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] rdblue commented on pull request #5392: Spark: Fix a separate table cache being created for each rewriteFiles

Reply via email to