lhofhansl commented on PR #5195:
URL: https://github.com/apache/iceberg/pull/5195#issuecomment-1181469377

   > could you elaborate on the cases where caching delete filters might be 
detrimental to Spark? Do you mean in terms of retaining the filters in memory?
   
   Yep. We'd be losing out on the opportunity to not materialize the complete 
filter as a set in memory.
   (see: 
https://github.com/apache/iceberg/blob/6cc4a198c56c05ff103e6ecdf75fe50004af19da/data/src/main/java/org/apache/iceberg/data/DeleteFilter.java#L231)
   
   An alternative is to add the same API I added to TrinoDeleteFilter (which 
extends DeleteFilter<TrinoRow>) in the Trino PR, namely a `filterCached` method 
(could call it cacheAndFilter as well). That way we can leave the Spark code in 
place and Trino can use the caching method.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to