lhofhansl commented on PR #5195: URL: https://github.com/apache/iceberg/pull/5195#issuecomment-1181469377
> could you elaborate on the cases where caching delete filters might be detrimental to Spark? Do you mean in terms of retaining the filters in memory? Yep. We'd be losing out on the opportunity to not materialize the complete filter as a set in memory. (see: https://github.com/apache/iceberg/blob/6cc4a198c56c05ff103e6ecdf75fe50004af19da/data/src/main/java/org/apache/iceberg/data/DeleteFilter.java#L231) An alternative is to add the same API I added to TrinoDeleteFilter (which extends DeleteFilter<TrinoRow>) in the Trino PR, namely a `filterCached` method (could call it cacheAndFilter as well). That way we can leave the Spark code in place and Trino can use the caching method. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
