flyrain commented on issue #5245: URL: https://github.com/apache/iceberg/issues/5245#issuecomment-1180706273
Hi @shidayang, how many delete files were there in your test? I did benchmark multiple delete files, you can see the result here https://github.com/apache/iceberg/pull/3287#issuecomment-960433304. ``` with 25% rows are deleted and distribute these deletes to 1, 2, 5, 10 delete files ``` The perf doesn’t degrade much with more delete files. Please be ware that non-vectorized read is using the path without caching the filter. I am guessing Trino could be different from Spark in terms of read pattern. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
