psantus commented on issue #2159: URL: https://github.com/apache/iceberg-python/issues/2159#issuecomment-3893362402
I'm also very disappointed in `upsert()` performance.. as it seems it doesn't use partition_filter at all! Am I missing something? I have a large table (100m+ rows) partitioned by `company_id,day(created_at)` and each row then has a unique `id`.. I would expect upserting with `join_cols=company_id,day(created_at),id` to actually use partition filtering so it doesn't scan all files, and then be super fast. But it doesn't (am I missing something here?) and believe @EnyMan's #2943 doesn't address that? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
