GrigorievNick edited a comment on issue #1202: URL: https://github.com/apache/iceberg/issues/1202#issuecomment-659308492
> Both Spark 2.4 and Spark 3.0 support dynamic partition overwrite. Spark 3.0 also supports overwrite by expression, although the expression must match all rows in a data file or no rows of a data file, or else it will cause an exception because the granularity of delete is a whole data file. @rdblue But Overwrite that implemented in delete is match smarter then overwrite all data in the partition. it will change only files that contain changes, while simple overwrite will update all partition. So of course I can read data all data from partition -> manipulate -> overwrite. But I can do it with any code. What I am looking for is to update only files that match changes. So as I understand, there is no such solution right now, yes? I can implement it manually using low-level(java-core) API. But in this case, I have one more question, which I can't find in docs. Does it possible to do concurrent [Table Operation](https://iceberg.apache.org/api/#table-metadata) -> `newRewrite`? Small explanation: I will have different spark partitions that will overwrite one or a few dataFiles. And of course, a partition is idempotent and running in parallel. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
