rdblue commented on issue #1410: URL: https://github.com/apache/iceberg/issues/1410#issuecomment-694436287
@skambha, the reason is that some of the table properties control behavior that should be preserved. I think that this makes sense with the `REPLACE ... AS SELECT` use case. The purpose is to atomically overwrite an entire table, possibly with a different schema. Users asked for a way to maintain a report table that is replaced daily. Because all of the data is replaced, it made little sense to validate the schema of a query against the table schema and force users to run `ALTER TABLE` commands in addition to changing the `SELECT` that produces the table data. Replacing, rather than overwriting, the table makes the most sense for this use case. But, there are concerns that are orthogonal to replacing the table contents and schema. For example, the target split size for the table would change much more rarely than the data. Likewise, how many old metadata versions to keep, when to age off snapshots, and other settings make more sense outside of the daily replacement lifecycle. Otherwise, each `REPLACE` query would need to carry _all_ of the table settings in a `TBLPROPERTIES` clause. That's why we want some metadata to persist across `REPLACE` operations. Exactly _what_ metadata should persist is a good thing to discuss further -- should partitioning be preserved? And we may want to have a property or flag to discard old properties and completely replace a table if you prefer a different behavior. (Would that be a Spark property or a table property? If it's a table property, do we keep it when we run the next replace?) ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
