rdblue commented on issue #1410:
URL: https://github.com/apache/iceberg/issues/1410#issuecomment-694436287


   @skambha, the reason is that some of the table properties control behavior 
that should be preserved. I think that this makes sense with the `REPLACE ... 
AS SELECT` use case. The purpose is to atomically overwrite an entire table, 
possibly with a different schema.
   
   Users asked for a way to maintain a report table that is replaced daily. 
Because all of the data is replaced, it made little sense to validate the 
schema of a query against the table schema and force users to run `ALTER TABLE` 
commands in addition to changing the `SELECT` that produces the table data. 
Replacing, rather than overwriting, the table makes the most sense for this use 
case.
   
   But, there are concerns that are orthogonal to replacing the table contents 
and schema. For example, the target split size for the table would change much 
more rarely than the data. Likewise, how many old metadata versions to keep, 
when to age off snapshots, and other settings make more sense outside of the 
daily replacement lifecycle. Otherwise, each `REPLACE` query would need to 
carry _all_ of the table settings in a `TBLPROPERTIES` clause.
   
   That's why we want some metadata to persist across `REPLACE` operations. 
Exactly _what_ metadata should persist is a good thing to discuss further -- 
should partitioning be preserved? And we may want to have a property or flag to 
discard old properties and completely replace a table if you prefer a different 
behavior. (Would that be a Spark property or a table property? If it's a table 
property, do we keep it when we run the next replace?)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to