[GitHub] [iceberg] RussellSpitzer commented on pull request #2496: [#2039] Support default value semantics - API changes

GitBox Thu, 10 Feb 2022 21:34:59 -0800


RussellSpitzer commented on pull request #2496:
URL: https://github.com/apache/iceberg/pull/2496#issuecomment-1035897006



   
   > Rewriting datafiles produces a new snapshot, which derives from the latest 
snapshot, with its same schema, so it is sort of orthogonal in the sense that 
the reading behavior is the same.
   
   I think this may be an issue since all current implementations of rewrite 
start by reading the current state of the data and then writing that output to 
new files. Consider I have two files both missing column A for which I have set 
a default value of 1. Say my optimize rewrite command touches
   One of these files and rewrites it. On read it will see that a has a value 
of 1 and return rows with a=1. A the replacement data file is now filled in 
with a=1. Now if I change the default to 2, row in the unoptimized file will 
return 2(the new default) for a while those in the optimized file will return 
1. I think this would be a pretty strange behavior 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] RussellSpitzer commented on pull request #2496: [#2039] Support default value semantics - API changes

Reply via email to