dtenedor commented on PR #36091:
URL: https://github.com/apache/spark/pull/36091#issuecomment-1101709874

   >  BTW, we should handle the following case:
   >
   > create table t1(i int) using parquet
   > insert into t1 values (1)
   > alter table t1 add column j int default 3
   > insert into t1 values (2, 4)
   > select * from t1
   > It should return ((1, 3), (2, 4)), but currently it returns ((1, null), 
(2, 4))
   
   Thanks for pointing this out! This is an example of the "current default" 
versus the "existence default". When we evaluate the `ALTER TABLE ADD COLUMN` 
command, it assigns the value of `3` into the `CURRENT_DEFAULT` column metadata 
(for future `INSERT`s when the corresponding value is not present or specified 
as `DEFAULT`) as well as the `EXISTS_DEFAULT` column metadata (for future 
`SELECT`s when the corresponding value is not present or specified as 
`DEFAULT`). In this PR we assign this column metadata correctly, but the data 
sources (including the Parquet data source per this example) have not yet 
implemented support for inspecting the `EXISTS_DEFAULT` metadata and returning 
`3` instead of `NULL` when the value is not present in storage. When we 
implement that support, your example will change to include the expected 
result. This is coming up next!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to