[GitHub] [spark] dtenedor opened a new pull request, #37256: [SPARK-39844][SQL] Restrict adding DEFAULT columns for existing tables to allowlist of supported data source types

GitBox Fri, 22 Jul 2022 16:25:36 -0700


dtenedor opened a new pull request, #37256:
URL: https://github.com/apache/spark/pull/37256


   ### What changes were proposed in this pull request?
   
   Restrict adding DEFAULT columns for existing tables to allowlist of 
supported data source types.
   
   ### Why are the changes needed?
   
   When `ALTER TABLE ... ADD COLUMNS` commands assign column DEFAULT values, 
Spark constant-folds these values and stores the result in the `EXISTS_DEFAULT` 
column metadata. This allows the target data source to substitute this value 
for rows where the corresponding field is not present in storage. This 
responsibility is up to each data source.
   
   In order to grant flexibility to certain data sources that are not yet ready 
to support this functionality, in this PR we add the new SQLConf 
`ADD_DEFAULT_COLUMN_EXISTING_TABLE_BANNED_PROVIDERS` as 
`spark.sql.defaultColumn.addColumnExistingTableBannedProviders`. Adding a table 
provider to this list as well as `DEFAULT_COLUMN_ALLOWED_PROVIDERS` will allow 
Spark to assign default column values only upon either (1) initial table 
creation or (2) later with `ALTER TABLE ... SET DEFAULT ...` commands.
   
   ### Does this PR introduce _any_ user-facing change?
   
   Yes.
   
   ### How was this patch tested?
   
   Existing column DEFAULT test coverage + new unit test coverage.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] dtenedor opened a new pull request, #37256: [SPARK-39844][SQL] Restrict adding DEFAULT columns for existing tables to allowlist of supported data source types

Reply via email to