adrians opened a new pull request, #44622:
URL: https://github.com/apache/spark/pull/44622

   
   ### What changes were proposed in this pull request?
   
   The bug: If running "create table if-not-exists as select..." without a 
hive-catalog, the table is always overwritten.
   The cause: DataWritingCommand.assertEmptyRootPath verifies if the 
storage-location exists only if the SaveMode is ErrorIfExists (to know whether 
to save or throw exceptions in "create table as select" statements). When the 
SaveMode is ignore (as is the case in "create table if-not-exists as select..." 
statements), the presence of files in the storage-location is not checked, so 
later the statement will force-overwrite the contents.
   
   ### Why are the changes needed?
   
   It's a bug: inconsistent behavior depending on which case it is in the 
test-matrix:
   
   ```
   +----------------------------+-------------------------------------+
   | sql statement              | Behavior when overwriting a table   |
   |                            +---------------+---------------------+
   |                            | hive-enabled  | hive-disabled       |
   +----------------------------+---------------+---------------------+
   | create-table               | exception (1) | exception (2)       |
   | create-table-if-not-exists | skip (3)      | OVERWRITE *BUG* (4) |
   +----------------------------+---------------+---------------------+
   ```
   Explained more in-depth in the jira ticket: 
https://issues.apache.org/jira/browse/SPARK-46617
   
   ### Does this PR introduce _any_ user-facing change?
   
   create-table-if-not-exists statements will work in a more consistent manner.
   
   ### How was this patch tested?
   
   Went manually through the test-matrix again, checked that the table was not 
overwritten.
   
   
   ### Was this patch authored or co-authored using generative AI tooling?
   
   No
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to