adrians commented on code in PR #44622:
URL: https://github.com/apache/spark/pull/44622#discussion_r1474552639


##########
sql/core/src/main/scala/org/apache/spark/sql/execution/command/DataWritingCommand.scala:
##########
@@ -102,14 +102,17 @@ object DataWritingCommand {
   }
   /**
    * When execute CTAS operators, and the location is not empty, throw 
[[AnalysisException]].
-   * For CTAS, the SaveMode is always [[ErrorIfExists]]
+   * For CTAS, the SaveMode is always [[ErrorIfExists]].
+   * For Create-Table-If-Not-Exists, the SaveMode is [[Ignore]].
    *
    * @param tablePath Table location.
    * @param saveMode  Save mode of the table.
    * @param hadoopConf Configuration.
    */
   def assertEmptyRootPath(tablePath: URI, saveMode: SaveMode, hadoopConf: 
Configuration): Unit = {

Review Comment:
   I don't think a separate flag is needed. The SaveModes are mapped reasonably 
over the SQL syntax variants, so no additional logic would be needed, we can 
base on the SaveMode.
   
   ```
   
+------------------------------------------+---------------+--------------------------------------------------------+
   | SQL Syntax                               | SaveMode      | Semantics       
                                       |
   
+------------------------------------------+---------------+--------------------------------------------------------+
   | CREATE TABLE ... AS SELECT               | ErrorIfExists | If data/table 
exists, show error.                      |
   |                                          |               | (or if 
allowNonEmptyLocationInCTAS is set,             |
   |                                          |               |  ignore error & 
overwrite anyway)                      |
   |                                          |               |                 
                                       |
   | CREATE TABLE IF NOT EXISTS ... AS SELECT | Ignore        | If data/table 
exists, ignore error & stop.             |
   |                                          |               |                 
                                       |
   | CREATE OR REPLACE TABLE ... AS SELECT    | Overwrite     | If data/table 
exists, ignore error & overwrite anyway. |
   |                                          |               | (or shows "The 
feature is not supported" exception)    |
   
+------------------------------------------+---------------+--------------------------------------------------------+
   ```
   
   My issue was that the `Ignore` was following the same code-path as 
`Overwrite` in this situation - the contents of the storage location were not 
checked (as implemented in assertEmptyRootPath) and as a consequence, the 
caller considered safe to overwrite it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to