[GitHub] [spark] AngersZhuuuu opened a new pull request #35770: [SPARK-38449][SQL] Avoid call createTable when ignoreIfExists=true and table exists

GitBox Tue, 08 Mar 2022 05:36:16 -0800


AngersZhuuuu opened a new pull request #35770:
URL: https://github.com/apache/spark/pull/35770



   ### What changes were proposed in this pull request?
   In current V2 code, we can see that when table exist and `ignoreIfExists` = 
true,  spark won't do nothing
   ```
   case class CreateTableExec(
       catalog: TableCatalog,
       identifier: Identifier,
       tableSchema: StructType,
       partitioning: Seq[Transform],
       tableSpec: TableSpec,
       ignoreIfExists: Boolean) extends LeafV2CommandExec {
     import org.apache.spark.sql.connector.catalog.CatalogV2Implicits._
   
     val tableProperties = CatalogV2Util.convertTableProperties(tableSpec)
   
     override protected def run(): Seq[InternalRow] = {
       if (!catalog.tableExists(identifier)) {
         try {
           catalog.createTable(identifier, tableSchema, partitioning.toArray, 
tableProperties.asJava)
         } catch {
           case _: TableAlreadyExistsException if ignoreIfExists =>
             logWarning(s"Table ${identifier.quoted} was created concurrently. 
Ignoring.")
         }
       } else if (!ignoreIfExists) {
         throw QueryCompilationErrors.tableAlreadyExistsError(identifier)
       }
   
       Seq.empty
     }
   ```
   
   But in current v1 code, it still will call `externalCatalog.createTable()`
   
   And for current `InMemoryCatalog.createTable()`, there is even no code to 
handle concurrent create table request.
   So here, we can handle it like v2 too. Under this case we just can do 
nothing.
   
   
   ### Why are the changes needed?
   Remove unnecessary call create table, especially hive metastore
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   
   ### How was this patch tested?
   WIP
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] AngersZhuuuu opened a new pull request #35770: [SPARK-38449][SQL] Avoid call createTable when ignoreIfExists=true and table exists

Reply via email to