[GitHub] [iceberg] rdblue opened a new pull request #1183: Support atomic CTAS and RTAS with SparkSessionCatalog

GitBox Wed, 08 Jul 2020 10:28:13 -0700


rdblue opened a new pull request #1183:
URL: https://github.com/apache/iceberg/pull/1183



   This adds support for atomic CTAS and RTAS commands when using 
SparkSessionCatalog in Spark 3.
   
   If a TableCatalog in Spark 3 implements StagingTableCatalog, then all 
CTAS/RTAS operations will use the staging table methods, assuming that all 
tables in the catalog support the same capabilities. Iceberg tables support 
atomic operations, but tables loaded by the wrapped session catalog do not. The 
work-around is to mimic Spark's non-atomic behavior by creating a table 
immediately, using it for the write, and rolling back by dropping the table.
   
   This PR doesn't contain new tests because the session catalog in Spark 3 
does not work with v2 tables. It will always return a 
[`V1Table`](https://github.com/apache/spark/blob/v3.0.0/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/V2SessionCatalog.scala#L72).
 Because a v1 table is always returned, there are no code paths that will load 
non-Iceberg tables using the session catalog.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] rdblue opened a new pull request #1183: Support atomic CTAS and RTAS with SparkSessionCatalog

Reply via email to