[GitHub] [iceberg] rdblue commented on a change in pull request #1843: Support for file paths in SparkCatalogs via HadoopTables

GitBox Tue, 01 Dec 2020 11:04:34 -0800


rdblue commented on a change in pull request #1843:
URL: https://github.com/apache/iceberg/pull/1843#discussion_r533652664




##########
File path: spark3/src/main/java/org/apache/iceberg/spark/SparkCatalog.java
##########
@@ -165,6 +179,7 @@ public StagedTable stageCreate(Identifier ident, StructType 
schema, Transform[]
                                  Map<String, String> properties) throws 
TableAlreadyExistsException {
     Schema icebergSchema = SparkSchemaUtil.convert(schema);
     try {
+      // can't stage a hadoop table

Review comment:
       After thinking about this a little more, I think we will need to support 
some form of staging for Hadoop tables. Because this catalog implements the 
atomic operation mix-in, the staging calls will be used for all CTAS plans. 
Using `SupportsCatalogOptions` would mean that `save()` gets turned into a 
CTAS. So if we don't want the existing creates to fail, we have to support a 
staged table.
   
   We can do that in a couple of ways. First, we could create a table builder 
based on `HadoopCatalogTableBuilder` that supports a location. Second, we could 
reuse the fake staged table from the session catalog (for non-Iceberg tables). 
I'd prefer to create a builder that can construct the transactions for path 
tables. We could add it to `HadoopTables`.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] rdblue commented on a change in pull request #1843: Support for file paths in SparkCatalogs via HadoopTables

Reply via email to