[GitHub] [iceberg] rymurr opened a new pull request #1783: Custom catalogs from `IcebergSource`

GitBox Wed, 18 Nov 2020 10:48:49 -0800


rymurr opened a new pull request #1783:
URL: https://github.com/apache/iceberg/pull/1783



   I wanted to start a conversation about how to make `IcebergSource` 
compatible with custom Iceberg Catalogs in both Spark 2&3.
   
   This first attempt is meant to give a straw man for how this could be done. 
It uses code from `LookupCatalog` to fetch a Catalog and Identifier from a 
path. We then try and extract an Iceberg table from it.
   
   This works by specifying the catalog(`iceberg_catalog`) below in the path or 
by setting a default catalog in spark config.
   
`df.write.format("iceberg").mode("append").save("iceberg_catalog.testing.foo")`
   
   Several problems:
   1) still have the fragile path check to delegate to the Hadoop catalog
   2) not strictly backwards compatible one has to specify a catalog as part of 
the path or a default catalog
   3) not 100% sure this will work for Spark2
   4) Copies code from `LookupCatalog` it would be nice to use that (or 
similar) directly but that would involve putting scala code in the code path.
   5) Hard to pass parameters to the catalog as we do now
   
   I am soliciting opinions on what might be a better way of doing custom 
catalogs, has anyone else been thinking about this?
   
   cc @rdblue @jacques-n @laurentgo


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] rymurr opened a new pull request #1783: Custom catalogs from `IcebergSource`

Reply via email to