rymurr opened a new pull request #1783:
URL: https://github.com/apache/iceberg/pull/1783
I wanted to start a conversation about how to make `IcebergSource`
compatible with custom Iceberg Catalogs in both Spark 2&3.
This first attempt is meant to give a straw man for how this could be done.
It uses code from `LookupCatalog` to fetch a Catalog and Identifier from a
path. We then try and extract an Iceberg table from it.
This works by specifying the catalog(`iceberg_catalog`) below in the path or
by setting a default catalog in spark config.
`df.write.format("iceberg").mode("append").save("iceberg_catalog.testing.foo")`
Several problems:
1) still have the fragile path check to delegate to the Hadoop catalog
2) not strictly backwards compatible one has to specify a catalog as part of
the path or a default catalog
3) not 100% sure this will work for Spark2
4) Copies code from `LookupCatalog` it would be nice to use that (or
similar) directly but that would involve putting scala code in the code path.
5) Hard to pass parameters to the catalog as we do now
I am soliciting opinions on what might be a better way of doing custom
catalogs, has anyone else been thinking about this?
cc @rdblue @jacques-n @laurentgo
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]