+1 Cheers, Murtadha
________________________________ From: Ian Maxon <[email protected]> Sent: Friday, September 12, 2025 3:29:14 AM To: [email protected] <[email protected]> Subject: Re: [DISCUSS] APE 25: Improved Apache Iceberg support Agreed. +1 from me as well. On Mon, Aug 11, 2025 at 3:17 PM Mike Carey <[email protected]> wrote: > > Great questions and discussion and clarifications. Good-looking APE at > this point, IMO! > > On 8/11/25 2:05 PM, Hari Kishore Chaparala wrote: > > That makes sense. Thanks for the clarification, Hussain. > > > > On Mon, Aug 11, 2025 at 8:40 AM Hussain Towaileb<[email protected]> > > wrote: > > > >> Hello Hari > >> > >> 1. Catalog persistence: > >> yes, once a catalog is created, it is persisted, it is a metadata entity > >> that is created, just like creating a collection, it's permanent. Tables > >> will be created using on those catalogs, so unless they are stored, a > >> customer would need to re-create it each time, which is not practical. As > >> for the credentials part, it depends on what type of credentials they use, > >> if it is permanent ones, then it is fine. If they use temporary credentials > >> that expire, there are 2 types of those: > >> - Passing keys + session token, then yes, if they expire, they will need > >> updating, which we don't support, so they will have to re-create the > >> catalog with the new credentials. > >> - Passing trust account authentication, this mechanism has temporary > >> credentials but automatically refresh, you can see this APE for more > >> details: > >> > >> https://urldefense.com/v3/__https://cwiki.apache.org/confluence/display/ASTERIXDB/APE*16*3A*Cross-Account*Trust*Authentication*for*AWS*S3*External*Collections__;KyUrKysrKysrKw!!CzAuKJ42GuquVTTmVmPViYEvSg!LNqPhf1prMmZ41ZjrV2H3GMj25fFDAmeqSSCuS5SDAm6vC1r9e1nG7oUpn6NT7sgPqBndN7pCZS4HnOy0A$ > >> > >> 2. Table referencing: > >> This still has room for discussion, but the idea I had in mind that the > >> name space would be in the WITH clause. This is to avoid breaking/confusing > >> things as "tables" are actually "external collections", and if you say > >> a.b.c, then you are talking about a "collection" in a "database". > >> The current planned behavior (again, open for discussion) > >> Say you create your catalog: > >> "CREATE CATALOG myCatalog .... WITH {"namespace": "my.name.space", ...}" > >> This would make all collections (tables) created on this catalog default to > >> the "my.name.space" namespace by default. > >> > >> So if I create a collection: > >> "CREATE EXTERNAL COLLECTION myTable ON myCatalog .... WITH {"table-name": > >> "users", ...}" > >> > >> Then this table is residing at "catalog-warehouse-path/my/name/space/users" > >> > >> If however I would like to create a collection in a namespace other than > >> the default, we can set that property, which will override the catalog's > >> default namespace: > >> "CREATE EXTERNAL COLLECTION myTable ON myCatalog .... WITH {"table-name": > >> "users", "namespace": "my.other.name.space"}" > >> > >> And now this collection would be at > >> "catalog-warehouse-path/my/other/name/space/users" > >> > >> 3. Splitting APE into 2: > >> Currently, I find the two topics tightly coupled with each other, it would > >> be more convenient to have them together for context. But if it gets too > >> large or too confusing, I don't think there is harm in splitting them. > >> Also note that creating a table is actually the same syntax for creating an > >> external collection, it just takes extra property "table-type": "iceberg" > >> to differentiate it from a normal external collection. > >> > >> > >> On Mon, Aug 11, 2025 at 4:18 AM Hari Kishore Chaparala<[email protected]> > >> wrote: > >> > >>> Thanks for the improvement. The syntax looks much more intuitive than in > >>> Spark, where the catalog has to be configured with the "spark.sql" prefix > >>> (even for DataFrame operations), which can be confusing—especially when > >>> working with multiple catalogs. > >>> > >>> A few questions on the new CATALOG entity: > >>> > >>> 1. Catalog persistence — When we run "*CREATE CATALOG myRestCatalog*" > >> with > >>> configuration options, will the catalog be stored and persisted beyond > >> the > >>> current session? In Spark and other engines, the catalog implementation > >> and > >>> configuration usually last only for the active session. Since we are > >>> querying external tables, I’m not sure if storing catalog details is > >>> necessary. Also, AWS roles and STS credentials expire after some time, > >>> which would require catalog updates. > >>> > >>> 2. Table referencing — How do we plan to reference tables? Will it be a > >>> three-part notation -- might be clearer when working across multiple > >>> catalogs? > >>> For example: > >>> > >>> > >>> > >>> *SELECT *FROM glue_catalog.namespace1.iceberg_table1 AINNER JOIN > >>> unity_catalog.namespace2.delta_lake_table1 B ON A.id = B.id;* > >>> > >>> 3. It looks like this APE proposes two features: 1. The new CATALOG > >> entity > >>> 2. DQL and DDL support for Iceberg tables using various catalog > >>> implementations. Would it make sense to split these into separate APEs? > >>> > >>> Thanks > >>> Hari Kishore > >>> > >>> On Fri, Aug 8, 2025 at 9:39 AM Hussain Towaileb<[email protected]> > >>> wrote: > >>> > >>>> Initiating discussion for adding improved support for Apache Iceberg > >>>> Feature: *Improved Support for Apache Iceberg* > >>>> Details: Apache Iceberg would provide support for reading Iceberg > >> tables. > >>>> This APE discusses adding improved support to the current Apache > >> Iceberg > >>>> support by introducing the Catalog entity to AsterixDB Metadata, adding > >>>> support to different types of Iceberg catalogs, and introducing other > >>>> features like time travel. > >>>> > >>>> APE: > >>>> > >>>> > >> https://urldefense.com/v3/__https://cwiki.apache.org/confluence/display/ASTERIXDB/APE*25*3A*Apache*Iceberg*Support__;KyUrKys!!CzAuKJ42GuquVTTmVmPViYEvSg!J81u58s9FyWRyedF8qV0TL-QjZrvS9vCviVHuCte1wGJ-y3qgzG087UwfC-ii0LKkEI3c5Iw7CG6yxA62A$ > >>>> -- > >>>> Regards, > >>>> Hussain Towaileb > >>>> > >> > >> -- > >> Regards, > >> Hussain Towaileb > >>
