> I'm a little concerned about using the REST spec as a means to force portability on implementations. I feel that level of requirement could result in a reluctance to provide interoperability which would limit access to data or normalize non-compliance with the spec. Ultimately, I feel user demand will drive the goals of openness and portability, which is a trend we see across the ecosystem and continues to drive interest in open formats and standards.
If we feel strongly about this, we could define the deregister/export operation as an optional endpoint. Catalog implementations could choose whether to support it, allowing users to make informed decisions based on feature availability when selecting a catalog. Once the feature becomes broadly adopted, we could move the endpoint into the set of required table endpoints. Daniel Weeks <[email protected]> ezt írta (időpont: 2026. jan. 28., Sze, 20:38): > I think there's good reason to consider a "deregister" or "export" like > functionality given that there isn't a clear path to hand off ownership of > a table between catalogs. This is a slightly different motivation for > similar functionality, but shares the same underlying goal of improving > portability. > > Even without this, there are ways to capture the metadata (e.g. persist > the json response and use that as the metadata reference for registering), > so I don't think the absence of a physical json file is really a blocker. > We originally wanted to preserve the physical representation to both adhere > to the spec language regarding how commits are effected and to ensure > access for older clients that do not support the REST Catalog. At this > point, REST support is nearing ubiquity and the metadata representation is > still available in some form (though less convenient for direct file > reference). > > I'm a little concerned about using the REST spec as a means to force > portability on implementations. I feel that level of requirement could > result in a reluctance to provide interoperability which would limit access > to data or normalize non-compliance with the spec. Ultimately, I feel user > demand will drive the goals of openness and portability, which is a trend > we see across the ecosystem and continues to drive interest in open formats > and standards. > > -Dan > > On Wed, Jan 28, 2026 at 7:55 AM Russell Spitzer <[email protected]> > wrote: > >> Prior to the introduction of CATALOG_ONLY tables, reading a table >>> implicitly required that the full table metadata be accessible to readers. >>> This made it possible to migrate a table between catalog implementations by >>> simply pointing */v1/{prefix}/namespaces/{namespace}/register* to the >>> existing metadata.json, assuming the appropriate user privileges were in >>> place. >> >> >> This actually hasn’t been the case for quite a while across several >> vendors (though not the one I work at — we still expose full metadata). >> There’s nothing preventing, and in fact several vendors are already, >> shipping Iceberg metadata that does not strictly represent the table. >> Properties, snapshots, or even the table itself can redirect to another >> representation of the same table, leaving no way to recover a true “ground >> truth” view via the REST API. I’m also aware of folks shipping different >> versions of the metadata or exposing what is essentially a read-only >> metadata.json layered on top of a table in another format. So I think >> the ship has largely sailed on relying on metadata as a guaranteed >> canonical view. >> >> I do think it’s still important to preserve *portability*, or at least >> to make it clear to end users whether or not their tables will be portable. >> With that in mind, I was wondering if we should introduce an explicit >> catalog export command that is essentially the inverse of register. >> Unlike loadTable, it would be required to produce the path of a >> metadata.json that represents the entire Iceberg table without modification. >> >> That would give catalogs a clear way to signal whether they support >> “unregistering” a table in a way that lets it be used in another system. We >> could also scope permissions for this functionality so that only specific >> users are allowed to perform an export. >> >> >> >> On Wed, Jan 28, 2026 at 5:42 AM Péter Váry <[email protected]> >> wrote: >> >>> > I am not sure about the concern for lock-in. Users are free to adopt >>> any catalog that is spec compliant. Catalog-only tables are not the choices >>> of the catalog vendor/provider, it is the choice of the table owner by >>> users for access control. >>> >>> Prior to the introduction of CATALOG_ONLY tables, reading a table >>> implicitly required that the full table metadata be accessible to readers. >>> This made it possible to migrate a table between catalog implementations by >>> simply pointing */v1/{prefix}/namespaces/{namespace}/register* to the >>> existing metadata.json, assuming the appropriate user privileges were in >>> place. >>> >>> With CATALOG_ONLY tables, this implicit requirement is removed, and no >>> alternative requirement is introduced. As a result, migrating the complete >>> history of a table may become impossible without performing a manual >>> traversal of the plan(s) and metadata. >>> >>> What I am suggesting is that the ability to re‑register an Iceberg table >>> with a different catalog should be an explicit requirement for a >>> spec‑compliant catalog. >>> >>> > Also this proposal doesn't say that the write path shouldn't produce >>> the metadata.json file, which is still required today to be spec compliant. >>> >>> The Iceberg table specification describes metadata.json and manifest >>> files, but after this change a catalog could be fully compliant with the >>> Iceberg REST Catalog specification while still not exposing these files in >>> a way that is accessible to users. This would effectively prevent use cases >>> such as migrating tables between catalogs. >>> >>> >>> Steven Wu <[email protected]> ezt írta (időpont: 2026. jan. 26., H, >>> 20:33): >>> >>>> catching up on this thread. >>>> >>>> I am not sure about the concern for lock-in. Users are free to adopt >>>> any catalog that is spec compliant. Catalog-only tables are not the choices >>>> of the catalog vendor/provider, it is the choice of the table owner by >>>> users for access control. >>>> >>>> Also this proposal doesn't say that the write path shouldn't produce >>>> the metadata.json file, which is still required today to be spec compliant. >>>> It is just that clients may not need to load the metadata.json (and >>>> manifest list, manifest files) directly for client-side scan planning. >>>> >>>> I also like Dan's suggestion of not including client preference/config >>>> in the spec. >>>> >>>> > I want to highlight that introducing "CATALOG_ONLY" planners >>>> implicitly creates a new requirement for all compliant engines. Without >>>> support for this, engines would be unable to read these new tables. This >>>> seems like a significant change that we should call out explicitly. >>>> >>>> Agree with Peter that this is a significant new requirement for >>>> engines. Iceberg libraries (Java or other languages) can probably hide it >>>> internally in the scan planning implementation. Some engines may not use >>>> Iceberg libraries. This would be a new requirement. >>>> >>>> >>>> >>>> On Tue, Jan 20, 2026 at 4:55 PM Prashant Singh < >>>> [email protected]> wrote: >>>> >>>>> Thank you Peter, I will go ahead and find a slot that works for most >>>>> of the folks interested in the discussion and put it in dev calendar ~ >>>>> >>>>> Regarding Agenda : I would request to keep the discussion contained in >>>>> context of what does this mean to have a mode of planning like >>>>> catalog_only >>>>> its use cases >>>>> and side effects, for example READ only tables is something that can >>>>> be done as of today, infacts folks use this in production, for example: >>>>> tools such as Apache Xtable (incubating) or Uniform where one generates >>>>> iceberg metadata on top of >>>>> existing data files, having CATALOG_ONLY doesn't change much except >>>>> the fact that now that fake metadata doesn't need to be written, but it >>>>> was >>>>> fake in the first place as an iceberg client didn't generate it and >>>>> catalog >>>>> is already fully capable of doing that. >>>>> >>>>> With that being said, I will definitely put all your suggestions on >>>>> the agenda, let's discuss this more in depth, to understand the feedback >>>>> better. I also wanna include the types of mode discussion. Maybe we should >>>>> just keep client_only and catalog_only for now ? since preference is too >>>>> much for the first phase ? >>>>> >>>>> Please let me circle back with concrete time, meeting links etc, i >>>>> will post it here ! >>>>> >>>>> Best, >>>>> Prashant Singh >>>>> >>>>> On Sat, Jan 17, 2026 at 11:28 PM Péter Váry < >>>>> [email protected]> wrote: >>>>> >>>>>> Hi Prashant, >>>>>> >>>>>> I agree that having a dedicated sync makes a lot of sense. I’d >>>>>> suggest the following agenda items: >>>>>> >>>>>> 1. *Read-only tables* >>>>>> During the early discussions around the File Format API, I suggested >>>>>> starting with the read path, as this would allow us to integrate new data >>>>>> sources more quickly. At the time, there were strong objections, with the >>>>>> argument that every Iceberg table should be fully readable and writable >>>>>> through Iceberg in order to be considered a “real” Iceberg table. I’m >>>>>> interested to understand whether this position has changed since then. >>>>>> >>>>>> 2. *Table migration* >>>>>> I see clear benefits in generating table metadata on the fly (e.g., >>>>>> easier integration with fast-changing systems, stricter security models, >>>>>> and potential performance gains). My concern is that, if we allow this >>>>>> without constraints, a fully compliant Iceberg catalog could choose not >>>>>> to >>>>>> materialize metadata at all. This would make migration to another >>>>>> compliant >>>>>> Iceberg catalog much harder. Openness and easy migration are major >>>>>> selling >>>>>> points of Iceberg, and I think we should continue to enforce those >>>>>> values. >>>>>> >>>>>> 3. *Engine compatibility* >>>>>> I want to highlight that introducing "CATALOG_ONLY" planners >>>>>> implicitly creates a new requirement for all compliant engines. Without >>>>>> support for this, engines would be unable to read these new tables. This >>>>>> seems like a significant change that we should call out explicitly. >>>>>> >>>>>> 4. *CATALOG_ONLY tables* >>>>>> If we reach agreement on the points above, I think the decision on >>>>>> this topic will naturally follow. >>>>>> >>>>>> My current perspective on these topics: >>>>>> >>>>>> 1. *Read-only tables* >>>>>> I like this idea, as it would allow Iceberg catalogs to more easily >>>>>> expose external databases such as Delta, Lance, and others. My main >>>>>> hesitation is that I’ve proposed this before and it was strongly rejected >>>>>> by the community. >>>>>> >>>>>> 2. *Table migration* >>>>>> My concern is that we may be taking incremental steps away from >>>>>> Iceberg’s original position of full compliance, easy migration, and broad >>>>>> compatibility, toward a more closed, catalog-bounded model. I’d like us >>>>>> to >>>>>> step back and clearly define our core values, then enforce them in the >>>>>> specification. This could be as simple as a few sentences in the >>>>>> "LoadTableResponse" description requiring a way (for some users) to >>>>>> obtain >>>>>> the full metadata JSON along with the corresponding manifest and data >>>>>> files, or perhaps a dedicated migration endpoint that allows one catalog >>>>>> to >>>>>> take over a table from another. >>>>>> >>>>>> 3. *Engine compatibility* >>>>>> I have the sense that this “small” enum change actually introduces a >>>>>> fairly large new requirement for engines, and I want to make sure we >>>>>> explicitly highlight that. >>>>>> >>>>>> 4. *CATALOG_ONLY tables* >>>>>> As above, I think our answers to the earlier questions will >>>>>> effectively determine our position here. >>>>>> >>>>>> Overall, I like your proposal, but in a few areas it seems to move us >>>>>> in a different direction from what we previously agreed on. I’d like to >>>>>> understand whether the community is aligned with this new direction. >>>>>> >>>>>> Thanks, >>>>>> Peter >>>>>> >>>>>> >>>>>> On Thu, Jan 15, 2026, 20:34 Prashant Singh <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Thank you for the discussion everyone, >>>>>>> really appreciate all of you taking time ! >>>>>>> >>>>>>> Unfortunately we were not able to discuss this in the catalog sync >>>>>>> this week, since we ran out of time, I was wondering if all the >>>>>>> interested >>>>>>> folks would be open to a discussion. >>>>>>> I can go ahead and request one in the iceberg calendar. >>>>>>> >>>>>>> Peter : >>>>>>> >>>>>>> > With the introduction of CATALOG_ONLY tables, storing Iceberg >>>>>>> metadata files is no longer required for any operation >>>>>>> >>>>>>> I am not sure if i fully get the concern here, the client still >>>>>>> writes the manifests and manifest lists to the tables which are given to >>>>>>> the catalog where it creates / tracks the metadata.json, for writes we >>>>>>> need >>>>>>> to have hold of these manifests specially for cases such as validating >>>>>>> no >>>>>>> new data has been inserted to the table (conflict detection) >>>>>>> please ref validateAddedDataFiles [1], this can't be achieved by >>>>>>> scan planning at least not without breaking the existing iceberg >>>>>>> clients as >>>>>>> these validations are client side based on the isolation level, which >>>>>>> would >>>>>>> make these tables unusable with client if we want to write. >>>>>>> >>>>>>> For the tables which are read only, I am not sure if those tables >>>>>>> are sufficient for enforcing vendor lock in, in addition to what can be >>>>>>> achieved today, I believe this would be circumvented though if we >>>>>>> clarify / >>>>>>> tighten the metadata location expectation in the spec, that it should >>>>>>> exactly state the state of the table as committed by clients >>>>>>> i.e it should have precise references to the manifest and manifest >>>>>>> list that the client created ? >>>>>>> >>>>>>> With that being said, I request everyone interested in this thread >>>>>>> please let me know if you all are open for a dedicated community >>>>>>> discussion >>>>>>> for this, happy to brainstorm together and reach consensus. >>>>>>> >>>>>>> [1] >>>>>>> https://github.com/apache/iceberg/blob/main/core/src/main/java/org/apache/iceberg/MergingSnapshotProducer.java#L377 >>>>>>> >>>>>>> Best, >>>>>>> Prashant Singh >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Wed, Jan 14, 2026 at 7:38 AM Péter Váry < >>>>>>> [email protected]> wrote: >>>>>>> >>>>>>>> Hi Dan, >>>>>>>> >>>>>>>> > While it is possible and may feel like it would prevent >>>>>>>> interoperability, that would be easily circumvented by just copying the >>>>>>>> entire contents of the table through scan/plan. >>>>>>>> >>>>>>>> This enables the user to recreate a snapshot of the table, but it >>>>>>>> does not provide the full history or complete table metadata. It is >>>>>>>> also >>>>>>>> significantly more involved than simply calling the register table >>>>>>>> operation. >>>>>>>> >>>>>>>> > REST Catalog implementations have always been able to restrict >>>>>>>> access to physical storage regardless of whether a client could load >>>>>>>> the >>>>>>>> table metadata or not. >>>>>>>> >>>>>>>> Previously, this was primarily a matter of gaining access to the >>>>>>>> underlying storage. With the introduction of CATALOG_ONLY tables, >>>>>>>> storing >>>>>>>> Iceberg metadata files is no longer required for any operation. >>>>>>>> >>>>>>>> > there are lots of different ways closed systems can restrict >>>>>>>> access already (e.g. jdbc only or proprietary APIs), so I don't feel >>>>>>>> like >>>>>>>> this is changing that dynamic. >>>>>>>> >>>>>>>> I’m not sure I understand this. Could you please provide more >>>>>>>> details? >>>>>>>> >>>>>>>> The goal, as I understand it, is that if a Catalog implements the >>>>>>>> Iceberg specification, migration to and from this Catalog should be >>>>>>>> possible with any other Catalog that adheres to the same specification. >>>>>>>> Introducing CATALOG_ONLY tables, however, feels like another step away >>>>>>>> from >>>>>>>> interoperability. >>>>>>>> >>>>>>>> > I think the motivation behind catalog only mode is more for cases >>>>>>>> where the underlying data is either in a different representation or is >>>>>>>> being adapted on-the-fly. For example, if you wanted to expose a table >>>>>>>> from a database that can export data to parquet, but doesn't natively >>>>>>>> support Iceberg as a format, you can hide that behind scan plan >>>>>>>> interfaces. >>>>>>>> >>>>>>>> Using the Scan Planning interface has been optional until now, but >>>>>>>> with the introduction of CATALOG_ONLY tables, it becomes mandatory. As >>>>>>>> a >>>>>>>> result, compliant engines will need to implement it. >>>>>>>> >>>>>>>> > There may not be a full representation of the table metadata but >>>>>>>> using a subset of Iceberg primitives, you can still achieve >>>>>>>> interoperability (at least for read). >>>>>>>> >>>>>>>> In earlier discussions, we agreed that tables should not implement >>>>>>>> only a subset of the Iceberg specification. This proposal seems to >>>>>>>> move in >>>>>>>> a different direction. While I’m not opposed to the feature and >>>>>>>> recognize >>>>>>>> the benefits of integrating non-Iceberg tables into Iceberg catalogs >>>>>>>> and >>>>>>>> making them queryable by compatible engines, I believe it would be >>>>>>>> useful >>>>>>>> to clarify our current understanding of the boundaries and the level of >>>>>>>> feature parity we aim to maintain. Establishing this would provide a >>>>>>>> consistent framework for evaluating similar proposals going forward. >>>>>>>> >>>>>>>> This seems like a good candidate for today’s catalog sync >>>>>>>> discussion. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Peter >>>>>>>> >>>>>>>> Daniel Weeks <[email protected]> ezt írta (időpont: 2026. jan. >>>>>>>> 14., Sze, 0:23): >>>>>>>> >>>>>>>>> I don't feel we should be too concerned about catalogs switching >>>>>>>>> to a "catalog only" mode and not providing direct access. While it is >>>>>>>>> possible and may feel like it would prevent interoperability, that >>>>>>>>> would be >>>>>>>>> easily circumvented by just copying the entire contents of the table >>>>>>>>> through scan/plan. I wouldn't agree there was implied access just by >>>>>>>>> having a metadata-location field either. REST Catalog >>>>>>>>> implementations have >>>>>>>>> always been able to restrict access to physical storage regardless of >>>>>>>>> whether a client could load the table metadata or not. I understand >>>>>>>>> the >>>>>>>>> concern about lock-in, but there are lots of different ways closed >>>>>>>>> systems >>>>>>>>> can restrict access already (e.g. jdbc only or proprietary APIs), so I >>>>>>>>> don't feel like this is changing that dynamic. >>>>>>>>> >>>>>>>>> I think the motivation behind catalog only mode is more for cases >>>>>>>>> where the underlying data is either in a different representation or >>>>>>>>> is >>>>>>>>> being adapted on-the-fly. For example, if you wanted to expose a >>>>>>>>> table >>>>>>>>> from a database that can export data to parquet, but doesn't natively >>>>>>>>> support Iceberg as a format, you can hide that behind scan plan >>>>>>>>> interfaces. There may not be a full representation of the table >>>>>>>>> metadata >>>>>>>>> but using a subset of Iceberg primitives, you can still achieve >>>>>>>>> interoperability (at least for read). >>>>>>>>> >>>>>>>>> Introducing modes just is a way to express the intent/availability >>>>>>>>> for the scan plan and coordinate between the client and server, but I >>>>>>>>> don't >>>>>>>>> think it really affects whether a client could be prevented from >>>>>>>>> reading >>>>>>>>> table data directly (a catalog can do that regardless). >>>>>>>>> >>>>>>>>> I would add that I don't think the spec should include anything >>>>>>>>> about the client modes (I added a comment to the PR on this). The >>>>>>>>> spec >>>>>>>>> should only indicate what the server can return and what the >>>>>>>>> expectations >>>>>>>>> should be for a client. What a client implements and what >>>>>>>>> configurations >>>>>>>>> it exposes is more of a client-side implementation detail and should >>>>>>>>> not be >>>>>>>>> part of the spec. >>>>>>>>> >>>>>>>>> >>>>>>>>> -Dan >>>>>>>>> >>>>>>>>> >>>>>>>>> On Tue, Jan 13, 2026 at 11:07 AM Prashant Singh < >>>>>>>>> [email protected]> wrote: >>>>>>>>> >>>>>>>>>> Hello Peter, >>>>>>>>>> Thank you for the feedback. >>>>>>>>>> >>>>>>>>>> IIUC, you mean to say an interpretation, could be a dummy file >>>>>>>>>> which would in worst case simply not exist ? sure i believe we can be >>>>>>>>>> explicit there to avoid this. >>>>>>>>>> Note: this is predating this proposal though and happy to take a >>>>>>>>>> stab in being explicit here. >>>>>>>>>> >>>>>>>>>> > users were required to have direct read access to the metadata >>>>>>>>>> files in order to plan queries on the table. That implied an access >>>>>>>>>> requirement, even though it was never explicitly documented >>>>>>>>>> >>>>>>>>>> while the requirement is true but it's not like every user would >>>>>>>>>> get credentials to do so, it was strictly based on if the user is >>>>>>>>>> authorized to read the table based on the privileges defined in the >>>>>>>>>> catalog, loadTable's credential was optional meaning if a catalog >>>>>>>>>> wants it >>>>>>>>>> could very well not vend any credentials despite the client >>>>>>>>>> sending X-Iceberg-Access-Delegation due to this [1] and hence they >>>>>>>>>> can >>>>>>>>>> cut off any client if they want to. I believe the flexibility >>>>>>>>>> is there because we don't define authorization in IRC spec. As i >>>>>>>>>> said the admin is the one who had given the access to storage to the >>>>>>>>>> catalog in the first place so it can very well revoke that access to >>>>>>>>>> storage and migrate if the catalog is misbehaving by calling every >>>>>>>>>> table to >>>>>>>>>> itself to do planning and can move to a different catalog if the >>>>>>>>>> culprit >>>>>>>>>> catalog doesn't fix it. >>>>>>>>>> >>>>>>>>>> > Maybe we add a sentence in the spec to enforce that there >>>>>>>>>> should be some users where the catalog MUST provide access to the >>>>>>>>>> metadata >>>>>>>>>> files. >>>>>>>>>> >>>>>>>>>> Regarding the original feedback, there will always be an ADMIN >>>>>>>>>> user who has configured the catalog in the first place with the >>>>>>>>>> storage >>>>>>>>>> permission (lets say proving the IAM and establishing the trust >>>>>>>>>> relationship) who can get hold of the storage directly and access >>>>>>>>>> those >>>>>>>>>> metadata files directly from storage. So some are implicit in that >>>>>>>>>> sense. >>>>>>>>>> >>>>>>>>>> I believe by introducing CATALOG only mode for planning on >>>>>>>>>> existing assumptions we are not introducing new ways to trap end >>>>>>>>>> users in >>>>>>>>>> getting into vendor lock-in and like always existed a user has a way >>>>>>>>>> to >>>>>>>>>> walk out of it with the constructs. >>>>>>>>>> >>>>>>>>>> Please let me know what WDYT is considering above ? >>>>>>>>>> >>>>>>>>>> [1] >>>>>>>>>> https://github.com/apache/iceberg/blob/fc434997fbc63a3f1f47481c0878073b1ccf6359/open-api/rest-catalog-open-api.yaml#L1886-L1887 >>>>>>>>>> >>>>>>>>>> Best, >>>>>>>>>> Prashant Singh >>>>>>>>>> >>>>>>>>>> On Tue, Jan 13, 2026 at 6:11 AM Péter Váry < >>>>>>>>>> [email protected]> wrote: >>>>>>>>>> >>>>>>>>>>> Hi Prashant, >>>>>>>>>>> >>>>>>>>>>> The specification states: >>>>>>>>>>> >>>>>>>>>>>> The corresponding file location of table metadata should be >>>>>>>>>>>> returned in the `metadata-location` field >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> However, it does not specify that this location must be readable >>>>>>>>>>> by any users. (Perhaps this is something we should revisit and >>>>>>>>>>> clarify >>>>>>>>>>> going forward.) >>>>>>>>>>> >>>>>>>>>>> Before the introduction of CATALOG_ONLY tables, users were >>>>>>>>>>> required to have direct read access to the metadata files in order >>>>>>>>>>> to plan >>>>>>>>>>> queries on the table. That implied an access requirement, even >>>>>>>>>>> though it >>>>>>>>>>> was never explicitly documented. With the introduction of >>>>>>>>>>> CATALOG_ONLY, >>>>>>>>>>> this implicit requirement no longer applies, and we currently do >>>>>>>>>>> not have >>>>>>>>>>> an explicit requirement defined in the specification either. >>>>>>>>>>> >>>>>>>>>>> Prashant Singh <[email protected]> ezt írta (időpont: >>>>>>>>>>> 2026. jan. 12., H, 23:33): >>>>>>>>>>> >>>>>>>>>>>> Thank you for the feedback everyone ! >>>>>>>>>>>> >>>>>>>>>>>> Eduard : I am open to being it named _ENFORCED or even not >>>>>>>>>>>> having _ONLY or _ENFORCED in the first place as Dan suggested >>>>>>>>>>>> here, please >>>>>>>>>>>> let me know if you are ok with that as per [1] >>>>>>>>>>>> >>>>>>>>>>>> Amogh : Thank you for the feedback on the _preference mode, i >>>>>>>>>>>> tried to document some concrete use cases that could benefit with >>>>>>>>>>>> it [2] as >>>>>>>>>>>> I believe it can provide some options for the catalog and client to >>>>>>>>>>>> negotiate when they are open to it please let me know wdyt ? >>>>>>>>>>>> >>>>>>>>>>>> Peter : I believe such kind of vendor locking would not be >>>>>>>>>>>> possible to have since the model we are going after i.e in the >>>>>>>>>>>> loadTable >>>>>>>>>>>> itself we get back the metadata pointer which is self describing >>>>>>>>>>>> and can be >>>>>>>>>>>> used to register this table in the new catalog, also the way the >>>>>>>>>>>> catalog >>>>>>>>>>>> (irc) specially has been laid out it decouple compute from storage >>>>>>>>>>>> so in the end it's the Admin user of the catalog which has >>>>>>>>>>>> given the catalog admin cred which gets scoped down based on the >>>>>>>>>>>> grants it >>>>>>>>>>>> had to the catalog defined and the ADMIN can simply revoke the >>>>>>>>>>>> catalog from >>>>>>>>>>>> doing it or can configure a new catalog with a different admin >>>>>>>>>>>> storage >>>>>>>>>>>> creds. >>>>>>>>>>>> I tried elaborating more on this on the PR feedback too [3] >>>>>>>>>>>> please let me know what wdyt ? >>>>>>>>>>>> >>>>>>>>>>>> I will be on top of both the PR and thread moving forward ! >>>>>>>>>>>> Appreciate all your feedback. >>>>>>>>>>>> >>>>>>>>>>>> [1] >>>>>>>>>>>> https://github.com/apache/iceberg/pull/14867#discussion_r2673087002 >>>>>>>>>>>> [2] >>>>>>>>>>>> https://github.com/apache/iceberg/pull/14867#discussion_r2678941794 >>>>>>>>>>>> [3] >>>>>>>>>>>> https://github.com/apache/iceberg/pull/14867#discussion_r2678376025 >>>>>>>>>>>> >>>>>>>>>>>> Best, >>>>>>>>>>>> Prashant Singh >>>>>>>>>>>> >>>>>>>>>>>> On Fri, Jan 9, 2026 at 10:34 PM Péter Váry < >>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> I have a concern about some catalogs starting to make every >>>>>>>>>>>>> table `CATALOG_ONLY`, which would essentially lock users to the >>>>>>>>>>>>> catalog >>>>>>>>>>>>> without providing a way to migrate the data to another catalog. >>>>>>>>>>>>> Maybe we add a sentence in the spec to enforce, that there >>>>>>>>>>>>> should be some users where the catalog MUST provide access to the >>>>>>>>>>>>> metadata >>>>>>>>>>>>> files. >>>>>>>>>>>>> >>>>>>>>>>>>> WDYT? >>>>>>>>>>>>> >>>>>>>>>>>>> On Thu, Jan 8, 2026, 18:38 Amogh Jahagirdar <[email protected]> >>>>>>>>>>>>> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> I did a pass over PR but I guess I'm a little skeptical on >>>>>>>>>>>>>> what notion of "preferences" truly gets us in the protocol. In >>>>>>>>>>>>>> case the >>>>>>>>>>>>>> endpoint is available but not enforced, my mental model is to >>>>>>>>>>>>>> just let the >>>>>>>>>>>>>> client make whatever choice it wants. If a server really thinks >>>>>>>>>>>>>> it's >>>>>>>>>>>>>> advantageous to use the remote planning, I'd think it'd just say >>>>>>>>>>>>>> server >>>>>>>>>>>>>> side planning is enforced. For the "momentary load" case, all a >>>>>>>>>>>>>> client >>>>>>>>>>>>>> would need to do is just handle the server throttling and >>>>>>>>>>>>>> fallback to a >>>>>>>>>>>>>> client side planning (don't think the protocol needs to expand >>>>>>>>>>>>>> just for >>>>>>>>>>>>>> that). >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Wed, Jan 7, 2026 at 11:28 AM Russell Spitzer < >>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> I'm in agreement with Prashsant's current plan, I have no >>>>>>>>>>>>>>> preference on naming of Only vs Enforced" >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Wed, Jan 7, 2026 at 4:42 AM Eduard Tudenhöfner < >>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Instead of calling it "ONLY", maybe "ENFORCED" would be a >>>>>>>>>>>>>>>> better term? I think that would more naturally express the >>>>>>>>>>>>>>>> behavior without >>>>>>>>>>>>>>>> having to define what "ONLY" really means. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Wed, Dec 24, 2025 at 12:05 AM Prashant Singh < >>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> *Hi everyone,* >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> *JB:* Mostly yes, but it's more about what the server >>>>>>>>>>>>>>>>> wants the client to do. The server can indicate if it >>>>>>>>>>>>>>>>> supports a mode or >>>>>>>>>>>>>>>>> not via the /v1/config endpoint at this point. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> *Russell:* Thank you for the thorough feedback! I think >>>>>>>>>>>>>>>>> it is a great idea to break the optional mode into *Prefer >>>>>>>>>>>>>>>>> Client | Prefer Catalog*—it really opens up a lot of >>>>>>>>>>>>>>>>> interesting use cases. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> For example, the server might support planning but, due to >>>>>>>>>>>>>>>>> momentary load, wants the client to see if it's open to >>>>>>>>>>>>>>>>> planning on the >>>>>>>>>>>>>>>>> client side. Similarly, an argument can be made that if the >>>>>>>>>>>>>>>>> server has a >>>>>>>>>>>>>>>>> table cached in memory, it would prefer the client comes to >>>>>>>>>>>>>>>>> the server. >>>>>>>>>>>>>>>>> Earlier, with just the optional value, we were simply falling >>>>>>>>>>>>>>>>> back to >>>>>>>>>>>>>>>>> server or client side planning based on whether the server >>>>>>>>>>>>>>>>> supported scan >>>>>>>>>>>>>>>>> planning. Now, the client can express its own overrides via >>>>>>>>>>>>>>>>> catalog configs >>>>>>>>>>>>>>>>> as well. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Based on our offline discussion, I have incorporated the >>>>>>>>>>>>>>>>> feedback into the updated matrix [1] to document what the >>>>>>>>>>>>>>>>> planning modes >>>>>>>>>>>>>>>>> would be based on the server response and client overrides: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> - >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> *CLIENT_ONLY + CATALOG_ONLY* = FAIL >>>>>>>>>>>>>>>>> - >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> *One "ONLY" + opposite "PREFERRED"* = ONLY wins >>>>>>>>>>>>>>>>> - >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> *Both "PREFERRED"* = Client config wins >>>>>>>>>>>>>>>>> - >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> *Client not configured* = Use server config or default >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I will update the reference implementation soon based on >>>>>>>>>>>>>>>>> this. I would love to know what other folks think! >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Best, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Prashant Singh >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> [1] >>>>>>>>>>>>>>>>> https://github.com/apache/iceberg/pull/14867#issuecomment-3683989832 >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Sat, Dec 20, 2025 at 1:26 PM Russell Spitzer < >>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I can imagine one more >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> (None - I would rename this) ClientOnly - Client can use >>>>>>>>>>>>>>>>>> Catalog Planning or Local Planning >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> PreferClient - Client should use local planning, but the >>>>>>>>>>>>>>>>>> plan api is available for this table — I can only imagine >>>>>>>>>>>>>>>>>> this would be >>>>>>>>>>>>>>>>>> useful for a scenario where most clients are heavy and have >>>>>>>>>>>>>>>>>> the resources >>>>>>>>>>>>>>>>>> to do local planning (or engine distributed planning) but >>>>>>>>>>>>>>>>>> you still want to >>>>>>>>>>>>>>>>>> support lightweight clients which can’t really do planning >>>>>>>>>>>>>>>>>> themselves. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> PreferCatalog - Client should use the plan API, but >>>>>>>>>>>>>>>>>> credentials have been provided to enable local planning — >>>>>>>>>>>>>>>>>> This is probably >>>>>>>>>>>>>>>>>> a transitional state as we move from clients that only >>>>>>>>>>>>>>>>>> support local >>>>>>>>>>>>>>>>>> planning to those which can use the plan api. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> CatalogOnly - Clients are not provided with the >>>>>>>>>>>>>>>>>> credentials required to read the table from the >>>>>>>>>>>>>>>>>> Metadata.json alone. If >>>>>>>>>>>>>>>>>> they do not implement the scan plan API they should fail >>>>>>>>>>>>>>>>>> fast, otherwise >>>>>>>>>>>>>>>>>> they will fail when they attempt to load a manifest_list >>>>>>>>>>>>>>>>>> file — This is >>>>>>>>>>>>>>>>>> used in circumstances where the catalog is giving either >>>>>>>>>>>>>>>>>> file specific >>>>>>>>>>>>>>>>>> credentials or is protecting the delivered files in some way >>>>>>>>>>>>>>>>>> such that >>>>>>>>>>>>>>>>>> their contents has been specially redacted or something like >>>>>>>>>>>>>>>>>> that. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I assume most catalogs will start with “ClientOnly” or >>>>>>>>>>>>>>>>>> “None” >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Then as Catalogs being to support planning API we will >>>>>>>>>>>>>>>>>> see most tables move to >>>>>>>>>>>>>>>>>> PreferCatalog with some perhaps extremely heavy or large >>>>>>>>>>>>>>>>>> tables staying as PreferClient or Client Only. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Then catalogs with special protections may have some >>>>>>>>>>>>>>>>>> tables return CatalogOnly so they can either scope >>>>>>>>>>>>>>>>>> credentials more >>>>>>>>>>>>>>>>>> tightly or manipulate the files that the client actually has >>>>>>>>>>>>>>>>>> access to in >>>>>>>>>>>>>>>>>> some way. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On Sat, Dec 20, 2025 at 1:09 AM Jean-Baptiste Onofré < >>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Hi Prashant >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> It makes sense to me. I guess we are using Catalog >>>>>>>>>>>>>>>>>>> properties to indicate what the REST server supports to the >>>>>>>>>>>>>>>>>>> client, right ? >>>>>>>>>>>>>>>>>>> I will take a look at the PR, but I like the idea. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Regards >>>>>>>>>>>>>>>>>>> JB >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On Sat, Dec 20, 2025 at 12:53 AM Prashant Singh < >>>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Hey All, >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> I wanted to bring up the discussion of introducing a >>>>>>>>>>>>>>>>>>>> concept of rest scan planning mode which would help the >>>>>>>>>>>>>>>>>>>> server to instruct >>>>>>>>>>>>>>>>>>>> the client on how to plan the table via loadTableResponse >>>>>>>>>>>>>>>>>>>> or config at >>>>>>>>>>>>>>>>>>>> table level override. >>>>>>>>>>>>>>>>>>>> There are three possible values which one could think >>>>>>>>>>>>>>>>>>>> of : >>>>>>>>>>>>>>>>>>>> 1. *None* : i.e plan it on the client side, this may >>>>>>>>>>>>>>>>>>>> be the table is too small and the additional rest request >>>>>>>>>>>>>>>>>>>> would add more >>>>>>>>>>>>>>>>>>>> overhead than benefit. >>>>>>>>>>>>>>>>>>>> 2. *Optional* : client can choose to plan it either >>>>>>>>>>>>>>>>>>>> locally or can trigger server side planning. >>>>>>>>>>>>>>>>>>>> 3. *Required* : client MUST do server side planning, >>>>>>>>>>>>>>>>>>>> the server could suggest this if it has better indexed the >>>>>>>>>>>>>>>>>>>> iceberg metadata >>>>>>>>>>>>>>>>>>>> or client is running on low resources or the table is >>>>>>>>>>>>>>>>>>>> protected. Server MAY >>>>>>>>>>>>>>>>>>>> choose whatever way required to enforce the client cant >>>>>>>>>>>>>>>>>>>> bypass this for >>>>>>>>>>>>>>>>>>>> example let's say don't vend cred as part of loadTable and >>>>>>>>>>>>>>>>>>>> only mint it >>>>>>>>>>>>>>>>>>>> part of planning completion this would mean if the client >>>>>>>>>>>>>>>>>>>> doesn't call plan >>>>>>>>>>>>>>>>>>>> table . >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> I proactively have created a pull request [1], would >>>>>>>>>>>>>>>>>>>> love to know all your feedback either here or in the PR >>>>>>>>>>>>>>>>>>>> directly ! >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Wish you all a very happy Holidays, it has been great >>>>>>>>>>>>>>>>>>>> working with you all. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> [1] https://github.com/apache/iceberg/pull/14867 >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Best, >>>>>>>>>>>>>>>>>>>> Prashant Singh >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>
