Hi community,

Thanks for the inputs during the catalog sync! I want to summarize the
decisions and direction that was agreed on during the sync.

*Direction*
- We'll introduce a storage-refresh-token concept that integrates with the
existing StorageCredential mechanism rather than being a staging-specific
construct. This keeps the design reusable across different APIs going
forward.
- We agreed not to model this after the planId-based credential vending
used in scan planning. The community is open to refactoring planId
credential refresh to use the storage credential refresh token pattern in
the future.

*Discarded approaches*
1. table-uuid as the identifier - overloads a spec-level identifier for a
purpose it wasn't designed for
2. Server-side state / sessions - adds operational complexity and some
existing catalog implementations assume stateless staged table creation
3. Overloading OAuth scopes - conflates storage credential refresh with the
OAuth layer

I will share an updated design doc and spec PR reflecting this direction.

On Tue, Feb 10, 2026 at 11:14 AM Maninder Parmar <
[email protected]> wrote:

> Thanks for reviewing the proposal Huaxin!
>
> *"Since stagingSession is in the URL and may show up in logs, should it be
> treated as a secret token (hard to guess, short expiry)?"*
> No, stagingSession is not a secret it is just an identifier for the
> session. It is up to the catalog server implementation if it wants to
> enforce if only the user who was issued the stagingSession or any user
> with staginSession should call commit on the table. It can use existing
> authentication mechanisms to enforce those constraints.
>
> *"If it leaks, can someone else use it, or is it restricted to the same
> user/job that created the staged table?"*
> Since it's not a secret but merely an identifier (just like planId) there
> should not be a risk of leak. It's up to catalog server implementation to
> restrict same user/job or not.
>
>
> *"What happens if a CTAS job crashes or is cancelled after staging? Does
> the stagingSession expire automatically, and is there a way to clean
> up/abort the staged create?"*The lifecycle implementation of
> stagingSession is up to the catalog servers. There are multiple strategies
> that could be used here like automatically expiring the session after a few
> hours if no updateTable call was made for that session or expiring active
> sessions when one of them is committed etc.
> There would not be any additional API surface area exposed to clients to
> manage the session lifecycle, it is the responsibility of the catalog
> server.
>
> Let me know if you have follow up questions.
>
>
> On Mon, Feb 9, 2026 at 7:07 PM huaxin gao <[email protected]> wrote:
>
>> Hi Maninder,
>>
>> Thanks for the proposal! It sounds like a good direction to me. Returning
>> a stagingSession from stage-create and then reusing it for
>> loadCredentials/loadTable feels consistent with the existing planId
>> pattern, and it fixes a real CTAS problem.
>>
>> A few questions:
>>
>> Since stagingSession is in the URL and may show up in logs, should it be
>> treated as a secret token (hard to guess, short expiry)?
>>
>> If it leaks, can someone else use it, or is it restricted to the same
>> user/job that created the staged table?
>>
>> What happens if a CTAS job crashes or is cancelled after staging? Does
>> the stagingSession expire automatically, and is there a way to clean
>> up/abort the staged create?
>>
>> Would love to hear your thoughts on these.
>>
>> Thanks,
>>
>> Huaxin
>>
>> On Mon, Feb 9, 2026 at 4:30 PM Maninder Parmar <
>> [email protected]> wrote:
>>
>>> Hello iceberg community!
>>>
>>> I wanted to discuss the proposal for refreshing storage credentials for
>>> staged table creation. The iceberg tables could be created either via
>>> single step creation flow or a two step staged creation flow which is used
>>> for implementing CTAS (Create table as select) statements. Currently, it's
>>> not possible to refresh the credentials for staged tables since they are
>>> not committed on the catalog and hence not visible to loadTable or
>>> credential endpoint.
>>> There has been prior discussion
>>> <https://lists.apache.org/thread/q5n355d89nxbhywtlv3qhq7dchbyb67d> where
>>> the community members have expressed the need for supporting this scenario.
>>>
>>> I have started a proposal
>>> <https://docs.google.com/document/d/1R1K6X7qYqvIFkPG3m1neV5Mvy8rwWJvhSFr8DgJgQ-E/edit?tab=t.0>
>>>  to
>>> flush out the details to support this scenario building on the
>>> precedence of credential vending support for scan planning.
>>> The OpenAPI changes can be seen in PR #15280
>>> <https://github.com/apache/iceberg/pull/15280>
>>>
>>> Looking forward to your feedback.
>>>
>>> Thanks,
>>> Maninder
>>>
>>>  Proposal: Credential Refresh for Staged Table Creation
>>> <https://drive.google.com/open?id=1R1K6X7qYqvIFkPG3m1neV5Mvy8rwWJvhSFr8DgJgQ-E>
>>>
>>

Reply via email to