Thanks for reviewing the proposal Huaxin!

*"Since stagingSession is in the URL and may show up in logs, should it be
treated as a secret token (hard to guess, short expiry)?"*
No, stagingSession is not a secret it is just an identifier for the
session. It is up to the catalog server implementation if it wants to
enforce if only the user who was issued the stagingSession or any user
with staginSession should call commit on the table. It can use existing
authentication mechanisms to enforce those constraints.

*"If it leaks, can someone else use it, or is it restricted to the same
user/job that created the staged table?"*
Since it's not a secret but merely an identifier (just like planId) there
should not be a risk of leak. It's up to catalog server implementation to
restrict same user/job or not.


*"What happens if a CTAS job crashes or is cancelled after staging? Does
the stagingSession expire automatically, and is there a way to clean
up/abort the staged create?"*The lifecycle implementation of stagingSession
is up to the catalog servers. There are multiple strategies that could be
used here like automatically expiring the session after a few hours if no
updateTable call was made for that session or expiring active sessions when
one of them is committed etc.
There would not be any additional API surface area exposed to clients to
manage the session lifecycle, it is the responsibility of the catalog
server.

Let me know if you have follow up questions.


On Mon, Feb 9, 2026 at 7:07 PM huaxin gao <[email protected]> wrote:

> Hi Maninder,
>
> Thanks for the proposal! It sounds like a good direction to me. Returning
> a stagingSession from stage-create and then reusing it for
> loadCredentials/loadTable feels consistent with the existing planId
> pattern, and it fixes a real CTAS problem.
>
> A few questions:
>
> Since stagingSession is in the URL and may show up in logs, should it be
> treated as a secret token (hard to guess, short expiry)?
>
> If it leaks, can someone else use it, or is it restricted to the same
> user/job that created the staged table?
>
> What happens if a CTAS job crashes or is cancelled after staging? Does the
> stagingSession expire automatically, and is there a way to clean up/abort
> the staged create?
>
> Would love to hear your thoughts on these.
>
> Thanks,
>
> Huaxin
>
> On Mon, Feb 9, 2026 at 4:30 PM Maninder Parmar <
> [email protected]> wrote:
>
>> Hello iceberg community!
>>
>> I wanted to discuss the proposal for refreshing storage credentials for
>> staged table creation. The iceberg tables could be created either via
>> single step creation flow or a two step staged creation flow which is used
>> for implementing CTAS (Create table as select) statements. Currently, it's
>> not possible to refresh the credentials for staged tables since they are
>> not committed on the catalog and hence not visible to loadTable or
>> credential endpoint.
>> There has been prior discussion
>> <https://lists.apache.org/thread/q5n355d89nxbhywtlv3qhq7dchbyb67d> where
>> the community members have expressed the need for supporting this scenario.
>>
>> I have started a proposal
>> <https://docs.google.com/document/d/1R1K6X7qYqvIFkPG3m1neV5Mvy8rwWJvhSFr8DgJgQ-E/edit?tab=t.0>
>>  to
>> flush out the details to support this scenario building on the
>> precedence of credential vending support for scan planning.
>> The OpenAPI changes can be seen in PR #15280
>> <https://github.com/apache/iceberg/pull/15280>
>>
>> Looking forward to your feedback.
>>
>> Thanks,
>> Maninder
>>
>>  Proposal: Credential Refresh for Staged Table Creation
>> <https://drive.google.com/open?id=1R1K6X7qYqvIFkPG3m1neV5Mvy8rwWJvhSFr8DgJgQ-E>
>>
>

Reply via email to