Re: Removal of CallContext.copy()

Yufei Gu Thu, 14 Aug 2025 13:33:10 -0700

To summarize a bit:

I agreed that the task context should be treated differently from both call
context and realm context. We should design the task context explicitly,
and it will inevitably need to be serialized so that other nodes can
execute the task with enough context.


When designing it, we must examine all execution paths a task may take. For
example:

   - In the persistence path, the metastore manager is required.
   - Is the realm ID alone sufficient to fetch it?
   - What about other dependencies like the entity cache or storage
   credential cache?


Currently, the MetaStoreManagerFactory interface accepts a RealmContext
object for all of its methods. This suggests that simplifying to just a
realmId would not be sufficient. In fact, the whole point of RealmContext
is to let vendors customize it, otherwise, we could have used a realmId.
Once we have a well-defined task context, we can confidently say that task
execution gets all the necessary components,  instead of collapsing call
context or realm context prematurely into just a realm ID.

Yufei


On Thu, Aug 14, 2025 at 12:14 AM Robert Stupp <[email protected]> wrote:

> I think the discussion has deviated quite a bit.
>
> For posterity, the only use case in Polaris for the only call site of
> `.copy()` is "purge table data". All that needs is the information
> about "what shall be purged" and during execution a way to get access
> to the object storage.
>
> Any task execution happens outside of a HTTP/REST request context.
> This is already the case.
>
> The PR here _only_ exchanges the `.copy()` function with a generic way
> to construct a realm-context from a realm-ID, which is the only
> information that's required. It's also the only information that's
> available during HTTP/REST requests. So `RealmContextResolver` is the
> interface that then provides a realm-context by realm-ID, which is
> exactly what's needed to build a "call-context".
>
> The design doc of the "async + reliable tasks" includes a description
> of how parameters (and potentially results) are passed: as immutable
> value objects describing the task to perform. What information a
> particular task behavior exactly needs is up to the task behavior
> itself.
>
> Regarding "audit lineage" - I could not find anything related to that
> for task execution even in 0.9.0-incubating. Did I miss something?
>
> Long story short: the PR does _not_ change the behavior, but it
> enables deferred and distributed execution of tasks.
>
> I don't want to deviate from this discussion about the pros and cons
> of (Polaris)CallContext but instead concentrate on the particular
> change.
>
>
> On Thu, Aug 14, 2025 at 5:54 AM Dennis Huo <[email protected]> wrote:
> >
> > I feel like we've discussed this before but I can't find the
> > previous threads. The purpose of the CallContext is explicitly to
> > encapsulate the set of information needed to represent running internal
> > operations on behalf of an abstract notion of a "request", precisely to
> > *avoid* hiding it all in CDI. This way, any handoff or fan-out of async
> > operations that go outside the boundaries of what Quarkus understands to
> be
> > a single HTTP request can still have the same standardized representation
> > of the original request context.
> >
> > I agree that today the CallContext doesn't do a great job of representing
> > this standard request context, partially because things have been further
> > removed instead of added. And we should definitely reconcile CallContext
> vs
> > PolarisCallContext.
> >
> > I missed the fact that we ripped out AuthenticatedPolarisPrincipal
> already
> > from CallContext. This is unfortunate because for async tasks that are
> > strongly associated with an originating request, we really do want to
> > preserve the audit lineage.
> >
> > Ultimately, with async and background services, Polaris really is a
> > distributed system, and inta-process CDI-based context propagation is
> > insufficient for the requirement, unless we're thinking of using some
> kind
> > of distributed message-passing CDI (does that exist)?
> >
> > Robert - is there a document or similar describing the intended design
> for
> > passing execution context to async tasks? It seems to me we might end up
> > removing CallContext just to add an analogous object back in under a
> > different name.
> >
> > IMO we *should* define the serializable standard "context" that we want
> to
> > propagate beyond the basic RequestScoped execution flow, and the runtime
> > objects that are *used* in CallContext/PolarisCallContext (such as the
> > BasePersistence handle) should be constructible from the serializable
> > fields.
> >
> > On Wed, Aug 13, 2025 at 6:28 PM yun zou <[email protected]>
> wrote:
> >
> > > Hi,
> > >
> > > While in theory, not all data within CallContext or RealmContext is
> > > required to execute the task, t
> > > he current TaskExecutor implementation does rely on these objects
> during
> > > execution.
> > > For instance, consider the code in TaskExecutorImpl
> > > <
> > >
> https://github.com/apache/polaris/blob/main/runtime/service/src/main/java/org/apache/polaris/service/task/TaskExecutorImpl.java#L153
> > > >
> > > .
> > >
> > > If we aim to pass only the essential information to the executor, the
> > > CallContext appears to be the necessary data
> > > needed for the executor code. However, if we believe the full
> CallContext
> > > or RealmContext is not needed, a more
> > > a robust solution would be to refactor the implementation to depend
> only on
> > > the specific data it requires—instead
> > > of relying on these entire context objects.
> > >
> > > The approach—passing selected data as parameters and then
> reconstructing
> > > the context—does not seem
> > > scalable, especially if CallContext evolves or if we need to include
> > > additional information in the future.
> > >
> > > If we say we want to pass the necessary information to the executor,
> then
> > > the CallContext seems the necessary
> > > information needed to be passed. If we think we do not need the whole
> > > CallContext or RealmContex, I think a more
> > > A robust way is to refactor the implementation code to not rely on
> > > CallContex or RealmContext, just the information
> > > it needed. The current approach passes information as parameter, and
> then
> > > intended to reconstruct the object doesn't
> > > sounds very scalable if future refactoring happens to CallContext, or
> if we
> > > intended to add more
> > > information to the object.
> > >
> > > A cleaner and more maintainable solution would be to leverage CDI to
> > > properly propagate CallContext and RealmContext t
> > > o the background thread. This would avoid manual reconstruction and
> promote
> > > a more consistent execution environment.
> > >
> > > Best Regards,
> > > Yun
> > >
> > > On Wed, Aug 13, 2025 at 1:01 PM Dmitri Bourlatchkov <[email protected]>
> > > wrote:
> > >
> > > > Hi Yufei,
> > > >
> > > > I would oppose serializing CallContext (especially across server
> nodes).
> > > >
> > > > Execution context is established based on a request. Executing a
> Task is
> > > > one of possible requests. As such it has well defined parameters and
> > > > begin/end boundaries. Any context that a task needs should naturally
> flow
> > > > from its parameters (which define the job to perform).
> > > >
> > > > Ultimately, I believe Polaris should leverage CDI for communicating
> > > context
> > > > information within a particular execution environment.
> > > >
> > > > The CallContext class is merely a runtime object that serves to
> represent
> > > > some set of parameters in some cases at this time. I do not think it
> is
> > > > defined well enough to act as a medium for transferring execution
> context
> > > > information across nodes.
> > > >
> > > > How a task's parameters are communicated to other nodes is probably
> > > beyond
> > > > the scope of this email thread, but I'd expect that aspect to be
> define
> > > in
> > > > the tasks proposal(s). My point is that it is a matter specific to
> tasks
> > > > and does not directly relate to CallContext.
> > > >
> > > > Cheers,
> > > > Dmitri.
> > > >
> > > > On Wed, Aug 13, 2025 at 2:46 PM Yufei Gu <[email protected]>
> wrote:
> > > >
> > > > > If we want tasks to run on any node, we can’t avoid serializing the
> > > call
> > > > > context, or part of it like the realm context. The refactor, as
> > > written,
> > > > > could NOT support tasks on any node as well. As long as we add a
> field
> > > > > other than the realm ID to the realm context, there is no way to
> > > > construct
> > > > > a realm context from another node just by realm id.
> > > > >
> > > > > Yufei
> > > > >
> > > > >
> > > > > On Wed, Aug 13, 2025 at 10:20 AM Eric Maynard <
> > > [email protected]>
> > > > > wrote:
> > > > >
> > > > > > I agree that we shouldn't need the entire CallContext in each &
> every
> > > > > task
> > > > > > -- tasks should include whatever information is necessary to
> execute
> > > > > them,
> > > > > > and in the end CallContext.copy() shouldn't be needed.
> > > > > >
> > > > > > However, at least in Will's proposal, we would keep the existing
> > > > > framework
> > > > > > around (with its unfortunate CallContext.copy()) in parallel
> with the
> > > > new
> > > > > > framework for some time. I don't see a big impetus to refactor
> this
> > > > just
> > > > > > yet.
> > > > > >
> > > > > > --EM
> > > > > >
> > > > > > On Wed, Aug 13, 2025 at 12:18 AM Robert Stupp <[email protected]>
> > > wrote:
> > > > > >
> > > > > > > Thanks for the reply Yufei, but the intent of all
> tasks-proposals
> > > is
> > > > > > > to be able to execute tasks on _any_ node. Are you suggesting
> > > making
> > > > > > > (Polaris)CallContext serializable?
> > > > > > >
> > > > > > > On Wed, Aug 13, 2025 at 3:12 AM Yufei Gu <[email protected]
> >
> > > > wrote:
> > > > > > > >
> > > > > > > > > To still let TaskExecutorImpl making "safe clones", a
> > > > functionality
> > > > > > to
> > > > > > > > get (fresh) instances of RealmContext is required. To enable
> > > this,
> > > > > the
> > > > > > > > RealmContextResolver has been enhanced with "RealmContext
> > > lookups"
> > > > by
> > > > > > > > realm-ID. That in turn led to splitting the
> > > > > HTTP/REST-to-realm-context
> > > > > > > > resolution into two parts: HTTP/REST-to-realm-ID and
> > > > > > realm-ID-to-context.
> > > > > > > >
> > > > > > > > I'm not sure whether this refactor is beneficial. Avoiding
> the
> > > > > copying
> > > > > > > of a
> > > > > > > > CDI bean would require introducing a global RealmContext map
> to
> > > > > > maintain
> > > > > > > > the mapping between realmId and RealmContext instances. That
> > > feels
> > > > > > > heavier
> > > > > > > > and more complex than simply copying the context whenever
> needed.
> > > > > > > >
> > > > > > > > Yufei
> > > > > > > >
> > > > > > > >
> > > > > > > > On Tue, Aug 12, 2025 at 6:18 AM Robert Stupp <[email protected]
> >
> > > > wrote:
> > > > > > > >
> > > > > > > > > Hi all,
> > > > > > > > >
> > > > > > > > > quick heads up that there's a PR to remove
> CallContext.copy(),
> > > > > which
> > > > > > > > > is only used from tasks. This change is also part of the
> effort
> > > > to
> > > > > > > > > have async & reliable tasks running "anywhere".
> > > > > > > > >
> > > > > > > > > Robert
> > > > > > > > >
> > > > > > > > > [1] https://github.com/apache/polaris/pull/2294
> > > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
>

Re: Removal of CallContext.copy()

Reply via email to