Hi Anurag,

Thank you for your interest and taking the time to review the design doc!

To answer some of your questions:
1. The source of truth for all delegated tasks is within the
Delegation Service's own persistence layer.
2. The current document abstracts away the implementation details of
the Delegation Service. The intent is to first agree on the high-level
architecture and the API contract between the services. For the
synchronous MVP, there is no traditional in-memory or message broker
queue. Instead, the persistence layer itself acts as a durable log; a
task is persisted upon submission and then processed by the API
thread. An example task execution loop has been added onto the
appendix outlining this approach.
3. The plan is to provide the Delegation Service as a new, separate
Docker image to be deployed alongside the existing Polaris container.
We envision a one-to-one Polaris to Delegation Service security binary
enforced through the security measures outlined in the document. I
have included a new entry in the appendix discussing the high-level
approach.

Thanks again for the valuable questions. Please let me know if these
clarifications address your concerns or if you have any further
thoughts.

Bests,
William

On Tue, Jun 24, 2025 at 5:35 PM Anurag Mantripragada
<amantriprag...@apple.com.invalid> wrote:
>
> Thank you for your proposal, Willam.
>
> This type of companion service is necessary, as evidenced by the other 
> proposal on asynchronous tasks. Overall, this is a promising start. I 
> understand that the scope for this proposal is limited, so please feel free 
> to indicate that it is not in scope. However, I have a few questions:
>
> 1. Could you clarify in the documentation the source of truth for task 
> status? From your diagram, it appears that it is in the delegation service.
> 2. The implementation details of the service are abstracted away. Are these 
> not in scope for this design? (For instance, do we have a task queue in the 
> delegation service?)
> 3. Could you provide additional details on how this service will be deployed?
>
> It becomes very complicated when we transition from a synchronous model to an 
> asynchronous model. (Handling failures, task executor unavailability, status 
> updates, etc.) We can have a separate discussion for those.
>
> Thank you,
> Anurag Mantripragada
>
>
> > On Jun 24, 2025, at 11:56 AM, William Hyun <will...@apache.org> wrote:
> >
> > Hey Dmitri,
> >
> > Thank you for your comments!
> >
> > I would like to first clarify that while the initial use case is
> > internal, we are not closing the door completely on having Delegation
> > Service be accessible through user-driven clients.
> > We would love this service to eventually be deployed and run
> > independently from the Polaris Catalog to handle scheduled,
> > asynchronous tasks as Eric mentioned above with compaction.
> > We believe the REST API is the foundational building block for that
> > evolution and the initial proposal aims to simply introduce the
> > framework to the Polaris ecosystem with the purge table task as the
> > main focal point.
> >
> > Secondly, in addressing the concern about task failures, I have added
> > a section in the appendix discussing the expected behavior of failed
> > tasks.
> > Please feel free to take a look and let me know what you think!
> > - 
> > https://docs.google.com/document/d/1AhR-cZ6WW6M-z8v53txOfcWvkDXvS-0xcMe3zjLMLj8/edit?tab=t.0#heading=h.fr5gi42vvat3
> >
> > Bests,
> > William
> >
> >
> > On Mon, Jun 23, 2025 at 4:42 PM Dmitri Bourlatchkov <di...@apache.org> 
> > wrote:
> >>
> >> Apologies for missing the reference to Robert's doc. I hope it does not
> >> invalidate my comments :)
> >>
> >> This is certainly up for discussion.
> >>
> >> To clarify my concern about the REST API: If we are to have resilient tasks
> >> and the node that serves the initial REST request fails, other nodes will
> >> have to be able to provide responses about the task instead of the failed
> >> node. Ultimately the data will come from persistence (I assume). Also, I
> >> suppose the Tasks Service is meant for internal interactions (not for
> >> user-driven clients). Therefore, it seems to me that the REST API is
> >> somewhat superficial in this case.
> >>
> >> Like I mentioned before, this is just what I thought after a quick review.
> >> I'll certainly have a deeper look later.
> >>
> >> Cheers,
> >> Dmitri.
> >>
> >> On Mon, Jun 23, 2025 at 6:02 PM Eric Maynard <eric.w.mayn...@gmail.com>
> >> wrote:
> >>
> >>> Hey Dmitri,
> >>>
> >>> There's a section in the email above and the linked doc that talks about
> >>> the linked proposal. See "Relationship to the "Asynchronous & Reliable
> >>> Tasks" Proposal".
> >>>
> >>> As for pulling away from a REST API in favor of driving things directly
> >>> from persistence, there's a lot to discuss here. Bear in mind that the
> >>> design goes into detail about one proposed "TaskExecutor" implementation;
> >>> maybe another TaskExecutor could work exactly like you describe. But the
> >>> reason that this implementation proposes to be driven by a REST API is 
> >>> that
> >>> there's a lot of interesting future work -- see the "Future Work" section
> >>> of the doc for some examples -- that can be added on to the REST API. In
> >>> particular, table maintenance actions like compaction.
> >>>
> >>> --EM
> >>>
> >>> On Mon, Jun 23, 2025 at 2:31 PM Dmitri Bourlatchkov <di...@apache.org>
> >>> wrote:
> >>>
> >>>> Hi All,
> >>>>
> >>>> A previous proposal by Robert [1] from May 9 appears to be related. I
> >>> think
> >>>> we should consider both at the same time, possibly as alternatives, but
> >>>> perhaps also sharing / reusing their respective ideas.
> >>>>
> >>>> A few notes after a quick review:
> >>>>
> >>>> * Separate scaling for task executors seems reasonable at first glance,
> >>> but
> >>>> it adds deployment complexity. If we go with this approach, I believe it
> >>>> would be worth making this deployment strategy optional. In other words
> >>> let
> >>>> admin users decide whether they want to have extra nodes dedicated to
> >>>> specific tasks or whether they are ok with having uniform nodes.
> >>>>
> >>>> * I'm not sure a separate rich REST API for submitting tasks is really
> >>>> necessary. Proper synchronization among multiple nodes will
> >>>> probably require roundtrips to Persistence anyway, so task submission
> >>> could
> >>>> probably be done via Persistence.
> >>>>
> >>>> [1] https://lists.apache.org/thread/gg0kn89vmblmjgllxn7jkn8ky2k28f5l
> >>>>
> >>>> Thanks,
> >>>> Dmitri.
> >>>>
> >>>>
> >>>> On Mon, Jun 23, 2025 at 3:12 PM William Hyun <will...@apache.org> wrote:
> >>>>
> >>>>> Hello Polaris Community,
> >>>>>
> >>>>> I would like to share my proposal for a new service, the Polaris
> >>>>> Delegation Service, and to share the design document for discussion
> >>>>> and feedback. The Delegation Service is intended to optionally be
> >>>>> deployed alongside Polaris to handle the execution of certain
> >>>>> long-running tasks.
> >>>>>
> >>>>> 1. Motivation
> >>>>> The Polaris Catalog is optimized for low-latency metadata operations.
> >>>>> However, certain tasks such as purging data files for dropped tables
> >>>>> are resource-intensive and can impact its core performance. The
> >>>>> motivation for this new service is to decouple these I/O-heavy
> >>>>> background tasks from the main catalog, ensuring it remains highly
> >>>>> responsive while allowing the task execution workload to be managed
> >>>>> and scaled independently.
> >>>>>
> >>>>> 2. Proposal
> >>>>> We propose an optional, independent Delegation Service responsible for
> >>>>> executing these offloaded operations.
> >>>>> The MVP will focus on synchronously handling the data file deletion
> >>>>> process for DROP TABLE WITH PURGE commands.
> >>>>>
> >>>>> 3. Relationship to the "Asynchronous & Reliable Tasks" Proposal
> >>>>> This proposal is designed to be highly synergistic with the existing
> >>>>> "Asynchronous & Reliable Tasks" proposal.
> >>>>>
> >>>>> The Asynchronous Task proposal describes a general internal framework
> >>>>> for reliably scheduling and managing the lifecycle of any task within
> >>>>> Polaris. On the other hand, this proposal defines a specific, external
> >>>>> worker service optimized for executing a particular class of I/O-heavy
> >>>>> tasks.
> >>>>>
> >>>>> The Delegation Service does not alter the core Polaris task schema.
> >>>>> This allows it to seamlessly act as a specialized "backend" worker
> >>>>> that can execute tasks scheduled and managed by the more advanced
> >>>>> Asynchronous Task Framework, which would serve as the reliable
> >>>>> "frontend." This relationship is explored further in section 10.2 of
> >>>>> the document.
> >>>>>
> >>>>> Please find the detailed design document here for review:
> >>>>> -
> >>>>>
> >>>>
> >>> https://docs.google.com/document/d/1AhR-cZ6WW6M-z8v53txOfcWvkDXvS-0xcMe3zjLMLj8/edit?usp=sharing
> >>>>>
> >>>>> Best Regards,
> >>>>> William
> >>>>>
> >>>>
> >>>
>

Reply via email to