Hi Subham, Dmitri, Yufei, JB,

I’m generally aligned with the direction here, and I left a few more
detailed comments on the PR.
At a high level, my main concern is that the current proposal may still be
a bit ahead of the current story/scope. I’d lean toward keeping the first
step narrower, preferring purpose-based routing over per-realm routing for
v1, and making the supported model/config+migration story more explicit
before the broader contract hardens.

-ej

On Wed, Mar 11, 2026 at 12:37 PM Jean-Baptiste Onofré <[email protected]>
wrote:

> Hi Subham,
>
> Thanks for this contribution. It's an interesting feature.
>
> As mentioned in the GitHub issues, I am fine with moving forward with a PR
> as long as it remains a Draft PR to help drive the discussion. I suggest
> linking a GitHub Discussion or a Design Doc within the PR to help build
> consensus.
>
> That said, I have a few initial comments:
>
> 1. I like the SPI approach used in the PR. This should become a standard in
> Polaris to facilitate custom implementations.
> 2. I agree that having a data source "per purpose" is a good idea. The main
> question is how we should handle the split:
> - Very granular (per entity)
> - By table "meaning"
> - By realm (this may not be granular enough)
>
> From a user standpoint, I believe we should keep it simple. It would be a
> first great step forward. For example, in the configuration (e.g.,
> application.properties), it could look like:
> - polaris.datasource.entities=
> - polaris.datasource.events=
> - polaris.datasource.grants=
>
> Regards,
> JB
>
> On Tue, Mar 10, 2026 at 11:19 PM Yufei Gu <[email protected]> wrote:
>
> > Hi Subham,
> >
> > Thanks for working on this. Given the complexity and long term
> implications
> > discussed in https://github.com/apache/polaris/issues/3890, I think a
> > short
> > design doc could still be helpful to capture the intended architecture
> and
> > future evolution. Here are a few questions listed in the issue. I believe
> > these should be answered before jumping to an implementation.
> >
> >
> >    1. Should we split each potential noisy table into its own dedicated
> >    data source. For example, one data source for events, one for metrics,
> > and
> >    one for idempotency.
> >    2. Should we allow flexible grouping. For example, events and
> >    idempotency tables sharing one data source, while metrics uses
> another.
> >    3. Should we consider different DS per realm instead of table-level
> >    spliting?
> >    4. How should schema version information be managed. If tables live in
> >    different data sources, how do we track and coordinate schema
> evolution.
> >    5. Should different data sources be allowed to point to different
> >    schemas or databases. This likely aligns with the isolation goal, but
> it
> >    implies that cross table joins become difficult or impossible at the
> >    database level, leaving only in memory joins as an option.
> >    6. Should different data sources be allowed to point to the same
> schema.
> >    If not, we need validation logic to detect and prevent
> misconfiguration.
> >
> >
> > Yufei
> >
> >
> > On Tue, Mar 10, 2026 at 7:33 AM Dmitri Bourlatchkov <[email protected]>
> > wrote:
> >
> > > Hi Subham,
> > >
> > > Thanks again for your contribution!
> > >
> > > I believe PR 3960 moves in the right direction by establishing an SPI
> to
> > > delegate DataSource resolution logic to the runtime environment.
> > >
> > > It immediately allows custom implementations in downstream projects (if
> > > people wish to do that) and opens a way for supporting multiple
> > DataSources
> > > in Apache Polaris (in follow-up PRs),
> > >
> > > I think the PR is pretty clear in itself and does not require any extra
> > > design docs. Let's review it in GH and merge when we have consensus.
> > >
> > > Cheers,
> > > Dmitri.
> > >
> > > On Tue, Mar 10, 2026 at 8:27 AM Subham Sangwan <
> > > [email protected]>
> > > wrote:
> > >
> > > > Hi Polaris Dev Team I have opened PR #3960 [1] to introduce the
> > > > foundational groundwork for multi-datasource support in JDBC
> > persistence,
> > > > addressing Issue #3890 [2].The goal is to enable physical isolation
> of
> > > > different persistence workloads (METASTORE, METRICS, EVENTS) into
> > > dedicated
> > > > connection pools or databases. This will allow Polaris to better
> handle
> > > > high-traffic environments by preventing "noisy neighbor" effects on
> the
> > > > core entity tables.
> > > >
> > > > Key Highlights:
> > > >
> > > >    - DataSourceResolver: A new pluggable interface for routing JDBC
> > > >    connections based on RealmContext and StoreType.
> > > >    - Modular Design: Decoupled the resolution implementation into the
> > > >    runtime-common module.
> > > >    - Consistency: Utilizes a type-safe StoreType enum and aligns with
> > > >    existing RealmContext patterns.
> > > >
> > > > The PR has been refined with feedback from @dimas-b and is now ready
> > for
> > > > community review. I'd appreciate any feedback on the overall
> approach.
> > > >
> > > > Best regards,
> > > >
> > > > Subham Sangwan
> > > > GitHub: Subham-KRLX
> > > >
> > > > [1] https://github.com/apache/polaris/pull/3960
> > > > [2] https://github.com/apache/polaris/issues/3890
> > > >
> > >
> >
>

Reply via email to