Re: Polaris SNAPSHOT available and nightly build

2025-04-18 Thread Jean-Baptiste Onofré
Apache Nexus doesn’t allow docker image (only maven artifacts). So the
docker images (release and nightly) will be published on docker hub.

That’s why I prefer to have apache/polaris-server:nightly tag, overwritten
every day (similar to Maven SNAPSHOT) (to avoid to have a bunch of docker
images with timestamp tag).

Regards
JB

Le ven. 18 avr. 2025 à 12:30, Adnan Hemani  a
écrit :

> I’m indifferent to either way - I could see potentially someone reporting
> a code regression using different nightly snapshot builds that persist with
> a nightly timestamp. Keeping every night’s build could be helpful in that
> way.
>
> But I’m also not sure if making and persisting a nightly “unstable” image
> makes sense from a file consumption standpoint.
>
> One thing I wanted to double-check: these nightly images are being
> currently pushed only to Apache Nexus? No plans to push to Docker Hub for
> the nightly tags? Are we still thinking to push release images to Docker
> Hub? Should we also be pushing release images to Apache Nexus?
>
> Best,
> Adnan Hemani
>
> On Apr 18, 2025, at 3:14 AM, Jean-Baptiste Onofré  wrote:
>
> I can update my PR with docker image pub ;)
>
> Le ven. 18 avr. 2025 à 12:12, Jean-Baptiste Onofré  a
> écrit :
>
> That’s a good idea.
>
> I would prefer a nightly tag that we overwrite every night.
>
> Thoughts ?
>
> Regards
> JB
>
> Le ven. 18 avr. 2025 à 11:25, Alex Dutra 
> a écrit :
>
> Hi JB,
>
> Thanks for driving this!
>
> Would it make sense to also publish nightly docker images, with a tag like
> "unstable" or a timestamp?
>
> Thanks,
>
> Alex
>
> On Wed, Apr 16, 2025 at 5:13 PM Russell Spitzer <
> russell.spit...@gmail.com>
> wrote:
>
> Great news!
>
> On Wed, Apr 16, 2025 at 10:04 AM Jean-Baptiste Onofré 
> wrote:
>
> Hi folks,
>
> I received several requests to get Polaris SNAPSHOT artifacts
> published (to build external tools, or from polaris-tools repo).
>
> I did a first SNAPSHOT deployment:
>
>
>
> https://www.google.com/url?q=https://repository.apache.org/content/groups/snapshots/org/apache/polaris/&source=gmail-imap&ust=174557611100&usg=AOvVaw07Mli5aQ0-6Us9g9ht27lA
>
>
> I also created a PR to have a GH Action for nightly build, publishing
> SNAPSHOT every day:
>
> https://www.google.com/url?q=https://github.com/apache/polaris/pull/1383&source=gmail-imap&ust=174557611100&usg=AOvVaw2AFTIS8aI_K14WIYhi-hY4
>
> Thanks !
> Regards
> JB
>
>
>


Re: [DISCUSS] Polaris Federated Principals and Roles

2025-04-18 Thread Yufei Gu
Thanks Micheal for working on this. The spec looks good to me!

Yufei


On Thu, Apr 17, 2025 at 10:44 AM Eric Maynard 
wrote:

> +1 on the spec change
>
> On Wed, Apr 16, 2025 at 3:44 PM Michael Collado 
> wrote:
>
> > Hey folks
> >
> > Some of you already know that I posted an initial PR to get federated
> > principals/roles added. One thing that came out of the feedback was a
> spec
> > change to make it clear when federated identities can be used in the
> APIs.
> > Notably, federated principals cannot be created or updated, but can be
> > returned in get/list calls, whereas federated roles *can* be created by
> the
> > API. The latter is useful/necessary in order to be able to assign
> > privileges to those roles without relying on the JIT creation on login.
> >
> > Please check out the spec change here and let me know what you think -
> >
> >
> https://github.com/apache/polaris/pull/1353/files#diff-52444bc79608edfae86ed0b46d171f7ef63c20090860d877e4e135168311a986
> >
> > Mike
> >
> > On Tue, Dec 17, 2024 at 5:15 PM Dmitri Bourlatchkov
> >  wrote:
> >
> > > Hi Mike,
> > >
> > > I left some comments in the doc, but overall it looks good to me :)
> > >
> > > I still think there are some hidden dependencies on Persistence. For
> > > example, whether and how we can have composite keys for persisted
> > federated
> > > entities... but I guess we can work that out later.
> > >
> > > Also, I think it is important for the Authorizer API to avoid assuming
> > that
> > > all principals are persisted. Specific authorizer implementations
> > > (including the default one) can certainly expect persisted principals,
> > but
> > > the API should require that for the sake of flexibility of possible
> > AuthN/Z
> > > extensions. WDYT?
> > >
> > > Cheers,
> > > Dmitri.
> > >
> > > On Thu, Nov 14, 2024 at 7:43 PM Michael Collado <
> collado.m...@gmail.com>
> > > wrote:
> > >
> > > > Hey folks
> > > >
> > > > As discussed during the community sync, I've put together some
> thoughts
> > > on
> > > > how we'd add support for federated identities in Polaris. I copied
> over
> > > > some of what I had in the issue at
> > > > https://github.com/apache/polaris/issues/441 and put it into the doc
> > > here:
> > > >
> > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/15_3ZiRB6Lhzw0nxij341QUdxEIyFGTrI9_18bFIyJVo/edit?tab=t.0
> > > > .
> > > >
> > > > Please take a look when you get some time and let me know what you
> > think.
> > > > Given that our next community sync is scheduled for the Thanksgiving
> > > > holiday in the US, it might be useful to schedule a meeting
> > specifically
> > > > for this. I can schedule that sync if needed.
> > > >
> > > > Mike
> > > >
> > >
> >
>


Re: [PROPOSAL] Include eclipselink module by default in Polaris distribution and docker image

2025-04-18 Thread Alex Dutra
Hi JB,

In my opinion, it makes sense to include the eclipselink module by default,
in all distributions. Without eclipselink, the distributables are pretty
much useless for anything else than prototyping or evaluating.

By the way, for the admin tool, it really doesn't make any sense to ship it
without eclipselink as it needs to connect to a persistent metastore.

For the docker images, if all images have eclipselink included, I would
suggest simply publishing:

   - For the Polaris server: one image named "apache/polaris" with two
   tags: "0.10.0-beta-incubating" and "latest"
   - For the admin tool: one image named "apache/polaris-admin-tool" with
   two tags: "0.10.0-beta-incubating" and "latest".

If that can be of any help, the below command should do exactly that (you
need to be logged into Docker):

./gradlew clean assemble \
  -Dquarkus.container-image.tag=0.10.0-beta-incubating \
  -Dquarkus.container-image.build=true \
  -Dquarkus.container-image.push=true

Thanks,

Alex

On Fri, Apr 18, 2025 at 8:12 AM Jean-Baptiste Onofré 
wrote:

> Hi folks
>
> As discussed yesterday during the community meeting, I propose we include
> eclipselink module by default in Polaris server distribution and docker
> image.
>
> I think it would help our users to easily start with Polaris.
> I would like to include this for 0.10.0-beta-incubating release.
>
> Thoughts ?
>
> Regards
> JB
>


Re: Polaris SNAPSHOT available and nightly build

2025-04-18 Thread Alex Dutra
Hi JB,

Thanks for driving this!

Would it make sense to also publish nightly docker images, with a tag like
"unstable" or a timestamp?

Thanks,

Alex

On Wed, Apr 16, 2025 at 5:13 PM Russell Spitzer 
wrote:

> Great news!
>
> On Wed, Apr 16, 2025 at 10:04 AM Jean-Baptiste Onofré 
> wrote:
>
> > Hi folks,
> >
> > I received several requests to get Polaris SNAPSHOT artifacts
> > published (to build external tools, or from polaris-tools repo).
> >
> > I did a first SNAPSHOT deployment:
> >
> https://repository.apache.org/content/groups/snapshots/org/apache/polaris/
> >
> > I also created a PR to have a GH Action for nightly build, publishing
> > SNAPSHOT every day:
> > https://github.com/apache/polaris/pull/1383
> >
> > Thanks !
> > Regards
> > JB
> >
>


Re: Switch to Quarkus Security

2025-04-18 Thread Alex Dutra
Hi,

Thanks to all of you who reviewed PR 1! Here is PR 2, introducing support
for external IDPs:

https://github.com/apache/polaris/pull/1397

I included detailed explanations and examples in the PR.

Thanks,

Alex


On Thu, Apr 17, 2025 at 12:42 AM Michael Collado 
wrote:

> Very slick. Thanks for the extra flexibility. Looking forward to the PR
>
> Mike
>
> On Wed, Apr 16, 2025 at 12:54 PM Alex Dutra  >
> wrote:
>
> > Hi again,
> >
> > As a follow-up, I was able today to make it possible for each realm to be
> > dynamically authenticated by either the internal token endpoint, or any
> of
> > the configured OIDC tenants.
> >
> > So, I take back my previous statement about the impossibility to mix
> > internal and external authentication for the same realm.
> >
> > The following kind of realm configurations should therefore be possible:
> >
> >- Realm A: internal auth only, with RSA key pair
> >- Realm B: internal or external auth; internal with shared secret,
> >external with IDP X
> >- Realm C: external auth only, with IDP X or Y
> >
> > I hope that this extra flexibility will make it easier for you to adapt
> to
> > Quarkus OIDC.
> >
> > Thanks,
> >
> > Alex
> > Alex
> >
> > On Wed, Apr 16, 2025 at 3:26 PM Alex Dutra 
> wrote:
> >
> > > Hi Mike,
> > >
> > > My current work makes it possible to define if the authentication is
> > > *internal* (using the internal token endpoint + custom auth mechanism)
> or
> > > *external* (using an external IDP + Quarkus OIDC extension).
> > >
> > > Furthermore, the authentication can be defined on a global level, *then
> > > overridden on a per-realm basis*. Combined with the fact that Quarkus
> > > OIDC also supports multi-tenancy, my work should make it possible to
> > > handle, for example, the following configuration:
> > >
> > >- Realm A: internal auth, RSA key pair M
> > >- Realm B: internal auth, RSA key pair N
> > >- Realm C: internal auth, Symmetric key, secret S
> > >- Realm D: external auth, IDP X
> > >- Realm E: external auth, IDP X or Y
> > >
> > > However:
> > >
> > > I want to make sure that it would still be
> > >> possible to use different authn mechanisms for different requests in
> the
> > >> same realm.
> > >
> > >
> > > *It won't be possible to mix internal and external authentication for
> the
> > > same realm.* I think this would complicate things a lot and I don't
> see a
> > > good reason to do this.
> > >
> > > That said, for realms that would opt for external auth, it will be
> > > possible to use more than one IDP per realm, since Quarkus OIDC is
> > > multi-tenant and has the ability to select tenants based on various
> > > criteria, such as the token issuer URL. This is what Realm E
> illustrates
> > in
> > > my example above.
> > >
> > > Is that OK for you?
> > >
> > > Thanks,
> > >
> > > Alex
> > >
> > >
> > > On Tue, Apr 15, 2025 at 10:32 PM Michael Collado <
> collado.m...@gmail.com
> > >
> > > wrote:
> > >
> > >> Hi Alex
> > >>
> > >> I'm going through the PR now and I think the Quarkus security approach
> > >> seems fine. I was actually thinking of working on this previously
> > myself.
> > >>
> > >> > This shall be done by  implementing a new
> HttpAuthenticationMechanism
> > >> that will pick the right authentication mechanism (internal token
> broker
> > >> vs
> > >> external IdP) based on the runtime configuration.
> > >>
> > >> Regarding this statement, I want to make sure that it would still be
> > >> possible to use different authn mechanisms for different requests in
> the
> > >> same realm. I also recently started picking up some of the work from
> the
> > >> federated auth proposal and something we need to ensure is that we can
> > >> support both external identity providers as well as the internal token
> > >> broker.
> > >>
> > >> Mike
> > >>
> > >>
> > >> On Tue, Apr 15, 2025 at 6:52 AM Jean-Baptiste Onofré  >
> > >> wrote:
> > >>
> > >> > Hi Alex,
> > >> >
> > >> > It sounds like a good plan :)
> > >> >
> > >> > Thanks !
> > >> > Regards
> > >> > JB
> > >> >
> > >> > On Mon, Apr 14, 2025 at 10:50 PM Alex Dutra
> > >> >  wrote:
> > >> > >
> > >> > > Hi all,
> > >> > >
> > >> > > A recently-reported bug [1] uncovered some serious issues with the
> > >> JAX-RS
> > >> > > authentication filters. Fixing this bug requires replacing the
> > >> > incriminated
> > >> > > filters with proper Quarkus Security mechanisms.
> > >> > >
> > >> > > In parallel to that, support for external identity providers has
> > been
> > >> > > requested many times, see [2], [3] and [4]. We know however that
> > this
> > >> > > feature can only be delivered by implementing similar mechanisms.
> > >> > >
> > >> > > There might be an opportunity here to kill two birds with one
> > stone. I
> > >> > > would like therefore to make the following proposal:
> > >> > >
> > >> > >1. In a first PR, *replace the current authentication filters*
> by
> > >> > >Quarkus Security. This PR should be transparent to users a

Re: [DISCUSS] Rollback REPLACE commit on conflicts

2025-04-18 Thread Dmitri Bourlatchkov
Hi Prashant,

Sorry for the delayed reply and apologies if I missed some relevant discussion.

As I understand the catalog could remove snapshots that come in-between 
previous and current snapshots from the perspective of one of the clients.

Can we be sure that the removed snapshot does not have material data changes 
(e.g. new roes or updated rows) that should have been taken into account by the 
client whose snapshot is forced to become "current". Could this result in data 
loss?

Thanks,
Dmitri.

On 2025/03/31 22:44:03 Prashant Singh wrote:
> Hey folks,
> 
> I wanted to propose this feature to Apache Polaris Rolling back
> replacements operation snapshots in the case during the concurrent write
> (compaction and other writers trying to commit to the table at the same
> time) to Iceberg there are conflicts. This is a feature which Ryan proposed
> as an alternative when I was proposing a Priority Amongst Writer proposal
> [1]  in the Apache Iceberg community. This kind of makes the compaction
> always a low priority process.
> 
> Earlier, I went ahead and added this feature as a client side change in the
> Apache Iceberg repo [2] . It got some attraction but this didn't get to the
> end. Now when we think more about it again Apache Polaris seems to be the
> best place to do it as it can benefit other language writer clients as well
> and Polaris is the one to actually apply the commits based on the
> requirements and update sent by Iceberg Rest Client.
> 
> Here is my draft PR [3] on how I think this can be achieved, given this is
> enabled by a table property, happy to discuss other knobs for ex: maybe
> check the snapshot prop ?
> 
> The logic essentially if we see is the base (B) on which the snapshot we
> want to include/commit is based on is changed to something like (B`) and
> the given snapshot from B` to B are all of ops type *REPLACE *. It adds
> other updates within the same update Table req
> 1. moved the snapshot ref to B
> 2. [Optional] to remove the snapshot between B` to B given its all of
> *REPLACE*.
> Then try the requirements and updates again on the updated base and see if
> it succeeds. To make all this as part of one updateReq and then commit to
> the table.
> Doing it this way preserves the schema changes for which no new snapshot
> has been created, just a new metadata.json is created.
> 
> Happy to know your thoughts on the same.
> 
> Links:
> [1]
> https://docs.google.com/document/d/1pSqxf5A59J062j9VFF5rcCpbW9vdTbBKTmjps80D-B0/edit?tab=t.0#heading=h.fn6jmpw6phpn
> [2] https://github.com/apache/iceberg/pull/5888
> [3] https://github.com/apache/polaris/pull/1285
> 
> Best,
> Prashant Singh
> 


Re: Discussion: Re-evaluating Realm Modeling in Polaris

2025-04-18 Thread Dmitri Bourlatchkov
I believe users of Apache Polaris may want to share the database across
many realms in environments that do not need secure separation of realms.
This is hypothetical, at this point, of course. However, If option 3 is not
supported by code that use case will be impossible (or require subsequent
changes and releases).

Even with option 1 if multiple realms are mixed in memory, the isolation
guarantees are not much stronger than with option 3. If the main concern is
strong isolation, then Polaris Servers should run with only one realm per
instance (per JVM).

I propose to delegate this decision to the Polaris admin.

I do not think the code will have to be more complex to support both
options 1 and 3 compared to option 1 alone. In fact, as far as I can tell,
supporting option 1 plus multiple realms per JVM is more complex than
option 3 alone.

Cheers,
Dmitri.


On Fri, Apr 18, 2025 at 4:38 PM Yufei Gu  wrote:

> Hi Folks,
>
> As we discussed, option 1 provides the strongest isolation, which should
> work particularly well for dynamically created data sources. Another
> significant benefit is that it's less complicated overall.
>
> I'm not convinced we need both option 1 and option 3. For scenarios
> involving only a single realm, the concept of a realm becomes unnecessary.
> In that case, there's no need for any additional options, including option
> 3.
>
> Yufei
>
>
> On Tue, Apr 15, 2025 at 11:19 AM Dmitri Bourlatchkov 
> wrote:
>
> > Going with options 1 and 3 initially sounds good to me. This should
> > simplify current JDBC PRs too.
> >
> > We can certainly add capabilities later, because having realm ID in the
> PR
> > does not preclude other deployment choices.
> >
> > Cheers,
> > Dmitri.
> >
> > On Tue, Apr 15, 2025 at 1:49 PM Michael Collado 
> > wrote:
> >
> > > My $.02 is that Option 1 is entirely possible using a DataSource that
> > > dynamically creates Connections as needed. Option 1 is nice because, as
> > > Pierre said, it gives admins the ability to dynamically allocate
> > resources
> > > to different clients as needed.
> > >
> > > Personally, I'm less inclined to option 3 just because it means
> > potentially
> > > larger blast radius if database credentials are ever leaked. But if
> most
> > > end users are expecting to only manage a single realm, it's probably
> the
> > > easiest and solves the most common use case.
> > >
> > > I like the option of combining 1 and 3 - by default, a single tenant
> > > deployment writes to a single end database, but admins have the ability
> > to
> > > configure dynamic connections to different database endpoints if
> multiple
> > > realms are supported.
> > >
> > > Mike
> > >
> > > On Tue, Apr 15, 2025 at 9:32 AM Alex Dutra
>  > >
> > > wrote:
> > >
> > > > Hi all,
> > > >
> > > > I'm in agreement with Pierre, JB and Dmitri's points. I’d like to add
> > > some
> > > > context from the Quarkus configuration angle:
> > > >
> > > > Option 1, which involves distinct datasources, presents a challenge.
> > > > Quarkus requires all datasources to be present and fully configured
> at
> > > > build time. This requirement could be quite cumbersome for end users,
> > > > making this option less user-friendly in practice.
> > > >
> > > > Regarding Option 2, while it's theoretically possible to manage
> > multiple
> > > > schemas with a single datasource, implementing this can be complex.
> To
> > > > effectively work with different schemas in PostgreSQL, you would need
> > to
> > > > either qualify all table identifiers or adjust the `search_path` URL
> > > > parameter. Additionally, other JDBC backends like MySQL don't support
> > > > multiple schemas per database, which would make Option 2 less
> portable
> > > > across different JDBC databases.
> > > >
> > > > That's why I think Option 3 is the most portable one, and the easiest
> > for
> > > > users or administrators to configure. As Pierre noted, it is subject
> to
> > > > noisy neighbor interferences – but to some extent, I think
> > interferences
> > > > could also happen with separate schemas like in option 2.
> > > >
> > > > Just my 2 cents.
> > > >
> > > > Thanks,
> > > >
> > > > Alex
> > > >
> > > >
> > > > On Tue, Apr 15, 2025 at 4:00 PM Dmitri Bourlatchkov <
> di...@apache.org>
> > > > wrote:
> > > >
> > > > > Thanks for your perspective, Pierre! You make good points and I
> agree
> > > > with
> > > > > them.
> > > > >
> > > > > From my POV, I'd add that we probably need to take deployment
> > concerns
> > > > into
> > > > > account too.
> > > > >
> > > > > If the deployment uses the database per realm approach (option 1)
> > then
> > > > > someone has to provide database connection parameters (including
> > > > secrets).
> > > > > If that is the deployment administrator, then the admin necessarily
> > has
> > > > to
> > > > > be aware of all realms and effectively has control of the data in
> all
> > > > > realms. Isolation is achieved only for end users.
> > > > >
> > > > > That said, even with option

Re: Discussion: Re-evaluating Realm Modeling in Polaris

2025-04-18 Thread Yufei Gu
Thanks for the thoughtful input.

While it's true that some environments may not require strict separation
between realms, the risk of incorrect usage or subtle cross-realm
interference is significantly higher if we allow shared databases without
enforcing strong boundaries.

Option 1 gives us strong, predictable isolation with minimal complexity and
fewer edge cases. Yes, if multiple realms are mixed in the same JVM even
with option 1, isolation may still be compromised, but at least the design
makes this explicit and easier to reason about. Running one realm per
Polaris instance is a reasonable solution for environments that value
isolation, and option 1 just works, while option 3 adds unnecessary
complexity.

I believe adding support for both option 1 and option 3 introduces not just
code complexity, but also operational ambiguity and a burden on users to
fully understand the trade-offs. Instead of delegating this to admins, we
should first aim for clarity and safety in the design.

We can always revisit this in the future if a strong real-world use case
arises. For now, I’d prefer we keep the design simple and unambiguous.

Yufei


On Fri, Apr 18, 2025 at 3:17 PM Dmitri Bourlatchkov 
wrote:

> I believe users of Apache Polaris may want to share the database across
> many realms in environments that do not need secure separation of realms.
> This is hypothetical, at this point, of course. However, If option 3 is not
> supported by code that use case will be impossible (or require subsequent
> changes and releases).
>
> Even with option 1 if multiple realms are mixed in memory, the isolation
> guarantees are not much stronger than with option 3. If the main concern is
> strong isolation, then Polaris Servers should run with only one realm per
> instance (per JVM).
>
> I propose to delegate this decision to the Polaris admin.
>
> I do not think the code will have to be more complex to support both
> options 1 and 3 compared to option 1 alone. In fact, as far as I can tell,
> supporting option 1 plus multiple realms per JVM is more complex than
> option 3 alone.
>
> Cheers,
> Dmitri.
>
>
> On Fri, Apr 18, 2025 at 4:38 PM Yufei Gu  wrote:
>
> > Hi Folks,
> >
> > As we discussed, option 1 provides the strongest isolation, which should
> > work particularly well for dynamically created data sources. Another
> > significant benefit is that it's less complicated overall.
> >
> > I'm not convinced we need both option 1 and option 3. For scenarios
> > involving only a single realm, the concept of a realm becomes
> unnecessary.
> > In that case, there's no need for any additional options, including
> option
> > 3.
> >
> > Yufei
> >
> >
> > On Tue, Apr 15, 2025 at 11:19 AM Dmitri Bourlatchkov 
> > wrote:
> >
> > > Going with options 1 and 3 initially sounds good to me. This should
> > > simplify current JDBC PRs too.
> > >
> > > We can certainly add capabilities later, because having realm ID in the
> > PR
> > > does not preclude other deployment choices.
> > >
> > > Cheers,
> > > Dmitri.
> > >
> > > On Tue, Apr 15, 2025 at 1:49 PM Michael Collado <
> collado.m...@gmail.com>
> > > wrote:
> > >
> > > > My $.02 is that Option 1 is entirely possible using a DataSource that
> > > > dynamically creates Connections as needed. Option 1 is nice because,
> as
> > > > Pierre said, it gives admins the ability to dynamically allocate
> > > resources
> > > > to different clients as needed.
> > > >
> > > > Personally, I'm less inclined to option 3 just because it means
> > > potentially
> > > > larger blast radius if database credentials are ever leaked. But if
> > most
> > > > end users are expecting to only manage a single realm, it's probably
> > the
> > > > easiest and solves the most common use case.
> > > >
> > > > I like the option of combining 1 and 3 - by default, a single tenant
> > > > deployment writes to a single end database, but admins have the
> ability
> > > to
> > > > configure dynamic connections to different database endpoints if
> > multiple
> > > > realms are supported.
> > > >
> > > > Mike
> > > >
> > > > On Tue, Apr 15, 2025 at 9:32 AM Alex Dutra
> >  > > >
> > > > wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > I'm in agreement with Pierre, JB and Dmitri's points. I’d like to
> add
> > > > some
> > > > > context from the Quarkus configuration angle:
> > > > >
> > > > > Option 1, which involves distinct datasources, presents a
> challenge.
> > > > > Quarkus requires all datasources to be present and fully configured
> > at
> > > > > build time. This requirement could be quite cumbersome for end
> users,
> > > > > making this option less user-friendly in practice.
> > > > >
> > > > > Regarding Option 2, while it's theoretically possible to manage
> > > multiple
> > > > > schemas with a single datasource, implementing this can be complex.
> > To
> > > > > effectively work with different schemas in PostgreSQL, you would
> need
> > > to
> > > > > either qualify all table identifiers or adjust the `search

Re: [DISCUSS] Polaris Federated Principals and Roles

2025-04-18 Thread Dmitri Bourlatchkov
Re 2: Thanks for the clarification, Mike! I guess my brain swapped out a
large portion of that doc :)

I'm still not sure how IdentityToPrincipalMapping can help with resolving
changes from API and from IdP integration.

The doc talks about namespaces, but in PR# 1353, it looks like API calls
are free to change anything in federated principal roles.

Should IdentityToPrincipalMapping extend to validating changes coming from
the Admin API?

Alternatively, we could treat that as a new permission and allow all admin
users to edit federated roles via API.

Re 1: Yes, I guess it would be helpful with SCIM, but then we'd need to
sync role membership, I guess.

My understanding so far was that we'd get role membership at authentication
time and not persist it in Polaris.

Is that still the plan?

I think this relates to how we connect authentication and authorization in
Polaris. If Principals are external, we get role membership from the access
token and all is clear. If Principals are local, and we still get roles
from the access token, then the question arises about whether that data is
in sync with local data (and the other way around too).

That could be resolved by Polaris owning access tokens (via token exchange)
and assuming control over role membership based on local data and access
time (I commented on this in the doc too).

All in all, since SCIM is not part of phase 1 here, maybe we could defer
dealing with federated Principal persistence, indeed.

Re 3: I guess it also applies to client secret generation code, but let's
see whether it actually causes too many conditions in code when it comes to
that.

Cheers,
Dmitri.

On Thu, Apr 17, 2025 at 7:42 PM Michael Collado 
wrote:

> Thanks for the response.
>
> 1. TBH, I have no strong feelings about persisting the federated
> principals. I think the biggest advantage I saw was to support SCIM along
> with 3p identity providers. But for this implementation, if we want to skip
> persisting principal entities, I'm ok with it.
>
> 2. In the doc I wrote the following, which I think addresses the concern
> around conflicts.
>
>1.
>
>The IdentityToPrincipalMapping should supply a naming convention so that
>PrincipalRoles and Principals created from that source do not conflict
> with
>entities created by a different IdentityProvider
>
> In the case that a role happens to exist even with the naming convention
> applied, the caller would be unable to authenticate with that role.
>
> 3. I don't think we need special classes for federated roles. In nearly all
> of the Polaris code, they are treated as PolarisRoleEntity without any
> issue. The only code that needs to be aware of federated roles is
>
> a) The Admin API, which prevents granting these roles to principals and
>
> b) The federated roles provider, which verifies that the federated
> principal is assuming federated roles.
>
> The rest of the code (i.e., the resolver and authorizer) treat these roles
> exactly as native Polaris roles.
>
> Mike
>
> On Thu, Apr 17, 2025 at 2:39 PM Dmitri Bourlatchkov 
> wrote:
>
> > Thanks for reviving this discussion, Mike!
> >
> > The API spec change by itself LGTM, but I have related concerns in how
> this
> > feature is meant to work in general.
> >
> > 1) The need to expose Federated Principals in Polaris API.
> >
> > The design doc [1] discusses the possibility to expose Federated
> Principals
> > in Polaris API, but there are pros and cons. I do not think it is a
> trivial
> > decision. I'd like to discuss this in more detail before we commit to
> doing
> > that.
> >
> > From my POV the main benefit of exposing Federated Principals is to be
> able
> > to alter their properties in Polaris (as mentioned in the doc). However,
> > your email indicates that this is actually forbidden.
> >
> > What use cases do you envision for exposing read-only Federated
> Principals?
> >
> > 2) Reconciliation of Federated vs. Non-Federated changes.
> >
> > Let's focus on Principal Roles here (which are meant to be writable via
> > API).
> >
> > If users are able to make and change (I assume) Principal Roles via API,
> > and federation code will also be creating and changing those roles, what
> is
> > our approach to handling conflicts between those two streams of changes?
> >
> > Some examples for context (also mentioned in GH):
> >
> > A) A user creates normal Principal Role "A", and later role "A" is
> > discovered through Federation.
> >
> > B) A user edits Federated Role "B" and later properties of "B" are
> updated
> > through Federation (or the other way around).
> >
> > Do we want to control property edits based on the origin of the property
> > (spec out a "namespace" for Federated properties)?
> >
> > 3) OO types
> >
> > Currently Principals and Principal Roles are represented by one java type
> > each. When federation comes into play, would it make sense to develop a
> > (java) type hierarchy for working with that data in Polaris code? My main
> > concern here is av

Re: Discussion: Re-evaluating Realm Modeling in Polaris

2025-04-18 Thread Yufei Gu
Hi Folks,

As we discussed, option 1 provides the strongest isolation, which should
work particularly well for dynamically created data sources. Another
significant benefit is that it's less complicated overall.

I'm not convinced we need both option 1 and option 3. For scenarios
involving only a single realm, the concept of a realm becomes unnecessary.
In that case, there's no need for any additional options, including option
3.

Yufei


On Tue, Apr 15, 2025 at 11:19 AM Dmitri Bourlatchkov 
wrote:

> Going with options 1 and 3 initially sounds good to me. This should
> simplify current JDBC PRs too.
>
> We can certainly add capabilities later, because having realm ID in the PR
> does not preclude other deployment choices.
>
> Cheers,
> Dmitri.
>
> On Tue, Apr 15, 2025 at 1:49 PM Michael Collado 
> wrote:
>
> > My $.02 is that Option 1 is entirely possible using a DataSource that
> > dynamically creates Connections as needed. Option 1 is nice because, as
> > Pierre said, it gives admins the ability to dynamically allocate
> resources
> > to different clients as needed.
> >
> > Personally, I'm less inclined to option 3 just because it means
> potentially
> > larger blast radius if database credentials are ever leaked. But if most
> > end users are expecting to only manage a single realm, it's probably the
> > easiest and solves the most common use case.
> >
> > I like the option of combining 1 and 3 - by default, a single tenant
> > deployment writes to a single end database, but admins have the ability
> to
> > configure dynamic connections to different database endpoints if multiple
> > realms are supported.
> >
> > Mike
> >
> > On Tue, Apr 15, 2025 at 9:32 AM Alex Dutra  >
> > wrote:
> >
> > > Hi all,
> > >
> > > I'm in agreement with Pierre, JB and Dmitri's points. I’d like to add
> > some
> > > context from the Quarkus configuration angle:
> > >
> > > Option 1, which involves distinct datasources, presents a challenge.
> > > Quarkus requires all datasources to be present and fully configured at
> > > build time. This requirement could be quite cumbersome for end users,
> > > making this option less user-friendly in practice.
> > >
> > > Regarding Option 2, while it's theoretically possible to manage
> multiple
> > > schemas with a single datasource, implementing this can be complex. To
> > > effectively work with different schemas in PostgreSQL, you would need
> to
> > > either qualify all table identifiers or adjust the `search_path` URL
> > > parameter. Additionally, other JDBC backends like MySQL don't support
> > > multiple schemas per database, which would make Option 2 less portable
> > > across different JDBC databases.
> > >
> > > That's why I think Option 3 is the most portable one, and the easiest
> for
> > > users or administrators to configure. As Pierre noted, it is subject to
> > > noisy neighbor interferences – but to some extent, I think
> interferences
> > > could also happen with separate schemas like in option 2.
> > >
> > > Just my 2 cents.
> > >
> > > Thanks,
> > >
> > > Alex
> > >
> > >
> > > On Tue, Apr 15, 2025 at 4:00 PM Dmitri Bourlatchkov 
> > > wrote:
> > >
> > > > Thanks for your perspective, Pierre! You make good points and I agree
> > > with
> > > > them.
> > > >
> > > > From my POV, I'd add that we probably need to take deployment
> concerns
> > > into
> > > > account too.
> > > >
> > > > If the deployment uses the database per realm approach (option 1)
> then
> > > > someone has to provide database connection parameters (including
> > > secrets).
> > > > If that is the deployment administrator, then the admin necessarily
> has
> > > to
> > > > be aware of all realms and effectively has control of the data in all
> > > > realms. Isolation is achieved only for end users.
> > > >
> > > > That said, even with option 3 the deployment owner has control over
> all
> > > > realms and end users are isolated as far as their access to APIs is
> > > > concerned. End users cannot discover each other's data (barring
> coding
> > > > mistakes in Polaris). The same goes for option 2 as it's the middle
> > > ground.
> > > >
> > > > I do not see any material difference between options 1, 2 and 3 from
> > the
> > > > end user's perspective.
> > > >
> > > > If, however, the database connection parameters are not controlled by
> > the
> > > > administrator, but by the end user who wants to define a realm, then
> > > > Polaris needs to expose managing database connections and secrets.
> This
> > > may
> > > > be a valuable feature, but I believe it is far beyond current Polaris
> > > > backend capabilities. I do not think going this way is justified at
> > this
> > > > time.
> > > >
> > > > I'd like to propose a hybrid approach where Polaris provides
> > capabilities
> > > > (and config) for the administrators to choose between options 1, 2, 3
> > > > according to their specific deployment concerns.
> > > >
> > > > This means that the primary key has to include the realm ID, because
> if
> >

Re: Polaris SNAPSHOT available and nightly build

2025-04-18 Thread Alex Dutra
Hi all,

I think a tag like "nightly" or "unstable", that gets overwritten every
night in Docker Hub, is probably a safer choice than publishing timestamped
images that could accumulate over time. You would support that.

JB: how about we merge your PR for pushing snapshots first, then you (or
someone else) tackles the nightly docker images task? Because I think we
need to have more community feedback before we go ahead and do it.

Alex

On Fri, Apr 18, 2025 at 1:29 PM Jean-Baptiste Onofré 
wrote:

> Apache Nexus doesn’t allow docker image (only maven artifacts). So the
> docker images (release and nightly) will be published on docker hub.
>
> That’s why I prefer to have apache/polaris-server:nightly tag, overwritten
> every day (similar to Maven SNAPSHOT) (to avoid to have a bunch of docker
> images with timestamp tag).
>
> Regards
> JB
>
> Le ven. 18 avr. 2025 à 12:30, Adnan Hemani  a
> écrit :
>
> > I’m indifferent to either way - I could see potentially someone reporting
> > a code regression using different nightly snapshot builds that persist
> with
> > a nightly timestamp. Keeping every night’s build could be helpful in that
> > way.
> >
> > But I’m also not sure if making and persisting a nightly “unstable” image
> > makes sense from a file consumption standpoint.
> >
> > One thing I wanted to double-check: these nightly images are being
> > currently pushed only to Apache Nexus? No plans to push to Docker Hub for
> > the nightly tags? Are we still thinking to push release images to Docker
> > Hub? Should we also be pushing release images to Apache Nexus?
> >
> > Best,
> > Adnan Hemani
> >
> > On Apr 18, 2025, at 3:14 AM, Jean-Baptiste Onofré 
> wrote:
> >
> > I can update my PR with docker image pub ;)
> >
> > Le ven. 18 avr. 2025 à 12:12, Jean-Baptiste Onofré  a
> > écrit :
> >
> > That’s a good idea.
> >
> > I would prefer a nightly tag that we overwrite every night.
> >
> > Thoughts ?
> >
> > Regards
> > JB
> >
> > Le ven. 18 avr. 2025 à 11:25, Alex Dutra 
> > a écrit :
> >
> > Hi JB,
> >
> > Thanks for driving this!
> >
> > Would it make sense to also publish nightly docker images, with a tag
> like
> > "unstable" or a timestamp?
> >
> > Thanks,
> >
> > Alex
> >
> > On Wed, Apr 16, 2025 at 5:13 PM Russell Spitzer <
> > russell.spit...@gmail.com>
> > wrote:
> >
> > Great news!
> >
> > On Wed, Apr 16, 2025 at 10:04 AM Jean-Baptiste Onofré 
> > wrote:
> >
> > Hi folks,
> >
> > I received several requests to get Polaris SNAPSHOT artifacts
> > published (to build external tools, or from polaris-tools repo).
> >
> > I did a first SNAPSHOT deployment:
> >
> >
> >
> >
> https://www.google.com/url?q=https://repository.apache.org/content/groups/snapshots/org/apache/polaris/&source=gmail-imap&ust=174557611100&usg=AOvVaw07Mli5aQ0-6Us9g9ht27lA
> >
> >
> > I also created a PR to have a GH Action for nightly build, publishing
> > SNAPSHOT every day:
> >
> >
> https://www.google.com/url?q=https://github.com/apache/polaris/pull/1383&source=gmail-imap&ust=174557611100&usg=AOvVaw2AFTIS8aI_K14WIYhi-hY4
> >
> > Thanks !
> > Regards
> > JB
> >
> >
> >
>


Re: [Polaris Meeting Sync] Structure of the meeting

2025-04-18 Thread Michael Collado
We do the newcomer "Hi, I'm here to learn/talk about..." in the OpenLineage
syncs and I've always found it useful to hear what users are interested in
discussing. Definitely +1 to that idea.

On Tue, Apr 15, 2025 at 12:22 PM Jean-Baptiste Onofré 
wrote:

> Great points, Russell !
>
> Let’s give priority to first time people, I like this ! (We did that last
> time with the Xtable folks).
> Agree about debates, it’s better if it happens on mailing list as it gives
> visibility to everyone.
> And we can always have dedicated meeting when needed.
>
> Thanks !
>
> Regards
> JB
>
> Le mar. 15 avr. 2025 à 19:54, Russell Spitzer 
> a
> écrit :
>
> > I'm strongly in favor of having a timeboxed Agenda. I do want to make
> sure
> > we always poll new attendees first. I'm not sure this would fit into our
> > agenda but the Parquet sync usually does a roll-call and "what are you
> > interested in?" at the top of the hour. We probably have too many people
> > for that but a "hello i'm new and I'm hear for ..." is probably great for
> > first timers.
> >
> > I also want to keep debates out of the meeting. If folks have a lot of
> > questions we should just punt to a sub-meeting or mail thread.
> >
> > On Tue, Apr 15, 2025 at 2:17 AM Jean-Baptiste Onofré 
> > wrote:
> >
> > > Hi folks,
> > >
> > > In order to "structure" on Community Meetings, I propose to have the
> > > following plan for the meetings, organized by themes:
> > >
> > > 1. Highlights (5mn):
> > >all cool things we did during the last two weeks
> > > 2. Podling Governance (5/10mn):
> > >   Update about release, new committers, incubator report, security
> > > reports, ...
> > > 3. UX and Onboarding (5/10mn):
> > >   Update about user experience (UI, CLI, ...) and user onboarding
> > > (documentation, examples, ...)
> > > 4. Persistence (5/10mn):
> > >   Update about persistence API, new backends, ...
> > > 5. Security and Data Governance (5/10mn):
> > >   Update about security and data governance layer (SSO, RBAC, FGAC,
> > > ABAC, ...), integration with IdP, ...
> > >  6. Catalog Integration (5/10mn):
> > >   Update about Catalog Federation, Foreign Tables, Table Formats
> > > Support/Converter, Unity integration, ...
> > >  7. Polaris Tools (5/10mn):
> > >   Update about Polaris tool (catalog migrator; benchmark, ...)
> > >
> > > The idea is to populate each theme before the meeting to focus on key
> > > topics to discuss during the meeting.
> > >
> > > Thoughts ?
> > >
> > > Regards
> > > JB
> > >
> >
>


Re: [DISCUSS] Adding an API to allow service admins to reset any principal's credentials.

2025-04-18 Thread Eric Maynard
>From what I understand, there is not a historical reason for this not
having been implemented. It was discussed, but never prioritized.

The doc looks great Mansehaj, thanks for putting this together.

On Thu, Apr 17, 2025 at 3:14 PM Dmitri Bourlatchkov 
wrote:

> Thanks, Mansehaj!
>
> Very nice proposal! I added some comments to the doc.
>
> I think in general it is a valuable feature, but as you mentioned in the
> doc there may be historical reasons why it was not implemented initially. I
> hope people more knowledgeable in this area can comment on that.
>
> Cheers,
> Dmitri.
>
> On Thu, Apr 17, 2025 at 2:59 PM Mansehaj Singh
>  wrote:
>
> > Hi everyone!
> >
> > I've drafted a small proposal here:
> >
> >
> https://docs.google.com/document/d/1uIJUp1BeAGm_mSO8OBjIZmeL5zY8dLzuYRP4Ah1U7X0/edit?usp=sharing
> >
> >
> > In summary, this proposes adding a new resetCredentials functionality to
> > Polaris to allow service_admin to be able to reset any principal's
> > credentials. This is really useful for a number of different credential
> > loss scenarios, eg. when someone leaves the company, an employee goes on
> > temporary leave, forgotten passwords etc. Of course, we hope that users
> > have no single point of failure for their critical workloads, but these
> > kinds of issues often don't become apparent until credential loss
> actually
> > occurs, at which point remediation can become difficult in Polaris.
> Having
> > a root user with the ability to reset credentials is a decently common
> > concept and I think it would add value. Currently, the only workaround is
> > to create a new principal and reassign all of their principal roles.
> >
> > However, addressing the risks is also important. This proposal
> introduces a
> > path for service_admins to basically be able to assume any principal
> within
> > Polaris, which is a major security vulnerability if a principal with
> > service_admin access ever did become compromised. However, this risk is
> > already present because a service_admin can create principals and assign
> > principal roles to assume the same level of privileges desired, it just
> > can't actually impersonate as any other principal.
> >
> > It looks like there is existing appetite to add something like this to
> > Polaris: https://github.com/apache/polaris/issues/624
> >
> > I'm curious to hear what the community thinks we can do to address this
> and
> > whether introducing these risks is worth having functionality like this.
> >
> > Thanks for reading,
> > Sehaj
> >
>


Re: [PROPOSAL] Include eclipselink module by default in Polaris distribution and docker image

2025-04-18 Thread Dmitri Bourlatchkov
I agree that including the EclipseLink-based Persistence into binary
distributions is a good idea.

I'd even take this one step further and remove all conditional build steps
for EclipseLink and include it in regular builds too.

Cheers,
Dmitri.

On Fri, Apr 18, 2025 at 2:12 AM Jean-Baptiste Onofré 
wrote:

> Hi folks
>
> As discussed yesterday during the community meeting, I propose we include
> eclipselink module by default in Polaris server distribution and docker
> image.
>
> I think it would help our users to easily start with Polaris.
> I would like to include this for 0.10.0-beta-incubating release.
>
> Thoughts ?
>
> Regards
> JB
>


Re: Polaris SNAPSHOT available and nightly build

2025-04-18 Thread Jean-Baptiste Onofré
That’s a good idea.

I would prefer a nightly tag that we overwrite every night.

Thoughts ?

Regards
JB

Le ven. 18 avr. 2025 à 11:25, Alex Dutra  a
écrit :

> Hi JB,
>
> Thanks for driving this!
>
> Would it make sense to also publish nightly docker images, with a tag like
> "unstable" or a timestamp?
>
> Thanks,
>
> Alex
>
> On Wed, Apr 16, 2025 at 5:13 PM Russell Spitzer  >
> wrote:
>
> > Great news!
> >
> > On Wed, Apr 16, 2025 at 10:04 AM Jean-Baptiste Onofré 
> > wrote:
> >
> > > Hi folks,
> > >
> > > I received several requests to get Polaris SNAPSHOT artifacts
> > > published (to build external tools, or from polaris-tools repo).
> > >
> > > I did a first SNAPSHOT deployment:
> > >
> >
> https://repository.apache.org/content/groups/snapshots/org/apache/polaris/
> > >
> > > I also created a PR to have a GH Action for nightly build, publishing
> > > SNAPSHOT every day:
> > > https://github.com/apache/polaris/pull/1383
> > >
> > > Thanks !
> > > Regards
> > > JB
> > >
> >
>


Re: Polaris SNAPSHOT available and nightly build

2025-04-18 Thread Jean-Baptiste Onofré
I can update my PR with docker image pub ;)

Le ven. 18 avr. 2025 à 12:12, Jean-Baptiste Onofré  a
écrit :

> That’s a good idea.
>
> I would prefer a nightly tag that we overwrite every night.
>
> Thoughts ?
>
> Regards
> JB
>
> Le ven. 18 avr. 2025 à 11:25, Alex Dutra 
> a écrit :
>
>> Hi JB,
>>
>> Thanks for driving this!
>>
>> Would it make sense to also publish nightly docker images, with a tag like
>> "unstable" or a timestamp?
>>
>> Thanks,
>>
>> Alex
>>
>> On Wed, Apr 16, 2025 at 5:13 PM Russell Spitzer <
>> russell.spit...@gmail.com>
>> wrote:
>>
>> > Great news!
>> >
>> > On Wed, Apr 16, 2025 at 10:04 AM Jean-Baptiste Onofré 
>> > wrote:
>> >
>> > > Hi folks,
>> > >
>> > > I received several requests to get Polaris SNAPSHOT artifacts
>> > > published (to build external tools, or from polaris-tools repo).
>> > >
>> > > I did a first SNAPSHOT deployment:
>> > >
>> >
>> https://repository.apache.org/content/groups/snapshots/org/apache/polaris/
>> > >
>> > > I also created a PR to have a GH Action for nightly build, publishing
>> > > SNAPSHOT every day:
>> > > https://github.com/apache/polaris/pull/1383
>> > >
>> > > Thanks !
>> > > Regards
>> > > JB
>> > >
>> >
>>
>


Re: Polaris SNAPSHOT available and nightly build

2025-04-18 Thread Adnan Hemani
I’m indifferent to either way - I could see potentially someone reporting a 
code regression using different nightly snapshot builds that persist with a 
nightly timestamp. Keeping every night’s build could be helpful in that way.

But I’m also not sure if making and persisting a nightly “unstable” image makes 
sense from a file consumption standpoint.

One thing I wanted to double-check: these nightly images are being currently 
pushed only to Apache Nexus? No plans to push to Docker Hub for the nightly 
tags? Are we still thinking to push release images to Docker Hub? Should we 
also be pushing release images to Apache Nexus?

Best,
Adnan Hemani

> On Apr 18, 2025, at 3:14 AM, Jean-Baptiste Onofré  wrote:
> 
> I can update my PR with docker image pub ;)
> 
> Le ven. 18 avr. 2025 à 12:12, Jean-Baptiste Onofré  > a
> écrit :
> 
>> That’s a good idea.
>> 
>> I would prefer a nightly tag that we overwrite every night.
>> 
>> Thoughts ?
>> 
>> Regards
>> JB
>> 
>> Le ven. 18 avr. 2025 à 11:25, Alex Dutra 
>> a écrit :
>> 
>>> Hi JB,
>>> 
>>> Thanks for driving this!
>>> 
>>> Would it make sense to also publish nightly docker images, with a tag like
>>> "unstable" or a timestamp?
>>> 
>>> Thanks,
>>> 
>>> Alex
>>> 
>>> On Wed, Apr 16, 2025 at 5:13 PM Russell Spitzer <
>>> russell.spit...@gmail.com>
>>> wrote:
>>> 
 Great news!
 
 On Wed, Apr 16, 2025 at 10:04 AM Jean-Baptiste Onofré 
 wrote:
 
> Hi folks,
> 
> I received several requests to get Polaris SNAPSHOT artifacts
> published (to build external tools, or from polaris-tools repo).
> 
> I did a first SNAPSHOT deployment:
> 
 
>>> https://www.google.com/url?q=https://repository.apache.org/content/groups/snapshots/org/apache/polaris/&source=gmail-imap&ust=174557611100&usg=AOvVaw07Mli5aQ0-6Us9g9ht27lA
> 
> I also created a PR to have a GH Action for nightly build, publishing
> SNAPSHOT every day:
> https://www.google.com/url?q=https://github.com/apache/polaris/pull/1383&source=gmail-imap&ust=174557611100&usg=AOvVaw2AFTIS8aI_K14WIYhi-hY4
> 
> Thanks !
> Regards
> JB



Re: Polaris SNAPSHOT available and nightly build

2025-04-18 Thread Yufei Gu
This is great! Thanks JB!
Yufei


On Fri, Apr 18, 2025 at 11:36 AM Adnan Hemani
 wrote:

> Thanks for the clarification, JB! I agree with everything mentioned on the
> thread then - I’m happy to keep working on the Docker image pushes since I
> made a PR recently for Docker images publishing on new releases. I also
> agree that a “nightly”/“snapshot”/“unstable” tag that gets overridden every
> night sounds good.
>
> Best,
> Adnan Hemani
>
> > On Apr 18, 2025, at 6:40 AM, Dmitri Bourlatchkov 
> wrote:
> >
> > Doing Maven and Docker in two phases sounds good to me.
> >
> > Also, from my POV using one (reassigned) "nightly" or "unable" tag is
> > reasonable.
> >
> > Regarding tracking specific build information related to bug reports, I
> > suppose we could add a git commit hash to a log message and/or jar
> Manifest
> > attributes (if we do not have that yet).
> >
> > Cheers,
> > Dmitri.
> >
> > On Fri, Apr 18, 2025 at 8:27 AM Alex Dutra  >
> > wrote:
> >
> >> Hi all,
> >>
> >> I think a tag like "nightly" or "unstable", that gets overwritten every
> >> night in Docker Hub, is probably a safer choice than publishing
> timestamped
> >> images that could accumulate over time. You would support that.
> >>
> >> JB: how about we merge your PR for pushing snapshots first, then you (or
> >> someone else) tackles the nightly docker images task? Because I think we
> >> need to have more community feedback before we go ahead and do it.
> >>
> >> Alex
> >>
> >> On Fri, Apr 18, 2025 at 1:29 PM Jean-Baptiste Onofré 
> >> wrote:
> >>
> >>> Apache Nexus doesn’t allow docker image (only maven artifacts). So the
> >>> docker images (release and nightly) will be published on docker hub.
> >>>
> >>> That’s why I prefer to have apache/polaris-server:nightly tag,
> >> overwritten
> >>> every day (similar to Maven SNAPSHOT) (to avoid to have a bunch of
> docker
> >>> images with timestamp tag).
> >>>
> >>> Regards
> >>> JB
> >>>
> >>> Le ven. 18 avr. 2025 à 12:30, Adnan Hemani  >
> >> a
> >>> écrit :
> >>>
>  I’m indifferent to either way - I could see potentially someone
> >> reporting
>  a code regression using different nightly snapshot builds that persist
> >>> with
>  a nightly timestamp. Keeping every night’s build could be helpful in
> >> that
>  way.
> 
>  But I’m also not sure if making and persisting a nightly “unstable”
> >> image
>  makes sense from a file consumption standpoint.
> 
>  One thing I wanted to double-check: these nightly images are being
>  currently pushed only to Apache Nexus? No plans to push to Docker Hub
> >> for
>  the nightly tags? Are we still thinking to push release images to
> >> Docker
>  Hub? Should we also be pushing release images to Apache Nexus?
> 
>  Best,
>  Adnan Hemani
> 
>  On Apr 18, 2025, at 3:14 AM, Jean-Baptiste Onofré 
> >>> wrote:
> 
>  I can update my PR with docker image pub ;)
> 
>  Le ven. 18 avr. 2025 à 12:12, Jean-Baptiste Onofré 
> a
>  écrit :
> 
>  That’s a good idea.
> 
>  I would prefer a nightly tag that we overwrite every night.
> 
>  Thoughts ?
> 
>  Regards
>  JB
> 
>  Le ven. 18 avr. 2025 à 11:25, Alex Dutra
>  >>>
>  a écrit :
> 
>  Hi JB,
> 
>  Thanks for driving this!
> 
>  Would it make sense to also publish nightly docker images, with a tag
> >>> like
>  "unstable" or a timestamp?
> 
>  Thanks,
> 
>  Alex
> 
>  On Wed, Apr 16, 2025 at 5:13 PM Russell Spitzer <
>  russell.spit...@gmail.com>
>  wrote:
> 
>  Great news!
> 
>  On Wed, Apr 16, 2025 at 10:04 AM Jean-Baptiste Onofré <
> j...@nanthrax.net
> >>>
>  wrote:
> 
>  Hi folks,
> 
>  I received several requests to get Polaris SNAPSHOT artifacts
>  published (to build external tools, or from polaris-tools repo).
> 
>  I did a first SNAPSHOT deployment:
> 
> 
> 
> 
> >>>
> >>
> https://www.google.com/url?q=https://www.google.com/url?q%3Dhttps://repository.apache.org/content/groups/snapshots/org/apache/polaris/%26source%3Dgmail-imap%26ust%3D174557611100%26usg%3DAOvVaw07Mli5aQ0-6Us9g9ht27lA&source=gmail-imap&ust=174558855100&usg=AOvVaw36Jt70Jjidbohh7yZcxyGI
> 
> 
>  I also created a PR to have a GH Action for nightly build, publishing
>  SNAPSHOT every day:
> 
> 
> >>>
> >>
> https://www.google.com/url?q=https://www.google.com/url?q%3Dhttps://github.com/apache/polaris/pull/1383%26source%3Dgmail-imap%26ust%3D174557611100%26usg%3DAOvVaw2AFTIS8aI_K14WIYhi-hY4&source=gmail-imap&ust=174558855100&usg=AOvVaw2PjmdTWUiVl2wO2MxkWYDP
> 
>  Thanks !
>  Regards
>  JB
>
>


Re: [DISCUSS] Rollback REPLACE commit on conflicts

2025-04-18 Thread Dmitri Bourlatchkov
Thanks for the pointer, Prashant!

On Fri, Apr 18, 2025 at 12:11 PM Prashant Singh
 wrote:

> Hey Dmitri,
>
> Yes we just remove the snapshot of data operations of type *REPLACE,* which
> means no data was added or removed in this snapshot. (iceberg [code
> <
> https://github.com/apache/iceberg/blob/main/api/src/main/java/org/apache/iceberg/DataOperations.java#L43
> >
> ])
> So we guaranteed that we never touch the snapshot which added / removed /
> updated some rows. So the correctness remains intact and would never result
> in data loss.
> The PR is also ready for review :
> https://github.com/apache/polaris/pull/1285
> It has tests as well demonstrating, with detailed comments on how it is
> gonna work !
>
> Best,
> Prashant Singh
>
>
> On Fri, Apr 18, 2025 at 8:56 AM Dmitri Bourlatchkov 
> wrote:
>
> > Hi Prashant,
> >
> > Sorry for the delayed reply and apologies if I missed some relevant
> > discussion.
> >
> > As I understand the catalog could remove snapshots that come in-between
> > previous and current snapshots from the perspective of one of the
> clients.
> >
> > Can we be sure that the removed snapshot does not have material data
> > changes (e.g. new roes or updated rows) that should have been taken into
> > account by the client whose snapshot is forced to become "current". Could
> > this result in data loss?
> >
> > Thanks,
> > Dmitri.
> >
> > On 2025/03/31 22:44:03 Prashant Singh wrote:
> > > Hey folks,
> > >
> > > I wanted to propose this feature to Apache Polaris Rolling back
> > > replacements operation snapshots in the case during the concurrent
> write
> > > (compaction and other writers trying to commit to the table at the same
> > > time) to Iceberg there are conflicts. This is a feature which Ryan
> > proposed
> > > as an alternative when I was proposing a Priority Amongst Writer
> proposal
> > > [1]  in the Apache Iceberg community. This kind of makes the compaction
> > > always a low priority process.
> > >
> > > Earlier, I went ahead and added this feature as a client side change in
> > the
> > > Apache Iceberg repo [2] . It got some attraction but this didn't get to
> > the
> > > end. Now when we think more about it again Apache Polaris seems to be
> the
> > > best place to do it as it can benefit other language writer clients as
> > well
> > > and Polaris is the one to actually apply the commits based on the
> > > requirements and update sent by Iceberg Rest Client.
> > >
> > > Here is my draft PR [3] on how I think this can be achieved, given this
> > is
> > > enabled by a table property, happy to discuss other knobs for ex: maybe
> > > check the snapshot prop ?
> > >
> > > The logic essentially if we see is the base (B) on which the snapshot
> we
> > > want to include/commit is based on is changed to something like (B`)
> and
> > > the given snapshot from B` to B are all of ops type *REPLACE *. It adds
> > > other updates within the same update Table req
> > > 1. moved the snapshot ref to B
> > > 2. [Optional] to remove the snapshot between B` to B given its all of
> > > *REPLACE*.
> > > Then try the requirements and updates again on the updated base and see
> > if
> > > it succeeds. To make all this as part of one updateReq and then commit
> to
> > > the table.
> > > Doing it this way preserves the schema changes for which no new
> snapshot
> > > has been created, just a new metadata.json is created.
> > >
> > > Happy to know your thoughts on the same.
> > >
> > > Links:
> > > [1]
> > >
> >
> https://docs.google.com/document/d/1pSqxf5A59J062j9VFF5rcCpbW9vdTbBKTmjps80D-B0/edit?tab=t.0#heading=h.fn6jmpw6phpn
> > > [2] https://github.com/apache/iceberg/pull/5888
> > > [3] https://github.com/apache/polaris/pull/1285
> > >
> > > Best,
> > > Prashant Singh
> > >
> >
>


Re: Polaris SNAPSHOT available and nightly build

2025-04-18 Thread Jean-Baptiste Onofré
Hi Alex,

Yes, let me move forward on Maven SNAPSHOTs, and do another one for Docker.

Thanks !

Regards
JB

On Fri, Apr 18, 2025 at 2:26 PM Alex Dutra  wrote:
>
> Hi all,
>
> I think a tag like "nightly" or "unstable", that gets overwritten every night 
> in Docker Hub, is probably a safer choice than publishing timestamped images 
> that could accumulate over time. You would support that.
>
> JB: how about we merge your PR for pushing snapshots first, then you (or 
> someone else) tackles the nightly docker images task? Because I think we need 
> to have more community feedback before we go ahead and do it.
>
> Alex
>
> On Fri, Apr 18, 2025 at 1:29 PM Jean-Baptiste Onofré  
> wrote:
>>
>> Apache Nexus doesn’t allow docker image (only maven artifacts). So the
>> docker images (release and nightly) will be published on docker hub.
>>
>> That’s why I prefer to have apache/polaris-server:nightly tag, overwritten
>> every day (similar to Maven SNAPSHOT) (to avoid to have a bunch of docker
>> images with timestamp tag).
>>
>> Regards
>> JB
>>
>> Le ven. 18 avr. 2025 à 12:30, Adnan Hemani  a
>> écrit :
>>
>> > I’m indifferent to either way - I could see potentially someone reporting
>> > a code regression using different nightly snapshot builds that persist with
>> > a nightly timestamp. Keeping every night’s build could be helpful in that
>> > way.
>> >
>> > But I’m also not sure if making and persisting a nightly “unstable” image
>> > makes sense from a file consumption standpoint.
>> >
>> > One thing I wanted to double-check: these nightly images are being
>> > currently pushed only to Apache Nexus? No plans to push to Docker Hub for
>> > the nightly tags? Are we still thinking to push release images to Docker
>> > Hub? Should we also be pushing release images to Apache Nexus?
>> >
>> > Best,
>> > Adnan Hemani
>> >
>> > On Apr 18, 2025, at 3:14 AM, Jean-Baptiste Onofré  
>> > wrote:
>> >
>> > I can update my PR with docker image pub ;)
>> >
>> > Le ven. 18 avr. 2025 à 12:12, Jean-Baptiste Onofré  a
>> > écrit :
>> >
>> > That’s a good idea.
>> >
>> > I would prefer a nightly tag that we overwrite every night.
>> >
>> > Thoughts ?
>> >
>> > Regards
>> > JB
>> >
>> > Le ven. 18 avr. 2025 à 11:25, Alex Dutra 
>> > a écrit :
>> >
>> > Hi JB,
>> >
>> > Thanks for driving this!
>> >
>> > Would it make sense to also publish nightly docker images, with a tag like
>> > "unstable" or a timestamp?
>> >
>> > Thanks,
>> >
>> > Alex
>> >
>> > On Wed, Apr 16, 2025 at 5:13 PM Russell Spitzer <
>> > russell.spit...@gmail.com>
>> > wrote:
>> >
>> > Great news!
>> >
>> > On Wed, Apr 16, 2025 at 10:04 AM Jean-Baptiste Onofré 
>> > wrote:
>> >
>> > Hi folks,
>> >
>> > I received several requests to get Polaris SNAPSHOT artifacts
>> > published (to build external tools, or from polaris-tools repo).
>> >
>> > I did a first SNAPSHOT deployment:
>> >
>> >
>> >
>> > https://www.google.com/url?q=https://repository.apache.org/content/groups/snapshots/org/apache/polaris/&source=gmail-imap&ust=174557611100&usg=AOvVaw07Mli5aQ0-6Us9g9ht27lA
>> >
>> >
>> > I also created a PR to have a GH Action for nightly build, publishing
>> > SNAPSHOT every day:
>> >
>> > https://www.google.com/url?q=https://github.com/apache/polaris/pull/1383&source=gmail-imap&ust=174557611100&usg=AOvVaw2AFTIS8aI_K14WIYhi-hY4
>> >
>> > Thanks !
>> > Regards
>> > JB
>> >
>> >
>> >


Re: [PROPOSAL] Include eclipselink module by default in Polaris distribution and docker image

2025-04-18 Thread Jean-Baptiste Onofré
Yes, that's the plan (cleanup the conditional build steps).

If no objections, I will go ahead on a PR.

Let's see what the others are thinking.

Regards
JB

On Fri, Apr 18, 2025 at 11:08 PM Dmitri Bourlatchkov  wrote:
>
> I agree that including the EclipseLink-based Persistence into binary
> distributions is a good idea.
>
> I'd even take this one step further and remove all conditional build steps
> for EclipseLink and include it in regular builds too.
>
> Cheers,
> Dmitri.
>
> On Fri, Apr 18, 2025 at 2:12 AM Jean-Baptiste Onofré 
> wrote:
>
> > Hi folks
> >
> > As discussed yesterday during the community meeting, I propose we include
> > eclipselink module by default in Polaris server distribution and docker
> > image.
> >
> > I think it would help our users to easily start with Polaris.
> > I would like to include this for 0.10.0-beta-incubating release.
> >
> > Thoughts ?
> >
> > Regards
> > JB
> >


Next steps for Polaris benchmarks

2025-04-18 Thread Pierre Laporte
Hello folks

During the community sync, there as an item for benchmarks next step but we
could not get to it.  I don't think we need to wait for the next community
sync to start the discussion, so here we go.

I have a couple of tasks on my todo list for Polaris benchmarks.  And I
would like to share those ideas and gather new ones, in case there is
appetite for more benchmarks.  Here is a short description for the tasks
that I am working on.

1 - Remove sequential benchmarks and renew credentials (#6
)
Sequential benchmarks (i.e. with only one request at a time) were initially
created because the current Eclipselink runs into issues under concurrent
load.  But now that the benchmarks throughput and concurrency can be
configured, those are not necessary anymore.  Additionally, this PR
contains an improvement for authentication to support long benchmarks (>1h,
the auth token validity).

2 - Remove the bound on the maximum number of updates
The current update-related benchmarks require the user to specify the
maximum number of update operations that should be generated.  The code
will change to generate an infinite stream of update operations.  Coupled
with the ability to control the throughput and the duration of the
simulation, this will simplify the user experience.

3 - Add a simulation that continuously creates table and view commits
This benchmark will continuously send table properties updates.  It will be
a way to quickly create lots of snapshots, which can then be used for
capacity planning, stressing the events subsystem or some metadata
management facility.

4 - ... ?
What else are you thinking should be added to Polaris benchmarks?  If you
have ideas of scenarios that could benefit the project, please let me know.

Cheers

--

Pierre


Re: Polaris SNAPSHOT available and nightly build

2025-04-18 Thread Adnan Hemani
Thanks for the clarification, JB! I agree with everything mentioned on the 
thread then - I’m happy to keep working on the Docker image pushes since I made 
a PR recently for Docker images publishing on new releases. I also agree that a 
“nightly”/“snapshot”/“unstable” tag that gets overridden every night sounds 
good.

Best,
Adnan Hemani

> On Apr 18, 2025, at 6:40 AM, Dmitri Bourlatchkov  wrote:
> 
> Doing Maven and Docker in two phases sounds good to me.
> 
> Also, from my POV using one (reassigned) "nightly" or "unable" tag is
> reasonable.
> 
> Regarding tracking specific build information related to bug reports, I
> suppose we could add a git commit hash to a log message and/or jar Manifest
> attributes (if we do not have that yet).
> 
> Cheers,
> Dmitri.
> 
> On Fri, Apr 18, 2025 at 8:27 AM Alex Dutra  >
> wrote:
> 
>> Hi all,
>> 
>> I think a tag like "nightly" or "unstable", that gets overwritten every
>> night in Docker Hub, is probably a safer choice than publishing timestamped
>> images that could accumulate over time. You would support that.
>> 
>> JB: how about we merge your PR for pushing snapshots first, then you (or
>> someone else) tackles the nightly docker images task? Because I think we
>> need to have more community feedback before we go ahead and do it.
>> 
>> Alex
>> 
>> On Fri, Apr 18, 2025 at 1:29 PM Jean-Baptiste Onofré 
>> wrote:
>> 
>>> Apache Nexus doesn’t allow docker image (only maven artifacts). So the
>>> docker images (release and nightly) will be published on docker hub.
>>> 
>>> That’s why I prefer to have apache/polaris-server:nightly tag,
>> overwritten
>>> every day (similar to Maven SNAPSHOT) (to avoid to have a bunch of docker
>>> images with timestamp tag).
>>> 
>>> Regards
>>> JB
>>> 
>>> Le ven. 18 avr. 2025 à 12:30, Adnan Hemani 
>> a
>>> écrit :
>>> 
 I’m indifferent to either way - I could see potentially someone
>> reporting
 a code regression using different nightly snapshot builds that persist
>>> with
 a nightly timestamp. Keeping every night’s build could be helpful in
>> that
 way.
 
 But I’m also not sure if making and persisting a nightly “unstable”
>> image
 makes sense from a file consumption standpoint.
 
 One thing I wanted to double-check: these nightly images are being
 currently pushed only to Apache Nexus? No plans to push to Docker Hub
>> for
 the nightly tags? Are we still thinking to push release images to
>> Docker
 Hub? Should we also be pushing release images to Apache Nexus?
 
 Best,
 Adnan Hemani
 
 On Apr 18, 2025, at 3:14 AM, Jean-Baptiste Onofré 
>>> wrote:
 
 I can update my PR with docker image pub ;)
 
 Le ven. 18 avr. 2025 à 12:12, Jean-Baptiste Onofré  a
 écrit :
 
 That’s a good idea.
 
 I would prefer a nightly tag that we overwrite every night.
 
 Thoughts ?
 
 Regards
 JB
 
 Le ven. 18 avr. 2025 à 11:25, Alex Dutra >> 
 a écrit :
 
 Hi JB,
 
 Thanks for driving this!
 
 Would it make sense to also publish nightly docker images, with a tag
>>> like
 "unstable" or a timestamp?
 
 Thanks,
 
 Alex
 
 On Wed, Apr 16, 2025 at 5:13 PM Russell Spitzer <
 russell.spit...@gmail.com>
 wrote:
 
 Great news!
 
 On Wed, Apr 16, 2025 at 10:04 AM Jean-Baptiste Onofré >> 
 wrote:
 
 Hi folks,
 
 I received several requests to get Polaris SNAPSHOT artifacts
 published (to build external tools, or from polaris-tools repo).
 
 I did a first SNAPSHOT deployment:
 
 
 
 
>>> 
>> https://www.google.com/url?q=https://www.google.com/url?q%3Dhttps://repository.apache.org/content/groups/snapshots/org/apache/polaris/%26source%3Dgmail-imap%26ust%3D174557611100%26usg%3DAOvVaw07Mli5aQ0-6Us9g9ht27lA&source=gmail-imap&ust=174558855100&usg=AOvVaw36Jt70Jjidbohh7yZcxyGI
 
 
 I also created a PR to have a GH Action for nightly build, publishing
 SNAPSHOT every day:
 
 
>>> 
>> https://www.google.com/url?q=https://www.google.com/url?q%3Dhttps://github.com/apache/polaris/pull/1383%26source%3Dgmail-imap%26ust%3D174557611100%26usg%3DAOvVaw2AFTIS8aI_K14WIYhi-hY4&source=gmail-imap&ust=174558855100&usg=AOvVaw2PjmdTWUiVl2wO2MxkWYDP
 
 Thanks !
 Regards
 JB



Re: [DISCUSS] Rollback REPLACE commit on conflicts

2025-04-18 Thread Prashant Singh
Hey Dmitri,

Yes we just remove the snapshot of data operations of type *REPLACE,* which
means no data was added or removed in this snapshot. (iceberg [code

])
So we guaranteed that we never touch the snapshot which added / removed /
updated some rows. So the correctness remains intact and would never result
in data loss.
The PR is also ready for review :
https://github.com/apache/polaris/pull/1285
It has tests as well demonstrating, with detailed comments on how it is
gonna work !

Best,
Prashant Singh


On Fri, Apr 18, 2025 at 8:56 AM Dmitri Bourlatchkov 
wrote:

> Hi Prashant,
>
> Sorry for the delayed reply and apologies if I missed some relevant
> discussion.
>
> As I understand the catalog could remove snapshots that come in-between
> previous and current snapshots from the perspective of one of the clients.
>
> Can we be sure that the removed snapshot does not have material data
> changes (e.g. new roes or updated rows) that should have been taken into
> account by the client whose snapshot is forced to become "current". Could
> this result in data loss?
>
> Thanks,
> Dmitri.
>
> On 2025/03/31 22:44:03 Prashant Singh wrote:
> > Hey folks,
> >
> > I wanted to propose this feature to Apache Polaris Rolling back
> > replacements operation snapshots in the case during the concurrent write
> > (compaction and other writers trying to commit to the table at the same
> > time) to Iceberg there are conflicts. This is a feature which Ryan
> proposed
> > as an alternative when I was proposing a Priority Amongst Writer proposal
> > [1]  in the Apache Iceberg community. This kind of makes the compaction
> > always a low priority process.
> >
> > Earlier, I went ahead and added this feature as a client side change in
> the
> > Apache Iceberg repo [2] . It got some attraction but this didn't get to
> the
> > end. Now when we think more about it again Apache Polaris seems to be the
> > best place to do it as it can benefit other language writer clients as
> well
> > and Polaris is the one to actually apply the commits based on the
> > requirements and update sent by Iceberg Rest Client.
> >
> > Here is my draft PR [3] on how I think this can be achieved, given this
> is
> > enabled by a table property, happy to discuss other knobs for ex: maybe
> > check the snapshot prop ?
> >
> > The logic essentially if we see is the base (B) on which the snapshot we
> > want to include/commit is based on is changed to something like (B`) and
> > the given snapshot from B` to B are all of ops type *REPLACE *. It adds
> > other updates within the same update Table req
> > 1. moved the snapshot ref to B
> > 2. [Optional] to remove the snapshot between B` to B given its all of
> > *REPLACE*.
> > Then try the requirements and updates again on the updated base and see
> if
> > it succeeds. To make all this as part of one updateReq and then commit to
> > the table.
> > Doing it this way preserves the schema changes for which no new snapshot
> > has been created, just a new metadata.json is created.
> >
> > Happy to know your thoughts on the same.
> >
> > Links:
> > [1]
> >
> https://docs.google.com/document/d/1pSqxf5A59J062j9VFF5rcCpbW9vdTbBKTmjps80D-B0/edit?tab=t.0#heading=h.fn6jmpw6phpn
> > [2] https://github.com/apache/iceberg/pull/5888
> > [3] https://github.com/apache/polaris/pull/1285
> >
> > Best,
> > Prashant Singh
> >
>


Re: Polaris SNAPSHOT available and nightly build

2025-04-18 Thread Dmitri Bourlatchkov
Doing Maven and Docker in two phases sounds good to me.

Also, from my POV using one (reassigned) "nightly" or "unable" tag is
reasonable.

Regarding tracking specific build information related to bug reports, I
suppose we could add a git commit hash to a log message and/or jar Manifest
attributes (if we do not have that yet).

Cheers,
Dmitri.

On Fri, Apr 18, 2025 at 8:27 AM Alex Dutra 
wrote:

> Hi all,
>
> I think a tag like "nightly" or "unstable", that gets overwritten every
> night in Docker Hub, is probably a safer choice than publishing timestamped
> images that could accumulate over time. You would support that.
>
> JB: how about we merge your PR for pushing snapshots first, then you (or
> someone else) tackles the nightly docker images task? Because I think we
> need to have more community feedback before we go ahead and do it.
>
> Alex
>
> On Fri, Apr 18, 2025 at 1:29 PM Jean-Baptiste Onofré 
> wrote:
>
> > Apache Nexus doesn’t allow docker image (only maven artifacts). So the
> > docker images (release and nightly) will be published on docker hub.
> >
> > That’s why I prefer to have apache/polaris-server:nightly tag,
> overwritten
> > every day (similar to Maven SNAPSHOT) (to avoid to have a bunch of docker
> > images with timestamp tag).
> >
> > Regards
> > JB
> >
> > Le ven. 18 avr. 2025 à 12:30, Adnan Hemani 
> a
> > écrit :
> >
> > > I’m indifferent to either way - I could see potentially someone
> reporting
> > > a code regression using different nightly snapshot builds that persist
> > with
> > > a nightly timestamp. Keeping every night’s build could be helpful in
> that
> > > way.
> > >
> > > But I’m also not sure if making and persisting a nightly “unstable”
> image
> > > makes sense from a file consumption standpoint.
> > >
> > > One thing I wanted to double-check: these nightly images are being
> > > currently pushed only to Apache Nexus? No plans to push to Docker Hub
> for
> > > the nightly tags? Are we still thinking to push release images to
> Docker
> > > Hub? Should we also be pushing release images to Apache Nexus?
> > >
> > > Best,
> > > Adnan Hemani
> > >
> > > On Apr 18, 2025, at 3:14 AM, Jean-Baptiste Onofré 
> > wrote:
> > >
> > > I can update my PR with docker image pub ;)
> > >
> > > Le ven. 18 avr. 2025 à 12:12, Jean-Baptiste Onofré  a
> > > écrit :
> > >
> > > That’s a good idea.
> > >
> > > I would prefer a nightly tag that we overwrite every night.
> > >
> > > Thoughts ?
> > >
> > > Regards
> > > JB
> > >
> > > Le ven. 18 avr. 2025 à 11:25, Alex Dutra  >
> > > a écrit :
> > >
> > > Hi JB,
> > >
> > > Thanks for driving this!
> > >
> > > Would it make sense to also publish nightly docker images, with a tag
> > like
> > > "unstable" or a timestamp?
> > >
> > > Thanks,
> > >
> > > Alex
> > >
> > > On Wed, Apr 16, 2025 at 5:13 PM Russell Spitzer <
> > > russell.spit...@gmail.com>
> > > wrote:
> > >
> > > Great news!
> > >
> > > On Wed, Apr 16, 2025 at 10:04 AM Jean-Baptiste Onofré  >
> > > wrote:
> > >
> > > Hi folks,
> > >
> > > I received several requests to get Polaris SNAPSHOT artifacts
> > > published (to build external tools, or from polaris-tools repo).
> > >
> > > I did a first SNAPSHOT deployment:
> > >
> > >
> > >
> > >
> >
> https://www.google.com/url?q=https://repository.apache.org/content/groups/snapshots/org/apache/polaris/&source=gmail-imap&ust=174557611100&usg=AOvVaw07Mli5aQ0-6Us9g9ht27lA
> > >
> > >
> > > I also created a PR to have a GH Action for nightly build, publishing
> > > SNAPSHOT every day:
> > >
> > >
> >
> https://www.google.com/url?q=https://github.com/apache/polaris/pull/1383&source=gmail-imap&ust=174557611100&usg=AOvVaw2AFTIS8aI_K14WIYhi-hY4
> > >
> > > Thanks !
> > > Regards
> > > JB
> > >
> > >
> > >
> >
>