> If the same endpoint works for both the engine and the Polaris Server, it
is only necessary to set one "endpoint" parameter.

That's right, we will only need one endpoint in that case. However, if
there is no one endpoint can work for both engines and Polaris server, the
extra new endpoint makes sense.

Yufei


On Sat, Aug 2, 2025 at 3:43 AM Alexandre Dutra <adu...@apache.org> wrote:

> Hi all,
>
> I agree with Dmitri: having this feature in Polaris will be very
> helpful. We know of many users that deploy engines and catalog in
> different networks, and thus must access the storage layer through
> different addresses.
>
> This feature is easy to implement, enables new use cases, and thus
> increases Polaris adoption – all without hurting any existing users or
> workflows. I don't see a valid reason not to have it in Polaris.
>
> Thanks,
> Alex
>
> On Sat, Aug 2, 2025 at 7:51 AM Eric Maynard <eric.w.mayn...@gmail.com>
> wrote:
> >
> > > It is only relevant for on-prem S3-compatible storage. I imagine,
> > "endpointInternal" will never be needed for AWS storage.
> >
> > That's not true, is it? If we are imagining scenarios where the client
> and
> > the server are on totally different networks, the regular endpoint could
> > indeed need to be addressed differently on the two networks.
> >
> > On Sat, Aug 2, 2025 at 8:17 AM Dmitri Bourlatchkov <di...@apache.org>
> wrote:
> >
> > > Hi Yufei,
> > >
> > > > I think Polaris server will only need the internal endpoint in that
> case,
> > > > while engines could use the public endpoint. Do we need to configure
> both
> > > > for the Polaris server
> > >
> > > Polaris puts "s3.endpoint" into loadTable responses when credential
> vending
> > > is enabled.
> > > So, yes, both settings are needed, but only in complex deployment
> cases.
> > >
> > > If the same endpoint works for both the engine and the Polaris Server,
> it
> > > is only necessary
> > > to set one "endpoint" parameter.
> > >
> > > Cheers,
> > > Dmitri.
> > >
> > >
> > >
> > > On Fri, Aug 1, 2025 at 1:33 PM Yufei Gu <flyrain...@gmail.com> wrote:
> > >
> > > > >
> > > > > 1: I do not really know. This is a question about a specific
> deployment
> > > > > environment.
> > > > >
> > > > If the endpoint used by engines could be also used by the Polaris
> server,
> > > > we should just use it, instead of configuring another one.
> > > >
> > > >
> > > > > 2: I'm not sure I understand your question. Two endpoints are
> necessary
> > > > in
> > > > > cases when the server's view of the network is different from the
> > > > engine's
> > > > > view.
> > > > >
> > > > I think Polaris server will only need the internal endpoint in that
> case,
> > > > while engines could use the public endpoint. Do we need to configure
> both
> > > > for the Polaris server?
> > > >
> > > >
> > > > > Cheers,
> > > > > Dmitri.
> > > > >
> > > > >
> > > > >
> > > > > On Thu, Jul 31, 2025 at 5:59 PM Yufei Gu <flyrain...@gmail.com>
> wrote:
> > > > >
> > > > > > Thanks for the explanation. Two questions:
> > > > > > 1. Should the public endpoint used by engines still work with
> Polaris
> > > > > even
> > > > > > if it co-locates with MinIO server?
> > > > > > 2. Can we set Polaris endpoint directly to the internal address
> in
> > > that
> > > > > > case? Another way to ask this question is that why do we need to
> keep
> > > > > both
> > > > > > endpoints in the Polaris server?
> > > > > >
> > > > > > Yufei
> > > > > >
> > > > > >
> > > > > > On Thu, Jul 31, 2025 at 11:06 AM Dmitri Bourlatchkov <
> > > di...@apache.org
> > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Hi Yufei,
> > > > > > >
> > > > > > > The "how" in your question depends on the deployment
> environment, I
> > > > > > guess.
> > > > > > > There are a lot of variants.
> > > > > > >
> > > > > > > If you wonder whether such a situation is possible in
> practice, I
> > > > > > > believe it is. An example would be self-hosting non-AWS S3
> storage
> > > > and
> > > > > > > Polaris in a way that Polaris connections go through a certain
> > > > internal
> > > > > > > network, while connections from query engines running outside
> of
> > > that
> > > > > > > deployment environment go through a different network. This is
> very
> > > > > > > high-level, of course, since the deployment choices are largely
> > > > driven
> > > > > by
> > > > > > > specific users' needs. The proposed "endpointInternal" config
> entry
> > > > > > merely
> > > > > > > expands deployment options that users can choose from.
> > > > > > >
> > > > > > > Cheers,
> > > > > > > Dmitri.
> > > > > > >
> > > > > > > On Thu, Jul 31, 2025 at 1:07 PM Yufei Gu <flyrain...@gmail.com
> >
> > > > wrote:
> > > > > > >
> > > > > > > > Hi Dimtri,
> > > > > > > >
> > > > > > > > That generally makes sense to me. For awareness, could you
> > > > elaborate
> > > > > a
> > > > > > > bit
> > > > > > > > on how the Polaris server and query engines (like Spark,
> Trino,
> > > > etc.)
> > > > > > > might
> > > > > > > > access the same object storage (e.g., MinIO) via different
> DNS
> > > > > > endpoints?
> > > > > > > >
> > > > > > > > Yufei
> > > > > > > >
> > > > > > > >
> > > > > > > > On Thu, Jul 31, 2025 at 4:36 AM Alexandre Dutra <
> > > adu...@apache.org
> > > > >
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi Dmitri,
> > > > > > > > >
> > > > > > > > > I think your suggestion makes sense. We added something
> similar
> > > > in
> > > > > > > > > Nessie long ago, and it is definitely useful.
> > > > > > > > >
> > > > > > > > > I left some comments in the PR.
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > > Alex
> > > > > > > > >
> > > > > > > > > On Thu, Jul 31, 2025 at 4:12 AM Dmitri Bourlatchkov <
> > > > > > di...@apache.org>
> > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > Hi All,
> > > > > > > > > >
> > > > > > > > > > I propose to add an `endpointInternal` optional
> parameter to
> > > > > > > > > > AwsStorageConfigInfo
> > > > > > > > > > in PR [2213].
> > > > > > > > > >
> > > > > > > > > > The main idea is to support deployment edge cases where
> > > Polaris
> > > > > > > Servers
> > > > > > > > > may
> > > > > > > > > > 'see' storage under a different DNS name than query
> engines.
> > > > This
> > > > > > use
> > > > > > > > > case
> > > > > > > > > > applies mostly to non-AWS S3 storage (e.g. MinIO).
> > > > > > > > > >
> > > > > > > > > > This change is backward-compatible with existing clients
> and
> > > > > > deployed
> > > > > > > > > > catalogs.
> > > > > > > > > >
> > > > > > > > > > Thoughts?
> > > > > > > > > >
> > > > > > > > > > [2213] https://github.com/apache/polaris/pull/2213
> > > > > > > > > >
> > > > > > > > > > Thanks,
> > > > > > > > > > Dmitri.
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
>

Reply via email to