WRT to the term "generic", I suspect that it can evolve to a more generic catalog API, that could also expose functionality for Iceberg tables + views and isn't limited to "generic tables" (too many different meanings of "generic" there...). So not a fan of using the term "generic" in a REST API (base) path.

Although I'd prefer to keep the existing base paths, even if /api/catalog isn't "prominently telling" that it's actually Iceberg REST, the proposals
* /api/catalog/iceberg and
* /api/catalog/polaris
are probably clearer and easier for users to understand.

So +1 on migrating Iceberg REST from /api/catalog to /api/catalog/iceberg and adding /api/catalog/polaris. The then "legacy" endpoint /api/catalog/v1 however should then be deprecated for removal (Polaris 1.0?).

For that API we need a (new) set of types. I'd really prefer to not reuse types from a spec that the Polaris project doesn't own.

A (controversial) thing is how to represent fully-qualified identifiers (think: table-identifer and namespace-elements) as path-parameters. Iceberg uses 0x1F as the element separator, but that is a control character, which can be problematic. I'd propose to use a different escaping that is in 99.9% of all cases both easy/intuitive to read and write and prevents control characters and ambiguous paths ([1]). Reference: [2]

Robert

[1] Jakarta Servlet Spec 6.0 - https://jakarta.ee/specifications/servlet/6.0/jakarta-servlet-spec-6.0#uri-path-canonicalization [2] https://github.com/projectnessie/nessie/blob/fbbbf8ae7ac0d961997c685afd32d5dceed90efc/api/model/src/main/java/org/projectnessie/model/Elements.java#L80-L121

On 25.02.25 00:48, yun zou wrote:
"If Polaris were to add new endpoints under /catalog/v1 and then Iceberg
releases a v2 REST Catalog API what happens to the Polaris-specific
endpoints?"

-- I think that is a great point, I am actually with Dmitri now -- a new
prefix could be beneficial for Polaris API to evolve independently in the
future.

For the new prefix name part, I think we will definitely need to continue
supporting the /catalog/v1 for the Iceberg V1 APIs to avoid breaking any
client that already works with Polaris today. Maybe when Iceberg evolves to
v2, we can start moving then to  /catalog/iceberg/.

For our new set of APIs includes generic tables and policy, maybe even
volume move on, I actually prefer /catalog/polaris, like Dmitri mentioned,
it indicates clearly that set of API evolves by Polaris, and we can embed
all new object specific APIs under this prefix.

Best Regards,
Yun

On Mon, Feb 24, 2025 at 2:50 PM Dmitri Bourlatchkov <di...@apache.org>
wrote:

The current /catalog/v1 URI prefix encapsulates the Iceberg REST Catalog
API. This API is not owned by Polaris.

If Polaris were to add new endpoints under /catalog/v1 and then Iceberg
releases a v2 REST Catalog API what happens to the Polaris-specific
endpoints? What if Iceberg adds an endpoint that matches the Polaris
endpoint URI (unlikely but possible)?

 From my POV this is the primary reason for putting the new API under a
different prefix.

A secondary concern is that listing methods in the Iceberg REST API will
certainly not include entities created via the proposed new API, so mixing
new create endpoints with old lising endpoints does not make sense to me
from the REST API design perspective.

As for a practical proposal, how about the following?

* /catalog/v1 - remains supported for Iceberg REST API v1, but deprecated
and will not be used for v2.
* /catalog/iceberg/ - encapsulates the Iceberg REST API v1 (and future API
versions).
* /catalog/generic/v1 - the API proposed in the referenced document.
* /management - current Polaris Management API.

Now,  "generic" is, of course, a naming concern. Other opinions are welcome
regarding how to call the new API.

Options that look reasonable to me (in order of preference):

* /catalog/polaris/v1 - this implies that Polaris evolves this API as the
primary API for all things catalogued by Polaris (not including roles,
grants, etc., which fall under /management)
* /catalog/api/v1 - same as above, but different path segment
* /catalog/storage-object/v1
* /catalog/generic/v1 - Works fine for the current proposal, but it will
look awkward if we have to add typed / specialized endpoints later (so
those API extensions may need yet another prefix).
* /catalog/generic-table/v1 - same as above
* /catalog/storage-entity/v1

Cheers,
Dmitri.

On Mon, Feb 24, 2025 at 3:10 PM yun zou <yunzou.colost...@gmail.com>
wrote:

Hi All,

Thanks all for attending the review today and providing your valuable
feedback, I really appreciate it ! And apologize for the late notice for
the review meeting here.

I think we had a good discussion, and agreed to
1) Move on to provide a new Spark Catalog Plugin and new set of APIs with
restricted scope to table managements, but keep the doors open for future
extensions includes new fields or even new APIs.
2) Update the field create_at_ms to catalog_register_at to be more clear
that the timestamp is for the registration time with Polaris.

We will start working on figuring out more implementation details for the
plugins as well as refactoring needed for Apache Polaris. If it turns out
too complicated, we can have a separate review or discussion about the
implementation part also.

As a follow up of the meeting, we would like to open this thread for
discussion for the URI prefix for the new endpoint. As Dmitri brought up
on
the design doc, since this set of APIs are new and not restricted by
Iceberg anymore, shall we introduce a new prefix like
/polaris/catalog/v1/
instead of reuse the current /catalog/v1?

I think my concern about this proposal is that it seems to diverge how
the
APIs path are formalized today, the APIs in Polaris today use a path that
is prefixed by functionality, like /catalog/ or /management/. If we start
using /polaris/catalog/, shall we change the /management also to
/polaris/management ? Shall we move the new Policy Engine related APIs to
/polaris/catalog also?

Another choice we have is to use /polaris-catalog/v1 instead /catalog/v1
as
a prefix, which aligns with the current API path better. Then we have the
same question, what do we do with the Policy Engine APIs? The Policies
can
eventually be applied to both table entities created through the /catalog
APIs and /polaris/catalog APIs.

Overall, I would prefer to stay consistent with the current APIs and
continue to use /catalog/v1, since regardless of Iceberg or Generic
tables,
they are just entities managed by Polaris, and the APIs don't have
'iceberg' in the path also. I would like to hear what people think about
this?

Link to the doc:


https://docs.google.com/document/d/1_R9jBIwoH3CV9G7gSoRJPQVEcOsROp4IEiGoeYeQE8A/edit?pli=1&tab=t.0
Best Regards,
Yun

--
Robert Stupp
@snazy

Reply via email to