Hi all,

  I'd like to start a discussion on simplifying engine access to
Iceberg REST and Lance REST catalogs
  managed by Gravitino.

  Background

  Today, when users want Spark to access Iceberg REST and Lance REST
catalogs managed by Gravitino, they
  must maintain separate engine-side catalog configurations that
duplicate what Gravitino already knows —
  Gravitino, those changes need to be manually propagated to every
engine's configuration.

  Proposed Approach

  The proposal introduces a provider-level engine-access-mode on the
engine side:

    spark.sql.gravitino.<provider>.engine-access-mode = auto |
gravitino | native

  With this, users only configure the Gravitino server address and
metalake. The engine connector calls
  listCatalogsInfo() at startup and auto-registers the appropriate
catalogs by translating existing
  Gravitino catalog properties — no new catalog properties or
server-side APIs are needed.

  While the first phase focuses on Spark (Iceberg REST and Lance
REST), the same engine-access-mode
  semantics are designed to extend naturally to Flink, Trino, Doris,
Daft, and other engines.

  The full design document is in the PR for review:
  https://github.com/apache/gravitino/pull/11280

  Looking forward to hearing your thoughts.


Best,
Xiaojing

Reply via email to