Re: [PROPOSAL] Polaris Table Source and Common Table API

Jean-Baptiste Onofré Thu, 04 Sep 2025 10:47:08 -0700

Sure, I will be at CoC too :). Else, we can do the week after CoC,
maybe easier for everyone.


Regards
JB

On Thu, Sep 4, 2025 at 6:37 PM Yufei Gu <[email protected]> wrote:
>
> Thanks JB for scheduling a meeting. CoC started on the 11th. Multiple Polaris 
> talks will happen in that morning. Can we shift to another day?
>
> Yufei
>
>
> On Thu, Sep 4, 2025 at 9:18 AM Jean-Baptiste Onofré <[email protected]> wrote:
>>
>> Hi folks,
>>
>> First of all, thanks everyone for your highly valuable comments and
>> questions in the proposal!
>>
>> As quickly discussed during the Polaris Community meeting, I would
>> like to propose a dedicated meeting to talk about the proposal, and
>> especially the "relation" with Generic Table.
>>
>> I'm proposing Thursday, Sep 11 at 9am PST. If there's no objections, I
>> will send an invite for this date.
>>
>> Laurent and I will reply to the comments in the document in the meantime.
>>
>> Thanks !
>> Regards
>> JB
>>
>> On Thu, Aug 28, 2025 at 1:21 PM Jean-Baptiste Onofré <[email protected]> 
>> wrote:
>> >
>> > Hi folks,
>> >
>> > Laurent and I worked on a new proposal for Apache Polaris: Polaris Table 
>> > Source.
>> >
>> > The purpose is to have a mechanism to create Iceberg tables in Polaris
>> > corresponding to non Iceberg data, allowing Polaris to be the "unique"
>> > catalog enforcing governance and gathering data sources in one
>> > catalog.
>> > An user can register a source configuration in Polaris (Polaris will
>> > have a Source Configuration registry). Then source services (not
>> > running in Polaris, they are "external" services) are using the
>> > registry to create the corresponding table in Polaris.
>> > We distinguish three kinds of sources:
>> > * structured data on a location (Parquet files, JSON files, CSV files,
>> > XML files, ...): a source service will create the Iceberg tables
>> > "wrapping" this data, the created table uses the schema from the
>> > "original" file.
>> > * unstructured data on a location (image files, video files, PDF
>> > files): a source service will "wrap" the location and metadata on the
>> > files in a table with "fixed" schema (file location, etags, last
>> > modification data, creation data, etc)
>> > * table format: here it would be possible to "import" a table in
>> > Polaris using an existing table format. For instance, in the case of
>> > existing Iceberg tables, we can use the metadata.json as an "import"
>> > basis. We can also support other table formats (Delta directly in
>> > Polaris, in addition to using a specific Spark client as we do today,
>> > we can also support Paimon, see this discussion
>> > https://github.com/apache/polaris/discussions/2453).
>> >
>> > The detailed proposal document is here:
>> > * 
>> > https://docs.google.com/document/d/1OBDkPbWdf0Bq6Wa_BMKaXn-fqAxfdmepo57ggkkC8mI/edit?usp=sharing
>> >
>> > Any feedback and comments are welcome !
>> >
>> > Thanks !
>> >
>> > Regards
>> > JB

Re: [PROPOSAL] Polaris Table Source and Common Table API

Reply via email to