Hi folks,

Thanks to everyone who joined the community meeting about Table Source Proposal.

Here's the main takeaways:
1. I think we have a consensus about the use case about reading
existing Parquet files to easily create Iceberg metadata and so
leverage Polaris features (especially about governance).
2. We don't have yet a clear consensus for "existing Iceberg tables"
(that can be addressed with Catalog Federation) and unstructured data
(PDF files, video, image, ...) needs more discussion.
3. In order to move forward, I propose to focus on the "existing
Parquet files" use case.
4. Then, I'm proposing the following action plan:
4.1. I propose to split the Table Source proposal document, with a
focus on the "Parquet file" use case.
4.2. We discussed leveraging Generic Table and server side scan API
for that. I propose to work with Yun and I will start a PoC to verify
it's a viable option and identify the changes eventually required on
Generic Table.
4.3. Depending of 4.2, I will update the proposal document about
"existing Parquet files" and open a PR with change.

Thoughts ?

Here's the record:
https://drive.google.com/file/d/1x4XjZCop7WaA8L0m81UrepE2nvRsfUU1/view?usp=sharing

I will submit a PR to update the website too and I will update the
corresponding GitHub Issue and proposal document.

Thanks again!

Regards
JB

Reply via email to