Hi all,

Le 03/10/2022 à 17:03, Will Jones a écrit :
Hi Rusty,

Note we discussed Iceberg a while ago [1]. I don't think we've discussed
Hudi in any depth.

As I see it, we are waiting on three things:

1. Someone willing to move forward the Iceberg / Hudi integration.
2. The Iceberg and Hudi projects need native libraries that we can use. The
base implementations are all Java, which isn't practical to integrate with
our C++ implementation (and the Python/R/Ruby bindings). But I think these
formats are complex enough that it's best to develop the core
implementation within the respective community, rather than within the
Arrow repo. There was a discussion to start one a C++/Rust implementation
for Iceberg [2], but I haven't seen any progress so far. I haven't been
watching Hudi.
3. We need a model for extending Arrow C++ datasets in separate packages,
or else we contribute to the package size problem you mentioned in your
other thread [3].

There may be other potential ways forward, such as integrate Iceberg/Hudi using a Flight or ADBC endpoint.

Regards

Antoine.

Reply via email to