Hey Xuanwo,

Thanks for raising this.

   - The S3 properties are largely covered under the S3FileIO page:
   https://iceberg.apache.org/docs/nightly/aws/#s3-fileio. But it looks
   like some important ones are missing indeed. I've raised an issue here
   <https://github.com/apache/iceberg/issues/10674>.
   - For PyIceberg it only supports like a subset of the functionality, and
   therefore also many properties are missing there.
   - For the REST Catalog, there is an open PR to add
   <https://github.com/apache/iceberg/pull/10576> the options for GCS and
   ADLS. It would be great to get some more eyes on there.

That being said, I do think there is value in formalizing them. When adding
configuration options to PyIceberg, I'll make sure to check out the Java
implementation to ensure that we use the same property.

Kind regards,
Fokko

Op wo 10 jul 2024 om 09:22 schreef Xuanwo <xua...@apache.org>:

> Hello everyone
>
> I've been working on the iceberg-rust FileIO recently and have found it
> challenging to identify all the necessary IO properties we need to support.
>
> For instance, consider AWS S3. There are no documents specifying which
> properties are supported by S3.
>
> The only relevant documentation I could find includes:
>
> - Iceberg AWS Integrations[1]: Does not define `s3.access-key-id` or
> `s3.secret-access-key`.
> - Pyiceberg configuration[2]: Missing several S3-related properties.
> - Iceberg REST Catalog[3]: Does not cover all storage services.
>
> To gather this information, we must refer to the S3FileIO Java code[4].
>
> I propose adding a separate section for agreeing upon these properties. We
> could create a specification that outlines all IO properties with
> indications of whether they are required or optional, along with their
> expected behaviors. This would help ensure consistency across different
> implementations without any conflicts.
>
>
> [1]: https://iceberg.apache.org/docs/latest/aws/
> [2]: https://py.iceberg.apache.org/configuration/#s3
> [3]:
> https://github.com/apache/iceberg/blob/eee81c59199a54e749ea58dae070eb066d9a5f9e/open-api/rest-catalog-open-api.yaml#L2737
> [4]:
> https://github.com/apache/iceberg/blob/2b21020aedb63c26295005d150c05f0a5a5f0eb2/aws/src/main/java/org/apache/iceberg/aws/s3/S3FileIOProperties.java#L46
>
> Xuanwo
>
> https://xuanwo.io/
>

Reply via email to