XJDKC commented on PR #14465: URL: https://github.com/apache/iceberg/pull/14465#issuecomment-3518841712
> A couple of questions wrt encrypted tables, > > 1. What if the encryption.key-id table property is set (https://github.com/apache/iceberg/blob/main/core/src/main/java/org/apache/iceberg/TableProperties.java#L391) but a custom TO implementation ignores it. Do the users expect a table to be encrypted if the encryption.key-id is set? Should the implementors of custom TOs validate them by running the Iceberg unitests (inc TestTableEncryption)? > 2. The standard RESTableOperations class, built in Iceberg, uses a safe approach to getting the metadata object (directly from the REST catalog server, never from the metadata.json file that can be kept in untrusted storage). Can custom TO replacements behave differently in this respect? > > If any of these points is a concern, then I believe it can be addressed just by adding a few lines to the RESTOperationsBuilder javadoc API comments. What do you think? > Should the implementors of custom TOs validate them by running the Iceberg unit tests? For a custom TableOperations (TO) implementation, the responsibility lies entirely with the implementer. They must ensure proper handling of encryption and any other security-sensitive logic. That said, even for a custom implementation, I'd expect most of the core logic to remain unchanged and continue to rely on the unmanaged components provided by Iceberg sdk. We can add some comments or documentation notes to call this out explicitly, so that anyone implementing a custom TableOperations is aware of the encryption keys and understands the need to handle encryption properly (IIRC, in your PR, an additional param will be passed). This should help prevent misuse or accidental security gaps when extending the default implementation. That said, if someone chooses to extend the default TO, they should take full responsibility for doing so safely. The same applies to the ClientBuilder: users may provide their own HttpClient (for example, to support custom logic (shared connection pool, PrivateLink, proxy, or mTLS, ), and it’s their responsibility to ensure it doesn’t break core functionality. > The standard RESTableOperations class, built in Iceberg, uses a safe approach to getting the metadata object (directly from the REST catalog server, never from the metadata.json file that can be kept in untrusted storage). Can custom TO replacements behave differently in this respect? As mentioned earlier, that’s already possible even without this PR. Anyone can build their own library or copy the Iceberg SDK code and modify it as they wish. Iceberg is a specification, and the Apache Iceberg repository serves as a reference implementation, we can't prevent developers from customizing it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
