szlta commented on code in PR #13225:
URL: https://github.com/apache/iceberg/pull/13225#discussion_r2477472541
##########
core/src/main/java/org/apache/iceberg/rest/RESTTableOperations.java:
##########
@@ -166,7 +208,67 @@ public void commit(TableMetadata base, TableMetadata
metadata) {
@Override
public FileIO io() {
- return io;
+ if (encryptionKeyId == null) {
+ return io;
+ }
+
+ if (encryptingFileIO == null) {
+ encryptingFileIO = EncryptingFileIO.combine(io, encryption());
+ }
+
+ return encryptingFileIO;
+ }
+
+ @Override
+ public EncryptionManager encryption() {
+ if (encryptionManager != null) {
+ return encryptionManager;
+ }
+
+ if (encryptionKeyId != null) {
+ if (kmsClient == null) {
+ throw new RuntimeException(
+ "Cant create encryption manager, because key management client is
not set");
+ }
+
+ Map<String, String> tableProperties = Maps.newHashMap();
+ tableProperties.put(TableProperties.ENCRYPTION_TABLE_KEY,
encryptionKeyId);
+ tableProperties.put(
+ TableProperties.ENCRYPTION_DEK_LENGTH,
String.valueOf(encryptionDekLength));
+ encryptionManager =
+ EncryptionUtil.createEncryptionManager(
+ encryptedKeysFromMetadata, tableProperties, kmsClient);
+ } else {
+ return PlaintextEncryptionManager.instance();
+ }
+
+ return encryptionManager;
+ }
+
+ private void encryptionPropsFromMetadata(TableMetadata metadata) {
Review Comment:
Well I see two scenarios when thinking about this:
1. metadata.json is something that both the server and the clients can read
(although clients wouldn't need to, given they get the metadata with the
`LoadTableResponse` )
2. metadata.json can only be accessed on the server side and clients are not
given FS credentials (either vended or not) to reach it
For case (1) I totally agree, we can't rely on just metadata.json to store
these encryption properties, and the catalog should store it separately too,
and eventually populating (i.e. doing the override logic referred by
@ggershinsky) the properties in the `LoadTableResponse` to be created.
For case (2) I'm not 100% sure, but still leaning toward the catalog taking
on this responsibility.
Either way, for the client side there's not much we can do other than
recommending clients to consider the metadata from LoadTableResponse only. The
rest (no pun intended) is on the server side to be decided and will be
implementation-specific. For this code snippet above, irrelevant IMHO.
Let me know your thoughts.
##########
spark/v4.0/spark/src/test/java/org/apache/iceberg/spark/sql/TestTableEncryption.java:
##########
@@ -118,20 +128,23 @@ public void testRefresh() {
@TestTemplate
public void testTransaction() {
- catalog.initialize(catalogName, catalogConfig);
-
- Table table = catalog.loadTable(tableIdent);
+ validationCatalog.initialize(catalogName, catalogConfig);
+ Table table = validationCatalog.loadTable(tableIdent);
List<DataFile> dataFiles = currentDataFiles(table);
Transaction transaction = table.newTransaction();
AppendFiles append = transaction.newAppend();
// add an arbitrary datafile
append.appendFile(dataFiles.get(0));
+
+ // append to the table in the meantime. use a separate load to avoid
shared operations
+
validationCatalog.loadTable(tableIdent).newFastAppend().appendFile(dataFiles.get(0)).commit();
Review Comment:
Sure, this does look good to me
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]