ggershinsky commented on code in PR #13225:
URL: https://github.com/apache/iceberg/pull/13225#discussion_r2490399490


##########
core/src/main/java/org/apache/iceberg/rest/RESTTableOperations.java:
##########
@@ -166,7 +208,67 @@ public void commit(TableMetadata base, TableMetadata 
metadata) {
 
   @Override
   public FileIO io() {
-    return io;
+    if (encryptionKeyId == null) {
+      return io;
+    }
+
+    if (encryptingFileIO == null) {
+      encryptingFileIO = EncryptingFileIO.combine(io, encryption());
+    }
+
+    return encryptingFileIO;
+  }
+
+  @Override
+  public EncryptionManager encryption() {
+    if (encryptionManager != null) {
+      return encryptionManager;
+    }
+
+    if (encryptionKeyId != null) {
+      if (kmsClient == null) {
+        throw new RuntimeException(
+            "Cant create encryption manager, because key management client is 
not set");
+      }
+
+      Map<String, String> tableProperties = Maps.newHashMap();
+      tableProperties.put(TableProperties.ENCRYPTION_TABLE_KEY, 
encryptionKeyId);
+      tableProperties.put(
+          TableProperties.ENCRYPTION_DEK_LENGTH, 
String.valueOf(encryptionDekLength));
+      encryptionManager =
+          EncryptionUtil.createEncryptionManager(
+              encryptedKeysFromMetadata, tableProperties, kmsClient);
+    } else {
+      return PlaintextEncryptionManager.instance();
+    }
+
+    return encryptionManager;
+  }
+
+  private void encryptionPropsFromMetadata(TableMetadata metadata) {

Review Comment:
   > for the Hive case it's only the table properties that are stored in HMS, 
right? 
   
   Correct. That's why the Hive catalog provides partial security (data 
confidentiality only; not integrity). REST catalogs can provide full security 
for encrypted tables.
   
   >  Is there a follow-up there or is this sufficient for the threat model?
   
   Afaik, HMS is not optimized for JSON storage. But maybe someone in the 
community will take on storing the full metadata object there, to improve table 
security. Having only the table properties is barely sufficient. I think we 
should recommend REST catalogs for encrypted tables.
   
   > REST catalog checking those table properties are correct before returning 
them
   
   To check this, the REST catalog would need to keep these properties in an 
independent storage / db? If so, maybe the catalog can just keep the whole 
metadata object there? (as a json file in an independent storage or as an 
object in a DB/store)
   
   Ok, here's my current understanding of the advantages of each approach - all 
corrections/additions are welcome.
   
   Storing the encryption properties
   - implementation: it might be easier to store two properties instead of a 
metadata object (in certain databases, backing the catalog)
   
   Storing the whole metadata object
   - better security (table integrity is guaranteed)
   - implementation: it might be easier to store the whole metadata object (in 
object stores /nosql db's /regular file systems, backing the catalog)
   - REST spec addition is concise, simple and easy to understand (see the 
first pass at https://github.com/apache/iceberg/pull/14486). An implementation 
is either "secure" or not. Going deep into threat model levels and encryption 
property details will make the addition much bigger and more complex; might be 
out of place for this spec.
   (also, a simple security spec means a lower risk of wrong implementations / 
security vulnerabilities)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to