amogh-jahagirdar commented on code in PR #15911:
URL: https://github.com/apache/iceberg/pull/15911#discussion_r3267283383


##########
core/src/main/java/org/apache/iceberg/deletes/BaseDVFileWriter.java:
##########
@@ -214,4 +252,47 @@ public StructLike partition() {
       return partition;
     }
   }
+
+  private static class EncryptedOutputFileWrapper implements 
EncryptedOutputFile, OutputFile {

Review Comment:
   Why is this needed? 



##########
core/src/main/java/org/apache/iceberg/deletes/BaseDVFileWriter.java:
##########
@@ -56,11 +63,12 @@ public class BaseDVFileWriter implements DVFileWriter {
   private final Function<String, PositionDeleteIndex> loadPreviousDeletes;
   private final Map<String, Deletes> deletesByPath = Maps.newHashMap();
   private final Map<String, BlobMetadata> blobsByPath = Maps.newHashMap();
+  private EncryptionKeyMetadata keyMetadata = null;
   private DeleteWriteResult result = null;
 
   public BaseDVFileWriter(
       OutputFileFactory fileFactory, Function<String, PositionDeleteIndex> 
loadPreviousDeletes) {
-    this(() -> fileFactory.newOutputFile().encryptingOutputFile(), 
loadPreviousDeletes);
+    this(() -> asOutputFile(fileFactory.newOutputFile()), loadPreviousDeletes);

Review Comment:
   Ok I had a bit of time to go through this a bit more. I didn't realize 
EncryptedOutputFile doesn't extend OutputFile, it's an orthogonal interface 
which just wraps an actual OutputFile with a key. It looks like that kind of 
forced the wrapper below which is a bit much for what we're trying to achieve 
here.
   
   So after looking at this a bit more, I think what makes sense is to do the 
following:
   
   1. Add a package private constructor which takes 
Supplier<EncryptedOutputFile>. I don't think we should change the public 
constructors. I think one benefit with the disjoint types here is that if we're 
given an OutputFile we can safely assume it's unencrypted.
   
   2. The local state keeps track of Supplier<EncryptedOutputFile>. The 
existing constructors can pass to this package-private constructor 
EncryptedFiles.plainAsEncryptedOutput(OutputFile).
   
   3. I would add a static factory helper to Deletes called `DVFileWriter 
writeDVs(EncryptedOutputFile outputFile,  Function<String, PositionDeleteIndex> 
loadPreviousDeletes)` which calls out to the package private constructor.  Over 
time, we probably want to have another static factory for regular OutputFile 
and then just hide the constructors on the writer, but I think that's for a 
follow on and not this PR. I'm largely trying to mimic what we already do in 
ManifestFiles writing APIs here for consistency.
   
   4. Then we can invoke Deletes.writeDVs whereever is required, and remove the 
wrapper below, and the key metadata can be abstracted from the encrypted 
outputFile afterwards.
   
   5. I think at least on any "read" APIs and helper methods we should only 
require a FileIO and not an EncryptingFileIO because if it's an instance of 
EncryptingFileIO it should already handle reading correctly so long as the 
metadata is setup correctly on the FileIO. I think that'll shrink the change a 
bit more too.



##########
core/src/main/java/org/apache/iceberg/deletes/BaseDVFileWriter.java:
##########
@@ -214,4 +252,47 @@ public StructLike partition() {
       return partition;
     }
   }
+
+  private static class EncryptedOutputFileWrapper implements 
EncryptedOutputFile, OutputFile {

Review Comment:
   Ah okay, my bad on my previous review @manuzhang I didn't realize 
EncryptedOutputFile actually doesn't even extend OutputFile. It looks like this 
wrapper solely exists just to make EncryptedOuputFile adapt to 
Supplier<OutputFile> in the constructor, and then later we extract it.



##########
core/src/main/java/org/apache/iceberg/DVUtil.java:
##########
@@ -159,7 +167,7 @@ private static void validateCanMerge(
    * @return map of referenced data file location to the merged position 
delete index
    */
   private static Map<String, PositionDeleteIndex> readAndMergeDVs(
-      DeleteFile[] duplicateDVs, FileIO io, ExecutorService pool) {
+      DeleteFile[] duplicateDVs, EncryptingFileIO io, ExecutorService pool) {

Review Comment:
   Don't think this needs to change



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to