[GitHub] [hive] marton-bod commented on a change in pull request #2644: Test reading/writing V2 tables with delete files

GitBox Thu, 16 Sep 2021 04:41:50 -0700


marton-bod commented on a change in pull request #2644:
URL: https://github.com/apache/hive/pull/2644#discussion_r710037137




##########
File path: 
iceberg/iceberg-handler/src/test/java/org/apache/iceberg/mr/hive/HiveIcebergTestUtils.java
##########
@@ -299,4 +311,68 @@ public static void validateDataWithSQL(TestHiveShell 
shell, String tableName, Li
       }
     }
   }
+
+  /**
+   * @param table The table to create the delete file for
+   * @param deleteFilePath The path where the delete file should be created, 
relative to the table location root
+   * @param equalityFields List of field names that should play a role in the 
equality check
+   * @param fileFormat The file format that should be used for writing out the 
delete file
+   * @param rowsToDelete The rows that should be deleted. It's enough to fill 
out the fields that are relevant for the
+   *                     equality check, as listed in equalityFields, the rest 
of the fields are ignored
+   * @return The DeleteFile created
+   * @throws IOException If there is an error during DeleteFile write
+   */
+  public static DeleteFile createEqualityDeleteFile(Table table, String 
deleteFilePath, List<String> equalityFields,
+      FileFormat fileFormat, List<Record> rowsToDelete) throws IOException {
+    List<Integer> equalityFieldIds = equalityFields.stream()
+        .map(id -> table.schema().findField(id).fieldId())
+        .collect(Collectors.toList());
+    Schema eqDeleteRowSchema = 
table.schema().select(equalityFields.toArray(new String[]{}));
+
+    FileAppenderFactory<Record> appenderFactory = new 
GenericAppenderFactory(table.schema(), table.spec(),
+        ArrayUtil.toIntArray(equalityFieldIds), eqDeleteRowSchema, null);
+    EncryptedOutputFile outputFile = 
table.encryption().encrypt(HadoopOutputFile.fromPath(
+        new org.apache.hadoop.fs.Path(table.location(), deleteFilePath), new 
Configuration()));
+
+    PartitionKey part = new PartitionKey(table.spec(), eqDeleteRowSchema);
+    part.partition(rowsToDelete.get(0));
+    EqualityDeleteWriter<Record> eqWriter = 
appenderFactory.newEqDeleteWriter(outputFile, fileFormat, part);
+    try (EqualityDeleteWriter<Record> writer = eqWriter) {
+      writer.deleteAll(rowsToDelete);
+    }
+    return eqWriter.toDeleteFile();
+  }
+
+  /**
+   * @param table The table to create the delete file for
+   * @param deleteFilePath The path where the delete file should be created, 
relative to the table location root
+   * @param fileFormat The file format that should be used for writing out the 
delete file
+   * @param partitionValues A map of partition values 
(partitionKey=partitionVal, ...) to be used for the delete file
+   * @param deletes The list of position deletes, each containing the data 
file path, the position of the row in the
+   *                data file and the row itself that should be deleted
+   * @return The DeleteFile created
+   * @throws IOException If there is an error during DeleteFile write
+   */
+  public static DeleteFile createPositionalDeleteFile(Table table, String 
deleteFilePath, FileFormat fileFormat,

Review comment:
       Added a new test where the rows are not supplied. 
   What caused the NPE was that I always supplied `table.schema()` as the 
positionDeleteSchema to the writer, even if the row was not available in the 
PositionDelete object.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [hive] marton-bod commented on a change in pull request #2644: Test reading/writing V2 tables with delete files

Reply via email to