marton-bod commented on a change in pull request #2644:
URL: https://github.com/apache/hive/pull/2644#discussion_r710037137
##########
File path:
iceberg/iceberg-handler/src/test/java/org/apache/iceberg/mr/hive/HiveIcebergTestUtils.java
##########
@@ -299,4 +311,68 @@ public static void validateDataWithSQL(TestHiveShell
shell, String tableName, Li
}
}
}
+
+ /**
+ * @param table The table to create the delete file for
+ * @param deleteFilePath The path where the delete file should be created,
relative to the table location root
+ * @param equalityFields List of field names that should play a role in the
equality check
+ * @param fileFormat The file format that should be used for writing out the
delete file
+ * @param rowsToDelete The rows that should be deleted. It's enough to fill
out the fields that are relevant for the
+ * equality check, as listed in equalityFields, the rest
of the fields are ignored
+ * @return The DeleteFile created
+ * @throws IOException If there is an error during DeleteFile write
+ */
+ public static DeleteFile createEqualityDeleteFile(Table table, String
deleteFilePath, List<String> equalityFields,
+ FileFormat fileFormat, List<Record> rowsToDelete) throws IOException {
+ List<Integer> equalityFieldIds = equalityFields.stream()
+ .map(id -> table.schema().findField(id).fieldId())
+ .collect(Collectors.toList());
+ Schema eqDeleteRowSchema =
table.schema().select(equalityFields.toArray(new String[]{}));
+
+ FileAppenderFactory<Record> appenderFactory = new
GenericAppenderFactory(table.schema(), table.spec(),
+ ArrayUtil.toIntArray(equalityFieldIds), eqDeleteRowSchema, null);
+ EncryptedOutputFile outputFile =
table.encryption().encrypt(HadoopOutputFile.fromPath(
+ new org.apache.hadoop.fs.Path(table.location(), deleteFilePath), new
Configuration()));
+
+ PartitionKey part = new PartitionKey(table.spec(), eqDeleteRowSchema);
+ part.partition(rowsToDelete.get(0));
+ EqualityDeleteWriter<Record> eqWriter =
appenderFactory.newEqDeleteWriter(outputFile, fileFormat, part);
+ try (EqualityDeleteWriter<Record> writer = eqWriter) {
+ writer.deleteAll(rowsToDelete);
+ }
+ return eqWriter.toDeleteFile();
+ }
+
+ /**
+ * @param table The table to create the delete file for
+ * @param deleteFilePath The path where the delete file should be created,
relative to the table location root
+ * @param fileFormat The file format that should be used for writing out the
delete file
+ * @param partitionValues A map of partition values
(partitionKey=partitionVal, ...) to be used for the delete file
+ * @param deletes The list of position deletes, each containing the data
file path, the position of the row in the
+ * data file and the row itself that should be deleted
+ * @return The DeleteFile created
+ * @throws IOException If there is an error during DeleteFile write
+ */
+ public static DeleteFile createPositionalDeleteFile(Table table, String
deleteFilePath, FileFormat fileFormat,
Review comment:
Added a new test where the rows are not supplied.
What caused the NPE was that I always supplied `table.schema()` as the
positionDeleteSchema to the writer, even if the row was not available in the
PositionDelete object.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]