pvary commented on a change in pull request #2126:
URL: https://github.com/apache/iceberg/pull/2126#discussion_r561796626
##########
File path:
mr/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandlerWithEngine.java
##########
@@ -296,6 +299,46 @@ public void testInsert() throws IOException {
HiveIcebergTestUtils.validateData(table, new
ArrayList<>(HiveIcebergStorageHandlerTestUtils.CUSTOMER_RECORDS), 0);
}
+ @Test
+ public void testInsertSupportedTypes() throws IOException {
+ Assume.assumeTrue("Tez write is not implemented yet",
executionEngine.equals("mr"));
+ for (int i = 0; i < SUPPORTED_TYPES.size(); i++) {
+ Type type = SUPPORTED_TYPES.get(i);
+ // TODO: remove this filter when issue #1881 is resolved
+ if (type == Types.UUIDType.get() && fileFormat == FileFormat.PARQUET) {
+ continue;
+ }
+ // TODO: remove this filter when we figure out how we could test binary
types
+ if (type.equals(Types.BinaryType.get()) ||
type.equals(Types.FixedType.ofLength(5))) {
+ continue;
+ }
+ String tableName = type.typeId().toString().toLowerCase() + "_table_" +
i;
+ String columnName = type.typeId().toString().toLowerCase() + "_column";
+
+ Schema schema = new Schema(required(1, "id", Types.LongType.get()),
required(2, columnName, type));
+ List<Record> expected = TestHelper.generateRandomRecords(schema, 5, 0L);
+ List<Record> records = new ArrayList<>(expected.size());
+ if (type == Types.TimestampType.withoutZone()) {
+ expected.forEach(r -> records.add(r.copy()));
+ records.forEach(r -> r.set(1, Timestamp.valueOf((LocalDateTime)
r.get(1))));
+ } else if (type == Types.TimestampType.withZone()) {
+ expected.forEach(r -> records.add(r.copy()));
+ records.forEach(r -> r.set(1, Timestamp.from(((OffsetDateTime)
r.get(1)).toInstant())));
+ } else {
+ records.addAll(expected);
+ }
Review comment:
Am I understanding correctly when I think this conversion is for
creating timestamps for which the toString is expected for Hive? Is it ok to
set a Timestamp to a field of a Record where the type is Types.TimestampType?
Maybe it would be better to have a `Map<Long, String> forValuesClause`, and
move every transformation here (Timestamp, Boolean, and maybe later
Fixed/Binary), and later just concatenate it for the query?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]