pvary commented on a change in pull request #2126:
URL: https://github.com/apache/iceberg/pull/2126#discussion_r561786578
##########
File path:
hive3/src/main/java/org/apache/iceberg/mr/hive/serde/objectinspector/IcebergDateObjectInspectorHive3.java
##########
@@ -69,4 +69,12 @@ public Object copyObject(Object o) {
}
}
+ @Override
+ public LocalDate convert(Object o) {
+ if (o == null) {
+ return null;
+ }
+ Date date = (Date) o;
Review comment:
nit of the nit: new line after if block 😄
##########
File path:
mr/src/main/java/org/apache/iceberg/mr/hive/serde/objectinspector/IcebergDecimalObjectInspector.java
##########
@@ -80,6 +80,12 @@ public Object copyObject(Object o) {
@Override
public BigDecimal convert(Object o) {
- return o == null ? null : ((HiveDecimal) o).bigDecimalValue();
+ if (o == null) {
+ return null;
+ }
+
+ BigDecimal result = ((HiveDecimal) o).bigDecimalValue();
+ result = result.setScale(scale());
Review comment:
Can we add a comment here?
##########
File path:
mr/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandlerWithEngine.java
##########
@@ -296,6 +299,46 @@ public void testInsert() throws IOException {
HiveIcebergTestUtils.validateData(table, new
ArrayList<>(HiveIcebergStorageHandlerTestUtils.CUSTOMER_RECORDS), 0);
}
+ @Test
+ public void testInsertSupportedTypes() throws IOException {
+ Assume.assumeTrue("Tez write is not implemented yet",
executionEngine.equals("mr"));
+ for (int i = 0; i < SUPPORTED_TYPES.size(); i++) {
+ Type type = SUPPORTED_TYPES.get(i);
+ // TODO: remove this filter when issue #1881 is resolved
+ if (type == Types.UUIDType.get() && fileFormat == FileFormat.PARQUET) {
Review comment:
Can we use Assume here too?
##########
File path:
mr/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandlerWithEngine.java
##########
@@ -296,6 +299,46 @@ public void testInsert() throws IOException {
HiveIcebergTestUtils.validateData(table, new
ArrayList<>(HiveIcebergStorageHandlerTestUtils.CUSTOMER_RECORDS), 0);
}
+ @Test
+ public void testInsertSupportedTypes() throws IOException {
+ Assume.assumeTrue("Tez write is not implemented yet",
executionEngine.equals("mr"));
+ for (int i = 0; i < SUPPORTED_TYPES.size(); i++) {
+ Type type = SUPPORTED_TYPES.get(i);
+ // TODO: remove this filter when issue #1881 is resolved
+ if (type == Types.UUIDType.get() && fileFormat == FileFormat.PARQUET) {
+ continue;
+ }
+ // TODO: remove this filter when we figure out how we could test binary
types
+ if (type.equals(Types.BinaryType.get()) ||
type.equals(Types.FixedType.ofLength(5))) {
+ continue;
+ }
+ String tableName = type.typeId().toString().toLowerCase() + "_table_" +
i;
+ String columnName = type.typeId().toString().toLowerCase() + "_column";
+
+ Schema schema = new Schema(required(1, "id", Types.LongType.get()),
required(2, columnName, type));
+ List<Record> expected = TestHelper.generateRandomRecords(schema, 5, 0L);
+ List<Record> records = new ArrayList<>(expected.size());
+ if (type == Types.TimestampType.withoutZone()) {
+ expected.forEach(r -> records.add(r.copy()));
+ records.forEach(r -> r.set(1, Timestamp.valueOf((LocalDateTime)
r.get(1))));
+ } else if (type == Types.TimestampType.withZone()) {
+ expected.forEach(r -> records.add(r.copy()));
+ records.forEach(r -> r.set(1, Timestamp.from(((OffsetDateTime)
r.get(1)).toInstant())));
+ } else {
+ records.addAll(expected);
+ }
Review comment:
Am I understanding correctly when I think this conversion is for
creating timestamps for which the toString is expected for Hive? Is it ok to
set a Timestamp to a field of a Record where the type is Types.TimestampType?
Maybe it would be better to have a `Map<Long, String> forValuesClause`, and
move every transformation here (Timestamp, Boolean, and maybe later
Fixed/Binary), and later just concatenate it for the query?
##########
File path:
mr/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandlerWithEngine.java
##########
@@ -296,6 +299,46 @@ public void testInsert() throws IOException {
HiveIcebergTestUtils.validateData(table, new
ArrayList<>(HiveIcebergStorageHandlerTestUtils.CUSTOMER_RECORDS), 0);
}
+ @Test
+ public void testInsertSupportedTypes() throws IOException {
+ Assume.assumeTrue("Tez write is not implemented yet",
executionEngine.equals("mr"));
+ for (int i = 0; i < SUPPORTED_TYPES.size(); i++) {
+ Type type = SUPPORTED_TYPES.get(i);
+ // TODO: remove this filter when issue #1881 is resolved
+ if (type == Types.UUIDType.get() && fileFormat == FileFormat.PARQUET) {
+ continue;
+ }
+ // TODO: remove this filter when we figure out how we could test binary
types
+ if (type.equals(Types.BinaryType.get()) ||
type.equals(Types.FixedType.ofLength(5))) {
+ continue;
+ }
+ String tableName = type.typeId().toString().toLowerCase() + "_table_" +
i;
+ String columnName = type.typeId().toString().toLowerCase() + "_column";
+
+ Schema schema = new Schema(required(1, "id", Types.LongType.get()),
required(2, columnName, type));
+ List<Record> expected = TestHelper.generateRandomRecords(schema, 5, 0L);
+ List<Record> records = new ArrayList<>(expected.size());
Review comment:
In my recent reviews I have learned that in Iceberg we should not create
ArrayLists directly but should use the guava methods for that.
nit: Lists.newArrayListWithCapacity(expected.size())
##########
File path:
mr/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandlerWithEngine.java
##########
@@ -296,6 +299,37 @@ public void testInsert() throws IOException {
HiveIcebergTestUtils.validateData(table, new
ArrayList<>(HiveIcebergStorageHandlerTestUtils.CUSTOMER_RECORDS), 0);
}
+ @Test
+ public void testInsertSupportedTypes() throws IOException {
+ Assume.assumeTrue("Tez write is not implemented yet",
executionEngine.equals("mr"));
+ for (int i = 0; i < SUPPORTED_TYPES.size(); i++) {
+ Type type = SUPPORTED_TYPES.get(i);
+ // TODO: remove this filter when issue #1881 is resolved
+ if (type == Types.UUIDType.get() && fileFormat == FileFormat.PARQUET) {
+ continue;
+ }
+ // TODO: remove this filter when we figure out how we could test binary
types
+ if (type.equals(Types.BinaryType.get()) ||
type.equals(Types.FixedType.ofLength(5))) {
+ continue;
+ }
+ String tableName = type.typeId().toString().toLowerCase() + "_table_" +
i;
+ String columnName = type.typeId().toString().toLowerCase() + "_column";
+
+ Schema schema = new Schema(required(1, "id", Types.LongType.get()),
required(2, columnName, type));
+ List<Record> expected = TestHelper.generateRandomRecords(schema, 5, 0L);
+
+ Table table = testTables.createTable(shell, tableName, schema,
fileFormat, ImmutableList.of());
+ StringBuilder query = new StringBuilder("INSERT INTO
").append(tableName).append(" VALUES")
+ .append(expected.stream()
+ // in hive2 every boolean value in apostrophes is
translated to true
+ .map(r -> String.format(type == Types.BooleanType.get()
? "(%s,%s)" : "(%s,'%s')", r.get(0),
Review comment:
Would it make sense to move the quotation marks to the
`getStringValueForInsert` method as well?
That would encapsulate the type related stuff entirely and it might be
easier to understand the test code.
What do you think?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]