rdblue commented on a change in pull request #828:
URL: https://github.com/apache/iceberg/pull/828#discussion_r434840901
##########
File path: spark/src/test/java/org/apache/iceberg/spark/data/TestHelpers.java
##########
@@ -78,6 +81,48 @@ public static void assertEqualsSafe(Types.StructType struct,
Record rec, Row row
}
}
+ public static void assertEqualsUnsafe(Types.StructType struct, List<Record>
expected, ColumnarBatch batch) {
Review comment:
This method isn't used, and the main difference between it and
`assertArrowVectors` is that the other one asserts that each vector is an
`IcebergArrowColumnVector`, which I don't think is actually necessary. Is there
a reason to require a specific type instead of just validating the data in each
row?
Also, since the `ColumnarBatch` provides access to an `InternalRow` that is
already supported, I was able to use the existing
`assertEqualsUnsafe(StructType, Record, InternalRow)` in a loop, like this:
```java
public static void assertEqualsBatch(Types.StructType struct, List<Record>
expected, ColumnarBatch batch) {
for (int r = 0; r < batch.numRows(); r++) {
assertEqualsUnsafe(struct, expected.get(r), batch.getRow(r));
}
}
```
That required fixing null handling in `assertEqualsUnsafe`, which wasn't
calling `isNullAt`, but once that was updated everything works without these
two fairly large methods. I'd prefer to move to using `assertEqualsBatch`
instead unless you think that introduces a problem.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]