the-other-tim-brown commented on code in PR #518:
URL: https://github.com/apache/incubator-xtable/pull/518#discussion_r1770711518
##########
xtable-core/src/test/java/org/apache/xtable/ITConversionController.java:
##########
@@ -797,13 +865,72 @@ private void checkDatasetEquivalence(
// if count is not known ahead of time, ensure datasets are
non-empty
assertFalse(dataset1Rows.isEmpty());
}
+
+ if (containsUUIDFields(dataset1Rows) &&
containsUUIDFields(dataset2Rows)) {
+ compareDatasetWithUUID(dataset1Rows, dataset2Rows);
+ } else {
+ assertEquals(
+ dataset1Rows,
+ dataset2Rows,
+ String.format(
+ "Datasets are not equivalent when reading from Spark.
Source: %s, Target: %s",
+ sourceFormat, format));
+ }
+ });
+ }
+
+ /**
+ * Compares two datasets where dataset1Rows is for Iceberg and dataset2Rows
is for other formats
+ * (such as Delta or Hudi). - For the "uuid_field", if present, the UUID
from dataset1 (Iceberg)
+ * is compared with the Base64-encoded UUID from dataset2 (other formats),
after decoding. - For
+ * all other fields, the values are compared directly. - If neither row
contains the "uuid_field",
+ * the rows are compared as plain JSON strings.
+ *
+ * @param dataset1Rows List of JSON rows representing the dataset in Iceberg
format (UUID is
+ * stored as a string).
+ * @param dataset2Rows List of JSON rows representing the dataset in other
formats (UUID might be
+ * Base64-encoded).
+ */
+ private void compareDatasetWithUUID(List<String> dataset1Rows, List<String>
dataset2Rows) {
+ for (int i = 0; i < dataset1Rows.size(); i++) {
+ String row1 = dataset1Rows.get(i);
+ String row2 = dataset2Rows.get(i);
Review Comment:
We're no longer asserting that the other fields are also equivalent. Is
there a way we can do that as well?
##########
xtable-core/src/test/java/org/apache/xtable/ITConversionController.java:
##########
@@ -103,6 +109,7 @@ public class ITConversionController {
private static JavaSparkContext jsc;
private static SparkSession sparkSession;
+ private static final ObjectMapper objectMapper = new ObjectMapper();
Review Comment:
use `UPPER_UNDERSCORE` when defining constants and move this up to line 109
to group with the `DATE_FORMAT`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]