yittg commented on issue #2575:
URL: https://github.com/apache/iceberg/issues/2575#issuecomment-1047432236
The following test can easily failed, i think it is equivalent to the
original IIUC. It will lead to different kinds of error:
1. Committing with more than one files in one snapshot;
2. Committing with one file in each snapshot, but failed on the final assert.
Hope it can help, @openinx.
```
@Test
public void testHashDistributeMode() throws Exception {
String tableName = "test_hash_distribution_mode";
Map<String, String> tableProps = ImmutableMap.of(
"write.format.default", format.name(),
TableProperties.WRITE_DISTRIBUTION_MODE,
DistributionMode.HASH.modeName()
);
sql("CREATE TABLE default_catalog.default_database.src (id INT) WITH %s",
toWithClause(ImmutableMap.of(
"connector","datagen",
"number-of-rows", "100000",
"rows-per-second", "100000",
"fields.id.kind", "sequence",
"fields.id.start", "1",
"fields.id.end", "100000")));
sql("CREATE TABLE %s(id INT, data VARCHAR) PARTITIONED BY (data) WITH
%s",
tableName, toWithClause(tableProps));
try {
// Insert data set.
sql("INSERT INTO %s SELECT id, 'aaa' as data FROM
default_catalog.default_database.src", tableName);
Table table =
validationCatalog.loadTable(TableIdentifier.of(icebergNamespace, tableName));
// Sometimes we will have more than one checkpoint if we pass the auto
checkpoint interval,
// thus producing multiple snapshots. Here we assert that each
snapshot has only 1 file per partition.
Map<Long, List<DataFile>> snapshotToDataFiles =
SimpleDataUtil.snapshotToDataFiles(table);
for (List<DataFile> dataFiles : snapshotToDataFiles.values()) {
Assert.assertEquals("There should be 1 data file in partition
'aaa'", 1,
SimpleDataUtil.matchingPartitions(dataFiles, table.spec(),
ImmutableMap.of("data", "aaa")).size());
}
} finally {
sql("DROP TABLE IF EXISTS %s.%s", flinkDatabase, tableName);
}
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]