xushiyan commented on a change in pull request #3595:
URL: https://github.com/apache/hudi/pull/3595#discussion_r706305915
##########
File path:
hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/client/functional/TestHoodieBackedMetadata.java
##########
@@ -1108,6 +1119,259 @@ public void testMetdataTableCommitFailure() throws
Exception {
assertTrue(timeline.getRollbackTimeline().countInstants() == 1);
}
+ /**
+ * Test simple bootstrap of metadata table.
+ * Trigger few write operations and boostrap metadata table. Validate.
+ * Add few more writes to sync and validate.
+ * @param tableType
+ * @throws Exception
+ */
+ @ParameterizedTest
+ @EnumSource(HoodieTableType.class)
+ public void testBootstrapWithTestTable(HoodieTableType tableType) throws
Exception {
+ init(tableType);
+ HoodieTestTable testTable = HoodieTestTable.of(metaClient);
+ // bootstrap with few commits
+ testBootstrap(testTable, false);
+ }
+
+ /**
+ * Before bootstrapping, rollback a commit in the original table.
+ * Ensure after bootstrap, sync and validate succeeds.
+ * @throws Exception
+ */
+ @Test
+ public void testBootstrapWithRolledBackCommitTestTable() throws Exception {
+ tableType = HoodieTableType.COPY_ON_WRITE;
+ init(tableType);
Review comment:
ideally all test cases should parameterize with table type
```
@ParameterizedTest
@EnumSource(HoodieTableType.class)
```
##########
File path:
hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/client/functional/TestHoodieBackedMetadata.java
##########
@@ -1108,6 +1119,259 @@ public void testMetdataTableCommitFailure() throws
Exception {
assertTrue(timeline.getRollbackTimeline().countInstants() == 1);
}
+ /**
+ * Test simple bootstrap of metadata table.
+ * Trigger few write operations and boostrap metadata table. Validate.
+ * Add few more writes to sync and validate.
+ * @param tableType
+ * @throws Exception
+ */
+ @ParameterizedTest
+ @EnumSource(HoodieTableType.class)
+ public void testBootstrapWithTestTable(HoodieTableType tableType) throws
Exception {
+ init(tableType);
+ HoodieTestTable testTable = HoodieTestTable.of(metaClient);
+ // bootstrap with few commits
+ testBootstrap(testTable, false);
+ }
+
+ /**
+ * Before bootstrapping, rollback a commit in the original table.
+ * Ensure after bootstrap, sync and validate succeeds.
+ * @throws Exception
+ */
+ @Test
+ public void testBootstrapWithRolledBackCommitTestTable() throws Exception {
+ tableType = HoodieTableType.COPY_ON_WRITE;
+ init(tableType);
+ HoodieTestTable testTable = HoodieTestTable.of(metaClient);
+ // bootstrap w/ few commits, but rollback one of the commit before
bootstrapping.
+ testBootstrap(testTable,true);
+ }
+
+ private void testBootstrap(HoodieTestTable testTable, boolean addRollback)
throws Exception {
Review comment:
i've seen this pattern in many classes: create a private method doing
all the test steps with a variable control different scenarios while different
testing methods invoke it with the variable. We should start avoiding this, for
reasons
- control flow is an anti-pattern in test code. Each testcase just follows a
simple flow: prep -> execute -> verify. Any varying part can be moved to a
different test method to explicitly show a different scenario
- I can see the use of control flow is mainly to reuse some code in the
original flow. It's a sign that the original flow's code itself is not concise
enough to be repeated. I think repeating some code across testcase is
acceptable and even preferred: testcases should be isolated and people wants to
read the flow as is without jumping back and forth btw methods. Repeating
concise test prep and verification logic makes the scenario more readable and
manageable in 1 place. This requires the test utils classes properly refactored
and doing heavy liftings.
##########
File path:
hudi-common/src/test/java/org/apache/hudi/common/testutils/PartitionFileInfoMap.java
##########
@@ -0,0 +1,53 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.hudi.common.testutils;
+
+import org.apache.hudi.common.util.collection.Pair;
+
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.UUID;
+
+public class PartitionFileInfoMap {
+ Map<String, Map<String, List<Pair<String, Integer>>>> partitionToFileIdMap =
new HashMap<>();
+
+ public PartitionFileInfoMap addPartitionAndBasefiles(String commitTime,
String partitionPath, List<Integer> lengths) {
+
+ if (!partitionToFileIdMap.containsKey(commitTime)) {
+ partitionToFileIdMap.put(commitTime, new HashMap<>());
+ }
+ if (!this.partitionToFileIdMap.get(commitTime).containsKey(partitionPath))
{
+ this.partitionToFileIdMap.get(commitTime).put(partitionPath, new
ArrayList<>());
+ }
+
+ List<Pair<String, Integer>> fileInfos = new ArrayList<>();
+ for (int length : lengths) {
+ fileInfos.add(Pair.of(UUID.randomUUID().toString(), length));
+ }
+
this.partitionToFileIdMap.get(commitTime).get(partitionPath).addAll(fileInfos);
+ return this;
+ }
+
+ public Map<String, List<Pair<String, Integer>>>
getPartitionToFileIdMap(String commitTime) {
+ return this.partitionToFileIdMap.get(commitTime);
+ }
+}
Review comment:
should fix the IDE setting to auto fix EOL problem
##########
File path:
hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/client/functional/TestHoodieBackedMetadata.java
##########
@@ -1108,6 +1119,259 @@ public void testMetdataTableCommitFailure() throws
Exception {
assertTrue(timeline.getRollbackTimeline().countInstants() == 1);
}
+ /**
+ * Test simple bootstrap of metadata table.
+ * Trigger few write operations and boostrap metadata table. Validate.
+ * Add few more writes to sync and validate.
+ * @param tableType
+ * @throws Exception
+ */
+ @ParameterizedTest
+ @EnumSource(HoodieTableType.class)
+ public void testBootstrapWithTestTable(HoodieTableType tableType) throws
Exception {
+ init(tableType);
+ HoodieTestTable testTable = HoodieTestTable.of(metaClient);
+ // bootstrap with few commits
+ testBootstrap(testTable, false);
+ }
+
+ /**
+ * Before bootstrapping, rollback a commit in the original table.
+ * Ensure after bootstrap, sync and validate succeeds.
+ * @throws Exception
+ */
+ @Test
+ public void testBootstrapWithRolledBackCommitTestTable() throws Exception {
+ tableType = HoodieTableType.COPY_ON_WRITE;
+ init(tableType);
+ HoodieTestTable testTable = HoodieTestTable.of(metaClient);
+ // bootstrap w/ few commits, but rollback one of the commit before
bootstrapping.
+ testBootstrap(testTable,true);
+ }
+
+ private void testBootstrap(HoodieTestTable testTable, boolean addRollback)
throws Exception {
+
+ // bootstrap w/ 3 or 5 commits
+ testTable.doWriteOperation(testTable, "001", WriteOperationType.INSERT,
Arrays.asList("p1", "p2"), Arrays.asList("p1", "p2"),
+ 2, true);
+ testTable.doWriteOperation(testTable, "002", WriteOperationType.INSERT,
Collections.emptyList(), Arrays.asList("p1", "p2"),
+ 2, true);
+ syncAndValidate(testTable);
+
+ if (addRollback) {
+ doRollback(testTable, "003", "004", Collections.singletonList("p3"),
Arrays.asList("p1","p2", "p3"), 2);
+ }
+ testTable.doWriteOperation(testTable, "005", WriteOperationType.INSERT,
Collections.emptyList(), Arrays.asList("p1", "p2"),
+ 4);
+ syncAndValidate(testTable);
+
+ // trigger an upsert and validate
+ testTable.doWriteOperation(testTable, "006", WriteOperationType.UPSERT,
Collections.singletonList("p3"),
+ Arrays.asList("p1", "p2", "p3"), 4, false);
+ syncAndValidate(testTable);
+ }
+
+ private void doRollback(HoodieTestTable testTable, String
commitTimeToRollback, String commitTime,
+ List<String> newPartitionsToAdd, List<String>
partitionsToAddFiles, int numFilesPerPartition) throws Exception {
+ // trigger an UPSERT that will be rolled back
+ Pair<HoodieCommitMetadata, PartitionFileInfoMap> commitMeta =
testTable.doWriteOperation(testTable, commitTimeToRollback,
WriteOperationType.UPSERT,
+ newPartitionsToAdd,
+ partitionsToAddFiles, numFilesPerPartition, false);
+ syncTableMetadata();
+
+ // rollback last commit
+ Map<String, List<String>> partitionFilesToDelete =
getPartitionFilesToDelete(commitMeta.getKey());
+ HoodieRollbackMetadata rollbackMetadata =
testTable.getRollbackMetadata(commitTimeToRollback, commitTime,
partitionFilesToDelete);
+ testTable.addRollback(commitTime, rollbackMetadata);
+
+ // delete the resp files from test table before validation
+ for (Map.Entry<String, List<String>> entry :
partitionFilesToDelete.entrySet()) {
+ testTable.deleteFilesInPartition(entry.getKey(), entry.getValue());
+ }
+ syncAndValidate(testTable);
+ }
+
+ /**
+ * Test few table operations like insert, upsert, compaction, clean.
+ * @param tableType
+ * @throws Exception
+ */
+ @ParameterizedTest
+ @EnumSource(HoodieTableType.class)
+ public void testTableOperationsWithTestTable(HoodieTableType tableType)
throws Exception {
+ init(tableType);
+ HoodieTestTable testTable = HoodieTestTable.of(metaClient);
+ testTableOperations(testTable,false);
+ }
+
+ /**
+ * 1. Enable metadata to sync and validate.
+ * 2. Disable metadata and add few writes to table.
+ * 3. Enable back again to sync and validate.
+ * @throws Exception
+ */
Review comment:
if test logic is encapsulate in well-design util APIs, we may not need
extra javadoc to explain the flow. Some inline comments might still be helpful
but ideally code itself should be able to explain it pretty well
##########
File path:
hudi-common/src/test/java/org/apache/hudi/common/testutils/PartitionDeleteFileList.java
##########
@@ -0,0 +1,47 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.hudi.common.testutils;
+
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+
+public class PartitionDeleteFileList {
Review comment:
as discussed, we can start creating a `HoodieTestState` and encapsulate
it there.
##########
File path:
hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/client/functional/TestHoodieBackedMetadata.java
##########
@@ -1261,17 +1525,189 @@ private void validateMetadata(SparkRDDWriteClient
testClient) throws IOException
LOG.info("Validation time=" + timer.endTimer());
}
+ /**
+ * Validate the metadata tables contents to ensure it matches what is on the
file system.
+ */
+ private void validateMetadata(HoodieTestTable testTable) throws IOException {
+ validateMetadata(testTable, Collections.emptyList());
+ }
+
+ /**
+ * Validate the metadata tables contents to ensure it matches what is on the
file system.
+ */
+ private void validateMetadata(HoodieTestTable testTable, List<String>
inflightCommits) throws IOException {
+ HoodieTableMetadata tableMetadata = metadata(writeConfig, context);
+ assertNotNull(tableMetadata, "MetadataReader should have been
initialized");
+ if (!writeConfig.isMetadataTableEnabled()) {
+ return;
+ }
+
+ assertEquals(inflightCommits, testTable.inflightCommits());
+
+ HoodieTimer timer = new HoodieTimer().startTimer();
+ HoodieSparkEngineContext engineContext = new HoodieSparkEngineContext(jsc);
+
+ // Partitions should match
+ List<java.nio.file.Path> fsPartitionPaths =
testTable.getAllPartitionPaths();
+ List<String> fsPartitions = new ArrayList<>();
+ fsPartitionPaths.forEach(entry ->
fsPartitions.add(entry.getFileName().toString()));
+ List<String> metadataPartitions = tableMetadata.getAllPartitionPaths();
+
+ Collections.sort(fsPartitions);
+ Collections.sort(metadataPartitions);
+
+ assertEquals(fsPartitions.size(), metadataPartitions.size(), "Partitions
should match");
+ assertTrue(fsPartitions.equals(metadataPartitions), "Partitions should
match");
+
+ // Files within each partition should match
+ metaClient = HoodieTableMetaClient.reload(metaClient);
+ HoodieTable table = HoodieSparkTable.create(writeConfig, engineContext);
+ TableFileSystemView tableView = table.getHoodieView();
+ List<String> fullPartitionPaths = fsPartitions.stream().map(partition ->
basePath + "/" + partition).collect(Collectors.toList());
+ Map<String, FileStatus[]> partitionToFilesMap =
tableMetadata.getAllFilesInPartitions(fullPartitionPaths);
+ assertEquals(fsPartitions.size(), partitionToFilesMap.size());
+
+ fsPartitions.forEach(partition -> {
+ try {
+ Path partitionPath;
+ if (partition.equals("")) {
+ // Should be the non-partitioned case
+ partitionPath = new Path(basePath);
+ } else {
+ partitionPath = new Path(basePath, partition);
+ }
+
+ FileStatus[] fsStatuses = testTable.listAllFilesInPartition(partition);
+ FileStatus[] metaStatuses =
tableMetadata.getAllFilesInPartition(partitionPath);
+ List<String> fsFileNames = Arrays.stream(fsStatuses)
+ .map(s -> s.getPath().getName()).collect(Collectors.toList());
+ List<String> metadataFilenames = Arrays.stream(metaStatuses)
+ .map(s -> s.getPath().getName()).collect(Collectors.toList());
+ Collections.sort(fsFileNames);
+ Collections.sort(metadataFilenames);
+
+ assertEquals(fsStatuses.length, partitionToFilesMap.get(basePath + "/"
+ partition).length);
+
+ // File sizes should be valid
+ Arrays.stream(metaStatuses).forEach(s -> assertTrue(s.getLen() > 0));
Review comment:
we should prefer for-loop over lambda in test code when there is
exception to avoid try-catch block. Just declare exception all the way up we
can anyway capture it when test failed.
##########
File path:
hudi-common/src/test/java/org/apache/hudi/common/testutils/FileCreateUtils.java
##########
@@ -59,6 +64,8 @@
public class FileCreateUtils {
Review comment:
to align with the new design, we should later aim to restrain its use.
This can be useful for testing low-level file-manipulation logic.
HoodieTestTable should leverage more src code path.
##########
File path:
hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/client/functional/TestHoodieBackedMetadata.java
##########
@@ -1108,6 +1119,259 @@ public void testMetdataTableCommitFailure() throws
Exception {
assertTrue(timeline.getRollbackTimeline().countInstants() == 1);
}
+ /**
+ * Test simple bootstrap of metadata table.
+ * Trigger few write operations and boostrap metadata table. Validate.
+ * Add few more writes to sync and validate.
+ * @param tableType
+ * @throws Exception
+ */
+ @ParameterizedTest
+ @EnumSource(HoodieTableType.class)
+ public void testBootstrapWithTestTable(HoodieTableType tableType) throws
Exception {
+ init(tableType);
+ HoodieTestTable testTable = HoodieTestTable.of(metaClient);
+ // bootstrap with few commits
+ testBootstrap(testTable, false);
+ }
+
+ /**
+ * Before bootstrapping, rollback a commit in the original table.
+ * Ensure after bootstrap, sync and validate succeeds.
+ * @throws Exception
+ */
+ @Test
+ public void testBootstrapWithRolledBackCommitTestTable() throws Exception {
+ tableType = HoodieTableType.COPY_ON_WRITE;
+ init(tableType);
+ HoodieTestTable testTable = HoodieTestTable.of(metaClient);
+ // bootstrap w/ few commits, but rollback one of the commit before
bootstrapping.
+ testBootstrap(testTable,true);
+ }
+
+ private void testBootstrap(HoodieTestTable testTable, boolean addRollback)
throws Exception {
+
+ // bootstrap w/ 3 or 5 commits
+ testTable.doWriteOperation(testTable, "001", WriteOperationType.INSERT,
Arrays.asList("p1", "p2"), Arrays.asList("p1", "p2"),
Review comment:
try making use of varargs instead of List for test util APIs. varargs
gives more flexibility and does not require caller to build a list (less code)
##########
File path:
hudi-common/src/test/java/org/apache/hudi/common/testutils/HoodieTestTable.java
##########
@@ -421,13 +582,102 @@ public String getBaseFileNameById(String fileId) {
}
public FileStatus[] listAllFilesInPartition(String partitionPath) throws
IOException {
- return FileSystemTestUtils.listRecursive(fs, new Path(Paths.get(basePath,
partitionPath).toString())).toArray(new FileStatus[0]);
+ return FileSystemTestUtils.listRecursive(fs, new Path(Paths.get(basePath,
partitionPath).toString())).stream()
+ .filter(entry -> {
+ boolean toReturn = true;
+ String fileName = entry.getPath().getName();
+ if
(fileName.equals(HoodiePartitionMetadata.HOODIE_PARTITION_METAFILE)) {
+ toReturn = false;
+ } else {
+ for (String inflight : inflightCommits) {
+ if (fileName.contains(inflight)) {
+ toReturn = false;
+ }
+ }
+ }
+ return toReturn;
+ }).collect(Collectors.toList()).toArray(new FileStatus[0]);
}
public FileStatus[] listAllFilesInTempFolder() throws IOException {
return FileSystemTestUtils.listRecursive(fs, new Path(Paths.get(basePath,
HoodieTableMetaClient.TEMPFOLDER_NAME).toString())).toArray(new FileStatus[0]);
}
+ public void deleteFilesInPartition(String partitionPath, List<String>
filesToDelete) throws IOException {
+ FileStatus[] allFiles = listAllFilesInPartition(partitionPath);
+ Arrays.stream(allFiles).filter(entry ->
filesToDelete.contains(entry.getPath().getName())).forEach(entry -> {
+ try {
+ Files.delete(Paths.get(basePath, partitionPath,
entry.getPath().getName()));
+ } catch (IOException e) {
+ e.printStackTrace();
+ }
+ });
+ }
+
+ public HoodieCleanMetadata doClean(HoodieTestTable testTable, String
commitTime, Map<String, Integer> partitionFileCountsToDelete) throws
IOException {
+ Map<String, List<String>> partitionFilesToDelete = new HashMap<>();
+ for (Map.Entry<String, Integer> entry :
partitionFileCountsToDelete.entrySet()) {
+ partitionFilesToDelete.put(entry.getKey(),
testTable.getEarliestFilesInPartition(entry.getKey(), entry.getValue()));
+ }
+ PartitionDeleteFileList partitionDeleteFileList = new
PartitionDeleteFileList();
+ for (Map.Entry<String, List<String>> entry :
partitionFilesToDelete.entrySet()) {
+ partitionDeleteFileList =
partitionDeleteFileList.addPartitionAndBasefiles(commitTime, entry.getKey(),
entry.getValue());
+ testTable.deleteFilesInPartition(entry.getKey(), entry.getValue());
+ }
+ Pair<HoodieCleanerPlan, HoodieCleanMetadata> cleanerMeta =
testTable.getHoodieCleanMetadata(commitTime,
partitionDeleteFileList.getPartitionToFileIdMap(commitTime));
+ testTable.addClean(commitTime, cleanerMeta.getKey(),
cleanerMeta.getValue());
+ return cleanerMeta.getValue();
+ }
+
+ public HoodieTestTable doCompaction(HoodieTestTable testTable, String
commitTime, List<String> partitions) throws Exception {
+ this.currentInstantTime = commitTime;
+ PartitionFileInfoMap partitionFileInfoMap = new PartitionFileInfoMap();
+ for (String partition : partitions) {
+ partitionFileInfoMap =
partitionFileInfoMap.addPartitionAndBasefiles(commitTime, partition,
Arrays.asList(100 + RANDOM.nextInt(500)));
+ }
+ HoodieCommitMetadata commitMetadata =
testTable.createCommitMetadata(WriteOperationType.COMPACT, commitTime,
partitionFileInfoMap.getPartitionToFileIdMap(commitTime));
+ for (String partition : partitions) {
+ testTable = testTable.withBaseFilesInPartition(partition,
partitionFileInfoMap.getPartitionToFileIdMap(commitTime).get(partition));
+ }
+ return testTable.addCompaction(commitTime, commitMetadata);
+ }
+
+ public Pair<HoodieCommitMetadata, PartitionFileInfoMap>
doWriteOperation(HoodieTestTable testTable, String commitTime,
WriteOperationType operationType,
Review comment:
this is an instance method, it does not need user to pass in a
testTable. Unless you want this to be static?
##########
File path:
hudi-common/src/test/java/org/apache/hudi/common/testutils/PartitionFileInfoMap.java
##########
@@ -0,0 +1,53 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.hudi.common.testutils;
+
+import org.apache.hudi.common.util.collection.Pair;
+
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.UUID;
+
+public class PartitionFileInfoMap {
Review comment:
ditto
##########
File path:
hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/client/functional/TestHoodieBackedMetadata.java
##########
@@ -1108,6 +1119,259 @@ public void testMetdataTableCommitFailure() throws
Exception {
assertTrue(timeline.getRollbackTimeline().countInstants() == 1);
}
+ /**
+ * Test simple bootstrap of metadata table.
+ * Trigger few write operations and boostrap metadata table. Validate.
+ * Add few more writes to sync and validate.
+ * @param tableType
+ * @throws Exception
+ */
+ @ParameterizedTest
+ @EnumSource(HoodieTableType.class)
+ public void testBootstrapWithTestTable(HoodieTableType tableType) throws
Exception {
+ init(tableType);
+ HoodieTestTable testTable = HoodieTestTable.of(metaClient);
+ // bootstrap with few commits
+ testBootstrap(testTable, false);
+ }
+
+ /**
+ * Before bootstrapping, rollback a commit in the original table.
+ * Ensure after bootstrap, sync and validate succeeds.
+ * @throws Exception
+ */
+ @Test
+ public void testBootstrapWithRolledBackCommitTestTable() throws Exception {
+ tableType = HoodieTableType.COPY_ON_WRITE;
+ init(tableType);
+ HoodieTestTable testTable = HoodieTestTable.of(metaClient);
+ // bootstrap w/ few commits, but rollback one of the commit before
bootstrapping.
+ testBootstrap(testTable,true);
+ }
+
+ private void testBootstrap(HoodieTestTable testTable, boolean addRollback)
throws Exception {
+
+ // bootstrap w/ 3 or 5 commits
+ testTable.doWriteOperation(testTable, "001", WriteOperationType.INSERT,
Arrays.asList("p1", "p2"), Arrays.asList("p1", "p2"),
+ 2, true);
+ testTable.doWriteOperation(testTable, "002", WriteOperationType.INSERT,
Collections.emptyList(), Arrays.asList("p1", "p2"),
+ 2, true);
+ syncAndValidate(testTable);
+
+ if (addRollback) {
+ doRollback(testTable, "003", "004", Collections.singletonList("p3"),
Arrays.asList("p1","p2", "p3"), 2);
+ }
+ testTable.doWriteOperation(testTable, "005", WriteOperationType.INSERT,
Collections.emptyList(), Arrays.asList("p1", "p2"),
+ 4);
+ syncAndValidate(testTable);
+
+ // trigger an upsert and validate
+ testTable.doWriteOperation(testTable, "006", WriteOperationType.UPSERT,
Collections.singletonList("p3"),
+ Arrays.asList("p1", "p2", "p3"), 4, false);
+ syncAndValidate(testTable);
+ }
+
+ private void doRollback(HoodieTestTable testTable, String
commitTimeToRollback, String commitTime,
+ List<String> newPartitionsToAdd, List<String>
partitionsToAddFiles, int numFilesPerPartition) throws Exception {
+ // trigger an UPSERT that will be rolled back
+ Pair<HoodieCommitMetadata, PartitionFileInfoMap> commitMeta =
testTable.doWriteOperation(testTable, commitTimeToRollback,
WriteOperationType.UPSERT,
+ newPartitionsToAdd,
+ partitionsToAddFiles, numFilesPerPartition, false);
+ syncTableMetadata();
+
+ // rollback last commit
+ Map<String, List<String>> partitionFilesToDelete =
getPartitionFilesToDelete(commitMeta.getKey());
+ HoodieRollbackMetadata rollbackMetadata =
testTable.getRollbackMetadata(commitTimeToRollback, commitTime,
partitionFilesToDelete);
+ testTable.addRollback(commitTime, rollbackMetadata);
+
+ // delete the resp files from test table before validation
+ for (Map.Entry<String, List<String>> entry :
partitionFilesToDelete.entrySet()) {
+ testTable.deleteFilesInPartition(entry.getKey(), entry.getValue());
+ }
+ syncAndValidate(testTable);
+ }
+
+ /**
+ * Test few table operations like insert, upsert, compaction, clean.
+ * @param tableType
+ * @throws Exception
+ */
+ @ParameterizedTest
+ @EnumSource(HoodieTableType.class)
+ public void testTableOperationsWithTestTable(HoodieTableType tableType)
throws Exception {
+ init(tableType);
+ HoodieTestTable testTable = HoodieTestTable.of(metaClient);
+ testTableOperations(testTable,false);
+ }
+
+ /**
+ * 1. Enable metadata to sync and validate.
+ * 2. Disable metadata and add few writes to table.
+ * 3. Enable back again to sync and validate.
+ * @throws Exception
+ */
Review comment:
`@throws Exception` looks redundant here. most of the time we just let
exception throw and investigate the failure.
##########
File path:
hudi-common/src/test/java/org/apache/hudi/common/testutils/HoodieTestTable.java
##########
@@ -144,6 +168,33 @@ public HoodieTestTable addCommit(String instantTime)
throws Exception {
return this;
}
+ public HoodieCommitMetadata createCommitMetadata(WriteOperationType
operationType, String commitTime,
+ Map<String,
List<Pair<String, Integer>>> partitionToFileIdMap) {
Review comment:
should try encapsulate data structure like `partitionToFileIdMap` within
`HoodieTestState` and make it invisible to users. It's not easy to grasp and
keep recalling what info is kept in the Map. And more friction of using it in
an API
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]