[GitHub] [hudi] vinothchandar commented on a change in pull request #2902: [HUDI-1800] Exclude file slices in pending compaction when performing small file sizing

GitBox Sat, 22 May 2021 22:59:24 -0700


vinothchandar commented on a change in pull request #2902:
URL: https://github.com/apache/hudi/pull/2902#discussion_r637492241




##########
File path: 
hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/table/action/commit/TestUpsertPartitioner.java
##########
@@ -306,6 +314,37 @@ public void testUpsertPartitionerWithSmallInsertHandling() 
throws Exception {
     assertInsertBuckets(weights, cumulativeWeights, insertBuckets);
   }
 
+  @Test
+  public void 
testUpsertPartitionerWithSmallFileHandlingWithInflightCompactionWithCanIndexLogFiles()
 throws Exception {
+    // Note this is used because it is same partition path used in 
CompactionTestUtils.createCompactionPlan()
+    final String testPartitionPath = DEFAULT_PARTITION_PATHS[0];
+
+    HoodieWriteConfig config = makeHoodieClientConfigBuilder()
+            
.withCompactionConfig(HoodieCompactionConfig.newBuilder().compactionSmallFileSize(1024).build())
+            .withIndexConfig(HoodieIndexConfig.newBuilder()
+                    .withIndexType(HoodieIndex.IndexType.HBASE)
+                    
.withHBaseIndexConfig(HoodieHBaseIndexConfig.newBuilder().build())
+                    .build())
+            .build();
+
+    // This will generate initial commits and create a compaction plan which 
includes file groups created as part of this
+    HoodieCompactionPlan plan = 
CompactionTestUtils.createCompactionPlan(metaClient, "001", "002", 1, true, 
false);
+    FileCreateUtils.createRequestedCompactionCommit(basePath, "002", plan);
+    // Simulate one more commit so that inflight compaction is considered when 
building file groups in file system view
+    //
+    FileCreateUtils.createLogFile(basePath, testPartitionPath, "003", "2", 1);
+    FileCreateUtils.createCommit(basePath, "003");
+
+    // Partitioner will attempt to assign inserts to file groups including 
base file created by inflight compaction
+    metaClient = HoodieTableMetaClient.reload(metaClient);
+    HoodieTestDataGenerator dataGenerator = new HoodieTestDataGenerator(new 
String[] {testPartitionPath});
+    List<HoodieRecord> insertRecords = dataGenerator.generateInserts("004", 
100);
+    WorkloadProfile profile = new 
WorkloadProfile(buildProfile(jsc.parallelize(insertRecords)));
+
+    HoodieSparkTable table = HoodieSparkTable.create(config, context, 
metaClient);
+    SparkUpsertDeltaCommitPartitioner partitioner = new 
SparkUpsertDeltaCommitPartitioner(profile, context, table, config);

Review comment:
       What you mention, can be a bonus @nsivabalan ? 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] vinothchandar commented on a change in pull request #2902: [HUDI-1800] Exclude file slices in pending compaction when performing small file sizing

Reply via email to