XComp commented on code in PR #20755:
URL: https://github.com/apache/flink/pull/20755#discussion_r1018768099


##########
flink-formats/flink-hadoop-bulk/src/test/java/org/apache/flink/formats/hadoop/bulk/HadoopPathBasedPartFileWriterTest.java:
##########
@@ -82,13 +84,23 @@ public void testWriteFile() throws Exception {
         DataStream<String> stream =
                 env.addSource(new FiniteTestSource<>(data), 
TypeInformation.of(String.class));
         Configuration configuration = new Configuration();
-
+        // Elements from source  assign to one bucket , and produce two part 
after checkpoint.

Review Comment:
   ```suggestion
           // Elements from source are going to be assigned to one bucket and 
produce two output files after checkpoint
   ```
   out of curiosity: what makes the output being split up into two files?



##########
flink-formats/flink-hadoop-bulk/src/test/java/org/apache/flink/formats/hadoop/bulk/HadoopPathBasedPartFileWriterTest.java:
##########
@@ -82,13 +84,23 @@ public void testWriteFile() throws Exception {
         DataStream<String> stream =
                 env.addSource(new FiniteTestSource<>(data), 
TypeInformation.of(String.class));
         Configuration configuration = new Configuration();
-
+        // Elements from source  assign to one bucket , and produce two part 
after checkpoint.
         HadoopPathBasedBulkFormatBuilder<String, String, ?> builder =
                 new HadoopPathBasedBulkFormatBuilder<>(
                         basePath,
                         new TestHadoopPathBasedBulkWriterFactory(),
                         configuration,
-                        new DateTimeBucketAssigner<>());
+                        new BucketAssigner<String, String>() {

Review Comment:
   Good idea. We could even use `BasePathBucketAssigner`. This would also make 
`HadoopPathBasedPartFileWriterTest#validateResult` easier because we don't have 
to verify that there is one bucket created. WDYT?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to