xushiyan commented on a change in pull request #2045:
URL: https://github.com/apache/hudi/pull/2045#discussion_r487451803



##########
File path: 
hudi-integ-test/src/main/java/org/apache/hudi/integ/testsuite/generator/GenericRecordFullPayloadGenerator.java
##########
@@ -45,14 +45,16 @@
  */
 public class GenericRecordFullPayloadGenerator implements Serializable {
 
+  private static Logger log = 
LoggerFactory.getLogger(GenericRecordFullPayloadGenerator.class);

Review comment:
       ```suggestion
     private static final Logger LOG = 
LoggerFactory.getLogger(GenericRecordFullPayloadGenerator.class);
   ```

##########
File path: 
hudi-integ-test/src/test/java/org/apache/hudi/integ/testsuite/generator/TestGenericRecordPayloadGenerator.java
##########
@@ -25,11 +25,13 @@
 import java.util.ArrayList;
 import java.util.Arrays;
 import java.util.List;
+import java.util.concurrent.TimeUnit;
 import java.util.stream.IntStream;
 import org.apache.avro.Schema;
 import org.apache.avro.generic.GenericRecord;
 import org.apache.hudi.avro.HoodieAvroUtils;
 import org.apache.hudi.utilities.testutils.UtilitiesTestBase;
+import org.junit.Assert;

Review comment:
       can we change this to junit 5 APIs? also to reduce verbosity, could we 
static import the assertion functions?

##########
File path: 
hudi-integ-test/src/main/java/org/apache/hudi/integ/testsuite/generator/DeltaGenerator.java
##########
@@ -93,14 +93,15 @@ public DeltaGenerator(DeltaConfig deltaOutputConfig, 
JavaSparkContext jsc, Spark
   }
 
   public JavaRDD<GenericRecord> generateInserts(Config operation) {
-    long recordsPerPartition = operation.getNumRecordsInsert();
     int numPartitions = operation.getNumInsertPartitions();
+    long recordsPerPartition = operation.getNumRecordsInsert();
     int minPayloadSize = operation.getRecordSize();
     JavaRDD<GenericRecord> inputBatch = jsc.parallelize(Collections.EMPTY_LIST)
         .repartition(operation.getNumInsertPartitions()).mapPartitions(p -> {
           return new LazyRecordGeneratorIterator(new 
FlexibleSchemaRecordGenerationIterator(recordsPerPartition,
             minPayloadSize, schemaStr, partitionPathFieldNames, 
numPartitions));
-        });
+
+    });

Review comment:
       could you revert these diffs please? seems unnecessary changes

##########
File path: 
hudi-integ-test/src/main/java/org/apache/hudi/integ/testsuite/generator/FlexibleSchemaRecordGenerationIterator.java
##########
@@ -60,11 +60,16 @@ public boolean hasNext() {
   public GenericRecord next() {
     this.counter--;
     if (lastRecord == null) {
-      GenericRecord record = this.generator.getNewPayload();
+      GenericRecord record = this.partitionPathFieldNames != null && 
this.partitionPathFieldNames.size() > 0
+          ? 
this.generator.getNewPayloadWithTimestamp(this.partitionPathFieldNames.get(0))
+          : this.generator.getNewPayload();
       lastRecord = record;
       return record;
     } else {
-      return this.generator.randomize(lastRecord, 
this.partitionPathFieldNames);
+      return this.partitionPathFieldNames != null && 
this.partitionPathFieldNames.size() > 0
+          ? this.generator.getUpdatePayloadWithTimestamp(lastRecord,
+          this.partitionPathFieldNames, this.partitionPathFieldNames.get(0))
+          : this.generator.getUpdatePayload(lastRecord, 
this.partitionPathFieldNames);

Review comment:
       looks like `this.partitionPathFieldNames != null && 
this.partitionPathFieldNames.size() > 0` deserves to be a local boolean 
variable with a good name to improve readability.
   Also can we avoid unnecessary `this.` references?

##########
File path: 
hudi-integ-test/src/main/java/org/apache/hudi/integ/testsuite/generator/FlexibleSchemaRecordGenerationIterator.java
##########
@@ -60,11 +60,16 @@ public boolean hasNext() {
   public GenericRecord next() {
     this.counter--;
     if (lastRecord == null) {
-      GenericRecord record = this.generator.getNewPayload();
+      GenericRecord record = this.partitionPathFieldNames != null && 
this.partitionPathFieldNames.size() > 0
+          ? 
this.generator.getNewPayloadWithTimestamp(this.partitionPathFieldNames.get(0))
+          : this.generator.getNewPayload();
       lastRecord = record;
       return record;
     } else {
-      return this.generator.randomize(lastRecord, 
this.partitionPathFieldNames);
+      return this.partitionPathFieldNames != null && 
this.partitionPathFieldNames.size() > 0
+          ? this.generator.getUpdatePayloadWithTimestamp(lastRecord,
+          this.partitionPathFieldNames, this.partitionPathFieldNames.get(0))
+          : this.generator.getUpdatePayload(lastRecord, 
this.partitionPathFieldNames);

Review comment:
       is it possible make sure `partitionPathFieldNames` not null so that we 
don't have to do null check here?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to