yanghua commented on a change in pull request #2577:
URL: https://github.com/apache/hudi/pull/2577#discussion_r581889015



##########
File path: 
hudi-utilities/src/test/java/org/apache/hudi/utilities/functional/TestHoodieDeltaStreamer.java
##########
@@ -965,16 +969,30 @@ public void testDistributedTestDataSource() {
     assertEquals(1000, c);
   }
 
-  private static void prepareParquetDFSFiles(int numRecords) throws 
IOException {
-    String path = PARQUET_SOURCE_ROOT + "/1.parquet";
+  protected static void prepareParquetDFSFiles(int numRecords, String 
baseParquetPath) throws IOException {
+    prepareParquetDFSFiles(numRecords, baseParquetPath, "1.parquet");
+  }
+
+  protected static void prepareParquetDFSFiles(int numRecords, String 
baseParquetPath, String fileName) throws IOException {
+    String path = baseParquetPath + "/" + fileName;

Review comment:
       `Paths.get()` should be better?

##########
File path: 
hudi-utilities/src/test/java/org/apache/hudi/utilities/functional/TestHoodieDeltaStreamer.java
##########
@@ -227,27 +227,31 @@ private static void 
populateInvalidTableConfigFilePathProps(TypedProperties prop
     
props.setProperty("hoodie.deltastreamer.ingestion.uber_db.dummy_table_uber.configFile",
 dfsBasePath + "/config/invalid_uber_config.properties");
   }
 
-  private static void populateCommonProps(TypedProperties props) {
+  protected static void populateCommonProps(TypedProperties props, boolean 
setKafkaConfig, boolean setHiveConfig) {

Review comment:
       Can we split the settings works for Kafka and Hive into different 
methods so that we can make the method smaller and have more scalability?

##########
File path: 
hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieMultiTableDeltaStreamer.java
##########
@@ -147,7 +147,7 @@ private void 
populateTableExecutionContextList(TypedProperties properties, Strin
   }
 
   private void populateSchemaProviderProps(HoodieDeltaStreamer.Config cfg, 
TypedProperties typedProperties) {
-    if 
(cfg.schemaProviderClassName.equals(SchemaRegistryProvider.class.getName())) {
+    if (cfg.schemaProviderClassName != null && 
cfg.schemaProviderClassName.equals(SchemaRegistryProvider.class.getName())) {

Review comment:
       It would be better to use `Objects.equals()`?

##########
File path: 
hudi-utilities/src/test/java/org/apache/hudi/utilities/functional/TestHoodieDeltaStreamer.java
##########
@@ -965,16 +969,30 @@ public void testDistributedTestDataSource() {
     assertEquals(1000, c);
   }
 
-  private static void prepareParquetDFSFiles(int numRecords) throws 
IOException {
-    String path = PARQUET_SOURCE_ROOT + "/1.parquet";
+  protected static void prepareParquetDFSFiles(int numRecords, String 
baseParquetPath) throws IOException {
+    prepareParquetDFSFiles(numRecords, baseParquetPath, "1.parquet");

Review comment:
       IMO, we should not hard code this parquet file. It will reduce the 
readability. We can extract it to be a constant.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to