yanghua commented on a change in pull request #2577:
URL: https://github.com/apache/hudi/pull/2577#discussion_r581889015
##########
File path:
hudi-utilities/src/test/java/org/apache/hudi/utilities/functional/TestHoodieDeltaStreamer.java
##########
@@ -965,16 +969,30 @@ public void testDistributedTestDataSource() {
assertEquals(1000, c);
}
- private static void prepareParquetDFSFiles(int numRecords) throws
IOException {
- String path = PARQUET_SOURCE_ROOT + "/1.parquet";
+ protected static void prepareParquetDFSFiles(int numRecords, String
baseParquetPath) throws IOException {
+ prepareParquetDFSFiles(numRecords, baseParquetPath, "1.parquet");
+ }
+
+ protected static void prepareParquetDFSFiles(int numRecords, String
baseParquetPath, String fileName) throws IOException {
+ String path = baseParquetPath + "/" + fileName;
Review comment:
`Paths.get()` should be better?
##########
File path:
hudi-utilities/src/test/java/org/apache/hudi/utilities/functional/TestHoodieDeltaStreamer.java
##########
@@ -227,27 +227,31 @@ private static void
populateInvalidTableConfigFilePathProps(TypedProperties prop
props.setProperty("hoodie.deltastreamer.ingestion.uber_db.dummy_table_uber.configFile",
dfsBasePath + "/config/invalid_uber_config.properties");
}
- private static void populateCommonProps(TypedProperties props) {
+ protected static void populateCommonProps(TypedProperties props, boolean
setKafkaConfig, boolean setHiveConfig) {
Review comment:
Can we split the settings works for Kafka and Hive into different
methods so that we can make the method smaller and have more scalability?
##########
File path:
hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieMultiTableDeltaStreamer.java
##########
@@ -147,7 +147,7 @@ private void
populateTableExecutionContextList(TypedProperties properties, Strin
}
private void populateSchemaProviderProps(HoodieDeltaStreamer.Config cfg,
TypedProperties typedProperties) {
- if
(cfg.schemaProviderClassName.equals(SchemaRegistryProvider.class.getName())) {
+ if (cfg.schemaProviderClassName != null &&
cfg.schemaProviderClassName.equals(SchemaRegistryProvider.class.getName())) {
Review comment:
It would be better to use `Objects.equals()`?
##########
File path:
hudi-utilities/src/test/java/org/apache/hudi/utilities/functional/TestHoodieDeltaStreamer.java
##########
@@ -965,16 +969,30 @@ public void testDistributedTestDataSource() {
assertEquals(1000, c);
}
- private static void prepareParquetDFSFiles(int numRecords) throws
IOException {
- String path = PARQUET_SOURCE_ROOT + "/1.parquet";
+ protected static void prepareParquetDFSFiles(int numRecords, String
baseParquetPath) throws IOException {
+ prepareParquetDFSFiles(numRecords, baseParquetPath, "1.parquet");
Review comment:
IMO, we should not hard code this parquet file. It will reduce the
readability. We can extract it to be a constant.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]