nsivabalan commented on code in PR #6333:
URL: https://github.com/apache/hudi/pull/6333#discussion_r940758026
##########
hudi-sync/hudi-sync-common/src/main/java/org/apache/hudi/sync/common/HoodieSyncConfig.java:
##########
@@ -72,31 +75,38 @@ public class HoodieSyncConfig extends HoodieConfig {
public static final ConfigProperty<String> META_SYNC_TABLE_NAME =
ConfigProperty
.key("hoodie.datasource.hive_sync.table")
.defaultValue("unknown")
- .withInferFunction(cfg ->
Option.ofNullable(cfg.getString(HOODIE_WRITE_TABLE_NAME_KEY))
- .or(() -> Option.ofNullable(cfg.getString(HOODIE_TABLE_NAME_KEY))))
+ .withInferFunction(cfg ->
Option.ofNullable(cfg.getString(HOODIE_TABLE_NAME_KEY))
+ .or(() ->
Option.ofNullable(cfg.getString(HOODIE_WRITE_TABLE_NAME_KEY))))
.withDocumentation("The name of the destination table that we should
sync the hudi table to.");
public static final ConfigProperty<String> META_SYNC_BASE_FILE_FORMAT =
ConfigProperty
.key("hoodie.datasource.hive_sync.base_file_format")
.defaultValue("PARQUET")
- .withInferFunction(cfg ->
Option.ofNullable(cfg.getString(HoodieTableConfig.BASE_FILE_FORMAT)))
+ .withInferFunction(cfg ->
Option.ofNullable(cfg.getString(BASE_FILE_FORMAT)))
.withDocumentation("Base file format for the sync.");
public static final ConfigProperty<String> META_SYNC_PARTITION_FIELDS =
ConfigProperty
.key("hoodie.datasource.hive_sync.partition_fields")
.defaultValue("")
- .withInferFunction(cfg ->
Option.ofNullable(cfg.getString(KeyGeneratorOptions.PARTITIONPATH_FIELD_NAME)))
+ .withInferFunction(cfg ->
Option.ofNullable(cfg.getString(PARTITION_FIELDS))
Review Comment:
can we prefix w/ class name. reading would be easier to understand which one
is table config and which one is datasource config.
eg:
KeyGeneratorOptions.PARTITIONPATH_FIELD_NAME
HoodieTableConfig.PARTITION_FIELDS
##########
hudi-sync/hudi-sync-common/src/main/java/org/apache/hudi/sync/common/HoodieSyncConfig.java:
##########
@@ -72,31 +75,38 @@ public class HoodieSyncConfig extends HoodieConfig {
public static final ConfigProperty<String> META_SYNC_TABLE_NAME =
ConfigProperty
.key("hoodie.datasource.hive_sync.table")
.defaultValue("unknown")
- .withInferFunction(cfg ->
Option.ofNullable(cfg.getString(HOODIE_WRITE_TABLE_NAME_KEY))
- .or(() -> Option.ofNullable(cfg.getString(HOODIE_TABLE_NAME_KEY))))
+ .withInferFunction(cfg ->
Option.ofNullable(cfg.getString(HOODIE_TABLE_NAME_KEY))
+ .or(() ->
Option.ofNullable(cfg.getString(HOODIE_WRITE_TABLE_NAME_KEY))))
.withDocumentation("The name of the destination table that we should
sync the hudi table to.");
public static final ConfigProperty<String> META_SYNC_BASE_FILE_FORMAT =
ConfigProperty
.key("hoodie.datasource.hive_sync.base_file_format")
.defaultValue("PARQUET")
- .withInferFunction(cfg ->
Option.ofNullable(cfg.getString(HoodieTableConfig.BASE_FILE_FORMAT)))
+ .withInferFunction(cfg ->
Option.ofNullable(cfg.getString(BASE_FILE_FORMAT)))
.withDocumentation("Base file format for the sync.");
public static final ConfigProperty<String> META_SYNC_PARTITION_FIELDS =
ConfigProperty
.key("hoodie.datasource.hive_sync.partition_fields")
.defaultValue("")
- .withInferFunction(cfg ->
Option.ofNullable(cfg.getString(KeyGeneratorOptions.PARTITIONPATH_FIELD_NAME)))
+ .withInferFunction(cfg ->
Option.ofNullable(cfg.getString(PARTITION_FIELDS))
+ .or(() ->
Option.ofNullable(cfg.getString(PARTITIONPATH_FIELD_NAME))))
.withDocumentation("Field in the table to use for determining hive
partition columns.");
public static final ConfigProperty<String>
META_SYNC_PARTITION_EXTRACTOR_CLASS = ConfigProperty
.key("hoodie.datasource.hive_sync.partition_extractor_class")
.defaultValue("org.apache.hudi.hive.MultiPartKeysValueExtractor")
.withInferFunction(cfg -> {
- if
(StringUtils.nonEmpty(cfg.getString(KeyGeneratorOptions.PARTITIONPATH_FIELD_NAME)))
{
- int numOfPartFields =
cfg.getString(KeyGeneratorOptions.PARTITIONPATH_FIELD_NAME).split(",").length;
+ Option<String> partitionFieldsOpt =
Option.ofNullable(cfg.getString(PARTITION_FIELDS))
Review Comment:
may I know when does this code get invoked in the lifecycle of a write?
for eg, incase of a first write to a new table, PARTITION_FIELDS may not
exists only.
also, I assume all table properties are populated as part of
HoodieSyncConfig by the time we hit this code. coz, some of the table
properties are not explicitly set by the user and hudi infers and sets them.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]