xushiyan commented on code in PR #5771:
URL: https://github.com/apache/hudi/pull/5771#discussion_r931043291
##########
hudi-common/src/main/java/org/apache/hudi/common/table/HoodieTableConfig.java:
##########
@@ -235,6 +235,13 @@ public class HoodieTableConfig extends HoodieConfig {
.withDocumentation("Comma-separated list of metadata partitions that
have been completely built and in-sync with data table. "
+ "These partitions are ready for use by the readers");
+ public static final ConfigProperty<Boolean> APPEND_ONLY_TABLE =
ConfigProperty
+ .key("hoodie.table.append.only")
+ .defaultValue(false)
+ .sinceVersion("0.12.0")
+ .withDocumentation("When enabled, the writer will simply bulk insert
without meta fields assuming that workload is append-only and there are no
updates."
+ + " This is enabled only when users have not configured any
operation type, record key and precombine field.");
Review Comment:
```suggestion
+ " This will be auto-enabled when users have not configured any
operation type, record key, or precombine field.");
```
##########
hudi-common/src/main/java/org/apache/hudi/common/table/HoodieTableMetaClient.java:
##########
@@ -396,6 +398,16 @@ public HoodieArchivedTimeline getArchivedTimeline(String
startTs) {
* @param properties Properties from writeConfig.
*/
public void validateTableProperties(Properties properties) {
Review Comment:
it's good to explicitly say `writeConfigProps` here instead of `properties`
##########
hudi-common/src/main/java/org/apache/hudi/common/table/HoodieTableMetaClient.java:
##########
@@ -438,7 +450,7 @@ public static HoodieTableMetaClient
initTableAndGetMetaClient(Configuration hado
// if anything other than default archive log folder is specified, create
that too
String archiveLogPropVal = new
HoodieConfig(props).getStringOrDefault(HoodieTableConfig.ARCHIVELOG_FOLDER);
- if (!StringUtils.isNullOrEmpty(archiveLogPropVal)) {
+ if (!isNullOrEmpty(archiveLogPropVal)) {
Review Comment:
nonEmpty()
##########
hudi-common/src/main/java/org/apache/hudi/common/table/HoodieTableMetaClient.java:
##########
@@ -1008,7 +1029,7 @@ public Properties build() {
tableConfig.setValue(HoodieTableConfig.CREATE_SCHEMA,
tableCreateSchema);
}
- if (!StringUtils.isNullOrEmpty(archiveLogFolder)) {
+ if (!isNullOrEmpty(archiveLogFolder)) {
Review Comment:
nonEmpty()
##########
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieSparkSqlWriter.scala:
##########
@@ -118,6 +118,19 @@ object HoodieSparkSqlWriter {
operation = WriteOperationType.INSERT
}
+ // If no record key field and precombine field is provided, assume it's an
append-only workload and do bulk insert
+ if (!hoodieConfig.contains(RECORDKEY_FIELD)
+ && !hoodieConfig.contains(PRECOMBINE_FIELD)) {
+ if (!optParams.contains(OPERATION.key)
+ || optParams.getOrDefault(OPERATION.key,
OPERATION.defaultValue).equalsIgnoreCase(BULK_INSERT_OPERATION_OPT_VAL)) {
+ log.warn(s"$RECORDKEY_FIELD and $PRECOMBINE_FIELD is not specified " +
Review Comment:
```suggestion
log.warn(s"Neither $RECORDKEY_FIELD nor $PRECOMBINE_FIELD is
specified; " +
```
##########
hudi-common/src/main/java/org/apache/hudi/common/table/HoodieTableMetaClient.java:
##########
@@ -396,6 +398,16 @@ public HoodieArchivedTimeline getArchivedTimeline(String
startTs) {
* @param properties Properties from writeConfig.
*/
public void validateTableProperties(Properties properties) {
+ // Once table is configured to be append-only, it cannot be mutable or
allow setting record key or precombine fields for updates
+ if (getTableConfig().isAppendOnlyTable()) {
+ boolean appendOnlyTable = Boolean.parseBoolean(
+ properties.getProperty(HoodieTableConfig.APPEND_ONLY_TABLE.key(),
String.valueOf(HoodieTableConfig.APPEND_ONLY_TABLE.defaultValue())));
+ checkArgument(appendOnlyTable, String.format("%s is enabled. Please
recreate the table with record key to support mutable tables.",
HoodieTableConfig.APPEND_ONLY_TABLE.key()));
+ String recordKeyFields =
properties.getProperty(HoodieTableConfig.RECORDKEY_FIELDS.key());
+ checkState(isNullOrEmpty(recordKeyFields), String.format("%s set for an
append-only table", HoodieTableConfig.RECORDKEY_FIELDS.key()));
+ String precombineField =
properties.getProperty(HoodieTableConfig.PRECOMBINE_FIELD.key());
Review Comment:
there can be other alias to this from table config?
##########
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieSparkSqlWriter.scala:
##########
@@ -118,6 +118,19 @@ object HoodieSparkSqlWriter {
operation = WriteOperationType.INSERT
}
+ // If no record key field and precombine field is provided, assume it's an
append-only workload and do bulk insert
+ if (!hoodieConfig.contains(RECORDKEY_FIELD)
+ && !hoodieConfig.contains(PRECOMBINE_FIELD)) {
+ if (!optParams.contains(OPERATION.key)
+ || optParams.getOrDefault(OPERATION.key,
OPERATION.defaultValue).equalsIgnoreCase(BULK_INSERT_OPERATION_OPT_VAL)) {
Review Comment:
you already checked contains(), no need to getOrDefault. Returning default
makes the logic less future-proof
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]