yihua commented on code in PR #7881:
URL: https://github.com/apache/hudi/pull/7881#discussion_r1162204969
##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java:
##########
@@ -168,23 +168,16 @@ public class HoodieWriteConfig extends HoodieConfig {
public static final ConfigProperty<String> WRITE_EXECUTOR_TYPE =
ConfigProperty
.key("hoodie.write.executor.type")
- .defaultValue(SIMPLE.name())
-
.withValidValues(Arrays.stream(ExecutorType.values()).map(Enum::name).toArray(String[]::new))
- .sinceVersion("0.13.0")
- .withDocumentation("Set executor which orchestrates concurrent producers
and consumers communicating through a message queue."
- + "BOUNDED_IN_MEMORY: Use LinkedBlockingQueue as a bounded in-memory
queue, this queue will use extra lock to balance producers and consumer"
- + "DISRUPTOR: Use disruptor which a lock free message queue as inner
message, this queue may gain better writing performance if lock was the
bottleneck. "
- + "SIMPLE(default): Executor with no inner message queue and no
inner lock. Consuming and writing records from iterator directly. Compared with
BIM and DISRUPTOR, "
- + "this queue has no need for additional memory and cpu resources
due to lock or multithreading, but also lost some benefits such as speed limit.
"
- + "Although DISRUPTOR is still experimental.");
+ .defaultValue(ExecutorType.SIMPLE.name())
+ .withDocumentation(ExecutorType.class)
+ .sinceVersion("0.13.0");
public static final ConfigProperty<String> KEYGENERATOR_TYPE = ConfigProperty
.key("hoodie.datasource.write.keygenerator.type")
.defaultValue(KeyGeneratorType.SIMPLE.name())
- .withDocumentation("Easily configure one the built-in key generators,
instead of specifying the key generator class."
- + "Currently supports SIMPLE, COMPLEX, TIMESTAMP, CUSTOM,
NON_PARTITION, GLOBAL_DELETE. "
- + "**Note** This is being actively worked on. Please use "
- + "`hoodie.datasource.write.keygenerator.class` instead.");
+ .withDocumentation(KeyGeneratorType.class,
Review Comment:
Still need to fix this with `.withDocumentation(KeyGeneratorType.class)`?
##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieClusteringConfig.java:
##########
@@ -732,40 +712,17 @@ public String getValue() {
}
}
+ @EnumDescription("Clustering mode to use.")
public enum ClusteringOperator {
- /**
- * only schedule the clustering plan
- */
- SCHEDULE("schedule"),
-
- /**
- * only execute then pending clustering plans
- */
- EXECUTE("execute"),
-
- /**
- * schedule cluster first, and execute all pending clustering plans
- */
- SCHEDULE_AND_EXECUTE("scheduleandexecute");
+ @EnumFieldDescription("Only schedule the clustering plan.")
+ SCHEDULE,
- private static final Map<String, ClusteringOperator> VALUE_TO_ENUM_MAP =
- TypeUtils.getValueToEnumMap(ClusteringOperator.class, e ->
e.value);
+ @EnumFieldDescription("Only execute pending clustering plans.")
+ EXECUTE,
- private final String value;
-
- ClusteringOperator(String value) {
- this.value = value;
- }
-
- @Nonnull
- public static ClusteringOperator fromValue(String value) {
- ClusteringOperator enumValue = VALUE_TO_ENUM_MAP.get(value);
- if (enumValue == null) {
- throw new HoodieException(String.format("Invalid value (%s)", value));
- }
- return enumValue;
- }
+ @EnumFieldDescription("Schedule cluster first, and execute all pending
clustering plans.")
+ SCHEDULE_AND_EXECUTE;
Review Comment:
Do we still need to keep the naming here:
`SCHEDULE_AND_EXECUTE("scheduleandexecute")`? Because `"scheduleandexecute"`
cannot be converted from `SCHEDULE_AND_EXECUTE` by just switching the case.
##########
hudi-common/src/main/java/org/apache/hudi/common/config/ConfigProperty.java:
##########
@@ -139,6 +144,49 @@ public ConfigProperty<T> withDocumentation(String doc) {
return new ConfigProperty<>(key, defaultValue, docOnDefaultValue, doc,
sinceVersion, deprecatedVersion, inferFunction, validValues, advanced,
alternatives);
}
+ public <U extends Enum<U>> ConfigProperty<T> withDocumentation(Class<U> e) {
+ return withDocumentation(e,"");
+ }
+
+ private <U extends Enum<U>> boolean isDefaultField(Class<U> e, Field f) {
+ if (!hasDefaultValue()) {
+ return false;
+ }
+ if (defaultValue() instanceof String) {
+ return f.getName().equals(defaultValue());
+ }
+ return Enum.valueOf(e, f.getName()).equals(defaultValue());
+ }
+
+ public <U extends Enum<U>> ConfigProperty<T> withDocumentation(Class<U> e,
String doc) {
Review Comment:
Change this to `public <U extends Enum<U>> ConfigProperty<T>
withDocumentation(Class<U> e)` and remove the following?
```
public <U extends Enum<U>> ConfigProperty<T> withDocumentation(Class<U> e) {
return withDocumentation(e,"");
}
```
##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java:
##########
@@ -2343,7 +2307,7 @@ public Long getLockAcquireWaitTimeoutInMs() {
}
public WriteConcurrencyMode getWriteConcurrencyMode() {
- return WriteConcurrencyMode.fromValue(getString(WRITE_CONCURRENCY_MODE));
+ return
WriteConcurrencyMode.valueOf(getStringOrDefault(WRITE_CONCURRENCY_MODE));
Review Comment:
This is not case-insensitive. Need to fix this too.
##########
hudi-common/src/main/java/org/apache/hudi/common/config/ConfigProperty.java:
##########
@@ -139,6 +144,49 @@ public ConfigProperty<T> withDocumentation(String doc) {
return new ConfigProperty<>(key, defaultValue, docOnDefaultValue, doc,
sinceVersion, deprecatedVersion, inferFunction, validValues, advanced,
alternatives);
}
+ public <U extends Enum<U>> ConfigProperty<T> withDocumentation(Class<U> e) {
+ return withDocumentation(e,"");
+ }
+
+ private <U extends Enum<U>> boolean isDefaultField(Class<U> e, Field f) {
+ if (!hasDefaultValue()) {
+ return false;
+ }
+ if (defaultValue() instanceof String) {
+ return f.getName().equals(defaultValue());
Review Comment:
Should this also do case-insensitive comparison?
##########
hudi-integ-test/src/test/java/org/apache/hudi/integ/testsuite/job/TestHoodieTestSuiteJob.java:
##########
@@ -268,7 +268,7 @@ public void testSparkDataSourceNodesDagWithLock() throws
Exception {
this.cleanDFSDirs();
TypedProperties props = getProperties();
- props.setProperty("hoodie.write.concurrency.mode",
"optimistic_concurrency_control");
+ props.setProperty("hoodie.write.concurrency.mode",
"OPTIMISTIC_CONCURRENCY_CONTROL");
Review Comment:
Similar here, revert all config value changes and all tests should still
pass.
##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java:
##########
@@ -194,16 +187,18 @@ public class HoodieWriteConfig extends HoodieConfig {
public static final ConfigProperty<String> TIMELINE_LAYOUT_VERSION_NUM =
ConfigProperty
.key("hoodie.timeline.layout.version")
- .defaultValue(Integer.toString(TimelineLayoutVersion.VERSION_1))
+ .defaultValue(Integer.toString(TimelineLayoutVersion.CURR_VERSION))
+
.withValidValues(Integer.toString(TimelineLayoutVersion.VERSION_0),Integer.toString(TimelineLayoutVersion.VERSION_1))
.sinceVersion("0.5.1")
.withDocumentation("Controls the layout of the timeline. Version 0
relied on renames, Version 1 (default) models "
+ "the timeline as an immutable log relying only on atomic writes
for object storage.");
public static final ConfigProperty<HoodieFileFormat> BASE_FILE_FORMAT =
ConfigProperty
.key("hoodie.table.base.file.format")
.defaultValue(HoodieFileFormat.PARQUET)
- .withAlternatives("hoodie.table.ro.file.format")
- .withDocumentation("Base file format to store all the base file data.");
+ .withValidValues(HoodieFileFormat.PARQUET.name(),
HoodieFileFormat.ORC.name(), HoodieFileFormat.HFILE.name())
+ .withDocumentation(HoodieFileFormat.class, "File format to store all the
base file data.")
Review Comment:
Same here for using `.withDocumentation(HoodieFileFormat.class)`?
##########
hudi-common/src/main/java/org/apache/hudi/common/config/EnumFieldDescription.java:
##########
@@ -0,0 +1,32 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.common.config;
+
+import java.lang.annotation.Documented;
+import java.lang.annotation.ElementType;
+import java.lang.annotation.Retention;
+import java.lang.annotation.RetentionPolicy;
+import java.lang.annotation.Target;
+
+@Documented
+@Retention(RetentionPolicy.RUNTIME)
+@Target(ElementType.FIELD)
+public @interface EnumFieldDescription {
Review Comment:
Add Javadocs on how this annotation can be used.
##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieIndexConfig.java:
##########
@@ -74,11 +74,10 @@ public class HoodieIndexConfig extends HoodieConfig {
.noDefaultValue()
Review Comment:
Flink uses INMEMORY index.
##########
hudi-common/src/test/java/org/apache/hudi/common/config/TestConfigProperty.java:
##########
@@ -171,4 +171,28 @@ public void testAdvancedValue() {
assertTrue(FAKE_BOOLEAN_CONFIG.markAdvanced().isAdvanced());
assertTrue(FAKE_BOOLEAN_CONFIG_NO_DEFAULT.markAdvanced().isAdvanced());
}
+
+ @EnumDescription("Test enum description.")
+ public enum TestEnum {
Review Comment:
Could you add tests for case-insensitive configs? e.g., setting
`test_val_a` should use `TEST_VAL_A`. Or if `TEST_VAL_A("val_a")` is defined,
`val_a` should transform to `TEST_VAL_A`.
##########
hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/spark/sql/hudi/command/procedures/RunClusteringProcedure.scala:
##########
@@ -160,7 +160,7 @@ class RunClusteringProcedure extends BaseProcedure
logInfo("No specific instants")
op match {
case Some(o) =>
- operator =
ClusteringOperator.fromValue(o.asInstanceOf[String].toLowerCase(Locale.ROOT))
+ operator = ClusteringOperator.valueOf(o.asInstanceOf[String])
Review Comment:
Similarly, this is not case-insensitive.
##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieClusteringConfig.java:
##########
@@ -732,40 +712,17 @@ public String getValue() {
}
}
+ @EnumDescription("Clustering mode to use.")
public enum ClusteringOperator {
- /**
- * only schedule the clustering plan
- */
- SCHEDULE("schedule"),
-
- /**
- * only execute then pending clustering plans
- */
- EXECUTE("execute"),
-
- /**
- * schedule cluster first, and execute all pending clustering plans
- */
- SCHEDULE_AND_EXECUTE("scheduleandexecute");
+ @EnumFieldDescription("Only schedule the clustering plan.")
+ SCHEDULE,
- private static final Map<String, ClusteringOperator> VALUE_TO_ENUM_MAP =
- TypeUtils.getValueToEnumMap(ClusteringOperator.class, e ->
e.value);
+ @EnumFieldDescription("Only execute pending clustering plans.")
+ EXECUTE,
- private final String value;
-
- ClusteringOperator(String value) {
- this.value = value;
- }
-
- @Nonnull
- public static ClusteringOperator fromValue(String value) {
- ClusteringOperator enumValue = VALUE_TO_ENUM_MAP.get(value);
- if (enumValue == null) {
- throw new HoodieException(String.format("Invalid value (%s)", value));
- }
- return enumValue;
- }
+ @EnumFieldDescription("Schedule cluster first, and execute all pending
clustering plans.")
+ SCHEDULE_AND_EXECUTE;
Review Comment:
We need to make sure the refactoring is still backward compatible.
##########
hudi-common/src/main/java/org/apache/hudi/common/config/EnumDescription.java:
##########
@@ -0,0 +1,32 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.common.config;
+
+import java.lang.annotation.Documented;
+import java.lang.annotation.ElementType;
+import java.lang.annotation.Retention;
+import java.lang.annotation.RetentionPolicy;
+import java.lang.annotation.Target;
+
+@Documented
Review Comment:
Add Javadocs on how this annotation can be used.
##########
docker/demo/config/test-suite/multi-writer-1.properties:
##########
@@ -35,7 +35,7 @@ hoodie.datasource.write.recordkey.field=_row_key
hoodie.datasource.write.keygenerator.class=org.apache.hudi.keygen.TimestampBasedKeyGenerator
hoodie.datasource.write.partitionpath.field=timestamp
-hoodie.write.concurrency.mode=optimistic_concurrency_control
+hoodie.write.concurrency.mode=OPTIMISTIC_CONCURRENCY_CONTROL
Review Comment:
Sg. Let's revert such unnecessary changes in the test configs (no case
changes, keep using `optimistic_concurrency_control` instead of
`OPTIMISTIC_CONCURRENCY_CONTROL`).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]