the-other-tim-brown commented on code in PR #13292:
URL: https://github.com/apache/hudi/pull/13292#discussion_r2106075462
##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/utils/TransactionUtils.java:
##########
@@ -72,7 +72,8 @@ public static Option<HoodieCommitMetadata>
resolveWriteConflictIfAny(
boolean timelineRefreshedWithinTransaction,
Set<String> pendingInstants) throws HoodieWriteConflictException {
WriteOperationType operationType =
thisCommitMetadata.map(HoodieCommitMetadata::getOperationType).orElse(null);
- if (config.needResolveWriteConflict(operationType)) {
+ if (config.needResolveWriteConflict(operationType, table.isMetadataTable(),
+
config.isStreamingWritesToMetadataEnabled(table.getMetaClient().getTableConfig().getTableVersion())))
{
Review Comment:
Instead of adding these new arguments, is there a way to just compute or
fetch these values in `needResolveWriteConflict`?
##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java:
##########
@@ -841,6 +841,14 @@ public class HoodieWriteConfig extends HoodieConfig {
.sinceVersion("1.0.0")
.withDocumentation("Whether to enable incremental table service. So far
Clustering and Compaction support incremental processing.");
+ public static final ConfigProperty<Boolean>
STREAMING_WRITES_TO_METADATA_TABLE = ConfigProperty
+ .key("hoodie.write.streaming.writes.to.metadata")
+ .defaultValue(false)
Review Comment:
The default here is a bit misleading. Below in
`getDefaultForStreamingWritesToMetadataTable` seems like default is true for
Spark so is it possible to remove the default for this config?
##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/transaction/MetadataTableNonBlockingWritesConflictResolutionStrategy.java:
##########
@@ -0,0 +1,60 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.client.transaction;
+
+import org.apache.hudi.common.model.HoodieCommitMetadata;
+import org.apache.hudi.common.table.HoodieTableMetaClient;
+import org.apache.hudi.common.table.timeline.HoodieInstant;
+import org.apache.hudi.common.util.Option;
+import org.apache.hudi.exception.HoodieMetadataException;
+import org.apache.hudi.exception.HoodieWriteConflictException;
+import org.apache.hudi.table.HoodieTable;
+
+import java.util.ArrayList;
+import java.util.List;
+import java.util.stream.Stream;
+
+/**
+ * Conflict resolution strategy to be used with metadata table when NBCC is
enabled.
+ * This resolution will not conflict any operations and let everything succeed
the conflict resolution step.
+ */
+public class MetadataTableNonBlockingWritesConflictResolutionStrategy
implements ConflictResolutionStrategy {
+
+ private final List<HoodieInstant> emptyHoodieInstants = new ArrayList(0);
+
+ @Override
+ public Stream<HoodieInstant> getCandidateInstants(HoodieTableMetaClient
metaClient, HoodieInstant currentInstant, Option<HoodieInstant>
lastSuccessfulInstant) {
+ return emptyHoodieInstants.stream();
Review Comment:
Use `Stream.empty()`
##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java:
##########
@@ -2855,8 +2863,12 @@ public Integer getWritesFileIdEncoding() {
return props.getInteger(WRITES_FILEID_ENCODING,
HoodieMetadataPayload.RECORD_INDEX_FIELD_FILEID_ENCODING_UUID);
}
- public boolean needResolveWriteConflict(WriteOperationType operationType) {
+ public boolean needResolveWriteConflict(WriteOperationType operationType,
boolean isMetadataTable, boolean streamingWritesToMetadataTableEnabled) {
WriteConcurrencyMode mode = getWriteConcurrencyMode();
+ if (isMetadataTable && streamingWritesToMetadataTableEnabled) {
+ // for metadata table, if streaming writes are enabled, we might need to
perform conflict resolution.
+ return true;
Review Comment:
Why does the conflict resolution differ from the data table NBCC? We should
add some notes here.
##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metadata/HoodieMetadataWriteUtils.java:
##########
@@ -84,8 +89,20 @@ public class HoodieMetadataWriteUtils {
*/
@VisibleForTesting
public static HoodieWriteConfig createMetadataWriteConfig(
- HoodieWriteConfig writeConfig, HoodieFailedWritesCleaningPolicy
failedWritesCleaningPolicy) {
+ HoodieWriteConfig writeConfig, HoodieFailedWritesCleaningPolicy
failedWritesCleaningPolicy,
+ HoodieTableVersion datatableVersion) {
String tableName = writeConfig.getTableName() + METADATA_TABLE_NAME_SUFFIX;
+ boolean isStreamingWritesToMetadataEnabled =
writeConfig.isStreamingWritesToMetadataEnabled(datatableVersion);
+ WriteConcurrencyMode concurrencyMode = isStreamingWritesToMetadataEnabled
+ ? WriteConcurrencyMode.NON_BLOCKING_CONCURRENCY_CONTROL :
WriteConcurrencyMode.SINGLE_WRITER;
+ HoodieLockConfig lockConfig = isStreamingWritesToMetadataEnabled
+ ?
HoodieLockConfig.newBuilder().withLockProvider(InProcessLockProvider.class)
Review Comment:
Shouldn't the lock config come from the data table? Is it also possible to
share the lock provider used with the data table writer? That way we can
leverage the re-entrant nature of the locks and avoid deadlock bugs.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]