prashantwason commented on code in PR #18045:
URL: https://github.com/apache/hudi/pull/18045#discussion_r2756396161
##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java:
##########
@@ -779,6 +779,18 @@ public class HoodieWriteConfig extends HoodieConfig {
.withDocumentation("When table is upgraded from pre 0.12 to 0.12, we
check for \"default\" partition and fail if found one. "
+ "Users are expected to rewrite the data in those partitions.
Enabling this config will bypass this validation");
+ /**
+ * Config that determines whether to block writes when Spark speculative
execution is enabled.
+ */
+ public static final ConfigProperty<Boolean>
BLOCK_WRITES_ON_SPECULATIVE_EXECUTION = ConfigProperty
Review Comment:
The config provides an escape hatch for users who understand the risks and
have specific needs. Here's the rationale:
1. **Default is safe** - The guardrail is enabled by default (`true`), so
users are protected out of the box.
2. **Cluster-level settings** - Some users have `spark.speculation=true` set
at the cluster or session level by their platform team and may not be able to
change it easily. The config allows them to explicitly acknowledge the risk for
Hudi writes.
3. **Consistent with other guardrails** - This follows the pattern of other
bypass configs in Hudi like `SKIP_DEFAULT_PARTITION_VALIDATION` which also
default to the safe behavior but allow users to opt-out when they understand
the implications.
4. **Gradual adoption** - For users upgrading who suddenly hit this
exception, the config provides a way to temporarily bypass while they work on
disabling speculative execution properly.
That said, if the consensus is to make this unconditional, I'm happy to
remove the config and simplify the implementation.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]