Abacn commented on code in PR #35730:
URL: https://github.com/apache/beam/pull/35730#discussion_r2286079969
##########
sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileIO.java:
##########
@@ -674,16 +675,37 @@ public abstract static class MatchAll
@AutoValue.Builder
abstract static class Builder {
+ protected boolean outputParallelization;
+
abstract Builder setConfiguration(MatchConfiguration configuration);
abstract MatchAll build();
+
+ abstract Builder setOutputParallelization(boolean b);
}
/** Like {@link Match#withConfiguration}. */
public MatchAll withConfiguration(MatchConfiguration configuration) {
return toBuilder().setConfiguration(configuration).build();
}
+ /**
+ * Specifies to avoid the reshuffle operation.
+ *
+ * <p>This is a performance optimization for pipelines that match a small
number of
+ * filepatterns.
+ *
+ * <p>By default, a {@link org.apache.beam.sdk.transforms.Reshuffle} is
applied to the matched
+ * files to break fusion and improve parallelism.
+ */
+ public MatchAll outputParallelization() {
Review Comment:
Optional: As `matchAll()` returns a MatchAll with outputParallelization to
true by default, Probably we do not need to introduce outputParallelization()
shortcut method.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]