kennknowles commented on a change in pull request #13006:
URL: https://github.com/apache/beam/pull/13006#discussion_r499779478



##########
File path: 
runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/SplittableParDo.java
##########
@@ -638,43 +644,161 @@ public void tearDown() {
   }
 
   /**
-   * Throws an {@link IllegalArgumentException} if the pipeline contains any 
primitive read
-   * transforms that have not been expanded to be executed as {@link DoFn 
splittable DoFns} as long
-   * as the experiment {@code use_deprecated_read} is not specified.
+   * Converts {@link Read} based Splittable DoFn expansions to primitive reads 
implemented by {@link
+   * PrimitiveBoundedRead} and {@link PrimitiveUnboundedRead} if either the 
experiment {@code
+   * use_deprecated_read} or {@code beam_fn_api_use_deprecated_read} are 
specified.
+   *
+   * <p>TODO(BEAM-10670): Remove the primitive Read and make the splittable 
DoFn the only option.
+   */
+  public static void 
convertReadBasedSplittableDoFnsToPrimitiveReadsIfNecessary(Pipeline pipeline) {

Review comment:
       I believe this should be controlled by the runner choosing to invoke the 
method, not by a global flag. It can have the same status as other 
runner-internal overrides, like GBK via GBKO.
   
   If you really believe it should be a flag, it should be the runner that 
reads the flag and decides what to do. This utility library should not change 
its behavior based on pipeline options. Only runners should opt in to 
particular behaviors.

##########
File path: 
runners/samza/src/main/java/org/apache/beam/runners/samza/SamzaRunner.java
##########
@@ -107,7 +107,7 @@ public PortablePipelineResult 
runPortablePipeline(RunnerApi.Pipeline pipeline) {
 
   @Override
   public SamzaPipelineResult run(Pipeline pipeline) {
-    SplittableParDo.validateNoPrimitiveReads(pipeline);
+    
SplittableParDo.convertReadBasedSplittableDoFnsToPrimitiveReadsIfNecessary(pipeline);

Review comment:
       For a runner that previously rejected all primitive reads, isn't it 
better to leave that runner alone and still reject all primitive reads?
   
   (here and elsewhere)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to