I was trying to look for references on how other runners handle @RequiresStableInput for batch cases, however I was not able to find any. In Flink I can see added support for streaming case and in Dataflow I see that support for the feature was turned off https://github.com/apache/beam/pull/8065
It seems to me that @RequiresStableInput is ignored for the batch case and the runner relies on being able to recompute the whole job in the worst case scenario. Is this assumption correct? Could I just change SparkRunner to crash on @RequiresStableInput annotation for streaming mode and ignore it in batch? On Wed, Jul 1, 2020 at 10:27 AM Jozef Vilcek <jozo.vil...@gmail.com> wrote: > We have a component which we use in streaming and batch jobs. Streaming we > run on FlinkRunner and batch on SparkRunner. Recently we needed to > add @RequiresStableInput to taht component because of streaming use-case. > But now batch case crash on SparkRunner with > > Caused by: java.lang.UnsupportedOperationException: Spark runner currently > doesn't support @RequiresStableInput annotation. > at > org.apache.beam.runners.core.construction.UnsupportedOverrideFactory.getReplacementTransform(UnsupportedOverrideFactory.java:58) > at org.apache.beam.sdk.Pipeline.applyReplacement(Pipeline.java:556) > at org.apache.beam.sdk.Pipeline.replace(Pipeline.java:292) > at org.apache.beam.sdk.Pipeline.replaceAll(Pipeline.java:210) > at org.apache.beam.runners.spark.SparkRunner.run(SparkRunner.java:168) > at org.apache.beam.runners.spark.SparkRunner.run(SparkRunner.java:90) > at org.apache.beam.sdk.Pipeline.run(Pipeline.java:315) > at org.apache.beam.sdk.Pipeline.run(Pipeline.java:301) > at > com.sizmek.dp.dsp.pipeline.driver.PipelineDriver$$anonfun$1.apply(PipelineDriver.scala:42) > at > com.sizmek.dp.dsp.pipeline.driver.PipelineDriver$$anonfun$1.apply(PipelineDriver.scala:35) > at scala.util.Try$.apply(Try.scala:192) > at > com.dp.pipeline.driver.PipelineDriver$class.main(PipelineDriver.scala:35) > > > We are using Beam 2.19.0. Is the @RequiresStableInput problematic to > support for both streaming and batch use-case? What are the options here? > https://issues.apache.org/jira/browse/BEAM-5358 > >