Yunqing Zhou created BEAM-7983: ---------------------------------- Summary: Template parameters don't work if they are only used in DoFns Key: BEAM-7983 URL: https://issues.apache.org/jira/browse/BEAM-7983 Project: Beam Issue Type: Bug Components: sdk-java-core Reporter: Yunqing Zhou Assignee: Luke Cwik
Template parameters don't work if they are only used in DoFns but not anywhere else in main. Sample pipeline: {code:java} import org.apache.beam.sdk.Pipeline; import org.apache.beam.sdk.options.PipelineOptions; import org.apache.beam.sdk.options.PipelineOptionsFactory; import org.apache.beam.sdk.options.ValueProvider; import org.apache.beam.sdk.transforms.Create; import org.apache.beam.sdk.transforms.DoFn; import org.apache.beam.sdk.transforms.ParDo; public class BugPipeline { public interface Options extends PipelineOptions { ValueProvider<String> getFoo(); void setFoo(ValueProvider<String> foo); } public static void main(String[] args) throws Exception { Options options = PipelineOptionsFactory.fromArgs(args).as(Options.class); Pipeline p = Pipeline.create(options); p.apply(Create.of(1)).apply(ParDo.of(new DoFn<Integer, String>() { @ProcessElement public void processElement(ProcessContext context) { System.out.println(context.getPipelineOptions().as(Options.class).getFoo()); } })); p.run(); } } {code} Option "foo" is not used anywhere else than the DoFn. So to reproduce the problem: {code:bash} $java BugPipeline --project=$PROJECT --stagingLocation=$STAGING --templateLocation=$TEMPLATE --runner=DataflowRunner $gcloud dataflow jobs run $NAME --gcs-location=$TEMPLATE --parameters=foo=bar {code} it will fail w/ this error: {code} ERROR: (gcloud.dataflow.jobs.run) INVALID_ARGUMENT: (2621bec26c2488b7): The workflow could not be created. Causes: (2621bec26c248dba): Found unexpected parameters: ['foo' (perhaps you meant 'zone')] - '@type': type.googleapis.com/google.rpc.DebugInfo detail: "(2621bec26c2488b7): The workflow could not be created. Causes: (2621bec26c248dba):\ \ Found unexpected parameters: ['foo' (perhaps you meant 'zone')]" {code} The underlying problem is that ProxyInvocationHandler.java only populate options which are "invoked" to the pipeline option map in the job object: https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/options/ProxyInvocationHandler.java#L159 One way to solve it is to save all ValueProvider type of params in the pipelineoptions section. Alternatively, some registration mechanism can be introduced. A current workaround is to annotate the parameter with {code}@Validation.Required{code}, which will call invoke() behind the scene. -- This message was sent by Atlassian JIRA (v7.6.14#76016)