Abacn opened a new issue, #30994:
URL: https://github.com/apache/beam/issues/30994

   ### What happened?
   
   Minimum reproduce:
   
   A minimum pipeline:
   
   ```
   PipelineOptions options = PipelineOptionsFactory.fromArgs(argv).create();
   Pipeline p = Pipeline.create(options);
   p.apply(Create.of(1, 2, 3, 4, 5));
   p.run().waitUntilFinish();
   ```
   
   And build the uber jar with command ./gradlew :beamtest:shadowJar. Submit 
the job to Dataflow with
   
   ```
   ./gradlew :beamtest:run --args='--project=<...> --region=us-central1 \
   --tempLocation=gs://<...>/tmp --runner=DataflowRunner \
   --filesToStage=/<...>/build/libs/beamtest-1.0-all.jar'
   ```
   
   The job fails with error
   
   ```
   "org.apache.beam.sdk.coders.CoderException: `UnknownCoderWrapper` was used 
to perform an actual decoding in the Java SDK. Potentially a Java transform is 
being followed by a cross-language transform thatuses a coder that is not 
available in the Java SDK. Please make sure that Python transforms at the 
multi-language boundary use Beam portable coders.
        at 
org.apache.beam.sdk.util.construction.UnknownCoderWrapper.decode(UnknownCoderWrapper.java:55)
        at 
org.apache.beam.sdk.fn.data.BeamFnDataInboundObserver.multiplexElements(BeamFnDataInboundObserver.java:158)
        at 
org.apache.beam.fn.harness.control.ProcessBundleHandler.processBundle(ProcessBundleHandler.java:537)
        at 
org.apache.beam.fn.harness.control.BeamFnControlClient.delegateOnInstructionRequestType(BeamFnControlClient.java:150)
        at 
org.apache.beam.fn.harness.control.BeamFnControlClient$InboundObserver.lambda$onNext$0(BeamFnControlClient.java:115)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
org.apache.beam.sdk.util.UnboundedScheduledExecutorService$ScheduledFutureTask.run(UnboundedScheduledExecutorService.java:163)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:750)
   ```
   
   The same pipeline succeeded in Beam 2.54.0, 2.54.0 under Dataflow runner v2
   
   ### Issue Priority
   
   Priority: 1 (data loss / total loss of function)
   
   ### Issue Components
   
   - [ ] Component: Python SDK
   - [X] Component: Java SDK
   - [ ] Component: Go SDK
   - [ ] Component: Typescript SDK
   - [ ] Component: IO connector
   - [ ] Component: Beam YAML
   - [ ] Component: Beam examples
   - [ ] Component: Beam playground
   - [ ] Component: Beam katas
   - [ ] Component: Website
   - [ ] Component: Spark Runner
   - [ ] Component: Flink Runner
   - [ ] Component: Samza Runner
   - [ ] Component: Twister2 Runner
   - [ ] Component: Hazelcast Jet Runner
   - [ ] Component: Google Cloud Dataflow Runner


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to