At least for Prism this may be the issue being chased down WRT Prism not periodically checkpointing an Unbounded SDF.
The Deadline Exceeded is a bit odd through. When I look at the logs it seems to be from the State channel. This doesn't eliminate my hypothesis though. On Thu, May 1, 2025, 6:22 AM Shunping Huang via dev <dev@beam.apache.org> wrote: > Hi Mohammed, > > Thanks for reaching out. Could you help to create two separate issues in > our repo (https://github.com/apache/beam/issues)? We will have people to > follow up with that. > > Thanks! > > Shunping > > On Wed, Apr 30, 2025 at 4:55 PM B S Mohammed Ashfaq <ash0...@gmail.com> > wrote: > >> Hi Beam Dev Team, >> >> I'm currently testing a Kafka-to-Kafka Beam YAML pipeline with a >> transformation step, and I'm running into issues on both the PrismRunner >> and the Flink Runner. Here's a summary of what I've observed: >> PrismRunner: >> >> - >> >> The job starts successfully and runs fine for a few minutes. >> - >> >> However, it eventually fails with the following error: >> >> ``` >> grpc._channel._MultiThreadedRendezvous: <_MultiThreadedRendezvous of RPC >> that terminated with: >> status = StatusCode.DEADLINE_EXCEEDED >> details = "Deadline Exceeded" >> debug_error_string = "UNKNOWN:Error received from peer >> {grpc_message:"Deadline Exceeded", grpc_status:4, created_time:"..."} >> ``` >> This behavior is consistent — runs briefly, then fails with the same >> gRPC error. >> >> Flink Runner: >> >> - >> >> I ran the same job on the Flink Runner (Beam version 2.64.0, Flink >> 1.19.2). >> - >> >> Kafka is running via Docker; Flink is run directly from the binary >> distribution. >> - >> >> Unlike simpler Beam YAML jobs (e.g. using static data as input), this >> Kafka pipeline fails with error. >> >> ```` >> INFO:apache_beam.utils.subprocess_server: Suppressed: >> java.io.IOException: Received exit code 1 for command 'docker kill >> 50de066d877f46ce687a7c3a5e865cabfded85833245c9a331fae32eefe7691c'. stderr: >> Error response from daemon: cannot kill container: >> 50de066d877f46ce687a7c3a5e865cabfded85833245c9a331fae32eefe7691c: container >> 50de066d877f46ce687a7c3a5e865cabfded85833245c9a331fae32eefe7691c is not >> running >> ```` >> >> I'm attaching the relevant Beam YAML and logs for both runners. Any >> insights or recommendations/resolution would be appreciated — particularly >> around: >> >> - >> >> Running Beam YAML Jobs with the Flink Runner >> - >> >> Known limitations or configurations when using Kafka with either >> runner >> - >> >> The gRPC error with PrismRunner >> >> Thanks in advance for your help. >> >> Best regards, >> Mohammed Ashfaq >> >> >