SanjayPanda opened a new issue, #34010:
URL: https://github.com/apache/beam/issues/34010

   ### What happened?
   
   Team I tried to email with the email Id shown it didn't work, hence thought 
creating this issue. 
   
   I am using the Apache Beam Python SDK with the Dataflow runner. Our pipeline 
is designed to periodically fetch lookup data and use it to enrich the main 
processing pipeline. The main pipeline is configured as a session window, while 
the lookup side-input is provided using Periodic Impulse with a fixed window of 
2 hours.
   Currently, every time we run the pipeline, it executes for approximately 45 
minutes before stopping at the enrich ParDo step, where the side input is 
passed using AsMultiMap.
   
   I investigated the issue and found two Stack Overflow discussions that 
describe similar behaviour:
   [google cloud dataflow - Apache beam blocked on unbounded side input - Stack 
Overflow](https://stackoverflow.com/questions/72148569/apache-beam-blocked-on-unbounded-side-input)
   [python - Apache Beam Cloud Dataflow Streaming Stuck Side Input - Stack 
Overflow](https://stackoverflow.com/questions/70561769/apache-beam-cloud-dataflow-streaming-stuck-side-input)
   
   Would appreciate any insights or suggestions on resolving this problem.
   
![Image](https://github.com/user-attachments/assets/e6a8da90-8c4d-4531-95fc-973aa5d1cd56)
   
   Here is the data freshness graph 
   
![Image](https://github.com/user-attachments/assets/f9596d12-2467-42ef-99ad-b31ea03c70fe)
   The Enrich Pardo stopped processing until next pulse triggered. 
   
   **Questions:**
   1.   Am I overlooking anything?
   2.   Do Session Window and Fixed Window side inputs work together as 
expected? If not why?
   3.   What alternative approaches can I use for this use case?
   
   Appreciate your help and support. Thanks 😊
   
   
   ### Issue Priority
   
   Priority: 2 (default / most bugs should be filed as P2)
   
   ### Issue Components
   
   - [x] Component: Python SDK
   - [ ] Component: Java SDK
   - [ ] Component: Go SDK
   - [ ] Component: Typescript SDK
   - [ ] Component: IO connector
   - [ ] Component: Beam YAML
   - [ ] Component: Beam examples
   - [ ] Component: Beam playground
   - [ ] Component: Beam katas
   - [ ] Component: Website
   - [ ] Component: Infrastructure
   - [ ] Component: Spark Runner
   - [ ] Component: Flink Runner
   - [ ] Component: Samza Runner
   - [ ] Component: Twister2 Runner
   - [ ] Component: Hazelcast Jet Runner
   - [ ] Component: Google Cloud Dataflow Runner


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to