anishnag opened a new issue, #22632:
URL: https://github.com/apache/beam/issues/22632

   ### What would you like to happen?
   
   I'm currently using a streaming Apache Beam pipeline on a Dataflow Runner 
with an attached GPU to perform real-time inference. We ingest Pub/Sub messages 
that contain the GCS path of a datafile, which we then proceed to download and 
pre-process before batching and dispatching to the GPU for inference.
   
   The issue is that the earlier preprocessing stages are I/O bound and would 
benefit from many harness threads, but the inference step would ideally only 
have one thread to prevent GPU memory oversubscription, despite using only one 
process.
   
   It would be very useful to be able to configure the maximum number of 
threads to allocate to the preprocess `ParDo` in an effort to properly assign 
threads to stages that need it the most. We'd also then just assign a single 
thread to the inference `ParDo` instead of choosing pipeline parameters 
empirically until they work in the majority of cases.
   
   ### Issue Priority
   
   Priority: 2
   
   ### Issue Component
   
   Component: runner-dataflow


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to