Hey all, I'm working on extending my pipeline runner so that multiple batches of inputs are streamed into a single bundle (to amortize bundle setup time) rather than just sending a single batch of inputs into the data channel and then closing the data channel.
I'm using the ProcessBundleProgressRequest[1] to poll my workers to see if they're idle and ready for more inputs using the `consuming_received_data` field, but I noticed that `consuming_received_data` doesn't actually imply whether or not the worker is busy or ready for work. Specifically, while the worker is setting up (e.g. running `setup_bundle`), `consuming_received_data` is set to False even though the worker is not actually ready for work. This makes it a bit awkward for runners to know whether or not it's safe to send work to a worker. If `setup_bundle` takes a very long time, then a runner might repeatedly send more and more work if it uses `consuming_received_data` as an indicator of idleness. Am I misusing `consuming_received_data`? Is there another more reliable way to detect worker idleness? cheers, Joey https://github.com/apache/beam/blob/357b7206f82f788b55732247c9096527f92adf4f/model/fn-execution/src/main/proto/org/apache/beam/model/fn_execution/v1/beam_fn_api.proto#L476
