For N invokers, the number of queues could be 2N (ready and overflow per invoker); or N+1 (ready per invoker and a global overflow). I used two queues to refer to both scenarios - maybe that was confusing?
The ready queue should have a very low occupancy time, since the consuming invoker has capacity, requests in that queue will be drained fast. I’m not sure if you’re implying this shouldn’t be Kafka and instead something else. If so, we can break that out into a separate discussion. -r
