cybermaggedon commented on PR #23143: URL: https://github.com/apache/pulsar/pull/23143#issuecomment-4037910570
This is a very important scenario for TrustGraph.ai, we've implemented RPC using 2 different queues, seems to work quite well. Maybe not 100% ideal. It seems like the 2 design options are: - to have a fixed queue for requests and dynamically create response queues, one per 'client' - to have a fixed request and response queue and use an request ID to work out which response goes to which client Dynamic queues: Complex for operations, harder to debug, harder to produce metrics on Static queues: Doesn't work well if there are MANY subscribers each getting a copy of all RPC responses working out and discarding millions of messages which are irrelevant. Also not good in scenarios where you don't trust the clients not to snoop on messages which aren't intended for them In our system, there are many RPC clients but in practice the request/response is handled at an API gateway so the fixed queue model works well. For agentic flows, useful to think beyond basic request/response to multiple consumers. e.g. think of an LLM service. Requests are like "What is 2+ 2?" and responses are the LLM response to that prompt. The response queue not only carries the response to the client but critically important metadata like token costs and counts, so a very useful place to put a metrics consumer to build a picture of LLM costs. Also, in practice an agent flow looks like request/response from a client POV but much more complex inside the system - the client puts in a request to the orchestrator. We decided for incremental steps the orchestrator sends messages to itself for subsequent iterations, and then the final response goes back to the client. So, best not to lock into a 'pure' request/response pattern. Pulsar puts different persistence requirements on different namespaces. Think about that for request/response. You probably don't have the same persistent/QOS requirements for requests and responses. If a request isn't initiated after e.g. 30s no point acting on it because there will be some retry somewhere in the client space. If a request is accepted, definitely want to get a response back to the client. Maybe the Pulsar primitives for today are the best feature set for this, wouldn't want to see a framework which doesn't address the above. Happy to kick ideas around. > @lhotari @codelipenghui @cybermaggedon @JackColquitt > > It seems to be in the current scenario of AI Agent. The function of this request-response message sending model is still very important. > > https://streamnative.io/blog/case-study-apache-pulsar-as-the-event-driven-backbone-of-trustgraph#pulsar-in-action-a-day-in-the-life-with-trustgraph > > ``` > Request/Response Messaging: > For interactive services—such as an AI Agent API or the GraphRAG query service—TrustGraph sets up dedicated request and response topics. > For example, when a user’s query hits the Agent service, it is published to a request topic, the agent processes it, and the answer comes back on a response topic tied to that user’s session or flow. > This pub/sub request-response pattern feels like a direct call from the client’s perspective, but under the hood it’s decoupled and asynchronous. > The client can await a response without knowing which specific service instance will handle it. This pattern gives synchronous behavior on top of asynchronous internals, combining interactivity with scalability. > ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
