Thank you Carl for being patient with all the questions and queries. Appreciate it.
On Tue, Aug 9, 2016 at 6:43 PM, Carl Mastrangelo <[email protected]> wrote: > When talking at the millisecond level, gRPC is likely not going to show up > as a significant cost. Adding up the cost of message serialization, > encryption, headers, etc. will maybe account for a millisecond of time > inside of your client, and the rest of the delay will be from your network. > (and of course your actual application code). > > It sounds like gRPC is designed for your use. I don't think your use case > is unusual at all. > > On Tue, Aug 9, 2016 at 6:34 PM, Pradeep Singh <[email protected]> wrote: > >> We are planning to write our own Message bus for a low latency Ad >> platform. >> With very tight SLAs on response times (<80ms), we would want to have a >> solution which can give us a good throughput with most of the traffic >> within this latency. >> >> What complicates the problem is that there are 4 hops involved before a >> response is sent out. >> This means, we either write our own RPC implementation in C(*shudders*) >> or use a substitue which can get us towards it. >> >> Since I am responsible for this, I would like to evaluate gRPC is right >> fit for such a peculiar use case? >> >> Thanks, >> >> On Tue, Aug 9, 2016 at 6:12 PM, Carl Mastrangelo <[email protected]> >> wrote: >> >>> The latency numbers are a little tricky to interpret with respect to >>> throughput. Latency and throughput are at odds with each other, and >>> optimizing one usually comes at the cost of the other. (And, generally >>> speaking, latency is more important than throughput). >>> >>> When Testing for latency, I create a client and server running on >>> separate machines. The client sends a single messages, and waits for a >>> response. Upon receiving a response it sends another. We call this a >>> closed-loop benchmark. It is effectively single threaded, in order to not >>> introduce additional noise into the system. (We also vary whether or not >>> to use an additional executor when handling responses, which can change the >>> latency by about 25us) In such a setup, I can do around 200us latency, >>> which ends up being around 5000qps for a single core. >>> >>> When running the benchmark trying to max out QPS, I can get much higher >>> throughput. The latency in such tests is around 50ms median latency, for >>> an aggregate throughput of about 300-400 Kqps. (and 186ms at the 99.9th >>> percentile). This is running a java server and client each with a 32 core >>> machine. We can go much higher, and I have added a number of performance >>> issues to the grpc-java github project. (all the code is available in our >>> benchmarks directory, so that you can reproduce them yourself) >>> >>> We give you good defaults out of the box with gRPC. The numbers I am >>> mentioning here are achieved by looking more thoroughly into the setup, and >>> making the appropriate changes. Our setup is careful to avoid lock >>> contention, avoid thread context switches, avoid allocating memory where >>> possible, and obeying the flow control signals. We prefer the Async API to >>> the synchronous one. >>> >>> All our numbers are visible on the dashboard as previously mentioned. >>> Describing your use case will tell what approximate performance you can >>> expect, and how to achieve it. >>> >>> On Tue, Aug 9, 2016 at 5:25 PM, Pradeep Singh <[email protected]> >>> wrote: >>> >>>> Thanks Carl. >>>> >>>> And what throughput can you achieve with these latencies? >>>> I mean sending one Req and receiving one Response is fine but what >>>> happens to latencies when REQ rate reaches 50K Reqs per second, especially >>>> what is avg latency and throughput at point when CPU cores are saturated at >>>> either Client or Server. >>>> >>>> I agree that latency and throughput do not go hand in hand but would >>>> love to know your numbers before it starts crossing millisecond latency >>>> boundaries? >>>> >>>> --Pradeep >>>> >>>> On Tue, Aug 9, 2016 at 5:00 PM, Carl Mastrangelo <[email protected]> >>>> wrote: >>>> >>>>> On machines that are within the same network, you can expect latencies >>>>> in the low hundreds of microseconds. I have personally measured numbers >>>>> within 100 - 200 microseconds on nearby machines. I had to tune the >>>>> server >>>>> somewhat to achieve this, but it is possible. >>>>> >>>>> On Tuesday, August 9, 2016 at 10:33:31 AM UTC-7, Pradeep Singh wrote: >>>>>> >>>>>> Oh I was running the included benchmark in gRPC src code. >>>>>> I think it reuses the same connection. >>>>>> >>>>>> 300us sounds really good. >>>>>> >>>>>> What latency do you guys notice when client and server are running on >>>>>> different hosts? >>>>>> >>>>>> Thanks, >>>>>> >>>>>> On Tue, Aug 9, 2016 at 8:58 AM, Eric Anderson <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> On Mon, Aug 8, 2016 at 12:35 AM, <[email protected]> wrote: >>>>>>> >>>>>>>> With custom zmq messaging bus we get latency in order of >>>>>>>> microseconds between 2 services on same host (21 us avg) vs 2 ms avg >>>>>>>> for >>>>>>>> gRPC. >>>>>>>> >>>>>>> >>>>>>> Did you reuse the ClientConn between RPCs? >>>>>>> >>>>>>> In our performance tests on GCE (using not very special machines, >>>>>>> where netperf takes ~100µs) we see ~300µs latency for unary and ~225µs >>>>>>> latency for streaming in Go. >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Pradeep Singh >>>>>> >>>>> >>>> >>>> >>>> -- >>>> Pradeep Singh >>>> >>> >>> >> >> >> -- >> Pradeep Singh >> > > -- Pradeep Singh -- You received this message because you are subscribed to the Google Groups "grpc.io" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/grpc-io/CAPpR%3DvVyMVB7q4WfGji5rYGV371YPAmFp-8pvM4THFuEO-dFCA%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
