Re: [grpc-io] gRPC for low latency Distributed Services

'Carl Mastrangelo' via grpc.io Tue, 09 Aug 2016 18:43:49 -0700

When talking at the millisecond level, gRPC is likely not going to show up
as a significant cost.  Adding up the cost of message serialization,
encryption, headers, etc.  will maybe account for a millisecond of time
inside of your client, and the rest of the delay will be from your network.
 (and of course your actual application code).


It sounds like gRPC is designed for your use.  I don't think your use case
is unusual at all.

On Tue, Aug 9, 2016 at 6:34 PM, Pradeep Singh <[email protected]> wrote:

> We are planning to write our own Message bus for a low latency Ad platform.
> With very tight SLAs on response times (<80ms), we would want to have a
> solution which can give us a good throughput with most of the traffic
> within this latency.
>
> What complicates the problem is that there are 4 hops involved before a
> response is sent out.
> This means, we either write our own RPC implementation in C(*shudders*) or
> use a substitue which can get us towards it.
>
> Since I am responsible for this, I would like to evaluate gRPC is right
> fit for such a peculiar use case?
>
> Thanks,
>
> On Tue, Aug 9, 2016 at 6:12 PM, Carl Mastrangelo <[email protected]>
> wrote:
>
>> The latency numbers are a little tricky to interpret with respect to
>> throughput.  Latency and throughput are at odds with each other, and
>> optimizing one usually comes at the cost of the other.  (And, generally
>> speaking, latency is more important than throughput).
>>
>> When Testing for latency, I create a client and server running on
>> separate machines.  The client sends a single messages, and waits for a
>> response.  Upon receiving a response it sends another.  We call this a
>> closed-loop benchmark.  It is effectively single threaded, in order to not
>> introduce additional noise into the system.  (We also vary whether or not
>> to use an additional executor when handling responses, which can change the
>> latency by about 25us)  In such a setup, I can do around 200us latency,
>> which ends up being around 5000qps for a single core.
>>
>> When running the benchmark trying to max out QPS, I can get much higher
>> throughput.  The latency in such tests is around 50ms median latency, for
>> an aggregate throughput of about 300-400 Kqps.   (and 186ms at the 99.9th
>> percentile).  This is running a java server and client each with a 32 core
>> machine.  We can go much higher, and I have added a number of performance
>> issues to the grpc-java github project.   (all the code is available in our
>> benchmarks directory, so that you can reproduce them yourself)
>>
>> We give you good defaults out of the box with gRPC.  The numbers I am
>> mentioning here are achieved by looking more thoroughly into the setup, and
>> making the appropriate changes.  Our setup is careful to avoid lock
>> contention, avoid thread context switches, avoid allocating memory where
>> possible, and obeying the flow control signals.  We prefer the Async API to
>> the synchronous one.
>>
>> All our numbers are visible on the dashboard as previously mentioned.
>> Describing your use case will tell what approximate performance you can
>> expect, and how to achieve it.
>>
>> On Tue, Aug 9, 2016 at 5:25 PM, Pradeep Singh <[email protected]> wrote:
>>
>>> Thanks Carl.
>>>
>>> And what throughput can you achieve with these latencies?
>>> I mean sending one Req and receiving one Response is fine but what
>>> happens to latencies when REQ rate reaches 50K Reqs per second, especially
>>> what is avg latency and throughput at point when CPU cores are saturated at
>>> either Client or Server.
>>>
>>> I agree that latency and throughput do not go hand in hand but would
>>> love to know your numbers before it starts crossing millisecond latency
>>> boundaries?
>>>
>>>     --Pradeep
>>>
>>> On Tue, Aug 9, 2016 at 5:00 PM, Carl Mastrangelo <[email protected]>
>>> wrote:
>>>
>>>> On machines that are within the same network, you can expect latencies
>>>> in the low hundreds of microseconds.  I have personally measured numbers
>>>> within 100 - 200 microseconds on nearby machines.  I had to tune the server
>>>> somewhat to achieve this, but it is possible.
>>>>
>>>> On Tuesday, August 9, 2016 at 10:33:31 AM UTC-7, Pradeep Singh wrote:
>>>>>
>>>>> Oh I was running the included benchmark in gRPC src code.
>>>>> I think it reuses the same connection.
>>>>>
>>>>> 300us sounds really good.
>>>>>
>>>>> What latency do you guys notice when client and server are running on
>>>>> different hosts?
>>>>>
>>>>> Thanks,
>>>>>
>>>>> On Tue, Aug 9, 2016 at 8:58 AM, Eric Anderson <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> On Mon, Aug 8, 2016 at 12:35 AM, <[email protected]> wrote:
>>>>>>
>>>>>>> With custom zmq messaging bus we get latency in order of
>>>>>>> microseconds between 2 services on same host (21 us avg) vs 2 ms avg for
>>>>>>> gRPC.
>>>>>>>
>>>>>>
>>>>>> Did you reuse the ClientConn between RPCs?
>>>>>>
>>>>>> In our performance tests on GCE (using not very special machines,
>>>>>> where netperf takes ~100µs) we see ~300µs latency for unary and ~225µs
>>>>>> latency for streaming in Go.
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Pradeep Singh
>>>>>
>>>>
>>>
>>>
>>> --
>>> Pradeep Singh
>>>
>>
>>
>
>
> --
> Pradeep Singh
>

-- 
You received this message because you are subscribed to the Google Groups 
"grpc.io" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/grpc-io/CAAcqB%2Bu2sN7v6HnOTh%2BM_X%2BWYGRckWJaaYAe8BX-OaG92N6Btw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: [grpc-io] gRPC for low latency Distributed Services

Reply via email to