Thank you Carl for being patient with all the questions and queries.

Appreciate it.

On Tue, Aug 9, 2016 at 6:43 PM, Carl Mastrangelo <[email protected]> wrote:

> When talking at the millisecond level, gRPC is likely not going to show up
> as a significant cost.  Adding up the cost of message serialization,
> encryption, headers, etc.  will maybe account for a millisecond of time
> inside of your client, and the rest of the delay will be from your network.
>  (and of course your actual application code).
>
> It sounds like gRPC is designed for your use.  I don't think your use case
> is unusual at all.
>
> On Tue, Aug 9, 2016 at 6:34 PM, Pradeep Singh <[email protected]> wrote:
>
>> We are planning to write our own Message bus for a low latency Ad
>> platform.
>> With very tight SLAs on response times (<80ms), we would want to have a
>> solution which can give us a good throughput with most of the traffic
>> within this latency.
>>
>> What complicates the problem is that there are 4 hops involved before a
>> response is sent out.
>> This means, we either write our own RPC implementation in C(*shudders*)
>> or use a substitue which can get us towards it.
>>
>> Since I am responsible for this, I would like to evaluate gRPC is right
>> fit for such a peculiar use case?
>>
>> Thanks,
>>
>> On Tue, Aug 9, 2016 at 6:12 PM, Carl Mastrangelo <[email protected]>
>> wrote:
>>
>>> The latency numbers are a little tricky to interpret with respect to
>>> throughput.  Latency and throughput are at odds with each other, and
>>> optimizing one usually comes at the cost of the other.  (And, generally
>>> speaking, latency is more important than throughput).
>>>
>>> When Testing for latency, I create a client and server running on
>>> separate machines.  The client sends a single messages, and waits for a
>>> response.  Upon receiving a response it sends another.  We call this a
>>> closed-loop benchmark.  It is effectively single threaded, in order to not
>>> introduce additional noise into the system.  (We also vary whether or not
>>> to use an additional executor when handling responses, which can change the
>>> latency by about 25us)  In such a setup, I can do around 200us latency,
>>> which ends up being around 5000qps for a single core.
>>>
>>> When running the benchmark trying to max out QPS, I can get much higher
>>> throughput.  The latency in such tests is around 50ms median latency, for
>>> an aggregate throughput of about 300-400 Kqps.   (and 186ms at the 99.9th
>>> percentile).  This is running a java server and client each with a 32 core
>>> machine.  We can go much higher, and I have added a number of performance
>>> issues to the grpc-java github project.   (all the code is available in our
>>> benchmarks directory, so that you can reproduce them yourself)
>>>
>>> We give you good defaults out of the box with gRPC.  The numbers I am
>>> mentioning here are achieved by looking more thoroughly into the setup, and
>>> making the appropriate changes.  Our setup is careful to avoid lock
>>> contention, avoid thread context switches, avoid allocating memory where
>>> possible, and obeying the flow control signals.  We prefer the Async API to
>>> the synchronous one.
>>>
>>> All our numbers are visible on the dashboard as previously mentioned.
>>> Describing your use case will tell what approximate performance you can
>>> expect, and how to achieve it.
>>>
>>> On Tue, Aug 9, 2016 at 5:25 PM, Pradeep Singh <[email protected]>
>>> wrote:
>>>
>>>> Thanks Carl.
>>>>
>>>> And what throughput can you achieve with these latencies?
>>>> I mean sending one Req and receiving one Response is fine but what
>>>> happens to latencies when REQ rate reaches 50K Reqs per second, especially
>>>> what is avg latency and throughput at point when CPU cores are saturated at
>>>> either Client or Server.
>>>>
>>>> I agree that latency and throughput do not go hand in hand but would
>>>> love to know your numbers before it starts crossing millisecond latency
>>>> boundaries?
>>>>
>>>>     --Pradeep
>>>>
>>>> On Tue, Aug 9, 2016 at 5:00 PM, Carl Mastrangelo <[email protected]>
>>>> wrote:
>>>>
>>>>> On machines that are within the same network, you can expect latencies
>>>>> in the low hundreds of microseconds.  I have personally measured numbers
>>>>> within 100 - 200 microseconds on nearby machines.  I had to tune the 
>>>>> server
>>>>> somewhat to achieve this, but it is possible.
>>>>>
>>>>> On Tuesday, August 9, 2016 at 10:33:31 AM UTC-7, Pradeep Singh wrote:
>>>>>>
>>>>>> Oh I was running the included benchmark in gRPC src code.
>>>>>> I think it reuses the same connection.
>>>>>>
>>>>>> 300us sounds really good.
>>>>>>
>>>>>> What latency do you guys notice when client and server are running on
>>>>>> different hosts?
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> On Tue, Aug 9, 2016 at 8:58 AM, Eric Anderson <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> On Mon, Aug 8, 2016 at 12:35 AM, <[email protected]> wrote:
>>>>>>>
>>>>>>>> With custom zmq messaging bus we get latency in order of
>>>>>>>> microseconds between 2 services on same host (21 us avg) vs 2 ms avg 
>>>>>>>> for
>>>>>>>> gRPC.
>>>>>>>>
>>>>>>>
>>>>>>> Did you reuse the ClientConn between RPCs?
>>>>>>>
>>>>>>> In our performance tests on GCE (using not very special machines,
>>>>>>> where netperf takes ~100µs) we see ~300µs latency for unary and ~225µs
>>>>>>> latency for streaming in Go.
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Pradeep Singh
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Pradeep Singh
>>>>
>>>
>>>
>>
>>
>> --
>> Pradeep Singh
>>
>
>


-- 
Pradeep Singh

-- 
You received this message because you are subscribed to the Google Groups 
"grpc.io" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/grpc-io/CAPpR%3DvVyMVB7q4WfGji5rYGV371YPAmFp-8pvM4THFuEO-dFCA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to