[grpc-io] Re: Scala gRPC High CPU Utilization

'Carl Mastrangelo' via grpc.io Wed, 29 Aug 2018 14:28:44 -0700

More info is needed to figure out why this is slow.  Have you use JProfiler 
or Yourkit before?  There are a couple Java profilers (perf, even) that can 
tell you where the CPU is going to.  Also, you should consider turning on 
gc logging to see if memory is being consumed too fast.


Our tuned benchmarks get about 5000qps per core, but that took profiling 
before we could get that fast.   The general approach is figure out whats 
slow, and then fix that.   Without knowing whats slow for your test, its 
hard to recommend a fix.

On Tuesday, August 28, 2018 at 2:14:31 PM UTC-7, [email protected] wrote:
>
> Hi Carl,
>
> Thanks for responding! I've tried a couple different executors and they 
> don't seem to change the behavior. I've done FixedThreadPool with the 
> number of threads = # of cores * 2, the ForkJoinPool.commonPool as you 
> recommended, and the Scala global ExecutionContext which ultimately is a 
> ForkJoinPool as well. I've set this in the NettyServerBuilder as well as 
> the call to bind my service.  
>
> For some more information, here's results from a gatling test run that 
> lasted 10 minutes using the CommonPool. Server implementation now looks 
> like this:
>
> val realtimeServiceWithMonitoring =
>   ServerInterceptors.intercept(
>     RealtimePublishGrpc.bindService(realtimeService, ExecutionContext.global),
>     serverInterceptor)
> val rppServiceWithMonitoring = ServerInterceptors.intercept(
>   RealtimeProxyGrpc.bindService(realtimePublishProxyService, 
> ExecutionContext.global),
>   serverInterceptor
> )
>
>
> NettyServerBuilder
>   .forPort(*8086*)
>   .sslContext(serverGrpcSslContexts)
>   .addService(realtimeServiceWithMonitoring)
>   .addService(batchPublishWithMonitoring)
>   .addService(rppServiceWithMonitoring)
>   .executor(ForkJoinPool.commonPool())
>   .build()
>
>
> My service implementation immediately returns Future.successful:
>
> override def publish(request: PublishRequest): Future[PublishResponse] = {
>   logger.debug("Received Publish request: " + request)
>   Future.successful(PublishResponse())
>
> }
>
>
> Test Results:
>
> ================================================================================
> ---- Global Information 
> --------------------------------------------------------
> > request count                                     208686 (OK=208686 
> KO=0     )
> > min response time                                    165 (OK=165    
> KO=-     )
> > max response time                                   2997 (OK=2997  
>  KO=-     )
> > mean response time                                   287 (OK=287    
> KO=-     )
> > std deviation                                        145 (OK=145    
> KO=-     )
> > response time 50th percentile                        232 (OK=232    
> KO=-     )
> > response time 75th percentile                        324 (OK=324    
> KO=-     )
> > response time 95th percentile                        501 (OK=501    
> KO=-     )
> > response time 99th percentile                        894 (OK=893    
> KO=-     )
> > mean requests/sec                                347.231 (OK=347.231 
> KO=-     )
> ---- Response Time Distribution 
> ------------------------------------------------
> > t < 800 ms                                        206014 ( 99%)
> > 800 ms < t < 1200 ms                                1511 (  1%)
> > t > 1200 ms                                         1161 (  1%)
> > failed                                                 0 (  0%)
>
> ================================================================================
>
> 347 Requests/Sec. CPU Utilization hovers between 29% - 35%. 
>
>
> Thanks for your help!
>
>
>
> On Tuesday, August 28, 2018 at 12:49:38 PM UTC-7, Carl Mastrangelo wrote:
>>
>> Can you try setting  the executor on both the channel and the server 
>> builder?   I would recommend ForkJoinPool.commonPool().
>>
>> On Monday, August 27, 2018 at 11:54:19 PM UTC-7, Kos wrote:
>>>
>>> Hi,
>>>
>>> I'm using gRPC in a new Scala service and I'm seeing unexpectedly high 
>>> CPU utilization. I see this high utilization in our production workload but 
>>> also am able to reproduce via performance tests which I'll describe below. 
>>>
>>> My setup is using grpc-netty-shaded 1.10 (but i've also repro'd with 
>>> 1.14). My performance test uses mTLS to talk to the service. The service is 
>>> deployed on a container with 6 cores and 2 gb ram. I've reduced the 
>>> footprint of my service to immediately return with a response without doing 
>>> any other work to try and identify if it's the application or something to 
>>> do with my gRPC configuration.
>>>
>>> My performance test is issuing about 250 requests a second using one 
>>> Managed Channel to one instance of my service. The data in each request is 
>>> about 10 bytes. With this workload, my service is running at about 35% CPU, 
>>> which I feel is far too high for this small amount of rps.
>>>
>>> Here is how I've constructed my server:
>>>
>>> val serverInterceptor = 
>>> MonitoringServerInterceptor.create(Configuration.allMetrics())
>>>
>>>
>>> val realtimeServiceWithMonitoring = ServerInterceptors.intercept(
>>>   RealtimePublishGrpc.bindService(realtimeService, ExecutionContext.global),
>>>   serverInterceptor)
>>> val rppServiceWithMonitoring = ServerInterceptors.intercept(
>>>   RealtimeProxyGrpc.bindService(realtimePublishProxyService, 
>>> ExecutionContext.global),
>>>   serverInterceptor
>>> )
>>>
>>>
>>>   val keyManagerFactory = GrpcSSLHelper.getKeyManagerFactory(sslConfig)
>>>   val trustManagerFactory = GrpcSSLHelper.getTrustManagerFactory
>>> (sslConfig)
>>>   val serverGrpcSslContexts = GrpcSSLHelper.getServerSslContext
>>> (keyManagerFactory, trustManagerFactory)
>>>
>>>   NettyServerBuilder
>>>     .forPort(8086)
>>>     .sslContext(serverGrpcSslContexts)
>>>     .addService(realtimeServiceWithMonitoring)
>>>     .addService(rppServiceWithMonitoring)
>>>     .build()
>>> }
>>>
>>>
>>> The server interceptor is modeled after: 
>>> https://github.com/grpc-ecosystem/java-grpc-prometheus
>>>
>>> The managed channel is constructed as such:
>>>
>>> private val interceptor = 
>>> MonitoringClientInterceptor.create(Configuration.allMetrics())
>>>
>>>
>>> val trustManagerFactory = GrpcSSLHelper.getTrustManagerFactory(sslConfig)
>>>
>>> NettyChannelBuilder
>>>   .forAddress(address, *8086*)
>>>   .intercept(interceptor)
>>>   .negotiationType(NegotiationType.TLS)
>>>   .sslContext(GrpcSSLHelper.getClientSslContext(keyManagerFactory, 
>>> trustManagerFactory))
>>>   .build()
>>>
>>>
>>> Finally, I use non-blocking stubs to issue the gRPC request.
>>>
>>> Any help would be greatly appreciated. Thanks!
>>> -K
>>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"grpc.io" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/grpc-io.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/grpc-io/75102928-610f-4dd6-b3cd-b80a494f6bcf%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[grpc-io] Re: Scala gRPC High CPU Utilization

Reply via email to