mapleFU opened a new issue #1727:
URL: https://github.com/apache/incubator-brpc/issues/1727


   **Describe the bug (描述bug)**
   
   Hi, we are using brpc and bthread as our rpc framework and runtime. Our 
tasks are lightweight, the workload is like handling a request and read some 
data in memory, usally finished in ~10ms.
   
   Under 96Core CPU, we config out bthread worker `106`. And when the workload 
is lots of lightweight request (about 300K request per second), pprof shows 
that `do_futex` takes about 25% of CPU runtime. And `bthread_worker_usage` is 
only 15-20, `bthread_signal_second` is also high, `bthread_count` is about 
1600, and our server qps is 300k. Some information in pprof can be listed as 
follow:
   
   ```
   bthread::TaskGroup::end_sched 18.09%
   - steal_task 5.61%
   - sched_to 12.07%
     -  ready_to_run  11.85%
       - do_futex 11.16% (call futex_wake)
         - _raw_spin_unlock_irqrestore 10.42%  
   ```
   
   and:
   
   ```
   bthread::TaskGroup::run_main_task 14.4%
   - TaskGroup::wait_task 4.87%
     - steal_task 4.63%
   - futex_wait 8.42% 
   ```
   
   According to 
[link](https://stackoverflow.com/questions/14703328/spin-unlock-irqrestore-has-very-high-sampling-rate-in-my-kvm-why),
 `_raw_spin_unlock_irqrestore` because interrupt is off. But it still takes too 
much time handling this than we expected on scheduling.
   
   We guess that we produce too many lightweight bthread, and scheduling them 
will notify lots of TaskGroup workers. After changing `bthread_worker` to 60 
and restart the server, the cost of scheduling reduce a lot. But restarting all 
machines is troblesome for us. And we think that configing worker number as 
same as `hardware_concurrency` is suitable for all different kinds of workloads.
   
   How can we handling this problem? I found bthread can only `add_worker` 
dynamically, but cannot remove spare worker, which can solve this problem 
easily. Using a `bthread pool` may help to reducing the signal and bthread 
scheduling, but writing a ThreadPool over Fiber is really a dirty work. 
   
   **To Reproduce (复现方法)**
   
   
   **Expected behavior (期望行为)**
   
   The bthread can reduce worker, or spend less time on `do_futex` when there 
are many lightweight tasks.
   
   
   **Versions (各种版本)**
   OS: Linux 5.4
   Compiler: g++ 830
   brpc: 0.9.6
   protobuf: We use thrift 0.9
   
   **Additional context/screenshots (更多上下文/截图)**
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to