Hi All, During httpd performance evaluation in Alibaba Cloud instance, I found httpd performance improved significantly after using “taskset” to set CPU affinity for httpd processes/threads, because it decreased the amount of CPU migrations. Performance improved 60% in arm instance g8y.2xlarge(8 vcpus, 32GiB memory, 40GB ESSD), also improved 20% in x86 instance g7.2xlarge(8 vcpus, 32GiB memory, 40GB ESSD). Test case: run httpd with event mode on g8y.2xlarge or g7.2xlarge, run traffic generator/benchmark 'wrk' on g8y.4xlarge(16 vcpus, 32GiB memory, 40GB ESSD), wrk command is 'wrk -t 32 -c 1000 -d 30 --latency http://$ServerIP <http://%24serverip/>'
mpm event parameters: <IfModule mpm_event_module> StartServers 8 ServerLimit 100 ThreadLimit 2000 MinSpareThreads 75 MaxSpareThreads 2000 ThreadsPerChild 125 MaxRequestWorkers 2000 </IfModule> But httpd didn't have related parameters to support CPU affinity, so I used "taskset" to optimize. After source code analysis, I made a prototype for the affinity solution(add set_affinity function when worker/lister thread created). We can observe the same improvement by this solution. However, this prototype only applied the above special “event mpm” configuration for 8 cores server. I think it also needs to modify the current mechanism to dynamically adapt to the perceived load and add new parameters for the affinity setting. I had created a ticket on bugzilla, and Christophe JAILLET suggested discussing it in the dev mail list. I am not the developer on httpd, hope experts can evaluate this request and add cpu affinity function in future versions. Any commnet, please let me know. bugzilla ticket link: https://bz.apache.org/bugzilla/show_bug.cgi?id=66424 Prototype patch(based on version 2.4.37) as below: diff --git a/server/mpm/event/event.c b/server/mpm/event/event.c index ffe8a23cbd..d23d115fff 100644 --- a/server/mpm/event/event.c +++ b/server/mpm/event/event.c @@ -1586,6 +1586,8 @@ static void * APR_THREAD_FUNC listener_thread(apr_thread_t * thd, void *dummy) int have_idle_worker = 0; apr_time_t last_log; + ap_setaffinity(process_slot); + last_log = apr_time_now(); free(ti); @@ -1998,6 +2000,8 @@ static void *APR_THREAD_FUNC worker_thread(apr_thread_t * thd, void *dummy) apr_status_t rv; int is_idle = 0; + ap_setaffinity(process_slot); + free(ti); ap_scoreboard_image->servers[process_slot][thread_slot].pid = ap_my_pid; @@ -2456,6 +2460,8 @@ static void child_main(int child_num_arg, int child_bucket) apr_thread_t *start_thread_id; int i; + ap_setaffinity(process_slot); + /* for benefit of any hooks that run as this child initializes */ retained->mpm->mpm_state = AP_MPMQ_STARTING; @@ -3862,6 +3868,17 @@ static const char *set_worker_factor(cmd_parms * cmd, void *dummy, return NULL; } +void ap_setaffinity(int cpu_affinity) +{ + cpu_set_t mask; + + CPU_ZERO(&mask); + CPU_SET(cpu_affinity, &mask); + + sched_setaffinity(0, sizeof(cpu_set_t), &mask); + + printf("set thread_id=%d CPU affinity to Core %d\n", gettid(), cpu_affinity); +} static const command_rec event_cmds[] = { LISTEN_COMMANDS, -- Thanks & Best Regards Martin Ma