Le 12/01/2018 à 18:51, Willy Tarreau a écrit :
On Fri, Jan 12, 2018 at 11:06:32AM -0600, Samuel Reed wrote:
On 1.8-git, similar results on the new process:
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
93.75 0.265450 15 17805 epoll_wait
4.85 0.013730 49 283 write
1.40 0.003960 15 266 12 recvfrom
0.01 0.000018 0 42 12 read
0.00 0.000000 0 28 close
0.00 0.000000 0 12 socket
0.00 0.000000 0 12 12 connect
0.00 0.000000 0 19 1 sendto
0.00 0.000000 0 12 sendmsg
0.00 0.000000 0 6 shutdown
0.00 0.000000 0 35 setsockopt
0.00 0.000000 0 7 getsockopt
0.00 0.000000 0 12 fcntl
0.00 0.000000 0 13 epoll_ctl
0.00 0.000000 0 2 2 accept4
------ ----------- ----------- --------- --------- ----------------
100.00 0.283158 18554 39 total
Cursory look through the strace output looks the same, with the same
three types as in the last email, including the cascade.
OK thank you for testing. On Monday we'll study this with Christopher.
Hi Samuel,
Here are 2 patches that may solve your problem. Idea is to set the
poller timeout to 0 for a specific thread only when some processing are
expected for this thread. The job was already done for tasks and applets
using bitfields. Now, we do the same for the FDs.
For now, we don't know if it aims your problem, but it should avoid a
thread to loop for nothing. Could you check if it works please ?
--
Christopher Faulet
>From 8a6558841bdfdea8835275f531e0d5a1c84046ca Mon Sep 17 00:00:00 2001
From: Christopher Faulet <[email protected]>
Date: Mon, 15 Jan 2018 11:57:03 +0100
Subject: [PATCH 1/2] MAJOR: fd: Use a bitfield to know if there are FDs for a
thread in the FD cache
A bitfield has been added to know if there are some FDs processable by a
specific thread in the FD cache. When a FD is inserted in the FD cache, the bits
corresponding to its thread_mask are set. On each thread, the bitfield is
updated when the FD cache is processed. If there is no FD processed, the thread
is removed from the bitfield by unsetting its tid_bit.
Note that this bitfield is updated but not checked in
fd_process_cached_events. So, when this function is called, the FDs cache is
always processed.
---
include/proto/fd.h | 2 ++
src/fd.c | 7 +++----
2 files changed, 5 insertions(+), 4 deletions(-)
diff --git a/include/proto/fd.h b/include/proto/fd.h
index 8cc191f75..44370e768 100644
--- a/include/proto/fd.h
+++ b/include/proto/fd.h
@@ -35,6 +35,7 @@
extern unsigned int *fd_cache; // FD events cache
extern int fd_cache_num; // number of events in the cache
+extern unsigned long fd_cache_mask; // Mask of threads with events in the cache
extern THREAD_LOCAL int *fd_updt; // FD updates list
extern THREAD_LOCAL int fd_nbupdt; // number of updates in the list
@@ -115,6 +116,7 @@ static inline void fd_alloc_cache_entry(const int fd)
if (fdtab[fd].cache)
goto end;
fd_cache_num++;
+ fd_cache_mask |= fdtab[fd].thread_mask;
fdtab[fd].cache = fd_cache_num;
fd_cache[fd_cache_num-1] = fd;
end:
diff --git a/src/fd.c b/src/fd.c
index 9fb09ab71..3e488a951 100644
--- a/src/fd.c
+++ b/src/fd.c
@@ -170,6 +170,7 @@ int nbpollers = 0;
unsigned int *fd_cache = NULL; // FD events cache
int fd_cache_num = 0; // number of events in the cache
+unsigned long fd_cache_mask = 0; // Mask of threads with events in the cache
THREAD_LOCAL int *fd_updt = NULL; // FD updates list
THREAD_LOCAL int fd_nbupdt = 0; // number of updates in the list
@@ -236,18 +237,16 @@ void fd_process_cached_events()
{
int fd, entry, e;
- if (!fd_cache_num)
- return;
-
HA_RWLOCK_RDLOCK(FDCACHE_LOCK, &fdcache_lock);
+ fd_cache_mask &= ~tid_bit;
for (entry = 0; entry < fd_cache_num; ) {
fd = fd_cache[entry];
if (!(fdtab[fd].thread_mask & tid_bit))
goto next;
+ fd_cache_mask |= tid_bit;
if (HA_SPIN_TRYLOCK(FD_LOCK, &fdtab[fd].lock))
goto next;
-
HA_RWLOCK_RDUNLOCK(FDCACHE_LOCK, &fdcache_lock);
e = fdtab[fd].state;
--
2.13.6
>From 3226285d8b18ea7a11f2a63289f0dcf4e085569e Mon Sep 17 00:00:00 2001
From: Christopher Faulet <[email protected]>
Date: Mon, 15 Jan 2018 12:16:34 +0100
Subject: [PATCH 2/2] MAJOR: polling: Use fd_cache_mask instead of fd_cache_num
fd_cache_num is the number of FDs in the FD cache. It is a global variable. So
it is underoptimized because we may be lead to consider there are waiting FDs
for the current thread in the FD cache while in fact all FDs are assigned to the
other threads. So, in such cases, the polling loop will be evaluated many more
times than necessary.
Instead, we now check if the thread id is set in the bitfield fd_cache_mask.
---
src/haproxy.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/haproxy.c b/src/haproxy.c
index 20b18f854..2bbc47574 100644
--- a/src/haproxy.c
+++ b/src/haproxy.c
@@ -2389,7 +2389,7 @@ static void run_poll_loop()
break;
/* expire immediately if events are pending */
- if (fd_cache_num || (active_tasks_mask & tid_bit) || signal_queue_len || (active_applets_mask & tid_bit))
+ if (((fd_cache_mask|active_tasks_mask|active_applets_mask) & tid_bit) || signal_queue_len)
next = now_ms;
/* The poller will ensure it returns around <next> */
--
2.13.6