Le 02/12/2017 à 08:23, Максим Куприянов a écrit :
Hi!

Tonight all of mine haproxy 1.8.0 instances stopped answering. They didn't forward traffic and even didn't answered over socket. They're compiled with threads, but threads are not enabled in they configs (no nbthread option). All of them stuck in same place:
# strace -f -p 831919
Process 831919 attached
write(2, "S", 1
Here's some debug stuff (from 1-threaded instance):
(gdb) bt
#0  0x00007fef9bd2a330 in __write_nocancel () at ../sysdeps/unix/syscall-template.S:81
#1  0x0000558dea62275b in thread_want_sync () at src/hathreads.c:74
#2  0x0000558dea58f548 in srv_register_update (srv=srv@entry=0x558ded691e30) at src/server.c:2596 #3  0x0000558dea5922f7 in server_recalc_eweight (sv=sv@entry=0x558ded691e30) at src/server.c:1151 #4  0x0000558dea5c3028 in server_warmup (t=0x558def513120) at src/checks.c:1448
#5  0x0000558dea619216 in process_runnable_tasks () at src/task.c:229
#6  0x0000558dea5cf237 in run_poll_loop () at src/haproxy.c:2326
#7  run_thread_poll_loop (data=<optimized out>) at src/haproxy.c:2375
#8  0x0000558dea53b6fe in main (argc=<optimized out>, argv=0x7ffe03a880d8) at src/haproxy.c:2910


Hi,

Thanks for your detailed report. There is a bug in the sync-point, when the same thread requests a synchronization many times. And, it is easier to encountered this bug with only one thread.

Could you check the attached patch ? It should fix the bug.

--
Christopher Faulet
>From b8475f5bf9098b667fabada7b88de33c62b42c35 Mon Sep 17 00:00:00 2001
From: Christopher Faulet <cfau...@haproxy.com>
Date: Sat, 2 Dec 2017 09:53:24 +0100
Subject: [PATCH] BUG/MAJOR: thread: Be sure to request a sync between threads
 only once at a time

The first thread requesting a synchronization is responsible to write in the
"sync" pipe to notify all others. But we must write only once in the pipe
between two synchronizations to have exactly one character in the pipe. It is
important because we only read 1 character in return when the last thread exits
from the sync-point.

Here there is a bug. If two threads request a synchronization, only the first
writes in the pipe. But, if the same thread requests several times a
synchronization before entering in the sync-point (because, for instance, it
detects many servers down), it writes as many as characters in the pipe. And
only one of them will be read. Repeating this bug many times will block HAProxy
on the write because the pipe is full.

To fix the bug, we just check if the current thread has already requested a
synchronization before trying to notify all others.

The patch must be backported in 1.8
---
 src/hathreads.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/hathreads.c b/src/hathreads.c
index eb9cd3fce..50f3c7701 100644
--- a/src/hathreads.c
+++ b/src/hathreads.c
@@ -70,6 +70,8 @@ void thread_sync_enable(void)
 void thread_want_sync()
 {
 	if (all_threads_mask) {
+		if (threads_want_sync & tid_bit)
+			return;
 		if (HA_ATOMIC_OR(&threads_want_sync, tid_bit) == tid_bit)
 			shut_your_big_mouth_gcc(write(threads_sync_pipe[1], "S", 1));
 	}
-- 
2.13.6

Reply via email to