[ceph-users] 答复: 答复: Pipe "deadlock" in Hammer, 0.94.5

2017-03-15 Thread 许雪寒
Hi, sir.

I'm sorry I made a mistake, the fix that you provided should be the one we 
need, is it safe for us to simply "git cherry-pick" that commit into our 0.94.5 
version?

So sorry for my mistake.
Thank you.

On Wed, Jan 11, 2017 at 3:59 PM, 许雪寒  wrote:
> In our test, when one machine is under heavy packet loss, OSDs on 
> other machines can be brought down and sometimes more than one OSD go 
> down, because they may try to read a message that comes from that 
> machine. So we are really concerned, please help us. Thanks
>
>
> -邮件原件-
> 发件人: 许雪寒
> 发送时间: 2017年1月11日 15:13
> 收件人: 'Sage Weil'
> 抄送: ceph-de...@vger.kernel.org
> 主题: 答复: Pipe "deadlock" in Hammer, 0.94.5
>
> Thanks for your reply, sir:-)
>
> Actually, this case is not very rare in our test. When iptables drop ip 
> packets with the probability around 98%~99%, this case occurs about every 
> three times our test runs.
> I checked #14120 as you recommended, however, it doesn't seem to be our 
> problem. Because, as http://tracker.ceph.com/issues/18184 says, #14120 is 
> caused by a commit that changed Pipe::tcp_read_wait() to return -errno 
> instead of "-1" and this commit is not applied in our tested hammer 
> verion(0.94.5).
> And I truly agree with you in that the invocation of "recv" function passes 
> MSG_DONTWAIT flag, so it shouldn't block, however, every time we encounter 
> this problem it's always the Pipe::reader_thread who's calling recv that hold 
> the lock when suicide happens, this is really confusing.
>
> Please help us, thank you:-)
>
>
>
> -邮件原件-
> 发件人: Sage Weil [mailto:s...@newdream.net]
> 发送时间: 2017年1月10日 21:12
> 收件人: 许雪寒
> 抄送: ceph-de...@vger.kernel.org
> 主题: Re: Pipe "deadlock" in Hammer, 0.94.5
>
> On Tue, 10 Jan 2017, 许雪寒 wrote:
>> Hi, everyone.
>>
>> Recently, we did some experiment to test the stability of the ceph cluster. 
>> We used Hammer version which is the mostly used version of online cluster. 
>> One of the scenarios that we simulated is poor network connectivity, in 
>> which we used iptables to drop TCP/IP packet under some probability. And 
>> sometimes, we can see that some OSD suicide themselves.
>>
>> We used gdb to debug the core dumped by linux. We found that the thread that 
>> hit the suicide time threshold is a peering thread who is trying to send a 
>> pg_notify message, the ceph-osd log file and gdb output is as follows:
>>
>> Log file:
>> -3> 2017-01-10 17:02:13.469949 7fd446ff7700  1 heartbeat_map 
>> is_healthy 'OSD::osd_tp thread 0x7fd440bed700' had timed out after 15
>> -2> 2017-01-10 17:02:13.469952 7fd446ff7700  1 heartbeat_map 
>> is_healthy 'OSD::osd_tp thread 0x7fd440bed700' had suicide timed out 
>> after 150
>> -1> 2017-01-10 17:02:13.469954 7fd4451f4700  1 --
>> 10.160.132.157:6818/10014122 <== osd.20 10.160.132.156:0/24908 163 
>>  osd_ping(ping e4030 stamp 2017-01-10 17:02:13.450374) v2 
>> 47+0+0 (3247646131 0 0) 0x7fd418ca8600 con 0x7fd413c89700
>>  0> 2017-01-10 17:02:13.496895 7fd446ff7700 -1 error_msg
>> common/HeartbeatMap.cc: In function 'bool 
>> ceph::HeartbeatMap::_check(ceph::heartbeat_handle_d*, const char*, 
>> time_t)' thread 7fd446ff7700 time 2017-01-10 17:02:13.469969
>> common/HeartbeatMap.cc: 79: FAILED assert(0 == "hit suicide timeout")
>>
>> GDB OUTPUT:
>> (gdb) thread 8
>> [Switching to thread 8 (Thread 0x7fd440bed700 (LWP 15302))]#0
>> 0x003c5d80e334 in __lll_lock_wait () from /lib64/libpthread.so.0
>> (gdb) bt
>> #0  0x003c5d80e334 in __lll_lock_wait () from
>> /lib64/libpthread.so.0
>> #1  0x003c5d8095d8 in _L_lock_854 () from /lib64/libpthread.so.0
>> #2  0x003c5d8094a7 in pthread_mutex_lock () from
>> /lib64/libpthread.so.0
>> #3  0x01a54ae4 in Mutex::Lock (this=0x7fd426453598,
>> no_lockdep=false) at common/Mutex.cc:96
>> #4  0x01409285 in Mutex::Locker::Locker (this=0x7fd440beb6c0,
>> m=...) at common/Mutex.h:115
>> #5  0x01c46446 in PipeConnection::try_get_pipe 
>> (this=0x7fd426453580, p=0x7fd440beb908) at
>> msg/simple/PipeConnection.cc:38
>> #6  0x01c05809 in SimpleMessenger::submit_message 
>> (this=0x7fd482029400, m=0x7fd425538d00, con=0x7fd426453580, 
>> dest_addr=..., dest_type=4, already_locked=false) at
>> msg/simple/SimpleMessenger.cc:443
>> #7  0x01c033fa in SimpleMessenger::_send_message 
>> (this=0x7fd482029400, m=0x7fd425538d00, con=0x7fd426453580) at
>> msg/simple/SimpleMessenger.cc:136
>> #8  0x01c467c7 in SimpleMessenger::send_message 
>> (this=0x7fd482029400, m=0x7fd425538d00, con=0x7fd426453580) at
>> msg/simple/SimpleMessenger.h:139
>> #9  0x01c466a1 in PipeConnection::send_message 
>> (this=0x7fd426453580, m=0x7fd425538d00) at
>> msg/simple/PipeConnection.cc:78
>> #10 0x013b3ff2 in OSDService::send_map (this=0x7fd4821e76c8, 
>> m=0x7fd425538d00, con=0x7fd426453580) at osd/OSD.cc:1054
>> #11 0x013b45e7 in OSDService::send_incremental_map 
>> (this=0x7fd4821e76c8, since=4028, 

Re: [ceph-users] 答复: 答复: Pipe "deadlock" in Hammer, 0.94.5

2017-01-13 Thread Gregory Farnum
On Thu, Jan 12, 2017 at 7:58 PM, 许雪寒  wrote:

> Thank you for your continuous helpJ.
>
>
>
> We are using hammer 0.94.5 version, and what I read is the version of the
> source code.
>
> However, on the other hand, if Pipe::do_recv do act as blocked, is it
> reasonable for the Pipe::reader_thread to block threads calling
> SimpleMessenger::submit_message by holding Connection::lock?
>
>
>
> I think maybe a different mutex should be used in Pipe::read_message
> rather than Connection::lock.
>
>
> I don't think it does use that lock. Pipe::read_message() is generally
called while the pipe_lock is held, but not Connection::lock. (They are
separate.)
I haven't dug into the relevant OSD code in a while, but I think it's a lot
more likely your OSD is just overloaded and is taking a while to send a lot
of different messages, and that the loop it's in doesn't update the
HeartbeatMap or something. Did you check that?
-Greg
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] 答复: 答复: Pipe "deadlock" in Hammer, 0.94.5

2017-01-12 Thread 许雪寒
Thank you for your continuous help☺.

We are using hammer 0.94.5 version, and what I read is the version of the 
source code.
However, on the other hand, if Pipe::do_recv do act as blocked, is it 
reasonable for the Pipe::reader_thread to block threads calling 
SimpleMessenger::submit_message by holding Connection::lock?

I think maybe a different mutex should be used in Pipe::read_message rather 
than Connection::lock.

发件人: jiajia zhong [mailto:zhong2p...@gmail.com]
发送时间: 2017年1月13日 11:50
收件人: 许雪寒
抄送: ceph-users@lists.ceph.com
主题: Re: 答复: [ceph-users] Pipe "deadlock" in Hammer, 0.94.5

Yes, but that depends.

that might have changed on master barnch.

2017-01-13 10:47 GMT+08:00 许雪寒 <xuxue...@360.cn<mailto:xuxue...@360.cn>>:
Thanks for your reply☺

Indeed, Pipe::do_recv would act just as blocked when errno is EAGAIN, however, 
in Pipe::read_message method, it first checks if there is pending msg on the 
socket by “Pipe::tcp_read_wait”. So, I think, when Pipe::do_recv is called, it 
shouldn’t get an EAGAIN, which means it wouldn’t act as blocked. Is this so?
This really confuses me.


发件人: jiajia zhong [mailto:zhong2p...@gmail.com<mailto:zhong2p...@gmail.com>]
发送时间: 2017年1月12日 18:22
收件人: 许雪寒
抄送: ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>
主题: Re: [ceph-users] Pipe "deadlock" in Hammer, 0.94.5

if errno is EAGAIN for recv, the Pipe:do_recv just acts as blocked. so

2017-01-12 16:34 GMT+08:00 许雪寒 <xuxue...@360.cn<mailto:xuxue...@360.cn>>:
Hi, everyone.

Recently, we did some experiment to test the stability of the ceph cluster. We 
used Hammer version which is the mostly used version of online cluster. One of 
the scenarios that we simulated is poor network connectivity, in which we used 
iptables to drop TCP/IP packet under some probability. And sometimes, we can 
see the following phenomenon: one machine is running iptables to drop packets 
going in and out, OSDs on other machines could be brought down, and sometimes 
more than one OSD.

We used gdb to debug the core dumped by linux. We found that the thread that 
hit the suicide time threshold is a peering thread who is trying to send a 
pg_notify message, the ceph-osd log file and gdb output is as follows:

Log file:
-3> 2017-01-10 17:02:13.469949 7fd446ff7700  1 heartbeat_map is_healthy 
'OSD::osd_tp thread 0x7fd440bed700' had timed out after 15
-2> 2017-01-10 17:02:13.469952 7fd446ff7700  1 heartbeat_map is_healthy 
'OSD::osd_tp thread 0x7fd440bed700' had suicide timed out after 150
-1> 2017-01-10 17:02:13.469954 7fd4451f4700  1 -- 
10.160.132.157:6818/10014122<http://10.160.132.157:6818/10014122> <== osd.20 
10.160.132.156:0/24908<http://10.160.132.156:0/24908> 163  osd_ping(ping 
e4030 stamp 2017-01-10 17:02:13.450374) v2  47+0+0 (3247646131 0 0) 
0x7fd418ca8600 con 0x7fd413c89700
 0> 2017-01-10 17:02:13.496895 7fd446ff7700 -1 error_msg 
common/HeartbeatMap.cc: In function 'bool 
ceph::HeartbeatMap::_check(ceph::heartbeat_handle_d*, const char*, time_t)' 
thread 7fd446ff7700 time 2017-01-10 17:02:13.469969
common/HeartbeatMap.cc: 79: FAILED assert(0 == "hit suicide timeout")

GDB OUTPUT:
(gdb) thread 8
[Switching to thread 8 (Thread 0x7fd440bed700 (LWP 15302))]#0  
0x003c5d80e334 in __lll_lock_wait () from /lib64/libpthread.so.0
(gdb) bt
#0  0x003c5d80e334 in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x003c5d8095d8 in _L_lock_854 () from /lib64/libpthread.so.0
#2  0x003c5d8094a7 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x01a54ae4 in Mutex::Lock (this=0x7fd426453598, no_lockdep=false) 
at common/Mutex.cc:96
#4  0x01409285 in Mutex::Locker::Locker (this=0x7fd440beb6c0, m=...) at 
common/Mutex.h:115
#5  0x01c46446 in PipeConnection::try_get_pipe (this=0x7fd426453580, 
p=0x7fd440beb908) at msg/simple/PipeConnection.cc:38
#6  0x01c05809 in SimpleMessenger::submit_message (this=0x7fd482029400, 
m=0x7fd425538d00, con=0x7fd426453580, dest_addr=..., dest_type=4, 
already_locked=false) at msg/simple/SimpleMessenger.cc:443
#7  0x01c033fa in SimpleMessenger::_send_message (this=0x7fd482029400, 
m=0x7fd425538d00, con=0x7fd426453580) at msg/simple/SimpleMessenger.cc:136
#8  0x01c467c7 in SimpleMessenger::send_message (this=0x7fd482029400, 
m=0x7fd425538d00, con=0x7fd426453580) at msg/simple/SimpleMessenger.h:139
#9  0x01c466a1 in PipeConnection::send_message (this=0x7fd426453580, 
m=0x7fd425538d00) at msg/simple/PipeConnection.cc:78
#10 0x013b3ff2 in OSDService::send_map (this=0x7fd4821e76c8, 
m=0x7fd425538d00, con=0x7fd426453580) at osd/OSD.cc:1054
#11 0x013b45e7 in OSDService::send_incremental_map 
(this=0x7fd4821e76c8, since=4028, con=0x7fd426453580, 
osdmap=std::tr1::shared_ptr (count 49) 0x7fd426c0f480) at osd/OSD.cc:1087
#12 0x013b215f in OSDService::share_map_peer (this=0x7fd4821e76c8, 
peer=9, c