[CCing Greg and ceph-devel]
On Wed, 7 May 2014, Guang Yang wrote:
> Hi Sage,
> Sorry to bother you directly, I am debugging / fixing issue
> http://tracker.ceph.com/issues/8232, during which time I studied the
> messenger component of CEPH. With more understanding of the messenger
> component, I started getting confused by the fix of issue 6992
> (http://tracker.ceph.com/issues/6992) in terms of how it could help to solve
> the problem (though I fully agree we should stop accepter first and then
> clear all PIPEs).
>
> Looking at the logs posted along with the issue 6992, the failure happened
> at pipe::writter::connect side (it should be a brand new connect instead of
> a re-connect as cs = 0, pgs = 0) after a rebind, and the failed one already
> has the updated local address, it is confused to me how the connection could
> be established? As there is connect_seq check at the remote side which is
> likely to fail for this connection attempt (which is a positive value),
> unless there is some race at remote side updating connet_seq and in_seq.
>
> Am I missing something obviously on this?
Honestly I haven't looked closely at that old log; I would focus on the
new log.
Looking at it now (for the first time, sorry), the last line is
-2> 2014-05-04 06:16:07.957897 7f71063ee700 2 --
10.193.207.180:6884/1037605 >> 10.193.207.183:6958/5001307 pipe(0x1cfa1400
sd=132 :60749 s=1 pgs=0 cs=0 l=0 c=0x12b4e160). got newly_acked_seq 10 vs
out_seq 0
If I'm reading it right, that's an outgoing connection with the same
source as the mark_down_all. If it existed before the mark_down_all,
something is really broken because mark_dwon_all should have set it to
STATE_CLOSED and the connect() function checks the state. Which makes me
think that it was initiated after.
My guess is that this is a race in OSD.cc. We do the rebind() stuff, but
only a bit further down do consume_map() which publishes the map to the
OSDService with all kind of complicated handoff. I'm forgetting right now
how this is supposed to work, but my guess is that this is the heart of
the problem: some random PG thread is trying to send to an OSD using the
older map and grabs the older OSDMap ref and opens the connection
*after* we do the rebind() and mark_down_all().
Does this sound plausible?
sage
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html