Re: [ceph-users] Ceph MDS WRN replayed op client.$id

Eugen Block Wed, 19 Sep 2018 05:08:47 -0700

Yeah, since we haven't knowingly done anything about it, it would be a
(pleasant) surprise if it was accidentally resolved in mimic ;-)


Too bad ;-)
Thanks for your help!

Eugen

Zitat von John Spray <[email protected]>:

On Wed, Sep 19, 2018 at 10:37 AM Eugen Block <[email protected]> wrote:


Hi John,

> I'm not 100% sure of that.  It could be that there's a path through
> the code that's healthy, but just wasn't anticipated at the point that
> warning message was added.  I wish a had a more unambiguous response
> to give!

then I guess we'll just keep ignoring these warnings from the replay
mds until we hit a real issue. ;-)

It's probably impossible to predict any improvement on this withmimic, right?


Yeah, since we haven't knowingly done anything about it, it would be a
(pleasant) surprise if it was accidentally resolved in mimic ;-)

John

Regards,
Eugen


Zitat von John Spray <[email protected]>:

> On Mon, Sep 17, 2018 at 2:49 PM Eugen Block <[email protected]> wrote:
>>
>> Hi,
>>
>> from your response I understand that these messages are not expected
>> if everything is healthy.
>
> I'm not 100% sure of that.  It could be that there's a path through
> the code that's healthy, but just wasn't anticipated at the point that
> warning message was added.  I wish a had a more unambiguous response
> to give!
>
> John
>
>> We face them every now and then, three or four times a week, but
>> there's no real connection to specific jobs or a high load in our
>> cluster. It's a Luminous cluster (12.2.7) with 1 active, 1
>> standby-replay and 1 standby MDS.
>> Since it's only the replay server reporting this and the failover
>> works fine we didn't really bother. But what can we do to prevent this
>> from happening? The messages appear quite randomly, so I don't really
>> now when to increase the debug log level.
>>
>> Any hint would be highly appreciated!
>>
>> Regards,
>> Eugen
>>
>>
>> Zitat von John Spray <[email protected]>:
>>
>> > On Thu, Sep 13, 2018 at 11:01 AM Stefan Kooman <[email protected]> wrote:
>> >>
>> >> Hi John,
>> >>
>> >> Quoting John Spray ([email protected]):
>> >>
>> >> > On Wed, Sep 12, 2018 at 2:59 PM Stefan Kooman <[email protected]> wrote:
>> >> >

>> >> > When replaying a journal (either on MDS startup or on astandby-replay

>> >> > MDS), the replayed file creation operations are being checked for
>> >> > consistency with the state of the replayed client sessions.  Client

>> >> > sessions have a "preallocated _inos" list that contains aset of inode

>> >> > numbers they should be using to create new files.
>> >> >
>> >> > There are two checks being done: a soft check (just log it) that the
>> >> > inode used for a new file is the same one that the session would be

>> >> > expected to use for a new file, and a hard check(assertion) that the

>> >> > inode used is one of the inode numbers that can be used for a new
>> >> > file.  When that soft check fails, it doesn't indicate anything
>> >> > inconsistent in the metadata, just that the inodes are being used in
>> >> > an unexpected order.
>> >> >

>> >> > The WRN severity message mainly benefits our automatedtesting -- the

>> >> > hope would be that if we're hitting strange scenarios like this in

>> >> > automated tests then it would trigger a test failure (we byfail tests

>> >> > if they emit unexpected warnings).
>> >>
>> >> Thanks for the explanation.
>> >>
>> >> > It would be interesting to know more about what's going on on your
>> >> > cluster when this is happening -- do you have standby replay MDSs?

>> >> > Multiple active MDSs? Were any daemons failing over at asimilar time

>> >> > to the warnings?  Did you have anything funny going on with clients
>> >> > (like forcing them to reconnect after being evicted)?
>> >>

>> >> Two MDSs in total. One active, one standby-replay. Theclients are doing

>> >> "funny" stuff. We are testing "CTDB" [1] in combination with cephfs to

>> >> build a HA setup (to prevent split brain). We have twoclients that, in>> >> case of a failure, need to require a lock on a file"ctdb_recovery_lock"

>> >> before doing a recovery. Somehow, while configuring this setup, we
>> >> triggered the "replayed op" warnings. We try to reproduce that, but no
>> >> matter what we do the "replayed op" warnings do not occur anymore ...
>> >>
>> >> We have seen these warnings before (other clients). Warnings started
>> >> after we had switched from mds1 -> mds2 (upgrade of Ceph cluster
>> >> according to MDS upgrade procedure, reboots afterwards, hence the
>> >> failover).
>> >>
>> >> Something I just realised is that _only_ the active-standby MDS
>> >> is emitting the warnings, not the active MDS.
>> >>

>> >> Not related to the "replayed op" warning, but related to theCTDB "lock

>> >> issue":
>> >>
>> >> The "surviving" cephfs client tries to acquire a lock on a file, but
>> >> although the other client is dead (but not yet evicted by the MDS) it
>> >> can't. Not until the dead client is evicted by the MDS after ~ 300 sec
>> >> (mds_session_autoclose=300). Turns out ctdb uses fcntl() locking. Does
>> >> cephfs support this kind of locking in the way ctdb expects it to?
>> >
>> > We implement locking, and it's correct that another client can't gain
>> > the lock until the first client is evicted.  Aside from speeding up
>> > eviction by modifying the timeout, if you have another mechanism for
>> > detecting node failure then you could use that to explicitly evict the
>> > client.
>> >
>> > John
>> >
>> >> In the mean time we will try [7] (rados object) as a recovery lock.
>> >> Would eliminate a layer / dependency as well.
>> >>
>> >> Thanks,
>> >>
>> >> Gr. Stefan
>> >>
>> >> [1]: https://ctdb.samba.org/

>> >> [2]:https://ctdb.samba.org/manpages/ctdb_mutex_ceph_rados_helper.7.html

>> >>
>> >> --
>> >> | BIT BV  http://www.bit.nl/        Kamer van Koophandel 09090351
>> >> | GPG: 0xD14839C6                   +31 318 648 688 / [email protected]
>> > _______________________________________________
>> > ceph-users mailing list
>> > [email protected]
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
>>
>> _______________________________________________
>> ceph-users mailing list
>> [email protected]
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Ceph MDS WRN replayed op client.$id

Reply via email to