I wonder if anyone could offer any insight on the issue below, regarding the CentOS 7.6 kernel cephfs client connecting to a Luminous cluster.  I have since tried a much newer 4.19.13 kernel, which did not show the same issue (but unfortunately for various reasons unrelated to ceph, we can't go to such a new kernel).

Am I reading it right that somehow the monitor thinks this kernel is old and needs to prepare special maps in some older format for it, and that takes too long and the kernel just gives up, or perhaps has some other communication protocol error?  It seems like one of these mon communication sessions only lasts half a second.  Then it reconnects to another mon, and gets the same result, etc.  Any way around this?

Andras


On 12/26/18 7:55 PM, Andras Pataki wrote:
We've been using ceph-fuse with a pretty good stability record (against the Luminous 12.2.8 back end).  Unfortunately ceph-fuse has extremely poor small file performance (understandably), so we've been testing the kernel client.  The latest RedHat kernel 3.10.0-957.1.3.el7.x86_64 seems to work pretty well, as long as the cluster is running in a completely clean state.  However, it seems that as soon as there is something happening to the cluster, the kernel client crashes pretty badly.

Today's example: I've reweighted some OSDs to balance the disk usage a bit (set nobackfill, reweight the OSDs, check the new hypothetical space usage, then unset nobackfill).   As soon as the reweighting procedure started, the kernel client went into an infinite loop trying to unsuccessfully connect to mons:

Dec 26 19:28:53 mon5 kernel: libceph: mon0 10.128.150.10:6789 io error
Dec 26 19:28:53 mon5 kernel: libceph: mon0 10.128.150.10:6789 session lost, hunting for new mon Dec 26 19:28:53 mon5 kernel: libceph: mon2 10.128.150.12:6789 session established
Dec 26 19:28:58 mon5 kernel: libceph: mon2 10.128.150.12:6789 io error
Dec 26 19:28:58 mon5 kernel: libceph: mon2 10.128.150.12:6789 session lost, hunting for new mon Dec 26 19:28:58 mon5 kernel: libceph: mon1 10.128.150.11:6789 session established
Dec 26 19:28:59 mon5 kernel: libceph: mon1 10.128.150.11:6789 io error
Dec 26 19:28:59 mon5 kernel: libceph: mon1 10.128.150.11:6789 session lost, hunting for new mon Dec 26 19:28:59 mon5 kernel: libceph: mon2 10.128.150.12:6789 session established
Dec 26 19:28:59 mon5 kernel: libceph: mon2 10.128.150.12:6789 io error
Dec 26 19:28:59 mon5 kernel: libceph: mon2 10.128.150.12:6789 session lost, hunting for new mon Dec 26 19:28:59 mon5 kernel: libceph: mon0 10.128.150.10:6789 session established
Dec 26 19:29:00 mon5 kernel: libceph: mon0 10.128.150.10:6789 io error
Dec 26 19:29:00 mon5 kernel: libceph: mon0 10.128.150.10:6789 session lost, hunting for new mon Dec 26 19:29:00 mon5 kernel: libceph: mon2 10.128.150.12:6789 session established
Dec 26 19:29:00 mon5 kernel: libceph: mon2 10.128.150.12:6789 io error
Dec 26 19:29:00 mon5 kernel: libceph: mon2 10.128.150.12:6789 session lost, hunting for new mon Dec 26 19:29:00 mon5 kernel: libceph: mon1 10.128.150.11:6789 session established
Dec 26 19:29:00 mon5 kernel: libceph: mon1 10.128.150.11:6789 io error
Dec 26 19:29:00 mon5 kernel: libceph: mon1 10.128.150.11:6789 session lost, hunting for new mon Dec 26 19:29:00 mon5 kernel: libceph: mon2 10.128.150.12:6789 session established
Dec 26 19:29:01 mon5 kernel: libceph: mon2 10.128.150.12:6789 io error
Dec 26 19:29:01 mon5 kernel: libceph: mon2 10.128.150.12:6789 session lost, hunting for new mon Dec 26 19:29:01 mon5 kernel: libceph: mon0 10.128.150.10:6789 session established
Dec 26 19:29:01 mon5 kernel: libceph: mon0 10.128.150.10:6789 io error
Dec 26 19:29:01 mon5 kernel: libceph: mon0 10.128.150.10:6789 session lost, hunting for new mon Dec 26 19:29:01 mon5 kernel: libceph: mon2 10.128.150.12:6789 session established
Dec 26 19:29:02 mon5 kernel: libceph: mon2 10.128.150.12:6789 io error
Dec 26 19:29:02 mon5 kernel: libceph: mon2 10.128.150.12:6789 session lost, hunting for new mon Dec 26 19:29:02 mon5 kernel: libceph: mon1 10.128.150.11:6789 session established
Dec 26 19:29:02 mon5 kernel: libceph: mon1 10.128.150.11:6789 io error
Dec 26 19:29:02 mon5 kernel: libceph: mon1 10.128.150.11:6789 session lost, hunting for new mon
... etc ...

seemingly never recovering.  The cluster is healthy, all other clients are successfully doing I/O:

    [root@cephmon00 ceph]# ceph -s
      cluster:
        id:     d7b33135-0940-4e48-8aa6-1d2026597c2f
        health: HEALTH_WARN
                noout flag(s) set
                1 backfillfull osd(s)
                4 pool(s) backfillfull
                239119058/12419244975 objects misplaced (1.925%)

      services:
        mon: 3 daemons, quorum cephmon00,cephmon01,cephmon02
        mgr: cephmon00(active)
        mds: cephfs-1/1/1 up  {0=cephmds00=up:active}, 1 up:standby
        osd: 3534 osds: 3534 up, 3534 in; 5040 remapped pgs
             flags noout

      data:
        pools:   5 pools, 50688 pgs
        objects: 2.51G objects, 7.77PiB
        usage:   19.1PiB used, 5.90PiB / 25.0PiB avail
        pgs:     239119058/12419244975 objects misplaced (1.925%)
                 45639 active+clean
                 2914  active+remapped+backfilling
                 2126  active+remapped+backfill_wait
                 9     active+clean+scrubbing+deep

      io:
        client:   10.3MiB/s rd, 2.23GiB/s wr, 32op/s rd, 3.77kop/s wr
        recovery: 114GiB/s, 33.12kobjects/s


The client machine in question is otherwise healthy also (not out of memory, etc.).  I've checked the osd blacklist, nothing on that.  Nothing suspicious in the mon logs regarding this client either.  Setting one of the mons to log level 20, the relevant section when this client tries to connect:

2018-12-26 19:51:46.203264 7fffeb406700 10 mon.cephmon00@0(leader) e40 ms_verify_authorizer 10.128.36.18:0/3882984371 client protocol 0 2018-12-26 19:51:46.206645 7fffee40c700 10 mon.cephmon00@0(leader) e40 _ms_dispatch new session 0x5555722bb980 MonSession(client.36398604 10.128.36.18:0/3882984371 is open , features 0x27018fb86aa42ada (jewel)) features 0x27018fb86aa42ada 2018-12-26 19:51:46.206651 7fffee40c700 20 mon.cephmon00@0(leader) e40  caps 2018-12-26 19:51:46.206657 7fffee40c700 10 mon.cephmon00@0(leader).auth v57602 preprocess_query auth(proto 0 34 bytes epoch 0) from client.36398604 10.128.36.18:0/3882984371 2018-12-26 19:51:46.206660 7fffee40c700 10 mon.cephmon00@0(leader).auth v57602 prep_auth() blob_size=34 2018-12-26 19:51:46.206692 7fffee40c700  2 mon.cephmon00@0(leader) e40 send_reply 0x555562fee5a0 0x55556f398b40 auth_reply(proto 2 0 (0) Success) v1 2018-12-26 19:51:46.206867 7fffee40c700 20 mon.cephmon00@0(leader) e40 _ms_dispatch existing session 0x5555722bb980 for client.36398604 10.128.36.18:0/3882984371 2018-12-26 19:51:46.206870 7fffee40c700 20 mon.cephmon00@0(leader) e40  caps 2018-12-26 19:51:46.206877 7fffee40c700 10 mon.cephmon00@0(leader).auth v57602 preprocess_query auth(proto 2 128 bytes epoch 0) from client.36398604 10.128.36.18:0/3882984371 2018-12-26 19:51:46.206882 7fffee40c700 10 mon.cephmon00@0(leader).auth v57602 prep_auth() blob_size=128 2018-12-26 19:51:46.206979 7fffee40c700  2 mon.cephmon00@0(leader) e40 send_reply 0x555562fee5a0 0x55571f9c4d80 auth_reply(proto 2 0 (0) Success) v1 2018-12-26 19:51:46.207178 7fffee40c700 20 mon.cephmon00@0(leader) e40 _ms_dispatch existing session 0x5555722bb980 for client.36398604 10.128.36.18:0/3882984371 2018-12-26 19:51:46.207182 7fffee40c700 20 mon.cephmon00@0(leader) e40  caps allow * 2018-12-26 19:51:46.207184 7fffee40c700 20 is_capable service=mon command= read on cap allow * 2018-12-26 19:51:46.207185 7fffee40c700 20  allow so far , doing grant allow *
2018-12-26 19:51:46.207186 7fffee40c700 20  allow all
2018-12-26 19:51:46.207188 7fffee40c700 10 mon.cephmon00@0(leader) e40 handle_subscribe mon_subscribe({mdsmap=72103+,monmap=41+,osdmap=1207891}) v2 2018-12-26 19:51:46.207194 7fffee40c700 10 mon.cephmon00@0(leader) e40 handle_subscribe: MDS sub 'mdsmap' 2018-12-26 19:51:46.207195 7fffee40c700 20 is_capable service=mds command= read on cap allow * 2018-12-26 19:51:46.207197 7fffee40c700 20  allow so far , doing grant allow *
2018-12-26 19:51:46.207198 7fffee40c700 20  allow all
2018-12-26 19:51:46.207200 7fffee40c700 20 mon.cephmon00@0(leader).mds e72102 check_sub: mdsmap 2018-12-26 19:51:46.207204 7fffee40c700 10 mon.cephmon00@0(leader).monmap v40 check_sub monmap next 41 have 40 2018-12-26 19:51:46.207207 7fffee40c700 20 is_capable service=osd command= read on cap allow * 2018-12-26 19:51:46.207208 7fffee40c700 20  allow so far , doing grant allow *
2018-12-26 19:51:46.207209 7fffee40c700 20  allow all
2018-12-26 19:51:46.207211 7fffee40c700 10 mon.cephmon00@0(leader).osd e1208017 check_osdmap_sub 0x55592f513d00 next 1207891 (onetime) 2018-12-26 19:51:46.207213 7fffee40c700  5 mon.cephmon00@0(leader).osd e1208017 send_incremental [1207891..1208017] to client.36398604 10.128.36.18:0/3882984371 2018-12-26 19:51:46.207217 7fffee40c700 10 mon.cephmon00@0(leader).osd e1208017 build_incremental [1207891..1207930] with features 27018fb86aa42ada 2018-12-26 19:51:46.220019 7fffee40c700 20 mon.cephmon00@0(leader).osd e1208017 reencode_incremental_map 1207930 with features 504412504116439552 2018-12-26 19:51:46.230217 7fffee40c700 20 mon.cephmon00@0(leader).osd e1208017 build_incremental    inc 1207930 1146701 bytes 2018-12-26 19:51:46.230349 7fffee40c700 20 mon.cephmon00@0(leader).osd e1208017 reencode_incremental_map 1207929 with features 504412504116439552 2018-12-26 19:51:46.232523 7fffee40c700 20 mon.cephmon00@0(leader).osd e1208017 build_incremental    inc 1207929 175613 bytes
... a lot more of reencode_incremental stuff ...
2018-12-26 19:51:46.745394 7fffee40c700 10 mon.cephmon00@0(leader) e40 ms_handle_reset 0x5555637cf800 10.128.36.18:0/3882984371 2018-12-26 19:51:46.745395 7ffff0c11700 10 mon.cephmon00@0(leader).log v79246823 encode_full log v 79246823 2018-12-26 19:51:46.745469 7ffff0c11700 10 mon.cephmon00@0(leader).log v79246823 encode_pending v79246824 2018-12-26 19:51:46.745763 7fffee40c700 10 mon.cephmon00@0(leader) e40 reset/close on session client.36398604 10.128.36.18:0/3882984371 2018-12-26 19:51:46.745769 7fffee40c700 10 mon.cephmon00@0(leader) e40 remove_session 0x5555722bb980 client.36398604 10.128.36.18:0/3882984371 features 0x27018fb86aa42ada

Any pointers to what to do here?

Andras

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to