I'm adding three OSD nodes(36 osds in total) to existing 3-node cluster(35
osds) using ceph-deploy, after disks prepared and OSDs activated, the
cluster re-balanced and shows all pgs active+clean:

     osdmap e820: 72 osds: 71 up, 71 in
      pgmap v173328: 15920 pgs, 17 pools, 12538 MB data, 3903 objects
            30081 MB used, 39631 GB / 39660 GB avail
               15920 active+clean

However, the object write start having issue since the new OSDs added to
cluster:

2014-06-16 11:36:36.421868 osd.35 [WRN] slow request 30.317529 seconds old,
received at 2014-06-16 11:36:06.104256: osd_op(client.5568.0:1502400
default.5250.4_loadtest/512B_file [getxattrs,stat] 9.552a7900 e820) v4
currently waiting for rw locks

And from existing osd log, it seems it's having problem to authenticate the
new OSDs (10.122.134.204 is the IP of one of new OSD nodes) :

2014-06-16 11:38:25.281270 7f58562ce700  0 cephx: verify_reply couldn't
decrypt with error: error decoding block for decryption
2014-06-16 11:38:25.281288 7f58562ce700  0 -- 172.17.9.218:6811/2047255 >>
10.122.134.204:6831/17571 pipe(0x2891280 sd=90 :48493 s=1 pgs=3091 cs=10
l=0 c=0x62d1840).failed verifying authorize reply


The cephx auth list shows good to me:

exported keyring for osd.45
[osd.45]
        key = AQAoCp5TqBq/MhAANwclbs1nCgefNfxqqPnkZQ==
        caps mon = "allow profile osd"
        caps osd = "allow *"

The key above does not match the keyring on osd.45.

Anybody have any clue what might be the authentication issue here? I'm
running Ceph 0.72.2.

Thanks in advance,
Fred
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to