I know this is an old build, but I just want to verify that this isn't an unknown bug. For context, the attached log covers the time from when server .15 dropped off the net (we think power failure at this point). OSDs 72, 73, 74, and 75 are on the node which apparently lost power.
Ceph version 0.47.2
Kernel version 3.4.4
Patches applied:
rbd-don-t-hold-spinlock-during-messenger-flush.patch
crush-adjust-local-retry-threshold.patch
crush-be-more-tolerant-of-nonsensical-crush-maps.patchcrush-fix-tree-node-weight-lookup.patch
crush-fix-memory-leak-when-destroying-tree-buckets.patch
ceph-osd_client-fix-endianness-bug-in-osd_req_encode.patchrbd-protect-read-of-snapshot-sequence-number.patch
rbd-store-snapshot-id-instead-of-index.patchceph-don-t-set-WRITE_PENDING-too-early.patch
ceph-messenger-rework-prepare_connect_authorizer.patchceph-messenger-check-return-from-get_authorizer.patch
ceph-define-ceph_auth_handshake-type.patchceph-messenger-reduce-args-to-create_authorizer.patch
ceph-ensure-auth-ops-are-defined-before-use.patchceph-have-get_authorizer-methods-return-pointers.patch
ceph-use-info-returned-by-get_authorizer.patchceph-return-pointer-from-prepare_connect_authorizer.patch
ceph-rename-prepare_connect_authorizer.patchceph-add-auth-buf-in-prepare_write_connect.patch
libceph-avoid-unregistering-osd-request-when-not-reg.patchlibceph-fix-pg_temp-updates.patch
ceph-check-PG_Private-flag-before-accessing-page-pri.patchrbd-Fix-ceph_snap_context-size-calculation.patch
rbd-endian-bug-in-rbd_req_cb.patchlibceph-osd_client-don-t-drop-reply-reference-too-ea.patch
libceph-use-con-get-put-ops-from-osd_client.patchrbd-Clear-ceph_msg-bio_iter-for-retransmitted-messag.patch
libceph-flush-msgr-queue-during-mon_client-shutdown.patch
Build options:
--with-cryptopp --without-nss --with-radosgw --without-fuse \
--with-tcmalloc --without-hadoop --with-libatomic-ops \
--with-system-libs3 --with-libaio --without-gtk2 \
--localstatedir=/var
cephoops
Description: Binary data
