Hi,

  I have a test cluster of 4 servers running Luminous. We were running 12.2.2 
under Fedora 17 and have just completed upgrading to 12.2.5 under Fedora 18.

  All seems well: all MONs are up, OSDs are up, I can see objects stored as 
expected with rados -p default.rgw.buckets.data ls. 

  But when i start RGW, my load goes through the roof as radosgw continuously 
rapid-fire core dumps. 

Log Excerpt:


… 

   -16> 2018-05-21 15:52:48.244579 7fc70eeda700  5 -- 10.19.33.13:0/3446208184 
>> 10.19.33.14:6800/1417 conn(0x55e78a610800 :-1 
s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=14567 cs=1 l=1). rx osd.6 seq 
7 0x55e78a67b500 osd_op_reply(47 notify.6 [watch watch cookie 94452947886080] 
v1092'43446 uv43445 ondisk = 0) v8
   -15> 2018-05-21 15:52:48.244619 7fc70eeda700  1 -- 10.19.33.13:0/3446208184 
<== osd.6 10.19.33.14:6800/1417 7 ==== osd_op_reply(47 notify.6 [watch watch 
cookie 94452947886080] v1092'43446 uv43445 ondisk = 0) v8 ==== 152+0+0 
(1199963694 0 0) 0x55e78a67b500 con 0x55e78a610800
   -14> 2018-05-21 15:52:48.244777 7fc723656000  1 -- 10.19.33.13:0/3446208184 
--> 10.19.33.15:6800/1433 -- osd_op(unknown.0.0:48 16.1 
16:93e5b521:::notify.7:head [create] snapc 0=[] 
ondisk+write+known_if_redirected e1092) v8 -- 0x55e78a67bc00 con 0
   -13> 2018-05-21 15:52:48.275650 7fc70eeda700  5 -- 10.19.33.13:0/3446208184 
>> 10.19.33.15:6800/1433 conn(0x55e78a65e000 :-1 
s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=14572 cs=1 l=1). rx osd.2 seq 
7 0x55e78a678380 osd_op_reply(48 notify.7 [create] v1092'43453 uv43453 ondisk = 
0) v8
   -12> 2018-05-21 15:52:48.275675 7fc70eeda700  1 -- 10.19.33.13:0/3446208184 
<== osd.2 10.19.33.15:6800/1433 7 ==== osd_op_reply(48 notify.7 [create] 
v1092'43453 uv43453 ondisk = 0) v8 ==== 152+0+0 (2720997170 0 0) 0x55e78a678380 
con 0x55e78a65e000
   -11> 2018-05-21 15:52:48.275849 7fc723656000  1 -- 10.19.33.13:0/3446208184 
--> 10.19.33.15:6800/1433 -- osd_op(unknown.0.0:49 16.1 
16:93e5b521:::notify.7:head [watch watch cookie 94452947887232] snapc 0=[] 
ondisk+write+known_if_redirected e1092) v8 -- 0x55e78a688000 con 0
   -10> 2018-05-21 15:52:48.296799 7fc70eeda700  5 -- 10.19.33.13:0/3446208184 
>> 10.19.33.15:6800/1433 conn(0x55e78a65e000 :-1 
s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=14572 cs=1 l=1). rx osd.2 seq 
8 0x55e78a688000 osd_op_reply(49 notify.7 [watch watch cookie 94452947887232] 
v1092'43454 uv43453 ondisk = 0) v8
    -9> 2018-05-21 15:52:48.296824 7fc70eeda700  1 -- 10.19.33.13:0/3446208184 
<== osd.2 10.19.33.15:6800/1433 8 ==== osd_op_reply(49 notify.7 [watch watch 
cookie 94452947887232] v1092'43454 uv43453 ondisk = 0) v8 ==== 152+0+0 
(3812136207 0 0) 0x55e78a688000 con 0x55e78a65e000
    -8> 2018-05-21 15:52:48.296924 7fc723656000  2 all 8 watchers are set, 
enabling cache
    -7> 2018-05-21 15:52:48.297135 7fc57cbb6700  2 garbage collection: start
    -6> 2018-05-21 15:52:48.297185 7fc57c3b5700  2 object expiration: start
    -5> 2018-05-21 15:52:48.297321 7fc57cbb6700  1 -- 10.19.33.13:0/3446208184 
--> 10.19.33.16:6804/1596 -- osd_op(unknown.0.0:50 18.3 
18:d242335b:gc::gc.2:head [call lock.lock] snapc 0=[] 
ondisk+write+known_if_redirected e1092) v8 -- 0x55e78a692000 con 0
    -4> 2018-05-21 15:52:48.297395 7fc57c3b5700  1 -- 10.19.33.13:0/3446208184 
--> 10.19.33.16:6804/1596 -- osd_op(unknown.0.0:51 18.0 
18:1a734c59:::obj_delete_at_hint.0000000000:head [call lock.lock] snapc 0=[] 
ondisk+write+known_if_redirected e1092) v8 -- 0x55e78a692380 con 0
    -3> 2018-05-21 15:52:48.299463 7fc568b8e700  5 schedule life cycle next 
start time: Tue May 22 04:00:00 2018
    -2> 2018-05-21 15:52:48.299528 7fc567b8c700  5 ERROR: sync_all_users() 
returned ret=-2
    -1> 2018-05-21 15:52:48.299698 7fc56738b700  1 -- 10.19.33.13:0/3446208184 
--> 10.19.33.14:6800/1417 -- osd_op(unknown.0.0:52 18.7 
18:e9187ab8:reshard::reshard.0000000000:head [call lock.lock] snapc 0=[] 
ondisk+write+known_if_redirected e1092) v8 -- 0x55e78a54fc00 con 0
     0> 2018-05-21 15:52:48.301978 7fc723656000 -1 *** Caught signal (Aborted) 
**
 in thread 7fc723656000 thread_name:radosgw

 ceph version 12.2.5 (cad919881333ac92274171586c827e01f554a70a) luminous 
(stable)
 1: (()+0x22d82c) [0x55e7861a882c]
 2: (()+0x11fb0) [0x7fc719270fb0]
 3: (gsignal()+0x10b) [0x7fc716603f4b]
 4: (abort()+0x12b) [0x7fc7165ee591]
 5: (parse_rgw_ldap_bindpw[abi:cxx11](CephContext*)+0x68b) [0x55e78647409b]
 6: (rgw::auth::s3::LDAPEngine::init(CephContext*)+0xb9) [0x55e7863a38f9]
 7: (rgw::auth::s3::ExternalAuthStrategy::ExternalAuthStrategy(CephContext*, 
RGWRados*, rgw::auth::s3::AWSEngine::VersionAbstractor*)+0x74) [0x55e786154bc4]
 8: (std::__shared_ptr<rgw::auth::StrategyRegistry, 
(__gnu_cxx::_Lock_policy)2>::__shared_ptr<std::allocator<rgw::auth::StrategyRegistry>,
 CephContext* const&, RGWRados* const&>(std::_Sp_make_shared_tag, 
std::allocator<rgw::auth::StrategyRegistry> const&, CephContext* const&, 
RGWRados* const&)+0xf8) [0x55e786158f78]
 9: (main()+0x196b) [0x55e78614463b]
 10: (__libc_start_main()+0xeb) [0x7fc7165f01bb]
 11: (_start()+0x2a) [0x55e78614c3da]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to 
interpret this.

 
                   <http://www.liquidpixels.com/r.m?d=mspencer>                 
Marc D. Spencer
Chief Technology Officer
T: 866.808.4937 × 202 <tel:18668084937,202>
E: mspen...@liquidpixels.com <mailto:mspen...@liquidpixels.com>
www.liquidpixels.com <http://www.liquidpixels.com/r.m?d=mspencer>
 <http://www.liquidpixels.com/go/facebook>     
<http://www.liquidpixels.com/go/twitter>     
<http://www.liquidpixels.com/go/linkedin>     
<http://www.liquidpixels.com/go/instagram>

Attachment: smime.p7s
Description: S/MIME cryptographic signature

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to