We recommend disabling dynamic resharding until fixes for two known
issues, plus a radosgw-admin command to remove traces of old dynamic
reshard runs, land in master and Luminous (shortly for at least the
first of these fixes).

Matt

On Wed, Oct 24, 2018 at 6:46 AM, Ch Wan <xmu.wc.2...@gmail.com> wrote:
> Hi. I encounter this problem these days.
> The clients get timeout exception while rgw doing resharding.
> Could we decrease the impact of dynamic resharding by adjusting some
> configurations, or just increase the timeout threshold at client-side?
>
> Jakub Jaszewski <jaszewski.ja...@gmail.com> 于2018年7月16日周一 下午11:25写道:
>>
>> Hi,
>> We run 5 RADOS Gateways on Luminous 12.2.5 as upstream servers in nginx
>> active-active setup, based on keepalived.
>> Cluster is 12x Ceph nodes (16x 10TB OSD(bluestore) per node, 2x 10Gb
>> network link shared by access and cluster networks), RGW pool is EC 9+3.
>>
>> We recently noticed below entries in RGW logs:
>>
>> 2018-07-11 06:19:13.726392 7f2eeed46700  1 ====== starting new request
>> req=0x7f2eeed402c0 =====
>> 2018-07-11 06:19:13.871358 7f2eeed46700  0 NOTICE: resharding operation on
>> bucket index detected, blocking
>> 2018-07-11 06:19:58.953816 7f2eeed46700  0 block_while_resharding ERROR:
>> bucket is still resharding, please retry
>> 2018-07-11 06:19:58.959424 7f2eeed46700  0 NOTICE: resharding operation on
>> bucket index detected, blocking
>> 2018-07-11 06:20:44.088045 7f2eeed46700  0 block_while_resharding ERROR:
>> bucket is still resharding, please retry
>> 2018-07-11 06:20:44.090664 7f2eeed46700  0 NOTICE: resharding operation on
>> bucket index detected, blocking
>> 2018-07-11 06:21:29.141182 7f2eeed46700  0 block_while_resharding ERROR:
>> bucket is still resharding, please retry
>> 2018-07-11 06:21:29.146598 7f2eeed46700  0 NOTICE: resharding operation on
>> bucket index detected, blocking
>> 2018-07-11 06:22:14.178369 7f2eeed46700  0 block_while_resharding ERROR:
>> bucket is still resharding, please retry
>> 2018-07-11 06:22:14.181697 7f2eeed46700  0 NOTICE: resharding operation on
>> bucket index detected, blocking
>> 2018-07-11 06:22:34.199763 7f2eeed46700  1 ====== req done
>> req=0x7f2eeed402c0 op status=0 http_status=200 ======
>> 2018-07-11 06:22:34.199851 7f2eeed46700  1 civetweb: 0x5599a1158000:
>> 10.195.17.6 - - [11/Jul/2018:06:10:11 +0000] "PUT
>> /BUCKET/PATH/OBJECT?partNumber=2&uploadId=2~ol_fQw_u7eKRjuP1qVwnj5V12GxDYXu
>> HTTP/1.1" 200 0 - -
>>
>> Causing 'upstream timed out (110: Connection timed out) while reading
>> response header from upstream' errors and 504 response code on nginx side
>> due to 30 seconds timeout.
>>
>> Other recurring log entries look like:
>>
>> 2018-07-11 06:20:47.407632 7f2e97c98700  1 ====== starting new request
>> req=0x7f2e97c922c0 =====
>> 2018-07-11 06:20:47.412455 7f2e97c98700  0 NOTICE: resharding operation on
>> bucket index detected, blocking
>> 2018-07-11 06:21:32.424983 7f2e97c98700  0 block_while_resharding ERROR:
>> bucket is still resharding, please retry
>> 2018-07-11 06:21:32.426597 7f2e97c98700  0 NOTICE: resharding operation on
>> bucket index detected, blocking
>> 2018-07-11 06:22:17.444467 7f2e97c98700  0 block_while_resharding ERROR:
>> bucket is still resharding, please retry
>> 2018-07-11 06:22:17.492217 7f2e97c98700  0 NOTICE: resharding operation on
>> bucket index detected, blocking
>>
>> 2018-07-11 06:22:32.495254 7f2e97c98700  0 ERROR: update_bucket_id()
>> new_bucket_id=d644765c-1705-49b2-9609-a8511d3c4fed.151639.105 returned
>> r=-125
>> 2018-07-11 06:22:32.495386 7f2e97c98700  0 WARNING: set_req_state_err
>> err_no=125 resorting to 500
>>
>> 2018-07-11 06:22:32.495509 7f2e97c98700  1 ====== req done
>> req=0x7f2e97c922c0 op status=-125 http_status=500 ======
>> 2018-07-11 06:22:32.495569 7f2e97c98700  1 civetweb: 0x5599a14f4000:
>> 10.195.17.6 - - [11/Jul/2018:06:19:25 +0000] "POST PUT
>> /BUCKET/PATH/OBJECT?uploads HTTP/1.1" 500 0 - -
>>
>>
>> To avoid 504 & 500 responses we disabled dynamic resharding via
>> 'rgw_dynamic_resharding = false'. Not sure if setting nginx
>> 'proxy_read_timeout' option to value higher than bucket resharding time is a
>> good idea.
>>
>> Once done, 'block_while_resharding ERROR: bucket is still resharding,
>> please retry' disappeared from RGW logs, however, another ERROR is now
>> logged and then RGWs catch singal Aborted and get restarted by systemd:
>>
>> 2018-07-13 05:27:31.149618 7f7eb72c7700  1 ====== starting new request
>> req=0x7f7eb72c12c0 =====
>> 2018-07-13 05:27:52.593413 7f7eb72c7700  0 ERROR: flush_read_list():
>> d->client_cb->handle_data() returned -5
>> 2018-07-13 05:27:52.594040 7f7eb72c7700  1 ====== req done
>> req=0x7f7eb72c12c0 op status=-5 http_status=206 ======
>> 2018-07-13 05:27:52.594633 7f7eb72c7700  1 civetweb: 0x55ab3171b000:
>> 10.195.17.6 - - [13/Jul/2018:05:24:28 +0000] "GET /BUCKET/PATH/OBJECT_580MB
>> HTTP/1.1" 206 0 - Hadoop 2.7.3.2.5.3.0-37, aws-sdk-java/1.10.6
>> Linux/4.4.0-97-generic Java_HotSpot(TM)_64-Bit_Server_VM/25.77-b03/1.8.0_77
>>
>> We see ~40 such ERRORs (each GET requests ~580 MB object) prior to the RGW
>> crush:
>>
>> 2018-07-13 05:21:43.993778 7fcce6575700  1 ====== starting new request
>> req=0x7fcce656f2c0 =====
>> 2018-07-13 05:22:16.137676 7fcce6575700 -1
>> /build/ceph-12.2.5/src/common/buffer.cc: In function 'void
>> ceph::buffer::list::append(const ceph::buffer::ptr&, unsigned int, unsigned
>> int)' thread 7fcce6575700 time 2018-07-13 05:22:16.135271
>> /build/ceph-12.2.5/src/common/buffer.cc: 1967: FAILED assert(len+off <=
>> bp.length())
>>
>>  ceph version 12.2.5 (cad919881333ac92274171586c827e01f554a70a) luminous
>> (stable)
>>  1: (ceph::__ceph_assert_fail(char const*, char const*, int, char
>> const*)+0x102) [0x7fcd5b7aab72]
>>  2: (ceph::buffer::list::append(ceph::buffer::ptr const&, unsigned int,
>> unsigned int)+0x118) [0x7fcd64993cf8]
>>  3: (RGWPutObj_ObjStore::get_data(ceph::buffer::list&)+0xde)
>> [0x5597153dd06e]
>>  4: (RGWPutObj_ObjStore_S3::get_data(ceph::buffer::list&)+0xe)
>> [0x55971541f5be]
>>  5: (RGWPutObj::execute()+0xe9b) [0x55971530e3cb]
>>  6: (rgw_process_authenticated(RGWHandler_REST*, RGWOp*&, RGWRequest*,
>> req_state*, bool)+0x165) [0x559715331cb5]
>>  7: (process_request(RGWRados*, RGWREST*, RGWRequest*,
>> std::__cxx11::basic_string<char, std::char_traits<char>,
>> std::allocator<char> > const&, rgw::auth::StrategyRegistry const&,
>> RGWRestfulIO*, OpsLogSocket*, int*)+0x1dbc) [0x559715333e3c]
>>  8: (RGWCivetWebFrontend::process(mg_connection*)+0x38f) [0x5597151cc3ff]
>>  9: (()+0x1e6319) [0x559715206319]
>>  10: (()+0x1e7ce9) [0x559715207ce9]
>>  11: (()+0x76ba) [0x7fcd646c86ba]
>>  12: (clone()+0x6d) [0x7fcd59dd041d]
>>  NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed
>> to interpret this.
>>
>>
>> --- begin dump of recent events ---
>> ...
>> ...
>> ...
>>  -9615> 2018-07-13 05:21:43.993778 7fcce6575700  1 ====== starting new
>> request req=0x7fcce656f2c0 =====
>>  -9614> 2018-07-13 05:21:43.993802 7fcce6575700  2 req
>> 448466:0.000024::PUT /BUCKET/PATH/OBJECT_35MB::initializing for trans_id =
>> tx00000000000000006d7d2-005b4836e7-a22ee-default
>>  -9613> 2018-07-13 05:21:43.993858 7fcce6575700  2 req
>> 448466:0.000081:s3:PUT /BUCKET/PATH/OBJECT_35MB::getting op 1
>>  -9612> 2018-07-13 05:21:43.993873 7fcce6575700  2 req
>> 448466:0.000095:s3:PUT /BUCKET/PATH/OBJECT_35MB:put_obj:verifying requester
>>  -9611> 2018-07-13 05:21:43.994148 7fcce6575700  2 req
>> 448466:0.000370:s3:PUT /BUCKET/PATH/OBJECT_35MB:put_obj:normalizing buckets
>> and tenants
>>  -9610> 2018-07-13 05:21:43.994156 7fcce6575700  2 req
>> 448466:0.000378:s3:PUT /BUCKET/PATH/OBJECT_35MB:put_obj:init permissions
>>  -9609> 2018-07-13 05:21:43.994246 7fcce6575700  1 --
>> 10.195.17.12:0/171535574 --> 10.195.17.20:6804/13437 --
>> osd_op(unknown.0.0:14249780 4.3 4:cc87b6a2:root::BUCKET01:head [call
>> version.read,getxattrs,stat] snapc 0=[] ondisk+read+known_if_redirected
>> e2230) v8 -- 0x55971dc5e680 con 0
>>  -9606> 2018-07-13 05:21:43.994934 7fcce6575700  1 --
>> 10.195.17.12:0/171535574 --> 10.195.17.20:6804/13437 --
>> osd_op(unknown.0.0:14249781 4.3 4:cc87b6a2:root::BUCKET01:head [call
>> version.check_conds,call version.read,read 0~524288] snapc 0=[]
>> ondisk+read+known_if_redirected e2230) v8 -- 0x55971774a340 con 0
>>  -9603> 2018-07-13 05:21:43.995333 7fcce6575700  1 --
>> 10.195.17.12:0/171535574 --> 10.195.17.20:6804/13437 --
>> osd_op(unknown.0.0:14249782 4.3
>> 4:ccf42def:root::.bucket.meta.BUCKET01%3ad644765c-1705-49b2-9609-a8511d3c4fed.311938.82:head
>> [call version.read,getxattrs,stat] snapc 0=[]
>> ondisk+read+known_if_redirected e2230) v8 -- 0x55973938f6c0 con 0
>>  -9600> 2018-07-13 05:21:43.995657 7fcce6575700  1 --
>> 10.195.17.12:0/171535574 --> 10.195.17.20:6804/13437 --
>> osd_op(unknown.0.0:14249783 4.3
>> 4:ccf42def:root::.bucket.meta.BUCKET01%3ad644765c-1705-49b2-9609-a8511d3c4fed.311938.82:head
>> [call version.check_conds,call version.read,read 0~524288] snapc 0=[]
>> ondisk+read+known_if_redirected e2230) v8 -- 0x55971811a000 con 0
>>  -9597> 2018-07-13 05:21:43.996145 7fcce6575700  2 req
>> 448466:0.002368:s3:PUT /BUCKET/PATH/OBJECT_35MB:put_obj:recalculating target
>>  -9596> 2018-07-13 05:21:43.996170 7fcce6575700  2 req
>> 448466:0.002372:s3:PUT /BUCKET/PATH/OBJECT_35MB:put_obj:reading permissions
>>  -9595> 2018-07-13 05:21:43.996174 7fcce6575700  2 req
>> 448466:0.002397:s3:PUT /BUCKET/PATH/OBJECT_35MB:put_obj:init op
>>  -9594> 2018-07-13 05:21:43.996176 7fcce6575700  2 req
>> 448466:0.002399:s3:PUT /BUCKET/PATH/OBJECT_35MB:put_obj:verifying op mask
>>  -9593> 2018-07-13 05:21:43.996178 7fcce6575700  2 req
>> 448466:0.002400:s3:PUT /BUCKET/PATH/OBJECT_35MB:put_obj:verifying op
>> permissions
>>  -9592> 2018-07-13 05:21:43.996180 7fcce6575700  5 Searching permissions
>> for identity=rgw::auth::SysReqApplier ->
>> rgw::auth::LocalApplier(acct_user=user01, acct_name=user01, subuser=,
>> perm_mask=15, is_admin=0) mask=50
>>  -9591> 2018-07-13 05:21:43.996226 7fcce6575700  5 Searching permissions
>> for uid=user01
>>  -9590> 2018-07-13 05:21:43.996285 7fcce6575700  5 Found permission: 15
>>  -9589> 2018-07-13 05:21:43.996288 7fcce6575700  5 Searching permissions
>> for group=1 mask=50
>>  -9588> 2018-07-13 05:21:43.996289 7fcce6575700  5 Permissions for group
>> not found
>>  -9587> 2018-07-13 05:21:43.996290 7fcce6575700  5 Searching permissions
>> for group=2 mask=50
>>  -9586> 2018-07-13 05:21:43.996291 7fcce6575700  5 Permissions for group
>> not found
>>  -9585> 2018-07-13 05:21:43.996292 7fcce6575700  5 -- Getting permissions
>> done for identity=rgw::auth::SysReqApplier ->
>> rgw::auth::LocalApplier(acct_user=user01, acct_name=user01, subuser=,
>> perm_mask=15, is_admin=0), owner=user01, perm=2
>>  -9584> 2018-07-13 05:21:43.996294 7fcce6575700  2 req
>> 448466:0.002516:s3:PUT /BUCKET/PATH/OBJECT_35MB:put_obj:verifying op params
>>  -9583> 2018-07-13 05:21:43.996296 7fcce6575700  2 req
>> 448466:0.002518:s3:PUT /BUCKET/PATH/OBJECT_35MB:put_obj:pre-executing
>>  -9582> 2018-07-13 05:21:43.996298 7fcce6575700  2 req
>> 448466:0.002520:s3:PUT /BUCKET/PATH/OBJECT_35MB:put_obj:executing
>>  -9581> 2018-07-13 05:21:43.997115 7fcce6575700  1 --
>> 10.195.17.12:0/171535574 --> 10.195.17.14:6831/27931 --
>> osd_op(unknown.0.0:14249784 9.5
>> 9:b67a1efd:::d644765c-1705-49b2-9609-a8511d3c4fed.165377.2__multipart_OBJECT_35MB.meta:head
>> [getxattrs,stat] snapc 0=[] ondisk+read+known_if_redirected e2230) v8 --
>> 0x55971811d740 con 0
>> ...
>> ...
>> ...
>>      0> 2018-07-13 05:22:16.137676 7fcce6575700 -1
>> /build/ceph-12.2.5/src/common/buffer.cc: In function 'void
>> ceph::buffer::list::append(const ceph::buffer::ptr&, unsigned int, unsigned
>> int)' thread 7fcce6575700 time 2018-07-13 05:22:16.135271
>> /build/ceph-12.2.5/src/common/buffer.cc: 1967: FAILED assert(len+off <=
>> bp.length())
>>
>>  ceph version 12.2.5 (cad919881333ac92274171586c827e01f554a70a) luminous
>> (stable)
>>  1: (ceph::__ceph_assert_fail(char const*, char const*, int, char
>> const*)+0x102) [0x7fcd5b7aab72]
>>  2: (ceph::buffer::list::append(ceph::buffer::ptr const&, unsigned int,
>> unsigned int)+0x118) [0x7fcd64993cf8]
>>  3: (RGWPutObj_ObjStore::get_data(ceph::buffer::list&)+0xde)
>> [0x5597153dd06e]
>>  4: (RGWPutObj_ObjStore_S3::get_data(ceph::buffer::list&)+0xe)
>> [0x55971541f5be]
>>  5: (RGWPutObj::execute()+0xe9b) [0x55971530e3cb]
>>  6: (rgw_process_authenticated(RGWHandler_REST*, RGWOp*&, RGWRequest*,
>> req_state*, bool)+0x165) [0x559715331cb5]
>>  7: (process_request(RGWRados*, RGWREST*, RGWRequest*,
>> std::__cxx11::basic_string<char, std::char_traits<char>,
>> std::allocator<char> > const&, rgw::auth::StrategyRegistry const&,
>> RGWRestfulIO*, OpsLogSocket*, int*)+0x1dbc) [0x559715333e3c]
>>  8: (RGWCivetWebFrontend::process(mg_connection*)+0x38f) [0x5597151cc3ff]
>>  9: (()+0x1e6319) [0x559715206319]
>>  10: (()+0x1e7ce9) [0x559715207ce9]
>>  11: (()+0x76ba) [0x7fcd646c86ba]
>>  12: (clone()+0x6d) [0x7fcd59dd041d]
>>  NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed
>> to interpret this.
>>
>>
>> --- logging levels ---
>>    0/ 5 none
>>    0/ 1 lockdep
>>    0/ 1 context
>>    1/ 1 crush
>>    1/ 5 mds
>>    1/ 5 mds_balancer
>>    1/ 5 mds_locker
>>    1/ 5 mds_log
>>    1/ 5 mds_log_expire
>>    1/ 5 mds_migrator
>>    0/ 1 buffer
>>    0/ 1 timer
>>    0/ 1 filer
>>    0/ 1 striper
>>    0/ 1 objecter
>>    0/ 5 rados
>>    0/ 5 rbd
>>    0/ 5 rbd_mirror
>>    0/ 5 rbd_replay
>>    0/ 5 journaler
>>    0/ 5 objectcacher
>>    0/ 5 client
>>    1/ 5 osd
>>    0/ 5 optracker
>>    0/ 5 objclass
>>    1/ 3 filestore
>>    1/ 3 journal
>>    0/ 5 ms
>>    1/ 5 mon
>>    0/10 monc
>>    1/ 5 paxos
>>    0/ 5 tp
>>    1/ 5 auth
>>    1/ 5 crypto
>>    1/ 1 finisher
>>    1/ 1 reserver
>>    1/ 5 heartbeatmap
>>    1/ 5 perfcounter
>>    1/ 5 rgw
>>    1/10 civetweb
>>    1/ 5 javaclient
>>    1/ 5 asok
>>    1/ 1 throttle
>>    0/ 0 refs
>>    1/ 5 xio
>>    1/ 5 compressor
>>    1/ 5 bluestore
>>    1/ 5 bluefs
>>    1/ 3 bdev
>>    1/ 5 kstore
>>    4/ 5 rocksdb
>>    4/ 5 leveldb
>>    4/ 5 memdb
>>    1/ 5 kinetic
>>    1/ 5 fuse
>>    1/ 5 mgr
>>    1/ 5 mgrc
>>    1/ 5 dpdk
>>    1/ 5 eventtrace
>>   -2/-2 (syslog threshold)
>>   -1/-1 (stderr threshold)
>>   max_recent     10000
>>   max_new         1000
>>   log_file /var/log/ceph/ceph-client.rgw.ceph-node-01.log
>> --- end dump of recent events ---
>> 2018-07-13 05:22:16.176189 7fcce6575700 -1 *** Caught signal (Aborted) **
>>  in thread 7fcce6575700 thread_name:civetweb-worker
>>
>>
>>
>> Any trick to enable rgw_dynamic_resharding only during offpeah hours? Is
>> 'rgw_dynamic_resharding' a injectable setting with no need to restart RGWs?
>> So far, I only mananged to change it via configuration file with RGW service
>> restart.
>>
>> Regarding `ERROR: flush_read_list(): d->client_cb->handle_data() returned
>> -5`, I'll try to increaste timeouts on nginx side.
>>
>> Any help with this error `/build/ceph-12.2.5/src/common/buffer.cc: 1967:
>> FAILED assert(len+off <= bp.length())` which seems to directly impact RGW
>> service state ? Could it be caused by lot of GET requests for big files ? It
>> happens only after the flood of `ERROR: flush_read_list()`.
>>
>> Thanks
>> Jakub
>>
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>



-- 

Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-821-5101
fax.  734-769-8938
cel.  734-216-5309
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to