** Description changed:

- This issue in the Ceph tracker has been encountered repeatedly with 
significant adverse effects on Ceph 12.2.11/12 in Bionic:
- https://tracker.ceph.com/issues/38454
+ [Impact]
+ Cancelling large S3/Swift object puts may result in garbage collection 
entries with zero-length chains. Rados gateway garbage collection does not 
efficiently process and clean up these zero-length chains.
  
- This PR is the likely candidate for backporting to correct the issue:
- https://github.com/ceph/ceph/pull/26601
+ A large number of zero-length chains will result in rgw processes
+ quickly spinning through the garbage collection lists doing very little
+ work. This can result in abnormally high cpu utilization and op
+ workloads.
+ 
+ [Test Case]
+ Disable garbage collection:
+ `juju config ceph-radosgw config-flags='{"rgw": {"rgw enable gc threads": 
"false"}}'`
+ 
+ Repeatedly kill 256MB object put requests for randomized object names.
+ `for i in {0.. 1000}; do f=$(mktemp); fallocate -l 256M $f; s3cmd put $f 
s3://test_bucket &; pid=$!; sleep $((RANDOM % 3)); kill $pid; rm $f; done`
+ 
+ Capture omap detail. Verify zero-length chains were created:
+ `for i in $(seq 0 ${RGW_GC_MAX_OBJS:-32}); do rados -p default.rgw.log 
--namespace gc listomapvals gc.$i; done`
+ 
+ Raise radosgw debug levels, and enable garbage collection:
+ `juju config ceph-radosgw config-flags='{"rgw": {"rgw enable gc threads": 
"false"}}' loglevel=20`
+ 
+ Verify zero-lenth chains are processed correctly by inspecting radosgw
+ logs.
+ 
+ [Regression Potential]
+ {Pending} Back-port still needs to be accepted upstream. Need complete fix to 
assess regression potential.
+ 
+ [Other Information]
+ This issue has been reported upstream [0] and was fixed in Nautilus alongside 
a number of other garbage collection issues/enhancements in pr#26601 [1]:
+ * adds additional logging to make future debugging easier.
+ * resolves bug where the truncated flag was not always set correctly in 
gc_iterate_entries
+ * resolves bug where marker in RGWGC::process was not advanced
+ * resolves bug in which gc entries with a zero-length chain were not trimmed
+ * resolves bug where same gc entry tag was added to list for deletion 
multiple times
+ 
+ These fixes were slated for back-port into Luminous and Mimic, but the
+ Luminous work was not completed because of a required dependency: AIO GC
+ [2]. This dependency has been resolved upstream, and is pending SRU
+ verification in Ubuntu packages [3].
+ 
+ [0] https://tracker.ceph.com/issues/38454
+ [1] https://github.com/ceph/ceph/pull/26601
+ [2] https://tracker.ceph.com/issues/23223
+ [3] https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1838858

** Also affects: cloud-archive
   Importance: Undecided
       Status: New

** Summary changed:

- Need backport of 0-length gc chain fixes to Luminous
+ Backport of zero-length gc chain fixes to Luminous

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1843085

Title:
  Backport of zero-length gc chain fixes to Luminous

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1843085/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to