Hi David,

Right, we use this cluster (v12.2.5, fresh installation) for RGW, however,
I don't see default.rgw.gc pool like we have on other cluster which was
upgraded to Luminous, 10.2.9 -> 10.2.10 -> 12.2.2 (I believe that
default.rgw.gc pool is there from the time of setting up RGW on Jewel
version and the pool was automatically created).

On impacted cluster we have below pools
pool 1 '.rgw.root' replicated size 3 min_size 2 crush_rule 2 object_hash
rjenkins pg_num 8 pgp_num 8 last_change 1499 flags hashpspool stripe_width
0 application rgw
pool 2 'rbd' replicated size 3 min_size 2 crush_rule 2 object_hash rjenkins
pg_num 2048 pgp_num 2048 last_change 2230 flags hashpspool stripe_width 0
application rbd
pool 3 'default.rgw.control' replicated size 3 min_size 2 crush_rule 2
object_hash rjenkins pg_num 8 pgp_num 8 last_change 1501 flags hashpspool
stripe_width 0 application rgw
pool 4 'default.rgw.meta' replicated size 3 min_size 2 crush_rule 2
object_hash rjenkins pg_num 8 pgp_num 8 last_change 1491 flags hashpspool
stripe_width 0 application rgw
pool 5 'default.rgw.log' replicated size 3 min_size 2 crush_rule 2
object_hash rjenkins pg_num 8 pgp_num 8 last_change 1486 flags hashpspool
stripe_width 0 application rgw
pool 7 'default.rgw.buckets.index' replicated size 3 min_size 2 crush_rule
2 object_hash rjenkins pg_num 8 pgp_num 8 last_change 1483 owner
18446744073709551615 flags hashpspool stripe_width 0 application rgw
pool 8 'default.rgw.buckets.data' erasure size 12 min_size 10 crush_rule 3
object_hash rjenkins pg_num 2048 pgp_num 2048 last_change 2228 flags
hashpspool max_bytes 879609302220800 stripe_width 36864 application rgw
pool 9 'default.rgw.buckets.non-ec' replicated size 3 min_size 2 crush_rule
2 object_hash rjenkins pg_num 8 pgp_num 8 last_change 1506 owner
18446744073709551615 flags hashpspool stripe_width 0 application rgw

Eight annoying OSDs match to primary OSDs of PGs that make default.rgw.log
pool.

Many thanks
Jakub


On Mon, Aug 20, 2018 at 11:54 AM David Turner <[email protected]> wrote:

> I'm assuming you use RGW and that you have a GC pool for RGW. It also
> might beat assumed that your GC pool only has 8 PGs.  Are any of those
> guesses correct?
>
> On Mon, Aug 20, 2018, 5:13 AM Jakub Jaszewski <[email protected]>
> wrote:
>
>> Issue tracker http://tracker.ceph.com/issues/23801.
>> Still don't know why only particular OSDs write this information to log
>> files.
>>
>> Jakub
>>
>> On Wed, Aug 8, 2018 at 12:02 PM Jakub Jaszewski <
>> [email protected]> wrote:
>>
>>> Hi All, exactly the same story today, same 8 OSDs and a lot of garbage
>>> collection objects to process
>>>
>>> Below is the number of "cls_rgw.cc:3284: gc_iterate_entries end_key="
>>> entries per OSD log file
>>> hostA:
>>>   /var/log/ceph/ceph-osd.58.log
>>>   1826467
>>> hostB:
>>>   /var/log/ceph/ceph-osd.88.log
>>>   2924241
>>> hostC:
>>>   /var/log/ceph/ceph-osd.153.log
>>>   581002
>>>   /var/log/ceph/ceph-osd.164.log
>>>   3278606
>>> hostD:
>>>   /var/log/ceph/ceph-osd.95.log
>>>   1426963
>>> hostE:
>>>   /var/log/ceph/ceph-osd.4.log
>>>   2716914
>>>   /var/log/ceph/ceph-osd.53.log
>>>   943749
>>> hostF:
>>>   /var/log/ceph/ceph-osd.172.log
>>>   4085334
>>>
>>>
>>> # radosgw-admin gc list --include-all|grep oid |wc -l
>>> 302357
>>> #
>>>
>>> Can anyone please explain what is going on ?
>>>
>>> Thanks!
>>> Jakub
>>>
>>> On Tue, Aug 7, 2018 at 3:03 PM Jakub Jaszewski <
>>> [email protected]> wrote:
>>>
>>>> Hi,
>>>>
>>>> 8 out of 192 OSDs in our cluster (version 12.2.5) write plenty of
>>>> records like "cls_rgw.cc:3284: gc_iterate_entries end_key=" to the
>>>> corresponding log files, e.g.
>>>>
>>>> 2018-08-07 04:34:06.000585 7fdd8f012700  0 <cls>
>>>> /build/ceph-12.2.5/src/cls/rgw/cls_rgw.cc:3284: gc_iterate_entries
>>>> end_key=1_01533616446.000580407
>>>> 2018-08-07 04:34:06.001888 7fdd8f012700  0 <cls>
>>>> /build/ceph-12.2.5/src/cls/rgw/cls_rgw.cc:3284: gc_iterate_entries
>>>> end_key=1_01533616446.001886318
>>>> 2018-08-07 04:34:06.003395 7fdd8f012700  0 <cls>
>>>> /build/ceph-12.2.5/src/cls/rgw/cls_rgw.cc:3284: gc_iterate_entries
>>>> end_key=1_01533616446.003390299
>>>> 2018-08-07 04:34:06.005205 7fdd8f012700  0 <cls>
>>>> /build/ceph-12.2.5/src/cls/rgw/cls_rgw.cc:3284: gc_iterate_entries
>>>> end_key=1_01533616446.005200341
>>>>
>>>> # grep '2018-08-07 04:34:06' /var/log/ceph/ceph-osd.4.log |wc -l
>>>> 712
>>>> #
>>>>
>>>> At the same time there were like 500 000 expired garbage collection
>>>> objects.
>>>>
>>>> Log level of OSD subsystem is set to default 1/5 across all OSDs.
>>>>
>>>> I wonder why only few OSDs record this information and is it something
>>>> to be logged in log level = 1 or maybe higher?
>>>> https://github.com/ceph/ceph/blob/v12.2.5/src/cls/rgw/cls_rgw.cc#L3284
>>>>
>>>> Thanks
>>>> Jakub
>>>>
>>> _______________________________________________
>> ceph-users mailing list
>> [email protected]
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to