Ceph maintains some metadata in objects. In this case, hitsets, which keep
track of object accesses for evaluating how hot an object is when flushing
and evicting from the cache.

On Tuesday, November 3, 2015, Дмитрий Глушенок <[email protected]> wrote:

> Hi,
>
> Thanks Gregory and Robert, now it is a bit clearer.
>
> After cache-flush-evict-all almost all objects were deleted, but 101
> remained in cache pool. Also 1 pg changed its state to inconsistent with
> HEALTH_ERR.
> "ceph pg repair" changed objects count to 100, but at least ceph become
> healthy.
>
> Now it looks like:
> POOLS:
>     NAME                      ID     USED      %USED     MAX AVAIL
>  OBJECTS
>     rbd-cache                 36     23185         0          157G
>  100
>     rbd                       37         0         0          279G
>    0
> # rados -p rbd-cache ls -all
> # rados -p rbd ls -all
> #
>
> Is there any way to find what the objects are?
>
> "ceph pg ls-by-pool rbd-cache" gives me pgs of the objects. Looking into
> these pgs gives me nothing I can understand :)
>
> # ceph pg ls-by-pool rbd-cache | head -4
> pg_stat objects mip     degr    misp    unf     bytes   log     disklog
> state   state_stamp     v       reported        up   up_primary
>  acting  acting_primary  last_scrub      scrub_stamp     last_deep_scrub
> deep_scrub_stamp
> 36.0    1       0       0       0       0       83      926     926
>  active+clean    2015-11-03 22:06:39.193371      798'926       798:640
> [4,0,3] 4       [4,0,3] 4       798'926 2015-11-03 22:06:39.193321
> 798'926 2015-11-03 22:06:39.193321
> 36.1    1       0       0       0       0       193     854     854
>  active+clean    2015-11-03 18:28:51.190819      798'854       798:515
> [1,4,3] 1       [1,4,3] 1       796'628 2015-11-03 18:28:51.190749
> 0'0     2015-11-02 18:28:42.546224
> 36.2    1       0       0       0       0       198     869     869
>  active+clean    2015-11-03 18:28:44.556048      798'869       798:554
> [2,0,1] 2       [2,0,1] 2       796'650 2015-11-03 18:28:44.555980
> 0'0     2015-11-02 18:28:42.546226
> #
>
> # find /var/lib/ceph/osd/ceph-0/current/36.0_head/
> /var/lib/ceph/osd/ceph-0/current/36.0_head/
> /var/lib/ceph/osd/ceph-0/current/36.0_head/__head_00000000__24
> /var/lib/ceph/osd/ceph-0/current/36.0_head/hit\uset\u36.0\uarchive\u2015-11-03
> 11:12:37.962360\u2015-11-03 21:28:58.149662__head_00000000_.ceph-internal_24
> # find /var/lib/ceph/osd/ceph-0/current/36.2_head/
> /var/lib/ceph/osd/ceph-0/current/36.2_head/
> /var/lib/ceph/osd/ceph-0/current/36.2_head/__head_00000002__24
> /var/lib/ceph/osd/ceph-0/current/36.2_head/hit\uset\u36.2\uarchive\u2015-11-02
> 19:50:00.788736\u2015-11-03 21:29:02.460568__head_00000002_.ceph-internal_24
> #
>
> # ls -l
> /var/lib/ceph/osd/ceph-0/current/36.0_head/hit\\uset\\u36.0\\uarchive\\u2015-11-03\
> 11\:12\:37.962360\\u2015-11-03\
> 21\:28\:58.149662__head_00000000_.ceph-internal_24
> -rw-r--r--. 1 root root 83 Nov  3 21:28
> /var/lib/ceph/osd/ceph-0/current/36.0_head/hit\uset\u36.0\uarchive\u2015-11-03
> 11:12:37.962360\u2015-11-03 21:28:58.149662__head_00000000_.ceph-internal_24
> #
> # ls -l
> /var/lib/ceph/osd/ceph-0/current/36.2_head/hit\\uset\\u36.2\\uarchive\\u2015-11-02\
> 19\:50\:00.788736\\u2015-11-03\
> 21\:29\:02.460568__head_00000002_.ceph-internal_24
> -rw-r--r--. 1 root root 198 Nov  3 21:29
> /var/lib/ceph/osd/ceph-0/current/36.2_head/hit\uset\u36.2\uarchive\u2015-11-02
> 19:50:00.788736\u2015-11-03 21:29:02.460568__head_00000002_.ceph-internal_24
> #
>
> --
> Dmitry Glushenok
> Jet Infosystems
>
>
> > 3 нояб. 2015 г., в 20:11, Robert LeBlanc <[email protected]
> <javascript:;>> написал(а):
> >
> > -----BEGIN PGP SIGNED MESSAGE-----
> > Hash: SHA256
> >
> > Try:
> >
> > rados -p {cachepool} cache-flush-evict-all
> >
> > and see if the objects clean up.
> > - ----------------
> > Robert LeBlanc
> > PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
> >
> >
> > On Tue, Nov 3, 2015 at 8:02 AM, Gregory Farnum  wrote:
> >> When you have a caching pool in writeback mode, updates to objects
> >> (including deletes) are handled by writeback rather than writethrough.
> >> Since there's no other activity against these pools, there is nothing
> >> prompting the cache pool to flush updates out to the backing pool, so
> >> the backing pool hasn't deleted its objects because nothing's told it
> >> to. You'll find that the cache pool has deleted the data for its
> >> objects, but it's keeping around a small "whiteout" and the object
> >> info metadata.
> >> The "rados ls" you're using has never played nicely with cache tiering
> >> and probably never will. :( Listings are expensive operations and
> >> modifying them to do more than the simple info scan would be fairly
> >> expensive in terms of computation and IO.
> >>
> >> I think there are some caching commands you can send to flush updates
> >> which would cause the objects to be entirely deleted, but I don't have
> >> them off-hand. You can probably search the mailing list archives or
> >> the docs for tiering commands. :)
> >> -Greg
> >>
> >> On Tue, Nov 3, 2015 at 12:40 AM, Дмитрий Глушенок  wrote:
> >>> Hi,
> >>>
> >>> While benchmarking tiered pool using rados bench it was noticed that
> objects are not being removed after test.
> >>>
> >>> Test was performed using "rados -p rbd bench 3600 write". The pool is
> not used by anything else.
> >>>
> >>> Just before end of test:
> >>> POOLS:
> >>>    NAME                      ID     USED       %USED     MAX AVAIL
>  OBJECTS
> >>>    rbd-cache                 36     33110M      3.41          114G
>     8366
> >>>    rbd                       37     43472M      4.47          237G
>    10858
> >>>
> >>> Some time later (few hundreds of writes are flushed, rados automatic
> cleanup finished):
> >>> POOLS:
> >>>    NAME                      ID     USED       %USED     MAX AVAIL
>  OBJECTS
> >>>    rbd-cache                 36      22998         0          157G
>    16342
> >>>    rbd                       37     46050M      4.74          234G
>    11503
> >>>
> >>> # rados -p rbd-cache ls | wc -l
> >>> 16242
> >>> # rados -p rbd ls | wc -l
> >>> 11503
> >>> #
> >>>
> >>> # rados -p rbd cleanup
> >>> error during cleanup: -2
> >>> error 2: (2) No such file or directory
> >>> #
> >>>
> >>> # rados -p rbd cleanup --run-name "" --prefix prefix ""
> >>> Warning: using slow linear search
> >>> Removed 0 objects
> >>> #
> >>>
> >>> # rados -p rbd ls | head -5
> >>> benchmark_data_dropbox01.tzk_7641_object10901
> >>> benchmark_data_dropbox01.tzk_7641_object9645
> >>> benchmark_data_dropbox01.tzk_7641_object10389
> >>> benchmark_data_dropbox01.tzk_7641_object10090
> >>> benchmark_data_dropbox01.tzk_7641_object11204
> >>> #
> >>>
> >>> #  rados -p rbd-cache ls | head -5
> >>> benchmark_data_dropbox01.tzk_7641_object10901
> >>> benchmark_data_dropbox01.tzk_7641_object9645
> >>> benchmark_data_dropbox01.tzk_7641_object10389
> >>> benchmark_data_dropbox01.tzk_7641_object5391
> >>> benchmark_data_dropbox01.tzk_7641_object10090
> >>> #
> >>>
> >>> So, it looks like the objects are still in place (in both pools?). But
> it is not possible to remove them:
> >>>
> >>> # rados -p rbd rm benchmark_data_dropbox01.tzk_7641_object10901
> >>> error removing rbd>benchmark_data_dropbox01.tzk_7641_object10901: (2)
> No such file or directory
> >>> #
> >>>
> >>> # ceph health
> >>> HEALTH_OK
> >>> #
> >>>
> >>>
> >>> Can somebody explain the behavior? And is it possible to cleanup the
> benchmark data without recreating the pools?
> >>>
> >>>
> >>> ceph version 0.94.5
> >>>
> >>> # ceph osd dump | grep rbd
> >>> pool 36 'rbd-cache' replicated size 3 min_size 1 crush_ruleset 1
> object_hash rjenkins pg_num 100 pgp_num 100 last_change 755 flags
> hashpspool,incomplete_clones tier_of 37 cache_mode writeback target_bytes
> 107374182400 hit_set bloom{false_positive_probability: 0.05, target_size:
> 0, seed: 0} 3600s x1 stripe_width 0
> >>> pool 37 'rbd' erasure size 5 min_size 3 crush_ruleset 2 object_hash
> rjenkins pg_num 100 pgp_num 100 last_change 745 lfor 745 flags hashpspool
> tiers 36 read_tier 36 write_tier 36 stripe_width 4128
> >>> #
> >>>
> >>> # ceph osd pool get rbd-cache hit_set_type
> >>> hit_set_type: bloom
> >>> # ceph osd pool get rbd-cache hit_set_period
> >>> hit_set_period: 3600
> >>> # ceph osd pool get rbd-cache hit_set_count
> >>> hit_set_count: 1
> >>> # ceph osd pool get rbd-cache target_max_objects
> >>> target_max_objects: 0
> >>> # ceph osd pool get rbd-cache target_max_bytes
> >>> target_max_bytes: 107374182400
> >>> # ceph osd pool get rbd-cache cache_target_dirty_ratio
> >>> cache_target_dirty_ratio: 0.1
> >>> # ceph osd pool get rbd-cache cache_target_full_ratio
> >>> cache_target_full_ratio: 0.2
> >>> #
> >>>
> >>> Crush map:
> >>> root cache_tier {
> >>>        id -7           # do not change unnecessarily
> >>>        # weight 0.450
> >>>        alg straw
> >>>        hash 0  # rjenkins1
> >>>        item osd.0 weight 0.090
> >>>        item osd.1 weight 0.090
> >>>        item osd.2 weight 0.090
> >>>        item osd.3 weight 0.090
> >>>        item osd.4 weight 0.090
> >>> }
> >>> root store_tier {
> >>>        id -8           # do not change unnecessarily
> >>>        # weight 0.450
> >>>        alg straw
> >>>        hash 0  # rjenkins1
> >>>        item osd.5 weight 0.090
> >>>        item osd.6 weight 0.090
> >>>        item osd.7 weight 0.090
> >>>        item osd.8 weight 0.090
> >>>        item osd.9 weight 0.090
> >>> }
> >>> rule cache {
> >>>        ruleset 1
> >>>        type replicated
> >>>        min_size 0
> >>>        max_size 5
> >>>        step take cache_tier
> >>>        step chooseleaf firstn 0 type osd
> >>>        step emit
> >>> }
> >>> rule store {
> >>>        ruleset 2
> >>>        type erasure
> >>>        min_size 0
> >>>        max_size 5
> >>>        step take store_tier
> >>>        step chooseleaf firstn 0 type osd
> >>>        step emit
> >>> }
> >>>
> >>> Thanks
> >>>
> >>> --
> >>> Dmitry Glushenok
> >>> Jet Infosystems
> >>>
> >>> _______________________________________________
> >>> ceph-users mailing list
> >>> [email protected] <javascript:;>
> >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >> _______________________________________________
> >> ceph-users mailing list
> >> [email protected] <javascript:;>
> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
> > -----BEGIN PGP SIGNATURE-----
> > Version: Mailvelope v1.2.3
> > Comment: https://www.mailvelope.com
> >
> > wsFcBAEBCAAQBQJWOOqoCRDmVDuy+mK58QAAScEP/jdBEK0lH2u38S7hOwll
> > FA2J5+9lY0QAzxyTaRIzsZu0g+LSrLek39fFMTX/zHI0TMfSTM6dnPs6ucO2
> > X59wbIyJQPEzzWEa2qn4F0j/QmWkMGfuMKDjxyT6OudpZtQKDS8mt13rnqAc
> > QH+0bQaVfjbKGowCAHyJXTsPK4qgew7dV3JDW2hVX/vIDjYAJomkF8Ll4miT
> > IQ6ViV2+9u4uW99ty4aSRnYUwaf7vqycK+qUT0Uohi6iTeym7s78O42Qa0p2
> > WYHcXzAdYnBiR+qTIVWvKXn81tm4gmP4lSh0gpRoJ007c0hu5vTAnUvHRh0Z
> > 070NTrmAAJXN0oZ7lkoksYZVkXDJkwBZpdif69OQU3No/HhcY9JtagEMCXcc
> > 7/bUACaKjyKRzRmT3VNPQuMI0ix+tdi3PU4dL+16eBO832BNsqnyNHPxu570
> > su1m4UQVGmoXCTUOeXYe9j4jzlHO/QRXcp/soFW5DgYO6JmylZzbyNmHPjMx
> > CiOsdhnjAylG/zq42S4zTfd+F//aRODGJ0JNHmQYm7M2sezAvQD1HyBCAwds
> > VfyOcfZwyeUNubtqssmOQ+n8/EDQciK4RH/hxG8bC8xsZaNgum7Z4/zA+Efc
> > gMuplOsBECODAoBlfA2TP3/XixzTXoVGMmdXolOhs+Z+tT+O22eKUEK7GbMG
> > rIWX
> > =qbDR
> > -----END PGP SIGNATURE-----
>
>
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to