On Sun, Oct 22, 2017 at 8:05 PM, Yan, Zheng <uker...@gmail.com> wrote:

> On Mon, Oct 23, 2017 at 9:35 AM, Eric Eastman
> <eric.east...@keepertech.com> wrote:
> > With help from the list we recently recovered one of our Jewel based
> > clusters that started failing when we got to about 4800 cephfs snapshots.
> > We understand that cephfs snapshots are still marked experimental.   We
> are
> > running a single active MDS with 2 standby MDS. We only have a single
> file
> > system, we are only taking snapshots from the top level directory, and we
> > are now planning on limiting snapshots to a few hundred. Currently we
> have
> > removed all snapshots from this system, using rmdir on each snapshot
> > directory, and the system is reporting that it is healthy:
> >
> > ceph -s
> >     cluster ba0c94fc-1168-11e6-aaea-000c290cc2d4
> >      health HEALTH_OK
> >      monmap e1: 3 mons at
> > {mon01=10.16.51.21:6789/0,mon02=10.16.51.22:6789/0,
> mon03=10.16.51.23:6789/0}
> >             election epoch 202, quorum 0,1,2 mon01,mon02,mon03
> >       fsmap e18283: 1/1/1 up {0=mds01=up:active}, 2 up:standby
> >      osdmap e342543: 93 osds: 93 up, 93 in
> >             flags sortbitwise,require_jewel_osds
> >       pgmap v38759308: 11336 pgs, 9 pools, 23107 GB data, 12086 kobjects
> >             73956 GB used, 209 TB / 281 TB avail
> >                11336 active+clean
> >   client io 509 kB/s rd, 2548 B/s wr, 0 op/s rd, 1 op/s wr
> >
> > The snapshots were removed several days ago, but just as an experiment I
> > decided to query a few PGs in the cephfs data  storage pool, and I am
> seeing
> > they are all listing:
> >
> > “purged_snaps": "[2~12cd,12d0~12c9]",
>
> purged_snaps IDs of snapshots whose data have been completely purged.
> Currently purged_snap set is append only, osd never remove ID from it.



Thank you for the quick reply.
So it is normal to have "purged_snaps" listed on a system that all
snapshots have been deleted.
Eric

>
>
>
>
> > Here is an example:
> >
> > ceph pg 1.72 query
> > {
> >     "state": "active+clean",
> >     "snap_trimq": "[]",
> >     "epoch": 342540,
> >     "up": [
> >         75,
> >         77,
> >         82
> >     ],
> >     "acting": [
> >         75,
> >         77,
> >         82
> >     ],
> >     "actingbackfill": [
> >         "75",
> >         "77",
> >         "82"
> >     ],
> >     "info": {
> >         "pgid": "1.72",
> >         "last_update": "342540'261039",
> >         "last_complete": "342540'261039",
> >         "log_tail": "341080'260697",
> >         "last_user_version": 261039,
> >         "last_backfill": "MAX",
> >         "last_backfill_bitwise": 1,
> >         "purged_snaps": "[2~12cd,12d0~12c9]",
> > …
> >
> > Is this an issue?
> > I am not seeing any recent trim activity.
> > Are there any procedures documented for looking at snapshots to see if
> there
> > are any issues?
> >
> > Before posting this, I have reread the cephfs and snapshot pages in at:
> > http://docs.ceph.com/docs/master/cephfs/
> > http://docs.ceph.com/docs/master/dev/cephfs-snapshots/
> >
> > Looked at the slides:
> > http://events.linuxfoundation.org/sites/events/files/slides/
> 2017-03-23%20Vault%20Snapshots.pdf
> >
> > Watched the video “Ceph Snapshots for Fun and Profit” given at the last
> > OpenStack conference.
> >
> > And I still can’t find much on info on debugging snapshots.
> >
> > Here is some addition information on the cluster:
> >
> > ceph df
> > GLOBAL:
> >     SIZE     AVAIL     RAW USED     %RAW USED
> >     281T      209T       73955G         25.62
> > POOLS:
> >     NAME                ID     USED       %USED     MAX AVAIL     OBJECTS
> >     rbd                 0          16         0        56326G
> 3
> >     cephfs_data         1      22922G     28.92        56326G
>  12279871
> >     cephfs_metadata     2      89260k         0        56326G
> 45232
> >     cinder              9        147G      0.26        56326G
> 41420
> >     glance              10          0         0        56326G
> 0
> >     cinder-backup       11          0         0        56326G
> 0
> >     cinder-ssltest      23      1362M         0        56326G
> 431
> >     IDMT-dfgw02         27      2552M         0        56326G
> 758
> >     dfbackup            28     33987M      0.06        56326G
>  8670
> >
> >
> > Recent tickets and posts on problems with this cluster
> > http://tracker.ceph.com/issues/21761
> > http://tracker.ceph.com/issues/21412
> > https://www.spinics.net/lists/ceph-devel/msg38203.html
> >
> > ceph -v
> > ceph version 10.2.10 (5dc1e4c05cb68dbf62ae6fce3f0700e4654fdbbe)
> >
> > Kernel is 4.13.1
> > uname -a
> > Linux ss001 4.13.1-041301-generic #201709100232 SMP Sun Sep 10 06:33:36
> UTC
> > 2017 x86_64 x86_64 x86_64 GNU/Linux
> >
> > OS is Ubuntu 16.04
> >
> > Thanks
> > Eric
> >
> > _______________________________________________
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to