[ceph-users] Re: CephFS ghost usage/inodes
Any other ideas ? > Am 15.01.2020 um 15:50 schrieb Oskar Malnowicz > : > > the situation is: > > health: HEALTH_WARN > 1 pools have many more objects per pg than average > > $ ceph health detail > MANY_OBJECTS_PER_PG 1 pools have many more objects per pg than average > pool cephfs_data objects per pg (315399) is more than 1227.23 times > cluster average (257) > > $ ceph df > RAW STORAGE: > CLASS SIZEAVAIL USEDRAW USED %RAW USED > hdd 7.8 TiB 7.4 TiB 326 GiB 343 GiB 4.30 > TOTAL 7.8 TiB 7.4 TiB 326 GiB 343 GiB 4.30 > > POOLS: > POOL ID STORED OBJECTS USED%USED > MAX AVAIL > cephfs_data 6 2.2 TiB 2.52M 2.2 TiB 26.44 > 3.0 TiB > cephfs_metadata 7 9.7 MiB 379 9.7 MiB 0 > 3.0 TiB > > the stored value of the "cephfs_data" pool is 2.2TiB. This must be wrong. > When i execute "du -sh" from the MDS root "/" i get an usage: > > $ du -sh > 31G . > > "df -h" shows: > > $ df -h > Filesystem Size Used Avail Use% Mounted on > ip1,ip2,ip3:/5.2T 2.2T 3.0T 43% /storage/cephfs > > It says that "Used" ist 2.2T but "du" shows 31G > > the pg_num from the "cephfs_data" pool is now 8. Autoscale suggest me to set > this parameter to 512 > > $ ceph osd pool autoscale-status > POOLSIZE TARGET SIZE RATE RAW CAPACITY RATIO TARGET > RATIO BIAS PG_NUM NEW PG_NUM AUTOSCALE > cephfs_metadata9994k 2.07959G 0. > 1.0 8off > cephfs_data2221G 2.07959G 0.5582 > 1.0 8512 off > > after setting pg_num to 512 the situation is: > > $ ceph health detail > HEALTH_WARN 1 pools have many more objects per pg than average > MANY_OBJECTS_PER_PG 1 pools have many more objects per pg than average > pool cephfs_data objects per pg (4928) is more than 100.571 times cluster > average (49) > > $ ceph df > RAW STORAGE: > CLASS SIZEAVAIL USEDRAW USED %RAW USED > hdd 7.8 TiB 7.4 TiB 329 GiB 346 GiB 4.34 > TOTAL 7.8 TiB 7.4 TiB 329 GiB 346 GiB 4.34 > > POOLS: > POOL ID STORED OBJECTS USED > %USED MAX AVAIL > cephfs_data6 30 GiB 2.52M 61 GiB > 0.99 3.0 TiB > cephfs_metadata7 9.8 MiB 379 20 MiB >0 3.0 TiB > > The "stored" value changed from 2.2TiB to 30GiB !!! This should be the > correct usage/size. > > When i execute "du -sh" from the MDS root "/" i get again an usage: > > $ du -sh > 31G > > and "df -h" shows again > > $ df -h > Filesystem Size Used Avail Use% Mounted on > ip1,ip2,ip3:/5.2T 2.2T 3.0T 43% /storage/cephfs > > It says that "Used" ist 2.2T but "du" shows 31G > > Can anybody explain me whats the problem ? > > > > > Am 14.01.20 um 11:15 schrieb Florian Pritz: >> Hi, >> >> When we tried putting some load on our test cephfs setup by restoring a >> backup in artifactory, we eventually ran out of space (around 95% used >> in `df` = 3.5TB) which caused artifactory to abort the restore and clean >> up. However, while a simple `find` no longer shows the files, `df` still >> claims that we have around 2.1TB of data on the cephfs. `df -i` also >> shows 2.4M used inodes. When using `du -sh` on a top-level mountpoint, I >> get 31G used, which is data that is still really here and which is >> expected to be here. >> >> Consequently, we also get the following warning: >> >> >>> MANY_OBJECTS_PER_PG 1 pools have many more objects per pg than average >>> pool cephfs_data objects per pg (38711) is more than 231.802 times >>> cluster average (167) >>> >> We are running ceph 14.2.5. >> >> We have snapshots enabled on cephfs, but there are currently no active >> snapshots listed by `ceph daemon mds.$hostname dump snaps --server` (see >> below). I can't say for sure if we created snapshots during the backup >> restore. >> >> >>> { >>> "last_snap": 39, >>> "last_created": 38, >>> "last_destroyed": 39, >>> "pending_noop": [], >>> "snaps": [], >>> "need_to_purge": {}, >>> "pending_update": [], >>> "pending_destroy": [] >>> } >>> >> We only have a single CephFS. >> >> We use the pool_namespace xattr for our various directory trees on the >> cephfs. >> >> `ceph df` shows: >> >> >>> POOL ID STORED OBJECTS USED%USED MAX AVAIL >>> cephfs_data 6 2.1 TiB 2.48M 2.1 TiB 24.97 3.1 TiB >>> >> `ceph daemon mds.$hostname perf dump | grep stray` shows: >> >> >>> "num_strays": 0, >>> "num_strays_delayed": 0, >>> "num_strays_enqueuing": 0, >>> "strays_created": 5097138, >>> "strays_enqueued": 5097138, >>> "strays_r
[ceph-users] Re: CephFS ghost usage/inodes
i think there is something wrong with the cephfs_data pool. i created a new pool "cephfs_data2" and copied data from the "cephfs_data" to the "cephfs_data2" pool by using this command: $ rados cppool cephfs_data cephfs_data2 $ ceph df detail RAW STORAGE: CLASS SIZE AVAIL USED RAW USED %RAW USED hdd 7.8 TiB 7.4 TiB 390 GiB 407 GiB 5.11 TOTAL 7.8 TiB 7.4 TiB 390 GiB 407 GiB 5.11 POOLS: POOL ID STORED OBJECTS USED %USED MAX AVAIL QUOTA OBJECTS QUOTA BYTES DIRTY USED COMPR UNDER COMPR cephfs_data 6 30 GiB 2.52M 61 GiB 1.02 2.9 TiB N/A N/A 2.52M 0 B 0 B cephfs_data2 20 30 GiB 11.06k 61 GiB 1.02 2.9 TiB N/A N/A 11.06k 0 B 0 B cephfs_metadata 7 9.8 MiB 379 20 MiB 0 2.9 TiB N/A N/A 379 0 B 0 B in the new pool the stored amount is also 30GiB, but object count and dirty count are significant smaller. i think that in the "cephfs_data" pool are something like "orpahned" objects. but how can i cleanup that pool ? Am 14.01.20 um 11:15 schrieb Florian Pritz: > Hi, > > When we tried putting some load on our test cephfs setup by restoring a > backup in artifactory, we eventually ran out of space (around 95% used > in `df` = 3.5TB) which caused artifactory to abort the restore and clean > up. However, while a simple `find` no longer shows the files, `df` still > claims that we have around 2.1TB of data on the cephfs. `df -i` also > shows 2.4M used inodes. When using `du -sh` on a top-level mountpoint, I > get 31G used, which is data that is still really here and which is > expected to be here. > > Consequently, we also get the following warning: > >> MANY_OBJECTS_PER_PG 1 pools have many more objects per pg than average >> pool cephfs_data objects per pg (38711) is more than 231.802 times >> cluster average (167) > We are running ceph 14.2.5. > > We have snapshots enabled on cephfs, but there are currently no active > snapshots listed by `ceph daemon mds.$hostname dump snaps --server` (see > below). I can't say for sure if we created snapshots during the backup > restore. > >> { >> "last_snap": 39, >> "last_created": 38, >> "last_destroyed": 39, >> "pending_noop": [], >> "snaps": [], >> "need_to_purge": {}, >> "pending_update": [], >> "pending_destroy": [] >> } > We only have a single CephFS. > > We use the pool_namespace xattr for our various directory trees on the > cephfs. > > `ceph df` shows: > >> POOL ID STORED OBJECTS USED%USED MAX AVAIL >> cephfs_data 6 2.1 TiB 2.48M 2.1 TiB 24.97 3.1 TiB > `ceph daemon mds.$hostname perf dump | grep stray` shows: > >> "num_strays": 0, >> "num_strays_delayed": 0, >> "num_strays_enqueuing": 0, >> "strays_created": 5097138, >> "strays_enqueued": 5097138, >> "strays_reintegrated": 0, >> "strays_migrated": 0, > `rados -p cephfs_data df` shows: > >> POOL_NAME USED OBJECTS CLONES COPIES MISSING_ON_PRIMARY UNFOUND >> DEGRADED RD_OPS RD WR_OPS WR USED COMPR UNDER COMPR >> cephfs_data 2.1 TiB 2477540 0 4955080 0 0 >> 0 10699626 6.9 TiB 86911076 35 TiB0 B 0 B >> >> total_objects29718 >> total_used 329 GiB >> total_avail 7.5 TiB >> total_space 7.8 TiB > When I combine the usage and the free space shown by `df` we would > exceed our cluster size. Our test cluster currently has 7.8TB total > space with a replication size of 2 for all pools. With 2.1TB > "used" on the cephfs according to `df` + 3.1TB being shows as "free" I > get 5.2TB total size. This would mean >10TB of data when accounted for > replication. Clearly this can't fit on a cluster with only 7.8TB of > capacity. > > Do you have any ideas why we see so many objects and so much reported > usage? Is there any way to fix this without recreating the cephfs? > > Florian > > > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: CephFS ghost usage/inodes
the situation is: health: HEALTH_WARN 1 pools have many more objects per pg than average $ ceph health detail MANY_OBJECTS_PER_PG 1 pools have many more objects per pg than average pool cephfs_data objects per pg (315399) is more than 1227.23 times cluster average (257) $ ceph df RAW STORAGE: CLASS SIZE AVAIL USED RAW USED %RAW USED hdd 7.8 TiB 7.4 TiB 326 GiB 343 GiB 4.30 TOTAL 7.8 TiB 7.4 TiB 326 GiB 343 GiB 4.30 POOLS: POOL ID STORED OBJECTS USED %USED MAX AVAIL cephfs_data 6 2.2 TiB 2.52M 2.2 TiB 26.44 3.0 TiB cephfs_metadata 7 9.7 MiB 379 9.7 MiB 0 3.0 TiB the stored value of the "cephfs_data" pool is 2.2TiB. This must be wrong. When i execute "du -sh" from the MDS root "/" i get an usage: $ du -sh 31G . "df -h" shows: $ df -h Filesystem Size Used Avail Use% Mounted on ip1,ip2,ip3:/ 5.2T 2.2T 3.0T 43% /storage/cephfs It says that "Used" ist 2.2T but "du" shows 31G the pg_num from the "cephfs_data" pool is now 8. Autoscale suggest me to set this parameter to 512 $ ceph osd pool autoscale-status POOL SIZE TARGET SIZE RATE RAW CAPACITY RATIO TARGET RATIO BIAS PG_NUM NEW PG_NUM AUTOSCALE cephfs_metadata 9994k 2.0 7959G 0. 1.0 8 off cephfs_data 2221G 2.0 7959G 0.5582 1.0 8 512 off after setting pg_num to 512 the situation is: $ ceph health detail HEALTH_WARN 1 pools have many more objects per pg than average MANY_OBJECTS_PER_PG 1 pools have many more objects per pg than average pool cephfs_data objects per pg (4928) is more than 100.571 times cluster average (49) $ ceph df RAW STORAGE: CLASS SIZE AVAIL USED RAW USED %RAW USED hdd 7.8 TiB 7.4 TiB 329 GiB 346 GiB 4.34 TOTAL 7.8 TiB 7.4 TiB 329 GiB 346 GiB 4.34 POOLS: POOL ID STORED OBJECTS USED %USED MAX AVAIL cephfs_data 6 30 GiB 2.52M 61 GiB 0.99 3.0 TiB cephfs_metadata 7 9.8 MiB 379 20 MiB 0 3.0 TiB The "stored" value changed from 2.2TiB to 30GiB !!! This should be the correct usage/size. When i execute "du -sh" from the MDS root "/" i get again an usage: $ du -sh 31G and "df -h" shows again $ df -h Filesystem Size Used Avail Use% Mounted on ip1,ip2,ip3:/ 5.2T 2.2T 3.0T 43% /storage/cephfs It says that "Used" ist 2.2T but "du" shows 31G Can anybody explain me whats the problem ? Am 14.01.20 um 11:15 schrieb Florian Pritz: > Hi, > > When we tried putting some load on our test cephfs setup by restoring a > backup in artifactory, we eventually ran out of space (around 95% used > in `df` = 3.5TB) which caused artifactory to abort the restore and clean > up. However, while a simple `find` no longer shows the files, `df` still > claims that we have around 2.1TB of data on the cephfs. `df -i` also > shows 2.4M used inodes. When using `du -sh` on a top-level mountpoint, I > get 31G used, which is data that is still really here and which is > expected to be here. > > Consequently, we also get the following warning: > >> MANY_OBJECTS_PER_PG 1 pools have many more objects per pg than average >> pool cephfs_data objects per pg (38711) is more than 231.802 times >> cluster average (167) > We are running ceph 14.2.5. > > We have snapshots enabled on cephfs, but there are currently no active > snapshots listed by `ceph daemon mds.$hostname dump snaps --server` (see > below). I can't say for sure if we created snapshots during the backup > restore. > >> { >> "last_snap": 39, >> "last_created": 38, >> "last_destroyed": 39, >> "pending_noop": [], >> "snaps": [], >> "need_to_purge": {}, >> "pending_update": [], >> "pending_destroy": [] >> } > We only have a single CephFS. > > We use the pool_namespace xattr for our various directory trees on the > cephfs. > > `ceph df` shows: > >> POOL ID STORED OBJECTS USED%USED MAX AVAIL >> cephfs_data 6 2.1 TiB 2.48M 2.1 TiB 24.97 3.1 TiB > `ceph daemon mds.$hostname perf dump | grep stray` shows: > >> "num_strays": 0, >> "num_strays_delayed": 0, >> "num_strays_enqueuing": 0, >> "strays_created": 5097138, >> "strays_enqueued": 5097138, >> "strays_reintegrated": 0, >> "strays_migrated": 0, > `rados -p cephfs_data df` shows: > >> POOL_NAME USED OBJECTS CLONES COPIES MISSING_ON_PRIMARY UNFOUND >> DEGRADED RD_OPS RD WR_OPS WR USED COMPR UNDER COMPR >> cephfs_data 2.1 TiB 2477540 0 4955080 0 0 >> 0 10699626 6.9 TiB 86911076 35 TiB0 B 0 B >> >
[ceph-users] Re: CephFS ghost usage/inodes
i executed the commands from above again ("Recovery from missing metadata objects") and now the mds daemons start. still the same situation like before :( Am 14.01.20 um 22:36 schrieb Oskar Malnowicz: > i just restartet the mds daemons and now they crash during the boot. > > -36> 2020-01-14 22:33:17.880 7fc9bbeaa700 2 mds.0.13470 Booting: 0: > opening inotable > -35> 2020-01-14 22:33:17.880 7fc9bbeaa700 2 mds.0.13470 Booting: 0: > opening sessionmap > -34> 2020-01-14 22:33:17.880 7fc9bbeaa700 2 mds.0.13470 Booting: 0: > opening mds log > -33> 2020-01-14 22:33:17.880 7fc9bbeaa700 5 mds.0.log open > discovering log bounds > -32> 2020-01-14 22:33:17.880 7fc9bbeaa700 2 mds.0.13470 Booting: 0: > opening purge queue (async) > -31> 2020-01-14 22:33:17.880 7fc9bbeaa700 4 mds.0.purge_queue open: > opening > -30> 2020-01-14 22:33:17.880 7fc9bbeaa700 1 mds.0.journaler.pq(ro) > recover start > -29> 2020-01-14 22:33:17.880 7fc9bb6a9700 4 mds.0.journalpointer > Reading journal pointer '400.' > -28> 2020-01-14 22:33:17.880 7fc9bbeaa700 1 mds.0.journaler.pq(ro) > read_head > -27> 2020-01-14 22:33:17.880 7fc9bbeaa700 2 mds.0.13470 Booting: 0: > loading open file table (async) > -26> 2020-01-14 22:33:17.880 7fc9c58a5700 10 monclient: > get_auth_request con 0x55aa83436d80 auth_method 0 > -25> 2020-01-14 22:33:17.880 7fc9bbeaa700 2 mds.0.13470 Booting: 0: > opening snap table > -24> 2020-01-14 22:33:17.884 7fc9c58a5700 10 monclient: > get_auth_request con 0x55aa83437680 auth_method 0 > -23> 2020-01-14 22:33:17.884 7fc9c60a6700 10 monclient: > get_auth_request con 0x55aa83437200 auth_method 0 > -22> 2020-01-14 22:33:17.884 7fc9bceac700 1 mds.0.journaler.pq(ro) > _finish_read_head loghead(trim 805306368, expire 807199928, write > 807199928, stream_format 1). probing for end of log (from 807199928)... > -21> 2020-01-14 22:33:17.884 7fc9bceac700 1 mds.0.journaler.pq(ro) > probing for end of the log > -20> 2020-01-14 22:33:17.884 7fc9bb6a9700 1 > mds.0.journaler.mdlog(ro) recover start > -19> 2020-01-14 22:33:17.884 7fc9bb6a9700 1 > mds.0.journaler.mdlog(ro) read_head > -18> 2020-01-14 22:33:17.884 7fc9bb6a9700 4 mds.0.log Waiting for > journal 0x200 to recover... > -17> 2020-01-14 22:33:17.884 7fc9c68a7700 10 monclient: > get_auth_request con 0x55aa83437f80 auth_method 0 > -16> 2020-01-14 22:33:17.884 7fc9c60a6700 10 monclient: > get_auth_request con 0x55aa83438400 auth_method 0 > -15> 2020-01-14 22:33:17.892 7fc9bbeaa700 1 > mds.0.journaler.mdlog(ro) _finish_read_head loghead(trim 98280931328, > expire 98282151365, write 98282247624, stream_format 1). probing for > end of log (from 98282247624)... > -14> 2020-01-14 22:33:17.892 7fc9bbeaa700 1 > mds.0.journaler.mdlog(ro) probing for end of the log > -13> 2020-01-14 22:33:17.892 7fc9bceac700 1 mds.0.journaler.pq(ro) > _finish_probe_end write_pos = 807199928 (header had 807199928). recovered. > -12> 2020-01-14 22:33:17.892 7fc9bceac700 4 mds.0.purge_queue > operator(): open complete > -11> 2020-01-14 22:33:17.892 7fc9bceac700 1 mds.0.journaler.pq(ro) > set_writeable > -10> 2020-01-14 22:33:17.892 7fc9bbeaa700 1 > mds.0.journaler.mdlog(ro) _finish_probe_end write_pos = 98283021535 > (header had 98282247624). recovered. > -9> 2020-01-14 22:33:17.892 7fc9bb6a9700 4 mds.0.log Journal 0x200 > recovered. > -8> 2020-01-14 22:33:17.892 7fc9bb6a9700 4 mds.0.log Recovered > journal 0x200 in format 1 > -7> 2020-01-14 22:33:17.892 7fc9bb6a9700 2 mds.0.13470 Booting: 1: > loading/discovering base inodes > -6> 2020-01-14 22:33:17.892 7fc9bb6a9700 0 mds.0.cache creating > system inode with ino:0x100 > -5> 2020-01-14 22:33:17.892 7fc9bb6a9700 0 mds.0.cache creating > system inode with ino:0x1 > -4> 2020-01-14 22:33:17.896 7fc9bbeaa700 2 mds.0.13470 Booting: 2: > replaying mds log > -3> 2020-01-14 22:33:17.896 7fc9bbeaa700 2 mds.0.13470 Booting: 2: > waiting for purge queue recovered > -2> 2020-01-14 22:33:17.908 7fc9ba6a7700 -1 log_channel(cluster) log > [ERR] : ESession.replay sessionmap v 7561128 - 1 > table 0 > -1> 2020-01-14 22:33:17.912 7fc9ba6a7700 -1 > /build/ceph-14.2.5/src/mds/journal.cc: In function 'virtual void > ESession::replay(MDSRank*)' thread 7fc9ba6a7700 time 2020-01-14 > 22:33:17.912135 > /build/ceph-14.2.5/src/mds/journal.cc: 1655: FAILED > ceph_assert(g_conf()->mds_wipe_sessions) > > Am 14.01.20 um 21:19 schrieb Oskar Malnowicz: >> this was the new state. the results are equal to florians >> >> $ time cephfs-data-scan scan_extents cephfs_data >> cephfs-data-scan scan_extents cephfs_data 1.86s user 1.47s system 21% >> cpu 15.397 total >> >> $ time cephfs-data-scan scan_inodes cephfs_data >> cephfs-data-scan scan_inodes cephfs_data 2.76s user 2.05s system 26% >> cpu 17.912 total >> >> $ time cephfs-data-scan scan_links >> cephfs-data-scan scan_links 0.13s user 0.11s system 31% cpu 0.747 total >> >> $ time cephfs-data-scan
[ceph-users] Re: CephFS ghost usage/inodes
i just restartet the mds daemons and now they crash during the boot. -36> 2020-01-14 22:33:17.880 7fc9bbeaa700 2 mds.0.13470 Booting: 0: opening inotable -35> 2020-01-14 22:33:17.880 7fc9bbeaa700 2 mds.0.13470 Booting: 0: opening sessionmap -34> 2020-01-14 22:33:17.880 7fc9bbeaa700 2 mds.0.13470 Booting: 0: opening mds log -33> 2020-01-14 22:33:17.880 7fc9bbeaa700 5 mds.0.log open discovering log bounds -32> 2020-01-14 22:33:17.880 7fc9bbeaa700 2 mds.0.13470 Booting: 0: opening purge queue (async) -31> 2020-01-14 22:33:17.880 7fc9bbeaa700 4 mds.0.purge_queue open: opening -30> 2020-01-14 22:33:17.880 7fc9bbeaa700 1 mds.0.journaler.pq(ro) recover start -29> 2020-01-14 22:33:17.880 7fc9bb6a9700 4 mds.0.journalpointer Reading journal pointer '400.' -28> 2020-01-14 22:33:17.880 7fc9bbeaa700 1 mds.0.journaler.pq(ro) read_head -27> 2020-01-14 22:33:17.880 7fc9bbeaa700 2 mds.0.13470 Booting: 0: loading open file table (async) -26> 2020-01-14 22:33:17.880 7fc9c58a5700 10 monclient: get_auth_request con 0x55aa83436d80 auth_method 0 -25> 2020-01-14 22:33:17.880 7fc9bbeaa700 2 mds.0.13470 Booting: 0: opening snap table -24> 2020-01-14 22:33:17.884 7fc9c58a5700 10 monclient: get_auth_request con 0x55aa83437680 auth_method 0 -23> 2020-01-14 22:33:17.884 7fc9c60a6700 10 monclient: get_auth_request con 0x55aa83437200 auth_method 0 -22> 2020-01-14 22:33:17.884 7fc9bceac700 1 mds.0.journaler.pq(ro) _finish_read_head loghead(trim 805306368, expire 807199928, write 807199928, stream_format 1). probing for end of log (from 807199928)... -21> 2020-01-14 22:33:17.884 7fc9bceac700 1 mds.0.journaler.pq(ro) probing for end of the log -20> 2020-01-14 22:33:17.884 7fc9bb6a9700 1 mds.0.journaler.mdlog(ro) recover start -19> 2020-01-14 22:33:17.884 7fc9bb6a9700 1 mds.0.journaler.mdlog(ro) read_head -18> 2020-01-14 22:33:17.884 7fc9bb6a9700 4 mds.0.log Waiting for journal 0x200 to recover... -17> 2020-01-14 22:33:17.884 7fc9c68a7700 10 monclient: get_auth_request con 0x55aa83437f80 auth_method 0 -16> 2020-01-14 22:33:17.884 7fc9c60a6700 10 monclient: get_auth_request con 0x55aa83438400 auth_method 0 -15> 2020-01-14 22:33:17.892 7fc9bbeaa700 1 mds.0.journaler.mdlog(ro) _finish_read_head loghead(trim 98280931328, expire 98282151365, write 98282247624, stream_format 1). probing for end of log (from 98282247624)... -14> 2020-01-14 22:33:17.892 7fc9bbeaa700 1 mds.0.journaler.mdlog(ro) probing for end of the log -13> 2020-01-14 22:33:17.892 7fc9bceac700 1 mds.0.journaler.pq(ro) _finish_probe_end write_pos = 807199928 (header had 807199928). recovered. -12> 2020-01-14 22:33:17.892 7fc9bceac700 4 mds.0.purge_queue operator(): open complete -11> 2020-01-14 22:33:17.892 7fc9bceac700 1 mds.0.journaler.pq(ro) set_writeable -10> 2020-01-14 22:33:17.892 7fc9bbeaa700 1 mds.0.journaler.mdlog(ro) _finish_probe_end write_pos = 98283021535 (header had 98282247624). recovered. -9> 2020-01-14 22:33:17.892 7fc9bb6a9700 4 mds.0.log Journal 0x200 recovered. -8> 2020-01-14 22:33:17.892 7fc9bb6a9700 4 mds.0.log Recovered journal 0x200 in format 1 -7> 2020-01-14 22:33:17.892 7fc9bb6a9700 2 mds.0.13470 Booting: 1: loading/discovering base inodes -6> 2020-01-14 22:33:17.892 7fc9bb6a9700 0 mds.0.cache creating system inode with ino:0x100 -5> 2020-01-14 22:33:17.892 7fc9bb6a9700 0 mds.0.cache creating system inode with ino:0x1 -4> 2020-01-14 22:33:17.896 7fc9bbeaa700 2 mds.0.13470 Booting: 2: replaying mds log -3> 2020-01-14 22:33:17.896 7fc9bbeaa700 2 mds.0.13470 Booting: 2: waiting for purge queue recovered -2> 2020-01-14 22:33:17.908 7fc9ba6a7700 -1 log_channel(cluster) log [ERR] : ESession.replay sessionmap v 7561128 - 1 > table 0 -1> 2020-01-14 22:33:17.912 7fc9ba6a7700 -1 /build/ceph-14.2.5/src/mds/journal.cc: In function 'virtual void ESession::replay(MDSRank*)' thread 7fc9ba6a7700 time 2020-01-14 22:33:17.912135 /build/ceph-14.2.5/src/mds/journal.cc: 1655: FAILED ceph_assert(g_conf()->mds_wipe_sessions) Am 14.01.20 um 21:19 schrieb Oskar Malnowicz: > this was the new state. the results are equal to florians > > $ time cephfs-data-scan scan_extents cephfs_data > cephfs-data-scan scan_extents cephfs_data 1.86s user 1.47s system 21% > cpu 15.397 total > > $ time cephfs-data-scan scan_inodes cephfs_data > cephfs-data-scan scan_inodes cephfs_data 2.76s user 2.05s system 26% > cpu 17.912 total > > $ time cephfs-data-scan scan_links > cephfs-data-scan scan_links 0.13s user 0.11s system 31% cpu 0.747 total > > $ time cephfs-data-scan scan_links > cephfs-data-scan scan_links 0.13s user 0.12s system 33% cpu 0.735 total > > $ time cephfs-data-scan cleanup cephfs_data > cephfs-data-scan cleanup cephfs_data 1.64s user 1.37s system 12% cpu > 23.922 total > > mds / $ du -sh > 31G > > $ df -h > ip1,ip2,ip3:/ 5.2T 2.1T 3.1T 41% /storage/cephfs_test1 > > $ ceph df detail > RAW STORAGE: > CLASS SIZE
[ceph-users] Re: CephFS ghost usage/inodes
this was the new state. the results are equal to florians $ time cephfs-data-scan scan_extents cephfs_data cephfs-data-scan scan_extents cephfs_data 1.86s user 1.47s system 21% cpu 15.397 total $ time cephfs-data-scan scan_inodes cephfs_data cephfs-data-scan scan_inodes cephfs_data 2.76s user 2.05s system 26% cpu 17.912 total $ time cephfs-data-scan scan_links cephfs-data-scan scan_links 0.13s user 0.11s system 31% cpu 0.747 total $ time cephfs-data-scan scan_links cephfs-data-scan scan_links 0.13s user 0.12s system 33% cpu 0.735 total $ time cephfs-data-scan cleanup cephfs_data cephfs-data-scan cleanup cephfs_data 1.64s user 1.37s system 12% cpu 23.922 total mds / $ du -sh 31G $ df -h ip1,ip2,ip3:/ 5.2T 2.1T 3.1T 41% /storage/cephfs_test1 $ ceph df detail RAW STORAGE: CLASS SIZE AVAIL USED RAW USED %RAW USED hdd 7.8 TiB 7.5 TiB 312 GiB 329 GiB 4.14 TOTAL 7.8 TiB 7.5 TiB 312 GiB 329 GiB 4.14 POOLS: POOL ID STORED OBJECTS USED %USED MAX AVAIL QUOTA OBJECTS QUOTA BYTES DIRTY USED COMPR UNDER COMPR cephfs_data 6 2.1 TiB 2.48M 2.1 TiB 25.00 3.1 TiB N/A N/A 2.48M 0 B 0 B cephfs_metadata 7 7.3 MiB 379 7.3 MiB 0 3.1 TiB N/A N/A 379 0 B 0 B Am 14.01.20 um 21:06 schrieb Patrick Donnelly: > I'm asking that you get the new state of the file system tree after > recovering from the data pool. Florian wrote that before I asked you > to do this... > > How long did it take to run the cephfs-data-scan commands? > > On Tue, Jan 14, 2020 at 11:58 AM Oskar Malnowicz > wrote: >> as florian already wrote, `du -hc` shows a total usage of 31G, but `ceph >> df` show us an usage of 2.1 >> >> # du -hs >> 31G >> >> # ceph df >> cephfs_data 6 2.1 TiB 2.48M 2.1 TiB 25.00 3.1 TiB >> >> Am 14.01.20 um 20:44 schrieb Patrick Donnelly: >>> On Tue, Jan 14, 2020 at 11:40 AM Oskar Malnowicz >>> wrote: i run this commands, but still the same problems >>> Which problems? >>> $ cephfs-data-scan scan_extents cephfs_data $ cephfs-data-scan scan_inodes cephfs_data $ cephfs-data-scan scan_links 2020-01-14 20:36:45.110 7ff24200ef80 -1 mds.0.snap updating last_snap 1 -> 27 $ cephfs-data-scan cleanup cephfs_data do you have other ideas ? >>> After you complete this, you should see the deleted files in your file >>> system tree (if this is indeed the issue). What's the output of `du >>> -hc`? >>> >> > ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: CephFS ghost usage/inodes
I'm asking that you get the new state of the file system tree after recovering from the data pool. Florian wrote that before I asked you to do this... How long did it take to run the cephfs-data-scan commands? On Tue, Jan 14, 2020 at 11:58 AM Oskar Malnowicz wrote: > > as florian already wrote, `du -hc` shows a total usage of 31G, but `ceph > df` show us an usage of 2.1 > > # du -hs > 31G > > # ceph df > cephfs_data 6 2.1 TiB 2.48M 2.1 TiB 25.00 3.1 TiB > > Am 14.01.20 um 20:44 schrieb Patrick Donnelly: > > On Tue, Jan 14, 2020 at 11:40 AM Oskar Malnowicz > > wrote: > >> i run this commands, but still the same problems > > Which problems? > > > >> $ cephfs-data-scan scan_extents cephfs_data > >> > >> $ cephfs-data-scan scan_inodes cephfs_data > >> > >> $ cephfs-data-scan scan_links > >> 2020-01-14 20:36:45.110 7ff24200ef80 -1 mds.0.snap updating last_snap 1 > >> -> 27 > >> > >> $ cephfs-data-scan cleanup cephfs_data > >> > >> do you have other ideas ? > > After you complete this, you should see the deleted files in your file > > system tree (if this is indeed the issue). What's the output of `du > > -hc`? > > > > -- Patrick Donnelly, Ph.D. He / Him / His Senior Software Engineer Red Hat Sunnyvale, CA GPG: 19F28A586F808C2402351B93C3301A3E258DD79D ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: CephFS ghost usage/inodes
as florian already wrote, `du -hc` shows a total usage of 31G, but `ceph df` show us an usage of 2.1 # du -hs 31G # ceph df cephfs_data 6 2.1 TiB 2.48M 2.1 TiB 25.00 3.1 TiB Am 14.01.20 um 20:44 schrieb Patrick Donnelly: > On Tue, Jan 14, 2020 at 11:40 AM Oskar Malnowicz > wrote: >> i run this commands, but still the same problems > Which problems? > >> $ cephfs-data-scan scan_extents cephfs_data >> >> $ cephfs-data-scan scan_inodes cephfs_data >> >> $ cephfs-data-scan scan_links >> 2020-01-14 20:36:45.110 7ff24200ef80 -1 mds.0.snap updating last_snap 1 >> -> 27 >> >> $ cephfs-data-scan cleanup cephfs_data >> >> do you have other ideas ? > After you complete this, you should see the deleted files in your file > system tree (if this is indeed the issue). What's the output of `du > -hc`? > ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: CephFS ghost usage/inodes
On Tue, Jan 14, 2020 at 11:40 AM Oskar Malnowicz wrote: > > i run this commands, but still the same problems Which problems? > $ cephfs-data-scan scan_extents cephfs_data > > $ cephfs-data-scan scan_inodes cephfs_data > > $ cephfs-data-scan scan_links > 2020-01-14 20:36:45.110 7ff24200ef80 -1 mds.0.snap updating last_snap 1 > -> 27 > > $ cephfs-data-scan cleanup cephfs_data > > do you have other ideas ? After you complete this, you should see the deleted files in your file system tree (if this is indeed the issue). What's the output of `du -hc`? -- Patrick Donnelly, Ph.D. He / Him / His Senior Software Engineer Red Hat Sunnyvale, CA GPG: 19F28A586F808C2402351B93C3301A3E258DD79D ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: CephFS ghost usage/inodes
i run this commands, but still the same problems $ cephfs-data-scan scan_extents cephfs_data $ cephfs-data-scan scan_inodes cephfs_data $ cephfs-data-scan scan_links 2020-01-14 20:36:45.110 7ff24200ef80 -1 mds.0.snap updating last_snap 1 -> 27 $ cephfs-data-scan cleanup cephfs_data do you have other ideas ? Am 14.01.20 um 20:32 schrieb Patrick Donnelly: > On Tue, Jan 14, 2020 at 11:24 AM Oskar Malnowicz > wrote: >> $ ceph daemon mds.who flush journal >> { >> "message": "", >> "return_code": 0 >> } >> >> >> $ cephfs-table-tool 0 reset session >> { >> "0": { >> "data": {}, >> "result": 0 >> } >> } >> >> $ cephfs-table-tool 0 reset snap >> { >> "result": 0 >> } >> >> $ cephfs-table-tool 0 reset inode >> { >> "0": { >> "data": {}, >> "result": 0 >> } >> } >> >> $ cephfs-journal-tool --rank=cephfs_test1:0 journal reset >> old journal was 98282151365~92872 >> new journal start will be 98285125632 (2881395 bytes past old end) >> writing journal head >> writing EResetJournal entry >> done >> >> $ cephfs-data-scan init >> Inode 0x0x1 already exists, skipping create. Use --force-init to >> overwrite the existing object. >> Inode 0x0x100 already exists, skipping create. Use --force-init to >> overwrite the existing object. >> >> Should i run with --force-init flag ? > No, that shouldn't be necessary. > ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: CephFS ghost usage/inodes
On Tue, Jan 14, 2020 at 11:24 AM Oskar Malnowicz wrote: > > $ ceph daemon mds.who flush journal > { > "message": "", > "return_code": 0 > } > > > $ cephfs-table-tool 0 reset session > { > "0": { > "data": {}, > "result": 0 > } > } > > $ cephfs-table-tool 0 reset snap > { > "result": 0 > } > > $ cephfs-table-tool 0 reset inode > { > "0": { > "data": {}, > "result": 0 > } > } > > $ cephfs-journal-tool --rank=cephfs_test1:0 journal reset > old journal was 98282151365~92872 > new journal start will be 98285125632 (2881395 bytes past old end) > writing journal head > writing EResetJournal entry > done > > $ cephfs-data-scan init > Inode 0x0x1 already exists, skipping create. Use --force-init to > overwrite the existing object. > Inode 0x0x100 already exists, skipping create. Use --force-init to > overwrite the existing object. > > Should i run with --force-init flag ? No, that shouldn't be necessary. -- Patrick Donnelly, Ph.D. He / Him / His Senior Software Engineer Red Hat Sunnyvale, CA GPG: 19F28A586F808C2402351B93C3301A3E258DD79D ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: CephFS ghost usage/inodes
$ ceph daemon mds.who flush journal { "message": "", "return_code": 0 } $ cephfs-table-tool 0 reset session { "0": { "data": {}, "result": 0 } } $ cephfs-table-tool 0 reset snap { "result": 0 } $ cephfs-table-tool 0 reset inode { "0": { "data": {}, "result": 0 } } $ cephfs-journal-tool --rank=cephfs_test1:0 journal reset old journal was 98282151365~92872 new journal start will be 98285125632 (2881395 bytes past old end) writing journal head writing EResetJournal entry done $ cephfs-data-scan init Inode 0x0x1 already exists, skipping create. Use --force-init to overwrite the existing object. Inode 0x0x100 already exists, skipping create. Use --force-init to overwrite the existing object. Should i run with --force-init flag ? Am 14.01.20 um 18:48 schrieb Patrick Donnelly: > Please try flushing the journal: > > ceph daemon mds.foo flush journal > > The problem may be caused by this bug: https://tracker.ceph.com/issues/43598 > > As for what to do next, you would likely need to recover the deleted > inodes from the data pool so you can retry deleting the files: > https://docs.ceph.com/docs/master/cephfs/disaster-recovery-experts/#recovery-from-missing-metadata-objects > > > On Tue, Jan 14, 2020 at 9:30 AM Oskar Malnowicz > wrote: >> Hello Patrick, >> >> "purge_queue": { >> "pq_executing_ops": 0, >> "pq_executing": 0, >> "pq_executed": 5097138 >> }, >> >> We already restarted the MDS daemons, but no change. >> There are no other health warnings than that one what Florian already >> mentioned. >> >> cheers Oskar >> >> Am 14.01.20 um 17:32 schrieb Patrick Donnelly: >>> On Tue, Jan 14, 2020 at 5:15 AM Florian Pritz >>> wrote: `ceph daemon mds.$hostname perf dump | grep stray` shows: > "num_strays": 0, > "num_strays_delayed": 0, > "num_strays_enqueuing": 0, > "strays_created": 5097138, > "strays_enqueued": 5097138, > "strays_reintegrated": 0, > "strays_migrated": 0, >>> Can you also paste the purge queue ("pq") perf dump? >>> >>> It's possible the MDS has hit an ENOSPC condition that caused the MDS >>> to go read-only. This would prevent the MDS PurgeQueue from cleaning >>> up. Do you see a health warning that the MDS is in this state? Is so, >>> please try restarting the MDS. >>> >> ___ >> ceph-users mailing list -- ceph-users@ceph.io >> To unsubscribe send an email to ceph-users-le...@ceph.io > > > -- > Patrick Donnelly, Ph.D. > He / Him / His > Senior Software Engineer > Red Hat Sunnyvale, CA > GPG: 19F28A586F808C2402351B93C3301A3E258DD79D > ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: CephFS ghost usage/inodes
Please try flushing the journal: ceph daemon mds.foo flush journal The problem may be caused by this bug: https://tracker.ceph.com/issues/43598 As for what to do next, you would likely need to recover the deleted inodes from the data pool so you can retry deleting the files: https://docs.ceph.com/docs/master/cephfs/disaster-recovery-experts/#recovery-from-missing-metadata-objects On Tue, Jan 14, 2020 at 9:30 AM Oskar Malnowicz wrote: > > Hello Patrick, > > "purge_queue": { > "pq_executing_ops": 0, > "pq_executing": 0, > "pq_executed": 5097138 > }, > > We already restarted the MDS daemons, but no change. > There are no other health warnings than that one what Florian already > mentioned. > > cheers Oskar > > Am 14.01.20 um 17:32 schrieb Patrick Donnelly: > > On Tue, Jan 14, 2020 at 5:15 AM Florian Pritz > > wrote: > >> `ceph daemon mds.$hostname perf dump | grep stray` shows: > >> > >>> "num_strays": 0, > >>> "num_strays_delayed": 0, > >>> "num_strays_enqueuing": 0, > >>> "strays_created": 5097138, > >>> "strays_enqueued": 5097138, > >>> "strays_reintegrated": 0, > >>> "strays_migrated": 0, > > Can you also paste the purge queue ("pq") perf dump? > > > > It's possible the MDS has hit an ENOSPC condition that caused the MDS > > to go read-only. This would prevent the MDS PurgeQueue from cleaning > > up. Do you see a health warning that the MDS is in this state? Is so, > > please try restarting the MDS. > > > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io -- Patrick Donnelly, Ph.D. He / Him / His Senior Software Engineer Red Hat Sunnyvale, CA GPG: 19F28A586F808C2402351B93C3301A3E258DD79D ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: CephFS ghost usage/inodes
Hello Patrick, "purge_queue": { "pq_executing_ops": 0, "pq_executing": 0, "pq_executed": 5097138 }, We already restarted the MDS daemons, but no change. There are no other health warnings than that one what Florian already mentioned. cheers Oskar Am 14.01.20 um 17:32 schrieb Patrick Donnelly: > On Tue, Jan 14, 2020 at 5:15 AM Florian Pritz > wrote: >> `ceph daemon mds.$hostname perf dump | grep stray` shows: >> >>> "num_strays": 0, >>> "num_strays_delayed": 0, >>> "num_strays_enqueuing": 0, >>> "strays_created": 5097138, >>> "strays_enqueued": 5097138, >>> "strays_reintegrated": 0, >>> "strays_migrated": 0, > Can you also paste the purge queue ("pq") perf dump? > > It's possible the MDS has hit an ENOSPC condition that caused the MDS > to go read-only. This would prevent the MDS PurgeQueue from cleaning > up. Do you see a health warning that the MDS is in this state? Is so, > please try restarting the MDS. > ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: CephFS ghost usage/inodes
On Tue, Jan 14, 2020 at 5:15 AM Florian Pritz wrote: > `ceph daemon mds.$hostname perf dump | grep stray` shows: > > > "num_strays": 0, > > "num_strays_delayed": 0, > > "num_strays_enqueuing": 0, > > "strays_created": 5097138, > > "strays_enqueued": 5097138, > > "strays_reintegrated": 0, > > "strays_migrated": 0, Can you also paste the purge queue ("pq") perf dump? It's possible the MDS has hit an ENOSPC condition that caused the MDS to go read-only. This would prevent the MDS PurgeQueue from cleaning up. Do you see a health warning that the MDS is in this state? Is so, please try restarting the MDS. -- Patrick Donnelly, Ph.D. He / Him / His Senior Software Engineer Red Hat Sunnyvale, CA GPG: 19F28A586F808C2402351B93C3301A3E258DD79D ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io