Re: [ceph-users] "stray" objects in empty cephfs data pool
On Fri, Oct 23, 2015 at 7:08 AM, Burkhard Linkewrote: > Hi, > > On 10/14/2015 06:32 AM, Gregory Farnum wrote: >> >> On Mon, Oct 12, 2015 at 12:50 AM, Burkhard Linke >> wrote: >>> >>> > *snipsnap* >>> >>> Thanks, that did the trick. I was able to locate the host blocking the >>> file >>> handles and remove the objects from the EC pool. >>> >>> Well, all except one: >>> >>> # ceph df >>>... >>> ec_ssd_cache 18 4216k 0 2500G 129 >>> cephfs_ec_data 19 4096k 0 31574G1 >>> >>> # rados -p ec_ssd_cache ls >>> 1ef540f.0386 >>> # rados -p cephfs_ec_data ls >>> 1ef540f.0386 >>> # ceph mds tell cb-dell-pe620r dumpcache cache.file >>> # grep 1ef540f /cache.file >>> # >>> >>> It does not show up in the dumped cache file, but keeps being promoted to >>> the cache tier after MDS restarts. I've restarted most of the cephfs >>> clients >>> by unmounting cephfs and restarting ceph-fuse, but the object remains >>> active. >> >> You can enable MDS debug logging and see if the inode shows up in the >> log during replay. It's possible it's getting read in (from journal >> operations) but then getting evicted from cache if nobody's accessing >> it any more. >> You can also look at the xattrs on the object to see what the >> backtrace is and if that file is in cephfs. > > After the last MDS restart the stray object was not promoted to the cache > anymore: > ec_ssd_cache 18 120k 0 3842G 128 > cephfs_ec_data 19 4096k 0 10392G1 > > There are no xattrs available for the stray object, so it's not possible to > find out which file it belongs/belonged to: > # rados -p cephfs_ec_data ls > 1ef540f.0386 > # rados -p cephfs_ec_data listxattr 1ef540f.0386 > # > > Is it possible to list pending journal operations to be on the safe side? Check out the cephfs-journal-tool. I don't remember the exact commands but I think it has good help text. -Greg ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] "stray" objects in empty cephfs data pool
Hi, On 10/14/2015 06:32 AM, Gregory Farnum wrote: On Mon, Oct 12, 2015 at 12:50 AM, Burkhard Linkewrote: *snipsnap* Thanks, that did the trick. I was able to locate the host blocking the file handles and remove the objects from the EC pool. Well, all except one: # ceph df ... ec_ssd_cache 18 4216k 0 2500G 129 cephfs_ec_data 19 4096k 0 31574G1 # rados -p ec_ssd_cache ls 1ef540f.0386 # rados -p cephfs_ec_data ls 1ef540f.0386 # ceph mds tell cb-dell-pe620r dumpcache cache.file # grep 1ef540f /cache.file # It does not show up in the dumped cache file, but keeps being promoted to the cache tier after MDS restarts. I've restarted most of the cephfs clients by unmounting cephfs and restarting ceph-fuse, but the object remains active. You can enable MDS debug logging and see if the inode shows up in the log during replay. It's possible it's getting read in (from journal operations) but then getting evicted from cache if nobody's accessing it any more. You can also look at the xattrs on the object to see what the backtrace is and if that file is in cephfs. After the last MDS restart the stray object was not promoted to the cache anymore: ec_ssd_cache 18 120k 0 3842G 128 cephfs_ec_data 19 4096k 0 10392G1 There are no xattrs available for the stray object, so it's not possible to find out which file it belongs/belonged to: # rados -p cephfs_ec_data ls 1ef540f.0386 # rados -p cephfs_ec_data listxattr 1ef540f.0386 # Is it possible to list pending journal operations to be on the safe side? Regards, Burkhard -- Dr. rer. nat. Burkhard Linke Bioinformatics and Systems Biology Justus-Liebig-University Giessen 35392 Giessen, Germany Phone: (+49) (0)641 9935810 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] "stray" objects in empty cephfs data pool
On Mon, Oct 12, 2015 at 12:50 AM, Burkhard Linkewrote: > Hi, > > On 10/08/2015 09:14 PM, John Spray wrote: >> >> On Thu, Oct 8, 2015 at 7:23 PM, Gregory Farnum wrote: >>> >>> On Thu, Oct 8, 2015 at 6:29 AM, Burkhard Linke >>> wrote: Hammer 0.94.3 does not support a 'dump cache' mds command. 'dump_ops_in_flight' does not list any pending operations. Is there any other way to access the cache? >>> >>> "dumpcache", it looks like. You can get all the supported commands >>> with "help" and look for things of interest or alternative phrasings. >>> :) >> >> To head off any confusion for someone trying to just replace dump >> cache with dumpcache: "dump cache" is the new (post hammer, >> apparently) admin socket command, dumpcache is the old tell command. >> So it's "ceph mds tell dumpcache ". > > Thanks, that did the trick. I was able to locate the host blocking the file > handles and remove the objects from the EC pool. > > Well, all except one: > > # ceph df > ... > ec_ssd_cache 18 4216k 0 2500G 129 > cephfs_ec_data 19 4096k 0 31574G1 > > # rados -p ec_ssd_cache ls > 1ef540f.0386 > # rados -p cephfs_ec_data ls > 1ef540f.0386 > # ceph mds tell cb-dell-pe620r dumpcache cache.file > # grep 1ef540f /cache.file > # > > It does not show up in the dumped cache file, but keeps being promoted to > the cache tier after MDS restarts. I've restarted most of the cephfs clients > by unmounting cephfs and restarting ceph-fuse, but the object remains > active. You can enable MDS debug logging and see if the inode shows up in the log during replay. It's possible it's getting read in (from journal operations) but then getting evicted from cache if nobody's accessing it any more. You can also look at the xattrs on the object to see what the backtrace is and if that file is in cephfs. -Greg > > > Regards, > Burkhard > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] "stray" objects in empty cephfs data pool
Hi, On 10/08/2015 09:14 PM, John Spray wrote: On Thu, Oct 8, 2015 at 7:23 PM, Gregory Farnumwrote: On Thu, Oct 8, 2015 at 6:29 AM, Burkhard Linke wrote: Hammer 0.94.3 does not support a 'dump cache' mds command. 'dump_ops_in_flight' does not list any pending operations. Is there any other way to access the cache? "dumpcache", it looks like. You can get all the supported commands with "help" and look for things of interest or alternative phrasings. :) To head off any confusion for someone trying to just replace dump cache with dumpcache: "dump cache" is the new (post hammer, apparently) admin socket command, dumpcache is the old tell command. So it's "ceph mds tell dumpcache ". Thanks, that did the trick. I was able to locate the host blocking the file handles and remove the objects from the EC pool. Well, all except one: # ceph df ... ec_ssd_cache 18 4216k 0 2500G 129 cephfs_ec_data 19 4096k 0 31574G1 # rados -p ec_ssd_cache ls 1ef540f.0386 # rados -p cephfs_ec_data ls 1ef540f.0386 # ceph mds tell cb-dell-pe620r dumpcache cache.file # grep 1ef540f /cache.file # It does not show up in the dumped cache file, but keeps being promoted to the cache tier after MDS restarts. I've restarted most of the cephfs clients by unmounting cephfs and restarting ceph-fuse, but the object remains active. Regards, Burkhard ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] "stray" objects in empty cephfs data pool
On Thu, Oct 8, 2015 at 11:41 AM, Burkhard Linkewrote: > Hi John, > > On 10/08/2015 12:05 PM, John Spray wrote: >> >> On Thu, Oct 8, 2015 at 10:21 AM, Burkhard Linke >> wrote: >>> >>> Hi, > > *snipsnap* >>> >>> >>> I've moved all files from a CephFS data pool (EC pool with frontend cache >>> tier) in order to remove the pool completely. >>> >>> Some objects are left in the pools ('ceph df' output of the affected >>> pools): >>> >>> cephfs_ec_data 19 7565k 0 66288G 13 >>> >>> Listing the objects and the readable part of their 'parent' attribute: >>> >>> # for obj in $(rados -p cephfs_ec_data ls); do echo $obj; rados -p >>> cephfs_ec_data getxattr parent | strings; done >>> 1f6119f. >>> 1f6119f >>> stray9 >>> 1f63fe5. >>> 1f6119f >>> stray9 >>> 1f61196. >>> 1f6119f >>> stray9 >>> ... > > > *snipsnap* >> >> >> Well, they're strays :-) >> >> You get stray dentries when you unlink files. They hang around either >> until the inode is ready to be purged, or if there are hard links then >> they hang around until something prompts ceph to "reintegrate" the >> stray into a new path. > > Thanks for the fast reply. During the transfer of all files from the EC pool > to a standard replicated pool I've copied the file to a new file name, > removed the orignal one and renamed the copy. There might have been some > processed with open files at that time, which might explain the stray files > objects. > > I've also been able to locate some processes that might be the reason for > these leftover files. I've terminated these processes, but the objects are > still present in the pool. How long does purging an inode usually take? If nothing is holding a file open, it'll start purging within a couple of journal-latencies of the unlink (i.e. pretty darn quick), and it'll take as long to purge as there are objects in the file (again, pretty darn quick for normal-sized files and a non-overloaded cluster). Chances are if you're noticing strays, they're stuck for some reason. You're probably on the right track looking for processes holding files open. >> You don't say what version you're running, so it's possible you're >> running an older version (pre hammer, I think) where you're >> experiencing either a bug holding up deletion (we've had a few) or a >> bug preventing reintegration (we had one of those too). The bugs >> holding up deletion can usually be worked around with some client >> and/or mds restarts. > > The cluster is running on hammer. I'm going to restart the mds to try to get > rid of these objects. OK, let us know how it goes. You may find the num_strays, num_strays_purging, num_strays_delayted performance counters (ceph daemon mds. perf dump) useful. >> It isn't safe to remove the pool in this state. The MDS is likely to >> crash if it eventually gets around to trying to purge these files. > > That's bad. Does the mds provide a way to get more information about these > files, e.g. which client is blocking purging? We have about 3 hosts working > on CephFS, and checking every process might be difficult. If a client has caps on an inode, you can find out about it by dumping (the whole!) cache from a running MDS. We have tickets for adding a more surgical version of this[1] but for now it's bit of a heavyweight thing. You can do JSON ("ceph daemon mds. dump cache > foo.json") or plain text ("ceph daemon mds. dump cache foo.txt"). The latter version is harder to parse but is less likely to eat all the memory on your MDS (JSON output builds the whole thing in memory before writing it)! In the dump output, search for the inode number you're interested in, and look for client caps. Remember if search json output to look for the decimal form of the inode, vs. the hex form in plan text output. Resolve the client session ID in the caps to a meaningful name with "ceph daemon mds. session ls", assuming the clients are recent enough to report the hostnames. You can also look at "ceph daemon mds. dump_ops_in_flight" to check there are no (stuck) requests touching the inode. John 1. http://tracker.ceph.com/issues/11171, http://tracker.ceph.com/issues/11172, http://tracker.ceph.com/issues/11173 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] "stray" objects in empty cephfs data pool
On Thu, Oct 8, 2015 at 10:21 AM, Burkhard Linkewrote: > Hi, > > I've moved all files from a CephFS data pool (EC pool with frontend cache > tier) in order to remove the pool completely. > > Some objects are left in the pools ('ceph df' output of the affected pools): > > cephfs_ec_data 19 7565k 0 66288G 13 > > Listing the objects and the readable part of their 'parent' attribute: > > # for obj in $(rados -p cephfs_ec_data ls); do echo $obj; rados -p > cephfs_ec_data getxattr parent | strings; done > 1f6119f. > 1f6119f > stray9 > 1f63fe5. > 1f6119f > stray9 > 1f61196. > 1f6119f > stray9 > ... > > The names are valid CephFS object names. But the parent attribute does not > contain the path of file the object belongs to; instead the string 'stray' > is the only useful information (without dissecting the binary content of the > parent attribute). > > What are those objects and is it safe to remove the pool in this state? Well, they're strays :-) You get stray dentries when you unlink files. They hang around either until the inode is ready to be purged, or if there are hard links then they hang around until something prompts ceph to "reintegrate" the stray into a new path. You don't say what version you're running, so it's possible you're running an older version (pre hammer, I think) where you're experiencing either a bug holding up deletion (we've had a few) or a bug preventing reintegration (we had one of those too). The bugs holding up deletion can usually be worked around with some client and/or mds restarts. It isn't safe to remove the pool in this state. The MDS is likely to crash if it eventually gets around to trying to purge these files. John ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] "stray" objects in empty cephfs data pool
Hi John, On 10/08/2015 12:05 PM, John Spray wrote: On Thu, Oct 8, 2015 at 10:21 AM, Burkhard Linkewrote: Hi, *snipsnap* I've moved all files from a CephFS data pool (EC pool with frontend cache tier) in order to remove the pool completely. Some objects are left in the pools ('ceph df' output of the affected pools): cephfs_ec_data 19 7565k 0 66288G 13 Listing the objects and the readable part of their 'parent' attribute: # for obj in $(rados -p cephfs_ec_data ls); do echo $obj; rados -p cephfs_ec_data getxattr parent | strings; done 1f6119f. 1f6119f stray9 1f63fe5. 1f6119f stray9 1f61196. 1f6119f stray9 ... *snipsnap* Well, they're strays :-) You get stray dentries when you unlink files. They hang around either until the inode is ready to be purged, or if there are hard links then they hang around until something prompts ceph to "reintegrate" the stray into a new path. Thanks for the fast reply. During the transfer of all files from the EC pool to a standard replicated pool I've copied the file to a new file name, removed the orignal one and renamed the copy. There might have been some processed with open files at that time, which might explain the stray files objects. I've also been able to locate some processes that might be the reason for these leftover files. I've terminated these processes, but the objects are still present in the pool. How long does purging an inode usually take? You don't say what version you're running, so it's possible you're running an older version (pre hammer, I think) where you're experiencing either a bug holding up deletion (we've had a few) or a bug preventing reintegration (we had one of those too). The bugs holding up deletion can usually be worked around with some client and/or mds restarts. The cluster is running on hammer. I'm going to restart the mds to try to get rid of these objects. It isn't safe to remove the pool in this state. The MDS is likely to crash if it eventually gets around to trying to purge these files. That's bad. Does the mds provide a way to get more information about these files, e.g. which client is blocking purging? We have about 3 hosts working on CephFS, and checking every process might be difficult. Regards, Burkhard ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] "stray" objects in empty cephfs data pool
Hi John, On 10/08/2015 01:03 PM, John Spray wrote: On Thu, Oct 8, 2015 at 11:41 AM, Burkhard Linkewrote: *snipsnap* Thanks for the fast reply. During the transfer of all files from the EC pool to a standard replicated pool I've copied the file to a new file name, removed the orignal one and renamed the copy. There might have been some processed with open files at that time, which might explain the stray files objects. I've also been able to locate some processes that might be the reason for these leftover files. I've terminated these processes, but the objects are still present in the pool. How long does purging an inode usually take? If nothing is holding a file open, it'll start purging within a couple of journal-latencies of the unlink (i.e. pretty darn quick), and it'll take as long to purge as there are objects in the file (again, pretty darn quick for normal-sized files and a non-overloaded cluster). Chances are if you're noticing strays, they're stuck for some reason. You're probably on the right track looking for processes holding files open. You don't say what version you're running, so it's possible you're running an older version (pre hammer, I think) where you're experiencing either a bug holding up deletion (we've had a few) or a bug preventing reintegration (we had one of those too). The bugs holding up deletion can usually be worked around with some client and/or mds restarts. The cluster is running on hammer. I'm going to restart the mds to try to get rid of these objects. OK, let us know how it goes. You may find the num_strays, num_strays_purging, num_strays_delayted performance counters (ceph daemon mds. perf dump) useful. The number of objects dropped to 7 after the mds restart. I was also able to identify the application the objects belong to (some where perl modules), but I've been unable to locate a running instance of this application. The main user of this application is also not aware of any running instance at the moment. It isn't safe to remove the pool in this state. The MDS is likely to crash if it eventually gets around to trying to purge these files. That's bad. Does the mds provide a way to get more information about these files, e.g. which client is blocking purging? We have about 3 hosts working on CephFS, and checking every process might be difficult. If a client has caps on an inode, you can find out about it by dumping (the whole!) cache from a running MDS. We have tickets for adding a more surgical version of this[1] but for now it's bit of a heavyweight thing. You can do JSON ("ceph daemon mds. dump cache > foo.json") or plain text ("ceph daemon mds. dump cache foo.txt"). The latter version is harder to parse but is less likely to eat all the memory on your MDS (JSON output builds the whole thing in memory before writing it)! Hammer 0.94.3 does not support a 'dump cache' mds command. 'dump_ops_in_flight' does not list any pending operations. Is there any other way to access the cache? 'perf dump' stray information (after mds restart): "num_strays": 2327, "num_strays_purging": 0, "num_strays_delayed": 0, "strays_created": 33, "strays_purged": 34, The data pool is a combination of EC pool and cache tier. I've evicted the cache pool resulting in 128 objects left (one per PG? hitset information?). After restarting the MDS the number of objects increases by 7 objects (the ones left in the data pool). So either the MDS rejoin process promotes them back to the cache, or some ceph-fuse instance insists on reading them. Regards, Burkhard ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] "stray" objects in empty cephfs data pool
On Thu, Oct 8, 2015 at 7:23 PM, Gregory Farnumwrote: > On Thu, Oct 8, 2015 at 6:29 AM, Burkhard Linke > wrote: >> Hammer 0.94.3 does not support a 'dump cache' mds command. >> 'dump_ops_in_flight' does not list any pending operations. Is there any >> other way to access the cache? > > "dumpcache", it looks like. You can get all the supported commands > with "help" and look for things of interest or alternative phrasings. > :) To head off any confusion for someone trying to just replace dump cache with dumpcache: "dump cache" is the new (post hammer, apparently) admin socket command, dumpcache is the old tell command. So it's "ceph mds tell dumpcache ". John ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] "stray" objects in empty cephfs data pool
On Thu, Oct 8, 2015 at 6:29 AM, Burkhard Linkewrote: > Hammer 0.94.3 does not support a 'dump cache' mds command. > 'dump_ops_in_flight' does not list any pending operations. Is there any > other way to access the cache? "dumpcache", it looks like. You can get all the supported commands with "help" and look for things of interest or alternative phrasings. :) > > 'perf dump' stray information (after mds restart): > "num_strays": 2327, > "num_strays_purging": 0, > "num_strays_delayed": 0, > "strays_created": 33, > "strays_purged": 34, > > The data pool is a combination of EC pool and cache tier. I've evicted the > cache pool resulting in 128 objects left (one per PG? hitset information?). Yeah, probably. I don't remember the naming scheme, but it does keep hitset objects. I don't think you should be able to list them via rados but they probably show up in the aggregate stats. -Greg > After restarting the MDS the number of objects increases by 7 objects (the > ones left in the data pool). So either the MDS rejoin process promotes them > back to the cache, or some ceph-fuse instance insists on reading them. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com