I saw this go by in the commit log: commit cc2200c5e60caecf7931e546f6522b2ba364227f Merge: f8d5807 12c083e Author: Sage Weil <s...@redhat.com> Date: Thu Feb 11 08:44:35 2016 -0500
Merge pull request #7537 from ifed01/wip-no-promote-for-delete-fix osd: fix unnecessary object promotion when deleting from cache pool Reviewed-by: Sage Weil <s...@redhat.com> Is there any chance that I was basically seeing with the same thing from the filesystem standpoint? Thanks Steve > On Feb 5, 2016, at 8:42 AM, Gregory Farnum <gfar...@redhat.com> wrote: > > On Fri, Feb 5, 2016 at 6:39 AM, Stephen Lord <steve.l...@quantum.com> wrote: >> >> I looked at this system this morning, and the it actually finished what it >> was >> doing. The erasure coded pool still contains all the data and the cache >> pool has about a million zero sized objects: >> >> >> GLOBAL: >> SIZE AVAIL RAW USED %RAW USED OBJECTS >> 15090G 9001G 6080G 40.29 2127k >> POOLS: >> NAME ID CATEGORY USED %USED MAX AVAIL >> OBJECTS DIRTY READ WRITE >> cache-data 21 - 0 0 7962G >> 1162258 1057k 22969 3220k >> cephfs-data 22 - 3964G 26.27 5308G >> 1014840 991k 891k 1143k >> >> Definitely seems like a bug since I removed all references to these from the >> filesystem >> which created them. >> >> I originally wrote 4.5 Tbytes of data into the file system, the erasure coded >> pool is setup as 4+2, and the cache has a size limit of 1 Tbyte. Looks like >> not >> all the data made it out of the cache tier before I removed content, it >> removed the >> content which was only present in the cache tier and created a zero sized >> object >> in the cache for all the content. The used capacity is somewhat consistent >> with >> this. >> >> I tried to look at the extended attributes on one of the zero size object >> with ceph-dencoder, >> but it failed: >> >> error: buffer::malformed_input: void >> object_info_t::decode(ceph::buffer::list::iterator&) unknown encoding >> version > 15 >> >> Same error on one of the objects in the erasure coded pool. >> >> Looks like I am a little too bleeding edge for this, or the contents of the >> .ceph_ attribute are not an object_info_t > > ghobject_info_t > > You can get the EC stuff actually deleted by getting the cache pool to > flush everything. That's discussed in the docs and in various mailing > list archives. > -Greg > >> >> >> >> Steve >> >>> On Feb 4, 2016, at 7:10 PM, Gregory Farnum <gfar...@redhat.com> wrote: >>> >>> On Thu, Feb 4, 2016 at 5:07 PM, Stephen Lord <steve.l...@quantum.com> wrote: >>>> >>>>> On Feb 4, 2016, at 6:51 PM, Gregory Farnum <gfar...@redhat.com> wrote: >>>>> >>>>> I presume we're doing reads in order to gather some object metadata >>>>> from the cephfs-data pool; and the (small) newly-created objects in >>>>> cache-data are definitely whiteout objects indicating the object no >>>>> longer exists logically. >>>>> >>>>> What kinds of reads are you actually seeing? Does it appear to be >>>>> transferring data, or merely doing a bunch of seeks? I thought we were >>>>> trying to avoid doing reads-to-delete, but perhaps the way we're >>>>> handling snapshots or something is invoking behavior that isn't >>>>> amicable to a full-FS delete. >>>>> >>>>> I presume you're trying to characterize the system's behavior, but of >>>>> course if you just want to empty it out entirely you're better off >>>>> deleting the pools and the CephFS instance entirely and then starting >>>>> it over again from scratch. >>>>> -Greg >>>> >>>> I believe it is reading all the data, just from the volume of traffic and >>>> the cpu load on the OSDs maybe suggests it is doing more than >>>> just that. >>>> >>>> iostat is showing a lot of data moving, I am seeing about the same volume >>>> of read and write activity here. Because the OSDs underneath both pools >>>> are the same ones, I know that’s not exactly optimal, it is hard to tell >>>> what >>>> which pool is responsible for which I/O. Large reads and small writes >>>> suggest >>>> it is reading up all the data from the objects, the write traffic is I >>>> presume all >>>> journal activity relating to deleting objects and creating the empty ones. >>>> >>>> The 9:1 ratio between things being deleted and created seems odd though. >>>> >>>> A previous version of this exercise with just a regular replicated data >>>> pool >>>> did not read anything, just a lot of write activity and eventually the >>>> content >>>> disappeared. So definitely related to the pool configuration here and >>>> probably >>>> not to the filesystem layer. >>> >>> Sam, does this make any sense to you in terms of how RADOS handles deletes? >>> -Greg >> >> >> ---------------------------------------------------------------------- >> The information contained in this transmission may be confidential. Any >> disclosure, copying, or further distribution of confidential information is >> not permitted unless such privilege is explicitly granted in writing by >> Quantum. Quantum reserves the right to have electronic communications, >> including email and attachments, sent across its networks filtered through >> anti virus and spam software programs and retain such messages in order to >> comply with applicable data security and retention requirements. Quantum is >> not responsible for the proper and complete transmission of the substance of >> this communication or for any delay in its receipt. _______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com