Re: [ceph-users] why is there heavy read traffic during object delete?

Stephen Lord Thu, 11 Feb 2016 12:12:15 -0800

I saw this go by in the commit log:

commit cc2200c5e60caecf7931e546f6522b2ba364227f
Merge: f8d5807 12c083e
Author: Sage Weil <s...@redhat.com>
Date:   Thu Feb 11 08:44:35 2016 -0500


    Merge pull request #7537 from ifed01/wip-no-promote-for-delete-fix
    
    osd: fix unnecessary object promotion when deleting from cache pool
    
    Reviewed-by: Sage Weil <s...@redhat.com>


Is there any chance that I was basically seeing with the same thing from the 
filesystem standpoint?

Thanks

  Steve

> On Feb 5, 2016, at 8:42 AM, Gregory Farnum <gfar...@redhat.com> wrote:
> 
> On Fri, Feb 5, 2016 at 6:39 AM, Stephen Lord <steve.l...@quantum.com> wrote:
>> 
>> I looked at this system this morning, and the it actually finished what it 
>> was
>> doing. The erasure coded pool still contains all the data and the cache
>> pool has about a million zero sized objects:
>> 
>> 
>> GLOBAL:
>>    SIZE       AVAIL     RAW USED     %RAW USED     OBJECTS
>>    15090G     9001G        6080G         40.29       2127k
>> POOLS:
>>    NAME                ID     CATEGORY     USED       %USED     MAX AVAIL    
>>  OBJECTS     DIRTY     READ       WRITE
>>    cache-data          21     -                 0         0         7962G    
>>  1162258     1057k      22969     3220k
>>    cephfs-data         22     -             3964G     26.27         5308G    
>>  1014840      991k       891k     1143k
>> 
>> Definitely seems like a bug since I removed all references to these from the 
>> filesystem
>> which created them.
>> 
>> I originally wrote 4.5 Tbytes of data into the file system, the erasure coded
>> pool is setup as 4+2, and the cache has a size limit of 1 Tbyte. Looks like 
>> not
>> all the data made it out of the cache tier before I removed content, it 
>> removed the
>> content which was only present in the cache tier and created a zero sized 
>> object
>> in the cache for all the content. The used capacity is somewhat consistent 
>> with
>> this.
>> 
>> I tried to look at the extended attributes on one of the zero size object 
>> with ceph-dencoder,
>> but it failed:
>> 
>> error: buffer::malformed_input: void 
>> object_info_t::decode(ceph::buffer::list::iterator&) unknown encoding 
>> version > 15
>> 
>> Same error on one of the objects in the erasure coded pool.
>> 
>> Looks like I am a little too bleeding edge for this, or the contents of the 
>> .ceph_ attribute are not an object_info_t
> 
> ghobject_info_t
> 
> You can get the EC stuff actually deleted by getting the cache pool to
> flush everything. That's discussed in the docs and in various mailing
> list archives.
> -Greg
> 
>> 
>> 
>> 
>> Steve
>> 
>>> On Feb 4, 2016, at 7:10 PM, Gregory Farnum <gfar...@redhat.com> wrote:
>>> 
>>> On Thu, Feb 4, 2016 at 5:07 PM, Stephen Lord <steve.l...@quantum.com> wrote:
>>>> 
>>>>> On Feb 4, 2016, at 6:51 PM, Gregory Farnum <gfar...@redhat.com> wrote:
>>>>> 
>>>>> I presume we're doing reads in order to gather some object metadata
>>>>> from the cephfs-data pool; and the (small) newly-created objects in
>>>>> cache-data are definitely whiteout objects indicating the object no
>>>>> longer exists logically.
>>>>> 
>>>>> What kinds of reads are you actually seeing? Does it appear to be
>>>>> transferring data, or merely doing a bunch of seeks? I thought we were
>>>>> trying to avoid doing reads-to-delete, but perhaps the way we're
>>>>> handling snapshots or something is invoking behavior that isn't
>>>>> amicable to a full-FS delete.
>>>>> 
>>>>> I presume you're trying to characterize the system's behavior, but of
>>>>> course if you just want to empty it out entirely you're better off
>>>>> deleting the pools and the CephFS instance entirely and then starting
>>>>> it over again from scratch.
>>>>> -Greg
>>>> 
>>>> I believe it is reading all the data, just from the volume of traffic and
>>>> the cpu load on the OSDs maybe suggests it is doing more than
>>>> just that.
>>>> 
>>>> iostat is showing a lot of data moving, I am seeing about the same volume
>>>> of read and write activity here. Because the OSDs underneath both pools
>>>> are the same ones, I know that’s not exactly optimal, it is hard to tell 
>>>> what
>>>> which pool is responsible for which I/O. Large reads and small writes 
>>>> suggest
>>>> it is reading up all the data from the objects,  the write traffic is I 
>>>> presume all
>>>> journal activity relating to deleting objects and creating the empty ones.
>>>> 
>>>> The 9:1 ratio between things being deleted and created seems odd though.
>>>> 
>>>> A previous version of this exercise with just a regular replicated data 
>>>> pool
>>>> did not read anything, just a lot of write activity and eventually the 
>>>> content
>>>> disappeared. So definitely related to the pool configuration here and 
>>>> probably
>>>> not to the filesystem layer.
>>> 
>>> Sam, does this make any sense to you in terms of how RADOS handles deletes?
>>> -Greg
>> 
>> 
>> ----------------------------------------------------------------------
>> The information contained in this transmission may be confidential. Any 
>> disclosure, copying, or further distribution of confidential information is 
>> not permitted unless such privilege is explicitly granted in writing by 
>> Quantum. Quantum reserves the right to have electronic communications, 
>> including email and attachments, sent across its networks filtered through 
>> anti virus and spam software programs and retain such messages in order to 
>> comply with applicable data security and retention requirements. Quantum is 
>> not responsible for the proper and complete transmission of the substance of 
>> this communication or for any delay in its receipt.


_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] why is there heavy read traffic during object delete?

Reply via email to