On Thu, Apr 21, 2011 at 2:54 PM, Yehuda Sadeh Weinraub
<[email protected]> wrote:
> On Thu, Apr 21, 2011 at 2:44 PM, Colin McCabe <[email protected]> wrote:
>> On Thu, Apr 21, 2011 at 2:23 PM, Yehuda Sadeh Weinraub
>> <[email protected]> wrote:
>>> On Thu, Apr 21, 2011 at 2:09 PM, Colin McCabe <[email protected]> 
>>> wrote:
>>>> On Thu, Apr 21, 2011 at 1:03 PM, Gregory Farnum
>>>> <[email protected]> wrote:
>>>>> I really don't see how pushing the naming complexity into the local 
>>>>> filesystem,
>>>>> where it adds lots of otherwise-useless inodes and dentries, is going to 
>>>>> help us.
>>>>
>>>> Here is a quick summary of how the TV's proposal would help us.
>>>> 1. it avoids collisions entirely
>>>> 2. You don't ever have do an extra xattr lookup, no matter how short
>>>> or long the object name is.
>>>
>>> Yeah, but you read more directories. Note that btrfs stores the xattrs
>>> on the directories, so reading those xattrs will have a lower IO
>>> impact than traversing directories recursively.
>>
>> It does seem like btrfs' extended attribute implementation is fairly
>> efficient. But Linux's dentry cache (dcache) is also pretty efficient.
>>
> (resending to list)
>
> It needs to be populated first before being efficient. And it'll be
> less efficient now that you populate it with extra entries.

That is a good point. However, xattrs also have a cost. It seems like
btrfs sometimes creates an inode for xattrs, and sometimes just
stashes them in the dentry (presumably if there aren't many and
they're small?)

The xattr-scheme always creates an extra xattr per entry. The
directory-based scheme creates extra directories, but not that many,
assuming a lot of objects have names with similar prefixes-- an
assumption that is likely to be true nearly all the time.

I think both schemes are doable, but I still lean towards the
directory-based one, just because I like fast prefix search.

Colin
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to