Re: [Gluster-devel] md-cache improvements

Raghavendra G Tue, 16 Aug 2016 23:13:36 -0700

On Fri, Aug 12, 2016 at 10:29 AM, Raghavendra G <[email protected]>
wrote:


>
>
> On Thu, Aug 11, 2016 at 9:31 AM, Raghavendra G <[email protected]>
> wrote:
>
>> Couple of more areas to explore:
>> 1. purging kernel dentry and/or page-cache too. Because of patch [1],
>> upcall notification can result in a call to inode_invalidate, which results
>> in an "invalidate" notification to fuse kernel module. While I am sure
>> that, this notification will purge page-cache from kernel, I am not sure
>> about dentries. I assume if an inode is invalidated, it should result in a
>> lookup (from kernel to glusterfs). But neverthless, we should look into
>> differences between entry_invalidation and inode_invalidation and harness
>> them appropriately.
>>
>> 2. Granularity of invalidation. For eg., We shouldn't be purging
>> page-cache in kernel, because of a change in xattr used by an xlator (eg.,
>> dht layout xattr). We have to make sure that [1] is handling this. We need
>> to add more granularity into invaldation (like internal xattr invalidation,
>> user xattr invalidation, entry invalidation in kernel, page-cache
>> invalidation in kernel, attribute/stat invalidation in kernel etc) and use
>> them judiciously, while making sure other cached data remains to be present.
>>
>
> To stress the importance of this point, it should be noted that with tier
> there can be constant migration of files, which can result in spurious
> (from perspective of application) invalidations, even though application is
> not doing any writes on files [2][3][4]. Also, even if application is
> writing to file, there is no point in invalidating dentry cache. We should
> explore more ways to solve [2][3][4].
>
> 3. We've a long standing issue of spurious termination of fuse
> invalidation thread. Since after termination, the thread is not re-spawned,
> we would not be able to purge kernel entry/attribute/page-cache. This issue
> was touched upon during a discussion [5], though we didn't solve the
> problem then for lack of bandwidth. Csaba has agreed to work on this issue.
>

4. Flooding of network with upcall notifications. Is it a problem? If yes,
does upcall infra already solves it? Would NFS/SMB leases help here?


> [2] https://bugzilla.redhat.com/show_bug.cgi?id=1293967#c7
> [3] https://bugzilla.redhat.com/show_bug.cgi?id=1293967#c8
> [4] https://bugzilla.redhat.com/show_bug.cgi?id=1293967#c9
> [5] http://review.gluster.org/#/c/13274/1/xlators/mount/
> fuse/src/fuse-bridge.c
>
>
>>
>> [1] http://review.gluster.org/12951
>>
>>
>> On Wed, Aug 10, 2016 at 10:35 PM, Dan Lambright <[email protected]>
>> wrote:
>>
>>>
>>> There have been recurring discussions within the gluster community to
>>> build on existing support for md-cache and upcalls to help performance for
>>> small file workloads. In certain cases, "lookup amplification" dominates
>>> data transfers, i.e. the cumulative round trip times of multiple LOOKUPs
>>> from the client mitigates benefits from faster backend storage.
>>>
>>> To tackle this problem, one suggestion is to more aggressively utilize
>>> md-cache to cache inodes on the client than is currently done. The inodes
>>> would be cached until they are invalidated by the server.
>>>
>>> Several gluster development engineers within the DHT, NFS, and Samba
>>> teams have been involved with related efforts, which have been underway for
>>> some time now. At this juncture, comments are requested from gluster
>>> developers.
>>>
>>> (1) .. help call out where additional upcalls would be needed to
>>> invalidate stale client cache entries (in particular, need feedback from
>>> DHT/AFR areas),
>>>
>>> (2) .. identify failure cases, when we cannot trust the contents of
>>> md-cache, e.g. when an upcall may have been dropped by the network
>>>
>>> (3) .. point out additional improvements which md-cache needs. For
>>> example, it cannot be allowed to grow unbounded.
>>>
>>> Dan
>>>
>>> ----- Original Message -----
>>> > From: "Raghavendra Gowdappa" <[email protected]>
>>> >
>>> > List of areas where we need invalidation notification:
>>> > 1. Any changes to xattrs used by xlators to store metadata (like dht
>>> layout
>>> > xattr, afr xattrs etc).
>>> > 2. Scenarios where individual xlator feels like it needs a lookup. For
>>> > example failed directory creation on non-hashed subvol in dht during
>>> mkdir.
>>> > Though dht succeeds mkdir, it would be better to not cache this inode
>>> as a
>>> > subsequent lookup will heal the directory and make things better.
>>> > 3. removing of files
>>> > 4. writev on brick (to invalidate read cache on client)
>>> >
>>> > Other questions:
>>> > 5. Does md-cache has cache management? like lru or an upper limit for
>>> cache.
>>> > 6. Network disconnects and invalidating cache. When a network
>>> disconnect
>>> > happens we need to invalidate cache for inodes present on that brick
>>> as we
>>> > might be missing some notifications. Current approach of purging cache
>>> of
>>> > all inodes might not be optimal as it might rollback benefits of
>>> caching.
>>> > Also, please note that network disconnects are not rare events.
>>> >
>>> > regards,
>>> > Raghavendra
>>> _______________________________________________
>>> Gluster-devel mailing list
>>> [email protected]
>>> http://www.gluster.org/mailman/listinfo/gluster-devel
>>>
>>
>>
>>
>> --
>> Raghavendra G
>>
>
>
>
> --
> Raghavendra G
>



-- 
Raghavendra G

_______________________________________________
Gluster-devel mailing list
[email protected]
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] md-cache improvements

Reply via email to