Another parallel effort could be trying to configure the number of inodes/dentries cached by kernel VFS using /proc/sys/vm interface.
============================================================== vfs_cache_pressure ------------------ This percentage value controls the tendency of the kernel to reclaim the memory which is used for caching of directory and inode objects. At the default value of vfs_cache_pressure=100 the kernel will attempt to reclaim dentries and inodes at a "fair" rate with respect to pagecache and swapcache reclaim. Decreasing vfs_cache_pressure causes the kernel to prefer to retain dentry and inode caches. When vfs_cache_pressure=0, the kernel will never reclaim dentries and inodes due to memory pressure and this can easily lead to out-of-memory conditions. Increasing vfs_cache_pressure beyond 100 causes the kernel to prefer to reclaim dentries and inodes. Increasing vfs_cache_pressure significantly beyond 100 may have negative performance impact. Reclaim code needs to take various locks to find freeable directory and inode objects. With vfs_cache_pressure=1000, it will look for ten times more freeable objects than there are. Also we've an article for sysadmins which has a section: <quote> With GlusterFS, many users with a lot of storage and many small files easily end up using a lot of RAM on the server side due to 'inode/dentry' caching, leading to decreased performance when the kernel keeps crawling through data-structures on a 40GB RAM system. Changing this value higher than 100 has helped many users to achieve fair caching and more responsiveness from the kernel. </quote> Complete article can be found at: https://gluster.readthedocs.io/en/latest/Administrator%20Guide/Linux%20Kernel%20Tuning/ regards, On Tue, Sep 5, 2017 at 5:20 PM, Raghavendra Gowdappa <rgowd...@redhat.com> wrote: > +gluster-devel > > Ashish just spoke to me about need of GC of inodes due to some state in > inode that is being proposed in EC. Hence adding more people to > conversation. > > > > On 4 September 2017 at 12:34, Csaba Henk <ch...@redhat.com> wrote: > > > > > > > I don't know, depends on how sophisticated GC we need/want/can get > by. I > > > > guess the complexity will be inherent, ie. that of the algorithm > chosen > > > > and > > > > how we address concurrency & performance impacts, but once that's got > > > > right > > > > the other aspects of implementation won't be hard. > > > > > > > > Eg. would it be good just to maintain a simple LRU list? > > > > > > > > Yes. I was also thinking of leveraging lru list. We can invalidate first > "n" > > inodes from lru list of fuse inode table. > > > > > > > > That might work for starters. > > > > > > > > > > > Csaba > > > > > > > > On Mon, Sep 4, 2017 at 8:48 AM, Nithya Balachandran < > nbala...@redhat.com> > > > > wrote: > > > > > > > >> > > > >> > > > >> On 4 September 2017 at 12:14, Csaba Henk <ch...@redhat.com> wrote: > > > >> > > > >>> Basically how I see the fuse invalidate calls as rescuers of > sanity. > > > >>> > > > >>> Normally, when you have lot of certain kind of stuff that tends to > > > >>> accumulate, the immediate thought is: let's set up some garbage > > > >>> collection > > > >>> mechanism, that will take care of keeping the accumulation at bay. > But > > > >>> that's what doesn't work with inodes in a naive way, as they are > > > >>> referenced > > > >>> from kernel, so we have to keep them around until kernel tells us > it's > > > >>> giving up its reference. However, with the fuse invalidate calls > we can > > > >>> take the initiative and instruct the kernel: "hey, kernel, give up > your > > > >>> references to this thing!" > > > >>> > > > >>> So we are actually free to implement any kind of inode GC in > glusterfs, > > > >>> just have to take care to add the proper callback to > fuse_invalidate_* > > > >>> and > > > >>> we are good to go. > > > >>> > > > >>> > > > >> That sounds good and something we need to do in the near future. Is > this > > > >> something that is easy to implement? > > > >> > > > >> > > > >>> Csaba > > > >>> > > > >>> On Mon, Sep 4, 2017 at 7:00 AM, Nithya Balachandran > > > >>> <nbala...@redhat.com > > > >>> > wrote: > > > >>> > > > >>>> > > > >>>> > > > >>>> On 4 September 2017 at 10:25, Raghavendra Gowdappa > > > >>>> <rgowd...@redhat.com > > > >>>> > wrote: > > > >>>> > > > >>>>> > > > >>>>> > > > >>>>> ----- Original Message ----- > > > >>>>> > From: "Nithya Balachandran" <nbala...@redhat.com> > > > >>>>> > Sent: Monday, September 4, 2017 10:19:37 AM > > > >>>>> > Subject: Fuse mounts and inodes > > > >>>>> > > > > >>>>> > Hi, > > > >>>>> > > > > >>>>> > One of the reasons for the memory consumption in gluster fuse > > > >>>>> > mounts > > > >>>>> is the > > > >>>>> > number of inodes in the table which are never kicked out. > > > >>>>> > > > > >>>>> > Is there any way to default to an entry-timeout and > > > >>>>> attribute-timeout value > > > >>>>> > while mounting Gluster using Fuse? Say 60s each so those > entries > > > >>>>> will be > > > >>>>> > purged periodically? > > > >>>>> > > > >>>>> Once the entry timeouts, inodes won't be purged. Kernel sends a > > > >>>>> lookup > > > >>>>> to revalidate the mapping of path to inode. AFAIK, reverse > > > >>>>> invalidation > > > >>>>> (see inode_invalidate) is the only way to make kernel forget > > > >>>>> inodes/attributes. > > > >>>>> > > > >>>>> Is that something that can be done from the Fuse mount ? Or is > this > > > >>>> something that needs to be added to Fuse? > > > >>>> > > > >>>>> > > > > >>>>> > Regards, > > > >>>>> > Nithya > > > >>>>> > > > > >>>>> > > > >>>> > > > >>>> > > > >>> > > > >> > > > > > > > > > > _______________________________________________ > Gluster-devel mailing list > Gluster-devel@gluster.org > http://lists.gluster.org/mailman/listinfo/gluster-devel > -- Raghavendra G
_______________________________________________ Gluster-devel mailing list Gluster-devel@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-devel