Yea, open_fd_count is broken…


We have been working on the right way to fix it.



Frank



From: bharat singh [mailto:bharat064...@gmail.com]
Sent: Saturday, February 10, 2018 7:42 PM
To: Malahal Naineni <mala...@gmail.com>
Cc: nfs-ganesha-devel@lists.sourceforge.net
Subject: Re: [Nfs-ganesha-devel] Ganesha V2.5.2: mdcache high open_fd_count



Hey,



I think there is a leak in open_fd_count.



fsal_rdwr() uses fsal_open() to open the file, but uses obj->obj_ops.close(obj) 
to close the file and there is no decrement of open_fd_count.

So this counter keeps increasing and I could easily hit the 4k hard limit with 
prolonged read/writes.



I changed it to use fsal_close() as it also does the decrement. After this 
change the open_fd_count was looking OK.

But recently I saw open_fd_count being underflown to 
open_fd_count=18446744073709551615



So i am suspecting a double close. Any suggestions ?



 Code snippet from // V2.5-stable/src/FSAL/fsal_helper.c

fsal_status_t fsal_rdwr(struct fsal_obj_handle *obj,

                            fsal_io_direction_t io_direction,

                            uint64_t offset, size_t io_size,

                            size_t *bytes_moved, void *buffer,

                            bool *eof,

                            bool *sync, struct io_info *info)

{

...

          loflags = obj->obj_ops.status(obj);

          while ((!fsal_is_open(obj))

                 || (loflags && loflags != FSAL_O_RDWR && loflags != 
openflags)) {

                      loflags = obj->obj_ops.status(obj);

                      if ((!fsal_is_open(obj))

                          || (loflags && loflags != FSAL_O_RDWR

                                  && loflags != openflags)) {

                                  fsal_status = fsal_open(obj, openflags);

                                  if (FSAL_IS_ERROR(fsal_status))

                                              goto out;

                                  opened = true;

                      }

                      loflags = obj->obj_ops.status(obj);

          }

..

                      if ((fsal_status.major != ERR_FSAL_NOT_OPENED)

                          && (obj->obj_ops.status(obj) != FSAL_O_CLOSED)) {

                                  LogFullDebug(COMPONENT_FSAL,

                                                   "fsal_rdwr_plus: CLOSING 
file %p",

                                                   obj);



                                  fsal_status = obj->obj_ops.close(obj);   
>>>>>>>> using fsal_close here ?

                                  if (FSAL_IS_ERROR(fsal_status)) {

                                              LogCrit(COMPONENT_FSAL,

                                                          "Error closing file 
in fsal_rdwr_plus: %s.",

                                                          
fsal_err_txt(fsal_status));

                                  }

                      }

...

          if (opened) {

                      fsal_status = obj->obj_ops.close(obj);    >>>>>>>> using 
fsal_close here ?

                      if (FSAL_IS_ERROR(fsal_status)) {

                                  LogEvent(COMPONENT_FSAL,

                                              "fsal_rdwr_plus: close = %s",

                                              fsal_err_txt(fsal_status));

                                  goto out;

                      }

          }

...

}





On Tue, Jan 2, 2018 at 12:30 AM, Malahal Naineni <mala...@gmail.com 
<mailto:mala...@gmail.com> > wrote:

The links I gave you will have everything you need. You should be able to 
download gerrit reviews by "git review -d <number>" or download from the gerrit 
web gui.



"390496" is merged upstream, but the other one is not merged yet.



$ git log --oneline --grep='Fix closing global file descriptors' origin/next

5c2efa8f0 Fix closing global file descriptors











On Tue, Jan 2, 2018 at 3:22 AM, bharat singh <bharat064...@gmail.com 
<mailto:bharat064...@gmail.com> > wrote:

Thanks Malahal



Can you point me to these issues/fixes. I will try to patch V2.5-stable and run 
my tests.



Thanks,

Bharat



On Mon, Jan 1, 2018 at 10:20 AM, Malahal Naineni <mala...@gmail.com 
<mailto:mala...@gmail.com> > wrote:

>> I see that mdcache keeps growing beyond the high water mark and lru 
>> reclamation can’t keep up.



mdcache is different from "FD" cache. I don't think we found an issue with 
mdcache itself. We found couple of issues with "FD cache"



1) https://review.gerrithub.io/#/c/391266/

2) https://review.gerrithub.io/#/c/390496/



Neither of them are in V2.5-stable at this point. We will have to backport 
these and others soon.



Regards, Malahal.



On Mon, Jan 1, 2018 at 11:04 PM, bharat singh <bharat064...@gmail.com 
<mailto:bharat064...@gmail.com> > wrote:

Adding nfs-ganesha-support..





On Fri, Dec 29, 2017 at 11:01 AM, bharat singh <bharat064...@gmail.com 
<mailto:bharat064...@gmail.com> > wrote:

Hello,



I am testing NFSv3 Ganesha implementation against nfstest_io tool. I see that 
mdcache keeps growing beyond the high water mark and lru reclamation can’t keep 
up.



[cache_lru] lru_run :INODE LRU :CRIT :Futility count exceeded.  The LRU thread 
is unable to make progress in reclaiming FDs.  Disabling FD cache.

mdcache_lru_fds_available :INODE LRU :INFO :FDs above high water mark, waking 
LRU thread. open_fd_count=14196, lru_state.fds_hiwat=3686, 
lru_state.fds_lowat=2048, lru_state.fds_hard_limit=4055



I am on Ganesha V2.5.2 with default config settings



So couple of questions:

1. Is Ganesha tested against these kind of tools, which does a bunch of 
open/close in quick successions.

2. Is there a way to suppress these error messages and/or expedite the lru 
reclamation process.

3. Any suggestions regarding the usage of these kind of tools with Ganesha.





Thanks,

Bharat







--

-Bharat







------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net 
<mailto:Nfs-ganesha-devel@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel









--

-Bharat













--

-Bharat







---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel

Reply via email to