from:"Pradeep"

[Nfs-ganesha-devel] read-only open causing file to be truncated with NFSv4/ganesha.

2018-10-12 Thread Pradeep

This list has been deprecated. Please subscribe to the new devel list at 
lists.nfs-ganesha.org.Hello,

I'm seeing an issue with a specific application flow and with ganesha
server (2.6.5). Here is what the application does:

- fd1 = open (file1, O_WRONLY|O_CREAT|O_TRUNC)
  Ganesha creates a state and stores the flags in fd->openflags (see
vfs_open2())
- write (fd1, )
- fd2 = open(file1, O_RDONLY)
  Ganesha finds the state for the first open, finds the flags and calls
fsal_reopen2() with old and new flags OR'd together. This causes the file
to be truncated because the original open was done with O_TRUNC.

I don't see this behavior with kernel NFS or a local filesystem. Any
suggestions on fixing this? Here is a quick and dirty program to reproduce
it - you can see that the second stat prints zero as size with a mount from
Ganesha server.

#include 
#include 
#include 
#include 
#include 
#include 
#include 

int main(int argc, char **argv)
{
  char *fn = argv[1];
  int fd;
  char buf[1024];
  struct stat stbuf;
  int rc;


  fd = open(fn, O_WRONLY|O_CREAT|O_TRUNC, 0666);
  printf("open(%s) = %d\n", fn, fd);

  ssize_t ret;
  ret = write(fd, buf, sizeof(buf));
  printf("write returned %ld\n", ret);

  rc = stat(fn, );
  printf("size: %ld\n", stbuf.st_size);

  int fd1;
  fd1 = open(fn, O_RDONLY);
  printf("open(%s) = %d\n", fn, fd1);

  rc = stat(fn, );
  printf("size: %ld\n", stbuf.st_size);
}

Thanks,
Pradeep
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel

[Nfs-ganesha-devel] Couple of stat issues.

2018-04-10 Thread Pradeep

Hello,

These are couple of statistics related issues I noticed:

1. NFS4 compounds are incremented multiple times:

For individual OPs, ganesha takes this path:
server_stats_nfsv4_op_done() -> record_nfsv4_op() ->
record_op(>compounds..)
Once the compound is complete, we take another path
server_stats_compound_done() -> record_compound() ->
record_op(>compounds, ..)
In both cases the same counter is incremented. So the sp->compounds will be
incremented one extra time both on export and client stats.
I think the second call is meant to measure the number of compounds from a
client (not the individual OPS). So is it ok to use a different variable
for that?

2. In nfs_rpc_execute(), the queue_wait is set to the difference between
op_ctx->start_time and reqdata->time_queued. But reqdata->time_queued is
never set (in the old code - pre 2.6-dev5, nfs_rpc_enqueue_req() used to
set it; now only 9P code sets it). Is nfs_rpc_decode_request() a good place
to set it?
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel

Re: [Nfs-ganesha-devel] mdcache growing beyond limits.

2018-04-04 Thread Pradeep

Hi Daniel,

I tried increasing lanes to 1023. The usage looks better, but still over
the limit:

$2 = {entries_hiwat = 10, entries_used = 299838, chunks_hiwat = 10,
chunks_used = 1235, fds_system_imposed = 1048576,
  fds_hard_limit = 1038090, fds_hiwat = 943718, fds_lowat = 524288,
futility = 0, per_lane_work = 50, biggest_window = 419430,
  prev_fd_count = 39434, prev_time = 1522775283, caching_fds = true}

I'm trying to simulate build workload by running SpecFS SWBUILD workload.
This is with Ganesha 2.7 and FSAL_VFS. The server has 4CPU/12GB Memory.

For build 8 (40 processes), the latency increased from 5ms (with 17 lanes)
to 22 ms (with 1023 lanes) and the test failed to achieve required IOPs.

Thanks,
Pradeep

On Tue, Apr 3, 2018 at 7:58 AM, Pradeep <pradeeptho...@gmail.com> wrote:

> Hi Daniel,
>
> Sure I will try that.
>
> One thing I tried is to not allocate new entries and return
> NFS4ERR_DELAY in the hope that the increased refcnt at LRU is
> temporary. This worked for some time; but then I hit a case where I
> see all the entries at the LRU of L1 has a refcnt of 2 and the
> subsequent entries have a refcnt of 1. All L2's were empty. I realized
> that whenever a new entry is created, the refcnt is 2 and it is put at
> the LRU. Also promotions from L2 moves them to LRU of L1. So it is
> likely that many threads may end up finding no entries at LRU and end
> allocating new entries.
>
> Then I tried another experiment: Invoke lru_wake_thread() when the
> number of entries is greater than entries_hiwat; but still allocate a
> new entry for the current thread. This worked. I had to make a change
> in lru_run() to allow demotion in case of 'entries > entries_hiwat' in
> addition to max FD check. The side effect would be that it will close
> FDs and demote to L2. Almost all of these FDs are opened in the
> context of setattr/getattr; so attributes are already in cache and FDs
> are probably useless until the cache expires.  I think your idea of
> moving further down the lane may be a better approach.
>
> I will try your suggestion next. With 1023 lanes, it is unlikely that
> all lanes will have an active entry.
>
> Thanks,
> Pradeep
>
> On 4/3/18, Daniel Gryniewicz <d...@redhat.com> wrote:
> > So, the way this is supposed to work is that getting a ref when the ref
> > is 1 is always an LRU_REQ_INITIAL ref, so that moves it to the MRU.  At
> > that point, further refs don't move it around in the queue, just
> > increment the refcount.  This should be the case, because
> > mdcache_new_entry() and mdcache_find_keyed() both get an INITIAL ref,
> > and all other refs require you to already have a pointer to the entry
> > (and therefore a ref).
> >
> > Can you try something, since you have a reproducer?  It seems that, with
> > 1.7 million files, 17 lanes may be a bit low.  Can you try with
> > something ridiculously large, like 1023, and see if that makes a
> > difference?
> >
> > I suspect we'll have to add logic to move further down the lanes if
> > futility hits.
> >
> > Daniel
> >
> > On 04/02/2018 12:30 PM, Pradeep wrote:
> >> We discussed this a while ago. I'm running into this again with 2.6.0.
> >> Here is a snapshot of the lru_state (I set the max entries to 10):
> >>
> >> {entries_hiwat = 20, entries_used = 1772870, chunks_hiwat = 10,
> >> chunks_used = 16371, lru_reap_l1 = 8116842,
> >>lru_reap_l2 = 1637334, lru_reap_failed = 1637334, attr_from_cache =
> >> 31917512, attr_from_cache_for_client = 5975849,
> >>fds_system_imposed = 1048576, fds_hard_limit = 1038090, fds_hiwat =
> >> 943718, fds_lowat = 524288, futility = 0, per_lane_work = 50,
> >>biggest_window = 419430, prev_fd_count = 0, prev_time = 1522647830,
> >> caching_fds = true}
> >>
> >> As you can see it has grown well beyond the limlt set (1.7 million vs
> >> 200K max size). lru_reap_failed indicates number of times the reap
> >> failed from L1 and L2.
> >> I'm wondering what can cause the reap to fail once it reaches a steady
> >> state. It appears to me that the entry at LRU (head of the queue) is
> >> actually being used (refcnt > 1) and there are entries in the queue with
> >> refcnt == 1. But those are not being looked at. My understanding is that
> >> if an entry is accessed, it must move to MRU (tail of the queue). Any
> >> idea why the entry at LRU can have a refcnt > 1?
> >>
> >> This can happen if the refcnt is incremented without QLOCK and if
> >> lru_reap_impl() is called at the same time from another thread, it will
> >> skip the firs

Re: [Nfs-ganesha-devel] mdcache growing beyond limits.

2018-04-03 Thread Pradeep

Hi Daniel,

Sure I will try that.

One thing I tried is to not allocate new entries and return
NFS4ERR_DELAY in the hope that the increased refcnt at LRU is
temporary. This worked for some time; but then I hit a case where I
see all the entries at the LRU of L1 has a refcnt of 2 and the
subsequent entries have a refcnt of 1. All L2's were empty. I realized
that whenever a new entry is created, the refcnt is 2 and it is put at
the LRU. Also promotions from L2 moves them to LRU of L1. So it is
likely that many threads may end up finding no entries at LRU and end
allocating new entries.

Then I tried another experiment: Invoke lru_wake_thread() when the
number of entries is greater than entries_hiwat; but still allocate a
new entry for the current thread. This worked. I had to make a change
in lru_run() to allow demotion in case of 'entries > entries_hiwat' in
addition to max FD check. The side effect would be that it will close
FDs and demote to L2. Almost all of these FDs are opened in the
context of setattr/getattr; so attributes are already in cache and FDs
are probably useless until the cache expires.  I think your idea of
moving further down the lane may be a better approach.

I will try your suggestion next. With 1023 lanes, it is unlikely that
all lanes will have an active entry.

Thanks,
Pradeep

On 4/3/18, Daniel Gryniewicz <d...@redhat.com> wrote:
> So, the way this is supposed to work is that getting a ref when the ref
> is 1 is always an LRU_REQ_INITIAL ref, so that moves it to the MRU.  At
> that point, further refs don't move it around in the queue, just
> increment the refcount.  This should be the case, because
> mdcache_new_entry() and mdcache_find_keyed() both get an INITIAL ref,
> and all other refs require you to already have a pointer to the entry
> (and therefore a ref).
>
> Can you try something, since you have a reproducer?  It seems that, with
> 1.7 million files, 17 lanes may be a bit low.  Can you try with
> something ridiculously large, like 1023, and see if that makes a
> difference?
>
> I suspect we'll have to add logic to move further down the lanes if
> futility hits.
>
> Daniel
>
> On 04/02/2018 12:30 PM, Pradeep wrote:
>> We discussed this a while ago. I'm running into this again with 2.6.0.
>> Here is a snapshot of the lru_state (I set the max entries to 10):
>>
>> {entries_hiwat = 20, entries_used = 1772870, chunks_hiwat = 10,
>> chunks_used = 16371, lru_reap_l1 = 8116842,
>>    lru_reap_l2 = 1637334, lru_reap_failed = 1637334, attr_from_cache =
>> 31917512, attr_from_cache_for_client = 5975849,
>>    fds_system_imposed = 1048576, fds_hard_limit = 1038090, fds_hiwat =
>> 943718, fds_lowat = 524288, futility = 0, per_lane_work = 50,
>>    biggest_window = 419430, prev_fd_count = 0, prev_time = 1522647830,
>> caching_fds = true}
>>
>> As you can see it has grown well beyond the limlt set (1.7 million vs
>> 200K max size). lru_reap_failed indicates number of times the reap
>> failed from L1 and L2.
>> I'm wondering what can cause the reap to fail once it reaches a steady
>> state. It appears to me that the entry at LRU (head of the queue) is
>> actually being used (refcnt > 1) and there are entries in the queue with
>> refcnt == 1. But those are not being looked at. My understanding is that
>> if an entry is accessed, it must move to MRU (tail of the queue). Any
>> idea why the entry at LRU can have a refcnt > 1?
>>
>> This can happen if the refcnt is incremented without QLOCK and if
>> lru_reap_impl() is called at the same time from another thread, it will
>> skip the first entry and return NULL. This was done
>> in _mdcache_lru_ref() which could cause the refcnt on the head of the
>> queue to be incremented while some other thread looks at it holding a
>> QLOCK. I tried moving the increment/dequeue in _mdcache_lru_ref() inside
>> QLOCK; but that did not help.
>>
>> Also if "get_ref()" is called for the entry at the LRU for some reason,
>> it will just increment refcnt and return. I think the assumption is that
>> by the time "get_ref() is called, the entry is supposed to be out of LRU.
>>
>>
>> Thanks,
>> Pradeep
>>
>

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel

Re: [Nfs-ganesha-devel] Multiprotocol support in ganesha

2018-03-06 Thread Pradeep

Hi Daniel,

What I meant is a use case where some one needs to access the same export
through NFS protocol using Ganesha server and SMB protocol using Samba
server. Both Samba and Ganesha are running on the same server. Obviously,
file can't be open by both ganesha and samba; so we need to close the open
FDs (if those are for caching). Linux provides oplock (fcntl() with
F_SETLEASE) for processes to get notification on other processes trying to
open and this can be used to synchronize with Samba.
Samba seems to support this already:
https://github.com/samba-team/samba/blob/master/source3/smbd/oplock_linux.c

Thanks,

On Tue, Mar 6, 2018 at 9:29 AM, Daniel Gryniewicz <d...@redhat.com> wrote:

> Ganesha has multi-protocol (NFS3, NFS4, and 9P).  There are no plans to
> add CIFS, since that is an insanely complicated protocol, and has a
> userspace daemon implementation already (in the form of Samba).  I
> personally wouldn't reject such support if it was offered, but as far as I
> know, no one is even thinking about working on it.
>
> Daniel
>
>
> On 03/06/2018 12:20 PM, Pradeep wrote:
>
>> Hello,
>>
>> Is there plans to implement multiprotocol (NFS and CIFS accessing same
>> export/share) in ganesha? I believe current FD cache will need changes to
>> support that.
>>
>> Thanks,
>> Pradeep
>>
>>
>> 
>> --
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>>
>>
>>
>> ___
>> Nfs-ganesha-devel mailing list
>> Nfs-ganesha-devel@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>>
>>
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Nfs-ganesha-devel mailing list
> Nfs-ganesha-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel

Re: [Nfs-ganesha-devel] READDIR doesn't return all entries.

2018-02-15 Thread Pradeep

Hi Frank/Sachin,

I will get back to you with tcpdump/logs if I can recreate it. It was
around 2000 directories and a "find" from the export root was issued before
the "ls" from client.

Thanks,
Pradeep

On Tue, Feb 13, 2018 at 9:38 AM, Frank Filz <ffilz...@mindspring.com> wrote:

> What FSAL? Is the test anything other than creating a large number of
> files (how many?) and then doing a readdir and comparing?
>
>
>
> Frank
>
>
>
> *From:* Pradeep [mailto:pradeeptho...@gmail.com]
> *Sent:* Monday, February 12, 2018 5:36 PM
> *To:* nfs-ganesha-devel <nfs-ganesha-devel@lists.sourceforge.net>
> *Subject:* [Nfs-ganesha-devel] READDIR doesn't return all entries.
>
>
>
> Hello,
>
>
>
> I noticed that with large number of directory entries, READDIR does not
> return all entries. It happened with RC5; but works fine in RC2. I looked
> through the changes and the offending change seems to be this one:
>
>
>
> https://github.com/nfs-ganesha/nfs-ganesha/commit/
> 985564cbd147b6acc5dd6de61a3ca8fbc6062eda
>
>
>
> (reverted the change and verified that all entries are returned without
> this change)
>
>
>
> Still looking into why it broke READDIR for me. Any insights on debugging
> this would be helpful.
>
>
>
> Thanks,
>
> Pradeep
>
>
>
>
> <https://www.avast.com/sig-email?utm_medium=email_source=link_campaign=sig-email_content=emailclient_term=icon>
>  Virus-free.
> www.avast.com
> <https://www.avast.com/sig-email?utm_medium=email_source=link_campaign=sig-email_content=emailclient_term=link>
> <#m_1856458977470970984_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel

[Nfs-ganesha-devel] READDIR doesn't return all entries.

2018-02-12 Thread Pradeep

Hello,

I noticed that with large number of directory entries, READDIR does not
return all entries. It happened with RC5; but works fine in RC2. I looked
through the changes and the offending change seems to be this one:

https://github.com/nfs-ganesha/nfs-ganesha/commit/985564cbd147b6acc5dd6de61a3ca8fbc6062eda

(reverted the change and verified that all entries are returned without
this change)

Still looking into why it broke READDIR for me. Any insights on debugging
this would be helpful.

Thanks,
Pradeep
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel

[Nfs-ganesha-devel] Using root privileges when using kerberos exports with Ganesha.

2018-02-08 Thread Pradeep

Hello,

It looks like Ganesha converts certain principals to UID/GID 0
(idmapper.c:principal2uid()). I noticed that when a client uses kerberos
with AD, the default principal is @. So when NFS operations
are tried with root on client, it sends the principal in @
format which will not be mapped to UID/GID 0 on Ganesha side.

Have anyone successfully used privileged access to NFS exports with
Kerberos/AD with Ganesha server? If yes, could share how you were able to
achieve that?

Thanks,
Pradeep
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel

Re: [Nfs-ganesha-devel] Crash in graceful shutdown.

2018-02-01 Thread Pradeep

Running some NFS workload and sending SIGTERM to ganesha (sudo killall
-TERM ganesha.nfsd) will reproduce it.

But you might hit a double-free problem before that - here is a patch that
fixes it.

https://review.gerrithub.io/#/c/398092/

Feel free to rework the patch above.

On Thu, Feb 1, 2018 at 6:59 AM, Daniel Gryniewicz <d...@redhat.com> wrote:

> I've been actually deliberately leaving that crash in.  It indicates a
> refcount leak (and is usually the only indicator of a refcount leak, other
> than slowly growing memory over a long time).
>
> Can you get me a reproducer for this?  If so, I can track down the leak.
>
> Daniel
>
>
> On 02/01/2018 09:51 AM, Pradeep wrote:
>
>> Hello,
>>
>> In graceful shutdown of ganesha (do_shutdown()), the way object
>> handles are released is first by calling unexport() and then
>> destroy_fsals(). One issue I'm seeing is unexport in MDCACHE will not
>> release objects if refcnt is non-zero (which can happen if files are
>> open). When it comes to destroy_fsals() -> shutdown_handles() ->
>> mdcache_hdl_release() -> ...-> mdcache_lru_clean(), we don't have
>> op_ctx. So it will crashes in mdcache_lru_clean().
>>
>> A simple fix would be to create op_ctx if it is NULL in
>> mdcache_hdl_release(). But I'm wondering if unexport is supposed to
>> free all handles in MDCACHE?
>>
>> This is with 2.6-rc2 in case you want to look at code.
>>
>> Thanks,
>> Pradeep
>>
>> 
>> --
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>> ___
>> Nfs-ganesha-devel mailing list
>> Nfs-ganesha-devel@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>>
>>
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Nfs-ganesha-devel mailing list
> Nfs-ganesha-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel

Re: [Nfs-ganesha-devel] Change in ffilz/nfs-ganesha[next]: Move nfs_Init_svc() after create_pseudofs().

2018-01-31 Thread Pradeep

On Wed, Jan 31, 2018 at 2:23 PM, William Allen Simpson <
william.allen.simp...@gmail.com> wrote:

> On 1/31/18 3:11 PM, GerritHub wrote:
>
>> Frank Filz *posted comments* on this change.
>>
>> View Change 
>>
>> It seems to me that the dupreq2_pkginit() is already in about the right
> place, just after nfs_Init_client_id().  Moving it before doesn't do much.
>
>
Done.


>
> Patch set 2:
>>
>> (3 comments)
>>
>>   *
>>
>> File src/MainNFSD/nfs_init.c: > /c/397596/2/src/MainNFSD/nfs_init.c>
>>
>>   o
>>
>> Patch Set #2, Line 820: > /c/397596/2/src/MainNFSD/nfs_init.c@820> |LogInfo(COMPONENT_INIT, "RPC
>> resources successfully initialized");|
>>
>> Hmm, should we do this after starting grace so we don't process
>> requests before setting up grace period?
>>
>> OK, as long as it does not send NLM.
>
> We need to move nfs_Init_admin_thread() down, because that starts dbus,
> and dbus can start, terminate, and affect grace.  So it should be here
> after nfs_start_grace().
>

Done.


>
>
>   o
>>
>> Patch Set #2, Line 823: > /c/397596/2/src/MainNFSD/nfs_init.c@823> |fsal_save_ganesha_credentials
>> ();|
>>
>> Hmm, this should be done earlier...
>>
>> Where?
>
>
>   o
>>
>> Patch Set #2, Line 962: > /c/397596/2/src/MainNFSD/nfs_init.c@962> |nsm_unmonitor_all();|
>>
>> Something makes me think this maybe needs to be earlier...
>>
>> As to this last, you cannot do it until after nfs_Init_svc(), as it
> makes client calls.  Do you want it before or after nfs_start_grace()?
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel

[Nfs-ganesha-devel] Correct initialization sequence

2018-01-30 Thread Pradeep

Hello,

It is possible to receive requests anytime after nfs_Init_svc() is
completed. We initialize several things in nfs_Init() after this. This
could lead to processing of incoming requests racing with the rest of
initialization (ex: dupreq2_pkginit()). Is it possible to re-order
nfs_Init_svc() so that rest of ganesha is ready to process requests as soon
as we start listing on the NFS port? Another way is to return NFS4ERR_DELAY
until 'nfs_init.init_complete' is true. Any thoughts?


Thanks,
Pradeep
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel

Re: [Nfs-ganesha-devel] 'missed' recv with 2.6-rc2?

2018-01-29 Thread Pradeep

Hi Bill,

Are you planning to pull this into the next ganesha RC?

Thanks,
Pradeep

On Sun, Jan 28, 2018 at 7:13 AM, William Allen Simpson <
william.allen.simp...@gmail.com> wrote:

> On 1/27/18 4:07 PM, Pradeep wrote:
>
>> Here is what I see in the log (the '2' is what I added to figure out
>> which recv failed):
>> nfs-ganesha-199008[svc_948] rpc :TIRPC :WARN :svc_vc_recv: 0x7f91c0861400
>> fd 21 recv errno 11 (try again) 2 176
>>
>> The fix looks good. Thanks Bill.
>>
>> Thanks for the excellent report.  I wish everybody did such well
> researched reports!
>
> Yeah, the 2 isn't really needed, because I used "svc_vc_wait" and
> "svc_vc_recv" (__func__) to differentiate the 2 messages.
>
> This is really puzzling, since it should never happen.  It's the
> recv() with NO WAIT.  And we are level-triggered, so we shouldn't be
> in this code without an event.
>
> If it needed more data, it should be WOULD BLOCK, but it's giving
> EAGAIN.  No idea what that means here.
>
> Hope it's not happening often
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel

Re: [Nfs-ganesha-devel] 'missed' recv with 2.6-rc2?

2018-01-27 Thread Pradeep

On Sat, Jan 27, 2018 at 7:00 AM, William Allen Simpson <
william.allen.simp...@gmail.com> wrote:

> On 1/27/18 9:56 AM, William Allen Simpson wrote:
>
>> I'm not able to reproduce.  Could you tell me which EAGAIN is
>> happening?  The log line will say "svc_vc_wait" or "svc_vc_recv",
>> and have the actual error code on it.  Maybe this is EWOULDBLOCK?
>>
>> Of course, neither EAGAIN or EWOULDBLOCK should be happening on a
>> level triggered event.  But the old code had a log, so it's there.
>>
>
> I've stashed the patch on
>   https://github.com/linuxbox2/ntirpc/tree/was16backport
>
> Could you see whether this fixed it for you?
>
> And report the log line(s)?  Is this happening often?
>

Here is what I see in the log (the '2' is what I added to figure out which
recv failed):
nfs-ganesha-199008[svc_948] rpc :TIRPC :WARN :svc_vc_recv: 0x7f91c0861400
fd 21 recv errno 11 (try again) 2 176

The fix looks good. Thanks Bill.

Pradeep
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel

Re: [Nfs-ganesha-devel] 'missed' recv with 2.6-rc2?

2018-01-26 Thread Pradeep

Hi Dan,

In svc_vc_recv(), we handle the case of incomplete receive by rearming the
FD and returning ( if xd->sx_fbtbc is not zero). In the case of EAGAIN also
shouldn't we be doing the same? epoll is ONESHOT; so new receives won't
give new events until epoll_ctl() is called, right?

I tried adding the rearming code in EAGAIN cases and was able run the test
without receive hang.

diff --git a/src/svc_vc.c b/src/svc_vc.c
index f5377df..496444a 100644
--- a/src/svc_vc.c
+++ b/src/svc_vc.c
@@ -680,6 +680,12 @@ svc_vc_recv(SVCXPRT *xprt)
code = errno;

if (code == EAGAIN || code == EWOULDBLOCK) {
+   if (unlikely(svc_rqst_rearm_events(xprt))) {
+   __warnx(TIRPC_DEBUG_FLAG_ERROR,
+   "%s: %p fd %d
svc_rqst_rearm_events failed (will set dead)",
+   __func__, xprt,
xprt->xp_fd);
+   SVC_DESTROY(xprt);
+   }
__warnx(TIRPC_DEBUG_FLAG_WARN,
"%s: %p fd %d recv errno %d (try
again)",
"svc_vc_wait", xprt, xprt->xp_fd,
code);
@@ -731,8 +737,14 @@ svc_vc_recv(SVCXPRT *xprt)
code = errno;

if (code == EAGAIN || code == EWOULDBLOCK) {
+   if (unlikely(svc_rqst_rearm_events(xprt))) {
+   __warnx(TIRPC_DEBUG_FLAG_ERROR,
+   "%s: %p fd %d svc_rqst_rearm_events
failed (will set dead)",
+   __func__, xprt, xprt->xp_fd);
+   SVC_DESTROY(xprt);
+   }
__warnx(TIRPC_DEBUG_FLAG_SVC_VC,
-   "%s: %p fd %d recv errno %d (try again)",
+   "%s: %p fd %d recv errno %d (try again) 2",
__func__, xprt, xprt->xp_fd, code);
return SVC_STAT(xprt);



On Fri, Jan 26, 2018 at 6:24 AM, Matt Benjamin <mbenj...@redhat.com> wrote:

> Yes, I wasn't claiming there is anything missing.  Before 2.6, there
> was a rearm method being called.
>
> Matt
>
> On Fri, Jan 26, 2018 at 9:20 AM, Daniel Gryniewicz <d...@redhat.com>
> wrote:
> > I don't think you re-arm a FD in epoll.  You arm it once, and it fires
> until
> > you disarm it, as far as I know.  You just call epoll_wait() to get new
> > events.
> >
> > The thread model is a bit odd;  When the epoll fires, all the events are
> > found, and a thread is submitted for each one except one.  That one is
> > handled in the local thread (since it's expected that most epoll triggers
> > will have one event on them, thus using the current hot thread).  In
> > addition, a new thread is submitted to go back and wait for events, so
> > there's no delay handling new events.  So EAGAIN is handled by just
> > indicating this thread is done, and returning it to the thread pool.
> When
> > the socket is ready again, it will trigger a new event on the thread
> waiting
> > on the epoll.
> >
> > Bill, please correct me if I'm wrong.
> >
> > Daniel
> >
> >
> > On 01/25/2018 09:13 PM, Matt Benjamin wrote:
> >>
> >> Hmm.  We used to handle that ;)
> >>
> >> Matt
> >>
> >> On Thu, Jan 25, 2018 at 9:11 PM, Pradeep <pradeeptho...@gmail.com>
> wrote:
> >>>
> >>> If recv() returns EAGAIN, then svc_vc_recv() returns without rearming
> the
> >>> epoll_fd. How does it get back to svc_vc_recv() again?
> >>>
> >>> On Wed, Jan 24, 2018 at 9:26 PM, Pradeep <pradeeptho...@gmail.com>
> wrote:
> >>>>
> >>>>
> >>>> Hello,
> >>>>
> >>>> I seem to be hitting a corner case where ganesha (2.6-rc2) does not
> >>>> respond to a RENEW request from 4.0 client. Enabled the debug logs and
> >>>> noticed that NFS layer has not seen the RENEW request (I can see it in
> >>>> tcpdump).
> >>>>
> >>>> I collected netstat output periodically and found that there is a time
> >>>> window of ~60 sec where the receive buffer size remains the same. This
> >>>> means
> >>>> the RPC layer somehow missed a 'recv' call. Now if I enable debug on
> >>>> TIRPC,
> >>>> I can't reproduce the issue. Any pointers to potential races where I
> >>>> could
> >

Re: [Nfs-ganesha-devel] 'missed' recv with 2.6-rc2?

2018-01-25 Thread Pradeep

If recv() returns EAGAIN, then svc_vc_recv() returns without rearming the
epoll_fd. How does it get back to svc_vc_recv() again?

On Wed, Jan 24, 2018 at 9:26 PM, Pradeep <pradeeptho...@gmail.com> wrote:

> Hello,
>
> I seem to be hitting a corner case where ganesha (2.6-rc2) does not
> respond to a RENEW request from 4.0 client. Enabled the debug logs and
> noticed that NFS layer has not seen the RENEW request (I can see it in
> tcpdump).
>
> I collected netstat output periodically and found that there is a time
> window of ~60 sec where the receive buffer size remains the same. This
> means the RPC layer somehow missed a 'recv' call. Now if I enable debug on
> TIRPC, I can't reproduce the issue. Any pointers to potential races where I
> could enable selective prints would be helpful.
>
> svc_rqst_epoll_event() resets SVC_XPRT_FLAG_ADDED. Is it possible for
> another thread to svc_rqst_rearm_events()? In that case if
> svc_rqst_epoll_event() could reset the flag set by svc_rqst_rearm_events
> and complete the current receive before the other thread could call
> epoll_ctl(), right?
>
> Thanks,
> Pradeep
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel

[Nfs-ganesha-devel] 'missed' recv with 2.6-rc2?

2018-01-24 Thread Pradeep

Hello,

I seem to be hitting a corner case where ganesha (2.6-rc2) does not respond
to a RENEW request from 4.0 client. Enabled the debug logs and noticed that
NFS layer has not seen the RENEW request (I can see it in tcpdump).

I collected netstat output periodically and found that there is a time
window of ~60 sec where the receive buffer size remains the same. This
means the RPC layer somehow missed a 'recv' call. Now if I enable debug on
TIRPC, I can't reproduce the issue. Any pointers to potential races where I
could enable selective prints would be helpful.

svc_rqst_epoll_event() resets SVC_XPRT_FLAG_ADDED. Is it possible for
another thread to svc_rqst_rearm_events()? In that case if
svc_rqst_epoll_event() could reset the flag set by svc_rqst_rearm_events
and complete the current receive before the other thread could call
epoll_ctl(), right?

Thanks,
Pradeep
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel

[Nfs-ganesha-devel] NULL pointer deref in clnt_ncreate_timed()

2018-01-22 Thread Pradeep

Hello,

I'm running into a crash in libntirpc with rc2:

#2  
#3  0x7f9004de31f4 in clnt_ncreate_timed (hostname=0x57592e
"localhost", prog=100024, vers=1,
netclass=0x57592a "tcp", tp=0x0) at
/usr/src/debug/nfs-ganesha-2.6-rc2/libntirpc/src/clnt_generic.c:197
#4  0x0049a21c in clnt_ncreate (hostname=0x57592e "localhost",
prog=100024, vers=1,
nettype=0x57592a "tcp") at
/usr/src/debug/nfs-ganesha-2.6-rc2/libntirpc/ntirpc/rpc/clnt.h:395
#5  0x0049a4d2 in nsm_connect () at
/usr/src/debug/nfs-ganesha-2.6-rc2/Protocols/NLM/nsm.c:58
#6  0x0049c10d in nsm_unmonitor_all () at
/usr/src/debug/nfs-ganesha-2.6-rc2/Protocols/NLM/nsm.c:267
#7  0x004449d4 in nfs_start (p_start_info=0x7c8b28
)
at /usr/src/debug/nfs-ganesha-2.6-rc2/MainNFSD/nfs_init.c:963
#8  0x0041cd2e in main (argc=10, argv=0x7fff68b294d8)
at /usr/src/debug/nfs-ganesha-2.6-rc2/MainNFSD/nfs_main.c:499
(gdb) f 3
#3  0x7f9004de31f4 in clnt_ncreate_timed (hostname=0x57592e
"localhost", prog=100024, vers=1,
netclass=0x57592a "tcp", tp=0x0) at
/usr/src/debug/nfs-ganesha-2.6-rc2/libntirpc/src/clnt_generic.c:197
197 if (CLNT_SUCCESS(clnt))
(gdb) print clnt
$1 = (CLIENT *) 0x0

Looked at dev.22 and we were handling this error case correctly there.
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel

Re: [Nfs-ganesha-devel] Race between fsal_find_fd() and 'open downgrade'

2018-01-10 Thread Pradeep

Thanks Frank. I will try it out. Any concerns on holding the lock
across system calls (though this is read lock)?

On 1/10/18, Frank Filz <ffilz...@mindspring.com> wrote:
> We have a patch under review for FSAL_GPFS for that issue, awaiting the
> submitter to extend the patch to cover other FSALs also. If you want to
> verify the fix would work for your case, it should be easy to take the fix
> and do the same thing in FSAL_VFS.
>
> https://review.gerrithub.io/#/c/390141/
>
> As soon as this patch is merged into V2.6, it will be slated for backport
> to
> V2.5-stable.
>
> Frank
>
>> -Original Message-
>> From: Pradeep [mailto:pradeep.tho...@gmail.com]
>> Sent: Wednesday, January 10, 2018 8:06 PM
>> To: nfs-ganesha-devel <nfs-ganesha-devel@lists.sourceforge.net>
>> Subject: [Nfs-ganesha-devel] Race between fsal_find_fd() and 'open
>> downgrade'
>>
>> Hello,
>>
>> I'm seeing a write failure because another thread in ganesha doing 'open
>> downgrade' closed the FD the find_fd() returned. Any suggestion on how to
>> fix the race?
>>
>> vfs_reopen2()
>> {...
>> status = vfs_open_my_fd(myself, openflags, posix_flags, my_fd);
>>
>> if (!FSAL_IS_ERROR(status)) {
>> /* Close the existing file descriptor and copy the new
>>  * one over.
>>  */
>> vfs_close_my_fd(my_share_fd);
>> *my_share_fd = fd;
>>
>> vfs_write2()
>> {
>> ..
>> status = find_fd(_fd, obj_hdl, bypass, state, openflags,
>>  _lock, , false);
>>
>> if (FSAL_IS_ERROR(status)) {
>> LogDebug(COMPONENT_FSAL,
>>  "find_fd failed %s",
>> msg_fsal_err(status.major));
>> goto out;
>> }
>>
>> fsal_set_credentials(op_ctx->creds);
>>
>> nb_written = pwrite(my_fd, buffer, buffer_size, offset);
>>
>>
> 
> --
>> Check out the vibrant tech community on one of the world's most engaging
>> tech sites, Slashdot.org! http://sdm.link/slashdot
>> ___
>> Nfs-ganesha-devel mailing list
>> Nfs-ganesha-devel@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>
>
> ---
> This email has been checked for viruses by Avast antivirus software.
> https://www.avast.com/antivirus
>
>

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel

[Nfs-ganesha-devel] Race between fsal_find_fd() and 'open downgrade'

2018-01-10 Thread Pradeep

Hello,

I'm seeing a write failure because another thread in ganesha doing 'open
downgrade' closed the FD the find_fd() returned. Any suggestion on how
to fix the race?

vfs_reopen2()
{...
status = vfs_open_my_fd(myself, openflags, posix_flags, my_fd);

if (!FSAL_IS_ERROR(status)) {
/* Close the existing file descriptor and copy the new
 * one over.
 */
vfs_close_my_fd(my_share_fd);
*my_share_fd = fd;

vfs_write2()
{
..
status = find_fd(_fd, obj_hdl, bypass, state, openflags,
 _lock, , false);

if (FSAL_IS_ERROR(status)) {
LogDebug(COMPONENT_FSAL,
 "find_fd failed %s", msg_fsal_err(status.major));
goto out;
}

fsal_set_credentials(op_ctx->creds);

nb_written = pwrite(my_fd, buffer, buffer_size, offset);

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel

[Nfs-ganesha-devel] Is there a way to update params in FSAL block with SIGHUP?

2018-01-09 Thread Pradeep

Hello,

It is possible to update export specific parameters added in the FSAL block
using SIGHUP only(instead of first doing RemoveExport and then doing
SIGHUP)? RemoveExport has the downside that it will blow away the cache and
fail inflight I/Os.

For example, FSAL VFS has parameter fsid_type. It doesn't seem to get
updated with just a SIGHUP after a change.

static struct config_item export_params[] = {
CONF_ITEM_NOOP("name"),
CONF_ITEM_TOKEN("fsid_type", FSID_NO_TYPE,
fsid_types,
vfs_fsal_export, fsid_type),
CONFIG_EOL
};

static struct config_block export_param_block = {
.dbus_interface_name = "org.ganesha.nfsd.config.fsal.vfs-export%d",
.blk_desc.name = "FSAL",
.blk_desc.type = CONFIG_BLOCK,
.blk_desc.u.blk.init = noop_conf_init,
.blk_desc.u.blk.params = export_params,
.blk_desc.u.blk.commit = noop_conf_commit
};

struct config_block *vfs_sub_export_param = _param_block;
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel

Re: [Nfs-ganesha-devel] XID missing in error path for RPC AUTH failure.

2017-12-15 Thread Pradeep

Thanks for the patch Bill. dev.22 works for me.

On Fri, Dec 15, 2017 at 9:15 AM, William Allen Simpson <
william.allen.simp...@gmail.com> wrote:

> On 12/14/17 1:13 PM, William Allen Simpson wrote:
>
>> This is May 2015 code, based upon 2012 code.  Obviously, we haven't
>> been testing error responses ;)
>>
>> I wanted to add a personal thank you for such an excellent bug report.
> The patch went in the next branch yesterday, and should show up in the
> Ganesha dev.22 branch today.  You are credited in the log message.
>
> There are (at least) 2 different code paths for handling similar reply
> message output, and this one wasn't correct.  Back in 2005, I'd spent a
> fair amount of time trying to debug it (the reason for all those logging
> messages), and had found missing breaks and returns.
>
> But I wasn't looking at the formatting for error messages that weren't
> being tested, and missed the obvious   Next year, we should probably
> try to find all the code paths and consolidate.
>
> What versions do you need backported?
>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel

Re: [Nfs-ganesha-devel] XID missing in error path for RPC AUTH failure.

2017-12-13 Thread Pradeep

On Tue, Dec 12, 2017 at 10:22 PM, Matt Benjamin <mbenj...@redhat.com> wrote:

> That sounds right, I'm uncertain whether this has regressed in the
> text, or maybe in the likelihood of inlining in the new dispatch
> model.  Bill?
>
>
It doesn't look like this code has changed recently. Is it possible that
some other function was used to encode XID that got removed recently?

Also I see that XID is missing in other error cases as well. Perhaps it is
better to move this up to handle all 'MSG_DENIED' cases.



> Matt
>
> On Wed, Dec 13, 2017 at 9:38 AM, Pradeep <pradeeptho...@gmail.com> wrote:
> > Hello,
> >
> > When using krb5 exports, I noticed that TIRPC does not send XID in
> response
> > - see xdr_reply_encode() for MSG_DENIED case. Looks like Linux clients
> can't
> > decode the message and go in to an infinite loop retrying the same NFS
> > operation. I tried adding XID back (like it is done for normal case) and
> it
> > seems to have fixed the problem. Is this the right thing to do?
> >
> > diff --git a/src/rpc_dplx_msg.c b/src/rpc_dplx_msg.c
> > index 01e5a5c..a585e8a 100644
> > --- a/src/rpc_dplx_msg.c
> > +++ b/src/rpc_dplx_msg.c
> > @@ -194,9 +194,12 @@ xdr_reply_encode(XDR *xdrs, struct rpc_msg *dmsg)
> > __warnx(TIRPC_DEBUG_FLAG_RPC_MSG,
> > "%s:%u DENIED AUTH",
> > __func__, __LINE__);
> > -   buf = XDR_INLINE(xdrs, 2 * BYTES_PER_XDR_UNIT);
> > +   buf = XDR_INLINE(xdrs, 5 * BYTES_PER_XDR_UNIT);
> >
> > if (buf != NULL) {
> > +   IXDR_PUT_INT32(buf, dmsg->rm_xid);
> > +   IXDR_PUT_ENUM(buf, dmsg->rm_direction);
> > +   IXDR_PUT_ENUM(buf,
> dmsg->rm_reply.rp_stat);
> > IXDR_PUT_ENUM(buf, rr->rj_stat);
> > IXDR_PUT_ENUM(buf, rr->rj_why);
> > } else if (!xdr_putenum(xdrs, rr->rj_stat)) {
> >
> > 
> --
> > Check out the vibrant tech community on one of the world's most
> > engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> > ___
> > Nfs-ganesha-devel mailing list
> > Nfs-ganesha-devel@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
> >
>
>
>
> --
>
> Matt Benjamin
> Red Hat, Inc.
> 315 West Huron Street, Suite 140A
> Ann Arbor, Michigan 48103
>
> http://www.redhat.com/en/technologies/storage
>
> tel.  734-821-5101
> fax.  734-769-8938
> cel.  734-216-5309
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel

[Nfs-ganesha-devel] XID missing in error path for RPC AUTH failure.

2017-12-12 Thread Pradeep

Hello,

When using krb5 exports, I noticed that TIRPC does not send XID in response
- see xdr_reply_encode() for MSG_DENIED case. Looks like Linux clients
can't decode the message and go in to an infinite loop retrying the same
NFS operation. I tried adding XID back (like it is done for normal case)
and it seems to have fixed the problem. Is this the right thing to do?

diff --git a/src/rpc_dplx_msg.c b/src/rpc_dplx_msg.c
index 01e5a5c..a585e8a 100644
--- a/src/rpc_dplx_msg.c
+++ b/src/rpc_dplx_msg.c
@@ -194,9 +194,12 @@ xdr_reply_encode(XDR *xdrs, struct rpc_msg *dmsg)
__warnx(TIRPC_DEBUG_FLAG_RPC_MSG,
"%s:%u DENIED AUTH",
__func__, __LINE__);
-   buf = XDR_INLINE(xdrs, 2 * BYTES_PER_XDR_UNIT);
+   buf = XDR_INLINE(xdrs, 5 * BYTES_PER_XDR_UNIT);

if (buf != NULL) {
+   IXDR_PUT_INT32(buf, dmsg->rm_xid);
+   IXDR_PUT_ENUM(buf, dmsg->rm_direction);
+   IXDR_PUT_ENUM(buf, dmsg->rm_reply.rp_stat);
IXDR_PUT_ENUM(buf, rr->rj_stat);
IXDR_PUT_ENUM(buf, rr->rj_why);
} else if (!xdr_putenum(xdrs, rr->rj_stat)) {
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel

[Nfs-ganesha-devel] enqueued_reqs/dequeued_reqs

2017-12-11 Thread Pradeep

It looks like, we don't increment enqueued_reqs/dequeued_reqs in the RPC
anymore - nfs_rpc_enqueue_req() is replaced with nfs_rpc_process_request.
Now that both values are zero, the health checker (get_ganesha_health) will
never detect any RPC hangs. Should the enqueued_reqs/dequeued_reqs be moved
to nfs_rpc_process_request()?

Thanks,
Pradeep
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel

[Nfs-ganesha-devel] Ganesha stuck in release_openstate().

2017-11-29 Thread Pradeep

Hello all,

I'm seeing a case with 2.6-dev12 where 'state' is in owner's
state_list (nfs4_owner->so_state_list). But it is not in the
ht_state_id hash table. This will cause state_del_locked() to return
without cleaning up state. Because of this release_openstate() goes
into an infinite loop (the errcnt is never incremented). Also since we
hold 'clientid->cid_mutex' before calling release_openstate(), all new
mounts are hung as well.

Any thoughts on how ganesha ends up having 'open state' in
nfs4_owner->so_state_list and not in ht_state_id?

Thanks,
Pradeep

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel

Re: [Nfs-ganesha-devel] [PATCH v2] nfs: Fix ugly referral attributes

2017-11-06 Thread Pradeep

Hi Frank,

I should be able to do that. Thanks Chuck for the fix.

On Mon, Nov 6, 2017 at 9:34 AM, Frank Filz <ffilz...@mindspring.com> wrote:
> Pradeep,
>
> Could you verify this patch with nfs-ganesha? Looking at the code, it looks 
> like we will supply the attributes requested.
>
> Thanks
>
> Frank
>
>> -Original Message-
>> From: linux-nfs-ow...@vger.kernel.org [mailto:linux-nfs-
>> ow...@vger.kernel.org] On Behalf Of Chuck Lever
>> Sent: Sunday, November 5, 2017 12:45 PM
>> To: trond.mykleb...@primarydata.com; anna.schuma...@netapp.com
>> Cc: linux-...@vger.kernel.org
>> Subject: [PATCH v2] nfs: Fix ugly referral attributes
>>
>> Before traversing a referral and performing a mount, the mounted-on
>> directory looks strange:
>>
>> dr-xr-xr-x. 2 4294967294 4294967294 0 Dec 31  1969 dir.0
>>
>> nfs4_get_referral is wiping out any cached attributes with what was returned
>> via GETATTR(fs_locations), but the bit mask for that operation does not
>> request any file attributes.
>>
>> Retrieve owner and timestamp information so that the memcpy in
>> nfs4_get_referral fills in more attributes.
>>
>> Changes since v1:
>> - Don't request attributes that the client unconditionally replaces
>> - Request only MOUNTED_ON_FILEID or FILEID attribute, not both
>> - encode_fs_locations() doesn't use the third bitmask word
>>
>> Fixes: 6b97fd3da1ea ("NFSv4: Follow a referral")
>> Suggested-by: Pradeep Thomas <pradeeptho...@gmail.com>
>> Signed-off-by: Chuck Lever <chuck.le...@oracle.com>
>> Cc: sta...@vger.kernel.org
>> ---
>>  fs/nfs/nfs4proc.c |   18 --
>>  1 file changed, 8 insertions(+), 10 deletions(-)
>>
>> I could send this as an incremental, but that just seems to piss off
>> distributors, who will just squash them all together anyway.
>>
>> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index 6c61e2b..2662879
>> 100644
>> --- a/fs/nfs/nfs4proc.c
>> +++ b/fs/nfs/nfs4proc.c
>> @@ -254,15 +254,12 @@ static int nfs4_map_errors(int err)  };
>>
>>  const u32 nfs4_fs_locations_bitmap[3] = {
>> - FATTR4_WORD0_TYPE
>> - | FATTR4_WORD0_CHANGE
>> + FATTR4_WORD0_CHANGE
>>   | FATTR4_WORD0_SIZE
>>   | FATTR4_WORD0_FSID
>>   | FATTR4_WORD0_FILEID
>>   | FATTR4_WORD0_FS_LOCATIONS,
>> - FATTR4_WORD1_MODE
>> - | FATTR4_WORD1_NUMLINKS
>> - | FATTR4_WORD1_OWNER
>> + FATTR4_WORD1_OWNER
>>   | FATTR4_WORD1_OWNER_GROUP
>>   | FATTR4_WORD1_RAWDEV
>>   | FATTR4_WORD1_SPACE_USED
>> @@ -6763,9 +6760,7 @@ static int _nfs4_proc_fs_locations(struct rpc_clnt
>> *client, struct inode *dir,
>>  struct page *page)
>>  {
>>   struct nfs_server *server = NFS_SERVER(dir);
>> - u32 bitmask[3] = {
>> - [0] = FATTR4_WORD0_FSID |
>> FATTR4_WORD0_FS_LOCATIONS,
>> - };
>> + u32 bitmask[3];
>>   struct nfs4_fs_locations_arg args = {
>>   .dir_fh = NFS_FH(dir),
>>   .name = name,
>> @@ -6784,12 +6779,15 @@ static int _nfs4_proc_fs_locations(struct rpc_clnt
>> *client, struct inode *dir,
>>
>>   dprintk("%s: start\n", __func__);
>>
>> + bitmask[0] = nfs4_fattr_bitmap[0] |
>> FATTR4_WORD0_FS_LOCATIONS;
>> + bitmask[1] = nfs4_fattr_bitmap[1];
>> +
>>   /* Ask for the fileid of the absent filesystem if mounted_on_fileid
>>* is not supported */
>>   if (NFS_SERVER(dir)->attr_bitmask[1] &
>> FATTR4_WORD1_MOUNTED_ON_FILEID)
>> - bitmask[1] |= FATTR4_WORD1_MOUNTED_ON_FILEID;
>> + bitmask[0] &= ~FATTR4_WORD0_FILEID;
>>   else
>> - bitmask[0] |= FATTR4_WORD0_FILEID;
>> + bitmask[1] &= ~FATTR4_WORD1_MOUNTED_ON_FILEID;
>>
>>   nfs_fattr_init(_locations->fattr);
>>   fs_locations->server = server;
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the
>> body of a message to majord...@vger.kernel.org More majordomo info at
>> http://vger.kernel.org/majordomo-info.html
>
>
> ---
> This email has been checked for viruses by Avast antivirus software.
> https://www.avast.com/antivirus
>

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel

Re: [Nfs-ganesha-devel] NFSv4 referrals not working with ganesha.

2017-11-01 Thread Pradeep

 Adding linux-nfs.

 Is this supposed to work with Linux NFS clients (see the problem
description at the end of this email)?

The NFSv4 referrals with Linux clients does not work with 'stat', 'ls'
etc., But will follow referrals only after a 'cd'.
 tcpdump is attached.



> On Mon, Oct 30, 2017 at 3:24 PM, Frank Filz <ffilz...@mindspring.com>
> wrote:
>
>> Oh, I had forgotten about that patch…
>>
>>
>>
>> Can you try any other clients? This may be a client issue (I did see some
>> suspicious code in the client).
>>
>>
>>
>> It may also be that you need a fully qualified path (starting with a /).
>>
>>
>>
>> It looks like Ganesha is doing the right thing though.
>>
>>
>>
>> Frank
>>
>>
>>
>> *From:* Pradeep [mailto:pradeep.tho...@gmail.com]
>> *Sent:* Monday, October 30, 2017 2:21 PM
>> *To:* Frank Filz <ffilz...@mindspring.com>
>> *Cc:* nfs-ganesha-devel <nfs-ganesha-devel@lists.sourceforge.net>;
>> ssaurabh.w...@gmail.com
>> *Subject:* Re: [Nfs-ganesha-devel] NFSv4 referrals not working with
>> ganesha.
>>
>>
>>
>> Hi Frank,
>>
>>
>>
>> This is with latest version of Ganesha. The referral support is already
>> in VFS: https://review.gerrithub.io/c/353684
>>
>>
>>
>> tcpdump is attached. From the tcpdump, we can see that the stat sent a
>> LOOKUP for the remote export and received a moved error. It also sent back
>> the fs_locations. But the client (CentOS 7.3) never followed that with a
>> LOOKUP to the remote server.
>>
>>
>>
>> You can see that packet #41 has the correct FS locations. But client does
>> not do another lookup to get the correct attributes.
>>
>>
>>
>> $ stat /mnt/nfs_d1
>>
>>   File: ‘/mnt/nfs_d1’
>>
>>   Size: 0   Blocks: 0  IO Block: 1048576 directory
>>
>> Device: 28h/40d Inode: 1   Links: 2
>>
>> Access: (0555/dr-xr-xr-x)  Uid: (4294967294/ UNKNOWN)   Gid: (4294967294/
>> UNKNOWN)
>>
>> Context: system_u:object_r:nfs_t:s0
>>
>> Access: 1969-12-31 16:00:00.0 -0800
>>
>> Modify: 1969-12-31 16:00:00.0 -0800
>>
>> Change: 1969-12-31 16:00:00.0 -0800
>>
>>  Birth: -
>>
>>
>>
>>
>>
>> On Mon, Oct 30, 2017 at 12:13 PM, Frank Filz <ffilz...@mindspring.com>
>> wrote:
>>
>> What version of Ganesha? I assume by “native” FSAL, you mean FSAL_VFS?
>> Did you add the fs locations XATTR support? FSAL_GPFS currently has the
>> only in-tree referral support and I’m not sure it necessarily works, but
>> I’m unable to test it.
>>
>>
>>
>> If you have code for FSAL_VFS to add the fs locations attribute, go ahead
>> and post it and I could poke at it.
>>
>>
>>
>> Also, tcpdump traces might help understand what is going wrong.
>>
>>
>>
>> Frank
>>
>>
>>
>> *From:* Pradeep [mailto:pradeep.tho...@gmail.com]
>> *Sent:* Monday, October 30, 2017 11:45 AM
>> *To:* nfs-ganesha-devel <nfs-ganesha-devel@lists.sourceforge.net>
>> *Cc:* ssaurabh.w...@gmail.com
>> *Subject:* [Nfs-ganesha-devel] NFSv4 referrals not working with ganesha.
>>
>>
>>
>> Hi all,
>>
>>
>>
>> We are testing NFSv4 referral for Linux CentOS 7 with nfs-ganesha and are
>> running
>>
>> into some serious issues.
>>
>>
>>
>> Although, we were able to set up NFSv4 referral using the native Ganesha
>> FSAL,
>>
>> we could not get it fully functional for all Linux client system calls.
>>
>> Basically, the NFSv4 spec suggests to return a NFS4ERR_MOVED on a
>>
>> LOOKUP done for a remote export. However, this breaks the `stat` system
>> call on
>>
>> Linux CentOS 7 (stat’ results in a LOOKUP,GETFH,GETATTR compound). An
>> easy way to
>>
>> reproduce the broken behavior is:
>>
>> 1) mount the root of the pseudo file system and
>>
>> 2) issue a `stat` command on the remote export.
>>
>> The stat returned are corrupt.
>>
>>
>>
>> After digging into the CentOS 7 client code, we realized that the stat
>> operation
>>
>> is never expected to follow the referral. However, switching to returning
>> NFS4_OK
>>
>> for stat, then breaks `cd` or a `ls -l` command, because now we don't
>> know when
>>
>> to follow the referral.
>>
>>
>>
>> Does anyone h

Re: [Nfs-ganesha-devel] NFSv4 referrals not working with ganesha.

2017-11-01 Thread Pradeep

Adding linux-nfs (did not work last couple of times because of email format).

 Is this supposed to work with Linux NFS clients (see the problem
description at the end of this email)?

The NFSv4 referrals with Linux clients does not work with 'stat', 'ls'
etc., But linux client follows referrals after a 'cd'. Is this the
expected behavior?

tcpdump is attached.


>>
>> On Mon, Oct 30, 2017 at 3:24 PM, Frank Filz <ffilz...@mindspring.com>
>> wrote:
>>>
>>> Oh, I had forgotten about that patch…
>>>
>>>
>>>
>>> Can you try any other clients? This may be a client issue (I did see some
>>> suspicious code in the client).
>>>
>>>
>>>
>>> It may also be that you need a fully qualified path (starting with a /).
>>>
>>>
>>>
>>> It looks like Ganesha is doing the right thing though.
>>>
>>>
>>>
>>> Frank
>>>
>>>
>>>
>>> From: Pradeep [mailto:pradeep.tho...@gmail.com]
>>> Sent: Monday, October 30, 2017 2:21 PM
>>> To: Frank Filz <ffilz...@mindspring.com>
>>> Cc: nfs-ganesha-devel <nfs-ganesha-devel@lists.sourceforge.net>;
>>> ssaurabh.w...@gmail.com
>>> Subject: Re: [Nfs-ganesha-devel] NFSv4 referrals not working with
>>> ganesha.
>>>
>>>
>>>
>>> Hi Frank,
>>>
>>>
>>>
>>> This is with latest version of Ganesha. The referral support is already
>>> in VFS: https://review.gerrithub.io/c/353684
>>>
>>>
>>>
>>> tcpdump is attached. From the tcpdump, we can see that the stat sent a
>>> LOOKUP for the remote export and received a moved error. It also sent back
>>> the fs_locations. But the client (CentOS 7.3) never followed that with a
>>> LOOKUP to the remote server.
>>>
>>>
>>>
>>> You can see that packet #41 has the correct FS locations. But client does
>>> not do another lookup to get the correct attributes.
>>>
>>>
>>>
>>> $ stat /mnt/nfs_d1
>>>
>>>   File: ‘/mnt/nfs_d1’
>>>
>>>   Size: 0   Blocks: 0  IO Block: 1048576 directory
>>>
>>> Device: 28h/40d Inode: 1   Links: 2
>>>
>>> Access: (0555/dr-xr-xr-x)  Uid: (4294967294/ UNKNOWN)   Gid: (4294967294/
>>> UNKNOWN)
>>>
>>> Context: system_u:object_r:nfs_t:s0
>>>
>>> Access: 1969-12-31 16:00:00.0 -0800
>>>
>>> Modify: 1969-12-31 16:00:00.0 -0800
>>>
>>> Change: 1969-12-31 16:00:00.0 -0800
>>>
>>>  Birth: -
>>>
>>>
>>>
>>>
>>>
>>> On Mon, Oct 30, 2017 at 12:13 PM, Frank Filz <ffilz...@mindspring.com>
>>> wrote:
>>>
>>> What version of Ganesha? I assume by “native” FSAL, you mean FSAL_VFS?
>>> Did you add the fs locations XATTR support? FSAL_GPFS currently has the only
>>> in-tree referral support and I’m not sure it necessarily works, but I’m
>>> unable to test it.
>>>
>>>
>>>
>>> If you have code for FSAL_VFS to add the fs locations attribute, go ahead
>>> and post it and I could poke at it.
>>>
>>>
>>>
>>> Also, tcpdump traces might help understand what is going wrong.
>>>
>>>
>>>
>>> Frank
>>>
>>>
>>>
>>> From: Pradeep [mailto:pradeep.tho...@gmail.com]
>>> Sent: Monday, October 30, 2017 11:45 AM
>>> To: nfs-ganesha-devel <nfs-ganesha-devel@lists.sourceforge.net>
>>> Cc: ssaurabh.w...@gmail.com
>>> Subject: [Nfs-ganesha-devel] NFSv4 referrals not working with ganesha.
>>>
>>>
>>>
>>> Hi all,
>>>
>>>
>>>
>>> We are testing NFSv4 referral for Linux CentOS 7 with nfs-ganesha and are
>>> running
>>>
>>> into some serious issues.
>>>
>>>
>>>
>>> Although, we were able to set up NFSv4 referral using the native Ganesha
>>> FSAL,
>>>
>>> we could not get it fully functional for all Linux client system calls.
>>>
>>> Basically, the NFSv4 spec suggests to return a NFS4ERR_MOVED on a
>>>
>>> LOOKUP done for a remote export. However, this breaks the `stat` system
>>> call on
>>>
>>> Linux CentOS 7 (stat’ results in a LOOKUP,GETFH,GETATTR compound). An
>>> easy way to
>>>
>>> reproduce the broken behavior is:
>>>
>>> 1) mount the root of the pseudo file system and
>>>
>>> 2) issue a `stat` command on the remote export.
>>>
>>> The stat returned are corrupt.
>>>
>>>
>>>
>>> After digging into the CentOS 7 client code, we realized that the stat
>>> operation
>>>
>>> is never expected to follow the referral. However, switching to returning
>>> NFS4_OK
>>>
>>> for stat, then breaks `cd` or a `ls -l` command, because now we don't
>>> know when
>>>
>>> to follow the referral.
>>>
>>>
>>>
>>> Does anyone have a successful experience in setting up the NFSv4
>>> referrals that
>>>
>>> they could share? Or, if some suggestions on what we might be doing
>>> wrong?
>>>
>>>
>>>
>>> Thanks
>>>
>>>
>>>
>>> Virus-free. www.avast.com
>>>
>>>
>>
>>
>


nfs_remote_export1.pcap
Description: Binary data
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel

Re: [Nfs-ganesha-devel] NFSv4 referrals not working with ganesha.

2017-10-30 Thread Pradeep

Adding linux-nfs.

Is this supposed to work with Linux NFS clients (see the problem
description at the end of this email)?

The NFSv4 referrals with Linux clients does not work with 'stat', 'ls'
etc., But will follow referrals only after a 'cd'.

On Mon, Oct 30, 2017 at 3:24 PM, Frank Filz <ffilz...@mindspring.com> wrote:

> Oh, I had forgotten about that patch…
>
>
>
> Can you try any other clients? This may be a client issue (I did see some
> suspicious code in the client).
>
>
>
> It may also be that you need a fully qualified path (starting with a /).
>
>
>
> It looks like Ganesha is doing the right thing though.
>
>
>
> Frank
>
>
>
> *From:* Pradeep [mailto:pradeep.tho...@gmail.com]
> *Sent:* Monday, October 30, 2017 2:21 PM
> *To:* Frank Filz <ffilz...@mindspring.com>
> *Cc:* nfs-ganesha-devel <nfs-ganesha-devel@lists.sourceforge.net>;
> ssaurabh.w...@gmail.com
> *Subject:* Re: [Nfs-ganesha-devel] NFSv4 referrals not working with
> ganesha.
>
>
>
> Hi Frank,
>
>
>
> This is with latest version of Ganesha. The referral support is already in
> VFS: https://review.gerrithub.io/c/353684
>
>
>
> tcpdump is attached. From the tcpdump, we can see that the stat sent a
> LOOKUP for the remote export and received a moved error. It also sent back
> the fs_locations. But the client (CentOS 7.3) never followed that with a
> LOOKUP to the remote server.
>
>
>
> You can see that packet #41 has the correct FS locations. But client does
> not do another lookup to get the correct attributes.
>
>
>
> $ stat /mnt/nfs_d1
>
>   File: ‘/mnt/nfs_d1’
>
>   Size: 0   Blocks: 0  IO Block: 1048576 directory
>
> Device: 28h/40d Inode: 1   Links: 2
>
> Access: (0555/dr-xr-xr-x)  Uid: (4294967294/ UNKNOWN)   Gid: (4294967294/
> UNKNOWN)
>
> Context: system_u:object_r:nfs_t:s0
>
> Access: 1969-12-31 16:00:00.0 -0800
>
> Modify: 1969-12-31 16:00:00.0 -0800
>
> Change: 1969-12-31 16:00:00.0 -0800
>
>  Birth: -
>
>
>
>
>
> On Mon, Oct 30, 2017 at 12:13 PM, Frank Filz <ffilz...@mindspring.com>
> wrote:
>
> What version of Ganesha? I assume by “native” FSAL, you mean FSAL_VFS? Did
> you add the fs locations XATTR support? FSAL_GPFS currently has the only
> in-tree referral support and I’m not sure it necessarily works, but I’m
> unable to test it.
>
>
>
> If you have code for FSAL_VFS to add the fs locations attribute, go ahead
> and post it and I could poke at it.
>
>
>
> Also, tcpdump traces might help understand what is going wrong.
>
>
>
> Frank
>
>
>
> *From:* Pradeep [mailto:pradeep.tho...@gmail.com]
> *Sent:* Monday, October 30, 2017 11:45 AM
> *To:* nfs-ganesha-devel <nfs-ganesha-devel@lists.sourceforge.net>
> *Cc:* ssaurabh.w...@gmail.com
> *Subject:* [Nfs-ganesha-devel] NFSv4 referrals not working with ganesha.
>
>
>
> Hi all,
>
>
>
> We are testing NFSv4 referral for Linux CentOS 7 with nfs-ganesha and are
> running
>
> into some serious issues.
>
>
>
> Although, we were able to set up NFSv4 referral using the native Ganesha
> FSAL,
>
> we could not get it fully functional for all Linux client system calls.
>
> Basically, the NFSv4 spec suggests to return a NFS4ERR_MOVED on a
>
> LOOKUP done for a remote export. However, this breaks the `stat` system
> call on
>
> Linux CentOS 7 (stat’ results in a LOOKUP,GETFH,GETATTR compound). An easy
> way to
>
> reproduce the broken behavior is:
>
> 1) mount the root of the pseudo file system and
>
> 2) issue a `stat` command on the remote export.
>
> The stat returned are corrupt.
>
>
>
> After digging into the CentOS 7 client code, we realized that the stat
> operation
>
> is never expected to follow the referral. However, switching to returning
> NFS4_OK
>
> for stat, then breaks `cd` or a `ls -l` command, because now we don't know
> when
>
> to follow the referral.
>
>
>
> Does anyone have a successful experience in setting up the NFSv4 referrals
> that
>
> they could share? Or, if some suggestions on what we might be doing wrong?
>
>
>
> Thanks
>
>
>
>
> <https://www.avast.com/sig-email?utm_medium=email_source=link_campaign=sig-email_content=emailclient_term=icon>
>
> Virus-free. *www.avast.com*
> <https://www.avast.com/sig-email?utm_medium=email_source=link_campaign=sig-email_content=emailclient_term=icon>
>
>
> <https://www.avast.com/sig-email?utm_medium=email_source=link_campaign=sig-email_content=emailclient_term=icon>
>


nfs_remote_export1.pcap
Description: Binary data
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel

Re: [Nfs-ganesha-devel] NFSv4 referrals not working with ganesha.

2017-10-30 Thread Pradeep

Hi Frank,

This is with latest version of Ganesha. The referral support is already in
VFS: https://review.gerrithub.io/c/353684

tcpdump is attached. From the tcpdump, we can see that the stat sent a
LOOKUP for the remote export and received a moved error. It also sent back
the fs_locations. But the client (CentOS 7.3) never followed that with a
LOOKUP to the remote server.

You can see that packet #41 has the correct FS locations. But client does
not do another lookup to get the correct attributes.

$ stat /mnt/nfs_d1
  File: ‘/mnt/nfs_d1’
  Size: 0   Blocks: 0  IO Block: 1048576 directory
Device: 28h/40d Inode: 1   Links: 2
Access: (0555/dr-xr-xr-x)  Uid: (4294967294/ UNKNOWN)   Gid: (4294967294/
UNKNOWN)
Context: system_u:object_r:nfs_t:s0
Access: 1969-12-31 16:00:00.0 -0800
Modify: 1969-12-31 16:00:00.0 -0800
Change: 1969-12-31 16:00:00.0 -0800
 Birth: -


On Mon, Oct 30, 2017 at 12:13 PM, Frank Filz <ffilz...@mindspring.com>
wrote:

> What version of Ganesha? I assume by “native” FSAL, you mean FSAL_VFS? Did
> you add the fs locations XATTR support? FSAL_GPFS currently has the only
> in-tree referral support and I’m not sure it necessarily works, but I’m
> unable to test it.
>
>
>
> If you have code for FSAL_VFS to add the fs locations attribute, go ahead
> and post it and I could poke at it.
>
>
>
> Also, tcpdump traces might help understand what is going wrong.
>
>
>
> Frank
>
>
>
> *From:* Pradeep [mailto:pradeep.tho...@gmail.com]
> *Sent:* Monday, October 30, 2017 11:45 AM
> *To:* nfs-ganesha-devel <nfs-ganesha-devel@lists.sourceforge.net>
> *Cc:* ssaurabh.w...@gmail.com
> *Subject:* [Nfs-ganesha-devel] NFSv4 referrals not working with ganesha.
>
>
>
> Hi all,
>
>
>
> We are testing NFSv4 referral for Linux CentOS 7 with nfs-ganesha and are
> running
>
> into some serious issues.
>
>
>
> Although, we were able to set up NFSv4 referral using the native Ganesha
> FSAL,
>
> we could not get it fully functional for all Linux client system calls.
>
> Basically, the NFSv4 spec suggests to return a NFS4ERR_MOVED on a
>
> LOOKUP done for a remote export. However, this breaks the `stat` system
> call on
>
> Linux CentOS 7 (stat’ results in a LOOKUP,GETFH,GETATTR compound). An easy
> way to
>
> reproduce the broken behavior is:
>
> 1) mount the root of the pseudo file system and
>
> 2) issue a `stat` command on the remote export.
>
> The stat returned are corrupt.
>
>
>
> After digging into the CentOS 7 client code, we realized that the stat
> operation
>
> is never expected to follow the referral. However, switching to returning
> NFS4_OK
>
> for stat, then breaks `cd` or a `ls -l` command, because now we don't know
> when
>
> to follow the referral.
>
>
>
> Does anyone have a successful experience in setting up the NFSv4 referrals
> that
>
> they could share? Or, if some suggestions on what we might be doing wrong?
>
>
>
> Thanks
>
>
> <https://www.avast.com/sig-email?utm_medium=email_source=link_campaign=sig-email_content=emailclient_term=icon>
>  Virus-free.
> www.avast.com
> <https://www.avast.com/sig-email?utm_medium=email_source=link_campaign=sig-email_content=emailclient_term=link>
> <#m_-471076988347431713_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>


nfs_remote_export1.pcap
Description: Binary data
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel

[Nfs-ganesha-devel] Crash in mdcache_alloc_handle() during unexport

2017-10-05 Thread Pradeep

Hello,

This issue is with 2.6-dev.11 (don't think it is specific to this version).

It appears that there is a race between mdcache_unexport() and
mdcache_alloc_handle(). If an request comes after the MDC_UNEXPORT is set,
mdcache tries to free the entry by calling these:

/* Map the export before we put this entry into the LRU, but after
it's
 * well enough set up to be able to be unrefed by unexport should
there
 * be a race.
 */
status = mdc_check_mapping(result);

if (unlikely(FSAL_IS_ERROR(status))) {
/* The current export is in process to be unexported, don't
 * create new mdcache entries.
 */
LogDebug(COMPONENT_CACHE_INODE,
 "Trying to allocate a new entry %p for export id %"
 PRIi16" that is in the process of being
unexported",
 result, op_ctx->ctx_export->export_id);
mdcache_put(result);
mdcache_kill_entry(result);
return NULL;
}

At this point, the entry is neither in any LRU queue nor in any partition
(AVL tree).
So _mdcache_kill_entry() will call mdcache_lru_cleanup_push() which will
try to dequeue:

if (!(lru->qid == LRU_ENTRY_CLEANUP)) {
struct lru_q *q;

/* out with the old queue */
q = lru_queue_of(entry);  <--- NULL since we haven't
inserted it.
LRU_DQ_SAFE(lru, q); <-- crash here.


I think if we call mdcache_lru_unref() instead of mdcache_kill_entry(), it
will correctly free the entry.

If the idea of calling mdcache_kill_entry() is to insert into the cleanup
queue, then adding a check before LRU_DQ_SAFE() in
mdcache_lru_cleanup_push() should fix it too.

Thanks,
Pradeep
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel

[Nfs-ganesha-devel] Refreshing an export

2017-10-05 Thread Pradeep

Hello,

If we were to refresh an existing NFS export (meaning cleanup all MDCACHE
entries in that export and release all FSAL object handles), we could use
the DBUS interface RemoveExport and AddExport. One problem with this
approach is that right after RemoveExport, all in-flight requests will
start receiving ESTALE error. Is there a way to pause all transports in
TIRPC, so that we can block requests for this duration? The flow will be
something like this:

- Pause all transports.
- Drain in-flight requests.
- RemoveExport
- AddExport

Alternative is to restart ganesha process which will block I/Os for the
'grace' period which is not ideal.

Thanks,
Pradeep
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel

Re: [Nfs-ganesha-devel] Crash in TIRPC with Ganesha 2.6-dev.5

2017-08-31 Thread Pradeep

Thanks Dan and Bill for the quick response. As Dan suggested, is moving
 svc_rqst_xprt_register() to the end
 of svc_vc_rendezvous()
the
 right
fix?

On Thu, Aug 31, 2017 at 8:30 AM, William Allen Simpson <
william.allen.simp...@gmail.com> wrote:

> On 8/31/17 9:14 AM, Daniel Gryniewicz wrote:
>
>> On 08/30/2017 10:06 PM, Pradeep wrote:
>>
>>> Hi all,
>>>
>>> I'm hitting a crash in TIRPC with Ganesha 2.6-dev.5. It appears to me
>>> that there is a race between a incoming RPC message on a new xprt (for
>>> which accept() was done on the FD) and TIRPC setting the process_cb on the
>>> new xprt.
>>>
>>> We set the xprt->xp_dispatch.process_cb() from the rendezvous function
>>> (nfs_rpc_dispatch_tcp_NFS in case of NFS/TCP). This is called at the end of
>>> svc_vc_rendezvous(). But before this happens an RPC request could be
>>> invoking svc_vc_recv() because we have already called accept(). Shouldn't
>>> we setup xprt before accept()?
>>>
>>
>> Not the accept itself, but adding the accepted fd to epoll, which is also
>> happening before the rendezvous.  I think the call to
>> svc_rqst_xprt_register() needs to be last, or a lock needs to be taken.
>>
>> Bill?
>>
>> Yes, that's a problem.  I checked v2.5 (ntirpc 1.5) and that has the
> same issue.  It's registering the epoll before doing other essential
> things, like setting up the recvsize and sendsize, and calling (old)
> xp_recv_user_data (now named nfs_rpc_dispatch_tcp_NFS).
>
> My guess is you're seeing it because the 2.6 epoll loop is much faster.
> We're expecting to find more of these timing and code ordering errors.
>
> But it looks like a relatively easy fix.
>
> Thanks for the excellent detailed report.  So helpful!
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel

Re: [Nfs-ganesha-devel] mdcache growing beyond limits.

2017-08-14 Thread Pradeep

On Fri, Aug 11, 2017 at 8:52 AM, Daniel Gryniewicz <d...@redhat.com> wrote:

> Right, this is reaping.  I was thinking it was the lane thread.  Reaping
> only looks at the single LRU of each queue.  We should probably look at
> some small number of each lane, like 2 or 3.
>

This is the lane thread, right? The background thread (lane thread?) moves
entries from L1 to L2 depending on the refcnt. Once it is moved, it can be
reaped by lru_reap_impl().

Couple of experiments I tried that helped limit the number of cached inodes
to somewhere close to entries_hiwat:
1. Added a check in lru_run() to invoke lru_run_lane() if number of cached
entries is above entries_hiwat.
2. Removed the limit on per_lane_work.

There were some comments on limiting promotions (from L2 to L1 or within
L1). Any suggestions on specific things to try out?

Thanks,
Pradeep


>
> Frank, this, in combination with the PIN lane, it probably the issue.
>
> Daniel
>
> On 08/11/2017 11:21 AM, Pradeep wrote:
>
>> Hi Daniel,
>>
>> I'm testing with 2.5.1. I haven't changed those parameters. Those
>> parameters only affect once you are in lru_run_lane(), right? Since the FDs
>> are lower than low-watermark, it never calls lru_run_lane().
>>
>> Thanks,
>> Pradeep
>>
>> On Fri, Aug 11, 2017 at 5:43 AM, Daniel Gryniewicz <d...@redhat.com
>> <mailto:d...@redhat.com>> wrote:
>>
>> Have you set Reaper_Work?  Have you changed LRU_N_Q_LANES?  (and
>> which version of Ganesha?)
>>
>> Daniel
>>
>> On 08/10/2017 07:12 PM, Pradeep wrote:
>>
>> Debugged this a little more. It appears that the entries that
>> can be reaped are not at the LRU position (head) of the L1
>> queue. So those can be free'd later by lru_run(). I don't see it
>> happening either for some reason.
>>
>> (gdb) p LRU[1].L1
>> $29 = {q = {next = 0x7fb459e71960, prev = 0x7fb3ec3c0d30}, id =
>> LRU_ENTRY_L1, size = 260379}
>>
>> head of the list is an entry with refcnt 2; but there are
>> several entries with refcnt 1.
>>
>> (gdb) p *(mdcache_lru_t *)0x7fb459e71960
>> $30 = {q = {next = 0x7fb43ddea8a0, prev = 0x7d68a0 <LRU+224>},
>> qid = LRU_ENTRY_L1, refcnt = 2, flags = 0, lane = 1, cf = 2}
>> (gdb) p *(mdcache_lru_t *)0x7fb43ddea8a0
>> $31 = {q = {next = 0x7fb3f041f9a0, prev = 0x7fb459e71960}, qid =
>> LRU_ENTRY_L1, refcnt = 1, flags = 0, lane = 1, cf = 0}
>> (gdb) p *(mdcache_lru_t *)0x7fb3f041f9a0
>> $32 = {q = {next = 0x7fb466960200, prev = 0x7fb43ddea8a0}, qid =
>> LRU_ENTRY_L1, refcnt = 1, flags = 0, lane = 1, cf = 0}
>> (gdb) p *(mdcache_lru_t *)0x7fb466960200
>> $33 = {q = {next = 0x7fb451e20570, prev = 0x7fb3f041f9a0}, qid =
>> LRU_ENTRY_L1, refcnt = 2, flags = 0, lane = 1, cf = 1}
>>
>> The entries with refcnt 1 are moved to L2 by the background
>> thread (lru_run). However it does it only of the open file count
>> is greater than low water mark. In my case, the open_fd_count is
>> not high; so lru_run() doesn't call lru_run_lane() to demote
>> those entries to L2. What is the best approach to handle this
>> scenario?
>>
>> Thanks,
>> Pradeep
>>
>>
>>
>> On Mon, Aug 7, 2017 at 6:08 AM, Daniel Gryniewicz
>> <d...@redhat.com <mailto:d...@redhat.com>
>> <mailto:d...@redhat.com <mailto:d...@redhat.com>>> wrote:
>>
>>  It never has been.  In cache_inode, a pin-ref kept it from
>> being
>>  reaped, now any ref beyond 1 keeps it.
>>
>>  On Fri, Aug 4, 2017 at 1:31 PM, Frank Filz
>> <ffilz...@mindspring.com <mailto:ffilz...@mindspring.com>
>>  <mailto:ffilz...@mindspring.com
>>
>> <mailto:ffilz...@mindspring.com>>> wrote:
>>   >> I'm hitting a case where mdcache keeps growing well
>> beyond the
>>  high water
>>   >> mark. Here is a snapshot of the lru_state:
>>   >>
>>   >> 1 = {entries_hiwat = 10, entries_used = 2306063,
>> chunks_hiwat =
>>   > 10,
>>   >> chunks_used = 16462,
>>   >>
>>   >> It has grown to 2.3 million entries and each entry is
>> ~1.6K.
>>   >>

Re: [Nfs-ganesha-devel] mdcache growing beyond limits.

2017-08-11 Thread Pradeep

Hi Daniel,

I'm testing with 2.5.1. I haven't changed those parameters. Those
parameters only affect once you are in lru_run_lane(), right? Since the FDs
are lower than low-watermark, it never calls lru_run_lane().

Thanks,
Pradeep

On Fri, Aug 11, 2017 at 5:43 AM, Daniel Gryniewicz <d...@redhat.com> wrote:

> Have you set Reaper_Work?  Have you changed LRU_N_Q_LANES?  (and which
> version of Ganesha?)
>
> Daniel
>
> On 08/10/2017 07:12 PM, Pradeep wrote:
>
>> Debugged this a little more. It appears that the entries that can be
>> reaped are not at the LRU position (head) of the L1 queue. So those can be
>> free'd later by lru_run(). I don't see it happening either for some reason.
>>
>> (gdb) p LRU[1].L1
>> $29 = {q = {next = 0x7fb459e71960, prev = 0x7fb3ec3c0d30}, id =
>> LRU_ENTRY_L1, size = 260379}
>>
>> head of the list is an entry with refcnt 2; but there are several entries
>> with refcnt 1.
>>
>> (gdb) p *(mdcache_lru_t *)0x7fb459e71960
>> $30 = {q = {next = 0x7fb43ddea8a0, prev = 0x7d68a0 <LRU+224>}, qid =
>> LRU_ENTRY_L1, refcnt = 2, flags = 0, lane = 1, cf = 2}
>> (gdb) p *(mdcache_lru_t *)0x7fb43ddea8a0
>> $31 = {q = {next = 0x7fb3f041f9a0, prev = 0x7fb459e71960}, qid =
>> LRU_ENTRY_L1, refcnt = 1, flags = 0, lane = 1, cf = 0}
>> (gdb) p *(mdcache_lru_t *)0x7fb3f041f9a0
>> $32 = {q = {next = 0x7fb466960200, prev = 0x7fb43ddea8a0}, qid =
>> LRU_ENTRY_L1, refcnt = 1, flags = 0, lane = 1, cf = 0}
>> (gdb) p *(mdcache_lru_t *)0x7fb466960200
>> $33 = {q = {next = 0x7fb451e20570, prev = 0x7fb3f041f9a0}, qid =
>> LRU_ENTRY_L1, refcnt = 2, flags = 0, lane = 1, cf = 1}
>>
>> The entries with refcnt 1 are moved to L2 by the background thread
>> (lru_run). However it does it only of the open file count is greater than
>> low water mark. In my case, the open_fd_count is not high; so lru_run()
>> doesn't call lru_run_lane() to demote those entries to L2. What is the best
>> approach to handle this scenario?
>>
>> Thanks,
>> Pradeep
>>
>>
>>
>> On Mon, Aug 7, 2017 at 6:08 AM, Daniel Gryniewicz <d...@redhat.com
>> <mailto:d...@redhat.com>> wrote:
>>
>> It never has been.  In cache_inode, a pin-ref kept it from being
>> reaped, now any ref beyond 1 keeps it.
>>
>> On Fri, Aug 4, 2017 at 1:31 PM, Frank Filz <ffilz...@mindspring.com
>> <mailto:ffilz...@mindspring.com>> wrote:
>>  >> I'm hitting a case where mdcache keeps growing well beyond the
>> high water
>>  >> mark. Here is a snapshot of the lru_state:
>>  >>
>>  >> 1 = {entries_hiwat = 10, entries_used = 2306063, chunks_hiwat
>> =
>>  > 10,
>>  >> chunks_used = 16462,
>>  >>
>>  >> It has grown to 2.3 million entries and each entry is ~1.6K.
>>  >>
>>  >> I looked at the first entry in lane 0, L1 queue:
>>  >>
>>  >> (gdb) p LRU[0].L1
>>  >> $9 = {q = {next = 0x7fad64256f00, prev = 0x7faf21a1bc00}, id =
>>  >> LRU_ENTRY_L1, size = 254628}
>>  >> (gdb) p (mdcache_entry_t *)(0x7fad64256f00-1024)
>>  >> $10 = (mdcache_entry_t *) 0x7fad64256b00
>>  >> (gdb) p $10->lru
>>  >> $11 = {q = {next = 0x7fad65ea0f00, prev = 0x7d67c0 }, qid =
>>  >> LRU_ENTRY_L1, refcnt = 2, flags = 0, lane = 0, cf = 0}
>>  >> (gdb) p $10->fh_hk.inavl
>>  >> $13 = true
>>  >
>>  > The refcount 2 prevents reaping.
>>  >
>>  > There could be a refcount leak.
>>  >
>>  > Hmm, though, I thought the entries_hwmark was a hard limit, guess
>> not...
>>  >
>>  > Frank
>>  >
>>  >> Lane 1:
>>  >> (gdb) p LRU[1].L1
>>  >> $18 = {q = {next = 0x7fad625c0300, prev = 0x7faec08c5100}, id =
>>  >> LRU_ENTRY_L1, size = 253006}
>>  >> (gdb) p (mdcache_entry_t *)(0x7fad625c0300 - 1024)
>>  >> $21 = (mdcache_entry_t *) 0x7fad625bff00
>>  >> (gdb) p $21->lru
>>  >> $22 = {q = {next = 0x7fad66fce600, prev = 0x7d68a0 <LRU+224>},
>> qid =
>>  >> LRU_ENTRY_L1, refcnt = 2, flags = 0, lane = 1, cf = 1}
>>  >>
>>  >> (gdb) p $21->fh_hk.inavl
>>  >> $24 = true
>>  >>
>>  >> As per LRU_ENTRY_RECLAIMABLE(), these entry should be
>> reclaimable. Not
&

Re: [Nfs-ganesha-devel] mdcache growing beyond limits.

2017-08-10 Thread Pradeep

Debugged this a little more. It appears that the entries that can be reaped
are not at the LRU position (head) of the L1 queue. So those can be free'd
later by lru_run(). I don't see it happening either for some reason.

(gdb) p LRU[1].L1
$29 = {q = {next = 0x7fb459e71960, prev = 0x7fb3ec3c0d30}, id =
LRU_ENTRY_L1, size = 260379}

head of the list is an entry with refcnt 2; but there are several entries
with refcnt 1.

(gdb) p *(mdcache_lru_t *)0x7fb459e71960
$30 = {q = {next = 0x7fb43ddea8a0, prev = 0x7d68a0 <LRU+224>}, qid =
LRU_ENTRY_L1, refcnt = 2, flags = 0, lane = 1, cf = 2}
(gdb) p *(mdcache_lru_t *)0x7fb43ddea8a0
$31 = {q = {next = 0x7fb3f041f9a0, prev = 0x7fb459e71960}, qid =
LRU_ENTRY_L1, refcnt = 1, flags = 0, lane = 1, cf = 0}
(gdb) p *(mdcache_lru_t *)0x7fb3f041f9a0
$32 = {q = {next = 0x7fb466960200, prev = 0x7fb43ddea8a0}, qid =
LRU_ENTRY_L1, refcnt = 1, flags = 0, lane = 1, cf = 0}
(gdb) p *(mdcache_lru_t *)0x7fb466960200
$33 = {q = {next = 0x7fb451e20570, prev = 0x7fb3f041f9a0}, qid =
LRU_ENTRY_L1, refcnt = 2, flags = 0, lane = 1, cf = 1}

The entries with refcnt 1 are moved to L2 by the background thread
(lru_run). However it does it only of the open file count is greater than
low water mark. In my case, the open_fd_count is not high; so lru_run()
doesn't call lru_run_lane() to demote those entries to L2. What is the best
approach to handle this scenario?

Thanks,
Pradeep



On Mon, Aug 7, 2017 at 6:08 AM, Daniel Gryniewicz <d...@redhat.com> wrote:

> It never has been.  In cache_inode, a pin-ref kept it from being
> reaped, now any ref beyond 1 keeps it.
>
> On Fri, Aug 4, 2017 at 1:31 PM, Frank Filz <ffilz...@mindspring.com>
> wrote:
> >> I'm hitting a case where mdcache keeps growing well beyond the high
> water
> >> mark. Here is a snapshot of the lru_state:
> >>
> >> 1 = {entries_hiwat = 10, entries_used = 2306063, chunks_hiwat =
> > 10,
> >> chunks_used = 16462,
> >>
> >> It has grown to 2.3 million entries and each entry is ~1.6K.
> >>
> >> I looked at the first entry in lane 0, L1 queue:
> >>
> >> (gdb) p LRU[0].L1
> >> $9 = {q = {next = 0x7fad64256f00, prev = 0x7faf21a1bc00}, id =
> >> LRU_ENTRY_L1, size = 254628}
> >> (gdb) p (mdcache_entry_t *)(0x7fad64256f00-1024)
> >> $10 = (mdcache_entry_t *) 0x7fad64256b00
> >> (gdb) p $10->lru
> >> $11 = {q = {next = 0x7fad65ea0f00, prev = 0x7d67c0 }, qid =
> >> LRU_ENTRY_L1, refcnt = 2, flags = 0, lane = 0, cf = 0}
> >> (gdb) p $10->fh_hk.inavl
> >> $13 = true
> >
> > The refcount 2 prevents reaping.
> >
> > There could be a refcount leak.
> >
> > Hmm, though, I thought the entries_hwmark was a hard limit, guess not...
> >
> > Frank
> >
> >> Lane 1:
> >> (gdb) p LRU[1].L1
> >> $18 = {q = {next = 0x7fad625c0300, prev = 0x7faec08c5100}, id =
> >> LRU_ENTRY_L1, size = 253006}
> >> (gdb) p (mdcache_entry_t *)(0x7fad625c0300 - 1024)
> >> $21 = (mdcache_entry_t *) 0x7fad625bff00
> >> (gdb) p $21->lru
> >> $22 = {q = {next = 0x7fad66fce600, prev = 0x7d68a0 <LRU+224>}, qid =
> >> LRU_ENTRY_L1, refcnt = 2, flags = 0, lane = 1, cf = 1}
> >>
> >> (gdb) p $21->fh_hk.inavl
> >> $24 = true
> >>
> >> As per LRU_ENTRY_RECLAIMABLE(), these entry should be reclaimable. Not
> >> sure why it is not able to claim it. Any ideas?
> >>
> >> Thanks,
> >> Pradeep
> >>
> >>
> > 
> 
> > --
> >> Check out the vibrant tech community on one of the world's most engaging
> >> tech sites, Slashdot.org! http://sdm.link/slashdot
> >> ___
> >> Nfs-ganesha-devel mailing list
> >> Nfs-ganesha-devel@lists.sourceforge.net
> >> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
> >
> >
> > ---
> > This email has been checked for viruses by Avast antivirus software.
> > https://www.avast.com/antivirus
> >
> >
> > 
> --
> > Check out the vibrant tech community on one of the world's most
> > engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> > ___
> > Nfs-ganesha-devel mailing list
> > Nfs-ganesha-devel@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel

Re: [Nfs-ganesha-devel] Dir_Chunk and Detached_Mult got values reversed?

2017-08-08 Thread Pradeep

Thanks Daniel & Frank for confirming. How do I submit a patch? Here is the
diff:

diff --git a/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_read_conf.c
b/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_read_conf.c
index e6d7be4..ee5f6e2 100644
--- a/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_read_conf.c
+++ b/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_read_conf.c
@@ -64,9 +64,9 @@
  CONF_ITEM_UI32("Dir_Max", 1, UINT32_MAX, 65536,
mdcache_parameter, dir.avl_max),
  CONF_ITEM_UI32("Dir_Chunk", 0, UINT32_MAX, 128,
-   mdcache_parameter, dir.avl_detached_mult),
- CONF_ITEM_UI32("Detached_Mult", 1, UINT32_MAX, 1,
mdcache_parameter, dir.avl_chunk),
+ CONF_ITEM_UI32("Detached_Mult", 1, UINT32_MAX, 1,
+   mdcache_parameter, dir.avl_detached_mult),
  CONF_ITEM_UI32("Entries_HWMark", 1, UINT32_MAX, 10,
mdcache_parameter, entries_hwmark),
  CONF_ITEM_UI32("Chunks_HWMark", 1, UINT32_MAX, 10,

On Tue, Aug 8, 2017 at 5:58 AM, Daniel Gryniewicz <d...@redhat.com> wrote:

> On 08/07/2017 11:24 PM, Pradeep wrote:
>
>> It appears that the commit 440a048887c99e17e2a582e8fb242d5c0a042a79
>> has reversed two mdcache config parameters and as a result the default
>> value of dir.avl_chunk became 1 from 128. Is this intentional? Ideally
>> the "Dir_Chunk" should map to dir.avl_chunk and "Detached_Mult" should
>> map to dir.avl_detached_mult, isn't it?
>>
>>  CONF_ITEM_UI32("Dir_Chunk", 0, UINT32_MAX, 128,
>> mdcache_parameter, dir.avl_detached_mult),
>>  CONF_ITEM_UI32("Detached_Mult", 1, UINT32_MAX, 1,
>> mdcache_parameter, dir.avl_chunk),
>>
>>
> This looks wrong.  Good catch.  This is also broken on 2.5, so the fix
> will need to be backported.  Do you want to submit a patch, or should I?
>
> Daniel
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Nfs-ganesha-devel mailing list
> Nfs-ganesha-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel

Re: [Nfs-ganesha-devel] Change in ffilz/nfs-ganesha[next]: Ensure partition lock and qlane lock ordering

2017-08-07 Thread Pradeep

Hi Dan,

This change could lead to a thread trying to acquire qlock while
holding the same(see below). The *lock_already_held* flag is now
removed.

#2  0x7faf71f7dc08 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x005227c0 in _mdcache_lru_unref (entry=0x7faee0dbe100,
flags=0, func=0x58b8c0 <__func__.23709> "lru_reap_impl", line=691)
at 
/usr/src/debug/nfs-ganesha-2.5.1-0.1.1-Source/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_lru.c:1874
#4  0x0051e914 in lru_reap_impl (qid=LRU_ENTRY_L1)
at 
/usr/src/debug/nfs-ganesha-2.5.1-0.1.1-Source/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_lru.c:691
#5  0x0051eac6 in lru_try_reap_entry () at
/usr/src/debug/nfs-ganesha-2.5.1-0.1.1-Source/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_lru.c:718

Thanks,
Pradeep

On 8/7/17, GerritHub <supp...@gerritforge.com> wrote:
> From Daniel Gryniewicz <d...@redhat.com>:
>
> Daniel Gryniewicz has uploaded this change for review. (
> https://review.gerrithub.io/373094
>
>
> Change subject: Ensure partition lock and qlane lock ordering
> ..
>
> Ensure partition lock and qlane lock ordering
>
> Because of the initial ref path, the partition lock must be taken before
> the qlane lock.  Ensure that all paths (noteably unref paths) take these
> locks in the correct order to avoid deadlock.
>
> Deadlock found by Pradeep <pradeep.tho...@gmail.com>
>
> Change-Id: I8abe86f6d3b6221c5221a29518504195718aa5f3
> Signed-off-by: Daniel Gryniewicz <d...@redhat.com>
> ---
> M src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_hash.h
> M src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_lru.c
> M src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_lru.h
> 3 files changed, 29 insertions(+), 31 deletions(-)
>
>
>
>   git pull ssh://review.gerrithub.io:29418/ffilz/nfs-ganesha
> refs/changes/94/373094/1
> --
> To view, visit https://review.gerrithub.io/373094
> To unsubscribe, visit https://review.gerrithub.io/settings
>
> Gerrit-Project: ffilz/nfs-ganesha
> Gerrit-Branch: next
> Gerrit-MessageType: newchange
> Gerrit-Change-Id: I8abe86f6d3b6221c5221a29518504195718aa5f3
> Gerrit-Change-Number: 373094
> Gerrit-PatchSet: 1
> Gerrit-Owner: Daniel Gryniewicz <d...@redhat.com>
>

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel

Re: [Nfs-ganesha-devel] mdcache growing beyond limits.

2017-08-04 Thread Pradeep

My mistake. As you both correctly pointed out, refcnt needs to be 1 for reclaim.
It is initialized with 2. So some must be doing an unref()/put() to make it 1.

On 8/4/17, Daniel Gryniewicz <d...@redhat.com> wrote:
> On 08/04/2017 01:14 PM, Pradeep wrote:
>> Hello,
>>
>> I'm hitting a case where mdcache keeps growing well beyond the high
>> water mark. Here is a snapshot of the lru_state:
>>
>> 1 = {entries_hiwat = 10, entries_used = 2306063, chunks_hiwat =
>> 10, chunks_used = 16462,
>>
>> It has grown to 2.3 million entries and each entry is ~1.6K.
>>
>> I looked at the first entry in lane 0, L1 queue:
>>
>> (gdb) p LRU[0].L1
>> $9 = {q = {next = 0x7fad64256f00, prev = 0x7faf21a1bc00}, id =
>> LRU_ENTRY_L1, size = 254628}
>> (gdb) p (mdcache_entry_t *)(0x7fad64256f00-1024)
>> $10 = (mdcache_entry_t *) 0x7fad64256b00
>> (gdb) p $10->lru
>> $11 = {q = {next = 0x7fad65ea0f00, prev = 0x7d67c0 }, qid =
>> LRU_ENTRY_L1, refcnt = 2, flags = 0, lane = 0, cf = 0}
>> (gdb) p $10->fh_hk.inavl
>> $13 = true
>>
>> Lane 1:
>> (gdb) p LRU[1].L1
>> $18 = {q = {next = 0x7fad625c0300, prev = 0x7faec08c5100}, id =
>> LRU_ENTRY_L1, size = 253006}
>> (gdb) p (mdcache_entry_t *)(0x7fad625c0300 - 1024)
>> $21 = (mdcache_entry_t *) 0x7fad625bff00
>> (gdb) p $21->lru
>> $22 = {q = {next = 0x7fad66fce600, prev = 0x7d68a0 <LRU+224>}, qid =
>> LRU_ENTRY_L1, refcnt = 2, flags = 0, lane = 1, cf = 1}
>>
>> (gdb) p $21->fh_hk.inavl
>> $24 = true
>>
>> As per LRU_ENTRY_RECLAIMABLE(), these entry should be reclaimable. Not
>> sure why it is not able to claim it. Any ideas?
>>
>
> refcnt == 2 is not reclaimable.  Reclaimable is refcnt == 1.  It checks
> for 2 because it just took a ref.  Unless you're actually processing
> that lane, and so seeing the ref taken during that processing, refcnt
> will be 3 when processing, and it won't be reclaimed.
>
> Daniel
>
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Nfs-ganesha-devel mailing list
> Nfs-ganesha-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel

[Nfs-ganesha-devel] mdcache growing beyond limits.

2017-08-04 Thread Pradeep

Hello,

I'm hitting a case where mdcache keeps growing well beyond the high
water mark. Here is a snapshot of the lru_state:

1 = {entries_hiwat = 10, entries_used = 2306063, chunks_hiwat =
10, chunks_used = 16462,

It has grown to 2.3 million entries and each entry is ~1.6K.

I looked at the first entry in lane 0, L1 queue:

(gdb) p LRU[0].L1
$9 = {q = {next = 0x7fad64256f00, prev = 0x7faf21a1bc00}, id =
LRU_ENTRY_L1, size = 254628}
(gdb) p (mdcache_entry_t *)(0x7fad64256f00-1024)
$10 = (mdcache_entry_t *) 0x7fad64256b00
(gdb) p $10->lru
$11 = {q = {next = 0x7fad65ea0f00, prev = 0x7d67c0 }, qid =
LRU_ENTRY_L1, refcnt = 2, flags = 0, lane = 0, cf = 0}
(gdb) p $10->fh_hk.inavl
$13 = true

Lane 1:
(gdb) p LRU[1].L1
$18 = {q = {next = 0x7fad625c0300, prev = 0x7faec08c5100}, id =
LRU_ENTRY_L1, size = 253006}
(gdb) p (mdcache_entry_t *)(0x7fad625c0300 - 1024)
$21 = (mdcache_entry_t *) 0x7fad625bff00
(gdb) p $21->lru
$22 = {q = {next = 0x7fad66fce600, prev = 0x7d68a0 <LRU+224>}, qid =
LRU_ENTRY_L1, refcnt = 2, flags = 0, lane = 1, cf = 1}

(gdb) p $21->fh_hk.inavl
$24 = true

As per LRU_ENTRY_RECLAIMABLE(), these entry should be reclaimable. Not
sure why it is not able to claim it. Any ideas?

Thanks,
Pradeep

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel

Re: [Nfs-ganesha-devel] assert() in mdcache_lru_clean()

2017-08-04 Thread Pradeep

For the first export, saved_ctx will be NULL. So the assignment at
line 1161 makes op_ctx also NULL. So when mdcache_lru_unref()  is
called, op_ctx will be NULL. This will cause the assert in
mdcache_lru_clean() abort.

Perhaps we can move the assignment in line 1161 after mdcache_lru_unref() call?


On 8/4/17, Daniel Gryniewicz <d...@redhat.com> wrote:
> Yep.  We save the old context, run our loop, and then restore the old
> context.
>
> On 08/04/2017 10:45 AM, Pradeep wrote:
>> Thanks Daniel. I see it being initialized. But then it is overwritten
>> from saved_ctx, right?
>>
>> https://github.com/nfs-ganesha/nfs-ganesha/blob/next/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_lru.c#L1161
>>
>> On 8/4/17, Daniel Gryniewicz <d...@redhat.com> wrote:
>>> Here:
>>>
>>> https://github.com/nfs-ganesha/nfs-ganesha/blob/next/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_lru.c#L1127
>>>
>>> On 08/04/2017 10:36 AM, Pradeep wrote:
>>>> Hi Daniel,
>>>>
>>>> I could not find where op_ctx gets populated in lru_run_lane(). I'm
>>>> using
>>>> 2.5.1.
>>>>
>>>> https://github.com/nfs-ganesha/nfs-ganesha/blob/next/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_lru.c#L1019
>>>>
>>>> On 8/4/17, Daniel Gryniewicz <d...@redhat.com> wrote:
>>>>> It should be valid.  lru_run_lane() sets up op_ctx, so it should be
>>>>> set
>>>>> correctly even in the LRU thread case.
>>>>>
>>>>> Daniel
>>>>>
>>>>> On 08/04/2017 09:54 AM, Pradeep wrote:
>>>>>> It looks like the assert() below and the comment in
>>>>>> mdcache_lru_clean() may not be valid in all cases. For example, if
>>>>>> cache is getting cleaned in the context of the LRU background thread,
>>>>>> the op_ctx will be NULL and the code may get into the 'else' part
>>>>>> (lru_run() -> lru_run_lane() -> _mdcache_lru_unref() ->
>>>>>> mdcache_lru_clean()):
>>>>>>
>>>>>> Do any of the calls after the 'if-else' block use 'op_ctx'? If those
>>>>>> don't us 'op_ctx', the 'else' part can be safely removed, right?
>>>>>>
>>>>>>   if (export_id >= 0 && op_ctx != NULL &&
>>>>>>op_ctx->ctx_export != NULL &&
>>>>>>op_ctx->ctx_export->export_id != export_id) {
>>>>>> 
>>>>>>} else {
>>>>>>/* We MUST have a valid op_ctx based on
>>>>>> the
>>>>>> conditions
>>>>>> * we could get here. first_export_id coild
>>>>>> be
>>>>>> -1
>>>>>> or it
>>>>>> * could match the current op_ctx export.
>>>>>> In
>>>>>> either case
>>>>>> * we will trust the current op_ctx.
>>>>>> */
>>>>>>assert(op_ctx);
>>>>>>assert(op_ctx->ctx_export);
>>>>>>LogFullDebug(COMPONENT_CACHE_INODE,
>>>>>> "Trusting op_ctx export id
>>>>>> %"PRIu16,
>>>>>>
>>>>>> op_ctx->ctx_export->export_id);
>>>>>> 
>>>>>>
>>>>>> --
>>>>>> Check out the vibrant tech community on one of the world's most
>>>>>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>>>>>> ___
>>>>>> Nfs-ganesha-devel mailing list
>>>>>> Nfs-ganesha-devel@lists.sourceforge.net
>>>>>> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Check out the vibrant tech community on one of the world's most
>>>>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>>>>> ___
>>>>> Nfs-ganesha-devel mailing list
>>>>> Nfs-ganesha-devel@lists.sourceforge.net
>>>>> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>>>>>
>>>
>>>
>
>

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel

Re: [Nfs-ganesha-devel] assert() in mdcache_lru_clean()

2017-08-04 Thread Pradeep

Thanks Daniel. I see it being initialized. But then it is overwritten
from saved_ctx, right?

https://github.com/nfs-ganesha/nfs-ganesha/blob/next/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_lru.c#L1161

On 8/4/17, Daniel Gryniewicz <d...@redhat.com> wrote:
> Here:
>
> https://github.com/nfs-ganesha/nfs-ganesha/blob/next/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_lru.c#L1127
>
> On 08/04/2017 10:36 AM, Pradeep wrote:
>> Hi Daniel,
>>
>> I could not find where op_ctx gets populated in lru_run_lane(). I'm using
>> 2.5.1.
>>
>> https://github.com/nfs-ganesha/nfs-ganesha/blob/next/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_lru.c#L1019
>>
>> On 8/4/17, Daniel Gryniewicz <d...@redhat.com> wrote:
>>> It should be valid.  lru_run_lane() sets up op_ctx, so it should be set
>>> correctly even in the LRU thread case.
>>>
>>> Daniel
>>>
>>> On 08/04/2017 09:54 AM, Pradeep wrote:
>>>> It looks like the assert() below and the comment in
>>>> mdcache_lru_clean() may not be valid in all cases. For example, if
>>>> cache is getting cleaned in the context of the LRU background thread,
>>>> the op_ctx will be NULL and the code may get into the 'else' part
>>>> (lru_run() -> lru_run_lane() -> _mdcache_lru_unref() ->
>>>> mdcache_lru_clean()):
>>>>
>>>> Do any of the calls after the 'if-else' block use 'op_ctx'? If those
>>>> don't us 'op_ctx', the 'else' part can be safely removed, right?
>>>>
>>>>  if (export_id >= 0 && op_ctx != NULL &&
>>>>   op_ctx->ctx_export != NULL &&
>>>>   op_ctx->ctx_export->export_id != export_id) {
>>>> 
>>>>   } else {
>>>>   /* We MUST have a valid op_ctx based on the
>>>> conditions
>>>>* we could get here. first_export_id coild be
>>>> -1
>>>> or it
>>>>* could match the current op_ctx export. In
>>>> either case
>>>>* we will trust the current op_ctx.
>>>>*/
>>>>   assert(op_ctx);
>>>>   assert(op_ctx->ctx_export);
>>>>   LogFullDebug(COMPONENT_CACHE_INODE,
>>>>"Trusting op_ctx export id
>>>> %"PRIu16,
>>>>op_ctx->ctx_export->export_id);
>>>> 
>>>>
>>>> --
>>>> Check out the vibrant tech community on one of the world's most
>>>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>>>> ___
>>>> Nfs-ganesha-devel mailing list
>>>> Nfs-ganesha-devel@lists.sourceforge.net
>>>> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>>>>
>>>
>>>
>>> --
>>> Check out the vibrant tech community on one of the world's most
>>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>>> ___
>>> Nfs-ganesha-devel mailing list
>>> Nfs-ganesha-devel@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>>>
>
>

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel

Re: [Nfs-ganesha-devel] assert() in mdcache_lru_clean()

2017-08-04 Thread Pradeep

Hi Daniel,

I could not find where op_ctx gets populated in lru_run_lane(). I'm using 2.5.1.

https://github.com/nfs-ganesha/nfs-ganesha/blob/next/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_lru.c#L1019

On 8/4/17, Daniel Gryniewicz <d...@redhat.com> wrote:
> It should be valid.  lru_run_lane() sets up op_ctx, so it should be set
> correctly even in the LRU thread case.
>
> Daniel
>
> On 08/04/2017 09:54 AM, Pradeep wrote:
>> It looks like the assert() below and the comment in
>> mdcache_lru_clean() may not be valid in all cases. For example, if
>> cache is getting cleaned in the context of the LRU background thread,
>> the op_ctx will be NULL and the code may get into the 'else' part
>> (lru_run() -> lru_run_lane() -> _mdcache_lru_unref() ->
>> mdcache_lru_clean()):
>>
>> Do any of the calls after the 'if-else' block use 'op_ctx'? If those
>> don't us 'op_ctx', the 'else' part can be safely removed, right?
>>
>> if (export_id >= 0 && op_ctx != NULL &&
>>  op_ctx->ctx_export != NULL &&
>>  op_ctx->ctx_export->export_id != export_id) {
>> 
>>  } else {
>>  /* We MUST have a valid op_ctx based on the
>> conditions
>>   * we could get here. first_export_id coild be -1
>> or it
>>   * could match the current op_ctx export. In
>> either case
>>   * we will trust the current op_ctx.
>>   */
>>  assert(op_ctx);
>>  assert(op_ctx->ctx_export);
>>  LogFullDebug(COMPONENT_CACHE_INODE,
>>   "Trusting op_ctx export id
>> %"PRIu16,
>>   op_ctx->ctx_export->export_id);
>> 
>>
>> --
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>> ___
>> Nfs-ganesha-devel mailing list
>> Nfs-ganesha-devel@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>>
>
>
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Nfs-ganesha-devel mailing list
> Nfs-ganesha-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel

[Nfs-ganesha-devel] assert() in mdcache_lru_clean()

2017-08-04 Thread Pradeep

It looks like the assert() below and the comment in
mdcache_lru_clean() may not be valid in all cases. For example, if
cache is getting cleaned in the context of the LRU background thread,
the op_ctx will be NULL and the code may get into the 'else' part
(lru_run() -> lru_run_lane() -> _mdcache_lru_unref() ->
mdcache_lru_clean()):

Do any of the calls after the 'if-else' block use 'op_ctx'? If those
don't us 'op_ctx', the 'else' part can be safely removed, right?

   if (export_id >= 0 && op_ctx != NULL &&
op_ctx->ctx_export != NULL &&
op_ctx->ctx_export->export_id != export_id) {

} else {
/* We MUST have a valid op_ctx based on the conditions
 * we could get here. first_export_id coild be -1 or it
 * could match the current op_ctx export. In either case
 * we will trust the current op_ctx.
 */
assert(op_ctx);
assert(op_ctx->ctx_export);
LogFullDebug(COMPONENT_CACHE_INODE,
 "Trusting op_ctx export id %"PRIu16,
 op_ctx->ctx_export->export_id);


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel

Re: [Nfs-ganesha-devel] deadlock in lru_reap_impl()

2017-08-03 Thread Pradeep

Thanks Franks. I merged your patch and now hitting another deadlock. Here
are the two threads:

This thread below holds the partition lock in 'read' mode and try to
acquire queue lock:

Thread 143 (Thread 0x7faf82f72700 (LWP 143573)):
#0  0x7fafd1c371bd in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x7fafd1c32d02 in _L_lock_791 () from /lib64/libpthread.so.0
#2  0x7fafd1c32c08 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x005221fd in _mdcache_lru_ref (entry=0x7fae78d19000, flags=2,
func=0x58ec80 <__func__.23467> "mdcache_find_keyed", line=881) at
/usr/src/debug/nfs-ganesha-2.5.1-0.1.1-Source/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_lru.c:1813
#4  0x00532686 in mdcache_find_keyed (key=0x7faf82f70760,
entry=0x7faf82f707e8) at
/usr/src/debug/nfs-ganesha-2.5.1-0.1.1-Source/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_helpers.c:881

874 *entry = cih_get_by_key_latch(key, ,
875 CIH_GET_RLOCK |
CIH_GET_UNLOCK_ON_MISS,
876 __func__, __LINE__);
877 if (likely(*entry)) {
878 fsal_status_t status;
879
880 /* Initial Ref on entry */
881 status = mdcache_lru_ref(*entry, LRU_REQ_INITIAL);


This thread is already holding queue lock and trying to acquire partition
lock in write mode:

Thread 188 (Thread 0x7faf9979f700 (LWP 143528)):
#0  0x7fafd1c3403e in pthread_rwlock_wrlock () from
/lib64/libpthread.so.0
#1  0x0052fc61 in cih_remove_checked (entry=0x7fad62914e00) at
/usr/src/debug/nfs-ganesha-2.5.1-0.1.1-Source/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_hash.h:394
#2  0x00530b3e in mdc_clean_entry (entry=0x7fad62914e00) at
/usr/src/debug/nfs-ganesha-2.5.1-0.1.1-Source/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_helpers.c:272
#3  0x0051df7e in mdcache_lru_clean (entry=0x7fad62914e00) at
/usr/src/debug/nfs-ganesha-2.5.1-0.1.1-Source/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_lru.c:590
#4  0x00522cca in _mdcache_lru_unref (entry=0x7fad62914e00,
flags=8, func=0x58b700 <__func__.23710> "lru_reap_impl", line=690) at
/usr/src/debug/nfs-ganesha-2.5.1-0.1.1-Source/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_lru.c:1922
#5  0x0051ea38 in lru_reap_impl (qid=LRU_ENTRY_L1) at
/usr/src/debug/nfs-ganesha-2.5.1-0.1.1-Source/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_lru.c:690




On Fri, Jul 28, 2017 at 1:34 PM, Frank Filz <ffilz...@mindspring.com> wrote:

> Hmm, well, that’s easy to fix…
>
>
>
> Instead of:
>
>
>
> mdcache_lru_unref(entry, LRU_UNREF_QLOCKED);
>
> goto next_lane;
>
>
>
> It could:
>
>
>
> QUNLOCK(qlane);
>
> mdcache_put(entry);
>
> continue;
>
>
>
> Fix posted here:
>
>
>
> https://review.gerrithub.io/371764
>
>
>
> Frank
>
>
>
>
>
> *From:* Pradeep [mailto:pradeep.tho...@gmail.com]
> *Sent:* Friday, July 28, 2017 12:44 PM
> *To:* nfs-ganesha-devel@lists.sourceforge.net
> *Subject:* [Nfs-ganesha-devel] deadlock in lru_reap_impl()
>
>
>
>
>
> I'm hitting another deadlock in mdcache with 2.5.1 base.  In this case two
> threads are in different places in lru_reap_impl()
>
>
>
> Thread 1:
>
>
>
> 636 QLOCK(qlane);
>
> 637 lru = glist_first_entry(>q, mdcache_lru_t, q);
>
> 638 if (!lru)
>
> 639 goto next_lane;
>
> 640 refcnt = atomic_inc_int32_t(>refcnt);
>
> 641 entry = container_of(lru, mdcache_entry_t, lru);
>
> 642 if (unlikely(refcnt != (LRU_SENTINEL_REFCOUNT +
> 1))) {
>
> 643 /* cant use it. */
>
> 644 mdcache_lru_unref(entry,
> LRU_UNREF_QLOCKED);
>
>
>
> mdcache_lru_unref() could lead to the set of calls below:
>
>
>
> mdcache_lru_unref() -> mdcache_lru_clean() -> mdc_clean_entry()
> -> cih_remove_checked()
>
>
>
> This tries to get partition lock which is held by 'Thread 2' which is
> trying to acquire queue lane lock.
>
>
>
> Thread 2:
>
> 650 if (cih_latch_entry(>fh_hk.key, ,
> CIH_GET_WLOCK,
>
> 651 __func__, __LINE__)) {
>
> 652 QLOCK(qlane);
>
>
>
> Stack traces:
>
>
>
> Thread 1:
>
>
> #0  0x7f571328103e in pthread_rwlock_wrlock () from
> /lib64/libpthread.so.0
>
> #1  0x0052f928 in cih_remove_checked (entry=0x7f548e86c400)
>
> at /usr/src/debug/nfs-ganesha-2.5.1-0.1.1-Source/FSAL/
> Stackable_FSALs/FSAL_MDCACHE/mdcache_hash.h:394
>
>

[Nfs-ganesha-devel] deadlock in lru_reap_impl()

2017-07-28 Thread Pradeep

I'm hitting another deadlock in mdcache with 2.5.1 base.  In this case two
threads are in different places in lru_reap_impl()

Thread 1:

636 QLOCK(qlane);
637 lru = glist_first_entry(>q, mdcache_lru_t, q);
638 if (!lru)
639 goto next_lane;
640 refcnt = atomic_inc_int32_t(>refcnt);
641 entry = container_of(lru, mdcache_entry_t, lru);
642 if (unlikely(refcnt != (LRU_SENTINEL_REFCOUNT +
1))) {
643 /* cant use it. */
644 mdcache_lru_unref(entry, LRU_UNREF_QLOCKED);

mdcache_lru_unref() could lead to the set of calls below:

mdcache_lru_unref() -> mdcache_lru_clean() -> mdc_clean_entry()
-> cih_remove_checked()

This tries to get partition lock which is held by 'Thread 2' which is
trying to acquire queue lane lock.

Thread 2:
650 if (cih_latch_entry(>fh_hk.key, ,
CIH_GET_WLOCK,
651 __func__, __LINE__)) {
652 QLOCK(qlane);

Stack traces:

Thread 1:

#0  0x7f571328103e in pthread_rwlock_wrlock () from
/lib64/libpthread.so.0
#1  0x0052f928 in cih_remove_checked (entry=0x7f548e86c400)
at
/usr/src/debug/nfs-ganesha-2.5.1-0.1.1-Source/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_hash.h:394
#2  0x00530805 in mdc_clean_entry (entry=0x7f548e86c400)
at
/usr/src/debug/nfs-ganesha-2.5.1-0.1.1-Source/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_helpers.c:272
#3  0x0051df7e in mdcache_lru_clean (entry=0x7f548e86c400)
at
/usr/src/debug/nfs-ganesha-2.5.1-0.1.1-Source/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_lru.c:590
#4  0x005229c0 in _mdcache_lru_unref (entry=0x7f548e86c400,
flags=8, func=0x58b5c0 <__func__.23710> "lru_reap_impl", line=687)
at
/usr/src/debug/nfs-ganesha-2.5.1-0.1.1-Source/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_lru.c:1918
#5  0x0051e83a in lru_reap_impl (qid=LRU_ENTRY_L1)
at
/usr/src/debug/nfs-ganesha-2.5.1-0.1.1-Source/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_lru.c:687

Thread 2:
#0  0x7f57132841bd in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x7f571327fd02 in _L_lock_791 () from /lib64/libpthread.so.0
#2  0x7f571327fc08 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x0051e4f5 in lru_reap_impl (qid=LRU_ENTRY_L1)
at
/usr/src/debug/nfs-ganesha-2.5.1-0.1.1-Source/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_lru.c:652
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel

[Nfs-ganesha-devel] deadlock between mdcache_rename() and mdcache_getattrs()

2017-07-28 Thread Pradeep

Hello,

I seem to be hitting a deadlock in Ganesha mdcache. I have debugged it
and here is what I see on the hung process:

Thread1 is doing a rename operation. Lets say file1 being moved from
'dir0' to 'dir1'. This calls mdcache_rename()
- Holds content_lock for both parents - dir0 and dir1
- In mdcache_refresh_attrs_no_invalidate() on 'dir1'
  o Tries to hold attr_lock in write mode [ attr_lock is needed when
calling mdcache_refresh_attrs() ]

Thread2 is doing a getattr on 'dir1' [ the destination diretory of the
rename]. This calls mdcache_getattrs().
- Holds attr_lock in write mode.
- Calls mdcache_refresh_attrs( invalidate=true)
  o Tries to hold content_lock in write mode if ‘invalidate’ is needed.

These two threads ends up in a deadlock and all other threads waiting
for attr_lock in READ mode also are hung.

To fix this, is it ok to drop content_lock before calling
mdcache_refresh_attrs() in mdcache_rename(). From the locking
guidelines, it appears that mdcache_refresh_attrs() doesn't require
content_lock.

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel

47 matches

Mail list logo