[Nfs-ganesha-devel] Change in ffilz/nfs-ganesha[next]: Inmprove debug of NLM_SHARE and NLM_UNSHARE

2018-04-04 Thread GerritHub
>From Frank Filz :

Frank Filz has uploaded this change for review. ( 
https://review.gerrithub.io/406501


Change subject: Inmprove debug of NLM_SHARE and NLM_UNSHARE
..

Inmprove debug of NLM_SHARE and NLM_UNSHARE

Change-Id: I0f2a3ca6617e1a69797c8acdc7577a30990a645f
Signed-off-by: Frank S. Filz 
---
M src/Protocols/NLM/nlm_Share.c
M src/Protocols/NLM/nlm_Unshare.c
M src/SAL/state_share.c
M src/include/sal_functions.h
4 files changed, 59 insertions(+), 23 deletions(-)



  git pull ssh://review.gerrithub.io:29418/ffilz/nfs-ganesha 
refs/changes/01/406501/1
--
To view, visit https://review.gerrithub.io/406501
To unsubscribe, visit https://review.gerrithub.io/settings

Gerrit-Project: ffilz/nfs-ganesha
Gerrit-Branch: next
Gerrit-MessageType: newchange
Gerrit-Change-Id: I0f2a3ca6617e1a69797c8acdc7577a30990a645f
Gerrit-Change-Number: 406501
Gerrit-PatchSet: 1
Gerrit-Owner: Frank Filz 
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


[Nfs-ganesha-devel] Change in ffilz/nfs-ganesha[next]: Allow NLM_SHARE access none, deny something (deny read for example)

2018-04-04 Thread GerritHub
>From Frank Filz :

Frank Filz has uploaded this change for review. ( 
https://review.gerrithub.io/406504


Change subject: Allow NLM_SHARE access none, deny something (deny read for 
example)
..

Allow NLM_SHARE access none, deny something (deny read for example)

It turns out Windows issues an NLM_SHARE access none, deny read in
the process of deleting a file. In order to push this down to the
FSAL, we need to have SOME kind of open mode, so we open for read.

This prevents an ugly crash...

Change-Id: I7cdfaf81583c02edcee6d2029e08c0ebdb5b7a0f
Signed-off-by: Frank S. Filz 
---
M src/SAL/state_share.c
1 file changed, 9 insertions(+), 2 deletions(-)



  git pull ssh://review.gerrithub.io:29418/ffilz/nfs-ganesha 
refs/changes/04/406504/1
--
To view, visit https://review.gerrithub.io/406504
To unsubscribe, visit https://review.gerrithub.io/settings

Gerrit-Project: ffilz/nfs-ganesha
Gerrit-Branch: next
Gerrit-MessageType: newchange
Gerrit-Change-Id: I7cdfaf81583c02edcee6d2029e08c0ebdb5b7a0f
Gerrit-Change-Number: 406504
Gerrit-PatchSet: 1
Gerrit-Owner: Frank Filz 
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


[Nfs-ganesha-devel] Change in ffilz/nfs-ganesha[next]: Fix refcounting of NLM_SHARE state_t

2018-04-04 Thread GerritHub
>From Frank Filz :

Frank Filz has uploaded this change for review. ( 
https://review.gerrithub.io/406502


Change subject: Fix refcounting of NLM_SHARE state_t
..

Fix refcounting of NLM_SHARE state_t

We must hold an additional refcount if the state_t has an active
share on it, and must only drop that reference when the share is
completely removed.

Change-Id: Ibea69d39e9329a721d0ee55cb6633e96945249a0
Signed-off-by: Frank S. Filz 
---
M src/SAL/state_share.c
1 file changed, 13 insertions(+), 6 deletions(-)



  git pull ssh://review.gerrithub.io:29418/ffilz/nfs-ganesha 
refs/changes/02/406502/1
--
To view, visit https://review.gerrithub.io/406502
To unsubscribe, visit https://review.gerrithub.io/settings

Gerrit-Project: ffilz/nfs-ganesha
Gerrit-Branch: next
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ibea69d39e9329a721d0ee55cb6633e96945249a0
Gerrit-Change-Number: 406502
Gerrit-PatchSet: 1
Gerrit-Owner: Frank Filz 
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


[Nfs-ganesha-devel] Change in ffilz/nfs-ganesha[next]: Allow for Windows use of multiple NLM_SHARE from same owner

2018-04-04 Thread GerritHub
>From Frank Filz :

Frank Filz has uploaded this change for review. ( 
https://review.gerrithub.io/406503


Change subject: Allow for Windows use of multiple NLM_SHARE from same owner
..

Allow for Windows use of multiple NLM_SHARE from same owner

Windows may end up with more than one NLM_SHARE per owner per file.
Each share is handled independently, each being countered with an
NLM_UNSHARE.

Implement a counter for each access and deny mode, and use those
counters to manage a union of all the access and deny modes held
on the file by the owner. Only when UNSHARE takes a particular
mode counter to 0 does that mode then potentially get removed.

The specific example I have seen is:

SHARE RW access, deny write
SHARE read access, deny none
UNSHARE read access, deny none
UNSHARE RW access, deny write

Note that the access and deny counters are managed separately, so
the code will actually allow:

SHARE RW access, deny write
SHARE read access, deny none
UNSHARE read access, deny write
UNSHARE RW access, deny none

I will trust that clients do the right thing...

Also, if an UNSHARE doesn't match an in use access and deny, the
UNSHARE is just ignored and no error is reported.

Note that the old code was buggy here in a variety of ways, for
example, the mode actually seen would result in the file being
deny none after the second SHARE request.

Additionally, the state_lock is held in write mode over the operation.

Change-Id: I1284b654bf97c1b53b782fe142f28463d61288f5
Signed-off-by: Frank S. Filz 
---
M src/SAL/state_share.c
M src/include/sal_data.h
2 files changed, 88 insertions(+), 28 deletions(-)



  git pull ssh://review.gerrithub.io:29418/ffilz/nfs-ganesha 
refs/changes/03/406503/1
--
To view, visit https://review.gerrithub.io/406503
To unsubscribe, visit https://review.gerrithub.io/settings

Gerrit-Project: ffilz/nfs-ganesha
Gerrit-Branch: next
Gerrit-MessageType: newchange
Gerrit-Change-Id: I1284b654bf97c1b53b782fe142f28463d61288f5
Gerrit-Change-Number: 406503
Gerrit-PatchSet: 1
Gerrit-Owner: Frank Filz 
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] mdcache growing beyond limits.

2018-04-04 Thread Daniel Gryniewicz
Okay, thanks.  That confirms to me that we need to do something else. 
I'll start to look into this ASAP.


Daniel

On 04/04/2018 12:37 PM, Pradeep wrote:

Hi Daniel,

I tried increasing lanes to 1023. The usage looks better, but still over 
the limit:


$2 = {entries_hiwat = 10, entries_used = 299838, chunks_hiwat = 
10, chunks_used = 1235, fds_system_imposed = 1048576,
   fds_hard_limit = 1038090, fds_hiwat = 943718, fds_lowat = 524288, 
futility = 0, per_lane_work = 50, biggest_window = 419430,

   prev_fd_count = 39434, prev_time = 1522775283, caching_fds = true}

I'm trying to simulate build workload by running SpecFS SWBUILD 
workload. This is with Ganesha 2.7 and FSAL_VFS. The server has 
4CPU/12GB Memory.


For build 8 (40 processes), the latency increased from 5ms (with 17 
lanes) to 22 ms (with 1023 lanes) and the test failed to achieve 
required IOPs.


Thanks,
Pradeep

On Tue, Apr 3, 2018 at 7:58 AM, Pradeep > wrote:


Hi Daniel,

Sure I will try that.

One thing I tried is to not allocate new entries and return
NFS4ERR_DELAY in the hope that the increased refcnt at LRU is
temporary. This worked for some time; but then I hit a case where I
see all the entries at the LRU of L1 has a refcnt of 2 and the
subsequent entries have a refcnt of 1. All L2's were empty. I realized
that whenever a new entry is created, the refcnt is 2 and it is put at
the LRU. Also promotions from L2 moves them to LRU of L1. So it is
likely that many threads may end up finding no entries at LRU and end
allocating new entries.

Then I tried another experiment: Invoke lru_wake_thread() when the
number of entries is greater than entries_hiwat; but still allocate a
new entry for the current thread. This worked. I had to make a change
in lru_run() to allow demotion in case of 'entries > entries_hiwat' in
addition to max FD check. The side effect would be that it will close
FDs and demote to L2. Almost all of these FDs are opened in the
context of setattr/getattr; so attributes are already in cache and FDs
are probably useless until the cache expires.  I think your idea of
moving further down the lane may be a better approach.

I will try your suggestion next. With 1023 lanes, it is unlikely that
all lanes will have an active entry.

Thanks,
Pradeep

On 4/3/18, Daniel Gryniewicz mailto:d...@redhat.com>> wrote:
 > So, the way this is supposed to work is that getting a ref when
the ref
 > is 1 is always an LRU_REQ_INITIAL ref, so that moves it to the
MRU.  At
 > that point, further refs don't move it around in the queue, just
 > increment the refcount.  This should be the case, because
 > mdcache_new_entry() and mdcache_find_keyed() both get an INITIAL ref,
 > and all other refs require you to already have a pointer to the entry
 > (and therefore a ref).
 >
 > Can you try something, since you have a reproducer?  It seems
that, with
 > 1.7 million files, 17 lanes may be a bit low.  Can you try with
 > something ridiculously large, like 1023, and see if that makes a
 > difference?
 >
 > I suspect we'll have to add logic to move further down the lanes if
 > futility hits.
 >
 > Daniel
 >
 > On 04/02/2018 12:30 PM, Pradeep wrote:
 >> We discussed this a while ago. I'm running into this again with
2.6.0.
 >> Here is a snapshot of the lru_state (I set the max entries to 10):
 >>
 >> {entries_hiwat = 20, entries_used = 1772870, chunks_hiwat =
10,
 >> chunks_used = 16371, lru_reap_l1 = 8116842,
 >>    lru_reap_l2 = 1637334, lru_reap_failed = 1637334,
attr_from_cache =
 >> 31917512, attr_from_cache_for_client = 5975849,
 >>    fds_system_imposed = 1048576, fds_hard_limit = 1038090,
fds_hiwat =
 >> 943718, fds_lowat = 524288, futility = 0, per_lane_work = 50,
 >>    biggest_window = 419430, prev_fd_count = 0, prev_time =
1522647830,
 >> caching_fds = true}
 >>
 >> As you can see it has grown well beyond the limlt set (1.7
million vs
 >> 200K max size). lru_reap_failed indicates number of times the reap
 >> failed from L1 and L2.
 >> I'm wondering what can cause the reap to fail once it reaches a
steady
 >> state. It appears to me that the entry at LRU (head of the queue) is
 >> actually being used (refcnt > 1) and there are entries in the
queue with
 >> refcnt == 1. But those are not being looked at. My understanding
is that
 >> if an entry is accessed, it must move to MRU (tail of the
queue). Any
 >> idea why the entry at LRU can have a refcnt > 1?
 >>
 >> This can happen if the refcnt is incremented without QLOCK and if
 >> lru_reap_impl() is called at the same time from another thread,
it will
 >> skip the first entry and return NULL. This was done
 >> in _mdc

Re: [Nfs-ganesha-devel] mdcache growing beyond limits.

2018-04-04 Thread Pradeep
Hi Daniel,

I tried increasing lanes to 1023. The usage looks better, but still over
the limit:

$2 = {entries_hiwat = 10, entries_used = 299838, chunks_hiwat = 10,
chunks_used = 1235, fds_system_imposed = 1048576,
  fds_hard_limit = 1038090, fds_hiwat = 943718, fds_lowat = 524288,
futility = 0, per_lane_work = 50, biggest_window = 419430,
  prev_fd_count = 39434, prev_time = 1522775283, caching_fds = true}

I'm trying to simulate build workload by running SpecFS SWBUILD workload.
This is with Ganesha 2.7 and FSAL_VFS. The server has 4CPU/12GB Memory.

For build 8 (40 processes), the latency increased from 5ms (with 17 lanes)
to 22 ms (with 1023 lanes) and the test failed to achieve required IOPs.

Thanks,
Pradeep

On Tue, Apr 3, 2018 at 7:58 AM, Pradeep  wrote:

> Hi Daniel,
>
> Sure I will try that.
>
> One thing I tried is to not allocate new entries and return
> NFS4ERR_DELAY in the hope that the increased refcnt at LRU is
> temporary. This worked for some time; but then I hit a case where I
> see all the entries at the LRU of L1 has a refcnt of 2 and the
> subsequent entries have a refcnt of 1. All L2's were empty. I realized
> that whenever a new entry is created, the refcnt is 2 and it is put at
> the LRU. Also promotions from L2 moves them to LRU of L1. So it is
> likely that many threads may end up finding no entries at LRU and end
> allocating new entries.
>
> Then I tried another experiment: Invoke lru_wake_thread() when the
> number of entries is greater than entries_hiwat; but still allocate a
> new entry for the current thread. This worked. I had to make a change
> in lru_run() to allow demotion in case of 'entries > entries_hiwat' in
> addition to max FD check. The side effect would be that it will close
> FDs and demote to L2. Almost all of these FDs are opened in the
> context of setattr/getattr; so attributes are already in cache and FDs
> are probably useless until the cache expires.  I think your idea of
> moving further down the lane may be a better approach.
>
> I will try your suggestion next. With 1023 lanes, it is unlikely that
> all lanes will have an active entry.
>
> Thanks,
> Pradeep
>
> On 4/3/18, Daniel Gryniewicz  wrote:
> > So, the way this is supposed to work is that getting a ref when the ref
> > is 1 is always an LRU_REQ_INITIAL ref, so that moves it to the MRU.  At
> > that point, further refs don't move it around in the queue, just
> > increment the refcount.  This should be the case, because
> > mdcache_new_entry() and mdcache_find_keyed() both get an INITIAL ref,
> > and all other refs require you to already have a pointer to the entry
> > (and therefore a ref).
> >
> > Can you try something, since you have a reproducer?  It seems that, with
> > 1.7 million files, 17 lanes may be a bit low.  Can you try with
> > something ridiculously large, like 1023, and see if that makes a
> > difference?
> >
> > I suspect we'll have to add logic to move further down the lanes if
> > futility hits.
> >
> > Daniel
> >
> > On 04/02/2018 12:30 PM, Pradeep wrote:
> >> We discussed this a while ago. I'm running into this again with 2.6.0.
> >> Here is a snapshot of the lru_state (I set the max entries to 10):
> >>
> >> {entries_hiwat = 20, entries_used = 1772870, chunks_hiwat = 10,
> >> chunks_used = 16371, lru_reap_l1 = 8116842,
> >>lru_reap_l2 = 1637334, lru_reap_failed = 1637334, attr_from_cache =
> >> 31917512, attr_from_cache_for_client = 5975849,
> >>fds_system_imposed = 1048576, fds_hard_limit = 1038090, fds_hiwat =
> >> 943718, fds_lowat = 524288, futility = 0, per_lane_work = 50,
> >>biggest_window = 419430, prev_fd_count = 0, prev_time = 1522647830,
> >> caching_fds = true}
> >>
> >> As you can see it has grown well beyond the limlt set (1.7 million vs
> >> 200K max size). lru_reap_failed indicates number of times the reap
> >> failed from L1 and L2.
> >> I'm wondering what can cause the reap to fail once it reaches a steady
> >> state. It appears to me that the entry at LRU (head of the queue) is
> >> actually being used (refcnt > 1) and there are entries in the queue with
> >> refcnt == 1. But those are not being looked at. My understanding is that
> >> if an entry is accessed, it must move to MRU (tail of the queue). Any
> >> idea why the entry at LRU can have a refcnt > 1?
> >>
> >> This can happen if the refcnt is incremented without QLOCK and if
> >> lru_reap_impl() is called at the same time from another thread, it will
> >> skip the first entry and return NULL. This was done
> >> in _mdcache_lru_ref() which could cause the refcnt on the head of the
> >> queue to be incremented while some other thread looks at it holding a
> >> QLOCK. I tried moving the increment/dequeue in _mdcache_lru_ref() inside
> >> QLOCK; but that did not help.
> >>
> >> Also if "get_ref()" is called for the entry at the LRU for some reason,
> >> it will just increment refcnt and return. I think the assumption is that
> >> by the time "get_ref() is called, the entry is

[Nfs-ganesha-devel] Change in ffilz/nfs-ganesha[next]: Remove fs_lease_time. This is not being used.

2018-04-04 Thread GerritHub
>From :

supriti.si...@suse.com has uploaded this change for review. ( 
https://review.gerrithub.io/406359


Change subject: Remove fs_lease_time. This is not being used.
..

Remove fs_lease_time. This is not being used.

Change-Id: I34c41fca9c22f89820cd29ed63414a3b4442482e
Signed-off-by: Supriti Singh 
---
M src/FSAL/FSAL_CEPH/main.c
M src/FSAL/FSAL_GLUSTER/main.c
M src/FSAL/FSAL_GPFS/main.c
M src/FSAL/FSAL_MEM/mem_main.c
M src/FSAL/FSAL_PROXY/main.c
M src/FSAL/FSAL_PSEUDO/main.c
M src/FSAL/FSAL_RGW/main.c
M src/FSAL/FSAL_VFS/panfs/main.c
M src/FSAL/FSAL_VFS/vfs/main-c.in.cmake
M src/FSAL/FSAL_VFS/xfs/main.c
M src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_export.c
M src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_main.c
M src/FSAL/Stackable_FSALs/FSAL_NULL/export.c
M src/FSAL/Stackable_FSALs/FSAL_NULL/main.c
M src/FSAL/default_methods.c
M src/FSAL/fsal_config.c
M src/include/FSAL/fsal_config.h
M src/include/fsal_api.h
M src/include/fsal_types.h
19 files changed, 0 insertions(+), 82 deletions(-)



  git pull ssh://review.gerrithub.io:29418/ffilz/nfs-ganesha 
refs/changes/59/406359/1
--
To view, visit https://review.gerrithub.io/406359
To unsubscribe, visit https://review.gerrithub.io/settings

Gerrit-Project: ffilz/nfs-ganesha
Gerrit-Branch: next
Gerrit-MessageType: newchange
Gerrit-Change-Id: I34c41fca9c22f89820cd29ed63414a3b4442482e
Gerrit-Change-Number: 406359
Gerrit-PatchSet: 1
Gerrit-Owner: supriti.si...@suse.com
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


[Nfs-ganesha-devel] Change in ffilz/nfs-ganesha[next]: C++ fix - cannot define variable in a header

2018-04-04 Thread GerritHub
>From Daniel Gryniewicz :

Daniel Gryniewicz has uploaded this change for review. ( 
https://review.gerrithub.io/406344


Change subject: C++ fix - cannot define variable in a header
..

C++ fix - cannot define variable in a header

Change-Id: I5869841cd45aef1f9221f980385822211f11068f
Signed-off-by: Daniel Gryniewicz 
---
M src/include/nfs_core.h
1 file changed, 1 insertion(+), 1 deletion(-)



  git pull ssh://review.gerrithub.io:29418/ffilz/nfs-ganesha 
refs/changes/44/406344/1
--
To view, visit https://review.gerrithub.io/406344
To unsubscribe, visit https://review.gerrithub.io/settings

Gerrit-Project: ffilz/nfs-ganesha
Gerrit-Branch: next
Gerrit-MessageType: newchange
Gerrit-Change-Id: I5869841cd45aef1f9221f980385822211f11068f
Gerrit-Change-Number: 406344
Gerrit-PatchSet: 1
Gerrit-Owner: Daniel Gryniewicz 
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel