Re: [Nfs-ganesha-devel] Change in ffilz/nfs-ganesha[next]: gtest/CMakeLists: libraries for commit2

2018-04-23 Thread William Allen Simpson

This list has been deprecated. Please subscribe to the new devel list at 
lists.nfs-ganesha.org.
On 4/23/18 10:01 AM, GerritHub wrote:

This list has been deprecated. Please subscribe to the new devel list at 
lists.nfs-ganesha.org.


Frank, who is responsible for changing GerritHub?


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Couple of stat issues.

2018-04-10 Thread William Allen Simpson

On 4/10/18 8:49 PM, Pradeep wrote:

2. In nfs_rpc_execute(), the queue_wait is set to the difference between 
op_ctx->start_time and reqdata->time_queued. But reqdata->time_queued is never 
set (in the old code - pre 2.6-dev5, nfs_rpc_enqueue_req() used to set it; now only 9P 
code sets it).
Is nfs_rpc_decode_request() a good place to set it?


This has no meaning, as there is no queue anymore (other than
vestigial code moved to 9P until they have time to fix it).

Better to remove queue_wait entirely.

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] rpcping comparison nfs-server

2018-03-28 Thread William Allen Simpson

On 3/27/18 9:34 AM, William Allen Simpson wrote:

On 3/25/18 1:44 PM, William Allen Simpson wrote:

On 3/23/18 1:30 PM, William Allen Simpson wrote:

Ran some apples to apples comparisons today V2.7-dev.5:


Without the client-side rbtrees, rpcping works a lot better:


Thought of a small tweak to the list adding routine, so it doesn't
kick the epoll timer unless the SVCXPRT was added to the end of its
timeout list (a much rarer occurrence, but it could happen).

The numbers don't change much, so I ran more of them.  Not sorted
this time, but you get the gist.  Still seeing a huge improvement
around 1,000, with a rough plateau over 10,000 calls.


I've spent some time trying to figure out the plateau.  Turned out
it was a programming error, a break where it should be a continue.
So all the later entries were timing out, but counted as responses.

Sadly, those higher throughput numbers can be disregarded. :(

In my latest rpcping code, I've added a counter for timeouts.

Happily, the profile shows spending our time in recv() and writev(),
exactly as it should be.  No more 48% of time in rbtree_insert.

Still having a problem with rpcping running to completion.  Needs
more debugging

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] rpcping comparison nfs-server

2018-03-27 Thread William Allen Simpson

On 3/25/18 1:44 PM, William Allen Simpson wrote:

On 3/23/18 1:30 PM, William Allen Simpson wrote:

Ran some apples to apples comparisons today V2.7-dev.5:


Without the client-side rbtrees, rpcping works a lot better:


Thought of a small tweak to the list adding routine, so it doesn't
kick the epoll timer unless the SVCXPRT was added to the end of its
timeout list (a much rarer occurrence, but it could happen).

The numbers don't change much, so I ran more of them.  Not sorted
this time, but you get the gist.  Still seeing a huge improvement
around 1,000, with a rough plateau over 10,000 calls.

But the raw data looks to me like Ganesha edges up past the kernel
around 1,000,000  Or maybe the extra Ganesha system call overhead
cancels out distributed over a longer period of time?

Probably need to run hundreds of times to get a better distribution,
but more than I'm willing to do by hand.

Happy baking!


Ganesha (worst, best):

rpcping tcp localhost count=1000 threads=1 workers=5 (port=2049 program=13 
version=3 procedure=0): mean 33950.1556, total 33950.1556
rpcping tcp localhost count=1000 threads=1 workers=5 (port=2049 program=13 
version=3 procedure=0): mean 43668.3435, total 43668.3435



rpcping tcp localhost count=1000 threads=1 workers=5 (port=2049 program=13 
version=3 procedure=0): mean 151800.6287, total 151800.6287
rpcping tcp localhost count=1000 threads=1 workers=5 (port=2049 program=13 
version=3 procedure=0): mean 167828.8817, total 167828.8817


rpcping tcp localhost count=1000 threads=1 workers=5 (port=2049 program=13 
version=3 procedure=0): mean 144967.5809, total 144967.5809
rpcping tcp localhost count=1000 threads=1 workers=5 (port=2049 program=13 
version=3 procedure=0): mean 219739.3627, total 219739.3627
rpcping tcp localhost count=1000 threads=1 workers=5 (port=2049 program=13 
version=3 procedure=0): mean 218477.8040, total 218477.8040
rpcping tcp localhost count=1000 threads=1 workers=5 (port=2049 program=13 
version=3 procedure=0): mean 126693.0146, total 126693.0146
rpcping tcp localhost count=1000 threads=1 workers=5 (port=2049 program=13 
version=3 procedure=0): mean 131807.8768, total 131807.8768

rpcping tcp localhost count=1 threads=1 workers=5 (port=2049 program=13 
version=3 procedure=0): mean 265231.6362, total 265231.6362
rpcping tcp localhost count=1 threads=1 workers=5 (port=2049 program=13 
version=3 procedure=0): mean 281711.3287, total 281711.3287
rpcping tcp localhost count=1 threads=1 workers=5 (port=2049 program=13 
version=3 procedure=0): mean 258412.9101, total 258412.9101
rpcping tcp localhost count=1 threads=1 workers=5 (port=2049 program=13 
version=3 procedure=0): mean 244638.8736, total 244638.8736
rpcping tcp localhost count=1 threads=1 workers=5 (port=2049 program=13 
version=3 procedure=0): mean 264594.2726, total 264594.2726

rpcping tcp localhost count=10 threads=1 workers=5 (port=2049 
program=13 version=3 procedure=0): mean 281988.8465, total 281988.8465
rpcping tcp localhost count=10 threads=1 workers=5 (port=2049 
program=13 version=3 procedure=0): mean 282341.2245, total 282341.2245
rpcping tcp localhost count=10 threads=1 workers=5 (port=2049 
program=13 version=3 procedure=0): mean 286837.9973, total 286837.9973
rpcping tcp localhost count=10 threads=1 workers=5 (port=2049 
program=13 version=3 procedure=0): mean 277970.8432, total 277970.8432
rpcping tcp localhost count=10 threads=1 workers=5 (port=2049 
program=13 version=3 procedure=0): mean 285086.8682, total 285086.8682

rpcping tcp localhost count=100 threads=1 workers=5 (port=2049 
program=13 version=3 procedure=0): mean 292704.4142, total 292704.4142
rpcping tcp localhost count=100 threads=1 workers=5 (port=2049 
program=13 version=3 procedure=0): mean 296892.2598, total 296892.2598
rpcping tcp localhost count=100 threads=1 workers=5 (port=2049 
program=13 version=3 procedure=0): mean 287227.5968, total 287227.5968
rpcping tcp localhost count=100 threads=1 workers=5 (port=2049 
program=13 version=3 procedure=0): mean 295969.2889, total 295969.2889
rpcping tcp localhost count=100 threads=1 workers=5 (port=2049 
program=13 version=3 procedure=0): mean 294702.5526, total 294702.5526





Kernel (worst, best):

rpcping tcp localhost count=1000 threads=1 workers=5 (port=2049 program=13 
version=3 procedure=0): mean 46826.6383, total 46826.6383
rpcping tcp localhost count=1000 threads=1 workers=5 (port=2049 program=13 
version=3 procedure=0): mean 52915.1652, total 52915.1652



rpcping tcp localhost count=1000 threads=1 workers=5 (port=2049 program=13 
version=3 procedure=0): mean 175773.3986, total 175773.3986
rpcping tcp localhost count=1000 threads=1 workers=5 (port=2049 program=13 
version=3 procedure=0): mean 189168.4778, total 189168.4778


rpcping tcp localhost count=1000 threads=1 workers=5 (port=2049

Re: [Nfs-ganesha-devel] rpcping comparison nfs-server

2018-03-25 Thread William Allen Simpson

On 3/23/18 1:30 PM, William Allen Simpson wrote:

Ran some apples to apples comparisons today V2.7-dev.5:


Without the client-side rbtrees, rpcping works a lot better:



Ganesha (worst, best):

rpcping tcp localhost count=1000 threads=1 workers=5 (port=2049 program=13 
version=3 procedure=0): mean 33950.1556, total 33950.1556
rpcping tcp localhost count=1000 threads=1 workers=5 (port=2049 program=13 
version=3 procedure=0): mean 43668.3435, total 43668.3435



rpcping tcp localhost count=1000 threads=1 workers=5 (port=2049 program=13 
version=3 procedure=0): mean 151800.6287, total 151800.6287
rpcping tcp localhost count=1000 threads=1 workers=5 (port=2049 program=13 
version=3 procedure=0): mean 167828.8817, total 167828.8817



Kernel (worst, best):

rpcping tcp localhost count=1000 threads=1 workers=5 (port=2049 program=13 
version=3 procedure=0): mean 46826.6383, total 46826.6383
rpcping tcp localhost count=1000 threads=1 workers=5 (port=2049 program=13 
version=3 procedure=0): mean 52915.1652, total 52915.1652



rpcping tcp localhost count=1000 threads=1 workers=5 (port=2049 program=13 
version=3 procedure=0): mean 175773.3986, total 175773.3986
rpcping tcp localhost count=1000 threads=1 workers=5 (port=2049 program=13 
version=3 procedure=0): mean 189168.4778, total 189168.4778

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] rpcping profile

2018-03-25 Thread William Allen Simpson

On 3/24/18 7:50 AM, William Allen Simpson wrote:

Noting that the top problem is exactly my prediction by knowledge of
the code:
   clnt_req_callback() opr_rbtree_insert()

The second is also exactly as expected:

   svc_rqst_expire_insert() opr_rbtree_insert() svc_rqst_expire_cmpf()

These are both inserted in ascending order, sorted in ascending order,
and removed in ascending order

QED: rb_tree is a poor data structure for this purpose.


I've replaced those 2 rbtrees with TAILQ, so that we are not
spending 49% of the time there anymore, and am now seeing:

rpcping tcp localhost count=1000 threads=1 workers=5 (port=2049 program=13 
version=3 procedure=0): mean 151800.6287, total 151800.6287
rpcping tcp localhost count=1000 threads=1 workers=5 (port=2049 program=13 
version=3 procedure=0): mean 167828.8817, total 167828.8817

This is probably good enough for now.  Time to move on to
more interesting things.

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


[Nfs-ganesha-devel] rpcping profile

2018-03-24 Thread William Allen Simpson

Using local file tests/rpcping.
Using local file ../profile.
Total: 989 samples
 321  32.5%  32.5%  321  32.5% svc_rqst_expire_cmpf
 149  15.1%  47.5%  475  48.0% opr_rbtree_insert
 139  14.1%  61.6%  140  14.2% __writev
  56   5.7%  67.2%   66   6.7% __GI___pthread_mutex_lock
  32   3.2%  70.5%   32   3.2% ?? 
/usr/src/debug/glibc-2.26-137-g247c1ddd30/nptl/../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:371
  32   3.2%  73.7%   32   3.2% futex_abstimed_wait_cancelable
  23   2.3%  76.0%   23   2.3% __libc_write
  21   2.1%  78.2%   23   2.3% __libc_recv
  18   1.8%  80.0%   18   1.8% ?? 
/usr/src/debug/glibc-2.26-137-g247c1ddd30/misc/../sysdeps/unix/syscall-template.S:84
  17   1.7%  81.7%   17   1.7% __libc_read
  15   1.5%  83.2%   15   1.5% clnt_req_xid_cmpf
  14   1.4%  84.6%   14   1.4% _int_free
  14   1.4%  86.0%   14   1.4% futex_wake
  11   1.1%  87.2%   13   1.3% __GI_epoll_pwait
   8   0.8%  88.0%9   0.9% _int_malloc
   8   0.8%  88.8%  494  49.9% svc_rqst_expire_insert
   7   0.7%  89.5%   20   2.0% __libc_calloc
   7   0.7%  90.2%7   0.7% opr_rbtree_first
   6   0.6%  90.8%6   0.6% __GI___libc_free
   5   0.5%  91.3%5   0.5% ?? 
/usr/src/debug/glibc-2.26-137-g247c1ddd30/nptl/../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
   5   0.5%  91.8%  150  15.2% svc_vc_recv
   5   0.5%  92.3%  470  47.5% work_pool_thread
   4   0.4%  92.7%4   0.4% ?? 
/usr/src/debug/glibc-2.26-137-g247c1ddd30/nptl/../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:133
   3   0.3%  93.0%3   0.3% __condvar_dec_grefs
   3   0.3%  93.3%3   0.3% atomic_add_uint64_t
   2   0.2%  93.5%2   0.2% ?? 
/usr/src/debug/glibc-2.26-137-g247c1ddd30/inet/../sysdeps/x86_64/htonl.S:29
   2   0.2%  93.7%2   0.2% ?? 
/usr/src/debug/glibc-2.26-137-g247c1ddd30/nptl/../sysdeps/unix/sysv/linux/x86_64/cancellation.S:67
   2   0.2%  93.9%   34   3.4% __pthread_mutex_unlock_usercnt
   2   0.2%  94.1%2   0.2% _seterr_reply
   2   0.2%  94.3%2   0.2% atomic_add_uint32_t@2665d
   2   0.2%  94.5%9   0.9% clnt_req_release
   2   0.2%  94.7%2   0.2% malloc_consolidate
   2   0.2%  94.9%  142  14.4% svc_ioq_flushv
   2   0.2%  95.1%3   0.3% svc_ref_it@401421
   2   0.2%  95.3%  255  25.8% svc_rqst_epoll_loop
   2   0.2%  95.6%   12   1.2% xdr_ioq_release
   2   0.2%  95.8%6   0.6% xdr_ioq_setup
   1   0.1%  95.9%1   0.1% ?? 
/usr/src/debug/glibc-2.26-137-g247c1ddd30/misc/../sysdeps/unix/syscall-template.S:84
   1   0.1%  96.0%1   0.1% ?? 
/usr/src/debug/glibc-2.26-137-g247c1ddd30/nptl/../sysdeps/unix/sysv/linux/x86_64/cancellation.S:59
   1   0.1%  96.1%1   0.1% ?? 
/usr/src/debug/glibc-2.26-137-g247c1ddd30/nptl/../sysdeps/unix/sysv/linux/x86_64/cancellation.S:60
   1   0.1%  96.2%1   0.1% ?? 
/usr/src/debug/glibc-2.26-137-g247c1ddd30/nptl/../sysdeps/unix/sysv/linux/x86_64/cancellation.S:67
   1   0.1%  96.3%1   0.1% ?? 
/usr/src/debug/glibc-2.26-137-g247c1ddd30/string/../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:296
   1   0.1%  96.4%1   0.1% ?? 
/usr/src/debug/glibc-2.26-137-g247c1ddd30/string/../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S:158
   1   0.1%  96.5%1   0.1% ?? 
/usr/src/debug/glibc-2.26-137-g247c1ddd30/string/../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S:189
   1   0.1%  96.6%1   0.1% ?? 
/usr/src/debug/glibc-2.26-137-g247c1ddd30/string/../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S:193
   1   0.1%  96.7%1   0.1% ?? 
/usr/src/debug/glibc-2.26-137-g247c1ddd30/nptl/../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:138
   1   0.1%  96.8%1   0.1% 0x7fffde7caa20
   1   0.1%  96.9%1   0.1% 0x7fffde7caa25
   1   0.1%  97.0%1   0.1% 0x7fffde7caaa1
   1   0.1%  97.1%1   0.1% __GI___libc_malloc
   1   0.1%  97.2%1   0.1% __condvar_get_private
   1   0.1%  97.3%1   0.1% __pthread_cond_destroy
   1   0.1%  97.4%1   0.1% __pthread_cond_signal
   1   0.1%  97.5%1   0.1% __pthread_cond_wait_common
   1   0.1%  97.6%1   0.1% _init
   1   0.1%  97.7%1   0.1% alloc_perturb
   1   0.1%  97.8%1   0.1% atomic_add_uint32_t@401377
   1   0.1%  97.9%1   0.1% atomic_dec_uint32_t@e3d8
   1   0.1%  98.0%2   0.2% atomic_inc_uint32_t@401394
   1   0.1%  98.1%4   0.4% atomic_inc_uint64_t
   1   0.1%  98.2%1   0.1% atomic_postclear_uint32_t_bits
   1   0.1%  98.3%1   0.1% atomic_postset_uint16_t_bits
   1   0.1%  98.4%1   0.1% atomic_sub_uint32_t
   1   

[Nfs-ganesha-devel] rpcping comparison nfs-server

2018-03-23 Thread William Allen Simpson

Ran some apples to apples comparisons today V2.7-dev.5:

Ganesha (worst, best):

rpcping tcp localhost count=1000 threads=1 workers=5 (port=2049 program=13 
version=3 procedure=0): mean 33950.1556, total 33950.1556
rpcping tcp localhost count=1000 threads=1 workers=5 (port=2049 program=13 
version=3 procedure=0): mean 43668.3435, total 43668.3435

rpcping tcp localhost count=1 threads=1 workers=5 (port=2049 program=13 
version=3 procedure=0): mean 17578.4042, total 17578.4042
rpcping tcp localhost count=1 threads=1 workers=5 (port=2049 program=13 
version=3 procedure=0): mean 17587.1451, total 17587.1451

rpcping tcp localhost count=10 threads=1 workers=5 (port=2049 
program=13 version=3 procedure=0): mean 15981.8650, total 15981.8650
rpcping tcp localhost count=10 threads=1 workers=5 (port=2049 
program=13 version=3 procedure=0): mean 16086.4951, total 16086.4951

Kernel (worst, best):

rpcping tcp localhost count=1000 threads=1 workers=5 (port=2049 program=13 
version=3 procedure=0): mean 46826.6383, total 46826.6383
rpcping tcp localhost count=1000 threads=1 workers=5 (port=2049 program=13 
version=3 procedure=0): mean 52915.1652, total 52915.1652

rpcping tcp localhost count=1 threads=1 workers=5 (port=2049 program=13 
version=3 procedure=0): mean 17334.0426, total 17334.0426
rpcping tcp localhost count=1 threads=1 workers=5 (port=2049 program=13 
version=3 procedure=0): mean 18545.5056, total 18545.5056

rpcping tcp localhost count=10 threads=1 workers=5 (port=2049 
program=13 version=3 procedure=0): mean 16413.7585, total 16413.7585
rpcping tcp localhost count=10 threads=1 workers=5 (port=2049 
program=13 version=3 procedure=0): mean 16463.0157, total 16463.0157

Conclusion:

For light concurrent loads, we're slower.  For heavier loads, we're the same.


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] nss_getpwnam: name 't...@my.dom@localdomain' does not map into domain 'nix.my.dom'

2018-03-23 Thread William Allen Simpson

On 3/23/18 7:59 AM, Daniel Gryniewicz wrote:

Thanks, Tomk.  PR is here: https://review.gerrithub.io/404945



Actually, it seems fairly elegant.

ntirpc and rdma also have the USE_ and _USE_ convention.  Both
require libraries, and would benefit from defaults with
enforcement checking for the cmake parameter line.

How hard would it be to convert?  Or would you prefer waiting
until these FSALs are pulled, and then try next week?

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Announce Push of V2.7-dev.4

2018-03-18 Thread William Allen Simpson

On 3/17/18 11:01 AM, Jeff Layton wrote:

See:

 https://review.gerrithub.io/c/404231/


Thanks.  A more pro-active approach to be sure.  I just assumed
Frank would quickly fix it and push a new dev.4a when he saw it Sat.
Nice to see I'm not the only one coding on weekends.

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Announce Push of V2.7-dev.4

2018-03-17 Thread William Allen Simpson

On 3/16/18 7:23 PM, Frank Filz wrote:

Branch next

Tag:V2.7-dev.4

NOTE: This merge includes an ntirpc pullup, please update your submodule

This is a big merge with a lot of cleanup.
Doesn't compile for me.



[ 17%] Building C object Protocols/NFS/CMakeFiles/nfsproto.dir/nfs4_Compound.c.o
In file included from /home/bill/rdma/install/include/ntirpc/rpc/xdr.h:52:0,
 from 
/home/bill/rdma/install/include/ntirpc/rpc/xdr_inline.h:53,
 from /home/bill/rdma/nfs-ganesha/src/include/gsh_rpc.h:17,
 from /home/bill/rdma/nfs-ganesha/src/include/uid2grp.h:44,
 from /home/bill/rdma/nfs-ganesha/src/include/fsal_types.h:42,
 from /home/bill/rdma/nfs-ganesha/src/include/fsal_api.h:43,
 from /home/bill/rdma/nfs-ganesha/src/include/fsal.h:62,
 from 
/home/bill/rdma/nfs-ganesha/src/Protocols/NFS/nfs4_Compound.c:35:
/home/bill/rdma/nfs-ganesha/src/Protocols/NFS/nfs4_Compound.c: In function 
‘nfs4_Compound’:
/home/bill/rdma/nfs-ganesha/src/Protocols/NFS/nfs4_Compound.c:697:27: error: ?: 
using integer constants in boolean context, the expression will always evaluate 
to ‘true’ [-Werror=int-in-bool-context]
bad_pos ? NIV_INFO : NIV_DEBUG,
/home/bill/rdma/install/include/ntirpc/intrinsic.h:32:42: note: in definition 
of macro ‘unlikely’
 #define unlikely(x)  __builtin_expect(!!(x), 0)
  ^
/home/bill/rdma/nfs-ganesha/src/Protocols/NFS/nfs4_Compound.c:696:4: note: in 
expansion of macro ‘LogAtLevel’
LogAtLevel(COMPONENT_SESSIONS,
^~
cc1: all warnings being treated as errors
make[2]: *** [Protocols/NFS/CMakeFiles/nfsproto.dir/build.make:231: 
Protocols/NFS/CMakeFiles/nfsproto.dir/nfs4_Compound.c.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:917: 
Protocols/NFS/CMakeFiles/nfsproto.dir/all] Error 2
make: *** [Makefile:152: all] Error 2

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Change in ffilz/nfs-ganesha[next]: Pullup NTIRPC through #124

2018-03-16 Thread William Allen Simpson

On 3/16/18 10:07 AM, GerritHub wrote:

william.allen.simp...@gmail.com has uploaded this change for *review*.

View Change 


I see that our ci.centos.org now provides dbench and iozone.

The dbench results are in its log:

+ tail -21 ../dbenchTestLog.txt

 OperationCountAvgLatMaxLat
 --
 Deltree102 9.79927.590
 Flush   284316 1.637   203.259
 Close  2979801 0.007 0.330
 LockX13208 0.007 0.079
 Mkdir   51 0.011 0.059
 Rename  171774 0.073 0.463
 ReadX  6358865 0.01038.319
 WriteX 2022375 0.04840.888
 Unlink  819204 0.09038.363
 UnlockX  13208 0.006 0.063
 FIND_FIRST 1421549 0.04438.320
 SET_FILE_INFORMATION330438 0.024 0.310
 QUERY_FILE_INFORMATION  644319 0.004 0.242
 QUERY_PATH_INFORMATION 3676827 0.01540.851
 QUERY_FS_INFORMATION674193 0.01037.783
 NTCreateX  4056560 0.049   122.097


Where are the iozone results from ../ioZoneLog.txt?

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] A question about rpc requests maybe for Bill

2018-03-15 Thread William Allen Simpson

On 3/15/18 7:57 PM, Frank Filz wrote:

NFS v4.1 has a max request size option for the session, I’m wondering if 
there’s a way to get the size of a given request easily.


Depends on how that's defined.  Bytes following header?  And what you
need to do with it.

It might be simplest to add a data length field to the struct svc_req,
and set it during decode.

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] rpcping

2018-03-15 Thread William Allen Simpson

On 3/15/18 10:23 AM, Daniel Gryniewicz wrote:

Can you try again with a larger count, like 100k?  500 is still quite
small for a loop benchmark like this.


In the code, I commented that 500 is minimal.  I've done a pile of
100, 200, 300, and they perform roughly the same as 500.

rpcping tcp localhost count=100 threads=1 workers=5 (port=2049 program=13 
version=3 procedure=0): mean 46812.8194, total 46812.8194
rpcping tcp localhost count=500 threads=1 workers=5 (port=2049 program=13 
version=3 procedure=0): mean 41285.4267, total 41285.4267

100k is a lot less (when it works).

tests/rpcping tcp localhost -c 10
rpcping tcp localhost count=10 threads=1 workers=5 (port=2049 
program=13 version=3 procedure=0): mean 15901.7190, total 15901.7190
tests/rpcping tcp localhost -c 10
rpcping tcp localhost count=10 threads=1 workers=5 (port=2049 
program=13 version=3 procedure=0): mean 15894.9971, total 15894.9971

tests/rpcping tcp localhost -c 10 -t 2
double free or corruption (out)
Aborted (core dumped)

tests/rpcping tcp localhost -c 10 -t 2
double free or corruption (out)
corrupted double-linked list (not small)
Aborted (core dumped)

Looks like we have a nice dump test case! ;)

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] rpcping

2018-03-15 Thread William Allen Simpson

On 3/14/18 3:33 AM, William Allen Simpson wrote:

rpcping tcp localhost threads=1 count=500 (port=2049 program=13 version=3 
procedure=0): mean 51285.7754, total 51285.7754


DanG pushed the latest code onto ntirpc this morning, and I'll submit a
pullup for Ganesha later today.

I've changed the calculations to be in the final loop, holding onto
the hope that the original design of averaging each threat result
might introduce quantization errors.  But it didn't significantly
change the results.

I've improved the pretty print a bit, now including the worker pool.
The default 5 worker threads are each handling the incoming replies
concurrently, so they hopefully keep working without a thread switch.

Another thing I've noted is that the best result is almost always the
first result after an idle period.  That's opposite of my expectations.

Could it be that the default Ganesha worker pool size of 200 (default)
or 500 (configured) is much too large, thread scheduler thrashing?

rpcping tcp localhost count=500 threads=1 workers=5 (port=2049 program=13 
version=3 procedure=0): mean 50989.4139, total 50989.4139
rpcping tcp localhost count=500 threads=1 workers=5 (port=2049 program=13 
version=3 procedure=0): mean 32562.0173, total 32562.0173
rpcping tcp localhost count=500 threads=1 workers=5 (port=2049 program=13 
version=3 procedure=0): mean 34479.7577, total 34479.7577
rpcping tcp localhost count=500 threads=1 workers=5 (port=2049 program=13 
version=3 procedure=0): mean 34070.8189, total 34070.8189
rpcping tcp localhost count=500 threads=1 workers=5 (port=2049 program=13 
version=3 procedure=0): mean 33861.2689, total 33861.2689
rpcping tcp localhost count=500 threads=1 workers=5 (port=2049 program=13 
version=3 procedure=0): mean 35843.8433, total 35843.8433
rpcping tcp localhost count=500 threads=1 workers=5 (port=2049 program=13 
version=3 procedure=0): mean 35367.2721, total 35367.2721
rpcping tcp localhost count=500 threads=1 workers=5 (port=2049 program=13 
version=3 procedure=0): mean 31642.2972, total 31642.2972
rpcping tcp localhost count=500 threads=1 workers=5 (port=2049 program=13 
version=3 procedure=0): mean 34738.4166, total 34738.4166
rpcping tcp localhost count=500 threads=1 workers=5 (port=2049 program=13 
version=3 procedure=0): mean 33211.7319, total 33211.7319
rpcping tcp localhost count=500 threads=1 workers=5 (port=2049 program=13 
version=3 procedure=0): mean 35000.5520, total 35000.5520
rpcping tcp localhost count=500 threads=1 workers=5 (port=2049 program=13 
version=3 procedure=0): mean 36557.6578, total 36557.6578

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] rpcping

2018-03-14 Thread William Allen Simpson

On 3/14/18 7:27 AM, Matt Benjamin wrote:

Daniel doesn't think you've measured much accurately yet, but at least
the effort (if not the discussion) aims to.


I'm sure Daniel can speak for himself.  At your time of writing,
Daniel had not yet arrived in the office after my post this am.

So I'm assuming you're speculating.  Or denigrating my work and
attributing that sentiment to Daniel.  I'd appreciate you cease
doing that.

I've done my best with the Tigran's code design that you held onto
for 6 years without putting it into the tree or keeping it up-to-date.

At this time, there's no indication any numbers are in error.

If you have quantitative information, please provide it.

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] rpcping

2018-03-14 Thread William Allen Simpson

On 3/13/18 1:58 PM, Daniel Gryniewicz wrote:

rpcping was not thread safe.  I have fixes for it incoming.


With DanG's significant help, we now have better timing results.

There was an implicit assumption in the ancient code that it was
calling single threaded tirpc, while ntirpc is multi-threaded.

The documentation on clock_gettime() says that we cannot obtain
correct timer results between threads.  The starting and stopping
timer calls must be on the same thread.

I've returned the code to the original design that records only
client replies, not the connection create and destroy.  As
expected, the reports have improved by that margin.

Same result.  More calls ::= slower times.

rpcping tcp localhost threads=1 count=500 (port=2049 program=13 version=3 
procedure=0): mean 51285.7754, total 51285.7754
rpcping tcp localhost threads=1 count=1000 (port=2049 program=13 version=3 
procedure=0): mean 44849.7587, total 44849.7587
rpcping tcp localhost threads=1 count=2000 (port=2049 program=13 version=3 
procedure=0): mean 32418.8600, total 32418.8600
rpcping tcp localhost threads=1 count=3000 (port=2049 program=13 version=3 
procedure=0): mean 22578.4432, total 22578.4432
rpcping tcp localhost threads=1 count=5000 (port=2049 program=13 version=3 
procedure=0): mean 18748.8576, total 18748.8576
rpcping tcp localhost threads=1 count=7000 (port=2049 program=13 version=3 
procedure=0): mean 18532.9326, total 18532.9326
rpcping tcp localhost threads=1 count=1 (port=2049 program=13 version=3 
procedure=0): mean 17750.2026, total 17750.2026

As before, multiple call threads are not helping:

rpcping tcp localhost threads=2 count=750 (port=2049 program=13 version=3 
procedure=0): mean 14615.7612, total 29231.5224
rpcping tcp localhost threads=3 count=750 (port=2049 program=13 version=3 
procedure=0): mean 8456.7597, total 25370.2792
rpcping tcp localhost threads=5 count=750 (port=2049 program=13 version=3 
procedure=0): mean 3851.8920, total 19259.4602

We've tried limiting the number of reply threads (was one worker per
reply up to 500 with recycling above that), but the overhead of creating
threads is swamped by something else.  No consistent difference.

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] rpcping

2018-03-14 Thread William Allen Simpson

On 3/13/18 8:27 AM, Matt Benjamin wrote:

On Tue, Mar 13, 2018 at 2:38 AM, William Allen Simpson
<william.allen.simp...@gmail.com> wrote:

but if we assume xids retire in xid order also,


They do.  Should be no variance.  Eliminating the dupreq caching --
also using the rbtree -- significantly improved the timing.


It's certainly correct not to cache, but it's also a special case that
arises from...benchmarking with rpcping, not NFS.


Never-the-less, "significantly improved the timing".

Duplicates are rare.  The DRC needs to be able to get out of the way,
and shouldn't add significant overhead.



Same goes for retire order.  Who said, let's assume the rpcping
requests retire in order?  Oh yes, me above.  


Actually, me in an earlier part of the thread.



Do you think NFS
requests in general are required to retire in arrival order?  No, of
course not.  What workload is the general case for the DRC?  NFS.


The question is not, do (RPC CALL) NFS requests retire in arrival order.

The question in this thread is how far out of order do RPC REPLY retire,
and best computer science data structure(s) for this workload.



Apparently picked the worst tree choice for this data, according to
computer science. If all you have is a hammer


What motivates you to write this stuff?


Correctness.



Here are two facts you may have overlooked:

1. The DRC has a constant insert-delete workload, and for this
application, IIRC, I put the last inserted entries directly into the
cache.  This both applies standard art on trees (rbtree vs avl
perfomance on insert/delete heavy workloads, and ostensibly avoids
searching the tree in the common case;  I measured hitrate informally,
looked to be working).


I have no idea why we are off on this tangent here.  The subject is
rpcping, not the DRC.

As to the DRC, we know that in fact the ntirpc "citihash" was of the
wrong data in GSS (the always changing ciphertext instead of the
plaintext), so in that case there was *no* hit rate at all.

In ntirpc v1.6, we now have a formal call to checksum, instead of an
ad hoc addition to the decode.  So we should be getting a better hit
rate.  I look forward to publication of your hit rate results.


2. the key in the DRC caches is hk,not xid.


That should improve the results for DRC RB-trees.

As I've mentioned before, I've never really examined the DRC code.
In person yesterday afternoon, you agreed that the repeated mallocs
in that code provide contention during concurrent thread processing
in the main path.

I've promised to take a look during my zero-copy efforts.

But this thread is about rpcping data structures.



What have you compared it to?  Need a gtest of avl and tailq with the
same data.  That's what the papers I looked at do


[...]

The rb tree either is, or isn't a major contributor to latency.  We'll
ditch it if it is.  Substituting a tailq (linear search) seems an
unlikely choice, but if you can prove your case with the numbers, no
one's going to object.


Thank you.  I'll probably try that in a week or so.

Right now, as mentioned on the conference call, I need some help
diagnosing why the rpcping code crashes.  Some assumptions about
threading seem to be wrong.  DanG is helping immensely!

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] rpcping

2018-03-13 Thread William Allen Simpson

On 3/13/18 2:38 AM, William Allen Simpson wrote:

In my measurements, using the new CLNT_CALL_BACK(), the client thread
starts sending a stream of pings.  In every case, it peaks at a
relatively stable rate.


DanG suggested that timing was dominated by the system time calls.

The previous numbers were switched to a finer grained timer than
the original code.  JeffL says that clock_gettime() should have had
negligible overhead.

But just to make sure, I've eliminated the per thread timers and
substituted one before and one after.  Unlike previously, this
will include the overhead of setting up the client, in addition to
completing all the callback returns.

Same result.  More calls ::= slower times.

rpcping tcp localhost threads=1 count=1000 (port=2049 program=13 version=3 
procedure=0): average 36012.0254, total 36012.0254
rpcping tcp localhost threads=1 count=1500 (port=2049 program=13 version=3 
procedure=0): average 33720.9125, total 33720.9125
rpcping tcp localhost threads=1 count=2000 (port=2049 program=13 version=3 
procedure=0): average 25604.7542, total 25604.7542
rpcping tcp localhost threads=1 count=3000 (port=2049 program=13 version=3 
procedure=0): average 21170.0836, total 21170.0836
rpcping tcp localhost threads=1 count=5000 (port=2049 program=13 version=3 
procedure=0): average 18163.2451, total 18163.2451

Including the 3-way handshake time for setting up the clients does affect
the overall throughput numbers.

rpcping tcp localhost threads=2 count=1500 (port=2049 program=13 version=3 
procedure=0): average 10379.3976, total 20758.7951
rpcping tcp localhost threads=2 count=1500 (port=2049 program=13 version=3 
procedure=0): average 10746.9395, total 21493.8790

rpcping tcp localhost threads=3 count=1500 (port=2049 program=13 version=3 
procedure=0): average 5473.3780, total 16420.1339
rpcping tcp localhost threads=3 count=1500 (port=2049 program=13 version=3 
procedure=0): average 5886.5549, total 17659.6646

rpcping tcp localhost threads=5 count=1500 (port=2049 program=13 version=3 
procedure=0): average 3396.9438, total 16984.7190
rpcping tcp localhost threads=5 count=1500 (port=2049 program=13 version=3 
procedure=0): average 3455.3026, total 17276.5131

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] rpcping

2018-03-13 Thread William Allen Simpson

On 3/12/18 6:25 PM, Matt Benjamin wrote:

If I understand correctly, we always insert records in xid order, and
xid is monotonically increasing by 1.  I guess pings might come back
in any order, 


No, they always come back in order.  This is TCP.  I've gone to some
lengths to fix the problem that operations were being executed in
arbitrary order.  (As was reported in the past.)

For UDP, there is always the possibility of loss or re-ordering of
datagrams, one of the reasons for switching to TCP in NFSv3 (and
eliminating UDP in NFSv4).

Threads can still block in apparently random order, because of
timing variances inside FSAL calls.  Should not be an issue here.


but if we assume xids retire in xid order also, 


They do.  Should be no variance.  Eliminating the dupreq caching --
also using the rbtree -- significantly improved the timing.

Apparently picked the worst tree choice for this data, according to
computer science.  If all you have is a hammer



and keep
a window of 1 records in-tree, that seems maybe like a reasonable
starting point for measuring this?
I've not tried 10,000 or 100,000 recently.  (The original code

default sent 100,000.)

I've not recorded how many remain in-tree during the run.

In my measurements, using the new CLNT_CALL_BACK(), the client thread
starts sending a stream of pings.  In every case, it peaks at a
relatively stable rate.

For 1,000, <4,000/s.  For 100, 40,000/s.  Fairly linear relationship.

By running multiple threads, I showed that each individual thread ran
roughly the same (on average).  But there is some variance per run.

I only posted the 5 thread results, lowest and highest achieved.

My original message had up to 200 threads and 4 results, but I decided
such a long series was overkill, so removed them before sending.

That 4,000 and 40,000 per client thread was stable across all runs.



I wrote a gtest program (gerrit) that I think does the above in a
single thread, no locks, for 1M cycles (search, remove, insert).  On
lemon, compiled at O2, the gtest profiling says the test finishes in
less than 150ms (I saw as low as 124).  That's over 6M cycles/s, I
think.


What have you compared it to?  Need a gtest of avl and tailq with the
same data.  That's what the papers I looked at do

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


[Nfs-ganesha-devel] WAVL tree

2018-03-12 Thread William Allen Simpson

New in 2015.

https://en.wikipedia.org/wiki/WAVL_tree

There's a C++ intrusive container implementation at:

https://fuchsia.googlesource.com/zircon/+/master/system/ulib/fbl/include/fbl/intrusive_wavl_tree.h

I've not found a standard C implementation yet.

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] rpcping

2018-03-12 Thread William Allen Simpson

[These are with a Ganesha that doesn't dupreq cache the null operation.]

Just how slow is this RB tree?

Here's a comparison of 1000 entries versus 100 entries in ops per second:

rpcping tcp localhost threads=5 count=1000 (port=2049 program=13 version=3 
procedure=0): average 2963.2517, total 14816.2587
rpcping tcp localhost threads=5 count=1000 (port=2049 program=13 version=3 
procedure=0): average 3999.0897, total 19995.4486

rpcping tcp localhost threads=5 count=100 (port=2049 program=13 version=3 
procedure=0): average 39738.1842, total 198690.9208
rpcping tcp localhost threads=5 count=100 (port=2049 program=13 version=3 
procedure=0): average 39913.1032, total 199565.5161

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] rpcping

2018-03-12 Thread William Allen Simpson

One of the limiting factors in our Ganesha performance is that the
NULL operation is going through the dupreq code.  That can be
easily fixed with a check that jumps to nocache.

One of the limiting factors in our ntirpc performance seems to be the
call_replies tree that stores the xid of calls to match replies.

Currently, we are using an RB tree.  The XID advances sequentially.

BTW, we have the same problem with fd.  The fd advances sequentially.

Performing sequential inserts, the AVL algorithm is 37.5% faster!
  
https://refactoringlightly.wordpress.com/2017/10/29/performance-of-avl-red-black-trees-in-java/

There is one tree per connection.  We don't really need to worry much
about out of order replies.  So the best structure would be a simpler
tailq list.

In the short term, we discussed hashing the XID before insertion.
But that still has the rapid insertion/deletion issue.

Apparently, insertions and deletions can be so slow that it takes a
count of about 40 before AVL outperforms simple list.  In the ping
case, we'll never see more than 1.  Its replies will always be
sequential on a TCP connection.

Do we know of any NFS case where we expect any RPC call to return
behind more than 40 other calls on the same connection?

Do we know of any NFS case where we expect to make 40 concurrent
RPC calls on the same connection?

[Remember, we are not talking about client to server calls.  We
are talking about server to client back-channel.]

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] rpcping

2018-03-12 Thread William Allen Simpson

[top post]
Matt produced a new tcp-only variant that skipped rpcbind.

I tried it, and immediately got crashes.  So I've pushed out a few
bug fixes  With my fixes, here are the results on my desktop.

First and foremost, I compared with my prior results against rpcbind,
and they were comparable.

Next I tried the default nfsv3:

rpcping tcp localhost threads=1 count=1500 (port=2049 program=13 version=3 
procedure=0): 5., total 5.
rpcping tcp localhost threads=2 count=1500 (port=2049 program=13 version=3 
procedure=0): 5000., total 1.
rpcping tcp localhost threads=3 count=1500 (port=2049 program=13 version=3 
procedure=0): 4166.6667, total 12500.

rpcping tcp localhost threads=5 count=500 (port=2049 program=13 version=3 
procedure=0): 4469.6970, total 22348.4848
rpcping tcp localhost threads=7 count=500 (port=2049 program=13 version=3 
procedure=0): 3019.9580, total 21139.7059
rpcping tcp localhost threads=10 count=500 (port=2049 program=13 version=3 
procedure=0): 1769.2308, total 17692.3077

Note we are almost entirely bound by Ganesha.  Results are progressively
worse than against rpcbind.

Finally, I tried nfsv4:

rpcping tcp localhost threads=1 count=1500 (port=2049 program=13 version=4 
procedure=0): 25000., total 25000.
rpcping tcp localhost threads=2 count=1500 (port=2049 program=13 version=4 
procedure=0): 13068.1818, total 26136.3636
rpcping tcp localhost threads=3 count=1500 (port=2049 program=13 version=4 
procedure=0): 4000., total 12000.
rpcping tcp localhost threads=5 count=1500 (port=2049 program=13 version=4 
procedure=0): 2743.1290, total 13715.6448

rpcping tcp localhost threads=7 count=500 (port=2049 program=13 version=4 
procedure=0): 2521.0084, total 17647.0588
rpcping tcp localhost threads=10 count=500 (port=2049 program=13 version=4 
procedure=0): 1731.3390, total 17313.3903
rpcping tcp localhost threads=15 count=500 (port=2049 program=13 version=4 
procedure=0): 1142.3732, total 17135.5981


On 3/8/18 8:03 PM, William Allen Simpson wrote:

rpcping tcp localhost threads=3 count=100 (program=10 version=4 
procedure=0): .6667, total 2.
rpcping tcp localhost threads=5 count=100 (program=10 version=4 
procedure=0): 1., total 5.
rpcping tcp localhost threads=7 count=100 (program=10 version=4 
procedure=0): 8571.4286, total 6.
rpcping tcp localhost threads=10 count=100 (program=10 version=4 
procedure=0): 7000., total 7.
rpcping tcp localhost threads=15 count=100 (program=10 version=4 
procedure=0): 5666.6667, total 85000.
rpcping tcp localhost threads=20 count=100 (program=10 version=4 
procedure=0): 3750., total 75000.
rpcping tcp localhost threads=25 count=100 (program=10 version=4 
procedure=0): 2420., total 60500.



--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] zero-copy read

2018-03-11 Thread William Allen Simpson

On 3/11/18 7:15 AM, William Allen Simpson wrote:

On 3/10/18 11:18 AM, Matt Benjamin wrote:

Marcus has code that prototypes using gss_iov from mit-krb5 1.1.12.  I
recall describing this to you in 2013.


That would be surprising, as I didn't start working on this project
until a year or so later than that

Anyway, last year Marcus sent me a link to his prototype.  It's
hopelessly out of date by now.  I'll need to start over.


And isn't there anymore in any case.  Did somebody do housecleaning?

  g...@github.com:linuxbox2/gss2


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] [nfs-ganesha/ntirpc] rpcping ncreatef (#115)

2018-03-11 Thread William Allen Simpson

On 3/9/18 5:38 PM, Matt Benjamin wrote:

I might be missing something, but it looked to me like the trick to talking to 
nfs-ganesha is to bypass the binder more-or-less as an nfsv4 backchannel does.


I'm not sure this is a good idea, unless we are really desperate for
Ganesha numbers.  I've already shown that we get 85,000 per second in
ntirpc itself, which should be more than enough for now.

The backchannel works because we already have a forward channel.

Let's figure out why rpcbind isn't working on our systems.  Or SE
Linux isn't allowing contact.  Or whatever.

Needs a lot better debugging output than a single message number for
all such problems.  There are sub-message numbers in rpcbind and
getnetconfig and getnetpath, but they aren't being reported.  I've
looked into replacing their annoying thread-local-variables.



I look forward to all the flames.


This doesn't handle IPv6, unlike my previous commit.  That needs to be
fixed before this is approved.  Traditionally (that is ping and
traceroute), we'd have -4 and -6 for 4-only and 6-only respectively.

Presumably we will handle raw interface via rpcbind, as existing.  But
the Usage information is incorrect.




You can view, comment on, or merge this pull request online at:

https://github.com/nfs-ganesha/ntirpc/pull/115


Commit Summary

  * rpcping: permit binding by connected socket (omitting bind)
  * rpcping: hook old and new opts via getopt_long_only


Not a big fan of making every parameter require a - flag.  Prefer
required parameters to be positional.  Obviously the original coder
agreed with me on that point!

Protocol is always required, and makes sense to be the first
positional parameter.

rpcping  []

Because for raw, destination is unused, it should be the last
positional parameter.

If you really must have a flag, traditionally -P for --protocol.

It does make it easier to change some optional parameters without
regard to their order.

You are missing process.  Let's use -s for --process ,
-m for --program .  I'd already specified defaults.

Also -b for --rpcbind -- you have a boolean rpcbind flag, but
nothing to set it yet.

Traditional -p for --port, -c for --count (as-is).

That leaves -t for --threads (as-is), -v for --version (as-is).
I've already specified defaults.

I'm sure there is some package out there that handles "-" and "--.
Use the same as traceroute?

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] zero-copy read

2018-03-11 Thread William Allen Simpson

On 3/10/18 11:18 AM, Matt Benjamin wrote:

Marcus has code that prototypes using gss_iov from mit-krb5 1.1.12.  I
recall describing this to you in 2013.


That would be surprising, as I didn't start working on this project
until a year or so later than that

Anyway, last year Marcus sent me a link to his prototype.  It's
hopelessly out of date by now.  I'll need to start over.

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] zero-copy read

2018-03-10 Thread William Allen Simpson

On 3/10/18 10:24 AM, William Allen Simpson wrote:

Finally, and what I'll do this weekend, my attempt to edit
xdr_nfs23.c won't pass checkpatch commit, because all the headers
are still pre-1989 pre-ANSI K

Unfortunately, Red Hat Linux doesn't seem to have cproto built-in,
even though it's on the usual https://linux.die.net/man/1/cproto.


Found it, installed it, and it wouldn't work in our complex
everything is a sub-library source structure.  So I did it by regex.

But as I delved deeper, I'll have to make GSS work on vector i-o, as
it currently requires one big buffer input.  This will be awhile.

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


[Nfs-ganesha-devel] zero-copy read

2018-03-10 Thread William Allen Simpson

Now that DanG has a workable vector i-o for read and write, I'm
trying again to make reading zero-copy.  Man-oh-man, do we have
our work cut out for us

It seems that currently we provide a buffer to read.  Then XDR
makes a new object, puts headers into it, makes another data_val
and copies data into that, then it is all eventually passed to
ntirpc where a buffer is created and all copied into that.

If GSS, another copy is made.  (This one cannot be avoided.)

So we're copying large amounts of data 4-5 times.   Not counting
whatever the FSAL library call does internally.

Then there is NFS_LOOKAHEAD_READ, and a nfs_request_lookahead.
Could somebody explain what that is doing?

AFAICT, the only test is in dup_req, and it won't keep the dup_req
"because large, though idempotent".  Isn't a large read exactly
where we'd benefit from holding onto a dup_req?

NFS_LOOKAHEAD_HIGH_LATENCY is never used.

There are a lot of XDR tests for x_public having the pointer to
nfs_request_lookahead, yet setting that pointer is one of the
early things in nfs_worker_thread.c nfs_rpc_process_request().

Finally, and what I'll do this weekend, my attempt to edit
xdr_nfs23.c won't pass checkpatch commit, because all the headers
are still pre-1989 pre-ANSI K

Unfortunately, Red Hat Linux doesn't seem to have cproto built-in,
even though it's on the usual https://linux.die.net/man/1/cproto.

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


[Nfs-ganesha-devel] nfs_worker_thread never executed code section

2018-03-10 Thread William Allen Simpson

Looking at the reply code just above this, discovered never executed
code section.  I'm surprised the compiler isn't giving a warning.

Checking the Lieb re-indent commit, there was a label handle_err.

I'll go ahead and remove it.  It merely seems to be old logging
Just a head's up for somebody to remember a need for this code.

02526d7325 (Jim Lieb  2013-10-10 20:50:47 -0700 1411)
02526d7325 (Jim Lieb  2013-10-10 20:50:47 -0700 1412)   /* 
Finish any request not already deleted */
02526d7325 (Jim Lieb  2013-10-10 20:50:47 -0700 1413)   if 
(dpq_status == DUPREQ_SUCCESS)
00c21a6878 (William Allen Simpson 2015-06-12 07:36:46 -0400 1414)   
dpq_status = nfs_dupreq_finish(>r_u.req.svc, res_nfs);
02526d7325 (Jim Lieb  2013-10-10 20:50:47 -0700 1415)   goto 
freeargs;
02526d7325 (Jim Lieb  2013-10-10 20:50:47 -0700 1416)
02526d7325 (Jim Lieb  2013-10-10 20:50:47 -0700 1417)   /* 
Reject the request for authentication reason (incompatible
02526d7325 (Jim Lieb  2013-10-10 20:50:47 -0700 1418)* file 
handle) */
822fdbc104 (Frank S. Filz 2014-03-20 15:01:31 -0400 1419)   if 
(isInfo(COMPONENT_DISPATCH) || isInfo(COMPONENT_EXPORT)) {
02526d7325 (Jim Lieb  2013-10-10 20:50:47 -0700 1420)   
char dumpfh[1024];
33be2e2c11 (Frank S. Filz 2015-05-19 18:41:15 -0400 1421)
02526d7325 (Jim Lieb  2013-10-10 20:50:47 -0700 1422)   
sprint_fhandle3(dumpfh, (nfs_fh3 *) arg_nfs);
822fdbc104 (Frank S. Filz 2014-03-20 15:01:31 -0400 1423)   
LogInfo(COMPONENT_DISPATCH,
66a09b1375 (Frank S. Filz 2017-12-07 11:30:32 -0800 1424)   
"%s Request from host %s V3 not allowed on this export, proc=%"
66a09b1375 (Frank S. Filz 2017-12-07 11:30:32 -0800 1425)   
PRIu32 ", FH=%s",
ba42f03b2c (Frank S. Filz 2014-05-19 14:23:01 -0400 1426)   
progname, client_ip,
eb51d7a97b (William Allen Simpson 2017-01-13 19:19:55 -0500 1427)  
 reqdata->r_u.req.svc.rq_msg.cb_proc, dumpfh);
02526d7325 (Jim Lieb  2013-10-10 20:50:47 -0700 1428)   }
02526d7325 (Jim Lieb  2013-10-10 20:50:47 -0700 1429)   auth_rc 
= AUTH_FAILED;
02526d7325 (Jim Lieb  2013-10-10 20:50:47 -0700 1430)
02526d7325 (Jim Lieb  2013-10-10 20:50:47 -0700 1431)  auth_failure:



--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] rpcping

2018-03-08 Thread William Allen Simpson

On 3/8/18 12:33 PM, William Allen Simpson wrote:

Still having no luck.  Instead of relying on RPC itself, checked
with Ganesha about what it registers, and tried some of those.


Without running Ganesha, rpcinfo reports portmapper services by default
on my machine.  Can talk to it via localhost (but not 127.0.0.1 loopback).

bill@simpson91:~/rdma/build_ganesha$ rpcinfo
   program version netid addressserviceowner
104tcp6  ::.0.111   portmapper superuser
103tcp6  ::.0.111   portmapper superuser
104udp6  ::.0.111   portmapper superuser
103udp6  ::.0.111   portmapper superuser
104tcp   0.0.0.0.0.111  portmapper superuser
103tcp   0.0.0.0.0.111  portmapper superuser
102tcp   0.0.0.0.0.111  portmapper superuser
104udp   0.0.0.0.0.111  portmapper superuser
103udp   0.0.0.0.0.111  portmapper superuser
102udp   0.0.0.0.0.111  portmapper superuser
104local /run/rpcbind.sock  portmapper superuser
103local /run/rpcbind.sock  portmapper superuser

TCP works.  UDP with the same parameters hangs forever.

tests/rpcping tcp localhost 1 1000 10 4
rpcping tcp localhost threads=1 count=1000 (program=10 version=4 
procedure=0): 5., total 5.
tests/rpcping tcp localhost 1 1 10 4
rpcping tcp localhost threads=1 count=1 (program=10 version=4 
procedure=0): 17543.8596, total 17543.8596
tests/rpcping tcp localhost 1 10 10 4
^C

What's interesting to me is that 1,000 async calls has much
better throughput (calls per second) than 10,000.  Hard to
say where is the bottleneck without profiling.

100,000 async calls bogs down so long that I gave up.  Same
with 2 threads and 10,000 -- or 3 threads down to 100.

tests/rpcping tcp localhost 2 1000 10 4
rpcping tcp localhost threads=2 count=1000 (program=10 version=4 
procedure=0): 8333., total 1.6667
tests/rpcping tcp localhost 2 1 10 4
^C

tests/rpcping tcp localhost 3 1000 10 4
^C
tests/rpcping tcp localhost 3 500 10 4
^C
tests/rpcping tcp localhost 3 100 10 4
rpcping tcp localhost threads=3 count=100 (program=10 version=4 
procedure=0): .6667, total 2.
tests/rpcping tcp localhost 5 100 10 4
rpcping tcp localhost threads=5 count=100 (program=10 version=4 
procedure=0): 1., total 5.
tests/rpcping tcp localhost 7 100 10 4
rpcping tcp localhost threads=7 count=100 (program=10 version=4 
procedure=0): 8571.4286, total 6.
tests/rpcping tcp localhost 10 100 10 4
rpcping tcp localhost threads=10 count=100 (program=10 version=4 
procedure=0): 7000., total 7.
tests/rpcping tcp localhost 15 100 10 4
rpcping tcp localhost threads=15 count=100 (program=10 version=4 
procedure=0): 5666.6667, total 85000.
tests/rpcping tcp localhost 20 100 10 4
rpcping tcp localhost threads=20 count=100 (program=10 version=4 
procedure=0): 3750., total 75000.
tests/rpcping tcp localhost 25 100 10 4
rpcping tcp localhost threads=25 count=100 (program=10 version=4 
procedure=0): 2420., total 60500.

Note that 5 threads and 100 catches up to 1 thread and 1,000?

So the bottleneck is probably in ntirpc.  That seems validated by 7 to
25 threads; portmapper will handle more requests (with diminishing
returns), but ntirpc cannot handle more results (on the same thread).

Oh well, against nfs-ganesha still doesn't work.

tests/rpcping tcp localhost 1 10 13 4
clnt_ncreate failed: RPC: Unknown protocol
tests/rpcping tcp localhost 1 10 13 3
clnt_ncreate failed: RPC: Unknown protocol

But it's in the rpcinfo:

   program version netid addressserviceowner
104tcp6  ::.0.111   portmapper superuser
103tcp6  ::.0.111   portmapper superuser
104udp6  ::.0.111   portmapper superuser
103udp6  ::.0.111   portmapper superuser
104tcp   0.0.0.0.0.111  portmapper superuser
103tcp   0.0.0.0.0.111  portmapper superuser
102tcp   0.0.0.0.0.111  portmapper superuser
104udp   0.0.0.0.0.111  portmapper superuser
103udp   0.0.0.0.0.111  portmapper superuser
102udp   0.0.0.0.0.111  portmapper superuser
104local /run/rpcbind.sock  portmapper superuser
103local /run/rpcbind.sock  portmapper superuser
133udp   0.0.0.0.8.1nfssuperuser
133udp6  :::0.0.0.0.8.1

[Nfs-ganesha-devel] rpcping

2018-03-08 Thread William Allen Simpson

Still having no luck.  Instead of relying on RPC itself, checked
with Ganesha about what it registers, and tried some of those.

The default procedure is 0, that according to every RFC is reserved for
do nothing.  But rpcbind is not finding program and version.

To be honest, I'm not sure how this can ever be used for real testing.
According to the documentation:

3.3.0 Procedure 0: NULL - Do nothing
...
  Since the NULL procedure takes no NFS version 3 protocol
  arguments and returns no NFS version 3 protocol response,
  it can not return an NFS version 3 protocol error.

But we need some kind of response for timing!

bill@simpson91:~/rdma/ntirpc$ cd ../build~ntirpc/
bill@simpson91:~/rdma/build~ntirpc$ tests/rpcping
Usage: rpcping[ [ [ 
[

NFS

bill@simpson91:~/rdma/build~ntirpc$ tests/rpcping tcp localhost 1 10 13 2
clnt_ncreate failed: RPC: Unknown protocol
bill@simpson91:~/rdma/build~ntirpc$ tests/rpcping tcp localhost 1 10 13 3
clnt_ncreate failed: RPC: Unknown protocol
bill@simpson91:~/rdma/build~ntirpc$ tests/rpcping tcp localhost 1 10 13 4
clnt_ncreate failed: RPC: Unknown protocol
bill@simpson91:~/rdma/build~ntirpc$ tests/rpcping tcp 127.0.0.1 1 10 13 4
clnt_ncreate failed: RPC: Unknown protocol


Mount

bill@simpson91:~/rdma/build~ntirpc$ tests/rpcping tcp 127.0.0.1 1 10 15 4
clnt_ncreate failed: RPC: Unknown protocol
bill@simpson91:~/rdma/build~ntirpc$ tests/rpcping tcp 127.0.0.1 1 10 15 3
clnt_ncreate failed: RPC: Unknown protocol
bill@simpson91:~/rdma/build~ntirpc$ tests/rpcping tcp 127.0.0.1 1 10 15 2
clnt_ncreate failed: RPC: Unknown protocol
bill@simpson91:~/rdma/build~ntirpc$ tests/rpcping tcp 127.0.0.1 1 10 15 1
clnt_ncreate failed: RPC: Unknown protocol


rpcbind itself

bill@simpson91:~/rdma/build~ntirpc$ tests/rpcping tcp 127.0.0.1 1 10 10 1
clnt_ncreate failed: RPC: Unknown protocol
bill@simpson91:~/rdma/build~ntirpc$ tests/rpcping tcp 127.0.0.1 1 10 10 2
clnt_ncreate failed: RPC: Unknown protocol
bill@simpson91:~/rdma/build~ntirpc$ tests/rpcping tcp 127.0.0.1 1 10 10 3
clnt_ncreate failed: RPC: Unknown protocol
bill@simpson91:~/rdma/build~ntirpc$ tests/rpcping tcp 127.0.0.1 1 10 10 4
clnt_ncreate failed: RPC: Unknown protocol

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Announce Push of V2.7-dev.1

2018-02-24 Thread William Allen Simpson

On 2/24/18 5:18 AM, William Allen Simpson wrote:

On 2/24/18 4:42 AM, William Allen Simpson wrote:

[top post for visibility]

Says ntirpc pullup (twice), but doesn't actually have:
  * "Pullup NTIRPC through #106"

Missing "(nfs41.h) unindent" checkpatch cleanup, even though we'd
agreed this was the best time to do it, and it had all the expected
+1 and +2.  Literally no changes to running code, so why?


So I rebased on this tag, and our CI compile fails:

error: Failed build dependencies:
 libcephfs-devel >= 10.2.0 is needed by 
nfs-ganesha-2.7-dev.1.el7.centos.x86_64


But success for CI CentOS cthon04 and CEA 9P and pynfs!

So who is able to update the trigger-fsal_gluster CI?

Do we need a separate trigger-fsal_ceph CI?

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Announce Push of V2.7-dev.1

2018-02-24 Thread William Allen Simpson

On 2/24/18 4:42 AM, William Allen Simpson wrote:

[top post for visibility]

Says ntirpc pullup (twice), but doesn't actually have:
  * "Pullup NTIRPC through #106"

Missing "(nfs41.h) unindent" checkpatch cleanup, even though we'd
agreed this was the best time to do it, and it had all the expected
+1 and +2.  Literally no changes to running code, so why?


So I rebased on this tag, and our CI compile fails:

error: Failed build dependencies:
libcephfs-devel >= 10.2.0 is needed by 
nfs-ganesha-2.7-dev.1.el7.centos.x86_64


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Announce Push of V2.7-dev.1

2018-02-24 Thread William Allen Simpson

[top post for visibility]

Says ntirpc pullup (twice), but doesn't actually have:
 * "Pullup NTIRPC through #106"

Missing "(nfs41.h) unindent" checkpatch cleanup, even though we'd
agreed this was the best time to do it, and it had all the expected
+1 and +2.  Literally no changes to running code, so why?

On 2/23/18 7:51 PM, Frank Filz wrote:

Branch next

Tag:V2.7-dev.1

NOTE: This merge includes an ntirpc pullup, please update your submodule

Release Highlights

* ntirpc pullup

* NFS4: allow reads and writes against delegations that have been

   recalled but not yet returned

* A bunch of rados stuff relating to config and recovery

* spec and cmake fixes

* Set op_ctx when reverting or failing exports

* Moving NFSv4 to POSIX ACL mapping code to a common code base

* fsal_rgw: support directory object as an export

Signed-off-by: Frank S. Filz 

Contents:

981cb36 Frank S. Filz V2.7-dev.1

9cfc995 taoCH fsal_rgw: support directory object as an export

312d583 Sriram Patil Moving NFSv4 to POSIX ACL mapping code to a common code 
base

c030a6c Daniel Gryniewicz Set op_ctx when reverting or failing exports

a91d57d Kaleb S. KEITHLEY build: compile conf_lex.c with _GNU_SOURCE to get 
caddr_t definition

a664baa Jeff Layton spec: allow packagers to remove dependency on rpcbind

1e9d029 Jeff Layton MainNFSD: invert _NO_PORTMAPPER option and rename to RPCBIND

daa6703 Jeff Layton specfile: fix libcephfs-devel and librgw-devel BuildRequires

f6d1f74 Jeff Layton doc: fix typo in ganesha-config manpage

45e9340 Jeff Layton RADOS_URLS: enable them by default

b9ddb96 Jeff Layton RADOS_URLS: allow the RADOS_URLS config block to be optional

30d9cb2 Jeff Layton SAL: consolidate rados cluster connect code

1802f57 Jeff Layton rados_ng: make a bunch of symbols static

67ec3db Jeff Layton NFS4: allow reads and writes against delegations that have 
been recalled but not yet returned



--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot



___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel




--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Change in ffilz/nfs-ganesha[next]: Remove unused fsal_read2() and fsal_write2()

2018-02-23 Thread William Allen Simpson

On 2/22/18 1:32 PM, GerritHub wrote:

Daniel Gryniewicz has uploaded this change for *review*.

View Change 

Remove unused fsal_read2() and fsal_write2()


I've reviewed, but the write showed up in my inbox before the read,
followed by this cleanup.  But the write changed things added by
the read changes.  And most of this cleanup was added by one of the
other patches.  Stuff appearing and then disappearing was confusing
to me.  Probably others, too.

So I've recommended squashing the 3 patches.  Will make future
bi-section easier.  Not that we'll ever need it, as otherwise
this seems like good work.

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] inconsistent '*' spacing

2018-02-21 Thread William Allen Simpson

On 2/21/18 11:35 AM, Frank Filz wrote:

There's a -n or --no-verify option that will bypass the commit hooks. I suggest 
trying to commit without that first to make sure the only checkpatch 
errors/warnings are for the spacing around * and then commit again with -n to 
bypass checkpatch to actually commit.


Boy oh boy, I wish I'd known about that!!!

So that's what I've done on the nfs41.h formatting.

I've also removed the K definitions that checkpatch hates.



I suppose now that it seems to complain either way, we can fix them
all to not have the space, and just force the commit and ignore the
checkpatch warnings (when I see ONLY warnings for things I know we
can't/won't fix in gerrithub, I ignore them and merge anyway). I'm not
sure how much of a potential merge headache changing them all would
cause though. I don't think nfsv41.h gets changed all that much so
even if we had to backport something, the potential manual merge wouldn't

be awful.



New -dev cycle, so it's time.


Yes, this is the right time to do it.


[...] > I sympathize with your grumbling about this one... I've chosen
the set of checkpatch checks to enable based on what seems to work
best for our code and keeps us as consistent style as possible. I
realize some folks have preferences opposite some of the checks, but
our code style is now what it is...


Or not, as in this case.

Anyway, I'll try to push a patch for those 2 files by tomorrow.


Ok, thanks


Turned out nfs23.h hadn't been updated into inlines for performance,
so doesn't seem to have these strange errors.  Hopefully nobody else
will want to patch nfs41.h this week.

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Change in ffilz/nfs-ganesha[next]: MainNFSD: invert _NO_PORTMAPPER option

2018-02-21 Thread William Allen Simpson

On 2/21/18 4:51 PM, Jeff Layton wrote:

On Wed, 2018-02-21 at 13:40 -0800, Frank Filz wrote:

We could take this opportunity to change the option to RPCBIND...



Fair enough.


I'd support this.



I actually disagree with the "no udp" statement above too. UDP is great
for single-shot request protocols like rpcbind, and the NFS client will
use it. DDoS is a possibility, but who exposes their rpcbind port to the
Internet?


Unfortunately, millions of websites.  At one time, portmapper was a
leading method of DDoS.

Actually, it's *NOT* great.  When Ganesha/ntirpc cannot find something,
it drops back from TCP to UDP.  And then tries over and over into the
void.  There's no return signal from UDP.

When the TCP service isn't available, you get a nice RST flag.  No need
for all these retry timeouts that UDP requires.

UDP turned out to be a security nightmare for NFS.  We all remember the
IP fragmentation DDoS?

That's why we tried (circa 1992) to eliminate IP fragmentation in IPv6.
Steve Deering was all over this.  DNS and NFS were the big culprits,
and NFS over UDP yields far bigger IP fragment chains than DNS



In any case, the real fix to this issue is to move to protocols that
don't require rpcbind at all. That means NFSv4.0 at a minimum (though
obviously v4.1+ would be preferred).


Ah, you're speaking to my heart.  But we apparently still have a lot
of UDP downstream, and now FSAL_PROXY.

When will we ever get away from the sins of our fathers, unto the 7th
generation?

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] inconsistent '*' spacing

2018-02-21 Thread William Allen Simpson

On 2/21/18 8:28 AM, William Allen Simpson wrote:

Anyway, I'll try to push a patch for those 2 files by tomorrow.


ERROR: "foo * bar" should be "foo *bar"
#18656: FILE: src/include/nfsv41.h:9900:
+static inline bool xdr_CB_COMPOUND4res(XDR * xdrs,

total: 450 errors, 14 warnings, 18658 lines checked

If those are fixed, we have remaining:

ERROR: need consistent spacing around '*' (ctx:WxV)
#4757: FILE: src/include/nfsv41.h:3696:
+extern COMPOUND4res *nfsproc4_compound_4(COMPOUND4args *, CLIENT *);
 ^

ERROR: need consistent spacing around '*' (ctx:WxV)
#4795: FILE: src/include/nfsv41.h:3719:
+extern CB_COMPOUND4res *cb_compound_1(CB_COMPOUND4args *, CLIENT *);
^

total: 7 errors, 14 warnings, 18659 lines checked


And no way around many of them, since fixing this:

WARNING: Avoid multiple line dereference - prefer 
'objp->nad_new_entry_cookie.nad_new_entry_cookie_val'
#13467: FILE: src/include/nfsv41.h:9151:
+  (char **)>nad_new_entry_cookie.
+  nad_new_entry_cookie_val,

WARNING: Avoid multiple line dereference - prefer 
'objp->nad_new_entry_cookie.nad_new_entry_cookie_len'
#13469: FILE: src/include/nfsv41.h:9153:
+  (u_int *) >nad_new_entry_cookie.
+  nad_new_entry_cookie_len, 1, sizeof(nfs_cookie4),

changes to a complaint about too long line:

WARNING: line over 80 characters
#13466: FILE: src/include/nfsv41.h:9150:
+  (char 
**)>nad_new_entry_cookie.nad_new_entry_cookie_val,

WARNING: line over 80 characters
#13467: FILE: src/include/nfsv41.h:9151:
+  (u_int *) 
>nad_new_entry_cookie.nad_new_entry_cookie_len,

So un-indented the whole thing to make more room.

But even that won't fix:

WARNING: line over 80 characters
#9587: FILE: src/include/nfsv41.h:6149:
+
>open_none_delegation4_u.ond_server_will_signal_avail))

WARNING: line over 80 characters
#11239: FILE: src/include/nfsv41.h:7261:
+
>GET_DIR_DELEGATION4res_non_fatal_u.gddrnf_will_signal_deleg_avail))


There are irredeemable errors, because checkpatch won't allow K C,
but the #ifdef __STDC__ #else doesn't actually compile these lines:

ERROR: Bad function definition - int nfs4_program_4_freeresult() should 
probably be int nfs4_program_4_freeresult(void)
#4773: FILE: src/include/nfsv41.h:3708:
+extern int nfs4_program_4_freeresult();

ERROR: Bad function definition - int nfs4_callback_1_freeresult() should 
probably be int nfs4_callback_1_freeresult(void)
#4812: FILE: src/include/nfsv41.h:3731:
+extern int nfs4_callback_1_freeresult();

total: 4 errors, 2 warnings, 18577 lines checked

Would have been much easier to turn off this pointless check!!!

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Change in ffilz/nfs-ganesha[next]: MainNFSD: invert _NO_PORTMAPPER option

2018-02-21 Thread William Allen Simpson

On 2/21/18 1:59 PM, GerritHub wrote:

Jeff Layton has uploaded this change for *review*.

View Change 

MainNFSD: invert _NO_PORTMAPPER option

The fact that this is a "negative" option is confusing. Change it
to a "PORTMAPPER" option, and have it default to ON.


While I vaguely agree with the former in principle, in this day and age
we really should stop using the name PORTMAPPER.  Replaced by rpcbind a
long time ago, and shouldn't be shipping with modern systems.

In ntirpc, PORTMAP is as expected the old version 2 UDP-only call.  We
should kill it.

We really shouldn't encourage folks to use a UDP system that has long
had known DDoS attacks.

And we really should be migrating from NFS 2 UDP to NFS 3 TCP, as a
minimum supported version

Also, we have talked about adding rpcbind itself to Ganesha or ntirpc.

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] inconsistent '*' spacing

2018-02-21 Thread William Allen Simpson

On 2/20/18 1:06 PM, Frank Filz wrote:

As I'm trying to update nfs41.h, I've run into the problem that the commit

check

is complaining that the pointer '*' on parameters is sometimes " * v" and

others

" *v" -- usually the same function definition.

Presumably the generator made these.  They are cosmetic.

Why oh why are we checking this now, after all these years?

Do I need to make a pass fixing all these header files before doing real

coding?


Or can we turn off this silly cosmetic check?


The problem is that checkpatch gets confused between multiplication and
pointer. Shame on C for overloading *...

The problem is that checkpatch doesn't recognize, for example, SVCXPRT as a
type, so it thinks SVCXPRT *xprt is a multiplication rather than a
declaration.

The number of places this causes problems is tiny (and mostly confined to
nfsv41.h and nfs23.h) that I have just decided to live with SVCXPRT * xprt.


Well, I cannot, because it won't let me commit anything in those files.



[...]
I suppose now that it seems to complain either way, we can fix them all to
not have the space, and just force the commit and ignore the checkpatch
warnings (when I see ONLY warnings for things I know we can't/won't fix in
gerrithub, I ignore them and merge anyway). I'm not sure how much of a
potential merge headache changing them all would cause though. I don't think
nfsv41.h gets changed all that much so even if we had to backport something,
the potential manual merge wouldn't be awful.


New -dev cycle, so it's time.



[...] > I sympathize with your grumbling about this one... I've chosen the set 
of
checkpatch checks to enable based on what seems to work best for our code
and keeps us as consistent style as possible. I realize some folks have
preferences opposite some of the checks, but our code style is now what it
is...


Or not, as in this case.

Anyway, I'll try to push a patch for those 2 files by tomorrow.

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] nfs ganesha vs nfs kernel performance

2018-02-21 Thread William Allen Simpson

I really don't have time today to respond to every one-liner
throw-away comment here, so I'll try to stick to the most cogent.

On 2/20/18 8:33 AM, Matt Benjamin wrote:

On Tue, Feb 20, 2018 at 8:12 AM, William Allen Simpson
<william.allen.simp...@gmail.com> wrote:

On 2/18/18 2:47 PM, Matt Benjamin wrote:


On Fri, Feb 16, 2018 at 11:23 AM, William Allen Simpson


But the planned 2.7 improvements are mostly throughput related, not IOPS.


Not at all, though I am trying to ensure that we get async FSAL ops
in.  There are people working on IOPs too.


async FSAL ops are not likely to further improve IOPS.

As I've pointed out many times in the past, async only allows
the same number of threads to handle more concurrent operations.

But it's actually slower.  It basically doubles the number of
system calls.  System calls are one of the reasons that Ganesha is
slower than knfsd.  It also ruins CPU cache coherency.

If we're CPU bound or network bandwidth constrained, it won't help.


Not everything being worked on is a direct answer to this problem.


Anything that isn't about protocol correctness or performance
should be lower priority.

Async FSAL ops could -- in a few cases where network or disk spinning
dominate the thread timing -- improve thread utilization.  But it
will never improve IOPS.



The effort that I've been waiting for *3* years -- add io vectors to
the FSAL interface, so that we can zero-copy buffers -- is more
likely to improve throughput.


Which, as you've noted, also is happening.


Finally.  Again, I've been asking for it now 3 years.  For V2.7, this is
what I anticipate will result in the most performance increase.



Moreover, going through the code and removing locking other such
bottlenecks is more likely to improve IOPS.


No one is disputing this.  It is not a new discovery, however.


And yet we seem to have to keep reminding folks.  I remember that
Malahal did a survey of lock timing/congestion a couple of years ago.

Perhaps he's willing to run it again?



If Ganesha is adding 6 ms to every read operation, we have a serious
problem, and need to profile immediately!



That's kind of what our team is doing.  I look forward to your work
with rpc-ping-mt.


Well, you were able to send me a copy after 6 pm on a Friday, so I'm
now taking a look at it.  Hopefully I'll have something by Friday.


It came up in a meeting on Friday, that's why I sent it Friday.  I got
busy in meetings and issues, that's why after 6 pm.



But I really wish you'd posted it 3 years ago.  It doesn't really test
IOPS, other than whatever bandwidth limit is imposed by the interface,
but it does test the client call interface.


It measures minimal request latency all the way to nfs-ganesha's, or
the Linux kernel's, rpcnull handler--the "top" of the request handling
stack, in the given client/server/network configuration.  Scaled up,
it can report the max such calls/s, which is a kind of best-possible
value for iops, taking FSAL ops to have 0 latency.


As I've mentioned elsewhere, this should be entirely dominated by the
link speed and protocol.  We should see UDP as slowest, TCP in the
middle, and RDMA as fastest.

OTOH, the "max such calls/s" would be reported by using XDR_RAW, which
is currently not working.



It was posted to this list by Tigran, iirc, in 2011 or 2012.


In which case, I really wish Tigran had put it in the tests folder,
so we'd have been continuously using and updating it.  The gtests
are a great improvement.  Now all we need to do is run them every
week for each new -dev release to track improvements/regressions.

I didn't know about this old test, because my archives only go back
to late 2014.  I wasn't involved before then.

Moreover, I didn't appreciate being criticized for not running an
old test that wasn't in the tree and wasn't maintained.



We've been using Jeff Layton's delegation callback work to test, and
this test would have been better and easier.

But a unit test is not what we need.  I wrote "profile".  We need to
know where the CPU bottlenecks are in Ganesha itself.


You also wrote unit test.


Looking back over this thread, I don't see those words.

I wrote "profile".

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


[Nfs-ganesha-devel] gtest results? profiles?

2018-02-20 Thread William Allen Simpson

Now that we have a pile of nice gtests, who is compiling the results?
Please post them here

Also, DanG told me yesterday that he has a profile of the lookup
test.  Please post that here.  That will allow us to better target
the CPU bottlenecks.

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] nfs ganesha vs nfs kernel performance

2018-02-20 Thread William Allen Simpson

On 2/18/18 2:47 PM, Matt Benjamin wrote:

On Fri, Feb 16, 2018 at 11:23 AM, William Allen Simpson

But the planned 2.7 improvements are mostly throughput related, not IOPS.


Not at all, though I am trying to ensure that we get async FSAL ops
in.  There are people working on IOPs too.


async FSAL ops are not likely to further improve IOPS.

As I've pointed out many times in the past, async only allows
the same number of threads to handle more concurrent operations.

But it's actually slower.  It basically doubles the number of
system calls.  System calls are one of the reasons that Ganesha is
slower than knfsd.  It also ruins CPU cache coherency.

If we're CPU bound or network bandwidth constrained, it won't help.

The effort that I've been waiting for *3* years -- add io vectors to
the FSAL interface, so that we can zero-copy buffers -- is more
likely to improve throughput.

Moreover, going through the code and removing locking other such
bottlenecks is more likely to improve IOPS.



If Ganesha is adding 6 ms to every read operation, we have a serious
problem, and need to profile immediately!



That's kind of what our team is doing.  I look forward to your work
with rpc-ping-mt.


Well, you were able to send me a copy after 6 pm on a Friday, so I'm
now taking a look at it.  Hopefully I'll have something by Friday.

But I really wish you'd posted it 3 years ago.  It doesn't really test
IOPS, other than whatever bandwidth limit is imposed by the interface,
but it does test the client call interface.

We've been using Jeff Layton's delegation callback work to test, and
this test would have been better and easier.

But a unit test is not what we need.  I wrote "profile".  We need to
know where the CPU bottlenecks are in Ganesha itself.

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Change in ffilz/nfs-ganesha[next]: "mdc_lookup" do not dispatch to FSAL

2018-02-20 Thread William Allen Simpson

On 2/19/18 12:12 PM, Sachin Punadikar wrote:

Hi Bill,
I rechecked the logs & discussed with Daniel. I missed to see the log entries 
related to FSAL.
So for this customer it looks like FSAL issue than a Ganesha issue


Thanks for the update.  The default log levels don't always show enough.



--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] READDIR doesn't return all entries.

2018-02-19 Thread William Allen Simpson

On 2/13/18 8:00 PM, Frank Filz wrote:

You still don’t mention FSAL…

I’m suspecting non-unique cookies from the FSAL as a cause. You may want to turn on CACHE_INODE and NFS_READDIR to FULL_DEBUG to see what is going on. A tcpdump trace won’t show anything useful (since we won’t see what cookies are being provided for the 
missing directory entries).



It's been several days, and nobody has been able to reproduce.

Please send your entire configuration.

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] testing

2018-02-18 Thread William Allen Simpson

On 2/15/18 1:17 PM, Frank Filz wrote:

Between your test message, and this test message, I've received 5
messages Subject: ACL support.

My Friday messages do not yet appear in the list archive:

List-Archive: 


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


[Nfs-ganesha-devel] inconsistent '*' spacing

2018-02-18 Thread William Allen Simpson

As I'm trying to update nfs41.h, I've run into the problem
that the commit check is complaining that the pointer '*' on
parameters is sometimes " * v" and others " *v" -- usually
the same function definition.

Presumably the generator made these.  They are cosmetic.

Why oh why are we checking this now, after all these years?

Do I need to make a pass fixing all these header files
before doing real coding?

Or can we turn off this silly cosmetic check?

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Change in ffilz/nfs-ganesha[next]: "mdc_lookup" do not dispatch to FSAL

2018-02-18 Thread William Allen Simpson

On 2/15/18 6:44 AM, GerritHub wrote:

Sachin Punadikar has uploaded this change for *review*.

View Change 

"mdc_lookup" do not dispatch to FSAL


Are you sure?  Do you have an actual reproducible error case?



"mdc_lookup" function first attempts to get the entry from cache
via function "mdc_try_get_cached". On getting ESATLE error, it
should dispatch to FSAL, but was again calling "mdc_try_get_cached".
Rectified code to make call to "mdc_lookup_uncached", so FSAL code
gets invoked.


I'm not the mdcache expert, but don't think this is correct.  The
comments already explain.

It tries under read lock (fastest).  If stale, it write locks and
tries again.  If still fails, at the uncached label, then it does
the mdc_lookup_uncached().

mdc_try_get_cached() is likely faster than mdc_lookup_uncached().

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] nfs ganesha vs nfs kernel performance

2018-02-18 Thread William Allen Simpson

On 2/14/18 8:32 AM, Daniel Gryniewicz wrote:

How many clients are you using?  Each client op can only (currently) be handled 
in a single thread, and client's won't send more ops until the current one is 
ack'd, so Ganesha can basically only parallelize on a per-client basis at the 
moment.


Actually, 2.6 should handle as many concurrent client requests as you like.
(Up to 250 of them.)  That's one of its features.

The client is not sending concurrent requests.



I'm sure there are locking issues; so far we've mostly worked on correctness 
rather than performance.  2.6 has changed the threading model a fair amount, 
and 2.7 will have more improvements, but it's a slow process.


But the planned 2.7 improvements are mostly throughput related, not IOPS.



On 02/13/2018 06:38 PM, Deepak Jagtap wrote:

Yeah user-kernel context switching is definitely adding up latency, but I 
wonder ifrpc or some locking overhead is also in the picture.


ifrpc?



With 70% read 30% random workload nfs ganesha CPU usage was close to 170% while 
remaining 2 cores were pretty much unused (~18K IOPS, latency ~8ms)

With 100% read 30% random nfs ganesha CPU usage ~250% ( ~50K IOPS, latency 
~2ms).


Those latency numbers seem suspect to me.  The dominant latency should be
the file system.  The system calls shouldn't add more than microseconds.

If Ganesha is adding 6 ms to every read operation, we have a serious
problem, and need to profile immediately!

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] nfs ganesha vs nfs kernel performance

2018-02-13 Thread William Allen Simpson

On 2/13/18 1:21 AM, Malahal Naineni wrote:

If your latency is high, then you most likely need to change 
Dispatch_Max_Reqs_Xprt. What your Dispatch_Max_Reqs_Xprt value?


That shouldn't do anything anymore in V2.6, other than 9P.

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] WIP example API for async/vector FSAL ops

2018-02-07 Thread William Allen Simpson

On 2/6/18 10:40 AM, Daniel Gryniewicz wrote:

On 02/06/2018 10:26 AM, William Allen Simpson wrote:

On 2/6/18 8:25 AM, Daniel Gryniewicz wrote:

Hi, all.

I've worked up a sample API for async/vector for FSAL ops.  The example op is read(), and I've "implemented" it for all FSALs, so that I can verify that it does, in fact, work for some definition of work. 


I'm a bit surprised it works, as the alloca needs the sizeof struct PLUS the
sizeof iov * iov_count.  Right now it's writing off the end.


I believe the empty array construct includes a size of 1.  If I'm wrong, then 
it's an easy fix (and this code will go away anyway, and never be committed).


No, it's zero.  Yes, an easy fix.

I'm assuming this code will be committed sometime in the near future.



"asynchronous: has an 'h' in it.

"it's" means "it is".  Most places should be "its".

To be async, need to move the status return into the arg struct, and pass
NULL for the caller's parameter at the top level.


Return is it's own argument to the callback.



I'd prefer to have the new struct contain all the common arguments.

Every level needs to be able to set the status, so putting the result in
the struct makes the code cleaner than copying stuff in every wrapper.



Why not move the other arguments into the struct?
   * bypass
   * state
   * offset


Because those are pure in arguments, and were unchanged, so minimal code 
changes.  The iov was put into the arg to avoid multiple mallocs, and I put 
iov_count with iov.  The rest are out arguments.


Obviously, all [in] arguments can be in the struct.  Set and forget once
at the top

Even [out] pointer arguments can be in the struct.

Removing long parameter lists makes the code cleaner (and faster).

And some of this will need to be done to remove op_ctx dependencies.



Also it will be the same for write, so we can just name it
struct fsal_cb_arg -- and the function typedef fsal_cb to match.


It may be.  I didn't look at write, this is a proof-of-concept for read, and 
not in any way intended to be final.


Yeah, as we talked earlier, I was looking at the bigger picture.  This is a
nice clean proof-of-concept.  I like it.  Now I'm talking details.




Why not get rid of fsal_read2(), and call the function directly in
the 3 places it's used?


I'm considering it.  That was a good point to break this particular 
proof-of-concept, but much must change for async to be plumbed all the way back 
to the top of the protocols.


Yes, again I'm forward looking.  If we do it now, then we don't have to
undo anything later.  Makes the patches easier to understand.

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


[Nfs-ganesha-devel] V2.6 fixed UDP address and port (and more)

2018-02-06 Thread William Allen Simpson

For a long time, there have been problems with UDP.  It has not been a
priority, under the assumption that most folks have moved to TCP.  And
it wasn't tested much.

Malahal tried a quick and dirty fix with a copy of the IP address in
each service request structure.  But all UDP requests were using the
same service transport; the copy there was constantly being munged, and
that's the copy the standard functions and macros referenced.

Also, all UDP connections used the same buffer for input.  *AND* output.
There was much lock wrangling.

In V2.6, each UDP request gets a new clone of the service transport, with
its own buffer (similar to the technique that Dominique did for RDMA).
Now they can be processed in parallel.  No locking in the input path.

If you are seeing UDP crashes with odd IP addresses in previous versions,
now you know the answer is to update to V2.6 downstream.

Guaranteed at least some increase in performance!

This was a significant re-write, not easy to backport.  Please test.
Please

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] WIP example API for async/vector FSAL ops

2018-02-06 Thread William Allen Simpson

On 2/6/18 8:25 AM, Daniel Gryniewicz wrote:

Hi, all.

I've worked up a sample API for async/vector for FSAL ops.  The example op is read(), and I've "implemented" it for all FSALs, so that I can verify that it does, in fact, work for some definition of work.  


I'm a bit surprised it works, as the alloca needs the sizeof struct PLUS the
sizeof iov * iov_count.  Right now it's writing off the end.

"asynchronous: has an 'h' in it.

"it's" means "it is".  Most places should be "its".

To be async, need to move the status return into the arg struct, and pass
NULL for the caller's parameter at the top level.

Why not move the other arguments into the struct?
  * bypass
  * state
  * offset

Also it will be the same for write, so we can just name it
struct fsal_cb_arg -- and the function typedef fsal_cb to match.

Why not get rid of fsal_read2(), and call the function directly in
the 3 places it's used?

Anyway, a good effort.  I see how you've wrapped for stacking.  Thanks!

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Features board list

2018-02-01 Thread William Allen Simpson

On 2/1/18 8:04 AM, Supriti Singh wrote:

It seems like github does not support organization level board to be visible.
https://github.com/isaacs/github/issues/935 :/ 


Let's just keep this design.  If github already knows about the issue,
then maybe they'll fix it.

My problem is that I only rarely log in, perhaps once a week, whenever
I'm making a ntirpc pullup request.

Unlike gerrithub, github handles email replies. So github is better
for workflow.



I forgot to ask first, but is there already a trello board for nfs-ganesha? I 
can see mention of nfs-ganesha in
gluster[1] and ceph[2][3] trello board. If yes and if that can be made public, 
then we would not need github project
boards.

[1] https://trello.com/c/AhGWdQYh/104-protocol-support-nfs-ganesha-samba
[2] https://trello.com/c/6DHdMUyH/144-cephfs-manila-nfs-ganesha
[3] https://trello.com/c/6DHdMUyH/144-cephfs-manila-nfs-ganesha



Nicer to have on the github project.  Those are FSAL specific, and not
well coordinated.

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Features board list

2018-01-31 Thread William Allen Simpson

On 1/31/18 1:11 PM, Supriti Singh wrote:
I have created a new board here: https://github.com/orgs/nfs-ganesha/projects . Its organization-wide project board. Everyone who belongs to nfs-ganesha organization should have write access. Can you check if you have write access to this board. 


YES!  But the page isn't visible (404 error) until I've logged in.

The other board was visible even not logged in.  That would be handy.


It makes 
more sense to have organization wide board, as we can track ntirpc and ci-tests development features as well.



Good idea.

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Change in ffilz/nfs-ganesha[next]: Move nfs_Init_svc() after create_pseudofs().

2018-01-31 Thread William Allen Simpson

On 1/31/18 3:11 PM, GerritHub wrote:

Frank Filz *posted comments* on this change.

View Change 


It seems to me that the dupreq2_pkginit() is already in about the right
place, just after nfs_Init_client_id().  Moving it before doesn't do much.



Patch set 2:

(3 comments)

  *

File src/MainNFSD/nfs_init.c: 


  o

Patch Set #2, Line 820: 
 
|LogInfo(COMPONENT_INIT, "RPC resources successfully initialized");|

Hmm, should we do this after starting grace so we don't process 
requests before setting up grace period?


OK, as long as it does not send NLM.

We need to move nfs_Init_admin_thread() down, because that starts dbus,
and dbus can start, terminate, and affect grace.  So it should be here
after nfs_start_grace().



  o

Patch Set #2, Line 823: 
 
|fsal_save_ganesha_credentials();|

Hmm, this should be done earlier...


Where?



  o

Patch Set #2, Line 962: 
 
|nsm_unmonitor_all();|

Something makes me think this maybe needs to be earlier...


As to this last, you cannot do it until after nfs_Init_svc(), as it
makes client calls.  Do you want it before or after nfs_start_grace()?

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Correct initialization sequence

2018-01-31 Thread William Allen Simpson

On 1/31/18 10:33 AM, Daniel Gryniewicz wrote:

On 01/31/2018 10:27 AM, William Allen Simpson wrote:

On 1/31/18 8:44 AM, Daniel Gryniewicz wrote:

Agreed.

Daniel

On 01/30/2018 11:46 PM, Malahal Naineni wrote:

Looking at the code, dupreq2_pkginit() only depends on Ganesha config 
processing to initialize few things, so it should be OK to call anytime after 
Ganesha config processing.

Regards, Malahal.

On Wed, Jan 31, 2018 at 8:00 AM, Pradeep <pradeeptho...@gmail.com 
<mailto:pradeeptho...@gmail.com>> wrote:

    Hi Bill,

    Is it ok to move dupreq2_pkginit() before nfs_Init_svc() so that we
    won't hit the crash below?


It seems OK to me.  The previous culprit was delegation callbacks
happened before nfs_Init_svc().  Anything that does output (or expects
input) has to be after initializing ntirpc svc.

DanG, could you add this move to your pullup?  That might trigger
another test, too.


Should probably be a different PR, since it's unrelated to ntirpc.  If Pradeep 
doesn't want to submit it, I can.


My thought was exactly the opposite (in the next message) -- both
nfs_Init_svc() and nfs_Init_admin_thread() should be after all cache init.

So I thought of it as directly related

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Correct initialization sequence

2018-01-31 Thread William Allen Simpson

On 1/31/18 8:44 AM, Daniel Gryniewicz wrote:

Agreed.

Daniel

On 01/30/2018 11:46 PM, Malahal Naineni wrote:

Looking at the code, dupreq2_pkginit() only depends on Ganesha config 
processing to initialize few things, so it should be OK to call anytime after 
Ganesha config processing.

Regards, Malahal.

On Wed, Jan 31, 2018 at 8:00 AM, Pradeep > wrote:

    Hi Bill,

    Is it ok to move dupreq2_pkginit() before nfs_Init_svc() so that we
    won't hit the crash below?


It seems OK to me.  The previous culprit was delegation callbacks
happened before nfs_Init_svc().  Anything that does output (or expects
input) has to be after initializing ntirpc svc.

DanG, could you add this move to your pullup?  That might trigger
another test, too.

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Is there a field in the SVCXPRT Ganesha can use

2018-01-30 Thread William Allen Simpson

On 1/30/18 9:36 AM, William Allen Simpson wrote:

But the code is obscure, so I could be missing something.


Also, it bears repeating that the dupreq cache wasn't working for
secure connections.  Pre-V2.6 checksummed the ciphertext, which is by
definition different on every request.  We'd never see duplicates.

One of the innovations in V2.6 (ntirpc 1.6) is the checksum is of the
plaintext.  So duplicate requests will be detected.

I'm not sure how often we have duplicate requests, but it should be
working now.

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Is there a field in the SVCXPRT Ganesha can use

2018-01-30 Thread William Allen Simpson

On 1/30/18 9:22 AM, William Allen Simpson wrote:

On 1/29/18 3:32 PM, Frank Filz wrote:

I haven't looked at how the SVCXPRT structure has changed, but if there's a
field in there we can attach a Ganesha structure to that would be cool, or
if not, if we could add one.


There are two: xp_u1, and xp_u2.

Right now, Ganesha is using xp_u2 for dup request cache pointers.

But I've eliminated all old usage of xp_u1 in V2.6.


Looking at src/RPCAL/nfs_dupreq.c, I'm not sure why that doesn't already
have a client or export reference there.  It seems we'll return the
duplicate data to any client that happens to use the same xid.  Seems
like a bug

But the code is obscure, so I could be missing something.

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Is there a field in the SVCXPRT Ganesha can use

2018-01-30 Thread William Allen Simpson

On 1/29/18 3:32 PM, Frank Filz wrote:

I haven't looked at how the SVCXPRT structure has changed, but if there's a
field in there we can attach a Ganesha structure to that would be cool, or
if not, if we could add one.


There are two: xp_u1, and xp_u2.

Right now, Ganesha is using xp_u2 for dup request cache pointers.

But I've eliminated all old usage of xp_u1 in V2.6.

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Change in ffilz/nfs-ganesha[next]: Pullup ntirpc 1.6

2018-01-30 Thread William Allen Simpson

On 1/29/18 2:27 PM, Daniel Gryniewicz wrote:

On 01/29/2018 02:09 PM, William Allen Simpson wrote:

On 1/29/18 1:13 PM, GerritHub wrote:

Daniel Gryniewicz has uploaded this change for *review*.

View Change <https://review.gerrithub.io/397004>

Pullup ntirpc 1.6

(svc_vc) rearm after EAGAIN and EWOULDBLOCK

(Note, previous pullup was erroneously from 1.7)


All my weekend patches need to be backported to the 1.6 branch.  There
are string errors and clnt_control errors fixed.





I'm not sure I agree.  clnt_control() isn't called with unknown values, so a default return of false isn't important; it's never called with CLSET_XID, so that case isn't important.  And RDMA doesn't work, even with these fixes correct?  I can't be 
convinced otherwise, but it seemed the only important fix for 1.6 was the EAGAIN one.


Daniel


The error string commas are very important, as V2.6 does a lot more
error reporting now.

I think that clnt_control is important, especially given the bad error
returns and that this might be downstream for years.  OTOH, it would also
apply to V2.5, V2.4, et alia going back years, and nobody has cared.

I agree we can hold off on RDMA for now (until next week).

Sorry you cannot be convinced otherwise.

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Change in ffilz/nfs-ganesha[next]: Pullup ntirpc 1.6

2018-01-29 Thread William Allen Simpson

On 1/29/18 1:13 PM, GerritHub wrote:

Daniel Gryniewicz has uploaded this change for *review*.

View Change 

Pullup ntirpc 1.6

(svc_vc) rearm after EAGAIN and EWOULDBLOCK

(Note, previous pullup was erroneously from 1.7)


All my weekend patches need to be backported to the 1.6 branch.  There
are string errors and clnt_control errors fixed.



--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] 'missed' recv with 2.6-rc2?

2018-01-28 Thread William Allen Simpson

On 1/27/18 4:07 PM, Pradeep wrote:

​Here is what I see in the log (the '2' is what I added to figure out which 
recv failed):
nfs-ganesha-199008[svc_948] rpc :TIRPC :WARN :svc_vc_recv: 0x7f91c0861400 fd 21 
recv errno 11 (try again) 2 176​

The fix looks good. Thanks Bill.


Thanks for the excellent report.  I wish everybody did such well
researched reports!

Yeah, the 2 isn't really needed, because I used "svc_vc_wait" and
"svc_vc_recv" (__func__) to differentiate the 2 messages.

This is really puzzling, since it should never happen.  It's the
recv() with NO WAIT.  And we are level-triggered, so we shouldn't be
in this code without an event.

If it needed more data, it should be WOULD BLOCK, but it's giving
EAGAIN.  No idea what that means here.

Hope it's not happening often

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] 'missed' recv with 2.6-rc2?

2018-01-27 Thread William Allen Simpson

On 1/27/18 9:56 AM, William Allen Simpson wrote:

I'm not able to reproduce.  Could you tell me which EAGAIN is
happening?  The log line will say "svc_vc_wait" or "svc_vc_recv",
and have the actual error code on it.  Maybe this is EWOULDBLOCK?

Of course, neither EAGAIN or EWOULDBLOCK should be happening on a
level triggered event.  But the old code had a log, so it's there.


I've stashed the patch on
  https://github.com/linuxbox2/ntirpc/tree/was16backport

Could you see whether this fixed it for you?

And report the log line(s)?  Is this happening often?

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] 'missed' recv with 2.6-rc2?

2018-01-27 Thread William Allen Simpson

On 1/26/18 8:53 PM, William Allen Simpson wrote:

In fact, I don't understand how we could get EAGAIN, according to the
documentation.  But it's logged.  Good idea about differentiating the
two identical log lines.  I'd prefer text rather than the number 2.


And in the adjacent code, you'll see that I already had a text
differentiation.



I'll code it up, with acknowledgement.  Thanks again!


I'm not able to reproduce.  Could you tell me which EAGAIN is
happening?  The log line will say "svc_vc_wait" or "svc_vc_recv",
and have the actual error code on it.  Maybe this is EWOULDBLOCK?

Of course, neither EAGAIN or EWOULDBLOCK should be happening on a
level triggered event.  But the old code had a log, so it's there.

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


[Nfs-ganesha-devel] V2.6-rc4 connect to statd failed

2018-01-27 Thread William Allen Simpson

With Dan's latest ntirpc update, I'm seeing a new error.  But this is
my first testing on Fedora 27, so maybe a Fedora change?

nsm_connect :NLM :CRIT :connect to statd failed: RPC: Unknown protocol

Actually, that's not exactly how the error looks; the string list is
missing its commas.  My bad.  I've got a patch for that, too.

Also, "destroy_fsals :FSAL :CRIT :Extra references (1) hanging around to FSAL 
PSEUDO"

And LeakSanitizer no longer works in gdb.  Bother.  Circa October,
according to my search.

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] 'missed' recv with 2.6-rc2?

2018-01-26 Thread William Allen Simpson

On 1/26/18 12:18 PM, Pradeep wrote:
In svc_vc_recv(), we handle the case of incomplete receive by rearming the FD and returning ( if xd->sx_fbtbc is not zero). In the case of EAGAIN also shouldn't we be doing the same? epoll is ONESHOT; so new receives won't give new events until epoll_ctl() 
is called, right?


I tried adding the rearming code in EAGAIN cases and was able run the test 
without receive hang.


I'm on PTO, but I'll look at this tomorrow.  So glad that somebody is
finally rigorously testing this code that was added half a year ago!

This may be some code left over from my tests with triggered (couldn't
get to work) instead of one-shot.  Triggered should be faster, with
fewer system calls, a greater concern in today's MELTDOWN environment.

In fact, I don't understand how we could get EAGAIN, according to the
documentation.  But it's logged.  Good idea about differentiating the
two identical log lines.  I'd prefer text rather than the number 2.

I'll code it up, with acknowledgement.  Thanks again!

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] NULL pointer deref in clnt_ncreate_timed()

2018-01-23 Thread William Allen Simpson

On 1/23/18 9:35 AM, William Allen Simpson wrote:

On 1/23/18 9:31 AM, Daniel Gryniewicz wrote:

On 01/23/2018 09:04 AM, William Allen Simpson wrote:

On 1/22/18 8:08 PM, Pradeep wrote:

Looked at dev.22 and we were handling this error case correctly there.


No, we're handling this error case correctly now.

Either you forgot to update your ntirpc, or there's a serious error in
ntirpc.  NULL should never be returned here.



Based on the backtrace, ntirpc is updated (ie, the line numbers all line up).   
There must be some way to return NULL here that we missed.


OK, should be obvious from the back trace.  I'll come by this afternoon
and we can track it down.


Needless to say, I cannot reproduce.

I'm guessing, and this is completely a WAG, that there's something weird
going on in __rpcb_findaddr_timed().  There are two places where it
calls getclnthandle(), and that can return NULL.  But we want to catch
any failures there, because that's a misconfiguration.

This is a recursive calling routine, called by clnt_tli_ncreate(), that
calls clnt_tli_ncreate() itself

So I guess we have to add some kind of error message indicating
misconfiguration, instead of crashing (or simply quitting as the old
PORTMAP code does).  This PORTMAP code is really terrible!

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] NULL pointer deref in clnt_ncreate_timed()

2018-01-23 Thread William Allen Simpson

On 1/23/18 9:31 AM, Daniel Gryniewicz wrote:

On 01/23/2018 09:04 AM, William Allen Simpson wrote:

On 1/22/18 8:08 PM, Pradeep wrote:

Looked at dev.22 and we were handling this error case correctly there.


No, we're handling this error case correctly now.

Either you forgot to update your ntirpc, or there's a serious error in
ntirpc.  NULL should never be returned here.



Based on the backtrace, ntirpc is updated (ie, the line numbers all line up).   
There must be some way to return NULL here that we missed.


OK, should be obvious from the back trace.  I'll come by this afternoon
and we can track it down.


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] NULL pointer deref in clnt_ncreate_timed()

2018-01-23 Thread William Allen Simpson

On 1/22/18 8:08 PM, Pradeep wrote:

Hello,

I'm running into a crash in libntirpc with rc2:

#2  
#3  0x7f9004de31f4 in clnt_ncreate_timed (hostname=0x57592e "localhost", 
prog=100024, vers=1,
     netclass=0x57592a "tcp", tp=0x0) at 
/usr/src/debug/nfs-ganesha-2.6-rc2/libntirpc/src/clnt_generic.c:197
#4  0x0049a21c in clnt_ncreate (hostname=0x57592e "localhost", 
prog=100024, vers=1,
     nettype=0x57592a "tcp") at 
/usr/src/debug/nfs-ganesha-2.6-rc2/libntirpc/ntirpc/rpc/clnt.h:395
#5  0x0049a4d2 in nsm_connect () at 
/usr/src/debug/nfs-ganesha-2.6-rc2/Protocols/NLM/nsm.c:58
#6  0x0049c10d in nsm_unmonitor_all () at 
/usr/src/debug/nfs-ganesha-2.6-rc2/Protocols/NLM/nsm.c:267
#7  0x004449d4 in nfs_start (p_start_info=0x7c8b28 )
     at /usr/src/debug/nfs-ganesha-2.6-rc2/MainNFSD/nfs_init.c:963
#8  0x0041cd2e in main (argc=10, argv=0x7fff68b294d8)
     at /usr/src/debug/nfs-ganesha-2.6-rc2/MainNFSD/nfs_main.c:499
(gdb) f 3
#3  0x7f9004de31f4 in clnt_ncreate_timed (hostname=0x57592e "localhost", 
prog=100024, vers=1,
     netclass=0x57592a "tcp", tp=0x0) at 
/usr/src/debug/nfs-ganesha-2.6-rc2/libntirpc/src/clnt_generic.c:197
197                     if (CLNT_SUCCESS(clnt))
(gdb) print clnt
$1 = (CLIENT *) 0x0

Looked at dev.22 and we were handling this error case correctly there.


No, we're handling this error case correctly now.

Either you forgot to update your ntirpc, or there's a serious error in
ntirpc.  NULL should never be returned here.

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


[Nfs-ganesha-devel] NTIRPC ENOMEM

2017-12-20 Thread William Allen Simpson

DanG has raised an interesting issue about recovery from low memory.
In Ganesha, we've been assiduously changing NULL checks to assert or
segfault on alloc failures.  Just had a few more patches by Kaleb.

Since 2013 or 2014, we've been doing the same to NTIRPC.  There are
currently 105 mem_.*alloc locations, and almost all of them
deliberately segfault.

DanG argues that we should report the ENOMEM in an error return, or
in the alternative return NULL in those cases, and let the caller
decide what to do, to make the library more general.

The current TI-RPC does return NULL in many cases.  Rarely reports
ENOMEM.  And often segfaults.

This would be a major reworking.  Should we do this?  If so, what is
the target date?

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] XID missing in error path for RPC AUTH failure.

2017-12-15 Thread William Allen Simpson

On 12/14/17 1:13 PM, William Allen Simpson wrote:

This is May 2015 code, based upon 2012 code.  Obviously, we haven't
been testing error responses ;)


I wanted to add a personal thank you for such an excellent bug report.
The patch went in the next branch yesterday, and should show up in the
Ganesha dev.22 branch today.  You are credited in the log message.

There are (at least) 2 different code paths for handling similar reply
message output, and this one wasn't correct.  Back in 2005, I'd spent a
fair amount of time trying to debug it (the reason for all those logging
messages), and had found missing breaks and returns.

But I wasn't looking at the formatting for error messages that weren't
being tested, and missed the obvious   Next year, we should probably
try to find all the code paths and consolidate.

What versions do you need backported?


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] XID missing in error path for RPC AUTH failure.

2017-12-14 Thread William Allen Simpson

This is May 2015 code, based upon 2012 code.  Obviously, we haven't been
testing error responses ;)

Not quite.  That would need to be duplicated for each of the error
conditions.  Instead, it should be a bit higher in the function.

Still, I'll keep it duplicated from the ACCEPTED code path, for
trivial efficiency.

On 12/13/17 1:22 AM, Matt Benjamin wrote:

That sounds right, I'm uncertain whether this has regressed in the
text, or maybe in the likelihood of inlining in the new dispatch
model.  Bill?

Matt

On Wed, Dec 13, 2017 at 9:38 AM, Pradeep  wrote:

Hello,

When using krb5 exports, I noticed that TIRPC does not send XID in response
- see xdr_reply_encode() for MSG_DENIED case. Looks like Linux clients can't
decode the message and go in to an infinite loop retrying the same NFS
operation. I tried adding XID back (like it is done for normal case) and it
seems to have fixed the problem. Is this the right thing to do?

diff --git a/src/rpc_dplx_msg.c b/src/rpc_dplx_msg.c
index 01e5a5c..a585e8a 100644
--- a/src/rpc_dplx_msg.c
+++ b/src/rpc_dplx_msg.c
@@ -194,9 +194,12 @@ xdr_reply_encode(XDR *xdrs, struct rpc_msg *dmsg)
 __warnx(TIRPC_DEBUG_FLAG_RPC_MSG,
 "%s:%u DENIED AUTH",
 __func__, __LINE__);
-   buf = XDR_INLINE(xdrs, 2 * BYTES_PER_XDR_UNIT);
+   buf = XDR_INLINE(xdrs, 5 * BYTES_PER_XDR_UNIT);

 if (buf != NULL) {
+   IXDR_PUT_INT32(buf, dmsg->rm_xid);
+   IXDR_PUT_ENUM(buf, dmsg->rm_direction);
+   IXDR_PUT_ENUM(buf, dmsg->rm_reply.rp_stat);
 IXDR_PUT_ENUM(buf, rr->rj_stat);
 IXDR_PUT_ENUM(buf, rr->rj_why);
 } else if (!xdr_putenum(xdrs, rr->rj_stat)) {

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel








--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Announce Push of V2.6-dev.21

2017-12-14 Thread William Allen Simpson

On 12/12/17 4:39 PM, Frank Filz wrote:

Branch next

Tag:V2.6-dev.21

Release Highlights

* new version of checkpatch

* checkpatch fixes for existing code


I'd been hoping that a mid-week release meant the crash during
shutdown was fixed, but apparently not:


Thread 270 "ganesha.nfsd" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fff6808a700 (LWP 18755)]
0x7fffef8ca739 in release (exp_hdl=0x6130cec0)
at /home/bill/rdma/nfs-ganesha/src/FSAL/FSAL_VFS/export.c:79
79  LogDebug(COMPONENT_FSAL, "Releasing VFS export for %s",

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] enqueued_reqs/dequeued_reqs

2017-12-11 Thread William Allen Simpson

On 12/11/17 2:35 PM, Pradeep wrote:


It looks like, we don't increment enqueued_reqs/dequeued_reqs in the RPC anymore - nfs_rpc_enqueue_req() is replaced with nfs_rpc_process_request. Now that both values are zero, the health checker (get_ganesha_health) will never detect any RPC hangs. 
Should the enqueued_reqs/dequeued_reqs be moved to nfs_rpc_process_request()?



I've got a patch that's been sitting in the queue a couple of weeks.  It
was submitted together with another patch that went in, but Frank missed
this one somehow.

Right now, the only thing counting is 9P.  Dominique had already requested
the health counters (a couple of reviews ago), so I'd added more counting.

View Change 






--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] libntirpc thread local storage

2017-12-09 Thread William Allen Simpson

On 12/9/17 5:28 PM, Matt Benjamin wrote:

I've already proposed we remove this.  No one is invested in it, I don't think.


OK.  I'll take a poke at it today.  It makes sense that this is a
good time to handle, as we've already made a major change to
CLNT_CALL in this release.

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


[Nfs-ganesha-devel] libntirpc thread local storage

2017-12-09 Thread William Allen Simpson

I've run into another TLS problem.  It's been there since tirpc.

Apparently, once upon a time, rpc_createerr was a static global.
It still says that in the man pages.

When a client create function fails, they stash the error there,
and return NULL for the CLIENT.  Basically, you check for NULL,
and then check rpc_createerr

This is also used extensively by the RPC bind code.

Then, they made it a keyed thread local to try and salvage it
without a major code re-write.

With async threads, that's not working.  We've got memory leaks.
AFAICT, only on errors.  But accessing them on a different
thread gives the wrong error code (or none at all).  Not good.

All the functions that use it are still not MT-safe, usually
because they stash a string in global memory without locking.
They need re-definition to alloc/free the string.

Worse, it's not a good definition.

rpc_createerr has both clnt_stat and rpc_err, but struct rpc_err
also has a clnt_stat (since original tirpc).  clnt_stat is not
consistently set properly, as it is in two places.  So the error
checking code is often wrong.

I'd like to get rid of the whole mess, but that means every client
create would have new semantics.  Fortunately, there aren't many
(in Ganesha).  Plus we already have new definitions -- all named
*_ncreate with a tirpc_compat.h to munge them back.

But should we do it now?  Or in 2.7?  We've been living with it for
years, although recent code changes have made it worse.  Again, it
only happens on errors.  Especially for RPC bind.

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Stacked FSALs and fsal_export parameters and op_ctx

2017-12-09 Thread William Allen Simpson

On 12/8/17 10:13 AM, Matt Benjamin wrote:

I'd like to see this use of TLS as a "hidden parameter" replaced
regardless.  It has been a source of bugs, and locks us into a
pthreads execution model I think needlessly.


With future async FSAL calls, it's going to stop working.

We already have a svc_req and now clnt_req.  That's what will be
returned on the new thread.  So it behooves us to stash the op_ctx
pointer in them (as a void *).

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Stacked FSALs and fsal_export parameters and op_ctx

2017-12-08 Thread William Allen Simpson

On 12/7/17 7:54 PM, Frank Filz wrote:

Stacked FSALs often depend on op_ctx->fsal_export being set.

We also have lots of FSAL methods that take the fsal_export as a parameter.


The latter sounds better.

Now that we know every single thread local storage access involves a
hidden lock/unlock sequence in glibc "magically" invoked by the linker,
it would be better to remove as many TLS references as possible!

After all, too many lock/unlock are a real performance issue.

Perhaps we should pass op_ctx as the parameter instead.

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Announce Push of V2.6-dev.17

2017-11-14 Thread William Allen Simpson

On 11/13/17 12:35 AM, Frank Filz wrote:

5ca449d Jan-Martin Rämer handle hosts via libcidr to unify IPv4/IPv4
host/network clients


Ran pynfs, seeing some massive leaks that weren't there last week:

Direct leak of 505080 byte(s) in 12627 object(s) allocated from:
#0 0x76efcfe0 in calloc (/lib64/libasan.so.3+0xc6fe0)
#1 0x64c976 in gsh_calloc__ 
/home/bill/rdma/nfs-ganesha/src/include/abstract_mem.h:145
#2 0x64c9e7 in cidr_alloc /home/bill/rdma/nfs-ganesha/src/cidr/cidr_mem.c:16
#3 0x64c21c in cidr_from_inaddr 
/home/bill/rdma/nfs-ganesha/src/cidr/cidr_inaddr.c:56
#4 0x623042 in client_match 
/home/bill/rdma/nfs-ganesha/src/support/exports.c:2401
#5 0x625eb0 in export_check_access 
/home/bill/rdma/nfs-ganesha/src/support/exports.c:2775
#6 0x6077e4 in nfs4_export_check_access 
/home/bill/rdma/nfs-ganesha/src/support/nfs_creds.c:556
#7 0x4da4a1 in nfs4_mds_putfh 
/home/bill/rdma/nfs-ganesha/src/Protocols/NFS/nfs4_op_putfh.c:187
#8 0x4dac87 in nfs4_op_putfh 
/home/bill/rdma/nfs-ganesha/src/Protocols/NFS/nfs4_op_putfh.c:281
#9 0x4a1727 in nfs4_Compound 
/home/bill/rdma/nfs-ganesha/src/Protocols/NFS/nfs4_Compound.c:752
#10 0x47f198 in nfs_rpc_process_request 
/home/bill/rdma/nfs-ganesha/src/MainNFSD/nfs_worker_thread.c:1338
#11 0x48122e in nfs_rpc_valid_NFS 
/home/bill/rdma/nfs-ganesha/src/MainNFSD/nfs_worker_thread.c:1736
#12 0x75c7575a in svc_vc_decode /home/bill/rdma/ntirpc/src/svc_vc.c:812
#13 0x48b8b0 in nfs_rpc_decode_request 
/home/bill/rdma/nfs-ganesha/src/MainNFSD/nfs_rpc_dispatcher_thread.c:1626
#14 0x75c7566c in svc_vc_recv /home/bill/rdma/ntirpc/src/svc_vc.c:785
#15 0x75c71f06 in svc_rqst_xprt_task 
/home/bill/rdma/ntirpc/src/svc_rqst.c:764
#16 0x75c72383 in svc_rqst_epoll_events 
/home/bill/rdma/ntirpc/src/svc_rqst.c:936
#17 0x75c724ca in svc_rqst_epoll_loop 
/home/bill/rdma/ntirpc/src/svc_rqst.c:974
#18 0x75c72580 in svc_rqst_run_task 
/home/bill/rdma/ntirpc/src/svc_rqst.c:1010
#19 0x75c7bc4c in work_pool_thread 
/home/bill/rdma/ntirpc/src/work_pool.c:176
#20 0x760b3739 in start_thread (/lib64/libpthread.so.0+0x7739)

Direct leak of 125640 byte(s) in 3141 object(s) allocated from:
#0 0x76efcfe0 in calloc (/lib64/libasan.so.3+0xc6fe0)
#1 0x64c976 in gsh_calloc__ 
/home/bill/rdma/nfs-ganesha/src/include/abstract_mem.h:145
#2 0x64c9e7 in cidr_alloc /home/bill/rdma/nfs-ganesha/src/cidr/cidr_mem.c:16
#3 0x64c21c in cidr_from_inaddr 
/home/bill/rdma/nfs-ganesha/src/cidr/cidr_inaddr.c:56
#4 0x623042 in client_match 
/home/bill/rdma/nfs-ganesha/src/support/exports.c:2401
#5 0x625eb0 in export_check_access 
/home/bill/rdma/nfs-ganesha/src/support/exports.c:2775
#6 0x6077e4 in nfs4_export_check_access 
/home/bill/rdma/nfs-ganesha/src/support/nfs_creds.c:556
#7 0x4db60b in nfs4_op_putrootfh 
/home/bill/rdma/nfs-ganesha/src/Protocols/NFS/nfs4_op_putrootfh.c:98
#8 0x4a1727 in nfs4_Compound 
/home/bill/rdma/nfs-ganesha/src/Protocols/NFS/nfs4_Compound.c:752
#9 0x47f198 in nfs_rpc_process_request 
/home/bill/rdma/nfs-ganesha/src/MainNFSD/nfs_worker_thread.c:1338
#10 0x48122e in nfs_rpc_valid_NFS 
/home/bill/rdma/nfs-ganesha/src/MainNFSD/nfs_worker_thread.c:1736
#11 0x75c7575a in svc_vc_decode /home/bill/rdma/ntirpc/src/svc_vc.c:812
#12 0x48b8b0 in nfs_rpc_decode_request 
/home/bill/rdma/nfs-ganesha/src/MainNFSD/nfs_rpc_dispatcher_thread.c:1626
#13 0x75c7566c in svc_vc_recv /home/bill/rdma/ntirpc/src/svc_vc.c:785
#14 0x75c71f06 in svc_rqst_xprt_task 
/home/bill/rdma/ntirpc/src/svc_rqst.c:764
#15 0x75c72383 in svc_rqst_epoll_events 
/home/bill/rdma/ntirpc/src/svc_rqst.c:936
#16 0x75c724ca in svc_rqst_epoll_loop 
/home/bill/rdma/ntirpc/src/svc_rqst.c:974
#17 0x75c72580 in svc_rqst_run_task 
/home/bill/rdma/ntirpc/src/svc_rqst.c:1010
#18 0x75c7bc4c in work_pool_thread 
/home/bill/rdma/ntirpc/src/work_pool.c:176
#19 0x760b3739 in start_thread (/lib64/libpthread.so.0+0x7739)

Direct leak of 124760 byte(s) in 3119 object(s) allocated from:
#0 0x76efcfe0 in calloc (/lib64/libasan.so.3+0xc6fe0)
#1 0x64c976 in gsh_calloc__ 
/home/bill/rdma/nfs-ganesha/src/include/abstract_mem.h:145
#2 0x64c9e7 in cidr_alloc /home/bill/rdma/nfs-ganesha/src/cidr/cidr_mem.c:16
#3 0x64c21c in cidr_from_inaddr 
/home/bill/rdma/nfs-ganesha/src/cidr/cidr_inaddr.c:56
#4 0x623042 in client_match 
/home/bill/rdma/nfs-ganesha/src/support/exports.c:2401
#5 0x625eb0 in export_check_access 
/home/bill/rdma/nfs-ganesha/src/support/exports.c:2775
#6 0x6077e4 in nfs4_export_check_access 
/home/bill/rdma/nfs-ganesha/src/support/nfs_creds.c:556
#7 0x4c69a1 in nfs4_op_lookup 
/home/bill/rdma/nfs-ganesha/src/Protocols/NFS/nfs4_op_lookup.c:150
#8 0x4a1727 in nfs4_Compound 
/home/bill/rdma/nfs-ganesha/src/Protocols/NFS/nfs4_Compound.c:752
#9 0x47f198 in nfs_rpc_process_request 

Re: [Nfs-ganesha-devel] Change in ffilz/nfs-ganesha[next]: (log_functions) display work_pool name

2017-11-07 Thread William Allen Simpson

On 11/6/17 8:41 PM, William Allen Simpson wrote:

On 11/6/17 8:38 AM, Dominique Martinet wrote:

One way that'd work for example would be have ganesha provide a pointer
to SetNameFunction at init, it's a bit ugly though.


Actually, that's how ntirpc calls various alloc and warnx functions,
via a pointer struct.  So I'll revert this code and add another
entry to that struct.  Good idea.  Ugly, but consistent.

Any other libraries that provide a thread could use the same struct
technique to instantiate a name.


Your idea seems to work, and I've posted the ntirpc side for DanG to
review.  Should have the Ganesha side for review soon thereafter.

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Change in ffilz/nfs-ganesha[next]: (log_functions) display work_pool name

2017-11-06 Thread William Allen Simpson

On 11/6/17 8:38 AM, Dominique Martinet wrote:

William Allen Simpson wrote on Mon, Nov 06, 2017 at 08:12:19AM -0500:

If you've got [s]ome others in another library, they'll have to use
the same library function.


Other FSALs are in other libraries, but given how it's setup they can
use that function and I understand ntirpc cannot use ganesha functions
directly :)

One way that'd work for example would be have ganesha provide a pointer
to SetNameFunction at init, it's a bit ugly though.


Actually, that's how ntirpc calls various alloc and warnx functions,
via a pointer struct.  So I'll revert this code and add another
entry to that struct.  Good idea.  Ugly, but consistent.

Any other libraries that provide a thread could use the same struct
technique to instantiate a name.

All this is because Frank didn't like the default %p, that worked
perfectly well

Hate it when merely cosmetic changes break the code! :(

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Change in ffilz/nfs-ganesha[next]: (log_functions) display work_pool name

2017-11-06 Thread William Allen Simpson

On 11/6/17 7:14 AM, GerritHub wrote:

File src/log/log_functions.c: 


  o

Patch Set #2, Line 1438: 
 |name = 
work_pool_worker_name();|

So the new hang for proxy is in this assumption -- proxy uses threads 
that do not define their thread_name but are not work_pool_worker_name.

This specific problem is an easy fix (just set the thread_name!), but I find the 
assumption "no thread name -> work_pool_thread" to be pretty daring.


Tried setting the thread_name thread local variable, but doing it in a
library didn't work.  Needed a different name, and a function to fetch it.

The only threads that I know of are fridge and work_pool.

If you've got come others in another library, they'll have to use the same
library function.


Here is the exact stack trace:
  #0  0x77061f4d in __lll_lock_wait () from 
/lib64/libpthread.so.0
  #1  0x7705dd1d in _L_lock_840 () from /lib64/libpthread.so.0
  #2  0x7705dc3a in pthread_mutex_lock () from 
/lib64/libpthread.so.0
  #3  0x77ddc029 in tls_get_addr_tail () from 
/lib64/ld-linux-x86-64.so.2
  #4  0x76c31bc4 in work_pool_worker_name () at 
/opt/nfs-ganesha/src/libntirpc/src/work_pool.c:72
  #5  0x0051541d in display_log_component 
(dsp_log=0x71ed93c0, component=COMPONENT_FSAL,
 file=0x73f042e0 "/opt/nfs-ganesha/src/FSAL/FSAL_PROXY/handle.c", line=621, 
function=0x73f04b21 <__func__.21686> "pxy_rpc_recv",
 level=5) at /opt/nfs-ganesha/src/log/log_functions.c:1438
  #6  0x00515694 in display_log_component_level 
(component=COMPONENT_FSAL,
 file=0x73f042e0 "/opt/nfs-ganesha/src/FSAL/FSAL_PROXY/handle.c", line=621, 
function=0x73f04b21 <__func__.21686> "pxy_rpc_recv",
 level=NIV_EVENT, format=0x73f04519 "Socket is closed", 
arguments=0x71ed9458) at /opt/nfs-ganesha/src/log/log_functions.c:1502
  #7  0x0051585f in DisplayLogComponentLevel (component=COMPONENT_FSAL, 
file=0x73f042e0 "/opt/nfs-ganesha/src/FSAL/FSAL_PROXY/handle.c",
 line=621, function=0x73f04b21 <__func__.21686> "pxy_rpc_recv", 
level=NIV_EVENT, format=0x73f04519 "Socket is closed")
 at /opt/nfs-ganesha/src/log/log_functions.c:1709
  #8  0x73efa137 in pxy_rpc_recv (arg=0x7410ae00 
) at /opt/nfs-ganesha/src/FSAL/FSAL_PROXY/handle.c:620
  #9  0x7705bdc5 in start_thread () from /lib64/libpthread.so.0
  #10 0x7671dced in clone () from /lib64/libc.so.6

As you can see the problem seems to be that the thread local variable 
is not initialized, but I fail to see why this would hang instead of returning 
garbage.


That seems odd.  I defined the function so that a missing variable
should return NULL.



I'll submit a patch that just sets thread names for now but 
experiencing a hang there instead of some default value isn't nice. Does anyone 
have an idea?


Nope.  And no idea why you'd post this on an old patch, either.  The
discussion will be lost.

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Change in ffilz/nfs-ganesha[next]: CLNT_CALL with clnt_req

2017-11-04 Thread William Allen Simpson

On 11/4/17 1:43 AM, Matt Benjamin wrote:

oh, come on.  not sure what needs to be done to reduce log noise, but
I'm sure we can make a dent.


Apparently, you and I reviewed and wrote our messages in opposite order.

This patch does that  Complaining about the log messages now that we
put in last week to find the places that needed to be fixed smacks me as
silliness (or something), given that it just wasn't timely applied.

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Change in ffilz/nfs-ganesha[next]: CLNT_CALL with clnt_req

2017-11-03 Thread William Allen Simpson

We already discussed this on Tuesday, October 24th.  Malahal agreed
that a half second was good, 3 seconds was OK, 5 seconds was long.
And Matt agreed we'd log more than 10 seconds.

Obviously, you have vastly more Internet experience than I, and
therefore are much better able to decide Internet timing parameters.

Also, your time is so much more valuable than mine, and you need to
post the weekly maintenance updates late (eastern time) on Friday to
avoid disrupting your thought processes -- so that I have to work
weekends, and my commits are stalled by yet another week.



Note that I'm currently taking a lot of half-week holiday over the
next few months, so hopefully this will make it in *early* next week.
Then maybe, just maybe, my next patch will be ready the week before
Thanksgiving

On 11/3/17 2:04 PM, GerritHub wrote:

Frank Filz *posted comments* on this change.

View Change 

Patch set 1:

I'd like a review by malahal before merging this one

I'm really not sure about the 3 second timeouts

Can anyone test this in something resembling a real customer environment?

To view, visit change 385451 . To unsubscribe, 
visit settings .

Gerrit-Project: ffilz/nfs-ganesha
Gerrit-Branch: next
Gerrit-MessageType: comment
Gerrit-Change-Id: I92b02eca435f4b1f6104b740c6c5b3747c380840
Gerrit-Change-Number: 385451
Gerrit-PatchSet: 1
Gerrit-Owner: william.allen.simp...@gmail.com
Gerrit-Reviewer: CEA-HPC 
Gerrit-Reviewer: Daniel Gryniewicz 
Gerrit-Reviewer: Frank Filz 
Gerrit-Reviewer: Gluster Community Jenkins 
Gerrit-Reviewer: Malahal 
Gerrit-Reviewer: openstack-ci-service+rdo-ci-cen...@redhat.com
Gerrit-Comment-Date: Fri, 03 Nov 2017 18:04:27 +
Gerrit-HasComments: No



--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Change in ffilz/nfs-ganesha[next]: CLNT_CALL with clnt_req

2017-11-03 Thread William Allen Simpson

On 11/3/17 2:06 PM, Frank Filz wrote:

Please respond inside gerrit to keep the conversation in one place.


The discussion of timeouts was always in public on this list, and
was also on the public conference call a week ago.

If Gerrit doesn't record its replies, unlike Github, you should
probably make a request to the maintainers of Gerrit.

But I'm not likely to log into Github to log into Gerrit to make an
email reply that then isn't recorded in my actual email on my machine.

I prefer to keep my conversations in one place.

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] tirpc warning

2017-11-03 Thread William Allen Simpson

On 11/3/17 7:46 PM, Frank Filz wrote:

Can we tone down this warning:

2017-11-03 16:14:21 [svc_4] :0 :rpc :TIRPC :INFO
:clnt_req_alloc:470 tv_sec 15 > 10

That is spamming the log when at INFO level.


I see that you didn't put in CLNT_CALL with clnt_req, that
eliminated all current examples of the warning.

So no, that's on you

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Change in ffilz/nfs-ganesha[next]: CLNT_CALL with clnt_req

2017-11-03 Thread William Allen Simpson

On 11/2/17 1:01 PM, GerritHub wrote:

Frank Filz *posted comments* on this change.

View Change 

Patch set 1:

(2 comments)

  *

File src/MainNFSD/nfs_rpc_callback.c: 


  o

Patch Set #1, Line 74: 
 
|/* retry timeout default to the moon and back */|

don't these go back to actual clients?

  *

File src/Protocols/NLM/sm_notify.c: 


  o

Patch Set #1, Line 20: |/* retry timeout default to the moon and back 
*/|

This is going to an actual client, shouldn't we have a longer timeout?


If anything, it's too large.  Do we have any clients on the moon?

It takes approximately 1.26 seconds for light to travel to the moon,
so 2.52 seconds RTT.  Leaves 480 ms of terrestrial latency, enough to
circle the earth a few times

AWS Network Latency Map reports 282 ms between data centers in
Virginia and Central Asia.  That's worst case.

Also, 3 seconds will easily handle 1500 byte packets over 300 bps
satellite link to any ship anywhere in the world.  I'm pretty sure
most links are faster than that nowadays

If you have worse links, fix your network.  NFS isn't going to work
well over such links anyway.

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] nlm_async retries

2017-11-02 Thread William Allen Simpson

On 11/1/17 3:07 PM, Frank Filz wrote:

I think we only need a single call fired off. If the client doesn't get it, 
there's not much recourse. I guess if a TCP connection actually fails, we could 
retry then, but over UDP there is no way to know what happened.

Thanks for working on cleaning this up.


Turning it into a one-shot turned out to be easiest.  Allow timeout 0,
and don't wait for timeout 0, plus refreshes = 0 to prevent retrying
for bad authentication (as we'll never know whether it was bad).

Assuming still need the retry loop for the "spurious EAI_NONAME errors"?

And it will still retry (tearing down and rebuilding the connection) for
TCP failing  Hard to test, probably never happens.

Hopefully DanG will have time to review and merge the ntirpc by Friday.
Matt had already taken a quick look yesterday.

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] nlm_async retries

2017-11-01 Thread William Allen Simpson

On 11/1/17 2:27 PM, William Allen Simpson wrote:

On 11/1/17 10:07 AM, Frank Filz wrote:

So part of why that code looks bizarre? Because the NLM ASYNC RPC procedures
are bizarre...

The NLM ASYNC procedures DON'T have a normal RPC call response. Instead, the
host handling the call (normally the server, but the client in the case of
NLM_GRANTED for lock grant callbacks) makes an RPC CALL back to the sender!
The RPC library, at least at the time of writing this, had no mechanism to
fire off RPC calls and not care about a response...


Good to know.  A feature I can add!


Actually, I can do that right away.  My new code splits the call_req
setup from the CLNT_CALL in preparation for true async callbacks.
With this interface exposed, setting cc_refreshes to 0 (default 2)
means it won't try again.

It will still wait, though.


[...]
In a week or two, I'll add the feature to not await a response at all.


And that should also be easy.  Will check for NULL callback function.

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] nlm_async retries

2017-11-01 Thread William Allen Simpson

On 11/1/17 10:07 AM, Frank Filz wrote:

So part of why that code looks bizarre? Because the NLM ASYNC RPC procedures
are bizarre...

The NLM ASYNC procedures DON'T have a normal RPC call response. Instead, the
host handling the call (normally the server, but the client in the case of
NLM_GRANTED for lock grant callbacks) makes an RPC CALL back to the sender!
The RPC library, at least at the time of writing this, had no mechanism to
fire off RPC calls and not care about a response...


Good to know.  A feature I can add!



The problem is the client NEVER sends a response to the NLM__RSP RPC
callback. So I coded a short timeout so we didn't actually wait for a
response we would never get... This code was also written in haste. I had a
day or so to get it re-written and tested during one of the few
Connectathons I attended with Ganesha that Apple also attended... Since
then, I have never had a Mac client to even make sure it still works...


I've never seen these in a log.  Just reading the code, prior to
making modifications.  But my reading is that with current RPC, this
short nearly impossible timeout will start a connection, fire off 3 calls,
drop the connection and immediately reconnect, fire off 3 calls, then
drop the connection and immediately reconnect, fire off 3 calls, then
drop the connection.

Probably not the planned operation.

So what I'll do by Friday is remove the re-connection loop.  Tell me
3 times (3 fast calls) should be enough?  I'll also make the timeout
10 ms so that we don't thrash threads so much.

In a week or two, I'll add the feature to not await a response at all.

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


[Nfs-ganesha-devel] nlm_async retries

2017-11-01 Thread William Allen Simpson

I'm flummoxed.  Who knows this code?

Problem 1: the timeout is set to 10 microseconds.  Holy heck?  And
historically, that's the maximum total wait time, so it would try at
least three (3) times within 10 *MICRO*seconds?

Probably should be milliseconds.

Problem 2: there's a retry loop that sets up and tears down a TCP
(or UDP) connection 3 times.  On top of the 3 tries in RPC itself?

This looks like a lot of self-flagellation, maybe because the
timeout above was set too short?

Problem 3: this isn't really async -- nlm_send_async() actually
runs a pthread_cond_timedwait() before returning.  That's sync!

But we already have a timedwait in RPC.  And a signal.  So this
completely duplicates the underlying RPC library code.  Why?


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] UID and GID mapping

2017-10-30 Thread William Allen Simpson

On 10/30/17 3:56 AM, Nitesh Sharma wrote:

Do you have any idea about nfs-ganesha UID and GID mapping


The developer's list might

What version are you using?



How can I map this entry in nfs-ganesha export which is above CephFS

/mnt/tvault_automation  *(rw,all_squash,anonuid=65534,anongid=65534)


My sample file is
=
root@d00-0c-29-05-b8-ca:~ # cat /etc/ganesha/ganesha.conf
EXPORT
{
   Export_Id = 1; # Each export needs to have a unique 'Export_Id' (mandatory)
   Path = "/"; # Export path in the related CephFS pool (mandatory)
   Pseudo = "/"; # Target NFS export path (mandatory for NFSv4)
   Access_Type = RW; # 'RO' for read-only access, default is 'None'
  #Squash = No_Root_Squash; # NFS squash option
   Squash=All, All_Squash, AllSquash, All_Anonymous, AllAnonymous;
   FSAL { # Exporting 'File System Abstraction Layer'
     Name = CEPH; # Ganesha backend, 'CEPH' for CephFS or 'RGW' for RADOS 
Gateway
   }
}
===



--
Thanks and Regards,
*Nitesh Sharma.*



--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Ganesha 2.3 and 2.5 - crash in free_nfs_request

2017-10-30 Thread William Allen Simpson

On 10/27/17 7:56 AM, Sachin Punadikar wrote:

Ganesha 2.3 got segfault with below :
[...]
After analyzing the core and related code found that - In "thr_decode_rpc_request" function, if call to SVC_RECV fails, then free_nfs_request is invoked to free the resources. But so far one of the field "reqdata->r_u.req.svc.rq_auth" is not initialized 
nor allocated, which is leading to segfault.


The code in this area is same for Ganesha 2.3 and 2.5.
I have created below patch to overcome this issue. Please review and if 
suitable merge with Ganesha 2.5 stable.
https://github.com/sachinpunadikar/nfs-ganesha/commit/91baffa8bd197c78eff106f42927a370155ae6b4


While your code should be harmless, at least in V2.5 that is already
initialized with gsh_calloc().  So it should already be NULL.

The answer of course as always is to upgrade  There are a lot of
fixes in V2.4 and V2.5, the current stable branch!

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Ganesha startup and shutdown

2017-10-27 Thread William Allen Simpson

On 10/27/17 6:43 PM, Frank Filz wrote:

Ganesha startup and shutdown seems to be wandering between fast, slow, and
in the case of shutdown, sometimes not cleanly...

I'm glad to see the startup time has recently improved, it was annoying for
a while.


Think that was mostly Malahal?

Also, my 2.6 UDP code uses separate send and receive buffers.  That
might have helped a little during startup (and anybody running NFSv3
UDP anytime). The old original TIRPC library code uses 1 common buffer
for all UDP calls and replies.  Was blocked waiting a lot of the time.



Shutdown had been horrible, but it would eventually shutdown, then it was
improved. And now it's horrible again.


Since the RDMA v3 work, the svc_work_pool took 240 seconds to shutdown
(an old constant from the former thrd_pool code).  But it didn't run
unless you were testing RDMA.

In 2.6, added that pool for every transport, and decreased the timeout
to 31 seconds.  Recently added a quicker 1 second shutdown wait loop
to signal the pool, and that usually runs twice.  So 2 seconds.



If anyone has any thoughts on how to stabilize these so they don't
constantly have ups and downs that would be really nice...


What do your logs say?

Mine weren't showing anything bad:

25/10/2017 13:59:47 : epoch 59f0d10c : simpson91 : ganesha.nfsd-28060[Admin] 
do_shutdown :MAIN :EVENT :NFS EXIT: stopping NFS service
...
25/10/2017 13:59:50 : epoch 59f0d10c : simpson91 : ganesha.nfsd-28060[Admin] 
do_shutdown :THREAD :EVENT :Worker threads successfully shut down.
...
25/10/2017 13:59:50 : epoch 59f0d10c : simpson91 : ganesha.nfsd-28060[Admin] rpc :TIRPC 
:DEBUG :work_pool_shutdown() "svc_" 7
...
25/10/2017 13:59:51 : epoch 59f0d10c : simpson91 : ganesha.nfsd-28060[Admin] rpc :TIRPC 
:DEBUG :work_pool_shutdown() "svc_" 1
...
25/10/2017 13:59:52 : epoch 59f0d10c : simpson91 : ganesha.nfsd-28060[Admin] 
do_shutdown :MAIN :EVENT :Destroying the FSAL system.
...
25/10/2017 13:59:52 : epoch 59f0d10c : simpson91 : ganesha.nfsd-28060[Admin] 
do_shutdown :MAIN :EVENT :FSAL system destroyed.
25/10/2017 13:59:52 : epoch 59f0d10c : simpson91 : ganesha.nfsd-28060[main] 
nfs_start :MAIN :EVENT :NFS EXIT: regular exit
25/10/2017 13:59:52 : epoch 59f0d10c : simpson91 : ganesha.nfsd-28060[main] 
fs_clean_old_recov_dir_impl :CLIENT ID :EVENT :Failed to open old v4 recovery 
dir (/home/bill/rdma/install/var/lib/nfs/ganesha/v4old/node0), errno=2

3 seconds to unreg and shutdown all the fridge threads.
2 seconds for ntirpc worker threads.
5 seconds total?

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] kkeithle pushed to libntirpc (f27). "libntirpc 1.5.3 PR https://github.com/nfs-ganesha/ntirpc/pull/85"

2017-10-20 Thread William Allen Simpson

On 10/20/17 1:44 PM, Kaleb S. KEITHLEY wrote:

On 10/20/2017 01:32 PM, Florian Weimer wrote:

Do you regularly import fixes from the glibc code into libntirpc?



I don't know. That's a question for the ntirpc developers (cc'd)


This isn't a fix "from the glibc code", this is a fix because of a
newly discovered bug in glibc().

Yes, there are other workarounds for changes to glibc code, compared
with ancient library practices

auth_unix.c:

/* According to glibc comments, an intervening setgroups(2)
 * call can increase the number of supplemental groups between
 * these two getgroups(2) calls. */

rpc_callmsg.c:

/* in glibc 2.14+ x86_64, memcpy no longer tries to handle overlapping areas,
 * see Fedora Bug 691336 (NOTABUG); we dont permit overlapping segments,
 * so memcpy may be a small win over memmove.
 */

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] V2.5-stable maintenance

2017-10-06 Thread William Allen Simpson

On 10/6/17 9:50 AM, Frank Filz wrote:

But if we can build a script that would easily show which patches have or have 
not been backported, that would be a big help. Right now, I’m always so 
confused as to what has actually been backported…


OR we could shot backporting so much, and release new versions every
couple of months, then say "that's fixed in the 2.x release, upgrade"!

Noting that Gluster seems to have moved from 3.8 to 3.12 in less time
than Ganesha gets out 1 update.

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Pull up NTIRPC #80 & #81

2017-10-03 Thread William Allen Simpson

On 10/3/17 12:53 PM, Daniel Gryniewicz wrote:

I don't think this is blocking QE, as they haven't hit it again either, so it 
can probably just go in in the normal merge this week.


It was my understanding that they stopped testing.  If they aren't
having any more problems, why oh why did I spend a week on this?!?!

Okay then, I'll just go away until this has been merged and been
thoroughly tested by QE

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


  1   2   3   >