Re: [Nfs-ganesha-devel] Change in ffilz/nfs-ganesha[next]: gtest/CMakeLists: libraries for commit2

2018-04-23 Thread William Allen Simpson
This list has been deprecated. Please subscribe to the new devel list at lists.nfs-ganesha.org. On 4/23/18 10:01 AM, GerritHub wrote: This list has been deprecated. Please subscribe to the new devel list at lists.nfs-ganesha.org. Frank, who is responsible for changing GerritHub? ---

Re: [Nfs-ganesha-devel] Couple of stat issues.

2018-04-10 Thread William Allen Simpson
On 4/10/18 8:49 PM, Pradeep wrote: 2. In nfs_rpc_execute(), the queue_wait is set to the difference between op_ctx->start_time and reqdata->time_queued. But reqdata->time_queued is never set (in the old code - pre 2.6-dev5, nfs_rpc_enqueue_req() used to set it; now only 9P code sets it). Is nf

Re: [Nfs-ganesha-devel] rpcping comparison nfs-server

2018-03-28 Thread William Allen Simpson
On 3/27/18 9:34 AM, William Allen Simpson wrote: On 3/25/18 1:44 PM, William Allen Simpson wrote: On 3/23/18 1:30 PM, William Allen Simpson wrote: Ran some apples to apples comparisons today V2.7-dev.5: Without the client-side rbtrees, rpcping works a lot better: Thought of a small tweak

[Nfs-ganesha-devel] nfstest_delegation

2018-03-28 Thread William Allen Simpson
I see that Patrice hasn't posted here about this problem yet. Linux client folks say our V2.7-dev delegations aren't working. At this week's bake-a-thon, Patrice has tried turning it on a couple of different ways. Shouldn't delegations be on by default? Could we get the nfstest suite added to

Re: [Nfs-ganesha-devel] rpcping comparison nfs-server

2018-03-27 Thread William Allen Simpson
On 3/25/18 1:44 PM, William Allen Simpson wrote: On 3/23/18 1:30 PM, William Allen Simpson wrote: Ran some apples to apples comparisons today V2.7-dev.5: Without the client-side rbtrees, rpcping works a lot better: Thought of a small tweak to the list adding routine, so it doesn't kic

Re: [Nfs-ganesha-devel] rpcping comparison nfs-server

2018-03-25 Thread William Allen Simpson
On 3/23/18 1:30 PM, William Allen Simpson wrote: Ran some apples to apples comparisons today V2.7-dev.5: Without the client-side rbtrees, rpcping works a lot better: Ganesha (worst, best): rpcping tcp localhost count=1000 threads=1 workers=5 (port=2049 program=13 version=3 procedure=0

Re: [Nfs-ganesha-devel] rpcping profile

2018-03-25 Thread William Allen Simpson
On 3/24/18 7:50 AM, William Allen Simpson wrote: Noting that the top problem is exactly my prediction by knowledge of the code:   clnt_req_callback() opr_rbtree_insert() The second is also exactly as expected:   svc_rqst_expire_insert() opr_rbtree_insert() svc_rqst_expire_cmpf() These are

[Nfs-ganesha-devel] rpcping profile

2018-03-24 Thread William Allen Simpson
Using local file tests/rpcping. Using local file ../profile. Total: 989 samples 321 32.5% 32.5% 321 32.5% svc_rqst_expire_cmpf 149 15.1% 47.5% 475 48.0% opr_rbtree_insert 139 14.1% 61.6% 140 14.2% __writev 56 5.7% 67.2% 66 6.7% __GI___pthread

[Nfs-ganesha-devel] rpcping comparison nfs-server

2018-03-23 Thread William Allen Simpson
Ran some apples to apples comparisons today V2.7-dev.5: Ganesha (worst, best): rpcping tcp localhost count=1000 threads=1 workers=5 (port=2049 program=13 version=3 procedure=0): mean 33950.1556, total 33950.1556 rpcping tcp localhost count=1000 threads=1 workers=5 (port=2049 program=13

Re: [Nfs-ganesha-devel] nss_getpwnam: name 't...@my.dom@localdomain' does not map into domain 'nix.my.dom'

2018-03-23 Thread William Allen Simpson
On 3/23/18 7:59 AM, Daniel Gryniewicz wrote: Thanks, Tomk.  PR is here: https://review.gerrithub.io/404945 Actually, it seems fairly elegant. ntirpc and rdma also have the USE_ and _USE_ convention. Both require libraries, and would benefit from defaults with enforcement checking for the cma

Re: [Nfs-ganesha-devel] Announce Push of V2.7-dev.4

2018-03-18 Thread William Allen Simpson
On 3/17/18 11:01 AM, Jeff Layton wrote: See: https://review.gerrithub.io/c/404231/ Thanks. A more pro-active approach to be sure. I just assumed Frank would quickly fix it and push a new dev.4a when he saw it Sat. Nice to see I'm not the only one coding on weekends. ---

Re: [Nfs-ganesha-devel] Announce Push of V2.7-dev.4

2018-03-17 Thread William Allen Simpson
On 3/16/18 7:23 PM, Frank Filz wrote: Branch next Tag:V2.7-dev.4 NOTE: This merge includes an ntirpc pullup, please update your submodule This is a big merge with a lot of cleanup. Doesn't compile for me. [ 17%] Building C object Protocols/NFS/CMakeFiles/nfsproto.dir/nfs4_Compound.c.o In fi

Re: [Nfs-ganesha-devel] Change in ffilz/nfs-ganesha[next]: Pullup NTIRPC through #124

2018-03-16 Thread William Allen Simpson
On 3/16/18 10:07 AM, GerritHub wrote: william.allen.simp...@gmail.com has uploaded this change for *review*. View Change I see that our ci.centos.org now provides dbench and iozone. The dbench results are in its log: + tail -21 ../dbenchTestLog.txt Oper

Re: [Nfs-ganesha-devel] A question about rpc requests maybe for Bill

2018-03-15 Thread William Allen Simpson
On 3/15/18 7:57 PM, Frank Filz wrote: NFS v4.1 has a max request size option for the session, I’m wondering if there’s a way to get the size of a given request easily. Depends on how that's defined. Bytes following header? And what you need to do with it. It might be simplest to add a data

Re: [Nfs-ganesha-devel] rpcping

2018-03-15 Thread William Allen Simpson
On 3/15/18 10:23 AM, Daniel Gryniewicz wrote: Can you try again with a larger count, like 100k? 500 is still quite small for a loop benchmark like this. In the code, I commented that 500 is minimal. I've done a pile of 100, 200, 300, and they perform roughly the same as 500. rpcping tcp loca

Re: [Nfs-ganesha-devel] rpcping

2018-03-15 Thread William Allen Simpson
On 3/14/18 3:33 AM, William Allen Simpson wrote: rpcping tcp localhost threads=1 count=500 (port=2049 program=13 version=3 procedure=0): mean 51285.7754, total 51285.7754 DanG pushed the latest code onto ntirpc this morning, and I'll submit a pullup for Ganesha later today. I'

Re: [Nfs-ganesha-devel] rpcping

2018-03-14 Thread William Allen Simpson
On 3/14/18 7:27 AM, Matt Benjamin wrote: Daniel doesn't think you've measured much accurately yet, but at least the effort (if not the discussion) aims to. I'm sure Daniel can speak for himself. At your time of writing, Daniel had not yet arrived in the office after my post this am. So I'm as

Re: [Nfs-ganesha-devel] rpcping

2018-03-14 Thread William Allen Simpson
On 3/13/18 1:58 PM, Daniel Gryniewicz wrote: rpcping was not thread safe.  I have fixes for it incoming. With DanG's significant help, we now have better timing results. There was an implicit assumption in the ancient code that it was calling single threaded tirpc, while ntirpc is multi-thread

Re: [Nfs-ganesha-devel] rpcping

2018-03-13 Thread William Allen Simpson
On 3/13/18 8:27 AM, Matt Benjamin wrote: On Tue, Mar 13, 2018 at 2:38 AM, William Allen Simpson wrote: but if we assume xids retire in xid order also, They do. Should be no variance. Eliminating the dupreq caching -- also using the rbtree -- significantly improved the timing. It&#

Re: [Nfs-ganesha-devel] rpcping

2018-03-13 Thread William Allen Simpson
On 3/13/18 2:38 AM, William Allen Simpson wrote: In my measurements, using the new CLNT_CALL_BACK(), the client thread starts sending a stream of pings.  In every case, it peaks at a relatively stable rate. DanG suggested that timing was dominated by the system time calls. The previous

Re: [Nfs-ganesha-devel] rpcping

2018-03-12 Thread William Allen Simpson
On 3/12/18 6:25 PM, Matt Benjamin wrote: If I understand correctly, we always insert records in xid order, and xid is monotonically increasing by 1. I guess pings might come back in any order, No, they always come back in order. This is TCP. I've gone to some lengths to fix the problem that

[Nfs-ganesha-devel] WAVL tree

2018-03-12 Thread William Allen Simpson
New in 2015. https://en.wikipedia.org/wiki/WAVL_tree There's a C++ intrusive container implementation at: https://fuchsia.googlesource.com/zircon/+/master/system/ulib/fbl/include/fbl/intrusive_wavl_tree.h I've not found a standard C implementation yet.

Re: [Nfs-ganesha-devel] rpcping

2018-03-12 Thread William Allen Simpson
[These are with a Ganesha that doesn't dupreq cache the null operation.] Just how slow is this RB tree? Here's a comparison of 1000 entries versus 100 entries in ops per second: rpcping tcp localhost threads=5 count=1000 (port=2049 program=13 version=3 procedure=0): average 2963.2517, tota

Re: [Nfs-ganesha-devel] rpcping

2018-03-12 Thread William Allen Simpson
One of the limiting factors in our Ganesha performance is that the NULL operation is going through the dupreq code. That can be easily fixed with a check that jumps to nocache. One of the limiting factors in our ntirpc performance seems to be the call_replies tree that stores the xid of calls to

Re: [Nfs-ganesha-devel] rpcping

2018-03-12 Thread William Allen Simpson
total 17647.0588 rpcping tcp localhost threads=10 count=500 (port=2049 program=13 version=4 procedure=0): 1731.3390, total 17313.3903 rpcping tcp localhost threads=15 count=500 (port=2049 program=13 version=4 procedure=0): 1142.3732, total 17135.5981 On 3/8/18 8:03 PM, William Allen Si

Re: [Nfs-ganesha-devel] zero-copy read

2018-03-11 Thread William Allen Simpson
On 3/11/18 7:15 AM, William Allen Simpson wrote: On 3/10/18 11:18 AM, Matt Benjamin wrote: Marcus has code that prototypes using gss_iov from mit-krb5 1.1.12.  I recall describing this to you in 2013. That would be surprising, as I didn't start working on this project until a year or so

Re: [Nfs-ganesha-devel] [nfs-ganesha/ntirpc] rpcping ncreatef (#115)

2018-03-11 Thread William Allen Simpson
On 3/9/18 5:38 PM, Matt Benjamin wrote: I might be missing something, but it looked to me like the trick to talking to nfs-ganesha is to bypass the binder more-or-less as an nfsv4 backchannel does. I'm not sure this is a good idea, unless we are really desperate for Ganesha numbers. I've alre

Re: [Nfs-ganesha-devel] zero-copy read

2018-03-11 Thread William Allen Simpson
On 3/10/18 11:18 AM, Matt Benjamin wrote: Marcus has code that prototypes using gss_iov from mit-krb5 1.1.12. I recall describing this to you in 2013. That would be surprising, as I didn't start working on this project until a year or so later than that Anyway, last year Marcus sent me a

Re: [Nfs-ganesha-devel] zero-copy read

2018-03-10 Thread William Allen Simpson
On 3/10/18 10:24 AM, William Allen Simpson wrote: Finally, and what I'll do this weekend, my attempt to edit xdr_nfs23.c won't pass checkpatch commit, because all the headers are still pre-1989 pre-ANSI K&R-style. Unfortunately, Red Hat Linux doesn't seem to have cproto bu

[Nfs-ganesha-devel] zero-copy read

2018-03-10 Thread William Allen Simpson
Now that DanG has a workable vector i-o for read and write, I'm trying again to make reading zero-copy. Man-oh-man, do we have our work cut out for us It seems that currently we provide a buffer to read. Then XDR makes a new object, puts headers into it, makes another data_val and copies da

[Nfs-ganesha-devel] nfs_worker_thread never executed code section

2018-03-10 Thread William Allen Simpson
-0700 1413) if (dpq_status == DUPREQ_SUCCESS) 00c21a6878 (William Allen Simpson 2015-06-12 07:36:46 -0400 1414) dpq_status = nfs_dupreq_finish(&reqdata->r_u.req.svc, res_nfs); 02526d7325 (Jim Lieb 2013-10-10 20:50:47 -0700 1415) goto fre

Re: [Nfs-ganesha-devel] rpcping

2018-03-08 Thread William Allen Simpson
On 3/8/18 12:33 PM, William Allen Simpson wrote: Still having no luck.  Instead of relying on RPC itself, checked with Ganesha about what it registers, and tried some of those. Without running Ganesha, rpcinfo reports portmapper services by default on my machine. Can talk to it via localhost

[Nfs-ganesha-devel] rpcping

2018-03-08 Thread William Allen Simpson
Still having no luck. Instead of relying on RPC itself, checked with Ganesha about what it registers, and tried some of those. The default procedure is 0, that according to every RFC is reserved for do nothing. But rpcbind is not finding program and version. To be honest, I'm not sure how this

Re: [Nfs-ganesha-devel] Announce Push of V2.7-dev.1

2018-02-24 Thread William Allen Simpson
On 2/24/18 5:18 AM, William Allen Simpson wrote: On 2/24/18 4:42 AM, William Allen Simpson wrote: [top post for visibility] Says ntirpc pullup (twice), but doesn't actually have:   * "Pullup NTIRPC through #106" Missing "(nfs41.h) unindent" checkpatch cleanup, even th

Re: [Nfs-ganesha-devel] Announce Push of V2.7-dev.1

2018-02-24 Thread William Allen Simpson
On 2/24/18 4:42 AM, William Allen Simpson wrote: [top post for visibility] Says ntirpc pullup (twice), but doesn't actually have:  * "Pullup NTIRPC through #106" Missing "(nfs41.h) unindent" checkpatch cleanup, even though we'd agreed this was the best tim

Re: [Nfs-ganesha-devel] Announce Push of V2.7-dev.1

2018-02-24 Thread William Allen Simpson
[top post for visibility] Says ntirpc pullup (twice), but doesn't actually have: * "Pullup NTIRPC through #106" Missing "(nfs41.h) unindent" checkpatch cleanup, even though we'd agreed this was the best time to do it, and it had all the expected +1 and +2. Literally no changes to running code,

Re: [Nfs-ganesha-devel] Change in ffilz/nfs-ganesha[next]: Remove unused fsal_read2() and fsal_write2()

2018-02-23 Thread William Allen Simpson
On 2/22/18 1:32 PM, GerritHub wrote: Daniel Gryniewicz has uploaded this change for *review*. View Change Remove unused fsal_read2() and fsal_write2() I've reviewed, but the write showed up in my inbox before the read, followed by this cleanup. But the wr

Re: [Nfs-ganesha-devel] inconsistent '*' spacing

2018-02-21 Thread William Allen Simpson
On 2/21/18 11:35 AM, Frank Filz wrote: There's a -n or --no-verify option that will bypass the commit hooks. I suggest trying to commit without that first to make sure the only checkpatch errors/warnings are for the spacing around * and then commit again with -n to bypass checkpatch to actuall

Re: [Nfs-ganesha-devel] Change in ffilz/nfs-ganesha[next]: MainNFSD: invert _NO_PORTMAPPER option

2018-02-21 Thread William Allen Simpson
On 2/21/18 4:51 PM, Jeff Layton wrote: On Wed, 2018-02-21 at 13:40 -0800, Frank Filz wrote: We could take this opportunity to change the option to RPCBIND... Fair enough. I'd support this. I actually disagree with the "no udp" statement above too. UDP is great for single-shot request pro

Re: [Nfs-ganesha-devel] inconsistent '*' spacing

2018-02-21 Thread William Allen Simpson
On 2/21/18 8:28 AM, William Allen Simpson wrote: Anyway, I'll try to push a patch for those 2 files by tomorrow. ERROR: "foo * bar" should be "foo *bar" #18656: FILE: src/include/nfsv41.h:9900: +static inline bool xdr_CB_COMPOUND4res(XDR * xdrs, total: 450 errors,

Re: [Nfs-ganesha-devel] Change in ffilz/nfs-ganesha[next]: MainNFSD: invert _NO_PORTMAPPER option

2018-02-21 Thread William Allen Simpson
On 2/21/18 1:59 PM, GerritHub wrote: Jeff Layton has uploaded this change for *review*. View Change MainNFSD: invert _NO_PORTMAPPER option The fact that this is a "negative" option is confusing. Change it to a "PORTMAPPER" option, and have it default to ON.

Re: [Nfs-ganesha-devel] inconsistent '*' spacing

2018-02-21 Thread William Allen Simpson
On 2/20/18 1:06 PM, Frank Filz wrote: As I'm trying to update nfs41.h, I've run into the problem that the commit check is complaining that the pointer '*' on parameters is sometimes " * v" and others " *v" -- usually the same function definition. Presumably the generator made these. They ar

Re: [Nfs-ganesha-devel] nfs ganesha vs nfs kernel performance

2018-02-21 Thread William Allen Simpson
I really don't have time today to respond to every one-liner throw-away comment here, so I'll try to stick to the most cogent. On 2/20/18 8:33 AM, Matt Benjamin wrote: On Tue, Feb 20, 2018 at 8:12 AM, William Allen Simpson wrote: On 2/18/18 2:47 PM, Matt Benjamin wrote: On Fri, Fe

[Nfs-ganesha-devel] gtest results? profiles?

2018-02-20 Thread William Allen Simpson
Now that we have a pile of nice gtests, who is compiling the results? Please post them here Also, DanG told me yesterday that he has a profile of the lookup test. Please post that here. That will allow us to better target the CPU bottlenecks. ---

Re: [Nfs-ganesha-devel] nfs ganesha vs nfs kernel performance

2018-02-20 Thread William Allen Simpson
On 2/18/18 2:47 PM, Matt Benjamin wrote: On Fri, Feb 16, 2018 at 11:23 AM, William Allen Simpson But the planned 2.7 improvements are mostly throughput related, not IOPS. Not at all, though I am trying to ensure that we get async FSAL ops in. There are people working on IOPs too. async

Re: [Nfs-ganesha-devel] Change in ffilz/nfs-ganesha[next]: "mdc_lookup" do not dispatch to FSAL

2018-02-20 Thread William Allen Simpson
On 2/19/18 12:12 PM, Sachin Punadikar wrote: Hi Bill, I rechecked the logs & discussed with Daniel. I missed to see the log entries related to FSAL. So for this customer it looks like FSAL issue than a Ganesha issue Thanks for the update. The default log levels don't always show enough. --

Re: [Nfs-ganesha-devel] nfs ganesha vs nfs kernel performance

2018-02-19 Thread William Allen Simpson
On 2/16/18 4:35 PM, Deepak Jagtap wrote:    There is only one client with 2 worker threads each worker thread generating 64 outstanding requests. Configured both servers (knfs & ganesha) with 128 worker threads.    I think client is fine  and is sending the request concurrently.  For knfs an

Re: [Nfs-ganesha-devel] READDIR doesn't return all entries.

2018-02-19 Thread William Allen Simpson
On 2/13/18 8:00 PM, Frank Filz wrote: You still don’t mention FSAL… I’m suspecting non-unique cookies from the FSAL as a cause. You may want to turn on CACHE_INODE and NFS_READDIR to FULL_DEBUG to see what is going on. A tcpdump trace won’t show anything useful (since we won’t see what cookies a

Re: [Nfs-ganesha-devel] testing

2018-02-18 Thread William Allen Simpson
On 2/15/18 1:17 PM, Frank Filz wrote: Between your test message, and this test message, I've received 5 messages Subject: ACL support. My Friday messages do not yet appear in the list archive: List-Archive: --

[Nfs-ganesha-devel] inconsistent '*' spacing

2018-02-18 Thread William Allen Simpson
As I'm trying to update nfs41.h, I've run into the problem that the commit check is complaining that the pointer '*' on parameters is sometimes " * v" and others " *v" -- usually the same function definition. Presumably the generator made these. They are cosmetic. Why oh why are we checking thi

Re: [Nfs-ganesha-devel] Change in ffilz/nfs-ganesha[next]: "mdc_lookup" do not dispatch to FSAL

2018-02-18 Thread William Allen Simpson
On 2/15/18 6:44 AM, GerritHub wrote: Sachin Punadikar has uploaded this change for *review*. View Change "mdc_lookup" do not dispatch to FSAL Are you sure? Do you have an actual reproducible error case? "mdc_lookup" function first attempts to get the e

Re: [Nfs-ganesha-devel] nfs ganesha vs nfs kernel performance

2018-02-18 Thread William Allen Simpson
On 2/14/18 8:32 AM, Daniel Gryniewicz wrote: How many clients are you using?  Each client op can only (currently) be handled in a single thread, and client's won't send more ops until the current one is ack'd, so Ganesha can basically only parallelize on a per-client basis at the moment. Act

Re: [Nfs-ganesha-devel] nfs ganesha vs nfs kernel performance

2018-02-15 Thread William Allen Simpson
On 2/13/18 6:20 PM, Deepak Jagtap wrote: Tried both v2.5-stable and 2.6 (next branch). Noticed marginal improvement, ~19K IOPS with 2.6 compared to ~18K IOPS with 2.5. Somewhat disappointed that you only found a 5.6% improvement in IOPs. V2.6 streamlines the input path. (V2.5 streamlined the

Re: [Nfs-ganesha-devel] nfs ganesha vs nfs kernel performance

2018-02-13 Thread William Allen Simpson
On 2/13/18 1:21 AM, Malahal Naineni wrote: If your latency is high, then you most likely need to change  Dispatch_Max_Reqs_Xprt. What your Dispatch_Max_Reqs_Xprt value? That shouldn't do anything anymore in V2.6, other than 9P. --

Re: [Nfs-ganesha-devel] WIP example API for async/vector FSAL ops

2018-02-07 Thread William Allen Simpson
On 2/6/18 10:40 AM, Daniel Gryniewicz wrote: On 02/06/2018 10:26 AM, William Allen Simpson wrote: On 2/6/18 8:25 AM, Daniel Gryniewicz wrote: Hi, all. I've worked up a sample API for async/vector for FSAL ops.  The example op is read(), and I've "implemented" it for all

[Nfs-ganesha-devel] V2.6 fixed UDP address and port (and more)

2018-02-06 Thread William Allen Simpson
For a long time, there have been problems with UDP. It has not been a priority, under the assumption that most folks have moved to TCP. And it wasn't tested much. Malahal tried a quick and dirty fix with a copy of the IP address in each service request structure. But all UDP requests were usin

Re: [Nfs-ganesha-devel] WIP example API for async/vector FSAL ops

2018-02-06 Thread William Allen Simpson
On 2/6/18 8:25 AM, Daniel Gryniewicz wrote: Hi, all. I've worked up a sample API for async/vector for FSAL ops.  The example op is read(), and I've "implemented" it for all FSALs, so that I can verify that it does, in fact, work for some definition of work. I'm a bit surprised it works, as t

Re: [Nfs-ganesha-devel] Features board list

2018-02-01 Thread William Allen Simpson
On 2/1/18 8:04 AM, Supriti Singh wrote: It seems like github does not support organization level board to be visible. https://github.com/isaacs/github/issues/935 :/ Let's just keep this design. If github already knows about the issue, then maybe they'll fix it. My problem is that I only rare

Re: [Nfs-ganesha-devel] Features board list

2018-01-31 Thread William Allen Simpson
On 1/31/18 1:11 PM, Supriti Singh wrote: I have created a new board here: https://github.com/orgs/nfs-ganesha/projects . Its organization-wide project board. Everyone who belongs to nfs-ganesha organization should have write access. Can you check if you have write access to this board. YES! Bu

Re: [Nfs-ganesha-devel] Change in ffilz/nfs-ganesha[next]: Move nfs_Init_svc() after create_pseudofs().

2018-01-31 Thread William Allen Simpson
On 1/31/18 3:11 PM, GerritHub wrote: Frank Filz *posted comments* on this change. View Change It seems to me that the dupreq2_pkginit() is already in about the right place, just after nfs_Init_client_id(). Moving it before doesn't do much. Patch set 2:

Re: [Nfs-ganesha-devel] Correct initialization sequence

2018-01-31 Thread William Allen Simpson
On 1/31/18 10:33 AM, Daniel Gryniewicz wrote: On 01/31/2018 10:27 AM, William Allen Simpson wrote: On 1/31/18 8:44 AM, Daniel Gryniewicz wrote: Agreed. Daniel On 01/30/2018 11:46 PM, Malahal Naineni wrote: Looking at the code, dupreq2_pkginit() only depends on Ganesha config processing to

Re: [Nfs-ganesha-devel] Correct initialization sequence

2018-01-31 Thread William Allen Simpson
On 1/31/18 10:27 AM, William Allen Simpson wrote: On 1/31/18 8:44 AM, Daniel Gryniewicz wrote: Agreed. Daniel On 01/30/2018 11:46 PM, Malahal Naineni wrote: Looking at the code, dupreq2_pkginit() only depends on Ganesha config processing to initialize few things, so it should be OK to call

Re: [Nfs-ganesha-devel] Correct initialization sequence

2018-01-31 Thread William Allen Simpson
On 1/31/18 8:44 AM, Daniel Gryniewicz wrote: Agreed. Daniel On 01/30/2018 11:46 PM, Malahal Naineni wrote: Looking at the code, dupreq2_pkginit() only depends on Ganesha config processing to initialize few things, so it should be OK to call anytime after Ganesha config processing. Regards,

Re: [Nfs-ganesha-devel] Features board list

2018-01-31 Thread William Allen Simpson
On 1/30/18 12:03 PM, Supriti Singh wrote: Hello all, As discussed in community call, I am sharing the feature board list: https://github.com/nfs-ganesha/nfs-ganesha/projects for nfs-ganesha 2.6 and 2.7. The aim is to use these boards to track the planned features for major release. The hope is t

Re: [Nfs-ganesha-devel] Is there a field in the SVCXPRT Ganesha can use

2018-01-30 Thread William Allen Simpson
On 1/30/18 9:36 AM, William Allen Simpson wrote: But the code is obscure, so I could be missing something. Also, it bears repeating that the dupreq cache wasn't working for secure connections. Pre-V2.6 checksummed the ciphertext, which is by definition different on every request. We&#x

Re: [Nfs-ganesha-devel] Is there a field in the SVCXPRT Ganesha can use

2018-01-30 Thread William Allen Simpson
On 1/30/18 9:22 AM, William Allen Simpson wrote: On 1/29/18 3:32 PM, Frank Filz wrote: I haven't looked at how the SVCXPRT structure has changed, but if there's a field in there we can attach a Ganesha structure to that would be cool, or if not, if we could add one. There are two:

Re: [Nfs-ganesha-devel] Is there a field in the SVCXPRT Ganesha can use

2018-01-30 Thread William Allen Simpson
On 1/29/18 3:32 PM, Frank Filz wrote: I haven't looked at how the SVCXPRT structure has changed, but if there's a field in there we can attach a Ganesha structure to that would be cool, or if not, if we could add one. There are two: xp_u1, and xp_u2. Right now, Ganesha is using xp_u2 for dup r

Re: [Nfs-ganesha-devel] Change in ffilz/nfs-ganesha[next]: Pullup ntirpc 1.6

2018-01-30 Thread William Allen Simpson
On 1/29/18 2:27 PM, Daniel Gryniewicz wrote: On 01/29/2018 02:09 PM, William Allen Simpson wrote: On 1/29/18 1:13 PM, GerritHub wrote: Daniel Gryniewicz has uploaded this change for *review*. View Change <https://review.gerrithub.io/397004> Pullup ntirpc 1.6 (svc_vc) rearm after EAGA

Re: [Nfs-ganesha-devel] Change in ffilz/nfs-ganesha[next]: Pullup ntirpc 1.6

2018-01-29 Thread William Allen Simpson
On 1/29/18 1:13 PM, GerritHub wrote: Daniel Gryniewicz has uploaded this change for *review*. View Change Pullup ntirpc 1.6 (svc_vc) rearm after EAGAIN and EWOULDBLOCK (Note, previous pullup was erroneously from 1.7) All my weekend patches need to be bac

Re: [Nfs-ganesha-devel] 'missed' recv with 2.6-rc2?

2018-01-28 Thread William Allen Simpson
On 1/27/18 4:07 PM, Pradeep wrote: ​Here is what I see in the log (the '2' is what I added to figure out which recv failed): nfs-ganesha-199008[svc_948] rpc :TIRPC :WARN :svc_vc_recv: 0x7f91c0861400 fd 21 recv errno 11 (try again) 2 176​ The fix looks good. Thanks Bill. Thanks for the excell

Re: [Nfs-ganesha-devel] 'missed' recv with 2.6-rc2?

2018-01-27 Thread William Allen Simpson
On 1/27/18 9:56 AM, William Allen Simpson wrote: I'm not able to reproduce.  Could you tell me which EAGAIN is happening?  The log line will say "svc_vc_wait" or "svc_vc_recv", and have the actual error code on it.  Maybe this is EWOULDBLOCK? Of course, neither EAGAIN

Re: [Nfs-ganesha-devel] 'missed' recv with 2.6-rc2?

2018-01-27 Thread William Allen Simpson
On 1/26/18 8:53 PM, William Allen Simpson wrote: In fact, I don't understand how we could get EAGAIN, according to the documentation.  But it's logged.  Good idea about differentiating the two identical log lines.  I'd prefer text rather than the number 2. And in the adjacent c

[Nfs-ganesha-devel] V2.6-rc4 connect to statd failed

2018-01-27 Thread William Allen Simpson
With Dan's latest ntirpc update, I'm seeing a new error. But this is my first testing on Fedora 27, so maybe a Fedora change? nsm_connect :NLM :CRIT :connect to statd failed: RPC: Unknown protocol Actually, that's not exactly how the error looks; the string list is missing its commas. My bad.

Re: [Nfs-ganesha-devel] 'missed' recv with 2.6-rc2?

2018-01-26 Thread William Allen Simpson
On 1/26/18 12:18 PM, Pradeep wrote: In svc_vc_recv(), we handle the case of incomplete receive by rearming the FD and returning ( if xd->sx_fbtbc is not zero). In the case of EAGAIN also shouldn't we be doing the same? epoll is ONESHOT; so new receives won't give new events until epoll_ctl() is c

Re: [Nfs-ganesha-devel] NULL pointer deref in clnt_ncreate_timed()

2018-01-23 Thread William Allen Simpson
On 1/23/18 9:35 AM, William Allen Simpson wrote: On 1/23/18 9:31 AM, Daniel Gryniewicz wrote: On 01/23/2018 09:04 AM, William Allen Simpson wrote: On 1/22/18 8:08 PM, Pradeep wrote: Looked at dev.22 and we were handling this error case correctly there. No, we're handling this error

Re: [Nfs-ganesha-devel] NULL pointer deref in clnt_ncreate_timed()

2018-01-23 Thread William Allen Simpson
On 1/23/18 9:31 AM, Daniel Gryniewicz wrote: On 01/23/2018 09:04 AM, William Allen Simpson wrote: On 1/22/18 8:08 PM, Pradeep wrote: Looked at dev.22 and we were handling this error case correctly there. No, we're handling this error case correctly now. Either you forgot to update

Re: [Nfs-ganesha-devel] NULL pointer deref in clnt_ncreate_timed()

2018-01-23 Thread William Allen Simpson
On 1/22/18 8:08 PM, Pradeep wrote: Hello, I'm running into a crash in libntirpc with rc2: #2  #3  0x7f9004de31f4 in clnt_ncreate_timed (hostname=0x57592e "localhost", prog=100024, vers=1,     netclass=0x57592a "tcp", tp=0x0) at /usr/src/debug/nfs-ganesha-2.6-rc2/libntirpc/src/clnt_gener

[Nfs-ganesha-devel] ntirpc v1.6.0 tagged, so nfs-ganesha v2.6.0 should be tagged

2018-01-13 Thread William Allen Simpson
Unfortunately, on Friday DanG tagged ntirpc v1.6.0. We had decided in conference calls to wait another week or so, until pull #99 had an opportunity to be widely tested, and my related RDMA patches could be updated. But DanG was sick, and missed the calls. That's why decisions made on calls sho

[Nfs-ganesha-devel] NTIRPC ENOMEM

2017-12-20 Thread William Allen Simpson
DanG has raised an interesting issue about recovery from low memory. In Ganesha, we've been assiduously changing NULL checks to assert or segfault on alloc failures. Just had a few more patches by Kaleb. Since 2013 or 2014, we've been doing the same to NTIRPC. There are currently 105 mem_.*allo

Re: [Nfs-ganesha-devel] XID missing in error path for RPC AUTH failure.

2017-12-15 Thread William Allen Simpson
On 12/14/17 1:13 PM, William Allen Simpson wrote: This is May 2015 code, based upon 2012 code.  Obviously, we haven't been testing error responses ;) I wanted to add a personal thank you for such an excellent bug report. The patch went in the next branch yesterday, and should show

Re: [Nfs-ganesha-devel] XID missing in error path for RPC AUTH failure.

2017-12-14 Thread William Allen Simpson
This is May 2015 code, based upon 2012 code. Obviously, we haven't been testing error responses ;) Not quite. That would need to be duplicated for each of the error conditions. Instead, it should be a bit higher in the function. Still, I'll keep it duplicated from the ACCEPTED code path,

Re: [Nfs-ganesha-devel] Announce Push of V2.6-dev.21

2017-12-14 Thread William Allen Simpson
On 12/12/17 4:39 PM, Frank Filz wrote: Branch next Tag:V2.6-dev.21 Release Highlights * new version of checkpatch * checkpatch fixes for existing code I'd been hoping that a mid-week release meant the crash during shutdown was fixed, but apparently not: Thread 270 "ganesha.nfsd" received

[Nfs-ganesha-devel] dev.20 segfault on shutdown

2017-12-11 Thread William Allen Simpson
I was testing code I'd written over the weekend, but it segfaulted on shutdown after running pynfs (pynfs itself was successful.) No problems simply starting and pkilling without doing any work. Gradually backed things out, until I'm at the 1a75e52 V2.6-dev.20, but still seeing the problem on sh

Re: [Nfs-ganesha-devel] enqueued_reqs/dequeued_reqs

2017-12-11 Thread William Allen Simpson
On 12/11/17 2:35 PM, Pradeep wrote: It looks like, we don't increment enqueued_reqs/dequeued_reqs in the RPC anymore - nfs_rpc_enqueue_req() is replaced with nfs_rpc_process_request. Now that both values are zero, the health checker (get_ganesha_health) will never detect any RPC hangs. Should t

Re: [Nfs-ganesha-devel] libntirpc thread local storage

2017-12-09 Thread William Allen Simpson
On 12/9/17 5:28 PM, Matt Benjamin wrote: I've already proposed we remove this.  No one is invested in it, I don't think. OK. I'll take a poke at it today. It makes sense that this is a good time to handle, as we've already made a major change to CLNT_CALL in this release. ---

[Nfs-ganesha-devel] libntirpc thread local storage

2017-12-09 Thread William Allen Simpson
I've run into another TLS problem. It's been there since tirpc. Apparently, once upon a time, rpc_createerr was a static global. It still says that in the man pages. When a client create function fails, they stash the error there, and return NULL for the CLIENT. Basically, you check for NULL,

Re: [Nfs-ganesha-devel] Change in ffilz/nfs-ganesha[next]: Consolidate 9P queues and workers

2017-12-09 Thread William Allen Simpson
On 11/27/17 1:47 PM, GerritHub wrote: william.allen.simp...@gmail.com has uploaded this change for *review*. View Change Consolidate 9P queues and workers Move worker_thread and dispatch_thread queuing into 9P. This is no longer used by NFS RPC. Is there

Re: [Nfs-ganesha-devel] Stacked FSALs and fsal_export parameters and op_ctx

2017-12-09 Thread William Allen Simpson
On 12/8/17 10:13 AM, Matt Benjamin wrote: I'd like to see this use of TLS as a "hidden parameter" replaced regardless. It has been a source of bugs, and locks us into a pthreads execution model I think needlessly. With future async FSAL calls, it's going to stop working. We already have a svc

Re: [Nfs-ganesha-devel] Stacked FSALs and fsal_export parameters and op_ctx

2017-12-08 Thread William Allen Simpson
On 12/7/17 7:54 PM, Frank Filz wrote: Stacked FSALs often depend on op_ctx->fsal_export being set. We also have lots of FSAL methods that take the fsal_export as a parameter. The latter sounds better. Now that we know every single thread local storage access involves a hidden lock/unlock sequ

Re: [Nfs-ganesha-devel] Announce Push of V2.6-dev.17

2017-11-14 Thread William Allen Simpson
On 11/13/17 12:35 AM, Frank Filz wrote: 5ca449d Jan-Martin Rämer handle hosts via libcidr to unify IPv4/IPv4 host/network clients Ran pynfs, seeing some massive leaks that weren't there last week: Direct leak of 505080 byte(s) in 12627 object(s) allocated from: #0 0x76efcfe0 in calloc

Re: [Nfs-ganesha-devel] Change in ffilz/nfs-ganesha[next]: (log_functions) display work_pool name

2017-11-07 Thread William Allen Simpson
On 11/6/17 8:41 PM, William Allen Simpson wrote: On 11/6/17 8:38 AM, Dominique Martinet wrote: One way that'd work for example would be have ganesha provide a pointer to SetNameFunction at init, it's a bit ugly though. Actually, that's how ntirpc calls various alloc and warnx f

Re: [Nfs-ganesha-devel] Change in ffilz/nfs-ganesha[next]: (log_functions) display work_pool name

2017-11-06 Thread William Allen Simpson
On 11/6/17 8:38 AM, Dominique Martinet wrote: William Allen Simpson wrote on Mon, Nov 06, 2017 at 08:12:19AM -0500: If you've got [s]ome others in another library, they'll have to use the same library function. Other FSALs are in other libraries, but given how it's setup th

Re: [Nfs-ganesha-devel] Change in ffilz/nfs-ganesha[next]: (log_functions) display work_pool name

2017-11-06 Thread William Allen Simpson
On 11/6/17 7:14 AM, GerritHub wrote: File src/log/log_functions.c: o Patch Set #2, Line 1438: |name = work_pool_worker_name();|

Re: [Nfs-ganesha-devel] Change in ffilz/nfs-ganesha[next]: CLNT_CALL with clnt_req

2017-11-04 Thread William Allen Simpson
On 11/4/17 1:43 AM, Matt Benjamin wrote: oh, come on. not sure what needs to be done to reduce log noise, but I'm sure we can make a dent. Apparently, you and I reviewed and wrote our messages in opposite order. This patch does that Complaining about the log messages now that we put in l

Re: [Nfs-ganesha-devel] Change in ffilz/nfs-ganesha[next]: CLNT_CALL with clnt_req

2017-11-03 Thread William Allen Simpson
We already discussed this on Tuesday, October 24th. Malahal agreed that a half second was good, 3 seconds was OK, 5 seconds was long. And Matt agreed we'd log more than 10 seconds. Obviously, you have vastly more Internet experience than I, and therefore are much better able to decide Internet t

Re: [Nfs-ganesha-devel] Change in ffilz/nfs-ganesha[next]: CLNT_CALL with clnt_req

2017-11-03 Thread William Allen Simpson
On 11/3/17 2:06 PM, Frank Filz wrote: Please respond inside gerrit to keep the conversation in one place. The discussion of timeouts was always in public on this list, and was also on the public conference call a week ago. If Gerrit doesn't record its replies, unlike Github, you should probabl

Re: [Nfs-ganesha-devel] tirpc warning

2017-11-03 Thread William Allen Simpson
On 11/3/17 7:46 PM, Frank Filz wrote: Can we tone down this warning: 2017-11-03 16:14:21 [svc_4] :0 :rpc :TIRPC :INFO :clnt_req_alloc:470 tv_sec 15 > 10 That is spamming the log when at INFO level. I see that you didn't put in CLNT_CALL with clnt_req, that eliminated all current examples of t

Re: [Nfs-ganesha-devel] Change in ffilz/nfs-ganesha[next]: CLNT_CALL with clnt_req

2017-11-03 Thread William Allen Simpson
On 11/2/17 1:01 PM, GerritHub wrote: Frank Filz *posted comments* on this change. View Change Patch set 1: (2 comments) * File src/MainNFSD/nfs_rpc_callback.c: o

Re: [Nfs-ganesha-devel] nlm_async retries

2017-11-02 Thread William Allen Simpson
On 11/1/17 3:07 PM, Frank Filz wrote: I think we only need a single call fired off. If the client doesn't get it, there's not much recourse. I guess if a TCP connection actually fails, we could retry then, but over UDP there is no way to know what happened. Thanks for working on cleaning this

Re: [Nfs-ganesha-devel] nlm_async retries

2017-11-01 Thread William Allen Simpson
On 11/1/17 2:27 PM, William Allen Simpson wrote: On 11/1/17 10:07 AM, Frank Filz wrote: So part of why that code looks bizarre? Because the NLM ASYNC RPC procedures are bizarre... The NLM ASYNC procedures DON'T have a normal RPC call response. Instead, the host handling the call (normall

  1   2   3   4   >