Re: [Nfs-ganesha-devel] Change in ffilz/nfs-ganesha[next]: gtest/CMakeLists: libraries for commit2

2018-04-23 Thread William Allen Simpson
This list has been deprecated. Please subscribe to the new devel list at lists.nfs-ganesha.org. On 4/23/18 10:01 AM, GerritHub wrote: This list has been deprecated. Please subscribe to the new devel list at lists.nfs-ganesha.org. Frank, who is responsible for changing GerritHub?

Re: [Nfs-ganesha-devel] Couple of stat issues.

2018-04-10 Thread William Allen Simpson
On 4/10/18 8:49 PM, Pradeep wrote: 2. In nfs_rpc_execute(), the queue_wait is set to the difference between op_ctx->start_time and reqdata->time_queued. But reqdata->time_queued is never set (in the old code - pre 2.6-dev5, nfs_rpc_enqueue_req() used to set it; now only 9P code sets it). Is

Re: [Nfs-ganesha-devel] rpcping comparison nfs-server

2018-03-28 Thread William Allen Simpson
On 3/27/18 9:34 AM, William Allen Simpson wrote: On 3/25/18 1:44 PM, William Allen Simpson wrote: On 3/23/18 1:30 PM, William Allen Simpson wrote: Ran some apples to apples comparisons today V2.7-dev.5: Without the client-side rbtrees, rpcping works a lot better: Thought of a small tweak

Re: [Nfs-ganesha-devel] rpcping comparison nfs-server

2018-03-27 Thread William Allen Simpson
On 3/25/18 1:44 PM, William Allen Simpson wrote: On 3/23/18 1:30 PM, William Allen Simpson wrote: Ran some apples to apples comparisons today V2.7-dev.5: Without the client-side rbtrees, rpcping works a lot better: Thought of a small tweak to the list adding routine, so it doesn't kick

Re: [Nfs-ganesha-devel] rpcping comparison nfs-server

2018-03-25 Thread William Allen Simpson
On 3/23/18 1:30 PM, William Allen Simpson wrote: Ran some apples to apples comparisons today V2.7-dev.5: Without the client-side rbtrees, rpcping works a lot better: Ganesha (worst, best): rpcping tcp localhost count=1000 threads=1 workers=5 (port=2049 program=13 version=3 procedure=0

Re: [Nfs-ganesha-devel] rpcping profile

2018-03-25 Thread William Allen Simpson
On 3/24/18 7:50 AM, William Allen Simpson wrote: Noting that the top problem is exactly my prediction by knowledge of the code:   clnt_req_callback() opr_rbtree_insert() The second is also exactly as expected:   svc_rqst_expire_insert() opr_rbtree_insert() svc_rqst_expire_cmpf

[Nfs-ganesha-devel] rpcping profile

2018-03-24 Thread William Allen Simpson
Using local file tests/rpcping. Using local file ../profile. Total: 989 samples 321 32.5% 32.5% 321 32.5% svc_rqst_expire_cmpf 149 15.1% 47.5% 475 48.0% opr_rbtree_insert 139 14.1% 61.6% 140 14.2% __writev 56 5.7% 67.2% 66 6.7%

[Nfs-ganesha-devel] rpcping comparison nfs-server

2018-03-23 Thread William Allen Simpson
Ran some apples to apples comparisons today V2.7-dev.5: Ganesha (worst, best): rpcping tcp localhost count=1000 threads=1 workers=5 (port=2049 program=13 version=3 procedure=0): mean 33950.1556, total 33950.1556 rpcping tcp localhost count=1000 threads=1 workers=5 (port=2049 program=13

Re: [Nfs-ganesha-devel] nss_getpwnam: name 't...@my.dom@localdomain' does not map into domain 'nix.my.dom'

2018-03-23 Thread William Allen Simpson
On 3/23/18 7:59 AM, Daniel Gryniewicz wrote: Thanks, Tomk.  PR is here: https://review.gerrithub.io/404945 Actually, it seems fairly elegant. ntirpc and rdma also have the USE_ and _USE_ convention. Both require libraries, and would benefit from defaults with enforcement checking for the

Re: [Nfs-ganesha-devel] Announce Push of V2.7-dev.4

2018-03-18 Thread William Allen Simpson
On 3/17/18 11:01 AM, Jeff Layton wrote: See: https://review.gerrithub.io/c/404231/ Thanks. A more pro-active approach to be sure. I just assumed Frank would quickly fix it and push a new dev.4a when he saw it Sat. Nice to see I'm not the only one coding on weekends.

Re: [Nfs-ganesha-devel] Announce Push of V2.7-dev.4

2018-03-17 Thread William Allen Simpson
On 3/16/18 7:23 PM, Frank Filz wrote: Branch next Tag:V2.7-dev.4 NOTE: This merge includes an ntirpc pullup, please update your submodule This is a big merge with a lot of cleanup. Doesn't compile for me. [ 17%] Building C object Protocols/NFS/CMakeFiles/nfsproto.dir/nfs4_Compound.c.o In

Re: [Nfs-ganesha-devel] Change in ffilz/nfs-ganesha[next]: Pullup NTIRPC through #124

2018-03-16 Thread William Allen Simpson
On 3/16/18 10:07 AM, GerritHub wrote: william.allen.simp...@gmail.com has uploaded this change for *review*. View Change I see that our ci.centos.org now provides dbench and iozone. The dbench results are in its log: + tail -21 ../dbenchTestLog.txt

Re: [Nfs-ganesha-devel] A question about rpc requests maybe for Bill

2018-03-15 Thread William Allen Simpson
On 3/15/18 7:57 PM, Frank Filz wrote: NFS v4.1 has a max request size option for the session, I’m wondering if there’s a way to get the size of a given request easily. Depends on how that's defined. Bytes following header? And what you need to do with it. It might be simplest to add a data

Re: [Nfs-ganesha-devel] rpcping

2018-03-15 Thread William Allen Simpson
On 3/15/18 10:23 AM, Daniel Gryniewicz wrote: Can you try again with a larger count, like 100k? 500 is still quite small for a loop benchmark like this. In the code, I commented that 500 is minimal. I've done a pile of 100, 200, 300, and they perform roughly the same as 500. rpcping tcp

Re: [Nfs-ganesha-devel] rpcping

2018-03-15 Thread William Allen Simpson
On 3/14/18 3:33 AM, William Allen Simpson wrote: rpcping tcp localhost threads=1 count=500 (port=2049 program=13 version=3 procedure=0): mean 51285.7754, total 51285.7754 DanG pushed the latest code onto ntirpc this morning, and I'll submit a pullup for Ganesha later today. I've changed

Re: [Nfs-ganesha-devel] rpcping

2018-03-14 Thread William Allen Simpson
On 3/14/18 7:27 AM, Matt Benjamin wrote: Daniel doesn't think you've measured much accurately yet, but at least the effort (if not the discussion) aims to. I'm sure Daniel can speak for himself. At your time of writing, Daniel had not yet arrived in the office after my post this am. So I'm

Re: [Nfs-ganesha-devel] rpcping

2018-03-14 Thread William Allen Simpson
On 3/13/18 1:58 PM, Daniel Gryniewicz wrote: rpcping was not thread safe.  I have fixes for it incoming. With DanG's significant help, we now have better timing results. There was an implicit assumption in the ancient code that it was calling single threaded tirpc, while ntirpc is

Re: [Nfs-ganesha-devel] rpcping

2018-03-14 Thread William Allen Simpson
On 3/13/18 8:27 AM, Matt Benjamin wrote: On Tue, Mar 13, 2018 at 2:38 AM, William Allen Simpson <william.allen.simp...@gmail.com> wrote: but if we assume xids retire in xid order also, They do. Should be no variance. Eliminating the dupreq caching -- also using the rbtree -- signifi

Re: [Nfs-ganesha-devel] rpcping

2018-03-13 Thread William Allen Simpson
On 3/13/18 2:38 AM, William Allen Simpson wrote: In my measurements, using the new CLNT_CALL_BACK(), the client thread starts sending a stream of pings.  In every case, it peaks at a relatively stable rate. DanG suggested that timing was dominated by the system time calls. The previous

Re: [Nfs-ganesha-devel] rpcping

2018-03-13 Thread William Allen Simpson
On 3/12/18 6:25 PM, Matt Benjamin wrote: If I understand correctly, we always insert records in xid order, and xid is monotonically increasing by 1. I guess pings might come back in any order, No, they always come back in order. This is TCP. I've gone to some lengths to fix the problem

[Nfs-ganesha-devel] WAVL tree

2018-03-12 Thread William Allen Simpson
New in 2015. https://en.wikipedia.org/wiki/WAVL_tree There's a C++ intrusive container implementation at: https://fuchsia.googlesource.com/zircon/+/master/system/ulib/fbl/include/fbl/intrusive_wavl_tree.h I've not found a standard C implementation yet.

Re: [Nfs-ganesha-devel] rpcping

2018-03-12 Thread William Allen Simpson
[These are with a Ganesha that doesn't dupreq cache the null operation.] Just how slow is this RB tree? Here's a comparison of 1000 entries versus 100 entries in ops per second: rpcping tcp localhost threads=5 count=1000 (port=2049 program=13 version=3 procedure=0): average 2963.2517,

Re: [Nfs-ganesha-devel] rpcping

2018-03-12 Thread William Allen Simpson
One of the limiting factors in our Ganesha performance is that the NULL operation is going through the dupreq code. That can be easily fixed with a check that jumps to nocache. One of the limiting factors in our ntirpc performance seems to be the call_replies tree that stores the xid of calls

Re: [Nfs-ganesha-devel] rpcping

2018-03-12 Thread William Allen Simpson
17647.0588 rpcping tcp localhost threads=10 count=500 (port=2049 program=13 version=4 procedure=0): 1731.3390, total 17313.3903 rpcping tcp localhost threads=15 count=500 (port=2049 program=13 version=4 procedure=0): 1142.3732, total 17135.5981 On 3/8/18 8:03 PM, William Allen Simpson

Re: [Nfs-ganesha-devel] zero-copy read

2018-03-11 Thread William Allen Simpson
On 3/11/18 7:15 AM, William Allen Simpson wrote: On 3/10/18 11:18 AM, Matt Benjamin wrote: Marcus has code that prototypes using gss_iov from mit-krb5 1.1.12.  I recall describing this to you in 2013. That would be surprising, as I didn't start working on this project until a year or so later

Re: [Nfs-ganesha-devel] [nfs-ganesha/ntirpc] rpcping ncreatef (#115)

2018-03-11 Thread William Allen Simpson
On 3/9/18 5:38 PM, Matt Benjamin wrote: I might be missing something, but it looked to me like the trick to talking to nfs-ganesha is to bypass the binder more-or-less as an nfsv4 backchannel does. I'm not sure this is a good idea, unless we are really desperate for Ganesha numbers. I've

Re: [Nfs-ganesha-devel] zero-copy read

2018-03-11 Thread William Allen Simpson
On 3/10/18 11:18 AM, Matt Benjamin wrote: Marcus has code that prototypes using gss_iov from mit-krb5 1.1.12. I recall describing this to you in 2013. That would be surprising, as I didn't start working on this project until a year or so later than that Anyway, last year Marcus sent me a

Re: [Nfs-ganesha-devel] zero-copy read

2018-03-10 Thread William Allen Simpson
On 3/10/18 10:24 AM, William Allen Simpson wrote: Finally, and what I'll do this weekend, my attempt to edit xdr_nfs23.c won't pass checkpatch commit, because all the headers are still pre-1989 pre-ANSI K Unfortunately, Red Hat Linux doesn't seem to have cproto built-in, even though it's

[Nfs-ganesha-devel] zero-copy read

2018-03-10 Thread William Allen Simpson
Now that DanG has a workable vector i-o for read and write, I'm trying again to make reading zero-copy. Man-oh-man, do we have our work cut out for us It seems that currently we provide a buffer to read. Then XDR makes a new object, puts headers into it, makes another data_val and copies

[Nfs-ganesha-devel] nfs_worker_thread never executed code section

2018-03-10 Thread William Allen Simpson
) if (dpq_status == DUPREQ_SUCCESS) 00c21a6878 (William Allen Simpson 2015-06-12 07:36:46 -0400 1414) dpq_status = nfs_dupreq_finish(>r_u.req.svc, res_nfs); 02526d7325 (Jim Lieb 2013-10-10 20:50:47 -0700 1415) goto freeargs; 02526d7325 (Jim Lieb 2013-10

Re: [Nfs-ganesha-devel] rpcping

2018-03-08 Thread William Allen Simpson
On 3/8/18 12:33 PM, William Allen Simpson wrote: Still having no luck.  Instead of relying on RPC itself, checked with Ganesha about what it registers, and tried some of those. Without running Ganesha, rpcinfo reports portmapper services by default on my machine. Can talk to it via localhost

[Nfs-ganesha-devel] rpcping

2018-03-08 Thread William Allen Simpson
Still having no luck. Instead of relying on RPC itself, checked with Ganesha about what it registers, and tried some of those. The default procedure is 0, that according to every RFC is reserved for do nothing. But rpcbind is not finding program and version. To be honest, I'm not sure how

Re: [Nfs-ganesha-devel] Announce Push of V2.7-dev.1

2018-02-24 Thread William Allen Simpson
On 2/24/18 5:18 AM, William Allen Simpson wrote: On 2/24/18 4:42 AM, William Allen Simpson wrote: [top post for visibility] Says ntirpc pullup (twice), but doesn't actually have:   * "Pullup NTIRPC through #106" Missing "(nfs41.h) unindent" checkpatch cleanup, eve

Re: [Nfs-ganesha-devel] Announce Push of V2.7-dev.1

2018-02-24 Thread William Allen Simpson
On 2/24/18 4:42 AM, William Allen Simpson wrote: [top post for visibility] Says ntirpc pullup (twice), but doesn't actually have:  * "Pullup NTIRPC through #106" Missing "(nfs41.h) unindent" checkpatch cleanup, even though we'd agreed this was the best time to d

Re: [Nfs-ganesha-devel] Announce Push of V2.7-dev.1

2018-02-24 Thread William Allen Simpson
[top post for visibility] Says ntirpc pullup (twice), but doesn't actually have: * "Pullup NTIRPC through #106" Missing "(nfs41.h) unindent" checkpatch cleanup, even though we'd agreed this was the best time to do it, and it had all the expected +1 and +2. Literally no changes to running

Re: [Nfs-ganesha-devel] Change in ffilz/nfs-ganesha[next]: Remove unused fsal_read2() and fsal_write2()

2018-02-23 Thread William Allen Simpson
On 2/22/18 1:32 PM, GerritHub wrote: Daniel Gryniewicz has uploaded this change for *review*. View Change Remove unused fsal_read2() and fsal_write2() I've reviewed, but the write showed up in my inbox before the read, followed by this cleanup. But the

Re: [Nfs-ganesha-devel] inconsistent '*' spacing

2018-02-21 Thread William Allen Simpson
On 2/21/18 11:35 AM, Frank Filz wrote: There's a -n or --no-verify option that will bypass the commit hooks. I suggest trying to commit without that first to make sure the only checkpatch errors/warnings are for the spacing around * and then commit again with -n to bypass checkpatch to

Re: [Nfs-ganesha-devel] Change in ffilz/nfs-ganesha[next]: MainNFSD: invert _NO_PORTMAPPER option

2018-02-21 Thread William Allen Simpson
On 2/21/18 4:51 PM, Jeff Layton wrote: On Wed, 2018-02-21 at 13:40 -0800, Frank Filz wrote: We could take this opportunity to change the option to RPCBIND... Fair enough. I'd support this. I actually disagree with the "no udp" statement above too. UDP is great for single-shot request

Re: [Nfs-ganesha-devel] inconsistent '*' spacing

2018-02-21 Thread William Allen Simpson
On 2/21/18 8:28 AM, William Allen Simpson wrote: Anyway, I'll try to push a patch for those 2 files by tomorrow. ERROR: "foo * bar" should be "foo *bar" #18656: FILE: src/include/nfsv41.h:9900: +static inline bool xdr_CB_COMPOUND4res(XDR * xdrs, total: 450 errors, 14 w

Re: [Nfs-ganesha-devel] Change in ffilz/nfs-ganesha[next]: MainNFSD: invert _NO_PORTMAPPER option

2018-02-21 Thread William Allen Simpson
On 2/21/18 1:59 PM, GerritHub wrote: Jeff Layton has uploaded this change for *review*. View Change MainNFSD: invert _NO_PORTMAPPER option The fact that this is a "negative" option is confusing. Change it to a "PORTMAPPER" option, and have it default to

Re: [Nfs-ganesha-devel] inconsistent '*' spacing

2018-02-21 Thread William Allen Simpson
On 2/20/18 1:06 PM, Frank Filz wrote: As I'm trying to update nfs41.h, I've run into the problem that the commit check is complaining that the pointer '*' on parameters is sometimes " * v" and others " *v" -- usually the same function definition. Presumably the generator made these. They

Re: [Nfs-ganesha-devel] nfs ganesha vs nfs kernel performance

2018-02-21 Thread William Allen Simpson
I really don't have time today to respond to every one-liner throw-away comment here, so I'll try to stick to the most cogent. On 2/20/18 8:33 AM, Matt Benjamin wrote: On Tue, Feb 20, 2018 at 8:12 AM, William Allen Simpson <william.allen.simp...@gmail.com> wrote: On 2/18/18 2:47 PM

[Nfs-ganesha-devel] gtest results? profiles?

2018-02-20 Thread William Allen Simpson
Now that we have a pile of nice gtests, who is compiling the results? Please post them here Also, DanG told me yesterday that he has a profile of the lookup test. Please post that here. That will allow us to better target the CPU bottlenecks.

Re: [Nfs-ganesha-devel] nfs ganesha vs nfs kernel performance

2018-02-20 Thread William Allen Simpson
On 2/18/18 2:47 PM, Matt Benjamin wrote: On Fri, Feb 16, 2018 at 11:23 AM, William Allen Simpson But the planned 2.7 improvements are mostly throughput related, not IOPS. Not at all, though I am trying to ensure that we get async FSAL ops in. There are people working on IOPs too. async

Re: [Nfs-ganesha-devel] Change in ffilz/nfs-ganesha[next]: "mdc_lookup" do not dispatch to FSAL

2018-02-20 Thread William Allen Simpson
On 2/19/18 12:12 PM, Sachin Punadikar wrote: Hi Bill, I rechecked the logs & discussed with Daniel. I missed to see the log entries related to FSAL. So for this customer it looks like FSAL issue than a Ganesha issue Thanks for the update. The default log levels don't always show enough.

Re: [Nfs-ganesha-devel] READDIR doesn't return all entries.

2018-02-19 Thread William Allen Simpson
On 2/13/18 8:00 PM, Frank Filz wrote: You still don’t mention FSAL… I’m suspecting non-unique cookies from the FSAL as a cause. You may want to turn on CACHE_INODE and NFS_READDIR to FULL_DEBUG to see what is going on. A tcpdump trace won’t show anything useful (since we won’t see what cookies

Re: [Nfs-ganesha-devel] testing

2018-02-18 Thread William Allen Simpson
On 2/15/18 1:17 PM, Frank Filz wrote: Between your test message, and this test message, I've received 5 messages Subject: ACL support. My Friday messages do not yet appear in the list archive: List-Archive:

[Nfs-ganesha-devel] inconsistent '*' spacing

2018-02-18 Thread William Allen Simpson
As I'm trying to update nfs41.h, I've run into the problem that the commit check is complaining that the pointer '*' on parameters is sometimes " * v" and others " *v" -- usually the same function definition. Presumably the generator made these. They are cosmetic. Why oh why are we checking

Re: [Nfs-ganesha-devel] Change in ffilz/nfs-ganesha[next]: "mdc_lookup" do not dispatch to FSAL

2018-02-18 Thread William Allen Simpson
On 2/15/18 6:44 AM, GerritHub wrote: Sachin Punadikar has uploaded this change for *review*. View Change "mdc_lookup" do not dispatch to FSAL Are you sure? Do you have an actual reproducible error case? "mdc_lookup" function first attempts to get the

Re: [Nfs-ganesha-devel] nfs ganesha vs nfs kernel performance

2018-02-18 Thread William Allen Simpson
On 2/14/18 8:32 AM, Daniel Gryniewicz wrote: How many clients are you using?  Each client op can only (currently) be handled in a single thread, and client's won't send more ops until the current one is ack'd, so Ganesha can basically only parallelize on a per-client basis at the moment.

Re: [Nfs-ganesha-devel] nfs ganesha vs nfs kernel performance

2018-02-13 Thread William Allen Simpson
On 2/13/18 1:21 AM, Malahal Naineni wrote: If your latency is high, then you most likely need to change  Dispatch_Max_Reqs_Xprt. What your Dispatch_Max_Reqs_Xprt value? That shouldn't do anything anymore in V2.6, other than 9P.

Re: [Nfs-ganesha-devel] WIP example API for async/vector FSAL ops

2018-02-07 Thread William Allen Simpson
On 2/6/18 10:40 AM, Daniel Gryniewicz wrote: On 02/06/2018 10:26 AM, William Allen Simpson wrote: On 2/6/18 8:25 AM, Daniel Gryniewicz wrote: Hi, all. I've worked up a sample API for async/vector for FSAL ops.  The example op is read(), and I've "implemented" it for all FSALs, so

[Nfs-ganesha-devel] V2.6 fixed UDP address and port (and more)

2018-02-06 Thread William Allen Simpson
For a long time, there have been problems with UDP. It has not been a priority, under the assumption that most folks have moved to TCP. And it wasn't tested much. Malahal tried a quick and dirty fix with a copy of the IP address in each service request structure. But all UDP requests were

Re: [Nfs-ganesha-devel] WIP example API for async/vector FSAL ops

2018-02-06 Thread William Allen Simpson
On 2/6/18 8:25 AM, Daniel Gryniewicz wrote: Hi, all. I've worked up a sample API for async/vector for FSAL ops.  The example op is read(), and I've "implemented" it for all FSALs, so that I can verify that it does, in fact, work for some definition of work. I'm a bit surprised it works, as

Re: [Nfs-ganesha-devel] Features board list

2018-02-01 Thread William Allen Simpson
On 2/1/18 8:04 AM, Supriti Singh wrote: It seems like github does not support organization level board to be visible. https://github.com/isaacs/github/issues/935 :/ Let's just keep this design. If github already knows about the issue, then maybe they'll fix it. My problem is that I only

Re: [Nfs-ganesha-devel] Features board list

2018-01-31 Thread William Allen Simpson
On 1/31/18 1:11 PM, Supriti Singh wrote: I have created a new board here: https://github.com/orgs/nfs-ganesha/projects . Its organization-wide project board. Everyone who belongs to nfs-ganesha organization should have write access. Can you check if you have write access to this board. YES!

Re: [Nfs-ganesha-devel] Change in ffilz/nfs-ganesha[next]: Move nfs_Init_svc() after create_pseudofs().

2018-01-31 Thread William Allen Simpson
On 1/31/18 3:11 PM, GerritHub wrote: Frank Filz *posted comments* on this change. View Change It seems to me that the dupreq2_pkginit() is already in about the right place, just after nfs_Init_client_id(). Moving it before doesn't do much. Patch set 2:

Re: [Nfs-ganesha-devel] Correct initialization sequence

2018-01-31 Thread William Allen Simpson
On 1/31/18 10:33 AM, Daniel Gryniewicz wrote: On 01/31/2018 10:27 AM, William Allen Simpson wrote: On 1/31/18 8:44 AM, Daniel Gryniewicz wrote: Agreed. Daniel On 01/30/2018 11:46 PM, Malahal Naineni wrote: Looking at the code, dupreq2_pkginit() only depends on Ganesha config processing

Re: [Nfs-ganesha-devel] Correct initialization sequence

2018-01-31 Thread William Allen Simpson
On 1/31/18 8:44 AM, Daniel Gryniewicz wrote: Agreed. Daniel On 01/30/2018 11:46 PM, Malahal Naineni wrote: Looking at the code, dupreq2_pkginit() only depends on Ganesha config processing to initialize few things, so it should be OK to call anytime after Ganesha config processing. Regards,

Re: [Nfs-ganesha-devel] Is there a field in the SVCXPRT Ganesha can use

2018-01-30 Thread William Allen Simpson
On 1/30/18 9:36 AM, William Allen Simpson wrote: But the code is obscure, so I could be missing something. Also, it bears repeating that the dupreq cache wasn't working for secure connections. Pre-V2.6 checksummed the ciphertext, which is by definition different on every request. We'd never

Re: [Nfs-ganesha-devel] Is there a field in the SVCXPRT Ganesha can use

2018-01-30 Thread William Allen Simpson
On 1/30/18 9:22 AM, William Allen Simpson wrote: On 1/29/18 3:32 PM, Frank Filz wrote: I haven't looked at how the SVCXPRT structure has changed, but if there's a field in there we can attach a Ganesha structure to that would be cool, or if not, if we could add one. There are two: xp_u1

Re: [Nfs-ganesha-devel] Is there a field in the SVCXPRT Ganesha can use

2018-01-30 Thread William Allen Simpson
On 1/29/18 3:32 PM, Frank Filz wrote: I haven't looked at how the SVCXPRT structure has changed, but if there's a field in there we can attach a Ganesha structure to that would be cool, or if not, if we could add one. There are two: xp_u1, and xp_u2. Right now, Ganesha is using xp_u2 for dup

Re: [Nfs-ganesha-devel] Change in ffilz/nfs-ganesha[next]: Pullup ntirpc 1.6

2018-01-30 Thread William Allen Simpson
On 1/29/18 2:27 PM, Daniel Gryniewicz wrote: On 01/29/2018 02:09 PM, William Allen Simpson wrote: On 1/29/18 1:13 PM, GerritHub wrote: Daniel Gryniewicz has uploaded this change for *review*. View Change <https://review.gerrithub.io/397004> Pullup ntirpc 1.6 (svc_vc) rearm after

Re: [Nfs-ganesha-devel] Change in ffilz/nfs-ganesha[next]: Pullup ntirpc 1.6

2018-01-29 Thread William Allen Simpson
On 1/29/18 1:13 PM, GerritHub wrote: Daniel Gryniewicz has uploaded this change for *review*. View Change Pullup ntirpc 1.6 (svc_vc) rearm after EAGAIN and EWOULDBLOCK (Note, previous pullup was erroneously from 1.7) All my weekend patches need to be

Re: [Nfs-ganesha-devel] 'missed' recv with 2.6-rc2?

2018-01-28 Thread William Allen Simpson
On 1/27/18 4:07 PM, Pradeep wrote: ​Here is what I see in the log (the '2' is what I added to figure out which recv failed): nfs-ganesha-199008[svc_948] rpc :TIRPC :WARN :svc_vc_recv: 0x7f91c0861400 fd 21 recv errno 11 (try again) 2 176​ The fix looks good. Thanks Bill. Thanks for the

Re: [Nfs-ganesha-devel] 'missed' recv with 2.6-rc2?

2018-01-27 Thread William Allen Simpson
On 1/27/18 9:56 AM, William Allen Simpson wrote: I'm not able to reproduce.  Could you tell me which EAGAIN is happening?  The log line will say "svc_vc_wait" or "svc_vc_recv", and have the actual error code on it.  Maybe this is EWOULDBLOCK? Of course, neither EAGAIN o

Re: [Nfs-ganesha-devel] 'missed' recv with 2.6-rc2?

2018-01-27 Thread William Allen Simpson
On 1/26/18 8:53 PM, William Allen Simpson wrote: In fact, I don't understand how we could get EAGAIN, according to the documentation.  But it's logged.  Good idea about differentiating the two identical log lines.  I'd prefer text rather than the number 2. And in the adjacent code, you'll see

[Nfs-ganesha-devel] V2.6-rc4 connect to statd failed

2018-01-27 Thread William Allen Simpson
With Dan's latest ntirpc update, I'm seeing a new error. But this is my first testing on Fedora 27, so maybe a Fedora change? nsm_connect :NLM :CRIT :connect to statd failed: RPC: Unknown protocol Actually, that's not exactly how the error looks; the string list is missing its commas. My bad.

Re: [Nfs-ganesha-devel] 'missed' recv with 2.6-rc2?

2018-01-26 Thread William Allen Simpson
On 1/26/18 12:18 PM, Pradeep wrote: In svc_vc_recv(), we handle the case of incomplete receive by rearming the FD and returning ( if xd->sx_fbtbc is not zero). In the case of EAGAIN also shouldn't we be doing the same? epoll is ONESHOT; so new receives won't give new events until epoll_ctl() is

Re: [Nfs-ganesha-devel] NULL pointer deref in clnt_ncreate_timed()

2018-01-23 Thread William Allen Simpson
On 1/23/18 9:35 AM, William Allen Simpson wrote: On 1/23/18 9:31 AM, Daniel Gryniewicz wrote: On 01/23/2018 09:04 AM, William Allen Simpson wrote: On 1/22/18 8:08 PM, Pradeep wrote: Looked at dev.22 and we were handling this error case correctly there. No, we're handling this error case

Re: [Nfs-ganesha-devel] NULL pointer deref in clnt_ncreate_timed()

2018-01-23 Thread William Allen Simpson
On 1/23/18 9:31 AM, Daniel Gryniewicz wrote: On 01/23/2018 09:04 AM, William Allen Simpson wrote: On 1/22/18 8:08 PM, Pradeep wrote: Looked at dev.22 and we were handling this error case correctly there. No, we're handling this error case correctly now. Either you forgot to update your

Re: [Nfs-ganesha-devel] NULL pointer deref in clnt_ncreate_timed()

2018-01-23 Thread William Allen Simpson
On 1/22/18 8:08 PM, Pradeep wrote: Hello, I'm running into a crash in libntirpc with rc2: #2  #3  0x7f9004de31f4 in clnt_ncreate_timed (hostname=0x57592e "localhost", prog=100024, vers=1,     netclass=0x57592a "tcp", tp=0x0) at

[Nfs-ganesha-devel] NTIRPC ENOMEM

2017-12-20 Thread William Allen Simpson
DanG has raised an interesting issue about recovery from low memory. In Ganesha, we've been assiduously changing NULL checks to assert or segfault on alloc failures. Just had a few more patches by Kaleb. Since 2013 or 2014, we've been doing the same to NTIRPC. There are currently 105

Re: [Nfs-ganesha-devel] XID missing in error path for RPC AUTH failure.

2017-12-15 Thread William Allen Simpson
On 12/14/17 1:13 PM, William Allen Simpson wrote: This is May 2015 code, based upon 2012 code.  Obviously, we haven't been testing error responses ;) I wanted to add a personal thank you for such an excellent bug report. The patch went in the next branch yesterday, and should show up

Re: [Nfs-ganesha-devel] XID missing in error path for RPC AUTH failure.

2017-12-14 Thread William Allen Simpson
This is May 2015 code, based upon 2012 code. Obviously, we haven't been testing error responses ;) Not quite. That would need to be duplicated for each of the error conditions. Instead, it should be a bit higher in the function. Still, I'll keep it duplicated from the ACCEPTED code path,

Re: [Nfs-ganesha-devel] Announce Push of V2.6-dev.21

2017-12-14 Thread William Allen Simpson
On 12/12/17 4:39 PM, Frank Filz wrote: Branch next Tag:V2.6-dev.21 Release Highlights * new version of checkpatch * checkpatch fixes for existing code I'd been hoping that a mid-week release meant the crash during shutdown was fixed, but apparently not: Thread 270 "ganesha.nfsd" received

Re: [Nfs-ganesha-devel] enqueued_reqs/dequeued_reqs

2017-12-11 Thread William Allen Simpson
On 12/11/17 2:35 PM, Pradeep wrote: It looks like, we don't increment enqueued_reqs/dequeued_reqs in the RPC anymore - nfs_rpc_enqueue_req() is replaced with nfs_rpc_process_request. Now that both values are zero, the health checker (get_ganesha_health) will never detect any RPC hangs. Should

Re: [Nfs-ganesha-devel] libntirpc thread local storage

2017-12-09 Thread William Allen Simpson
On 12/9/17 5:28 PM, Matt Benjamin wrote: I've already proposed we remove this.  No one is invested in it, I don't think. OK. I'll take a poke at it today. It makes sense that this is a good time to handle, as we've already made a major change to CLNT_CALL in this release.

[Nfs-ganesha-devel] libntirpc thread local storage

2017-12-09 Thread William Allen Simpson
I've run into another TLS problem. It's been there since tirpc. Apparently, once upon a time, rpc_createerr was a static global. It still says that in the man pages. When a client create function fails, they stash the error there, and return NULL for the CLIENT. Basically, you check for NULL,

Re: [Nfs-ganesha-devel] Stacked FSALs and fsal_export parameters and op_ctx

2017-12-09 Thread William Allen Simpson
On 12/8/17 10:13 AM, Matt Benjamin wrote: I'd like to see this use of TLS as a "hidden parameter" replaced regardless. It has been a source of bugs, and locks us into a pthreads execution model I think needlessly. With future async FSAL calls, it's going to stop working. We already have a

Re: [Nfs-ganesha-devel] Stacked FSALs and fsal_export parameters and op_ctx

2017-12-08 Thread William Allen Simpson
On 12/7/17 7:54 PM, Frank Filz wrote: Stacked FSALs often depend on op_ctx->fsal_export being set. We also have lots of FSAL methods that take the fsal_export as a parameter. The latter sounds better. Now that we know every single thread local storage access involves a hidden lock/unlock

Re: [Nfs-ganesha-devel] Announce Push of V2.6-dev.17

2017-11-14 Thread William Allen Simpson
On 11/13/17 12:35 AM, Frank Filz wrote: 5ca449d Jan-Martin Rämer handle hosts via libcidr to unify IPv4/IPv4 host/network clients Ran pynfs, seeing some massive leaks that weren't there last week: Direct leak of 505080 byte(s) in 12627 object(s) allocated from: #0 0x76efcfe0 in calloc

Re: [Nfs-ganesha-devel] Change in ffilz/nfs-ganesha[next]: (log_functions) display work_pool name

2017-11-07 Thread William Allen Simpson
On 11/6/17 8:41 PM, William Allen Simpson wrote: On 11/6/17 8:38 AM, Dominique Martinet wrote: One way that'd work for example would be have ganesha provide a pointer to SetNameFunction at init, it's a bit ugly though. Actually, that's how ntirpc calls various alloc and warnx functions, via

Re: [Nfs-ganesha-devel] Change in ffilz/nfs-ganesha[next]: (log_functions) display work_pool name

2017-11-06 Thread William Allen Simpson
On 11/6/17 8:38 AM, Dominique Martinet wrote: William Allen Simpson wrote on Mon, Nov 06, 2017 at 08:12:19AM -0500: If you've got [s]ome others in another library, they'll have to use the same library function. Other FSALs are in other libraries, but given how it's setup they can use

Re: [Nfs-ganesha-devel] Change in ffilz/nfs-ganesha[next]: (log_functions) display work_pool name

2017-11-06 Thread William Allen Simpson
On 11/6/17 7:14 AM, GerritHub wrote: File src/log/log_functions.c: o Patch Set #2, Line 1438: |name = work_pool_worker_name();|

Re: [Nfs-ganesha-devel] Change in ffilz/nfs-ganesha[next]: CLNT_CALL with clnt_req

2017-11-04 Thread William Allen Simpson
On 11/4/17 1:43 AM, Matt Benjamin wrote: oh, come on. not sure what needs to be done to reduce log noise, but I'm sure we can make a dent. Apparently, you and I reviewed and wrote our messages in opposite order. This patch does that Complaining about the log messages now that we put in

Re: [Nfs-ganesha-devel] Change in ffilz/nfs-ganesha[next]: CLNT_CALL with clnt_req

2017-11-03 Thread William Allen Simpson
We already discussed this on Tuesday, October 24th. Malahal agreed that a half second was good, 3 seconds was OK, 5 seconds was long. And Matt agreed we'd log more than 10 seconds. Obviously, you have vastly more Internet experience than I, and therefore are much better able to decide Internet

Re: [Nfs-ganesha-devel] Change in ffilz/nfs-ganesha[next]: CLNT_CALL with clnt_req

2017-11-03 Thread William Allen Simpson
On 11/3/17 2:06 PM, Frank Filz wrote: Please respond inside gerrit to keep the conversation in one place. The discussion of timeouts was always in public on this list, and was also on the public conference call a week ago. If Gerrit doesn't record its replies, unlike Github, you should

Re: [Nfs-ganesha-devel] tirpc warning

2017-11-03 Thread William Allen Simpson
On 11/3/17 7:46 PM, Frank Filz wrote: Can we tone down this warning: 2017-11-03 16:14:21 [svc_4] :0 :rpc :TIRPC :INFO :clnt_req_alloc:470 tv_sec 15 > 10 That is spamming the log when at INFO level. I see that you didn't put in CLNT_CALL with clnt_req, that eliminated all current examples of

Re: [Nfs-ganesha-devel] Change in ffilz/nfs-ganesha[next]: CLNT_CALL with clnt_req

2017-11-03 Thread William Allen Simpson
On 11/2/17 1:01 PM, GerritHub wrote: Frank Filz *posted comments* on this change. View Change Patch set 1: (2 comments) * File src/MainNFSD/nfs_rpc_callback.c: o

Re: [Nfs-ganesha-devel] nlm_async retries

2017-11-02 Thread William Allen Simpson
On 11/1/17 3:07 PM, Frank Filz wrote: I think we only need a single call fired off. If the client doesn't get it, there's not much recourse. I guess if a TCP connection actually fails, we could retry then, but over UDP there is no way to know what happened. Thanks for working on cleaning this

Re: [Nfs-ganesha-devel] nlm_async retries

2017-11-01 Thread William Allen Simpson
On 11/1/17 2:27 PM, William Allen Simpson wrote: On 11/1/17 10:07 AM, Frank Filz wrote: So part of why that code looks bizarre? Because the NLM ASYNC RPC procedures are bizarre... The NLM ASYNC procedures DON'T have a normal RPC call response. Instead, the host handling the call (normally

Re: [Nfs-ganesha-devel] nlm_async retries

2017-11-01 Thread William Allen Simpson
On 11/1/17 10:07 AM, Frank Filz wrote: So part of why that code looks bizarre? Because the NLM ASYNC RPC procedures are bizarre... The NLM ASYNC procedures DON'T have a normal RPC call response. Instead, the host handling the call (normally the server, but the client in the case of NLM_GRANTED

[Nfs-ganesha-devel] nlm_async retries

2017-11-01 Thread William Allen Simpson
I'm flummoxed. Who knows this code? Problem 1: the timeout is set to 10 microseconds. Holy heck? And historically, that's the maximum total wait time, so it would try at least three (3) times within 10 *MICRO*seconds? Probably should be milliseconds. Problem 2: there's a retry loop that

Re: [Nfs-ganesha-devel] UID and GID mapping

2017-10-30 Thread William Allen Simpson
On 10/30/17 3:56 AM, Nitesh Sharma wrote: Do you have any idea about nfs-ganesha UID and GID mapping The developer's list might What version are you using? How can I map this entry in nfs-ganesha export which is above CephFS /mnt/tvault_automation 

Re: [Nfs-ganesha-devel] Ganesha 2.3 and 2.5 - crash in free_nfs_request

2017-10-30 Thread William Allen Simpson
On 10/27/17 7:56 AM, Sachin Punadikar wrote: Ganesha 2.3 got segfault with below : [...] After analyzing the core and related code found that - In "thr_decode_rpc_request" function, if call to SVC_RECV fails, then free_nfs_request is invoked to free the resources. But so far one of the field

Re: [Nfs-ganesha-devel] Ganesha startup and shutdown

2017-10-27 Thread William Allen Simpson
On 10/27/17 6:43 PM, Frank Filz wrote: Ganesha startup and shutdown seems to be wandering between fast, slow, and in the case of shutdown, sometimes not cleanly... I'm glad to see the startup time has recently improved, it was annoying for a while. Think that was mostly Malahal? Also, my 2.6

Re: [Nfs-ganesha-devel] kkeithle pushed to libntirpc (f27). "libntirpc 1.5.3 PR https://github.com/nfs-ganesha/ntirpc/pull/85"

2017-10-20 Thread William Allen Simpson
On 10/20/17 1:44 PM, Kaleb S. KEITHLEY wrote: On 10/20/2017 01:32 PM, Florian Weimer wrote: Do you regularly import fixes from the glibc code into libntirpc? I don't know. That's a question for the ntirpc developers (cc'd) This isn't a fix "from the glibc code", this is a fix because of a

Re: [Nfs-ganesha-devel] V2.5-stable maintenance

2017-10-06 Thread William Allen Simpson
On 10/6/17 9:50 AM, Frank Filz wrote: But if we can build a script that would easily show which patches have or have not been backported, that would be a big help. Right now, I’m always so confused as to what has actually been backported… OR we could shot backporting so much, and release new

Re: [Nfs-ganesha-devel] Pull up NTIRPC #80 & #81

2017-10-03 Thread William Allen Simpson
On 10/3/17 12:53 PM, Daniel Gryniewicz wrote: I don't think this is blocking QE, as they haven't hit it again either, so it can probably just go in in the normal merge this week. It was my understanding that they stopped testing. If they aren't having any more problems, why oh why did I

  1   2   3   >