Re: [Nfs-ganesha-devel] libntirpc thread local storage
On 12/9/17 5:28 PM, Matt Benjamin wrote: I've already proposed we remove this. No one is invested in it, I don't think. OK. I'll take a poke at it today. It makes sense that this is a good time to handle, as we've already made a major change to CLNT_CALL in this release. -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Nfs-ganesha-devel mailing list Nfs-ganesha-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
Re: [Nfs-ganesha-devel] libntirpc thread local storage
I've already proposed we remove this. No one is invested in it, I don't think. Matt On Dec 10, 2017 2:38 AM, "William Allen Simpson" < william.allen.simp...@gmail.com> wrote: > I've run into another TLS problem. It's been there since tirpc. > > Apparently, once upon a time, rpc_createerr was a static global. > It still says that in the man pages. > > When a client create function fails, they stash the error there, > and return NULL for the CLIENT. Basically, you check for NULL, > and then check rpc_createerr > > This is also used extensively by the RPC bind code. > > Then, they made it a keyed thread local to try and salvage it > without a major code re-write. > > With async threads, that's not working. We've got memory leaks. > AFAICT, only on errors. But accessing them on a different > thread gives the wrong error code (or none at all). Not good. > > All the functions that use it are still not MT-safe, usually > because they stash a string in global memory without locking. > They need re-definition to alloc/free the string. > > Worse, it's not a good definition. > > rpc_createerr has both clnt_stat and rpc_err, but struct rpc_err > also has a clnt_stat (since original tirpc). clnt_stat is not > consistently set properly, as it is in two places. So the error > checking code is often wrong. > > I'd like to get rid of the whole mess, but that means every client > create would have new semantics. Fortunately, there aren't many > (in Ganesha). Plus we already have new definitions -- all named > *_ncreate with a tirpc_compat.h to munge them back. > > But should we do it now? Or in 2.7? We've been living with it for > years, although recent code changes have made it worse. Again, it > only happens on errors. Especially for RPC bind. > > > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > ___ > Nfs-ganesha-devel mailing list > Nfs-ganesha-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel > -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Nfs-ganesha-devel mailing list Nfs-ganesha-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
[Nfs-ganesha-devel] libntirpc thread local storage
I've run into another TLS problem. It's been there since tirpc. Apparently, once upon a time, rpc_createerr was a static global. It still says that in the man pages. When a client create function fails, they stash the error there, and return NULL for the CLIENT. Basically, you check for NULL, and then check rpc_createerr This is also used extensively by the RPC bind code. Then, they made it a keyed thread local to try and salvage it without a major code re-write. With async threads, that's not working. We've got memory leaks. AFAICT, only on errors. But accessing them on a different thread gives the wrong error code (or none at all). Not good. All the functions that use it are still not MT-safe, usually because they stash a string in global memory without locking. They need re-definition to alloc/free the string. Worse, it's not a good definition. rpc_createerr has both clnt_stat and rpc_err, but struct rpc_err also has a clnt_stat (since original tirpc). clnt_stat is not consistently set properly, as it is in two places. So the error checking code is often wrong. I'd like to get rid of the whole mess, but that means every client create would have new semantics. Fortunately, there aren't many (in Ganesha). Plus we already have new definitions -- all named *_ncreate with a tirpc_compat.h to munge them back. But should we do it now? Or in 2.7? We've been living with it for years, although recent code changes have made it worse. Again, it only happens on errors. Especially for RPC bind. -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Nfs-ganesha-devel mailing list Nfs-ganesha-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
Re: [Nfs-ganesha-devel] Stacked FSALs and fsal_export parameters and op_ctx
On 12/8/17 10:13 AM, Matt Benjamin wrote: I'd like to see this use of TLS as a "hidden parameter" replaced regardless. It has been a source of bugs, and locks us into a pthreads execution model I think needlessly. With future async FSAL calls, it's going to stop working. We already have a svc_req and now clnt_req. That's what will be returned on the new thread. So it behooves us to stash the op_ctx pointer in them (as a void *). -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Nfs-ganesha-devel mailing list Nfs-ganesha-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel