Hi Malahal, My read on this is that the clnt_err() callout probably has to be deprecated, and clnt_call() updated to take a pointer to struct rpc_err to return the error detail.
The behavior in clnt_vc_call() and friends was ok for duplex-10, but it conflicts with idempotent and backchannel calls. We'd want to do this and the matching change to the callers on Ganesha 2.0 and ntirpc duplex-11. Thoughts on that? Matt ----- "Malahal Naineni" <mala...@us.ibm.com> wrote: > Matt, clnt_vc_call() frees the memory that is later accessed by > clnt_vc_geterr. It all happens in the same *thread*. You see two > stack > traces because I enabled valgrind to keep origins. > > Regrads, Malahal. > > Matt W. Benjamin [m...@cohortfs.com] wrote: > > Hi, > > > > So this is all happening in clnt_vc_call()? I'll have a look at > it. > > > > Matt > > > > ----- "Malahal Naineni" <mala...@us.ibm.com> wrote: > > > > > Hi All, > > > > > > valgrind reported that clnt_vc_geterr() is accessing freed > memory. > > > Code review shows that nlm_send_async() calls clnt_call() and > then > > > calls > > > clnt_sperror() on some errors. This is clearly a bug to access > rpc > > > context > > > memory that was freed in a prior call. I am not familiar with RPC > > > code, > > > it should be an easy fix for someone familiar with RPC code. Can > > > someone > > > please fix or give pointers to fix it. > > > > > > Here is the entire valgrind report (ran against V2.2-stable, but > line > > > numbers should NOT change that much). > > > > > > ==7012== Thread 35: > > > ==7012== Invalid read of size 8 > > > ==7012== at 0x535704: clnt_vc_geterr (clnt_vc.c:582) > > > ==7012== by 0x533FFB: clnt_sperror (clnt_perror.c:83) > > > ==7012== by 0x48FF07: nlm_send_async (nlm_async.c:283) > > > ==7012== by 0x491132: nlm4_send_grant_msg (nlm_util.c:205) > > > ==7012== by 0x49B203: state_async_func_caller > (state_async.c:81) > > > ==7012== by 0x508E5A: fridgethr_start_routine > (fridgethr.c:562) > > > ==7012== by 0x61EEDF2: start_thread (in > > > /usr/lib64/libpthread-2.17.so) > > > ==7012== by 0x67011AC: clone (in /usr/lib64/libc-2.17.so) > > > ==7012== Address 0x87f5a88 is 136 bytes inside a block of size > 184 > > > free'd > > > ==7012== at 0x4C29577: free (in > > > /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) > > > ==7012== by 0x53AAB9: free_rpc_call_ctx (rpc_ctx.c:251) > > > ==7012== by 0x5356B3: clnt_vc_call (clnt_vc.c:567) > > > ==7012== by 0x48FE71: nlm_send_async (nlm_async.c:267) > > > ==7012== by 0x491132: nlm4_send_grant_msg (nlm_util.c:205) > > > ==7012== by 0x49B203: state_async_func_caller > (state_async.c:81) > > > ==7012== by 0x508E5A: fridgethr_start_routine > (fridgethr.c:562) > > > ==7012== by 0x61EEDF2: start_thread (in > > > /usr/lib64/libpthread-2.17.so) > > > ==7012== by 0x67011AC: clone (in /usr/lib64/libc-2.17.so) > > > > > > Regards, Malahal. > > > PS: I can make the context pointer NULL when we free, and use > > > xdrs->x_lib[1] > > > in the check instead of xdrs->x_lib[0] in clnt_vc_geterr(). Wonder > if > > > that is a right method. > > > > -- > > Matt Benjamin > > CohortFS, LLC. > > 315 West Huron Street, Suite 140A > > Ann Arbor, Michigan 48103 > > > > http://cohortfs.com > > > > tel. 734-761-4689 > > fax. 734-769-8938 > > cel. 734-216-5309 > > -- Matt Benjamin CohortFS, LLC. 315 West Huron Street, Suite 140A Ann Arbor, Michigan 48103 http://cohortfs.com tel. 734-761-4689 fax. 734-769-8938 cel. 734-216-5309 ------------------------------------------------------------------------------ Monitor 25 network devices or servers for free with OpManager! OpManager is web-based network management software that monitors network devices and physical & virtual servers, alerts via email & sms for fault. Monitor 25 devices for free with no restriction. Download now http://ad.doubleclick.net/ddm/clk/292181274;119417398;o _______________________________________________ Nfs-ganesha-devel mailing list Nfs-ganesha-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel