>> That trace is the NSM clnt_dg clnt_call, the only use of outgoing UDP.
It's a mess, and has been a mess for a long time.
We get a file descriptor fd and then create "rec", but while destroying
things, we close "fd" and then rpc_dplx_unref(). Re-arranging these in
clnt_dg_destroy() (and other places) might help fix this issue, but I am
not positive as I am not familiar with this code.
I am also working on a blind replacement of "fd" by "struct gfd" where
struct gfd has the "fd" as well as a "generation number". The generation
number is incremented when ever such "fd" is created (e.g. accept() call or
socket() call). The changes are many but they are trivial.
Any thoughts?
Regards, Malahal.
On Fri, Aug 11, 2017 at 8:50 PM, Malahal Naineni <mala...@gmail.com> wrote:
> We do support TCP and UDP for NSM. If this customer is using clients that
> support TCP, then we don't need UDP for this customer. Isn't there a config
> where we can say the daemon only support TCP alone for NSM?
>
> There might be couple MacOS customers that might need UDP, but then I am
> not sure though. They do make use of UDP (including Linux) first if
> available.
>
> Regards, Malahal.
>
> On Fri, Aug 11, 2017 at 8:17 PM, William Allen Simpson <
> william.allen.simp...@gmail.com> wrote:
>
>> On 8/11/17 8:56 AM, Matt Benjamin wrote:
>>
>>> On Fri, Aug 11, 2017 at 8:44 AM, William Allen Simpson
>>> <william.allen.simp...@gmail.com> wrote:
>>>
>>>> On 8/11/17 8:26 AM, William Allen Simpson wrote:
>>>>
>>>>>
>>>>> On 8/11/17 2:29 AM, Malahal Naineni wrote:
>>>>>
>>>>>>
>>>>>> Following confirms that Thread1 (TCP) is trying to use the same "rec"
>>>>>> as
>>>>>> Thread42 (UDP), it is easy to reproduce on the customer system!
>>>>>>
>>>>>> There are 2 duplicated fd indexed trees, not well coordinated. My
>>>>> 2015
>>>>> code to fix this went in Feb/Mar timeframe for Ganesha v2.5/ntirpc 1.5.
>>>>>
>>>>
>>>>
>>>> That trace is the NSM clnt_dg clnt_call, the only use of outgoing UDP.
>>>> It's a mess, and has been a mess for a long time.
>>>>
>>>> There is still an analogous problem (Dominique reported) where UDP
>>>> uses poll() on an fd at the same time that TCP uses epoll() on the
>>>> same fd.
>>>>
>>>> That's why I was asking whether your IBM systems support TCP for NSM?
>>>>
>>>> It would be a much easier back-portable fix to Ganesha to require TCP.
>>>> The code passes "tcp" parameter, but for some as yet unknown reason
>>>> tries UDP, too.
>>>>
>>>> Again, does IBM support TCP for NSM?
>>>>
>>>
>>> That doesn't sound like it's fixing anything at all. If someone wants
>>> to do this on a downstream, they're welcome, but we've already had the
>>> upstream discussion about this.
>>>
>>> Who is this royal "we"?
>>
>> Everybody agrees that we need to support UDP incoming to Ganesha. That's
>> src/svc_dg.c.
>>
>> Whereas src/clnt_dg.c has long been problematic. As it is used in only
>> one place, it doesn't get much testing. Ganesha v2.5/ntirpc v1.5 tried to
>> fix the known non-MT cases in clnt_dg. And my code finally blessed on
>> Tuesday may have fixed some more. But that won't help IBM shipping v2.3.
>>
>> Linux supports TCP for NSM. If IBM supports TCP too, we're good to go.
>> That looks like a relatively simple easily back-portable fix.
>>
>> I'm pretty sure that Malahal actually has to please customers.
>>
>
>
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel