> I'm unclear why the added NCE would usually be in the unresolved state;
> ire_nce_init() does:

As thirumalai pointed out, this is because ire_fp_mp passed in is
null

And the reason that it is null is because the fp_mp itself can be
freed (see ndp_fastpath_flush())


> The problem that I see with the above is that if the NCE already exists
> but is not ND_REACHABLE, then we will not replace it with an ND_REACHABLE

right, so the safest (temporary fix, till we clean up the ip_newroute()
path) is to make sure that we do the following in ip_newroute for
this case (adding ire_cache for offlink host, based on gateway's ire_cache):
   nce->nce_state = ND_REACHABLE;
   nce_fastpath(nce);
with the returned nce.

> Yes, I overlooked this.  I've done some testing and this is true in my
> bits -- and seems to be true in onnv as well.  Is there a reason why
> ire_fp_mp has to be NULL? 

I recall running into race conditions where the fastpath would delete
the nce in between the calls from ip_newroute and the ire* functions. 

> I have not tested against onnv.  

Ok. So I guess there are no tests in the ipmp test suite to trigger this
particular case.

>                                    Is it possible that:
>   6508701 ire_add_v4() often adds unresolved IREs even when told not to
> ... is playing a role here?  Specifically, before that fix, ire_add_v4()
> will add unresolved IREs regardless of the allow_unresolved flag.  So

nope. the root-cause is different. even if you add the ire, unless
someone kicked off arp, the packets would never get sent.

> wouldn't that mask this bug?  If so, seems like IPMP should be pretty
> broken in Nevada right now.

I believe this case might not easily encountered, even with ipmp,
which is why we have not seen it...  does the ipmp test suite actually
trigger a case where we send packets to an offlink dst through various
interfaces in an ipmp group, and there is only 1 gateway on the lan?

--Sowmini


Reply via email to