Sowmini,
I've tracked down the latest NCE-related issue. I'm not sure if this is
related to our proposed fix to the earlier problem, or something more
fundamental. However, it seems like this is off-the-rails in Nevada too.
To see the problem, suppose we've established an NCE to a destination but
the NCE has expired. In that case, when we next create an IRE that uses
the NCE, we will go through ire_create() -> ... -> ire_nce_init() ->
nce_reinit() -> ndp_add_v4(). The nce_reinit() function will pass a NULL
res_mp to ndp_add_v4(), which will then proceed to use ill_resolver_mp:
/*
* This one holds link layer address; if res_mp has been provided
* by the caller, accept it without any further checks. Otherwise,
* for V4, we fill it up with ill_resolver_mp here, then in
* in ire_arpresolve(), we fill it up with the ARP query
* once its formulated.
*/
if (res_mp != NULL) {
template = res_mp;
} else {
if (ill->ill_resolver_mp == NULL) {
freeb(mp);
return (EINVAL);
}
--> template = copyb(ill->ill_resolver_mp);
}
...
--> nce->nce_res_mp = template;
However, here, ill_resolver_mp is not a DL_UNITDATA_REQ, but an areq_t,
because of this code in ill_dl_up():
areq_mp = ill_arp_alloc(ill,
(uchar_t *)&ip_areq_template, 0);
if (areq_mp == NULL) {
return (ENOMEM);
}
freemsg(ill->ill_resolver_mp);
--> ill->ill_resolver_mp = areq_mp;
As per the proposed fix, in ip_newroute() we will set nce_state to
ND_REACHABLE and call nce_fastpath() -> ill_fastpath_probe(), but
ill_fastpath_probe() expects a dl_unitdata_req_t in nce_res_mp, not an
areq_t. The result is that we send a completely bogus DL_IOC_HDR_INFO
message down to the driver, which means it returns EINVAL and and
nce_fp_mp remains NULL. Also, nce_res_mp continues to point to an areq_t
rather than a dl_unitdata_req_t -- but ip_wput_ire() doesn't know this, so
it happily sends bogus "DL_UNITDATA_REQ" messages to the driver, (with
dl_primitive set to 0x4103 aka AR_ENTRY_QUERY), causing the driver to send
up a DL_ERROR_ACK each time IP attempts to send a packet.
Thoughts?
--
meem