Sowmini,

I've tracked down the latest NCE-related issue.  I'm not sure if this is
related to our proposed fix to the earlier problem, or something more
fundamental.  However, it seems like this is off-the-rails in Nevada too.

To see the problem, suppose we've established an NCE to a destination but
the NCE has expired.  In that case, when we next create an IRE that uses
the NCE, we will go through ire_create() -> ... -> ire_nce_init() ->
nce_reinit() -> ndp_add_v4().  The nce_reinit() function will pass a NULL
res_mp to ndp_add_v4(), which will then proceed to use ill_resolver_mp:

        /*
         * This one holds link layer address; if res_mp has been provided
         * by the caller, accept it without any further checks. Otherwise,
         * for V4, we fill it up with ill_resolver_mp here, then in
         * in ire_arpresolve(), we fill it up with the ARP query
         * once its formulated.
         */
        if (res_mp != NULL) {
                template = res_mp;
        } else  {
                if (ill->ill_resolver_mp == NULL) {
                        freeb(mp);
                        return (EINVAL);
                }
-->             template = copyb(ill->ill_resolver_mp);
        }
        ...
-->     nce->nce_res_mp = template;

However, here, ill_resolver_mp is not a DL_UNITDATA_REQ, but an areq_t,
because of this code in ill_dl_up():

                areq_mp = ill_arp_alloc(ill,
                        (uchar_t *)&ip_areq_template, 0);
                if (areq_mp == NULL) {
                        return (ENOMEM);
                }
                freemsg(ill->ill_resolver_mp);
-->             ill->ill_resolver_mp = areq_mp;

As per the proposed fix, in ip_newroute() we will set nce_state to
ND_REACHABLE and call nce_fastpath() -> ill_fastpath_probe(), but
ill_fastpath_probe() expects a dl_unitdata_req_t in nce_res_mp, not an
areq_t.  The result is that we send a completely bogus DL_IOC_HDR_INFO
message down to the driver, which means it returns EINVAL and and
nce_fp_mp remains NULL.  Also, nce_res_mp continues to point to an areq_t
rather than a dl_unitdata_req_t -- but ip_wput_ire() doesn't know this, so
it happily sends bogus "DL_UNITDATA_REQ" messages to the driver, (with
dl_primitive set to 0x4103 aka AR_ENTRY_QUERY), causing the driver to send
up a DL_ERROR_ACK each time IP attempts to send a packet.

Thoughts?
-- 
meem

Reply via email to