http://defect.opensolaris.org/bz/show_bug.cgi?id=11092


amaguire <alan.maguire at sun.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|REOPENED                    |CAUSEKNOWN


--- Comment #16 from amaguire <alan.maguire at sun.com> 2009-09-18 19:39:54 UTC 
---
updating since cause is (somewhat) known. Part of the solution is in 11437
(ensuring that we only go online* when we lose an address), and part of the
solution is to go to WAITING_FOR_ADDR in all cases expect where static
addresses are used. The final piece of the puzzle is that addresses in
nwamd_handle_if_state_event appear to be truncated sometimes. For example, bge1
adds an ADDRCONF address of , but all that appears is 2002:a08:39f0:3::. At the
same time bge0:1's ADDRCONF address doesn't get truncated. Here's the
appropriate portion of the log (this is with an nwamd with the 11437 and
WAITING_FOR_ADDR fixes):

Sep 18 14:51:18 whitestar3-0.East.Sun.COM nwamd[102325]: [ID 223585
daemon.debug
] 1: nwamd_ncu_handle_if_state_event: if interface:bge1, state (offline*,
(re)in
itialized but not configured)
Sep 18 14:51:18 whitestar3-0.East.Sun.COM nwamd[102325]: [ID 204897
daemon.debug
] 1: nwamd_ncu_handle_if_state_event: new addr: 2002:a08:39f0:3::
Sep 18 14:51:18 whitestar3-0.East.Sun.COM nwamd[102325]: [ID 399939
daemon.error
] 1: nwamd_ncu_handle_if_state_event: can't find lifnum for index 15 addr
2002:a
08:39f0:3::
Sep 18 14:51:18 whitestar3-0.East.Sun.COM nwamd[102325]: [ID 732162
daemon.debug
] 1: nwamd_event_dequeue: nonblock
Sep 18 14:51:18 whitestar3-0.East.Sun.COM nwamd[102325]: [ID 413473
daemon.debug
] 1: dequeueing event 373 of type 23 (QUEUE_QUIET) for object none
Sep 18 14:51:18 whitestar3-0.East.Sun.COM nwamd[102325]: [ID 140469
daemon.debug
] 5: routing message NEWADDR: index 14 flags 1
Sep 18 14:51:18 whitestar3-0.East.Sun.COM nwamd[102325]: [ID 247535
daemon.debug
] 5: netmask: ffff:ffff:ffff:ffff::
Sep 18 14:51:18 whitestar3-0.East.Sun.COM nwamd[102325]: [ID 508102
daemon.debug
] 5: interface name: link bge0:1
Sep 18 14:51:18 whitestar3-0.East.Sun.COM nwamd[102325]: [ID 760460
daemon.debug
] 5: interface address: 2002:a08:39f0:3:9544:b19a:b05e:b4a7
Sep 18 14:51:18 whitestar3-0.East.Sun.COM nwamd[102325]: [ID 599442
daemon.debug
] 5: broadcast address: ::
Sep 18 14:51:18 whitestar3-0.East.Sun.COM nwamd[102325]: [ID 403732
daemon.debug
] 5: enqueueing event 374 12 (IF_STATE) for object (80d0088) interface:bge0
Sep 18 14:51:18 whitestar3-0.East.Sun.COM nwamd[102325]: [ID 715218
daemon.debug
] 1: dequeueing event 374 of type 12 (IF_STATE) for object interface:bge0
Sep 18 14:51:18 whitestar3-0.East.Sun.COM nwamd[102325]: [ID 906461
daemon.debug
] 1: (80d0088) interface:bge0: running method for event IF_STATE
Sep 18 14:51:18 whitestar3-0.East.Sun.COM nwamd[102325]: [ID 861422
daemon.debug
] 1: nwamd_ncu_handle_if_state_event: if interface:bge0, state (offline*,
(re)in
itialized but not configured)
Sep 18 14:51:18 whitestar3-0.East.Sun.COM nwamd[102325]: [ID 204897
daemon.debug
] 1: nwamd_ncu_handle_if_state_event: new addr: 2002:a08:39f0:3::
Sep 18 14:51:18 whitestar3-0.East.Sun.COM nwamd[102325]: [ID 391747
daemon.error
] 1: nwamd_ncu_handle_if_state_event: can't find lifnum for index 14 addr
2002:a
08:39f0:3::

Neither address is truncated in the RTM_NEWADDR message, so the truncation
must occur during the construction of the IF_STATE event.

Haven't root-caused the cause of the trunctation yet, but the result is that
lifnum_for_addr() fails, and we don't go ONLINE.

Why does this happen for some v6 addresses and not others? For a shared
prioritized NCU group of wired links, both at the same priority, I see one come
up and the other not due to the truncation issue. However, it seems to be
random which comes up and which doesn't (sometimes it's bge0, other times
bge1). It is consistent that one comes up and the other does not due to the
truncation of the address causing the lifnum_for_addr() function to fail.
Weird...

-- 
Configure bugmail: http://defect.opensolaris.org/bz/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.

Reply via email to