IPMP depends on the probe target responding to pings in a consistent way. If the probe target responds selectively to some probes, but not others, then the probe based failure detection may not work. The piece of code you refer to essentially tries to achieve that if the probe target is running IPMP. In the case where there are no routers and the local network has only Solaris machines, all running IPMP, this edge case becomes critical. At least 1 of our groups used to run IPMP in this mode, and reported problems. In the degenerate case there are just 2 machines, each with a few interfaces in an IPMP group and connected back to back or through a switch.
As an example consider the degenerate case above. We have interfaces A, B in an IPMP group on say host H1, and interfaces C, D in another IPMP group on the probe target machine say H2. Let us say there is a transmit path failure on C, at time T. Now until the failure detection happens at H2, IP may use either C or D to send out the ping response. Depending on the ping source address, IP load spreads and send out the ping response on some interface. (by creating a destination based ire cache). So it is possible that the response to pings originating from A (to both C and D) go out on C. Similarly response to pings originating from B may go out on D. With a transmit failure on C, A stops seeing responses altogether, while B still sees responses from both C and D. At time T + 10, H1 would misdiagnose that A has failed. By imposing the restriction that ping replies go out on the same interface on which the ping was received, we make A and B see consistent probe target failure and avoid this misdiagnosis. Thirumalai Peter Memishian wrote: > > > If we don't turnoff load spreading, the packets might get dropped if > > > there are no non-FAILED/INACTIVE interfaces for it to go out. > > > > > >Specifically, it seems to me that the interface must be functioning in > > >order to have data addresses and thus to receive an ICMP_ECHO_REQUEST > > >message in the first place. Given that the interface is functioning, that > > >means there must be at least one functioning interface for the packet to > > >go out of and thus I don't see how the above comment applies. > > > > > True we received the echo request, but I think the assumption is that the > > transmit and receive are separate paths in the NIC and could potentially > > fail independently. > >But if the transmit path of the interface has failed and IPMP is enabled >on that interface, then in.mpathd will have already marked it IFF_FAILED. >If the transmit path is not being tested (e.g., because only link-based >failure detection is enabled) then the interface will not be marked >IFF_FAILED and the comment above doesn't apply. > >(Further, this concern seems suspect, since IPMP may pick any set of hosts >as probe targets, and those be running other operating systems that have >no regard for the concerns of in.mpathd's probes.) > >Do you recall what caused this code to be added? It seems to be there >since IPMP was introduced in Solaris 8. I think it either needs to be >clearly justified or removed. > > >
