On 26/07/10 12:57 PM, James Carlson wrote:
Darren Reed wrote:
On 26/07/10 08:18 AM, James Carlson wrote:
darren.r...@oracle.com wrote:
Author: Darren Reed<darren.r...@oracle.com>
Repository: /hg/onnv/onnv-gate
Latest revision: 2794a0c9cce102961d08f075ee1f569073b99786
Total changesets: 1
Log message:
6965774 bound the interface index in IP to [1,65535]
Files:
update: usr/src/uts/common/inet/ip/ip_if.c
update: usr/src/uts/common/net/if.h
I suspect this change will have two serious effects:
- SNMP is now potentially broken. The interface IDs cannot be reused
without an engine reboot indication, and restricting to a 16 bit range
makes reuse very much more likely than before.
Which interface IDs cannot be used?
The SNMP ifIndex value itself cannot be recycled without reporting a
reboot in the SNMP engine. (It's not necessary that the *system* itself
be rebooted, but that at least the SNMP engine behave as though the
system were rebooted by bumping up the generation number.)
See RFC 2863 section 3.1.5.
Do you mean to say that SNMP cannot handle two different
network interfaces having the same interface ID during the
lifetime of the SNMP agent?
Correct.
But to me this sounds like a fundamental problem with SNMP and
that choosing 32bits for a network interface identifier does not
represent a better design or solution, only a pushing out the
problem to some point in the future and hoping that it never
appears on your watch.
Correct. At 2^32 IDs, "not on my watch" means that a system churning
away at (say) 5 plumb/unplumb sequences per second would last a bit over
27 years before the feared roll-over occurs. My plans at that time
include being either retired or dead. Maybe both.
2^16 IDs passes by much more quickly.
- Systems with large numbers of virtual interfaces (tunnels and PPP
links, for example) may now run out of identifiers when this wasn't
previously possible.
This was considered.
65,535 is a LOT of network interfaces.
Further, the limit is only per instance of IP.
A "lot" depends on what you're doing. At just one replumb a minute,
you'd wrap in 45 days.
A given installation of Solaris can only support 1024 zones and
it would require every zone to be using a shared network instance
and 64 network interfaces for it to be a problem in that direction.
Simultaneously, yes, that's one possibility.
Sequentially, though, it's much easier to get there.
And things like tunnels can get you there much faster. I'm *SURE* that
punchin has rolled over that 2^16 limit many times over.
I'd hazard a guess that punchin gets rebooted for upgrading
before that happens. To give that more context, Dan did not
seem to think that fixing the IKE daemon's naive use of the
index from the routing message was more important than a P4,
which is a likely indication that the index has not passed 65535.
Can you imagine how long "ifconfig -a" would take to run on a
system with that many network interfaces?
The numbers are intentionally like PIDs; they're not reused until the
worst happens. I believe that's the point you may be missing here.
One of the ideas that I floated around before addressing this
was to have DEBUG kernels start their IP interface index
allocation at 100,000, since we do something similar for PIDs
but nobody was interested in that.
But to take that further, except for PIDs under 100, the system
does reuse the PID number space, so why shouldn't it reuse
network interface IDs?
Was any consideration given to fixing the applications that rely on
these old BSD interfaces? That's what we had been doing since at least
2.6 -- making the applications aware that ifIndex numbers could be
ambiguous, and using other means to verify the data. There are numerous
examples of this work in the source base. Let me know if you need
pointers; I can google it for you. ;-}
In general, having aliases in the ifIndex space seen by the old BSD
interfaces is "not a problem" for most applications, because getting
routing socket messages just means that it's time to use the ioctls to
get the real current information. The routing socket messages
themselves contain too little data to make a fully functioning program
on Solaris anyway.
Or perhaps "fixing" these old interfaces with some new mechanism?
In fixing the situation to support more than 16bits for a network
interface identifier, changes to the routing message would be
required. That change would then break our compatibility with
every open source application that exists and uses those messages.
It would also break compatibility with older applications built for
Solaris. If a new interface was introduced and the old one left in
place for compatibility reasons, that doesn't stop the ones using
the old interface from causing strange behaviour when the index
exceeds 65535. The sockaddr_dl message is used by applications to
both send and receive routing messages.
Indeed. That's exactly why this interface has been left alone for
decades, and instead the fixes were put into the applications using the
interfaces.
If I quickly look at current BSD source code, I find:
- index allocation using the smallest available index
- limited to USHORT_MAX
Linux uses a signed integer and has its own routing message
protocol (that seems to not use the index...)
I don't believe that the fix applied for CR 6965774 is really the right
idea. It perhaps makes some sense in a Windows-like environment where
you're encouraged (sometimes forcefully) to reboot every few hours or
so, but not so much for an OS that runs for a long period of time.
Tossing away the 32-bit counter that we put into IP decades ago seems
like a step backwards.
I think that the correct solution is to fix SNMP to not assume
or assert that an interface index should be unique for the
entire "uptime" of a host.
It may also be that a new routing message format could be whipped
up and spread around so that this and other limitations can be
addressed. But that will take substantially longer and carries
with it more peril.
Until one of those to happens, it is a mistake to not limit the
index allocation to [1,65535] because the only way of "fixing"
the problems that can occur once 65535 is passed is with a
reboot. To that end, any minor SNMP inconvenience seems trivial.
Or to put it differently, the potential for problems posed by
passing 65535 vastly outweigh the SNMP side of the equation.
Darren
_______________________________________________
networking-discuss mailing list
networking-discuss@opensolaris.org