[brussels-dev] [networking-discuss] shrinking the ndd tunable list

Kacheong Poon Tue, 03 Mar 2009 00:43:18 +0800

sowmini.varadhan at sun.com wrote:

> I wasn't suggesting that. However Solaris has an enormous number
> of tunables compared to other OS'es. It's not clear if we
> are papering over some problems (or lack of flexibility) in the kernel
> or just desiging "solutions in search of a problem" to use 
> your own phrase from the earlier tcp discussion.
> 
> AFAIK equivalents of these tunables do  not exist on comparitive
> OS'es, so I don't know how they deal with the "problem apps" 
> etc. that we are designing against.



There can be many reasons for this.  For example, the other
OSes developers do not care :-)  Or those OSes do not talk
to so many diversified peers such that the developers have
not encountered those strange situations.  Just because the
other OSes (actually, which one you have checked out?) do
not have so many tunables, does not mean that those ndd params
server no purpose.  Note very clearly that I am not saying
that those existing ndd params are all very useful...


> - if we want to have an admin knob to fix up the ttl to handle
>   problem apps (and the admin really cannot fix/kill the problem
>   app itself!) why isn't one ttl tunable (for ip) enough? What does 
>   it mean to provide ndd tunable for tcp_def_ttl, ip_def_ttl (which
>   is really icmp_err_ttl) and icmp_def_ttl (which is really raw ip ttl)?
>   And what does it mean to provide  a tunable for ipv6 unicast hops?


 From reading what you wrote, they are for different purposes.
So are you asserting that affecting all types of apps is
equivalent to or better than affecting, say only raw socket
apps?


>>> - icmp_bsd_compat: sounds like something that should
>>>   be a setsockopt, if we want something other than the
>>>   default. Does anyone actually alter the default here?
> 
> this one is defined as "if 1 (default) the length field in the IP
> header of received datagrams is adjusted to exclude the length of the
> IP header. This is compatible with Berkeley derived implementations and
> is for applications reading raw IP or raw ICMP packets. If 0, the
> length is not changed." 
> 
> But if the admin sets this to 0, all the existing raw-socket apps 
> that look at the length would be broken! How does it help to have
> a big button for the whole machine in this case? 


So I guess this one is for bug compatibility with old BSD.
I'm wondering if the current *BSD still behaves like this.
If no, it means that some sys admin may actually turn this
off to make those new apps work.  Have you checked this out?


>>> - [ip, tcp, icmp, udp]_wroff_extra- who modifies these
>>>   from defaults? The stack should self-tune these based
>>>   on the ill_phys_addr_length it learns through DLPI.
> 
> First, I don't know why you would want to do this per ulp, for
> the whole machine.  Second, I don't know why you would want to do this
> at all: if a driver-writer is working some exotic driver that has
> a different link-layer length, then they should just provide
> that link-layer length in the DLPI messages, and IP should
> adjust suitably. What is the purpose of providing this knob
> to your average admin who is likely running a mix of ethernet,
> tunnel, ibd and other interfaces? 


I suspect that this is for historical reason since all the
transports, IP and drivers are different STREAMS modules.
And I guess there was no such communication on exactly
this info.

One interesting question to think about is whether using
ill_phys_addr_length is good enough.  All the _wroff_extra
params have values bigger than most MAC header length.  One
use of this I can think of is for interesting alignment.
Then using ill_phys_addr_length is properly not OK.  I don't
know if it actually matters.  But I'd suggest you to check
it out before removing them and change the code to use
ill_phys_addr_length.




-- 

                                                K. Poon.
                                                kacheong.poon at sun.com

[brussels-dev] [networking-discuss] shrinking the ndd tunable list

Reply via email to