Re: [networking-discuss] Fwd: svccfg apply profile.xml - network addresses do not change

2011-08-22 Thread Darren Reed

On 22/08/11 09:03 PM, Mark Haywood wrote:

 On 08/22/11 15:04, Darren Reed wrote:

On 22/08/11 04:16 PM, Mark Haywood wrote:
 Comment from the net-install service to which your profile is 
applying its properties:


#
# The network/install service will configure interfaces using the
# property values supplied by the install profile. Once the 
configuration

# has been applied, the service will not apply another configuration
# unless an unconfiguration has been performed first.
#

The service maintains an SMF property, "config/applied" to track 
this. If you don't want to "sysconfig unconfigure", then you'll have 
to reset the "config/applied" property to "false". I suppose you 
could do that manually or via the profile.


Is this the correct way to change "config/applied" to false?











I'm not sure if that would work or not. In particular, I thought that 
the property group and property value types had to be specified. I 
assume that you already have a profile which sets the network/install 
property values. If so, then I would insert:






That is how it is defined in the service manifest itself and would, I 
assume, work for the profile.


Or you could simply use svccfg to reset the property to false. 


Using the below XML as the only contents for a file that I named "b", 
doing an "svccfg apply b" neither returned an error nor worked.


Darren


'/usr/share/lib/xml/dtd/service_bundle.dtd.1'>










value='10.134.67.103/24'/>







___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] Fwd: svccfg apply profile.xml - network addresses do not change

2011-08-22 Thread Darren Reed

On 22/08/11 04:16 PM, Mark Haywood wrote:
 Comment from the net-install service to which your profile is 
applying its properties:


#
# The network/install service will configure interfaces using the
# property values supplied by the install profile. Once the configuration
# has been applied, the service will not apply another configuration
# unless an unconfiguration has been performed first.
#

The service maintains an SMF property, "config/applied" to track this. 
If you don't want to "sysconfig unconfigure", then you'll have to 
reset the "config/applied" property to "false". I suppose you could do 
that manually or via the profile.


Is this the correct way to change "config/applied" to false?










Cheers,
Darren


___
networking-discuss mailing list
networking-discuss@opensolaris.org


[networking-discuss] svccfg apply profile.xml - network addresses do not change

2011-08-22 Thread Darren Reed

Is there something below that I'm doing wrong?

Darren

Boot device: /pci@7c0/pci@0/pci@8/scsi@2/disk@0,0:a  File and args:
SunOS Release 5.11 Version on-hg_ob2 64-bit
Copyright (c) 1983, 2011, Oracle and/or its affiliates. All rights reserved.
DEBUG enabled
misc/forthdebug (170417 bytes) loaded
Hostname: netvirt-b1
Aug 22 06:01:11 svc.startd[12]: svc:/system/ocm:default: Method 
"/lib/svc/method/svc-ocm start" failed with exit status 95.
Aug 22 06:01:11 svc.startd[12]: system/ocm:default failed fatally: 
transitioned to maintenance (see 'svcs -xv' for details)


netvirt-b1 console login: root
Password:
Last login: Mon Aug 22 05:54:42 on console
Oracle Corporation  SunOS 5.11  on-hg_ob2   Aug. 16, 2011
SunOS Internal Development: dr146992 2011-Aug-16 [on-hg_ob2]
root@netvirt-b1:~# ifconfig -a
lo0: flags=2001000849 mtu 
8232 index 1

inet 127.0.0.1 netmask ff00
net0: flags=1000843 mtu 1500 index 2
inet 10.0.0.10 netmask ff00 broadcast 10.255.255.255
ether 0:14:4f:af:e:e2
lo0: flags=2002000849 mtu 
8252 index 1

inet6 ::1/128
net0: flags=20002000840 mtu 1500 index 2
inet6 ::/0
ether 0:14:4f:af:e:e2
root@netvirt-b1:~# cd /var/svc/profile
root@netvirt-b1:/var/svc/profile# ls
abcprofile.xml
root@netvirt-b1:/var/svc/profile# svccfg apply profile.xml
root@netvirt-b1:/var/svc/profile# ifconfig -a
lo0: flags=2001000849 mtu 
8232 index 1

inet 127.0.0.1 netmask ff00
net0: flags=1000843 mtu 1500 index 2
inet 10.0.0.10 netmask ff00 broadcast 10.255.255.255
ether 0:14:4f:af:e:e2
lo0: flags=2002000849 mtu 
8252 index 1

inet6 ::1/128
net0: flags=20002000840 mtu 1500 index 2
inet6 ::/0
ether 0:14:4f:af:e:e2
root@netvirt-b1:/var/svc/profile# grep v4_addr profile.xml
root@netvirt-b1:/var/svc/profile# grep address profile.xml

value='10.134.67.103/24'/>







root@netvirt-b1:/var/svc/profile# svcadm refresh network
root@netvirt-b1:/var/svc/profile# ifconfig -a
lo0: flags=2001000849 mtu 
8232 index 1

inet 127.0.0.1 netmask ff00
net0: flags=1000843 mtu 1500 index 2
inet 10.0.0.10 netmask ff00 broadcast 10.255.255.255
ether 0:14:4f:af:e:e2
lo0: flags=2002000849 mtu 
8252 index 1

inet6 ::1/128
net0: flags=20002000840 mtu 1500 index 2
inet6 ::/0
ether 0:14:4f:af:e:e2
root@netvirt-b1:/var/svc/profile# cp profile.xml profile2.xml
root@netvirt-b1:/var/svc/profile# svccfg apply profile2.xml
root@netvirt-b1:/var/svc/profile# ifconfig -a
lo0: flags=2001000849 mtu 
8232 index 1

inet 127.0.0.1 netmask ff00
net0: flags=1000843 mtu 1500 index 2
inet 10.0.0.10 netmask ff00 broadcast 10.255.255.255
ether 0:14:4f:af:e:e2
lo0: flags=2002000849 mtu 
8252 index 1

inet6 ::1/128
net0: flags=20002000840 mtu 1500 index 2
inet6 ::/0
ether 0:14:4f:af:e:e2
root@netvirt-b1:/var/svc/profile# reboot
syncing file systems... done
rebooting...

SC Alert: Host System has Reset
|

Sun Fire(TM) T1000, No Keyboard
Copyright (c) 1998, 2010, Oracle and/or its affiliates. All rights reserved.
OpenBoot 4.30.4.b, 8064 MB memory available, Serial #78581474.
Ethernet address 0:14:4f:af:e:e2, Host ID: 84af0ee2.



Boot device: /pci@7c0/pci@0/pci@8/scsi@2/disk@0,0:a  File and args:
SunOS Release 5.11 Version on-hg_ob2 64-bit
Copyright (c) 1983, 2011, Oracle and/or its affiliates. All rights reserved.
DEBUG enabled
misc/forthdebug (170417 bytes) loaded
Hostname: netvirt-b1
Aug 22 06:12:20 svc.startd[12]: svc:/system/ocm:default: Method 
"/lib/svc/method/svc-ocm start" failed with exit status 95.
Aug 22 06:12:20 svc.startd[12]: system/ocm:default failed fatally: 
transitioned to maintenance (see 'svcs -xv' for details)


netvirt-b1 console login: root
Password:
Last login: Mon Aug 22 06:04:48 on console
Oracle Corporation  SunOS 5.11  on-hg_ob2   Aug. 16, 2011
SunOS Internal Development: dr146992 2011-Aug-16 [on-hg_ob2]
root@netvirt-b1:~# ifconfig -a
lo0: flags=2001000849 mtu 
8232 index 1

net0: flags=1000843 mtu 1500 index 2
inet 10.0.0.10 netmask ff00 broadcast 10.255.255.255
ether 0:14:4f:af:e:e2
lo0: flags=2002000849 mtu 
8252 index 1

inet6 ::1/128
net0: flags=20002000840 mtu 1500 index 2
inet6 ::/0
ether 0:14:4f:af:e:e2
root@netvirt-b1:~# cd /var/svc/profile
root@netvirt-b1:/var/svc/profile# cat profile2.xml

'/usr/share/lib/xml/dtd/service_bundle.dtd.1'>


...






value='10.134.67.103/24'/>








___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] JUmpStart Installation and ZFS flar installation

2011-06-08 Thread Darren Reed

This is not a list for the discussion of platforms other than OpenSolaris.

The suggestions you provided are not in any way helpful in answering the 
original question.


In the future, please stick to providing information related to the 
topic at hand.


I will also go further and note that JumpStart is no longer used for 
automated network installs of OpenSolaris - Automated Install (AI) is. 
Questions about AI are best raised in other forums on opensolaris.org. 
Suffice to say that AI is very different to JumpStart.


Darren

On  6/06/11 04:43 PM, Daniel Payno wrote:

[NON-PC]

In my early comp.os* and bbs's days, this would've gotten a blazing fast RTFM...

Kind of miss the old days 0;)

[NON-PC]
[useful]

If you're familiar with redhat's linux Kickstart autoconfiguration, you'll 
grasp the essence very fast and will only need to learn the new syntax and req. 
services needed..  both share basic funcionality: tftp, bootp/dhcp, etc..

Google it will produce tons of useful links

73s!

El 07/06/2011, a las 01:15, Darren Reed  escribió:

   

On  1/06/11 01:44 AM, Praneeth Ravikanti wrote:
 

Can anybody please explain what is mean by JumpStart Installation. As far as I 
know Solaris installation with ZFS flar can be done only with JUmpStart 
installation. Is that correct? If that is correct, what is Jumpstart 
installation, how can we do?, Is JUmpstart installation can be done from any 
server or any bootable CD?

   

There is a lot of documentation on JumpStart installation.

I would suggest that you start reading it here:
http://download.oracle.com/docs/cd/E19253-01/821-0437/index.html

Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org
 

___
networking-discuss mailing list
networking-discuss@opensolaris.org


___
networking-discuss mailing list
networking-discuss@opensolaris.org

Re: [networking-discuss] JUmpStart Installation and ZFS flar installation

2011-06-06 Thread Darren Reed

On  1/06/11 01:44 AM, Praneeth Ravikanti wrote:

Can anybody please explain what is mean by JumpStart Installation. As far as I 
know Solaris installation with ZFS flar can be done only with JUmpStart 
installation. Is that correct? If that is correct, what is Jumpstart 
installation, how can we do?, Is JUmpstart installation can be done from any 
server or any bootable CD?
   


There is a lot of documentation on JumpStart installation.

I would suggest that you start reading it here:
http://download.oracle.com/docs/cd/E19253-01/821-0437/index.html

Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] Bizarre code in dladm

2011-05-23 Thread Darren Reed

On 23/05/11 08:58 AM, James Carlson wrote:

Brian Utterback wrote:
   

On 05/20/11 22:10, James Carlson wrote:
 

As for whether it has any effect on -g, I don't know.  I can't imagine
that any competent compiler implementation would have trouble generating
proper code for that construct, but I guess I don't have a good
imagination for bad compilers ...

   

A lot of compilers will turn off all optimizations when -g is used.
Others will turn off some opts. It was not that long ago that you were
not allowed to use -O and -g at the same time. Clearly using the -g
option in this case turns off the compile time optimizations that remove
the bogus function call. This makes sense because the point of -g is to
force the compiler to retain as much as possible the correspondence
between the source code and the generated assembly code so the developer
can debug it in a sane manner.
 

Disabling optimizations that might confuse those attempting to debug the
code is one thing, but I think leaving in a call that the compiler knows
can't possibly be made (due to constant evaluation) is quite another.
For one thing, the external linkage means that it can trigger dynamic
library loading that (for the "optimized" code) would just never happen.
  And I think allowing that would add to (not lessen) the potential
confusion.

I would not expect "-g" to turn off basic constant folding on most
compilers, and this doesn't look like an exception to me.
   


Well I can tell you that SUNWspro fails to compile
dladm with -g as the command line option.



But, anyway, the point was to explain the construct.  It's not an
uncommon way to verify a relationship between structures at compile time.

Another example would be:

if (offsetof(struct mystruct, my_exposed_member) != 16)
someone_broke_binary_compatibility_for_mystruct();
   



If you know that your compiler (such as gcc) doesn't mind
things like this, use:

#ifdef __GNUC__

if (offsetof(struct mystruct, my_exposed_member) != 16)
someone_broke_binary_compatibility_for_mystruct();

#endif


The behaviour you're expecting is quite rightly compiler
dependent and there's no reason to expect it to work on
every compiler.

I'd actively discourage people from ever writing such code.

Darren

/ws/onnv-tools/SUNWspro/sunstudio12.1/bin/cc -g -m32 
-Wc,-Qassembler-ounrefsym=0 -xspace -W0,-Lt -Xa -xildoff -errtags=yes 
-errwarn=%all -erroff=E_EMPTY_TRANSLATION_UNIT 
-erroff=E_STATEMENT_NOT_REACHED -xc99=%none -Wd,-xsafe=unboundsym 
-W2,-xwrap_int -W0,-xglobalstatic -features=conststrings 
-DTEXT_DOMAIN="SUNW_OST_OSCMD" -D_TS_ERRNO 
-I/net/mintslice.us.oracle.com/biscuit/on-hg_ob2/proto/root_sparc/usr/include 
-zguidance -zfatal-warnings -Bdirect 
-M/net/mintslice.us.oracle.com/biscuit/on-hg_ob2/usr/src/common/mapfiles/common/map.noexstk 
-M/net/mintslice.us.oracle.com/biscuit/on-hg_ob2/usr/src/common/mapfiles/common/map.pagealign 
-Wl,-I/lib/ld.so.1 -o dladm dladm.c 
-L/net/mintslice.us.oracle.com/biscuit/on-hg_ob2/proto/root_sparc_stub/lib 
-L/net/mintslice.us.oracle.com/biscuit/on-hg_ob2/proto/root_sparc_stub/usr/lib 
-lsocket -ldladm -ldlpi -lkstat -lbsm -linetutil -ldevinfo -zlazyload 
-lrstp -znolazyload

Undefined   first referenced
 symbol in file
brlsum_t_is_too_large   dladm.o
brsum_t_is_too_largedladm.o
ld: fatal: symbol referencing errors. No output written to dladm

___
networking-discuss mailing list
networking-discuss@opensolaris.org


[networking-discuss] Bizarre code in dladm

2011-05-20 Thread Darren Reed

Whilst trying to compile dladm with "-g", I received this error:
...
Undefined   first referenced
 symbol in file
brlsum_t_is_too_large   dladm.o
brsum_t_is_too_largedladm.o
ld: fatal: symbol referencing errors. No output written to dladm

Looking in dladm.c, I find this:

+#ifndef lint
+/* This is a compile-time assertion; optimizer normally fixes this */
+extern void brlsum_t_is_too_large(void);
+
+if (sizeof (*brlsum)>  sizeof (state->ls_prevstats))
+brlsum_t_is_too_large();
+#endif

... given that this prohibits compiling with -g,
can someone explain why this belongs in dladm.c?

Darren


___
networking-discuss mailing list
networking-discuss@opensolaris.org


[networking-discuss] dladm, -t and -R

2010-12-22 Thread Darren Reed

Shouldn't the use of -t and -R be mutually exclusive?

Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] Raw Wifi Headers

2010-10-06 Thread Darren Reed

Saadia Fatima wrote:

Hi,
How can I pass raw Wifi Headers to pcap. In mac, the wifi headers are converted 
into ethernet headers and passed on to upper layers. How can I bypass that?
  


If you're using a Solaris that has the BPF driver and libpcap using that 
driver, then you should only see the native type headers (802.11) and 
not the 'cooked' headers (802.3) on any Wifi device.


What do you see with "tcpdump -L wifi0" and "tcpdump -c3 -envvv -i wifi0"?

Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] LRO implementation

2010-10-06 Thread Darren Reed

tom60 wrote:

Darren,

Thank you.

I also think of modifying the tcp checksum, but that means the Solaris 
OS would re-calculate each TCP packet's check sum.


Why would it do that?




I guess there is a way to announce that the checksum of the received 
TCP packets have been verified by HW so re-calculation by the OS is 
not necessary. But setting HCK_FULLCKSUM_OK alone is not enough somehow.


See Rao's email - the driver needs to be advertising the capability too.

Darren



Tom

--
From: "Darren Reed" 
Sent: Tuesday, October 05, 2010 7:25 PM
To: 
Subject: Re: [networking-discuss] LRO implementation


On  4/10/10 07:13 PM, Tom Chen wrote:

Hello,

I am implementing LRO on a 10G nic. I encountered a lot of "packet 
out of order" issues.

Here is how I process LRO rx interrupt which follows our Linux driver:
1. From rx descriptors, we know how many extra bytes appended behind 
the current tcp packet and the last appended packet's sequence number.

2. Read the tcp packet and extended payload
3. Modify the IP header's total_length field to reflect those extra 
payloads
4. Since IP header has been changed, calculate new IP checksum and 
update it in IP header

5. Update TCP header sequence field with new sequence number
6. Indicate to the OS that this LRO packet has good checksum by 
"mac_hcksum_set(mp, 0, 0, 0, 0, HCK_FULLCKSUM_OK);"

7. send to OS.

However, it looks like the packet is rejected by OS and netstat 
shows large number of "tcpInErrs". I am not sure how does the OS 
know that the packet is corrupted? Probably I changed tcp header's 
sequence number but did not re-calculate the TCP checksum? I did not 
do because it is too time-consuming. Does the OS redo whole TCP 
segments' checksum and verify even if the packet has 
HCK_FULLCKSUM_OK flag set?




If you changed the tcp header sequence number by x then you need to 
adjust the checksum by -x.


Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org



___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] LRO implementation

2010-10-05 Thread Darren Reed

On  4/10/10 07:13 PM, Tom Chen wrote:

Hello,

I am implementing LRO on a 10G nic. I encountered a lot of "packet out of 
order" issues.
Here is how I process LRO rx interrupt which follows our Linux driver:
1. From rx descriptors, we know how many extra bytes appended behind the 
current tcp packet and the last appended packet's sequence number.
2. Read the tcp packet and extended payload
3. Modify the IP header's total_length field to reflect those extra payloads
4. Since IP header has been changed, calculate new IP checksum and update it in 
IP header
5. Update TCP header sequence field with new sequence number
6. Indicate to the OS that this LRO packet has good checksum by "mac_hcksum_set(mp, 
0, 0, 0, 0, HCK_FULLCKSUM_OK);"
7. send to OS.

However, it looks like the packet is rejected by OS and netstat shows large number of 
"tcpInErrs". I am not sure how does the OS know that the packet is corrupted? 
Probably I changed tcp header's sequence number but did not re-calculate the TCP 
checksum? I did not do because it is too time-consuming. Does the OS redo whole TCP 
segments' checksum and verify even if the packet has HCK_FULLCKSUM_OK flag set?
   


If you changed the tcp header sequence number by x then you need to 
adjust the checksum by -x.


Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] ipadm weirdness

2010-08-30 Thread Darren Reed

Girish Moodalbail wrote:

On 8/28/10 4:30 AM, Darren Reed wrote:

 From a fresh AI install of 146...

r...@netvirt-d1:~# ipadm show-if
ipadm: Could not get interface(s): Operation failed
r...@netvirt-d1:~# ifconfig -a
ifconfig: could not get addresses from kernel: Operation failed
r...@netvirt-d1:~# ifconfig e1000g0
e1000g0: flags=1004843 mtu 
1500 index 2

inet 10.5.233.117 netmask ff00 broadcast 10.5.233.255
ether 0:23:8b:77:26:d8
r...@netvirt-d1:~# ipadm show-if e1000g0
ipadm: Could not get interface(s): Operation failed


The reason why 'ifconfig e1000g0' works and while 'ipadm show-if 
e1000g0' does not is because, ipadm show-if has a 'PERSISTENT' column 
to display and for which it needs to communicate with the 'ipmgmtd' 
daemon to retrieve the value for that column.


Also, ifconfig -a, calls ipadm_addr_info() which in turn tries to 
communicate with the daemon.


It can be inferred from the truss output below that the daemon fails 
to open the ipadm data-store for some reason.


Does this system still show this symptom? If so, can you email me the 
access to the system. If not, is it reproducible?


I haven't touched it since I started having these problems and wrote the 
email.


As per the prompt, netvirt-d1 is the host, andssh should let you login 
to it.


Darren


r...@netvirt-d1:~# dladm show-link
LINK CLASS MTU STATE BRIDGE OVER
e1000g0 phys 1500 up -- --
igb0 phys 1500 up -- --
e1000g1 phys 1500 up -- --
igb1 phys 1500 up -- --
r...@netvirt-d1:~# ps -ef | grep ipmgmt
netadm 49 1 0 19:08:39 ? 0:00 /lib/inet/ipmgmtd
root 1944 1659 0 19:24:38 console 0:00 grep ipmgmt
r...@netvirt-d1:~# truss -p 49 &
r...@netvirt-d1:~# ipadm show-if e1000g0
/4: door_return(0xFE96E92C, 4, 0x, 0xFE96EE00, 1007360) = 0
/4: open("///etc/ipadm/ipadm.conf", O_RDONLY) Err#13 EACCES 
[file_dac_search]

ipadm: Could not get interface(s): Operation failed
r...@netvirt-d1:/etc/ipadm# ls -ald /etc
drwxr-xr-x 83 root sys 230 Aug 27 19:13 /etc
r...@netvirt-d1:/etc/ipadm# ls -ald /etc/ipadm
drwxr-xr-x 2 netadm netadm 3 Aug 28 2010 /etc/ipadm
r...@netvirt-d1:/etc/ipadm# ls -al /etc/ipadm/ipadm.conf
-rw-r--r-- 1 netadm netadm 999 Aug 28 2010 /etc/ipadm/ipadm.conf
r...@netvirt-d1:/etc/ipadm# ls -ald /proc/`pgrep ipmgmt`
dr-x--x--x 5 netadm netadm 864 Aug 27 19:08 /proc/49

Thoughts, anyone?

Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org




___
networking-discuss mailing list
networking-discuss@opensolaris.org


[networking-discuss] ipadm weirdness

2010-08-28 Thread Darren Reed

From a fresh AI install of 146...

r...@netvirt-d1:~# ipadm show-if
ipadm: Could not get interface(s): Operation failed
r...@netvirt-d1:~# ifconfig -a
ifconfig: could not get addresses from kernel: Operation failed
r...@netvirt-d1:~# ifconfig e1000g0
e1000g0: flags=1004843 mtu 
1500 index 2

inet 10.5.233.117 netmask ff00 broadcast 10.5.233.255
ether 0:23:8b:77:26:d8
r...@netvirt-d1:~# ipadm show-if e1000g0
ipadm: Could not get interface(s): Operation failed

r...@netvirt-d1:~# truss ipadm show-if e1000g0
execve("/sbin/ipadm", 0x08047E60, 0x08047E70)  argc = 3
sysinfo(SI_MACHINE, "i86pc", 257)   = 6
mmap(0x, 32, PROT_READ|PROT_WRITE|PROT_EXEC, 
MAP_PRIVATE|MAP_ANON, -1, 0) = 0xFEFB
mmap(0x, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANON, -1, 
0) = 0xFEFA
mmap(0x, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANON, -1, 
0) = 0xFEF9
mmap(0x, 4096, PROT_READ|PROT_WRITE|PROT_EXEC, 
MAP_PRIVATE|MAP_ANON, -1, 0) = 0xFEF8

memcntl(0xFEFB7000, 31888, MC_ADVISE, MADV_WILLNEED, 0, 0) = 0
memcntl(0x0805, 7996, MC_ADVISE, MADV_WILLNEED, 0, 0) = 0
resolvepath("/lib/ld.so.1", "/lib/ld.so.1", 1023) = 12
resolvepath("/sbin/ipadm", "/sbin/ipadm", 1023) = 11
sysconfig(_CONFIG_PAGESIZE) = 4096
stat64("/sbin/ipadm", 0x08047AA4)   = 0
open("/var/ld/ld.config", O_RDONLY) Err#2 ENOENT
stat64("/lib/libc.so.1", 0x08047254)= 0
resolvepath("/lib/libc.so.1", "/lib/libc.so.1", 1023) = 14
open("/lib/libc.so.1", O_RDONLY)= 3
mmapobj(3, MMOBJ_INTERPRET, 0xFEF80AC0, 0x080472C0, 0x) = 0
close(3)= 0
memcntl(0xFEE2, 185224, MC_ADVISE, MADV_WILLNEED, 0, 0) = 0
mmap(0x, 4096, PROT_READ|PROT_WRITE|PROT_EXEC, 
MAP_PRIVATE|MAP_ANON, -1, 0) = 0xFEE1
mmap(0x0001, 24576, PROT_READ|PROT_WRITE|PROT_EXEC, 
MAP_PRIVATE|MAP_ANON|MAP_ALIGN, -1, 0) = 0xFEE0

getcontext(0x08047904)
getrlimit(RLIMIT_STACK, 0x080478FC) = 0
getpid()= 1910 [1909]
lwp_private(0, 1, 0xFEE02A00)   = 0x01C3
setustack(0xFEE02A60)
sysi86(SI86FPSTART, 0xFEF77C4C, 0x133F, 0x1F80) = 0x0001
brk(0x08068F50) = 0
brk(0x0806AF50) = 0
stat64("/lib/libipadm.so.1", 0x08047378)= 0
resolvepath("/lib/libipadm.so.1", "/lib/libipadm.so.1", 1023) = 18
open("/lib/libipadm.so.1", O_RDONLY)= 3
mmapobj(3, MMOBJ_INTERPRET, 0xFEE107C0, 0x080473E4, 0x) = 0
close(3)= 0
memcntl(0xFEDD, 19428, MC_ADVISE, MADV_WILLNEED, 0, 0) = 0
stat64("/lib/libsocket.so.1", 0x08046FC8)   = 0
resolvepath("/lib/libsocket.so.1", "/lib/libsocket.so.1", 1023) = 19
open("/lib/libsocket.so.1", O_RDONLY)   = 3
mmapobj(3, MMOBJ_INTERPRET, 0xFEE10ED0, 0x08047034, 0x) = 0
close(3)= 0
mmap(0x, 4096, PROT_READ|PROT_WRITE|PROT_EXEC, 
MAP_PRIVATE|MAP_ANON, -1, 0) = 0xFEDC

memcntl(0xFE7C, 16524, MC_ADVISE, MADV_WILLNEED, 0, 0) = 0
stat64("/lib/libnsl.so.1", 0x08046C18)  = 0
resolvepath("/lib/libnsl.so.1", "/lib/libnsl.so.1", 1023) = 16
open("/lib/libnsl.so.1", O_RDONLY)  = 3
mmapobj(3, MMOBJ_INTERPRET, 0xFEDC0598, 0x08046C84, 0x) = 0
close(3)= 0
memcntl(0xFEAE, 78408, MC_ADVISE, MADV_WILLNEED, 0, 0) = 0
sigfillset(0xFEF77320)  = 0
so_socket(PF_INET, SOCK_DGRAM, IPPROTO_IP, 0x, SOV_DEFAULT) = 3
so_socket(PF_INET6, SOCK_DGRAM, IPPROTO_IP, 0x, SOV_DEFAULT) = 4
zone_lookup(0x) = 0
stat64("/lib/libdladm.so.1", 0x08047338)= 0
resolvepath("/lib/libdladm.so.1", "/lib/libdladm.so.1", 1023) = 18
open("/lib/libdladm.so.1", O_RDONLY)= 5
mmapobj(5, MMOBJ_INTERPRET, 0xFEDC0C10, 0x080473A4, 0x) = 0
close(5)= 0
mmap(0x, 4096, PROT_READ|PROT_WRITE|PROT_EXEC, 
MAP_PRIVATE|MAP_ANON, -1, 0) = 0xFED6

memcntl(0xFED7, 47200, MC_ADVISE, MADV_WILLNEED, 0, 0) = 0
stat64("/lib/libcurses.so.1", 0x08046F88)   = 0
resolvepath("/lib/libcurses.so.1", "/lib/libcurses.so.1", 1023) = 19
open("/lib/libcurses.so.1", O_RDONLY)   = 5
mmapobj(5, MMOBJ_INTERPRET, 0xFED606F0, 0x08046FF4, 0x) = 0
close(5)= 0
memcntl(0xFED1, 54252, MC_ADVISE, MADV_WILLNEED, 0, 0) = 0
open("/dev/dld", O_RDWR)= 5
sysconfig(_CONFIG_PAGESIZE) = 4096
stat64("/lib/libinetutil.so.1", 0x08047318) = 0
resolvepath("/lib/libinetutil.so.1", "/lib/libinetutil.so.1", 1023) = 21
open("/lib/libinetutil.so.1", O_RDONLY) = 6
mmapobj(6, MMOBJ_INTERPRET, 0xFED60D38, 0x08047384, 0x) = 0
close(6)= 0
mmap(0x, 4

Re: [networking-discuss] 6965774 bound the interface index in IP to [1, 65535]

2010-07-28 Thread Darren Reed

On 28/07/10 05:09 AM, James Carlson wrote:

Darren Reed wrote:
   

On 27/07/10 01:47 PM, James Carlson wrote:
 

...
BSD has for decades provided sa_len, so you don't have to guess at how
long your sockaddrs really are.  Solaris does not, and there's weird
(and fragile) switch (sa.sa_family) in every application that has to
deal with routing socket messages.

   

Something that has been floated about, from time to time,
is making sa_family_t an 8bit type rather than the current
16bit type and using the other 8 bits for sa_len. If done
such that the ordering of the fields is different for x86 and
sparc then there should be few (if any) binary compatibility
issues.
 

That's (unfortunately) not really true.  If you receive a routing socket
message with flags in rtm_addrs, you need to look at the sockaddrs that
follow for each flag set.  One of the first things an application will
do with that message will be to read sa_family from those sockaddrs that
follow the rt_msghdr, which means dereferencing and switching on the
first *two* bytes of the structure.
   
Those existing applications are already compiled using the old sockaddr

definition, and they will always read the first two bytes as being the
family.  They can't be told not to read both bytes without a recompile.
  If you redefine one of those bytes to be the length value, then the
family number that the application sees will be bogus (e.g., AF 0x1002
rather than 2), and the application will fail.
   


I think this is just a case of being strict with what you send
(i.e. solaris only generates messages with sa_len==0) and
liberal with what you receive.

Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] 6965774 bound the interface index in IP to [1, 65535]

2010-07-27 Thread Darren Reed
On 27/07/10 01:47 PM, James Carlson wrote:
> ...
> BSD has for decades provided sa_len, so you don't have to guess at how
> long your sockaddrs really are.  Solaris does not, and there's weird
> (and fragile) switch (sa.sa_family) in every application that has to
> deal with routing socket messages.
>

Something that has been floated about, from time to time,
is making sa_family_t an 8bit type rather than the current
16bit type and using the other 8 bits for sa_len. If done
such that the ordering of the fields is different for x86 and
sparc then there should be few (if any) binary compatibility
issues.


Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] 6965774 bound the interface index in IP to [1, 65535]

2010-07-27 Thread Darren Reed

On 27/07/10 12:52 PM, Sebastien Roy wrote:

On 07/27/10 03:30 PM, Dan McDonald wrote:

On Tue, Jul 27, 2010 at 03:22:27PM -0400, James Carlson wrote:




The only thing that's really affected by this problem is the old BSD
routing socket interface.  In the case of adding a route, you can 
always

use the interface name instead of the ifIndex (sockaddr_dl supports
both), and that works fine.  In the case of receiving a "surprising"
truncated rtm_index value, you can do the smart thing and validate all
potentially matching interface information you've got, as dhcpagent
already does by doing a comparison under a 0x mask.


Someone mentioned IKE earlier in this thread.  IKE _only_ uses the
sockaddr_dl's index to create an appropriate listener for an IPv6 
link-lock

address.


I thought it used sockaddr_in6 with sin6_scope_id for this (which 
funnily enough is a 32-bit field).


One weird thing about all of this is that sockaddr_dl is used at the 
link-layer, and our IP interface index concept doesn't exist at that 
layer; it's an IP interface concept.  It's odd that the range of IP 
interface index would be at all related to the sdl_index field in 
sockaddr_dl to begin with.  We do have the concept of datalink ID, but 
that is an implementation detail, and not meant to be used by 
applications.  Even if it were not an implementation detail, it has no 
relationship with the IP interface index, and it would sure be nice if 
there was a correlation between an index at the link-layer and an 
index at the IP layer (for the interfaces that have objects in both 
layers, such as most IP interfaces plumbed over actual datalinks).


Yes, I've often thought the same thing - that it would be beneficial if 
there was a 1 to 1 relationship between the link layer id and the one in IP.


But if you take Jim's concern seriously, that interface indicies cannot 
be reused, then the mechanism behind selecting one for "ifconfig plumb" 
operations needs to change quite substantially because the new number 
needs to exist at both the datalink and IP layers.


Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] 6965774 bound the interface index in IP to [1, 65535]

2010-07-27 Thread Darren Reed

On 27/07/10 04:38 AM, James Carlson wrote:

Darren Reed wrote:

Finally, given that you feel so strongly about this and given that
this is OpenSolaris, feel free to file a bug in bugzilla along with
a new design and code that fixes this issue. Nothing speaks louder
in an open source community than contributions of new working code.


The only "new design" I would offer would be to back this errant fix out
of the gate.  The code was better the way it was before, and the change
does not actually fix any real problem that anyone encountered.  It's
gratuitous, and I'm surprised that a reviewer didn't object.

Given that this is OpenSolaris, I was hoping that we could have a
meaningful conversation about this change on a mailing list that's
intended for that purpose.  Obviously, that's not to be, because I'm
instead getting weird demands that I name other "multitasking operating
systems" that implement standardized protocols.


Let me summarise it like this: there was a feature present that
that nobody used and if they did, they would need to reboot their
system after they did so in order to restore proper operation.
Is that the kind of feature we need/want in Solaris?

Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] 6965774 bound the interface index in IP to [1, 65535]

2010-07-26 Thread Darren Reed

On 26/07/10 02:10 PM, James Carlson wrote:

Darren Reed wrote:

On 26/07/10 12:57 PM, James Carlson wrote:

Sequentially, though, it's much easier to get there.

And things like tunnels can get you there much faster.  I'm *SURE* that
punchin has rolled over that 2^16 limit many times over.


I'd hazard a guess that punchin gets rebooted for upgrading
before that happens. To give that more context, Dan did not
seem to think that fixing the IKE daemon's naive use of the
index from the routing message was more important than a P4,
which is a likely indication that the index has not passed 65535.


I think that misses the point I was making.

Creating and destroying a tunnel requires allocating a new ifIndex
number every time (unless it's the "same" tunnel and you can save all
the interface counters in the process).  Even a short run of a punchin
server sees a lot of plumbing and unplumbing as users log in and out.
That's a real-world usage of the system that results in many
plumbs/unplumbs, and that thus consumes a lot of the ifIndex space, with
wrap-arounds being a likely result if the space is artificially constrained.


And to carry this forward further, if the number space did pass 65535 then
the punchin server would need to be rebooted because the system would then
start to misbehave in unexpected ways.



I wasn't talking about whether the IKE daemon could or should be fixed
to do something different than what it currently does.


It's my understanding that punchin uses IPSec and thus IKE.



Indeed.  That's exactly why this interface has been left alone for
decades, and instead the fixes were put into the applications using the
interfaces.


If I quickly look at current BSD source code, I find:
- index allocation using the smallest available index
- limited to USHORT_MAX


Right.  That's why the Solaris code (up until this change) truncated the
index value down to 16 bits for those interfaces.


And in truncating the index value down to 16 bits, the old behaviour
caused broken behaviour when the index number passed 65535.



The SIOCGLIFINDEX interface, though, is Solaris-specific, so the full 32
bits is (or was once) available there.  Now it's broken.


I disagree.

The interface works perfectly fine, reporting an interface index
that maps to an existing IP interface. Furthermore, there's now
a guarantee that the number returned from SIOCGLIFINDEX will
match up with those received in routing protocol messages.



I don't believe that the fix applied for CR 6965774 is really the right
idea.  It perhaps makes some sense in a Windows-like environment where
you're encouraged (sometimes forcefully) to reboot every few hours or
so, but not so much for an OS that runs for a long period of time.
Tossing away the 32-bit counter that we put into IP decades ago seems
like a step backwards.


I think that the correct solution is to fix SNMP to not assume
or assert that an interface index should be unique for the
entire "uptime" of a host.


How would that be "fixed?"

It's a widely-implemented standard, and the idea that ifIndex numbers
are not recycled is baked into the way the standard operates.  It's not
something that vendors get a choice on.

I don't understand what "fix" means in that context.


Aside from Linux and derived products, which vendors of a
multitasking operating system operate in accordance with
the RFC in question?



Until one of those to happens, it is a mistake to not limit the
index allocation to [1,65535] because the only way of "fixing"
the problems that can occur once 65535 is passed is with a
reboot. To that end, any minor SNMP inconvenience seems trivial.
Or to put it differently, the potential for problems posed by
passing 65535 vastly outweigh the SNMP side of the equation.


I strongly disagree.  This change has broken an important feature in
Solaris -- the 32-bit ifIndex numbers -- and I'd very much like to see
it restored.  The fact that the old BSD interfaces had limitations
doesn't mean that the rest of the system needs to be hobbled as well.


How many times have you taken advantage of this and used this feature?

Faced with the choice of either potentially causing problems for
SNMP or forcing users to reboot their systems in order to avoid
further networking behaviour issues, which do you choose?

Maybe a way of putting this is that the new behaviour is a lesser
evil than the old behaviour and that nirvana is yet to be reached.

Finally, given that you feel so strongly about this and given that
this is OpenSolaris, feel free to file a bug in bugzilla along with
a new design and code that fixes this issue. Nothing speaks louder
in an open source community than contributions of new working code.

Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] 6965774 bound the interface index in IP to [1, 65535]

2010-07-26 Thread Darren Reed

On 26/07/10 12:57 PM, James Carlson wrote:

Darren Reed wrote:

On 26/07/10 08:18 AM, James Carlson wrote:

darren.r...@oracle.com wrote:

Author: Darren Reed
Repository: /hg/onnv/onnv-gate
Latest revision: 2794a0c9cce102961d08f075ee1f569073b99786
Total changesets: 1
Log message:
6965774 bound the interface index in IP to [1,65535]

Files:
 update: usr/src/uts/common/inet/ip/ip_if.c
 update: usr/src/uts/common/net/if.h


I suspect this change will have two serious effects:

- SNMP is now potentially broken.  The interface IDs cannot be reused
without an engine reboot indication, and restricting to a 16 bit range
makes reuse very much more likely than before.


Which interface IDs cannot be used?


The SNMP ifIndex value itself cannot be recycled without reporting a
reboot in the SNMP engine.  (It's not necessary that the *system* itself
be rebooted, but that at least the SNMP engine behave as though the
system were rebooted by bumping up the generation number.)

See RFC 2863 section 3.1.5.


Do you mean to say that SNMP cannot handle two different
network interfaces having the same interface ID during the
lifetime of the SNMP agent?


Correct.


But to me this sounds like a fundamental problem with SNMP and
that choosing 32bits for a network interface identifier does not
represent a better design or solution, only a pushing out the
problem to some point in the future and hoping that it never
appears on your watch.


Correct.  At 2^32 IDs, "not on my watch" means that a system churning
away at (say) 5 plumb/unplumb sequences per second would last a bit over
27 years before the feared roll-over occurs.  My plans at that time
include being either retired or dead.  Maybe both.

2^16 IDs passes by much more quickly.


- Systems with large numbers of virtual interfaces (tunnels and PPP
links, for example) may now run out of identifiers when this wasn't
previously possible.


This was considered.

65,535 is a LOT of network interfaces.
Further, the limit is only per instance of IP.


A "lot" depends on what you're doing.  At just one replumb a minute,
you'd wrap in 45 days.


A given installation of Solaris can only support 1024 zones and
it would require every zone to be using a shared network instance
and 64 network interfaces for it to be a problem in that direction.


Simultaneously, yes, that's one possibility.

Sequentially, though, it's much easier to get there.

And things like tunnels can get you there much faster.  I'm *SURE* that
punchin has rolled over that 2^16 limit many times over.


I'd hazard a guess that punchin gets rebooted for upgrading
before that happens. To give that more context, Dan did not
seem to think that fixing the IKE daemon's naive use of the
index from the routing message was more important than a P4,
which is a likely indication that the index has not passed 65535.




Can you imagine how long "ifconfig -a" would take to run on a
system with that many network interfaces?


The numbers are intentionally like PIDs; they're not reused until the
worst happens.  I believe that's the point you may be missing here.


One of the ideas that I floated around before addressing this
was to have DEBUG kernels start their IP interface index
allocation at 100,000, since we do something similar for PIDs
but nobody was interested in that.

But to take that further, except for PIDs under 100, the system
does reuse the PID number space, so why shouldn't it reuse
network interface IDs?



Was any consideration given to fixing the applications that rely on
these old BSD interfaces?  That's what we had been doing since at least
2.6 -- making the applications aware that ifIndex numbers could be
ambiguous, and using other means to verify the data.  There are numerous
examples of this work in the source base.  Let me know if you need
pointers; I can google it for you.  ;-}

In general, having aliases in the ifIndex space seen by the old BSD
interfaces is "not a problem" for most applications, because getting
routing socket messages just means that it's time to use the ioctls to
get the real current information.  The routing socket messages
themselves contain too little data to make a fully functioning program
on Solaris anyway.

Or perhaps "fixing" these old interfaces with some new mechanism?


In fixing the situation to support more than 16bits for a network
interface identifier, changes to the routing message would be
required. That change would then break our compatibility with
every open source application that exists and uses those messages.
It would also break compatibility with older applications built for
Solaris. If a new interface was introduced and the old one left in
place for compatibility reasons, that doesn't stop the ones using
the old interface from causing strange behaviour when the index
exceeds 65535. The sockaddr_dl message is used by

Re: [networking-discuss] 6965774 bound the interface index in IP to [1, 65535]

2010-07-26 Thread Darren Reed

On 26/07/10 08:18 AM, James Carlson wrote:

darren.r...@oracle.com wrote:

Author: Darren Reed
Repository: /hg/onnv/onnv-gate
Latest revision: 2794a0c9cce102961d08f075ee1f569073b99786
Total changesets: 1
Log message:
6965774 bound the interface index in IP to [1,65535]

Files:
update: usr/src/uts/common/inet/ip/ip_if.c
update: usr/src/uts/common/net/if.h


I suspect this change will have two serious effects:

   - SNMP is now potentially broken.  The interface IDs cannot be reused
without an engine reboot indication, and restricting to a 16 bit range
makes reuse very much more likely than before.


Which interface IDs cannot be used?
Do you mean to say that SNMP cannot handle two different
network interfaces having the same interface ID during the
lifetime of the SNMP agent?

But to me this sounds like a fundamental problem with SNMP and
that choosing 32bits for a network interface identifier does not
represent a better design or solution, only a pushing out the
problem to some point in the future and hoping that it never
appears on your watch.



   - Systems with large numbers of virtual interfaces (tunnels and PPP
links, for example) may now run out of identifiers when this wasn't
previously possible.


This was considered.

65,535 is a LOT of network interfaces.
Further, the limit is only per instance of IP.

A given installation of Solaris can only support 1024 zones and
it would require every zone to be using a shared network instance
and 64 network interfaces for it to be a problem in that direction.

Can you imagine how long "ifconfig -a" would take to run on a
system with that many network interfaces?

I suspect that anyone who is pushing beyond using 1000 network
interfaces at any particular point in time is going to be facing
all sorts of performance issues.



Was any consideration given to fixing the applications that rely on
these old BSD interfaces?  That's what we had been doing since at least
2.6 -- making the applications aware that ifIndex numbers could be
ambiguous, and using other means to verify the data.  There are numerous
examples of this work in the source base.  Let me know if you need
pointers; I can google it for you.  ;-}

In general, having aliases in the ifIndex space seen by the old BSD
interfaces is "not a problem" for most applications, because getting
routing socket messages just means that it's time to use the ioctls to
get the real current information.  The routing socket messages
themselves contain too little data to make a fully functioning program
on Solaris anyway.

Or perhaps "fixing" these old interfaces with some new mechanism?


In fixing the situation to support more than 16bits for a network
interface identifier, changes to the routing message would be
required. That change would then break our compatibility with
every open source application that exists and uses those messages.
It would also break compatibility with older applications built for
Solaris. If a new interface was introduced and the old one left in
place for compatibility reasons, that doesn't stop the ones using
the old interface from causing strange behaviour when the index
exceeds 65535. The sockaddr_dl message is used by applications to
both send and receive routing messages.

In the quick survey that I did of source code that used this interface,
only the IKE daemon appeared to be able to changed easily to use
SIOCGLIFINDEX.

Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] rx packet duplicates

2010-07-09 Thread Darren Reed

I believe that this is CR#6873233.

http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6873233

The problem is that your stack is blown due to recursion.

Darren

On  7/07/10 09:50 PM, Jesse Off wrote:

Okay, this duplication of packets only happens when snoop is running (ethernet 
is in promiscuous mode).  It is real to the rest of the system though and not 
just a bug in snoop as opensolaris will also externally react to each packet 
twice.

Our opensolaris server crashed 5 times yesterday on its first day of 
deployment.  Each time I was either running snoop or had just stopped running 
snoop.  The crash dump stack traces seem to point to some stack overflow 
regarding promiscous mode + ipfilter.

It would seem running snoop or otherwise putting an interface into promiscous 
mode is a recipe for duplicate packets and crashes on opensolaris.

fr_check+0x23(ff0708db5e74, 14, 3, 0, ff002f66c070, ff002f66c1e0)
ipf_hook+0xd2(ff002f66c1f0, 0, 0, ff06f1464000)
ipf_hook4_in+0x27(ff06fd30d000, ff002f66c1f0, ff06f1464000)
hook_run+0x90(ff06fd3d3a00, ff06fd30d000, ff002f66c1f0)
ip_input+0x433(ff06fe3c0928, 0, ff07006de400, ff002f66c2c0)
dls_rx_promisc+0x179(ff0701679b48, 0, ff07006de400, 1)
mac_promisc_dispatch_one+0x5f(ff078b118d28, ff07100d1120, 1)
mac_promisc_dispatch+0x105(ff070151c098, ff07100d1120, ff06ff8c8310)
mac_tx_send+0x423(ff06ff8c8310, ff070246edc8, ff07100d1120, 0)
mac_tx_single_ring_mode+0xf4(ff0701788200, ff07100d1120, 0, 0, 0)
mac_tx+0x302(ff06ff8c8310, ff07100d1120, 0, 0, 0)
str_mdata_fastpath_put+0xa4(ff0701679b48, ff07100d1120, 0, 0)
ip_xmit_v4+0x3bc(ff07100d1120, ff07049b56c8, 0, 0, 0)
ip_rput_forward+0x5b8(ff07049b56c8, ff0708c63c74, ff07100d1120, 
ff06fe3c0928)
ip_rput_process_forward+0x30f(ff07017a52f0, ff07100d1120, 
ff07049b56c8, ff0708c63c74, ff06fe3c0928, 0)
ip_fast_forward+0x87d(ff07049b56c8, a600a8c0, ff06fe3c0928, 
ff07100d1120)
ip_input+0x600(ff06fe3c0928, 0, ff07100d1120, ff002f66cb30)
dls_rx_promisc+0x179(ff0701679b48, 0, ff07100d1120, 1)
mac_promisc_dispatch_one+0x5f(ff078b118d28, ff070cd41760, 1)
mac_promisc_dispatch+0x105(ff070151c098, ff070cd41760, ff06ff8c8310)
mac_tx_send+0x423(ff06ff8c8310, ff070246edc8, ff070cd41760, 0)
mac_tx_single_ring_mode+0xf4(ff0701788200, ff070cd41760, 0, 0, 0)
mac_tx+0x302(ff06ff8c8310, ff070cd41760, 0, 0, 0)
str_mdata_fastpath_put+0xa4(ff0701679b48, ff070cd41760, 0, 0)
ip_xmit_v4+0x3bc(ff070cd41760, ff07049b56c8, 0, 0, 0)
ip_rput_forward+0x5b8(ff07049b56c8, ff0708db5834, ff070cd41760, 
ff06fe3c0928)
ip_rput_process_forward+0x30f(ff07017a52f0, ff070cd41760, 
ff07049b56c8, ff0708db5834, ff06fe3c0928, 0)
ip_fast_forward+0x87d(ff07049b56c8, a600a8c0, ff06fe3c0928, 
ff070cd41760)
ip_input+0x600(ff06fe3c0928, 0, ff070cd41760, ff002f66d3a0)
dls_rx_promisc+0x179(ff0701679b48, 0, ff070cd41760, 1)
mac_promisc_dispatch_one+0x5f(ff078b118d28, ff070083e180, 1)
mac_promisc_dispatch+0x105(ff070151c098, ff070083e180, ff06ff8c8310)
mac_tx_send+0x423(ff06ff8c8310, ff070246edc8, ff070083e180, 0)
mac_tx_single_ring_mode+0xf4(ff0701788200, ff070083e180, 0, 0, 0)
mac_tx+0x302(ff06ff8c8310, ff070083e180, 0, 0, 0)
str_mdata_fastpath_put+0xa4(ff0701679b48, ff070083e180, 0, 0)
ip_xmit_v4+0x3bc(ff070083e180, ff07049b56c8, 0, 0, 0)
ip_rput_forward+0x5b8(ff07049b56c8, ff07092849f4, ff070083e180, 
ff06fe3c0928)
ip_rput_process_forward+0x30f(ff07017a52f0, ff070083e180, 
ff07049b56c8, ff07092849f4, ff06fe3c0928, 0)
ip_fast_forward+0x87d(ff07049b56c8, a600a8c0, ff06fe3c0928, 
ff070083e180)
ip_input+0x600(ff06fe3c0928, 0, ff070083e180, ff002f66dc10)
dls_rx_promisc+0x179(ff0701679b48, 0, ff070083e180, 1)
mac_promisc_dispatch_one+0x5f(ff078b118d28, ff0708ababe0, 1)
mac_promisc_dispatch+0x105(ff070151c098, ff0708ababe0, ff06ff8c8310)
mac_tx_send+0x423(ff06ff8c8310, ff070246edc8, ff0708ababe0, 0)
mac_tx_single_ring_mode+0xf4(ff0701788200, ff0708ababe0, 0, 0, 0)
mac_tx+0x302(ff06ff8c8310, ff0708ababe0, 0, 0, 0)
str_mdata_fastpath_put+0xa4(ff0701679b48, ff0708ababe0, 0, 0)
ip_xmit_v4+0x3bc(ff0708ababe0, ff07049b56c8, 0, 0, 0)
ip_rput_forward+0x5b8(ff07049b56c8, ff0708abc734, ff0708ababe0, 
ff06fe3c0928)
ip_rput_process_forward+0x30f(ff07017a52f0, ff0708ababe0, 
ff07049b56c8, ff0708abc734, ff06fe3c0928, 0)
ip_fast_forward+0x87d(ff07049b56c8, a600a8c0, ff06fe3c0928, 
ff0708ababe0)
ip_input+0x600(ff06fe3c0928, 0, ff0708ababe0, ff002f66e480)
dls_rx_promisc+0x179(ff0701679b48, 0, ff0708ababe0, 1)
mac_

Re: [networking-discuss] Intel 82599 woes

2010-06-11 Thread Darren Reed

Steven Stallion wrote:

(First off, I apologize for the cross post - I wanted to include some
folks familiar with intel hardware I know are sitting on the
networking-discuss list)

I'm noticing some odd behavior on snv_134 with an intel 82599 variant
(10gbe x2 SFP+) using a custom driver. Essentially, the device is never
clearing the reset bit once a soft reset is issued. Out of curiosity, I
attached the ixgbe driver and it exhibited the same exact results.

The same hardware works fine with Linux installed, so this appears to be
an issue with Solaris.
  


"custom driver" means you're developing your own driver?

Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] loop back address

2010-05-26 Thread Darren Reed

On 25/05/10 08:47 AM, James Carlson wrote:

Garrett D'Amore wrote:

On 5/25/2010 6:56 AM, PRDEEP KUMAR wrote:

Hi Experts,

  Am new to the networking.Is there any specific reason why
only 127.0.0.1 is used as a loop back address,why not other addresses.


Historical convention?  The standards say that 127/24 is reserved for
loopback addresses.


It's actually /8.  See RFC 1122 section 3.2.1.3(g), which describes an
entire Class A for loopback.


  You could use any other 127 address, I suppose (and
I've done so), but I suspect 127.0.0.1 is so firmly entrenched in the
minds of admins and developers that you'd probably find things that
break if you tried a different address.


Long ago, at a company far away, we were able to use 127.x.x.x for
private communication among a collection of interconnected machines by
configuring Ethernet interfaces with addresses like 127.1.0.1/24 and by
reconfiguring lo0, which normally is configured as 127.0.0.1/8, with
127.0.0.1/24 instead.  It actually worked, and allowed customers to use
the rest of the interfaces for any legal IP address, though it wasn't
what you might call "standards conformant."

I don't know of anyone who has changed lo0's address away from 127.0.0.1
... nor any reason to do so.  I suspect that wasn't the original
poster's intent, though.


I tried that... about 6 years ago with BSD but it took a few
changes to disable various assumptions about where a packet
with a 127/8 address could come from. I think the most annoying
one was that code threw away LOOPBACK_NET packets if the
interface used was not also marked with the IFF_LOOPBACK flag.

It's the type of thing that you could easily imagine as being
possible with the etherstub device and vnic's attached to
it and supporting various zones with opensolaris. I suspect,
however, that it wouldn't be quite that easy...

Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] What SAP value to use for 802.2 and 802.3?

2010-05-12 Thread Darren Reed

On 12/05/10 04:57 PM, Garrett D'Amore wrote:

On 05/12/10 04:51 PM, Jason King wrote:
On Wed, May 12, 2010 at 6:46 PM, Darren Reed  
wrote:

Looking through the networking code, it is hard to tell if
Open/Solaris supports doing an explicit bind to a SAP that
refers directly to either 802.2 or 802.3. Am I missing
something?

If it is true that we don't and we were to support that,
what values would be appropriate?

The reasoning behind these questions is that I'm trying to
make sense of what it would mean to create a socket like
this on Open/Solaris:

fd = socket(PF_PACKET, SOCK_RAW, ETH_P_802_2);

or


fd = socket(PF_PACKET, SOCK_RAW, ETH_P_802_3);

Of course it may be that it just doesn't make sense on
Open/Solaris but I'm curious to know what others thing.

I don't believe it does -- the best you can do is bind to SAP 0, then
push the pfil module configured to pass the desired packets through.


There used to be an llc2 module that would help with this... I'm not 
sure these days though.  You should check llc2 to see if it will work 
with modern drivers.


llc2 still exists (in the closed part of the source code tree.)

I suppose I could equate ETH_P_802_2 with 802.2 SNAP?

Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org


[networking-discuss] What SAP value to use for 802.2 and 802.3?

2010-05-12 Thread Darren Reed


Looking through the networking code, it is hard to tell if
Open/Solaris supports doing an explicit bind to a SAP that
refers directly to either 802.2 or 802.3. Am I missing
something?

If it is true that we don't and we were to support that,
what values would be appropriate?

The reasoning behind these questions is that I'm trying to
make sense of what it would mean to create a socket like
this on Open/Solaris:

fd = socket(PF_PACKET, SOCK_RAW, ETH_P_802_2);

or


fd = socket(PF_PACKET, SOCK_RAW, ETH_P_802_3);

Of course it may be that it just doesn't make sense on
Open/Solaris but I'm curious to know what others thing.

Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] preferred way to configure ipfilter?

2010-03-09 Thread Darren Reed

Ivan Wang wrote:

Hi all,

Sorry if this is kind of old question..

I've been using /etc/ipf/ipf.conf to configure ipfilter until recently I 
checked ipfilter svc method out of curiosity, I saw there is a new way using 
smf firewall_config property group to automate ipf ruleset generation.

Is this to be the preferred way to configure ipfilter? I don't find a way to 
selective admit ICMP packet with the new facility though..
  


The SMF method is very coarse.

Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] Extremely slow and unreliable networking.

2010-03-09 Thread Darren Reed

On  9/03/10 11:40 AM, Garrett D'Amore wrote:

On 03/09/10 09:47, Jason King wrote:

I thought some virtualization software still used it, or at least
still defaulted to it and made it a bit annoying to change the
emulated nic (though I might be mistaken about that) -- if someone
could confirm that, that might be a decent reason to keep it.


I think you're right -- ISTR that Microsoft VirtualPC emulated a pcn.  
Ugh.


When I run NetBSD under another popular virtual workstation
environment, the driver that probes there is pcn and that's what
first came to my mind when I saw this email, not real hardware...

Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] BSD stuff that should be ported

2010-03-08 Thread Darren Reed

On  8/03/10 09:25 AM, Pedro F. Giffuni wrote:




- Messaggio originale -

   

Pedro F. Giffuni wrote:
I just wanted to mention that while I find
OpenSolaris very
attractive for everyday use I am still surprised that
many technologies from the BSDs have not been considered by Solaris for
inclusion, especially when the tough work is done already and the license was
thought to permit wide adoption.

Here are the descriptions for
a couple of such tools from
FreeBSD:

IPFW (recently ported to linux too)

 

href="http://en.wikipedia.org/wiki/Ipfirewall"; target=_blank
http://en.wikipedia.org/wiki/Ipfirewall
   

OpenSolaris uses (and comes
   

with) IPFilter rather than IPFW.  If you
 

really do need IPFW, then it
   

sounds like you'll want to launch a project
 

to port it over.

FreeBSD comes with IPFW and PF: both are very different but
maintainers develop preferences over time that are difficult to
change. Having it would open a space for Opensolaris on BSD
shops. Yes, I'd like it ported, and a Google Summer of Code
project did that for linux but I don't currently have the
resources (time in particular) to do it.
   



Last I checked, FreeBSD shipped with ipfw, pf and ipf.

At various points in time, people have decided to use one
or the other, based on what's currently in vogue.

There are differences between them and there are specific
features that are in ipfw that aren't in pf/ipf (and of course
vice versa.)

As a member of the FreeBSD community, I don't get the
impression that adding ipfw to OpenSolaris would help
OpenSolaris in a significant way.

The main feature of FreeBSD's networking that I'd like to
see considered would be dummynet, e.g:
http://info.iet.unipi.it/~luigi/dummynet/

Hopefully simnet can evolve to do some of that.

Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] libpcap/tshark/tcpdump produce garbled output on iwk0

2010-02-08 Thread Darren Reed

Antoon Huiskens wrote:

On 02/ 7/10 12:17 AM, Darren Reed wrote:

Antoon Huiskens wrote:

On 02/ 5/10 06:26 PM, Darren Reed wrote:

Finally, to track the fixing of this, bug 6922926:
http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6922926

You may also be interested in keeping abreast of 6923836.

Unfortunately the data isn't "live" and up to date, so patience may 
be required.


Darren


great!

let me know. Won't mind at all to test a fix.


Which build are you using? uname -a output?

Darren

at time of the initial mail his was 131. in between I upgraded to 132, 
which has the same issue. I still have BEs from 127 up.


fwiw: my main gripe is that I can't see 802.11 packets with snoop. Was 
hoping that wireshark & friends might fix that...


With your capture file that you sent me, find a hex editor and change 
byte 20 from 0x01 to 0x69 and reopen it with wireshark et al.


Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] libpcap/tshark/tcpdump produce garbled output on iwk0

2010-02-06 Thread Darren Reed

Antoon Huiskens wrote:

On 02/ 5/10 06:26 PM, Darren Reed wrote:

Finally, to track the fixing of this, bug 6922926:
http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6922926

You may also be interested in keeping abreast of 6923836.

Unfortunately the data isn't "live" and up to date, so patience may 
be required.


Darren


great!

let me know. Won't mind at all to test a fix.


Which build are you using? uname -a output?

Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] libpcap/tshark/tcpdump produce garbled output on iwk0

2010-02-05 Thread Darren Reed

Finally, to track the fixing of this, bug 6922926:
http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6922926

You may also be interested in keeping abreast of 6923836.

Unfortunately the data isn't "live" and up to date, so patience may be 
required.


Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] libpcap/tshark/tcpdump produce garbled output on iwk0

2010-02-05 Thread Darren Reed

Antoon,

The problem is that bpf is receiving the packets with the 802.11 header
present but it thinks that the link type is ethernet. The fix for this 
is to have

bpf use the native mac type for the link, rather than the "cooked" one.

I also need to look at how to have bpf provide both 802.11 haeders and
802.3 headers - or more correctly, provide both types of headers when
there is a difference between the native type and cooked type of a mac
plugin.

Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] libpcap/tshark/tcpdump produce garbled output on iwk0

2010-02-04 Thread Darren Reed

Antoon Huiskens wrote:

On 02/ 3/10 04:54 PM, Darren Reed wrote:

Antoon Huiskens wrote:

On 02/ 3/10 04:05 PM, Darren Reed wrote:
In the mean time, if ethernet headers are unimportant, you should 
be able to do this:


pfexec tcpdump -y IPNET -i iwk0

Darren


That works indeed.

Any thoughts on how we can diagnose the ethernet headers issue (I 
like to work my way up the stack..)


I think the first thing to do is confirm what is being passed into 
bpf is correct with the dtrace script below.


Run the script and then do "tcpdump -y EN10MB -i iwk0 -c 1".

Darren

#!/usr/sbin/dtrace -Fs

mblk_t *m;
size_t len;

fbt:bpf:bpf_mtap:entry {
   m = (mblk_t *)arg2;
   len = m->b_wptr - m->b_rptr;
   printf("%d:msg %p sz %d len %d", arg3, m, msgdsize(m), len);
   tracemem(m->b_rptr, 20);
}
fbt:bpf:bpf_mtap:return {}


named the script bpftrace.d:

$ pfexec dtrace -s bpftrace.d
...
$ pfexec tcpdump -y EN10MB -i iwk0 -c 1


Can you try this again, but this time make two changes:
change "tracemem(m->b_rptr, 20);" to "tracemem(m->b_rptr,40);" and
run tcpdump like this: "pfexec tcpdump -vXe -y EN10MB -i iwk0 -c 1".

Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] libpcap/tshark/tcpdump produce garbled output on iwk0

2010-02-03 Thread Darren Reed

Antoon Huiskens wrote:

On 02/ 3/10 04:05 PM, Darren Reed wrote:
In the mean time, if ethernet headers are unimportant, you should be 
able to do this:


pfexec tcpdump -y IPNET -i iwk0

Darren


That works indeed.

Any thoughts on how we can diagnose the ethernet headers issue (I like 
to work my way up the stack..)


I think the first thing to do is confirm what is being passed into bpf 
is correct with the dtrace script below.


Run the script and then do "tcpdump -y EN10MB -i iwk0 -c 1".

Darren

#!/usr/sbin/dtrace -Fs

mblk_t *m;
size_t len;

fbt:bpf:bpf_mtap:entry {
   m = (mblk_t *)arg2;
   len = m->b_wptr - m->b_rptr;
   printf("%d:msg %p sz %d len %d", arg3, m, msgdsize(m), len);
   tracemem(m->b_rptr, 20);
}
fbt:bpf:bpf_mtap:return {}

___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] libpcap/tshark/tcpdump produce garbled output on iwk0

2010-02-03 Thread Darren Reed
In the mean time, if ethernet headers are unimportant, you should be 
able to do this:


pfexec tcpdump -y IPNET -i iwk0

Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] libpcap/tshark/tcpdump produce garbled output on iwk0

2010-02-03 Thread Darren Reed

Antoon Huiskens wrote:

On 02/ 3/10 11:23 AM, Darren Reed wrote:
Rather than the usual 14 bytes, you've got 18 bytes prepended to your 
IP packets.


The confusing part is the 2 bytes in front of the MAC addresses and 
the 2 bytes between the MAC addresses and the ethernet type.


given that my network works (I observe this whilst typing this email 
on the laptop that has the defect) and also snoop does the right 
thing, I'd say this looks like a defect in libpcap or?


libpcap and tcpdump, etc, do not modify the data they receive, they just 
print it out.


Similarly, bpf should only be recording what it receives from the driver/IP.


These all appear to be broadcast packets of one type or another.

Here's a few more (http requests to blogs.sun.com) I have a hard time 
creating a filter that makes sense though:-)


Of course, for some reason there is 4 bytes in there that should not be 
there.


Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] libpcap/tshark/tcpdump produce garbled output on iwk0

2010-02-03 Thread Darren Reed
Rather than the usual 14 bytes, you've got 18 bytes prepended to your IP 
packets.


The confusing part is the 2 bytes in front of the MAC addresses and the 
2 bytes between the MAC addresses and the ethernet type.


These all appear to be broadcast packets of one type or another.

Antoon Huiskens wrote:
$  pfexec tcpdump -v -X -s 1536 -c 3 -i iwk0 
tcpdump: listening on iwk0, link-type EN10MB (Ethernet), capture size 1536 bytes
13:00:08.382924 ff:ff:ff:ff:00:0b (oui Unknown) > 08:22:00:00:ff:ff (oui Unknown), ethertype Unknown (0x0e9e), length 110: 
	0x:  4340 001d e019 ead1 7054  0300   c...@..pt..

0x0010:  0800 4500 004e 7c19  8011 c689 0a00  ..E..N|.
0x0020:  e3fc 0a00  0089 0089 003a f94c 87a1  ...:.L..
0x0030:  0110 0001    2046 4446 4645  ...FDFFE
0x0040:  4f43 4e45 4244 4644 4a44 4145  4845  OCNEBDFDJDAEDDHE
0x0050:  4345 4344 4945 4745 4341 4100 0020 0001  CECDIEGECAA.
13:00:08.383053 ff:ff:ff:ff:00:0b (oui Unknown) > 08:02:00:00:ff:ff (oui Unknown), ethertype Unknown (0x0e9e), length 110: 
	0x:  4340 001f 3bc0 37bd 8054  0300   c...@..;.7..T..

0x0010:  0800 4500 004e 0071  8011 3266 0a00  ..E..N.q2f..
0x0020:  f3c8 0a00  0089 0089 003a 2f19 8013  ...:/...
0x0030:  0110 0001    2045 4a46 4445  ...EJFDE
0x0040:  4246 4545 4246 4143 4143 4143 4143 4143  BFEEBFACACACACAC
0x0050:  4143 4143 4143 4143 4141 4100 0020 0001  ACACACACAAA.
13:00:08.485307 00:00:00:02:00:0b (oui Ethernet) > 08:02:00:00:33:33 (oui Unknown), ethertype Unknown (0x0e9e), length 88: 
	0x:  4340 001f 5bbe 892b a054  0300   c...@..[..+.t..

0x0010:  86dd 6000  0010 3aff fe80    ..`.:...
0x0020:   021f 5bff febe 892b ff02    [+..
0x0030:      0002 8500 b11c   
0x0040:   0101 001f 5bbe 892b ..[..+
  


___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] libpcap/tshark/tcpdump produce garbled output on iwk0

2010-02-02 Thread Darren Reed

Antoon Huiskens wrote:

On 02/ 1/10 09:57 PM, Darren Reed wrote:


If you do "pfexec tcpdump -L -i iwk0", what do you see?

Darren



$ pfexec tcpdump -L -i iwk0
Data link types (use option -y to set):
  DOCSIS (DOCSIS) (printing not supported)
  IPNET (Solaris IPNET)
  EN10MB (Ethernet)


Ok... and if you do this:

pfexec tcpdump -v -X -s 1536 -c 3 -i iwk0

What do you see?

Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] libpcap/tshark/tcpdump produce garbled output on iwk0

2010-02-01 Thread Darren Reed


If you do "pfexec tcpdump -L -i iwk0", what do you see?

Darren


___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] ipnat brokes HTTP or FTP-proxy connection

2010-01-07 Thread Darren Reed

On 8/01/2010 7:46 AM, Sergey Klyaus wrote:

Hello.

I am having two workstations - one is acting as NAT and ftp/nfs/smb-server and 
torrent-client and second - connected directly to e1000g0 interface (Ubuntu 
9.10 installed), with two Intel 82567 interfaces:
e1000g0: flags=1100843  mtu 1500 
index 2
 inet 192.168.24.1 netmask ff00 broadcast 192.168.24.255
e1000g2: flags=1100843  mtu 1500 
index 4
 inet 92.53.64.137 netmask ffc0 broadcast 92.53.64.191

There is no firewall rules, but i have simple nat configuration:
map e1000g2 192.168.24.1/24 ->  92.53.64.137/32 proxy port ftp ftp/tcp
map e1000g2 192.168.23.1/24 ->  92.53.64.137/32 proxy port ftp ftp/tcp
map e1000g2 192.168.24.1/24 ->  92.53.64.137/32 portmap tcp/udp 4:6
map e1000g2 192.168.24.1/24 ->  92.53.64.137/32
map e1000g2 192.168.23.1/24 ->  92.53.64.137/32 portmap tcp/udp 6:65000
map e1000g2 192.168.23.1/24 ->  92.53.64.137/32
rdr e1000g2 92.53.64.137 port 6346 ->  192.168.24.5 port 6346 tcp
rdr e1000g2 92.53.64.137 port 6346 ->  192.168.24.5 port 6346 udp

So, the problem is that http or ftp connections are suddenly broken without any reason. 
ipmon reports that connection is not expired, but Ubuntu behind NAT returns 
"Connection reset by peer error".

Here is snoop capture of last 4 packets: http://paste.org/pastebin/view/14237
   


Unfortunately providing a capture of only the last 4 packets does not 
give me enough information to determine anything about what might be 
happening to cause the connections to close.


Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] Socket filter design

2009-12-15 Thread Darren Reed

On 15/12/09 10:50 AM, Anders Persson wrote:

On Mon, Dec 14, 2009 at 09:24:54PM -0800, Darren Reed wrote:
  

On 14/12/09 07:28 PM, Anders Persson wrote:


Hi Darren,

My responses are inline.

On Wed, Dec 09, 2009 at 08:34:32AM +1100, Darren Reed wrote:
  
  

...


5.2
Is there any limit to how many connections can be in the ESTABLISHED 
TCP state but not returned via accept because of filters?

How can I tell which sockets are in this state? netstat? something else?



For a given listener, the number of deferred connections is only limited by the
backlog size. There is no global limit on the number of deferred connections.

I'll add per-filter kstat that show how many connections are currently
deferred.
  
  

I suppose this is fine.

Some additional kstats that would be useful:
- number of times a filter is attached to a socket (maybe pasive/active
 counters are different)



Sure.

  

- number of times a filter has been detached from a socket



There is already a stat that show how many sockets a filter is
attached to, so that with the two counters above should be sufficient.
  


Yup.



- number of sockets that are waiting for the filter to say it is ok
 to become established



Right, that's the "deferred count", and I'll add that.

  

- number of sockets that are closed after the filter has been attached
 but before it was able to finish its work (think connection dropped
 because only "GET " was received on a http filter socket.)



A filter can maintain its own kstats as well, and I think that this
type of counter would fit better there.
  


Do you have any feel for if there is a number of statistics
that all socket filters should keep track of?
And if so, should the plumbing that supports them support
a base set of stats that a socket filter can extend?

The model I'm thinking of here is ethernet drivers that all
now have a large pool of common statistics that are provided
through the gld/mii interfaces.



From the first two, the total number of active sockets with attached
filters can be derived

Some administrative problems I can foresee:
- my web server is using socket filters, i can connect to the server
 but I can't see any connections being accepted by the server.
 Where is my TCP connection?
 How do I know how many other TCP connections are
 in a similar state?
 How can I tell what state the socket is in within the filter?

Normally the way we know what state a TCP connection is in
is to use "netstat" and look for ESTABLISHED, SYN_RECEIVED,
FIN_WAIT_2, etc. This introduces a whole new set of sub-states
that we currently have no insight into. I don't know if this project
should be required to provide something that allows a filter to set
a sub-state or even how that would be observed.



The problems you mention above are not new, they already exist with
STREAMS and kssl. But I agree, having netstat be able to show
which connections have been deferred would be a useful debugging tool
and it's something I would consider addressing as a follow-up RFE.
  


Ok.

Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org

Re: [networking-discuss] Socket filter design

2009-12-14 Thread Darren Reed

On 14/12/09 07:28 PM, Anders Persson wrote:

Hi Darren,

My responses are inline.

On Wed, Dec 09, 2009 at 08:34:32AM +1100, Darren Reed wrote:
  

...

5.2
Is there any limit to how many connections can be in the ESTABLISHED TCP 
state but not returned via accept because of filters?

How can I tell which sockets are in this state? netstat? something else?



For a given listener, the number of deferred connections is only limited by the
backlog size. There is no global limit on the number of deferred connections.

I'll add per-filter kstat that show how many connections are currently
deferred.
  


I suppose this is fine.

Some additional kstats that would be useful:
- number of times a filter is attached to a socket (maybe pasive/active
 counters are different)
- number of times a filter has been detached from a socket
- number of sockets that are waiting for the filter to say it is ok
 to become established
- number of sockets that are closed after the filter has been attached
 but before it was able to finish its work (think connection dropped
 because only "GET " was received on a http filter socket.)

From the first two, the total number of active sockets with attached
filters can be derived

Some administrative problems I can foresee:
- my web server is using socket filters, i can connect to the server
 but I can't see any connections being accepted by the server.
 Where is my TCP connection?
 How do I know how many other TCP connections are
 in a similar state?
 How can I tell what state the socket is in within the filter?

Normally the way we know what state a TCP connection is in
is to use "netstat" and look for ESTABLISHED, SYN_RECEIVED,
FIN_WAIT_2, etc. This introduces a whole new set of sub-states
that we currently have no insight into. I don't know if this project
should be required to provide something that allows a filter to set
a sub-state or even how that would be observed.



8.2.1
Looking at sofop_attach, it would seem that a number of its arguments 
are only used for passive opens. For example, there will be no useful 
address information available for an active open. Isn't it therefore 
better to have a different attach for both socket() and accept()? This 
would better mirror the design elsewhere that has a callback per socket 
function (bind, etc.)



I considered having separate callbacks, but I kept coming back to it being
a single event, and hence the one callback. Also, a split would not simplify
the callback for passive attach, so I would like to keep it as it is for
now.
  


The difference, as I see it, is that a socket filter needs to infer that
a connection is being passively opened, rather than actively opened,
via the parameters to the callback.

With different callbacks, there's no need to infer which type of open
it is and nor is there any danger to such code from changes in how
the parameters are used.

Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org

[networking-discuss] Extension of GLD for transmittng packet when promiscuous

2009-12-13 Thread Darren Reed

At present the delivery of packets, on the transmit side, to promiscuous
receivers is handled by GLD before the packet is handed to the driver.

In some cases the driver makes few, if any, changes to the packet (such
as ethernet) but in others (such as IP tunneling), substantial changes are
made.

The problem is most easily seen when using a tool such as snoop on a
network interface that is doing hardware checksum offload: checksum
validation of the IP or TCP headers by programs such as snoop fails.

For network devices where the updates to header fields in the packet
are made by hardware, there is little that we can do to improve the
situation.

But this does not apply to network devices such as the IP tunnel device.

One option to handle this would be to have a function that could be
called by mac to "fill in" the header details before the packet is delivered
to the promiscuous callback.

Another option is to not deliver the packet to the promiscuous callback
from GLD, but from the driver itself, after the driver has finished filling
in the header - a delayed promiscuous callback.

My prefernce is for the latter approach as the changes do not appear to
be as frought as the former.

Thoughts?

Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] Socket filter design

2009-12-08 Thread Darren Reed

Anders Persson wrote:

Hi Folks,

A design document for socket filters is now available at:
  http://cr.opensolaris.org/~anders/sockfilter/sockfilter-design.pdf
  


2.1/2.2
How do I know which socket filters are available to enable with SMF?

3.1
If filter "A" fails to attach and filter "B" is defined as being 
"before" or "after" 'A", will "B" be attached?
Similarly, will the configuration of "B" fail if "A" is not been 
configured yet?


3.2
Shouldn't FILF_LIST be used with getsockopt, not setsockopt?

5.2
Is there any limit to how many connections can be in the ESTABLISHED TCP 
state but not returned via accept because of filters?

How can I tell which sockets are in this state? netstat? something else?

6.1
If no filter is configured, will pfiles display anything to indicate this?
Or will the line simply not be output?

8.2.1
Looking at sofop_attach, it would seem that a number of its arguments 
are only used for passive opens. For example, there will be no useful 
address information available for an active open. Isn't it therefore 
better to have a different attach for both socket() and accept()? This 
would better mirror the design elsewhere that has a callback per socket 
function (bind, etc.)


8.4.1
Can the bypass flag be set on a programmatic filter?
If so, does an application need to know this state when calling FILF_DETACH?

___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] Err#22 EINVAL when call ioctl(..,SIOCGIFHWADDR, ...);

2009-12-04 Thread Darren Reed

James Carlson wrote:

Darren Reed wrote:
  

Sebastien Roy wrote:


That should probably be fixed.  Since the introduction of SIOCGIFHWADDR
in build 128, all of those #ifdefs protecting its use on PF_INET sockets
in 3rd party software are now being compiled in and failing on Solaris.
This is the 2nd instance I've seen of this.
  
  

Given that SIOCGIFHWADDR has already been ARC'd, do you see a need to
ARC it for use with PF_INET sockets?



That sounds like an under-the-radar bug fix to me.
  


There's probably a bit of follow-on work to support it in linux branded 
zones, too.


Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] Err#22 EINVAL when call ioctl(..,SIOCGIFHWADDR, ...);

2009-12-04 Thread Darren Reed

Sebastien Roy wrote:

On Fri, 2009-12-04 at 21:03 +1100, Darren Reed wrote:
  

Yevgeniy Litvinenko wrote:


In OpenSolaris include files I found SIOCGIFHWADDR:
/usr/include/sys/sockio.h:#define   SIOCGIFHWADDR   _IOWR('i', 185, int)
/* PF_PACKET */

Is this call (ioctl(SIOCGIFHWADDR)) valid? Or is this a bug?
  
  
At present, SIOCGIFHWADDR is only supported on PF_PACKET sockets, not 
PF_INET sockets.



That should probably be fixed.  Since the introduction of SIOCGIFHWADDR
in build 128, all of those #ifdefs protecting its use on PF_INET sockets
in 3rd party software are now being compiled in and failing on Solaris.
This is the 2nd instance I've seen of this.
  


Given that SIOCGIFHWADDR has already been ARC'd, do you see a need to 
ARC it for use with PF_INET sockets?


Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] Err#22 EINVAL when call ioctl(..,SIOCGIFHWADDR, ...);

2009-12-04 Thread Darren Reed

Yevgeniy Litvinenko wrote:

Hi.

I've built Nmap 5.10BETA1 from source.
When I run it I get an error:
Failed to determine the MAC address of bge0!: Invalid argument (22)

truss nmap gives:
...
11146:   1.6062 so_socket(PF_INET, SOCK_DGRAM, IPPROTO_IP, 0x, 
SOV_XPG4_2) = 4
11146:   1.6063 ioctl(4, SIOCGIFCONF, 0x08043358)   = 0
11146:   1.6063 ioctl(4, SIOCGIFNETMASK, 0x08043308)= 0
11146:   1.6064 ioctl(4, SIOCGIFFLAGS, 0x08043308)  = 0
11146:   1.6064 ioctl(4, SIOCGIFNETMASK, 0x08043308)= 0
11146:   1.6064 ioctl(4, SIOCGIFFLAGS, 0x08043308)  = 0
11146:   1.6065 ioctl(4, _IOWRN('i', 185, 4), 0x08043308)   Err#22 EINVAL
11146:   1.6065 fstat64(2, 0x08042300)  = 0
...

In the source files of nmap I found this ioctl:
...
#ifdef SIOCGIFHWADDR
  memcpy(&tmpifr.ifr_addr, sin, MIN(sizeof(tmpifr.ifr_addr), sizeof(*sin)));
  rc = ioctl(sd, SIOCGIFHWADDR, &tmpifr); 
  if (rc < 0 && errno != EADDRNOTAVAIL)

pfatal("Failed to determine the MAC address of %s!", tmpifr.ifr_name);
  else if (rc >= 0)
memcpy(devs[count].mac, &tmpifr.ifr_addr.sa_data, 6);
#else
...

In OpenSolaris include files I found SIOCGIFHWADDR:
/usr/include/sys/sockio.h:#define   SIOCGIFHWADDR   _IOWR('i', 185, int)
/* PF_PACKET */

Is this call (ioctl(SIOCGIFHWADDR)) valid? Or is this a bug?
  


At present, SIOCGIFHWADDR is only supported on PF_PACKET sockets, not 
PF_INET sockets.


Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] software to block MSN

2009-12-03 Thread Darren Reed

qing wrote:

Do you know any good software to block MSN or Other Chat Tools,many employees 
have wasted many time to chatting with other people during the working time
  


This is an arms race - if you block it, people will find something else 
to use for chat.


Your company needs a policy that says people are forbidden from using 
chat at work.


___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] Mobile networking

2009-12-03 Thread Darren Reed

James Carlson wrote:

...
 but a question: I haven't been paying much attention to how
things have changed over the past few years, so could you summarize the
story for Shim6 versus PI addresses?

It's certainly good that there's progress, but if it turns out to be
another Mobile IP ...


It does feel like it, doesn't it?

I wonder if the world is low enough on real use of mobile networking (by 
more than a handful) that this part of networking is really still very 
much experimental...


Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] New project proposal: Shim6: Level 3 Multihoming Shim Protocol for IPv6

2009-12-03 Thread Darren Reed

+1

Erik Nordmark wrote:

-- OPENSOLARIS PROJECT PROPOSAL --


Project Name:
Shim6: Level 3 Multihoming Shim Protocol for IPv6

Project Synopsis:
Implement RFC 5533, 5534, and 5535 in OpenSolaris.

Project Purpose:
The Shim6 protocol has been developed in the IETF and recently
appeared as a set of three RFCs.
Shim6 provides the ability for hosts to recover from IP
communication failures when the hosts have two or more IP
addresses, by the shim switching from using one pair of IP
addresses to another pair. This is done transparently to TCP,
UDP and other transport protocols.

The switch between IP addresses is a potential source of
security problems. Shim6 avoids those by using Hash Based
Addresses (HBA) or Cryptographically Generated Addresses (CGA),
which are light-weight methods that do not require any 3rd party
infrastructure (such as PKI) yet provide protection against off
path attackers trying to use shim6 to redirect packets.

The discussion will take place on the existing
networking-discuss@opensolaris.org list.

Proposed Community Sponsors:
 Networking

Participants:
 Project lead:
 Erik Nordmark

___
networking-discuss mailing list
networking-discuss@opensolaris.org


___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] getting the list of TCP retransmissions victims

2009-11-29 Thread Darren Reed

Vladimir Kotal wrote:


In order to test my changes in IPsec SA expiration handling I'd like 
to get a list of processes/connections which experience TCP 
retransmissions (to see if they belong in the group of processes which 
transfer the data using the SAs).


Thanks to the MIB probes in ip module this is easy to accomplish for 
the latter case but I am not able to get a process name out of 
tcp_t/conn_t/sonode et al. The pid of the process is stashed in conn_t 
so this is fine but I'd like to get the execname as well. Sadly, the 
pid2proc mdb dcmd simulated in dtrace cannot be used here, I think.


Any ideas ?


You could use dtrace to trace fork/exec calls and create an array 
indexed on pid that stashes the execname...


But that presumes that the process is created after you start the dtrace 
script.


Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] Contributor/core-contributor grants for Team IPsec

2009-11-25 Thread Darren Reed

Dan McDonald wrote:

On 11/20/09 15:45, darren.r...@sun.com wrote:
  

On (11/20/09 17:02), Sebastien Roy wrote:
  

On Fri, 2009-11-20 at 12:00 -0800, Darren Reed wrote:


In light of Dan McDonald's previous and ongoing work in IPsec, I'd like
to nominate him for core-contributor status in the networking
community.
  

+1
  

And that makes 3 +1's (Sebastien Roy, James Carlson, Swmini Varadhan.)

Congratulations, Dan, welcome to the fold.




Thanks (assuming I'm not jumping the gun over anyones objections...).

I took a look at the grants of other Team IPsec members:

UserNetworking grant


markfen Contributor
(EXPIRED 2009-02-24)

pwernau Contributor
(EXPIRED 2009-02-24)

sommerfeld  Contributor
(EXPIRED 2009-02-24)

vkotal  None


Assuming I haven't jumped the gun over anyone's objections, I would like to
nominate refreshed grants:

FOR CORE CONTRIBUTOR:  Bill Sommerfeld (sommerfeld), Mark Fenwick (markfen)
  


I don't want to get personal, but does Mark even read this mailing list?

My thunderbird counts 0 emails from him in the last 12 months to this list.
What value would having him as a core contributor have to the community 
processes that involve core contributors? i.e I have no evidence that he 
wants to be a part of this community or even knows it exists, yet you're 
proposing that he should be a core contributor.


You're a regular contributor to the community, in different ways, which 
made it an easy nomination but I think that the title of core 
contributor needs to reflect on not only what they do with code, but 
also that they demonstrate that they're part of the community. Otherwise 
we end up with lots of dead weight. It would be like if someone 
nominated me to be a core contributor for security - it would have no 
real benefit to the security community because I don't have the time to 
keep abreast of what's happening there and thus don't really contribute 
to the opensolaris security community at large.


Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] Nomination of Dan McDonald to core-contributor status in the networking community

2009-11-20 Thread Darren Reed

On 11/20/09 14:18, sowmini.varad...@sun.com wrote:

On (11/20/09 17:02), Sebastien Roy wrote:
  

On Fri, 2009-11-20 at 12:00 -0800, Darren Reed wrote:

In light of Dan McDonald's previous and ongoing work in IPsec, I'd like 
to nominate him for core-contributor status in the networking community.
  


+1
  


And that makes 3 +1's (Sebastien Roy, James Carlson, Swmini Varadhan.)

Congratulations, Dan, welcome to the fold.

Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org

[networking-discuss] Nomination of Dan McDonald to core-contributor status in the networking community

2009-11-20 Thread Darren Reed
In light of Dan McDonald's previous and ongoing work in IPsec, I'd like 
to nominate him for core-contributor status in the networking community.


Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] New project announcement: "Solaris Network Performance Projects"

2009-11-17 Thread Darren Reed

eOn 11/16/09 13:16, Artem Kachitchkine wrote:



 > *) Pluggable congestion control

Sounds reasonable, but could we get more detail on what's in and out of
scope for these two projects -- e.g., is pluggable congestion control
limited to TCP or also applicable to SCTP?


The goal is to support SCTP as well as TCP. A new type of loadable 
kernel modules will implement congestion control algorithms. More 
specifically, the function of each algorithm will be to dynamically 
adjust, increase or decrease, sending rate based on its unique notion 
of available bandwidth and presence of congestion in the network.


Are there any plans to import interfaces from the crossbow project to 
provide hints to IP about what the available bandwidth might be?


Or is available bandwidth only a result of discovery through 
sending/receiving traffic?


Will all of the existing congestion control algorithms be modified to 
use the new interfaces delivered by this?


If so, will this include our implementation of Explicit Congestion 
Notification (ECN) for TCP?


Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org


[networking-discuss] tcpdump 4.0.0 and libpcap 1.0.0 integrated into SFW

2009-11-10 Thread Darren Reed

With the putback today of:

   PSARC/2009/147 tcpdump
   6744125 libpcap should be IPv6 capable
   6808014 tcpdump to be included into SFW consolidation
   6841989 libpcap needs to be updated version 1.0.0
   6863377 libpcap needs to be built to support BPF


tcpdump 4.0.0 and libpcap 1.0.0 are now available via SFW.

The first build for which binary files will be available
is build 129.

Stay tuned for a further update to provide Solaris IPNET
capture support for wireshark.

Darren


___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] Is there a place in heaven for snoop?

2009-11-09 Thread Darren Reed

James Carlson wrote:

Darren Reed wrote:
  

Anyone who wants to can get the source using Mercurial from
hg.opensolaris.org.

Run tar and gzip.
  
  

And the directory tree you get won't build anywhere outside of the ON tree.



Of course not.  It has not been ported anywhere else.

Who are you proposing to do that porting work?  It's a *lot* of work,
and I suspect it's basically a worthless task given the much better
alternatives available.
  


But yet it is currently essential to us and undoubtedly others too.



Are
there _any_ developers out there who give a flying fig about snoop?  All
you need is one.
  


I haven't asked all developers, so I've no idea.



As far as I can see, we've got everything to gain and nothing to lose by
trying it.



No.  Because it's silly.
  
  

Why do you say that?

So far as I can tell, you're making a lot of assumptions about something
that's unknown.



You're talking about an astroturf community, which makes it just plain
silly.  Throwing source at sourceforge does not cause developers to
spring into being any more than posting the source on opensolaris.org or
any other place.
  


Yup, agreed.

Maybe the point in time where this is an interesting idea has passed:, I 
don't know. (I'm thinking maybe in the mid 1990s it would have been an 
interesting idea: pre-ethereal/wireshark.)


Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] Is there a place in heaven for snoop?

2009-11-09 Thread Darren Reed

Nicolas Williams wrote:

On Mon, Nov 09, 2009 at 01:11:17PM -0800, Darren Reed wrote:
  

James Carlson wrote:


If nobody cares -- as it seems to be today -- then it'll never happen.
  
Well maybe that's not a bad idea - extract snoop from ON, convert it 
into something that builds on Solaris 8, 9, 10 and 11 and whack it up on 
sourceforge.



What's in it for us?  It sounds like a waste of time.
  


It satisfies the desire at Sun to no longer do any work on it whilst 
keeping it available for those that "need it."



OpenSolaris is [mostly] open source, snoop is in the open source part of
OpenSolaris.  That should be enough.
  


Except that all changes to anything on opensolaris need to come from or 
go through someone at Sun.



...

I would say it's silly because you're suggesting, I think, that someone
at Sun (?) do the work to get snoop to be a standalone tool that can be
hosted outside ON... but we all seem to agree that snoop is dead to us!
(except for compatibility reasons)

What is the "everything" that we've got to gain by doing this?  Why
isn't snoop's presence in ONNV sufficiently "open source"?
  


So it is dead but we want to keep it in the source code tree for 
opensolaris.
We don't dare remove it because there are scripts that exist that rely 
on it.
We don't want to add new features to it because there are better tools 
more worthy.

So we're left with this... thing... that rots.

IMHO, the current situation is far sillier than what I'm suggesting.

Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] Is there a place in heaven for snoop?

2009-11-09 Thread Darren Reed

James Carlson wrote:

Darren Reed wrote:
  

James Carlson wrote:


Darren Reed wrote:
 
  

How many times have you seen people on the 'web clamour about "why
didn't they open source it if they're going to discontinue it?"?



Sun *DID* open source it -- well over four years ago.  No more "open
sourcing" is needed to get snoop out there.
  

Where can I download snoop.tar.gz



Anyone who wants to can get the source using Mercurial from
hg.opensolaris.org.

Run tar and gzip.
  


And the directory tree you get won't build anywhere outside of the ON tree.



...
Someone will have to do that work, and it's certainly not trivial.

  

Or even just to get all of the current decoders, etc, for snoop on
Solaris 8, 9 or 10?



Mercurial, qv.
  


No, mercurial will give me something that compiles as part of ON.



While it might be "open source'd", because it is all CDDL'd, today it is
part of OpenSolaris. To do any work on it today is thus not attractive
for anyone outside of Sun.



So it seems you're saying that the fact that nobody's tar'd up that one
directory and tossed it on a web server is the reason that there's no
vibrant external snoop development community.  It's the one blocking
issue.  They can do all the complicated work of getting a DLPI-dependent
program running on a non-DLPI system, but Mercurial is a mystery.
  


Why does everything need to have a "vibrant development community"?

Statements like that sound like the mutterings of managers and people 
who are disconnected with open source.




In other words, you've got the cart before the horse.  If there is
indeed someone out there who wants to take the snoop sources and run a
new project on sourceforge with them, then more power to that person.
Good luck with it.  You (or anyone else; Sun employee or not) can do
that right _now_ without waiting for any special approval or changes in
ON.  The source is free for the taking.  Today.
  
  

I don't believe that we need two projects maintaining a program called
"snoop". Furthermore, I can't see how anyone would be attracted to work
on something outside of opensolaris when the "official" one is at
opensolaris.



I still think you've got the cart before the horse.

Start that external community of wacky snoop believers first, and, if it
happens, then shutting down the unnecessary one in ON becomes trivial.

If nobody cares -- as it seems to be today -- then it'll never happen.
  


Well maybe that's not a bad idea - extract snoop from ON, convert it 
into something that builds on Solaris 8, 9, 10 and 11 and whack it up on 
sourceforge.




...
You can't treat sourceforge (or any other such site) as a virtual India
where out-of-date projects can be sent for death by maintenance.  It
won't work.
  

How do you know that?
Has it been tried before?

If it hasn't, isn't it worth trying out?

As far as I can see, we've got everything to gain and nothing to lose by
trying it.



No.  Because it's silly.
  


Why do you say that?

So far as I can tell, you're making a lot of assumptions about something 
that's unknown.


Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] Is there a place in heaven for snoop?

2009-11-09 Thread Darren Reed

James Carlson wrote:

Darren Reed wrote:
  

How many times have you seen people on the 'web clamour about "why
didn't they open source it if they're going to discontinue it?"?



Sun *DID* open source it -- well over four years ago.  No more "open
sourcing" is needed to get snoop out there.


Where can I download snoop.tar.gz to build and compile on Linux?

Or even just to get all of the current decoders, etc, for snoop on 
Solaris 8, 9 or 10?


While it might be "open source'd", because it is all CDDL'd, today it is 
part of OpenSolaris. To do any work on it today is thus not attractive 
for anyone outside of Sun.




In other words, you've got the cart before the horse.  If there is
indeed someone out there who wants to take the snoop sources and run a
new project on sourceforge with them, then more power to that person.
Good luck with it.  You (or anyone else; Sun employee or not) can do
that right _now_ without waiting for any special approval or changes in
ON.  The source is free for the taking.  Today.
  


I don't believe that we need two projects maintaining a program called 
"snoop". Furthermore, I can't see how anyone would be attracted to work 
on something outside of opensolaris when the "official" one is at 
opensolaris.



...
You can't treat sourceforge (or any other such site) as a virtual India
where out-of-date projects can be sent for death by maintenance.  It
won't work.


How do you know that?
Has it been tried before?

If it hasn't, isn't it worth trying out?

As far as I can see, we've got everything to gain and nothing to lose by 
trying it.


Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] Is there a place in heaven for snoop?

2009-11-09 Thread Darren Reed

On 11/08/09 04:00, Brian Utterback wrote:

Darren Reed wrote:


Another question in my mind is why shouldn't we kickstart a snoop 
project on sourceforge? If it fails to attract a lot of attention, so 
be it, but isn't it worthwhile as an experiment? If it doesn't go 
anywhere, then that's a decision that the community at large has 
made, rather than one that Sun makes.


If Sun wants to stop developing snoop and similarly envisage ceasing 
support of it, why shouldn't Sun enable folks to continue hacking on 
it ?

Why don't they now? Because Sun owns it.

Darren



An interesting idea, but I am not sure that snoop is a good place to 
start. Has anyone ever expressed an interest in hacking snoop? Does it 
have any attribute that makes it a better starting point for hacking 
than wireshark?


Back in the 20th century, when SunOS4 still existed, people wanted snoop 
there..


But the point isn't necessarily to "hack on it", but rather for those 
that want to continue using it in whichever scripts they've built around 
it to be able to continue doing that and to be able to update it as is 
necessary.


How many times have you seen people on the 'web clamour about "why 
didn't they open source it if they're going to discontinue it?"?


On the hacking side of things, I know there have been times I've wanted 
to do more interesting things in snoop, like decode the ASN.1 in 
Kerberos UDP packets.



Remember that the whole reason that we must keep it in the distro is 
that so much it built on it in scripts, tests, etc. That means that if 
we did pass it to sourceforge or something like that, the one proviso 
is that it would have to remain backwards compatible forever (or at 
least until we did drop it). This would make it even less attractive 
for hacking.


snoop functions and behaves in a manner that is unique to snoop. I think 
the value of snoop is in it being snoop, not in changing it to be like 
something else. Thus I'm not terribly afraid of it becoming incompatible 
with itself.


Just because something is an open source project on sourceforge does not 
mean that backward compatibility ceases to become an issue. For example, 
if we assume that such a thing on sourceforge did happen, at least an 
initial pull of that source code would be used to construct the package 
for Open/Solaris. If that sourceforge project wanted its code updates to 
be continued to be used in Open/Solaris then it seems natural to me for 
it to keep being useful by maintaining backward compatibility.


And not to mention snoop is pretty hairy internally. Again not the 
best place to start hacking.


I'm sure there are pieces of code in much worse condition, today, that 
still get hacked on. I mean Linux is still alive and kicking, isn't it?


Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] Is there a place in heaven for snoop?

2009-11-07 Thread Darren Reed

On 11/04/09 08:24, Garrett D'Amore wrote:

Brian Utterback wrote:



Steven Stallion wrote:


+1 for keeping snoop.

snoop presents a minimal quick and dirty method of inspecting 
packets; this

has been invaluable during network driver development over a serial
console.


Is there anything that snoop provides in this regard that tshark does 
not? I suspect that once you give it a try, you will find that tshark 
works as well or better for your needs.


I'm aware of only one thing snoop provides that tshark does not: 
familiarity for people who've been using Solaris for a very long time. 
:-)


Eliminating snoop at this time is not practical.  I think "abandoning" 
snoop shouldn't happen at least until we have had a significant 
release cycle that included an alternative.  As it stands, there has 
not yet been an actual production release that included wireshark.


I wasn't ever proposing that snoop should be eliminated from the 
distribution.


What I was proposing is that the "how" it is packaged be changed.

So there seems to be wide agreement that no new development, at Sun, 
should happen for snoop. That's good because it lets us move on to what 
to do with it in its afterlife.


The next question that needs to be asked is who should support it and how?
Does that role need to be played by Sun forever and a day?

What I'd like people to consider is that the removal of snoop from ON 
does not necessarily mean that it needs to be removed from the distribution.


A distribution is made up of a plethora of packages, each one compiled 
from source code that may or may not have its origins inside of Sun.


Another question in my mind is why shouldn't we kickstart a snoop 
project on sourceforge? If it fails to attract a lot of attention, so be 
it, but isn't it worthwhile as an experiment? If it doesn't go anywhere, 
then that's a decision that the community at large has made, rather than 
one that Sun makes.


If Sun wants to stop developing snoop and similarly envisage ceasing 
support of it, why shouldn't Sun enable folks to continue hacking on it ?

Why don't they now? Because Sun owns it.

Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] Is there a place in heaven for snoop?

2009-11-03 Thread Darren Reed

Sebastien Roy wrote:

On Wed, 2009-11-04 at 00:35 +1100, Darren Reed wrote:
  

Sebastien Roy wrote:


Wireshark and tshark are in SFW, but there is IMO much work to be done
to call it a suitable alternative/replacement.  For one, we have a large
number of Solaris networking test suites that depend on the snoop CLI
and output format (both the snoop file format and the terminal output),
and one would need to evaluate how to handle those dependencies.  I have
no doubt that similar dependencies exist for user scripts and tools.

For that reason, I'm not sure that we'll be able to exclude snoop from a
Solaris distribution anytime soon.
  
  

I agree.

But a Solaris distribution already includes components from SFW, not 
just ON.



Yes, I know, I was simply making an argument against ripping out snoop
from the distribution.

  

And our test suite already depends on bits from SFW.

So there is no Rubicon that needs crossing here.

Why shouldn't snoop become SFW?

Do we need to have a PSARC case or something to announce the EO-whatever 
for snoop in order to unbundle it from ON?



The only architecture that this would affect are the potential
consolidation-private interfaces that snoop uses to do its job.  If it
makes extensive use of ON consolidation-private interfaces, then the
person doing this work needs to work out how to untangle that.
Otherwise, moving the clump of source code associated with snoop from
one manure pile to another doesn't really affect the architecture of the
system.

  
So the idea I'm proposing is not to "rip snoop out" but to change the 
location and the home of its source code so that it is no longer 
considered an integral part of Solaris like it is now. It becomes just 
another tool for sniffing packets, alongside tcpdump and wireshark, that 
we bring in from the outside.



I personally don't see the benefit in doing that.  Snoop still needs to
be maintained in previous Solaris versions (at least for high-priority
bug fixes), and the process for integrating such changes involves
integrating changes to the current release under development.  Moving
consolidations increases the cost associated with integrating such fixes
(IMO) without much benefit.

I also think that we have much bigger problems to solve in making sure
that Wireshark and tcpdump can actually replace snoop eventually.  Let's
think about the engineering problems associated with that.  Snoop can
sit in ON just as well as it can sit on sourceforge, I personally don't
see the point in moving it.
  


If this is the case then any and all arguments at PSARC, or wherever 
else, in favour of using something else other than snoop are neutered by 
the above comments.


In other words, I do not want to hear another person from within our 
organisation that we need or want to dump snoop in favour of wireshark 
or something else again - or at least until such time that they have all 
of the problems addressed.


If this issue does get raised again, I hope that the prevailing opinion 
will be that the community wanted to keep snoop until such time as the 
community is ready to see it removed - regardless of what any ARC 
committtee might think is best.


Personally, I don't care one way or another, but as a group, it seems 
networking is not ready to let go. If your manager asks you about this 
topic, feel free to refer them to this thread about what the future of 
snoop should be.


Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] Is there a place in heaven for snoop?

2009-11-03 Thread Darren Reed

James Carlson wrote:

Darren Reed wrote:
  

Well, if we no longer wish to support snoop and do not wish to bundle it
with the standard product, why not pull it out into a tar-ball of souce
code that can be downloaded and compiled?

And maybe we could go a step further: create a project to
maintain/enhance snoop on a website such as sourceforge and let the
community take complete control of it? The community at large could even
port it to Linux or BSD, if such as their wish. Having it hosted on a
non-opensolaris web site would be important for many reasons.



Even if snoop is "deleted" from the source tree, it's still present in
the Mercurial history, so someone who really does care about it can
easily fetch the sources.

And if there's nobody who cares, why bother trying to create a community
somewhere for it?  It seems too much like astroturf to me.  Either there
are people who care -- and who can trivially take a fork of the source
from ON and do with it as they please -- or there's nobody who cares.
In both cases, we needn't "help" them by creating a ghost ship on
sourceforge.  (And a weird one at that, since the code certainly will
require a substantial amount of work to be even minimally portable.)
  


Right, but if someone else wants to do it, why should we care or try to 
stop them?

(btw, I've got the SunOS4 binary for snoop around somewhere...)

Creating forks of software also creates headaches. You have got to be 
really dedicated to do it.


Taking any part of Solaris and making changes to it is unattractive for 
anyone outside of Sun because the process or contributing those changes 
back is a not very friendly and welcoming.


Maybe snoop will be just another open source project that is idle most 
of the time, maybe not.


What I do know is that apart from the use of snoop for test suites, it 
is not often used and it is even less often the tool of choice.


What I'm encouraging people to think about is a way for us to have our 
cake and eat it too.


We want to get rid of snoop from ON but at the same time, we want to be 
able to use it. SFW seems like the perfect answer.


Darren

p.s. i suppose another answer is to move the snoop source from ON to STC 
but that only placates those in testing, not the wider community.


___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] Is there a place in heaven for snoop?

2009-11-03 Thread Darren Reed

Sebastien Roy wrote:

On Mon, 2009-11-02 at 07:19 -0500, James Carlson wrote:
  

The only thing blocking the removal of snoop in the past has been the
lack of wireshark/tshark integration as a replacement.  Has that finally
been fixed?  If so, then just do it.



Wireshark and tshark are in SFW, but there is IMO much work to be done
to call it a suitable alternative/replacement.  For one, we have a large
number of Solaris networking test suites that depend on the snoop CLI
and output format (both the snoop file format and the terminal output),
and one would need to evaluate how to handle those dependencies.  I have
no doubt that similar dependencies exist for user scripts and tools.

For that reason, I'm not sure that we'll be able to exclude snoop from a
Solaris distribution anytime soon.
  


I agree.

But a Solaris distribution already includes components from SFW, not 
just ON.


And our test suite already depends on bits from SFW.

So there is no Rubicon that needs crossing here.

Why shouldn't snoop become SFW?

Do we need to have a PSARC case or something to announce the EO-whatever 
for snoop in order to unbundle it from ON?


So the idea I'm proposing is not to "rip snoop out" but to change the 
location and the home of its source code so that it is no longer 
considered an integral part of Solaris like it is now. It becomes just 
another tool for sniffing packets, alongside tcpdump and wireshark, that 
we bring in from the outside.


Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org


[networking-discuss] Is there a place in heaven for snoop?

2009-11-02 Thread Darren Reed
As many of us know, snoop is a bit long in the tooth when it comes to 
some aspects of network traffic analysis when compared to more modern 
tools such as wireshark. The end-game most people have in mind is for 
snoop to everntually be removed from ON... and that got me thinking: why 
not cut snoop free?


What do I mean by "cut snoop free"?

Well, if we no longer wish to support snoop and do not wish to bundle it 
with the standard product, why not pull it out into a tar-ball of souce 
code that can be downloaded and compiled?


And maybe we could go a step further: create a project to 
maintain/enhance snoop on a website such as sourceforge and let the 
community take complete control of it? The community at large could even 
port it to Linux or BSD, if such as their wish. Having it hosted on a 
non-opensolaris web site would be important for many reasons.


If we can do the above, then we could move snoop from being an ON 
component to an SFW component (or similar) and still allow those who 
want it to use pkg to install it. So we solve our issue (having to 
support/enhance a program we don't want to spend a lot of time in - 
snoop ) and at the same time keep it available for those that "must have 
it."


In the past, the biggest obstacle to us doing this would have been the 
source code being closed. That problem is solved.


Thoughts?

Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org


[networking-discuss] tcpdump/libpcap code review

2009-10-20 Thread Darren Reed

To integrate tcpdump and update libpcap to version 1.0.0,
I need a set of eyes or two to look over the changes here:

http://cr.opensolaris.org/~darrenr/sfwnv-pcap/

Thanks,
Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] Automated installs and dladm

2009-10-19 Thread Darren Reed

Sebastien Roy wrote:

On Tue, 2009-10-20 at 01:16 +1100, Darren Reed wrote:
  

Sebastien Roy wrote:


On Mon, 2009-10-19 at 23:50 +1100, Darren Reed wrote:
  
  

In many system administration environments, it is desirable to do
system installation via jumpstart. Although jumpstart as we know
it today will go away with IPS, IPS will provide a new mechanism
to do automated network installs so this need does not go away...

As it is today, dladm only appears to support automated installs
creating bridge devices, leaving IP tunnels, vnics, aggregations
and others out in the cold.

How many of those that do not support -R can be "fixed"?



It looks like every dladm create-* subcommand supports a -R option.  I'm
unclear about what you're referring to.
  
  

$ dladm |& grep create
create-aggr  [-t] [-P ] [-L ] [-T ] [-u 
]

create-secobj[-t] [-f ] -c  
create-vlan  [-ft] -l  -v  [link]
create-iptun [-t] -T  [-a {local|remote}=,...] ]
create-vnic  [-t] -l  [-m  | auto |
create-etherstub [-t] 
create-bridge[-R ] [-P ] [-p ]

oh... the "usage" output disagrees with the man page...
silly me for not checking the man page.



That seems like a bug in the create-bridge synopsis.  I believe -R was
taken out of all of the subcommand synopses in the usage output because
is caused too much clutter and was redundant.  It looks like one was
re-introduced with create-bridge.
  


No, the bug is in thinking that a command line option does not belong
in the subcommand synposis that is displayed as above. There should
always be 100% agreement between the synopses and the man page.

Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] Automated installs and dladm

2009-10-19 Thread Darren Reed

Sebastien Roy wrote:

On Mon, 2009-10-19 at 23:50 +1100, Darren Reed wrote:
  

In many system administration environments, it is desirable to do
system installation via jumpstart. Although jumpstart as we know
it today will go away with IPS, IPS will provide a new mechanism
to do automated network installs so this need does not go away...

As it is today, dladm only appears to support automated installs
creating bridge devices, leaving IP tunnels, vnics, aggregations
and others out in the cold.

How many of those that do not support -R can be "fixed"?



It looks like every dladm create-* subcommand supports a -R option.  I'm
unclear about what you're referring to.
  


$ dladm |& grep create
   create-aggr  [-t] [-P ] [-L ] [-T ] [-u 
]

   create-secobj[-t] [-f ] -c  
   create-vlan  [-ft] -l  -v  [link]
   create-iptun [-t] -T  [-a {local|remote}=,...] ]
   create-vnic  [-t] -l  [-m  | auto |
   create-etherstub [-t] 
   create-bridge[-R ] [-P ] [-p ]

oh... the "usage" output disagrees with the man page...
silly me for not checking the man page.

Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org


[networking-discuss] Automated installs and dladm

2009-10-19 Thread Darren Reed

In many system administration environments, it is desirable to do
system installation via jumpstart. Although jumpstart as we know
it today will go away with IPS, IPS will provide a new mechanism
to do automated network installs so this need does not go away...

As it is today, dladm only appears to support automated installs
creating bridge devices, leaving IP tunnels, vnics, aggregations
and others out in the cold.

How many of those that do not support -R can be "fixed"?

Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] MAC and IP Address with ioctl

2009-10-16 Thread Darren Reed

yonah wrote:

Hi,

i found the solution for IP Address (inet_ntop function i have to use), but arp 
does not work for IPv6 (i found in the internet). How can i get MAC Address for 
IPv6 interfaces only?

Interface Name:xnf0:1
Hardware Address:00:00:00:00:00:00:
IP Adresse (V6):2001:db8::216:3eff:fe7f:e4d8

Thats my output without arp but with inet_ntop.
  


The MAC address is a property of the physical interface, not the IP 
interface.


I don't believe that you can have xnf0 with MAC address 11:22:33:44:55:66
for IPv4 and xnf0 with MAC address 66:55:44:33:22:11 for IPv6.

Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] MAC and IP Address with ioctl

2009-10-11 Thread Darren Reed

yonah wrote:

Hi,

i need to get MAC Address IP Address for IPv6 with C Programming for 
OpenSolaris.

I found the following code:

#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
 
main(int argc,char *argv)

{
unsigned char eth_addr[6];
int i,j,sock;
unsigned char *p;
struct lifconf lic;
struct lifreq lifrs[30];
struct lifnum num;
struct arpreq ar;
struct sockaddr_in *soap,*soap2;

/* Enumerate all IP addresses of the system */

sock=socket(PF_INET,SOCK_DGRAM,IPPROTO_IP);
num.lifn_family=AF_INET;
num.lifn_flags=0;
ioctl(sock,SIOCGLIFNUM,&num);
lic.lifc_family=AF_INET;
lic.lifc_flags=0;
lic.lifc_len=sizeof(lifrs);
lic.lifc_buf=(caddr_t)&(lifrs[0] );
ioctl(sock,SIOCGLIFCONF,&lic);
 
/* Get the ethernet address for each of them */

for(i=0;i 
/* Print IP address */

p=(unsigned char *)&(soap->sin_addr);
printf("%s: %u.%u.%u.%u - ",lifrs[ i ].lifr_name,
p[0],p[1],p[2],p[3]);

/* Get ethernet address */

if(ioctl(sock,SIOCGARP,&ar)<0)
{
printf("No ethernet address.\n");
}
else
{
p=(unsigned char *)&(ar.arp_ha.sa_data);
printf("%02X:%02X:%02X:%02X:%02X:%02X\n",
p[0],p[1],p[2],p[3],p[4],p[5]);
}
}
 
close(sock);

}



But this work only for IPv4. I changed Addressfamily and structures to IPv6, 
but it does not work.
  


Which part of the above does not work for IPv6?
What happens if you use SIOCGLIFCONF and SIOCGLIFADDR for IPv6?

Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] Irresponsiveness when lots of active network connections

2009-10-11 Thread Darren Reed

Mika Borner wrote:

Darren Reed wrote:


What launches the loginproxy command?



The loginproxy is an IMAP4 proxy.

There is a setting for MaxThreads (currently set to 16384) and 
MaxSocketsPerThread Currently 128)


The Documentation mentions for MaxThreads: "The number of available 
POP3/IMAP4 server threads. Each user requires one thread".


At the moment I can see 262 loginproxy LWPs and netstat shows me 10254 
established connections...


Is any of this driven out of inetd?

Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] Irresponsiveness when lots of active network connections

2009-10-10 Thread Darren Reed

Mika Borner wrote:

Hi,

on a T2 (8-core) we see irresponsiveness when having a high number of 
network connections, even on interfaces that do not have a high 
payload. Logins can take ages


What launches the loginproxy command?

Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] plumbing an interface

2009-10-09 Thread Darren Reed

James Carlson wrote:

Darren Reed wrote:
  

For example, maybe I want to develop an on-demand tunnel protocol and
implementation. Lets assume that it's a GLDv3 interface and I want to use
my own tunnel protocol, not one of those supported by iptun (for example
openvpn.)



The subtle but important difference is that you're talking about
creating and destroying higher-level software abstractions (tunnels),
while the original poster and I were discussing plumbing of low-level
physical (Ethernet) interfaces.
  


I chose tunnels because they were an easy example.

But the same problem applies to any network interface.

NWAM is designed around the principal of activating a network interface
and configuring because of what it is attached to. For a lot of 
applications,
that is fine. It is a good model for a laptop that has a wireless/LAN 
interface

and can be extended to file/web servers.

But the design does not include how to manage network interfaces that
are desired to be up/down/plumbed/unplumbed based on some other
characteristic.



I think these have some different behaviors and, despite what meem said,
I'm not seeing a lot of semantic goodness in being able to insert a new
adapter (or create a new VLAN), but not being able to talk about IP on
top of that interface until I specifically say, yeah, I'd like fries^WIP
with that.  It's at best confusing, because it's different from most
other operating systems, and being able to say "no IP on X" is not
terribly different (at least to me) from saying "IP is down and
unconfigured on X."
  


So what I hear you saying is that the API needs to be fuller,
and as an example, rather than having different actions to plumb
and then add a network interface, it should be a single function.



Whilst NWAM provides us with "network automagic", sometimes what
people (or applications) want is "programatic networking" instead or in
addition to the "automagic".



I agree that being able to create and destroy tunnels used for VPNs
would be a good thing.  It's less obvious to me how that should interact
with NWAM, though.  Do I _really_ need to have that application doing
all of the interface configuration work?


Why not?

If it isn't obvious how it should work with NWAM then perhaps the
answer is that it shouldn't.



For an application that's
well-integrated with OpenSolaris, I think the answer is "no" -- NWAM
should always handle that job, even if there are goofy proprietary means
for determining properties.
  


What do you mean by "well integrated with OpenSolaris"?

Do you mean compiles and runs?
Or is bundled and part of an OpenSolaris based distribution?

Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] plumbing an interface

2009-10-08 Thread Darren Reed

James Carlson wrote:

...
You're welcome to the opinion, but I don't think it has anything to do
with interface stability, or determining what things you can rely on.

There are deeper issues here.  If the OpenSolaris community does in fact
do away with the need for applications that know about plumbing by
making it all automatic (as the NWAM team has prototyped, and as it
seems we're driving towards), then what benefit do we get by having
function call wrappers over a non-existent feature?

Why make easy what we wish nobody needed to do, and what we expect to
make obsolete?


Well, to express this idea in terms of NWAM, I suspect what would be
needed is to have sub-profiles that are specific to an application that 
exist

within a "location" based profile. But even then, that might be too generic.

For example, maybe I want to develop an on-demand tunnel protocol and
implementation. Lets assume that it's a GLDv3 interface and I want to use
my own tunnel protocol, not one of those supported by iptun (for example
openvpn.)

Now I want my application to do authentication of the remote end before
it creates the tunnel interface. The application does some sort of out 
of band

authentication (maybe web based) and then wants to create a tunnel based
on the success of that authentication.

My "location" hasn't changed, it could be "any" network that I'm attached
to and thus there is no reason to activate a different NWAM profile. That
might even be exactly what I don't want to do. Tieing the interface together
to the NWAM profile(s) would be an incorrect association between the
state of the application and NWAM. Thus the tunnel interface should not
be visible "all the time" as a "just in case" thing but only as required.

In this particular high level example, the state of the tunnel link is 
tied to

internal application state, thus it is not necessarily desirable to link the
tunnel state to the application state.

Whilst NWAM provides us with "network automagic", sometimes what
people (or applications) want is "programatic networking" instead or in
addition to the "automagic".

Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] plumbing an interface

2009-10-05 Thread Darren Reed

On 10/05/09 10:53, Girish Moodalbail wrote:

On 10/05/09 13:19, Darren Reed wrote:

Girish Moodalbail wrote:

On 10/05/09 09:43, Gireesh Nagabhushana wrote:
So, are we getting APIs which can be used for interface 
configurations?


Plumb/unplumb was his initial requirement. His last mail was not 
just on

plumb/unplumb but on a wide number of operations on network interfaces
which we do through ndd/dladm/ifconfig..


We know the gaping hole in doing network interface configuration 
programmatically in Solaris. We are addressing this issue in 
Brussels II project. More on it here:


http://arc.opensolaris.org/caselog/PSARC/2009/306/commitment.materials/

and here

http://opensolaris.org/os/project/brussels/

However it will be a while before we could make those API's public.


Why?


For the simple reason that Brussels II delivers components in a phased 
manner and by the first phase we wouldn't have every feature supported 
to make the API's stable and public.


And why is it necessary to support every feature with the first release 
of a stable API?


That's the best excuse to never make an API stable/public because new 
features will always be added.


Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] plumbing an interface

2009-10-05 Thread Darren Reed

Girish Moodalbail wrote:

On 10/05/09 09:43, Gireesh Nagabhushana wrote:

So, are we getting APIs which can be used for interface configurations?

Plumb/unplumb was his initial requirement. His last mail was not just on
plumb/unplumb but on a wide number of operations on network interfaces
which we do through ndd/dladm/ifconfig..


We know the gaping hole in doing network interface configuration 
programmatically in Solaris. We are addressing this issue in Brussels 
II project. More on it here:


http://arc.opensolaris.org/caselog/PSARC/2009/306/commitment.materials/

and here

http://opensolaris.org/os/project/brussels/

However it will be a while before we could make those API's public.


Why?

That's quite sad :-(

We seem to sometimes miss the obvious things, in our pursuit of the
finer details. For example, how hard would it be to support a library
function call that was a "x = create_interface("foo0")" and then one that
was "add_address_to_interface(x, "192.168.1.1/24")''?

Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] What is this code's history and reason for being here still?

2009-10-05 Thread Darren Reed

I jazzed it up a little and the test still passed, so I've filed 6888104,
so people can add comments there...

The original bug that introduced the code, 076 is not available
to bugster, so I can't cross reference it in the bug database :-(

Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] What is this code's history and reason for being here still?

2009-10-02 Thread Darren Reed

Darren Reed wrote:


Looking through ip.c for the refactor review, I found this gem;

5900 ip_net_mask(ipaddr_t addr)
5901 {
5902 uchar_t *up = (uchar_t *)&addr;
5903 ipaddr_t mask = 0;
5904 uchar_t *maskp = (uchar_t *)&mask;
5905 5906 #if defined(__i386) || defined(__amd64)
5907 #define TOTALLY_BRAIN_DAMAGED_C_COMPILER
5908 #endif
5909 #ifdef  TOTALLY_BRAIN_DAMAGED_C_COMPILER
5910 maskp[0] = maskp[1] = maskp[2] = maskp[3] = 0;
5911 #endif


It predates the refactoring project so I'm reluctant to mention it here
but has anyone attempted to confirm if the above is still requried?

It looks like a workaround for an optimisation bug?



To test if this was still a problem, I used the following:

$ /ws/onnv-tools/SUNWspro/SS12/bin/cc -m64 -Ui386 -U__i386 -xO3 
-D_ASM_INLINES -Xa -xspace -Wu,-save_args -v -xildoff -g -xc99=%all 
-W0,-noglobal -xdebugformat=stabs -errtags=yes -errwarn=%all 
-W0,-xglobalstatic -xstrconst -xinline=tcp_set_ws_value -D_SYSCALL32 
-D_SYSCALL32_IMPL -D_ELF64 -D_DDI_STRICT -Dsun -D__sun -D__SVR4 foo.c -lnsl

$ ./a.out 1.2.3.4 224.1.1.1 192.168.1.1 129.1.1.1
255.0.0.0
240.0.0.0
255.255.255.0
255.255.0.0

$ cat foo.c
#include 
#include 
#include 
#include 
#include 

#ifdef  _BIG_ENDIAN
#define N_IN_CLASSA_NET IN_CLASSA_NET
#define N_IN_CLASSD_NET IN_CLASSD_NET
#define N_INADDR_UNSPEC_GROUP   INADDR_UNSPEC_GROUP
#define N_IN_LOOPBACK_NET   (ipaddr_t)0x7f00U
#else /* _BIG_ENDIAN */
#define N_IN_CLASSA_NET (ipaddr_t)0x00ffU
#define N_IN_CLASSD_NET (ipaddr_t)0x00f0U
#define N_INADDR_UNSPEC_GROUP   (ipaddr_t)0x00e0U
#define N_IN_LOOPBACK_NET   (ipaddr_t)0x007fU
#endif /* _BIG_ENDIAN */
#define CLASSD(addr)(((addr) & N_IN_CLASSD_NET) == 
N_INADDR_UNSPEC_GROUP)

#define CLASSE(addr)(((addr) & N_IN_CLASSD_NET) == N_IN_CLASSD_NET)
#define IP_LOOPBACK_ADDR(addr)  \
   (((addr) & N_IN_CLASSA_NET == N_IN_LOOPBACK_NET))

/*
* Return the network mask
* associated with the specified address.
*/
ipaddr_t
ip_net_mask(ipaddr_t addr)
{
   uchar_t *up = (uchar_t *)&addr;
   ipaddr_t mask = 0;
   uchar_t *maskp = (uchar_t *)&mask;

#if defined(__i386) || defined(__amd64)
#define TOTALLY_BRAIN_DAMAGED_C_COMPILERX
#endif
#ifdef  TOTALLY_BRAIN_DAMAGED_C_COMPILER
   maskp[0] = maskp[1] = maskp[2] = maskp[3] = 0;
#endif
   if (CLASSD(addr)) {
   maskp[0] = 0xF0;
   return (mask);
   }

   /* We assume Class E default netmask to be 32 */
   if (CLASSE(addr))
   return (0xU);

   if (addr == 0)
   return (0);
   maskp[0] = 0xFF;
   if ((up[0] & 0x80) == 0)
   return (mask);

   maskp[1] = 0xFF;
   if ((up[0] & 0xC0) == 0x80)
   return (mask);

   maskp[2] = 0xFF;
   if ((up[0] & 0xE0) == 0xC0)
   return (mask);

   /* Otherwise return no mask */
   return ((ipaddr_t)0);
}

int
main(int argc, char *argv[])
{
   struct in_addr ip;
   ipaddr_t in;

   while (argc > 1) {
   inet_pton(AF_INET, argv[1], &in);
   ip.s_addr = ip_net_mask(in);
   printf("%s\n", inet_ntoa(ip));
   argc--;
   argv++;
   }

   return (0);
}

Looks safe for removal... comments on compiling?

Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org


[networking-discuss] What is this code's history and reason for being here still?

2009-10-02 Thread Darren Reed


Looking through ip.c for the refactor review, I found this gem;

5900 ip_net_mask(ipaddr_t addr)
5901 {
5902 uchar_t *up = (uchar_t *)&addr;
5903 ipaddr_t mask = 0;
5904 uchar_t *maskp = (uchar_t *)&mask;
5905 
5906 #if defined(__i386) || defined(__amd64)

5907 #define TOTALLY_BRAIN_DAMAGED_C_COMPILER
5908 #endif
5909 #ifdef  TOTALLY_BRAIN_DAMAGED_C_COMPILER
5910 maskp[0] = maskp[1] = maskp[2] = maskp[3] = 0;
5911 #endif


It predates the refactoring project so I'm reluctant to mention it here
but has anyone attempted to confirm if the above is still requried?

It looks like a workaround for an optimisation bug?

Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] Start of code review for IP datapath refactoring

2009-09-30 Thread Darren Reed

Erik Nordmark wrote:

Darren Reed wrote:
There's some more files and analysis to do but to get it started, 
here's a couple...


$SRC/uts/common/inet/ip/ip_netinfo.c
- the changes here seem fine but it would seem like it is worthwhile
 adding some stub functions (in the future) so that the "if 
(strcmp(NHF_INET...)"

 is not necessary. See 6886534.


OK. Should I take ownership of that RFE as part of doing that, or does 
the RFE cover some other cleanup so I should leave it open?


If the net_no_*() stub functions get pushed into the neti module, 
there's no reason why you can't take ownership of the RFE.


Darren


___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] Start of code review for IP datapath refactoring

2009-09-29 Thread Darren Reed
There's some more files and analysis to do but to get it started, here's 
a couple...


$SRC/uts/common/inet/ip/ip_netinfo.c
- the changes here seem fine but it would seem like it is worthwhile
 adding some stub functions (in the future) so that the "if 
(strcmp(NHF_INET...)"

 is not necessary. See 6886534.
$SRC/uts/common/inet/ipf/fil.c
- this reflects earlier conversation, so all is fine


___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] is the zoneid signed or unsigned?

2009-09-18 Thread Darren Reed

On 18/09/09 10:44 AM, James Carlson wrote:

Darren Reed wrote:
  

James Carlson wrote:


What kind of confusion are you expecting?
  
  

If it is an opaque type, then how does it get printed?



You have to use one of the look-up functions to convert it to a string
for printing.  Zones are named, not numbered, even in the kernel.

This was a key architectural decision made by (I think) Andy Tucker back
when Kevlar was being designed.  I wanted zoneids to be like UIDs, GIDs,
and other UNIX IDs -- used as small integers everywhere, and converted
to names when necessary by use of name services.  Andy and the other
kernel folks disagreed, and felt that strings were better, and integers
would be allocated on the fly (non-permanently) and never used as zone
identifiers, except in performance-sensitive (and entirely internal)
contexts.

  

As an unsigned integer for all values, except -1, or as a signed integer?



I still think it's properly "neither."  Users can't reasonably do
anything with those ephemeral numbers, so printing them (or using them
in user interfaces) would be a mistake.
  


Do a "man snoop" and search for the word "zone".

Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org

Re: [networking-discuss] is the zoneid signed or unsigned?

2009-09-18 Thread Darren Reed

James Carlson wrote:

Darren Reed wrote:
  

Guys, In most parts of the source code, the zoneid is unsigned,
except for where we use ALL_ZONES. Then in some places,
we assign or expect -1 to be the zoneid, for example in what
psh prints and expects to see.

It would seem that we want the zoneid to be unsigned except
for when its value is -1. This seems... like it could lead to
confusion or other bad things.



It looks to me like it's neither signed nor unsigned, but rather an
opaque type.  It supports only equality tests, not general arithmetic
ones, so signedness doesn't seem to come into it.

What kind of confusion are you expecting?
  


If it is an opaque type, then how does it get printed?
As an unsigned integer for all values, except -1, or as a signed integer?

Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org


[networking-discuss] is the zoneid signed or unsigned?

2009-09-17 Thread Darren Reed

Guys, In most parts of the source code, the zoneid is unsigned,
except for where we use ALL_ZONES. Then in some places,
we assign or expect -1 to be the zoneid, for example in what
psh prints and expects to see.

It would seem that we want the zoneid to be unsigned except
for when its value is -1. This seems... like it could lead to
confusion or other bad things.

Thoughts?

Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] multiple addresses to a mac address, using dhcp

2009-09-11 Thread Darren Reed

On 11/09/09 11:10 AM, Daniel wrote:

Hi,

I have a situation where I'm trying to determine an IP address from a mac 
address on a network that uses a DHCP server. I use the following command:

arp -a | grep $macAddress | awk "{print $2}"

but this is returning multiple addresses. I believe what's happening is that 
we're restarting these hosts frequently during testing and the old leases for 
the mac address are not up yet and so there are multiple entries in the dhcp 
table.

What's the best way to solve this? I need to determine for sure which IP 
address is being used. Should I just ping each address that I get or is there a 
better way to do this?
  


You need to make sure that your test hosts shutdown cleanly and as a 
part of that actively release the DHCP address that they receive from 
the DHCP server.


Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] Start of design review for IP datapath refactoring

2009-09-10 Thread Darren Reed

On  9/09/09 08:46 AM, Kacheong Poon wrote:

Darren Reed wrote:

Given the [], I'm not sure if this is an editorial mistake or not, 
but something that has been brought  up a few times is expanding the 
mblk_t (or dblk_t?) to store a packet length and other useful 
information. Whilst the mblk_t is currently a nice multiple of system 
word size (and in some cases cache line size as well), 
micro-optimisation shouldn't be a design goal. With the above comment 
and the IP Refactoring project almost behind you now, what are your 
thoughts on allowing for this type of change?



My guess is that this special field does not help much since
ULPs, such as TCP and SCTP, can use multiple mblks per send.
Similarly, since TCP/SCTP can segment (or break up) what the
app layer sends down, knowing the size of an app layer send
call may not help avoid the use of msgdsize().


"can" but don't always.

I think it would be a worthwhile experiment to modify the
mblk_t, add in a size field, "fix" all of the places where this
would help and see what, if any, performance change there
is. That's the only way to really find out. Although I wonder
if there are more "hidden" benefits from a change like this
than there are obvious ones.

Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] Start of design review for IP datapath refactoring

2009-09-08 Thread Darren Reed

Erik,

In the design document, is the following:

"[If Volo could pass down the size of the message to the sendmsg 
function we could avoid msgdsize() all together.]"


Given the [], I'm not sure if this is an editorial mistake or not, but 
something that has been brought  up a few times is expanding the mblk_t 
(or dblk_t?) to store a packet length and other useful information. 
Whilst the mblk_t is currently a nice multiple of system word size (and 
in some cases cache line size as well), micro-optimisation shouldn't be 
a design goal. With the above comment and the IP Refactoring project 
almost behind you now, what are your thoughts on allowing for this type 
of change?


Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] ipnat -l ioctl(SIOCGNATS): I/O error

2009-09-02 Thread Darren Reed

On  1/09/09 08:29 AM, Darko Hojnik wrote:

Thank you, i don't get the error any more. But my NAT doesn't still work.
I have opened a new Thread...
http://opensolaris.org/jive/thread.jspa?threadID=111859
  


Is ipfilter enabled?

___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] How to realize a OpenSolaris based NAT Router?

2009-08-28 Thread Darren Reed

Darko Hojnik wrote:

...

How i can enable NAT on the Wifi?


Look up documentation on how to enable and use IPFilter for Solaris (10).

Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] snoop and LSO

2009-08-26 Thread Darren Reed

On 26/08/09 09:00 PM, Sebastien Roy wrote:

On Wed, 2009-08-26 at 20:53 -0700, Darren Reed wrote:
  
On 26/08/09 08:41 PM, Sebastien Roy wrote: 


On Wed, 2009-08-26 at 19:42 -0700, Yunsong (Roamer) Lu wrote:
  
  
Yes, to determine the packet is a LSO one, snoop needs to get the meta 
data. It's certain that snoop shouldn't regard any large(>MTU) packets 
as LSO ones. :)



Seeing as snoop currently doesn't know the MTU
  

What makes you say that?

Doesn't dlpi_info() return that in snoop_capture.c`open_datalink?

Or am I missing something?

It's di_max_sdu that gets used, not mtu_size, throughout the code...

This, however, seems a bit... bogus...
/* for backward compatibility, allow known interface mtu_sizes
*/
if (interface->mtu_size > dlinfo.di_max_sdu)
dlinfo.di_max_sdu = interface->mtu_size;



Ah, this is what I was thinking of.  Bogus indeed.  Does that get saved
into the capture file?
  


You're going to love this...

from snoop_capture.c`cap_open_read:

   dlinfo.di_max_sdu = MAXINT; /* Decode any stored packet. */

Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org

Re: [networking-discuss] snoop and LSO

2009-08-26 Thread Darren Reed

On 26/08/09 08:41 PM, Sebastien Roy wrote:

On Wed, 2009-08-26 at 19:42 -0700, Yunsong (Roamer) Lu wrote:
  
Yes, to determine the packet is a LSO one, snoop needs to get the meta 
data. It's certain that snoop shouldn't regard any large(>MTU) packets 
as LSO ones. :)



Seeing as snoop currently doesn't know the MTU


What makes you say that?

Doesn't dlpi_info() return that in snoop_capture.c`open_datalink?

Or am I missing something?

It's di_max_sdu that gets used, not mtu_size, throughout the code...

This, however, seems a bit... bogus...
   /* for backward compatibility, allow known interface mtu_sizes */
   if (interface->mtu_size > dlinfo.di_max_sdu)
   dlinfo.di_max_sdu = interface->mtu_size;

Darren


___
networking-discuss mailing list
networking-discuss@opensolaris.org

Re: [networking-discuss] Default TCP window size -- why still 48k?

2009-08-26 Thread Darren Reed

On 26/08/09 07:17 PM, Dan McDonald wrote:

On Thu, Aug 27, 2009 at 10:12:33AM +0800, Kacheong Poon wrote:




So at 10Mbit, that:

125 bytes * 0.185sec == 231250 bytes.


I think the default should be changed.

Perhaps to 256k then?  (It would cover your case that way.)

I guess that should be fine for most usage.


Cool!  I *thought* there was an RFE already filed.  Perhaps not.

Does anyone else in the peanut gallery have any thoughts?


Peanut gallery? :)

So there are two considerations:
1) buffer allocation on sockets when we're running 1000s
  of TCP connections: @1M, 10,000 sockets = 20GB of RAM,
  @48k, 10,000 sockets = 900MB of RAM
  ... yes, the buffers aren't allocated "straight away"
  but it is a measure of the obligation being advertised.

2) reduced security through a larger window making it easier
  to guess sequence numbers. 48k = 1 in ~87,000,
  1M = 1 in 4096

Applications that want a bigger window can issue a setsockopt
to expand the kernel buffer size (and thus the window.)

Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org

Re: [networking-discuss] snoop and LSO

2009-08-26 Thread Darren Reed

Unless you have meta data that says the packet is schedule to have
LSO applied to it and the network interface supports LSO, there
is no way to accurate determine that a packet that "looks wrong"
is going to turn out ok when it is sent by the NIC over the wire.

You cannot use only this information:
(warning) packet length greater than MTU in buffer offset 7384: length=3000

and decide "yes, it's ok, it will be LSO'd" without more data.

Darren

On 26/08/09 06:28 PM, Yunsong (Roamer) Lu wrote:

What's your point? Could you please describe it with more details?

Thanks,

Roamer


Darren Reed wrote:
No, I'm saying that just because the packet contents (data) look like 
they will be the subject of LSO/hardware checksum does not mean that 
it will. The flags and other data from the dblk_t are needed.


Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org




___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] snoop and LSO

2009-08-26 Thread Darren Reed

Yunsong (Roamer) Lu wrote:

Darren Reed wrote:

On 25/08/09 05:59 PM, Yunsong (Roamer) Lu wrote:

Darren Reed wrote:
I don't think snoop is supposed to do all sanity check for the 
driver. Actually it's simple for the snoop users to figure out 
such misbehavior without seeing a warning.


If I send a packet using a raw socket, to a network interface, that 
looks just like the one that comes from inside the kernel, how is 
snoop going to know the difference?

Are you saying to send a LSO packet from raw socket to a interface?


Not quite.

The "warnings" that snoop prints are because sanity checks fail due 
to the way the packet in the mblk_t's is put together by the kernel. 
It is possible for a user application, using a raw socket, to build a 
packet that looks exactly like it and send that.


To snoop, both packets will look the same.

To the driver, one (from the kernel) will have all of the hardware 
checksum bits and the other (from the application) will not.

Are you suggesting that snoop should warn for such packets?


No, I'm saying that just because the packet contents (data) look like 
they will be the subject of LSO/hardware checksum does not mean that it 
will. The flags and other data from the dblk_t are needed.


Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] snoop and LSO

2009-08-25 Thread Darren Reed

On 25/08/09 05:59 PM, Yunsong (Roamer) Lu wrote:

Darren Reed wrote:
I don't think snoop is supposed to do all sanity check for the 
driver. Actually it's simple for the snoop users to figure out such 
misbehavior without seeing a warning.


If I send a packet using a raw socket, to a network interface, that 
looks just like the one that comes from inside the kernel, how is 
snoop going to know the difference?

Are you saying to send a LSO packet from raw socket to a interface?


Not quite.

The "warnings" that snoop prints are because sanity checks fail due to 
the way the packet in the mblk_t's is put together by the kernel. It is 
possible for a user application, using a raw socket, to build a packet 
that looks exactly like it and send that.


To snoop, both packets will look the same.

To the driver, one (from the kernel) will have all of the hardware 
checksum bits and the other (from the application) will not.


Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] snoop and LSO

2009-08-25 Thread Darren Reed

On 25/08/09 05:40 PM, Yunsong (Roamer) Lu wrote:

Darren Reed wrote:

Yunsong (Roamer) Lu wrote:

...
The simplest way to fix the CR is let snoop to check the LSO 
information with the packet and avoid warning it. But I'm not sure 
it's enough.


Is that enough?

Even though the system shouldn't do it, if an mblk_t is marked with 
LSO information and delivered to a driver that doesn't support it, it 
will then attempt to transmit the packet as-is, correct?
I don't think snoop is supposed to do all sanity check for the driver. 
Actually it's simple for the snoop users to figure out such 
misbehavior without seeing a warning.


Really?

If I send a packet using a raw socket, to a network interface, that 
looks just like the one that comes from inside the kernel, how is snoop 
going to know the difference?


Mind you, the one from the application won't have the checksum flags set 
on the mblk_t and will thus be transmitted "as is" - if at all possible.


Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] snoop and LSO

2009-08-25 Thread Darren Reed

On 25/08/09 10:32 AM, Peter Memishian wrote:
 > > The simplest way to fix the CR is let snoop to check the LSO 
 > > information with the packet and avoid warning it. But I'm not sure 
 > > it's enough. Since LSO packets will be segmented by the NIC hardware 
 > > so on the wire there are only MTU-size packets. Are there suers who 
 > > expect to have a option for snoop to see the "expected packets on the 
 > > wire"? For example, `snoop -?` to parse the LSO packet header to 
 > > multiple regular headers that are expected to be seen on the wire?
 > 
 > I think snoop should only report what it really sees. For physical 
 > devices, on the tx side, snoop can never see post-segmentation packets 
 > on the wire. I don't think snoop should make a guess and report what it 
 > has imagined to the users.

 > >
 > > It gets a little bit complex when we implement LSO on top of VNICs, as 
 > > is still being discussed. When snooping a VNIC created on e1000g, 
 > > should the snoops be seeing original LSO packets as sent to e1000g or 
 > > post-segmentation packets as seen on the wire? Any thoughts?
 > 
 > If "snoop -d vnic0", I think it should report original LSO packets. If 
 > the real traffic passes through e1000g, "snoop -d e1000g0" should report 
 > the real packets that are passed to e1000g (if e1000g LSO is disabled, 
 > it should show regular Ethernet frames to users).


Yes -- and FWIW we do something similar when hardware checksums are
enabled (snoop simply reports what's in the packet, even if it's invalid).

That said, there should be some facility that allows snoop to know this is
an LSO packet and make this clear to the user examining the dump.  Of
course, we could also disable LSO in this case, but I think that would be
a mistake because it's likely the problem the user is trying to
troubleshoot is related to LSO -- and thus by disabling LSO we are only
making their life harder.  (There is intentionally no supported
administrative mechanism for disabling LSO; we need to keep it that way.)


That's not very helpful when diagnosing problems with incorrect
checksums in packets. Luckily there is a workaround:
# echo 'dohwcksum/W 0' | adb -w -k

With no way to discover what features like this are enabled on
network interfaces and no way to control them (except for the
above), diagnosing checksum issues is harder than it needs to be
on Solaris.

By contrast, with modern BSD systems, all of this is available, e.g;

nfe0: flags=8843 mtu 1500
   
capabilities=3f00

   enabled=0
   address: 00:24:8c:5c:66:80
   media: Ethernet 100baseTX (100baseTX half-duplex)
   status: active
   inet 192.168.1.254 netmask 0xff00 broadcast 192.168.1.255
   inet6 fe80::224:8cff:fe5c:6680%nfe0 prefixlen 64 scopeid 0x1


Every one of those capabilities can be individually enabled or
disabled, allowing for developers and administrators to make precise
changes in the processing of packet checksums when tracking down a
fault, not to mention working around it until a fix is available.
In this instance, none of the capabilities are enabled. FWIW, this
list only represents those that this NIC supports (there are more)
and those that are enabled (i.e. none.)

The comment about it being likely a problem related to LSO is only
guesswork until you can eliminate LSO from the processing path and
determine if the fault is then reproducible or not. One way to try
and eliminate LSO from the equation is to use dtrace to trace the
packet through the kernel - but we all know what the limitations
with that are. The other is to turn LSO off and see if the problem
persists.

Regardless of which particular option any one person feels is best,
we should not only be enabling our users to choose for themselves
but also enabling ourselves to diagnose problems by delivering
tools that make it possible to retrieve and manage all configuration
information. I suppose there's also mdb that can be used to retrieve
network interface configuration, but that's quite raw.

I haven't checked to see if AIX/HPUX allow this to be tweaked,
I suspect HPUX's heritage means it does not, but I do know that
BSD, Linux and Windows all allow it to be controlled at a finer
level than Solaris does. Strangely, I don't hear any horror
stories from people about it being possible to manage network
interfaces in that way on those platforms. Maybe because it is
not the monster that it is being made out to be?

Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org


Re: [networking-discuss] snoop and LSO

2009-08-25 Thread Darren Reed

On 24/08/09 09:46 PM, Dan Mick wrote:

Darren Reed wrote:

On 24/08/09 07:39 PM, Dan Mick wrote:

Snoop issues a pile of warnings when run on an LSO-enabled NIC:

(warning) packet length greater than MTU in buffer offset 7384: 
length=3000


As I understand it, these are hard to avoid.  It is, however, not 
hard to make snoop not whinge about them.


Is this behavior useful enough that it should be kept, perhaps under 
-V or -v, or is nuking it acceptable?


(and, yes, snoop's unmaintainable, etc., etc., but we're talking 
about a small relief fix)


A proper fix for this would involve snoop querying the device
driver, before it opens it for sniffing packets and finding out
what capabilities it has enabled. It should also receive metadata
with the packet from the device about whether or not any given
packet is marked up for such a capability and only then ignore
problems such as you're describing.

Otherwise there is no way to distinguish a well formed packet
from a badly formed one.


Yes.  I guess I'm asking if that warning is useful enough to be 
warranted, even in non-verbose mode, when it's clear that many of our 
NICs will cause it to be issued willy-nilly right now, polluting the 
actual desired output.


I agree that a more-proper fix would be better, but I'm told that 
wireshark is imminent, and so "perfect is the enemy of the good" may 
apply.


wireshark is not going to magically make the problem go away,
nor will wireshark using bpf be able to make it go away. A similar
problem is seen with tcpdump on BSD boxes that make use of hardware
checksum and there too they have the same problem as here: it isn't
possible to know if the captured packet for transmit is intended to
have its checksums calculated by the hardware.

Darren

___
networking-discuss mailing list
networking-discuss@opensolaris.org


  1   2   3   4   5   6   7   8   9   >