carp init delay

2013-04-03 Thread Camiel Dobbelaar


In some cases when a network port comes up, it does not indicate that 
the network is ready.  But on linkup, carp(4) will try to get out of the 
INIT state as soon as possible.  And because all is quiet it will decide 
to become master.


This then leads to master-master situations.

Here are two examples when this can happen, there are probably more:

(1) spanning tree may be in effect, and not yet forwarding

(2) a powering-up or rebooting switch that activates its ports 
immediately, but does not forward anything while not completely up yet 
(this may be an openbsd bridge too)


I wonder if carp(4) needs an extra knob (*shudder*) to pause in the INIT 
state while the rest of the network gets ready after a linkup.


I see in the source code there are already two mechanisms/workarounds 
that are related, but a pause may be a bit more generic:

- sc_suppress
- sc_delayed_arp

Anyone else observe/fix this by other means?  Opinions?




Re: carp init delay

2013-04-03 Thread Camiel Dobbelaar


When the system with the carp interfaces comes up, a sleep in the 
hostname.if file works.  An arping might be an optimization of that.


But I'd like carp to react properly to events *outside* the system.

When I unplug/plug a network cable, spanning tree can kick in again on 
the switch.  *Then* I'd like carp to pause.


Or the example I mentioned earlier when a switch is powered off and on.

I think handling that belongs in the kernel, and not some userland 
voodoo (ifstated/cron scripts) to clean it up.   :-)




On 4/3/13 3:37 PM, sven falempin wrote:

my 2 cents:
timing is always a problem, maybe you could arping the next hop and then
activate the carp ?


On Wed, Apr 3, 2013 at 9:34 AM, Stuart Henderson s...@spacehopper.orgwrote:


On 2013/04/03 14:54, Camiel Dobbelaar wrote:


In some cases when a network port comes up, it does not indicate that
the network is ready.  But on linkup, carp(4) will try to get out of
the INIT state as soon as possible.  And because all is quiet it will
decide to become master.

This then leads to master-master situations.

Here are two examples when this can happen, there are probably more:

(1) spanning tree may be in effect, and not yet forwarding

(2) a powering-up or rebooting switch that activates its ports
immediately, but does not forward anything while not completely up
yet (this may be an openbsd bridge too)

I wonder if carp(4) needs an extra knob (*shudder*) to pause in the
INIT state while the rest of the network gets ready after a linkup.

I see in the source code there are already two mechanisms/workarounds
that are related, but a pause may be a bit more generic:
- sc_suppress
- sc_delayed_arp

Anyone else observe/fix this by other means?  Opinions?




slightly messy, though at least this also applies to the case with
things other than carp which could also have problems: add !sleep 5
or something in hostname.if for the physical interface...









Re: carp init delay

2013-04-03 Thread Camiel Dobbelaar



On 4/3/13 3:54 PM, Stuart Henderson wrote:

On 2013/04/03 15:43, Camiel Dobbelaar wrote:



On 4/3/13 3:34 PM, Stuart Henderson wrote:

In some cases when a network port comes up, it does not indicate that
the network is ready.  But on linkup, carp(4) will try to get out of
the INIT state as soon as possible.  And because all is quiet it will
decide to become master.
Anyone else observe/fix this by other means?  Opinions?



slightly messy, though at least this also applies to the case with
things other than carp which could also have problems: add !sleep 5
or something in hostname.if for the physical interface...


Yes, I already use that.  That solves the case where the system with
the carp interfaces itself is rebooted.

But not the other cases.  Hence I'd like that sleep applied always,
instead of only when /etc/netstart is run.  :-)


Ah, I see what you mean. Still there are things other than carp
where this might also apply - for example the pfsync initial_bulk
that gets handled via if_linkstatehooks (but maybe also userland
things)..


Pausing carp may help the pfsync case too?

Wasn't the major problem there caused by the freshly booted backup going 
to master too soon and cancelling the bulk update?





Re: help testing bridge diff

2012-09-24 Thread Camiel Dobbelaar
On Sun, 23 Sep 2012, Stefan Sperling wrote:

 On Thu, Sep 20, 2012 at 10:11:20AM +0200, Camiel Dobbelaar wrote:
  I need help testing this bridge diff, as I cannot test (or even imagine) 
  all the possible bridge setups.
  
  It brings a nice speed improvement and simplifies the code.
  
  Testing especially appreciated with gif, tun and vether interfaces in the 
  bridge.
  
  I can provide i386 and amd64 kernels to make it convenient.  :-)
  
  Thanks!
 
 The diff reads fine. I like the idea, storing a pointer to the bridge
 port itself makes much more sense than having everyone and their uncle
 loop over the iflist in the bridge softc to find the port.
 
 I'll note that apart from making the output path more efficient this
 diff also hides the bridge iflist internals inside the bridge core code.
 Are you planning on changing the list to e.g. a tree going forward?

I think the list is ok as long as it's for simple bridge maintenance.  Not 
for lookups.

There is one performance killer still there that loops over the list to 
compare MAC addresses to see if the packet is for the machine itself.  I 
think it may be possible to use the bridge routecache for this.  So that 
MAC addresses of bridgeports point to self.  There is also a carp 
check in there unfortunately, which may make this a bit harder.
 
 I'm gonna test this on my firewalls. Let's see if it runs as well
 as it looks :)
 
 There are two small changes buried in the diff which are unrelated
 to the overall change you're making to the bridge code, see below.

Yes, the mbuf.h change as well.  I'll commit those seperately.

And a new diff,  A misplaced bracket in in_arpinput() caused make release 
to fail (as you noticed).


Index: dev/isa/if_ie.c
===
RCS file: /cvs/src/sys/dev/isa/if_ie.c,v
retrieving revision 1.35
diff -u -p -r1.35 if_ie.c
--- dev/isa/if_ie.c 28 Nov 2008 02:44:17 -  1.35
+++ dev/isa/if_ie.c 24 Sep 2012 15:14:52 -
@@ -1054,16 +1054,16 @@ check_eh(sc, eh, to_bpf)
 */
 #if NBPFILTER  0
*to_bpf = (sc-sc_arpcom.ac_if.if_bpf != 0) ||
-   (sc-sc_arpcom.ac_if.if_bridge != NULL);
+   (sc-sc_arpcom.ac_if.if_bridgeport != NULL);
 #else
-   *to_bpf = (sc-sc_arpcom.ac_if.if_bridge != NULL);
+   *to_bpf = (sc-sc_arpcom.ac_if.if_bridgeport != NULL);
 #endif
/* If for us, accept and hand up to BPF */
if (ether_equal(eh-ether_dhost, sc-sc_arpcom.ac_enaddr))
return 1;
 
 #if NBPFILTER  0
-   if (*to_bpf  sc-sc_arpcom.ac_if.if_bridge == NULL)
+   if (*to_bpf  sc-sc_arpcom.ac_if.if_bridgeport == NULL)
*to_bpf = 2; /* we don't need to see it */
 #endif
 
@@ -1095,9 +1095,9 @@ check_eh(sc, eh, to_bpf)
 */
 #if NBPFILTER  0
*to_bpf = (sc-sc_arpcom.ac_if.if_bpf != 0) ||
-   (sc-sc_arpcom.ac_if.if_bridge != NULL);
+   (sc-sc_arpcom.ac_if.if_bridgeport != NULL);
 #else
-   *to_bpf = (sc-sc_arpcom.ac_if.if_bridge != NULL);
+   *to_bpf = (sc-sc_arpcom.ac_if.if_bridgeport != NULL);
 #endif
/* We want to see multicasts. */
if (eh-ether_dhost[0]  1)
@@ -1109,7 +1109,7 @@ check_eh(sc, eh, to_bpf)
 
/* Anything else goes to BPF but nothing else. */
 #if NBPFILTER  0
-   if (*to_bpf  sc-sc_arpcom.ac_if.if_bridge == NULL)
+   if (*to_bpf  sc-sc_arpcom.ac_if.if_bridgeport == NULL)
*to_bpf = 2;
 #endif
return 1;
Index: net/bridgestp.c
===
RCS file: /cvs/src/sys/net/bridgestp.c,v
retrieving revision 1.41
diff -u -p -r1.41 bridgestp.c
--- net/bridgestp.c 20 Sep 2012 14:10:18 -  1.41
+++ net/bridgestp.c 24 Sep 2012 15:14:52 -
@@ -1641,7 +1641,6 @@ void
 bstp_ifstate(void *arg)
 {
struct ifnet *ifp = (struct ifnet *)arg;
-   struct bridge_softc *sc;
struct bridge_iflist *p;
struct bstp_port *bp;
struct bstp_state *bs;
@@ -1649,16 +1648,11 @@ bstp_ifstate(void *arg)
 
if (ifp-if_type == IFT_BRIDGE)
return;
-   sc = (struct bridge_softc *)ifp-if_bridge;
 
s = splnet();
-   LIST_FOREACH(p, sc-sc_iflist, next) {
-   if ((p-bif_flags  IFBIF_STP) == 0)
-   continue;
-   if (p-ifp == ifp)
-   break;
-   }
-   if (p == LIST_END(sc-sc_iflist))
+   if ((p = (struct bridge_iflist *)ifp-if_bridgeport) == NULL)
+   goto done;
+   if ((p-bif_flags  IFBIF_STP) == 0)
goto done;
if ((bp = p-bif_stp) == NULL)
goto done;
@@ -2121,7 +2115,7 @@ bstp_ifsflags(struct bstp_port *bp, u_in
 int
 bstp_ioctl(struct ifnet *ifp, u_long cmd

Re: proto cksum madness

2012-09-22 Thread Camiel Dobbelaar
On 21-9-2012 23:40, Stuart Henderson wrote:
 $ ifconfig vr0 hwfeatures|head -2
 vr0: flags=8b43UP,BROADCAST,RUNNING,PROMISC,ALLMULTI,SIMPLEX,MULTICAST mtu 
 1500
 hwfeatures=8017CSUM_IPv4,CSUM_TCPv4,CSUM_UDPv4,VLAN_MTU,WOL
 
 No problems noticed yet. (this is running i386).
 
 $ ifconfig vlan6 hwfeatures|head -2
 vlan6: flags=8943UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST mtu 1500
 hwfeatures=0
 
 Is it right/expected that CSUM_* aren't propagated to the vlan ifaces?

It is correct.  vr(4) does not have VLAN_HWTAGGING.

The comment in the code explains it:

 /*
  * If the parent interface can do hardware-assisted
  * VLAN encapsulation, then propagate its hardware-
  * assisted checksumming flags.
  *
  * If the card cannot handle hardware tagging, it cannot
  * possibly compute the correct checksums for tagged packets.
  *
  * This brings up another possibility, do cards exist which
  * have all of these capabilities but cannot utilize them together?
  */
 if (p-if_capabilities  IFCAP_VLAN_HWTAGGING)
 ifv-ifv_if.if_capabilities = p-if_capabilities 
 IFCAP_CSUM_MASK;



help testing bridge diff

2012-09-20 Thread Camiel Dobbelaar
I need help testing this bridge diff, as I cannot test (or even imagine) 
all the possible bridge setups.

It brings a nice speed improvement and simplifies the code.

Testing especially appreciated with gif, tun and vether interfaces in the 
bridge.

I can provide i386 and amd64 kernels to make it convenient.  :-)

Thanks!


Index: dev/isa/if_ie.c
===
RCS file: /cvs/src/sys/dev/isa/if_ie.c,v
retrieving revision 1.35
diff -u -p -r1.35 if_ie.c
--- dev/isa/if_ie.c 28 Nov 2008 02:44:17 -  1.35
+++ dev/isa/if_ie.c 18 Sep 2012 09:55:59 -
@@ -1054,16 +1054,16 @@ check_eh(sc, eh, to_bpf)
 */
 #if NBPFILTER  0
*to_bpf = (sc-sc_arpcom.ac_if.if_bpf != 0) ||
-   (sc-sc_arpcom.ac_if.if_bridge != NULL);
+   (sc-sc_arpcom.ac_if.if_bridgeport != NULL);
 #else
-   *to_bpf = (sc-sc_arpcom.ac_if.if_bridge != NULL);
+   *to_bpf = (sc-sc_arpcom.ac_if.if_bridgeport != NULL);
 #endif
/* If for us, accept and hand up to BPF */
if (ether_equal(eh-ether_dhost, sc-sc_arpcom.ac_enaddr))
return 1;
 
 #if NBPFILTER  0
-   if (*to_bpf  sc-sc_arpcom.ac_if.if_bridge == NULL)
+   if (*to_bpf  sc-sc_arpcom.ac_if.if_bridgeport == NULL)
*to_bpf = 2; /* we don't need to see it */
 #endif
 
@@ -1095,9 +1095,9 @@ check_eh(sc, eh, to_bpf)
 */
 #if NBPFILTER  0
*to_bpf = (sc-sc_arpcom.ac_if.if_bpf != 0) ||
-   (sc-sc_arpcom.ac_if.if_bridge != NULL);
+   (sc-sc_arpcom.ac_if.if_bridgeport != NULL);
 #else
-   *to_bpf = (sc-sc_arpcom.ac_if.if_bridge != NULL);
+   *to_bpf = (sc-sc_arpcom.ac_if.if_bridgeport != NULL);
 #endif
/* We want to see multicasts. */
if (eh-ether_dhost[0]  1)
@@ -1109,7 +1109,7 @@ check_eh(sc, eh, to_bpf)
 
/* Anything else goes to BPF but nothing else. */
 #if NBPFILTER  0
-   if (*to_bpf  sc-sc_arpcom.ac_if.if_bridge == NULL)
+   if (*to_bpf  sc-sc_arpcom.ac_if.if_bridgeport == NULL)
*to_bpf = 2;
 #endif
return 1;
Index: net/bridgestp.c
===
RCS file: /cvs/src/sys/net/bridgestp.c,v
retrieving revision 1.40
diff -u -p -r1.40 bridgestp.c
--- net/bridgestp.c 9 Jul 2011 04:53:33 -   1.40
+++ net/bridgestp.c 18 Sep 2012 09:56:00 -
@@ -1640,7 +1640,6 @@ void
 bstp_ifstate(void *arg)
 {
struct ifnet *ifp = (struct ifnet *)arg;
-   struct bridge_softc *sc;
struct bridge_iflist *p;
struct bstp_port *bp;
struct bstp_state *bs;
@@ -1648,16 +1647,11 @@ bstp_ifstate(void *arg)
 
if (ifp-if_type == IFT_BRIDGE)
return;
-   sc = (struct bridge_softc *)ifp-if_bridge;
 
s = splnet();
-   LIST_FOREACH(p, sc-sc_iflist, next) {
-   if ((p-bif_flags  IFBIF_STP) == 0)
-   continue;
-   if (p-ifp == ifp)
-   break;
-   }
-   if (p == LIST_END(sc-sc_iflist))
+   if ((p = (struct bridge_iflist *)ifp-if_bridgeport) == NULL)
+   goto done;
+   if ((p-bif_flags  IFBIF_STP) == 0)
goto done;
if ((bp = p-bif_stp) == NULL)
goto done;
@@ -2120,7 +2114,7 @@ bstp_ifsflags(struct bstp_port *bp, u_in
 int
 bstp_ioctl(struct ifnet *ifp, u_long cmd, caddr_t data)
 {
-   struct bridge_softc *sc = (struct bridge_softc *)ifp;
+   struct bridge_softc *sc = (struct bridge_softc *)ifp-if_softc;
struct bstp_state *bs = sc-sc_stp;
struct ifbrparam *ifbp = (struct ifbrparam *)data;
struct ifbreq *ifbr = (struct ifbreq *)data;
@@ -2137,15 +2131,8 @@ bstp_ioctl(struct ifnet *ifp, u_long cmd
err = ENOENT;
break;
}
-   if ((caddr_t)sc != ifs-if_bridge) {
-   err = ESRCH;
-   break;
-   }
-   LIST_FOREACH(p, sc-sc_iflist, next) {
-   if (p-ifp == ifs)
-   break;
-   }
-   if (p == LIST_END(sc-sc_iflist)) {
+   p = (struct bridge_iflist *)ifs-if_bridgeport;
+   if (p == NULL || p-bridge_sc != sc) {
err = ESRCH;
break;
}
Index: net/if.c
===
RCS file: /cvs/src/sys/net/if.c,v
retrieving revision 1.241
diff -u -p -r1.241 if.c
--- net/if.c3 Jan 2012 23:41:51 -   1.241
+++ net/if.c18 Sep 2012 09:56:00 -
@@ -531,7 +531,7 @@ if_detach(struct ifnet *ifp)
 
 #if NBRIDGE  0
/* Remove the interface from any bridge it is 

Re: bridge loop detection

2012-03-02 Thread Camiel Dobbelaar
On 2-3-2012 15:49, Matthieu Herrb wrote:
 On Fri, Mar 02, 2012 at 03:19:34PM +0100, Camiel Dobbelaar wrote:
 I think the bridge loop detection in if_ethersubr.c can be removed.  It 
 taxes all bridge output traffic, but I don't think it ever kicks in.

 It was added in 2001 by angelos:
 http://www.openbsd.org/cgi-bin/cvsweb/src/sys/net/if_ethersubr.c.diff?r1=1.48;r2=1.49;f=h

 I'd say the following ethertypes are safe, they push packets further down 
 the stack so they cannot be bridged again: ieee80211, trunk, vlan

 tun and gre cannot be part of a bridge.
 
 While I don't understand the network stack very deeply, I for sure
 have tun interfaces (in tap mode, with link0 set) part of a bridge on
 my openvpn gateway:
 
 bridge0: flags=41UP,RUNNING
 groups: bridge
 priority 32768 hellotime 2 fwddelay 15 maxage 20 holdcnt 6
 proto rstp
 tun0 flags=3LEARNING,DISCOVER
 port 14 ifpriority 0 ifcost 0
 vlan4 flags=3LEARNING,DISCOVER
 port 6 ifpriority 0 ifcost 0
 
 
 

Yes, you are right, I missed the layer-2 tap mode.  It looks like it
only pushes packets further down as well (so they cannot be bridged
twice).  But the tun code is a little more challenging so not 100% sure yet.

If you can try the diff on this setup, that would be nice.



vitaminstore 24x7

2012-02-06 Thread Camiel Dobbelaar
Vitaminstote is in 24x7 now.   They are allowed to contact us now when
the server is down.

They did that anyway a few times, so not a lot changes.  :-)

What we do is solve hardware and resource problems, and maybe a very
ocassional iisreset.  For the rest they probably need their China devs.



Re: vitaminstore 24x7

2012-02-06 Thread Camiel Dobbelaar
Sorry everyone, wrong tech@   :-(


On 6-2-2012 9:31, Camiel Dobbelaar wrote:
 Vitaminstote is in 24x7 now.   They are allowed to contact us now when
 the server is down.
 
 They did that anyway a few times, so not a lot changes.
 
 What we do is solve hardware and resource problems, and maybe a very
 ocassional iisreset.  For the rest they probably need their China devs.



Re: relayd imsg race

2012-01-02 Thread Camiel Dobbelaar
On Mon, 5 Dec 2011, Camiel Dobbelaar wrote:
  Another might be to inhibit the processing of IMSG_HOST_STATUS only until
  the configuration has been completed (that is after receiving 
  IMSG_CFG_DONE):
 
 I'm going to try this one.  I'm not sure how bad it is to discard
 messages though.

I tried it, and it does not work correctly.  Because the imsg is dropped 
while the process is marked inactive, you get desynchronized and this code 
for example still breaks:

if (host-check_cnt != st.check_cnt) {
log_debug(%s: host %d = %d, __func__,
host-conf.id, host-up);
fatalx(pfe_dispatch_hce: desynchronized);
}

Maybe that can be fixed, if we can assume that it's not bad to drop 
some status messages once in a while.

I tried another approach below: only start the processes if _all_ of them 
have loaded the config.  This should fix the configuration race after 
startup completely.

There's still a race while reloading though.  Some processes might still 
be active with an old config, while others may be busy purging their old 
config before loading the new one.  The right way would be to pauze all 
the processes first.  But I'd say that's a seperate problem.  :-)

--
Cam


Index: hce.c
===
RCS file: /cvs/src/usr.sbin/relayd/hce.c,v
retrieving revision 1.61
diff -u -p -r1.61 hce.c
--- hce.c   12 Nov 2011 19:36:17 -  1.61
+++ hce.c   2 Jan 2012 13:57:40 -
@@ -355,6 +355,8 @@ hce_dispatch_parent(int fd, struct privs
break;
case IMSG_CFG_DONE:
config_getcfg(env, imsg);
+   break;
+   case IMSG_CTL_START:
hce_setup_events();
break;
case IMSG_CTL_RESET:
Index: parse.y
===
RCS file: /cvs/src/usr.sbin/relayd/parse.y,v
retrieving revision 1.159
diff -u -p -r1.159 parse.y
--- parse.y 21 Sep 2011 18:45:40 -  1.159
+++ parse.y 2 Jan 2012 13:57:40 -
@@ -2280,9 +2280,6 @@ load_config(const char *filename, struct
errors++;
}
 
-   if (TAILQ_EMPTY(conf-sc_relays))
-   conf-sc_prefork_relay = 0;
-
/* Cleanup relay list to inherit */
while ((rlay = TAILQ_FIRST(relays)) != NULL) {
TAILQ_REMOVE(relays, rlay, rl_entry);
Index: pfe.c
===
RCS file: /cvs/src/usr.sbin/relayd/pfe.c,v
retrieving revision 1.71
diff -u -p -r1.71 pfe.c
--- pfe.c   12 Nov 2011 19:36:17 -  1.71
+++ pfe.c   2 Jan 2012 13:57:40 -
@@ -203,6 +203,8 @@ pfe_dispatch_parent(int fd, struct privs
config_getcfg(env, imsg);
init_filter(env, imsg-fd);
init_tables(env);
+   break;
+   case IMSG_CTL_START:
pfe_setup_events();
pfe_sync();
break;
Index: relay.c
===
RCS file: /cvs/src/usr.sbin/relayd/relay.c,v
retrieving revision 1.143
diff -u -p -r1.143 relay.c
--- relay.c 21 Sep 2011 18:45:40 -  1.143
+++ relay.c 2 Jan 2012 13:57:40 -
@@ -2577,6 +2577,8 @@ relay_dispatch_parent(int fd, struct pri
break;
case IMSG_CFG_DONE:
config_getcfg(env, imsg);
+   break;
+   case IMSG_CTL_START:
relay_launch();
break;
case IMSG_CTL_RESET:
Index: relayd.c
===
RCS file: /cvs/src/usr.sbin/relayd/relayd.c,v
retrieving revision 1.104
diff -u -p -r1.104 relayd.c
--- relayd.c4 Sep 2011 20:26:58 -   1.104
+++ relayd.c2 Jan 2012 13:57:40 -
@@ -49,6 +49,7 @@
 __dead void usage(void);
 
 int parent_configure(struct relayd *);
+voidparent_configure_done(struct relayd *);
 voidparent_reload(struct relayd *, u_int, const char *);
 voidparent_sig_handler(int, short, void *);
 voidparent_shutdown(struct relayd *);
@@ -292,6 +293,9 @@ parent_configure(struct relayd *env)
TAILQ_FOREACH(rlay, env-sc_relays, rl_entry)
config_setrelay(env, rlay);
 
+   /* HCE, PFE and the preforked relays need to reload their config. */
+   env-sc_reload = 2 + env-sc_prefork_relay;
+
for (id = 0; id  PROC_MAX; id++) {
if (id == privsep_process)
continue;
@@ -308,7 +312,6 @@ parent_configure(struct relayd *env)
} else
s = -1;
 
-   env-sc_reload++;
proc_compose_imsg(env-sc_ps, id, -1, IMSG_CFG_DONE, s,
cf, sizeof(cf));
}
@@ -354,6 +357,28 @@ parent_reload(struct relayd *env, u_int 
 }
 
 void

Re: relayd imsg race

2011-12-05 Thread Camiel Dobbelaar
On 5-12-2011 19:45, Sebastian Benoit wrote:
 I see relayd crashes like this: (1)
 fatal: relay_dispatch_pfe: invalid host id

 or like this: (2)
 fatal: pfe_dispatch_hce: invalid host id

 There is a race of the hce and the other childs (pfe and relays)
 between loading the configuration and start of processing IMSG_HOST_STATUS
 messages.
 
 The problem is that in hce_setup_events() the host checks are started before
 all childs have all of the configuration.

Yes, I experienced the same thing, see:
http://marc.info/?l=openbsd-bugsm=132207738531052w=2

 A quick hack is to insert a sleep(1) at the beginning of hce_setup_events().

No, that does not work, I've seen crashes with sleeps upto 3 seconds on
my system.  And it is still a race.

 A fix might be to make 'invalid host id' non fatal:

That might lead to crashes later on, especially if the hce notifies
about new host ids that the other processes have not loaded yet.

 Another might be to inhibit the processing of IMSG_HOST_STATUS only until
 the configuration has been completed (that is after receiving IMSG_CFG_DONE):

I'm going to try this one.  I'm not sure how bad it is to discard
messages though.



Re: raise max value for tcp autosizing buffer [WAS: misc@ network tuning for high bandwidth and high latency]

2011-12-04 Thread Camiel Dobbelaar
On 4-12-2011 13:01, Sebastian Reitenbach wrote:
 the default maximum size of the tcp send and receive buffer used by the 
 autosizing algorithm is way too small, when trying to get maximum speed with 
 high bandwidth and high latency connections.

I have tweaked SB_MAX on a system too, but it was for UDP.

When running a busy Unbound resolver, the recommendation is too bump the
receive buffer to 4M or even 8M. See
http://unbound.net/documentation/howto_optimise.html

Otherwise a lot of queries are dropped when the cache is cold.

I don't think there's a magic value that's right for everyone, so a
sysctl would be nice.  Maybe separate ones for tcp and udp.

I know similar sysctl's have been removed recently, and that they are
sometimes abused, but I'd say we have two valid use cases now.

So I'd love some more discussion.  :-)

--
Cam



Re: use M_PROTO1 in bridge output too

2011-11-03 Thread Camiel Dobbelaar
No one interested in this one?  I have another bridge speedup diff after 
this.

On Fri, 28 Oct 2011, Camiel Dobbelaar wrote:

 M_PROTO1 is used by if_bridge on the input path.  On the output path it's 
 used now only by if_bridge for if_gif.  I think we can use it generically 
 to mark packets as processed by bridge in the output path.
 
 The diff simplifies things and avoids mtag checking and allocation so is 
 more efficient too.
 
 The old code checks if a packet has passed the _same_ bridge already, but 
 as an interface can only be a member of one bridge I think the flag is 
 sufficient.
 
 It looks like the only other user of M_PROTO1 is netbt/hci_link.c, but 
 that can be fixed if the diff is acceptable otherwise.
 
 Tested lightly in a bridge/gif setup, but could use some more testing.
 (especially with ipsec in the mix too)
 
 
 Index: if_bridge.c
 ===
 RCS file: /cvs/src/sys/net/if_bridge.c,v
 retrieving revision 1.193
 diff -u -p -r1.193 if_bridge.c
 --- if_bridge.c   4 Jul 2011 06:54:49 -   1.193
 +++ if_bridge.c   28 Oct 2011 17:55:04 -
 @@ -2813,9 +2813,8 @@ bridge_ifenqueue(struct bridge_softc *sc
  #if NGIF  0
   /* Packet needs etherip encapsulation. */
   if (ifp-if_type == IFT_GIF) {
 - m-m_flags |= M_PROTO1;
 -
   /* Count packets input into the gif from outside */
 + /* XXX do this in if_gif? */
   ifp-if_ipackets++;
   ifp-if_ibytes += m-m_pkthdr.len;
   }
 @@ -2844,6 +2843,7 @@ bridge_ifenqueue(struct bridge_softc *sc
   }
  #endif
   len = m-m_pkthdr.len;
 + m-m_flags |= M_PROTO1;
   mflags = m-m_flags;
   IFQ_ENQUEUE(ifp-if_snd, m, NULL, error);
   if (error) {
 Index: if_ethersubr.c
 ===
 RCS file: /cvs/src/sys/net/if_ethersubr.c,v
 retrieving revision 1.151
 diff -u -p -r1.151 if_ethersubr.c
 --- if_ethersubr.c9 Jul 2011 00:47:18 -   1.151
 +++ if_ethersubr.c28 Oct 2011 17:55:04 -
 @@ -382,40 +382,8 @@ ether_output(ifp0, m0, dst, rt0)
* Interfaces that are bridge members need special handling
* for output.
*/
 - if (ifp-if_bridge) {
 - struct m_tag *mtag;
 -
 - /*
 -  * Check if this packet has already been sent out through
 -  * this bridge, in which case we simply send it out
 -  * without further bridge processing.
 -  */
 - for (mtag = m_tag_find(m, PACKET_TAG_BRIDGE, NULL); mtag;
 - mtag = m_tag_find(m, PACKET_TAG_BRIDGE, mtag)) {
 -#ifdef DEBUG
 - /* Check that the information is there */
 - if (mtag-m_tag_len != sizeof(caddr_t)) {
 - error = EINVAL;
 - goto bad;
 - }
 -#endif
 - if (!bcmp(ifp-if_bridge, mtag + 1, sizeof(caddr_t)))
 - break;
 - }
 - if (mtag == NULL) {
 - /* Attach a tag so we can detect loops */
 - mtag = m_tag_get(PACKET_TAG_BRIDGE, sizeof(caddr_t),
 - M_NOWAIT);
 - if (mtag == NULL) {
 - error = ENOBUFS;
 - goto bad;
 - }
 - bcopy(ifp-if_bridge, mtag + 1, sizeof(caddr_t));
 - m_tag_prepend(m, mtag);
 - error = bridge_output(ifp, m, NULL, NULL);
 - return (error);
 - }
 - }
 + if (ifp-if_bridge  !(m-m_flags  M_PROTO1))
 + return (bridge_output(ifp, m, NULL, NULL));
  #endif
   mflags = m-m_flags;
   len = m-m_pkthdr.len;



use M_PROTO1 in bridge output too

2011-10-28 Thread Camiel Dobbelaar
M_PROTO1 is used by if_bridge on the input path.  On the output path it's 
used now only by if_bridge for if_gif.  I think we can use it generically 
to mark packets as processed by bridge in the output path.

The diff simplifies things and avoids mtag checking and allocation so is 
more efficient too.

The old code checks if a packet has passed the _same_ bridge already, but 
as an interface can only be a member of one bridge I think the flag is 
sufficient.

It looks like the only other user of M_PROTO1 is netbt/hci_link.c, but 
that can be fixed if the diff is acceptable otherwise.

Tested lightly in a bridge/gif setup, but could use some more testing.
(especially with ipsec in the mix too)


Index: if_bridge.c
===
RCS file: /cvs/src/sys/net/if_bridge.c,v
retrieving revision 1.193
diff -u -p -r1.193 if_bridge.c
--- if_bridge.c 4 Jul 2011 06:54:49 -   1.193
+++ if_bridge.c 28 Oct 2011 17:55:04 -
@@ -2813,9 +2813,8 @@ bridge_ifenqueue(struct bridge_softc *sc
 #if NGIF  0
/* Packet needs etherip encapsulation. */
if (ifp-if_type == IFT_GIF) {
-   m-m_flags |= M_PROTO1;
-
/* Count packets input into the gif from outside */
+   /* XXX do this in if_gif? */
ifp-if_ipackets++;
ifp-if_ibytes += m-m_pkthdr.len;
}
@@ -2844,6 +2843,7 @@ bridge_ifenqueue(struct bridge_softc *sc
}
 #endif
len = m-m_pkthdr.len;
+   m-m_flags |= M_PROTO1;
mflags = m-m_flags;
IFQ_ENQUEUE(ifp-if_snd, m, NULL, error);
if (error) {
Index: if_ethersubr.c
===
RCS file: /cvs/src/sys/net/if_ethersubr.c,v
retrieving revision 1.151
diff -u -p -r1.151 if_ethersubr.c
--- if_ethersubr.c  9 Jul 2011 00:47:18 -   1.151
+++ if_ethersubr.c  28 Oct 2011 17:55:04 -
@@ -382,40 +382,8 @@ ether_output(ifp0, m0, dst, rt0)
 * Interfaces that are bridge members need special handling
 * for output.
 */
-   if (ifp-if_bridge) {
-   struct m_tag *mtag;
-
-   /*
-* Check if this packet has already been sent out through
-* this bridge, in which case we simply send it out
-* without further bridge processing.
-*/
-   for (mtag = m_tag_find(m, PACKET_TAG_BRIDGE, NULL); mtag;
-   mtag = m_tag_find(m, PACKET_TAG_BRIDGE, mtag)) {
-#ifdef DEBUG
-   /* Check that the information is there */
-   if (mtag-m_tag_len != sizeof(caddr_t)) {
-   error = EINVAL;
-   goto bad;
-   }
-#endif
-   if (!bcmp(ifp-if_bridge, mtag + 1, sizeof(caddr_t)))
-   break;
-   }
-   if (mtag == NULL) {
-   /* Attach a tag so we can detect loops */
-   mtag = m_tag_get(PACKET_TAG_BRIDGE, sizeof(caddr_t),
-   M_NOWAIT);
-   if (mtag == NULL) {
-   error = ENOBUFS;
-   goto bad;
-   }
-   bcopy(ifp-if_bridge, mtag + 1, sizeof(caddr_t));
-   m_tag_prepend(m, mtag);
-   error = bridge_output(ifp, m, NULL, NULL);
-   return (error);
-   }
-   }
+   if (ifp-if_bridge  !(m-m_flags  M_PROTO1))
+   return (bridge_output(ifp, m, NULL, NULL));
 #endif
mflags = m-m_flags;
len = m-m_pkthdr.len;



carp destroy

2011-10-21 Thread Camiel Dobbelaar
Destroying a carp interface does not restore the demote count of the 
carp group.

Reason is that the interface is removed from the carp group by 
if_clone_destroy() before carp_clone_destroy() is run.  The second reason 
is a simple bug introduced in ip_carp.c, rev 1.175.

The diff removes if_delgroup() from if_clone_destroy().  This is possible 
because if_detach() that is run later on removes the interface from all 
the groups as well.  This seems to work fine.  I cannot deduce from the 
CVS history why it was added to if_clone_destroy...  can anyone 
remember?  (Henning?)

After the diff (fxp1 has no carrier on purpose):

camield@rifraf $ ifconfig fxp1
fxp1: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST mtu 1500
lladdr 00:d0:b7:47:3c:07
priority: 0
media: Ethernet autoselect (none)
status: no carrier
inet 10.38.38.10 netmask 0xff00 broadcast 10.38.38.255
inet6 fe80::2d0:b7ff:fe47:3c07%fxp1 prefixlen 64 scopeid 0x3
camield@rifraf $ sudo ifconfig carp11 vhid 11 carpdev fxp1
camield@rifraf $ ifconfig carp11
carp11: flags=8803UP,BROADCAST,SIMPLEX,MULTICAST mtu 1500
lladdr 00:00:5e:00:01:0b
priority: 0
carp: INIT carpdev fxp1 vhid 11 advbase 1 advskew 0
groups: carp
inet6 fe80::200:5eff:fe00:10b%carp11 prefixlen 64 scopeid 0x6
camield@rifraf $ ifconfig -g carp
carp: carp demote count 1
camield@rifraf $ sudo ifconfig carp11 destroy
camield@rifraf $ ifconfig -g carp 
carp: carp demote count 0
camield@rifraf $ tail -2 /var/log/messages  

Oct 21 13:48:25 rifraf /bsd: carp: carp11 demoted group carp by 1 to 1 
(carpdev)
Oct 21 13:48:33 rifraf /bsd: carp: carp11 demoted group carp by -1 to 0 
(detach)


Index: net/if.c
===
RCS file: /cvs/src/sys/net/if.c,v
retrieving revision 1.239
diff -u -p -r1.239 if.c
--- net/if.c9 Jul 2011 00:47:18 -   1.239
+++ net/if.c21 Oct 2011 08:23:04 -
@@ -712,7 +712,7 @@ if_clone_destroy(const char *name)
 {
struct if_clone *ifc;
struct ifnet *ifp;
-   int s, ret;
+   int s;
 
ifc = if_clone_lookup(name, NULL);
if (ifc == NULL)
@@ -731,12 +731,7 @@ if_clone_destroy(const char *name)
splx(s);
}
 
-   if_delgroup(ifp, ifc-ifc_name);
-
-   if ((ret = (*ifc-ifc_destroy)(ifp)) != 0)
-   if_addgroup(ifp, ifc-ifc_name);
-
-   return (ret);
+   return ((*ifc-ifc_destroy)(ifp));
 }
 
 /*
Index: netinet/ip_carp.c
===
RCS file: /cvs/src/sys/netinet/ip_carp.c,v
retrieving revision 1.191
diff -u -p -r1.191 ip_carp.c
--- netinet/ip_carp.c   16 Oct 2011 21:07:19 -  1.191
+++ netinet/ip_carp.c   21 Oct 2011 08:23:04 -
@@ -980,7 +980,7 @@ carpdetach(struct carp_softc *sc)
carp_del_all_timeouts(sc);
 
if (sc-sc_demote_cnt)
-   carp_group_demote_adj(sc-sc_if, sc-sc_demote_cnt, detach);
+   carp_group_demote_adj(sc-sc_if, -sc-sc_demote_cnt, detach);
sc-sc_suppress = 0;
sc-sc_sendad_errors = 0;



more sunix puc cards

2011-10-21 Thread Camiel Dobbelaar
Adapted from NetBSD, which in turn got the id's from Linux.

Not sure about the PUC_MAX_PORTS bump from 8 to 16 (and the comment), it 
grows pucdata.o from 13k to 23k (on i386) for just one card.


Index: sys/dev/pci/pcidevs
===
RCS file: /cvs/src/sys/dev/pci/pcidevs,v
retrieving revision 1.1624
diff -u -r1.1624 pcidevs
--- sys/dev/pci/pcidevs 9 Oct 2011 21:39:11 -   1.1624
+++ sys/dev/pci/pcidevs 11 Oct 2011 06:57:54 -
@@ -320,6 +320,7 @@
 vendor SYMPHONY2   0x1c1c  Symphony Labs
 vendor TEKRAM2 0x1de1  Tekram
 vendor TEHUTI  0x1fc9  Tehuti Networks
+vendor SUNIX2  0x1fd4  Sunix
 vendor HINT0x3388  Hint
 vendor 3DLABS  0x3d3d  3D Labs
 vendor AVANCE2 0x4005  Avance Logic
@@ -5230,6 +5231,7 @@
 /* Sunix */
 product SUNIX 40XX 0x7168  40XX
 product SUNIX 4018A0x7268  4018A
+product SUNIX2 50XX0x1999  50XX
 
 /* Surecom products */
 product SURECOM NE34   0x0e34  NE-34
Index: sys/dev/pci/pucdata.c
===
RCS file: /cvs/src/sys/dev/pci/pucdata.c,v
retrieving revision 1.75
diff -u -r1.75 pucdata.c
--- sys/dev/pci/pucdata.c   9 Oct 2011 21:46:32 -   1.75
+++ sys/dev/pci/pucdata.c   11 Oct 2011 06:57:54 -
@@ -1463,6 +1463,120 @@
},
 
/*
+* SUNIX 50XX series of serial/parallel combo cards.
+* Tested with 5066A.
+*/
+   {   /* SUNIX 5008 1P */
+   {   PCI_VENDOR_SUNIX2, PCI_PRODUCT_SUNIX2_50XX, 0x1fd4, 0x0100 },
+   {   0x, 0x, 0x, 0xeff0 },
+   {
+   { PUC_PORT_TYPE_LPT, 0x14, 0x00, 0x00 },
+   },
+   },
+
+   {   /* SUNIX 5016 16S */
+   {   PCI_VENDOR_SUNIX2, PCI_PRODUCT_SUNIX2_50XX, 0x1fd4, 0x0010 },
+   {   0x, 0x, 0x, 0x },
+   {
+   { PUC_PORT_TYPE_COM, 0x10, 0x00, COM_FREQ * 8 },
+   { PUC_PORT_TYPE_COM, 0x10, 0x08, COM_FREQ * 8 },
+   { PUC_PORT_TYPE_COM, 0x10, 0x10, COM_FREQ * 8 },
+   { PUC_PORT_TYPE_COM, 0x10, 0x18, COM_FREQ * 8 },
+   { PUC_PORT_TYPE_COM, 0x14, 0x00, COM_FREQ * 8 },
+   { PUC_PORT_TYPE_COM, 0x14, 0x08, COM_FREQ * 8 },
+   { PUC_PORT_TYPE_COM, 0x14, 0x10, COM_FREQ * 8 },
+   { PUC_PORT_TYPE_COM, 0x14, 0x18, COM_FREQ * 8 },
+/*
+ * PUC_MAX_PORTS needs to be raised in order to reach these ports
+ */
+#if PUC_MAX_PORTS = 16
+   { PUC_PORT_TYPE_COM, 0x14, 0x20, COM_FREQ * 8 },
+   { PUC_PORT_TYPE_COM, 0x14, 0x28, COM_FREQ * 8 },
+   { PUC_PORT_TYPE_COM, 0x14, 0x30, COM_FREQ * 8 },
+   { PUC_PORT_TYPE_COM, 0x14, 0x38, COM_FREQ * 8 },
+   { PUC_PORT_TYPE_COM, 0x14, 0x40, COM_FREQ * 8 },
+   { PUC_PORT_TYPE_COM, 0x14, 0x48, COM_FREQ * 8 },
+   { PUC_PORT_TYPE_COM, 0x14, 0x50, COM_FREQ * 8 },
+   { PUC_PORT_TYPE_COM, 0x14, 0x58, COM_FREQ * 8 },
+#endif /* PUC_MAX_PORTS = 16 */
+   },
+   },
+
+   {   /* SUNIX 5027 1S */
+   {   PCI_VENDOR_SUNIX2, PCI_PRODUCT_SUNIX2_50XX, 0x1fd4, 0x0001 },
+   {   0x, 0x, 0x, 0x },
+   {
+   { PUC_PORT_TYPE_COM, 0x10, 0x00, COM_FREQ * 8 },
+   },
+   },
+
+   {   /* SUNIX 5037 2S */
+   {   PCI_VENDOR_SUNIX2, PCI_PRODUCT_SUNIX2_50XX, 0x1fd4, 0x0002 },
+   {   0x, 0x, 0x, 0x },
+   {
+   { PUC_PORT_TYPE_COM, 0x10, 0x00, COM_FREQ * 8 },
+   { PUC_PORT_TYPE_COM, 0x10, 0x08, COM_FREQ * 8 },
+   },
+   },
+
+   {   /* SUNIX 5056 4S */
+   {   PCI_VENDOR_SUNIX2, PCI_PRODUCT_SUNIX2_50XX, 0x1fd4, 0x0004 },
+   {   0x, 0x, 0x, 0x },
+   {
+   { PUC_PORT_TYPE_COM, 0x10, 0x00, COM_FREQ * 8 },
+   { PUC_PORT_TYPE_COM, 0x10, 0x08, COM_FREQ * 8 },
+   { PUC_PORT_TYPE_COM, 0x10, 0x10, COM_FREQ * 8 },
+   { PUC_PORT_TYPE_COM, 0x10, 0x18, COM_FREQ * 8 },
+   },
+   },
+
+   {   /* SUNIX 5066 8S */
+   {   PCI_VENDOR_SUNIX2, PCI_PRODUCT_SUNIX2_50XX, 0x1fd4, 0x0008 },
+   {   0x, 0x, 0x, 0x },
+   {
+   { PUC_PORT_TYPE_COM, 0x10, 0x00, COM_FREQ * 8 },
+   { PUC_PORT_TYPE_COM, 0x10, 0x08, COM_FREQ * 8 },
+   { PUC_PORT_TYPE_COM, 0x10, 0x10, COM_FREQ * 8 },
+   { PUC_PORT_TYPE_COM, 0x10, 0x18, COM_FREQ * 8 },
+   { PUC_PORT_TYPE_COM, 0x14, 0x00, COM_FREQ * 8 },
+   { PUC_PORT_TYPE_COM, 0x14, 0x08, COM_FREQ * 8 },
+   { PUC_PORT_TYPE_COM, 0x14, 0x10, COM_FREQ * 8 },
+   { PUC_PORT_TYPE_COM, 0x14, 0x18, COM_FREQ * 8 },
+   },
+   },
+
+   {   /* SUNIX 5069 1S / 1P */
+   {   

Re: rdr-to ::1

2011-08-02 Thread Camiel Dobbelaar
On 1-8-2011 23:59, Alexander Bluhm wrote:
 On Wed, Jul 27, 2011 at 12:44:21AM +0200, Alexander Bluhm wrote:
 On Fri, May 20, 2011 at 11:54:09AM +0200, Camiel Dobbelaar wrote:
 I'll spend some more time on this, but maybe there's an IPv6 guru that
 can lend a hand?  :-)

 Just removing the check seems wrong to me.  This would allow ::1
 addresses from the wire.  Also the goto hbhcheck would get lost.
 
 I have reconsidered the existing loopback check in ip6_input().  It
 is wrong.  The check that ::1 is not allowed from the wire must be
 before pf_test().  Otherwise pf could reroute or redirect such a
 packet.
 
 KAME moved the check in rev 1.189 of their ip6_input.c.  They also
 removed the special goto ours logic for ::1.  I do not change that
 now before release so leave the goto where it is.
 
 Redirect or nat to ::1 should work with this diff.  But I still
 believe that divert-to is more suitable for that.
 
 ok?


Fixes the problem for me.

And looks correct according to that KAME rev.

(and I agree with the remark about the divert-to, I'll prepare a manpage
ipv6 example for ftp-proxy)

--
Cam



svlan taghash

2011-07-04 Thread Camiel Dobbelaar
When svlan(4) was introduced, it got its own taghash in if_vlan.c.
This wasn't necessary as etype was already checked in the old hash
lookup.

So simplify the code again, and use the savings for some extra hash 
buckets.  :-)

Survives this stacked test between i386 and sparc64 (use .1 on one system, 
and .2 on the other and ping all ip's):

#!/bin/sh
VLANDEV=fxp0
IP4=1

ifconfig svlan100 vlan 100 vlanprio 1 vlandev $VLANDEV \
10.10.101.$IP4 netmask 255.255.255.0
ifconfig svlan200 vlan 200 vlanprio 2 vlandev svlan100 \
10.10.102.$IP4 netmask 255.255.255.0
ifconfig vlan300  vlan 300 vlanprio 3 vlandev svlan200 \
10.10.103.$IP4 netmask 255.255.255.0
ifconfig svlan300 vlan 300 vlanprio 4 vlandev vlan300 \
10.10.104.$IP4 netmask 255.255.255.0
ifconfig vlan200  vlan 200 vlanprio 5 vlandev svlan300 \
10.10.105.$IP4 netmask 255.255.255.0
ifconfig vlan100  vlan 100 vlanprio 6 vlandev vlan200 \
10.10.106.$IP4 netmask 255.255.255.0

tcpdump:

19:53:17.566551 QinQ svid 100 pri 1 QinQ svid 200 pri 2 802.1Q vid 300 pri 
3 QinQ svid 300 pri 4 802.1Q vid 200 pri 5 802.1Q vid 100 pri 6 
10.10.106.1  10.10.106.2: icmp: echo request

19:53:17.566636 QinQ svid 100 pri 1 QinQ svid 200 pri 2 802.1Q vid 300 pri 
3 QinQ svid 300 pri 4 802.1Q vid 200 pri 5 802.1Q vid 100 pri 6 
10.10.106.2  10.10.106.1: icmp: echo reply


--
Cam


Index: if_vlan.c
===
RCS file: /cvs/src/sys/net/if_vlan.c,v
retrieving revision 1.87
diff -u -r1.87 if_vlan.c
--- if_vlan.c   18 Feb 2011 17:06:45 -  1.87
+++ if_vlan.c   4 Jul 2011 18:33:47 -
@@ -77,11 +77,11 @@
 #include net/if_vlan_var.h
 
 extern struct  ifaddr  **ifnet_addrs;
-u_long vlan_tagmask, svlan_tagmask;
+u_long vlan_tagmask;
 
-#define TAG_HASH_SIZE  32
+#define TAG_HASH_SIZE  64  
 #define TAG_HASH(tag)  (tag  vlan_tagmask)
-LIST_HEAD(vlan_taghash, ifvlan)*vlan_tagh, *svlan_tagh;
+LIST_HEAD(vlan_taghash, ifvlan)*vlan_tagh;
 
 void   vlan_start(struct ifnet *ifp);
 intvlan_ioctl(struct ifnet *ifp, u_long cmd, caddr_t addr);
@@ -107,18 +107,12 @@
 void
 vlanattach(int count)
 {
-   /* Normal VLAN */
vlan_tagh = hashinit(TAG_HASH_SIZE, M_DEVBUF, M_NOWAIT,
vlan_tagmask);
if (vlan_tagh == NULL)
panic(vlanattach: hashinit);
-   if_clone_attach(vlan_cloner);
 
-   /* Service-VLAN for QinQ/802.1ad provider bridges */
-   svlan_tagh = hashinit(TAG_HASH_SIZE, M_DEVBUF, M_NOWAIT,
-   svlan_tagmask);
-   if (svlan_tagh == NULL)
-   panic(vlanattach: hashinit);
+   if_clone_attach(vlan_cloner);
if_clone_attach(svlan_cloner);
 }
 
@@ -277,13 +271,11 @@
 {
struct ifvlan *ifv;
struct ifnet *ifp = m-m_pkthdr.rcvif;
-   struct vlan_taghash *tagh;
u_int tag;
u_int16_t etype;
 
if (m-m_flags  M_VLANTAG) {
etype = ETHERTYPE_VLAN;
-   tagh = vlan_tagh;
tag = EVL_VLANOFTAG(m-m_pkthdr.ether_vtag);
} else {
if (m-m_len  EVL_ENCAPLEN 
@@ -293,11 +285,10 @@
}
 
etype = ntohs(eh-ether_type);
-   tagh = etype == ETHERTYPE_QINQ ? svlan_tagh : vlan_tagh;
tag = EVL_VLANOFTAG(ntohs(*mtod(m, u_int16_t *)));
}
 
-   LIST_FOREACH(ifv, tagh[TAG_HASH(tag)], ifv_list) {
+   LIST_FOREACH(ifv, vlan_tagh[TAG_HASH(tag)], ifv_list) {
if (m-m_pkthdr.rcvif == ifv-ifv_p  tag == ifv-ifv_tag 
etype == ifv-ifv_type)
break;
@@ -359,7 +350,6 @@
 {
struct ifaddr *ifa1, *ifa2;
struct sockaddr_dl *sdl1, *sdl2;
-   struct vlan_taghash *tagh;
u_int flags;
int s;
 
@@ -449,8 +439,7 @@
 
ifv-ifv_tag = tag;
s = splnet();
-   tagh = ifv-ifv_type == ETHERTYPE_QINQ ? svlan_tagh : vlan_tagh;
-   LIST_INSERT_HEAD(tagh[TAG_HASH(tag)], ifv, ifv_list);
+   LIST_INSERT_HEAD(vlan_tagh[TAG_HASH(tag)], ifv, ifv_list);
 
/* Register callback for physical link state changes */
ifv-lh_cookie = hook_establish(p-if_linkstatehooks, 1,



Re: svlan taghash

2011-07-04 Thread Camiel Dobbelaar
On Mon, 4 Jul 2011, Camiel Dobbelaar wrote:
 When svlan(4) was introduced, it got its own taghash in if_vlan.c.
 This wasn't necessary as etype was already checked in the old hash
 lookup.

Hmmm, etype wasn't checked in rev 1.83 so it looks like my memory is 
shady...  I also see a comment about ID conflicts so my diff is probably 
wrong.  Disregard please.



bridge interface search

2011-06-28 Thread Camiel Dobbelaar
This diff changes the if_bridge pointer of an interface (struct ifnet) to 
not point to the bridge but to its own bridge interface configuration.
Should be safe because an interface can only be part of one bridge.

This way all the LIST_FOREACH lineair searches in the bridge code can be 
replaced.  There are also two of those in the forwarding path so this diff 
should make the bridge faster, especially with lots of interfaces.

I've renamed it to if_bridge_port to smoke out all users and because 
it's clearer.  (my fingers itch to rename bridge_iflist too as noted in 
the diff :-) )

Most of the diff is mechanical.  The if_ether.c change got pretty hairy 
though and could some more eyes.

And the whole thing could use some substantial testing...



Index: sys/dev/isa/if_ie.c
===
RCS file: /cvs/src/sys/dev/isa/if_ie.c,v
retrieving revision 1.35
diff -u -r1.35 if_ie.c
--- sys/dev/isa/if_ie.c 28 Nov 2008 02:44:17 -  1.35
+++ sys/dev/isa/if_ie.c 24 Jun 2011 18:29:18 -
@@ -1054,16 +1054,16 @@
 */
 #if NBPFILTER  0
*to_bpf = (sc-sc_arpcom.ac_if.if_bpf != 0) ||
-   (sc-sc_arpcom.ac_if.if_bridge != NULL);
+   (sc-sc_arpcom.ac_if.if_bridge_port != NULL);
 #else
-   *to_bpf = (sc-sc_arpcom.ac_if.if_bridge != NULL);
+   *to_bpf = (sc-sc_arpcom.ac_if.if_bridge_port != NULL);
 #endif
/* If for us, accept and hand up to BPF */
if (ether_equal(eh-ether_dhost, sc-sc_arpcom.ac_enaddr))
return 1;
 
 #if NBPFILTER  0
-   if (*to_bpf  sc-sc_arpcom.ac_if.if_bridge == NULL)
+   if (*to_bpf  sc-sc_arpcom.ac_if.if_bridge_port == NULL)
*to_bpf = 2; /* we don't need to see it */
 #endif
 
@@ -1095,9 +1095,9 @@
 */
 #if NBPFILTER  0
*to_bpf = (sc-sc_arpcom.ac_if.if_bpf != 0) ||
-   (sc-sc_arpcom.ac_if.if_bridge != NULL);
+   (sc-sc_arpcom.ac_if.if_bridge_port != NULL);
 #else
-   *to_bpf = (sc-sc_arpcom.ac_if.if_bridge != NULL);
+   *to_bpf = (sc-sc_arpcom.ac_if.if_bridge_port != NULL);
 #endif
/* We want to see multicasts. */
if (eh-ether_dhost[0]  1)
@@ -1109,7 +1109,7 @@
 
/* Anything else goes to BPF but nothing else. */
 #if NBPFILTER  0
-   if (*to_bpf  sc-sc_arpcom.ac_if.if_bridge == NULL)
+   if (*to_bpf  sc-sc_arpcom.ac_if.if_bridge_port == NULL)
*to_bpf = 2;
 #endif
return 1;
Index: sys/net/bridgestp.c
===
RCS file: /cvs/src/sys/net/bridgestp.c,v
retrieving revision 1.39
diff -u -r1.39 bridgestp.c
--- sys/net/bridgestp.c 20 Nov 2010 14:23:09 -  1.39
+++ sys/net/bridgestp.c 24 Jun 2011 18:29:18 -
@@ -1644,7 +1644,7 @@
 
if (ifp-if_type == IFT_BRIDGE)
return;
-   sc = (struct bridge_softc *)ifp-if_bridge;
+   sc = ((struct bridge_iflist *)ifp-if_bridge_port)-bridge_sc;
 
s = splnet();
LIST_FOREACH(p, sc-sc_iflist, next) {
@@ -2133,15 +2133,8 @@
err = ENOENT;
break;
}
-   if ((caddr_t)sc != ifs-if_bridge) {
-   err = ESRCH;
-   break;
-   }
-   LIST_FOREACH(p, sc-sc_iflist, next) {
-   if (p-ifp == ifs)
-   break;
-   }
-   if (p == LIST_END(sc-sc_iflist)) {
+   p = (struct bridge_iflist *)ifs-if_bridge_port;
+   if (p == NULL || p-bridge_sc != sc) {
err = ESRCH;
break;
}
Index: sys/net/if.c
===
RCS file: /cvs/src/sys/net/if.c,v
retrieving revision 1.234
diff -u -r1.234 if.c
--- sys/net/if.c13 Mar 2011 15:31:41 -  1.234
+++ sys/net/if.c24 Jun 2011 18:29:18 -
@@ -531,7 +531,7 @@
 
 #if NBRIDGE  0
/* Remove the interface from any bridge it is part of.  */
-   if (ifp-if_bridge)
+   if (ifp-if_bridge_port)
bridge_ifdetach(ifp);
 #endif
 
@@ -1101,7 +1101,7 @@
carp_carpdev_state(ifp);
 #endif
 #if NBRIDGE  0
-   if (ifp-if_bridge)
+   if (ifp-if_bridge_port)
bstp_ifstate(ifp);
 #endif
rt_ifmsg(ifp);
@@ -1137,7 +1137,7 @@
carp_carpdev_state(ifp);
 #endif
 #if NBRIDGE  0
-   if (ifp-if_bridge)
+   if (ifp-if_bridge_port)
bstp_ifstate(ifp);
 #endif
rt_ifmsg(ifp);
Index: sys/net/if.h
===
RCS file: /cvs/src/sys/net/if.h,v
retrieving revision 1.122
diff -u -r1.122 if.h
--- 

mark arp broadcasts in the mbuf

2011-06-26 Thread Camiel Dobbelaar
Mark ARP request broadcasts as such in the mbuf flags.  FreeBSD and NetBSD 
both have this.

Without this, bridge_output() drops ARP request broadcasts on interfaces 
without the discover flag:

if ((p-bif_flags  IFBIF_DISCOVER) == 0 
(m-m_flags  (M_BCAST | M_MCAST)) == 0)
continue;


Index: if_ether.c
===
RCS file: /cvs/src/sys/netinet/if_ether.c,v
retrieving revision 1.88
diff -u -r1.88 if_ether.c
--- if_ether.c  22 Jul 2010 00:41:55 -  1.88
+++ if_ether.c  26 Jun 2011 17:30:29 -
@@ -359,6 +359,7 @@
bcopy((caddr_t)tip, (caddr_t)ea-arp_tpa, sizeof(ea-arp_tpa));
sa.sa_family = pseudo_AF_HDRCMPLT;
sa.sa_len = sizeof(sa);
+   m-m_flags |= M_BCAST;
(*ifp-if_output)(ifp, m, sa, (struct rtentry *)0);
 }
 
@@ -994,6 +995,7 @@
   sizeof(ea-arp_tha));
sa.sa_family = pseudo_AF_HDRCMPLT;
sa.sa_len = sizeof(sa);
+   m-m_flags |= M_BCAST;
ifp-if_output(ifp, m, sa, (struct rtentry *)0);
 }



ifconfig vlan diff

2011-06-26 Thread Camiel Dobbelaar
vlandev (parent) does not need to be physical, and can be changed on the 
fly now.


Index: ifconfig.8
===
RCS file: /cvs/src/sbin/ifconfig/ifconfig.8,v
retrieving revision 1.216
diff -u -r1.216 ifconfig.8
--- ifconfig.8  13 Mar 2011 21:24:20 -  1.216
+++ ifconfig.8  21 Jun 2011 13:02:03 -
@@ -1487,11 +1487,11 @@
 vlan header for packets sent from the vlan interface.
 This value cannot be changed once it is set for an interface.
 .It Cm vlandev Ar parent-interface
-Associate with physical interface
-.Ar iface .
+Associate with interface
+.Ar parent-interface .
 Packets transmitted through the vlan interface will be
-diverted to the specified physical interface
-.Ar iface
+diverted to the specified interface
+.Ar parent-interface
 with 802.1Q vlan encapsulation.
 Packets with 802.1Q encapsulation received
 by the parent interface with the correct vlan tag will be diverted to
@@ -1506,12 +1506,8 @@
 the interface name, for instance
 .Cm vlan5
 will be assigned 802.1Q tag 5.
-If the vlan interface already has
-a physical interface associated with it, this command will fail.
-To change the association to another physical interface, the existing
-association must be cleared first.
 .It Fl vlandev
-Disassociate from the physical interface.
+Disassociate from the parent interface.
 This breaks the link between the vlan interface and its parent,
 clears its vlan tag, flags, and link address, and shuts the interface down.
 .It Cm vlanprio Ar vlan-priority



rdr-to ::1

2011-05-20 Thread Camiel Dobbelaar
inet6 pf rules that rdr-to ::1 do not work currently.  Matching
packets just disappear and the counter packets that violated scope
rules from a netstat -s -p ip6 gets incremented.

It came up before on misc@:
http://marc.info/?t=12680425912r=1w=2

The attached diff removes the check (let's call it check #1) that drops
the packet.  This is just to point out where the problem is because the
BSD's have diverged here:

FreeBSD _replaced_ check #1 with another check (#2) in this diff from
2004 (from kame):
http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/netinet6/ip6_input.c.diff?r1=1.68;r2=1.69

NetBSD _replaced_ check #1 it with something totally different
(attributed to jinmei@kame)
http://cvsweb.netbsd.org/bsdweb.cgi/src/sys/netinet6/ip6_input.c?rev=1.81content-type=text/x-cvsweb-markup

Check #2 was _added_ to OpenBSD in 2006 (attributed to jinmei@kame):
http://www.openbsd.org/cgi-bin/cvsweb/src/sys/netinet6/ip6_input.c.diff?r1=1.72;r2=1.73

Basically, check #1 is gone in FreeBSD and NetBSD and the diff syncs us
closer to FreeBSD.

I'm unsure if it's the right thing to do though for a few reasons. The
FreeBSD diff has a second part that I cannot yet tell is related or not.
 Or maybe the NetBSD diff could be better.  And OpenBSD also seems to
have other checks in this area.

I'll spend some more time on this, but maybe there's an IPv6 guru that
can lend a hand?  :-)

--
Cam




Index: ip6_input.c
===
RCS file: /cvs/src/sys/netinet6/ip6_input.c,v
retrieving revision 1.99
diff -u -r1.99 ip6_input.c
--- ip6_input.c 3 Apr 2011 13:56:05 -   1.99
+++ ip6_input.c 20 May 2011 09:30:14 -
@@ -270,7 +270,6 @@
in6_ifstat_inc(m-m_pkthdr.rcvif, ifs6_in_addrerr);
goto bad;
}
-
if (IN6_IS_ADDR_MC_INTFACELOCAL(ip6-ip6_dst) 
!(m-m_flags  M_LOOP)) {
/*
@@ -340,19 +339,6 @@
ip6 = mtod(m, struct ip6_hdr *);
srcrt = !IN6_ARE_ADDR_EQUAL(odst, ip6-ip6_dst);
 #endif
-
-   if (IN6_IS_ADDR_LOOPBACK(ip6-ip6_src) ||
-   IN6_IS_ADDR_LOOPBACK(ip6-ip6_dst)) {
-   if (m-m_pkthdr.rcvif-if_flags  IFF_LOOPBACK) {
-   ours = 1;
-   deliverifp = m-m_pkthdr.rcvif;
-   goto hbhcheck;
-   } else {
-   ip6stat.ip6s_badscope++;
-   in6_ifstat_inc(m-m_pkthdr.rcvif, ifs6_in_addrerr);
-   goto bad;
-   }
-   }

/* drop packets if interface ID portion is already filled */
if ((m-m_pkthdr.rcvif-if_flags  IFF_LOOPBACK) == 0) {



Re: vlan vlandev fix

2011-02-17 Thread Camiel Dobbelaar
On 16-2-2011 14:27, Reyk Floeter wrote:
 My previous change to vlan(4) allows to change the vlandev and vlan id
 on-the-fly without re-creating the vlan interface.

I hesitated to ask this simple question, because I might be overlooking
something, but what exactly is the advantage over just using
/etc/netstart?  Less dropped packets?


--
Cam



ifconfig vlan tag range

2011-02-09 Thread Camiel Dobbelaar
The valid range for vlan tags in OpenBSD is 0-4095 (inclusive).  Fix
both checks.

Makes vlan0 autoconfig work (obj/ifconfig has the diff):

# ifconfig vlan0 vlandev fxp0
ifconfig: invalid vlan tag and device specification

# obj/ifconfig vlan0 vlandev fxp0

and gives a better error message on tags  4095:

# ifconfig vlan1 vlan 5000 vlandev fxp0
ifconfig: SIOCSETVLAN: Invalid argument

# obj/ifconfig vlan1 vlan 5000 vlandev fxp0
ifconfig: vlan tag 5000: too large

--
Cam
Index: ifconfig.c
===
RCS file: /cvs/src/sbin/ifconfig/ifconfig.c,v
retrieving revision 1.242
diff -u -r1.242 ifconfig.c
--- ifconfig.c  9 Nov 2010 21:14:47 -   1.242
+++ ifconfig.c  5 Feb 2011 14:28:56 -
@@ -3351,7 +3351,7 @@
struct vlanreq vreq;
const char *errmsg = NULL;
 
-   __tag = tag = strtonum(val, 0, 65535, errmsg);
+   __tag = tag = strtonum(val, 0, 4095, errmsg);
if (errmsg)
errx(1, vlan tag %s: %s, val, errmsg);
__have_tag = 1;
@@ -3411,7 +3411,7 @@
 
if (!__have_tag  vreq.vlr_tag == 0) {
skip = strcspn(ifr.ifr_name, 0123456789);
-   tag = strtonum(ifr.ifr_name + skip, 1, 4095, estr);
+   tag = strtonum(ifr.ifr_name + skip, 0, 4095, estr);
if (estr != NULL)
errx(1, invalid vlan tag and device specification);
vreq.vlr_tag = tag;



Re: carp shutdown in /etc/rc

2011-02-05 Thread Camiel Dobbelaar
On 5-2-2011 2:15, Ted Unangst wrote:
 On Fri, Feb 4, 2011 at 7:21 AM, Camiel Dobbelaar c...@sentia.nl wrote:
 With hundreds of (vlan) interfaces, a shutdown takes quite a while.
# bring carp interfaces down gracefully
 -   ifconfig | while read a b; do
 +   ifconfig carp | while read a b; do
 
 going back to the original issue, does ifconfig | grep carp | while
 read a b make things faster?

No, it's ifconfig itself that takes long.

With 2000 vlan interfaces and 1 carp interface:
# time ifconfig | grep ^carp
carp80: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST mtu 1500
1m11.29s real 0m12.07s user 0m59.03s system
# time ifconfig carp | grep ^carp
carp80: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST mtu 1500
0m0.06s real 0m0.01s user 0m0.05s system

1000
# time ifconfig | grep ^carp
carp80: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST mtu 1500
0m16.66s real 0m2.88s user 0m13.72s system
# time ifconfig carp | grep ^carp
carp80: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST mtu 1500
0m0.03s real 0m0.00s user 0m0.02s system

500
# time ifconfig | grep ^carp
carp80: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST mtu 1500
0m3.18s real 0m0.67s user 0m2.49s system
# time ifconfig carp | grep ^carp
carp80: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST mtu 1500
0m0.02s real 0m0.00s user 0m0.01s system

200
# time ifconfig | grep ^carp
carp80: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST mtu 1500
0m0.35s real 0m0.07s user 0m0.27s system
# time ifconfig carp | grep ^carp
carp80: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST mtu 1500
0m0.01s real 0m0.00s user 0m0.00s system


It does not scale linearly, but the real world usage (200) is fine.  I
think we can drop the diff, since it turned out not to be so obvious and
clean...

--
Cam



Re: carp shutdown in /etc/rc

2011-02-05 Thread Camiel Dobbelaar
On 5-2-2011 11:02, Henning Brauer wrote:
 on the other side, fixing ifconfig very slow with lots of interfaces
 deserves to be fixed anyway. looking at the code - either getifaddrs is
 slow (which in turn wouldn't be ifconfig only), or the ioctls ifconfig
 does in getinfo(). that's just 5 tho. wonder wether making one big
 ioctl that returns everything those 5 would help - wouldn't win a beauty
 price for sure. not that ifconfig would ever qualify.

If ifconfig.c is instrumented with a little perl script:

# cat ioctl_debug.pl
#!/usr/bin/perl -pi.orig

if (m{ioctl\(\w+, (\w+),}) {
my $sig = $1;
s{ioctl\(}{printf(ioctl $sig\\n) != -1  ioctl(};
}

and then build like this:
# perl ioctl_debug.pl ifconfig.c
# make
# make install

you can see this:

# ifconfig carp80
ioctl SIOCGIFFLAGS
ioctl SIOCGIFXFLAGS
ioctl SIOCGIFMETRIC
ioctl SIOCGIFMTU
ioctl SIOCGIFRDOMAIN
ioctl SIOCGIFFLAGS
ioctl SIOCGIFXFLAGS
ioctl SIOCGIFMETRIC
ioctl SIOCGIFMTU
ioctl SIOCGIFRDOMAIN
carp80: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST mtu 1500
lladdr 00:00:5e:00:01:50
ioctl SIOCGIFDESCR
ioctl SIOCGIFPRIORITY
priority: 0
ioctl SIOCGETKALIVE
ioctl SIOCGETVLAN
ioctl SIOCGVH
carp: BACKUP carpdev vlan80 vhid 80 advbase 1 advskew 0
ioctl SIOCGETPFSYNC
ioctl PPPOEGETPARMS
ioctl SIOCGIFTIMESLOT
ioctl SIOCGIFGENERIC
ioctl SIOCGTRUNKPORT
ioctl SIOCGTRUNK
ioctl SIOCGETPFLOW
ioctl SIOCGIFGROUP
ioctl SIOCGIFGROUP
groups: carp
ioctl SIOCGIFMEDIA
status: backup
ioctl SIOCGLIFPHYADDR
inet6 fe80::200:5eff:fe00:150%carp80ioctl SIOCGIFNETMASK_IN6
 prefixlen 64ioctl SIOCGIFAFLAG_IN6
 scopeid 0x7ioctl SIOCGIFALIFETIME_IN6

inet 10.10.80.1ioctl SIOCGIFNETMASK
 netmask 0xff00ioctl SIOCGIFBRDADDR
 broadcast 10.10.80.255

# ifconfig vlan
ioctl SIOCGIFGMEMB
ioctl SIOCGIFGMEMB
ioctl SIOCGIFFLAGS
ioctl SIOCGIFXFLAGS
ioctl SIOCGIFMETRIC
ioctl SIOCGIFMTU
ioctl SIOCGIFRDOMAIN
vlan80: flags=8943UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST mtu 1500
lladdr 00:d0:59:b6:f4:27
ioctl SIOCGIFDESCR
ioctl SIOCGIFPRIORITY
priority: 0
ioctl SIOCGETKALIVE
ioctl SIOCGETVLAN
ioctl SIOCGETVLANPRIO
vlan: 80 priority: 0 parent interface: fxp0
ioctl SIOCGVH
ioctl SIOCGETPFSYNC
ioctl PPPOEGETPARMS
ioctl SIOCGIFTIMESLOT
ioctl SIOCGIFGENERIC
ioctl SIOCGTRUNKPORT
ioctl SIOCGTRUNK
ioctl SIOCGETPFLOW
ioctl SIOCGIFGROUP
ioctl SIOCGIFGROUP
groups: vlan
ioctl SIOCGIFMEDIA
status: active
ioctl SIOCGLIFPHYADDR
inet6 fe80::2d0:59ff:feb6:f427%vlan80ioctl SIOCGIFNETMASK_IN6
 prefixlen 64ioctl SIOCGIFAFLAG_IN6
 scopeid 0x6ioctl SIOCGIFALIFETIME_IN6

inet 10.10.80.2ioctl SIOCGIFNETMASK
 netmask 0xff00ioctl SIOCGIFBRDADDR
 broadcast 10.10.80.255

# ifconfig fxp
ioctl SIOCGIFGMEMB
ioctl SIOCGIFFLAGS
ioctl SIOCGIFXFLAGS
ioctl SIOCGIFMETRIC
ioctl SIOCGIFMTU
ioctl SIOCGIFRDOMAIN
fxp0:
flags=8b43UP,BROADCAST,RUNNING,PROMISC,ALLMULTI,SIMPLEX,MULTICAST mtu 1500
lladdr 00:d0:59:b6:f4:27
ioctl SIOCGIFDESCR
ioctl SIOCGIFPRIORITY
priority: 0
ioctl SIOCGETKALIVE
ioctl SIOCGETVLAN
ioctl SIOCGVH
ioctl SIOCGETPFSYNC
ioctl PPPOEGETPARMS
ioctl SIOCGIFTIMESLOT
ioctl SIOCGIFGENERIC
ioctl SIOCGTRUNKPORT
ioctl SIOCGTRUNK
ioctl SIOCGETPFLOW
ioctl SIOCGIFGROUP
ioctl SIOCGIFGROUP
groups: egress
ioctl SIOCGIFMEDIA
ioctl SIOCGIFMEDIA
media: Ethernet autoselect (100baseTX full-duplex)
status: active
ioctl SIOCG80211NWID
ioctl SIOCG80211NWKEY
ioctl SIOCG80211WPAPSK
ioctl SIOCG80211POWER
ioctl SIOCG80211CHANNEL
ioctl SIOCG80211BSSID
ioctl SIOCG80211TXPOWER
ioctl SIOCG80211WPAPARMS
ioctl SIOCGLIFPHYADDR
inet6 fe80::2d0:59ff:feb6:f427%fxp0ioctl SIOCGIFNETMASK_IN6
 prefixlen 64ioctl SIOCGIFAFLAG_IN6
 scopeid 0x2ioctl SIOCGIFALIFETIME_IN6

inet 192.168.28.129ioctl SIOCGIFNETMASK
 netmask 0xff00ioctl SIOCGIFBRDADDR
 broadcast 192.168.28.255

So yeah, it looks like ifconfig can be made a little smarter.



Re: carp shutdown in /etc/rc

2011-02-04 Thread Camiel Dobbelaar
On 4-2-2011 15:06, Stuart Henderson wrote:
 On 2011/02/04 14:37, Camiel Dobbelaar wrote:
 On 4-2-2011 13:32, Henning Brauer wrote:
 * Camiel Dobbelaar c...@sentia.nl [2011-02-04 13:21]:
 With hundreds of (vlan) interfaces, a shutdown takes quite a while.
 Fix below.

 hmm. this relies on all carp interfaces being in the carp interface
 group. while that is the default, it is not necessarily so.

 I didn't know that a groupname takes precendence, neither did the
 manpage.  :-)
 
 +If an interface group with that name exists, all interfaces in the group
 +will be shown.
 
 seems it's more complicated than that - if you remove all interfaces
 from group carp, 'ifconfig carp' lists nothing:
 
 $ ifconfig |grep ^carp
 carp1: flags=8802BROADCAST,SIMPLEX,MULTICAST mtu 1500
 carp2: flags=8802BROADCAST,SIMPLEX,MULTICAST mtu 1500
 $ ifconfig carp  
 $
 
 but with another type, this doesn't apply:
 
 $ sudo ifconfig bge0 group em
 $ ifconfig em | egrep '(0: |groups:)'
 bge0: flags=28843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST,NOINET6 mtu 1500
 groups: em
 $ sudo ifconfig bge0 -group em
 $ ifconfig em | egrep '(0: |groups:)'
 em0: flags=8b43UP,BROADCAST,RUNNING,PROMISC,ALLMULTI,SIMPLEX,MULTICAST mtu 
 1500
 groups: egress

It looks like group carp is never removed from the system, even if it
becomes empty.  That makes sense since some daemons use it as a default.

But the manpage diff is still correct, isn't it?

--
Cam