Re: ospfd announces carp interface with physical link down

2015-05-19 Thread Henning Brauer
* Johan Ymerson  [2015-05-19 19:25]:
> On Tue, 2015-05-19 at 11:16 +, Johan Ymerson wrote:
> > On Tue, 2015-05-19 at 11:24 +0100, Stuart Henderson wrote:
> > > On 2015/05/19 10:10, Johan Ymerson wrote:
> > > Yes I understand that, but if carp init was counted in "LINK_STATE_DOWN"
> > > then the metric would be 65535 which I think would still avoid the
> > > problem you're seeing, and would involve less special-casing in ospfd.
> > Yes, that would also resolve the issue, but it is a bit illogical to
> > announce a network we cannot possibly route traffic to (due to hardware
> > problems).
> After some more testing I think we can conclude that this is most
> definitely a kernel issue.

hmm. there's definately more to it.

> If the network cable is unplugged while the machine is running, ifconfig
> reports the following:
> carp2: flags=8803 mtu 1500
> lladdr 00:00:5e:00:01:03
> priority: 0
> carp: INIT carpdev em2 vhid 3 advbase 1 advskew 100
> groups: carp
> status: invalid
> inet 192.168.254.1 netmask 0xfe00 broadcast 192.168.255.255
> In this case network is announced with the correct metric (65535).
> 
> However, if the machine is started with the cable unplugged, ifconfig
> instead reports this:
> carp2: flags=8803 mtu 1500
> lladdr 00:00:5e:00:01:03
> priority: 0
> carp: INIT carpdev em2 vhid 3 advbase 1 advskew 100
> groups: carp
> inet 192.168.254.1 netmask 0xfe00 broadcast 192.168.255.255
> In this case the network is incorrectly announced as available with low
> metric.
> 
> Initializing if_link_state in the kernel carp code does seem to fix the
> issue:
> Index: sys/netinet/ip_carp.c
> ===
> RCS file: /cvs/src/sys/netinet/ip_carp.c,v
> retrieving revision 1.247
> diff -u -p -r1.247 ip_carp.c
> --- sys/netinet/ip_carp.c   4 Mar 2015 10:59:52 -   1.247
> +++ sys/netinet/ip_carp.c   19 May 2015 17:22:18 -
> @@ -751,6 +751,7 @@ carp_clone_create(ifc, unit)
> ifp->if_addrlen = ETHER_ADDR_LEN;
> ifp->if_hdrlen = ETHER_HDR_LEN;
> ifp->if_mtu = ETHERMTU;
> +   ifp->if_link_state = LINK_STATE_INVALID;
> IFQ_SET_MAXLEN(&ifp->if_snd, IFQ_MAXLEN);
> IFQ_SET_READY(&ifp->if_snd);
> if_attach(ifp);

just for completeness: LINK_STATE_INVALID is 1, and that's what
carp_set_state uses for everything but master and backup. so far so
good.

ifp is part of the sc which in turn is malloc'd with M_ZERO in
carp_clone_create, so link state will be 0 aka LINK_STATE_UNKNOWN.

however, at the end of carp_clone_create, we call
carp_set_state_all(sc, INIT) which should take care of that.

Now the question is why that doesn't work, your one-liner above SHOULD
not make a difference. Either the fact that you set the link state
before if_attach() makes a difference (I don't see how atm), or
something isn't working as expected/intended in carp_set_state_all()
resp. its sibling carp_set_state(). printf debugging time?

-- 
Henning Brauer, h...@bsws.de, henn...@openbsd.org
BS Web Services GmbH, http://bsws.de, Full-Service ISP
Secure Hosting, Mail and DNS. Virtual & Dedicated Servers, Root to Fully Managed
Henning Brauer Consulting, http://henningbrauer.com/



[Patch] httpd - don't leak fcgi file descriptors

2015-05-19 Thread Todd Mortimer
Hi tech,

The attached patch fixes a problem I’ve been having with httpd +
php_fpm + owncloud on 5.7. The patch is against 5.7-release.

After several days running owncloud with httpd, php_fpm started
complaining about hitting pm.max_children, and top would show a
bunch of idle php_fpm processes waiting on netio. Eventually httpd
would start returning error 500 and owncloud would stop working.
Restarting php_fpm or httpd would temporarily fix the issue. The
same server with nginx did not have the same problem.

I’ve had this patch running for 5 days now, and php_fpm isnt leaving
idle processes lying around anymore. I did run with some debugging
output to verify that clt->clt_fd is sometimes not -1 when it is
overwritten with the new socket fd.

I’m happy to test or revise if needed. 

Thank you!
Todd


Index: server_fcgi.c
===
RCS file: /cvs/src/usr.sbin/httpd/server_fcgi.c,v
retrieving revision 1.52
diff -u -p -u -p -r1.52 server_fcgi.c
--- server_fcgi.c   23 Feb 2015 19:22:43 -  1.52
+++ server_fcgi.c   15 May 2015 22:12:30 -
@@ -31,6 +31,7 @@
#include 
#include 
#include 
+#include 
#include 

#include "httpd.h"
@@ -152,6 +153,9 @@ server_fcgi(struct httpd *env, struct cl
   errstr = "failed to allocate evbuffer";
   goto fail;
   }
+
+   if (clt->clt_fd != -1)
+   close(clt->clt_fd);

   clt->clt_fd = fd;
   if (clt->clt_srvbev != NULL)



[patch]rcs: xstrdup just wrappes strdup

2015-05-19 Thread Fritjof Bornebusch
Hi,

xstrdup just wrappes strdup, so there is no need to call xmalloc and
strlcpy instead.

Regards,
--F.


 
Index: xmalloc.c
===
RCS file: /cvs/src/usr.bin/rcs/xmalloc.c,v
retrieving revision 1.8
diff -u -p -r1.8 xmalloc.c
--- xmalloc.c   26 Mar 2015 15:17:30 -  1.8
+++ xmalloc.c   19 May 2015 18:54:22 -
@@ -76,13 +76,11 @@ xfree(void *ptr)
 char *
 xstrdup(const char *str)
 {
-   size_t len;
char *cp;
 
-   len = strlen(str) + 1;
-   cp = xmalloc(len);
-   if (strlcpy(cp, str, len) >= len)
-   errx(1, "xstrdup: string truncated");
+   if ((cp = strdup(str)) == NULL)
+   errx(1, "xstrdup: copy string failed");
+
return cp;
 }
 


pgpsSRRSWzADK.pgp
Description: PGP signature


Re: ospfd announces carp interface with physical link down

2015-05-19 Thread Johan Ymerson
On Tue, 2015-05-19 at 11:16 +, Johan Ymerson wrote:
> On Tue, 2015-05-19 at 11:24 +0100, Stuart Henderson wrote:
> > On 2015/05/19 10:10, Johan Ymerson wrote:
> > 
> > Yes I understand that, but if carp init was counted in "LINK_STATE_DOWN"
> > then the metric would be 65535 which I think would still avoid the
> > problem you're seeing, and would involve less special-casing in ospfd.
> > 
> 
> Yes, that would also resolve the issue, but it is a bit illogical to
> announce a network we cannot possibly route traffic to (due to hardware
> problems).
> 
> I had a look at how BGPd handles this, and it only looks at
> LINK_STATE_DOWN. It will also miss the case when
> linkstate==LINK_STATE_UNKNOWN.
> That brings us back to why the kernel's carp code doesn't initialize the
> linkstate to something more useful than UNKNOWN.
> 
> /Johan
> 

After some more testing I think we can conclude that this is most
definitely a kernel issue.
If the network cable is unplugged while the machine is running, ifconfig
reports the following:
carp2: flags=8803 mtu 1500
lladdr 00:00:5e:00:01:03
priority: 0
carp: INIT carpdev em2 vhid 3 advbase 1 advskew 100
groups: carp
status: invalid
inet 192.168.254.1 netmask 0xfe00 broadcast 192.168.255.255
In this case network is announced with the correct metric (65535).

However, if the machine is started with the cable unplugged, ifconfig
instead reports this:
carp2: flags=8803 mtu 1500
lladdr 00:00:5e:00:01:03
priority: 0
carp: INIT carpdev em2 vhid 3 advbase 1 advskew 100
groups: carp
inet 192.168.254.1 netmask 0xfe00 broadcast 192.168.255.255
In this case the network is incorrectly announced as available with low
metric.

Initializing if_link_state in the kernel carp code does seem to fix the
issue:
Index: sys/netinet/ip_carp.c
===
RCS file: /cvs/src/sys/netinet/ip_carp.c,v
retrieving revision 1.247
diff -u -p -r1.247 ip_carp.c
--- sys/netinet/ip_carp.c   4 Mar 2015 10:59:52 -   1.247
+++ sys/netinet/ip_carp.c   19 May 2015 17:22:18 -
@@ -751,6 +751,7 @@ carp_clone_create(ifc, unit)
ifp->if_addrlen = ETHER_ADDR_LEN;
ifp->if_hdrlen = ETHER_HDR_LEN;
ifp->if_mtu = ETHERMTU;
+   ifp->if_link_state = LINK_STATE_INVALID;
IFQ_SET_MAXLEN(&ifp->if_snd, IFQ_MAXLEN);
IFQ_SET_READY(&ifp->if_snd);
if_attach(ifp);

/Johan


/Johan



Re: em(4) support for another Tolopai based board

2015-05-19 Thread Dariusz Swiderski
On Sat, 16 May 2015, Dariusz Swiderski wrote:

> Hi,
> 
> Attached patch implements support for yet another Tolopai (EP80579)
> based soho router. As some of you might remeber I wrote the original
> support for this platform during h2k9.
> 
> In this case, the board is named Teak 3020, and not only it does
> not follow intel recomedation on GCU usage, but also uses Realtek PHY!
> 
> Attached diff was tested by me on my old Axiomtek Tolopai board, and on 
> Teak 3020 board by Holger Gleass.
> 
> Comments?
> 

Hi,

fixed version below. i still think this should just go in, if we want to 
split the driver i volunteer to detatch the Tolopai platform as a first 
one and will submit a bigger patch.

greets
--
dms

Index: pci/if_em.c
===
RCS file: /cvs/src/sys/dev/pci/if_em.c,v
retrieving revision 1.297
diff -u -r1.297 if_em.c
--- pci/if_em.c 12 May 2015 20:20:18 -  1.297
+++ pci/if_em.c 19 May 2015 14:08:36 -
@@ -187,7 +187,10 @@
{ PCI_VENDOR_INTEL, PCI_PRODUCT_INTEL_ICH10_R_BM_V },
{ PCI_VENDOR_INTEL, PCI_PRODUCT_INTEL_EP80579_LAN_1 },
{ PCI_VENDOR_INTEL, PCI_PRODUCT_INTEL_EP80579_LAN_2 },
-   { PCI_VENDOR_INTEL, PCI_PRODUCT_INTEL_EP80579_LAN_3 }
+   { PCI_VENDOR_INTEL, PCI_PRODUCT_INTEL_EP80579_LAN_3 },
+   { PCI_VENDOR_INTEL, PCI_PRODUCT_INTEL_EP80579_LAN_4 },
+   { PCI_VENDOR_INTEL, PCI_PRODUCT_INTEL_EP80579_LAN_5 },
+   { PCI_VENDOR_INTEL, PCI_PRODUCT_INTEL_EP80579_LAN_6 }
 };
 
 /*
@@ -298,6 +301,8 @@
pci_chipset_tag_t   pc = pa->pa_pc;
void *gcu;
 
+   INIT_DEBUGOUT("em_defer_attach: begin");
+
if ((gcu = em_lookup_gcu(self)) == 0) {
printf("%s: No GCU found, defered attachment failed\n",
sc->sc_dv.dv_xname);
@@ -321,9 +326,9 @@
 
em_setup_interface(sc); 
 
-   em_update_link_status(sc);  
-
em_setup_link(&sc->hw); 
+
+   em_update_link_status(sc);
 }
 
 /*
Index: pci/if_em_hw.c
===
RCS file: /cvs/src/sys/dev/pci/if_em_hw.c,v
retrieving revision 1.84
diff -u -r1.84 if_em_hw.c
--- pci/if_em_hw.c  12 May 2015 02:33:39 -  1.84
+++ pci/if_em_hw.c  19 May 2015 14:08:36 -
@@ -60,6 +60,8 @@
 #include 
 #include 
 
+#include 
+
 #define STATIC
 
 static int32_t em_swfw_sync_acquire(struct em_hw *, uint16_t);
@@ -266,6 +268,9 @@
case I350_I_PHY_ID:
hw->phy_type = em_phy_82580;
break;
+   case RTL8211_E_PHY_ID:
+   hw->phy_type = em_phy_rtl8211;
+   break;
case BME1000_E_PHY_ID:
if (hw->phy_revision == 1) {
hw->phy_type = em_phy_bm;
@@ -609,13 +614,19 @@
hw->icp__port_num = 0;
break;
case E1000_DEV_ID_EP80579_LAN_2:
+   case E1000_DEV_ID_EP80579_LAN_4:
hw->mac_type = em_icp_;
hw->icp__port_num = 1;
break;
case E1000_DEV_ID_EP80579_LAN_3:
+   case E1000_DEV_ID_EP80579_LAN_5:
hw->mac_type = em_icp_;
hw->icp__port_num = 2;
break;
+   case E1000_DEV_ID_EP80579_LAN_6:
+   hw->mac_type = em_icp_;
+   hw->icp__port_num = 3;
+   break;
default:
/* Should never have loaded on this device */
return -E1000_ERR_MAC_TYPE;
@@ -707,7 +718,10 @@
case E1000_DEV_ID_EP80579_LAN_1:
case E1000_DEV_ID_EP80579_LAN_2:
case E1000_DEV_ID_EP80579_LAN_3:
-   hw->media_type = em_media_type_oem;
+   case E1000_DEV_ID_EP80579_LAN_4:
+   case E1000_DEV_ID_EP80579_LAN_5:
+   case E1000_DEV_ID_EP80579_LAN_6:
+   hw->media_type = em_media_type_copper;
break;
default:
switch (hw->mac_type) {
@@ -2477,6 +2491,134 @@
return ret_val;
 }
 
+static int32_t
+em_copper_link_rtl8211_setup(struct em_hw *hw)
+{
+   int32_t ret_val;
+   uint16_t phy_data;
+
+   DEBUGFUNC("em_copper_link_rtl8211_setup: begin");
+
+   if (!hw) {
+   return -1;
+   }
+   /* SW Reset the PHY so all changes take effect */
+   em_phy_hw_reset(hw);
+
+   /* Enable CRS on TX. This must be set for half-duplex operation. */
+   phy_data = 0;
+
+   ret_val = em_read_phy_reg_ex(hw, RGEPHY_CR, &phy_data);
+   if (ret_val) {
+   printf("Unable to read RGEPHY_CR register\n");
+   return ret_val;
+   }
+   DEBUGOUT3("RTL8211: Rx phy_id=%X addr=%X SPEC_CTRL=%X\n", hw->phy_id,
+   hw->phy_addr, phy_data);
+   phy_data |= RGEPHY_CR_ASSERT_CRS;
+
+   ret

Re: chroot: add -c to use login class with -u

2015-05-19 Thread Todd C. Miller
On Tue, 19 May 2015 09:56:53 -0600, Theo de Raadt wrote:

> I think the consensus was easy to form.  People using -u right now
> collect root's giant limits, which is not sensible.

Actually, they retain the invoking user's limits.  But either way,
it is surprising.

 - todd



Re: chroot: add -c to use login class with -u

2015-05-19 Thread Theo de Raadt
> The consensus seems to be that "chroot -u" should apply the settings
> in /etc/login.conf by default.  Since this is a non-standard flag
> we can do what we like with it.  I should have used setusercontext()
> when I added -u to chroot in the first place.
> 
> We can add a "-c class" option in the future if there turns out to
> be a need for it.

Looks good.

I think the consensus was easy to form.  People using -u right now
collect root's giant limits, which is not sensible.

> Index: usr.sbin/chroot/chroot.8
> ===
> RCS file: /cvs/src/usr.sbin/chroot/chroot.8,v
> retrieving revision 1.14
> diff -u -p -u -r1.14 chroot.8
> --- usr.sbin/chroot/chroot.8  8 Jul 2010 06:52:30 -   1.14
> +++ usr.sbin/chroot/chroot.8  19 May 2015 15:47:52 -
> @@ -77,6 +77,11 @@ and
>  databases unless overridden by the
>  .Fl g
>  option.
> +Additional settings may be applied as specified in
> +.Xr login.conf 5
> +depending on
> +.Ar user Ns 's
> +login class.
>  .El
>  .Sh ENVIRONMENT
>  .Bl -tag -width SHELL
> @@ -95,6 +100,7 @@ is used.
>  .Sh SEE ALSO
>  .Xr ldd 1 ,
>  .Xr group 5 ,
> +.Xr login.conf 5 ,
>  .Xr passwd 5 ,
>  .Xr environ 7
>  .Sh HISTORY
> Index: usr.sbin/chroot/chroot.c
> ===
> RCS file: /cvs/src/usr.sbin/chroot/chroot.c,v
> retrieving revision 1.13
> diff -u -p -u -r1.13 chroot.c
> --- usr.sbin/chroot/chroot.c  27 Oct 2009 23:59:51 -  1.13
> +++ usr.sbin/chroot/chroot.c  19 May 2015 15:48:29 -
> @@ -35,6 +35,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -50,11 +51,14 @@ main(int argc, char **argv)
>  {
>   struct group*grp;
>   struct passwd   *pwd;
> + login_cap_t *lc;
>   const char  *shell;
>   char*user, *group, *grouplist;
>   gid_t   gidlist[NGROUPS_MAX];
>   int ch, ngids;
> + int flags = LOGIN_SETALL & ~(LOGIN_SETLOGIN|LOGIN_SETUSER);
>  
> + lc = NULL;
>   ngids = 0;
>   pwd = NULL;
>   user = grouplist = NULL;
> @@ -80,8 +84,12 @@ main(int argc, char **argv)
>   if (argc < 1)
>   usage();
>  
> - if (user != NULL && (pwd = getpwnam(user)) == NULL)
> - errx(1, "no such user `%s'", user);
> + if (user != NULL) {
> + if ((pwd = getpwnam(user)) == NULL)
> + errx(1, "no such user `%s'", user);
> + if ((lc = login_getclass(pwd->pw_class)) == NULL)
> + err(1, "unable to get login class for `%s'", user);
> + }
>  
>   while ((group = strsep(&grouplist, ",")) != NULL) {
>   if (*group == '\0')
> @@ -99,11 +107,11 @@ main(int argc, char **argv)
>   err(1, "setgid");
>   if (setgroups(ngids, gidlist) != 0)
>   err(1, "setgroups");
> - } else if (pwd != NULL) {
> - if (setgid(pwd->pw_gid) != 0)
> - err(1, "setgid");
> - if (initgroups(user, pwd->pw_gid) == -1)
> - err(1, "initgroups");
> + flags &= ~LOGIN_SETGROUP;
> + }
> + if (lc != NULL) {
> + if (setusercontext(lc, pwd, pwd->pw_uid, flags) == -1)
> + err(1, "setusercontext");
>   }
>  
>   if (chroot(argv[0]) != 0 || chdir("/") != 0)
> @@ -115,7 +123,6 @@ main(int argc, char **argv)
>   setlogin(pwd->pw_name);
>   if (setuid(pwd->pw_uid) != 0)
>   err(1, "setuid");
> - endgrent();
>   }
>  
>   if (argv[1]) {
> 



Re: chroot: add -c to use login class with -u

2015-05-19 Thread Todd C. Miller
The consensus seems to be that "chroot -u" should apply the settings
in /etc/login.conf by default.  Since this is a non-standard flag
we can do what we like with it.  I should have used setusercontext()
when I added -u to chroot in the first place.

We can add a "-c class" option in the future if there turns out to
be a need for it.

 - todd

Index: usr.sbin/chroot/chroot.8
===
RCS file: /cvs/src/usr.sbin/chroot/chroot.8,v
retrieving revision 1.14
diff -u -p -u -r1.14 chroot.8
--- usr.sbin/chroot/chroot.88 Jul 2010 06:52:30 -   1.14
+++ usr.sbin/chroot/chroot.819 May 2015 15:47:52 -
@@ -77,6 +77,11 @@ and
 databases unless overridden by the
 .Fl g
 option.
+Additional settings may be applied as specified in
+.Xr login.conf 5
+depending on
+.Ar user Ns 's
+login class.
 .El
 .Sh ENVIRONMENT
 .Bl -tag -width SHELL
@@ -95,6 +100,7 @@ is used.
 .Sh SEE ALSO
 .Xr ldd 1 ,
 .Xr group 5 ,
+.Xr login.conf 5 ,
 .Xr passwd 5 ,
 .Xr environ 7
 .Sh HISTORY
Index: usr.sbin/chroot/chroot.c
===
RCS file: /cvs/src/usr.sbin/chroot/chroot.c,v
retrieving revision 1.13
diff -u -p -u -r1.13 chroot.c
--- usr.sbin/chroot/chroot.c27 Oct 2009 23:59:51 -  1.13
+++ usr.sbin/chroot/chroot.c19 May 2015 15:48:29 -
@@ -35,6 +35,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -50,11 +51,14 @@ main(int argc, char **argv)
 {
struct group*grp;
struct passwd   *pwd;
+   login_cap_t *lc;
const char  *shell;
char*user, *group, *grouplist;
gid_t   gidlist[NGROUPS_MAX];
int ch, ngids;
+   int flags = LOGIN_SETALL & ~(LOGIN_SETLOGIN|LOGIN_SETUSER);
 
+   lc = NULL;
ngids = 0;
pwd = NULL;
user = grouplist = NULL;
@@ -80,8 +84,12 @@ main(int argc, char **argv)
if (argc < 1)
usage();
 
-   if (user != NULL && (pwd = getpwnam(user)) == NULL)
-   errx(1, "no such user `%s'", user);
+   if (user != NULL) {
+   if ((pwd = getpwnam(user)) == NULL)
+   errx(1, "no such user `%s'", user);
+   if ((lc = login_getclass(pwd->pw_class)) == NULL)
+   err(1, "unable to get login class for `%s'", user);
+   }
 
while ((group = strsep(&grouplist, ",")) != NULL) {
if (*group == '\0')
@@ -99,11 +107,11 @@ main(int argc, char **argv)
err(1, "setgid");
if (setgroups(ngids, gidlist) != 0)
err(1, "setgroups");
-   } else if (pwd != NULL) {
-   if (setgid(pwd->pw_gid) != 0)
-   err(1, "setgid");
-   if (initgroups(user, pwd->pw_gid) == -1)
-   err(1, "initgroups");
+   flags &= ~LOGIN_SETGROUP;
+   }
+   if (lc != NULL) {
+   if (setusercontext(lc, pwd, pwd->pw_uid, flags) == -1)
+   err(1, "setusercontext");
}
 
if (chroot(argv[0]) != 0 || chdir("/") != 0)
@@ -115,7 +123,6 @@ main(int argc, char **argv)
setlogin(pwd->pw_name);
if (setuid(pwd->pw_uid) != 0)
err(1, "setuid");
-   endgrent();
}
 
if (argv[1]) {



pcap N listeners vs N devs

2015-05-19 Thread def
Hi!
As bpf devices is a some limited resource, and as far as i know the kernal is 
like single-thread when performing filtering each packet thr many active bpf 
devices.
So the question applying to ethernet trunk where one eth port spans multiple 
vlans, or i.e. multiple mpe.In terms of performance/botlenecks - is the 
multiple different bpf filters applied to one port (eth) looking better than 
such filters applied to many vlan interfaces with unique bpf dev for each.
Additional question.Is there any known or theoretical number of unique active 
bpf devices whenkernal can be locked by filtering processing?
Thanks.

--


Re: Brainy: Memory Leak in ICMP

2015-05-19 Thread Martin Pieuchot
On 19/05/15(Tue) 15:28, Maxime Villard wrote:
> -- netinet/ip_icmp.c --
> 
> 925   rt = rtalloc(sintosa(&sin), RT_REPORT|RT_RESOLVE, rtableid);
>   if (rt == NULL)
>   return (NULL);
> 
>   /* Check if the route is actually usable */
>   if (rt->rt_flags & (RTF_REJECT | RTF_BLACKHOLE) ||
>   (rt->rt_flags & RTF_UP) == 0)
>   return (NULL);
> 
> ---
> 
> 'rt' is not released.
> 
> Found by The Brainy Code Scanner.

Indeed!  Thanks for the report, I just committed a fix.

Martin



Re: chroot: add -c to use login class with -u

2015-05-19 Thread trondd
On Mon, May 18, 2015 6:30 pm, Todd C. Miller wrote:
> Currently, "chroot -u" doesn't use the settings in /etc/login.conf.

Nice.  I was missing this option.

>
> Open questions:
>
> 1) Should this just be default behavior with -u?  Are there cases
>when you would *not* want to set the priority and resouce limits
>based on the target user when one is specified?

When I was setting up an application in a chroot, my expectation was that
-u would apply that user's class.  When I noticed that it didn't, I looked
for a -c to set the class, which didn't exist.  So I would vote for the
class being set by default and -c to override it.  I believe this is what
su does...?

Tim.



Brainy: Memory Leak in ICMP

2015-05-19 Thread Maxime Villard
Hi,
I put here a bug among others:

-- netinet/ip_icmp.c --

925 rt = rtalloc(sintosa(&sin), RT_REPORT|RT_RESOLVE, rtableid);
if (rt == NULL)
return (NULL);

/* Check if the route is actually usable */
if (rt->rt_flags & (RTF_REJECT | RTF_BLACKHOLE) ||
(rt->rt_flags & RTF_UP) == 0)
return (NULL);

---

'rt' is not released.

Found by The Brainy Code Scanner.

Maxime



Re: ospfd announces carp interface with physical link down

2015-05-19 Thread Johan Ymerson
On Tue, 2015-05-19 at 11:24 +0100, Stuart Henderson wrote:
> On 2015/05/19 10:10, Johan Ymerson wrote:
> 
> Yes I understand that, but if carp init was counted in "LINK_STATE_DOWN"
> then the metric would be 65535 which I think would still avoid the
> problem you're seeing, and would involve less special-casing in ospfd.
> 

Yes, that would also resolve the issue, but it is a bit illogical to
announce a network we cannot possibly route traffic to (due to hardware
problems).

I had a look at how BGPd handles this, and it only looks at
LINK_STATE_DOWN. It will also miss the case when
linkstate==LINK_STATE_UNKNOWN.
That brings us back to why the kernel's carp code doesn't initialize the
linkstate to something more useful than UNKNOWN.

/Johan



Re: chroot: add -c to use login class with -u

2015-05-19 Thread Marc Espie
On Mon, May 18, 2015 at 04:30:33PM -0600, Todd C. Miller wrote:
> Currently, "chroot -u" doesn't use the settings in /etc/login.conf.
> This adds a -c option to apply the via setusercontext().  We can't
> use LOGIN_SETALL since the uid change has to happen after chroot(2)
> and the groups may be specified via the -g option.
> 
> Open questions:
> 
> 1) Should this just be default behavior with -u?  Are there cases
>when you would *not* want to set the priority and resouce limits
>based on the target user when one is specified?

Color me surprised, I just looked at posix, and discovered that chroot is
*not* a standard utility.  So yeah, that would even make sense to me.

> 2) Does it make sense to apply LOGIN_SETPATH, LOGIN_SETPATH
>and LOGIN_SETUMASK or are we better off with just LOGIN_SETPRIORITY
>and LOGIN_SETRESOURCES?  Ultimately it depends on whether you
>expect the chroot'd environment to be setup like a login session
>or not.  We don't invoke the shell as a login shell though.

For my use case, everything can be set up as a login session... Note that
I would use the chroot -c -u user /directory cmd   directly, so there is
no login shell involved.

> Opinions?

Looks fine to me as it currently stands.

>  - todd
> 
> Index: usr.sbin/chroot/chroot.8
> ===
> RCS file: /cvs/src/usr.sbin/chroot/chroot.8,v
> retrieving revision 1.14
> diff -u -p -u -r1.14 chroot.8
> --- usr.sbin/chroot/chroot.8  8 Jul 2010 06:52:30 -   1.14
> +++ usr.sbin/chroot/chroot.8  18 May 2015 20:02:55 -
> @@ -37,6 +37,7 @@
>  .Nd change root directory
>  .Sh SYNOPSIS
>  .Nm chroot
> +.Op Fl c
>  .Op Fl g Ar group,group,...
>  .Op Fl u Ar user
>  .Ar newroot
> @@ -56,6 +57,13 @@ command is restricted to the superuser.
>  .Pp
>  The options are as follows:
>  .Bl -tag -width Ds
> +.It Fl c
> +Apply the user's login class as defined in
> +.Xr login.conf 5
> +before executing
> +.Ar command .
> +This option may only be used in conjunction with
> +.Fl u .
>  .It Fl g Ar group,group,...
>  Override the primary and supplemental group IDs.
>  The primary group ID is set to the first group in the list.
> @@ -94,6 +102,7 @@ is used.
>  .El
>  .Sh SEE ALSO
>  .Xr ldd 1 ,
> +.Xr login.conf 5 ,
>  .Xr group 5 ,
>  .Xr passwd 5 ,
>  .Xr environ 7
> Index: usr.sbin/chroot/chroot.c
> ===
> RCS file: /cvs/src/usr.sbin/chroot/chroot.c,v
> retrieving revision 1.13
> diff -u -p -u -r1.13 chroot.c
> --- usr.sbin/chroot/chroot.c  27 Oct 2009 23:59:51 -  1.13
> +++ usr.sbin/chroot/chroot.c  18 May 2015 19:56:15 -
> @@ -35,8 +35,10 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -50,16 +52,21 @@ main(int argc, char **argv)
>  {
>   struct group*grp;
>   struct passwd   *pwd;
> + login_cap_t *lc = NULL;
>   const char  *shell;
>   char*user, *group, *grouplist;
>   gid_t   gidlist[NGROUPS_MAX];
>   int ch, ngids;
> + boolcflag = false;
>  
>   ngids = 0;
>   pwd = NULL;
>   user = grouplist = NULL;
> - while ((ch = getopt(argc, argv, "g:u:")) != -1) {
> + while ((ch = getopt(argc, argv, "cg:u:")) != -1) {
>   switch(ch) {
> + case 'c':
> + cflag = true;
> + break;
>   case 'u':
>   user = optarg;
>   if (*user == '\0')
> @@ -83,6 +90,12 @@ main(int argc, char **argv)
>   if (user != NULL && (pwd = getpwnam(user)) == NULL)
>   errx(1, "no such user `%s'", user);
>  
> + if (cflag) {
> + if (pwd == NULL)
> + errx(1, "login class requires a user");
> + lc = login_getclass(pwd->pw_class);
> + }
> +
>   while ((group = strsep(&grouplist, ",")) != NULL) {
>   if (*group == '\0')
>   continue;
> @@ -104,6 +117,11 @@ main(int argc, char **argv)
>   err(1, "setgid");
>   if (initgroups(user, pwd->pw_gid) == -1)
>   err(1, "initgroups");
> + if (lc != NULL) {
> + const int flags = LOGIN_SETALL & 
> ~(LOGIN_SETGROUP|LOGIN_SETLOGIN|LOGIN_SETUSER);
> + if (setusercontext(lc, pwd, pwd->pw_uid, flags) == -1)
> + err(1, "setusercontext");
> + }
>   }
>  
>   if (chroot(argv[0]) != 0 || chdir("/") != 0)



Re: id: add -c option

2015-05-19 Thread Marc Espie
On Mon, May 18, 2015 at 03:12:24PM -0600, Todd C. Miller wrote:
> This adds "id -c" to display a user's login class.  If no user is
> specified, it looks up the passwd entry based on the real uid and
> displays the corresponding login class.
> 
> This is similar to "id -c" in FreeBSD (but they keep the login class
> in the kernel).
> 
>  - todd
> 
> Index: usr.bin/id/id.1
> ===
> RCS file: /cvs/src/usr.bin/id/id.1,v
> retrieving revision 1.17
> diff -u -p -u -r1.17 id.1
> --- usr.bin/id/id.1   3 Sep 2010 11:09:29 -   1.17
> +++ usr.bin/id/id.1   18 May 2015 16:47:46 -
> @@ -43,6 +43,9 @@
>  .Nm id
>  .Op Ar user
>  .Nm id
> +.Fl c
> +.Op Ar user
> +.Nm id
>  .Fl G Op Fl n
>  .Op Ar user
>  .Nm id
> @@ -70,6 +73,9 @@ In this case, the real and effective IDs
>  .Pp
>  The options are as follows:
>  .Bl -tag -width Ds
> +.It Fl c
> +Display the login class of the real user ID or the specified
> +.Ar user .
>  .It Fl G
>  Display the different group IDs (effective, real and supplementary)
>  as whitespace separated numbers, in no particular order.
> Index: usr.bin/id/id.c
> ===
> RCS file: /cvs/src/usr.bin/id/id.c,v
> retrieving revision 1.22
> diff -u -p -u -r1.22 id.c
> --- usr.bin/id/id.c   16 Jan 2015 06:40:08 -  1.22
> +++ usr.bin/id/id.c   18 May 2015 21:10:13 -
> @@ -38,6 +38,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  void current(void);
>  void pretty(struct passwd *);
> @@ -52,12 +53,12 @@ main(int argc, char *argv[])
>  {
>   struct group *gr;
>   struct passwd *pw;
> - int Gflag, ch, gflag, nflag, pflag, rflag, uflag;
> + int ch, cflag, Gflag, gflag, nflag, pflag, rflag, uflag;
>   uid_t uid;
>   gid_t gid;
>   const char *opts;
>  
> - Gflag = gflag = nflag = pflag = rflag = uflag = 0;
> + cflag = Gflag = gflag = nflag = pflag = rflag = uflag = 0;
>  
>   if (strcmp(getprogname(), "groups") == 0) {
>   Gflag = 1;
> @@ -72,10 +73,13 @@ main(int argc, char *argv[])
>   if (argc > 1)
>   usage();
>   } else
> - opts = "Ggnpru";
> + opts = "cGgnpru";
>  
>   while ((ch = getopt(argc, argv, opts)) != -1)
>   switch(ch) {
> + case 'c':
> + cflag = 1;
> + break;
>   case 'G':
>   Gflag = 1;
>   break;
> @@ -101,7 +105,7 @@ main(int argc, char *argv[])
>   argc -= optind;
>   argv += optind;
>  
> - switch (Gflag + gflag + pflag + uflag) {
> + switch (cflag + Gflag + gflag + pflag + uflag) {
>   case 1:
>   break;
>   case 0:
> @@ -117,6 +121,16 @@ main(int argc, char *argv[])
>  
>   pw = *argv ? who(*argv) : NULL;
>  
> + if (cflag) {
> + if (pw == NULL)
> + pw = getpwuid(getuid());
> + if (pw != NULL && pw->pw_class != NULL && *pw->pw_class != '\0')
> + (void)printf("%s\n", pw->pw_class);
> + else
> + (void)printf("%s\n", LOGIN_DEFCLASS);
> + exit(0);
> + }
> +
>   if (gflag) {
>   gid = pw ? pw->pw_gid : rflag ? getgid() : getegid();
>   if (nflag && (gr = getgrgid(gid)))
> @@ -325,6 +339,7 @@ usage(void)
>   (void)fprintf(stderr, "usage: whoami\n");
>   } else {
>   (void)fprintf(stderr, "usage: id [user]\n");
> + (void)fprintf(stderr, "   id -c [user]\n");
>   (void)fprintf(stderr, "   id -G [-n] [user]\n");
>   (void)fprintf(stderr, "   id -g [-nr] [user]\n");
>   (void)fprintf(stderr, "   id -p [user]\n");
Since I asked for it, I obviously approve of this change.



Re: vlan+bridge fix

2015-05-19 Thread Martin Pieuchot
On 15/05/15(Fri) 17:34, mxb wrote:
> Diff is applied. So far no problems.
> Unfortunately I can’t test this fully - no vlans on my side.

Thanks for testing.  A "no regression" report is always welcome.

There's some more issues with bridge+vlan but jasper@ also confirmed
this diff improve the situation.

Can I have oks?

> > On 15 maj 2015, at 13:14, Martin Pieuchot  wrote:
> > 
> > I have one setup with multiple interfaces in a bridge and on some of
> > these interfaces some vlan(4)s.  But there's currently a bug that
> > prevent us to send (receive is fine) VLAN packets in such config.
> > Diff below fixes that.
> > 
> > The problem is that vlan_output() does not pass its parent interface
> > to ether_output().  That's a mis-design that should be fixed later.
> > The reason for not passing the parent interface is that we want to
> > tcpdump(8) packets on vlan interfaces and the easiest hack^Wsolution
> > was to add a bpf handler in vlan_start()*.
> > 
> > Since my vlans are not part of the bridge, the check below is never
> > true and my packets never go through the bridge.  By moving this
> > check to if_output() we kill two birds with one diff.  First of
> > all we fix this vlan bug and secondly we simplify ether_output()
> > which in turn will allow us to fix all pseudo-interface *output()
> > functions.
> > 
> > One of the goals of if_output() is to move all bpf handlers instead
> > of having them in multiple if_start().  Of course, this will also
> > help us removing the various "#if PSEUDODRIVER" from our stack...
> > 
> > Ok?
> > 
> > *: Note that for the exact same reason we cannot tcpdump output
> > packets on a carp(4) interface, this will be fixed at the same
> > time in upcoming diffs.
> > 
> > 
> > Index: net/if_ethersubr.c
> > ===
> > RCS file: /cvs/src/sys/net/if_ethersubr.c,v
> > retrieving revision 1.198
> > diff -u -p -r1.198 if_ethersubr.c
> > --- net/if_ethersubr.c  15 May 2015 10:15:13 -  1.198
> > +++ net/if_ethersubr.c  15 May 2015 10:58:37 -
> > @@ -363,47 +363,6 @@ ether_output(struct ifnet *ifp0, struct 
> > if (ether_addheader(&m, ifp, etype, esrc, edst) == -1)
> > senderr(ENOBUFS);
> > 
> > -#if NBRIDGE > 0
> > -   /*
> > -* Interfaces that are bridgeports need special handling for output.
> > -*/
> > -   if (ifp->if_bridgeport) {
> > -   struct m_tag *mtag;
> > -
> > -   /*
> > -* Check if this packet has already been sent out through
> > -* this bridgeport, in which case we simply send it out
> > -* without further bridge processing.
> > -*/
> > -   for (mtag = m_tag_find(m, PACKET_TAG_BRIDGE, NULL); mtag;
> > -   mtag = m_tag_find(m, PACKET_TAG_BRIDGE, mtag)) {
> > -#ifdef DEBUG
> > -   /* Check that the information is there */
> > -   if (mtag->m_tag_len != sizeof(caddr_t)) {
> > -   error = EINVAL;
> > -   goto bad;
> > -   }
> > -#endif
> > -   if (!memcmp(&ifp->if_bridgeport, mtag + 1,
> > -   sizeof(caddr_t)))
> > -   break;
> > -   }
> > -   if (mtag == NULL) {
> > -   /* Attach a tag so we can detect loops */
> > -   mtag = m_tag_get(PACKET_TAG_BRIDGE, sizeof(caddr_t),
> > -   M_NOWAIT);
> > -   if (mtag == NULL) {
> > -   error = ENOBUFS;
> > -   goto bad;
> > -   }
> > -   memcpy(mtag + 1, &ifp->if_bridgeport, sizeof(caddr_t));
> > -   m_tag_prepend(m, mtag);
> > -   error = bridge_output(ifp, m, NULL, NULL);
> > -   return (error);
> > -   }
> > -   }
> > -#endif
> > -
> > len = m->m_pkthdr.len;
> > 
> > error = if_output(ifp, m);
> > Index: net/if.c
> > ===
> > RCS file: /cvs/src/sys/net/if.c,v
> > retrieving revision 1.331
> > diff -u -p -r1.331 if.c
> > --- net/if.c15 May 2015 10:15:13 -  1.331
> > +++ net/if.c15 May 2015 10:58:37 -
> > @@ -450,6 +450,40 @@ if_output(struct ifnet *ifp, struct mbuf
> > length = m->m_pkthdr.len;
> > mflags = m->m_flags;
> > 
> > +#if NBRIDGE > 0
> > +   /*
> > +* Interfaces that are bridgeports need special handling for output.
> > +*/
> > +   if (ifp->if_bridgeport) {
> > +   struct m_tag *mtag;
> > +
> > +   /*
> > +* Check if this packet has already been sent out through
> > +* this bridgeport, in which case we simply send it out
> > +* without further bridge processing.
> > +*/
> > +   for (mtag = m_tag_find(m, PACKET_TAG_BRIDGE, NULL); mtag;
> > +   mtag = m_tag_find(m, 

Re: More if_output()

2015-05-19 Thread Martin Pieuchot
On 15/05/15(Fri) 15:53, Martin Pieuchot wrote:
> Some more if_output() conversion.  The xl bits are here because I'd
> like to reduce the number of places where IFQ_ENQUEUE() is used.
> 
> After applying this diff you should only have a couple left.

Anyone?

> Ok?
> 
> Index: dev/usb/if_upl.c
> ===
> RCS file: /cvs/src/sys/dev/usb/if_upl.c,v
> retrieving revision 1.64
> diff -u -p -r1.64 if_upl.c
> --- dev/usb/if_upl.c  10 Apr 2015 08:41:43 -  1.64
> +++ dev/usb/if_upl.c  15 May 2015 13:43:51 -
> @@ -888,28 +888,5 @@ int
>  upl_output(struct ifnet *ifp, struct mbuf *m, struct sockaddr *dst,
>  struct rtentry *rt0)
>  {
> - int s, len, error;
> -
> - DPRINTFN(10,("%s: %s: enter\n",
> -  ((struct upl_softc *)ifp->if_softc)->sc_dev.dv_xname,
> -  __func__));
> -
> - len = m->m_pkthdr.len;
> - s = splnet();
> - /*
> -  * Queue message on interface, and start output if interface
> -  * not yet active.
> -  */
> - IFQ_ENQUEUE(&ifp->if_snd, m, NULL, error);
> - if (error) {
> - /* mbuf is already freed */
> - splx(s);
> - return (error);
> - }
> - ifp->if_obytes += len;
> - if ((ifp->if_flags & IFF_OACTIVE) == 0)
> - (*ifp->if_start)(ifp);
> - splx(s);
> -
> - return (0);
> + return (if_output(ifp, m));
>  }
> Index: dev/ic/xl.c
> ===
> RCS file: /cvs/src/sys/dev/ic/xl.c,v
> retrieving revision 1.123
> diff -u -p -r1.123 xl.c
> --- dev/ic/xl.c   24 Mar 2015 11:23:02 -  1.123
> +++ dev/ic/xl.c   15 May 2015 13:43:24 -
> @@ -177,9 +177,6 @@ int xl_list_tx_init_90xB(struct xl_softc
>  void xl_wait(struct xl_softc *);
>  void xl_mediacheck(struct xl_softc *);
>  void xl_choose_xcvr(struct xl_softc *, int);
> -#ifdef notdef
> -void xl_testpacket(struct xl_softc *);
> -#endif
>  
>  int xl_miibus_readreg(struct device *, int, int);
>  void xl_miibus_writereg(struct device *, int, int, int);
> @@ -659,35 +656,6 @@ xl_iff_905b(struct xl_softc *sc)
>  
>   XL_SEL_WIN(7);
>  }
> -
> -#ifdef notdef
> -void
> -xl_testpacket(struct xl_softc *sc)
> -{
> - struct mbuf *m;
> - struct ifnet*ifp;
> - int error;
> -
> - ifp = &sc->sc_arpcom.ac_if;
> -
> - MGETHDR(m, M_DONTWAIT, MT_DATA);
> -
> - if (m == NULL)
> - return;
> -
> - memcpy(mtod(m, struct ether_header *)->ether_dhost,
> - &sc->sc_arpcom.ac_enaddr, ETHER_ADDR_LEN);
> - memcpy(mtod(m, struct ether_header *)->ether_shost,
> - &sc->sc_arpcom.ac_enaddr, ETHER_ADDR_LEN);
> - mtod(m, struct ether_header *)->ether_type = htons(3);
> - mtod(m, unsigned char *)[14] = 0;
> - mtod(m, unsigned char *)[15] = 0;
> - mtod(m, unsigned char *)[16] = 0xE3;
> - m->m_len = m->m_pkthdr.len = sizeof(struct ether_header) + 3;
> - IFQ_ENQUEUE(&ifp->if_snd, m, NULL, error);
> - xl_start(ifp);
> -}
> -#endif
>  
>  void
>  xl_setcfg(struct xl_softc *sc)
> Index: net80211/ieee80211_input.c
> ===
> RCS file: /cvs/src/sys/net80211/ieee80211_input.c,v
> retrieving revision 1.133
> diff -u -p -r1.133 ieee80211_input.c
> --- net80211/ieee80211_input.c14 Mar 2015 03:38:51 -  1.133
> +++ net80211/ieee80211_input.c15 May 2015 13:43:51 -
> @@ -827,7 +827,6 @@ ieee80211_deliver_data(struct ieee80211c
>   !(ic->ic_flags & IEEE80211_F_NOBRIDGE) &&
>   eh->ether_type != htons(ETHERTYPE_PAE)) {
>   struct ieee80211_node *ni1;
> - int error, len;
>  
>   if (ETHER_IS_MULTICAST(eh->ether_dhost)) {
>   m1 = m_copym2(m, 0, M_COPYALL, M_DONTWAIT);
> @@ -843,18 +842,8 @@ ieee80211_deliver_data(struct ieee80211c
>   m = NULL;
>   }
>   }
> - if (m1 != NULL) {
> - len = m1->m_pkthdr.len;
> - IFQ_ENQUEUE(&ifp->if_snd, m1, NULL, error);
> - if (error)
> - ifp->if_oerrors++;
> - else {
> - if (m != NULL)
> - ifp->if_omcasts++;
> - ifp->if_obytes += len;
> - if_start(ifp);
> - }
> - }
> + if (m1 != NULL)
> + if_output(ifp, m1);
>   }
>  #endif
>   if (m != NULL) {
> Index: net80211/ieee80211_output.c
> ===
> RCS file: /cvs/src/sys/net80211/ieee80211_output.c,v
> retrieving revision 1.94
> diff -u -p -r1.94 ieee80211_output.c
> --- net80211/ieee80211_output.c   14 Mar 2015 03:38:51 -  1.94
> +++ net80211/i

Re: ospfd announces carp interface with physical link down

2015-05-19 Thread Stuart Henderson
On 2015/05/19 10:10, Johan Ymerson wrote:
> On Tue, 2015-05-19 at 10:10 +0100, Stuart Henderson wrote:
> > On 2015/05/19 09:03, Johan Ymerson wrote:
> > > On Fri, 2015-05-15 at 17:59 +0200, Johan Ymerson wrote:
> > > > I have found a peculiar behaviour in ospfd when the physical link of the
> > > > parent carp interface is down. The carp interface net is then announced
> > > > with it's regular metric.
> > > > 
> > > > An example:
> > > > The cable of em2, parent of carp2 (192.168.254.0/23), is unplugged. Here
> > > > is what is announced, seen by another machine running bird:
> > > > 
> > > > router 192.168.200.4
> > > > distance 10
> > > > network 192.168.200.0/24 metric 10
> > > > stubnet 192.168.202.0/24 metric 65535
> > > > stubnet 192.168.254.0/23 metric 10
> > > > stubnet 195.58.98.144/28 metric 65535
> > > > stubnet 92.33.0.200/30 metric 65535
> > > > stubnet 192.168.253.0/24 metric 10
> > > > 
> > > > 192.168.254.0/23 is announced with metric 10. All other interfaces in
> > > > the same carp group are announced with metric 65535 because the
> > > > link-down state of em2 has demoted the carp group, as it should.
> > > 
> > > After reading my initial post I realize I wasn't clear about the result
> > > of this.
> > > If you have a redundant router set up with carp on one side and ospf on
> > > the other, and plug out a network cable on the carp side on the master,
> > > one will loose network connectivity to that network.
> > > 
> > > In our case we lost Internet access until we realized what was wrong and
> > > shut down the master.
> > 
> > I'm not keen on (relatively complex) special-casing in ospfd for this,
> > I think this is the pertinent question:
> > 
> > > > Also, is the carp kernel code really correct when it leaves the
> > > > interface link state as "unknown" when in carp init state?
> > 
> 
> I don't think we under each other. ospfd already has special-casing for
> CARP-interfaces. It is this special case that introduce this bug.
> 
> A normal interface is not announced when link is down. For carp
> interfaces however, there is an exception for link down so that ospfd
> announces the interface but with a metric of 65535. This is to allow
> faster fail over when carp master changes.
> However, the first exception:
>   !LINK_STATE_IS_UP(iface->linkstate)
> which allow the interface to be announced, triggers on both physical
> link down and on carp backup mode, while the second exception:
>   iface->linkstate == LINK_STATE_DOWN
> which sets the metric to 65535, triggers only on carp backup mode.
> 
> The result is that we announce routes to a network we have no connection
> to, and that is a very bad thing.

Yes I understand that, but if carp init was counted in "LINK_STATE_DOWN"
then the metric would be 65535 which I think would still avoid the
problem you're seeing, and would involve less special-casing in ospfd.



Re: ospfd announces carp interface with physical link down

2015-05-19 Thread Johan Ymerson
On Tue, 2015-05-19 at 10:10 +0100, Stuart Henderson wrote:
> On 2015/05/19 09:03, Johan Ymerson wrote:
> > On Fri, 2015-05-15 at 17:59 +0200, Johan Ymerson wrote:
> > > I have found a peculiar behaviour in ospfd when the physical link of the
> > > parent carp interface is down. The carp interface net is then announced
> > > with it's regular metric.
> > > 
> > > An example:
> > > The cable of em2, parent of carp2 (192.168.254.0/23), is unplugged. Here
> > > is what is announced, seen by another machine running bird:
> > > 
> > > router 192.168.200.4
> > > distance 10
> > > network 192.168.200.0/24 metric 10
> > > stubnet 192.168.202.0/24 metric 65535
> > > stubnet 192.168.254.0/23 metric 10
> > > stubnet 195.58.98.144/28 metric 65535
> > > stubnet 92.33.0.200/30 metric 65535
> > > stubnet 192.168.253.0/24 metric 10
> > > 
> > > 192.168.254.0/23 is announced with metric 10. All other interfaces in
> > > the same carp group are announced with metric 65535 because the
> > > link-down state of em2 has demoted the carp group, as it should.
> > 
> > After reading my initial post I realize I wasn't clear about the result
> > of this.
> > If you have a redundant router set up with carp on one side and ospf on
> > the other, and plug out a network cable on the carp side on the master,
> > one will loose network connectivity to that network.
> > 
> > In our case we lost Internet access until we realized what was wrong and
> > shut down the master.
> 
> I'm not keen on (relatively complex) special-casing in ospfd for this,
> I think this is the pertinent question:
> 
> > > Also, is the carp kernel code really correct when it leaves the
> > > interface link state as "unknown" when in carp init state?
> 

I don't think we under each other. ospfd already has special-casing for
CARP-interfaces. It is this special case that introduce this bug.

A normal interface is not announced when link is down. For carp
interfaces however, there is an exception for link down so that ospfd
announces the interface but with a metric of 65535. This is to allow
faster fail over when carp master changes.
However, the first exception:
  !LINK_STATE_IS_UP(iface->linkstate)
which allow the interface to be announced, triggers on both physical
link down and on carp backup mode, while the second exception:
  iface->linkstate == LINK_STATE_DOWN
which sets the metric to 65535, triggers only on carp backup mode.

The result is that we announce routes to a network we have no connection
to, and that is a very bad thing.

/Johan





Re: ospfd announces carp interface with physical link down

2015-05-19 Thread Stuart Henderson
On 2015/05/19 09:03, Johan Ymerson wrote:
> On Fri, 2015-05-15 at 17:59 +0200, Johan Ymerson wrote:
> > I have found a peculiar behaviour in ospfd when the physical link of the
> > parent carp interface is down. The carp interface net is then announced
> > with it's regular metric.
> > 
> > An example:
> > The cable of em2, parent of carp2 (192.168.254.0/23), is unplugged. Here
> > is what is announced, seen by another machine running bird:
> > 
> > router 192.168.200.4
> > distance 10
> > network 192.168.200.0/24 metric 10
> > stubnet 192.168.202.0/24 metric 65535
> > stubnet 192.168.254.0/23 metric 10
> > stubnet 195.58.98.144/28 metric 65535
> > stubnet 92.33.0.200/30 metric 65535
> > stubnet 192.168.253.0/24 metric 10
> > 
> > 192.168.254.0/23 is announced with metric 10. All other interfaces in
> > the same carp group are announced with metric 65535 because the
> > link-down state of em2 has demoted the carp group, as it should.
> 
> After reading my initial post I realize I wasn't clear about the result
> of this.
> If you have a redundant router set up with carp on one side and ospf on
> the other, and plug out a network cable on the carp side on the master,
> one will loose network connectivity to that network.
> 
> In our case we lost Internet access until we realized what was wrong and
> shut down the master.

I'm not keen on (relatively complex) special-casing in ospfd for this,
I think this is the pertinent question:

> > Also, is the carp kernel code really correct when it leaves the
> > interface link state as "unknown" when in carp init state?



Re: ospfd announces carp interface with physical link down

2015-05-19 Thread Johan Ymerson
On Fri, 2015-05-15 at 17:59 +0200, Johan Ymerson wrote:
> I have found a peculiar behaviour in ospfd when the physical link of the
> parent carp interface is down. The carp interface net is then announced
> with it's regular metric.
> 
> An example:
> The cable of em2, parent of carp2 (192.168.254.0/23), is unplugged. Here
> is what is announced, seen by another machine running bird:
> 
> router 192.168.200.4
> distance 10
> network 192.168.200.0/24 metric 10
> stubnet 192.168.202.0/24 metric 65535
> stubnet 192.168.254.0/23 metric 10
> stubnet 195.58.98.144/28 metric 65535
> stubnet 92.33.0.200/30 metric 65535
> stubnet 192.168.253.0/24 metric 10
> 
> 192.168.254.0/23 is announced with metric 10. All other interfaces in
> the same carp group are announced with metric 65535 because the
> link-down state of em2 has demoted the carp group, as it should.

After reading my initial post I realize I wasn't clear about the result
of this.
If you have a redundant router set up with carp on one side and ospf on
the other, and plug out a network cable on the carp side on the master,
one will loose network connectivity to that network.

In our case we lost Internet access until we realized what was wrong and
shut down the master.

/Johan


> 
> This behaviour is cased by the test for the carp state "down" doesn't
> check for link state "unknown".
> 
> Here is a patch that prevents ospfd from announcing the interface when
> the physical interface is down. One could also argue that it should
> announce it with metric 65535, as in the carp backup state. But I feel
> it is better to not announce it at all since the link down state
> prevents us from becoming the master.
> 
> Index: ospfe.c
> ===
> RCS file: /cvs/src/usr.sbin/ospfd/ospfe.c,v
> retrieving revision 1.90
> diff -u -p -r1.90 ospfe.c
> --- ospfe.c 10 Feb 2015 05:24:48 -  1.90
> +++ ospfe.c 15 May 2015 13:02:40 -
> @@ -880,7 +880,8 @@ orig_rtr_lsa(struct area *area)
> if (!(iface->flags & IFF_UP) ||
> (!LINK_STATE_IS_UP(iface->linkstate) &&
> !(iface->media_type == IFT_CARP &&
> -   iface->linkstate == LINK_STATE_DOWN)))
> +   iface->linkstate == LINK_STATE_DOWN)) ||
> +   (iface->media_type == IFT_CARP && 
> iface->linkstate == LINK_STATE_UNKNOWN))
> continue;
> log_debug("orig_rtr_lsa: stub net, "
> "interface %s", iface->name);
> 
> 
> However, this if statement is difficult to understand as it is and
> should probably be rewritten, maybe something like this:
> 
> Index: ospfe.c
> ===
> RCS file: /cvs/src/usr.sbin/ospfd/ospfe.c,vretrieving revision 1.90
> diff -u -p -r1.90 ospfe.c
> --- ospfe.c 10 Feb 2015 05:24:48 -  1.90
> +++ ospfe.c 15 May 2015 15:13:38 -
> @@ -877,11 +877,17 @@ orig_rtr_lsa(struct area *area)
>  *backup carp interfaces have linkstate down, but
>  *we still announce them.
>  */
> -   if (!(iface->flags & IFF_UP) ||
> -   (!LINK_STATE_IS_UP(iface->linkstate) &&
> -   !(iface->media_type == IFT_CARP &&
> -   iface->linkstate == LINK_STATE_DOWN)))
> -   continue;
> +   if (!(iface->flags & IFF_UP))
> + continue; /* admin down */
> +   
> +   if (iface->media_type == IFT_CARP) {
> + if (iface->linkstate < LINK_STATE_DOWN)
> +   continue; /* physical link down on carp if or 
> invaild */
> +   } else {
> + if (!LINK_STATE_IS_UP(iface->linkstate))
> +   continue; /* UP or UNKNOWN */
> +   }
> +
> log_debug("orig_rtr_lsa: stub net, "
> "interface %s", iface->name);
>  
> 
> Also, is the carp kernel code really correct when it leaves the
> interface link state as "unknown" when in carp init state?
> 
> /Johan Ymerson
> 
>