mcast-proxy daemon

2017-05-19 Thread Rafael Zalamena
Hello tech@,

I have been developing a new daemon for OpenBSD that fills in a gap in
the multicast protocol support for network edges. More specifically I'm
talking about a multicast proxy. I'm sending this e-mail to share the
daemon code and see if there is interest in such.

The mcast-proxy is a less featured multicast routing daemon that is
mostly used on equipments that face client networks (end users). It is
mainly used when you don't need a full multicast routing daemon (like
dvmrpd, mrouted or pim), but you want to use your networks resources
efficiently. This implementation has the following features:

* Support IPv4 (IGMPv1/v2) multicast proxy
* Support IPv6 (MLDv1) multicast proxy
* Privilege dropping (runs as user)
* chroot jailing

The development of this daemon brought improvements to the IPv6
multicast stack, like:

* Initial MP support
  Now IPv6 multicast routing code uses the art routing table to store
  the multicast routes. This also means you can see your multicast
  routes in route(8).
* Support multiple rdomains
  The interfaces mif (multicast interface) are now domain specific, so
  you can have mif ids duplicated on different rdomains.
* Fixed a few problems in MLD code that prevented some client/server
  functionality

Note: the daemon is not yet pledge()d as there is no support for
MRT(6)_* setsockopt() calls.

Note 2: IPv6 multicast proxy requires an OpenBSD -current, because of
the recent kernel changes and netstat(8).

---

To run multicast routing protocols in your machines you have to configure
the following settings:

* Allow multicast routing:
  # rcctl enable multicast

* (IPv4 only) allow IGMP packets.
  To allow IP options you have to configure your PF traffic pass rule to
  accept IP options. Example: change 'pass' to 'pass allow-opts'.

* Add a multicast route (if the default doesn't exist or is not correct)
  IPv4: route add 224/8 192.168.0.1
  IPv6: route add ff00::/8 fe80::fce1:baff:fed0:2001%vio1

* In case you are using the default route for multicast you might need
  to specify an alternate multicast source. By default mcast-proxy only
  accepts multicast traffic from the same network of your interface.

  Example:
em0 has IPv6 address: 2001:db8::100, but the multicast traffic comes
from 2001:db9::10.

  The mcast-proxy.conf:
  ...
  interface em0 {
source 2001:db9::/64
upstream
  }
  ...

  The same applies for IPv4.

---

How to build it:

* Save this e-mail (e.g. /tmp/mail)
* Create a new directory (e.g. mkdir /tmp/mcast-proxy)
* Apply the diff in this email
  (e.g. cd/tmp/mcast-proxy; patch -p0 -i /tmp/mail)
* Build it (e.g. cd /tmp/mcast-proxy; make obj; make)
* Run it (e.g. /tmp/mcast-proxy/obj/mcast-proxy)

Reading the man pages:
* The daemon man page:
  cd /tmp/mcast-proxy; mandoc mcast-proxy.8 | less
* The configuration man page:
  cd /tmp/mcast-proxy; mandoc mcast-proxy.conf.5 | less

---

The daemon code is split in the following file hierarchy:

* mcast-proxy.c: all IGMP/MLD related packet parsing
* mrt.c: the multicast routing table on userland
* kroute.c: all kernel interactions
* util.c: misc functions that did not fit the other files


Here is the daemon code:

diff --git Makefile Makefile
new file mode 100644
index 000..d99eaed
--- /dev/null
+++ Makefile
@@ -0,0 +1,14 @@
+# $OpenBSD:$
+
+SRCS= mcast-proxy.c kroute.c log.c mrt.c parse.y util.c
+PROG= mcast-proxy
+MAN = mcast-proxy.8 mcast-proxy.conf.5
+
+CFLAGS   += -I${.CURDIR}
+CFLAGS += -Wall -Wextra -Wshadow
+CFLAGS += -Wmissing-prototypes -Wmissing-declarations
+CFLAGS += -Wstrict-prototypes -Wpointer-arith -Wsign-compare
+DPADD   = ${LIBEVENT}
+LDADD   = -levent
+
+.include 
diff --git kroute.c kroute.c
new file mode 100644
index 000..af32091
--- /dev/null
+++ kroute.c
@@ -0,0 +1,1251 @@
+/* $OpenBSD:$  */
+
+/*
+ * Copyright (c) 2017 Rafael Zalamena <rzalam...@openbsd.org>
+ *
+ * Permission to use, copy, modify, and/or distribute this software for any
+ * purpose with or without fee is hereby granted, provided that the above
+ * copyright notice and this permission notice appear in all copies.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
+ * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
+ * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
+ * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
+ * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
+ * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
+ * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
+ */
+
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+
+#include "mcast-proxy.h"
+
+#define MAX_RTSOCK_BUF (128 * 1024)
+
+int bad_add

Re: mpe(4), mpw(4) and splsoftnet()

2016-12-20 Thread Rafael Zalamena
On Mon, Dec 19, 2016 at 11:48:31AM +0100, Martin Pieuchot wrote:
> Interface ioctl(2) are now always run at IPL_SOFTNET, so let's get rid
> of recursive splsoftnet()/splx() dances.
> 
> ok?

ok rzalamena@



Re: igmp: set rtableid on new mbufs

2016-12-16 Thread Rafael Zalamena
On Wed, Dec 14, 2016 at 06:59:42PM +0100, Martin Pieuchot wrote:
> On 14/12/16(Wed) 16:54, Rafael Zalamena wrote:
> > After running the igmpproxy in multiple domains I noticed that the kernel
> > started complaining about sending packets on wrong domains. Here is the
> > exact message:
> > "
> > vio1: trying to send packet on wrong domain. if 1 vs. mbuf 0
> > "
> > 
> > After some debugging I traced the problem to the igmp_sendpkt() function
> > and it seems that it is missing to set the mbuf rdomain, so this is
> > exactly what this diff does.
> 
> It doesn't make sense to call if_get(9) when all the callers of
> igmp_sendpkt() already have a reference to the sending ifp.  if_get(9)
> has a cost and adds complexity.  I'd rather pass ifp or the rdomain to
> igmp_sendpkt().

Following mpi@'s suggestion here is a new diff that removes the
if_get()/if_put() from igmp_sendpkt() and make it the callers
responsability.

ok?

Index: sys/netinet/igmp.c
===
RCS file: /home/obsdcvs/src/sys/netinet/igmp.c,v
retrieving revision 1.57
diff -u -p -r1.57 igmp.c
--- sys/netinet/igmp.c  14 Dec 2016 17:15:56 -  1.57
+++ sys/netinet/igmp.c  16 Dec 2016 11:19:42 -
@@ -104,7 +104,7 @@ static struct mbuf *router_alert;
 struct igmpstat igmpstat;
 
 void igmp_checktimer(struct ifnet *);
-void igmp_sendpkt(struct in_multi *, int, in_addr_t);
+void igmp_sendpkt(struct ifnet *, struct in_multi *, int, in_addr_t);
 int rti_fill(struct in_multi *);
 struct router_info * rti_find(struct ifnet *);
 void igmp_input_if(struct ifnet *, struct mbuf *, int);
@@ -509,7 +509,7 @@ igmp_joingroup(struct in_multi *inm)
if ((i = rti_fill(inm)) == -1)
goto out;
 
-   igmp_sendpkt(inm, i, 0);
+   igmp_sendpkt(ifp, inm, i, 0);
inm->inm_state = IGMP_DELAYING_MEMBER;
inm->inm_timer = IGMP_RANDOM_DELAY(
IGMP_MAX_HOST_REPORT_DELAY * PR_FASTHZ);
@@ -534,7 +534,8 @@ igmp_leavegroup(struct in_multi *inm)
if (!IN_LOCAL_GROUP(inm->inm_addr.s_addr) &&
ifp && (ifp->if_flags & IFF_LOOPBACK) == 0)
if (inm->inm_rti->rti_type != IGMP_v1_ROUTER)
-   igmp_sendpkt(inm, IGMP_HOST_LEAVE_MESSAGE,
+   igmp_sendpkt(ifp, inm,
+   IGMP_HOST_LEAVE_MESSAGE,
INADDR_ALLROUTERS_GROUP);
break;
case IGMP_LAZY_MEMBER:
@@ -582,10 +583,10 @@ igmp_checktimer(struct ifnet *ifp)
} else if (--inm->inm_timer == 0) {
if (inm->inm_state == IGMP_DELAYING_MEMBER) {
if (inm->inm_rti->rti_type == IGMP_v1_ROUTER)
-   igmp_sendpkt(inm,
+   igmp_sendpkt(ifp, inm,
IGMP_v1_HOST_MEMBERSHIP_REPORT, 0);
else
-   igmp_sendpkt(inm,
+   igmp_sendpkt(ifp, inm,
IGMP_v2_HOST_MEMBERSHIP_REPORT, 0);
inm->inm_state = IGMP_IDLE_MEMBER;
}
@@ -611,22 +612,17 @@ igmp_slowtimo(void)
 }
 
 void
-igmp_sendpkt(struct in_multi *inm, int type, in_addr_t addr)
+igmp_sendpkt(struct ifnet *ifp, struct in_multi *inm, int type,
+in_addr_t addr)
 {
-   struct ifnet *ifp;
struct mbuf *m;
struct igmp *igmp;
struct ip *ip;
struct ip_moptions imo;
 
-   if ((ifp = if_get(inm->inm_ifidx)) == NULL)
-   return;
-
MGETHDR(m, M_DONTWAIT, MT_HEADER);
-   if (m == NULL) {
-   if_put(ifp);
+   if (m == NULL)
return;
-   }
 
/*
 * Assume max_linkhdr + sizeof(struct ip) + IGMP_MINLEN
@@ -674,7 +670,6 @@ igmp_sendpkt(struct in_multi *inm, int t
 #endif /* MROUTING */
 
ip_output(m, router_alert, NULL, IP_MULTICASTOPTS, , NULL, 0);
-   if_put(ifp);
 
++igmpstat.igps_snd_reports;
 }



dhcrelay(8): allow multiple interfaces on l2

2016-12-13 Thread Rafael Zalamena
This diff implements support for allowing dhcrelay(8) to run on multiple
source interfaces with just one instance when using layer 2. This is
useful if you want to run dhcrelay(8) on multiple interfaces and want to
use the same circuit-id/remote-id (e.g. have multiple vlan(4)s on the
same interface).

Extras: simplified the poll dispatch code, removed some excessive
broadcast on layer 2 BOOTREPLY messages and added run examples to the
man page.

ok?

Index: dhcpd.h
===
RCS file: /cvs/src/usr.sbin/dhcrelay/dhcpd.h,v
retrieving revision 1.18
diff -u -p -r1.18 dhcpd.h
--- dhcpd.h 12 Dec 2016 15:41:05 -  1.18
+++ dhcpd.h 13 Dec 2016 15:40:27 -
@@ -39,6 +39,8 @@
  * Enterprises, see ``http://www.vix.com''.
  */
 
+#include 
+
 #defineSERVER_PORT 67
 #defineCLIENT_PORT 68
 
@@ -90,7 +92,8 @@ enum dhcp_relay_mode {
 };
 
 struct interface_info {
-   struct interface_info   *next;
+   TAILQ_ENTRY(interface_info)
+entry;
struct hardware  hw_address;
struct in_addr   primary_address;
char name[IFNAMSIZ];
@@ -103,9 +106,9 @@ struct interface_info {
struct ifreq ifr;
int  noifmedia;
int  errors;
-   int  dead;
u_int16_tindex;
 };
+TAILQ_HEAD(intflist, interface_info);
 
 struct timeout {
struct timeout  *next;
@@ -149,6 +152,9 @@ void dispatch(void);
 void got_one(struct protocol *);
 void add_protocol(char *, int, void (*)(struct protocol *), void *);
 void remove_protocol(struct protocol *);
+struct interface_info *lookup_interface(const char *);
+void add_interface(struct interface_info *);
+void remove_interface(struct interface_info *);
 
 /* packet.c */
 void assemble_hw_header(struct interface_info *, unsigned char *,
@@ -169,6 +175,7 @@ extern int server_fd;
 extern time_t cur_time;
 extern int log_priority;
 extern int log_perror;
+extern struct intflist intflist;
 
 static inline struct sockaddr_in *
 ss2sin(struct sockaddr_storage *ss)
Index: dhcrelay.8
===
RCS file: /cvs/src/usr.sbin/dhcrelay/dhcrelay.8,v
retrieving revision 1.14
diff -u -p -r1.14 dhcrelay.8
--- dhcrelay.8  13 Dec 2016 06:55:32 -  1.14
+++ dhcrelay.8  13 Dec 2016 15:40:27 -
@@ -66,7 +66,7 @@ whence the original request came.
 .Pp
 The server might be a name, address or interface.
 .Nm
-will operate in layer 2 mode when the specified servers are interfaces,
+will operate in layer 2 mode when the specified destinations are interfaces,
 otherwise it will operate in layer 3 mode.
 .Pp
 The name of at least one DHCP server to which DHCP and BOOTP requests
@@ -106,6 +106,10 @@ The name of the network interface that
 should attempt to configure.
 For layer 3 mode at least one IPv4 address has to be configured on this
 interface.
+Multiple network interfaces may be specified to avoid running more than
+one instances of
+.Nm
+when using the layer 2 mode.
 .It Fl o
 Add the relay agent information option.
 By default, this is only enabled for the
@@ -118,6 +122,29 @@ relay agent information sub-option value
 .Nm
 should append on relayed packets.
 If this option is not specified it will use the destination address by default.
+.El
+.Sh EXAMPLES
+Listen on interface em0 in layer 3 mode and relay it to two different servers:
+.Pp
+.Dl # dhcrelay -i em0 10.0.0.1 10.0.0.2
+.Pp
+Listen on em1 in layer 3 mode and append Relay Agent Information:
+.Pp
+.Dl # dhcrelay -o -i em1 10.0.0.3
+.Pp
+Use a different circuit-id for em1:
+.Pp
+.Dl # dhcrelay -o -C new-circuit -i em1 10.0.0.3
+.Pp
+Listen on em2 and relay it to em3 using layer 2:
+.Pp
+.Dl # dhcrelay -i em2 em3
+.Pp
+Use layer 2 relay for more than one listening interface and relay it through
+em3:
+.Pp
+.Dl # dhcrelay -i em0 -i em1 -i em2 em3
+.Pp
 .El
 .Sh SEE ALSO
 .Xr dhclient 8 ,
Index: dhcrelay.c
===
RCS file: /cvs/src/usr.sbin/dhcrelay/dhcrelay.c,v
retrieving revision 1.53
diff -u -p -r1.53 dhcrelay.c
--- dhcrelay.c  13 Dec 2016 15:28:19 -  1.53
+++ dhcrelay.c  13 Dec 2016 15:40:27 -
@@ -95,6 +95,8 @@ enum dhcp_relay_mode   drm = DRM_UNKNOWN;
 const char *rai_circuit = NULL;
 const char *rai_remote = NULL;
 
+struct intflist intflist = TAILQ_HEAD_INITIALIZER(intflist);
+
 struct server_list {
struct interface_info *intf;
struct server_list *next;
@@ -127,12 +129,28 @@ main(int argc, char *argv[])
daemonize = 0;
break;
case 'i':
-   if (interfaces != NULL)
-   usage();
+   /* Only layer 2 allows multiple input interfaces. */
+   

dhcrelay(8): fix default layer 3 remote-id

2016-12-13 Thread Rafael Zalamena
After the many iterations of the layer 2 diff, I noticed I broke the
layer 3 default Relay Agent Information insertion: the relayed packet is
using the wrong address in the remote-id field.

This diff makes the Relay Agent Information init function to run later
and get the right address for the default remote-id.

ok?

Index: dhcrelay.c
===
RCS file: /home/obsdcvs/src/usr.sbin/dhcrelay/dhcrelay.c,v
retrieving revision 1.52
diff -u -p -r1.52 dhcrelay.c
--- dhcrelay.c  13 Dec 2016 09:29:05 -  1.52
+++ dhcrelay.c  13 Dec 2016 13:57:16 -
@@ -72,7 +72,7 @@ char  *print_hw_addr(int, int, unsigned c
 voidgot_response(struct protocol *);
 int get_rdomain(char *);
 
-voidrelay_agentinfo(struct packet_ctx *, struct interface_info *);
+voidrelay_agentinfo(struct packet_ctx *, struct interface_info *, int);
 
 int relay_agentinfo_cmp(struct packet_ctx *pc, uint8_t *, int);
 ssize_t relay_agentinfo_append(struct packet_ctx *, struct dhcp_packet 
*,
@@ -337,8 +337,6 @@ relay(struct interface_info *ip, struct 
return;
}
 
-   relay_agentinfo(pc, ip);
-
/* If it's a bootreply, forward it to the client. */
if (packet->op == BOOTREPLY) {
/* Filter packet that were not meant for us. */
@@ -373,6 +371,7 @@ relay(struct interface_info *ip, struct 
memset(pc->pc_dmac, 0xff, sizeof(pc->pc_dmac));
}
 
+   relay_agentinfo(pc, interfaces, packet->op);
if ((length = relay_agentinfo_remove(pc, packet,
length)) == -1) {
note("ignoring BOOTREPLY with invalid "
@@ -420,6 +419,7 @@ relay(struct interface_info *ip, struct 
if (!packet->giaddr.s_addr)
packet->giaddr = ip->primary_address;
 
+   relay_agentinfo(pc, interfaces, packet->op);
if ((length = relay_agentinfo_append(pc, packet, length)) == -1) {
note("ignoring BOOTREQUEST with invalid "
"relay agent information");
@@ -559,9 +559,11 @@ got_response(struct protocol *l)
 }
 
 void
-relay_agentinfo(struct packet_ctx *pc, struct interface_info *intf)
+relay_agentinfo(struct packet_ctx *pc, struct interface_info *intf,
+int bootop)
 {
-   static u_int8_t buf[8];
+   static u_int8_t  buf[8];
+   struct sockaddr_in  *sin;
 
if (oflag == 0)
return;
@@ -579,10 +581,15 @@ relay_agentinfo(struct packet_ctx *pc, s
pc->pc_circuitlen = 2;
 
if (rai_remote == NULL) {
+   if (bootop == BOOTREPLY)
+   sin = ss2sin(>pc_dst);
+   else
+   sin = ss2sin(>pc_src);
+
pc->pc_remote =
-   (uint8_t *)(>pc_dst)->sin_addr;
+   (uint8_t *)>sin_addr;
pc->pc_remotelen =
-   sizeof(ss2sin(>pc_dst)->sin_addr);
+   sizeof(sin->sin_addr);
}
} else {
pc->pc_circuit = (u_int8_t *)rai_circuit;
@@ -867,7 +874,7 @@ l2relay(struct interface_info *ip, struc
return;
}
 
-   relay_agentinfo(pc, ip);
+   relay_agentinfo(pc, ip, dp->op);
 
switch (dp->op) {
case BOOTREQUEST:



multicast: propagate rdomain for add_vif

2016-12-12 Thread Rafael Zalamena
After trying to run igmpproxy daemon in different rdomains I noted that
it fails with the following message:
"
ERRO: MRT_ADD_VIF; Errno(49): Can't assign requested address
"

In the following line:
"
if ( setsockopt( MRouterFD, IPPROTO_IP, MRT_ADD_VIF,
 (char *), sizeof( VifCtl ) ) )
my_log( LOG_ERR, errno, "MRT_ADD_VIF" );
"

With some help from mikeb@ we found out that even though the system is
configured for multicast (multicast=YES) it is failing to setsockopt(),
because it wasn't being able to add the address. We traced where was it
failing and found out that the MRT_ADD_VIF wasn't propagating the rdomain
so it was always trying to install in the rdomain 0 even when we ran
igmpproxy on a different domain.

This diff just makes ip_mrouter_set() propagate the rdomain so the
ifa_ifwithaddr() receives the right rdomain and not fail anymore.

ok?

Index: sys/netinet/ip_mroute.c
===
RCS file: /home/obsdcvs/src/sys/netinet/ip_mroute.c,v
retrieving revision 1.93
diff -u -p -r1.93 ip_mroute.c
--- sys/netinet/ip_mroute.c 29 Nov 2016 15:52:12 -  1.93
+++ sys/netinet/ip_mroute.c 12 Dec 2016 14:51:37 -
@@ -130,7 +130,7 @@ int get_vif_cnt(struct sioc_vif_req *);
 int get_vif_ctl(struct vifctl *);
 int ip_mrouter_init(struct socket *, struct mbuf *);
 int get_version(struct mbuf *);
-int add_vif(struct mbuf *);
+int add_vif(struct socket *, struct mbuf *);
 int del_vif(struct mbuf *);
 void update_mfc_params(struct mfc *, struct mfcctl2 *);
 void init_mfc_params(struct mfc *, struct mfcctl2 *);
@@ -293,7 +293,7 @@ ip_mrouter_set(struct socket *so, int op
error = ip_mrouter_done();
break;
case MRT_ADD_VIF:
-   error = add_vif(*mp);
+   error = add_vif(so, *mp);
break;
case MRT_DEL_VIF:
error = del_vif(*mp);
@@ -773,8 +773,9 @@ static struct sockaddr_in sin = { sizeof
  * Add a vif to the vif table
  */
 int
-add_vif(struct mbuf *m)
+add_vif(struct socket *so, struct mbuf *m)
 {
+   struct inpcb *inp;
struct vifctl *vifcp;
struct vif *vifp;
struct ifaddr *ifa;
@@ -809,8 +810,9 @@ add_vif(struct mbuf *m)
} else
 #endif
{
+   inp = sotoinpcb(so);
sin.sin_addr = vifcp->vifc_lcl_addr;
-   ifa = ifa_ifwithaddr(sintosa(), /* XXX */ 0);
+   ifa = ifa_ifwithaddr(sintosa(), inp->inp_rtableid);
if (ifa == NULL)
return (EADDRNOTAVAIL);
}



Re: dhcrelay(8): add support for layer 2 relaying

2016-12-10 Thread Rafael Zalamena
On Fri, Dec 09, 2016 at 11:55:17PM +0100, Reyk Floeter wrote:
> On Fri, Dec 09, 2016 at 10:08:09AM +0100, Rafael Zalamena wrote:
> > On Thu, Dec 08, 2016 at 08:43:20PM +0100, Rafael Zalamena wrote:
> > > This diff implements layer 2 relaying support for dhcrelay with further
> > > support for Relay Agent Info (RFC 3046). This feature is mostly used by
> > > switched networks that might not be using IP addresses when in the edge
> > > with the customer.
> > > 
> > > Basically this diff allows you to run dhcrelay on interfaces without
> > > addresses and doesn't require you to specify an DHCP server address.
> > > Instead you just need to specify the output port.
> > > 
> > > I also updated the man page to show the new options for layer 2 relaying
> > > Relay Agent Info knobs, since you might want to let the remote DHCP
> > > server know where the DHCP packet is coming from.
> > 
> > I forgot to add the man page in the last diff, here is a new one with
> > the man page modifications.
> > 
> > ok?
> > 
> 
> See comments below.
> 

Thanks for the in-depth review, however see comments for the following
snippets:

> For the circuit-id, you could default to the interface name or index
> where the packet was received on.  The remote-id could even default to
> the an hostname or address.  And how does it differ from the existing
> -o (see below)?

Yes we could do that, but then we also have to define a way to make dhcrelay
not use Relay Agent Information (L2 without packet modifications). Can we
fix this in another diff?

> The encoding is different to the DHO_RELAY_AGENT_INFORMATION (option
> 82) that I added for enc0/IPsec where the circuit-id is just the
> interface index and the remote id an IP address.  I know this is
> related to the other standard, but could this be merged with
> relay_agentinfo() somehow or documented that there is a difference?

It is actually the same standard you just added different information,
normally you can use the 'giaddr' for layer 3 identification, but I
guess you stumbled on something else and you added relay_agentinfo().

I'm happy to make Layer 3 Relay Agent Information use the same code, but
can we do that in another diff?

Beside those two points mentioned above, I fixed everything else you
commented.

ok?

Index: bpf.c
===
RCS file: /home/obsdcvs/src/usr.sbin/dhcrelay/bpf.c,v
retrieving revision 1.13
diff -u -p -r1.13 bpf.c
--- bpf.c   8 Dec 2016 19:18:15 -   1.13
+++ bpf.c   10 Dec 2016 01:52:46 -
@@ -93,6 +93,38 @@ if_register_send(struct interface_info *
 }
 
 /*
+ * Packet filter program: 'ip and udp and dst port CLIENT_PORT'
+ */
+struct bpf_insn dhcp_bpf_sfilter[] = {
+   /* Make sure this is an IP packet... */
+   BPF_STMT(BPF_LD + BPF_H + BPF_ABS, 12),
+   BPF_JUMP(BPF_JMP + BPF_JEQ + BPF_K, ETHERTYPE_IP, 0, 8),
+
+   /* Make sure it's a UDP packet... */
+   BPF_STMT(BPF_LD + BPF_B + BPF_ABS, 23),
+   BPF_JUMP(BPF_JMP + BPF_JEQ + BPF_K, IPPROTO_UDP, 0, 6),
+
+   /* Make sure this isn't a fragment... */
+   BPF_STMT(BPF_LD + BPF_H + BPF_ABS, 20),
+   BPF_JUMP(BPF_JMP + BPF_JSET + BPF_K, 0x1fff, 4, 0),
+
+   /* Get the IP header length... */
+   BPF_STMT(BPF_LDX + BPF_B + BPF_MSH, 14),
+
+   /* Make sure it's to the right port... */
+   BPF_STMT(BPF_LD + BPF_H + BPF_IND, 16),
+   BPF_JUMP(BPF_JMP + BPF_JEQ + BPF_K, CLIENT_PORT, 0, 1),
+
+   /* If we passed all the tests, ask for the whole packet. */
+   BPF_STMT(BPF_RET+BPF_K, (u_int)-1),
+
+   /* Otherwise, drop it. */
+   BPF_STMT(BPF_RET+BPF_K, 0),
+};
+
+int dhcp_bpf_sfilter_len = sizeof(dhcp_bpf_sfilter) / sizeof(struct bpf_insn);
+
+/*
  * Packet filter program: 'ip and udp and dst port SERVER_PORT'
  */
 struct bpf_insn dhcp_bpf_filter[] = {
@@ -161,6 +193,38 @@ struct bpf_insn dhcp_bpf_efilter[] = {
 int dhcp_bpf_efilter_len = sizeof(dhcp_bpf_efilter) / sizeof(struct bpf_insn);
 
 /*
+ * Packet write filter program: 'ip and udp and src port CLIENT_PORT'
+ */
+struct bpf_insn dhcp_bpf_swfilter[] = {
+   /* Make sure this is an IP packet... */
+   BPF_STMT(BPF_LD + BPF_H + BPF_ABS, 12),
+   BPF_JUMP(BPF_JMP + BPF_JEQ + BPF_K, ETHERTYPE_IP, 0, 8),
+
+   /* Make sure it's a UDP packet... */
+   BPF_STMT(BPF_LD + BPF_B + BPF_ABS, 23),
+   BPF_JUMP(BPF_JMP + BPF_JEQ + BPF_K, IPPROTO_UDP, 0, 6),
+
+   /* Make sure this isn't a fragment... */
+   BPF_STMT(BPF_LD + BPF_H + BPF_ABS, 20),
+   BPF_JUMP(BPF_JMP + BPF_JSET + BPF_K, 0x1fff, 4, 0),
+
+   /* Get the IP header length... */
+   BPF_STMT(BPF_LDX + BPF_B + BPF_MSH, 14),
+
+   /* Make sure it's from the right port... */
+   BPF_STMT(BPF_LD + BPF_H + 

Re: dhcrelay(8): add support for layer 2 relaying

2016-12-09 Thread Rafael Zalamena
On Thu, Dec 08, 2016 at 08:43:20PM +0100, Rafael Zalamena wrote:
> This diff implements layer 2 relaying support for dhcrelay with further
> support for Relay Agent Info (RFC 3046). This feature is mostly used by
> switched networks that might not be using IP addresses when in the edge
> with the customer.
> 
> Basically this diff allows you to run dhcrelay on interfaces without
> addresses and doesn't require you to specify an DHCP server address.
> Instead you just need to specify the output port.
> 
> I also updated the man page to show the new options for layer 2 relaying
> Relay Agent Info knobs, since you might want to let the remote DHCP
> server know where the DHCP packet is coming from.

I forgot to add the man page in the last diff, here is a new one with
the man page modifications.

ok?

Index: bpf.c
===
RCS file: /home/obsdcvs/src/usr.sbin/dhcrelay/bpf.c,v
retrieving revision 1.13
diff -u -p -r1.13 bpf.c
--- bpf.c   8 Dec 2016 19:18:15 -   1.13
+++ bpf.c   9 Dec 2016 09:03:44 -
@@ -93,6 +93,38 @@ if_register_send(struct interface_info *
 }
 
 /*
+ * Packet filter program: 'ip and udp and dst port CLIENT_PORT'
+ */
+struct bpf_insn dhcp_bpf_sfilter[] = {
+   /* Make sure this is an IP packet... */
+   BPF_STMT(BPF_LD + BPF_H + BPF_ABS, 12),
+   BPF_JUMP(BPF_JMP + BPF_JEQ + BPF_K, ETHERTYPE_IP, 0, 8),
+
+   /* Make sure it's a UDP packet... */
+   BPF_STMT(BPF_LD + BPF_B + BPF_ABS, 23),
+   BPF_JUMP(BPF_JMP + BPF_JEQ + BPF_K, IPPROTO_UDP, 0, 6),
+
+   /* Make sure this isn't a fragment... */
+   BPF_STMT(BPF_LD + BPF_H + BPF_ABS, 20),
+   BPF_JUMP(BPF_JMP + BPF_JSET + BPF_K, 0x1fff, 4, 0),
+
+   /* Get the IP header length... */
+   BPF_STMT(BPF_LDX + BPF_B + BPF_MSH, 14),
+
+   /* Make sure it's to the right port... */
+   BPF_STMT(BPF_LD + BPF_H + BPF_IND, 16),
+   BPF_JUMP(BPF_JMP + BPF_JEQ + BPF_K, CLIENT_PORT, 0, 1),
+
+   /* If we passed all the tests, ask for the whole packet. */
+   BPF_STMT(BPF_RET+BPF_K, (u_int)-1),
+
+   /* Otherwise, drop it. */
+   BPF_STMT(BPF_RET+BPF_K, 0),
+};
+
+int dhcp_bpf_sfilter_len = sizeof(dhcp_bpf_sfilter) / sizeof(struct bpf_insn);
+
+/*
  * Packet filter program: 'ip and udp and dst port SERVER_PORT'
  */
 struct bpf_insn dhcp_bpf_filter[] = {
@@ -161,6 +193,38 @@ struct bpf_insn dhcp_bpf_efilter[] = {
 int dhcp_bpf_efilter_len = sizeof(dhcp_bpf_efilter) / sizeof(struct bpf_insn);
 
 /*
+ * Packet write filter program: 'ip and udp and src port CLIENT_PORT'
+ */
+struct bpf_insn dhcp_bpf_swfilter[] = {
+   /* Make sure this is an IP packet... */
+   BPF_STMT(BPF_LD + BPF_H + BPF_ABS, 12),
+   BPF_JUMP(BPF_JMP + BPF_JEQ + BPF_K, ETHERTYPE_IP, 0, 8),
+
+   /* Make sure it's a UDP packet... */
+   BPF_STMT(BPF_LD + BPF_B + BPF_ABS, 23),
+   BPF_JUMP(BPF_JMP + BPF_JEQ + BPF_K, IPPROTO_UDP, 0, 6),
+
+   /* Make sure this isn't a fragment... */
+   BPF_STMT(BPF_LD + BPF_H + BPF_ABS, 20),
+   BPF_JUMP(BPF_JMP + BPF_JSET + BPF_K, 0x1fff, 4, 0),
+
+   /* Get the IP header length... */
+   BPF_STMT(BPF_LDX + BPF_B + BPF_MSH, 14),
+
+   /* Make sure it's from the right port... */
+   BPF_STMT(BPF_LD + BPF_H + BPF_IND, 14),
+   BPF_JUMP(BPF_JMP + BPF_JEQ + BPF_K, CLIENT_PORT, 0, 1),
+
+   /* If we passed all the tests, ask for the whole packet. */
+   BPF_STMT(BPF_RET+BPF_K, (u_int)-1),
+
+   /* Otherwise, drop it. */
+   BPF_STMT(BPF_RET+BPF_K, 0),
+};
+
+int dhcp_bpf_swfilter_len = sizeof(dhcp_bpf_swfilter) / sizeof(struct 
bpf_insn);
+
+/*
  * Packet write filter program: 'ip and udp and src port SERVER_PORT'
  */
 struct bpf_insn dhcp_bpf_wfilter[] = {
@@ -193,7 +257,7 @@ struct bpf_insn dhcp_bpf_wfilter[] = {
 int dhcp_bpf_wfilter_len = sizeof(dhcp_bpf_wfilter) / sizeof(struct bpf_insn);
 
 void
-if_register_receive(struct interface_info *info)
+if_register_receive(struct interface_info *info, int isserver)
 {
struct bpf_version v;
struct bpf_program p;
@@ -234,7 +298,10 @@ if_register_receive(struct interface_inf
info->rbuf_len = 0;
 
/* Set up the bpf filter program structure. */
-   if (info->hw_address.htype == HTYPE_IPSEC_TUNNEL) {
+   if (isserver) {
+   p.bf_len = dhcp_bpf_sfilter_len;
+   p.bf_insns = dhcp_bpf_sfilter;
+   } else if (info->hw_address.htype == HTYPE_IPSEC_TUNNEL) {
p.bf_len = dhcp_bpf_efilter_len;
p.bf_insns = dhcp_bpf_efilter;
} else {
@@ -245,8 +312,13 @@ if_register_receive(struct interface_inf
error("Can't install packet filter program: %m");
 
/* Set up the bpf write filter program structure. */
-   p.bf_len = dhcp_bpf_wfilter_len;
-   p.bf_insns = dhcp_bpf_wfilter;
+   if (isserver) {
+   p.bf_

dhcrelay(8): filter BOOTREPLY packets

2016-12-08 Thread Rafael Zalamena
This diff makes dhcrelay(8) drop packets that were not meant for us.
This is a safety check suggested by jca@ to avoid relaying packets with
the address of other relays.

ok?

Index: dhcrelay.c
===
RCS file: /cvs/src/usr.sbin/dhcrelay/dhcrelay.c,v
retrieving revision 1.49
diff -u -p -r1.49 dhcrelay.c
--- dhcrelay.c  8 Dec 2016 19:18:15 -   1.49
+++ dhcrelay.c  8 Dec 2016 19:52:51 -
@@ -276,6 +276,11 @@ relay(struct interface_info *ip, struct 
 
/* If it's a bootreply, forward it to the client. */
if (packet->op == BOOTREPLY) {
+   /* Filter packet that were not meant for us. */
+   if (packet->giaddr.s_addr !=
+   interfaces->primary_address.s_addr)
+   return;
+
bzero(, sizeof(to));
if (!(packet->flags & htons(BOOTP_BROADCAST))) {
to.sin_addr = packet->yiaddr;



dhcrelay(8): add support for layer 2 relaying

2016-12-08 Thread Rafael Zalamena
This diff implements layer 2 relaying support for dhcrelay with further
support for Relay Agent Info (RFC 3046). This feature is mostly used by
switched networks that might not be using IP addresses when in the edge
with the customer.

Basically this diff allows you to run dhcrelay on interfaces without
addresses and doesn't require you to specify an DHCP server address.
Instead you just need to specify the output port.

I also updated the man page to show the new options for layer 2 relaying
Relay Agent Info knobs, since you might want to let the remote DHCP
server know where the DHCP packet is coming from.

ok?

Index: bpf.c
===
RCS file: /cvs/src/usr.sbin/dhcrelay/bpf.c,v
retrieving revision 1.13
diff -u -p -r1.13 bpf.c
--- bpf.c   8 Dec 2016 19:18:15 -   1.13
+++ bpf.c   8 Dec 2016 19:34:44 -
@@ -93,6 +93,38 @@ if_register_send(struct interface_info *
 }
 
 /*
+ * Packet filter program: 'ip and udp and dst port CLIENT_PORT'
+ */
+struct bpf_insn dhcp_bpf_sfilter[] = {
+   /* Make sure this is an IP packet... */
+   BPF_STMT(BPF_LD + BPF_H + BPF_ABS, 12),
+   BPF_JUMP(BPF_JMP + BPF_JEQ + BPF_K, ETHERTYPE_IP, 0, 8),
+
+   /* Make sure it's a UDP packet... */
+   BPF_STMT(BPF_LD + BPF_B + BPF_ABS, 23),
+   BPF_JUMP(BPF_JMP + BPF_JEQ + BPF_K, IPPROTO_UDP, 0, 6),
+
+   /* Make sure this isn't a fragment... */
+   BPF_STMT(BPF_LD + BPF_H + BPF_ABS, 20),
+   BPF_JUMP(BPF_JMP + BPF_JSET + BPF_K, 0x1fff, 4, 0),
+
+   /* Get the IP header length... */
+   BPF_STMT(BPF_LDX + BPF_B + BPF_MSH, 14),
+
+   /* Make sure it's to the right port... */
+   BPF_STMT(BPF_LD + BPF_H + BPF_IND, 16),
+   BPF_JUMP(BPF_JMP + BPF_JEQ + BPF_K, CLIENT_PORT, 0, 1),
+
+   /* If we passed all the tests, ask for the whole packet. */
+   BPF_STMT(BPF_RET+BPF_K, (u_int)-1),
+
+   /* Otherwise, drop it. */
+   BPF_STMT(BPF_RET+BPF_K, 0),
+};
+
+int dhcp_bpf_sfilter_len = sizeof(dhcp_bpf_sfilter) / sizeof(struct bpf_insn);
+
+/*
  * Packet filter program: 'ip and udp and dst port SERVER_PORT'
  */
 struct bpf_insn dhcp_bpf_filter[] = {
@@ -161,6 +193,38 @@ struct bpf_insn dhcp_bpf_efilter[] = {
 int dhcp_bpf_efilter_len = sizeof(dhcp_bpf_efilter) / sizeof(struct bpf_insn);
 
 /*
+ * Packet write filter program: 'ip and udp and src port CLIENT_PORT'
+ */
+struct bpf_insn dhcp_bpf_swfilter[] = {
+   /* Make sure this is an IP packet... */
+   BPF_STMT(BPF_LD + BPF_H + BPF_ABS, 12),
+   BPF_JUMP(BPF_JMP + BPF_JEQ + BPF_K, ETHERTYPE_IP, 0, 8),
+
+   /* Make sure it's a UDP packet... */
+   BPF_STMT(BPF_LD + BPF_B + BPF_ABS, 23),
+   BPF_JUMP(BPF_JMP + BPF_JEQ + BPF_K, IPPROTO_UDP, 0, 6),
+
+   /* Make sure this isn't a fragment... */
+   BPF_STMT(BPF_LD + BPF_H + BPF_ABS, 20),
+   BPF_JUMP(BPF_JMP + BPF_JSET + BPF_K, 0x1fff, 4, 0),
+
+   /* Get the IP header length... */
+   BPF_STMT(BPF_LDX + BPF_B + BPF_MSH, 14),
+
+   /* Make sure it's from the right port... */
+   BPF_STMT(BPF_LD + BPF_H + BPF_IND, 14),
+   BPF_JUMP(BPF_JMP + BPF_JEQ + BPF_K, CLIENT_PORT, 0, 1),
+
+   /* If we passed all the tests, ask for the whole packet. */
+   BPF_STMT(BPF_RET+BPF_K, (u_int)-1),
+
+   /* Otherwise, drop it. */
+   BPF_STMT(BPF_RET+BPF_K, 0),
+};
+
+int dhcp_bpf_swfilter_len = sizeof(dhcp_bpf_swfilter) / sizeof(struct 
bpf_insn);
+
+/*
  * Packet write filter program: 'ip and udp and src port SERVER_PORT'
  */
 struct bpf_insn dhcp_bpf_wfilter[] = {
@@ -193,7 +257,7 @@ struct bpf_insn dhcp_bpf_wfilter[] = {
 int dhcp_bpf_wfilter_len = sizeof(dhcp_bpf_wfilter) / sizeof(struct bpf_insn);
 
 void
-if_register_receive(struct interface_info *info)
+if_register_receive(struct interface_info *info, int isserver)
 {
struct bpf_version v;
struct bpf_program p;
@@ -234,7 +298,10 @@ if_register_receive(struct interface_inf
info->rbuf_len = 0;
 
/* Set up the bpf filter program structure. */
-   if (info->hw_address.htype == HTYPE_IPSEC_TUNNEL) {
+   if (isserver) {
+   p.bf_len = dhcp_bpf_sfilter_len;
+   p.bf_insns = dhcp_bpf_sfilter;
+   } else if (info->hw_address.htype == HTYPE_IPSEC_TUNNEL) {
p.bf_len = dhcp_bpf_efilter_len;
p.bf_insns = dhcp_bpf_efilter;
} else {
@@ -245,8 +312,13 @@ if_register_receive(struct interface_inf
error("Can't install packet filter program: %m");
 
/* Set up the bpf write filter program structure. */
-   p.bf_len = dhcp_bpf_wfilter_len;
-   p.bf_insns = dhcp_bpf_wfilter;
+   if (isserver) {
+   p.bf_len = dhcp_bpf_swfilter_len;
+   p.bf_insns = dhcp_bpf_swfilter;
+   } else {
+   p.bf_len = dhcp_bpf_wfilter_len;
+   p.bf_insns = dhcp_bpf_wfilter;
+   }
 
if (ioctl(info->rfdesc, BIOCSETWF, ) == -1)
  

Re: dhcrelay(8): clean up function prototypes

2016-12-08 Thread Rafael Zalamena
On Thu, Dec 08, 2016 at 06:59:18PM +0100, Jeremie Courreges-Anglas wrote:
> Rafael Zalamena <rzalam...@gmail.com> writes:
> 
> [...]
> 
> >> Another problem: the relay->server code uses send(2) on a connected
> >> socket and thus has no destination IP issue.  But the relay->client path
> >> now uses the source address from the server->relay packet.  I think
> >> we should keep using interfaces->primary_address here...
> 
> [...]
> 
> > Here is the new diff with the improvements, ok?
> 
> Thanks.  Your diff still makes use of the server's source address to
> relay the BOOTREPLY to the client.  I'd say that this is an important
> change and I'm not sure at all whether it is desirable.  It surely can't
> be committed as part of a cleanup diff.

Sorry, I forgot about that one. I got a multi relayed setup and it does
make a difference.

Here is a new diff with your suggestion.

ok?

Index: bpf.c
===
RCS file: /home/obsdcvs/src/usr.sbin/dhcrelay/bpf.c,v
retrieving revision 1.12
diff -u -p -r1.12 bpf.c
--- bpf.c   8 Dec 2016 09:29:50 -   1.12
+++ bpf.c   8 Dec 2016 18:41:43 -
@@ -258,24 +258,23 @@ if_register_receive(struct interface_inf
 
 ssize_t
 send_packet(struct interface_info *interface,
-struct dhcp_packet *raw, size_t len, struct in_addr from,
-struct sockaddr_in *to, struct hardware *hto)
+struct dhcp_packet *raw, size_t len, struct packet_ctx *pc)
 {
unsigned char buf[256];
struct iovec iov[2];
int result, bufp = 0;
 
if (interface->hw_address.htype == HTYPE_IPSEC_TUNNEL) {
-   socklen_t slen = sizeof(*to);
+   socklen_t slen = pc->pc_dst.ss_len;
result = sendto(server_fd, raw, len, 0,
-   (struct sockaddr *)to, slen);
+   (struct sockaddr *)>pc_dst, slen);
goto done;
}
 
/* Assemble the headers... */
-   assemble_hw_header(interface, buf, , hto);
-   assemble_udp_ip_header(interface, buf, , from.s_addr,
-   to->sin_addr.s_addr, to->sin_port, (unsigned char *)raw, len);
+   assemble_hw_header(interface, buf, , pc);
+   assemble_udp_ip_header(interface, buf, , pc,
+   (unsigned char *)raw, len);
 
/* Fire it off */
iov[0].iov_base = (char *)buf;
@@ -292,7 +291,7 @@ send_packet(struct interface_info *inter
 
 ssize_t
 receive_packet(struct interface_info *interface, unsigned char *buf,
-size_t len, struct sockaddr_in *from, struct hardware *hfrom)
+size_t len, struct packet_ctx *pc)
 {
int length = 0, offset = 0;
struct bpf_hdr hdr;
@@ -358,7 +357,7 @@ receive_packet(struct interface_info *in
 
/* Decode the physical header... */
offset = decode_hw_header(interface,
-   interface->rbuf, interface->rbuf_offset, hfrom);
+   interface->rbuf, interface->rbuf_offset, pc);
 
/*
 * If a physical layer checksum failed (dunno of any
@@ -374,7 +373,7 @@ receive_packet(struct interface_info *in
 
/* Decode the IP and UDP headers... */
offset = decode_udp_ip_header(interface, interface->rbuf,
-   interface->rbuf_offset, from, hdr.bh_caplen);
+   interface->rbuf_offset, pc, hdr.bh_caplen);
 
/* If the IP or UDP checksum was bad, skip the packet... */
if (offset < 0) {
Index: dhcpd.h
===
RCS file: /home/obsdcvs/src/usr.sbin/dhcrelay/dhcpd.h,v
retrieving revision 1.16
diff -u -p -r1.16 dhcpd.h
--- dhcpd.h 8 Dec 2016 09:29:50 -   1.16
+++ dhcpd.h 8 Dec 2016 18:41:43 -
@@ -42,15 +42,28 @@
 #defineSERVER_PORT 67
 #defineCLIENT_PORT 68
 
+/* Maximum size of client hardware address. */
+#define CHADDR_SIZE16
+
+struct packet_ctx {
+   uint8_t  pc_htype;
+   uint8_t  pc_hlen;
+   uint8_t  pc_smac[CHADDR_SIZE];
+   uint8_t  pc_dmac[CHADDR_SIZE];
+
+   struct sockaddr_storage  pc_src;
+   struct sockaddr_storage  pc_dst;
+};
+
 struct iaddr {
int len;
-   unsigned char iabuf[16];
+   unsigned char iabuf[CHADDR_SIZE];
 };
 
 struct hardware {
u_int8_t htype;
u_int8_t hlen;
-   u_int8_t haddr[16];
+   u_int8_t haddr[CHADDR_SIZE];
 };
 
 /* Possible states in which the client can be. */
@@ -112,15 +125,13 @@ int if_register_bpf(struct interface_inf
 void if_register_send(struct interface_info *);
 void if_register_receive(struct interface_info *);
 ssize_t send_packet(struct interface_info *,
-struct dhcp_packet *, s

Re: dhcrelay(8): clean up function prototypes

2016-12-08 Thread Rafael Zalamena
On Thu, Dec 08, 2016 at 05:07:41PM +0100, Jeremie Courreges-Anglas wrote:
> ---sniped---
> I think you've summoned Cthulhu with this diff. :)
> 
> I started noticing a wrong behavior with when testing a multi-relay
> setup, as prompted by Patrick's diff. See
> 
>   marc.info/?l=openbsd-tech=148096893918967=2
> 
> Such as setup didn't work before, and is still half-broken in -current.
> 
> For the record:
> 
> --8<--
> Addresses start with 192.168...
> 
> [server] <-> [relay1] <-> [relay2] <-> [client]
>   1.1   1.22.1   2.23.1
> 
> Here dhcrelay on relay2 ignores replies sent by 192.168.1.1 to
> 192.168.3.1, because its UDP socket is bound to 192.168.3.1 but
> connected to 192.168.2.2.
> -->8--
> 
> A second problem is that, if [relay1] is on the path between [server] and
> [relay2], the BPF socket will catch [server]'s reply, think it has to
> handle it (I don't think it should), and send it to [client].  The code
> in dhcrelay(8) doesn't make enough checks to cope with such a situation.
> This is not a big problem in practice, but now your diff comes in.
> 
> In got_response(), you store client_port, 68, in
> ss2sin(>pc_dst)->sin_port.  But in got_one() -> receive_packet() ->
> decode_udp_ip_header(), you store the UDP port of the incoming packet.
> In case of a [server]->[relay2] messages, the port is 67.  [relay1] thus
> sends a packet to the client with 67 as the destination port.  This
> packet matches the BPF filter of [relay1], which then goes in an
> infinite loop.
> 
> IIUC, an obvious improvement would be to ignore BOOTREPLY packets with
> a giaddr that doesn't match our primary address.  relay() could use such
> a check:
> 
>   /* If it's a bootreply, forward it to the client. */
>   if (packet->op == BOOTREPLY) {
> 
>   /* Is this packet actually for us? */
>   if (packet->giaddr.s_addr != interfaces->primary_address.s_addr)
>   return;
>   [...]
>   }
> 
> Maybe there is a drawback that I can't see, but blindly relaying like we
> do now can't be a good thing.

This is not the purpose of this diff, but we can fix it later in the next
diffs. I want to keep the scope on reducing the functions prototype and
less in fixing previous problems.

> 
> The reason why port 67 ends up on the wire instead of port 68 is
> probably this:
> 
>   bzero(, sizeof(to));
>   if (!(packet->flags & htons(BOOTP_BROADCAST))) {
>   to.sin_addr = packet->yiaddr;
>   to.sin_port = client_port;
>   } else {
>   to.sin_addr.s_addr = htonl(INADDR_BROADCAST);
>   to.sin_port = client_port;
>   }
>   to.sin_family = AF_INET;
>   to.sin_len = sizeof to;
>   memcpy((>pc_dst)->sin_addr, _addr,
>   sizeof(ss2sin(>pc_dst)->sin_addr));
> 
> The memcpy copies only the IP address, not the port.  This should
> be something like
> 
>   *ss2sin(>pc_dst) = to;

New diff has this fix, thank you!

> 
> Another problem: the relay->server code uses send(2) on a connected
> socket and thus has no destination IP issue.  But the relay->client path
> now uses the source address from the server->relay packet.  I think
> we should keep using interfaces->primary_address here...
> 
> No idea if the proposals above would fix all the problems, but I have to
> finish this mail. :)
> 
> Aside from this:
> - please drop the extra netinet/if_ether.h includes
> - not a problem per se, but such a simple function like ss2sin could
>   probably be defined static inline in a header

I accept your suggestions and I have applied them.

Here is the new diff with the improvements, ok?

Index: bpf.c
===
RCS file: /home/obsdcvs/src/usr.sbin/dhcrelay/bpf.c,v
retrieving revision 1.12
diff -u -p -r1.12 bpf.c
--- bpf.c   8 Dec 2016 09:29:50 -   1.12
+++ bpf.c   8 Dec 2016 17:03:21 -
@@ -258,24 +258,23 @@ if_register_receive(struct interface_inf
 
 ssize_t
 send_packet(struct interface_info *interface,
-struct dhcp_packet *raw, size_t len, struct in_addr from,
-struct sockaddr_in *to, struct hardware *hto)
+struct dhcp_packet *raw, size_t len, struct packet_ctx *pc)
 {
unsigned char buf[256];
struct iovec iov[2];
int result, bufp = 0;
 
if (interface->hw_address.htype == HTYPE_IPSEC_TUNNEL) {
-   socklen_t slen = sizeof(*to);
+   socklen_t slen = pc->pc_dst.ss_len;
result = sendto(server_fd, raw, len, 0,
-   (struct sockaddr *)to, slen);
+   (struct sockaddr *)>pc_dst, slen);
goto done;
}
 
/* Assemble the headers... */
-   assemble_hw_header(interface, buf, , hto);
-   assemble_udp_ip_header(interface, buf, , from.s_addr,
-   to->sin_addr.s_addr, 

Re: dhcrelay(8): clean up function prototypes

2016-12-08 Thread Rafael Zalamena
On Wed, Dec 07, 2016 at 09:36:24PM +0100, Jeremie Courreges-Anglas wrote:
> Rafael Zalamena <rzalam...@gmail.com> writes:
> 
> > I'm implementing some features for dhcrelay and to make them fit I need
> > some clean ups in the dhcrelay(8) first. This diff changes most of the
> > input/output functions prototypes to take one parameter with all addresses
> > instead of passing multiple parameters.
> >
> > Basically this will make input functions gather more information (source/
> > destination MACs, source/destination IPs, source/destination ports) and
> > use it in the output instead of trying to figure out this information along
> > the way.
> >
> > With this we will be able to add IPv6 support and layer 2 relaying.
> 
> Nice. :)
> 
> [...]
> 
> > ok?
> 
> This conflicts with a diff that has been committed by patrick@, you'll
> need to refresh it.

I updated the diff with the latest commits from patrick@. Basically instead
of cleaning hto to trigger the memset(, 0xff,) on destination mac we just
update the pc dmac field (see assemble_hw_header()).

> 
> I didn't review it entirely, but please address the point below.

I changed the struct fields sss and dss to src and dst respectively.

ok?

Index: bpf.c
===
RCS file: /home/obsdcvs/src/usr.sbin/dhcrelay/bpf.c,v
retrieving revision 1.11
diff -u -p -r1.11 bpf.c
--- bpf.c   28 May 2016 07:00:18 -  1.11
+++ bpf.c   8 Dec 2016 09:10:23 -
@@ -258,24 +258,23 @@ if_register_receive(struct interface_inf
 
 ssize_t
 send_packet(struct interface_info *interface,
-struct dhcp_packet *raw, size_t len, struct in_addr from,
-struct sockaddr_in *to, struct hardware *hto)
+struct dhcp_packet *raw, size_t len, struct packet_ctx *pc)
 {
unsigned char buf[256];
struct iovec iov[2];
int result, bufp = 0;
 
if (interface->hw_address.htype == HTYPE_IPSEC_TUNNEL) {
-   socklen_t slen = sizeof(*to);
+   socklen_t slen = pc->pc_dst.ss_len;
result = sendto(server_fd, raw, len, 0,
-   (struct sockaddr *)to, slen);
+   (struct sockaddr *)>pc_dst, slen);
goto done;
}
 
/* Assemble the headers... */
-   assemble_hw_header(interface, buf, , hto);
-   assemble_udp_ip_header(interface, buf, , from.s_addr,
-   to->sin_addr.s_addr, to->sin_port, (unsigned char *)raw, len);
+   assemble_hw_header(interface, buf, , pc);
+   assemble_udp_ip_header(interface, buf, , pc,
+   (unsigned char *)raw, len);
 
/* Fire it off */
iov[0].iov_base = (char *)buf;
@@ -292,7 +291,7 @@ send_packet(struct interface_info *inter
 
 ssize_t
 receive_packet(struct interface_info *interface, unsigned char *buf,
-size_t len, struct sockaddr_in *from, struct hardware *hfrom)
+size_t len, struct packet_ctx *pc)
 {
int length = 0, offset = 0;
struct bpf_hdr hdr;
@@ -358,7 +357,7 @@ receive_packet(struct interface_info *in
 
/* Decode the physical header... */
offset = decode_hw_header(interface,
-   interface->rbuf, interface->rbuf_offset, hfrom);
+   interface->rbuf, interface->rbuf_offset, pc);
 
/*
 * If a physical layer checksum failed (dunno of any
@@ -374,7 +373,7 @@ receive_packet(struct interface_info *in
 
/* Decode the IP and UDP headers... */
offset = decode_udp_ip_header(interface, interface->rbuf,
-   interface->rbuf_offset, from, hdr.bh_caplen);
+   interface->rbuf_offset, pc, hdr.bh_caplen);
 
/* If the IP or UDP checksum was bad, skip the packet... */
if (offset < 0) {
Index: dhcpd.h
===
RCS file: /home/obsdcvs/src/usr.sbin/dhcrelay/dhcpd.h,v
retrieving revision 1.15
diff -u -p -r1.15 dhcpd.h
--- dhcpd.h 7 Dec 2016 13:19:18 -   1.15
+++ dhcpd.h 8 Dec 2016 09:10:23 -
@@ -42,15 +42,28 @@
 #defineSERVER_PORT 67
 #defineCLIENT_PORT 68
 
+/* Maximum size of client hardware address. */
+#define CHADDR_SIZE16
+
+struct packet_ctx {
+   uint8_t  pc_htype;
+   uint8_t  pc_hlen;
+   uint8_t  pc_smac[CHADDR_SIZE];
+   uint8_t  pc_dmac[CHADDR_SIZE];
+
+   struct sockaddr_storage  pc_src;
+   struct sockaddr_storage  pc_dst;
+};
+
 struct iaddr {
int len;
-   unsigned char iabuf[16];
+   unsigned char iabuf[CHADDR_SIZE];
 };
 
 struct hardware {
u_int8_t htype;
u_int8_t hlen;
-   u_int8_t 

Re: dhcrelay(8): simplify get_interface()

2016-12-07 Thread Rafael Zalamena
On Wed, Dec 07, 2016 at 05:34:05PM +0100, Rafael Zalamena wrote:
> This diff simplifies the get_interface function and makes it more
> straightforward, it also makes dhcrelay(8) throw a more informative error
> message when running layer 3 mode (default) on interfaces without an
> address.
> 
> I'll use this code later to be able to get_interfaces() without an IP address.

I forgot to make it return NULL if no interfaces are found (e.g. invalid
interface name) and then it started failing with a cryptic death message:
"Can't attach interface vip1 to bpf device: Device not configured"

This updated diff fix the problem.

ok?

Index: bpf.c
===
RCS file: /home/obsdcvs/src/usr.sbin/dhcrelay/bpf.c,v
retrieving revision 1.11
diff -u -p -r1.11 bpf.c
--- bpf.c   28 May 2016 07:00:18 -  1.11
+++ bpf.c   7 Dec 2016 17:49:16 -
@@ -75,7 +75,7 @@ if_register_bpf(struct interface_info *i
error("Can't open bpf device: %m");
 
/* Set the BPF device to point at this interface. */
-   if (ioctl(sock, BIOCSETIF, info->ifp) == -1)
+   if (ioctl(sock, BIOCSETIF, >ifr) == -1)
error("Can't attach interface %s to bpf device: %m",
info->name);
 
Index: dhcpd.h
===
RCS file: /home/obsdcvs/src/usr.sbin/dhcrelay/dhcpd.h,v
retrieving revision 1.15
diff -u -p -r1.15 dhcpd.h
--- dhcpd.h 7 Dec 2016 13:19:18 -   1.15
+++ dhcpd.h 7 Dec 2016 17:49:16 -
@@ -76,7 +76,7 @@ struct interface_info {
size_t   rbuf_max;
size_t   rbuf_offset;
size_t   rbuf_len;
-   struct ifreq*ifp;
+   struct ifreq ifr;
int  noifmedia;
int  errors;
int  dead;
Index: dhcrelay.c
===
RCS file: /home/obsdcvs/src/usr.sbin/dhcrelay/dhcrelay.c,v
retrieving revision 1.44
diff -u -p -r1.44 dhcrelay.c
--- dhcrelay.c  7 Dec 2016 13:19:18 -   1.44
+++ dhcrelay.c  7 Dec 2016 17:49:16 -
@@ -165,6 +165,9 @@ main(int argc, char *argv[])
 
if (interfaces == NULL)
error("no interface given");
+   if (interfaces->primary_address.s_addr == 0)
+   error("interface '%s' does not have an address",
+   interfaces->name);
 
/* Default DHCP/BOOTP ports. */
server_port = htons(SERVER_PORT);
Index: dispatch.c
===
RCS file: /home/obsdcvs/src/usr.sbin/dhcrelay/dispatch.c,v
retrieving revision 1.12
diff -u -p -r1.12 dispatch.c
--- dispatch.c  7 Dec 2016 13:19:18 -   1.12
+++ dispatch.c  7 Dec 2016 17:49:16 -
@@ -79,15 +79,15 @@ get_interface(const char *ifname, void (
 {
struct interface_info   *iface;
struct ifaddrs  *ifap, *ifa;
-   struct ifreq*tif;
-   struct sockaddr_in   foo;
+   struct sockaddr_in  *sin;
+   int  found = 0;
 
if ((iface = calloc(1, sizeof(*iface))) == NULL)
error("failed to allocate memory");
 
if (strlcpy(iface->name, ifname, sizeof(iface->name)) >=
sizeof(iface->name))
-   error("interface name too long");
+   error("interface name '%s' too long", ifname);
 
if (getifaddrs() != 0)
error("getifaddrs failed");
@@ -101,6 +101,8 @@ get_interface(const char *ifname, void (
if (strcmp(ifname, ifa->ifa_name))
continue;
 
+   found = 1;
+
/*
 * If we have the capability, extract link information
 * and record it in a linked list.
@@ -120,31 +122,28 @@ get_interface(const char *ifname, void (
memcpy(iface->hw_address.haddr,
LLADDR(foo), foo->sdl_alen);
} else if (ifa->ifa_addr->sa_family == AF_INET) {
-   struct iaddr addr;
+   /* We already have the primary address. */
+   if (iface->primary_address.s_addr != 0)
+   continue;
 
-   memcpy(, ifa->ifa_addr, sizeof(foo));
-   if (foo.sin_addr.s_addr == htonl(INADDR_LOOPBACK))
+   sin = (struct sockaddr_in *)ifa->ifa_addr;
+   if (sin->sin_addr.s_addr == htonl(INADDR_LOOPBACK))
continue;
-   if (!iface->ifp) {
-  

dhcrelay(8): simplify get_interface()

2016-12-07 Thread Rafael Zalamena
This diff simplifies the get_interface function and makes it more
straightforward, it also makes dhcrelay(8) throw a more informative error
message when running layer 3 mode (default) on interfaces without an
address.

I'll use this code later to be able to get_interfaces() without an IP address.

ok?

Index: bpf.c
===
RCS file: /cvs/src/usr.sbin/dhcrelay/bpf.c,v
retrieving revision 1.11
diff -u -p -r1.11 bpf.c
--- bpf.c   28 May 2016 07:00:18 -  1.11
+++ bpf.c   7 Dec 2016 16:29:14 -
@@ -75,7 +75,7 @@ if_register_bpf(struct interface_info *i
error("Can't open bpf device: %m");
 
/* Set the BPF device to point at this interface. */
-   if (ioctl(sock, BIOCSETIF, info->ifp) == -1)
+   if (ioctl(sock, BIOCSETIF, >ifr) == -1)
error("Can't attach interface %s to bpf device: %m",
info->name);
 
Index: dhcpd.h
===
RCS file: /cvs/src/usr.sbin/dhcrelay/dhcpd.h,v
retrieving revision 1.15
diff -u -p -r1.15 dhcpd.h
--- dhcpd.h 7 Dec 2016 13:19:18 -   1.15
+++ dhcpd.h 7 Dec 2016 16:29:14 -
@@ -76,7 +76,7 @@ struct interface_info {
size_t   rbuf_max;
size_t   rbuf_offset;
size_t   rbuf_len;
-   struct ifreq*ifp;
+   struct ifreq ifr;
int  noifmedia;
int  errors;
int  dead;
Index: dhcrelay.c
===
RCS file: /cvs/src/usr.sbin/dhcrelay/dhcrelay.c,v
retrieving revision 1.44
diff -u -p -r1.44 dhcrelay.c
--- dhcrelay.c  7 Dec 2016 13:19:18 -   1.44
+++ dhcrelay.c  7 Dec 2016 16:29:14 -
@@ -165,6 +165,9 @@ main(int argc, char *argv[])
 
if (interfaces == NULL)
error("no interface given");
+   if (interfaces->primary_address.s_addr == 0)
+   error("interface '%s' does not have an address",
+   interfaces->name);
 
/* Default DHCP/BOOTP ports. */
server_port = htons(SERVER_PORT);
Index: dispatch.c
===
RCS file: /cvs/src/usr.sbin/dhcrelay/dispatch.c,v
retrieving revision 1.12
diff -u -p -r1.12 dispatch.c
--- dispatch.c  7 Dec 2016 13:19:18 -   1.12
+++ dispatch.c  7 Dec 2016 16:29:14 -
@@ -79,15 +79,14 @@ get_interface(const char *ifname, void (
 {
struct interface_info   *iface;
struct ifaddrs  *ifap, *ifa;
-   struct ifreq*tif;
-   struct sockaddr_in   foo;
+   struct sockaddr_in  *sin;
 
if ((iface = calloc(1, sizeof(*iface))) == NULL)
error("failed to allocate memory");
 
if (strlcpy(iface->name, ifname, sizeof(iface->name)) >=
sizeof(iface->name))
-   error("interface name too long");
+   error("interface name '%s' too long", ifname);
 
if (getifaddrs() != 0)
error("getifaddrs failed");
@@ -120,31 +119,23 @@ get_interface(const char *ifname, void (
memcpy(iface->hw_address.haddr,
LLADDR(foo), foo->sdl_alen);
} else if (ifa->ifa_addr->sa_family == AF_INET) {
-   struct iaddr addr;
+   /* We already have the primary address. */
+   if (iface->primary_address.s_addr != 0)
+   continue;
 
-   memcpy(, ifa->ifa_addr, sizeof(foo));
-   if (foo.sin_addr.s_addr == htonl(INADDR_LOOPBACK))
+   sin = (struct sockaddr_in *)ifa->ifa_addr;
+   if (sin->sin_addr.s_addr == htonl(INADDR_LOOPBACK))
continue;
-   if (!iface->ifp) {
-   int len = IFNAMSIZ + ifa->ifa_addr->sa_len;
 
-   if ((tif = malloc(len)) == NULL)
-   error("no space to remember ifp");
-   strlcpy(tif->ifr_name, ifa->ifa_name, IFNAMSIZ);
-   memcpy(>ifr_addr, ifa->ifa_addr,
-   ifa->ifa_addr->sa_len);
-   iface->ifp = tif;
-   iface->primary_address = foo.sin_addr;
-   }
-   addr.len = 4;
-   memcpy(addr.iabuf, _addr.s_addr, addr.len);
+   iface->primary_address = sin->sin_addr;
}
}
 
freeifaddrs(ifap);
 
-   if (!iface->ifp)
-   error("%s: not found", iface->name);
+   if (strlcpy(iface->ifr.ifr_name, ifname,

Re: dhcrelay: pledge(2)

2016-12-07 Thread Rafael Zalamena
On Wed, Dec 07, 2016 at 02:47:25PM +0100, Reyk Floeter wrote:
> Hi,
> 
> dhcrelay drops privs but isn't pledged yet - here it is.
> 
> It is simpler than dhclient: it only needs stdio and route because it
> pre-opens all file descriptors (UDP, bpf), does the bpf ioctls before,
> and only needs "route" for interface status ioctls on runtime.
> 
> OK?

I didn't finish my implementations, but from what I've tested it seems to
be working. I don't expect anything different.

ok rzalamena@

> 
> Reyk
> 
> Index: usr.sbin/dhcrelay/dhcrelay.c
> ===
> RCS file: /cvs/src/usr.sbin/dhcrelay/dhcrelay.c,v
> retrieving revision 1.44
> diff -u -p -u -p -r1.44 dhcrelay.c
> --- usr.sbin/dhcrelay/dhcrelay.c  7 Dec 2016 13:19:18 -   1.44
> +++ usr.sbin/dhcrelay/dhcrelay.c  7 Dec 2016 13:42:07 -
> @@ -248,6 +248,9 @@ main(int argc, char *argv[])
>   log_perror = 0;
>   }
>  
> + if (pledge("stdio route", NULL) == -1)
> + error("pledge");
> +
>   dispatch();
>   /* not reached */
>  
> 



Re: dhcrelay(8): clean up function prototypes

2016-12-07 Thread Rafael Zalamena
On Wed, Dec 07, 2016 at 02:49:55PM +0100, Rafael Zalamena wrote:
> ---snipped---
> 

Actually the code below is not wrong, there are some scenarios where you
need this to make relayed DHCP to work. I'm not touching the part I noted
before.

The diff that I sent before still stands and has nothing to do with this
note.

> ---
> Note:
> While testing this I noticed that even though the server socket is sending
> the wrong source port, the dhcp server doesn't care about it and it works.
> But this can be easily fixed by changing this line in dhcrelay.c:
> ...
> main() {
> ...
>   laddr.sin_port = server_port;
> ...
> 
> to
>   laddr.sin_port = client_port;
> 
> I'll fix this in another diff.
> --



dhcrelay(8): clean up function prototypes

2016-12-07 Thread Rafael Zalamena
I'm implementing some features for dhcrelay and to make them fit I need
some clean ups in the dhcrelay(8) first. This diff changes most of the
input/output functions prototypes to take one parameter with all addresses
instead of passing multiple parameters.

Basically this will make input functions gather more information (source/
destination MACs, source/destination IPs, source/destination ports) and
use it in the output instead of trying to figure out this information along
the way.

With this we will be able to add IPv6 support and layer 2 relaying.

---
Note:
While testing this I noticed that even though the server socket is sending
the wrong source port, the dhcp server doesn't care about it and it works.
But this can be easily fixed by changing this line in dhcrelay.c:
...
main() {
...
laddr.sin_port = server_port;
...

to
laddr.sin_port = client_port;

I'll fix this in another diff.
--

ok?

Index: bpf.c
===
RCS file: /cvs/src/usr.sbin/dhcrelay/bpf.c,v
retrieving revision 1.11
diff -u -p -r1.11 bpf.c
--- bpf.c   28 May 2016 07:00:18 -  1.11
+++ bpf.c   7 Dec 2016 13:44:35 -
@@ -258,24 +258,23 @@ if_register_receive(struct interface_inf
 
 ssize_t
 send_packet(struct interface_info *interface,
-struct dhcp_packet *raw, size_t len, struct in_addr from,
-struct sockaddr_in *to, struct hardware *hto)
+struct dhcp_packet *raw, size_t len, struct packet_ctx *pc)
 {
unsigned char buf[256];
struct iovec iov[2];
int result, bufp = 0;
 
if (interface->hw_address.htype == HTYPE_IPSEC_TUNNEL) {
-   socklen_t slen = sizeof(*to);
+   socklen_t slen = pc->pc_dss.ss_len;
result = sendto(server_fd, raw, len, 0,
-   (struct sockaddr *)to, slen);
+   (struct sockaddr *)>pc_dss, slen);
goto done;
}
 
/* Assemble the headers... */
-   assemble_hw_header(interface, buf, , hto);
-   assemble_udp_ip_header(interface, buf, , from.s_addr,
-   to->sin_addr.s_addr, to->sin_port, (unsigned char *)raw, len);
+   assemble_hw_header(interface, buf, , pc);
+   assemble_udp_ip_header(interface, buf, , pc,
+   (unsigned char *)raw, len);
 
/* Fire it off */
iov[0].iov_base = (char *)buf;
@@ -292,7 +291,7 @@ send_packet(struct interface_info *inter
 
 ssize_t
 receive_packet(struct interface_info *interface, unsigned char *buf,
-size_t len, struct sockaddr_in *from, struct hardware *hfrom)
+size_t len, struct packet_ctx *pc)
 {
int length = 0, offset = 0;
struct bpf_hdr hdr;
@@ -358,7 +357,7 @@ receive_packet(struct interface_info *in
 
/* Decode the physical header... */
offset = decode_hw_header(interface,
-   interface->rbuf, interface->rbuf_offset, hfrom);
+   interface->rbuf, interface->rbuf_offset, pc);
 
/*
 * If a physical layer checksum failed (dunno of any
@@ -374,7 +373,7 @@ receive_packet(struct interface_info *in
 
/* Decode the IP and UDP headers... */
offset = decode_udp_ip_header(interface, interface->rbuf,
-   interface->rbuf_offset, from, hdr.bh_caplen);
+   interface->rbuf_offset, pc, hdr.bh_caplen);
 
/* If the IP or UDP checksum was bad, skip the packet... */
if (offset < 0) {
Index: dhcpd.h
===
RCS file: /cvs/src/usr.sbin/dhcrelay/dhcpd.h,v
retrieving revision 1.15
diff -u -p -r1.15 dhcpd.h
--- dhcpd.h 7 Dec 2016 13:19:18 -   1.15
+++ dhcpd.h 7 Dec 2016 13:44:35 -
@@ -42,15 +42,28 @@
 #defineSERVER_PORT 67
 #defineCLIENT_PORT 68
 
+/* Maximum size of client hardware address. */
+#define CHADDR_SIZE16
+
+struct packet_ctx {
+   uint8_t  pc_htype;
+   uint8_t  pc_hlen;
+   uint8_t  pc_smac[CHADDR_SIZE];
+   uint8_t  pc_dmac[CHADDR_SIZE];
+
+   struct sockaddr_storage  pc_sss;
+   struct sockaddr_storage  pc_dss;
+};
+
 struct iaddr {
int len;
-   unsigned char iabuf[16];
+   unsigned char iabuf[CHADDR_SIZE];
 };
 
 struct hardware {
u_int8_t htype;
u_int8_t hlen;
-   u_int8_t haddr[16];
+   u_int8_t haddr[CHADDR_SIZE];
 };
 
 /* Possible states in which the client can be. */
@@ -112,15 +125,13 @@ int if_register_bpf(struct interface_inf
 void if_register_send(struct interface_info *);
 void if_register_receive(struct interface_info *);
 ssize_t send_packet(struct interface_info *,
-struct dhcp_packet *, size_t, struct in_addr,
-struct sockaddr_in *, struct hardware *);
+struct dhcp_packet *, size_t, struct 

Re: ntpd(8): use stack instead of heap

2016-12-02 Thread Rafael Zalamena
On Sat, Oct 01, 2016 at 07:05:51PM +0200, Rafael Zalamena wrote:
> The ntpd(8) constraint fork+exec diff changed the way the constraint
> processes are created, but then it introduced new calloc()s to avoid
> increasing diff size and to focus on the problem. Now that the fork+exec
> is in, this diff make those variables to become a part of the stack.
> 
> No functional changes, just changing variables storage location.
> 
> ok?

Ping.

Updated diff to apply on the latest ntpd sources.

ok?

Index: usr.sbin/ntpd//constraint.c
===
RCS file: /home/obsdcvs/src/usr.sbin/ntpd/constraint.c,v
retrieving revision 1.34
diff -u -p -r1.34 constraint.c
--- usr.sbin/ntpd//constraint.c 18 Oct 2016 22:05:47 -  1.34
+++ usr.sbin/ntpd//constraint.c 2 Dec 2016 16:27:15 -
@@ -321,8 +321,8 @@ priv_constraint_readquery(struct constra
 void
 priv_constraint_child(const char *pw_dir, uid_t pw_uid, gid_t pw_gid)
 {
-   struct constraint   *cstr;
-   struct ntp_addr_msg *am;
+   struct constraintcstr;
+   struct ntp_addr_msg  am;
uint8_t *data;
static char  addr[NI_MAXHOST];
struct timeval   rectv, xmttv;
@@ -336,10 +336,6 @@ priv_constraint_child(const char *pw_dir
if (setpriority(PRIO_PROCESS, 0, 0) == -1)
log_warn("could not set priority");
 
-   if ((cstr = calloc(1, sizeof(*cstr))) == NULL ||
-   (am = calloc(1, sizeof(*am))) == NULL)
-   fatal("%s: calloc", __func__);
-
/* Init TLS and load CA certs before chroot() */
if (tls_init() == -1)
fatalx("tls_init");
@@ -368,9 +364,9 @@ priv_constraint_child(const char *pw_dir
if (pledge("stdio inet", NULL) == -1)
fatal("pledge");
 
-   cstr->fd = CONSTRAINT_PASSFD;
-   imsg_init(>ibuf, cstr->fd);
-   priv_constraint_readquery(cstr, am, );
+   cstr.fd = CONSTRAINT_PASSFD;
+   imsg_init(, cstr.fd);
+   priv_constraint_readquery(, , );
 
/*
 * Get the IP address as name and set the process title accordingly.
@@ -378,8 +374,8 @@ priv_constraint_child(const char *pw_dir
 * any DNS operation, so it is safe to be called without the dns
 * pledge.
 */
-   if (getnameinfo((struct sockaddr *)>addr->ss,
-   SA_LEN((struct sockaddr *)>addr->ss),
+   if (getnameinfo((struct sockaddr *)>ss,
+   SA_LEN((struct sockaddr *)>ss),
addr, sizeof(addr), NULL, 0,
NI_NUMERICHOST) != 0)
fatalx("%s getnameinfo", __func__);
@@ -398,21 +394,21 @@ priv_constraint_child(const char *pw_dir
fatal("%s fcntl F_SETFD", __func__);
 
/* Get remaining data from imsg in the unpriv child */
-   if (am->namelen) {
-   if ((cstr->addr_head.name =
-   get_string(data, am->namelen)) == NULL)
+   if (am.namelen) {
+   if ((cstr.addr_head.name =
+   get_string(data, am.namelen)) == NULL)
fatalx("invalid IMSG_CONSTRAINT_QUERY name");
-   data += am->namelen;
+   data += am.namelen;
}
-   if (am->pathlen) {
-   if ((cstr->addr_head.path =
-   get_string(data, am->pathlen)) == NULL)
+   if (am.pathlen) {
+   if ((cstr.addr_head.path =
+   get_string(data, am.pathlen)) == NULL)
fatalx("invalid IMSG_CONSTRAINT_QUERY path");
}
 
/* Run! */
if ((ctx = httpsdate_query(addr,
-   CONSTRAINT_PORT, cstr->addr_head.name, cstr->addr_head.path,
+   CONSTRAINT_PORT, cstr.addr_head.name, cstr.addr_head.path,
conf->ca, conf->ca_len, , )) == NULL) {
/* Abort with failure but without warning */
exit(1);
@@ -422,10 +418,10 @@ priv_constraint_child(const char *pw_dir
iov[0].iov_len = sizeof(rectv);
iov[1].iov_base = 
iov[1].iov_len = sizeof(xmttv);
-   imsg_composev(>ibuf,
+   imsg_composev(,
IMSG_CONSTRAINT_RESULT, 0, 0, -1, iov, 2);
do {
-   rv = imsg_flush(>ibuf);
+   rv = imsg_flush();
} while (rv == -1 && errno == EAGAIN);
 
/* Tear down the TLS connection after sending the result */



switchd(8): learn remote switch tables

2016-12-02 Thread Rafael Zalamena
Learn remote switch's flow table properties so we can use this information
to decide where to install the default table-miss flow for OpenFlow 1.3.
This is not needed by OpenFlow 1.0 since it already does this by default.

This diff implements the functions to ask the remote switch for tables
information and to parse them into data structures that we can use to
decide where can the switchd(8) install flows and what kind. After the
tables information are parsed and stored we use that to select the first
table with the capabilies we need to send packets to controller.

Even though this is already enough to make switchd(8) to work with
switchd(4) and HP 3800 by default, it doesn't consider the possibility
of having installed flows that changes the normal table processing. So
the next implementation step is to obtain flow information for all table
and make switchd(8) also consider this when choosing the table.

ok?

Index: usr.sbin/switchd/ofp.c
===
RCS file: /cvs/src/usr.sbin/switchd/ofp.c,v
retrieving revision 1.17
diff -u -p -r1.17 ofp.c
--- usr.sbin/switchd/ofp.c  2 Dec 2016 14:39:46 -   1.17
+++ usr.sbin/switchd/ofp.c  2 Dec 2016 15:03:51 -
@@ -231,14 +231,11 @@ ofp_nextstate(struct switchd *sc, struct
/* Let's not ask this while we don't use it. */
ofp13_flow_stats(sc, con, OFP_PORT_ANY, OFP_GROUP_ID_ANY,
OFP_TABLE_ID_ALL);
-   ofp13_table_features(sc, con, 0);
ofp13_desc(sc, con);
 #endif
+   rv |= ofp13_table_features(sc, con, 0);
rv |= ofp13_setconfig(sc, con, OFP_CONFIG_FRAG_NORMAL,
OFP_CONTROLLER_MAXLEN_NO_BUFFER);
-
-   /* Use table '0' for switch(4) and '100' for HP 3800. */
-   rv |= ofp13_tablemiss_sendctrl(sc, con, 0);
break;
 
 
Index: usr.sbin/switchd/ofp13.c
===
RCS file: /cvs/src/usr.sbin/switchd/ofp13.c,v
retrieving revision 1.41
diff -u -p -r1.41 ofp13.c
--- usr.sbin/switchd/ofp13.c2 Dec 2016 14:39:46 -   1.41
+++ usr.sbin/switchd/ofp13.c2 Dec 2016 15:03:52 -
@@ -70,10 +70,8 @@ int   ofp13_packet_in(struct switchd *, s
struct ofp_header *, struct ibuf *);
 int ofp13_flow_removed(struct switchd *, struct switch_connection *,
struct ofp_header *, struct ibuf *);
-int ofp13_parse_instruction(struct ibuf *, struct ofp_instruction *);
-int ofp13_parse_action(struct ibuf *, struct ofp_action_header *);
-int ofp13_parse_oxm(struct ibuf *, struct ofp_ox_match *);
-int ofp13_parse_tableproperties(struct ibuf *, struct ofp_table_features 
*);
+int ofp13_tableproperties(struct switch_connection *, struct ibuf *,
+   off_t, size_t, int);
 int ofp13_multipart_reply(struct switchd *, struct switch_connection *,
struct ofp_header *, struct ibuf *);
 int ofp13_validate_tableproperty(struct ibuf *, off_t, int);
@@ -104,6 +102,9 @@ int  ofp13_setconfig_validate(struct swi
struct sockaddr_storage *, struct sockaddr_storage *,
struct ofp_header *, struct ibuf *);
 
+int ofp13_switchconfigure(struct switchd *, struct switch_connection *);
+int ofp13_getflowtable(struct switch_connection *);
+
 struct ofp_callback ofp13_callbacks[] = {
{ OFP_T_HELLO,  ofp13_hello, ofp_validate_hello },
{ OFP_T_ERROR,  NULL, ofp13_validate_error },
@@ -1013,7 +1014,7 @@ ofp13_packet_in(struct switchd *sc, stru
struct ofp_ox_match *oxm;
struct packetpkt;
struct ibuf *obuf = NULL;
-   int  ret = -1;
+   int  table, ret = -1;
ssize_t  len, mlen;
uint32_t srcport = 0, dstport;
int  addflow = 0, sendbuffer = 0;
@@ -1091,6 +1092,13 @@ ofp13_packet_in(struct switchd *sc, stru
 
  again:
if (addflow) {
+   table = ofp13_getflowtable(con);
+   if (table > OFP_TABLE_ID_MAX || table < 0) {
+   /* This switch doesn't support installing flows. */
+   addflow = 0;
+   goto again;
+   }
+
if ((fm = ibuf_advance(obuf, sizeof(*fm))) == NULL)
goto done;
 
@@ -1101,6 +1109,7 @@ ofp13_packet_in(struct switchd *sc, stru
fm->fm_hard_timeout = 0; /* permanent */
fm->fm_priority = 0;
fm->fm_buffer_id = pin->pin_buffer_id;
+   fm->fm_table_id = table;
fm->fm_flags = htons(OFP_FLOWFLAG_SEND_FLOW_REMOVED);
if (pin->pin_buffer_id == htonl(OFP_PKTOUT_NO_BUFFER))
sendbuffer = 1;
@@ 

Re: vio(4): fixup crash on up/down

2016-11-24 Thread Rafael Zalamena
On Wed, Nov 23, 2016 at 09:10:44PM +0100, Stefan Fritsch wrote:
> On Wed, 23 Nov 2016, Rafael Zalamena wrote:
> 
> > > Maybe something like this is enough already (untested):
> > 
> > I tried your diff without Mike's if_vio diff and it doesn't panic anymore,
> > however it doesn't work.
> > 
> > vioX can send packets to host, host receives them and reply, but vioX
> > doesn't see any packets back. I don't even need to touch the interface
> > up/down status to see this happening. Also when the interface comes
> > up after being shutdown it sends a bunch of packets to host.
> 
> Sorry, device_status is a bitmask, not a plain value.
> 
> Try the patch below. The first hunk is to fix the 'sends a bunch of 
> packets'. If it causes any problems, leave it out.

This diff fixes the panic and makes the interface work again after a 'down'.

Thank you and Mike for fixing this. I'm going to play with this
more today and I can give feedbacks later if anything bad happens.

> 
> diff --git usr.sbin/vmd/virtio.c usr.sbin/vmd/virtio.c
> index 93def73..6436e6a 100644
> --- usr.sbin/vmd/virtio.c
> +++ usr.sbin/vmd/virtio.c
> @@ -703,6 +703,13 @@ virtio_net_io(int dir, uint16_t reg, uint32_t *data, 
> uint8_t *intr,
>   break;
>   case VIRTIO_CONFIG_DEVICE_STATUS:
>   dev->cfg.device_status = *data;
> + if (*data == 0) {
> + dev->vq[0].last_avail = 0;
> + dev->vq[0].notified_avail = 0;
> + dev->vq[1].last_avail = 0;
> + dev->vq[1].notified_avail = 0;
> + /* XXX do proper reset */
> + }
>   break;
>   default:
>   break;
> @@ -796,6 +803,9 @@ vionet_enq_rx(struct vionet_dev *dev, char *pkt, ssize_t 
> sz, int *spc)
>  
>   ret = 0;
>  
> + if (!(dev->cfg.device_status & VIRTIO_CONFIG_DEVICE_STATUS_DRIVER_OK))
> + return ret;
> +
>   vr_sz = vring_size(VIONET_QUEUE_SIZE);
>   q_gpa = dev->vq[0].qa;
>   q_gpa = q_gpa * VIRTIO_PAGE_SIZE;



Re: vio(4): fixup crash on up/down

2016-11-23 Thread Rafael Zalamena
On Wed, Nov 23, 2016 at 09:03:46AM +0100, Stefan Fritsch wrote:
> On Wed, 23 Nov 2016, Mike Belopuhov wrote:
> > > I guess we could do that. But then we cannot free the mbufs on DOWN
> > > until the device has used them.
> > 
> > Diff to this effect is below.  Works on vmd and qemu (original
> > one didn't because I kept the virtio_reset).
> > 
> > > That sounds like an unnecessary waste of memory to me.
> > > 
> > 
> > This is not so much memory we lose and then if you up it again
> > you're going to have it all back.  We can revert to the present
> > behavior once vmd matures, in the meantime people won't have to
> > juggle diffs around in their trees :)
> 
> I am not convinced. Doing a reset allows to recover from all kinds of 
> problems with DOWN/UP. That was useful when we had bugs in the event_idx 
> implementation.
> 
> Also, I don't like to change code that is known to work with at least 4 
> independent device implementations to work around problems in one 
> incomplete implementation that we can easily change.
> 
> Maybe something like this is enough already (untested):

I tried your diff without Mike's if_vio diff and it doesn't panic anymore,
however it doesn't work.

vioX can send packets to host, host receives them and reply, but vioX
doesn't see any packets back. I don't even need to touch the interface
up/down status to see this happening. Also when the interface comes
up after being shutdown it sends a bunch of packets to host.

> 
> --- usr.sbin/vmd/virtio.c 2016-10-20 05:05:49.049943724 +0200
> +++ usr.sbin/vmd/virtio.c 2016-11-23 08:55:38.829501275 +0100
> @@ -796,6 +796,9 @@
>  
>   ret = 0;
>  
> + if (dev->cfg.device_status != VIRTIO_CONFIG_DEVICE_STATUS_DRIVER_OK)
> + return ret;
> +
>   vr_sz = vring_size(VIONET_QUEUE_SIZE);
>   q_gpa = dev->vq[0].qa;
>   q_gpa = q_gpa * VIRTIO_PAGE_SIZE;
> 



switchd(8): negotiate versions with hello

2016-11-22 Thread Rafael Zalamena
Teach switchd(8) how to negotiate protocol version using the hello bitmap
header. This way switchd(8) is able to fallback or use higher version using
the bitmap.

This diff also prevents connections from switching version in the middle of
the operation.

This is the first step before adding a state machine to switchd(8): move the
hello to a common function and make it step into the first state: HELLO_WAIT.
(next diff)

ok?

Index: ofp.c
===
RCS file: /home/obsdcvs/src/usr.sbin/switchd/ofp.c,v
retrieving revision 1.15
diff -u -p -r1.15 ofp.c
--- ofp.c   4 Nov 2016 22:27:08 -   1.15
+++ ofp.c   22 Nov 2016 14:55:12 -
@@ -132,6 +132,13 @@ ofp_input(struct switch_connection *con,
return (-1);
}
 
+   if (con->con_version != OFP_V_0 &&
+   oh->oh_version != con->con_version) {
+   log_debug("wrong version %d, expected %d",
+   oh->oh_version, con->con_version);
+   return (-1);
+   }
+
switch (oh->oh_version) {
case OFP_V_1_0:
if (ofp10_input(sc, con, oh, ibuf) != 0)
@@ -165,6 +172,10 @@ ofp_open(struct privsep *ps, struct swit
log_info("%s: new connection %u.%u from switch %u",
__func__, con->con_id, con->con_instance,
sw == NULL ? 0 : sw->sw_id);
+
+   /* Send the hello with the latest version we support. */
+   if (ofp_send_hello(ps->ps_env, con, OFP_V_1_3) == -1)
+   return (-1);
 
return (0);
 }
Index: ofp10.c
===
RCS file: /home/obsdcvs/src/usr.sbin/switchd/ofp10.c,v
retrieving revision 1.16
diff -u -p -r1.16 ofp10.c
--- ofp10.c 21 Nov 2016 18:19:51 -  1.16
+++ ofp10.c 22 Nov 2016 15:08:02 -
@@ -60,7 +60,7 @@ intofp10_validate_packet_out(struct sw
struct ofp_header *, struct ibuf *);
 
 struct ofp_callback ofp10_callbacks[] = {
-   { OFP10_T_HELLO,ofp10_hello, NULL },
+   { OFP10_T_HELLO,ofp10_hello, ofp_validate_hello },
{ OFP10_T_ERROR,NULL, ofp10_validate_error },
{ OFP10_T_ECHO_REQUEST, ofp10_echo_request, NULL },
{ OFP10_T_ECHO_REPLY,   NULL, NULL },
@@ -262,13 +262,8 @@ ofp10_hello(struct switchd *sc, struct s
return (-1);
}
 
-   /* Echo back the received Hello packet */
-   oh->oh_version = OFP_V_1_0;
-   oh->oh_length = htons(sizeof(*oh));
-   oh->oh_xid = htonl(con->con_xidnxt++);
-   if (ofp10_validate(sc, >con_local, >con_peer, oh, NULL) != 0)
+   if (ofp_recv_hello(sc, con, oh, ibuf) == -1)
return (-1);
-   ofp_output(con, oh, NULL);
 
 #if 0
(void)write(fd, , sizeof(oh));
Index: ofp13.c
===
RCS file: /home/obsdcvs/src/usr.sbin/switchd/ofp13.c,v
retrieving revision 1.39
diff -u -p -r1.39 ofp13.c
--- ofp13.c 21 Nov 2016 19:33:12 -  1.39
+++ ofp13.c 22 Nov 2016 14:49:27 -
@@ -109,7 +109,7 @@ int  ofp13_tablemiss_sendctrl(struct swi
uint8_t);
 
 struct ofp_callback ofp13_callbacks[] = {
-   { OFP_T_HELLO,  ofp13_hello, NULL },
+   { OFP_T_HELLO,  ofp13_hello, ofp_validate_hello },
{ OFP_T_ERROR,  NULL, ofp13_validate_error },
{ OFP_T_ECHO_REQUEST,   ofp13_echo_request, NULL },
{ OFP_T_ECHO_REPLY, NULL, NULL },
@@ -639,13 +639,8 @@ ofp13_hello(struct switchd *sc, struct s
return (-1);
}
 
-   /* Echo back the received Hello packet */
-   oh->oh_version = OFP_V_1_3;
-   oh->oh_length = htons(sizeof(*oh));
-   oh->oh_xid = htonl(con->con_xidnxt++);
-   if (ofp13_validate(sc, >con_local, >con_peer, oh, NULL) != 0)
+   if (ofp_recv_hello(sc, con, oh, ibuf) == -1)
return (-1);
-   ofp_output(con, oh, NULL);
 
/* Ask for switch features so we can get more information. */
if (ofp13_featuresrequest(sc, con) == -1)
Index: ofp_common.c
===
RCS file: /home/obsdcvs/src/usr.sbin/switchd/ofp_common.c,v
retrieving revision 1.7
diff -u -p -r1.7 ofp_common.c
--- ofp_common.c17 Nov 2016 13:10:26 -  1.7
+++ ofp_common.c22 Nov 2016 14:53:45 -
@@ -43,6 +43,8 @@
 #include "switchd.h"
 #include "ofp_map.h"
 
+intofp_setversion(struct switch_connection *, int);
+
 int
 ofp_validate_header(struct switchd *sc,
 struct sockaddr_storage *src, struct sockaddr_storage *dst,
@@ -114,6 +116,177 @@ ofp_output(struct switch_connection *con
}
 
ofrelay_write(con, buf);
+
+   return (0);
+}
+
+int
+ofp_send_hello(struct switchd *sc, struct switch_connection *con, int version)
+{
+   struct 

switchd(8): more oxm basic checks

2016-11-17 Thread Rafael Zalamena
This diff adds the missing IP_PROTO oxm validation and adds more hasmask
checks for types that should not have that.

ok?

Index: ofp13.c
===
RCS file: /cvs/src/usr.sbin/switchd/ofp13.c,v
retrieving revision 1.29
diff -u -p -r1.29 ofp13.c
--- ofp13.c 17 Nov 2016 16:24:00 -  1.29
+++ ofp13.c 17 Nov 2016 17:46:32 -
@@ -183,6 +183,8 @@ ofp13_validate_oxm_basic(struct ibuf *ib
case OFP_XM_T_IN_PORT:
case OFP_XM_T_IN_PHY_PORT:
case OFP_XM_T_MPLS_LABEL:
+   if (hasmask)
+   return (-1);
if ((ui32 = ibuf_seek(ibuf, off, sizeof(*ui32))) == NULL)
return (-1);
 
@@ -205,12 +207,26 @@ ofp13_validate_oxm_basic(struct ibuf *ib
log_debug("\t\t%llu", be64toh(*ui64));
break;
 
-   case OFP_XM_T_ETH_DST:
-   case OFP_XM_T_ETH_SRC:
case OFP_XM_T_ARP_SHA:
case OFP_XM_T_ARP_THA:
case OFP_XM_T_IPV6_ND_SLL:
case OFP_XM_T_IPV6_ND_TLL:
+   if (hasmask)
+   return (-1);
+   if ((ui8 = ibuf_seek(ibuf, off, ETHER_ADDR_LEN)) == NULL)
+   return (-1);
+
+   buf[0] = 0;
+   for (i = 0; i < ETHER_ADDR_LEN; i++) {
+   snprintf(hex, sizeof(hex), "%02x", *(ui8 + i));
+   strlcat(buf, hex, sizeof(buf));
+   }
+
+   log_debug("\t\t%s", buf);
+   break;
+
+   case OFP_XM_T_ETH_DST:
+   case OFP_XM_T_ETH_SRC:
len = ETHER_ADDR_LEN;
if (hasmask)
len *= 2;
@@ -245,15 +261,22 @@ ofp13_validate_oxm_basic(struct ibuf *ib
log_debug("\t\t0x%04x", ntohs(*ui16));
break;
 
-   case OFP_XM_T_ARP_OP:
-   case OFP_XM_T_VLAN_VID:
-   case OFP_XM_T_IP_PROTO:
case OFP_XM_T_TCP_SRC:
case OFP_XM_T_TCP_DST:
case OFP_XM_T_UDP_SRC:
case OFP_XM_T_UDP_DST:
case OFP_XM_T_SCTP_SRC:
case OFP_XM_T_SCTP_DST:
+   case OFP_XM_T_ARP_OP:
+   if (hasmask)
+   return (-1);
+   if ((ui16 = ibuf_seek(ibuf, off, sizeof(*ui16))) == NULL)
+   return (-1);
+
+   log_debug("\t\t%d", ntohs(*ui16));
+   break;
+
+   case OFP_XM_T_VLAN_VID:
case OFP_XM_T_IPV6_EXTHDR:
len = sizeof(*ui16);
if (hasmask)
@@ -283,12 +306,15 @@ ofp13_validate_oxm_basic(struct ibuf *ib
 
case OFP_XM_T_IP_DSCP:
case OFP_XM_T_IP_ECN:
+   case OFP_XM_T_IP_PROTO:
case OFP_XM_T_ICMPV4_TYPE:
case OFP_XM_T_ICMPV4_CODE:
case OFP_XM_T_ICMPV6_TYPE:
case OFP_XM_T_ICMPV6_CODE:
case OFP_XM_T_MPLS_TC:
case OFP_XM_T_MPLS_BOS:
+   if (hasmask)
+   return (-1);
if ((ui8 = ibuf_seek(ibuf, off, sizeof(*ui8))) == NULL)
return (-1);
 
@@ -314,9 +340,24 @@ ofp13_validate_oxm_basic(struct ibuf *ib
log_debug("\t\t%#08x", ntohl(*ui32));
break;
 
+   case OFP_XM_T_IPV6_ND_TARGET:
+   if (hasmask)
+   return (-1);
+   if ((ui8 = ibuf_seek(ibuf, off,
+   sizeof(struct in6_addr))) == NULL)
+   return (-1);
+
+   buf[0] = 0;
+   for (i = 0; i < (int)sizeof(struct in6_addr); i++) {
+   snprintf(hex, sizeof(hex), "%02x", *(ui8 + i));
+   strlcat(buf, hex, sizeof(buf));
+   }
+
+   log_debug("\t\t%s", buf);
+   break;
+
case OFP_XM_T_IPV6_SRC:
case OFP_XM_T_IPV6_DST:
-   case OFP_XM_T_IPV6_ND_TARGET:
len = sizeof(struct in6_addr);
if (hasmask)
len *= 2;



switchd(8): add more packet-out validations

2016-11-07 Thread Rafael Zalamena
Now that we have the flow-mod validation with the action/instructions
support we can extend the usage of this functions for the packet-out
validation.

This diff increases the packet-out validation coverage by also doing
instructions and packet truncation checks.

ok?

Index: ofp13.c
===
RCS file: /cvs/src/usr.sbin/switchd/ofp13.c,v
retrieving revision 1.25
diff -u -p -r1.25 ofp13.c
--- ofp13.c 7 Nov 2016 13:27:11 -   1.25
+++ ofp13.c 7 Nov 2016 13:33:34 -
@@ -462,10 +462,9 @@ ofp13_validate_packet_out(struct switchd
 struct ofp_header *oh, struct ibuf *ibuf)
 {
struct ofp_packet_out   *pout;
-   size_t   len;
-   off_toff;
+   size_t   len, plen, diff;
+   off_toff, noff;
struct ofp_action_header*ah;
-   struct ofp_action_output*ao;
 
off = 0;
if ((pout = ibuf_seek(ibuf, off, sizeof(*pout))) == NULL) {
@@ -474,36 +473,43 @@ ofp13_validate_packet_out(struct switchd
return (-1);
}
 
-   log_debug("\tbuffer %d port %s "
-   "actions length %u",
+   off += sizeof(*pout);
+   len = ntohs(pout->pout_actions_len);
+   log_debug("\tbuffer %d in_port %s actions_len %lu",
ntohl(pout->pout_buffer_id),
-   print_map(ntohl(pout->pout_in_port), ofp_port_map),
-   ntohs(pout->pout_actions_len));
-   len = ntohl(pout->pout_actions_len);
+   print_map(ntohl(pout->pout_in_port), ofp_port_map), len);
 
-   off += sizeof(*pout);
-   while ((ah = ibuf_seek(ibuf, off, len)) != NULL &&
-   ntohs(ah->ah_len) >= (uint16_t)sizeof(*ah)) {
-   switch (ntohs(ah->ah_type)) {
-   case OFP_ACTION_OUTPUT:
-   ao = (struct ofp_action_output *)ah;
-   log_debug("\t\taction type %s length %d "
-   "port %s max length %d",
-   print_map(ntohs(ao->ao_type), ofp_action_map),
-   ntohs(ao->ao_len),
-   print_map(ntohs(ao->ao_port), ofp_port_map),
-   ntohs(ao->ao_max_len));
-   break;
-   default:
-   log_debug("\t\taction type %s length %d",
-   print_map(ntohs(ah->ah_type), ofp_action_map),
-   ntohs(ah->ah_len));
-   break;
-   }
-   if (pout->pout_buffer_id == (uint32_t)-1)
-   break;
-   off += ntohs(ah->ah_len);
+parse_next_action:
+   if ((ah = ibuf_seek(ibuf, off, sizeof(*ah))) == NULL)
+   return (-1);
+
+   noff = off;
+   ofp13_validate_action(sc, oh, ibuf, , ah);
+
+   diff = off - noff;
+   /* Loop prevention. */
+   if (off < noff || diff == 0)
+   return (-1);
+
+   len -= diff;
+   if (len)
+   goto parse_next_action;
+
+   /* Check for encapsulated packet truncation. */
+   len = ntohs(oh->oh_length) - off;
+   plen = ibuf_length(ibuf) - off;
+
+   if (plen < len) {
+   log_debug("\ttruncated packet %lu < %lu", plen, len);
+
+   /* Buffered packets can be truncated */
+   if (pout->pout_buffer_id != OFP_PKTOUT_NO_BUFFER)
+   len = plen;
+   else
+   return (-1);
}
+   if (ibuf_seek(ibuf, off, len) == NULL)
+   return (-1);
 
return (0);
 }



Re: switchd(8): add flow_mod validation

2016-10-31 Thread Rafael Zalamena
On Mon, Oct 24, 2016 at 07:05:08PM +0200, Rafael Zalamena wrote:
> On Wed, Oct 12, 2016 at 05:39:17PM +0200, Rafael Zalamena wrote:
> > This diff teaches switchd(8) how to validate flow_mod messages, more
> > specifically the flow instructions and actions. The oxm validations
> > were already implemented so we get them for free here.
> 
> I've updated the flow_mod diff to also include the following changes:
>  - Better loop detection like I did for tcpdump(8);
>  - A small fix in packet-in OXM parsing (see note below);
>  - Reuse actions validation for packet-out and implement missing
>payload truncation check;
>  - Moved the new code away from packet-out to make diff looks less
>confusing;
> 
> Note:
> In packet-in we shouldn't use omlen with header size included, it only
> works because the padding is zeroed out. To avoid one more loop and
> errornous zero header reading we should remove the header size.

I broke the last diff into more pieces, one of them was the header and it
was commited last week. So here is the new diff for the flow mod validation.

ok?

Index: ofp13.c
===
RCS file: /home/obsdcvs/src/usr.sbin/switchd/ofp13.c,v
retrieving revision 1.21
diff -u -p -r1.21 ofp13.c
--- ofp13.c 13 Oct 2016 08:29:14 -  1.21
+++ ofp13.c 31 Oct 2016 16:00:10 -
@@ -59,6 +59,12 @@ int   ofp13_features_reply(struct switchd
 int ofp13_validate_error(struct switchd *,
struct sockaddr_storage *, struct sockaddr_storage *,
struct ofp_header *, struct ibuf *);
+int ofp13_validate_action(struct switchd *, struct ofp_header *,
+   struct ibuf *, off_t *, struct ofp_action_header *);
+int ofp13_validate_instruction(struct switchd *, struct ofp_header *,
+   struct ibuf *, off_t *, struct ofp_instruction *);
+int ofp13_validate_flow_mod(struct switchd *, struct sockaddr_storage *,
+   struct sockaddr_storage *, struct ofp_header *, struct ibuf *);
 int ofp13_validate_oxm_basic(struct ibuf *, off_t, int, uint8_t);
 int ofp13_validate_oxm(struct switchd *, struct ofp_ox_match *,
struct ofp_header *, struct ibuf *, off_t);
@@ -129,7 +135,7 @@ struct ofp_callback ofp13_callbacks[] = 
{ OFP_T_FLOW_REMOVED,   ofp13_flow_removed, NULL },
{ OFP_T_PORT_STATUS,NULL, NULL },
{ OFP_T_PACKET_OUT, NULL, ofp13_validate_packet_out },
-   { OFP_T_FLOW_MOD,   NULL, NULL },
+   { OFP_T_FLOW_MOD,   NULL, ofp13_validate_flow_mod },
{ OFP_T_GROUP_MOD,  NULL, NULL },
{ OFP_T_PORT_MOD,   NULL, NULL },
{ OFP_T_TABLE_MOD,  NULL, NULL },
@@ -646,6 +652,298 @@ ofp13_features_reply(struct switchd *sc,
 #endif
ofp13_setconfig(sc, con, OFP_CONFIG_FRAG_NORMAL,
OFP_CONTROLLER_MAXLEN_NO_BUFFER);
+
+   return (0);
+}
+
+int
+ofp13_validate_action(struct switchd *sc, struct ofp_header *oh,
+struct ibuf *ibuf, off_t *off, struct ofp_action_header *ah)
+{
+   struct ofp_action_output*ao;
+   struct ofp_action_mpls_ttl  *amt;
+   struct ofp_action_push  *ap;
+   struct ofp_action_pop_mpls  *apm;
+   struct ofp_action_group *ag;
+   struct ofp_action_nw_ttl*ant;
+   struct ofp_action_set_field *asf;
+   struct ofp_action_set_queue *asq;
+   struct ofp_ox_match *oxm;
+   size_t   len;
+   int  type;
+   off_tmoff;
+
+   type = ntohs(ah->ah_type);
+   len = ntohs(ah->ah_len);
+
+   switch (type) {
+   case OFP_ACTION_OUTPUT:
+   if (len != sizeof(*ao))
+   return (-1);
+   if ((ao = ibuf_seek(ibuf, *off, sizeof(*ao))) == NULL)
+   return (-1);
+
+   *off += len;
+   log_debug("\t\taction %s len %lu port %s max_len %d",
+   print_map(type, ofp_action_map), len,
+   print_map(ntohl(ao->ao_port), ofp_port_map),
+   ntohs(ao->ao_max_len));
+   break;
+   case OFP_ACTION_SET_MPLS_TTL:
+   if (len != sizeof(*amt))
+   return (-1);
+   if ((amt = ibuf_seek(ibuf, *off, sizeof(*amt))) == NULL)
+   return (-1);
+
+   *off += len;
+   log_debug("\t\taction %s len %lu ttl %d",
+   print_map(type, ofp_action_map), len, amt->amt_ttl);
+   break;
+   case OFP_ACTION_PUSH_VLAN:
+   case OFP_ACTION_PUSH_MPLS:
+   case OFP_ACTION_PUSH_PBB:
+   if (len != sizeof(*ap))
+   return (-1);
+   if ((ap = ibuf_seek(ibuf, 

Re: switch(4): add more input validations

2016-10-31 Thread Rafael Zalamena
On Fri, Oct 28, 2016 at 07:56:12PM +0400, Reyk Floeter wrote:
> > On 28.10.2016, at 19:20, Rafael Zalamena <rzalam...@gmail.com> wrote:
> > This diff teaches switch(4) how to do more validations on dynamic input
> > field types, like: ofp_match (has N oxms), ofp_action_header (might be
> > followed by N actions) and ofp_instruction (might have N actions inside).
> > 
> > This is important because the internal switch structures reuse the ofp_match
> > and friends from the packet and blindly trusts it to be correct, so to be
> > able to do that we must ensure that we are receiving them correctly.
> > We need to pay special attention to the fields that these macros use:
> > - OFP_OXM_FOREACH;
> > - OFP_ACTION_FOREACH;
> > - OFP_FLOW_MOD_MSG_INSTRUCTION_OFFSET;
> > - OFP_FLOW_MOD_INSTRUCTON_FOREACH;
> > - OFP_I_ACTIONS_FOREACH;
> > - OFP_BUCKETS_FOREACH;
> ---snip---

I've commited most of the small diffs, but one of them turned out to be
different so I sent another email ('switch(4): input validation:
swofp_flow_entry_put_instructions').

Here is the diff after all those commits, with just one small change:
I have implemented set_queue action validation that was missing.

ok?

Index: switchofp.c
===
RCS file: /cvs/src/sys/net/switchofp.c,v
retrieving revision 1.23
diff -u -p -r1.23 switchofp.c
--- switchofp.c 31 Oct 2016 08:06:27 -  1.23
+++ switchofp.c 31 Oct 2016 14:52:13 -
@@ -211,8 +211,11 @@ int swofp_flow_cmp_strict(struct swofp_
 int swofp_flow_filter(struct swofp_flow_entry *, uint64_t, uint64_t,
uint32_t, uint32_t);
 voidswofp_flow_timeout(struct switch_softc *);
+int swofp_validate_oxm(struct ofp_ox_match *, int *);
 int swofp_validate_flow_match(struct ofp_match *, int *);
-int swofp_validate_flow_instruction(struct ofp_instruction *, int *);
+int swofp_validate_flow_instruction(struct ofp_instruction *, size_t,
+   int *);
+int swofp_validate_action(struct ofp_action_header *, size_t, int *);
 
 /*
  * OpenFlow protocol compare oxm
@@ -1807,75 +1810,242 @@ swofp_ox_cmp_ether_addr(struct ofp_ox_ma
}
 }
 
-
-/* TODO: validation for match */
 int
-swofp_validate_flow_match(struct ofp_match *om, int *err)
+swofp_validate_oxm(struct ofp_ox_match *oxm, int *err)
 {
-   struct ofp_oxm_class *handler;
-   struct ofp_ox_match *oxm;
+   struct ofp_oxm_class*handler;
+   int  length, hasmask;
+   int  neededlen;
 
-   OFP_OXM_FOREACH(om, ntohs(om->om_length), oxm) {
-   handler = swofp_lookup_oxm_handler(oxm);
-   if (handler == NULL ||
-   handler->oxm_match == NULL) {
-   *err = OFP_ERRMATCH_BAD_FIELD;
-   return (-1);
-   }
+   handler = swofp_lookup_oxm_handler(oxm);
+   if (handler == NULL || handler->oxm_match == NULL) {
+   *err = OFP_ERRMATCH_BAD_FIELD;
+   return (-1);
+   }
+
+   hasmask = OFP_OXM_GET_HASMASK(oxm);
+   length = oxm->oxm_length;
+
+   neededlen = (hasmask) ?
+   (handler->oxm_len * 2) : (handler->oxm_len);
+   if (oxm->oxm_length != neededlen) {
+   *err = OFP_ERRMATCH_BAD_LEN;
+   return (-1);
}
 
return (0);
 }
 
 int
-swofp_validate_flow_action_set_field(struct ofp_action_set_field *oasf)
+swofp_validate_flow_match(struct ofp_match *om, int *err)
 {
-   struct ofp_ox_match *oxm;
-   struct ofp_oxm_class*handler;
-
-   oxm = (struct ofp_ox_match *)oasf->asf_field;
+   struct ofp_ox_match *oxm;
 
-   handler = swofp_lookup_oxm_handler(oxm);
-   if (handler == NULL)
-   return (OFP_ERRACTION_SET_TYPE);
-   if (handler->oxm_set == NULL)
-   return (OFP_ERRACTION_SET_TYPE);
+   /*
+* TODO this function is missing checks for:
+* - OFP_ERRMATCH_BAD_TAG;
+* - OFP_ERRMATCH_BAD_VALUE;
+* - OFP_ERRMATCH_BAD_MASK;
+* - OFP_ERRMATCH_BAD_PREREQ;
+* - OFP_ERRMATCH_DUP_FIELD;
+*/
+   OFP_OXM_FOREACH(om, ntohs(om->om_length), oxm) {
+   if (swofp_validate_oxm(oxm, err))
+   return (*err);
+   }
 
return (0);
 }
 
-/* TODO: validation for instruction */
 int
-swofp_validate_flow_instruction(struct ofp_instruction *oi, int *error)
+swofp_validate_flow_instruction(struct ofp_instruction *oi, size_t total,
+int *err)
 {
struct ofp_action_header*oah;
struct ofp_instruction_actions  *oia;
+   int  ilen;
+
+   ilen = ntohs(oi->i_len);
+   /* Check for bigger than packet or smaller than header. */
+   if (ilen > total || ilen < sizeof(*oi)

switch(4): input validation: swofp_flow_entry_put_instructions

2016-10-31 Thread Rafael Zalamena
This diff is a part of the bigger diff to add more input validations to
the switch(4) OpenFlow protocol parser.

In this diff we reworked the swofp_flow_entry_put_instructions() function
with the following changes:
- Avoid leaking memory on repeated instructions. It is not possible to
  use the same instruction more than once, however the spec doesn't
  specify any errors for this ocasion. (OpenFlow 1.3.5 spec page 26);
- Apply the same error return tecnique for this function;
- Remove old goto label;
- Fix some whitespace/tab issue;

ok?

Index: net/switchofp.c
===
RCS file: /cvs/src/sys/net/switchofp.c,v
retrieving revision 1.23
diff -u -p -r1.23 switchofp.c
--- net/switchofp.c 31 Oct 2016 08:06:27 -  1.23
+++ net/switchofp.c 31 Oct 2016 08:24:43 -
@@ -195,7 +195,7 @@ int  swofp_validate_buckets(struct switc
  * Flow entry
  */
 int swofp_flow_entry_put_instructions(struct mbuf *,
-   struct swofp_flow_entry *);
+   struct swofp_flow_entry *, int *error);
 voidswofp_flow_entry_instruction_free(struct swofp_flow_entry *);
 voidswofp_flow_entry_free(struct swofp_flow_entry **);
 voidswofp_flow_entry_add(struct switch_softc *, struct swofp_flow_table *,
@@ -4570,12 +4570,12 @@ swofp_send_flow_removed(struct switch_so
  */
 int
 swofp_flow_entry_put_instructions(struct mbuf *m,
-struct swofp_flow_entry *swfe)
+struct swofp_flow_entry *swfe, int *error)
 {
struct ofp_flow_mod *ofm;
struct ofp_instruction  *oi;
caddr_t  inst;
-   int  start, len, off, error;
+   int  start, len, off;
 
ofm = mtod(m, struct ofp_flow_mod *);
 
@@ -4591,41 +4591,69 @@ swofp_flow_entry_put_instructions(struct
for (off = start; off < start + len; off += ntohs(oi->i_len)) {
oi = (struct ofp_instruction *)(mtod(m, caddr_t) + off);
 
-   if (swofp_validate_flow_instruction(oi, ))
-   goto failed;
+   if (swofp_validate_flow_instruction(oi, error))
+   return (-1);
 
if ((inst = malloc(ntohs(oi->i_len), M_DEVBUF,
-   M_DONTWAIT|M_ZERO)) == NULL) {
-   error = OFP_ERRFLOWMOD_UNKNOWN;
-   goto failed;
+   M_DONTWAIT|M_ZERO)) == NULL) {
+   *error = OFP_ERRFLOWMOD_UNKNOWN;
+   return (-1);
}
memcpy(inst, oi, ntohs(oi->i_len));
 
switch (ntohs(oi->i_type)) {
case OFP_INSTRUCTION_T_GOTO_TABLE:
+   if (swfe->swfe_goto_table)
+   free(swfe->swfe_goto_table, M_DEVBUF,
+   ntohs(oi->i_len));
+
swfe->swfe_goto_table =
(struct ofp_instruction_goto_table *)inst;
break;
case OFP_INSTRUCTION_T_WRITE_META:
+   if (swfe->swfe_write_metadata)
+   free(swfe->swfe_write_metadata, M_DEVBUF,
+   ntohs(oi->i_len));
+
swfe->swfe_write_metadata =
(struct ofp_instruction_write_metadata *)inst;
break;
case OFP_INSTRUCTION_T_WRITE_ACTIONS:
+   if (swfe->swfe_write_actions)
+   free(swfe->swfe_write_actions, M_DEVBUF,
+   ntohs(oi->i_len));
+
swfe->swfe_write_actions =
(struct ofp_instruction_actions *)inst;
break;
case OFP_INSTRUCTION_T_APPLY_ACTIONS:
+   if (swfe->swfe_apply_actions)
+   free(swfe->swfe_apply_actions, M_DEVBUF,
+   ntohs(oi->i_len));
+
swfe->swfe_apply_actions =
(struct ofp_instruction_actions *)inst;
break;
case OFP_INSTRUCTION_T_CLEAR_ACTIONS:
+   if (swfe->swfe_clear_actions)
+   free(swfe->swfe_clear_actions, M_DEVBUF,
+   ntohs(oi->i_len));
+
swfe->swfe_clear_actions =
(struct ofp_instruction_actions *)inst;
break;
case OFP_INSTRUCTION_T_METER:
+   if (swfe->swfe_meter)
+   free(swfe->swfe_meter, M_DEVBUF,
+   ntohs(oi->i_len));
+
swfe->swfe_meter = (struct ofp_instruction_meter *)inst;
break;
case OFP_INSTRUCTION_T_EXPERIMENTER:
+  

switch(4): add more input validations

2016-10-28 Thread Rafael Zalamena
This diff teaches switch(4) how to do more validations on dynamic input
field types, like: ofp_match (has N oxms), ofp_action_header (might be
followed by N actions) and ofp_instruction (might have N actions inside).

This is important because the internal switch structures reuse the ofp_match
and friends from the packet and blindly trusts it to be correct, so to be
able to do that we must ensure that we are receiving them correctly.
We need to pay special attention to the fields that these macros use:
 - OFP_OXM_FOREACH;
 - OFP_ACTION_FOREACH;
 - OFP_FLOW_MOD_MSG_INSTRUCTION_OFFSET;
 - OFP_FLOW_MOD_INSTRUCTON_FOREACH;
 - OFP_I_ACTIONS_FOREACH;
 - OFP_BUCKETS_FOREACH;

Other small fixes:
 - Simplify OFP_FLOW_MOD_MSG_INSTRUCTION_OFFSET() macro;
 - Change swofp_flow_table_add() malloc behaviour to be like all the rest
   of the code (don't wait for memory);
 - swofp_flow_entry_put_instructions() doesn't need a pointer to an mbuf
   pointer, we only read it;
 - Change the validation function parameters: we can't check for non-errors
   using the value 0, because the spec actually uses it for some errors.
   Instead use int pointer to get error value and use the return value only
   to find out if the validation succeeded or not;
 - Return more accurate error messages: some error messages were being sent
   with the wrong type/code combination;
 - Removed swofp_validate_flow_action_set_field() as it was incorporated
   in action validation;

TODO for next diffs:
 - We still need to code the validation for group messages, but since I
   haven't started using it yet, I didn't touch it;
 - We still need to implement validation for some specific OXM errors:
   duplicated OXM, invalid wildcard, invalid value or missing prereq;
 - swofp_validate_buckets() could use validate_action() with some effort;

Even though switchd(8) does packet validation (both on input and output),
we should commit this to enable other vendors/software to use the OpenBSD
switch(4) directly, otherwise we will need more effort to implement this
for switchd(8) relaying and force people to use switchd(8) needlessly.

ok?

Index: net/ofp.h
===
RCS file: /home/obsdcvs/src/sys/net/ofp.h,v
retrieving revision 1.2
diff -u -p -r1.2 ofp.h
--- net/ofp.h   30 Sep 2016 12:40:00 -  1.2
+++ net/ofp.h   25 Oct 2016 08:50:14 -
@@ -95,7 +95,7 @@ struct ofp_hello_element_versionbitmap {
 
 /* Ports */
 #define OFP_PORT_MAX   0xff00  /* Maximum number of physical 
ports */
-#defineOFP_PORT_INPUT  0xfff8  /* Send back to input 
port */
+#defineOFP_PORT_INPUT  0xfff8  /* Send back to input 
port */
 #define OFP_PORT_FLOWTABLE 0xfff9  /* Perform actions in flow 
table */
 #define OFP_PORT_NORMAL0xfffa  /* Let switch decide */
 #define OFP_PORT_FLOOD 0xfffb  /* All non-block ports except 
input */
@@ -179,9 +179,9 @@ struct ofp_switch_features {
 
 /* Switch capabilities */
 #define OFP_SWCAP_FLOW_STATS   0x1 /* Flow statistics */
-#define OFP_SWCAP_TABLE_STATS  0x2 /* Table statistics */
-#define OFP_SWCAP_PORT_STATS   0x4 /* Port statistics */
-#define OFP_SWCAP_GROUP_STATS  0x8 /* Group statistics */
+#define OFP_SWCAP_TABLE_STATS  0x2 /* Table statistics */
+#define OFP_SWCAP_PORT_STATS   0x4 /* Port statistics */
+#define OFP_SWCAP_GROUP_STATS  0x8 /* Group statistics */
 #define OFP_SWCAP_IP_REASM 0x20/* Can reassemble IP frags */
 #define OFP_SWCAP_QUEUE_STATS  0x40/* Queue statistics */
 #define OFP_SWCAP_ARP_MATCH_IP 0x80/* Match IP addresses in ARP 
pkts */
@@ -314,15 +314,15 @@ struct ofp_action_mpls_ttl {
 struct ofp_action_push {
uint16_tap_type;
uint16_tap_len;
-   uint16_tap_ethertype;
-   uint8_t pad[2];
+   uint16_tap_ethertype;
+   uint8_t ap_pad[2];
 } __packed;
 
 struct ofp_action_pop_mpls {
uint16_tapm_type;
uint16_tapm_len;
uint16_tapm_ethertype;
-   uint8_t pad[2];
+   uint8_t apm_pad[2];
 } __packed;
 
 struct ofp_action_group {
@@ -342,6 +342,12 @@ struct ofp_action_set_field {
uint16_tasf_type;
uint16_tasf_len;
uint8_t asf_field[4];
+} __packed;
+
+struct ofp_action_set_queue {
+   uint16_tasq_type;
+   uint16_tasq_len;
+   uint32_tasq_queue_id;
 } __packed;
 
 /* Packet-Out Message */
Index: net/switchofp.c
===
RCS file: /home/obsdcvs/src/sys/net/switchofp.c,v
retrieving revision 1.18
diff -u -p -r1.18 switchofp.c
--- net/switchofp.c 28 Oct 2016 09:01:49 -  1.18
+++ net/switchofp.c 28 Oct 2016 

Re: snmpd(8): teach how to fork+exec

2016-10-28 Thread Rafael Zalamena
On Sat, Oct 22, 2016 at 10:32:27PM +0200, Rafael Zalamena wrote:
> On Sat, Oct 22, 2016 at 08:14:16PM +0200, Jeremie Courreges-Anglas wrote:
> > Rafael Zalamena <rzalam...@gmail.com> writes:
> > > On Fri, Oct 21, 2016 at 01:26:36PM +0200, Jeremie Courreges-Anglas wrote:
> > >> Rafael Zalamena <rzalam...@gmail.com> writes:
> > >> ---snip---
>
> Short answer:
> Yes, you are correct to note this and I think now that it is probably better
> to write another diff to solve this problem. I'll get back at this diff later.
> 
> ---snip---
> 
> Just to clarify:
> I talked with reyk@ about this global env variables in the last hackathon,
> and we reached the conclusion that the best way to handle this is to use
> the ps_env whenever is possible, however since a lot of functions don't
> get access to ps, we must decide what does less changes to the daemon:
> 1) Use a single global variable (look at the httpd(8) commits);
> 2) Keep using the env (relayd(8) case);

The diff that makes snmpd(8) use only one global env is in, now we can
move on with the fork+exec diff.

Here is the updated diff.

ok?

Index: proc.c
===
RCS file: /cvs/src/usr.sbin/snmpd/proc.c,v
retrieving revision 1.20
diff -u -p -r1.20 proc.c
--- proc.c  7 Dec 2015 16:05:56 -   1.20
+++ proc.c  28 Oct 2016 08:06:34 -
@@ -1,7 +1,7 @@
 /* $OpenBSD: proc.c,v 1.20 2015/12/07 16:05:56 reyk Exp $  */
 
 /*
- * Copyright (c) 2010 - 2014 Reyk Floeter <r...@openbsd.org>
+ * Copyright (c) 2010 - 2016 Reyk Floeter <r...@openbsd.org>
  * Copyright (c) 2008 Pierre-Yves Ritschard <p...@openbsd.org>
  *
  * Permission to use, copy, modify, and distribute this software for any
@@ -22,6 +22,7 @@
 #include 
 #include 
 
+#include 
 #include 
 #include 
 #include 
@@ -34,8 +35,12 @@
 
 #include "snmpd.h"
 
-voidproc_open(struct privsep *, struct privsep_proc *,
-   struct privsep_proc *, size_t);
+voidproc_exec(struct privsep *, struct privsep_proc *, unsigned int,
+   int, char **);
+voidproc_setup(struct privsep *, struct privsep_proc *, unsigned int);
+voidproc_open(struct privsep *, int, int);
+voidproc_accept(struct privsep *, int, enum privsep_procid,
+   unsigned int);
 voidproc_close(struct privsep *);
 int proc_ispeer(struct privsep_proc *, unsigned int, enum privsep_procid);
 voidproc_shutdown(struct privsep_proc *);
@@ -55,204 +60,383 @@ proc_ispeer(struct privsep_proc *procs, 
return (0);
 }
 
-void
-proc_init(struct privsep *ps, struct privsep_proc *procs, unsigned int nproc)
+enum privsep_procid
+proc_getid(struct privsep_proc *procs, unsigned int nproc,
+const char *proc_name)
 {
-   unsigned int i, j, src, dst;
-   struct privsep_pipes*pp;
+   struct privsep_proc *p;
+   unsigned int proc;
 
-   /*
-* Allocate pipes for all process instances (incl. parent)
-*
-* - ps->ps_pipes: N:M mapping
-* N source processes connected to M destination processes:
-* [src][instances][dst][instances], for example
-* [PROC_RELAY][3][PROC_CA][3]
-*
-* - ps->ps_pp: per-process 1:M part of ps->ps_pipes
-* Each process instance has a destination array of socketpair fds:
-* [dst][instances], for example
-* [PROC_PARENT][0]
-*/
-   for (src = 0; src < PROC_MAX; src++) {
-   /* Allocate destination array for each process */
-   if ((ps->ps_pipes[src] = calloc(ps->ps_ninstances,
-   sizeof(struct privsep_pipes))) == NULL)
-   fatal("proc_init: calloc");
+   for (proc = 0; proc < nproc; proc++) {
+   p = [proc];
+   if (strcmp(p->p_title, proc_name))
+   continue;
 
-   for (i = 0; i < ps->ps_ninstances; i++) {
-   pp = >ps_pipes[src][i];
+   return (p->p_id);
+   }
 
-   for (dst = 0; dst < PROC_MAX; dst++) {
-   /* Allocate maximum fd integers */
-   if ((pp->pp_pipes[dst] =
-   calloc(ps->ps_ninstances,
-   sizeof(int))) == NULL)
-   fatal("proc_init: calloc");
+   return (PROC_MAX);
+}
 
-   /* Mark fd as unused */
-   for (j = 0; j < ps->ps_ninstances; j++)
-   pp->pp_pipes[dst][j] = -1;
-   }
-   }
-   }
+void
+proc_exec(struct privsep *ps, struct privsep_proc *procs, unsigned int nproc,
+int argc, char 

Re: switchd(8): add flow_mod validation

2016-10-24 Thread Rafael Zalamena
On Wed, Oct 12, 2016 at 05:39:17PM +0200, Rafael Zalamena wrote:
> This diff teaches switchd(8) how to validate flow_mod messages, more
> specifically the flow instructions and actions. The oxm validations
> were already implemented so we get them for free here.

I've updated the flow_mod diff to also include the following changes:
 - Better loop detection like I did for tcpdump(8);
 - A small fix in packet-in OXM parsing (see note below);
 - Reuse actions validation for packet-out and implement missing
   payload truncation check;
 - Moved the new code away from packet-out to make diff looks less
   confusing;

Note:
In packet-in we shouldn't use omlen with header size included, it only
works because the padding is zeroed out. To avoid one more loop and
errornous zero header reading we should remove the header size.

ok?

Index: sys/net/ofp.h
===
RCS file: /home/obsdcvs/src/sys/net/ofp.h,v
retrieving revision 1.2
diff -u -p -r1.2 ofp.h
--- sys/net/ofp.h   30 Sep 2016 12:40:00 -  1.2
+++ sys/net/ofp.h   12 Oct 2016 15:22:59 -
@@ -315,14 +315,14 @@ struct ofp_action_push {
uint16_tap_type;
uint16_tap_len;
uint16_tap_ethertype;
-   uint8_t pad[2];
+   uint8_t ap_pad[2];
 } __packed;
 
 struct ofp_action_pop_mpls {
uint16_tapm_type;
uint16_tapm_len;
uint16_tapm_ethertype;
-   uint8_t pad[2];
+   uint8_t apm_pad[2];
 } __packed;
 
 struct ofp_action_group {
@@ -342,6 +342,12 @@ struct ofp_action_set_field {
uint16_tasf_type;
uint16_tasf_len;
uint8_t asf_field[4];
+} __packed;
+
+struct ofp_action_set_queue {
+   uint16_tasq_type;
+   uint16_tasq_len;
+   uint32_tasq_queue_id;
 } __packed;
 
 /* Packet-Out Message */
Index: usr.sbin/switchd/ofp13.c
===
RCS file: /home/obsdcvs/src/usr.sbin/switchd/ofp13.c,v
retrieving revision 1.21
diff -u -p -r1.21 ofp13.c
--- usr.sbin/switchd/ofp13.c13 Oct 2016 08:29:14 -  1.21
+++ usr.sbin/switchd/ofp13.c24 Oct 2016 16:49:46 -
@@ -59,6 +59,12 @@ int   ofp13_features_reply(struct switchd
 int ofp13_validate_error(struct switchd *,
struct sockaddr_storage *, struct sockaddr_storage *,
struct ofp_header *, struct ibuf *);
+int ofp13_validate_action(struct switchd *, struct ofp_header *,
+   struct ibuf *, off_t *, struct ofp_action_header *);
+int ofp13_validate_instruction(struct switchd *, struct ofp_header *,
+   struct ibuf *, off_t *, struct ofp_instruction *);
+int ofp13_validate_flow_mod(struct switchd *, struct sockaddr_storage *,
+   struct sockaddr_storage *, struct ofp_header *, struct ibuf *);
 int ofp13_validate_oxm_basic(struct ibuf *, off_t, int, uint8_t);
 int ofp13_validate_oxm(struct switchd *, struct ofp_ox_match *,
struct ofp_header *, struct ibuf *, off_t);
@@ -129,7 +135,7 @@ struct ofp_callback ofp13_callbacks[] = 
{ OFP_T_FLOW_REMOVED,   ofp13_flow_removed, NULL },
{ OFP_T_PORT_STATUS,NULL, NULL },
{ OFP_T_PACKET_OUT, NULL, ofp13_validate_packet_out },
-   { OFP_T_FLOW_MOD,   NULL, NULL },
+   { OFP_T_FLOW_MOD,   NULL, ofp13_validate_flow_mod },
{ OFP_T_GROUP_MOD,  NULL, NULL },
{ OFP_T_PORT_MOD,   NULL, NULL },
{ OFP_T_TABLE_MOD,  NULL, NULL },
@@ -421,6 +427,7 @@ ofp13_validate_packet_in(struct switchd 
log_debug("\tmatch type %s length %zu (padded to %zu)",
print_map(ntohs(om->om_type), ofp_match_map),
mlen, OFP_ALIGN(mlen) + ETHER_ALIGN);
+   mlen -= sizeof(*om);
 
/* current match offset, aligned offset after all matches */
moff = off + sizeof(*om);
@@ -468,10 +475,9 @@ ofp13_validate_packet_out(struct switchd
 struct ofp_header *oh, struct ibuf *ibuf)
 {
struct ofp_packet_out   *pout;
-   size_t   len;
-   off_toff;
+   size_t   len, plen, diff;
+   off_toff, noff;
struct ofp_action_header*ah;
-   struct ofp_action_output*ao;
 
off = 0;
if ((pout = ibuf_seek(ibuf, off, sizeof(*pout))) == NULL) {
@@ -480,36 +486,43 @@ ofp13_validate_packet_out(struct switchd
return (-1);
}
 
-   log_debug("\tbuffer %d port %s "
-   "actions length %u",
+   off += sizeof(*pout);
+   len = ntohs(pout->pout_actions_len);
+   log_debug("\tbuffer %d in_port %s actions_len %lu",
ntohl(pout->pout_

tun(4)/tap(4): fix mbuf header space check

2016-10-24 Thread Rafael Zalamena
tun(4)/tap(4) function tun_dev_write() is checking for the wrong size for
the mbuf packet header. We must check against MHLEN (the mbuf header data
storage size) and not MINCLSIZE (smallest amount of data of a cluster).

For the curious:
MGETHDR() calls m_gethdr() which uses mbpool to get the mbuf data
storage. mbpool is initialized at mbinit() which sets the members
allocation size to MSIZE.

- MSIZE is "256";
- MLEN is "(MSIZE - sizeof(struct m_hdr))"
- MHLEN is "(MLEN - sizeof(pkthdr))";
- MINCLSIZE is "MHLEN + MLEN + 1";

ok?

Index: sys/net/if_tun.c
===
RCS file: /home/obsdcvs/src/sys/net/if_tun.c,v
retrieving revision 1.169
diff -u -p -r1.169 if_tun.c
--- sys/net/if_tun.c4 Sep 2016 15:46:39 -   1.169
+++ sys/net/if_tun.c24 Oct 2016 07:43:03 -
@@ -895,7 +895,7 @@ tun_dev_write(struct tun_softc *tp, stru
if (m == NULL)
return (ENOBUFS);
mlen = MHLEN;
-   if (uio->uio_resid >= MINCLSIZE) {
+   if (uio->uio_resid >= MHLEN) {
MCLGET(m, M_DONTWAIT);
if (!(m->m_flags & M_EXT)) {
m_free(m);
@@ -926,7 +926,7 @@ tun_dev_write(struct tun_softc *tp, stru
break;
}
mlen = MLEN;
-   if (uio->uio_resid >= MINCLSIZE) {
+   if (uio->uio_resid >= MHLEN) {
MCLGET(m, M_DONTWAIT);
if (!(m->m_flags & M_EXT)) {
error = ENOBUFS;



snmpd(8): turn snmpd_env the only global

2016-10-23 Thread Rafael Zalamena
This diff removes all "extern struct snmpd *" lines from source files,
replaces all 'env' occurences with 'snmpd_env' and adds the extern
declaration for snmpd_env in the snmpd.h header.

With this diff we only need to guarantee that this variable is set,
we avoid shadowing other 'env' variables and we diminish the confusion
about this env variable thing.

We need this diff (or something with the same effect) to proceed with
fork+exec, because jca@ found out that traphandler child process do not
set env. We could alternatively just set env in traphandler p_init() and
remove snmpd_env, however it looks cleaner to me to not have all this
extern in all .c files and have a propper name for 'env'.

ok?


Index: kroute.c
===
RCS file: /home/obsdcvs/src/usr.sbin/snmpd/kroute.c,v
retrieving revision 1.33
diff -u -p -r1.33 kroute.c
--- kroute.c3 Sep 2016 15:45:02 -   1.33
+++ kroute.c22 Oct 2016 22:20:00 -
@@ -44,8 +44,6 @@
 
 #include "snmpd.h"
 
-extern struct snmpd*env;
-
 struct ktable  **krt;
 u_intkrt_size;
 
@@ -173,8 +171,9 @@ kr_init(void)
, sizeof(opt)) == -1)
log_warn("%s: SO_USELOOPBACK", __func__);   /* not fatal */
 
-   if (env->sc_rtfilter && setsockopt(kr_state.ks_fd, PF_ROUTE,
-   ROUTE_MSGFILTER, >sc_rtfilter, sizeof(env->sc_rtfilter)) == -1)
+   if (snmpd_env->sc_rtfilter && setsockopt(kr_state.ks_fd, PF_ROUTE,
+   ROUTE_MSGFILTER, _env->sc_rtfilter,
+   sizeof(snmpd_env->sc_rtfilter)) == -1)
log_warn("%s: ROUTE_MSGFILTER", __func__);
 
/* grow receive buffer, don't wanna miss messages */
Index: mib.c
===
RCS file: /home/obsdcvs/src/usr.sbin/snmpd/mib.c,v
retrieving revision 1.80
diff -u -p -r1.80 mib.c
--- mib.c   17 Nov 2015 12:30:23 -  1.80
+++ mib.c   22 Oct 2016 22:26:52 -
@@ -58,8 +58,6 @@
 #include "snmpd.h"
 #include "mib.h"
 
-extern struct snmpd*env;
-
 /*
  * Defined in SNMPv2-MIB.txt (RFC 3418)
  */
@@ -255,7 +253,7 @@ mib_sysor(struct oid *oid, struct ber_oi
 int
 mib_getsnmp(struct oid *oid, struct ber_oid *o, struct ber_element **elm)
 {
-   struct snmp_stats   *stats = >sc_stats;
+   struct snmp_stats   *stats = _env->sc_stats;
long longi;
struct statsmap {
u_int8_t m_id;
@@ -316,7 +314,7 @@ mib_getsnmp(struct oid *oid, struct ber_
 int
 mib_setsnmp(struct oid *oid, struct ber_oid *o, struct ber_element **elm)
 {
-   struct snmp_stats   *stats = >sc_stats;
+   struct snmp_stats   *stats = _env->sc_stats;
long longi;
 
if (ber_get_integer(*elm, ) == -1)
@@ -354,11 +352,11 @@ mib_engine(struct oid *oid, struct ber_o
 {
switch (oid->o_oid[OIDIDX_snmpEngine]) {
case 1:
-   *elm = ber_add_nstring(*elm, env->sc_engineid,
-   env->sc_engineid_len);
+   *elm = ber_add_nstring(*elm, snmpd_env->sc_engineid,
+   snmpd_env->sc_engineid_len);
break;
case 2:
-   *elm = ber_add_integer(*elm, env->sc_engine_boots);
+   *elm = ber_add_integer(*elm, snmpd_env->sc_engine_boots);
break;
case 3:
*elm = ber_add_integer(*elm, snmpd_engine_time());
@@ -375,7 +373,7 @@ mib_engine(struct oid *oid, struct ber_o
 int
 mib_usmstats(struct oid *oid, struct ber_oid *o, struct ber_element **elm)
 {
-   struct snmp_stats   *stats = >sc_stats;
+   struct snmp_stats   *stats = _env->sc_stats;
long longi;
struct statsmap {
u_int8_t m_id;
@@ -697,7 +695,7 @@ mib_hrdevice(struct oid *oid, struct ber
 
/* Get and verify the current row index */
idx = o->bo_id[OIDIDX_hrDeviceEntry];
-   if (idx > (u_int)env->sc_ncpu)
+   if (idx > (u_int)snmpd_env->sc_ncpu)
return (1);
 
/* Tables need to prepend the OID on their own */
@@ -748,7 +746,7 @@ mib_hrprocessor(struct oid *oid, struct 
 
/* Get and verify the current row index */
idx = o->bo_id[OIDIDX_hrDeviceEntry];
-   if (idx > (u_int)env->sc_ncpu)
+   if (idx > (u_int)snmpd_env->sc_ncpu)
return (1);
else if (idx < 1)
idx = 1;
@@ -766,9 +764,9 @@ mib_hrprocessor(struct oid *oid, struct 
 * The percentage of time that the system was not
 * idle during the last minute.
 */
-   if (env->sc_cpustates == NULL)
+   if (snmpd_env->sc_cpustates == NULL)
return (-1);
-   cptime2 = env->sc_cpustates + (CPUSTATES * (idx - 1));
+   cptime2 = snmpd_env->sc_cpustates + (CPUSTATES * (idx - 1));
val = 100 -
 

Re: snmpd(8): teach how to fork+exec

2016-10-22 Thread Rafael Zalamena
On Sat, Oct 22, 2016 at 08:14:16PM +0200, Jeremie Courreges-Anglas wrote:
> Rafael Zalamena <rzalam...@gmail.com> writes:
> 
> > On Fri, Oct 21, 2016 at 01:26:36PM +0200, Jeremie Courreges-Anglas wrote:
> >> Rafael Zalamena <rzalam...@gmail.com> writes:
> >> > On Fri, Oct 14, 2016 at 06:47:09PM +0200, Rafael Zalamena wrote:
> >> >> On Mon, Sep 26, 2016 at 03:45:59PM +0200, Rafael Zalamena wrote:
> >> >> ---snip---
> >> >
> >> > I got feedback from jca@ that the trap handler wasn't working, so after
> >> > trying to reproduce the problem myself I found one 'env' global variable
> >> > that was not being set and the child process was dying silently.
> >> > (thanks jca@ !)
> >> >
> >> > Instead of depending on snmpe.c:snmpe env initialization (p_init), I'm
> >> > now calling smi_setenv() to do that in the main() function so all 
> >> > children
> >> > get the same behaviour. Also note that we don't have an 'extern' env in
> >> > smi.c anymore.
> >> >
> >> > ok?
> >> 
> >> Works fine here, but then I don't understand the relationship between
> >> static struct snmpd *env in smi.c and struct snmpd *env in snmpe.c.
> >
> > smi.c had a "extern struct snmpd *env" and that variable was only being
> > set during the snmpe initialzation (p_init). Since with fork+exec the
> > child process runs entirely from scratch (no memory / socket sharing with
> > the parent process), we need to set it somewhere else.
> >
> > It is a known problem that everyone that used to set things in the p_init
> > and expected it to work for everyother process was wrong. sunil@ found this
> > the hard-way when he found out that p_env wasn't being set for his process
> > and he noticed that now p_init is ran in the child process already. Before
> > fork+exec the p_init() functions were run by the parent process.
> 
> I understand this, but...
> 
> > To fix the current problem I made the 'env' for smi.c to be a local file
> > global variable and set it in the main() process for every child.
> 
> right now there are mixed uses of a global 'env' variable, a global
> 'snmpd_env' variable, some local 'env' variables set using ps_env or
> cs_env fields.  I fear that throwing another *file-local* 'env' variable
> in the mix makes the code harder to follow.

Short answer:
Yes, you are correct to note this and I think now that it is probably better
to write another diff to solve this problem. I'll get back at this diff later.


Brief background:
With the httpd(8) and relayd(8) we had the same problem: every file depended
on some functions setting a global env and it was being set by some p_init()
that now is not called anymore because of fork+exec.


This case:
Before fork+exec snmpe made the favor to set his global env in p_init which
every file uses and traphandler was also using it indirectly. Now that we
do fork+exec traphandler process don't get env anymore, however traphandler
seems to be only using smi.c, so the new diff addresses this problem.

I agree with you that this is not the final solution, but it is not the
objective of this diff to solve this problem.

If we don't feel that it is safe to proceed with this correction (properly
set env for all files even for traphandler), we must write another diff
that only handles this problem.


Just to clarify:
I talked with reyk@ about this global env variables in the last hackathon,
and we reached the conclusion that the best way to handle this is to use
the ps_env whenever is possible, however since a lot of functions don't
get access to ps, we must decide what does less changes to the daemon:
1) Use a single global variable (look at the httpd(8) commits);
2) Keep using the env (relayd(8) case);

> 
> Also, why would smi.c be special?
> 
> kroute.c:47:extern struct snmpd   *env;
> mib.c:61:extern struct snmpd  *env;
> mps.c:48:extern struct snmpd *env;
> timer.c:45:extern struct snmpd*env;
> trap.c:42:extern struct snmpd *env;
> usm.c:45:extern struct snmpd  *env;



Re: snmpd(8): teach how to fork+exec

2016-10-21 Thread Rafael Zalamena
On Fri, Oct 21, 2016 at 01:26:36PM +0200, Jeremie Courreges-Anglas wrote:
> Rafael Zalamena <rzalam...@gmail.com> writes:
> > On Fri, Oct 14, 2016 at 06:47:09PM +0200, Rafael Zalamena wrote:
> >> On Mon, Sep 26, 2016 at 03:45:59PM +0200, Rafael Zalamena wrote:
> >> ---snip---
> >
> > I got feedback from jca@ that the trap handler wasn't working, so after
> > trying to reproduce the problem myself I found one 'env' global variable
> > that was not being set and the child process was dying silently.
> > (thanks jca@ !)
> >
> > Instead of depending on snmpe.c:snmpe env initialization (p_init), I'm
> > now calling smi_setenv() to do that in the main() function so all children
> > get the same behaviour. Also note that we don't have an 'extern' env in
> > smi.c anymore.
> >
> > ok?
> 
> Works fine here, but then I don't understand the relationship between
> static struct snmpd *env in smi.c and struct snmpd *env in snmpe.c.

smi.c had a "extern struct snmpd *env" and that variable was only being
set during the snmpe initialzation (p_init). Since with fork+exec the
child process runs entirely from scratch (no memory / socket sharing with
the parent process), we need to set it somewhere else.

It is a known problem that everyone that used to set things in the p_init
and expected it to work for everyother process was wrong. sunil@ found this
the hard-way when he found out that p_env wasn't being set for his process
and he noticed that now p_init is ran in the child process already. Before
fork+exec the p_init() functions were run by the parent process.

To fix the current problem I made the 'env' for smi.c to be a local file
global variable and set it in the main() process for every child.



Re: snmpd(8): teach how to fork+exec

2016-10-21 Thread Rafael Zalamena
On Fri, Oct 14, 2016 at 06:47:09PM +0200, Rafael Zalamena wrote:
> On Mon, Sep 26, 2016 at 03:45:59PM +0200, Rafael Zalamena wrote:
> > Lets teach snmpd(8) how to fork+exec using the proc.c file from the latest
> > switchd(8) diff.
> > 
> > Note 1: I just tested the basic operations: startup and teardown.
> > Note 2: the kill with close will be implemented in another diff with the
> > ps_pid removal.
> > 
> 
> I've update the diff with the latest proc.c changes.
> 
> Note: I still have to implement kill with close().
> 

I got feedback from jca@ that the trap handler wasn't working, so after
trying to reproduce the problem myself I found one 'env' global variable
that was not being set and the child process was dying silently.
(thanks jca@ !)

Instead of depending on snmpe.c:snmpe env initialization (p_init), I'm
now calling smi_setenv() to do that in the main() function so all children
get the same behaviour. Also note that we don't have an 'extern' env in
smi.c anymore.

ok?

Index: proc.c
===
RCS file: /home/obsdcvs/src/usr.sbin/snmpd/proc.c,v
retrieving revision 1.20
diff -u -p -r1.20 proc.c
--- proc.c  7 Dec 2015 16:05:56 -   1.20
+++ proc.c  14 Oct 2016 15:42:19 -
@@ -1,7 +1,7 @@
 /* $OpenBSD: proc.c,v 1.20 2015/12/07 16:05:56 reyk Exp $  */
 
 /*
- * Copyright (c) 2010 - 2014 Reyk Floeter <r...@openbsd.org>
+ * Copyright (c) 2010 - 2016 Reyk Floeter <r...@openbsd.org>
  * Copyright (c) 2008 Pierre-Yves Ritschard <p...@openbsd.org>
  *
  * Permission to use, copy, modify, and distribute this software for any
@@ -22,6 +22,7 @@
 #include 
 #include 
 
+#include 
 #include 
 #include 
 #include 
@@ -34,8 +35,12 @@
 
 #include "snmpd.h"
 
-voidproc_open(struct privsep *, struct privsep_proc *,
-   struct privsep_proc *, size_t);
+voidproc_exec(struct privsep *, struct privsep_proc *, unsigned int,
+   int, char **);
+voidproc_setup(struct privsep *, struct privsep_proc *, unsigned int);
+voidproc_open(struct privsep *, int, int);
+voidproc_accept(struct privsep *, int, enum privsep_procid,
+   unsigned int);
 voidproc_close(struct privsep *);
 int proc_ispeer(struct privsep_proc *, unsigned int, enum privsep_procid);
 voidproc_shutdown(struct privsep_proc *);
@@ -55,204 +60,383 @@ proc_ispeer(struct privsep_proc *procs, 
return (0);
 }
 
-void
-proc_init(struct privsep *ps, struct privsep_proc *procs, unsigned int nproc)
+enum privsep_procid
+proc_getid(struct privsep_proc *procs, unsigned int nproc,
+const char *proc_name)
 {
-   unsigned int i, j, src, dst;
-   struct privsep_pipes*pp;
+   struct privsep_proc *p;
+   unsigned int proc;
 
-   /*
-* Allocate pipes for all process instances (incl. parent)
-*
-* - ps->ps_pipes: N:M mapping
-* N source processes connected to M destination processes:
-* [src][instances][dst][instances], for example
-* [PROC_RELAY][3][PROC_CA][3]
-*
-* - ps->ps_pp: per-process 1:M part of ps->ps_pipes
-* Each process instance has a destination array of socketpair fds:
-* [dst][instances], for example
-* [PROC_PARENT][0]
-*/
-   for (src = 0; src < PROC_MAX; src++) {
-   /* Allocate destination array for each process */
-   if ((ps->ps_pipes[src] = calloc(ps->ps_ninstances,
-   sizeof(struct privsep_pipes))) == NULL)
-   fatal("proc_init: calloc");
+   for (proc = 0; proc < nproc; proc++) {
+   p = [proc];
+   if (strcmp(p->p_title, proc_name))
+   continue;
 
-   for (i = 0; i < ps->ps_ninstances; i++) {
-   pp = >ps_pipes[src][i];
+   return (p->p_id);
+   }
 
-   for (dst = 0; dst < PROC_MAX; dst++) {
-   /* Allocate maximum fd integers */
-   if ((pp->pp_pipes[dst] =
-   calloc(ps->ps_ninstances,
-   sizeof(int))) == NULL)
-   fatal("proc_init: calloc");
+   return (PROC_MAX);
+}
 
-   /* Mark fd as unused */
-   for (j = 0; j < ps->ps_ninstances; j++)
-   pp->pp_pipes[dst][j] = -1;
-   }
-   }
-   }
+void
+proc_exec(struct privsep *ps, struct privsep_proc *procs, unsigned int nproc,
+int argc, char **argv)
+{
+   unsigned int proc, nargc, i, proc_i;
+   char**nargv;
+   struct privsep_proc *p;
+ 

tcpdump(8): teach how to read ofp

2016-10-19 Thread Rafael Zalamena
.T ...d
  0040:  0003 0010     0010  
  0050:  fffb        
  0060:  38ea a771 1441 0806 0001 0800 0604  ..8..q.A
  0070: 0001 38ea a771 1441 ac17 01cb    ..8..q.A
  0080:  ac17 013c       .<..
  0090:      

---snip---

And here is the diff.

ok?

Index: Makefile
===
RCS file: /home/obsdcvs/src/usr.sbin/tcpdump/Makefile,v
retrieving revision 1.59
diff -u -p -r1.59 Makefile
--- Makefile14 Oct 2015 04:55:17 -  1.59
+++ Makefile18 Oct 2016 10:19:50 -
@@ -47,7 +47,7 @@ SRCS= tcpdump.c addrtoname.c privsep.c p
print-ip6.c print-ip6opts.c print-icmp6.c print-dhcp6.c print-frag6.c \
print-bgp.c print-ospf6.c print-ripng.c print-rt6.c print-stp.c \
print-etherip.c print-lwres.c print-lldp.c print-cdp.c print-pflog.c \
-   print-pfsync.c pf_print_state.c \
+   print-pfsync.c pf_print_state.c print-ofp.c \
print-udpencap.c print-carp.c \
print-802_11.c print-iapp.c print-mpls.c print-slow.c \
gmt2local.c savestr.c setsignal.c in_cksum.c
Index: interface.h
===
RCS file: /home/obsdcvs/src/usr.sbin/tcpdump/interface.h,v
retrieving revision 1.67
diff -u -p -r1.67 interface.h
--- interface.h 11 Jul 2016 00:27:50 -  1.67
+++ interface.h 18 Oct 2016 10:18:54 -
@@ -274,6 +274,7 @@ extern void mpls_print(const u_char *, u
 extern void lldp_print(const u_char *, u_int);
 extern void slow_print(const u_char *, u_int);
 extern void gtp_print(const u_char *, u_int, u_short, u_short);
+extern void ofp_print(const u_char *, u_int);
 
 #ifdef INET6
 extern void ip6_print(const u_char *, u_int);
Index: print-tcp.c
===
RCS file: /home/obsdcvs/src/usr.sbin/tcpdump/print-tcp.c,v
retrieving revision 1.35
diff -u -p -r1.35 print-tcp.c
--- print-tcp.c 16 Nov 2015 00:16:39 -  1.35
+++ print-tcp.c 18 Oct 2016 16:11:37 -
@@ -123,6 +123,10 @@ static struct tcp_seq_hash tcp_seq_hash[
 #endif
 #define NETBIOS_SSN_PORT 139
 
+/* OpenFlow TCP ports. */
+#define OLD_OFP_PORT   6633
+#define OFP_PORT   6653
+
 static int tcp_cksum(const struct ip *ip, const struct tcphdr *tp, int len)
 {
union phu {
@@ -665,6 +669,9 @@ tcp_print(const u_char *bp, u_int length
} else {
if (sport == BGP_PORT || dport == BGP_PORT)
bgp_print(bp, length);
+       else if (sport == OLD_OFP_PORT || dport == OLD_OFP_PORT ||
+   sport == OFP_PORT || dport == OFP_PORT)
+   ofp_print(bp, length);
 #if 0
else if (sport == NETBIOS_SSN_PORT || dport == NETBIOS_SSN_PORT)
nbt_tcp_print(bp, length);
--- /dev/null   Wed Oct 19 12:02:31 2016
+++ print-ofp.c Wed Oct 19 12:00:39 2016
@@ -0,0 +1,912 @@
+/* $$  */
+
+/*
+ * Copyright (c) 2016 Rafael Zalamena <rzalam...@openbsd.org>
+ *
+ * Permission to use, copy, modify, and distribute this software for any
+ * purpose with or without fee is hereby granted, provided that the above
+ * copyright notice and this permission notice appear in all copies.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
+ * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
+ * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
+ * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
+ * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
+ * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
+ * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
+ */
+
+#include 
+
+#include 
+#include 
+#include 
+
+#include "interface.h"
+
+/* Size of action header without the padding. */
+#define AH_UNPADDED(offsetof(struct ofp_action_header, ah_pad))
+
+voidofp_print_hello(const u_char *, u_int, u_int);
+voidofp_print_featuresreply(const u_char *, u_int);
+voidofp_print_setconfig(const u_char *, u_int);
+voidofp_print_packetin(const u_char *, u_int);
+voidofp_print_packetout(const u_char *, u_int);
+voidofp_print_flowremoved(const u_char *, u_int);
+
+voidoxm_print_halfword(const u_char *, u_int, int, int);
+voidoxm_print_word(const u_char *, u_int, int, int);
+voidoxm_print_quad(const u_char *, u_int, int, int);
+voidoxm_print_ether(const u_char *, u_int, int);
+voidofp_print_oxm(struct ofp_ox_match *, const u_char *, u_int);
+
+voidaction_print_output(const u_char *, u_int);
+voidaction_print_group(const u_char *, u_int

ifconfig(8): fix set switch(4) datapath id

2016-10-17 Thread Rafael Zalamena
There are two inconsistencies with the ifconfig(8) switch(4) configuring:
1) Datapath ID is an unsigned 64 bit integer, not a signed one;
2) ifconfig(8) man pages says that the parameter is "datapath" not
   "datapathid";

This diff fixes both problems and let us configure the datapath id
correctly.

ok?


Index: brconfig.c
===
RCS file: /home/obsdcvs/src/sbin/ifconfig/brconfig.c,v
retrieving revision 1.11
diff -u -p -r1.11 brconfig.c
--- brconfig.c  3 Sep 2016 17:13:48 -   1.11
+++ brconfig.c  17 Oct 2016 10:11:25 -
@@ -1051,7 +1051,7 @@ switch_datapathid(const char *arg, int d
char *endptr;
 
errno = 0;
-   newdpid = strtoll(arg, , 0);
+   newdpid = strtoull(arg, , 0);
if (arg[0] == '\0' || endptr[0] != '\0' || errno == ERANGE)
errx(1, "invalid arg for datapath-id: %s", arg);
 
Index: ifconfig.c
===
RCS file: /home/obsdcvs/src/sbin/ifconfig/ifconfig.c,v
retrieving revision 1.330
diff -u -p -r1.330 ifconfig.c
--- ifconfig.c  3 Sep 2016 13:46:57 -   1.330
+++ ifconfig.c  17 Oct 2016 09:48:33 -
@@ -517,7 +517,7 @@ const structcmd {
{ "-roaming",   0,  0,  umb_roaming },
{ "patch",  NEXTARG,0,  setpair },
{ "-patch", 1,  0,  unsetpair },
-   { "datapathid", NEXTARG,0,  switch_datapathid },
+   { "datapath",   NEXTARG,0,  switch_datapathid },
{ "portno", NEXTARG2,   0,  NULL, switch_portno },
{ "addlocal",   NEXTARG,0,  addlocal },
 #else /* SMALL */



Re: snmpd(8): teach how to fork+exec

2016-10-14 Thread Rafael Zalamena
On Mon, Sep 26, 2016 at 03:45:59PM +0200, Rafael Zalamena wrote:
> Lets teach snmpd(8) how to fork+exec using the proc.c file from the latest
> switchd(8) diff.
> 
> Note 1: I just tested the basic operations: startup and teardown.
> Note 2: the kill with close will be implemented in another diff with the
> ps_pid removal.
> 

I've update the diff with the latest proc.c changes.

Note: I still have to implement kill with close().

ok?

Index: proc.c
===
RCS file: /home/obsdcvs/src/usr.sbin/snmpd/proc.c,v
retrieving revision 1.20
diff -u -p -r1.20 proc.c
--- proc.c  7 Dec 2015 16:05:56 -   1.20
+++ proc.c  14 Oct 2016 15:42:19 -
@@ -1,7 +1,7 @@
 /* $OpenBSD: proc.c,v 1.20 2015/12/07 16:05:56 reyk Exp $  */
 
 /*
- * Copyright (c) 2010 - 2014 Reyk Floeter <r...@openbsd.org>
+ * Copyright (c) 2010 - 2016 Reyk Floeter <r...@openbsd.org>
  * Copyright (c) 2008 Pierre-Yves Ritschard <p...@openbsd.org>
  *
  * Permission to use, copy, modify, and distribute this software for any
@@ -22,6 +22,7 @@
 #include 
 #include 
 
+#include 
 #include 
 #include 
 #include 
@@ -34,8 +35,12 @@
 
 #include "snmpd.h"
 
-voidproc_open(struct privsep *, struct privsep_proc *,
-   struct privsep_proc *, size_t);
+voidproc_exec(struct privsep *, struct privsep_proc *, unsigned int,
+   int, char **);
+voidproc_setup(struct privsep *, struct privsep_proc *, unsigned int);
+voidproc_open(struct privsep *, int, int);
+voidproc_accept(struct privsep *, int, enum privsep_procid,
+   unsigned int);
 voidproc_close(struct privsep *);
 int proc_ispeer(struct privsep_proc *, unsigned int, enum privsep_procid);
 voidproc_shutdown(struct privsep_proc *);
@@ -55,204 +60,383 @@ proc_ispeer(struct privsep_proc *procs, 
return (0);
 }
 
-void
-proc_init(struct privsep *ps, struct privsep_proc *procs, unsigned int nproc)
+enum privsep_procid
+proc_getid(struct privsep_proc *procs, unsigned int nproc,
+const char *proc_name)
 {
-   unsigned int i, j, src, dst;
-   struct privsep_pipes*pp;
+   struct privsep_proc *p;
+   unsigned int proc;
 
-   /*
-* Allocate pipes for all process instances (incl. parent)
-*
-* - ps->ps_pipes: N:M mapping
-* N source processes connected to M destination processes:
-* [src][instances][dst][instances], for example
-* [PROC_RELAY][3][PROC_CA][3]
-*
-* - ps->ps_pp: per-process 1:M part of ps->ps_pipes
-* Each process instance has a destination array of socketpair fds:
-* [dst][instances], for example
-* [PROC_PARENT][0]
-*/
-   for (src = 0; src < PROC_MAX; src++) {
-   /* Allocate destination array for each process */
-   if ((ps->ps_pipes[src] = calloc(ps->ps_ninstances,
-   sizeof(struct privsep_pipes))) == NULL)
-   fatal("proc_init: calloc");
+   for (proc = 0; proc < nproc; proc++) {
+   p = [proc];
+   if (strcmp(p->p_title, proc_name))
+   continue;
 
-   for (i = 0; i < ps->ps_ninstances; i++) {
-   pp = >ps_pipes[src][i];
+   return (p->p_id);
+   }
 
-   for (dst = 0; dst < PROC_MAX; dst++) {
-   /* Allocate maximum fd integers */
-   if ((pp->pp_pipes[dst] =
-   calloc(ps->ps_ninstances,
-   sizeof(int))) == NULL)
-   fatal("proc_init: calloc");
+   return (PROC_MAX);
+}
 
-   /* Mark fd as unused */
-   for (j = 0; j < ps->ps_ninstances; j++)
-   pp->pp_pipes[dst][j] = -1;
-   }
-   }
-   }
+void
+proc_exec(struct privsep *ps, struct privsep_proc *procs, unsigned int nproc,
+int argc, char **argv)
+{
+   unsigned int proc, nargc, i, proc_i;
+   char**nargv;
+   struct privsep_proc *p;
+   char num[32];
+   int  fd;
+
+   /* Prepare the new process argv. */
+   nargv = calloc(argc + 5, sizeof(char *));
+   if (nargv == NULL)
+   fatal("%s: calloc", __func__);
+
+   /* Copy call argument first. */
+   nargc = 0;
+   nargv[nargc++] = argv[0];
+
+   /* Set process name argument and save the position. */
+   nargv[nargc++] = "-P";
+   proc_i = nargc;
+   nargc++;
+
+   /* Point process instance arg to stack and copy the original args. */
+  

Re: relayd(8): proc.c sync and remove fd limit change

2016-10-14 Thread Rafael Zalamena
On Tue, Oct 11, 2016 at 02:02:46AM +0200, Rafael Zalamena wrote:
> This diff brings the relayd(8) proc.c up-to-date and removes the file limit
> alteration in relayd.c. The file limit alteration is not needed anymore
> since now the number of descriptors pre-allocated is very small (only one
> descriptor per child + 2 to distribute fds between child).
> 
> It would be nice to have some feedback in this diff since this daemon is
> the one that most uses the proc.c multiple instances of child process.

Here is an updated diff with the proc_flush_imsg() fix from reyk@.

Summary:
 * Fixes the msgbuf_write() usage idiom;
 * Add context to fatal() messages;
 * Use proc_flush_imsg() instead of manually using imsg_flush();
 * Use less fds on startup;

ok?

Index: proc.c
===
RCS file: /home/obsdcvs/src/usr.sbin/relayd/proc.c,v
retrieving revision 1.36
diff -u -p -r1.36 proc.c
--- proc.c  5 Oct 2016 17:31:28 -   1.36
+++ proc.c  14 Oct 2016 15:14:21 -
@@ -37,8 +37,6 @@
 
 voidproc_exec(struct privsep *, struct privsep_proc *, unsigned int,
int, char **);
-voidproc_connectpeer(struct privsep *, enum privsep_procid, int,
-   struct privsep_pipes *);
 voidproc_setup(struct privsep *, struct privsep_proc *, unsigned int);
 voidproc_open(struct privsep *, int, int);
 voidproc_accept(struct privsep *, int, enum privsep_procid,
@@ -157,72 +155,38 @@ proc_exec(struct privsep *ps, struct pri
 }
 
 void
-proc_connectpeer(struct privsep *ps, enum privsep_procid id, int inst,
-struct privsep_pipes *pp)
-{
-   unsigned int i, j;
-   struct privsep_fdpf;
-
-   for (i = 0; i < PROC_MAX; i++) {
-   /* Parent is already connected with everyone. */
-   if (i == PROC_PARENT)
-   continue;
-
-   for (j = 0; j < ps->ps_instances[i]; j++) {
-   /* Don't send socket to child itself. */
-   if (i == (unsigned int)id &&
-   j == (unsigned int)inst)
-   continue;
-   if (pp->pp_pipes[i][j] == -1)
-   continue;
-
-   pf.pf_procid = i;
-   pf.pf_instance = j;
-   proc_compose_imsg(ps, id, inst, IMSG_CTL_PROCFD,
-   -1, pp->pp_pipes[i][j], , sizeof(pf));
-   pp->pp_pipes[i][j] = -1;
-   }
-   }
-}
-
-/* Inter-connect all process except with ourself. */
-void
 proc_connect(struct privsep *ps)
 {
-   unsigned int src, i, j;
-   struct privsep_pipes*pp;
struct imsgev   *iev;
+   unsigned int src, dst, inst;
 
-   /* Listen on appropriate pipes. */
-   src = privsep_process;
-   pp = >ps_pipes[src][ps->ps_instance];
-
-   for (i = 0; i < PROC_MAX; i++) {
-   /* Don't listen to ourself. */
-   if (i == src)
-   continue;
+   /* Don't distribute any sockets if we are not really going to run. */
+   if (ps->ps_noaction)
+   return;
 
-   for (j = 0; j < ps->ps_instances[i]; j++) {
-   if (pp->pp_pipes[i][j] == -1)
-   continue;
+   for (dst = 0; dst < PROC_MAX; dst++) {
+   /* We don't communicate with ourselves. */
+   if (dst == PROC_PARENT)
+   continue;
 
-   iev = >ps_ievs[i][j];
-   imsg_init(>ibuf, pp->pp_pipes[i][j]);
+   for (inst = 0; inst < ps->ps_instances[dst]; inst++) {
+   iev = >ps_ievs[dst][inst];
+   imsg_init(>ibuf, ps->ps_pp->pp_pipes[dst][inst]);
event_set(>ev, iev->ibuf.fd, iev->events,
iev->handler, iev->data);
event_add(>ev, NULL);
}
}
 
-   /* Exchange pipes between process. */
-   for (i = 0; i < PROC_MAX; i++) {
-   /* Parent is already connected with everyone. */
-   if (i == PROC_PARENT)
-   continue;
+   /* Distribute the socketpair()s for everyone. */
+   for (src = 0; src < PROC_MAX; src++)
+   for (dst = src; dst < PROC_MAX; dst++) {
+   /* Parent already distributed its fds. */
+   if (src == PROC_PARENT || dst == PROC_PARENT)
+   continue;
 
-   for (j = 0; j < ps->ps_instances[i]; j++)
-   proc_connectpeer(ps, i, j, >ps_pipes[i][j]);
-   }
+   proc_open(ps, src, dst);
+

switch(4): kill unused function

2016-10-14 Thread Rafael Zalamena
The switch(4) device has a function called switch_forward_flooder()
which doesn't seem to be used anywhere.

In switchofp.c we have the swofp_action_output() which would be the place
where it would be likely called, however it already has the code that
does it.

Since it doesn't seem to fit anywhere I think we should just removed it,
and that's what this diff does.

ok?

Index: net/if_switch.c
===
RCS file: /home/obsdcvs/src/sys/net/if_switch.c,v
retrieving revision 1.9
diff -u -p -r1.9 if_switch.c
--- net/if_switch.c 8 Oct 2016 23:36:10 -   1.9
+++ net/if_switch.c 14 Oct 2016 13:28:18 -
@@ -85,8 +85,6 @@ intswitch_stop(struct ifnet *, int);
 struct mbuf
*switch_port_ingress(struct switch_softc *, struct ifnet *,
struct mbuf *);
-voidswitch_forward_flooder(struct switch_softc *,
-   struct switch_flow_classify *, struct mbuf *);
 voidswitch_port_egress(struct switch_softc *, struct switch_fwdp_queue *,
struct mbuf *);
 int switch_ifenqueue(struct switch_softc *, struct ifnet *,
@@ -725,25 +720,6 @@ switch_port_ingress(struct switch_softc 
 #endif /* NPF */
 
return (m);
-}
-
-void
-switch_forward_flooder(struct switch_softc *sc,
-struct switch_flow_classify *swfcl, struct mbuf *m)
-{
-   struct switch_port   *swpo;
-   struct switch_fwdp_queue fwdp_q;
-   uint32_t src_port_no;
-
-   src_port_no = swfcl->swfcl_in_port;
-   TAILQ_INIT(_q);
-   TAILQ_FOREACH(swpo, >sc_swpo_list, swpo_list_next) {
-   if (swpo->swpo_port_no == src_port_no)
-   continue;
-   TAILQ_INSERT_HEAD(_q, swpo, swpo_fwdp_next);
-   }
-
-   switch_port_egress(sc, _q, m);
 }
 
 void



switch(4): fix packet_out message handling

2016-10-14 Thread Rafael Zalamena
The switch(4) packet_out handler wasn't handling some cases, so here is
the missing code.

1) pout_buffer_id is a 4 bytes field and it was using the wrong define
   to check for absence of buffers;
2) When a buffer_id was sent the code didn't handle this, now when this
   happens we send an error message back;
3) Call the classifier function again for packet_out otherwise the
   *_set_field* action functions will panic();

ok?

Index: net/if_switch.c
===
RCS file: /home/obsdcvs/src/sys/net/if_switch.c,v
retrieving revision 1.9
diff -u -p -r1.9 if_switch.c
--- net/if_switch.c 8 Oct 2016 23:36:10 -   1.9
+++ net/if_switch.c 13 Oct 2016 16:08:28 -
@@ -124,9 +124,6 @@ struct mbuf
 struct mbuf
*switch_flow_classifier_tunnel(struct mbuf *, int *,
struct switch_flow_classify *);
-struct mbuf
-   *switch_flow_classifier(struct mbuf *, uint32_t,
-   struct switch_flow_classify *);
 voidswitch_flow_classifier_dump(struct switch_softc *,
struct switch_flow_classify *);
 voidswitchattach(int);
Index: net/if_switch.h
===
RCS file: /home/obsdcvs/src/sys/net/if_switch.h,v
retrieving revision 1.4
diff -u -p -r1.4 if_switch.h
--- net/if_switch.h 7 Oct 2016 08:18:22 -   1.4
+++ net/if_switch.h 13 Oct 2016 16:08:49 -
@@ -216,6 +216,9 @@ void switch_port_egress(struct switch_s
 int switch_swfcl_dup(struct switch_flow_classify *,
struct switch_flow_classify *);
 voidswitch_swfcl_free(struct switch_flow_classify *);
+struct mbuf
+   *switch_flow_classifier(struct mbuf *, uint32_t,
+   struct switch_flow_classify *);
 
 /* switchctl.c */
 voidswitch_dev_destroy(struct switch_softc *);
Index: net/switchofp.c
===
RCS file: /home/obsdcvs/src/sys/net/switchofp.c,v
retrieving revision 1.13
diff -u -p -r1.13 switchofp.c
--- net/switchofp.c 12 Oct 2016 09:50:55 -  1.13
+++ net/switchofp.c 14 Oct 2016 10:22:53 -
@@ -5080,7 +5080,7 @@ swofp_recv_packet_out(struct switch_soft
pout = mtod(m, struct ofp_packet_out *);
 
al_start = offsetof(struct ofp_packet_out, pout_actions);
-   if (pout->pout_buffer_id != OFP_CONTROLLER_MAXLEN_NO_BUFFER) {
+   if (pout->pout_buffer_id == OFP_PKTOUT_NO_BUFFER) {
/*
 * It's not necessary to deep copy at here because it's done
 * in m_dup_pkt().
@@ -5098,6 +5098,17 @@ swofp_recv_packet_out(struct switch_soft
}
 
mc = mcn;
+   } else {
+   /* TODO We don't do buffering yet. */
+   swofp_send_error(sc, m, OFP_ERRTYPE_BAD_REQUEST,
+   OFP_ERRREQ_BUFFER_UNKNOWN);
+   return (0);
+   }
+
+   mc = switch_flow_classifier(mc, pout->pout_in_port, );
+   if (mc == NULL) {
+   m_freem(m);
+   return (0);
}
 
TAILQ_INIT(_fwdp_q);



switchd(8): add flow_mod validation

2016-10-12 Thread Rafael Zalamena
This diff teaches switchd(8) how to validate flow_mod messages, more
specifically the flow instructions and actions. The oxm validations
were already implemented so we get them for free here.

ok?

Index: sys/net/ofp.h
===
RCS file: /cvs/src/sys/net/ofp.h,v
retrieving revision 1.2
diff -u -p -r1.2 ofp.h
--- sys/net/ofp.h   30 Sep 2016 12:40:00 -  1.2
+++ sys/net/ofp.h   12 Oct 2016 15:36:30 -
@@ -315,14 +315,14 @@ struct ofp_action_push {
uint16_tap_type;
uint16_tap_len;
uint16_tap_ethertype;
-   uint8_t pad[2];
+   uint8_t ap_pad[2];
 } __packed;
 
 struct ofp_action_pop_mpls {
uint16_tapm_type;
uint16_tapm_len;
uint16_tapm_ethertype;
-   uint8_t pad[2];
+   uint8_t apm_pad[2];
 } __packed;
 
 struct ofp_action_group {
@@ -342,6 +342,12 @@ struct ofp_action_set_field {
uint16_tasf_type;
uint16_tasf_len;
uint8_t asf_field[4];
+} __packed;
+
+struct ofp_action_set_queue {
+   uint16_tasq_type;
+   uint16_tasq_len;
+   uint32_tasq_queue_id;
 } __packed;
 
 /* Packet-Out Message */
Index: usr.sbin/switchd/ofp13.c
===
RCS file: /cvs/src/usr.sbin/switchd/ofp13.c,v
retrieving revision 1.20
diff -u -p -r1.20 ofp13.c
--- usr.sbin/switchd/ofp13.c12 Oct 2016 15:18:56 -  1.20
+++ usr.sbin/switchd/ofp13.c12 Oct 2016 15:36:30 -
@@ -54,6 +54,12 @@ int   ofp13_echo_request(struct switchd *
 int ofp13_validate_error(struct switchd *,
struct sockaddr_storage *, struct sockaddr_storage *,
struct ofp_header *, struct ibuf *);
+int ofp13_validate_action(struct switchd *, struct ofp_header *,
+   struct ibuf *, off_t *, struct ofp_action_header *);
+int ofp13_validate_instruction(struct switchd *, struct ofp_header *,
+   struct ibuf *, off_t *, struct ofp_instruction *);
+int ofp13_validate_flow_mod(struct switchd *, struct sockaddr_storage *,
+   struct sockaddr_storage *, struct ofp_header *, struct ibuf *);
 int ofp13_validate_oxm_basic(struct ibuf *, off_t, int, uint8_t);
 int ofp13_validate_oxm(struct switchd *, struct ofp_ox_match *,
struct ofp_header *, struct ibuf *, off_t);
@@ -121,7 +127,7 @@ struct ofp_callback ofp13_callbacks[] = 
{ OFP_T_FLOW_REMOVED,   ofp13_flow_removed, NULL },
{ OFP_T_PORT_STATUS,NULL, NULL },
{ OFP_T_PACKET_OUT, NULL, ofp13_validate_packet_out },
-   { OFP_T_FLOW_MOD,   NULL, NULL },
+   { OFP_T_FLOW_MOD,   NULL, ofp13_validate_flow_mod },
{ OFP_T_GROUP_MOD,  NULL, NULL },
{ OFP_T_PORT_MOD,   NULL, NULL },
{ OFP_T_TABLE_MOD,  NULL, NULL },
@@ -501,6 +507,274 @@ ofp13_validate_packet_out(struct switchd
if (pout->pout_buffer_id == (uint32_t)-1)
break;
off += ntohs(ah->ah_len);
+   }
+
+   return (0);
+}
+
+int
+ofp13_validate_action(struct switchd *sc, struct ofp_header *oh,
+struct ibuf *ibuf, off_t *off, struct ofp_action_header *ah)
+{
+   struct ofp_action_output*ao;
+   struct ofp_action_mpls_ttl  *amt;
+   struct ofp_action_push  *ap;
+   struct ofp_action_pop_mpls  *apm;
+   struct ofp_action_group *ag;
+   struct ofp_action_nw_ttl*ant;
+   struct ofp_action_set_field *asf;
+   struct ofp_action_set_queue *asq;
+   struct ofp_ox_match *oxm;
+   int  len, type;
+   off_tmoff;
+
+   type = ntohs(ah->ah_type);
+   len = ntohs(ah->ah_len);
+   switch (type) {
+   case OFP_ACTION_OUTPUT:
+   if ((ao = ibuf_seek(ibuf, *off, sizeof(*ao))) == NULL)
+   return (-1);
+
+   *off += len;
+   log_debug("\t\taction %s len %d port %u max_len %d",
+   print_map(type, ofp_action_map), len, ntohl(ao->ao_port),
+   ntohs(ao->ao_max_len));
+   break;
+   case OFP_ACTION_SET_MPLS_TTL:
+   if ((amt = ibuf_seek(ibuf, *off, sizeof(*amt))) == NULL)
+   return (-1);
+
+   *off += len;
+   log_debug("\t\taction %s len %d ttl %d",
+   print_map(type, ofp_action_map), len, amt->amt_ttl);
+   break;
+   case OFP_ACTION_PUSH_VLAN:
+   case OFP_ACTION_PUSH_MPLS:
+   case OFP_ACTION_PUSH_PBB:
+   if ((ap = ibuf_seek(ibuf, *off, sizeof(*ap))) == NULL)
+   return (-1);
+
+   *off += len;
+   log_debug("\t\taction 

switchd(8): implement the setconfig message

2016-10-12 Thread Rafael Zalamena
This diff teaches switchd(8) how to send the set_config message for
OpenFlow 1.3.5. We need this to set the default miss_send_len to
a value greater than zero so we can receive packets from the switch(4)
with the payload.

ok?


Index: ofp13.c
===
RCS file: /home/obsdcvs/src/usr.sbin/switchd/ofp13.c,v
retrieving revision 1.19
diff -u -p -r1.19 ofp13.c
--- ofp13.c 7 Oct 2016 08:49:53 -   1.19
+++ ofp13.c 12 Oct 2016 12:47:40 -
@@ -99,6 +99,12 @@ struct ofp_group_mod *
 struct ofp_bucket *
ofp13_bucket(struct ibuf *, uint16_t, uint32_t, uint32_t);
 
+int ofp13_setconfig_validate(struct switchd *,
+   struct sockaddr_storage *, struct sockaddr_storage *,
+   struct ofp_header *, struct ibuf *);
+int ofp13_setconfig(struct switchd *, struct switch_connection *,
+   uint16_t, uint16_t);
+
 struct ofp_callback ofp13_callbacks[] = {
{ OFP_T_HELLO,  ofp13_hello, NULL },
{ OFP_T_ERROR,  NULL, ofp13_validate_error },
@@ -109,7 +115,7 @@ struct ofp_callback ofp13_callbacks[] = 
{ OFP_T_FEATURES_REPLY, NULL, NULL },
{ OFP_T_GET_CONFIG_REQUEST, NULL, NULL },
{ OFP_T_GET_CONFIG_REPLY,   NULL, NULL },
-   { OFP_T_SET_CONFIG, NULL, NULL },
+   { OFP_T_SET_CONFIG, NULL, ofp13_setconfig_validate },
{ OFP_T_PACKET_IN,  ofp13_packet_in,
ofp13_validate_packet_in },
{ OFP_T_FLOW_REMOVED,   ofp13_flow_removed, NULL },
@@ -585,6 +591,8 @@ ofp13_hello(struct switchd *sc, struct s
OFP_TABLE_ID_ALL);
ofp13_table_features(sc, con, 0);
ofp13_desc(sc, con);
+   ofp13_setconfig(sc, con, OFP_CONFIG_FRAG_REASM,
+   OFP_CONTROLLER_MAXLEN_NO_BUFFER);
 
return (0);
 }
@@ -1461,4 +1469,48 @@ ofp13_bucket(struct ibuf *ibuf, uint16_t
b->b_watch_port = htonl(watchport);
b->b_watch_group = htonl(watchgroup);
return (b);
+}
+
+int
+ofp13_setconfig_validate(struct switchd *sc,
+struct sockaddr_storage *src, struct sockaddr_storage *dst,
+struct ofp_header *oh, struct ibuf *ibuf)
+{
+   struct ofp_switch_config*cfg;
+
+   if ((cfg = ibuf_seek(ibuf, 0, sizeof(*cfg))) == NULL)
+   return (-1);
+
+   log_debug("\tflags %#04x miss_send_len %d",
+   ntohs(cfg->cfg_flags), ntohs(cfg->cfg_miss_send_len));
+   return (0);
+}
+
+int
+ofp13_setconfig(struct switchd *sc, struct switch_connection *con,
+ uint16_t flags, uint16_t misslen)
+{
+   struct ibuf *ibuf;
+   struct ofp_switch_config*cfg;
+   struct ofp_header   *oh;
+   int  rv;
+
+   if ((ibuf = ibuf_static()) == NULL ||
+   (cfg = ibuf_advance(ibuf, sizeof(*cfg))) == NULL)
+   return (-1);
+
+   cfg->cfg_flags = htons(flags);
+   cfg->cfg_miss_send_len = htons(misslen);
+
+   oh = >cfg_oh;
+   oh->oh_version = OFP_V_1_3;
+   oh->oh_type = OFP_T_SET_CONFIG;
+   oh->oh_length = htons(ibuf_length(ibuf));
+   oh->oh_xid = htonl(con->con_xidnxt++);
+   if (ofp13_validate(sc, >con_local, >con_peer, oh, ibuf) != 0)
+   return (-1);
+
+   rv = ofp_output(con, NULL, ibuf);
+   ibuf_free(ibuf);
+   return (rv);
 }



Re: vmd/vmctl load/reload/reset

2016-10-12 Thread Rafael Zalamena
On Wed, Oct 12, 2016 at 02:06:35PM +0200, Reyk Floeter wrote:
> On Wed, Oct 12, 2016 at 01:44:25PM +0200, Reyk Floeter wrote:
> > Hi,
> > 
> > vmctl reload is currently broken, the attached diff fixes it and
> > re-introduces the semantics that originally came from iked:
> > 
> > - load/reload just reloads the configuration without clearing any
> > running configuration.  This way you can start vmd with a few
> > configured vms, terminate one vm, and reload the configuration which
> > will restart the terminated vm but keep the running ones.
> > 
> > - reset clears the configuration (and possibly terminates vms) without
> > reloading the configuration.  "vmctl reset" thus terminates all vms.
> > 
> > # vmctl load /etc/my-personal-vm.conf
> > # vmctl reload
> > # vmctl reset
> > 
> > OK?
> > 
> 
> ajacoutot@ reminded me of SIGHUP:
> change it to reload instead of reset on HUP.
> 
> Updated diff, OK?

It fixes my reload problem (vmctl and SIGHUP), "vmctl load " works,
"vmctl reset" also works and the code reads fine.

ok rzalamena@



relayd(8): proc.c sync and remove fd limit change

2016-10-10 Thread Rafael Zalamena
This diff brings the relayd(8) proc.c up-to-date and removes the file limit
alteration in relayd.c. The file limit alteration is not needed anymore
since now the number of descriptors pre-allocated is very small (only one
descriptor per child + 2 to distribute fds between child).

It would be nice to have some feedback in this diff since this daemon is
the one that most uses the proc.c multiple instances of child process.

ok?


Index: proc.c
===
RCS file: /home/obsdcvs/src/usr.sbin/relayd/proc.c,v
retrieving revision 1.36
diff -u -p -r1.36 proc.c
--- proc.c  5 Oct 2016 17:31:28 -   1.36
+++ proc.c  10 Oct 2016 23:55:24 -
@@ -37,8 +37,6 @@
 
 voidproc_exec(struct privsep *, struct privsep_proc *, unsigned int,
int, char **);
-voidproc_connectpeer(struct privsep *, enum privsep_procid, int,
-   struct privsep_pipes *);
 voidproc_setup(struct privsep *, struct privsep_proc *, unsigned int);
 voidproc_open(struct privsep *, int, int);
 voidproc_accept(struct privsep *, int, enum privsep_procid,
@@ -157,72 +155,38 @@ proc_exec(struct privsep *ps, struct pri
 }
 
 void
-proc_connectpeer(struct privsep *ps, enum privsep_procid id, int inst,
-struct privsep_pipes *pp)
-{
-   unsigned int i, j;
-   struct privsep_fdpf;
-
-   for (i = 0; i < PROC_MAX; i++) {
-   /* Parent is already connected with everyone. */
-   if (i == PROC_PARENT)
-   continue;
-
-   for (j = 0; j < ps->ps_instances[i]; j++) {
-   /* Don't send socket to child itself. */
-   if (i == (unsigned int)id &&
-   j == (unsigned int)inst)
-   continue;
-   if (pp->pp_pipes[i][j] == -1)
-   continue;
-
-   pf.pf_procid = i;
-   pf.pf_instance = j;
-   proc_compose_imsg(ps, id, inst, IMSG_CTL_PROCFD,
-   -1, pp->pp_pipes[i][j], , sizeof(pf));
-   pp->pp_pipes[i][j] = -1;
-   }
-   }
-}
-
-/* Inter-connect all process except with ourself. */
-void
 proc_connect(struct privsep *ps)
 {
-   unsigned int src, i, j;
-   struct privsep_pipes*pp;
struct imsgev   *iev;
+   unsigned int src, dst, inst;
 
-   /* Listen on appropriate pipes. */
-   src = privsep_process;
-   pp = >ps_pipes[src][ps->ps_instance];
-
-   for (i = 0; i < PROC_MAX; i++) {
-   /* Don't listen to ourself. */
-   if (i == src)
-   continue;
+   /* Don't distribute any sockets if we are not really going to run. */
+   if (ps->ps_noaction)
+   return;
 
-   for (j = 0; j < ps->ps_instances[i]; j++) {
-   if (pp->pp_pipes[i][j] == -1)
-   continue;
+   for (dst = 0; dst < PROC_MAX; dst++) {
+   /* We don't communicate with ourselves. */
+   if (dst == PROC_PARENT)
+   continue;
 
-   iev = >ps_ievs[i][j];
-   imsg_init(>ibuf, pp->pp_pipes[i][j]);
+   for (inst = 0; inst < ps->ps_instances[dst]; inst++) {
+   iev = >ps_ievs[dst][inst];
+   imsg_init(>ibuf, ps->ps_pp->pp_pipes[dst][inst]);
event_set(>ev, iev->ibuf.fd, iev->events,
iev->handler, iev->data);
event_add(>ev, NULL);
}
}
 
-   /* Exchange pipes between process. */
-   for (i = 0; i < PROC_MAX; i++) {
-   /* Parent is already connected with everyone. */
-   if (i == PROC_PARENT)
-   continue;
+   /* Distribute the socketpair()s for everyone. */
+   for (src = 0; src < PROC_MAX; src++)
+   for (dst = src; dst < PROC_MAX; dst++) {
+   /* Parent already distributed its fds. */
+   if (src == PROC_PARENT || dst == PROC_PARENT)
+   continue;
 
-   for (j = 0; j < ps->ps_instances[i]; j++)
-   proc_connectpeer(ps, i, j, >ps_pipes[i][j]);
-   }
+   proc_open(ps, src, dst);
+   }
 }
 
 void
@@ -230,17 +194,41 @@ proc_init(struct privsep *ps, struct pri
 int argc, char **argv, enum privsep_procid proc_id)
 {
struct privsep_proc *p = NULL;
+   struct privsep_pipes*pa, *pb;
unsigned int proc;
-   unsigned int src, dst;
+   unsigned int dst;
+   int  fds[2];
+
+   /* Don't initiate anything if we are not really going to run. */
+   if 

Re: httpd(8)/proc.c: use less fds on startup

2016-10-10 Thread Rafael Zalamena
On Mon, Oct 10, 2016 at 12:32:49PM +0200, Reyk Floeter wrote:
> On Tue, Oct 04, 2016 at 11:54:37PM +0200, Rafael Zalamena wrote:
> > On Tue, Oct 04, 2016 at 07:46:52PM +0200, Rafael Zalamena wrote:
> > > This diff makes proc.c daemons to use less file descriptors on startup,
> > > this way we increase the number of child we can have considerably. This
> > > also improves the solution on a bug reported in bugs@
> > > "httpd errors out with 'too many open files'". 
> > > 
> > > To achieve that I delayed the socket distribution and made a minimal
> > > socket allocation in proc_init(), so only the necessary children socket
> > > are allocated and passed with proc_exec(). After the event_init() is 
> > > called
> > > we call proc_connect() which creates the socketpair() and immediatly after
> > > each call we already sends them without accumulating.
> > > 
> > > Note: We still have to calculate how many fds we will want to have and
> > >   then limit the daemon prefork configuration.
> > > 
> > > ok?
> > > 
> > 
> > Paul de Weerd still found problems with the diff, because the httpd(8)
> > would not exit successfully with '-n' flag. It happened because the
> > new proc_connect() code tried to write fds to children process that
> > did not start caused by the ps_noaction flag. (thanks Paul!)
> > 
> > This new diff just adds a check for ps_noaction in proc_init() and
> > proc_connect() so it doesn't try to do anything if we are not really
> > going to start the daemon. Also I removed the ps_noaction from proc_run()
> > since we are not going to get to this point anymore.
> > 
> > ok?
> > 
> 
> The diff looks fine and works on httpd, but it doesn't seem work on
> vmd and switchd (i didn't test relayd).  So there might be something
> wrong, please test applying it to them first.
> 
> Reyk
> 

It was aborting because msgbuf_write() was being called and since the
data had already been imsg_flush()ed it return 0 (like a socket close
would do) and fatal()ed. The fix is simply call imsg_event_add() to
remove the EV_WRITE and now msgbuf_write() won't be called anymore
without any data pending.

The diff also got some readability improvements to make the proc_open()
smaller and with less conditionals.

ok?


Index: proc.c
===
RCS file: /cvs/src/usr.sbin/httpd/proc.c,v
retrieving revision 1.32
diff -u -p -r1.32 proc.c
--- proc.c  10 Oct 2016 16:31:35 -  1.32
+++ proc.c  10 Oct 2016 16:40:09 -
@@ -37,8 +37,6 @@
 
 voidproc_exec(struct privsep *, struct privsep_proc *, unsigned int,
int, char **);
-voidproc_connectpeer(struct privsep *, enum privsep_procid, int,
-   struct privsep_pipes *);
 voidproc_setup(struct privsep *, struct privsep_proc *, unsigned int);
 voidproc_open(struct privsep *, int, int);
 voidproc_accept(struct privsep *, int, enum privsep_procid,
@@ -157,72 +155,38 @@ proc_exec(struct privsep *ps, struct pri
 }
 
 void
-proc_connectpeer(struct privsep *ps, enum privsep_procid id, int inst,
-struct privsep_pipes *pp)
-{
-   unsigned int i, j;
-   struct privsep_fdpf;
-
-   for (i = 0; i < PROC_MAX; i++) {
-   /* Parent is already connected with everyone. */
-   if (i == PROC_PARENT)
-   continue;
-
-   for (j = 0; j < ps->ps_instances[i]; j++) {
-   /* Don't send socket to child itself. */
-   if (i == (unsigned int)id &&
-   j == (unsigned int)inst)
-   continue;
-   if (pp->pp_pipes[i][j] == -1)
-   continue;
-
-   pf.pf_procid = i;
-   pf.pf_instance = j;
-   proc_compose_imsg(ps, id, inst, IMSG_CTL_PROCFD,
-   -1, pp->pp_pipes[i][j], , sizeof(pf));
-   pp->pp_pipes[i][j] = -1;
-   }
-   }
-}
-
-/* Inter-connect all process except with ourself. */
-void
 proc_connect(struct privsep *ps)
 {
-   unsigned int src, i, j;
-   struct privsep_pipes*pp;
struct imsgev   *iev;
+   unsigned int src, dst, inst;
 
-   /* Listen on appropriate pipes. */
-   src = privsep_process;
-   pp = >ps_pipes[src][ps->ps_instance];
-
-   for (i = 0; i < PROC_MAX; i++) {
-   /* Don't listen to ourself. */
-   if (i == src)
-   continue;
+   /* Don't distribute any sockets if we are not really going to run. */
+ 

httpd(8): dup2() fix for proc.c

2016-10-05 Thread Rafael Zalamena
This diff fixes the same problem ntpd(8) had with the dup2() when oldd == newd.

Quick background:
when you dup2(oldd, newd) and oldd == newd the CLOEXEC flag won't be removed
by the descriptor. We could use dup3() to detect this, but it is easier/faster
just to compare the fds and do the fcntl() ourselves.

ok?


Index: proc.c
===
RCS file: /home/obsdcvs/src/usr.sbin/httpd/proc.c,v
retrieving revision 1.27
diff -u -p -r1.27 proc.c
--- proc.c  28 Sep 2016 12:01:04 -  1.27
+++ proc.c  5 Oct 2016 16:50:40 -
@@ -22,6 +22,7 @@
 #include 
 #include 
 
+#include 
 #include 
 #include 
 #include 
@@ -131,7 +132,12 @@ proc_exec(struct privsep *ps, struct pri
break;
case 0:
/* Prepare parent socket. */
-   dup2(fd, PROC_PARENT_SOCK_FILENO);
+   if (fd != PROC_PARENT_SOCK_FILENO) {
+   if (dup2(fd, PROC_PARENT_SOCK_FILENO)
+   == -1)
+   fatal("dup2");
+   } else if (fcntl(fd, F_SETFD, 0) == -1)
+   fatal("fcntl");
 
execvp(argv[0], nargv);
fatal("%s: execvp", __func__);



Re: httpd(8)/proc.c: use less fds on startup

2016-10-04 Thread Rafael Zalamena
On Tue, Oct 04, 2016 at 07:46:52PM +0200, Rafael Zalamena wrote:
> This diff makes proc.c daemons to use less file descriptors on startup,
> this way we increase the number of child we can have considerably. This
> also improves the solution on a bug reported in bugs@
> "httpd errors out with 'too many open files'". 
> 
> To achieve that I delayed the socket distribution and made a minimal
> socket allocation in proc_init(), so only the necessary children socket
> are allocated and passed with proc_exec(). After the event_init() is called
> we call proc_connect() which creates the socketpair() and immediatly after
> each call we already sends them without accumulating.
> 
> Note: We still have to calculate how many fds we will want to have and
>   then limit the daemon prefork configuration.
> 
> ok?
> 

Paul de Weerd still found problems with the diff, because the httpd(8)
would not exit successfully with '-n' flag. It happened because the
new proc_connect() code tried to write fds to children process that
did not start caused by the ps_noaction flag. (thanks Paul!)

This new diff just adds a check for ps_noaction in proc_init() and
proc_connect() so it doesn't try to do anything if we are not really
going to start the daemon. Also I removed the ps_noaction from proc_run()
since we are not going to get to this point anymore.

ok?


Index: proc.c
===
RCS file: /home/obsdcvs/src/usr.sbin/httpd/proc.c,v
retrieving revision 1.27
diff -u -p -r1.27 proc.c
--- proc.c  28 Sep 2016 12:01:04 -  1.27
+++ proc.c  4 Oct 2016 21:50:43 -
@@ -36,8 +36,6 @@
 
 voidproc_exec(struct privsep *, struct privsep_proc *, unsigned int,
int, char **);
-voidproc_connectpeer(struct privsep *, enum privsep_procid, int,
-   struct privsep_pipes *);
 voidproc_setup(struct privsep *, struct privsep_proc *, unsigned int);
 voidproc_open(struct privsep *, int, int);
 voidproc_accept(struct privsep *, int, enum privsep_procid,
@@ -147,72 +145,18 @@ proc_exec(struct privsep *ps, struct pri
 }
 
 void
-proc_connectpeer(struct privsep *ps, enum privsep_procid id, int inst,
-struct privsep_pipes *pp)
-{
-   unsigned int i, j;
-   struct privsep_fdpf;
-
-   for (i = 0; i < PROC_MAX; i++) {
-   /* Parent is already connected with everyone. */
-   if (i == PROC_PARENT)
-   continue;
-
-   for (j = 0; j < ps->ps_instances[i]; j++) {
-   /* Don't send socket to child itself. */
-   if (i == (unsigned int)id &&
-   j == (unsigned int)inst)
-   continue;
-   if (pp->pp_pipes[i][j] == -1)
-   continue;
-
-   pf.pf_procid = i;
-   pf.pf_instance = j;
-   proc_compose_imsg(ps, id, inst, IMSG_CTL_PROCFD,
-   -1, pp->pp_pipes[i][j], , sizeof(pf));
-   pp->pp_pipes[i][j] = -1;
-   }
-   }
-}
-
-/* Inter-connect all process except with ourself. */
-void
 proc_connect(struct privsep *ps)
 {
-   unsigned int src, i, j;
-   struct privsep_pipes*pp;
-   struct imsgev   *iev;
-
-   /* Listen on appropriate pipes. */
-   src = privsep_process;
-   pp = >ps_pipes[src][ps->ps_instance];
-
-   for (i = 0; i < PROC_MAX; i++) {
-   /* Don't listen to ourself. */
-   if (i == src)
-   continue;
-
-   for (j = 0; j < ps->ps_instances[i]; j++) {
-   if (pp->pp_pipes[i][j] == -1)
-   continue;
-
-   iev = >ps_ievs[i][j];
-   imsg_init(>ibuf, pp->pp_pipes[i][j]);
-   event_set(>ev, iev->ibuf.fd, iev->events,
-   iev->handler, iev->data);
-   event_add(>ev, NULL);
-   }
-   }
+   unsigned int src, dst;
 
-   /* Exchange pipes between process. */
-   for (i = 0; i < PROC_MAX; i++) {
-   /* Parent is already connected with everyone. */
-   if (i == PROC_PARENT)
-   continue;
+   /* Don't distribute any sockets if we are not really going to run. */
+   if (ps->ps_noaction)
+   return;
 
-   for (j = 0; j < ps->ps_instances[i]; j++)
-   proc_connectpeer(ps, i, j, >ps_pipes[i][j]);
-   }
+   /* Distribute the socketpair()s for everyone. */
+   for (src = 0; src < PROC_MAX; src++)
+   for (dst = src; dst < PROC_MAX; dst++)
+   

httpd(8)/proc.c: use less fds on startup

2016-10-04 Thread Rafael Zalamena
This diff makes proc.c daemons to use less file descriptors on startup,
this way we increase the number of child we can have considerably. This
also improves the solution on a bug reported in bugs@
"httpd errors out with 'too many open files'". 

To achieve that I delayed the socket distribution and made a minimal
socket allocation in proc_init(), so only the necessary children socket
are allocated and passed with proc_exec(). After the event_init() is called
we call proc_connect() which creates the socketpair() and immediatly after
each call we already sends them without accumulating.

Note: We still have to calculate how many fds we will want to have and
  then limit the daemon prefork configuration.

ok?


Index: proc.c
===
RCS file: /home/obsdcvs/src/usr.sbin/httpd/proc.c,v
retrieving revision 1.27
diff -u -p -r1.27 proc.c
--- proc.c  28 Sep 2016 12:01:04 -  1.27
+++ proc.c  4 Oct 2016 17:26:41 -
@@ -36,8 +36,6 @@
 
 voidproc_exec(struct privsep *, struct privsep_proc *, unsigned int,
int, char **);
-voidproc_connectpeer(struct privsep *, enum privsep_procid, int,
-   struct privsep_pipes *);
 voidproc_setup(struct privsep *, struct privsep_proc *, unsigned int);
 voidproc_open(struct privsep *, int, int);
 voidproc_accept(struct privsep *, int, enum privsep_procid,
@@ -147,72 +145,14 @@ proc_exec(struct privsep *ps, struct pri
 }
 
 void
-proc_connectpeer(struct privsep *ps, enum privsep_procid id, int inst,
-struct privsep_pipes *pp)
-{
-   unsigned int i, j;
-   struct privsep_fdpf;
-
-   for (i = 0; i < PROC_MAX; i++) {
-   /* Parent is already connected with everyone. */
-   if (i == PROC_PARENT)
-   continue;
-
-   for (j = 0; j < ps->ps_instances[i]; j++) {
-   /* Don't send socket to child itself. */
-   if (i == (unsigned int)id &&
-   j == (unsigned int)inst)
-   continue;
-   if (pp->pp_pipes[i][j] == -1)
-   continue;
-
-   pf.pf_procid = i;
-   pf.pf_instance = j;
-   proc_compose_imsg(ps, id, inst, IMSG_CTL_PROCFD,
-   -1, pp->pp_pipes[i][j], , sizeof(pf));
-   pp->pp_pipes[i][j] = -1;
-   }
-   }
-}
-
-/* Inter-connect all process except with ourself. */
-void
 proc_connect(struct privsep *ps)
 {
-   unsigned int src, i, j;
-   struct privsep_pipes*pp;
-   struct imsgev   *iev;
-
-   /* Listen on appropriate pipes. */
-   src = privsep_process;
-   pp = >ps_pipes[src][ps->ps_instance];
-
-   for (i = 0; i < PROC_MAX; i++) {
-   /* Don't listen to ourself. */
-   if (i == src)
-   continue;
-
-   for (j = 0; j < ps->ps_instances[i]; j++) {
-   if (pp->pp_pipes[i][j] == -1)
-   continue;
-
-   iev = >ps_ievs[i][j];
-   imsg_init(>ibuf, pp->pp_pipes[i][j]);
-   event_set(>ev, iev->ibuf.fd, iev->events,
-   iev->handler, iev->data);
-   event_add(>ev, NULL);
-   }
-   }
-
-   /* Exchange pipes between process. */
-   for (i = 0; i < PROC_MAX; i++) {
-   /* Parent is already connected with everyone. */
-   if (i == PROC_PARENT)
-   continue;
+   unsigned int src, dst;
 
-   for (j = 0; j < ps->ps_instances[i]; j++)
-   proc_connectpeer(ps, i, j, >ps_pipes[i][j]);
-   }
+   /* Distribute the socketpair()s for everyone. */
+   for (src = 0; src < PROC_MAX; src++)
+   for (dst = src; dst < PROC_MAX; dst++)
+   proc_open(ps, src, dst);
 }
 
 void
@@ -220,17 +160,37 @@ proc_init(struct privsep *ps, struct pri
 int argc, char **argv, enum privsep_procid proc_id)
 {
struct privsep_proc *p = NULL;
+   struct privsep_pipes*pa, *pb;
unsigned int proc;
-   unsigned int src, dst;
+   unsigned int dst;
+   int  fds[2];
 
if (proc_id == PROC_PARENT) {
privsep_process = PROC_PARENT;
proc_setup(ps, procs, nproc);
 
-   /* Open socketpair()s for everyone. */
-   for (src = 0; src < PROC_MAX; src++)
-   for (dst = 0; dst < PROC_MAX; dst++)
-   proc_open(ps, src, dst);
+   /*
+* Create the children sockets so we can use them 
+* to distribute the rest of the socketpair()s 

bridge(4): fix span interface removal

2016-10-03 Thread Rafael Zalamena
While doing the "notify bridge of interface removal with hook" I noticed
that the span ports suffer from not having something to remove them. To
reproduce this problem, do the following steps:

# ifconfig vether0 up
# ifconfig bridge0 up
# ifconfig bridge0 addspan vether0
# ifconfig vether0 destroy
# ifconfig bridge0 # vether0 is still there!

The diff below fixes this problem by adding a hook for span ports as well
and we get some fewer lines of duplicated code.


ok?

Index: net/if_bridge.c
===
RCS file: /home/obsdcvs/src/sys/net/if_bridge.c,v
retrieving revision 1.286
diff -u -p -r1.286 if_bridge.c
--- net/if_bridge.c 3 Oct 2016 12:26:13 -   1.286
+++ net/if_bridge.c 3 Oct 2016 12:53:07 -
@@ -107,6 +107,7 @@
 void   bridgeattach(int);
 intbridge_ioctl(struct ifnet *, u_long, caddr_t);
 void   bridge_ifdetach(void *);
+void   bridge_spandetach(void *);
 intbridge_input(struct ifnet *, struct mbuf *, void *);
 void   bridge_process(struct ifnet *, struct mbuf *);
 void   bridgeintr_frame(struct bridge_softc *, struct ifnet *, struct mbuf *);
@@ -215,10 +216,8 @@ bridge_clone_destroy(struct ifnet *ifp)
bridge_rtflush(sc, IFBF_FLUSHALL);
while ((bif = TAILQ_FIRST(>sc_iflist)) != NULL)
bridge_delete(sc, bif);
-   while ((bif = TAILQ_FIRST(>sc_spanlist)) != NULL) {
-   TAILQ_REMOVE(>sc_spanlist, bif, next);
-   free(bif, M_DEVBUF, sizeof *bif);
-   }
+   while ((bif = TAILQ_FIRST(>sc_spanlist)) != NULL)
+   bridge_spandetach(bif);
 
bstp_destroy(sc->sc_stp);
 
@@ -408,6 +407,9 @@ bridge_ioctl(struct ifnet *ifp, u_long c
}
p->ifp = ifs;
p->bif_flags = IFBIF_SPAN;
+   p->bridge_sc = sc;
+   p->bif_dhcookie = hook_establish(ifs->if_detachhooks, 0,
+   bridge_spandetach, p);
SIMPLEQ_INIT(>bif_brlin);
SIMPLEQ_INIT(>bif_brlout);
TAILQ_INSERT_TAIL(>sc_spanlist, p, next);
@@ -418,8 +420,7 @@ bridge_ioctl(struct ifnet *ifp, u_long c
TAILQ_FOREACH(p, >sc_spanlist, next) {
if (strncmp(p->ifp->if_xname, req->ifbr_ifsname,
sizeof(p->ifp->if_xname)) == 0) {
-   TAILQ_REMOVE(>sc_spanlist, p, next);
-   free(p, M_DEVBUF, sizeof *p);
+   bridge_spandetach(p);
break;
}
}
@@ -581,6 +582,17 @@ bridge_ifdetach(void *arg)
sc = bif->bridge_sc;
 
bridge_delete(sc, bif);
+}
+
+void
+bridge_spandetach(void *arg)
+{
+   struct bridge_iflist *p = (struct bridge_iflist *)arg;
+   struct bridge_softc *sc = p->bridge_sc;
+
+   hook_disestablish(p->ifp->if_detachhooks, p->bif_dhcookie);
+   TAILQ_REMOVE(>sc_spanlist, p, next);
+   free(p, M_DEVBUF, sizeof(*p));
 }
 
 int



ntpd(8): use safer dup3() instead of dup2()

2016-10-02 Thread Rafael Zalamena
This diff is an improvement and an attempt to fix the bug where the ntpd(8)
not always stays running.

During the review of syslogd fork+exec diff I noticed the use of dup3()
and went to read its man page: dup2() doesn't always remove the CLOEXEC
flag from the descriptor, so using dup3() is a better idea because it does
check the flag and if the oldd == newd then it retuns an useful error.

So if dup3() returns us an error it means oldd == newd and we should
remove the CLOEXEC flag ourself.

ok?


Index: util.c
===
RCS file: /home/obsdcvs/src/usr.sbin/ntpd/util.c,v
retrieving revision 1.22
diff -u -p -r1.22 util.c
--- util.c  14 Sep 2016 13:20:16 -  1.22
+++ util.c  2 Oct 2016 18:31:41 -
@@ -16,6 +16,8 @@
  * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
  */
 
+#include 
+#include 
 #include 
 #include 
 #include 
@@ -184,7 +186,12 @@ start_child(char *pname, int cfd, int ar
break;
case 0:
/* Prepare the parent socket and execute. */
-   dup2(cfd, PARENT_SOCK_FILENO);
+   if (dup3(cfd, PARENT_SOCK_FILENO, 0) == -1) {
+   if (errno != EINVAL)
+   fatal("%s: dup3", __func__);
+   if (fcntl(cfd, F_SETFD, 0) == -1)
+   fatal("%s: fcntl", __func__);
+   }
 
execvp(argv[0], nargv);
fatal("%s: execvp", __func__);



bridge(4): use hook to notify interface detach

2016-10-02 Thread Rafael Zalamena
Just like the switch(4) diff, this one does the same thing for the
bridge(4).

This diff removes bridge(4) code from if.c and uses the detach hook to
be notified about interface removals.

ok?


Index: net/if.c
===
RCS file: /home/obsdcvs/src/sys/net/if.c,v
retrieving revision 1.451
diff -u -p -r1.451 if.c
--- net/if.c28 Sep 2016 08:31:42 -  1.451
+++ net/if.c2 Oct 2016 20:40:30 -
@@ -895,12 +895,6 @@ if_deactivate(struct ifnet *ifp)
 */
dohooks(ifp->if_detachhooks, HOOK_REMOVE | HOOK_FREE);
 
-#if NBRIDGE > 0
-   /* Remove the interface from any bridge it is part of.  */
-   if (ifp->if_bridgeport)
-   bridge_ifdetach(ifp);
-#endif
-
 #if NSWITCH > 0
if (ifp->if_switchport)
switch_port_detach(ifp);
Index: net/if_bridge.c
===
RCS file: /home/obsdcvs/src/sys/net/if_bridge.c,v
retrieving revision 1.285
diff -u -p -r1.285 if_bridge.c
--- net/if_bridge.c 29 Sep 2016 11:37:44 -  1.285
+++ net/if_bridge.c 2 Oct 2016 20:39:56 -
@@ -106,6 +106,7 @@
 
 void   bridgeattach(int);
 intbridge_ioctl(struct ifnet *, u_long, caddr_t);
+void   bridge_ifdetach(void *);
 intbridge_input(struct ifnet *, struct mbuf *, void *);
 void   bridge_process(struct ifnet *, struct mbuf *);
 void   bridgeintr_frame(struct bridge_softc *, struct ifnet *, struct mbuf *);
@@ -244,6 +245,7 @@ bridge_delete(struct bridge_softc *sc, s
 
p->ifp->if_bridgeport = NULL;
error = ifpromisc(p->ifp, 0);
+   hook_disestablish(p->ifp->if_detachhooks, p->bif_dhcookie);
 
if_ih_remove(p->ifp, bridge_input, NULL);
TAILQ_REMOVE(>sc_iflist, p, next);
@@ -356,6 +358,8 @@ bridge_ioctl(struct ifnet *ifp, u_long c
SIMPLEQ_INIT(>bif_brlin);
SIMPLEQ_INIT(>bif_brlout);
ifs->if_bridgeport = (caddr_t)p;
+   p->bif_dhcookie = hook_establish(ifs->if_detachhooks, 0,
+   bridge_ifdetach, ifs);
if_ih_insert(p->ifp, bridge_input, NULL);
TAILQ_INSERT_TAIL(>sc_iflist, p, next);
break;
@@ -567,8 +571,9 @@ bridge_ioctl(struct ifnet *ifp, u_long c
 
 /* Detach an interface from a bridge.  */
 void
-bridge_ifdetach(struct ifnet *ifp)
+bridge_ifdetach(void *arg)
 {
+   struct ifnet *ifp = (struct ifnet *)arg;
struct bridge_softc *sc;
struct bridge_iflist *bif;
 
Index: net/if_bridge.h
===
RCS file: /home/obsdcvs/src/sys/net/if_bridge.h,v
retrieving revision 1.52
diff -u -p -r1.52 if_bridge.h
--- net/if_bridge.h 29 Sep 2016 11:37:44 -  1.52
+++ net/if_bridge.h 2 Oct 2016 20:35:26 -
@@ -398,6 +398,7 @@ struct bridge_iflist {
struct brl_head bif_brlout; /* output rules */
struct  ifnet *ifp; /* member interface */
u_int32_t   bif_flags;  /* member flags */
+   void*bif_dhcookie;
 };
 #define bif_state  bif_stp->bp_state
 
@@ -450,7 +451,6 @@ struct bridge_softc {
 extern const u_int8_t bstp_etheraddr[];
 struct llc;
 
-void   bridge_ifdetach(struct ifnet *);
 intbridge_output(struct ifnet *, struct mbuf *, struct sockaddr *,
 struct rtentry *);
 void   bridge_update(struct ifnet *, struct ether_addr *, int);



switch(4): use hook to notify interface detach

2016-10-02 Thread Rafael Zalamena
mpi@ suggested that it would be possible to use if_detachhooks to handle
the interface teardown instead of adding code to if.c, so this diff does
exactly that.

Not only we get to remove switch(4) code from if.c, we also get less lines
of code by removing some duplicated teardown procedure in
switch_clone_destroy().

ok?


Index: net/if_switch.c
===
RCS file: /home/obsdcvs/src/sys/net/if_switch.c,v
retrieving revision 1.7
diff -u -p -r1.7 if_switch.c
--- net/if_switch.c 29 Sep 2016 11:37:44 -  1.7
+++ net/if_switch.c 2 Oct 2016 20:03:33 -
@@ -74,6 +74,7 @@ intswitch_port_set_local(struct switch
 int switch_port_unset_local(struct switch_softc *, struct switch_port *);
 int switch_ioctl(struct ifnet *, unsigned long, caddr_t);
 int switch_port_add(struct switch_softc *, struct ifbreq *);
+voidswitch_port_detach(void *);
 int switch_port_del(struct switch_softc *, struct ifbreq *);
 int switch_port_list(struct switch_softc *, struct ifbifconf *);
 int switch_output(struct ifnet *, struct mbuf *, struct sockaddr *,
@@ -215,16 +216,9 @@ switch_clone_destroy(struct ifnet *ifp)
struct ifnet*ifs;
 
TAILQ_FOREACH_SAFE(swpo, >sc_swpo_list, swpo_list_next, tp) {
-   if ((ifs = if_get(swpo->swpo_ifindex)) != NULL) {
-   if (swpo->swpo_flags & IFBIF_LOCAL)
-   switch_port_unset_local(sc, swpo);
-   ifs->if_switchport = NULL;
-   ifpromisc(ifs, 0);
-   if_ih_remove(ifs, switch_input, NULL);
-   if_put(ifs);
-   TAILQ_REMOVE(>sc_swpo_list, swpo, swpo_list_next);
-   free(swpo, M_DEVBUF, sizeof(*swpo));
-   } else
+   if ((ifs = if_get(swpo->swpo_ifindex)) != NULL)
+   switch_port_detach(ifs);
+   else
log(LOG_ERR, "failed to cleanup on ifindex(%d)\n",
swpo->swpo_ifindex);
}
@@ -576,6 +570,8 @@ switch_port_add(struct switch_softc *sc,
ifs->if_switchport = (caddr_t)swpo;
if_ih_insert(ifs, switch_input, NULL);
swpo->swpo_port_no = swofp_assign_portno(sc, ifs->if_index);
+   swpo->swpo_dhcookie = hook_establish(ifs->if_detachhooks, 0,
+   switch_port_detach, ifs);
 
nanouptime(>swpo_appended);
 
@@ -630,8 +626,9 @@ done:
 }
 
 void
-switch_port_detach(struct ifnet *ifp)
+switch_port_detach(void *arg)
 {
+   struct ifnet*ifp = (struct ifnet *)arg;
struct switch_softc *sc = ifp->if_softc;
struct switch_port  *swpo;
 
@@ -640,6 +637,7 @@ switch_port_detach(struct ifnet *ifp)
switch_port_unset_local(sc, swpo);
 
ifp->if_switchport = NULL;
+   hook_disestablish(ifp->if_detachhooks, swpo->swpo_dhcookie);
ifpromisc(ifp, 0);
if_ih_remove(ifp, switch_input, NULL);
TAILQ_REMOVE(>sc_swpo_list, swpo, swpo_list_next);
Index: net/if_switch.h
===
RCS file: /home/obsdcvs/src/sys/net/if_switch.h,v
retrieving revision 1.3
diff -u -p -r1.3 if_switch.h
--- net/if_switch.h 28 Sep 2016 08:31:42 -  1.3
+++ net/if_switch.h 2 Oct 2016 19:47:44 -
@@ -174,6 +174,7 @@ struct switch_port {
struct timespec  swpo_appended;
struct switch_softc *swpo_switch;
uint32_t swpo_flags;
+   void*swpo_dhcookie;
void(*swop_bk_start)(struct ifnet *);
 };
 
@@ -215,7 +216,6 @@ void switch_port_egress(struct switch_s
 int switch_swfcl_dup(struct switch_flow_classify *,
struct switch_flow_classify *);
 voidswitch_swfcl_free(struct switch_flow_classify *);
-voidswitch_port_detach(struct ifnet *);
 
 /* switchctl.c */
 voidswitch_dev_destroy(struct switch_softc *);
Index: net/if.c
===
RCS file: /home/obsdcvs/src/sys/net/if.c,v
retrieving revision 1.451
diff -u -p -r1.451 if.c
--- net/if.c28 Sep 2016 08:31:42 -  1.451
+++ net/if.c2 Oct 2016 19:51:40 -
@@ -130,10 +130,6 @@
 #include 
 #endif
 
-#if NSWITCH > 0
-#include 
-#endif
-
 void   if_attachsetup(struct ifnet *);
 void   if_attachdomain(struct ifnet *);
 void   if_attach_common(struct ifnet *);
@@ -899,11 +895,6 @@ if_deactivate(struct ifnet *ifp)
/* Remove the interface from any bridge it is part of.  */
if (ifp->if_bridgeport)
bridge_ifdetach(ifp);
-#endif
-
-#if NSWITCH > 0
-   if (ifp->if_switchport)
-   switch_port_detach(ifp);
 #endif
 
 #if NCARP > 0



Re: syslogd fork+exec

2016-10-01 Thread Rafael Zalamena
On Thu, Sep 29, 2016 at 08:09:23PM +0200, Alexander Bluhm wrote:
> Hi,
> 
> With this diff syslogd(8) does an exec on itself in the privileged
> parent process to reshuffle its memory layout.
> 
> As syslogd only forks once, it does not really matter wether we
> fork+exec in the child or in the parent.  To do it in the parent
> is easier as it has much less state.
> 
> ok?
> 
> bluhm

Your diffs looks good and you made me realize that I should use dup3()
instead of dup2() to create children socket.

Short explanation for outsiders: dup2(fd1, fd2) duplicates fd1 onto fd2
removing CLOEXEC flags, except if fd1 == fd2, then in that case the fd
will remain with CLOEXEC and things will not work. This is not the case
with httpd(8), relayd(8), ntpd(8) and switchd(8), but since code might
be copied around it would be good to fix this there.

I'm using this diff and it works in my default configuration, but since
I'm not familiar with syslogd I don't feel confortable giving oks here.

I made one comment inline in the snipped diff below.

> 
> Index: usr.sbin/syslogd/privsep.c
> ===
> RCS file: /data/mirror/openbsd/cvs/src/usr.sbin/syslogd/privsep.c,v
> retrieving revision 1.61
> diff -u -p -r1.61 privsep.c
> --- usr.sbin/syslogd/privsep.c28 Jun 2016 18:22:50 -  1.61
> +++ usr.sbin/syslogd/privsep.c29 Sep 2016 17:55:03 -
> @@ -194,38 +162,87 @@ priv_init(char *conf, int numeric, int l
>   if (fd_unix[i] != -1)
>   close(fd_unix[i]);
>  
> - /* Save the config file specified by the child process */
> - if (strlcpy(config_file, conf, sizeof config_file) >= 
> sizeof(config_file))
> - errx(1, "config_file truncation");
> + if (dup3(socks[0], 3, 0) == -1)
> + err(1, "dup3 priv sock failed");
> + snprintf(childnum, sizeof(childnum), "%d", child_pid);
> + if ((privargv = reallocarray(NULL, argc + 3, sizeof(char *))) == NULL)
> + err(1, "alloc priv argv failed");
> + for (i = 0; i < argc; i++)
> + privargv[i] = argv[i];
> + privargv[i++] = "-P";
> + privargv[i++] = childnum;
> + privargv[i++] = NULL;
> + execv(privargv[0], privargv);
> + err(1, "exec priv '%s' failed", privargv[0]);
> +}
>  
> - if (stat(config_file, _info) < 0)
> - err(1, "stat config file failed");
> +__dead void
> +priv_exec(char *conf, int numeric, int child, int argc, char *argv[])
> +{
> + int i, fd, sock, cmd, addr_len, result, restart;
> + size_t path_len, protoname_len, hostname_len, servname_len;
> + char path[PATH_MAX], protoname[5];
> + char hostname[NI_MAXHOST], servname[NI_MAXSERV];
> + struct sockaddr_storage addr;
> + struct stat cf_info, cf_stat;
> + struct addrinfo hints, *res0;
> + struct sigaction sa;
>  
> - /* Save whether or not the child can have access to getnameinfo(3) */
> - if (numeric > 0)
> - allow_getnameinfo = 0;
> - else
> - allow_getnameinfo = 1;
> + if (pledge("stdio rpath wpath cpath dns getpw sendfd id proc exec",
> + NULL) == -1)
> + err(1, "pledge priv");
> +
> + if (argc <= 2 || strcmp("-P", argv[argc - 2]) != 0)
> + errx(1, "exec without priv");
> + argv[argc -= 2] = NULL;
> +
> + sock = 3;
> + for (fd = 4; fd < 1024; fd++)
> + close(fd);

This could be replaced with "closefrom(4);".

> +
> + child_pid = child;
> +
> + memset(, 0, sizeof(sa));
> + sigemptyset(_mask);
> + sa.sa_flags = SA_RESTART;
> + sa.sa_handler = SIG_DFL;
> + for (i = 1; i < _NSIG; i++)
> + sigaction(i, , NULL);
> +
> + /* Pass TERM/HUP/INT/QUIT through to child, and accept CHLD */
> + sa.sa_handler = sig_pass_to_chld;
> + sigaction(SIGTERM, , NULL);
> + sigaction(SIGHUP, , NULL);
> + sigaction(SIGINT, , NULL);
> + sigaction(SIGQUIT, , NULL);
> + sa.sa_handler = sig_got_chld;
> + sa.sa_flags |= SA_NOCLDSTOP;
> + sigaction(SIGCHLD, , NULL);
> +
> + setproctitle("[priv]");
> +
> + if (stat(conf, _info) < 0)
> + err(1, "stat config file failed");
>  
>   TAILQ_INIT();
>   increase_state(STATE_CONFIG);
>   restart = 0;
>  
>   while (cur_state < STATE_QUIT) {
> - if (may_read(socks[0], , sizeof(int)))
> + if (may_read(sock, , sizeof(int)))
>   break;
>   switch (cmd) {
>   case PRIV_OPEN_TTY:
>   logdebug("[priv]: msg PRIV_OPEN_TTY received\n");
>   /* Expecting: length, path */
> - must_read(socks[0], _len, sizeof(size_t));
> + must_read(sock, _len, sizeof(size_t));
>   if (path_len == 0 || path_len > sizeof(path))
>   _exit(1);
> - must_read(socks[0], , path_len);
> +   

ntpd(8): use stack instead of heap

2016-10-01 Thread Rafael Zalamena
The ntpd(8) constraint fork+exec diff changed the way the constraint
processes are created, but then it introduced new calloc()s to avoid
increasing diff size and to focus on the problem. Now that the fork+exec
is in, this diff make those variables to become a part of the stack.

No functional changes, just changing variables storage location.

ok?


Index: constraint.c
===
RCS file: /home/obsdcvs/src/usr.sbin/ntpd/constraint.c,v
retrieving revision 1.32
diff -u -p -r1.32 constraint.c
--- constraint.c26 Sep 2016 17:17:01 -  1.32
+++ constraint.c1 Oct 2016 18:54:35 -
@@ -317,8 +317,8 @@ priv_constraint_readquery(struct constra
 void
 priv_constraint_child(const char *pw_dir, uid_t pw_uid, gid_t pw_gid)
 {
-   struct constraint   *cstr;
-   struct ntp_addr_msg *am;
+   struct constraintcstr;
+   struct ntp_addr_msg  am;
uint8_t *data;
static char  addr[NI_MAXHOST];
struct timeval   rectv, xmttv;
@@ -332,10 +332,6 @@ priv_constraint_child(const char *pw_dir
if (setpriority(PRIO_PROCESS, 0, 0) == -1)
log_warn("could not set priority");
 
-   if ((cstr = calloc(1, sizeof(*cstr))) == NULL ||
-   (am = calloc(1, sizeof(*am))) == NULL)
-   fatal("%s: calloc", __func__);
-
/* Init TLS and load CA certs before chroot() */
if (tls_init() == -1)
fatalx("tls_init");
@@ -364,9 +360,9 @@ priv_constraint_child(const char *pw_dir
if (pledge("stdio inet", NULL) == -1)
fatal("pledge");
 
-   cstr->fd = CONSTRAINT_PASSFD;
-   imsg_init(>ibuf, cstr->fd);
-   priv_constraint_readquery(cstr, am, );
+   cstr.fd = CONSTRAINT_PASSFD;
+   imsg_init(, cstr.fd);
+   priv_constraint_readquery(, , );
 
/*
 * Get the IP address as name and set the process title accordingly.
@@ -374,8 +370,8 @@ priv_constraint_child(const char *pw_dir
 * any DNS operation, so it is safe to be called without the dns
 * pledge.
 */
-   if (getnameinfo((struct sockaddr *)>addr->ss,
-   SA_LEN((struct sockaddr *)>addr->ss),
+   if (getnameinfo((struct sockaddr *)>ss,
+   SA_LEN((struct sockaddr *)>ss),
addr, sizeof(addr), NULL, 0,
NI_NUMERICHOST) != 0)
fatalx("%s getnameinfo", __func__);
@@ -394,21 +390,21 @@ priv_constraint_child(const char *pw_dir
fatal("%s fcntl F_SETFD", __func__);
 
/* Get remaining data from imsg in the unpriv child */
-   if (am->namelen) {
-   if ((cstr->addr_head.name =
-   get_string(data, am->namelen)) == NULL)
+   if (am.namelen) {
+   if ((cstr.addr_head.name =
+   get_string(data, am.namelen)) == NULL)
fatalx("invalid IMSG_CONSTRAINT_QUERY name");
-   data += am->namelen;
+   data += am.namelen;
}
-   if (am->pathlen) {
-   if ((cstr->addr_head.path =
-   get_string(data, am->pathlen)) == NULL)
+   if (am.pathlen) {
+   if ((cstr.addr_head.path =
+   get_string(data, am.pathlen)) == NULL)
fatalx("invalid IMSG_CONSTRAINT_QUERY path");
}
 
/* Run! */
if ((ctx = httpsdate_query(addr,
-   CONSTRAINT_PORT, cstr->addr_head.name, cstr->addr_head.path,
+   CONSTRAINT_PORT, cstr.addr_head.name, cstr.addr_head.path,
conf->ca, conf->ca_len, , )) == NULL) {
/* Abort with failure but without warning */
exit(1);
@@ -418,9 +414,9 @@ priv_constraint_child(const char *pw_dir
iov[0].iov_len = sizeof(rectv);
iov[1].iov_base = 
iov[1].iov_len = sizeof(xmttv);
-   imsg_composev(>ibuf,
+   imsg_composev(,
IMSG_CONSTRAINT_RESULT, 0, 0, -1, iov, 2);
-   imsg_flush(>ibuf);
+   imsg_flush();
 
/* Tear down the TLS connection after sending the result */
httpsdate_free(ctx);



netstart+switch(4): delay interface start

2016-09-27 Thread Rafael Zalamena
switch(4) needs to have its interface start up delayed, otherwise the
netstart script will fail to configure switch(4) with virtual interfaces
like vether(4). This diff adds switch(4) to the delayed list just like
bridge(4).

ok?


Index: netstart
===
RCS file: /home/obsdcvs/src/etc/netstart,v
retrieving revision 1.170
diff -u -p -r1.170 netstart
--- netstart9 Sep 2016 19:48:16 -   1.170
+++ netstart27 Sep 2016 10:04:47 -
@@ -251,7 +251,7 @@ fi
 
 # Configure all the non-loopback interfaces which we know about, but
 # do not start interfaces which must be delayed. Refer to hostname.if(5)
-ifmstart "" "trunk svlan vlan carp gif gre pfsync pppoe tun bridge pflow"
+ifmstart "" "trunk svlan vlan carp gif gre pfsync pppoe tun bridge pflow 
switch"
 
 # The trunk interfaces need to come up first in this list.
 # The (s)vlan interfaces need to come up after trunk.
@@ -283,7 +283,7 @@ fi
 # require routes to be set. TUN might depend on PPPoE, and GIF or GRE may
 # depend on either of them. PFLOW might bind to ip addresses configured
 # on either of them.
-ifmstart "pppoe tun gif gre bridge pflow"
+ifmstart "pppoe tun gif gre bridge pflow switch"
 
 # Reject 127/8 other than 127.0.0.1.
 route -qn add -net 127 127.0.0.1 -reject >/dev/null



switch(4): don't panic when destroying interfaces

2016-09-26 Thread Rafael Zalamena
switch(4) is currently not handling device removal when the interface is
being destroyed.

Example:
# ifconfig switch0 up
# ifconfig vether0 up
# ifconfig switch0 add vether0
# ifconfig vether0 destroy # kernel panic here

This diff fixes it by calling the switch port detach on the right time.

ok?


Index: if.c
===
RCS file: /home/obsdcvs/src/sys/net/if.c,v
retrieving revision 1.450
diff -u -p -r1.450 if.c
--- if.c22 Sep 2016 14:50:11 -  1.450
+++ if.c26 Sep 2016 16:05:45 -
@@ -130,6 +130,10 @@
 #include 
 #endif
 
+#if NSWITCH > 0
+#include 
+#endif
+
 void   if_attachsetup(struct ifnet *);
 void   if_attachdomain(struct ifnet *);
 void   if_attach_common(struct ifnet *);
@@ -895,6 +899,11 @@ if_deactivate(struct ifnet *ifp)
/* Remove the interface from any bridge it is part of.  */
if (ifp->if_bridgeport)
bridge_ifdetach(ifp);
+#endif
+
+#if NSWITCH > 0
+   if (ifp->if_switchport)
+   switch_port_detach(ifp);
 #endif
 
 #if NCARP > 0
Index: if_switch.c
===
RCS file: /home/obsdcvs/src/sys/net/if_switch.c,v
retrieving revision 1.5
diff -u -p -r1.5 if_switch.c
--- if_switch.c 4 Sep 2016 17:11:09 -   1.5
+++ if_switch.c 26 Sep 2016 16:08:17 -
@@ -629,6 +629,23 @@ done:
return (error);
 }
 
+void
+switch_port_detach(struct ifnet *ifp)
+{
+   struct switch_softc *sc = ifp->if_softc;
+   struct switch_port  *swpo;
+
+   swpo = (struct switch_port *)ifp->if_switchport;
+   if (swpo->swpo_flags & IFBIF_LOCAL)
+   switch_port_unset_local(sc, swpo);
+
+   ifp->if_switchport = NULL;
+   ifpromisc(ifp, 0);
+   if_ih_remove(ifp, switch_input, NULL);
+   TAILQ_REMOVE(>sc_swpo_list, swpo, swpo_list_next);
+   free(swpo, M_DEVBUF, sizeof(*swpo));
+}
+
 int
 switch_port_del(struct switch_softc *sc, struct ifbreq *req)
 {
@@ -645,13 +662,7 @@ switch_port_del(struct switch_softc *sc,
}
 
if (swpo) {
-   if (swpo->swpo_flags & IFBIF_LOCAL)
-   switch_port_unset_local(sc, swpo);
-   ifs->if_switchport = NULL;
-   ifpromisc(ifs, 0);
-   if_ih_remove(ifs, switch_input, NULL);
-   TAILQ_REMOVE(>sc_swpo_list, swpo, swpo_list_next);
-   free(swpo, M_DEVBUF, sizeof(*swpo));
+   switch_port_detach(ifs);
if_put(ifs);
error = 0;
} else
Index: if_switch.h
===
RCS file: /home/obsdcvs/src/sys/net/if_switch.h,v
retrieving revision 1.2
diff -u -p -r1.2 if_switch.h
--- if_switch.h 4 Sep 2016 16:47:41 -   1.2
+++ if_switch.h 26 Sep 2016 16:05:45 -
@@ -215,6 +215,7 @@ void switch_port_egress(struct switch_s
 int switch_swfcl_dup(struct switch_flow_classify *,
struct switch_flow_classify *);
 voidswitch_swfcl_free(struct switch_flow_classify *);
+voidswitch_port_detach(struct ifnet *);
 
 /* switchctl.c */
 voidswitch_dev_destroy(struct switch_softc *);



snmpd(8): teach how to fork+exec

2016-09-26 Thread Rafael Zalamena
Lets teach snmpd(8) how to fork+exec using the proc.c file from the latest
switchd(8) diff.

Note 1: I just tested the basic operations: startup and teardown.
Note 2: the kill with close will be implemented in another diff with the
ps_pid removal.

ok?


Index: proc.c
===
RCS file: /home/obsdcvs/src/usr.sbin/snmpd/proc.c,v
retrieving revision 1.20
diff -u -p -r1.20 proc.c
--- proc.c  7 Dec 2015 16:05:56 -   1.20
+++ proc.c  26 Sep 2016 13:39:29 -
@@ -34,8 +34,14 @@
 
 #include "snmpd.h"
 
-voidproc_open(struct privsep *, struct privsep_proc *,
-   struct privsep_proc *, size_t);
+voidproc_exec(struct privsep *, struct privsep_proc *, unsigned int,
+   int, char **);
+voidproc_connectpeer(struct privsep *, enum privsep_procid, int,
+   struct privsep_pipes *);
+voidproc_setup(struct privsep *, struct privsep_proc *, unsigned int);
+voidproc_open(struct privsep *, int, int);
+voidproc_accept(struct privsep *, int, enum privsep_procid,
+   unsigned int);
 voidproc_close(struct privsep *);
 int proc_ispeer(struct privsep_proc *, unsigned int, enum privsep_procid);
 voidproc_shutdown(struct privsep_proc *);
@@ -55,11 +61,262 @@ proc_ispeer(struct privsep_proc *procs, 
return (0);
 }
 
+enum privsep_procid
+proc_getid(struct privsep_proc *procs, unsigned int nproc,
+const char *proc_name)
+{
+   struct privsep_proc *p;
+   unsigned int proc;
+
+   for (proc = 0; proc < nproc; proc++) {
+   p = [proc];
+   if (strcmp(p->p_title, proc_name))
+   continue;
+
+   return (p->p_id);
+   }
+
+   return (PROC_MAX);
+}
+
+void
+proc_exec(struct privsep *ps, struct privsep_proc *procs, unsigned int nproc,
+int argc, char **argv)
+{
+   unsigned int proc, nargc, i, proc_i;
+   char**nargv;
+   struct privsep_proc *p;
+   char num[32];
+   int  fd;
+
+   /* Prepare the new process argv. */
+   nargv = calloc(argc + 5, sizeof(char *));
+   if (nargv == NULL)
+   fatal("%s: calloc", __func__);
+
+   /* Copy call argument first. */
+   nargc = 0;
+   nargv[nargc++] = argv[0];
+
+   /* Set process name argument and save the position. */
+   nargv[nargc++] = "-P";
+   proc_i = nargc;
+   nargc++;
+
+   /* Point process instance arg to stack and copy the original args. */
+   nargv[nargc++] = "-I";
+   nargv[nargc++] = num;
+   for (i = 1; i < (unsigned int) argc; i++)
+   nargv[nargc++] = argv[i];
+
+   nargv[nargc] = NULL;
+
+   for (proc = 0; proc < nproc; proc++) {
+   p = [proc];
+
+   /* Update args with process title. */
+   nargv[proc_i] = (char *)(uintptr_t)p->p_title;
+
+   /* Fire children processes. */
+   for (i = 0; i < ps->ps_instances[p->p_id]; i++) {
+   /* Update the process instance number. */
+   snprintf(num, sizeof(num), "%u", i);
+
+   fd = ps->ps_pipes[p->p_id][i].pp_pipes[PROC_PARENT][0];
+   ps->ps_pipes[p->p_id][i].pp_pipes[PROC_PARENT][0] = -1;
+
+   switch (fork()) {
+   case -1:
+   fatal("%s: fork", __func__);
+   break;
+   case 0:
+   /* Prepare parent socket. */
+   dup2(fd, PROC_PARENT_SOCK_FILENO);
+
+   execvp(argv[0], nargv);
+   fatal("%s: execvp", __func__);
+   break;
+   default:
+   /* Close child end. */
+   close(fd);
+   break;
+   }
+   }
+   }
+   free(nargv);
+}
+
 void
-proc_init(struct privsep *ps, struct privsep_proc *procs, unsigned int nproc)
+proc_connectpeer(struct privsep *ps, enum privsep_procid id, int inst,
+struct privsep_pipes *pp)
 {
-   unsigned int i, j, src, dst;
+   unsigned int i, j;
+   struct privsep_fdpf;
+
+   for (i = 0; i < PROC_MAX; i++) {
+   /* Parent is already connected with everyone. */
+   if (i == PROC_PARENT)
+   continue;
+
+   for (j = 0; j < ps->ps_instances[i]; j++) {
+   /* Don't send socket to child itself. */
+   if (i == (unsigned int)id &&
+   j == (unsigned int)inst)
+   continue;
+   if (pp->pp_pipes[i][j] == -1)
+   

snmpd(8): fix compilation warnings with DEBUG

2016-09-26 Thread Rafael Zalamena
This diff fixes two compiler warnings when compiling with DEBUG define.

ok?


Index: timer.c
===
RCS file: /home/obsdcvs/src/usr.sbin/snmpd/timer.c,v
retrieving revision 1.5
diff -u -p -r1.5 timer.c
--- timer.c 27 Aug 2016 01:50:07 -  1.5
+++ timer.c 23 Sep 2016 19:23:22 -
@@ -70,7 +70,7 @@ timer_cpu(int fd, short event, void *arg
(void)percentages(CPUSTATES, cptime2, cp_time[n],
cp_old[n], cp_diff[n]);
 #ifdef DEBUG
-   log_debug("timer_cpu: cpu%d %d%% idle in %ds", n,
+   log_debug("timer_cpu: cpu%d %llu%% idle in %llus", n,
(cptime2[CP_IDLE] > 1000 ?
1000 : (cptime2[CP_IDLE] / 10)), tv.tv_sec);
 #endif



switchd(8): set the pktbuf for packet_in messages

2016-09-23 Thread Rafael Zalamena
The pkt_buf variable is never set in incoming packet_in messages and this
diff fixes it.

ok?


Index: packet.c
===
RCS file: /home/obsdcvs/src/usr.sbin/switchd/packet.c,v
retrieving revision 1.3
diff -u -p -r1.3 packet.c
--- packet.c21 Jul 2016 08:39:23 -  1.3
+++ packet.c21 Sep 2016 11:33:44 -
@@ -63,7 +63,7 @@ packet_input(struct switchd *sc, struct 
return (-1);
 
pkt->pkt_len = ibuf_dataleft(ibuf);
-   if ((pkt->pkt_eh = eh = ibuf_getdata(ibuf, sizeof(*eh))) == NULL) {
+   if ((eh = ibuf_getdata(ibuf, sizeof(*eh))) == NULL) {
log_debug("short packet");
return (-1);
}
@@ -86,6 +86,9 @@ packet_input(struct switchd *sc, struct 
 
if (dstport)
*dstport = dst == NULL ? OFP_PORT_ANY : dst->mac_port;
+
+   pkt->pkt_eh = eh;
+   pkt->pkt_buf = (uint8_t *) eh;
 
return (0);
 }



switchd(8): fix memory leak and loop

2016-09-23 Thread Rafael Zalamena
This diff fixes a memory leak in ofp_read() that happens in every message
and a infinite loop that happens when the remote switch closes the
connection.

ok?

Index: ofp.c
===
RCS file: /home/obsdcvs/src/usr.sbin/switchd/ofp.c,v
retrieving revision 1.7
diff -u -p -r1.7 ofp.c
--- ofp.c   14 Sep 2016 13:46:51 -  1.7
+++ ofp.c   21 Sep 2016 11:59:46 -
@@ -146,6 +146,7 @@ void
 ofp_close(struct switch_connection *con)
 {
log_info("%s: connection %u closed", __func__, con->con_id);
+   event_del(>con_ev);
switch_remove(con->con_sc, con->con_switch);
close(con->con_fd);
TAILQ_REMOVE(_head, con, con_next);
@@ -203,7 +204,7 @@ ofp_read(int fd, short event, void *arg)
if ((len = read(fd, buf, sizeof(buf))) == -1)
goto fail;
if (len == 0)
-   return;
+   goto fail;
 
if ((ibuf = ibuf_new(buf, len)) == NULL)
goto fail;
@@ -236,6 +237,7 @@ ofp_read(int fd, short event, void *arg)
goto fail;
}
 
+   ibuf_release(ibuf);
return;
 
  fail:



switchd(8): more debug messages

2016-09-23 Thread Rafael Zalamena
Enable more debug messages to help developing the flow modification messages.

ok?

Index: ofp13.c
===
RCS file: /home/obsdcvs/src/usr.sbin/switchd/ofp13.c,v
retrieving revision 1.5
diff -u -p -r1.5 ofp13.c
--- ofp13.c 21 Jul 2016 14:25:36 -  1.5
+++ ofp13.c 22 Sep 2016 18:35:50 -
@@ -291,6 +347,15 @@ ofp13_validate_error(struct switchd *sc,
case OFP_ERRTYPE_FLOW_MOD_FAILED:
code = print_map(ntohs(err->err_code), ofp_errflowmod_map);
break;
+   case OFP_ERRTYPE_BAD_MATCH:
+   code = print_map(ntohs(err->err_code), ofp_errmatch_map);
+   break;
+   case OFP_ERRTYPE_BAD_INSTRUCTION:
+   code = print_map(ntohs(err->err_code), ofp_errinst_map);
+   break;
+   case OFP_ERRTYPE_BAD_REQUEST:
+   code = print_map(ntohs(err->err_code), ofp_errreq_map);
+   break;
default:
code = NULL;
break;
Index: ofp_map.h
===
RCS file: /home/obsdcvs/src/usr.sbin/switchd/ofp_map.h,v
retrieving revision 1.3
diff -u -p -r1.3 ofp_map.h
--- ofp_map.h   20 Jul 2016 19:57:54 -  1.3
+++ ofp_map.h   21 Sep 2016 13:43:30 -
@@ -52,5 +52,8 @@ extern struct constmap ofp_flowcmd_map[]
 extern struct constmap ofp_flowflag_map[];
 extern struct constmap ofp_errtype_map[];
 extern struct constmap ofp_errflowmod_map[];
+extern struct constmap ofp_errmatch_map[];
+extern struct constmap ofp_errinst_map[];
+extern struct constmap ofp_errreq_map[];
 
 #endif /* _SWITCHD_OFP_MAP_H */



tcpdump mpls pseudowire support

2016-07-07 Thread Rafael Zalamena
This diff teaches tcpdump to recognize MPLS pseudowires with control
words only. This should not be a problem since the control words are
used by default unless configured otherwise (ldpd does this).

It also makes possible to print encapsulated ethernet packets with the
new ethernet print function ether_tryprint(). Ethernet packets
encapsulated in MPLS pseudowires with no control words won't be shown,
because, as discussed previously, we have to make a lot of guesses to
find that out.

ok?

Note:
this is actually an updated diff of an old email I sent last year, the
differences between that and this are:
 * Use 'extern' in interface.h so the prototype looks like the others;
 * Re ordered the variable definition sequence;
 * print-mpls.c now uses the kernel headers to obtain the pseudowire
   macro definitions like: CW_ZERO_MASK and CW_FRAG_MASK;


Index: interface.h
===
RCS file: /cvs/src/usr.sbin/tcpdump/interface.h,v
retrieving revision 1.66
diff -u -p -r1.66 interface.h
--- interface.h 15 Nov 2015 20:35:36 -  1.66
+++ interface.h 7 Jul 2016 17:27:29 -
@@ -205,6 +205,7 @@ extern void pfsync_if_print(u_char *, co
 extern void pfsync_ip_print(const u_char *, u_int, const u_char *);
 extern void ether_if_print(u_char *, const struct pcap_pkthdr *,
const u_char *);
+extern void ether_tryprint(const u_char *, u_int, int);
 extern void fddi_if_print(u_char *, const struct pcap_pkthdr *, const u_char 
*);
 extern void ppp_ether_if_print(u_char *, const struct pcap_pkthdr *,
const u_char *);
Index: print-ether.c
===
RCS file: /cvs/src/usr.sbin/tcpdump/print-ether.c,v
retrieving revision 1.30
diff -u -p -r1.30 print-ether.c
--- print-ether.c   16 Nov 2015 00:16:39 -  1.30
+++ print-ether.c   7 Jul 2016 17:27:29 -
@@ -89,29 +89,34 @@ u_short extracted_ethertype;
 void
 ether_if_print(u_char *user, const struct pcap_pkthdr *h, const u_char *p)
 {
-   u_int caplen = h->caplen;
-   u_int length = h->len;
-   struct ether_header *ep;
-   u_short ether_type;
-
ts_print(>ts);
 
-   if (caplen < sizeof(struct ether_header)) {
-   printf("[|ether]");
-   goto out;
-   }
-
/*
 * Some printers want to get back at the ethernet addresses,
 * and/or check that they're not walking off the end of the packet.
 * Rather than pass them all the way down, we set these globals.
 */
-   packetp = p;
-   snapend = p + caplen;
+   snapend = p + h->caplen;
+
+   ether_tryprint(p, h->len, 1);
+}
+
+void
+ether_tryprint(const u_char *p, u_int length, int first_header)
+{
+   struct ether_header *ep;
+   u_int caplen = snapend - p;
+   u_short ether_type;
+
+   if (caplen < sizeof(struct ether_header)) {
+   printf("[|ether]");
+   goto out;
+   }
 
if (eflag)
ether_print(p, length);
 
+   packetp = p;
length -= sizeof(struct ether_header);
caplen -= sizeof(struct ether_header);
ep = (struct ether_header *)p;
@@ -152,14 +157,15 @@ ether_if_print(u_char *user, const struc
default_print(p, caplen);
}
}
-   if (xflag) {
+   if (xflag && first_header) {
if (eflag)
default_print(packetp, snapend - packetp);
else
default_print(p, caplen);
}
  out:
-   putchar('\n');
+   if (first_header)
+   putchar('\n');
 }
 
 /*
Index: print-mpls.c
===
RCS file: /cvs/src/usr.sbin/tcpdump/print-mpls.c,v
retrieving revision 1.2
diff -u -p -r1.2 print-mpls.c
--- print-mpls.c30 Jun 2010 19:01:06 -  1.2
+++ print-mpls.c7 Jul 2016 17:27:29 -
@@ -26,15 +26,24 @@
  * POSSIBILITY OF SUCH DAMAGE.
  */
 
+#include 
+#include 
+#include 
+
 #include 
 
 #include "interface.h"
 #include "extract.h"   /* must come after interface.h */
 
+#define CW_SEQUENCE_MASK   (0xU)
+
+int controlword_tryprint(const u_char **, u_int *);
+
 void
 mpls_print(const u_char *bp, u_int len)
 {
u_int32_t tag, label, exp, bottom, ttl;
+   int has_cw;
 
  again:
if (bp + sizeof(tag) > snapend)
@@ -56,6 +65,9 @@ mpls_print(const u_char *bp, u_int len)
if (!bottom)
goto again;
 
+   /* Handle pseudowire control word. */
+   has_cw = controlword_tryprint(, );
+
/*
 * guessing the underlying protocol is about all we can do if
 * it's not explicitly defined.
@@ -107,7 +119,34 @@ mpls_print(const u_char *bp, u_int len)
}
}
 
+   if (has_cw)
+   ether_tryprint(bp, len, 0);
+
return;
 trunc:
printf("[|mpls]");
+}

Re: tcpdump mpls pseudowire support

2015-09-26 Thread Rafael Zalamena
On Fri, Jul 17, 2015 at 03:24:17PM -0300, Rafael Zalamena wrote:
> This diff adds support for detection of pseudowires inside of MPLS tagged
> packets. Basically it teaches MPLS to look for ethernet headers when there
> is no sign of IP headers.
> 
> --- SNIPPED OLD DIFF ---

This is an updated diff to teach tcpdump mpls parser to identify VPLS
packets and display the ethernet payload. This time it doesn't do any
clever tricks: it will only show VPLS encapsulated ethernet packet if
there is a control word (which is used by default in ldpd(8)).

ok?

Index: interface.h
===
RCS file: /cvs/src/usr.sbin/tcpdump/interface.h,v
retrieving revision 1.65
diff -u -p -r1.65 interface.h
--- interface.h 5 Apr 2015 17:02:57 -   1.65
+++ interface.h 26 Sep 2015 21:57:42 -
@@ -205,6 +205,7 @@ extern void pfsync_if_print(u_char *, co
 extern void pfsync_ip_print(const u_char *, u_int, const u_char *);
 extern void ether_if_print(u_char *, const struct pcap_pkthdr *,
const u_char *);
+void ether_tryprint(const u_char *, u_int, int);
 extern void fddi_if_print(u_char *, const struct pcap_pkthdr *, const u_char 
*);
 extern void ppp_ether_if_print(u_char *, const struct pcap_pkthdr *,
const u_char *);
Index: print-ether.c
===
RCS file: /cvs/src/usr.sbin/tcpdump/print-ether.c,v
retrieving revision 1.29
diff -u -p -r1.29 print-ether.c
--- print-ether.c   16 Jan 2015 06:40:21 -  1.29
+++ print-ether.c   26 Sep 2015 21:58:24 -
@@ -89,29 +89,34 @@ u_short extracted_ethertype;
 void
 ether_if_print(u_char *user, const struct pcap_pkthdr *h, const u_char *p)
 {
-   u_int caplen = h->caplen;
-   u_int length = h->len;
-   struct ether_header *ep;
-   u_short ether_type;
-
ts_print(>ts);
 
-   if (caplen < sizeof(struct ether_header)) {
-   printf("[|ether]");
-   goto out;
-   }
-
/*
 * Some printers want to get back at the ethernet addresses,
 * and/or check that they're not walking off the end of the packet.
 * Rather than pass them all the way down, we set these globals.
 */
-   packetp = p;
-   snapend = p + caplen;
+   snapend = p + h->caplen;
+
+   ether_tryprint(p, h->len, 1);
+}
+
+void
+ether_tryprint(const u_char *p, u_int length, int first_header)
+{
+   u_int caplen = snapend - p;
+   struct ether_header *ep;
+   u_short ether_type;
+
+   if (caplen < sizeof(struct ether_header)) {
+   printf("[|ether]");
+   goto out;
+   }
 
if (eflag)
ether_print(p, length);
 
+   packetp = p;
length -= sizeof(struct ether_header);
caplen -= sizeof(struct ether_header);
ep = (struct ether_header *)p;
@@ -152,14 +157,15 @@ ether_if_print(u_char *user, const struc
default_print(p, caplen);
}
}
-   if (xflag) {
+   if (xflag && first_header) {
if (eflag)
default_print(packetp, snapend - packetp);
else
default_print(p, caplen);
}
  out:
-   putchar('\n');
+   if (first_header)
+   putchar('\n');
 }
 
 /*
Index: print-mpls.c
===
RCS file: /cvs/src/usr.sbin/tcpdump/print-mpls.c,v
retrieving revision 1.2
diff -u -p -r1.2 print-mpls.c
--- print-mpls.c30 Jun 2010 19:01:06 -  1.2
+++ print-mpls.c26 Sep 2015 21:57:53 -
@@ -31,10 +31,17 @@
 #include "interface.h"
 #include "extract.h"   /* must come after interface.h */
 
+#define CW_ZERO_MASK   (0xf000U)
+#define CW_FRAG_MASK   (0x0fffU)
+#define CW_SEQUENCE_MASK   (0xU)
+
+int controlword_tryprint(const u_char **, u_int *);
+
 void
 mpls_print(const u_char *bp, u_int len)
 {
u_int32_t tag, label, exp, bottom, ttl;
+   int has_cw;
 
  again:
if (bp + sizeof(tag) > snapend)
@@ -56,6 +63,9 @@ mpls_print(const u_char *bp, u_int len)
if (!bottom)
goto again;
 
+   /* Handle pseudowire control word. */
+   has_cw = controlword_tryprint(, );
+
/*
 * guessing the underlying protocol is about all we can do if
 * it's not explicitly defined.
@@ -107,7 +117,34 @@ mpls_print(const u_char *bp, u_int len)
}
}
 
+   if (has_cw)
+   ether_tryprint(bp, len, 0);
+
return;
 trunc:
printf("[|mpls]");
+}
+
+/* Print control word if any and returns 1 on success. */
+int
+controlword_tryprint(const u_char **bp, u_int *lenp)
+{
+   u_int32_t cw, frag, seq;
+
+   if (*lenp < 4)
+   retu

make robots(6) use ppoll() and timespec

2015-08-24 Thread Rafael Zalamena
Following the worm(6) bug fix and improvement, make robots(6) do it as
well.

Changes:
 * Replace timeval with timespec structs;
 * Use clock_gettime(CLOCK_UPTIME) instead of gettimeofday();
 * Use timespec*() instead of manual math operations;
 * Use ppoll() instead of poll() + math operations;

No functionality changed, except that with this diff robots(6) won't
suffer with clock changes and suspend.

ok?

Index: extern.c
===
RCS file: /cvs/src/games/robots/extern.c,v
retrieving revision 1.6
diff -u -p -r1.6 extern.c
--- extern.c3 Nov 2014 22:14:54 -   1.6
+++ extern.c25 Aug 2015 01:22:36 -
@@ -62,7 +62,7 @@ int   Score;  /* Current score */
 intStart_level = 1;/* Level on which to start */
 intWait_bonus; /* bonus for waiting */
 
-struct timeval tv; /* how long to wait; could be an option */
+struct timespectv; /* how long to wait; could be an option 
*/
 
 COORD  Max;/* Max area robots take up */
 COORD  Min;/* Min area robots take up */
Index: main.c
===
RCS file: /cvs/src/games/robots/main.c,v
retrieving revision 1.19
diff -u -p -r1.19 main.c
--- main.c  3 Nov 2014 22:14:54 -   1.19
+++ main.c  25 Aug 2015 01:22:53 -
@@ -70,7 +70,6 @@ main(int ac, char *av[])
Real_time = TRUE;
/* Could be a command-line option */
tv.tv_sec = 3;
-   tv.tv_usec = 0;
break;
case 'a':
Start_level = 4;
Index: move.c
===
RCS file: /cvs/src/games/robots/move.c,v
retrieving revision 1.10
diff -u -p -r1.10 move.c
--- move.c  3 Nov 2014 22:14:54 -   1.10
+++ move.c  25 Aug 2015 02:01:19 -
@@ -43,8 +43,7 @@ get_move(void)
 {
int c;
int retval;
-   struct timeval t, tod;
-   struct timezone tz;
+   struct timespec t, tn;
 #ifdef FANCY
int lastmove;
 #endif
@@ -61,9 +60,8 @@ get_move(void)
}
 #endif
if (Real_time) {
-   t.tv_sec = tv.tv_sec;
-   t.tv_usec = tv.tv_usec;
-   (void)gettimeofday(tod, tz);
+   t = tv;
+   clock_gettime(CLOCK_UPTIME, tn);
}
for (;;) {
if (Teleport  must_telep())
@@ -94,8 +92,7 @@ over:
 
pfd[0].fd = STDIN_FILENO;
pfd[0].events = POLLIN;
-   retval = poll(pfd, 1,
-   t.tv_sec * 1000 + t.tv_usec / 1000);
+   retval = ppoll(pfd, 1, t, NULL);
if (retval  0)
c = getchar();
else/* Don't move if timed out or error */
@@ -203,15 +200,16 @@ teleport:
break;
}
if (Real_time) {
-   (void)gettimeofday(t, tz);
-   t.tv_sec = tod.tv_sec + tv.tv_sec - t.tv_sec;
-   t.tv_usec = tod.tv_usec + tv.tv_usec - t.tv_usec;
-   if (t.tv_usec  0) {
-   t.tv_sec--;
-   t.tv_usec += 100;   /* Now it must be  0 */
-   }
-   if (t.tv_sec  0)
+   /* Update current time. */
+   clock_gettime(CLOCK_UPTIME, t);
+
+   /* Check whether tv time has passed. */
+   timespecadd(tn, tv, tn);
+   if (timespeccmp(tn, t, ))
goto ret;
+
+   /* Keep the difference otherwise. */
+   timespecsub(tn, t, t);
}
}
 ret:
Index: robots.h
===
RCS file: /cvs/src/games/robots/robots.h,v
retrieving revision 1.8
diff -u -p -r1.8 robots.h
--- robots.h16 Nov 2014 04:49:48 -  1.8
+++ robots.h25 Aug 2015 02:04:58 -
@@ -107,7 +107,7 @@ extern char Cnt_move, Field[Y_FIELDSIZE]
 extern int Count, Level, Num_robots, Num_scores, Score,
Start_level, Wait_bonus;
 
-extern struct timeval  tv;
+extern struct timespec tv;
 
 extern COORD   Max, Min, My_pos, Robots[];
 



worm(6) remove cheating bug

2015-08-23 Thread Rafael Zalamena
I just fixed a bug which allowed people to cheat in worm(6). This bug was
found out by deraadt@ when peer reviewing the mail thread in tech@
'Fwd: worm.c removing unused variables'.

To reproduce the bug simply hold spacebar and your worm won't move.

Highlights:
 * Use the unused time variables to record how long has been since the
   last valid key press;
 * Changed the process() function to return whether the key pressed was
   valid or not;
 * Use clock_gettime(CLOCK_UPTIME) instead of gettimeofday() to provide
   reliable clock source even on suspend/wake up;

ok?

Index: worm.c
===
RCS file: /cvs/src/games/worm/worm.c,v
retrieving revision 1.30
diff -u -p -r1.30 worm.c
--- worm.c  22 Aug 2015 14:47:41 -  1.30
+++ worm.c  23 Aug 2015 20:42:42 -
@@ -78,7 +78,7 @@ void  leave(int);
 void   life(void);
 void   newpos(struct body *);
 struct body*newlink(void);
-void   process(int);
+intprocess(int);
 void   prize(void);
 intrnd(int);
 void   setup(void);
@@ -88,10 +88,11 @@ int
 main(int argc, char **argv)
 {
int retval;
-   struct timeval t, tod;
-   struct timezone tz;
struct pollfd pfd[1];
const char *errstr;
+   struct timespec t, tn, tdiff;
+
+   memset(t, 0, sizeof(t));
 
setbuf(stdout, outbuf);
signal(SIGINT, leave);
@@ -154,18 +155,34 @@ main(int argc, char **argv)
running--;
process(lastch);
} else {
-   /* fflush(stdout); */
-   /* Delay could be a command line option */
-   t.tv_sec = 1;
-   t.tv_usec = 0;
-   (void)gettimeofday(tod, tz);
+   /* Check for user input timeout. */
+   clock_gettime(CLOCK_UPTIME, tn);
+   if (timespeccmp(t, tn, =)) {
+   t = tn;
+   t.tv_sec += 1;
+
+   process(lastch);
+   continue;
+   }
+
+   /* Prepare next read */
pfd[0].fd = STDIN_FILENO;
pfd[0].events = POLLIN;
-   retval = poll(pfd, 1, t.tv_sec * 1000 + t.tv_usec / 
1000);
-   if (retval  0)
-   process(getch());
-   else
-   process(lastch);
+   timespecsub(t, tn, tdiff);
+   retval = poll(pfd, 1, (tdiff.tv_sec * 1000) +
+   (tdiff.tv_nsec / 100));
+
+   /* Nothing to do if timed out or signal. */
+   if (retval = 0)
+   continue;
+
+   /* Only update timer if valid key was pressed. */
+   if (process(getch()) == 0)
+   continue;
+
+   /* Update using clock_gettime(), tn is too old now. */
+   clock_gettime(CLOCK_UPTIME, t);
+   t.tv_sec += 1;
}
}
 }
@@ -245,7 +262,7 @@ prize(void)
wrefresh(tv);
 }
 
-void
+int
 process(int ch)
 {
int x,y;
@@ -300,21 +317,21 @@ process(int ch)
break;
case '\f':
setup();
-   return;
+   return (0);
case CNTRL('Z'):
suspend(0);
-   return;
+   return (0);
case CNTRL('C'):
crash();
-   return;
+   return (0);
case CNTRL('D'):
crash();
-   return;
+   return (0);
case ERR:
leave(0);
-   return;
+   return (0);
default:
-   return;
+   return (0);
}
lastch = ch;
if (growing == 0) {
@@ -352,6 +369,7 @@ process(int ch)
wmove(tv, head-y, head-x);
wrefresh(tv);
}
+   return (1);
 }
 
 struct body *



Re: worm(6) remove cheating bug

2015-08-23 Thread Rafael Zalamena
On Sun, Aug 23, 2015 at 06:07:46PM -0300, Rafael Zalamena wrote:
 I just fixed a bug which allowed people to cheat in worm(6). This bug was
 found out by deraadt@ when peer reviewing the mail thread in tech@
 'Fwd: worm.c removing unused variables'.
 
 To reproduce the bug simply hold spacebar and your worm won't move.
 
 Highlights:
  * Use the unused time variables to record how long has been since the
last valid key press;
  * Changed the process() function to return whether the key pressed was
valid or not;
  * Use clock_gettime(CLOCK_UPTIME) instead of gettimeofday() to provide
reliable clock source even on suspend/wake up;
 

More feedback from deraadt@ and guenter@.

Changes:
 * Replaced memset() with timerspecclear();
 * Fixed the time out comment before clock_gettime();
 * Replaced poll() with ppoll() to use timespec directly instead of math
   operations;

ok?

Index: worm.c
===
RCS file: /cvs/src/games/worm/worm.c,v
retrieving revision 1.30
diff -u -p -r1.30 worm.c
--- worm.c  22 Aug 2015 14:47:41 -  1.30
+++ worm.c  24 Aug 2015 02:51:09 -
@@ -78,7 +78,7 @@ void  leave(int);
 void   life(void);
 void   newpos(struct body *);
 struct body*newlink(void);
-void   process(int);
+intprocess(int);
 void   prize(void);
 intrnd(int);
 void   setup(void);
@@ -88,10 +88,11 @@ int
 main(int argc, char **argv)
 {
int retval;
-   struct timeval t, tod;
-   struct timezone tz;
struct pollfd pfd[1];
const char *errstr;
+   struct timespec t, tn, tdiff;
+
+   timespecclear(t);
 
setbuf(stdout, outbuf);
signal(SIGINT, leave);
@@ -154,18 +155,33 @@ main(int argc, char **argv)
running--;
process(lastch);
} else {
-   /* fflush(stdout); */
-   /* Delay could be a command line option */
-   t.tv_sec = 1;
-   t.tv_usec = 0;
-   (void)gettimeofday(tod, tz);
+   /* Check for timeout. */
+   clock_gettime(CLOCK_UPTIME, tn);
+   if (timespeccmp(t, tn, =)) {
+   t = tn;
+   t.tv_sec += 1;
+
+   process(lastch);
+   continue;
+   }
+
+   /* Prepare next read */
pfd[0].fd = STDIN_FILENO;
pfd[0].events = POLLIN;
-   retval = poll(pfd, 1, t.tv_sec * 1000 + t.tv_usec / 
1000);
-   if (retval  0)
-   process(getch());
-   else
-   process(lastch);
+   timespecsub(t, tn, tdiff);
+   retval = ppoll(pfd, 1, tdiff, NULL);
+
+   /* Nothing to do if timed out or signal. */
+   if (retval = 0)
+   continue;
+
+   /* Only update timer if valid key was pressed. */
+   if (process(getch()) == 0)
+   continue;
+
+   /* Update using clock_gettime(), tn is too old now. */
+   clock_gettime(CLOCK_UPTIME, t);
+   t.tv_sec += 1;
}
}
 }
@@ -245,7 +261,7 @@ prize(void)
wrefresh(tv);
 }
 
-void
+int
 process(int ch)
 {
int x,y;
@@ -300,21 +316,21 @@ process(int ch)
break;
case '\f':
setup();
-   return;
+   return (0);
case CNTRL('Z'):
suspend(0);
-   return;
+   return (0);
case CNTRL('C'):
crash();
-   return;
+   return (0);
case CNTRL('D'):
crash();
-   return;
+   return (0);
case ERR:
leave(0);
-   return;
+   return (0);
default:
-   return;
+   return (0);
}
lastch = ch;
if (growing == 0) {
@@ -352,6 +368,7 @@ process(int ch)
wmove(tv, head-y, head-x);
wrefresh(tv);
}
+   return (1);
 }
 
 struct body *



tcpdump mpls pseudowire support

2015-07-17 Thread Rafael Zalamena
This diff adds support for detection of pseudowires inside of MPLS tagged
packets. Basically it teaches MPLS to look for ethernet headers when there
is no sign of IP headers.

Index: interface.h
===
RCS file: /cvs/src/usr.sbin/tcpdump/interface.h,v
retrieving revision 1.65
diff -u -p -r1.65 interface.h
--- interface.h 5 Apr 2015 17:02:57 -   1.65
+++ interface.h 17 Jul 2015 18:16:43 -
@@ -205,6 +205,7 @@ extern void pfsync_if_print(u_char *, co
 extern void pfsync_ip_print(const u_char *, u_int, const u_char *);
 extern void ether_if_print(u_char *, const struct pcap_pkthdr *,
const u_char *);
+void ether_tryprint(const u_char *, u_int);
 extern void fddi_if_print(u_char *, const struct pcap_pkthdr *, const u_char 
*);
 extern void ppp_ether_if_print(u_char *, const struct pcap_pkthdr *,
const u_char *);
Index: print-ether.c
===
RCS file: /cvs/src/usr.sbin/tcpdump/print-ether.c,v
retrieving revision 1.29
diff -u -p -r1.29 print-ether.c
--- print-ether.c   16 Jan 2015 06:40:21 -  1.29
+++ print-ether.c   17 Jul 2015 18:16:43 -
@@ -89,29 +89,34 @@ u_short extracted_ethertype;
 void
 ether_if_print(u_char *user, const struct pcap_pkthdr *h, const u_char *p)
 {
-   u_int caplen = h-caplen;
-   u_int length = h-len;
-   struct ether_header *ep;
-   u_short ether_type;
-
ts_print(h-ts);
 
-   if (caplen  sizeof(struct ether_header)) {
-   printf([|ether]);
-   goto out;
-   }
-
/*
 * Some printers want to get back at the ethernet addresses,
 * and/or check that they're not walking off the end of the packet.
 * Rather than pass them all the way down, we set these globals.
 */
-   packetp = p;
-   snapend = p + caplen;
+   snapend = p + h-caplen;
+
+   ether_tryprint(p, h-len);
+}
+
+void
+ether_tryprint(const u_char *p, u_int length)
+{
+   u_int caplen = snapend - p;
+   struct ether_header *ep;
+   u_short ether_type;
+
+   if (caplen  sizeof(struct ether_header)) {
+   printf([|ether]);
+   goto out;
+   }
 
if (eflag)
ether_print(p, length);
 
+   packetp = p;
length -= sizeof(struct ether_header);
caplen -= sizeof(struct ether_header);
ep = (struct ether_header *)p;
Index: print-mpls.c
===
RCS file: /cvs/src/usr.sbin/tcpdump/print-mpls.c,v
retrieving revision 1.2
diff -u -p -r1.2 print-mpls.c
--- print-mpls.c30 Jun 2010 19:01:06 -  1.2
+++ print-mpls.c17 Jul 2015 18:16:43 -
@@ -31,6 +31,12 @@
 #include interface.h
 #include extract.h   /* must come after interface.h */
 
+#define CW_ZERO_MASK   (0xf000U)
+#define CW_FRAG_MASK   (0x0fffU)
+#define CW_SEQUENCE_MASK   (0xU)
+
+void controlword_print(const u_char **, u_int *);
+
 void
 mpls_print(const u_char *bp, u_int len)
 {
@@ -56,6 +62,9 @@ mpls_print(const u_char *bp, u_int len)
if (!bottom)
goto again;
 
+   /* Handle pseudowire control word if any. */
+   controlword_print(bp, len);
+
/*
 * guessing the underlying protocol is about all we can do if
 * it's not explicitly defined.
@@ -99,15 +108,48 @@ mpls_print(const u_char *bp, u_int len)
 
switch (bp[0]  0xf0) {
case 0x40:
+   /*
+* IPv4 second nibble is the header length and its
+* value must be at least 5 bytes long.
+*/
+   if ((bp[0]  0x0f)  5) {
+   ether_tryprint(bp, len);
+   break;
+   }
+
ip_print(bp, len);
break;
case 0x60:
ip6_print(bp, len);
break;
+   default:
+   ether_tryprint(bp, len);
+   break;
}
}
 
return;
 trunc:
printf([|mpls]);
+}
+
+void
+controlword_print(const u_char **bp, u_int *lenp)
+{
+   u_int32_t cw, frag, seq;
+
+   if (*lenp  4)
+   return;
+
+   cw = EXTRACT_32BITS(*bp);
+   if (cw  CW_ZERO_MASK)
+   return;
+
+   *bp += sizeof(cw);
+   *lenp += sizeof(cw);
+
+   frag = (cw  CW_FRAG_MASK)  16;
+   seq = cw  CW_SEQUENCE_MASK;
+
+   printf(CW(frag %u, sequence %u) , frag, seq);
 }



indent ifconfig(8) bridge rules output

2015-07-17 Thread Rafael Zalamena
This diff indents the output of bridge rules in ifconfig or ifconfig bridgeX.

Old output:
$ ifconfig bridge0
bridge0: flags=41UP,RUNNING
groups: bridge
priority 32768 hellotime 2 fwddelay 15 maxage 20 holdcnt 6 proto rstp
designated: id 00:00:00:00:00:00 priority 0
tun1 flags=3LEARNING,DISCOVER
port 148 ifpriority 0 ifcost 0
block in on tun1 src 00:11:22:33:44:55
block in on tun1 src 00:11:22:33:44:56
block out on tun1 src 00:11:22:33:44:56
Addresses (max cache: 100, timeout: 240):

New output:
$ ifconfig bridge0
bridge0: flags=41UP,RUNNING
groups: bridge
priority 32768 hellotime 2 fwddelay 15 maxage 20 holdcnt 6 proto rstp
designated: id 00:00:00:00:00:00 priority 0
tun1 flags=3LEARNING,DISCOVER
port 148 ifpriority 0 ifcost 0
block in on tun1 src 00:11:22:33:44:55
block in on tun1 src 00:11:22:33:44:56
block out on tun1 src 00:11:22:33:44:56
Addresses (max cache: 100, timeout: 240):

Also I kept the 'ifconfig bridgeX rules interface' non-indented:
$ ifconfig bridge0 rules tun1
block in on tun1 src 00:11:22:33:44:55
block in on tun1 src 00:11:22:33:44:56
block out on tun1 src 00:11:22:33:44:56

Index: sbin/ifconfig//brconfig.c
===
RCS file: /cvs/src/sbin/ifconfig/brconfig.c,v
retrieving revision 1.8
diff -u -p -r1.8 brconfig.c
--- sbin/ifconfig//brconfig.c   13 Oct 2013 12:18:18 -  1.8
+++ sbin/ifconfig//brconfig.c   18 Jul 2015 04:41:02 -
@@ -322,7 +322,7 @@ bridge_list(char *delim)
stpstates[reqp-ifbr_state],
stproles[reqp-ifbr_role]);
printf(\n);
-   bridge_rules(buf, 0);
+   bridge_rules(buf, 1);
}
free(bifc.ifbic_buf);
 }
@@ -742,7 +742,7 @@ bridge_flushrule(const char *ifname, int
 }
 
 void
-bridge_rules(const char *ifname, int d)
+bridge_rules(const char *ifname, int usetab)
 {
char *inbuf = NULL, *inb;
struct ifbrlconf ifc;
@@ -766,6 +766,10 @@ bridge_rules(const char *ifname, int d)
ifrp = ifc.ifbrl_req;
for (i = 0; i  ifc.ifbrl_len; i += sizeof(*ifrp)) {
ifrp = (struct ifbrlreq *)((caddr_t)ifc.ifbrl_req + i);
+
+   if (usetab)
+   printf(\t);
+
bridge_showrule(ifrp);
}
 }



Re: mpe(4) broken on -current

2015-03-21 Thread Rafael Zalamena
On Thu, Mar 19, 2015 at 11:50 PM, Rafael Zalamena rzalam...@gmail.com wrote:
 On Thu, Mar 19, 2015 at 8:32 AM, Martin Pieuchot m...@openbsd.org wrote:
 On 18/03/15(Wed) 22:58, Rafael Zalamena wrote:
 mpe(4) is not installing routes / label in the interface in -current.
 
 Snippet:
 # ifconfig mpe0 mplslabel 100
 ifconfig: SIOCSETLABEL: Network is unreachable
 
 Quickly looking at the code I found out that since the old MPLS route
 installer function (mpe_newlabel) doesn't include an ifa pointer later
 on rt_getifa() will fail and return ENETUNREACH.
 
 Trace:
 mpe_newlabel - rtrequest1 - switch (RTM_ADD) - rt_getifa
 
 I tried moving it to rt_ifa_add() using my old VPLS datapath diffs,
 but there are some other problems like panic()s or NULL MPLS routes
 installed for mpeX that might be happening because of my poor
 understanding of the new network stack design (no more
 ifp-if_lladdr).
 
 So mpe(4) was also abusing if_lladdr?
 
 (this commit: 
 https://github.com/rzalamena/vpls-src/commit/675216b75b665f42b06bd2b0b18cbd0deab84f57)
 
 This is good.  You can initialize sc_ifa in mpe_clone_create(), look at
 how enc(4) does it.
 
 --- SNIPPED OLD CHAT ---
 
 Thanks, I'll send a diff sometime soon if you don't do it first.

Here is a diff to fix the mpe(4) route installation that wasn't working.
Code changes:
* Add sc_ifa field and change sc_shim to sc_smpls (struct shim_hdr - struct 
sockaddr_smpls) in mpe_softc;
 sc_ifa will be used by rt_ifa_* functions to install routes and sc_smpls was 
changed to simplify route install.
* Removed old mpe_newlabel() function and replaced it with rt_ifa_*() calls;
* Introduced code to deal with MPLS routes in rt_ifa_add() and rt_ifa_del();
 rt_ifa_add() and rt_ifa_del() should work on rdomain 0 when dealing with MPLS.

Index: sys/net/if_mpe.c
===
RCS file: /cvs/src/sys/net/if_mpe.c,v
retrieving revision 1.41
diff -u -p -r1.41 if_mpe.c
--- sys/net/if_mpe.c22 Dec 2014 11:05:53 -  1.41
+++ sys/net/if_mpe.c21 Mar 2015 19:00:13 -
@@ -57,7 +57,6 @@ int   mpeioctl(struct ifnet *, u_long, cad
 void   mpestart(struct ifnet *);
 intmpe_clone_create(struct if_clone *, int);
 intmpe_clone_destroy(struct ifnet *);
-intmpe_newlabel(struct ifnet *, int, struct shim_hdr *);
 
 LIST_HEAD(, mpe_softc) mpeif_list;
 struct if_clonempe_cloner =
@@ -85,7 +84,6 @@ mpe_clone_create(struct if_clone *ifc, i
M_DEVBUF, M_NOWAIT|M_ZERO)) == NULL)
return (ENOMEM);
 
-   mpeif-sc_shim.shim_label = 0;
mpeif-sc_unit = unit;
ifp = mpeif-sc_if;
snprintf(ifp-if_xname, sizeof ifp-if_xname, mpe%d, unit);
@@ -105,6 +103,12 @@ mpe_clone_create(struct if_clone *ifc, i
bpfattach(ifp-if_bpf, ifp, DLT_LOOP, sizeof(u_int32_t));
 #endif
 
+   mpeif-sc_ifa.ifa_ifp = ifp;
+   mpeif-sc_ifa.ifa_rtrequest = link_rtrequest;
+   mpeif-sc_ifa.ifa_addr = (struct sockaddr *) ifp-if_sadl;
+   mpeif-sc_smpls.smpls_len = sizeof(mpeif-sc_smpls);
+   mpeif-sc_smpls.smpls_family = AF_MPLS;
+
LIST_INSERT_HEAD(mpeif_list, mpeif, sc_list);
 
return (0);
@@ -114,9 +118,17 @@ int
 mpe_clone_destroy(struct ifnet *ifp)
 {
struct mpe_softc*mpeif = ifp-if_softc;
+   int s;
 
LIST_REMOVE(mpeif, sc_list);
 
+   if (mpeif-sc_smpls.smpls_label) {
+   s = splsoftnet();
+   rt_ifa_del(mpeif-sc_ifa, RTF_MPLS | RTF_UP,
+   smplstosa(mpeif-sc_smpls));
+   splx(s);
+   }
+
if_detach(ifp);
free(mpeif, M_DEVBUF, 0);
return (0);
@@ -292,7 +304,7 @@ mpeioctl(struct ifnet *ifp, u_long cmd, 
case SIOCGETLABEL:
ifm = ifp-if_softc;
shim.shim_label =
-   ((ntohl(ifm-sc_shim.shim_label  MPLS_LABEL_MASK)) 
+   ((ntohl(ifm-sc_smpls.smpls_label  MPLS_LABEL_MASK)) 
MPLS_LABEL_OFFSET);
error = copyout(shim, ifr-ifr_data, sizeof(shim));
break;
@@ -306,11 +318,11 @@ mpeioctl(struct ifnet *ifp, u_long cmd, 
break;
}
shim.shim_label = htonl(shim.shim_label  MPLS_LABEL_OFFSET);
-   if (ifm-sc_shim.shim_label == shim.shim_label)
+   if (ifm-sc_smpls.smpls_label == shim.shim_label)
break;
LIST_FOREACH(ifm, mpeif_list, sc_list) {
if (ifm != ifp-if_softc 
-   ifm-sc_shim.shim_label == shim.shim_label) {
+   ifm-sc_smpls.smpls_label == shim.shim_label) {
error = EEXIST;
break;
}
@@ -319,25 +331,29 @@ mpeioctl(struct ifnet *ifp, u_long cmd, 
break;
ifm = ifp-if_softc;
s = splsoftnet

QinQ regression

2014-11-18 Thread Rafael Zalamena
The diff inlined in this mail fixes a regression introduced by a commit
that removed the VLAN tagging from vlan_start.

Since ether_addheader is only called once now (by the first and the inner
most VLAN) we need to add tags there. Doing all the tagging in
ether_addheader avoids all the ether_header popping and pushing that we
used to do in vlan_start so we also gain a small performance improvement.

I've added a new function called vlan_addheader that will do all VLAN
tagging plus it will prepend the ethernet header: it will look all parent
interfaces pointed by the VLAN softc (ifv_p) to decide how many VLANs will
be prepended, then it will alocate the necessary memory and do all the
VLAN tagging. (if there is too many VLANs in the stack, M_PREPEND will return
NULL and ether_addheader will fail)

To decide whether the parent interface is a VLAN or not I've used the if_output
function to determine the interface type.

Index: net/if_ethersubr.c
===
RCS file: /cvs/src/sys/net/if_ethersubr.c,v
retrieving revision 1.177
diff -u -p -r1.177 if_ethersubr.c
--- net/if_ethersubr.c  6 Nov 2014 14:28:47 -   1.177
+++ net/if_ethersubr.c  18 Nov 2014 13:11:30 -
@@ -206,18 +206,10 @@ ether_addheader(struct mbuf **m, struct 
((*m)-m_pkthdr.pf.prio  EVL_PRIO_BITS);
/* don't return, need to add regular ethernet header */
} else {
-   struct ether_vlan_header*evh;
-
-   M_PREPEND(*m, sizeof(*evh), M_DONTWAIT);
-   if (*m == NULL)
+   *m = vlan_addheader(*m, ifp, etype, esrc, edst);
+   if (m == NULL)
return (-1);
-   evh = mtod(*m, struct ether_vlan_header *);
-   memcpy(evh-evl_dhost, edst, sizeof(evh-evl_dhost));
-   memcpy(evh-evl_shost, esrc, sizeof(evh-evl_shost));
-   evh-evl_proto = etype;
-   evh-evl_encap_proto = htons(ifv-ifv_type);
-   evh-evl_tag = htons(ifv-ifv_tag +
-   ((*m)-m_pkthdr.pf.prio  EVL_PRIO_BITS));
+
(*m)-m_flags = ~M_VLANTAG;
return (0);
}
Index: net/if_vlan.c
===
RCS file: /cvs/src/sys/net/if_vlan.c,v
retrieving revision 1.109
diff -u -p -r1.109 if_vlan.c
--- net/if_vlan.c   7 Oct 2014 11:16:23 -   1.109
+++ net/if_vlan.c   18 Nov 2014 13:11:30 -
@@ -788,3 +788,69 @@ vlan_ether_resetmulti(struct ifvlan *ifv
(void)(*p-if_ioctl)(p, SIOCADDMULTI, (caddr_t)ifr);
}
 }
+
+/*
+ * This procedures alocates the space required for ethernet header +
+ * all VLANs, it also fill the information of all inner VLANs and the
+ * packet ethernet header.
+ *
+ * When successful this function returns the new mbuf else NULL.
+ */
+struct mbuf *
+vlan_addheader(struct mbuf *m, const struct ifnet *ifp0, uint16_t etype,
+u_char *esrc, u_char *edst)
+{
+   struct  ifvlan *ifv = ifp0-if_softc;
+   struct  ether_vlan_header *evh;
+   struct  ifnet *ifp;
+   int nvlan = 0, off;
+   struct vlan_shim {
+   uint16_tvs_proto;
+   uint16_tvs_tag;
+   } vs;
+
+#define ISVLAN(ifp)\
+   ((ifp)-if_output == vlan_output)
+
+   /* Find out how many VLANs we are going to prepend */
+   while ((ifp = ifv-ifv_p) != NULL  ISVLAN(ifp)) {
+   ifv = ifp-if_softc;
+   nvlan++;
+   }
+#undef ISVLAN
+
+   /* Prepend VLANs + ethernet */
+   M_PREPEND(m, (nvlan * sizeof(vs)) + sizeof(*evh), M_DONTWAIT);
+   if (m == NULL)
+   return (NULL);
+
+   /*
+* Calculate the offset of packet ethertype and copy:
+* First VLAN + (VLAN header size * VLAN count) - ethertype size
+*/
+   off = sizeof(*evh) + (nvlan * sizeof(vs)) - sizeof(etype);
+   m_copyback(m, off, sizeof(etype), etype, M_NOWAIT);
+
+   /* Fill inner VLAN information if any */
+   ifv = ifp0-if_softc;
+   while (nvlan--  0) {
+   vs.vs_tag = htons((ifv-ifv_prio  EVL_PRIO_BITS) +
+   ifv-ifv_tag);
+   vs.vs_proto = htons(ifv-ifv_type);
+
+   off -= sizeof(vs);
+   m_copyback(m, off, sizeof(vs), vs, M_NOWAIT);
+
+   ifp = ifv-ifv_p;
+   ifv = ifp-if_softc;
+   }
+
+   evh = mtod(m, struct ether_vlan_header *);
+   memcpy(evh-evl_dhost, edst, sizeof(evh-evl_dhost));
+   memcpy(evh-evl_shost, esrc, sizeof(evh-evl_shost));
+   evh-evl_encap_proto = htons(ifv-ifv_type);
+   evh-evl_tag = htons(ifv-ifv_tag +
+   (m-m_pkthdr.pf.prio  EVL_PRIO_BITS));
+
+   return (m);
+}
Index: 

Re: VPLS patch [0/3]: introduction

2014-11-14 Thread Rafael Zalamena
On Sun, Sep 14, 2014 at 11:48:11PM -0300, Rafael Zalamena wrote:
 The following mails will contain patchs that implement the VPLS datapath
 in OpenBSD. Applying all patchs should allow people to configure a
 network using VPLS manually.
 
 --- snipped diffs descriptions ---
 
 How to use:
  * Create a MPLS network.
Example: http://2011.eurobsdcon.org/papers/jeker/MPLS.pdf
  * Create a pseudowire in both ends of your network (PEs)
# ifconfig wirenumber encap ethernet wirelabel local label \
remote label neighbor other PE address controlword up
 
# ifconfig wire0 encap ethernet wirelabel 500 500 neighbor 1.2.3.4 up
  or
# ifconfig wire0 encap ethernet wirelabel 500 500 neighbor 1.2.3.4 \
controlword up
  * Create a bridge between the interface facing your customer (CE) and
your wireX, also in both PEs you are configuring the VPN.
 
 --- more comments snipped ---
 
 TODO list:
 * interface configuration code - SIOCSETWIRECFG / SIOCGETWIRECFG (DONE)
 * add / remove wire label (DONE)
 * add / remove wire control label (DONE)
 * ethernet-vlan support (WIP)

Ethernet-tagged support almost complete, it's not working in the case when
you have packets coming with 2 or more tags. I'm having problems to test
this since 5.6 and -current doesn't do QinQ properly. (I'll be sending
proposal diffs to fix this soon)

 * ifconfig(8) integration
 ** show wire configuration (DONE)
 ** configure wire (DONE)
 * man page:
 ** wire(4) (TODO)
 ** ifconfig(8) (TODO)

wire(4) is depending on some other diffs now that are unrelated to this,
please see:

(update mpe to use rt_ifa, wire will use that too)
http://marc.info/?l=openbsd-techm=141280528700615w=2

(fix bridge + vlan, bridge expects the complete packet)
http://marc.info/?l=openbsd-techm=141575896420071w=2


Here is the diff for the man pages:
diff --git sbin/ifconfig/ifconfig.8 sbin/ifconfig/ifconfig.8
index 0449b9f..99de2e4 100644
--- sbin/ifconfig/ifconfig.8
+++ sbin/ifconfig/ifconfig.8
@@ -185,7 +185,8 @@ At least the following devices can be created on demand:
 .Xr trunk 4 ,
 .Xr tun 4 ,
 .Xr vether 4 ,
-.Xr vlan 4
+.Xr vlan 4 ,
+.Xr wire 4
 .It Cm debug
 Enable driver-dependent debugging code; usually, this turns on
 extra console error logging.
@@ -494,6 +495,8 @@ and
 .Xr gre 4 )
 .It
 .Xr vlan 4
+.It
+.Xr wire 4
 .El
 .\ BRIDGE
 .Sh BRIDGE
@@ -1546,6 +1549,53 @@ Disassociate from the parent interface.
 This breaks the link between the vlan interface and its parent,
 clears its vlan tag, flags, and link address, and shuts the interface down.
 .El
+.\ wire
+.Sh WIRE
+.nr nS 1
+.Bk -words
+.Nm ifconfig
+.Ar wire-interface
+.Op Cm wirelabel Ar local-label Ar remote-label Cm neighbor Ar dest-address
+.Op Cm encap Ar encapsulation
+.Op Oo Fl Oc Ns Cm controlword
+.Ek
+.nr Ns 0
+.Pp
+The following options are available for a
+.Xr wire 4
+interface:
+.Bl -tag -width Ds
+.It Cm wirelabel Ar local-label Ar remote-label
+Set wire local label to
+.Ar local-label
+and remote label to
+.Ar remote-label .
+The
+.Ar local-label
+is a 20-bit number which will be used to create a local label route to
+the wire interface and the
+.Ar remote-label
+is another 20-bit number which will be used to create the output label header.
+.It Cm neighbor Ar dest-address
+Sets the destination address where this wire should output. The
+.Ar dest-address
+is an IPv4 address that will be used to find the nexthop in the MPLS
+network.
+.It Cm encap Ar encapsulation
+Configures the wire encapsulation type with value
+.Ar encapsulation
+which can be
+.Ql ethernet
+or
+.Ql ethernet-tagged
+By default it's assumed to be
+.Ql ethernet
+mode.
+.It Cm controlword
+Configure the wire interface to use control-words.
+.It Cm -controlword
+Remove control-word configuration from the interface.
+.El
 .Sh EXAMPLES
 Assign the
 address of 192.168.1.10 with a network mask of

diff --git wire.4 wire.4
new file mode 100644
index 000..b5b5cf3
--- /dev/null
+++ wire.4
@@ -0,0 +1,97 @@
+.\ Copyright (C) 2014 Rafael F. Zalamena rzalam...@gmail.com
+.\
+.\ Permission to use, copy, modify, and distribute this software for any
+.\ purpose with or without fee is hereby granted, provided that the above
+.\ copyright notice and this permission notice appear in all copies.
+.\
+.\ THE SOFTWARE IS PROVIDED AS IS AND THE AUTHOR DISCLAIMS ALL WARRANTIES
+.\ WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
+.\ MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
+.\ ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
+.\ WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
+.\ ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
+.\ OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
+.\
+.Dd $Mdocdate: September 23 2014 $
+.Dt WIRE 4
+.Os
+.Sh NAME
+.Nm wire
+.Nd Pseudowire
+.Sh SYNOPSIS
+.Cd pseudo-device wire
+.Pp
+.Fd #include sys/types.h
+.Fd #include netmpls/mpls.h
+.Sh DESCRIPTION

Re: VPLS patch [2/3]: the wire (pseudowire) implementation

2014-11-14 Thread Rafael Zalamena
On Sat, Sep 20, 2014 at 12:33:11PM -0300, Rafael Zalamena wrote:
 On Sun, Sep 14, 2014 at 11:51:07PM -0300, Rafael Zalamena wrote:
  The following patch implements the basics of the wire network interface.
  
  --- snipped ---
 
 I've added support for tcpdump'ing the wire interface, it will get all
 data flowing through the wire without the MPLS / VPLS labels and control
 word (just like enc(4) does for IPSec). If you want to see the full packet
 see the interface facing your MPLS network (tcpdump needs some work to
 identify the VPLS label and control word).
 

Updated diff of wire(4) (with VLAN support):
diff --git sys/conf/GENERIC sys/conf/GENERIC
index a265eea..5f79987 100644
--- sys/conf/GENERIC
+++ sys/conf/GENERIC
@@ -93,6 +93,7 @@ pseudo-device systrace 1  # system call tracing device
 # clonable devices
 pseudo-device  bpfilter# packet filter
 pseudo-device  bridge  # network bridging support
+pseudo-device  wire# pseudowire support
 pseudo-device  carp# CARP protocol support
 pseudo-device  gif # IPv[46] over IPv[46] tunnel (RFC1933)
 pseudo-device  gre # GRE encapsulation interface
diff --git sys/conf/files sys/conf/files
index ab7af00..6af6812 100644
--- sys/conf/files
+++ sys/conf/files
@@ -550,6 +550,7 @@ pseudo-device tun: ifnet
 pseudo-device bpfilter: ifnet
 pseudo-device enc: ifnet
 pseudo-device bridge: ifnet, ether
+pseudo-device wire: ifnet, ether
 pseudo-device vlan: ifnet, ether
 pseudo-device carp: ifnet, ether
 pseudo-device sppp: ifnet
@@ -787,6 +788,7 @@ file net/if_tun.c   tun 
needs-count
 file net/if_bridge.c   bridge  needs-count
 file net/bridgestp.c   bridge
 file net/if_vlan.c vlanneeds-count
+file net/if_wire.c wireneeds-count
 file net/pipex.c   pipex
 file net/radix.c
 file net/radix_mpath.c !small_kernel
diff --git sys/net/if_bridge.c sys/net/if_bridge.c
index a70f813..8f031b7 100644
--- sys/net/if_bridge.c
+++ sys/net/if_bridge.c
@@ -36,6 +36,7 @@
 #include pf.h
 #include carp.h
 #include vlan.h
+#include wire.h
 
 #include sys/param.h
 #include sys/systm.h
@@ -365,6 +366,11 @@ bridge_ioctl(struct ifnet *ifp, u_long cmd, caddr_t data)
/* Nothing needed */
}
 #endif /* NGIF */
+#if NWIRE  0
+   else if (ifs-if_type == IFT_MPLSTUNNEL) {
+   /* Nothing needed */
+   }
+#endif /* NWIRE */
else {
error = EINVAL;
break;
diff --git sys/net/if_wire.c sys/net/if_wire.c
new file mode 100644
index 000..9fb20af
--- /dev/null
+++ sys/net/if_wire.c
@@ -0,0 +1,512 @@
+/*
+ * Copyright (c) 2014 Rafael Zalamena rzalam...@gmail.com
+ *
+ * Permission to use, copy, modify, and distribute this software for any
+ * purpose with or without fee is hereby granted, provided that the above
+ * copyright notice and this permission notice appear in all copies.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS AND THE AUTHOR DISCLAIMS ALL WARRANTIES
+ * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
+ * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
+ * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
+ * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
+ * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
+ * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
+ */
+
+#include bpfilter.h
+#include vlan.h
+
+#include sys/param.h
+#include sys/systm.h
+#include sys/mbuf.h
+#include sys/socket.h
+#include sys/ioctl.h
+#include sys/errno.h
+
+#include net/if.h
+#include net/if_types.h
+#include net/route.h
+
+#include netinet/in.h
+
+#include netinet/if_ether.h
+#include netmpls/mpls.h
+
+#if NBPFILTER  0
+#include net/bpf.h
+#endif /* NBPFILTER */
+
+#if NVLAN  0
+#include net/if_vlan_var.h
+#endif
+
+struct wire_softc {
+   struct  ifnet sc_if;
+   u_int32_t   sc_flags;
+   u_int32_t   sc_type;
+   struct  shim_hdr sc_lshim;
+   struct  shim_hdr sc_rshim;
+   struct  sockaddr sc_nexthop;
+};
+
+void   wireattach(int);
+intwire_clone_create(struct if_clone *, int);
+intwire_clone_destroy(struct ifnet *);
+intwire_ioctl(struct ifnet *, u_long, caddr_t);
+intwire_output(struct ifnet *, struct mbuf *, struct sockaddr *,
+struct rtentry *);
+void   wire_start(struct ifnet *);
+struct mbuf *wire_vlan_handle(struct mbuf *, struct wire_softc *);
+intwire_labelroute(struct ifnet *, struct shim_hdr *, int);
+
+struct if_clone wire_cloner =
+IF_CLONE_INITIALIZER(wire, wire_clone_create, wire_clone_destroy);
+
+/* ARGSUSED */
+void
+wireattach(int n)
+{
+   if_clone_attach(wire_cloner

Re: VPLS patch [3/3]: ifconfig(8) wire support

2014-11-14 Thread Rafael Zalamena
On Sat, Sep 20, 2014 at 12:43:09PM -0300, Rafael Zalamena wrote:
 On Sun, Sep 14, 2014 at 11:52:22PM -0300, Rafael Zalamena wrote:
  Adds support for wire configuration and status printing.
  
  --- snipped ---
 
 This patch fixes the ifconfig(8) default encapsulation to 'ethernet',
 as it should only display 'none' when it's not configured.
 
 Also, changed the 'ethernet-vlan' to 'ethernet-tagged', thanks to Renato
 feedback.
 

Updated ifconfig(8) wire support diff:
diff --git sbin/ifconfig/ifconfig.c sbin/ifconfig/ifconfig.c
index 133ff55..3a363ff 100644
--- sbin/ifconfig/ifconfig.c
+++ sbin/ifconfig/ifconfig.c
@@ -122,6 +122,10 @@ struct sockaddr_in netmask;
 
 #ifndef SMALL
 struct ifaliasreq  addreq;
+
+intwconfig = 0;
+intwcwconfig = 0;
+struct ifwirereq   iwrsave;
 #endif /* SMALL */
 
 char   name[IFNAMSIZ];
@@ -191,10 +195,16 @@ void  setmediainst(const char *, int);
 void   settimeslot(const char *, int);
 void   timeslot_status(void);
 void   setmpelabel(const char *, int);
+void   process_wire_commands(void);
+void   setwireencap(const char *, int);
+void   setwirelabel(const char *, const char *);
+void   setwireneighbor(const char *, int);
+void   setwirecontrolword(const char *, int);
 void   setvlantag(const char *, int);
 void   setvlandev(const char *, int);
 void   unsetvlandev(const char *, int);
 void   mpe_status(void);
+void   wire_status(void);
 void   vlan_status(void);
 void   setinstance(const char *, int);
 intmain(int, char *[]);
@@ -362,6 +372,11 @@ const struct   cmd {
{ mpls,   IFXF_MPLS,  0,  setifxflags },
{ -mpls,  -IFXF_MPLS, 0,  setifxflags },
{ mplslabel,  NEXTARG,0,  setmpelabel },
+   { wirelabel,  NEXTARG2,   0,  NULL, setwirelabel },
+   { neighbor,   NEXTARG,0,  setwireneighbor },
+   { controlword, 1, 0,  setwirecontrolword },
+   { -controlword, 0,0,  setwirecontrolword },
+   { encap,  NEXTARG,0,  setwireencap },
{ advbase,NEXTARG,0,  setcarp_advbase },
{ advskew,NEXTARG,0,  setcarp_advskew },
{ carppeer,   NEXTARG,0,  setcarppeer },
@@ -754,6 +769,9 @@ nextarg:
/* Process any media commands that may have been issued. */
process_media_commands();
 
+   /* Process wire commands */
+   process_wire_commands();
+
if (af == AF_INET6  explicit_prefix == 0) {
/*
 * Aggregatable address architecture defines all prefixes
@@ -2919,6 +2937,7 @@ status(int link, struct sockaddr_dl *sdl, int ls)
sppp_status();
trunk_status();
mpe_status();
+   wire_status();
pflow_status();
 #endif
getifgroups();
@@ -3357,6 +3376,56 @@ mpe_status(void)
printf(\tmpls label: %d\n, shim.shim_label);
 }
 
+void
+wire_status(void)
+{
+   struct sockaddr_in *sin;
+   struct ifwirereq iwr;
+
+   bzero(iwr, sizeof(iwr));
+   ifr.ifr_data = (caddr_t) iwr;
+   if (ioctl(s, SIOCGETWIRECFG, (caddr_t) ifr) == -1)
+   return;
+
+   printf(\tencapsulation-type );
+   switch (iwr.iwr_type) {
+   case IWR_TYPE_NONE:
+   printf(none);
+   break;
+   case IWR_TYPE_ETHERNET:
+   printf(ethernet);
+   break;
+   case IWR_TYPE_ETHERNET_TAGGED:
+   printf(ethernet-vlan);
+   break;
+   default:
+   printf(unknown);
+   break;
+   }
+
+   if (iwr.iwr_flags  IWR_FLAG_CONTROLWORD)
+   printf(, control-word);
+
+   printf(\n);
+
+   printf(\tmpls label: );
+   if (iwr.iwr_lshim.shim_label == 0)
+   printf(local none );
+   else
+   printf(local %u , iwr.iwr_lshim.shim_label);
+
+   if (iwr.iwr_rshim.shim_label == 0)
+   printf(remote none\n);
+   else
+   printf(remote %u\n, iwr.iwr_rshim.shim_label);
+
+   sin = (struct sockaddr_in *) iwr.iwr_nexthop;
+   if (sin-sin_addr.s_addr == 0)
+   printf(\tneighbor: none\n);
+   else
+   printf(\tneighbor: %s\n, inet_ntoa(sin-sin_addr));
+}
+
 /* ARGSUSED */
 void
 setmpelabel(const char *val, int d)
@@ -3373,6 +3442,112 @@ setmpelabel(const char *val, int d)
if (ioctl(s, SIOCSETLABEL, (caddr_t)ifr) == -1)
warn(SIOCSETLABEL);
 }
+
+void
+process_wire_commands(void)
+{
+   struct  sockaddr_in *sin, *sinn;
+   struct  ifwirereq iwr;
+
+   if (wconfig == 0)
+   return;
+
+   bzero(iwr, sizeof(iwr));
+   ifr.ifr_data = (caddr_t) iwr;
+   if (ioctl(s, SIOCGETWIRECFG, (caddr_t) ifr) == -1)
+   err(1, SIOCGETWIRECFG);
+
+   if (iwrsave.iwr_type == 0

Re: VPLS patch [0/3]: introduction

2014-11-14 Thread Rafael Zalamena
On Fri, Nov 14, 2014 at 05:41:32PM +0100, Mike Belopuhov wrote:
 On 14 November 2014 17:26, Rafael Zalamena rzalam...@gmail.com wrote:
  On Sun, Sep 14, 2014 at 11:48:11PM -0300, Rafael Zalamena wrote:
  The following mails will contain patchs that implement the VPLS datapath
  in OpenBSD. Applying all patchs should allow people to configure a
  network using VPLS manually.
 
  --- snipped diffs descriptions ---
 
  How to use:
   * Create a MPLS network.
 Example: http://2011.eurobsdcon.org/papers/jeker/MPLS.pdf
   * Create a pseudowire in both ends of your network (PEs)
 # ifconfig wirenumber encap ethernet wirelabel local label \
 remote label neighbor other PE address controlword up
 
 # ifconfig wire0 encap ethernet wirelabel 500 500 neighbor 1.2.3.4 up
   or
 # ifconfig wire0 encap ethernet wirelabel 500 500 neighbor 1.2.3.4 \
 controlword up
   * Create a bridge between the interface facing your customer (CE) and
 your wireX, also in both PEs you are configuring the VPN.
 
  --- more comments snipped ---
 
  TODO list:
  * interface configuration code - SIOCSETWIRECFG / SIOCGETWIRECFG (DONE)
  * add / remove wire label (DONE)
  * add / remove wire control label (DONE)
  * ethernet-vlan support (WIP)
 
  Ethernet-tagged support almost complete, it's not working in the case when
  you have packets coming with 2 or more tags. I'm having problems to test
  this since 5.6 and -current doesn't do QinQ properly. (I'll be sending
  proposal diffs to fix this soon)
 
  * ifconfig(8) integration
  ** show wire configuration (DONE)
  ** configure wire (DONE)
  * man page:
  ** wire(4) (TODO)
  ** ifconfig(8) (TODO)
 
  wire(4) is depending on some other diffs now that are unrelated to this,
  please see:
 
  (update mpe to use rt_ifa, wire will use that too)
  http://marc.info/?l=openbsd-techm=141280528700615w=2
 
  (fix bridge + vlan, bridge expects the complete packet)
  http://marc.info/?l=openbsd-techm=141575896420071w=2
 
 
 is it possible to call it something other than just wire(4)?
 vpls maybe?

pseudowire(4) then? (looks long to me)

I wouldn't call vpls as it may be expanded later to do also VPWS.



Re: VLAN + bridge regression

2014-11-13 Thread Rafael Zalamena
On Thu, Nov 13, 2014 at 04:19:23PM +0100, Martin Pieuchot wrote:
 On 12/11/14(Wed) 00:22, Rafael Zalamena wrote:
  The diff attached to this mail fixes the bridge output for VLANs noted in
  this link:
  http://marc.info/?l=openbsd-miscm=141508025731320w=2
  
  Now when we are doing VLAN input we check whether the VLAN is a bridge port
  or not, if it does then we have to do nothing and just pass the packet and
  the bridge will handle it. This way we save some time by not doing VLAN
  popping in the packets.
 
 In this case, would it makes more sense to move the vlan_input() chunk 
 after the bridge_input() one in ether_input()?

The problem: (long answer)
It seems so, but this way (1) we are doing unnecessary VLAN popping (when
QinQ or no HW VLAN support), (2) we would have to add more code to do tag
re-insertion (the whole point of no (1)) and (3) it seems to be not working
with QinQ in the current code.

The whole problem with the bridge is that it shoves the packets directly
in the interface, and the VLANs no longer do tag insertion in if_start.
(which didn't look right btw)
We would assume that ether_output is always called, but in the bridge case
it assumes that the packet is already ready (ether_output was called),
so it looks to me to be 'more right' that we keep this promise to the bridge.
Otherwise we would have to add more VLAN code to the bridge or handle VLAN
cases everywhere receiving packets from the bridge.


Conclusion: (short answer)
The problem with moving the vlan_input chunk is that we have to do tag
re-insertion in some cases, might break QinQ and it looks to be a more
intrusive code change than it is with this diff.

  
  --- snipped ---



ether_ifdetach: remove unreachable code

2014-11-11 Thread Rafael Zalamena
Remove unreachable code from ether_ifdetach, it has been marked that way
for almost 11 years.

Index: net/if_ethersubr.c
===
RCS file: /home/rzalamena/obsdcvs/src/sys/net/if_ethersubr.c,v
retrieving revision 1.177
diff -u -p -u -r1.177 if_ethersubr.c
--- net/if_ethersubr.c  6 Nov 2014 14:28:47 -   1.177
+++ net/if_ethersubr.c  11 Nov 2014 12:17:13 -
@@ -810,11 +810,6 @@ ether_ifdetach(struct ifnet *ifp)
LIST_REMOVE(enm, enm_list);
free(enm, M_IFMADDR, 0);
}
-
-#if 0
-   /* moved to if_detach() */
-   if_free_sadl(ifp);
-#endif
 }
 
 #if 0



VLAN + bridge regression

2014-11-11 Thread Rafael Zalamena
The diff attached to this mail fixes the bridge output for VLANs noted in
this link:
http://marc.info/?l=openbsd-miscm=141508025731320w=2

Now when we are doing VLAN input we check whether the VLAN is a bridge port
or not, if it does then we have to do nothing and just pass the packet and
the bridge will handle it. This way we save some time by not doing VLAN
popping in the packets.

Bridge was not ready or consistent in some places to handle this, so some
checks were altered to consider tagged packets without the M_VLANTAG flag.
(this means we continue to ignore tagged packets in 'bridge_ip' and
'bridge_blocknonip')
Also, when copying packets, remember to copy the packet M_VLANTAG as well.

Altered the function vlan_input to update the ether_input ifp pointer, so
now ether_input doesn't need to be re-entered when we didn't pop the VLAN
tag.

Lightly tested with in my VPLS setup.

Index: net/if_bridge.c
===
RCS file: /cvs/src/sys/net/if_bridge.c,v
retrieving revision 1.227
diff -u -p -r1.227 if_bridge.c
--- net/if_bridge.c 8 Sep 2014 06:24:13 -   1.227
+++ net/if_bridge.c 12 Nov 2014 01:40:03 -
@@ -1373,6 +1373,9 @@ bridge_input(struct ifnet *ifp, struct e
if (mc == NULL)
return (m);
bcopy(eh, mtod(mc, caddr_t), ETHER_HDR_LEN);
+   if (m-m_flags  M_VLANTAG)
+   mc-m_flags |= M_VLANTAG;
+
s = splnet();
if (IF_QFULL(sc-sc_if.if_snd)) {
m_freem(mc);
@@ -2064,12 +2067,13 @@ bridge_blocknonip(struct ether_header *e
if (m-m_pkthdr.len  ETHER_HDR_LEN)
return (1);
 
+   etype = ntohs(eh-ether_type);
 #if NVLAN  0
-   if (m-m_flags  M_VLANTAG)
+   if ((m-m_flags  M_VLANTAG) || etype == ETHERTYPE_VLAN ||
+   etype == ETHERTYPE_QINQ)
return (1);
 #endif
 
-   etype = ntohs(eh-ether_type);
switch (etype) {
case ETHERTYPE_ARP:
case ETHERTYPE_REVARP:
@@ -2399,12 +2403,12 @@ bridge_ip(struct bridge_softc *sc, int d
int hlen;
u_int16_t etype;
 
+   etype = ntohs(eh-ether_type);
 #if NVLAN  0
-   if (m-m_flags  M_VLANTAG)
+   if ((m-m_flags  M_VLANTAG) || etype == ETHERTYPE_VLAN ||
+   etype == ETHERTYPE_QINQ)
return (m);
 #endif
-
-   etype = ntohs(eh-ether_type);
 
if (etype != ETHERTYPE_IP  etype != ETHERTYPE_IPV6) {
if (etype  ETHERMTU ||
Index: net/if_ethersubr.c
===
RCS file: /cvs/src/sys/net/if_ethersubr.c,v
retrieving revision 1.177
diff -u -p -r1.177 if_ethersubr.c
--- net/if_ethersubr.c  6 Nov 2014 14:28:47 -   1.177
+++ net/if_ethersubr.c  12 Nov 2014 01:40:03 -
@@ -552,7 +552,7 @@ ether_input(struct ifnet *ifp0, struct e
 
 #if NVLAN  0
if (((m-m_flags  M_VLANTAG) || etype == ETHERTYPE_VLAN ||
-   etype == ETHERTYPE_QINQ)  (vlan_input(eh, m) == 0))
+   etype == ETHERTYPE_QINQ)  (vlan_input(eh, m, ifp) == 0))
return;
 #endif
 
Index: net/if_vlan.c
===
RCS file: /cvs/src/sys/net/if_vlan.c,v
retrieving revision 1.109
diff -u -p -r1.109 if_vlan.c
--- net/if_vlan.c   7 Oct 2014 11:16:23 -   1.109
+++ net/if_vlan.c   12 Nov 2014 01:40:03 -
@@ -47,6 +47,7 @@
  * will not modify the ethernet header.
  */
 
+#include bridge.h
 #include vlan.h
 
 #include sys/param.h
@@ -272,7 +273,7 @@ vlan_start(struct ifnet *ifp)
  * vlan_input() returns 0 if it has consumed the packet, 1 otherwise.
  */
 int
-vlan_input(struct ether_header *eh, struct mbuf *m)
+vlan_input(struct ether_header *eh, struct mbuf *m, struct ifnet **ifp0)
 {
struct ifvlan   *ifv;
struct ifnet*ifp = m-m_pkthdr.rcvif;
@@ -320,6 +321,21 @@ vlan_input(struct ether_header *eh, stru
 * reentrant!).
 */
m-m_pkthdr.rcvif = ifv-ifv_if;
+
+#if NBRIDGE  0
+   /* If we are in a bridge, let it handle it */
+   if (ifv-ifv_if.if_bridgeport != NULL) {
+   *ifp0 = ifv-ifv_if;
+#if NBPFILTER  0
+   if (ifv-ifv_if.if_bpf)
+   bpf_mtap_hdr(ifv-ifv_if.if_bpf, (char *)eh,
+   ETHER_HDR_LEN, m, BPF_DIRECTION_IN, NULL);
+#endif
+   ifv-ifv_if.if_ipackets++;
+   return (1);
+   }
+#endif
+
if (m-m_flags  M_VLANTAG) {
m-m_flags = ~M_VLANTAG;
} else {
Index: net/if_vlan_var.h
===
RCS file: /cvs/src/sys/net/if_vlan_var.h,v
retrieving revision 1.24
diff -u -p -r1.24 if_vlan_var.h
--- net/if_vlan_var.h   24 Oct 2013 11:14:33 -  1.24
+++ net/if_vlan_var.h   12 Nov 2014 01:40:03 -
@@ -95,7 +95,7 @@ structifvlan {
 #define  

Re: mpe patch: use rt_ifa_{add,del}

2014-10-14 Thread Rafael Zalamena
On Wed, Oct 08, 2014 at 06:54:14PM -0300, Rafael Zalamena wrote:
 On Wed, Oct 08, 2014 at 09:22:44AM +0200, Martin Pieuchot wrote:
  On 07/10/14(Tue) 18:44, Rafael Zalamena wrote:
   On Sat, Oct 04, 2014 at 07:39:03PM -0300, Rafael Zalamena wrote:
On Thu, Oct 02, 2014 at 02:36:12PM +0200, Martin Pieuchot wrote:
 On 01/10/14(Wed) 21:54, Rafael Zalamena wrote:
  --- old chat snip ---

   
   Code changed:
* Replaced old function that used to create routes in favor of rt_ifa_*
* Modified rt_ifa_{add,del} to handle MPLS addresses: when creating an
  route to a MPLS interface it means we want to remove labels. Also MPLS
  only works on rdomain 0
  
  Even if they only work on rdomain 0, I'd prefer not to add code to
  enforce this behavior.  It's like making it harder for people to make it
  work any rdomain.
  
  Other than that, I'm ok with your diff.
  
 
 I removed the code that hardcoded RTF_MPLS to rdomain 0, now we use a
 function to handle the rdomain switching to install routes.
 
 Index: sys/net/if_mpe.c
 ===
 RCS file: /home/rzalamena/obsdcvs/src/sys/net/if_mpe.c,v
 retrieving revision 1.35
 diff -u -p -r1.35 if_mpe.c
 --- sys/net/if_mpe.c  22 Jul 2014 11:06:09 -  1.35
 +++ sys/net/if_mpe.c  8 Oct 2014 21:48:15 -
 @@ -61,7 +61,7 @@ int mpeioctl(struct ifnet *, u_long, cad
  void mpestart(struct ifnet *);
  int  mpe_clone_create(struct if_clone *, int);
  int  mpe_clone_destroy(struct ifnet *);
 -int  mpe_newlabel(struct ifnet *, int, struct shim_hdr *);
 +int  mpe_iflabelroute(struct ifnet *, struct shim_hdr *, int);
  
  LIST_HEAD(, mpe_softc)   mpeif_list;
  struct if_clone  mpe_cloner =
 @@ -333,10 +333,10 @@ mpeioctl(struct ifnet *ifp, u_long cmd, 
   ifm = ifp-if_softc;
   if (ifm-sc_shim.shim_label) {
   /* remove old MPLS route */
 - mpe_newlabel(ifp, RTM_DELETE, ifm-sc_shim);
 + mpe_iflabelroute(ifp, ifm-sc_shim, 0);
   }
   /* add new MPLS route */
 - error = mpe_newlabel(ifp, RTM_ADD, shim);
 + error = mpe_iflabelroute(ifp, shim, 1);
   if (error)
   break;
   ifm-sc_shim.shim_label = shim.shim_label;
 @@ -346,8 +346,7 @@ mpeioctl(struct ifnet *ifp, u_long cmd, 
   ifm = ifp-if_softc;
   if (ifr-ifr_rdomainid != ifp-if_rdomain) {
   if (ifm-sc_shim.shim_label) {
 - shim.shim_label = ifm-sc_shim.shim_label;
 - error = mpe_newlabel(ifp, RTM_ADD, shim);
 + mpe_iflabelroute(ifp, ifm-sc_shim, 1);
   }
   }
   /* return with ENOTTY so that the parent handler finishes */
 @@ -443,37 +442,29 @@ mpe_input6(struct mbuf *m, struct ifnet 
  }
  #endif   /* INET6 */
  
 +/*
 + * Install or remove mpe interface label routes using rdomain 0.
 + */
  int
 -mpe_newlabel(struct ifnet *ifp, int cmd, struct shim_hdr *shim)
 +mpe_iflabelroute(struct ifnet *ifp, struct shim_hdr *shim, int add)
  {
 - struct rtentry *nrt;
 - struct sockaddr_mpls dst;
 - struct rt_addrinfo info;
 - int error;
 -
 - bzero(dst, sizeof(dst));
 - dst.smpls_len = sizeof(dst);
 - dst.smpls_family = AF_MPLS;
 - dst.smpls_label = shim-shim_label;
 -
 - bzero(info, sizeof(info));
 - info.rti_flags = RTF_UP | RTF_MPLS;
 - info.rti_mpls = MPLS_OP_POP;
 - info.rti_info[RTAX_DST] = smplstosa(dst);
 - info.rti_info[RTAX_GATEWAY] = (struct sockaddr *)ifp-if_sadl;
 -
 - error = rtrequest1(cmd, info, RTP_CONNECTED, nrt, 0);
 - rt_missmsg(cmd, info, error ? 0 : nrt-rt_flags, ifp, error, 0);
 - if (cmd == RTM_DELETE) {
 - if (error == 0  nrt != NULL) {
 - if (nrt-rt_refcnt = 0) {
 - nrt-rt_refcnt++;
 - rtfree(nrt);
 - }
 - }
 - }
 - if (cmd == RTM_ADD  error == 0  nrt != NULL) {
 - nrt-rt_refcnt--;
 - }
 + int error;
 + struct  sockaddr_mpls smpls;
 + u_short rdomain = ifp-if_rdomain;
 +
 + ifp-if_rdomain = 0;
 +
 + memset(smpls, 0, sizeof(smpls));
 + smpls.smpls_family = AF_MPLS;
 + smpls.smpls_label = shim-shim_label;
 + smpls.smpls_len = sizeof(smpls);
 + if (add)
 + error = rt_ifa_add(ifp-if_lladdr, RTF_MPLS | RTF_UP,
 + smplstosa(smpls));
 + else
 + error = rt_ifa_del(ifp-if_lladdr, RTF_MPLS | RTF_UP,
 + smplstosa(smpls));
 +
 + ifp-if_rdomain = rdomain;
   return (error);
  }
 Index: sys/net/route.c
 ===
 RCS file: /home/rzalamena/obsdcvs/src/sys/net/route.c,v
 retrieving revision 1.185
 diff -u -p -r1.185

Re: mpe patch: use rt_ifa_{add,del}

2014-10-08 Thread Rafael Zalamena
On Wed, Oct 08, 2014 at 09:22:44AM +0200, Martin Pieuchot wrote:
 On 07/10/14(Tue) 18:44, Rafael Zalamena wrote:
  On Sat, Oct 04, 2014 at 07:39:03PM -0300, Rafael Zalamena wrote:
   On Thu, Oct 02, 2014 at 02:36:12PM +0200, Martin Pieuchot wrote:
On 01/10/14(Wed) 21:54, Rafael Zalamena wrote:
 --- old chat snip ---
   
  
  Code changed:
   * Replaced old function that used to create routes in favor of rt_ifa_*
   * Modified rt_ifa_{add,del} to handle MPLS addresses: when creating an
 route to a MPLS interface it means we want to remove labels. Also MPLS
 only works on rdomain 0
 
 Even if they only work on rdomain 0, I'd prefer not to add code to
 enforce this behavior.  It's like making it harder for people to make it
 work any rdomain.
 
 Other than that, I'm ok with your diff.
 

I removed the code that hardcoded RTF_MPLS to rdomain 0, now we use a
function to handle the rdomain switching to install routes.

Index: sys/net/if_mpe.c
===
RCS file: /home/rzalamena/obsdcvs/src/sys/net/if_mpe.c,v
retrieving revision 1.35
diff -u -p -r1.35 if_mpe.c
--- sys/net/if_mpe.c22 Jul 2014 11:06:09 -  1.35
+++ sys/net/if_mpe.c8 Oct 2014 21:48:15 -
@@ -61,7 +61,7 @@ int   mpeioctl(struct ifnet *, u_long, cad
 void   mpestart(struct ifnet *);
 intmpe_clone_create(struct if_clone *, int);
 intmpe_clone_destroy(struct ifnet *);
-intmpe_newlabel(struct ifnet *, int, struct shim_hdr *);
+intmpe_iflabelroute(struct ifnet *, struct shim_hdr *, int);
 
 LIST_HEAD(, mpe_softc) mpeif_list;
 struct if_clonempe_cloner =
@@ -333,10 +333,10 @@ mpeioctl(struct ifnet *ifp, u_long cmd, 
ifm = ifp-if_softc;
if (ifm-sc_shim.shim_label) {
/* remove old MPLS route */
-   mpe_newlabel(ifp, RTM_DELETE, ifm-sc_shim);
+   mpe_iflabelroute(ifp, ifm-sc_shim, 0);
}
/* add new MPLS route */
-   error = mpe_newlabel(ifp, RTM_ADD, shim);
+   error = mpe_iflabelroute(ifp, shim, 1);
if (error)
break;
ifm-sc_shim.shim_label = shim.shim_label;
@@ -346,8 +346,7 @@ mpeioctl(struct ifnet *ifp, u_long cmd, 
ifm = ifp-if_softc;
if (ifr-ifr_rdomainid != ifp-if_rdomain) {
if (ifm-sc_shim.shim_label) {
-   shim.shim_label = ifm-sc_shim.shim_label;
-   error = mpe_newlabel(ifp, RTM_ADD, shim);
+   mpe_iflabelroute(ifp, ifm-sc_shim, 1);
}
}
/* return with ENOTTY so that the parent handler finishes */
@@ -443,37 +442,29 @@ mpe_input6(struct mbuf *m, struct ifnet 
 }
 #endif /* INET6 */
 
+/*
+ * Install or remove mpe interface label routes using rdomain 0.
+ */
 int
-mpe_newlabel(struct ifnet *ifp, int cmd, struct shim_hdr *shim)
+mpe_iflabelroute(struct ifnet *ifp, struct shim_hdr *shim, int add)
 {
-   struct rtentry *nrt;
-   struct sockaddr_mpls dst;
-   struct rt_addrinfo info;
-   int error;
-
-   bzero(dst, sizeof(dst));
-   dst.smpls_len = sizeof(dst);
-   dst.smpls_family = AF_MPLS;
-   dst.smpls_label = shim-shim_label;
-
-   bzero(info, sizeof(info));
-   info.rti_flags = RTF_UP | RTF_MPLS;
-   info.rti_mpls = MPLS_OP_POP;
-   info.rti_info[RTAX_DST] = smplstosa(dst);
-   info.rti_info[RTAX_GATEWAY] = (struct sockaddr *)ifp-if_sadl;
-
-   error = rtrequest1(cmd, info, RTP_CONNECTED, nrt, 0);
-   rt_missmsg(cmd, info, error ? 0 : nrt-rt_flags, ifp, error, 0);
-   if (cmd == RTM_DELETE) {
-   if (error == 0  nrt != NULL) {
-   if (nrt-rt_refcnt = 0) {
-   nrt-rt_refcnt++;
-   rtfree(nrt);
-   }
-   }
-   }
-   if (cmd == RTM_ADD  error == 0  nrt != NULL) {
-   nrt-rt_refcnt--;
-   }
+   int error;
+   struct  sockaddr_mpls smpls;
+   u_short rdomain = ifp-if_rdomain;
+
+   ifp-if_rdomain = 0;
+
+   memset(smpls, 0, sizeof(smpls));
+   smpls.smpls_family = AF_MPLS;
+   smpls.smpls_label = shim-shim_label;
+   smpls.smpls_len = sizeof(smpls);
+   if (add)
+   error = rt_ifa_add(ifp-if_lladdr, RTF_MPLS | RTF_UP,
+   smplstosa(smpls));
+   else
+   error = rt_ifa_del(ifp-if_lladdr, RTF_MPLS | RTF_UP,
+   smplstosa(smpls));
+
+   ifp-if_rdomain = rdomain;
return (error);
 }
Index: sys/net/route.c
===
RCS file: /home/rzalamena/obsdcvs/src/sys/net/route.c,v
retrieving revision 1.185
diff -u -p -r1.185 route.c
--- sys/net/route.c 2 Oct 2014 12:21:20 -

Re: mpe patch: use rt_ifa_{add,del}

2014-10-04 Thread Rafael Zalamena
On Thu, Oct 02, 2014 at 02:36:12PM +0200, Martin Pieuchot wrote:
 On 01/10/14(Wed) 21:54, Rafael Zalamena wrote:
  --- old chat snip ---
  
  Code change:
   * Moved label address from softc to lladdr ifa
 
 I'm afraid this is not what we want.  The rest of your diff looks fine
 but moving the storage to be represented as a 'destination address'
 might make sense, but not attached on the lladdr ifa.
 

I tried to use ifp-if_addrlist, but it looks like this KASSERT in rtsock.c
doesn't like this.
rtsock.c:1322
TAILQ_FOREACH(ifa, ifp-if_addrlist, ifa_list) {
KASSERT(ifa-ifa_addr-sa_family != AF_LINK);

See sys/net/rtsock.c: revision 1.144

Do not put any link-layer address on the per-ifp lists or on the RB-
Tree.
 
Since interfaces only support one link-layer address accessible via the
if_sadl member, there's no need to have it elsewhere.  This improves
various address lookups because the first element of the list, the link-
layer address, won't necessarily be discarded.

Finally remove the empty netmask associated to every link-layer address.
This hack was needed to (ab)use the address  netmask comparison code to
do a strcmp() on the interface name embedded in the sdl_data field.

ok henning@, claudio@


So I moved the ifa back to softc.

   * Changed rt_ifa_add to default RTF_MPLS routes to do a POP and only
  use rdomain 0 (MPLS only works on domain 0, and it doesn't make sense
  other actions when creating MPLS route to an interface)

Also changed rt_ifa_del() to remove routes from rdomain 0 as well.

   * Removed old code that installed mpe MPLS routes
   * Conflicting labels verification is now done by routing (see rt_ifa_add())

I had to put back the old verification for installed routes, since the
rt_ifa_add checks failed to find already existing routes when changing
domains or having mpe interfaces down.

 
 This looks ok.
 
  This was tested in the setup described in:
  http://2011.eurobsdcon.org/papers/jeker/MPLS.pdf
  
  Here is the diff:
  --- snipped old diff ---

Updated diff:

Index: net/if_mpe.c
===
RCS file: /home/rzalamena/obsdcvs//src/sys/net/if_mpe.c,v
retrieving revision 1.35
diff -u -p -r1.35 if_mpe.c
--- net/if_mpe.c22 Jul 2014 11:06:09 -  1.35
+++ net/if_mpe.c4 Oct 2014 22:21:17 -
@@ -61,7 +61,6 @@ int   mpeioctl(struct ifnet *, u_long, cad
 void   mpestart(struct ifnet *);
 intmpe_clone_create(struct if_clone *, int);
 intmpe_clone_destroy(struct ifnet *);
-intmpe_newlabel(struct ifnet *, int, struct shim_hdr *);
 
 LIST_HEAD(, mpe_softc) mpeif_list;
 struct if_clonempe_cloner =
@@ -84,13 +83,19 @@ mpe_clone_create(struct if_clone *ifc, i
 {
struct ifnet*ifp;
struct mpe_softc*mpeif;
+   struct sockaddr_mpls*smpls;
int  s;
 
if ((mpeif = malloc(sizeof(*mpeif),
M_DEVBUF, M_NOWAIT|M_ZERO)) == NULL)
return (ENOMEM);
 
-   mpeif-sc_shim.shim_label = 0;
+   smpls = malloc(sizeof(*smpls), M_IFADDR, M_NOWAIT | M_ZERO);
+   if (smpls == NULL) {
+   free(mpeif, M_DEVBUF, 0);
+   return (ENOMEM);
+   }
+
mpeif-sc_unit = unit;
ifp = mpeif-sc_if;
snprintf(ifp-if_xname, sizeof ifp-if_xname, mpe%d, unit);
@@ -110,6 +115,12 @@ mpe_clone_create(struct if_clone *ifc, i
bpfattach(ifp-if_bpf, ifp, DLT_LOOP, sizeof(u_int32_t));
 #endif
 
+   smpls-smpls_family = AF_MPLS;
+   smpls-smpls_len = sizeof(*smpls);
+   mpeif-sc_ifa.ifa_ifp = ifp;
+   mpeif-sc_ifa.ifa_addr = (struct sockaddr *) ifp-if_sadl;
+   mpeif-sc_ifa.ifa_dstaddr = smplstosa(smpls);
+
s = splnet();
LIST_INSERT_HEAD(mpeif_list, mpeif, sc_list);
splx(s);
@@ -128,6 +139,7 @@ mpe_clone_destroy(struct ifnet *ifp)
splx(s);
 
if_detach(ifp);
+   free(mpeif-sc_ifa.ifa_dstaddr, M_IFADDR, 0);
free(mpeif, M_DEVBUF, 0);
return (0);
 }
@@ -278,8 +290,9 @@ int
 mpeioctl(struct ifnet *ifp, u_long cmd, caddr_t data)
 {
int  error;
-   struct mpe_softc*ifm;
+   struct mpe_softc*ifm, *ifmn;
struct ifreq*ifr;
+   struct sockaddr_mpls*smpls;
struct shim_hdr  shim;
 
ifr = (struct ifreq *)data;
@@ -304,13 +317,15 @@ mpeioctl(struct ifnet *ifp, u_long cmd, 
break;
case SIOCGETLABEL:
ifm = ifp-if_softc;
+   smpls = satosmpls(ifm-sc_ifa.ifa_dstaddr);
shim.shim_label =
-   ((ntohl(ifm-sc_shim.shim_label  MPLS_LABEL_MASK)) 
+   ((ntohl(smpls-smpls_label  MPLS_LABEL_MASK)) 
MPLS_LABEL_OFFSET);
error = copyout(shim, ifr-ifr_data, sizeof(shim));
break;
case SIOCSETLABEL:
ifm = ifp

mpe patch: use rt_ifa_{add,del}

2014-10-01 Thread Rafael Zalamena
This new diff aims to simplify the mpe(4) device and also to improve
the old code that handled the installation of MPLS interface routes.

I followed what mpi@ said:

On Tue, Sep 30, 2014 at 11:00:25AM +0200, Martin Pieuchot wrote:
 Hello Rafael,
 
 On 14/09/14(Sun) 23:49, Rafael Zalamena wrote:
  The following patch is just a preparation for the code that is coming to
  implement the wire network interface (the VPLS datapath) to work on OpenBSD.
  
  This code turns the mpe code that handles route and labels into some general
  use functions that will be called by mpe and wire.
 
 Would it be possible to use  the new rt_ifa_add() and rt_ifa_del() instead of
 keeping what is basically a copy of the old rtinit()?
 
 In your case you want to use the lladdr's ifa and you can check for
 RTF_MPLS in the flags to add the corresponding MPLS_OP_POP value.
 
 --- patch snipped ---

Code change:
 * Moved label address from softc to lladdr ifa
 * Changed rt_ifa_add to default RTF_MPLS routes to do a POP and only
use rdomain 0 (MPLS only works on domain 0, and it doesn't make sense
other actions when creating MPLS route to an interface)
 * Removed old code that installed mpe MPLS routes
 * Conflicting labels verification is now done by routing (see rt_ifa_add())

This was tested in the setup described in:
http://2011.eurobsdcon.org/papers/jeker/MPLS.pdf

Here is the diff:
diff --git sys/net/if_mpe.c sys/net/if_mpe.c
index 74039dc..8580ef3 100644
--- sys/net/if_mpe.c
+++ sys/net/if_mpe.c
@@ -61,7 +61,6 @@ int   mpeioctl(struct ifnet *, u_long, caddr_t);
 void   mpestart(struct ifnet *);
 intmpe_clone_create(struct if_clone *, int);
 intmpe_clone_destroy(struct ifnet *);
-intmpe_newlabel(struct ifnet *, int, struct shim_hdr *);
 
 LIST_HEAD(, mpe_softc) mpeif_list;
 struct if_clonempe_cloner =
@@ -84,13 +83,19 @@ mpe_clone_create(struct if_clone *ifc, int unit)
 {
struct ifnet*ifp;
struct mpe_softc*mpeif;
+   struct sockaddr_mpls*smpls;
int  s;
 
if ((mpeif = malloc(sizeof(*mpeif),
M_DEVBUF, M_NOWAIT|M_ZERO)) == NULL)
return (ENOMEM);
 
-   mpeif-sc_shim.shim_label = 0;
+   smpls = malloc(sizeof(*smpls), M_IFADDR, M_NOWAIT | M_ZERO);
+   if (smpls == NULL) {
+   free(mpeif, M_DEVBUF, 0);
+   return (ENOMEM);
+   }
+
mpeif-sc_unit = unit;
ifp = mpeif-sc_if;
snprintf(ifp-if_xname, sizeof ifp-if_xname, mpe%d, unit);
@@ -110,6 +115,10 @@ mpe_clone_create(struct if_clone *ifc, int unit)
bpfattach(ifp-if_bpf, ifp, DLT_LOOP, sizeof(u_int32_t));
 #endif
 
+   smpls-smpls_family = AF_MPLS;
+   smpls-smpls_len = sizeof(*smpls);
+   ifp-if_lladdr-ifa_dstaddr = smplstosa(smpls);
+
s = splnet();
LIST_INSERT_HEAD(mpeif_list, mpeif, sc_list);
splx(s);
@@ -127,6 +136,7 @@ mpe_clone_destroy(struct ifnet *ifp)
LIST_REMOVE(mpeif, sc_list);
splx(s);
 
+   free(ifp-if_lladdr-ifa_dstaddr, M_IFADDR, 0);
if_detach(ifp);
free(mpeif, M_DEVBUF, 0);
return (0);
@@ -278,7 +288,7 @@ int
 mpeioctl(struct ifnet *ifp, u_long cmd, caddr_t data)
 {
int  error;
-   struct mpe_softc*ifm;
+   struct sockaddr_mpls*smpls;
struct ifreq*ifr;
struct shim_hdr  shim;
 
@@ -303,14 +313,13 @@ mpeioctl(struct ifnet *ifp, u_long cmd, caddr_t data)
ifp-if_mtu = ifr-ifr_mtu;
break;
case SIOCGETLABEL:
-   ifm = ifp-if_softc;
+   smpls = satosmpls(ifp-if_lladdr-ifa_dstaddr);
shim.shim_label =
-   ((ntohl(ifm-sc_shim.shim_label  MPLS_LABEL_MASK)) 
+   ((ntohl(smpls-smpls_label  MPLS_LABEL_MASK)) 
MPLS_LABEL_OFFSET);
error = copyout(shim, ifr-ifr_data, sizeof(shim));
break;
case SIOCSETLABEL:
-   ifm = ifp-if_softc;
if ((error = copyin(ifr-ifr_data, shim, sizeof(shim
break;
if (shim.shim_label  MPLS_LABEL_MAX ||
@@ -319,36 +328,29 @@ mpeioctl(struct ifnet *ifp, u_long cmd, caddr_t data)
break;
}
shim.shim_label = htonl(shim.shim_label  MPLS_LABEL_OFFSET);
-   if (ifm-sc_shim.shim_label == shim.shim_label)
-   break;
-   LIST_FOREACH(ifm, mpeif_list, sc_list) {
-   if (ifm != ifp-if_softc 
-   ifm-sc_shim.shim_label == shim.shim_label) {
-   error = EEXIST;
-   break;
-   }
-   }
-   if (error)
+
+   smpls = satosmpls(ifp-if_lladdr-ifa_dstaddr);
+   if (smpls-smpls_label == shim.shim_label

Re: VPLS patch [2/3]: the wire (pseudowire) implementation

2014-09-20 Thread Rafael Zalamena
On Sun, Sep 14, 2014 at 11:51:07PM -0300, Rafael Zalamena wrote:
 The following patch implements the basics of the wire network interface.
 
 --- snipped ---

I've added support for tcpdump'ing the wire interface, it will get all
data flowing through the wire without the MPLS / VPLS labels and control
word (just like enc(4) does for IPSec). If you want to see the full packet
see the interface facing your MPLS network (tcpdump needs some work to
identify the VPLS label and control word).

Partial patch (if you applied the one before this):

PATCH START

diff --git sys/net/if_wire.c sys/net/if_wire.c
index 41840cf..b9301fc 100644
--- sys/net/if_wire.c
+++ sys/net/if_wire.c
@@ -30,6 +30,13 @@
 #include netinet/if_ether.h
 #include netmpls/mpls.h
 
+#include bpfilter.h
+
+#if NBPFILTER  0
+#include net/bpf.h
+#endif /* NBPFILTER */
+
+
 void   wireattach(int);
 intwire_clone_create(struct if_clone *, int);
 intwire_clone_destroy(struct ifnet *);
@@ -90,6 +97,10 @@ wire_clone_create(struct if_clone *ifc, int unit)
if_attach(ifp);
if_alloc_sadl(ifp);
 
+#if NBPFILTER  0
+   bpfattach(ifp-if_bpf, ifp, DLT_EN10MB, ETHER_HDR_LEN);
+#endif
+
s = splnet();
LIST_INSERT_HEAD(wire_list, sc, sc_list);
splx(s);
@@ -253,6 +264,11 @@ wire_input(struct ifnet *ifp, struct mbuf *m)
}
}
 
+#if NBPFILTER  0
+   if (sc-sc_if.if_bpf)
+   bpf_mtap(sc-sc_if.if_bpf, m, BPF_DIRECTION_IN);
+#endif
+
eh = mtod(m, struct ether_header *);
m_adj(m, ETHER_HDR_LEN);
 
@@ -316,6 +332,11 @@ wire_start(struct ifnet *ifp)
continue;
}
 
+#if NBPFILTER  0
+   if (sc-sc_if.if_bpf)
+   bpf_mtap(sc-sc_if.if_bpf, m, BPF_DIRECTION_OUT);
+#endif /* NBPFILTER */
+
rt = rtalloc1(sc-sc_nexthop, RT_REPORT, 0);
if (rt == NULL) {
m_freem(m);



Here is the full patch: (you still need to apply VPLS patch [1/3])

PATCH START

diff --git sys/conf/GENERIC sys/conf/GENERIC
index 309528e..444dcbe 100644
--- sys/conf/GENERIC
+++ sys/conf/GENERIC
@@ -95,6 +95,7 @@ pseudo-device systrace 1  # system call tracing device
 # clonable devices
 pseudo-device  bpfilter# packet filter
 pseudo-device  bridge  # network bridging support
+pseudo-device  wire# pseudowire support
 pseudo-device  carp# CARP protocol support
 pseudo-device  gif # IPv[46] over IPv[46] tunnel (RFC1933)
 pseudo-device  gre # GRE encapsulation interface
diff --git sys/conf/files sys/conf/files
index 4220371..755b6cd 100644
--- sys/conf/files
+++ sys/conf/files
@@ -552,6 +552,7 @@ pseudo-device tun: ifnet
 pseudo-device bpfilter: ifnet
 pseudo-device enc: ifnet
 pseudo-device bridge: ifnet, ether
+pseudo-device wire: ifnet, ether
 pseudo-device vlan: ifnet, ether
 pseudo-device carp: ifnet, ether
 pseudo-device sppp: ifnet
@@ -790,6 +791,7 @@ file net/if_tun.c   tun 
needs-count
 file net/if_bridge.c   bridge  needs-count
 file net/bridgestp.c   bridge
 file net/if_vlan.c vlanneeds-count
+file net/if_wire.c wireneeds-count
 file net/pipex.c   pipex
 file net/radix.c
 file net/radix_mpath.c !small_kernel
diff --git sys/net/if_bridge.c sys/net/if_bridge.c
index fa40d36..6d09113 100644
--- sys/net/if_bridge.c
+++ sys/net/if_bridge.c
@@ -36,6 +36,7 @@
 #include pf.h
 #include carp.h
 #include vlan.h
+#include wire.h
 
 #include sys/param.h
 #include sys/systm.h
@@ -365,6 +366,11 @@ bridge_ioctl(struct ifnet *ifp, u_long cmd, caddr_t data)
/* Nothing needed */
}
 #endif /* NGIF */
+#if NWIRE  0
+   else if (ifs-if_type == IFT_MPLSTUNNEL) {
+   /* Nothing needed */
+   }
+#endif /* NWIRE */
else {
error = EINVAL;
break;
diff --git sys/net/if_wire.c sys/net/if_wire.c
new file mode 100644
index 000..b9301fc
--- /dev/null
+++ sys/net/if_wire.c
@@ -0,0 +1,387 @@
+/*
+ * Copyright (c) 2014 Rafael Zalamena rzalam...@gmail.com
+ *
+ * Permission to use, copy, modify, and distribute this software for any
+ * purpose with or without fee is hereby granted, provided that the above
+ * copyright notice and this permission notice appear in all copies.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS AND THE AUTHOR DISCLAIMS ALL WARRANTIES
+ * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
+ * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR

Re: VPLS patch [3/3]: ifconfig(8) wire support

2014-09-20 Thread Rafael Zalamena
On Sun, Sep 14, 2014 at 11:52:22PM -0300, Rafael Zalamena wrote:
 Adds support for wire configuration and status printing.
 
 --- snipped ---

This patch fixes the ifconfig(8) default encapsulation to 'ethernet',
as it should only display 'none' when it's not configured.

Also, changed the 'ethernet-vlan' to 'ethernet-tagged', thanks to Renato
feedback.

Partial patch (if you applied the one before this):
PATCH START

diff --git sbin/ifconfig/ifconfig.c sbin/ifconfig/ifconfig.c
index 3a363ff..b6f70a5 100644
--- sbin/ifconfig/ifconfig.c
+++ sbin/ifconfig/ifconfig.c
@@ -3396,7 +3396,7 @@ wire_status(void)
printf(ethernet);
break;
case IWR_TYPE_ETHERNET_TAGGED:
-   printf(ethernet-vlan);
+   printf(ethernet-tagged);
break;
default:
printf(unknown);
@@ -3459,7 +3459,7 @@ process_wire_commands(void)
 
if (iwrsave.iwr_type == 0) {
if (iwr.iwr_type == 0)
-   iwrsave.iwr_type = IWR_TYPE_ETHERNET;
+   iwr.iwr_type = IWR_TYPE_ETHERNET;
 
iwrsave.iwr_type = iwr.iwr_type;
}
@@ -3498,7 +3498,7 @@ setwireencap(const char *value, int d)
 
if (strcmp(value, ethernet) == 0)
iwrsave.iwr_type = IWR_TYPE_ETHERNET;
-   else if (strcmp(value, ethernet-vlan) == 0)
+   else if (strcmp(value, ethernet-tagged) == 0)
iwrsave.iwr_type = IWR_TYPE_ETHERNET_TAGGED;
else
errx(1, invalid wire encapsulation type);



This is the full patch:
PATCH START

diff --git sbin/ifconfig/ifconfig.c sbin/ifconfig/ifconfig.c
index 133ff55..b6f70a5 100644
--- sbin/ifconfig/ifconfig.c
+++ sbin/ifconfig/ifconfig.c
@@ -122,6 +122,10 @@ struct sockaddr_in netmask;
 
 #ifndef SMALL
 struct ifaliasreq  addreq;
+
+intwconfig = 0;
+intwcwconfig = 0;
+struct ifwirereq   iwrsave;
 #endif /* SMALL */
 
 char   name[IFNAMSIZ];
@@ -191,10 +195,16 @@ void  setmediainst(const char *, int);
 void   settimeslot(const char *, int);
 void   timeslot_status(void);
 void   setmpelabel(const char *, int);
+void   process_wire_commands(void);
+void   setwireencap(const char *, int);
+void   setwirelabel(const char *, const char *);
+void   setwireneighbor(const char *, int);
+void   setwirecontrolword(const char *, int);
 void   setvlantag(const char *, int);
 void   setvlandev(const char *, int);
 void   unsetvlandev(const char *, int);
 void   mpe_status(void);
+void   wire_status(void);
 void   vlan_status(void);
 void   setinstance(const char *, int);
 intmain(int, char *[]);
@@ -362,6 +372,11 @@ const struct   cmd {
{ mpls,   IFXF_MPLS,  0,  setifxflags },
{ -mpls,  -IFXF_MPLS, 0,  setifxflags },
{ mplslabel,  NEXTARG,0,  setmpelabel },
+   { wirelabel,  NEXTARG2,   0,  NULL, setwirelabel },
+   { neighbor,   NEXTARG,0,  setwireneighbor },
+   { controlword, 1, 0,  setwirecontrolword },
+   { -controlword, 0,0,  setwirecontrolword },
+   { encap,  NEXTARG,0,  setwireencap },
{ advbase,NEXTARG,0,  setcarp_advbase },
{ advskew,NEXTARG,0,  setcarp_advskew },
{ carppeer,   NEXTARG,0,  setcarppeer },
@@ -754,6 +769,9 @@ nextarg:
/* Process any media commands that may have been issued. */
process_media_commands();
 
+   /* Process wire commands */
+   process_wire_commands();
+
if (af == AF_INET6  explicit_prefix == 0) {
/*
 * Aggregatable address architecture defines all prefixes
@@ -2919,6 +2937,7 @@ status(int link, struct sockaddr_dl *sdl, int ls)
sppp_status();
trunk_status();
mpe_status();
+   wire_status();
pflow_status();
 #endif
getifgroups();
@@ -3357,6 +3376,56 @@ mpe_status(void)
printf(\tmpls label: %d\n, shim.shim_label);
 }
 
+void
+wire_status(void)
+{
+   struct sockaddr_in *sin;
+   struct ifwirereq iwr;
+
+   bzero(iwr, sizeof(iwr));
+   ifr.ifr_data = (caddr_t) iwr;
+   if (ioctl(s, SIOCGETWIRECFG, (caddr_t) ifr) == -1)
+   return;
+
+   printf(\tencapsulation-type );
+   switch (iwr.iwr_type) {
+   case IWR_TYPE_NONE:
+   printf(none);
+   break;
+   case IWR_TYPE_ETHERNET:
+   printf(ethernet);
+   break;
+   case IWR_TYPE_ETHERNET_TAGGED:
+   printf(ethernet-tagged);
+   break;
+   default

VPLS patch [0/3]: introduction

2014-09-14 Thread Rafael Zalamena
The following mails will contain patchs that implement the VPLS datapath
in OpenBSD. Applying all patchs should allow people to configure a
network using VPLS manually.

The first patch prepares the system sources to receive the wire
implementation: it turns some mpe specific code into some generic
functions that we will be using in wire and mpe.

The second patch implements the wire datapath itself, with that we should
be able to create a pseudo device that will handle the VPLS labels and
control words. Yet, it is still missing the ethernet-vlan mode.

The third and last patch implements wire specific handling in ifconfig,
like: showing wire configuration and doing the wire configuration.

How to use:
 * Create a MPLS network.
   Example: http://2011.eurobsdcon.org/papers/jeker/MPLS.pdf
 * Create a pseudowire in both ends of your network (PEs)
   # ifconfig wirenumber encap ethernet wirelabel local label \
   remote label neighbor other PE address controlword up

   # ifconfig wire0 encap ethernet wirelabel 500 500 neighbor 1.2.3.4 up
 or
   # ifconfig wire0 encap ethernet wirelabel 500 500 neighbor 1.2.3.4 \
   controlword up
 * Create a bridge between the interface facing your customer (CE) and
   your wireX, also in both PEs you are configuring the VPN.

Now every time a packet comes through the client interface it will be
encapsulated by the wire interface with a new ethernet frame and will
be sent to the neighbor PE configured. When it reaches the destination PE
the packet will be removed from the frame and sent out as it entered the
first PE.

NOTE: there is a LDPd protocol implementation for VPLS/VPWS ongoing by
Renato that might be coming anytime soon. Also, thanks to Renato
for the help and input :) .

Comments and advices are highly welcomed.

TODO list:
* interface configuration code - SIOCSETWIRECFG / SIOCGETWIRECFG (DONE)
* add / remove wire label (DONE)
* add / remove wire control label (DONE)
* ethernet-vlan support (WIP)
* ifconfig(8) integration
** show wire configuration (DONE)
** configure wire (DONE)
* man page:
** wire(4) (TODO)
** ifconfig(8) (TODO)



VPLS patch [1/3]: prepare sys/ to receive pseudowire implementation

2014-09-14 Thread Rafael Zalamena
The following patch is just a preparation for the code that is coming to
implement the wire network interface (the VPLS datapath) to work on OpenBSD.

This code turns the mpe code that handles route and labels into some general
use functions that will be called by mpe and wire.

diff --git sys/net/if_mpe.c sys/net/if_mpe.c
index 74039dc..98d69f4 100644
--- sys/net/if_mpe.c
+++ sys/net/if_mpe.c
@@ -61,7 +61,6 @@ int   mpeioctl(struct ifnet *, u_long, caddr_t);
 void   mpestart(struct ifnet *);
 intmpe_clone_create(struct if_clone *, int);
 intmpe_clone_destroy(struct ifnet *);
-intmpe_newlabel(struct ifnet *, int, struct shim_hdr *);
 
 LIST_HEAD(, mpe_softc) mpeif_list;
 struct if_clonempe_cloner =
@@ -319,36 +318,17 @@ mpeioctl(struct ifnet *ifp, u_long cmd, caddr_t data)
break;
}
shim.shim_label = htonl(shim.shim_label  MPLS_LABEL_OFFSET);
-   if (ifm-sc_shim.shim_label == shim.shim_label)
-   break;
-   LIST_FOREACH(ifm, mpeif_list, sc_list) {
-   if (ifm != ifp-if_softc 
-   ifm-sc_shim.shim_label == shim.shim_label) {
-   error = EEXIST;
-   break;
-   }
-   }
-   if (error)
-   break;
-   ifm = ifp-if_softc;
-   if (ifm-sc_shim.shim_label) {
-   /* remove old MPLS route */
-   mpe_newlabel(ifp, RTM_DELETE, ifm-sc_shim);
-   }
-   /* add new MPLS route */
-   error = mpe_newlabel(ifp, RTM_ADD, shim);
-   if (error)
-   break;
-   ifm-sc_shim.shim_label = shim.shim_label;
+   error = mpls_shim_set(ifp, shim, ifm-sc_shim);
break;
case SIOCSIFRDOMAIN:
/* must readd the MPLS route for our label */
ifm = ifp-if_softc;
if (ifr-ifr_rdomainid != ifp-if_rdomain) {
-   if (ifm-sc_shim.shim_label) {
-   shim.shim_label = ifm-sc_shim.shim_label;
-   error = mpe_newlabel(ifp, RTM_ADD, shim);
-   }
+   shim.shim_label = ifm-sc_shim.shim_label;
+
+   /* XXX trick mpls_shim_set() to reinstall the route */
+   bzero(ifm-sc_shim, sizeof(ifm-sc_shim));
+   error = mpls_shim_set(ifp, shim, ifm-sc_shim);
}
/* return with ENOTTY so that the parent handler finishes */
return (ENOTTY);
@@ -444,36 +424,13 @@ mpe_input6(struct mbuf *m, struct ifnet *ifp, struct 
sockaddr_mpls *smpls,
 #endif /* INET6 */
 
 int
-mpe_newlabel(struct ifnet *ifp, int cmd, struct shim_hdr *shim)
+mpe_label_exists(const struct shim_hdr *shim)
 {
-   struct rtentry *nrt;
-   struct sockaddr_mpls dst;
-   struct rt_addrinfo info;
-   int error;
-
-   bzero(dst, sizeof(dst));
-   dst.smpls_len = sizeof(dst);
-   dst.smpls_family = AF_MPLS;
-   dst.smpls_label = shim-shim_label;
-
-   bzero(info, sizeof(info));
-   info.rti_flags = RTF_UP | RTF_MPLS;
-   info.rti_mpls = MPLS_OP_POP;
-   info.rti_info[RTAX_DST] = smplstosa(dst);
-   info.rti_info[RTAX_GATEWAY] = (struct sockaddr *)ifp-if_sadl;
-
-   error = rtrequest1(cmd, info, RTP_CONNECTED, nrt, 0);
-   rt_missmsg(cmd, info, error ? 0 : nrt-rt_flags, ifp, error, 0);
-   if (cmd == RTM_DELETE) {
-   if (error == 0  nrt != NULL) {
-   if (nrt-rt_refcnt = 0) {
-   nrt-rt_refcnt++;
-   rtfree(nrt);
-   }
-   }
-   }
-   if (cmd == RTM_ADD  error == 0  nrt != NULL) {
-   nrt-rt_refcnt--;
-   }
-   return (error);
+   struct  mpe_softc *mpe_sc;
+
+   LIST_FOREACH(mpe_sc, mpeif_list, sc_list)
+   if (shim-shim_label == mpe_sc-sc_shim.shim_label)
+   return (1);
+
+   return (0);
 }
diff --git sys/netmpls/mpls.h sys/netmpls/mpls.h
index 2903aa4..0363d86 100644
--- sys/netmpls/mpls.h
+++ sys/netmpls/mpls.h
@@ -176,9 +176,13 @@ extern int mpls_inkloop;
 void   mpls_init(void);
 void   mplsintr(void);
 
+int mpe_label_exists(const struct shim_hdr *);
+
 struct mbuf*mpls_shim_pop(struct mbuf *);
 struct mbuf*mpls_shim_swap(struct mbuf *, struct rt_mpls *);
 struct mbuf*mpls_shim_push(struct mbuf *, struct rt_mpls *);
+int mpls_shim_set(struct ifnet *, struct shim_hdr *,
+   struct shim_hdr *);
 
 int mpls_sysctl(int *, u_int, void *, size_t *, void *, size_t);
 voidmpls_input(struct mbuf *);
diff --git sys/netmpls/mpls_shim.c sys/netmpls/mpls_shim.c
index 

VPLS patch [2/3]: the wire (pseudowire) implementation

2014-09-14 Thread Rafael Zalamena
The following patch implements the basics of the wire network interface.

diff --git sys/conf/GENERIC sys/conf/GENERIC
index 309528e..444dcbe 100644
--- sys/conf/GENERIC
+++ sys/conf/GENERIC
@@ -95,6 +95,7 @@ pseudo-device systrace 1  # system call tracing device
 # clonable devices
 pseudo-device  bpfilter# packet filter
 pseudo-device  bridge  # network bridging support
+pseudo-device  wire# pseudowire support
 pseudo-device  carp# CARP protocol support
 pseudo-device  gif # IPv[46] over IPv[46] tunnel (RFC1933)
 pseudo-device  gre # GRE encapsulation interface
diff --git sys/conf/files sys/conf/files
index 4220371..755b6cd 100644
--- sys/conf/files
+++ sys/conf/files
@@ -552,6 +552,7 @@ pseudo-device tun: ifnet
 pseudo-device bpfilter: ifnet
 pseudo-device enc: ifnet
 pseudo-device bridge: ifnet, ether
+pseudo-device wire: ifnet, ether
 pseudo-device vlan: ifnet, ether
 pseudo-device carp: ifnet, ether
 pseudo-device sppp: ifnet
@@ -790,6 +791,7 @@ file net/if_tun.c   tun 
needs-count
 file net/if_bridge.c   bridge  needs-count
 file net/bridgestp.c   bridge
 file net/if_vlan.c vlanneeds-count
+file net/if_wire.c wireneeds-count
 file net/pipex.c   pipex
 file net/radix.c
 file net/radix_mpath.c !small_kernel
diff --git sys/net/if_bridge.c sys/net/if_bridge.c
index fa40d36..6d09113 100644
--- sys/net/if_bridge.c
+++ sys/net/if_bridge.c
@@ -36,6 +36,7 @@
 #include pf.h
 #include carp.h
 #include vlan.h
+#include wire.h
 
 #include sys/param.h
 #include sys/systm.h
@@ -365,6 +366,11 @@ bridge_ioctl(struct ifnet *ifp, u_long cmd, caddr_t data)
/* Nothing needed */
}
 #endif /* NGIF */
+#if NWIRE  0
+   else if (ifs-if_type == IFT_MPLSTUNNEL) {
+   /* Nothing needed */
+   }
+#endif /* NWIRE */
else {
error = EINVAL;
break;
diff --git sys/net/if_wire.c sys/net/if_wire.c
new file mode 100644
index 000..41840cf
--- /dev/null
+++ sys/net/if_wire.c
@@ -0,0 +1,366 @@
+/*
+ * Copyright (c) 2014 Rafael Zalamena rzalam...@gmail.com
+ *
+ * Permission to use, copy, modify, and distribute this software for any
+ * purpose with or without fee is hereby granted, provided that the above
+ * copyright notice and this permission notice appear in all copies.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS AND THE AUTHOR DISCLAIMS ALL WARRANTIES
+ * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
+ * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
+ * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
+ * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
+ * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
+ * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
+ */
+
+#include sys/param.h
+#include sys/systm.h
+#include sys/mbuf.h
+#include sys/socket.h
+#include sys/ioctl.h
+#include sys/errno.h
+
+#include net/if.h
+#include net/if_types.h
+#include net/route.h
+
+#include netinet/in.h
+
+#include netinet/if_ether.h
+#include netmpls/mpls.h
+
+void   wireattach(int);
+intwire_clone_create(struct if_clone *, int);
+intwire_clone_destroy(struct ifnet *);
+intwire_ioctl(struct ifnet *, u_long, caddr_t);
+intwire_output(struct ifnet *, struct mbuf *, struct sockaddr *,
+struct rtentry *);
+void   wire_start(struct ifnet *);
+
+struct wire_softc {
+   struct  ifnet sc_if;
+   u_int32_t   sc_flags;
+   u_int32_t   sc_type;
+   struct  shim_hdr sc_lshim;
+   struct  shim_hdr sc_rshim;
+   struct  sockaddr sc_nexthop;
+
+   LIST_ENTRY(wire_softc) sc_list;
+};
+
+LIST_HEAD(, wire_softc) wire_list;
+
+struct if_clone wire_cloner =
+IF_CLONE_INITIALIZER(wire, wire_clone_create, wire_clone_destroy);
+
+/* ARGSUSED */
+void
+wireattach(int n)
+{
+   LIST_INIT(wire_list);
+   if_clone_attach(wire_cloner);
+}
+
+int
+wire_clone_create(struct if_clone *ifc, int unit)
+{
+   struct  wire_softc *sc;
+   struct  ifnet *ifp;
+   int s;
+
+   sc = malloc(sizeof(*sc), M_DEVBUF, M_NOWAIT | M_ZERO);
+   if (sc == NULL)
+   return (ENOMEM);
+
+   ifp = sc-sc_if;
+   snprintf(ifp-if_xname, sizeof(ifp-if_xname), %s%d,
+   ifc-ifc_name, unit);
+   ifp-if_softc = sc;
+   ifp-if_mtu = ETHERMTU;
+   ifp-if_flags = IFF_POINTOPOINT;
+   ifp-if_ioctl = wire_ioctl;
+   ifp-if_output = wire_output;
+   ifp-if_start = wire_start;
+   ifp-if_type = IFT_MPLSTUNNEL;
+   ifp-if_hdrlen = ETHER_HDR_LEN;
+   IFQ_SET_MAXLEN(ifp-if_snd, IFQ_MAXLEN

VPLS patch [3/3]: ifconfig(8) wire support

2014-09-14 Thread Rafael Zalamena
Adds support for wire configuration and status printing.

diff --git sbin/ifconfig/ifconfig.c sbin/ifconfig/ifconfig.c
index 133ff55..3a363ff 100644
--- sbin/ifconfig/ifconfig.c
+++ sbin/ifconfig/ifconfig.c
@@ -122,6 +122,10 @@ struct sockaddr_in netmask;
 
 #ifndef SMALL
 struct ifaliasreq  addreq;
+
+intwconfig = 0;
+intwcwconfig = 0;
+struct ifwirereq   iwrsave;
 #endif /* SMALL */
 
 char   name[IFNAMSIZ];
@@ -191,10 +195,16 @@ void  setmediainst(const char *, int);
 void   settimeslot(const char *, int);
 void   timeslot_status(void);
 void   setmpelabel(const char *, int);
+void   process_wire_commands(void);
+void   setwireencap(const char *, int);
+void   setwirelabel(const char *, const char *);
+void   setwireneighbor(const char *, int);
+void   setwirecontrolword(const char *, int);
 void   setvlantag(const char *, int);
 void   setvlandev(const char *, int);
 void   unsetvlandev(const char *, int);
 void   mpe_status(void);
+void   wire_status(void);
 void   vlan_status(void);
 void   setinstance(const char *, int);
 intmain(int, char *[]);
@@ -362,6 +372,11 @@ const struct   cmd {
{ mpls,   IFXF_MPLS,  0,  setifxflags },
{ -mpls,  -IFXF_MPLS, 0,  setifxflags },
{ mplslabel,  NEXTARG,0,  setmpelabel },
+   { wirelabel,  NEXTARG2,   0,  NULL, setwirelabel },
+   { neighbor,   NEXTARG,0,  setwireneighbor },
+   { controlword, 1, 0,  setwirecontrolword },
+   { -controlword, 0,0,  setwirecontrolword },
+   { encap,  NEXTARG,0,  setwireencap },
{ advbase,NEXTARG,0,  setcarp_advbase },
{ advskew,NEXTARG,0,  setcarp_advskew },
{ carppeer,   NEXTARG,0,  setcarppeer },
@@ -754,6 +769,9 @@ nextarg:
/* Process any media commands that may have been issued. */
process_media_commands();
 
+   /* Process wire commands */
+   process_wire_commands();
+
if (af == AF_INET6  explicit_prefix == 0) {
/*
 * Aggregatable address architecture defines all prefixes
@@ -2919,6 +2937,7 @@ status(int link, struct sockaddr_dl *sdl, int ls)
sppp_status();
trunk_status();
mpe_status();
+   wire_status();
pflow_status();
 #endif
getifgroups();
@@ -3357,6 +3376,56 @@ mpe_status(void)
printf(\tmpls label: %d\n, shim.shim_label);
 }
 
+void
+wire_status(void)
+{
+   struct sockaddr_in *sin;
+   struct ifwirereq iwr;
+
+   bzero(iwr, sizeof(iwr));
+   ifr.ifr_data = (caddr_t) iwr;
+   if (ioctl(s, SIOCGETWIRECFG, (caddr_t) ifr) == -1)
+   return;
+
+   printf(\tencapsulation-type );
+   switch (iwr.iwr_type) {
+   case IWR_TYPE_NONE:
+   printf(none);
+   break;
+   case IWR_TYPE_ETHERNET:
+   printf(ethernet);
+   break;
+   case IWR_TYPE_ETHERNET_TAGGED:
+   printf(ethernet-vlan);
+   break;
+   default:
+   printf(unknown);
+   break;
+   }
+
+   if (iwr.iwr_flags  IWR_FLAG_CONTROLWORD)
+   printf(, control-word);
+
+   printf(\n);
+
+   printf(\tmpls label: );
+   if (iwr.iwr_lshim.shim_label == 0)
+   printf(local none );
+   else
+   printf(local %u , iwr.iwr_lshim.shim_label);
+
+   if (iwr.iwr_rshim.shim_label == 0)
+   printf(remote none\n);
+   else
+   printf(remote %u\n, iwr.iwr_rshim.shim_label);
+
+   sin = (struct sockaddr_in *) iwr.iwr_nexthop;
+   if (sin-sin_addr.s_addr == 0)
+   printf(\tneighbor: none\n);
+   else
+   printf(\tneighbor: %s\n, inet_ntoa(sin-sin_addr));
+}
+
 /* ARGSUSED */
 void
 setmpelabel(const char *val, int d)
@@ -3373,6 +3442,112 @@ setmpelabel(const char *val, int d)
if (ioctl(s, SIOCSETLABEL, (caddr_t)ifr) == -1)
warn(SIOCSETLABEL);
 }
+
+void
+process_wire_commands(void)
+{
+   struct  sockaddr_in *sin, *sinn;
+   struct  ifwirereq iwr;
+
+   if (wconfig == 0)
+   return;
+
+   bzero(iwr, sizeof(iwr));
+   ifr.ifr_data = (caddr_t) iwr;
+   if (ioctl(s, SIOCGETWIRECFG, (caddr_t) ifr) == -1)
+   err(1, SIOCGETWIRECFG);
+
+   if (iwrsave.iwr_type == 0) {
+   if (iwr.iwr_type == 0)
+   iwrsave.iwr_type = IWR_TYPE_ETHERNET;
+
+   iwrsave.iwr_type = iwr.iwr_type;
+   }
+   if (wcwconfig == 0)
+   iwrsave.iwr_flags |= iwr.iwr_flags;
+
+   if (iwrsave.iwr_lshim.shim_label == 0 ||
+   iwrsave.iwr_rshim.shim_label == 0) {
+   if (iwr.iwr_lshim.shim_label == 0 ||
+   iwr.iwr_rshim.shim_label == 0)

Re: L2VPN in OpenBSD

2014-08-19 Thread Rafael Zalamena
On Tue, Aug 19, 2014 at 03:48:51PM -0400, Tim Epkes wrote:
 All,
 
 I noticed in a few write-ups by Claudio that PWE3 and VPLS were next on the
 roadmap.  This seemed to be a few years ago.  Any progress in that regard?
  Is their a page that tracks that status?  Very interested, Thanks
 
 Tim

Yes, we (me and renato@) are working actively on this.

There is no page tracking the status, but it has been discussed on misc@.

http://marc.info/?l=openbsd-miscm=140744694729898w=2



Re: daily(8) scratch and junk files removal

2014-07-02 Thread Rafael Zalamena
On Wed, Jul 02, 2014 at 08:49:34AM -0500, Shawn K. Quinn wrote:
 On Tue, 2014-07-01 at 19:07 -0300, Rafael Zalamena wrote:
  I also noted that would only happen on one machine which I had setup
  one partition for /var/tmp and /tmp (and /tmp - /var/tmp). After some
  investigation I found out that the code that daily(8) uses to clean
  /var/tmp is different from /tmp.
 
 Putting /var/tmp and /tmp together is a really bad idea. /var/tmp is
 supposed to survive a reboot, /tmp isn't.
 
 -- 
 Shawn K. Quinn skqu...@rushpost.com
 

Thank you for reminding that, I got my setup wrong and I should have
done my homework first. It seems that people from a while back also got
that wrong: the first appearence of the 'ssh-*' was back in 2000:
etc/daily 1.31:
Prune /tmp traversal at .X11-unix
Since /tmp might be a link to /var/tmp, prune at ssh-* or .X11-unix
like the find on /tmp does.

I've made a quick test and there is no problem in 'tmux-*' existing in
/var/tmp after a reboot. Running 'tmux attach' in a socket with no tmux
process controlling just spits out: 'no sessions' and if you run 'tmux'
a new session starts with no problem. I don't know about 'ssh-*' though.

With the diff I mailed we should be compatible with people who expect
/tmp to be a symbolic link for /var/tmp, but if we want to fix it for
good we should remove the other /var/tmp exceptions and warn users about
it.



daily(8) scratch and junk files removal

2014-07-01 Thread Rafael Zalamena
I noticed a problem in one of my OpenBSD installation where tmux(1)
would lose its session socket after a few inactive days. Every time
that happened I quickly fixed it by sending a SIGUSR1 (as suggested by
the man page) to restore the socket session.

I also noted that would only happen on one machine which I had setup
one partition for /var/tmp and /tmp (and /tmp - /var/tmp). After some
investigation I found out that the code that daily(8) uses to clean
/var/tmp is different from /tmp.

/tmp:
Test if /tmp exists and it's not a symlink
Clean all files unaccessed in 3 days except: ssh-*,
X11-unix, ICE-unix and portslocks
Clean all directories unmodified in 3 days except: ssh-*,
vi.recover, X11-unix, ICE-unix and portslocks

/var/tmp:
This one has similar rules as /tmp, but instead of cleaning all
unaccessed files, it says: clean all not directories.

The following diff fixes the issue by ignoring folders belonging to
tmux(1) sessions. Another solution would be changing the find(1) type
flag to only remove files (and not 'not directories' which include
special files like sockets, pipe, devices etc...), but I found the
first one less intrusive and more correct.


Index: etc/daily
===
RCS file: /cvs/src/etc/daily,v
retrieving revision 1.80
diff -u -p -r1.80 daily
--- etc/daily   24 Apr 2014 19:04:54 -  1.80
+++ etc/daily   1 Jul 2014 00:49:54 -
@@ -49,7 +49,7 @@ if [ -d /tmp -a ! -L /tmp ]; then
cd /tmp  {
find -x . \
\( -path './ssh-*' -o -path ./.X11-unix -o -path ./.ICE-unix \
-   -o -path ./portslocks \) \
+   -o -path ./portslocks -o -path './tmux-*' \) \
-prune -o -type f -atime +3 -execdir rm -f -- {} \; 2/dev/null
find -x . -type d -mtime +1 ! -path ./vi.recover ! -path ./.X11-unix \
! -path ./.ICE-unix ! -path ./portslocks ! -name . \
@@ -60,7 +60,7 @@ if [ -d /var/tmp -a ! -L /var/tmp ]; the
cd /var/tmp  {
find -x . \
\( -path './ssh-*' -o -path ./.X11-unix -o -path ./.ICE-unix \
-   -o -path ./portslocks \) \
+   -o -path ./portslocks -o -path './tmux-*' \) \
-prune -o ! -type d -atime +7 -execdir rm -f -- {} \; 2/dev/null
find -x . -type d -mtime +1 ! -path ./vi.recover ! -path ./.X11-unix \
! -path ./.ICE-unix ! -path ./portslocks ! -name . \



[PATCH] SLIST mergesort implementation

2013-11-25 Thread Rafael Zalamena
This is an implementation of the merge sort algorithm for SLIST in
queue(3).

Merge sort is a stable algorithm that provides us a worst case run time
of O(n lg n) and uses at most O(n) of stack (where 'n' is the current
number of elements in the list).

The patch attached to this mail provides the following macros:
SLIST_MERGESORT_PROTOTYPE(name, type, field)
SLIST_MERGESORT_PROTOTYPE_STATIC(name, type, field)

These macros generates the merge sort functions prototypes, where:
 - 'name' is the prefix prepended to the function name;
 - 'type' is the struct type that we are using;
 - 'field' is the data structure pointer to the next entry;


SLIST_MERGESORT_GENERATE(name, type, field, cmp)
SLIST_MERGESORT_GENERATE_STATIC(name, type, field, cmp)

These macros generates the merge sort functions, where:
 - 'name' prefix prepended to the function name;
 - 'type' struct type that we are using;
 - 'field' data structure pointer to the next entry;
 - 'cmp' the compare function that determines where to move the items
(' 0' means move left, ' 0' means move right and '0' items are equal);
NOTE: the compare function MUST return an signed variable, it doesn't
matter what type it is, it is recommended to return an integer (int).
The parameters must be 'struct type *' or 'void *'.


SLIST_MERGESORT(name, head)

This macro calls the mergesort function with name prefix 'name' and
returns a pointer to the list head (struct type *), if the list is empty
NULL is returned instead.
 - 'name' prefix prepended to the functions name;
 - 'head' is a pointer to the list head created by SLIST_HEAD();


It is possible to generate multiple sort functions using the 'name'
parameter illustrated by 'NHEAD' in the sample in this mail.

Usage example:
 SAMPLE START
#include sys/queue.h

#include err.h
#include stdio.h
#include stdlib.h

struct number {
int n_number;
SLIST_ENTRY(number) n_entry;
};
static SLIST_HEAD(NHEAD, number) nhead = SLIST_HEAD_INITIALIZER(nhead);

static int
number_cmp(struct number *n, struct number *nn)
{
if (n-n_number  nn-n_number)
return (1);
else if (n-n_number  nn-n_number)
return (-1);
return (0);
}

SLIST_MERGESORT_PROTOTYPE_STATIC(NHEAD, number, n_entry)
SLIST_MERGESORT_GENERATE_STATIC(NHEAD, number, n_entry, number_cmp)

int
main(int argc, char *argv[])
{
struct  number *n;
int i;

for (i = 1; i  500; i++) {
n = calloc(1, sizeof(*n));
if (n == NULL)
err(1, calloc);

n-n_number = i;
SLIST_INSERT_HEAD(nhead, n, n_entry);
}

printf(List:\n);
SLIST_FOREACH(n, nhead, n_entry)
printf( %d, n-n_number);
printf(\n);

SLIST_MERGESORT(NHEAD, nhead);

printf(List sorted:\n);
SLIST_FOREACH(n, nhead, n_entry)
printf( %d, n-n_number);
printf(\n);

exit(EXIT_SUCCESS);
}
 SAMPLE END


Index: queue.h
===
RCS file: /cvs/src/sys/sys/queue.h,v
retrieving revision 1.38
diff -u -p -r1.38 queue.h
--- queue.h 3 Jul 2013 15:05:21 -   1.38
+++ queue.h 26 Nov 2013 01:16:44 -
@@ -161,6 +161,57 @@ struct {   
\
}   \
 } while (0)
 
+
+#define SLIST_MERGESORT_PROTOTYPE(name, type, field)   \
+   SLIST_MERGESORT_PROTOTYPE_INTERNAL(name, type, field,)
+#define SLIST_MERGESORT_PROTOTYPE_STATIC(name, type, field)\
+   SLIST_MERGESORT_PROTOTYPE_INTERNAL(name, type, field, 
__attribute__((__unused__)) static)
+#define SLIST_MERGESORT_PROTOTYPE_INTERNAL(name, type, field, attr)\
+   attr struct type *name##_MERGE(struct type *, struct type *);   \
+   attr struct type *name##_MERGESORT(struct type *);
+
+#define SLIST_MERGESORT_GENERATE(name, type, field, cmp)   \
+   SLIST_MERGESORT_GENERATE_INTERNAL(name, type, field, cmp,)
+#define SLIST_MERGESORT_GENERATE_STATIC(name, type, field, cmp)
\
+   SLIST_MERGESORT_GENERATE_INTERNAL(name, type, field, cmp, 
__attribute__((__unused__)) static)
+#define SLIST_MERGESORT_GENERATE_INTERNAL(name, type, field, cmp, attr)
\
+   attr struct type *  \
+   name##_MERGE(struct type *s, struct type *e) {  \
+   struct type head;   \
+   struct type *a = head; \
+   while ((s != NULL)  (e != NULL)) {\
+   if ((cmp)(s, e)  0) {  \
+   a-field.sle_next = s;  \
+   a = s;  \
+   

Re: [RFC4448] wire(4) patch v1

2012-08-26 Thread Rafael Zalamena
On Wed, Aug 22, 2012 at 12:20:19AM -0300, Rafael Zalamena wrote:
 I've being working on a project that involves MPLS and OpenBSD, I
 recently started coding and I would really love to get some
 comments/advices about it.
 
 This thread will contain code about the wire(4) pseudo device (RFC4448),
 it will be used in the future by ldpd(8) or ifconfig(8) to configure
 pseudowires (RFC4447). This device will allow us to develop VPLS support
 in LDPd later (RFC4762).
 
 Since I'm quite new to this expect rough edges.
 

Updated v1 patch:
* Cleaned up code - removed code at if_output() and if_start() that
didn't make sense.
* Setting and getting labels works (fixed setlabel)
* Moved wire_input() call to its propper place at ether_input()
* Creating wire and associating it with ethernet + mpe interface works -
needs custom program that calls the appropriated ioctl()


--- /dev/null   Mon Aug 27 00:22:40 2012
+++ net/if_mpe.hTue Aug 21 23:50:31 2012
@@ -0,0 +1,24 @@
+/*
+ * Copyright (c) 2012 Rafael F. Zalamena rzalam...@gmail.com
+ *
+ * Permission to use, copy, modify, and distribute this software for any
+ * purpose with or without fee is hereby granted, provided that the above
+ * copyright notice and this permission notice appear in all copies.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS AND THE AUTHOR DISCLAIMS ALL WARRANTIES
+ * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
+ * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
+ * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
+ * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
+ * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
+ * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
+ */
+
+struct mpe_softc {
+   struct ifnetsc_if;  /* the interface */
+   int sc_unit;
+   struct shim_hdr sc_shim;
+   LIST_ENTRY(mpe_softc)   sc_list;
+};
+
+int mpe_newlabel(struct ifnet *, int, struct shim_hdr *);
--- /dev/null   Mon Aug 27 00:30:25 2012
+++ net/if_wire.c   Mon Aug 27 00:29:49 2012
@@ -0,0 +1,418 @@
+/*
+ * Copyright (c) 2012 Rafael F. Zalamena rzalam...@gmail.com
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *notice, this list of conditions and the following disclaimer in the
+ *documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
+ * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT,
+ * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+ * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
+ * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
+ * ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include bpfilter.h
+#include vlan.h
+
+#include sys/param.h
+#include sys/proc.h
+#include sys/systm.h
+#include sys/mbuf.h
+#include sys/socket.h
+#include sys/ioctl.h
+#include sys/errno.h
+#include sys/kernel.h
+#include machine/cpu.h
+
+#include net/if.h
+#include net/if_types.h
+#include net/if_llc.h
+#include net/route.h
+#include net/netisr.h
+
+#ifdef INET
+#include netinet/in.h
+#include netinet/in_systm.h
+#include netinet/in_var.h
+#include netinet/ip.h
+#include netinet/ip_var.h
+#include netinet/if_ether.h
+#endif
+
+#if NBPFILTER  0
+#include net/bpf.h
+#endif
+
+#include netmpls/mpls.h
+
+#include net/if_mpe.h
+#include net/if_wire.h
+
+#ifdef WIRE_DEBUG
+#defineDPRINTF(fmt, args...)   printf(wire:  fmt \n, ## args)
+#else
+#defineDPRINTF(fmt, args...)
+#endif
+
+
+voidwireattach(int);
+int wire_clone_create(struct if_clone *, int);
+int wire_clone_destroy(struct ifnet *);
+int wire_ioctl(struct ifnet *, u_long, caddr_t);
+voidwire_init(struct wire_softc *);
+voidwire_stop(struct wire_softc *);
+int wire_output(struct ifnet *, struct mbuf *,
+   struct sockaddr *, struct rtentry *);
+voidwire_start(struct ifnet *);
+voidwire_input(struct mbuf *, struct ether_header *, struct 
wire_softc *);
+
+LIST_HEAD(, wire_softc) wire_list;
+
+struct if_clone

Re: cwm tiling

2012-06-09 Thread Rafael Zalamena
On Sat, Jun 9, 2012 at 9:53 AM, Weldon Goree wel...@b.rontosaur.us wrote:
 On Sat, 2012-06-09 at 14:26 +0300, Paul Irofti wrote:
 I agree completley with you. Being able to tile just a given virtual
 desktop and leave the others intact would be pretty awesome.

 Except they aren't desktops. Desktops are exclusively selected, cwm
 groups aren't (necessarily), and have a z index based on last selection.
 How (eg) xmonad handles tags is a better basis for multi-desktop tiling
 than how cwm handles groups.

 Weldon


+1, but I wouldn't like the tiling behavior by default though, ever.

2 reasons:
1 - author didn't intend it - it might lead old or current users feel
like its not the same cwm
2 - In my personal experience the windows that I want tiled are never
closed, so I like manually setting the tiling property.
Current diff to implement tiling meet the above conditions, so I like it!



ldpctl(8) fix invalid uptime

2012-05-10 Thread Rafael Zalamena
This patch fixes the invalid uptime for interface which are not active
(no link). When ldpd is running on an interface with no link it shows a
invalid value.

Steps to reproduce:
1 - Configure ldpd on an interface without link
2 - Start ldpd
3 - Run 'ldpctl show interfaces'

Bugged result:
Interface   AddressState  Linkstate  Uptime  
re0 192.168.1.40/24ACTIVE no carrier 2210w1d0

Expected result:
Interface   AddressState  Linkstate  Uptime  
re0 192.168.1.40/24ACTIVE no carrier 00:00:00


Index: interface.c
===
RCS file: /cvs/src/usr.sbin/ldpd/interface.c,v
retrieving revision 1.8
diff -u -p -r1.8 interface.c
--- interface.c 4 Jul 2011 04:34:14 -   1.8
+++ interface.c 11 May 2012 02:52:26 -
@@ -298,7 +298,8 @@ if_to_ctl(struct iface *iface)
} else
ictl.hello_timer = -1;
 
-   if (iface-state != IF_STA_DOWN) {
+   if (iface-state != IF_STA_DOWN 
+   iface-uptime != 0) {
ictl.uptime = now.tv_sec - iface-uptime;
} else
ictl.uptime = 0;