bpe(4): 802.1Q Provider Backbone Bridge edge

2018-12-18 Thread David Gwynne
802.1ah-2008 developed Provider Backbone Bridges, aka, mac-in-mac
support. This was adopted as part of 802.1Q-2011.

It basically provides Ethernet over Ethernet overlay networking. Unlike
vlan and svlan, the entire ethernet packet is encapsulated in another.
The motivation for this is to avoid the need for intermediate switches
to learn all the "customer" mac addresses, they just need to know about
the PBB endpoints.

However, like vlan it does have a concept of a vnetid, and has the
ability to store the packet priority. the vnetid is 24 bits and doesn't
appear to have any reserved values.

Thoughts?

Index: share/man/man4/Makefile
===
RCS file: /cvs/src/share/man/man4/Makefile,v
retrieving revision 1.697
diff -u -p -r1.697 Makefile
--- share/man/man4/Makefile 23 Nov 2018 12:38:44 -  1.697
+++ share/man/man4/Makefile 19 Dec 2018 04:16:07 -
@@ -15,7 +15,7 @@ MAN=  aac.4 ac97.4 acphy.4 acrtc.4 \
auacer.4 audio.4 aue.4 auglx.4 auich.4 auixp.4 autri.4 auvia.4 \
axe.4 axen.4 axppmic.4 azalia.4 \
bce.4 bcmaux.4 bcmdog.4  bcmrng.4 bcmtemp.4 berkwdt.4 bge.4 \
-   bgw.4 bio.4 bktr.4 bmtphy.4 bnx.4 bnxt.4 \
+   bgw.4 bio.4 bpe.4 bktr.4 bmtphy.4 bnx.4 bnxt.4 \
boca.4 bpf.4 brgphy.4 bridge.4 brswphy.4 bwfm.4 bwi.4 bytgpio.4 \
cac.4 cas.4 cardbus.4 carp.4 ccp.4 ccpmic.4 cd.4 cdce.4 cfxga.4 \
ch.4 chvgpio.4 ciphy.4 ciss.4 clcs.4 clct.4 cmpci.4 \
Index: share/man/man4/bpe.4
===
RCS file: share/man/man4/bpe.4
diff -N share/man/man4/bpe.4
--- /dev/null   1 Jan 1970 00:00:00 -
+++ share/man/man4/bpe.419 Dec 2018 04:16:07 -
@@ -0,0 +1,159 @@
+.\" $OpenBSD$
+.\"
+.\" Copyright (c) 2018 David Gwynne 
+.\"
+.\" Permission to use, copy, modify, and distribute this software for any
+.\" purpose with or without fee is hereby granted, provided that the above
+.\" copyright notice and this permission notice appear in all copies.
+.\"
+.\" THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
+.\" WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
+.\" MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
+.\" ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
+.\" WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
+.\" ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
+.\" OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
+.\"
+.Dd $Mdocdate: November 16 2018 $
+.Dt BPE 4
+.Os
+.Sh NAME
+.Nm bpe
+.Nd Backbone Provider Edge pseudo-device
+.Sh SYNOPSIS
+.Cd "pseudo-device bpe"
+.Sh DESCRIPTION
+The
+.Nm bpe
+driver allows construction of IEEE 802.1Q Provider Backbone Bridge
+(PBB) networks by acting as a Backbone Edge Bridge (BEB).
+PBB, also known as mac-in-mac, was originally specified in
+IEEE 802.1ah-2008 and became part of IEEE 802.1Q-2011.
+.Pp
+A Provider Backbone Bridge Network (PBBN) consists of BEBs
+interconnected by Backbone Core Bridges (BCBs) to form an Ethernet
+network for the transport of encapsulated Ethernet packets.
+Where VLAN and SVLAN protocols add a shim to differentiate Ethernet
+packets for different networks but retain the Ethernet addresses
+of encapsulated traffic, PBB completely encapsulates Ethernet packets
+for transmission between BEBs on a PBBN.
+This removes the need for intermediate BCB devices on the backbone
+network  to learn the Ethernet addresses of devices on the encapsulated
+network, but requires each BEB to maintain a mapping of addresses
+on the encapsulated network to peer BEBs.
+.Pp
+A PBB packet consists of another Ethernet frame containing Ethernet
+addresses for BEBs and the PBB Ethernet protocol type (0x88e7), a
+32-bit Backbone Service Instance Tag (I-TAG), followed by the
+encapsulated Ethernet frame.
+The I-TAG contains a 24-bit Backbone Service Instance Identifiier
+(I-SID) to differentiate different PBBNs on the same backbone network
+.Pp
+IEEE 802.1Q describes Customer VLANs being encapsulated by PBB,
+which in turn uses an S-VLAN service.
+This can be implemented with
+.Xr vlan 4
+using a
+.Nm bpe
+interface as the parent,
+and with the
+.Nm bpe
+interface using
+.Xr svlan 4
+as the parent.
+.Nm bpe
+itself does not require this topology, therefore allowing flexible
+deployment and network topologies.
+.Pp
+The
+Nm. bpe
+driver implements a learning bridge on each interface.
+The driver will learn the mapping of BEPs to encapsulated Ethernet
+address based on traffic received from other devices on the backbone
+network.
+Traffic sent to broadcast, multicast, or unknown unicast Etherent
+addresses will be flooded to a multicast address on the backbone network.
+The multicast address used for each PBB Service Instance
+will begin with 01:1e:83 as the first three octets, with the I-SID
+as the last three octets, e.g., a
+.Nm bpe
+interface 

Re: Please test: HZ bump

2018-12-18 Thread Ian Sutton
On Mon, Aug 14, 2017 at 3:07 PM Martin Pieuchot  wrote:
>
> I'd like to improve the fairness of the scheduler, with the goal of
> mitigating userland starvations.  For that the kernel needs to have
> a better understanding of the amount of executed time per task.
>
> The smallest interval currently usable on all our architectures for
> such accounting is a tick.  With the current HZ value of 100, this
> smallest interval is 10ms.  I'd like to bump this value to 1000.
>
> The diff below intentionally bump other `hz' value to keep current
> ratios.  We certainly want to call schedclock(), or a similar time
> accounting function, at a higher frequency than 16 Hz.  However this
> will be part of a later diff.
>
> I'd be really interested in test reports.  mlarkin@ raised a good
> question: is your battery lifetime shorter with this diff?
>
> Comments, oks?
>

I'd like to revisit this patch. It makes our armv7 platform more
usable for what it is meant to do, i.e. be a microcontroller. I
imagine on other platforms it would accrue similar benefits as well.

I've tested this patch and found delightfully proportional results.
Currently, at HZ = 100, the minimum latency for a sleep calll from
userspace is about 10ms:

https://ce.gl/baseline.jpg

After the patch, which bumps HZ from 100 --> 1000, we see a tenfold
decrease in this latency:

https://ce.gl/with-mpi-hz-patch.jpg

This signal is generated with gpio(4) ioctl calls from userspace,
e.g.: for(;;) { HI(pin); usleep(1); LO(pin(); usleep(1); }

I'd like to see more folks test and other devs to share their
thoughts: What are the risks associated with bumping HZ globally?
Drawbacks? Reasons for hesitation?

Thanks,
Ian Sutton



> Index: conf/param.c
> ===
> RCS file: /cvs/src/sys/conf/param.c,v
> retrieving revision 1.37
> diff -u -p -r1.37 param.c
> --- conf/param.c6 May 2016 19:45:35 -   1.37
> +++ conf/param.c14 Aug 2017 17:03:23 -
> @@ -76,7 +76,7 @@
>  # define DST 0
>  #endif
>  #ifndef HZ
> -#defineHZ 100
> +#defineHZ 1000
>  #endif
>  inthz = HZ;
>  inttick = 100 / HZ;
> Index: kern/kern_clock.c
> ===
> RCS file: /cvs/src/sys/kern/kern_clock.c,v
> retrieving revision 1.93
> diff -u -p -r1.93 kern_clock.c
> --- kern/kern_clock.c   22 Jul 2017 14:33:45 -  1.93
> +++ kern/kern_clock.c   14 Aug 2017 19:50:49 -
> @@ -406,12 +406,11 @@ statclock(struct clockframe *frame)
> if (p != NULL) {
> p->p_cpticks++;
> /*
> -* If no schedclock is provided, call it here at ~~12-25 Hz;
> +* If no schedclock is provided, call it here;
>  * ~~16 Hz is best
>  */
> if (schedhz == 0) {
> -   if ((++curcpu()->ci_schedstate.spc_schedticks & 3) ==
> -   0)
> +   if ((spc->spc_schedticks & 0x3f) == 0)
> schedclock(p);
> }
> }
> Index: arch/amd64/isa/clock.c
> ===
> RCS file: /cvs/src/sys/arch/amd64/isa/clock.c,v
> retrieving revision 1.25
> diff -u -p -r1.25 clock.c
> --- arch/amd64/isa/clock.c  11 Aug 2017 21:18:11 -  1.25
> +++ arch/amd64/isa/clock.c  14 Aug 2017 17:19:35 -
> @@ -303,8 +303,8 @@ rtcdrain(void *v)
>  void
>  i8254_initclocks(void)
>  {
> -   stathz = 128;
> -   profhz = 1024;
> +   stathz = 1024;
> +   profhz = 8192;
>
> isa_intr_establish(NULL, 0, IST_PULSE, IPL_CLOCK, clockintr,
> 0, "clock");
> @@ -321,7 +321,7 @@ rtcstart(void)
>  {
> static struct timeout rtcdrain_timeout;
>
> -   mc146818_write(NULL, MC_REGA, MC_BASE_32_KHz | MC_RATE_128_Hz);
> +   mc146818_write(NULL, MC_REGA, MC_BASE_32_KHz | MC_RATE_1024_Hz);
> mc146818_write(NULL, MC_REGB, MC_REGB_24HR | MC_REGB_PIE);
>
> /*
> @@ -577,10 +577,10 @@ setstatclockrate(int arg)
> if (initclock_func == i8254_initclocks) {
> if (arg == stathz)
> mc146818_write(NULL, MC_REGA,
> -   MC_BASE_32_KHz | MC_RATE_128_Hz);
> +   MC_BASE_32_KHz | MC_RATE_1024_Hz);
> else
> mc146818_write(NULL, MC_REGA,
> -   MC_BASE_32_KHz | MC_RATE_1024_Hz);
> +   MC_BASE_32_KHz | MC_RATE_8192_Hz);
> }
>  }
>
> Index: arch/armv7/omap/dmtimer.c
> ===
> RCS file: /cvs/src/sys/arch/armv7/omap/dmtimer.c,v
> retrieving revision 1.6
> diff -u -p -r1.6 dmtimer.c
> --- arch/armv7/omap/dmtimer.c   22 Jan 2015 14:33:01 -  1.6
> +++ arch/armv7/omap/dmtimer.c   14 Aug 2017 17:16:01 -
> @@ -296,8 +296,8 @@ 

smtpd: update table api

2018-12-18 Thread Eric Faurot
Hi.

This diff changes the internal table interface.  The backends now
return results as formatted strings, parsing is delegated to the upper
layer.

It's been lightly tested already, but more tests would be very welcome,
especially with setups involving lots of tables (including external ones).

Eric.

Index: smtpd.h
===
RCS file: /cvs/src/usr.sbin/smtpd/smtpd.h,v
retrieving revision 1.594
diff -u -p -r1.594 smtpd.h
--- smtpd.h 13 Dec 2018 17:08:10 -  1.594
+++ smtpd.h 17 Dec 2018 16:33:09 -
@@ -375,8 +375,8 @@ struct table_backend {
void   *(*open)(struct table *);
int (*update)(struct table *);
void(*close)(void *);
-   int (*lookup)(void *, struct dict *, const char *, enum 
table_service, union lookup *);
-   int (*fetch)(void *, struct dict *, enum table_service, union 
lookup *);
+   int (*lookup)(void *, struct dict *, const char *, enum 
table_service, char **);
+   int (*fetch)(void *, struct dict *, enum table_service, char **);
 };
 
 
@@ -1601,8 +1601,6 @@ int table_regex_match(const char *, cons
 void   table_open_all(struct smtpd *);
 void   table_dump_all(struct smtpd *);
 void   table_close_all(struct smtpd *);
-int table_parse_lookup(enum table_service, const char *, const char *,
-union lookup *);
 
 
 /* to.c */
Index: table.c
===
RCS file: /cvs/src/usr.sbin/smtpd/table.c,v
retrieving revision 1.32
diff -u -p -r1.32 table.c
--- table.c 2 Nov 2018 13:45:59 -   1.32
+++ table.c 17 Dec 2018 16:09:02 -
@@ -53,6 +53,8 @@ extern struct table_backend table_backen
 static const char * table_service_name(enum table_service);
 static const char * table_backend_name(struct table_backend *);
 static const char * table_dump_lookup(enum table_service, union lookup *);
+static int table_parse_lookup(enum table_service, const char *, const char *,
+union lookup *);
 static int parse_sockaddr(struct sockaddr *, int, const char *);
 
 static unsigned int last_table_id = 0;
@@ -125,7 +127,7 @@ table_lookup(struct table *table, struct
 union lookup *lk)
 {
int r;
-   charlkey[1024];
+   charlkey[1024], *buf = NULL;
 
if (table->t_backend->lookup == NULL)
return (-1);
@@ -135,9 +137,9 @@ table_lookup(struct table *table, struct
return -1;
}
 
-   r = table->t_backend->lookup(table->t_handle, params, lkey, kind, lk);
+   r = table->t_backend->lookup(table->t_handle, params, lkey, kind, lk ? 
 : NULL);
 
-   if (r == 1)
+   if (r == 1) {
log_trace(TRACE_LOOKUP, "lookup: %s \"%s\" as %s in table %s:%s 
-> %s%s%s",
lk ? "lookup" : "check",
lkey,
@@ -145,8 +147,11 @@ table_lookup(struct table *table, struct
table_backend_name(table->t_backend),
table->t_name,
lk ? "\"" : "",
-   (lk) ? table_dump_lookup(kind, lk): "found",
+   (lk) ? buf : "found",
lk ? "\"" : "");
+   if (buf)
+   r = table_parse_lookup(kind, lkey, buf, lk);
+   }
else
log_trace(TRACE_LOOKUP, "lookup: %s \"%s\" as %s in table %s:%s 
-> %d",
lk ? "lookup" : "check",
@@ -156,6 +161,8 @@ table_lookup(struct table *table, struct
table->t_name,
r);
 
+   free(buf);
+
return (r);
 }
 
@@ -163,20 +170,24 @@ int
 table_fetch(struct table *table, struct dict *params, enum table_service kind, 
union lookup *lk)
 {
int r;
+   char*buf = NULL;
 
if (table->t_backend->fetch == NULL)
return (-1);
 
-   r = table->t_backend->fetch(table->t_handle, params, kind, lk);
+   r = table->t_backend->fetch(table->t_handle, params, kind, lk ?  : 
NULL);
 
-   if (r == 1)
+   if (r == 1) {
log_trace(TRACE_LOOKUP, "lookup: fetch %s from table %s:%s -> 
%s%s%s",
table_service_name(kind),
table_backend_name(table->t_backend),
table->t_name,
lk ? "\"" : "",
-   (lk) ? table_dump_lookup(kind, lk): "found",
+   (lk) ? buf : "found",
lk ? "\"" : "");
+   if (buf)
+   r = table_parse_lookup(kind, NULL, buf, lk);
+   }
else
log_trace(TRACE_LOOKUP, "lookup: fetch %s from table %s:%s -> 
%d",
table_service_name(kind),
@@ -184,6 +195,8 @@ table_fetch(struct table *table, struct 
table->t_name,
r);
 
+   free(buf);
+
return (r);
 }
 
@@ -535,7 +548,7 @@ table_close_all(struct smtpd *conf)
table_close(t);
 }
 

Re: Patch for install64.octeon : EdgeRouter 6 info

2018-12-18 Thread Visa Hankala
On Mon, Dec 17, 2018 at 11:22:40PM -0500, Chris McGee wrote:
> Hi:
> 
>   I would like to add some info for Edgerouter 6
> (and presumably ER4, and maybe also ER12?) to install64.octeon.
> The document is great but it won't get a new user booting on the new
> 4-core machines with MMC drives.
> 
> I tried to make it as brief as possible while pointing the user in the right
> direction, so for example it mentions that you're going to need to drop
> bsd.mp into the msdos kernel loader partition but doesn't explain how
> to do that. Seemed to be the right level of detail for this document.
> 
> Here is a diff with my additions. Diff is from
> /OpenBSD/6.4/INSTALL.octeon.
> 
> me@box> diff INSTALL.octeon INSTALL.octeon.er6
> 690a691,692
> > For the EdgeRouter Lite:
> >
> 702a705,710
> > For the EdgeRouter 6, installing to the internal MMC drive:
> >
> >   # setenv bootcmd 'fatload mmc 0 ${loadaddr} bsd;bootoctlinux 
> > coremask=0xf rootdev=/dev/sd0'
> >   # setenv bootdelay 5
> >   # saveenv
> >
> 707c715
> < On multi-core systems, the numcores parameter enables the secondary CPUs.
> ---
> > On multi-core systems, the numcores parameter enables multiple cores.
> 708a717,719
> > Note that this boot command does not actually put a multiprocessor kernel in
> > place; you will also need to copy the bsd.mp kernel to the octeon MS-DOS
> > partition (disklabel i by default) on your boot drive for multicore support.
> 709a721
> > Example booting from USB on the Edgerouter Lite:
> 711a724,726
> > Example booting from USB on the EdgeRouter 6:
> >   fatload usb 0 ${loadaddr} bsd; bootoctlinux rootdev=sd0 numcores=4
> >
> 716a732,736
> > If you installed from a USB stick to the MMC on an EdgeRouter 4/6/8:
> > The machine assigns sd0 to USB first if present, then to MMC if present.
> > If you leave the USB install stick in, the machine will try to boot it.
> > Removing the USB device will cause sd0 to be assigned to mmc0 next boot,
> > allowing the machine to boot your newly-installed OpenBSD drive.

Good points. However, I would like to keep the text general and avoid
listing machine specifics if possible. Does the patch below make the
text any clearer?

In principle, the installer could show an example bootoctlinux command
with the correct parameters for the system.

Future snapshots built after today should handle the copying of bsd.mp
automatically.

Index: notes/octeon/install
===
RCS file: src/distrib/notes/octeon/install,v
retrieving revision 1.16
diff -u -p -r1.16 install
--- notes/octeon/install30 Nov 2017 15:25:37 -  1.16
+++ notes/octeon/install18 Dec 2018 16:21:22 -
@@ -56,8 +56,8 @@ restore it later if needed:
 
 ${bootcmd} is run by U-Boot when ${autoload} is enabled. Now create a new
 ${bootcmd} which will load an ELF file called 'bsd' from the first active FAT
-partition on the first CF card or USB device. The FAT partition has been 
created
-by the installer.
+partition on the first CF card. The FAT partition has been created by the
+installer.
 
# setenv bootcmd 'fatload ide 0:1 ${loadaddr} bsd;bootoctlinux 
rootdev=/dev/octcf0'
# setenv bootdelay 5
@@ -71,9 +71,19 @@ by the installer.
Protected 1 sectors
#
 
-If you have installed onto USB use the following bootcmd instead:
+If you have installed onto eMMC, SATA or USB, use the following
+bootcmd instead:
 
-  fatload usb 0 ${loadaddr} bsd; bootoctlinux rootdev=sd0
+  fatload  0 ${loadaddr} bsd; bootoctlinux rootdev=sd0
+
+where you replace ``'' with ``mmc'', ``sata'' or ``usb''.
+
+For stable root disk selection, you can specify the disk
+by disklabel(8) UID (DUID):
+
+  fatload usb 0 ${loadaddr} bsd; bootoctlinux rootdev=
+
+where ``'' is the DUID of your root disk.
 
 On multi-core systems, the numcores parameter enables the secondary CPUs.
 Use the total number of cores on your system as the value of the parameter.



MPLSv6 2/2 : bgpd diff

2018-12-18 Thread Denis Fondras
Here is a serie of diffs to enable MPLSv6, MPLS transport over IPv6.

Second diff : add support for IPv6 MPLS routes exchange with bgpd(8).

(***)
pe1# cat /etc/hostname.mpe0
rdomain 2
mplslabel 42
inet6 2001:db8::2/128
up
(***)
pe1# cat /etc/hostname.vio0
rdomain 2
inet6 2001:db8::2 126
up
(***)
pe1# cat /etc/hostname.vio1
mpls
inet6 2001:db8:1::2 126
up
(***)
pe1# cat /etc/hostname.lo0 
inet6 2001:db8:fffe::1 128
up
(***)
pe1# cat /etc/bgpd.conf
router-id 10.0.0.2
AS 65530

rdomain 2 {
  descr "CUSTOMER1"
  rd 65530:2
  import-target rt 65530:2
  export-target rt 65530:2
  depend on mpe0
  network inet connected
  network inet6 connected
}
group "ibgp" {
  announce IPv4 vpn
  announce IPv4 unicast
  announce IPv6 vpn
  announce IPv6 unicast
  remote-as 65530
  neighbor 10.255.254.2 {
local-address 10.255.254.1
descr PE2v4
down
  }
  neighbor 2001:db8:fffe::2 {
local-address 2001:db8:fffe::1
descr PE2v6
  }
}
allow from ibgp
allow to ibgp
(***)
pe1# bgpctl sh rib  
  
flags: * = Valid, > = Selected, I = via IBGP, A = Announced,
   S = Stale, E = Error
origin validation state: N = not-found, V = valid, ! = invalid
origin: i = IGP, e = EGP, ? = Incomplete

flags ovs destination  gateway  lpref   med aspath origin   
  
AI*>N rd 65530:2 2001:db8::/126 rd 0:0 ::  100 0 i
I*> N rd 65530:2 2001:db8:::/126 2001:db8:fffe::2100 0 i
  
(***)
pe2# tcpdump -n -i vio1 mpls
tcpdump: listening on vio1, link-type EN10MB
08:13:01.870005 MPLS(label 42, exp 0, ttl 62) 2001:db8::1 > 2001:db8:::1: 
icmp6: echo request 
08:13:01.870882 MPLS(label 26, exp 0, ttl 63) MPLS(label 42, exp 0, ttl 63) 
2001:db8:::1 > 2001:db8::1: icmp6: echo reply 
08:13:02.362564 MPLS(label 42, exp 0, ttl 62) 2001:db8::1 > 2001:db8:::1: 
icmp6: echo request 
08:13:02.363173 MPLS(label 26, exp 0, ttl 63) MPLS(label 42, exp 0, ttl 63) 
2001:db8:::1 > 2001:db8::1: icmp6: echo reply 
08:13:02.865183 MPLS(label 42, exp 0, ttl 62) 2001:db8::1 > 2001:db8:::1: 
icmp6: echo request
(***)

We can only exchange MPLS routes with the same address family as the
transport AF.

Unfortunately I don't have gear to test interoperability. It seems there is very
few support that.  Has anyone access to such hardware ?

Index: bgpd/bgpd.h
===
RCS file: /cvs/src/usr.sbin/bgpd/bgpd.h,v
retrieving revision 1.357
diff -u -p -r1.357 bgpd.h
--- bgpd/bgpd.h 11 Dec 2018 09:02:14 -  1.357
+++ bgpd/bgpd.h 18 Dec 2018 11:04:07 -
@@ -154,7 +154,8 @@ extern const struct aid aid_vals[];
 #defineAID_INET1
 #defineAID_INET6   2
 #defineAID_VPN_IPv43
-#defineAID_MAX 4
+#defineAID_VPN_IPv64
+#defineAID_MAX 5
 #defineAID_MIN 1   /* skip AID_UNSPEC since that is a 
dummy */
 
 #define AID_VALS   {   \
@@ -162,14 +163,16 @@ extern const struct aid aid_vals[];
{ AFI_UNSPEC, AF_UNSPEC, SAFI_NONE, "unspec"},  \
{ AFI_IPv4, AF_INET, SAFI_UNICAST, "IPv4 unicast" },\
{ AFI_IPv6, AF_INET6, SAFI_UNICAST, "IPv6 unicast" },   \
-   { AFI_IPv4, AF_INET, SAFI_MPLSVPN, "IPv4 vpn" } \
+   { AFI_IPv4, AF_INET, SAFI_MPLSVPN, "IPv4 vpn" },\
+   { AFI_IPv6, AF_INET6, SAFI_MPLSVPN, "IPv6 vpn" }\
 }
 
 #define AID_PTSIZE {   \
0,  \
sizeof(struct pt_entry4),   \
sizeof(struct pt_entry6),   \
-   sizeof(struct pt_entry_vpn4)\
+   sizeof(struct pt_entry_vpn4),   \
+   sizeof(struct pt_entry_vpn6)\
 }
 
 struct vpn4_addr {
@@ -181,6 +184,15 @@ struct vpn4_addr {
u_int8_tpad2;
 };
 
+struct vpn6_addr {
+   u_int64_t   rd;
+   struct in6_addr addr;
+   u_int8_tlabelstack[21]; /* max that makes sense */
+   u_int8_tlabellen;
+   u_int8_tpad1;
+   u_int8_tpad2;
+};
+
 #define BGP_MPLS_BOS   

MPLSv6 1/2: kernel diff

2018-12-18 Thread Denis Fondras
Here is a serie of diffs to enable MPLSv6, MPLS transport over IPv6.

First diff : allow mpe(4) to handle IPv6 trafic.

Index: net/if_ethersubr.c
===
RCS file: /cvs/src/sys/net/if_ethersubr.c,v
retrieving revision 1.255
diff -u -p -r1.255 if_ethersubr.c
--- net/if_ethersubr.c  12 Dec 2018 05:38:26 -  1.255
+++ net/if_ethersubr.c  18 Dec 2018 11:03:33 -
@@ -246,18 +246,28 @@ ether_resolve(struct ifnet *ifp, struct 
sizeof(eh->ether_dhost));
break;
 #ifdef INET6
+do_v6:
case AF_INET6:
error = nd6_resolve(ifp, rt, m, dst, eh->ether_dhost);
if (error)
return (error);
break;
 #endif
+do_v4:
case AF_INET:
-   case AF_MPLS:
error = arpresolve(ifp, rt, m, dst, eh->ether_dhost);
if (error)
return (error);
break;
+   case AF_MPLS:
+   switch (rt->rt_gateway->sa_family) {
+   case AF_INET:
+   goto do_v4;
+#ifdef INET6
+   case AF_INET6:
+   goto do_v6;
+#endif
+   }
default:
senderr(EHOSTUNREACH);
}
Index: net/if_mpe.c
===
RCS file: /cvs/src/sys/net/if_mpe.c,v
retrieving revision 1.64
diff -u -p -r1.64 if_mpe.c
--- net/if_mpe.c9 Jan 2018 15:24:24 -   1.64
+++ net/if_mpe.c18 Dec 2018 11:03:33 -
@@ -85,7 +85,7 @@ mpe_clone_create(struct if_clone *ifc, i
mpeif->sc_unit = unit;
ifp = >sc_if;
snprintf(ifp->if_xname, sizeof ifp->if_xname, "mpe%d", unit);
-   ifp->if_flags = IFF_POINTOPOINT;
+   ifp->if_flags = IFF_POINTOPOINT|IFF_MULTICAST;
ifp->if_xflags = IFXF_CLONED;
ifp->if_softc = mpeif;
ifp->if_mtu = MPE_MTU;
@@ -157,6 +157,16 @@ mpestart(struct ifnet *ifp0)
sizeof(in_addr_t));
m_adj(m, sizeof(in_addr_t));
break;
+#ifdef INET6
+   case AF_INET6:
+   memset(sa, 0, sizeof(struct sockaddr_in6));
+   satosin6(sa)->sin6_family = af;
+   satosin6(sa)->sin6_len = sizeof(struct sockaddr_in6);
+   bcopy(mtod(m, caddr_t), (sa)->sin6_addr,
+   sizeof(struct in6_addr));
+   m_adj(m, sizeof(struct in6_addr));
+   break;
+#endif
default:
m_freem(m);
continue;
@@ -204,6 +214,9 @@ mpeoutput(struct ifnet *ifp, struct mbuf
int error;
int off;
in_addr_t   addr;
+#ifdef INET6
+   struct in6_addr addr6;
+#endif
u_int8_top = 0;
 
 #ifdef DIAGNOSTIC
@@ -251,6 +264,39 @@ mpeoutput(struct ifnet *ifp, struct mbuf
m_copyback(m, sizeof(sa_family_t), sizeof(in_addr_t),
, M_NOWAIT);
break;
+#ifdef INET6
+   case AF_INET6:
+   if (!rt || !(rt->rt_flags & RTF_MPLS)) {
+   m_freem(m);
+   error = ENETUNREACH;
+   goto out;
+   }
+   shim.shim_label = ((struct rt_mpls *)rt->rt_llinfo)->mpls_label;
+   shim.shim_label |= MPLS_BOS_MASK;
+   op =  ((struct rt_mpls *)rt->rt_llinfo)->mpls_operation;
+   if (op != MPLS_OP_PUSH) {
+   m_freem(m);
+   error = ENETUNREACH;
+   goto out;
+   }
+   if (mpls_mapttl_ip) {
+   struct ip6_hdr  *ip6;
+   ip6 = mtod(m, struct ip6_hdr *);
+   shim.shim_label |= htonl(ip6->ip6_hops) & MPLS_TTL_MASK;
+   } else
+   shim.shim_label |= htonl(mpls_defttl) & MPLS_TTL_MASK;
+   off = sizeof(sa_family_t) + sizeof(struct in6_addr);
+   M_PREPEND(m, sizeof(shim) + off, M_DONTWAIT);
+   if (m == NULL) {
+   error = ENOBUFS;
+   goto out;
+   }
+   *mtod(m, sa_family_t *) = AF_INET6;
+   addr6 = satosin6(rt->rt_gateway)->sin6_addr;
+   m_copyback(m, sizeof(sa_family_t), sizeof(struct in6_addr),
+   , M_NOWAIT);
+   break;
+#endif
default:
m_freem(m);
error = EPFNOSUPPORT;
@@ -354,6 +400,9 @@ mpeioctl(struct ifnet *ifp, u_long cmd, 
}
/* return with ENOTTY so that the 

Re: ospf6d reports no buffer space available on vmx interface

2018-12-18 Thread Stuart Henderson
On 2018/12/18 11:34, Arnaud BRAND wrote:
> Hi,
> 
> I'm running 6.4 stable, with latest syspatches.
> 
> I saw ospf6d reporting this in the logs
> Dec 18 08:18:10 obsd64-ic1 ospf6d[68658]: send_packet: error sending packet
> on interface vmx1: No buffer space available
> 
> Searching the web, I gathered that netstat -m might shed some light, so I
> proceeded :
> obsd64-ic1# netstat -m
> 610 mbufs in use:
> 543 mbufs allocated to data
> 8 mbufs allocated to packet headers
> 59 mbufs allocated to socket names and addresses
> 13/200 mbuf 2048 byte clusters in use (current/peak)
> 0/30 mbuf 2112 byte clusters in use (current/peak)
> 1/56 mbuf 4096 byte clusters in use (current/peak)
> 0/48 mbuf 8192 byte clusters in use (current/peak)
> 475/2170 mbuf 9216 byte clusters in use (current/peak)
> 0/0 mbuf 12288 byte clusters in use (current/peak)
> 0/0 mbuf 16384 byte clusters in use (current/peak)
> 0/0 mbuf 65536 byte clusters in use (current/peak)
> 10196/23304/524288 Kbytes allocated to network (current/peak/max)
> 0 requests for memory denied
> 0 requests for memory delayed
> 0 calls to protocol drain routines
> 
> So if there were no requests denied or delayed and the peak was only 24MB
> out of 512MB max, what could cause ospf6d to complain ?
> Should I be worried about this message ?
> 
> Looking at the sendto man page I get that it can return ENOBUFS in two cases
> :
> Case 1 - The system was unable to allocate an internal buffer
> -> this seems to not be the case as shown above
> 
> This leaves only case 2 : The output queue for a network interface was full.
> 
> Looking at netstat -id is see drops on vmx1 and vmx3.
> Both of these cards are VMXNET3 cards connected to the different
> VLANs/Portgroups on the same vswitch which has two 10G uplinks to the
> switches.
> 
> sysctl | grep drops shows
> net.inet.ip.ifq.drops=0
> net.inet6.ip6.ifq.drops=0
> net.pipex.inq.drops=0
> net.pipex.outq.drops=0
> 
> I'm out of ideas for places where to look next.
> Please, could network guru provide some insight/help ?
> Or just tell me that it's not worth bothering and I should stop here ?
> 
> Thanks for your help and have a nice day !
> Arnaud

It maybe worth trying e1000/em(4). I had quite frequent panics with
vmx(4) (https://marc.info/?l=openbsd-bugs=2=1=vmxnet3_getbuf=b),
the same VM has been totally stable since switching to em(4).



ospf6d reports no buffer space available on vmx interface

2018-12-18 Thread Arnaud BRAND

Hi,

I'm running 6.4 stable, with latest syspatches.

I saw ospf6d reporting this in the logs
Dec 18 08:18:10 obsd64-ic1 ospf6d[68658]: send_packet: error sending 
packet on interface vmx1: No buffer space available


Searching the web, I gathered that netstat -m might shed some light, so 
I proceeded :

obsd64-ic1# netstat -m
610 mbufs in use:
543 mbufs allocated to data
8 mbufs allocated to packet headers
59 mbufs allocated to socket names and addresses
13/200 mbuf 2048 byte clusters in use (current/peak)
0/30 mbuf 2112 byte clusters in use (current/peak)
1/56 mbuf 4096 byte clusters in use (current/peak)
0/48 mbuf 8192 byte clusters in use (current/peak)
475/2170 mbuf 9216 byte clusters in use (current/peak)
0/0 mbuf 12288 byte clusters in use (current/peak)
0/0 mbuf 16384 byte clusters in use (current/peak)
0/0 mbuf 65536 byte clusters in use (current/peak)
10196/23304/524288 Kbytes allocated to network (current/peak/max)
0 requests for memory denied
0 requests for memory delayed
0 calls to protocol drain routines

So if there were no requests denied or delayed and the peak was only 
24MB out of 512MB max, what could cause ospf6d to complain ?

Should I be worried about this message ?

Looking at the sendto man page I get that it can return ENOBUFS in two 
cases :

Case 1 - The system was unable to allocate an internal buffer
-> this seems to not be the case as shown above

This leaves only case 2 : The output queue for a network interface was 
full.


Looking at netstat -id is see drops on vmx1 and vmx3.
Both of these cards are VMXNET3 cards connected to the different 
VLANs/Portgroups on the same vswitch which has two 10G uplinks to the 
switches.


sysctl | grep drops shows
net.inet.ip.ifq.drops=0
net.inet6.ip6.ifq.drops=0
net.pipex.inq.drops=0
net.pipex.outq.drops=0

I'm out of ideas for places where to look next.
Please, could network guru provide some insight/help ?
Or just tell me that it's not worth bothering and I should stop here ?

Thanks for your help and have a nice day !
Arnaud






Re: request for testing: patch for boot loader out of mem

2018-12-18 Thread Otto Moerbeek
On Mon, Dec 17, 2018 at 10:53:26PM +0100, diego righi wrote:

> Tested also the snapshot of today 2018/12/17 on same hardware but with
> 500Gb disk with single big "a" partition, it works:

The diff has been commited, thanks for testing.

-Otto



Re: Patch for install64.octeon : EdgeRouter 6 info

2018-12-18 Thread Stefan Sperling
On Mon, Dec 17, 2018 at 11:22:40PM -0500, Chris McGee wrote:
> so for example it mentions that you're going to need to drop
> bsd.mp into the msdos kernel loader partition but doesn't explain how
> to do that. Seemed to be the right level of detail for this document.

The installer is supposed to put the right kernel on the FAT partition,
isn't it? If that doesn't work on these machines the install script
should be fixed. I don't see harm in updating the docs but we shouldn't
be documenting broken behaviour.

> me@box> diff INSTALL.octeon INSTALL.octeon.er6
> 690a691,692
> > For the EdgeRouter Lite:

Please send patches in unidiff format (diff -u)