[PATCH] usr.sbin/rpki-client: remove -f (force) option

2020-06-30 Thread Job Snijders
Remove rpki-client's -f command line option

I haven't come across a use case that requires tricking the software
into accepting out-of-date manifests. Anyone using -f? I think this is a
leftover from the initial debugging era.

OK?

Index: extern.h
===
RCS file: /cvs/src/usr.sbin/rpki-client/extern.h,v
retrieving revision 1.30
diff -u -p -r1.30 extern.h
--- extern.h24 Jun 2020 14:39:21 -  1.30
+++ extern.h30 Jun 2020 10:21:04 -
@@ -289,7 +289,7 @@ struct cert *cert_read(int);
 
 voidmft_buffer(char **, size_t *, size_t *, const struct mft *);
 voidmft_free(struct mft *);
-struct mft *mft_parse(X509 **, const char *, int);
+struct mft *mft_parse(X509 **, const char *);
 int mft_check(const char *, struct mft *);
 struct mft *mft_read(int);
 
Index: main.c
===
RCS file: /cvs/src/usr.sbin/rpki-client/main.c,v
retrieving revision 1.71
diff -u -p -r1.71 main.c
--- main.c  24 Jun 2020 14:39:21 -  1.71
+++ main.c  30 Jun 2020 10:21:05 -
@@ -148,7 +148,7 @@ struct filepath_tree  fpt = RB_INITIALIZ
 /*
  * Mark that our subprocesses will never return.
  */
-static voidproc_parser(int, int) __attribute__((noreturn));
+static voidproc_parser(int) __attribute__((noreturn));
 static voidproc_rsync(char *, char *, int, int)
__attribute__((noreturn));
 static voidbuild_chain(const struct auth *, STACK_OF(X509) **);
@@ -892,8 +892,8 @@ proc_parser_roa(struct entity *entp,
  * Return the mft on success or NULL on failure.
  */
 static struct mft *
-proc_parser_mft(struct entity *entp, int force, X509_STORE *store,
-X509_STORE_CTX *ctx, struct auth_tree *auths, struct crl_tree *crlt)
+proc_parser_mft(struct entity *entp, X509_STORE *store, X509_STORE_CTX *ctx,
+   struct auth_tree *auths, struct crl_tree *crlt)
 {
struct mft  *mft;
X509*x509;
@@ -902,7 +902,7 @@ proc_parser_mft(struct entity *entp, int
STACK_OF(X509)  *chain;
 
assert(!entp->has_dgst);
-   if ((mft = mft_parse(, entp->uri, force)) == NULL)
+   if ((mft = mft_parse(, entp->uri)) == NULL)
return NULL;
 
a = valid_ski_aki(entp->uri, auths, mft->ski, mft->aki);
@@ -1127,7 +1127,7 @@ build_crls(const struct auth *a, struct 
  * The process will exit cleanly only when fd is closed.
  */
 static void
-proc_parser(int fd, int force)
+proc_parser(int fd)
 {
struct tal  *tal;
struct cert *cert;
@@ -1249,8 +1249,7 @@ proc_parser(int fd, int force)
 */
break;
case RTYPE_MFT:
-   mft = proc_parser_mft(entp, force,
-   store, ctx, , );
+   mft = proc_parser_mft(entp, store, ctx, , );
c = (mft != NULL);
io_simple_buffer(, , , , sizeof(int));
if (mft != NULL)
@@ -1500,8 +1499,7 @@ int
 main(int argc, char *argv[])
 {
int  rc = 1, c, proc, st, rsync,
-fl = SOCK_STREAM | SOCK_CLOEXEC, noop = 0,
-force = 0;
+fl = SOCK_STREAM | SOCK_CLOEXEC, noop = 0;
size_t   i, j, eid = 1, outsz = 0, talsz = 0;
pid_tprocpid, rsyncpid;
int  fd[2];
@@ -1539,7 +1537,7 @@ main(int argc, char *argv[])
if (pledge("stdio rpath wpath cpath fattr proc exec unveil", NULL) == 
-1)
err(1, "pledge");
 
-   while ((c = getopt(argc, argv, "b:Bcd:e:fjnot:T:v")) != -1)
+   while ((c = getopt(argc, argv, "b:Bcd:e:jnot:T:v")) != -1)
switch (c) {
case 'b':
bind_addr = optarg;
@@ -1556,9 +1554,6 @@ main(int argc, char *argv[])
case 'e':
rsync_prog = optarg;
break;
-   case 'f':
-   force = 1;
-   break;
case 'j':
outformats |= FORMAT_JSON;
break;
@@ -1634,7 +1629,7 @@ main(int argc, char *argv[])
err(1, "%s: unveil", cachedir);
if (pledge("stdio rpath", NULL) == -1)
err(1, "pledge");
-   proc_parser(fd[0], force);
+   proc_parser(fd[0]);
/* NOTREACHED */
}
 
@@ -1826,7 +1821,7 @@ main(int argc, char *argv[])
 
 usage:
fprintf(stderr,
-   "usage: rpki-client [-Bcfjnov] [-b sourceaddr] [-d cachedir]"
+   "usage: rpki-client [-Bcjnov] [-b sourceaddr] [-d cachedir]"
" [-e rsync_prog]\n"
"   [-T table] [-t tal] [outputdir]\n");
return 1;
Index: mft.c

Re: suggest to run rpki-client hourly

2020-04-16 Thread Job Snijders
Now that cron(8) was put on a quick steroids programme, we have new
options available! Awesome work Todd, Theo.

On Mon, Apr 13, 2020 at 02:43:27PM +, Job Snijders wrote:
> I'm reviewing some of the timers associated with the workings of the
> end-to-end propagation from ROA to VRP. I think suggesting to run
> rpki-client only once a day can make for needless brittleness.
> 
> Running rpki-client just once a day also results in only making a rsync
> fetch attempt once a day. If the connection can't be established because
> of a transient network issue, the RP can easily end up going without
> contact with the CA Publication Point for close to 48 hours. A lot of
> CRLs appear to have expiration dates in the range of '24 hours'.
> 
> I think attempting to contact a CA PP at least once an hour is more
> appropriate for the various 24-48h sliding windows that are in play.

In autonomous systems running bgpd(8) and rpki-client(8) on their edge
routers, I believe it to be beneficial if out-of-the-box the routers
don't all do rpki fetches & bgp loads at the same time. It is expected
behavior for RPKI information to un-evenly percolate towards the BGP
edge in a staggered way. From a network operational perspective should a
support request come in to the ISP, responding "once an hour" will
satisfy most situations.

In cases where rpki-client for some reason ends up taking longer than an
hour, the next execution attempt of the command will be skipped. Better
to just try again an hour later, this helps avoid concurrent rpki-client
processes crossing streams.

I think 'once an hour' is a reasonable balance between the needs of
internet users (the ROAs creators who may depend urgently on an
expedient distribution of updated RPKI information); considerations for
what the Internet's CA infrastructure realisticly can support; and what
network operators are willing to tolerate in churn. We have to hold the
throttle open at the right position.

Anyway, I consider "once an hour" a big upgrade from the decades old
"once every 24 hour"-mantra the IRR brought us. :-)

Kind regards,

Job

Index: etc/crontab
===
RCS file: /cvs/src/etc/crontab,v
retrieving revision 1.26
diff -u -p -r1.26 crontab
--- etc/crontab 15 Apr 2020 03:24:08 -  1.26
+++ etc/crontab 16 Apr 2020 22:29:16 -
@@ -19,4 +19,4 @@ HOME=/var/log
 30 5   1   *   *   /bin/sh /etc/monthly
 #~ *   *   *   *   /usr/libexec/spamd-setup
 
-#0~20  9   *   *   *   -n rpki-client -v && bgpctl reload
+#~ *   *   *   *   -s -n rpki-client -v && bgpctl reload



Re: suggest to run rpki-client hourly

2020-04-13 Thread Job Snijders
On Mon, Apr 13, 2020 at 02:43:27PM +, Job Snijders wrote:
> I'm reviewing some of the timers associated with the workings of the
> end-to-end propagation from ROA to VRP. I think suggesting to run
> rpki-client only once a day can make for needless brittleness.
> 
> Running rpki-client just once a day also results in only making a rsync
> fetch attempt once a day. If the connection can't be established because
> of a transient network issue, the RP can easily end up going without
> contact with the CA Publication Point for close to 48 hours. A lot of
> CRLs appear to have expiration dates in the range of '24 hours'.
> 
> I think attempting to contact a CA PP at least once an hour is more
> appropriate for the various 24-48h sliding windows that are in play.
> 
> Thoughts? OK?

Small update: if we go hourly we should jiggle RANDOM accordingly

OK?

Index: crontab
===
RCS file: /cvs/src/etc/crontab,v
retrieving revision 1.25
diff -u -p -r1.25 crontab
--- crontab 4 Dec 2019 15:07:51 -   1.25
+++ crontab 13 Apr 2020 14:48:06 -
@@ -19,4 +19,4 @@ HOME=/var/log
 30 5   1   *   *   /bin/sh /etc/monthly
 #0 *   *   *   *   sleep $((RANDOM \% 2048)) && 
/usr/libexec/spamd-setup
 
-#0 9   *   *   *   -n sleep $((RANDOM \% 4096)) && 
rpki-client -v && bgpctl reload
+#0 *   *   *   *   -n sleep $((RANDOM \% 2048)) && 
rpki-client -v && bgpctl reload



suggest to run rpki-client hourly

2020-04-13 Thread Job Snijders
Hi,

I'm reviewing some of the timers associated with the workings of the
end-to-end propagation from ROA to VRP. I think suggesting to run
rpki-client only once a day can make for needless brittleness.

Running rpki-client just once a day also results in only making a rsync
fetch attempt once a day. If the connection can't be established because
of a transient network issue, the RP can easily end up going without
contact with the CA Publication Point for close to 48 hours. A lot of
CRLs appear to have expiration dates in the range of '24 hours'.

I think attempting to contact a CA PP at least once an hour is more
appropriate for the various 24-48h sliding windows that are in play.

Thoughts? OK?

Kind regards,

Job

Index: crontab
===
RCS file: /cvs/src/etc/crontab,v
retrieving revision 1.25
diff -u -p -r1.25 crontab
--- crontab 4 Dec 2019 15:07:51 -   1.25
+++ crontab 13 Apr 2020 14:34:45 -
@@ -19,4 +19,4 @@ HOME=/var/log
 30 5   1   *   *   /bin/sh /etc/monthly
 #0 *   *   *   *   sleep $((RANDOM \% 2048)) && 
/usr/libexec/spamd-setup
 
-#0 9   *   *   *   -n sleep $((RANDOM \% 4096)) && 
rpki-client -v && bgpctl reload
+#0 *   *   *   *   -n sleep $((RANDOM \% 4096)) && 
rpki-client -v && bgpctl reload



Re: BIRD 1.x/2.x support at rpki-client

2020-03-06 Thread Job Snijders
On Fri, Mar 06, 2020 at 07:11:56PM +0100, Robert Scheck wrote:
> On Fri, 06 Mar 2020, Sebastian Benoit wrote:
> > Note that I haven't tried this with bird 1 or 2 yet ;)
> > comments, oks?
> 
> I did not try it yet, but I think BIRD 1 also needs something like "define
> force_roa_table_update = %lld;" and maybe some table definition. I will try
> BIRD 1 and 2 during the weekend explicitly and provide specific feedback or
> suggest diffs.

BIRD 1 doesn't need such a define, the mechanism is somewhat less sexy,
on BIRD 1.6.x you have to issue the following to make a new staticly
configured ROAS definition be applied:

birdc configure # load new config
birdc flush roa # throw away old stuff
birdc reload all# apply new stuff to all protocols

The table is defined by declaring (and populating) the table as the
-current rpki-client does for BIRD v1.

Kind regards,

Job



Re: BIRD 1.x/2.x support at rpki-client

2020-03-06 Thread Job Snijders
I have a small suggestion, in some deployments I saw the convention to
name it as following so it is clear the data came from user provided
data rather than internal bird structures 

I tested Benno's patch against BIRD 1.6.6 - wfm.

Index: main.c
===
RCS file: /cvs/src/usr.sbin/rpki-client/main.c,v
retrieving revision 1.58
diff -u -p -r1.58 main.c
--- main.c  11 Feb 2020 18:41:39 -  1.58
+++ main.c  6 Mar 2020 17:25:56 -
@@ -156,7 +156,7 @@ static void build_chain(const struct aut
 static voidbuild_crls(const struct auth *, struct crl_tree *,
STACK_OF(X509_CRL) **);
 
-const char *bird_tablename = "roa";
+const char *bird_tablename = "ROAS";
 
 int verbose;
 



Re: BIRD 1.x/2.x support at rpki-client

2020-03-06 Thread Job Snijders
On Fri, Mar 06, 2020 at 12:24:18PM +0100, Sebastian Benoit wrote:
> Robert Scheck(rob...@fedoraproject.org) on 2020.03.03 01:20:24 +0100:
> > job@ suggested to move this from GitHub to tech@ list (as upstream):
> > 
> > 1. Currently, BIRD 1.x support in rpki-client seems to be broken: As per
> >BIRD upstream the "combined format" produced by rpki-client can't be
> >used as-is with BIRD 1.x due to separated daemons (and configuration
> >files) for IPv4 and IPv6.
> > 2. Lack of BIRD 2.x support in rpki-client, which requires a different
> >output/configuration format (semi-finished pull request at GitHub).
> 
> Hi, can you point me to that pull request, i could not find it.

I believe Robert is referring to this snippet of code:


https://patch-diff.githubusercontent.com/raw/kristapsdz/rpki-client/pull/21.patch

Kind regards,

Job



Re: BIRD 1.x/2.x support at rpki-client

2020-03-04 Thread Job Snijders
We are still at the early stages of RPKI deployment, so if we make it easier to 
plug things into BIRD1 is beneficial given the wide deployment scale.

Only /very/ recently was rpki-client packaged for some of the Linux distros, so 
if we add support for all formats now - it’ll improve the applicability of the 
rpki-client software. 

When NIC.CZ announces the EOL for BIRD1 we can plan to remove that output 
format.

It’s too soon to not support BIRD1 imho.

Kind regards,

Job



Re: BIRD 1.x/2.x support at rpki-client

2020-03-04 Thread Job Snijders
I think we still need to support BIRD 1 for the foreseeable future, NIC.CZ 
hasn’t communicated plans to deprecate BIRD1 and still supports it; and BIRD1 
still is widely deployed.

I’m somewhat preferential to just generate all 3 BIRD flavors if -B is given as 
command line option.

Kind regards,

Job



Re: BIRD 1.x/2.x support at rpki-client

2020-03-04 Thread Job Snijders
On Wed, Mar 4, 2020, at 00:55, Robert Scheck wrote:
> > The idea is you can specify many outputs.  That will make the commandline
> > very long, especially for the way we run it in cron.
> 
> Oh! I'm sorry, I didn't see the idea of specifying many outputs.

Yeah, its nice to do things in one batch for all things that might exist in an 
organisation:

job@vurt ~$ doas rpki-client -Bjoc

job@vurt ~$ ls -alhtr /var/db/rpki-client/
total 49128
drwxr-xr-x  14 root  wheel   1.0K Mar  1 14:53 ..
-rw-r--r--   1 _rpki-client  wheel   4.3M Mar  4 12:16 openbgpd
-rw-r--r--   1 _rpki-client  wheel   4.8M Mar  4 12:16 bird
-rw-r--r--   1 _rpki-client  wheel   4.1M Mar  4 12:16 csv
-rw-r--r--   1 _rpki-client  wheel  10.7M Mar  4 12:16 json
drwxr-xr-x   2 _rpki-client  wheel   512B Mar  4 12:16 .

Saves everyone valuable time! ;-)

Kind regards,

Job



Re: IPv6 Support for umb(4)

2020-01-29 Thread Job Snijders
On Tue, Jan 28, 2020 at 03:03:47PM +0100, Gerhard Roth wrote:
> this patch adds IPv6 support to umb(4).

OK job@

Tested with 'telnet -6 towel.blinkenlights.nl' on Fibocom L831-EAU on
IIJ MIO's network (Japan), with 'inet6 autoconf' in /etc/hostname.umb0 :-)

job@vurt ~$ doas ifconfig umb0
umb0: flags=208851 mtu 1500
index 5 priority 6 llprio 3
roaming enabled registration home network
state up cell-class LTE rssi -95dBm speed 47.7Mps up 143Mps down
SIM initialized PIN valid (3 attempts left)
subscriber-id 440030011055456 ICC-id 898103010455217 provider JP 
DOCOMO
device L831-EAU-00 v1.0.0 IMEI 862727030473719 firmware 
L831_V2E.0C.00.07
APN iijmio.jp
dns 202.232.2.32 202.232.2.33 2001:240::32 2001:240::33
groups: egress
status: active
inet6 fe80::fa59:71ff:fe9d:67dc%umb0 ->  prefixlen 64 scopeid 0x5
inet 100.94.237.118 --> 100.94.237.1 netmask 0xff00
inet6 fe80::15:9392:1801%umb0 -> fe80::15:9392:1802%umb0 prefixlen 128 
scopeid 0x5
inet6 2001:240:240c:9431:18e0:d5ed:c293:aafd ->  prefixlen 64 autoconf
inet6 2001:240:240c:9431:15ac:21ce:8471:27e1 ->  prefixlen 64 autoconf 
autoconfprivacy pltime 82252 vltime 601067



Re: Add #define for RFC8622 IPTOS_DSCP_LE codepoint

2020-01-25 Thread Job Snijders
On Sat, Jan 25, 2020 at 11:36:53PM +1100, Damien Miller wrote:
> This adds a #define for the "lower effort" DSCP code point specified
> by https://tools.ietf.org/html/rfc8622
> 
> People have asked to be able to use this OpenSSH for "don't care"
> traffic.
> 
> ok?

OK job@



Re: bgpd max-prefix out limit

2020-01-22 Thread Job Snijders
On Wed, Jan 22, 2020 at 05:02:32AM +0100, Claudio Jeker wrote:
> This diff implements 'max-prefix NUM out' which is a simple way to
> avoid leaking full tables to upstream or peers. If the limit is
> triggered the session will be closed with a NOTIFICATION (kind of
> suicide for the good of the Internet).
> 
> In bgpctl the counters are visible in the 'bgpctl show nei' output.
> 
> Works for me (adopting the maxprefix regress test to test for
> 'max-prefix NUM out'.

This is an implementation of 
https://tools.ietf.org/html/draft-sa-idr-maxprefix-00

OK job@



[PATCH] nc(1): print IP address in verbose mode (-v)

2019-10-23 Thread Job Snijders
Dear all,

Scratching a small itch: telnet(1) nicely prints what IP addresses it is
attempting to connect to, I'd like 'nc -v' to do the same, see below:

$ nc -v localhost 23
nc: connect to localhost (127.0.0.1) port 23 (tcp) failed: Connection refused
nc: connect to localhost (::1) port 23 (tcp) failed: Connection refused

$ # dont bother printing IPs when -n or host == ipaddr
$ nc -v -n 127.0.0.1 23
nc: connect to 127.0.0.1 port 23 (tcp) failed: Connection refused

$ dig a www.peeringdb.com +short
54.209.79.173
3.221.183.110
$ nc -v www.peeringdb.com 80
Connection to www.peeringdb.com (54.209.79.173) 80 port [tcp/www] succeeded!
^C

OK?

Kind regards,

Job

Index: netcat.c
===
RCS file: /cvs/src/usr.bin/nc/netcat.c,v
retrieving revision 1.207
diff -u -p -r1.207 netcat.c
--- netcat.c17 Oct 2019 14:29:24 -  1.207
+++ netcat.c23 Oct 2019 20:55:21 -
@@ -125,7 +125,7 @@ voidhelp(void) __attribute__((noreturn)
 intlocal_listen(const char *, const char *, struct addrinfo);
 void   readwrite(int, struct tls *);
 void   fdpass(int nfd) __attribute__((noreturn));
-intremote_connect(const char *, const char *, struct addrinfo);
+intremote_connect(const char *, const char *, struct addrinfo, char *);
 inttimeout_tls(int, struct tls *, int (*)(struct tls *));
 inttimeout_connect(int, const struct sockaddr *, socklen_t);
 intsocks_connect(const char *, const char *, struct addrinfo,
@@ -151,6 +151,7 @@ main(int argc, char *argv[])
 {
int ch, s = -1, ret, socksv;
char *host, *uport;
+   char ipaddr[NI_MAXHOST];
struct addrinfo hints;
struct servent *sv;
socklen_t len;
@@ -677,7 +678,8 @@ main(int argc, char *argv[])
proxy, proxyport, proxyhints, socksv,
Pflag);
else
-   s = remote_connect(host, portlist[i], hints);
+   s = remote_connect(host, portlist[i], hints,
+   ipaddr);
 
if (s == -1)
continue;
@@ -701,10 +703,14 @@ main(int argc, char *argv[])
uflag ? "udp" : "tcp");
}
 
-   fprintf(stderr,
-   "Connection to %s %s port [%s/%s] "
-   "succeeded!\n", host, portlist[i],
-   uflag ? "udp" : "tcp",
+   fprintf(stderr, "Connection to %s", host);
+
+   /* if there is something to report, print IP */
+   if (!nflag && (strcmp(host, ipaddr) != 0))
+   fprintf(stderr, " (%s)", ipaddr);
+
+   fprintf(stderr, " %s port [%s/%s] succeeded!\n",
+   portlist[i], uflag ? "udp" : "tcp",
sv ? sv->s_name : "*");
}
if (Fflag)
@@ -916,10 +922,11 @@ unix_listen(char *path)
  * port or source address if needed. Returns -1 on failure.
  */
 int
-remote_connect(const char *host, const char *port, struct addrinfo hints)
+remote_connect(const char *host, const char *port, struct addrinfo hints,
+char *ipaddr)
 {
struct addrinfo *res, *res0;
-   int s = -1, error, on = 1, save_errno;
+   int s = -1, error, herr, on = 1, save_errno;
 
if ((error = getaddrinfo(host, port, , )))
errx(1, "getaddrinfo for host \"%s\" port %s: %s", host,
@@ -952,11 +959,26 @@ remote_connect(const char *host, const c
 
set_common_sockopts(s, res->ai_family);
 
+   if ((herr = getnameinfo(res->ai_addr, res->ai_addrlen, ipaddr,
+   NI_MAXHOST, NULL, 0, NI_NUMERICHOST)) != 0) {
+   if (herr == EAI_SYSTEM)
+   err(1, "getnameinfo");
+   else
+   errx(1, "getnameinfo: %s", gai_strerror(herr));
+   }
+
if (timeout_connect(s, res->ai_addr, res->ai_addrlen) == 0)
break;
-   if (vflag)
-   warn("connect to %s port %s (%s) failed", host, port,
-   uflag ? "udp" : "tcp");
+
+   if (vflag) {
+   /* only print IP if there is something to report */
+   if (nflag || (strncmp(host, ipaddr, NI_MAXHOST) == 0))
+   warn("connect to %s port %s (%s) failed", host,
+   port, uflag ? "udp" : "tcp");
+   else
+   warn("connect to %s (%s) port %s (%s) failed",
+   

Re: OpenBSD crossed 400,000 commits

2019-10-08 Thread Job Snijders
On Tue, Oct 08, 2019 at 09:57:42PM -0600, Theo de Raadt wrote:
> Sometime in the last week OpenBSD crossed 400,000 commits (*) upon all
> our repositories since starting at 1995/10/18 08:37:01
> Canada/Mountain.  That's a lot of commits by a lot of amazing people.

Great achievement! Time to pop a bottle of Opën to celebrate ;-)

https://twitter.com/openbsd/status/1181803375479816193/



Re: bgpd fail before daemonizing on config errors

2019-08-08 Thread Job Snijders
On Thu, Aug 08, 2019 at 11:48:08AM +0200, Claudio Jeker wrote:
> With the introduction of re-exec of the childs the config parsing happened
> after bgpd demonized. This is super annoying and therefor this diff
> changes that. It will make bgpd fail on startup if there is an issue with
> the config file. I could not move the control socket setup before
> daemonizing since that call will send out imsgs and so the childs need to
> be up and running. To change this more refactoring is needed but this is
> already a good first step.
> 
> OK?

OK job@



Re: bgpd adj-rib-out rewrite

2019-07-16 Thread Job Snijders
On Wed, Jul 10, 2019 at 10:08:38PM +0200, Claudio Jeker wrote:
> This diff is a bit of a monster. It changes the Adj-RIB-Out to be a
> peer specific set of RB trees instead of using a rib in the original
> sense. The reason for this is that the more peers a system has the
> more elements end up being linked into the adj-rib-out and many
> operations do linear searches which does not scale.
> 
> I did some testing with 4000 peers sending 1 prefix each which then
> are sent back to all peers (resulting in 16Mio updates being put in
> Adj-RIB-Out).  Without this diff the system takes about 1h to bring up
> all sessions. With the diff the system finishes in around 5min.
> 
> To not increase the memory footprint struct prefix is now using a
> union for the lists or RB trees. Additionally the rib dump runner was
> adjusted so that it also works with the Adj-RIB-Out. bgpctl show rib
> out changed a bit since it will dump now one peer after the other
> apart from that behaviour should be the same.

Some testing indicates that things indeed seem faster.

More reports from the real world would be appreciated, to better
understand the specific performance characteristics this change
introduces in various different deployment models, but that can happen
in tree.

OK job@

Kind regards,

Job



Re: Remove irrfilter from bgpctl

2019-06-24 Thread Job Snijders
Hi,

On Mon, Jun 24, 2019 at 12:41:08PM +0200, Claudio Jeker wrote:
> I think it is time to remove the bgpctl irrfilter code. It is not
> useful and it is better to use tools like bgpq3 to build as-sets and
> prefix-sets for bgpd filtering.

Agreed. I'd love to have a replacement in base some day, but 'bgpctl
irrfilter' is not suitable.

> If anyone is still using irrfilter please speak up now.

Even if you are using irrfilter today, I'd recommend you to switch to
bgpq3 or an equivalent. I can't get irrfilter's functionality to work at
all with my own ASNs.

[job@kiera ~]$ mkdir /tmp/y
[job@kiera ~]$ bgpctl irrfilter -o /tmp/y 8283
irrfilter for: 8283, writing to /tmp/y
[job@kiera ~]$ ls /tmp/y
[job@kiera ~]$

Any attempt to parse RPSL is insanity anyway...

OK job@

Kind regards,

Job



Re: bgpd set nexthop 198.51.100.42 clarifications

2019-05-28 Thread Job Snijders
On Tue, May 28, 2019 at 05:17:08PM +0200, Claudio Jeker wrote:
> On Tue, May 28, 2019 at 01:28:32PM +0200, Job Snijders wrote:
> > On Mon, May 13, 2019 at 09:03:41PM +0200, Claudio Jeker wrote:
> > > When using a rule forcing the nexthop to a specific address bgpd
> > > currently does not mark that nexthop as no-modify. In other words
> > > the default rules for nexthop propagation applies. This means that
> > > for ebgp it only sends out the set nexthop when this nexthop is connected
> > > and on the same network as the peer. So while the Adj-RIB-Out shows the
> > > right nexthop it is actually not on the wire.
> > > 
> > > This diff changes set nexthop 198.51.100.42 to also imply set nexthop
> > > no-modify. This way the set nexthop is always on the wire.
> > > The problem with that is that it will hand you a nice footgun ready to
> > > blow of your big toe (but in the end the current behaviour is doing the
> > > same just with a different angle of attack) .
> > > 
> > > The set nexthop section in bgpd.conf.5 needs to be adjusted once a
> > > decision is made on how to handle this.
> > 
> > I think I'm not a big fan of this change.
> > 
> > Section 5.1.3 of RFC 4271 (the core BGP spec) is very explicit on
> > what NEXT_HOPs are valid to send over a non-multihop BGP session.
> > Only addresses that are part of the linknet between the two routers
> > are valid NEXT_HOP addresses on the wire. This changes makes it
> > trivial to send not-valid NEXT_HOPs to a neighbor, this may result
> > in very hard to debug troublecases. Feels like we'll be handing way
> > too much rope to users, especially since it facilitates protocol
> > violations.
> > 
> > I am not aware of a real world BGP implementation that would allow
> > you to send completely arbitrary NEXT_HOPs.
> 
> I came to a similar conclusion and will send out a better diff. The
> idea is that for the non-multihop BGP sessions we should require the
> nexthop to be in the same network. 

With 'same network', you mean the NEXT_HOP IP address must be part of
the router-to-router linknet, right?

> The ebgp multihop case is currently always sticking to the local
> address and should probably respect the nexthop if set explicitly
> (that is also what the RFC suggests). 

Right.

> ibgp seems to do the right thing.

yup, in IBGP the NEXT_HOP can be anything.

> The same rules need to be applied to "set nexthop no-modify" since
> that is currently on massive hammer.

Right.

Thanks!

Kind regards,

Job



Re: bgpd set nexthop 198.51.100.42 clarifications

2019-05-28 Thread Job Snijders
Hi,

On Mon, May 13, 2019 at 09:03:41PM +0200, Claudio Jeker wrote:
> When using a rule forcing the nexthop to a specific address bgpd
> currently does not mark that nexthop as no-modify. In other words
> the default rules for nexthop propagation applies. This means that
> for ebgp it only sends out the set nexthop when this nexthop is connected
> and on the same network as the peer. So while the Adj-RIB-Out shows the
> right nexthop it is actually not on the wire.
> 
> This diff changes set nexthop 198.51.100.42 to also imply set nexthop
> no-modify. This way the set nexthop is always on the wire.
> The problem with that is that it will hand you a nice footgun ready to
> blow of your big toe (but in the end the current behaviour is doing the
> same just with a different angle of attack) .
> 
> The set nexthop section in bgpd.conf.5 needs to be adjusted once a
> decision is made on how to handle this.

I think I'm not a big fan of this change.

Section 5.1.3 of RFC 4271 (the core BGP spec) is very explicit on what
NEXT_HOPs are valid to send over a non-multihop BGP session. Only
addresses that are part of the linknet between the two routers are valid
NEXT_HOP addresses on the wire. This changes makes it trivial to send
not-valid NEXT_HOPs to a neighbor, this may result in very hard to debug
troublecases. Feels like we'll be handing way too much rope to users,
especially since it facilitates protocol violations.

I am not aware of a real world BGP implementation that would allow you
to send completely arbitrary NEXT_HOPs.

Kind regards,

Job



Re: bgpd set nexthop 198.51.100.42 clarifications

2019-05-27 Thread Job Snijders
On Mon, May 13, 2019 at 21:11 Claudio Jeker 
wrote:

> When using a rule forcing the nexthop to a specific address bgpd
> currently does not mark that nexthop as no-modify. In other words
> the default rules for nexthop propagation applies. This means that
> for ebgp it only sends out the set nexthop when this nexthop is connected
> and on the same network as the peer. So while the Adj-RIB-Out shows the
> right nexthop it is actually not on the wire.
>
> This diff changes set nexthop 198.51.100.42 to also imply set nexthop
> no-modify. This way the set nexthop is always on the wire.
> The problem with that is that it will hand you a nice footgun ready to
> blow of your big toe (but in the end the current behaviour is doing the
> same just with a different angle of attack) .
>
> The set nexthop section in bgpd.conf.5 needs to be adjusted once a
> decision is made on how to handle this.
> --
> :wq Claudio
>
> Index: rde_rib.c
> ===
> RCS file: /cvs/src/usr.sbin/bgpd/rde_rib.c,v
> retrieving revision 1.190
> diff -u -p -r1.190 rde_rib.c
> --- rde_rib.c   7 Mar 2019 07:42:36 -   1.190
> +++ rde_rib.c   13 May 2019 17:32:14 -
> @@ -1491,7 +1491,7 @@ nexthop_modify(struct nexthop *setnh, en
> break;
> nexthop_put(*nexthop);
> *nexthop = nexthop_ref(setnh);
> -   *flags = 0;
> +   *flags = NEXTHOP_NOMODIFY;
> break;
> default:
> break;
>
>


Re: Stream Control Transmission Protocol SCTP RFC4960

2019-05-23 Thread Job Snijders
On Thu, May 23, 2019 at 19:50 Denis  wrote:

>
> SCTP(4) present in FreeBSD 12.0
>
> OpenBSD implementation planned?


Nothing planned as far as I know.

Out of curiosity - what is your use case? Do you really use it? It doesn’t
seem to be a widely used protocol.

Kind regards,

Job


Re: NSD & Unbound refusing to bind to IPv6 when anycast flag set ?

2019-05-17 Thread Job Snijders
On Fri, May 17, 2019 at 2:13 PM Stuart Henderson  wrote:
> On 2019/05/16 23:37, Rachel Roch wrote:
> > > RFC3513 says this:
> > >
> > >  o An anycast address must not be used as the source address of
> > >  an IPv6 packet.
> > >
> > >  o An anycast address must not be assigned to an IPv6 host, that
> > >  is, it may be assigned to an IPv6 router only.
> > >
> > > And to help ensure this, the kernel denies binding to an address marked
> > > with the anycast flag (see netinet6/in6_pcb.c).
> > >
> > > This was obsoleted by RFC4291, including this change:
> > >
> > >  o The restrictions on using IPv6 anycast addresses were removed because
> > >  there is now sufficient experience with the use of anycast addresses,
> > >  the issues are not specific to IPv6, and the GROW working group is
> > >  working in this area.
> > >
> > > So I think this restriction can now be removed, at least with this
> > > change, but more might be needed
> >
> > Certainly in my case the current OpenBSD situation represents a bit too 
> > much "nanny knows best".
>
> No, it represents "following the (old) RFCs".

patches welcome, indeed the openbsd behaviour is adhering to
now-outdated standards.

> > My use-case is anycast DNS with NSD and Unbound.
> >
> > Both NSD and unbound provide config parameters that allow distinguishing 
> > between listen address and source address.
> >
> > But then again, is there any real reason to use the anycast flag ?  To make 
> > NSD and unbound work I reconfigured to remove the anycast flag from IPv6 
> > addresses and nothing seems broken ?
> >
> If you are doing a typical "internet anycast services" setup with some
> routing protocol announcing the anycasted address then I don't see a use
> for the flag, AFAICT it was mostly in conjunction with using an anycast
> address for a local router, it feels like the usual IPv6 overengineering
> to me..

Overengineering or not, there is no reason to disallow binding to
interfaces which have the ANYCAST flag set.

Kind regards,

Job



Re: tcpdump support extended bgp shutdown communication

2019-05-11 Thread Job Snijders
OK job@

On Sat, May 11, 2019 at 14:37 Claudio Jeker 
wrote:

> bgpd already got support for extended shutdown communication messages.
> This adds the same support to tcpdump.
>
> OK?
> --
> :wq Claudio
>
> Index: print-bgp.c
> ===
> RCS file: /cvs/src/usr.sbin/tcpdump/print-bgp.c,v
> retrieving revision 1.27
> diff -u -p -r1.27 print-bgp.c
> --- print-bgp.c 28 Dec 2018 11:54:10 -  1.27
> +++ print-bgp.c 25 Apr 2019 08:20:27 -
> @@ -241,7 +241,7 @@ static const char *bgpnotify_minor_fsm[]
>  /* RFC 8203 */
>  #define BGP_NOTIFY_MINOR_CEASE_SHUT2
>  #define BGP_NOTIFY_MINOR_CEASE_RESET   4
> -#define BGP_NOTIFY_MINOR_CEASE_ADMIN_SHUTDOWN_LEN  128
> +#define BGP_NOTIFY_MINOR_CEASE_ADMIN_SHUTDOWN_LEN  255
>  static const char *bgpnotify_minor_cease[] = {
> NULL, "Maximum Number of Prefixes Reached", "Administrative
> Shutdown",
> "Peer De-configured", "Administrative Reset", "Connection
> Rejected",
> @@ -982,14 +982,14 @@ bgp_notification_print(const u_char *dat
> u_int16_t af;
> u_int8_t safi;
> const u_char *p;
> -   uint8_t shutdown_comm_length;
> +   size_t shutdown_comm_length;
> char shutstring[BGP_NOTIFY_MINOR_CEASE_ADMIN_SHUTDOWN_LEN + 1];
>
> TCHECK2(dat[0], BGP_NOTIFICATION_SIZE);
> memcpy(, dat, BGP_NOTIFICATION_SIZE);
>
> /* sanity checking */
> -   if (length +   if (length < BGP_NOTIFICATION_SIZE)
> return;
>
> printf(": error %s,", bgp_notify_major(bgpn.bgpn_major));
> @@ -1027,22 +1027,23 @@ bgp_notification_print(const u_char *dat
> (length >= BGP_NOTIFICATION_SIZE + 1)) {
> p = dat + BGP_NOTIFICATION_SIZE;
> TCHECK2(*p, 1);
> -   shutdown_comm_length = *(p);
> +   shutdown_comm_length = *p;
>
> /* sanity checking */
> if (shutdown_comm_length == 0)
> return;
> if (shutdown_comm_length >
> BGP_NOTIFY_MINOR_CEASE_ADMIN_SHUTDOWN_LEN)
> -   return;
> -   if (length < (shutdown_comm_length + 1 +
> BGP_NOTIFICATION_SIZE))
> -   return;
> +   goto trunc;
> +   if (length < (shutdown_comm_length + 1 +
> +   BGP_NOTIFICATION_SIZE))
> +   goto trunc;
> TCHECK2(*(p+1), shutdown_comm_length);
>
> /* a proper shutdown communication */
> -   printf(", Shutdown Communication [len %u]: \"",
> +   printf(", Shutdown Communication [len %zu]: \"",
> shutdown_comm_length);
> -   memset(shutstring, 0,
> BGP_NOTIFY_MINOR_CEASE_ADMIN_SHUTDOWN_LEN + 1);
> +   memset(shutstring, 0, sizeof(shutstring));
> memcpy(shutstring, p+1, shutdown_comm_length);
> safeputs(shutstring);
> printf("\"");
>
>


[PATCH] bgpctl(8): improve user interface for RPKI Origin Validation

2019-04-01 Thread Job Snijders
Dear all,

I've consulted with numerous user interface experts, their consistent
advice was to facilitate internalization by provoking simpler, stronger
emotions through the text based interface.

bgpctl(8) will now provide simplified 'SAD' or 'HAPPY' ascii ideograms
to help network operators quickly understand whether a route is valid or
not. I suspect this patch is the final missing piece to make the
struggle for RPKI adoption a thing of the past.

Example output:

$ bgpctl show rib
flags: * = Valid, > = Selected, I = via IBGP, A = Announced,
   S = Stale, E = Error
origin validation state: ? = not-found, :-) = valid, :-( = invalid
origin: i = IGP, e = EGP, ? = Incomplete

flags ovs destination  gateway  lpref   med aspath origin
I*>   :-) 1.0.0.0/24   192.147.168.1  100 0 2914 13335 i
I*> ? 199.185.136.0/23 192.147.168.1  100 0 2914 3257 22512 
i
I*>   :-( 199.47.227.0/24  192.147.168.1  100 0 2914 7018 17216 
i

Kind regards,

Job

diff --git usr.sbin/bgpctl/bgpctl.c usr.sbin/bgpctl/bgpctl.c
index ae5dab9c620..f998514fe90 100644
--- usr.sbin/bgpctl/bgpctl.c
+++ usr.sbin/bgpctl/bgpctl.c
@@ -1127,7 +1127,7 @@ show_rib_summary_head(void)
 {
printf("flags: * = Valid, > = Selected, I = via IBGP, A = Announced,\n"
"   S = Stale, E = Error\n");
-   printf("origin validation state: N = not-found, V = valid, ! = invalid\n");
+   printf("origin validation state: ? = not-found, :-) = valid, :-( = 
invalid\n");
printf("origin: i = IGP, e = EGP, ? = Incomplete\n\n");
printf("%-5s %3s %-20s %-15s  %5s %5s %s\n", "flags", "ovs", "destination",
"gateway", "lpref", "med", "aspath origin");
@@ -1204,11 +1204,11 @@ print_ovs(u_int8_t validation_state, int sum)
 {
switch (validation_state) {
case ROA_INVALID:
-   return (sum ? "!" : "invalid");
+   return (sum ? ":-(" : "invalid");
case ROA_VALID:
-   return (sum ? "V" : "valid");
+   return (sum ? ":-)" : "valid");
default:
-   return (sum ? "N" : "not-found");
+   return (sum ? "?" : "not-found");
}
 }



Re: bgpd optimize filter rules

2018-12-03 Thread Job Snijders
On Mon, Dec 03, 2018 at 12:14:13PM +0100, Claudio Jeker wrote:
> There is a trivial optimization that bgpd can do when loading the filter
> ruleset. If the rule is the same as the previous rule than the filterset
> can be merged.  e.g.
> 
> match from ebgp set community delete $myAS:*
> match from ebgp set community $myAS:15
> match from ebgp set med 100
> 
> Will be optimized into:
> 
> match from ebgp set { metric 100 community delete $myAS:* community 
> $myAS:15 }
> 
> The following diff is doing this and saves around 5% of the rules in
> arouteserver configs and probably similar amount in other peoples config.

OK job@



Re: bgpd refactor aspath_match a bit

2018-11-27 Thread Job Snijders
On Tue, Nov 27, 2018 at 06:23:53PM +0100, Claudio Jeker wrote:
> On Tue, Nov 27, 2018 at 04:21:53PM +0100, Job Snijders wrote:
> > On Fri, Nov 23, 2018 at 03:55:18PM +0100, Claudio Jeker wrote:
> > > For origin validation I chacked the source_as in struct rde_aspath
> > > this is not really the right place. It should be in struct aspath
> > > since that holds all the ASPATH related stuff. Change this, move
> > > aspath_match out of util.c back into rde_attr.c and adjust code to use
> > > the cached value also in match from any source-as XYZ rules.
> > > This last bit causes a minor behavioural change since the old code
> > > extracted the last non AS_SET asnumber. The new code follows the ROA
> > > RFC and returns the rightmost AS for AS_SEQUENCE, the local AS for
> > > empty paths and AS_NONE (which is 0) for everything else.
> > > So now 'match from any source-as 0' will return all paths that do not
> > > have a final AS_SEQUENCE segment.
> > > 
> > > The reason for this change is that I don't want to have two different
> > > behaviours for what we call source-as (the one in roa-set and the one on a
> > > filter).
> > 
> > Something is off, it seems 'source-as 0' is matching anything that has
> > an AS_SET attribute set:
> > 
> > $ bgpctl show rib source-as 0 | head
> > flags: * = Valid, > = Selected, I = via IBGP, A = Announced,
> >S = Stale, E = Error
> > origin validation state: N = not-found, V = valid, ! = invalid
> > origin: i = IGP, e = EGP, ? = Incomplete
> > 
> > flags ovs destination  gateway  lpref   med aspath 
> > origin
> > I*> N 5.39.176.0/21192.147.168.1  100 0 2914 8530 { 
> > 198753 } ?
> > I*> N 5.101.110.0/24   192.147.168.1  100 0 2914 14061 
> > { 46652 } i
> > I*> N 5.175.0.0/19 192.147.168.1  100 0 2914 1299 
> > 20773 { 8972 } i
> > I*> N 8.41.202.0/24192.147.168.1  100 0 2914 13789 
> > 30372 { 40179 } i
> > 
> > Similarly, this should return at least 5.39.176.0/21:
> > 
> > $ bgpctl show rib source-as 8530
> > flags: * = Valid, > = Selected, I = via IBGP, A = Announced,
> >S = Stale, E = Error
> > origin validation state: N = not-found, V = valid, ! = invalid
> > origin: i = IGP, e = EGP, ? = Incomplete
> > 
> > flags ovs destination  gateway  lpref   med aspath 
> > origin
> > I*> N 80.87.16.0/20192.147.168.1  100 0 2914 8530 ?
> > I*> N 87.236.128.0/21  192.147.168.1  100 0 2914 8530 ?
> > I*> N 88.151.152.0/21  192.147.168.1  100 0 2914 8530 ?
> > I*> N 89.38.120.0/21   192.147.168.1  100 0 2914 8530 i
> > I*> N 93.115.176.0/20  192.147.168.1  100 0 2914 8530 i
> > I*> N 185.52.144.0/22  192.147.168.1  100 0 2914 8530 ?
> > 
> 
> I implemented source-as the way ROA is defining it. So anything which ends
> with a AS_SET will return AS_NONE (which is 0). OpenBGPD has no way to
> have an AS_PATH that has a real 0 in the AS_PATH (those UPDATES are
> treated as withdraw). Because of this also the 5.39.176.0/21 is no longer
> matching in 'bgpctl show rib source-as 8530'.

I'm not sure it should behave that way.

'bgpctl show rib source-as 8530' really ought to return prefixes like
80.87.16.0/20 but also 5.39.176.0/21.

> I'm a bit on the edge here about where to go and currently prefer to
> follow a RFC (which in this case is RFC6811).
> 
>  o  Route Origin ASN: The origin AS number derived from a Route as
> follows:
> 
> *  the rightmost AS in the final segment of the AS_PATH attribute
>  in the Route if that segment is of type AS_SEQUENCE, or
> 
> *  the BGP speaker's own AS number if that segment is of type
>AS_CONFED_SEQUENCE or AS_CONFED_SET or if the AS_PATH is empty,
>or
> 
> *  the distinguished value "NONE" if the final segment of the
>AS_PATH attribute is of any other type.
> 
> As mentioned above I found it strange when behaviour is different because
> of where it is used.

RFC 6811 describes how BGP Origin Validation should be performed, but is
not a guideline how to treat AS_SETs in general or how CLIs should
function. Perhaps 'source-as 0' should just throw an error in all
contexts (both bgpd.conf and bgpctl), since 0 can never be in an AS_SET
or AS_SEQUENCE. Perhaps we shouldn't overload the number.

1/ Maybe 'source-as self' or 'source-as none' can be used to match all
routes originated by the AS in which this bgpd instance runs.

2/ Similarly, the program could be made so that 'AS 15562' (lets use
15562 as example) from the Global Configuration passed on to bgpctl and
used in the filter ruleset means both "what is originated by 15562" and
also what is originated in the own AS (and doesn't have AS_PATH yet, but
we know the router runs in 15562 because of the global config parameter).

Kind regards,

Job



Re: bgpd refactor aspath_match a bit

2018-11-27 Thread Job Snijders
Hi Claudio,

On Fri, Nov 23, 2018 at 03:55:18PM +0100, Claudio Jeker wrote:
> For origin validation I chacked the source_as in struct rde_aspath
> this is not really the right place. It should be in struct aspath
> since that holds all the ASPATH related stuff. Change this, move
> aspath_match out of util.c back into rde_attr.c and adjust code to use
> the cached value also in match from any source-as XYZ rules.
> This last bit causes a minor behavioural change since the old code
> extracted the last non AS_SET asnumber. The new code follows the ROA
> RFC and returns the rightmost AS for AS_SEQUENCE, the local AS for
> empty paths and AS_NONE (which is 0) for everything else.
> So now 'match from any source-as 0' will return all paths that do not
> have a final AS_SEQUENCE segment.
> 
> The reason for this change is that I don't want to have two different
> behaviours for what we call source-as (the one in roa-set and the one on a
> filter).

Something is off, it seems 'source-as 0' is matching anything that has
an AS_SET attribute set:

$ bgpctl show rib source-as 0 | head
flags: * = Valid, > = Selected, I = via IBGP, A = Announced,
   S = Stale, E = Error
origin validation state: N = not-found, V = valid, ! = invalid
origin: i = IGP, e = EGP, ? = Incomplete

flags ovs destination  gateway  lpref   med aspath origin
I*> N 5.39.176.0/21192.147.168.1  100 0 2914 8530 { 
198753 } ?
I*> N 5.101.110.0/24   192.147.168.1  100 0 2914 14061 { 
46652 } i
I*> N 5.175.0.0/19 192.147.168.1  100 0 2914 1299 20773 
{ 8972 } i
I*> N 8.41.202.0/24192.147.168.1  100 0 2914 13789 
30372 { 40179 } i

Similarly, this should return at least 5.39.176.0/21:

$ bgpctl show rib source-as 8530
flags: * = Valid, > = Selected, I = via IBGP, A = Announced,
   S = Stale, E = Error
origin validation state: N = not-found, V = valid, ! = invalid
origin: i = IGP, e = EGP, ? = Incomplete

flags ovs destination  gateway  lpref   med aspath origin
I*> N 80.87.16.0/20192.147.168.1  100 0 2914 8530 ?
I*> N 87.236.128.0/21  192.147.168.1  100 0 2914 8530 ?
I*> N 88.151.152.0/21  192.147.168.1  100 0 2914 8530 ?
I*> N 89.38.120.0/21   192.147.168.1  100 0 2914 8530 i
I*> N 93.115.176.0/20  192.147.168.1  100 0 2914 8530 i
I*> N 185.52.144.0/22  192.147.168.1  100 0 2914 8530 ?

Kind regards,

Job



Re: bgpd refactor community code

2018-11-27 Thread Job Snijders
On Thu, Nov 22, 2018 at 05:56:20PM +0100, Claudio Jeker wrote:
> On Tue, Nov 13, 2018 at 06:53:55PM +0100, Claudio Jeker wrote:
> > This is a large diff that changes the way communities are stored in
> > filters and filter_sets. Both standard communities and large communities
> > now share the same data structure for lookups and at the same time the
> > filters are extended to allow more then one community to match per rule
> > (currently the maximum is 3). I did leave extended communities outside for
> > now since this diff is already big enough but they will follow in a second
> > step.
> > 
> > So now filters like
> > deny to any large-community 196618:0:666 large-community 196618:0:3
> > deny to any community 13030:1016 community 13030:5
> > will work and match only if both communities are matched.
> > 
> > As a side effect the bgpctl show rib code is changed and there is no
> > longer a limitation that only one of the filter is allowed to be used.
> > In other words 'bgpctl show rib as 13030 community 1303:1036' will
> > work now.
> > 
> > Apart from that there should be no other visible change.
> > 
> > Please test and report back

OK job@



Re: prevent bgpd from starting when control socket already used

2018-11-11 Thread Job Snijders
Shouldnt we already bomb out at the following?

cannot bind to 0.0.0.0:179: Address already in use
cannot bind to [::]:179: Address already in use

In any regard, I agree with the functionality proposed. No strong opinion
on the diff itself.

Kind regards,

Job

On Sun, Nov 11, 2018 at 22:35 Remi Locherer  wrote:

> Hi,
>
> I heard from two devs that started a 2nd bgpd by accident (forgot -n for
> a config check) which then caused downtime.
>
> Below diff adds a check to bgpd similar to the one we have now in ospfd and
> ospf6d: if another process is listening on the control socket bgpd exits.
>
> The situation is a bit different than in the ospfd case:
> - The socket path is defined in the config and not with an option.
> - The config is parsed relatively late (after daemon).
>
> Because of that rcctl would not detect that bgpd exits. But this only
> applies
> when the first bgpd was started without using the rc scripts.
>
> bgpd already detects that it can not bind the port if another instance is
> running. Adding a fatal there can have an effect when reloading after a
> config change. Loosing all sessions is not nice.
>
> Of course if the socket path is changed in the config and then reloaded the
> same can happen. I think this is less likely than changing a listen
> address.
>
> I think it is sufficient to add this check only to the control socket and
> not to the restricted socket.
>
> The diff in action:
>
> # /usr/src/usr.sbin/bgpd/obj/bgpd -dv
> startup
> session engine ready
> rereading config
> cannot bind to 0.0.0.0:179: Address already in use
> cannot bind to [::]:179: Address already in use
> control_check: socket in use
> fatal in bgpd: control socket check failed
> rib_new: Adj-RIB-In -> 0
> rib_new: Adj-RIB-Out -> 1
> route decision engine ready
> peer closed imsg connection
> fatal in RDE: Lost connection to parent
> peer closed imsg connection
> SE: Lost connection to parent
> session engine exiting
> #
>
> Comments? OKs?
>
> Remi
>
>
> cvs diff: Diffing .
> Index: bgpd.c
> ===
> RCS file: /cvs/src/usr.sbin/bgpd/bgpd.c,v
> retrieving revision 1.204
> diff -u -p -r1.204 bgpd.c
> --- bgpd.c  29 Sep 2018 08:11:11 -  1.204
> +++ bgpd.c  11 Nov 2018 20:33:32 -
> @@ -939,6 +939,8 @@ control_setup(struct bgpd_config *conf)
> }
> if ((cname = strdup(conf->csock)) == NULL)
> fatal("strdup");
> +   if ((control_check(cname)) == -1)
> +   fatalx("control socket check failed");
> if ((fd = control_init(0, cname)) == -1)
> fatalx("control socket setup failed");
> if (control_listen(fd) == -1)
> Index: control.c
> ===
> RCS file: /cvs/src/usr.sbin/bgpd/control.c,v
> retrieving revision 1.90
> diff -u -p -r1.90 control.c
> --- control.c   11 Aug 2017 16:02:53 -  1.90
> +++ control.c   10 Nov 2018 20:52:46 -
> @@ -38,6 +38,32 @@ void  control_result(struct ctl_conn *,
>  ssize_t imsg_read_nofd(struct imsgbuf *);
>
>  int
> +control_check(char *path)
> +{
> +   struct sockaddr_un   sun;
> +   int  fd;
> +
> +   bzero(, sizeof(sun));
> +   sun.sun_family = AF_UNIX;
> +   strlcpy(sun.sun_path, path, sizeof(sun.sun_path));
> +
> +   if ((fd = socket(AF_UNIX, SOCK_STREAM, 0)) == -1) {
> +   log_warn("control_check: socket check");
> +   return (-1);
> +   }
> +
> +   if (connect(fd, (struct sockaddr *), sizeof(sun)) == 0) {
> +   log_warnx("control_check: socket in use");
> +   close(fd);
> +   return (-1);
> +   }
> +
> +   close(fd);
> +
> +   return (0);
> +}
> +
> +int
>  control_init(int restricted, char *path)
>  {
> struct sockaddr_un   sun;
> Index: session.h
> ===
> RCS file: /cvs/src/usr.sbin/bgpd/session.h,v
> retrieving revision 1.125
> diff -u -p -r1.125 session.h
> --- session.h   24 Oct 2018 08:26:37 -  1.125
> +++ session.h   10 Nov 2018 21:53:04 -
> @@ -250,6 +250,7 @@ void prepare_listeners(struct bgpd_conf
>  int get_mpe_label(struct rdomain *);
>
>  /* control.c */
> +intcontrol_check(char *);
>  intcontrol_init(int, char *);
>  intcontrol_listen(int);
>  void   control_shutdown(int);
>
>


Re: ifconfig(8) to deny non-contiguous netmask / take 2

2018-09-30 Thread Job Snijders
OK job@



[PATCH] bgpd: expose ROA origin validation state in show rib

2018-09-29 Thread Job Snijders
Dear all,

This small patch exposes the origin validation state in 'bgpctl show
rib' and 'bgpctl show rib detail'. This will help debugging, and draw
attention to routing problems.

I know we're weary of spending horizontal space, but I think spending 3
chars to show the OV state (and as such make it a first class citizen)
will be worth it. When I tried putting the ovs state in the 'flags'
column it just didn't look right.

Example output:

$ bgpctl show rib
flags: * = Valid, > = Selected, I = via IBGP, A = Announced,
   S = Stale, E = Error
origin validation state: N = not-found, V = valid, ! = invalid
origin: i = IGP, e = EGP, ? = Incomplete

flags ovs destination  gateway  lpref   med aspath origin
*>  V 192.0.2.0/24 100.64.2.3 100 0 2 i
*>  ! 192.0.2.0/26 100.64.2.3 100 0 2 i
$ bgpctl show rib detail

BGP routing table entry for 192.0.2.0/24
2
Nexthop 100.64.2.3 (via 100.64.2.3) from 100.64.2.3 (192.147.168.1)
Origin IGP, metric 0, localpref 100, weight 0, ovs valid, external, 
valid, best
Last update: 00:02:19 ago

BGP routing table entry for 192.0.2.0/26
2
Nexthop 100.64.2.3 (via 100.64.2.3) from 100.64.2.3 (192.147.168.1)
Origin IGP, metric 0, localpref 100, weight 0, ovs invalid, external, 
valid, best
Last update: 00:02:19 ago

OK?

Kind regards,

Job

diff --git usr.sbin/bgpctl/bgpctl.c usr.sbin/bgpctl/bgpctl.c
index 7275ea6a2c4..70648ed8f5b 100644
--- usr.sbin/bgpctl/bgpctl.c
+++ usr.sbin/bgpctl/bgpctl.c
@@ -72,8 +72,9 @@ intshow_nexthop_msg(struct imsg *);
 voidshow_interface_head(void);
 int show_interface_msg(struct imsg *);
 voidshow_rib_summary_head(void);
-voidprint_prefix(struct bgpd_addr *, u_int8_t, u_int8_t);
+voidprint_prefix(struct bgpd_addr *, u_int8_t, u_int8_t, u_int8_t);
 const char *print_origin(u_int8_t, int);
+const char *print_ovs(u_int8_t, int);
 voidprint_flags(u_int8_t, int);
 int show_rib_summary_msg(struct imsg *);
 int show_rib_detail_msg(struct imsg *, int, int);
@@ -183,6 +184,7 @@ main(int argc, char *argv[])
ribreq.neighbor = neighbor;
ribreq.aid = res->aid;
ribreq.flags = res->flags;
+   ribreq.validation_state = res->validation_state;
show_mrt.arg = 
if (!(res->flags & F_CTL_DETAIL))
show_rib_summary_head();
@@ -1183,17 +1185,20 @@ show_rib_summary_head(void)
 {
printf("flags: * = Valid, > = Selected, I = via IBGP, A = Announced,\n"
"   S = Stale, E = Error\n");
+   printf("origin validation state: N = not-found, V = valid, ! = 
invalid\n");
printf("origin: i = IGP, e = EGP, ? = Incomplete\n\n");
-   printf("%-5s %-20s %-15s  %5s %5s %s\n", "flags", "destination",
+   printf("%-5s %3s %-20s %-15s  %5s %5s %s\n", "flags", "ovs", 
"destination",
"gateway", "lpref", "med", "aspath origin");
 }
 
 void
-print_prefix(struct bgpd_addr *prefix, u_int8_t prefixlen, u_int8_t flags)
+print_prefix(struct bgpd_addr *prefix, u_int8_t prefixlen, u_int8_t flags,
+u_int8_t ovs)
 {
char*p;
 
print_flags(flags, 1);
+   printf("%3s ", print_ovs(ovs, 1));
if (asprintf(, "%s/%u", log_addr(prefix), prefixlen) == -1)
err(1, NULL);
printf("%-20s", p);
@@ -1252,6 +1257,19 @@ print_flags(u_int8_t flags, int sum)
}
 }
 
+const char *
+print_ovs(u_int8_t validation_state, int sum)
+{
+   switch (validation_state) {
+   case ROA_INVALID:
+   return (sum ? "!" : "invalid");
+   case ROA_VALID:
+   return (sum ? "V" : "valid");
+   default:
+   return (sum ? "N" : "not-found");
+   }
+}
+
 int
 show_rib_summary_msg(struct imsg *imsg)
 {
@@ -1309,7 +1327,7 @@ show_rib_brief(struct ctl_show_rib *r, u_char *asdata)
 {
char*aspath;
 
-   print_prefix(>prefix, r->prefixlen, r->flags);
+   print_prefix(>prefix, r->prefixlen, r->flags, r->validation_state);
printf(" %-15s ", log_addr(>exit_nexthop));
printf(" %5u %5u ", r->local_pref, r->med);
 
@@ -1346,8 +1364,9 @@ show_rib_detail(struct ctl_show_rib *r, u_char *asdata, 
int nodescr, int flag0)
id.s_addr = htonl(r->remote_id);
printf("%s)%c", inet_ntoa(id), EOL0(flag0));
 
-   printf("Origin %s, metric %u, localpref %u, weight %u, ",
-   print_origin(r->origin, 0), r->med, r->local_pref, r->weight);
+   printf("Origin %s, metric %u, localpref %u, weight %u, ovs %s, ",
+   print_origin(r->origin, 0), r->med, r->local_pref, r->weight,
+   print_ovs(r->validation_state, 0));
print_flags(r->flags, 0);
 
now = time(NULL);
diff --git 

Re: bgpd ROA validation

2018-09-25 Thread Job Snijders
On Tue, Sep 25, 2018 at 12:23:48PM +0200, Claudio Jeker wrote:
> On Sat, Sep 22, 2018 at 09:48:24PM +0000, Job Snijders wrote:
> > Seems we are getting very close. Some suggestions to simplify the
> > experience for the end user.
> > 
> > Let's start with supporting just one (unnamed) roa-set, so far I've
> > really not come across a use case where multiple ROA tables are
> > useful.  I say this having implemented origin validation on both
> > ISPs and IXPs.
> > 
> > The semantics of the Origin Validation procedure that apply to
> > "Validated ROA Payloads" are not compatible with how the ARIN WHOIS
> > OriginAS semantics, so roa-set will never be used for ARIN WHOIS
> > data.
> 
> Please explain further, because honestly both the ruleset that
> arouteserver generates for ARIN WHOIS OriginAS and ROA are doing
> the same. They connect a source-as with a prefix. This is what a
> roa-set is giving you. It implements a way to quickly lookup a 2 key
> element (AS + prefix). 

The big difference between IRR, ARIN WHOIS and RPKI is that the former
two carry no exclusivity, and the latter (RPKI) semantically means
someting different than IRR or WHOIS data can convey.

When an object from IRR or WHOIS indicates that something like
"192.0.2.0/24 source-as 65001", it does *not* mean that 192.0.2.0/24
can't be originated by other ASNs. There may be other sources (other
IRRs) that also have route objects covering the prefix, or not. It just
means that 65001 is one of the valid source ASNs, it is a rather weak
statement.

However, if the same information "192.0.2.0/24 source-as 65001" comes
from a RPKI ROA, it means that *ALL OTHER* announcements from different
source ASNs or with different prefix lengths are invalid.

You can't apply the origin validation procedure if the information fed
into the procedure has no proof of termination.

Phrased differently: this is why you can do RPKI Origin Validation on
all your EBGP sessions, including your transit providers. But with IRR
based filters you can't really apply them on you transit providers.
IRR/WHOIS are only an allowlist, where as RPKI ROAs is a combined allow
+ blocklist.

> Depending on how this table is used it can be used for both cases.
> This is the technical view based on looking at rulesets and thinking
> on how to write them in a way that is fast and understandable.  I
> understand that from an administration point of view RPKI ROA is much
> stronger and can therefor be used more strictly (e.g. with the simple
> rule deny quick from ebgp roa-set RPKI invalid). The generic origin
> validatiation as a supporting measure to IRR based prefixset filtering
> is only allowing additional prefixes based on the roa-set (e.g. match
> from ebgp \ large-community $INTCOMM_ORIGIN_OK roa-set ARINDB valid \
> set large-community $INTCOMM_PREF_OK). They are two different things
> but can use the same filter building blocks.  Maybe naming it roa-set
> is wrong (since it is too much connected with RPKI) but the lookup
> table is usable for more than just RPKI. This is why I think
> supporting multiple tables is benefitial. I also agree that a
> simplified user experience for RPKI is a good thing, it should be
> simple to use.

At this point in time I do not thing we should support multiple tables.
Let's finish RPKI ROA origin validation and later, separately, look if
we can optimise for IRR/WHOIS data.

> > There currently is 1 RPKI, and therefor we should have just 1 roa-set.
> 
> I would fully agree here if ARIN would make their trust anchor freely
> distributable. 

That is entirely unrelated, and being worked on outside of OpenBSD
scope.

> But yes in general only one RPKI table is needed.  My point is that
> roa-set is more generic than RPKI. It is not an RPKI only thing. It
> can be used for more than that and we should support this as well
> since it makes for much better and efficient configs.

I disagree, roa-set is an RPKI only thing. There currently is nothing
else to support. No other type of published routing information has the
same semantics as a ROA and are thus suitable for the origin validation
procedure.

> > The advantage of having only one roa-set is that it does not need to be
> > referenced in the policies.
> > 
> > I propose the patch is amended to accomodate the following:
> > 
> > roa-set {
> > 165.254.255.0/24 source-as 15562
> > 193.0.0.0/21 source-as 
> > }
> > 
> > deny from any ovs invalid
> > match from any ovs valid set community local-as:42
> > match from any ovs unknown set community local-as:43
> > 
> > I suggest the filter keyword 'ovs' (origin validation state) is
> > introduced to allow easy mat

Re: bgpd ROA validation

2018-09-22 Thread Job Snijders
Hi claudio,

Seems we are getting very close. Some suggestions to simplify the
experience for the end user.

Let's start with supporting just one (unnamed) roa-set, so far I've
really not come across a use case where multiple ROA tables are useful.
I say this having implemented origin validation on both ISPs and IXPs.

The semantics of the Origin Validation procedure that apply to
"Validated ROA Payloads" are not compatible with how the ARIN WHOIS
OriginAS semantics, so roa-set will never be used for ARIN WHOIS data.
There currently is 1 RPKI, and therefor we should have just 1 roa-set.

The advantage of having only one roa-set is that it does not need to be
referenced in the policies.

I propose the patch is amended to accomodate the following:

roa-set {
165.254.255.0/24 source-as 15562
193.0.0.0/21 source-as 
}

deny from any ovs invalid
match from any ovs valid set community local-as:42
match from any ovs unknown set community local-as:43

I suggest the filter keyword 'ovs' (origin validation state) is
introduced to allow easy matching. I think this looks better. If the
roa-set is not defined or empty, all route announcements are 'unknown'.

I'm working on a patch to introduce the 'ovs' keyword to bgpctl too,
folks can do stuff like "bgpctl show rib ovs valid". I'll share that
with you privately.

Kind regards,

Job



Re: bgpd ROA validation

2018-09-22 Thread Job Snijders
On Fri, Sep 21, 2018 at 05:29:24PM +0200, Claudio Jeker wrote:
> I currently use the RIPE RPKI validator to grab a JSON file (e.g.
> http://localcert.ripe.net:8088/export.json) and feed that to this perl
> script to convert it into bgpd syntax:

For now I recommend using https://rpki.gin.ntt.net/api/export.json -
should be reliable.

The localcert.ripe.net instance is not run as a production service, more
as a sort of public facing demonstration of an old version of the
validator.

Of course best operational practise is to run your own RPKI validator.
Currently there are two implementations available that can provide the
'export.json' file:

https://github.com/RIPE-NCC/rpki-validator-3 (java, more mature)
https://github.com/nlnetlabs/routinator (rust, beta)

We'll have to use one of these until we have our own openbsd validator.

Kind regards,

Job



Re: bgpd roa-set backend

2018-09-14 Thread Job Snijders
Dear Claudio,

On Fri, Sep 14, 2018 at 04:59:51PM +0200, Claudio Jeker wrote:
> This diff extends the existing trie code for prefix-set to also work with
> roa-set. Unlike prefix-set there is no need for a prefixlen mask during
> lookup, instead the source-as needs to be checked and also if the
> prefixlen of the prefix is allowed.
> The lookup can return 3 states:
> ROA_UNKNONW: prefix is not covered by any entry
> ROA_VALID: prefix is covered and the source-as matches as does the prefixlen
> ROA_INVALID: there was a covering ROA entry that did not match source-as
>   or prefixlen

I didn't test yet - but from your above description I have these three
questions.

Given this ROA file with two entries: 

prefix 80.128.0.0/11 maxlen 11 source-as 0
prefix 80.128.0.0/11 maxlen 11 source-as 3320

1/ The following announcement "80.128.0.0/11 AS_PATH 2914_3320" received
via (E)BGP must result in ROA_VALID state. An announcement must be
considered ROA_VALID if it correctly matches /any/ of the configured
ROAs. So the presence of "80.128.0.0/11 maxlen 11 source-as 0" should
not invalidate any other ROA. Does this match your understanding too?

> The source-as check is done with an as_set and should therefor scale
> well.  In general these lookups need to be quick since all prefixes
> will go through roa lookups.

2/ I take it that (iff a ROA file is defined) any and all announcements
will go through the lookup? (I think this is good.)

3/ If the contents of the roa_file are change, and a 'bgpctl reload' is
issued - the validation state should be reevaluated. For instance if we
start with:

prefix 80.128.0.0/11 maxlen 11 source-as 0
prefix 80.128.0.0/11 maxlen 11 source-as 3320

and have announcement 80.128.0.0/11 AS_PATH 2914_3320 (ROA_VALID) in
Loc-RIB, the moment the ROA file is changed to:

prefix 80.128.0.0/11 maxlen 11 source-as 0
prefix 80.128.0.0/11 maxlen 11 source-as 666

the announcement 80.128.0.0/11 AS_PATH 2914_3320 should become
ROA_INVALID. Should we get to the point where we can filter based on
validation state, and I configured something like:

deny from ebgp validation-state invalid

The 80.128.0.0/11 AS_PATH 2914_3320 announcement should become an
infeasible/rejected path.

> The frontend code (parse.y and imsg passing) is missing, since this is
> already fairly large and it is tested by the unit tests I decided to
> send this out as an idividual step.

yes.

Kind regards,

Job



Re: bgpd: refine source-as matching

2018-08-09 Thread Job Snijders
On Thu, Aug 09, 2018 at 03:10:11PM +0200, Claudio Jeker wrote:
> Per rfc6472 AS_SET should no longer be used but some AS still do.
> Until now source-as would take the rightmost AS number of an AS_PATH
> no matter if it was an AS_SEQUENCE or an AS_SET. Thit is not correct.

Indeed, good find!

> Also because AS_SET are used in aggregation source-as should match
> against the aggregator AS (which it the rightmost as of the previous
> AS_PATH segment).  At the same time move the peer-as check out of the
> loop, there is no reason to have that inside the for loop.
> 
> This diff implements this and also makes aspath_extract() and the
> lenght checks a bit more paranoid.

tested:

$ bgpctl show rib source-as 8530 5.39.176.0/21
flags: * = Valid, > = Selected, I = via IBGP, A = Announced, S = Stale
origin: i = IGP, e = EGP, ? = Incomplete

flags destination  gateway  lpref   med aspath origin
I*>   5.39.176.0/21192.147.168.1  100 0 2914 8530 { 
198753 } ?

OK job@

Kind regards,

Job



[PATCH] column(1): -r to right justify

2018-07-04 Thread Job Snijders
Dear all,

Following some back and forth on how disklabel output should be
formatted, I proposed to Kenneth to extend the column(1) utility. All that
was missing is the ability to right justify. I've longed for this
feature for a while: I often use 'column -t' to prettify data coming
from an awk pipeline.

Example:

job@vurt ~$ netstat -r | column -t -r | tail -5 
ff01::%iwm0/32  fe80::4708:d2be:9a Um 03  - 4 
iwm0  
 ff01::%lo0/32   localhost Um 01  32768 4  
lo0  
 ff02::/16   localhost   UGRS 01  32768 8  
lo0  
ff02::%iwm0/32  fe80::4708:d2be:9a Um 03  - 4 
iwm0  
 ff02::%lo0/32   localhost Um 01  32768 4  
lo0  


Patch courtesy of Kenneth R Westerback. OK?

Index: column.1
===
RCS file: /cvs/src/usr.bin/column/column.1,v
retrieving revision 1.18
diff -u -p -r1.18 column.1
--- column.124 Oct 2016 13:53:05 -  1.18
+++ column.14 Jul 2018 10:27:54 -
@@ -40,6 +40,7 @@
 .Nm column
 .Op Fl tx
 .Op Fl c Ar columns
+.Op Fl r Op Ar list
 .Op Fl s Ar sep
 .Op Ar
 .Sh DESCRIPTION
@@ -66,6 +67,16 @@ The options are as follows:
 Output is formatted for a display
 .Ar columns
 wide.
+.It Fl r Op Ar list
+Table mode will right justify the column contents for the
+specified columns.
+.Ar list
+is a list of comma separated column numbers or column ranges.
+Column numbers start at 1.
+The list must not contain whitespace.
+If no
+.Ar list
+is provided then all columns will be right justified.
 .It Fl s Ar sep
 Specify a set of characters to delimit columns for the
 .Fl t
Index: column.c
===
RCS file: /cvs/src/usr.bin/column/column.c,v
retrieving revision 1.26
diff -u -p -r1.26 column.c
--- column.c22 Jun 2018 12:27:00 -  1.26
+++ column.c4 Jul 2018 10:28:00 -
@@ -47,7 +47,8 @@
 void  c_columnate(void);
 void *ereallocarray(void *, size_t, size_t);
 void  input(FILE *);
-void  maketbl(void);
+int   rightjustify(const char *, const int);
+void  maketbl(const int, const char *);
 void  print(void);
 void  r_columnate(void);
 __dead void usage(void);
@@ -69,8 +70,8 @@ main(int argc, char *argv[])
 {
struct winsize win;
FILE *fp;
-   int ch, tflag, xflag;
-   char *p;
+   int ch, rflag, tflag, xflag;
+   char *p, *rcols;
const char *errstr;
 
setlocale(LC_CTYPE, "");
@@ -87,14 +88,19 @@ main(int argc, char *argv[])
if (pledge("stdio rpath", NULL) == -1)
err(1, "pledge");
 
-   tflag = xflag = 0;
-   while ((ch = getopt(argc, argv, "c:s:tx")) != -1) {
+   rcols = NULL;
+   rflag = 0; tflag = xflag = 0;
+   while ((ch = getopt(argc, argv, "c:r::s:tx")) != -1) {
switch(ch) {
case 'c':
termwidth = strtonum(optarg, 1, INT_MAX, );
if (errstr != NULL)
errx(1, "%s: %s", errstr, optarg);
break;
+   case 'r':
+   rflag = 1;
+   rcols = optarg;
+   break;
case 's':
if ((separator = reallocarray(NULL, strlen(optarg) + 1,
sizeof(*separator))) == NULL)
@@ -139,7 +145,7 @@ main(int argc, char *argv[])
return eval;
 
if (tflag)
-   maketbl();
+   maketbl(rflag, rcols);
else if (*maxwidths >= termwidth)
print();
else if (xflag)
@@ -207,18 +213,69 @@ print(void)
puts(table[row]->content);
 }
 
+int
+rightjustify(const char *rcols, const int col)
+{
+   const char *errstr;
+   char c, *num, *temp;
+   long long ch, rangestart;
+   unsigned int i;
+
+   if (rcols == NULL)
+   return 1;
+
+   temp = strdup(rcols);
+   num = temp;
+   rangestart = -1;
+
+   c = 0;
+   for (i = 0; i <= strlen(rcols); i++) {
+   ch = temp[i];
+   if (ch == ',' || ch == '-')
+   temp[i] = '\0';
+   if (temp[i] != '\0')
+   continue;
+
+   c = strtonum(num, 1, INT_MAX, );
+   if (errstr != NULL)
+   break;
+   c--;/* Users are 1-based. Reality is 0-based. */
+
+   if (c == col)
+   goto found;
+   if (ch == '-') {
+   rangestart = c;
+   } else if ((ch == ',' || ch == '\0') && rangestart != -1) {
+   if (rangestart <= col && c >= col)
+   goto found;
+   rangestart = -1;
+   }
+   num = temp + i + 1;
+   }
+
+   

Re: BGPD - refactor nexthop handling

2018-06-24 Thread Job Snijders
On Thu, Jun 21, 2018 at 08:59:45PM +0200, Claudio Jeker wrote:
> This is the first step of some larger reshuffling of how the RDE is
> working. One of the things needed is proper reference counting for
> nexthops since I want to kill nexthop_link and nexthop_unlink in the
> long run.
> 
> Even though an intermediat step the result is IMO a lot cleaner than
> before. Caching the nexthop in the filter_set and holding a refcnt
> while doing the UPDATE processing makes a lot of the special code
> disapear.
> 
> Please test and review

Ran this on my home router and a route collector for a few days - no
issues.

OK job@



Add 6to4 anycast prefixes to examples/bgpd.conf

2018-06-21 Thread Job Snijders
Hi,

Globally anycasted 6to4 has outlived its usefulness.
Operational discussion: http://seclists.org/nanog/2018/Jun/268

Kind regards,

Job

diff --git etc/examples/bgpd.conf etc/examples/bgpd.conf
index a5fa7234a3c..77f610b9a06 100644
--- etc/examples/bgpd.conf
+++ etc/examples/bgpd.conf
@@ -118,6 +118,7 @@ deny from any prefix 127.0.0.0/8 prefixlen >= 8 # 
localhost [RFC1122]
 deny from any prefix 169.254.0.0/16 prefixlen >= 16# link local [RFC3927]
 deny from any prefix 172.16.0.0/12 prefixlen >= 12 # private space 
[RFC1918]
 deny from any prefix 192.0.2.0/24 prefixlen >= 24  # TEST-NET-1 [RFC5737]
+deny from any prefix 192.88.99.0/24 prefixlen >= 24# 6to4 anycast [RFC7526]
 deny from any prefix 192.168.0.0/16 prefixlen >= 16# private space 
[RFC1918]
 deny from any prefix 198.18.0.0/15 prefixlen >= 15 # benchmarking [RFC2544]
 deny from any prefix 198.51.100.0/24 prefixlen >= 24   # TEST-NET-2 [RFC5737]
@@ -131,6 +132,7 @@ deny from any prefix 0100::/64 prefixlen >= 64  
# Discard-Only [RFC]
 deny from any prefix 2001:2::/48 prefixlen >= 48   # BMWG [RFC5180]
 deny from any prefix 2001:10::/28 prefixlen >= 28  # ORCHID [RFC4843]
 deny from any prefix 2001:db8::/32 prefixlen >= 32 # docu range [RFC3849]
+deny from any prefix 2002::/16 prefixlen >= 16 # 6to4 anycast [RFC7526]
 deny from any prefix 3ffe::/16 prefixlen >= 16 # old 6bone
 deny from any prefix fc00::/7 prefixlen >= 7   # unique local unicast
 deny from any prefix fe80::/10 prefixlen >= 10 # link local unicast



Re: Should whois(1) and IPv6 default to ANICHOST?

2018-06-17 Thread Job Snijders
OK job@

On Sun, 17 Jun 2018 at 17:00, Florian Obser  wrote:

> I like it, I think the heuristic is good enough.
>
> OK florian@ or I take OKs to commit it myself ;)
>
> On Sun, May 20, 2018 at 07:48:34PM +0100, Mikolaj Kucharski wrote:
> > Hi,
> >
> > This is very naive patch for whois(1) which makes it work
> > by default for IPv6 addresses. I went with very minimal
> > approach here. If string contains colon, it's asumed to
> > be IPv6 address. Comments welcome.
> >
> > Please CC me, as I'm not subscribed to the list.
> >
> > Index: whois.c
> > ===
> > RCS file: /cvs/src/usr.bin/whois/whois.c,v
> > retrieving revision 1.56
> > diff -u -p -u -r1.56 whois.c
> > --- whois.c   26 Jul 2017 15:48:38 -  1.56
> > +++ whois.c   4 Apr 2018 19:01:45 -
> > @@ -278,7 +278,8 @@ whois(const char *query, const char *ser
> >   * If the TLD is a number, query ARIN, otherwise, use
> TLD.whois-server.net.
> >   * If the domain does not contain '.', check to see if it is an NSI
> handle
> >   * (starts with '!') or a CORE handle (COCO-[0-9]+ or COHO-[0-9]+) or an
> > - * ASN (starts with AS). Fall back to NICHOST for the non-handle case.
> > + * ASN (starts with AS) or IPv6 address (contains ':'). Fall back to
> > + * NICHOST for the non-handle and non-IPv6 case.
> >   */
> >  char *
> >  choose_server(const char *name, const char *country, char **tofree)
> > @@ -305,6 +306,8 @@ choose_server(const char *name, const ch
> >   else if ((strncasecmp(name, "AS", 2) == 0) &&
> >   strtol(name + 2, , 10) > 0 && *ep == '\0')
> >   return (MNICHOST);
> > + else if (strchr(name, ':') != NULL) /* IPv6 address */
> > + return (ANICHOST);
> >   else
> >   return (NICHOST);
> >   } else if (isdigit((unsigned char)*(++qhead)))
> >
>
> --
> I'm not entirely sure you are real.
>
>


Re: [patch] crontab(5) add -n option to suppress mail when the run was successful

2018-06-12 Thread Job Snijders
On Tue, Jun 12, 2018 at 09:54:47AM -0600, Theo de Raadt wrote:
> I would prefer if the -q and -n descriptions were in a table.  I dislike
> the ancient style of describing such things inline (harder to spot).
> And it really falls down when there are multiple ones.  How do you
> feel about that jmc?

Agreed, I changed it a bit to improve readability.

> Also, do -qn and -nq work?  How about -nnn.  Not saying those make a
> lot of sense, but once getopt syntax is borrowed it should probably be
> honoured.

I redid this piece a little bit, and opted to go a bit stricter to leave
as much freedom as possible for future extensions.

OK:
-n command
-n -q command
-q -n command
-q command
command

Not OK:
-nn command
-qn command
-q -q command
-n -n -q command

My thinking is by being strict now, we make it possible to add arguments
to options in the future. If we allow "-nn" or "-nq" now, we won't be
able to allow "-n...@instituut.net" in the future. Or maybe we'll want
"-v" to mean something different than "-vv". I don't know, so prefer to
be less forgiving.

Kind regards,

Job


diff --git usr.sbin/cron/crontab.5 usr.sbin/cron/crontab.5
index 9c2e651980a..d9330698fd3 100644
--- usr.sbin/cron/crontab.5
+++ usr.sbin/cron/crontab.5
@@ -193,15 +193,29 @@ will be changed into newline characters, and all data
 after the first
 .Ql %
 will be sent to the command as standard input.
-If the
+.Ss Options
+The
 .Ar command
-field begins with
-.Ql -q ,
-execution will not be logged.
+field can begin with one or more options.
+.Bl -tag -width Ds
+.It Fl n
+No mail is send after a successful run.
+The execution output will only be mailed if the command exits with a non-zero
+exit code.
+The
+.Ql -n
+option is an attempt to cure potentially copious volumes of mail coming from
+.Xr cron 8 .
+.It Fl q
+Execution will not be logged.
+.El
+.Pp
 Use whitespace to separate
-.Ql -q
-from the command.
+.Ql -n ,
+.Ql -q ,
+and the command.
 .Pp
+.Ss Execution
 Commands are executed by
 .Xr cron 8
 when the
@@ -329,6 +343,9 @@ Ranges may include
 .It
 Months or days of the week can be specified by name.
 .It
+Mailing after a successful run can be suppressed with
+.Ql -n .
+.It
 Logging can be suppressed with
 .Ql -q .
 .It
diff --git usr.sbin/cron/do_command.c usr.sbin/cron/do_command.c
index 6a4022fcc9a..4fbca61d170 100644
--- usr.sbin/cron/do_command.c
+++ usr.sbin/cron/do_command.c
@@ -3,6 +3,7 @@
 /* Copyright 1988,1990,1993,1994 by Paul Vixie
  * Copyright (c) 2004 by Internet Systems Consortium, Inc. ("ISC")
  * Copyright (c) 1997,2000 by Internet Software Consortium, Inc.
+ * Copyright (c) 2018 Job Snijders 
  *
  * Permission to use, copy, modify, and distribute this software for any
  * purpose with or without fee is hereby granted, provided that the above
@@ -80,7 +81,6 @@ child_process(entry *e, user *u)
char **p, *input_data, *usernm;
auth_session_t *as;
login_cap_t *lc;
-   int children = 0;
extern char **environ;
 
/* mark ourselves as different to PS command watchers */
@@ -146,7 +146,9 @@ child_process(entry *e, user *u)
 
/* fork again, this time so we can exec the user's command.
 */
-   switch (fork()) {
+
+   pid_t   jobpid;
+   switch (jobpid = fork()) {
case -1:
syslog(LOG_ERR, "(CRON) CAN'T FORK (%m)");
_exit(EXIT_FAILURE);
@@ -260,8 +262,6 @@ child_process(entry *e, user *u)
break;
}
 
-   children++;
-
/* middle process, child of original cron, parent of process running
 * the user's command.
 */
@@ -283,7 +283,8 @@ child_process(entry *e, user *u)
 * we would block here.  thus we must fork again.
 */
 
-   if (*input_data && fork() == 0) {
+   pid_t   stdinjob;
+   if (*input_data && (stdinjob = fork()) == 0) {
FILE *out = fdopen(stdin_pipe[WRITE_PIPE], "w");
int need_newline = FALSE;
int escaped = FALSE;
@@ -331,8 +332,6 @@ child_process(entry *e, user *u)
 */
close(stdin_pipe[WRITE_PIPE]);
 
-   children++;
-
/*
 * read output from the grandchild.  it's stderr has been redirected to
 * it's stdout, which has been redirected to our pipe.  if there is any
@@ -342,15 +341,17 @@ child_process(entry *e, user *u)
 
(void) signal(SIGPIPE, SIG_IGN);
in = fdopen(stdout_pipe[READ_PIPE], "r");
+
+   char*mailto;
+   FILE*mail = NULL;
+   int status = 0;
+   pid_t   mailpid;
+   size_t  bytes = 1;
+
if (in != NULL) {
int ch = getc(in);
 
if (ch != EOF) {
-   FILE*mail = NULL;
-   char*mailto;
-   size_t  bytes = 1;
-   

[patch] crontab(5) add -n option to suppress mail when the run was successful

2018-06-11 Thread Job Snijders
Dear all,

Managing the flow of email coming from cron(8) can be a challenge,
especially when you manage a lot of machines. A pattern I see in system
administration is that either a ton of logic is put in wrappers/scripts
to sensibly deal with any output - or even worse all output is zapped
with this dreaded pattern:

* * * * * command_goes_here 2>&1 >/dev/null # piloting blind

In the above example when your cron job fails, you will never know about
it. You were depending on cron to backup some files? Screwed! The job
has been failing due to filesystem permission errors for weeks - all
your files are gone.

There are workarounds for cron(8)'s shortcomings: you can either put
logic in your scripts to only output when things went wrong, but then
you'll not know about what didn't go wrong: "do x || echo error x".

Another approach is to use wrappers that buffer all output until it is
clear what the exit code of the underlaying command was, and then print
the output so cron will email. Shims such as cronic [1] or chronic [2]
are popular, but not part of base.

To improve the situation I propose to add a simple crontab(5)
convenience option called "-n" (mnemonic "No mail if run successful").
Note that options already are a thing in vixie cron ("-q" has existed
for decades?), but are not part of POSIX.

With this "no mail if success" option you can do things like:

* * * * * -n cp -rv src/ dest/

With the above example crontab(5) entry you'll only receive a mail from
cron(8) if the cp(1) encountered some kind of error. You'll also have in
that email up until what point cp(1) actually was able to copy files.

The "-n" option also encourages folks to be liberal with adding trace
options to their shell scripts like "set -o errexit -o nounset -o xtrace",
and just focus on making sure the script exits with a sensible exit
code. This way when there is some kind of problem, you can read the
full context from the cron mail and be more productive; and if there is
no problem, you won't receive an email, reducing clutter in your inbox.

Kind regards,

Job


[1]: https://habilis.net/cronic/
[2]: https://joeyh.name/code/moreutils/

 usr.sbin/cron/crontab.5|  17 +++-
 usr.sbin/cron/do_command.c | 100 -
 usr.sbin/cron/entry.c  |   5 +++
 usr.sbin/cron/structs.h|   1 +
 4 files changed, 75 insertions(+), 48 deletions(-)

diff --git usr.sbin/cron/crontab.5 usr.sbin/cron/crontab.5
index 9c2e651980a..700010faadf 100644
--- usr.sbin/cron/crontab.5
+++ usr.sbin/cron/crontab.5
@@ -193,14 +193,27 @@ will be changed into newline characters, and all data
 after the first
 .Ql %
 will be sent to the command as standard input.
+.Pp
+If the
+.Ar command
+field begins with
+.Ql -n ,
+the execution output will only be mailed if the command exits with a non-zero
+exit code.
+No mail is sent after a successful run.
+The
+.Ql -n
+option is an attempt to cure potentially copious volumes of mail coming from
+.Xr cron 8 .
 If the
 .Ar command
 field begins with
 .Ql -q ,
 execution will not be logged.
 Use whitespace to separate
-.Ql -q
-from the command.
+.Ql -n ,
+.Ql -q ,
+and the command.
 .Pp
 Commands are executed by
 .Xr cron 8
diff --git usr.sbin/cron/do_command.c usr.sbin/cron/do_command.c
index 6a4022fcc9a..5d60bff6421 100644
--- usr.sbin/cron/do_command.c
+++ usr.sbin/cron/do_command.c
@@ -3,6 +3,7 @@
 /* Copyright 1988,1990,1993,1994 by Paul Vixie
  * Copyright (c) 2004 by Internet Systems Consortium, Inc. ("ISC")
  * Copyright (c) 1997,2000 by Internet Software Consortium, Inc.
+ * Copyright (c) 2018 Job Snijders 
  *
  * Permission to use, copy, modify, and distribute this software for any
  * purpose with or without fee is hereby granted, provided that the above
@@ -80,7 +81,6 @@ child_process(entry *e, user *u)
char **p, *input_data, *usernm;
auth_session_t *as;
login_cap_t *lc;
-   int children = 0;
extern char **environ;
 
/* mark ourselves as different to PS command watchers */
@@ -146,7 +146,9 @@ child_process(entry *e, user *u)
 
/* fork again, this time so we can exec the user's command.
 */
-   switch (fork()) {
+
+   pid_t   jobpid = -1;
+   switch (jobpid = fork()) {
case -1:
syslog(LOG_ERR, "(CRON) CAN'T FORK (%m)");
_exit(EXIT_FAILURE);
@@ -260,8 +262,6 @@ child_process(entry *e, user *u)
break;
}
 
-   children++;
-
/* middle process, child of original cron, parent of process running
 * the user's command.
 */
@@ -283,7 +283,8 @@ child_process(entry *e, user *u)
 * we would block here.  thus we must fork again.
 */
 
-   if (*input_data && fork() == 0) {
+   pid_t   stdinjob = -1;
+   if (*input_data && (stdinjob = fork()) == 0) {
FI

Re: bgpd: stop with the announce confusion

2018-06-07 Thread Job Snijders
On Thu, Jun 07, 2018 at 12:14:07PM +0200, Claudio Jeker wrote:
> > It would be helpful during upgrades if it's possible to write some
> > configurations that work the same on both the old and new versions.
> > That way the configuration can be changed to a version which will
> > still work before you get to the point where you can't simply revert
> > to the old config file). I think if the new version would accept
> > but ignore "announce all", that would be enough.
> 
> Yes, I can add a dummy "announce *" handler. In general people should be
> already able to write filters that work before and after the change.
> Happy to add a bit of code to ease the change but it also comes with a
> price that users may not adjust their configs because they don't fail to
> load.

I'd prefer if only "announce all" is allowed as a no-op (but a warning
about the directive's deprecation is printed), for "announce self" and
"announce default-route" just throw an error and refuse to start the
daemon.

Kind regards,

Job



Re: bgpd: stop with the announce confusion

2018-06-07 Thread Job Snijders
On Wed, Jun 06, 2018 at 11:04:56PM +0200, Claudio Jeker wrote:
> The following diff does a few things.
> a) it removes the `announce (all|none|self|default-route)` version
> b) `announce none` is now `export none`
> c) `announce default-route` is now `export default-route`
> d) the examples file is adjusted accordingly and also the network
>statements are no setting a large-community which is used by the
>filters.
> e) the filters are split in input and output and to a lesser extent
>ibgp vs ebgp.
> f) since anounce self is gone the filters will now default deny (which is
>not a big thing and has the benefit that at start all routes show up in
>Adj-RIB-In first).

OK job@

I built a release & tested on a few routers.

I like Stuart's suggestion, if the directive "announce all" is made a
no-op in 6.4 (and removed in 6.5), I can produce configs that work on
both 6.3 and 6.4, making the upgrade easier.

Kind regards,

Job



Re: bgpd ignore aspath with to large attributes

2018-05-31 Thread Job Snijders
On Wed, May 30, 2018 at 03:18:45PM +0200, Claudio Jeker wrote:
> This adds a protection to handle aspaths overly large attributes in
> bgpd. The main idea is to protect other bgp routes downstream for
> hitting the limit with is often not well catched.

I am not sure this is sound logic. The BGP UPDATE limit has been 4096
for decades, and protocol developers are even looking into extending it
beyond 4096. If a downstream router cannot handle the legal limit that
is on them, they need to fix their non-compliance.

> The limit is currently a bit arbitarily set to 4096 - 1024 which is
> afaik the same limit that bird uses.

Arbitrary limits are arbitrary, not sure I agree with how BIRD handles
this either.

Kind regards,

Job



Re: Eliminate trailing whitespace & typo in chmod

2018-04-13 Thread Job Snijders
https://en.wikipedia.org/wiki/If_and_only_if

As a non-native speaker, it took some years before I realized the use of
“iff” is not a typo.

Kind regards,

Job


Re: Stop ping telling world its pid

2018-04-11 Thread Job Snijders
When things arrive out of sequence, that usually is of special interest to
network operator people. Not sure  the sequence field can easily be
overloaded to increase “validity”.

I’m not great at math, with a 16 bit random value, wouldn’t we start
running into ID collisions around 256 concurrent ping processes? Perhaps
that already is the case today and this patch does nothing to improve that,
but also doesn’t make it worse.

Kind regards,

Job


Re: high sofnet load with gif(4) and icmp

2018-04-05 Thread Job Snijders
Hi,

I'm optimistic about this patch: where previously running traffic
through this router over gif
tunnels would result in memory exhaustion, the problem now seems gone.
memory graph: http://instituut.net/~job/screenshots/ee7f0fa5304032a2.png

Should perhaps an errata / syspatch blob be prepared for the 6.3 users
that rely on gif(4)?
I can attempt to create one for amd64.

Kind regards,

Job

On Thu, Apr 5, 2018 at 2:38 PM, Alexander Bluhm  wrote:
> OK bluhm@
>
> On Thu, Apr 05, 2018 at 09:14:32AM +1000, David Gwynne wrote:
>> Index: if_gif.c
>> ===
>> RCS file: /cvs/src/sys/net/if_gif.c,v
>> retrieving revision 1.113
>> diff -u -p -r1.113 if_gif.c
>> --- if_gif.c  15 Mar 2018 21:01:18 -  1.113
>> +++ if_gif.c  4 Apr 2018 23:12:02 -
>> @@ -403,6 +403,8 @@ gif_output(struct ifnet *ifp, struct mbu
>>   error = ENOBUFS;
>>   goto drop;
>>   }
>> + memcpy((caddr_t)(mtag + 1), >if_index, sizeof(ifp->if_index));
>> + m_tag_prepend(m, mtag);
>>
>>   m->m_pkthdr.ph_family = dst->sa_family;
>>
>



Re: [PATCH] Update default QoS markers for ssh

2018-04-01 Thread Job Snijders
On Sun, Apr 01, 2018 at 11:29:55AM +0100, Stuart Henderson wrote:
> On 2018/03/31 16:10, Job Snijders wrote:
> > TL;DR: I propose to update the defaults to use DSCP "AF21" (Low
> > Latency Data) for interactive session traffic, and CS1 ("Lower
> > Effort") for non-interactive traffic. This applies to both IPv4 and
> > IPv6.
> 
> I think this is the right thing to do, but needs handling in
> conjunction with changes to PF, which has dual queue-setting
> (IPTOS_LOWDELAY packets go in one queue along with empty TCP ACKs,
> other packets can go in another lower priority queue).
> 
> Since firewalls are often updated less often than hosts I think it
> would be better if the PF side was done first with some time to give
> people chance to update, though it doesn't need to be too long (they
> can set qos in sshd_config if they want to retain the old setting).
> 
> A few other pieces use IPTOS_LOWDELAY (pfsync, carp, vxlan, etherip)
> which I think would want switching too.

What we're looking at is some kind of "migrate TOS to DSCP" project.
There is a few tens of files where TOS values need to be examined and an
appropiate DSCP value choosen. I'm not sure I'd go as far as blindly
replacing IPTOS_LOWDELAY with IPTOS_DSCP_AF21, but perhaps as a
transition that is what is needed in some places.

We can start by treating IPTOS_LOWDELAY and IPTOS_DSCP_AF21 similarly in
pf (see untested patch below), and then accept patches that migrate from
TOS to DSCP?

As far as I understand, out-of-the-box pf doesn't do anything with TOS
values, so users wlil have to have explicitly configured something on
the firewall or clients anyhow?

Kind regards,

Job

 share/man/man5/pf.conf.5 | 12 +---
 sys/net/pf.c |  8 +---
 2 files changed, 14 insertions(+), 6 deletions(-)

diff --git share/man/man5/pf.conf.5 share/man/man5/pf.conf.5
index 13e23423daa..f5a65e1b0d4 100644
--- share/man/man5/pf.conf.5
+++ share/man/man5/pf.conf.5
@@ -679,7 +679,9 @@ interface, the queueing priority will also be written as 
the priority
 code point in the 802.1Q VLAN header.
 If two priorities are given, TCP ACKs with no data payload and packets
 which have a TOS of
-.Cm lowdelay
+.Cm lowdelay ,
+or DSCP value
+.Cm af21 ,
 will be assigned to the second one.
 Packets with a higher priority number are processed first,
 and packets with the same priority are processed
@@ -702,7 +704,9 @@ section.
 Packets matching this rule will be assigned to the specified
 .Ar queue .
 If two queues are given, packets which have a TOS of
-.Cm lowdelay
+.Cm lowdelay ,
+or DSCP value
+.Cm af21 ,
 and TCP ACKs with no data payload will be assigned to the second one.
 See
 .Sx QUEUEING
@@ -1608,7 +1612,9 @@ Normally only one
 .Ar queue
 is specified; when a second one is specified it will instead be used for
 packets which have a TOS of
-.Cm lowdelay
+.Cm lowdelay ,
+or DSCP value
+.Cm af21 ,
 and for TCP ACKs with no data payload.
 .Pp
 To continue the previous example, the examples below would specify the
diff --git sys/net/pf.c sys/net/pf.c
index d841f834af1..aac20603a01 100644
--- sys/net/pf.c
+++ sys/net/pf.c
@@ -2804,7 +2804,7 @@ pf_build_tcp(const struct pf_rule *r, sa_family_t af,
h->ip_len = htons(tlen);
h->ip_v = 4;
h->ip_hl = sizeof(*h) >> 2;
-   h->ip_tos = IPTOS_LOWDELAY;
+   h->ip_tos = IPTOS_DSCP_AF21;
h->ip_len = htons(len);
h->ip_off = htons(ip_mtudisc ? IP_DF : 0);
h->ip_ttl = ttl ? ttl : ip_defttl;
@@ -7020,7 +7020,8 @@ done:
pf_scrub(pd.m, s->state_flags, pd.af, s->min_ttl,
s->set_tos);
pf_tag_packet(pd.m, s->tag, s->rtableid[pd.didx]);
-   if (pqid || (pd.tos & IPTOS_LOWDELAY)) {
+   if (pqid || (pd.tos & IPTOS_LOWDELAY)
+   || (pd.tos & IPTOS_DSCP_AF21)) {
qid = s->pqid;
if (s->state_flags & PFSTATE_SETPRIO)
pd.m->m_pkthdr.pf.prio = s->set_prio[1];
@@ -7032,7 +7033,8 @@ done:
} else {
pf_scrub(pd.m, r->scrub_flags, pd.af, r->min_ttl,
r->set_tos);
-   if (pqid || (pd.tos & IPTOS_LOWDELAY)) {
+   if (pqid || (pd.tos & IPTOS_LOWDELAY)
+   || (pd.tos & IPTOS_DSCP_AF21)) {
qid = r->pqid;
if (r->scrub_flags & PFSTATE_SETPRIO)
pd.m->m_pkthdr.pf.prio = r->set_prio[1];



[PATCH] Update default QoS markers for ssh

2018-03-31 Thread Job Snijders
Dear all,

There may be opportunity for improvement of ssh(1) and sshd(8)'s default
QoS markers for better integration in environments that can offer either
layer-2 or layer-3 prioritisation profiles. Currently ssh(1) and sshd(8)
set obsoleted values 'lowdelay' for interactive sessions and
'throughput' for non-interactive sessions.

TL;DR: I propose to update the defaults to use DSCP "AF21" (Low Latency
Data) for interactive session traffic, and CS1 ("Lower Effort") for
non-interactive traffic. This applies to both IPv4 and IPv6.

The openbsd 'lowdelay' value translates to IP TOS '0x10', and
'throughput' translates to IP TOS 0x08'. The IP TOS field is 8 bits, and
was defined in RFC 1349 (1992 A.D.). IP TOS was deprecated in RFC 2474
(1998 A.D.) with the introduction of "Differentiated Services Field" aka
DSCP.

When the world transitioned from TOS to DSCP, the IP TOS field was
chopped up into two parts: the 6 most significant bits became the
DSField (which contains a DSCP), and the 2 least significant bits became
a place for ECN experimentation. (Similarly in IPv6, the 'Traffic Class'
became the DSField). To convert the IP TOS value to a DSCP value you
shift the value 2 bits to the right: 'lowdelay' = DSCP 0x04 and
'throughput' = DSCP 0x02. It should be noted that DSCP 0x04 and DSCP
0x02 have no meaning. Both 0x02 and 0x04 map to Precedence "routine",
which is the /least/ important Precedence.

Using IP TOS values is problematic because network operators can't match
on TOS values on today's equipment. We looked at Cisco IOS XR, Juniper
Junos, and Nokia SR-OS - and there aren't classifiers to match on TOS
values. Instead, the systems I mentioned allow you to match on DSCP
value (and perhaps IPv4 precedence).

The ability to correctly match QoS markers of this nature is important
in controlled environments, you'll generally see that the DSCP values
are fully trusted, or not honored at all (such as when crossing Internet
boundaries). An example would be how traffic from VOIP phones/apps are
treated in enterprise office networks, perhaps even in the entire branch
network. The industry practise is that iff QoS markers are honored, they
are translated 1:1 - so AF21 goes into AF21 (or equivalent).

I selected AF21 as this is the highest priority within the low-latency
service class (and it is higher than what we have today). SSH is elastic
and time-sensitive data, where a user is waiting for a response via the
network in order to continue with a task at hand. As such, these flows
should be considered foreground traffic, with delays or drops to such
traffic directly impacting user-productivity.

You may ask, "why not EF?" - the risk with EF is that you can easily end
up in a tiny policer queue, as an example some CPEs by default only
allow up to 100Kbps of EF traffic. EF is meant for inelastic traffic,
where for instance the codec sends packets at the rate the codec
produces them, regardless of availability of capacity.

For bulk SSH traffic, the "Lower Effort" marker was chosen to enable
networks implementing a scavanger/lower-than-best effort class to
discriminate scp(1) below normal activities, such as web surfing. In
general this type of bulk SSH traffic is a background activity. Today
this traffic would go into 'best effort', potentially drowning out other
traffic on wireless links, so kicking this into 'background' may improve
the user experience.

An advantage of using "AF21" for interactive SSH and "CS1" for bulk SSH
is that they are recognisable values on all common platforms (IANA
https://www.iana.org/assignments/dscp-registry/dscp-registry.xml ), and
for AF21 specifically a rigorous definition of the intended behavior
exists https://tools.ietf.org/html/rfc4594#section-4.7 in addition to
the definition of the Assured Forwarding PHB group
https://tools.ietf.org/html/rfc2597, and for CS1 (Lower Effort) there is
https://tools.ietf.org/html/rfc3662 - finally this document is also a
recommened reading: https://tools.ietf.org/html/rfc8325

Another advantage is that the first three bits of "AF21" map to the
equivalent I 802.1D PCP, IEEE 802.11e, MPLS EXP/Cos and IP
Precedence value of 2 (also known as "Immediate", or "AC_BE"), and CS1's
first 3 bits map to I 802.1D PCP, IEEE 802.11e, MPLS/Cos and IP
Precedence value 1 ("Background" or "AC_BK"). 

The below patch ensures that in environments where people are using QoS,
the openssh defaults can easily matched both for interactive and bulk
SSH traffic in all layers of the transport stack (wired/wirless
Ethernet, MPLS, IP) and priortize accordingly.

Kind regards,

Job

The below patch should be portable, but I've only tested it on my own
openbsd machines.

 usr.bin/ssh/readconf.c| 4 ++--
 usr.bin/ssh/servconf.c| 4 ++--
 usr.bin/ssh/ssh_config.5  | 6 --
 usr.bin/ssh/sshd_config.5 | 6 --
 4 files changed, 12 insertions(+), 8 deletions(-)

diff --git usr.bin/ssh/readconf.c usr.bin/ssh/readconf.c
index 5d17b725600..d3a121373d8 100644
--- 

Re: handle updates via Adj-RIB-Out

2018-03-05 Thread Job Snijders
Claudio,

How best to test this change proposal? Should this maybe be tested on
one of the yycix route servers?

I'll let it run on my home router, if that doesn't cause issues in a
week or so; we can consider rs2.yycix.ca

Kind regards,

Job

On Fri, Mar 02, 2018 at 04:55:23PM +0100, Claudio Jeker wrote:
> On Wed, Feb 07, 2018 at 05:52:09AM +0100, Claudio Jeker wrote:
> > This diff changes the way bgpd does updates. Instead of having its own
> > special update queue/tree it uses a regular RIB (Adj-RIB-Out) to store all
> > updates to be sent. Stuff that has been sent is linked to the prefixes
> > queue. On the peer there are also queues for updates and withdraws.
> > The whole update code becomes a lot simpler but also results in the bulk
> > of the diff. Other changes include the bgpctl show rib handling (we can
> > just walk the Adj-RIB-Out now). Last but not least the EOR records are
> > also now a magic rde_aspath (flag F_ATTR_EOR) which is added to the update
> > queue.
> > 
> > This diff is still very large and the changes are intrusive so reviews and
> > testing is very welcome.
> 
> No news on this? Anyone?
> 
> -- 
> :wq Claudio
> 
> Index: rde.c
> ===
> RCS file: /cvs/src/usr.sbin/bgpd/rde.c,v
> retrieving revision 1.377
> diff -u -p -r1.377 rde.c
> --- rde.c 7 Feb 2018 00:02:02 -   1.377
> +++ rde.c 7 Feb 2018 00:02:18 -
> @@ -80,8 +80,6 @@ void rde_dump_rib_as(struct prefix *, 
>int);
>  void  rde_dump_filter(struct prefix *,
>struct ctl_show_rib_request *);
> -void  rde_dump_filterout(struct rde_peer *, struct prefix *,
> -  struct ctl_show_rib_request *);
>  void  rde_dump_upcall(struct rib_entry *, void *);
>  void  rde_dump_prefix_upcall(struct rib_entry *, void *);
>  void  rde_dump_ctx_new(struct ctl_show_rib_request *, pid_t,
> @@ -2262,71 +2260,33 @@ rde_dump_rib_as(struct prefix *p, struct
>  }
>  
>  void
> -rde_dump_filterout(struct rde_peer *peer, struct prefix *p,
> -struct ctl_show_rib_request *req)
> +rde_dump_filter(struct prefix *p, struct ctl_show_rib_request *req)
>  {
> - struct bgpd_addr addr;
> - struct rde_aspath   *asp, *fasp;
> - enum filter_actions  a;
> + struct rde_aspath   *asp;
>  
> - if (up_test_update(peer, p) != 1)
> + if (req->peerid && req->peerid != prefix_peer(p)->conf.id)
>   return;
> + if (p->flags & F_PREFIX_USE_PEER)
> + return; /* pending withdraw, skip */
>  
> - pt_getaddr(p->re->prefix, );
>   asp = prefix_aspath(p);
> - a = rde_filter(out_rules, , peer, asp, ,
> - p->re->prefix->prefixlen, asp->peer);
> - if (fasp)
> - fasp->peer = asp->peer;
> - else
> - fasp = asp;
> -
> - if (a == ACTION_ALLOW)
> - rde_dump_rib_as(p, fasp, req->pid, req->flags);
> -
> - if (fasp != asp)
> - path_put(fasp);
> -}
> -
> -void
> -rde_dump_filter(struct prefix *p, struct ctl_show_rib_request *req)
> -{
> - struct rde_peer *peer;
> - struct rde_aspath   *asp;
> -
> - if (req->flags & F_CTL_ADJ_IN ||
> - !(req->flags & (F_CTL_ADJ_IN|F_CTL_ADJ_OUT))) {
> - asp = prefix_aspath(p);
> - if (req->peerid && req->peerid != asp->peer->conf.id)
> - return;
> - if (req->type == IMSG_CTL_SHOW_RIB_AS &&
> - !aspath_match(asp->aspath->data, asp->aspath->len,
> - >as, req->as.as))
> - return;
> - if (req->type == IMSG_CTL_SHOW_RIB_COMMUNITY &&
> - !community_match(asp, req->community.as,
> - req->community.type))
> - return;
> - if (req->type == IMSG_CTL_SHOW_RIB_EXTCOMMUNITY &&
> - !community_ext_match(asp, >extcommunity, 0))
> - return;
> - if (req->type == IMSG_CTL_SHOW_RIB_LARGECOMMUNITY &&
> - !community_large_match(asp, req->large_community.as,
> - req->large_community.ld1, req->large_community.ld2))
> - return;
> - if ((req->flags & F_CTL_ACTIVE) && p->re->active != p)
> - return;
> - rde_dump_rib_as(p, asp, req->pid, req->flags);
> - } else if (req->flags & F_CTL_ADJ_OUT) {
> - if (p->re->active != p)
> - /* only consider active prefix */
> - return;
> - if (req->peerid) {
> - if ((peer = peer_get(req->peerid)) != NULL)
> - rde_dump_filterout(peer, p, req);
> - return;
> - }
> - }
> + if (req->type == IMSG_CTL_SHOW_RIB_AS &&
> + !aspath_match(asp->aspath->data, asp->aspath->len,
> + >as, 

Re: [PATCH] bgplg error message fix

2018-02-02 Thread Job Snijders
On Fri, Feb 02, 2018 at 02:38:54PM +0100, Pierre Emeriaud wrote:
> While playing with bgplg I noticed that 'show ip bgp detail as' error
> message is wrong: expects an AS number but asks for a prefix.

Thanks! Committed.

Kind regards,

Job



Re: /etc/rc: fsck -y

2018-01-18 Thread Job Snijders
On Thu, Jan 18, 2018 at 12:22:21PM +, Stuart Henderson wrote:
> A flag (fsck_flags) would be a better idea than a file, and more
> flexible (users with many disks and low RAM could use the same
> mechanism to force "-l 1" for example).

fsck_flags would be an idea. I prefer not to bring local changes into /etc/rc

Kind regards,

Job

diff --git etc/rc etc/rc
index c88e13ce7ab..bfe76b1c598 100644
--- etc/rc
+++ etc/rc
@@ -349,7 +349,7 @@ if [[ -e /fastboot ]]; then
echo "Fast boot: skipping disk checks."
 elif [[ $1 == autoboot ]]; then
echo "Automatic boot in progress: starting file system checks."
-   do_fsck
+   do_fsck $fsck_flags
 fi
 
 # From now on, allow user to interrupt (^C) the boot process.
diff --git etc/rc.conf etc/rc.conf
index 05146d58a4e..7f116896b2d 100644
--- etc/rc.conf
+++ etc/rc.conf
@@ -100,6 +100,7 @@ multicast=NO# Reject IPv4 multicast packets 
by default
 
 # miscellaneous other flags
 amd_master=/etc/amd/master # AMD 'master' map
+fsck_flags=# "-l 1" to limit the number of parallel checks
 library_aslr=YES   # set to NO to disable library randomization
 savecore_flags=# "-z" to compress
 spamd_black=NO # set to YES to run spamd without greylisting



Re: [patch] faq5.html, added missing branch to update -stable trees.

2018-01-06 Thread Job Snijders
Hi,

On Sat, Jan 6, 2018 at 9:53 AM, Christoph R. Murauer  wrote:
> Here is a diff for the missing -rOPENBSD_6_2 branch to update the
> -stable CVS trees as described in https://www.openbsd.org/anoncvs.html
> at Updating an existing tree.

I don't think this is needed in this context of the documentation.

The -r tag specification is "sticky" when you use this option with
"cvs checkout"
or "cvs update" to make your own copy of a file: cvs remembers the tag and
continues to use it on future update commands, until you specify otherwise.

http://man.openbsd.org/cvs

Kind regards,

Job



Re: pckbd: go back to using table 2 by default

2018-01-02 Thread Job Snijders
Hi all,

I often observed on my Thinkpad x270 that after an upgrade via bsd.rd,
the first reboot resulted in keystrokes being garbage (and at second
reboot everything was fine again).

The below patch seems to be an improvement.

Kind regards,

Job

On Tue, Jan 02, 2018 at 09:36:49AM -0600, joshua stein wrote:
> In 2007 I changed this code to use table 3 by default, falling back 
> on table 2 (the previous default) or 1.  This was done just to make 
> the keyboard on the OQO model 01 work, and these devices are long 
> gone.
> 
> 10 years later, some newer Lenovo machines seem to have trouble 
> trying this mode which can occasionally leave the keyboard in a 
> state where it generates bogus keys when typing.  It also causes a 
> long delay when booting because the table changes have to timeout:
> 
> pckbd: trying table 3
> pckbc_cmd: lost 0xee
> pckbc_cmd: timeout
> pckbd: table set of 3 failed
> pckbd: trying table 2
> pckbc_cmd: lost 0xee
> pckbc_cmd: timeout
> pckbd: table set of 2 failed
> pckbd: trying table 1
> pckbc_cmd: lost 0xee
> pckbc_cmd: timeout
> pckbd: table set of 1 failed
> pckbd: settling on table 1
> 
> This patch reverts back to using table 2 by default and if it says 
> it was already in table 2, leaves it alone rather than forcibly 
> changing to that mode again, which is how Linux behaves.
> 
> 
> Index: sys/dev/pckbc/pckbd.c
> ===
> RCS file: /cvs/src/sys/dev/pckbc/pckbd.c,v
> retrieving revision 1.43
> diff -u -p -u -p -r1.43 pckbd.c
> --- sys/dev/pckbc/pckbd.c 14 Apr 2016 07:06:03 -  1.43
> +++ sys/dev/pckbc/pckbd.c 2 Jan 2018 15:21:05 -
> @@ -214,8 +214,7 @@ int
>  pckbd_set_xtscancode(pckbc_tag_t kbctag, pckbc_slot_t kbcslot,
>  struct pckbd_internal *id)
>  {
> - /* default to have the 8042 translate the keyboard with table 3. */
> - int table = 3;
> + int table;
>  
>   if (pckbc_xt_translation(kbctag)) {
>  #ifdef DEBUG
> @@ -234,12 +233,21 @@ pckbd_set_xtscancode(pckbc_tag_t kbctag,
>   if (id != NULL)
>   id->t_translating = 0;
>   } else {
> - if (id != NULL)
> + if (id != NULL) {
>   id->t_translating = 1;
> + if (id->t_table == 0) {
> + /*
> +  * Don't bother explicitly setting into set 2,
> +  * it's the default.
> +  */
> + id->t_table = 2;
> + return (0);
> + }
> + }
>   }
>  
>   /* keep falling back until we hit a table that looks usable. */
> - for (; table >= 1; table--) {
> + for (table = 3; table >= 1; table--) {
>   u_char cmd[2];
>  #ifdef DEBUG
>   printf("pckbd: trying table %d\n", table);
> 



Re: Add "-c command" option to script(1)

2017-12-16 Thread Job Snijders
On Sat, Dec 16, 2017 at 09:45:02AM +0100, Paul de Weerd wrote:
> On Fri, Dec 15, 2017 at 12:24:45PM +0100, Paul de Weerd wrote:
> | I've updated the diff to add this example as per jmc's suggestion.  It
> | now has:
> | 
> | - add the `-c command` feature
> | - updates usage
> | - removes /* ARGSUSED */ lint comments
> | - documents the -c feature
> | - adds an example to the manpage
> 
> jmc@ pointed out a missing colon at the end of the example.  Apologies
> for the extra noise.  Updated diff (still covers the above five
> changes) included.

OK job@



Re: Add "-c command" option to script(1)

2017-12-15 Thread Job Snijders
On Thu, Dec 14, 2017 at 09:23:29AM +0100, Paul de Weerd wrote:
> Another use I personally find very convenient is this:
> 
> [weerd@pom] $ script -c "vmctl start test -c"
> 
> Hope others see value here too :)

That is a great use case.

Kind regards,

Job



Re: Include hostname in shell prompts by default

2017-12-09 Thread Job Snijders
On Sat, Dec 09, 2017 at 06:44:44PM +0100, Theo Buehler wrote:
> Theo asked me to make sure that all our shells print a prompt
> including the hostname by default.

Thank you. This is a significant improvement.

I keep finding myself updating /etc/profile everywhere to ensure I don't
lose my bearings on where I am.

Kind regards,

Job



Re: pf neighbor discovery hop limit

2017-12-04 Thread Job Snijders
On Mon, Dec 04, 2017 at 02:55:16PM +0100, Alexander Bluhm wrote:
> RFC 4861 requires that all neighbor discovery packets have 255 in
> their IPv6 header hop limit field.  Let pf drop neighbor solicitation,
> neighbor advertisement, router solicitation, router advertisement,
> and redirect ICMP6 packets that do not comply.  This enforces that
> bogus packets cannot be routed when pf is enabled.
> 
> ok?

Wouldn't this be a duplicate of "if (ip6->ip6_hlim != 255)" checks done
in sys/netinet6/{icmp6,nd6_nbr,nd6_rtr}.c ?

Kind regards,

Job



Re: [PATCH] amd64/bsd.rd: add growfs(8)

2017-11-07 Thread Job Snijders
On Mon, Nov 06, 2017 at 04:14:48PM -0700, Theo de Raadt wrote:
> I agree on that.  So please put it into the correct lists files for
> all the unlimited ramdisks.
> 
> Job, the situation is a little nit-picky but try to do it for all the
> architectures and I'll give you fast feedback.

This is what I was able to test. The current state of affairs: growfs
in bsd.rd will cost 16K on amd64 and 21K on i386.

  Filesystem  Size  Used  Avail Capacity mounted on
amd64 with growfs:/dev/rd0a   3.5M  3.1M  365K90%/
amd64 without growfs: /dev/rd0a   3.5M  3.1M  381K89%/
i386 with growfs: /dev/rd0a   3.0M  2.7M  294K90%/
i386 without growfs:  /dev/rd0a   3.0M  2.7M  315K90%/

Below is the MI patch. I glanced at Florian's slaacd commit to figure
out where the link lines should go.

Kind regards,

Job

diff --git distrib/alpha/bsd.rd/list.local distrib/alpha/bsd.rd/list.local
index 4d2d3f1875b..c8d52363fe5 100644
--- distrib/alpha/bsd.rd/list.local
+++ distrib/alpha/bsd.rd/list.local
@@ -1,3 +1,4 @@
+LINK   instbin sbin/growfs
 LINK   instbin sbin/mount_cd9660
 LINK   instbin sbin/dhclient
 LINK   instbin bin/mt bin/eject
diff --git distrib/amd64/ramdisk_cd/list.local 
distrib/amd64/ramdisk_cd/list.local
index 49d677cb6d5..094ead2f06a 100644
--- distrib/amd64/ramdisk_cd/list.local
+++ distrib/amd64/ramdisk_cd/list.local
@@ -9,6 +9,7 @@ LINKinstbin sbin/mount_msdos
 LINK   instbin sbin/mount_udf
 LINK   instbin sbin/newfs_msdos
 LINK   instbin sbin/fsck_msdos
+LINK   instbin sbin/growfs
 LINK   instbin sbin/slaacd
 
 COPY   ${DESTDIR}/etc/ssl/cert.pem etc/ssl/cert.pem
diff --git distrib/arm64/ramdisk/list distrib/arm64/ramdisk/list
index d1b4f696646..3f3a2926aff 100644
--- distrib/arm64/ramdisk/list
+++ distrib/arm64/ramdisk/list
@@ -35,6 +35,7 @@ LINK  instbin sbin/fdisk
 LINK   instbin sbin/fsck
 LINK   instbin sbin/fsck_ext2fs
 LINK   instbin sbin/fsck_ffs
+LINK   instbin sbin/growfs
 LINK   instbin sbin/ifconfig
 LINK   instbin sbin/init
 LINK   instbin sbin/mknod
diff --git distrib/armv7/ramdisk/list distrib/armv7/ramdisk/list
index dd2b1ddc618..02b4800f226 100644
--- distrib/armv7/ramdisk/list
+++ distrib/armv7/ramdisk/list
@@ -35,6 +35,7 @@ LINK  instbin sbin/fdisk
 LINK   instbin sbin/fsck
 LINK   instbin sbin/fsck_ext2fs
 LINK   instbin sbin/fsck_ffs
+LINK   instbin sbin/growfs
 LINK   instbin sbin/ifconfig
 LINK   instbin sbin/init
 LINK   instbin sbin/mknod
diff --git distrib/hppa/ramdisk/list.local distrib/hppa/ramdisk/list.local
index d2130f3bbde..d4598cba7bf 100644
--- distrib/hppa/ramdisk/list.local
+++ distrib/hppa/ramdisk/list.local
@@ -5,6 +5,7 @@ LINKinstbin sbin/disklabel
 LINK   instbin usr/bin/grep usr/bin/egrep 
usr/bin/fgrep
 LINK   instbin usr/bin/more usr/bin/less
 LINK   instbin sbin/bioctl
+LINK   instbin sbin/growfs
 LINK   instbin sbin/slaacd
 
 # copy the MAKEDEV script and make some devices
diff --git distrib/i386/ramdisk_cd/list.local distrib/i386/ramdisk_cd/list.local
index 38879e31040..eed3304bb06 100644
--- distrib/i386/ramdisk_cd/list.local
+++ distrib/i386/ramdisk_cd/list.local
@@ -1,6 +1,7 @@
 # $OpenBSD: list.local,v 1.38 2017/07/08 15:42:46 florian Exp $
 
 # add local links; use bin/sh since instbin has already been unlinked
+LINK   instbin sbin/growfs
 LINK   instbin sbin/mount_ext2fs
 LINK   instbin sbin/mount_msdos
 LINK   instbin sbin/mount_udf
diff --git distrib/landisk/ramdisk/list distrib/landisk/ramdisk/list
index 0aa2b9109d8..6295dd433dc 100644
--- distrib/landisk/ramdisk/list
+++ distrib/landisk/ramdisk/list
@@ -34,6 +34,7 @@ LINK  instbin sbin/fdisk
 LINK   instbin sbin/fsck
 LINK   instbin sbin/fsck_ext2fs
 LINK   instbin 

Re: [PATCH] amd64/bsd.rd: add growfs(8)

2017-11-06 Thread Job Snijders
Thanks for the feedback.

I'll get to work on a MI patch and test on amd64 + i386, then pass it on to you.



[PATCH] amd64/bsd.rd: add growfs(8)

2017-11-05 Thread Job Snijders
Goodmorning everyone,

While quite some resizing scenarios can be done from within single user
mode, resizing the root partition requires you to bring your own
growfs(8) binary into the ramdisk environment. The below patch adds
growfs(8) to the amd64 ramdisk to simplify such operations.

I tested this on amd64 by building full releases with and without the
patch to compare sizes, and by actually growing partitions from within
bsd.rd.

Adding growfs to the crunched binary adds roughly 16 kilobytes, leaving
365K of free space in the the ramdisk's root. I think it's worth it, but
would appreciate feedback from others.

Kind regards,

Job

diff --git distrib/amd64/common/list distrib/amd64/common/list
index 75e1757b6ad..164b67b9bef 100644
--- distrib/amd64/common/list
+++ distrib/amd64/common/list
@@ -32,6 +32,7 @@ LINK  instbin sbin/dmesg
 LINK   instbin sbin/fdisk
 LINK   instbin sbin/fsck
 LINK   instbin sbin/fsck_ffs
+LINK   instbin sbin/growfs
 LINK   instbin sbin/ifconfig
 LINK   instbin sbin/init
 LINK   instbin sbin/kbd
diff --git distrib/special/Makefile distrib/special/Makefile
index c8a22ff7012..bd465ff1fb6 100644
--- distrib/special/Makefile
+++ distrib/special/Makefile
@@ -3,8 +3,8 @@
 SUBDIR=libstubs \
arch bioctl cat chmod chroot cp date dd df dhclient disklabel dmesg \
doas ed eeprom encrypt fdisk fsck fsck_ext2fs fsck_ffs fsck_msdos ftp \
-   ftp-ssl grep gzip hostname ifconfig init installboot kbd ksh ln ls md5 \
-   mkdir mknod mkuboot more mount mount_cd9660 mount_ext2fs \
+   ftp-ssl grep growfs gzip hostname ifconfig init installboot kbd ksh ln \
+   ls md5 mkdir mknod mkuboot more mount mount_cd9660 mount_ext2fs \
mount_ffs mount_msdos mount_nfs mount_udf mt mv newfs newfs_ext2fs \
newfs_msdos pax pdisk ping pwd_mkdb reboot restore rm route sed \
signify slaacd sleep stty sync sysctl umount
diff --git distrib/special/growfs/Makefile distrib/special/growfs/Makefile
new file mode 100644
index 000..e872d157c79
--- /dev/null
+++ distrib/special/growfs/Makefile
@@ -0,0 +1,10 @@
+# $OpenBSD: Makefile,v 1.8 2015/11/23 18:35:18 mmcc Exp $
+
+PROG=  growfs
+SRCS=  growfs.c
+
+DPADD= ${LIBUTIL}
+LDADD= -lutil
+
+.PATH: ${.CURDIR}/../../../sbin/growfs
+.include 



Re: Remove TCP_FACK

2017-10-25 Thread Job Snijders
This has been committed. Since the patch changed the userland ABI, don't
forget to rebuild (at least) fstat, netstat & tcpbench.

Kind regards,

Job



Re: Refactor TCP partial ACK handling

2017-10-24 Thread Job Snijders
On Tue, Oct 24, 2017 at 03:21:08PM +0200, Mike Belopuhov wrote:
> I didn't do it because tcp_var.h is where tcp keeps all of it's prototypes
> but I don't mind moving them into tcp_input.c.  Any objections?  Otherwise
> I'll check in the diff below.

ok job@



Remove TCP_FACK

2017-10-24 Thread Job Snijders
Dear all,

This patch builds upon the work shared in the following email. Mike's
patch is a prerequisite to apply this patch.

Date: Tue, 24 Oct 2017 15:21:08 +0200
From: Mike Belopuhov 
Subject: Re: Refactor TCP partial ACK handling

TCP_FACK was disabled by provos@ in June 1999. This patch removes
the TCP_FACK option and associated #if{,n}def code.

TCP_FACK is an algorithm that decides that when something is lost, all
not SACKed packets until the most forward SACK are lost. It may be a
correct estimate, if network does not reorder packets. 

The algorithm described in RFC 6675 may be a better replacement. This
culling patch can provide guidance how and where to implement 6675.

Kind regards,

Job

 share/man/man4/options.4 |   5 ---
 sys/conf/GENERIC |   1 -
 sys/netinet/tcp_input.c  | 111 +--
 sys/netinet/tcp_output.c |  42 --
 sys/netinet/tcp_timer.c  |   5 ---
 sys/netinet/tcp_usrreq.c |   5 ---
 sys/netinet/tcp_var.h|   6 ---
 usr.bin/netstat/inet.c   |   3 --
 8 files changed, 1 insertion(+), 177 deletions(-)

diff --git share/man/man4/options.4 share/man/man4/options.4
index c28d4e27896..737dc29efea 100644
--- share/man/man4/options.4
+++ share/man/man4/options.4
@@ -445,11 +445,6 @@ TCP to adjust the transmission rate using this signal.
 Both communication endpoints negotiate enabling
 .Em ECN
 functionality at the TCP connection establishment.
-.It Cd option TCP_FACK
-Turns on forward acknowledgements allowing a more precise estimate of
-outstanding data during the fast recovery phase by using
-.Em SACK
-information.
 .It Cd option TCP_SIGNATURE
 Turns on support for the TCP MD5 Signature option (RFC 2385).
 This is used by
diff --git sys/conf/GENERIC sys/conf/GENERIC
index 6df800175ed..e385b45785c 100644
--- sys/conf/GENERIC
+++ sys/conf/GENERIC
@@ -47,7 +47,6 @@ optionFUSE# FUSE
 option SOCKET_SPLICE   # Socket Splicing for TCP and UDP
 option TCP_ECN # Explicit Congestion Notification for TCP
 option TCP_SIGNATURE   # TCP MD5 Signatures, for BGP routing sessions
-#optionTCP_FACK# Forward Acknowledgements for TCP
 
 option INET6   # IPv6
 option IPSEC   # IPsec
diff --git sys/netinet/tcp_input.c sys/netinet/tcp_input.c
index 8d172e2905c..4321d85854c 100644
--- sys/netinet/tcp_input.c
+++ sys/netinet/tcp_input.c
@@ -974,10 +974,6 @@ findpcb:
if (SEQ_GT(tp->snd_una, tp->snd_last))
 #endif
tp->snd_last = tp->snd_una;
-#ifdef TCP_FACK
-   tp->snd_fack = tp->snd_una;
-   tp->retran_data = 0;
-#endif
m_freem(m);
 
/*
@@ -1566,18 +1562,7 @@ trimthenstep6:
 */
if (TCP_TIMER_ISARMED(tp, TCPT_REXMT) == 0)
tp->t_dupacks = 0;
-#ifdef TCP_FACK
-   /*
-* In FACK, can enter fast rec. if the receiver
-* reports a reass. queue longer than 3 segs.
-*/
-   else if (++tp->t_dupacks == tcprexmtthresh ||
-   ((SEQ_GT(tp->snd_fack, tcprexmtthresh *
-   tp->t_maxseg + tp->snd_una)) &&
-   SEQ_GT(tp->snd_una, tp->snd_last))) {
-#else
else if (++tp->t_dupacks == tcprexmtthresh) {
-#endif /* TCP_FACK */
tcp_seq onxt = tp->snd_nxt;
u_long win =
ulmin(tp->snd_wnd, tp->snd_cwnd) /
@@ -1603,15 +1588,6 @@ trimthenstep6:
 #endif
tcpstat_inc(tcps_cwr_frecovery);

tcpstat_inc(tcps_sack_recovery_episode);
-#ifdef TCP_FACK
-   tp->t_dupacks = tcprexmtthresh;
-   (void) tcp_output(tp);
-   /*
-* During FR, snd_cwnd is held
-* constant for FACK.
-*/
-   tp->snd_cwnd = tp->snd_ssthresh;
-#else
/*
 * tcp_output() will send
 * oldest SACK-eligible rtx.
@@ -1619,7 +1595,6 @@ trimthenstep6:
(void) tcp_output(tp);
 

Re: Enable TCP selective acknowledgements (SACK) on all kernels

2017-10-23 Thread Job Snijders
On Sun, Oct 22, 2017 at 04:04:30PM +0200, Mike Belopuhov wrote:
> > If this is as expected, OK job@
> 
> It's setting the option in my build here:
> 
> 15:55:20.336682 fe:e1:bb:d1:a2:f0 fe:e1:ba:d0:55:1e 0800 78: \
>   10.50.50.34.17078 > 10.50.50.1.80: S [tcp sum ok] 1313610867:1313610867(0) \
>   win 16384  3624645505 0> \
>   (DF) (ttl 64, id 25292, len 64)

I went back and tested again, but with a different packetsniffer. You
were right. I now question the tool I used to observe this traffic on a
middle box is giving me correct output.

I can confirm there is no difference observable with tcpdump between
miniroot ftp(1) and normal ftp(1). SACK is set in both contexts.

All good! :)

Kind regards,

Job



Re: Enable TCP selective acknowledgements (SACK) on all kernels

2017-10-22 Thread Job Snijders
On Thu, Oct 19, 2017 at 06:55:05PM +0200, Mike Belopuhov wrote:
> SACK has been enabled in GENERIC kernels for over a decade and it's
> time to make it an official part of the TCP stack. 

I tested your diff by doing an amd64 release build and testing both the
newly created /bsd and /bsd.rd, I observed no problems and SACK was
available in both boot scenarios.

One thing that stood out to me is that the miniroot's "SMALL" ftp(1)
didn't sent SACK as permitted tcp option. However, after chrooting into
my normal environment and using the real /usr/bin/ftp, I observed that
SACK was available and used.

If this is as expected, OK job@

> This grows bsd.rd on amd64 by 8k but Theo said it's within reasonable.
> OK?

$ ls -latr /bsd.rd /tmp/bsd.rd
-rw---  1 root  wheel  9787542 Oct 22 08:36 /bsd.rd
-rw---  1 job   wheel  9782763 Oct 19 12:48 /old/bsd.rd

> diff --git sys/conf/GENERIC sys/conf/GENERIC
> -option   TCP_SACK# Selective Acknowledgements for TCP

I think the below patch may be an appropriate companion for removal of
the option.

Kind regards,

Job

diff --git share/man/man4/options.4 share/man/man4/options.4
index 3e15d4c8c4f..3945611607e 100644
--- share/man/man4/options.4
+++ share/man/man4/options.4
@@ -454,20 +454,6 @@ Turns on forward acknowledgements allowing a more precise 
estimate of
 outstanding data during the fast recovery phase by using
 .Em SACK
 information.
-This option can only be used together with
-.Em TCP_SACK .
-.It Cd option TCP_SACK
-Turns on selective acknowledgements.
-Additional information about
-segments already received can be transmitted back to the sender,
-thus indicating segments that have been lost and allowing for
-a swifter recovery.
-Both communication endpoints need to support
-.Em SACK .
-The fallback behaviour is NewReno fast recovery phase, which allows
-one lost segment to be recovered per round trip time.
-When more than one segment has been dropped per window, the transmission can
-continue without waiting for a retransmission timeout.
 .It Cd option TCP_SIGNATURE
 Turns on support for the TCP MD5 Signature option (RFC 2385).
 This is used by



Re: netstat(1) print PID for sockets.

2017-07-02 Thread Job Snijders
On Sat, Jul 01, 2017 at 08:50:36PM +0200, Sebastian Benoit wrote:
> Sebastian Benoit(be...@openbsd.org) on 2017.07.01 20:40:17 +0200:
> 
> oks for this?
> 
>   /* filter listening sockets out unless -a is set */
> - if (!aflag && istcp && kf->t_state <= TCPS_LISTEN)
> + if (!(aflag || lflag) && istcp && kf->t_state <= TCPS_LISTEN)
> + return;
> + else if (!(aflag||lflag) && isany)

formatting: (aflag||lflag) -> (aflag || lflag)

> +++ usr.bin/netstat/netstat.1
> @@ -38,7 +38,7 @@
>  .Nd show network status
>  .Sh SYNOPSIS
>  .Nm netstat
> -.Op Fl AaBn
> +.Op Fl AaBln

Shouldn't 'v' be added to 'AaBln' too?

Other than that, ok!

Kind regards,

Job



[PATCH] bin: Add -v option to cp(1), mv(1), rm(1)

2017-06-27 Thread Job Snijders
Dear all,

This patch adds a '-v' option to cp(1), mv(1), and rm(1). If the '-v'
option is used, the utility will display the name of each file after it
has been successfully copied, moved or removed.

This patch rolls in feedback from recent discussion. The manpage updates
are specific as to what the -v option does for each utility and the
messages written to the standard output are now minimalistic in nature.

example use:

$ touch a
$ mv -v a b
a -> b
$ cp -v b c
b -> c
$ rm -v b c
b
c

Hat tip to Paul de Weerd for helping with testing.

Kind regards,

Job

diff --git bin/cp/cp.1 bin/cp/cp.1
index 8573d801ca5..2896406498c 100644
--- bin/cp/cp.1
+++ bin/cp/cp.1
@@ -41,14 +41,14 @@
 .Nd copy files
 .Sh SYNOPSIS
 .Nm cp
-.Op Fl fip
+.Op Fl fipv
 .Oo
 .Fl R
 .Op Fl H | L | P
 .Oc
 .Ar source target
 .Nm cp
-.Op Fl fip
+.Op Fl fipv
 .Oo
 .Fl R
 .Op Fl H | L | P
@@ -145,6 +145,8 @@ use a utility such as
 or
 .Xr tar 1
 instead.
+.It Fl v
+Display the source and destination after each copy.
 .El
 .Pp
 For each destination file that already exists, its contents are
diff --git bin/cp/cp.c bin/cp/cp.c
index 643d82ed9fa..480f18b0ff1 100644
--- bin/cp/cp.c
+++ bin/cp/cp.c
@@ -71,7 +71,7 @@
 PATH_T to = { to.p_path, "" };
 
 uid_t myuid;
-int Rflag, fflag, iflag, pflag, rflag;
+int Rflag, fflag, iflag, pflag, rflag, vflag;
 mode_t myumask;
 
 enum op { FILE_TO_FILE, FILE_TO_DIR, DIR_TO_DNE };
@@ -88,7 +88,7 @@ main(int argc, char *argv[])
char *target;
 
Hflag = Lflag = Pflag = Rflag = 0;
-   while ((ch = getopt(argc, argv, "HLPRfipr")) != -1)
+   while ((ch = getopt(argc, argv, "HLPRfiprv")) != -1)
switch (ch) {
case 'H':
Hflag = 1;
@@ -119,6 +119,9 @@ main(int argc, char *argv[])
case 'r':
rflag = 1;
break;
+   case 'v':
+   vflag = 1;
+   break;
default:
usage();
break;
@@ -394,6 +397,9 @@ copy(char *argv[], enum op type, int fts_options)
case S_IFLNK:
if (copy_link(curr, !fts_dne(curr)))
rval = 1;
+   else if (vflag)
+   (void)fprintf(stdout, "%s -> %s\n",
+   curr->fts_path, to.p_path);
break;
case S_IFDIR:
if (!Rflag && !rflag) {
@@ -415,6 +421,9 @@ copy(char *argv[], enum op type, int fts_options)
if (mkdir(to.p_path,
curr->fts_statp->st_mode | S_IRWXU) < 0)
err(1, "%s", to.p_path);
+   else if (vflag)
+   (void)fprintf(stdout, "%s -> %s\n",
+   curr->fts_path, to.p_path);
} else if (!S_ISDIR(to_stat.st_mode))
errc(1, ENOTDIR, "%s", to.p_path);
break;
@@ -426,6 +435,9 @@ copy(char *argv[], enum op type, int fts_options)
} else
if (copy_file(curr, fts_dne(curr)))
rval = 1;
+   if (!rval && vflag)
+   (void)fprintf(stdout, "%s -> %s\n",
+   curr->fts_path, to.p_path);
break;
case S_IFIFO:
if (Rflag) {
@@ -434,6 +446,9 @@ copy(char *argv[], enum op type, int fts_options)
} else
if (copy_file(curr, fts_dne(curr)))
rval = 1;
+   if (!rval && vflag)
+   (void)fprintf(stdout, "%s -> %s\n",
+   curr->fts_path, to.p_path);
break;
case S_IFSOCK:
warnc(EOPNOTSUPP, "%s", curr->fts_path);
@@ -441,6 +456,9 @@ copy(char *argv[], enum op type, int fts_options)
default:
if (copy_file(curr, fts_dne(curr)))
rval = 1;
+   else if (vflag)
+   (void)fprintf(stdout, "%s -> %s\n",
+   curr->fts_path, to.p_path);
break;
}
}
diff --git bin/cp/utils.c bin/cp/utils.c
index 6a3c5178647..2189dd4be1f 100644
--- bin/cp/utils.c
+++ bin/cp/utils.c
@@ -307,9 +307,9 @@ void
 usage(void)
 {
(void)fprintf(stderr,
-   "usage: %s [-fip] [-R [-H | -L | -P]] source target\n", __progname);
+   "usage: %s [-fipv] [-R [-H | -L | -P]] source 

Re: [PATCH 2/3] openbgpd: Add support for 'unknown' well-known communities

2017-06-25 Thread Job Snijders
On Sun, Jun 25, 2017 at 11:41:05PM +0200, Sebastian Benoit wrote:
> ok
> 
> as wor the WELLKNOWN, what do other implementations do?

I'm not aware of other implementations that do a blanket replacement of
"65535:" with something like "WELLKNOWN:" in their CLI output.

Most implementations (after native support for a given well-known
community is added), do replace strings like "65535:666" with a human
readable version like "BLACKHOLE". Subsequently, they often accept both
the numeric and text version of the community as input. In this regard
openbsd is already aligned with the industry pattern.

> If others print it or accept it as input, lets keep it.
> If not, remove it from both your diff and bgpctls output.

ok, below:


diff --git usr.sbin/bgpctl/bgpctl.c usr.sbin/bgpctl/bgpctl.c
index 4d9701da35b..4242b3484ca 100644
--- usr.sbin/bgpctl/bgpctl.c
+++ usr.sbin/bgpctl/bgpctl.c
@@ -1548,7 +1548,7 @@ show_community(u_char *data, u_int16_t len)
printf("BLACKHOLE");
break;
default:
-   printf("WELLKNOWN:%hu", v);
+   printf("%hu:%hu", a, v);
break;
}
else
diff --git usr.sbin/bgpctl/parser.c usr.sbin/bgpctl/parser.c
index 85300d1cd32..0846436a16b 100644
--- usr.sbin/bgpctl/parser.c
+++ usr.sbin/bgpctl/parser.c
@@ -994,10 +994,6 @@ done:
case COMMUNITY_BLACKHOLE:
/* valid */
break;
-   default:
-   /* unknown */
-   fprintf(stderr, "Unknown well-known community\n");
-   return (0);
}
 
if ((fs = calloc(1, sizeof(struct filter_set))) == NULL)
diff --git usr.sbin/bgpd/parse.y usr.sbin/bgpd/parse.y
index f0c96051e17..7eec31f2bda 100644
--- usr.sbin/bgpd/parse.y
+++ usr.sbin/bgpd/parse.y
@@ -3019,10 +3019,6 @@ parsecommunity(struct filter_community *c, char *s)
 
if ((i = getcommunity(s)) == COMMUNITY_ERROR)
return (-1);
-   if (i == COMMUNITY_WELLKNOWN) {
-   yyerror("Bad community AS number");
-   return (-1);
-   }
as = i;
 
if ((i = getcommunity(p)) == COMMUNITY_ERROR)



Re: [PATCH] cp(1): add -v option for verbosity

2017-06-25 Thread Job Snijders
On Sun, Jun 25, 2017 at 02:06:20PM +0200, Job Snijders wrote:
> This patch adds a -v option to cp(1) for more verbose output.

NetBSD/FreeBSD/DragonFly/OSX's cp(1) with "-v" print file names without
the single quotes, which might indeed be more appealing to the eye:

$ touch a b; mkdir c
$ cp -v a b c
a -> c/a
b -> c/b

> +.It Fl v
> +Explain what is being done.

A more compendious option description for bin/cp/cp.1 would be: 

.It Fl v
Display the source and destination of each copied file.

The above 2 suggestions make sense to the proposed mv(1) patch too,
modulo s/copied/moved/.

Kind regards,

Job



Re: [PATCH] rm(1): add -v option for verbosity

2017-06-25 Thread Job Snijders
On Sun, Jun 25, 2017 at 04:09:11PM +0200, Job Snijders wrote:
> --- bin/rm/rm.1
> +++ bin/rm/rm.1
> @@ -95,6 +95,8 @@ that directory is skipped.
>  .It Fl r
>  Equivalent to
>  .Fl R .
> +.It Fl v
> +Explain what is being done.
   

On second thought, "Display what files were removed." would be a more
succinct description of what the "-v" option in 'rm' does.



Re: [PATCH] cp(1): add -v option for verbosity

2017-06-25 Thread Job Snijders
Dear Alexander,

On Sun, Jun 25, 2017 at 06:13:40PM +0200, Alexander Hall wrote:
> On June 25, 2017 2:06:20 PM GMT+02:00, Job Snijders <j...@instituut.net> 
> wrote:
> >This patch adds a -v option to cp(1) for more verbose output.
> >
> > $ touch a b; mkdir c
> > $ cp -v a b c
> > 'a' -> 'c/a'
> > 'b' -> 'c/b'
> > $ cp -rv c d
> > 'c' -> 'd/'
> > 'c/a' -> 'd/a'
> > 'c/b' -> 'd/b'
> 
> Pardon my ignorance, but why?

A fair question.  I myself have two use cases, but others may have their
own to add.

When a glob pattern is used, it can be beneficial to immediately observe
(during the execution of the command) which files have been copied.

When copying very large trees, the -v option provides some insight as to
what progress the cp operation has made so far.

Alternatively one can use rsync(1), but that is not part of the base.

> Is this a gnu thing? 

Not specifically: freebsd, netbsd, darwin and dragonfly have it too.

Kind regards,

Job



Re: [PATCH] rm(1): add -v option for verbosity

2017-06-25 Thread Job Snijders
Hi Ingo,

Thanks for taking the time to review this.

On Sun, Jun 25, 2017 at 03:12:26PM +0200, Ingo Schwarze wrote:
> Job Snijders wrote on Sun, Jun 25, 2017 at 02:06:16PM +0200:
> 
> > This patch adds a '-v' option to rm(1) for more verbose output.
> 
> Do not add new options to standard utilities, unless you can show that
> they are unusually useful in practice *and* practically every other
> system out there has them, with a compatible user interface across
> practically all systems.

I checked a number of systems: busybox, HP-UX, and SunOS don't have -v.

FreeBSD, NetBSD, DragonFlyBSD, Slackware, Minix, Debian, CentOS, Suse,
and Darwin do have a -v option to report which files were deleted.

> Adding -v to rm(1) probably wouldn't make the very high bar against
> adding non-standard options even if almost everybody else had it
> (which i didn't check, to not waste time) because it's completely
> useless.

I appreciate the bar is high, and I doubted whether I should send the
patch at all. However, without some proposed code on the table we would
not be able to have this conversation.

> If you are really unsure, study the output of
> 
>   $ find *
> 
> first, before typing
> 
>   $ rm -rf *
> 
> No non-standard option is needed at all for this.

The 'find *' will show you what existed at the moment of executing the
'find' command, where on the other hand 'rm -rfv *' gives a report of
which files actually were succesfully deleted through the execution of
the 'rm' command.

> It's also strongly in violation of the UNIX philosophy (each utility,
> do one thing, and do it well). rm(1) removes files, and does so well.
> rm(1) does *not* trespass on ls(1) and find(1) territory to list
> files.

I'm not sure I agree with you on this point. To me ls(1) and rm(1) have
different uses, rm(1) being able to tell which files it deleted, is not
at all the same as requesting a listing of files.

> Besides, they way your proposed rm(1) extension lists files is not
> doing it well at all. The output is awful.

Yes, the output could be simplified, below a version of the patch to
align with how freebsd and netbsd do it:

$ touch f; mkdir d; touch d/f
$ rm -rfv *
d/f
d
f

Sticking to the standards is a strong argument, and I understand that
changes to core utilities like rm must be thoroughly vetted. Please
don't take offense in me attempting to propose a change.

Kind regards,

Job

diff --git bin/rm/rm.1 bin/rm/rm.1
index 5c8aefaab7d..edc68b60001 100644
--- bin/rm/rm.1
+++ bin/rm/rm.1
@@ -95,6 +95,8 @@ that directory is skipped.
 .It Fl r
 Equivalent to
 .Fl R .
+.It Fl v
+Explain what is being done.
 .El
 .Pp
 The
@@ -148,7 +150,7 @@ utility is compliant with the
 specification.
 .Pp
 The flags
-.Op Fl dP
+.Op Fl dPv
 are extensions to that specification.
 .Sh HISTORY
 An
diff --git bin/rm/rm.c bin/rm/rm.c
index 3242ef5f410..fc0904d0325 100644
--- bin/rm/rm.c
+++ bin/rm/rm.c
@@ -50,7 +50,7 @@
 
 extern char *__progname;
 
-int dflag, eval, fflag, iflag, Pflag, stdin_ok;
+int dflag, eval, fflag, iflag, Pflag, vflag, stdin_ok;
 
 intcheck(char *, char *, struct stat *);
 void   checkdot(char **);
@@ -73,7 +73,7 @@ main(int argc, char *argv[])
int ch, rflag;
 
Pflag = rflag = 0;
-   while ((ch = getopt(argc, argv, "dfiPRr")) != -1)
+   while ((ch = getopt(argc, argv, "dfiPRrv")) != -1)
switch(ch) {
case 'd':
dflag = 1;
@@ -93,6 +93,9 @@ main(int argc, char *argv[])
case 'r':   /* Compatibility. */
rflag = 1;
break;
+   case 'v':
+   vflag = 1;
+   break;
default:
usage();
}
@@ -201,8 +204,11 @@ rm_tree(char **argv)
case FTS_DP:
case FTS_DNR:
if (!rmdir(p->fts_accpath) ||
-   (fflag && errno == ENOENT))
+   (fflag && errno == ENOENT)) {
+   if (vflag)
+   (void)fprintf(stdout, "%s\n", 
p->fts_path);
continue;
+   }
break;
 
case FTS_F:
@@ -213,8 +219,11 @@ rm_tree(char **argv)
/* FALLTHROUGH */
default:
if (!unlink(p->fts_accpath) ||
-   (fflag && errno == ENOENT))
+   (fflag && errno == ENOENT)) {
+   if (vflag)
+   (void)fprintf(stdout, "%s\n", 
p->fts_path);
continue;
+  

Re: [PATCH 2/3] openbgpd: Add support for 'unknown' well-known communities

2017-06-25 Thread Job Snijders
Small update.

I renamed the 'msb' argument ('most significant bits') to 'part' to
improve readability. In Community 15562:4, '15562' is part 0 and the '4'
is part 1. Same type of logic might be useful down the road for Large
Communities which would have 3 parts.

- Job

diff --git usr.sbin/bgpctl/parser.c usr.sbin/bgpctl/parser.c
index 85300d1cd32..c9d63f9ade3 100644
--- usr.sbin/bgpctl/parser.c
+++ usr.sbin/bgpctl/parser.c
@@ -413,7 +413,7 @@ int  parse_addr(const char *, struct 
bgpd_addr *);
 int parse_asnum(const char *, size_t, u_int32_t *);
 int parse_number(const char *, struct parse_result *,
 enum token_type);
-int getcommunity(const char *);
+int getcommunity(const char *, int);
 int parse_community(const char *, struct parse_result *);
 u_int   getlargecommunity(const char *);
 int parse_largecommunity(const char *, struct parse_result 
*);
@@ -927,7 +927,7 @@ parse_number(const char *word, struct parse_result *r, enum 
token_type type)
 }
 
 int
-getcommunity(const char *s)
+getcommunity(const char *s, int part)
 {
const char  *errstr;
u_int16_tuval;
@@ -935,6 +935,9 @@ getcommunity(const char *s)
if (strcmp(s, "*") == 0)
return (COMMUNITY_ANY);
 
+   if (part == 0 && strcmp(s, "WELLKNOWN") == 0)
+   return (COMMUNITY_WELLKNOWN);
+
uval = strtonum(s, 0, USHRT_MAX, );
if (errstr)
errx(1, "Community is %s: %s", errstr, s);
@@ -978,8 +981,8 @@ parse_community(const char *word, struct parse_result *r)
}
*p++ = 0;
 
-   as = getcommunity(word);
-   type = getcommunity(p);
+   as = getcommunity(word, 0);
+   type = getcommunity(p, 1);
 
 done:
if (as == 0) {
@@ -994,10 +997,6 @@ done:
case COMMUNITY_BLACKHOLE:
/* valid */
break;
-   default:
-   /* unknown */
-   fprintf(stderr, "Unknown well-known community\n");
-   return (0);
}
 
if ((fs = calloc(1, sizeof(struct filter_set))) == NULL)
diff --git usr.sbin/bgpd/parse.y usr.sbin/bgpd/parse.y
index f0c96051e17..ec4ed956b60 100644
--- usr.sbin/bgpd/parse.y
+++ usr.sbin/bgpd/parse.y
@@ -146,7 +146,7 @@ void copy_filterset(struct filter_set_head 
*,
 voidmerge_filter_lists(struct filter_head *, struct filter_head *);
 struct filter_rule *get_rule(enum action_types);
 
-int getcommunity(char *);
+int getcommunity(char *, int);
 int parsecommunity(struct filter_community *, char *);
 int64_t getlargecommunity(char *);
 int parselargecommunity(struct filter_largecommunity *, char *);
@@ -2963,11 +2963,13 @@ symget(const char *nam)
 }
 
 int
-getcommunity(char *s)
+getcommunity(char *s, int part)
 {
int  val;
const char  *errstr;
 
+   if (part == 0 && strcmp(s, "WELLKNOWN") == 0)
+   return (COMMUNITY_WELLKNOWN);
if (strcmp(s, "*") == 0)
return (COMMUNITY_ANY);
if (strcmp(s, "neighbor-as") == 0)
@@ -3017,15 +3019,11 @@ parsecommunity(struct filter_community *c, char *s)
}
*p++ = 0;
 
-   if ((i = getcommunity(s)) == COMMUNITY_ERROR)
+   if ((i = getcommunity(s, 0)) == COMMUNITY_ERROR)
return (-1);
-   if (i == COMMUNITY_WELLKNOWN) {
-   yyerror("Bad community AS number");
-   return (-1);
-   }
as = i;
 
-   if ((i = getcommunity(p)) == COMMUNITY_ERROR)
+   if ((i = getcommunity(p, 1)) == COMMUNITY_ERROR)
return (-1);
c->as = as;
c->type = i;



Re: [PATCH 2/3] openbgpd: Add support for 'unknown' well-known communities

2017-06-25 Thread Job Snijders
On Sun, Jun 25, 2017 at 02:09:22PM +0200, Peter Hessler wrote:
> : $ bgpctl show rib community WELLKNOWN:0
> : ..
> : $ bgpctl show rib community WELLKNOWN:*
> : ..
> 
> Eh, I don't really see a reason to have syntatic sugar for
> '65535'.  In this case, I'm more likely to remember then number than
> which string to use ;).
> 
> : $ doas cat /etc/bgpd.conf | grep set
> : match from any set { community WELLKNOWN:0 community 65535:1 }
> 
> Same as before.  OK for setting 65535:1, but 'E' for
> 'WELLKNOWN:0'.
> 
> However, if we have one, then we need to have the other.

Current bgpd(8) on 6.1 uses the WELLKNOWN sugar in bgpctl output, i felt
it would be proper to allow it in bgpd.conf and bgpctl input too.

6.1 example:

[job@karen ~] $ bgpctl show rib  detail 192.147.168.0/25

BGP routing table entry for 192.147.168.0/25
Nexthop 192.147.168.249 (via 192.147.168.249) from 
192.147.168.249 (192.147.168.1)
Origin IGP, metric 0, localpref 100, weight 0, internal, valid, 
best
Last update: 00:00:08 ago
Communities: WELLKNOWN:1

Kind regards,

Job



[PATCH] cp(1): add -v option for verbosity

2017-06-25 Thread Job Snijders
Dear team,

This patch adds a -v option to cp(1) for more verbose output.

$ touch a b; mkdir c
$ cp -v a b c
'a' -> 'c/a'
'b' -> 'c/b'
$ cp -rv c d
'c' -> 'd/'
'c/a' -> 'd/a'
'c/b' -> 'd/b'

Kind regards,

Job

diff --git bin/cp/cp.1 bin/cp/cp.1
index 8573d801ca5..d4346d23f1d 100644
--- bin/cp/cp.1
+++ bin/cp/cp.1
@@ -41,14 +41,14 @@
 .Nd copy files
 .Sh SYNOPSIS
 .Nm cp
-.Op Fl fip
+.Op Fl fipv
 .Oo
 .Fl R
 .Op Fl H | L | P
 .Oc
 .Ar source target
 .Nm cp
-.Op Fl fip
+.Op Fl fipv
 .Oo
 .Fl R
 .Op Fl H | L | P
@@ -145,6 +145,8 @@ use a utility such as
 or
 .Xr tar 1
 instead.
+.It Fl v
+Explain what is being done.
 .El
 .Pp
 For each destination file that already exists, its contents are
diff --git bin/cp/cp.c bin/cp/cp.c
index 643d82ed9fa..819e02f7be8 100644
--- bin/cp/cp.c
+++ bin/cp/cp.c
@@ -71,7 +71,7 @@
 PATH_T to = { to.p_path, "" };
 
 uid_t myuid;
-int Rflag, fflag, iflag, pflag, rflag;
+int Rflag, fflag, iflag, pflag, rflag, vflag;
 mode_t myumask;
 
 enum op { FILE_TO_FILE, FILE_TO_DIR, DIR_TO_DNE };
@@ -88,7 +88,7 @@ main(int argc, char *argv[])
char *target;
 
Hflag = Lflag = Pflag = Rflag = 0;
-   while ((ch = getopt(argc, argv, "HLPRfipr")) != -1)
+   while ((ch = getopt(argc, argv, "HLPRfiprv")) != -1)
switch (ch) {
case 'H':
Hflag = 1;
@@ -119,6 +119,9 @@ main(int argc, char *argv[])
case 'r':
rflag = 1;
break;
+   case 'v':
+   vflag = 1;
+   break;
default:
usage();
break;
@@ -394,6 +397,9 @@ copy(char *argv[], enum op type, int fts_options)
case S_IFLNK:
if (copy_link(curr, !fts_dne(curr)))
rval = 1;
+   else if (vflag)
+   (void)fprintf(stdout, "'%s' -> '%s'\n",
+   curr->fts_path, to.p_path);
break;
case S_IFDIR:
if (!Rflag && !rflag) {
@@ -415,6 +421,9 @@ copy(char *argv[], enum op type, int fts_options)
if (mkdir(to.p_path,
curr->fts_statp->st_mode | S_IRWXU) < 0)
err(1, "%s", to.p_path);
+   else if (vflag)
+   (void)fprintf(stdout, "'%s' -> '%s'\n",
+   curr->fts_path, to.p_path);
} else if (!S_ISDIR(to_stat.st_mode))
errc(1, ENOTDIR, "%s", to.p_path);
break;
@@ -426,6 +435,9 @@ copy(char *argv[], enum op type, int fts_options)
} else
if (copy_file(curr, fts_dne(curr)))
rval = 1;
+   if (!rval && vflag)
+   (void)fprintf(stdout, "'%s' -> '%s'\n",
+   curr->fts_path, to.p_path);
break;
case S_IFIFO:
if (Rflag) {
@@ -434,6 +446,9 @@ copy(char *argv[], enum op type, int fts_options)
} else
if (copy_file(curr, fts_dne(curr)))
rval = 1;
+   if (!rval && vflag)
+   (void)fprintf(stdout, "'%s' -> '%s'\n",
+   curr->fts_path, to.p_path);
break;
case S_IFSOCK:
warnc(EOPNOTSUPP, "%s", curr->fts_path);
@@ -441,6 +456,9 @@ copy(char *argv[], enum op type, int fts_options)
default:
if (copy_file(curr, fts_dne(curr)))
rval = 1;
+   else if (vflag)
+   (void)fprintf(stdout, "'%s' -> '%s'\n",
+   curr->fts_path, to.p_path);
break;
}
}
diff --git bin/cp/utils.c bin/cp/utils.c
index 6a3c5178647..2189dd4be1f 100644
--- bin/cp/utils.c
+++ bin/cp/utils.c
@@ -307,9 +307,9 @@ void
 usage(void)
 {
(void)fprintf(stderr,
-   "usage: %s [-fip] [-R [-H | -L | -P]] source target\n", __progname);
+   "usage: %s [-fipv] [-R [-H | -L | -P]] source target\n", 
__progname);
(void)fprintf(stderr,
-   "   %s [-fip] [-R [-H | -L | -P]] source ... directory\n",
+   "   %s [-fipv] [-R [-H | -L | -P]] source ... directory\n",
__progname);
exit(1);
 }



[PATCH] rm(1): add -v option for verbosity

2017-06-25 Thread Job Snijders
Hi all,

This patch adds a '-v' option to rm(1) for more verbose output.

$ mkdir a; touch a/b; touch c
$ rm -rfv *
removed 'a/b'
removed directory 'a'
removed 'c'

Kind regards,

Job

diff --git bin/rm/rm.1 bin/rm/rm.1
index 5c8aefaab7d..7de2c7067ee 100644
--- bin/rm/rm.1
+++ bin/rm/rm.1
@@ -95,6 +95,8 @@ that directory is skipped.
 .It Fl r
 Equivalent to
 .Fl R .
+.It Fl v
+Explain what is being done.
 .El
 .Pp
 The
diff --git bin/rm/rm.c bin/rm/rm.c
index 3242ef5f410..574609c64e9 100644
--- bin/rm/rm.c
+++ bin/rm/rm.c
@@ -50,7 +50,7 @@
 
 extern char *__progname;
 
-int dflag, eval, fflag, iflag, Pflag, stdin_ok;
+int dflag, eval, fflag, iflag, Pflag, vflag, stdin_ok;
 
 intcheck(char *, char *, struct stat *);
 void   checkdot(char **);
@@ -73,7 +73,7 @@ main(int argc, char *argv[])
int ch, rflag;
 
Pflag = rflag = 0;
-   while ((ch = getopt(argc, argv, "dfiPRr")) != -1)
+   while ((ch = getopt(argc, argv, "dfiPRrv")) != -1)
switch(ch) {
case 'd':
dflag = 1;
@@ -93,6 +93,9 @@ main(int argc, char *argv[])
case 'r':   /* Compatibility. */
rflag = 1;
break;
+   case 'v':
+   vflag = 1;
+   break;
default:
usage();
}
@@ -201,8 +204,12 @@ rm_tree(char **argv)
case FTS_DP:
case FTS_DNR:
if (!rmdir(p->fts_accpath) ||
-   (fflag && errno == ENOENT))
+   (fflag && errno == ENOENT)) {
+   if (vflag)
+   (void)fprintf(stdout,
+   "removed directory '%s'\n", 
p->fts_path);
continue;
+   }
break;
 
case FTS_F:
@@ -213,8 +220,12 @@ rm_tree(char **argv)
/* FALLTHROUGH */
default:
if (!unlink(p->fts_accpath) ||
-   (fflag && errno == ENOENT))
+   (fflag && errno == ENOENT)) {
+   if (vflag)
+   (void)fprintf(stdout,
+   "removed '%s'\n", p->fts_path);
continue;
+   }
}
warn("%s", p->fts_path);
eval = 1;
@@ -262,7 +273,8 @@ rm_file(char **argv)
if (rval && (!fflag || errno != ENOENT)) {
warn("%s", f);
eval = 1;
-   }
+   } else if (vflag)
+   (void)fprintf(stdout, "removed '%s'\n", f);
}
 }
 
@@ -430,6 +442,6 @@ skip:
 void
 usage(void)
 {
-   (void)fprintf(stderr, "usage: %s [-dfiPRr] file ...\n", __progname);
+   (void)fprintf(stderr, "usage: %s [-dfiPRrv] file ...\n", __progname);
exit(1);
 }



[PATCH] mv(1): add -v option for verbosity

2017-06-25 Thread Job Snijders
Hi all,

This patch adds a -v option to mv(1) for more verbose output.

$ touch a
$ mv -v a b
'a' -> 'b'
$ mkdir c
$ mv -v b c
'b' -> 'c/b'
$ mv -v c d
'e' -> 'd'

And here is an example of the output of the situation mentioned in the
'caveats' section in the manpage:

$ touch f g; mkdir -p d/f
$ mv -v f g d
mv: rename f to d/f: Is a directory
'g' -> 'd/g'
$ echo $?
1

Kind regards,

Job

diff --git bin/mv/mv.1 bin/mv/mv.1
index cb6c9d92673..fc8e642017e 100644
--- bin/mv/mv.1
+++ bin/mv/mv.1
@@ -103,6 +103,8 @@ The
 option overrides any previous
 .Fl f
 options.
+.It Fl v
+Explain what is being done.
 .El
 .Pp
 The
diff --git bin/mv/mv.c bin/mv/mv.c
index 003aca59e87..fa8654b50e4 100644
--- bin/mv/mv.c
+++ bin/mv/mv.c
@@ -51,7 +51,7 @@
 
 extern char *__progname;
 
-int fflg, iflg;
+int fflg, iflg, vflg;
 int stdin_ok;
 
 extern int cpmain(int argc, char **argv);
@@ -71,7 +71,7 @@ main(int argc, char *argv[])
int ch;
char path[PATH_MAX];
 
-   while ((ch = getopt(argc, argv, "if")) != -1)
+   while ((ch = getopt(argc, argv, "ifv")) != -1)
switch (ch) {
case 'i':
fflg = 0;
@@ -81,6 +81,9 @@ main(int argc, char *argv[])
iflg = 0;
fflg = 1;
break;
+   case 'v':
+   vflg = 1;
+   break;
default:
usage();
}
@@ -208,8 +211,11 @@ do_move(char *from, char *to)
 *  message to standard error, and do nothing more with the
 *  current source file...
 */
-   if (!rename(from, to))
+   if (!rename(from, to)) {
+   if (vflg)
+   (void)fprintf(stdout, "'%s' -> '%s'\n", from, to);
return (0);
+   }
 
if (errno != EXDEV) {
warn("rename %s to %s", from, to);
@@ -339,6 +345,10 @@ err:   if (unlink(to))
warn("%s: remove", from);
return (1);
}
+
+   if (vflg)
+   (void)fprintf(stdout, "'%s' -> '%s'\n", from, to);
+
return (0);
 }
 
@@ -362,14 +372,17 @@ mvcopy(char *from, char *to)
_exit(1);
}
 
+   if (vflg)
+   (void)fprintf(stdout, "'%s' -> '%s'\n", from, to);
+
return (0);
 }
 
 void
 usage(void)
 {
-   (void)fprintf(stderr, "usage: %s [-fi] source target\n", __progname);
-   (void)fprintf(stderr, "   %s [-fi] source ... directory\n",
+   (void)fprintf(stderr, "usage: %s [-fiv] source target\n", __progname);
+   (void)fprintf(stderr, "   %s [-fiv] source ... directory\n",
__progname);
exit(1);
 }



[PATCH 3/3] openbgpd: Add well-known community GRACEFUL_SHUTDOWN

2017-06-23 Thread Job Snijders
Dear team,

This patch adds support for the "graceful shutdown" well-known
community as described in draft-ietf-grow-bgp-gshut.

An example implementation would be to add the following to your
bgpd.conf:

match from any community GRACEFUL_SHUTDOWN set { localpref 0 }

Kind regards,

Job

---
 etc/examples/bgpd.conf| 4 
 usr.sbin/bgpctl/bgpctl.c  | 3 +++
 usr.sbin/bgpctl/parser.c  | 7 ++-
 usr.sbin/bgpd/bgpd.conf.5 | 2 ++
 usr.sbin/bgpd/bgpd.h  | 1 +
 usr.sbin/bgpd/parse.y | 6 +-
 6 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/etc/examples/bgpd.conf b/etc/examples/bgpd.conf
index 2ec37b2c752..1caf200ceab 100644
--- a/etc/examples/bgpd.conf
+++ b/etc/examples/bgpd.conf
@@ -87,6 +87,10 @@ allow from any inet6 prefixlen 16 - 48
 #allow from any prefix 0.0.0.0/0
 #allow from any prefix ::/0
 
+# Honor requests to gracefully shutdown BGP sessions
+# https://tools.ietf.org/html/draft-ietf-grow-bgp-gshut
+match from any community GRACEFUL_SHUTDOWN set { localpref 0 }
+
 # https://www.arin.net/announcements/2014/20140130.html
 # This block will be subject to a minimum size allocation of /28 and a
 # maximum size allocation of /24. ARIN should use sparse allocation when
diff --git a/usr.sbin/bgpctl/bgpctl.c b/usr.sbin/bgpctl/bgpctl.c
index 4d9701da35b..8baa8be0ff2 100644
--- a/usr.sbin/bgpctl/bgpctl.c
+++ b/usr.sbin/bgpctl/bgpctl.c
@@ -1532,6 +1532,9 @@ show_community(u_char *data, u_int16_t len)
v = ntohs(v);
if (a == COMMUNITY_WELLKNOWN)
switch (v) {
+   case COMMUNITY_GRACEFUL_SHUTDOWN:
+   printf("GRACEFUL_SHUTDOWN");
+   break;
case COMMUNITY_NO_EXPORT:
printf("NO_EXPORT");
break;
diff --git a/usr.sbin/bgpctl/parser.c b/usr.sbin/bgpctl/parser.c
index 0d1e5d9fb3a..4ea16533b71 100644
--- a/usr.sbin/bgpctl/parser.c
+++ b/usr.sbin/bgpctl/parser.c
@@ -953,7 +953,11 @@ parse_community(const char *word, struct parse_result *r)
int  as, type;
 
/* Well-known communities */
-   if (strcasecmp(word, "NO_EXPORT") == 0) {
+   if (strcasecmp(word, "GRACEFUL_SHUTDOWN") == 0) {
+   as = COMMUNITY_WELLKNOWN;
+   type = COMMUNITY_GRACEFUL_SHUTDOWN;
+   goto done;
+   } else if (strcasecmp(word, "NO_EXPORT") == 0) {
as = COMMUNITY_WELLKNOWN;
type = COMMUNITY_NO_EXPORT;
goto done;
@@ -991,6 +995,7 @@ done:
}
if (as == COMMUNITY_WELLKNOWN)
switch (type) {
+   case COMMUNITY_GRACEFUL_SHUTDOWN:
case COMMUNITY_NO_EXPORT:
case COMMUNITY_NO_ADVERTISE:
case COMMUNITY_NO_EXPSUBCONFED:
diff --git a/usr.sbin/bgpd/bgpd.conf.5 b/usr.sbin/bgpd/bgpd.conf.5
index 6cecd7a5a80..3afc54ef385 100644
--- a/usr.sbin/bgpd/bgpd.conf.5
+++ b/usr.sbin/bgpd/bgpd.conf.5
@@ -1173,6 +1173,7 @@ to do wildcard matching.
 Alternatively, well-known communities may be given by name instead and
 include
 .Ic BLACKHOLE ,
+.Ic GRACEFUL_SHUTDOWN ,
 .Ic NO_EXPORT ,
 .Ic NO_ADVERTISE ,
 .Ic NO_EXPORT_SUBCONFED ,
@@ -1444,6 +1445,7 @@ is an AS number and
 is a locally-significant number between zero and
 .Li 65535 .
 Alternately, well-known communities may be specified by name:
+.Ic GRACEFUL_SHUTDOWN ,
 .Ic NO_EXPORT ,
 .Ic NO_ADVERTISE ,
 .Ic NO_EXPORT_SUBCONFED ,
diff --git a/usr.sbin/bgpd/bgpd.h b/usr.sbin/bgpd/bgpd.h
index db52f858241..ef4e30ffd94 100644
--- a/usr.sbin/bgpd/bgpd.h
+++ b/usr.sbin/bgpd/bgpd.h
@@ -750,6 +750,7 @@ struct filter_peers {
 #defineCOMMUNITY_LOCAL_AS  -4
 #defineCOMMUNITY_UNSET -5
 #defineCOMMUNITY_WELLKNOWN 0x
+#defineCOMMUNITY_GRACEFUL_SHUTDOWN 0x  /* 
draft-ietf-grow-bgp-gshut */
 #defineCOMMUNITY_BLACKHOLE 0x029A  /* RFC 7999 */
 #defineCOMMUNITY_NO_EXPORT 0xff01
 #defineCOMMUNITY_NO_ADVERTISE  0xff02
diff --git a/usr.sbin/bgpd/parse.y b/usr.sbin/bgpd/parse.y
index 73bdb3a0cb9..0b09f83bc0a 100644
--- a/usr.sbin/bgpd/parse.y
+++ b/usr.sbin/bgpd/parse.y
@@ -2991,7 +2991,11 @@ parsecommunity(struct filter_community *c, char *s)
int i, as;
 
/* Well-known communities */
-   if (strcasecmp(s, "NO_EXPORT") == 0) {
+   if (strcasecmp(s, "GRACEFUL_SHUTDOWN") == 0) {
+   c->as = COMMUNITY_WELLKNOWN;
+   c->type = COMMUNITY_GRACEFUL_SHUTDOWN;
+   return (0);
+   } else if (strcasecmp(s, "NO_EXPORT") == 0) {
c->as = COMMUNITY_WELLKNOWN;
c->type = COMMUNITY_NO_EXPORT;
return (0);



[PATCH 2/3] openbgpd: Add support for 'unknown' well-known communities

2017-06-23 Thread Job Snijders
Dear team,

This patch makes 'unknown' well-known communities more of a first-class
citizen.

A powerful property of well-known communities is that (often) operators
can implement the feature associated with a given well-known community
through their local routing policy, ahead of time before their vendor
releasing native support in the implementation. 

Things that work now:

$ bgpctl show rib community 65535:0
..
$ bgpctl show rib community WELLKNOWN:0
..
$ bgpctl show rib community WELLKNOWN:*
..
$ doas cat /etc/bgpd.conf | grep set
match from any set { community WELLKNOWN:0 community 65535:1 }

Kind regards,

Job

---
 usr.sbin/bgpctl/parser.c | 15 +++
 usr.sbin/bgpd/parse.y| 14 ++
 2 files changed, 13 insertions(+), 16 deletions(-)

diff --git a/usr.sbin/bgpctl/parser.c b/usr.sbin/bgpctl/parser.c
index 85300d1cd32..0d1e5d9fb3a 100644
--- a/usr.sbin/bgpctl/parser.c
+++ b/usr.sbin/bgpctl/parser.c
@@ -413,7 +413,7 @@ int  parse_addr(const char *, struct 
bgpd_addr *);
 int parse_asnum(const char *, size_t, u_int32_t *);
 int parse_number(const char *, struct parse_result *,
 enum token_type);
-int getcommunity(const char *);
+int getcommunity(const char *, int);
 int parse_community(const char *, struct parse_result *);
 u_int   getlargecommunity(const char *);
 int parse_largecommunity(const char *, struct parse_result 
*);
@@ -927,7 +927,7 @@ parse_number(const char *word, struct parse_result *r, enum 
token_type type)
 }
 
 int
-getcommunity(const char *s)
+getcommunity(const char *s, int msb)
 {
const char  *errstr;
u_int16_tuval;
@@ -935,6 +935,9 @@ getcommunity(const char *s)
if (strcmp(s, "*") == 0)
return (COMMUNITY_ANY);
 
+   if (msb == 1 && strcmp(s, "WELLKNOWN") == 0)
+   return (COMMUNITY_WELLKNOWN);
+
uval = strtonum(s, 0, USHRT_MAX, );
if (errstr)
errx(1, "Community is %s: %s", errstr, s);
@@ -978,8 +981,8 @@ parse_community(const char *word, struct parse_result *r)
}
*p++ = 0;
 
-   as = getcommunity(word);
-   type = getcommunity(p);
+   as = getcommunity(word, 1);
+   type = getcommunity(p, 0);
 
 done:
if (as == 0) {
@@ -994,10 +997,6 @@ done:
case COMMUNITY_BLACKHOLE:
/* valid */
break;
-   default:
-   /* unknown */
-   fprintf(stderr, "Unknown well-known community\n");
-   return (0);
}
 
if ((fs = calloc(1, sizeof(struct filter_set))) == NULL)
diff --git a/usr.sbin/bgpd/parse.y b/usr.sbin/bgpd/parse.y
index f0c96051e17..73bdb3a0cb9 100644
--- a/usr.sbin/bgpd/parse.y
+++ b/usr.sbin/bgpd/parse.y
@@ -146,7 +146,7 @@ void copy_filterset(struct filter_set_head 
*,
 voidmerge_filter_lists(struct filter_head *, struct filter_head *);
 struct filter_rule *get_rule(enum action_types);
 
-int getcommunity(char *);
+int getcommunity(char *, int);
 int parsecommunity(struct filter_community *, char *);
 int64_t getlargecommunity(char *);
 int parselargecommunity(struct filter_largecommunity *, char *);
@@ -2963,11 +2963,13 @@ symget(const char *nam)
 }
 
 int
-getcommunity(char *s)
+getcommunity(char *s, int msb)
 {
int  val;
const char  *errstr;
 
+   if (msb == 1 && strcmp(s, "WELLKNOWN") == 0)
+   return (COMMUNITY_WELLKNOWN);
if (strcmp(s, "*") == 0)
return (COMMUNITY_ANY);
if (strcmp(s, "neighbor-as") == 0)
@@ -3017,15 +3019,11 @@ parsecommunity(struct filter_community *c, char *s)
}
*p++ = 0;
 
-   if ((i = getcommunity(s)) == COMMUNITY_ERROR)
+   if ((i = getcommunity(s, 1)) == COMMUNITY_ERROR)
return (-1);
-   if (i == COMMUNITY_WELLKNOWN) {
-   yyerror("Bad community AS number");
-   return (-1);
-   }
as = i;
 
-   if ((i = getcommunity(p)) == COMMUNITY_ERROR)
+   if ((i = getcommunity(p, 0)) == COMMUNITY_ERROR)
return (-1);
c->as = as;
c->type = i;



[PATCH 1/3] openbgpd: Allow localpref of zero

2017-06-23 Thread Job Snijders
Dear team,

The lowest valid BGP LOCAL_PREF is 0, allowing bgpd to set 0 too will
accomodate interopability.

Kind regards,

Job

--- a/usr.sbin/bgpd/parse.y
+++ b/usr.sbin/bgpd/parse.y
@@ -1988,7 +1988,7 @@ filter_set_opt: LOCALPREF NUMBER  {
}
if (($$ = calloc(1, sizeof(struct filter_set))) == NULL)
fatal(NULL);
-   if ($2 > 0) {
+   if ($2 >= 0) {
$$->type = ACTION_SET_LOCALPREF;
$$->action.metric = $2;
} else {



Re: [PATCH] ntpd: allow to specify a source IP address for outgoing queries

2017-05-30 Thread Job Snijders
Dear team,

Henning Brauer (off-list) made a few suggestions, which I summerized in
the following four points:

1) poor initialization style, instead of:
struct xxx yyy = {
.property = zzz
};

use:
struct xxx yyy;
yyy.property = zzz;

2) avoid creation of a dubiously named newly defined 'struct dual_addr'
filled with sockaddr_storage strucs: we already know what needs to
be shipped around: sockaddr_in & sockaddr_in6.

3) the bzero() wasn't needed. conf is passed in (as xconf) is lconf in
main() which is zero'd there, which isn't even necessary.

4) memset/memcpy/memmove are nowadays prefered over bzero/bcopy for
performance reasons.

Below is a new version of the patch.

Kind regards,

Job

---
 usr.sbin/ntpd/client.c| 12 
 usr.sbin/ntpd/ntp.c   |  2 ++
 usr.sbin/ntpd/ntpd.conf.5 |  8 
 usr.sbin/ntpd/ntpd.h  |  4 
 usr.sbin/ntpd/parse.y | 31 ++-
 5 files changed, 56 insertions(+), 1 deletion(-)

diff --git a/usr.sbin/ntpd/client.c b/usr.sbin/ntpd/client.c
index ad53f6098c1..f7379946fd4 100644
--- a/usr.sbin/ntpd/client.c
+++ b/usr.sbin/ntpd/client.c
@@ -137,11 +137,23 @@ client_query(struct ntp_peer *p)
 
if (p->query->fd == -1) {
struct sockaddr *sa = (struct sockaddr *)>addr->ss;
+   struct sockaddr *qa4 = (struct sockaddr *)>query_addr4;
+   struct sockaddr *qa6 = (struct sockaddr *)>query_addr6;
 
if ((p->query->fd = socket(p->addr->ss.ss_family, SOCK_DGRAM,
0)) == -1)
fatal("client_query socket");
 
+   if (p->addr->ss.ss_family == qa4->sa_family) {
+   if (bind(p->query->fd, qa4, SA_LEN(qa4)) == -1)
+   fatal("couldn't bind to IPv4 query address: %s",
+   log_sockaddr(qa4));
+   } else if (p->addr->ss.ss_family == qa6->sa_family) {
+   if (bind(p->query->fd, qa6, SA_LEN(qa6)) == -1)
+   fatal("couldn't bind to IPv6 query address: %s",
+   log_sockaddr(qa6));
+   }
+
if (connect(p->query->fd, sa, SA_LEN(sa)) == -1) {
if (errno == ECONNREFUSED || errno == ENETUNREACH ||
errno == EHOSTUNREACH || errno == EADDRNOTAVAIL) {
diff --git a/usr.sbin/ntpd/ntp.c b/usr.sbin/ntpd/ntp.c
index 2fbaaf7149f..2184028cbc4 100644
--- a/usr.sbin/ntpd/ntp.c
+++ b/usr.sbin/ntpd/ntp.c
@@ -521,6 +521,8 @@ ntp_dispatch_imsg_dns(void)
if (peer->addr_head.pool) {
npeer = new_peer();
npeer->weight = peer->weight;
+   npeer->query_addr4 = peer->query_addr4;
+   npeer->query_addr6 = peer->query_addr6;
h->next = NULL;
npeer->addr = h;
npeer->addr_head.a = h;
diff --git a/usr.sbin/ntpd/ntpd.conf.5 b/usr.sbin/ntpd/ntpd.conf.5
index 4d2d15c66d7..d2ebd67edb5 100644
--- a/usr.sbin/ntpd/ntpd.conf.5
+++ b/usr.sbin/ntpd/ntpd.conf.5
@@ -67,6 +67,14 @@ or
 listen on 127.0.0.1
 listen on ::1
 listen on 127.0.0.1 rtable 4
+.It Xo Ic query from Ar address
+.Xc
+Specify a Local IP address the
+.Xr ntpd 8
+daemon should use for outgoing queries.
+.Bd -literal -offset indent
+query from 192.0.2.1
+query from 2001:db8::1
 .Ed
 .It Xo Ic sensor Ar device
 .Op Ic correction Ar microseconds
diff --git a/usr.sbin/ntpd/ntpd.h b/usr.sbin/ntpd/ntpd.h
index fb9cd87118a..c1e8ce469fc 100644
--- a/usr.sbin/ntpd/ntpd.h
+++ b/usr.sbin/ntpd/ntpd.h
@@ -153,6 +153,8 @@ struct ntp_peer {
struct ntp_query*query;
struct ntp_offsetreply[OFFSET_ARRAY_SIZE];
struct ntp_offsetupdate;
+   struct sockaddr_in   query_addr4;
+   struct sockaddr_in6  query_addr6;
enum client_statestate;
time_t   next;
time_t   deadline;
@@ -219,6 +221,8 @@ struct ntpd_conf {
TAILQ_HEAD(constraints, constraint) constraints;
struct ntp_status   status;
struct ntp_freq freq;
+   struct sockaddr_in  query_addr4;
+   struct sockaddr_in6 query_addr6;
u_int32_t   scale;
int debug;
int verbose;
diff --git a/usr.sbin/ntpd/parse.y b/usr.sbin/ntpd/parse.y
index 8da19a218e0..c39ccf57ef7 100644
--- a/usr.sbin/ntpd/parse.y

Re: sys/socket.h: make sstosa() available to everyone

2017-05-30 Thread Job Snijders
On Tue, May 30, 2017 at 01:29:07PM -0600, Theo de Raadt wrote:
> I don't think this trivial thing should be pushed into the public
> namespace.
> 
> Personally I think this construct is really contrived.

ok. Another downside might be that it can negatively impact portability.

Thanks,

Job



sys/socket.h: make sstosa() available to everyone

2017-05-30 Thread Job Snijders
Hi,

Might be out of my depth here, but would be nice if the sstosa() is
available to everyone, not just _KERNEL

If accepted, 'define sstosa' can to be removed from
usr.sbin/ftp-proxy/ftp-proxy.c.

Kind regards,

Job

---
 sys/sys/socket.h | 16 ++--
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/sys/sys/socket.h b/sys/sys/socket.h
index 27cd9b5962e..48cd94e072a 100644
--- a/sys/sys/socket.h
+++ b/sys/sys/socket.h
@@ -229,6 +229,16 @@ struct sockaddr_storage {
unsigned char   __ss_pad3[240]; /* pad to a total of 256 bytes */
 };
 
+/*
+ * inline function to convert struct sockaddr_storage to struct sockaddr
+ * in a typesafe manner instead of sprinkling casts everywhere.
+ */
+static inline struct sockaddr *
+sstosa(struct sockaddr_storage *ss)
+{
+return ((struct sockaddr *)(ss));
+}
+
 #ifdef _KERNEL
 /*
  * Structure used by kernel to pass protocol
@@ -535,12 +545,6 @@ __END_DECLS
 
 void   pfctlinput(int, struct sockaddr *);
 
-static inline struct sockaddr *
-sstosa(struct sockaddr_storage *ss)
-{
-   return ((struct sockaddr *)(ss));
-}
-
 #endif /* !_KERNEL */
 
 #endif /* !_SYS_SOCKET_H_ */



Re: [PATCH] ntpd: allow to specify a source IP address for outgoing queries

2017-05-30 Thread Job Snijders
On Sun, May 28, 2017 at 10:52:24PM +0200, Sebastian Benoit wrote:
> which makes me think:
> would a global local-address be good enough?

Attached is a patch that allows you to specify the source for outgoing
queries, as a global option. Example ntpd.conf:

query from 165.254.255.33
query from 2001:728:1808::26
servers ntp.ring.nlnog.net

I have a number of remarks myself:

- unsure about the bzero() in parse_config()

- should we check 2+ declarations of 'query from', or just use the
  last one like this patch does now, (we don't check for duplicate
  'weight' etc either)

- the ipv4 / ipv6 approach with 'struct dual_addr' seems clumsy, is
  this what life is like in an ipv4 + ipv6 world? Any suggestions
  how to improve?

Kind regards,

Job

---
 src/usr.sbin/ntpd/client.c| 13 +
 src/usr.sbin/ntpd/ntp.c   |  1 +
 src/usr.sbin/ntpd/ntpd.conf.5 |  8 
 src/usr.sbin/ntpd/ntpd.h  |  7 +++
 src/usr.sbin/ntpd/parse.y | 31 ++-
 5 files changed, 59 insertions(+), 1 deletion(-)

diff --git a/src/usr.sbin/ntpd/client.c b/src/usr.sbin/ntpd/client.c
index ddbb1281..7d921dcf 100644
--- a/src/usr.sbin/ntpd/client.c
+++ b/src/usr.sbin/ntpd/client.c
@@ -137,11 +137,24 @@ client_query(struct ntp_peer *p)
 
if (p->query->fd == -1) {
struct sockaddr *sa = (struct sockaddr *)>addr->ss;
+   struct sockaddr *qa4 = (struct sockaddr *)>query_addr.v4;
+   struct sockaddr *qa6 = (struct sockaddr *)>query_addr.v6;
 
if ((p->query->fd = socket(p->addr->ss.ss_family, SOCK_DGRAM,
0)) == -1)
fatal("client_query socket");
 
+   if (p->addr->ss.ss_family == qa4->sa_family) {
+   if (bind(p->query->fd, qa4, SA_LEN(qa4)) == -1)
+   fatal("couldn't bind to IPv4 query address: %s",
+   log_sockaddr(qa4));
+   }
+   else if (p->addr->ss.ss_family == qa6->sa_family) {
+   if (bind(p->query->fd, qa6, SA_LEN(qa6)) == -1)
+   fatal("couldn't bind to IPv6 query address: %s",
+   log_sockaddr(qa6));
+   }
+
if (connect(p->query->fd, sa, SA_LEN(sa)) == -1) {
if (errno == ECONNREFUSED || errno == ENETUNREACH ||
errno == EHOSTUNREACH || errno == EADDRNOTAVAIL) {
diff --git a/src/usr.sbin/ntpd/ntp.c b/src/usr.sbin/ntpd/ntp.c
index f3366640..b0f80294 100644
--- a/src/usr.sbin/ntpd/ntp.c
+++ b/src/usr.sbin/ntpd/ntp.c
@@ -521,6 +521,7 @@ ntp_dispatch_imsg_dns(void)
if (peer->addr_head.pool) {
npeer = new_peer();
npeer->weight = peer->weight;
+   npeer->query_addr = peer->query_addr;
h->next = NULL;
npeer->addr = h;
npeer->addr_head.a = h;
diff --git a/src/usr.sbin/ntpd/ntpd.conf.5 b/src/usr.sbin/ntpd/ntpd.conf.5
index 6e4e0012..b8f03b22 100644
--- a/src/usr.sbin/ntpd/ntpd.conf.5
+++ b/src/usr.sbin/ntpd/ntpd.conf.5
@@ -67,6 +67,14 @@ or
 listen on 127.0.0.1
 listen on ::1
 listen on 127.0.0.1 rtable 4
+.It Xo Ic source from Ar address
+.Xc
+Specify a Local IP address the
+.Xr ntpd 8
+daemon should use for outgoing queries.
+.Bd -literal -offset indent
+query from 10.0.0.1
+query from 2001:db8::1
 .Ed
 .It Xo Ic sensor Ar device
 .Op Ic correction Ar microseconds
diff --git a/src/usr.sbin/ntpd/ntpd.h b/src/usr.sbin/ntpd/ntpd.h
index 613b29b2..ded2948a 100644
--- a/src/usr.sbin/ntpd/ntpd.h
+++ b/src/usr.sbin/ntpd/ntpd.h
@@ -106,6 +106,11 @@ struct listen_addr {
int  rtable;
 };
 
+struct dual_addr {
+   struct sockaddr_storage v4;
+   struct sockaddr_storage v6;
+};
+
 struct ntp_addr {
struct ntp_addr *next;
struct sockaddr_storage  ss;
@@ -153,6 +158,7 @@ struct ntp_peer {
struct ntp_query*query;
struct ntp_offsetreply[OFFSET_ARRAY_SIZE];
struct ntp_offsetupdate;
+   struct dual_addr query_addr;
enum client_statestate;
time_t   next;
time_t   deadline;
@@ -219,6 +225,7 @@ struct ntpd_conf {
TAILQ_HEAD(constraints, constraint) constraints;
struct ntp_status   status;
struct ntp_freq freq;
+   struct dual_addrquery_addr;
u_int32_t   scale;
int 

Re: tcpdump: enable some more bgp info

2017-05-30 Thread Job Snijders
On Tue, May 30, 2017 at 10:21:17AM +0200, Michal Mazurek wrote:
> On 12:15:06, 29.05.17, Job Snijders wrote:
> > perhaps add a comment like /* RFC 6608 */ above the below:
> 
> Right, it will make it more consistent.
> 
> > > +static const char *bgpnotify_minor_fsm[] = {
> > > + NULL, "In OpenSent State", "In OpenConfirm State",
> > > + "In Established State",
> > > +};
> > 
> > and maybe s/NULL/"Unspecified Error"/
> 
> If it's NULL, then tcpdump will print out the number:
>
>   if (p == NULL) {
>   snprintf(buf, sizeof(buf), "#%d", minor);

Perhaps there is a misunderstanding on your part or on my part.

In the registry created by RFC 6608, the value "0" is the BGP Finite
State Machine Error subcode meaning "Unspecified Error". I think that
when a name is assigned to a value, the name should be printed (like
your patch does for subcode values 1, 2, and 3).

If no name is known for the error subcode, just printing the number is
useful indeed.

Kind regards,

Job



Re: tcpdump: enable some more bgp info

2017-05-29 Thread Job Snijders
On Mon, May 29, 2017 at 12:02:33PM +0200, Michal Mazurek wrote:
> The error information for bgp was commited in 2009
> (bgpnotify_minor_cease, bgpnotify_minor_cap) but never enabled, so do
> that here. Also add FSM error codes.

perhaps add a comment like /* RFC 6608 */ above the below:

> +static const char *bgpnotify_minor_fsm[] = {
> + NULL, "In OpenSent State", "In OpenConfirm State",
> + "In Established State",
> +};

and maybe s/NULL/"Unspecified Error"/

Kind regards,

Job



[PATCH] ntpd: allow to specify a source IP address for outgoing queries

2017-05-28 Thread Job Snijders
Dear team,

I have the following use-case on some of my routers: ntpd will
opportunistically select a source address, regardless of whether that
source address is actually a globally routable IP address. Most of the
time this is great, but not in some deployment scenarios.

For instance, IP addresses from the AMS-IX or DE-CIX peering LANs cannot
be expected to be functional source IP addresses for Internet use, as
many ISPs drop those prefixes at their border. If the NTP server happens
to be across the Peering LAN, you'll probably want to use an IP address
from your own advertised IP space, like the router's loopback address.

Thus, I set out to patch ntpd so a source address can be explicitly
specified. This is probably mostly useful for machines which have a
bunch of IPs spread over multiple interfaces like bgp routers.

However, in implementing this feature, I somewhat bumped my head against
"how to properly accomodate dual-stack IPv4 + IPv6 environments from a
user-interface point of view?".

If you configure the following:

"servers ntp.ring.nlnog.net local-address 165.254.255.27"

Do you expect that any  records resolved for ntp.ring.nlnog.net are
ignored, and for the queries to the A records behind ntp.ring.nlnog.net
"165.254.255.27" is used as source address? Or would one expect that for
IPv4 queries "165.254.255.27" is used, and for IPv6 queries "whatever"
is used (kind of like the current behavior)?

Or, should I make it so that you can do the following:

"servers ntp.ring.nlnog.net \
local-address 165.254.255.27,2001:728:1808::26"

or allow one to repeat the 'local-address' keyword:

"servers ntp.ring.nlnog.net \
local-address 165.254.255.27 \
local-address 2001:728:1808::26"

or use different keywords for ipv4 and ipv6:

"servers ntp.ring.nlnog.net \
local-address4 165.254.255.27 \
local-address6 2001:728:1808::26"

It feels like a balance must be struck between between deterministic
behaviour, and the software attempting to get the time from somewhere,
somehow. Any guidance on the topic would be appreciated!

Personally I think i am somewhat in favor of the comma separated
approach: "local-address 165.254.255.27,2001:728:1808::26" - if you only
specifiy one address, that AFI will be determinstic, and the software
will just automatically select a source address for the other AFI. If
you specify two addresses (comma separated, one IPv4 and one IPv6),
you'll have determinstic behavior for both AFIs.

Look forward to your feedback.

Kind regards,

Job

---
 src/usr.sbin/ntpd/client.c|  6 ++
 src/usr.sbin/ntpd/ntp.c   |  1 +
 src/usr.sbin/ntpd/ntpd.conf.5 |  1 +
 src/usr.sbin/ntpd/ntpd.h  |  1 +
 src/usr.sbin/ntpd/parse.y | 41 -
 5 files changed, 49 insertions(+), 1 deletion(-)

diff --git a/src/usr.sbin/ntpd/client.c b/src/usr.sbin/ntpd/client.c
index ddbb1281..31ff782d 100644
--- a/src/usr.sbin/ntpd/client.c
+++ b/src/usr.sbin/ntpd/client.c
@@ -137,11 +137,17 @@ client_query(struct ntp_peer *p)
 
if (p->query->fd == -1) {
struct sockaddr *sa = (struct sockaddr *)>addr->ss;
+   struct sockaddr *la = (struct sockaddr *)>local_addr;
 
if ((p->query->fd = socket(p->addr->ss.ss_family, SOCK_DGRAM,
0)) == -1)
fatal("client_query socket");
 
+   if(p->addr->ss.ss_family == la->sa_family)
+   if (bind(p->query->fd, la, SA_LEN(la)) == -1)
+   fatal("couldn't bind to local-address: %s",
+   log_sockaddr(la));
+
if (connect(p->query->fd, sa, SA_LEN(sa)) == -1) {
if (errno == ECONNREFUSED || errno == ENETUNREACH ||
errno == EHOSTUNREACH || errno == EADDRNOTAVAIL) {
diff --git a/src/usr.sbin/ntpd/ntp.c b/src/usr.sbin/ntpd/ntp.c
index f3366640..f22f1ca4 100644
--- a/src/usr.sbin/ntpd/ntp.c
+++ b/src/usr.sbin/ntpd/ntp.c
@@ -521,6 +521,7 @@ ntp_dispatch_imsg_dns(void)
if (peer->addr_head.pool) {
npeer = new_peer();
npeer->weight = peer->weight;
+   npeer->local_addr = peer->local_addr;
h->next = NULL;
npeer->addr = h;
npeer->addr_head.a = h;
diff --git a/src/usr.sbin/ntpd/ntpd.conf.5 b/src/usr.sbin/ntpd/ntpd.conf.5
index 6e4e0012..07bc2174 100644
--- a/src/usr.sbin/ntpd/ntpd.conf.5
+++ b/src/usr.sbin/ntpd/ntpd.conf.5
@@ -131,6 +131,7 @@ A stratum value other than the default of 1 can be assigned 
using the
 keyword.
 .It Xo Ic server Ar address
 .Op Ic weight Ar weight-value
+.Op Ic local-address Ar address
 .Xc
 Specify the IP address or the hostname of an NTP
 server to 

Re: [bgpd] RFC 7607 Codification of AS 0 Processing

2017-05-28 Thread Job Snijders
On Fri, May 26, 2017 at 09:40:49PM +0200, Peter Hessler wrote:
> On 2017 May 26 (Fri) at 20:01:00 +0200 (+0200), Peter Hessler wrote:
> :Apropos of "I found it", I implemented support for RFC 7607.  It's a
> :super short RFC, but basically it forbids use of AS 0 anywhere.
> :
> :OK?
>
> Fixed some denglish in an error message, mention the RFC in the man
> page, and don't take down the session if we receive AS0 in the path.

Indeed, the better behaviour is to apply 'Treat-as-withdraw' to a BGP
UPDATE which contains AS 0 anywhere in the AS_PATH/AS4_PATH.

Kind regards,

Job



Re: tcpdump: decode BGP Administrative Shutdown Communication

2017-04-19 Thread Job Snijders
On Mon, Apr 17, 2017 at 01:56:17PM -0600, Theo de Raadt wrote:
> +   memset(string, 0, 129);
> +   memcpy(string, p+1, shutdown_comm_length);
> +   safeputs(string);
> 
> Please don't copy numbers like that. If this is a string, why not use
> string functions that gaurantee truncation and truncation detection...

The payload of the shutdown communication (which starts at p+1) is not a
C string: the content of the field wich is memcpy'd, is not
null-terminated. Instead the field's size is available through
'shutdown_comm_length'.

The realisation that a shutdown communication may contain \0 (since NUL is a
valid UTF-8 char), led me to alter the proposed changes. A debugging tool like
tcpdump should display trash too. This 0003 patch avoids the memset/memcpy and
can deal with trash in the shutdown communication through a new util 
putlchars().

example output: 

BGP (NOTIFICATION: error Cease, subcode #2, Shutdown Communication 
(length: 52): "This is a test of the sh\000\000\000\000wn communication 
system.") (DF) [tos 0xc0] (ttl 255, id 40416, len 126)

Kind regards,

Job


diff --git a/usr.sbin/tcpdump/print-bgp.c b/usr.sbin/tcpdump/print-bgp.c
index d028d671893..e25fdbd930a 100644
--- a/usr.sbin/tcpdump/print-bgp.c
+++ b/usr.sbin/tcpdump/print-bgp.c
@@ -228,9 +228,13 @@ static const char *bgpnotify_minor_update[] = {
 
 /* RFC 4486 */
 #define BGP_NOTIFY_MINOR_CEASE_MAXPRFX  1
+/* draft-ietf-idr-shutdown-07 */
+#define BGP_NOTIFY_MINOR_CEASE_SHUT2
+#define BGP_NOTIFY_MINOR_CEASE_RESET   4
+#define BGP_NOTIFY_MINOR_CEASE_ADMIN_SHUTDOWN_LEN  128
 static const char *bgpnotify_minor_cease[] = {
-   NULL, "Maximum Number of Prefixes Reached", "Administratively Shutdown",
-   "Peer De-configured", "Administratively Reset", "Connection Rejected",
+   NULL, "Maximum Number of Prefixes Reached", "Administrative Shutdown",
+   "Peer De-configured", "Administrative Reset", "Connection Rejected",
"Other Configuration Change", "Connection Collision Resolution",
"Out of Resources",
 };
@@ -302,6 +306,21 @@ static const char *afnumber[] = AFNUM_NAME_STR;
sizeof(afnumber)/sizeof(afnumber[0]), (x)))
 
 
+static void
+print_hex(const u_char *p, u_int len)
+{
+   while (len--)
+   printf("%02x", *p++);
+}
+
+static void
+putlchars(const u_char *str, u_int len)
+{
+   while (len--)
+   safeputchar(*str++);
+}
+
+
 static const char *
 num_or_str(const char **table, size_t siz, int value)
 {
@@ -996,6 +1015,8 @@ bgp_notification_print(const u_char *dat, int length)
u_int16_t af;
u_int8_t safi;
const u_char *p;
+   uint8_t shutdown_comm_length;
+   uint8_t remainder_offset;
 
TCHECK2(dat[0], BGP_NOTIFICATION_SIZE);
memcpy(, dat, BGP_NOTIFICATION_SIZE);
@@ -1026,9 +1047,54 @@ bgp_notification_print(const u_char *dat, int length)
 
printf(" Max Prefixes: %u", EXTRACT_32BITS(p+3));
}
+
+   /*
+* draft-ietf-idr-shutdown describes a method to send a
+* message intended for human consumption regarding the
+* Administrative Shutdown or Reset event. This is called
+* the "Shutdown Communication". The communication is
+* UTF-8 encoded and may be no longer than 128 bytes.
+*/
+
+   if ((bgpn.bgpn_minor == BGP_NOTIFY_MINOR_CEASE_SHUT ||
+   bgpn.bgpn_minor == BGP_NOTIFY_MINOR_CEASE_RESET) &&
+   (length >= BGP_NOTIFICATION_SIZE + 1)) {
+   p = dat + BGP_NOTIFICATION_SIZE;
+   TCHECK2(*p, 1);
+   shutdown_comm_length = *(p);
+   remainder_offset = 0;
+   /* if we received garbage, make sure we hexdump it all 
*/
+   if (shutdown_comm_length >
+   BGP_NOTIFY_MINOR_CEASE_ADMIN_SHUTDOWN_LEN ||
+   shutdown_comm_length > (length - 
BGP_NOTIFICATION_SIZE) + 1)
+   printf(", invalid Shutdown Communication 
length");
+   else if (shutdown_comm_length == 0) {
+   printf(", empty Shutdown Communication");
+   remainder_offset += 1;
+   }
+   /* a proper shutdown communication */
+   else {
+   TCHECK2(*(p+1), shutdown_comm_length);
+   printf(", Shutdown Communication (length: %u): 
\"",
+   shutdown_comm_length);
+   putlchars(p+1, shutdown_comm_length);
+   printf("\"");
+   remainder_offset += 

Re: tcpdump: decode BGP Administrative Shutdown Communication

2017-04-17 Thread Job Snijders
Hi all,

Daan Keuper (Computest) was kind enough to review the diff, he pointed
out the following:

safeputs() expects a null-terminated string. Since shutdown_comm_length
won't exceed BGP_NOTIFY_MINOR_CEASE_ADMIN_SHUTDOWN_LEN (128), the
following will ensure a null-terminated string is passed to safeputs().

> + char string[128];
^^^
> + memset(string, 0, 128);
  ^^^
> + memcpy(string, p+1, shutdown_comm_length);
> + safeputs(string);

128 should be 129. oops!

I'd like to defer to more experienced programmers on the aesthetics of
using "129" rather then "BGP_NOTIFY_MINOR_CEASE_ADMIN_SHUTDOWN_LEN + 1",
or perhaps even an entirely other approach to ensure the string is
null-terminated.

Also added an additional check "shutdown_comm_length > (length -
BGP_NOTIFICATION_SIZE)" which is useful to prevent in case the shutdown
communication claims to be longer then the BGP message actually is.

Kind regards,

Job


diff --git a/usr.sbin/tcpdump/print-bgp.c b/usr.sbin/tcpdump/print-bgp.c
index d028d671893..0886304926e 100644
--- a/usr.sbin/tcpdump/print-bgp.c
+++ b/usr.sbin/tcpdump/print-bgp.c
@@ -228,9 +228,13 @@ static const char *bgpnotify_minor_update[] = {
 
 /* RFC 4486 */
 #define BGP_NOTIFY_MINOR_CEASE_MAXPRFX  1
+/* draft-ietf-idr-shutdown-07 */
+#define BGP_NOTIFY_MINOR_CEASE_SHUT2
+#define BGP_NOTIFY_MINOR_CEASE_RESET   4
+#define BGP_NOTIFY_MINOR_CEASE_ADMIN_SHUTDOWN_LEN  128
 static const char *bgpnotify_minor_cease[] = {
-   NULL, "Maximum Number of Prefixes Reached", "Administratively Shutdown",
-   "Peer De-configured", "Administratively Reset", "Connection Rejected",
+   NULL, "Maximum Number of Prefixes Reached", "Administrative Shutdown",
+   "Peer De-configured", "Administrative Reset", "Connection Rejected",
"Other Configuration Change", "Connection Collision Resolution",
"Out of Resources",
 };
@@ -302,6 +306,13 @@ static const char *afnumber[] = AFNUM_NAME_STR;
sizeof(afnumber)/sizeof(afnumber[0]), (x)))
 
 
+static void
+print_hex(const u_char *p, u_int len)
+{
+   while (len--)
+   printf("%02x", *p++);
+}
+
 static const char *
 num_or_str(const char **table, size_t siz, int value)
 {
@@ -996,6 +1007,9 @@ bgp_notification_print(const u_char *dat, int length)
u_int16_t af;
u_int8_t safi;
const u_char *p;
+   uint8_t shutdown_comm_length;
+   uint8_t remainder_offset;
+   char string[129];
 
TCHECK2(dat[0], BGP_NOTIFICATION_SIZE);
memcpy(, dat, BGP_NOTIFICATION_SIZE);
@@ -1026,9 +1040,56 @@ bgp_notification_print(const u_char *dat, int length)
 
printf(" Max Prefixes: %u", EXTRACT_32BITS(p+3));
}
+
+   /*
+* draft-ietf-idr-shutdown describes a method to send a
+* message intended for human consumption regarding the
+* Administrative Shutdown or Reset event. This is called
+* the "Shutdown Communication". The communication is
+* UTF-8 encoded and may be no longer than 128 bytes.
+*/
+
+   if ((bgpn.bgpn_minor == BGP_NOTIFY_MINOR_CEASE_SHUT ||
+   bgpn.bgpn_minor == BGP_NOTIFY_MINOR_CEASE_RESET) &&
+   (length >= BGP_NOTIFICATION_SIZE + 1)) {
+   p = dat + BGP_NOTIFICATION_SIZE;
+   TCHECK2(*p, 1);
+   shutdown_comm_length = *(p);
+   remainder_offset = 0;
+   /* seems we received garbage, hexdump it all */
+   if (shutdown_comm_length >
+   BGP_NOTIFY_MINOR_CEASE_ADMIN_SHUTDOWN_LEN ||
+   shutdown_comm_length > (length - 
BGP_NOTIFICATION_SIZE))
+   printf(", invalid Shutdown Communication 
length");
+   else if (shutdown_comm_length == 0) {
+   printf(", empty Shutdown Communication");
+   remainder_offset += 1;
+   }
+   /* a proper shutdown communication */
+   else {
+   TCHECK2(*(p+1), shutdown_comm_length);
+   printf(", Shutdown Communication (length: %u): 
\"",
+   shutdown_comm_length);
+   memset(string, 0, 129);
+   memcpy(string, p+1, shutdown_comm_length);
+   safeputs(string);
+   printf("\"");
+   remainder_offset += shutdown_comm_length + 1;
+   }
+   /* if there is trailing data, hexdump it */
+

Re: tcpdump: decode BGP Administrative Shutdown Communication

2017-04-17 Thread Job Snijders
Hi OpenBSD,

bgpd(8) as shipped in OpenBSD 6.1 supports draft-ietf-idr-shutdown-07.
The below patch adds support to tcpdump(8) to decode such shutdown
communication.

This is an improved version of the patch proposal I sent in January.

Kind regards,

Job



diff --git a/usr.sbin/tcpdump/print-bgp.c b/usr.sbin/tcpdump/print-bgp.c
index d028d671893..2871b92909f 100644
--- a/usr.sbin/tcpdump/print-bgp.c
+++ b/usr.sbin/tcpdump/print-bgp.c
@@ -228,9 +228,13 @@ static const char *bgpnotify_minor_update[] = {
 
 /* RFC 4486 */
 #define BGP_NOTIFY_MINOR_CEASE_MAXPRFX  1
+/* draft-ietf-idr-shutdown-07 */
+#define BGP_NOTIFY_MINOR_CEASE_SHUT2
+#define BGP_NOTIFY_MINOR_CEASE_RESET   4
+#define BGP_NOTIFY_MINOR_CEASE_ADMIN_SHUTDOWN_LEN  128
 static const char *bgpnotify_minor_cease[] = {
-   NULL, "Maximum Number of Prefixes Reached", "Administratively Shutdown",
-   "Peer De-configured", "Administratively Reset", "Connection Rejected",
+   NULL, "Maximum Number of Prefixes Reached", "Administrative Shutdown",
+   "Peer De-configured", "Administrative Reset", "Connection Rejected",
"Other Configuration Change", "Connection Collision Resolution",
"Out of Resources",
 };
@@ -302,6 +306,13 @@ static const char *afnumber[] = AFNUM_NAME_STR;
sizeof(afnumber)/sizeof(afnumber[0]), (x)))
 
 
+static void
+print_hex(const u_char *p, u_int len)
+{
+   while (len--)
+   printf("%02x", *p++);
+}
+
 static const char *
 num_or_str(const char **table, size_t siz, int value)
 {
@@ -996,6 +1007,9 @@ bgp_notification_print(const u_char *dat, int length)
u_int16_t af;
u_int8_t safi;
const u_char *p;
+   uint8_t shutdown_comm_length;
+   uint8_t remainder_offset;
+   char string[128];
 
TCHECK2(dat[0], BGP_NOTIFICATION_SIZE);
memcpy(, dat, BGP_NOTIFICATION_SIZE);
@@ -1026,9 +1040,55 @@ bgp_notification_print(const u_char *dat, int length)
 
printf(" Max Prefixes: %u", EXTRACT_32BITS(p+3));
}
+
+   /*
+* draft-ietf-idr-shutdown describes a method to send a
+* message intended for human consumption regarding the
+* Administrative Shutdown or Reset event. This is called
+* the "Shutdown Communication". The communication is
+* UTF-8 encoded and may be no longer than 128 bytes.
+*/
+
+   if ((bgpn.bgpn_minor == BGP_NOTIFY_MINOR_CEASE_SHUT ||
+   bgpn.bgpn_minor == BGP_NOTIFY_MINOR_CEASE_RESET) &&
+   (length >= BGP_NOTIFICATION_SIZE + 1)) {
+   p = dat + BGP_NOTIFICATION_SIZE;
+   TCHECK2(*p, 1);
+   shutdown_comm_length = *(p);
+   remainder_offset = 0;
+   /* seems we received garbage, hexdump it all */
+   if (shutdown_comm_length >
+   BGP_NOTIFY_MINOR_CEASE_ADMIN_SHUTDOWN_LEN)
+   printf(", invalid Shutdown Communication 
length");
+   else if (shutdown_comm_length == 0) {
+   printf(", empty Shutdown Communication");
+   remainder_offset += 1;
+   }
+   /* a proper shutdown communication */
+   else {
+   TCHECK2(*(p+1), shutdown_comm_length);
+   printf(", Shutdown Communication (length: %u): 
\"",
+   shutdown_comm_length);
+   memset(string, 0, 128);
+   memcpy(string, p+1, shutdown_comm_length);
+   safeputs(string);
+   printf("\"");
+   remainder_offset += shutdown_comm_length + 1;
+   }
+   /* if there is trailing data, hexdump it */
+   if (length -
+   (remainder_offset + BGP_NOTIFICATION_SIZE) > 0) {
+   printf(", Data: (length: %u) ",
+   length - (remainder_offset +
+   BGP_NOTIFICATION_SIZE));
+   print_hex(p + remainder_offset, length -
+   (remainder_offset + BGP_NOTIFICATION_SIZE));
+   }
+   }
}
 
return;
+
 trunc:
printf("[|BGP]");
 }



Re: [PATCH] openbgpd: Add support for BGP Validation State Communities (RFC 8097)

2017-04-03 Thread Job Snijders
On Wed, Mar 29, 2017 at 12:14:24AM +0200, Sebastian Benoit wrote:
> Job Snijders(j...@instituut.net) on 2017.03.28 14:12:42 -0500:
> > 
> > match from any \
> > set { ext-community bovs not-found \
> 
> do other implementations call this "bovs"?  dont get me wrong, i think
> junipers origin-validation-state-valid is too long.

¯\_(ツ)_/¯, alternatively, "ovs"?

> > match from any \
> > prefix 2a02:898::/32 source-as 8283 \
> > set { ext-community bovs valid }
> 
> So if we want to implement rfc6810/rfc6811 (BGP Prefix Origin
> Validation), would this syntax become redundant? Should we not
> implement 
> 
>   match from any validation-state valid ...
> 
> at this point, and hide the community behind that? And when we get
> Origin Validation, its transparent in the ruleset where the
> information came from, from our own lookup or through the community.
> 
> Just a thought, not something that makes the config a bit cleaner.

I do not think it will be redundant, we should clearly separate:

"a well-known community which can be set by anyone", and
"the validation state internal to the daemon".

I'd consider it an important feature to be able to match based on what a
bgp community alleges about the validation state, and what the actual
validation state is based on the daemon's own lookup.

Also, through this patch one is now able to match on the community
itself, which also means one can now scrub the community when received
from other (untrusted) parties.

So, I recommend against conflation.

> > --- bgpd/util.c 24 Jan 2017 04:22:42 -  1.24
> > +++ bgpd/util.c 28 Mar 2017 19:01:34 -
> > [ snip ]
> > +   case EXT_COMMUNITY_VALIDATION_STATE:
> > +   return ("bovs"); /* RFC 8097 */
> 
> spell out what bovs means?

ack.

> > +log_ext_bovs_value(u_int8_t value)
> > +{
> > [ snip ]
> > +}
> 
> this function is only used in bgpctl. it should be moved there.  we
> are trying to get rid of the reacharound *ctl programs are doing, so
> you should not add another function.

ack.

On Thu, Mar 30, 2017 at 12:48:45AM +0200, Claudio Jeker wrote:
> On Tue, Mar 28, 2017 at 02:12:42PM -0500, Job Snijders wrote:
> >  int
> > +parse_bgp_validation_state(char *state)
> > [ snip ]
> 
> Is there an expectation that this list will grow? I find it a bit
> overkill to use bsearch for 3 values. I understand we use similar
> constructs for other tables. Doing 3 strcmp in a row would have been
> easier I think.

I don't expect the list to grow any time soon. I've changed it.

Kind regards,

Job


Index: bgpctl/bgpctl.c
===
RCS file: /cvs/src/usr.sbin/bgpctl/bgpctl.c,v
retrieving revision 1.193
diff -u -p -r1.193 bgpctl.c
--- bgpctl/bgpctl.c 23 Jan 2017 23:38:51 -  1.193
+++ bgpctl/bgpctl.c 3 Apr 2017 07:24:37 -
@@ -2,7 +2,7 @@
 
 /*
  * Copyright (c) 2003 Henning Brauer <henn...@openbsd.org>
- * Copyright (c) 2016 Job Snijders <j...@instituut.net>
+ * Copyright (c) 2016, 2017 Job Snijders <j...@instituut.net>
  * Copyright (c) 2016 Peter Hessler <phess...@openbsd.org>
  *
  * Permission to use, copy, modify, and distribute this software for any
@@ -84,6 +84,7 @@ void   show_rib_detail(struct ctl_show_r
 voidshow_attr(void *, u_int16_t);
 voidshow_community(u_char *, u_int16_t);
 voidshow_large_community(u_char *, u_int16_t);
+const char *log_ext_ovs_value(u_int8_t);
 voidshow_ext_community(u_char *, u_int16_t);
 char   *fmt_mem(int64_t);
 int show_rib_memory_msg(struct imsg *);
@@ -1578,6 +1579,24 @@ show_large_community(u_char *data, u_int
}
 }
 
+const char *
+log_ext_ovs_value(u_int8_t value)
+{
+   static char s[6];
+
+   switch (value) {
+   case EXT_COMMUNITY_VALIDATION_STATE_VALID:
+   return ("valid");
+   case EXT_COMMUNITY_VALIDATION_STATE_NOTFOUND:
+   return ("not-found");
+   case EXT_COMMUNITY_VALIDATION_STATE_INVALID:
+   return ("invalid");
+   default:
+   snprintf(s, sizeof(s), "[%u?]", value);
+   return (s);
+   }
+}
+
 void
 show_ext_community(u_char *data, u_int16_t len)
 {
@@ -1585,7 +1604,7 @@ show_ext_community(u_char *data, u_int16
struct in_addr  ip;
u_int32_t   as4, u32;
u_int16_t   i, as2, u16;
-   u_int8_ttype, subtype;
+   u_int8_ttype, subtype, state;
 
if (len & 0x7)
return;
@@ -1618,6 +1637,13 @@ show_ext_community(u_char *data, u_int16
ext = betoh6

[PATCH] openbgpd: Add support for BGP Validation State Communities (RFC 8097)

2017-03-28 Thread Job Snijders
Dear all,

BGP Origin Validation State communities are non-transitive opaque
extended communities to carry the origination Autonomous System
validation state inside an autonomous system. IBGP speakers that
receive this validation state can configure local policies that allow it
to influence their decision process. https://tools.ietf.org/html/rfc8097

This patch allows to configurations such as the below. This
configuration was generated based on an external validation source (in
this case RPKI) and allows manipulation through well-known identifiers.

match from any \
set { ext-community bovs not-found \
  ext-community delete bovs invalid \
  ext-community delete bovs valid }
match from any \
prefix 2a02:898::/32 or-longer \
set { ext-community bovs invalid }
match from any \
prefix 2a02:898::/32 source-as 8283 \
set { ext-community bovs valid }
match from any ext-community bovs valid \
set { ext-community delete bovs invalid \
  ext-community delete bovs not-found }
match from any ext-community bovs invalid \
set { ext-community delete bovs not-found }

The following new mapping exists between keywords and Extended BGP
Communities:

keywords| ext type | subtype | value
+--+-+---
bovs valid  |   0x2b   |   0x0   |  0x0
bovs not-found  |   0x2b   |   0x0   |  0x1
bovs invalid|   0x2b   |   0x0   |  0x2

[job@kiera ~]$ bgpctl show rib detail 2a02:898::/32 longer-prefixes

BGP routing table entry for 2a02:898::/32
2914 8283
Nexthop 2001:728:0:1000::2 (via ???) from AS15562_scarlett_IPv6 
(165.254.255.1)
Origin IGP, metric 0, localpref 100, weight 0, internal
Last update: 00:00:49 ago
Communities: 2914:420 2914:1206 2914:2203 2914:3200 65504:8283
Ext. communities: bovs valid

[job@kiera ~]$

Index: bgpctl/bgpctl.c
===
RCS file: /cvs/src/usr.sbin/bgpctl/bgpctl.c,v
retrieving revision 1.193
diff -u -p -r1.193 bgpctl.c
--- bgpctl/bgpctl.c 23 Jan 2017 23:38:51 -  1.193
+++ bgpctl/bgpctl.c 28 Mar 2017 19:01:33 -
@@ -1585,7 +1585,7 @@ show_ext_community(u_char *data, u_int16
struct in_addr  ip;
u_int32_t   as4, u32;
u_int16_t   i, as2, u16;
-   u_int8_ttype, subtype;
+   u_int8_ttype, subtype, state;
 
if (len & 0x7)
return;
@@ -1618,6 +1618,13 @@ show_ext_community(u_char *data, u_int16
ext = betoh64(ext) & 0xLL;
printf("%s 0x%llx", log_ext_subtype(subtype), ext);
break;
+   case EXT_COMMUNITY_NON_TRANS_OPAQUE:
+   if (subtype == EXT_COMMUNITY_VALIDATION_STATE) {
+   state = data[i + 7];
+   printf("%s %s", log_ext_subtype(subtype),
+   log_ext_bovs_value(state));
+   break;
+   }
default:
memcpy(, data + i, sizeof(ext));
printf("0x%llx", betoh64(ext));
Index: bgpd/bgpd.8
===
RCS file: /cvs/src/usr.sbin/bgpd/bgpd.8,v
retrieving revision 1.52
diff -u -p -r1.52 bgpd.8
--- bgpd/bgpd.8 19 Feb 2017 11:38:24 -  1.52
+++ bgpd/bgpd.8 28 Mar 2017 19:01:33 -
@@ -390,6 +390,17 @@ control socket
 .%R draft-ietf-idr-shutdown
 .%T BGP Administrative Shutdown Communication
 .Re
+.Pp
+.Rs
+.%A P. Mohapatra
+.%A K. Patel
+.%A J. Scudder
+.%A D. Ward
+.%A R. Bush
+.%D March 2017
+.%R RFC 8097
+.%T BGP Prefix Origin Validation State Extended Community
+.Re
 .Sh HISTORY
 The
 .Nm
Index: bgpd/bgpd.conf.5
===
RCS file: /cvs/src/usr.sbin/bgpd/bgpd.conf.5,v
retrieving revision 1.152
diff -u -p -r1.152 bgpd.conf.5
--- bgpd/bgpd.conf.513 Jan 2017 18:59:12 -  1.152
+++ bgpd/bgpd.conf.528 Mar 2017 19:01:33 -
@@ -1195,6 +1195,11 @@ which is expanded to the current neighbo
 .Ic ext-community
 .Ar subtype Ar numvalue
 .Xc
+.It Xo
+.Ic ext-community
+.Ar bovs
+.Pq Ic valid | not-found | invalid
+.Xc
 This rule applies only to
 .Em UPDATES
 where the
@@ -1456,6 +1461,11 @@ to do wildcard matching.
 .Ic ext-community Op Ar delete
 .Ar subtype Ar numvalue
 .Xc
+.It Xo
+.Ic ext-community Op Ar delete
+.Ar bovs
+.Pq Ic valid | not-found | invalid
+.Xc
 Set or delete the
 .Em Extended Community
 AS path attribute.
@@ -1481,6 +1491,7 @@ odi  OSPF Domain Identifier
 ort  OSPF Route Type
 ori  OSPF Router ID
 bdc  BGP Data Collection
+bovs BGP Origin Validation State
 .Ed
 .Pp
 Not all type and subtype value pairs are allowed by IANA and the parser
Index: bgpd/bgpd.h

Re: openbgpd: support for bgp administrative shutdown communication

2017-01-09 Thread Job Snijders
Dear all,

The below is based on feedback from Sebastian Benoit, Theo de Raadt,
and Peter Hessler. The patch adds less lines of code, and adheres
better to style(9). Thank you for your time.

Kind regards,

Job


Index: bgpctl/bgpctl.8
===
RCS file: /cvs/src/usr.sbin/bgpctl/bgpctl.8,v
retrieving revision 1.71
diff -u -p -r1.71 bgpctl.8
--- bgpctl/bgpctl.8 26 Oct 2016 17:24:13 -  1.71
+++ bgpctl/bgpctl.8 9 Jan 2017 21:52:31 -
@@ -14,7 +14,7 @@
 .\" ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
 .\" OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
 .\"
-.Dd $Mdocdate: October 26 2016 $
+.Dd $Mdocdate: January 9 2017 $
 .Dt BGPCTL 8
 .Os
 .Sh NAME
@@ -104,8 +104,14 @@ Destroy a previously cloned peer.
 The peer must be down before calling this function.
 .Ar peer
 may be the neighbor's address or description.
-.It Cm neighbor Ar peer Cm down
-Take the BGP session to the specified neighbor down.
+.It Cm neighbor Ar peer Cm down Op Ar reason
+Take the BGP session to the specified neighbor down. If a
+.Ar reason
+is provided, the
+.Ar reason
+is sent as Administrative Shutdown Communication to the neighbor. The
+.Ar reason
+cannot exceed 128 octets.
 .Ar peer
 may be the neighbor's address or description.
 .It Cm neighbor Ar peer Cm refresh
Index: bgpctl/bgpctl.c
===
RCS file: /cvs/src/usr.sbin/bgpctl/bgpctl.c,v
retrieving revision 1.190
diff -u -p -r1.190 bgpctl.c
--- bgpctl/bgpctl.c 14 Oct 2016 16:05:35 -  1.190
+++ bgpctl/bgpctl.c 9 Jan 2017 21:52:31 -
@@ -162,6 +162,7 @@ main(int argc, char *argv[])
 
memcpy(, >peeraddr, sizeof(neighbor.addr));
strlcpy(neighbor.descr, res->peerdesc, sizeof(neighbor.descr));
+   strlcpy(neighbor.shutcomm, res->shutcomm, sizeof(neighbor.shutcomm));
 
if ((fd = socket(AF_UNIX, SOCK_STREAM, 0)) == -1)
err(1, "control_init: socket");
@@ -722,6 +723,13 @@ show_neighbor_msg(struct imsg *imsg, enu
inet_ntoa(ina));
printf("%s\n", print_auth_method(p->auth.method));
printf("  BGP state = %s", statenames[p->state]);
+   if (p->conf.down) {
+   printf(", marked down");
+   if (*(p->conf.shutcomm)) {
+   printf(" with shutdown reason \"%s\"",
+   log_shutcomm(p->conf.shutcomm));
+   }
+   }
if (p->stats.last_updown != 0)
printf(", %s for %s",
p->state == STATE_ESTABLISHED ? "up" : "down",
@@ -756,6 +764,10 @@ show_neighbor_msg(struct imsg *imsg, enu
break;
print_neighbor_msgstats(p);
printf("\n");
+   if (*(p->stats.last_shutcomm)) {
+   printf("  Last received shutdown reason: \"%s\"\n",
+   log_shutcomm(p->stats.last_shutcomm));
+   }
if (p->state == STATE_IDLE) {
static const char   *errstr;
 
Index: bgpctl/parser.c
===
RCS file: /cvs/src/usr.sbin/bgpctl/parser.c,v
retrieving revision 1.74
diff -u -p -r1.74 parser.c
--- bgpctl/parser.c 14 Oct 2016 16:05:35 -  1.74
+++ bgpctl/parser.c 9 Jan 2017 21:52:31 -
@@ -45,6 +45,7 @@ enum token_type {
PREFIX,
PEERDESC,
RIBNAME,
+   SHUTDOWN_COMMUNICATION,
COMMUNITY,
LARGE_COMMUNITY,
LOCALPREF,
@@ -245,9 +246,15 @@ static const struct token t_neighbor[] =
{ ENDTOKEN, "", NONE,   NULL}
 };
 
+static const struct token t_nei_mod_shutc[] = {
+   { NOTOKEN,  "", NONE,   NULL},
+   { SHUTDOWN_COMMUNICATION,   "", NONE,   NULL},
+   { ENDTOKEN, "", NONE,   NULL}
+};
+
 static const struct token t_neighbor_modifiers[] = {
{ KEYWORD,  "up",   NEIGHBOR_UP,NULL},
-   { KEYWORD,  "down", NEIGHBOR_DOWN,  NULL},
+   { KEYWORD,  "down", NEIGHBOR_DOWN,  
t_nei_mod_shutc},
{ KEYWORD,  "clear",NEIGHBOR_CLEAR, NULL},
{ KEYWORD,  "refresh",  NEIGHBOR_RREFRESH,  NULL},
{ KEYWORD,  "destroy",  NEIGHBOR_DESTROY,   NULL},
@@ -571,6 +578,16 @@ match_token(int *argc, char **argv[], co
t = [i];
}
break;
+   case SHUTDOWN_COMMUNICATION:
+   if (!match && word != NULL && wordlen > 0) {
+   if (strlcpy(res.shutcomm, word,
+   

openbgpd: support for bgp administrative shutdown communication

2017-01-08 Thread Job Snijders
Dear OpenBSD developers,

This patch adds support for the "BGP Administrative Shutdown
Communication" to bgpd(8) and bgpctl(8).

The draft-ietf-idr-shutdown 
(https://tools.ietf.org/html/draft-ietf-idr-shutdown)
document specifies a mechanism to transmit a short freeform message
across the wire as part of an Administrative Shutdown Cease NOTIFICATION
message. This message serves to inform the neighbor why the BGP session
was shut down. This is useful to communicate for instance a ticket
reference, contact person or other information to the neighbor.

Example usage:

[job@kiera ~]$ bgpctl show
NeighborASMsgRcvd  MsgSent  OutQ Up/Down   State/PrfRcvd
165.254.255.24  15562  70684  60 0   00:21:16  33653

[job@kiera ~]$ bgpctl neighbor 165.254.255.24 down "goodbye, we are 
upgrading to openbsd 6.1"
request processed

[job@kiera ~]$ bgpctl show
NeighborASMsgRcvd  MsgSent  OutQ Up/Down  State/PrfRcvd
165.254.255.24  15562  70708 62  0   00:00:08 Idle

[job@kiera ~]$ bgpctl show neighbor 165.254.255.24 | grep reason
  BGP state = Idle, marked down with shutdown reason "goodbye, we are 
upgrading to openbsd 6.1", down for 00:00:17

On the neighbor's side you'll see the following in syslog:

Jan  8 19:28:54 shutdown bgpd[50719]: neighbor 165.254.255.26: received 
notification: Cease, administratively down
Jan  8 19:28:54 shutdown bgpd[50719]: neighbor 165.254.255.26: received 
shutdown reason: "goodbye, we are upgrading to openbsd 6.1"

and this is also visible through bgpctl(8):

[job@shutdown~]$ bgpctl show neighbor 165.254.255.26 | grep reason
  Last received shutdown reason: "goodbye, we are upgrading to openbsd 6.1"

This work has been tested against pmacct and exabgp which also
support draft-ietf-idr-shutdown.

The BGP Administrative Shutdown Communication feature for OpenBGPD was
developed by Peter van Dijk <peter.van.d...@powerdns.com> and Job Snijders 
<j...@ntt.net>.

Kind regards,

Job


Index: usr.sbin/bgpctl/bgpctl.8
===
RCS file: /cvs/src/usr.sbin/bgpctl/bgpctl.8,v
retrieving revision 1.69
diff -u -p -r1.69 bgpctl.8
--- usr.sbin/bgpctl/bgpctl.825 May 2016 14:15:59 -  1.69
+++ usr.sbin/bgpctl/bgpctl.88 Jan 2017 17:45:24 -
@@ -104,8 +104,14 @@ Destroy a previously cloned peer.
 The peer must be down before calling this function.
 .Ar peer
 may be the neighbor's address or description.
-.It Cm neighbor Ar peer Cm down
-Take the BGP session to the specified neighbor down.
+.It Cm neighbor Ar peer Cm down Op Ar reason
+Take the BGP session to the specified neighbor down. If a
+.Ar reason
+is provided, the 
+.Ar reason
+is send as Administrative Shutdown Communication to the neighbor. The
+.Ar reason
+cannot exceed 128 octets.
 .Ar peer
 may be the neighbor's address or description.
 .It Cm neighbor Ar peer Cm refresh
Index: usr.sbin/bgpctl/bgpctl.c
===
RCS file: /cvs/src/usr.sbin/bgpctl/bgpctl.c,v
retrieving revision 1.188
diff -u -p -r1.188 bgpctl.c
--- usr.sbin/bgpctl/bgpctl.c3 Jun 2016 17:36:37 -   1.188
+++ usr.sbin/bgpctl/bgpctl.c8 Jan 2017 17:45:24 -
@@ -159,6 +159,7 @@ main(int argc, char *argv[])
 
memcpy(, >peeraddr, sizeof(neighbor.addr));
strlcpy(neighbor.descr, res->peerdesc, sizeof(neighbor.descr));
+   strlcpy(neighbor.shutcomm, res->shutcomm, sizeof(neighbor.shutcomm));
 
if ((fd = socket(AF_UNIX, SOCK_STREAM, 0)) == -1)
err(1, "control_init: socket");
@@ -702,6 +703,13 @@ show_neighbor_msg(struct imsg *imsg, enu
inet_ntoa(ina));
printf("%s\n", print_auth_method(p->auth.method));
printf("  BGP state = %s", statenames[p->state]);
+   if (p->conf.down) {
+   printf(", marked down");
+   if (*(p->conf.shutcomm)) {
+   printf(" with shutdown reason \"%s\"",
+   log_shutcomm(p->conf.shutcomm));
+   }
+   }
if (p->stats.last_updown != 0)
printf(", %s for %s",
p->state == STATE_ESTABLISHED ? "up" : "down",
@@ -736,6 +744,10 @@ show_neighbor_msg(struct imsg *imsg, enu
break;
print_neighbor_msgstats(p);
printf("\n");
+   if (*(p->stats.last_shutcomm)) {
+   printf("  Last received shutdown reason: \"%s\"\n",
+   log_shutcomm(p->stats.last_shutcomm));
+   }
if (p->state == STATE_ID

tcpdump: decode BGP Administrative Shutdown Communication

2017-01-07 Thread Job Snijders
This patch adds support to tcpdump(8) to decode BGP Administrative
Shutdown Communications in human readable form.

The draft-ietf-idr-shutdown 
(https://tools.ietf.org/html/draft-ietf-idr-shutdown)
specification documents a mechanism to transmit a short freeform UTF-8
message as part of a BGP Cease NOTIFICATION message to inform the
neighbor why the BGP session is being shutdown.

This modified tcpdump(8) is used in the development of the
draft-ietf-idr-shutdown feature for OpenBGPD.

example, snipped from 'tcpdump -t -n -v -s 1500 host 165.254.255.17 and port 
179':

BGP (NOTIFICATION: error Cease, subcode #2, Shutdown Communication
(len 124): "NTT will perform maintenance on this router. This is
tracked in TICKET-1-24824294. Contact n...@ntt.net for more
information.")

UTF-8 stuff like 濾 is printed like this:

BGP (NOTIFICATION: error Cease, subcode #2, Shutdown Communication
(len 26): "and here is a unicorn \360\237\246\204")

Index: usr.sbin/tcpdump/print-bgp.c
===
RCS file: /cvs/src/usr.sbin/tcpdump/print-bgp.c,v
retrieving revision 1.20
diff -u -p -r1.20 print-bgp.c
--- usr.sbin/tcpdump/print-bgp.c27 Oct 2016 08:21:58 -  1.20
+++ usr.sbin/tcpdump/print-bgp.c7 Jan 2017 11:15:01 -
@@ -228,6 +228,9 @@ static const char *bgpnotify_minor_updat
 
 /* RFC 4486 */
 #define BGP_NOTIFY_MINOR_CEASE_MAXPRFX  1
+/* draft-ietf-idr-shutdown */
+#define BGP_NOTIFY_MINOR_CEASE_ADMIN   2
+#define BGP_NOTIFY_MINOR_CEASE_ADMIN_SHUTDOWN_LEN  128
 static const char *bgpnotify_minor_cease[] = {
NULL, "Maximum Number of Prefixes Reached", "Administratively Shutdown",
"Peer De-configured", "Administratively Reset", "Connection Rejected",
@@ -302,6 +305,13 @@ static const char *afnumber[] = AFNUM_NA
sizeof(afnumber)/sizeof(afnumber[0]), (x)))
 
 
+static void
+print_hex(const u_char *p, u_int len)
+{
+   while (len--)
+   printf("%02x", *p++);
+}
+
 static const char *
 num_or_str(const char **table, size_t siz, int value)
 {
@@ -996,6 +1006,9 @@ bgp_notification_print(const u_char *dat
u_int16_t af;
u_int8_t safi;
const u_char *p;
+   uint8_t shutdown_comm_length;
+   uint8_t remainder_offset;
+   char string[128];
 
TCHECK2(dat[0], BGP_NOTIFICATION_SIZE);
memcpy(, dat, BGP_NOTIFICATION_SIZE);
@@ -1026,9 +1039,57 @@ bgp_notification_print(const u_char *dat
 
printf(" Max Prefixes: %u", EXTRACT_32BITS(p+3));
}
+
+   /*
+* draft-ietf-idr-shutdown describes a method to send a
+* message intended for human consumption regarding the
+* Administrative Shutdown event. This is called the
+* "Administrative Shutdown Communication". The message
+* is UTF-8 encoded and may be no longer than 128 bytes.
+*/
+
+   if (bgpn.bgpn_minor == BGP_NOTIFY_MINOR_CEASE_ADMIN &&
+   length >= 1) {
+   p = dat + BGP_NOTIFICATION_SIZE;
+   TCHECK2(*p, 1);
+   shutdown_comm_length = *(p);
+   remainder_offset = 0;
+   /* seems we received garbage, hexdump it all */
+   if (shutdown_comm_length >
+   BGP_NOTIFY_MINOR_CEASE_ADMIN_SHUTDOWN_LEN)
+   printf(", invalid Shutdown Communication (len 
%i > %i)",
+   shutdown_comm_length,
+   BGP_NOTIFY_MINOR_CEASE_ADMIN_SHUTDOWN_LEN);
+   else if (shutdown_comm_length == 0) {
+   printf(", empty Shutdown Communication");
+   remainder_offset += 1;
+   }
+   /* a proper shutdown communication */
+   else if (shutdown_comm_length <=
+   BGP_NOTIFY_MINOR_CEASE_ADMIN_SHUTDOWN_LEN) {
+   TCHECK2(*p, shutdown_comm_length);
+   printf(", Shutdown Communication (len %u): \"",
+   shutdown_comm_length);
+   memset(string, 0, 128);
+   memcpy(string, p+1, shutdown_comm_length);
+   safeputs(string);
+   printf("\"");
+   remainder_offset += shutdown_comm_length + 1;
+   }
+   /* if there is trailing data, hexdump it */
+   if (length -
+   (remainder_offset + BGP_NOTIFICATION_SIZE) > 0) {
+   printf(", Data: (len %u) ",
+   length - (remainder_offset +
+ 

tcpdump: decode Large BGP Communities

2016-10-12 Thread Job Snijders
This patch adds support to tcpdump(8) to decode Large BGP
Communities in human readable form.

Example:

[ snip ] BGP (UPDATE: (Path attributes: (ORIGIN[T] IGP)
(AS_PATH[T] 65000)
(NEXT_HOP[T] pxtr-2.meerval.net)
(COMMUNITIES[OT] 666:666 2914:0)
(LARGE_COMMUNITIES[OT] 2914:0:666 2914:4294927296:123))
(NLRI: 9.9.9.10/32)) (DF) (ttl 63, id 2806, len 138)

Kind regards,

Job


Index: src/usr.sbin/tcpdump/print-bgp.c
===
RCS file: /cvs/src/usr.sbin/tcpdump/print-bgp.c,v
retrieving revision 1.18
diff -u -p -u -r1.18 print-bgp.c
--- src/usr.sbin/tcpdump/print-bgp.c20 Oct 2015 11:29:07 -  1.18
+++ src/usr.sbin/tcpdump/print-bgp.c12 Oct 2016 16:09:51 -
@@ -134,6 +134,7 @@ struct bgp_attr {
 #define BGPTYPE_EXTD_COMMUNITIES   16  /* RFC4360 */
 #define BGPTYPE_AS4_PATH   17  /* RFC4893 */
 #define BGPTYPE_AGGREGATOR418  /* RFC4893 */
+#define BGPTYPE_LARGE_COMMUNITIES  30  /* 
draft-ietf-idr-large-community */
 
 #define BGP_AS_SET 1
 #define BGP_AS_SEQUENCE2
@@ -265,7 +266,8 @@ static const char *bgpattr_type[] = {
"MULTI_EXIT_DISC", "LOCAL_PREF", "ATOMIC_AGGREGATE", "AGGREGATOR",
"COMMUNITIES", "ORIGINATOR_ID", "CLUSTER_LIST", "DPA",
"ADVERTISERS", "RCID_PATH", "MP_REACH_NLRI", "MP_UNREACH_NLRI",
-   "EXTD_COMMUNITIES", "AS4_PATH", "AGGREGATOR4",
+   "EXTD_COMMUNITIES", "AS4_PATH", "AGGREGATOR4", NULL, NULL, NULL,
+   NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, "LARGE_COMMUNITIES",
 };
 #define bgp_attr_type(x) \
num_or_str(bgpattr_type, \
@@ -544,6 +546,21 @@ bgp_attr_print(const struct bgp_attr *at
}
tlen -= 4;
p += 4;
+   }
+   break;
+   case BGPTYPE_LARGE_COMMUNITIES:
+   if (len == 0 || len % 12) {
+   printf(" invalid len");
+   break;
+   }
+   while (tlen>0) {
+   TCHECK2(p[0], 12);
+   printf(" %u:%u:%u",
+   EXTRACT_32BITS(p),
+   EXTRACT_32BITS(p + 4),
+   EXTRACT_32BITS(p + 8));
+   tlen -= 12;
+   p += 12;
}
break;
case BGPTYPE_ORIGINATOR_ID:



  1   2   >