Re: bgpd, announce to ibgp from 2 routers, prefixes only show up from 1

2021-11-30 Thread Claudio Jeker
On Mon, Nov 29, 2021 at 10:38:21PM +0100, Sebastian Benoit wrote:
> Stuart Henderson(s...@spacehopper.org) on 2021.11.13 00:11:08 +:
> > I have a pair of -current routers running bgpd (let's call them rtr-a
> > and rtr-b) on a subnet which also has some vpn gateways and firewalls.
> > 
> > These routers provide a carp address which the vpn gateways are using
> > as default route. There are some networks behind the vpn gateways (a
> > /32 to accept incoming vpn connections and some other prefixes that vpn
> > clients are numbered from).
> > 
> > rtr-a and rtr-b have static routes to those networks, and they have
> > network statements in bgpd.conf to announce them to their ibgp peers
> > ("network 172.24.232.0/21 set nexthop XXX" etc) so the paths are reachable
> > from the rest of the network. (This is replacing an existing setup using
> > ospf, trying to remove routing protocols from machines that don't really
> > need them).
> > 
> > It is working but something seems a little odd - the paths are announced
> > from both routers briefly and show up on the rest of the network from
> > both rtr-a and rtr-b. But after a few seconds, rtr-b receives these
> > paths from rtr-a, and then rtr-b stops announcing them itself. (they
> > stop showing in "bgpctl sh rib out" on rtr-b; "bgpctl sh nex" does
> > correctly identify the associated nexthops as connected/UP).
> > 
> > Is this expected/correct behaviour?
> 
> It is expected: once rtr-b receives the route from rtr-a, it will run the
> route decision process on it. IF both routers are configured identically
> except for the router-id, one of the routes will be prefered at either the
> "oldest path" or the "lowest bgp id" criteria.
> 
> As only one route is a best route, that one will be annouced to the
> neighbors. However this is IBGP. In a set of IBGP connected routers, a
> router will not announce a route to other IBGP peers that it received from
> on a IBGP session. Thus, rtr-b will stop announcing that route.
> 
> When rtr-a goes down, the session is shut down or the prefix is filtered,
> bgpd wont see the "better" route anymore and announce its own instead.
> 
> > I'd prefer to have them announced from both rtr-a and rtr-b, so there's
> > no blackhole period if rtr-a is restarted while rtr-b figures out that
> > it should start announcing them, etc. (No need for tracking carp state
> > in this case, I'm not using stateful pf rules on the traffic involved).
> 
> This is a place where ospf might give you faster failover, especiall y with
> the redistribute ... depend on ... syntax.
>  
> > If rtr-b stops seeing the prefixes from rtr-a (either by taking down
> > the ibgp session, or by filtering) I see the announcements from both
> > rtr-a and rtr-b again. So the obvious workaround is to filter, but
> > I thought I'd ask first in case it's something that is better handled
> > by code changes rather than config.

Or the other way is to alter localpref, as-path or metric of those routes
in some way that makes sure that both router-A and router-B announce a
"better" route.

You can do this in multiple ways. One way would be to use something like
this:
pass out on ibgp metric +1
or
pass in on ibgp metric -1
 
Long term it would be nice to reintroduce route metrics and use this
to sort nexthops in bgpd.

-- 
:wq Claudio



Re: Put non-NULL pledge abort in the man page

2021-11-25 Thread Claudio Jeker
On Thu, Nov 25, 2021 at 04:55:23AM -0600, Luke Small wrote:
> I ran ktrace. Kdump said the last thing it did was try to load
> /usr/libexec/ld.so
> 
> To main(), before the unveil pledge is dropped, I added:
> 
> if (unveil("/usr/libexec/", "rx") == -1)
> err(1, "unveil, line: % d", __LINE__);
> 
> After running it again, it spits out an error message:
> 
> ld.so: pkg_ping: can't load library 'libc.so.96.1'
> 
> So I put in:
> 
> if (unveil("usr/lib/", "rx") == -1)
> err(1, "unveil, line: %d", __LINE__);
> 
> Now it successfully execv()s into the new process space!
> Now in the newly created program, which hasn’t set new pledge execpromises,
> it won’t successfully run ftp(1) because it wasn’t granted the inet
> execpromise.
> 
> execpromises seems to have carried over!

Don't use execpromises. That feature is not working and no tool in OpenBSD
uses it.

-- 
:wq Claudio



Re: Dynamic routing and REJECT,LLINFO,CLONED routes

2021-11-07 Thread Claudio Jeker
On Sun, Nov 07, 2021 at 12:46:43PM +0100, Denis Fondras wrote:
> I came up with this diff to overcome my problem.
> 
> Index: rtable.c
> ===
> RCS file: /cvs/src/sys/net/rtable.c,v
> retrieving revision 1.75
> diff -u -p -r1.75 rtable.c
> --- rtable.c  25 May 2021 22:45:09 -  1.75
> +++ rtable.c  7 Nov 2021 11:21:33 -
> @@ -834,6 +834,10 @@ rtable_mpath_insert(struct art_node *an,
>   return;
>   }
>  
> + /* Unreachable on-link route will not preferred */
> + if (ISSET(mrt->rt_flags, RTF_LLINFO|RTF_REJECT))
> + prio = 0;
> +
>   /* Iterate until we find the route to be placed after ``rt''. */
>   while (mrt->rt_priority <= prio && SRPL_NEXT_LOCKED(mrt, rt_next)) {
>   prt = mrt;
> 
> Le Sun, Nov 07, 2021 at 10:11:54AM +0100, Denis Fondras a écrit :
> > Hi,
> > 
> > I am using BGP to connect 2 OpenBSD-current routers :
> > 
> > [static default GW]---RT1---[bgp]---RT2
> > 
> > I announce an IPv4 /32 from RT2.
> > After I start both RT1 and RT2, traffic flows to RT2 /32 without any issue.
> > However if I reboot RT2 (let's say for sysupgrade), RT1 loses the /32 
> > (which is
> > expected) but as traffic is still directed to the /32 (because of a constant
> > ping towards the /32 for example), RT1 installs a route for the /32 with 
> > these
> > flags :
> > 
> > flags: 
> > (The REJECT flag is dropped after a timeout but comes back a few second 
> > later)
> > 
> > From there I cannot get the back /32 from BGP until I manually delete the
> > automatically installed HOST route. Is there any way to deal with it without
> > manual intervention ?
> > 
> > Denis

To be honest, you have arp or ND running on that prefix and then overload
it with a /32 route. You really need to explain why you do that. This is
in my opinion a broken setup.

We don't want to add hacks for setups that are inherently broken. If
something is directly connected it should use that direct link.

-- 
:wq Claudio



Re: Asyncronous IO

2021-11-04 Thread Claudio Jeker
On Wed, Nov 03, 2021 at 03:37:01PM +, cho...@jtan.com wrote:
> I program on OpenBSD and am writing a library which presents an API
> for IO. POSIX defines an API[*] for asyncronous IO and I would like
> my code to support it but this API is unavailable in OpenBSD.
> 
> Is the lack intentional (perhaps there are other plans) or is it
> simply the case that no-one has sat down and written it yet?

A bit of both. AIO is not often used. Using basic poll/select or
the use of libevent is much preferred to build an async API.
AIO is complex and in most cases not needed.
 
> I don't mind that the async parts will not (yet) work on OpenBSD
> because I can always test them elsewhere but I would like to know
> which backend API(s) I should write against and therefore what
> OpenBSD intends to do regarding AIO in the future.
> 
> Cheers,
> 
> Matthew
> 
> [*] https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/aio.h.html
> 

-- 
:wq Claudio



Re: httpd(8) - Internal Server error (500) on invalid request

2021-10-21 Thread Claudio Jeker
On Thu, Oct 21, 2021 at 04:38:43PM +0200, Sebastian Benoit wrote:
> J. K.(openbsd.l...@krottmayer.com) on 2021.10.21 14:10:16 +0200:
> > Another question, to httpd(8). Tried the following query.
> > Used an invalid HTTP Version number (typo).
> > 
> > $ telnet 10.42.42.183 80
> > [Shortened]
> > GET / HTTP/1.2
> > [content]
> > 
> > httpd provide here the site. Without checking the not existent version
> > (1.2) number and the Host. Okay, that's maybe stupid from me to
> > start a request with an invalid version number. But should not also
> > the server answer with 400 (bad request)?
> > 
> > According to the source only HTTP/1.1 is checked. All other request
> > will be accepted. Okay, I'm not a RFC specialist. Still a newbie.
> 
> This diff makes httpd return "505 HTTP Version Not Supported"
> for < 0.9 and > 1.9 http versions. Anything from 1.1 to 1.9 is
> interpreted as 1.1. This is what nginx does too.
> 
> ok?
> 
> diff --git usr.sbin/httpd/server_http.c usr.sbin/httpd/server_http.c
> index 6a74f3e45c5..52aaf3711c2 100644
> --- usr.sbin/httpd/server_http.c
> +++ usr.sbin/httpd/server_http.c
> @@ -51,6 +51,7 @@ int  server_http_authenticate(struct server_config 
> *,
>   struct client *);
>  char *server_expand_http(struct client *, const char *,
>   char *, size_t);
> +int   http_version_num(char *);
>  
>  static struct http_method http_methods[] = HTTP_METHODS;
>  static struct http_error  http_errors[] = HTTP_ERRORS;
> @@ -198,6 +199,19 @@ done:
>   return (ret);
>  }
>  
> +int http_version_num(char *version)

KNF please.

> +{
> + if (strcmp(version, "HTTP/0.9") == 0)
> + return (9);
> + if (strcmp(version, "HTTP/1.0") == 0)
> + return (10);
> + /* any other version 1.x gets downgraded to 1.1 */
> + if (strncmp(version, "HTTP/1", 6) == 0)
> + return (11);
> +
> + return (0);
> +}
> +
>  void
>  server_read_http(struct bufferevent *bev, void *arg)
>  {
> @@ -207,6 +221,7 @@ server_read_http(struct bufferevent *bev, void *arg)
>   char*line = NULL, *key, *value;
>   const char  *errstr;
>   size_t   size, linelen;
> + int  version;
>   struct kv   *hdr = NULL;
>  
>   getmonotime(>clt_tv_last);
> @@ -329,12 +344,29 @@ server_read_http(struct bufferevent *bev, void *arg)
>   *desc->http_query++ = '\0';
>  
>   /*
> -  * Have to allocate the strings because they could
> +  * We have to allocate the strings because they could
>* be changed independently by the filters later.
> +  * Allow HTTP version 0.9 to 1.1.
> +  * Downgrade http version > 1.1 <= 1.9 to version 1.1.
> +  * Return HTTP Version Not Supported for anything else.
>*/
> - if ((desc->http_version =
> - strdup(desc->http_version)) == NULL)
> - goto fail;
> +
> + version = http_version_num(desc->http_version);

I woud prefer if this code would store the version not in
desc->http_version until after the strdup(). The way these strdup work is
just wonky. Especil in the failure cases this may result in calling free
on the wrong thing.

> + if (version == 11) {
> + if ((desc->http_version =
> + strdup("HTTP/1.1")) == NULL)
> + goto fail;
> + } else {
> + if ((desc->http_version =
> + strdup(desc->http_version)) == NULL)
> + goto fail;
> + }
> +
> + if (version == 0) {
> + server_abort_http(clt, 505, "bad http version");
> + goto abort;
> + }

I would prefer to have this as:
if (version == 0) {
} else if if (version == 11) {
} else {
}

-- 
:wq Claudio



Re: httpd(8) - Internal Server error (500) on invalid request

2021-10-21 Thread Claudio Jeker
On Thu, Oct 21, 2021 at 01:21:33PM +0200, Sebastian Benoit wrote:
> J. K.(openbsd.l...@krottmayer.com) on 2021.10.21 11:55:47 +0200:
> > Hi,
> > 
> > I don't know if this is a real issue from OpenBSD's httpd(8).
> > Tried some requests to httpd(8) for the purpose of education.
> > 
> > Simple tried the following request:
> > 
> > $ telnet 10.42.42.183 80
> > Trying 10.42.42.183...
> > Connected to 10.42.42.183.
> > Escape character is '^]'.
> > GET / HTTP/1.1
> > fasfsdfsfd
> > 
> > Here without the colon httpd(8) return an internal server
> > error.
> > 
> > Can somebody verify this behavior?
> > 
> > Noticed with OpenBSD 7.0. Is this a correct behavior (RFC
> > conform)?
> > 
> > Thanks in advance!
> > 
> > Kind regrads,
> > 
> > J. K.
> 
> Hi,
> 
> yes. The server should probably answer with a "Bad Request" instead.
> 
> Fix below. ok?

OK claudio@
 
> diff --git usr.sbin/httpd/server_http.c usr.sbin/httpd/server_http.c
> index 732add41283..fce3c21af72 100644
> --- usr.sbin/httpd/server_http.c
> +++ usr.sbin/httpd/server_http.c
> @@ -268,8 +268,14 @@ server_read_http(struct bufferevent *bev, void *arg)
>   else if (*key == ' ' || *key == '\t')
>   /* Multiline headers wrap with a space or tab */
>   value = NULL;
> - else
> + else {
> + /* Not a multiline header, should have a : */
>   value = strchr(key, ':');
> + if (value == NULL) {
> + server_abort_http(clt, 400, "malformed");
> + goto abort;
> + }
> + }
>   if (value == NULL) {
>   if (clt->clt_line == 1) {
>   server_abort_http(clt, 400, "malformed");
> 

-- 
:wq Claudio



Re: problems with outbound load-balancing (PF sticky-address for destination IPs)

2021-09-29 Thread Claudio Jeker
On Wed, Sep 29, 2021 at 08:07:43PM +1000, Andrew Lemin wrote:
> Hi Claudio,
> 
> So you probably guessed I am using 'route-to { GW1, GW2, GW3, GW4 } random'
> (and was wanting to add 'sticky-address' to this) based on your reply :)
> 
> "it will make sure that selected default routes are sticky to source/dest
> pairs" - Are you saying that even though multipath routing uses hashing to
> select the path (https://www.ietf.org/rfc/rfc2992.txt - "The router first
> selects a key by performing a hash (e.g., CRC16) over the packet header
> fields that identify a flow."), subsequent new sessions to the same dest IP
> with different source ports will still get the same path? I thought a new
> session with a new tuple to the same dest IP would get a different hashed
> path with multipath?

OpenBSD multipath routing implements gateway selection by Hash-Threshold
from RFC 2992. It therefor routes the same src/dst pair over the same
nexthop as long as there are no changes to the route. If one of your
links drops then some sessions will move links but the goal of
hash-threshold is to minimize the affected session.

> "On rerouting the multipath code reshuffles the selected routes in a way to
> minimize the affected sessions." - Are you saying, in the case where one
> path goes down, it will migrate all the entries only for that failed path
> onto the remaining good paths (like ecmp-fast-reroute ?)

No, some session on good paths may also migrate to other links, this is
how the hash-threshold algorithm works.

Split with 4 nexthops, now lets assume link 2 dies and stuff gets
reshuffled:
+=+=+=+=+
|   link   1  |   link   2  |   link   3  |   link   4  |
+=+=+===+===+=+=+
|   link   1|   link   3|   link   4|
+===+
Unaffected sessions for drop
 ^   ^^^   ^
Affected sessions because of drop
   # #
Unsing other ways to split the hash into buckets (e.g. a simple modulo)
causes more change.

Btw. using route-to with 4 gw will not detect a link failure and 25% of
your traffic will be dropped. This is another advantage of multipath
routing.

Cheers
-- 
:wq Claudio

> Thanks for your time, Andy.
> 
> On Wed, Sep 29, 2021 at 5:21 PM Claudio Jeker 
> wrote:
> 
> > On Wed, Sep 29, 2021 at 02:17:59PM +1000, Andrew Lemin wrote:
> > > I see this question died on its arse! :)
> > >
> > > This is still an issue for outbound load-balancing over multiple internet
> > > links.
> > >
> > > PF's 'sticky-address' parameter only works on source IPs (because it was
> > > originally designed for use when hosting your own server pools - inbound
> > > load balancing).
> > > I.e. There is no way to configure 'sticky-address' to consider
> > destination
> > > IPs for outbound load balancing, so all subsequent outbound connections
> > to
> > > the same target IP originate from the same internet connection.
> > >
> > > The reason why this is desirable is because an increasing number of
> > > websites use single sign on mechanisms (quite a few different
> > architectures
> > > expose the issue described here). After a users outbound connection is
> > > initially randomly load balanced onto an internet connection, their
> > browser
> > > is redirected into opening multiple additional sockets towards the
> > > website's load balancers / cloud gateways, which redirect the connections
> > > to different internal servers for different parts of the site/page, and
> > the
> > > SSO authentication/cookies passed on the additional sockets must to
> > > originate from the same IP as the original socket. As a result outbound
> > > load-balancing does not work for these sites.
> > >
> > > The ideal functionality would be for 'sticky-address' to consider both
> > > source IP and destination IP after initially being load balanced by
> > > round-robin or random.
> >
> > Just use multipath routing, it will make sure that selected default routes
> > are sticky to source/dest pairs. You may want the states to be interface
> > bound if you need to nat-to on those links.
> >
> > On rerouting the multipath code reshuffles the selected routes in a way to
> > minimize the affected sessions. All this is done without any extra memory
> > usage since the hashing function is smart.
> >
> > --
> > :wq Claudio

Re: problems with outbound load-balancing (PF sticky-address for destination IPs)

2021-09-29 Thread Claudio Jeker
On Wed, Sep 29, 2021 at 02:17:59PM +1000, Andrew Lemin wrote:
> I see this question died on its arse! :)
> 
> This is still an issue for outbound load-balancing over multiple internet
> links.
> 
> PF's 'sticky-address' parameter only works on source IPs (because it was
> originally designed for use when hosting your own server pools - inbound
> load balancing).
> I.e. There is no way to configure 'sticky-address' to consider destination
> IPs for outbound load balancing, so all subsequent outbound connections to
> the same target IP originate from the same internet connection.
> 
> The reason why this is desirable is because an increasing number of
> websites use single sign on mechanisms (quite a few different architectures
> expose the issue described here). After a users outbound connection is
> initially randomly load balanced onto an internet connection, their browser
> is redirected into opening multiple additional sockets towards the
> website's load balancers / cloud gateways, which redirect the connections
> to different internal servers for different parts of the site/page, and the
> SSO authentication/cookies passed on the additional sockets must to
> originate from the same IP as the original socket. As a result outbound
> load-balancing does not work for these sites.
> 
> The ideal functionality would be for 'sticky-address' to consider both
> source IP and destination IP after initially being load balanced by
> round-robin or random.

Just use multipath routing, it will make sure that selected default routes
are sticky to source/dest pairs. You may want the states to be interface
bound if you need to nat-to on those links.

On rerouting the multipath code reshuffles the selected routes in a way to
minimize the affected sessions. All this is done without any extra memory
usage since the hashing function is smart.

-- 
:wq Claudio

 
> Thanks again, Andy.
> 
> On Sat, Apr 3, 2021 at 12:40 PM Andy Lemin  wrote:
> 
> > Hi smart people :)
> >
> > The current implementation of ‘sticky-address‘ relates only to a sticky
> > source IP.
> > https://www.openbsd.org/faq/pf/pools.html
> >
> > This is used for inbound server load balancing, by ensuring that all
> > socket connections from the same client/user/IP on the internet goes to the
> > same server on your local server pool.
> >
> > This works great for ensuring simplified memory management of session
> > artefacts on the application being hosted (the servers do not have to
> > synchronise the users session data as extra sockets from that user will
> > always connect to the same local server)
> >
> > However sticky-address does not have an equivalent for sticky destination
> > IPs. For example when doing outbound load balancing over multiple ISP
> > links, every single socket is load balanced randomly. This causes many
> > websites to break (especially cookie login and single-sign-on style
> > enterprise services), as the first outbound socket will originate randomly
> > from one of the local ISP IPs, and the users login session/SSO (on the
> > server side) will belong to that first random IP.
> >
> > When the user then browses to or uses another part of that same website
> > which requires additional sockets, the additional sockets will pass the SSO
> > credentials from the first socket, but the extra socket connection will
> > again be randomly load-balanced, and so the remote server will reject the
> > connection as it is originating from the wrong source IP etc.
> >
> > Therefore can I please propose a “sticky-address for destination IPs” as
> > an analogue to the existing sticky-address for source IPs?
> >
> > This is now such a problem that we have to use sticky-address even on
> > outbound load-balancing connections, which causes internal user1 to always
> > use the same ISP for _everthing_ etc. While this does stop the breakage, it
> > does not result in evenly distributed balancing of traffic, as users are
> > locked to one single transit, for all their web browsing for the rest of
> > the day after being randomly balanced once first-thing in the morning,
> > rather than all users balancing over all transits throughout the day.
> >
> > Another pain; using the current source-ip sticky-address for outbound
> > balancing, makes it hard to drain transits for maintenance. For example
> > without source sticky-address balancing, you can just remove the transit
> > from the Pf rule, and after some time, all traffic will eventually move
> > over to the other transits, allowing the first to be shut down for whatever
> > needs. But with the current source-ip sticky-address, that first transit
> > will take months to drain in a real-world situations..
> >
> > lastly just as a nice-to-have, how feasible would a deterministic load
> > balancing algorithm be? So that balancing selection is done based on the
> > “least utilised” path?
> >
> > Thanks for your time and consideration,
> > Kindest regards Andy
> >
> >
> >
> > Sent from a teeny tiny 

Re: Blog comparing open source BGP stacks

2021-08-25 Thread Claudio Jeker
On Wed, Aug 25, 2021 at 02:01:26PM +0200, Kristjan Komlosi wrote:
> On 24. 08. 21 21:59, Laura Smith wrote:
> > Would be interesting to hear comments from the community on this comparison 
> > : https://elegantnetwork.github.io/posts/followup-measuring-BGP-stacks/
> > 
> > N.B. For the record, don't shoot the messenger, I had nothing to do with 
> > these tests, I just became aware of them via the BIRD list.  I am 
> > particularly interested in the OpenBSD community comments given one person 
> > on the BIRD list had this to say of OpenBGPD: "OpenBGPd has always been a 
> > dog.".
> > 
> 
> I'm no expert at all, but I'd imagine that OpenBGPD performs at least
> somewhat differently on Linux, which seems to be what the author used in the
> tests. My personal BGP server runs OpenBSD on a 512MB VPS, using about 150MB
> of RAM with full IPv6 table and routing my traffic just fine, though I can
> imagine the tables turning very quickly with lots of neighbors, as the
> benchmark shows. I could try replicating their setup on an OpenBSD system,
> but I don't have good enough hardware at hand at the moment.

The massive amount of memory used in OpenBGPD comes from the fact that
unlike BIRD OpenBGPD runs with a full Adj-RIB-Out.
The tests result in large amount of prefixes that need to be tracked.
If you have 100 peers announcing 1 random prefixes then you end up with
100 * 100 * 1 = 100Million elements to manage. This is not a realistic
test since in most cases the number of routes in the Adj-RIB-Out is
limited (even on route servers). In the end for day to day use OpenBGPD
performs well enough for many people. Future releases will focus more on
performance and optimizing Adj-RIB-Out is on the list.

-- 
:wq Claudio



Re: WireGuard host crashes roughly every week

2021-08-04 Thread Claudio Jeker
On Wed, Aug 04, 2021 at 08:36:07PM +1000, Matt Dunwoodie wrote:
> On Tue, 3 Aug 2021 13:02:15 -0500
> "Matt P."  wrote:
> 
> > Hi Stuart!
> > 
> > Your advice lead me to discover, the issue happens only with the
> > "PersistantKeepalive = 25" option I had enabled on each wg-quick
> > peer. Looks like you could recreate it by making a few no-address
> > peers with this option enabled.
> 
> Hi Matt,
> 
> This insight was very helpful. It looks like mbufs are not freed if
> we're sending to a peer with no endpoint. Specifically, "wg_send" is
> expected to free the mbuf if there is an error sending. This (untested)
> patch should fix it.
> 
> Cheers,
> Matt
> 
> diff --git if_wg.c if_wg.c
> index 18333eda4cb..5f4319558ab 100644
> --- if_wg.c
> +++ if_wg.c
> @@ -810,6 +810,7 @@ wg_send(struct wg_softc *sc, struct wg_endpoint *e, 
> struct mbuf *m)
>   IPPROTO_IPV6);
>  #endif
>   } else {
> + m_freem(m);
>   return EAFNOSUPPORT;
>   }
>  
> 

Diff looks sensible. OK claudio@

-- 
:wq Claudio



Re: iked choosing the wrong policy?

2021-07-27 Thread Claudio Jeker
On Tue, Jul 27, 2021 at 07:32:09AM -, Stuart Henderson wrote:
> On 2021-07-27, Vladimir Nikishkin  wrote:
> > Hello, everyone.
> >
> > This is my iked.conf:
> >
> > ```
> > ikev2 "for-phone" passive esp \
> > from any to 10.0.3.2/32 \
> > local egress peer any \
> ...
> > dstid phone.mine \
> 
> > ikev2 "for-laptop" passive esp \
> > from any to 10.0.3.3/32 \
> > local egress peer any \
> ...
> > dstid laptop.mine \
> 
> Two policies with "peer any" doesn't work.
> 
> > How to correct the setup?
> 
> Maybe it's possible by modifying the code, I'm not sure if the
> id is sent early enough though so it might not be possible.

This is one of the biggest annoyances of iked. It does not even help to
use different IPs and 'local' to split up the rules. Would love if someone
would fix this.

-- 
:wq Claudio



Re: DHCP non-issues

2021-07-20 Thread Claudio Jeker
On Tue, Jul 20, 2021 at 08:53:03AM -, Stuart Henderson wrote:
> On 2021-07-19, jungle Boogie  wrote:
> > On Mon, 19 Jul 2021 at 04:48, Christian Weisgerber  
> > wrote:
> >>
> >> Look guys, it's simple.
> >>
> >> If you want IPv6 (SLAAC) autoconfiguration, you set "inet6 autoconf"
> >> for that interface.  slaacd(8) will then automatically handle things.
> >>
> >> If you want IPv4 (DHCP) autoconfiguration, you set "inet autoconf"
> >> for that interface.  dhcpleased(8) will then automatically handle
> >> things.  If you require special DHCP options that dhcpleased(8)
> >> doesn't include, then you don't enable autoconfigurarion and run
> >> dhclient(8) instead, which can be extensively configured.
> >>
> >> Both slaacd(8) and dhcpleased(8) pass nameserver information to
> >> resolvd(8), which adds those nameservers to /etc/resolv.conf unless
> >> unwind(8) is running.  If you don't want that to happen for some
> >> other reason, you turn off resolvd(8).
> >>
> >
> > Sounds like great information to put in current.html:
> > https://www.openbsd.org/faq/current.html
> > I think folks are surprised by the change and want to know how to
> > handle the new daemons in certain situations.
> > Your explanation above is very helpful and probably could be used in
> > current.html
> > I imagine the 7.0 "what's new" section will contain something similar.
> >
> >
> > What do I need to do to have WireGuard start at boot when I want to
> > use a hostname in my hostname.wg0 interface file?
> >
> > Currently, the interface doesn't come up as expected:
> > ifconfig: no address associated with name
> >
> > Are these my options?
> > a. use dhclient
> > b. make a script to start the interface later
> > c. use ip address
> 
> or d. add an entry to /etc/hosts
> 
> Some people are also running into problems with hostnames in pf.conf;
> a c and d apply in that case too.
> 
> Some of this could be fixed by having a way to ask dhcpleased to wait
> (with timeout) for an address during boot. For your example with wg,
> as well as that, netstart would need to be split i.e. start standard
> interfaces, then dhcpleased/unwind/resolvd, then tunnel interfaces.
> 
> I was going to say the same would apply for hostnames used in fstab
> if /usr and /var are NFS-mounted; but actually /usr and /var can't
> be NFS-mounted if you rely on addresses from dhcpleased to reach the
> NFS server anyway (these daemons need access to /var so they need
> to be started after /usr and /var are mounted).
> 

Actually this needs to be fixed in /etc/netstart, dhcpleasd / slaacd. Until
now systems with dynamic ips had the 10sec wait of dhclient to make sure
the interfaces are up and configured. This no longer and because of this
stuff breaks left and right.

Up until now the system relied on the fact that after /etc/netstart ran
the interfaces where up and configured (static or dynamic) and all
following services relied on this fact. Honestly adding host entires is
not a solution because it will not work in all cases. e.g. pf rules using
interface names as addresses will not work correctly.

There must be a way to wait at the end of netstart to ensure that network
configuration settled or timed out. IIRC dlg@ hat a diff that allowed
something along these lines.

We already hit this issue with slaacd on IPv6-only setups and ignored it.
Now it affects everyone, lets not ignore it again.
-- 
:wq Claudio



Re: VLANs isolation

2021-07-13 Thread Claudio Jeker
On Tue, Jul 13, 2021 at 11:34:28AM +0200, Radek wrote:
> Hello,
> I'm going to build a router with +40 vlans.
> I need to block access from every vlan to each other (and then enable traffic 
> between certain vlans as needed).
> 
> How can I do this? Is there any one liner pf block rule to do this?  

Not really but you can try:

block out on vlan received-on vlan

It really matters in how you want to build your filters (outbound or
inbound filtering). Maybe it is better to just start with a block all rule
and slowly allow traffic back. You can use interface groups and pf tags to
help with rule writing.

-- 
:wq Claudio



Re: rpki-client and BLACKHOLE routes

2021-06-23 Thread Claudio Jeker
On Wed, Jun 23, 2021 at 11:40:25AM +0200, Hrvoje Popovski wrote:
> Hi all,
> 
> fist of all, thank you for rpki-client, it's so easy to use it and to
> get the job done.
> I'm playing with rpki-client and denying ovs invalid statement and I've
> seen that with default ovs config statement (deny from ebgp ovs invalid)
> BLACKHOLE routes are blocked/invalid.
> 
> What is the right way to allow BLACKHOLE routes through rpki ? Or if
> someone can give me a hint on what to do.
> 

BLACKHOLE routes normally have a more specific check so you can re-allow
them back after the ovs invalid check (for that you need to take away the
quick from the default ruleset or actually allow quick the blackholes
before).

I guess you can use something along the lines of:
allow quick from group clients inet prefixlen 32 community $BLACKHOLE set 
nexthop blackhole
allow quick from group clients inet6 prefixlen 128 community $BLACKHOLE set 
nexthop blackhole

I guess you also have some client prefix-sets that should be added to the
filter rule so that one client can not blackhole for another.

BLACKHOLE routes are done in many ways and I'm not sure if there is
consensus who is allowed to announce what. Also if there are multiple
paths to the destination should the blackhole only be active if the
covering route is from the same peer?

-- 
:wq Claudio



Re: EACCES of UDP packet

2021-06-22 Thread Claudio Jeker
On Tue, Jun 22, 2021 at 04:48:26PM +0800, Siegfried Levin wrote:
> > Why have you chosen to hide information that may be useful in debugging 
> > your problem?
> 
> I’m truly sorry for the inconvenience but I do have some concerns of security 
> and privacy. I confirm it is not a broadcast address because it is the public 
> IP of the server and this issue has a probability of 1% to happen. The 
> address cannot just be a broadcast address at 1% of the time while not at the 
> rest of 99%. I also double checked it by SSHing to the address I copied from 
> the kdump, if it makes sense.
> 
> > So, since the manpage mentions blocking pf, I suggest the hypothesis "it 
> > returns EACCES because pf is blocking your packets".  I can think of 
> > several ways to test that; what testing have you performed to confirm or 
> > rule out that possibility?  "doas pfctl -d; run test; doas pfctl -e”?
> 
> This issue is really hard to reproduce because the application works at most 
> of the time, but I think you are right. I’ll be watching the pf log in next 
> weeks.
> 

Also check the various counters of netstat -s and especially pfctl -si (or
systat pf). In the pfctl output especially check memory, congestion or
state errors.

-- 
:wq Claudio



Re: Prometheus on OpenBSD - does it work?

2021-06-15 Thread Claudio Jeker
On Tue, Jun 15, 2021 at 04:24:08PM +0200, Julien Pivotto wrote:
> Hello,
> 
> I am a Prometheus maintainer and we have received a bug regarding
> Prometheus - prometheus would no longer work on OpenBSD since we
> introduced MMAP:
> 
> https://github.com/prometheus/prometheus/issues/8877
> https://github.com/prometheus/prometheus/issues/8799
> 
> I would like to know if the facts here are accurate and, on the
> opposite, if there are happy openbsd users of Prometheus 2.19+.
> 
> I see that Prometheus 2.24 is packaged upstream, so I guess there are
> users. Can you please interact with us so we can better understand the
> situation at play.
> 

Unlike other OS OpenBSD does not automatically sync between mmap-ed memory
of a file with any write() to the same file (OpenBSD has no unified
cache). It requries use of msync(2) to make sure that mappings are
properly updated.

While prometheus works, it also does not. I looked into the code of TSDB
and came to the conclusion that many operations (especially compaction)
fail because TSDB writes to file handels but uses mmaps of the same memory
at the same time.

I fixed one case (which is the one mentioned in the issues index/index.go
but then more errors show up when running tsdb go test. Including a SEGV
in db_test.go

I played a bit more with this and skipping the bad test in db_test.go it
seems to mostly pass but errors out at the end:

level=error msg="WAL corruption detected; truncating" err="unexpected
CRC32 checksum 7c1a52ff, want 1020304"
file=/tmp/test_corrupted095078964/01 pos=44
PASS
goleak: Errors on successful test run: found unexpected goroutines:
[Goroutine 17761 in state chan send, with
github.com/prometheus/prometheus/tsdb.(*SegmentWAL).cut.func1 on top of
the stack:
goroutine 17761 [chan send]:
github.com/prometheus/prometheus/tsdb.(*SegmentWAL).cut.func1(0xc001262fd0,
0xc0eff0)
/usr/ports/pobj/prometheus-2.27.1/go/src/all/tsdb/wal.go:571 +0x72
created by github.com/prometheus/prometheus/tsdb.(*SegmentWAL).cut
/usr/ports/pobj/prometheus-2.27.1/go/src/all/tsdb/wal.go:570 +0x7a

 Goroutine 18135 in state chan send, with
github.com/prometheus/prometheus/tsdb.(*SegmentWAL).cut.func1 on top of
the stack:
goroutine 18135 [chan send]:
github.com/prometheus/prometheus/tsdb.(*SegmentWAL).cut.func1(0xc99290,
0xc000be24b0)
/usr/ports/pobj/prometheus-2.27.1/go/src/all/tsdb/wal.go:571 +0x72
created by github.com/prometheus/prometheus/tsdb.(*SegmentWAL).cut
/usr/ports/pobj/prometheus-2.27.1/go/src/all/tsdb/wal.go:570 +0x7a
]
exit status 1
FAILgithub.com/prometheus/prometheus/tsdb   83.561s

The TSDB code is very hard to follow and debug. There is mmaps all over
the place and it is unclear which files are written too and which are not.
Also the MmapFile struct are not stored in some other structs and so it is
not that simple to call msync.
-- 
:wq Claudio

$OpenBSD$

Add msync to sync mmap buffers

diff --git tsdb/fileutil/mmap.go tsdb/fileutil/mmap.go
index 4dbca4f97..516991c60 100644
--- tsdb/fileutil/mmap.go
+++ tsdb/fileutil/mmap.go
@@ -71,3 +71,7 @@ func (f *MmapFile) File() *os.File {
 func (f *MmapFile) Bytes() []byte {
return f.b
 }
+
+func (f *MmapFile) Sync() error {
+   return sync(f.b)
+}
diff --git tsdb/fileutil/mmap_unix.go tsdb/fileutil/mmap_unix.go
index 043f4d408..c21829989 100644
--- tsdb/fileutil/mmap_unix.go
+++ tsdb/fileutil/mmap_unix.go
@@ -28,3 +28,7 @@ func mmap(f *os.File, length int) ([]byte, error) {
 func munmap(b []byte) (err error) {
return unix.Munmap(b)
 }
+
+func sync(b []byte) error {
+   return unix.Msync(b, unix.MS_ASYNC)
+}
diff --git tsdb/fileutil/mmap_windows.go tsdb/fileutil/mmap_windows.go
index b94226412..c54b6b125 100644
--- tsdb/fileutil/mmap_windows.go
+++ tsdb/fileutil/mmap_windows.go
@@ -44,3 +44,7 @@ func munmap(b []byte) error {
}
return nil
 }
+
+func sync(b []byte) error {
+   return nil
+}
diff --git tsdb/index/index.go tsdb/index/index.go
index a6ade9455..723f2bc73 100644
--- tsdb/index/index.go
+++ tsdb/index/index.go
@@ -552,6 +552,7 @@ func (w *Writer) finishSymbols() error {
if err := w.writeAt(w.buf1.Get(), hashPos); err != nil {
return err
}
+   w.symbolFile.Sync()
 
// Load in the symbol table efficiently for the rest of the index 
writing.
w.symbols, err = NewSymbols(realByteSlice(w.symbolFile.Bytes()), 
FormatV2, int(w.toc.Symbols))



Re: Howto measure pps at forwarding plane

2021-06-10 Thread Claudio Jeker
On Thu, Jun 10, 2021 at 09:23:03AM -, Stuart Henderson wrote:
> On 2021-06-10, Valdrin MUJA  wrote:
> > Hello,
> >
> > I'm trying to figure out how much packets are being forwarded on my OpenBSD 
> > firewall.
> > Here a small script i wrote.
> >
> >
> > #!/bin/sh
> >
> >
> > VAL1=`netstat -s | grep 'packets forwarded' | head -1 | awk -F ' ' '{print 
> > $1}'`
> >
> > sleep 1
> >
> > VAL2=`netstat -s | grep 'packets forwarded' | head -1 | awk -F ' ' '{print 
> > $1}'`
> >
> >
> > echo "$(($VAL2-$VAL1))"
> >
> >
> > But i can not be sure if i am doing the right thing?
> > Can anyone check it please.
> > Thanks.
> >
> 
> If you are only interested in IPv4 then yes that'll do it.
> This would save some cpu cycles though:
> 
> VAL1=`netstat -s | awk '/packets forwarded/ { print $1; exit }'`
> 

And use netstat -spip which limits the number of sysctls made in netstat.

-- 
:wq Claudio



Re: openbgpd "depend on"

2021-06-09 Thread Claudio Jeker
On Wed, Jun 09, 2021 at 09:57:32AM +0200, open...@kene.nu wrote:
> Hello,
> 
> Just a question and maybe a suggestion. I am implementing a few DCs that
> use vxlan symmetric routing and hence, layer2 redundancy protocols like
> CARP (and VRRP/HSRP) do not work as intended due to evpn layer2 being the
> technology of choice to announce ARP entries.
> 
> This led me to try out the "depend on carp" functionality that is available
> on openbgpd. It does what I want, partially. It would be much more usable
> if you cold define what this functionality does in case of a CARP backup
> state. Currently it puts the bgp neighbor into Idle state. However, it
> would be better if one could define that it should as-path prepend and/or
> add a metric (MED) instead. This way, carp failovers would not rely on the
> tedious and relatively time consuming process of setting up a BGP session
> and announcing prefixes before it can truly be carp master.
> 
> WDYT?

The 'depend on' feature was added to use a CARP cluster as a BGP border
router (e.g. at an IXP that only gives one IP/port). In that case the
backup carp interface is not able to open a TCP session. The backup carp
interface is not reachable and the session would conflict with the master
session.

What you would like is to add depend on on announcements (network
10.0.0.0/24 depend on carp0) or probably as a filter (match to group
uplinks depend on carp set med 100). At least this is how I understand
your request.

-- 
:wq Claudio



Re: pf, relayd, TCP keep alive and NAT, oh my!

2021-06-01 Thread Claudio Jeker
On Tue, Jun 01, 2021 at 10:25:38AM +1000, Cameron Simpson wrote:
> Can I enforce or implement TCP keep alives on a TCP stream via my 
> firewall?
> 
> Background:
> 
> I've got a client with an OpenBSD firewall and a Telstra NBN modem as 
> their modem.
> 
> Their IMAP server is upstream in the cloud (Unbuntu, courier imap). I 
> have this odd problem which I am beginning to suspect is the NBN modem 
> getting bored and dropping its NAT entries. Let me explain...
> 
> At the firewall end I see about 30 ESTABLISHED connections to the IMAP 
> server. At the IMAP server I see over 500, which is about where the IMAP 
> service stops accepting new connections, leading to errors from the 
> client mail readers.
> 
> My current theory is that the IMAP client connections issue the IMAP 
> IDLE command and go passive, waiting for email notifications from the 
> server.  So we have an idle TCP connection across the firewall and 
> across the NBN modem (which NATs).
> 
> My conjecture is that at some point the modem discards idle connection 
> states. (This could just as well happen at any other intermediate 
> stateful router too.) After that event, the client end does something 
> which tries to use the connection, gets an RST from the modem, clean 
> tidyup happens on the client and in the firewall.
> 
> At the server end, none of this is seen and the imapd just sits around 
> idle, never releasing the connection and never stopping the matching 
> daemon process. This gradually rises to hit the server's configured 
> connection limit and it stops accepting new things.
> 
> If I had TCP keep alive turned on, both ends might tidy themselves up.  
> I can't enable that on the clients (various mail readers) or, 
> apparently, on the server configuration. I can't do it in PF because PF 
> just copies packets. I can't seem to do it in relayd either, though that 
> seems the obvious way to intercept the connection for this purpose.
> 
> Any suggestions?

Make sure you use 'block return' at least for the imap connections. This
way when the state is dropped the firewall will issue a RST packet to the
server which will close the connection.

On OpenBSD there is the 'net.inet.tcp.always_keepalive' sysctl to enable
keepalive by default. So that is something you can enable on the IMAP
server to force keep-alive on there. Other systems have similar knobs.

-- 
:wq Claudio



Re: openrsync manpage error

2021-05-17 Thread Claudio Jeker
On Fri, May 14, 2021 at 12:31:32PM +0300, Irshad Sulaiman wrote:
> Hi
>  Originally I was trying sync usb drive with openbsd box I was getting 
> same error 
> 
> Below is eg: I have two files bar and baz in home dir and dest is destination 
> directory 
> While trying to sync I get error 
> And if I try ‘rsync’ as command I get error not found 
> Iam in 6.9 release with syspatch updated 
> 
> 
> irshad:/home/irshad/test# ls
> bar  baz  dest
> irshad:/home/irshad/test# openrsync -t  bar baz dest/
> openrsync: error: unexpected end of file
> irshad:/home/irshad/test# openrsync -t  bar baz root@192.168.1.1:bar
> root@192.168.1.1's password:
> ash: rsync: not found
> openrsync: error: unexpected end of file
> irshad:/home/irshad/test# rsync
> ksh: rsync: not found
> irshad:/home/irshad/test# uname -a
> OpenBSD openbsd.local 6.9 GENERIC.MP#473 amd64
> irshad:/home/irshad/test#
> 

Yes, this is behaviour is expected right now. Since we install openrsync
as openrsync but the --rsync-path defaults to rsync (as it should).
This is normally not an issue since the remote server most probably has
rsync installed. I also have rsync installed on most of my systems so I
did not notice this.

Right now people should use the rsync package since the openrsync is not
enough compatible to work well in all scenarios.


> > On 14-May-2021, at 12:02 PM, Claudio Jeker  wrote:
> > 
> > On Fri, May 14, 2021 at 12:44:45AM +0300, Irshad Sulaiman wrote:
> >> Hi 
> >> 
> >> I have modified error in openrsync(1) manpage in Example section isn’t
> >> that ‘openrsync -t'  instead of 'rsync -t ‘
> >> And without --rsync-path= it gives an following error 'openrsync: error:
> >> unexpected end of file’
> > 
> > I did try all three examples and they do work for me without adding
> > --rsync-path=. On which command did you get the unexpected result.
> > Can you share the exact way to reproduce this issue?
> > 
> >> Apologize if Iam wrong 
> >> 
> >> Thanks 
> >> Irshad 
> >> 
> >> 
> >> 
> >> Index: rsync.1
> >> ===
> >> RCS file: /cvs/src/usr.bin/rsync/rsync.1,v
> >> retrieving revision 1.24
> >> diff -u -p -r1.24 rsync.1
> >> --- rsync.131 Mar 2021 20:36:05 -  1.24
> >> +++ rsync.113 May 2021 21:25:57 -
> >> @@ -234,7 +234,7 @@ with the local
> >> and
> >> .Pa ../src/baz :
> >> .Pp
> >> -.Dl % rsync -t ../src/bar ../src/baz host:dest
> >> +.Dl % openrsync -t --rsync-path=openrsync  ../src/bar ../src/baz host:dest
> >> .Pp
> >> To update the out-of-date local files
> >> .Pa bar
> >> @@ -245,7 +245,7 @@ with the remote files
> >> and
> >> .Pa host:src/baz :
> >> .Pp
> >> -.Dl % rsync -t host:src/bar :src/baz \&.
> >> +.Dl % openrsync -t --rsync-path=openrsync  host:src/bar :src/baz \&.
> >> .Pp
> >> To update the out-of-date local files
> >> .Pa ../dest/bar
> >> @@ -256,7 +256,7 @@ with
> >> and
> >> .Pa baz :
> >> .Pp
> >> -.Dl % rsync -t bar baz ../dest
> >> +.Dl % openrsync -t --rsync-path=openrsync  bar baz ../dest
> >> .\" .Sh DIAGNOSTICS
> >> .Sh SEE ALSO
> >> .Xr ssh 1
> >> 
> > 
> > -- 
> > :wq Claudio
> 

-- 
:wq Claudio



Re: openrsync manpage error

2021-05-14 Thread Claudio Jeker
On Fri, May 14, 2021 at 12:44:45AM +0300, Irshad Sulaiman wrote:
> Hi 
> 
> I have modified error in openrsync(1) manpage in Example section isn’t
> that ‘openrsync -t'  instead of 'rsync -t ‘
> And without --rsync-path= it gives an following error 'openrsync: error:
> unexpected end of file’

I did try all three examples and they do work for me without adding
--rsync-path=. On which command did you get the unexpected result.
Can you share the exact way to reproduce this issue?

> Apologize if Iam wrong 
> 
> Thanks 
> Irshad 
> 
> 
> 
> Index: rsync.1
> ===
> RCS file: /cvs/src/usr.bin/rsync/rsync.1,v
> retrieving revision 1.24
> diff -u -p -r1.24 rsync.1
> --- rsync.1   31 Mar 2021 20:36:05 -  1.24
> +++ rsync.1   13 May 2021 21:25:57 -
> @@ -234,7 +234,7 @@ with the local
>  and
>  .Pa ../src/baz :
>  .Pp
> -.Dl % rsync -t ../src/bar ../src/baz host:dest
> +.Dl % openrsync -t --rsync-path=openrsync  ../src/bar ../src/baz host:dest
>  .Pp
>  To update the out-of-date local files
>  .Pa bar
> @@ -245,7 +245,7 @@ with the remote files
>  and
>  .Pa host:src/baz :
>  .Pp
> -.Dl % rsync -t host:src/bar :src/baz \&.
> +.Dl % openrsync -t --rsync-path=openrsync  host:src/bar :src/baz \&.
>  .Pp
>  To update the out-of-date local files
>  .Pa ../dest/bar
> @@ -256,7 +256,7 @@ with
>  and
>  .Pa baz :
>  .Pp
> -.Dl % rsync -t bar baz ../dest
> +.Dl % openrsync -t --rsync-path=openrsync  bar baz ../dest
>  .\" .Sh DIAGNOSTICS
>  .Sh SEE ALSO
>  .Xr ssh 1
> 

-- 
:wq Claudio



Re: pf firewall bridge0 vether0 blocks DHCP for bridge interfaces connected to Windows

2021-03-10 Thread Claudio Jeker
On Wed, Mar 10, 2021 at 08:40:55PM +0100, da...@hajes.org wrote:
> Hi,
> 
> I did set up OpenBSD router/firewall on PC Engines APU4d4 box.
> 
> First interface is WAN that connects to Internet.
> 
> Remaining three interfaces are bridged with bridge0 via vether0.
> 
> firewall doesn't block LAN/bridge traffic on vether0.
> 
> DHCPD runs on bridge.
> 
> Two Linux hosts (connected to em2 and em3) connect without problem but
> Windows host DHCP requests are blocked on em1.
> 
> I didn't find any info regarding pf and bridging.

Please check bridge(4) manpage, especially the NOTES section.
 
> set skip on lo0
> set skip on bridge0

This line is useless. Packets never show up on bridge0. You need to add
the physical interfaces and vether0 to your ruleset.
 
> So far I have found a kludge for Windows "set skip on em1"
> 
> Once, above by line is present in pf.conf, Win 10 host is allowed to acquire
> IP address. Interesting is that Linux has no issues to acquire IP addresses
> via DHCP.
> 
> Any suggestions, please?
 
You need to fix your pf.conf.

> Is it something screwed up in Windows such as short 3-way-handshake?

I doubt it. Your ruleset is most probably not allowing packets to pass
properly over the bridge. Since you did not share your pf.conf file it is
impossible to give you a better answer. 

-- 
:wq Claudio



Re: iSCSI LUN mount on boot

2021-02-20 Thread Claudio Jeker
On Fri, Feb 19, 2021 at 07:48:25PM -0500, Ashton Fagg wrote:
> I'm curious as to what other folks are doing for mounting iSCSI volumes
> at boot time. I've successfully configured iscsid, and mounting the
> volume manually works as expected.
> 
> I found this article [1] which suggests that hotplugd should be used.
> 
> I also found this old presentation [2] which suggests it should "just
> work" with an entry in /etc/fstab. Maybe I did not get this correct, as:
> 
> .a /mnt/test ffs rw,noatime,nodev,nosuid,nofail 1 2
> 
> causes the machine to go into single-user mode on boot (presumably
> because the iSCSI daemon hasn't yet started).
> 
> Am I missing something here? Is hotplugd the preferred way to accomplish this?

Yeah, the documentation is not great.

You need to add 'net' to the mount options in /etc/fstab for iscsi drives.
Then our rc script will do the right thing and mount these drives late
(after iscsid started).

.a /mnt/test ffs rw,noatime,nodev,nosuid,net 1 2

With that it should work. You can not use iscsi for /, /usr, /var but it
works for /home or /var/www.

-- 
:wq Claudio



Re: Unknown process modifying routing table

2021-02-06 Thread Claudio Jeker
On Sat, Feb 06, 2021 at 02:16:20PM +0100, Otto Moerbeek wrote:
> On Sat, Feb 06, 2021 at 12:18:40PM +, James wrote:
> 
> > I've disabled my VPN on the machine as well as dhclient, connecting via a
> > fixed static IP address and DNS servers. My routing table is still being
> > modifed by PID 0 (which I assume to be the kernel) every 30 minutes or so.
> > Ntpd is also disabled.
> > 
> > I have also caught my machine communicating to one the of the IPs via TCP
> > and have a pcap dump from wireshark. No actual data was sent other than a
> > TCP timestamp.
> > 
> > > If your default route is a VPN,
> > > please show how you establish the VPN to be your default route.
> > > 
> > The default route is established mannually in a script that is run after the
> > VPN starts. Essentially it does the following:
> > 
> >     route add $VPN_HOST $DEFAULT_GW
> > 
> >     route change default $VPN_HOST
> > 
> > 
> > I do not belive the VPN to be the cause of this problem.
> > 
> > 
> > Any tips on debugging the kernel to track the cause of these route changes
> > would be greatly appreciated.
> > 
> > 
> > Thanks,
> > 
> 
> The kernel uses the routing table to store things like PMTU discovery
> data and ARP entries,
> 

Also showing the route -n monitor output will help to identify what is
going on.

-- 
:wq Claudio



Re: Ask ospfd

2021-02-01 Thread Claudio Jeker
On Tue, Feb 02, 2021 at 12:06:37PM +0700, Adiwangsa Kusumah wrote:
> Dear All,
> 
> I have topology as below:
> 
> UP1 UP2
> \ /
>   \  /
>   OBSD6.6
> /\
>   /\
> OSPF1OSPF2
> 
> 
> I use openbgpd to upstream and  openospfd to internal
> I want my openbsd send 0.0.0.0/0 to my ospf (single area)
> 
> At my bgpd.conf  I add
> network 0.0.0.0/0
> 
> Ay my ospfd I tri to add
> redistribute default
> and/or
> redistribute 0.0.0.0/0
> 
> when i check my ospf, there is no 0.0.0.0 send to my internal network
> 
> ospfctl sh database self-originated
> 
> Link ID Adv Router  Age  Seq#   Checksum
> 10. xxx.xxx.248  103.xxx.xxx.11   1225 0x8048 0x2471
> 10. xxx.xxx.252  103. xxx.xxx.11   1225 0x804a 0xf797
> 103. xxx.xxx.72103. xxx.xxx.11   1225 0x8048 0xe1c4
> 103. xxx.xxx.60   103. xxx.xxx.11   1225 0x804a 0x858d
> 103. xxx.xxx.12   103. xxx.xxx.11   1225 0x804a 0x3b05
> 
> Is that any additional configuration at my bgpd.conf or my ospfd.conf?
> Your advice will be appreciated.
> 

ospfd(4) redistribute requires that the corresponding route is present in
the routing table (route -n get default). This is not the case for
bgpd(8). So make sure that you have a default route in kernel routing table.

-- 
:wq Claudio



Re: ospf on wg(4)

2021-01-30 Thread Claudio Jeker
On Sat, Jan 30, 2021 at 09:14:50AM +, Olivier Cherrier wrote:
> On Fri, Jan 29, 2021 at 10:32:45PM +0100, bast...@durel.org wrote:
> > 
> 
> Hello Bastien,
> 
> > It is possible, I use it myself. You have to allow multicast address on
> > wg(4) interface(s):
> > 225.0.0.5 for all OSPF routers
> > 224.0.0.6 for all DR/BDR
>  
> Arfff indeed.
> Thank you for helping me on this. It works now.
> 
> (Note it is 224.0.0.5 and 224.0.0.6 though).
> 

So wireguard reinvented IPsec flows but even worse. Wow.

-- 
:wq Claudio



Re: bgpd not including MED attribute on updates

2021-01-28 Thread Claudio Jeker
On Thu, Jan 28, 2021 at 02:51:33PM +0100, open...@kene.nu wrote:
> In my case MED is changed with + on every eBGP hop. I use it to
> calculate the total MED over several hops to decide the best path from a
> latency point of view.
> 
> My intention with listing the advertised prefix from R1 was to show that
> there is a MED present. As per the tcpdump I did, the MED attribute is not
> included in the BGP update packets. This I have confirmed is the case for
> all prefixes on R1 that has an iBGP nexthop. Any prefixes that in R1 has an
> eBGP nexthop advertises MED as expected.
> 
> The "bgpd -vn" for R1:
> 
> AS 64660
> router-id 172.30.37.1
> socket "/var/run/bgpd.sock.0"
> holdtime 9
> rde med compare always
> nexthop qualify via bgp
> 
> prefix-set "internal" {
> 
> }
> 
> rde rib Adj-RIB-In no evaluate
> rde rib Loc-RIB rtable 0 fib-update yes
> 
> neighbor 172.30.1.54 {
> descr "R2"
> remote-as 64840
> enforce neighbor-as yes
> enforce local-as yes
> announce IPv4 unicast
> }
> 
> group "rr" {
> neighbor 172.30.37.25 {
> descr "rr1"
> remote-as 64660
> local-address 172.30.37.1
> enforce neighbor-as no
> enforce local-as yes
> announce IPv4 unicast
> }
> neighbor 172.30.37.39 {
> descr "rr2"
> remote-as 64660
> local-address 172.30.37.1
> enforce neighbor-as no
> enforce local-as yes
> announce IPv4 unicast
> }
> }
> 
> deny from any
> deny to any
> allow to ebgp prefix-set "internal"
> allow to ibgp prefix-set "internal"
> allow from ebgp prefix-set "internal"
> allow from group "rr" prefix-set "internal"
> match to ibgp set { nexthop self }
> match from 172.30.1.54 set { metric +23 }

Any route learned via rr1 or rr2 will not pass the MED on to R2 because
the system does not touch the MED and therefor bgpd considers the received
MED from rr1 and rr2 to have originated from outside and so it is excluded
from UPDATES to EBGP peers.

You should add a 'maych from ibgp set med +0' rule which makes MED learned
via IBGP to be considered to be sent out.
 
> On Thu, Jan 28, 2021 at 2:01 PM Claudio Jeker 
> wrote:
> 
> > On Thu, Jan 28, 2021 at 12:41:29PM +0100, open...@kene.nu wrote:
> > > Hello,
> > >
> > > I am experiencing this on 6.8, fully syspatched.
> > >
> > > root@R1():~ # uname -a
> > > OpenBSD R1 6.8 GENERIC.MP#4 amd64
> > >
> > > The problem is that R1 sends updates with MED set to 0 even though I
> > expect
> > > it not to be. Upon reviewing a tcpdump pcap taken at R2, the MED
> > attribute
> > > is not even included in said update sent from R1.
> > >
> > > This only applies to some, not all updates, in my case it seems to affect
> > > routes where R1 has an ospf discovered nexthop. (172.30.37.2)
> > >
> > > root@R1():~ # route -n get 172.30.37.2 | grep priority
> > >priority: 32 (ospf)
> > >
> > > root@R1():~ # route -n get 172.30.1.110 | grep priority
> > >priority: 8 (static)
> > >
> > > root@R1():~ # bgpctl sh ip bgp neigh R2 out | egrep "172.30.194.[1234]"
> > > *   N 172.30.194.1/32  172.30.1.110  100   210 64750 i
> > > *   N 172.30.194.2/32  172.30.37.2   100   251 64750 i
> > > *   N 172.30.194.3/32  172.30.1.110  100   210 64750 i
> > > *   N 172.30.194.4/32  172.30.1.110  100   210 64750 i
> > >
> > > root@R2():~ $ bgpctl sh ip bgp neigh R1 in | egrep "172.30.194.[1234]"
> > > *   N 172.30.194.1/32  172.30.1.55100   210 64660 64750
> > i
> > > *   N 172.30.194.2/32  172.30.1.55100 0 64660 64750
> > i
> > > *   N 172.30.194.3/32  172.30.1.55100   210 64660 64750
> > i
> > > *   N 172.30.194.4/32  172.30.1.55100   210 64660 64750
> > i
> >
> > Please remember that MED is not really a transitive attribute. It only
> > hops into an AS but not accross it. So a MED recv from an EBGP session is
> > not forwarded. If the MED is changed (e.g. set med +1 -- maybe set med +0
> > works as well, don't remember) then the MED will be passed on.
> > From the output the session between R1 and R2 is EBGP so it very much
> > depends on your filter rules. If the MED was changed by the ruleset it
> > will be sent if not it will be filtered.
> >
> > With the limited information it is not really possible to know. Note, the
> > adj-rib-out output on R1 shows the prefix before the attribute is stripped.
> > Also the ASPATH prepend happens then.
> >
> > --
> > :wq Claudio
> >

-- 
:wq Claudio



Re: bgpd not including MED attribute on updates

2021-01-28 Thread Claudio Jeker
On Thu, Jan 28, 2021 at 12:41:29PM +0100, open...@kene.nu wrote:
> Hello,
> 
> I am experiencing this on 6.8, fully syspatched.
> 
> root@R1():~ # uname -a
> OpenBSD R1 6.8 GENERIC.MP#4 amd64
> 
> The problem is that R1 sends updates with MED set to 0 even though I expect
> it not to be. Upon reviewing a tcpdump pcap taken at R2, the MED attribute
> is not even included in said update sent from R1.
> 
> This only applies to some, not all updates, in my case it seems to affect
> routes where R1 has an ospf discovered nexthop. (172.30.37.2)
> 
> root@R1():~ # route -n get 172.30.37.2 | grep priority
>priority: 32 (ospf)
> 
> root@R1():~ # route -n get 172.30.1.110 | grep priority
>priority: 8 (static)
> 
> root@R1():~ # bgpctl sh ip bgp neigh R2 out | egrep "172.30.194.[1234]"
> *   N 172.30.194.1/32  172.30.1.110  100   210 64750 i
> *   N 172.30.194.2/32  172.30.37.2   100   251 64750 i
> *   N 172.30.194.3/32  172.30.1.110  100   210 64750 i
> *   N 172.30.194.4/32  172.30.1.110  100   210 64750 i
> 
> root@R2():~ $ bgpctl sh ip bgp neigh R1 in | egrep "172.30.194.[1234]"
> *   N 172.30.194.1/32  172.30.1.55100   210 64660 64750 i
> *   N 172.30.194.2/32  172.30.1.55100 0 64660 64750 i
> *   N 172.30.194.3/32  172.30.1.55100   210 64660 64750 i
> *   N 172.30.194.4/32  172.30.1.55100   210 64660 64750 i

Please remember that MED is not really a transitive attribute. It only
hops into an AS but not accross it. So a MED recv from an EBGP session is
not forwarded. If the MED is changed (e.g. set med +1 -- maybe set med +0
works as well, don't remember) then the MED will be passed on.
>From the output the session between R1 and R2 is EBGP so it very much
depends on your filter rules. If the MED was changed by the ruleset it
will be sent if not it will be filtered.

With the limited information it is not really possible to know. Note, the
adj-rib-out output on R1 shows the prefix before the attribute is stripped.
Also the ASPATH prepend happens then.

-- 
:wq Claudio



Re: osp6d p2p send_ls_update

2021-01-06 Thread Claudio Jeker
On Tue, Dec 29, 2020 at 06:39:36PM +0200, Kapetanakis Giannis wrote:
> Hi,
> 
> I've changed today my config from broadcast to p2p for both ipv4 and ipv6.
> 
> In ospf6d I get this quite often:
> 
> Dec 29 17:39:00 ospf6d[40695]: send_packet: error sending packet on interface 
> vlanX: Network is unreachable
> Dec 29 17:39:00 ospf6d[40695]: send_ls_update: Network is unreachable
> 
> debugging send_packet shows:
> Dec 29 18:12:57 ospf6d[65033]: send_packet: error sending packet on interface 
> vlanX to ::: Network is unreachable
> Dec 29 18:12:57 ospf6d[65033]: send_ls_update: Network is unreachable
> 
> The dst_address of send_packet is :::
> This comes from send_ls_update
> 
> system is current (20 dec).
> 
> maybe something more is missing for P2P?
> 

I just sent a patch for this to tech@. I included the diff here as well.
With this my P2P link works now.

-- 
:wq Claudio

Index: lsupdate.c
===
RCS file: /cvs/src/usr.sbin/ospf6d/lsupdate.c,v
retrieving revision 1.18
diff -u -p -r1.18 lsupdate.c
--- lsupdate.c  15 Jul 2020 14:47:41 -  1.18
+++ lsupdate.c  6 Jan 2021 11:28:43 -
@@ -474,7 +474,7 @@ ls_retrans_timer(int fd, short event, vo
/* ls_retrans_list_free retriggers the timer */
return;
} else if (nbr->iface->type == IF_TYPE_POINTOPOINT)
-   memcpy(, >iface->dst, sizeof(addr));
+   memcpy(, >addr, sizeof(addr));
else
inet_pton(AF_INET6, AllDRouters, );
} else
Index: packet.c
===
RCS file: /cvs/src/usr.sbin/ospf6d/packet.c,v
retrieving revision 1.17
diff -u -p -r1.17 packet.c
--- packet.c23 Dec 2019 07:33:49 -  1.17
+++ packet.c6 Jan 2021 11:52:08 -
@@ -82,12 +82,9 @@ send_packet(struct iface *iface, struct 
 struct in6_addr *dst)
 {
struct sockaddr_in6 sa6;
-   struct msghdr   msg;
-   struct ioveciov[1];
 
-   /* setup buffer */
+   /* setup sockaddr */
bzero(, sizeof(sa6));
-
sa6.sin6_family = AF_INET6;
sa6.sin6_len = sizeof(sa6);
sa6.sin6_addr = *dst;
@@ -104,15 +101,8 @@ send_packet(struct iface *iface, struct 
return (-1);
}
 
-   bzero(, sizeof(msg));
-   msg.msg_name = 
-   msg.msg_namelen = sizeof(sa6);
-   iov[0].iov_base = buf->buf;
-   iov[0].iov_len = ibuf_size(buf);
-   msg.msg_iov = iov;
-   msg.msg_iovlen = 1;
-
-   if (sendmsg(iface->fd, , 0) == -1) {
+   if (sendto(iface->fd, buf->buf, ibuf_size(buf), 0,
+   (struct sockaddr *), sizeof(sa6)) == -1) {
log_warn("send_packet: error sending packet on interface %s",
iface->name);
return (-1);
@@ -186,11 +176,16 @@ recv_packet(int fd, short event, void *b
 * AllDRouters is only valid for DR and BDR but this is checked later.
 */
inet_pton(AF_INET6, AllSPFRouters, );
-
if (!IN6_ARE_ADDR_EQUAL(, )) {
inet_pton(AF_INET6, AllDRouters, );
if (!IN6_ARE_ADDR_EQUAL(, )) {
-   if (!IN6_ARE_ADDR_EQUAL(, >addr)) {
+   struct iface_addr *ia;
+
+   TAILQ_FOREACH(ia, >ifa_list, entry) {
+   if (IN6_ARE_ADDR_EQUAL(, >addr))
+   break;
+   }
+   if (ia == NULL) {
log_debug("recv_packet: packet sent to wrong "
"address %s, interface %s",
log_in6addr(), iface->name);



Re: osp6d p2p send_ls_update

2020-12-29 Thread Claudio Jeker
On Tue, Dec 29, 2020 at 06:39:36PM +0200, Kapetanakis Giannis wrote:
> Hi,
> 
> I've changed today my config from broadcast to p2p for both ipv4 and ipv6.
> 
> In ospf6d I get this quite often:
> 
> Dec 29 17:39:00 ospf6d[40695]: send_packet: error sending packet on interface 
> vlanX: Network is unreachable
> Dec 29 17:39:00 ospf6d[40695]: send_ls_update: Network is unreachable
> 
> debugging send_packet shows:
> Dec 29 18:12:57 ospf6d[65033]: send_packet: error sending packet on interface 
> vlanX to ::: Network is unreachable
> Dec 29 18:12:57 ospf6d[65033]: send_ls_update: Network is unreachable
> 
> The dst_address of send_packet is :::
> This comes from send_ls_update
> 
> system is current (20 dec).
> 
> maybe something more is missing for P2P?

Yes, I also ran into it and hand not energy yet to fix it.
There was a fix in ospfd that was not ported over.

-- 
:wq Claudio



Re: OSPF and CARP interfaces

2020-12-22 Thread Claudio Jeker
On Tue, Dec 22, 2020 at 02:04:27PM +0100, open...@kene.nu wrote:
> Hello,
> I am seeing what I deem to be unexpected behavior with ospfd and depending
> on carp interfaces.
> Running 6.8 with latest patches applied on all three routers.
> 
> # uname -a
> OpenBSD extfw1.lab.kambi.com 6.8 GENERIC.MP#2 amd64
> 
> My setup is as following;
> Two openbsd boxes (FW1 and FW2) acting as a firewall pair sharing carp
> interfaces.
> Single openbsd box (R1) that in this instance acts as a client trying to
> reach servers that are reachable via the FWs.
> VLan20 (actually carp20) is my nexthop (BGP wise) to reach any networks
> behind the FW pair.
> VLan21 is the link network between all the three boxes. The FWs share a
> carp21 interface.
> 
> My FW ospfd.conf (same on all three boxes apart from the "depend on" which
> is absent from R1):
> router-id 
> 
> area 0.0.0.0 {
> interface lo1
> interface vlan20 {
> depend on carp20
> }
> interface vlan21 {
> depend on carp21
> }
> }

I would change the config to just use

area 0.0.0.0 {
interface lo1
interface carp20
interface vlan21
}

This way the network on vlan20/carp20 will be announced depending on the
carp state with the backup system announcing the same route with a high
metric. There is no need to use "depend on" for such a simple case.

For vlan21 I would not do that since there you want reachability in any
case especially if you announce BGP networks on the firewalls with the
carp21 address (instead of the default vlan21 one).
 
> Carp20:
> root@FW1:~ # ifconfig carp20 | grep inet
> inet 172.30.9.21 netmask 0xfff0 broadcast 172.30.9.31
> 
> Now to the strange part. I see that the selected route in R1 points to FW1
> even though carp20/21 on FW1 is in state BACKUP. No matter what I do, apart
> from setting static metrics, ospfd on R1 always selects FW1 as nexthop.
> root@FW1:~ # ifconfig vlan21 | grep inet
> inet 172.30.9.34 netmask 0xfff0 broadcast 172.30.9.47
> root@FW1:~ # ifconfig carp20 | grep carp:
> carp: BACKUP carpdev vlan20 vhid 1 advbase 1 advskew 10
> root@FW1:~ # ifconfig carp21 | grep carp:
> carp: BACKUP carpdev vlan21 vhid 1 advbase 1 advskew 10
> 
> root@FW2:~ # ifconfig vlan21 | grep inet
> inet 172.30.9.35 netmask 0xfff0 broadcast 172.30.9.47
> root@FW2:~ # ifconfig carp20 | grep carp:
> carp: MASTER carpdev vlan20 vhid 1 advbase 1 advskew 100
> root@FW2:~ # ifconfig carp21 | grep carp:
> carp: MASTER carpdev vlan21 vhid 1 advbase 1 advskew 100
> 
> root@R1:~ # ospfctl sh
> neighID  Pri StateDeadTime Address Iface
> Uptime
> 172.30.9.4  1   FULL/OTHER   00:00:38 172.30.9.35 vlan2100:21:33
> 172.30.9.3  1   FULL/BCKUP   00:00:38 172.30.9.34 vlan2100:22:14
> 
> root@R1:~ # ospfctl sh fib | grep 172.30.9.16/2
> *O   32 172.30.9.16/28   172.30.9.34
> *O   32 172.30.9.16/28   172.30.9.35
> 
> root@R1:~ # ospfctl sh rib | grep 172.30.9.16/2
> 172.30.9.16/28   172.30.9.34   Intra-Area   Network   20
>  00:30:33
> 172.30.9.16/28   172.30.9.35   Intra-Area   Network   20
>  00:29:56
> 
> root@R1:~ # route -n get 172.30.9.21
>route to: 172.30.9.21
> destination: 172.30.9.16
>mask: 255.255.255.240
> gateway: 172.30.9.34
>   interface: vlan21
>  if address: 172.30.9.37
>priority: 32 (ospf)
>   flags: 
>  use   mtuexpire
>   11 0 0
> 
> As seen above R1 selects 172.30.9.34 as the nexthop based on ospf which is
> wrong. It should be 172.30.9.35 as FW2 is carp master for carp20/21. What I
> in the end want to achieve is that the router with carp20/21 MASTER should
> be the preferred carp20 nexthop. An assumption can be made that carp20/21
> will always have the same FW as master in my case.

-- 
:wq Claudio



Re: RISC-V and OpenBSD

2020-12-09 Thread Claudio Jeker
On Wed, Dec 09, 2020 at 05:30:48PM +0200, Mihai Popescu wrote:
> Would it be interesting from the OpenBSD point of view [1] ?
> 
> [1] http://www.micromagic.com/news/RISCv-Fastest_PR.pdf

No, this is just PR. We need HW to run on.

-- 
:wq Claudio



Re: APU4 hardware network interfaces tied together

2020-11-16 Thread Claudio Jeker
On Mon, Nov 16, 2020 at 06:37:50PM -0700, John McGuigan wrote:
> On Mon, Nov 16, 2020, 6:05 PM Stuart Henderson  wrote:
> 
> >
> > bridge (and theoretically switch but I never got it to do anything
> > useful) make a group of ports act like a network switch (maybe with
> > filtering between the ports).
> >
> 
> I've been having issues with switch (4) as well... The reason I decided to
> go for switch vs bridge on my APU2 is that, from what I understood, bridge
> invokes some ugly locks in the kernel whereas switch was written without as
> big of locks in mind. I could be wrong here but maybe someone can correct
> me.

>From my knowledge switch has the same limitation as bridge(4) when it
comes to locks. Both require the big kernel lock to operate.
 
> I have a feeling there is something wrong with switch(4) but I haven't been
> able to independently test that.

switch(4) is mostly for people that want to play with SDN and should not
be used as bridge(4) replacement. It is far from finished.

-- 
:wq Claudio



Re: Impact of 002_icmp6.patch

2020-10-30 Thread Claudio Jeker
On Fri, Oct 30, 2020 at 11:15:31AM +0100, js-openbsd-m...@webkeks.org wrote:
> > Am 30.10.2020 um 01:28 schrieb Theo de Raadt :
> > 
> > js-openbsd-m...@webkeks.org wrote:
> > 
> >> I just saw
> >> https://ftp.openbsd.org/pub/OpenBSD/patches/6.8/common/002_icmp6.patch.sig,
> >> however, it's unclear from the description and the context around the
> >> patch if this is a read after free or write after free (or both).
> > 
> > I think it is fair you can study the code yourself and make your own
> > factual determination.
> 
> As said, it is not immediately obvious to me if this is just read-after-free 
> or also write-after-free. Hence I was hoping someone who either wrote the fix 
> or who is more familiar with the code than me could enlighten me. It's not 
> one of those obvious fixes where you see the buffer overflow just below.
> 
> >> In the case of a write after free, would this change "Only two remote
> >> holes in the default install, in a heck of a long time!" to three? Or
> >> does it need more than IPv6 being configured?
> > 
> > First off, is ipv6 deployment really part of the default install?  No,
> > not really it takes some effort to configure v6, it is not natural.
> 
> The same could be said for v4 though, so is networking not considered part of 
> the default install? How did the 2 remote holes happen without network then, 
> though? Please help me understand, because the installer asked me for IPv6 
> just as it did for IPv4, so I would consider them both equally default.
> 
> > It is active on the loopback, but then that's not remote..
> 
> What about link-local IPv6? That's active by default, isn't it?
> 
> In any case, are you saying just removing the inet6 address from all 
> interfaces would be a sufficient workaround if an immediate update is not 
> possible? (Of course, only as a workaround until it's possible)
> 
> > But there's a bigger assumption in your mail:
> > 
> > We've released the errata as security because it is possibly exploitable
> > or could cause a crash, and we have a rapid fix release process.  It was
> > released without even seeing any evidence of a remote crash, nor any
> > evidence of a remote exploit.  Incorrect code gets fixed, and if we
> > judge it important we release a fix to the public in expedited fashion,
> > and apparently get judged for doing so.
> 
> And that is good. But it still does not help in determining the impact, i.e.: 
> Was this just a remote DoS (read-after-free) or a potential RCE 
> (write-after-free)? For the latter, I would just update, for the former, time 
> to reinstall my machines.
> 
> > Now that the fix is released and deployed by most openbsd users, we
> > quickly become uncurious and head back to other work.  The only
> > conversations related to this are asking how we can harden the mbuf
> > layer to avoid similar issues in the future.
> 
> Which seems like a good strategy, but still, don't you think it's valuable to 
> know what the maximum impact was in the worst-case? I fully agree with being 
> over cautious and calling something an RCE rather than a DoS when it's 
> unclear (a write-after-free could look like a DoS at first and turn out to be 
> RCE, after all), but some things are limited in impact (a read-after-free 
> usually isn't more than a DoS).
> 
> > I guess many other operating systems would wait weeks or months to
> > collect all the "facts" and make a fancy disclosure, but we shipped
> > source and binary fixes in just over 24 hours.
> 
> Again, I think that time is better spent fixing it fast than writing a fancy 
> disclosure. I am merely curious if this was just read-after-free or 
> write-after-free (or both) to make my own risk determination.
> 
> > So, is it a remote crash?  Possibly, but we'd like to see a packet
> > that causes it.
> > 
> > Next after that, is it a remote exploit?
> > 
> > I think it is fair to wait for facts.
> 
> So, what you're saying is, it is only tagged as a security out of caution, 
> not because it necessarily is exploitable?
> 
> > I also think you are a troll.
> 
> Not everybody trying to understand the impact of a security bug is a troll ;).
> 
> I merely brought up the 2 remote holes because I was wondering if this could 
> be used as a signal that it's not remotely exploitable, as it's still 2.

Honestly, as one of the devs involved with this security fix, I can tell
you that I don't know. It is a use-after-free in some situations.
Is it reachable from remote? I don't know.
Is it reachable from local? Maybe.
Is the use-after-free exploitable? Damn hard to tell, it is for sure not easy.
Was there a PoC exploit? No, there was no PoC.
I will not invest hours of my time to figure out something that does not
really interest me. The fix is out, everyone can update.

-- 
:wq Claudio



Re: Primepower 250 vs Sunfire v215

2020-09-20 Thread Claudio Jeker
On Sun, Sep 20, 2020 at 08:00:45PM +0300, Kihaguru Gathura wrote:
> > The Primepower is bigger and needs more power but if you find a box with
> > good CPUs and memory it should run faster than a V215
> 
> How did the performance of the PrimePower 250 SCSI drives compare to Sun
> Fire V215 SAS drives?

Any spinning rust is slow compared to SSD disks. I run my Fire V215 with a
NVME disk for the busy partitions (but boot from the SAS drives). This is
not really possible with the primepower 250 (hard to find any kind of SSD
for that system).

-- 
:wq Claudio



Re: Primepower 250 vs Sunfire v215

2020-09-20 Thread Claudio Jeker
On Sun, Sep 20, 2020 at 09:02:55AM +0300, Kihaguru Gathura wrote:
> Hi,
> 
> For those who have experience with older Sparc machines, Which hardware
> offers better reliability/stability?
> 
> Fujitsu Primepower 250 or Sun fire V215.
> 

Depends mostly on how well they were handled. Also if they are equipped
with all the PSUs. I used both for a long time, neither caused me issues.
The Primepower is bigger and needs more power but if you find a box with
good CPUs and memory it should run faster than a V215.
On the other hand the V215 has PCIe slots and so NVMe disks are an option.

-- 
:wq Claudio



Re: pf, send(2) and EACCES

2020-08-28 Thread Claudio Jeker
On Fri, Aug 28, 2020 at 11:40:17AM -0400, Daniel Jakots wrote:
> On Fri, 28 Aug 2020 16:06:48 +0200, Sebastien Marie 
> wrote:
> 
> > - generate lot of postgresql access. from postgresql thread, the
> > statement seems to be a SELECT, so it would be fine to ran in loop
> > (hopping no cache and real traffic generated).
> > 
> > - run pfctl -Treplace in a loop (with a set of different files as the
> > kernel code takes care if host are added, changed, deleted)
> 
> I ran the select on one machine and the pfctl -Treplace on db1 both in
> a `while :` for about two hours and it didn't happen.
> 
> I'll try again if the problem happens genuinely again.

Have a look at the pf(4) stats. especially check if the congestion counter
increases when you see the error. If pf(4) detects a network congestion
then ruleset evaluation is skipped and only state matching happens. In
that case you can get EACCESS for connections that would normally be
allowed by pf(4).

-- 
:wq Claudio



Re: bgpd config advice needed

2020-08-24 Thread Claudio Jeker
On Mon, Aug 24, 2020 at 04:36:10PM +, Laura Smith wrote:
> Hi,
> 
> Let's say I've got a scenario where I've got transit ISPs and peering 
> connections.
> 
> My general config rule is that I use med to prioritise peering over transit 
> (because localpref is too high up in the BGP selection algorithm, so 
> localpref is a sledgehammer to crack a nut).
> 
> That setup has served me well.  But now with increasing peering connections, 
> I'm seeing the wrong peer being selected for a route, e.g. (IPs and ASNs 
> obfuscated to protect the innocent) 
> 
> *>  N 2001:db8:::/29   2001:db8::::1    100   100 64512 
> 65500 i
> *   N 2001:db8:::/29   2001:db8::::2    100   100 65500 
> 65500 i
> 
> In this example, both 64512 and 65500 are peers (med=100) but obviously 65500 
> 65500 should be the preferred route.
> 
> What options do I have to resolve this sort of tie-break ?  Ideally I'd
> like to find something that would resolve all such instances rather than
> have to introduce config hacks on a per-peer basis.
> 

A possible option is to prefer announcements from the neighbor which is
the originator. To do this you can use a rule like:

   match from ebgp source-as neighbor-as set med +100

Now it is a bit strange that an AS is prepending on peering. I wonder why
they do that (is their connection to the IX undersized?).
-- 
:wq Claudio



Re: rtables and kernel routes

2020-08-21 Thread Claudio Jeker
On Fri, Aug 21, 2020 at 08:45:36AM +0200, open...@kene.nu wrote:
> Hello,
> 
> I am seeing rather strange, or maybe expected, behaviour. I utilise
> rtables to send internal traffic towards the internet via a default
> route in rtable 2. The traffic is punted to rtable 2 with pf. The
> strangeness I am seeing is that unless there is a matching dummy route
> in rtable 0 the traffic gets dropped on ingress hence the pf ruleset
> that moves it into rtable 2 is never evalutated.
> 
> Is this expected? The man pages for rdomain seems to suggest so but it
> is not explicitly stated.

I guess with internal traffic you mean traffic on the local LAN that is
forwarded by the router. Not traffic local to the machine.

pf(4) runs twice in your box. Once on packet reception (in rules) and once
before sending out a packet (out rules). In between these two checkpoints
packet forwarding happens (if forwarding is enabled and traffic is
not for the local system). During forwarding a route lookup is made and
based on that lookup the packet is sent out on the right interface.
If this lookup fails the packet can't be forwarded and is dropped. Now
the pf hook for out rules happens after this point and so a valid route is
required to get there.

In your case you either need a (default) route in rtable 0 so that traffic
makes it to the out rule that then changes the rtable to 2 and sends out
the packet towards the internet or you need to change the rtable on input
(match in ... rtable 2) so that the forwarding lookup is done on rtable 2
(where there is a valid route to the destination).

It seems most people prefer to write pf rulesets like yours with out rules
and so a dummy default route in rtable 0 is needed but from a technical
perspective it is better to do the rtable change on input. By doing so you
actually save an extra route lookup (the one on rtable 0 hitting the dummy
route).

-- 
:wq Claudio



Re: CPU usage of httpd+slowcgi

2020-07-27 Thread Claudio Jeker
On Mon, Jul 27, 2020 at 02:54:25PM +0100, Stuart Henderson wrote:
> Replying back on-list, I don't do support-type mails off-list, and other
> people know more about sparc64 hardware than me.
> 
> On 2020/07/26 22:38, Kihaguru Gathura wrote:
> > Hi Stuart,
> > 
> > For legacy, single-core CPU's such as Sparc64 V.
> > Would OpenBSD cope well with more number of CPU's or less as in previous 
> > case?
> > 
> > Example.
> > 
> > 2 CPU's (primepower 250) -> 4 CPU's (PrimePower 450) -> 8 CPU's(PrimePower 
> > 650) -> 16 CPU's
> > (PrimePower 850) -> 32 CPU's (Primepower 1500)
> 
> It depends on the workload. I'd have thought for most things the max
> really usable at the moment is probably somewhere in the region of 4-8
> cpu cores before kernel locking gets in the way too much.
> 
> FWIW sparc64 ports builds are now done on T4 and they're really fast.
> I think (but am not 100% sure) that this is carved into ldoms so the
> number of cores visible to each OpenBSD instance is limited (so
> contention between cores in the kernel is also limited).

The primepower 250 are decent and IIRC you can get dual core SPARC64-VI
CPUs for those. They use a fair amount of power. The bigger irons are fun
but honestly the weight and power consumption is just not worth it.
A primepower 250 is compareable with a fast v215. At least that is my
experience.

Better to look for an M3000 or M4000 or as suggested for a T4-1. Also make
sure you get good CPUs in them (esp. the M4000 comes with a few options).

-- 
:wq Claudio



Re: OpenBGPD fatal in RDE: rde_dispatch_imsg_session: imsg_get error: Cannot allocate memory

2020-06-30 Thread Claudio Jeker
On Tue, Jun 30, 2020 at 10:23:07AM +0200, Laurent CARON wrote:
> Hi,
> 
> 
> I'm running a pretty busy OpenBGPd router (~250 bgp sessions) with 4 IPv4
> and 4 IPv6 full views, plus a few IX sessions.
> 
> 
> # bgpctl show rib mem
> RDE memory statistics
>     820983 IPv4 unicast network entries using 31.3M of memory
>     203228 IPv6 unicast network entries using 10.9M of memory
>    1935802 rib entries using 118M of memory
>    6348318 prefix entries using 775M of memory
>     728103 BGP path attribute entries using 50.0M of memory
>    and holding 6348318 references
>     464633 BGP AS-PATH attribute entries using 22.3M of memory
>    and holding 728103 references
>  29055 entries for 371905 BGP communities using 8.6M of memory
>    and holding 6348318 references
>  18541 BGP attributes entries using 724K of memory
>    and holding 1618379 references
>  18540 BGP attributes using 145K of memory
>  0 as-set elements in 0 tables using 0B of memory
>     64 prefix-set elements using 3.0K of memory
> RIB using 1008M of memory
> Sets using 3.0K of memory
> 
> RDE hash statistics
>     path hash: size 131072, 728103 entries
>     min 0 max 19 avg/std-dev = 5.555/2.268
>     aspath hash: size 131072, 464633 entries
>     min 0 max 17 avg/std-dev = 3.545/1.853
>     comm hash: size 16384, 29055 entries
>     min 0 max 8 avg/std-dev = 1.773/0.925
>     attr hash: size 16384, 18541 entries
>     min 0 max 8 avg/std-dev = 1.132/0.848
> 
> 
> More often than not the BGPd daemon is crashing (although having plenty of
> RAM (80G) on the server) with: /var/log/messages
> 
> fatal in RDE: rde_dispatch_imsg_session: imsg_get error: Cannot allocate
> memory
> 
> fatal in RDE: prefix_alloc: Cannot allocate memory
> 
> fatal in RDE: communities_copy: Cannot allocate memory
> 
> peer closed imsg connection
> main: Lost connection to RDE
> peer closed imsg connection
> SE: Lost connection to RDE
> peer closed imsg connection
> SE: Lost connection to RDE control
> Can't send message 57 to RDE, pipe closed
> last message repeated 12 times
> peer closed imsg connection
> SE: Lost connection to parent
> neighbor A.B.C.D (sas-v4-001): sending notification: Cease, administratively
> down
> 
> 
> :/etc/login.conf:
> 
> default:\
>     :path=/usr/bin /bin /usr/sbin /sbin /usr/X11R6/bin /usr/local/bin
> /usr/local/sbin:\
>     :umask=022:\
>     :datasize-max=768M:\
>     :datasize-cur=768M:\
>     :maxproc-max=256:\
>     :maxproc-cur=128:\
>     :openfiles-max=1024:\
>     :openfiles-cur=512:\
>     :stacksize-cur=4M:\
>     :localcipher=blowfish,a:\
>     :tc=auth-defaults:\
>     :tc=auth-ftp-defaults:
> 
> daemon:\
>     :ignorenologin:\
>     :datasize=infinity:\
>     :maxproc=infinity:\
>     :openfiles-max=1024:\
>     :openfiles-cur=128:\
>     :stacksize-cur=8M:\
>     :localcipher=blowfish,a:\
>     :tc=default:
> 
> bgpd:\
>     :openfiles=512:\
>     :tc=daemon:
> 
> How can I pinpoint the source of the problem ?
> 

Can you check and monitor with ps aux | grep bgpd and or top the VSZ and
RSS of the RDE process. What is the maximum you notice. Also how do you
start bgpd? Make sure the limits from login.conf are actually applied
(using rcctl start should do that while doas bgpd would not).

Cheers
-- 
:wq Claudio



Re: Convert ffs1 to ffs2?

2020-05-20 Thread Claudio Jeker
On Wed, May 20, 2020 at 11:30:00AM +0300, Михаил Попов wrote:
> > "Possible" is irrelevant. Lots of things are _possible_ but not done.
> 
> Then only rsyncing?

There is also dump and restore.
 
> Why not adding at least one of a well tested journaled FS like XFS to OpenBSD?
> Is XFS too fat and complex to be secure?

There is a lot of talk but nobody ever did the work. It is hard work and
will take a lot of time. Also the license needs to be compatible with
OpenBSD.
 
> Does OpenBSD work well if system root is stored via NFS, say on a Linux ZFS?

OpenBSD can run diskless but not sure if it works well, that depends on
your workload and opinion.

-- 
:wq Claudio



Re: RT_TABLEID_MAX behavior changed?

2020-05-19 Thread Claudio Jeker
On Tue, May 19, 2020 at 11:21:13AM +0300, Bars Bars wrote:
> it seems i figured out why userland was 'broken' on recompiled kernel
> with changed RT_TABLEID_MAX.
> I dont think things are really broken, may be i dont them right way, please
> advice.
> 
> I could reproduce the issue (all steps are done exactly as in openbsd.org
> faq).
> I changed RT_TABLEID_MAX, recompiled the kernel, booted from it, and change
> didnt work on userland.
> Then i rebuilt userland, rebooted, all works now.
> Now if i apply some patch from errata, there is kernel re-linking done, and
> just after that kernel change doesnt work.
> Is it expected behavior? How can i fix it? syspacth -r doesnt help.
> 

You can not syspatch a system with a custom kernel. You need to do apply
the patches yourself. syspatch only works for non-modified kernels.
It should actually check this by making sure that the kernel signature is
correct so not sure what exactly happend but I guess you never properly
installed your kernel including the relink directory and so syspatch
relinked a default kernel over your modified one.

> пн, 18 мая 2020 г. в 13:31, Bars Bars :
> 
> > To be more convinient, when i said about that its limit became shorter its
> > relevant to sys/net/rtable.c struct dommp.
> >   struct dommp {
> > unsigned int   limit;
> > /*
> >  * Array to get the routing domain and loopback interface related
> > to
> >  * a routing table. Format:
> >  *
> >  * 8 unused bits | 16 bits for loopback index | 8 bits for rdomain
> >  */
> > unsigned int  *value;
> > };
> >
> > In past the maxumum value was limited to u_int16_t in some deep places,
> > but nowadays there is only 8 bits allocated to it based on the struct + 8
> > unused bits which i hop i can safely add to allocation.
> > I worried these unused bits are not guaranteed to users, so actually the
> > limit is 8 bits instead of 16 in earlier releases.
> >
> >
> >
> > пн, 18 мая 2020 г. в 11:51, Bars Bars :
> >
> >> Hi, Claudio
> >>
> >> I mean these in sys/socket.h
> >> /*
> >>  * Maximum number of alternate routing tables
> >>  */
> >> #define RT_TABLEID_MAX  8000
> >> #define RT_TABLEID_BITS 16
> >> #define RT_TABLEID_MASK 0x
> >>
> >>
> >> пн, 18 мая 2020 г. в 10:18, Claudio Jeker :
> >>
> >>> On Sun, May 17, 2020 at 10:16:28PM +0300, Bars Bars wrote:
> >>> > it seems the things work just when i rebuild userland completely (im
> >>> pretty
> >>> > sure i did it only with compiling kernel in past, correct me if i
> >>> wrong?).
> >>> >
> >>> > btw, questions for the Devs.
> >>> > Looking at the cvs history, i really worried that you do not expand
> >>> > rt_tableid_max limit for the years, moreover now its actually 8 bits
> >>> > shorter than it was before loopback to rdomain map. There are many
> >>> people
> >>> > with more than such a number of vpns, for example if they setup
> >>> centralized
> >>> > vpns setup, or border inter AS router role on the box.
> >>>
> >>> Sorry your mail is incredibly inprecise and unclear. There is no
> >>> rt_tableid_max in OpenBSD at least not in my tree (grep -r rt_tableid_max
> >>> returned nothing). So I have no idea what you are talking about and am
> >>> therefor not able to give you a better answer.
> >>>
> >>> > вс, 17 мая 2020 г., 10:25 Bars Bars :
> >>> >
> >>> > > Hey, guys.
> >>> > >
> >>> > > I always used the rt_tableid_max expanded to 16 bit range in past
> >>> releases
> >>> > > 5.x and after rebuilding the kernel it worked immediately.
> >>> > > But now I installed 6.6 on the new system, and after changing
> >>> > > rt_tableid_max (and new rt_tableid_mask and bits values too), my
> >>> whole
> >>> > > userland throw an rtable / rdomain too large error.
> >>> > > Is there behaviour change?
> >>> > > The only thing changed (as i know) it is news net/trable.c struct to
> >>> map
> >>> > > loopback to domain, where there is only 8 unused bits to which i can
> >>> expand
> >>> > > tableid value.
> >>> > >
> >>> > >
> >>>
> >>> --
> >>> :wq Claudio
> >>>
> >>

-- 
:wq Claudio



Re: RT_TABLEID_MAX behavior changed?

2020-05-18 Thread Claudio Jeker
On Sun, May 17, 2020 at 10:16:28PM +0300, Bars Bars wrote:
> it seems the things work just when i rebuild userland completely (im pretty
> sure i did it only with compiling kernel in past, correct me if i wrong?).
> 
> btw, questions for the Devs.
> Looking at the cvs history, i really worried that you do not expand
> rt_tableid_max limit for the years, moreover now its actually 8 bits
> shorter than it was before loopback to rdomain map. There are many people
> with more than such a number of vpns, for example if they setup centralized
> vpns setup, or border inter AS router role on the box.
 
Sorry your mail is incredibly inprecise and unclear. There is no
rt_tableid_max in OpenBSD at least not in my tree (grep -r rt_tableid_max
returned nothing). So I have no idea what you are talking about and am
therefor not able to give you a better answer.
 
> вс, 17 мая 2020 г., 10:25 Bars Bars :
> 
> > Hey, guys.
> >
> > I always used the rt_tableid_max expanded to 16 bit range in past releases
> > 5.x and after rebuilding the kernel it worked immediately.
> > But now I installed 6.6 on the new system, and after changing
> > rt_tableid_max (and new rt_tableid_mask and bits values too), my whole
> > userland throw an rtable / rdomain too large error.
> > Is there behaviour change?
> > The only thing changed (as i know) it is news net/trable.c struct to map
> > loopback to domain, where there is only 8 unused bits to which i can expand
> > tableid value.
> >
> >

-- 
:wq Claudio



Re: OSPF lsa_check issue

2020-05-06 Thread Claudio Jeker
On Wed, May 06, 2020 at 03:23:06PM +0100, Richard Chivers wrote:
> Hi,
> 
> Thanks so much for the diff, it appears to have resolved the issue.
> 
> We are now trying to establish whether we need the fix widely deployed or
> only on the box that originates with the large LSA updates, pushing it over
> the 1500mtu.
> 
> We are going to run some tests, but our expectation is that when the DR
> sends the message from the originating router on to its neighbors that they
> will then see the same issue.
> 
> Out of interest is there any way of just announcing a single network.
> 
> In this particular case the large LS-Update is caused because we have many
> interfaces, but these are all carp so will failover in one hit anyway. We
> have allocated 10.128.0.0/16 to this firewall so there are many networks,
> but anything in our network with a destination of 10.128.0.0/16 can end up
> here.
> 
> We tried something like *redistribute 10.128.0.0/16 
> depend on carp0*, but what that appears to do is limit advertisements to
> the subnets that fall within that range, so we still have a very large LSA
> update anyway.
> 
> Just wondering if there was any workaround, as it would just simplify
> processing etc.
> 
> It is probably a non issue anyway now, with the fix, but just interested if
> anyone has done anything similar.

Without the exact config it is hard to judge but you are advertising a lot
of stub networks in the router lsa. stub networks are from interface rules
that are passive or have no active peers. So to reduce the size of the
router LSA an option is to remove some of the interfaces and change them
to redistribute connected which uses Type-5 LSA instead.

-- 
:wq Claudio



Re: OSPF lsa_check issue

2020-05-06 Thread Claudio Jeker
On Wed, May 06, 2020 at 09:33:11AM +0100, Richard Chivers wrote:
> Hi,
> 
> Some progress has been made, we can now replicate this consistently and it
> appears that whenever a LS update exceeds the mtu (1500) we get this issue
> of lsa_check bad age.
> 
> When running with the diff Claudio sent we start getting a bunch of errors
> complaining about:
> 
> recv_ls_update: bad packet size, neighbor ID x.x.x.x
> lsa_check: bad packet size
> 
> We don't ever move to a state of FULL/DR or similar.
> 
> Does anyone have any suggestions? We are just starting to look at the wider
> code to see if we can comprehend what may be occurring, but it will
> likely be a steep learning curve :)
> 

Just realized that my diff was wrong since ibuf_reserve() would change the
write position of the buffer and so you end up with some empty space in
the buffer.

Here is a better diff. This is using ibuf_size to get the current write
position and then ibuf_seek() to write the age back into the right spot.
Using the position instead of the pointer has the benefit that a realloc()
in ibuf_add() will not result in the stale pointer to lsage that the
current code has.

I have currently no ospf setup so my testing is limited.
-- 
:wq Claudio

Index: lsupdate.c
===
RCS file: /cvs/src/usr.sbin/ospfd/lsupdate.c,v
retrieving revision 1.47
diff -u -p -r1.47 lsupdate.c
--- lsupdate.c  19 Nov 2019 09:55:55 -  1.47
+++ lsupdate.c  6 May 2020 08:48:19 -
@@ -175,8 +175,8 @@ int
 add_ls_update(struct ibuf *buf, struct iface *iface, void *data, u_int16_t len,
 u_int16_t older)
 {
-   void*lsage;
-   u_int16_tage;
+   size_t  ageoff;
+   u_int16_t   age;
 
if ((size_t)iface->mtu < sizeof(struct ip) + sizeof(struct ospf_hdr) +
sizeof(u_int32_t) + ibuf_size(buf) + len + MD5_DIGEST_LENGTH) {
@@ -186,7 +186,7 @@ add_ls_update(struct ibuf *buf, struct i
return (0);
}
 
-   lsage = ibuf_reserve(buf, 0);
+   ageoff = ibuf_size(buf);
if (ibuf_add(buf, data, len)) {
log_warn("add_ls_update");
return (0);
@@ -198,7 +198,7 @@ add_ls_update(struct ibuf *buf, struct i
if ((age += older + iface->transmit_delay) >= MAX_AGE)
age = MAX_AGE;
age = htons(age);
-   memcpy(lsage, , sizeof(age));
+   memcpy(ibuf_seek(buf, ageoff, sizeof(age)), , sizeof(age));
 
return (1);
 }



Re: OSPF lsa_check issue

2020-05-05 Thread Claudio Jeker
On Tue, May 05, 2020 at 10:51:40AM +0200, Claudio Jeker wrote:
> On Tue, May 05, 2020 at 09:07:34AM +0100, Richard Chivers wrote:
> > After some more work this morning we have managed to extract the
> > information from tcpdump of the full LS-Update packet, we couldn't see it
> > on bsd, but running:
> > 
> > tcpdump -v -r ~/Downloads/ospf.pcap on osx did the trick.
> > 
> > What we are seeing is that a pair of firewalls are both sending updates
> > like this:
> > 
> > 07:16:09.346525 IP (tos 0xc0, ttl 1, id 47473, offset 0, flags [+], proto
> > OSPF (89), length 1500)
> > x.x.x.x > ospf-dsig.mcast.net: OSPFv2, LS-Update, length 1480 [len 1672]
> > Router-ID x.x.x.x, Backbone Area, Authentication Type: simple (1)
> > Simple text password: dslkfjld, 1 LSA
> >  LSA #1
> >  Advertising Router x.x.x.x, seq 0x806e, age 0s, length 1624
> >Router LSA (1), LSA-ID: x.x.x.x
> >Options: [External]
> >Router LSA Options: [ASBR]
> >  Stub Network: 10.128.32.128, Mask: 255.255.255.128
> > topology default (0), metric 10
> >  Stub Network: 10.128.9.0, Mask: 255.255.255.128
> > *{ another 50 or so networks here}*
> > 
> > Each time we get one of these updates the DR logs the lsa_check: bad age.
> > 
> > Another 5 or so seconds later the same LS-Update comes in with the same seq
> > number. This appears to continue indefinitely. Our only fix appears to be
> > restarting ospfd on the routers.
> > 
> > Does anyone have an idea what is going wrong here?
> > 
> > Something we have considered being a problem is that we do have many
> > interfaces, we have 90 or so, so the LS-Update packets are quite large and
> > do get fragmented, as we are using a 1500mtu.
> > 
> > The fact that ospfd sees the age and complains though makes us think this
> > is not a problem.
> > 
> 
> Looking at the tcpdump output there is something strange with the various
> reported length fields. Is it possible to get the raw packet dumps?
> 

Can you try the following diff and see if it fixes the issue?

-- 
:wq Claudio

Index: lsupdate.c
===
RCS file: /cvs/src/usr.sbin/ospfd/lsupdate.c,v
retrieving revision 1.47
diff -u -p -r1.47 lsupdate.c
--- lsupdate.c  19 Nov 2019 09:55:55 -  1.47
+++ lsupdate.c  5 May 2020 09:20:50 -
@@ -186,7 +186,7 @@ add_ls_update(struct ibuf *buf, struct i
return (0);
}
 
-   lsage = ibuf_reserve(buf, 0);
+   lsage = ibuf_reserve(buf, len);
if (ibuf_add(buf, data, len)) {
log_warn("add_ls_update");
return (0);



Re: OSPF lsa_check issue

2020-05-05 Thread Claudio Jeker
On Tue, May 05, 2020 at 09:07:34AM +0100, Richard Chivers wrote:
> After some more work this morning we have managed to extract the
> information from tcpdump of the full LS-Update packet, we couldn't see it
> on bsd, but running:
> 
> tcpdump -v -r ~/Downloads/ospf.pcap on osx did the trick.
> 
> What we are seeing is that a pair of firewalls are both sending updates
> like this:
> 
> 07:16:09.346525 IP (tos 0xc0, ttl 1, id 47473, offset 0, flags [+], proto
> OSPF (89), length 1500)
> x.x.x.x > ospf-dsig.mcast.net: OSPFv2, LS-Update, length 1480 [len 1672]
> Router-ID x.x.x.x, Backbone Area, Authentication Type: simple (1)
> Simple text password: dslkfjld, 1 LSA
>  LSA #1
>  Advertising Router x.x.x.x, seq 0x806e, age 0s, length 1624
>Router LSA (1), LSA-ID: x.x.x.x
>Options: [External]
>Router LSA Options: [ASBR]
>  Stub Network: 10.128.32.128, Mask: 255.255.255.128
> topology default (0), metric 10
>  Stub Network: 10.128.9.0, Mask: 255.255.255.128
> *{ another 50 or so networks here}*
> 
> Each time we get one of these updates the DR logs the lsa_check: bad age.
> 
> Another 5 or so seconds later the same LS-Update comes in with the same seq
> number. This appears to continue indefinitely. Our only fix appears to be
> restarting ospfd on the routers.
> 
> Does anyone have an idea what is going wrong here?
> 
> Something we have considered being a problem is that we do have many
> interfaces, we have 90 or so, so the LS-Update packets are quite large and
> do get fragmented, as we are using a 1500mtu.
> 
> The fact that ospfd sees the age and complains though makes us think this
> is not a problem.
> 

Looking at the tcpdump output there is something strange with the various
reported length fields. Is it possible to get the raw packet dumps?

-- 
:wq Claudio



Re: bad AGGREGATOR, AS 0 not allowed

2020-04-29 Thread Claudio Jeker
On Wed, Apr 29, 2020 at 05:45:30PM +0200, Marko Cupać wrote:
> Hi,
> 
> on 6.6-RELEASE amd64, (sys)patched up to 019_smtpd_exec, I am noticing
> these:
> 
> Apr 29 17:23:33 bgp1 bgpd[42338]: neighbor IP.ADD.RE.SS (desc): bad
> AGGREGATOR, AS 0 not allowed, attribute discarded
> 
> My bgpd.conf is almost default, announcing my AS to two upstream peers.
> 
> I wrote to my peer, they said they are not sending me AS 0, and to clear my
> session.
> 
> After 'bgpctl neighbor desc clear' I'm still getting these messages.
> 
> Is this related to:
> [https://marc.info/?l=openbsd-tech=156510627921885=2]
> 
> Can I safely disregard this, and wait for next release for these messages to
> disappear?

At the moment this warning as not been removed, so you will see it even in
the next release. It has indeed todo with the fact that AS 0 is not
allowed even in the AGGREGATOR attribute. Now your neighbor is sending you
such an attribute which indicates that their routers do not handle RFC7607
correctly. At the moment there are a handful of prefixes in the DFZ that
are sent with an AGGREGATOR attribute that has AS 0 and this is what
triggers. You normally get the error on the initial sync.
I wanted to make the error better an include ASPATH / prefix but at the
time this problem happens this information is not available. Time to look
at this again so that the finger pointing is more helpful.

-- 
:wq Claudio



Re: Ospfd default route query

2020-04-27 Thread Claudio Jeker
On Mon, Apr 27, 2020 at 07:26:08PM +0100, Richard Chivers wrote:
> Hi,
> 
> That makes a lot of sense thanks, and appears to have solved the problem,
> we had a route added through our loopback interface in production"
> "!/sbin/route add -reject default 127.0.0.1"
> 
> Is that the best/general practise in general?

I would use a -blackhole route (no need to send out ICMP messages) but
yes, that is what I normally use in such a case (at least for the DFZ).
 
> Cheers
> 
> Richard
> 
> On Mon, Apr 27, 2020 at 8:25 AM Claudio Jeker 
> wrote:
> 
> > On Sun, Apr 26, 2020 at 08:44:42PM +0100, Richard Chivers wrote:
> > > Not sure how I missed the clear information in the man page...
> > >
> > > "If set to default, a default route pointing to this router will be
> > > announced over OSPF"
> > >
> > > It seems I am just having an issue and it should work as I expected.
> > >
> > > I will do some more diagnosis in the morning...
> > >
> >
> > I think the man page is not optimal here. ospfd(8) and ospf6d(8) will only
> > redistribute networks that are in the FIB. So in case of redistribute
> > default the router needs to have a default route 0/0 or ::/0 in the
> > routing table. Also that route's priority needs to be less than 32
> > to be picked up.
> >
> > This is different from bgpd where the network statements and export
> > default-route statement work even if there is no matching route in the
> > FIB.
> >
> > > On Sun, 26 Apr 2020, 17:09 Richard Chivers, 
> > wrote:
> > >
> > > > Hi,
> > > >
> > > > Hope someone can help, I am having a strange issue and can't seem to
> > > > isolate the problem.
> > > >
> > > > We have "redistribute default" set globally on our bgp/ibgp speakers
> > > > in the ospfd.conf. The bsd boxes are all 6.6.
> > > >
> > > > These routers are connected via ibgp to some other routers and have
> > > > external bgp sessions taking at present a couple of basic network
> > > > announcements from their egbp peers. e.g. 2.2.2.0/24 ( we have faked
> > our
> > > > transit provider)
> > > >
> > > > Connected to these routers we have a pair of firewalls, which
> > previously
> > > > received a default route from the bgp/ibgp speakers.
> > > >
> > > > I am trying to understand exactly what the redistribute default in the
> > > > ospfd.conf does. I assume it is saying if i have a static default
> > route or
> > > > another default route from an upstream then tell other routers about
> > it? Or
> > > > is it saying tell others to use me as a default route. I can't seem to
> > find
> > > > anything specific in the docs to clarify this, and would like to
> > understand
> > > > it clearly if pos.
> > > >
> > > > In our case our previous configuration on 5.8 and this configuration
> > has a
> > > > static route on the bgp speakers of 0.0.0.0/24 -> 127.0.0.1.
> > > >
> > > > If I do a ospfctl sh rib or ospfctl sh data on the firewalls i just
> > don't
> > > > see any default route being provided by the bgp speakers.
> > > >
> > > > Hope this makes sense. I am sure I am missing something obvious...
> > > >
> > > > Effectively I want the bgp speakers to announce themselves as the
> > default
> > > > route for their neighbor firewalls over ospf.
> > > >
> > > > Thanks
> > > >
> >
> > --
> > :wq Claudio
> >

-- 
:wq Claudio



Re: Ospfd default route query

2020-04-27 Thread Claudio Jeker
On Sun, Apr 26, 2020 at 08:44:42PM +0100, Richard Chivers wrote:
> Not sure how I missed the clear information in the man page...
> 
> "If set to default, a default route pointing to this router will be
> announced over OSPF"
> 
> It seems I am just having an issue and it should work as I expected.
> 
> I will do some more diagnosis in the morning...
> 

I think the man page is not optimal here. ospfd(8) and ospf6d(8) will only
redistribute networks that are in the FIB. So in case of redistribute
default the router needs to have a default route 0/0 or ::/0 in the
routing table. Also that route's priority needs to be less than 32
to be picked up. 
 
This is different from bgpd where the network statements and export
default-route statement work even if there is no matching route in the
FIB.
 
> On Sun, 26 Apr 2020, 17:09 Richard Chivers,  wrote:
> 
> > Hi,
> >
> > Hope someone can help, I am having a strange issue and can't seem to
> > isolate the problem.
> >
> > We have "redistribute default" set globally on our bgp/ibgp speakers
> > in the ospfd.conf. The bsd boxes are all 6.6.
> >
> > These routers are connected via ibgp to some other routers and have
> > external bgp sessions taking at present a couple of basic network
> > announcements from their egbp peers. e.g. 2.2.2.0/24 ( we have faked our
> > transit provider)
> >
> > Connected to these routers we have a pair of firewalls, which previously
> > received a default route from the bgp/ibgp speakers.
> >
> > I am trying to understand exactly what the redistribute default in the
> > ospfd.conf does. I assume it is saying if i have a static default route or
> > another default route from an upstream then tell other routers about it? Or
> > is it saying tell others to use me as a default route. I can't seem to find
> > anything specific in the docs to clarify this, and would like to understand
> > it clearly if pos.
> >
> > In our case our previous configuration on 5.8 and this configuration has a
> > static route on the bgp speakers of 0.0.0.0/24 -> 127.0.0.1.
> >
> > If I do a ospfctl sh rib or ospfctl sh data on the firewalls i just don't
> > see any default route being provided by the bgp speakers.
> >
> > Hope this makes sense. I am sure I am missing something obvious...
> >
> > Effectively I want the bgp speakers to announce themselves as the default
> > route for their neighbor firewalls over ospf.
> >
> > Thanks
> >

-- 
:wq Claudio



Re: socket I/O on openbsd

2020-04-22 Thread Claudio Jeker
On Tue, Apr 21, 2020 at 10:48:46PM -0300, Gustavo Rios wrote:
> Dear gentleman,
> 
> i have the an ANSI C code that do the following:
> 
> 0. open a socket
> 1. write data to the socket
> 2. close the writing end of the socket
> 3. read data from the socket
> 4. close the read end of the socket
> 
> The the step number 4 returns an error, why ?
> 
> Here it is (Only the relevant part of the code )
> 
> if (!r) r = apx_connect(s, );
> if (!r) r = pmp_set(, 1ul, );
> if (!r) r = pmpsend(s, );
> if (!r) r = apx_shutdown(s, shut_wr);
> if (!r) r = pmprecv(, s, );
> if (!r) r = apx_shutdown(s, shut_rd);
> 

This is not helpful. What kind of errno is returned? What kind of socket
used? There are many more questions...
The best way to report this  is to use ktrace and show its output.
>From that we can see what the syscalls are issued and if there is indeed
an error on shutdown().

-- 
:wq Claudio



Re: BGPD announce deprecation query

2020-04-19 Thread Claudio Jeker
On Sun, Apr 19, 2020 at 08:07:48AM +0100, Richard Chivers wrote:
> Hi,
> 
> Just been building a copy of our production system in vagrant to test
> upgrading to the latest version, in order to resolve an issue we were
> having.
> 
> In our current config we have:
> 
> group "core" {
> local-address $localaddr
> remote-as xx
> announce all
> neighbor x.x.x.x {
> descr "router-a"
> }
> neighbor x.x.x.x {
> descr "router-b"
> }
> }
> 
> From the upgrade guide it says: In OpenBSD 6.4, the announce keyword was
> deprecated in bgpd.conf(5). It has now been removed and must be replaced
> with export.
> 
> We also have another group with announce none
> 
> Is it fair to suggest that removing the announce all will be the same as
> not having it in >= 6.4, and that we replace announce none with export none.
> 
> Probably a stupid question, but I only touch BGP occasionally, and was just
> hoping to understand in more detail.
> 
> The group core is our own internal bgp speakers, each of these also have
> transit connections too.
> 
> All our config is templated using ansible, so we can easily adjust the
> config based on the actual version.
> 
> Probably worth saying we are running on 6.6 with patches applied, in the
> test environment.

Yes, you can just remove announce all from your config. I guess you
already have the needed input and output filters in place to ensure only
the right thing is accepted and announced. Actually since the core group
is ibgp even in the old config announce all is not needed since that was
the default for ibgp sessions.

announce none can just be replaced with export none. The result is the
same and no prefix will be announced to these peers even if the filters
would allow them.

As mentioned the important change was that the filter switched from a
default allow rule to a default deny rule both for incoming and outgoing
filters. So you need to check your ruleset and maybe add some additional
filters. Something like
allow from ibgp
allow to ibgp
may do the trick.

-- 
:wq Claudio



Re: MultiPath / ADD_PATH for bgpd

2020-04-16 Thread Claudio Jeker
On Wed, Apr 15, 2020 at 08:16:14PM +0100, Richard Chivers wrote:
> Hi,
> 
> Just wondering if anyone can help.
> 
> I saw back in late 2018 that there were some initial plans for ADD_PATH and
> Multipath in bgpd, it was in a list on a slide right after the portable
> version. https://youtu.be/4gOoPxGKKjA?t=1500
> 
> Does anyone know if there are still plans in this area, or if there has
> been any progress, we are really interested to explore using this in a
> project we are working on, and just keen to understand if it may be coming?
> 

The plan still holds but the timeline got a bit mixed up. Unless someone
steps up ADD_PATH will not show up in the 6.8 release but probably in 6.9.

-- 
:wq Claudio



Re: OSPF seems to stops processing updates

2020-04-13 Thread Claudio Jeker
On Mon, Apr 13, 2020 at 02:08:31PM +0200, Remi Locherer wrote:
> On Mon, Apr 13, 2020 at 12:05:10PM +0100, Richard Chivers wrote:
> > Thanks. Please see my comments below.
> > 
> > On Mon, 13 Apr 2020, 10:18 Remi Locherer,  wrote:
> > 
> > > Hi Richard,
> > >
> > > On Mon, Apr 13, 2020 at 08:38:31AM +0100, Richard Chivers wrote:
> > > > We have been having a strange issue, whereby OSPF stops updating
> > > properly.
> > > >
> > > > We can see an entry for an ip route in the database but it is not in the
> > > > kernel routing table, and when it is the DR, other routers then do not
> > > have
> > > > the route at all.
> > > >
> > > > We are seeing this across multiple boxes. We have 10+ ospf speakers, and
> > > > seem to see the issue at different times.
> > > >
> > > > The problem starts with:
> > > >
> > > > ospfd[6960]: recv_db_description: neighbor ID x.x.x.x: seq num mismatch,
> > > > bad flags
> > >
> > > The neighbor sent a db desc with the master flag set differently than what
> > > this ospfd instance recorded before for that particular neighbor.
> > >
> > > See 2nd last item on page 100 of RFC 2328:
> > > https://tools.ietf.org/html/rfc2328#page-100
> > 
> > 
> > Thanks, should the routers just recover then from this scenario even if it
> > was happening due to lost packets, CPU pause etc.
> 
> I think so. But it may take quite a while. It might also be an bug in ospfd
> or in another implementation.

Since this issues happen with 5.8 and 6.4 ospfd I would suggest to update
to at least 6.6 (especially the 5.8). IIRC there was some issue with ospfd
neighbor selection that caused troubles when sessions flapped. This was
fixed some time ago but I doubt 5.8 has that fix in.

-- 
:wq Claudio



Re: BGP and carp slaves

2020-04-02 Thread Claudio Jeker
On Thu, Apr 02, 2020 at 11:34:21AM +0200, Luca Bodini wrote:
> Hi folks,
> 
> I’m just having a strange issue using OpenBSD 6.6 and BGP .
> I have two OpenBSD firewalls with a carp configuration, let’s suppose the 
> shared IP is 10.10.10.100, and I am able to announce 10.10.10.100/32 via BGP.
> Now, here is my /etc/bgpd.conf configuration:
> 
> # define our own ASN as a macro
> ASN=“65000"
> rde med compare always
> 
> # global configuration
> AS $ASN
> router-id 172.10.10.3 
> 
> # list of networks that may be originated by our ASN
> prefix-set mynetworks { \
> 10.10.10.100/32\
> }
> 
> # Generate routes for the networks our ASN will originate.
> # The communities (read 'tags') are later used to match on what
> # is announced to EBGP neighbors
> network prefix-set mynetworks set { community $ASN:1 med 10 } 
> 
> # upstream providers
> group "upstreams" {
> remote-as 20746
> neighbor 172.10.10.1  {
> descr “provider router 01"
> }
> neighbor 172.10.10.2 {
> descr “provider router 02"
> }
> }
> 
> ## rules section
> allow from group upstreams prefix 0.0.0.0/0
> 
> # IBGP: allow all updates to and from our IBGP neighbors
> allow from ibgp
> allow to ibgp
> allow to ebgp prefix-set mynetworks 
> 
> The problem I’m facing is due to (i guess) provider router misconfiguration, 
> in fact, routers are forwarding traffic to carp slave and unexpectedly 
> everything is working fine: firewall is accepting connections and forwarding 
> traffic, for example if I try to SSH:
> ~# ssh -l root 10.10.10.100
> [root@fw-02 root]# ifconfig | grep vhid
> carp: BACKUP carpdev vlan100 vhid 10 advbase 1 advskew 10 
> 
> I’ve asked provider to change BGP configuration and everything now is stetted 
> up correctly, now, the question is:
> Is the carp slave accepting and forwarding connections by design or is it un 
> “unintended" feature?
> 

By default bgpd will just announce mynetworks without checking if
something is up or not.
You may have more luck with 'network inet connected' or even better use a
rtlabel. In that case bgpd should respect the status of the route.

I normally use carp on both sides and use 'network X/Y set nexthop $CARPIP'
Where $CARPIP is the external carp IP shared between the two routers. In
this case both systems announce the same network with the same nexthop
(the carp IP) to the next routers and so no rerouting happens if the
master dies. This only works if the systems share a lan segement for ebgp
sessions.

-- 
:wq Claudio



Re: routing with DMZ between internal and external firewall

2020-03-16 Thread Claudio Jeker
On Mon, Mar 16, 2020 at 09:49:30AM +0100, pebwindkraft wrote:
> Hi,
> 
> I have a question concerning static routes and default gateways for a DMZ
> setup, with internal and external firewall.
> A DNS in the DMZ shall be used from internal machines, and later a http
> proxy from internal and external machines.
> The setup is within a network of a bigger data centre with it's own edge
> router. I cannot change anything on this edge router.
> I am using OpenBSD 6.6, and ip forwarding is activated on both firewalls.
> Here an ASCII pic (for better viewing also here:
> https://ln2.sync.com/dl/9da92f730/wrzi9rse-xh9sqzed-cst55auv-y39rkrwj):
> 
> ||   |-|   |-| /-\
> | int_pc |---| int_fw  |---| ext_fw  |---| Data Center |---> Internet
> ||   |em0   em1|   |   |em0   em1|   | Edge Router |
>  |-|   |   |-| \-/
>    |
>     ||
>     | DNS & http |
>     ||
> 
> Setup of default routes:
>   int_pc  -> IP address of em0 on int_fw
>   int_fw  -> IP address of em0 on ext_fw
>   DNS -> IP address of em0 on ext_fw
>   ext_fw  -> IP address of external interface
> 
> Without any firewall rules (pfctl -d), I observe:
> 
>  1.) I cannot ping from int_pc to DNS, and vice versa.
>  2.) I cannot ping from int_pc to em0 on ext_fw
> 
> I can observe with tcpdump, that ping echo request leaves int_pc, goes
> through int_fw and reaches the network card of DNS or em0 on ext_fw. As the
> default route of DNS is pointing to ext_fw, the ping echo reply is sent to
> ext_fw, which doesn't know what to do with the IP address of int_pc, and
> ignores the package. I get this.
> So I can set a static route on the DNS or on the external firewall, like
> this
> 
>   route add -inet {network of int_pc} {IP address of em1 on int_fw}
> 
> and then pinging back and forth works.
> But setting static routes on all DMZ machines and ext_fw seems doesn't seem
> right to me(?).
> 
> What would be the correct design?
> Can I use "only" the ext_fw with a static route, so that packages from DNS
> would travel twice through DMZ net (from DNS to ext_fw, and then from ext_fw
> via int_fw back to int_pc)?
> 
> The information I found on misc@ and internet is usually talking about "home
> router" with NAT and three network cards, where one leg supplies the DMZ...
> Mine is different, and I think I do not need NAT here?
> 

You need to add routes for your internal network on ext_fw and on the DNS
box. They need to know that those networks are reachable via int_fw. These
routes are more specific and will make sure that the traffic has a path
back to int_pc.

-- 
:wq Claudio



Re: size of size_t (diff angle)

2020-02-27 Thread Claudio Jeker
On Thu, Feb 27, 2020 at 02:07:36PM +0100, zeurk...@volny.cz wrote:
> Haai,
> 
> "Claudio Jeker"  wrote:
> > This has not much to do with OpenBSD.
> 
> On the contrary: these issues touch the fundaments of UNIX programming.
> 
> > As for OpenBSD, it only runs on two types of machines: ILP32 and I32LP64.
> > Any other type of machine that is not covered by these two types will
> > not run OpenBSD.
> 
> Oh yes, this is not NetBSD, me's well aware... And yet, metries hard to
> satisfy basic portability when feasible. This is consistent with OpenBSD
> practice, at least if the manual pages are anything to go by.
> 
> > In both cases size_t is defined as unsigned long which is the same as
> > uintptr_t and the same size as pointer.
> 
> Of course, in practice that's the case. You'll really get no argument
> from me there. 
> 
> > Now if SIZE_MAX is the highest address is a different thing.
> > On OpenBSD 0..SIZE_MAX will cover the address room (in most cases
> > it covers actually more then what is possible). The highest valid
> > address is in most cases less than SIZE_MAX.
> 
> Yes, the {,in}famous halfway split... for calculations involving
> already valid {addresse,offset,size}s that hardly matters, however.
> 
> What *does* matter, is the potential lack of equivalence of the types.
> Which, as you pointed out, does not affect OpenBSD (at this time), yet
> might be a portability issue. Hence me raising it.

The times of non ILP32 or I32LP64 UNIX systems is over (at least when it
comes to userland processes). If you want a UNIX-like OS where code will
work then those are your only options. The ecosystem is not able to handle
anything else anymore. All the other discussions are theortical and will
not result in anything that is usable to run UNIX software.

-- 
:wq Claudio



Re: size of size_t (diff angle)

2020-02-27 Thread Claudio Jeker
This has not much to do with OpenBSD.
As for OpenBSD, it only runs on two types of machines: ILP32 and I32LP64.
Any other type of machine that is not covered by these two types will
not run OpenBSD.

In both cases size_t is defined as unsigned long which is the same as
uintptr_t and the same size as pointer.

Now if SIZE_MAX is the highest address is a different thing.
On OpenBSD 0..SIZE_MAX will cover the address room (in most cases
it covers actually more then what is possible). The highest valid
address is in most cases less than SIZE_MAX.

-- 
:wq Claudio


On Thu, Feb 27, 2020 at 01:36:39AM +0100, zeurk...@volny.cz wrote:
> Haai,
> 
> "Marc Espie"  wrote:
> >>> You're looking at the wrong type. size_t is very good for what it does.
> >>
> >> Yes; meproblem is with the 'what it does' part.
> >
> > It represents memory sizes. It works on anything with a sane
> > memory model.
> 
> The way meunderstands it, it's just an offset, plain and simple. Which
> on a sane machine is indeed of the same type as an address[0].
> 
> Unfortunately, C99 does not appear to reflect that. Now, to what degree
> (if!) we should respect C99, or take it much seriously at all, is
> another matter...
> 
> >>> Try uintptr_t
> >>
> >> Are you proposing a change to struct iovec?
> >
> > Why should I ? readv works with sizes, so size_t is adequate.
> 
> Yes, why should you? That was me implied question. You told me to use
> uintptr_t, but that will hardly solve things on the exact problem mewas
> working on (medidn't specify what it was, and you didn't ask), unless we
> change struct iovec (cue an 'over my dead body' response from theo, and
> with respect to compat, he'd be damn right).
> 
> > You were mentionning caddr_t earlier. intptr_t and uintptr_t are
> > the adequate types for working with addresses. size_t is the adequate
> > family for working with sizes.
> 
> Me's found that such statements emerge from a shallow understanding of
> the nature of C. C doesn't know sizes: indeed, it barely knows indices
> and offsets. If sizeof() would have been defined to return the index
> of the final byte, instead of the count of bytes, then the C99
> definition for size_t would've been pre-empted.
> 
> > POSIX kind-of implies readv, which means that both realms tend of
> > mesh.
> 
> Yes, that's an obvious layer error. C as a language should not be
> confused with libc, or UNIX in general. In fact, C and UNIX appear to
> only have two concrete things in common: ASCII, and the byte as the
> fundamental type. That's it.
> 
> > If you're on something where they don't, you're fucked.
> 
> Me's never been the type to play it safe. The path forward is not blind
> obedience to the ravings of committees, especially those that pretend to
> set a universal standard. 
> 
> > Good luck.
> 
> Thanks. Me's decided to ditch the {read,write}v compat wrappers and take
> the performance hit. It's all preperation for a real OS, after all:
> me'll do it right in there.
> 
> > What are you doing asking questions on an OpenBSD list, btw ?
> 
> nnx runs on OpenBSD. You must be confusing it with NetNIX, which is the
> OS that will eventually emerge.
> 
> NetNIX will not have size_t.
> 
> Baai,
> 
> --zeurkous.
> 
> [0] Except, of course, it's an 'offset + 1'. Oops. But that's the least
> of the problems if SIZE_MAX is not guaranteed to be the highest
> address...
> 
> -- 
> Friggin' Machines!
> 



Re: ahci issue corebooted X220 does not recognise usb or stata

2020-02-21 Thread Claudio Jeker
On Wed, Feb 19, 2020 at 02:34:40PM +0100, Thomas Meulendijks wrote:
> Hi OpenBSD Mailing list,
> 
> I am trying to install Openbsd via the install66.fs on a Thinkpad X220 
> [amd64] with coreboot.
> I have the problem that it does not recognize any USB or SATA device may it 
> be storage or peripherals like a keyboard, except for the boot USB.
> I tried with external USB storage, multiple different internal SSD's, 
> multiple USB keyboards, but sysctl does not show anything and dmesg does not 
> give a message when plugged in or out.
> I also tried with the grub and seabios payloads but it did not make a 
> difference.
> Coreboot and payloads are compiled at master and agenst the latest stable 
> version but it did not make a difference.
> coreboot Master version is 4.11-1189.
> 
> When I look at dmesg I see ahci failed, I know you guys will need to see my 
> dmesg but since I can't save it to a drive [read problem above]
> and the installation fs only has ftp to communicate to the web as far as I 
> can see and I don't know how to set up a ftp server, I am at a loss of how to 
> get it out.
> Maybe I am missing something?

The Thinkpad X220 works with its original BIOS so the problem is with your
coreboot and not with OpenBSD. coreboot fails to properly report something
(most probably some of the ACPI or SMBIOS information is wrong) and
because of that every PCI access seems to fail. Most probably pci0 is not
setup correctly.

We can't help you here, you changed your BIOS with something that is not
quite right. You should reach out to coreboot.
 
> A part of dmesg I think may be helpful that I typed over:
> 
> 
> em0 at pci0 dev 25 function 0 "Intel 82579LM" rev 0x05: msiem0: The EEPROM 
> Checksum Is Not Valid
> em0: Unable to initialize the hardware
> ehci0 at pci0 dev 26 function 0 "Intel 6 Series USB" rev 0x05: apci 2 int 19
> ehci0: reset timeout
> ehci0 init failed, error=13
> 
> ehci1 at pci0 dev 29 function 0 "Intel 6 Series USB" rev 0x05: apci 2 int 18
> ehci1: reset timeout
> ehci1 init failed, error=13
> "Intel QM67 LPC" rev 0x05 at pci0 dev 31 function 0 not configured
> ahci0 at pci0 dev 31 function 2 "Intel 6 Series AHCI" rev 0x05: msi, unable 
> to reset controller
> "Intel 6 Series SMBus" rev 0x05 at pci0 dev 31 function 3 not configured
> "Intel 6 Series Thermal" rev 0x05 at pci0 dev 31 function 6 not configured
> 
> 
> I hope this is enough info and would greatly appreciate it if anyone could 
> help me out!
> 
> Greetings,
> 
> Thomas
> 

-- 
:wq Claudio



Re: Fwd: tap(4) performance tuning on (amd64)

2020-01-21 Thread Claudio Jeker
On Tue, Jan 21, 2020 at 09:17:20PM +, Tom Smyth wrote:
> in testing tap(4)  performance on the same box with the following config
> using claudios userlandbridge (tbridge)  in between two tap interfaces
> each tap was also added their own standard bridge(4) along with 1 physical
> interface.
> 
> iperf3client--ix0--bridge0--tap0--tbridge--tap1--bridge1--ix1---iperf3svr
> 
> with a 1socket 2 core system that gives 3Gb/s we got the following
> performance
> 
> tbridge -t gave 557Mb/s TCP throughput
> 
> btw (tbridge -t did not stop after  using ^C  or kill
> but did respond to kill -s SIGKILL )

I forgot to mark the signals to interrupt read instead of restart. So you
need another packet to arrive to exit the loop.
You can add
siginterrupt(SIGTERM, 1);
siginterrupt(SIGINT, 1);
siginterrupt(SIGHUP, 1);
before the signal() calls to install the signal handler and then ^C will
work.

> tbridge -s gave 455Mb/s TCP throughput
> 
> tbridge -p gave 448Mb/s TCP throughput
> 
> tbridge -k gave 458mb/s TCP througput
> 
> im going to try this again with more CPUs as the workload of forwarding in
> this box involves 3 bridges in series.
> 
> I will also try with the tpmr(4) driver
> so something about OpenVPN  has a bottleneck that reduces performance
> by a factor of 3 -4x
> 

Surprised by the 20% better performance of the threaded version. I wonder
if the single threaded version max out the performance of a single CPU.
My tests running tcpbench just between two interfaces show no
measurable performance difference between the different modes (for either
tun or tap).

-- 
:wq Claudio



Re: tap(4) performance tuning on (amd64)

2020-01-20 Thread Claudio Jeker
On Tue, Jan 21, 2020 at 02:44:35AM +, Tom Smyth wrote:
> Claudio,
> Thanks for this,
> I compiled  it on Openbsd 6.6 (stable) amd64
> 
> it compiled without error
> 
> the binary seems to run  fine but,
> ./tbridge -k /dev/tap0 /dev/tap1
> 
> runs and displays the usage message and  gives an errorlevel of 1
> every time  use the -k or -t or -s or -p arguments   see  terminal
> conversation below
> 

Shit, I added a last minute check and as usual introduced a bug.
Line 189 change if (ch != 0) to if (mode != 0)

-- 
:wq Claudio

/*
 * Copyright (c) 2020 Claudio Jeker 
 *
 * Permission to use, copy, modify, and distribute this software for any
 * purpose with or without fee is hereby granted, provided that the above
 * copyright notice and this permission notice appear in all copies.
 *
 * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
 * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
 * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
 * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
 * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
 * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
 * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
 */
#include 
#include 
#include 

#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 

volatile sig_atomic_tquit;

static void
do_read(int in, int out)
{
char buf[2048];
ssize_t n, o;

n = read(in, buf, sizeof(buf));
if (n == -1)
err(1, "read");
o = write(out, buf, n);
if (o == -1)
err(1, "read");
if (o != n)
errx(1, "short write");
}

static void
do_poll(int fd[2])
{
struct pollfd pfd[2];
int n, i;

while (quit == 0) {
memset(pfd, 0, sizeof(pfd));
pfd[0].fd = fd[0];
pfd[0].events = POLLIN;

pfd[1].fd = fd[1];
pfd[1].events = POLLIN;

n = poll(pfd, 2, INFTIM);
if (n == -1)
err(1, "poll");
if (n == 0)
errx(1, "poll: timeout");
for (i = 0; i < 2; i++) {
if (pfd[i].revents & POLLIN)
do_read(fd[i], fd[(i + 1) & 0x1]);
else if (pfd[i].revents & (POLLHUP | POLLERR))
errx(1, "fd %d revents %x", i, pfd[i].revents);
}
}

}

static void
do_select(int fd[2])
{
fd_set readfds;
int n, i, maxfd = -1;

while (quit == 0) {
FD_ZERO();
for (i = 0; i < 2; i++) {
if (fd[i] > maxfd)
maxfd = fd[i];
FD_SET(fd[i], );
}
n = select(maxfd + 1, , NULL, NULL, NULL);
if (n == -1)
err(1, "select");
if (n == 0)
errx(1, "select: timeout");
for (i = 0; i < 2; i++) {
if (FD_ISSET(fd[i], ))
do_read(fd[i], fd[(i + 1) & 0x1]);
}
}
}

static void
do_kqueue(int fd[2])
{
struct kevent kev[2];
int kq, i, n;

if ((kq = kqueue()) == -1)
err(1, "kqueue");

memset(kev, 0, sizeof(kev));
for (i = 0; i < 2; i++) {
EV_SET([i], fd[i], EVFILT_READ, EV_ADD | EV_ENABLE,
0, 0, (void *)(intptr_t)i);
}
if (kevent(kq, kev, 2, NULL, 0, NULL) == -1)
err(1, "kevent register");

while (quit == 0) {
n = kevent(kq, NULL, 0, kev, 2, NULL);
if (n == -1)
err(1, "kevent");
if (n == 0)
errx(1, "kevent: timeout");
for (i = 0; i < n; i++) {
if (kev[i].flags & EV_ERROR)
errc(1, kev[i].data, "kevent EV_ERROR");
if (kev[i].filter == EVFILT_READ) {
int r = (int)kev[i].udata;
do_read(fd[r], fd[(r + 1) & 0x1]);
}
}
}
}

static void *
run_thread(void *arg)
{
int *fd = arg;

while (quit == 0)
do_read(fd[0], fd[1]);

return NULL;
}

static void
do_thread(int fd[2])
{
pthread_t tid;
int ret;

ret = pthread_create(, NULL, run_thread, fd);
   

Re: tap(4) performance tuning on (amd64)

2020-01-20 Thread Claudio Jeker
On Fri, Jan 10, 2020 at 01:00:49PM +, Tom Smyth wrote:
> Hi lads,
> 
> I have been doing some testing with tap(4) and openvpn (standard ssl )
> I have been using openvpn with tap and I have been trying with null
> encryption. null authentication,
> the performance of the tap interface  seems to be about 100-150Mb/s  on a 
> system
> which can give  3Gb/s-5Gb/s on ix(4) interfaces  in Bridge mode and
> 4-8Gb/s on tpmr mode
> I was wondering is there a sysctl setting that if modified would
> improve the tap interface performance.
> I have tried with tpmr(4) and  bridge(4)
> 
> is there a simple way  testing a tap(4) interface throughput /
> performance without Openvpn process
> 
> I can try mlvpn and wireguard
> but I would love if there was a trick where I can just test the tap(4)
> interface  with something like pair(4)...
> 
> ix0---bridge0--tap0---someprocess--tap1-bridge1--ix1
> or
> ix0--tpmr0--tap0--someprocess--tap1-tpmr1-ix1
> 
> is there a simple "someprocess" that would provide forwarding packets
> between tap0 and tap1 in userland
> so that any performance testing on tap(4) interfaces does not have the
> distractions of complex userland programs with encryption /
> encapsulation overheads
> 

I just wrote a simple tun/tap bridge for testing so here you go.
Compile it with 'cc -Wall -o tbridge tbridge.c -lpthread' and run it
with 'tbridge -k /dev/tun0 /dev/tun1' to wire tun0 and tun1 together.
You can select between, select(2), poll(2), kqueue(2) and pthreads as the
way on how to multiplex the reads.

For me the code triggers scheduler inefficencies and causes packets drops
on the output queue when there are multiple packet producers.
-- 
:wq Claudio

/*
 * Copyright (c) 2020 Claudio Jeker 
 *
 * Permission to use, copy, modify, and distribute this software for any
 * purpose with or without fee is hereby granted, provided that the above
 * copyright notice and this permission notice appear in all copies.
 *
 * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
 * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
 * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
 * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
 * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
 * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
 * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
 */
#include 
#include 
#include 

#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 

volatile sig_atomic_tquit;

static void
do_read(int in, int out)
{
char buf[2048];
ssize_t n, o;

n = read(in, buf, sizeof(buf));
if (n == -1)
err(1, "read");
o = write(out, buf, n);
if (o == -1)
err(1, "read");
if (o != n)
errx(1, "short write");
}

static void
do_poll(int fd[2])
{
struct pollfd pfd[2];
int n, i;

while (quit == 0) {
memset(pfd, 0, sizeof(pfd));
pfd[0].fd = fd[0];
pfd[0].events = POLLIN;

pfd[1].fd = fd[1];
pfd[1].events = POLLIN;

n = poll(pfd, 2, INFTIM);
if (n == -1)
err(1, "poll");
if (n == 0)
errx(1, "poll: timeout");
for (i = 0; i < 2; i++) {
if (pfd[i].revents & POLLIN)
do_read(fd[i], fd[(i + 1) & 0x1]);
else if (pfd[i].revents & (POLLHUP | POLLERR))
errx(1, "fd %d revents %x", i, pfd[i].revents);
}
}

}

static void
do_select(int fd[2])
{
fd_set readfds;
int n, i, maxfd = -1;

while (quit == 0) {
FD_ZERO();
for (i = 0; i < 2; i++) {
if (fd[i] > maxfd)
maxfd = fd[i];
FD_SET(fd[i], );
}
n = select(maxfd + 1, , NULL, NULL, NULL);
if (n == -1)
err(1, "select");
if (n == 0)
errx(1, "select: timeout");
for (i = 0; i < 2; i++) {
if (FD_ISSET(fd[i], ))
do_read(fd[i], fd[(i + 1) & 0x1]);
}
}
}

static void
do_kqueue(int fd[2])
{
struct kevent kev[2];
int kq, i, n;

if ((kq = kqueue()) == -1)
err(1, "kqueue");

memset(kev, 0, sizeof(kev));
for (i = 0; i < 2; i++) {
 

Re: The OpenBSD talk at 36c3

2019-12-30 Thread Claudio Jeker
On Sun, Dec 29, 2019 at 01:29:12PM +0100, Henry Jensen wrote:
> Greetings,
> 
> for those who didn't watched it, there is an accompanied site at
> https://isopenbsdsecu.re/
> 
> Summary: There are a lot of claims. The speaker basically said, that
> some mitigations are "cool", but other, more or less, useless.
> 
> Further accusations are, that OpenBSD still uses e-mail and cvs and not
> more advanced CI tools.
> 
> I can't say anything to the more technical claims about useless
> mitigations, since I am not a OS developer. Is there going to be a
> response from the OpenBSD team?
> 

One thing that everyone can check is the claim that 50% of our commit
messages are less than 10 chars long and 75% are less than 20 chars.
Using the git repo you can run something like this and get the numbers
yourself.

openbsd-git> git log --log-size --format="%B" | grep '^log size ' | cut -f
3 -d ' ' | awk '{ t++; if ($1 <= 10) s++; if ($1 <= 20) m++; else l++; }
END { print s " <= 10 char"; print m " <= 20 char"; print l " rest"; print
t " total" }'

12386 <= 10 char
25894 <= 20 char
176304 rest
202198 total

Sorry but 25k is no where close to 75% of 202198.
Seems he did count words not characters.

-- 
:wq Claudio



Re: Readv and writev failing across ethernet

2019-12-24 Thread Claudio Jeker
On Mon, Dec 23, 2019 at 08:17:37AM -0800, Philip Guenther wrote:
> On Mon, Dec 23, 2019 at 5:04 AM Raymond, David 
> wrote:
> 
> > The "timeout" error was numerically 60.  Curiously, boards with RTL
> > 8111GR chips did not produce these errors, but those with RTL 8111H
> > chips did.  Unfortunately, this chipset seems to be in a lot of newer
> > motherboards.
> >
> > I didn't use ktrace/kdump.  The openmpi software returned the error
> > presented by readv/writev.
> >
> > It sounds like the simplest solution at this point is to try
> > non-Realtek pcie network cards.  Any suggestions?  How are Intel or
> > Broadcom cards?
> >
> 
> At this point I think you're clearly in the "device driver is buggy"
> situation.  If this device has an in-tree driver (and not something you're
> compiling locally into your kernel) then you should start a new thread
> starting with a dmesg and a clear description of the involved hardware.

I don't know what OpenMP uses for communication but re(4) does not return
errno 60 (ETIMEDOUT). So it seems like it is something else. Also 8111G
and 8111H are treated the same way in our re(4) driver.

-- 
:wq Claudio



Re: route an IPv4 /32 to a different interface

2019-12-16 Thread Claudio Jeker
On Sun, Dec 15, 2019 at 08:57:48PM +0100, Denis Fondras wrote:
> Hi,
> 
> I have this setup :
> 
> em3: flags=8843 mtu 1500
> lladdr 
> index 4 priority 0 llprio 3
> media: Ethernet autoselect (1000baseSX full-duplex)
> status: active
> inet6 fe80::aa9:b803:8a7a:ca72%em3 prefixlen 64 scopeid 0x4
> inet 172.16.0.254 netmask 0xff00 broadcast 172.16.0.255
> em4: flags=8843 mtu 1500
> lladdr
> index 5 priority 0 llprio 3
> media: Ethernet autoselect (1000baseSX full-duplex)
> status: active
> inet 172.16.0.249 netmask 0xfffc broadcast 172.16.0.251
> inet6 fe80::29ae:98d:f238:fd68%em4 prefixlen 64 scopeid 0x5
> 
> I have a computer with IPv4 address 172.16.0.248 connected to em3.
> When I try to ping it, obviously it goes to em4.
> 
> How can I route 172.16.0.248 through em3 ?
> 
> I tried with :
> * route add 172.16.0.248/32 172.16.0.254 -iface em3
> * route add 172.16.0.248/32 -llinfo -link -static -iface em3
> but without luck.
> 

You have overlapping networks and you try to add an IP from the more
specific into the less specific block. That is going to be tricky and it
will most probably not work in all cases (e.g. hosts on the more specific
network would not be able to talk to that IP).

While it may be possible to coerce the routing table into doing the right
thing it will probably not work well.
One way to work around this is using rdomains another is renumbering the
network.

-- 
:wq Claudio



Re: random packet drops with syncookies/synproxy

2019-11-09 Thread Claudio Jeker
On Sat, Nov 09, 2019 at 01:30:32PM +0100, Markus Wernig wrote:
> Hm, also no replies to that one :-)
> 
> On 11/6/19 8:15 PM, Markus Wernig wrote:
> 
> > So just to make sure: Is anybody using syncookies and/or synproxy in
> > production in a similar setup?
> 
> So nobody is using syncookies/synproxy at all?

I guess that is a reasonably safe assumption. syncookies are rather new
and probably need more battle testing. synproxy never helped me much in
case of a SYN attack since it will cause pf(4) to hit the state limit no
matter what you do and then stuff starts to break.

-- 
:wq Claudio



Re: LDAP tls: handshake failure

2019-10-24 Thread Claudio Jeker
On Thu, Oct 24, 2019 at 02:06:47PM +0200, Martijn van Duren wrote:
> On 10/24/19 1:50 PM, Robert Klein wrote:
> > Hi,
> > 
> > 
> > 
> > On Thu, 24 Oct 2019 05:26:49 +0200,
> > Predrag Punosevac wrote:
> >>
> >> Kapetanakis Giannis wrote:
> >>
> >>> On 23/10/2019 19:14, Predrag Punosevac wrote:
>  Hi Misc,
> 
>  I just upgraded a LDAP server from 6.5 to 6.6 running authorization and
>  authentication services for a 100 some member university research group.
>  It appears TLS handshake is broken. This worked perfectly on 6.5 and
>  earlier.
> 
> > 
> > [ rest deleted ]
> > 
> >> I am out of fuel to look more this tonight but I am 99% sure something
> >> have changed on 6.6 which broke the things. Maybe my configuration was
> >> wrong all along and in 6.6 few screws got tighten up which bit me for my
> >> rear end. I would appreciate any further commend or suggestions how to
> >> debug this. I would also appreciate any reports of fully working ldapd
> >> on 6.6 release
> >>
> >> Best,
> >> Predrag
> >>
> > 
> > This is related to commit “Make sure that ber in ber_scanf_elements is
> > not NULL before parsing format” (martijn@) and caused by the scan string
> > used by ber_scanf_elements on line 310 in ldape.c
> 
> Thanks for looking into this. I didn't found the time yet.
> > 
> > Regarding the commit, see also emails with subject “ber.c: Don't
> > continue on nonexistent ber” to tech@ on August, 13.
> > 
> > When you set scan string for ber_scanf_elements in line 310 of ldape.c
> > from "{se" to "{s" it works again.  Patch below.
> > 
> > When you look at the ldap_extended function on ldape.c, you see ext_val
> > is assigned to req_op in line 314.  The only use could happen in the
> > extended_ops[i]fn(req) call in line 318.  This currently can only be a
> > call to ldap_starttls (beginning at line 285, same file) which doesn't
> > use req_op either.  So it the `e' shouldn't matter.
> > 
> > At a guess, this also conforms to RFC4511, section 4.14.1.
> 
> Glancing over the RFC seems that you are correct.
> > 
> > If ldap_extended is extended to handle other operations than starttls,
> > care must be taken for an optional additional octet string in the
> > request (see definition of extended request in RFC4511, section 4.12).
> 
> Diff below should handle this. Also, you forgot to remove the ext_val.
> > 
> > 
> > Best regards
> > Robert
> > 
> martijn@
> 
> Index: ldape.c
> ===
> RCS file: /cvs/src/usr.sbin/ldapd/ldape.c,v
> retrieving revision 1.31
> diff -u -p -r1.31 ldape.c
> --- ldape.c   28 Jun 2019 13:32:48 -  1.31
> +++ ldape.c   24 Oct 2019 12:05:19 -
> @@ -298,7 +298,6 @@ ldap_extended(struct request *req)
>  {
>   int  i, rc = LDAP_PROTOCOL_ERROR;
>   char*oid = NULL;
> - struct ber_element  *ext_val = NULL;
>   struct {
>   const char  *oid;
>   int (*fn)(struct request *);
> @@ -307,11 +306,11 @@ ldap_extended(struct request *req)
>   { NULL }
>   };
>  
> - if (ber_scanf_elements(req->op, "{se", , _val) != 0)
> + if (ber_scanf_elements(req->op, "{s", ) != 0)
>   goto done;
>  
>   log_debug("got extended operation %s", oid);
> - req->op = ext_val;
> + req->op = req->op->be_sub->be_next;
>  
>   for (i = 0; extended_ops[i].oid != NULL; i++) {
>   if (strcmp(oid, extended_ops[i].oid) == 0) {

OK claudio@

-- 
:wq Claudio



Re: Does net.mpls.maxloop_inkernel do anything?

2019-10-24 Thread Claudio Jeker
On Thu, Oct 24, 2019 at 12:01:35PM +0100, Thomas Habets wrote:
> $ cd /usr/src/sys
> $ grep mpls_inkloop -r .
> ./netmpls/mpls.h:   _inkloop, \
> ./netmpls/mpls.h:extern int mpls_inkloop;
> ./netmpls/mpls_raw.c:int mpls_inkloop = MPLS_INKERNEL_LOOP_MAX;
> $ grep -r MPLSCTL_MAXINKLOOP .
> ./netmpls/mpls.h:#defineMPLSCTL_MAXINKLOOP  4
> 
> Looks like last users of this variable were removed in 2015:
> https://cvsweb.openbsd.org/cgi-bin/cvsweb/src/sys/netmpls/mpls_input.c.diff?r1=1.51=1.52=h
> 
> So should this sysctl be retired, or is there an indirect accessor path I
> did not find?

Yes, I agree this is dead and could be GC-ed.

-- 
:wq Claudio



Re: Requesting vi tips

2019-10-18 Thread Claudio Jeker
On Fri, Oct 18, 2019 at 03:12:37PM +0100, cho...@jtan.com wrote:
> OK this has started to get on my nerves now.
> 
> I use vi to enter emails despite using evil emacs for development and
> other general editing. Rather than linking them together (they're on
> seperate machines) to enter emails in emacs I'd rather figure out
> something interesting about vi.
> 
> At the moment I limit lines to 72 characters through a laborious process
> of finding the appropriate space character myself and replacing it with
> a ^M. Obviously nonsense which is why I sometimes don't bother. (Sorry).
> 
> I know about fmt and could easily concoct the pipeline to format each
> paragraph but I wonder if there's something that can correctly parse the
> whole email and format the entire thing en masse without me writing what
> would undoubtedly be Yet Another Poor Implementation.
> 
> Alternatively is there something that would make vi do it on the fly, or
> something akin to emacs' C-q or vim's gq. Although I appreciate the fact
> that vi doesn't try to be clever.
> 

set wl=72 will limit the line lenght to around 72. Additionally you
can use !fmt with movement chars to reformat sections. I use !{fmt
or {!}fmt frequently to reformat the paragraph I'm in.

-- 
:wq Claudio



Re: Strong Host Model in OpenBSD network stack

2019-10-18 Thread Claudio Jeker
On Thu, Oct 17, 2019 at 09:50:28PM +0200, Bastian Kanbach wrote:
> Hello,
> 
> recently I was performing some checks that relate to the "Strong Host
> Model" and "Weak Host Model", and I noticed that OpenBSD was behaving
> different than I expected. I always assumed that the network stack of
> OpenBSD was following the "Strong Host Model", but I might be wrong with
> that:

OpenBSD does follow the "Weak Host Model". Has always been like that.
 
> Basically the Strong Host Model means that the network stack "accepts
> locally destined packets if the destination IP address in the packet
> matches an IP address assigned to the network interface on which the
> packet was received."
> 
> FreeBSD and NetBSD have a sysctl property for this, called
> "net.inet.ip.check_interface", which defaults to 0 (Weak Host Model).
> However for OpenBSD I haven't seen such a property at all.
> 
> 
> Basically my setup consisted of the following virtual machines and
> network interfaces (IP-Forwarding disabled):
> 
> 
> VM 1 (OpenBSD 6.5):
> 
> em0: 192.168.100.1/24 ("Internal Network")
> 
> em1: 10.0.0.97/24 ("NAT")
> 
> 
> VM 2 (Ubuntu Server 18.10):
> 
> ens33: 192.168.100.2/24 ("Internal Network")
> 
> 
> 
> 
> 
> As expected, ens33 of VM2 can communicate with em0 of VM1, since both
> interfaces are associated with the same Virtualbox network, and both IP
> addresses are part of the same /24 subnet.
> 
> ens33 of VM2 can't directly communicate with em1 of VM1, since the IP
> addresses are part of different subnets and no routes were configured.
> 
> 
> Then I performed 2 tests:
> 
> 
> Test 1:
> 
> Perform an arping from ens33/VM2 (192.168.100.2) to 10.0.0.97 (VM1). The
> packet was NOT answered by VM1.
> 

This is a Layer 2 ARP test. Since 10.0.0.97 is not on that interface arp
will not answer. The host model only matters for Layer 3.

> 
> Test 2:
> 
> Set the following route on VM2: ip r add 10.0.0.0/24 via 192.168.100.1.
> Then send an ICMP echo request to 10.0.0.97 (VM1), originating from
> 192.168.100.2 (VM2). VM1 replied with an ICMP echo reply (with a source
> MAC address of interface em0).
> 
> 
> While the behaviour of Test 1 indicates that the Strong Host Model is
> followed, Test 2 shows the behaviour of a "Weak Host Model".
 
No, Test 1 is not the right test for the host model.
 
> What of both is actually supposed to be the default for OpenBSD? Is
> there any kernel parameter to control these behaviours, like
> net.inet.ip.check_interface for FreeBSD or NetBSD?

We don't have a button and just follow the "Weak Host Model".
You can enforce a strong model per interface with pf(4):

block in on !em0 inet to (em0)

or

block in
pass in on em0 to (em0)
pass in on em1 to (em1)

-- 
:wq Claudio



Re: Strong Host Model in OpenBSD network stack

2019-10-18 Thread Claudio Jeker
On Fri, Oct 18, 2019 at 07:21:42AM +0200, Remi Locherer wrote:
> On Thu, Oct 17, 2019 at 10:33:41PM -0600, Theo de Raadt wrote:
> > > Setting net.inet.ip.check_interface=1 on FreeBSD stopped any ICMP Echo
> > > replies immediately.
> > > 
> > > On NetBSD I set net.inet.ip.checkinterface=1 and it showed the same
> > > behaviour like FreeBSD. No replies anymore, whenever the "wrong"
> > > interface was contacted.
> > 
> > How many users set those variables?
> > 
> > A global seems this is a misguided place to establish such a policy.
> > 
> > If it was good and neccessary for everyone on all interfaces and had no
> > downsides, they would have turned it on.  But they didn't.
> > 
> > A similar feature "urpf-failed" which is more nuanced is available in
> > pf.conf, and you can properly use it on a per-interface basis, also
> > selecting to do so based un other per-rule options, rather than having
> > a 'global rule'.
> > 
> > Something blocked FreeBSD or NetBSD from turning this into the default.
> > What was that reason -- was it too damaging?
> > 
> > (I'm going to assume the people with so-called 'strong' views didn't win
> > the battle, and the so-called 'weak' view pervailed, probably because
> > the 'strong' option created breakage and prevents the dominant
> > operational model of Getting-Shit-Done.  That's why I ask how many
> > people in real life subscribe the 'strong' view by turning on this
> > option in FreeBSD/NetBSD.  3 people or is it 2?  In my experience,
> > everyone is so busy getting on about their lives they don't flip any
> > knobs which don't provide an immediately confirmable and neccessary
> > value).
> > 
> >  from source port source os source to dest port dest
> >  This rule applies only to packets with the specified source and
> >  destination addresses and ports.
> > 
> >  Addresses can be specified in CIDR notation (matching 
> > netblocks),
> >  as symbolic host names, interface names or interface group 
> > names,
> >  or as any of the following keywords:
> > 
> >  any  Any address.
> >  no-route Any address which is not currently routable.
> >  route label  Any address matching the given route(8) label.
> >  self Expands to all addresses assigned to all 
> > interfaces.
> >Any address matching the given table.
> >  urpf-failed  Any source address that fails a unicast reverse 
> > path
> >   forwarding (URPF) check, i.e. packets coming in on
> >   an interface other than that which holds the route
> >   back to the packet's source address.
> > 
> > Convince us we should change to the strong model, and I'll embrace it.
> > 
> > You won't convince us to make a global which people don't understand...
> > 
> 
> This "strong" model is a bad fit for routers.
> 
> When this model is needed we have pf (antispoof or urpf-failed).
> Alternatively rdomains can be used (put a network interface with management
> services on it in a separate rdomain).
> 

The BSD systems and IIRC most unix systems have been following the
weak host model. As mentioned the weak model has a lot of benefits.
I see no point in changing this.

-- 
:wq Claudio



Re: bgpctl(8) community question

2019-10-10 Thread Claudio Jeker
On Mon, Oct 07, 2019 at 04:48:34PM -0500, Adam Thompson wrote:
> [OpenBSD 6.5-STABLE, up to date]
> 
> When using bgpctl(8), I'm able to do almost everything I need, but I'm
> having trouble figuring out how to do one thing:
> 
> How do I show routes that do NOT have a community (or ext-community, or
> large-community) attribute?
> 
> The best I can come up with so far is a fairly ugly AWK script that buffers
> the detailed route output, then emits it if it doesn't see a Communities:
> line.  Am I missing a better way?
> 
> Thanks,
> -Adam
> 
> N.B. manually looking through N sets of DFZ route tables isn't going to
> happen, I need a mostly-automatic solution.

Currently there is no other way to filter on prefixes which don't have a
community. You can use the ssv output to make the filtering easier or you
tag the prefixes with a different community (first set community on all
routes and then delete community again from those where you set the other
community).
 
Adding a 'not' option to community matching should be possible. I will
look into that after 6.6 is out.

-- 
:wq Claudio



Re: bgplg ping/traceroute failed

2019-10-03 Thread Claudio Jeker
On Thu, Oct 03, 2019 at 02:07:58PM -0400, Henry Bonath wrote:
> Hello Misc,
> 
> I had thought that I had configured the looking glass correctly per the man
> page,
> I have everything else working correctly, with custom header and footer
> with CSS and all works great.
> Whenever I attempt to ping/traceroute from the webpage, it simlpy reports:
> "failed."
> 
> Here is what permissions look like: (set to 4555, per the man page)
> # ls -l /var/www/bin
> total 3584
> -r-xr-xr-x  1 root  bin  336016 Apr 13 16:35 bgpctl
> -r-sr-xr-x  2 www   bin  366536 Apr 13 16:35 ping
> -r-sr-xr-x  2 www   bin  366536 Apr 13 16:35 ping6
> -r-sr-xr-x  2 www   bin  325320 Apr 13 16:35 traceroute
> -r-sr-xr-x  2 www   bin  325320 Apr 13 16:35 traceroute6

The ping* and traceroute* binaries need to be setuid root not setuid www.
The root privs are needed to open the raw socket after that privs are
dropped. Also check the mail from Theo about nosuid mount option on /var
 
> OpenBSD version is 6.5 amd64.
> 
> Is there anything I am missing that I would need to do in order to make
> this work?
> Thanks in advance!
> -Henry

-- 
:wq Claudio



Re: bgpctl sho ri nei terse output vs man page discrepancy

2019-09-23 Thread Claudio Jeker
On Sun, Sep 22, 2019 at 04:48:18PM -, Stuart Henderson wrote:
> On 2019-09-22, Rachel Roch  wrote:
> > Hi,
> >
> > Hopefully I'm not missing something silly here but I've read the paragraph 
> > in the man page and it only lists 15 variables:
> >
> > "The printed numbers are the sent and received open,
> > sent and received notifications, sent and received
> > updates, sent and received keepalives, and sent and
> > received route refresh messages plus the current and
> > maximum prefix count, the number of sent and received
> > updates, and withdraws."
> >
> > But bgpctl sho ri nei outputs 16 numbers, not 15 ?
> 
> Sent and recevied updates, sent and received withdraws.
> 
> Unfortunately the peer's name/address is missing, which makes it a bit
> tricky to use with "group", though it's not very convenient to change the
> output format now ..

Better now than later. You could add the name/ip to the end.

-- 
:wq Claudio



Re: Prometheus node_exporter on OpenBSD - anyone managed ?

2019-09-21 Thread Claudio Jeker
On Fri, Sep 20, 2019 at 10:36:11AM +0200, Rachel Roch wrote:
> Claudio,
> 
> pkg_add node_exporter ?
> 
> I already had a good look at the package list on the FTP mirror and
> can't see any node_exporter there ?  pkg_add seems to agree with me, it
> says "can't find node_exporter" ?
> 
> Certainly pkg_add would be my preferred option, but it seems someone has
> forgot poor old node_exporter for recent releases ?  

node_exporter-0.18.0 is in -current since May. So yes, it has not been in
6.5 or any earlier release.
 
> Regarding the other gmake suggestion, that possibility occurred to me
> after sending yesterday's email, but I guess I would have to edit
> various source files to make sure its calling the right command.  Not
> rocket science I guess, but equally could be time consuming to make sure
> I've caught all the right spots in the code.

-- 
:wq Claudio
 
 
> Sep 20, 2019, 05:29 by cje...@diehard.n-r-g.com:
> 
> > On Thu, Sep 19, 2019 at 10:13:23PM +, Travis Cole wrote:
> >
> >>
> >> Looks like they are assuming GNU make.
> >>
> >>
> >> Try doing the build with 'gmake'.
> >>
> >>
> >> If you don't already have gmake installed:
> >>
> >>
> >> # pkg_add gmake
> >>
> >
> > Or just do `pkg_add node_exporter`. While prometheus does not provide
> > a pre-compiled binary OpenBSD does.
> >
> >> On Thu, Sep 19, 2019 at 11:49:20PM +0200, Rachel Roch wrote:
> >> > Hi,
> >> > 
> >> > The official Prometheus github repo 
> >> > (https://github.com/prometheus/node_exporter) 
> >> >  appears to suggest in 
> >> > multiple places that node_exporter is capable of working on OpenBSD.
> >> > 
> >> > But although they provide pre-compiled binaries for multiple platforms 
> >> > including NetBSD (https://github.com/prometheus/node_exporter/releases) 
> >> >  they seemingly 
> >> > don't provide a binary for OpenBSD.
> >> > 
> >> > So I tried downloading the source and compiling it, but I get a 
> >> > screenful of nasty sounding messages, e.g.:
> >> > Bad modifier: , ,$(shell $(GO) env GOPATH))) 
> >> > 
> >> > Bad modifier: , ,$(shell $(GO) env GOPATH))) 
> >> > 
> >> > No closing parenthesis in archive specification  
> >> > 
> >> > *** Parse error: Error in archive specification: "(, \.'))" 
> >> > (Makefile.common:41) 
> >> > 
> >> > *** Parse error: Need an operator in 'else' (Makefile.common:51) 
> >> > 
> >> > *** Parse error: Need an operator in '' (Makefile.common:54) 
> >> > 
> >> > *** Parse error: Need an operator in '' (Makefile.common:55) 
> >> > 
> >> > *** Parse error: Need an operator in 'endif' (Makefile.common:61)
> >> > 
> >> > Bad modifier: , ,$(shell go env GOPATH)))
> >> > 
> >> > Bad modifier: , ,$(shell go env GOPATH))) 
> >> > 
> >> > 
> >> > Given the popularity of Prometheus, I'm sure someone on-list must be 
> >> > actively running it ?
> >> > 
> >> > Thanks !
> >> > 
> >> > Rachel
> >> > 
> >>
> >
> > -- 
> > :wq Claudio
> >
> 



Re: What is the 3rd column in the learned mac address list in ifconfig

2019-09-20 Thread Claudio Jeker
On Fri, Sep 20, 2019 at 07:16:15AM +0100, Tom Smyth wrote:
> Hi all, hope those of you at eurobsdcon are enjoying your selves
> wish I was there
> I waswondering what is the  3rd column in the learned mac address list in
> the column is a number 0 or 1 after the interface name in
>   ifconfig  bridge x
> 
> ihave highlighted with ** the value i'm interested in
> Addresses (max cache: 100, timeout: 240):
> 00:17:c8:3e:08:22 em2 *0* flags=0<>

This would be the age of the entry.
 
> 
> ifconfig  bridge x
> 
> 
> bridge0: flags=41
> index 7 llprio 3
> groups: bridge
> priority 32768 hellotime 2 fwddelay 15 maxage 20 holdcnt 6 proto
> rstp
> em2 flags=3
> port 4 ifpriority 0 ifcost 0
> em1 flags=3
> port 3 ifpriority 0 ifcost 0
> vether0 flags=3
> port 10 ifpriority 0 ifcost 0
> Addresses (max cache: 100, timeout: 240):
> 00:17:c8:3e:08:22 em2 0 flags=0<>
> 1c:c3:eb:68:05:29 em1 0 flags=0<>
> b8:bc:1b:1e:9d:9f em1 0 flags=0<>
> 38:f9:d3:47:db:54 em1 1 flags=0<>
> 48:bf:6b:e6:27:c2 em1 0 flags=0<>
> 74:d4:35:80:51:91 em2 1 flags=0<>
> 74:44:01:81:9b:7e em1 0 flags=0<>
> 
> -- 
> Kindest regards,
> Tom Smyth.

-- 
:wq Claudio



Re: Prometheus node_exporter on OpenBSD - anyone managed ?

2019-09-19 Thread Claudio Jeker
On Thu, Sep 19, 2019 at 10:13:23PM +, Travis Cole wrote:
> 
> Looks like they are assuming GNU make.
> 
> 
> Try doing the build with 'gmake'.
> 
> 
> If you don't already have gmake installed:
> 
> 
> # pkg_add gmake
> 

Or just do `pkg_add node_exporter`. While prometheus does not provide
a pre-compiled binary OpenBSD does.

> On Thu, Sep 19, 2019 at 11:49:20PM +0200, Rachel Roch wrote:
> > Hi,
> > 
> > The official Prometheus github repo 
> > (https://github.com/prometheus/node_exporter) 
> >  appears to suggest in 
> > multiple places that node_exporter is capable of working on OpenBSD.
> > 
> > But although they provide pre-compiled binaries for multiple platforms 
> > including NetBSD (https://github.com/prometheus/node_exporter/releases) 
> >  they seemingly don't 
> > provide a binary for OpenBSD.
> > 
> > So I tried downloading the source and compiling it, but I get a screenful 
> > of nasty sounding messages, e.g.:
> > Bad modifier: , ,$(shell $(GO) env GOPATH)))
> >  
> > Bad modifier: , ,$(shell $(GO) env GOPATH)))
> >  
> > No closing parenthesis in archive specification 
> >  
> > *** Parse error: Error in archive specification: "(, \.'))" 
> > (Makefile.common:41)
> >  
> > *** Parse error: Need an operator in 'else' (Makefile.common:51)
> >  
> > *** Parse error: Need an operator in '' (Makefile.common:54)
> >  
> > *** Parse error: Need an operator in '' (Makefile.common:55)
> >  
> > *** Parse error: Need an operator in 'endif' (Makefile.common:61)   
> >  
> > Bad modifier: , ,$(shell go env GOPATH)))   
> >  
> > Bad modifier: , ,$(shell go env GOPATH))) 
> > 
> > 
> > Given the popularity of Prometheus, I'm sure someone on-list must be 
> > actively running it ?
> > 
> > Thanks !
> > 
> > Rachel
> > 
> 

-- 
:wq Claudio



Re: ldapd hangs/stalls

2019-08-28 Thread Claudio Jeker
On Wed, Aug 28, 2019 at 03:17:05PM -0400, Allan Streib wrote:
> Allan Streib  writes:
> 
> > Running a rather busy ldapd host, and seeing some hangs in responses to
> > queries.
> 
> 
> I see that fstat -u _ldapd always ends at FD 119 when the hang occurs:
> 
> [...]
> _ldapd   ldapd  42641  112* internet stream tcp 0x0 172.16.0.169:389 <-- 
> 172.16.0.38:44708
> _ldapd   ldapd  42641  113* internet stream tcp 0x0 172.16.0.169:389 <-- 
> 172.16.0.45:43392
> _ldapd   ldapd  42641  114* internet stream tcp 0x0 172.16.0.169:389 <-- 
> 172.16.0.26:54300
> _ldapd   ldapd  42641  115* internet stream tcp 0x0 172.29.202.69:389 <-- 
> 172.29.200.100:36250
> _ldapd   ldapd  42641  116* internet stream tcp 0x0 172.29.202.69:389 <-- 
> 172.29.200.109:45362
> _ldapd   ldapd  42641  117* internet stream tcp 0x0 172.29.202.69:389 <-- 
> 172.29.200.108:47864
> _ldapd   ldapd  42641  118* internet stream tcp 0x0 172.29.202.69:389 <-- 
> 172.29.200.104:56746
> _ldapd   ldapd  42641  119* internet stream tcp 0x0 172.29.202.69:389 <-- 
> 172.29.200.106:40436
> 
> 
> I tried the following:
> 
> Gave _ldapd a login class of "ldap"
> 
> Added to login.conf:
> 
> ldap:\
> :openfiles=512:\
> :tc=daemon:
> 
> restart ldapd.
> 
> Still hangs with fstat output the same.
> 

I guess the problem is in the error handling of one of the filter codes
which leaks an fd. At least I suspect that the error message about filter
type is suggesting that.

-- 
:wq Claudio



Re: missing SYN_RECV in netstat

2019-08-20 Thread Claudio Jeker
On Tue, Aug 20, 2019 at 07:36:11PM +0200, Peter J. Philipp wrote:
> Hi,
> 
> On the NANOG list there is a thread about something synflooding:
> https://mailman.nanog.org/pipermail/nanog/2019-August/102713.html
> 
> Most of my hosts are synflooded, and I was wondering why my OpenBSD
> hosts don't show any SYN_RECV states in a netstat -nafinet.  I had to tcpdump
> to see a synflood happening on port 53 on one of my hosts, have to 
> still check the other one.   Could there be a bad pf rule I'm 
> using?  I suspect this is a worm of sorts or something.  
> 
> While not an emergency, it is inconvenient to pick out the synflooders
> with tcpdump.  Is there any better tools?

netstat does not show SYN_RECV states because those are hold in the
syncache and need to finish the 3-way handshake before showing up in
netstat. I normally use tcpdump to identify synfloods but pfctl -ss will
probably show them as well (up to the moment where pf decides to switch to
syncookies).

-- 
:wq Claudio



Re: Building Unbound with Python module support

2019-08-07 Thread Claudio Jeker
On Wed, Aug 07, 2019 at 08:44:07AM +0100, Andy Lemin wrote:
> Morning Stuart,
> 
> So I’ve tested with the base build options properly, the initial errors I saw 
> before have gone which is good. But I have a more fundamental issue with 
> Unbound now sadly.
> 
> Swig successfully built “/usr/src/unbound/pythonmod/unboundmodule.py” and 
> installed it to “/usr/local/lib/python2.7/site-packages/unboundmodule.py”.
> 
> However unbound is unable to find it, and the following errors are seen;
> [HOME]root@bsd1:/var/unbound#/usr/local/sbin/unbound -c 
> /var/unbound/etc/unbound.conf -dv
> 
> [1565135861] unbound[90497:0] notice: Start of unbound 1.9.3.
> 
> [1565135861] unbound[90497:0] debug: increased limit(open files) from 128 to 
> 16478
> 
> [1565135861] unbound[90497:0] debug: creating udp4 socket 127.0.0.1 53
> 
> [1565135861] unbound[90497:0] debug: creating udp4 socket 10.10.1.5 53
> 
> [1565135861] unbound[90497:0] debug: creating unix socket 
> /var/run/unbound.sock
> 
> [1565135861] unbound[90497:0] debug: switching log to syslog
> 
> Could not find platform independent libraries 
> 
> Could not find platform dependent libraries 
> 
> Consider setting $PYTHONHOME to [:]
> 
> Traceback (most recent call last):
> 
>   File "", line 1, in 
> 
> ImportError: No module named distutils.sysconfig
> 
> Traceback (most recent call last):
> 
>   File "", line 1, in 
> 
> NameError: name 'distutils' is not defined
> 
> Traceback (most recent call last):
> 
>   File "", line 1, in 
> 
> ImportError: No module named unboundmodule
> 
> 
> 
> I have tried all manner of values for PYTHONHOME and I have also tried 
> 
> --with-pythonmodule=/usr/local/lib/python2.7/site-packages
> 
> 
> Searching around shows others have found the exact same issue;
> https://nlnetlabs.nl/pipermail/unbound-users/2011-July/007371.html
> 
> What do you think about this in context of OpenBSD?
> 

unbound does a chroot(2) by default ot /var/unbound and so anything in
/usr/local is unreachable. Either install the python code into the chroot
or try running unbound with chroot: "" (which disables chroot). See also
unbound.conf(5) for more info about chroot.


> Sent from a teeny tiny keyboard, so please excuse typos
> 
> > On 7 Aug 2019, at 00:03, Andy Lemin  wrote:
> > 
> > Hi Stuart,
> > 
> > Thanks for your reply.
> > 
> > So I put in some leg work to set myself up so I could build a new release 
> > base system, and went digging.
> > 
> > And I found “/usr/src/usr.src/unbound/Makefile.bsd-wrapper” so I think I 
> > have found the correct build options to match with the base builds 
> > CONFIGURE_OPTS_UNBOUND
> > 
> > I will try again with these options tomorrow, and see if I have the same 
> > errors.
> > 
> > “The default install can't include Python support, because the default 
> > install of Unbound is in the base OS, and Python isn't.”
> > 
> > Facepalm.. Of course!
> > 
> > Is there a C plugin library? I would like to make this project 
> > native/portable so other users can use this project without having to 
> > rebuild Unbound?
> > 
> > Thanks Andy.
> > 
> > 
> > Sent from a teeny tiny keyboard, so please excuse typos
> > 
> >>> On 6 Aug 2019, at 19:36, Stuart Henderson  wrote:
> >>> 
> >>> On 2019-08-06, Andy Lemin  wrote:
> >>> Hi guys,
> >>> 
> >>> I’m just after some general advice as I feel like I’m doing something 
> >>> wrong, and having to hack around too much for what I believe should be 
> >>> simple.
> >>> 
> >>> I am developing a simple Python plugin for Unbound, and the default 
> >>> Unbound install on OpenBSD sadly wasn’t built with “—with-pythonmodule”.
> >>> 
> >>> So I grabbed the Unbound source code with a git clone from GitHub, 
> >>> installed dependencies, and did “./configure —with-pythonmodule”, make, 
> >>> make install etc..
> >>> 
> >>> So nothing special here. It installed to /usr/local/ rather than just 
> >>> /usr etc, and so fiddled around with /etc/rc.d/unbound to make the rc 
> >>> scripts start the custom one.
> >>> 
> >>> But I’m getting errors which requires some extra config settings to 
> >>> squash when loading the same config as with the built in Unbound. ok 
> >>> maybe newer unbound code..
> >>> 
> >>> But I am then also getting errors when trying to load the stock example 
> >>> python plugin as per the source built sphinx docs.
> >>> 
> >>> I’m not at my computer at the moment so can’t share the exact errors, but 
> >>> thought I’d ask as it feels like I’m missing something obvious!
> >>> 
> >>> Maybe I need some extra build options or static library references to 
> >>> make it as smooth as the built in Unbound? Or maybe I should be using a 
> >>> different source?
> >>> 
> >>> Any initial thoughts? I’ll post exact errors as soon as I can.
> >> 
> >> Initial thoughts are "did you use the same configure flags as much as 
> >> possible
> >> as the build in base". Really need to see the errors to be able to make any
> >> more detailed suggestions.
> >> 
> >> The default install can't 

Re: Best 1Gbe NIC

2019-08-02 Thread Claudio Jeker
On Fri, Aug 02, 2019 at 12:28:58PM +0100, Andy Lemin wrote:
> Ahhh, thank you!
> 
> I didn’t realise this had changed and now the drivers are written with
> full knowledge of the interface.

That is an overstatement but we know for sure a lot more about these cards
then many other less open ones.

> So that would make Intel Server NICs (i350 for example) some of the best
> 1Gbe cards nowadays then?

They are well supported by OpenBSD as are many other server nics like bge
and bnx. I would not call them best, when it comes to network cards it
seems to be a race to the bottom. All chips have stuff in them that is
just not great. em(4) for example needs a major workaround because the
buffersize is specified by a bitfield. 

My view is more pessimistic, all network cards are shit there are just
some that are less shitty. Also I prefer to use em(4) over most other
gigabit cards.

-- 
:wq Claudio

> 
> Sent from a teeny tiny keyboard, so please excuse typos
> 
> > On 2 Aug 2019, at 09:52, Jonathan Gray  wrote:
> > 
> >> On Fri, Aug 02, 2019 at 09:19:09AM +0100, Andy Lemin wrote:
> >> Hi list,
> >> 
> >> I know this is a rather classic question, but I have searched a lot on 
> >> this again recently, and I just cannot find any conclusive up to date 
> >> information?
> >> 
> >> I am looking to buy the best 1Gbe NIC possible for OpenBSD and the only 
> >> official comments I can find relate to 3COM for ISA, or community 
> >> consensus towards Chelsio for 10Gbe.
> >> 
> >> I know Intel works ok and I???ve used the i350???s before, but my 
> >> understanding is that Intel still doesn???t provide the documentation for 
> >> their NICs and so the emX driver is reverse engineered.
> > 
> > This is incorrect.  Intel provides datasheets for Ethernet parts.
> > em(4) is derived from Intel authored code for FreeBSD supplied under a
> > permissive license.
> > 
> >> 
> >> And if I remember correctly some offload features were also disabled in 
> >> the emX driver a while back as some functions where found to be insecure 
> >> on die and so it was deemed safer to bring the logic back on CPU.
> >> 
> >> So I???m looking for the best 1Gbe NIC that supports the most 
> >> offloading/best driver support/performance etc.
> >> 
> >> Thanks, Andy.
> >> 
> >> PS; could we update the official supported hardware lists? ;)
> >> All the best.
> >> 
> >> 
> >> Sent from a teeny tiny keyboard, so please excuse typos
> >> 
> 



Re: Moving from Bird to OpenBGPD

2019-07-16 Thread Claudio Jeker
On Mon, Jul 15, 2019 at 11:33:45PM -0700, BSD user wrote:
> 
> 
> On 7/14/19 11:24 PM, Claudio Jeker wrote:
> > On Sun, Jul 14, 2019 at 07:28:29PM -0700, BSD user wrote:
> > > 
> > > 
> > > On 7/14/19 12:52 AM, Denis Fondras wrote:
> > > > On Sat, Jul 13, 2019 at 09:44:28PM -0700, BSD user wrote:
> > > > > Hello,
> > > > > 
> > > > > My apologies for sending this email multiple times.
> > > > > 
> > > > > I was so mortified by Tutanota's awful text formatting that I
> > > > > created a new mail account that supported IMAP so that I could load
> > > > > it up in Thunderbird with text only mode enabled.
> > > > > 
> > > > > Once again, my apologies for my rookie mistake choosing Tutanota for
> > > > > use on an international mailing list such as this one. I hope you
> > > > > guys will give me one more chance.
> > > > > 
> > > > > My (hopefully) unmangled message is below.
> > > > > 
> > > > 
> > > > You did not include which version you are running, I'll assume this is
> > > > 6.5.  It seems you do not have any filter, OpenBGPD denies everything
> > > > by default.
> > > > 
> > > 
> > > Thanks for the reply Denis. You were right, I was missing my allow
> > > rules. After setting "allow from any AS 64515" and "allow to any" rules,
> > > everything started working. I was able to get IPv6 working as well
> > > without a hitch.
> > > 
> > > Are there any other filter rules I should be setting to secure my BGP
> > > deployment? I'm on a private ASN assigned to me by Vultr. This is my
> > > first forray into BGP land, so any advice or tips would be much
> > > appreciated.
> > 
> > Ideally you want to limit the filters to only announce what you really
> > need to announce to prevent leaking of prefixes because of a
> > missconfiguration. Also what is Vultr sending you via BGP?  Depending on
> > that you may be able to limit the input as well.
> > 
> > I guess in this simple setup it does not matter to have simple allow
> > filters since this bgpd instance is not connected to the default free zone
> > and so there is less risk of leaking or receiving leaked routes.  In
> > general if your BGP setup has more than one external neighbor you need to
> > take care of your filters to make sure that you don't leak updates from
> > one neighbor to the other.
> > 
> 
> Thanks for the reply Claudio!
> 
> You were right, my "allow from" rule was unnecessary, Vultr doesn't
> appear to be sending me anything.
> 
> I managed to get my "allow to" rule tightened up to look like this:
> 
> allow to any prefix {xxx.xxx.xxx.141/32 2001:::::/64}

This rule is perfectly fine.
 
> I tried tightening the rule down further to restrict to Vultr's upstream
> AS and IP addresses like so:
> 
> 'allow to 169.254.169.254 AS 64515 prefix 140.82.0.141/32'

The problem here is that AS 64515 wants to match any part of the ASPATH
for AS 64515.  Which is not there and so this rule never matches.

You can write this rule either as:
'allow to 169.254.169.254 prefix xxx.xxx.xxx.141/32'
or
'allow to AS 64515 prefix xxx.xxx.xxx.141/32'

> Unfortunately the rule doesn't work properly as my prefixes immediately
> become unpingable after loading that rule. I'm probably missing
> something obvious. Any suggestions on how to tighten down the rule further?

Normally it is enought to limit the rule to the prefixes you want to
announce. So the first rule is just fine.
 
> My final question is concerning assigning prefixes to interfaces. Is it
> best practice to assign the addresses to something like 'lo1' loopback
> interface, or should assigning it as an alias on an egress interface
> suffice? I tried and they both seem to work.

That is a bit of a style question. In some cases using lo1 has benefits
(the IP will always be UP which can matter when you have multiple
interfaces). Using it on the only interface out would have the benefit
that you could use a default route that uses the public IP as source IP
for outgoing packets by default. (using 'route add default gateway 
-ifa xxx.xxx.xxx.141')

In your case I guess neither matters so you can decide what you like
better.

-- 
:wq Claudio



Re: Moving from Bird to OpenBGPD

2019-07-15 Thread Claudio Jeker
On Sun, Jul 14, 2019 at 07:28:29PM -0700, BSD user wrote:
> 
> 
> On 7/14/19 12:52 AM, Denis Fondras wrote:
> > On Sat, Jul 13, 2019 at 09:44:28PM -0700, BSD user wrote:
> > > Hello,
> > > 
> > > My apologies for sending this email multiple times.
> > > 
> > > I was so mortified by Tutanota's awful text formatting that I
> > > created a new mail account that supported IMAP so that I could load
> > > it up in Thunderbird with text only mode enabled.
> > > 
> > > Once again, my apologies for my rookie mistake choosing Tutanota for
> > > use on an international mailing list such as this one. I hope you
> > > guys will give me one more chance.
> > > 
> > > My (hopefully) unmangled message is below.
> > > 
> > 
> > You did not include which version you are running, I'll assume this is
> > 6.5.  It seems you do not have any filter, OpenBGPD denies everything
> > by default.
> > 
> 
> Thanks for the reply Denis. You were right, I was missing my allow
> rules. After setting "allow from any AS 64515" and "allow to any" rules,
> everything started working. I was able to get IPv6 working as well
> without a hitch.
> 
> Are there any other filter rules I should be setting to secure my BGP
> deployment? I'm on a private ASN assigned to me by Vultr. This is my
> first forray into BGP land, so any advice or tips would be much
> appreciated.

Ideally you want to limit the filters to only announce what you really
need to announce to prevent leaking of prefixes because of a
missconfiguration. Also what is Vultr sending you via BGP?  Depending on
that you may be able to limit the input as well.

I guess in this simple setup it does not matter to have simple allow
filters since this bgpd instance is not connected to the default free zone
and so there is less risk of leaking or receiving leaked routes.  In
general if your BGP setup has more than one external neighbor you need to
take care of your filters to make sure that you don't leak updates from
one neighbor to the other.

-- 
:wq Claudio



Re: umsm: sparc64

2019-07-04 Thread Claudio Jeker
On Thu, Jul 04, 2019 at 12:52:15PM +0300, Kihaguru Gathura wrote:
> Hereby attached the new multiprocessor kernel with umsm working ok.
> 
> The error message appears for each connection made to cuaU. This might
> potentially populate dmesg logs over time.
> 
> Error message:
> umsm0: this device is not using CDC notify message in intr pipe.
> Please send your dmesg to , thanks.
> umsm0: intr buffer 0xc1 0x1 0x3 0x0 0x0 0x0 0x0:

Can you try this USB device on a different OpenBSD machine (e.g. an amd64
one). I wonder if this is the device just sending a bad message along the
way. The first byte should be 0xa1 (UCDC_NOTIFICATION) and not 0xc1.
 
> Kihaguru.
> 
> www# dmesg
> console is /pci@83,4000/isa@7/su@0,3f8
> Copyright (c) 1982, 1986, 1989, 1991, 1993
> The Regents of the University of California.  All rights reserved.
> Copyright (c) 1995-2019 OpenBSD. All rights reserved.  https://www.OpenBSD.org
> 
> OpenBSD 6.5 (WWW.MP) #0: Thu Jul  4 08:43:43 EAT 2019
> kihag...@www.datastore.ke:/usr/src/sys/arch/sparc64/compile/WWW.MP
> real mem = 17179869184 (16384MB)
> avail mem = 16862576640 (16081MB)
> mpath0 at root
> scsibus0 at mpath0: 256 targets
> mainbus0 at root: Fujitsu Siemens PRIMEPOWER250 2x SPARC64 V
> cpu0 at mainbus0: FJSV,SPARC64-V (rev 5.1) @ 1979 MHz
> cpu0: physical 128K instruction (64 b/l), 128K data (64 b/l), 3072K
> external (64 b/l)
> cpu1 at mainbus0: FJSV,SPARC64-V (rev 5.1) @ 1979 MHz
> cpu1: physical 128K instruction (64 b/l), 128K data (64 b/l), 3072K
> external (64 b/l)
> psycho0 at mainbus0 addr 0xfffb2000: SUNW,psycho, impl 0, version 4, ign c0
> psycho0: bus range 0-0, PCI bus 0
> psycho0: dvma map fe00-, STC0 enabled
> pci0 at psycho0
> ebus0 at pci0 dev 1 function 0 "Sun PCIO EBus2" rev 0x01
> "FJSV,scfc" at ebus0 addr 21-210085, 22-220031, 26-260001,
> 27-28 ivec 0x23 not configured
> "FJSV,flashprom" at ebus0 addr 0-3f not configured
> clock1 at ebus0 addr 25-251fff: mk48t59
> "FJSV,panel" at ebus0 addr 210011-210011 ivec 0x25 not configured
> ebus1 at pci0 dev 7 function 0 "Acer Labs M1533 ISA" rev 0x00
> com0 at ebus1 addr 3f8-3ff ivec 0x2b: ns16550a, 16 byte fifo
> com0: console
> com1 at ebus1 addr 2e8-2ef ivec 0x2b: ns16550a, 16 byte fifo
> hme0 at pci0 dev 1 function 1 "Sun HME" rev 0x01: ivec 0xe1, address
> 00:0b:5d:f3:a7:5c
> nsphyter0 at hme0 phy 1: DP83843 10/100 PHY, rev. 0
> mpi0 at pci0 dev 2 function 1 "Symbios Logic 53c1030" rev 0x07: ivec 0xe0
> mpi0: 0, firmware 1.0.12.0
> scsibus1 at mpi0: 16 targets, initiator 7
> sym0 at scsibus1 targ 0 lun 0:  SCSI2
> 0/direct fixed serial.FUJITSU_MAT3073N_SUN72G_000506B00RAR_AAN0P5200RAR
> sd0 at scsibus0 targ 0 lun 0:  SCSI2
> 0/direct fixed serial.FUJITSU_MAT3073N_SUN72G_000506B00RAR_AAN0P5200RAR
> sd0: 70007MB, 512 bytes/sector, 143374738 sectors
> sym1 at scsibus1 targ 1 lun 0:  SCSI2
> 0/direct fixed serial.FUJITSU_MAT3073N_SUN72G_000506B00SSL_AAN0P5200SSL
> sd1 at scsibus0 targ 1 lun 0:  SCSI2
> 0/direct fixed serial.FUJITSU_MAT3073N_SUN72G_000506B00SSL_AAN0P5200SSL
> sd1: 70007MB, 512 bytes/sector, 143374738 sectors
> mpi0: target 0 Sync at 160MHz width 16bit offset 127 QAS 1 DT 1 IU 1
> mpi0: target 1 Sync at 160MHz width 16bit offset 127 QAS 1 DT 1 IU 1
> pciide0 at pci0 dev 13 function 0 "Acer Labs M5229 UDMA IDE" rev 0xc4:
> DMA, channel 0 configured to native-PCI, channel 1 configured to
> native-PCI
> pciide0: using ivec 0xe4 for native-PCI interrupt
> atapiscsi0 at pciide0 channel 0 drive 0
> scsibus2 at atapiscsi0: 2 targets
> cd0 at scsibus2 targ 0 lun 0:  ATAPI
> 5/cdrom removable
> cd0(pciide0:0:0): using PIO mode 4, Ultra-DMA mode 2
> pciide0: channel 1 disabled (no drives)
> ohci0 at pci0 dev 10 function 0 "Acer Labs M5237 USB" rev 0x03: ivec
> 0xe9, version 1.0, legacy support
> usb0 at ohci0: USB revision 1.0
> uhub0 at usb0 configuration 1 interface 0 "Acer Labs OHCI root hub"
> rev 1.00/1.00 addr 1
> psycho1 at mainbus0 addr 0xfff9e000: SUNW,psycho, impl 0, version 4, ign c0
> psycho1: bus range 128-128, PCI bus 128
> psycho1: dvma map fe00-, STC0 enabled, STC1 enabled
> pci1 at psycho1
> bge0 at pci1 dev 1 function 0 "Fujitsu PRIMEPOWER250/450 LAN" rev
> 0x02, BCM5702/5703 A2 (0x1002): ivec 0xc0, address 00:0b:5d:f4:27:5c
> brgphy0 at bge0 phy 1: BCM5703 10/100/1000baseT PHY, rev. 2
> "counter-timer" at mainbus0 addr 0xfff8bc00 not configured
> umsm0 at uhub0 port 1 configuration 1 interface 0 "HUAWEI HUAWEI
> Mobile" rev 2.00/1.02 addr 2
> ucom0 at umsm0
> umsm1 at uhub0 port 1 configuration 1 interface 1 "HUAWEI HUAWEI
> Mobile" rev 2.00/1.02 addr 2
> ucom1 at umsm1
> umsm2 at uhub0 port 1 configuration 1 interface 2 "HUAWEI HUAWEI
> Mobile" rev 2.00/1.02 addr 2
> ucom2 at umsm2
> umass0 at uhub0 port 1 configuration 1 interface 3 "HUAWEI HUAWEI
> Mobile" rev 2.00/1.02 addr 2
> umass0: using SCSI over Bulk-Only
> scsibus3 at umass0: 2 targets, initiator 0
> cd1 at scsibus3 targ 1 lun 0:  SCSI2
> 5/cdrom removable
> umass1 at 

Re: man bgpd.conf + question

2019-06-29 Thread Claudio Jeker
On Fri, Jun 28, 2019 at 10:52:01PM +, Mik J wrote:
> Hello,
> I have a syntax error with  announce none 
> group "spam-bgp" {
>     remote-as   $spamASN
>     multihop 64
>     announce none
> 
> I was told recently that everything is filtered by default from 6.4 and read 
> on Internet that announce none is deprecated
> However man bgpd.conf (Openbsd 6.5) still has this command in section 
> "NEIGHBORS AND GROUPS"announce (IPv4|IPv6) (none|unicast|vpn)
> 
> Do you know what is correct ?

There is a difference between:
announce none
and
announce IPv4 none
or
announce IPv6 none

The frist one no longer exists. The 2nd one still works and disables the
multiprotocol capability for the define AFI (IPv4 or IPv6).
By default the session enables the unicast AFI for the IP family that the
session uses. (e.g. announce IPv6 unicast for IPv6 sessions) and the other
AFI is disabled.

-- 
:wq Claudio



Re: Route through different gateways depending on process

2019-06-24 Thread Claudio Jeker
On Mon, Jun 24, 2019 at 08:47:38AM +, slackwaree wrote:
> Hello,
> 
> 
> Could you maybe provide a full case study for this as it is fairly
> uncommon task?
> 
> Do you mean that I will also need +2 ip aliases next to the boxes main ip?

No. You can use either option. The question is how are the proxy users
talking to those 3 different proxies? If you want to use port 8080 for all
of them you want 3 different IPs.
 
> Eg instead of
> 192.168.10.1: 3128 3129 3130
> 
> 192.168.10.1:3128 using gateway 192.168.10.250
> 192.168.10.2:3128 using gateway 192.168.10.251
> 192.168.10.3:3128 using gateway 192.168.10.252
> 

Try it out yourself. Create an extra table and run a proxy in it.
Use tools like tcpdump, nc, etc to check if it works.

Start with:
route -T1 add default 192.168.10.250
route -T1 exec "squid command to run ideal with debugging on"

-- 
:wq Claudio

 
> ‐‐‐ Original Message ‐‐‐
> On Friday, June 21, 2019 8:27 PM, Brian Brombacher 
>  wrote:
> 
> > You’ll also need PF rules to allow incoming traffic from your squid clients 
> > to go to the routing table where your squid process is running.
> >
> > > On Jun 21, 2019, at 10:28 AM, Claudio Jeker cje...@diehard.n-r-g.com 
> > > wrote:
> > >
> > > > On Fri, Jun 21, 2019 at 02:11:53PM +, slackwaree wrote:
> > > > Hello,
> > > > I wonder if the following scenario can be solved with OpenBSD on 1 
> > > > single machine or with VMM:
> > > > I got 3 OpenBSD vms, all of them are exactly the same running squid 
> > > > except they use different default routers to route their traffic out.
> > > > I would like to merge these to one VM if it is possible somehow to tell 
> > > > OpenBSD to use different gateway depending on the squid process.
> > > > If not would the same thing be possible with VMMs? All the gateways are 
> > > > in the same IP range.
> > >
> > > A simple way to solve this is with multiple routing tables.
> > > Create multiple routing tables with:
> > > route -T1 add default 
> > > route -T2 add default 
> > > route -T3 add default 
> > > And start the 3 squid processes with route -T1 exec, route -T2 exec.
> > > You can also use the the *_rtable variable in rc.d(8) to do that
> > > automatically.
> > > This requires that the 3 squids listen on different IPs or ports.
> > > --
> > > :wq Claudio



Re: Route through different gateways depending on process

2019-06-21 Thread Claudio Jeker
On Fri, Jun 21, 2019 at 02:11:53PM +, slackwaree wrote:
> Hello,
> 
> I wonder if the following scenario can be solved with OpenBSD on 1 single 
> machine or with VMM:
> 
> I got 3 OpenBSD vms, all of them are exactly the same running squid except 
> they use different default routers to route their traffic out.
> 
> I would like to merge these to one VM if it is possible somehow to tell 
> OpenBSD to use different gateway depending on the squid process.
> 
> If not would the same thing be possible with VMMs? All the gateways are in 
> the same IP range.
> 

A simple way to solve this is with multiple routing tables.

Create multiple routing tables with:
route -T1 add default 
route -T2 add default 
route -T3 add default 

And start the 3 squid processes with route -T1 exec, route -T2 exec.
You can also use the the *_rtable variable in rc.d(8) to do that
automatically.

This requires that the 3 squids listen on different IPs or ports.

-- 
:wq Claudio



Re: network alias on different network

2019-06-20 Thread Claudio Jeker
On Thu, Jun 20, 2019 at 07:05:57PM +, Victor Camacho wrote:
> Hi,
> 
> Using OpenBSD 6.4 and I wanted to run some alias ip addresses on one of the 
> interfaces.
> My question is, can I use a different network as an alias?
> 
> Example:
> fw3# more hostname.bge0
> inet 10.2.0.1 255.255.0.0
> inet alias 10.2.1.1 255.255.255.255
> inet alias 10.2.2.1 255.255.255.255
> inet alias 10.2.4.1 255.255.255.255
> inet alias 10.2.6.1 255.255.255.255
> inet alias 172.17.11.1 255.255.255.255
> 
> I am having a problem pinging on the 172.17.11.0 network.
> Ping 172.17.11.1
> Responds, but nothing else on the network.
> I saw one thing on the internet that said 'alias' has to be on the same 
> network, but this was not specific as far as age and what operating system.
> To me a router, routes.
> Any clarification or better way to handle this would be appreciated.
> 

You need to add the 172.17.11.1 with the correct netmask. The
255.255.255.255 netmask will not allow it to see any other system on that
net. The 255.255.255.255 netmask should only be used for additional IPs
that are already covered by an other IP address on that interface.
Because of this outgoing traffic will use 10.2.0.1 as local IP address an
not one of the other (10.2.1.1, 10.2.2.1, ...) unless explicitly bound.
When using two different networks on the same interface just configure
them the usual way (alias is just telling ifconfig not to replace the
first IP address on the interface and instead add another one).


> Here is the routing table (with public ip and mac addresses changed or 
> obscured):
> 
> fw3# route -n show
> Routing tables
> 
> Internet:
> DestinationGatewayFlags   Refs  Use   Mtu  Prio Iface
> defaultx.x.x.109  UGS  261 23105124 - 8 dc0
> 224/4  127.0.0.1  URS00 32768 8 lo0
> 10.2/1610.2.0.1   UCn   31 3623 - 4 bge0
> 10.2.0.1   00:16:41:ed:dd:47  UHLl   026952 - 1 bge0
> 10.2.1.1   00:16:41:ed:dd:47  UHLl   0   175419 - 1 bge0
> 10.2.1.1/3210.2.1.1   UCn00 - 4 bge0
> 10.2.1.11  b4:fb:e4:2c:5b:4d  UHLc   0   249998 - 3 bge0
> 10.2.1.200 e8:36:17:6e:89:67  UHLc   0 3730 - 3 bge0
> 10.2.1.207 d0:d2:b0:0c:b9:41  UHLc   0   149944 - 3 bge0
> 10.2.1.208 38:89:2c:dd:5c:37  UHLc   0   179441 - 3 bge0
> 10.2.1.213 34:08:bc:be:3f:c6  UHLc   039991 - 3 bge0
> 10.2.1.217 4c:57:ca:08:33:c8  UHLc   0 6704 - 3 bge0
> 10.2.1.221 b0:c0:90:4b:8c:f8  UHLc   1  1299001 - 3 bge0
> 10.2.1.226 78:8a:20:d6:e7:b8  UHLc   0 3626 - 3 bge0
> 10.2.1.243 64:c7:53:aa:68:85  UHLc   0 3720 - 3 bge0
> 10.2.1.245 28:ff:3c:52:6a:51  UHLc   0   171234 - 3 bge0
> 10.2.2.1   00:16:41:ed:dd:47  UHLl   046132 - 1 bge0
> 10.2.2.1/3210.2.2.1   UCn00 - 4 bge0
> 10.2.2.21  ec:b1:d7:f3:09:a9  UHLc   1   252761 - 3 bge0
> 10.2.2.31  ac:1f:6b:96:38:96  UHLc   111629 - 3 bge0
> 10.2.2.61  9c:93:4e:5c:b7:9e  UHLc   0   120968 - 3 bge0
> 10.2.2.62  9c:93:4e:2d:87:1f  UHLc   0 3833 - 3 bge0
> 10.2.2.101 18:60:24:e3:eb:a1  UHLc   0  1872476 - 3 bge0
> 10.2.2.102 18:60:24:e3:f4:80  UHLc   0  5944221 - 3 bge0
> 10.2.2.103 18:60:24:e3:f3:99  UHLc   0   409286 - 3 bge0
> 10.2.2.104 18:60:24:e3:fb:97  UHLc   0  1452694 - 3 bge0
> 10.2.2.105 64:51:06:2b:ba:8b  UHLc   0   559768 - 3 bge0
> 10.2.2.106 18:60:24:e3:f1:d2  UHLc   0   150568 - 3 bge0
> 10.2.2.107 64:51:06:2b:74:a3  UHLc   0   406897 - 3 bge0
> 10.2.2.108 18:60:24:e3:e0:63  UHLc   0  1759000 - 3 bge0
> 10.2.2.150 00:0b:82:c1:04:fb  UHLc   020780 - 3 bge0
> 10.2.2.155 00:0b:82:d0:28:0c  UHLc   0 3730 - 3 bge0
> 10.2.2.157 00:0b:82:d0:28:00  UHLc   0 3729 - 3 bge0
> 10.2.2.158 00:0b:82:d2:a9:aa  UHLc   0 3729 - 3 bge0
> 10.2.2.255 link#1 UHLc   0 3671 - 3 bge0
> 10.2.4.1   00:16:41:ed:dd:47  UHLl   075492 - 1 bge0
> 10.2.4.1/3210.2.4.1   UCn00 - 4 bge0
> 10.2.4.101 6c:62:6d:93:1e:66  UHLc   1  2203177 - 3 bge0
> 10.2.4.102 c8:60:00:75:f3:d1  UHLc   015808 - 3 bge0
> 10.2.4.103 bc:ae:c5:e2:15:eb  UHLc   095620 - 3 bge0
> 10.2.4.255 link#1 UHLc   0 3635 - 3 bge0
> 10.2.6.1   00:16:41:ed:dd:47  

Re: Newer snapshots on ALIX

2019-06-19 Thread Claudio Jeker
On Wed, Jun 19, 2019 at 08:37:28AM +0200, Paul de Weerd wrote:
> Morning folks,
> 
> I ran into a problem after upgrading my ALIX to a more recent snapshot
> in that it won't boot anymore.  It gets to "entry point 0x2d0" and
> then stops.  I tried using the PXE bootloader to load the local kernel
> from disk (both bsd and bsd.rd) and to load kernels from tftp, but all
> fails in similar ways with the entry point being the last output.
> 
> I grabbed another ALIX to test, but I'm afraid I screwed that one up
> and now that one doesn't boot either anymore.  This is probably user
> error, but now I'd like to confirm: has anyone successfully upgraded
> their ALIX to a recent snapshot?
> 
> It could be that my hardware is dying on me (I should find my piggy
> bank for some nickels), so confirmation that this still works for
> others is appreciated.
> 

There were some boot(8) changes so try some older pxeboot from 6.4, 6.5 or
the snapshot archive to see when the breakage was introduced.

-- 
:wq Claudio



Re: "ucode too large"

2019-06-07 Thread Claudio Jeker
On Fri, Jun 07, 2019 at 03:43:39PM +0200, Paul de Weerd wrote:
> I've just replaced my home gateway with a brandless machine with an
> i5-7200U.  While preparing, I noticed the message "ucode too large"
> scrolling by on the serial console, just before the kernel starts.
> 
> The dmesg shows cpu0 as mode 06-8e-09:
> 
> cpu0: Intel(R) Core(TM) i5-7200U CPU @ 2.50GHz, 2395.19 MHz, 06-8e-09
> 
> While /etc/firmware/intel/06-8e-09 is the biggest file in that
> directory (at 193kB), so this probably has something to do with that
> and the MDS "fun".
> 
> Machine works fine as far as I can tell (typing this mail over an SSH
> session through it).
> 

This should work if you are using a -current EFI boot(9) or try the
following diff for the BIOS boot(9). In both cases make sure you
installboot the new code.

-- 
:wq Claudio

Index: arch/amd64/stand/libsa/exec_i386.c
===
RCS file: /cvs/src/sys/arch/amd64/stand/libsa/exec_i386.c,v
retrieving revision 1.31
diff -u -p -r1.31 exec_i386.c
--- arch/amd64/stand/libsa/exec_i386.c  28 May 2019 17:38:02 -  1.31
+++ arch/amd64/stand/libsa/exec_i386.c  7 Jun 2019 13:58:05 -
@@ -226,7 +226,7 @@ ucode_load(void)
return;
 
buflen = sb.st_size;
-   if (buflen > 128*1024) {
+   if (buflen > 256*1024) {
printf("ucode too large\n");
return;
}



Re: OpenBSD on thinkpad x280

2019-05-25 Thread Claudio Jeker
On Sat, May 25, 2019 at 03:53:03PM +0100, Maurice McCarthy wrote:
> On 25/05/2019, Timo Myyrä  wrote:
> > Tristan Pilat  writes:
> >
> >> Hi OpenBSD users and devs!
> >>
> >> I got a new laptop in January, a thinkpad x280. At that time my system
> >> running 'current' was very slow and I assumed the video acceleration
> >> wasn't working so I just sadly stuck with Debian for a while. I then
> >> saw that an update of the inteldrm landed in current a month ago or so
> >> so I tried yesterday to reinstall current. Unfortunately the system is
> >> still barely usable. Could you guys tell me why the video acceleration
> >> isn't handled? Isn't Kaby lake compatible for now? I saw this article
> >> (https://jcs.org/2017/05/22/xiaomiair) which says it is.
> >>
> 
> You may have to adjust the aperture
> See /etc/examples/sysctl.conf
> 
> #machdep.allowaperture=2  # See xf86(4)
> 

Nope. That does not help. I bet the issue is not related to anything
related to inteldrm. It is most probably an interrupt storm happening
because of Thunderbolt 3. At least that seems to be something people
complained about.

-- 
:wq Claudio



Re: need docs about udp buffer size

2019-05-16 Thread Claudio Jeker
On Thu, May 16, 2019 at 12:18:53PM +0300, kasak wrote:
> Hello! I have a litle problem with my unbound:
> 
> unbound: notice: sendto failed: No buffer space available
> 
> I think, I should increase net.inet.udp.sendspace, but I don't really
> understand what size do i need.
> 
> Is there any information about calculating needed buffer space?

It is probably not net.inet.udp.sendspace since that value only affects
how big a packet you can send per UDP. The send buffer is only used to
move the packet to the kernel and is empty after every send.
Please check a) if there are any failures to allocate mbufs (netstat -m
and vmstat -m) and b) interface errors (netstat -i)

-- 
:wq Claudio



Re: post-6.5-upgrade bgpd(8) problem

2019-05-09 Thread Claudio Jeker
On Thu, May 09, 2019 at 10:58:54AM -0500, Adam Thompson wrote:
> I've upgraded my looking glass from 6.4 to 6.5, and an experiencing an
> unexpected problem - routes learned from one (iBGP) peer are not being
> automatically exported to other (eBGP) peers.
> 
> I did not change /etc/bgpd.conf, but behaviour seems to have changed
> nonetheless.  The upgrade from 6.4 to 6.5 appeared smooth otherwise, nothing
> to suggest subtle breakage under the hood.
> 
> As you can see below, this bgpd.conf is almost so simple it *can't* have
> problems.  Apparently "almost" being the operative word.
> 
> Under 6.4, this behaved as though "export none" only applied to the first
> group.  Under 6.5, it behaves as though "export none" is a global setting.
> 
> Under 6.5, bgpctl show produces:
> root@bgpmirror:~# bgpctl sh
> Neighbor   ASMsgRcvdMsgSent  OutQ Up/Down
> State/PrfRcvd
> Hermes IPv4 16796 128773 85 0 00:41:40
> 753748
> Hermes IPv6 16796  29727 85 0 00:41:40
> 68228
> MBNOG IPv4  65204 97 85 0 00:41:40
> 0
> MBNOG IPv6  65204 97 85 0 00:41:40
> 0
> BGPMon.io IPv4   6447  0 21 0 Never
> Active
> isolario.it IPv465517 86 85 0 00:41:39
> 0
> isolario.it IPv665517 86 85 0 00:41:39
> 0
> and the operator of the MBNOG route collector confirms that I stopped
> sending him a full routing table at the same time I did the OS upgrade.
> 
> Any ideas?  What other information would help diagnose this problem?
> 
> Thanks,
> -Adam
> 
> 
> 
> Dmesg & bgpd.conf:
> https://gist.github.com/athompso/e334d8621ce458925e25bb44b8068341
> 
> 
> bgpd.conf, duplicated here for convenience:
> 
>   ===BOF===
>   route-collector yes

You have route-collector turned on and so you disable the decision process
and so no prefix will be selected and sent out. This is the way it is
supposed to work. Your setup is not a route-collector.
In 6.4 route-collector mode was broken (as in you could not turn it on)
and I fixed this. That is why your noticed the behaviour change.

>   socket "/var/www/run/bgpd.rsock" restricted # for bgplg
> 
>   # settings
>   nexthop qualify via default
>   fib-update no
>   dump table-v2 "/var/www/htdocs/bgplg/mrt/rib-dump.mrt" 3600
>   dump updates in "/var/www/htdocs/bgplg/mrt/updates-in-%H%M" 300
>   dump all in "/var/www/htdocs/bgplg/mrt/all-in-%H%M" 300
> 
>   # myself
>   AS X
>   router-id X.X.X.X
> 
>   # neighbors
> 
>   group hermes {
>   enforce local-as no
>   enforce neighbor-as no
>   export none
> 
>   neighbor X.X.X.X {
>   remote-as X
>   descr "Hermes IPv4"
>   }
>   neighbor X:X:X:X::X {
>   remote-as X
>   descr "Hermes IPv6"
>   }
>   }
> 
>   group bgpresearch {
>   multihop 32
>   enforce local-as no
>   enforce neighbor-as no
> 
>   neighbor 192.160.102.196 {
>   remote-as 65204
>   descr "MBNOG IPv4"
>   }
>   neighbor 2620:132:3003:300::196 {
>   remote-as 65204
>   descr "MBNOG IPv6"
>   }
>   neighbor 129.82.138.6 {
>   remote-as 6447
>   descr "BGPMon.io IPv4"
>   }
>   neighbor 146.48.78.12 {
>   remote-as 65517
>   descr "isolario.it IPv4"
>   }
>   neighbor 2a00:1620:c0:4e:146:48:78:12 {
>   remote-as 65517
>   descr "isolario.it IPv6"
>   }
>   }
> 
>   # policies
>   allow quick from group hermes
>   allow quick to group bgpresearch
>   ===EOF===
> 
> (if you want to see the unredacted version of bgpd.conf, ask and I'll email
> it directly to you, I just don't want internal addresses in the public
> archive.)
> 

-- 
:wq Claudio



Re: bgpd acting up, dropping connected/static network statements

2019-05-03 Thread Claudio Jeker
On Fri, May 03, 2019 at 11:52:07AM +0200, open...@kene.nu wrote:
> Much appreciated, will test. Did this also affect previous versions
> (specifically thinking about 6.3 and 6.4)?

No. This code was changed after 6.4
 
> On Fri, May 3, 2019 at 11:43 AM Claudio Jeker  
> wrote:
> >
> > On Fri, May 03, 2019 at 09:59:40AM +0200, open...@kene.nu wrote:
> > > Hello,
> > >
> > > I am seeing strange behaviour of bgpd in 6.5.
> > >
> > > Not sure what causes the networks in bgpd to disappear but they do
> > > disappear and performing a netstart pick the network back up again in
> > > bgpd. I cannot see this in either 6.4 or 6.3. One triggering factor
> > > seems to be restarting the bgpd process.
> > >
> > > Excerpt form the daemon logs (bgpd restart or reload):
> > > May  3 07:44:25 host bgpd[94972]: Rib Loc-RIB: neighbor 172.30.198.4
> > > (LOCAL) AS64712: announce 10.1.150.0/24
> > > May  3 07:44:25 host bgpd[94972]: Rib Loc-RIB: neighbor 172.30.198.4
> > > (LOCAL) AS64712: withdraw announce 10.1.150.0/24
> > >
> > > If one performs a netstart, of relevant vlan interfaces, the
> > > announcements seem to survive a bgpd reload. Static routes never
> > > survive a restart or reload.
> > >
> > > Some additional commands to show behaviour:
> > > # uname -a
> > > OpenBSD host 6.5 GENERIC.MP#3 amd64
> > > # ifconfig vlan190
> > > vlan190: flags=8943 mtu 
> > > 1500
> > > lladdr 
> > > index 33 priority 0 llprio 3
> > > encap: vnetid 190 parent em0 txprio packet
> > > groups: vlan
> > > media: Ethernet autoselect (1000baseT full-duplex,master)
> > > status: active
> > > inet 10.1.150.2 netmask 0xff00 broadcast 10.1.150.255
> > > # grep connected /etc/bgpd.conf
> > > network inet connected set community 65000:64712
> > > # bgpctl sh ip bgp 10.1.150.0/24
> > > flags: * = Valid, > = Selected, I = via IBGP, A = Announced,
> > >S = Stale, E = Error
> > > origin validation state: N = not-found, V = valid, ! = invalid
> > > origin: i = IGP, e = EGP, ? = Incomplete
> > >
> > > flags ovs destination  gateway  lpref   med aspath origin
> > > # sh /etc/netstart vlan150
> > > # bgpctl sh ip bgp 10.1.150.0/24
> > > flags: * = Valid, > = Selected, I = via IBGP, A = Announced,
> > >S = Stale, E = Error
> > > origin validation state: N = not-found, V = valid, ! = invalid
> > > origin: i = IGP, e = EGP, ? = Incomplete
> > >
> > > flags ovs destination  gateway  lpref   med aspath origin
> > > AI*>N 10.1.150.0/240.0.0.0100 0 i
> > >
> > >
> > > My bgpd.conf:
> > > # GLOBALS
> > > AS 64712
> > > router-id 172.30.198.4
> > > holdtime 9
> > > log updates
> > >
> > > prefix-set internal { 10.0.0.0/8 prefixlen >= 16, 10.60.0.0/15,
> > > 172.20.0.0/16 prefixlen <= 32, 172.29.0.0/16 prefixlen >= 24,
> > > 172.29.248.10/31 prefixlen = 32, 172.30.0.0/16 prefixlen >= 24 }
> > >
> > > # DEFAULT FILTERING
> > > deny from any
> > > deny to any
> > >
> > > # NETWORK STATEMENTS
> > > network 172.30.198.4/32 set community 65000:64712
> > > network inet connected set community 65000:64712
> > > network inet static set community 65000:64712
> > >
> > > # NEIGHBORS
> > > group "vpn" {
> > > announce IPv6 none
> > > route-reflector
> > > remote-as 64712
> > >
> > > neighbor 10.1.230.9 {
> > > descr "vpn1"
> > > }
> > > neighbor 10.1.230.10 {
> > > descr "vpn2"
> > > }
> > > }
> > >
> > > # SOURCE FILTERING
> > > allow to group "vpn" prefix-set internal community 65000:64712
> > > # DEST FILTERING
> > > allow from group "vpn" prefix-set internal
> > > # TRAFFIC ENGINEERING
> > > match to group "vpn" set nexthop 10.1.230.12
> > > match to any prefix 172.30.198.4/32 set nexthop self
> > >
> >
> > Thanks for the detailed report. I quick workaround is to reload the config
> > twice. Then the networks are added again. The proper fix is attached.
> >
> > The problem was that when already present networks were readded the
&g

Re: bgpd acting up, dropping connected/static network statements

2019-05-03 Thread Claudio Jeker
On Fri, May 03, 2019 at 09:59:40AM +0200, open...@kene.nu wrote:
> Hello,
> 
> I am seeing strange behaviour of bgpd in 6.5.
> 
> Not sure what causes the networks in bgpd to disappear but they do
> disappear and performing a netstart pick the network back up again in
> bgpd. I cannot see this in either 6.4 or 6.3. One triggering factor
> seems to be restarting the bgpd process.
> 
> Excerpt form the daemon logs (bgpd restart or reload):
> May  3 07:44:25 host bgpd[94972]: Rib Loc-RIB: neighbor 172.30.198.4
> (LOCAL) AS64712: announce 10.1.150.0/24
> May  3 07:44:25 host bgpd[94972]: Rib Loc-RIB: neighbor 172.30.198.4
> (LOCAL) AS64712: withdraw announce 10.1.150.0/24
> 
> If one performs a netstart, of relevant vlan interfaces, the
> announcements seem to survive a bgpd reload. Static routes never
> survive a restart or reload.
> 
> Some additional commands to show behaviour:
> # uname -a
> OpenBSD host 6.5 GENERIC.MP#3 amd64
> # ifconfig vlan190
> vlan190: flags=8943 mtu 1500
> lladdr 
> index 33 priority 0 llprio 3
> encap: vnetid 190 parent em0 txprio packet
> groups: vlan
> media: Ethernet autoselect (1000baseT full-duplex,master)
> status: active
> inet 10.1.150.2 netmask 0xff00 broadcast 10.1.150.255
> # grep connected /etc/bgpd.conf
> network inet connected set community 65000:64712
> # bgpctl sh ip bgp 10.1.150.0/24
> flags: * = Valid, > = Selected, I = via IBGP, A = Announced,
>S = Stale, E = Error
> origin validation state: N = not-found, V = valid, ! = invalid
> origin: i = IGP, e = EGP, ? = Incomplete
> 
> flags ovs destination  gateway  lpref   med aspath origin
> # sh /etc/netstart vlan150
> # bgpctl sh ip bgp 10.1.150.0/24
> flags: * = Valid, > = Selected, I = via IBGP, A = Announced,
>S = Stale, E = Error
> origin validation state: N = not-found, V = valid, ! = invalid
> origin: i = IGP, e = EGP, ? = Incomplete
> 
> flags ovs destination  gateway  lpref   med aspath origin
> AI*>N 10.1.150.0/240.0.0.0100 0 i
> 
> 
> My bgpd.conf:
> # GLOBALS
> AS 64712
> router-id 172.30.198.4
> holdtime 9
> log updates
> 
> prefix-set internal { 10.0.0.0/8 prefixlen >= 16, 10.60.0.0/15,
> 172.20.0.0/16 prefixlen <= 32, 172.29.0.0/16 prefixlen >= 24,
> 172.29.248.10/31 prefixlen = 32, 172.30.0.0/16 prefixlen >= 24 }
> 
> # DEFAULT FILTERING
> deny from any
> deny to any
> 
> # NETWORK STATEMENTS
> network 172.30.198.4/32 set community 65000:64712
> network inet connected set community 65000:64712
> network inet static set community 65000:64712
> 
> # NEIGHBORS
> group "vpn" {
> announce IPv6 none
> route-reflector
> remote-as 64712
> 
> neighbor 10.1.230.9 {
> descr "vpn1"
> }
> neighbor 10.1.230.10 {
> descr "vpn2"
> }
> }
> 
> # SOURCE FILTERING
> allow to group "vpn" prefix-set internal community 65000:64712
> # DEST FILTERING
> allow from group "vpn" prefix-set internal
> # TRAFFIC ENGINEERING
> match to group "vpn" set nexthop 10.1.230.12
> match to any prefix 172.30.198.4/32 set nexthop self
> 

Thanks for the detailed report. I quick workaround is to reload the config
twice. Then the networks are added again. The proper fix is attached.

The problem was that when already present networks were readded the
function kr_net_redist_add() returned 0 which was propegated to
kr_net_match() which then caused kr_redistribute() to actually remove the
prefix.

I changed the code to only return 0 when there is actually the case that
the network being added is shadowed by another one and therefor this
prefix should be removed. While there I also fixed a memory leak ;)

Please test.
-- 
:wq Claudio

Index: kroute.c
===
RCS file: /cvs/src/usr.sbin/bgpd/kroute.c,v
retrieving revision 1.235
diff -u -p -r1.235 kroute.c
--- kroute.c7 Mar 2019 07:42:36 -   1.235
+++ kroute.c3 May 2019 09:32:10 -
@@ -1230,19 +1230,19 @@ kr_net_redist_add(struct ktable *kt, str
 
xr = RB_INSERT(kredist_tree, >kredist, r);
if (xr != NULL) {
-   if (dynamic == xr->dynamic || dynamic) {
+   free(r);
+
+   if (dynamic != xr->dynamic && dynamic) {
/*
-* ignore update, equal announcement already present,
-* or a non-dynamic announcement is already present
-* which has preference.
+* ignore update a non-dynamic announcement is
+* already present which has preference.
 */
-   free(r);
return 0;
}
/*
-* only the case where xr->dynamic == 1 and dynamic == 0
-* ends up here and in this case non-dynamic announcments
-* are preferred. Override dynamic flag.
+* only 

Re: Reflected IBGP VPNv4 Routes overstaying their welcome

2019-04-09 Thread Claudio Jeker
On Mon, Apr 08, 2019 at 05:08:32PM -0400, Henry Bonath wrote:
> Hello, I am seeing some BGP VPNv4 routes staying populated in
> the RIB of route-reflector clients even after dropping the originating 
> neighbor.
> 
> I'm on OpenBSD 6.4, running MPLS L3VPN.
> 
> I have 2 IBGP route-reflectors, both OpenBSD 6.4.
> I run OSPF to distribute Loopbacks into an Area (100)
> We run Cisco devices for our Provider Edge installed on site at
> Customer Premise.
> All MPLS PE devices neighbor with both route reflectors.
> 
> My bgpd.conf from the route reflectors:
> ===
> ASN="64670"
> 
> # global configuration
> AS $ASN
> router-id 172.16.16.212
> nexthop qualify via default
> 
> group "IBGP" {
> remote-as $ASN
> announce IPv4 vpn
> route-reflector 172.16.16.212
> local-address 172.16.16.212
> neighbor 100.92.64.0/18 {
> }
> 
> }
> 
> # IBGP: allow all updates to and from our IBGP neighbors
> allow from any
> allow to any
> ===
> 
> bgpd.conf from an OpenBSD PE:
> ===
> ASN="64670"
> 
> # global configuration
> AS $ASN
> router-id 100.92.127.121
> 
> rdomain 2 {
> rd 64670:37
> import-target rt 64670:37
> export-target rt 64670:37
> # advertise summary of tenant Subnet:
> network 172.29.21.0/24
> 
> # Redistribute from OSPF (Priority 32)
> network inet priority 32
> depend on mpe1
> }
> 
> group "IBGP" {
> remote-as $ASN
> announce IPv4 vpn
> set rtlabel FROM_BGP
> local-address 100.92.127.121
> neighbor 172.16.16.211 {
> descr "bgp-rr-01"
> }
> neighbor 172.16.16.212 {
> descr "bgp-rr-02"
> }
> 
> }
> 
> # IBGP: allow all updates to and from our IBGP neighbors
> allow from ibgp
> allow to ibgp
> 
> ===
> 
> The problem comes if I shutdown one of my Premise equipment PE
> devices, or an OpenBSD PE,
> on the other OpenBSD PEs that remain up, they still show the routes
> that were advertised by the
> now shutdown device.
> 
> If I log into a route reflector and run a "bgpctl show rib" those
> routes are no longer there as i expected,
> though they persist at the OpenBSD reflector clients.
> 
> Example output after shutting down the 100.92.127.21 Cisco PE observed
> from the OpenBSD PE
> that is listening to 64670:37 rt/rd:
> 
> flags: * = Valid, > = Selected, I = via IBGP, A = Announced,
>S = Stale, E = Error
> origin validation state: N = not-found, V = valid, ! = invalid
> origin: i = IGP, e = EGP, ? = Incomplete
> 
> flags ovs destination  gateway  lpref   med aspath origin
> I*> N rd 64670:37 192.168.11.0/24 100.92.127.21  100 2 ?
> I*  N rd 64670:37 192.168.11.0/24 100.92.127.21  100 2 ?
> I*> N rd 64670:37 192.168.15.0/24 100.92.127.21  100 2 ?
> I*  N rd 64670:37 192.168.15.0/24 100.92.127.21  100 2 ?
> I*> N rd 64670:37 192.168.20.0/24 100.92.127.21  100 3 ?
> I*  N rd 64670:37 192.168.20.0/24 100.92.127.21  100 3 ?
> I*> N rd 64670:37 192.168.100.0/24 100.92.127.21  100 2 ?
> I*  N rd 64670:37 192.168.100.0/24 100.92.127.21  100 2 ?
> I*> N rd 64670:37 192.168.110.0/24 100.92.127.21  100 3 ?
> I*  N rd 64670:37 192.168.110.0/24 100.92.127.21  100 3 ?
> I*> N rd 64670:37 192.168.150.0/24 100.92.127.21  100 2 ?
> I*  N rd 64670:37 192.168.150.0/24 100.92.127.21  100 2 ?
> I*> N rd 64670:37 192.168.200.0/24 100.92.127.21  100 2 ?
> I*  N rd 64670:37 192.168.200.0/24 100.92.127.21  100 2 ?
> 
> Shouldn't those routes disappear once the 100.92.127.21 router is shutdown?
> 
> Thanks for any help you all  have to offer!

Are you able to test this with -current? There were some fixes and changes
done for MPLS VPN support. I have the feeling that this may be already
fixed. Also I would desiable gracefull restart on the RR with 'announce
restart no' for the template. 

-- 
:wq Claudio



Re: openbgpd; strip private ASNs from bgp updates

2019-03-31 Thread Claudio Jeker
On Fri, Mar 29, 2019 at 08:36:26AM +0100, open...@kene.nu wrote:
> I forgot to add to my previous email. One thing that could be useful
> in this case is to mimic the Cisco option "neighbor x.x.x.x
> remove-private-as" which removes any private ASes from the path on any
> updates to a peer.  Just throwing it out there, cant be a very
> difficult option to implement I guess?

I think changing the AS PATH is a bad thing, removing elements from your
AS path has a major impact on the route selection and opens doors for
routing loops. In general I will only add features like 'as-override' when
there is a clear reason why it is needed.
So my question is, why do you need to use private AS numbers in your
internal network?
 
> On Thu, Mar 28, 2019 at 2:55 PM  wrote:
> >
> > That will indeed help. Will check it out.
> >
> > How I have solved it now is by having network statements on the edge
> > (/24s). To make the internal routing work I announce more specific
> > prefixes from the internal router, so externally I announce a /24
> > (from edge to peering partners) but internally I announce two /25s
> > (from internal to edge). That way internet knows how to find my /24
> > and edge knows how to find its way internally due to /25 being more
> > specific compared to /24.
> >
> > On Wed, Mar 27, 2019 at 9:33 PM Sebastian Benoit  
> > wrote:
> > >
> > > open...@kene.nu(open...@kene.nu) on 2019.03.27 12:25:33 +0100:
> > > > Hello,
> > > >
> > > > That would unforunately affect all the prefixes announced to the edge
> > > > router from the internal router. I need it to be only prefixes
> > > > announced to my peering partners.
> > > >
> > > > /Oscar
> > > >
> > > > On Tue, Mar 26, 2019 at 3:50 PM Denis Fondras  
> > > > wrote:
> > > > >
> > > > > On Tue, Mar 26, 2019 at 02:54:38PM +0100, open...@kene.nu wrote:
> > > > > > Hello,
> > > > > >
> > > > > > Is there a way to make openbgpd strip private ASNs from updates it
> > > > > > sends to certain neighbors?
> > > > > > I am using openbgpd on my edge routers and distribute routes 
> > > > > > generated
> > > > > > internally to the rest of the world. However, the internal routers 
> > > > > > use
> > > > > > private ASNs and this is obviously frowned upon by my peering
> > > > > > partners.
> > > > > >
> > > > > > I can of course have network statements on my edge routers but that
> > > > > > assumes the prefixes will always be reachable via said edge router,
> > > > > > something I can never be certain of. I would rather the updates rely
> > > > > > on the prefix actually being announced from the source.
> > > > > >
> > > > >
> > > > > Perhaps with transparent-as ?
> > >
> > > In current (snapshots) there is "as-override":
> > >
> > >  as-override (yes|no)
> > >  If set to yes, all occurrences of the neighbor AS in the AS
> > >  path will be replaced with the local AS before running the
> > >  filters.  The Adj-RIB-In still holds the unmodified AS path.
> > >  The default value is no.
> > >
> > > this is a neighbor option and used on the session to a peer that uses a
> > > private AS.
> > >
> > > You dont say much about your network structure, but if your edge router 
> > > has
> > > a normal As number, and your internal ebgp peers have private As numbers,
> > > this option will help.
> > >
> > > /Benno
> > >
> 

-- 
:wq Claudio



Re: serial console images for installing on vmd based guests

2019-03-13 Thread Claudio Jeker
On Tue, Mar 12, 2019 at 11:48:01PM -0700, Mike Larkin wrote:
> On Tue, Mar 12, 2019 at 05:37:04PM -0700, Chris Cappuccio wrote:
> > Is there any archive of serial console bootable images (w/virtio support)
> > for Linux or other OSes to boot under vmd?
> > 
> 
> You mean installer images? Like things you would install from? Tons.
> 
> If you're talking about pre-installed full OSes, it's unlikely.
> 

Debian still does not manage to ship an install media that has virtio
support. While it is possible to install via serial console it fails to
detect disks and net.

-- 
:wq Claudio



Re: purpose of bgpd.conf dump "timeout" parameter?

2019-02-08 Thread Claudio Jeker
On Fri, Feb 08, 2019 at 03:56:12PM -0600, Adam Thompson wrote:
> In bgpd.conf(5), for the "dump" directive there is an optional "timeout"
> parameter.  What is its purpose?  I assume from the examples that it's
> denominated in seconds...

Yes it is.
 
> my first guess was to time out on attempting to write to the dump file, but
> that doesn't seem realistic.  It looks like it's a cycle, i.e. the dump file
> will be recreated every X seconds, but if so, why is it called "timeout" and
> not "interval" ?

Because naming is hard :)
 
> I see in the source that MRT_MAX_TIMEOUT is set to 7200 - does this mean
> that if I leave the parameter unset, the MRT file will be re-dumped every 2
> hrs?

No, it just limits the poll timeout but it will only fire once the time
really ran out.
 
> Yes, I'll produce a patch for the manpage if someone can explain what the
> parameter is supposed to do / how it works.

After timeout seconds the file is reopened (or maybe a new file is opened
depending on the strftime expand of the filename). For all and update
dumps this just puts new messages into a new file. For table dumps it will
issue a new dump. e.g.
dump table "/tmp/rib-dump-%H%M" 300
will create a new table dump every 5 minutes.

Hope that helps.
-- 
:wq Claudio



  1   2   3   4   5   6   7   8   9   10   >