Re: wg: count peers per interface not globally

2020-09-02 Thread Matt Dunwoodie
On Tue, 1 Sep 2020 20:57:09 +0200
Klemens Nanni  wrote:

> The driver increases a static peer counter across all wg interfaces
> when creating peers, the peer number is only used in debug output,
> though.
> 
> Feedback? OK?

High level, I understand the problem and this makes debugging easier.
It shouldn't affect multiple interfaces because it is only used for
debugging, and there is a wgN prefixed to the log message. Overall,
seems fine to me.

The patch isn't fully correct. When you remove a peer, sc_peer_num
will decrement, and if you add a new peer afterwards you will have
duplicate IDs. This is likely to create further headaches. A dedicated
ID counter in wg_softc should probably be used.

> Index: if_wg.c
> ===
> RCS file: /cvs/src/sys/net/if_wg.c,v
> retrieving revision 1.13
> diff -u -p -r1.13 if_wg.c
> --- if_wg.c   27 Aug 2020 21:27:17 -  1.13
> +++ if_wg.c   1 Sep 2020 18:35:41 -
> @@ -370,8 +370,6 @@ int   wg_clone_create(struct if_clone *, i
>  int  wg_clone_destroy(struct ifnet *);
>  void wgattach(int);
>  
> -uint64_t peer_counter = 0;
> -uint64_t keypair_counter = 0;
>  struct pool  wg_aip_pool;
>  struct pool  wg_peer_pool;
>  struct pool  wg_ratelimit_pool;
> @@ -398,7 +396,6 @@ wg_peer_create(struct wg_softc *sc, uint
>   if ((peer = pool_get(_peer_pool, PR_NOWAIT)) == NULL)
>   return NULL;
>  
> - peer->p_id = peer_counter++;
>   peer->p_sc = sc;
>  
>   noise_remote_init(>p_remote, public, >sc_local);
> @@ -442,7 +439,7 @@ wg_peer_create(struct wg_softc *sc, uint
>   rw_enter_write(>sc_peer_lock);
>   LIST_INSERT_HEAD(>sc_peer[idx], peer, p_pubkey_entry);
>   TAILQ_INSERT_TAIL(>sc_peer_seq, peer, p_seq_entry);
> - sc->sc_peer_num++;
> + peer->p_id = sc->sc_peer_num++;
>   rw_exit_write(>sc_peer_lock);
>  
>   DPRINTF(sc, "Peer %llu created\n", peer->p_id);
> 



wg(4) if_rtrequest

2020-08-25 Thread Matt Dunwoodie
Hi,

I doing some IPv6 setup, I came across an issue with wg(4) and ndp. The
local route is created with RTF_LLINFO, which ndp attempts to print. As
wg is a layer3 tunnel it won't have any link-local information.

This patch just sets if_rtrequest to p2p_rtrequest. Even if wg is
technically point-to-multipoint, p2p_rtrequest has the correct
semantics. ok?

The following snippet shows the symptoms: 

# ifconfig wg0 create  
# ifconfig wg0 inet6 fc00::1/64  
# ndp -an
Neighbor Linklayer Address   Netif ExpireS Flags
ndp: ioctl(SIOCGNBRINFO_IN6): Invalid argument
ndp: failed to get neighbor information
fc00::1

diff --git net/if_wg.c net/if_wg.c
index e5a1071ccf1..a1eaecf129f 100644
--- net/if_wg.c
+++ net/if_wg.c
@@ -2655,6 +2655,7 @@ wg_clone_create(struct if_clone *ifc, int unit)
ifp->if_output = wg_output;
 
ifp->if_type = IFT_WIREGUARD;
+   ifp->if_rtrequest = p2p_rtrequest;
 
if_attach(ifp);
if_alloc_sadl(ifp);



Re: m_defrag(9) leak

2020-08-25 Thread Matt Dunwoodie
On Tue, 25 Aug 2020 08:54:10 +0200
Claudio Jeker  wrote:

> On Tue, Aug 25, 2020 at 08:42:47AM +0200, Martin Pieuchot wrote:
> > Maxime Villard mentioned a leak due to a missing m_freem() in wg(4):
> > https://marc.info/?l=netbsd-tech-net=159827988018641=2
> > 
> > It seems to be that such leak is present in other uses of
> > m_defrag() in the tree.  I won't take the time to go though all of
> > them, but an audit would be welcome.  
> 
> Why does wg(4) needs m_defrag? and why is it used in this way? It
> does not even check if m_defrag() is actually needed.
>
> m_defrag() should be used by HW drivers where DMA constraints require
> the mbuf to be flattened. Software dirvers should use m_pullup /
> m_pulldown etc instead.

wg(4) uses m_defrag to bring all mbuf fragments into one, so it can
read handshake values, and decrypt data directly from the mbuf. In
both cases it needs the full length of the packet.

However you're right, it doesn't check if it needs defrag and yes
m_pullup is preferable. I have a new patch for m_pullup and includes a
comment on why it's done. Thoughts?

There is one other case of "m_pullup(m, m->m_pkthdr.len)" in sppp_input.

(To avoid any confusion with the previous patch, there is no need to
m_free(m) as m_pullup will do that for us.)

diff --git if_wg.c if_wg.c
index e5a1071ccf1..242bd105e72 100644
--- if_wg.c
+++ if_wg.c
@@ -2022,7 +2022,13 @@ wg_input(void *_sc, struct mbuf *m, struct ip *ip, 
struct ip6_hdr *ip6,
/* m has a IP/IPv6 header of hlen length, we don't need it anymore. */
m_adj(m, hlen);
 
-   if (m_defrag(m, M_NOWAIT) != 0)
+   /*
+* Ensure mbuf is contiguous over full length of packet. This is done
+* so we can directly read the handshake values in wg_handshake, and so
+* we can decrypt a transport packet by passing a single buffer to
+* noise_remote_decrypt in wg_decap.
+*/
+   if ((m = m_pullup(m, m->m_pkthdr.len)) == NULL)
return NULL;
 
if ((m->m_pkthdr.len == sizeof(struct wg_pkt_initiation) &&



Re: wg(4): encapsulated transport checksums

2020-06-30 Thread Matt Dunwoodie
On Tue, 30 Jun 2020 20:40:10 -0600
"Theo de Raadt"  wrote:

> Matt Dunwoodie  wrote:
> 
> > Depends on your definition of significant, I've seen 1-3% throughput
> > improvement without the patch.  
> 
> > Real networks require statistics, which you want to throw away.  
> 
> > Overall, it is still debatable whether to skip the IPv4 checksum as
> > modern crypto certainly offers better integrity checks. However, the
> > primary motivator for skipping the integrity checks is performance,
> > and the performance isn't severely impacted. Additionally, I can
> > sympathise with avoiding layer violations and bringing it inline
> > with other tunnels in this case.  
> 
> If it is debatable, why don't you debate it?  I don't see a debate.
> 
> Let me debate it.
> 
> The issue is not about integrity checks being needed, but about
> integrity check counters -- such counters are used to short-cut
> procedures during network diagostic failures in multi-configuration
> systems.
> 
> If a higher-level network overlay skips that counter updates for
> lower-levels, the counters are incorrect, now how do you diagnose
> quickly?  Well, you don't.

Before going any further, I should clarify that the outer packet
checksums are already checked (with or without this patch) in
ip_input_if and udp_input when being received on the lower-level
interface. Therefore, any lower-level link layer corruption to the
outer packet will be caught, dropped and counters incremented before
being passed to wg_input. Does that change any of your following points?

> It sounds like the overlay is being chosen and relevant as more
> important than the underlay.  Sorry to burst your bubble, but the
> overlay will never be the whole internet.  The underlay will persist
> for a long time, and the underlay will see errors.  But the counters
> indicating those erors will be *incoherent*.
> 
> To me, it seems your path leads to the inablity to diagnose underlying
> issues correctly and quickly
> 
> Are underlying issues suddenly absent, or rare enough, they don't need
> quick diagnosis?
> 
> Or do (all) overlay technologies now provide enough information
> access to make evaluation of underlying failures easy?
> 
> For those questions, in my experience, I don't think reality provides
> easy paths yet.
> 
> As I said, argue it from a non-wg diagnosis model.  If the argument is
> not convincing, we have to do the obvious right thing, even if it
> costs a small amount.
> 
> Honestly, i don't understand how you ended in the position you are.

Now, the "debatability" is about whether we want to check the inner
packet IPv4 checksum after successful decryption (not about counters).
The story may be different if you have any cases to add to the three
Jason sent through earlier. I said "debatable" because I still think
both sides are vaild, however when I weigh up the arguments I see
applying the patch as the right thing to do.

- Matt



Re: wg(4): encapsulated transport checksums

2020-06-30 Thread Matt Dunwoodie
On Tue, 30 Jun 2020 09:22:18 +1200 (NZST)
richard.n.proc...@gmail.com wrote:

> Hi Jason,
> 
> On 27/06/2020, at 10:09 PM, Jason A. Donenfeld 
> wrote:
> > [...] I thought I'd lay out my understanding of the matter,
> > and you can let me know whether or not this corresponds with what
> > you had in mind:
> > [...]  
> 
> My main concern is the end-to-end TCP or UDP transport checksum of the
> tunneled (or inner, or encapsulated) packet. Your argument seems to
> concern instead the per-hop IPv4 header checksum (which is also
> interesting to look at, but first things first).
> 
> As I read it, wg_decap() tells the stack to ignore the transport
> checksum of the tunneled packet, and I think this is incorrect for
> the following reason: the packet may have been corrupted outside of
> the tunnel, because the tunnel needn't cover the entire path the
> packet took through the network.
> 
> So a tunneled packet may be corrupt, and one addressed to the tunnel
> endpoint will be delivered with its end-to-end transport checksum
> unverified. Higher layers may receive corrupt data as as result.[*]  
> 
> (Routers, as intermediate network elements, are under no obligation to
> check end-to-end transport checksums. It isn't their job.)
> 
> One might argue (I do not) that this isn't a concern, because
> encryption these days is ubiquitous and comes with far stronger
> integrity checks. Nonetheless, even here transport checksums remain
> relevant for two reasons. Firsly, because networks remain unreliable:
> I see non-zero numbers of failed checksums on my systems. And,
> secondly, because higher layers require a reliable transport: TLS for
> instance will drop the connection when the MAC check fails[1].
> 
> Let's turn to the per-hop IPv4 checksum, which wg_decap() also tells
> the stack to ignore. I suspect the cost of verifying that checksum --
> 20 bytes on a hot cache line -- is negligible both absolutely and
> relative to the cost of decrypting the packet. And as the
> optimisation appears nowhere else in the tree that I can find,
> removing it would make wg(4) no worse than the other tunnels. Do you
> have evidence that the cost is in fact significant?

Depends on your definition of significant, I've seen 1-3% throughput
improvement without the patch.

Without patch:
--- 10.0.0.2 tcpbench statistics ---
4455088992 bytes sent over 121.001 seconds
bandwidth min/avg/max/std-dev = 122.940/294.737/324.080/30.782 Mbps

With patch:
--- 10.0.0.2 tcpbench statistics ---
4366145736 bytes sent over 120.999 seconds
bandwidth min/avg/max/std-dev = 123.908/288.718/329.770/31.518 Mbps

> Further, as Theo pointed out, the link layer has in any case no
> business with the IPv4 checksum, being part of the network layer.
> Layer violations are problematic for at least two reasons. Firstly,
> they constrict the design space. For instance, I suppose the IPv4
> checksum needn't be per-hop. It might be updated incrementally,
> increasing its scope. But this would be nullified by any link layer
> that, acting on a false assumption about the layer above it, told the
> stack to ignore that layer's checksum. Secondly, layer violations, an
> optimisation, which almost by definition rely on additional premises,
> burden the correctness obligation and so trade less work for the
> computer in narrow circumstances for (often much) more work for the
> fallible human to show that the process remains sound.
> 
> On that balance I now think that skipping the IPv4 check a false
> economy, too. Hopefully I have managed to make my reasons clearer
> this time around, please let me know if not, or if you disagree. 
> 
> See updated patch below, looking for oks to commit that.

Overall, it is still debatable whether to skip the IPv4 checksum as
modern crypto certainly offers better integrity checks. However, the
primary motivator for skipping the integrity checks is performance, and
the performance isn't severely impacted. Additionally, I can sympathise
with avoiding layer violations and bringing it inline with other
tunnels in this case.

I'm happy for the following patch to be committed.

- Matt

> cheers,
> Richard.
> 
> [*] By way of background, I understand the transport checksum was
> added to the early ARPANET primarily to guard against corruption
> inside its routers[0], after a router fault which "[brought] the
> network to its knees". In other words, end-to-end checks are
> necessary to catch faults in the routers; link-level checks aren't
> enough.
> 
> More generally, transport checksums set a lower bound on the
> reliability of the end-to-end transport, which decouples it from the
> reliability of the underlying network.
> 
> [0] Kleinrock "Queuing Systems, Volume 2: Computer Applications",
> John Wiley and Sons (1976), p441
> 
> [1] RFC5246 TLS v1.2 (2008), section 7.2.2.
> 
> TLS's requirement is reasonable: it would otherwise be obliged to
> reimplement for itself retransmission, which would drag with it other
> transport layer 

Re: use libc base64 code instead of libcrypt for ifconfig wg key handling

2020-06-21 Thread Matt Dunwoodie
On Mon, 22 Jun 2020 11:01:05 +1000
David Gwynne  wrote:

> libc has undocumented base64 encoding and decoding funtionality. this
> cuts ifconfig over to using it instead of the code in libcrypto.
> 
> whether the libc functionality should be "blessed" and documented is a
> separate issue.
> 
> ok?
> 
> Index: Makefile
> ===
> RCS file: /cvs/src/sbin/ifconfig/Makefile,v
> retrieving revision 1.16
> diff -u -p -r1.16 Makefile
> --- Makefile  21 Jun 2020 12:20:06 -  1.16
> +++ Makefile  21 Jun 2020 23:15:34 -
> @@ -4,7 +4,7 @@ PROG= ifconfig
>  SRCS=ifconfig.c brconfig.c sff.c
>  MAN= ifconfig.8
>  
> -LDADD=   -lutil -lm -lcrypto
> +LDADD=   -lutil -lm
>  DPADD=   ${LIBUTIL}
>  
>  .include 
> Index: ifconfig.c
> ===
> RCS file: /cvs/src/sbin/ifconfig/ifconfig.c,v
> retrieving revision 1.422
> diff -u -p -r1.422 ifconfig.c
> --- ifconfig.c21 Jun 2020 12:20:06 -  1.422
> +++ ifconfig.c21 Jun 2020 23:15:35 -
> @@ -5673,14 +5673,12 @@ setifpriority(const char *id, int param)
>   * space.
>   */
>  #define WG_BASE64_KEY_LEN (4 * ((WG_KEY_LEN + 2) / 3))
> -#define WG_TMP_KEY_LEN (WG_BASE64_KEY_LEN / 4 * 3)
>  #define WG_LOAD_KEY(dst, src, fn_name) do {
>   \
> - uint8_t _tmp[WG_TMP_KEY_LEN];
>   \
> + uint8_t _tmp[WG_KEY_LEN]; int _r;
>   \ if (strlen(src) != WG_BASE64_KEY_LEN)
>   \ errx(1, fn_name " (key): invalid length");
>   \
> - if (EVP_DecodeBlock(_tmp, src,
>   \
> - WG_BASE64_KEY_LEN) != WG_TMP_KEY_LEN)
>   \
> - errx(1, fn_name " (key): invalid base64");
>   \
> + if ((_r = b64_pton(src, _tmp, sizeof(_tmp))) !=
> sizeof(_tmp)) \
> + errx(1, fn_name " (key): invalid base64 %d/%zu", _r,
> sizeof(_tmp));\ memcpy(dst, _tmp, WG_KEY_LEN);
>   \ } while (0)
>  
> @@ -5899,13 +5897,15 @@ wg_status(void)
>   if (wg_interface->i_flags & WG_INTERFACE_HAS_RTABLE)
>   printf("\twgrtable %d\n", wg_interface->i_rtable);
>   if (wg_interface->i_flags & WG_INTERFACE_HAS_PUBLIC) {
> - EVP_EncodeBlock(key, wg_interface->i_public,
> WG_KEY_LEN);
> + b64_ntop(wg_interface->i_public, WG_KEY_LEN,
> + key, sizeof(key));
>   printf("\twgpubkey %s\n", key);
>   }
>  
>   wg_peer = _interface->i_peers[0];
>   for (i = 0; i < wg_interface->i_peers_count; i++) {
> - EVP_EncodeBlock(key, wg_peer->p_public, WG_KEY_LEN);
> + b64_ntop(wg_peer->p_public, WG_KEY_LEN,
> + key, sizeof(key));
>   printf("\twgpeer %s\n", key);
>  
>   if (wg_peer->p_flags & WG_PEER_HAS_PSK)

looks good to me.

- Matt



Re: WireGuard patchset for OpenBSD, rev. 3

2020-06-21 Thread Matt Dunwoodie
On Sun, 21 Jun 2020 15:54:00 +0200
Matthieu Herrb  wrote:
> Hi,
> 
> I was wondering if there is a way to specify a routing domain/table
> for wgendpoint in ifconfig(8).
> 
> In a VPN client setup (roadwarrior style) I'd like to keep wg0 in
> rdomain 0 and put the actual physical interface in rdomain 1. So that
> all daemons (smtpd, unwind, ...) use the VPN by default and only the
> strict minimum to setup the VPN runs in rdomain 1.
> 
> Everything works if I set wg0 in rdomain1 and keep my re0 interface in
> rdomain 0, but as soon as I set rdomain 1 for re0 and rdomain 0 for
> wg0, the VPN cannot come up (and I see the UDP packets to port 51820
> trying to go out through wg0).

Yes, this is most certainly possible (I have this configuration in a
couple of places). If you haven't found it yet, the "wgrtable" option
(see ifconfig(8)) will allow you to achieve this.



Re: WireGuard patchset for OpenBSD, rev. 2

2020-05-27 Thread Matt Dunwoodie
On Wed, 27 May 2020 01:43:34 -0600
"Jason A. Donenfeld"  wrote:

> On Wed, May 27, 2020 at 1:34 AM Martin Pieuchot 
> wrote:
> > First question is, is it possible to use the wg(4) interface
> > without a port?  
> 
> No, that is not how WireGuard works. For details on the actual
> protocol particulars, please see
> https://www.wireguard.com/papers/wireguard.pdf . Our rev 1 submission
> has links to further links as well in the cover letter.

Oh, I hope we haven't gotten terminology mixed up. Yes, WireGuard
requires a UDP port, no it does not require any software from the ports
tree. All functionality is accessible from the base system as a "first
class citizen".

Matt



Re: WireGuard patchset for OpenBSD, rev. 2

2020-05-27 Thread Matt Dunwoodie
On Wed, 27 May 2020 09:34:53 +0200
Martin Pieuchot  wrote:

> Hello Matt,
> 
> Thank you for your submission.  

Hi Martin,

No worries, thank you for your feedback. This is something I want to
help make happen and if I recall correctly, someone once said that if I
wanted a new feature on OpenBSD then submit a patch :)

> On 26/05/20(Tue) 19:39, Matt Dunwoodie wrote:  
> > After some feedback and comments, we've addressed the concerns, and
> > fixed a few things from our side too. Overall the structure is
> > familiar with no major changes, so any prior readings mostly carry
> > over.
> 
> It's hard to review a such big diff.  If you're interested in getting
> feedbacks in the code itself I'd suggest you split it :)  

I do realise it is a bit unwieldy, but it is a fairly
integrated/cohesive patch. I'll have a think about splitting it up
if/when we get to a rev. 3.

> Regarding the kernel, I'd suggest you use "#if NWG > 0" like it is
> done for other pseudo-drives with 'needs-flag'.  

For the most part there is no significant changes to other parts of the
network stack, so I don't believe this should be necessary. If there is
anything in particular that you think should be flagged like that then
please do say.

> Can you re-use PACKET_TAG_GRE instead of introduce
> PACKET_TAG_WIREGUARD? The former is already used by multiple
> pseudo-drivers.  

As far as I can tell PACKET_TAG_GRE is only used in gif(4) and gre(4),
however they just use the tag to store the if_index. I also understand
m_pkthdr.ph_tagsset is limited on space so adding a new TAG type is not
a great idea therefore, I've repurposed PACKET_TAG_GIF to
PACKET_TAG_WIREGUARD.

In this case we store a little bit more information (and not the
if_index) and I would not want any kind of conflict between wg(4)
PACKET_TAG_GRE and gif(4)/gre(4) PACKET_TAG_GRE.

> Aren't wg_noise.c and wg_cookie.c meant to be used by if_wg.c only?
> Or would other parts of the kernel benefit from them?  If wg(4) is
> the only user I'm questionnings whether putting them in crypto/ is
> the best choice.  

Yes, I'm open to moving them somewhere, the were originally in crypto/
as they handled specifically the crypto side of WireGuard. The end
result though, I don't think they should be merged into if_wg.c as they
are designed to be separate to help auditing/review. Would it make
sense to have them in sys/net/?

> Why are you using rwlock to protect cookie_* and noise_* states?  Is
> any sleeping involved?  Did you consider a mutex with IPL_NONE? rwlock
> introduce sleep point that might be surprising.  Did you try the
> option WITNESS of the kernel to check your locking?  

Both the cookie_* and noise_* are expected to run concurrently, and I
wouldn't want one thread holding up another with a mutex while
performing (relatively) expensive crypto operations. Where possible and
correct, we use a read lock.

Running with WITNESS does not raise any errors, I've checked.

> Keeping comments away from source code, like you did in wg_cookie.h
> can result in them not being updated.  This isn't something common in
> the tree and after some times comments tend to lie :o)  

Makes sense, and if it isn't that common I'm more than happy to move
them.

> Would you mind using variables of more than one-letter?  That makes is
> harder to grep for patterns.  

I'll see what I can do :)
 
> If you prefix fields of variable, I'd suggest picking the firs letter
> of the two words, for example 'wt_' for "struct wg_tag" instead of
> "t_" currently.  Then grepping for 'wt_endpoint' is more likely to
> yield what we're looking for :)  

Very reasonable - will do.
 
> Is mq_push() required?  I'm always surprise to see mbuf APIs grow and
> grow over years.  Anyway, that could be a separate change.  

To be explicit, the behaviour of mq_push is required, that is: we want
to add a packet to the queue; if the queue is full we want to drop the
oldest packet in the queue.

I'm aware of APIs growing though, so understand the concern. While it
would be possible to emulate this behaviour with mq_full, and then some
combination of mq_dequeue and mq_enqueue the result would be racey and
may have unintended side effects. The end goal is that we want to
achieve the behaviour above atomically.

> ioctl(2) shouldn't require the NET_LOCK, so if your pseudo-driver uses
> its own locks to serialize access to its data, please say that in the
> comments for SIOCSWG and SIOCGWG.  

Yes, I'll add some comments.

> WG_PEERS_FOREACH{,SAFE} macros are only introducing abstraction and
> make code harder to understand.  

Absolutely, I agree here. It was a remnant from some refactoring.

> Please do not cast function pointer in task_set(), use the expected
> prototype.  

Noted, will fix.

> Why did you introduce wg_queue_*(), isn't mq_init(9) API 

Re: WireGuard patchset for OpenBSD, rev. 2

2020-05-27 Thread Matt Dunwoodie
On Tue, 26 May 2020 13:28:22 +0200
Tobias Heider  wrote:

> Hi Matt,
> 
> just repeating what I commented yesterday for the new diff to make
> sure it isn't overlooked.

Thank you for repeating it, I didn't get around to addressing it before
the new diff.

> > +int
> > +wg_ioctl_get(struct wg_softc *sc, struct wg_data_io *data)
> > +{
> > +   struct wg_interface_io  *iface_p, iface_o;
> > +   struct wg_peer_io   *peer_p, peer_o;
> > +   struct wg_aip_io*aip_p;
> > +
> > +   struct wg_peer  *peer;
> > +   struct wg_aip   *aip;
> > +
> > +   size_t   size, peer_count, aip_count;
> > +   int  ret = 0, is_suser =
> > suser(curproc) == 0; +
> > +   size = sizeof(struct wg_interface_io);
> > +   if (data->wgd_size < size && !is_suser)
> > +   goto ret_size;
> > +
> > +   iface_p = data->wgd_interface;
> > +   bzero(_o, sizeof(iface_o));
> > +
> > +   rw_enter_read(>sc_lock);
> > +
> > +   if (sc->sc_udp_port != 0) {
> > +   iface_o.i_port = ntohs(sc->sc_udp_port);
> > +   iface_o.i_flags |= WG_INTERFACE_HAS_PORT;
> > +   }
> > +
> > +   if (sc->sc_udp_rtable != 0) {
> > +   iface_o.i_rtable = sc->sc_udp_rtable;
> > +   iface_o.i_flags |= WG_INTERFACE_HAS_RTABLE;
> > +   }
> > +
> > +   if (!is_suser)
> > +   goto out;
> > +
> > +   if (noise_local_keys(>sc_local, iface_o.i_public,
> > +   iface_o.i_private) == 0) {
> > +   iface_o.i_flags |= WG_INTERFACE_HAS_PUBLIC;
> > +   iface_o.i_flags |= WG_INTERFACE_HAS_PRIVATE;
> > +   }
> > +
> > +   if (((SIZE_MAX - size) / sizeof(struct wg_peer_io)) <
> > sc->sc_peer_num)
> > +   goto error;
> > +   size += sizeof(struct wg_peer_io) * sc->sc_peer_num;
> > +   if (((SIZE_MAX - size) / sizeof(struct wg_aip_io)) <
> > sc->sc_aip_num)
> > +   goto error;  
> 
> I still think those two should return an error.  'goto error' is
> misleading as it doesn't actually set ret != 0.  'error' should
> probably be renamed to 'cleanup' to avoid confusion and ret should be
> set to ERANGE .

I'll elaborate on this a bit. These goto errors are just checks for
integer overflows on size. Upon further consideration it probably
doesn't make sense if we are using a size_t. Therefore it makes sense
to remove these two checks.

> > +   size += sizeof(struct wg_aip_io) * sc->sc_aip_num;
> > +   if (data->wgd_size < size)
> > +   goto error;

This has been changed to "goto unlock_and_ret_size;". I think Jason
touched on it, but the problem is that if we return ERANGE, then
wg_data_io does not get copied back to userspace and the caller cannot
read the returned size. Therefore we need to return 0 to indicate no
error.

> > +   peer_count = 0;
> > +   peer_p = _p->i_peers[0];
> > +   WG_PEERS_FOREACH(peer, sc) {
> > +   bzero(_o, sizeof(peer_o));
> > +   peer_o.p_flags = WG_PEER_HAS_PUBLIC;
> > +   peer_o.p_protocol_version = 1;
> > +
> > +   if (noise_remote_keys(>p_remote,
> > peer_o.p_public,
> > +   peer_o.p_psk) == 0)
> > +   peer_o.p_flags |= WG_PEER_HAS_PSK;
> > +
> > +   if
> > (wg_timers_get_persistent_keepalive(>p_timers,
> > +   _o.p_pka) == 0)
> > +   peer_o.p_flags |= WG_PEER_HAS_PKA;
> > +
> > +   if (wg_peer_get_sockaddr(peer, _o.p_sa) == 0)
> > +   peer_o.p_flags |= WG_PEER_HAS_ENDPOINT;
> > +
> > +   mtx_enter(>p_counters_mtx);
> > +   peer_o.p_txbytes = peer->p_counters_tx;
> > +   peer_o.p_rxbytes = peer->p_counters_rx;
> > +   mtx_leave(>p_counters_mtx);
> > +
> > +   wg_timers_get_last_handshake(>p_timers,
> > +   _o.p_last_handshake);
> > +
> > +   aip_count = 0;
> > +   aip_p = _p->p_aips[0];
> > +   LIST_FOREACH(aip, >p_aip, a_entry) {
> > +   if ((ret = copyout(>a_data, aip_p,
> > sizeof(*aip_p))) != 0)
> > +   goto error;
> > +   aip_p++;
> > +   aip_count++;
> > +   }
> > +   peer_o.p_aips_count = aip_count;
> > +
> > +   if ((ret = copyout(_o, peer_p,
> > sizeof(peer_o))) != 0)
> > +   goto error;
> > +
> > +   peer_p = (struct wg_peer_io *)aip_p;
> > +   peer_count++;
> > +   }
> > +   iface_o.i_peers_count = peer_count;
> > +
> > +out:
> > +   if ((ret = copyout(_o, iface_p, sizeof(iface_o))) !=
> > 0)
> > +   goto error;
> > +error:
> > +   rw_exit_read(>sc_lock);
> > +   explicit_bzero(_o, sizeof(iface_o));
> > +   explicit_bzero(_o, sizeof(peer_o));
> > +ret_size:
> > +   data->wgd_size = size;
> > +   return ret;
> > +}  
> 



Re: WireGuard patchset for OpenBSD

2020-05-12 Thread Matt Dunwoodie
On Tue, 12 May 2020 17:36:15 +0200
Ingo Schwarze  wrote:

> Hi Matt,
> 
> thanks for doing all this work.  Note that i cannot provide feedback
> on the code or concepts, and also note that when the code is ready,
> a developer can commit it even if there are still issues to sort out
> with the documentation.  We do value good documentation, but the exact
> point in time when it is polished matters little.

Thanks for the feedback, I certainly appreciate it.

> All the same, i looked through the wg(4) manual page and improved
> a few details.  Please apply the following patch to what you sent.
> 
> 
> In addition to my changes, this line must be fixed before commit:
> 
>   .\" Copyright?
> 
> Please just use the normal /usr/share/misc/license.template with
> the commenting method adjusted from C to roff(7) comments.  If you
> wrote all the manual page text from scratch, use your own name and
> year(s).  If it is a derivative work based on the work of others,
> retain the original Copyright and license and add your own name and
> date(s) if your changes are significant enough.

Yes, all me. Will add.

> I feel somewhat concerned that you recommend the openssl(1) command
> for production use.  As far as i understand, the LibreSSL developers
> consider openssl(1) as a low-quality program purely intended for
> testing purposes that should not be used for production.  But that
> does not need to be addressed now, it can be improved later.

This is news to me, but what we are using it for very simply is calling
arc4random_buf, and then base64 encoding. If this isn't appropriate,
then perhaps a dedicated utility, or ifconfig integration could work.

wg (from wireguard-tools) also fills this functionality, however
getting that vs a simple key generator in base would be more work.

I'm open to suggestions here.

> When providing exactly one example, it isn't ideal that that example
> doesn't actually do anything practically useful.  Again, that's not
> a critical problem and can be improved later.
> 
> 
> Theo is right that .Ek and .nr nS are slightly awkward in manual
> pages, but they are still somewhat widespread in particular in
> driver and kernel manuals, so they wouldn't be a serious problem.
> However, the whole section containing them is useless in the first
> place, so the issue just vanishes with no additional effort.
> 
> 
> Rationale for my changes:
> 
>  * Add the missing $OpenBSD$ tag.
>  * New sentence, new line.
>  * Minor macro improvements.
>  * At a few places, purge phrases that say nothing.
>  * A few minor wording improvements.
>  * Delete the DIRECTIVES section: it contains no new information
>and ifconfig(8) was already referenced prominently above.
>  * Use the standard section name DIAGNOSTICS.

Thank you again, they're all reasonable, I'll add them in.
Matt



Re: WireGuard patchset for OpenBSD

2020-05-12 Thread Matt Dunwoodie
On Tue, 12 May 2020 14:44:45 +0200
Tobias Heider  wrote:

> Hi,
> 
> thanks for the diff!
> 
> > SipHash and ChaCha20Poly1305 are already available in the kernel.
> > The only modification here is add the short and simple chapoly AEAD
> > construction alongside the existing AE one.  
> 
> At first glance, I think you could use the crypto framework
> implementation for the chacha20-poly1305 AEAD construction (see
> sys/net/cryptosoft.c:swcr_authenc). An example for how it is used can
> be found in netinet/ip_esp.c

Hi Tobias,

Yes, that is a good suggestion and we did look into that during
development. However, for the time being I think the patch better
provides for our needs.

The patch is only ~210 lines (130:.c,80:.h), and doesn't just include
our aead chapoly, but also xchapoly which is required by the WireGuard
protocol and allows us to use random nonces, currently not provided by
swcr_authenc.

Additionally, as far as I'm aware, the cryptosoft only runs in a single
threaded taskq, while with calling the raw functions allows us to
crypt packets in parallel.

Finally, we wanted this patchset to be as auditable as
possible, so having the chapoly patch allows people to verify as easily
as possible that this is doing what we want.

So yes, for integration with the crypto(9) system, perhaps one day
after working through the above, but for the time being I don't see it
as a barrier to continuing development.

Thanks for the feedback!
Matt