Re: WireGuard patchset for OpenBSD, rev. 2

2020-05-28 Thread Claudio Jeker
On Thu, May 28, 2020 at 01:07:40PM +0200, Martin Pieuchot wrote:
> On 27/05/20(Wed) 20:18, Matt Dunwoodie wrote:
> > On Wed, 27 May 2020 09:34:53 +0200
> > Martin Pieuchot  wrote:
> > > Regarding the kernel, I'd suggest you use "#if NWG > 0" like it is
> > > done for other pseudo-drives with 'needs-flag'.  
> > 
> > For the most part there is no significant changes to other parts of the
> > network stack, so I don't believe this should be necessary. If there is
> > anything in particular that you think should be flagged like that then
> > please do say.
> 
> I'm thinking of the `inp_upcall' abstraction and the in6_ifattach()
> chunk.  Is it possible to do without adding a function pointer to
> "struct inpcb" and guard the logic with #if NWG > 0" instead?  I'm not
> saying that such abstraction is not wanted, but I would prefer we think
> it through rather than add it for one particular driver/subsystem.
> 
> Have you seen the "#if NVXLAN" chunk in udp_input()?  Maybe this hook
> could also be considered for the abstraction you're suggesting.

Why is this code not just using the existing kernel socket API?
Why is there a need for yet another upcall interface?

I would not like to have more #if XYZ in udp_input() I already feel that
other codepaths in there should be reworked because this spaghetti coding
in the input path needs to stop.
 
-- 
:wq Claudio



Re: WireGuard patchset for OpenBSD, rev. 2

2020-05-28 Thread Martin Pieuchot
On 27/05/20(Wed) 20:18, Matt Dunwoodie wrote:
> On Wed, 27 May 2020 09:34:53 +0200
> Martin Pieuchot  wrote:
> > Regarding the kernel, I'd suggest you use "#if NWG > 0" like it is
> > done for other pseudo-drives with 'needs-flag'.  
> 
> For the most part there is no significant changes to other parts of the
> network stack, so I don't believe this should be necessary. If there is
> anything in particular that you think should be flagged like that then
> please do say.

I'm thinking of the `inp_upcall' abstraction and the in6_ifattach()
chunk.  Is it possible to do without adding a function pointer to
"struct inpcb" and guard the logic with #if NWG > 0" instead?  I'm not
saying that such abstraction is not wanted, but I would prefer we think
it through rather than add it for one particular driver/subsystem.

Have you seen the "#if NVXLAN" chunk in udp_input()?  Maybe this hook
could also be considered for the abstraction you're suggesting.

> > Aren't wg_noise.c and wg_cookie.c meant to be used by if_wg.c only?
> > Or would other parts of the kernel benefit from them?  If wg(4) is
> > the only user I'm questionnings whether putting them in crypto/ is
> > the best choice.  
> 
> Yes, I'm open to moving them somewhere, the were originally in crypto/
> as they handled specifically the crypto side of WireGuard. The end
> result though, I don't think they should be merged into if_wg.c as they
> are designed to be separate to help auditing/review. Would it make
> sense to have them in sys/net/?

In my opinion putting them in sys/net/ is better.

> > Why are you using rwlock to protect cookie_* and noise_* states?  Is
> > any sleeping involved?  Did you consider a mutex with IPL_NONE? rwlock
> > introduce sleep point that might be surprising.  Did you try the
> > option WITNESS of the kernel to check your locking?  
> 
> Both the cookie_* and noise_* are expected to run concurrently, and I
> wouldn't want one thread holding up another with a mutex while
> performing (relatively) expensive crypto operations. Where possible and
> correct, we use a read lock.

As long as rwlock are used in wg(4)'s taskqs that makes sense.  However
in the processing paths of the network stack (wg_start, wg_output,
wg_input and if you introduce it wg_enqueue) they should be avoided.
The rational is that they introduce sleeping points which we still try
to avoid. 

At least wg_input() calls wg_index_get() that tries to grab a read lock.
It would be great if those small iterations could be protected by a
mutex.  I haven't done a complete audit so they might be others ;)

I'd also suggest you look at vlan_enqueue() and see if it makes sense to
implement an if_enqueue() routine.  That should allow you to avoid
useless queueing involving locking, if HFSQ is not used on top of wg(4),
and maybe help for the double error accounting ;)

> > Is mq_push() required?  I'm always surprise to see mbuf APIs grow and
> > grow over years.  Anyway, that could be a separate change.  
> 
> To be explicit, the behaviour of mq_push is required, that is: we want
> to add a packet to the queue; if the queue is full we want to drop the
> oldest packet in the queue.
> 
> I'm aware of APIs growing though, so understand the concern. While it
> would be possible to emulate this behaviour with mq_full, and then some
> combination of mq_dequeue and mq_enqueue the result would be racey and
> may have unintended side effects. The end goal is that we want to
> achieve the behaviour above atomically.

Sure, please make sure dlg@ can review this change then :)



Re: WireGuard patchset for OpenBSD, rev. 2

2020-05-28 Thread Otto Moerbeek
On Thu, May 28, 2020 at 01:21:21AM -0600, Jason A. Donenfeld wrote:

> On Thu, May 28, 2020 at 1:19 AM Otto Moerbeek  wrote:
> > Of course.., I was running it from a !wxallowed mount. BTW, qemu is in
> > packages, no need to build it yourself.
> 
> Sure, but now I've been somewhat nerd sniped and am playing with this
> fcode forth implementation in qemu :-P. I wonder if there's something
> missing in the 64-bit extensions to IEEE 1275, in table.fs...

OK, can reproduce. I'll see if I can find out something.

-Otto



Re: WireGuard patchset for OpenBSD, rev. 2

2020-05-28 Thread Jason A. Donenfeld
On Thu, May 28, 2020 at 1:19 AM Otto Moerbeek  wrote:
> Of course.., I was running it from a !wxallowed mount. BTW, qemu is in
> packages, no need to build it yourself.

Sure, but now I've been somewhat nerd sniped and am playing with this
fcode forth implementation in qemu :-P. I wonder if there's something
missing in the 64-bit extensions to IEEE 1275, in table.fs...



Re: WireGuard patchset for OpenBSD, rev. 2

2020-05-28 Thread Otto Moerbeek
On Thu, May 28, 2020 at 01:05:59AM -0600, Jason A. Donenfeld wrote:

> On Thu, May 28, 2020 at 12:15 AM Otto Moerbeek  wrote:
> >
> > On Wed, May 27, 2020 at 11:28:09PM -0600, Jason A. Donenfeld wrote:
> >
> > > Hi Otto,
> > >
> > > On Wed, May 27, 2020 at 4:07 AM Otto Moerbeek  wrote:
> > > > Although I'm not terribly interested in bugs that are only seen (s0
> > > > far) using emulation, please send me the details on how you set up
> > > > qemu.
> > >
> > > Right, it could very well be a TCG bug. But maybe not. Here's how to
> > > get things [not-]working:
> > >
> > > $ qemu-system-sparc64 --version
> > > QEMU emulator version 5.0.0
> > > $ qemu-img convert -O qcow2 miniroot66.fs disk.qcow2
> > > $ qemu-img resize disk.qcow2 20G
> > > $ qemu-system-sparc64 \
> > >         -machine sun4u \
> > >         -m 1024 \
> > >         -drive file=disk.qcow2,if=ide \
> > >         -net nic,model=ne2k_pci -net user \
> > >         -nographic -serial stdio -monitor none \
> > >         -boot c
> > >
> > > OpenBIOS for Sparc64
> > > [...]
> > > Loading FCode image...
> > > Loaded 5840 bytes
> > > entry point is 0x4000
> > > Evaluating FCode...
> > > OpenBSD IEEE 1275 Bootblock 1.4
> > > ..>> OpenBSD BOOT 1.14
> > > Trying bsd...
> > > [...]
> > > OpenBSD 6.6 (RAMDISK) #84: Sat Oct 12 10:42:12 MDT 2019
> > >    dera...@sparc64.openbsd.org:/usr/src/sys/arch/sparc64/compile/RAMDISK
> > > [...]
> > > Welcome to the OpenBSD/sparc64 6.6 installation program.
> > > (I)nstall, (U)pgrade, (A)utoinstall or (S)hell?
> > >
> > > If you swap out miniroot66.fs for miniroot67.fs, you'll get the error
> > > I sent prior.
> > >
> > > Jason
> > >
> >
> > Does not work for me, error message on OpenBSD/amd64:
> >
> > Could not allocate dynamic translator buffer
> >
> > ktrace snippet:
> >
> > 74960 qemu-system-spar CALL  
> > mmap(0,0x4000,0x7 > EC>,0x1002,-1,0)
> > 74960 qemu-system-spar RET   mmap -1 errno 91 Not supported
> >
> > It's doing a RWX mapping, won't fly on OpenBSD.
> >
> >         -Otto
> 
> This sequence worked fine on my OpenBSD box for reproducing the maybe-bug. 
> (See: mount option.) YMMV:
> 
> bart ~ # pkg_add git gmake glib2 bison sdl2 gsed bash xz
> [...]
> bart ~ # ftp -o - https://download.qemu.org/qemu-5.0.0.tar.xz | unxz | tar xf 
> -
> bart ~ # cd qemu-5.0.0/
> bart ~/qemu-5.0.0 # mkdir build && cd build
> bart ~/qemu-5.0.0/build # ../configure && gmake -j$(sysctl -n hw.ncpu)
> [...]
> bart ~/qemu-5.0.0/build # ftp 
> https://cdn.openbsd.org/pub/OpenBSD/6.7/sparc64/miniroot67.fs
> [...]
> bart ~/qemu-5.0.0/build # ./qemu-img convert -O qcow2 miniroot67.fs disk.qcow2
> bart ~/qemu-5.0.0/build # ./qemu-img resize disk.qcow2 20G
> Image resized.
> bart ~/qemu-5.0.0/build # mount
> /dev/sd0a on / type ffs (local, wxallowed)
> bart ~/qemu-5.0.0/build # ./sparc64-softmmu/qemu-system-sparc64 -machine 
> sun4u -m 1024 -drive file=disk.qcow2,if=ide -net nic,model=ne2k_pci -net user 
> -nographic -serial stdio -monitor none -boot c
> OpenBIOS for Sparc64
> Configuration device id QEMU version 1 machine id 0
> kernel cmdline 
> CPUs: 1 x SUNW,UltraSPARC-IIi
> UUID: ----
> Welcome to OpenBIOS v1.1 built on Oct 28 2019 17:08
>   Type 'help' for detailed information
> Trying disk:a...
> Not a bootable ELF image
> Not a bootable a.out image
> Loading FCode image...
> Loaded 6882 bytes
> entry point is 0x4000
> Evaluating FCode...
> OpenBSD IEEE 1275 Bootblock 2.0
> ..reserved fcode word.
> Unhandled Exception 0x0030
> PC = 0xffd0f3ac NPC = 0xffd0f3b0
> Stopping execution
> 
> Jason
> 

Of course.., I was running it from a !wxallowed mount. BTW, qemu is in
packages, no need to build it yourself.

-Otto



Re: WireGuard patchset for OpenBSD, rev. 2

2020-05-28 Thread Jason A. Donenfeld
On Thu, May 28, 2020 at 12:15 AM Otto Moerbeek  wrote:
>
> On Wed, May 27, 2020 at 11:28:09PM -0600, Jason A. Donenfeld wrote:
>
> > Hi Otto,
> >
> > On Wed, May 27, 2020 at 4:07 AM Otto Moerbeek  wrote:
> > > Although I'm not terribly interested in bugs that are only seen (s0
> > > far) using emulation, please send me the details on how you set up
> > > qemu.
> >
> > Right, it could very well be a TCG bug. But maybe not. Here's how to
> > get things [not-]working:
> >
> > $ qemu-system-sparc64 --version
> > QEMU emulator version 5.0.0
> > $ qemu-img convert -O qcow2 miniroot66.fs disk.qcow2
> > $ qemu-img resize disk.qcow2 20G
> > $ qemu-system-sparc64 \
> >         -machine sun4u \
> >         -m 1024 \
> >         -drive file=disk.qcow2,if=ide \
> >         -net nic,model=ne2k_pci -net user \
> >         -nographic -serial stdio -monitor none \
> >         -boot c
> >
> > OpenBIOS for Sparc64
> > [...]
> > Loading FCode image...
> > Loaded 5840 bytes
> > entry point is 0x4000
> > Evaluating FCode...
> > OpenBSD IEEE 1275 Bootblock 1.4
> > ..>> OpenBSD BOOT 1.14
> > Trying bsd...
> > [...]
> > OpenBSD 6.6 (RAMDISK) #84: Sat Oct 12 10:42:12 MDT 2019
> >    dera...@sparc64.openbsd.org:/usr/src/sys/arch/sparc64/compile/RAMDISK
> > [...]
> > Welcome to the OpenBSD/sparc64 6.6 installation program.
> > (I)nstall, (U)pgrade, (A)utoinstall or (S)hell?
> >
> > If you swap out miniroot66.fs for miniroot67.fs, you'll get the error
> > I sent prior.
> >
> > Jason
> >
>
> Does not work for me, error message on OpenBSD/amd64:
>
> Could not allocate dynamic translator buffer
>
> ktrace snippet:
>
> 74960 qemu-system-spar CALL  
> mmap(0,0x4000,0x7 EC>,0x1002,-1,0)
> 74960 qemu-system-spar RET   mmap -1 errno 91 Not supported
>
> It's doing a RWX mapping, won't fly on OpenBSD.
>
>         -Otto

This sequence worked fine on my OpenBSD box for reproducing the maybe-bug. 
(See: mount option.) YMMV:

bart ~ # pkg_add git gmake glib2 bison sdl2 gsed bash xz
[...]
bart ~ # ftp -o - https://download.qemu.org/qemu-5.0.0.tar.xz | unxz | tar xf -
bart ~ # cd qemu-5.0.0/
bart ~/qemu-5.0.0 # mkdir build && cd build
bart ~/qemu-5.0.0/build # ../configure && gmake -j$(sysctl -n hw.ncpu)
[...]
bart ~/qemu-5.0.0/build # ftp 
https://cdn.openbsd.org/pub/OpenBSD/6.7/sparc64/miniroot67.fs
[...]
bart ~/qemu-5.0.0/build # ./qemu-img convert -O qcow2 miniroot67.fs disk.qcow2
bart ~/qemu-5.0.0/build # ./qemu-img resize disk.qcow2 20G
Image resized.
bart ~/qemu-5.0.0/build # mount
/dev/sd0a on / type ffs (local, wxallowed)
bart ~/qemu-5.0.0/build # ./sparc64-softmmu/qemu-system-sparc64 -machine sun4u 
-m 1024 -drive file=disk.qcow2,if=ide -net nic,model=ne2k_pci -net user 
-nographic -serial stdio -monitor none -boot c
OpenBIOS for Sparc64
Configuration device id QEMU version 1 machine id 0
kernel cmdline 
CPUs: 1 x SUNW,UltraSPARC-IIi
UUID: ----
Welcome to OpenBIOS v1.1 built on Oct 28 2019 17:08
  Type 'help' for detailed information
Trying disk:a...
Not a bootable ELF image
Not a bootable a.out image
Loading FCode image...
Loaded 6882 bytes
entry point is 0x4000
Evaluating FCode...
OpenBSD IEEE 1275 Bootblock 2.0
..reserved fcode word.
Unhandled Exception 0x0030
PC = 0xffd0f3ac NPC = 0xffd0f3b0
Stopping execution

Jason



Re: WireGuard patchset for OpenBSD, rev. 2

2020-05-28 Thread Otto Moerbeek
On Wed, May 27, 2020 at 11:28:09PM -0600, Jason A. Donenfeld wrote:

> Hi Otto,
> 
> On Wed, May 27, 2020 at 4:07 AM Otto Moerbeek  wrote:
> > Although I'm not terribly interested in bugs that are only seen (s0
> > far) using emulation, please send me the details on how you set up
> > qemu.
> 
> Right, it could very well be a TCG bug. But maybe not. Here's how to
> get things [not-]working:
> 
> $ qemu-system-sparc64 --version
> QEMU emulator version 5.0.0
> $ qemu-img convert -O qcow2 miniroot66.fs disk.qcow2
> $ qemu-img resize disk.qcow2 20G
> $ qemu-system-sparc64 \
> -machine sun4u \
> -m 1024 \
> -drive file=disk.qcow2,if=ide \
> -net nic,model=ne2k_pci -net user \
> -nographic -serial stdio -monitor none \
> -boot c
> 
> OpenBIOS for Sparc64
> [...]
> Loading FCode image...
> Loaded 5840 bytes
> entry point is 0x4000
> Evaluating FCode...
> OpenBSD IEEE 1275 Bootblock 1.4
> ..>> OpenBSD BOOT 1.14
> Trying bsd...
> [...]
> OpenBSD 6.6 (RAMDISK) #84: Sat Oct 12 10:42:12 MDT 2019
>dera...@sparc64.openbsd.org:/usr/src/sys/arch/sparc64/compile/RAMDISK
> [...]
> Welcome to the OpenBSD/sparc64 6.6 installation program.
> (I)nstall, (U)pgrade, (A)utoinstall or (S)hell?
> 
> If you swap out miniroot66.fs for miniroot67.fs, you'll get the error
> I sent prior.
> 
> Jason
> 

Does not work for me, error message on OpenBSD/amd64:

Could not allocate dynamic translator buffer

ktrace snippet:

74960 qemu-system-spar CALL  mmap(0,0x4000,0x7,0x1002,-1,0)
74960 qemu-system-spar RET   mmap -1 errno 91 Not supported

It's doing a RWX mapping, won't fly on OpenBSD.

-Otto




Re: WireGuard patchset for OpenBSD, rev. 2

2020-05-27 Thread Jason A. Donenfeld
Hi Otto,

On Wed, May 27, 2020 at 4:07 AM Otto Moerbeek  wrote:
> Although I'm not terribly interested in bugs that are only seen (s0
> far) using emulation, please send me the details on how you set up
> qemu.

Right, it could very well be a TCG bug. But maybe not. Here's how to
get things [not-]working:

$ qemu-system-sparc64 --version
QEMU emulator version 5.0.0
$ qemu-img convert -O qcow2 miniroot66.fs disk.qcow2
$ qemu-img resize disk.qcow2 20G
$ qemu-system-sparc64 \
-machine sun4u \
-m 1024 \
-drive file=disk.qcow2,if=ide \
-net nic,model=ne2k_pci -net user \
-nographic -serial stdio -monitor none \
-boot c

OpenBIOS for Sparc64
[...]
Loading FCode image...
Loaded 5840 bytes
entry point is 0x4000
Evaluating FCode...
OpenBSD IEEE 1275 Bootblock 1.4
..>> OpenBSD BOOT 1.14
Trying bsd...
[...]
OpenBSD 6.6 (RAMDISK) #84: Sat Oct 12 10:42:12 MDT 2019
   dera...@sparc64.openbsd.org:/usr/src/sys/arch/sparc64/compile/RAMDISK
[...]
Welcome to the OpenBSD/sparc64 6.6 installation program.
(I)nstall, (U)pgrade, (A)utoinstall or (S)hell?

If you swap out miniroot66.fs for miniroot67.fs, you'll get the error
I sent prior.

Jason



Re: WireGuard patchset for OpenBSD, rev. 2

2020-05-27 Thread Matt Dunwoodie
On Wed, 27 May 2020 01:43:34 -0600
"Jason A. Donenfeld"  wrote:

> On Wed, May 27, 2020 at 1:34 AM Martin Pieuchot 
> wrote:
> > First question is, is it possible to use the wg(4) interface
> > without a port?  
> 
> No, that is not how WireGuard works. For details on the actual
> protocol particulars, please see
> https://www.wireguard.com/papers/wireguard.pdf . Our rev 1 submission
> has links to further links as well in the cover letter.

Oh, I hope we haven't gotten terminology mixed up. Yes, WireGuard
requires a UDP port, no it does not require any software from the ports
tree. All functionality is accessible from the base system as a "first
class citizen".

Matt



Re: WireGuard patchset for OpenBSD, rev. 2

2020-05-27 Thread Matt Dunwoodie
On Wed, 27 May 2020 09:34:53 +0200
Martin Pieuchot  wrote:

> Hello Matt,
> 
> Thank you for your submission.  

Hi Martin,

No worries, thank you for your feedback. This is something I want to
help make happen and if I recall correctly, someone once said that if I
wanted a new feature on OpenBSD then submit a patch :)

> On 26/05/20(Tue) 19:39, Matt Dunwoodie wrote:  
> > After some feedback and comments, we've addressed the concerns, and
> > fixed a few things from our side too. Overall the structure is
> > familiar with no major changes, so any prior readings mostly carry
> > over.
> 
> It's hard to review a such big diff.  If you're interested in getting
> feedbacks in the code itself I'd suggest you split it :)  

I do realise it is a bit unwieldy, but it is a fairly
integrated/cohesive patch. I'll have a think about splitting it up
if/when we get to a rev. 3.

> Regarding the kernel, I'd suggest you use "#if NWG > 0" like it is
> done for other pseudo-drives with 'needs-flag'.  

For the most part there is no significant changes to other parts of the
network stack, so I don't believe this should be necessary. If there is
anything in particular that you think should be flagged like that then
please do say.

> Can you re-use PACKET_TAG_GRE instead of introduce
> PACKET_TAG_WIREGUARD? The former is already used by multiple
> pseudo-drivers.  

As far as I can tell PACKET_TAG_GRE is only used in gif(4) and gre(4),
however they just use the tag to store the if_index. I also understand
m_pkthdr.ph_tagsset is limited on space so adding a new TAG type is not
a great idea therefore, I've repurposed PACKET_TAG_GIF to
PACKET_TAG_WIREGUARD.

In this case we store a little bit more information (and not the
if_index) and I would not want any kind of conflict between wg(4)
PACKET_TAG_GRE and gif(4)/gre(4) PACKET_TAG_GRE.

> Aren't wg_noise.c and wg_cookie.c meant to be used by if_wg.c only?
> Or would other parts of the kernel benefit from them?  If wg(4) is
> the only user I'm questionnings whether putting them in crypto/ is
> the best choice.  

Yes, I'm open to moving them somewhere, the were originally in crypto/
as they handled specifically the crypto side of WireGuard. The end
result though, I don't think they should be merged into if_wg.c as they
are designed to be separate to help auditing/review. Would it make
sense to have them in sys/net/?

> Why are you using rwlock to protect cookie_* and noise_* states?  Is
> any sleeping involved?  Did you consider a mutex with IPL_NONE? rwlock
> introduce sleep point that might be surprising.  Did you try the
> option WITNESS of the kernel to check your locking?  

Both the cookie_* and noise_* are expected to run concurrently, and I
wouldn't want one thread holding up another with a mutex while
performing (relatively) expensive crypto operations. Where possible and
correct, we use a read lock.

Running with WITNESS does not raise any errors, I've checked.

> Keeping comments away from source code, like you did in wg_cookie.h
> can result in them not being updated.  This isn't something common in
> the tree and after some times comments tend to lie :o)  

Makes sense, and if it isn't that common I'm more than happy to move
them.

> Would you mind using variables of more than one-letter?  That makes is
> harder to grep for patterns.  

I'll see what I can do :)
 
> If you prefix fields of variable, I'd suggest picking the firs letter
> of the two words, for example 'wt_' for "struct wg_tag" instead of
> "t_" currently.  Then grepping for 'wt_endpoint' is more likely to
> yield what we're looking for :)  

Very reasonable - will do.
 
> Is mq_push() required?  I'm always surprise to see mbuf APIs grow and
> grow over years.  Anyway, that could be a separate change.  

To be explicit, the behaviour of mq_push is required, that is: we want
to add a packet to the queue; if the queue is full we want to drop the
oldest packet in the queue.

I'm aware of APIs growing though, so understand the concern. While it
would be possible to emulate this behaviour with mq_full, and then some
combination of mq_dequeue and mq_enqueue the result would be racey and
may have unintended side effects. The end goal is that we want to
achieve the behaviour above atomically.

> ioctl(2) shouldn't require the NET_LOCK, so if your pseudo-driver uses
> its own locks to serialize access to its data, please say that in the
> comments for SIOCSWG and SIOCGWG.  

Yes, I'll add some comments.

> WG_PEERS_FOREACH{,SAFE} macros are only introducing abstraction and
> make code harder to understand.  

Absolutely, I agree here. It was a remnant from some refactoring.

> Please do not cast function pointer in task_set(), use the expected
> prototype.  

Noted, will fix.

> Why did you introduce wg_queue_*(), isn't mq_init(9) API good enough?
>  

The queueing semantics for WireGuard are a little different from mq_*
for the most part due to encrypting on a taskq with multiple threads.

You may imagine a 

Re: WireGuard patchset for OpenBSD, rev. 2

2020-05-27 Thread Otto Moerbeek
On Wed, May 27, 2020 at 03:14:29AM -0600, Jason A. Donenfeld wrote:

> One interesting quirk in doing this on qemu is that the 6.7 and
> -current kernel both crash:
> 
> Loading FCode image...
> Loaded 6882 bytes
> entry point is 0x4000
> Evaluating FCode...
> OpenBSD IEEE 1275 Bootblock 2.0
> Unhandled Exception 0x0030
> PC = 0xffd0f3ac NPC = 0xffd0f3b0
> Stopping execution
> 
> Luckily it works fine on 6.6, so that's where I debugged this issue.
> But this might be a bug worth looking into. Otto's recent bootblk
> patch is a possible culprit, so I've CC'd him.
> 
> Jason
> 

Although I'm not terribly interested in bugs that are only seen (s0
far) using emulation, please send me the details on how you set up
qemu.

-Otto



Re: WireGuard patchset for OpenBSD, rev. 2

2020-05-27 Thread Jason A. Donenfeld
On Wed, May 27, 2020 at 2:12 AM Jason A. Donenfeld  wrote:
>
> Hi again Klemens,
>
> On Tue, May 26, 2020 at 5:42 PM Jason A. Donenfeld  wrote:
> >
> > On Tue, May 26, 2020 at 4:52 PM Jason A. Donenfeld  wrote:
> > > With regards to your crash, though, that's a bit more puzzling, and
> > > I'd be interested to learn more details. Because these structs are
> > > already naturally aligned, the __packed attribute, even with the odd
> > > nesting Matt had prior, should have produced all entirely aligned
> > > accesses. That makes me think your kaboom was coming from someplace
> > > else. One possibility is that you were running the git tree on the two
> > > days that I was playing with uint128_t, only to find out that some of
> > > openbsd's patches to clang miscalculate stack sizes when they're in
> > > use, so that work was shelved for another day and the commits removed;
> > > perhaps you were just unlucky? Or you hit some other bug that's
> > > lurking. Either way, output from ddb's `bt` would at least be useful.
> >
> > Do you know off hand if we're able to assume any type of alignment
> > with mbuf->m_data? mtod just casts without any address fixup, which
> > means if mbuf->m_data isn't aligned by some other mechanism, we're in
> > trouble. But I would assume there _is_ some alignment imposed, since
> > the rest of the stack appears to parse tcp headers and such directly
> > without byte-by-byte copies being made.
>
> After a day of TCG-run compilation, I've got a working sparc64 setup
> and can confirm the issue. Working on a fix.

The latest git master should work well now:

solar# arch -s
sparc64
solar# wg-quick up demo
[#] ifconfig wg0 create
[#] wg setconf wg0 /dev/fd/63
[#] ifconfig wg0 inet 192.168.4.155/24 alias
[#] ifconfig wg0 mtu 1420
[#] ifconfig wg0 up
[#] cp /etc/resolv.conf /etc/resolv.conf.wg-quick-backup.demo
[#] printf nameserver %s\n 8.8.8.8 8.8.4.4 1.1.1.1 1.0.0.1
[#] route -q -n add -inet 0.0.0.0/1 -iface 192.168.4.155
[#] route -q -n add -inet 128.0.0.0/1 -iface 192.168.4.155
[#] route -q -n delete -inet 163.172.161.0
[#] route -q -n add -inet 163.172.161.0 -gateway 10.0.2.2
[+] Backgrounding route monitor
solar# curl zx2c4.com/ip
163.172.161.0
demo.wireguard.com

One interesting quirk in doing this on qemu is that the 6.7 and
-current kernel both crash:

Loading FCode image...
Loaded 6882 bytes
entry point is 0x4000
Evaluating FCode...
OpenBSD IEEE 1275 Bootblock 2.0
Unhandled Exception 0x0030
PC = 0xffd0f3ac NPC = 0xffd0f3b0
Stopping execution

Luckily it works fine on 6.6, so that's where I debugged this issue.
But this might be a bug worth looking into. Otto's recent bootblk
patch is a possible culprit, so I've CC'd him.

Jason



Re: WireGuard patchset for OpenBSD, rev. 2

2020-05-27 Thread Jason A. Donenfeld
Hey David,

On Wed, May 27, 2020 at 2:26 AM David Gwynne  wrote:
>
> On Tue, May 26, 2020 at 05:42:13PM -0600, Jason A. Donenfeld wrote:
> > On Tue, May 26, 2020 at 4:52 PM Jason A. Donenfeld  wrote:
> > > With regards to your crash, though, that's a bit more puzzling, and
> > > I'd be interested to learn more details. Because these structs are
> > > already naturally aligned, the __packed attribute, even with the odd
> > > nesting Matt had prior, should have produced all entirely aligned
> > > accesses. That makes me think your kaboom was coming from someplace
> > > else. One possibility is that you were running the git tree on the two
> > > days that I was playing with uint128_t, only to find out that some of
> > > openbsd's patches to clang miscalculate stack sizes when they're in
> > > use, so that work was shelved for another day and the commits removed;
> > > perhaps you were just unlucky? Or you hit some other bug that's
> > > lurking. Either way, output from ddb's `bt` would at least be useful.
>
> When you say clang patches miscalculate the stack size with uint128_t,
> do you know which one(s) specifically? Was it -msave-args?

Yep, exactly. It looked to me like that plugin was considering each
argument of a function to be register-sized, which obviously isn't the
case for uint128_t. I had planned to come back to that after the
wireguard stuff is finished.

> Anyway, tl;dr, my guess is you're reading a 64bit value out of a packet,
> but I'm pretty sure the stack probably only provides 32bit alignment.

Yep, that seems to be exactly the case. Earlier today I looked at
every struct passed to mtod for the entire kernel, and found two other
drivers that do this with 64bit types, but maybe they're in contexts
where that's okay somehow.

Now that I've got my sparc64 rig cooking away at these kernels, I can
confirm that this is indeed the case, and it's trivially fixed by just
memcpy'ing to/from a properly variable on the stack. In my experience,
this pattern,

uint64_t i;
memcpy(, somecharbuffer, sizeof(i));
return i;

optimizes to just the simple boring cast on architectures with fast
unaligned access (e.g. amd64), while doing the smart thing for
architectures that trap or have a performance penalty. IOW, the
compiler generates optimal code over that pattern, so that's what I've
done to work around the issue here.

> - The network stack accesses bits of packets directly. I'm pretty
> sure so far this is limited to 32bit word accesses for things like
> addresses in IPv4 headers, GRE protocol fields, etc. I'm not sure
> there are (currently) any 64bit accesses.

My review earlier today indeed showed all max 32bit, except for two drivers:
- sys/net/switchofp.c and sys/net/ofp.h
- sys/dev/usb/if_athn_usb.c and sys/dev/usb/if_athn_usb.h

Either those are incorrect, are limited to archs that don't trap, or
are used in contexts where other factors make them safe. But beyond
those two, I couldn't find any others.

> Sorry for the rambling. Hopefully you know a bit more about mbuf
> and network stack alignment now though.

Not a problem at all, super appreciated. Looking forward to your other
comments on the driver too.

Jason



Re: WireGuard patchset for OpenBSD, rev. 2

2020-05-27 Thread David Gwynne
On Tue, May 26, 2020 at 05:42:13PM -0600, Jason A. Donenfeld wrote:
> On Tue, May 26, 2020 at 4:52 PM Jason A. Donenfeld  wrote:
> > With regards to your crash, though, that's a bit more puzzling, and
> > I'd be interested to learn more details. Because these structs are
> > already naturally aligned, the __packed attribute, even with the odd
> > nesting Matt had prior, should have produced all entirely aligned
> > accesses. That makes me think your kaboom was coming from someplace
> > else. One possibility is that you were running the git tree on the two
> > days that I was playing with uint128_t, only to find out that some of
> > openbsd's patches to clang miscalculate stack sizes when they're in
> > use, so that work was shelved for another day and the commits removed;
> > perhaps you were just unlucky? Or you hit some other bug that's
> > lurking. Either way, output from ddb's `bt` would at least be useful.

When you say clang patches miscalculate the stack size with uint128_t,
do you know which one(s) specifically? Was it -msave-args?

> Do you know off hand if we're able to assume any type of alignment
> with mbuf->m_data? mtod just casts without any address fixup, which
> means if mbuf->m_data isn't aligned by some other mechanism, we're in
> trouble. But I would assume there _is_ some alignment imposed, since
> the rest of the stack appears to parse tcp headers and such directly
> without byte-by-byte copies being made.

It's probably more correct to say that payload alignment is required by
the network stack rather than imposed. However, you could argue
that's a useless distinction to make in practice.

Anyway, tl;dr, my guess is you're reading a 64bit value out of a packet,
but I'm pretty sure the stack probably only provides 32bit alignment.

The long version of the relevant things are:

- OpenBSD runs on some architectures that require strict alignment
for accessing words, and the fault handlers for unaligned accesses
in the kernel drop you into ddb. ie, you don't want to do that.

- The network stack accesses bits of packets directly. I'm pretty
sure so far this is limited to 32bit word accesses for things like
addresses in IPv4 headers, GRE protocol fields, etc. I'm not sure
there are (currently) any 64bit accesses.

- mbufs and mbuf clusters (big buffers for packet data) are allocated
out of pools which provide 64 byte alignment. The data portion of
mbufs starts after some header structures, but should remain long
aligned. At least on amd64 and sparc64 it happens to be 16 byte
aligned. Figuring it out on a 32bit arch is too much maths for me atm.

- The mbuf API tries to keep the m_data pointer in newly allocated mbufs
long word aligned.

- However, when you're building a packet and you put a header onto it,
you usually do that with m_prepend. m_prepend will put the new header
up against the existing data, unless there is no space to do that with
the current mbuf. In that latter situation it will allocate a new mbuf
and add the new header there. Because it is a new mbuf, and as per the
previous point, the new header will start on a long word boundary.

This is mostly a problem for Ethernet related stuff (14 byte headers,
yay), and has a few consequences.

One of the consequences is that the receive side of Ethernet drivers
in OpenBSD try hard to align Ethernet payloads to 4 byte boundaries.
This means they allocate an mbuf with at least their mtu + 2 bytes,
and then chop 2 bytes off the front and give that to the chip to
rx into.

So if you're on a sparc64 running one of those, let's say the driver does
this and it gets an mbuf cluster to rx into. That cluster will be 64
byte aligned, which is good. Let's also say it gets a wg packet. The
layout of the packet will be:

- 2 byte ETHER_ALIGN pad at offset 0
- 14 byte Ethernet header at offset 2
- 20 byte IP header at offset 16 (hooray)
- 8 byte UDP header at offset 36
- 4 byte wg message type + pad at offset 44
- 4 byte wg/noise receiver index at offset 48
- 8 byte wg/noise counter/none at offset 52

52 % 8 = explosion.

However, in your diff struct noise_data is __packed, so the compiler
shouldn't make assumptions about the alignment and should do a byte
by byte load. It's hard to say what's happening without a backtrace
like you suggest.

For reasons I would suggest making struct noise_data look like this:

struct noise_data {
uint32_tr_idx;
uint32_tnonce_lo;
uint32_tnonce_hi;
};

And making the load of the counter look like this:

uint64_t ctr;

ctr = lemtoh32(>nonce_lo) |
((uint64_t)lemtoh32(>nonce_hi) << 32);

Sorry for the rambling. Hopefully you know a bit more about mbuf
and network stack alignment now though.



Re: WireGuard patchset for OpenBSD, rev. 2

2020-05-27 Thread Jason A. Donenfeld
Hi again Klemens,

On Tue, May 26, 2020 at 5:42 PM Jason A. Donenfeld  wrote:
>
> On Tue, May 26, 2020 at 4:52 PM Jason A. Donenfeld  wrote:
> > With regards to your crash, though, that's a bit more puzzling, and
> > I'd be interested to learn more details. Because these structs are
> > already naturally aligned, the __packed attribute, even with the odd
> > nesting Matt had prior, should have produced all entirely aligned
> > accesses. That makes me think your kaboom was coming from someplace
> > else. One possibility is that you were running the git tree on the two
> > days that I was playing with uint128_t, only to find out that some of
> > openbsd's patches to clang miscalculate stack sizes when they're in
> > use, so that work was shelved for another day and the commits removed;
> > perhaps you were just unlucky? Or you hit some other bug that's
> > lurking. Either way, output from ddb's `bt` would at least be useful.
>
> Do you know off hand if we're able to assume any type of alignment
> with mbuf->m_data? mtod just casts without any address fixup, which
> means if mbuf->m_data isn't aligned by some other mechanism, we're in
> trouble. But I would assume there _is_ some alignment imposed, since
> the rest of the stack appears to parse tcp headers and such directly
> without byte-by-byte copies being made.

After a day of TCG-run compilation, I've got a working sparc64 setup
and can confirm the issue. Working on a fix.

Jason



Re: WireGuard patchset for OpenBSD, rev. 2

2020-05-27 Thread Matt Dunwoodie
On Tue, 26 May 2020 13:28:22 +0200
Tobias Heider  wrote:

> Hi Matt,
> 
> just repeating what I commented yesterday for the new diff to make
> sure it isn't overlooked.

Thank you for repeating it, I didn't get around to addressing it before
the new diff.

> > +int
> > +wg_ioctl_get(struct wg_softc *sc, struct wg_data_io *data)
> > +{
> > +   struct wg_interface_io  *iface_p, iface_o;
> > +   struct wg_peer_io   *peer_p, peer_o;
> > +   struct wg_aip_io*aip_p;
> > +
> > +   struct wg_peer  *peer;
> > +   struct wg_aip   *aip;
> > +
> > +   size_t   size, peer_count, aip_count;
> > +   int  ret = 0, is_suser =
> > suser(curproc) == 0; +
> > +   size = sizeof(struct wg_interface_io);
> > +   if (data->wgd_size < size && !is_suser)
> > +   goto ret_size;
> > +
> > +   iface_p = data->wgd_interface;
> > +   bzero(_o, sizeof(iface_o));
> > +
> > +   rw_enter_read(>sc_lock);
> > +
> > +   if (sc->sc_udp_port != 0) {
> > +   iface_o.i_port = ntohs(sc->sc_udp_port);
> > +   iface_o.i_flags |= WG_INTERFACE_HAS_PORT;
> > +   }
> > +
> > +   if (sc->sc_udp_rtable != 0) {
> > +   iface_o.i_rtable = sc->sc_udp_rtable;
> > +   iface_o.i_flags |= WG_INTERFACE_HAS_RTABLE;
> > +   }
> > +
> > +   if (!is_suser)
> > +   goto out;
> > +
> > +   if (noise_local_keys(>sc_local, iface_o.i_public,
> > +   iface_o.i_private) == 0) {
> > +   iface_o.i_flags |= WG_INTERFACE_HAS_PUBLIC;
> > +   iface_o.i_flags |= WG_INTERFACE_HAS_PRIVATE;
> > +   }
> > +
> > +   if (((SIZE_MAX - size) / sizeof(struct wg_peer_io)) <
> > sc->sc_peer_num)
> > +   goto error;
> > +   size += sizeof(struct wg_peer_io) * sc->sc_peer_num;
> > +   if (((SIZE_MAX - size) / sizeof(struct wg_aip_io)) <
> > sc->sc_aip_num)
> > +   goto error;  
> 
> I still think those two should return an error.  'goto error' is
> misleading as it doesn't actually set ret != 0.  'error' should
> probably be renamed to 'cleanup' to avoid confusion and ret should be
> set to ERANGE .

I'll elaborate on this a bit. These goto errors are just checks for
integer overflows on size. Upon further consideration it probably
doesn't make sense if we are using a size_t. Therefore it makes sense
to remove these two checks.

> > +   size += sizeof(struct wg_aip_io) * sc->sc_aip_num;
> > +   if (data->wgd_size < size)
> > +   goto error;

This has been changed to "goto unlock_and_ret_size;". I think Jason
touched on it, but the problem is that if we return ERANGE, then
wg_data_io does not get copied back to userspace and the caller cannot
read the returned size. Therefore we need to return 0 to indicate no
error.

> > +   peer_count = 0;
> > +   peer_p = _p->i_peers[0];
> > +   WG_PEERS_FOREACH(peer, sc) {
> > +   bzero(_o, sizeof(peer_o));
> > +   peer_o.p_flags = WG_PEER_HAS_PUBLIC;
> > +   peer_o.p_protocol_version = 1;
> > +
> > +   if (noise_remote_keys(>p_remote,
> > peer_o.p_public,
> > +   peer_o.p_psk) == 0)
> > +   peer_o.p_flags |= WG_PEER_HAS_PSK;
> > +
> > +   if
> > (wg_timers_get_persistent_keepalive(>p_timers,
> > +   _o.p_pka) == 0)
> > +   peer_o.p_flags |= WG_PEER_HAS_PKA;
> > +
> > +   if (wg_peer_get_sockaddr(peer, _o.p_sa) == 0)
> > +   peer_o.p_flags |= WG_PEER_HAS_ENDPOINT;
> > +
> > +   mtx_enter(>p_counters_mtx);
> > +   peer_o.p_txbytes = peer->p_counters_tx;
> > +   peer_o.p_rxbytes = peer->p_counters_rx;
> > +   mtx_leave(>p_counters_mtx);
> > +
> > +   wg_timers_get_last_handshake(>p_timers,
> > +   _o.p_last_handshake);
> > +
> > +   aip_count = 0;
> > +   aip_p = _p->p_aips[0];
> > +   LIST_FOREACH(aip, >p_aip, a_entry) {
> > +   if ((ret = copyout(>a_data, aip_p,
> > sizeof(*aip_p))) != 0)
> > +   goto error;
> > +   aip_p++;
> > +   aip_count++;
> > +   }
> > +   peer_o.p_aips_count = aip_count;
> > +
> > +   if ((ret = copyout(_o, peer_p,
> > sizeof(peer_o))) != 0)
> > +   goto error;
> > +
> > +   peer_p = (struct wg_peer_io *)aip_p;
> > +   peer_count++;
> > +   }
> > +   iface_o.i_peers_count = peer_count;
> > +
> > +out:
> > +   if ((ret = copyout(_o, iface_p, sizeof(iface_o))) !=
> > 0)
> > +   goto error;
> > +error:
> > +   rw_exit_read(>sc_lock);
> > +   explicit_bzero(_o, sizeof(iface_o));
> > +   explicit_bzero(_o, sizeof(peer_o));
> > +ret_size:
> > +   data->wgd_size = size;
> > +   return ret;
> > +}  
> 



Re: WireGuard patchset for OpenBSD, rev. 2

2020-05-27 Thread Jason A. Donenfeld
Hi Martin,

To answer a few but not all of your questions:

On Wed, May 27, 2020 at 1:34 AM Martin Pieuchot  wrote:
> First question is, is it possible to use the wg(4) interface without a
> port?

No, that is not how WireGuard works. For details on the actual
protocol particulars, please see
https://www.wireguard.com/papers/wireguard.pdf . Our rev 1 submission
has links to further links as well in the cover letter.

> Aren't wg_noise.c and wg_cookie.c meant to be used by if_wg.c only?  Or
> would other parts of the kernel benefit from them?  If wg(4) is the only
> user I'm questionnings whether putting them in crypto/ is the best choice.

Yes, they should only be used by if_wg.c. Moving them closer to
if_wg.c would make sense imo.

> Is IFT_WIREGUARD needed?  Isn't the interface a point-to-point interface?

It is a point to point interface, but it also has some important
traits that point to point interfaces do not have. In this case, that
means disabling automatically generated v6 ip addresses, which
wireguard explicitly does not support. (It does support manually
generated and cryptographically bound link local v6 addresses,
however.)

> Why did you re-implemented a routing table?  Did you consider storing
> the information you need in existing data structures?

WireGuard's cryptokey routing is a different structure with different
semantics and is intended to work separately from the normal system
routing table. Trying to merge the two is a recipe for disaster. Paper
has details.

> My overall impression is that this can be simplified and integrate
> better in the kernel :)  If you could explain the basic architecture or
> point me where you explained it, it would help.

See paper above. Also see links on cover letter of rev 1 submission.

Regards,
Jason



Re: WireGuard patchset for OpenBSD, rev. 2

2020-05-27 Thread Martin Pieuchot
Hello Matt,

Thank you for your submission.

On 26/05/20(Tue) 19:39, Matt Dunwoodie wrote:
> After some feedback and comments, we've addressed the concerns, and
> fixed a few things from our side too. Overall the structure is familiar
> with no major changes, so any prior readings mostly carry over.

It's hard to review a such big diff.  If you're interested in getting
feedbacks in the code itself I'd suggest you split it :)

First question is, is it possible to use the wg(4) interface without a
port?

Regarding the kernel, I'd suggest you use "#if NWG > 0" like it is done
for other pseudo-drives with 'needs-flag'.

Can you re-use PACKET_TAG_GRE instead of introduce PACKET_TAG_WIREGUARD?
The former is already used by multiple pseudo-drivers.

Aren't wg_noise.c and wg_cookie.c meant to be used by if_wg.c only?  Or
would other parts of the kernel benefit from them?  If wg(4) is the only
user I'm questionnings whether putting them in crypto/ is the best choice.

The blake implementation and ChaCha extensions could be reviewed
independently.

Why are you using rwlock to protect cookie_* and noise_* states?  Is any
sleeping involved?  Did you consider a mutex with IPL_NONE? rwlock
introduce sleep point that might be surprising.  Did you try the option
WITNESS of the kernel to check your locking?

Keeping comments away from source code, like you did in wg_cookie.h can
result in them not being updated.  This isn't something common in the
tree and after some times comments tend to lie :o)

Would you mind using variables of more than one-letter?  That makes is
harder to grep for patterns.

If you prefix fields of variable, I'd suggest picking the firs letter of
the two words, for example 'wt_' for "struct wg_tag" instead of "t_"
currently.  Then grepping for 'wt_endpoint' is more likely to yield what
we're looking for :)

Is mq_push() required?  I'm always surprise to see mbuf APIs grow and
grow over years.  Anyway, that could be a separate change.

ioctl(2) shouldn't require the NET_LOCK, so if your pseudo-driver uses
its own locks to serialize access to its data, please say that in the
comments for SIOCSWG and SIOCGWG.

Is IFT_WIREGUARD needed?  Isn't the interface a point-to-point interface?

WG_PEERS_FOREACH{,SAFE} macros are only introducing abstraction and
make code harder to understand.

Please do not cast function pointer in task_set(), use the expected
prototype.

Why did you introduce wg_queue_*(), isn't mq_init(9) API good enough?

Why did you re-implemented a routing table?  Did you consider storing
the information you need in existing data structures?

My overall impression is that this can be simplified and integrate
better in the kernel :)  If you could explain the basic architecture or
point me where you explained it, it would help.

Thanks



Re: WireGuard patchset for OpenBSD, rev. 2

2020-05-26 Thread Jason A. Donenfeld
On Tue, May 26, 2020 at 4:52 PM Jason A. Donenfeld  wrote:
> With regards to your crash, though, that's a bit more puzzling, and
> I'd be interested to learn more details. Because these structs are
> already naturally aligned, the __packed attribute, even with the odd
> nesting Matt had prior, should have produced all entirely aligned
> accesses. That makes me think your kaboom was coming from someplace
> else. One possibility is that you were running the git tree on the two
> days that I was playing with uint128_t, only to find out that some of
> openbsd's patches to clang miscalculate stack sizes when they're in
> use, so that work was shelved for another day and the commits removed;
> perhaps you were just unlucky? Or you hit some other bug that's
> lurking. Either way, output from ddb's `bt` would at least be useful.

Do you know off hand if we're able to assume any type of alignment
with mbuf->m_data? mtod just casts without any address fixup, which
means if mbuf->m_data isn't aligned by some other mechanism, we're in
trouble. But I would assume there _is_ some alignment imposed, since
the rest of the stack appears to parse tcp headers and such directly
without byte-by-byte copies being made.



Re: WireGuard patchset for OpenBSD, rev. 2

2020-05-26 Thread Jason A. Donenfeld
Hey Klemens, Theo,

On Tue, May 26, 2020 at 2:38 PM Klemens Nanni  wrote:
>
> On Tue, May 26, 2020 at 02:23:06PM -0600, Jason A. Donenfeld wrote:
> > That's good news that it's working for you now, but I didn't change
> > anything within the last 24 hours (you mentioned "yesterday") that
> > would seem related to this. So perhaps don't get too excited just yet.
> I was surprised, too.  Perhaps something went wrong applying all those
> single patches from the wireguard-openbsd resository, but if so, I'd
> doubt that it would build in the first place...
>
> So far the interface is stable, but I keep testing.  When I find the
> time, I'll also take a look at the pieces of code which blew up earlier.

My audit is complete, and every single usage of __packed has been
removed. Those never should have been there in the first place.
WireGuard was specifically designed for kernels, and I distinctly
remember making sure things would be high performance on little mips
devices, which necessitated natural struct alignment. And indeed, all
of the structs that wireguard uses on the wire have fields that are
always aligned to each field's size, so no __packed is ever necessary.
This all has been fixed up in the git repo now.

With regards to your crash, though, that's a bit more puzzling, and
I'd be interested to learn more details. Because these structs are
already naturally aligned, the __packed attribute, even with the odd
nesting Matt had prior, should have produced all entirely aligned
accesses. That makes me think your kaboom was coming from someplace
else. One possibility is that you were running the git tree on the two
days that I was playing with uint128_t, only to find out that some of
openbsd's patches to clang miscalculate stack sizes when they're in
use, so that work was shelved for another day and the commits removed;
perhaps you were just unlucky? Or you hit some other bug that's
lurking. Either way, output from ddb's `bt` would at least be useful.

Jason



Re: WireGuard patchset for OpenBSD, rev. 2

2020-05-26 Thread Jason A. Donenfeld
On Tue, May 26, 2020 at 2:33 PM Theo de Raadt  wrote:
>
> Jason A. Donenfeld  wrote:
>
> > Hey Klemens,
> >
> > On Tue, May 26, 2020 at 9:13 AM Klemens Nanni  wrote:
> > > I worked with the patches from the wireguard-openbsd repository after
> > > version one of this diff on tech@ became a bit old.
> > >
> > > That was until yesterday;  the kernel would panic due to memory
> > > alignment issues in various spots, `amd64# ping6 tunnel-ip.sparc64'
> > > would receive a single ICMP6 echo reply from the sparc64 peer before
> > > panicing its kernel very early in noise init code.
> > >
> > > I've had fixed a few bugs and ran into different panics during later
> > > code paths.
> > >
> > > Fortunately, using "rev. 2" of your diff on top of a clean checkout just
> > > works so far on sparc64 without any additional patches required - this
> > > is a very pleasnt suprise and I can start looking at it under production
> > > use cases more thoroughly now.
> >
> > That's good news that it's working for you now, but I didn't change
> > anything within the last 24 hours (you mentioned "yesterday") that
> > would seem related to this. So perhaps don't get too excited just yet.
> > I've got a decent MIPS setup here and a hacked up qemu for having a
> > bit more detail on alignment traps. I'll see if I can reproduce any
> > issues.
>
> I suspect problems around forcing __packed.
>
> __packed is very dangerous.
>
> If you know the sizes of the objects are correctly sized and placed to
> not have gap-pads, then not using __packed might be better.

Right, most times, uses of packed are wrong and can be avoided. I'll
work through that all and see what can be minimized. In a cursory look
I did see code to the effect of `struct { u32 a; u64 b; } __packed`,
which could trap.

Jason



Re: WireGuard patchset for OpenBSD, rev. 2

2020-05-26 Thread Theo de Raadt
Jason A. Donenfeld  wrote:

> Hey Klemens,
> 
> On Tue, May 26, 2020 at 9:13 AM Klemens Nanni  wrote:
> > I worked with the patches from the wireguard-openbsd repository after
> > version one of this diff on tech@ became a bit old.
> >
> > That was until yesterday;  the kernel would panic due to memory
> > alignment issues in various spots, `amd64# ping6 tunnel-ip.sparc64'
> > would receive a single ICMP6 echo reply from the sparc64 peer before
> > panicing its kernel very early in noise init code.
> >
> > I've had fixed a few bugs and ran into different panics during later
> > code paths.
> >
> > Fortunately, using "rev. 2" of your diff on top of a clean checkout just
> > works so far on sparc64 without any additional patches required - this
> > is a very pleasnt suprise and I can start looking at it under production
> > use cases more thoroughly now.
> 
> That's good news that it's working for you now, but I didn't change
> anything within the last 24 hours (you mentioned "yesterday") that
> would seem related to this. So perhaps don't get too excited just yet.
> I've got a decent MIPS setup here and a hacked up qemu for having a
> bit more detail on alignment traps. I'll see if I can reproduce any
> issues.

I suspect problems around forcing __packed.

__packed is very dangerous.

If you know the sizes of the objects are correctly sized and placed to
not have gap-pads, then not using __packed might be better.

__packed won't sae you from leaking gap bytes, either.

We only run on ILP32 and I32LP64 ABIs, and all of them pack in the classic
way...



Re: WireGuard patchset for OpenBSD, rev. 2

2020-05-26 Thread Jason A. Donenfeld
Hey Klemens,

On Tue, May 26, 2020 at 9:13 AM Klemens Nanni  wrote:
> I worked with the patches from the wireguard-openbsd repository after
> version one of this diff on tech@ became a bit old.
>
> That was until yesterday;  the kernel would panic due to memory
> alignment issues in various spots, `amd64# ping6 tunnel-ip.sparc64'
> would receive a single ICMP6 echo reply from the sparc64 peer before
> panicing its kernel very early in noise init code.
>
> I've had fixed a few bugs and ran into different panics during later
> code paths.
>
> Fortunately, using "rev. 2" of your diff on top of a clean checkout just
> works so far on sparc64 without any additional patches required - this
> is a very pleasnt suprise and I can start looking at it under production
> use cases more thoroughly now.

That's good news that it's working for you now, but I didn't change
anything within the last 24 hours (you mentioned "yesterday") that
would seem related to this. So perhaps don't get too excited just yet.
I've got a decent MIPS setup here and a hacked up qemu for having a
bit more detail on alignment traps. I'll see if I can reproduce any
issues.

Jason



Re: WireGuard patchset for OpenBSD, rev. 2

2020-05-26 Thread Jason A. Donenfeld
Hey Tobias,

On Tue, May 26, 2020 at 5:28 AM Tobias Heider  wrote:
> > + if (((SIZE_MAX - size) / sizeof(struct wg_aip_io)) < sc->sc_aip_num)
> > + goto error;
>
> I still think those two should return an error.  'goto error' is misleading as
> it doesn't actually set ret != 0.  'error' should probably be renamed
> to 'cleanup' to avoid confusion and ret should be set to ERANGE .

I'll change the label name to be more clear; I agree "error" is
misleading. Returning ERANGE wouldn't be correct though; we still want
the struct to be copied back to the user so that they can use the
information in it to optionally try again (optionally, because maybe
they don't care about info) with a different buffer size.



Re: WireGuard patchset for OpenBSD, rev. 2

2020-05-26 Thread Klemens Nanni
On Tue, May 26, 2020 at 08:09:48AM -0600, Theo de Raadt wrote:
> I'll let you know who has sparc64 machines to help out:
> 
> kn was the developer who saw the problem.  jca is also adept
> enough to look at this with you.
I worked with the patches from the wireguard-openbsd repository after
version one of this diff on tech@ became a bit old.

That was until yesterday;  the kernel would panic due to memory
alignment issues in various spots, `amd64# ping6 tunnel-ip.sparc64'
would receive a single ICMP6 echo reply from the sparc64 peer before
panicing its kernel very early in noise init code.

I've had fixed a few bugs and ran into different panics during later
code paths.

Fortunately, using "rev. 2" of your diff on top of a clean checkout just
works so far on sparc64 without any additional patches required - this
is a very pleasnt suprise and I can start looking at it under production
use cases more thoroughly now.

playground$ arch -s ; doas ifconfig wg0
sparc64
wg0: flags=8043 mtu 1420
index 5 priority 0 llprio 3
wgport 2345
wgpubkey mrtNB07tzEJKyJDvhaov7QYt487BXLK3hnnZB+pDIhM=
wgpeer 9xuyga6Z/W7l/vnLIl67PiRcf57VujfoqpIH5VArpR4=
wgendpoint 2001:xxx 2345
tx: 701892, rx: 13283708
last handshake: 29 seconds ago
wgaip ::/0
groups: wg
inet6 2001:db8::1 prefixlen 126



Re: WireGuard patchset for OpenBSD, rev. 2

2020-05-26 Thread Theo de Raadt
I'll let you know who has sparc64 machines to help out:

kn was the developer who saw the problem.  jca is also adept
enough to look at this with you.



Re: WireGuard patchset for OpenBSD, rev. 2

2020-05-26 Thread Tobias Heider
On Tue, May 26, 2020 at 07:39:01PM +1000, Matt Dunwoodie wrote:
> Hi tech,
> 
> After some feedback and comments, we've addressed the concerns, and
> fixed a few things from our side too. Overall the structure is familiar
> with no major changes, so any prior readings mostly carry over.
> 
> This is a quick summary of the changes. Detailed changes can be found
> on the out-of-tree repo or we can elaborate if more detail is needed.
> 
> * Allow non-root to read WireGuard network configuration of the
>   interface.
>   - This allows users to help debug issues without
> escalating to root.
> 
> * Removal of wgconf option from ifconfig.
>   - This option was not mature enough for inclusion into base. Upon
> reflection, this behaviour probably needs further consideration
> about how to integrate into ifconfig.
> 
> * Fix ioctl interface.
>   - The linked list setup was fragile, complicated, and broken, so
> we've gone with something much simpler, and user easy to program
> for, using a boring ioctl pattern that's been tried successfully
> elsewhere already.
> 
> * Limit the padding of the packet to the MTU.
>   - Fits the protocol spec and prevents unneccessary fragmentation.
> 
> * Align handshake/identity locks and timer semantics with spec and other
>   implementations.
>   - This avoids any subtle edge cases that may occur.
> 
> Here are the changes that shouldn't need any explanation:
> 
> * Updated documentation based on feedback from tech and others.
> * Maintain order of peers when adding/deleting  .
> * Rename noise_alloc to noise_upcall.
> * Remove wg_timers_fn indirection.
> 
> Minor formatting changes may also be present.
> 
> Once someone picks this up, we're more than happy to help integrate
> this and maintain it in-tree.
> 
> Signed
> Matt Dunwoodie

Hi Matt,

just repeating what I commented yesterday for the new diff to make sure
it isn't overlooked.

> +int
> +wg_ioctl_get(struct wg_softc *sc, struct wg_data_io *data)
> +{
> + struct wg_interface_io  *iface_p, iface_o;
> + struct wg_peer_io   *peer_p, peer_o;
> + struct wg_aip_io*aip_p;
> +
> + struct wg_peer  *peer;
> + struct wg_aip   *aip;
> +
> + size_t   size, peer_count, aip_count;
> + int  ret = 0, is_suser = suser(curproc) == 0;
> +
> + size = sizeof(struct wg_interface_io);
> + if (data->wgd_size < size && !is_suser)
> + goto ret_size;
> +
> + iface_p = data->wgd_interface;
> + bzero(_o, sizeof(iface_o));
> +
> + rw_enter_read(>sc_lock);
> +
> + if (sc->sc_udp_port != 0) {
> + iface_o.i_port = ntohs(sc->sc_udp_port);
> + iface_o.i_flags |= WG_INTERFACE_HAS_PORT;
> + }
> +
> + if (sc->sc_udp_rtable != 0) {
> + iface_o.i_rtable = sc->sc_udp_rtable;
> + iface_o.i_flags |= WG_INTERFACE_HAS_RTABLE;
> + }
> +
> + if (!is_suser)
> + goto out;
> +
> + if (noise_local_keys(>sc_local, iface_o.i_public,
> + iface_o.i_private) == 0) {
> + iface_o.i_flags |= WG_INTERFACE_HAS_PUBLIC;
> + iface_o.i_flags |= WG_INTERFACE_HAS_PRIVATE;
> + }
> +
> + if (((SIZE_MAX - size) / sizeof(struct wg_peer_io)) < sc->sc_peer_num)
> + goto error;
> + size += sizeof(struct wg_peer_io) * sc->sc_peer_num;
> + if (((SIZE_MAX - size) / sizeof(struct wg_aip_io)) < sc->sc_aip_num)
> + goto error;

I still think those two should return an error.  'goto error' is misleading as
it doesn't actually set ret != 0.  'error' should probably be renamed
to 'cleanup' to avoid confusion and ret should be set to ERANGE .

> + size += sizeof(struct wg_aip_io) * sc->sc_aip_num;
> + if (data->wgd_size < size)
> + goto error;
> +
> + peer_count = 0;
> + peer_p = _p->i_peers[0];
> + WG_PEERS_FOREACH(peer, sc) {
> + bzero(_o, sizeof(peer_o));
> + peer_o.p_flags = WG_PEER_HAS_PUBLIC;
> + peer_o.p_protocol_version = 1;
> +
> + if (noise_remote_keys(>p_remote, peer_o.p_public,
> + peer_o.p_psk) == 0)
> + peer_o.p_flags |= WG_PEER_HAS_PSK;
> +
> + if (wg_timers_get_persistent_keepalive(>p_timers,
> + _o.p_pka) == 0)
> + peer_o.p_flags |= WG_PEER_HAS_PKA;
> +
> + if (wg_peer_get_sockaddr(peer, _o.p_sa) == 0)
> + peer_o.p_flags |= WG_PEER_HAS_ENDPOINT;
> +
> + mtx_enter(>p_counters_mtx);
> + peer_o.p_txbytes = peer->p_counters_tx;
> + peer_o.p_rxbytes = peer->p_counters_rx;
> + mtx_leave(>p_counters_mtx);
> +
> + wg_timers_get_last_handshake(>p_timers,
> + _o.p_last_handshake);
> +
> + aip_count = 0;
> + aip_p = _p->p_aips[0];
> + LIST_FOREACH(aip, >p_aip, a_entry) {
> + if 

Re: WireGuard patchset for OpenBSD, rev. 2

2020-05-26 Thread Jason A. Donenfeld
Hey tech@,

A few things I thought I should add to our v2 revision:

First, the improvements we've made in the last few weeks have been
pretty substantial, and we've now got a much more faithful protocol
implementation. I've been running this on a few high traffic servers,
and I'll probably move demo.wireguard.com over to it soonish. The ioctl
stuff has also been cleaned up and finalized now.

It's this last point that I'm quite happy about: the latest
wireguard-tools package in ports now fully supports this patch. So that
means after patching your kernel you can easily run:

# pkg_add -Dsnap wireguard-tools

And it'll interface with this implementation like usual, providing wg(8)
and wg-quick(8). Go programmers will also be happy to learn that Matt
Layher has added support for the ioctl interface to his wgctrl-go
project.

As you might have surmised, I'm very interested in getting expanded
testing of this patch. We're still very early in the 6.8 cycle. I'm
wondering if it would make sense to check this v2 revision into cvs
_now_, which will then make it included in -current, so that folks can
test and use this easily. Then, Matt and I can continue to develop and
refine this in one of two ways:

  - Send (bi-?)monthly patches to tech@ with fixes we've been working
on.
  - Enable both of our commit bits for that part of the tree, and we'll
push patches directly.

Either one works, or maybe you have a third idea for that. I'm fairly
committed to working on this full time to get it perfect for 6.8.

The other thing I've been exploring is re-licensing the wireguard-tools
package as MIT or ISC in case we decide at some later point down the
road (not now) that maybe wg(8) would fit well in the base system.

Anyway, please go forth and test this! And thanks a lot for the feedback
from our v1. Looking forward to the same on v2.

Jason