Re: Mixed MTU hosts on a network

2018-04-14 Thread Jason A. Donenfeld
Hey Roman,

I've just tried a few ways of replicating your setup, and I can't seem
to reproduce the bug, either with the new code or old. The results you
mention are surprising too, since WireGuard or not, TCP is supposed to
negotiate the lowest common MSS. I wonder if some strange iptables
rules are getting in the way and confusing things?

Jason
___
WireGuard mailing list
WireGuard@lists.zx2c4.com
https://lists.zx2c4.com/mailman/listinfo/wireguard


Re: Mixed MTU hosts on a network

2018-04-14 Thread Roman Mamedov
On Sat, 14 Apr 2018 16:45:32 +0200
"Jason A. Donenfeld"  wrote:

> In this case, WireGuard seems to be doing the right thing. Think you
> could come up with some minimal test that exhibits the behavior you're
> seeing?

I now remember in more detail what was the problem. It was not with MTU 1412
on both sides, it was during trying to mix WG MTU 1412 on the PPPoE-connected
machine, with WG MTU 1420 on the other side (which uses full 1500 underlying
MTU).

Here I posted about it with some tcpdumps included:
https://lists.zx2c4.com/pipermail/wireguard/2018-March/002537.html

With 1420 on the "full MTU" side, the "PPPoE" side had to set 1408 WG MTU for
things to work properly, not 1412 as would theoretically fit into its PPPoE.

I'll post an update if I come up with a short and simple reproducer sequence.

Setting 1412 on both sides seems to work fine from more testing just now.

-- 
With respect,
Roman
___
WireGuard mailing list
WireGuard@lists.zx2c4.com
https://lists.zx2c4.com/mailman/listinfo/wireguard


Re: Mixed MTU hosts on a network

2018-04-14 Thread Jason A. Donenfeld
Hi Roman,

That's strange; I'm unable to reproduce what you've described:

[+] NS1: ip link set wg0 mtu 1412
[+] NS2: ip link set wg0 mtu 1412
[+] NS1: wg set wg0 peer QXloTaPOwUTzqFElVLSD0vBc4sxjyoKtPBSaTkZHokY=
endpoint 127.0.0.1:2
[+] NS2: wg set wg0 peer X0p7+UWc4wjaAmT73xAEuXLY80I6Gv8vTg6KwFHCPGs=
endpoint 127.0.0.1:1
[+] NS0: iptables -A INPUT -m length --length 1473 -j DROP
[+] NS2: ping -c 1 -W 1 -s 1384 192.168.241.1
PING 192.168.241.1 (192.168.241.1) 1384(1412) bytes of data.
1392 bytes from 192.168.241.1: icmp_seq=1 ttl=64 time=0.752 ms

--- 192.168.241.1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.752/0.752/0.752/0.000 ms

In this case, WireGuard seems to be doing the right thing. Think you
could come up with some minimal test that exhibits the behavior you're
seeing?

Jason
___
WireGuard mailing list
WireGuard@lists.zx2c4.com
https://lists.zx2c4.com/mailman/listinfo/wireguard


Re: Mixed MTU hosts on a network

2018-04-14 Thread Roman Mamedov
On Sat, 14 Apr 2018 16:15:07 +0200
"Jason A. Donenfeld"  wrote:

> Hi Roman,
> 
> I answered this in my first email to you, which perhaps got lost in
> the mix of emails, so I'll quote the relevant part:
> 
> > 2) When we pad the packet payload. In this case, we pad it to the
> > nearest multiple of 16, but we don't let it exceed the device MTU.
> > This is skb_padding in send.c. This behavior seems like the bug in
> > your particular case, since what matters here is the route's MTU, not
> > the device MTU. For full 1412 size packets, the payload is presumably
> > being padded to 1424, since that's still less than the device MTU. In
> > order to test this theory, try setting your route MTU, as you've
> > described in your first email, to 1408 (which is a multiple of 16). If
> > this works, let me know, as it will be good motivation for fixing
> > skb_padding. If not, then it means there's a problem elsewhere to
> > investigate too.
> 
> In short, because 1408 is a multiple of 16 so it didn't get rounded
> up, whereas 1412 got rounded up to 1424.

I got that, but that still seemed to be talking about the problem with route
MTUs.

But what about if I don't touch any route MTUs at all, but set the WG device
MTU to 1412. In my further experiments that didn't work well either, causing
weird one-directional issues, and only 1408 worked.

So, is it possible to fix the padding so 1412 can be used as WG device MTU on
underlying MTU of 1492? Otherwise, shouldn't there be a warning somewhere in
the docs to not just choose the largest fitting MTU according to [1], but also
round down what you got, to a nearest multiple of 16.

[1] https://www.mail-archive.com/wireguard@lists.zx2c4.com/msg01856.html

-- 
With respect,
Roman
___
WireGuard mailing list
WireGuard@lists.zx2c4.com
https://lists.zx2c4.com/mailman/listinfo/wireguard


Re: Mixed MTU hosts on a network

2018-04-14 Thread Jason A. Donenfeld
Hi Roman,

I answered this in my first email to you, which perhaps got lost in
the mix of emails, so I'll quote the relevant part:

> 2) When we pad the packet payload. In this case, we pad it to the
> nearest multiple of 16, but we don't let it exceed the device MTU.
> This is skb_padding in send.c. This behavior seems like the bug in
> your particular case, since what matters here is the route's MTU, not
> the device MTU. For full 1412 size packets, the payload is presumably
> being padded to 1424, since that's still less than the device MTU. In
> order to test this theory, try setting your route MTU, as you've
> described in your first email, to 1408 (which is a multiple of 16). If
> this works, let me know, as it will be good motivation for fixing
> skb_padding. If not, then it means there's a problem elsewhere to
> investigate too.

In short, because 1408 is a multiple of 16 so it didn't get rounded
up, whereas 1412 got rounded up to 1424.

Jason
___
WireGuard mailing list
WireGuard@lists.zx2c4.com
https://lists.zx2c4.com/mailman/listinfo/wireguard


Re: Mixed MTU hosts on a network

2018-04-14 Thread Roman Mamedov
On Sat, 14 Apr 2018 15:16:56 +0200
"Jason A. Donenfeld"  wrote:

> Hi Roman,
> 
> This commit should fix it. It now has a unit test too so that we don't
> hit this issue again. Thanks for reporting it in such detail.
> 
> https://git.zx2c4.com/WireGuard/commit/?id=a88a067d5477f877003d3703bb3b95cb4e94bc46
> 
> Let me know if that fixes it on your end.
> 
> Jason

Thanks! I didn't get a chance to test it yet.

Leaving route MTUs aside, did you look into why the interface MTU of 1412
behaves erratically (while by all calculations it should just fit into 1492
underlying PPPoE MTU), with only 1408 working reliably? Is it also because of
the padding?

-- 
With respect,
Roman
___
WireGuard mailing list
WireGuard@lists.zx2c4.com
https://lists.zx2c4.com/mailman/listinfo/wireguard


Re: Mixed MTU hosts on a network

2018-04-14 Thread Jason A. Donenfeld
Hi Roman,

This commit should fix it. It now has a unit test too so that we don't
hit this issue again. Thanks for reporting it in such detail.

https://git.zx2c4.com/WireGuard/commit/?id=a88a067d5477f877003d3703bb3b95cb4e94bc46

Let me know if that fixes it on your end.

Jason
___
WireGuard mailing list
WireGuard@lists.zx2c4.com
https://lists.zx2c4.com/mailman/listinfo/wireguard


Re: Mixed MTU hosts on a network

2018-04-13 Thread Jason A. Donenfeld
On Sat, Apr 14, 2018 at 03:38:46AM +0200, Jason A. Donenfeld wrote:
> 2) When we pad the packet payload. In this case, we pad it to the
> nearest multiple of 16, but we don't let it exceed the device MTU.
> This is skb_padding in send.c. This behavior seems like the bug in
> your particular case, since what matters here is the route's MTU, not
> the device MTU. For full 1412 size packets, the payload is presumably
> being padded to 1424, since that's still less than the device MTU. In
> order to test this theory, try setting your route MTU, as you've
> described in your first email, to 1408 (which is a multiple of 16). If
> this works, let me know, as it will be good motivation for fixing
> skb_padding. If not, then it means there's a problem elsewhere to
> investigate too.
> 
> I'm CC'ing Luis on this email, as he was working on the MTU code a while back.

I'm still playing with this, but something like the following might fix
the issue, if you're interested in playing a bit.

=~=~=~=~=~=~=

diff --git a/src/device.c b/src/device.c
index 1614d61..3d18368 100644
--- a/src/device.c
+++ b/src/device.c
@@ -120,6 +120,7 @@ static netdev_tx_t xmit(struct sk_buff *skb, struct 
net_device *dev)
struct sk_buff *next;
struct sk_buff_head packets;
sa_family_t family;
+   u32 mtu;
int ret;

if (unlikely(skb_examine_untrusted_ip_hdr(skb) != skb->protocol)) {
@@ -142,6 +143,8 @@ static netdev_tx_t xmit(struct sk_buff *skb, struct 
net_device *dev)
goto err_peer;
}

+   mtu = dst_mtu(skb_dst(skb)) ?: skb->dev->mtu;
+
__skb_queue_head_init();
if (!skb_is_gso(skb))
skb->next = NULL;
@@ -168,6 +171,8 @@ static netdev_tx_t xmit(struct sk_buff *skb, struct 
net_device *dev)
 */
skb_dst_drop(skb);

+   PACKET_CB(skb)->mtu = mtu;
+
__skb_queue_tail(, skb);
} while ((skb = next) != NULL);

diff --git a/src/queueing.h b/src/queueing.h
index d5948f3..c507536 100644
--- a/src/queueing.h
+++ b/src/queueing.h
@@ -46,6 +46,7 @@ struct packet_cb {
u64 nonce;
struct noise_keypair *keypair;
atomic_t state;
+   u32 mtu;
u8 ds;
 };
 #define PACKET_PEER(skb) (((struct packet_cb *)skb->cb)->keypair->entry.peer)
diff --git a/src/send.c b/src/send.c
index dddcc0b..e3b1ffd 100644
--- a/src/send.c
+++ b/src/send.c
@@ -116,11 +116,11 @@ static inline unsigned int skb_padding(struct sk_buff 
*skb)
 * isn't strictly neccessary, but it's better to be cautious here, 
especially
 * if that code ever changes.
 */
-   unsigned int last_unit = skb->len % skb->dev->mtu;
+   unsigned int last_unit = skb->len % PACKET_CB(skb)->mtu;
unsigned int padded_size = (last_unit + MESSAGE_PADDING_MULTIPLE - 1) & 
~(MESSAGE_PADDING_MULTIPLE - 1);

-   if (padded_size > skb->dev->mtu)
-   padded_size = skb->dev->mtu;
+   if (padded_size > PACKET_CB(skb)->mtu)
+   padded_size = PACKET_CB(skb)->mtu;
return padded_size - last_unit;
 }

___
WireGuard mailing list
WireGuard@lists.zx2c4.com
https://lists.zx2c4.com/mailman/listinfo/wireguard


Re: Mixed MTU hosts on a network

2018-04-13 Thread Jason A. Donenfeld
Hi Roman,

I think that your idea of setting a route-based MTU _should_ work, and
it seems like a bug if it isn't working. There are two places in
WireGuard which directly touch the MTU:

1) When we split GSO superpackets up into normal sized packets. This
code is supposed to be aware of the per-route MTU you've set, so it
shouldn't be a problem. This is the call to skb_gso_segment in
device.c.

2) When we pad the packet payload. In this case, we pad it to the
nearest multiple of 16, but we don't let it exceed the device MTU.
This is skb_padding in send.c. This behavior seems like the bug in
your particular case, since what matters here is the route's MTU, not
the device MTU. For full 1412 size packets, the payload is presumably
being padded to 1424, since that's still less than the device MTU. In
order to test this theory, try setting your route MTU, as you've
described in your first email, to 1408 (which is a multiple of 16). If
this works, let me know, as it will be good motivation for fixing
skb_padding. If not, then it means there's a problem elsewhere to
investigate too.

I'm CC'ing Luis on this email, as he was working on the MTU code a while back.

Regards,
Jason
___
WireGuard mailing list
WireGuard@lists.zx2c4.com
https://lists.zx2c4.com/mailman/listinfo/wireguard


Re: Mixed MTU hosts on a network

2018-03-26 Thread Luis Ressel
On Fri, 16 Mar 2018 14:25:47 +0500
Roman Mamedov  wrote:

> What helps, is only reducing MTU of the entire wg0 interface to 1412.
> Then everything works fine. But it doesn't feel optimal to reduce MTU
> of the entire network just because of 1 or 2 hosts. I would rather
> use a couple of those mtu-override routes, if they worked.

Unfortunately, lowering the MTU of the whole tunnel interface is the
only reliable solution right now. Per-peer configurability of MTUs has
been on project TODO for a while, so there will be a better solution
some day. I even started to work on this a few months back, but got
sidetracked.

Cheers,
Luis
___
WireGuard mailing list
WireGuard@lists.zx2c4.com
https://lists.zx2c4.com/mailman/listinfo/wireguard


Re: Mixed MTU hosts on a network

2018-03-16 Thread Roman Mamedov
On Fri, 16 Mar 2018 15:53:43 +0500
Roman Mamedov  wrote:

> But guess what, turns out that didn't work either. Tried both OUTPUT and
> POSTROUTING chains on the "mangle" table, and set-mss all the way down to
> 1220, no matter what, the iperf3 output looked the same as before.

Actually the iptables bit is easy to explain. Even if initial MSS is forced
to a low value on the sender, it's get negotiated back up to the maximum value
according to MTU on the receiver (changed both IPs since then):

21:13:38.641531 IP6 fd39:30::f5a8:e923:f8cd:24b5.40052 > 
fd39:30::e84f:942d:7f93:ddc1.5001: Flags [S], seq 2397878391, win 27200, 
options [mss 1220,sackOK,TS val 566161815 ecr 0,nop,wscale 9], length 0
21:13:38.641574 IP6 fd39:30::e84f:942d:7f93:ddc1.5001 > 
fd39:30::f5a8:e923:f8cd:24b5.40052: Flags [S.], seq 1221117548, ack 2397878392, 
win 26800, options [mss 1352,sackOK,TS val 2726162536 ecr 566161815,nop,wscale 
9], length 0
21:13:38.716047 IP6 fd39:30::f5a8:e923:f8cd:24b5.40052 > 
fd39:30::e84f:942d:7f93:ddc1.5001: Flags [.], ack 1, win 54, options 
[nop,nop,TS val 566161889 ecr 2726162536], length 0
21:13:38.716444 IP6 fd39:30::f5a8:e923:f8cd:24b5.40052 > 
fd39:30::e84f:942d:7f93:ddc1.5001: Flags [P.], seq 1341:1605, ack 1, win 54, 
options [nop,nop,TS val 566161889 ecr 2726162536], length 264
21:13:38.716458 IP6 fd39:30::e84f:942d:7f93:ddc1.5001 > 
fd39:30::f5a8:e923:f8cd:24b5.40052: Flags [.], ack 1, win 55, options 
[nop,nop,TS val 2726162611 ecr 566161889,nop,nop,sack 1 {1341:1605}], length 0

So the other side really needs to have a proper MTU set. And the highest working
wg0 MTU on PPPoE turned out to be 1408, not 1412 as I assumed. As for why 1412
also works but only if set on the sender side, I've no explanation for that yet.

-- 
With respect,
Roman
___
WireGuard mailing list
WireGuard@lists.zx2c4.com
https://lists.zx2c4.com/mailman/listinfo/wireguard


Re: Mixed MTU hosts on a network

2018-03-16 Thread Roman Mamedov
On Fri, 16 Mar 2018 10:35:18 +0100
Matthias Ordner  wrote:

> If you only care about TCP connections you could set a different TCP-MSS 
> with an iptables rule.

On Fri, 16 Mar 2018 11:01:51 +0100
Kalin KOZHUHAROV  wrote:

> You may need to pre-shape the packets for the "offenders", e.g.
> 
> ip6tables -t mangle -A POSTROUTING -o wg0 -d WHATEVERHOST -p tcp -m
> tcp --tcp-flags SYN,RST SYN -j TCPMSS --set-mss 1352
> 
> https://www.netfilter.org/documentation/HOWTO/netfilter-extensions-HOWTO-4.html#ss4.7
> 
> O, wait! You talk IPv6...
> 
> ip6tables -t mangle -A POSTROUTING -o wg0 -d fd39:30::250/128 -p tcp
> -m tcp --tcp-flags SYN,RST SYN -j TCPMSS --set-mss 1372

I knew about this option, but wanted to avoid it because it would incur more
overhead (going to iptables for this) and a bit more complexity.

But guess what, turns out that didn't work either. Tried both OUTPUT and
POSTROUTING chains on the "mangle" table, and set-mss all the way down to
1220, no matter what, the iperf3 output looked the same as before. At this
point I thought I'm going crazy or something. :)

It's not just iperf either, trying to send a file with "netcat6" into a
running listener on the other side also failed to transfer data.

Then almost by accident, I discovered that what also helps. It's to reduce
interface MTU only on the receiver, but just by a bit more, to 1408.

So what makes it work is EITHER:

a) set MTU 1412 on wg0 at sender;

OR

b) set MTU 1408 on wg0 at receiver.

...doing both at the same time is not even necessary. Some tcpdumps from the
receiver host are attached to demonstrate (if anyone else thinks I am crazy :).

Now, I can live with just the impacted (PPPoE) hosts having a lower MTU on wg0.

But still the whole thing seems rather weird.

-- 
With respect,
Roman
Receiver mtu 1420, sender mtu 1412, successful transfer:

# tcpdump -i wg0
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on wg0, link-type RAW (Raw IP), capture size 262144 bytes
15:42:35.027995 IP6 fd39:30::2.42414 > fd39:30::250.5001: Flags [S], seq 
4148302601, win 27040, options [mss 1352,sackOK,TS val 2239613851 ecr 
0,nop,wscale 9], length 0
15:42:35.028026 IP6 fd39:30::250.5001 > fd39:30::2.42414: Flags [S.], seq 
505975510, ack 4148302602, win 26960, options [mss 1360,sackOK,TS val 
1473426057 ecr 2239613851,nop,wscale 9], length 0
15:42:35.102517 IP6 fd39:30::2.42414 > fd39:30::250.5001: Flags [.], ack 1, win 
53, options [nop,nop,TS val 2239613925 ecr 1473426057], length 0
15:42:35.102772 IP6 fd39:30::2.42414 > fd39:30::250.5001: Flags [.], seq 
1:1341, ack 1, win 53, options [nop,nop,TS val 2239613925 ecr 1473426057], 
length 1340
15:42:35.102785 IP6 fd39:30::250.5001 > fd39:30::2.42414: Flags [.], ack 1341, 
win 58, options [nop,nop,TS val 1473426131 ecr 2239613925], length 0
15:42:35.102810 IP6 fd39:30::2.42414 > fd39:30::250.5001: Flags [P.], seq 
1341:2145, ack 1, win 53, options [nop,nop,TS val 2239613925 ecr 1473426057], 
length 804
15:42:35.102818 IP6 fd39:30::250.5001 > fd39:30::2.42414: Flags [.], ack 2145, 
win 64, options [nop,nop,TS val 1473426131 ecr 2239613925], length 0
15:42:35.729846 IP6 fd39:30::250.5001 > fd39:30::2.42162: Flags [F.], seq 
1811803733, ack 3749581328, win 56, options [nop,nop,TS val 1473426758 ecr 
2239251660,nop,nop,sack 1 {1341:2145}], length 0
15:42:35.804023 IP6 fd39:30::2.42162 > fd39:30::250.5001: Flags [.], ack 1, win 
54, options [nop,nop,TS val 2239614627 ecr 1473426758,nop,nop,sack 1 {0:1}], 
length 0
15:42:36.939584 IP6 fd39:30::2.42414 > fd39:30::250.5001: Flags [F.], seq 2145, 
ack 1, win 53, options [nop,nop,TS val 2239615763 ecr 1473426131], length 0
15:42:36.939723 IP6 fd39:30::250.5001 > fd39:30::2.42414: Flags [F.], seq 1, 
ack 2146, win 64, options [nop,nop,TS val 1473427968 ecr 2239615763], length 0
15:42:37.014143 IP6 fd39:30::2.42414 > fd39:30::250.5001: Flags [.], ack 2, win 
53, options [nop,nop,TS val 2239615837 ecr 1473427968], length 0
^C
12 packets captured
12 packets received by filter
0 packets dropped by kernel

===

Receiver mtu 1408, sender mtu 1420, successful transfer:

# tcpdump -i wg0
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on wg0, link-type RAW (Raw IP), capture size 262144 bytes
15:43:23.935508 IP6 fd39:30::2.42442 > fd39:30::250.5001: Flags [S], seq 
1011924297, win 27200, options [mss 1360,sackOK,TS val 2239662759 ecr 
0,nop,wscale 9], length 0
15:43:23.935541 IP6 fd39:30::250.5001 > fd39:30::2.42442: Flags [S.], seq 
1735470303, ack 1011924298, win 26720, options [mss 1348,sackOK,TS val 
1473474964 ecr 2239662759,nop,wscale 9], length 0
15:43:24.009867 IP6 fd39:30::2.42442 > fd39:30::250.5001: Flags [.], ack 1, win 
54, options [nop,nop,TS val 2239662834 ecr 1473474964], length 0
15:43:24.010192 IP6 fd39:30::2.42442 > fd39:30::250.5001: Flags [.], seq 
1:1337, ack 1, win 54, options 

Re: Mixed MTU hosts on a network

2018-03-16 Thread Kalin KOZHUHAROV
On Fri, Mar 16, 2018 at 10:25 AM, Roman Mamedov  wrote:
> Hello,
>
> I have a host which is on PPPoE and has 1492 as underlying MTU.
>
> When WireGuard starts by default, it sets MTU of its interface to 1420. All
> TCP connections trying to send a stream of data over the WG interface to that
> host, hang up (I test with iperf3).
>
> My first idea was to override the MTU for this specific host via adding a
> route:
>
> # ip -6 route add fd39:30::250/128 dev wg0 mtu 1412 metric 1
>
> # ip -6 route | grep ^fd39:30
> fd39:30::250 dev wg0  metric 1  mtu 1412
> fd39:30::/64 dev wg0  proto kernel  metric 256
>
> # ip route get fd39:30::250
> fd39:30::250 from :: dev wg0  src fd39:30::2  metric 1  mtu 1412
>
> However, this does not help at all. Even adding the corresponding route on the
> other side. Even using the "mtu lock" keyword instead of just "mtu". I am 
> still
> puzzled why. Any ideas?
>
Isn't it because routing is done by WG itself, based on AlowedIPs, so
that routing table is not considered at all, after the packet is given
to WG?

Those are assumptions of how things work, I haven't looked at the code.

> What helps, is only reducing MTU of the entire wg0 interface to 1412. Then
> everything works fine. But it doesn't feel optimal to reduce MTU of the entire
> network just because of 1 or 2 hosts. I would rather use a couple of those
> mtu-override routes, if they worked.
>
You may need to pre-shape the packets for the "offenders", e.g.

ip6tables -t mangle -A POSTROUTING -o wg0 -d WHATEVERHOST -p tcp -m
tcp --tcp-flags SYN,RST SYN -j TCPMSS --set-mss 1352

https://www.netfilter.org/documentation/HOWTO/netfilter-extensions-HOWTO-4.html#ss4.7

O, wait! You talk IPv6...

ip6tables -t mangle -A POSTROUTING -o wg0 -d fd39:30::250/128 -p tcp
-m tcp --tcp-flags SYN,RST SYN -j TCPMSS --set-mss 1372

You can also try setting the route MTU as above and then use "... -j
TCPMSS --clamp-mss-to-pmtu", although it may be more work and/or might
not work.

Cheers,
Kalin.
___
WireGuard mailing list
WireGuard@lists.zx2c4.com
https://lists.zx2c4.com/mailman/listinfo/wireguard