On 2022-05-15, Jason McIntyre <[email protected]> wrote:
> On Sat, May 14, 2022 at 09:14:36PM -0000, Stuart Henderson wrote:
>> On 2022-05-14, Georg Pfuetzenreuter <[email protected]> wrote:
>> > pppoe(4) already has a section on this, possibly this could be used as a
>> > start.
>>
>> It's not a great start really. Mixes up information about a method to
>> set the pppoe MTU to 1500 (RFC4638) and using scrub, doesn't describe
>> the problem (says "causing conflict" but this isn't very meaningful
>> or really correct), and points at nonexistent "more information on MTU,
>> MSS and NAT" as this isn't in pf.conf(5).
>>
>>
>
> hi.
>
> if there are issues in that text, feel free to suggest how to improve
> it.
>
> - mixing mtu to 1500 and scrub: well, both concern issues with mtu. why
> wouldn;t they be together in there?
They're related but one is for avoiding the problem in the first place
(which may or may not work, depending on the ISP and backhaul network)
and the other is working around problems encountered (due to
misconfiguration of other people's networks) as a result.
Putting them together in one large section isn't so bad for pppoe, though
it already feels like it makes it harder to distinguish the two, but in
the context of using this as a base for text relating to other interface
types then the RFC 4638 bits aren't relevant at all there.
> - "causing conflict": feel free to be more specific. it's not something
> i have knowledge of
outline:
- client <> router is on ethernet and can pass packets of 1500 bytes
(or even larger)
- router <> "the internet" can sometimes carry 1500 byte packets but
via certain types of connection can only pass packets of a smaller size,
e.g. 1492 bytes with standard pppoe, some ISPs have tougher restrictions
(either outright, or "work but don't work _well_" if you go above some
other size)
- router <> "sites accessed by tunnel/vpn over the internet" has an
extra header inserted in packets, further reducing the available size
for packets (usually 1420 bytes for wg(4) though can be less if
it's carried over a more restrictive internet connection than usual,
other sizes for other types of tunnel/vpn)
- website (or other host "on the internet") <> "the internet" can
typically send packets of 1500 bytes
so the two endpoints of a TCP connection (say, client and website)
can send 1500 byte packets to their immediate upstream. but the path
between them (router/ISP/internet/vpn/whatever) can only carry smaller
packets.
clear so far?
the size of packets which can be carried on a particular network
interface is "the MTU" of that interface. this defaults to the hardware
capable size or 1500 whichever is less.
for TCP packets there is a negotiation at connection setup between the
two sides. they look at the MTU of the route to reach the other address
which defaults to that of the network interface used to reach it.
subtract the TCP header size, and call that MSS "maximum segment
size". they tell the other side their idea of MSS (in the TCP SYN
packet) and the lowest of the two is used for the connection
(so packets are capped at that size).
this is fine where the whole path can cope with the same sized packets,
but if not then a router on the path must either split it into fragments
(much slower than simply forwarding it, involving use of the router cpu
which is usually fairly weedy) or send a "fragmentation needed" ICMP
message and rely on the other side to do it. (the common case is for
TCP connections to be generated with packets flagged as "don't fragment"
because the endpoints want to know about the issue so they can adapt
to it).
in the best case, the relevant endpoint (e.g. client or website)
receives that message and acts on it by reducing the size of packets
it then transmits. there's still some overhead from detecting the
oversized packet and reacting to it but things "work".
in the worst case, those packets don't reach the relevant endpoint.
(various possible reasons. maybe a misconfigured firewall blocks all
ICMP. maybe there's some link numbered on private addresses in the
network path and the frag-needed message was sent from a private
address and blocked by a firewall. maybe some loadbalancing or
queueing or icmp-packets-per-second limit got in the way. lots of
options).
so in that case the endpoint sending the oversize packet doesn't
know it must reduce packet size, and the packet doesn't make it
through.
(it's actually worse with a standard IPsec tunnel because the MTU is
that of the interface carrying the network route, usually the default
route, so in that case it also affects connections where the VPN is
run directly on an endpoint, not just where the VPN is handled on a
separate router).
anyway. when a rule with "scrub (max-mss XYZ)" is matched by a
TCP SYN packet, PF inspects the maximum segment, if it's higher than
XYZ it modifies the packet and sets it to XYZ instead. the effect is that
TCP packets are kept below the size which can be actually be transmitted
across the network and so the problem is avoided.
(elephant in the room: non TCP packets. there's often no handshake
mechanism like TCP with MSS negotiation, so the only real options are to
keep packets smaller or to do some specific probing to see which packet
sizes make it through. this is usually either handled individually in
some way or other by those protocols which run across the internet, or
they just ignore it and break sometimes).
> - "more information in pf.conf": yes there is information in pf.conf on
> mtu, mss, and nat, including the syntax for using them. again, why
> wouldn;t we point people there?
About MTU, pf.conf(5) mostly talks in terms of fragment handling on a
machine running PF, there's a reference to path MTU discovery when talking
about IPv6 but there's no introduction for somebody who doesn't know what
the problems are.
The sum total of text about MSS is
"
max-mss number
Enforces a maximum segment size (MSS) for matching TCP packets.
"
this is not even clear about what the option does (it's actually
"reduces the MSS on TCP syn packets if it is higher than the max-mss
value" but with the word "enforces" it could be be read as "drops
the packet if MSS is not equal to this value") and has nothing on
relation to MTU or why one might use it.
"If the maximum segment size (MSS) on matching TCP packets exceeds
<number>, modify the packet and set it to <number>" would be better than
we have but this definitely feels like it needs more explanation as to
why it might be needed.
Arguably pf.conf would be the wrong place to describe TCP fundamentals
but it's probably the one place people running into the problem on non-
pppoe interfaces are most likely to discover. (I think the logic of
including this in pppoe(4) originally was that for PF people are more
likely to copy-and-paste/modify sample configs rather than actually
read the PF manual because it's so long and they'll _probably_ be more
likely to see it in pppoe.)
> i'm happy to try and rework the text if you think it can be improved.
info about "scrub max-mss" probably wants to go in pf.conf and just
referenced from pppoe (and maybe other places)
: MTU/MSS ISSUES
: Problems can arise on machines with private IPs connecting to the Internet
: via a machine running both Network Address Translation (NAT) and pppoe.
this above is totally wrong, it is nothing to do with NAT and private IPs,
same applies on routed address blocks going through a smaller MTU network
: Standard Ethernet uses a maximum transmission unit (MTU) of 1500 bytes,
: whereas PPPoE mechanisms need a further 8 bytes of overhead. This leaves a
: maximum MTU of 1492. pppoe sets the MTU on its interface to 1492 as a
: matter of course. However, machines connecting on a private LAN will still
: have their MTUs set to 1500, causing conflict. Using a packet filter, the
: maximum segment size (MSS) can be set (clamped) to the required value. The
: following rule in pf.conf(5) would set the MSS to 1440:
i'd prefer splitting up something like
"PPPOE AND MTU/MSS
PPPoE has an 8 byte header. When run over a network interface with the
standard Ethernet maximum transmission unit (MTU) of 1500 bytes, this
reduces the maximum available MTU to 1492. pppoe(4) sets the default
MTU to this value."
then something briefly explaining the issue (if it can be condensed)
and point to an expanded pf.conf(5) bit about max-mss which could be
referenced from other places too, and a separate bit about 4638
negotiation, maybe sth like
"MTU/MSS NEGOTIATION
When using a pppoedev configured for a higher MTU ("jumbo
frames"), the MTU for the pppoe(4) device can also be raised.
In this case pppoe(4) attempts to negotiate the higher size with
the other PPPoE endpoint using the RFC 4638 protocol.
This can allow standard Ethernet packet sizes (1500 bytes) to be
carried over PPPoE. However, RFC 4638 negotiation only takes into
account the MTU configured on[...as per original...]"
(while I'm thinking of pppoe(4), we've also got to do something with
"inet 0.0.0.0 255.255.255.255 NONE / dest 0.0.0.1", which just
doesn't work reliably - the 0.0.0.1 needs to be set *before*
bringing the interface up - "inet 0.0.0.0 255.255.255.255 0.0.0.1"
is better)