Re: pair(4) crashes on strict alignment platforms

2018-03-08 Thread David Gwynne
On Thu, Mar 08, 2018 at 07:44:13PM +0100, Stefan Sperling wrote:
> Running the first set of example commands from the pair(4) man page
> crashes the kernel on at least sparc64 and octeon.
> 
>  # ifconfig pair1 rdomain 1 10.1.1.1/24 up
>  # ifconfig pair2 rdomain 2 10.1.1.2/24 up
>  # ifconfig pair1 patch pair2
>  # route -T 1 exec ping 10.1.1.2
> 
> A netcat<->telnet connection from 10.1.1.1 to 10.1.1.2 works.
> 
> It seems the problem only happens with ping, or short packets in general.
> It looks like the crash is happening while processing the icmp echo reply.
> This code in ip_input_if() calls m_pullup() which ends up setting m->m_data
> to an unaligned address:
> 
>   if (m->m_len < sizeof (struct ip) &&
>   (m = *mp = m_pullup(m, sizeof (struct ip))) == NULL) {
>   ipstat_inc(ips_toosmall);
>   goto bad;
>   }
>   ip = mtod(m, struct ip *);
>   if (ip->ip_v != IPVERSION) {  // we crash here because ip is misaligned
> 
> Note that pair(4) has dequeued this mbuf from its send queue and doesn't
> modify it except for resetting the packet header if it exists.
> 
> Trace from sparc64:
> 
> panic: trap type 0x34 (mem address not aligned): pc=11336d4 npc=11336d8 
> pstate=44820006
> Stopped at  db_enter+0x8:   nop
> TIDPIDUID PRFLAGS PFLAGS  CPU  COMMAND
> *291733   6703  0 0x14000  0x2000  softnet
> trap(40016e6b890, 34, 11336d4, 44820006, 400023ff800, 8848) at trap+0x2e0
> Lslowtrap_reenter(400023ace00, 0, , 1c176d0, 4, 1) at 
> Lslowtrap_reenter+0xf8
> ip_input_if(40016e6bb48, 40016e6bb54, 4, 0, 400023ff800, 8848) at 
> ip_input_if+0x120
> ipv4_input(400023ff800, 18184e8, , 1c176d0, 4, 1) at 
> ipv4_input+0x3c
> ether_input(400023ff800, 400023ace00, 0, 16545e8, , 8848) at 
> ether_input+0xc8
> if_input_process(400021607c0, 40016e6bde0, 131cb20, 1c176d0, 4, 0) at 
> if_input_process+0x11c
> taskq_thread(4000216c080, 40002142fc0, 1758938, 16545e8, 0, 3b9ac800) at 
> taskq_thread+0x6c
> proc_trampoline(0, 0, 0, 0, 0, 0) at proc_trampoline+0x14
> https://www.openbsd.org/ddb.html describes the minimum info required in bug 
> reports.
> Insufficient info makes it difficult to find and fix bugs.
> 

the diff below fixes the panic.

the problem is m_adj doesn't keep m_data updated when removing all
the data from one mbuf in a chain.

in more detail, when we read the ping packet from userland into
the kernel, it's put at the front of an mbuf, which is properly
aligned for an ip packet. pair is an ethernet interface, so to send
it an ethernet header is prepended. because the ip packet is at the
start of an mbuf we allocate a new one for the ethernet header.
that ends up being 6 bytes at the end of a new mbuf, with the end
of the ethernet header properly aligned for an ip packet.

pair then sends the packet back into the network stack, which m_adjs
the ethernet header off the front of the packet. m_adj never deletes
mbufs, it just sets their lengths to 0 if there's no more data left
in an mbuf. because the ethernet header is all thats in that first
mbuf it sets m_len to 0, but doesn't touch m_data.

the ip stack then tries to access the ip header in the mbuf chain.
because the first mbuf has 0 bytes in it it uses m_pulldown to get
to the ip header. m_pulldown goes to a lot of lengths to maintain
the alignment of the payload in an mbuf. because m_adj left m_data
where an ethernet header is, ie, 6 bytes with the end aligned for
a payload), m_pulldown puts the ip header where the ethernet header
was, which is 2 byte aligned.

netcat works because non-raw ip sockets have their headers allocated
with space at the start for link headers.

this diff makes m_adj update m_data in all paths.

another option could be to have m_pullup skip over mbufs with
m_len == 0 before finding the target alignment. or do both.

Index: uipc_mbuf.c
===
RCS file: /cvs/src/sys/kern/uipc_mbuf.c,v
retrieving revision 1.253
diff -u -p -r1.253 uipc_mbuf.c
--- uipc_mbuf.c 16 Jan 2018 19:44:34 -  1.253
+++ uipc_mbuf.c 9 Mar 2018 01:34:05 -
@@ -812,11 +812,12 @@ m_adj(struct mbuf *mp, int req_len)
while (m != NULL && len > 0) {
if (m->m_len <= len) {
len -= m->m_len;
+   m->m_data += m->m_len; /* move alignment */
m->m_len = 0;
m = m->m_next;
} else {
-   m->m_len -= len;
m->m_data += len;
+   m->m_len -= len;
len = 0;
}
}



pair(4) crashes on strict alignment platforms

2018-03-08 Thread Stefan Sperling
Running the first set of example commands from the pair(4) man page
crashes the kernel on at least sparc64 and octeon.

   # ifconfig pair1 rdomain 1 10.1.1.1/24 up
   # ifconfig pair2 rdomain 2 10.1.1.2/24 up
   # ifconfig pair1 patch pair2
   # route -T 1 exec ping 10.1.1.2

A netcat<->telnet connection from 10.1.1.1 to 10.1.1.2 works.

It seems the problem only happens with ping, or short packets in general.
It looks like the crash is happening while processing the icmp echo reply.
This code in ip_input_if() calls m_pullup() which ends up setting m->m_data
to an unaligned address:

if (m->m_len < sizeof (struct ip) &&
(m = *mp = m_pullup(m, sizeof (struct ip))) == NULL) {
ipstat_inc(ips_toosmall);
goto bad;
}
ip = mtod(m, struct ip *);
if (ip->ip_v != IPVERSION) {  // we crash here because ip is misaligned

Note that pair(4) has dequeued this mbuf from its send queue and doesn't
modify it except for resetting the packet header if it exists.

Trace from sparc64:

panic: trap type 0x34 (mem address not aligned): pc=11336d4 npc=11336d8 
pstate=44820006
Stopped at  db_enter+0x8:   nop
TIDPIDUID PRFLAGS PFLAGS  CPU  COMMAND
*291733   6703  0 0x14000  0x2000  softnet
trap(40016e6b890, 34, 11336d4, 44820006, 400023ff800, 8848) at trap+0x2e0
Lslowtrap_reenter(400023ace00, 0, , 1c176d0, 4, 1) at 
Lslowtrap_reenter+0xf8
ip_input_if(40016e6bb48, 40016e6bb54, 4, 0, 400023ff800, 8848) at 
ip_input_if+0x120
ipv4_input(400023ff800, 18184e8, , 1c176d0, 4, 1) at 
ipv4_input+0x3c
ether_input(400023ff800, 400023ace00, 0, 16545e8, , 8848) at 
ether_input+0xc8
if_input_process(400021607c0, 40016e6bde0, 131cb20, 1c176d0, 4, 0) at 
if_input_process+0x11c
taskq_thread(4000216c080, 40002142fc0, 1758938, 16545e8, 0, 3b9ac800) at 
taskq_thread+0x6c
proc_trampoline(0, 0, 0, 0, 0, 0) at proc_trampoline+0x14
https://www.openbsd.org/ddb.html describes the minimum info required in bug 
reports.
Insufficient info makes it difficult to find and fix bugs.