On 06/12/15 10:20, Sam Russell wrote:
tl;dr mpls_output expects skb->protocol to be set to correct
ethertype, but it isn't

https://github.com/torvalds/linux/blob/ede2059dbaf9c6557a49d466c8c7778343b208ff/net/mpls/mpls_iptunnel.c#L64

Problem:

I set up two interfaces pointed at each other, and added a static arp
entry to minimise complexity

ifconfig enp0s8 10.0.0.1/24 up
ifconfig enp0s9 up
arp -s 10.0.0.5 00:12:34:56:78:90

I then added an MPLS route

./dev/iproute2/ip/ip route add 192.168.2.0/24 encap mpls 100 via inet
10.0.0.5 dev enp0s8

I then tried to ping an IP in this route but got errors back

ping 192.168.2.1
* PING 192.168.2.1 (192.168.2.1) 56(84) bytes of data.
* ping: sendmsg: Invalid argument

I then recorded calls to skb_kfree

./tools/perf/perf record -e skb:kfree_skb -g -a

This gave me the following packet trace:

    100.00%   100.00%  ping     [kernel.kallsyms]  [k] kfree_skb
                |
                ---kfree_skb
                   mpls_output
                   lwtunnel_output
                   ip_local_out_sk
                   ip_send_skb
                   ip_push_pending_frames
                   raw_sendmsg
                   inet_sendmsg
                   sock_sendmsg
                   ___sys_sendmsg
                   __sys_sendmsg
                   sys_sendmsg
                   entry_SYSCALL_64_fastpath
                   sendmsg
                   0

I then went through mpls_output.c and put printk() at every call to
"goto drop" and found that this was being hit after failing to match
skb->protocol

https://github.com/torvalds/linux/blob/ede2059dbaf9c6557a49d466c8c7778343b208ff/net/mpls/mpls_iptunnel.c#L64

My understanding is that skb->protocol is normally set after
dst_output. For example, a ping packet hitting a normal IPv4 route
should follow something like:

raw_sendmsg
ip_push_pending_frames
ip_send_skb
ip_local_out_sk
dst_output
ip_output

ip_output() is the first place where skb->protocol gets set

https://github.com/torvalds/linux/blob/dbd3393c56a8794fe596e7dd20d0efa613b9cf61/net/ipv4/ip_output.c#L356

The path that a packet follows when hitting an MPLS route is as follows:

raw_sendmsg
ip_push_pending_frames
ip_send_skb
ip_local_out_sk
dst_output
lwtunnel_output
mpls_output

lwtunnel_output merely routes to the correct output function (mpls_output)
mpls_output expects skb->protocol to be set, but nothing has set it
yet, so it drops the packet!

Any suggestions on how mpls_output should detect the protocol?

Thanks for reporting this and for your analysis.

We could write wrappers to lwtunnel_output for the v4 and v6 cases that set the protocol accordingly and then call lwtunnel_output, but since mpls_output relies on the AF-specific type of dst I think the simpler fix is to just test the type of the dst in mpls_output rather than skb->protocol.

Thanks,
Rob
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to