On Tue, Jan 23, 2024 at 7:41 AM Ilya Maximets <[email protected]> wrote:
>
> On 1/22/24 23:24, Mike Pattrick wrote:
> > On Mon, Jan 22, 2024 at 4:10 PM Ilya Maximets <[email protected]> wrote:
> >>
> >> On 1/22/24 21:33, Mike Pattrick wrote:
> >>> On Mon, Jan 22, 2024 at 1:47 PM Ilya Maximets <[email protected]> wrote:
> >>>>
> >>>> On 1/22/24 18:51, Mike Pattrick wrote:
> >>>>> The OVN test suite identified a bug in dp_packet_ol_send_prepare() where
> >>>>> a double encapsulated BFD packet would trigger a seg fault. This
> >>>>> happened because we had assumed that if IP checksumming was marked as
> >>>>> offloaded, that checksum would occur in the innermost packet.
> >>>>>
> >>>>> This change will check that the inner packet has an L3 and L4 before
> >>>>> checksumming those parts. And if missing, will resume checksumming the
> >>>>> outer components.
> >>>>
> >>>> Hrm. This looks like a workaround rather than a fix. If the inner
> >>>> packet
> >>>> doesn't have L3 header, dp-packet must not have a flag for L3
> >>>> checksumming
> >>>> set in the first place. And if it does have inner L3, the offset must be
> >>>> initialized. I guess, some of the offsets can be not initialized,
> >>>> because
> >>>> the packet is never parsed by either miniflow_extract() or
> >>>> parse_tcp_flags().
> >>>> And the bfd_put_packet() doesn't seem to set them.
> >>>
> >>> I think you're right, I stepped through the problem in GDB and it was
> >>> clear that the flags weren't getting reset properly between BFD
> >>> packets. I'll send an updated patch.
> >>
> >> Yeah, flags preservation is one thing, the other is that l3/l4_ofs are just
> >> not populated in these packets, which doesn't sound right.
> >>
> >> It's also not clear why the packet is marked for inner checksum offload if
> >> the inner checksum is actually fully calculated and correct. I understand
> >> that the flags are carried over from a previous packet, but why did that
> >> previous packet have this flag set?
> >
> > Here's an example:
> >
> > monitor_run()
> > \_ dp_packet_use_stub(&packet, stub, sizeof stub); <--- initializes packet
> > once
> > \_ while(!heap_is_empty(&monitor_heap) <-- loop where the same packet
> > can be reused
> > \_ monitor_mport_run()
> > \_ dp_packet_clear() <- Data cleared, but flags are not cleared
> > \_ cfm_compose_ccm() / bfd_put_packet() / lldp_put_packet() <-
> > one or more can run
> > \_ Note that cfm_compose_ccm and lldp_put_packet call
> > eth_compose, but bfd_put_packet doesn't. bfd_put_packet doesn't reset
> > the offsets like eth_compose does.
>
> Sounds like a bug to me. bfd_put_packet() should set correct offsets
> in the packet, otherwise we'll get random failures with different
> actions that may depend on these offsets to be populated.
Agreed. Do you want me to homologate this as part of this series?
>
> > \_ ofproto_dpif_send_packet()
> > .... non-relevant stack trace ...
> > \_ netdev_push_header()
> > \_ First run, push geneve header and toggle geneve flag
> > \_ Second run, detect geneve header flag, call
> > dp_packet_ol_send_prepare()
> >
> > Given the above, I think it makes sense to clear the offload flags in
> > dp_packet_clear().
>
> I agree with that. But it still doesn't explain why the
> DP_PACKET_OL_TX_IP_CKSUM
> is set after the first run. The inner checksum is fully calculated and
> correct.
> There should be no Tx offloading for it set. Only the
> DP_PACKET_OL_TX_OUTER_IP_CKSUM
> should be set in this packet. Or am I missing something?
bfd_put_packet() calculates the checksum, so it will always be
correct, miniflow_extract() sets DP_PACKET_OL_TX_IP_CKSUM if the
previous packet had DP_PACKET_OL_RX_IP_CKSUM_GOOD, which it would have
gotten from dp_packet_ol_send_prepare().
>
> >
> > -M
> >
> >>
> >>>
> >>>> BTW, is there actually a double encapsulation in the original OVN test?
> >>>> Sounds strange.
> >>>
> >>> The problem was exposed in netdev_push_header() in
> >>> dp_packet_ol_send_prepare(packet, 0);
> >>
> >> That's true, but the packet dump in gdb showed a plain BFD packet.
> >> So, it was a first encapsulation, not double.
> >>
> >>>
> >>> -M
> >>>
> >>>>
> >>>>>
> >>>>> Fixes: 8b5fe2dc6080 ("userspace: Add Generic Segmentation Offloading.")
> >>>>> Reported-by: Dumitru Ceara <[email protected]>
> >>>>> Reported-at: https://issues.redhat.com/browse/FDP-300
> >>>>> Signed-off-by: Mike Pattrick <[email protected]>
> >>>>> ---
> >>>>> lib/dp-packet.h | 10 ++++++++--
> >>>>> lib/packets.c | 6 +++---
> >>>>> 2 files changed, 11 insertions(+), 5 deletions(-)
> >>>>
> >>>> Best regards, Ilya Maximets.
> >>>>
> >>>
> >>
> >
>
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev