Re: [ovs-dev] [PATCH v11 09/14] dp-packet: Add support for data "linearization".
On 11/10/2018 15:50, Eelco Chaudron wrote: > Hi Tiago, > > It has been a while since I reviewed this patchset, guess before my > summer holiday ;) > I picked this patch out of the set as it does not have my acked-by, and > I assume all the other which have it did not change too much to warrant > a review. If they did, let me know. > > See inline comments... > > Cheers, Hi Eelco, Thanks for the comments. Yeah, I've kept them because I, too, think there haven't been changes on those patches, aside from smaller fixes. Most of the changes have fallen now on the linearization patch (09/14). If I remember of something I'll make sure I'll ping you on the specific patch. Thanks again for that. > > Eelco > > On 10 Oct 2018, at 18:22, Tiago Lam wrote: > >> Previous commits have added support to the dp_packet API to handle >> multi-segmented packets, where data is not stored contiguously in >> memory. However, in some cases, it is inevitable and data must be >> provided contiguously. Examples of such cases are when performing >> csums >> over the entire packet data, or when write()'ing to a file descriptor >> (for a tap interface, for example). For such cases, the dp_packet API >> has been extended to provide a way to transform a multi-segmented >> DPBUF_DPDK packet into a DPBUF_MALLOC system packet (at the expense of >> a >> copy of memory). If the packet's data is already stored in memory >> contigously then there's no need to convert the packet. >> >> Thus, the main use cases that were assuming that a dp_packet's data is >> always held contiguously in memory were changed to make use of the new >> "linear functions" in the dp_packet API when there's a need to >> traverse >> the entire's packet data. Per the example above, when the packet's >> data >> needs to be write() to the tap's file descriptor, or when the >> conntrack >> module needs to verify a packet's checksum, the data is now >> linearized. >> >> Additionally, the layer functions, such as dp_packet_l3() and >> variants, >> have been modified to check if there's enough data in the packet >> before >> returning a pointer to the data (and callers have been modified >> accordingly). This requirement is needed to guarantee that a caller >> doesn't read beyond the available memory. > > I think the last might be a problem, as now the dp_packet_l2_5(), > dp_packet_l3() and dp_packet_l4() might start returning NULL. > Some the changes you made to the code referencing this functions would > crash if they return NULL. > > Can I assume you verified that all these calls will be guarded by > dp_packet_linearize() so we do not access a NULL pointer? I understand your point. Ilya also brought it up in v9 and v10. This is my point of view. At the moment the miniflow_extract(), where the layer offsets are initially set, is extracting the data contiguously. It is mentioned as part of the "limitations" that headers should be within the first mbuf (1920B). So, the only other way to get a NULL is if while manipulating the packet one sets the headers up to more than those 1920B. I haven't been able to find a case where this happens, since even if you are pushing headers you won't go above the headroom (128B by default). While I haven't verified all cases, I would say that this would happen rarely and would consider it a corner case (you also have to had opted-in for multi-segs mbufs option). So, my point here, really, is that I'd rather find and fix those corner cases than polluting the code with NULL checks (as I mentioned before, the code under ovn/ won't be dealing with DPBUF_DPDK so it won't be returning NULL). But I understand this is the second person pointing this out, so I'll extend the scope to cover the miniflow_extract() and deal with the NULL returns. (There are some in-line comments below which I didn't reply to because they are related to this.) > > In addition, why have you decided not to use the dp_packet_linearize() > inside these functions (I think I have an idea)? > This might allow skipping the: > > if (!dp_packet_is_linear(pkt)) { > dp_packet_linearize(pkt); > } > > Maybe the above can be changed to just dp_packet_linearize(), and have > the dp_packet_is_linear() part of the first call. They are both inline > functions already. A previous version introduced a dp_packet_linear_data() API, which would linearize the packet and return the pointer to the data. As Ilya pointed out in the patch, and I agreed with, because dp_packet_linear_data() would change the address space of where the data is, it would be very easy for a caller to misuse it. So this has evolved into two separate functions; One to check if the packet is linear and another to linearize it. This would be the main reason, I'd say. The other is more personal, but it feels odd to me to have a standalone call to `dp_packet_linearize()` in the middle of the code without any other context. This, I think, gives some more context. > >
Re: [ovs-dev] [PATCH v11 09/14] dp-packet: Add support for data "linearization".
Hi Tiago, It has been a while since I reviewed this patchset, guess before my summer holiday ;) I picked this patch out of the set as it does not have my acked-by, and I assume all the other which have it did not change too much to warrant a review. If they did, let me know. See inline comments... Cheers, Eelco On 10 Oct 2018, at 18:22, Tiago Lam wrote: Previous commits have added support to the dp_packet API to handle multi-segmented packets, where data is not stored contiguously in memory. However, in some cases, it is inevitable and data must be provided contiguously. Examples of such cases are when performing csums over the entire packet data, or when write()'ing to a file descriptor (for a tap interface, for example). For such cases, the dp_packet API has been extended to provide a way to transform a multi-segmented DPBUF_DPDK packet into a DPBUF_MALLOC system packet (at the expense of a copy of memory). If the packet's data is already stored in memory contigously then there's no need to convert the packet. Thus, the main use cases that were assuming that a dp_packet's data is always held contiguously in memory were changed to make use of the new "linear functions" in the dp_packet API when there's a need to traverse the entire's packet data. Per the example above, when the packet's data needs to be write() to the tap's file descriptor, or when the conntrack module needs to verify a packet's checksum, the data is now linearized. Additionally, the layer functions, such as dp_packet_l3() and variants, have been modified to check if there's enough data in the packet before returning a pointer to the data (and callers have been modified accordingly). This requirement is needed to guarantee that a caller doesn't read beyond the available memory. I think the last might be a problem, as now the dp_packet_l2_5(), dp_packet_l3() and dp_packet_l4() might start returning NULL. Some the changes you made to the code referencing this functions would crash if they return NULL. Can I assume you verified that all these calls will be guarded by dp_packet_linearize() so we do not access a NULL pointer? In addition, why have you decided not to use the dp_packet_linearize() inside these functions (I think I have an idea)? This might allow skipping the: if (!dp_packet_is_linear(pkt)) { dp_packet_linearize(pkt); } Maybe the above can be changed to just dp_packet_linearize(), and have the dp_packet_is_linear() part of the first call. They are both inline functions already. Signed-off-by: Tiago Lam --- lib/bfd.c | 3 +- lib/cfm.c | 5 +- lib/conntrack-icmp.c | 4 +- lib/conntrack-private.h | 4 +- lib/conntrack-tcp.c | 6 +- lib/conntrack.c | 109 ++-- lib/crc32c.c | 35 +++-- lib/crc32c.h | 2 + lib/dp-packet.c | 18 +++ lib/dp-packet.h | 288 +- lib/dpif-netdev.c | 5 + lib/dpif-netlink.c| 5 + lib/dpif.c| 9 ++ lib/flow.c| 44 --- lib/lacp.c| 3 +- lib/mcast-snooping.c | 8 +- lib/netdev-bsd.c | 5 + lib/netdev-dummy.c| 13 +- lib/netdev-linux.c| 13 +- lib/netdev-native-tnl.c | 38 +++--- lib/odp-execute.c | 28 ++-- lib/ofp-print.c | 10 +- lib/ovs-lldp.c| 3 +- lib/packets.c | 151 -- lib/packets.h | 7 + lib/pcap-file.c | 2 +- ofproto/ofproto-dpif-upcall.c | 20 ++- ofproto/ofproto-dpif-xlate.c | 42 -- ovn/controller/pinctrl.c | 29 +++-- tests/test-conntrack.c| 2 +- tests/test-rstp.c | 8 +- tests/test-stp.c | 8 +- 32 files changed, 637 insertions(+), 290 deletions(-) diff --git a/lib/bfd.c b/lib/bfd.c index cc8c685..12e076a 100644 --- a/lib/bfd.c +++ b/lib/bfd.c @@ -721,7 +721,8 @@ bfd_process_packet(struct bfd *bfd, const struct flow *flow, if (!msg) { VLOG_INFO_RL(, "%s: Received too-short BFD control message (only " "%"PRIdPTR" bytes long, at least %d required).", - bfd->name, (uint8_t *) dp_packet_tail(p) - l7, + bfd->name, dp_packet_size(p) - + (l7 - (uint8_t *) dp_packet_data(p)), BFD_PACKET_LEN); goto out; } diff --git a/lib/cfm.c b/lib/cfm.c index 71d2c02..83baf2a 100644 --- a/lib/cfm.c +++ b/lib/cfm.c @@ -584,7 +584,7 @@ cfm_compose_ccm(struct cfm *cfm, struct dp_packet *packet, atomic_read_relaxed(>extended, ); -ccm = dp_packet_l3(packet); +ccm = dp_packet_l3(packet, sizeof(*ccm)); ccm->mdlevel_version = 0; ccm->opcode = CCM_OPCODE;
[ovs-dev] [PATCH v11 09/14] dp-packet: Add support for data "linearization".
Previous commits have added support to the dp_packet API to handle multi-segmented packets, where data is not stored contiguously in memory. However, in some cases, it is inevitable and data must be provided contiguously. Examples of such cases are when performing csums over the entire packet data, or when write()'ing to a file descriptor (for a tap interface, for example). For such cases, the dp_packet API has been extended to provide a way to transform a multi-segmented DPBUF_DPDK packet into a DPBUF_MALLOC system packet (at the expense of a copy of memory). If the packet's data is already stored in memory contigously then there's no need to convert the packet. Thus, the main use cases that were assuming that a dp_packet's data is always held contiguously in memory were changed to make use of the new "linear functions" in the dp_packet API when there's a need to traverse the entire's packet data. Per the example above, when the packet's data needs to be write() to the tap's file descriptor, or when the conntrack module needs to verify a packet's checksum, the data is now linearized. Additionally, the layer functions, such as dp_packet_l3() and variants, have been modified to check if there's enough data in the packet before returning a pointer to the data (and callers have been modified accordingly). This requirement is needed to guarantee that a caller doesn't read beyond the available memory. Signed-off-by: Tiago Lam --- lib/bfd.c | 3 +- lib/cfm.c | 5 +- lib/conntrack-icmp.c | 4 +- lib/conntrack-private.h | 4 +- lib/conntrack-tcp.c | 6 +- lib/conntrack.c | 109 ++-- lib/crc32c.c | 35 +++-- lib/crc32c.h | 2 + lib/dp-packet.c | 18 +++ lib/dp-packet.h | 288 +- lib/dpif-netdev.c | 5 + lib/dpif-netlink.c| 5 + lib/dpif.c| 9 ++ lib/flow.c| 44 --- lib/lacp.c| 3 +- lib/mcast-snooping.c | 8 +- lib/netdev-bsd.c | 5 + lib/netdev-dummy.c| 13 +- lib/netdev-linux.c| 13 +- lib/netdev-native-tnl.c | 38 +++--- lib/odp-execute.c | 28 ++-- lib/ofp-print.c | 10 +- lib/ovs-lldp.c| 3 +- lib/packets.c | 151 -- lib/packets.h | 7 + lib/pcap-file.c | 2 +- ofproto/ofproto-dpif-upcall.c | 20 ++- ofproto/ofproto-dpif-xlate.c | 42 -- ovn/controller/pinctrl.c | 29 +++-- tests/test-conntrack.c| 2 +- tests/test-rstp.c | 8 +- tests/test-stp.c | 8 +- 32 files changed, 637 insertions(+), 290 deletions(-) diff --git a/lib/bfd.c b/lib/bfd.c index cc8c685..12e076a 100644 --- a/lib/bfd.c +++ b/lib/bfd.c @@ -721,7 +721,8 @@ bfd_process_packet(struct bfd *bfd, const struct flow *flow, if (!msg) { VLOG_INFO_RL(, "%s: Received too-short BFD control message (only " "%"PRIdPTR" bytes long, at least %d required).", - bfd->name, (uint8_t *) dp_packet_tail(p) - l7, + bfd->name, dp_packet_size(p) - + (l7 - (uint8_t *) dp_packet_data(p)), BFD_PACKET_LEN); goto out; } diff --git a/lib/cfm.c b/lib/cfm.c index 71d2c02..83baf2a 100644 --- a/lib/cfm.c +++ b/lib/cfm.c @@ -584,7 +584,7 @@ cfm_compose_ccm(struct cfm *cfm, struct dp_packet *packet, atomic_read_relaxed(>extended, ); -ccm = dp_packet_l3(packet); +ccm = dp_packet_l3(packet, sizeof(*ccm)); ccm->mdlevel_version = 0; ccm->opcode = CCM_OPCODE; ccm->tlv_offset = 70; @@ -759,8 +759,7 @@ cfm_process_heartbeat(struct cfm *cfm, const struct dp_packet *p) atomic_read_relaxed(>extended, ); eth = dp_packet_eth(p); -ccm = dp_packet_at(p, (uint8_t *)dp_packet_l3(p) - (uint8_t *)dp_packet_data(p), -CCM_ACCEPT_LEN); +ccm = dp_packet_l3(p, CCM_ACCEPT_LEN); if (!ccm) { VLOG_INFO_RL(, "%s: Received an unparseable 802.1ag CCM heartbeat.", diff --git a/lib/conntrack-icmp.c b/lib/conntrack-icmp.c index 40fd1d8..0575d0e 100644 --- a/lib/conntrack-icmp.c +++ b/lib/conntrack-icmp.c @@ -63,7 +63,7 @@ icmp_conn_update(struct conn *conn_, struct conntrack_bucket *ctb, static bool icmp4_valid_new(struct dp_packet *pkt) { -struct icmp_header *icmp = dp_packet_l4(pkt); +struct icmp_header *icmp = dp_packet_l4(pkt, sizeof *icmp); return icmp->icmp_type == ICMP4_ECHO_REQUEST || icmp->icmp_type == ICMP4_INFOREQUEST @@ -73,7 +73,7 @@ icmp4_valid_new(struct dp_packet *pkt) static bool icmp6_valid_new(struct dp_packet *pkt) { -struct icmp6_header *icmp6 = dp_packet_l4(pkt); +struct icmp6_header *icmp6 =