Re: [ovs-dev] [PATCH v11 09/14] dp-packet: Add support for data "linearization".

2018-10-12 Thread Lam, Tiago
On 11/10/2018 15:50, Eelco Chaudron wrote:
> Hi Tiago,
> 
> It has been a while since I reviewed this patchset, guess before my 
> summer holiday ;)
> I picked this patch out of the set as it does not have my acked-by, and 
> I assume all the other which have it did not change too much to warrant 
> a review. If they did, let me know.
> 
> See inline comments...
> 
> Cheers,

Hi Eelco,

Thanks for the comments. Yeah, I've kept them because I, too, think
there haven't been changes on those patches, aside from smaller fixes.
Most of the changes have fallen now on the linearization patch (09/14).

If I remember of something I'll make sure I'll ping you on the specific
patch. Thanks again for that.

> 
> Eelco
> 
> On 10 Oct 2018, at 18:22, Tiago Lam wrote:
> 
>> Previous commits have added support to the dp_packet API to handle
>> multi-segmented packets, where data is not stored contiguously in
>> memory. However, in some cases, it is inevitable and data must be
>> provided contiguously. Examples of such cases are when performing 
>> csums
>> over the entire packet data, or when write()'ing to a file descriptor
>> (for a tap interface, for example). For such cases, the dp_packet API
>> has been extended to provide a way to transform a multi-segmented
>> DPBUF_DPDK packet into a DPBUF_MALLOC system packet (at the expense of 
>> a
>> copy of memory). If the packet's data is already stored in memory
>> contigously then there's no need to convert the packet.
>>
>> Thus, the main use cases that were assuming that a dp_packet's data is
>> always held contiguously in memory were changed to make use of the new
>> "linear functions" in the dp_packet API when there's a need to 
>> traverse
>> the entire's packet data. Per the example above, when the packet's 
>> data
>> needs to be write() to the tap's file descriptor, or when the 
>> conntrack
>> module needs to verify a packet's checksum, the data is now 
>> linearized.
>>
>> Additionally, the layer functions, such as dp_packet_l3() and 
>> variants,
>> have been modified to check if there's enough data in the packet 
>> before
>> returning a pointer to the data (and callers have been modified
>> accordingly). This requirement is needed to guarantee that a caller
>> doesn't read beyond the available memory.
> 
> I think the last might be a problem, as now the dp_packet_l2_5(), 
> dp_packet_l3() and dp_packet_l4() might start returning NULL.
> Some the changes you made to the code referencing this functions would 
> crash if they return NULL.
> 
> Can I assume you verified that all these calls will be guarded by 
> dp_packet_linearize() so we do not access a NULL pointer?

I understand your point. Ilya also brought it up in v9 and v10. This is
my point of view.

At the moment the miniflow_extract(), where the layer offsets are
initially set, is extracting the data contiguously. It is mentioned as
part of the "limitations" that headers should be within the first mbuf
(1920B). So, the only other way to get a NULL is if while manipulating
the packet one sets the headers up to more than those 1920B. I haven't
been able to find a case where this happens, since even if you are
pushing headers you won't go above the headroom (128B by default). While
I haven't verified all cases, I would say that this would happen rarely
and would consider it a corner case (you also have to had opted-in for
multi-segs mbufs option).

So, my point here, really, is that I'd rather find and fix those corner
cases than polluting the code with NULL checks (as I mentioned before,
the code under ovn/ won't be dealing with DPBUF_DPDK so it won't be
returning NULL).

But I understand this is the second person pointing this out, so I'll
extend the scope to cover the miniflow_extract() and deal with the NULL
returns.

(There are some in-line comments below which I didn't reply to because
they are related to this.)

> 
> In addition, why have you decided not to use the dp_packet_linearize() 
> inside these functions (I think I have an idea)?
> This might allow skipping the:
> 
>   if (!dp_packet_is_linear(pkt)) {
>  dp_packet_linearize(pkt);
>  }
> 
> Maybe the above can be changed to just dp_packet_linearize(), and have 
> the dp_packet_is_linear() part of the first call. They are both inline 
> functions already.

A previous version introduced a dp_packet_linear_data() API, which would
linearize the packet and return the pointer to the data. As Ilya pointed
out in the patch, and I agreed with, because dp_packet_linear_data()
would change the address space of where the data is, it would be very
easy for a caller to misuse it. So this has evolved into two separate
functions; One to check if the packet is linear and another to linearize
it. This would be the main reason, I'd say. The other is more personal,
but it feels odd to me to have a standalone call to
`dp_packet_linearize()` in the middle of the code without any other
context. This, I think, gives some more context.

> 
> 

Re: [ovs-dev] [PATCH v11 09/14] dp-packet: Add support for data "linearization".

2018-10-11 Thread Eelco Chaudron

Hi Tiago,

It has been a while since I reviewed this patchset, guess before my 
summer holiday ;)
I picked this patch out of the set as it does not have my acked-by, and 
I assume all the other which have it did not change too much to warrant 
a review. If they did, let me know.


See inline comments...

Cheers,

Eelco

On 10 Oct 2018, at 18:22, Tiago Lam wrote:


Previous commits have added support to the dp_packet API to handle
multi-segmented packets, where data is not stored contiguously in
memory. However, in some cases, it is inevitable and data must be
provided contiguously. Examples of such cases are when performing 
csums

over the entire packet data, or when write()'ing to a file descriptor
(for a tap interface, for example). For such cases, the dp_packet API
has been extended to provide a way to transform a multi-segmented
DPBUF_DPDK packet into a DPBUF_MALLOC system packet (at the expense of 
a

copy of memory). If the packet's data is already stored in memory
contigously then there's no need to convert the packet.

Thus, the main use cases that were assuming that a dp_packet's data is
always held contiguously in memory were changed to make use of the new
"linear functions" in the dp_packet API when there's a need to 
traverse
the entire's packet data. Per the example above, when the packet's 
data
needs to be write() to the tap's file descriptor, or when the 
conntrack
module needs to verify a packet's checksum, the data is now 
linearized.


Additionally, the layer functions, such as dp_packet_l3() and 
variants,
have been modified to check if there's enough data in the packet 
before

returning a pointer to the data (and callers have been modified
accordingly). This requirement is needed to guarantee that a caller
doesn't read beyond the available memory.


I think the last might be a problem, as now the dp_packet_l2_5(), 
dp_packet_l3() and dp_packet_l4() might start returning NULL.
Some the changes you made to the code referencing this functions would 
crash if they return NULL.


Can I assume you verified that all these calls will be guarded by 
dp_packet_linearize() so we do not access a NULL pointer?


In addition, why have you decided not to use the dp_packet_linearize() 
inside these functions (I think I have an idea)?

This might allow skipping the:

 if (!dp_packet_is_linear(pkt)) {
dp_packet_linearize(pkt);
}

Maybe the above can be changed to just dp_packet_linearize(), and have 
the dp_packet_is_linear() part of the first call. They are both inline 
functions already.




Signed-off-by: Tiago Lam 
---
 lib/bfd.c |   3 +-
 lib/cfm.c |   5 +-
 lib/conntrack-icmp.c  |   4 +-
 lib/conntrack-private.h   |   4 +-
 lib/conntrack-tcp.c   |   6 +-
 lib/conntrack.c   | 109 ++--
 lib/crc32c.c  |  35 +++--
 lib/crc32c.h  |   2 +
 lib/dp-packet.c   |  18 +++
 lib/dp-packet.h   | 288 
+-

 lib/dpif-netdev.c |   5 +
 lib/dpif-netlink.c|   5 +
 lib/dpif.c|   9 ++
 lib/flow.c|  44 ---
 lib/lacp.c|   3 +-
 lib/mcast-snooping.c  |   8 +-
 lib/netdev-bsd.c  |   5 +
 lib/netdev-dummy.c|  13 +-
 lib/netdev-linux.c|  13 +-
 lib/netdev-native-tnl.c   |  38 +++---
 lib/odp-execute.c |  28 ++--
 lib/ofp-print.c   |  10 +-
 lib/ovs-lldp.c|   3 +-
 lib/packets.c | 151 --
 lib/packets.h |   7 +
 lib/pcap-file.c   |   2 +-
 ofproto/ofproto-dpif-upcall.c |  20 ++-
 ofproto/ofproto-dpif-xlate.c  |  42 --
 ovn/controller/pinctrl.c  |  29 +++--
 tests/test-conntrack.c|   2 +-
 tests/test-rstp.c |   8 +-
 tests/test-stp.c  |   8 +-
 32 files changed, 637 insertions(+), 290 deletions(-)

diff --git a/lib/bfd.c b/lib/bfd.c
index cc8c685..12e076a 100644
--- a/lib/bfd.c
+++ b/lib/bfd.c
@@ -721,7 +721,8 @@ bfd_process_packet(struct bfd *bfd, const struct 
flow *flow,

 if (!msg) {
 VLOG_INFO_RL(, "%s: Received too-short BFD control message 
(only "

  "%"PRIdPTR" bytes long, at least %d required).",
- bfd->name, (uint8_t *) dp_packet_tail(p) - l7,
+ bfd->name, dp_packet_size(p) -
+ (l7 - (uint8_t *) dp_packet_data(p)),
  BFD_PACKET_LEN);
 goto out;
 }
diff --git a/lib/cfm.c b/lib/cfm.c
index 71d2c02..83baf2a 100644
--- a/lib/cfm.c
+++ b/lib/cfm.c
@@ -584,7 +584,7 @@ cfm_compose_ccm(struct cfm *cfm, struct dp_packet 
*packet,


 atomic_read_relaxed(>extended, );

-ccm = dp_packet_l3(packet);
+ccm = dp_packet_l3(packet, sizeof(*ccm));
 ccm->mdlevel_version = 0;
 ccm->opcode = CCM_OPCODE;
 

[ovs-dev] [PATCH v11 09/14] dp-packet: Add support for data "linearization".

2018-10-10 Thread Tiago Lam
Previous commits have added support to the dp_packet API to handle
multi-segmented packets, where data is not stored contiguously in
memory. However, in some cases, it is inevitable and data must be
provided contiguously. Examples of such cases are when performing csums
over the entire packet data, or when write()'ing to a file descriptor
(for a tap interface, for example). For such cases, the dp_packet API
has been extended to provide a way to transform a multi-segmented
DPBUF_DPDK packet into a DPBUF_MALLOC system packet (at the expense of a
copy of memory). If the packet's data is already stored in memory
contigously then there's no need to convert the packet.

Thus, the main use cases that were assuming that a dp_packet's data is
always held contiguously in memory were changed to make use of the new
"linear functions" in the dp_packet API when there's a need to traverse
the entire's packet data. Per the example above, when the packet's data
needs to be write() to the tap's file descriptor, or when the conntrack
module needs to verify a packet's checksum, the data is now linearized.

Additionally, the layer functions, such as dp_packet_l3() and variants,
have been modified to check if there's enough data in the packet before
returning a pointer to the data (and callers have been modified
accordingly). This requirement is needed to guarantee that a caller
doesn't read beyond the available memory.

Signed-off-by: Tiago Lam 
---
 lib/bfd.c |   3 +-
 lib/cfm.c |   5 +-
 lib/conntrack-icmp.c  |   4 +-
 lib/conntrack-private.h   |   4 +-
 lib/conntrack-tcp.c   |   6 +-
 lib/conntrack.c   | 109 ++--
 lib/crc32c.c  |  35 +++--
 lib/crc32c.h  |   2 +
 lib/dp-packet.c   |  18 +++
 lib/dp-packet.h   | 288 +-
 lib/dpif-netdev.c |   5 +
 lib/dpif-netlink.c|   5 +
 lib/dpif.c|   9 ++
 lib/flow.c|  44 ---
 lib/lacp.c|   3 +-
 lib/mcast-snooping.c  |   8 +-
 lib/netdev-bsd.c  |   5 +
 lib/netdev-dummy.c|  13 +-
 lib/netdev-linux.c|  13 +-
 lib/netdev-native-tnl.c   |  38 +++---
 lib/odp-execute.c |  28 ++--
 lib/ofp-print.c   |  10 +-
 lib/ovs-lldp.c|   3 +-
 lib/packets.c | 151 --
 lib/packets.h |   7 +
 lib/pcap-file.c   |   2 +-
 ofproto/ofproto-dpif-upcall.c |  20 ++-
 ofproto/ofproto-dpif-xlate.c  |  42 --
 ovn/controller/pinctrl.c  |  29 +++--
 tests/test-conntrack.c|   2 +-
 tests/test-rstp.c |   8 +-
 tests/test-stp.c  |   8 +-
 32 files changed, 637 insertions(+), 290 deletions(-)

diff --git a/lib/bfd.c b/lib/bfd.c
index cc8c685..12e076a 100644
--- a/lib/bfd.c
+++ b/lib/bfd.c
@@ -721,7 +721,8 @@ bfd_process_packet(struct bfd *bfd, const struct flow *flow,
 if (!msg) {
 VLOG_INFO_RL(, "%s: Received too-short BFD control message (only "
  "%"PRIdPTR" bytes long, at least %d required).",
- bfd->name, (uint8_t *) dp_packet_tail(p) - l7,
+ bfd->name, dp_packet_size(p) -
+ (l7 - (uint8_t *) dp_packet_data(p)),
  BFD_PACKET_LEN);
 goto out;
 }
diff --git a/lib/cfm.c b/lib/cfm.c
index 71d2c02..83baf2a 100644
--- a/lib/cfm.c
+++ b/lib/cfm.c
@@ -584,7 +584,7 @@ cfm_compose_ccm(struct cfm *cfm, struct dp_packet *packet,
 
 atomic_read_relaxed(>extended, );
 
-ccm = dp_packet_l3(packet);
+ccm = dp_packet_l3(packet, sizeof(*ccm));
 ccm->mdlevel_version = 0;
 ccm->opcode = CCM_OPCODE;
 ccm->tlv_offset = 70;
@@ -759,8 +759,7 @@ cfm_process_heartbeat(struct cfm *cfm, const struct 
dp_packet *p)
 atomic_read_relaxed(>extended, );
 
 eth = dp_packet_eth(p);
-ccm = dp_packet_at(p, (uint8_t *)dp_packet_l3(p) - (uint8_t 
*)dp_packet_data(p),
-CCM_ACCEPT_LEN);
+ccm = dp_packet_l3(p, CCM_ACCEPT_LEN);
 
 if (!ccm) {
 VLOG_INFO_RL(, "%s: Received an unparseable 802.1ag CCM heartbeat.",
diff --git a/lib/conntrack-icmp.c b/lib/conntrack-icmp.c
index 40fd1d8..0575d0e 100644
--- a/lib/conntrack-icmp.c
+++ b/lib/conntrack-icmp.c
@@ -63,7 +63,7 @@ icmp_conn_update(struct conn *conn_, struct conntrack_bucket 
*ctb,
 static bool
 icmp4_valid_new(struct dp_packet *pkt)
 {
-struct icmp_header *icmp = dp_packet_l4(pkt);
+struct icmp_header *icmp = dp_packet_l4(pkt, sizeof *icmp);
 
 return icmp->icmp_type == ICMP4_ECHO_REQUEST
|| icmp->icmp_type == ICMP4_INFOREQUEST
@@ -73,7 +73,7 @@ icmp4_valid_new(struct dp_packet *pkt)
 static bool
 icmp6_valid_new(struct dp_packet *pkt)
 {
-struct icmp6_header *icmp6 = dp_packet_l4(pkt);
+struct icmp6_header *icmp6 =