Re: [ovs-dev] [PATCH v2 1/2] upcall: Fix minor race when deleting ukeys.
FWIW the slice logic in the for loop is meant to ensure that no revalidator threads concurrently clean the same map. There's two cases where the main thread might call into this as well though, one is when reconfiguring threads (which shouldn't conflict, because it stops all revalidators first), and the ovs-appctl revalidator/purge case, which could hit something like this, but I don't think there is a case where we use this functionality while OVS is under load, so the likelyhood of problems is low. On 11 August 2015 at 12:18, Ethan Jackson et...@nicira.com wrote: Yeah sorry this patch is not great. That second problem is solved in the follow on. I'd be somewhat inclined to drop this one all together. Ethan On Tue, Aug 11, 2015 at 11:15 AM, Jarno Rajahalme jrajaha...@nicira.com wrote: On Aug 11, 2015, at 10:42 AM, Jarno Rajahalme jrajaha...@nicira.com wrote: On Aug 10, 2015, at 6:46 PM, Ethan J. Jackson et...@nicira.com wrote: From: Ethan Jackson et...@nicira.com Since revalidator_sweep() doesn't hold the ukey mutex for each full loop iteration, it's theoretically possible that two threads may call ukey_delete() on the same ukey. If this happens, they both will attempt to remove the ukey from he cmap, causing the loser of the race to fail. Signed-off-by: Ethan J. Jackson et...@nicira.com --- ofproto/ofproto-dpif-upcall.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c index 0f2e186..fddb535 100644 --- a/ofproto/ofproto-dpif-upcall.c +++ b/ofproto/ofproto-dpif-upcall.c @@ -2076,7 +2076,6 @@ revalidator_sweep__(struct revalidator *revalidator, bool purge) flow_exists = ukey-flow_exists; seq_mismatch = (ukey-dump_seq != dump_seq ukey-reval_seq != reval_seq); -ovs_mutex_unlock(ukey-mutex); if (flow_exists (purge There is a call to push_ukey_ops() between these blocks. It calls push_ukey_ops__(), which will take a lock on the ukeys, including the last one added and still locked, so this would result in double locking. I missed the fact that handle_missed_revalidation() in here also takes the ukey lock. Jarno This could be resolved by moving the “ if (n_ops == REVALIDATE_MAX_BATCH)” block after the (moved) ukey unlock, similarly to the remainder ops handling at the end. However, I’m not sure if this would defeat the purpose of this patch. Also, it seems that generally the umap lock is taken first and ukey locks after, so there is a possibility of a deadlock if the locks are taken in the reverse order, like in this patch. It would be really great if we could employ clang thread safety analysis more than we do in this file currently, but I’m not sure if that is possible. Jarno @@ -2095,6 +2094,7 @@ revalidator_sweep__(struct revalidator *revalidator, bool purge) ukey_delete(umap, ukey); ovs_mutex_unlock(umap-mutex); } +ovs_mutex_unlock(ukey-mutex); } if (n_ops) { -- 2.1.0 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [PATCH 3/8] lib/ofpbuf: add ofpbuf_use_ds() API
Add an API to convert a dynamic string object into ofpbuf. Signed-off-by: Andy Zhou az...@nicira.com --- lib/ofpbuf.c | 10 ++ lib/ofpbuf.h | 1 + 2 files changed, 11 insertions(+) diff --git a/lib/ofpbuf.c b/lib/ofpbuf.c index 9b9d6b2..392f843 100644 --- a/lib/ofpbuf.c +++ b/lib/ofpbuf.c @@ -53,6 +53,16 @@ ofpbuf_use(struct ofpbuf *b, void *base, size_t allocated) ofpbuf_use__(b, base, allocated, 0, OFPBUF_MALLOC); } +/* Converts ds into ofpbuf 'b'. 'b' contains the 'ds-allocated' bytes of + * memory starting at 'ds-string'. 'ds' should not be modified any more. + * The memory allocated for 'ds' will be freed (with free()) if 'b' is + * resized or freed. */ +void +ofpbuf_use_ds(struct ofpbuf *b, const struct ds *ds) +{ +ofpbuf_use__(b, ds-string, ds-allocated + 1, ds-length, OFPBUF_MALLOC); +} + /* Initializes 'b' as an empty ofpbuf that contains the 'allocated' bytes of * memory starting at 'base'. 'base' should point to a buffer on the stack. * (Nothing actually relies on 'base' being allocated on the stack. It could diff --git a/lib/ofpbuf.h b/lib/ofpbuf.h index 9e82de2..17257a0 100644 --- a/lib/ofpbuf.h +++ b/lib/ofpbuf.h @@ -84,6 +84,7 @@ struct ofpbuf { } void ofpbuf_use(struct ofpbuf *, void *, size_t); +void ofpbuf_use_ds(struct ofpbuf *, const struct ds *); void ofpbuf_use_stack(struct ofpbuf *, void *, size_t); void ofpbuf_use_stub(struct ofpbuf *, void *, size_t); void ofpbuf_use_const(struct ofpbuf *, const void *, size_t); -- 1.9.1 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [PATCH 8/8] json: increase Json string initial buffer size
Json string are usually created and freed immediately. Thus, there should not be any downside in creating a larger buffer initially to avoid the cost of moving strings around in memory due to realloc() call. The following script is used as benchmark to measure number of bytes reallocated: ovs-vsctl add-br br0 for i in `seq 0 100`; do ovs-vsctl add-port br0 p$i done ovs-vsctl del-br br0 'ds_bytes_relocated' coverage counter shows that the bytes relocated are 2.5Mbytes/sec and 17Kbytes/sec, before and after this patch applied, respectively. Signed-off-by: Andy Zhou az...@nicira.com --- lib/json.c| 2 +- lib/json.h| 10 -- lib/jsonrpc.c | 6 -- 3 files changed, 13 insertions(+), 5 deletions(-) diff --git a/lib/json.c b/lib/json.c index dd85213..be76810 100644 --- a/lib/json.c +++ b/lib/json.c @@ -1462,7 +1462,7 @@ json_to_string(const struct json *json, int flags) { struct ds ds; -ds_init(ds); +json_ds_init(ds); json_to_ds(json, flags, ds); return ds_steal_cstr(ds); } diff --git a/lib/json.h b/lib/json.h index 3497035..c6b4ff5 100644 --- a/lib/json.h +++ b/lib/json.h @@ -30,14 +30,13 @@ *should be unique). */ +#include dynamic-string.h #include shash.h #ifdef __cplusplus extern C { #endif -struct ds; - /* Type of a JSON value. */ enum json_type { JSON_NULL, /* null */ @@ -127,6 +126,13 @@ enum { }; char *json_to_string(const struct json *, int flags); void json_to_ds(const struct json *, int flags, struct ds *); +/* Initialize 'ds', with reserved size suitable for typical Json objects */ +static inline void +json_ds_init(struct ds *ds) +{ +ds_init(ds); +ds_reserve(ds, 32 * 1024); /* pre allocate 32Kbytes buffer */ +} /* JSON string formatting operations. */ diff --git a/lib/jsonrpc.c b/lib/jsonrpc.c index b482777..81b173f 100644 --- a/lib/jsonrpc.c +++ b/lib/jsonrpc.c @@ -198,7 +198,8 @@ jsonrpc_log_msg(const struct jsonrpc *rpc, const char *title, const struct jsonrpc_msg *msg) { if (VLOG_IS_DBG_ENABLED()) { -struct ds s = DS_EMPTY_INITIALIZER; +struct ds s; +json_ds_init(s); if (msg-method) { ds_put_format(s, , method=\%s\, msg-method); } @@ -238,7 +239,7 @@ jsonrpc_send(struct jsonrpc *rpc, struct jsonrpc_msg *msg) { struct ofpbuf *buf; struct json *json; -struct ds ds = DS_EMPTY_INITIALIZER; +struct ds ds; size_t length; if (rpc-status) { @@ -246,6 +247,7 @@ jsonrpc_send(struct jsonrpc *rpc, struct jsonrpc_msg *msg) return rpc-status; } +json_ds_init(ds); jsonrpc_log_msg(rpc, send, msg); json = jsonrpc_msg_to_json(msg); -- 1.9.1 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [PATCH 6/8] lib/ofpbuf: make ofpbuf_use() static
There is no external users for ofpbuf_use() directly. Signed-off-by: Andy Zhou az...@nicira.com --- lib/ofpbuf.c | 2 +- lib/ofpbuf.h | 1 - 2 files changed, 1 insertion(+), 2 deletions(-) diff --git a/lib/ofpbuf.c b/lib/ofpbuf.c index 392f843..c190f8b 100644 --- a/lib/ofpbuf.c +++ b/lib/ofpbuf.c @@ -47,7 +47,7 @@ ofpbuf_use__(struct ofpbuf *b, void *base, size_t allocated, size_t size, * memory starting at 'base'. 'base' should be the first byte of a region * obtained from malloc(). It will be freed (with free()) if 'b' is resized or * freed. */ -void +static void ofpbuf_use(struct ofpbuf *b, void *base, size_t allocated) { ofpbuf_use__(b, base, allocated, 0, OFPBUF_MALLOC); diff --git a/lib/ofpbuf.h b/lib/ofpbuf.h index 17257a0..873065e 100644 --- a/lib/ofpbuf.h +++ b/lib/ofpbuf.h @@ -83,7 +83,6 @@ struct ofpbuf { .source = OFPBUF_STUB, \ } -void ofpbuf_use(struct ofpbuf *, void *, size_t); void ofpbuf_use_ds(struct ofpbuf *, const struct ds *); void ofpbuf_use_stack(struct ofpbuf *, void *, size_t); void ofpbuf_use_stub(struct ofpbuf *, void *, size_t); -- 1.9.1 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [PATCH 1/8] lib/dynamic-string: coding style fix
Remove tabs per coding style Signed-off-by: Andy Zhou az...@nicira.com --- lib/dynamic-string.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/lib/dynamic-string.h b/lib/dynamic-string.h index 95172d1..f1e0a36 100644 --- a/lib/dynamic-string.h +++ b/lib/dynamic-string.h @@ -1,5 +1,5 @@ /* - * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013 Nicira, Inc. + * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013, 2015 Nicira, Inc. * * Licensed under the Apache License, Version 2.0 (the License); * you may not use this file except in compliance with the License. @@ -63,7 +63,7 @@ int ds_get_preprocessed_line(struct ds *, FILE *, int *line_number); int ds_get_test_line(struct ds *, FILE *); void ds_put_strftime_msec(struct ds *, const char *format, long long int when, - bool utc); + bool utc); char *xastrftime_msec(const char *format, long long int when, bool utc); char *ds_cstr(struct ds *); -- 1.9.1 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [PATCH 5/8] lib/jsonrpc: make use of ofpbuf_use_ds()
Make use of ofpbuf_use_ds() to simplify code. Signed-off-by: Andy Zhou az...@nicira.com --- lib/jsonrpc.c | 5 + 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/lib/jsonrpc.c b/lib/jsonrpc.c index 282f3bb..b482777 100644 --- a/lib/jsonrpc.c +++ b/lib/jsonrpc.c @@ -240,7 +240,6 @@ jsonrpc_send(struct jsonrpc *rpc, struct jsonrpc_msg *msg) struct json *json; struct ds ds = DS_EMPTY_INITIALIZER; size_t length; -char *s; if (rpc-status) { jsonrpc_msg_destroy(msg); @@ -252,12 +251,10 @@ jsonrpc_send(struct jsonrpc *rpc, struct jsonrpc_msg *msg) json = jsonrpc_msg_to_json(msg); json_to_ds(json, 0, ds); length = ds.length; -s = ds_steal_cstr(ds); json_destroy(json); buf = xmalloc(sizeof *buf); -ofpbuf_use(buf, s, length); -buf-size = length; +ofpbuf_use_ds(buf, ds); list_push_back(rpc-output, buf-list_node); rpc-output_count++; rpc-backlog += length; -- 1.9.1 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [PATCH 2/8] lib/ofpbuf: refactor ofpbuf_use__() API
Add the size to its parameter list. Signed-off-by: Andy Zhou az...@nicira.com --- lib/ofpbuf.c | 13 ++--- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/lib/ofpbuf.c b/lib/ofpbuf.c index 05b513c..9b9d6b2 100644 --- a/lib/ofpbuf.c +++ b/lib/ofpbuf.c @@ -33,12 +33,12 @@ ofpbuf_init__(struct ofpbuf *b, size_t allocated, enum ofpbuf_source source) } static void -ofpbuf_use__(struct ofpbuf *b, void *base, size_t allocated, +ofpbuf_use__(struct ofpbuf *b, void *base, size_t allocated, size_t size, enum ofpbuf_source source) { b-base = base; b-data = base; -b-size = 0; +b-size = size; ofpbuf_init__(b, allocated, source); } @@ -50,7 +50,7 @@ ofpbuf_use__(struct ofpbuf *b, void *base, size_t allocated, void ofpbuf_use(struct ofpbuf *b, void *base, size_t allocated) { -ofpbuf_use__(b, base, allocated, OFPBUF_MALLOC); +ofpbuf_use__(b, base, allocated, 0, OFPBUF_MALLOC); } /* Initializes 'b' as an empty ofpbuf that contains the 'allocated' bytes of @@ -70,7 +70,7 @@ ofpbuf_use(struct ofpbuf *b, void *base, size_t allocated) void ofpbuf_use_stack(struct ofpbuf *b, void *base, size_t allocated) { -ofpbuf_use__(b, base, allocated, OFPBUF_STACK); +ofpbuf_use__(b, base, allocated, 0, OFPBUF_STACK); } /* Initializes 'b' as an empty ofpbuf that contains the 'allocated' bytes of @@ -90,7 +90,7 @@ ofpbuf_use_stack(struct ofpbuf *b, void *base, size_t allocated) void ofpbuf_use_stub(struct ofpbuf *b, void *base, size_t allocated) { -ofpbuf_use__(b, base, allocated, OFPBUF_STUB); +ofpbuf_use__(b, base, allocated, 0, OFPBUF_STUB); } /* Initializes 'b' as an ofpbuf whose data starts at 'data' and continues for @@ -103,8 +103,7 @@ ofpbuf_use_stub(struct ofpbuf *b, void *base, size_t allocated) void ofpbuf_use_const(struct ofpbuf *b, const void *data, size_t size) { -ofpbuf_use__(b, CONST_CAST(void *, data), size, OFPBUF_STACK); -b-size = size; +ofpbuf_use__(b, CONST_CAST(void *, data), size, size, OFPBUF_STACK); } /* Initializes 'b' as an empty ofpbuf with an initial capacity of 'size' -- 1.9.1 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [PATCH 7/8] lib/dynamic-string: add a coverage counter for reloated bytes
Keep a count of bytes moved due to the realloc() call. Signed-off-by: Andy Zhou az...@nicira.com --- lib/dynamic-string.c | 9 + 1 file changed, 9 insertions(+) diff --git a/lib/dynamic-string.c b/lib/dynamic-string.c index a6c8f6c..89ead08 100644 --- a/lib/dynamic-string.c +++ b/lib/dynamic-string.c @@ -15,6 +15,7 @@ */ #include config.h +#include coverage.h #include dynamic-string.h #include inttypes.h #include stdlib.h @@ -23,6 +24,8 @@ #include timeval.h #include util.h +COVERAGE_DEFINE(ds_bytes_relocated); + /* Initializes 'ds' as an empty string buffer. */ void ds_init(struct ds *ds) @@ -58,9 +61,15 @@ void ds_reserve(struct ds *ds, size_t min_length) { if (min_length ds-allocated || !ds-string) { +char *string_ = ds-string; +size_t size_ = ds-allocated; + ds-allocated += MAX(min_length, ds-allocated); ds-allocated = MAX(8, ds-allocated); ds-string = xrealloc(ds-string, ds-allocated + 1); +if (string_ (string_ != ds-string)) { +COVERAGE_ADD(ds_bytes_relocated, size_); +} } } -- 1.9.1 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [PATCH 4/8] jsonrpc: use json_to_ds to speed up jsonrpc_send
This change reuses the string length that available from 'ds', saving a strlen() call. Signed-off-by: Andy Zhou az...@nicira.com --- lib/jsonrpc.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/lib/jsonrpc.c b/lib/jsonrpc.c index ae51b42..282f3bb 100644 --- a/lib/jsonrpc.c +++ b/lib/jsonrpc.c @@ -238,6 +238,7 @@ jsonrpc_send(struct jsonrpc *rpc, struct jsonrpc_msg *msg) { struct ofpbuf *buf; struct json *json; +struct ds ds = DS_EMPTY_INITIALIZER; size_t length; char *s; @@ -249,8 +250,9 @@ jsonrpc_send(struct jsonrpc *rpc, struct jsonrpc_msg *msg) jsonrpc_log_msg(rpc, send, msg); json = jsonrpc_msg_to_json(msg); -s = json_to_string(json, 0); -length = strlen(s); +json_to_ds(json, 0, ds); +length = ds.length; +s = ds_steal_cstr(ds); json_destroy(json); buf = xmalloc(sizeof *buf); -- 1.9.1 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [ovs-discuss] tc ingress qdisc of tapB disappeared when del-port tapA from bridge
Hi, I need some help about ovs QOS policing. I want to know the meaning of the flag VALID_POLICING In function netdev_linux_set_policing(), whether parameter kbits_rateequals to 0 or not, VALID_POLICING flag will be set. In my opinion, only when users set ingress_policing_rate to none zero, then VALID_POLICING should be set, if ingress_policing_rate is cleared or set to 0, then VALID_POLICING should be unset am I right? or there is another meaning of this flag? At 2015-08-10 21:36:51, ychen ychen103...@163.com wrote: There are still something puzzled me, can you do some help? 1. what's the meaning of the flag VALID_POLICING? I don't see any meaning of the flag VALID_POLICING in function netdev_linux_set_policing(). you see, whether the parameter kbits_rateequals to 0, the clause netdev-cache_valid |= VALID_POLICING; will be executed only if the conditions matches error !=0 another strange phenomenon is that, when i set breakpoint on function netdev_linux_set_policing, I found that netdev-cache_valid equals to 0x73 which means VALID_POLICING has set before this function but I can't find anywhere to set this flag except in function netdev_linux_set_policing. why? 2. which event triggered message RTM_NEWLINK? I found that when use command add-port br0 tap111, first function netdev_linux_set_policing() will be called, then netdev_linux_update() with message RTM_NEWLINK and this message will lead to netdev-cache_valid to clear the flag VALID_POLICING. my question is, the port tap111 is a system port, I have created it in kernel before add it to ovs bridge. and I didn't to anything like up the port or change the port's namespace or change mtu/mac etc, but why RTM_NEWLINK is trigerred? At 2015-07-23 23:51:41, Ben Pfaff b...@nicira.com wrote: On Thu, Jul 23, 2015 at 11:12:22AM +0800, ychen wrote: sorry, but how to contribute to this thread: http://openvswitch.org/pipermail/discuss/2015-May/017687.html It seems I can only reply this thread http://openvswitch.org/pipermail/discuss/2015-July/018324.html I don't see how any of your solutions solve the problem described in http://openvswitch.org/pipermail/discuss/2015-May/017687.html. I don't see much value in just tweaking the parameters of the problem. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] dev@openvswitch.org
The original message was received at Tue, 11 Aug 2015 14:36:35 +0530 from [222.51.252.46] - The following addresses had permanent fatal errors - dev@openvswitch.org ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] This thing won't help you to become a romantic person, but it will assist you in being a real stud instead!
When you'll be able to answer her desires? Good health good times! Improve it swiftly with Cialis Soft. Place an order in our site! Make your choice! A E shone and twinkled, and his usually pale face was flushed and ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH 1/2] dpif-netdev: Purge all ukeys when reconfigure pmd.
On 11.08.2015 05:47, Alex Wang wrote: When dpdk configuration changes, all pmd threads are recreated and rx queues of each port are reloaded. After this process, rx queue could be mapped to a different pmd thread other than the one before reconfiguration. However, this is totally transparent to ofproto layer modules. So, if the ofproto-dpif-upcall module still holds ukeys generated before pmd thread recreation, this old ukey will collide with the ukey for the new upcalls from same traffic flow, causing flow installation failure. To fix the bug, this commit adds a new call-back function in dpif layer for notifying upper layer the purging of datapath (e.g. pmd threads recreation in dpif-netdev). So, the ofproto-dpif-upcall module can react properly with deleting all the ukeys and with collecting all flows' last stats. Reported-by: Ilya Maximets i.maxim...@samsung.com Signed-off-by: Alex Wang al...@nicira.com --- lib/dpif-netdev.c | 25 + lib/dpif-netlink.c|1 + lib/dpif-provider.h | 11 ++- lib/dpif.c|8 lib/dpif.h| 12 ofproto/ofproto-dpif-upcall.c | 16 6 files changed, 72 insertions(+), 1 deletion(-) diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index c144352..a54853f 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -203,6 +203,11 @@ struct dp_netdev { upcall_callback *upcall_cb; /* Callback function for executing upcalls. */ void *upcall_aux; +/* Callback function for notifying the purging of dp flows (during + * reseting pmd threads). */ +dp_purge_callback *dp_purge_cb; +void *dp_purge_aux; + /* Stores all 'struct dp_netdev_pmd_thread's. */ struct cmap poll_threads; @@ -223,6 +228,8 @@ struct dp_netdev { static struct dp_netdev_port *dp_netdev_lookup_port(const struct dp_netdev *dp, odp_port_t); +static void dp_netdev_disable_upcall(struct dp_netdev *); +static void dp_netdev_enable_upcall(struct dp_netdev *); enum dp_stat_type { DP_STAT_EXACT_HIT, /* Packets that had an exact match (emc). */ @@ -2440,10 +2447,18 @@ dpif_netdev_pmd_set(struct dpif *dpif, unsigned int n_rxqs, const char *cmask) free(dp-pmd_cmask); dp-pmd_cmask = cmask ? xstrdup(cmask) : NULL; +/* Disables upcall and notifies higher layer about the incoming + * datapath purging. */ +dp_netdev_disable_upcall(dp); +dp-dp_purge_cb(dp-dp_purge_aux); + /* Restores the non-pmd. */ dp_netdev_set_nonpmd(dp); /* Restores all pmd threads. */ dp_netdev_reset_pmd_threads(dp); + +/* Enables upcall after pmd reconfiguration. */ +dp_netdev_enable_upcall(dp); } It is better to enable upcalls before restoring PMD threads to prevent dropping of packets. return 0; @@ -3419,6 +3434,15 @@ struct dp_netdev_execute_aux { }; static void +dpif_netdev_register_dp_purge_cb(struct dpif *dpif, dp_purge_callback *cb, + void *aux) +{ +struct dp_netdev *dp = get_dp_netdev(dpif); +dp-dp_purge_aux = aux; +dp-dp_purge_cb = cb; +} + +static void dpif_netdev_register_upcall_cb(struct dpif *dpif, upcall_callback *cb, void *aux) { @@ -3665,6 +3689,7 @@ const struct dpif_class dpif_netdev_class = { NULL, /* recv */ NULL, /* recv_wait */ NULL, /* recv_purge */ +dpif_netdev_register_dp_purge_cb, dpif_netdev_register_upcall_cb, dpif_netdev_enable_upcall, dpif_netdev_disable_upcall, diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c index 8884a9f..f8f3092 100644 --- a/lib/dpif-netlink.c +++ b/lib/dpif-netlink.c @@ -2314,6 +2314,7 @@ const struct dpif_class dpif_netlink_class = { dpif_netlink_recv, dpif_netlink_recv_wait, dpif_netlink_recv_purge, +NULL, /* register_dp_purge_cb */ NULL, /* register_upcall_cb */ NULL, /* enable_upcall */ NULL, /* disable_upcall */ diff --git a/lib/dpif-provider.h b/lib/dpif-provider.h index 28ea86f..8f2690f 100644 --- a/lib/dpif-provider.h +++ b/lib/dpif-provider.h @@ -361,13 +361,22 @@ struct dpif_class { * return. */ void (*recv_purge)(struct dpif *dpif); +/* When 'dpif' is about to purge the datapath, the higher layer may want + * to be notified so that it could try reacting accordingly (e.g. grabbing + * all flow stats before they are gone). + * + * Registers an upcall callback function with 'dpif'. This is only used if double 'if' here too. + * if 'dpif' needs to notify
Re: [ovs-dev] [PATCH 2/5] system-common-macros: Don't use bash to exec in ns.
I'm not an autoconf expert, but having NS_CHECK_EXEC and NS_CHECK_EXEC_UNQUOTED for this purpose might be confusing, since there already are AT_CHECK and AT_CHECK_UNQUOTED, and the UNQUOTED refers to comparison text. How about this? # NS_EXEC([namespace], [command]) # # Execute 'command' in 'namespace' m4_define([NS_EXEC], [ip netns exec $1 sh NS_EXEC_HEREDOC $2 NS_EXEC_HEREDOC ]) On 10/08/2015 21:17, Joe Stringer joestrin...@nicira.com wrote: Hmm. The problem I was having is that if we wrap 'command' in quotes here, and 'command' itself includes quotes (eg, command='echo foo | bar'), then rather than quoting foo, the command ends up quoting everything up to 'foo', then unquoting 'foo', then quoting everything after 'foo'. The ICMP related test in the conntrack branch has an instance of this. Maybe the right solution is to have two variants: NS_CHECK_EXEC() and NS_CHECK_EXEC_UNQUOTED(). I've seen this other places in OVS testsuite. On 10 August 2015 at 11:40, Daniele Di Proietto diproiet...@vmware.com wrote: I guess you need this to use quotes() inside 'command'. One effect of this change is that if 'command' contains a pipe (or , or ||) just the first command will be executed inside the namespace. I'm not sure if it's a big problem. What do you think? On 08/08/2015 00:28, Joe Stringer joestrin...@nicira.com wrote: ip netns exec $namespace $command doesn't need to use bash to execute the command. Remove it. Signed-off-by: Joe Stringer joestrin...@nicira.com --- tests/system-common-macros.at | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tests/system-common-macros.at b/tests/system-common-macros.at index 1321e58..fdc2bc7 100644 --- a/tests/system-common-macros.at +++ b/tests/system-common-macros.at @@ -25,7 +25,7 @@ m4_define([ADD_NAMESPACES], # # Execute 'command' in 'namespace' m4_define([NS_EXEC], -[ip netns exec $1 bash -c $2] +[ip netns exec $1 $2] ) # NS_CHECK_EXEC([namespace], [command], other_params...) -- 2.1.4 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [RFC] dpdk: support multiple queues in vhost
-Original Message- From: dev [mailto:dev-boun...@openvswitch.org] On Behalf Of Flavio Leitner Sent: Friday, July 31, 2015 11:30 PM To: dev@openvswitch.org Cc: Flavio Leitner Subject: [ovs-dev] [RFC] dpdk: support multiple queues in vhost This RFC is based on the vhost multiple queues work on dpdk-dev: http://dpdk.org/ml/archives/dev/2015-June/019345.html Hi Flavio - the patch looks good, one minor comment below. Signed-off-by: Flavio Leitner f...@redhat.com --- lib/netdev-dpdk.c | 61 - -- 1 file changed, 40 insertions(+), 21 deletions(-) diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index 5ae805e..493172c 100644 --- a/lib/netdev-dpdk.c +++ b/lib/netdev-dpdk.c @@ -215,12 +215,9 @@ struct netdev_dpdk { * If the numbers match, 'txq_needs_locking' is false, otherwise it is * true and we will take a spinlock on transmission */ int real_n_txq; +int real_n_rxq; bool txq_needs_locking; -/* Spinlock for vhost transmission. Other DPDK devices use spinlocks in - * dpdk_tx_queue */ -rte_spinlock_t vhost_tx_lock; - /* virtio-net structure for vhost device */ OVSRCU_TYPE(struct virtio_net *) virtio_dev; @@ -602,13 +599,10 @@ dpdk_dev_parse_name(const char dev_name[], const char prefix[], static int vhost_construct_helper(struct netdev *netdev_) OVS_REQUIRES(dpdk_mutex) { -struct netdev_dpdk *netdev = netdev_dpdk_cast(netdev_); - if (rte_eal_init_ret) { return rte_eal_init_ret; } -rte_spinlock_init(netdev-vhost_tx_lock); return netdev_dpdk_init(netdev_, -1, DPDK_DEV_VHOST); } @@ -791,9 +785,16 @@ netdev_dpdk_vhost_set_multiq(struct netdev *netdev_, unsigned int n_txq, ovs_mutex_lock(dpdk_mutex); ovs_mutex_lock(netdev-mutex); +rte_free(netdev-tx_q); +/* FIXME: the number of vqueues needs to match */ netdev-up.n_txq = n_txq; -netdev-real_n_txq = 1; -netdev-up.n_rxq = 1; +netdev-up.n_rxq = n_rxq; + +/* vring has txq = rxq */ +netdev-real_n_txq = n_rxq; +netdev-real_n_rxq = n_rxq; +netdev-txq_needs_locking = netdev-real_n_txq != netdev-up.n_txq; +netdev_dpdk_alloc_txq(netdev, netdev-up.n_txq); ovs_mutex_unlock(netdev-mutex); ovs_mutex_unlock(dpdk_mutex); @@ -904,14 +905,14 @@ netdev_dpdk_vhost_rxq_recv(struct netdev_rxq *rxq_, struct netdev *netdev = rx-up.netdev; struct netdev_dpdk *vhost_dev = netdev_dpdk_cast(netdev); struct virtio_net *virtio_dev = netdev_dpdk_get_virtio(vhost_dev); -int qid = 1; +int qid = rxq_-queue_id; uint16_t nb_rx = 0; if (OVS_UNLIKELY(!is_vhost_running(virtio_dev))) { return EAGAIN; } -nb_rx = rte_vhost_dequeue_burst(virtio_dev, qid, +nb_rx = rte_vhost_dequeue_burst(virtio_dev, VIRTIO_TXQ + qid * 2, vhost_dev-dpdk_mp-mp, (struct rte_mbuf **)packets, NETDEV_MAX_BURST); @@ -958,8 +959,9 @@ netdev_dpdk_rxq_recv(struct netdev_rxq *rxq_, struct dp_packet **packets, } static void -__netdev_dpdk_vhost_send(struct netdev *netdev, struct dp_packet **pkts, - int cnt, bool may_steal) +__netdev_dpdk_vhost_send(struct netdev *netdev, int qid, + struct dp_packet **pkts, int cnt, + bool may_steal) { struct netdev_dpdk *vhost_dev = netdev_dpdk_cast(netdev); struct virtio_net *virtio_dev = netdev_dpdk_get_virtio(vhost_dev); @@ -974,13 +976,16 @@ __netdev_dpdk_vhost_send(struct netdev *netdev, struct dp_packet **pkts, goto out; } -/* There is vHost TX single queue, So we need to lock it for TX. */ -rte_spinlock_lock(vhost_dev-vhost_tx_lock); +if (vhost_dev-txq_needs_locking) { +qid = qid % vhost_dev-real_n_txq; +rte_spinlock_lock(vhost_dev-tx_q[qid].tx_lock); +} do { +int vhost_qid = VIRTIO_RXQ + qid * VIRTIO_QNUM; unsigned int tx_pkts; -tx_pkts = rte_vhost_enqueue_burst(virtio_dev, VIRTIO_RXQ, +tx_pkts = rte_vhost_enqueue_burst(virtio_dev, vhost_qid, cur_pkts, cnt); if (OVS_LIKELY(tx_pkts)) { /* Packets have been sent.*/ @@ -999,7 +1004,7 @@ __netdev_dpdk_vhost_send(struct netdev *netdev, struct dp_packet **pkts, * Unable to enqueue packets to vhost interface. * Check available entries before retrying. */ -while (!rte_vring_available_entries(virtio_dev, VIRTIO_RXQ)) { +while (!rte_vring_available_entries(virtio_dev, vhost_qid)) { if (OVS_UNLIKELY((rte_get_timer_cycles() - start) timeout)) { expired = 1; break; @@ -1011,7 +1016,10 @@
Re: [ovs-dev] [PATCH v2 1/2] upcall: Fix minor race when deleting ukeys.
On Aug 10, 2015, at 6:46 PM, Ethan J. Jackson et...@nicira.com wrote: From: Ethan Jackson et...@nicira.com Since revalidator_sweep() doesn't hold the ukey mutex for each full loop iteration, it's theoretically possible that two threads may call ukey_delete() on the same ukey. If this happens, they both will attempt to remove the ukey from he cmap, causing the loser of the race to fail. Signed-off-by: Ethan J. Jackson et...@nicira.com --- ofproto/ofproto-dpif-upcall.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c index 0f2e186..fddb535 100644 --- a/ofproto/ofproto-dpif-upcall.c +++ b/ofproto/ofproto-dpif-upcall.c @@ -2076,7 +2076,6 @@ revalidator_sweep__(struct revalidator *revalidator, bool purge) flow_exists = ukey-flow_exists; seq_mismatch = (ukey-dump_seq != dump_seq ukey-reval_seq != reval_seq); -ovs_mutex_unlock(ukey-mutex); if (flow_exists (purge There is a call to push_ukey_ops() between these blocks. It calls push_ukey_ops__(), which will take a lock on the ukeys, including the last one added and still locked, so this would result in double locking. This could be resolved by moving the “ if (n_ops == REVALIDATE_MAX_BATCH)” block after the (moved) ukey unlock, similarly to the remainder ops handling at the end. However, I’m not sure if this would defeat the purpose of this patch. Also, it seems that generally the umap lock is taken first and ukey locks after, so there is a possibility of a deadlock if the locks are taken in the reverse order, like in this patch. It would be really great if we could employ clang thread safety analysis more than we do in this file currently, but I’m not sure if that is possible. Jarno @@ -2095,6 +2094,7 @@ revalidator_sweep__(struct revalidator *revalidator, bool purge) ukey_delete(umap, ukey); ovs_mutex_unlock(umap-mutex); } +ovs_mutex_unlock(ukey-mutex); } if (n_ops) { -- 2.1.0 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH] ovn: Add lflow-list to ovn-sbctl.
Thx, this is really really useful!!! Just few minor suggestions below, On Mon, Aug 10, 2015 at 8:33 PM, Russell Bryant rbry...@redhat.com wrote: I frequently view the contents of the Logical_Flow table while working on OVN. Add a patch that can output the contents of this table in a sorted way that makes it easier to read through. It's sorted by logical datapath, pipeline, table id, priority, and match. Signed-off-by: Russell Bryant rbry...@redhat.com --- ovn/utilities/ovn-sbctl.8.in | 6 +++ ovn/utilities/ovn-sbctl.c| 94 2 files changed, 100 insertions(+) Example output: $ ./ovn-2port-setup.sh + ovn-nbctl lswitch-add sw0 + ovn-nbctl lport-add sw0 sw0-port1 + ovn-nbctl lport-add sw0 sw0-port2 + ovn-nbctl lport-set-macs sw0-port1 00:00:00:00:00:01 + ovn-nbctl lport-set-macs sw0-port2 00:00:00:00:00:02 + ovn-nbctl lport-set-port-security sw0-port1 00:00:00:00:00:01 + ovn-nbctl lport-set-port-security sw0-port2 00:00:00:00:00:02 + ovs-vsctl add-port br-int lport1 -- set Interface lport1 external_ids:iface-id=sw0-port1 + ovs-vsctl add-port br-int lport2 -- set Interface lport2 external_ids:iface-id=sw0-port2 $ ovn-sbctl lflow-list Datapath: 03f8a791-b5bb-47b6-b373-09b75b8e26f3 Pipeline: ingress table_id=0, priority=100, match=(eth.src[40]), action=(drop;) table_id=0, priority=100, match=(vlan.present), action=(drop;) table_id=0, priority= 50, match=(inport == sw0-port1 eth.src == {00:00:00:00:00:01}), action=(next;) table_id=0, priority= 50, match=(inport == sw0-port2 eth.src == {00:00:00:00:00:02}), action=(next;) table_id=0, priority= 0, match=(1), action=(drop;) table_id=1, priority=100, match=(eth.dst[40]), action=(outport = _MC_flood; output;) table_id=1, priority= 50, match=(eth.dst == 00:00:00:00:00:01), action=(outport = sw0-port1; output;) table_id=1, priority= 50, match=(eth.dst == 00:00:00:00:00:02), action=(outport = sw0-port2; output;) Datapath: 03f8a791-b5bb-47b6-b373-09b75b8e26f3 Pipeline: egress table_id=0, priority= 0, match=(1), action=(next;) table_id=1, priority=100, match=(eth.dst[40]), action=(output;) table_id=1, priority= 50, match=(outport == sw0-port1 eth.dst == {00:00:00:00:00:01}), action=(output;) table_id=1, priority= 50, match=(outport == sw0-port2 eth.dst == {00:00:00:00:00:02}), action=(output;) diff --git a/ovn/utilities/ovn-sbctl.8.in b/ovn/utilities/ovn-sbctl.8.in index b5c796e..9f9168e 100644 --- a/ovn/utilities/ovn-sbctl.8.in +++ b/ovn/utilities/ovn-sbctl.8.in @@ -146,6 +146,12 @@ Without \fB\-\-if\-exists\fR, attempting to unbind a logical port that is not bound is an error. With \fB\-\-if\-exists\fR, attempting to unbind logical port that is not bound has no effect. . +.SS Logical Flow Commands +. +.IP \fBlflow\-list\fR [\fIlogical\-datapath\fR] +List logical flows. If \fIlogical\-datapath\fR is specified, only list flows for +that logical datapath. +. .so lib/db-ctl-base.man .SH EXIT STATUS .IP 0 diff --git a/ovn/utilities/ovn-sbctl.c b/ovn/utilities/ovn-sbctl.c index cbde60a..0bcb795 100644 --- a/ovn/utilities/ovn-sbctl.c +++ b/ovn/utilities/ovn-sbctl.c @@ -300,6 +300,9 @@ Port binding commands:\n\ lport-bind LPORT CHASSISbind logical port LPORT to CHASSIS\n\ lport-unbind LPORT reset the port binding of logical port LPORT\n\ \n\ +Logical flow commands:\n\ + lflow-list [DATAPATH] List logical flows for all or a single datapath\n\ +\n\ %s\ \n\ Options:\n\ @@ -470,6 +473,13 @@ pre_get_info(struct ctl_context *ctx) ovsdb_idl_add_column(ctx-idl, sbrec_port_binding_col_logical_port); ovsdb_idl_add_column(ctx-idl, sbrec_port_binding_col_chassis); + +ovsdb_idl_add_column(ctx-idl, sbrec_logical_flow_col_logical_datapath); +ovsdb_idl_add_column(ctx-idl, sbrec_logical_flow_col_pipeline); +ovsdb_idl_add_column(ctx-idl, sbrec_logical_flow_col_actions); +ovsdb_idl_add_column(ctx-idl, sbrec_logical_flow_col_priority); +ovsdb_idl_add_column(ctx-idl, sbrec_logical_flow_col_table_id); +ovsdb_idl_add_column(ctx-idl, sbrec_logical_flow_col_match); } static struct cmd_show_table cmd_show_tables[] = { @@ -584,6 +594,86 @@ cmd_lport_unbind(struct ctl_context *ctx) } } +static int +lflow_cmp(const void *lf1_, const void *lf2_) +{ +const struct sbrec_logical_flow *lf1, *lf2; + +lf1 = *((struct sbrec_logical_flow **) lf1_); +lf2 = *((struct sbrec_logical_flow **) lf2_); + +#define CMP(expr) \ +do { \ +int res; \ +res = (expr); \ +if (res) { \ +return res; \ +} \ +} while (0) + +CMP(memcmp(lf1-logical_datapath-header_.uuid, + lf2-logical_datapath-header_.uuid, +sizeof(lf1-logical_datapath-header_.uuid))); We have a uuid_compare_3way() function defined in lib/uuid.c, should we use that? +
Re: [ovs-dev] [PATCH v2 2/2] ofproto: Allow in-place modifications of datapath flows.
On Aug 10, 2015, at 6:46 PM, Ethan J. Jackson et...@nicira.com wrote: From: Ethan Jackson et...@nicira.com There are certain use cases (such as bond rebalancing) where a datapath flow's actions may change, while it's wildcard pattern remains the same. Before this patch, revalidators would note the change, delete the flow, and wait for the handlers to install an updated version. This is inefficient, as many packets could get punted to userspace before the new flow is finally installed. To improve the situation, this patch implements in place modification of datapath flows. If the revalidators detect the only change to a given ukey is its actions, instead of deleting it, it does a put with the MODIFY flag set. Signed-off-by: Ethan Jackson et...@nicira.com --- ofproto/ofproto-dpif-upcall.c | 169 +++--- tests/ofproto-dpif.at | 40 ++ 2 files changed, 150 insertions(+), 59 deletions(-) diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c index fddb535..8ef61bc 100644 --- a/ofproto/ofproto-dpif-upcall.c +++ b/ofproto/ofproto-dpif-upcall.c @@ -150,6 +150,12 @@ enum upcall_type { IPFIX_UPCALL/* Per-bridge sampling. */ }; +enum reval_result { +UKEY_KEEP, +UKEY_DELETE, +UKEY_MODIFY +}; + struct upcall { struct ofproto_dpif *ofproto; /* Parent ofproto. */ const struct recirc_id_node *recirc; /* Recirculation context. */ @@ -1663,33 +1669,41 @@ should_revalidate(const struct udpif *udpif, uint64_t packets, return false; } -static bool +/* Verifies that the datapath actions of 'ukey' are still correct, and pushes + * 'stats' for it. + * + * Returns a recommended action for 'ukey', options include: + * UKEY_DELETE The ukey should be deleted. + * UKEY_KEEP The ukey is fine as is. + * UKEY_MODIFY The ukey's actions should be changed but is otherwise + * fine. Callers should change the actions to those found + * in the caller supplied 'odp_actions' buffer. */ +static enum reval_result revalidate_ukey(struct udpif *udpif, struct udpif_key *ukey, -const struct dpif_flow_stats *stats, uint64_t reval_seq) +const struct dpif_flow_stats *stats, +struct ofpbuf *odp_actions, uint64_t reval_seq) OVS_REQUIRES(ukey-mutex) { -uint64_t odp_actions_stub[1024 / 8]; -struct ofpbuf odp_actions = OFPBUF_STUB_INITIALIZER(odp_actions_stub); - struct xlate_out xout, *xoutp; struct netflow *netflow; struct ofproto_dpif *ofproto; struct dpif_flow_stats push; struct flow flow, dp_mask; struct flow_wildcards wc; +enum reval_result result; uint64_t *dp64, *xout64; ofp_port_t ofp_in_port; struct xlate_in xin; long long int last_used; int error; size_t i; -bool ok; bool need_revalidate; -ok = false; +result = UKEY_DELETE; xoutp = NULL; netflow = NULL; +ofpbuf_clear(odp_actions); need_revalidate = (ukey-reval_seq != reval_seq); last_used = ukey-stats.used; push.used = stats-used; @@ -1703,20 +1717,19 @@ revalidate_ukey(struct udpif *udpif, struct udpif_key *ukey, if (need_revalidate last_used !should_revalidate(udpif, push.n_packets, last_used)) { -ok = false; goto exit; } /* We will push the stats, so update the ukey stats cache. */ ukey-stats = *stats; if (!push.n_packets !need_revalidate) { -ok = true; +result = UKEY_KEEP; goto exit; } if (ukey-xcache !need_revalidate) { xlate_push_stats(ukey-xcache, push); -ok = true; +result = UKEY_KEEP; goto exit; } @@ -1739,7 +1752,7 @@ revalidate_ukey(struct udpif *udpif, struct udpif_key *ukey, } xlate_in_init(xin, ofproto, flow, ofp_in_port, NULL, push.tcp_flags, - NULL, need_revalidate ? wc : NULL, odp_actions); + NULL, need_revalidate ? wc : NULL, odp_actions); if (push.n_packets) { xin.resubmit_stats = push; xin.may_learn = true; @@ -1749,18 +1762,14 @@ revalidate_ukey(struct udpif *udpif, struct udpif_key *ukey, xoutp = xout; if (!need_revalidate) { -ok = true; +result = UKEY_KEEP; goto exit; } if (xout.slow) { -ofpbuf_clear(odp_actions); +ofpbuf_clear(odp_actions); compose_slow_path(udpif, xout, flow, flow.in_port.odp_port, - odp_actions); -} - -if (!ofpbuf_equal(odp_actions, ukey-actions)) { -goto exit; + odp_actions); } if (odp_flow_key_to_mask(ukey-mask, ukey-mask_len, ukey-key, @@ -1781,18 +1790,24 @@ revalidate_ukey(struct udpif *udpif, struct udpif_key *ukey, } } -ok = true; +
Re: [ovs-dev] [PATCH v2 1/2] upcall: Fix minor race when deleting ukeys.
On Aug 11, 2015, at 10:42 AM, Jarno Rajahalme jrajaha...@nicira.com wrote: On Aug 10, 2015, at 6:46 PM, Ethan J. Jackson et...@nicira.com wrote: From: Ethan Jackson et...@nicira.com Since revalidator_sweep() doesn't hold the ukey mutex for each full loop iteration, it's theoretically possible that two threads may call ukey_delete() on the same ukey. If this happens, they both will attempt to remove the ukey from he cmap, causing the loser of the race to fail. Signed-off-by: Ethan J. Jackson et...@nicira.com --- ofproto/ofproto-dpif-upcall.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c index 0f2e186..fddb535 100644 --- a/ofproto/ofproto-dpif-upcall.c +++ b/ofproto/ofproto-dpif-upcall.c @@ -2076,7 +2076,6 @@ revalidator_sweep__(struct revalidator *revalidator, bool purge) flow_exists = ukey-flow_exists; seq_mismatch = (ukey-dump_seq != dump_seq ukey-reval_seq != reval_seq); -ovs_mutex_unlock(ukey-mutex); if (flow_exists (purge There is a call to push_ukey_ops() between these blocks. It calls push_ukey_ops__(), which will take a lock on the ukeys, including the last one added and still locked, so this would result in double locking. I missed the fact that handle_missed_revalidation() in here also takes the ukey lock. Jarno This could be resolved by moving the “ if (n_ops == REVALIDATE_MAX_BATCH)” block after the (moved) ukey unlock, similarly to the remainder ops handling at the end. However, I’m not sure if this would defeat the purpose of this patch. Also, it seems that generally the umap lock is taken first and ukey locks after, so there is a possibility of a deadlock if the locks are taken in the reverse order, like in this patch. It would be really great if we could employ clang thread safety analysis more than we do in this file currently, but I’m not sure if that is possible. Jarno @@ -2095,6 +2094,7 @@ revalidator_sweep__(struct revalidator *revalidator, bool purge) ukey_delete(umap, ukey); ovs_mutex_unlock(umap-mutex); } +ovs_mutex_unlock(ukey-mutex); } if (n_ops) { -- 2.1.0 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH 1/2] dpif-netdev: Purge all ukeys when reconfigure pmd.
Thx, IIya for the review, please see my reply below, On Tue, Aug 11, 2015 at 4:41 AM, Ilya Maximets i.maxim...@samsung.com wrote: On 11.08.2015 05:47, Alex Wang wrote: When dpdk configuration changes, all pmd threads are recreated and rx queues of each port are reloaded. After this process, rx queue could be mapped to a different pmd thread other than the one before reconfiguration. However, this is totally transparent to ofproto layer modules. So, if the ofproto-dpif-upcall module still holds ukeys generated before pmd thread recreation, this old ukey will collide with the ukey for the new upcalls from same traffic flow, causing flow installation failure. To fix the bug, this commit adds a new call-back function in dpif layer for notifying upper layer the purging of datapath (e.g. pmd threads recreation in dpif-netdev). So, the ofproto-dpif-upcall module can react properly with deleting all the ukeys and with collecting all flows' last stats. Reported-by: Ilya Maximets i.maxim...@samsung.com Signed-off-by: Alex Wang al...@nicira.com --- lib/dpif-netdev.c | 25 + lib/dpif-netlink.c|1 + lib/dpif-provider.h | 11 ++- lib/dpif.c|8 lib/dpif.h| 12 ofproto/ofproto-dpif-upcall.c | 16 6 files changed, 72 insertions(+), 1 deletion(-) diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index c144352..a54853f 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -203,6 +203,11 @@ struct dp_netdev { upcall_callback *upcall_cb; /* Callback function for executing upcalls. */ void *upcall_aux; +/* Callback function for notifying the purging of dp flows (during + * reseting pmd threads). */ +dp_purge_callback *dp_purge_cb; +void *dp_purge_aux; + /* Stores all 'struct dp_netdev_pmd_thread's. */ struct cmap poll_threads; @@ -223,6 +228,8 @@ struct dp_netdev { static struct dp_netdev_port *dp_netdev_lookup_port(const struct dp_netdev *dp, odp_port_t); +static void dp_netdev_disable_upcall(struct dp_netdev *); +static void dp_netdev_enable_upcall(struct dp_netdev *); enum dp_stat_type { DP_STAT_EXACT_HIT, /* Packets that had an exact match (emc). */ @@ -2440,10 +2447,18 @@ dpif_netdev_pmd_set(struct dpif *dpif, unsigned int n_rxqs, const char *cmask) free(dp-pmd_cmask); dp-pmd_cmask = cmask ? xstrdup(cmask) : NULL; +/* Disables upcall and notifies higher layer about the incoming + * datapath purging. */ +dp_netdev_disable_upcall(dp); +dp-dp_purge_cb(dp-dp_purge_aux); + /* Restores the non-pmd. */ dp_netdev_set_nonpmd(dp); /* Restores all pmd threads. */ dp_netdev_reset_pmd_threads(dp); + +/* Enables upcall after pmd reconfiguration. */ +dp_netdev_enable_upcall(dp); } It is better to enable upcalls before restoring PMD threads to prevent dropping of packets. Sorry, the comments are misleading, the dp_netdev_reset_pmd_threads() is for recreating all pmd threads (not exactly restoring). If we move the dp_netdev_enable_upcall() before the function, then upcall will be enabled before destroying all existing pmd threads, and we will have the same issue again (the existing pmd threads will try installing flows/creating ukeys before getting destroyed). Also, since recreating pmd threads is considered a major configuration change and is disruptive, I think it should be okay to resume upcall handling right after all new pmd threads are created. return 0; @@ -3419,6 +3434,15 @@ struct dp_netdev_execute_aux { }; static void +dpif_netdev_register_dp_purge_cb(struct dpif *dpif, dp_purge_callback *cb, + void *aux) +{ +struct dp_netdev *dp = get_dp_netdev(dpif); +dp-dp_purge_aux = aux; +dp-dp_purge_cb = cb; +} + +static void dpif_netdev_register_upcall_cb(struct dpif *dpif, upcall_callback *cb, void *aux) { @@ -3665,6 +3689,7 @@ const struct dpif_class dpif_netdev_class = { NULL, /* recv */ NULL, /* recv_wait */ NULL, /* recv_purge */ +dpif_netdev_register_dp_purge_cb, dpif_netdev_register_upcall_cb, dpif_netdev_enable_upcall, dpif_netdev_disable_upcall, diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c index 8884a9f..f8f3092 100644 --- a/lib/dpif-netlink.c +++ b/lib/dpif-netlink.c @@ -2314,6 +2314,7 @@ const struct dpif_class dpif_netlink_class = { dpif_netlink_recv, dpif_netlink_recv_wait, dpif_netlink_recv_purge, +NULL,
Re: [ovs-dev] [PATCH 2/2] ofproto-dpif-upcall: Add VLOG_WARN_RL logs for upcall_cb() error.
Acked-by: Jarno Rajahalme jrajaha...@nicira.com On Aug 10, 2015, at 7:47 PM, Alex Wang al...@nicira.com wrote: Signed-off-by: Alex Wang al...@nicira.com --- ofproto/ofproto-dpif-upcall.c |4 1 file changed, 4 insertions(+) diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c index 6385abc..4fed956 100644 --- a/ofproto/ofproto-dpif-upcall.c +++ b/ofproto/ofproto-dpif-upcall.c @@ -1060,6 +1060,7 @@ upcall_cb(const struct dp_packet *packet, const struct flow *flow, ovs_u128 *ufi const struct nlattr *userdata, struct ofpbuf *actions, struct flow_wildcards *wc, struct ofpbuf *put_actions, void *aux) { +static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1); struct udpif *udpif = aux; unsigned int flow_limit; struct upcall upcall; @@ -1090,6 +1091,7 @@ upcall_cb(const struct dp_packet *packet, const struct flow *flow, ovs_u128 *ufi } if (udpif_get_n_flows(udpif) = flow_limit) { +VLOG_WARN_RL(rl, upcall_cb failure: datapath flow limit reached); error = ENOSPC; goto out; } @@ -1097,11 +1099,13 @@ upcall_cb(const struct dp_packet *packet, const struct flow *flow, ovs_u128 *ufi /* Prevent miss flow installation if the key has recirculation ID but we * were not able to get a reference on it. */ if (type == DPIF_UC_MISS upcall.recirc !upcall.have_recirc_ref) { +VLOG_WARN_RL(rl, upcall_cb failure: no reference for recirc flow); error = ENOSPC; goto out; } if (upcall.ukey !ukey_install(udpif, upcall.ukey)) { +VLOG_WARN_RL(rl, upcall_cb failure: ukey installation fails); error = ENOSPC; } out: -- 1.7.9.5 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [RFC] dpdk: support multiple queues in vhost
On Tue, Aug 11, 2015 at 04:24:01PM +, Traynor, Kevin wrote: -Original Message- From: dev [mailto:dev-boun...@openvswitch.org] On Behalf Of Flavio Leitner Sent: Friday, July 31, 2015 11:30 PM To: dev@openvswitch.org Cc: Flavio Leitner Subject: [ovs-dev] [RFC] dpdk: support multiple queues in vhost This RFC is based on the vhost multiple queues work on dpdk-dev: http://dpdk.org/ml/archives/dev/2015-June/019345.html Hi Flavio - the patch looks good, one minor comment below. Signed-off-by: Flavio Leitner f...@redhat.com --- lib/netdev-dpdk.c | 61 - -- 1 file changed, 40 insertions(+), 21 deletions(-) diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index 5ae805e..493172c 100644 --- a/lib/netdev-dpdk.c +++ b/lib/netdev-dpdk.c @@ -215,12 +215,9 @@ struct netdev_dpdk { * If the numbers match, 'txq_needs_locking' is false, otherwise it is * true and we will take a spinlock on transmission */ int real_n_txq; +int real_n_rxq; bool txq_needs_locking; -/* Spinlock for vhost transmission. Other DPDK devices use spinlocks in - * dpdk_tx_queue */ -rte_spinlock_t vhost_tx_lock; - /* virtio-net structure for vhost device */ OVSRCU_TYPE(struct virtio_net *) virtio_dev; @@ -602,13 +599,10 @@ dpdk_dev_parse_name(const char dev_name[], const char prefix[], static int vhost_construct_helper(struct netdev *netdev_) OVS_REQUIRES(dpdk_mutex) { -struct netdev_dpdk *netdev = netdev_dpdk_cast(netdev_); - if (rte_eal_init_ret) { return rte_eal_init_ret; } -rte_spinlock_init(netdev-vhost_tx_lock); return netdev_dpdk_init(netdev_, -1, DPDK_DEV_VHOST); } @@ -791,9 +785,16 @@ netdev_dpdk_vhost_set_multiq(struct netdev *netdev_, unsigned int n_txq, ovs_mutex_lock(dpdk_mutex); ovs_mutex_lock(netdev-mutex); +rte_free(netdev-tx_q); +/* FIXME: the number of vqueues needs to match */ netdev-up.n_txq = n_txq; -netdev-real_n_txq = 1; -netdev-up.n_rxq = 1; +netdev-up.n_rxq = n_rxq; + +/* vring has txq = rxq */ +netdev-real_n_txq = n_rxq; +netdev-real_n_rxq = n_rxq; +netdev-txq_needs_locking = netdev-real_n_txq != netdev-up.n_txq; +netdev_dpdk_alloc_txq(netdev, netdev-up.n_txq); ovs_mutex_unlock(netdev-mutex); ovs_mutex_unlock(dpdk_mutex); @@ -904,14 +905,14 @@ netdev_dpdk_vhost_rxq_recv(struct netdev_rxq *rxq_, struct netdev *netdev = rx-up.netdev; struct netdev_dpdk *vhost_dev = netdev_dpdk_cast(netdev); struct virtio_net *virtio_dev = netdev_dpdk_get_virtio(vhost_dev); -int qid = 1; +int qid = rxq_-queue_id; uint16_t nb_rx = 0; if (OVS_UNLIKELY(!is_vhost_running(virtio_dev))) { return EAGAIN; } -nb_rx = rte_vhost_dequeue_burst(virtio_dev, qid, +nb_rx = rte_vhost_dequeue_burst(virtio_dev, VIRTIO_TXQ + qid * 2, vhost_dev-dpdk_mp-mp, (struct rte_mbuf **)packets, NETDEV_MAX_BURST); @@ -958,8 +959,9 @@ netdev_dpdk_rxq_recv(struct netdev_rxq *rxq_, struct dp_packet **packets, } static void -__netdev_dpdk_vhost_send(struct netdev *netdev, struct dp_packet **pkts, - int cnt, bool may_steal) +__netdev_dpdk_vhost_send(struct netdev *netdev, int qid, + struct dp_packet **pkts, int cnt, + bool may_steal) { struct netdev_dpdk *vhost_dev = netdev_dpdk_cast(netdev); struct virtio_net *virtio_dev = netdev_dpdk_get_virtio(vhost_dev); @@ -974,13 +976,16 @@ __netdev_dpdk_vhost_send(struct netdev *netdev, struct dp_packet **pkts, goto out; } -/* There is vHost TX single queue, So we need to lock it for TX. */ -rte_spinlock_lock(vhost_dev-vhost_tx_lock); +if (vhost_dev-txq_needs_locking) { +qid = qid % vhost_dev-real_n_txq; +rte_spinlock_lock(vhost_dev-tx_q[qid].tx_lock); +} do { +int vhost_qid = VIRTIO_RXQ + qid * VIRTIO_QNUM; unsigned int tx_pkts; -tx_pkts = rte_vhost_enqueue_burst(virtio_dev, VIRTIO_RXQ, +tx_pkts = rte_vhost_enqueue_burst(virtio_dev, vhost_qid, cur_pkts, cnt); if (OVS_LIKELY(tx_pkts)) { /* Packets have been sent.*/ @@ -999,7 +1004,7 @@ __netdev_dpdk_vhost_send(struct netdev *netdev, struct dp_packet **pkts, * Unable to enqueue packets to vhost interface. * Check available entries before retrying. */ -while (!rte_vring_available_entries(virtio_dev, VIRTIO_RXQ)) { +while
Re: [ovs-dev] [PATCH 2/5] system-common-macros: Don't use bash to exec in ns.
This looks like a reasonable solution, I'll have a play around with it. On 11 August 2015 at 05:25, Daniele Di Proietto diproiet...@vmware.com wrote: I'm not an autoconf expert, but having NS_CHECK_EXEC and NS_CHECK_EXEC_UNQUOTED for this purpose might be confusing, since there already are AT_CHECK and AT_CHECK_UNQUOTED, and the UNQUOTED refers to comparison text. How about this? # NS_EXEC([namespace], [command]) # # Execute 'command' in 'namespace' m4_define([NS_EXEC], [ip netns exec $1 sh NS_EXEC_HEREDOC $2 NS_EXEC_HEREDOC ]) On 10/08/2015 21:17, Joe Stringer joestrin...@nicira.com wrote: Hmm. The problem I was having is that if we wrap 'command' in quotes here, and 'command' itself includes quotes (eg, command='echo foo | bar'), then rather than quoting foo, the command ends up quoting everything up to 'foo', then unquoting 'foo', then quoting everything after 'foo'. The ICMP related test in the conntrack branch has an instance of this. Maybe the right solution is to have two variants: NS_CHECK_EXEC() and NS_CHECK_EXEC_UNQUOTED(). I've seen this other places in OVS testsuite. On 10 August 2015 at 11:40, Daniele Di Proietto diproiet...@vmware.com wrote: I guess you need this to use quotes() inside 'command'. One effect of this change is that if 'command' contains a pipe (or , or ||) just the first command will be executed inside the namespace. I'm not sure if it's a big problem. What do you think? On 08/08/2015 00:28, Joe Stringer joestrin...@nicira.com wrote: ip netns exec $namespace $command doesn't need to use bash to execute the command. Remove it. Signed-off-by: Joe Stringer joestrin...@nicira.com --- tests/system-common-macros.at | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tests/system-common-macros.at b/tests/system-common-macros.at index 1321e58..fdc2bc7 100644 --- a/tests/system-common-macros.at +++ b/tests/system-common-macros.at @@ -25,7 +25,7 @@ m4_define([ADD_NAMESPACES], # # Execute 'command' in 'namespace' m4_define([NS_EXEC], -[ip netns exec $1 bash -c $2] +[ip netns exec $1 $2] ) # NS_CHECK_EXEC([namespace], [command], other_params...) -- 2.1.4 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH v2 1/2] upcall: Fix minor race when deleting ukeys.
Yeah sorry this patch is not great. That second problem is solved in the follow on. I'd be somewhat inclined to drop this one all together. Ethan On Tue, Aug 11, 2015 at 11:15 AM, Jarno Rajahalme jrajaha...@nicira.com wrote: On Aug 11, 2015, at 10:42 AM, Jarno Rajahalme jrajaha...@nicira.com wrote: On Aug 10, 2015, at 6:46 PM, Ethan J. Jackson et...@nicira.com wrote: From: Ethan Jackson et...@nicira.com Since revalidator_sweep() doesn't hold the ukey mutex for each full loop iteration, it's theoretically possible that two threads may call ukey_delete() on the same ukey. If this happens, they both will attempt to remove the ukey from he cmap, causing the loser of the race to fail. Signed-off-by: Ethan J. Jackson et...@nicira.com --- ofproto/ofproto-dpif-upcall.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c index 0f2e186..fddb535 100644 --- a/ofproto/ofproto-dpif-upcall.c +++ b/ofproto/ofproto-dpif-upcall.c @@ -2076,7 +2076,6 @@ revalidator_sweep__(struct revalidator *revalidator, bool purge) flow_exists = ukey-flow_exists; seq_mismatch = (ukey-dump_seq != dump_seq ukey-reval_seq != reval_seq); -ovs_mutex_unlock(ukey-mutex); if (flow_exists (purge There is a call to push_ukey_ops() between these blocks. It calls push_ukey_ops__(), which will take a lock on the ukeys, including the last one added and still locked, so this would result in double locking. I missed the fact that handle_missed_revalidation() in here also takes the ukey lock. Jarno This could be resolved by moving the “ if (n_ops == REVALIDATE_MAX_BATCH)” block after the (moved) ukey unlock, similarly to the remainder ops handling at the end. However, I’m not sure if this would defeat the purpose of this patch. Also, it seems that generally the umap lock is taken first and ukey locks after, so there is a possibility of a deadlock if the locks are taken in the reverse order, like in this patch. It would be really great if we could employ clang thread safety analysis more than we do in this file currently, but I’m not sure if that is possible. Jarno @@ -2095,6 +2094,7 @@ revalidator_sweep__(struct revalidator *revalidator, bool purge) ukey_delete(umap, ukey); ovs_mutex_unlock(umap-mutex); } +ovs_mutex_unlock(ukey-mutex); } if (n_ops) { -- 2.1.0 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH 2/2] ofproto-dpif-upcall: Add VLOG_WARN_RL logs for upcall_cb() error.
Thx, applied to master~ On Tue, Aug 11, 2015 at 11:16 AM, Jarno Rajahalme jrajaha...@nicira.com wrote: Acked-by: Jarno Rajahalme jrajaha...@nicira.com On Aug 10, 2015, at 7:47 PM, Alex Wang al...@nicira.com wrote: Signed-off-by: Alex Wang al...@nicira.com --- ofproto/ofproto-dpif-upcall.c |4 1 file changed, 4 insertions(+) diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c index 6385abc..4fed956 100644 --- a/ofproto/ofproto-dpif-upcall.c +++ b/ofproto/ofproto-dpif-upcall.c @@ -1060,6 +1060,7 @@ upcall_cb(const struct dp_packet *packet, const struct flow *flow, ovs_u128 *ufi const struct nlattr *userdata, struct ofpbuf *actions, struct flow_wildcards *wc, struct ofpbuf *put_actions, void *aux) { +static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1); struct udpif *udpif = aux; unsigned int flow_limit; struct upcall upcall; @@ -1090,6 +1091,7 @@ upcall_cb(const struct dp_packet *packet, const struct flow *flow, ovs_u128 *ufi } if (udpif_get_n_flows(udpif) = flow_limit) { +VLOG_WARN_RL(rl, upcall_cb failure: datapath flow limit reached); error = ENOSPC; goto out; } @@ -1097,11 +1099,13 @@ upcall_cb(const struct dp_packet *packet, const struct flow *flow, ovs_u128 *ufi /* Prevent miss flow installation if the key has recirculation ID but we * were not able to get a reference on it. */ if (type == DPIF_UC_MISS upcall.recirc !upcall.have_recirc_ref) { +VLOG_WARN_RL(rl, upcall_cb failure: no reference for recirc flow); error = ENOSPC; goto out; } if (upcall.ukey !ukey_install(udpif, upcall.ukey)) { +VLOG_WARN_RL(rl, upcall_cb failure: ukey installation fails); error = ENOSPC; } out: -- 1.7.9.5 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH v2 2/2] ofproto: Allow in-place modifications of datapath flows.
On 10 August 2015 at 18:46, Ethan J. Jackson et...@nicira.com wrote: exit: -if (ok) { +if (result != UKEY_DELETE) { ukey-reval_seq = reval_seq; } -if (netflow !ok) { +if (netflow result != UKEY_DELETE) { netflow_flow_clear(netflow, flow); } I think netflow should be cleared only if we delete. +if (purge) { +result = UKEY_DELETE; +} else if (ukey-dump_seq == dump_seq + || ukey-reval_seq == reval_seq) { +result = UKEY_KEEP; Indentation in elseif. -} else if (!flow_exists) { -ovs_mutex_lock(umap-mutex); -ukey_delete(umap, ukey); -ovs_mutex_unlock(umap-mutex); +} else if (result == UKEY_MODIFY) { +ofpbuf_delete(ukey-actions); +ukey-actions = ofpbuf_clone(odp_actions); +modify_op_init(ops[n_ops++], ukey); Per-batch push_ukey_ops() occurs for UKEY_DELETE case, but not UKEY_MODIFY. this may allow n_ops to exceed REVALIDATE_MAX_BATCH if a bunch of modify operations all occur at once. diff --git a/tests/ofproto-dpif.at b/tests/ofproto-dpif.at index 58c426b..121f84d 100644 --- a/tests/ofproto-dpif.at +++ b/tests/ofproto-dpif.at @@ -6985,3 +6985,43 @@ recirc_id=0,ip,in_port=1,dl_vlan=10,nw_frag=no, actions: del ]) OVS_VSWITCHD_STOP AT_CLEANUP + +# Tests in place modification of installed datapath flows. +AT_SETUP([ofproto-dpif - in place modification]) +OVS_VSWITCHD_START( + [add-port br0 p1 -- set Interface p1 type=dummy ofport_request=1]) +AT_CHECK([ovs-appctl vlog/set dpif:dbg dpif_netdev:dbg]) + +AT_CHECK([ovs-ofctl del-flows br0]) +AT_CHECK([ovs-ofctl add-flow br0 in_port=1,actions=mod_vlan_vid:3,output:local]) + +ovs-appctl vlog/set PATTERN:ANY:'%c|%p|%m' + +ovs-appctl time/stop + +for i in 1 2 3; do +ovs-appctl netdev-dummy/receive p1 'in_port(1),eth(src=50:54:00:00:00:09,dst=50:54:00:00:00:0a),eth_type(0x1234)' +done + +AT_CHECK([ovs-appctl dpif/dump-flows br0 | STRIP_UFID | STRIP_USED | sort], [0], [dnl +recirc_id(0),in_port(1),eth_type(0x1234), packets:2, bytes:120, used:0.0s, actions:push_vlan(vid=3,pcp=0),100 +]) + +AT_CHECK([ovs-ofctl add-flow br0 priority=6,in_port=1,actions=mod_vlan_vid:4,output:local]) + +ovs-appctl time/warp 500 +ovs-appctl time/warp 500 + +for i in 1 2 3; do +ovs-appctl netdev-dummy/receive p1 'in_port(1),eth(src=50:54:00:00:00:09,dst=50:54:00:00:00:0a),eth_type(0x1234)' +done + +AT_CHECK([ovs-appctl dpif/dump-flows br0 | STRIP_UFID | STRIP_USED | sort], [0], [dnl +recirc_id(0),in_port(1),eth_type(0x1234), packets:5, bytes:300, used:0.0s, actions:push_vlan(vid=4,pcp=0),100 +]) + +AT_CHECK([cat ovs-vswitchd.log | grep 'modify' | STRIP_UFID ], [0], [dnl +dpif|DBG|dummy@ovs-dummy: put[[modify]] skb_priority(0/0),skb_mark(0/0),recirc_id(0),dp_hash(0/0),in_port(1),eth(src=50:54:00:00:00:09/00:00:00:00:00:00,dst=50:54:00:00:00:0a/00:00:00:00:00:00),eth_type(0x1234), actions:push_vlan(vid=4,pcp=0),100 +]) +OVS_VSWITCHD_STOP +AT_CLEANUP Have you tried running this test on travis or a similar environment under high load? I wonder if it's prone to race conditions, because the test script doesn't do any waiting to ensure that the packets are handled before it checks the logs. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH v2 2/2] ofproto: Allow in-place modifications of datapath flows.
On 11 August 2015 at 11:13, Jarno Rajahalme jrajaha...@nicira.com wrote: -if (!keep) { +if (result == UKEY_DELETE) { delete_op_init(udpif, ops[n_ops++], ukey); +} else if (result == UKEY_MODIFY) { +ofpbuf_delete(ukey-actions); +ukey-actions = ofpbuf_clone(odp_actions); ukey’s actions is documented to be read only, so you should check if it is safe to change it. Also, ukey is said to be RCU protected, so maybe the ukey-actions pointer should be changed to an RCU pointer and the old key postpone deleted after the new pointer is set? Or, have you checked that holding the ukey mutex while changing the actions is enough? In that case you should update the struct ukey definition accordingly. I think that SFLOW_UPCALL handling may dislike this; Generally I think that the assumption is that you will lock the ukey while using its contents. Either way, that code will need to be updated and the definition documentation should be changed accordingly. +continue; +} + +if (purge) { +result = UKEY_DELETE; +} else if (ukey-dump_seq == dump_seq + || ukey-reval_seq == reval_seq) { +result = UKEY_KEEP; +} else { +struct dpif_flow_stats stats; +COVERAGE_INC(revalidate_missed_dp_flow); +memset(stats, 0, sizeof stats); +result = revalidate_ukey(udpif, ukey, stats, odp_actions, + reval_seq); +} + +if (result == UKEY_DELETE) { struct ukey_op *op = ops[n_ops++]; delete_op_init(udpif, op, ukey); @@ -2089,17 +2137,20 @@ revalidator_sweep__(struct revalidator *revalidator, bool purge) push_ukey_ops(udpif, umap, ops, n_ops); push_key_ops() will try to take the locks of the ukeys, so it would double lock the key we are currently processing. push_ukey_ops__() takes the ukey mutex, yes. This is avoided today by unlocking earlier in the function. IMO the most likely race condition trigger here is when a user executes ovs-appctl revalidator/purge, because the main thread has a small chance to iterate across a umap in synchronization (but slightly out of time) with a corresponding revalidator thread. Jarno suggested offline adding some new variable to ukey which indicates that the ukey has been swept this round. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH 1/5] system-kmod-macros: Fix VSWITCHD_STOP.
On 10 August 2015 at 11:32, Daniele Di Proietto diproiet...@vmware.com wrote: I'm surprised as well that the tests were passing. Sorry about this and thanks for finding out Acked-by: Daniele Di Proietto diproiet...@vmware.com Thanks, I applied this patch to master. I plan to revise the later patches to try to handle support detection and graceful startup/shutdown better. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH RFC] coverage: fix coverage accounting for pmd threads
Just trying share an alternative, could we still do the coverage clear inside 'pmd_thread_main()'. However, instead of contending for the 'coverage_mutex', use 'ovs_mutex_trylock()'. Since pmd thread runs in infinite loop, it can constantly try grabbing the lock and will eventually acquire it. Something like this: diff --git a/lib/coverage.c b/lib/coverage.c index 6121956..9f23f99 100644 --- a/lib/coverage.c +++ b/lib/coverage.c @@ -272,6 +272,37 @@ coverage_clear(void) } } +/* Runs approximately every COVERAGE_CLEAR_INTERVAL amount of time to + * synchronize per-thread counters with global counters. Exits immediately + * without updating the 'thread_time' if cannot acquire the 'coverage_mutex'. + * This is to avoid lock contention for some performance-critical threads. + */ +void +coverage_try_clear(void) +{ +long long int now, *thread_time; + +now = time_msec(); +thread_time = coverage_clear_time_get(); + +/* Initialize the coverage_clear_time. */ +if (*thread_time == LLONG_MIN) { +*thread_time = now + COVERAGE_CLEAR_INTERVAL; +} + +if (now = *thread_time) { +if (ovs_mutex_trylock(coverage_mutex)) { +size_t i; +for (i = 0; i n_coverage_counters; i++) { +struct coverage_counter *c = coverage_counters[i]; +c-total += c-count(); +} +ovs_mutex_unlock(coverage_mutex); +*thread_time = now + COVERAGE_CLEAR_INTERVAL; +} +} +} + /* Runs approximately every COVERAGE_RUN_INTERVAL amount of time to update the * coverage counters' 'min' and 'hr' array. 'min' array is for cumulating * per second counts into per minute count. 'hr' array is for cumulating per diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index c144352..96884ec 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -2695,6 +2695,7 @@ reload: lc = 0; +coverage_try_clear(); emc_cache_slow_sweep(pmd-flow_cache); ovsrcu_quiesce(); On Tue, Aug 11, 2015 at 6:24 PM, Alex Wang al...@nicira.com wrote: Hi IIya, Thx a lot for writing up this fix, The reason that you did not see the pmd related coverage counter stats (in ovs-appctl coverage/show) is that we do not call coverage_clear() anywhere in pmd_thread_main(). This is in that the coverage_clear() requires holding the global mutex in the coverage module which can be super expensive for dpdk packet processing. Want to admit that your fix is more complicated than I expect. Especially, changing the global (extern) struct 'counter_##COUNTER' to per-thread and expanding the array size by number of threads times can also be expensive for main thread... Do you think there could be a simpler way to achieve this? I'd like to think about other possible solutions more, Thanks, Alex Wang, On Mon, Aug 10, 2015 at 7:45 AM, Ilya Maximets i.maxim...@samsung.com wrote: Currently coverage_counter_register() is called only from constructor functions at program initialization time by main thread. Coverage counter values are static and thread local. This means, that all COVERAGE_INC() calls from pmd threads leads to increment of thread local variables of that threads that can't be accessed by anyone else. And they are not used in final coverage accounting. Fix that by adding ability to add/remove coverage counters in runtime for newly created threads and counting coverage using counters from all threads. Signed-off-by: Ilya Maximets i.maxim...@samsung.com --- lib/coverage.c| 83 +-- lib/coverage.h| 27 -- lib/dpif-netdev.c | 4 ++- 3 files changed, 91 insertions(+), 23 deletions(-) diff --git a/lib/coverage.c b/lib/coverage.c index 6121956..2ae9e7a 100644 --- a/lib/coverage.c +++ b/lib/coverage.c @@ -30,14 +30,22 @@ VLOG_DEFINE_THIS_MODULE(coverage); /* The coverage counters. */ static struct coverage_counter **coverage_counters = NULL; -static size_t n_coverage_counters = 0; static size_t allocated_coverage_counters = 0; +static void (**coverage_initializers)(void) = NULL; +static size_t n_coverage_initializers = 0; +static size_t allocated_coverage_initializers = 0; + static struct ovs_mutex coverage_mutex = OVS_MUTEX_INITIALIZER; DEFINE_STATIC_PER_THREAD_DATA(long long int, coverage_clear_time, LLONG_MIN); static long long int coverage_run_time = LLONG_MIN; +volatile unsigned int *coverage_val[2048] = {NULL}; +volatile size_t n_coverage_counters = 0; + +DEFINE_STATIC_PER_THREAD_DATA(unsigned int, coverage_thread_start, 0); + /* Index counter used to compute the moving average array's index. */ static unsigned int idx_count = 0; @@ -54,7 +62,44 @@ coverage_counter_register(struct coverage_counter* counter) allocated_coverage_counters, sizeof(struct
Re: [ovs-dev] [PATCH RFC] coverage: fix coverage accounting for pmd threads
Hi IIya, Thx a lot for writing up this fix, The reason that you did not see the pmd related coverage counter stats (in ovs-appctl coverage/show) is that we do not call coverage_clear() anywhere in pmd_thread_main(). This is in that the coverage_clear() requires holding the global mutex in the coverage module which can be super expensive for dpdk packet processing. Want to admit that your fix is more complicated than I expect. Especially, changing the global (extern) struct 'counter_##COUNTER' to per-thread and expanding the array size by number of threads times can also be expensive for main thread... Do you think there could be a simpler way to achieve this? I'd like to think about other possible solutions more, Thanks, Alex Wang, On Mon, Aug 10, 2015 at 7:45 AM, Ilya Maximets i.maxim...@samsung.com wrote: Currently coverage_counter_register() is called only from constructor functions at program initialization time by main thread. Coverage counter values are static and thread local. This means, that all COVERAGE_INC() calls from pmd threads leads to increment of thread local variables of that threads that can't be accessed by anyone else. And they are not used in final coverage accounting. Fix that by adding ability to add/remove coverage counters in runtime for newly created threads and counting coverage using counters from all threads. Signed-off-by: Ilya Maximets i.maxim...@samsung.com --- lib/coverage.c| 83 +-- lib/coverage.h| 27 -- lib/dpif-netdev.c | 4 ++- 3 files changed, 91 insertions(+), 23 deletions(-) diff --git a/lib/coverage.c b/lib/coverage.c index 6121956..2ae9e7a 100644 --- a/lib/coverage.c +++ b/lib/coverage.c @@ -30,14 +30,22 @@ VLOG_DEFINE_THIS_MODULE(coverage); /* The coverage counters. */ static struct coverage_counter **coverage_counters = NULL; -static size_t n_coverage_counters = 0; static size_t allocated_coverage_counters = 0; +static void (**coverage_initializers)(void) = NULL; +static size_t n_coverage_initializers = 0; +static size_t allocated_coverage_initializers = 0; + static struct ovs_mutex coverage_mutex = OVS_MUTEX_INITIALIZER; DEFINE_STATIC_PER_THREAD_DATA(long long int, coverage_clear_time, LLONG_MIN); static long long int coverage_run_time = LLONG_MIN; +volatile unsigned int *coverage_val[2048] = {NULL}; +volatile size_t n_coverage_counters = 0; + +DEFINE_STATIC_PER_THREAD_DATA(unsigned int, coverage_thread_start, 0); + /* Index counter used to compute the moving average array's index. */ static unsigned int idx_count = 0; @@ -54,7 +62,44 @@ coverage_counter_register(struct coverage_counter* counter) allocated_coverage_counters, sizeof(struct coverage_counter*)); } -coverage_counters[n_coverage_counters++] = counter; +coverage_counters[n_coverage_counters] = counter; +coverage_val[n_coverage_counters++] = NULL; +} + +void +coverage_initializer_register(void (*f)(void)) +{ +if (n_coverage_initializers = allocated_coverage_initializers) { +coverage_initializers = x2nrealloc(coverage_initializers, + allocated_coverage_initializers, + sizeof *coverage_initializers); +} +coverage_initializers[n_coverage_initializers++] = f; +} + +void +coverage_register_new_thread(void) +{ +int i; +ovs_mutex_lock(coverage_mutex); +*coverage_thread_start_get() = n_coverage_counters; +for (i = 0; i n_coverage_initializers; i++) +coverage_initializers[i](); +ovs_mutex_unlock(coverage_mutex); +} + +void +coverage_unregister_thread(void) +{ +int i; +ovs_mutex_lock(coverage_mutex); +for (i = *coverage_thread_start_get() + n_coverage_initializers; + i n_coverage_counters; i++) { +coverage_counters[i - n_coverage_initializers] = coverage_counters[i]; +coverage_val[i - n_coverage_initializers] = coverage_val[i]; +} +n_coverage_counters -= n_coverage_initializers; +ovs_mutex_unlock(coverage_mutex); } static void @@ -102,13 +147,12 @@ coverage_hash(void) uint32_t hash = 0; int n_groups, i; +ovs_mutex_lock(coverage_mutex); /* Sort coverage counters into groups with equal totals. */ c = xmalloc(n_coverage_counters * sizeof *c); -ovs_mutex_lock(coverage_mutex); for (i = 0; i n_coverage_counters; i++) { c[i] = coverage_counters[i]; } -ovs_mutex_unlock(coverage_mutex); qsort(c, n_coverage_counters, sizeof *c, compare_coverage_counters); /* Hash the names in each group along with the rank. */ @@ -130,6 +174,7 @@ coverage_hash(void) i = j; } +ovs_mutex_unlock(coverage_mutex); free(c); return hash_int(n_groups, hash); @@ -200,9