[ovs-dev] Swift Copy - December 21st, 2017
Dear d...@openvswitch.org, PT. Akbar Nusantera sent you this email message with the following file attachments: - Swift copy.zip (706.7 KB) Comment: Good Morning, Here is the payment slip. Confirm and get back to me. Best regards, Rachmat Yanuar (Fery) PT. Akbar Nusantera Jl. Cimanuk No. 111 Randuagung - Grasik Jawa Timur – Indonesia Telp. 031-399201231, Fax. 031-399230031 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] 5000+ Grocery Products For Your Whole Family
Quickest Home Delivery in Dhaka https://www.chaldal.com https://www.chaldal.com At Chaldal, our main focus is to provide top class service before ,during & after sales to each of our customers. Our team of 375+ members have only one mission & that is to save your time,provide you convenience & make you smile :) - Chaldal Team https://chaldal.com/fresh-soyebean-oil-5-ltr Fresh Soyabean Oil 5 ltr MRP ৳ 520 ৳ 500 https://chaldal.com/fresh-soyebean-oil-5-ltr SHOP NOW [ https://chaldal.com/fresh-soyebean-oil-5-ltr ] https://chaldal.com/whole-skin-off-broiler-murgi-chicken-11-kg Whole Broiler Chicken Skin Off (Net Weight ± 50 gm) 800 gm MRP ৳ 196 ৳ 187 https://chaldal.com/whole-skin-off-broiler-murgi-chicken-11-kg SHOP NOW [ https://chaldal.com/whole-skin-off-broiler-murgi-chicken-11-kg ] https://chaldal.com/diploma-full-cream-milk-powder-500-gm-5 Diploma Full Cream Milk Powder 500 gm (Free Doodles Instant Noodles 62 Gm 2 Pcs) ৳ 300 https://chaldal.com/diploma-full-cream-milk-powder-500-gm-5 SHOP NOW [ https://chaldal.com/diploma-full-cream-milk-powder-500-gm-5 ] https://chaldal.com/nestl%C3%A9-nescaf%C3%A9-classic-instant-coffee-jar-200-gm Nestlé Nescafé Classic Instant Coffee Jar 200 gm MRP ৳ 495 ৳ 473 https://chaldal.com/nestl%C3%A9-nescaf%C3%A9-classic-instant-coffee-jar-200-gm SHOP NOW [ https://chaldal.com/nestl%C3%A9-nescaf%C3%A9-classic-instant-coffee-jar-200-gm ] https://chaldal.com/nivea-silky-smooth-body-lotion-400-ml Nivea Silky Smooth Body Milk 400 ml MRP ৳ 525 ৳ 509 https://chaldal.com/nivea-silky-smooth-body-lotion-400-ml SHOP NOW [ https://chaldal.com/nivea-silky-smooth-body-lotion-400-ml ] https://chaldal.com/davidoff-cafe-espresso-57-coffee-100-gm DAVIDOFF Cafe Espresso 57 Coffee 100 gm MRP ৳ 600 ৳ 535 https://chaldal.com/davidoff-cafe-espresso-57-coffee-100-gm SHOP NOW [ https://chaldal.com/davidoff-cafe-espresso-57-coffee-100-gm ] https://chaldal.com/ambassader-olive-oil-300-ml Ambassador Spanish Olive Oil Can 300 ml MRP ৳ 420 ৳ 406 https://chaldal.com/ambassader-olive-oil-300-ml SHOP NOW [ https://chaldal.com/ambassader-olive-oil-300-ml ] https://chaldal.com/sunmoon-electronic-mosquito-swatter-china-1-pcs SunMoon Electronic Mosquito Swatter (China) each MRP ৳ 480 ৳ 400 https://chaldal.com/sunmoon-electronic-mosquito-swatter-china-1-pcs SHOP NOW [ https://chaldal.com/sunmoon-electronic-mosquito-swatter-china-1-pcs ] https://chaldal.com/vaseline-total-moisture-lotion-400-ml VASELINE Deep Restore Lotion 400 gm MRP ৳ 400 ৳ 445 https://chaldal.com/vaseline-total-moisture-lotion-400-ml SHOP NOW [ https://chaldal.com/vaseline-total-moisture-lotion-400-ml ] https://chaldal.com/chosen-foods-avocado-oil-250-ml Chosen Foods Avocado Oil 250 ml MRP ৳ 666 ৳ 605 https://chaldal.com/chosen-foods-avocado-oil-250-ml SHOP NOW [ https://chaldal.com/chosen-foods-avocado-oil-250-ml ] https://chaldal.com/savlon-twinkle-baby-diaper-belt-l-7-18-kg-36-pcs Savlon Twinkle Baby Diaper (Belt) L (7-18 kg) 36 pcs MRP ৳ 950 ৳ 769 https://chaldal.com/savlon-twinkle-baby-diaper-belt-l-7-18-kg-36-pcs SHOP NOW [ https://chaldal.com/savlon-twinkle-baby-diaper-belt-l-7-18-kg-36-pcs ] https://chaldal.com/lipice-sheer-colour-natural-lip-gel-2-gm LipIce Sheer Colour Natural MRP ৳ 270 ৳ 231 https://chaldal.com/lipice-sheer-colour-natural-lip-gel-2-gm SHOP NOW [ https://chaldal.com/lipice-sheer-colour-natural-lip-gel-2-gm ] https://chaldal.com/offers Get hold of us: 01881234567 https://www.chaldal.com https://www.facebook.com/chaldalcom https://play.google.com/store/apps/details?id=com.chaldal.poached https://itunes.apple.com/us/app/chaldal/id1104493220?mt=8 -- This email was sent by upda...@chaldal.us to d...@openvswitch.org Not interested?Unsubscribe - https://zc1.maillist-manage.com/ua/optout?od=11287eca7ecbb6=13da75efd70ee0de=13da75efd70e3987=11699e4c2d973b1 Update profile - https://zc1.maillist-manage.com/ua/upc?upd=13da75efd6ffe513=13da75efd70ee0de=11699e4c2d973b1=11287eca7ecbb6 Impressed?Tell-A-Friend - https://zc1.tell-your-friend.com/ua/forward?od=11287eca7ecbb6=13da75efd70ee0de=13da75efd70e3987=11699e4c2d973b1=f Chaldal Inc | House 5, Road 1, Sector-6, Uttara, Dhaka-1230 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] jsonrpc-server: Enforce uniqueness of monitor IDs.
> On Dec 21, 2017, at 4:30 PM, Ben Pfaffwrote: > > What do you think of this version? Yes, it's much clearer to me. Thanks! > @@ -1458,10 +1466,12 @@ ovsdb_jsonrpc_monitor_cond_change(struct > ovsdb_jsonrpc_session *s, > } > > /* Change monitor id */ > -hmap_remove(>monitors, >node); > -json_destroy(m->monitor_id); > -m->monitor_id = json_clone(params->u.array.elems[1]); > -hmap_insert(>monitors, >node, json_hash(m->monitor_id, 0)); > +if (changing_id) { > +hmap_remove(>monitors, >node); > +json_destroy(m->monitor_id); > +m->monitor_id = json_clone(new_monitor_id); > +hmap_insert(>monitors, >node, json_hash(m->monitor_id, 0)); > +} It doesn't matter, but you can probably drop the comment, since the "if" statement is pretty clear. Acked-by: Justin Pettit --Justin ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v3 6/8] ovsdb-client: Add new "restore" command.
On Wed, Dec 20, 2017 at 02:23:54PM -0800, Justin Pettit wrote: > > On Dec 13, 2017, at 3:10 PM, Ben Pfaffwrote: > > diff --git a/Documentation/ref/ovsdb.7.rst b/Documentation/ref/ovsdb.7.rst > > index ddc573a47c94..1f860768df2d 100644 > > --- a/Documentation/ref/ovsdb.7.rst > > +++ b/Documentation/ref/ovsdb.7.rst > > @@ -317,9 +317,14 @@ Another way to make a backup is to use ``ovsdb-client > > backup``, which > > connects to a running database server and outputs an atomic snapshot of its > > schema and content, in the same format used for on-disk databases. > > > > -To restore from a backup, stop the database server or servers, overwrite > > -the database file with the backup (e.g. with ``cp``), and then > > -restart the servers. > > +Multiple options are also available when the time comes to restore a > > database > > +from a backup. One option is to stop the database server or servers, > > overwrite > > +the database file with the backup (e.g. with ``cp``), and then restart the > > +servers. Another way is to use ``ovsdb-client restore``, which connects > > to a > > +running database server and replaces the data in one of its databases by a > > +provided snapshot. Using ``ovsdb-client restore`` has the disadvantage > > that > > +UUIDs of rows in the restored database will differ from those in the > > snapshot, > > +because the OVSDB protocol does not allow clients to specify row UUIDs. > > Do you think it's worth mentioning the cases where it is useful? Does that > come later with the clustering change? Thanks, I added that it allows avoiding any downtime for the database. > > diff --git a/ovsdb/ovsdb-client.1.in b/ovsdb/ovsdb-client.1.in > > index 2e2df5e5aa7f..30de9c536600 100644 > > --- a/ovsdb/ovsdb-client.1.in > > +++ b/ovsdb/ovsdb-client.1.in > > @@ -33,6 +33,9 @@ ovsdb\-client \- command-line interface to > > \fBovsdb-server\fR(1) > > \fBovsdb\-client \fR[\fIoptions\fR] > > \fBbackup\fI \fR[\fIserver\fR] \fR[\fIdatabase\fR] > \fIsnapshot\fR > > .br > > +\fBovsdb\-client \fR[\fIoptions\fR] [\fB\-\-force\fR] > > +\fBrestore\fI \fR[\fIserver\fR] \fR[\fIdatabase\fR] < \fIsnapshot\fR > > Same comment as the earlier patches about unnecessary markups. Fixed, thanks. > > @@ -41,7 +44,6 @@ ovsdb\-client \- command-line interface to > > \fBovsdb-server\fR(1) > > \fBovsdb\-client \fR[\fIoptions\fR] \fBmonitor\-cond\fI \fR[\fIserver\fR] > > \fR[\fIdatabase\fR] \fIconditions > > \fItable\fR [\fIcolumn\fR[\fB,\fIcolumn\fR]...]... > > .IP "Testing Commands:" > > -.br > > \fBovsdb\-client \fR[\fIoptions\fR] \fBlock\fI \fR[\fIserver\fR] \fIlock\fR > > .br > > \fBovsdb\-client \fR[\fIoptions\fR] \fBsteal\fI \fR[\fIserver\fR] \fIlock\fR > > @@ -156,6 +158,30 @@ database is in use. > > The output does not include ephemeral columns, which by design do not > > survive across restarts of \fBovsdb\-server\fR. > > . > > +.IP "[\fB\-\-force\fR] \fBrestore\fI \fR[\fIserver\fR] \fR[\fIdatabase\fR] > > < \fIsnapshot\fR" > > Here too. Ditto. > > +Reads \fIsnapshot\fR, which must be in the format used for OVSDB > > +standalone and active-backup databases. Then, connects to > > Do you think it's worth mentioning how you'd get it in that format? > > > +Normally \fBrestore\fR exits with a failure if \fBsnapshot\fR and the > > +server's database have different schemas. In such a case, it is a > > +good idea to convert the database to the new schema before restoring, > > +e.g. with \fBovsdb\-client convert\fR. Use \fB\-\-force\fR to proceed > > +regardless of schema differences even though the restore might fail > > +with an error or succeed with surprising results. > > I love surprises! Then you should use --force all the time! > > diff --git a/ovsdb/ovsdb-client.c b/ovsdb/ovsdb-client.c > > index 568c46b84d54..1af19d5dbc28 100644 > > --- a/ovsdb/ovsdb-client.c > > +++ b/ovsdb/ovsdb-client.c > > > > > > @@ -279,6 +289,8 @@ usage(void) > >"dump contents of DATABASE on SERVER to stdout\n" > >"\n backup [SERVER] [DATABASE] > DB\n" > >"dump database contents in the form of a database file\n" > > + "\n [--force] restore [SERVER] [DATABASE] < DB\n" > > + "restore database contents from a database file\n" > > There's obviously some precedent, but the man page uses "snapshot" instead of > "DB". It might be nice to use consistent terms--especially since the > difference between "DATABASE" and "DB" is not obvious without context--but > it's not a big deal. Thanks, I changed these to SNAPSHOT. > Acked-by: Justin Pettit Thanks! I applied this to master. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] netdev-tc-offloads: Use customary types for buffer.
> On Nov 27, 2017, at 1:53 PM, Ben Pfaffwrote: > > This function uses local array set_buff[] to store Netlink attributes. > It declares set_buff as an array of character pointers, which is a strange > type for a buffer of non-character-pointer objects. In OVS it is > customary to use an ofpbuf with a stub of uint64_t objecs (to ensure > proper alignment, otherwise uint8_t would be more usual). This commit > changes to that more usual form. > > Signed-off-by: Ben Pfaff Acked-by: Justin Pettit --Justin ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] WAITING TO HEAR FROM YOU
Good day, Mr.John frank is my name, I work in a Bank here in Benin. And I personally handles transactions of Mr. Hamza Yaqoob who eventually died on the 5th of February 2015 from injuries he sustained in auto crash that happened 2:30pm 2years ago. Before his death, Mr. Hamza Yaqoob lived and work here in Benin Republic and was one of the long serving Contractor with (Benin Energy dept). He has $6.2M in a fixed account with our bank here. And was left without no other beneficiary. Now i write to seek your consent to present you as a foreign partner to stand in as the next of kin to late Mr. Hamza Yaqoob. Let me know if I can work with you in regards to this with all honesty. My best Regards to you and your family Mr. John Frank. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] jsonrpc-server: Enforce uniqueness of monitor IDs.
On Wed, Dec 20, 2017 at 03:03:37PM -0800, Justin Pettit wrote: > > > > On Dec 8, 2017, at 1:01 PM, Ben Pfaffwrote: > > > > This oversight allowed monitor IDs to be duplicated when the > > monitor_cond_change request changed them. > > > > Signed-off-by: Ben Pfaff > > --- > > ovsdb/jsonrpc-server.c | 18 ++ > > 1 file changed, 14 insertions(+), 4 deletions(-) > > > > diff --git a/ovsdb/jsonrpc-server.c b/ovsdb/jsonrpc-server.c > > index da3b3835f0d7..36e50241b040 100644 > > --- a/ovsdb/jsonrpc-server.c > > +++ b/ovsdb/jsonrpc-server.c > > @@ -1420,6 +1420,14 @@ ovsdb_jsonrpc_monitor_cond_change(struct > > ovsdb_jsonrpc_session *s, > > goto error; > > } > > > > +struct ovsdb_jsonrpc_monitor *m2 > > += ovsdb_jsonrpc_monitor_find(s, params->u.array.elems[1]); > > +if (m2 && m2 != m) { > > +error = ovsdb_syntax_error(params->u.array.elems[1], NULL, > > + "duplicate monitor ID"); > > +goto error; > > +} > > As we discussed off-line, I found the logic a bit confusing. It might be > clearer with either a comment or a slight restructuring of the code. > > Acked-by: Justin Pettit Thanks. What do you think of this version? --8<--cut here-->8-- From: Ben Pfaff Date: Thu, 21 Dec 2017 16:29:32 -0800 Subject: [PATCH] jsonrpc-server: Enforce uniqueness of monitor IDs. This oversight allowed monitor IDs to be duplicated when the monitor_cond_change request changed them. Signed-off-by: Ben Pfaff Acked-by: Justin Pettit --- ovsdb/jsonrpc-server.c | 18 ++ 1 file changed, 14 insertions(+), 4 deletions(-) diff --git a/ovsdb/jsonrpc-server.c b/ovsdb/jsonrpc-server.c index f4a14d4fb2c2..07e8d1f17b51 100644 --- a/ovsdb/jsonrpc-server.c +++ b/ovsdb/jsonrpc-server.c @@ -1411,6 +1411,14 @@ ovsdb_jsonrpc_monitor_cond_change(struct ovsdb_jsonrpc_session *s, goto error; } +const struct json *new_monitor_id = params->u.array.elems[1]; +bool changing_id = !json_equal(m->monitor_id, new_monitor_id); +if (changing_id && ovsdb_jsonrpc_monitor_find(s, new_monitor_id)) { +error = ovsdb_syntax_error(new_monitor_id, NULL, + "duplicate monitor ID"); +goto error; +} + monitor_cond_change_reqs = params->u.array.elems[2]; if (monitor_cond_change_reqs->type != JSON_OBJECT) { error = @@ -1458,10 +1466,12 @@ ovsdb_jsonrpc_monitor_cond_change(struct ovsdb_jsonrpc_session *s, } /* Change monitor id */ -hmap_remove(>monitors, >node); -json_destroy(m->monitor_id); -m->monitor_id = json_clone(params->u.array.elems[1]); -hmap_insert(>monitors, >node, json_hash(m->monitor_id, 0)); +if (changing_id) { +hmap_remove(>monitors, >node); +json_destroy(m->monitor_id); +m->monitor_id = json_clone(new_monitor_id); +hmap_insert(>monitors, >node, json_hash(m->monitor_id, 0)); +} /* Send the new update, if any, represents the difference from the old * condition and the new one. */ -- 2.10.2 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v2] odp-util: Use flexible sized buffer to hold Geneve options.
> On Dec 4, 2017, at 11:51 AM, Ben Pfaffwrote: > > The 'mask' buffer in parse_odp_action() is supposed to always be big > enough: >/* 'mask' is big enough to hold any key. */ > > Geneve options can be really big and the comment was wrong. In addition, > the user might supply more options than can really fit in any case, so > we might as well just use a stub. > > Found by libfuzzer. > > Reported-by: Bhargava Shastry > Signed-off-by: Ben Pfaff Acked-by: Justin Pettit --Justin ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] ovsdb-client: Show even constraint-breaking data in "dump" output.
> On Dec 8, 2017, at 12:37 PM, Ben Pfaffwrote: > > The ovsdb-client "dump" command is a fairly low-level tool that can be > used, among other purposes, to debug the OVSDB protocol. It's better if > it just prints what the server sends without being too judgmental about it. > Thus, we might as well ignore constraints for the purpose of dumping > tables. > > Signed-off-by: Ben Pfaff Acked-by: Justin Pettit --Justin ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] ovs-vswitchd: Allow more convenient 0x short form for specifying DPIDs.
On Wed, Dec 20, 2017 at 03:09:09PM -0800, Justin Pettit wrote: > > > > On Oct 23, 2017, at 2:01 PM, Ben Pfaffwrote: > > > > Until now, ovs-vswitchd has insisted that other-config:datapath-id be > > exactly 16 hex digits. This commit allows shorter forms prefixed by 0x. > > This was prompted by Faucet controller examples such as this one: > > https://github.com/faucetsdn/faucet/blob/master/docs/README_config.rst > > which tend to suggest datapath IDs like 0x1. > > > > CC: Josh Bailey > > Signed-off-by: Ben Pfaff > > Acked-by: Justin Pettit Thanks, applied to master. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] ovsdb-server: Forbid user-specified databases with reserved names.
> On Dec 8, 2017, at 12:53 PM, Ben Pfaffwrote: > > +if (argc > 2 && argv[1][0] == '_') { > +unixctl_command_reply_error(conn, "cannot compact built-in > databases"); > +return; > +} This error condition seems a little odd to me. I think it will only provide this warning if multiple databases are specified, and only complain if the first one is built-in. Acked-by: Justin Pettit --Justin ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] table: Add --max-column-width option.
> On Dec 8, 2017, at 12:54 PM, Ben Pfaffwrote: > > This can make it easier to read tables that contain wide data in some > columns. > > Signed-off-by: Ben Pfaff Acked-by: Justin Pettit --Justin ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] dpif: Use spaces instead of tabs.
> On Dec 21, 2017, at 3:24 PM, Gregory Rosewrote: > > On 12/21/2017 2:28 PM, Justin Pettit wrote: >> Signed-off-by: Justin Pettit >> --- >> lib/dpif.h | 6 +++--- >> 1 file changed, 3 insertions(+), 3 deletions(-) >> >> diff --git a/lib/dpif.h b/lib/dpif.h >> index ab898f4be2dd..2e739dc784be 100644 >> --- a/lib/dpif.h >> +++ b/lib/dpif.h >> @@ -787,9 +787,9 @@ const char *dpif_upcall_type_to_string(enum >> dpif_upcall_type); >> struct dpif_upcall { >> /* All types. */ >> struct dp_packet packet;/* Packet data,'dp_packet' should be the >> first >> - member to avoid a hole. This is because >> - 'rte_mbuf' in dp_packet is aligned atleast >> - on a 64-byte boundary */ >> + member to avoid a hole. This is because >> + 'rte_mbuf' in dp_packet is aligned >> atleast >> + on a 64-byte boundary */ >> enum dpif_upcall_type type; >> struct nlattr *key; /* Flow key. */ >> size_t key_len; /* Length of 'key' in bytes. */ > > Reviewed-by: Greg Rose Thanks. I pushed this to master. --Justin ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] xcache: Handle null argument for xlate_cache_uninit().
> On Dec 21, 2017, at 10:05 AM, Yifeng Sunwrote: > > Thanks, looks good to me. > > Reviewed-by: Yifeng Sun Thanks. I pushed this with yours and Ben's acks. --Justin ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] ofproto-dpif: Fix a couple minor issues in comments.
> On Dec 21, 2017, at 10:21 AM, Ben Pfaffwrote: > > On Wed, Dec 20, 2017 at 07:42:39PM -0800, Justin Pettit wrote: >> Signed-off-by: Justin Pettit > > Acked-by: Ben Pfaff Thanks. I pushed this with yours and Yufeng's acks. --Justin ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] dpif: Use spaces instead of tabs.
On 12/21/2017 2:28 PM, Justin Pettit wrote: Signed-off-by: Justin Pettit--- lib/dpif.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/lib/dpif.h b/lib/dpif.h index ab898f4be2dd..2e739dc784be 100644 --- a/lib/dpif.h +++ b/lib/dpif.h @@ -787,9 +787,9 @@ const char *dpif_upcall_type_to_string(enum dpif_upcall_type); struct dpif_upcall { /* All types. */ struct dp_packet packet;/* Packet data,'dp_packet' should be the first - member to avoid a hole. This is because - 'rte_mbuf' in dp_packet is aligned atleast - on a 64-byte boundary */ + member to avoid a hole. This is because + 'rte_mbuf' in dp_packet is aligned atleast + on a 64-byte boundary */ enum dpif_upcall_type type; struct nlattr *key; /* Flow key. */ size_t key_len; /* Length of 'key' in bytes. */ Reviewed-by: Greg Rose ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH] dpif: Use spaces instead of tabs.
Signed-off-by: Justin Pettit--- lib/dpif.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/lib/dpif.h b/lib/dpif.h index ab898f4be2dd..2e739dc784be 100644 --- a/lib/dpif.h +++ b/lib/dpif.h @@ -787,9 +787,9 @@ const char *dpif_upcall_type_to_string(enum dpif_upcall_type); struct dpif_upcall { /* All types. */ struct dp_packet packet;/* Packet data,'dp_packet' should be the first - member to avoid a hole. This is because - 'rte_mbuf' in dp_packet is aligned atleast - on a 64-byte boundary */ + member to avoid a hole. This is because + 'rte_mbuf' in dp_packet is aligned atleast + on a 64-byte boundary */ enum dpif_upcall_type type; struct nlattr *key; /* Flow key. */ size_t key_len; /* Length of 'key' in bytes. */ -- 2.7.4 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [no-slow 6/6] ofproto-dpif: Don't slow-path controller actions with pause.
A previous patch removed slow-pathing for controller actions with the exception of ones that specified "pause". This commit removes that restriction so that no controller actions are slow-pathed. Signed-off-by: Justin Pettit--- lib/odp-util.h | 2 +- ofproto/ofproto-dpif-upcall.c | 31 ++-- ofproto/ofproto-dpif-xlate-cache.c | 13 --- ofproto/ofproto-dpif-xlate-cache.h | 1 - ofproto/ofproto-dpif-xlate.c | 159 +++-- ofproto/ofproto-dpif.c | 1 - 6 files changed, 87 insertions(+), 120 deletions(-) diff --git a/lib/odp-util.h b/lib/odp-util.h index ff9ecf00e3c2..365194404eea 100644 --- a/lib/odp-util.h +++ b/lib/odp-util.h @@ -43,7 +43,6 @@ struct pkt_metadata; SPR(SLOW_LACP, "lacp", "Consists of LACP packets") \ SPR(SLOW_STP,"stp","Consists of STP packets") \ SPR(SLOW_LLDP, "lldp", "Consists of LLDP packets") \ -SPR(SLOW_PAUSE, "pause", "Controller action with pause") \ SPR(SLOW_ACTION, "action", \ "Uses action(s) not supported by datapath") @@ -299,6 +298,7 @@ enum user_action_cookie_type { /* Flags for controller cookies. */ enum user_action_controller_flags { UACF_DONT_SEND= 1 << 0, /* Don't send the packet to controller. */ +UACF_CONTINUATION = 1 << 1, /* Send packet-in as a continuation. */ }; struct user_action_cookie_header { diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c index 405a3f74368e..5b6e2b002816 100644 --- a/ofproto/ofproto-dpif-upcall.c +++ b/ofproto/ofproto-dpif-upcall.c @@ -1449,6 +1449,8 @@ process_upcall(struct udpif *udpif, struct upcall *upcall, break; } +const struct frozen_state *state = _node->state; + struct ofproto_async_msg *am = xmalloc(sizeof *am); *am = (struct ofproto_async_msg) { .controller_id = cookie->controller.controller_id, @@ -1460,7 +1462,7 @@ process_upcall(struct udpif *udpif, struct upcall *upcall, dp_packet_size(packet)), .packet_len = dp_packet_size(packet), .reason = cookie->controller.reason, -.table_id = recirc_node->state.table_id, +.table_id = state->table_id, .cookie = get_32aligned_be64( >controller.rule_cookie), .userdata = (cookie->controller.userdata_len @@ -1474,18 +1476,36 @@ process_upcall(struct udpif *udpif, struct upcall *upcall, }, }; +if (cookie->controller.flags & UACF_CONTINUATION) { +am->pin.up.stack = (state->stack_size + ? xmemdup(state->stack, state->stack_size) + : NULL), +am->pin.up.stack_size = state->stack_size, +am->pin.up.mirrors = state->mirrors, +am->pin.up.conntracked = state->conntracked, +am->pin.up.actions = (state->ofpacts_len +? xmemdup(state->ofpacts, + state->ofpacts_len) : NULL), +am->pin.up.actions_len = state->ofpacts_len, +am->pin.up.action_set = (state->action_set_len + ? xmemdup(state->action_set, + state->action_set_len) + : NULL), +am->pin.up.action_set_len = state->action_set_len, +am->pin.up.bridge = upcall->ofproto->uuid; +} + /* We don't want to use the upcall 'flow', since it may be * more specific than the point at which the "controller" * action was specified. */ struct flow frozen_flow; frozen_flow = *flow; -if (!recirc_node->state.conntracked) { +if (!state->conntracked) { flow_clear_conntrack(_flow); } -frozen_metadata_to_flow(_node->state.metadata, -_flow); +frozen_metadata_to_flow(>metadata, _flow); flow_get_metadata(_flow, >pin.up.base.flow_metadata); ofproto_dpif_send_async_msg(upcall->ofproto, am); @@ -1515,9 +1535,6 @@ handle_upcalls(struct udpif *udpif, struct upcall *upcalls, * * - For SLOW_ACTION, translation executes the actions directly. * - * - For SLOW_PAUSE, translation needs to handle a pause request - * from the controller. - * * The loop fills 'ops' with an array of operations to execute in the * datapath. */ n_ops = 0; diff --git a/ofproto/ofproto-dpif-xlate-cache.c
[ovs-dev] [no-slow 3/6] ofp-actions: Add action "debug_slow" for testing slow-path.
It isn't otherwise useful and in fact hurts performance so it's disabled without --enable-dummy. An upcoming commit will make use of this. Signed-off-by: Justin Pettit--- include/openvswitch/ofp-actions.h | 1 + lib/ofp-actions.c | 48 ++- ofproto/ofproto-dpif-xlate.c | 7 ++ tests/ofproto-dpif.at | 19 4 files changed, 74 insertions(+), 1 deletion(-) diff --git a/include/openvswitch/ofp-actions.h b/include/openvswitch/ofp-actions.h index 25f61ef93ace..49316c4b39b4 100644 --- a/include/openvswitch/ofp-actions.h +++ b/include/openvswitch/ofp-actions.h @@ -128,6 +128,7 @@ struct vl_mff_map; * These are intentionally undocumented, subject to change, and \ * only accepted if ovs-vswitchd is started with --enable-dummy. */ \ OFPACT(DEBUG_RECIRC, ofpact_null, ofpact, "debug_recirc") \ +OFPACT(DEBUG_SLOW, ofpact_null, ofpact, "debug_slow") \ \ /* Instructions. */ \ OFPACT(METER, ofpact_meter, ofpact, "meter")\ diff --git a/lib/ofp-actions.c b/lib/ofp-actions.c index e5d03e155543..cef6170fedde 100644 --- a/lib/ofp-actions.c +++ b/lib/ofp-actions.c @@ -355,6 +355,9 @@ enum ofp_raw_action_type { /* These are intentionally undocumented, subject to change, and ovs-vswitchd */ /* accepts them only if started with --enable-dummy. */ +/* NX1.0+(254): void. */ +NXAST_RAW_DEBUG_SLOW, + /* NX1.0+(255): void. */ NXAST_RAW_DEBUG_RECIRC, }; @@ -473,6 +476,7 @@ ofpact_next_flattened(const struct ofpact *ofpact) case OFPACT_UNROLL_XLATE: case OFPACT_CT_CLEAR: case OFPACT_DEBUG_RECIRC: +case OFPACT_DEBUG_SLOW: case OFPACT_METER: case OFPACT_CLEAR_ACTIONS: case OFPACT_WRITE_METADATA: @@ -5800,7 +5804,7 @@ format_SAMPLE(const struct ofpact_sample *a, ds_put_format(s, "%s)%s", colors.paren, colors.end); } -/* debug_recirc instruction. */ +/* debug instructions. */ static bool enable_debug; @@ -5847,6 +5851,43 @@ format_DEBUG_RECIRC(const struct ofpact_null *a OVS_UNUSED, ds_put_format(s, "%sdebug_recirc%s", colors.value, colors.end); } +static enum ofperr +decode_NXAST_RAW_DEBUG_SLOW(struct ofpbuf *out) +{ +if (!enable_debug) { +return OFPERR_OFPBAC_BAD_VENDOR_TYPE; +} + +ofpact_put_DEBUG_SLOW(out); +return 0; +} + +static void +encode_DEBUG_SLOW(const struct ofpact_null *n OVS_UNUSED, + enum ofp_version ofp_version OVS_UNUSED, + struct ofpbuf *out) +{ +put_NXAST_DEBUG_SLOW(out); +} + +static char * OVS_WARN_UNUSED_RESULT +parse_DEBUG_SLOW(char *arg OVS_UNUSED, + const struct ofputil_port_map *port_map OVS_UNUSED, + struct ofpbuf *ofpacts, + enum ofputil_protocol *usable_protocols OVS_UNUSED) +{ +ofpact_put_DEBUG_SLOW(ofpacts); +return NULL; +} + +static void +format_DEBUG_SLOW(const struct ofpact_null *a OVS_UNUSED, + const struct ofputil_port_map *port_map OVS_UNUSED, + struct ds *s) +{ +ds_put_format(s, "%sdebug_slow%s", colors.value, colors.end); +} + /* Action structure for NXAST_CT. * * Pass traffic to the connection tracker. @@ -7149,6 +7190,7 @@ ofpact_is_set_or_move_action(const struct ofpact *a) case OFPACT_WRITE_ACTIONS: case OFPACT_WRITE_METADATA: case OFPACT_DEBUG_RECIRC: +case OFPACT_DEBUG_SLOW: return false; default: OVS_NOT_REACHED(); @@ -7217,6 +7259,7 @@ ofpact_is_allowed_in_actions_set(const struct ofpact *a) case OFPACT_STACK_POP: case OFPACT_STACK_PUSH: case OFPACT_DEBUG_RECIRC: +case OFPACT_DEBUG_SLOW: /* The action set may only include actions and thus * may not include any instructions */ @@ -7439,6 +7482,7 @@ ovs_instruction_type_from_ofpact_type(enum ofpact_type type) case OFPACT_UNROLL_XLATE: case OFPACT_SAMPLE: case OFPACT_DEBUG_RECIRC: +case OFPACT_DEBUG_SLOW: case OFPACT_CT: case OFPACT_CT_CLEAR: case OFPACT_NAT: @@ -8105,6 +8149,7 @@ ofpact_check__(enum ofputil_protocol *usable_protocols, struct ofpact *a, return OFPERR_OFPBAC_BAD_TYPE; case OFPACT_DEBUG_RECIRC: +case OFPACT_DEBUG_SLOW: return 0; case OFPACT_ENCAP: @@ -8620,6 +8665,7 @@ ofpact_outputs_to_port(const struct ofpact *ofpact, ofp_port_t port) case OFPACT_METER: case OFPACT_GROUP: case OFPACT_DEBUG_RECIRC: +case OFPACT_DEBUG_SLOW: case OFPACT_CT: case OFPACT_CT_CLEAR: case OFPACT_NAT: diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c index a2b4fdb3b6be..3bdd64d4d552 100644 --- a/ofproto/ofproto-dpif-xlate.c +++ b/ofproto/ofproto-dpif-xlate.c @@ -5298,6 +5298,7 @@ reversible_actions(const
[ovs-dev] [no-slow 2/6] ofproto-dpif: Reorganize upcall handling.
- This reduces the number of times upcall cookies are processed. - It separate true miss calls from slow-path actions. The reorganization will also be useful for a future commit. Signed-off-by: Justin Pettit--- ofproto/ofproto-dpif-upcall.c | 91 +-- 1 file changed, 45 insertions(+), 46 deletions(-) diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c index 02cf5415bc3d..46b75fe35a2b 100644 --- a/ofproto/ofproto-dpif-upcall.c +++ b/ofproto/ofproto-dpif-upcall.c @@ -183,6 +183,7 @@ struct udpif { enum upcall_type { BAD_UPCALL, /* Some kind of bug somewhere. */ MISS_UPCALL,/* A flow miss. */ +SLOW_PATH_UPCALL, /* Slow path upcall. */ SFLOW_UPCALL, /* sFlow sample. */ FLOW_SAMPLE_UPCALL, /* Per-flow sampling. */ IPFIX_UPCALL/* Per-bridge sampling. */ @@ -210,8 +211,7 @@ struct upcall { uint16_t mru; /* If !0, Maximum receive unit of fragmented IP packet */ -enum dpif_upcall_type type;/* Datapath type of the upcall. */ -const struct nlattr *userdata; /* Userdata for DPIF_UC_ACTION Upcalls. */ +enum upcall_type type; /* Type of the upcall. */ const struct nlattr *actions; /* Flow actions in DPIF_UC_ACTION Upcalls. */ bool xout_initialized; /* True if 'xout' must be uninitialized. */ @@ -235,6 +235,8 @@ struct upcall { size_t key_len;/* Datapath flow key length. */ const struct nlattr *out_tun_key; /* Datapath output tunnel key. */ +union user_action_cookie cookie; + uint64_t odp_actions_stub[1024 / 8]; /* Stub for odp_actions. */ }; @@ -367,7 +369,8 @@ static int ukey_acquire(struct udpif *, const struct dpif_flow *, static void ukey_delete__(struct udpif_key *); static void ukey_delete(struct umap *, struct udpif_key *); static enum upcall_type classify_upcall(enum dpif_upcall_type type, -const struct nlattr *userdata); +const struct nlattr *userdata, +union user_action_cookie *cookie); static void put_op_init(struct ukey_op *op, struct udpif_key *ukey, enum dpif_flow_put_flags flags); @@ -813,7 +816,6 @@ recv_upcalls(struct handler *handler) upcall->key = dupcall->key; upcall->key_len = dupcall->key_len; -upcall->ufid = >ufid; upcall->out_tun_key = dupcall->out_tun_key; upcall->actions = dupcall->actions; @@ -969,11 +971,13 @@ udpif_revalidator(void *arg) } static enum upcall_type -classify_upcall(enum dpif_upcall_type type, const struct nlattr *userdata) +classify_upcall(enum dpif_upcall_type type, const struct nlattr *userdata, +union user_action_cookie *cookie) { -union user_action_cookie cookie; size_t userdata_len; +memset(cookie, 0, sizeof *cookie); + /* First look at the upcall type. */ switch (type) { case DPIF_UC_ACTION: @@ -994,29 +998,30 @@ classify_upcall(enum dpif_upcall_type type, const struct nlattr *userdata) return BAD_UPCALL; } userdata_len = nl_attr_get_size(userdata); -if (userdata_len < sizeof cookie.type -|| userdata_len > sizeof cookie) { +if (userdata_len < sizeof cookie->type +|| userdata_len > sizeof *cookie) { VLOG_WARN_RL(, "action upcall cookie has unexpected size %"PRIuSIZE, userdata_len); return BAD_UPCALL; } -memset(, 0, sizeof cookie); -memcpy(, nl_attr_get(userdata), userdata_len); -if (userdata_len == MAX(8, sizeof cookie.sflow) -&& cookie.type == USER_ACTION_COOKIE_SFLOW) { + +memcpy(cookie, nl_attr_get(userdata), userdata_len); + +if (userdata_len == MAX(8, sizeof cookie->sflow) +&& cookie->type == USER_ACTION_COOKIE_SFLOW) { return SFLOW_UPCALL; -} else if (userdata_len == MAX(8, sizeof cookie.slow_path) - && cookie.type == USER_ACTION_COOKIE_SLOW_PATH) { -return MISS_UPCALL; -} else if (userdata_len == MAX(8, sizeof cookie.flow_sample) - && cookie.type == USER_ACTION_COOKIE_FLOW_SAMPLE) { +} else if (userdata_len == MAX(8, sizeof cookie->slow_path) + && cookie->type == USER_ACTION_COOKIE_SLOW_PATH) { +return SLOW_PATH_UPCALL; +} else if (userdata_len == MAX(8, sizeof cookie->flow_sample) + && cookie->type == USER_ACTION_COOKIE_FLOW_SAMPLE) { return FLOW_SAMPLE_UPCALL; -} else if (userdata_len == MAX(8, sizeof cookie.ipfix) - && cookie.type == USER_ACTION_COOKIE_IPFIX) { +} else if (userdata_len == MAX(8, sizeof cookie->ipfix) + && cookie->type == USER_ACTION_COOKIE_IPFIX) { return IPFIX_UPCALL; }
[ovs-dev] [no-slow 1/6] ofproto-dpif: Add ability to look up an ofproto by UUID.
This will have callers in the future. Signed-off-by: Justin Pettit--- ofproto/ofproto-dpif-trace.c | 2 +- ofproto/ofproto-dpif.c | 92 +++- ofproto/ofproto-dpif.h | 13 +-- 3 files changed, 77 insertions(+), 30 deletions(-) diff --git a/ofproto/ofproto-dpif-trace.c b/ofproto/ofproto-dpif-trace.c index d5da48e326bb..4999d1d6f326 100644 --- a/ofproto/ofproto-dpif-trace.c +++ b/ofproto/ofproto-dpif-trace.c @@ -354,7 +354,7 @@ parse_flow_and_packet(int argc, const char *argv[], goto exit; } -*ofprotop = ofproto_dpif_lookup(argv[1]); +*ofprotop = ofproto_dpif_lookup_by_name(argv[1]); if (!*ofprotop) { error = "Unknown bridge name"; goto exit; diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c index 43b7b89f26e4..7d628e11328a 100644 --- a/ofproto/ofproto-dpif.c +++ b/ofproto/ofproto-dpif.c @@ -58,6 +58,7 @@ #include "openvswitch/ofp-print.h" #include "openvswitch/ofp-util.h" #include "openvswitch/ofpbuf.h" +#include "openvswitch/uuid.h" #include "openvswitch/vlog.h" #include "ovs-lldp.h" #include "ovs-rcu.h" @@ -71,6 +72,7 @@ #include "unaligned.h" #include "unixctl.h" #include "util.h" +#include "uuid.h" #include "vlan-bitmap.h" VLOG_DEFINE_THIS_MODULE(ofproto_dpif); @@ -187,7 +189,12 @@ COVERAGE_DEFINE(rev_mcast_snooping); struct shash all_dpif_backers = SHASH_INITIALIZER(_dpif_backers); /* All existing ofproto_dpif instances, indexed by ->up.name. */ -struct hmap all_ofproto_dpifs = HMAP_INITIALIZER(_ofproto_dpifs); +struct hmap all_ofproto_dpifs_by_name = + HMAP_INITIALIZER(_ofproto_dpifs_by_name); + +/* All existing ofproto_dpif instances, indexed by ->uuid. */ +struct hmap all_ofproto_dpifs_by_uuid = + HMAP_INITIALIZER(_ofproto_dpifs_by_uuid); static bool ofproto_use_tnl_push_pop = true; static void ofproto_unixctl_init(void); @@ -268,7 +275,8 @@ enumerate_names(const char *type, struct sset *names) struct ofproto_dpif *ofproto; sset_clear(names); -HMAP_FOR_EACH (ofproto, all_ofproto_dpifs_node, _ofproto_dpifs) { +HMAP_FOR_EACH (ofproto, all_ofproto_dpifs_by_name_node, + _ofproto_dpifs_by_name) { if (strcmp(type, ofproto->up.type)) { continue; } @@ -311,7 +319,8 @@ lookup_ofproto_dpif_by_port_name(const char *name) { struct ofproto_dpif *ofproto; -HMAP_FOR_EACH (ofproto, all_ofproto_dpifs_node, _ofproto_dpifs) { +HMAP_FOR_EACH (ofproto, all_ofproto_dpifs_by_name_node, + _ofproto_dpifs_by_name) { if (sset_contains(>ports, name)) { return ofproto; } @@ -368,7 +377,8 @@ type_run(const char *type) simap_init(_backers); simap_swap(>tnl_backers, _backers); -HMAP_FOR_EACH (ofproto, all_ofproto_dpifs_node, _ofproto_dpifs) { +HMAP_FOR_EACH (ofproto, all_ofproto_dpifs_by_name_node, + _ofproto_dpifs_by_name) { struct ofport_dpif *iter; if (backer != ofproto->backer) { @@ -433,7 +443,8 @@ type_run(const char *type) backer->need_revalidate = 0; xlate_txn_start(); -HMAP_FOR_EACH (ofproto, all_ofproto_dpifs_node, _ofproto_dpifs) { +HMAP_FOR_EACH (ofproto, all_ofproto_dpifs_by_name_node, + _ofproto_dpifs_by_name) { struct ofport_dpif *ofport; struct ofbundle *bundle; @@ -522,7 +533,8 @@ process_dpif_all_ports_changed(struct dpif_backer *backer) const char *devname; sset_init(); -HMAP_FOR_EACH (ofproto, all_ofproto_dpifs_node, _ofproto_dpifs) { +HMAP_FOR_EACH (ofproto, all_ofproto_dpifs_by_name_node, + _ofproto_dpifs_by_name) { if (ofproto->backer == backer) { struct ofport *ofport; @@ -552,8 +564,8 @@ process_dpif_port_change(struct dpif_backer *backer, const char *devname) return; } -HMAP_FOR_EACH (ofproto, all_ofproto_dpifs_node, - _ofproto_dpifs) { +HMAP_FOR_EACH (ofproto, all_ofproto_dpifs_by_name_node, + _ofproto_dpifs_by_name) { if (simap_contains(>backer->tnl_backers, devname)) { return; } @@ -604,7 +616,8 @@ process_dpif_port_error(struct dpif_backer *backer, int error) { struct ofproto_dpif *ofproto; -HMAP_FOR_EACH (ofproto, all_ofproto_dpifs_node, _ofproto_dpifs) { +HMAP_FOR_EACH (ofproto, all_ofproto_dpifs_by_name_node, + _ofproto_dpifs_by_name) { if (ofproto->backer == backer) { sset_clear(>port_poll_set); ofproto->port_poll_errno = error; @@ -1458,8 +1471,12 @@ construct(struct ofproto *ofproto_) } } -hmap_insert(_ofproto_dpifs, >all_ofproto_dpifs_node, +hmap_insert(_ofproto_dpifs_by_name, +
Re: [ovs-dev] Maintaining openvswitch-switch-dpdk in Debian
On Thu, Dec 21, 2017 at 09:28:20AM -0500, Josh Knight wrote: > I'm interested in using the openvswitch-switch-dpdk package (currently in > the Ubuntu repos) on a Debian system, however it does not appear to be in > the Debian repos. > > Are there any efforts or plans to maintain this package in the Debian > repos? I see that 'openvswitch-switch' package is maintained in both repos > by the same groups/people, so I'm surprised the dpdk version is not already > in Debian. I think that someone at Ubuntu added it, but they have not contributed it back. I'm actually hoping to get out of the Debian package maintenance business. I've basically turned the downstream Debian packaging over to Thomas Goirand and the openstack-dev folks, so the right thing to do might be to ask them to help. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] Criptomonedas
Nuevos esquemas tecnológicos para la conservación y transmisión de valor Criptomonedas: innovación, aplicación e impacto en el sector financiero 19 de enero- Lic. Jonathan H. Stahl Ducker - 9am-3pm No puede perderse la oportunidad de acudir a este seminario en el que conocerá el sistema financiero actual y sus distintas áreas de oportunidad, analizará las características y aplicaciones de nuevos protocolos para la conservación, distribución de “dinero” y medios de pago. También aprenderá sobre la aplicación de estos nuevos protocolos de comunicación, a través del estudio de casos prácticos que pondrán de manifiesto el aspecto financiero, regulatorio y técnico de distintos proyectos. BENEFICIOS DE ASISTIR: - Aprenderá las generalidades del Sistema Financiero Mexicano. - Dominará el lenguaje utilizado en el ecosistema de las criptomonedas. - Analizará casos y experiencias internacionales. - Comprenderá el estado jurídico y regulatorio nacional e internacional. - Obtendrá herramientas y modelos de análisis. ¿Requiere la información a la Brevedad? responda este email con la palabra: Criptomonedas + nombre - teléfono - correo. centro telefónico:018002120744 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] xcache: Handle null argument for xlate_cache_uninit().
On Wed, Dec 20, 2017 at 07:42:41PM -0800, Justin Pettit wrote: > Most other OVS libraries' delete and uninitialization functions allow a > null argument, but this one would cause a segfault. > > Signed-off-by: Justin PettitAcked-by: Ben Pfaff ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] ofproto-dpif: Fix a couple minor issues in comments.
On Wed, Dec 20, 2017 at 07:42:39PM -0800, Justin Pettit wrote: > Signed-off-by: Justin PettitAcked-by: Ben Pfaff ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] xcache: Handle null argument for xlate_cache_uninit().
Thanks, looks good to me. Reviewed-by: Yifeng SunOn Wed, Dec 20, 2017 at 7:42 PM, Justin Pettit wrote: > Most other OVS libraries' delete and uninitialization functions allow a > null argument, but this one would cause a segfault. > > Signed-off-by: Justin Pettit > --- > ofproto/ofproto-dpif-xlate-cache.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/ofproto/ofproto-dpif-xlate-cache.c > b/ofproto/ofproto-dpif-xlate-cache.c > index d319d287eadb..24fc769a7a0d 100644 > --- a/ofproto/ofproto-dpif-xlate-cache.c > +++ b/ofproto/ofproto-dpif-xlate-cache.c > @@ -279,6 +279,9 @@ xlate_cache_clear(struct xlate_cache *xcache) > void > xlate_cache_uninit(struct xlate_cache *xcache) > { > +if (!xcache) { > +return; > +} > xlate_cache_clear(xcache); > ofpbuf_uninit(>entries); > } > -- > 2.7.4 > > ___ > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH net v3] openvswitch: Fix pop_vlan action for double tagged frames
From: Eric GarverDate: Wed, 20 Dec 2017 15:09:22 -0500 > skb_vlan_pop() expects skb->protocol to be a valid TPID for double > tagged frames. So set skb->protocol to the TPID and let skb_vlan_pop() > shift the true ethertype into position for us. > > Fixes: 5108bbaddc37 ("openvswitch: add processing of L3 packets") > Signed-off-by: Eric Garver Applied and queued up for -stable, thanks. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] ofproto-dpif: Fix a couple minor issues in comments.
Thanks for the change. Reviewed-by: Yifeng SunOn Wed, Dec 20, 2017 at 7:42 PM, Justin Pettit wrote: > Signed-off-by: Justin Pettit > --- > ofproto/ofproto-dpif-upcall.h | 2 +- > ofproto/ofproto-dpif-xlate.c | 2 +- > 2 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/ofproto/ofproto-dpif-upcall.h b/ofproto/ofproto-dpif-upcall.h > index efa3cd190084..cef1d34198d6 100644 > --- a/ofproto/ofproto-dpif-upcall.h > +++ b/ofproto/ofproto-dpif-upcall.h > @@ -24,7 +24,7 @@ struct ofpbuf; > struct seq; > struct simap; > > -/* Udif is responsible for retrieving upcalls from the kernel and > processing > +/* Udpif is responsible for retrieving upcalls from the kernel and > processing > * them. Additionally, it's responsible for maintaining the datapath flow > * table. */ > > diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c > index 9b3a2f28a348..a2b4fdb3b6be 100644 > --- a/ofproto/ofproto-dpif-xlate.c > +++ b/ofproto/ofproto-dpif-xlate.c > @@ -1387,7 +1387,7 @@ xlate_lookup_ofproto(const struct dpif_backer > *backer, const struct flow *flow, > } > > /* Given a datapath and flow metadata ('backer', and 'flow' respectively), > - * optionally populates 'ofproto' with the ofproto_dpif, 'ofp_in_port' > with the > + * optionally populates 'ofprotop' with the ofproto_dpif, 'ofp_in_port' > with the > * openflow in_port, and 'ipfix', 'sflow', and 'netflow' with the > appropriate > * handles for those protocols if they're enabled. Caller may use the > returned > * pointers until quiescing, for longer term use additional references > must > -- > 2.7.4 > > ___ > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v2 RESEND] openvswitch: Trim off padding before L3+ netfilter processing
IPv4 and IPv6 packets may arrive with lower-layer padding that is not included in the L3 length. For example, a short IPv4 packet may have up to 6 bytes of padding following the IP payload when received on an Ethernet device. In the normal IPv4 receive path, ip_rcv() trims the packet to ip_hdr->tot_len before invoking netfilter hooks (including conntrack and nat). In the IPv6 receive path, ip6_rcv() does the same using ipv6_hdr->payload_len. Similarly in the br_netfilter receive path, br_validate_ipv4() and br_validate_ipv6() trim the packet to the L3 length before invoking NF_INET_PRE_ROUTING hooks. In the OVS conntrack receive path, ovs_ct_execute() pulls the skb to the L3 header but does not trim it to the L3 length before calling nf_conntrack_in(NF_INET_PRE_ROUTING). When nf_conntrack_proto_tcp encounters a packet with lower-layer padding, nf_checksum() fails and logs "nf_ct_tcp: bad TCP checksum". While extra zero bytes don't affect the checksum, the length in the IP pseudoheader does. That length is based on skb->len, and without trimming, it doesn't match the length the sender used when computing the checksum. The assumption throughout nf_conntrack and nf_nat is that skb->len reflects the length of the L3 header and payload, so there is no need to refer back to ip_hdr->tot_len or ipv6_hdr->payload_len. This change brings OVS into line with other netfilter users, trimming IPv4 and IPv6 packets prior to L3+ netfilter processing. Signed-off-by: Ed Swierk--- (resent with Pravin's correct address) v2: - Trim packet in nat receive path as well as conntrack - Free skb on error --- net/openvswitch/conntrack.c | 34 ++ 1 file changed, 34 insertions(+) diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c index b27c5c6..1bdc78f 100644 --- a/net/openvswitch/conntrack.c +++ b/net/openvswitch/conntrack.c @@ -703,6 +703,33 @@ static bool skb_nfct_cached(struct net *net, return ct_executed; } +/* Trim the skb to the L3 length. Assumes the skb is already pulled to + * the L3 header. The skb is freed on error. + */ +static int skb_trim_l3(struct sk_buff *skb) +{ + unsigned int nh_len; + int err; + + switch (skb->protocol) { + case htons(ETH_P_IP): + nh_len = ntohs(ip_hdr(skb)->tot_len); + break; + case htons(ETH_P_IPV6): + nh_len = ntohs(ipv6_hdr(skb)->payload_len) + + sizeof(struct ipv6hdr); + break; + default: + nh_len = skb->len; + } + + err = pskb_trim_rcsum(skb, nh_len); + if (err) + kfree_skb(skb); + + return err; +} + #ifdef CONFIG_NF_NAT_NEEDED /* Modelled after nf_nat_ipv[46]_fn(). * range is only used for new, uninitialized NAT state. @@ -715,8 +742,12 @@ static int ovs_ct_nat_execute(struct sk_buff *skb, struct nf_conn *ct, { int hooknum, nh_off, err = NF_ACCEPT; + /* The nat module expects to be working at L3. */ nh_off = skb_network_offset(skb); skb_pull_rcsum(skb, nh_off); + err = skb_trim_l3(skb); + if (err) + return err; /* See HOOK2MANIP(). */ if (maniptype == NF_NAT_MANIP_SRC) @@ -,6 +1142,9 @@ int ovs_ct_execute(struct net *net, struct sk_buff *skb, /* The conntrack module expects to be working at L3. */ nh_ofs = skb_network_offset(skb); skb_pull_rcsum(skb, nh_ofs); + err = skb_trim_l3(skb); + if (err) + return err; if (key->ip.frag != OVS_FRAG_TYPE_NONE) { err = handle_fragments(net, key, info->zone.id, skb); -- 1.9.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v2] openvswitch: Trim off padding before L3+ netfilter processing
IPv4 and IPv6 packets may arrive with lower-layer padding that is not included in the L3 length. For example, a short IPv4 packet may have up to 6 bytes of padding following the IP payload when received on an Ethernet device. In the normal IPv4 receive path, ip_rcv() trims the packet to ip_hdr->tot_len before invoking netfilter hooks (including conntrack and nat). In the IPv6 receive path, ip6_rcv() does the same using ipv6_hdr->payload_len. Similarly in the br_netfilter receive path, br_validate_ipv4() and br_validate_ipv6() trim the packet to the L3 length before invoking NF_INET_PRE_ROUTING hooks. In the OVS conntrack receive path, ovs_ct_execute() pulls the skb to the L3 header but does not trim it to the L3 length before calling nf_conntrack_in(NF_INET_PRE_ROUTING). When nf_conntrack_proto_tcp encounters a packet with lower-layer padding, nf_checksum() fails and logs "nf_ct_tcp: bad TCP checksum". While extra zero bytes don't affect the checksum, the length in the IP pseudoheader does. That length is based on skb->len, and without trimming, it doesn't match the length the sender used when computing the checksum. The assumption throughout nf_conntrack and nf_nat is that skb->len reflects the length of the L3 header and payload, so there is no need to refer back to ip_hdr->tot_len or ipv6_hdr->payload_len. This change brings OVS into line with other netfilter users, trimming IPv4 and IPv6 packets prior to L3+ netfilter processing. Signed-off-by: Ed Swierk--- v2: - Trim packet in nat receive path as well as conntrack - Free skb on error --- net/openvswitch/conntrack.c | 34 ++ 1 file changed, 34 insertions(+) diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c index b27c5c6..1bdc78f 100644 --- a/net/openvswitch/conntrack.c +++ b/net/openvswitch/conntrack.c @@ -703,6 +703,33 @@ static bool skb_nfct_cached(struct net *net, return ct_executed; } +/* Trim the skb to the L3 length. Assumes the skb is already pulled to + * the L3 header. The skb is freed on error. + */ +static int skb_trim_l3(struct sk_buff *skb) +{ + unsigned int nh_len; + int err; + + switch (skb->protocol) { + case htons(ETH_P_IP): + nh_len = ntohs(ip_hdr(skb)->tot_len); + break; + case htons(ETH_P_IPV6): + nh_len = ntohs(ipv6_hdr(skb)->payload_len) + + sizeof(struct ipv6hdr); + break; + default: + nh_len = skb->len; + } + + err = pskb_trim_rcsum(skb, nh_len); + if (err) + kfree_skb(skb); + + return err; +} + #ifdef CONFIG_NF_NAT_NEEDED /* Modelled after nf_nat_ipv[46]_fn(). * range is only used for new, uninitialized NAT state. @@ -715,8 +742,12 @@ static int ovs_ct_nat_execute(struct sk_buff *skb, struct nf_conn *ct, { int hooknum, nh_off, err = NF_ACCEPT; + /* The nat module expects to be working at L3. */ nh_off = skb_network_offset(skb); skb_pull_rcsum(skb, nh_off); + err = skb_trim_l3(skb); + if (err) + return err; /* See HOOK2MANIP(). */ if (maniptype == NF_NAT_MANIP_SRC) @@ -,6 +1142,9 @@ int ovs_ct_execute(struct net *net, struct sk_buff *skb, /* The conntrack module expects to be working at L3. */ nh_ofs = skb_network_offset(skb); skb_pull_rcsum(skb, nh_ofs); + err = skb_trim_l3(skb); + if (err) + return err; if (key->ip.frag != OVS_FRAG_TYPE_NONE) { err = handle_fragments(net, key, info->zone.id, skb); -- 1.9.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v8 5/6] dpif-netdev: Time based output batching.
Thanks everyone for working on this. For now, I've incorporated changes from incremental patch and re-splitted it for simpler review. Rebased on current istokes/dpdk_merge and sent as v9. Documentation patch sent together with the series + change to NEWS. Best regards, Ilya Maximets. On 21.12.2017 16:42, Kevin Traynor wrote: > On 12/21/2017 12:27 PM, Jan Scheurich wrote: > I had a cheaper solution in mind that would accept the inaccuracy of the > accounting of tx cycles during rx batch processing (it >>> statistically slightly biases the load distribution towards the more >>> heavily loaded rx queues). And instead of attributing the tx cycles outside >>> the rx batching based on a stored rx queue per packet, I would simply >>> estimate the actual processing cycles per rx queue per interval by >>> distributing the total PMD processing cycles in the interval over the >>> polled rx queues weighted by their counted processing cycles. I'm not sure if I fully understand your idea, but batches are likely will be sent on one of the next polling cycles, not on the cycle where they was received. Moreover, send cycles are very different for different port types. And also packets form a single rx queue may have different destinations (different types of output ports). I'm not sure if it possible to correctly count cycles in this case. >>> >>> I agree, tx processing can be very different depending on destination, >>> especially with larger packet sizes where some paths may have a copy. >> >> I understand. There can be a big difference between physical and vhostuser >> ports, especially if the copy passes over the QPI. And if the per packet tx >> accounting is not causing too much overhead, it would be the best. >> >> My suggestion for simplification was not aiming at the same accuracy as >> Ilya's patch. Perhaps rebalancing doesn't really require that accuracy. >> Often the new rx queue distribution will anyway behave different compared to >> the prediction because of changed batching characteristics. >> > > Sure, that is why I had suggested to perhaps make no assumption about > future batch opportunities and just add the full cost of tx to each rxq > that has at least one packet in the flush. OTOH, then the stats for the > current situation could look worse than things actually are. Also, it > might not really simplify the code as Ilya'a patch just gives each rxq > some cycles per packet in a simple loop, whereas changing to give a full > cycle count per rxq would mean checking for duplicate rxqs etc. I guess > it's debatable. > > Anyway, I think both those schemes are a fix for attributing cycles to > the wrong rxq, or not counting them at all. > >>> Am I missed something? > After all, Kevin algorithm only needs comparable relative rx queue loads > across the PMDs to function. In this context I would also >>> suggest to exclude idle polling cycles from the algorithm because they are >>> not load related and always eat up the total cycle budget of the >>> PMD anyhow. >>> >>> Idle polling per rxq is not counted in the balance algorithm (for the >>> reason you say) but it's needed to calculate a total cycles over the >>> correct time period for providing the user with an rxq percentage of pmd >>> used stats. Counting (2/3) and calculations (3/3) are in the balance >>> stats patches for ref. >> >> Sorry, I didn't check the patches carefully. I just assumed because you do >> count idle polling cycles per rx queue. But the total number of cycles per >> interval could easily obtained from the PMD cycles. Perhaps a chance to >> simplify? >> > > Good question, it had been my first thought too. The issue was the full > PMD cycles are not in sync with the rxq measurement intervals and are > also controlled by the user with pmd-stats-clear. It was simpler to just > extend the rxq counting code a bit to capture idle cycles, than to try > and synchronize with the full PMD cycles and I didn't want to change > their behaviour. Actually, with the rework to the data structure, I > think measuring idle/proc cycles for an rxq simplified the existing code > a little too. > > thanks, > Kevin. > >> BR, Jan >> > > > > ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v9 5/5] NEWS: Mark output packet batching support.
New feature should be mentioned in news, especially because it has user-visible configuration options. Signed-off-by: Ilya Maximets--- NEWS | 2 ++ 1 file changed, 2 insertions(+) diff --git a/NEWS b/NEWS index 057df93..9b86912 100644 --- a/NEWS +++ b/NEWS @@ -27,6 +27,8 @@ Post-v2.8.0 * Add support for vHost IOMMU * New debug appctl command 'netdev-dpdk/get-mempool-info'. * All the netdev-dpdk appctl commands described in ovs-vswitchd man page. + - Userspace datapath: + * Output packet batching support. v2.8.0 - 31 Aug 2017 -- 2.7.4 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v9 4/5] docs: Describe output packet batching in DPDK guide.
Added information about output packet batching and a way to configure 'tx-flush-interval'. Signed-off-by: Ilya MaximetsCo-authored-by: Jan Scheurich Signed-off-by: Jan Scheurich --- Documentation/intro/install/dpdk.rst | 58 1 file changed, 58 insertions(+) diff --git a/Documentation/intro/install/dpdk.rst b/Documentation/intro/install/dpdk.rst index 3fecb5c..040e62e 100644 --- a/Documentation/intro/install/dpdk.rst +++ b/Documentation/intro/install/dpdk.rst @@ -568,6 +568,64 @@ not needed i.e. jumbo frames are not needed, it can be forced off by adding chains of descriptors it will make more individual virtio descriptors available for rx to the guest using dpdkvhost ports and this can improve performance. +Output Packet Batching +~~ + +To make advantage of batched transmit functions, OVS collects packets in +intermediate queues before sending when processing a batch of received packets. +Even if packets are matched by different flows, OVS uses a single send +operation for all packets destined to the same output port. + +Furthermore, OVS is able to buffer packets in these intermediate queues for a +configurable amount of time to reduce the frequency of send bursts at medium +load levels when the packet receive rate is high, but the receive batch size +still very small. This is particularly beneficial for packets transmitted to +VMs using an interrupt-driven virtio driver, where the interrupt overhead is +significant for the OVS PMD, the host operating system and the guest driver. + +The ``tx-flush-interval`` parameter can be used to specify the time in +microseconds OVS should wait between two send bursts to a given port (default +is ``0``). When the intermediate queue fills up before that time is over, the +buffered packet batch is sent immediately:: + +$ ovs-vsctl set Open_vSwitch . other_config:tx-flush-interval=50 + +This parameter influences both throughput and latency, depending on the traffic +load on the port. In general lower values decrease latency while higher values +may be useful to achieve higher throughput. + +Low traffic (``packet rate < 1 / tx-flush-interval``) should not experience +any significant latency or throughput increase as packets are forwarded +immediately. + +At intermediate load levels +(``1 / tx-flush-interval < packet rate < 32 / tx-flush-interval``) traffic +should experience an average latency increase of up to +``1 / 2 * tx-flush-interval`` and a possible throughput improvement. + +Very high traffic (``packet rate >> 32 / tx-flush-interval``) should experience +the average latency increase equal to ``32 / (2 * packet rate)``. Most send +batches in this case will contain the maximum number of packets (``32``). + +A ``tx-burst-interval`` value of ``50`` microseconds has shown to provide a +good performance increase in a ``PHY-VM-PHY`` scenario on ``x86`` system for +interrupt-driven guests while keeping the latency increase at a reasonable +level: + + https://mail.openvswitch.org/pipermail/ovs-dev/2017-December/341628.html + +.. note:: + Throughput impact of this option significantly depends on the scenario and + the traffic patterns. For example: ``tx-burst-interval`` value of ``50`` + microseconds shows performance degradation in ``PHY-VM-PHY`` with bonded PHY + scenario while testing with ``256 - 1024`` packet flows: + +https://mail.openvswitch.org/pipermail/ovs-dev/2017-December/341700.html + +The average number of packets per output batch can be checked in PMD stats:: + +$ ovs-appctl dpif-netdev/pmd-stats-show + Limitations -- 2.7.4 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v9 3/5] dpif-netdev: Time based output batching.
This allows to collect packets from more than one RX burst and send them together with a configurable intervals. 'other_config:tx-flush-interval' can be used to configure time that a packet can wait in output batch for sending. 'tx-flush-interval' has microsecond resolution. Signed-off-by: Ilya Maximets--- lib/dpif-netdev.c| 116 --- vswitchd/vswitch.xml | 16 +++ 2 files changed, 107 insertions(+), 25 deletions(-) diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index ed60ff1..c2515bf 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -85,6 +85,9 @@ VLOG_DEFINE_THIS_MODULE(dpif_netdev); #define MAX_RECIRC_DEPTH 6 DEFINE_STATIC_PER_THREAD_DATA(uint32_t, recirc_depth, 0) +/* Use instant packet send by default. */ +#define DEFAULT_TX_FLUSH_INTERVAL 0 + /* Configuration parameters. */ enum { MAX_FLOWS = 65536 }; /* Maximum number of flows in flow table. */ enum { MAX_METERS = 65536 };/* Maximum number of meters. */ @@ -271,6 +274,9 @@ struct dp_netdev { struct hmap ports; struct seq *port_seq; /* Incremented whenever a port changes. */ +/* The time that a packet can wait in output batch for sending. */ +atomic_uint32_t tx_flush_interval; + /* Meters. */ struct ovs_mutex meter_locks[N_METER_LOCKS]; struct dp_meter *meters[MAX_METERS]; /* Meter bands. */ @@ -532,6 +538,7 @@ struct tx_port { int qid; long long last_used; struct hmap_node node; +long long flush_time; struct dp_packet_batch output_pkts; struct dp_netdev_rxq *output_pkts_rxqs[NETDEV_MAX_BURST]; }; @@ -623,6 +630,9 @@ struct dp_netdev_pmd_thread { * than 'cmap_count(dp->poll_threads)'. */ uint32_t static_tx_qid; +/* Number of filled output batches. */ +int n_output_batches; + struct ovs_mutex port_mutex;/* Mutex for 'poll_list' and 'tx_ports'. */ /* List of rx queues to poll. */ struct hmap poll_list OVS_GUARDED; @@ -716,8 +726,9 @@ static void dp_netdev_add_rxq_to_pmd(struct dp_netdev_pmd_thread *pmd, static void dp_netdev_del_rxq_from_pmd(struct dp_netdev_pmd_thread *pmd, struct rxq_poll *poll) OVS_REQUIRES(pmd->port_mutex); -static void -dp_netdev_pmd_flush_output_packets(struct dp_netdev_pmd_thread *pmd); +static int +dp_netdev_pmd_flush_output_packets(struct dp_netdev_pmd_thread *pmd, + bool force); static void reconfigure_datapath(struct dp_netdev *dp) OVS_REQUIRES(dp->port_mutex); @@ -1317,6 +1328,7 @@ create_dp_netdev(const char *name, const struct dpif_class *class, conntrack_init(>conntrack); atomic_init(>emc_insert_min, DEFAULT_EM_FLOW_INSERT_MIN); +atomic_init(>tx_flush_interval, DEFAULT_TX_FLUSH_INTERVAL); cmap_init(>poll_threads); @@ -2987,7 +2999,7 @@ dpif_netdev_execute(struct dpif *dpif, struct dpif_execute *execute) dp_packet_batch_init_packet(, execute->packet); dp_netdev_execute_actions(pmd, , false, execute->flow, execute->actions, execute->actions_len); -dp_netdev_pmd_flush_output_packets(pmd); +dp_netdev_pmd_flush_output_packets(pmd, true); if (pmd->core_id == NON_PMD_CORE_ID) { ovs_mutex_unlock(>non_pmd_mutex); @@ -3036,6 +3048,16 @@ dpif_netdev_set_config(struct dpif *dpif, const struct smap *other_config) smap_get_ullong(other_config, "emc-insert-inv-prob", DEFAULT_EM_FLOW_INSERT_INV_PROB); uint32_t insert_min, cur_min; +uint32_t tx_flush_interval, cur_tx_flush_interval; + +tx_flush_interval = smap_get_int(other_config, "tx-flush-interval", + DEFAULT_TX_FLUSH_INTERVAL); +atomic_read_relaxed(>tx_flush_interval, _tx_flush_interval); +if (tx_flush_interval != cur_tx_flush_interval) { +atomic_store_relaxed(>tx_flush_interval, tx_flush_interval); +VLOG_INFO("Flushing interval for tx queues set to %"PRIu32" us", + tx_flush_interval); +} if (!nullable_string_is_equal(dp->pmd_cmask, cmask)) { free(dp->pmd_cmask); @@ -3285,13 +3307,14 @@ dp_netdev_rxq_get_intrvl_cycles(struct dp_netdev_rxq *rx, unsigned idx) return processing_cycles; } -static void +static int dp_netdev_pmd_flush_output_on_port(struct dp_netdev_pmd_thread *pmd, struct tx_port *p) { int tx_qid; int output_cnt; bool dynamic_txqs; +uint32_t tx_flush_interval; enum pmd_cycles_counter_type save_pmd_cycles_type; /* In case we're in PMD_CYCLES_PROCESSING state we need to count @@ -3311,28 +3334,45 @@ dp_netdev_pmd_flush_output_on_port(struct dp_netdev_pmd_thread *pmd, } output_cnt = dp_packet_batch_size(>output_pkts); +ovs_assert(output_cnt > 0); netdev_send(p->port->netdev, tx_qid, >output_pkts, dynamic_txqs);
[ovs-dev] [PATCH v9 2/5] dpif-netdev: Count cycles on per-rxq basis.
Upcoming time-based output batching will allow to collect in a single output batch packets from different RX queues. Lets keep the list of RX queues for each output packet and collect cycles for them on send. Signed-off-by: Ilya Maximets--- lib/dpif-netdev.c | 93 +-- 1 file changed, 63 insertions(+), 30 deletions(-) diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 23c01d0..ed60ff1 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -533,6 +533,7 @@ struct tx_port { long long last_used; struct hmap_node node; struct dp_packet_batch output_pkts; +struct dp_netdev_rxq *output_pkts_rxqs[NETDEV_MAX_BURST]; }; /* A set of properties for the current processing loop that is not directly @@ -544,6 +545,10 @@ struct dp_netdev_pmd_thread_ctx { long long now; /* Used to count cycles. See 'cycles_count_end()' */ unsigned long long last_cycles; +/* RX queue from which last packet was received. */ +struct dp_netdev_rxq *last_rxq; +/* Indicates how should be treated last counted cycles. */ +enum pmd_cycles_counter_type current_pmd_cycles_type; }; /* PMD: Poll modes drivers. PMD accesses devices via polling to eliminate @@ -3197,42 +3202,53 @@ cycles_counter(void) /* Fake mutex to make sure that the calls to cycles_count_* are balanced */ extern struct ovs_mutex cycles_counter_fake_mutex; -/* Start counting cycles. Must be followed by 'cycles_count_end()' */ +/* Start counting cycles. Must be followed by 'cycles_count_end()'. + * Counting starts from the idle type state. */ static inline void cycles_count_start(struct dp_netdev_pmd_thread *pmd) OVS_ACQUIRES(_counter_fake_mutex) OVS_NO_THREAD_SAFETY_ANALYSIS { +pmd->ctx.current_pmd_cycles_type = PMD_CYCLES_IDLE; pmd->ctx.last_cycles = cycles_counter(); } -/* Stop counting cycles and add them to the counter 'type' */ +/* Stop counting cycles and add them to the counter of the current type. */ static inline void -cycles_count_end(struct dp_netdev_pmd_thread *pmd, - enum pmd_cycles_counter_type type) +cycles_count_end(struct dp_netdev_pmd_thread *pmd) OVS_RELEASES(_counter_fake_mutex) OVS_NO_THREAD_SAFETY_ANALYSIS { unsigned long long interval = cycles_counter() - pmd->ctx.last_cycles; +enum pmd_cycles_counter_type type = pmd->ctx.current_pmd_cycles_type; non_atomic_ullong_add(>cycles.n[type], interval); } -/* Calculate the intermediate cycle result and add to the counter 'type' */ +/* Calculate the intermediate cycle result and add to the counter of + * the current type */ static inline void cycles_count_intermediate(struct dp_netdev_pmd_thread *pmd, - struct dp_netdev_rxq *rxq, - enum pmd_cycles_counter_type type) + struct dp_netdev_rxq **rxqs, int n_rxqs) OVS_NO_THREAD_SAFETY_ANALYSIS { unsigned long long new_cycles = cycles_counter(); unsigned long long interval = new_cycles - pmd->ctx.last_cycles; +enum pmd_cycles_counter_type type = pmd->ctx.current_pmd_cycles_type; +int i; + pmd->ctx.last_cycles = new_cycles; non_atomic_ullong_add(>cycles.n[type], interval); -if (rxq && (type == PMD_CYCLES_PROCESSING)) { +if (n_rxqs && (type == PMD_CYCLES_PROCESSING)) { /* Add to the amount of current processing cycles. */ -non_atomic_ullong_add(>cycles[RXQ_CYCLES_PROC_CURR], interval); +interval /= n_rxqs; +for (i = 0; i < n_rxqs; i++) { +if (rxqs[i]) { +non_atomic_ullong_add([i]->cycles[RXQ_CYCLES_PROC_CURR], + interval); +} +} } } @@ -3276,6 +3292,16 @@ dp_netdev_pmd_flush_output_on_port(struct dp_netdev_pmd_thread *pmd, int tx_qid; int output_cnt; bool dynamic_txqs; +enum pmd_cycles_counter_type save_pmd_cycles_type; + +/* In case we're in PMD_CYCLES_PROCESSING state we need to count + * cycles for rxq we're processing now. */ +cycles_count_intermediate(pmd, >ctx.last_rxq, 1); + +/* Save current cycles counting state to restore after accounting + * send cycles. */ +save_pmd_cycles_type = pmd->ctx.current_pmd_cycles_type; +pmd->ctx.current_pmd_cycles_type = PMD_CYCLES_PROCESSING; dynamic_txqs = p->port->dynamic_txqs; if (dynamic_txqs) { @@ -3291,6 +3317,10 @@ dp_netdev_pmd_flush_output_on_port(struct dp_netdev_pmd_thread *pmd, dp_netdev_count_packet(pmd, DP_STAT_SENT_PKTS, output_cnt); dp_netdev_count_packet(pmd, DP_STAT_SENT_BATCHES, 1); + +/* Update send cycles for all the rx queues and restore previous state. */ +cycles_count_intermediate(pmd, p->output_pkts_rxqs, output_cnt); +pmd->ctx.current_pmd_cycles_type = save_pmd_cycles_type; } static void @@ -3307,7 +3337,7 @@ dp_netdev_pmd_flush_output_packets(struct
[ovs-dev] [PATCH v9 1/5] dpif-netdev: Use microsecond granularity.
Upcoming time-based output batching will require microsecond granularity for it's flexible configuration. Signed-off-by: Ilya Maximets--- lib/dpif-netdev.c | 27 ++- 1 file changed, 14 insertions(+), 13 deletions(-) diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 39f2940..23c01d0 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -178,12 +178,13 @@ struct emc_cache { /* Simple non-wildcarding single-priority classifier. */ -/* Time in ms between successive optimizations of the dpcls subtable vector */ -#define DPCLS_OPTIMIZATION_INTERVAL 1000 +/* Time in microseconds between successive optimizations of the dpcls + * subtable vector */ +#define DPCLS_OPTIMIZATION_INTERVAL 100LL -/* Time in ms of the interval in which rxq processing cycles used in - * rxq to pmd assignments is measured and stored. */ -#define PMD_RXQ_INTERVAL_LEN 1 +/* Time in microseconds of the interval in which rxq processing cycles used + * in rxq to pmd assignments is measured and stored. */ +#define PMD_RXQ_INTERVAL_LEN 1000LL /* Number of intervals for which cycles are stored * and used during rxq to pmd assignment. */ @@ -358,7 +359,7 @@ enum rxq_cycles_counter_type { RXQ_N_CYCLES }; -#define XPS_TIMEOUT_MS 500LL +#define XPS_TIMEOUT 50LL/* In microseconds. */ /* Contained by struct dp_netdev_port's 'rxqs' member. */ struct dp_netdev_rxq { @@ -802,7 +803,7 @@ emc_cache_slow_sweep(struct emc_cache *flow_cache) static inline void pmd_thread_ctx_time_update(struct dp_netdev_pmd_thread *pmd) { -pmd->ctx.now = time_msec(); +pmd->ctx.now = time_usec(); } /* Returns true if 'dpif' is a netdev or dummy dpif, false otherwise. */ @@ -4258,7 +4259,7 @@ dp_netdev_run_meter(struct dp_netdev *dp, struct dp_packet_batch *packets_, memset(exceeded_rate, 0, cnt * sizeof *exceeded_rate); /* All packets will hit the meter at the same time. */ -long_delta_t = (now - meter->used); /* msec */ +long_delta_t = (now - meter->used) / 1000; /* msec */ /* Make sure delta_t will not be too large, so that bucket will not * wrap around below. */ @@ -4414,7 +4415,7 @@ dpif_netdev_meter_set(struct dpif *dpif, ofproto_meter_id *meter_id, meter->flags = config->flags; meter->n_bands = config->n_bands; meter->max_delta_t = 0; -meter->used = time_msec(); +meter->used = time_usec(); /* set up bands */ for (i = 0; i < config->n_bands; ++i) { @@ -4962,7 +4963,7 @@ packet_batch_per_flow_execute(struct packet_batch_per_flow *batch, struct dp_netdev_flow *flow = batch->flow; dp_netdev_flow_used(flow, batch->array.count, batch->byte_count, -batch->tcp_flags, pmd->ctx.now); +batch->tcp_flags, pmd->ctx.now / 1000); actions = dp_netdev_flow_get_actions(flow); @@ -5337,7 +5338,7 @@ dpif_netdev_xps_revalidate_pmd(const struct dp_netdev_pmd_thread *pmd, continue; } interval = pmd->ctx.now - tx->last_used; -if (tx->qid >= 0 && (purge || interval >= XPS_TIMEOUT_MS)) { +if (tx->qid >= 0 && (purge || interval >= XPS_TIMEOUT)) { port = tx->port; ovs_mutex_lock(>txq_used_mutex); port->txq_used[tx->qid]--; @@ -5358,7 +5359,7 @@ dpif_netdev_xps_get_tx_qid(const struct dp_netdev_pmd_thread *pmd, interval = pmd->ctx.now - tx->last_used; tx->last_used = pmd->ctx.now; -if (OVS_LIKELY(tx->qid >= 0 && interval < XPS_TIMEOUT_MS)) { +if (OVS_LIKELY(tx->qid >= 0 && interval < XPS_TIMEOUT)) { return tx->qid; } @@ -5737,7 +5738,7 @@ dp_execute_cb(void *aux_, struct dp_packet_batch *packets_, conntrack_execute(>conntrack, packets_, aux->flow->dl_type, force, commit, zone, setmark, setlabel, aux->flow->tp_src, aux->flow->tp_dst, helper, nat_action_info_ref, - pmd->ctx.now); + pmd->ctx.now / 1000); break; } -- 2.7.4 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v9 0/5] Output packet batching (Time-based).
This patch-set inspired by [1] from Bhanuprakash Bodireddy. Implementation of [1] looks very complex and introduces many pitfalls [2] for later code modifications like possible packet stucks. This version targeted to make simple and flexible output packet batching on higher level without introducing and even simplifying netdev layer. [1] [PATCH v4 0/5] netdev-dpdk: Use intermediate queue during packet transmission. https://mail.openvswitch.org/pipermail/ovs-dev/2017-August/337019.html [2] For example: https://mail.openvswitch.org/pipermail/ovs-dev/2017-August/337133.html The most of the series already in merge tree and not included in v9. Testing of 'PVP with OVS bonding on phy ports' scenario shows significant performance improvement up to +25% (v9 has no performance difference with v6 here): https://mail.openvswitch.org/pipermail/ovs-dev/2017-December/341700.html Other testing results for v6: https://mail.openvswitch.org/pipermail/ovs-dev/2017-December/341605.html https://mail.openvswitch.org/pipermail/ovs-dev/2017-December/341628.html Version 9: * Made on top of current DPDK_MERGE tree (cc4891f). * Patches 1-4,6 from v8 dropped as already included in DPDK_MERGE. * "Time based ..." patch updated with per-rxq cycles counting from the following incremental: https://mail.openvswitch.org/pipermail/ovs-dev/2017-December/342255.html and splitted into 3 logically separate parts for easier review purposes: 1. Microsecond granularity for dpif-netdev. 2. Cycles accounting changes. 3. Time based batching itself. There was no code changes in comparing with v8 + Incremental, only re-split. * Output batching mentioned in NEWS. * Documentation patch included in this series. Version 8: * Minor rebase on top of current master. Version 7: * Rebased on current istokes/dpdk_merge (3eb8d4f) * Dropped dp_netdev_pmd_thread structure refactoring patch. Not needed since revert applied. Version 6: * Rebased on current master: - Added new patch to refactor dp_netdev_pmd_thread structure according to following suggestion: https://mail.openvswitch.org/pipermail/ovs-dev/2017-November/341230.html NOTE: I still prefer reverting of the padding related patch. Rebase done to not block acepting of this series. Revert patch and discussion here: https://mail.openvswitch.org/pipermail/ovs-dev/2017-November/341153.html * Added comment about pmd_thread_ctx_time_update() usage. Version 5: * pmd_thread_ctx_time_update() calls moved to different places to call them only from dp_netdev_process_rxq_port() and main polling functions: pmd_thread_main, dpif_netdev_run and dpif_netdev_execute. All other functions should use cached time from pmd->ctx.now. It's guaranteed to be updated at least once per polling cycle. * 'may_steal' patch returned to version from v3 because 'may_steal' in qos is a completely different variable. This patch only removes 'may_steal' from netdev API. * 2 more usec functions added to timeval to have complete public API. * Checking of 'output_cnt' turned to assertion. Version 4: * Rebased on current master. * Rebased on top of "Keep latest measured time for PMD thread." (Jan Scheurich) * Microsecond resolution related patches integrated. * Time-based batching without RFC tag. * 'output_time' renamed to 'flush_time'. (Jan Scheurich) * 'flush_time' update moved to 'dp_netdev_pmd_flush_output_on_port'. (Jan Scheurich) * 'output-max-latency' renamed to 'tx-flush-interval'. * Added patch for output batching statistics. Version 3: * Rebased on current master. * Time based RFC: fixed assert on n_output_batches <= 0. Version 2: * Rebased on current master. * Added time based batching RFC patch. * Fixed mixing packets with different sources in same batch. Ilya Maximets (5): dpif-netdev: Use microsecond granularity. dpif-netdev: Count cycles on per-rxq basis. dpif-netdev: Time based output batching. docs: Describe output packet batching in DPDK guide. NEWS: Mark output packet batching support. Documentation/intro/install/dpdk.rst | 58 + NEWS | 2 + lib/dpif-netdev.c| 224 +-- vswitchd/vswitch.xml | 16 +++ 4 files changed, 238 insertions(+), 62 deletions(-) -- 2.7.4 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] TNB End of Year Gift
Tenaga Nasional Berhad (TNB HQ) 129 Jalan Bangsar, 59200 Kuala Lumpur, Malaysia Ini adalah satu pemberitahuan rasmi daripada ibu pejabat daripada Tenaga Nasional Berhad sempena Persatuan Air Malaysia (PAM) untuk memaklumkan anda mengenai Hadiah Khas Eksklusif Tenaga Nasional Berhad Akhir Tahun dan kami Kami dengan gembira mengumumkan alamat e-mel anda sebagai salah satu daripada 7 alamat e-mel yang dipilih semalam yang diadakan di ibu pejabat di Kuala Lumpur. Dilampirkan disini Nombor Rujukan TNB-1187-48901 dengan Nombor kelulusan, TNB / 2012-087 /KL, dan nombor bertuah, TNB / 7071KL-733 yang memenangi RM150,000.00 (Seratus Lima Puluh Ribu Ringgit Malaysia Sahaja), alamat e-mel anda dipilih dari pangkalan data ibu pejabat lembaga program Tenaga Nasioanl Berhad dan TNB memberi RM150,000 kepada 7 keluarga yang bertuah menerima mesej rasmi ini melalui e-mel sebagai pampasan 1 Malaysia. Hadiah ekslusif Akhir Tahun TNB berjumlah RM150,000.00 akan dikeluarkan kepada anda melalui bank antarabangsa. Oleh kerana rombakkan beberapa nombor dan nama, kami meminta anda menyimpan mesej ini untuk diri sendiri dan tidak akan sama sekali mendedahkan kepada pihak TNB sehingga tuntutan anda telah diproses, dan hadiah anda dimasukkan ke dalam akaun anda, kerana ini adalah sebahagian daripada protokol untuk mengelakkan penyalahgunaan nombor bertuah anda. Untuk memulakan tuntutan TNB anda, anda dikehendaki mengisi borang pengesahan di bawah dan mengemukakannya kepada ejen tuntutan anda,Puan Zailina Bt Zainal Abidin bagi membolehkan beliau menjelaskan fail anda dan membayar dengan segera. Jabatan Ganjaran TNB Zailina Bt Zainal Abidin E-mel:tnbcarel...@outlook.my Nama Penuh: Alamat Bil: Umur: Warganegara: Nombor Telefon Rumah: No. Pengesahan dan Rujukan Hadiah: NOTA: Untuk mengelakkan kelewatan atau pembatalan, sila lampirkan bersama Nombor Rujukan dan Nombor Pengesahan TNB dalam semua urusan surat menyurat dengan ejen anda di dalam e-mel ini. Semua wang hadiah mesti dituntut tidak lewat dari tarikh 31 Januari, 2018. Apa-apa tuntutan yang tidak dibuat mengikut tarikh ini akan dikembalikan kepada Tenaga Nasional Berhad, Malaysia. Tahniah !! Sekali lagi Saya yang menurut perintah, (Nor Fadilah Binti Samsudin) Setiausaha Publisiti TNB Copyright 2017 Tenaga Nasional Berhad Semua hak cipta terpelihara. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] tunnel: fix tnl_find() after packet_type changed
Hi, I sent a new version to dev list. Please ignore this one. I changed the subject, since the problem does not seem to be restricted to tunnel handling. https://patchwork.ozlabs.org/patch/851949/ Best regards, Zoltan > -Original Message- > From: ovs-dev-boun...@openvswitch.org > [mailto:ovs-dev-boun...@openvswitch.org] On Behalf Of Zoltán Balogh > Sent: Monday, December 11, 2017 4:39 PM > To: Jan Scheurich; d...@openvswitch.org; Ben > Pfaff > Subject: Re: [ovs-dev] [PATCH] tunnel: fix tnl_find() after packet_type > changed > > Hi, > > I've been working on the solution proposed by Jan. I faced a new issue with > patch ports. Let's have a config like in unit test 1025: > "1025: ofproto-dpif - balance-tcp bonding, different recirc flow " > > > +-+ > | br-int| p5 (OF 5) > | o--<--- > +---o-+ > | br1- (patch port: OF 101) > | > +--+ >| >| br1+ (patch port: OF 100) > +--o--+ > | br1 | > | | > +---o-o---+ > p3 | | p4 > (OF 3)| | (OF 4) > |bond1| > +--+--+ >| > > In the unit test, a packet is received on port p5. The packet is then output > to > 'br1-' which is a patch port, its pair is 'br1+' on br1. Ports p3 and p4 are > in > bond0, the packet is recirculated in br1, before it is sent out on bond1. > > So, when it's recirculated, ofproto xlate will end up in > xlate_lookup_ofproto_() > at some point in time. The vanilla code does xport_lookup() based on the flow > which is built up based on dp_packet (due to miniflow_extract/expand) which > has > in port 5 (p5). So the flow has in port 5 as well. Furthermore, the returned > ofproto_dpif will be xport->xbridge->ofproto: br-int. > > If we do use the recirc ID extracted from the flow (or from dp_packet) which > is > 2, then the state's uuid will refer to the bridge br1. This is the bridge > where > the recirc action was added to odp_actions. On the other hand the in port > stored in the state is 0x (undefined). It's neither 5 (p5) nor 100 (br1+). > > So, If we want to build our code based on the recirc ID extracted from flow > (or > dp_packet) we won't get the same result as the vanilla code in case of using > patch ports and recirculate in the second bridge. > > Does anyone have a idea how to resolve this? > Maybe the vanilla code or patch port concept is broken? > > If we would use the original value of packet_type in tnl_find() we could still > workaround this. > > Best regards, > Zoltan > > > -Original Message- > > From: Jan Scheurich > > Sent: Friday, December 08, 2017 5:18 PM > > To: Zoltán Balogh ; d...@openvswitch.org > > Cc: Ben Pfaff > > Subject: RE: [PATCH] tunnel: fix tnl_find() after packet_type changed > > > > Hi Zoltan, > > Please find answers below. > > Regards, Jan > > > > > -Original Message- > > > From: Zoltán Balogh > > > Sent: Friday, 08 December, 2017 12:56 > > > To: Jan Scheurich ; d...@openvswitch.org > > > Cc: Ben Pfaff > > > Subject: RE: [PATCH] tunnel: fix tnl_find() after packet_type changed > > > > > > Hello Jan, > > > > > > Thank you for the review. > > > The recirc_id_node, and thus the frozen_state with the the ofproto_uuid > > > can be retrieved from recirc ID via > > recirc_id_node_find(). So I > > > think, it would be feasible to get the ofproto from the recirc ID without > > > calling tnl_find(). In addition we > would > > need to store in_port in the > > > frozen_state. > > > > Yes, my thinking was to lookup the recirculation context in > > xlate_lookup_ofproto_() if flow->recirc_id != 0 and > get > > bridge and in_port from the frozen data in that case. But I was looking for > > a way to store the recirc context > > pointer so that it need not be looked up once more later during > > xlate_actions(). > > > > BTW: in_port is already stored in frozen_state.metdata.in_port > > > > > > > > However, if we do not fix this in tnl_find(), then we need to make sure, > > > that neither xlate_lookup_ofproto() nor > > xlate_lookup() are called > > > after flow->packet_type has changed during xlate, don't we? For instance, > > > calling revalidate() can lead to call > > xlate_lookup(). > > > > revalidate() is called from revalidator threads to pass an artificial > > packet representing the megaflow through the > > pipeline. The flow constructed from the ukey will contain the recirc_id and > > it should still hit the same > > recirculation context with the frozen_state as a MISS upcall for the > > originally recirculated packet. > > > > > > > > Btw, what about the case, when we have a L2 and L3 port with the same > > > local IP. A packet is received on one of > > them, but the packet_type has changed during xlate and later revalidate > > leads to
[ovs-dev] [PATCH] xlate: don't add dummy eth header when doing recirc
In xlate_actions(), when packet comes from a L3 port and its packet_type is not Ethernet, then a dummy Ethernet header is added to the packet by setting flow->packet_type to Ethernet and zero out flow->dl_src and flow->dl_dst. This process should be avoided if packet is recirculated, i.e. xin->frozen_state is set properly. Signed-off-by: Zoltan BaloghCC: Jan Scheurich --- ofproto/ofproto-dpif-xlate.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c index 9b3a2f28a..d85441cb4 100644 --- a/ofproto/ofproto-dpif-xlate.c +++ b/ofproto/ofproto-dpif-xlate.c @@ -7041,7 +7041,7 @@ xlate_actions(struct xlate_in *xin, struct xlate_out *xout) struct xport *in_port = get_ofp_port(xbridge, ctx.base_flow.in_port.ofp_port); -if (flow->packet_type != htonl(PT_ETH) && in_port && +if (!xin->frozen_state && flow->packet_type != htonl(PT_ETH) && in_port && in_port->pt_mode == NETDEV_PT_LEGACY_L3 && ctx.table_id == 0) { /* Add dummy Ethernet header to non-L2 packet if it's coming from a * L3 port. So all packets will be L2 packets for lookup. -- 2.14.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH] xlate: fix xport lookup for recirc
Xlate_lookup and xlate_lookup_ofproto_() provides in_port and ofproto based on xport determined using flow, which is extracted from packet. The lookup can happen due to recirculation as well. It can happen, that packet_type has been modified during xlate before recirculation is triggered, so the lookup fails or delivers wrong xport. This can be worked around by propagating xport to ctx->xin after the very first lookup and store it in frozen state of the recirculation. So, when lookup is performed due to recirculation, the xport can be retrieved from the frozen state. Signed-off-by: Zoltan BaloghCC: Jan Scheurich --- ofproto/ofproto-dpif-rid.c| 4 +++- ofproto/ofproto-dpif-rid.h| 1 + ofproto/ofproto-dpif-upcall.c | 2 ++ ofproto/ofproto-dpif-xlate.c | 30 ++ ofproto/ofproto-dpif-xlate.h | 3 +++ 5 files changed, 39 insertions(+), 1 deletion(-) diff --git a/ofproto/ofproto-dpif-rid.c b/ofproto/ofproto-dpif-rid.c index fc5700489..79de90520 100644 --- a/ofproto/ofproto-dpif-rid.c +++ b/ofproto/ofproto-dpif-rid.c @@ -158,7 +158,8 @@ frozen_state_equal(const struct frozen_state *a, const struct frozen_state *b) && ofpacts_equal(a->ofpacts, a->ofpacts_len, b->ofpacts, b->ofpacts_len) && ofpacts_equal(a->action_set, a->action_set_len, - b->action_set, b->action_set_len)); + b->action_set, b->action_set_len) +&& a->xport_in == b->xport_in); } /* Lockless RCU protected lookup. If node is needed accross RCU quiescent @@ -285,6 +286,7 @@ recirc_alloc_id(struct ofproto_dpif *ofproto) .ipv6_dst = in6addr_any, }, .in_port = OFPP_NONE }, +.xport_in = NULL, }; return recirc_alloc_id__(, frozen_state_hash())->id; } diff --git a/ofproto/ofproto-dpif-rid.h b/ofproto/ofproto-dpif-rid.h index 19fc27c7c..f02221094 100644 --- a/ofproto/ofproto-dpif-rid.h +++ b/ofproto/ofproto-dpif-rid.h @@ -143,6 +143,7 @@ struct frozen_state { size_t stack_size; mirror_mask_t mirrors;/* Mirrors already output. */ bool conntracked; /* Conntrack occurred prior to freeze. */ +const struct xport *xport_in; /* Port packet was received on. */ /* Actions to be translated when thawing. */ struct ofpact *ofpacts; diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c index 02cf5415b..92a92d288 100644 --- a/ofproto/ofproto-dpif-upcall.c +++ b/ofproto/ofproto-dpif-upcall.c @@ -236,6 +236,8 @@ struct upcall { const struct nlattr *out_tun_key; /* Datapath output tunnel key. */ uint64_t odp_actions_stub[1024 / 8]; /* Stub for odp_actions. */ + +const struct xport *xport_in; /* Port the packet was received on. */ }; /* Ukeys must transition through these states using transition_ukey(). */ diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c index 9b3a2f28a..c9fb32338 100644 --- a/ofproto/ofproto-dpif-xlate.c +++ b/ofproto/ofproto-dpif-xlate.c @@ -1362,12 +1362,36 @@ xlate_lookup_ofproto_(const struct dpif_backer *backer, const struct flow *flow, struct xlate_cfg *xcfg = ovsrcu_get(struct xlate_cfg *, ); const struct xport *xport; +/* If packet is recirculated, xport can be retrieved from frozen state. */ +if (flow->recirc_id) { +const struct recirc_id_node *recirc_id_node; + +recirc_id_node = recirc_id_node_find(flow->recirc_id); + +if (OVS_UNLIKELY(!recirc_id_node)) { +return NULL; +} + +/* If recirculation was initiated due to bond (in_port = OFPP_NONE), + * xport cannot be retrieved from frozen state. */ +if (recirc_id_node->state.metadata.in_port != OFPP_NONE) { +xport = recirc_id_node->state.xport_in; +/* Verify, if xport is valid. */ +if (xport && hmap_contains(>xports, >hmap_node) && +xport->xbridge && xport->xbridge->ofproto) { +goto out; +} +} +} + xport = xport_lookup(xcfg, tnl_port_should_receive(flow) ? tnl_port_receive(flow) : odp_port_to_ofport(backer, flow->in_port.odp_port)); if (OVS_UNLIKELY(!xport)) { return NULL; } + +out: *xportp = xport; if (ofp_in_port) { *ofp_in_port = xport->ofp_port; @@ -4623,6 +4647,7 @@ finish_freezing__(struct xlate_ctx *ctx, uint8_t table) .stack_size = ctx->stack.size, .mirrors = ctx->mirrors, .conntracked = ctx->conntracked, +.xport_in = ctx->xin->xport_in, .ofpacts = ctx->frozen_actions.data, .ofpacts_len = ctx->frozen_actions.size, .action_set = ctx->action_set.data, @@ -6613,6 +6638,7 @@ xlate_in_init(struct xlate_in *xin, struct ofproto_dpif *ofproto, xin->odp_actions
Re: [ovs-dev] [PATCH v8 5/6] dpif-netdev: Time based output batching.
On 12/21/2017 12:27 PM, Jan Scheurich wrote: I had a cheaper solution in mind that would accept the inaccuracy of the accounting of tx cycles during rx batch processing (it >> statistically slightly biases the load distribution towards the more heavily >> loaded rx queues). And instead of attributing the tx cycles outside >> the rx batching based on a stored rx queue per packet, I would simply >> estimate the actual processing cycles per rx queue per interval by >> distributing the total PMD processing cycles in the interval over the polled >> rx queues weighted by their counted processing cycles. >>> >>> I'm not sure if I fully understand your idea, but batches are likely will >>> be sent >>> on one of the next polling cycles, not on the cycle where they was received. >>> Moreover, send cycles are very different for different port types. And also >>> packets >>> form a single rx queue may have different destinations (different types of >>> output ports). >>> I'm not sure if it possible to correctly count cycles in this case. >>> >> >> I agree, tx processing can be very different depending on destination, >> especially with larger packet sizes where some paths may have a copy. > > I understand. There can be a big difference between physical and vhostuser > ports, especially if the copy passes over the QPI. And if the per packet tx > accounting is not causing too much overhead, it would be the best. > > My suggestion for simplification was not aiming at the same accuracy as > Ilya's patch. Perhaps rebalancing doesn't really require that accuracy. Often > the new rx queue distribution will anyway behave different compared to the > prediction because of changed batching characteristics. > Sure, that is why I had suggested to perhaps make no assumption about future batch opportunities and just add the full cost of tx to each rxq that has at least one packet in the flush. OTOH, then the stats for the current situation could look worse than things actually are. Also, it might not really simplify the code as Ilya'a patch just gives each rxq some cycles per packet in a simple loop, whereas changing to give a full cycle count per rxq would mean checking for duplicate rxqs etc. I guess it's debatable. Anyway, I think both those schemes are a fix for attributing cycles to the wrong rxq, or not counting them at all. >> >>> Am I missed something? >>> After all, Kevin algorithm only needs comparable relative rx queue loads across the PMDs to function. In this context I would also >> suggest to exclude idle polling cycles from the algorithm because they are >> not load related and always eat up the total cycle budget of the >> PMD anyhow. >> >> Idle polling per rxq is not counted in the balance algorithm (for the >> reason you say) but it's needed to calculate a total cycles over the >> correct time period for providing the user with an rxq percentage of pmd >> used stats. Counting (2/3) and calculations (3/3) are in the balance >> stats patches for ref. > > Sorry, I didn't check the patches carefully. I just assumed because you do > count idle polling cycles per rx queue. But the total number of cycles per > interval could easily obtained from the PMD cycles. Perhaps a chance to > simplify? > Good question, it had been my first thought too. The issue was the full PMD cycles are not in sync with the rxq measurement intervals and are also controlled by the user with pmd-stats-clear. It was simpler to just extend the rxq counting code a bit to capture idle cycles, than to try and synchronize with the full PMD cycles and I didn't want to change their behaviour. Actually, with the rework to the data structure, I think measuring idle/proc cycles for an rxq simplified the existing code a little too. thanks, Kevin. > BR, Jan > ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v8 5/6] dpif-netdev: Time based output batching.
> >> I had a cheaper solution in mind that would accept the inaccuracy of the > >> accounting of tx cycles during rx batch processing (it > statistically slightly biases the load distribution towards the more heavily > loaded rx queues). And instead of attributing the tx cycles outside > the rx batching based on a stored rx queue per packet, I would simply > estimate the actual processing cycles per rx queue per interval by > distributing the total PMD processing cycles in the interval over the polled > rx queues weighted by their counted processing cycles. > > > > I'm not sure if I fully understand your idea, but batches are likely will > > be sent > > on one of the next polling cycles, not on the cycle where they was received. > > Moreover, send cycles are very different for different port types. And also > > packets > > form a single rx queue may have different destinations (different types of > > output ports). > > I'm not sure if it possible to correctly count cycles in this case. > > > > I agree, tx processing can be very different depending on destination, > especially with larger packet sizes where some paths may have a copy. I understand. There can be a big difference between physical and vhostuser ports, especially if the copy passes over the QPI. And if the per packet tx accounting is not causing too much overhead, it would be the best. My suggestion for simplification was not aiming at the same accuracy as Ilya's patch. Perhaps rebalancing doesn't really require that accuracy. Often the new rx queue distribution will anyway behave different compared to the prediction because of changed batching characteristics. > > > Am I missed something? > > > >> After all, Kevin algorithm only needs comparable relative rx queue loads > >> across the PMDs to function. In this context I would also > suggest to exclude idle polling cycles from the algorithm because they are > not load related and always eat up the total cycle budget of the > PMD anyhow. > > Idle polling per rxq is not counted in the balance algorithm (for the > reason you say) but it's needed to calculate a total cycles over the > correct time period for providing the user with an rxq percentage of pmd > used stats. Counting (2/3) and calculations (3/3) are in the balance > stats patches for ref. Sorry, I didn't check the patches carefully. I just assumed because you do count idle polling cycles per rx queue. But the total number of cycles per interval could easily obtained from the PMD cycles. Perhaps a chance to simplify? BR, Jan ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH] tests: Flunk OVN tests if populating ARP tables failed.
If we cannot talk to ovs-vswitchd process for some reason, e.g. it has terminated prematurely, we want to fail the test as soon as possible. Otherwise the test will likely fail later on due to ARP tables not being populated, which will make the troubleshooting the failure harder. Signed-off-by: Jakub Sitnicki--- tests/ofproto-macros.at | 7 --- tests/ovn.at| 46 +++--- 2 files changed, 27 insertions(+), 26 deletions(-) diff --git a/tests/ofproto-macros.at b/tests/ofproto-macros.at index 55a4e77..1fe0223 100644 --- a/tests/ofproto-macros.at +++ b/tests/ofproto-macros.at @@ -271,23 +271,24 @@ ovn_attach() { start_daemon ovn-controller || return 1 } -# ovn_populate_arp +# OVN_POPULATE_ARP # # This pre-populates the ARP tables of all of the OVN instances that have been # started with ovn_attach(). That means that packets sent from one hypervisor # to another never get dropped or delayed by ARP resolution, which makes # testing easier. -ovn_populate_arp() { +ovn_populate_arp__() { for e1 in $arp_table; do set `echo $e1 | sed 's/,/ /g'`; sb1=$1 br1=$2 ip=$3 mac=$4 for e2 in $arp_table; do set `echo $e2 | sed 's/,/ /g'`; sb2=$1 br2=$2 if test $sb1,$br1 != $sb2,$br2; then -as $sb2 ovs-appctl tnl/neigh/set $br2 $ip $mac +as $sb2 ovs-appctl tnl/neigh/set $br2 $ip $mac || return 1 fi done done } +m4_define([OVN_POPULATE_ARP], [AT_CHECK(ovn_populate_arp__, [0], [ignore])]) # Strips 'xid=0x1234' from ovs-ofctl output. strip_xids () { diff --git a/tests/ovn.at b/tests/ovn.at index 53fc536..8873bf3 100644 --- a/tests/ovn.at +++ b/tests/ovn.at @@ -1185,7 +1185,7 @@ ovn-nbctl acl-add lsw0 to-lport 1000 'eth.type == 0x1237 && eth.src == $set1 && # Pre-populate the hypervisors' ARP tables so that we don't lose any # packets for ARP resolution (native tunneling doesn't queue packets # for ARP resolution). -ovn_populate_arp +OVN_POPULATE_ARP # Allow some time for ovn-northd and ovn-controller to catch up. # XXX This should be more systematic. @@ -1721,7 +1721,7 @@ done ovn-nbctl --wait=sb sync ovn-sbctl dump-flows -ovn_populate_arp +OVN_POPULATE_ARP # XXX This is now the 3rd copy of these functions in this file ... @@ -1924,7 +1924,7 @@ ovs-vsctl add-port br-phys vif3 -- set Interface vif3 options:tx_pcap=hv3/vif3-t # Pre-populate the hypervisors' ARP tables so that we don't lose any # packets for ARP resolution (native tunneling doesn't queue packets # for ARP resolution). -ovn_populate_arp +OVN_POPULATE_ARP # Allow some time for ovn-northd and ovn-controller to catch up. # XXX This should be more systematic. @@ -2094,7 +2094,7 @@ ovs-vsctl add-port br-phys vif3 -- set Interface vif3 options:tx_pcap=hv3/vif3-t # Pre-populate the hypervisors' ARP tables so that we don't lose any # packets for ARP resolution (native tunneling doesn't queue packets # for ARP resolution). -ovn_populate_arp +OVN_POPULATE_ARP # Allow some time for ovn-northd and ovn-controller to catch up. # XXX This should be more systematic. @@ -2291,7 +2291,7 @@ done # Pre-populate the hypervisors' ARP tables so that we don't lose any # packets for ARP resolution (native tunneling doesn't queue packets # for ARP resolution). -ovn_populate_arp +OVN_POPULATE_ARP # Allow some time for ovn-northd and ovn-controller to catch up. # XXX This should be more systematic. @@ -2638,7 +2638,7 @@ done # Pre-populate the hypervisors' ARP tables so that we don't lose any # packets for ARP resolution (native tunneling doesn't queue packets # for ARP resolution). -ovn_populate_arp +OVN_POPULATE_ARP # Allow some time for ovn-northd and ovn-controller to catch up. # XXX This should be more systematic. @@ -3054,7 +3054,7 @@ ovs-vsctl -- add-port br-int hv2-vif1 -- \ # Pre-populate the hypervisors' ARP tables so that we don't lose any # packets for ARP resolution (native tunneling doesn't queue packets # for ARP resolution). -ovn_populate_arp +OVN_POPULATE_ARP # Allow some time for ovn-northd and ovn-controller to catch up. # XXX This should be more systematic. @@ -3454,7 +3454,7 @@ ovs-vsctl -- add-port br-int hv2-vif1 -- \ # Pre-populate the hypervisors' ARP tables so that we don't lose any # packets for ARP resolution (native tunneling doesn't queue packets # for ARP resolution). -ovn_populate_arp +OVN_POPULATE_ARP # Allow some time for ovn-northd and ovn-controller to catch up. # XXX This should be more systematic. @@ -3666,7 +3666,7 @@ ovs-vsctl -- add-port br-int hv2-vif1 -- \ # Pre-populate the hypervisors' ARP tables so that we don't lose any # packets for ARP resolution (native tunneling doesn't queue packets # for ARP resolution). -ovn_populate_arp +OVN_POPULATE_ARP # Allow some time for ovn-northd and ovn-controller to catch up. # XXX This should be more systematic. @@ -3806,7 +3806,7 @@ ovs-vsctl --
Re: [ovs-dev] [PATCH v8 5/6] dpif-netdev: Time based output batching.
> >> DP_PACKET_BATCH_FOR_EACH (packet, packets_) { > >> + > >> p->output_pkts_rxqs[dp_packet_batch_size(>output_pkts)] = > >> + > >> pmd->ctx.last_rxq; > >> dp_packet_batch_add(>output_pkts, packet); > >> } > >> return; > >> --- > >> > >> What do you think? > >> > > Thanks Ilya. I didn't test or check the cycle counts but on first review > it looks good. I had something similar in mind about saving the rxq's > that contributed packets. I'm not sure if there's any performance hit > with adding the cycles used on a per pkt basis - if so, we could do > something a bit less accurate. In my original email I also missed the > case where flush is done with pkts from multiple rxq's but cycles are > attributed solely to one rxq, but this patch fixes that too. > > I'm not sure yet of the impact towards rework on patches Jan sent for > pmd debug or I sent for balance stats. It seems a lot of people are > finishing up for the holidays in the next day or two, so I think we have > some more time for discussion and test before it is merged anyway. > > btw, Bhanu/Ilya thanks for tx batching :-) > > Kevin. > I also had a quick glance at the incremental patch and it looks correct as it addresses both errors in wrongly or not accounting tx cycles. I haven't tested, though, and have no feedback regarding the performance penalty, if any. I had a cheaper solution in mind that would accept the inaccuracy of the accounting of tx cycles during rx batch processing (it statistically slightly biases the load distribution towards the more heavily loaded rx queues). And instead of attributing the tx cycles outside the rx batching based on a stored rx queue per packet, I would simply estimate the actual processing cycles per rx queue per interval by distributing the total PMD processing cycles in the interval over the polled rx queues weighted by their counted processing cycles. After all, Kevin algorithm only needs comparable relative rx queue loads across the PMDs to function. In this context I would also suggest to exclude idle polling cycles from the algorithm because they are not load related and always eat up the total cycle budget of the PMD anyhow. Finally, the order of merging our patches matters. I'm sure my PMD performance metrics refactoring conflicts with both Ilya's and Kevin's latest changes. I don't know which order will cause the least rework. But we should decide on an order and work accordingly to avoid unnecessary churn. BR, Jan ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev