[ovs-dev] [patch_v2 3/3] tests: Enhance the pmd stats test.

2017-08-13 Thread Darrell Ball
The pmd stats test is enhanced to include megaflow stats
counting and checking.

Signed-off-by: Darrell Ball 
---
 tests/pmd.at | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/tests/pmd.at b/tests/pmd.at
index b6732ea..0b1731a 100644
--- a/tests/pmd.at
+++ b/tests/pmd.at
@@ -206,6 +206,25 @@ pmd thread numa_id  core_id :
lost:0
 ])
 
+AT_CHECK([ovs-vsctl set Open_vSwitch . other_config:emc-insert-inv-prob=0])
+(
+for i in `seq 0 5`;
+do
+
pkt="in_port(7),eth(src=50:54:00:00:00:77,dst=50:54:00:00:01:78),eth_type(0x0800),ipv4(src=10.0.0.2,dst=10.0.1.1,proto=1,tos=0,ttl=64,frag=no),icmp(type=8,code=0)"
+AT_CHECK([ovs-appctl netdev-dummy/receive p0 $pkt])
+done
+)
+ovs-appctl time/warp 100
+
+AT_CHECK([ovs-appctl dpif-netdev/pmd-stats-show | sed SED_NUMA_CORE_PATTERN | 
sed '/cycles/d' | grep pmd -A 5], [0], [dnl
+pmd thread numa_id  core_id :
+   emc hits:19
+   megaflow hits:6
+   avg. subtable lookups per hit:1.00
+   miss:1
+   lost:0
+])
+
 OVS_VSWITCHD_STOP
 AT_CLEANUP
 
-- 
1.9.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [patch_v2 2/3] dpif-netdev: Refactor some pmd stats.

2017-08-13 Thread Darrell Ball
The per packets stats are presently overlapping in that
miss stats include lost stats; make these stats non-overlapping
for clarity and make this clear in the dp_stat_type enum.  This
also eliminates the need to increment two 'miss' stats for a
single packet.

The subtable lookup stats is renamed to make it
clear that it relates to masked lookups.

The stats that total to the number of packets seen are defined
in dp_stat_type and an api is created to total the stats in case
these stats are further divided in the future.

Signed-off-by: Darrell Ball 
---
 lib/dpif-netdev.c | 58 ---
 1 file changed, 38 insertions(+), 20 deletions(-)

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index 17e1666..38f5203 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -323,12 +323,21 @@ static struct dp_netdev_port *dp_netdev_lookup_port(const 
struct dp_netdev *dp,
 OVS_REQUIRES(dp->port_mutex);
 
 enum dp_stat_type {
-DP_STAT_EXACT_HIT,  /* Packets that had an exact match (emc). */
-DP_STAT_MASKED_HIT, /* Packets that matched in the flow table. */
-DP_STAT_MISS,   /* Packets that did not match. */
-DP_STAT_LOST,   /* Packets not passed up to the client. */
-DP_STAT_LOOKUP_HIT, /* Number of subtable lookups for flow table
-   hits */
+DP_STAT_EXACT_HIT,  /* Packets that had an exact match (emc). */
+DP_STAT_MASKED_HIT, /* Packets that matched in the flow table. */
+DP_STAT_MISS,   /* Packets that did not match and were passed
+   up to the client. */
+DP_STAT_LOST,   /* Packets that did not match and were not
+   passed up to client. */
+DP_N_TOT_PKT_STAT,  /* The above statistics account for the total
+   number of packets seen and should be
+   non overlapping with each other. */
+DP_STAT_MASKED_LOOKUP_HIT = DP_N_TOT_PKT_STAT,  /* Number of subtable
+   lookups for flow table
+   hits. Each MASKED_HIT
+   hit will have >= 1
+   MASKED_LOOKUP_HIT
+   hit(s). */
 DP_N_STATS
 };
 
@@ -749,13 +758,22 @@ enum pmd_info_type {
 PMD_INFO_SHOW_RXQ /* Show poll-lists of pmd threads. */
 };
 
+static unsigned long long
+dp_netdev_calcul_total_packets(unsigned long long *stats)
+{
+unsigned long long total_packets = 0;
+for (uint8_t i = 0; i < DP_N_TOT_PKT_STAT; i++) {
+total_packets += stats[i];
+}
+return total_packets;
+}
+
 static void
 pmd_info_show_stats(struct ds *reply,
 struct dp_netdev_pmd_thread *pmd,
 unsigned long long stats[DP_N_STATS],
 uint64_t cycles[PMD_N_CYCLES])
 {
-unsigned long long total_packets;
 uint64_t total_cycles = 0;
 int i;
 
@@ -773,10 +791,6 @@ pmd_info_show_stats(struct ds *reply,
 }
 }
 
-/* Sum of all the matched and not matched packets gives the total.  */
-total_packets = stats[DP_STAT_EXACT_HIT] + stats[DP_STAT_MASKED_HIT]
-+ stats[DP_STAT_MISS];
-
 for (i = 0; i < PMD_N_CYCLES; i++) {
 if (cycles[i] > pmd->cycles_zero[i]) {
cycles[i] -= pmd->cycles_zero[i];
@@ -804,7 +818,8 @@ pmd_info_show_stats(struct ds *reply,
   "\tmiss:%llu\n\tlost:%llu\n",
   stats[DP_STAT_EXACT_HIT], stats[DP_STAT_MASKED_HIT],
   stats[DP_STAT_MASKED_HIT] > 0
-  ? (1.0*stats[DP_STAT_LOOKUP_HIT])/stats[DP_STAT_MASKED_HIT]
+  ? (1.0 * stats[DP_STAT_MASKED_LOOKUP_HIT])
+ / stats[DP_STAT_MASKED_HIT]
   : 0,
   stats[DP_STAT_MISS], stats[DP_STAT_LOST]);
 
@@ -820,6 +835,9 @@ pmd_info_show_stats(struct ds *reply,
   cycles[PMD_CYCLES_PROCESSING],
   cycles[PMD_CYCLES_PROCESSING] / (double)total_cycles * 100);
 
+/* Sum of all the matched and not matched packets gives the total.  */
+unsigned long long total_packets =
+ dp_netdev_calcul_total_packets(stats);
 if (total_packets == 0) {
 return;
 }
@@ -4811,7 +4829,7 @@ fast_path_processing(struct dp_netdev_pmd_thread *pmd,
 struct dpcls *cls;
 struct dpcls_rule *rules[PKT_ARRAY_SIZE];
 struct dp_netdev *dp = pmd->dp;
-int miss_cnt = 0, lost_cnt = 0;
+int miss_upcall_cnt = 0, miss_no_upcall_cnt = 0, lost_cnt = 0;
 int lookup_cnt = 0, add_lookup_cnt;
 bool any_miss;
 size_t i;
@@ -4853,7 +4871,7 @@ fast_path_processing(struct dp_netdev_pmd_thread *pmd,
 continue;
 }
 
-miss_cnt++;
+   

[ovs-dev] [patch_v2 1/3] dpif-netdev: Fix per packet cycles statistics.

2017-08-13 Thread Darrell Ball
From: Ilya Maximets 

DP_STAT_LOOKUP_HIT statistics used mistakenly for calculation
of total number of packets. This leads to completely wrong
per packet cycles statistics.

For example:

emc hits:0
megaflow hits:253702308
avg. subtable lookups per hit:1.50
miss:0
lost:0
avg cycles per packet: 248.32 (157498766585/634255770)

In this case 634255770 total_packets value used for avg
per packet calculation:

  total_packets = 'megaflow hits' + 'megaflow hits' * 1.5

The real value should be 524.38 (157498766585/253702308)

Fix that by summing only stats that reflect match/not match.
It's decided to make direct summing of required values instead of
disabling some stats in a loop to make calculations more clear and
avoid similar issues in the future.

CC: Jan Scheurich 
Fixes: 3453b4d62a98 ("dpif-netdev: dpcls per in_port with sorted subtables")
Signed-off-by: Ilya Maximets 
Acked-by: Jan Scheurich 
---
 lib/dpif-netdev.c | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index e2cd931..17e1666 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -755,7 +755,7 @@ pmd_info_show_stats(struct ds *reply,
 unsigned long long stats[DP_N_STATS],
 uint64_t cycles[PMD_N_CYCLES])
 {
-unsigned long long total_packets = 0;
+unsigned long long total_packets;
 uint64_t total_cycles = 0;
 int i;
 
@@ -771,13 +771,12 @@ pmd_info_show_stats(struct ds *reply,
 } else {
 stats[i] = 0;
 }
-
-if (i != DP_STAT_LOST) {
-/* Lost packets are already included in DP_STAT_MISS */
-total_packets += stats[i];
-}
 }
 
+/* Sum of all the matched and not matched packets gives the total.  */
+total_packets = stats[DP_STAT_EXACT_HIT] + stats[DP_STAT_MASKED_HIT]
++ stats[DP_STAT_MISS];
+
 for (i = 0; i < PMD_N_CYCLES; i++) {
 if (cycles[i] > pmd->cycles_zero[i]) {
cycles[i] -= pmd->cycles_zero[i];
-- 
1.9.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [patch_v2 0/3] dpif-netdev: Fix and refactor pmd stats.

2017-08-13 Thread Darrell Ball
Fix total pmd stats calculation, refactor pmd stats and enhance
the pmd stats test.

Darrell Ball (2):
  dpif-netdev: Refactor some pmd stats.
  tests: Enhance the pmd stats test.

Ilya Maximets (1):
  dpif-netdev: Fix per packet cycles statistics.

 lib/dpif-netdev.c | 59 +++
 tests/pmd.at  | 19 ++
 2 files changed, 57 insertions(+), 21 deletions(-)

-- 
1.9.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH RFC v3 4/4] dpif-netdev: Time based output batching.

2017-08-13 Thread Jan Scheurich
> This allows to collect packets from more than one RX burst
> and send them together with a configurable maximum latency.
> 
> 'other_config:output-max-latency' can be used to configure
> time that a packet can wait in output batch for sending.
> 
> Signed-off-by: Ilya Maximets 
> ---
> Notes:
> 
> * This is an RFC and should not be used for performance testing.
> * Millisecond granularity is used for now. Can be easily switched
>   to use microseconds instead.

>From earlier in-house trials we know we need to target flush times of 50 us or 
>less, so we clearly need better time resolution. Sub-ms timing in PMD should 
>be based on TSC cycles, which are already kept in the pmd struct. Could you 
>provide a corresponding patch for performance testing?

> 
>  lib/dpif-netdev.c| 121
> ++-
>  vswitchd/vswitch.xml |  15 +++
>  2 files changed, 115 insertions(+), 21 deletions(-)
> 
> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> index dcf55f3..0d78ae4 100644
> --- a/lib/dpif-netdev.c
> +++ b/lib/dpif-netdev.c
> @@ -85,6 +85,9 @@ VLOG_DEFINE_THIS_MODULE(dpif_netdev);
>  #define MAX_RECIRC_DEPTH 5
>  DEFINE_STATIC_PER_THREAD_DATA(uint32_t, recirc_depth, 0)
> 
> +/* Use instant packet send by default. */
> +#define DEFAULT_OUTPUT_MAX_LATENCY 0
> +
>  /* Configuration parameters. */
>  enum { MAX_FLOWS = 65536 }; /* Maximum number of flows in flow
> table. */
>  enum { MAX_METERS = 65536 };/* Maximum number of meters. */
> @@ -262,6 +265,9 @@ struct dp_netdev {
>  struct hmap ports;
>  struct seq *port_seq;   /* Incremented whenever a port changes. */
> 
> +/* The time that a packet can wait in output batch for sending. */
> +atomic_uint32_t output_max_latency;
> +
>  /* Meters. */
>  struct ovs_mutex meter_locks[N_METER_LOCKS];
>  struct dp_meter *meters[MAX_METERS]; /* Meter bands. */
> @@ -502,6 +508,7 @@ struct tx_port {
>  int qid;
>  long long last_used;
>  struct hmap_node node;
> +long long output_time;

Rename to flush_time?

>  struct dp_packet_batch output_pkts;
>  };
> 
> @@ -574,6 +581,9 @@ struct dp_netdev_pmd_thread {
>   * than 'cmap_count(dp->poll_threads)'. */
>  uint32_t static_tx_qid;
> 
> +/* Number of filled output batches. */
> +int n_output_batches;
> +
>  struct ovs_mutex port_mutex;/* Mutex for 'poll_list' and 'tx_ports'.
> */
>  /* List of rx queues to poll. */
>  struct hmap poll_list OVS_GUARDED;
> @@ -669,9 +679,9 @@ static void dp_netdev_add_rxq_to_pmd(struct
> dp_netdev_pmd_thread *pmd,
>  static void dp_netdev_del_rxq_from_pmd(struct dp_netdev_pmd_thread
> *pmd,
> struct rxq_poll *poll)
>  OVS_REQUIRES(pmd->port_mutex);
> -static void
> +static int
>  dp_netdev_pmd_flush_output_packets(struct dp_netdev_pmd_thread
> *pmd,
> -   long long now);
> +   long long now, bool force);
>  static void reconfigure_datapath(struct dp_netdev *dp)
>  OVS_REQUIRES(dp->port_mutex);
>  static bool dp_netdev_pmd_try_ref(struct dp_netdev_pmd_thread *pmd);
> @@ -1193,6 +1203,7 @@ create_dp_netdev(const char *name, const
> struct dpif_class *class,
>  conntrack_init(>conntrack);
> 
>  atomic_init(>emc_insert_min, DEFAULT_EM_FLOW_INSERT_MIN);
> +atomic_init(>output_max_latency,
> DEFAULT_OUTPUT_MAX_LATENCY);
> 
>  cmap_init(>poll_threads);
> 
> @@ -2858,7 +2869,7 @@ dpif_netdev_execute(struct dpif *dpif, struct
> dpif_execute *execute)
>  dp_packet_batch_init_packet(, execute->packet);
>  dp_netdev_execute_actions(pmd, , false, execute->flow,
>execute->actions, execute->actions_len, now);
> -dp_netdev_pmd_flush_output_packets(pmd, now);
> +dp_netdev_pmd_flush_output_packets(pmd, now, true);
> 
>  if (pmd->core_id == NON_PMD_CORE_ID) {
>  ovs_mutex_unlock(>non_pmd_mutex);
> @@ -2907,6 +2918,16 @@ dpif_netdev_set_config(struct dpif *dpif, const
> struct smap *other_config)
>  smap_get_ullong(other_config, "emc-insert-inv-prob",
>  DEFAULT_EM_FLOW_INSERT_INV_PROB);
>  uint32_t insert_min, cur_min;
> +uint32_t output_max_latency, cur_max_latency;
> +
> +output_max_latency = smap_get_int(other_config, "output-max-
> latency",
> +  DEFAULT_OUTPUT_MAX_LATENCY);
> +atomic_read_relaxed(>output_max_latency, _max_latency);
> +if (output_max_latency != cur_max_latency) {
> +atomic_store_relaxed(>output_max_latency,
> output_max_latency);
> +VLOG_INFO("Output maximum latency set to %"PRIu32" ms",
> +  output_max_latency);
> +}
> 
>  if (!nullable_string_is_equal(dp->pmd_cmask, cmask)) {
>  free(dp->pmd_cmask);
> @@ -3107,11 +3128,12 @@ cycles_count_intermediate(struct
> dp_netdev_pmd_thread *pmd,
>  

Re: [ovs-dev] [PATCH v3 0/4] Output packet batching.

2017-08-13 Thread Jan Scheurich
Hi Ilya,

Thanks for providing these patches. We appreciate a simple/maintainable 
approach to Tx batching. 

In our use cases time-based Tx batching across multiple Rx batches provides the 
biggest value, so we will focus on the complete series including the final RFC 
patch.

BR, Jan

> -Original Message-
> From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev-
> boun...@openvswitch.org] On Behalf Of Ilya Maximets
> Sent: Thursday, 10 August, 2017 17:38
> To: ovs-dev@openvswitch.org; Bhanuprakash Bodireddy
> 
> Cc: Heetae Ahn ; Ilya Maximets
> 
> Subject: [ovs-dev] [PATCH v3 0/4] Output packet batching.
> 
> This patch-set inspired by [1] from Bhanuprakash Bodireddy.
> Implementation of [1] looks very complex and introduces many pitfalls [2]
> for later code modifications like possible packet stucks.
> 
> This version targeted to make simple and flexible output packet batching
> on
> higher level without introducing and even simplifying netdev layer.
> 
> Patch set consists of 3 patches. All the functionality introduced in the
> first patch. Two others are just cleanups of netdevs to not do unnecessary
> things.
> 
> 4th patch is just an RFC with possible time based implementation.
> Should not be concidered for performance testing.
> 
> Basic testing of 'PVP with OVS bonding on phy ports' scenario shows
> significant performance improvement.
> More accurate and intensive testing required.
> 
> [1] [PATCH v4 0/5] netdev-dpdk: Use intermediate queue during packet
> transmission.
> https://mail.openvswitch.org/pipermail/ovs-dev/2017-
> August/337019.html
> 
> [2] For example:
> https://mail.openvswitch.org/pipermail/ovs-dev/2017-
> August/337133.html
> 
> Version 3:
> 
>   * Rebased on current master.
>   * Time based RFC: fixed assert on n_output_batches <= 0.
> 
> Version 2:
> 
>   * Rebased on current master.
>   * Added time based batching RFC patch.
>   * Fixed mixing packets with different sources in same batch.
> 
> Ilya Maximets (4):
>   dpif-netdev: Output packet batching.
>   netdev: Remove unused may_steal.
>   netdev-dpdk: Remove useless cutlen.
>   dpif-netdev: Time based output batching.
> 
>  lib/dpif-netdev.c | 197
> ++
>  lib/netdev-bsd.c  |   4 +-
>  lib/netdev-dpdk.c |  30 +++-
>  lib/netdev-dummy.c|   4 +-
>  lib/netdev-linux.c|   4 +-
>  lib/netdev-provider.h |   7 +-
>  lib/netdev.c  |  12 +--
>  lib/netdev.h  |   2 +-
>  vswitchd/vswitch.xml  |  15 
>  9 files changed, 208 insertions(+), 67 deletions(-)
> 
> --
> 2.7.4
> 
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 1/2] dpif-netdev: Keep latest measured time for PMD thread.

2017-08-13 Thread Jan Scheurich
> > For me struct dp_netdev_pmd_thread is *the* the PMD thread context in
> dpif-netdev. I don't really see the benefit of creating a sub-struct
> dp_netdev_pmd_thread_ctx and moving certain data into that. Can you
> explain your motivation?
> 
> Hello Jan,
> 
> IMHO all other fields in struct dp_netdev_pmd_thread has direct relation
> with the
> thread itself (like core_id, need_reload) or they are it's direct accessories
> (like mutexes, stats, caches, lists of polled ports). From the other hand,
> time and
> last_cycles has no actual relation with thread and can be moved out of
> struct
> dp_netdev_pmd_thread wihtout any logical issues. These fields are
> characteristics
> of the outside environment. For example, 'emc_insert_min' which I'm
> introducing in
> the next patch is actually the characteristics of the last received packet
> (maybe port)
> but not the thread itself.
> 

I see your point. But I am afraid that such classification will continue to be 
fairly subjective and there is little technical or readability gain, if any, by 
doing this. Or are we going to pass/copy/zero the entire sub-struct outside the 
context of the pmd struct at some point? Perhaps it would be enough to group 
these fields in struct dp_netdev_pmd_thread and head the section with a comment?

BR, Jan
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH net-next v2] openvswitch: enable NSH support

2017-08-13 Thread Jan Scheurich
> From: Jiri Benc [mailto:jb...@redhat.com]
> Sent: Friday, 11 August, 2017 12:23
> 
> On Fri, 11 Aug 2017 10:09:36 +, Jan Scheurich wrote:
> > Unless someone can explain to me why the datapath should understand the
> > internal structure/format of metadata in push_nsh, I would strongly
> > vote to keep the metadata as variable length octet sequence in the
> > non-structured OVS_ACTION_ATTR_PUSH_NSH
> 
> Could be but it still needs to be in a different attribute and not in
> the ovs_action_push_nsh structure.
> 
> Separate attributes for MD1/MD2 has the advantage of easier validation:
> with a separate MD1 type attribute, the size check is easier. With an
> unstructured MD attribute, we'd need to look into the
> OVS_ACTION_ATTR_NSH_BASE_HEADER attribute for mdtype and then validate
> the unstructured MD attribute size manually. Not a big deal, though.
> I don't have strong opinion here.
> 
> But I do have strong opinion that MD needs to go into a separate
> attribute, whether there are separate attributes for MD1/2 or not.

Jiri, I am not too familiar with conventions on the OVS netlink interface 
regarding the handling of variable length fields. What is the benefit of 
structuring the push_nsh action into

OVS_ACTION_ATTR_PUSH_NSH
+-- OVS_ACTION_ATTR_NSH_BASE_HEADER
+-- OVS_ACTION_ATTR_NSH_MD

instead of grouping the base header fields and the variable length MD into a 
single monolithic attribute OVS_ACTION_ATTR_PUSH_NSH? Is the main concern a 
potential future need to add additional parameters to the push_nsh datapath 
action? Are there examples for such structured actions other than 
OVS_ACTION_ATTR_CT where the need is obvious because it is very polymorphic?

BR, Jan

BTW:  The name OVS_ACTION_ATTR_NSH_BASE_HEADER is misleading because in the NSH 
draft the term base header is used for the first 32-bit word, whereas here it 
includes also the 32-bit Service Path header.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] Running OVS on a wireless node

2017-08-13 Thread Raymond Burkholder

On 08/13/17 12:15, Sgh snhz wrote:


I want to run OVS on a wireless node in an emulator.

After running ovsdb and ovs-vswitchd, I am confused how to create a bridge
or configure the switch to work on wireless nodes (with only one
interface).

Could you help me to configure OVS in wireless nodes?


You don't provide much information, but perhaps the following may be of 
assistance:


connecting openvswitch to a hostapd based wireless access-point:

http://blog.raymond.burkholder.net/index.php?/archives/762-Using-Quilt-to-Patch-a-Debian-Package-hostapd.html

Some other general linux based wireless information:

http://blog.raymond.burkholder.net/index.php?/categories/75-Wireless

--
Raymond Burkholder
https://blog.raymond.burkholder.net/

--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] Running OVS on a wireless node

2017-08-13 Thread Sgh snhz
Hello,



I want to run OVS on a wireless node in an emulator.

After running ovsdb and ovs-vswitchd, I am confused how to create a bridge
or configure the switch to work on wireless nodes (with only one
interface).

Could you help me to configure OVS in wireless nodes?



Thank you in advance
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [patch_v5] dp-packet: Reset DPDK HWOL checksum flags on init.

2017-08-13 Thread Chandran, Sugesh


Regards
_Sugesh


> -Original Message-
> From: Darrell Ball [mailto:db...@vmware.com]
> Sent: Thursday, August 10, 2017 5:15 PM
> To: Chandran, Sugesh ; Ben Pfaff
> 
> Cc: d...@openvswitch.org
> Subject: Re: [ovs-dev] [patch_v5] dp-packet: Reset DPDK HWOL checksum
> flags on init.
> 
> 
> 
> -Original Message-
> From: Darrell Ball 
> Date: Wednesday, August 9, 2017 at 1:38 PM
> To: "Chandran, Sugesh" , Ben Pfaff
> 
> Cc: "d...@openvswitch.org" 
> Subject: Re: [ovs-dev] [patch_v5] dp-packet: Reset DPDK HWOL checksum
> flags on init.
> 
> 
> 
> -Original Message-
> From: "Chandran, Sugesh" 
> Date: Wednesday, August 9, 2017 at 12:55 PM
> To: Darrell Ball , Ben Pfaff 
> Cc: "d...@openvswitch.org" 
> Subject: RE: [ovs-dev] [patch_v5] dp-packet: Reset DPDK HWOL checksum
> flags on init.
> 
> 
> 
> Regards
> _Sugesh
> 
> 
> > > >
> > > > Correct, I reused reset_dp_packet_checksum_ol_flags() 
> to do
> the
> > > initialization as well
> > > > I could also have created a separate function.
> > > >
> > > > In case a DPDK dev is used, those flags will be managed 
> by
> DPDK.
> > > >
> > > >  That sounds like a
> > > > bug in itself--is there a missing call to 
> initialize the mbuf
> > somewhere?
> > > >
> > > > Are you suggesting to initialize the whole mbuf for 
> each packet
> ?
> > >
> > > The issue that I'm raising is that it's unusual to take an
> > > uninitialized, indeterminate field, and then initialize 
> it by clearing
> a
> > > few bits.  It's much more conventional to initialize it 
> by setting it
> to
> > > zero, like this:
> > >
> > > p->mbuf.ol_flags = 0;
> > >
> > >
> > > That is better; I will create a separate function then.
> > > I will resend
> > > Thanks
> > [Sugesh] I also agree with Ben here.
> > Currently OVS uses only checksum offload flags from mbuf(As I am
> aware
> > of).
> > But there are other flag bits that may get used in future like 
> TSO.
> > So its better to initialize the mbuf properly before using.
> >
> > Here is the mbuf reset function in DPDK that get called when an
> existing
> > memory is mapped to
> > Mbuf.
> > I believe only the ol_flags are relevant for now in OVS.
> >
> > There is no higher cost associated with initializing all the 
> ol_flags vs
> some
> > flags, so that is fine.
> > It will be done twice in the case of a packet received from a dpdk
> device, but
> > it is a small cost.
> > I was more concerned about the dual responsibility conflict when
> packets are
> > received from a DPDK device and this is why I wanted to limit the 
> scope
> of
> > the flag management in OVS; it should be ok, though.
> > Hence, I mentioned that I will initialize all the ol_flags.
> >
> > JTBC, are you suggesting to initialize all the fields below ?
> > This would mean when packets are received from a DPDK dev, both
> the rte
> > library and OVS would separately initialize all the fields below –
> meaning, it
> > would be done twice for each packet.
> >
> [Sugesh] No, we don’t have to initialize all the fields. But we can 
> have a
> generic function
> to init all mbuf fields that are relevant in OVS.
> For now its only ol_flags. In the future this function must be updated
> when we use more
> fields from rte_mbuf.
> 
> We are on the same page then.
> I plan for the function to have a generic name, so this fine.
> 
> 
> Sorry, I really didn’t get the case you mentioned above, where the 
> init
> get called twice.
> What I can see in the code is , during the dpdk mp init, the
> the dp_packet_init is called for rte_mbuf initialization. This 
> happens at
> port initiation.
> 
> On packet reception, there wont any call to dp_packet_init.
> 
> I meant for the memory related to packets that are received, not during
> the reception time itself.
> However, the memory will be reused for subsequent packets, right. You
> can’t use a piece of memory
> once and never use it again – this is called a memory leak. This means 
> each
> mbuf will need to be initialized again and again.
> 
> I did some background investigation around this Sugesh.
> I originally 

Re: [ovs-dev] [PATCH v2 2/3] ovs-router: Set suitable type to netdev_open().

2017-08-13 Thread Tonghao Zhang
On Sat, Aug 12, 2017 at 1:15 AM, Ben Pfaff  wrote:
> I don't know.  This patch seems to say that it fixes a problem if we
> revert commit 8c2c225e481d ("netdev: Fix netdev_open() to track and
> recreate classless interfaces").  That patch hasn't been reverted.
> Given that, can you explain the value of this patch?
This patch and  8c2c225e481d ("netdev: Fix netdev_open() to track and
recreate classless interfaces") may fix the same bug and I explained it in
the commit message. I guess we should set suitable type to netdev_open()
even the patch 8c2c225e481d("netdev: Fix netdev_open() to track and
recreate classless interfaces")  has been applied. If we don't need it, this is
fine to me.

> Is removing route_table_reset() related to the rest of the patch?  If it
> is related, please add the explanation to the commit message.  If it is
> not related, then please submit it as a separate patch with the
> explanation that you gave below.
yes, if this patch is ok, I will send v3.

> I spent a few minutes playing with the style of this patch.  I'm
> appending it in a style closer to what we usually prefer.  This should
> not have changed the behavior at all.
>
> I wonder whether we should change the behavior of netdev_open() with a
> NULL type, so that it somehow searches for the proper type.
>
> Thank you for your work making OVS better.
>
> On Fri, Aug 11, 2017 at 06:02:06PM +0800, Tonghao Zhang wrote:
>> Hi Ben, this patch is ok ?
>>
>> On Fri, Aug 4, 2017 at 10:45 AM, Tonghao Zhang  
>> wrote:
>> > We can avoid the deadlock via  removing the route_table_reset() from
>> > the route_table_init()
>> >
>> > the call trace is below
>> > dp_initialize (ovsthread_once)
>> > route_table_init
>> > route_table_reset
>> >  route_table_handle_msg
>> >  ovs_router_insert__
>> >
>> > ovs_router_insert__
>> >get_src_addr
>> > get_netdev_type
>> >dp_enumerate_types
>> >  dp_initialize (ovsthread_once)
>> >
>> >
>> > After removing the route_table_reset() from the route_table_init(),
>> > the ovs-router works well. Because we run the route_table_run()
>> > periodically, and route_table_valid is inited as false (we will call
>> > the route_table_reset in this case). So it’s also unnecessary to
>> > immediately  reset route table in the route_table_init().
>> >
>> > On Fri, Aug 4, 2017 at 2:00 AM, Ben Pfaff  wrote:
>> >> On Tue, Jul 18, 2017 at 08:44:15PM -0700, Tonghao Zhang wrote:
>> >>> ovs-router module uses the netdev_open() to get routes.
>> >>> But this module always calls the netdev_open() with type
>> >>> which is NULL. This module may open the eth0, vethx,
>> >>> vxlan_sys_4789, br0 if these network devices exist. And
>> >>> these device will be opened as 'system' type.
>> >>>
>> >>> When debugging, somewhere netdev_ref it.  After reverting
>> >>> "netdev: Fix netdev_open() to adhere to class type if given",
>> >>> and when we call the 'ovs-appctl dpctl/show' or 'ovs-dpctl show',
>> >>> the info is shown as below. the vxlan_sys_4789 is up
>> >>> (eg. ifconfig vxlan_sys_4789 up).
>> >>>
>> >>> $ ovs-dpctl show
>> >>> system@ovs-system:
>> >>>   lookups: hit:4053 missed:118 lost:3
>> >>>   flows: 0
>> >>>   masks: hit:4154 total:1 hit/pkt:1.00
>> >>>   port 0: ovs-system (internal)
>> >>>   port 1: user-ovs-vm (internal)
>> >>>   port 2: vxlan_sys_4789 (vxlan)
>> >>>
>> >>> But the info should be as below.
>> >>> $ ovs-dpctl show
>> >>> system@ovs-system:
>> >>>   lookups: hit:4053 missed:118 lost:3
>> >>>   flows: 0
>> >>>   masks: hit:4154 total:1 hit/pkt:1.00
>> >>>   port 0: ovs-system (internal)
>> >>>   port 1: user-ovs-vm (internal)
>> >>>   port 2: vxlan_sys_4789 (vxlan: packet_type=ptap)
>> >>>
>> >>> Because the netdev-class of 'system' type does not have the
>> >>> 'get_config', and tunnel vports have 'get_config', then we can
>> >>> get the config info(eg. packet_type=ptap) of tunnel vports.
>> >>>
>> >>> If we only revert the patch, there is a bug all the time. The
>> >>> patch which Eelco support is fine to me. That patch avoid issue.
>> >>> URL: https://mail.openvswitch.org/pipermail/ovs-dev/2017-July/335560.html
>> >>> But without it, the patch I support also avoid the problem.
>> >>>
>> >>> However we should check the type in the ovs-router module, this
>> >>> patch works well with the patch Eelco support.
>> >>>
>> >>> Signed-off-by: Tonghao Zhang 
>> >>
>> >> Thanks for the bug fix.
>> >>
>> >> This patch seems fine but can you help me understand the change to
>> >> route_table_init()?  It isn't obviously connected to the rest of the
>> >> patch.
>> >>
>> >> Thanks,
>> >>
>> >> Ben.
>> >>
>> >>> diff --git a/lib/route-table.c b/lib/route-table.c
>> >>> index 67fda31..fc6845f 100644
>> >>> --- a/lib/route-table.c
>> >>> +++ b/lib/route-table.c
>> >>> @@ -112,7 +112,6 @@