[ovs-dev] Why 'OVS_VSWITCHD_STOP' do not run?

2017-08-17 Thread Sam
Hi all,

I'm running my ovs test case(my-test.at) like this, but when
'OVS_VSWITCHD_START'  or other AT_CHECK failed, 'OVS_VSWITCHD_STOP' will
not run, why and how to fix this to let 'OVS_VSWITCHD_STOP' run anyway.

Thank you~
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH] gitignore: Ignore cxx-check

2017-08-17 Thread Xiao Liang
Add cxx-check to .gitignore

Signed-off-by: Xiao Liang 
---
 .gitignore | 1 +
 1 file changed, 1 insertion(+)

diff --git a/.gitignore b/.gitignore
index cc864ecc7..8019bee41 100644
--- a/.gitignore
+++ b/.gitignore
@@ -75,3 +75,4 @@ testsuite.tmp.orig
 /tests/lcov/
 /Documentation/_build
 /.venv
+/cxx-check
-- 
2.14.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] How to refresh test cases in tests folder?

2017-08-17 Thread Sam
I have to ./boot.sh, then the test case will refresh

2017-08-18 9:17 GMT+08:00 Sam :

> Hi all,
>
> I'm running my test cases, I define a new macro in ofproto-macro.at like
> this;
>
> m4_define([_OVS_VSWITCHD_START],
>>   [OVS_RUNDIR=/usr/local/var/run/openvswitch; export OVS_RUNDIR
>>OVS_LOGDIR=/usr/local/var/log/openvswitch; export OVS_LOGDIR
>>OVS_DBDIR=/usr/local/var/run/openvswitch; export OVS_DBDIR
>>dnl  OVS_SYSCONFDIR=; export OVS_SYSCONFDIR
>>/etc/init.d/openvswitch stop
>>OVS_WAIT_UNTIL([tail -n 3 $OVS_LOGDIR/ovs-ctl.log | grep
>> "Killing ovsdb-server"])
>>/etc/init.d/openvswitch start
>>OVS_WAIT_UNTIL([tail -n 3 $OVS_LOGDIR/ovs-ctl.log | \
>> grep "Enabling remote OVSDB managers."])
>> ])
>
>
> And this got wrong, so I change it into this:
>
> m4_define([_OVS_VSWITCHD_START],
>>   [OVS_RUNDIR=/usr/local/var/run/openvswitch; export OVS_RUNDIR
>>OVS_LOGDIR=/usr/local/var/log/openvswitch; export OVS_LOGDIR
>>OVS_DBDIR=/usr/local/var/run/openvswitch; export OVS_DBDIR
>>dnl  OVS_SYSCONFDIR=; export OVS_SYSCONFDIR
>>/etc/init.d/openvswitch stop
>>OVS_WAIT_UNTIL([tail -n 3 $OVS_LOGDIR/ovs-ctl.log | \
>> grep "Killing ovsdb-server"])
>>/etc/init.d/openvswitch start
>>OVS_WAIT_UNTIL([tail -n 3 $OVS_LOGDIR/ovs-ctl.log | \
>> grep "Enabling remote OVSDB managers."])
>> ])
>
>
> But after I reconfigure the ovs, and run `make check`, I got same error as
> first, which is :
>
> /home/gangyewei-3/mvs/mvs/tests/testsuite.dir/at-groups/1/test-source:
>> line 41: Enabling remote OVSDB managers.: command not found
>> stdout:
>> ./netdev-dpdk.at:24: exit code was 1, expected 0
>
>
> I think this is because the bug of first code, but I have change it into
> 2nd code, I don't know why still got error, or I have to re-run `boot.sh`
> in OVS folder to re-construct test cases?
>
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3 1/4] dpif-netdev: Output packet batching.

2017-08-17 Thread Gao Zhenyu
Hi IIya,

   Thanks for working on it.
   This patch consumes dp_packet_batch_clone so I have concern on the
performance. Could you please show some performace number with/without your
patch.

Thanks
Zhenyu Gao

2017-08-10 23:38 GMT+08:00 Ilya Maximets :

> While processing incoming batch of packets they are scattered
> across many per-flow batches and sent separately.
>
> This becomes an issue while using more than a few flows.
>
> For example if we have balanced-tcp OvS bonding with 2 ports
> there will be 256 datapath internal flows for each dp_hash
> pattern. This will lead to scattering of a single recieved
> batch across all of that 256 per-flow batches and invoking
> send for each packet separately. This behaviour greatly degrades
> overall performance of netdev_send because of inability to use
> advantages of vectorized transmit functions.
> But the half (if 2 ports in bonding) of datapath flows will
> have the same output actions. This means that we can collect
> them in a single place back and send at once using single call
> to netdev_send. This patch introduces per-port packet batch
> for output packets for that purpose.
>
> 'output_pkts' batch is thread local and located in send port cache.
>
> Signed-off-by: Ilya Maximets 
> ---
>  lib/dpif-netdev.c | 104 ++
> 
>  1 file changed, 82 insertions(+), 22 deletions(-)
>
> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> index e2cd931..a2a25be 100644
> --- a/lib/dpif-netdev.c
> +++ b/lib/dpif-netdev.c
> @@ -502,6 +502,7 @@ struct tx_port {
>  int qid;
>  long long last_used;
>  struct hmap_node node;
> +struct dp_packet_batch output_pkts;
>  };
>
>  /* PMD: Poll modes drivers.  PMD accesses devices via polling to eliminate
> @@ -633,9 +634,10 @@ static void dp_netdev_execute_actions(struct
> dp_netdev_pmd_thread *pmd,
>size_t actions_len,
>long long now);
>  static void dp_netdev_input(struct dp_netdev_pmd_thread *,
> -struct dp_packet_batch *, odp_port_t port_no);
> +struct dp_packet_batch *, odp_port_t port_no,
> +long long now);
>  static void dp_netdev_recirculate(struct dp_netdev_pmd_thread *,
> -  struct dp_packet_batch *);
> +  struct dp_packet_batch *, long long
> now);
>
>  static void dp_netdev_disable_upcall(struct dp_netdev *);
>  static void dp_netdev_pmd_reload_done(struct dp_netdev_pmd_thread *pmd);
> @@ -667,6 +669,9 @@ static void dp_netdev_add_rxq_to_pmd(struct
> dp_netdev_pmd_thread *pmd,
>  static void dp_netdev_del_rxq_from_pmd(struct dp_netdev_pmd_thread *pmd,
> struct rxq_poll *poll)
>  OVS_REQUIRES(pmd->port_mutex);
> +static void
> +dp_netdev_pmd_flush_output_packets(struct dp_netdev_pmd_thread *pmd,
> +   long long now);
>  static void reconfigure_datapath(struct dp_netdev *dp)
>  OVS_REQUIRES(dp->port_mutex);
>  static bool dp_netdev_pmd_try_ref(struct dp_netdev_pmd_thread *pmd);
> @@ -2809,6 +2814,7 @@ dpif_netdev_execute(struct dpif *dpif, struct
> dpif_execute *execute)
>  struct dp_netdev *dp = get_dp_netdev(dpif);
>  struct dp_netdev_pmd_thread *pmd;
>  struct dp_packet_batch pp;
> +long long now = time_msec();
>
>  if (dp_packet_size(execute->packet) < ETH_HEADER_LEN ||
>  dp_packet_size(execute->packet) > UINT16_MAX) {
> @@ -2851,8 +2857,8 @@ dpif_netdev_execute(struct dpif *dpif, struct
> dpif_execute *execute)
>
>  dp_packet_batch_init_packet(, execute->packet);
>  dp_netdev_execute_actions(pmd, , false, execute->flow,
> -  execute->actions, execute->actions_len,
> -  time_msec());
> +  execute->actions, execute->actions_len,
> now);
> +dp_netdev_pmd_flush_output_packets(pmd, now);
>
>  if (pmd->core_id == NON_PMD_CORE_ID) {
>  ovs_mutex_unlock(>non_pmd_mutex);
> @@ -3101,6 +3107,37 @@ cycles_count_intermediate(struct
> dp_netdev_pmd_thread *pmd,
>  non_atomic_ullong_add(>cycles.n[type], interval);
>  }
>
> +static void
> +dp_netdev_pmd_flush_output_on_port(struct dp_netdev_pmd_thread *pmd,
> +   struct tx_port *p, long long now)
> +{
> +int tx_qid;
> +bool dynamic_txqs;
> +
> +dynamic_txqs = p->port->dynamic_txqs;
> +if (dynamic_txqs) {
> +tx_qid = dpif_netdev_xps_get_tx_qid(pmd, p, now);
> +} else {
> +tx_qid = pmd->static_tx_qid;
> +}
> +
> +netdev_send(p->port->netdev, tx_qid, >output_pkts, true,
> dynamic_txqs);
> +dp_packet_batch_init(>output_pkts);
> +}
> +
> +static void
> +dp_netdev_pmd_flush_output_packets(struct dp_netdev_pmd_thread *pmd,
> + 

Re: [ovs-dev] [PATCH v1] netdev-dpdk: Execute QoS Checking before copying to mbuf

2017-08-17 Thread Gao Zhenyu
Hi Ian,

   This patch is pending for a long time. May I know if I should revise it?


Thanks
Zhenyu Gao

2017-08-08 17:34 GMT+08:00 Gao Zhenyu :

> Thanks for the review. Please let me know if you have any concern on it. :)
>
> Thanks
> Zhenyu Gao
>
> 2017-08-08 17:08 GMT+08:00 Stokes, Ian :
>
>> Hi Darrell,
>>
>> I am, I've had a cursory look over it already but was planning to do a
>> deeper dive later this week as I have a few concerns about the approach.
>>
>> Ian
>>
>> > -Original Message-
>> > From: Darrell Ball [mailto:db...@vmware.com]
>> > Sent: Tuesday, August 8, 2017 3:07 AM
>> > To: Gao Zhenyu ; Ben Pfaff ;
>> > Chandran, Sugesh ; ovs dev
>> > ; Stokes, Ian 
>> > Subject: Re: [ovs-dev] [PATCH v1] netdev-dpdk: Execute QoS Checking
>> before
>> > copying to mbuf
>> >
>> > Hi Ian
>> >
>> > Are you interested in reviewing this ?
>> >
>> > Thanks Darrell
>> >
>> > -Original Message-
>> > From:  on behalf of Gao Zhenyu
>> > 
>> > Date: Monday, August 7, 2017 at 6:36 PM
>> > To: Ben Pfaff , "Chandran, Sugesh"
>> > , ovs dev 
>> > Subject: Re: [ovs-dev] [PATCH v1] netdev-dpdk: Execute QoS Checking
>> before
>> >   copying to mbuf
>> >
>> > ping
>> >
>> > Thanks
>> > Zhenyu Gao
>> >
>> > 2017-07-25 18:27 GMT+08:00 Zhenyu Gao :
>> >
>> > > In dpdk_do_tx_copy function, all packets were copied to mbuf
>> first,
>> > > but QoS checking may drop some of them.
>> > > Move the QoS checking in front of copying data to mbuf, it helps
>> to
>> > > reduce useless copy.
>> > >
>> > > Signed-off-by: Zhenyu Gao 
>> > > ---
>> > >  lib/netdev-dpdk.c | 55 ++
>> > > ++---
>> > >  1 file changed, 36 insertions(+), 19 deletions(-)
>> > >
>> > > diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
>> > > index ea17b97..bb8bd8f 100644
>> > > --- a/lib/netdev-dpdk.c
>> > > +++ b/lib/netdev-dpdk.c
>> > > @@ -258,7 +258,7 @@ struct dpdk_qos_ops {
>> > >   * For all QoS implementations it should always be non-null.
>> > >   */
>> > >  int (*qos_run)(struct qos_conf *qos_conf, struct rte_mbuf
>> > **pkts,
>> > > -   int pkt_cnt);
>> > > +   int pkt_cnt, bool may_steal);
>> > >  };
>> > >
>> > >  /* dpdk_qos_ops for each type of user space QoS implementation */
>> > > @@ -1438,7 +1438,8 @@ netdev_dpdk_policer_pkt_handle(struct
>> > > rte_meter_srtcm *meter,
>> > >
>> > >  static int
>> > >  netdev_dpdk_policer_run(struct rte_meter_srtcm *meter,
>> > > -struct rte_mbuf **pkts, int pkt_cnt)
>> > > +struct rte_mbuf **pkts, int pkt_cnt,
>> > > +bool may_steal)
>> > >  {
>> > >  int i = 0;
>> > >  int cnt = 0;
>> > > @@ -1454,7 +1455,9 @@ netdev_dpdk_policer_run(struct
>> rte_meter_srtcm
>> > > *meter,
>> > >  }
>> > >  cnt++;
>> > >  } else {
>> > > -rte_pktmbuf_free(pkt);
>> > > +if (may_steal) {
>> > > +rte_pktmbuf_free(pkt);
>> > > +}
>> > >  }
>> > >  }
>> > >
>> > > @@ -1463,12 +1466,13 @@ netdev_dpdk_policer_run(struct
>> > rte_meter_srtcm
>> > > *meter,
>> > >
>> > >  static int
>> > >  ingress_policer_run(struct ingress_policer *policer, struct
>> > rte_mbuf
>> > > **pkts,
>> > > -int pkt_cnt)
>> > > +int pkt_cnt, bool may_steal)
>> > >  {
>> > >  int cnt = 0;
>> > >
>> > >  rte_spinlock_lock(>policer_lock);
>> > > -cnt = netdev_dpdk_policer_run(>in_policer, pkts,
>> > pkt_cnt);
>> > > +cnt = netdev_dpdk_policer_run(>in_policer, pkts,
>> > > +  pkt_cnt, may_steal);
>> > >  rte_spinlock_unlock(>policer_lock);
>> > >
>> > >  return cnt;
>> > > @@ -1572,7 +1576,7 @@ netdev_dpdk_vhost_rxq_recv(struct
>> netdev_rxq
>> > *rxq,
>> > >  dropped = nb_rx;
>> > >  nb_rx = ingress_policer_run(policer,
>> > >  (struct rte_mbuf **) batch-
>> > >packets,
>> > > -nb_rx);
>> > > +nb_rx, true);
>> > >  dropped -= nb_rx;
>> > >  }
>> > >
>> > > @@ -1609,7 +1613,7 @@ netdev_dpdk_rxq_recv(struct netdev_rxq *rxq,
>> > struct
>> > > dp_packet_batch *batch)
>> > >  dropped = nb_rx;
>> >   

Re: [ovs-dev] [PATCH v1] ovn: Fix BFD error config on gateway

2017-08-17 Thread Gao Zhenyu
Thanks for the suggestion.  A testcase would be add in ovn testings. But I
am not familiar with ovn test and busy on other stuff now.
Since this issue affects many consumers who try to use HA gateways. I
prefer to push this fix in ovs master first, then we have more time to make
a testcase for it.

Thanks
Zhenyu Gao

2017-08-18 1:21 GMT+08:00 Anil Venkata :

> Hi Zhenyu Gao
>
> Is it possible for you to add a test case for this scenario. This test on
> the master code( without your patch) should fail, and your patch should
> make the test pass.
>
> Thanks
> Anil
>
> On Wed, Aug 16, 2017 at 12:17 PM, Zhenyu Gao 
> wrote:
>
>> The bfd_calculate_chassis function calculates gateway's peer
>> datapaths to figure our which chassis should be add in bfd_chassis.
>> But in most circumstance, a gateway's peer datapath has NO chassis.
>> So it may disable BFD on some tunnel ports. Then a port on a remote
>> ovs cannot send packet out because it believes all remote gateway are
>> down.
>>
>> This patch will go though whole graph and visit all datapath's port
>> which has connect with gateway.
>>
>> Signed-off-by: Zhenyu Gao 
>> ---
>>  ovn/controller/bfd.c | 102 ++
>> -
>>  1 file changed, 84 insertions(+), 18 deletions(-)
>>
>> diff --git a/ovn/controller/bfd.c b/ovn/controller/bfd.c
>> index d1448b1..dccfc98 100644
>> --- a/ovn/controller/bfd.c
>> +++ b/ovn/controller/bfd.c
>> @@ -100,6 +100,88 @@ bfd_calculate_active_tunnels(const struct
>> ovsrec_bridge *br_int,
>>  }
>>  }
>>
>> +struct local_datapath_node {
>> +struct ovs_list node;
>> +struct local_datapath *dp;
>> +};
>> +
>> +static void
>> +bfd_travel_gw_related_chassis(struct local_datapath *dp,
>> +  const struct hmap *local_datapaths,
>> +  struct ovsdb_idl_index_cursor *cursor,
>> +  struct sbrec_port_binding *lpval,
>> +  struct sset *bfd_chassis)
>> +{
>> +struct ovs_list dp_list;
>> +const struct sbrec_port_binding *pb;
>> +struct sset visited_dp = SSET_INITIALIZER(_dp);
>> +const char *dp_key;
>> +struct local_datapath_node *dp_binding;
>> +
>> +if (!(dp_key = smap_get(>datapath->external_ids,
>> "logical-router")) &&
>> +!(dp_key = smap_get(>datapath->external_ids,
>> "logical-switch"))) {
>> +VLOG_INFO("datapath has no uuid, cannot travel graph");
>> +return;
>> +}
>> +
>> +sset_add(_dp, dp_key);
>> +
>> +ovs_list_init(_list);
>> +dp_binding = xmalloc(sizeof *dp_binding);
>> +dp_binding->dp = dp;
>> +ovs_list_push_back(_list, _binding->node);
>> +
>> +/*
>> + * Go through whole graph to figure out all chassis which may deliver
>> + * packetsto gateway. */
>> +do {
>> +dp_binding = CONTAINER_OF(ovs_list_pop_front(_list),
>> +  struct local_datapath_node, node);
>> +dp = dp_binding->dp;
>> +free(dp_binding);
>> +for (size_t i = 0; i < dp->n_peer_dps; i++) {
>> +const struct sbrec_datapath_binding *pdp = dp->peer_dps[i];
>> +if (!pdp) {
>> +continue;
>> +}
>> +
>> +if (!(dp_key = smap_get(>external_ids,
>> "logical-router")) &&
>> +!(dp_key = smap_get(>external_ids,
>> "logical-switch"))) {
>> +continue;
>> +}
>> +
>> +if (sset_contains(_dp, dp_key)) {
>> +continue;
>> +}
>> +
>> +sset_add(_dp, dp_key);
>> +
>> +struct hmap_node *node = hmap_first_with_hash(local_dat
>> apaths,
>> +
>> pdp->tunnel_key);
>> +if (!node) {
>> +continue;
>> +}
>> +
>> +dp_binding = xmalloc(sizeof *dp_binding);
>> +dp_binding->dp = CONTAINER_OF(node, struct local_datapath,
>> +  hmap_node);
>> +ovs_list_push_back(_list, _binding->node);
>> +
>> +sbrec_port_binding_index_set_datapath(lpval, pdp);
>> +SBREC_PORT_BINDING_FOR_EACH_EQUAL (pb, cursor, lpval) {
>> +if (pb->chassis) {
>> +const char *chassis_name = pb->chassis->name;
>> +if (chassis_name) {
>> +sset_add(bfd_chassis, chassis_name);
>> +}
>> +}
>> +}
>> +}
>> +} while (!ovs_list_is_empty(_list));
>> +
>> +sset_destroy(_dp);
>> +}
>> +
>>  static void
>>  bfd_calculate_chassis(struct controller_ctx *ctx,
>>const struct sbrec_chassis *our_chassis,
>> @@ -155,24 +237,8 @@ bfd_calculate_chassis(struct controller_ctx *ctx,
>>  }
>>  }
>>  if (our_chassis_is_gw_for_dp) {
>> -for (size_t i = 0; i < dp->n_peer_dps; i++) {

[ovs-dev] How to refresh test cases in tests folder?

2017-08-17 Thread Sam
Hi all,

I'm running my test cases, I define a new macro in ofproto-macro.at like
this;

m4_define([_OVS_VSWITCHD_START],
>   [OVS_RUNDIR=/usr/local/var/run/openvswitch; export OVS_RUNDIR
>OVS_LOGDIR=/usr/local/var/log/openvswitch; export OVS_LOGDIR
>OVS_DBDIR=/usr/local/var/run/openvswitch; export OVS_DBDIR
>dnl  OVS_SYSCONFDIR=; export OVS_SYSCONFDIR
>/etc/init.d/openvswitch stop
>OVS_WAIT_UNTIL([tail -n 3 $OVS_LOGDIR/ovs-ctl.log | grep
> "Killing ovsdb-server"])
>/etc/init.d/openvswitch start
>OVS_WAIT_UNTIL([tail -n 3 $OVS_LOGDIR/ovs-ctl.log | \
> grep "Enabling remote OVSDB managers."])
> ])


And this got wrong, so I change it into this:

m4_define([_OVS_VSWITCHD_START],
>   [OVS_RUNDIR=/usr/local/var/run/openvswitch; export OVS_RUNDIR
>OVS_LOGDIR=/usr/local/var/log/openvswitch; export OVS_LOGDIR
>OVS_DBDIR=/usr/local/var/run/openvswitch; export OVS_DBDIR
>dnl  OVS_SYSCONFDIR=; export OVS_SYSCONFDIR
>/etc/init.d/openvswitch stop
>OVS_WAIT_UNTIL([tail -n 3 $OVS_LOGDIR/ovs-ctl.log | \
> grep "Killing ovsdb-server"])
>/etc/init.d/openvswitch start
>OVS_WAIT_UNTIL([tail -n 3 $OVS_LOGDIR/ovs-ctl.log | \
> grep "Enabling remote OVSDB managers."])
> ])


But after I reconfigure the ovs, and run `make check`, I got same error as
first, which is :

/home/gangyewei-3/mvs/mvs/tests/testsuite.dir/at-groups/1/test-source: line
> 41: Enabling remote OVSDB managers.: command not found
> stdout:
> ./netdev-dpdk.at:24: exit code was 1, expected 0


I think this is because the bug of first code, but I have change it into
2nd code, I don't know why still got error, or I have to re-run `boot.sh`
in OVS folder to re-construct test cases?
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 2/2] ofproto-dpif-ipfix: add interface Information Elements to flow key

2017-08-17 Thread Greg Rose

On 08/16/2017 01:54 AM, Szczerbik, PrzemyslawX wrote:

Hi,

I haven't received any feedback on this patch for quite some time.
Is there anything that I can do to expedite review process?

Regards,
Przemek



[snip]

I'll have a look at it over the next few days and see if I can provide some 
feedback.  I was investigating other issues with ipfix some time back but got 
pulled into some other duties.  Hopefully I can get it working and provide some 
feedback.

Thanks,

- Greg

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 4/4] netdev-tc-offloads: Add support for action set

2017-08-17 Thread Joe Stringer
On 17 August 2017 at 00:52, Paul Blakey  wrote:
>
>
> On 15/08/2017 20:24, Joe Stringer wrote:
>>
>> On 15 August 2017 at 01:19, Paul Blakey  wrote:
>>>
>>>
>>>
>>> On 15/08/2017 10:28, Paul Blakey wrote:




 On 15/08/2017 04:04, Joe Stringer wrote:
>
>
> On 8 August 2017 at 07:21, Roi Dayan  wrote:
>>
>>
>> From: Paul Blakey 
>>
>> Implement support for offloading ovs action set using
>> tc header rewrite action.
>>
>> Signed-off-by: Paul Blakey 
>> Reviewed-by: Roi Dayan 
>> ---
>
>
>
> 
>
>> @@ -274,6 +346,48 @@ netdev_tc_flow_dump_destroy(struct
>> netdev_flow_dump
>> *dump)
>>return 0;
>>}
>>
>> +static void
>> +parse_flower_rewrite_to_netlink_action(struct ofpbuf *buf,
>> +   struct tc_flower *flower)
>> +{
>> +char *mask = (char *) >rewrite.mask;
>> +char *data = (char *) >rewrite.key;
>> +
>> +for (int type = 0; type < ARRAY_SIZE(set_flower_map); type++) {
>> +char *put = NULL;
>> +size_t nested = 0;
>> +int len = ovs_flow_key_attr_lens[type].len;
>> +
>> +if (len <= 0) {
>> +continue;
>> +}
>> +
>> +for (int j = 0; j < ARRAY_SIZE(set_flower_map[type]); j++) {
>> +struct netlink_field *f = _flower_map[type][j];
>> +
>> +if (!f->size) {
>> +break;
>> +}
>
>
>
> Seems like maybe there's similar feedback here to the previous patch
> wrt sparsely populated array set_flower_map[].
>

 I'll answer there.
>>>
>>>
>>>
>>> Scratch that, I actually meant to write it here.
>>>
>>> With this map we have fast (direct lookup) access on the put path
>>> (parse_put_flow_set_masked_action) , and slow access on the reverse dump
>>> path (parse_flower_rewrite_to_netlink_action) which iterates over all
>>> indices, although it should skips them  fast if they are empty.
>>>
>>> Changing this to sparse map will slow both of the paths, but dumping
>>> would
>>> be faster. We didn't write this maps for performance but for convince of
>>> adding new supported attributes.
>>
>>
>> OK, thanks for the explanation. I figured there must be a reason it
>> was indexed that way, I just didn't see it in my first review pass
>> through the series. Obviously fast path should be fast, and typically
>> we don't worry quite as much about dumping fast (although I have in
>> the past tested this to determine how many flows we will attempt to
>> populate in adverse conditions).
>>
>>> What we can do to both maps is:
>>>
>>> 1) Using ovs thread once, we can create a fast (direct lookup) access
>>> map;
>>> for both paths - generating a reverse map at init time and using that
>>> instead for dumping.
>>
>>
>> Until there's a clear, known advantage I'm not sure if we should do this.
>>
>>> 2) Rewrite it in a non sparse way, use it for both.
>>
>>
>> I think that we generally are more interested in 'put' performance so
>> if this sacrifices it for 'dump' performance, then it's probably not a
>> good idea (though it may still be too early to really optimize these).
>>
>>> 3) Rewrite it using a switch case, the logic for now is pretty much the
>>> same
>>> for all attributes.
>>>
>>> What do you think?
>>
>>
>> Do you think that this third option would improve the readability,
>> allow more compile-time checking on the code, or more code reuse?
>>
>
> Hi Joe, Thanks for the review,
>
> Since it can be implemented by a map, as with this patch set, all the switch
> cases will have the same logic of translating from OVS action set attributes
> to flower members, so I think its not much an increase in readability unless
> you're looking at a single case.
> And regarding compile time checking, we still need to convert from flower
> key members to tc action pedit raw offsets/values so we won't have that
> there.
>
> I'd like to remove the tc flower struct in the future
> and work straight on struct match and actions
> so then we will have only the set action to pedit conversion (and one
> conversion for all the rest).
>
> I think I'd rather keep the map till it won't fit for the task or we do the
> above if that's OK with you.

That seems reasonable enough to me.

I'm not sure exactly what you have in mind for removing the tc flower
struct in future, but I recognise that there's a bunch of inefficient
translation between different formats in this path today so I hope
we'll be able to improve this in the future. One of the things I've
had in mind for some time is to update the dpif layer to use 'struct
flow' for matches rather than OVS netlink-formatted matches, so then
the conversion can be pushed closer down to the kernel and we 

[ovs-dev] [PATCH v2 2/2] nsh: add new flow key 'ttl'

2017-08-17 Thread Yi Yang
IETF NSH draft will be approved by end of August, NSH header
format has been finalized and won't be change anymore, so we
need to follow this final spec to implement nsh.

kernel data path also needs finalized uAPIs, they can't be
changed once they are merged.

This patch adds new nsh key 'ttl', bits of flags and mdtype
fields are also changed to follow the final spec.

A new action nsh_dec_ttl will be added later, that won't impact
on kernel uAPI.

Signed-off-by: Yi Yang 
---
 datapath/linux/compat/include/linux/openvswitch.h |   2 +-
 include/openvswitch/flow.h|   6 +-
 include/openvswitch/meta-flow.h   |  31 ++-
 include/openvswitch/nsh.h | 296 +++---
 include/openvswitch/packets.h |  12 +-
 lib/flow.c|  25 +-
 lib/flow.h|   2 +-
 lib/match.c   |  12 +-
 lib/meta-flow.c   |  56 +++-
 lib/meta-flow.xml |   6 +-
 lib/nx-match.c|  16 +-
 lib/odp-execute.c |  40 +--
 lib/odp-util.c| 191 +-
 lib/odp-util.h|   2 +-
 lib/packets.c |   1 +
 ofproto/ofproto-dpif-xlate.c  |   7 +-
 tests/nsh.at  |  40 +--
 17 files changed, 493 insertions(+), 252 deletions(-)

diff --git a/datapath/linux/compat/include/linux/openvswitch.h 
b/datapath/linux/compat/include/linux/openvswitch.h
index d7f9029..c6bc5d7 100644
--- a/datapath/linux/compat/include/linux/openvswitch.h
+++ b/datapath/linux/compat/include/linux/openvswitch.h
@@ -503,9 +503,9 @@ enum ovs_nsh_key_attr {
 
 struct ovs_nsh_key_base {
__u8 flags;
+   __u8 ttl;
__u8 mdtype;
__u8 np;
-   __u8 pad;
__be32 path_hdr;
 };
 
diff --git a/include/openvswitch/flow.h b/include/openvswitch/flow.h
index a658a58..cd61fff 100644
--- a/include/openvswitch/flow.h
+++ b/include/openvswitch/flow.h
@@ -146,7 +146,7 @@ struct flow {
 struct eth_addr arp_tha;/* ARP/ND target hardware address. */
 ovs_be16 tcp_flags; /* TCP flags. With L3 to avoid matching L4. */
 ovs_be16 pad2;  /* Pad to 64 bits. */
-struct flow_nsh nsh;/* Network Service Header keys */
+struct ovs_key_nsh nsh; /* Network Service Header keys */
 
 /* L4 (64-bit aligned) */
 ovs_be16 tp_src;/* TCP/UDP/SCTP source port/ICMP type. */
@@ -159,13 +159,13 @@ struct flow {
 };
 BUILD_ASSERT_DECL(sizeof(struct flow) % sizeof(uint64_t) == 0);
 BUILD_ASSERT_DECL(sizeof(struct flow_tnl) % sizeof(uint64_t) == 0);
-BUILD_ASSERT_DECL(sizeof(struct flow_nsh) % sizeof(uint64_t) == 0);
+BUILD_ASSERT_DECL(sizeof(struct ovs_key_nsh) % sizeof(uint64_t) == 0);
 
 #define FLOW_U64S (sizeof(struct flow) / sizeof(uint64_t))
 
 /* Remember to update FLOW_WC_SEQ when changing 'struct flow'. */
 BUILD_ASSERT_DECL(offsetof(struct flow, igmp_group_ip4) + sizeof(uint32_t)
-  == sizeof(struct flow_tnl) + sizeof(struct flow_nsh) + 300
+  == sizeof(struct flow_tnl) + sizeof(struct ovs_key_nsh) + 300
   && FLOW_WC_SEQ == 40);
 
 /* Incremental points at which flow classification may be performed in
diff --git a/include/openvswitch/meta-flow.h b/include/openvswitch/meta-flow.h
index 436501f..14e6b59 100644
--- a/include/openvswitch/meta-flow.h
+++ b/include/openvswitch/meta-flow.h
@@ -1757,6 +1757,21 @@ enum OVS_PACKED_ENUM mf_field_id {
  */
 MFF_NSH_FLAGS,
 
+/* "nsh_ttl".
+ *
+ * TTL field in NSH base header.
+ *
+ * Type: u8.
+ * Maskable: no.
+ * Formatting: decimal.
+ * Prerequisites: NSH.
+ * Access: read/write.
+ * NXM: none.
+ * OXM: NXOXM_NSH_TTL(2) since OF1.3 and v2.8.
+ */
+MFF_NSH_TTL,
+
+
 /* "nsh_mdtype".
  *
  * mdtype field in NSH base header.
@@ -1767,7 +1782,7 @@ enum OVS_PACKED_ENUM mf_field_id {
  * Prerequisites: NSH.
  * Access: read-only.
  * NXM: none.
- * OXM: NXOXM_NSH_MDTYPE(2) since OF1.3 and v2.8.
+ * OXM: NXOXM_NSH_MDTYPE(3) since OF1.3 and v2.8.
  */
 MFF_NSH_MDTYPE,
 
@@ -1781,7 +1796,7 @@ enum OVS_PACKED_ENUM mf_field_id {
  * Prerequisites: NSH.
  * Access: read-only.
  * NXM: none.
- * OXM: NXOXM_NSH_NP(3) since OF1.3 and v2.8.
+ * OXM: NXOXM_NSH_NP(4) since OF1.3 and v2.8.
  */
 MFF_NSH_NP,
 
@@ -1795,7 +1810,7 @@ enum OVS_PACKED_ENUM mf_field_id {
  * Prerequisites: NSH.
  * Access: read/write.
  * NXM: none.
- * OXM: NXOXM_NSH_SPI(4) since OF1.3 and v2.8.
+ * OXM: NXOXM_NSH_SPI(5) since OF1.3 and v2.8.
  */
 MFF_NSH_SPI,
 
@@ -1809,7 +1824,7 @@ enum 

[ovs-dev] [PATCH v2 1/2] nsh: rework NSH netlink keys and actions

2017-08-17 Thread Yi Yang
Per kernel data path requirements, this patch changes OVS_KEY_ATTR_NSH
to nested attribute and adds three new NSH sub attribute keys:

OVS_NSH_KEY_ATTR_BASE: for length-fixed NSH base header
OVS_NSH_KEY_ATTR_MD1:  for length-fixed MD type 1 context
OVS_NSH_KEY_ATTR_MD2:  for length-variable MD type 2 metadata

NSH match fields, set and PUSH_NSH action all use the below
nested attribute format:

OVS_KEY_ATTR_NSH begin
OVS_NSH_KEY_ATTR_BASE
OVS_NSH_KEY_ATTR_MD1
OVS_KEY_ATTR_NSH end

or

OVS_KEY_ATTR_NSH begin
OVS_NSH_KEY_ATTR_BASE
OVS_NSH_KEY_ATTR_MD2
OVS_KEY_ATTR_NSH end

In addition, NSH encap and decap actions are renamed as push_nsh
and pop_nsh to meet action naming convention.

Signed-off-by: Yi Yang 
---
 datapath/linux/compat/include/linux/openvswitch.h |  57 +-
 include/openvswitch/nsh.h |  31 +-
 include/openvswitch/packets.h |  11 +-
 lib/dpif-netdev.c |   4 +-
 lib/dpif.c|   4 +-
 lib/flow.c|  43 +-
 lib/match.c   |  12 +-
 lib/meta-flow.c   |  13 +-
 lib/nx-match.c|   4 +-
 lib/odp-execute.c |  75 ++-
 lib/odp-util.c| 738 ++
 lib/odp-util.h|   4 +
 lib/packets.c |  23 +-
 lib/packets.h |   5 +-
 ofproto/ofproto-dpif-ipfix.c  |   4 +-
 ofproto/ofproto-dpif-sflow.c  |   4 +-
 ofproto/ofproto-dpif-xlate.c  |  24 +-
 tests/nsh.at  |  28 +-
 18 files changed, 808 insertions(+), 276 deletions(-)

diff --git a/datapath/linux/compat/include/linux/openvswitch.h 
b/datapath/linux/compat/include/linux/openvswitch.h
index bc6c94b..d7f9029 100644
--- a/datapath/linux/compat/include/linux/openvswitch.h
+++ b/datapath/linux/compat/include/linux/openvswitch.h
@@ -369,7 +369,7 @@ enum ovs_key_attr {
 #ifndef __KERNEL__
/* Only used within userspace data path. */
OVS_KEY_ATTR_PACKET_TYPE,  /* be32 packet type */
-   OVS_KEY_ATTR_NSH,  /* struct ovs_key_nsh */
+   OVS_KEY_ATTR_NSH,  /* Nested set of ovs_nsh_key_* */
 #endif
 
__OVS_KEY_ATTR_MAX
@@ -492,13 +492,27 @@ struct ovs_key_ct_labels {
};
 };
 
-struct ovs_key_nsh {
-__u8 flags;
-__u8 mdtype;
-__u8 np;
-__u8 pad;
-__be32 path_hdr;
-__be32 c[4];
+enum ovs_nsh_key_attr {
+   OVS_NSH_KEY_ATTR_BASE,  /* struct ovs_nsh_key_base. */
+   OVS_NSH_KEY_ATTR_MD1,   /* struct ovs_nsh_key_md1. */
+   OVS_NSH_KEY_ATTR_MD2,   /* variable-length octets. */
+   __OVS_NSH_KEY_ATTR_MAX
+};
+
+#define OVS_NSH_KEY_ATTR_MAX (__OVS_NSH_KEY_ATTR_MAX - 1)
+
+struct ovs_nsh_key_base {
+   __u8 flags;
+   __u8 mdtype;
+   __u8 np;
+   __u8 pad;
+   __be32 path_hdr;
+};
+
+#define NSH_MD1_CONTEXT_SIZE 4
+
+struct ovs_nsh_key_md1 {
+   __be32 context[NSH_MD1_CONTEXT_SIZE];
 };
 
 /* OVS_KEY_ATTR_CT_STATE flags */
@@ -793,24 +807,7 @@ struct ovs_action_push_eth {
struct ovs_key_ethernet addresses;
 };
 
-#define OVS_ENCAP_NSH_MAX_MD_LEN 16
-/*
- * struct ovs_action_encap_nsh - %OVS_ACTION_ATTR_ENCAP_NSH
- * @flags: NSH header flags.
- * @mdtype: NSH metadata type.
- * @mdlen: Length of NSH metadata in bytes.
- * @np: NSH next_protocol: Inner packet type.
- * @path_hdr: NSH service path id and service index.
- * @metadata: NSH metadata for MD type 1 or 2
- */
-struct ovs_action_encap_nsh {
-uint8_t flags;
-uint8_t mdtype;
-uint8_t mdlen;
-uint8_t np;
-__be32 path_hdr;
-uint8_t metadata[OVS_ENCAP_NSH_MAX_MD_LEN];
-};
+#define OVS_PUSH_NSH_MAX_MD_LEN 248
 
 /**
  * enum ovs_nat_attr - Attributes for %OVS_CT_ATTR_NAT.
@@ -887,8 +884,8 @@ enum ovs_nat_attr {
  * @OVS_ACTION_ATTR_PUSH_ETH: Push a new outermost Ethernet header onto the
  * packet.
  * @OVS_ACTION_ATTR_POP_ETH: Pop the outermost Ethernet header off the packet.
- * @OVS_ACTION_ATTR_ENCAP_NSH: encap NSH action to push NSH header.
- * @OVS_ACTION_ATTR_DECAP_NSH: decap NSH action to remove NSH header.
+ * @OVS_ACTION_ATTR_PUSH_NSH: push NSH header to the packet.
+ * @OVS_ACTION_ATTR_POP_NSH: pop the outermost NSH header off the packet.
  *
  * Only a single header can be set with a single %OVS_ACTION_ATTR_SET.  Not all
  * fields within a header are modifiable, e.g. the IPv4 protocol and fragment
@@ -930,8 +927,8 @@ enum ovs_action_attr {
OVS_ACTION_ATTR_TUNNEL_POP,/* u32 port number. */
OVS_ACTION_ATTR_CLONE, /* Nested OVS_CLONE_ATTR_*.  */
OVS_ACTION_ATTR_METER, /* u32 meter number. */
-   OVS_ACTION_ATTR_ENCAP_NSH,

[ovs-dev] [PATCH v2 0/2] nsh: rework NSH netlink keys and actions

2017-08-17 Thread Yi Yang
v1->v2
  - Rework per kernel datapath review comments
  - Add new NSH key ttl
  - Add many helpers in nsh.h and replace much code
with these helpers
  - nsh.h includes the lasted NSH spec
  - bits of flags and mdtype have a change

This patch seires reworks NSH netlink keys and actions
per kernel datapath requirments. OVS_KEY_ATTR_NSH is
changed as a nested key, encap_nsh and decap_nsh are
renamed as push_nsh and pop_nsh. PUSH_NSH used nested
OVS_KEY_ATTR_NSH keys to transfer NSH header data.

It also adds new NSH key 'ttl' to follow the lasted
IETF NSH draft:

https://datatracker.ietf.org/doc/draft-ietf-sfc-nsh/

I have double confirmed from one of its authors, this
is a final version which will be approved as IETF RFC,
the NSH header format won't be change anymore.

Yi Yang (2):
  nsh: rework NSH netlink keys and actions
  nsh: add new flow key 'ttl'

 datapath/linux/compat/include/linux/openvswitch.h |  57 +-
 include/openvswitch/flow.h|   6 +-
 include/openvswitch/meta-flow.h   |  31 +-
 include/openvswitch/nsh.h | 319 -
 include/openvswitch/packets.h |   9 +-
 lib/dpif-netdev.c |   4 +-
 lib/dpif.c|   4 +-
 lib/flow.c|  68 +-
 lib/flow.h|   2 +-
 lib/match.c   |  24 +-
 lib/meta-flow.c   |  69 +-
 lib/meta-flow.xml |   6 +-
 lib/nx-match.c|  20 +-
 lib/odp-execute.c |  83 ++-
 lib/odp-util.c| 757 +-
 lib/odp-util.h|   4 +
 lib/packets.c |  24 +-
 lib/packets.h |   5 +-
 ofproto/ofproto-dpif-ipfix.c  |   4 +-
 ofproto/ofproto-dpif-sflow.c  |   4 +-
 ofproto/ofproto-dpif-xlate.c  |  31 +-
 tests/nsh.at  |  54 +-
 22 files changed, 1179 insertions(+), 406 deletions(-)

-- 
2.1.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v4] route-table: Remove netdevs in netdev_hash when deleted

2017-08-17 Thread Greg Rose

On 08/14/2017 12:11 AM, fukaige wrote:

From: Kaige Fu 

Start a virtual machine with its backend tap device attached to a brought up 
linux bridge.
If we delete the linux bridge when vm is still running, we'll get the following 
error when
trying to create a ovs bridge with the same name.

The reason is that ovs-router subsystem add the linux bridge into netdev_shash, 
but does
not remove it when the bridge is deleted in the situation. When the bridge is 
deleted, ovs
will receive a RTM_DELLINK msg, take this chance to remove the bridge in 
netdev_shash.

ovs-vsctl: Error detected while setting up 'br-eth'.  See ovs-vswitchd log for 
details.

ovs-vswitchd log:
2017-05-11T03:45:25.293Z|00026|ofproto_dpif|INFO|system@ovs-system: Datapath 
supports recirculation
2017-05-11T03:45:25.293Z|00027|ofproto_dpif|INFO|system@ovs-system: MPLS label 
stack length probed as 1
2017-05-11T03:45:25.293Z|00028|ofproto_dpif|INFO|system@ovs-system: Datapath 
supports unique flow ids
2017-05-11T03:45:25.293Z|00029|ofproto_dpif|INFO|system@ovs-system: Datapath 
supports ct_state
2017-05-11T03:45:25.293Z|00030|ofproto_dpif|INFO|system@ovs-system: Datapath 
supports ct_zone
2017-05-11T03:45:25.293Z|00031|ofproto_dpif|INFO|system@ovs-system: Datapath 
supports ct_mark
2017-05-11T03:45:25.293Z|00032|ofproto_dpif|INFO|system@ovs-system: Datapath 
supports ct_label
2017-05-11T03:45:25.364Z|1|ofproto_dpif_upcall(handler226)|INFO|received 
packet on unassociated datapath port 0
2017-05-11T03:45:25.368Z|00033|netdev_linux|WARN|ethtool command ETHTOOL_GFLAGS 
on network device br-eth failed: No such device
2017-05-11T03:45:25.368Z|00034|dpif|WARN|system@ovs-system: failed to add 
br-eth as port: No such device
2017-05-11T03:45:25.368Z|00035|bridge|INFO|bridge br-eth: using datapath ID 
2a51cf9f2841
2017-05-11T03:45:25.368Z|00036|connmgr|INFO|br-eth: added service controller 
"punix:/var/run/openvswitch/br-eth.mgmt"

Signed-off-by: Kaige Fu 
---
  lib/route-table.c | 9 -
  1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/lib/route-table.c b/lib/route-table.c
index 67fda31..4e40c37 100644
--- a/lib/route-table.c
+++ b/lib/route-table.c
@@ -33,6 +33,7 @@
  #include "ovs-router.h"
  #include "packets.h"
  #include "rtnetlink.h"
+#include "tnl-ports.h"
  #include "openvswitch/vlog.h"

  /* Linux 2.6.36 added RTA_MARK, so define it just in case we're building with
@@ -333,10 +334,16 @@ name_table_init(void)


  static void
-name_table_change(const struct rtnetlink_change *change OVS_UNUSED,
+name_table_change(const struct rtnetlink_change *change,
void *aux OVS_UNUSED)
  {
  /* Changes to interface status can cause routing table changes that some
   * versions of the linux kernel do not advertise for some reason. */
  route_table_valid = false;
+
+if (change && change->nlmsg_type == RTM_DELLINK) {
+if (change->ifname) {
+tnl_port_map_delete_ipdev(change->ifname);
+}
+}
  }


Works as advertised...

Tested-by: Greg Rose 
Reviewed-by: Greg Rose 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] RESUME NEEDED.

2017-08-17 Thread CAPSIAN ENERGY CANADA
 
 

JOB VACANCIES, KINDLY FORWARD YOUR RESUME / CV FOR JOB OPENING IN CAPSIAN 
ENERGY CANADA.
 
 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [dpdk-dev] [ovs-dpdk-tests] where is ovs-dpdk test case?

2017-08-17 Thread Joe Stringer
On 17 August 2017 at 05:07, Kavanagh, Mark B  wrote:
>
>>From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Sam
>>Sent: Thursday, August 17, 2017 7:31 AM
>>To: ovs-disc...@openvswitch.org; ovs-dev@openvswitch.org; d...@dpdk.org
>
> Hi Sam,
>
> Just a heads-up that d...@dpdk.org is strictly for DPDK development threads - 
> I've removed it from this thread.
>
> Also, responses to your queries are inline.
>
> Thanks!
> Mark
>
>
>>Subject: [dpdk-dev] [ovs-dpdk-tests] where is ovs-dpdk test case?
>>
>>Hi all,
>>
>>I'm working with ovs-dpdk, I want to run ovs-dpdk test case. But I found
>>there is no test case for 'netdev' type bridge and no test case for
>>ovs-dpdk datapath(which is pmd_thread_main).
>
> Do you mean unit tests (as in OvS autotests), or integration tests?
>
> If the former, then unfortunately there are none at present within OvS (but 
> feel free to add some!); if you'd like to run integration tests, you could 
> take a look at VSPERF: https://etherpad.opnfv.org/p/vSwitchIntegrationTests.

"make check" runs unit tests against the userspace datapath, and it
should work with a DPDK-enabled build of OVS. I don't know how much it
tests the DPDK-specific portions of this, however. For example, it
won't use DPDK devices.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCHv2] checkpatch: Enforce bracing around conditionals.

2017-08-17 Thread Joe Stringer
The coding style states that BSD-style brace placement should be used,
and even single statements should be enclosed. Add checks to checkpatch
for this, particularly for 'else' statements.

Signed-off-by: Joe Stringer 
---
v2: Combine in same check as if_and_for_end_with_bracket_check
---
 utilities/checkpatch.py | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/utilities/checkpatch.py b/utilities/checkpatch.py
index 43f10bb3ded3..185ddaf0d5e9 100755
--- a/utilities/checkpatch.py
+++ b/utilities/checkpatch.py
@@ -95,6 +95,8 @@ __regex_ends_with_bracket = \
 __regex_ptr_declaration_missing_whitespace = re.compile(r'[a-zA-Z0-9]\*[^*]')
 __regex_is_comment_line = re.compile(r'^\s*(/\*|\*\s)')
 __regex_trailing_operator = re.compile(r'^[^ ]* [^ ]*[?:]$')
+__regex_conditional_else_bracing = re.compile(r'^\s*else\s*{?$')
+__regex_conditional_else_bracing2 = re.compile(r'^\s*}\selse\s*$')
 
 skip_leading_whitespace_check = False
 skip_trailing_whitespace_check = False
@@ -186,6 +188,10 @@ def if_and_for_end_with_bracket_check(line):
 return True
 if __regex_ends_with_bracket.search(line) is None:
 return False
+if __regex_conditional_else_bracing.match(line) is not None:
+return False
+if __regex_conditional_else_bracing2.match(line) is not None:
+return False
 return True
 
 
-- 
2.14.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] checkpatch: Enforce bracing around conditionals.

2017-08-17 Thread Joe Stringer
On 17 August 2017 at 07:35, Aaron Conole  wrote:
> Hi Joe,
>
> Joe Stringer  writes:
>
>> The coding style states that BSD-style brace placement should be used,
>> and even single statements should be enclosed. Add checks to checkpatch
>> for this.
>>
>> Signed-off-by: Joe Stringer 
>> ---
>>  utilities/checkpatch.py | 13 +
>>  1 file changed, 13 insertions(+)
>>
>> diff --git a/utilities/checkpatch.py b/utilities/checkpatch.py
>> index 43f10bb3ded3..c8381c98403b 100755
>> --- a/utilities/checkpatch.py
>> +++ b/utilities/checkpatch.py
>> @@ -95,6 +95,8 @@ __regex_ends_with_bracket = \
>>  __regex_ptr_declaration_missing_whitespace = 
>> re.compile(r'[a-zA-Z0-9]\*[^*]')
>>  __regex_is_comment_line = re.compile(r'^\s*(/\*|\*\s)')
>>  __regex_trailing_operator = re.compile(r'^[^ ]* [^ ]*[?:]$')
>> +__regex_conditional_else_bracing = re.compile(r'^\s*else\s*{?$')
>> +__regex_conditional_else_bracing2 = re.compile(r'^\s*}\selse\s*$')
>>
>>  skip_leading_whitespace_check = False
>>  skip_trailing_whitespace_check = False
>> @@ -212,6 +214,12 @@ def trailing_operator(line):
>>  return __regex_trailing_operator.match(line) is not None
>>
>>
>> +def if_else_bracing_check(line):
>> +if __regex_conditional_else_bracing.match(line) is not None:
>> +return True
>> +return __regex_conditional_else_bracing2.match(line) is not None
>> +
>> +
>
> Would it make more sense to put this in with the other bracing checks in
> if_and_for_end_with_bracket_check ?

Yes! Apparently I missed this before. Thanks, I'll respin.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [patch_v4 2/2] dpif-netdev: Refactor some pmd stats.

2017-08-17 Thread Darrell Ball


On 8/17/17, 1:03 PM, "ovs-dev-boun...@openvswitch.org on behalf of Darrell 
Ball"  wrote:





On 8/17/17, 6:48 AM, "ovs-dev-boun...@openvswitch.org on behalf of Jan 
Scheurich"  wrote:



Hi Darrell,



Please find my specific comments below.



I have a general concern for the pmd-stats-show counters: The output 
refers to the number of processed packets (e.g. processing cycles/pkt etc). But 
the data actually presented is the total number of *passes of packets through 
the datapath*. 



In the early OVS days there was little difference, but with increasing 
amount of recirculations ( for tunnel decap, pop_mpls, bond selection, 
conntrack), nowadays each packet passes the datapath several times and the 
output becomes confusing for users, especially as it does not contain any 
indication of the average number of datapath passes per packet.



I would strongly suggest that we add a new PMD stats counter for the 
actually received packets, which allows us to calculate and include the average 
number of passes per packet and true average processing cost per pkt. 



What do you think?





[Darrell] This is an excellent proposal. We have had this level of 
recirculations for 3+ years, so it is nothing new, but better late to fix this 
than never.

 Processing cycles is mainly useful for checking single 
flows and comparing relative numbers for code changes, however it is definitely 


>>

[Darrell] JTBC, the above should be “Average processing cycles” rather than 
“Processing cycles”.

<



 better to use the actual packets received, rather than 
packet passes. 

 I can do this enhancement/fix as part of this patch.

 I will add the counter and go one step further and display 
both received packets and total recirculations. The processing cycles will be

 based on received packets, as you pointed out.





BR, Jan



> The per packets stats are presently overlapping in that miss stats 
include

> lost stats; make these stats non-overlapping for clarity and make 
this clear

> in the dp_stat_type enum.  This also eliminates the need to increment 
two

> 'miss' stats for a single packet.

> 

> The subtable lookup stats is renamed to make it clear that it relates 
to

> masked lookups.

> The stats that total to the number of packets seen are defined in

> dp_stat_type and an api is created to total the stats in case these 
stats are

> further divided in the future.

> 

> The pmd stats test is enhanced to include megaflow stats counting and

> checking.

> Also, miss and lost stats are annotated to make it clear what they 
mean.

> 

> Signed-off-by: Darrell Ball 

> ---

>  lib/dpif-netdev.c | 78 
++--

> ---

>  tests/pmd.at  | 31 +-

>  2 files changed, 74 insertions(+), 35 deletions(-)

> 

> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 
17e1666..dfc6684

> 100644

> --- a/lib/dpif-netdev.c

> +++ b/lib/dpif-netdev.c

> @@ -323,12 +323,19 @@ static struct dp_netdev_port

> *dp_netdev_lookup_port(const struct dp_netdev *dp,

>  OVS_REQUIRES(dp->port_mutex);

> 

>  enum dp_stat_type {

> -DP_STAT_EXACT_HIT,  /* Packets that had an exact match 
(emc).

> */

> -DP_STAT_MASKED_HIT, /* Packets that matched in the flow 
table.

> */

> -DP_STAT_MISS,   /* Packets that did not match. */

> -DP_STAT_LOST,   /* Packets not passed up to the 
client. */

> -DP_STAT_LOOKUP_HIT, /* Number of subtable lookups for 
flow

> table

> -   hits */

> +DP_STAT_EXACT_HIT,  /* Packets that had an exact match (emc). */

> +DP_STAT_MASKED_HIT, /* Packets that matched in the flow table. */

> +DP_STAT_MISS,   /* Packets that did not match and upcall was

> +   done. */

> +DP_STAT_LOST,   /* Packets that did not match and upcall was

> + 

Re: [ovs-dev] [patch_v4 2/2] dpif-netdev: Refactor some pmd stats.

2017-08-17 Thread Darrell Ball


On 8/17/17, 6:48 AM, "ovs-dev-boun...@openvswitch.org on behalf of Jan 
Scheurich"  wrote:

Hi Darrell,

Please find my specific comments below.

I have a general concern for the pmd-stats-show counters: The output refers 
to the number of processed packets (e.g. processing cycles/pkt etc). But the 
data actually presented is the total number of *passes of packets through the 
datapath*. 

In the early OVS days there was little difference, but with increasing 
amount of recirculations ( for tunnel decap, pop_mpls, bond selection, 
conntrack), nowadays each packet passes the datapath several times and the 
output becomes confusing for users, especially as it does not contain any 
indication of the average number of datapath passes per packet.

I would strongly suggest that we add a new PMD stats counter for the 
actually received packets, which allows us to calculate and include the average 
number of passes per packet and true average processing cost per pkt. 

What do you think?


[Darrell] This is an excellent proposal. We have had this level of 
recirculations for 3+ years, so it is nothing new, but better late to fix this 
than never.
 Processing cycles is mainly useful for checking single flows 
and comparing relative numbers for code changes, however it is definitely 
 better to use the actual packets received, rather than packet 
passes. 
 I can do this enhancement/fix as part of this patch.
 I will add the counter and go one step further and display 
both received packets and total recirculations. The processing cycles will be
 based on received packets, as you pointed out.


BR, Jan

> The per packets stats are presently overlapping in that miss stats include
> lost stats; make these stats non-overlapping for clarity and make this 
clear
> in the dp_stat_type enum.  This also eliminates the need to increment two
> 'miss' stats for a single packet.
> 
> The subtable lookup stats is renamed to make it clear that it relates to
> masked lookups.
> The stats that total to the number of packets seen are defined in
> dp_stat_type and an api is created to total the stats in case these stats 
are
> further divided in the future.
> 
> The pmd stats test is enhanced to include megaflow stats counting and
> checking.
> Also, miss and lost stats are annotated to make it clear what they mean.
> 
> Signed-off-by: Darrell Ball 
> ---
>  lib/dpif-netdev.c | 78 ++--
> ---
>  tests/pmd.at  | 31 +-
>  2 files changed, 74 insertions(+), 35 deletions(-)
> 
> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 17e1666..dfc6684
> 100644
> --- a/lib/dpif-netdev.c
> +++ b/lib/dpif-netdev.c
> @@ -323,12 +323,19 @@ static struct dp_netdev_port
> *dp_netdev_lookup_port(const struct dp_netdev *dp,
>  OVS_REQUIRES(dp->port_mutex);
> 
>  enum dp_stat_type {
> -DP_STAT_EXACT_HIT,  /* Packets that had an exact match (emc).
> */
> -DP_STAT_MASKED_HIT, /* Packets that matched in the flow 
table.
> */
> -DP_STAT_MISS,   /* Packets that did not match. */
> -DP_STAT_LOST,   /* Packets not passed up to the client. 
*/
> -DP_STAT_LOOKUP_HIT, /* Number of subtable lookups for flow
> table
> -   hits */
> +DP_STAT_EXACT_HIT,  /* Packets that had an exact match (emc). */
> +DP_STAT_MASKED_HIT, /* Packets that matched in the flow table. */
> +DP_STAT_MISS,   /* Packets that did not match and upcall was
> +   done. */
> +DP_STAT_LOST,   /* Packets that did not match and upcall was
> +   not done. */

If I understand the implementation correctly, this counts packets for which 
an upcall was done but failed so that the packet was dropped. There are no 
packets with a miss for which we currently do not do an upcall. Why do you 
suggest to change the meaning of the counter?


[Darrell] The usage now is:

miss: means an upcall was actually done; meaning we processed at the ofproto 
layer successfully. This brings this counter in sync with the counter in
the ofproto upcall code itself (n_upcalls++;), which counts ‘successful 
upcalls’. This is also what the user expects by OVS “miss upcall”; the user 
expects
a miss upcall to mean the packet was processed by the ofproto layer; the user 
really does not care that an ‘attempt was made’ to process at the ofproto layer.

lost: now means specifically a failed upcall; I actually had no idea what it 
might mean (based on the o/p of 

Re: [ovs-dev] [PATCH v2] datapath-windows: Do not modify port field for ICMP during SNAT/DNAT

2017-08-17 Thread aserdean
I applied this on master and branch-2.8

Thanks,
Alin.

> -Original Message-
> From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev-
> boun...@openvswitch.org] On Behalf Of Alin Serdean
> Sent: Thursday, August 17, 2017 10:47 PM
> To: Anand Kumar ; d...@openvswitch.org
> Subject: Re: [ovs-dev] [PATCH v2] datapath-windows: Do not modify port
> field for ICMP during SNAT/DNAT
> 
> Thanks a lot for the patch and changes!
> 
> Acked-by: Alin Gabriel Serdean 
> 
> > -Original Message-
> > From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev-
> > boun...@openvswitch.org] On Behalf Of Anand Kumar
> > Sent: Wednesday, August 16, 2017 1:29 AM
> > To: d...@openvswitch.org
> > Subject: [ovs-dev] [PATCH v2] datapath-windows: Do not modify port
> > field for ICMP during SNAT/DNAT
> >
> > During SNAT/DNAT, we should not be updating the port field of
> > ct_endpoint struct, as ICMP packets do not have port information.
> > Since port and icmp_id are overlapped in ct_endpoint struct, icmp_id
gets
> changed.
> > As a result, NAT look up fails to find a matching entry.
> >
> > This patch addresses this issue by not modifying icmp_id field during
> > SNAT/DNAT only for ICMP traffic
> >
> > The current NAT module doesn't take the ICMP type/code into account
> > during the lookups. Fix this to make it similar with the other
> > conntrack module.
> >
> > Signed-off-by: Anand Kumar 
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3] datapath-windows: Update Orig Tuple to use ICMP Type and Code

2017-08-17 Thread aserdean
I applied this on master and branch-2.8

Thanks,
Alin.

> -Original Message-
> From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev-
> boun...@openvswitch.org] On Behalf Of aserd...@ovn.org
> Sent: Thursday, August 17, 2017 10:48 PM
> To: 'Anand Kumar' ; d...@openvswitch.org
> Subject: Re: [ovs-dev] [PATCH v3] datapath-windows: Update Orig Tuple to
> use ICMP Type and Code
> 
> Acked-by: Alin Gabriel Serdean 
> 
> > -Original Message-
> > From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev-
> > boun...@openvswitch.org] On Behalf Of Anand Kumar
> > Sent: Wednesday, August 16, 2017 9:23 PM
> > To: d...@openvswitch.org
> > Subject: [ovs-dev] [PATCH v3] datapath-windows: Update Orig Tuple to
> > use ICMP Type and Code
> >
> > - Also add some padding for the ct_endpoint's union, so that each
> > member of ct_endpoint's union are of same size.
> >
> > Co-authored-by: Sairam Venugopal 
> > Signed-off-by: Anand Kumar 
> > ---
> 
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3] datapath-windows: Update Orig Tuple to use ICMP Type and Code

2017-08-17 Thread aserdean
Acked-by: Alin Gabriel Serdean 

> -Original Message-
> From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev-
> boun...@openvswitch.org] On Behalf Of Anand Kumar
> Sent: Wednesday, August 16, 2017 9:23 PM
> To: d...@openvswitch.org
> Subject: [ovs-dev] [PATCH v3] datapath-windows: Update Orig Tuple to use
> ICMP Type and Code
> 
> - Also add some padding for the ct_endpoint's union, so that each member
> of ct_endpoint's union are of same size.
> 
> Co-authored-by: Sairam Venugopal 
> Signed-off-by: Anand Kumar 
> ---

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v1 1/1] Build the JSON C extension for the Python lib

2017-08-17 Thread Terry Wilson
The JSON C extensions performs much better than the pure Python
version, so build it when producing RPMs.

Signed-off-by: Terry Wilson 
---
 rhel/openvswitch-fedora.spec.in | 33 -
 1 file changed, 24 insertions(+), 9 deletions(-)

diff --git a/rhel/openvswitch-fedora.spec.in b/rhel/openvswitch-fedora.spec.in
index 59e8ff8..29982e8 100644
--- a/rhel/openvswitch-fedora.spec.in
+++ b/rhel/openvswitch-fedora.spec.in
@@ -125,9 +125,9 @@ Tailored Open vSwitch SELinux policy
 %package -n %{_py2}-openvswitch
 Summary: Open vSwitch python2 bindings
 License: ASL 2.0
-BuildArch: noarch
 Requires: %{_py2}
 Requires: %{_py2}-six
+Requires: openvswitch-devel
 %{?python_provide:%python_provide python2-openvswitch = %{version}-%{release}}
 %description -n %{_py2}-openvswitch
 Python bindings for the Open vSwitch database
@@ -136,9 +136,9 @@ Python bindings for the Open vSwitch database
 %package -n python3-openvswitch
 Summary: Open vSwitch python3 bindings
 License: ASL 2.0
-BuildArch: noarch
 Requires: python3
 Requires: python3-six
+Requires: openvswitch-devel
 %{?python_provide:%python_provide python3-openvswitch = %{version}-%{release}}
 
 %description -n python3-openvswitch
@@ -227,7 +227,8 @@ Docker network plugins for OVN.
--with-dpdk=$(dirname %{_datadir}/dpdk/*/.config) \
 %endif
--enable-ssl \
-   --with-pkidir=%{_sharedstatedir}/openvswitch/pki
+   --with-pkidir=%{_sharedstatedir}/openvswitch/pki \
+   --enable-shared
 
 /usr/bin/perl build-aux/dpdkstrip.pl \
 %if %{with dpdk}
@@ -283,14 +284,25 @@ install -p -m 0755 
rhel/etc_sysconfig_network-scripts_ifup-ovs \
 $RPM_BUILD_ROOT/%{_sysconfdir}/sysconfig/network-scripts/ifup-ovs
 
 install -d -m 0755 $RPM_BUILD_ROOT%{python2_sitelib}
-cp -a $RPM_BUILD_ROOT/%{_datadir}/openvswitch/python/* \
+cp -a $RPM_BUILD_ROOT/%{_datadir}/openvswitch/python/ovstest \
$RPM_BUILD_ROOT%{python2_sitelib}
 
+# We can't build the Python lib until after the libopenvswitch install
+pushd python
+# py2/py3_build, but with extra CFLAGS--is there a better way to do this?
+CFLAGS="%{optflags} -I ../include -L $RPM_BUILD_ROOT%{_libdir}" %{__python} \
+setup.py %{?py_setup_args} build --executable="%{__python2} -s"
+
+%if 0%{?fedora} > 22 || %{with build_python3}
+CFLAGS="%{optflags} -I ../include -L $RPM_BUILD_ROOT%{_libdir}" %{__python3} \
+setup.py %{?py_setup_args} build --executable="%{__python3} -s"
+%endif
+
+%py2_install
 %if 0%{?fedora} > 22 || %{with build_python3}
-install -d -m 0755 $RPM_BUILD_ROOT%{python3_sitelib}
-cp -a $RPM_BUILD_ROOT/%{_datadir}/openvswitch/python/ovs \
-   $RPM_BUILD_ROOT%{python3_sitelib}
+%py3_install
 %endif
+popd
 
 rm -rf $RPM_BUILD_ROOT/%{_datadir}/openvswitch/python/
 
@@ -481,12 +493,14 @@ fi
 %{_datadir}/selinux/packages/%{name}/openvswitch-custom.pp
 
 %files -n %{_py2}-openvswitch
-%{python2_sitelib}/ovs
+%{python2_sitearch}/ovs
+%{python2_sitearch}/ovs-*.egg-info
 %doc COPYING
 
 %if 0%{?fedora} > 22 || %{with build_python3}
 %files -n python3-openvswitch
-%{python3_sitelib}/ovs
+%{python3_sitearch}/ovs
+%{python3_sitearch}/ovs-*.egg-info
 %doc COPYING
 %endif
 
@@ -508,6 +522,7 @@ fi
 %files devel
 %{_libdir}/*.a
 %{_libdir}/*.la
+%{_libdir}/lib*.so*
 %{_libdir}/pkgconfig/*.pc
 %{_includedir}/openvswitch/*
 %{_includedir}/openflow/*
-- 
2.9.4

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3 1/4] dpif-netdev: Skip EMC lookup/insert for recirc packets

2017-08-17 Thread Darrell Ball


On 8/17/17, 5:22 AM, "Jan Scheurich"  wrote:

The RSS hash threshold method looks like the only pseudo-random 
criterion that we can use that produces consistent result for every packet of a 
flow and does require more information. Of course elephant flows with an 
unlucky hash value might never get to use the EMC, but that risk we have with 
any stateless selection scheme.



[Darrell] It is probably something I know by another name, but JTBC, can 
you define the “RSS hash threshold method” ?



I am referring to Billy's proposal 
(https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_pipermail_ovs-2Ddev_2017-2DAugust_336509.html=DwIGaQ=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs=dGZmbKhBG9tJHY4odedsGA=_i_IqWudJqU7R3_ZaFm7HHhOQMHwm_U6G-EIyOGjkxI=IDMHVf9n5CjmHMI67mzMd0HZegNJ_LntZLfcdpRUvJI=
 )


In essence the is suggests to only select packets for EMC lookup whose RSS 
hash is above a certain threshold. The lookup probability is determined by the 
threshold (e.g. threshold of 0.75 * UINT32_MAX corresponds to 25%). It is 
pseudo-random as, assuming that the hash result is uniformly distributed, flows 
will profit from EMC lookup with the same probability.


[Darrell] ahh, there is no actual patch yet, just an e-mail
I see, you have a coined the term “RSS hash threshold method” 
for the approach; the nomenclature makes sense now.
I’ll have separate comments, of course, on the proposal itself.


  
The new thing required will be the dynamic adjustment of lookup 
probability to the EMC fill level and/or hit ratio.



[Darrell] Did you mean insertion probability rather than lookup probability 
?



No, I actually meant dynamic adaptation of lookup probability. We don't 
want to reduce the EMC lookup probability when the EMC is not yet overloaded, 
but only when the EMC hit rate degrades due to collisions. When we devise an 
algorithm to adapt lookup probability, we can study if it could make sense to 
also dynamically adjust the currently fixed (configurable) EMC insertion 
probability based on EMC fill level and/or hit rate.


[Darrell] Now that I know what you are referring to above, it is a lot easier 
to make the linkage. 



BR, Jan



___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] presupuestos 2018

2017-08-17 Thread Estrategias para Maximizar Utilidades
El mejor plan para elaborar el presupuesto aplicable a su empresa 

Planeación y control de presupuestos 2018 
29 de Agosto- CP. Hugo Coca Chávez 9am-8pm

El control presupuestario es una herramienta para los negocios muy poderosa, 
que hace que el empresario, los ejecutivos, gerentes, supervisores y cualquier 
trabajador, administrativo u operativo tengan una guía de las actividades a 
realizar por orden prioritario o productivo. La planeación y control 
presupuestario nos da la pauta para evitar perder el tiempo en todas aquellas 
acciones que no están dirigidas al logro de objetivos. 

¿Requiere la información a la Brevedad? responda este email con la palabra: 
Presupuestos + nombre - teléfono - correo.


centro telefónico:018002120744


 


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 1/2] tests/system-offloads-traffic.at: add sanity check

2017-08-17 Thread Joe Stringer
On 17 August 2017 at 00:17, Roi Dayan  wrote:
>
>
> On 17/08/2017 08:32, Roi Dayan wrote:
>>
>>
>>
>> On 17/08/2017 01:17, Joe Stringer wrote:
>>>
>>> On 16 August 2017 at 05:14, Roi Dayan  wrote:

 Doing dump-flows also altering the netdev ports list.
 So doing it pre the actual test is adding a check to
 make sure we don't break the that list.

 Signed-off-by: Roi Dayan 
 Reviewed-by: Paul Blakey 
>>>
>>>
>>> I'm actually not sure what the requirements are to run these offload
>>> tests. I tried running them on a 4.4 kernel, and the first test passed
>>> while the second failed; I assume that this is because 4.4's TC flower
>>> support is not new enough.
>>>
>>> Then I tried with a 4.12 kernel and neither test passed, and I have
>>> extra flows being reported in the dump-flows output.
>>>
>>> I believe that I understand what this patch is trying to achieve, but
>>> I don't know how I'm supposed to validate it.
>>>
>>
>
> In the past you had an issue that cls_flower was not configured
> in your kernel. could be the same issue now?

Maybe it is, I changed which setup I was testing in. I'll double-check.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 0/3] updated selinux policy for Open vSwitch

2017-08-17 Thread Flavio Leitner
On Wed, 16 Aug 2017 16:04:49 -0400
Aaron Conole  wrote:

> This series brings about a policy update to openvswitch allowing it to
> run on a RHEL / Fedora system, even as a non-root user, with selinux set
> to Enforcing.
> 
> The first two patches make some changes to the way the selinux policy is
> built to have a macro-like effect, allowing the dpdk policy to be enabled
> or disabled based on the build.  This is chosen instead of using an selinux
> boolean, because it is more transparent to the end user.
> 
> All of this work was tested by passing traffic, including via a dpdk bridge.
> 
> Aaron Conole (3):
>   rhel: make the selinux policy intermediate
>   makefile: hook up dpdkstrip preprocessor
>   selinux: update policy to reflect non-root and dpdk support
> 
>  Makefile.am  |  4 
>  rhel/openvswitch-fedora.spec.in  |  1 +
>  selinux/automake.mk  |  2 +-
>  selinux/openvswitch-custom.te| 16 -
>  selinux/openvswitch-custom.te.in | 52 
> 
>  5 files changed, 58 insertions(+), 17 deletions(-)
>  delete mode 100644 selinux/openvswitch-custom.te
>  create mode 100644 selinux/openvswitch-custom.te.in
> 

Looks good to me.
Acked-by: Flavio Leitner 


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] response.contact me with this(danjohnpaymentguara...@gmail.com)

2017-08-17 Thread Mr frank wood Jr
Attn:Fund Beneficiary,

We hereby officially notifying you about the present arrangement to
pay you, your over due contract/inheritance fund through((DIPLOMATIC
COURIER SERVICE DELIVERY SYSTEM) This arrangement was newly
initiated/constituted by the World Bank and Paris Club in
collaboration with IMF, due to fraudulent activities going on within
the African Region.The World Bank and Paris Club in collaboration with
IMF introduced this new Cash Payment payment arrangement as to enable
our contractors/inheritance beneficiaries to receive their fund
without any interference.The ''WORLD BANK AND PARIS CLUB IN
COLLABORATION WITH IMF'' HAVE INTRODUCED A NEW SYSTEM OF CASH PAYMENT
THROUGH WHICH BANKS BRING FOREIGN CURRENCIES INTO THE COUNTRY,THIS HAS
HELPED MUCH TO AVOID THE PROBLEMS HERE AND THERE WHEN HUGE AMOUNT IS
BEING TRANSFERRED.THE DIPLOMATIC IMMUNITY CLEARANCE DOCUMENTS ARE
INTACT AND CASH DELIVERY UNDER THE NEW SYSTEM IS RISK FREE AND 100%
SUCCESSFUL DELIVERY.Reconfirm the following  information to us for
Security reasons and for us to know if we are communicating with the
right person/beneficiary.

1.FULL NAME.

2.ADDRESS WERE YOU WANT THEM TO SEND THE ATM .

3.PHONE AND FAX NUMBER..

4.YOUR AGE .
5.YOUR CURRENT OCCUPATION..

6.ATTACH COPY OF YOUR

IDENTIFICATION

YOUR SUPPOSED CONTRACT/INHERITANCE AMOUNT ...

Upon the receipt of this mail we are going to arrange your fund into
the DIPLOMATIC CONSIGNMENT WITH RELEVANT CLEARANCE DOCUMENTS FOR
SUCCESSFUL DELIVERY WITHOUT STOP ORDER ON THE WAY COMING TO YOU, and
dispatch the fund consignment directly to your nominated home address
so you absolutely have nothing to worry about, all we need is your
Prompt Response and Co-operation by Gods Grace we will have a
successful Transaction.We awaiting for your immediate response.contact
me with this(danjohnpaymentguara...@gmail.com)
(danjoh...@yahoo.com)
  Thanks.
Mr frank wood Jr,
For & On Behalf Of The WB/IMF/PC.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v1] ovn: Fix BFD error config on gateway

2017-08-17 Thread Anil Venkata
Hi Zhenyu Gao

Is it possible for you to add a test case for this scenario. This test on
the master code( without your patch) should fail, and your patch should
make the test pass.

Thanks
Anil

On Wed, Aug 16, 2017 at 12:17 PM, Zhenyu Gao 
wrote:

> The bfd_calculate_chassis function calculates gateway's peer
> datapaths to figure our which chassis should be add in bfd_chassis.
> But in most circumstance, a gateway's peer datapath has NO chassis.
> So it may disable BFD on some tunnel ports. Then a port on a remote
> ovs cannot send packet out because it believes all remote gateway are
> down.
>
> This patch will go though whole graph and visit all datapath's port
> which has connect with gateway.
>
> Signed-off-by: Zhenyu Gao 
> ---
>  ovn/controller/bfd.c | 102 ++
> -
>  1 file changed, 84 insertions(+), 18 deletions(-)
>
> diff --git a/ovn/controller/bfd.c b/ovn/controller/bfd.c
> index d1448b1..dccfc98 100644
> --- a/ovn/controller/bfd.c
> +++ b/ovn/controller/bfd.c
> @@ -100,6 +100,88 @@ bfd_calculate_active_tunnels(const struct
> ovsrec_bridge *br_int,
>  }
>  }
>
> +struct local_datapath_node {
> +struct ovs_list node;
> +struct local_datapath *dp;
> +};
> +
> +static void
> +bfd_travel_gw_related_chassis(struct local_datapath *dp,
> +  const struct hmap *local_datapaths,
> +  struct ovsdb_idl_index_cursor *cursor,
> +  struct sbrec_port_binding *lpval,
> +  struct sset *bfd_chassis)
> +{
> +struct ovs_list dp_list;
> +const struct sbrec_port_binding *pb;
> +struct sset visited_dp = SSET_INITIALIZER(_dp);
> +const char *dp_key;
> +struct local_datapath_node *dp_binding;
> +
> +if (!(dp_key = smap_get(>datapath->external_ids,
> "logical-router")) &&
> +!(dp_key = smap_get(>datapath->external_ids,
> "logical-switch"))) {
> +VLOG_INFO("datapath has no uuid, cannot travel graph");
> +return;
> +}
> +
> +sset_add(_dp, dp_key);
> +
> +ovs_list_init(_list);
> +dp_binding = xmalloc(sizeof *dp_binding);
> +dp_binding->dp = dp;
> +ovs_list_push_back(_list, _binding->node);
> +
> +/*
> + * Go through whole graph to figure out all chassis which may deliver
> + * packetsto gateway. */
> +do {
> +dp_binding = CONTAINER_OF(ovs_list_pop_front(_list),
> +  struct local_datapath_node, node);
> +dp = dp_binding->dp;
> +free(dp_binding);
> +for (size_t i = 0; i < dp->n_peer_dps; i++) {
> +const struct sbrec_datapath_binding *pdp = dp->peer_dps[i];
> +if (!pdp) {
> +continue;
> +}
> +
> +if (!(dp_key = smap_get(>external_ids,
> "logical-router")) &&
> +!(dp_key = smap_get(>external_ids,
> "logical-switch"))) {
> +continue;
> +}
> +
> +if (sset_contains(_dp, dp_key)) {
> +continue;
> +}
> +
> +sset_add(_dp, dp_key);
> +
> +struct hmap_node *node = hmap_first_with_hash(local_
> datapaths,
> +
> pdp->tunnel_key);
> +if (!node) {
> +continue;
> +}
> +
> +dp_binding = xmalloc(sizeof *dp_binding);
> +dp_binding->dp = CONTAINER_OF(node, struct local_datapath,
> +  hmap_node);
> +ovs_list_push_back(_list, _binding->node);
> +
> +sbrec_port_binding_index_set_datapath(lpval, pdp);
> +SBREC_PORT_BINDING_FOR_EACH_EQUAL (pb, cursor, lpval) {
> +if (pb->chassis) {
> +const char *chassis_name = pb->chassis->name;
> +if (chassis_name) {
> +sset_add(bfd_chassis, chassis_name);
> +}
> +}
> +}
> +}
> +} while (!ovs_list_is_empty(_list));
> +
> +sset_destroy(_dp);
> +}
> +
>  static void
>  bfd_calculate_chassis(struct controller_ctx *ctx,
>const struct sbrec_chassis *our_chassis,
> @@ -155,24 +237,8 @@ bfd_calculate_chassis(struct controller_ctx *ctx,
>  }
>  }
>  if (our_chassis_is_gw_for_dp) {
> -for (size_t i = 0; i < dp->n_peer_dps; i++) {
> -const struct sbrec_datapath_binding *pdp =
> dp->peer_dps[i];
> -if (!pdp) {
> -continue;
> -}
> -
> -sbrec_port_binding_index_set_datapath(lpval, pdp);
> -SBREC_PORT_BINDING_FOR_EACH_EQUAL (pb, , lpval) {
> -if (pb->chassis) {
> -/* Gateway node has to enable bfd to all nodes
> hosting
> - * connected network ports */
> -const 

[ovs-dev] packet loopback because different xcfgp pointer is used in xlate_normal*

2017-08-17 Thread Huanle Han
Hi, Ben
 I find a bug in ovs that mcast packet loops back because a different xcfgp
pointer is used in xlate_normal*().
 I find it on v2.5.0 and I haven't verify it on master yet.

Here is the my investigation:
In function xlate_normal_mcast_send_group,  the inbundle pointer and
outbundle pointer may be unequal even they are actually same port. ovs
sends the packet back, because mcast_xbundle != in_xbundle.  I think it's
because they are from different xcfgs.
if mcast_xbun
```

xcfg = ovsrcu_get(struct xlate_cfg *, );
LIST_FOR_EACH(b, bundle_node, >bundle_lru) {
mcast_xbundle = xbundle_lookup(xcfg, b->port);
if (mcast_xbundle && mcast_xbundle != in_xbundle) {
xlate_report(ctx, OFT_DETAIL, "forwarding to mcast group port");
output_normal(ctx, mcast_xbundle, xvlan);
}
  ..
}
```

Question 1: Does this bug fixed in master? if it is, which commit?


mac-learning table of my vswitch is used up, entry is expired all the
time. mac_learning_run() returning need_revalidate makes xbridge/xbundle
reset in type_run() frequently.
Question 2: Is it necessary to reset xbridge/xbundle/xport
when REV_MAC_LEARNING?


Thanks, Huanle
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 0/4] prioritizing latency sensitive traffic

2017-08-17 Thread O Mahony, Billy
Hi All,

> -Original Message-
> From: Jan Scheurich [mailto:jan.scheur...@ericsson.com]
> Sent: Thursday, August 17, 2017 5:22 PM
> To: O Mahony, Billy ; Kevin Traynor
> ; d...@openvswitch.org
> Subject: RE: [ovs-dev] [PATCH 0/4] prioritizing latency sensitive traffic
> 
> Good discussion. Some thoughts:
> 
> 1. Prioritizing queues by assigning them to dedicated PMDs is a simple and
> effective but very crude method, considering that you have to reserve an
> entire (logical) core for that. So I am all for a more economic and perhaps
> slightly less deterministic option!
[[BO'M]] I would agree. I was just drawing attention to the two-part nature of 
the patch. Ie. that it's not dpdk specific as such but comes with a dpdk 
implementation.
> 
> 2. Offering the option to prioritize certain queues in OVS-DPDK is a highly
> desirable feature. We have at least one important use case in OpenStack
> (prioritizing "in-band" infrastructure control plane traffic over tenant 
> data, in
> case both are carried on the same physical network). In our case the traffic
> separation would be done per VLAN. Can we add this to the list of supported
> filters?
[[BO'M]] Good to know about use-cases. I'll dig a bit on that wrt to dpdk 
drivers and hardware.
> 
> 3. It would be nice to be able to combine priority queues with filters with a
> number of RSS queues without filter. Is this a XL710 HW limitation or only a
> limitation of the drivers and DPDK APIs?
[[BO'M]] Again I'll have to dig on this. Our go to guy for this is on vacation 
at the moment. Remind me if I don't get back with a response. 
> 
> BR, Jan
> 
> 
> > From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev-
> > boun...@openvswitch.org] On Behalf Of O Mahony, Billy
> > Sent: Thursday, 17 August, 2017 18:07
> > To: Kevin Traynor ; d...@openvswitch.org
> > Subject: Re: [ovs-dev] [PATCH 0/4] prioritizing latency sensitive
> > traffic
> >
> > Hi Kevin,
> >
> > Thanks for the comments - more inline.
> >
> > Billy.
> >
> > > -Original Message-
> > > From: Kevin Traynor [mailto:ktray...@redhat.com]
> > > Sent: Thursday, August 17, 2017 3:37 PM
> > > To: O Mahony, Billy ; d...@openvswitch.org
> > > Subject: Re: [ovs-dev] [PATCH 0/4] prioritizing latency sensitive
> > > traffic
> > >
> > > Hi Billy,
> > >
> > > I just happened to be about to send a reply to the previous
> > > patchset, so adding comments here instead.
> > >
> > > On 08/17/2017 03:24 PM, Billy O'Mahony wrote:
> > > > Hi All,
> > > >
> > > > v2: Addresses various review comments; Applies cleanly on 0bedb3d6.
> > > >
> > > > This patch set provides a method to request ingress scheduling on
> > > interfaces.
> > > > It also provides an implemtation of same for DPDK physical ports.
> > > >
> > > > This allows specific packet types to be:
> > > > * forwarded to their destination port ahead of other packets.
> > > > and/or
> > > > * be less likely to be dropped in an overloaded situation.
> > > >
> > > > It was previously discussed
> > > > https://mail.openvswitch.org/pipermail/ovs-discuss/2017-
> > > May/044395.htm
> > > > l
> > > > and RFC'd
> > > > https://mail.openvswitch.org/pipermail/ovs-dev/2017-July/335237.ht
> > > > ml
> > > >
> > > > Limitations of this patch:
> > > > * The patch uses the Flow Director filter API in DPDK and has only
> > > > been tested  on Fortville (XL710) NIC.
> > > > * Prioritization is limited to:
> > > > ** eth_type
> > > > ** Fully specified 5-tuple src & dst ip and port numbers for UDP &
> > > > TCP packets
> > > > * ovs-appctl dpif-netdev/pmd-*-show o/p should indicate rxq
> > > prioritization.
> > > > * any requirements for a more granular prioritization mechanism
> > > >
> > >
> > > In general I like the idea of splitting priority traffic to a
> > > specific queue but I have concerns about the implementation. I
> > > shared most of these when we met already but adding here too. Not a
> detailed review.
> > [[BO'M]] No worries. If we get the high-level sorted out first the
> > details will fall into place :)
> > >
> > > - It is using deprecated DPDK filter API.
> > > http://dpdk.org/doc/guides/rel_notes/deprecation.html
> > [[BO'M]] Yes it looks like a move to the shiny new Flow API is in order.
> > >
> > > - It is an invasive change that seems to be for only one Intel NIC
> > > in the DPDK datapath. Even then it is very limited as it only works
> > > when that Intel NIC is using exactly 2 rx queues.
> > [[BO'M]] That's the current case but is really a limitation of
> > FlowDirectorAPI/DPDK/XL710 combination. Maybe Flow API will allow to
> > RSS over many queues and place the prioritized traffic on another queue.
> > >
> > > - It's a hardcoded opaque QoS which will have a negative impact on
> > > whichever queues happen to land on the same pmd so it's
> > > unpredictable which queues will be affected. It could effect other
> > > latency sensitive traffic that cannot 

Re: [ovs-dev] [PATCH 0/4] prioritizing latency sensitive traffic

2017-08-17 Thread Jan Scheurich
Good discussion. Some thoughts:

1. Prioritizing queues by assigning them to dedicated PMDs is a simple and 
effective but very crude method, considering that you have to reserve an entire 
(logical) core for that. So I am all for a more economic and perhaps slightly 
less deterministic option!

2. Offering the option to prioritize certain queues in OVS-DPDK is a highly 
desirable feature. We have at least one important use case in OpenStack 
(prioritizing "in-band" infrastructure control plane traffic over tenant data, 
in case both are carried on the same physical network). In our case the traffic 
separation would be done per VLAN. Can we add this to the list of supported 
filters?

3. It would be nice to be able to combine priority queues with filters with a 
number of RSS queues without filter. Is this a XL710 HW limitation or only a 
limitation of the drivers and DPDK APIs?

BR, Jan


> From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev-
> boun...@openvswitch.org] On Behalf Of O Mahony, Billy
> Sent: Thursday, 17 August, 2017 18:07
> To: Kevin Traynor ; d...@openvswitch.org
> Subject: Re: [ovs-dev] [PATCH 0/4] prioritizing latency sensitive traffic
> 
> Hi Kevin,
> 
> Thanks for the comments - more inline.
> 
> Billy.
> 
> > -Original Message-
> > From: Kevin Traynor [mailto:ktray...@redhat.com]
> > Sent: Thursday, August 17, 2017 3:37 PM
> > To: O Mahony, Billy ; d...@openvswitch.org
> > Subject: Re: [ovs-dev] [PATCH 0/4] prioritizing latency sensitive
> > traffic
> >
> > Hi Billy,
> >
> > I just happened to be about to send a reply to the previous patchset,
> > so adding comments here instead.
> >
> > On 08/17/2017 03:24 PM, Billy O'Mahony wrote:
> > > Hi All,
> > >
> > > v2: Addresses various review comments; Applies cleanly on 0bedb3d6.
> > >
> > > This patch set provides a method to request ingress scheduling on
> > interfaces.
> > > It also provides an implemtation of same for DPDK physical ports.
> > >
> > > This allows specific packet types to be:
> > > * forwarded to their destination port ahead of other packets.
> > > and/or
> > > * be less likely to be dropped in an overloaded situation.
> > >
> > > It was previously discussed
> > > https://mail.openvswitch.org/pipermail/ovs-discuss/2017-
> > May/044395.htm
> > > l
> > > and RFC'd
> > > https://mail.openvswitch.org/pipermail/ovs-dev/2017-July/335237.html
> > >
> > > Limitations of this patch:
> > > * The patch uses the Flow Director filter API in DPDK and has only
> > > been tested  on Fortville (XL710) NIC.
> > > * Prioritization is limited to:
> > > ** eth_type
> > > ** Fully specified 5-tuple src & dst ip and port numbers for UDP &
> > > TCP packets
> > > * ovs-appctl dpif-netdev/pmd-*-show o/p should indicate rxq
> > prioritization.
> > > * any requirements for a more granular prioritization mechanism
> > >
> >
> > In general I like the idea of splitting priority traffic to a specific
> > queue but I have concerns about the implementation. I shared most of
> > these when we met already but adding here too. Not a detailed review.
> [[BO'M]] No worries. If we get the high-level sorted out first the details 
> will
> fall into place :)
> >
> > - It is using deprecated DPDK filter API.
> > http://dpdk.org/doc/guides/rel_notes/deprecation.html
> [[BO'M]] Yes it looks like a move to the shiny new Flow API is in order.
> >
> > - It is an invasive change that seems to be for only one Intel NIC in
> > the DPDK datapath. Even then it is very limited as it only works when
> > that Intel NIC is using exactly 2 rx queues.
> [[BO'M]] That's the current case but is really a limitation of
> FlowDirectorAPI/DPDK/XL710 combination. Maybe Flow API will allow to
> RSS over many queues and place the prioritized traffic on another queue.
> >
> > - It's a hardcoded opaque QoS which will have a negative impact on
> > whichever queues happen to land on the same pmd so it's unpredictable
> > which queues will be affected. It could effect other latency sensitive
> > traffic that cannot by prioritized because of the limitations above.
> >
> > - I guess multiple priority queues could land on the same pmd and
> > starve each other?
> [[BO'M]] Interaction with pmd assignment is definitely an issue that needs
> to be addressed. I know there is work in-flight in that regard so it will be
> easier to address that when the in-flight work lands.
> >
> > I think a more general, less restricted scheme using DPDK rte_flow API
> > with controls on the effects to other traffic is needed. Perhaps if a
> > user is very concerned with latency on traffic from a port, they would
> > be ok with dedicating a pmd to it.
> [[BO'M]] You are proposing to prioritize queues by allocating a single pmd
> to them rather than by changing the pmds read algorithm to favor
> prioritized queues? For sure that could be another implementation of the
> solution.
> 
> If we look at the patch set as containing two distinct things as per the 

Re: [ovs-dev] [PATCH v4 0/6] OVS-DPDK rxq to pmd assignment improvements.

2017-08-17 Thread Kevin Traynor
On 08/16/2017 09:02 PM, Darrell Ball wrote:
> 
> 
> -Original Message-
> From: Kevin Traynor 
> Organization: Red Hat
> Date: Wednesday, August 16, 2017 at 12:19 PM
> To: Darrell Ball , "d...@openvswitch.org" 
> , "ian.sto...@intel.com" , 
> "jan.scheur...@ericsson.com" , 
> "bhanuprakash.bodire...@intel.com" , 
> "mark.b.kavan...@intel.com" , 
> "gvrose8...@gmail.com" 
> Subject: Re: [ovs-dev] [PATCH v4 0/6] OVS-DPDK rxq to pmd assignment 
> improvements.
> 
> On 08/16/2017 07:28 AM, Darrell Ball wrote:
> > Hi Kevin
> > 
> > I did some basic testing and parsed all the code.
> > I have some high level comments, below.
> > I’ll add the other comments tomorrow.
> > 
> 
> Thanks Darrell.
> 
> > 1/ The redistribution is based on the last interval – I wonder if a 
> number of intervals may be better.
> >  Possibly a single interval could result in a skewed decision ?
> > 
> 
> Sure, it could happen if there was some spike. Ian had made a similar
> comment. The only danger is that if the interval is too long then the
> older information is less relevant and also a rebalance would be 
> occasional.
> 
> In the v4 I decoupled the interval used here from anything else, so it
> can easily be increased if there is a better suggestion for length.
> 
> [Darrell]:  You could keep it simple for now and use a little longer time - 
> maybe a ‘minute’ and a future
>  patch could use a weighted average maybe or something else ? 
> This is the only major concern
>  I have with this series. Point 2 can come as a follow-up 
> series.
>  
> 

Yes, completely agree - a user command and any stats required to support
it can be follow up.

@Jan, can you comment on length of time to take into consideration when
re-balancing - when we talked before you had some views on it. I will
post a v5 when we get consensus around that.

thanks,
Kevin.


> 
> Ian also suggested making it user configurable, but I'd only want to add
> that if it's clear that different interval lengths are needed for
> different use cases. Otherwise it's another low level config the user
> has to grapple with and that has to be maintained.
> 
> [Darrell] If we ultimately have a show/rebalance maintenance command in a 
> future series, a
> configurable interval would be moot I think.
> 
> 
> > 2/ I see a rebalance command added that can be used other than at 
> reconfiguration time.
> 
> Yep, I'd really like to hear opinions about whether the command to
> rebalance is useful for people in production. I prefer the case where
> rebalance is done as part of a happening anyway reconfig because it
> doesn't need user intervention. But you don't want someone having to add
> a VM in order to do a rebalance :-)
> 
> >  Rather than for testing (per the patch use case), maybe this could 
> be used as maintenance command when combined
> >  with  
> https://urldefense.proofpoint.com/v2/url?u=https-3A__patchwork.ozlabs.org_patch_768757_=DwIDaQ=uilaK90D4TOVoH58JNXRgQ=BVhFA09CGX7JQ5Ih-uZnsw=kd7bxA2tqkzEpBpen6pNG2aWw61Pv0_EQtg7tddsFX8=QLFdqh_ZEJpnsKwnemFEqEIAbNBfXI4NgzUD1FnshEs=
>   ?
> 
> Sugesh's patch re-implemented the data collection from this RFC, but
> used the data for displaying stats.
> 
> [Darrell] got it.
> 
> .It's not compatible with this or
> Ciara's merged patch for collecting pmd stats.
> 
> [Darrell] agreed
> 
>  I agree that increased
> stats reporting would be useful to compliment a rebalance command in
> production.
> 
> [Darrell] Anyhow, the show/rebalance can come as a follow-up series that
>  you could do alone or both you and Sugesh.
> 
> 
> Kevin.
> 
> >If this is the case, patch 6 could have more description how best to 
> do that.
> >I will look at Sugesh’s patch tomorrow as well so we can expedite.
> > 
> > Thanks Darrell
> > 
> >  
> > 
> > 
> > 
> > -Original Message-
> > From:  on behalf of Kevin Traynor 
> 
> > Date: Wednesday, August 9, 2017 at 8:45 AM
> > To: "d...@openvswitch.org" , 
> "ian.sto...@intel.com" , "jan.scheur...@ericsson.com" 
> , "bhanuprakash.bodire...@intel.com" 
> , "mark.b.kavan...@intel.com" 
> , "gvrose8...@gmail.com" 
> > Subject: [ovs-dev] [PATCH v4 0/6] OVS-DPDK rxq to pmd assignment
> improvements.
> > 
> > For the DPDK datapath, by default rxqs are assigned to available 
> pmds

Re: [ovs-dev] [PATCH] nsh: rework NSH netlink keys and actions

2017-08-17 Thread Jan Scheurich
> > As discussed please adjust the netlink uAPI for NSH as follows:
> >
> > OVS_KEY_ATTR_NSH
> > -- OVS_NSH_KEY_ATTR_BASEmandatory
> > -- OVS_NSH_KEY_ATTR_MD1 conditional: if mdtype=MD1
> >
> > OVS_ACTION_ATTR_PUSH_NSH
> > -- OVS_NSH_KEY_ATTR_BASEmandatory
> > -- OVS_NSH_PUSH_ATTR_CONTEXTconditional: currently if MD1 or
> > MD2 with TLV encap properties
> >
> > OVS_ACTION_ATTR_POP_NSH no data
> >
> > Remove OVS_NSH_KEY_ATTR_MD2 because the MD2 context headers
> will not
> > be modelled as dedicated key fields but mapped to some generic TLV
> fields,
> > similar to but independent from the current tunnel metadata TLV
> registers.
> >
> > Using the name OVS_NSH_KEY_ATTR_MD2 for the opaque context data
> in
> > PUSH_NSH would be misleading as the variable length byte array can
> carry
> > any type of metadata format: MD1, MD2, or any future MD type. For
> that
> > reason I prefer the generic name OVS_NSH_PUSH_ATTR_CONTEXT.
> 
> Then why even have ATTR_MD1? Sounds a single attribute is enough to
> support all cases.

OVS_NSH_KEY_ATTR_MD1 represents a struct containing the four fixed length 
context headers defined in MD1. These are match fields in OVS and must be 
represented explicitly in OVS_KEY_ATTR_NSH. But their presence is conditional 
on MD type 1, that's why Jiri rightly insisted on having it split from 
OVS_NSH_KEY_ATTR_BASE.

In OVS_ACTION_ATTR_PUSH_NSH we again have the same mandatory header part 
represented by OVS_NSH_KEY_ATTR_BASE, this time followed by an optional 
variable length opaque container for up to 248 octets of context data (the 
format is of no concern for the push_nsh action). The content of the 
ATTR_CONTEXT, if any, is compiled by ofproto when the action is sent.

BR, Jan
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 0/4] prioritizing latency sensitive traffic

2017-08-17 Thread O Mahony, Billy
Hi Kevin,

Thanks for the comments - more inline.

Billy.

> -Original Message-
> From: Kevin Traynor [mailto:ktray...@redhat.com]
> Sent: Thursday, August 17, 2017 3:37 PM
> To: O Mahony, Billy ; d...@openvswitch.org
> Subject: Re: [ovs-dev] [PATCH 0/4] prioritizing latency sensitive traffic
> 
> Hi Billy,
> 
> I just happened to be about to send a reply to the previous patchset, so
> adding comments here instead.
> 
> On 08/17/2017 03:24 PM, Billy O'Mahony wrote:
> > Hi All,
> >
> > v2: Addresses various review comments; Applies cleanly on 0bedb3d6.
> >
> > This patch set provides a method to request ingress scheduling on
> interfaces.
> > It also provides an implemtation of same for DPDK physical ports.
> >
> > This allows specific packet types to be:
> > * forwarded to their destination port ahead of other packets.
> > and/or
> > * be less likely to be dropped in an overloaded situation.
> >
> > It was previously discussed
> > https://mail.openvswitch.org/pipermail/ovs-discuss/2017-
> May/044395.htm
> > l
> > and RFC'd
> > https://mail.openvswitch.org/pipermail/ovs-dev/2017-July/335237.html
> >
> > Limitations of this patch:
> > * The patch uses the Flow Director filter API in DPDK and has only
> > been tested  on Fortville (XL710) NIC.
> > * Prioritization is limited to:
> > ** eth_type
> > ** Fully specified 5-tuple src & dst ip and port numbers for UDP & TCP
> > packets
> > * ovs-appctl dpif-netdev/pmd-*-show o/p should indicate rxq
> prioritization.
> > * any requirements for a more granular prioritization mechanism
> >
> 
> In general I like the idea of splitting priority traffic to a specific queue 
> but I
> have concerns about the implementation. I shared most of these when we
> met already but adding here too. Not a detailed review.
[[BO'M]] No worries. If we get the high-level sorted out first the details will 
fall into place :) 
> 
> - It is using deprecated DPDK filter API.
> http://dpdk.org/doc/guides/rel_notes/deprecation.html
[[BO'M]] Yes it looks like a move to the shiny new Flow API is in order.
> 
> - It is an invasive change that seems to be for only one Intel NIC in the DPDK
> datapath. Even then it is very limited as it only works when that Intel NIC is
> using exactly 2 rx queues.
[[BO'M]] That's the current case but is really a limitation of 
FlowDirectorAPI/DPDK/XL710 combination. Maybe Flow API will allow to RSS over 
many queues and place the prioritized traffic on another queue.
> 
> - It's a hardcoded opaque QoS which will have a negative impact on
> whichever queues happen to land on the same pmd so it's unpredictable
> which queues will be affected. It could effect other latency sensitive traffic
> that cannot by prioritized because of the limitations above.
> 
> - I guess multiple priority queues could land on the same pmd and starve
> each other?
[[BO'M]] Interaction with pmd assignment is definitely an issue that needs to 
be addressed. I know there is work in-flight in that regard so it will be 
easier to address that when the in-flight work lands.
> 
> I think a more general, less restricted scheme using DPDK rte_flow API with
> controls on the effects to other traffic is needed. Perhaps if a user is very
> concerned with latency on traffic from a port, they would be ok with
> dedicating a pmd to it.
[[BO'M]] You are proposing to prioritize queues by allocating a single pmd to 
them rather than by changing the pmds read algorithm to favor prioritized 
queues? For sure that could be another implementation of the solution.

If we look at the patch set as containing two distinct things as per the cover 
letter "the patch set provides a method to request ingress scheduling on
interfaces. It also provides an implementation of same for DPDK physical 
ports." Then this would change the second part put the first would be still 
valid. Each port type in any case would have to come up with it's own 
implementation - it's just for non-physical ports than cannot offload the 
prioritization decision it not worth the effort - as was noted in an earlier 
RFC.

> 
> thanks,
> Kevin.
> 
> > Initial results:
> > * even when userspace OVS is very much overloaded and
> >   dropping significant numbers of packets the drop rate for prioritized 
> > traffic
> >   is running at 1/1000th of the drop rate for non-prioritized traffic.
> >
> > * the latency profile of prioritized traffic through userspace OVS is also
> much
> >   improved
> >
> > 1e0 |*
> > |*
> > 1e-1|* | Non-prioritized pkt latency
> > |* * Prioritized pkt latency
> > 1e-2|*
> > |*
> > 1e-3|*   |
> > |*   |
> > 1e-4|*   | | |
> > |*   |*| |
> > 1e-5|*   |*| | |
> > |*   |*|*| |  |
> > 1e-6|*   |*|*|*|  |
> > |*   |*|*|*|*

Re: [ovs-dev] [PATCH v2] datapath-windows: Do not modify port field for ICMP during SNAT/DNAT

2017-08-17 Thread Shashank Ram

From: ovs-dev-boun...@openvswitch.org  on 
behalf of Anand Kumar 
Sent: Tuesday, August 15, 2017 3:29 PM
To: d...@openvswitch.org
Subject: [ovs-dev] [PATCH v2] datapath-windows: Do not modify port field for
ICMP during SNAT/DNAT

During SNAT/DNAT, we should not be updating the port field of ct_endpoint
struct, as ICMP packets do not have port information. Since port and
icmp_id are overlapped in ct_endpoint struct, icmp_id gets changed.
As a result, NAT look up fails to find a matching entry.

This patch addresses this issue by not modifying icmp_id field during
SNAT/DNAT only for ICMP traffic

The current NAT module doesn't take the ICMP type/code into account
during the lookups. Fix this to make it similar with the other conntrack
module.

Signed-off-by: Anand Kumar 
---
___

Acked-by: Shashank Ram 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3] datapath-windows: Update Orig Tuple to use ICMP Type and Code

2017-08-17 Thread Shashank Ram

From: ovs-dev-boun...@openvswitch.org  on 
behalf of Anand Kumar 
Sent: Wednesday, August 16, 2017 11:23 AM
To: d...@openvswitch.org
Subject: [ovs-dev] [PATCH v3] datapath-windows: Update Orig Tuple to use
ICMP Type and Code

- Also add some padding for the ct_endpoint's union, so that each member
of ct_endpoint's union are of same size.

Co-authored-by: Sairam Venugopal 
Signed-off-by: Anand Kumar 
---
___

Acked-by: Shashank Ram 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] nsh: rework NSH netlink keys and actions

2017-08-17 Thread Eric Garver
On Thu, Aug 17, 2017 at 12:04:25PM +, Jan Scheurich wrote:
> Hi Yi,
> 
> As discussed please adjust the netlink uAPI for NSH as follows:
> 
> OVS_KEY_ATTR_NSH
> -- OVS_NSH_KEY_ATTR_BASE  mandatory
> -- OVS_NSH_KEY_ATTR_MD1   conditional: if mdtype=MD1
> 
> OVS_ACTION_ATTR_PUSH_NSH
> -- OVS_NSH_KEY_ATTR_BASE  mandatory
> -- OVS_NSH_PUSH_ATTR_CONTEXT  conditional: currently if MD1 or 
>   MD2 with TLV encap properties
> 
> OVS_ACTION_ATTR_POP_NSH   no data
> 
> Remove OVS_NSH_KEY_ATTR_MD2 because the MD2 context headers will not 
> be modelled as dedicated key fields but mapped to some generic TLV fields, 
> similar to but independent from the current tunnel metadata TLV registers.
> 
> Using the name OVS_NSH_KEY_ATTR_MD2 for the opaque context data in 
> PUSH_NSH would be misleading as the variable length byte array can carry
> any type of metadata format: MD1, MD2, or any future MD type. For that
> reason I prefer the generic name OVS_NSH_PUSH_ATTR_CONTEXT. 

Then why even have ATTR_MD1? Sounds a single attribute is enough to
support all cases.

> 
> Note that this attribute will not be used as part of the NSH key, so you 
> might 
> consider generalizing the NSH attribute enum definition to 
> 
> enum ovs_nsh_attr {
>   OVS_NSH_KEY_ATTR_BASE,  /* struct ovs_nsh_key_base. */
>   OVS_NSH_KEY_ATTR_MD1,   /* struct ovs_nsh_key_md1. */
>   OVS_NSH_PUSH_ATTR_CONTEXT,   /* opaque context headers */
>   __OVS_NSH_ATTR_MAX
> };
> 
> BR, Jan
> 
> 
> > -Original Message-
> > From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev-
> > boun...@openvswitch.org] On Behalf Of Yi Yang
> > Sent: Wednesday, 16 August, 2017 09:56
> > To: d...@openvswitch.org
> > Subject: [ovs-dev] [PATCH] nsh: rework NSH netlink keys and actions
> > 
> > Per kernel data path requirements, this patch changes
> > OVS_KEY_ATTR_NSH
> > to nested attribute and adds three new NSH sub attribute keys:
> > 
> > OVS_NSH_KEY_ATTR_BASE: for length-fixed NSH base header
> > OVS_NSH_KEY_ATTR_MD1:  for length-fixed MD type 1 context
> > OVS_NSH_KEY_ATTR_MD2:  for length-variable MD type 2 metadata
> > 
> > NSH match fields, set and PUSH_NSH action all use the below
> > nested attribute format:
> > 
> > OVS_KEY_ATTR_NSH begin
> > OVS_NSH_KEY_ATTR_BASE
> > OVS_NSH_KEY_ATTR_MD1
> > OVS_KEY_ATTR_NSH end
> > 
> > or
> > 
> > OVS_KEY_ATTR_NSH begin
> > OVS_NSH_KEY_ATTR_BASE
> > OVS_NSH_KEY_ATTR_MD2
> > OVS_KEY_ATTR_NSH end
> > 
> > In addition, NSH encap and decap actions are renamed as push_nsh
> > and pop_nsh to meet action naming convention.
> > 
> > Signed-off-by: Yi Yang 
> > ---
> >  datapath/linux/compat/include/linux/openvswitch.h |  57 +-
> >  include/openvswitch/nsh.h |  32 +-
> >  include/openvswitch/packets.h |  11 +-
> >  lib/dpif-netdev.c |   4 +-
> >  lib/dpif.c|   4 +-
> >  lib/flow.c|  15 +-
> >  lib/match.c   |  12 +-
> >  lib/meta-flow.c   |  13 +-
> >  lib/nx-match.c|   4 +-
> >  lib/odp-execute.c |  76 ++-
> >  lib/odp-util.c| 713 
> > ++
> >  lib/odp-util.h|   4 +
> >  lib/packets.c |  24 +-
> >  lib/packets.h |   6 +-
> >  ofproto/ofproto-dpif-ipfix.c  |   4 +-
> >  ofproto/ofproto-dpif-sflow.c  |   4 +-
> >  ofproto/ofproto-dpif-xlate.c  |  24 +-
> >  tests/nsh.at  |  28 +-
> >  18 files changed, 773 insertions(+), 262 deletions(-)
> > 
> > diff --git a/datapath/linux/compat/include/linux/openvswitch.h
> > b/datapath/linux/compat/include/linux/openvswitch.h
> > index bc6c94b..d7f9029 100644
> > --- a/datapath/linux/compat/include/linux/openvswitch.h
> > +++ b/datapath/linux/compat/include/linux/openvswitch.h
> > @@ -369,7 +369,7 @@ enum ovs_key_attr {
> >  #ifndef __KERNEL__
> > /* Only used within userspace data path. */
> > OVS_KEY_ATTR_PACKET_TYPE,  /* be32 packet type */
> > -   OVS_KEY_ATTR_NSH,  /* struct ovs_key_nsh */
> > +   OVS_KEY_ATTR_NSH,  /* Nested set of ovs_nsh_key_* */
> >  #endif
> > 
> > __OVS_KEY_ATTR_MAX
> > @@ -492,13 +492,27 @@ struct ovs_key_ct_labels {
> > };
> >  };
> > 
> > -struct ovs_key_nsh {
> > -__u8 flags;
> > -__u8 mdtype;
> > -__u8 np;
> > -__u8 pad;
> > -__be32 path_hdr;
> > -__be32 c[4];
> > +enum ovs_nsh_key_attr {
> > +   OVS_NSH_KEY_ATTR_BASE,  /* struct ovs_nsh_key_base. */
> > +   OVS_NSH_KEY_ATTR_MD1,   /* struct 

Re: [ovs-dev] [patch_v4] dp-packet: Reset DPDK HWOL checksum flags on init.

2017-08-17 Thread Aaron Conole
Aaron Conole  writes:

> Darrell Ball  writes:
>
>> -Original Message-
>> From: Aaron Conole 
>> Date: Thursday, August 10, 2017 at 10:13 AM
>> To: Darrell Ball 
>> Cc: ovs dev 
>> Subject: Re: [ovs-dev] [patch_v4] dp-packet: Reset DPDK HWOL checksum flags 
>> on init.
>>
>> Darrell Ball  writes:
>> 
>> > -Original Message-
>> > From: Aaron Conole 
>> > Date: Wednesday, August 9, 2017 at 12:51 PM
>> > To: Darrell Ball 
>> > Cc: Joe Stringer , Darrell Ball , ovs
>> > dev 
>> > Subject: Re: [ovs-dev] [patch_v4] dp-packet: Reset DPDK HWOL
>> > checksum flags on init.
>> >
>> > Darrell Ball  writes:
>> > 
>> > > Thanks Joe
>> > > I forgot to add your Tested-by to V5; I have been testing this 
>> myself;
>> > > but let me know if you would like it added – I can send a V6.
>> > 
>> > It will automatically be added by patchwork.  It is sufficient to
>> > download (ex:
>> > 
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__patchwork.ozlabs.org_patch_799499_mbox=DwIFaQ=uilaK90D4TOVoH58JNXRgQ=BVhFA09CGX7JQ5Ih-uZnsw=LK8s59ajLx5MbmH5Ng8SG1F1nAgKZJ_0JBs7phbl6C8=HRG5wrQrEXCl_AA8uAGbroVSvYGUIQjQkF8HrMRvNYI=
>> > )
>> > 
>> > I usually use patchwork when working on a series - if I download 
>> it from
>> > patchwork, I can be confident that all the tags are applied 
>> properly, so
>> > I won't forget.  Plus, all the discussion happens there, so I can
>> > quickly browse it.
>> >
>> > Thanks Aaron
>> > In this case, the Tested-by was applied to V4 and I sent a V5
>> > which did not have the Tested-by.
>> > The easiest solution would have been to ask Joe to respond to
>> > V5 – well, it is moot now anyways with V6 on the way.
>> 
>> Yep, that happens.
>> 
>> > The full list is available at:
>> > 
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__patchwork.ozlabs.org_project_openvswitch_list_=DwIFaQ=uilaK90D4TOVoH58JNXRgQ=BVhFA09CGX7JQ5Ih-uZnsw=LK8s59ajLx5MbmH5Ng8SG1F1nAgKZJ_0JBs7phbl6C8=rqaCq5Jr-Vy-cfQ_t5px-zGtdb55CQn9cIAZCC1PvfE=
>> >
>> > Thanks, I am on this page most of the day (
>> > 
>> > It would actually be cool to have a few more admins to troll 
>> patchwork
>> > and do things like ping for status (ex:
>> > 
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__patchwork.ozlabs.org_patch_719492_=DwIFaQ=uilaK90D4TOVoH58JNXRgQ=BVhFA09CGX7JQ5Ih-uZnsw=LK8s59ajLx5MbmH5Ng8SG1F1nAgKZJ_0JBs7phbl6C8=y0RPF47IRVw8BujWHzB9oc7IUihgLuCenLfteZ2vUr4=
>> >
>> > I don’t have any context on this patch.
>> > Aaron, would you like to resurrect this patch with rebase ?;
>> > it is RHEL related.
>> 
>> It doesn't need a rebase; it applies cleanly.  The version of patchwork
>> running doesn't recognize my ack, so someone will have to add it
>> manually (or omit it... doesn't matter too much, since Ben ack'd as
>> well).
>> 
>> But it would be cool to get it applied;  maybe even to get back as far
>> as branch-2.7 (I think it was posted in time for 2.7 branch).
>>
>>
>> One thing is I don’t know if Daniele had a chance to test this,
>> although it looks simple enough.
>> Did you get a chance to test it ? if so, maybe a tested-by  ?
>
> I believe I tested it at the time I acked it.  I'll set up a test for it
> tomorrow, though.

Disregard my message (and my ack).  I've tested, but it's not clear how
best to use this.  I'm going to be revisiting this area and seeing what
I can do to make it more friendly.

Thanks, Darrell.

>> > is this still needed?  it
>> > does seem to mostly apply).  Then we could make sure we don't miss
>> > things.
>> > 
>> > > Darrell
>> > >
>> > > -Original Message-
>> > > From:  on behalf of Joe
>> > > Stringer 
>> > > Date: Tuesday, August 8, 2017 at 4:51 PM
>> > > To: Darrell Ball 
>> > > Cc: ovs dev 
>> > > Subject: Re: [ovs-dev] [patch_v4] dp-packet: Reset DPDK HWOL 
>> checksum
>> > > flags on init.
>> > >
>> > > On 8 August 2017 at 16:39, Darrell Ball  
>> wrote:
>> > > > Reset the DPDK HWOL checksum flags in dp_packet_init_.
>> > > > The new HWOL bad checksum flag is uninitialized on
>> > > > non-dpdk ports and
>> > > > this is noticed as test failures using netdev-dummy
>> > > > ports where the bad
>> > > > checksum flag is checked.
>> > > >
>> 

Re: [ovs-dev] [PATCH 0/4] prioritizing latency sensitive traffic

2017-08-17 Thread Kevin Traynor
Hi Billy,

I just happened to be about to send a reply to the previous patchset, so
adding comments here instead.

On 08/17/2017 03:24 PM, Billy O'Mahony wrote:
> Hi All,
> 
> v2: Addresses various review comments; Applies cleanly on 0bedb3d6.
> 
> This patch set provides a method to request ingress scheduling on interfaces.
> It also provides an implemtation of same for DPDK physical ports.
> 
> This allows specific packet types to be:
> * forwarded to their destination port ahead of other packets.
> and/or
> * be less likely to be dropped in an overloaded situation.
> 
> It was previously discussed
> https://mail.openvswitch.org/pipermail/ovs-discuss/2017-May/044395.html
> and RFC'd
> https://mail.openvswitch.org/pipermail/ovs-dev/2017-July/335237.html
> 
> Limitations of this patch:
> * The patch uses the Flow Director filter API in DPDK and has only been tested
>  on Fortville (XL710) NIC.
> * Prioritization is limited to:
> ** eth_type
> ** Fully specified 5-tuple src & dst ip and port numbers for UDP & TCP packets
> * ovs-appctl dpif-netdev/pmd-*-show o/p should indicate rxq prioritization.
> * any requirements for a more granular prioritization mechanism
> 

In general I like the idea of splitting priority traffic to a specific
queue but I have concerns about the implementation. I shared most of
these when we met already but adding here too. Not a detailed review.

- It is using deprecated DPDK filter API.
http://dpdk.org/doc/guides/rel_notes/deprecation.html

- It is an invasive change that seems to be for only one Intel NIC in
the DPDK datapath. Even then it is very limited as it only works when
that Intel NIC is using exactly 2 rx queues.

- It's a hardcoded opaque QoS which will have a negative impact on
whichever queues happen to land on the same pmd so it's unpredictable
which queues will be affected. It could effect other latency sensitive
traffic that cannot by prioritized because of the limitations above.

- I guess multiple priority queues could land on the same pmd and starve
each other?

I think a more general, less restricted scheme using DPDK rte_flow API
with controls on the effects to other traffic is needed. Perhaps if a
user is very concerned with latency on traffic from a port, they would
be ok with dedicating a pmd to it.

thanks,
Kevin.

> Initial results:
> * even when userspace OVS is very much overloaded and
>   dropping significant numbers of packets the drop rate for prioritized 
> traffic
>   is running at 1/1000th of the drop rate for non-prioritized traffic.
> 
> * the latency profile of prioritized traffic through userspace OVS is also 
> much
>   improved
> 
> 1e0 |*
> |*
> 1e-1|* | Non-prioritized pkt latency
> |* * Prioritized pkt latency
> 1e-2|*
> |*
> 1e-3|*   |
> |*   |
> 1e-4|*   | | |
> |*   |*| |
> 1e-5|*   |*| | |
> |*   |*|*| |  |
> 1e-6|*   |*|*|*|  |
> |*   |*|*|*|* |
> 1e-7|*   |*|*|*|* |*
> |*   |*|*|*|* |*
> 1e-8|*   |*|*|*|* |*
>   0-1 1-20 20-40 40-50 50-60 60-70 ... 120-400
> Latency (us)
> 
>  Proportion of packets per latency bin @ 80% Max Throughput
>   (Log scale)
> 
> 
> Regards,
> Billy.
> 
> billy O'Mahony (4):
>   netdev: Add set_ingress_sched to netdev api
>   netdev-dpdk: Apply ingress_sched config to dpdk phy ports
>   dpif-netdev: Add rxq prioritization
>   docs: Document ingress scheduling feature
> 
>  Documentation/howto/dpdk.rst|  31 +++
>  include/openvswitch/ofp-parse.h |   3 +
>  lib/dpif-netdev.c   |  25 --
>  lib/netdev-bsd.c|   1 +
>  lib/netdev-dpdk.c   | 192 
> +++-
>  lib/netdev-dummy.c  |   1 +
>  lib/netdev-linux.c  |   1 +
>  lib/netdev-provider.h   |  10 +++
>  lib/netdev-vport.c  |   1 +
>  lib/netdev.c|  22 +
>  lib/netdev.h|   1 +
>  vswitchd/bridge.c   |   4 +
>  vswitchd/vswitch.xml|  31 +++
>  13 files changed, 315 insertions(+), 8 deletions(-)
> 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] checkpatch: Enforce bracing around conditionals.

2017-08-17 Thread Aaron Conole
Hi Joe,

Joe Stringer  writes:

> The coding style states that BSD-style brace placement should be used,
> and even single statements should be enclosed. Add checks to checkpatch
> for this.
>
> Signed-off-by: Joe Stringer 
> ---
>  utilities/checkpatch.py | 13 +
>  1 file changed, 13 insertions(+)
>
> diff --git a/utilities/checkpatch.py b/utilities/checkpatch.py
> index 43f10bb3ded3..c8381c98403b 100755
> --- a/utilities/checkpatch.py
> +++ b/utilities/checkpatch.py
> @@ -95,6 +95,8 @@ __regex_ends_with_bracket = \
>  __regex_ptr_declaration_missing_whitespace = re.compile(r'[a-zA-Z0-9]\*[^*]')
>  __regex_is_comment_line = re.compile(r'^\s*(/\*|\*\s)')
>  __regex_trailing_operator = re.compile(r'^[^ ]* [^ ]*[?:]$')
> +__regex_conditional_else_bracing = re.compile(r'^\s*else\s*{?$')
> +__regex_conditional_else_bracing2 = re.compile(r'^\s*}\selse\s*$')
>  
>  skip_leading_whitespace_check = False
>  skip_trailing_whitespace_check = False
> @@ -212,6 +214,12 @@ def trailing_operator(line):
>  return __regex_trailing_operator.match(line) is not None
>  
>  
> +def if_else_bracing_check(line):
> +if __regex_conditional_else_bracing.match(line) is not None:
> +return True
> +return __regex_conditional_else_bracing2.match(line) is not None
> +
> +

Would it make more sense to put this in with the other bracing checks in
if_and_for_end_with_bracket_check ?

>  checks = [
>  {'regex': None,
>   'match_name':
> @@ -250,6 +258,11 @@ checks = [
>   'check': lambda x: trailing_operator(x),
>   'print':
>   lambda: print_error("Line has '?' or ':' operator at end of line")},
> +
> +{'regex': '(.c|.h)(.in)?$', 'match_name': None,
> + 'prereq': lambda x: not is_comment_line(x),
> + 'check': lambda x: if_else_bracing_check(x),
> + 'print': lambda: print_error("Improper bracing in conditional 
> statement")}
>  ]
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v2 4/4] docs: Document ingress scheduling feature

2017-08-17 Thread Billy O'Mahony
Signed-off-by: Billy O'Mahony 
---
 Documentation/howto/dpdk.rst | 31 +++
 vswitchd/vswitch.xml | 31 +++
 2 files changed, 62 insertions(+)

diff --git a/Documentation/howto/dpdk.rst b/Documentation/howto/dpdk.rst
index d7f6610..c7bfc01 100644
--- a/Documentation/howto/dpdk.rst
+++ b/Documentation/howto/dpdk.rst
@@ -188,6 +188,37 @@ respective parameter. To disable the flow control at tx 
side, run::
 
 $ ovs-vsctl set Interface dpdk-p0 options:tx-flow-ctrl=false
 
+Ingress Scheduling
+--
+
+The ingress scheduling feature is described in general in
+``ovs-vswitchd.conf.db (5)``.
+
+Interfaces of type ``dpdk`` support ingress scheduling only for
+either ether_type or else a fully specificed combination of src and
+dst ip address and port numbers for TCP or UDP packets.
+
+To prioritize packets for Precision Time Protocol:
+
+$ ovs-vsctl set Interface dpdk-p0 \
+other_config:ingress_sched=eth_type=0x88F7
+
+To prioritize UDP packets between specific IP source and destination:
+
+$ ovs-vsctl set Interface dpdk-p0 \
+other_config:ingress_sched=udp,ip_src=1.1.1.1,ip_dst=2.2.2.2,\
+udp_src=11,udp_dst=22
+
+If unsupported ingress scheduling configuration is specified or it cannot be
+applied for any reason a warning message is logged and the Interface operates
+as if no ingress scheduling was configured.
+
+Interfaces of type ``dpdkvhostuserclient``, ``dpdkr`` and ``dpdkvhostuser`` do
+not support ingress scheduling.
+
+Currently only the match fields listed above are supported. No wildcarding of
+fields is supported.
+
 pdump
 -
 
diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
index 074535b..4564536 100644
--- a/vswitchd/vswitch.xml
+++ b/vswitchd/vswitch.xml
@@ -2923,6 +2923,37 @@
   
 
 
+
+  
+   Packets matching the ingress_sched value are prioritized. This means
+   some combination of:
+  
+  
+
+ prioritized packets are forwarded to their destination port before
+ non-prioritized
+
+
+ prioritized packets are less likely to be dropped in an overloaded
+ situation than prioritized packets
+
+  
+  
+   Ingress scheduling is supported with the best effort of the Interface.
+   It may be dependant on the interface type and it's supporting
+   implementation devices. Different interface types may have different
+   levels of support for the feature and the same interface type attached
+   to different devices (physical NICs or vhost ports, device driver,
+   NIC model) may also offer different levels of support.
+  
+  
+
+ The format of the ingress_sched field is specified in ovs-fields(7) in
+ the ``Matching'' and ``FIELD REFERENCE'' sections.
+
+  
+
+
 
   
 BFD, defined in RFC 5880 and RFC 5881, allows point-to-point
-- 
2.7.4

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v2 3/4] dpif-netdev: Add rxq prioritization

2017-08-17 Thread Billy O'Mahony
If an rxq is marked as 'prioritized' then keep reading from this queue until
there are no packets available. Only then proceed to other queues.

Signed-off-by: Billy O'Mahony 
---
 lib/dpif-netdev.c | 24 ++--
 1 file changed, 18 insertions(+), 6 deletions(-)

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index 9ce3456..d9f014d 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -488,6 +488,7 @@ struct dp_netdev_pmd_cycles {
 struct polled_queue {
 struct netdev_rxq *rx;
 odp_port_t port_no;
+uint8_t is_priority;
 };
 
 /* Contained by struct dp_netdev_pmd_thread's 'poll_list' member. */
@@ -3801,6 +3802,8 @@ pmd_load_queues_and_ports(struct dp_netdev_pmd_thread 
*pmd,
 HMAP_FOR_EACH (poll, node, >poll_list) {
 poll_list[i].rx = poll->rxq->rx;
 poll_list[i].port_no = poll->rxq->port->port_no;
+poll_list[i].is_priority = \
+(poll->rxq->rx->queue_id == poll->rxq->rx->netdev->priority_rxq);
 i++;
 }
 
@@ -3849,15 +3852,24 @@ reload:
 lc = UINT_MAX;
 }
 
+unsigned int log_cnt = 0;
+int streak_len;
+const unsigned int MAX_STREAK_LEN = 100;
 cycles_count_start(pmd);
 for (;;) {
+log_cnt++;
 for (i = 0; i < poll_cnt; i++) {
-process_packets =
-dp_netdev_process_rxq_port(pmd, poll_list[i].rx,
-   poll_list[i].port_no);
-cycles_count_intermediate(pmd,
-  process_packets ? PMD_CYCLES_PROCESSING
-  : PMD_CYCLES_IDLE);
+streak_len = 0;
+do {
+process_packets =
+dp_netdev_process_rxq_port(pmd, poll_list[i].rx,
+   poll_list[i].port_no);
+cycles_count_intermediate(pmd,
+   process_packets ? PMD_CYCLES_PROCESSING
+   : PMD_CYCLES_IDLE);
+streak_len++;
+} while (process_packets && poll_list[i].is_priority &&
+ streak_len < MAX_STREAK_LEN);
 }
 
 if (lc++ > 1024) {
-- 
2.7.4

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v2 2/4] netdev-dpdk: Apply ingress_sched config to dpdk phy ports

2017-08-17 Thread Billy O'Mahony
Ingress scheduling configuration is given effect by way of Flow Director
filters. A small subset of the ingress scheduling possible is
implemented in this patch.

Signed-off-by: Billy O'Mahony 
---
 include/openvswitch/ofp-parse.h |   3 +
 lib/dpif-netdev.c   |   1 +
 lib/netdev-dpdk.c   | 181 ++--
 vswitchd/bridge.c   |   2 +
 4 files changed, 180 insertions(+), 7 deletions(-)

diff --git a/include/openvswitch/ofp-parse.h b/include/openvswitch/ofp-parse.h
index 013a8f3..1991694 100644
--- a/include/openvswitch/ofp-parse.h
+++ b/include/openvswitch/ofp-parse.h
@@ -41,6 +41,9 @@ struct ofputil_table_mod;
 struct ofputil_bundle_msg;
 struct ofputil_tlv_table_mod;
 struct simap;
+struct tun_table;
+struct flow_wildcards;
+struct ofputil_port_map;
 enum ofputil_protocol;
 
 char *parse_ofp_str(struct ofputil_flow_mod *, int command, const char *str_,
diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index e2cd931..9ce3456 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -44,6 +44,7 @@
 #include "dp-packet.h"
 #include "dpif.h"
 #include "dpif-provider.h"
+#include "netdev-provider.h"
 #include "dummy.h"
 #include "fat-rwlock.h"
 #include "flow.h"
diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index 1ffedd4..d9aab4f 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -49,6 +49,8 @@
 #include "openvswitch/list.h"
 #include "openvswitch/ofp-print.h"
 #include "openvswitch/vlog.h"
+#include "openvswitch/ofp-parse.h"
+#include "openvswitch/ofp-util.h"
 #include "ovs-numa.h"
 #include "ovs-thread.h"
 #include "ovs-rcu.h"
@@ -175,6 +177,10 @@ static const struct rte_eth_conf port_conf = {
 .txmode = {
 .mq_mode = ETH_MQ_TX_NONE,
 },
+.fdir_conf = {
+.mode = RTE_FDIR_MODE_PERFECT,
+},
+
 };
 
 /*
@@ -351,6 +357,11 @@ enum dpdk_hw_ol_features {
 NETDEV_RX_CHECKSUM_OFFLOAD = 1 << 0,
 };
 
+union ingress_filter {
+struct rte_eth_ethertype_filter ethertype;
+struct rte_eth_fdir_filter fdir;
+};
+
 struct netdev_dpdk {
 struct netdev up;
 dpdk_port_t port_id;
@@ -390,8 +401,11 @@ struct netdev_dpdk {
 /* If true, device was attached by rte_eth_dev_attach(). */
 bool attached;
 
-/* Ingress Scheduling config */
+/* Ingress Scheduling config & state. */
 char *ingress_sched_str;
+bool ingress_sched_changed;
+enum rte_filter_type ingress_filter_type;
+union ingress_filter ingress_filter;
 
 /* In dpdk_list. */
 struct ovs_list list_node OVS_GUARDED_BY(dpdk_mutex);
@@ -674,6 +688,22 @@ dpdk_eth_dev_queue_setup(struct netdev_dpdk *dev, int 
n_rxq, int n_txq)
 int i;
 struct rte_eth_conf conf = port_conf;
 
+/* Ingress scheduling uses an extra RX queue reserved for prioritized
+   frames but using RSS will 'pollute' that queue by distributing
+   non-priority packets on to it.  Therefore RSS is not compatible with
+   ingress scheduling.  Also requesting anything other than two queues
+   with ingress scheduling is wasteful as without RSS only two queues are
+   required.  Rather than force n_rxq to two here and overriding the
+   configured value it's less surprising for the user to issue a warning
+   (see dpdk_apply_ingress_scheduling()) and not enable ingress shceduling.
+*/
+if (dev->ingress_sched_str && n_rxq == 2) {
+conf.rxmode.mq_mode = ETH_MQ_RX_NONE;
+}
+else {
+conf.rxmode.mq_mode = ETH_MQ_RX_RSS;
+}
+
 /* For some NICs (e.g. Niantic), scatter_rx mode needs to be explicitly
  * enabled. */
 if (dev->mtu > ETHER_MTU) {
@@ -757,6 +787,128 @@ dpdk_eth_flow_ctrl_setup(struct netdev_dpdk *dev) 
OVS_REQUIRES(dev->mutex)
 }
 }
 
+static void
+dpdk_apply_ingress_scheduling(struct netdev_dpdk *dev, int n_rxq)
+{
+if (!dev->ingress_sched_str) {
+return;
+}
+
+/* See dpdk_eth_dev_queue_setup for n_rxq requirement */
+if (n_rxq != 2) {
+VLOG_ERR("Interface %s: Ingress scheduling config ignored; " \
+ "Requires n_rxq==2.", dev->up.name);
+return;
+}
+
+int priority_q_id = n_rxq-1;
+char *key, *val, *str, *iter;
+
+ovs_be32 ip_src, ip_dst;
+ip_src = ip_dst = 0;
+
+uint16_t eth_type, port_src, port_dst;
+eth_type = port_src = port_dst = 0;
+uint8_t ip_proto = 0;
+int diag = 0;
+
+/* delete any existing filter */
+if (dev->ingress_filter_type == RTE_ETH_FILTER_FDIR) {
+diag = rte_eth_dev_filter_ctrl(dev->port_id, RTE_ETH_FILTER_FDIR,
+RTE_ETH_FILTER_DELETE, >ingress_filter.fdir);
+} else if (dev->ingress_filter_type == RTE_ETH_FILTER_ETHERTYPE) {
+diag = rte_eth_dev_filter_ctrl(dev->port_id, RTE_ETH_FILTER_ETHERTYPE,
+RTE_ETH_FILTER_DELETE, >ingress_filter.ethertype);
+}
+
+/* str_to_x on error returns malloc'd str we'll need to free */
+char *mallocd_str;
+/* Parse the 

[ovs-dev] [PATCH v2 1/4] netdev: Add set_ingress_sched to netdev api

2017-08-17 Thread Billy O'Mahony
Passes ingress_sched config item from other_config column of Interface
table to the netdev.

Signed-off-by: Billy O'Mahony 
---
 lib/netdev-bsd.c  |  1 +
 lib/netdev-dpdk.c | 21 +
 lib/netdev-dummy.c|  1 +
 lib/netdev-linux.c|  1 +
 lib/netdev-provider.h | 10 ++
 lib/netdev-vport.c|  1 +
 lib/netdev.c  | 22 ++
 lib/netdev.h  |  1 +
 vswitchd/bridge.c |  2 ++
 9 files changed, 60 insertions(+)

diff --git a/lib/netdev-bsd.c b/lib/netdev-bsd.c
index 8a4cdb3..3943b7b 100644
--- a/lib/netdev-bsd.c
+++ b/lib/netdev-bsd.c
@@ -1506,6 +1506,7 @@ netdev_bsd_update_flags(struct netdev *netdev_, enum 
netdev_flags off,
 netdev_bsd_get_etheraddr,\
 netdev_bsd_get_mtu,  \
 NULL, /* set_mtu */  \
+NULL, /* set_ingress_sched */\
 netdev_bsd_get_ifindex,  \
 netdev_bsd_get_carrier,  \
 NULL, /* get_carrier_resets */   \
diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index 1aaf6f7..1ffedd4 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -390,6 +390,9 @@ struct netdev_dpdk {
 /* If true, device was attached by rte_eth_dev_attach(). */
 bool attached;
 
+/* Ingress Scheduling config */
+char *ingress_sched_str;
+
 /* In dpdk_list. */
 struct ovs_list list_node OVS_GUARDED_BY(dpdk_mutex);
 
@@ -1081,6 +1084,9 @@ netdev_dpdk_destruct(struct netdev *netdev)
 }
 
 free(dev->devargs);
+if (dev->ingress_sched_str) {
+free(dev->ingress_sched_str);
+}
 common_destruct(dev);
 
 ovs_mutex_unlock(_mutex);
@@ -2004,6 +2010,20 @@ netdev_dpdk_set_mtu(struct netdev *netdev, int mtu)
 }
 
 static int
+netdev_dpdk_set_ingress_sched(struct netdev *netdev,
+  const char *ingress_sched_str)
+{
+struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
+
+free(dev->ingress_sched_str);
+if (ingress_sched_str) {
+dev->ingress_sched_str = xstrdup(ingress_sched_str);
+}
+
+return 0;
+}
+
+static int
 netdev_dpdk_get_carrier(const struct netdev *netdev, bool *carrier);
 
 static int
@@ -3342,6 +3362,7 @@ unlock:
 netdev_dpdk_get_etheraddr,\
 netdev_dpdk_get_mtu,  \
 netdev_dpdk_set_mtu,  \
+netdev_dpdk_set_ingress_sched,\
 netdev_dpdk_get_ifindex,  \
 GET_CARRIER,  \
 netdev_dpdk_get_carrier_resets,   \
diff --git a/lib/netdev-dummy.c b/lib/netdev-dummy.c
index f731af1..0c36d2d 100644
--- a/lib/netdev-dummy.c
+++ b/lib/netdev-dummy.c
@@ -1378,6 +1378,7 @@ netdev_dummy_update_flags(struct netdev *netdev_,
 netdev_dummy_get_etheraddr, \
 netdev_dummy_get_mtu,   \
 netdev_dummy_set_mtu,   \
+NULL,   /* set_ingress_sched */ \
 netdev_dummy_get_ifindex,   \
 NULL,   /* get_carrier */   \
 NULL,   /* get_carrier_resets */\
diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c
index 2ff3e2b..00cbe17 100644
--- a/lib/netdev-linux.c
+++ b/lib/netdev-linux.c
@@ -2847,6 +2847,7 @@ netdev_linux_update_flags(struct netdev *netdev_, enum 
netdev_flags off,
 netdev_linux_get_etheraddr, \
 netdev_linux_get_mtu,   \
 netdev_linux_set_mtu,   \
+NULL,   /* set_ingress_sched */ \
 netdev_linux_get_ifindex,   \
 netdev_linux_get_carrier,   \
 netdev_linux_get_carrier_resets,\
diff --git a/lib/netdev-provider.h b/lib/netdev-provider.h
index b3c57d5..6cbd066 100644
--- a/lib/netdev-provider.h
+++ b/lib/netdev-provider.h
@@ -34,6 +34,7 @@ extern "C" {
 
 struct netdev_tnl_build_header_params;
 #define NETDEV_NUMA_UNSPEC OVS_NUMA_UNSPEC
+#define NETDEV_PRIO_RXQ_UNSPEC (-1)
 
 /* A network device (e.g. an Ethernet device).
  *
@@ -76,6 +77,7 @@ struct netdev {
  * modify them. */
 int n_txq;
 int n_rxq;
+int priority_rxq;   /* id of prioritized rxq. -1 = None */
 int ref_cnt;/* Times this devices was opened. */
 struct shash_node *node;/* Pointer to element in global map. */
 struct ovs_list saved_flags_list; /* Contains "struct netdev_saved_flags". 
*/
@@ -412,6 +414,14 @@ struct netdev_class {
  * null if it would always 

[ovs-dev] [PATCH 0/4] prioritizing latency sensitive traffic

2017-08-17 Thread Billy O'Mahony
Hi All,

v2: Addresses various review comments; Applies cleanly on 0bedb3d6.

This patch set provides a method to request ingress scheduling on interfaces.
It also provides an implemtation of same for DPDK physical ports.

This allows specific packet types to be:
* forwarded to their destination port ahead of other packets.
and/or
* be less likely to be dropped in an overloaded situation.

It was previously discussed
https://mail.openvswitch.org/pipermail/ovs-discuss/2017-May/044395.html
and RFC'd
https://mail.openvswitch.org/pipermail/ovs-dev/2017-July/335237.html

Limitations of this patch:
* The patch uses the Flow Director filter API in DPDK and has only been tested
 on Fortville (XL710) NIC.
* Prioritization is limited to:
** eth_type
** Fully specified 5-tuple src & dst ip and port numbers for UDP & TCP packets
* ovs-appctl dpif-netdev/pmd-*-show o/p should indicate rxq prioritization.
* any requirements for a more granular prioritization mechanism

Initial results:
* even when userspace OVS is very much overloaded and
  dropping significant numbers of packets the drop rate for prioritized traffic
  is running at 1/1000th of the drop rate for non-prioritized traffic.

* the latency profile of prioritized traffic through userspace OVS is also much
  improved

1e0 |*
|*
1e-1|* | Non-prioritized pkt latency
|* * Prioritized pkt latency
1e-2|*
|*
1e-3|*   |
|*   |
1e-4|*   | | |
|*   |*| |
1e-5|*   |*| | |
|*   |*|*| |  |
1e-6|*   |*|*|*|  |
|*   |*|*|*|* |
1e-7|*   |*|*|*|* |*
|*   |*|*|*|* |*
1e-8|*   |*|*|*|* |*
  0-1 1-20 20-40 40-50 50-60 60-70 ... 120-400
Latency (us)

 Proportion of packets per latency bin @ 80% Max Throughput
  (Log scale)


Regards,
Billy.

billy O'Mahony (4):
  netdev: Add set_ingress_sched to netdev api
  netdev-dpdk: Apply ingress_sched config to dpdk phy ports
  dpif-netdev: Add rxq prioritization
  docs: Document ingress scheduling feature

 Documentation/howto/dpdk.rst|  31 +++
 include/openvswitch/ofp-parse.h |   3 +
 lib/dpif-netdev.c   |  25 --
 lib/netdev-bsd.c|   1 +
 lib/netdev-dpdk.c   | 192 +++-
 lib/netdev-dummy.c  |   1 +
 lib/netdev-linux.c  |   1 +
 lib/netdev-provider.h   |  10 +++
 lib/netdev-vport.c  |   1 +
 lib/netdev.c|  22 +
 lib/netdev.h|   1 +
 vswitchd/bridge.c   |   4 +
 vswitchd/vswitch.xml|  31 +++
 13 files changed, 315 insertions(+), 8 deletions(-)

-- 
2.7.4

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3] openvswitch: enable NSH support

2017-08-17 Thread Eric Garver
On Thu, Aug 17, 2017 at 07:49:41AM +0800, Yang, Yi wrote:
> On Wed, Aug 16, 2017 at 11:15:28PM +0800, Eric Garver wrote:
> > On Wed, Aug 16, 2017 at 01:35:30PM +0800, Yi Yang wrote:
> > > +
> > > +#define NSH_DST_PORT4790 /* UDP Port for NSH on VXLAN. */
> > > +#define ETH_P_NSH   0x894F   /* Ethertype for NSH. */
> > 
> > ETH_P_NSH probably belongs in include/uapi/linux/if_ether.h with all the
> > other ETH_P_* defines.
> > 
> 
> Ok, I'll move it to include/uapi/linux/if_ether.h, but in userspace, we
> still need to keep it in nsh.h.
> 
> > >  
> > > +struct ovs_key_nsh {
> > > + __u8 flags;
> > > + __u8 mdtype;
> > > + __u8 np;
> > > + __u8 pad;
> > > + __be32 path_hdr;
> > > + __be32 context[NSH_MD1_CONTEXT_SIZE];
> > > +};
> > > +
> > >  struct sw_flow_key {
> > >   u8 tun_opts[IP_TUNNEL_OPTS_MAX];
> > >   u8 tun_opts_len;
> > > @@ -144,6 +154,7 @@ struct sw_flow_key {
> > >   };
> > >   } ipv6;
> > >   };
> > > + struct ovs_key_nsh nsh; /* network service header */
> > 
> > Are you intentionally not reserving space in the flow key for
> > OVS_NSH_KEY_ATTR_MD2? I know it's not supported yet, but much of the
> > code is stubbed out for it - just making sure this wasn't an oversight.
> > 
> 
> For MD type 2, we'll reuse tun_metedata keys in struct flow_tnl which
> will be reworked and it will be shared by NSH and GENEVE, so we won't
> have new keys in "struct ovs_key_nsh" for MD type 2.

Be careful here. VXLAN also uses tun_metadata for GBP. VXLAN-GPE (+ NSH)
and VXLAN-GBP are mutually exclusive AFAICS, but you should verify it
all behaves as expected.

[..]
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [patch_v4 2/2] dpif-netdev: Refactor some pmd stats.

2017-08-17 Thread Jan Scheurich
Hi Darrell,

Please find my specific comments below.

I have a general concern for the pmd-stats-show counters: The output refers to 
the number of processed packets (e.g. processing cycles/pkt etc). But the data 
actually presented is the total number of *passes of packets through the 
datapath*. 

In the early OVS days there was little difference, but with increasing amount 
of recirculations ( for tunnel decap, pop_mpls, bond selection, conntrack), 
nowadays each packet passes the datapath several times and the output becomes 
confusing for users, especially as it does not contain any indication of the 
average number of datapath passes per packet.

I would strongly suggest that we add a new PMD stats counter for the actually 
received packets, which allows us to calculate and include the average number 
of passes per packet and true average processing cost per pkt. 

What do you think?

BR, Jan

> The per packets stats are presently overlapping in that miss stats include
> lost stats; make these stats non-overlapping for clarity and make this clear
> in the dp_stat_type enum.  This also eliminates the need to increment two
> 'miss' stats for a single packet.
> 
> The subtable lookup stats is renamed to make it clear that it relates to
> masked lookups.
> The stats that total to the number of packets seen are defined in
> dp_stat_type and an api is created to total the stats in case these stats are
> further divided in the future.
> 
> The pmd stats test is enhanced to include megaflow stats counting and
> checking.
> Also, miss and lost stats are annotated to make it clear what they mean.
> 
> Signed-off-by: Darrell Ball 
> ---
>  lib/dpif-netdev.c | 78 ++--
> ---
>  tests/pmd.at  | 31 +-
>  2 files changed, 74 insertions(+), 35 deletions(-)
> 
> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 17e1666..dfc6684
> 100644
> --- a/lib/dpif-netdev.c
> +++ b/lib/dpif-netdev.c
> @@ -323,12 +323,19 @@ static struct dp_netdev_port
> *dp_netdev_lookup_port(const struct dp_netdev *dp,
>  OVS_REQUIRES(dp->port_mutex);
> 
>  enum dp_stat_type {
> -DP_STAT_EXACT_HIT,  /* Packets that had an exact match (emc).
> */
> -DP_STAT_MASKED_HIT, /* Packets that matched in the flow table.
> */
> -DP_STAT_MISS,   /* Packets that did not match. */
> -DP_STAT_LOST,   /* Packets not passed up to the client. */
> -DP_STAT_LOOKUP_HIT, /* Number of subtable lookups for flow
> table
> -   hits */
> +DP_STAT_EXACT_HIT,  /* Packets that had an exact match (emc). */
> +DP_STAT_MASKED_HIT, /* Packets that matched in the flow table. */
> +DP_STAT_MISS,   /* Packets that did not match and upcall was
> +   done. */
> +DP_STAT_LOST,   /* Packets that did not match and upcall was
> +   not done. */

If I understand the implementation correctly, this counts packets for which an 
upcall was done but failed so that the packet was dropped. There are no packets 
with a miss for which we currently do not do an upcall. Why do you suggest to 
change the meaning of the counter?

> +DP_N_PER_PKT_CNT,   /* The above statistics account for the total
> +   number of packets seen and should not be
> +   overlapping with each other. */

I find this definition very obscure. I think it was much clearer (and less 
likely to break again in the future) to explicitly sum up the above four 
non-overlapping counters to obtain the total number of packets as suggested by 
Ilya. It would be good to keep the comment, though.

> +DP_STAT_MASKED_LOOKUP = DP_N_PER_PKT_CNT,  /* Number of
> subtable lookups
> +  for flow table hits. Each
> +  MASKED_HIT hit will have
> +  >= 1 MASKED_LOOKUP(s). */
>  DP_N_STATS
>  };

+1 for the clarification of the comment.

> +static unsigned long long
> +dp_netdev_calcul_total_packets(unsigned long long *stats) {
> +unsigned long long total_packets = 0;
> +for (uint8_t i = 0; i < DP_N_PER_PKT_CNT; i++) {
> +total_packets += stats[i];
> +}
> +return total_packets;
> +}

Please compute the sum explicitly. Much clearer and better maintainable.

>  ds_put_format(reply,
>"\temc hits:%llu\n\tmegaflow hits:%llu\n"
> -  "\tavg. subtable lookups per hit:%.2f\n"
> -  "\tmiss:%llu\n\tlost:%llu\n",
> +  "\tavg. subtable lookups per megaflow hit:%.2f\n"
> +  "\tmiss(upcall done):%llu\n\tlost(upcall not
> + done):%llu\n",
>stats[DP_STAT_EXACT_HIT], stats[DP_STAT_MASKED_HIT],
>stats[DP_STAT_MASKED_HIT] > 0
> -  ?
> 

Re: [ovs-dev] [PATCH 2/4] netdev-dpdk: Apply ingress_sched config to dpdk phy ports

2017-08-17 Thread O Mahony, Billy
Hi Mark,

Thanks for the very useful review comments.

I'll send a rev'd patch set shortly.

/Billy

> -Original Message-
> From: Kavanagh, Mark B
> Sent: Friday, August 4, 2017 4:14 PM
> To: O Mahony, Billy ; d...@openvswitch.org
> Subject: RE: [ovs-dev] [PATCH 2/4] netdev-dpdk: Apply ingress_sched config
> to dpdk phy ports
> 
> >From: ovs-dev-boun...@openvswitch.org
> >[mailto:ovs-dev-boun...@openvswitch.org]
> >On Behalf Of Billy O'Mahony
> >Sent: Thursday, July 20, 2017 5:11 PM
> >To: d...@openvswitch.org
> >Subject: [ovs-dev] [PATCH 2/4] netdev-dpdk: Apply ingress_sched config
> >to dpdk phy ports
> >
> >Ingress scheduling configuration is given effect by way of Flow
> >Director filters. A small subset of the ingress scheduling possible is
> >implemented in this patch.
> >
> >Signed-off-by: Billy O'Mahony 
> 
> Hi Billy,
> 
> As a general comment, this patch doesn't apply to HEAD of master, so please
> rebase as part of rework.
> 
> Review comments inline.
> 
> Thanks,
> Mark
> 
> 
> >---
> > include/openvswitch/ofp-parse.h |   3 +
> > lib/dpif-netdev.c   |   1 +
> > lib/netdev-dpdk.c   | 167
> ++-
> >-
> > vswitchd/bridge.c   |   2 +
> > 4 files changed, 166 insertions(+), 7 deletions(-)
> >
> >diff --git a/include/openvswitch/ofp-parse.h
> >b/include/openvswitch/ofp-parse.h index fc5784e..08d6086 100644
> >--- a/include/openvswitch/ofp-parse.h
> >+++ b/include/openvswitch/ofp-parse.h
> >@@ -37,6 +37,9 @@ struct ofputil_table_mod;  struct ofputil_bundle_msg;
> >struct ofputil_tlv_table_mod;  struct simap;
> >+struct tun_table;
> >+struct flow_wildcards;
> >+struct ofputil_port_map;
> > enum ofputil_protocol;
> >
> > char *parse_ofp_str(struct ofputil_flow_mod *, int command, const char
> >*str_, diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index
> >47a9fa0..d35566f 100644
> >--- a/lib/dpif-netdev.c
> >+++ b/lib/dpif-netdev.c
> >@@ -44,6 +44,7 @@
> > #include "dp-packet.h"
> > #include "dpif.h"
> > #include "dpif-provider.h"
> >+#include "netdev-provider.h"
> 
> If a setter function for modifying netdev->ingress_sched_str is
> implemented, there is no need to include netdev-provider.h
> 
> > #include "dummy.h"
> > #include "fat-rwlock.h"
> > #include "flow.h"
> >diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index
> >e74c50f..e393abf 100644
> >--- a/lib/netdev-dpdk.c
> >+++ b/lib/netdev-dpdk.c
> >@@ -33,6 +33,8 @@
> > #include 
> > #include 
> >
> >+#include 
> >+#include 
> 
> Move these include below, with the other openvswitch include file.
> 
> > #include "dirs.h"
> > #include "dp-packet.h"
> > #include "dpdk.h"
> >@@ -169,6 +171,10 @@ static const struct rte_eth_conf port_conf = {
> > .txmode = {
> > .mq_mode = ETH_MQ_TX_NONE,
> > },
> >+.fdir_conf = {
> >+.mode = RTE_FDIR_MODE_PERFECT,
> 
> As you mentioned in your cover letter, you've only tested on a Fortville NIC.
> How widely supported are the Flow Director features across DPDK-
> supported NICs?
[[BO'M]] I'm not sure. Probably many NICs have the capabilities required - but 
I don't know if the dpdk drivers expose it.
> 
> >+},
> >+
> > };
> >
> > enum { DPDK_RING_SIZE = 256 };
> >@@ -330,6 +336,11 @@ enum dpdk_hw_ol_features {
> > NETDEV_RX_CHECKSUM_OFFLOAD = 1 << 0,  };
> >
> >+union ingress_filter {
> >+struct rte_eth_ethertype_filter ethertype;
> >+struct rte_eth_fdir_filter fdir;
> >+};
> >+
> > struct netdev_dpdk {
> > struct netdev up;
> > dpdk_port_t port_id;
> >@@ -369,8 +380,11 @@ struct netdev_dpdk {
> > /* If true, device was attached by rte_eth_dev_attach(). */
> > bool attached;
> >
> >-/* Ingress Scheduling config */
> >+/* Ingress Scheduling config & state. */
> > char *ingress_sched_str;
> >+bool ingress_sched_changed;
> >+enum rte_filter_type ingress_filter_type;
> >+union ingress_filter ingress_filter;
> >
> > /* In dpdk_list. */
> > struct ovs_list list_node OVS_GUARDED_BY(dpdk_mutex); @@ -653,6
> >+667,15 @@ dpdk_eth_dev_queue_setup(struct netdev_dpdk *dev, int
> n_rxq,
> >int n_txq)
> > int i;
> > struct rte_eth_conf conf = port_conf;
> >
> >+/* Ingress scheduling requires ETH_MQ_RX_NONE so limit it to when
> exactly
> >+ * two rxqs are defined. Otherwise MQ will not work as expected.
> >+ */
> 
> So if a user later enables multiq (i.e. > 2 rxqs), that implicitly disables 
> ingress
> scheduling?
[[BO'M]] Yes. RSS and Ingress Scheduling are incompatible - at least I couldn't 
figure out how to make them compatible. It could be setup so that RSS would be 
disabled but that would be favoring the existing common use case of the less 
common new use case.
> 
> In any event, the comment here needs to be much clearer, as this is the first
> mention within the code that ingress scheduling is limited to just 2 rxqs. Is
> that limitation just for the PoC?
[[BO'M]] Unless I can 

Re: [ovs-dev] INSTRUCTION TO RELEASE YOUR FUND,,

2017-08-17 Thread From: IMF
INTERNATIONAL MONETARY FUND
website:www.imf.org
Direct Email:  imfoffic...@gmail.com
Director: Davis Jimmy
 
INSTRUCTION TO RELEASE YOUR FUND
 
Dear Beneficiary,
 
This is to intimate you of a very important information which will be of a
great help to redeem you from all the difficulties you have been
experiencing in getting your long overdue payment due to excessive demand
for money from you by both corrupt Bank officials and Courier Companies
after which your fund remain unpaid to you.
 
I am Mr. Davis Jimmy a highly placed official of the International
Monetary Fund (IMF). It may interest you to know that reports have reached
our office by so many correspondences on the uneasy way which people like
you are treated by Various Banks and Courier Companies/ Diplomat across
Europe to Africa and Asia /London Uk and we have decided to put a stop to
that and that is why I was appointed to handle your transaction here in
UK.
 
All Governmental and Non-Governmental prostates, NGOs, Finance Companies,
Banks, Security Companies and Courier companies which have been in contact
with you of late have been instructed to back up from your transaction and
you have been advised NOT to respond to them anymore since the IMF is now
directly in charge of your fund.
 
The most annoying thing is that the bad officials wont tell you the truth
that on no account will they ever release the fund to you, instead they
allow you spend money unnecessarily, I do not intend to work here all the
days of my life, I can release this fund to you if you can certify me of
my security.
 
I needed to help you release the fund because you need to know the statues
of your funds and cause for the delay, please this is like a Mafia setting
in WORLD, you may not understand it.
 
Listed below are the mafia’s and banks behind the non release of your
funds that i managed to sneak out for your kind perusal.'
 
PLEASE CHECK AMONG THESE NAMES IF THERE IS ANY ONE AMONG THEN WHO HAVE
SCAM ED YOU, IF YES GET BACK TO ME FOR LEGAL ACTION TO BE TAKEN.
 
1) George Philips Consultant/principal Partner
2) AMBASSADOR OFFICES
3) MR  GODWIN  EMEFIELE GOVERNOR (C.B.N)
4) John Rob(Barclays bank plc)
5) PRIME MINISTERS OFFICES
6) Ronald Franklin
7)  (Federal Reserve Bank of New York)
8) Mr. Ernest Chukwudi Obi
9) Mr. Mike Jombo
Deputy Governor – Policy / Board Member
10) Mr. Tunde Lemo
Deputy Governor – Financial Sector Surveillance / Board Member
(11) Mrs. W. D. A. Mshelia
Deputy Governor – Corporate Services / Board Members
12) Mrs. Okonjo Iweala
13) FAKE BANKS IN EUROPE
 
You are hereby advised NOT to remit further payment to any institutions
with respect to your transaction as your fund will be transferred to you
directly from our source.
I hope this is clear. Any action contrary to this instruction is at your
own risk. Respond to this e-mail on  with immediate effect and we shall
give you further details on how your fund will be released.
 
Reconfirm the information bellow
1. Full Name:
2. Address:
3 Nationality:
4. Age: Date of Birth:
5. Occupation:
6. Phone: Mobile/Cellular
7. State of Origin:
8. Copy of your identity Card.
 
Yours Sincerely,
 
Davis Jimmy
FOR INTERNATIONAL MONETARY FUND
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3 1/4] dpif-netdev: Skip EMC lookup/insert for recirc packets

2017-08-17 Thread Jan Scheurich
Hi Antonio,

> > Is there a reason to assume that a deterministic selection on some non-
> random
> > criteria like the recirculation count will on average (over deployments
> and
> > applications) give a better performance than a random selection?
> 
> [Antonio]
> If we consider latency and jitter a deterministic solution should be
> more preferable than a solution which behaves differently depending
> on the particular values of the packet fields, eg the IP addresses.

Do you have measurements showing that latency is significantly affected 
by EMC hit vs DPCLS hit?  I wouldn't think so.  Only throughput should vary.

Probabilistic EMC lookup should only apply in situations where EMC is 
overloaded, meaning we have thousands of packet flows. In this case we
maximize the aggregate throughput of the statistical flow mix. But it is not
that a flow using EMC would see higher throughput than analogous flows that
don't.

> > I don't believe so. For example, the number of "EMC flows" in each pass
> through
> > the datapath can differ hugely: 1 GRE tunnel flow in first pass (from phy
> > port), 100K tenant flows after tunnel decapsulation. Or 100K tenant flows
> in
> > first pass (from VM) but 1 flow after NSH encapsulation in second pass.
> 
> [Antonio]
> Maybe I'm wrong but shouldn't the different flows encapped in a GRE
> tunnel hit the EMC in different locations? Because even if they all have the
> same outer IP addresses, they differ in the L4 ports so the 5-tuple hash
> - and the emc locations - should vary. Same thing for NSH encapsulation?

Neither GRE nor NSH packets have L4 ports for RSS hashing. GRE is a separate
IP protocol (not UDP). All packets of a GRE tunnel share the same pair of IP
addresses. NSH is even a non-IP protocol.

> > I believe a random selection with dynamically adapted probability is the
> best
> > we can do without a priori knowledge about the traffic patterns and
> pipeline
> > organization.
> 
> [Antonio]
> This proposal is orthogonal to other approaches that look at the usage
> of the single locations, eg policies not to overwrite active locations or to
> reduce in general the emc usage.
> I think we should consider both the two strategies to tackle two different
> aspects of the thrashing and use emc more efficiently:
>  1. skip emc lookup/insert for recirc packets (which is only activated when
>emc entries exceeds EMC_RECIRCT_NO_INSERT_THRESHOLD);
>  2. any other strategy that limits emc usage or offers a better entries
> eviction.
> 
> So - being agnostic of what's the traffic type - if we have 100k flows
> that could potentially be recirculated:
>  1. allows to tackle the thrashing due to recirculation, which is activated
> when the emc entries exceeds a threshold.
>  2. allows to limit the emc usage to fewer flows because we don't want
> 100k flows to hit emc.

First of all: we only discuss limiting EMC lookups in the case of EMC overload.
I still don't think that it is a good idea to general skip EMC lookup for
recirculated  flows in that situation. It may be the right thing to do in some 
scenarios (e.g. GRE -> VM), but exactly the wrong in others (e.g. VM -> GRE).

If we go for a probabilistic reduction of EMC lookups we'd statistically have a
balanced improvement in all (known and unknown) scenarios.

BR, Jan
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 3/4] tc: Add header rewrite using tc pedit action

2017-08-17 Thread Paul Blakey



On 15/08/2017 20:04, Joe Stringer wrote:

On 15 August 2017 at 01:19, Paul Blakey  wrote:



On 15/08/2017 00:56, Joe Stringer wrote:


On 8 August 2017 at 07:21, Roi Dayan  wrote:


From: Paul Blakey 







@@ -1062,7 +1422,7 @@ nl_msg_put_flower_options(struct ofpbuf *request,
struct tc_flower *flower)
   nl_msg_put_flower_tunnel(request, flower);
   }

-nl_msg_put_flower_acts(request, flower);
+return 0;
   }



The above snippet seems like it could be logically separate and kept
as a different patch, as it shouldn't affect behaviour at all but it
does actually introduce additional error checking. This could assist
git bisect if necessary.



We'll do that, thanks.


OK.



Looking at this again, we only changed the return from void to int 
because of this change which can fail, so we can't separate this without 
a commit that just always returns 0.

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3 1/4] dpif-netdev: Skip EMC lookup/insert for recirc packets

2017-08-17 Thread Jan Scheurich
    The RSS hash threshold method looks like the only pseudo-random criterion 
that we can use that produces consistent result for every packet of a flow and 
does require more information. Of course elephant flows with an unlucky hash 
value might never get to use the EMC, but that risk we have with any stateless 
selection scheme.

[Darrell] It is probably something I know by another name, but JTBC, can you 
define the “RSS hash threshold method” ?

I am referring to Billy's proposal 
(https://mail.openvswitch.org/pipermail/ovs-dev/2017-August/336509.html)

In essence the is suggests to only select packets for EMC lookup whose RSS hash 
is above a certain threshold. The lookup probability is determined by the 
threshold (e.g. threshold of 0.75 * UINT32_MAX corresponds to 25%). It is 
pseudo-random as, assuming that the hash result is uniformly distributed, flows 
will profit from EMC lookup with the same probability.


    The new thing required will be the dynamic adjustment of lookup probability 
to the EMC fill level and/or hit ratio.

[Darrell] Did you mean insertion probability rather than lookup probability ?

No, I actually meant dynamic adaptation of lookup probability. We don't want to 
reduce the EMC lookup probability when the EMC is not yet overloaded, but only 
when the EMC hit rate degrades due to collisions. When we devise an algorithm 
to adapt lookup probability, we can study if it could make sense to also 
dynamically adjust the currently fixed (configurable) EMC insertion probability 
based on EMC fill level and/or hit rate.

BR, Jan
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [dpdk-dev] [ovs-dpdk-tests] where is ovs-dpdk test case?

2017-08-17 Thread Kavanagh, Mark B

>From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Sam
>Sent: Thursday, August 17, 2017 7:31 AM
>To: ovs-disc...@openvswitch.org; ovs-dev@openvswitch.org; d...@dpdk.org

Hi Sam,

Just a heads-up that d...@dpdk.org is strictly for DPDK development threads - 
I've removed it from this thread.

Also, responses to your queries are inline.

Thanks!
Mark 


>Subject: [dpdk-dev] [ovs-dpdk-tests] where is ovs-dpdk test case?
>
>Hi all,
>
>I'm working with ovs-dpdk, I want to run ovs-dpdk test case. But I found
>there is no test case for 'netdev' type bridge and no test case for
>ovs-dpdk datapath(which is pmd_thread_main).

Do you mean unit tests (as in OvS autotests), or integration tests?

If the former, then unfortunately there are none at present within OvS (but 
feel free to add some!); if you'd like to run integration tests, you could take 
a look at VSPERF: https://etherpad.opnfv.org/p/vSwitchIntegrationTests.

>
>So my question is where could I find these test cases?
>
>Also I change some code in dpdk-vhost client mode, how could I find test
>case for this module?
>
>Thank you~
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] nsh: rework NSH netlink keys and actions

2017-08-17 Thread Jan Scheurich
Hi Yi,

As discussed please adjust the netlink uAPI for NSH as follows:

OVS_KEY_ATTR_NSH
-- OVS_NSH_KEY_ATTR_BASEmandatory
-- OVS_NSH_KEY_ATTR_MD1 conditional: if mdtype=MD1

OVS_ACTION_ATTR_PUSH_NSH
-- OVS_NSH_KEY_ATTR_BASEmandatory
-- OVS_NSH_PUSH_ATTR_CONTEXTconditional: currently if MD1 or 
MD2 with TLV encap properties

OVS_ACTION_ATTR_POP_NSH no data

Remove OVS_NSH_KEY_ATTR_MD2 because the MD2 context headers will not 
be modelled as dedicated key fields but mapped to some generic TLV fields, 
similar to but independent from the current tunnel metadata TLV registers.

Using the name OVS_NSH_KEY_ATTR_MD2 for the opaque context data in 
PUSH_NSH would be misleading as the variable length byte array can carry
any type of metadata format: MD1, MD2, or any future MD type. For that
reason I prefer the generic name OVS_NSH_PUSH_ATTR_CONTEXT. 

Note that this attribute will not be used as part of the NSH key, so you might 
consider generalizing the NSH attribute enum definition to 

enum ovs_nsh_attr {
OVS_NSH_KEY_ATTR_BASE,  /* struct ovs_nsh_key_base. */
OVS_NSH_KEY_ATTR_MD1,   /* struct ovs_nsh_key_md1. */
OVS_NSH_PUSH_ATTR_CONTEXT,   /* opaque context headers */
__OVS_NSH_ATTR_MAX
};

BR, Jan


> -Original Message-
> From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev-
> boun...@openvswitch.org] On Behalf Of Yi Yang
> Sent: Wednesday, 16 August, 2017 09:56
> To: d...@openvswitch.org
> Subject: [ovs-dev] [PATCH] nsh: rework NSH netlink keys and actions
> 
> Per kernel data path requirements, this patch changes
> OVS_KEY_ATTR_NSH
> to nested attribute and adds three new NSH sub attribute keys:
> 
> OVS_NSH_KEY_ATTR_BASE: for length-fixed NSH base header
> OVS_NSH_KEY_ATTR_MD1:  for length-fixed MD type 1 context
> OVS_NSH_KEY_ATTR_MD2:  for length-variable MD type 2 metadata
> 
> NSH match fields, set and PUSH_NSH action all use the below
> nested attribute format:
> 
> OVS_KEY_ATTR_NSH begin
> OVS_NSH_KEY_ATTR_BASE
> OVS_NSH_KEY_ATTR_MD1
> OVS_KEY_ATTR_NSH end
> 
> or
> 
> OVS_KEY_ATTR_NSH begin
> OVS_NSH_KEY_ATTR_BASE
> OVS_NSH_KEY_ATTR_MD2
> OVS_KEY_ATTR_NSH end
> 
> In addition, NSH encap and decap actions are renamed as push_nsh
> and pop_nsh to meet action naming convention.
> 
> Signed-off-by: Yi Yang 
> ---
>  datapath/linux/compat/include/linux/openvswitch.h |  57 +-
>  include/openvswitch/nsh.h |  32 +-
>  include/openvswitch/packets.h |  11 +-
>  lib/dpif-netdev.c |   4 +-
>  lib/dpif.c|   4 +-
>  lib/flow.c|  15 +-
>  lib/match.c   |  12 +-
>  lib/meta-flow.c   |  13 +-
>  lib/nx-match.c|   4 +-
>  lib/odp-execute.c |  76 ++-
>  lib/odp-util.c| 713 
> ++
>  lib/odp-util.h|   4 +
>  lib/packets.c |  24 +-
>  lib/packets.h |   6 +-
>  ofproto/ofproto-dpif-ipfix.c  |   4 +-
>  ofproto/ofproto-dpif-sflow.c  |   4 +-
>  ofproto/ofproto-dpif-xlate.c  |  24 +-
>  tests/nsh.at  |  28 +-
>  18 files changed, 773 insertions(+), 262 deletions(-)
> 
> diff --git a/datapath/linux/compat/include/linux/openvswitch.h
> b/datapath/linux/compat/include/linux/openvswitch.h
> index bc6c94b..d7f9029 100644
> --- a/datapath/linux/compat/include/linux/openvswitch.h
> +++ b/datapath/linux/compat/include/linux/openvswitch.h
> @@ -369,7 +369,7 @@ enum ovs_key_attr {
>  #ifndef __KERNEL__
>   /* Only used within userspace data path. */
>   OVS_KEY_ATTR_PACKET_TYPE,  /* be32 packet type */
> - OVS_KEY_ATTR_NSH,  /* struct ovs_key_nsh */
> + OVS_KEY_ATTR_NSH,  /* Nested set of ovs_nsh_key_* */
>  #endif
> 
>   __OVS_KEY_ATTR_MAX
> @@ -492,13 +492,27 @@ struct ovs_key_ct_labels {
>   };
>  };
> 
> -struct ovs_key_nsh {
> -__u8 flags;
> -__u8 mdtype;
> -__u8 np;
> -__u8 pad;
> -__be32 path_hdr;
> -__be32 c[4];
> +enum ovs_nsh_key_attr {
> + OVS_NSH_KEY_ATTR_BASE,  /* struct ovs_nsh_key_base. */
> + OVS_NSH_KEY_ATTR_MD1,   /* struct ovs_nsh_key_md1. */
> + OVS_NSH_KEY_ATTR_MD2,   /* variable-length octets. */
> + __OVS_NSH_KEY_ATTR_MAX
> +};
> +
> +#define OVS_NSH_KEY_ATTR_MAX (__OVS_NSH_KEY_ATTR_MAX - 1)
> +
> +struct ovs_nsh_key_base {
> + __u8 flags;
> + __u8 mdtype;
> + __u8 np;
> + __u8 pad;
> + __be32 path_hdr;
> +};
> +
> +#define NSH_MD1_CONTEXT_SIZE 4
> +
> 

Re: [ovs-dev] [PATCH v3 1/4] dpif-netdev: Skip EMC lookup/insert for recirc packets

2017-08-17 Thread Fischetti, Antonio
Thanks Jan for your feedback and the interesting usecases described. 
Please find below some questions/comments I added inline.

Regards,
-Antonio


> -Original Message-
> From: Jan Scheurich [mailto:jan.scheur...@ericsson.com]
> Sent: Wednesday, August 16, 2017 5:24 PM
> To: Fischetti, Antonio ; Darrell Ball
> ; d...@openvswitch.org
> Subject: RE: [ovs-dev] [PATCH v3 1/4] dpif-netdev: Skip EMC lookup/insert for
> recirc packets
> 
> Hi,
> 
> I agree that in the event of EMC overload it is beneficial to reduce the 
> number
> of EMC insertions and lookups as they just generate overhead and degrade
> overall throughput. At the same time we want to keep as much of the EMC
> acceleration as possible for a fraction of traffic that can benefit from EMC
> most.

[Antonio] 
Perfectly agree, the goal should be to reserve the emc acceleration to a 
'fraction'
of the traffic.


> 
> For EMC insertion we have already done earlier this by introducing
> probabilistic EMC insertion, which greatly reduces the costly effect of EMC
> thrashing. But we didn't touch the lookup part. How should we select the
> packets (or rather packet datapath traversals) for which to perform lookup?
> 
> There are several proposals in the air: Only do it for the first pass, not for
> recirculated packets, only do it for RSS hash values below a (dynamic)
> threshold, possibly others.
> 
> For EMC insertion we consciously settled on a random selection as the datapath
> has no a priori insight into which flows are better candidates than others and
> big flows that benefit most have a higher chance of getting cached.
> 
> Is there a reason to assume that a deterministic selection on some non-random
> criteria like the recirculation count will on average (over deployments and
> applications) give a better performance than a random selection?

[Antonio]
If we consider latency and jitter a deterministic solution should be 
more preferable than a solution which behaves differently depending 
on the particular values of the packet fields, eg the IP addresses.


> 
> I don't believe so. For example, the number of "EMC flows" in each pass 
> through
> the datapath can differ hugely: 1 GRE tunnel flow in first pass (from phy
> port), 100K tenant flows after tunnel decapsulation. Or 100K tenant flows in
> first pass (from VM) but 1 flow after NSH encapsulation in second pass.

[Antonio]
Maybe I'm wrong but shouldn't the different flows encapped in a GRE 
tunnel hit the EMC in different locations? Because even if they all have the 
same outer IP addresses, they differ in the L4 ports so the 5-tuple hash
- and the emc locations - should vary. Same thing for NSH encapsulation?


> 
> I believe a random selection with dynamically adapted probability is the best
> we can do without a priori knowledge about the traffic patterns and pipeline
> organization.

[Antonio]
This proposal is orthogonal to other approaches that look at the usage
of the single locations, eg policies not to overwrite active locations or to 
reduce in general the emc usage. 
I think we should consider both the two strategies to tackle two different 
aspects of the thrashing and use emc more efficiently:
 1. skip emc lookup/insert for recirc packets (which is only activated when 
   emc entries exceeds EMC_RECIRCT_NO_INSERT_THRESHOLD);
 2. any other strategy that limits emc usage or offers a better entries 
eviction.

So - being agnostic of what's the traffic type - if we have 100k flows 
that could potentially be recirculated:
 1. allows to tackle the thrashing due to recirculation, which is activated
when the emc entries exceeds a threshold. 
 2. allows to limit the emc usage to fewer flows because we don't want 
100k flows to hit emc.

> 
> The RSS hash threshold method looks like the only pseudo-random criterion that
> we can use that produces consistent result for every packet of a flow and does
> require more information. Of course elephant flows with an unlucky hash value
> might never get to use the EMC, but that risk we have with any stateless
> selection scheme.
> 
> The new thing required will be the dynamic adjustment of lookup probability to
> the EMC fill level and/or hit ratio. Any ideas for that? I guess we'd need a
> scheme that periodically increases the probability again to probe for changed
> traffic patterns.
> 
> Once we have that I think the same dynamic probability could be possible to 
> use
> also for probabilistic EMC insertion.
> 
> BR, Jan
> 
> > -Original Message-
> > From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev-
> > boun...@openvswitch.org] On Behalf Of Fischetti, Antonio
> > Sent: Wednesday, 16 August, 2017 14:42
> > To: Darrell Ball ; d...@openvswitch.org
> > Subject: Re: [ovs-dev] [PATCH v3 1/4] dpif-netdev: Skip EMC lookup/insert
> > for recirc packets
> >
> >
> > > -Original Message-
> > > From: Darrell Ball [mailto:db...@vmware.com]
> > > Sent: 

Re: [ovs-dev] [PATCH 3/4] tc: Add header rewrite using tc pedit action

2017-08-17 Thread Paul Blakey



On 14/08/2017 16:00, Simon Horman wrote:

On Tue, Aug 08, 2017 at 05:21:53PM +0300, Roi Dayan wrote:

From: Paul Blakey 

To be later used to implement ovs action set offloading.

Signed-off-by: Paul Blakey 
Reviewed-by: Roi Dayan 
---
  lib/tc.c | 372 ++-
  lib/tc.h |  12 +++
  2 files changed, 381 insertions(+), 3 deletions(-)

diff --git a/lib/tc.c b/lib/tc.c


...


@@ -589,6 +758,10 @@ nl_parse_single_action(struct nlattr *action, struct 
tc_flower *flower)
  nl_parse_act_vlan(act_options, flower);
  } else if (!strcmp(act_kind, "tunnel_key")) {
  nl_parse_act_tunnel_key(act_options, flower);
+} else if (!strcmp(act_kind, "pedit")) {
+nl_parse_act_pedit(act_options, flower);
+} else if (!strcmp(act_kind, "csum")) {
+/* not doing anything for now */
  } else {
  VLOG_ERR_RL(_rl, "unknown tc action kind: %s", act_kind);
  return EINVAL;
@@ -790,6 +963,48 @@ tc_get_tc_cls_policy(enum tc_offload_policy policy)
  }
  
  static void

+nl_msg_put_act_csum(struct ofpbuf *request, uint32_t flags)
+{
+size_t offset;
+
+nl_msg_put_string(request, TCA_ACT_KIND, "csum");
+offset = nl_msg_start_nested(request, TCA_ACT_OPTIONS);
+{
+struct tc_csum parm = { .action = TC_ACT_PIPE,
+.update_flags = flags };


I think it would be a bit nicer to move param to the top of the function
and remove the extra { } scope it is currently inside.


+
+nl_msg_put_unspec(request, TCA_CSUM_PARMS, , sizeof parm);
+}
+nl_msg_end_nested(request, offset);
+}
+
+static void
+nl_msg_put_act_pedit(struct ofpbuf *request, struct tc_pedit *parm,
+ struct tc_pedit_key_ex *ex)
+{
+size_t ksize = sizeof *parm + (parm->nkeys * sizeof(struct tc_pedit_key));
+size_t offset, offset_keys_ex, offset_key;
+int i;
+
+nl_msg_put_string(request, TCA_ACT_KIND, "pedit");
+offset = nl_msg_start_nested(request, TCA_ACT_OPTIONS);
+{


The extra { } scope here seems unnecessary.


+parm->action = TC_ACT_PIPE;
+
+nl_msg_put_unspec(request, TCA_PEDIT_PARMS_EX, parm, ksize);
+offset_keys_ex = nl_msg_start_nested(request, TCA_PEDIT_KEYS_EX);
+for (i = 0; i < parm->nkeys; i++, ex++) {
+offset_key = nl_msg_start_nested(request, TCA_PEDIT_KEY_EX);
+nl_msg_put_u16(request, TCA_PEDIT_KEY_EX_HTYPE, ex->htype);
+nl_msg_put_u16(request, TCA_PEDIT_KEY_EX_CMD, ex->cmd);
+nl_msg_end_nested(request, offset_key);
+}
+nl_msg_end_nested(request, offset_keys_ex);
+}
+nl_msg_end_nested(request, offset);
+}
+
+static void
  nl_msg_put_act_push_vlan(struct ofpbuf *request, uint16_t vid, uint8_t prio)
  {
  size_t offset;


...



Hi and thanks for the review,

The above style corresponds with the rest of the file (scope and 
nl_msg_put_act_* funcs)
Maybe we do a style patch that remove all this extra scopes in a later 
patch?

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 3/4] tc: Add header rewrite using tc pedit action

2017-08-17 Thread Paul Blakey



On 15/08/2017 20:04, Joe Stringer wrote:

On 15 August 2017 at 01:19, Paul Blakey  wrote:



On 15/08/2017 00:56, Joe Stringer wrote:


On 8 August 2017 at 07:21, Roi Dayan  wrote:


From: Paul Blakey 

@@ -346,6 +425,96 @@ nl_parse_flower_ip(struct nlattr **attrs, struct
tc_flower *flower) {
   }
   }

+static const struct nl_policy pedit_policy[] = {
+[TCA_PEDIT_PARMS_EX] = { .type = NL_A_UNSPEC,
+ .min_len = sizeof(struct tc_pedit),
+ .optional = false, },
+[TCA_PEDIT_KEYS_EX]   = { .type = NL_A_NESTED,
+  .optional = false, },
+};
+
+static int
+nl_parse_act_pedit(struct nlattr *options, struct tc_flower *flower)
+{
+struct nlattr *pe_attrs[ARRAY_SIZE(pedit_policy)];
+const struct tc_pedit *pe;
+const struct tc_pedit_key *keys;
+const struct nlattr *nla, *keys_ex, *ex_type;
+const void *keys_attr;
+char *rewrite_key = (void *) >rewrite.key;
+char *rewrite_mask = (void *) >rewrite.mask;



These are actually 'struct tc_flower_key', shouldn't we use the actual
types? (I realise there is pointer arithmetic below, but usually we
restrict the usage of void casting and pointer arithmetic as much as
possible).



Yes, It's for the pointer arithmetic. (void *) cast was for the
clangs warning "cast increases required alignment of target type"
which we couldn't find another way to suppress.
I don't think alignments an issue here.


Ah. I don't have particularly much experience with pointer alignment.




+size_t keys_ex_size, left;
+int type, i = 0;
+
+if (!nl_parse_nested(options, pedit_policy, pe_attrs,
+ ARRAY_SIZE(pedit_policy))) {
+VLOG_ERR_RL(_rl, "failed to parse pedit action options");
+return EPROTO;
+}
+
+pe = nl_attr_get_unspec(pe_attrs[TCA_PEDIT_PARMS_EX], sizeof *pe);
+keys = pe->keys;
+keys_attr = pe_attrs[TCA_PEDIT_KEYS_EX];
+keys_ex = nl_attr_get(keys_attr);
+keys_ex_size = nl_attr_get_size(keys_attr);
+
+NL_ATTR_FOR_EACH(nla, left, keys_ex, keys_ex_size) {
+if (i >= pe->nkeys) {
+break;
+}
+
+if (nl_attr_type(nla) == TCA_PEDIT_KEY_EX) {
+ex_type = nl_attr_find_nested(nla, TCA_PEDIT_KEY_EX_HTYPE);
+type = nl_attr_get_u16(ex_type);
+
+for (int j = 0; j < ARRAY_SIZE(flower_pedit_map); j++) {
+struct flower_key_to_pedit *m = _pedit_map[j];
+int flower_off = j;
+int sz = m->size;
+int mf = m->offset;
+
+if (!sz || m->htype != type) {
+   continue;
+}



If flower_pedit_map was just a regular array and the offset was
included in 'struct flower_key_to_pedit', then I don't think we need
the above check?


+
+/* check overlap between current pedit key, which is
always
+ * 4 bytes (range [off, off + 3]), and a map entry in
+ * flower_pedit_map (range [mf, mf + sz - 1]) */
+if ((keys->off >= mf && keys->off < mf + sz)
+|| (keys->off + 3 >= mf && keys->off + 3 < mf + sz))
{
+int diff = flower_off + (keys->off - mf);
+uint32_t *dst = (void *) (rewrite_key + diff);
+uint32_t *dst_m = (void *) (rewrite_mask + diff);
+uint32_t mask = ~(keys->mask);
+uint32_t zero_bits;
+
+if (keys->off < mf) {
+zero_bits = 8 * (mf - keys->off);
+mask &= UINT32_MAX << zero_bits;
+} else if (keys->off + 4 > mf + m->size) {
+zero_bits = 8 * (keys->off + 4 - mf - m->size);
+mask &= UINT32_MAX >> zero_bits;
+}



I think this is all trying to avoid having a giant switch case here
which would handle all of the possible flower key attributes, similar
to somewhere else where this kind of iteration occurs.


Right that was the objective.

  Is there any


overlap between this code and calc_offsets()calc_offsets was to calculate
the word offsets and masks for the first


and last word to write, here its the reverse so its already in word size
with correct masks, and easier to write back.


OK.



Could we somehow trigger an assert or a strong warning message if the
offsets of our flower_pedit_map and the flower key don't align, given
that there's a bunch of pointer arithmetic happening here?



Not sure what you mean here, we are aligned to words (4 bytes) and
"cut" the words if they overflow the target. we also padded flower.rewrite
key and mask so we won't overflow to flower member we don't mean to write.


What I was trying to figure out is if there is any additional
compile-time check we could have to ensure that we 

Re: [ovs-dev] [PATCH 4/4] netdev-tc-offloads: Add support for action set

2017-08-17 Thread Paul Blakey



On 15/08/2017 20:30, Joe Stringer wrote:

On 8 August 2017 at 07:21, Roi Dayan  wrote:

From: Paul Blakey 

Implement support for offloading ovs action set using
tc header rewrite action.

Signed-off-by: Paul Blakey 
Reviewed-by: Roi Dayan 
---




Another couple of bits I noticed while responding..


@@ -454,10 +572,56 @@ netdev_tc_flow_dump_next(struct netdev_flow_dump *dump,
  }

  static int
+parse_put_flow_set_masked_action(struct tc_flower *flower,
+ const struct nlattr *set,
+ size_t set_len,
+ bool hasmask)
+{
+static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 20);
+const struct nlattr *set_attr;
+char *key = (char *) >rewrite.key;
+char *mask = (char *) >rewrite.mask;
+size_t set_left;
+int i, j;
+
+NL_ATTR_FOR_EACH_UNSAFE(set_attr, set_left, set, set_len) {
+int type = nl_attr_type(set_attr);
+size_t size = nl_attr_get_size(set_attr) / 2;
+char *set_data = CONST_CAST(char *, nl_attr_get(set_attr));
+char *set_mask = set_data + size;
+
+if (type >= ARRAY_SIZE(set_flower_map)) {
+VLOG_DBG_RL(, "unsupported set action type: %d", type);
+return EOPNOTSUPP;


I think that this assumes that 'set_flower_map' has every entry
populated up until the maximum supported key field - but are some
missing - for example OVS_KEY_ATTR_VLAN. Could we also check by
indexing into set_flower_map and checking if it's a valid entry?


Good catch, thanks, will be fixed.




+}
+
+for (i = 0; i < ARRAY_SIZE(set_flower_map[type]); i++) {
+struct netlink_field *f = _flower_map[type][i];


Separate thought - you're iterating nlattrs above then iterating all
the key attributes in the flower_map here. Couldn't you just directly
index into the packet field that this action will modify?


+
+if (!f->size) {
+break;
+}


I think that headers that are not in the set_flower_map will hit this,
which would be unsupported (similar to above comment).


This is for null entries in the map, like OVS_KEY_ATTR_IPV6 has only
two entries in the second dimension (src and dst) but it can have up to 
3 so the third size is zeroed to skip it.





+
+for (j = 0; j < f->size; j++) {
+char maskval = hasmask ? set_mask[f->offset + j] : 0xFF;
+
+key[f->flower_offset + j] = maskval & set_data[f->offset + j];
+mask[f->flower_offset + j] = maskval;
+}
+}
+}
+
+if (!is_all_zeros(>rewrite, sizeof flower->rewrite)) {
+flower->rewrite.rewrite = true;
+}
+
+return 0;
+}
+
+static int
  parse_put_flow_set_action(struct tc_flower *flower, const struct nlattr *set,
size_t set_len)
  {
-static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 20);
  const struct nlattr *set_attr;
  size_t set_left;




Thanks for the review guys, We'll send a new patch set.

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 4/4] netdev-tc-offloads: Add support for action set

2017-08-17 Thread Paul Blakey



On 15/08/2017 20:24, Joe Stringer wrote:

On 15 August 2017 at 01:19, Paul Blakey  wrote:



On 15/08/2017 10:28, Paul Blakey wrote:




On 15/08/2017 04:04, Joe Stringer wrote:


On 8 August 2017 at 07:21, Roi Dayan  wrote:


From: Paul Blakey 

Implement support for offloading ovs action set using
tc header rewrite action.

Signed-off-by: Paul Blakey 
Reviewed-by: Roi Dayan 
---






@@ -274,6 +346,48 @@ netdev_tc_flow_dump_destroy(struct netdev_flow_dump
*dump)
   return 0;
   }

+static void
+parse_flower_rewrite_to_netlink_action(struct ofpbuf *buf,
+   struct tc_flower *flower)
+{
+char *mask = (char *) >rewrite.mask;
+char *data = (char *) >rewrite.key;
+
+for (int type = 0; type < ARRAY_SIZE(set_flower_map); type++) {
+char *put = NULL;
+size_t nested = 0;
+int len = ovs_flow_key_attr_lens[type].len;
+
+if (len <= 0) {
+continue;
+}
+
+for (int j = 0; j < ARRAY_SIZE(set_flower_map[type]); j++) {
+struct netlink_field *f = _flower_map[type][j];
+
+if (!f->size) {
+break;
+}



Seems like maybe there's similar feedback here to the previous patch
wrt sparsely populated array set_flower_map[].



I'll answer there.



Scratch that, I actually meant to write it here.

With this map we have fast (direct lookup) access on the put path
(parse_put_flow_set_masked_action) , and slow access on the reverse dump
path (parse_flower_rewrite_to_netlink_action) which iterates over all
indices, although it should skips them  fast if they are empty.

Changing this to sparse map will slow both of the paths, but dumping would
be faster. We didn't write this maps for performance but for convince of
adding new supported attributes.


OK, thanks for the explanation. I figured there must be a reason it
was indexed that way, I just didn't see it in my first review pass
through the series. Obviously fast path should be fast, and typically
we don't worry quite as much about dumping fast (although I have in
the past tested this to determine how many flows we will attempt to
populate in adverse conditions).


What we can do to both maps is:

1) Using ovs thread once, we can create a fast (direct lookup) access map;
for both paths - generating a reverse map at init time and using that
instead for dumping.


Until there's a clear, known advantage I'm not sure if we should do this.


2) Rewrite it in a non sparse way, use it for both.


I think that we generally are more interested in 'put' performance so
if this sacrifices it for 'dump' performance, then it's probably not a
good idea (though it may still be too early to really optimize these).


3) Rewrite it using a switch case, the logic for now is pretty much the same
for all attributes.

What do you think?


Do you think that this third option would improve the readability,
allow more compile-time checking on the code, or more code reuse?



Hi Joe, Thanks for the review,

Since it can be implemented by a map, as with this patch set, all the 
switch cases will have the same logic of translating from OVS action set 
attributes to flower members, so I think its not much an increase in 
readability unless you're looking at a single case.
And regarding compile time checking, we still need to convert from 
flower key members to tc action pedit raw offsets/values so we won't 
have that there.


I'd like to remove the tc flower struct in the future
and work straight on struct match and actions
so then we will have only the set action to pedit conversion (and one 
conversion for all the rest).


I think I'd rather keep the map till it won't fit for the task or we do 
the above if that's OK with you.














___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] Compliance of NSH implementation with latest NSH draft

2017-08-17 Thread Jan Scheurich
Hi Yi,
I would like to have agreement in the team (including Ben and Jiri) how to 
proceed. I see the following two main options:

  1.  We do not touch the code in OVS 2.8 branch, mark the NSH support in 2.8 
experimental, document that it complies to NSH drafts up to version 12 (Feb 
2017) and will be aligned with the final RFC in the next release. We support 
NSH only with userspace datapath and we only fix the remaining bugs in NSH.

On OVS master and Linux net-next we continue to finalize NSH support and align 
with RFC when approved (including support for TTL and dec_nsh_ttl action). This 
would give as some time to wait for the final RFC an sort out the open issues 
regarding netlink uAPI.
  2.  We align NSH support in OVS 2.8 userspace with draft v19 without support 
for NSH TTL field and dec_nsh_ttl action. This will require some rework in 
nsh.h and the datapath code for parsing and setting/pushing NSH headers.
Support for NSH TTL field and dec_nsh_ttl action can be added at a later stage.

You continue to work on the NSH in kernel datapath in net-next according latest 
draft/final RFC in the hope that NSH in net-next upstream can be compatible 
with the uAPI released in 2.8. The risk is that it won't and then there would 
be no kernel datapath support for NSH in 2.8 anyhow.
I personally see the risk with adding even more late changes on 2.8 release 
branch in a rush and tend to favor option 1, but I would very much like to hear 
Ben's and Jiri's opinion on this.
BR, Jan

From: Yang, Yi Y [mailto:yi.y.y...@intel.com]
Sent: Thursday, 17 August, 2017 02:17
To: Jan Scheurich ; Zoltán Balogh 

Cc: Ben Pfaff (b...@ovn.org) ; Jiri Benc (jb...@redhat.com) 
; 'ovs-dev@openvswitch.org' ; Eric 
Garver 
Subject: RE: Compliance of NSH implementation with latest NSH draft

Jan, I think we can adjust struct nsh_hdr to adapt to new nsh draft, do you 
have time to add nsh_ttl match field and a corresponding dec_nsh_ttl action? 
You can use ovs_key_nsh_rework in gitlab, I have updated to the new changes I 
did and it can work as before.

From: Jan Scheurich [mailto:jan.scheur...@ericsson.com]
Sent: Thursday, August 17, 2017 12:22 AM
To: Yang, Yi Y >; Zoltán Balogh 
>
Cc: Ben Pfaff (b...@ovn.org) 
>; Jiri Benc 
(jb...@redhat.com) 
>; 'ovs-dev@openvswitch.org' 
>; Eric Garver 
>
Subject: Compliance of NSH implementation with latest NSH draft

I checked the IETF tracker for NSH draft version 19:
It seems like final IETF reviews are concluding (deadline Aug 25), so we might 
consider updating the OVS implementation in 2.8 to align with version 19. At 
minimum this would mean restricting the nsh_mdtype field to 4 bits and possibly 
adjust the nsh_flags field to the changed bits in the base header.
I believe the we could add the nsh_ttl match field and a corresponding 
dec_nsh_ttl action at a later stage.
What do you think?
BR, Jan

From: Jan Scheurich
Sent: Wednesday, 16 August, 2017 16:05
To: Yang, Yi Y >; Zoltán Balogh 
>
Cc: Ben Pfaff (b...@ovn.org) 
>; Jiri Benc 
(jb...@redhat.com) 
>; 
ovs-dev@openvswitch.org
Subject: RE: Yang, Yi Y sent you a message in Skype for Business-FYI: basically 
I have worked out a ovs nsh version per their requirements

Hi Yi,
It is not correct that our implementation does not support MD2. OVS 2.8 will be 
able to receive and forward packets with NSH MD1 and MD2 headers (SFF use 
case). It should also be able to push an NSH header MD1 or MD2 format, with MD2 
TLVs specified as encap properties, (Classifier use case) and pop an NSH header 
with MD1 or MD2 format (last SFF use case).
The only limitation OVS 2.8 has is that it cannot match on or modify MD2 TLV 
metadata. This will require the introduction of generic TLV match fields to be 
mapped to specific TLV protocol headers, which we have postponed to a future 
release.
Regarding the Netlink representation of NSH key and PUSH_NSH action:
For the NSH key we agreed with Jiri and Ben that the nested OVS_KEY_ATTR_NSH 
must have separate members  OVS_NSH_KEY_ATTR_BASE and OVS_NSH_KEY_ATTR_MD1, as 
presence of MD1 is optional. A specific MD1 attribute is required to carry the 
four fixed context headers as key fields.
We do have a problem with the NSH_KEY_BASE as the base header remains a moving 
target in the NSH draft. The format has 

Re: [ovs-dev] [PATCH 1/2] tests/system-offloads-traffic.at: add sanity check

2017-08-17 Thread Roi Dayan



On 17/08/2017 08:32, Roi Dayan wrote:



On 17/08/2017 01:17, Joe Stringer wrote:

On 16 August 2017 at 05:14, Roi Dayan  wrote:

Doing dump-flows also altering the netdev ports list.
So doing it pre the actual test is adding a check to
make sure we don't break the that list.

Signed-off-by: Roi Dayan 
Reviewed-by: Paul Blakey 


I'm actually not sure what the requirements are to run these offload
tests. I tried running them on a 4.4 kernel, and the first test passed
while the second failed; I assume that this is because 4.4's TC flower
support is not new enough.

Then I tried with a 4.12 kernel and neither test passed, and I have
extra flows being reported in the dump-flows output.

I believe that I understand what this patch is trying to achieve, but
I don't know how I'm supposed to validate it.





In the past you had an issue that cls_flower was not configured
in your kernel. could be the same issue now?



strange. there are no special requirements and this test should work
with kernel 4.10 and up I think.
Without the fix the first pass and the second fails.
With the fix the first and second pass.

I currently I test using kernel 4.13-rc2 and this is my output:


before the fix:


# make check-offloads
..
datapath offloads

  1: offloads - ping between two ports - offloads disabled ok
  2: offloads - ping between two ports - offloads enabled FAILED
(system-offloads-traffic.at:55)



after the fix:


# make check-offloads
..
datapath offloads

  1: offloads - ping between two ports - offloads disabled ok
  2: offloads - ping between two ports - offloads enabled ok


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [ovs-dpdk-tests] where is ovs-dpdk test case?

2017-08-17 Thread Sam
Hi all,

I'm working with ovs-dpdk, I want to run ovs-dpdk test case. But I found
there is no test case for 'netdev' type bridge and no test case for
ovs-dpdk datapath(which is pmd_thread_main).

So my question is where could I find these test cases?

Also I change some code in dpdk-vhost client mode, how could I find test
case for this module?

Thank you~
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH V3 0/2] Fix rules not offloaded after executing dpctl command

2017-08-17 Thread Roi Dayan
Hi,

The first patch is small update to the test to catch this error early.
The second patch is the actual fix not to clean netdev_ports map early.

V2->V3
- Fix cleaning netdev ports before dpif_uninit()

V1->V2
- Change order of commits for fix to be first
- Small refactor to function declaration

Thanks,
Roi


Roi Dayan (2):
  dpif: Fix cleanup of netdev_ports map
  tests/system-offloads-traffic.at: add sanity check

 lib/dpif.c   | 20 ++--
 tests/system-offloads-traffic.at |  1 +
 2 files changed, 15 insertions(+), 6 deletions(-)

-- 
2.8.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH V3 1/2] dpif: Fix cleanup of netdev_ports map

2017-08-17 Thread Roi Dayan
Executing dpctl commands from userspace also calls to
dpif_open()/dpif_close() but not really creating another dpif
but using a clone.
As for netdev_ports map is global we avoid adding duplicate entries
but also need to make sure we are not removing needed entries.
With this commit we make sure only the last dpif close should clean
the netdev_ports map.

Fixes: 6595cb95a4a9 ("dpif: Clean up netdev_ports map on dpif_close().")
Signed-off-by: Roi Dayan 
Reviewed-by: Paul Blakey 
---
 lib/dpif.c | 20 ++--
 1 file changed, 14 insertions(+), 6 deletions(-)

diff --git a/lib/dpif.c b/lib/dpif.c
index 4c5eac6..79b2e6c 100644
--- a/lib/dpif.c
+++ b/lib/dpif.c
@@ -428,6 +428,18 @@ dpif_create_and_open(const char *name, const char *type, 
struct dpif **dpifp)
 return error;
 }
 
+static void
+dpif_remove_netdev_ports(struct dpif *dpif) {
+struct dpif_port_dump port_dump;
+struct dpif_port dpif_port;
+
+DPIF_PORT_FOR_EACH (_port, _dump, dpif) {
+if (!dpif_is_internal_port(dpif_port.type)) {
+netdev_ports_remove(dpif_port.port_no, dpif->dpif_class);
+}
+}
+}
+
 /* Closes and frees the connection to 'dpif'.  Does not destroy the datapath
  * itself; call dpif_delete() first, instead, if that is desirable. */
 void
@@ -435,15 +447,11 @@ dpif_close(struct dpif *dpif)
 {
 if (dpif) {
 struct registered_dpif_class *rc;
-struct dpif_port_dump port_dump;
-struct dpif_port dpif_port;
 
 rc = shash_find_data(_classes, dpif->dpif_class->type);
 
-DPIF_PORT_FOR_EACH (_port, _dump, dpif) {
-if (!dpif_is_internal_port(dpif_port.type)) {
-netdev_ports_remove(dpif_port.port_no, dpif->dpif_class);
-}
+if (rc->refcount == 1) {
+dpif_remove_netdev_ports(dpif);
 }
 dpif_uninit(dpif, true);
 dp_class_unref(rc);
-- 
2.8.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev