Re: [ovs-dev] [PATCH net-next v4] net: openvswitch: add hash info to upcall

2019-11-13 Thread Pravin Shelar
On Wed, Nov 13, 2019 at 7:05 AM  wrote:
>
> From: Tonghao Zhang 
>
> When using the kernel datapath, the upcall don't
> include skb hash info relatived. That will introduce
> some problem, because the hash of skb is important
> in kernel stack. For example, VXLAN module uses
> it to select UDP src port. The tx queue selection
> may also use the hash in stack.
>
> Hash is computed in different ways. Hash is random
> for a TCP socket, and hash may be computed in hardware,
> or software stack. Recalculation hash is not easy.
>
> Hash of TCP socket is computed:
> tcp_v4_connect
> -> sk_set_txhash (is random)
>
> __tcp_transmit_skb
> -> skb_set_hash_from_sk
>
> There will be one upcall, without information of skb
> hash, to ovs-vswitchd, for the first packet of a TCP
> session. The rest packets will be processed in Open vSwitch
> modules, hash kept. If this tcp session is forward to
> VXLAN module, then the UDP src port of first tcp packet
> is different from rest packets.
>
> TCP packets may come from the host or dockers, to Open vSwitch.
> To fix it, we store the hash info to upcall, and restore hash
> when packets sent back.
>
> +---+  +-+
> |   Docker/VMs  |  | ovs-vswitchd|
> ++--+  +-++--+
>  |   ^|
>  |   ||
>  |   |  upcallv restore packet hash (not 
> recalculate)
>  | +-++--+
>  |  tap netdev | |   vxlan module
>  +---> +-->  Open vSwitch ko +-->
>or internal type| |
>+-+
>
> Reported-at: 
> https://mail.openvswitch.org/pipermail/ovs-dev/2019-October/364062.html
> Signed-off-by: Tonghao Zhang 

Acked-by: Pravin B Shelar 

Thanks.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] Formación en Comercio Exterior

2019-11-13 Thread Cierre de Inscripciones
Buen día
 
El cupo del curso está por llenarse y quise aprovechar la oportunidad de 
hacerte una última invitación:
 
•Nombre: Formación en Comercio Exterior 
•¿Cuándo?: Viernes 22 de Noviembre   
•Formato: En línea con interacción en vivo.
•Lugar: En Vivo desde su computadora

Proporcionaremos las herramientas más importantes y de vanguardia que se están 
utilizando para medir la satisfacción de los clientes y que nos ayudarán a 
centrar la
estrategia de nuestra organización en las necesidades de nuestro cliente.
 
Si el tema del curso no se apega a tus necesidades podemos ayudarte con cursos 
privados personalizados en línea o presenciales. 
Solicita informes sin compromiso.
 
¡Esperamos te animes a acompañarnos! En este curso participan empresas de todo 
el país y nuestro instructor es un experto en el tema que puede atender 
directamente las dudas y comentarios de tu equipo de trabajo.

 
Solicita información respondiendo a este correo con la palabra Comercio, junto 
con los siguientes datos:

Nombre:
Correo electrónico:
Número telefónico:


Números de Atención: (045) 55 15 54 66 30 - (045) 55 85 56 72 93 - (045) 55 30 
16 70 85  


Qué tengas un gran día.
Saludos.


Si desea dejar de recibir nuestra promoción favor de responder con la palabra 
baja o enviar un correo a bajas @innovalearn.net


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v4 ovn 0/3] Improve ovn-detrace support for parsing OpenFlow cookies.

2019-11-13 Thread Han Zhou
On Wed, Nov 13, 2019 at 1:51 AM Dumitru Ceara  wrote:
>
> Commit eb25a7da639e ("Improve debuggability of OVN to OpenFlow
translations.")
> added support for more types of cookies (partial SB uuids) that are stored
> in OpenFlow entries created by ovn-controller.
>
> The last patch in this series implements support for parsing all these
> different cookies in ovn-detrace too.
>
> The first patch is a bug fix required as ovn-detrace was broken after
> moving OVN to its own separate rundir.
>
> The second patch fixes line parsing in ovn-detrace.
>
> CC: Han Zhou 
> Signed-off-by: Dumitru Ceara 
>
> Dumitru Ceara (3):
>   ovn-detrace: Fix rundir.
>   ovn-detrace: Fix line parsing.
>   ovn-detrace: Add support for other types of SB cookies.
>
>
>  utilities/ovn-detrace.in |  201
--
>  1 file changed, 138 insertions(+), 63 deletions(-)
>
>
> ---
> v4:
> - Address Han's comments:
>   - Fix printing of last logical flow information.
> - Added Acked-by from Mark on patches 1 & 3. I didn't add it to patch 2
>   because I significantly changed patch 2 in v4.
> v3:
> - Remove stray "%s".
> - Rename pprint to print_p to avoid name clashes with the library
function.
> v2:
> - Address Mark's comments:
>   - properly handle potential collisions between cookie -> UUID mappings.
>   - when looking up SB records by cookie, match on the first part of the
> UUID.
> - Further refactor ovn-detrace to simplify prefixing outputs.
> - Change print statements to print() calls.
>

Thanks Dumitru. I applied the series to master.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v7 ovn 2/2] ovn-northd: Limit ARP/ND broadcast domain whenever possible.

2019-11-13 Thread Han Zhou
On Wed, Nov 13, 2019 at 8:14 AM Dumitru Ceara  wrote:
>
> On Wed, Nov 13, 2019 at 4:30 PM Han Zhou  wrote:
> >
> >
> >
> > On Wed, Nov 13, 2019 at 2:42 AM Dumitru Ceara  wrote:
> >>
> >> On Tue, Nov 12, 2019 at 8:50 PM Han Zhou  wrote:
> >> >
> >> >
> >> >
> >> >
> >> > On Tue, Nov 12, 2019 at 10:10 AM Dumitru Ceara 
wrote:
> >> > >
> >> > > On Tue, Nov 12, 2019 at 6:17 PM Han Zhou  wrote:
> >> > > >
> >> > > >
> >> > > >
> >> > > > On Tue, Nov 12, 2019 at 2:29 AM Dumitru Ceara 
wrote:
> >> > > > >
> >> > > > > ARP request and ND NS packets for router owned IPs were being
> >> > > > > flooded in the complete L2 domain (using the MC_FLOOD
multicast group).
> >> > > > > However this creates a scaling issue in scenarios where
aggregation
> >> > > > > logical switches are connected to more logical routers (~350).
The
> >> > > > > logical pipelines of all routers would have to be executed
before the
> >> > > > > packet is finally replied to by a single router, the owner of
the IP
> >> > > > > address.
> >> > > > >
> >> > > > > This commit limits the broadcast domain by bypassing the L2
Lookup stage
> >> > > > > for ARP requests that will be replied by a single router. The
packets
> >> > > > > are forwarded only to the router port that owns the target IP
address.
> >> > > > >
> >> > > > > IPs that are owned by the routers and for which this fix
applies are:
> >> > > > > - IP addresses configured on the router ports.
> >> > > > > - VIPs.
> >> > > > > - NAT IPs.
> >> > > > >
> >> > > > > Reported-at: https://bugzilla.redhat.com/1756945
> >> > > > > Reported-by: Anil Venkata 
> >> > > > > Signed-off-by: Dumitru Ceara 
> >> > > > >
> >> > > > > ---
> >> > > > > v7:
> >> > > > > - Address Han's comments:
> >> > > > > - Remove flooding for all ARPs received on VLAN networks.
To avoid
> >> > > > >   that we now identify self originated (G)ARPs by matching
on source
> >> > > > >   MAC address too.
> >> > > > > - Rename REGBIT_NOT_VXLAN to FLAGBIT_NOT_VXLAN.
> >> > > > > - Fix ovn-sb manpage.
> >> > > > > - Split patch in a series of 2:
> >> > > > > - patch1: fixes the get_router_load_balancer_ips()
function.
> >> > > > > - patch2: limits the ARP/ND broadcast domain.
> >> > > > > v6:
> >> > > > > - Address Han's comments:
> >> > > > > - remove flooding of ARPs targeting OVN owned IP addresses.
> >> > > > > - update ovn-architecture documentation.
> >> > > > > - rename ARP handling functions.
> >> > > > > - Adapt "ovn -- 3 HVs, 3 LS, 3 lports/LS, 1 LR" autotest
to take into
> >> > > > > account the new way of forwarding ARPs.
> >> > > > > - Also, properly deal with ARP packets on VLAN-backed networks.
> >> > > > > v5: Address Numan's comments: update comments & make autotest
more
> >> > > > > robust.
> >> > > > > v4: Rebase.
> >> > > > > v3: Properly deal with VXLAN traffic. Address review comments
from
> >> > > > > Numan (add autotests). Fix function
get_router_load_balancer_ips.
> >> > > > > Rebase -> deal with IPv6 NAT too.
> >> > > > > v2: Move ARP broadcast domain limiting to table
S_SWITCH_IN_L2_LKUP to
> >> > > > > address localnet ports too.
> >> > > > > ---
> >> > > > >  northd/ovn-northd.8.xml |   14 ++
> >> > > > >  northd/ovn-northd.c |  230
+++
> >> > > > >  ovn-architecture.7.xml  |   19 +++
> >> > > > >  tests/ovn.at|  307
+--
> >> > > > >  4 files changed, 530 insertions(+), 40 deletions(-)
> >> > > > >
> >> > > > > diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
> >> > > > > index 0a33dcd..344cc0d 100644
> >> > > > > --- a/northd/ovn-northd.8.xml
> >> > > > > +++ b/northd/ovn-northd.8.xml
> >> > > > > @@ -1005,6 +1005,20 @@ output;
> >> > > > >
> >> > > > >
> >> > > > >
> >> > > > > +Priority-80 flows for each port connected to a
logical router
> >> > > > > +matching self originated GARP/ARP request/ND packets.
These packets
> >> > > > > +are flooded to the MC_FLOOD which
contains all logical
> >> > > > > +ports.
> >> > > > > +  
> >> > > > > +
> >> > > > > +  
> >> > > > > +Priority-75 flows for each IP address/VIP/NAT address
owned by a
> >> > > > > +router port connected to the switch. These flows
match ARP requests
> >> > > > > +and ND packets for the specific IP addresses.
Matched packets are
> >> > > > > +forwarded only to the router that owns the IP address.
> >> > > > > +  
> >> > > > > +
> >> > > > > +  
> >> > > > >  A priority-70 flow that outputs all packets with an
Ethernet broadcast
> >> > > > >  or multicast eth.dst to the
MC_FLOOD
> >> > > > >  multicast group.
> >> > > > > diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
> >> > > > > index 32f3200..d6beb97 100644
> >> > > > > --- a/northd/ovn-northd.c
> >> > > > > +++ b/northd/ovn-northd.c
> >> > > > > @@ -210,6 +210,8 @@ enum ovn_stage {
> >> > > > >  #d

Re: [ovs-dev] [PATCH ovn v2 00/13] OVN Interconnection

2019-11-13 Thread Numan Siddique
On Thu, Oct 31, 2019 at 2:43 AM Han Zhou  wrote:
>
> The series supports interconnecting multiple OVN deployments (e.g.  located at
> multiple data centers) through logical routers connected with tansit logical
> switches with overlay tunnels, managed through OVN control plane.  See the
> ovn-architecture.rst document updates for more details, and find the
> instructions in Documentation/tutorials/ovn-interconnection.rst.
>
> Han Zhou (13):
>   ovn-architecture: Add documentation for OVN interconnection feature.
>   ovn-inb: Interconnection northbound DB schema and CLI.
>   ovn-isb: Interconnection southbound DB schema and CLI.
>   ovn-ic: Interconnection controller with AZ registeration.
>   ovn-northd.c: Refactor allocate_tnlid.
>   ovn-ic: Transit switch controller.
>   ovn-sb: Add columns is_interconn and is_remote to Chassis.
>   ovn-ic: Interconnection gateway controller.
>   ovn-ic: Interconnection port controller.
>   ovn.at: e2e test for OVN interconnection.
>   ovn-ctl: Refactor to reduce redundant code.
>   ovn-ctl: Support commands for interconnection.
>   tutorial: Add tutorial for OVN Interconnection.
>

Hi Han,

Thanks for working on this feature. It's an interesting use case.

I had a quick look at all the patches.

I have few comments

1. I would suggest to rename the DBs as  ovn-ic-nb.ovsschema (and the
same for ovn-isb).
The DB name is - OVN_IC_Northbound. So it would make sense to have
- ovn-ic-nb.ovsschema
 I would also suggest to rename the utilities to ovn-ic-nbctl and
ovn-ic-sbctl.
With  ovn-inbctl/ovn-isbctl, it is really confusing.

2. ovn-ic service writes to interconnect south db, ovn north db and
ovn south db. Writing to ic south db and
ovn north db is fine. But it seems a little odd for ovn-ic to
write to south db. From what I understand it writes
to south db for 3 purposes
  a. Updating the tunnel_key column of datapath_binding
representing the transit switch
  b. Updating the key column of port_binding representing the
logical router port connecting to the transit switch.
  c. Creating chassis rows for remote gateway chassis.

   I think it's better if ovn-ic can delegate all these to ovn-northd.
For (a) and (b), ovn-ic can set the generated tunnel key
   in the other_config/options column of Logical switch/Logical switch
port. ovn-northd can internally set this value to
   the south db.

   For (c), I think its better we add another table - Remote_Chassis
(or some other appropriate name) . ovn-ic will create rows
   in this table for each remote chassis and ovn-northd will create
chassis row in south db.
   We already have Gateway_Chassis table in North db. So I think it's
fine if we add Remote_Chassis table. The only odd thing
   would be is to store the encap details in North db.

   With this, ovn-ic, doesn't need to worry about syncing between
interconnect south db and ovn south db. In my opinion ovn-ic
   should act more like CMS to each availability zone and hence should
not do any write transactions to the south db.

Any concerns with this proposed approach ?

3. In patch 7,its better to rename the ovs configuration option -
"external_ids:is-interconn"  to "external_ids:ovn-is-interconn".
   You also need to document it in ovn-controller-8.xml.

   Or maybe we can remove this option - external_ids:is-interconn. We
probably don't need this if we do like below

   2 (c) can also be done this way
 - User creates transit switch.
 - ovn-ic creates transit switch in north db.
 - then the user adds the logical router port - logical switch
port to the transit switch in availability zone - az1 (for example)
 - then the user creates gw chassis - for example
  ovn-nbctl lrp-set-gateway-chassis lrp-lr1-ts1  [priority]
 - ovn-ic on az1 ,learns this and creates a Gateway_Chassis
row in the interconnect south db.
 - ovn-ic on other az's then create Remote_Chassis rows in the North db.
- ovn-northd on other az's then create chassis row in south db
with "is_remote" set to true.

I am not sure if this complicates further and hence its better
that ovn-ic writes to the south dbs. But we can discuss further on
this.


4. Can you please add a section in ovn-architecture about the "Journey
of  a packet in inter-az scenario" ? This would really
help in understanding the feature.

Thanks
Numan


>  .gitignore  |6 +
>  Documentation/automake.mk   |1 +
>  Documentation/tutorials/index.rst   |1 +
>  Documentation/tutorials/ovn-interconnection.rst |  188 
>  Makefile.am |1 +
>  NEWS|5 +
>  TODO.rst|   10 +
>  automake.mk |   75 ++
>  controller/binding.c|6 +-
>  controller/chassis.c|

Re: [ovs-dev] [PATCH v3] travis: support ppc64le builds

2019-11-13 Thread dwilder

On 2019-11-12 17:24, Lance Yang (Arm Technology China) wrote:

Hi,

For pip missing issue, we have a workaround.  In
.travis/linux-prepare.sh. You can install pip or something else with
package management commands.

For arm64 on travis, we check the environment variables in the
container and found a variable called "TRAVIS_ARCH". Thus we can use
this variable to install specific software for a specific CPU
architecture.

You can find a successful build here:
https://urldefense.proofpoint.com/v2/url?u=https-3A__travis-2Dci.org_yzyuestc_ovs_jobs_610716477&d=DwIGaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=7ndxyKjH9UrBD68Us2WP1wI4BwEBQbeAyz8i_vwCCaI&m=WN9SDoyPaU_O5oa3DAKsM2wYQpKnc2bhF_O-3r50JVs&s=p5lQOFy_H4apDm3OJqyJFVo04orvtxDBtLgMzLs0cZ4&e=



Thank you for your workaround Lance.  It would be best if the multi-arch 
images were congruent. I fear this may be an ongoing issue as distro's 
content can very between architectures. I am waiting for travis's 
response on the issue. I will submit an update to my ppc64le commit in a 
few days if travis cant resolve it.



-Original Message-
From: ovs-dev-boun...@openvswitch.org
 On Behalf Of dwilder
Sent: Wednesday, November 13, 2019 8:29 AM
To: Ilya Maximets 
Cc: ovs-dev@openvswitch.org; wil...@us.ibm.com
Subject: Re: [ovs-dev] [PATCH v3] travis: support ppc64le builds

On 2019-11-12 11:30, Ilya Maximets wrote:

On 12.11.2019 18:57, dwilder wrote:

On 2019-11-08 14:52, Ilya Maximets wrote:

On 06.11.2019 20:20, David Wilder wrote:

Add support for travis-ci ppc64le builds.

- Updated matrix in .travis.yml to include an arch: ppc64le build.
- Move package install needed for 32bit builds to
.travis/linux-prepare.sh.

To keep the total build time at an acceptable level only a single
build job is included in the matrix for ppc64le.

A build report example can be found here [1] [0]
https://urldefense.proofpoint.com/v2/url?u=http-3A__travis-2Dci.org
_&d=DwICaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=7ndxyKjH9UrBD68Us2WP1wI4BwEBQ
beAyz8i_vwCCaI&m=_UtshYcJsj4Pt3b9hfgEiyaEIT3j9gPEIgmBatCEqCo&s=r0fB
Os-21CKcV4kyZGnzh3fcjrpR8caYSl8K2i1St54&e=
[1]
https://urldefense.proofpoint.com/v2/url?u=https-3A__travis-2Dci.or
g_djlwilder_ovs_builds_607851729&d=DwICaQ&c=jf_iaSHvJObTbx-siA1ZOg&
r=7ndxyKjH9UrBD68Us2WP1wI4BwEBQbeAyz8i_vwCCaI&m=_UtshYcJsj4Pt3b9hfg
EiyaEIT3j9gPEIgmBatCEqCo&s=7t2rzVasH7Xq_R7jWkWZO9rkgm4KHMH-WavBzCRb
F74&e=
Signed-off-by: David Wilder 
---
Addressed review comments:
- Cleaned up linux-prepare.sh (v2)
- Switch from os: linux-ppc64le to arch: ppc64le (v3)


What a wonderful world of undocumented features. :)

Anyway, I just tried this patch and it fails for me because of
missing pip:

https://urldefense.proofpoint.com/v2/url?u=https-3A__travis-2Dci.org
_igsilya_ovs_jobs_609402867&d=DwICaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=7ndx
yKjH9UrBD68Us2WP1wI4BwEBQbeAyz8i_vwCCaI&m=_UtshYcJsj4Pt3b9hfgEiyaEIT
3j9gPEIgmBatCEqCo&s=PF1oO_KkZFd_RRKToj6UBN2t2YhvTVE5XnVD1GF9u60&e=

pip install --disable-pip-version-check --user six flake8 hacking
./.travis/linux-prepare.sh: line 15: pip: command not found

Restarting the job doesn't help.

I'm wondering what is the base image and who controls preinstalled
software?
Maybe it makes sense to hold on until travis will start supporting
ppc64le officially?

Best regards, Ilya Maximets.


Unfortunately it not just ppc the arm64 image has the same issue :) I
added a arm64 build on this attempt.

https://urldefense.proofpoint.com/v2/url?u=https-3A__travis-2Dci.org_
djlwilder_ovs_builds_610517731&d=DwIDaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=7n
dxyKjH9UrBD68Us2WP1wI4BwEBQbeAyz8i_vwCCaI&m=rVTSRTh9jExG_mX8boqA-OQcZ
kWmbO2g7TjgtUe6jws&s=sCvTr8MwXrqa7AEOs60tuqnquBqbRKlp_7-WacGzJWc&e=

I will attempt to adjust the package list.


I think, you could just report this to TravisCI support and see what
they will answer.  It'll be much better if they will just update their
images.

And the good news that ppc64le is officially supported now:
https://urldefense.proofpoint.com/v2/url?u=https-3A__blog.travis-2Dci.
com_2019-2D11-2D12-2Dmulti-2Dcpu-2Darchitecture-2Dibm-2Dpower-2Dibm-2D
z&d=DwIDaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=7ndxyKjH9UrBD68Us2WP1wI4BwEBQbeA
yz8i_vwCCaI&m=rVTSRTh9jExG_mX8boqA-OQcZkWmbO2g7TjgtUe6jws&s=CRYw9o6At3
rFT_gd_5r1Uw07mv_SEPvQ6LgBLfMyqgg&e=


That is great news!
I reported the issue to travis:
https://urldefense.proofpoint.com/v2/url?u=https-3A__travis-2Dci.community_t_pip-2Dis-2Dnot-2Dinstalled-2Din-2Dppc64le-2Dand-2Darm64-2Dimages_5902-3Fu-3Ddjlwilder&d=DwIGaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=7ndxyKjH9UrBD68Us2WP1wI4BwEBQbeAyz8i_vwCCaI&m=WN9SDoyPaU_O5oa3DAKsM2wYQpKnc2bhF_O-3r50JVs&s=dIOPvVHVZcpNxq8BgX026yL8uhzxz19lOzjBJhB15xg&e=

I added a workaround if we need it. This is my latest travis run.
https://urldefense.proofpoint.com/v2/url?u=https-3A__www.travis-2Dci.org_djlwilder_ovs_builds_611123734&d=DwIGaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=7ndxyKjH9UrBD68Us2WP1wI4BwEBQbeAyz8i_vwCCaI&m=WN9SDoyPaU_O5oa3DAKsM2wYQpKnc2bhF_O-3r50JVs&s=QcFJcytLX5PkWH1FK5SZDaFwSijTPi01rCNT02pBsms&e=

[ovs-dev] [PATCH] ovsdb raft: Fix election timer parsing in snapshot RPC.

2019-11-13 Thread Han Zhou
Commit a76ba825 took care of saving and restoring election timer in
file header snapshot, but it didn't handle the parsing of election
timer in install_snapshot_request/reply RPC, which results in problems,
e.g. when election timer change log is compacted in snapshot and then a
new node join the cluster, the new node will use the default timer
instead of the new value.  This patch fixed it by parsing election
timer in snapshot RPC.

At the same time the patch updates the test case to cover the DB compact and
join senario. The test reveals another 2 problems related to clustered DB
compact, as commented in the test case's XXX, which need to be addressed
separately.

Signed-off-by: Han Zhou 
---
 ovsdb/raft-rpc.c   |  5 +
 ovsdb/raft.c   |  2 +-
 tests/ovsdb-cluster.at | 49 +
 3 files changed, 55 insertions(+), 1 deletion(-)

diff --git a/ovsdb/raft-rpc.c b/ovsdb/raft-rpc.c
index 56c07d4..18c83fe 100644
--- a/ovsdb/raft-rpc.c
+++ b/ovsdb/raft-rpc.c
@@ -511,6 +511,7 @@ raft_install_snapshot_request_to_jsonrpc(
 json_object_put(args, "last_servers", json_clone(rq->last_servers));
 json_object_put_format(args, "last_eid",
UUID_FMT, UUID_ARGS(&rq->last_eid));
+raft_put_uint64(args, "election_timer", rq->election_timer);
 
 json_object_put(args, "data", json_clone(rq->data));
 }
@@ -527,6 +528,9 @@ raft_install_snapshot_request_from_jsonrpc(
 rq->last_index = raft_parse_required_uint64(p, "last_index");
 rq->last_term = raft_parse_required_uint64(p, "last_term");
 rq->last_eid = raft_parse_required_uuid(p, "last_eid");
+/* election_timer is optional in file header, but is always populated in
+ * install_snapshot_request. */
+rq->election_timer = raft_parse_required_uint64(p, "election_timer");
 
 rq->data = json_nullable_clone(
 ovsdb_parser_member(p, "data", OP_OBJECT | OP_ARRAY));
@@ -541,6 +545,7 @@ raft_format_install_snapshot_request(
 ds_put_format(s, " last_term=%"PRIu64, rq->last_term);
 ds_put_format(s, " last_eid="UUID_FMT, UUID_ARGS(&rq->last_eid));
 ds_put_cstr(s, " last_servers=");
+ds_put_format(s, " election_timer=%"PRIu64, rq->election_timer);
 
 struct hmap servers;
 struct ovsdb_error *error =
diff --git a/ovsdb/raft.c b/ovsdb/raft.c
index a45c7f8..f354d50 100644
--- a/ovsdb/raft.c
+++ b/ovsdb/raft.c
@@ -3257,7 +3257,7 @@ raft_send_install_snapshot_request(struct raft *raft,
 .last_servers = raft->snap.servers,
 .last_eid = raft->snap.eid,
 .data = raft->snap.data,
-.election_timer = raft->election_timer,
+.election_timer = raft->election_timer, /* use latest value */
 }
 };
 raft_send(raft, &rpc);
diff --git a/tests/ovsdb-cluster.at b/tests/ovsdb-cluster.at
index 23ed7ec..72f8740 100644
--- a/tests/ovsdb-cluster.at
+++ b/tests/ovsdb-cluster.at
@@ -233,6 +233,55 @@ done
 OVS_WAIT_UNTIL([ovs-appctl -t "`pwd`"/s1 cluster/status $schema_name | grep 
"Election timer: 4000"])
 OVS_WAIT_UNTIL([ovs-appctl -t "`pwd`"/s2 cluster/status $schema_name | grep 
"Election timer: 4000"])
 
+# Latest timer should be restored after DB compact and restart.
+# This is to test the install_snapshot RPC.
+
+# XXX: Insert data before compact, because otherwise transaction will trigger
+# busy loop after compact.
+# poll_loop|DBG|wakeup due to 0-ms timeout at ../ovsdb/trigger.c:164 (89% CPU 
usage)
+AT_CHECK([ovsdb-client transact unix:s1.ovsdb '[["idltest",
+  {"op": "insert",
+   "table": "simple",
+   "row": {"i": 1}}]]'], [0], [ignore], [ignore])
+
+# Compact online
+for i in `seq $n`; do
+AT_CHECK([ovs-appctl -t "`pwd`"/s$i ovsdb-server/compact])
+done
+
+# XXX: Insert data after compact, because otherwise vote will fail after
+# cluster restart after compact. There will be error logs like:
+# raft|ERR|internal error: deferred vote_request message completed but not 
ready to send because message index 9 is past last synced index 0: s2 
vote_request: term=6 last_log_index=9 last_log_term=4
+AT_CHECK([ovsdb-client transact unix:s1.ovsdb '[["idltest",
+  {"op": "insert",
+   "table": "simple",
+   "row": {"i": 1}}]]'], [0], [ignore], [ignore])
+
+for i in `seq $n`; do
+printf "\ns$i: stopping\n"
+OVS_APP_EXIT_AND_WAIT_BY_TARGET([`pwd`/s$i], [s$i.pid])
+done
+for i in `seq $n`; do
+AT_CHECK([ovsdb-server -v -vconsole:off -vsyslog:off --detach --no-chdir 
--log-file=s$i.log --pidfile=s$i.pid --unixctl=s$i --remote=punix:s$i.ovsdb 
s$i.db])
+done
+for i in `seq $n`; do
+OVS_WAIT_UNTIL([ovs-appctl -t "`pwd`"/s$i cluster/status $schema_name | 
grep "Election timer: 4000"])
+done
+
+# Wait until cluster is ready
+for i in `seq $n`; do
+OVS_WAIT_WHILE([ovs-appctl -t "`pwd`"/s$i cluster/status $schema_name | 
grep "Leader: unknown"])
+done
+
+# Newly joined member should use latest timer value
+AT_CHECK([ovsdb-tool join-cluster s4.db $sche

Re: [ovs-dev] [PATCH] compat: Add missing inline keyword

2019-11-13 Thread Gregory Rose


On 11/5/2019 5:03 PM, Gregory Rose wrote:


On 11/5/2019 3:14 PM, Ben Pfaff wrote:

On Tue, Nov 05, 2019 at 02:14:24PM -0800, Greg Rose wrote:

The missing inline keyword before the definition of the
rpl_nf_ct_tmpl_free() function causes spurious warnings about the
function not being used on some older kernels.  Add the keyword
to suppress the warning.

Signed-off-by: Greg Rose 

The commit message and the patch make sense given my experience of GCC
behavior.  I haven't tested it.

Acked-by: Ben Pfaff 


I only did a compile test of it on Travis.

https://travis-ci.org/gvrose8192/ovs-experimental/builds/607794710

There's one error but it's a resource or infrastructure issue I think.

Thanks,

- Greg


Pinging the maintainers...

This is a benign issue for now but if anyone ever tried to use 
rpl_nf_ct_tmpl_free in two different modules they'd

get a linker error.  That probably won't happen but it's still not correct.

Thanks,

- Greg
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH] dpif-netdev: Modify dfc_processing function to void function

2019-11-13 Thread Malvika Gupta
dfc_processing function returns the number of packets left to be processed
in 'packets' array via dp_packet_batch_size() function. dfc_processing function
is called only from dp_netdev_input__ and its return value is not checked upon
function return. Moreover, dp_packet_batch_is_empty() called after function
return from dfc_processing itself calls dp_packet_batch_size to check if
'packets' array is empty. This patch modifies the dfc_processing function to a
void function to remove the above code redundancy and cleans the code.

Signed-off-by: Malvika Gupta 
---
 lib/dpif-netdev.c | 17 +
 1 file changed, 5 insertions(+), 12 deletions(-)

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index 28bc224c2..dd4a6da2f 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -6450,16 +6450,13 @@ smc_lookup_batch(struct dp_netdev_pmd_thread *pmd,
  * beginning of the 'packets' array. The pointers of missed keys are put in the
  * missed_keys pointer array for future processing.
  *
- * The function returns the number of packets that needs to be processed in the
- * 'packets' array (they have been moved to the beginning of the vector).
- *
  * For performance reasons a caller may choose not to initialize the metadata
  * in 'packets_'.  If 'md_is_valid' is false, the metadata in 'packets'
  * is not valid and must be initialized by this function using 'port_no'.
  * If 'md_is_valid' is true, the metadata is already valid and 'port_no'
  * will be ignored.
  */
-static inline size_t
+static inline void
 dfc_processing(struct dp_netdev_pmd_thread *pmd,
struct dp_packet_batch *packets_,
struct netdev_flow_key *keys,
@@ -6575,15 +6572,11 @@ dfc_processing(struct dp_netdev_pmd_thread *pmd,
 
 pmd_perf_update_counter(&pmd->perf_stats, PMD_STAT_EXACT_HIT, n_emc_hit);
 
-if (!smc_enable_db) {
-return dp_packet_batch_size(packets_);
+if (smc_enable_db) {
+/* Packets miss EMC will do a batch lookup in SMC if enabled */
+smc_lookup_batch(pmd, keys, missed_keys, packets_,
+  n_missed, flow_map, index_map);
 }
-
-/* Packets miss EMC will do a batch lookup in SMC if enabled */
-smc_lookup_batch(pmd, keys, missed_keys, packets_,
- n_missed, flow_map, index_map);
-
-return dp_packet_batch_size(packets_);
 }
 
 static inline int
-- 
2.17.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH] dpif-netdev-perf: aarch64 support for accurate timing update of TSC cycle counter

2019-11-13 Thread Malvika Gupta
The accurate timing implementation in this patch gets the wall clock counter via
cntvct_el0 register access. This call is portable to all aarch64 architectures
and has been verified on an 64-bit arm server.

Suggested-by: Yanqin Wei 
Signed-off-by: Malvika Gupta 
---
 lib/dpif-netdev-perf.h | 5 +
 1 file changed, 5 insertions(+)

diff --git a/lib/dpif-netdev-perf.h b/lib/dpif-netdev-perf.h
index ce369375b..4ea7cc355 100644
--- a/lib/dpif-netdev-perf.h
+++ b/lib/dpif-netdev-perf.h
@@ -220,6 +220,11 @@ cycles_counter_update(struct pmd_perf_stats *s)
 asm volatile("rdtsc" : "=a" (l), "=d" (h));
 
 return s->last_tsc = ((uint64_t) h << 32) | l;
+#elif !defined(_MSC_VER) && defined(__aarch64__)
+uint64_t tsc;
+asm volatile("mrs %0, cntvct_el0" : "=r" (tsc));
+
+return s->last_tsc = tsc;
 #elif defined(__linux__)
 return rdtsc_syscall(s);
 #else
-- 
2.17.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3] dpdk: Migrate to the new pdump API.

2019-11-13 Thread David Marchand
On Wed, Nov 13, 2019 at 4:48 PM Ilya Maximets  wrote:
> On 12.11.2019 21:13, David Marchand wrote:
> > This is pure speculation, but when I saw the crash before, I thought
> > that the problem was in the way ovs creates its thread without the
> > dpdk being aware of it.
>
> At least, lcore_ids are set.

It works, but this is a hack.

>
> > dpdk pdump component expects that it's running on a EAL thread, with a
> > known lcore, and *boom* when it dereferences some uninitialized
> > structures/resources.
> >
> > I did not really investigate, I just fear we have this class of
> > issues, since dpdk (and its sub components) is not instructed by ovs
> > how it placed its threads.
> > ovs has been doing this for some time, without people hitting bugs, so
> > I might just be paranoid.
>
> BTW, it seems for me that all this "EAL threads" concept is an "RTE"
> legacy.  I understand that automatic support for dynamically created
> threads in the outer application sounds like a Sci-Fi fantasy, but as
> far as DPDK tries to be a library, I think, it should at least allow
> users to register their own threads.
> Large applications that wants to use DPDK, but also wants to be usable
> without it will likely not use EAL-threads because it's not flexible
> and not re-usable.  Otherwise they will need to create layers of proxy
> libraries to abstract their thread management.

Not entering the debate on what is legacy.

I agree that DPDK should provide a way to register threads.
I don't remember discussion with OVS people when/if DPDK integration
has been done.
If it happened, pointers are welcome.


Anyway, it seems worth looking at, but not necessarily easy.
It would make dpdk a little more like a library.
I will put this in my dpdk todolist (for the next release(s)).


--
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v7 ovn 2/2] ovn-northd: Limit ARP/ND broadcast domain whenever possible.

2019-11-13 Thread Dumitru Ceara
On Wed, Nov 13, 2019 at 4:30 PM Han Zhou  wrote:
>
>
>
> On Wed, Nov 13, 2019 at 2:42 AM Dumitru Ceara  wrote:
>>
>> On Tue, Nov 12, 2019 at 8:50 PM Han Zhou  wrote:
>> >
>> >
>> >
>> >
>> > On Tue, Nov 12, 2019 at 10:10 AM Dumitru Ceara  wrote:
>> > >
>> > > On Tue, Nov 12, 2019 at 6:17 PM Han Zhou  wrote:
>> > > >
>> > > >
>> > > >
>> > > > On Tue, Nov 12, 2019 at 2:29 AM Dumitru Ceara  
>> > > > wrote:
>> > > > >
>> > > > > ARP request and ND NS packets for router owned IPs were being
>> > > > > flooded in the complete L2 domain (using the MC_FLOOD multicast 
>> > > > > group).
>> > > > > However this creates a scaling issue in scenarios where aggregation
>> > > > > logical switches are connected to more logical routers (~350). The
>> > > > > logical pipelines of all routers would have to be executed before the
>> > > > > packet is finally replied to by a single router, the owner of the IP
>> > > > > address.
>> > > > >
>> > > > > This commit limits the broadcast domain by bypassing the L2 Lookup 
>> > > > > stage
>> > > > > for ARP requests that will be replied by a single router. The packets
>> > > > > are forwarded only to the router port that owns the target IP 
>> > > > > address.
>> > > > >
>> > > > > IPs that are owned by the routers and for which this fix applies are:
>> > > > > - IP addresses configured on the router ports.
>> > > > > - VIPs.
>> > > > > - NAT IPs.
>> > > > >
>> > > > > Reported-at: https://bugzilla.redhat.com/1756945
>> > > > > Reported-by: Anil Venkata 
>> > > > > Signed-off-by: Dumitru Ceara 
>> > > > >
>> > > > > ---
>> > > > > v7:
>> > > > > - Address Han's comments:
>> > > > > - Remove flooding for all ARPs received on VLAN networks. To 
>> > > > > avoid
>> > > > >   that we now identify self originated (G)ARPs by matching on 
>> > > > > source
>> > > > >   MAC address too.
>> > > > > - Rename REGBIT_NOT_VXLAN to FLAGBIT_NOT_VXLAN.
>> > > > > - Fix ovn-sb manpage.
>> > > > > - Split patch in a series of 2:
>> > > > > - patch1: fixes the get_router_load_balancer_ips() function.
>> > > > > - patch2: limits the ARP/ND broadcast domain.
>> > > > > v6:
>> > > > > - Address Han's comments:
>> > > > > - remove flooding of ARPs targeting OVN owned IP addresses.
>> > > > > - update ovn-architecture documentation.
>> > > > > - rename ARP handling functions.
>> > > > > - Adapt "ovn -- 3 HVs, 3 LS, 3 lports/LS, 1 LR" autotest to take 
>> > > > > into
>> > > > > account the new way of forwarding ARPs.
>> > > > > - Also, properly deal with ARP packets on VLAN-backed networks.
>> > > > > v5: Address Numan's comments: update comments & make autotest more
>> > > > > robust.
>> > > > > v4: Rebase.
>> > > > > v3: Properly deal with VXLAN traffic. Address review comments from
>> > > > > Numan (add autotests). Fix function get_router_load_balancer_ips.
>> > > > > Rebase -> deal with IPv6 NAT too.
>> > > > > v2: Move ARP broadcast domain limiting to table S_SWITCH_IN_L2_LKUP 
>> > > > > to
>> > > > > address localnet ports too.
>> > > > > ---
>> > > > >  northd/ovn-northd.8.xml |   14 ++
>> > > > >  northd/ovn-northd.c |  230 +++
>> > > > >  ovn-architecture.7.xml  |   19 +++
>> > > > >  tests/ovn.at|  307 
>> > > > > +--
>> > > > >  4 files changed, 530 insertions(+), 40 deletions(-)
>> > > > >
>> > > > > diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
>> > > > > index 0a33dcd..344cc0d 100644
>> > > > > --- a/northd/ovn-northd.8.xml
>> > > > > +++ b/northd/ovn-northd.8.xml
>> > > > > @@ -1005,6 +1005,20 @@ output;
>> > > > >
>> > > > >
>> > > > >
>> > > > > +Priority-80 flows for each port connected to a logical 
>> > > > > router
>> > > > > +matching self originated GARP/ARP request/ND packets. These 
>> > > > > packets
>> > > > > +are flooded to the MC_FLOOD which contains all 
>> > > > > logical
>> > > > > +ports.
>> > > > > +  
>> > > > > +
>> > > > > +  
>> > > > > +Priority-75 flows for each IP address/VIP/NAT address owned 
>> > > > > by a
>> > > > > +router port connected to the switch. These flows match ARP 
>> > > > > requests
>> > > > > +and ND packets for the specific IP addresses.  Matched 
>> > > > > packets are
>> > > > > +forwarded only to the router that owns the IP address.
>> > > > > +  
>> > > > > +
>> > > > > +  
>> > > > >  A priority-70 flow that outputs all packets with an 
>> > > > > Ethernet broadcast
>> > > > >  or multicast eth.dst to the 
>> > > > > MC_FLOOD
>> > > > >  multicast group.
>> > > > > diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
>> > > > > index 32f3200..d6beb97 100644
>> > > > > --- a/northd/ovn-northd.c
>> > > > > +++ b/northd/ovn-northd.c
>> > > > > @@ -210,6 +210,8 @@ enum ovn_stage {
>> > > > >  #define REGBIT_LOOKUP_NEIGHBOR_RESULT "reg9[4]"
>> > > > >  #defi

Re: [ovs-dev] [PATCH v3] dpdk: Migrate to the new pdump API.

2019-11-13 Thread Ilya Maximets
On 12.11.2019 21:13, David Marchand wrote:
> On Tue, Nov 12, 2019 at 8:41 PM Ilya Maximets  wrote:
>>
>> On 12.11.2019 19:51, Stokes, Ian wrote:
>>>
>>>
>>> On 11/12/2019 5:15 PM, David Marchand wrote:
 On Tue, Nov 12, 2019 at 6:07 PM Stokes, Ian  wrote:
> On 11/11/2019 3:01 PM, Ilya Maximets wrote:
>> DPDK commit 660098d61f57 ("pdump: use generic multi-process channel")
>> switched pdump to use generic DPDK IPC instead of sockets.
>> Old API was deprecated and removed.  Updating OVS code accordingly.
>>
>> Signed-off-by: Ilya Maximets 
>
> Thanks for the patch Ilya.
>
> I see compilation passing now on dpdk-latest with this applied.
>
> https://travis-ci.org/istokes/ovs/builds/610915636
>
> I still had issues with running PDUMP, but those issues are specific to
> PDUMP setup in my environment. A separate issue we can discuss further
> on the deprecation thread as it seems unrelated to this patch.
>
> @David, are you happy to ack the patch (I see some of the changes are
> from your side).

 I did not work on the crash I saw, but it was most likely a problem on my 
 side.
 This looks good to me.
>>>
>>> From a some further testing on my side I'm also seeing a crash, 
>>> specifically OVS crashes out once packets are received. PDUMP is still 
>>> running but complains of being unable to communicate with the primary 
>>> process and then exits. Is this similar to what you saw?
>>>
>>> @Ilya, by chance did you see anything like this?
>>
>>
>> Honestly, I never tried to use pdump, and I don't really want to try
>> preparing the setup for it (building ASLR disabled kernel and stuff).
>>
>>
>>> I'll investigate further myself tomorrow. I was going to hold off on the 
>>> merge in the meantime. Thoughts?
>>
>> There are 2 options here:
>> 1. Apply this patch and hope that DPDK will be fixed someday.
>>+ Optionally apply deprecation patch.
> 
> This is pure speculation, but when I saw the crash before, I thought
> that the problem was in the way ovs creates its thread without the
> dpdk being aware of it.

At least, lcore_ids are set.

> dpdk pdump component expects that it's running on a EAL thread, with a
> known lcore, and *boom* when it dereferences some uninitialized
> structures/resources.
> 
> I did not really investigate, I just fear we have this class of
> issues, since dpdk (and its sub components) is not instructed by ovs
> how it placed its threads.
> ovs has been doing this for some time, without people hitting bugs, so
> I might just be paranoid.

BTW, it seems for me that all this "EAL threads" concept is an "RTE"
legacy.  I understand that automatic support for dynamically created
threads in the outer application sounds like a Sci-Fi fantasy, but as
far as DPDK tries to be a library, I think, it should at least allow
users to register their own threads.
Large applications that wants to use DPDK, but also wants to be usable
without it will likely not use EAL-threads because it's not flexible
and not re-usable.  Otherwise they will need to create layers of proxy
libraries to abstract their thread management.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v7 ovn 2/2] ovn-northd: Limit ARP/ND broadcast domain whenever possible.

2019-11-13 Thread Han Zhou
On Wed, Nov 13, 2019 at 2:42 AM Dumitru Ceara  wrote:

> On Tue, Nov 12, 2019 at 8:50 PM Han Zhou  wrote:
> >
> >
> >
> >
> > On Tue, Nov 12, 2019 at 10:10 AM Dumitru Ceara 
> wrote:
> > >
> > > On Tue, Nov 12, 2019 at 6:17 PM Han Zhou  wrote:
> > > >
> > > >
> > > >
> > > > On Tue, Nov 12, 2019 at 2:29 AM Dumitru Ceara 
> wrote:
> > > > >
> > > > > ARP request and ND NS packets for router owned IPs were being
> > > > > flooded in the complete L2 domain (using the MC_FLOOD multicast
> group).
> > > > > However this creates a scaling issue in scenarios where aggregation
> > > > > logical switches are connected to more logical routers (~350). The
> > > > > logical pipelines of all routers would have to be executed before
> the
> > > > > packet is finally replied to by a single router, the owner of the
> IP
> > > > > address.
> > > > >
> > > > > This commit limits the broadcast domain by bypassing the L2 Lookup
> stage
> > > > > for ARP requests that will be replied by a single router. The
> packets
> > > > > are forwarded only to the router port that owns the target IP
> address.
> > > > >
> > > > > IPs that are owned by the routers and for which this fix applies
> are:
> > > > > - IP addresses configured on the router ports.
> > > > > - VIPs.
> > > > > - NAT IPs.
> > > > >
> > > > > Reported-at: https://bugzilla.redhat.com/1756945
> > > > > Reported-by: Anil Venkata 
> > > > > Signed-off-by: Dumitru Ceara 
> > > > >
> > > > > ---
> > > > > v7:
> > > > > - Address Han's comments:
> > > > > - Remove flooding for all ARPs received on VLAN networks. To
> avoid
> > > > >   that we now identify self originated (G)ARPs by matching on
> source
> > > > >   MAC address too.
> > > > > - Rename REGBIT_NOT_VXLAN to FLAGBIT_NOT_VXLAN.
> > > > > - Fix ovn-sb manpage.
> > > > > - Split patch in a series of 2:
> > > > > - patch1: fixes the get_router_load_balancer_ips() function.
> > > > > - patch2: limits the ARP/ND broadcast domain.
> > > > > v6:
> > > > > - Address Han's comments:
> > > > > - remove flooding of ARPs targeting OVN owned IP addresses.
> > > > > - update ovn-architecture documentation.
> > > > > - rename ARP handling functions.
> > > > > - Adapt "ovn -- 3 HVs, 3 LS, 3 lports/LS, 1 LR" autotest to
> take into
> > > > > account the new way of forwarding ARPs.
> > > > > - Also, properly deal with ARP packets on VLAN-backed networks.
> > > > > v5: Address Numan's comments: update comments & make autotest more
> > > > > robust.
> > > > > v4: Rebase.
> > > > > v3: Properly deal with VXLAN traffic. Address review comments from
> > > > > Numan (add autotests). Fix function
> get_router_load_balancer_ips.
> > > > > Rebase -> deal with IPv6 NAT too.
> > > > > v2: Move ARP broadcast domain limiting to table
> S_SWITCH_IN_L2_LKUP to
> > > > > address localnet ports too.
> > > > > ---
> > > > >  northd/ovn-northd.8.xml |   14 ++
> > > > >  northd/ovn-northd.c |  230 +++
> > > > >  ovn-architecture.7.xml  |   19 +++
> > > > >  tests/ovn.at|  307
> +--
> > > > >  4 files changed, 530 insertions(+), 40 deletions(-)
> > > > >
> > > > > diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
> > > > > index 0a33dcd..344cc0d 100644
> > > > > --- a/northd/ovn-northd.8.xml
> > > > > +++ b/northd/ovn-northd.8.xml
> > > > > @@ -1005,6 +1005,20 @@ output;
> > > > >
> > > > >
> > > > >
> > > > > +Priority-80 flows for each port connected to a logical
> router
> > > > > +matching self originated GARP/ARP request/ND packets.
> These packets
> > > > > +are flooded to the MC_FLOOD which contains
> all logical
> > > > > +ports.
> > > > > +  
> > > > > +
> > > > > +  
> > > > > +Priority-75 flows for each IP address/VIP/NAT address
> owned by a
> > > > > +router port connected to the switch. These flows match
> ARP requests
> > > > > +and ND packets for the specific IP addresses.  Matched
> packets are
> > > > > +forwarded only to the router that owns the IP address.
> > > > > +  
> > > > > +
> > > > > +  
> > > > >  A priority-70 flow that outputs all packets with an
> Ethernet broadcast
> > > > >  or multicast eth.dst to the
> MC_FLOOD
> > > > >  multicast group.
> > > > > diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
> > > > > index 32f3200..d6beb97 100644
> > > > > --- a/northd/ovn-northd.c
> > > > > +++ b/northd/ovn-northd.c
> > > > > @@ -210,6 +210,8 @@ enum ovn_stage {
> > > > >  #define REGBIT_LOOKUP_NEIGHBOR_RESULT "reg9[4]"
> > > > >  #define REGBIT_SKIP_LOOKUP_NEIGHBOR "reg9[5]"
> > > > >
> > > > > +#define FLAGBIT_NOT_VXLAN "flags[1] == 0"
> > > > > +
> > > > >  /* Returns an "enum ovn_stage" built from the arguments. */
> > > > >  static enum ovn_stage
> > > > >  ovn_stage_build(enum ovn_datapath_type dp_type, enum ovn_pipeline
> pipeline,
> > > 

Re: [ovs-dev] [PATCH] dpdk: Deprecate pdump support.

2019-11-13 Thread Flavio Leitner
On Mon, 11 Nov 2019 19:52:56 +0100
Ilya Maximets  wrote:

> The conventional way for packet dumping in OVS is to use ovs-tcpdump
> that works via traffic mirroring.  DPDK pdump could probably be used
> for some lower level debugging, but it is not commonly used for
> various reasons.
> 
> There are lots of limitations for using this functionality in
> practice. Most of them connected with running secondary pdump process
> and memory layout issues like requirement to disable ASLR in kernel.
> More details are available in DPDK guide:
> https://doc.dpdk.org/guides/prog_guide/multi_proc_support.html#multi-process-limitations
> 
> Beside the functional limitations it's also hard to use this
> functionality correctly.  User must be sure that OVS and pdump utility
> are running on different CPU cores, which is hard because non-PMD
> threads could float over available CPU cores.  This or any other
> misconfiguration will likely lead to crash of the pdump utility
> or/and OVS.
> 
> Another problem is that the user must actually have this special pdump
> utility in a system and it might be not available in distributions.
> 
> This change disables pdump support by default introducing special
> configuration option '--enable-dpdk-pdump'.  Deprecation warnings will
> be shown to users on configuration and in runtime.
> 
> Claiming to completely remove this functionality from OVS in one
> of the next releases.
> 
> Signed-off-by: Ilya Maximets 
> Acked-by: Aaron Conole 
> ---
> 
> Version 1:
>   * No changes since RFC.
>   * Added ACK from Aaron.
> 
>  .travis/linux-build.sh  |  4 +++-
>  Documentation/topics/dpdk/pdump.rst |  8 +++-
>  NEWS|  4 
>  acinclude.m4| 24 ++--
>  lib/dpdk.c  |  2 ++
>  5 files changed, 34 insertions(+), 8 deletions(-)

New option noted in NEWS, warning for those users enabling the
option, documentation updated.
LGTM
Acked-by: Flavio Leitner 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3] dpdk: Migrate to the new pdump API.

2019-11-13 Thread Ilya Maximets
On 13.11.2019 15:28, Stokes, Ian wrote:
> 
> 
> On 11/12/2019 7:40 PM, Ilya Maximets wrote:
>> On 12.11.2019 19:51, Stokes, Ian wrote:
>>>
>>>
>>> On 11/12/2019 5:15 PM, David Marchand wrote:
 On Tue, Nov 12, 2019 at 6:07 PM Stokes, Ian  wrote:
> On 11/11/2019 3:01 PM, Ilya Maximets wrote:
>> DPDK commit 660098d61f57 ("pdump: use generic multi-process channel")
>> switched pdump to use generic DPDK IPC instead of sockets.
>> Old API was deprecated and removed.  Updating OVS code accordingly.
>>
>> Signed-off-by: Ilya Maximets 
>
> Thanks for the patch Ilya.
>
> I see compilation passing now on dpdk-latest with this applied.
>
> https://travis-ci.org/istokes/ovs/builds/610915636
>
> I still had issues with running PDUMP, but those issues are specific to
> PDUMP setup in my environment. A separate issue we can discuss further
> on the deprecation thread as it seems unrelated to this patch.
>
> @David, are you happy to ack the patch (I see some of the changes are
> from your side).

 I did not work on the crash I saw, but it was most likely a problem on my 
 side.
 This looks good to me.
>>>
>>>  From a some further testing on my side I'm also seeing a crash, 
>>> specifically OVS crashes out once packets are received. PDUMP is still 
>>> running but complains of being unable to communicate with the primary 
>>> process and then exits. Is this similar to what you saw?
>>>
>>> @Ilya, by chance did you see anything like this?
>>
>>
>> Honestly, I never tried to use pdump, and I don't really want to try
>> preparing the setup for it (building ASLR disabled kernel and stuff).
>>
> 
> I haven't run it a while myself either. Typically use ovs-tcpdump.
> 
> 
>>
>>> I'll investigate further myself tomorrow. I was going to hold off on the 
>>> merge in the meantime. Thoughts?
>>
>> There are 2 options here:
>> 1. Apply this patch and hope that DPDK will be fixed someday.
>>     + Optionally apply deprecation patch.
>> 2. Completely remove pdump support now without prior deprecation
>>     because it just doesn't work.
> I'd like option 1, apply the patches & deprecate on dpdk-latest. This at 
> least gets the dpdk-latest passing with rc2 for the moment (I'd do a rebase 
> for ovs master with this also).
> 
> Before removing pdump support completely I'd like to investigate further, if 
> it can be confirmed to be a DPDK/OVS compatibility issue WRT the PMDs as 
> David has suggested, then thats fine,  once confirmed I feel it would be fair 
> to remove it completely then as it just doesn't work.
> 
> Thoughts?

OK for me.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH net-next v4] net: openvswitch: add hash info to upcall

2019-11-13 Thread xiangxia . m . yue
From: Tonghao Zhang 

When using the kernel datapath, the upcall don't
include skb hash info relatived. That will introduce
some problem, because the hash of skb is important
in kernel stack. For example, VXLAN module uses
it to select UDP src port. The tx queue selection
may also use the hash in stack.

Hash is computed in different ways. Hash is random
for a TCP socket, and hash may be computed in hardware,
or software stack. Recalculation hash is not easy.

Hash of TCP socket is computed:
tcp_v4_connect
-> sk_set_txhash (is random)

__tcp_transmit_skb
-> skb_set_hash_from_sk

There will be one upcall, without information of skb
hash, to ovs-vswitchd, for the first packet of a TCP
session. The rest packets will be processed in Open vSwitch
modules, hash kept. If this tcp session is forward to
VXLAN module, then the UDP src port of first tcp packet
is different from rest packets.

TCP packets may come from the host or dockers, to Open vSwitch.
To fix it, we store the hash info to upcall, and restore hash
when packets sent back.

+---+  +-+
|   Docker/VMs  |  | ovs-vswitchd|
++--+  +-++--+
 |   ^|
 |   ||
 |   |  upcallv restore packet hash (not 
recalculate)
 | +-++--+
 |  tap netdev | |   vxlan module
 +---> +-->  Open vSwitch ko +-->
   or internal type| |
   +-+

Reported-at: 
https://mail.openvswitch.org/pipermail/ovs-dev/2019-October/364062.html
Signed-off-by: Tonghao Zhang 
---
v4:
* all hash of skb to upcall
* remove the pad_packet, nla_put makes sure len is align
* add OVS_PACKET_ATTR_HASH at the end of ovs_packet_attr.
v3:
* add enum ovs_pkt_hash_types
* avoid duplicate call the skb_get_hash_raw.
* explain why we should fix this problem.
---
 include/uapi/linux/openvswitch.h |  4 +++-
 net/openvswitch/datapath.c   | 26 +-
 net/openvswitch/datapath.h   | 12 
 3 files changed, 40 insertions(+), 2 deletions(-)

diff --git a/include/uapi/linux/openvswitch.h b/include/uapi/linux/openvswitch.h
index 1887a451c388..a87b44cd5590 100644
--- a/include/uapi/linux/openvswitch.h
+++ b/include/uapi/linux/openvswitch.h
@@ -173,6 +173,7 @@ enum ovs_packet_cmd {
  * @OVS_PACKET_ATTR_LEN: Packet size before truncation.
  * %OVS_PACKET_ATTR_USERSPACE action specify the Maximum received fragment
  * size.
+ * @OVS_PACKET_ATTR_HASH: Packet hash info (e.g. hash, sw_hash and l4_hash in 
skb).
  *
  * These attributes follow the &struct ovs_header within the Generic Netlink
  * payload for %OVS_PACKET_* commands.
@@ -190,7 +191,8 @@ enum ovs_packet_attr {
OVS_PACKET_ATTR_PROBE,  /* Packet operation is a feature probe,
   error logging should be suppressed. */
OVS_PACKET_ATTR_MRU,/* Maximum received IP fragment size. */
-   OVS_PACKET_ATTR_LEN,/* Packet size before truncation. */
+   OVS_PACKET_ATTR_LEN,/* Packet size before truncation. */
+   OVS_PACKET_ATTR_HASH,   /* Packet hash. */
__OVS_PACKET_ATTR_MAX
 };
 
diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c
index 2088619c03f0..8ce1f773378d 100644
--- a/net/openvswitch/datapath.c
+++ b/net/openvswitch/datapath.c
@@ -350,7 +350,8 @@ static size_t upcall_msg_size(const struct dp_upcall_info 
*upcall_info,
size_t size = NLMSG_ALIGN(sizeof(struct ovs_header))
+ nla_total_size(hdrlen) /* OVS_PACKET_ATTR_PACKET */
+ nla_total_size(ovs_key_attr_size()) /* OVS_PACKET_ATTR_KEY */
-   + nla_total_size(sizeof(unsigned int)); /* OVS_PACKET_ATTR_LEN 
*/
+   + nla_total_size(sizeof(unsigned int)) /* OVS_PACKET_ATTR_LEN */
+   + nla_total_size(sizeof(u64)); /* OVS_PACKET_ATTR_HASH */
 
/* OVS_PACKET_ATTR_USERDATA */
if (upcall_info->userdata)
@@ -393,6 +394,7 @@ static int queue_userspace_packet(struct datapath *dp, 
struct sk_buff *skb,
size_t len;
unsigned int hlen;
int err, dp_ifindex;
+   u64 hash;
 
dp_ifindex = get_dpifindex(dp);
if (!dp_ifindex)
@@ -504,6 +506,19 @@ static int queue_userspace_packet(struct datapath *dp, 
struct sk_buff *skb,
pad_packet(dp, user_skb);
}
 
+   /* Add OVS_PACKET_ATTR_HASH */
+   hash = skb_get_hash_raw(skb);
+   if (skb->sw_hash)
+   hash |= OVS_PACKET_HASH_SW_BIT;
+
+   if (skb->l4_hash)
+   hash |= OVS_PACKET_HASH_L4_BIT;
+
+   if (nla_put(user_skb, OVS_PACKET_ATTR_HASH, sizeof (u64), &hash)) {
+   err = -ENOBUFS;
+   goto out;
+   }
+

Re: [ovs-dev] [PATCH v3] dpdk: Migrate to the new pdump API.

2019-11-13 Thread Stokes, Ian



On 11/12/2019 7:40 PM, Ilya Maximets wrote:

On 12.11.2019 19:51, Stokes, Ian wrote:



On 11/12/2019 5:15 PM, David Marchand wrote:

On Tue, Nov 12, 2019 at 6:07 PM Stokes, Ian  wrote:

On 11/11/2019 3:01 PM, Ilya Maximets wrote:

DPDK commit 660098d61f57 ("pdump: use generic multi-process channel")
switched pdump to use generic DPDK IPC instead of sockets.
Old API was deprecated and removed.  Updating OVS code accordingly.

Signed-off-by: Ilya Maximets 


Thanks for the patch Ilya.

I see compilation passing now on dpdk-latest with this applied.

https://travis-ci.org/istokes/ovs/builds/610915636

I still had issues with running PDUMP, but those issues are specific to
PDUMP setup in my environment. A separate issue we can discuss further
on the deprecation thread as it seems unrelated to this patch.

@David, are you happy to ack the patch (I see some of the changes are
from your side).


I did not work on the crash I saw, but it was most likely a problem on my side.
This looks good to me.


 From a some further testing on my side I'm also seeing a crash, specifically 
OVS crashes out once packets are received. PDUMP is still running but complains 
of being unable to communicate with the primary process and then exits. Is this 
similar to what you saw?

@Ilya, by chance did you see anything like this?



Honestly, I never tried to use pdump, and I don't really want to try
preparing the setup for it (building ASLR disabled kernel and stuff).



I haven't run it a while myself either. Typically use ovs-tcpdump.





I'll investigate further myself tomorrow. I was going to hold off on the merge 
in the meantime. Thoughts?


There are 2 options here:
1. Apply this patch and hope that DPDK will be fixed someday.
+ Optionally apply deprecation patch.
2. Completely remove pdump support now without prior deprecation
because it just doesn't work.
I'd like option 1, apply the patches & deprecate on dpdk-latest. This at 
least gets the dpdk-latest passing with rc2 for the moment (I'd do a 
rebase for ovs master with this also).


Before removing pdump support completely I'd like to investigate 
further, if it can be confirmed to be a DPDK/OVS compatibility issue WRT 
the PMDs as David has suggested, then thats fine,  once confirmed I feel 
it would be fair to remove it completely then as it just doesn't work.


Thoughts?



Best regards, Ilya Maximets.


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] netdev-afxdp: Best-effort configuration of XDP mode.

2019-11-13 Thread Ilya Maximets
On 13.11.2019 1:21, William Tu wrote:
> On Thu, Nov 7, 2019 at 3:36 AM Ilya Maximets  wrote:
>>
>> Until now there was only two options for XDP mode in OVS: SKB or DRV.
>> i.e. 'generic XDP' or 'native XDP with zero-copy enabled'.
>>
>> Devices like 'veth' interfaces in Linux supports native XDP, but
>> doesn't support zero-copy mode.  This case can not be covered by
>> existing API and we have to use slower generic XDP for such devices.
>> There are few more issues, e.g. TCP is not supported in generic XDP
>> mode for veth interfaces due to kernel limitations, however it is
>> supported in native mode.
>>
>> This change introduces ability to use native XDP without zero-copy
>> along with best-effort configuration option that enabled by default.
>> In best-effort case OVS will sequentially try different modes starting
>> from the fastest one and will choose the first acceptable for current
>> interface.  This will guarantee the best possible performance.
>>
>> If user will want to choose specific mode, it's still possible by
>> setting the 'options:xdp-mode'.
>>
>> This change additionally changes the API by renaming the configuration
>> knob from 'xdpmode' to 'xdp-mode' and also renaming the modes
>> themselves to be more user-friendly.
>>
>> The full list of currently supported modes:
>>   * native-with-zerocopy - former DRV
>>   * native   - new one, DRV without zero-copy
>>   * generic  - former SKB
>>   * best-effort  - new one, chooses the best available from
>>3 above modes
>>
>> Since 'best-effort' is a default mode, users will not need to
>> explicitely set 'xdp-mode' in most cases.
>>
>> TCP related tests enabled back in system afxdp testsuite, because
>> 'best-effort' will choose 'native' mode for veth interfaces
>> and this mode has no issues with TCP.
>>
>> Signed-off-by: Ilya Maximets 
>> ---
> 
> I'm thinking about adding some tests case and patches depends on
> this API change. If there is no more comment, I will apply to master.

I think, it might be good to wait for some comments from Eelco.

@Eelco, do you have a plan to review the code? Do you agree with the
API update?

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] netdev-afxdp: Best-effort configuration of XDP mode.

2019-11-13 Thread Ilya Maximets
On 08.11.2019 15:22, William Tu wrote:
> On Thu, Nov 07, 2019 at 12:36:33PM +0100, Ilya Maximets wrote:
>> Until now there was only two options for XDP mode in OVS: SKB or DRV.
>> i.e. 'generic XDP' or 'native XDP with zero-copy enabled'.
>>
>> Devices like 'veth' interfaces in Linux supports native XDP, but
>> doesn't support zero-copy mode.  This case can not be covered by
>> existing API and we have to use slower generic XDP for such devices.
>> There are few more issues, e.g. TCP is not supported in generic XDP
>> mode for veth interfaces due to kernel limitations, however it is
>> supported in native mode.
>>
>> This change introduces ability to use native XDP without zero-copy
>> along with best-effort configuration option that enabled by default.
>> In best-effort case OVS will sequentially try different modes starting
>> from the fastest one and will choose the first acceptable for current
>> interface.  This will guarantee the best possible performance.
>>
>> If user will want to choose specific mode, it's still possible by
>> setting the 'options:xdp-mode'.
>>
>> This change additionally changes the API by renaming the configuration
>> knob from 'xdpmode' to 'xdp-mode' and also renaming the modes
>> themselves to be more user-friendly.
>>
>> The full list of currently supported modes:
>>   * native-with-zerocopy - former DRV
>>   * native   - new one, DRV without zero-copy
>>   * generic  - former SKB
>>   * best-effort  - new one, chooses the best available from
>>3 above modes
> 
> Since we are renaming the mode, in doc, should we tell user the mapping
> of these mode to kernel AF_XDP's mode?
> So let user know generic mode in OVS  = generic SKB in kernel,
> native mode in OVS = native mode without zc...

It might make sense to document that 'generic' uses 'XDP_SKB'
and 'native' uses 'XDP_DRV', but it seems that 'generic'/'native'
terms are widely used in XDP related community and projects, i.e.
should be well known.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v7 ovn 2/2] ovn-northd: Limit ARP/ND broadcast domain whenever possible.

2019-11-13 Thread Dumitru Ceara
On Tue, Nov 12, 2019 at 8:50 PM Han Zhou  wrote:
>
>
>
>
> On Tue, Nov 12, 2019 at 10:10 AM Dumitru Ceara  wrote:
> >
> > On Tue, Nov 12, 2019 at 6:17 PM Han Zhou  wrote:
> > >
> > >
> > >
> > > On Tue, Nov 12, 2019 at 2:29 AM Dumitru Ceara  wrote:
> > > >
> > > > ARP request and ND NS packets for router owned IPs were being
> > > > flooded in the complete L2 domain (using the MC_FLOOD multicast group).
> > > > However this creates a scaling issue in scenarios where aggregation
> > > > logical switches are connected to more logical routers (~350). The
> > > > logical pipelines of all routers would have to be executed before the
> > > > packet is finally replied to by a single router, the owner of the IP
> > > > address.
> > > >
> > > > This commit limits the broadcast domain by bypassing the L2 Lookup stage
> > > > for ARP requests that will be replied by a single router. The packets
> > > > are forwarded only to the router port that owns the target IP address.
> > > >
> > > > IPs that are owned by the routers and for which this fix applies are:
> > > > - IP addresses configured on the router ports.
> > > > - VIPs.
> > > > - NAT IPs.
> > > >
> > > > Reported-at: https://bugzilla.redhat.com/1756945
> > > > Reported-by: Anil Venkata 
> > > > Signed-off-by: Dumitru Ceara 
> > > >
> > > > ---
> > > > v7:
> > > > - Address Han's comments:
> > > > - Remove flooding for all ARPs received on VLAN networks. To avoid
> > > >   that we now identify self originated (G)ARPs by matching on source
> > > >   MAC address too.
> > > > - Rename REGBIT_NOT_VXLAN to FLAGBIT_NOT_VXLAN.
> > > > - Fix ovn-sb manpage.
> > > > - Split patch in a series of 2:
> > > > - patch1: fixes the get_router_load_balancer_ips() function.
> > > > - patch2: limits the ARP/ND broadcast domain.
> > > > v6:
> > > > - Address Han's comments:
> > > > - remove flooding of ARPs targeting OVN owned IP addresses.
> > > > - update ovn-architecture documentation.
> > > > - rename ARP handling functions.
> > > > - Adapt "ovn -- 3 HVs, 3 LS, 3 lports/LS, 1 LR" autotest to take 
> > > > into
> > > > account the new way of forwarding ARPs.
> > > > - Also, properly deal with ARP packets on VLAN-backed networks.
> > > > v5: Address Numan's comments: update comments & make autotest more
> > > > robust.
> > > > v4: Rebase.
> > > > v3: Properly deal with VXLAN traffic. Address review comments from
> > > > Numan (add autotests). Fix function get_router_load_balancer_ips.
> > > > Rebase -> deal with IPv6 NAT too.
> > > > v2: Move ARP broadcast domain limiting to table S_SWITCH_IN_L2_LKUP to
> > > > address localnet ports too.
> > > > ---
> > > >  northd/ovn-northd.8.xml |   14 ++
> > > >  northd/ovn-northd.c |  230 +++
> > > >  ovn-architecture.7.xml  |   19 +++
> > > >  tests/ovn.at|  307 
> > > > +--
> > > >  4 files changed, 530 insertions(+), 40 deletions(-)
> > > >
> > > > diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
> > > > index 0a33dcd..344cc0d 100644
> > > > --- a/northd/ovn-northd.8.xml
> > > > +++ b/northd/ovn-northd.8.xml
> > > > @@ -1005,6 +1005,20 @@ output;
> > > >
> > > >
> > > >
> > > > +Priority-80 flows for each port connected to a logical router
> > > > +matching self originated GARP/ARP request/ND packets. These 
> > > > packets
> > > > +are flooded to the MC_FLOOD which contains all 
> > > > logical
> > > > +ports.
> > > > +  
> > > > +
> > > > +  
> > > > +Priority-75 flows for each IP address/VIP/NAT address owned by 
> > > > a
> > > > +router port connected to the switch. These flows match ARP 
> > > > requests
> > > > +and ND packets for the specific IP addresses.  Matched packets 
> > > > are
> > > > +forwarded only to the router that owns the IP address.
> > > > +  
> > > > +
> > > > +  
> > > >  A priority-70 flow that outputs all packets with an Ethernet 
> > > > broadcast
> > > >  or multicast eth.dst to the MC_FLOOD
> > > >  multicast group.
> > > > diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
> > > > index 32f3200..d6beb97 100644
> > > > --- a/northd/ovn-northd.c
> > > > +++ b/northd/ovn-northd.c
> > > > @@ -210,6 +210,8 @@ enum ovn_stage {
> > > >  #define REGBIT_LOOKUP_NEIGHBOR_RESULT "reg9[4]"
> > > >  #define REGBIT_SKIP_LOOKUP_NEIGHBOR "reg9[5]"
> > > >
> > > > +#define FLAGBIT_NOT_VXLAN "flags[1] == 0"
> > > > +
> > > >  /* Returns an "enum ovn_stage" built from the arguments. */
> > > >  static enum ovn_stage
> > > >  ovn_stage_build(enum ovn_datapath_type dp_type, enum ovn_pipeline 
> > > > pipeline,
> > > > @@ -1202,6 +1204,34 @@ ovn_port_allocate_key(struct ovn_datapath *od)
> > > >1, (1u << 15) - 1, &od->port_key_hint);
> > > >  }
> > > >
> > > > +/* Returns true if the logical switch port 'enabled' column

[ovs-dev] [PATCH v4 ovn 3/3] ovn-detrace: Add support for other types of SB cookies.

2019-11-13 Thread Dumitru Ceara
Commit eb25a7da639e ("Improve debuggability of OVN to OpenFlow
translations.") added cookies for Port_Binding, Mac_Binding,
Multicast_Group, Chassis records to the OpenfFlow entries they
generate.

Add support for parsing such cookies in ovn-detrace too.
Also:
- refactor ovn-detrace to allow a more generic way of defining
  cookie handlers.
- properly handle potential collisions between cookie -> UUID
  mappings.
- change print statements to print() calls.
- add custom printing functions to simplify prefixing outputs.

Acked-by: Mark Michelson 
Signed-off-by: Dumitru Ceara 
---
 utilities/ovn-detrace.in |  194 --
 1 file changed, 134 insertions(+), 60 deletions(-)

diff --git a/utilities/ovn-detrace.in b/utilities/ovn-detrace.in
index accbcd2..c645658 100755
--- a/utilities/ovn-detrace.in
+++ b/utilities/ovn-detrace.in
@@ -14,6 +14,9 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
+from __future__ import print_function
+
+import functools
 import getopt
 import os
 import re
@@ -37,7 +40,7 @@ argv0 = sys.argv[0]
 
 
 def usage():
-print """\
+print("""\
 %(argv0)s:
 usage: %(argv0)s < FILE
 where FILE is output from ovs-appctl ofproto/trace.
@@ -47,9 +50,21 @@ The following options are also available:
   -V, --version   display version information
   --ovnsb=DATABASEuse DATABASE as southbound DB
   --ovnnb=DATABASEuse DATABASE as northbound DB\
-""" % {'argv0': argv0}
+""" % {'argv0': argv0})
 sys.exit(0)
 
+print_p = functools.partial(print, '  * ')
+print_h = functools.partial(print, '   * ')
+
+def datapath_str(datapath):
+return '"%s" (%s)' % (str(datapath.external_ids.get('name')),
+  datapath.uuid)
+
+def chassis_str(chassis):
+if len(chassis) == 0:
+return ''
+ch = chassis[0]
+return 'chassis-name "%s", chassis-str "%s"' % (ch.name, ch.hostname)
 
 class OVSDB(object):
 @staticmethod
@@ -87,59 +102,112 @@ class OVSDB(object):
 def get_table(self, table_name):
 return self._idl_conn.tables[table_name]
 
-def _find_row(self, table_name, find):
-return next(
-(row for row in self.get_table(table_name).rows.values()
- if find(row)), None)
+def _find_row(self, table_name, find_fn):
+return filter(find_fn, self.get_table(table_name).rows.values())
 
-def _find_row_by_name(self, table_name, value):
+def _find_rows_by_name(self, table_name, value):
 return self._find_row(table_name, lambda row: row.name == value)
 
-def find_row_by_partial_uuid(self, table_name, value):
-return self._find_row(table_name, lambda row: value in str(row.uuid))
-
-
-def get_lflow_from_cookie(ovnsb_db, cookie):
-return ovnsb_db.find_row_by_partial_uuid('Logical_Flow', cookie)
-
-
-def print_lflow(lflow, prefix):
-ldp_uuid = lflow.logical_datapath.uuid
-ldp_name = str(lflow.logical_datapath.external_ids.get('name'))
-
-print '%sLogical datapath: "%s" (%s) [%s]' % (prefix,
-  ldp_name,
-  ldp_uuid,
-  lflow.pipeline)
-print "%sLogical flow: table=%s (%s), priority=%s, " \
-  "match=(%s), actions=(%s)" % (prefix,
-lflow.table_id,
-lflow.external_ids.get('stage-name'),
-lflow.priority,
-str(lflow.match).strip('"'),
-str(lflow.actions).strip('"'))
-
-
-def print_lflow_nb_hint(lflow, prefix, ovnnb_db):
-external_ids = lflow.external_ids
-if external_ids.get('stage-name') in ['ls_in_acl',
-  'ls_out_acl']:
-acl_hint = external_ids.get('stage-hint')
-if not acl_hint:
-return
-acl = ovnnb_db.find_row_by_partial_uuid('ACL', acl_hint)
-if not acl:
-return
-output = "%sACL: %s, priority=%s, " \
- "match=(%s), %s" % (prefix,
- acl.direction,
- acl.priority,
- acl.match.strip('"'),
- acl.action)
-if acl.log:
-output += ' (log)'
-print output
-
+def find_rows_by_partial_uuid(self, table_name, value):
+return self._find_row(table_name,
+  lambda row: str(row.uuid).startswith(value))
+
+class CookieHandler(object):
+def __init__(self, db, table):
+self._db = db
+self._table = table
+
+def get_records(self, cookie):
+# Adjust cookie to include leading zeroes if needed.
+cookie = cookie.zfill(8)
+return self._db.fi

[ovs-dev] [PATCH v4 ovn 2/3] ovn-detrace: Fix line parsing.

2019-11-13 Thread Dumitru Ceara
The script was not properly parsing rows containing flows from
tables 0-9 because ofproto/trace adds leading whitespace for
pretty printing flows.
Also, add a check to make sure that the cookie refers to a
Logical_Flow before trying to print the record.

Signed-off-by: Dumitru Ceara 
---
 utilities/ovn-detrace.in |7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/utilities/ovn-detrace.in b/utilities/ovn-detrace.in
index 9471e37..accbcd2 100755
--- a/utilities/ovn-detrace.in
+++ b/utilities/ovn-detrace.in
@@ -184,7 +184,7 @@ def main():
 ovsdb_ovnnb = OVSDB(ovnnb_db, 'OVN_Northbound')
 
 regex_cookie = re.compile(r'^.*cookie 0x([0-9a-fA-F]+)')
-regex_table_id = re.compile(r'^[0-9]+\.')
+regex_table_id = re.compile(r'^ *[0-9]+\.')
 cookie = None
 while True:
 line = sys.stdin.readline()
@@ -192,8 +192,9 @@ def main():
 # print lflow info when the current flow block ends
 if regex_table_id.match(line) or line.strip() == '':
 lflow = get_lflow_from_cookie(ovsdb_ovnsb, cookie)
-print_lflow(lflow, "  * ")
-print_lflow_nb_hint(lflow, "* ", ovsdb_ovnnb)
+if lflow:
+print_lflow(lflow, "  * ")
+print_lflow_nb_hint(lflow, "* ", ovsdb_ovnnb)
 cookie = None
 
 print line.strip()

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v4 ovn 1/3] ovn-detrace: Fix rundir.

2019-11-13 Thread Dumitru Ceara
After the separation of OVS and OVN rundirs, the ovn-detrace script
hasn't been updated to use the OVN rundir instead of the OVS one.

Acked-by: Mark Michelson 
Signed-off-by: Dumitru Ceara 
---
 utilities/ovn-detrace.in |6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/utilities/ovn-detrace.in b/utilities/ovn-detrace.in
index c842adc..9471e37 100755
--- a/utilities/ovn-detrace.in
+++ b/utilities/ovn-detrace.in
@@ -169,16 +169,16 @@ def main():
  "(use --help for help)\n" % argv0)
 sys.exit(1)
 
-ovs_rundir = os.getenv('OVS_RUNDIR', '@RUNDIR@')
+ovn_rundir = os.getenv('OVN_RUNDIR', '@OVN_RUNDIR@')
 if not ovnsb_db:
 ovnsb_db = os.getenv('OVN_SB_DB')
 if not ovnsb_db:
-ovnsb_db = 'unix:%s/ovnsb_db.sock' % ovs_rundir
+ovnsb_db = 'unix:%s/ovnsb_db.sock' % ovn_rundir
 
 if not ovnnb_db:
 ovnnb_db = os.getenv('OVN_NB_DB')
 if not ovnnb_db:
-ovnnb_db = 'unix:%s/ovnnb_db.sock' % ovs_rundir
+ovnnb_db = 'unix:%s/ovnnb_db.sock' % ovn_rundir
 
 ovsdb_ovnsb = OVSDB(ovnsb_db, 'OVN_Southbound')
 ovsdb_ovnnb = OVSDB(ovnnb_db, 'OVN_Northbound')

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v4 ovn 0/3] Improve ovn-detrace support for parsing OpenFlow cookies.

2019-11-13 Thread Dumitru Ceara
Commit eb25a7da639e ("Improve debuggability of OVN to OpenFlow translations.")
added support for more types of cookies (partial SB uuids) that are stored
in OpenFlow entries created by ovn-controller.

The last patch in this series implements support for parsing all these
different cookies in ovn-detrace too.

The first patch is a bug fix required as ovn-detrace was broken after
moving OVN to its own separate rundir.

The second patch fixes line parsing in ovn-detrace.

CC: Han Zhou 
Signed-off-by: Dumitru Ceara 

Dumitru Ceara (3):
  ovn-detrace: Fix rundir.
  ovn-detrace: Fix line parsing.
  ovn-detrace: Add support for other types of SB cookies.


 utilities/ovn-detrace.in |  201 --
 1 file changed, 138 insertions(+), 63 deletions(-)


---
v4:
- Address Han's comments:
  - Fix printing of last logical flow information.
- Added Acked-by from Mark on patches 1 & 3. I didn't add it to patch 2
  because I significantly changed patch 2 in v4.
v3:
- Remove stray "%s".
- Rename pprint to print_p to avoid name clashes with the library function.
v2:
- Address Mark's comments:
  - properly handle potential collisions between cookie -> UUID mappings.
  - when looking up SB records by cookie, match on the first part of the
UUID.
- Further refactor ovn-detrace to simplify prefixing outputs.
- Change print statements to print() calls.

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v2] ovn-controller: Fix use of dangling pointers in I-P runtime_data.

2019-11-13 Thread Dumitru Ceara
On Fri, Nov 8, 2019 at 8:27 PM Han Zhou  wrote:
>
>
>
> On Fri, Nov 8, 2019 at 11:22 AM Han Zhou  wrote:
> >
> > 1. storage data and the void *arg of init() breaks the engine node data 
> > encapsulation.
> > 2. engine_node_valid(&en_flow_output, engine_run_id) is not needed? Should 
> > use storage to access instead?
> > 3. refactor of engine is good but better to be a separate commit
> > 4. we can have a new interface: engine_get_data(), which returns data if it 
> > is valid. we should never expose the data directly. We should init the 
> > engine node with dynamically allocated engine data structure (and remember 
> > to free during destroy)
>
> Oops! please ignore the above part since it was draft and I forgot to delete 
> after editing the formal response, mostly redundant :-)
> Real response started here =>
> >
> > Hi Dumitru,
> >
> > Sorry for late response.
> > On Mon, Nov 4, 2019 at 4:54 AM Dumitru Ceara  wrote:
> > >
> > > The incremental processing engine might stop a run before the
> > > en_runtime_data node is processed. In such cases the ed_runtime_data
> > > fields might contain pointers to already deleted SB records. For
> > > example, if a port binding corresponding to a patch port is removed from
> > > the SB database and the incremental processing engine aborts before the
> > > en_runtime_data node is processed then the corresponding local_datapath
> > > hashtable entry in ed_runtime_data is stale and will store a pointer to
> > > the already freed sbrec_port_binding record.
> > >
> > > This will cause invalid memory accesses in various places (e.g.,
> > > pinctrl_run() -> prepare_ipv6_ras()).
> > >
> > > To fix the issue we need a way to track how each node was processed
> > > during an engine run. This commit transforms the 'changed' field in
> > > struct engine_node in a 'state' field. Possible node states are:
> > > - "New": the node is not yet initialized.
> > > - "Stale": data in the node is not up to date with the DB.
> > > - "Updated": data in the node is valid but was updated during
> > >   the last run of the engine.
> > > - "Valid": data in the node is valid and didn't change during
> > >   the last run of the engine.
> > > - "Aborted": during the last run, processing was aborted for
> > >   this node.
> > > - "Destroyed": the node was already cleaned up.
> > >
> > > We also add a separation between engine node data that can be accessed
> > > at any time (regardless if the last engine run was successful or not)
> > > and data that may be accessed only if the nodes are up to date. This
> > > helps avoiding custom "engine_node_valid" handlers for different
> > > nodes.
> > >
> > > The commit also simplifies the logic of calling engine_run and
> > > engine_need_run in order to reduce the number of external variables
> > > required to track the result of the last engine execution.
> > >
> > > Functions that need to be called from the main loop and depend on
> > > various data contents of the engine's nodes are now called only if
> > > the data is up to date.
> > >
> > > CC: Han Zhou 
> > > Fixes: ca278d98a4f5 ("ovn-controller: Initial use of incremental engine - 
> > > quiet mode.")
> > > Signed-off-by: Dumitru Ceara 
> > >
> > > ---
> > > v2: Address Han's comments:
> > > - call engine_node_valid() in all the places where node local data is
> > >   used.
> > > - move out "global" data outside the engine nodes. Make a clear
> > >   separation between data that can be safely used at any time and data
> > >   that can be used only when the engine run was successful.
> >
> > I am concerned with this kind of separation of *global* data, which breaks 
> > the data encapsulation of engine node, and can easily result in hidden 
> > dependency. As you know the main purpose of the I-P engine is to force 
> > explicit dependency exposed between different engine nodes thus ensure the 
> > correctness (at least it helps to ensure) of incremental processing.
> >
> > Here is my proposal to address the problem with better encapsulation.
> >
> > Firstly, let's avoid direct engine data access outside of engine module. At 
> > engine node construction, instead of using reference of stack variable 
> > (such as struct ed_type_runtime_data ed_runtime_data), we can allocate the 
> > memory in the engine node's init() interface, and free in the cleanup() 
> > interface. This way, there will be no way to directly access engine data 
> > like &ed_runtime_data.local_datapaths.
> >
> > Secondly, let's add a engine module interface engine_get_data() to retrieve 
> > *and validate* data for an engine node:
> > void *
> > engine_get_data(struct engine_node *node, uint64_t run_id)
> > {
> > if (engine_node_valid(node, run_id)) {
> > return node->data;
> > }
> > return NULL;
> > }
> >
> > This should be used whenever an engine node data need to be accessed. (It 
> > may be even better to use node name as input instead of node structure, but 
> > it seems ok to me either way)
> >
> > Now let

Re: [ovs-dev] [PATCH v3 ovn 2/3] ovn-detrace: Fix line parsing.

2019-11-13 Thread Dumitru Ceara
On Tue, Nov 12, 2019 at 11:16 PM Han Zhou  wrote:
>
>
> Hi Dumitru,
>
> Thanks for improving the tool.

Hi Han,

Thanks for the review.

>
> On Tue, Nov 12, 2019 at 8:47 AM Dumitru Ceara  wrote:
> >
> > The script was never parsing the first cookie in the input. Also, add a
> > check to make sure that the cookie refers to a Logical_Flow before
> > trying to print the record.
> >
> > Signed-off-by: Dumitru Ceara 
> > ---
> >  utilities/ovn-detrace.in |   24 ++--
> >  1 file changed, 14 insertions(+), 10 deletions(-)
> >
> > diff --git a/utilities/ovn-detrace.in b/utilities/ovn-detrace.in
> > index 9471e37..34b6b0e 100755
> > --- a/utilities/ovn-detrace.in
> > +++ b/utilities/ovn-detrace.in
> > @@ -188,22 +188,26 @@ def main():
> >  cookie = None
> >  while True:
> >  line = sys.stdin.readline()
> > +
> > +if line == '':
> > +break
> > +
> > +line = line.strip()
> > +
> >  if cookie:
> >  # print lflow info when the current flow block ends
> > -if regex_table_id.match(line) or line.strip() == '':
> > +if regex_table_id.match(line):
> >  lflow = get_lflow_from_cookie(ovsdb_ovnsb, cookie)
> > -print_lflow(lflow, "  * ")
> > -print_lflow_nb_hint(lflow, "* ", ovsdb_ovnnb)
> > -cookie = None
> > +if lflow:
> > +print_lflow(lflow, "  * ")
> > +print_lflow_nb_hint(lflow, "* ", ovsdb_ovnnb)
> > +cookie = None
> >
> > -print line.strip()
> > -if line == "":
> > -break
> > +print line
> >
> >  m = regex_cookie.match(line)
> > -if not m:
> > -continue
> > -cookie = m.group(1)
> > +if m:
> > +cookie = m.group(1)
> >
> >
> >  if __name__ == "__main__":
> >
>
> Maybe I missed something here, but the original logic seems to be correct to 
> me. It always parses the current line - if there is a cookie, it is parsed. 
> And then it prints the previous logical flow information using the last 
> cookie that was parsed when a *new* flow or line break is encountered. And 
> finally, it ensures the last logical flow information is printed because the 
> "break" is AFTER the "if cookie" block.

With the following (simplified) input the original code doesn't work:
$ cat /tmp/test_input
 0. in_port=1, priority 100, cookie 0x6b55ca8d
 8. reg14=0x2,metadata=0x3, priority 50, cookie 0xf0b23940
 9. metadata=0x3, priority 0, cookie 0x892dcbdd
10. metadata=0x3, priority 0, cookie 0xf35a08d5
11. ip,metadata=0x3, priority 100, cookie 0x627415f1

In the first three lines we don't match regex_table_id.match(line)
because of the leading whitespace.

> Now with the change, it seems the last logical flow (related to the last 
> cookie) may not be printed, if it happens to be the last line. (although it 
> is not a problem if the last line doesn't has cookie)

My bad, I'll send a v4 taking care of this case too.

Thanks,
Dumitru

>
> Thanks,
> Han

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev