Re: [ovs-dev] OVN Soft Freeze

2020-01-29 Thread Han Zhou
On Tue, Jan 28, 2020 at 1:37 PM Han Zhou  wrote:
>
>
>
> On Tue, Jan 21, 2020 at 3:08 AM Mark Michelson 
wrote:
> >
> > Hi all,
> >
> > OVN has entered its "soft freeze" state. That means that any new
> > features that should be added prior to the next release need to have
> > reviews posted already.
> >
> > At this point, if anyone has any specific new features they want
> > reviewed prior to the release, please reply on this thread.
> >
> > Hard freeze (and creation of the branch) will be on 1 February.
> >
> > Thanks,
> > Mark Michelson
>
> Hi Mark,
>
> I sent v3 for the OVN interconnection series yesterday. I hope it can be
included in this release.
> https://patchwork.ozlabs.org/project/openvswitch/list/?series=155577
>
> Thanks,
> Han
>

Hi Mark,

One more patch before releasing just to add the news about ECMP support.
https://patchwork.ozlabs.org/patch/1231255/

Thanks,
Han
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn] NEWS: Add the news for ECMP support.

2020-01-29 Thread Han Zhou
Signed-off-by: Han Zhou 
---
 NEWS | 1 +
 1 file changed, 1 insertion(+)

diff --git a/NEWS b/NEWS
index 2b8cd6f..d4f5dee 100644
--- a/NEWS
+++ b/NEWS
@@ -6,6 +6,7 @@ Post-OVS-v2.12.0
- Added Stateless Floating IP support in OVN.
- Added Forwarding Group support in OVN.
- Added support for MLD Snooping and MLD Querier.
+   - Added support for ECMP routes in OVN router.
 
 v2.12.0 - 03 Sep 2019
 -
-- 
2.1.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v1] ovn-openstack.rst: Account for networking-ovn-migration

2020-01-29 Thread 0-day Robot
Bleep bloop.  Greetings Flavio Fernandes, I am a robot and I have tried out 
your patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
WARNING: Line is 82 characters long (recommended limit is 79)
#56 FILE: Documentation/tutorials/ovn-openstack.rst:163:
  Support for `Centos 7 in Devstack 
`_

WARNING: Line is 80 characters long (recommended limit is 79)
#57 FILE: Documentation/tutorials/ovn-openstack.rst:164:
  is going away, but you can still use it. Especially while Centos 8 support

WARNING: Line is 86 characters long (recommended limit is 79)
#58 FILE: Documentation/tutorials/ovn-openstack.rst:165:
  is not finished. The one important caveat for making Centos 7 work with 
Devstack

Lines checked: 143, Warnings: 3, Errors: 0


Please check this out.  If you feel there has been an error, please email 
acon...@redhat.com

Thanks,
0-day Robot
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH] userspace: Enable TSO support for non-DPDK.

2020-01-29 Thread William Tu
For some cases, we want to use userspace datapath but not with
DPDK library, ex: using AF_XDP. The patch enables non-DPDK support
for TSO (TCP Segmentation Offload). I measured the performance
using:
  iperf3 -c (ns0) -> veth peer -> OVS -> veth peer -> iperf3 -s (ns1)
And got around 6Gbps, similar to that with DPDK.

Tested-at: https://travis-ci.org/williamtu/ovs-travis/builds/643599592
Signed-off-by: William Tu 
---
 lib/dp-packet.h | 95 -
 lib/userspace-tso.c |  9 ++---
 2 files changed, 62 insertions(+), 42 deletions(-)

diff --git a/lib/dp-packet.h b/lib/dp-packet.h
index 69ae5dfac7f0..d0a964b87fb8 100644
--- a/lib/dp-packet.h
+++ b/lib/dp-packet.h
@@ -53,7 +53,25 @@ enum OVS_PACKED_ENUM dp_packet_source {
 enum dp_packet_offload_mask {
 DP_PACKET_OL_RSS_HASH_MASK  = 0x1, /* Is the 'rss_hash' valid? */
 DP_PACKET_OL_FLOW_MARK_MASK = 0x2, /* Is the 'flow_mark' valid? */
+DP_PACKET_OL_RX_L4_CKSUM_BAD = 1 << 3,
+DP_PACKET_OL_RX_IP_CKSUM_BAD = 1 << 4,
+DP_PACKET_OL_RX_L4_CKSUM_GOOD = 1 << 5,
+DP_PACKET_OL_RX_IP_CKSUM_GOOD = 1 << 6,
+DP_PACKET_OL_TX_TCP_SEG = 1 << 7,
+DP_PACKET_OL_TX_IPV4 = 1 << 8,
+DP_PACKET_OL_TX_IPV6 = 1 << 9,
+DP_PACKET_OL_TX_TCP_CKSUM = 1 << 10,
+DP_PACKET_OL_TX_UDP_CKSUM = 1 << 11,
+DP_PACKET_OL_TX_SCTP_CKSUM = 1 << 12,
 };
+
+#define DP_PACKET_OL_TX_L4_MASK (DP_PACKET_OL_TX_TCP_CKSUM | \
+ DP_PACKET_OL_TX_UDP_CKSUM | \
+ DP_PACKET_OL_TX_SCTP_CKSUM)
+#define DP_PACKET_OL_RX_IP_CKSUM_MASK (DP_PACKET_OL_RX_IP_CKSUM_GOOD | \
+   DP_PACKET_OL_RX_IP_CKSUM_BAD)
+#define DP_PACKET_OL_RX_L4_CKSUM_MASK (DP_PACKET_OL_RX_L4_CKSUM_GOOD | \
+   DP_PACKET_OL_RX_L4_CKSUM_BAD)
 #else
 /* DPDK mbuf ol_flags that are not really an offload flags.  These are mostly
  * related to mbuf memory layout and OVS should not touch/clear them. */
@@ -737,82 +755,79 @@ dp_packet_set_allocated(struct dp_packet *b, uint16_t s)
 b->allocated_ = s;
 }
 
-/* There are no implementation when not DPDK enabled datapath. */
 static inline bool
-dp_packet_hwol_is_tso(const struct dp_packet *b OVS_UNUSED)
+dp_packet_hwol_is_tso(const struct dp_packet *b)
 {
-return false;
+return !!(b->ol_flags & DP_PACKET_OL_TX_TCP_SEG);
 }
 
-/* There are no implementation when not DPDK enabled datapath. */
 static inline bool
-dp_packet_hwol_is_ipv4(const struct dp_packet *b OVS_UNUSED)
+dp_packet_hwol_is_ipv4(const struct dp_packet *b)
 {
-return false;
+return !!(b->ol_flags & DP_PACKET_OL_TX_IPV4);
 }
 
-/* There are no implementation when not DPDK enabled datapath. */
 static inline uint64_t
-dp_packet_hwol_l4_mask(const struct dp_packet *b OVS_UNUSED)
+dp_packet_hwol_l4_mask(const struct dp_packet *b)
 {
-return 0;
+return b->ol_flags & DP_PACKET_OL_TX_L4_MASK;
 }
 
-/* There are no implementation when not DPDK enabled datapath. */
 static inline bool
-dp_packet_hwol_l4_is_tcp(const struct dp_packet *b OVS_UNUSED)
+dp_packet_hwol_l4_is_tcp(const struct dp_packet *b)
 {
-return false;
+return (b->ol_flags & DP_PACKET_OL_TX_L4_MASK) ==
+DP_PACKET_OL_TX_TCP_CKSUM;
 }
 
-/* There are no implementation when not DPDK enabled datapath. */
 static inline bool
-dp_packet_hwol_l4_is_udp(const struct dp_packet *b OVS_UNUSED)
+dp_packet_hwol_l4_is_udp(const struct dp_packet *b)
 {
-return false;
+return (b->ol_flags & DP_PACKET_OL_TX_L4_MASK) ==
+DP_PACKET_OL_TX_UDP_CKSUM;
 }
 
-/* There are no implementation when not DPDK enabled datapath. */
 static inline bool
-dp_packet_hwol_l4_is_sctp(const struct dp_packet *b OVS_UNUSED)
+dp_packet_hwol_l4_is_sctp(const struct dp_packet *b)
 {
-return false;
+return (b->ol_flags & DP_PACKET_OL_TX_L4_MASK) ==
+DP_PACKET_OL_TX_SCTP_CKSUM;
 }
 
-/* There are no implementation when not DPDK enabled datapath. */
 static inline void
-dp_packet_hwol_set_tx_ipv4(struct dp_packet *b OVS_UNUSED)
+dp_packet_hwol_set_tx_ipv4(struct dp_packet *b)
 {
+b->ol_flags |= DP_PACKET_OL_TX_IPV4;
 }
 
-/* There are no implementation when not DPDK enabled datapath. */
 static inline void
-dp_packet_hwol_set_tx_ipv6(struct dp_packet *b OVS_UNUSED)
+dp_packet_hwol_set_tx_ipv6(struct dp_packet *b)
 {
+b->ol_flags |= DP_PACKET_OL_TX_IPV6;
 }
 
-/* There are no implementation when not DPDK enabled datapath. */
 static inline void
-dp_packet_hwol_set_csum_tcp(struct dp_packet *b OVS_UNUSED)
+dp_packet_hwol_set_csum_tcp(struct dp_packet *b)
 {
+b->ol_flags |= DP_PACKET_OL_TX_TCP_CKSUM;
 }
 
-/* There are no implementation when not DPDK enabled datapath. */
 static inline void
-dp_packet_hwol_set_csum_udp(struct dp_packet *b OVS_UNUSED)
+dp_packet_hwol_set_csum_udp(struct dp_packet *b)
 {
+b->ol_flags |= DP_PACKET_OL_TX_UDP_CKSUM;
 }
 
-/* There are no implementation when not DPDK enabled datapath. 

[ovs-dev] [PATCH] netdev-linux: Prepend the std packet in the TSO packet

2020-01-29 Thread Flavio Leitner
Usually TSO packets are close to 50k, 60k bytes long, so to
to copy less bytes when receiving a packet from the kernel
change the approach. Instead of extending the MTU sized
packet received and append with remaining TSO data from
the TSO buffer, allocate a TSO packet with enough headroom
to prepend the std packet data.

Suggested-by: Ben Pfaff 
Signed-off-by: Flavio Leitner 
---
 lib/dp-packet.c|   8 +--
 lib/dp-packet.h|   2 +
 lib/netdev-linux-private.h |   3 +-
 lib/netdev-linux.c | 117 ++---
 4 files changed, 78 insertions(+), 52 deletions(-)

diff --git a/lib/dp-packet.c b/lib/dp-packet.c
index 8dfedcb7c..cd2623500 100644
--- a/lib/dp-packet.c
+++ b/lib/dp-packet.c
@@ -243,8 +243,8 @@ dp_packet_copy__(struct dp_packet *b, uint8_t *new_base,
 
 /* Reallocates 'b' so that it has exactly 'new_headroom' and 'new_tailroom'
  * bytes of headroom and tailroom, respectively. */
-static void
-dp_packet_resize__(struct dp_packet *b, size_t new_headroom, size_t 
new_tailroom)
+void
+dp_packet_resize(struct dp_packet *b, size_t new_headroom, size_t new_tailroom)
 {
 void *new_base, *new_data;
 size_t new_allocated;
@@ -297,7 +297,7 @@ void
 dp_packet_prealloc_tailroom(struct dp_packet *b, size_t size)
 {
 if (size > dp_packet_tailroom(b)) {
-dp_packet_resize__(b, dp_packet_headroom(b), MAX(size, 64));
+dp_packet_resize(b, dp_packet_headroom(b), MAX(size, 64));
 }
 }
 
@@ -308,7 +308,7 @@ void
 dp_packet_prealloc_headroom(struct dp_packet *b, size_t size)
 {
 if (size > dp_packet_headroom(b)) {
-dp_packet_resize__(b, MAX(size, 64), dp_packet_tailroom(b));
+dp_packet_resize(b, MAX(size, 64), dp_packet_tailroom(b));
 }
 }
 
diff --git a/lib/dp-packet.h b/lib/dp-packet.h
index 69ae5dfac..9a9d35183 100644
--- a/lib/dp-packet.h
+++ b/lib/dp-packet.h
@@ -152,6 +152,8 @@ struct dp_packet *dp_packet_clone_with_headroom(const 
struct dp_packet *,
 struct dp_packet *dp_packet_clone_data(const void *, size_t);
 struct dp_packet *dp_packet_clone_data_with_headroom(const void *, size_t,
  size_t headroom);
+void dp_packet_resize(struct dp_packet *b, size_t new_headroom,
+  size_t new_tailroom);
 static inline void dp_packet_delete(struct dp_packet *);
 
 static inline void *dp_packet_at(const struct dp_packet *, size_t offset,
diff --git a/lib/netdev-linux-private.h b/lib/netdev-linux-private.h
index 143616ca8..e5c13fc37 100644
--- a/lib/netdev-linux-private.h
+++ b/lib/netdev-linux-private.h
@@ -44,7 +44,8 @@ struct netdev_rxq_linux {
 struct netdev_rxq up;
 bool is_tap;
 int fd;
-char *aux_bufs[NETDEV_MAX_BURST]; /* Batch of preallocated TSO buffers. */
+struct dp_packet *aux_bufs[NETDEV_MAX_BURST]; /* Preallocated TSO
+ packets. */
 };
 
 int netdev_linux_construct(struct netdev *);
diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c
index 6add3e2fc..c30a22287 100644
--- a/lib/netdev-linux.c
+++ b/lib/netdev-linux.c
@@ -1052,15 +1052,6 @@ static struct netdev_rxq *
 netdev_linux_rxq_alloc(void)
 {
 struct netdev_rxq_linux *rx = xzalloc(sizeof *rx);
-if (userspace_tso_enabled()) {
-int i;
-
-/* Allocate auxiliay buffers to receive TSO packets. */
-for (i = 0; i < NETDEV_MAX_BURST; i++) {
-rx->aux_bufs[i] = xmalloc(LINUX_RXQ_TSO_MAX_LEN);
-}
-}
-
 return >up;
 }
 
@@ -1172,7 +1163,7 @@ netdev_linux_rxq_destruct(struct netdev_rxq *rxq_)
 }
 
 for (i = 0; i < NETDEV_MAX_BURST; i++) {
-free(rx->aux_bufs[i]);
+dp_packet_delete(rx->aux_bufs[i]);
 }
 }
 
@@ -1238,13 +1229,18 @@ netdev_linux_batch_rxq_recv_sock(struct 
netdev_rxq_linux *rx, int mtu,
 virtio_net_hdr_size = 0;
 }
 
-std_len = VLAN_ETH_HEADER_LEN + mtu + virtio_net_hdr_size;
+/* The length here needs to be accounted in the same way when the
+ * aux_buf is allocated so that it can be prepended to TSO buffer. */
+std_len = virtio_net_hdr_size + VLAN_ETH_HEADER_LEN + mtu;
 for (i = 0; i < NETDEV_MAX_BURST; i++) {
  buffers[i] = dp_packet_new_with_headroom(std_len, DP_NETDEV_HEADROOM);
  iovs[i][IOV_PACKET].iov_base = dp_packet_data(buffers[i]);
  iovs[i][IOV_PACKET].iov_len = std_len;
- iovs[i][IOV_AUXBUF].iov_base = rx->aux_bufs[i];
- iovs[i][IOV_AUXBUF].iov_len = LINUX_RXQ_TSO_MAX_LEN;
+ if (iovlen == IOV_TSO_SIZE) {
+ iovs[i][IOV_AUXBUF].iov_base = dp_packet_data(rx->aux_bufs[i]);
+ iovs[i][IOV_AUXBUF].iov_len = dp_packet_size(rx->aux_bufs[i]);
+ }
+
  mmsgs[i].msg_hdr.msg_name = NULL;
  mmsgs[i].msg_hdr.msg_namelen = 0;
  mmsgs[i].msg_hdr.msg_iov = iovs[i];
@@ -1268,6 +1264,8 @@ netdev_linux_batch_rxq_recv_sock(struct netdev_rxq_linux 
*rx, int mtu,
 }
 
 for (i = 0; i < retval; 

[ovs-dev] [PATCH ovn v1] ovn-openstack.rst: Account for networking-ovn-migration

2020-01-29 Thread Flavio Fernandes
The networking-ovn repo has been migrated into Neutron [0]
as of Ussuri release. This change implements the necessary
updates to the OVN OpenStack tutorial.

Other minor changes here include commands needed to make
Devstack work with Centos 7, as well as the removal of
workarounds that are no longer needed.

[0]: 
https://review.opendev.org/#/c/658414/19/specs/ussuri/ml2ovs-ovn-convergence.rst

Signed-off-by: Flavio Fernandes 
---
 Documentation/tutorials/ovn-openstack.rst | 51 ---
 1 file changed, 18 insertions(+), 33 deletions(-)

diff --git a/Documentation/tutorials/ovn-openstack.rst 
b/Documentation/tutorials/ovn-openstack.rst
index 3ef052396..2e4f63404 100644
--- a/Documentation/tutorials/ovn-openstack.rst
+++ b/Documentation/tutorials/ovn-openstack.rst
@@ -43,7 +43,7 @@ potential users to understand how OVN works and how to debug 
and
 troubleshoot it.
 
 In addition to new material, this tutorial incorporates content from
-``testing.rst`` in OpenStack networking-ovn, by Russell Bryant and
+``ovn_devstack.rst`` in OpenStack neutron, by Russell Bryant and
 others.  Without that example, this tutorial could not have been
 written.
 
@@ -60,7 +60,7 @@ packaging for developers, in a way that allows you to follow 
along
 with the tutorial in full.
 
 Unless you have a spare computer laying about, it's easiest to install
-DevStacck in a virtual machine.  This tutorial was built using a VM
+DevStack in a virtual machine.  This tutorial was built using a VM
 implemented by KVM and managed by virt-manager.  I recommend
 configuring the VM configured for the x86-64 architecture, 6 GB RAM, 2
 VCPUs, and a 20 GB virtual disk.
@@ -102,7 +102,7 @@ Here are step-by-step instructions to get started:
 
 1. Install a VM.
 
-   I tested these instructions with Centos 7.3.  Download the "minimal
+   I tested these instructions with Centos 7.6.  Download the "minimal
install" ISO and booted it.  The install is straightforward.  Be
sure to enable networking, and set a host name, such as
"ovn-devstack-1".  Add a regular (non-root) user, and check the box
@@ -160,6 +160,13 @@ Here are step-by-step instructions to get started:
 
.. note::
 
+  Support for `Centos 7 in Devstack 
`_
+  is going away, but you can still use it. Especially while Centos 8 
support
+  is not finished. The one important caveat for making Centos 7 work with 
Devstack
+  is that you will explicitly have to install these packages as well::
+
+   $ sudo yum install python3 python3-devel
+
   If you installed a 32-bit i386 guest (against the advice above),
   install a non-PAE kernel and reboot into it at this point::
 
@@ -169,12 +176,12 @@ Here are step-by-step instructions to get started:
   Be sure to select the non-PAE kernel from the list at boot.
   Without this step, DevStack will fail to install properly later.
 
-3. Get copies of DevStack and OVN and set them up::
+3. Get copies of DevStack and Neutron and set them up::
 
- $ git clone http://git.openstack.org/openstack-dev/devstack.git
- $ git clone http://git.openstack.org/openstack/networking-ovn.git
+ $ git clone https://git.openstack.org/openstack-dev/devstack.git
+ $ git clone https://git.openstack.org/openstack/neutron.git
  $ cd devstack
- $ cp ../networking-ovn/devstack/local.conf.sample local.conf
+ $ cp ../neutron/devstack/ovn-local.conf.sample local.conf
 
.. note::
 
@@ -219,16 +226,7 @@ Here are step-by-step instructions to get started:
the alternative command-line interfaces because they are easier to
explain and to cut and paste.
 
-5. As of this writing, you need to run the following to fix a problem
-   with using VM consoles from the OpenStack web instance::
-
- $ (cd /opt/stack/noVNC && git checkout v0.6.0)
-
-   See
-   
https://serenity-networks.com/how-to-fix-setkeycodes-00-and-unknown-key-pressed-console-errors-on-openstack/
-   for more details.
-
-6. The firewall in the VM by default allows SSH access but not HTTP.
+5. The firewall in the VM by default allows SSH access but not HTTP.
You will probably want HTTP access to use the OpenStack web
interface.  The following command enables that.  (It also enables
every other kind of network access, so if you're concerned about
@@ -240,7 +238,7 @@ Here are step-by-step instructions to get started:
 
(You need to re-run this if you reboot the VM.)
 
-7. To use OpenStack command line utilities in the tutorial, run::
+6. To use OpenStack command line utilities in the tutorial, run::
 
  $ . ~/devstack/openrc admin
 
@@ -1331,20 +1329,6 @@ with an IP address from the "private" network, then we 
create a
 floating IP address on the "public" network, then we associate the
 port with the floating IP address.
 
-As of this writing, you may need to run the following to fix a
-problem with associating a logical port of router with the external
-gateway::
-
- 

Re: [ovs-dev] [PATCH v4 0/3] Add support for TSO with DPDK

2020-01-29 Thread Yifeng Sun
Got it. Thanks.
Yifeng

On Wed, Jan 29, 2020 at 3:04 PM Flavio Leitner  wrote:
>
> On Wed, Jan 29, 2020 at 02:42:27PM -0800, Yifeng Sun wrote:
> > Hi Flavio,
>
> Hi Yifend, thanks for looking into this.
>
> > I found this piece of code in kernel's drivers/net/virtio_net.c and
> > its function receive_buf():
> > if (hdr->hdr.flags & VIRTIO_NET_HDR_F_DATA_VALID)
> > skb->ip_summed = CHECKSUM_UNNECESSARY;
> > My understanding is that vhost_user needs to set flag
> > VIRTIO_NET_HDR_F_DATA_VALID so that
> > guest's kernel will skip packet's checksum validation.
> >
> > Then I looked through dpdk's source code but didn't find any place
> > that sets this flag. So I made
> > some changes as below, and TCP starts working between 2 VMs without
> > any kernel change.
> >
> > diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
> > index 73bf98bd9..5e45db655 100644
> > --- a/lib/librte_vhost/virtio_net.c
> > +++ b/lib/librte_vhost/virtio_net.c
> > @@ -437,6 +437,7 @@ virtio_enqueue_offload(struct rte_mbuf *m_buf,
> > struct virtio_net_hdr *net_hdr)
> > ASSIGN_UNLESS_EQUAL(net_hdr->csum_start, 0);
> > ASSIGN_UNLESS_EQUAL(net_hdr->csum_offset, 0);
> > -   ASSIGN_UNLESS_EQUAL(net_hdr->flags, 0);
> > +   net_hdr->flags = VIRTIO_NET_HDR_F_DATA_VALID;
> > }
> >
> > /* IP cksum verification cannot be bypassed, then calculate here */
>
> No, it actually uses ->flags to pass VIRTIO_NET_HDR_F_NEEDS_CSUM and
> then we pass the start and offset.
>
> HTH,
> fbl
>
> >
> >
> > Any comments will be appreciated!
> >
> > Thanks a lot,
> > Yifeng
> >
> > On Wed, Jan 29, 2020 at 1:21 PM Flavio Leitner  wrote:
> > >
> > > On Wed, Jan 29, 2020 at 11:19:47AM -0800, William Tu wrote:
> > > > On Wed, Jan 29, 2020 at 3:25 AM Flavio Leitner  
> > > > wrote:
> > > > >
> > > > > On Tue, Jan 28, 2020 at 03:23:02PM -0800, Yifeng Sun wrote:
> > > > > > Sure.
> > > > > >
> > > > > > Firstly, make sure userspace-tso-enable is true
> > > > > > # ovs-vsctl get Open_vSwitch . other_config
> > > > > > {dpdk-init="true", enable-statistics="true", pmd-cpu-mask="0xf",
> > > > > > userspace-tso-enable="true"}
> > > > > >
> > > > > > Next, create 2 VMs with vhostuser-type interface on the same KVM 
> > > > > > host:
> > > > > > 
> > > > > >   
> > > > > >> > > > > path='/tmp/041afca0-6e11-4eab-a62f-1ccf5cd318fd' mode='server'/>
> > > > > >   
> > > > > >   
> > > > > > 
> > > > > > 
> > > > >
> > > > > I have other options set, but I don't think they are related:
> > > > > > > > > ufo='off' mrg_rxbuf='on'/>
> > > > >
> > > >
> > > > Is mrg_rxbuf required to be on?
> > >
> > > No.
> > >
> > > > I saw when enable userspace tso, we are setting external buffer
> > > > RTE_VHOST_USER_EXTBUF_SUPPORT
> > >
> > > Yes.
> > >
> > > > Is this the same thing?
> > >
> > > No.
> > >
> > > mrg_rxbuf says that we want the virtio ring to support chained ring
> > > entries. If that is disabled, the virtio ring will be populated with
> > > entries of maximum buffer length. If that is enabled, a packet will
> > > use one or chain more entries in the virtio ring, so each entry can
> > > be of smaller lengths. That is not visible to OvS.
> > >
> > > The RTE_VHOST_USER_EXTBUF_SUPPORT tells how a packet is provided
> > > after have been pulled out of virtio rings to OvS. We have three
> > > options currently:
> > >
> > > 1) LINEARBUF
> > > It supports data length up to the packet provided (~MTU size).
> > >
> > > 2) EXTBUF
> > > If the packet is too big for #1, allocate a buffer large enough
> > > to fit the data. We get a big packet, but instead of data being
> > > along with the packet's metadata, it's in an external buffer.
> > >
> > > 
> > >+---> [ big buffer]
> > >
> > > Well, actually we make partial use of unused buffer to store
> > > struct rte_mbuf_ext_shared_info.
> > >
> > > 3) If neither LINEARBUF nor EXTBUF is not provided (default),
> > > vhost lib can provide large packets as a chain of mbufs, which
> > > OvS doesn't support today.
> > >
> > > HTH,
> > > --
> > > fbl
>
> --
> fbl
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v4 0/3] Add support for TSO with DPDK

2020-01-29 Thread Flavio Leitner
On Wed, Jan 29, 2020 at 02:42:27PM -0800, Yifeng Sun wrote:
> Hi Flavio,

Hi Yifend, thanks for looking into this.

> I found this piece of code in kernel's drivers/net/virtio_net.c and
> its function receive_buf():
> if (hdr->hdr.flags & VIRTIO_NET_HDR_F_DATA_VALID)
> skb->ip_summed = CHECKSUM_UNNECESSARY;
> My understanding is that vhost_user needs to set flag
> VIRTIO_NET_HDR_F_DATA_VALID so that
> guest's kernel will skip packet's checksum validation.
> 
> Then I looked through dpdk's source code but didn't find any place
> that sets this flag. So I made
> some changes as below, and TCP starts working between 2 VMs without
> any kernel change.
> 
> diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
> index 73bf98bd9..5e45db655 100644
> --- a/lib/librte_vhost/virtio_net.c
> +++ b/lib/librte_vhost/virtio_net.c
> @@ -437,6 +437,7 @@ virtio_enqueue_offload(struct rte_mbuf *m_buf,
> struct virtio_net_hdr *net_hdr)
> ASSIGN_UNLESS_EQUAL(net_hdr->csum_start, 0);
> ASSIGN_UNLESS_EQUAL(net_hdr->csum_offset, 0);
> -   ASSIGN_UNLESS_EQUAL(net_hdr->flags, 0);
> +   net_hdr->flags = VIRTIO_NET_HDR_F_DATA_VALID;
> }
> 
> /* IP cksum verification cannot be bypassed, then calculate here */

No, it actually uses ->flags to pass VIRTIO_NET_HDR_F_NEEDS_CSUM and 
then we pass the start and offset.

HTH,
fbl

> 
> 
> Any comments will be appreciated!
> 
> Thanks a lot,
> Yifeng
> 
> On Wed, Jan 29, 2020 at 1:21 PM Flavio Leitner  wrote:
> >
> > On Wed, Jan 29, 2020 at 11:19:47AM -0800, William Tu wrote:
> > > On Wed, Jan 29, 2020 at 3:25 AM Flavio Leitner  wrote:
> > > >
> > > > On Tue, Jan 28, 2020 at 03:23:02PM -0800, Yifeng Sun wrote:
> > > > > Sure.
> > > > >
> > > > > Firstly, make sure userspace-tso-enable is true
> > > > > # ovs-vsctl get Open_vSwitch . other_config
> > > > > {dpdk-init="true", enable-statistics="true", pmd-cpu-mask="0xf",
> > > > > userspace-tso-enable="true"}
> > > > >
> > > > > Next, create 2 VMs with vhostuser-type interface on the same KVM host:
> > > > > 
> > > > >   
> > > > >> > > > path='/tmp/041afca0-6e11-4eab-a62f-1ccf5cd318fd' mode='server'/>
> > > > >   
> > > > >   
> > > > > 
> > > > > 
> > > >
> > > > I have other options set, but I don't think they are related:
> > > > > > > ufo='off' mrg_rxbuf='on'/>
> > > >
> > >
> > > Is mrg_rxbuf required to be on?
> >
> > No.
> >
> > > I saw when enable userspace tso, we are setting external buffer
> > > RTE_VHOST_USER_EXTBUF_SUPPORT
> >
> > Yes.
> >
> > > Is this the same thing?
> >
> > No.
> >
> > mrg_rxbuf says that we want the virtio ring to support chained ring
> > entries. If that is disabled, the virtio ring will be populated with
> > entries of maximum buffer length. If that is enabled, a packet will
> > use one or chain more entries in the virtio ring, so each entry can
> > be of smaller lengths. That is not visible to OvS.
> >
> > The RTE_VHOST_USER_EXTBUF_SUPPORT tells how a packet is provided
> > after have been pulled out of virtio rings to OvS. We have three
> > options currently:
> >
> > 1) LINEARBUF
> > It supports data length up to the packet provided (~MTU size).
> >
> > 2) EXTBUF
> > If the packet is too big for #1, allocate a buffer large enough
> > to fit the data. We get a big packet, but instead of data being
> > along with the packet's metadata, it's in an external buffer.
> >
> > 
> >+---> [ big buffer]
> >
> > Well, actually we make partial use of unused buffer to store
> > struct rte_mbuf_ext_shared_info.
> >
> > 3) If neither LINEARBUF nor EXTBUF is not provided (default),
> > vhost lib can provide large packets as a chain of mbufs, which
> > OvS doesn't support today.
> >
> > HTH,
> > --
> > fbl

-- 
fbl
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v4 0/3] Add support for TSO with DPDK

2020-01-29 Thread Yifeng Sun
Hi Flavio,

I found this piece of code in kernel's drivers/net/virtio_net.c and
its function receive_buf():
if (hdr->hdr.flags & VIRTIO_NET_HDR_F_DATA_VALID)
skb->ip_summed = CHECKSUM_UNNECESSARY;
My understanding is that vhost_user needs to set flag
VIRTIO_NET_HDR_F_DATA_VALID so that
guest's kernel will skip packet's checksum validation.

Then I looked through dpdk's source code but didn't find any place
that sets this flag. So I made
some changes as below, and TCP starts working between 2 VMs without
any kernel change.

diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
index 73bf98bd9..5e45db655 100644
--- a/lib/librte_vhost/virtio_net.c
+++ b/lib/librte_vhost/virtio_net.c
@@ -437,6 +437,7 @@ virtio_enqueue_offload(struct rte_mbuf *m_buf,
struct virtio_net_hdr *net_hdr)
ASSIGN_UNLESS_EQUAL(net_hdr->csum_start, 0);
ASSIGN_UNLESS_EQUAL(net_hdr->csum_offset, 0);
-   ASSIGN_UNLESS_EQUAL(net_hdr->flags, 0);
+   net_hdr->flags = VIRTIO_NET_HDR_F_DATA_VALID;
}

/* IP cksum verification cannot be bypassed, then calculate here */


Any comments will be appreciated!

Thanks a lot,
Yifeng

On Wed, Jan 29, 2020 at 1:21 PM Flavio Leitner  wrote:
>
> On Wed, Jan 29, 2020 at 11:19:47AM -0800, William Tu wrote:
> > On Wed, Jan 29, 2020 at 3:25 AM Flavio Leitner  wrote:
> > >
> > > On Tue, Jan 28, 2020 at 03:23:02PM -0800, Yifeng Sun wrote:
> > > > Sure.
> > > >
> > > > Firstly, make sure userspace-tso-enable is true
> > > > # ovs-vsctl get Open_vSwitch . other_config
> > > > {dpdk-init="true", enable-statistics="true", pmd-cpu-mask="0xf",
> > > > userspace-tso-enable="true"}
> > > >
> > > > Next, create 2 VMs with vhostuser-type interface on the same KVM host:
> > > > 
> > > >   
> > > >> > > path='/tmp/041afca0-6e11-4eab-a62f-1ccf5cd318fd' mode='server'/>
> > > >   
> > > >   
> > > > 
> > > > 
> > >
> > > I have other options set, but I don't think they are related:
> > > > > ufo='off' mrg_rxbuf='on'/>
> > >
> >
> > Is mrg_rxbuf required to be on?
>
> No.
>
> > I saw when enable userspace tso, we are setting external buffer
> > RTE_VHOST_USER_EXTBUF_SUPPORT
>
> Yes.
>
> > Is this the same thing?
>
> No.
>
> mrg_rxbuf says that we want the virtio ring to support chained ring
> entries. If that is disabled, the virtio ring will be populated with
> entries of maximum buffer length. If that is enabled, a packet will
> use one or chain more entries in the virtio ring, so each entry can
> be of smaller lengths. That is not visible to OvS.
>
> The RTE_VHOST_USER_EXTBUF_SUPPORT tells how a packet is provided
> after have been pulled out of virtio rings to OvS. We have three
> options currently:
>
> 1) LINEARBUF
> It supports data length up to the packet provided (~MTU size).
>
> 2) EXTBUF
> If the packet is too big for #1, allocate a buffer large enough
> to fit the data. We get a big packet, but instead of data being
> along with the packet's metadata, it's in an external buffer.
>
> 
>+---> [ big buffer]
>
> Well, actually we make partial use of unused buffer to store
> struct rte_mbuf_ext_shared_info.
>
> 3) If neither LINEARBUF nor EXTBUF is not provided (default),
> vhost lib can provide large packets as a chain of mbufs, which
> OvS doesn't support today.
>
> HTH,
> --
> fbl
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v4 13/13] tutorial: Add tutorial for OVN Interconnection.

2020-01-29 Thread 0-day Robot
Bleep bloop.  Greetings Han Zhou, I am a robot and I have tried out your patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
WARNING: Line is 85 characters long (recommended limit is 79)
#122 FILE: Documentation/tutorials/ovn-interconnection.rst:77:
  --ovn-northd-nb-db= --ovn-northd-sb-db= [more options] 
start_ic

Lines checked: 266, Warnings: 1, Errors: 0


Please check this out.  If you feel there has been an error, please email 
acon...@redhat.com

Thanks,
0-day Robot
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v4 0/3] Add support for TSO with DPDK

2020-01-29 Thread Flavio Leitner
On Wed, Jan 29, 2020 at 11:19:47AM -0800, William Tu wrote:
> On Wed, Jan 29, 2020 at 3:25 AM Flavio Leitner  wrote:
> >
> > On Tue, Jan 28, 2020 at 03:23:02PM -0800, Yifeng Sun wrote:
> > > Sure.
> > >
> > > Firstly, make sure userspace-tso-enable is true
> > > # ovs-vsctl get Open_vSwitch . other_config
> > > {dpdk-init="true", enable-statistics="true", pmd-cpu-mask="0xf",
> > > userspace-tso-enable="true"}
> > >
> > > Next, create 2 VMs with vhostuser-type interface on the same KVM host:
> > > 
> > >   
> > >> > path='/tmp/041afca0-6e11-4eab-a62f-1ccf5cd318fd' mode='server'/>
> > >   
> > >   
> > > 
> > > 
> >
> > I have other options set, but I don't think they are related:
> > > ufo='off' mrg_rxbuf='on'/>
> >
> 
> Is mrg_rxbuf required to be on?

No.

> I saw when enable userspace tso, we are setting external buffer
> RTE_VHOST_USER_EXTBUF_SUPPORT

Yes.

> Is this the same thing?

No.

mrg_rxbuf says that we want the virtio ring to support chained ring
entries. If that is disabled, the virtio ring will be populated with
entries of maximum buffer length. If that is enabled, a packet will
use one or chain more entries in the virtio ring, so each entry can
be of smaller lengths. That is not visible to OvS.

The RTE_VHOST_USER_EXTBUF_SUPPORT tells how a packet is provided
after have been pulled out of virtio rings to OvS. We have three
options currently:

1) LINEARBUF
It supports data length up to the packet provided (~MTU size).

2) EXTBUF
If the packet is too big for #1, allocate a buffer large enough
to fit the data. We get a big packet, but instead of data being
along with the packet's metadata, it's in an external buffer.


   +---> [ big buffer]

Well, actually we make partial use of unused buffer to store
struct rte_mbuf_ext_shared_info.

3) If neither LINEARBUF nor EXTBUF is not provided (default),
vhost lib can provide large packets as a chain of mbufs, which
OvS doesn't support today.

HTH,
-- 
fbl
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v4 12/13] ovn-ctl: Support commands for interconnection.

2020-01-29 Thread 0-day Robot
Bleep bloop.  Greetings Han Zhou, I am a robot and I have tried out your patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
WARNING: Line is 85 characters long (recommended limit is 79)
#54 FILE: utilities/ovn-ctl:112:
$DB_IC_NB_SYNC_FROM_PORT $ic_nb_active_conf_file 
ovn_ic_nb_db.ctl

WARNING: Line is 85 characters long (recommended limit is 79)
#59 FILE: utilities/ovn-ctl:117:
$DB_IC_SB_SYNC_FROM_PORT $ic_sb_active_conf_file 
ovn_ic_sb_db.ctl

WARNING: Line is 91 characters long (recommended limit is 79)
#198 FILE: utilities/ovn-ctl:494:
OVS_RUNDIR=${OVS_RUNDIR} start_ovn_daemon "$OVN_IC_PRIORITY" 
"$OVN_IC_WRAPPER" "$@"

WARNING: Line is 80 characters long (recommended limit is 79)
#402 FILE: utilities/ovn-ctl:838:
  restart_ic_ovsdbrestart ovn interconnection ovsdb-server processes

WARNING: Line is 81 characters long (recommended limit is 79)
#403 FILE: utilities/ovn-ctl:839:
  restart_ic_nb_ovsdb   restart ovn ic-northbound db ovsdb-server 
process

WARNING: Line is 81 characters long (recommended limit is 79)
#404 FILE: utilities/ovn-ctl:840:
  restart_ic_sb_ovsdb   restart ovn ic-southbound db ovsdb-server 
process

WARNING: Line is 82 characters long (recommended limit is 79)
#418 FILE: utilities/ovn-ctl:855:
  promote_ic_nb   promote ovn ic-northbound db backup server to 
active

WARNING: Line is 82 characters long (recommended limit is 79)
#419 FILE: utilities/ovn-ctl:856:
  promote_ic_sb   promote ovn ic-southbound db backup server to 
active

WARNING: Line is 81 characters long (recommended limit is 79)
#420 FILE: utilities/ovn-ctl:857:
  demote_ic_nbdemote ovn ic-northbound db active server to 
backup

WARNING: Line is 81 characters long (recommended limit is 79)
#421 FILE: utilities/ovn-ctl:858:
  demote_ic_sbdemote ovn ic-southbound db active server to 
backup

WARNING: Line is 83 characters long (recommended limit is 79)
#432 FILE: utilities/ovn-ctl:880:
  --ovn-manage-ovsdb=yes|noWhether or not the OVN NB/SB databases 
should be

WARNING: Line is 85 characters long (recommended limit is 79)
#445 FILE: utilities/ovn-ctl:896:
  --ovn-ic-log=STRINGovn-ic process logging params (default: 
$OVN_IC_LOG)

WARNING: Line is 83 characters long (recommended limit is 79)
#446 FILE: utilities/ovn-ctl:897:
  --ovn-ic-logfile=STRINGovn-ic process log file (default: 
$OVN_IC_LOGFILE)

WARNING: Line is 85 characters long (recommended limit is 79)
#466 FILE: utilities/ovn-ctl:964:
  --db-ic-nb-addr=ADDROVN IC Northbound db ptcp address (default: 
$DB_IC_NB_ADDR)

WARNING: Line is 82 characters long (recommended limit is 79)
#467 FILE: utilities/ovn-ctl:965:
  --db-ic-nb-port=PORTOVN IC Northbound db ptcp port (default: 
$DB_IC_NB_PORT)

WARNING: Line is 85 characters long (recommended limit is 79)
#468 FILE: utilities/ovn-ctl:966:
  --db-ic-sb-addr=ADDROVN IC Southbound db ptcp address (default: 
$DB_IC_SB_ADDR)

WARNING: Line is 82 characters long (recommended limit is 79)
#469 FILE: utilities/ovn-ctl:967:
  --db-ic-sb-port=PORTOVN IC Southbound db ptcp port (default: 
$DB_IC_SB_PORT)

WARNING: Line is 83 characters long (recommended limit is 79)
#470 FILE: utilities/ovn-ctl:968:
  --ovn-ic-nb-logfile=FILE OVN IC Northbound log file (default: 
$OVN_IC_NB_LOGFILE)

WARNING: Line is 83 characters long (recommended limit is 79)
#471 FILE: utilities/ovn-ctl:969:
  --ovn-ic-sb-logfile=FILE OVN IC Southbound log file (default: 
$OVN_IC_SB_LOGFILE)

WARNING: Line is 108 characters long (recommended limit is 79)
#472 FILE: utilities/ovn-ctl:970:
  --db-ic-nb-sync-from-addr=ADDR OVN IC Northbound active db tcp address 
(default: $DB_IC_NB_SYNC_FROM_ADDR)

WARNING: Line is 105 characters long (recommended limit is 79)
#473 FILE: utilities/ovn-ctl:971:
  --db-ic-nb-sync-from-port=PORT OVN IC Northbound active db tcp port (default: 
$DB_IC_NB_SYNC_FROM_PORT)

WARNING: Line is 109 characters long (recommended limit is 79)
#474 FILE: utilities/ovn-ctl:972:
  --db-ic-nb-sync-from-proto=PROTO OVN IC Northbound active db transport 
(default: $DB_IC_NB_SYNC_FROM_PROTO)

WARNING: Line is 123 characters long (recommended limit is 79)
#475 FILE: utilities/ovn-ctl:973:
  --db-ic-nb-create-insecure-remote=yes|no Create ptcp OVN IC Northbound remote 
(default: $DB_IC_NB_CREATE_INSECURE_REMOTE)

WARNING: Line is 108 characters long (recommended limit is 79)
#476 FILE: utilities/ovn-ctl:974:
  --db-ic-sb-sync-from-addr=ADDR OVN IC Southbound active db tcp address 
(default: $DB_IC_SB_SYNC_FROM_ADDR)

WARNING: Line is 105 characters long (recommended limit is 79)
#477 FILE: utilities/ovn-ctl:975:
  --db-ic-sb-sync-from-port=ADDR OVN IC Southbound active db tcp port (default: 
$DB_IC_SB_SYNC_FROM_PORT)

WARNING: Line is 109 characters long (recommended limit is 79)
#478 FILE: 

Re: [ovs-dev] [PATCH ovn v4 11/13] ovn-ctl: Refactor to reduce redundant code.

2020-01-29 Thread 0-day Robot
Bleep bloop.  Greetings Han Zhou, I am a robot and I have tried out your patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
WARNING: Line is 83 characters long (recommended limit is 79)
#58 FILE: utilities/ovn-ctl:72:
echo "$sync_from_proto:$sync_from_addr:$sync_from_port" > 
$active_conf_file

WARNING: Line is 104 characters long (recommended limit is 79)
#65 FILE: utilities/ovn-ctl:76:
ovn-appctl -t $OVN_RUNDIR/$ctl_file 
ovsdb-server/set-active-ovsdb-server `cat $active_conf_file`

WARNING: Line is 84 characters long (recommended limit is 79)
#66 FILE: utilities/ovn-ctl:77:
ovn-appctl -t $OVN_RUNDIR/$ctl_file 
ovsdb-server/connect-active-ovsdb-server

Lines checked: 115, Warnings: 3, Errors: 0


Please check this out.  If you feel there has been an error, please email 
acon...@redhat.com

Thanks,
0-day Robot
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v4 09/13] ovn-ic: Interconnection port controller.

2020-01-29 Thread 0-day Robot
Bleep bloop.  Greetings Han Zhou, I am a robot and I have tried out your patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
WARNING: Comment with 'xxx' marker
#321 FILE: ic/ovn-ic.c:652:
/* XXX: Sync encap so that multiple encaps can be used for the same

Lines checked: 778, Warnings: 1, Errors: 0


Please check this out.  If you feel there has been an error, please email 
acon...@redhat.com

Thanks,
0-day Robot
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v4 06/13] ovn-ic: Transit switch controller.

2020-01-29 Thread 0-day Robot
Bleep bloop.  Greetings Han Zhou, I am a robot and I have tried out your patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
WARNING: Line is 80 characters long (recommended limit is 79)
#325 FILE: ovn-sb.xml:2367:
this key the value of the same interconn-ts key of the https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v4 04/13] ovn-ic: Interconnection controller with AZ registeration.

2020-01-29 Thread 0-day Robot
Bleep bloop.  Greetings Han Zhou, I am a robot and I have tried out your patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
WARNING: Line is 81 characters long (recommended limit is 79)
#83 FILE: ic/ovn-ic.8.xml:12:
  daemon which communicates with global interconnection databases 
IC_NB/IC_SB

WARNING: Line is 82 characters long (recommended limit is 79)
#125 FILE: ic/ovn-ic.8.xml:54:
http://www.w3.org/2003/XInclude"/>

WARNING: Line is 80 characters long (recommended limit is 79)
#128 FILE: ic/ovn-ic.8.xml:57:
http://www.w3.org/2003/XInclude"/>

WARNING: Line lacks whitespace around operator
WARNING: Line lacks whitespace around operator
WARNING: Line lacks whitespace around operator
#283 FILE: ic/ovn-ic.c:86:
  --ovnnb-db=DATABASE   connect to ovn-nb database at DATABASE\n\

WARNING: Line lacks whitespace around operator
WARNING: Line lacks whitespace around operator
WARNING: Line lacks whitespace around operator
#285 FILE: ic/ovn-ic.c:88:
  --ovnsb-db=DATABASE   connect to ovn-sb database at DATABASE\n\

WARNING: Line lacks whitespace around operator
WARNING: Line lacks whitespace around operator
#287 FILE: ic/ovn-ic.c:90:
  --unixctl=SOCKET  override default control socket name\n\

WARNING: Comment with 'xxx' marker
#502 FILE: ic/ovn-ic.c:305:
/* ovn-nb db. XXX: add only needed tables and columns */

WARNING: Comment with 'xxx' marker
#506 FILE: ic/ovn-ic.c:309:
/* ovn-sb db. XXX: add only needed tables and columns */

WARNING: Line is 111 characters long (recommended limit is 79)
#1138 FILE: tutorial/ovs-sandbox:406:

PATH=$builddir/controller:$builddir/controller-vtep:$builddir/northd:$builddir/ic:$builddir/utilities:$PATH

Lines checked: 1178, Warnings: 14, Errors: 0


Please check this out.  If you feel there has been an error, please email 
acon...@redhat.com

Thanks,
0-day Robot
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v4 03/13] ovn-ic-sb: Interconnection southbound DB schema and CLI.

2020-01-29 Thread 0-day Robot
Bleep bloop.  Greetings Han Zhou, I am a robot and I have tried out your patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
WARNING: Line is 80 characters long (recommended limit is 79)
#269 FILE: ovn-ic-sb.ovsschema:34:
 "refTable": "Availability_Zone"}}},

WARNING: Line is 80 characters long (recommended limit is 79)
#308 FILE: ovn-ic-sb.ovsschema:73:
 "refTable": "Availability_Zone"}}},

WARNING: Line is 82 characters long (recommended limit is 79)
#400 FILE: ovn-ic-sb.xml:30:
These tables contain objects that are availability zone specific.  Each 
object

WARNING: Line is 83 characters long (recommended limit is 79)
#401 FILE: ovn-ic-sb.xml:31:
is owned and populated by one availability zone, and read by other 
availability

WARNING: Line is 82 characters long (recommended limit is 79)
#466 FILE: ovn-ic-sb.xml:96:
  Each row in this table represents an Availability Zone.  Each OVN 
deployment

WARNING: Line is 81 characters long (recommended limit is 79)
#467 FILE: ovn-ic-sb.xml:97:
  is considered an availability zone from OVN control plane perspective, 
with

WARNING: Line is 81 characters long (recommended limit is 79)
#468 FILE: ovn-ic-sb.xml:98:
  its own central components, such as northbound and southbound databases 
and

WARNING: Line is 81 characters long (recommended limit is 79)
#576 FILE: ovn-ic-sb.xml:206:
see  column of the 
OVN

WARNING: Line is 81 characters long (recommended limit is 79)
#577 FILE: ovn-ic-sb.xml:207:
Southbound database's 

WARNING: Line is 81 characters long (recommended limit is 79)
#645 FILE: ovn-ic-sb.xml:275:
  The Ethernet address and IP addresses used by the corresponding 
logical

WARNING: Line is 84 characters long (recommended limit is 79)
#646 FILE: ovn-ic-sb.xml:276:
  router port peering with the transit switch port.  It is a string 
combined

WARNING: Line is 82 characters long (recommended limit is 79)
#689 FILE: ovn-ic-sb.xml:319:
  
ssl:host[:port]

WARNING: Line is 82 characters long (recommended limit is 79)
#707 FILE: ovn-ic-sb.xml:337:
  
tcp:host[:port]

WARNING: Line is 84 characters long (recommended limit is 79)
#713 FILE: ovn-ic-sb.xml:343:
  address, wrap it in square brackets, e.g. 
tcp:[::1]:6640.

WARNING: Line is 85 characters long (recommended limit is 79)
#719 FILE: ovn-ic-sb.xml:349:
  
pssl:[port][:host]

WARNING: Line is 80 characters long (recommended limit is 79)
#731 FILE: ovn-ic-sb.xml:361:
  A valid SSL configuration must be provided when this form is used,

WARNING: Line is 85 characters long (recommended limit is 79)
#743 FILE: ovn-ic-sb.xml:373:
  
ptcp:[port][:host]

WARNING: Line is 110 characters long (recommended limit is 79)
#819 FILE: ovn-ic-sb.xml:449:
  type='{"type": "string", "enum": ["set", ["VOID", "BACKOFF", 
"CONNECTING", "ACTIVE", "IDLE"]]}'>

WARNING: Line is 96 characters long (recommended limit is 79)
#1165 FILE: utilities/ovn-ic-sbctl.8.xml:4:
ovn-ic-sbctl -- Open Virtual Network interconnection southbound db 
management utility

WARNING: Line is 96 characters long (recommended limit is 79)
#1168 FILE: utilities/ovn-ic-sbctl.8.xml:7:
ovn-ic-sbctl [options] command 
[arg...]

WARNING: Line is 90 characters long (recommended limit is 79)
#1171 FILE: utilities/ovn-ic-sbctl.8.xml:10:
This utility can be used to manage the OVN interconnection southbound 
database.

WARNING: Line is 81 characters long (recommended limit is 79)
#1191 FILE: utilities/ovn-ic-sbctl.8.xml:30:
These commands query and modify the contents of ovsdb 
tables.

WARNING: Line is 92 characters long (recommended limit is 79)
#1193 FILE: utilities/ovn-ic-sbctl.8.xml:32:
as such they operate at a lower level than other ovn-ic-sbctl 
commands.

WARNING: Line is 82 characters long (recommended limit is 79)
#1195 FILE: utilities/ovn-ic-sbctl.8.xml:34:
Each of these commands has a table parameter to identify a 
table

WARNING: Line is 87 characters long (recommended limit is 79)
#1219 FILE: utilities/ovn-ic-sbctl.8.xml:58:
http://www.w3.org/2003/XInclude"/>

WARNING: Line is 114 characters long (recommended limit is 79)
#1233 FILE: utilities/ovn-ic-sbctl.8.xml:72:
  [--inactivity-probe=msecs] 
set-connection target...

WARNING: Line is 80 characters long (recommended limit is 79)
#1236 FILE: utilities/ovn-ic-sbctl.8.xml:75:
--inactivity-probe=msecs to override the default

WARNING: Line is 83 characters long (recommended limit is 79)
#1237 FILE: utilities/ovn-ic-sbctl.8.xml:76:
idle connection inactivity probe time.  Use 0 to disable inactivity 
probes.

WARNING: Line is 86 characters long (recommended limit is 79)
#1268 FILE: utilities/ovn-ic-sbctl.8.xml:107:
  Otherwise, the default is unix:@RUNDIR@/ovn_ic_sb_db.sock, 

Re: [ovs-dev] [PATCH ovn v4 02/13] ovn-ic-nb: Interconnection northbound DB schema and CLI.

2020-01-29 Thread 0-day Robot
Bleep bloop.  Greetings Han Zhou, I am a robot and I have tried out your patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
WARNING: Line is 85 characters long (recommended limit is 79)
#346 FILE: ovn-ic-nb.xml:35:
  Northbound configuration for OVN interconnection.  This table must have 
exactly

WARNING: Line is 80 characters long (recommended limit is 79)
#378 FILE: ovn-ic-nb.xml:67:
  Each row represents one transit logical switch for interconnection between

WARNING: Line is 82 characters long (recommended limit is 79)
#470 FILE: ovn-ic-nb.xml:159:
  
ssl:host[:port]

WARNING: Line is 81 characters long (recommended limit is 79)
#477 FILE: ovn-ic-nb.xml:166:
  specified via command-line options or the  
table.

WARNING: Line is 82 characters long (recommended limit is 79)
#488 FILE: ovn-ic-nb.xml:177:
  
tcp:host[:port]

WARNING: Line is 84 characters long (recommended limit is 79)
#494 FILE: ovn-ic-nb.xml:183:
  address, wrap it in square brackets, e.g. 
tcp:[::1]:6640.

WARNING: Line is 85 characters long (recommended limit is 79)
#500 FILE: ovn-ic-nb.xml:189:
  
pssl:[port][:host]

WARNING: Line is 80 characters long (recommended limit is 79)
#512 FILE: ovn-ic-nb.xml:201:
  A valid SSL configuration must be provided when this form is used,

WARNING: Line is 85 characters long (recommended limit is 79)
#524 FILE: ovn-ic-nb.xml:213:
  
ptcp:[port][:host]

WARNING: Line is 110 characters long (recommended limit is 79)
#599 FILE: ovn-ic-nb.xml:288:
  type='{"type": "string", "enum": ["set", ["VOID", "BACKOFF", 
"CONNECTING", "ACTIVE", "IDLE"]]}'>

WARNING: Line is 96 characters long (recommended limit is 79)
#848 FILE: utilities/ovn-ic-nbctl.8.xml:4:
ovn-ic-nbctl -- Open Virtual Network interconnection northbound db 
management utility

WARNING: Line is 96 characters long (recommended limit is 79)
#851 FILE: utilities/ovn-ic-nbctl.8.xml:7:
ovn-ic-nbctl [options] command 
[arg...]

WARNING: Line is 90 characters long (recommended limit is 79)
#854 FILE: utilities/ovn-ic-nbctl.8.xml:10:
This utility can be used to manage the OVN interconnection northbound 
database.

WARNING: Line is 81 characters long (recommended limit is 79)
#900 FILE: utilities/ovn-ic-nbctl.8.xml:56:
These commands query and modify the contents of ovsdb 
tables.

WARNING: Line is 92 characters long (recommended limit is 79)
#902 FILE: utilities/ovn-ic-nbctl.8.xml:58:
as such they operate at a lower level than other ovn-ic-nbctl 
commands.

WARNING: Line is 82 characters long (recommended limit is 79)
#904 FILE: utilities/ovn-ic-nbctl.8.xml:60:
Each of these commands has a table parameter to identify a 
table

WARNING: Line is 87 characters long (recommended limit is 79)
#928 FILE: utilities/ovn-ic-nbctl.8.xml:84:
http://www.w3.org/2003/XInclude"/>

WARNING: Line is 114 characters long (recommended limit is 79)
#942 FILE: utilities/ovn-ic-nbctl.8.xml:98:
  [--inactivity-probe=msecs] 
set-connection target...

WARNING: Line is 80 characters long (recommended limit is 79)
#945 FILE: utilities/ovn-ic-nbctl.8.xml:101:
--inactivity-probe=msecs to override the default

WARNING: Line is 83 characters long (recommended limit is 79)
#946 FILE: utilities/ovn-ic-nbctl.8.xml:102:
idle connection inactivity probe time.  Use 0 to disable inactivity 
probes.

WARNING: Line is 86 characters long (recommended limit is 79)
#977 FILE: utilities/ovn-ic-nbctl.8.xml:133:
  Otherwise, the default is unix:@RUNDIR@/ovn_ic_nb_db.sock, 
but this

WARNING: Line is 81 characters long (recommended limit is 79)
#986 FILE: utilities/ovn-ic-nbctl.8.xml:142:
  is a clustered database, ovn-ic-nbctl will avoid servers 
other

WARNING: Line is 82 characters long (recommended limit is 79)
#989 FILE: utilities/ovn-ic-nbctl.8.xml:145:
  --no-leader-only, ovn-ic-nbctl will use any 
server

WARNING: Line is 80 characters long (recommended limit is 79)
#999 FILE: utilities/ovn-ic-nbctl.8.xml:155:
http://www.w3.org/2003/XInclude"/>

WARNING: Line is 81 characters long (recommended limit is 79)
#1004 FILE: utilities/ovn-ic-nbctl.8.xml:160:
http://www.w3.org/2003/XInclude"/>

WARNING: Line is 89 characters long (recommended limit is 79)
#1012 FILE: utilities/ovn-ic-nbctl.8.xml:168:
http://www.w3.org/2003/XInclude"/>

WARNING: Line is 82 characters long (recommended limit is 79)
#1016 FILE: utilities/ovn-ic-nbctl.8.xml:172:
http://www.w3.org/2003/XInclude"/>

WARNING: Line lacks whitespace around operator
#1330 FILE: utilities/ovn-ic-nbctl.c:306:
  ts-add SWITCH  create a transit switch named SWITCH\n\

WARNING: Line lacks whitespace around operator
#1331 FILE: utilities/ovn-ic-nbctl.c:307:
  ts-del SWITCH  delete SWITCH\n\

WARNING: Line lacks whitespace around operator
#1332 FILE: utilities/ovn-ic-nbctl.c:308:
  ts-list 

Re: [ovs-dev] [PATCH v2] dpdk: Support running PMD threads on cores > RTE_MAX_LCORE.

2020-01-29 Thread Aaron Conole
David Marchand  writes:

> Most DPDK components make the assumption that rte_lcore_id() returns
> a valid lcore_id in [0..RTE_MAX_LCORE[ range (with the exception of
> the LCORE_ID_ANY special value).
> OVS does not currently check which value is set in
> RTE_PER_LCORE(_lcore_id) which exposes us to potential crashes on DPDK
> side.
> Introduce a lcore allocator in OVS for PMD threads and map them to
> unused lcores from DPDK à la --lcores.
> The physical cores on which the PMD threads are running still
> constitutes an important information when debugging, so still keep
> those in the PMD thread names but add a new debug log when starting
> them.
> Mapping between OVS threads and DPDK lcores can be dumped with a new
> dpdk/dump-lcores command.
> Synchronize DPDK internals on numa and cpuset for the PMD threads by
> registering them via the rte_thread_set_affinity() helper.
> Signed-off-by: David Marchand 
> ---

Thanks for this work.  The current failure condition that exists
today gets confusing for users.

I do prefer having part of OvS internally handling this - it will only
be used with dpdk ports when the EAL gets initialized, so I dislike
documenting some kinds of options.  This is where I like having the
abstraction layer for option passing because we can fix things up for
the user.

> Changelog since v1:
> - rewired existing configuration 'dpdk-lcore-mask' to use --lcores,
> - switched to a bitmap to track lcores,
> - added a command to dump current mapping (Flavio): used an experimental
>   API to get DPDK lcores cpuset since it is the most reliable/portable
>   information,
> - used the same code for the logs when starting DPDK/PMD threads,
> - addressed Ilya comments,
> ---
>  lib/dpdk-stub.c   |   8 ++-
>  lib/dpdk.c| 163 --
>  lib/dpdk.h|   5 +-
>  lib/dpif-netdev.c |   3 +-
>  4 files changed, 170 insertions(+), 9 deletions(-)
> diff --git a/lib/dpdk-stub.c b/lib/dpdk-stub.c
> index c332c217c..90473bc8e 100644
> --- a/lib/dpdk-stub.c
> +++ b/lib/dpdk-stub.c
> @@ -39,7 +39,13 @@ dpdk_init(const struct smap *ovs_other_config)
>  }
>  void
> -dpdk_set_lcore_id(unsigned cpu OVS_UNUSED)
> +dpdk_init_thread_context(unsigned cpu OVS_UNUSED)
> +{
> +/* Nothing */
> +}
> +
> +void
> +dpdk_uninit_thread_context(void)
>  {
>  /* Nothing */
>  }
> diff --git a/lib/dpdk.c b/lib/dpdk.c
> index 37ea2973c..0173366a0 100644
> --- a/lib/dpdk.c
> +++ b/lib/dpdk.c
> @@ -30,6 +30,7 @@
>  #include 
>  #endif
> +#include "bitmap.h"
>  #include "dirs.h"
>  #include "fatal-signal.h"
>  #include "netdev-dpdk.h"
> @@ -39,6 +40,7 @@
>  #include "ovs-numa.h"
>  #include "smap.h"
>  #include "svec.h"
> +#include "unixctl.h"
>  #include "util.h"
>  #include "vswitch-idl.h"
> @@ -54,6 +56,9 @@ static bool dpdk_initialized = false; /* Indicates 
> successful initialization
> * of DPDK. */
>  static bool per_port_memory = false; /* Status of per port memory support */
> +static struct ovs_mutex lcore_bitmap_mutex = OVS_MUTEX_INITIALIZER;
> +static unsigned long *lcore_bitmap OVS_GUARDED_BY(lcore_bitmap_mutex);
> +
>  static int
>  process_vhost_flags(char *flag, const char *default_val, int size,
>  const struct smap *ovs_other_config,
> @@ -94,6 +99,38 @@ args_contains(const struct svec *args, const char *value)
>  return false;
>  }
> +static void
> +construct_dpdk_lcore_option(const struct smap *ovs_other_config,
> +struct svec *args)
> +{
> +const char *cmask = smap_get(ovs_other_config, "dpdk-lcore-mask");
> +struct svec lcores = SVEC_EMPTY_INITIALIZER;
> +struct ovs_numa_info_core *core;
> +struct ovs_numa_dump *cores;
> +int index = 0;
> +
> +if (!cmask) {
> +return;
> +}
> +if (args_contains(args, "-c") || args_contains(args, "-l") ||
> +args_contains(args, "--lcores")) {
> +VLOG_WARN("Ignoring database defined option 
> 'dpdk-lcore-mask' "
> +  "due to dpdk-extra config");
> +return;
> +}
> +
> +cores = ovs_numa_dump_cores_with_cmask(cmask);
> +FOR_EACH_CORE_ON_DUMP(core, cores) {
> +svec_add_nocopy(, xasprintf("%d@%d", index, core->core_id));
> +index++;
> +}
> +svec_terminate();
> +ovs_numa_dump_destroy(cores);
> +svec_add(args, "--lcores");
> +svec_add_nocopy(args, svec_join(, ",", ""));
> +svec_destroy();
> +}
> +
>  static void
>  construct_dpdk_options(const struct smap *ovs_other_config, struct svec 
> *args)
>  {
> @@ -103,7 +140,6 @@ construct_dpdk_options(const struct smap 
> *ovs_other_config, struct svec *args)
>  bool default_enabled;
>  const char *default_value;
>  } opts[] = {
> -{"dpdk-lcore-mask",   "-c", false, NULL},
>  {"dpdk-hugepage-dir", "--huge-dir", false, NULL},
>  {"dpdk-socket-limit", "--socket-limit", false, NULL},
>  };

Re: [ovs-dev] [PATCH ovn v3 00/13] OVN Interconnection

2020-01-29 Thread Han Zhou
On Wed, Jan 29, 2020 at 11:44 AM Numan Siddique  wrote:
>
> On Tue, Jan 28, 2020 at 8:26 AM Han Zhou  wrote:
> >
> > The series supports interconnecting multiple OVN deployments (e.g.
 located at
> > multiple data centers) through logical routers connected with tansit
logical
> > switches with overlay tunnels, managed through OVN control plane.  See
the
> > ovn-architecture.rst document updates for more details, and find the
> > instructions in Documentation/tutorials/ovn-interconnection.rst.
> >
> > v2 -> v3:
> >
> >   - Addressed Numan's comments:
> > - Rename ovn-inbctl => ovn-ic-nbctl ovn-isbctl => ovn-ic-sbctl.
> > - Update tunnel keys through northd instead of directly update
SB-DB by
> >   ovn-ic.
> > - Rename is-interconn to ovn-is-interconn in chassis ovsdb settings.
> > - Add a section in ovn-architecture for "A day in the life of a
packet
> >   crossing AZs".
> > - Set hostname for chassis in test cases.
> >
> >   - In addition, there are some other changes:
> > - Avoid unnecessary tunnel and bfd sessions to remote chassis.
> > - Use external_ids keys "is-remote" and "is-interconn" in SB Chassis
> >   table, instead of adding new columns is_remote and is_interconn,
to
> >   avoid too many columns.
> >
> > Han Zhou (13):
> >   ovn-architecture: Add documentation for OVN interconnection feature.
> >   ovn-ic-nb: Interconnection northbound DB schema and CLI.
> >   ovn-ic-sb: Interconnection southbound DB schema and CLI.
> >   ovn-ic: Interconnection controller with AZ registeration.
> >   ovn-northd.c: Refactor allocate_tnlid.
> >   ovn-ic: Transit switch controller.
> >   ovn-sb: Add keys is_interconn and is_remote to Chassis's external_ids.
> >   ovn-ic: Interconnection gateway controller.
> >   ovn-ic: Interconnection port controller.
> >   ovn.at: e2e test for OVN interconnection.
> >   ovn-ctl: Refactor to reduce redundant code.
> >   ovn-ctl: Support commands for interconnection.
> >   tutorial: Add tutorial for OVN Interconnection.
>
> Hi Han,
>
> Can you please rebase this series ? Some of the patches are not
> applying on the present master.
>
> Thanks
> Numan
>
OK. I rebased it to v4:
https://patchwork.ozlabs.org/project/openvswitch/list/?series=155905

Thanks,
Han
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn v4 12/13] ovn-ctl: Support commands for interconnection.

2020-01-29 Thread Han Zhou
Add support for managing IC-NB and IC-SB DBs, and ovn-ic daemon.

Signed-off-by: Han Zhou 
---
 utilities/ovn-ctl   | 364 +++-
 utilities/ovn-ctl.8.xml |  91 
 2 files changed, 453 insertions(+), 2 deletions(-)

diff --git a/utilities/ovn-ctl b/utilities/ovn-ctl
index 2e4e773..c7cb42b 100755
--- a/utilities/ovn-ctl
+++ b/utilities/ovn-ctl
@@ -33,6 +33,9 @@ done
 ovnnb_active_conf_file="$ovn_etcdir/ovnnb-active.conf"
 ovnsb_active_conf_file="$ovn_etcdir/ovnsb-active.conf"
 ovn_northd_db_conf_file="$ovn_etcdir/ovn-northd-db-params.conf"
+ic_nb_active_conf_file="$ovn_etcdir/ic-nb-active.conf"
+ic_sb_active_conf_file="$ovn_etcdir/ic-sb-active.conf"
+ovn_ic_db_conf_file="$ovn_etcdir/ovn-ic-db-params.conf"
 ## - ##
 ## start ##
 ## - ##
@@ -61,6 +64,19 @@ stop_ovsdb () {
 stop_sb_ovsdb
 }
 
+stop_ic_nb_ovsdb() {
+stop_xx_ovsdb $DB_IC_NB_PID ovn_ic_nb_db.ctl
+}
+
+stop_ic_sb_ovsdb() {
+stop_xx_ovsdb $DB_IC_SB_PID ovn_ic_sb_db.ctl
+}
+
+stop_ic_ovsdb () {
+stop_ic_nb_ovsdb
+stop_ic_sb_ovsdb
+}
+
 demote_xx_ovsdb () {
 local sync_from_addr=$1
 local sync_from_proto=$2
@@ -91,6 +107,16 @@ demote_ovnsb() {
 $DB_SB_SYNC_FROM_PORT $ovnsb_active_conf_file ovnsb_db.ctl
 }
 
+demote_ic_nb() {
+demote_xx_ovsdb $DB_IC_NB_SYNC_FROM_ADDR $DB_IC_NB_SYNC_FROM_PROTO \
+$DB_IC_NB_SYNC_FROM_PORT $ic_nb_active_conf_file 
ovn_ic_nb_db.ctl
+}
+
+demote_ic_sb() {
+demote_xx_ovsdb $DB_IC_SB_SYNC_FROM_ADDR $DB_IC_SB_SYNC_FROM_PROTO \
+$DB_IC_SB_SYNC_FROM_PORT $ic_sb_active_conf_file 
ovn_ic_sb_db.ctl
+}
+
 promote_xx_ovsdb() {
 local active_conf_file=$1
 local ctl_file=$2
@@ -106,6 +132,14 @@ promote_ovnsb() {
 promote_xx_ovsdb $ovnsb_active_conf_file ovnsb_db.ctl
 }
 
+promote_ic_nb() {
+promote_xx_ovsdb $ic_nb_active_conf_file ovn_ic_nb_db.ctl
+}
+
+promote_ic_sb() {
+promote_xx_ovsdb $ic_sb_active_conf_file ovn_ic_sb_db.ctl
+}
+
 start_ovsdb__() {
 local DB=$1 db=$2 schema_name=$3 table_name=$4
 local db_pid_file
@@ -255,7 +289,7 @@ $cluster_remote_port
 # Initialize the database if it's running standalone,
 # active-passive, or is the first server in a cluster.
 if test -z "$cluster_remote_addr"; then
-ovn-${db}ctl init
+$(echo ovn-${db}ctl | tr _ -) init
 fi
 
 if test $mode = cluster; then
@@ -284,6 +318,19 @@ start_ovsdb () {
 start_sb_ovsdb
 }
 
+start_ic_nb_ovsdb() {
+start_ovsdb__ IC_NB ic_nb OVN_IC_Northbound IC_NB_Global
+}
+
+start_ic_sb_ovsdb() {
+start_ovsdb__ IC_SB ic_sb OVN_IC_Southbound IC_SB_Global
+}
+
+start_ic_ovsdb () {
+start_ic_nb_ovsdb
+start_ic_sb_ovsdb
+}
+
 sync_status() {
 ovn-appctl -t $OVN_RUNDIR/ovn${1}_db.ctl ovsdb-server/sync-status | awk 
'{if(NR==1) print $2}'
 }
@@ -318,6 +365,36 @@ status_ovsdb () {
   fi
 }
 
+status_ic_nb() {
+if ! pidfile_is_running $DB_IC_NB_PID; then
+echo "not-running"
+else
+echo "running/$(sync_status ic_nb)"
+fi
+}
+
+status_ic_sb() {
+if ! pidfile_is_running $DB_IC_SB_PID; then
+echo "not-running"
+else
+echo "running/$(sync_status ic_sb)"
+fi
+}
+
+status_ic_ovsdb () {
+  if ! pidfile_is_running $DB_IC_NB_PID; then
+  log_success_msg "OVN IC-Northbound DB is not running"
+  else
+  log_success_msg "OVN IC-Northbound DB is running"
+  fi
+
+  if ! pidfile_is_running $DB_IC_SB_PID; then
+  log_success_msg "OVN IC-Southbound DB is not running"
+  else
+  log_success_msg "OVN IC-Southbound DB is running"
+  fi
+}
+
 run_nb_ovsdb() {
 DB_NB_DETACH=no
 start_nb_ovsdb
@@ -328,6 +405,16 @@ run_sb_ovsdb() {
 start_sb_ovsdb
 }
 
+run_ic_nb_ovsdb() {
+DB_IC_NB_DETACH=no
+start_ic_nb_ovsdb
+}
+
+run_ic_sb_ovsdb() {
+DB_IC_SB_DETACH=no
+start_ic_sb_ovsdb
+}
+
 start_northd () {
 if [ ! -e $ovn_northd_db_conf_file ]; then
 if test X"$OVN_MANAGE_OVSDB" = Xyes; then
@@ -373,6 +460,41 @@ start_northd () {
 fi
 }
 
+start_ic () {
+if [ ! -e $ovn_ic_db_conf_file ]; then
+ovn_ic_params="--ovnnb-db=$OVN_NORTHD_NB_DB \
+   --ovnsb-db=$OVN_NORTHD_SB_DB \
+   --ic-nb-db=$OVN_IC_NB_DB \
+   --ic-sb-db=$OVN_IC_SB_DB"
+else
+ovn_ic_params="`cat $ovn_ic_db_conf_file`"
+fi
+
+if daemon_is_running ovn-ic; then
+log_success_msg "ovn-ic is already running"
+else
+set ovn-ic
+if test X"$OVN_IC_LOGFILE" != X; then
+set "$@" --log-file=$OVN_IC_LOGFILE
+fi
+if test X"$OVN_IC_SSL_KEY" != X; then
+set "$@" --private-key=$OVN_IC_SSL_KEY
+fi
+if test X"$OVN_IC_SSL_CERT" != X; then
+set "$@" --certificate=$OVN_IC_SSL_CERT
+fi
+if test X"$OVN_IC_SSL_CA_CERT" != X; then
+set "$@" --ca-cert=$OVN_IC_SSL_CA_CERT
+fi
+
+[ 

[ovs-dev] [PATCH ovn v4 13/13] tutorial: Add tutorial for OVN Interconnection.

2020-01-29 Thread Han Zhou
Added tutorial, and also updated NEWS and TODO.

Tested-by: Aliasgar Ginwala 
Signed-off-by: Han Zhou 
---
 Documentation/automake.mk   |   1 +
 Documentation/tutorials/index.rst   |   1 +
 Documentation/tutorials/ovn-interconnection.rst | 188 
 NEWS|   5 +
 TODO.rst|   6 +
 5 files changed, 201 insertions(+)
 create mode 100644 Documentation/tutorials/ovn-interconnection.rst

diff --git a/Documentation/automake.mk b/Documentation/automake.mk
index bf21663..2f33753 100644
--- a/Documentation/automake.mk
+++ b/Documentation/automake.mk
@@ -20,6 +20,7 @@ DOC_SOURCE = \
Documentation/tutorials/ovn-sandbox.rst \
Documentation/tutorials/ovn-ipsec.rst \
Documentation/tutorials/ovn-rbac.rst \
+   Documentation/tutorials/ovn-interconnection.rst \
Documentation/topics/index.rst \
Documentation/topics/testing.rst \
Documentation/topics/high-availability.rst \
diff --git a/Documentation/tutorials/index.rst 
b/Documentation/tutorials/index.rst
index 1cf083e..4ff6e16 100644
--- a/Documentation/tutorials/index.rst
+++ b/Documentation/tutorials/index.rst
@@ -43,3 +43,4 @@ vSwitch.
ovn-openstack
ovn-rbac
ovn-ipsec
+   ovn-interconnection
diff --git a/Documentation/tutorials/ovn-interconnection.rst 
b/Documentation/tutorials/ovn-interconnection.rst
new file mode 100644
index 000..2f9d6d7
--- /dev/null
+++ b/Documentation/tutorials/ovn-interconnection.rst
@@ -0,0 +1,188 @@
+..
+  Licensed under the Apache License, Version 2.0 (the "License"); you may
+  not use this file except in compliance with the License. You may obtain
+  a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+  WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+  License for the specific language governing permissions and limitations
+  under the License.
+
+  Convention for heading levels in OVN documentation:
+
+  ===  Heading 0 (reserved for the title in a document)
+  ---  Heading 1
+  ~~~  Heading 2
+  +++  Heading 3
+  '''  Heading 4
+
+  Avoid deeper levels because they do not render well.
+
+===
+OVN Interconnection
+===
+
+This document provides a guide for interconnecting multiple OVN deployements
+with OVN managed tunneling.  More details about the OVN Interconnectiong design
+can be found in ``ovn-architecture``\(7) manpage.
+
+This document assumes two or more OVN deployments are setup and runs normally,
+possibly at different data-centers, and the gateway chassises of each OVN
+are with IP addresses that are reachable between each other.
+
+Setup Interconnection Databases
+---
+
+To interconnect different OVNs, you need to create global OVSDB databases that
+store interconnection data.  The databases can be setup on any nodes that are
+accessible from all the central nodes of each OVN deployment.  It is
+recommended that the global databases are setup with HA, with nodes in
+different avaialbility zones, to avoid single point of failure.
+
+1. Install OVN packages on each global database node.
+
+2. Start OVN IC-NB and IC-SB databases.
+
+   On each global database node ::
+
+$ ovn-ctl [options] start_ic_ovsdb
+
+   Options depends on the HA mode you use.  To start standalone mode with TCP
+   connections, use ::
+
+$ ovn-ctl --db-ic-nb-create-insecure-remote=yes \
+  --db-ic-sb-create-insecure-remote=yes start_ic_ovsdb
+
+   This command starts IC database servers that accept both unix socket and
+   TCP connections.  For other modes, see more details with ::
+
+$ ovn-ctl --help
+
+Register OVN to Interconnection Databases
+-
+
+For each OVN deployment, set an availability zone name ::
+
+$ ovn-nbctl set NB_Global . name=
+
+The name should be unique across all OVN deployments, e.g. ovn-east,
+ovn-west, etc.
+
+For each OVN deployment, start the ``ovn-ic`` daemon on central nodes ::
+
+$ ovn-ctl --ovn-ic-nb-db= --ovn-ic-sb-db= \
+  --ovn-northd-nb-db= --ovn-northd-sb-db= [more options] 
start_ic
+
+An example of  is ``tcp::6645``, or for
+clustered DB: ``tcp::6645,tcp::6645,tcp::6645``.
+ is similar, but usually with a different port number, typically,
+6646.
+
+For  and , use same connection methods as for starting
+``northd``.
+
+Verify each OVN registration from global IC-SB database, using
+``ovn-ic-sbctl``, either on a global DB node or other nodes but with property
+DB connection method specified in options ::
+
+$ ovn-ic-sbctl show
+
+Configure Gateways
+--
+
+For each OVN 

[ovs-dev] [PATCH ovn v4 11/13] ovn-ctl: Refactor to reduce redundant code.

2020-01-29 Thread Han Zhou
This patch helps reducing redundant code in next patch for adding
support for interconnection related DBs and daemon.

Signed-off-by: Han Zhou 
---
 utilities/ovn-ctl | 61 +++
 1 file changed, 35 insertions(+), 26 deletions(-)

diff --git a/utilities/ovn-ctl b/utilities/ovn-ctl
index 576a983..2e4e773 100755
--- a/utilities/ovn-ctl
+++ b/utilities/ovn-ctl
@@ -42,16 +42,18 @@ pidfile_is_running () {
 test -e "$pidfile" && [ -s "$pidfile" ] && pid=`cat "$pidfile"` && 
pid_exists "$pid"
 } >/dev/null 2>&1
 
-stop_nb_ovsdb() {
-if pidfile_is_running $DB_NB_PID; then
-ovn-appctl -t $OVN_RUNDIR/ovnnb_db.ctl exit
+stop_xx_ovsdb() {
+if pidfile_is_running $1; then
+ovn-appctl -t $OVN_RUNDIR/$2 exit
 fi
 }
 
+stop_nb_ovsdb() {
+stop_xx_ovsdb $DB_NB_PID ovnnb_db.ctl
+}
+
 stop_sb_ovsdb() {
-if pidfile_is_running $DB_SB_PID; then
-ovn-appctl -t $OVN_RUNDIR/ovnsb_db.ctl exit
-fi
+stop_xx_ovsdb $DB_SB_PID ovnsb_db.ctl
 }
 
 stop_ovsdb () {
@@ -59,42 +61,49 @@ stop_ovsdb () {
 stop_sb_ovsdb
 }
 
-demote_ovnnb() {
-if test ! -z "$DB_NB_SYNC_FROM_ADDR"; then
-echo 
"$DB_NB_SYNC_FROM_PROTO:$DB_NB_SYNC_FROM_ADDR:$DB_NB_SYNC_FROM_PORT" > 
$ovnnb_active_conf_file
+demote_xx_ovsdb () {
+local sync_from_addr=$1
+local sync_from_proto=$2
+local sync_from_port=$3
+local active_conf_file=$4
+local ctl_file=$5
+
+if test ! -z "$sync_from_addr"; then
+echo "$sync_from_proto:$sync_from_addr:$sync_from_port" > 
$active_conf_file
 fi
 
-if test -e $ovnnb_active_conf_file; then
-ovn-appctl -t $OVN_RUNDIR/ovnnb_db.ctl 
ovsdb-server/set-active-ovsdb-server `cat $ovnnb_active_conf_file`
-ovn-appctl -t $OVN_RUNDIR/ovnnb_db.ctl 
ovsdb-server/connect-active-ovsdb-server
+if test -e $active_conf_file; then
+ovn-appctl -t $OVN_RUNDIR/$ctl_file 
ovsdb-server/set-active-ovsdb-server `cat $active_conf_file`
+ovn-appctl -t $OVN_RUNDIR/$ctl_file 
ovsdb-server/connect-active-ovsdb-server
 else
 echo >&2 "$0: active server details not set"
 exit 1
 fi
 }
 
+demote_ovnnb() {
+demote_xx_ovsdb $DB_NB_SYNC_FROM_ADDR $DB_NB_SYNC_FROM_PROTO \
+$DB_NB_SYNC_FROM_PORT $ovnnb_active_conf_file ovnnb_db.ctl
+}
+
 demote_ovnsb() {
-if test ! -z "$DB_SB_SYNC_FROM_ADDR"; then
-echo 
"$DB_SB_SYNC_FROM_PROTO:$DB_SB_SYNC_FROM_ADDR:$DB_SB_SYNC_FROM_PORT" > 
$ovnsb_active_conf_file
-fi
+demote_xx_ovsdb $DB_SB_SYNC_FROM_ADDR $DB_SB_SYNC_FROM_PROTO \
+$DB_SB_SYNC_FROM_PORT $ovnsb_active_conf_file ovnsb_db.ctl
+}
 
-if test -e $ovnsb_active_conf_file; then
-ovn-appctl -t $OVN_RUNDIR/ovnsb_db.ctl 
ovsdb-server/set-active-ovsdb-server `cat $ovnsb_active_conf_file`
-ovn-appctl -t $OVN_RUNDIR/ovnsb_db.ctl 
ovsdb-server/connect-active-ovsdb-server
-else
-echo >&2 "$0: active server details not set"
-exit 1
-fi
+promote_xx_ovsdb() {
+local active_conf_file=$1
+local ctl_file=$2
+rm -f $active_conf_file
+ovn-appctl -t $OVN_RUNDIR/$2 ovsdb-server/disconnect-active-ovsdb-server
 }
 
 promote_ovnnb() {
-rm -f $ovnnb_active_conf_file
-ovn-appctl -t $OVN_RUNDIR/ovnnb_db.ctl 
ovsdb-server/disconnect-active-ovsdb-server
+promote_xx_ovsdb $ovnnb_active_conf_file ovnnb_db.ctl
 }
 
 promote_ovnsb() {
-rm -f $ovnsb_active_conf_file
-ovn-appctl -t $OVN_RUNDIR/ovnsb_db.ctl 
ovsdb-server/disconnect-active-ovsdb-server
+promote_xx_ovsdb $ovnsb_active_conf_file ovnsb_db.ctl
 }
 
 start_ovsdb__() {
-- 
2.1.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn v4 03/13] ovn-ic-sb: Interconnection southbound DB schema and CLI.

2020-01-29 Thread Han Zhou
This patch introduces OVN_IC_Southbound DB schema and the CLI
ovn-ic-sbctl that manages the DB.

Signed-off-by: Han Zhou 
---
 .gitignore   |3 +
 automake.mk  |   36 ++
 debian/ovn-common.install|1 +
 debian/ovn-common.manpages   |2 +
 lib/.gitignore   |3 +
 lib/automake.mk  |   17 +-
 lib/ovn-ic-sb-idl.ann|9 +
 lib/ovn-util.c   |   13 +
 lib/ovn-util.h   |1 +
 ovn-ic-sb.ovsschema  |  129 ++
 ovn-ic-sb.xml|  582 
 tests/automake.mk|2 +
 tests/ovn-ic-sbctl.at|  112 +
 tests/testsuite.at   |1 +
 utilities/.gitignore |2 +
 utilities/automake.mk|8 +
 utilities/ovn-ic-sbctl.8.xml |  148 ++
 utilities/ovn-ic-sbctl.c | 1017 ++
 18 files changed, 2085 insertions(+), 1 deletion(-)
 create mode 100644 lib/ovn-ic-sb-idl.ann
 create mode 100644 ovn-ic-sb.ovsschema
 create mode 100644 ovn-ic-sb.xml
 create mode 100644 tests/ovn-ic-sbctl.at
 create mode 100644 utilities/ovn-ic-sbctl.8.xml
 create mode 100644 utilities/ovn-ic-sbctl.c

diff --git a/.gitignore b/.gitignore
index d4f8c10..7ca9b38 100644
--- a/.gitignore
+++ b/.gitignore
@@ -70,6 +70,9 @@
 /ovn-ic-nb.5
 /ovn-ic-nb.gv
 /ovn-ic-nb.pic
+/ovn-ic-sb.5
+/ovn-ic-sb.gv
+/ovn-ic-sb.pic
 /package.m4
 /stamp-h1
 /_build-gcc
diff --git a/automake.mk b/automake.mk
index 59063c2..2b0bb4b 100644
--- a/automake.mk
+++ b/automake.mk
@@ -91,6 +91,35 @@ ovn-ic-nb.5: \
$(srcdir)/ovn-ic-nb.xml > $@.tmp && \
mv $@.tmp $@
 
+# OVN interconnection southbound E-R diagram
+#
+# If "python" or "dot" is not available, then we do not add graphical diagram
+# to the documentation.
+if HAVE_DOT
+ovn-ic-sb.gv: ${OVSDIR}/ovsdb/ovsdb-dot.in $(srcdir)/ovn-ic-sb.ovsschema
+   $(AM_V_GEN)$(OVSDB_DOT) --no-arrows $(srcdir)/ovn-ic-sb.ovsschema > $@
+ovn-ic-sb.pic: ovn-ic-sb.gv ${OVSDIR}/ovsdb/dot2pic
+   $(AM_V_GEN)(dot -T plain < ovn-ic-sb.gv | $(PYTHON) 
${OVSDIR}/ovsdb/dot2pic -f 3) > $@.tmp && \
+   mv $@.tmp $@
+OVN_IC_SB_PIC = ovn-ic-sb.pic
+OVN_IC_SB_DOT_DIAGRAM_ARG = --er-diagram=$(OVN_IC_SB_PIC)
+CLEANFILES += ovn-ic-sb.gv ovn-ic-sb.pic
+endif
+
+# OVN interconnection southbound schema documentation
+EXTRA_DIST += ovn-ic-sb.xml
+CLEANFILES += ovn-ic-sb.5
+man_MANS += ovn-ic-sb.5
+
+ovn-ic-sb.5: \
+   ${OVSDIR}/ovsdb/ovsdb-doc $(srcdir)/ovn-ic-sb.xml 
$(srcdir)/ovn-ic-sb.ovsschema $(OVN_IC_SB_PIC)
+   $(AM_V_GEN)$(OVSDB_DOC) \
+   $(OVN_IC_SB_DOT_DIAGRAM_ARG) \
+   --version=$(VERSION) \
+   $(srcdir)/ovn-ic-sb.ovsschema \
+   $(srcdir)/ovn-ic-sb.xml > $@.tmp && \
+   mv $@.tmp $@
+
 # Version checking for ovn-nb.ovsschema.
 ALL_LOCAL += ovn-nb.ovsschema.stamp
 ovn-nb.ovsschema.stamp: ovn-nb.ovsschema
@@ -108,8 +137,15 @@ ovn-ic-nb.ovsschema.stamp: ovn-ic-nb.ovsschema
$(srcdir)/build-aux/cksum-schema-check $? $@
 CLEANFILES += ovn-ic-nb.ovsschema.stamp
 
+# Version checking for ovn-ic-sb.ovsschema.
+ALL_LOCAL += ovn-ic-sb.ovsschema.stamp
+ovn-ic-sb.ovsschema.stamp: ovn-ic-sb.ovsschema
+   $(srcdir)/build-aux/cksum-schema-check $? $@
+CLEANFILES += ovn-ic-sb.ovsschema.stamp
+
 pkgdata_DATA += ovn-nb.ovsschema
 pkgdata_DATA += ovn-sb.ovsschema
 pkgdata_DATA += ovn-ic-nb.ovsschema
+pkgdata_DATA += ovn-ic-sb.ovsschema
 
 CLEANFILES += ovn-sb.ovsschema.stamp
diff --git a/debian/ovn-common.install b/debian/ovn-common.install
index 59b8018..e3c3c00 100644
--- a/debian/ovn-common.install
+++ b/debian/ovn-common.install
@@ -1,6 +1,7 @@
 usr/bin/ovn-nbctl
 usr/bin/ovn-sbctl
 usr/bin/ovn-ic-nbctl
+usr/bin/ovn-ic-sbctl
 usr/bin/ovn-trace
 usr/bin/ovn-detrace
 usr/share/openvswitch/scripts/ovn-ctl
diff --git a/debian/ovn-common.manpages b/debian/ovn-common.manpages
index e7d3e4d..ba0fe8a 100644
--- a/debian/ovn-common.manpages
+++ b/debian/ovn-common.manpages
@@ -2,9 +2,11 @@ ovn/ovn-architecture.7
 ovn/ovn-nb.5
 ovn/ovn-sb.5
 ovn/ovn-ic-nb.5
+ovn/ovn-ic-sb.5
 ovn/utilities/ovn-ctl.8
 ovn/utilities/ovn-nbctl.8
 ovn/utilities/ovn-sbctl.8
 ovn/utilities/ovn-ic-nbctl.8
+ovn/utilities/ovn-ic-sbctl.8
 ovn/utilities/ovn-trace.8
 ovn/utilities/ovn-detrace.1
diff --git a/lib/.gitignore b/lib/.gitignore
index 3af2923..7f67f1d 100644
--- a/lib/.gitignore
+++ b/lib/.gitignore
@@ -8,4 +8,7 @@
 /ovn-ic-nb-idl.c
 /ovn-ic-nb-idl.h
 /ovn-ic-nb-idl.ovsidl
+/ovn-ic-sb-idl.c
+/ovn-ic-sb-idl.h
+/ovn-ic-sb-idl.ovsidl
 /ovn-dirs.c
diff --git a/lib/automake.mk b/lib/automake.mk
index 5f6561a..f3e9c88 100644
--- a/lib/automake.mk
+++ b/lib/automake.mk
@@ -31,7 +31,9 @@ nodist_lib_libovn_la_SOURCES = \
lib/ovn-sb-idl.c \
lib/ovn-sb-idl.h \
lib/ovn-ic-nb-idl.c \
-   lib/ovn-ic-nb-idl.h
+   lib/ovn-ic-nb-idl.h \
+   lib/ovn-ic-sb-idl.c \
+   lib/ovn-ic-sb-idl.h
 
 CLEANFILES += 

[ovs-dev] [PATCH ovn v4 09/13] ovn-ic: Interconnection port controller.

2020-01-29 Thread Han Zhou
Sync interconnection logical ports and bindings between NB, SB
and IC-SB.  With this patch, the OVN interconnection works end to
end.

Signed-off-by: Han Zhou 
---
 controller/binding.c   |   6 +-
 ic/ovn-ic.c| 385 -
 lib/ovn-util.c |   7 +
 lib/ovn-util.h |   2 +
 northd/ovn-northd.c|  69 +++--
 ovn-architecture.7.xml |   2 +-
 ovn-nb.xml |  35 -
 tests/ovn-ic.at|  65 +
 8 files changed, 552 insertions(+), 19 deletions(-)

diff --git a/controller/binding.c b/controller/binding.c
index 6e752db..c3376e2 100644
--- a/controller/binding.c
+++ b/controller/binding.c
@@ -834,11 +834,13 @@ binding_evaluate_port_binding_changes(
  * - If a regular VIF is unbound from this chassis, the local ovsdb
  *   interface table will be updated, which will trigger recompute.
  *
- * - If the port is not a regular VIF, always trigger recompute. */
+ * - If the port is not a regular VIF, and not a "remote" port,
+ *   always trigger recompute. */
 if (binding_rec->chassis == chassis_rec
 || is_our_chassis(chassis_rec, binding_rec,
   active_tunnels, _to_iface, local_lports)
-|| strcmp(binding_rec->type, "")) {
+|| (strcmp(binding_rec->type, "") && strcmp(binding_rec->type,
+"remote"))) {
 changed = true;
 break;
 }
diff --git a/ic/ovn-ic.c b/ic/ovn-ic.c
index f352036..25ca3f7 100644
--- a/ic/ovn-ic.c
+++ b/ic/ovn-ic.c
@@ -60,6 +60,10 @@ struct ic_context {
 struct ovsdb_idl_txn *ovnsb_txn;
 struct ovsdb_idl_txn *ovninb_txn;
 struct ovsdb_idl_txn *ovnisb_txn;
+struct ovsdb_idl_index *nbrec_ls_by_name;
+struct ovsdb_idl_index *sbrec_chassis_by_name;
+struct ovsdb_idl_index *sbrec_port_binding_by_name;
+struct ovsdb_idl_index *icsbrec_port_binding_by_ts;
 };
 
 struct ic_state {
@@ -182,7 +186,7 @@ ts_run(struct ic_context *ctx)
 }
 }
 
-/* Create NB Logical_Switch for each TS */
+/* Create/update NB Logical_Switch for each TS */
 ICNBREC_TRANSIT_SWITCH_FOR_EACH (ts, ctx->ovninb_idl) {
 ls = shash_find_and_delete(_tses, ts->name);
 if (!ls) {
@@ -394,6 +398,366 @@ gateway_run(struct ic_context *ctx, const struct 
icsbrec_availability_zone *az)
 shash_destroy(_gws);
 }
 
+static const struct nbrec_logical_switch *
+find_ts_in_nb(struct ic_context *ctx, char *ts_name)
+{
+const struct nbrec_logical_switch *key =
+nbrec_logical_switch_index_init_row(ctx->nbrec_ls_by_name);
+nbrec_logical_switch_index_set_name(key, ts_name);
+
+const struct nbrec_logical_switch *ls;
+bool found = false;
+NBREC_LOGICAL_SWITCH_FOR_EACH_EQUAL (ls, key, ctx->nbrec_ls_by_name) {
+const char *ls_ts_name = smap_get(>other_config, "interconn-ts");
+if (ls_ts_name && !strcmp(ts_name, ls_ts_name)) {
+found = true;
+break;
+}
+}
+nbrec_logical_switch_index_destroy_row(key);
+
+if (found) {
+return ls;
+}
+return NULL;
+}
+
+static const struct sbrec_port_binding *
+find_sb_pb_by_name(struct ovsdb_idl_index *sbrec_port_binding_by_name,
+   const char *name)
+{
+const struct sbrec_port_binding *key =
+sbrec_port_binding_index_init_row(sbrec_port_binding_by_name);
+sbrec_port_binding_index_set_logical_port(key, name);
+
+const struct sbrec_port_binding *pb =
+sbrec_port_binding_index_find(sbrec_port_binding_by_name, key);
+sbrec_port_binding_index_destroy_row(key);
+
+return pb;
+}
+
+static const struct sbrec_port_binding *
+find_peer_port(struct ic_context *ctx,
+   const struct sbrec_port_binding *sb_pb)
+{
+const char *peer_name = smap_get(_pb->options, "peer");
+if (!peer_name) {
+return NULL;
+}
+
+return find_sb_pb_by_name(ctx->sbrec_port_binding_by_name, peer_name);
+}
+
+static const struct sbrec_port_binding *
+find_crp_from_lrp(struct ic_context *ctx,
+  const struct sbrec_port_binding *lrp_pb)
+{
+char *crp_name = ovn_chassis_redirect_name(lrp_pb->logical_port);
+
+const struct sbrec_port_binding *pb =
+find_sb_pb_by_name(ctx->sbrec_port_binding_by_name, crp_name);
+
+free(crp_name);
+return pb;
+}
+
+static const struct sbrec_port_binding *
+find_crp_for_sb_pb(struct ic_context *ctx,
+   const struct sbrec_port_binding *sb_pb)
+{
+const struct sbrec_port_binding *peer = find_peer_port(ctx, sb_pb);
+if (!peer) {
+return NULL;
+}
+
+return find_crp_from_lrp(ctx, peer);
+}
+
+static const char *
+get_lrp_address_for_sb_pb(struct ic_context *ctx,
+  const struct sbrec_port_binding *sb_pb)
+{
+const struct sbrec_port_binding *peer = 

[ovs-dev] [PATCH ovn v4 08/13] ovn-ic: Interconnection gateway controller.

2020-01-29 Thread Han Zhou
Sync local and remote gateways between SB and IC-SB.

Signed-off-by: Han Zhou 
---
 ic/ovn-ic.c | 147 
 tests/ovn-ic.at |  55 +
 2 files changed, 202 insertions(+)

diff --git a/ic/ovn-ic.c b/ic/ovn-ic.c
index 82bd86e..f352036 100644
--- a/ic/ovn-ic.c
+++ b/ic/ovn-ic.c
@@ -248,6 +248,152 @@ ts_run(struct ic_context *ctx)
 shash_destroy(_dps);
 }
 
+/* Returns true if any information in gw and chassis is different. */
+static bool
+is_gateway_data_changed(const struct icsbrec_gateway *gw,
+   const struct sbrec_chassis *chassis)
+{
+if (strcmp(gw->hostname, chassis->hostname)) {
+return true;
+}
+
+if (gw->n_encaps != chassis->n_encaps) {
+return true;
+}
+
+for (int g = 0; g < gw->n_encaps; g++) {
+
+bool found = false;
+const struct icsbrec_encap *gw_encap = gw->encaps[g];
+for (int s = 0; s < chassis->n_encaps; s++) {
+const struct sbrec_encap *chassis_encap = chassis->encaps[s];
+if (!strcmp(gw_encap->type, chassis_encap->type) &&
+!strcmp(gw_encap->ip, chassis_encap->ip)) {
+found = true;
+if (!smap_equal(_encap->options, _encap->options)) {
+return true;
+}
+break;
+}
+}
+if (!found) {
+return true;
+}
+}
+
+return false;
+}
+
+static void
+sync_isb_gw_to_sb(struct ic_context *ctx,
+  const struct icsbrec_gateway *gw,
+  const struct sbrec_chassis *chassis)
+{
+sbrec_chassis_set_hostname(chassis, gw->hostname);
+sbrec_chassis_update_external_ids_setkey(chassis, "is-remote", "true");
+
+/* Sync encaps used by this gateway. */
+ovs_assert(gw->n_encaps);
+struct sbrec_encap *sb_encap;
+struct sbrec_encap **sb_encaps =
+xmalloc(gw->n_encaps * sizeof *sb_encaps);
+for (int i = 0; i < gw->n_encaps; i++) {
+sb_encap = sbrec_encap_insert(ctx->ovnsb_txn);
+sbrec_encap_set_chassis_name(sb_encap, gw->name);
+sbrec_encap_set_ip(sb_encap, gw->encaps[i]->ip);
+sbrec_encap_set_type(sb_encap, gw->encaps[i]->type);
+sbrec_encap_set_options(sb_encap, >encaps[i]->options);
+sb_encaps[i] = sb_encap;
+}
+sbrec_chassis_set_encaps(chassis, sb_encaps, gw->n_encaps);
+free(sb_encaps);
+}
+
+static void
+sync_sb_gw_to_isb(struct ic_context *ctx,
+  const struct sbrec_chassis *chassis,
+  const struct icsbrec_gateway *gw)
+{
+icsbrec_gateway_set_hostname(gw, chassis->hostname);
+
+/* Sync encaps used by this chassis. */
+ovs_assert(chassis->n_encaps);
+struct icsbrec_encap *isb_encap;
+struct icsbrec_encap **isb_encaps =
+xmalloc(chassis->n_encaps * sizeof *isb_encaps);
+for (int i = 0; i < chassis->n_encaps; i++) {
+isb_encap = icsbrec_encap_insert(ctx->ovnisb_txn);
+icsbrec_encap_set_gateway_name(isb_encap,
+  chassis->name);
+icsbrec_encap_set_ip(isb_encap, chassis->encaps[i]->ip);
+icsbrec_encap_set_type(isb_encap,
+  chassis->encaps[i]->type);
+icsbrec_encap_set_options(isb_encap,
+ >encaps[i]->options);
+isb_encaps[i] = isb_encap;
+}
+icsbrec_gateway_set_encaps(gw, isb_encaps,
+  chassis->n_encaps);
+free(isb_encaps);
+}
+
+static void
+gateway_run(struct ic_context *ctx, const struct icsbrec_availability_zone *az)
+{
+if (!ctx->ovnisb_txn || !ctx->ovnsb_txn) {
+return;
+}
+
+struct shash local_gws = SHASH_INITIALIZER(_gws);
+struct shash remote_gws = SHASH_INITIALIZER(_gws);
+const struct icsbrec_gateway *gw;
+ICSBREC_GATEWAY_FOR_EACH (gw, ctx->ovnisb_idl) {
+if (gw->availability_zone == az) {
+shash_add(_gws, gw->name, gw);
+} else {
+shash_add(_gws, gw->name, gw);
+}
+}
+
+const struct sbrec_chassis *chassis;
+SBREC_CHASSIS_FOR_EACH (chassis, ctx->ovnsb_idl) {
+if (smap_get_bool(>external_ids, "is-interconn", false)) {
+gw = shash_find_and_delete(_gws, chassis->name);
+if (!gw) {
+gw = icsbrec_gateway_insert(ctx->ovnisb_txn);
+icsbrec_gateway_set_availability_zone(gw, az);
+icsbrec_gateway_set_name(gw, chassis->name);
+sync_sb_gw_to_isb(ctx, chassis, gw);
+} else if (is_gateway_data_changed(gw, chassis)) {
+sync_sb_gw_to_isb(ctx, chassis, gw);
+}
+} else if (smap_get_bool(>external_ids, "is-remote", false)) {
+gw = shash_find_and_delete(_gws, chassis->name);
+if (!gw) {
+sbrec_chassis_delete(chassis);
+} else if 

[ovs-dev] [PATCH ovn v4 07/13] ovn-sb: Add keys is_interconn and is_remote to Chassis's external_ids.

2020-01-29 Thread Han Zhou
Support the new keys in external_ids column of Chassis table for
OVN interconnection.  Also, populate the is_interconn key according
to external_ids:ovn-is-interconn key of Open_vSwitch table on the
chassis.

This patch also avoids creating tunnel or bfd sessions with remote
chassis.

Signed-off-by: Han Zhou 
---
 controller/bfd.c|  6 +-
 controller/chassis.c| 25 +++--
 controller/encaps.c | 15 ---
 controller/encaps.h |  3 ++-
 controller/ovn-controller.8.xml |  6 ++
 controller/ovn-controller.c |  2 +-
 northd/ovn-northd.c |  4 +++-
 ovn-sb.xml  | 15 +++
 8 files changed, 67 insertions(+), 9 deletions(-)

diff --git a/controller/bfd.c b/controller/bfd.c
index 10cd5fc..2b1e87f 100644
--- a/controller/bfd.c
+++ b/controller/bfd.c
@@ -151,7 +151,11 @@ bfd_calculate_chassis(
 if (is_ha_chassis) {
 /* It's an HA chassis. So add the ref_chassis to the bfd set. */
 for (size_t i = 0; i < ha_chassis_grp->n_ref_chassis; i++) {
-sset_add(_chassis, ha_chassis_grp->ref_chassis[i]->name);
+struct sbrec_chassis *ref_ch = ha_chassis_grp->ref_chassis[i];
+if (smap_get_bool(_ch->external_ids, "is-remote", false)) {
+continue;
+}
+sset_add(_chassis, ref_ch->name);
 }
 } else {
 /* This is not an HA chassis. Check if this chassis is present
diff --git a/controller/chassis.c b/controller/chassis.c
index 978273e..522893e 100644
--- a/controller/chassis.c
+++ b/controller/chassis.c
@@ -92,6 +92,8 @@ struct ovs_chassis_cfg {
 struct sset encap_ip_set;
 /* Interface type list formatted in the OVN-SB Chassis required format. */
 struct ds iface_types;
+/* Is this chassis an interconnection gateway. */
+bool is_interconn;
 };
 
 static void
@@ -172,6 +174,12 @@ get_datapath_type(const struct ovsrec_bridge *br_int)
 return "";
 }
 
+static bool
+get_is_interconn(const struct smap *ext_ids)
+{
+return smap_get_bool(ext_ids, "ovn-is-interconn", false);
+}
+
 static void
 update_chassis_transport_zones(const struct sset *transport_zones,
const struct sbrec_chassis *chassis_rec)
@@ -285,19 +293,23 @@ chassis_parse_ovs_config(const struct 
ovsrec_open_vswitch_table *ovs_table,
 sset_destroy(_cfg->encap_ip_set);
 }
 
+ovs_cfg->is_interconn = get_is_interconn(>external_ids);
+
 return true;
 }
 
 static void
 chassis_build_external_ids(struct smap *ext_ids, const char *bridge_mappings,
const char *datapath_type, const char *cms_options,
-   const char *chassis_macs, const char *iface_types)
+   const char *chassis_macs, const char *iface_types,
+   bool is_interconn)
 {
 smap_replace(ext_ids, "ovn-bridge-mappings", bridge_mappings);
 smap_replace(ext_ids, "datapath-type", datapath_type);
 smap_replace(ext_ids, "ovn-cms-options", cms_options);
 smap_replace(ext_ids, "iface-types", iface_types);
 smap_replace(ext_ids, "ovn-chassis-mac-mappings", chassis_macs);
+smap_replace(ext_ids, "is-interconn", is_interconn ? "true" : "false");
 }
 
 /*
@@ -309,6 +321,7 @@ chassis_external_ids_changed(const char *bridge_mappings,
  const char *cms_options,
  const char *chassis_macs,
  const struct ds *iface_types,
+ bool is_interconn,
  const struct sbrec_chassis *chassis_rec)
 {
 const char *chassis_bridge_mappings =
@@ -345,6 +358,12 @@ chassis_external_ids_changed(const char *bridge_mappings,
 return true;
 }
 
+bool chassis_is_interconn =
+smap_get_bool(_rec->external_ids, "is-interconn", false);
+if (chassis_is_interconn != is_interconn) {
+return true;
+}
+
 return false;
 }
 
@@ -524,6 +543,7 @@ chassis_update(const struct sbrec_chassis *chassis_rec,
  ovs_cfg->cms_options,
  ovs_cfg->chassis_macs,
  _cfg->iface_types,
+ ovs_cfg->is_interconn,
  chassis_rec)) {
 struct smap ext_ids;
 
@@ -532,7 +552,8 @@ chassis_update(const struct sbrec_chassis *chassis_rec,
ovs_cfg->datapath_type,
ovs_cfg->cms_options,
ovs_cfg->chassis_macs,
-   ds_cstr_ro(_cfg->iface_types));
+   ds_cstr_ro(_cfg->iface_types),
+   ovs_cfg->is_interconn);
 

[ovs-dev] [PATCH ovn v4 10/13] ovn.at: e2e test for OVN interconnection.

2020-01-29 Thread Han Zhou
Test with 5 AZs, each with 1 GW, 1 HV, 5 VIFs, 5 LRs, connected to
5 transit switches. Verify traffic through each TS between each pair
of AZs.

Signed-off-by: Han Zhou 
---
 tests/ovn.at | 147 +++
 1 file changed, 147 insertions(+)

diff --git a/tests/ovn.at b/tests/ovn.at
index f130441..1ae26fd 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -18022,6 +18022,153 @@ ovn-nbctl --wait=hv --timeout=3 sync
 AT_CHECK([ovn-trace --ovs lsw0 'inport == "lp1" && eth.type == 0x1234' | grep 
"dl_type=0x1234 actions="], [0], [ignore])
 
 OVN_CLEANUP([hv1])
+AT_CLEANUP
+
+AT_SETUP([ovn -- interconnection])
+ovn_init_ic_db
+n_az=5
+n_ts=5
+for i in `seq 1 $n_az`; do
+ovn_start az$i
+done
+
+net_add n1
+
+# 1 HV and 1 GW per AZ
+for az in `seq 1 $n_az`; do
+sim_add hv$az
+as hv$az
+ovs-vsctl add-br br-phys
+ovn_az_attach az$az n1 br-phys 192.168.$az.1 16
+for p in `seq 1 $n_ts`; do
+ovs-vsctl -- add-port br-int vif$p -- \
+set interface vif$p external-ids:iface-id=lsp$az-$p \
+options:tx_pcap=hv$az/vif$p-tx.pcap \
+options:rxq_pcap=hv$az/vif$p-rx.pcap \
+ofport-request=$p
+done
+
+sim_add gw$az
+as gw$az
+ovs-vsctl add-br br-phys
+ovn_az_attach az$az n1 br-phys 192.168.$az.2 16
+ovs-vsctl set open . external-ids:ovn-is-interconn=true
+done
+
+for ts in `seq 1 $n_ts`; do
+ovn-ic-nbctl create Transit_Switch name=ts$ts
+done
+
+for az in `seq 1 $n_az`; do
+ovn_as az$az
+
+# Each AZ has n_ts LSPi->LSi->LRi connecting to each TSi
+for i in `seq 1 $n_ts`; do
+lsp_mac=00:00:00:0$az:0$i:00
+lrp_ls_mac=00:00:00:0$az:0$i:01
+lrp_ts_mac=00:00:00:0$az:0$i:02
+lsp_ip=10.$az.$i.123
+lrp_ls_ip=10.$az.$i.1
+lrp_ts_ip=169.254.$i.$az
+
+ovn-nbctl ls-add ls$az-$i
+ovn-nbctl lsp-add ls$az-$i lsp$az-$i
+ovn-nbctl lsp-set-addresses lsp$az-$i "$lsp_mac $lsp_ip"
+
+ovn-nbctl lr-add lr$az-$i
+
+ovn-nbctl lrp-add lr$az-$i lrp-lr$az-$i-ls$az-$i $lrp_ls_mac 
$lrp_ls_ip/24
+ovn-nbctl lsp-add ls$az-$i lsp-ls$az-$i-lr$az-$i
+ovn-nbctl lsp-set-addresses lsp-ls$az-$i-lr$az-$i router
+ovn-nbctl lsp-set-type lsp-ls$az-$i-lr$az-$i router
+ovn-nbctl lsp-set-options lsp-ls$az-$i-lr$az-$i 
router-port=lrp-lr$az-$i-ls$az-$i
+
+ovn-nbctl lrp-add lr$az-$i lrp-lr$az-$i-ts$i $lrp_ts_mac $lrp_ts_ip/24
+ovn-nbctl lsp-add ts$i lsp-ts$i-lr$az-$i
+ovn-nbctl lsp-set-addresses lsp-ts$i-lr$az-$i router
+ovn-nbctl lsp-set-type lsp-ts$i-lr$az-$i router
+ovn-nbctl lsp-set-options lsp-ts$i-lr$az-$i 
router-port=lrp-lr$az-$i-ts$i
+ovn-nbctl lrp-set-gateway-chassis lrp-lr$az-$i-ts$i gw$az
+
+for remote in `seq 1 $n_az`; do
+if test $az = $remote; then
+continue
+fi
+ovn-nbctl lr-route-add lr$az-$i 10.$remote.$i.0/24 
169.254.$i.$remote
+done
+done
+done
+
+# Pre-populate the hypervisors' ARP tables so that we don't lose any
+# packets for ARP resolution (native tunneling doesn't queue packets
+# for ARP resolution).
+OVN_POPULATE_ARP
+
+for i in `seq 1 $n_az`; do
+AT_CHECK([ovn_as az$i ovn-nbctl --timeout=3 --wait=sb sync], [0], [ignore])
+done
+
+# Allow some time for ovn-northd and ovn-controller to catch up.
+# XXX This should be more systematic.
+sleep 2
+
+# Send packets between AZs on each TS
+for s_az in `seq 1 $n_az`; do
+for d_az in `seq 1 $n_az`; do
+if test $s_az = $d_az; then
+continue
+fi
+
+for i in `seq 1 $n_ts`; do
+lsp_smac=00:00:00:0${s_az}:0$i:00
+lsp_dmac=00:00:00:0${d_az}:0$i:00
+lrp_ls_smac=00:00:00:0${s_az}:0$i:01
+lrp_ls_dmac=00:00:00:0${d_az}:0$i:01
+lsp_sip=10.${s_az}.$i.123
+lsp_dip=10.${d_az}.$i.123
+
+packet="inport==\"lsp${s_az}-$i\" && eth.src==$lsp_smac && 
eth.dst==$lrp_ls_smac &&
+ip4 && ip.ttl==64 && ip4.src==$lsp_sip && 
ip4.dst==$lsp_dip &&
+udp && udp.src==53 && udp.dst==4369"
+AT_CHECK([as hv${s_az} ovs-appctl -t ovn-controller inject-pkt 
"$packet"])
+
+# Packet to Expect
+# The TTL should be decremented by 2.
+packet="eth.src==$lrp_ls_dmac && eth.dst==$lsp_dmac &&
+ip4 && ip.ttl==62 && ip4.src==$lsp_sip && 
ip4.dst==$lsp_dip &&
+udp && udp.src==53 && udp.dst==4369"
+echo $packet | ovstest test-ovn expr-to-packets >> 
${d_az}-$i.expected
+done
+done
+done
+
+echo "-INB dump-"
+ovn-ic-nbctl show
+echo "-"
+
+echo "-ISB dump-"
+ovn-ic-sbctl show
+echo "-"
+ovn-ic-sbctl list gateway
+echo "-"
+ovn-ic-sbctl list datapath_binding
+echo "-"
+ovn-ic-sbctl list 

[ovs-dev] [PATCH ovn v4 04/13] ovn-ic: Interconnection controller with AZ registeration.

2020-01-29 Thread Han Zhou
This patch introduces interconnection controller, ovn-ic, and
implements the basic AZ registration feature: taking the AZ
name from NB DB and create an Availability_Zone entry in
IC-SB DB.

Signed-off-by: Han Zhou 
---
 Makefile.am  |   1 +
 ic/.gitignore|   2 +
 ic/automake.mk   |  10 ++
 ic/ovn-ic.8.xml  | 120 +
 ic/ovn-ic.c  | 467 +++
 ovn-nb.ovsschema |   5 +-
 ovn-nb.xml   |   7 +
 tests/automake.mk|   4 +-
 tests/ovn-ic.at  |  28 +++
 tests/ovn-macros.at  | 161 +++---
 tests/testsuite.at   |   1 +
 tutorial/ovs-sandbox |  78 -
 12 files changed, 854 insertions(+), 30 deletions(-)
 create mode 100644 ic/.gitignore
 create mode 100644 ic/automake.mk
 create mode 100644 ic/ovn-ic.8.xml
 create mode 100644 ic/ovn-ic.c
 create mode 100644 tests/ovn-ic.at

diff --git a/Makefile.am b/Makefile.am
index 8eed7a7..d524c0d 100644
--- a/Makefile.am
+++ b/Makefile.am
@@ -500,4 +500,5 @@ include selinux/automake.mk
 include controller/automake.mk
 include controller-vtep/automake.mk
 include northd/automake.mk
+include ic/automake.mk
 include build-aux/automake.mk
diff --git a/ic/.gitignore b/ic/.gitignore
new file mode 100644
index 000..1b73eb4
--- /dev/null
+++ b/ic/.gitignore
@@ -0,0 +1,2 @@
+/ovn-ic
+/ovn-ic.8
diff --git a/ic/automake.mk b/ic/automake.mk
new file mode 100644
index 000..8e71bc3
--- /dev/null
+++ b/ic/automake.mk
@@ -0,0 +1,10 @@
+# ovn-ic
+bin_PROGRAMS += ic/ovn-ic
+ic_ovn_ic_SOURCES = ic/ovn-ic.c
+ic_ovn_ic_LDADD = \
+   lib/libovn.la \
+   $(OVSDB_LIBDIR)/libovsdb.la \
+   $(OVS_LIBDIR)/libopenvswitch.la
+man_MANS += ic/ovn-ic.8
+EXTRA_DIST += ic/ovn-ic.8.xml
+CLEANFILES += ic/ovn-ic.8
diff --git a/ic/ovn-ic.8.xml b/ic/ovn-ic.8.xml
new file mode 100644
index 000..f5ef994
--- /dev/null
+++ b/ic/ovn-ic.8.xml
@@ -0,0 +1,120 @@
+
+
+Name
+ovn-ic -- Open Virtual Network interconnection controller
+
+Synopsis
+ovn-ic [options]
+
+Description
+
+  ovn-ic, OVN interconnection controller, is a centralized
+  daemon which communicates with global interconnection databases 
IC_NB/IC_SB
+  to configure and exchange data with local NB/SB for interconnecting
+  with other OVN deployments.
+
+
+Options
+
+  --ovnnb-db=database
+  
+The OVSDB database containing the OVN Northbound Database.  If the
+OVN_NB_DB environment variable is set, its value is used
+as the default.  Otherwise, the default is
+unix:@RUNDIR@/ovnnb_db.sock.
+  
+  --ovnsb-db=database
+  
+The OVSDB database containing the OVN Southbound Database.  If the
+OVN_SB_DB environment variable is set, its value is used
+as the default.  Otherwise, the default is
+unix:@RUNDIR@/ovnsb_db.sock.
+  
+  --ic-nb-db=database
+  
+The OVSDB database containing the OVN Interconnection Northbound
+Database.  If the OVN_IC_NB_DB environment variable is set,
+its value is used as the default.  Otherwise, the default is
+unix:@RUNDIR@/ovn_ic_nb_db.sock.
+  
+  --ic-sb-db=database
+  
+The OVSDB database containing the OVN Interconnection Southbound
+Database.  If the OVN_IC_SB_DB environment variable is set,
+its value is used as the default.  Otherwise, the default is
+unix:@RUNDIR@/ovn_ic_sb_db.sock.
+  
+
+
+  database in the above options must be an OVSDB active or
+  passive connection method, as described in ovsdb(7).
+
+
+Daemon Options
+http://www.w3.org/2003/XInclude"/>
+
+Logging Options
+http://www.w3.org/2003/XInclude"/>
+
+PKI Options
+
+  PKI configuration is required in order to use SSL for the connections to
+  the Northbound and Southbound databases.
+
+http://www.w3.org/2003/XInclude"/>
+
+Other Options
+http://www.w3.org/2003/XInclude"/>
+
+http://www.w3.org/2003/XInclude"/>
+
+Runtime Management Commands
+
+  ovs-appctl can send commands to a running
+  ovn-ic process.  The currently supported commands
+  are described below.
+  
+  exit
+  
+Causes ovn-ic to gracefully terminate.
+  
+
+  pause
+  
+Pauses the ovn-ic operation from processing any database changes.
+This will also instruct ovn-ic to drop any lock on SB DB.
+  
+
+  resume
+  
+Resumes the ovn-ic operation to process database contents.  This will
+also instruct ovn-northd to aspire for the lock on SB DB.
+  
+
+  is-paused
+  
+Returns "true" if ovn-ic is currently paused, "false" otherwise.
+  
+
+  status
+  
+Prints this server's status.  Status will be "active" if ovn-ic has
+acquired OVSDB lock on SB DB, "standby" if it has not or "paused" if
+this instance is paused.

[ovs-dev] [PATCH ovn v4 00/13] OVN Interconnection

2020-01-29 Thread Han Zhou
The series supports interconnecting multiple OVN deployments (e.g.  located at
multiple data centers) through logical routers connected with tansit logical
switches with overlay tunnels, managed through OVN control plane.  See the
ovn-architecture.rst document updates for more details, and find the
instructions in Documentation/tutorials/ovn-interconnection.rst.

v3 -> v4: rebase on master

v2 -> v3:

  - Addressed Numan's comments:
- Rename ovn-inbctl => ovn-ic-nbctl ovn-isbctl => ovn-ic-sbctl.
- Update tunnel keys through northd instead of directly update SB-DB by
  ovn-ic.
- Rename is-interconn to ovn-is-interconn in chassis ovsdb settings.
- Add a section in ovn-architecture for "A day in the life of a packet
  crossing AZs".
- Set hostname for chassis in test cases.

  - In addition, there are some other changes:
- Avoid unnecessary tunnel and bfd sessions to remote chassis.
- Use external_ids keys "is-remote" and "is-interconn" in SB Chassis
  table, instead of adding new columns is_remote and is_interconn, to
  avoid too many columns.

Han Zhou (13):
  ovn-architecture: Add documentation for OVN interconnection feature.
  ovn-ic-nb: Interconnection northbound DB schema and CLI.
  ovn-ic-sb: Interconnection southbound DB schema and CLI.
  ovn-ic: Interconnection controller with AZ registeration.
  ovn-northd.c: Refactor allocate_tnlid.
  ovn-ic: Transit switch controller.
  ovn-sb: Add keys is_interconn and is_remote to Chassis's external_ids.
  ovn-ic: Interconnection gateway controller.
  ovn-ic: Interconnection port controller.
  ovn.at: e2e test for OVN interconnection.
  ovn-ctl: Refactor to reduce redundant code.
  ovn-ctl: Support commands for interconnection.
  tutorial: Add tutorial for OVN Interconnection.

 .gitignore  |6 +
 Documentation/automake.mk   |1 +
 Documentation/tutorials/index.rst   |1 +
 Documentation/tutorials/ovn-interconnection.rst |  188 
 Makefile.am |1 +
 NEWS|5 +
 TODO.rst|6 +
 automake.mk |   71 ++
 controller/bfd.c|6 +-
 controller/binding.c|6 +-
 controller/chassis.c|   25 +-
 controller/encaps.c |   15 +-
 controller/encaps.h |3 +-
 controller/ovn-controller.8.xml |6 +
 controller/ovn-controller.c |2 +-
 debian/ovn-common.install   |2 +
 debian/ovn-common.manpages  |4 +
 ic/.gitignore   |2 +
 ic/automake.mk  |   10 +
 ic/ovn-ic.8.xml |  120 +++
 ic/ovn-ic.c | 1104 +++
 lib/.gitignore  |6 +
 lib/automake.mk |   31 +-
 lib/ovn-ic-nb-idl.ann   |9 +
 lib/ovn-ic-sb-idl.ann   |9 +
 lib/ovn-util.c  |   92 ++
 lib/ovn-util.h  |   17 +
 northd/ovn-northd.c |  224 +++--
 ovn-architecture.7.xml  |  144 ++-
 ovn-ic-nb.ovsschema |   75 ++
 ovn-ic-nb.xml   |  371 
 ovn-ic-sb.ovsschema |  129 +++
 ovn-ic-sb.xml   |  582 
 ovn-nb.ovsschema|5 +-
 ovn-nb.xml  |   66 +-
 ovn-sb.xml  |   24 +
 tests/automake.mk   |8 +-
 tests/ovn-ic-nbctl.at   |   65 ++
 tests/ovn-ic-sbctl.at   |  112 +++
 tests/ovn-ic.at |  188 
 tests/ovn-macros.at |  161 +++-
 tests/ovn.at|  147 +++
 tests/testsuite.at  |3 +
 tutorial/ovs-sandbox|   78 +-
 utilities/.gitignore|4 +
 utilities/automake.mk   |   16 +
 utilities/ovn-ctl   |  425 -
 utilities/ovn-ctl.8.xml |   91 ++
 utilities/ovn-ic-nbctl.8.xml|  174 
 utilities/ovn-ic-nbctl.c|  950 +++
 utilities/ovn-ic-sbctl.8.xml|  148 +++
 utilities/ovn-ic-sbctl.c| 1017 +
 52 files changed, 6788 insertions(+), 167 deletions(-)

[ovs-dev] [PATCH ovn v4 01/13] ovn-architecture: Add documentation for OVN interconnection feature.

2020-01-29 Thread Han Zhou
Signed-off-by: Han Zhou 
---
 ovn-architecture.7.xml | 144 -
 1 file changed, 143 insertions(+), 1 deletion(-)

diff --git a/ovn-architecture.7.xml b/ovn-architecture.7.xml
index c43f16d..defcdc9 100644
--- a/ovn-architecture.7.xml
+++ b/ovn-architecture.7.xml
@@ -1246,7 +1246,14 @@
   
 Distributed gateway ports are logical router patch ports
 that directly connect distributed logical routers to logical
-switches with localnet ports.
+switches with external connection.
+  
+
+  
+There are two types of external connections.  Firstly, connection to
+physical network through a localnet port.  Secondly, connection to
+another OVN deployment, which will be introduced in section "OVN
+Deployments Interconnection".
   
 
   
@@ -1820,6 +1827,141 @@
 
   
 
+  OVN Deployments Interconnection (TODO)
+
+  
+It is not uncommon for an operator to deploy multiple OVN clusters, for
+two main reasons.  Firstly, an operator may prefer to deploy one OVN
+cluster for each availability zone, e.g. in different physical regions,
+to avoid single point of failure.  Secondly, there is always an upper limit
+for a single OVN control plane to scale.
+  
+
+  
+Although the control planes of the different availability zone (AZ)s are
+independent from each other, the workloads from different AZs may need
+to communicate across the zones.  The OVN interconnection feature provides
+a native way to interconnect different AZs by L3 routing through transit
+overlay networks between logical routers of different AZs.
+  
+
+  
+A global OVN Interconnection Northbound database is introduced for the
+operator (probably through CMS systems) to configure transit logical
+switches that connect logical routers from different AZs.  A transit
+switch is similar to a regular logical switch, but it is used for
+interconnection purpose only.  Typically, each transit switch can be used
+to connect all logical routers that belong to same tenant across all AZs.
+  
+
+  
+A dedicated daemon process ovn-ic, OVN interconnection
+controller, in each AZ will consume this data and populate corresponding
+logical switches to their own northbound databases for each AZ, so that
+logical routers can be connected to the transit switch by creating
+patch port pairs in their northbound databases.  Any router ports
+connected to the transit switches are considered interconnection ports,
+which will be exchanged between AZs.
+  
+
+  
+Physically, when workloads from different AZs communicate, packets
+need to go through multiple hops: source chassis, source gateway,
+destination gateway and destination chassis.  All these hops are connected
+through tunnels so that the packets never leave overlay networks.
+A distributed gateway port is required to connect the logical router to a
+transit switch, with a gateway chassis specified, so that the traffic can
+be forwarded through the gateway chassis.
+  
+
+  
+A global OVN Interconnection Southbound database is introduced for
+exchanging control plane information between the AZs.  The data in
+this database is populated and consumed by the ovn-ic,
+of each AZ.  The main information in this database includes:
+  
+
+  
+
+  Datapath bindings for transit switches, which mainly contains the tunnel
+  keys generated for each transit switch.  Separate key ranges are reserved
+  for transit switches so that they will never conflict with any tunnel
+  keys locally assigned for datapaths within each AZ.
+
+
+  Availability zones, which are registerd by ovn-ic
+  from each AZ.
+
+
+  Gateways.  Each AZ specifies chassises that are supposed to work
+  as interconnection gateways, and the ovn-ic will
+  populate this information to the interconnection southbound DB.
+  The ovn-ic from all the other AZs will learn the
+  gateways and populate to their own southbound DB as a chassis.
+
+
+  Port bindings for logical switch ports created on the transit switch.
+  Each AZ maintains their logical router to transit switch connections
+  independently, but ovn-ic automatically populates
+  local port bindings on transit switches to the global interconnection
+  southbound DB, and learns remote port bindings from other AZs back
+  to its own northbound and southbound DBs, so that logical flows
+  can be produced and then translated to OVS flows locally, which finally
+  enables data plane communication.
+
+  
+
+  
+The tunnel keys for transit switch datapaths and related port bindings
+must be agreed across all AZs.  This is ensured by generating and storing
+the keys in the global interconnection southbound database.  Any
+ovn-ic from any AZ can allocate the key, but race conditions
+are 

[ovs-dev] [PATCH ovn v4 02/13] ovn-ic-nb: Interconnection northbound DB schema and CLI.

2020-01-29 Thread Han Zhou
This patch introduces OVN_IC_Northbound DB schema and the CLI
ovn-ic-nbctl that manages the DB.

Signed-off-by: Han Zhou 
---
 .gitignore   |   3 +
 automake.mk  |  35 ++
 debian/ovn-common.install|   1 +
 debian/ovn-common.manpages   |   2 +
 lib/.gitignore   |   3 +
 lib/automake.mk  |  16 +-
 lib/ovn-ic-nb-idl.ann|   9 +
 lib/ovn-util.c   |  13 +
 lib/ovn-util.h   |   1 +
 ovn-ic-nb.ovsschema  |  75 
 ovn-ic-nb.xml| 371 +
 tests/automake.mk|   2 +
 tests/ovn-ic-nbctl.at|  65 +++
 tests/testsuite.at   |   1 +
 utilities/.gitignore |   2 +
 utilities/automake.mk|   8 +
 utilities/ovn-ic-nbctl.8.xml | 174 
 utilities/ovn-ic-nbctl.c | 950 +++
 18 files changed, 1730 insertions(+), 1 deletion(-)
 create mode 100644 lib/ovn-ic-nb-idl.ann
 create mode 100644 ovn-ic-nb.ovsschema
 create mode 100644 ovn-ic-nb.xml
 create mode 100644 tests/ovn-ic-nbctl.at
 create mode 100644 utilities/ovn-ic-nbctl.8.xml
 create mode 100644 utilities/ovn-ic-nbctl.c

diff --git a/.gitignore b/.gitignore
index 6fee075..d4f8c10 100644
--- a/.gitignore
+++ b/.gitignore
@@ -67,6 +67,9 @@
 /ovn-sb.5
 /ovn-sb.gv
 /ovn-sb.pic
+/ovn-ic-nb.5
+/ovn-ic-nb.gv
+/ovn-ic-nb.pic
 /package.m4
 /stamp-h1
 /_build-gcc
diff --git a/automake.mk b/automake.mk
index 591e007..59063c2 100644
--- a/automake.mk
+++ b/automake.mk
@@ -62,6 +62,34 @@ ovn-sb.5: \
$(srcdir)/ovn-sb.xml > $@.tmp && \
mv $@.tmp $@
 
+# OVN interconnection northbound E-R diagram
+#
+# If "python" or "dot" is not available, then we do not add graphical diagram
+# to the documentation.
+if HAVE_DOT
+ovn-ic-nb.gv: ${OVSDIR}/ovsdb/ovsdb-dot.in $(srcdir)/ovn-ic-nb.ovsschema
+   $(AM_V_GEN)$(OVSDB_DOT) --no-arrows $(srcdir)/ovn-ic-nb.ovsschema > $@
+ovn-ic-nb.pic: ovn-ic-nb.gv ${OVSDIR}/ovsdb/dot2pic
+   $(AM_V_GEN)(dot -T plain < ovn-ic-nb.gv | $(PYTHON) 
${OVSDIR}/ovsdb/dot2pic -f 3) > $@.tmp && \
+   mv $@.tmp $@
+OVN_IC_NB_PIC = ovn-ic-nb.pic
+OVN_IC_NB_DOT_DIAGRAM_ARG = --er-diagram=$(OVN_IC_NB_PIC)
+CLEANFILES += ovn-ic-nb.gv ovn-ic-nb.pic
+endif
+
+# OVN interconnection northbound schema documentation
+EXTRA_DIST += ovn-ic-nb.xml
+CLEANFILES += ovn-ic-nb.5
+man_MANS += ovn-ic-nb.5
+
+ovn-ic-nb.5: \
+   ${OVSDIR}/ovsdb/ovsdb-doc $(srcdir)/ovn-ic-nb.xml 
$(srcdir)/ovn-ic-nb.ovsschema $(OVN_IC_NB_PIC)
+   $(AM_V_GEN)$(OVSDB_DOC) \
+   $(OVN_IC_NB_DOT_DIAGRAM_ARG) \
+   --version=$(VERSION) \
+   $(srcdir)/ovn-ic-nb.ovsschema \
+   $(srcdir)/ovn-ic-nb.xml > $@.tmp && \
+   mv $@.tmp $@
 
 # Version checking for ovn-nb.ovsschema.
 ALL_LOCAL += ovn-nb.ovsschema.stamp
@@ -74,7 +102,14 @@ ALL_LOCAL += ovn-sb.ovsschema.stamp
 ovn-sb.ovsschema.stamp: ovn-sb.ovsschema
$(srcdir)/build-aux/cksum-schema-check $? $@
 
+# Version checking for ovn-ic-nb.ovsschema.
+ALL_LOCAL += ovn-ic-nb.ovsschema.stamp
+ovn-ic-nb.ovsschema.stamp: ovn-ic-nb.ovsschema
+   $(srcdir)/build-aux/cksum-schema-check $? $@
+CLEANFILES += ovn-ic-nb.ovsschema.stamp
+
 pkgdata_DATA += ovn-nb.ovsschema
 pkgdata_DATA += ovn-sb.ovsschema
+pkgdata_DATA += ovn-ic-nb.ovsschema
 
 CLEANFILES += ovn-sb.ovsschema.stamp
diff --git a/debian/ovn-common.install b/debian/ovn-common.install
index 90484d2..59b8018 100644
--- a/debian/ovn-common.install
+++ b/debian/ovn-common.install
@@ -1,5 +1,6 @@
 usr/bin/ovn-nbctl
 usr/bin/ovn-sbctl
+usr/bin/ovn-ic-nbctl
 usr/bin/ovn-trace
 usr/bin/ovn-detrace
 usr/share/openvswitch/scripts/ovn-ctl
diff --git a/debian/ovn-common.manpages b/debian/ovn-common.manpages
index 249349e..e7d3e4d 100644
--- a/debian/ovn-common.manpages
+++ b/debian/ovn-common.manpages
@@ -1,8 +1,10 @@
 ovn/ovn-architecture.7
 ovn/ovn-nb.5
 ovn/ovn-sb.5
+ovn/ovn-ic-nb.5
 ovn/utilities/ovn-ctl.8
 ovn/utilities/ovn-nbctl.8
 ovn/utilities/ovn-sbctl.8
+ovn/utilities/ovn-ic-nbctl.8
 ovn/utilities/ovn-trace.8
 ovn/utilities/ovn-detrace.1
diff --git a/lib/.gitignore b/lib/.gitignore
index 3eed458..3af2923 100644
--- a/lib/.gitignore
+++ b/lib/.gitignore
@@ -5,4 +5,7 @@
 /ovn-sb-idl.c
 /ovn-sb-idl.h
 /ovn-sb-idl.ovsidl
+/ovn-ic-nb-idl.c
+/ovn-ic-nb-idl.h
+/ovn-ic-nb-idl.ovsidl
 /ovn-dirs.c
diff --git a/lib/automake.mk b/lib/automake.mk
index 0c8245c..5f6561a 100644
--- a/lib/automake.mk
+++ b/lib/automake.mk
@@ -29,7 +29,9 @@ nodist_lib_libovn_la_SOURCES = \
lib/ovn-nb-idl.c \
lib/ovn-nb-idl.h \
lib/ovn-sb-idl.c \
-   lib/ovn-sb-idl.h
+   lib/ovn-sb-idl.h \
+   lib/ovn-ic-nb-idl.c \
+   lib/ovn-ic-nb-idl.h
 
 CLEANFILES += $(nodist_lib_libovn_la_SOURCES)
 
@@ -74,3 +76,15 @@ lib/ovn-nb-idl.ovsidl: $(OVN_NB_IDL_FILES)
$(AM_V_GEN)$(OVSDB_IDLC) annotate $(OVN_NB_IDL_FILES) > $@.tmp && \
mv $@.tmp $@
 
+# ovn-ic-nb IDL
+OVSIDL_BUILT += \
+   

[ovs-dev] [PATCH ovn v4 05/13] ovn-northd.c: Refactor allocate_tnlid.

2020-01-29 Thread Han Zhou
Move allocate_tnlid() and related interfaces to ovn_util module,
so that they be reused by ovn-ic (in next patches). At the same
time, define macros for the range of datapath tunnel keys, and
reserve a range with ((1u << 16) - 1) keys for global transit
switch datapaths, among the ((1u << 24) - 1) datapath tunnel key
space.

Signed-off-by: Han Zhou 
---
 lib/ovn-util.c  | 59 +
 lib/ovn-util.h  | 12 
 northd/ovn-northd.c | 85 +
 3 files changed, 85 insertions(+), 71 deletions(-)

diff --git a/lib/ovn-util.c b/lib/ovn-util.c
index 18c13a8..70007fd 100644
--- a/lib/ovn-util.c
+++ b/lib/ovn-util.c
@@ -455,3 +455,62 @@ datapath_is_switch(const struct sbrec_datapath_binding 
*ldp)
 {
 return smap_get(>external_ids, "logical-switch") != NULL;
 }
+
+struct tnlid_node {
+struct hmap_node hmap_node;
+uint32_t tnlid;
+};
+
+void
+ovn_destroy_tnlids(struct hmap *tnlids)
+{
+struct tnlid_node *node;
+HMAP_FOR_EACH_POP (node, hmap_node, tnlids) {
+free(node);
+}
+hmap_destroy(tnlids);
+}
+
+void
+ovn_add_tnlid(struct hmap *set, uint32_t tnlid)
+{
+struct tnlid_node *node = xmalloc(sizeof *node);
+hmap_insert(set, >hmap_node, hash_int(tnlid, 0));
+node->tnlid = tnlid;
+}
+
+static bool
+tnlid_in_use(const struct hmap *set, uint32_t tnlid)
+{
+const struct tnlid_node *node;
+HMAP_FOR_EACH_IN_BUCKET (node, hmap_node, hash_int(tnlid, 0), set) {
+if (node->tnlid == tnlid) {
+return true;
+}
+}
+return false;
+}
+
+static uint32_t
+next_tnlid(uint32_t tnlid, uint32_t min, uint32_t max)
+{
+return tnlid + 1 <= max ? tnlid + 1 : min;
+}
+
+uint32_t
+ovn_allocate_tnlid(struct hmap *set, const char *name, uint32_t min,
+   uint32_t max, uint32_t *hint)
+{
+for (uint32_t tnlid = next_tnlid(*hint, min, max); tnlid != *hint;
+ tnlid = next_tnlid(tnlid, min, max)) {
+if (!tnlid_in_use(set, tnlid)) {
+ovn_add_tnlid(set, tnlid);
+*hint = tnlid;
+return tnlid;
+}
+}
+
+static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
+VLOG_WARN_RL(, "all %s tunnel ids exhausted", name);
+return 0;
+}
diff --git a/lib/ovn-util.h b/lib/ovn-util.h
index 5685f43..4f9bbf6 100644
--- a/lib/ovn-util.h
+++ b/lib/ovn-util.h
@@ -90,4 +90,16 @@ uint32_t ovn_logical_flow_hash(const struct uuid 
*logical_datapath,
uint16_t priority,
const char *match, const char *actions);
 bool datapath_is_switch(const struct sbrec_datapath_binding *);
+
+#define OVN_MAX_DP_KEY ((1u << 24) - 1)
+#define OVN_MAX_DP_GLOBAL_NUM ((1u << 16) - 1)
+#define OVN_MIN_DP_KEY_LOCAL 1
+#define OVN_MAX_DP_KEY_LOCAL (OVN_MAX_DP_KEY - OVN_MAX_DP_GLOBAL_NUM)
+#define OVN_MIN_DP_KEY_GLOBAL (OVN_MAX_DP_KEY_LOCAL + 1)
+#define OVN_MAX_DP_KEY_GLOBAL OVN_MAX_DP_KEY
+struct hmap;
+void ovn_destroy_tnlids(struct hmap *tnlids);
+void ovn_add_tnlid(struct hmap *set, uint32_t tnlid);
+uint32_t ovn_allocate_tnlid(struct hmap *set, const char *name, uint32_t min,
+uint32_t max, uint32_t *hint);
 #endif
diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
index 44fe6cf..5157690 100644
--- a/northd/ovn-northd.c
+++ b/northd/ovn-northd.c
@@ -307,65 +307,6 @@ Options:\n\
 stream_usage("database", true, true, false);
 }
 
-struct tnlid_node {
-struct hmap_node hmap_node;
-uint32_t tnlid;
-};
-
-static void
-destroy_tnlids(struct hmap *tnlids)
-{
-struct tnlid_node *node;
-HMAP_FOR_EACH_POP (node, hmap_node, tnlids) {
-free(node);
-}
-hmap_destroy(tnlids);
-}
-
-static void
-add_tnlid(struct hmap *set, uint32_t tnlid)
-{
-struct tnlid_node *node = xmalloc(sizeof *node);
-hmap_insert(set, >hmap_node, hash_int(tnlid, 0));
-node->tnlid = tnlid;
-}
-
-static bool
-tnlid_in_use(const struct hmap *set, uint32_t tnlid)
-{
-const struct tnlid_node *node;
-HMAP_FOR_EACH_IN_BUCKET (node, hmap_node, hash_int(tnlid, 0), set) {
-if (node->tnlid == tnlid) {
-return true;
-}
-}
-return false;
-}
-
-static uint32_t
-next_tnlid(uint32_t tnlid, uint32_t min, uint32_t max)
-{
-return tnlid + 1 <= max ? tnlid + 1 : min;
-}
-
-static uint32_t
-allocate_tnlid(struct hmap *set, const char *name, uint32_t min, uint32_t max,
-   uint32_t *hint)
-{
-for (uint32_t tnlid = next_tnlid(*hint, min, max); tnlid != *hint;
- tnlid = next_tnlid(tnlid, min, max)) {
-if (!tnlid_in_use(set, tnlid)) {
-add_tnlid(set, tnlid);
-*hint = tnlid;
-return tnlid;
-}
-}
-
-static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
-VLOG_WARN_RL(, "all %s tunnel ids exhausted", name);
-return 0;
-}
-
 struct ovn_chassis_qdisc_queues {
 struct hmap_node key_node;
 

[ovs-dev] [PATCH ovn v4 06/13] ovn-ic: Transit switch controller.

2020-01-29 Thread Han Zhou
Processing transit switches and sync between IC-NB, IC-SB and NB.

Signed-off-by: Han Zhou 
---
 ic/ovn-ic.c | 107 
 lib/ovn-util.c  |   6 +--
 lib/ovn-util.h  |   1 +
 northd/ovn-northd.c |  70 --
 ovn-nb.xml  |  24 
 ovn-sb.xml  |   9 +
 tests/ovn-ic.at |  40 
 7 files changed, 242 insertions(+), 15 deletions(-)

diff --git a/ic/ovn-ic.c b/ic/ovn-ic.c
index a6d27c0..82bd86e 100644
--- a/ic/ovn-ic.c
+++ b/ic/ovn-ic.c
@@ -147,11 +147,118 @@ az_run(struct ic_context *ctx)
 return NULL;
 }
 
+static uint32_t
+allocate_ts_dp_key(struct hmap *dp_tnlids)
+{
+static uint32_t hint = OVN_MIN_DP_KEY_GLOBAL;
+return ovn_allocate_tnlid(dp_tnlids, "transit switch datapath",
+  OVN_MIN_DP_KEY_GLOBAL, OVN_MAX_DP_KEY_GLOBAL,
+  );
+}
+
+static void
+ts_run(struct ic_context *ctx)
+{
+const struct icnbrec_transit_switch *ts;
+
+struct hmap dp_tnlids = HMAP_INITIALIZER(_tnlids);
+struct shash isb_dps = SHASH_INITIALIZER(_dps);
+const struct icsbrec_datapath_binding *isb_dp;
+ICSBREC_DATAPATH_BINDING_FOR_EACH (isb_dp, ctx->ovnisb_idl) {
+shash_add(_dps, isb_dp->transit_switch, isb_dp);
+ovn_add_tnlid(_tnlids, isb_dp->tunnel_key);
+}
+
+/* Sync INB TS to AZ NB */
+if (ctx->ovnnb_txn) {
+struct shash nb_tses = SHASH_INITIALIZER(_tses);
+const struct nbrec_logical_switch *ls;
+
+/* Get current NB Logical_Switch with other_config:interconn-ts */
+NBREC_LOGICAL_SWITCH_FOR_EACH (ls, ctx->ovnnb_idl) {
+const char *ts_name = smap_get(>other_config, "interconn-ts");
+if (ts_name) {
+shash_add(_tses, ts_name, ls);
+}
+}
+
+/* Create NB Logical_Switch for each TS */
+ICNBREC_TRANSIT_SWITCH_FOR_EACH (ts, ctx->ovninb_idl) {
+ls = shash_find_and_delete(_tses, ts->name);
+if (!ls) {
+ls = nbrec_logical_switch_insert(ctx->ovnnb_txn);
+nbrec_logical_switch_set_name(ls, ts->name);
+nbrec_logical_switch_update_other_config_setkey(ls,
+"interconn-ts",
+ts->name);
+}
+isb_dp = shash_find_data(_dps, ts->name);
+if (isb_dp) {
+int64_t nb_tnl_key = smap_get_int(>other_config,
+  "requested-tnl-key",
+  0);
+if (nb_tnl_key != isb_dp->tunnel_key) {
+VLOG_DBG("Set other_config:requested-tnl-key %"PRId64
+ " for transit switch %s in NB.",
+ isb_dp->tunnel_key, ts->name);
+char *tnl_key_str = xasprintf("%"PRId64,
+  isb_dp->tunnel_key);
+nbrec_logical_switch_update_other_config_setkey(
+ls, "requested-tnl-key", tnl_key_str);
+free(tnl_key_str);
+}
+}
+}
+
+/* Delete extra NB Logical_Switch with other_config:interconn-ts */
+struct shash_node *node;
+SHASH_FOR_EACH (node, _tses) {
+nbrec_logical_switch_delete(node->data);
+}
+shash_destroy(_tses);
+}
+
+/* Sync TS between INB and ISB.  This is performed after syncing with AZ
+ * SB, to avoid uncommitted ISB datapath tunnel key to be synced back to
+ * AZ. */
+if (ctx->ovnisb_txn) {
+/* Create ISB Datapath_Binding */
+ICNBREC_TRANSIT_SWITCH_FOR_EACH (ts, ctx->ovninb_idl) {
+isb_dp = shash_find_and_delete(_dps, ts->name);
+if (!isb_dp) {
+/* Allocate tunnel key */
+int64_t dp_key = allocate_ts_dp_key(_tnlids);
+if (!dp_key) {
+continue;
+}
+
+isb_dp = icsbrec_datapath_binding_insert(ctx->ovnisb_txn);
+icsbrec_datapath_binding_set_transit_switch(isb_dp, ts->name);
+icsbrec_datapath_binding_set_tunnel_key(isb_dp, dp_key);
+}
+}
+
+/* Delete extra ISB Datapath_Binding */
+struct shash_node *node;
+SHASH_FOR_EACH (node, _dps) {
+icsbrec_datapath_binding_delete(node->data);
+}
+}
+ovn_destroy_tnlids(_tnlids);
+shash_destroy(_dps);
+}
+
 static void
 ovn_db_run(struct ic_context *ctx)
 {
 const struct icsbrec_availability_zone *az = az_run(ctx);
 VLOG_DBG("Availability zone: %s", az ? az->name : "not created yet.");
+
+if (!az) {
+return;
+}
+
+ts_run(ctx);
 }
 
 static void
diff --git 

Re: [ovs-dev] [PATCH ovn v3 00/13] OVN Interconnection

2020-01-29 Thread Numan Siddique
On Tue, Jan 28, 2020 at 8:26 AM Han Zhou  wrote:
>
> The series supports interconnecting multiple OVN deployments (e.g.  located at
> multiple data centers) through logical routers connected with tansit logical
> switches with overlay tunnels, managed through OVN control plane.  See the
> ovn-architecture.rst document updates for more details, and find the
> instructions in Documentation/tutorials/ovn-interconnection.rst.
>
> v2 -> v3:
>
>   - Addressed Numan's comments:
> - Rename ovn-inbctl => ovn-ic-nbctl ovn-isbctl => ovn-ic-sbctl.
> - Update tunnel keys through northd instead of directly update SB-DB by
>   ovn-ic.
> - Rename is-interconn to ovn-is-interconn in chassis ovsdb settings.
> - Add a section in ovn-architecture for "A day in the life of a packet
>   crossing AZs".
> - Set hostname for chassis in test cases.
>
>   - In addition, there are some other changes:
> - Avoid unnecessary tunnel and bfd sessions to remote chassis.
> - Use external_ids keys "is-remote" and "is-interconn" in SB Chassis
>   table, instead of adding new columns is_remote and is_interconn, to
>   avoid too many columns.
>
> Han Zhou (13):
>   ovn-architecture: Add documentation for OVN interconnection feature.
>   ovn-ic-nb: Interconnection northbound DB schema and CLI.
>   ovn-ic-sb: Interconnection southbound DB schema and CLI.
>   ovn-ic: Interconnection controller with AZ registeration.
>   ovn-northd.c: Refactor allocate_tnlid.
>   ovn-ic: Transit switch controller.
>   ovn-sb: Add keys is_interconn and is_remote to Chassis's external_ids.
>   ovn-ic: Interconnection gateway controller.
>   ovn-ic: Interconnection port controller.
>   ovn.at: e2e test for OVN interconnection.
>   ovn-ctl: Refactor to reduce redundant code.
>   ovn-ctl: Support commands for interconnection.
>   tutorial: Add tutorial for OVN Interconnection.

Hi Han,

Can you please rebase this series ? Some of the patches are not
applying on the present master.

Thanks
Numan

>
>  .gitignore  |6 +
>  Documentation/automake.mk   |1 +
>  Documentation/tutorials/index.rst   |1 +
>  Documentation/tutorials/ovn-interconnection.rst |  188 
>  Makefile.am |1 +
>  NEWS|5 +
>  TODO.rst|6 +
>  automake.mk |   71 ++
>  controller/bfd.c|6 +-
>  controller/binding.c|6 +-
>  controller/chassis.c|   25 +-
>  controller/encaps.c |   15 +-
>  controller/encaps.h |3 +-
>  controller/ovn-controller.8.xml |6 +
>  controller/ovn-controller.c |2 +-
>  debian/ovn-common.install   |2 +
>  debian/ovn-common.manpages  |4 +
>  ic/.gitignore   |2 +
>  ic/automake.mk  |   10 +
>  ic/ovn-ic.8.xml |  120 +++
>  ic/ovn-ic.c | 1104 
> +++
>  lib/.gitignore  |6 +
>  lib/automake.mk |   31 +-
>  lib/ovn-ic-nb-idl.ann   |9 +
>  lib/ovn-ic-sb-idl.ann   |9 +
>  lib/ovn-util.c  |   92 ++
>  lib/ovn-util.h  |   16 +
>  northd/ovn-northd.c |  224 +++--
>  ovn-architecture.7.xml  |  144 ++-
>  ovn-ic-nb.ovsschema |   75 ++
>  ovn-ic-nb.xml   |  371 
>  ovn-ic-sb.ovsschema |  129 +++
>  ovn-ic-sb.xml   |  582 
>  ovn-nb.ovsschema|5 +-
>  ovn-nb.xml  |   66 +-
>  ovn-sb.xml  |   24 +
>  tests/automake.mk   |8 +-
>  tests/ovn-ic-nbctl.at   |   65 ++
>  tests/ovn-ic-sbctl.at   |  112 +++
>  tests/ovn-ic.at |  188 
>  tests/ovn-macros.at |  161 +++-
>  tests/ovn.at|  147 +++
>  tests/testsuite.at  |3 +
>  tutorial/ovs-sandbox|   78 +-
>  utilities/.gitignore|4 +
>  utilities/automake.mk   |   16 +
>  utilities/ovn-ctl   |  425 -
>  utilities/ovn-ctl.8.xml |   91 ++
>  

Re: [ovs-dev] [PATCH v4 0/3] Add support for TSO with DPDK

2020-01-29 Thread William Tu
On Wed, Jan 29, 2020 at 3:25 AM Flavio Leitner  wrote:
>
> On Tue, Jan 28, 2020 at 03:23:02PM -0800, Yifeng Sun wrote:
> > Sure.
> >
> > Firstly, make sure userspace-tso-enable is true
> > # ovs-vsctl get Open_vSwitch . other_config
> > {dpdk-init="true", enable-statistics="true", pmd-cpu-mask="0xf",
> > userspace-tso-enable="true"}
> >
> > Next, create 2 VMs with vhostuser-type interface on the same KVM host:
> > 
> >   
> >> path='/tmp/041afca0-6e11-4eab-a62f-1ccf5cd318fd' mode='server'/>
> >   
> >   
> > 
> > 
>
> I have other options set, but I don't think they are related:
> ufo='off' mrg_rxbuf='on'/>
>

Is mrg_rxbuf required to be on?
I saw when enable userspace tso, we are setting external buffer
RTE_VHOST_USER_EXTBUF_SUPPORT

Is this the same thing?
William
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCHv2] docs: Add header install command for afxdp.

2020-01-29 Thread William Tu
On Thu, Jan 23, 2020 at 09:03:11AM -0800, William Tu wrote:
> The 'XDP_RING_NEED_WAKEUP' and related flags are defined if_xdp.h, so if
> users are building their own kernel, users have to update the kernel's
> header files, by doing:
> 
>   $ make headers_install INSTALL_HDR_PATH=/usr
> 
> Otherwise the following error shows:
> /usr/local/include/bpf/xsk.h: In function 'xsk_ring_prod__needs_wakeup':
> /usr/local/include/bpf/xsk.h:82:21: error: 'XDP_RING_NEED_WAKEUP' undeclared \
>   (first use in this function)
>   return *r->flags & XDP_RING_NEED_WAKEUP;
> 
> Reported-by: Tomek Osinski 
> Reported-at: https://osinstom.github.io/en/tutorial/ovs-afxdp-installation/
> Signed-off-by: William Tu 
> Acked-by: Ben Pfaff 
> ---
>  Documentation/intro/install/afxdp.rst | 8 
>  1 file changed, 8 insertions(+)
> 
> diff --git a/Documentation/intro/install/afxdp.rst 
> b/Documentation/intro/install/afxdp.rst
> index c4685fa7ebac..99003e4dbdb2 100644
> --- a/Documentation/intro/install/afxdp.rst
> +++ b/Documentation/intro/install/afxdp.rst
> @@ -108,6 +108,14 @@ vSwitch with AF_XDP will require the following:
>  
>* CONFIG_XDP_SOCKETS_DIAG=y (Debugging)
>  
> +- If you're building your own kernel, be sure that you're installing kernel
> +  headers too.  For example, with the following command::
> +
> +make headers_install INSTALL_HDR_PATH=/usr
> +
> +- If you're using kernel from the distribution, be sure that corresponding
> +  kernel headers package installed.
> +
>  - Once your AF_XDP-enabled kernel is ready, if possible, run
>**./xdpsock -r -N -z -i ** under linux/samples/bpf.
>This is an OVS independent benchmark tools for AF_XDP.
> -- 
> 2.7.4
> 

I applied to master, thanks
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] conntrack: Fix conntrack new state

2020-01-29 Thread William Tu
On Fri, Dec 20, 2019 at 01:16:39PM -0800, Ben Pfaff wrote:
> On Fri, Dec 20, 2019 at 09:51:08AM -0800, Yi-Hung Wei wrote:
> > In connection tracking system, a connection is established if we
> > see packets from both directions.  However, in userspace datapath's
> > conntrack, if we send a connection setup packet in one direction
> > twice, it will make the connection to be in established state.
> > 
> > This patch fixes the aforementioned issue, and adds a system traffic
> > test for UDP and TCP traffic to avoid regression.
> > 
> > Fixes: a489b16854b59 ("conntrack: New userspace connection tracker.")
> > Signed-off-by: Yi-Hung Wei 
> > ---

LGTM. I applied this to master, and branch 2.13.
Thanks
William

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] kbuild: Include external modules compile flags

2020-01-29 Thread Gregory Rose


On 1/28/2020 7:37 AM, Gregory Rose wrote:

On 1/27/2020 7:35 PM, Masahiro Yamada wrote:

On Tue, Jan 28, 2020 at 6:50 AM Greg Rose  wrote:

Since this commit:
'commit 9b9a3f20cbe0 ("kbuild: split final module linking out into 
Makefile.modfinal")'

at least one out-of-tree external kernel module build fails
during the modfinal make phase because Makefile.modfinal does
not include the ccflags-y variable from the exernal module's Kbuild.

ccflags-y is passed only for compiling C files in that directory.

It is not used for compiling *.mod.c
This is true for both in-kernel and external modules.

So, ccflags-y is not a good choice
for passing such flags that should be globally effective.


Maybe, KCFLAGS should work.


module:
    $(MAKE) KCFLAGS=...  M=$(PWD) -C /lib/modules/$(uname 
-r)/build modules




Hi Masahiro,

I'm unable to get that to work.  KCFLAGS does not seem to be used in 
Makefile.modfinal.


[snip]

--- a/scripts/Makefile.modfinal
+++ b/scripts/Makefile.modfinal
@@ -21,6 +21,10 @@ __modfinal: $(modules)
  modname = $(notdir $(@:.mod.o=))
  part-of-module = y

+# Include the module's Makefile to find KBUILD_EXTRA_SYMBOLS
+include $(if $(wildcard $(KBUILD_EXTMOD)/Kbuild), \
+ $(KBUILD_EXTMOD)/Kbuild)
+


I continue to wonder why this it is so bad to include the external 
module's Kbuild.
It used to be included in Makefile.modpost and did no harm, and in fact 
was what
made our external build work at all in the past.  Without the ability to 
define our
local kernel module build environment during the modfinal make I see no 
way forward.


That said, I'm no expert on the Linux kernel Makefile 
interdependencies.  If you

have some other idea we could try I'm game.

Thanks,

- Greg

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] Is this d...@openvswitch.org still active?

2020-01-29 Thread Nelson Corey
I have tried to email this your d...@openvswitch.org account severally but I 
got no response, Please get back to me at your earliest convenience if you 
receive this email. 

Sincerely,
Nelson Corey

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v4 0/3] Add support for TSO with DPDK

2020-01-29 Thread Yifeng Sun
Hi Ilya,

The whole output of 'ethtool -k ens6' is here:

$ ethtool -k ens6
Features for ens6:
rx-checksumming: on [fixed]
tx-checksumming: on
tx-checksum-ipv4: off [fixed]
tx-checksum-ip-generic: on
tx-checksum-ipv6: off [fixed]
tx-checksum-fcoe-crc: off [fixed]
tx-checksum-sctp: off [fixed]
scatter-gather: on
tx-scatter-gather: on
tx-scatter-gather-fraglist: off [fixed]
tcp-segmentation-offload: on
tx-tcp-segmentation: on
tx-tcp-ecn-segmentation: on
tx-tcp-mangleid-segmentation: on
tx-tcp6-segmentation: on
udp-fragmentation-offload: off
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off [fixed]
rx-vlan-offload: off [fixed]
tx-vlan-offload: off [fixed]
ntuple-filters: off [fixed]
receive-hashing: off [fixed]
highdma: on [fixed]
rx-vlan-filter: on [fixed]
vlan-challenged: off [fixed]
tx-lockless: off [fixed]
netns-local: off [fixed]
tx-gso-robust: on [fixed]
tx-fcoe-segmentation: off [fixed]
tx-gre-segmentation: off [fixed]
tx-gre-csum-segmentation: off [fixed]
tx-ipxip4-segmentation: off [fixed]
tx-ipxip6-segmentation: off [fixed]
tx-udp_tnl-segmentation: off [fixed]
tx-udp_tnl-csum-segmentation: off [fixed]
tx-gso-partial: off [fixed]
tx-sctp-segmentation: off [fixed]
tx-esp-segmentation: off [fixed]
fcoe-mtu: off [fixed]
tx-nocache-copy: off
loopback: off [fixed]
rx-fcs: off [fixed]
rx-all: off [fixed]
tx-vlan-stag-hw-insert: off [fixed]
rx-vlan-stag-hw-parse: off [fixed]
rx-vlan-stag-filter: off [fixed]
l2-fwd-offload: off [fixed]
hw-tc-offload: off [fixed]
esp-hw-offload: off [fixed]
esp-tx-csum-hw-offload: off [fixed]
rx-udp_tunnel-port-offload: off [fixed]
yfs@ubuntu:~$ ethtool -k ens6 | grep rx
rx-checksumming: on [fixed]
rx-vlan-offload: off [fixed]
rx-vlan-filter: on [fixed]
rx-fcs: off [fixed]
rx-all: off [fixed]
rx-vlan-stag-hw-parse: off [fixed]
rx-vlan-stag-filter: off [fixed]
rx-udp_tunnel-port-offload: off [fixed]
yfs@ubuntu:~$ ethtool -k ens6
Features for ens6:
rx-checksumming: on [fixed]
tx-checksumming: on
tx-checksum-ipv4: off [fixed]
tx-checksum-ip-generic: on
tx-checksum-ipv6: off [fixed]
tx-checksum-fcoe-crc: off [fixed]
tx-checksum-sctp: off [fixed]
scatter-gather: on
tx-scatter-gather: on
tx-scatter-gather-fraglist: off [fixed]
tcp-segmentation-offload: on
tx-tcp-segmentation: on
tx-tcp-ecn-segmentation: on
tx-tcp-mangleid-segmentation: on
tx-tcp6-segmentation: on
udp-fragmentation-offload: off
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off [fixed]
rx-vlan-offload: off [fixed]
tx-vlan-offload: off [fixed]
ntuple-filters: off [fixed]
receive-hashing: off [fixed]
highdma: on [fixed]
rx-vlan-filter: on [fixed]
vlan-challenged: off [fixed]
tx-lockless: off [fixed]
netns-local: off [fixed]
tx-gso-robust: on [fixed]
tx-fcoe-segmentation: off [fixed]
tx-gre-segmentation: off [fixed]
tx-gre-csum-segmentation: off [fixed]
tx-ipxip4-segmentation: off [fixed]
tx-ipxip6-segmentation: off [fixed]
tx-udp_tnl-segmentation: off [fixed]
tx-udp_tnl-csum-segmentation: off [fixed]
tx-gso-partial: off [fixed]
tx-sctp-segmentation: off [fixed]
tx-esp-segmentation: off [fixed]
fcoe-mtu: off [fixed]
tx-nocache-copy: off
loopback: off [fixed]
rx-fcs: off [fixed]
rx-all: off [fixed]
tx-vlan-stag-hw-insert: off [fixed]
rx-vlan-stag-hw-parse: off [fixed]
rx-vlan-stag-filter: off [fixed]
l2-fwd-offload: off [fixed]
hw-tc-offload: off [fixed]
esp-hw-offload: off [fixed]
esp-tx-csum-hw-offload: off [fixed]
rx-udp_tunnel-port-offload: off [fixed]

Thanks,
Yifeng

On Wed, Jan 29, 2020 at 4:07 AM Ilya Maximets  wrote:
>
> On 29.01.2020 12:25, Flavio Leitner wrote:
> > On Tue, Jan 28, 2020 at 03:23:02PM -0800, Yifeng Sun wrote:
> >> Sure.
> >>
> >> Firstly, make sure userspace-tso-enable is true
> >> # ovs-vsctl get Open_vSwitch . other_config
> >> {dpdk-init="true", enable-statistics="true", pmd-cpu-mask="0xf",
> >> userspace-tso-enable="true"}
> >>
> >> Next, create 2 VMs with vhostuser-type interface on the same KVM host:
> >> 
> >>   
> >>>> path='/tmp/041afca0-6e11-4eab-a62f-1ccf5cd318fd' mode='server'/>
> >>   
> >>   
> >> 
> >> 
> >
> > I have other options set, but I don't think they are related:
> > > ufo='off' mrg_rxbuf='on'/>> > ecn='off' ufo='off'/>
> >
> >
> >>   
> >>   
> >>>> function='0x0'/>
> >> 
> >>
> >> When VM boots up, turn on tx, tso and sg
> >> # ethtool -K ens6 tx on
> >> # ethtool -K ens6 tso on
> >> # ethtool -K ens6 sg on
>
> Could you, please, provide the output of 'ethtool -k ens6'?
> If for some reason rx offloading is not enabled by default, you need to
> enable it too.
>
> >
> > All the needed offloading features are turned on by default,
> > so I don't change anything in my testbed.
> >
> >> Then run 'iperf -s' on one VM and 'iperf -c xx.xx.xx.xx' on 

Re: [ovs-dev] [PATCH v4 0/3] Add support for TSO with DPDK

2020-01-29 Thread Yifeng Sun
Hi Flavio,

Sorry in my last email, one change is incorrect. it should be:
in tcp_v4_rcv()
  - if (skb_checksum_init(skb, IPPROTO_TCP, inet_compute_pseudo))
  + if (0)

The kernel version I am using is ubuntu 18.04's default kernel:
$ uname -r
4.15.0-76-generic

Thanks,
Yifeng

On Wed, Jan 29, 2020 at 3:25 AM Flavio Leitner  wrote:
>
> On Tue, Jan 28, 2020 at 03:23:02PM -0800, Yifeng Sun wrote:
> > Sure.
> >
> > Firstly, make sure userspace-tso-enable is true
> > # ovs-vsctl get Open_vSwitch . other_config
> > {dpdk-init="true", enable-statistics="true", pmd-cpu-mask="0xf",
> > userspace-tso-enable="true"}
> >
> > Next, create 2 VMs with vhostuser-type interface on the same KVM host:
> > 
> >   
> >> path='/tmp/041afca0-6e11-4eab-a62f-1ccf5cd318fd' mode='server'/>
> >   
> >   
> > 
> > 
>
> I have other options set, but I don't think they are related:
> ufo='off' mrg_rxbuf='on'/>
>
>
>
> >   
> >   
> >> function='0x0'/>
> > 
> >
> > When VM boots up, turn on tx, tso and sg
> > # ethtool -K ens6 tx on
> > # ethtool -K ens6 tso on
> > # ethtool -K ens6 sg on
>
> All the needed offloading features are turned on by default,
> so I don't change anything in my testbed.
>
> > Then run 'iperf -s' on one VM and 'iperf -c xx.xx.xx.xx' on another VM.
> > Iperf doesn't work if there is no chage to VM's kernel. `tcpdump` shows
> > that iperf server received packets with invalid TCP checksum.
> > `nstat -a` shows that TcpInCsumErr number is accumulating.
> >
> > After adding changes to VM's kernel as below, iperf works properly.
> > in tcp_v4_rcv()
> >   - if (skb_checksum_init(skb, IPPROTO_TCP, inet_compute_pseudo))
> >   + if (skb_checksum_init(skb, IPPROTO_TCP, inet_compute_pseudo))
> >
> > static inline bool tcp_checksum_complete(struct sk_buff *skb)
> > {
> > return 0;
> > }
>
> That's odd. Which kernel is that? Maybe I can try the same version.
> I am using 5.2.14-200.fc30.x86_64.
>
> Looks like somehow the packet lost its offloading flags, then kernel
> has to check the csum and since it wasn't calculated before, it's
> just random garbage.
>
> fbl
>
>
> >
> >
> >
> > Best,
> > Yifeng
> >
> > On Tue, Jan 28, 2020 at 2:52 PM Flavio Leitner  wrote:
> > >
> > > On Tue, Jan 28, 2020 at 02:21:30PM -0800, Yifeng Sun wrote:
> > > > Hi Flavio,
> > > >
> > > > Thanks for the explanation. I followed the steps in the document but
> > > > TCP connection still failed to build between 2 VMs.
> > > >
> > > > I finally modified VM's kernel directly to disable TCP checksum 
> > > > validation
> > > > to get it working properly. I got 30.0Gbps for 'iperf' between 2 VMs.
> > >
> > > Could you provide more details on how you did that? What's running
> > > inside the VM?
> > >
> > > I don't change anything inside of the VMs (Linux) in my testbed.
> > >
> > > fbl
> > >
> > >
> > > >
> > > > Best,
> > > > Yifeng
> > > >
> > > >
> > > > On Tue, Jan 28, 2020 at 4:00 AM Flavio Leitner  
> > > > wrote:
> > > > >
> > > > > On Mon, Jan 27, 2020 at 05:17:01PM -0800, Yifeng Sun wrote:
> > > > > > Hi Ilya,
> > > > > >
> > > > > > Thanks for your reply.
> > > > > >
> > > > > > The thing is, if checksum offloading is enabled in both VMs, then
> > > > > > sender VM will send
> > > > > > a packet with invalid TCP checksum, and later OVS will send this
> > > > > > packet to receiver
> > > > > > VM directly without calculating a valid checksum. As a result,
> > > > > > receiver VM will drop
> > > > > > this packet because it contains invalid checksum. This is what
> > > > > > happened when I tried
> > > > > > this patch.
> > > > > >
> > > > >
> > > > > When TSO is enabled, the TX checksumming offloading is required,
> > > > > then you will see invalid checksum. This is well documented here:
> > > > >
> > > > > https://github.com/openvswitch/ovs/blob/master/Documentation/topics/userspace-tso.rst#userspace-datapath---tso
> > > > >
> > > > > "Additionally, if the traffic is headed to a VM within the same host
> > > > > further optimization can be expected. As the traffic never leaves
> > > > > the machine, no MTU needs to be accounted for, and thus no
> > > > > segmentation and checksum calculations are required, which saves yet
> > > > > more cycles."
> > > > >
> > > > > Therefore, it's expected to see bad csum in the traffic dumps.
> > > > >
> > > > > To use the feature, you need few steps: enable the feature in OvS
> > > > > enable in qemu and inside the VM. The linux guest usually enable
> > > > > the feature by default if qemu offers it.
> > > > >
> > > > > HTH,
> > > > > fbl
> > > > >
> > > > >
> > > > > > Best,
> > > > > > Yifeng
> > > > > >
> > > > > > On Mon, Jan 27, 2020 at 12:09 PM Ilya Maximets  
> > > > > > wrote:
> > > > > > >
> > > > > > > On 27.01.2020 18:24, Yifeng Sun wrote:
> > > > > > > > Hi Flavio,
> > > > > > > >
> > > > > > > > I am testing your patch using iperf between 2 VMs on the same 
> > > > > > > > host.
> > > > > > > > 

[ovs-dev] Drink (2 a day) to end obesity

2020-01-29 Thread Ketosis via dev
Hi d...@openvswitch.org!!




Can't Load Image? CLICK  HERE

 To See it!








click here 
to
 remove yourself from our emails list PO Box 371680 ,Denver, Colorado 
80237-5680 US,,Denver,Colorado,80237-5680 A construction worker carries 
materials as new hospitals are built to tackle the coronavirus on Jan. 28, 2020 
in Wuhan, China. Wuhan Huoshenshan hospital will reportedly be ready for use on 
February 3rd with the capacity of 1000 beds.  Getty Images/Getty Images 
AsiaPac/TNS A construction worker carries materials as new hospitals are built 
to tackle the coronavirus on Jan. 28, 2020 in Wuhan, China. Wuhan Huoshenshan 
hospital will reportedly be ready for use on February 3rd with the capacity of 
1000 beds. But offscreen, Chinas virus war is grim. Silent cities sit in a 
white winter smog that blots out the sky, their silent, empty streets 
contrasting with the crowded hospitals where doctors and nurses break down becau
 se they lack equipment or rooms for the patients squeezed outside their doors.;
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v2] ovn-northd: Consider load balancer active backends in router pipeline

2020-01-29 Thread Numan Siddique
On Wed, Jan 29, 2020 at 7:05 PM Dumitru Ceara  wrote:
>
> On 1/27/20 2:06 PM, num...@ovn.org wrote:
> > From: Numan Siddique 
> >
> > The commit [1] which added lood balancer health check support
> > missed out on updating the logical flows to consider only
> > active backends in the logical router pipeline if a load balancer
> > is associated. This patch fixes it. It also refactors the code
> > a bit.
> >
> > Without this, an offline backend may be chosen for load balancing,
> > resulting in the packet loss.
> >
> > This patch also adds logical flows in ingress ACL and egress ACL logical
> > switch datapath pipeline to skip the ACL policies for service monitor health
> > check traffic.
> >
> > [1] - ba0d6eae960d("ovn-northd: Add support for Load Balancer health check")
> >
> > Reported-by: Maciej Józefczyk 
> > Fixes - ba0d6eae960d("ovn-northd: Add support for Load Balancer health 
> > check")
> > Signed-off-by: Numan Siddique 
>
> Looks good to me, just a couple of minor comments below. Otherwise:

Thanks for the review, I addressed your comments and applied the patch
to master.

Numan

>
> Acked-by: Dumitru Ceara 
>
> Thanks,
> Dumitru
>
> > ---
> >
> > v1 -> v2
> > --
> >   * Addressed the issues reported by Maciej - added the flows for the
> > the service monitor packets to by pass the ACL pipeline.
> >
> >  northd/ovn-northd.8.xml |  22 
> >  northd/ovn-northd.c | 272 +++-
> >  tests/ovn.at|  27 
> >  3 files changed, 176 insertions(+), 145 deletions(-)
> >
> > diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
> > index bcb320be9..74bf2b9a8 100644
> > --- a/northd/ovn-northd.8.xml
> > +++ b/northd/ovn-northd.8.xml
> > @@ -420,6 +420,17 @@
> >  in the request direction are skipped here to let a newly created
> >  ACL re-allow this connection.
> >
> > +
> > +  
> > +A priority 34000 logical flow is added for each logical switch 
> > datapath
> > +with the match eth.dst = E to allow the 
> > service
> > +monitor reply packet destined to ovn-controller
> > +with the action next, where E is the
> > +service monitor mac defined in the
> > + > +db="OVN_Northbound"/> colum of  > +db="OVN_Northbound"/> table.
> > +  
> >  
> >
> >  Ingress Table 7: from-lport QoS Marking
> > @@ -1279,6 +1290,17 @@ output;
> >  to allow the DNS reply packet from the
> >  Ingress Table 15:DNS responses.
> >
> > +
> > +  
> > +A priority 34000 logical flow is added for each logical switch 
> > datapath
> > +with the match eth.src = E to allow the 
> > service
> > +monitor request packet generated by ovn-controller
> > +with the action next, where E is the
> > +service monitor mac defined in the
> > + > +db="OVN_Northbound"/> colum of  > +db="OVN_Northbound"/> table.
> > +  
> >  
> >
> >  Egress Table 5: to-lport QoS Marking
> > diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
> > index d094587a6..9a0a19db8 100644
> > --- a/northd/ovn-northd.c
> > +++ b/northd/ovn-northd.c
> > @@ -3027,6 +3027,7 @@ struct lb_vip {
> >  struct lb_vip_backend {
> >  char *ip;
> >  uint16_t port;
> > +int addr_family;
> >
> >  struct ovn_port *op; /* Logical port to which the ip belong to. */
> >  bool health_check;
> > @@ -3182,6 +3183,7 @@ ovn_lb_create(struct northd_context *ctx, struct hmap 
> > *lbs,
> >
> >  lb->vips[n_vips].backends[i].ip = backend_ip;
> >  lb->vips[n_vips].backends[i].port = backend_port;
> > +lb->vips[n_vips].backends[i].addr_family = addr_family;
> >  lb->vips[n_vips].backends[i].op = op;
> >  lb->vips[n_vips].backends[i].svc_mon_src_ip = svc_mon_src_ip;
> >
> > @@ -3245,6 +3247,41 @@ ovn_lb_destroy(struct ovn_lb *lb)
> >  free(lb->vips);
> >  }
> >
> > +static void build_lb_vip_ct_lb_actions(struct lb_vip *lb_vip,
> > +   struct ds *action)
> > +{
> > +if (lb_vip->health_check) {
> > +ds_put_cstr(action, "ct_lb(");
> > +
> > +size_t n_active_backends = 0;
> > +for (size_t k = 0; k < lb_vip->n_backends; k++) {
> > +struct lb_vip_backend *backend = _vip->backends[k];
> > +bool is_up = true;
> > +if (backend->health_check && backend->sbrec_monitor &&
> > +backend->sbrec_monitor->status &&
> > +strcmp(backend->sbrec_monitor->status, "online")) {
> > +is_up = false;
> > +}
> > +
>
> Now that we're refactoring this code we could also remove the 'is_up'
> variable and just "continue" if the backend is not online.
>
> > +if (is_up) {
> > +n_active_backends++;
> > +ds_put_format(action, "%s:%"PRIu16",",
> > +backend->ip, backend->port);
> > 

Re: [ovs-dev] [PATCH ovn v2] ovn-northd: Consider load balancer active backends in router pipeline

2020-01-29 Thread Dumitru Ceara
On 1/27/20 2:06 PM, num...@ovn.org wrote:
> From: Numan Siddique 
> 
> The commit [1] which added lood balancer health check support
> missed out on updating the logical flows to consider only
> active backends in the logical router pipeline if a load balancer
> is associated. This patch fixes it. It also refactors the code
> a bit.
> 
> Without this, an offline backend may be chosen for load balancing,
> resulting in the packet loss.
> 
> This patch also adds logical flows in ingress ACL and egress ACL logical
> switch datapath pipeline to skip the ACL policies for service monitor health
> check traffic.
> 
> [1] - ba0d6eae960d("ovn-northd: Add support for Load Balancer health check")
> 
> Reported-by: Maciej Józefczyk 
> Fixes - ba0d6eae960d("ovn-northd: Add support for Load Balancer health check")
> Signed-off-by: Numan Siddique 

Looks good to me, just a couple of minor comments below. Otherwise:

Acked-by: Dumitru Ceara 

Thanks,
Dumitru

> ---
> 
> v1 -> v2
> --
>   * Addressed the issues reported by Maciej - added the flows for the
> the service monitor packets to by pass the ACL pipeline.
> 
>  northd/ovn-northd.8.xml |  22 
>  northd/ovn-northd.c | 272 +++-
>  tests/ovn.at|  27 
>  3 files changed, 176 insertions(+), 145 deletions(-)
> 
> diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
> index bcb320be9..74bf2b9a8 100644
> --- a/northd/ovn-northd.8.xml
> +++ b/northd/ovn-northd.8.xml
> @@ -420,6 +420,17 @@
>  in the request direction are skipped here to let a newly created
>  ACL re-allow this connection.
>
> +
> +  
> +A priority 34000 logical flow is added for each logical switch 
> datapath
> +with the match eth.dst = E to allow the 
> service
> +monitor reply packet destined to ovn-controller
> +with the action next, where E is the
> +service monitor mac defined in the
> + +db="OVN_Northbound"/> colum of  +db="OVN_Northbound"/> table.
> +  
>  
>  
>  Ingress Table 7: from-lport QoS Marking
> @@ -1279,6 +1290,17 @@ output;
>  to allow the DNS reply packet from the
>  Ingress Table 15:DNS responses.
>
> +
> +  
> +A priority 34000 logical flow is added for each logical switch 
> datapath
> +with the match eth.src = E to allow the 
> service
> +monitor request packet generated by ovn-controller
> +with the action next, where E is the
> +service monitor mac defined in the
> + +db="OVN_Northbound"/> colum of  +db="OVN_Northbound"/> table.
> +  
>  
>  
>  Egress Table 5: to-lport QoS Marking
> diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
> index d094587a6..9a0a19db8 100644
> --- a/northd/ovn-northd.c
> +++ b/northd/ovn-northd.c
> @@ -3027,6 +3027,7 @@ struct lb_vip {
>  struct lb_vip_backend {
>  char *ip;
>  uint16_t port;
> +int addr_family;
>  
>  struct ovn_port *op; /* Logical port to which the ip belong to. */
>  bool health_check;
> @@ -3182,6 +3183,7 @@ ovn_lb_create(struct northd_context *ctx, struct hmap 
> *lbs,
>  
>  lb->vips[n_vips].backends[i].ip = backend_ip;
>  lb->vips[n_vips].backends[i].port = backend_port;
> +lb->vips[n_vips].backends[i].addr_family = addr_family;
>  lb->vips[n_vips].backends[i].op = op;
>  lb->vips[n_vips].backends[i].svc_mon_src_ip = svc_mon_src_ip;
>  
> @@ -3245,6 +3247,41 @@ ovn_lb_destroy(struct ovn_lb *lb)
>  free(lb->vips);
>  }
>  
> +static void build_lb_vip_ct_lb_actions(struct lb_vip *lb_vip,
> +   struct ds *action)
> +{
> +if (lb_vip->health_check) {
> +ds_put_cstr(action, "ct_lb(");
> +
> +size_t n_active_backends = 0;
> +for (size_t k = 0; k < lb_vip->n_backends; k++) {
> +struct lb_vip_backend *backend = _vip->backends[k];
> +bool is_up = true;
> +if (backend->health_check && backend->sbrec_monitor &&
> +backend->sbrec_monitor->status &&
> +strcmp(backend->sbrec_monitor->status, "online")) {
> +is_up = false;
> +}
> +

Now that we're refactoring this code we could also remove the 'is_up'
variable and just "continue" if the backend is not online.

> +if (is_up) {
> +n_active_backends++;
> +ds_put_format(action, "%s:%"PRIu16",",
> +backend->ip, backend->port);
> +}
> +}
> +
> +if (!n_active_backends) {
> +ds_clear(action);
> +ds_put_cstr(action, "drop;");
> +} else {
> +ds_chomp(action, ',');
> +ds_put_cstr(action, ");");
> +}
> +} else {
> +ds_put_format(action, "ct_lb(%s);", lb_vip->backend_ips);
> +}
> +}
> +
>  static void
>  

Re: [ovs-dev] [PATCH ovn 2/2] ovn-northd: Support hairpinning for logical switch load balancing.

2020-01-29 Thread Dumitru Ceara
On 1/20/20 2:25 PM, Numan Siddique wrote:
> On Thu, Jan 16, 2020 at 9:08 PM Dumitru Ceara  wrote:
>>
>> In case a VIF is trying to connect to a load balancer VIP that includes in
>> its backends the VIF itself, traffic would get DNAT-ed, ct_lb(VIP), but
>> when it reaches the VIF, the VIF will try to reply locally as the source IP
>> is known to be local. For this kind of hairpinning to work properly, reply
>> traffic must be sent back through OVN and the way to enforce that is to
>> perform SNAT (VIF source IP -> VIP) on hairpinned packets.
>>
>> For load balancers configured on gateway logical routers we already have the
>> possibility of using 'lb_force_snat_ip' but for load balancers configured
>> on logical switches there's no such configuration.
>>
>> For this second case we take an automatic approach which determines if
>> load balanced traffic needs to be hairpinned and execute the SNAT. To achieve
>> this, two new stages are added to the logical switch ingress pipeline:
>> - Ingress Table 11: Pre-Hairpin: which matches on load balanced traffic
>>   coming from VIFs that needs to be hairpinned and sets REGBIT_HAIRPIN
>>   (reg0[6]) to 1. If the traffic is in the direction that initiated the
>>   connection then 'ct_snat(VIP)' is performed, otherwise 'ct_snat' is
>>   used to unSNAT replies.
>> - Ingress Table 12: Hairpin: which hairpins packets at L2 (swaps Ethernet
>>   addresses and loops traffic back on the ingress port) if REGBIT_HAIRPIN
>>   is 1.
>>
>> Also, update all references to logical switch ingress pipeline tables to use
>> the correct indices.
>>
>> Reported-at: https://github.com/ovn-org/ovn-kubernetes/issues/817
>> Signed-off-by: Dumitru Ceara 
>> ---
>>  northd/ovn-northd.8.xml   |   57 --
>>  northd/ovn-northd.c   |  260 
>> ++---
>>  tests/ovn.at  |  209 
>>  utilities/ovn-trace.8.xml |4 -
>>  4 files changed, 406 insertions(+), 124 deletions(-)
> 
> Hi Dumitru,
> 
> The patch LGTM. I have a small comment below, please take a look.
> 
> Can you please add or enhance the system tests in system-ovn.at to
> handle this scenario ?
> 
> Thanks
> Numan
> 

Thanks for reviewing this Numan. V2 posted at
https://patchwork.ozlabs.org/patch/1230876/ includes system tests for LB
hairpinning and addresses your comments.

Regards,
Dumitru

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn v2] ovn-northd: Support hairpinning for logical switch load balancing.

2020-01-29 Thread Dumitru Ceara
In case a VIF is trying to connect to a load balancer VIP that includes in
its backends the VIF itself, traffic would get DNAT-ed, ct_lb(VIP), but
when it reaches the VIF, the VIF will try to reply locally as the source IP
is known to be local. For this kind of hairpinning to work properly, reply
traffic must be sent back through OVN and the way to enforce that is to
perform SNAT (VIF source IP -> VIP) on hairpinned packets.

For load balancers configured on gateway logical routers we already have the
possibility of using 'lb_force_snat_ip' but for load balancers configured
on logical switches there's no such configuration.

For this second case we take an automatic approach which determines if
load balanced traffic needs to be hairpinned and execute the SNAT. To achieve
this, two new stages are added to the logical switch ingress pipeline:
- Ingress Table 11: Pre-Hairpin: which matches on load balanced traffic
  coming from VIFs that needs to be hairpinned and sets REGBIT_HAIRPIN
  (reg0[6]) to 1. If the traffic is in the direction that initiated the
  connection then 'ct_snat(VIP)' is performed, otherwise 'ct_snat' is
  used to unSNAT replies.
- Ingress Table 12: Hairpin: which hairpins packets at L2 (swaps Ethernet
  addresses and loops traffic back on the ingress port) if REGBIT_HAIRPIN
  is 1.

Also, update all references to logical switch ingress pipeline tables to use
the correct indices.

Reported-at: https://github.com/ovn-org/ovn-kubernetes/issues/817
Signed-off-by: Dumitru Ceara 

---
v2:
- add system-ovn.at tests for LB hairpinning.
- address Numan's comments.
---
 northd/ovn-northd.8.xml   |  57 +++---
 northd/ovn-northd.c   | 260 +++---
 tests/ovn.at  | 209 -
 tests/system-ovn.at   | 158 
 utilities/ovn-trace.8.xml |   4 +-
 5 files changed, 564 insertions(+), 124 deletions(-)

diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
index dc07a70..ee1e58e 100644
--- a/northd/ovn-northd.8.xml
+++ b/northd/ovn-northd.8.xml
@@ -527,7 +527,40 @@
   
 
 
-Ingress Table 11: ARP/ND responder
+Ingress Table 11: Pre-Hairpin
+
+  
+For all configured load balancer backends a priority-2 flow that
+matches on traffic that needs to be hairpinned, i.e., after load
+balancing the destination IP matches the source IP, which sets
+reg0[6] = 1  and executes ct_snat(VIP)
+to force replies to these packets to come back through OVN.
+  
+  
+For all configured load balancer backends a priority-1 flow that
+matches on replies to hairpinned traffic, i.e., destination IP is VIP,
+source IP is the backend IP and source L4 port is backend port, which
+sets reg0[6] = 1  and executes ct_snat;.
+  
+  
+A priority-0 flow that simply moves traffic to the next table.
+  
+
+
+Ingress Table 12: Hairpin
+
+  
+A priority-1 flow that hairpins traffic matched by non-default
+flows in the Pre-Hairpin table. Hairpinning is done at L2, Ethernet
+addresses are swapped and the packets are looped back on the input
+port.
+  
+  
+A priority-0 flow that simply moves traffic to the next table.
+  
+
+
+Ingress Table 13: ARP/ND responder
 
 
   This table implements ARP/ND responder in a logical switch for known
@@ -811,7 +844,7 @@ output;
   
 
 
-Ingress Table 12: DHCP option processing
+Ingress Table 14: DHCP option processing
 
 
   This table adds the DHCPv4 options to a DHCPv4 packet from the
@@ -868,11 +901,11 @@ next;
   
 
   
-A priority-0 flow that matches all packets to advances to table 11.
+A priority-0 flow that matches all packets to advances to table 15.
   
 
 
-Ingress Table 13: DHCP responses
+Ingress Table 15: DHCP responses
 
 
   This table implements DHCP responder for the DHCP replies generated by
@@ -950,11 +983,11 @@ output;
   
 
   
-A priority-0 flow that matches all packets to advances to table 12.
+A priority-0 flow that matches all packets to advances to table 16.
   
 
 
-Ingress Table 14 DNS Lookup
+Ingress Table 16 DNS Lookup
 
 
   This table looks up and resolves the DNS names to the corresponding
@@ -983,7 +1016,7 @@ reg0[4] = dns_lookup(); next;
   
 
 
-Ingress Table 15 DNS Responses
+Ingress Table 17 DNS Responses
 
 
   This table implements DNS responder for the DNS replies generated by
@@ -1018,7 +1051,7 @@ output;
   
 
 
-Ingress table 16 External ports
+Ingress table 18 External ports
 
 
   Traffic from the external logical ports enter the ingress
@@ -1046,11 +1079,11 @@ output;
   
 
   
-A priority-0 flow that matches all packets to advances to table 17.
+A 

Re: [ovs-dev] [PATCH v3 ovn 0/2] Add MLD support.

2020-01-29 Thread Dumitru Ceara
On 1/29/20 1:28 PM, Numan Siddique wrote:
> On Wed, Jan 29, 2020 at 2:59 PM Dumitru Ceara  wrote:
>>
>> The first patch of the series is a minor fix of how IP multicast traffic
>> is matched.
>>
>> The second patch extends the already existing IPv4 Multicast support
>> (IGMP snooping, IGMP querier, relay and static flood config) to IPv6
>> by implementing MLDv1 & MLDv2 snooping and querier.
>>
>> Signed-off-by: Dumitru Ceara 
> 
> Thanks Dumitru
> I applied this series to master.
> 
> Numan
> 

Thanks Numan!

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3 ovn 0/2] Add MLD support.

2020-01-29 Thread Numan Siddique
On Wed, Jan 29, 2020 at 2:59 PM Dumitru Ceara  wrote:
>
> The first patch of the series is a minor fix of how IP multicast traffic
> is matched.
>
> The second patch extends the already existing IPv4 Multicast support
> (IGMP snooping, IGMP querier, relay and static flood config) to IPv6
> by implementing MLDv1 & MLDv2 snooping and querier.
>
> Signed-off-by: Dumitru Ceara 

Thanks Dumitru
I applied this series to master.

Numan

>
> Dumitru Ceara (2):
>   ovn-northd: Fix ipv4.mcast logical field.
>   ovn: Add MLD support.
>
>
>  NEWS|1
>  controller/pinctrl.c|  359 +++--
>  lib/logical-fields.c|   36 +++
>  lib/ovn-l7.h|   97 
>  northd/ovn-northd.8.xml |   24 ++
>  northd/ovn-northd.c |  109 +++--
>  ovn-nb.xml  |4
>  ovn-sb.ovsschema|5
>  ovn-sb.xml  |5
>  tests/ovn.at|  579 
> +++
>  tests/system-ovn.at |   73 +-
>  11 files changed, 1162 insertions(+), 130 deletions(-)
>
>
> ---
> v3:
> - Reorder IN_IP_INPUT router flows so that flows that reply to ARP/ND have
>   higher priority than multicast flows. This fixes system-ovn.at failures.
> - Update ovn-northd man page accordingly.
> - Fix system-ovn.at IGMP test to properly check the group name and add MLD
>   system test.
> v2:
> - Rebase and fix conflict in NEWS.
> - Add Mark's acks.
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v4 0/3] Add support for TSO with DPDK

2020-01-29 Thread Ilya Maximets
On 29.01.2020 12:25, Flavio Leitner wrote:
> On Tue, Jan 28, 2020 at 03:23:02PM -0800, Yifeng Sun wrote:
>> Sure.
>>
>> Firstly, make sure userspace-tso-enable is true
>> # ovs-vsctl get Open_vSwitch . other_config
>> {dpdk-init="true", enable-statistics="true", pmd-cpu-mask="0xf",
>> userspace-tso-enable="true"}
>>
>> Next, create 2 VMs with vhostuser-type interface on the same KVM host:
>> 
>>   
>>   > path='/tmp/041afca0-6e11-4eab-a62f-1ccf5cd318fd' mode='server'/>
>>   
>>   
>> 
>> 
> 
> I have other options set, but I don't think they are related:
> ufo='off' mrg_rxbuf='on'/>> ecn='off' ufo='off'/>
> 
> 
>>   
>>   
>>   > function='0x0'/>
>> 
>>
>> When VM boots up, turn on tx, tso and sg
>> # ethtool -K ens6 tx on
>> # ethtool -K ens6 tso on
>> # ethtool -K ens6 sg on

Could you, please, provide the output of 'ethtool -k ens6'?
If for some reason rx offloading is not enabled by default, you need to
enable it too.

> 
> All the needed offloading features are turned on by default,
> so I don't change anything in my testbed.
> 
>> Then run 'iperf -s' on one VM and 'iperf -c xx.xx.xx.xx' on another VM.
>> Iperf doesn't work if there is no chage to VM's kernel. `tcpdump` shows
>> that iperf server received packets with invalid TCP checksum.
>> `nstat -a` shows that TcpInCsumErr number is accumulating.
>>
>> After adding changes to VM's kernel as below, iperf works properly.
>> in tcp_v4_rcv()
>>   - if (skb_checksum_init(skb, IPPROTO_TCP, inet_compute_pseudo))
>>   + if (skb_checksum_init(skb, IPPROTO_TCP, inet_compute_pseudo))
>>
>> static inline bool tcp_checksum_complete(struct sk_buff *skb)
>> {
>> return 0;
>> }
> 
> That's odd. Which kernel is that? Maybe I can try the same version.
> I am using 5.2.14-200.fc30.x86_64.
> 
> Looks like somehow the packet lost its offloading flags, then kernel
> has to check the csum and since it wasn't calculated before, it's 
> just random garbage.
> 
> fbl
> 
> 
>>
>>
>>
>> Best,
>> Yifeng
>>
>> On Tue, Jan 28, 2020 at 2:52 PM Flavio Leitner  wrote:
>>>
>>> On Tue, Jan 28, 2020 at 02:21:30PM -0800, Yifeng Sun wrote:
 Hi Flavio,

 Thanks for the explanation. I followed the steps in the document but
 TCP connection still failed to build between 2 VMs.

 I finally modified VM's kernel directly to disable TCP checksum validation
 to get it working properly. I got 30.0Gbps for 'iperf' between 2 VMs.
>>>
>>> Could you provide more details on how you did that? What's running
>>> inside the VM?
>>>
>>> I don't change anything inside of the VMs (Linux) in my testbed.
>>>
>>> fbl
>>>
>>>

 Best,
 Yifeng


 On Tue, Jan 28, 2020 at 4:00 AM Flavio Leitner  wrote:
>
> On Mon, Jan 27, 2020 at 05:17:01PM -0800, Yifeng Sun wrote:
>> Hi Ilya,
>>
>> Thanks for your reply.
>>
>> The thing is, if checksum offloading is enabled in both VMs, then
>> sender VM will send
>> a packet with invalid TCP checksum, and later OVS will send this
>> packet to receiver
>> VM directly without calculating a valid checksum. As a result,
>> receiver VM will drop
>> this packet because it contains invalid checksum. This is what
>> happened when I tried
>> this patch.
>>
>
> When TSO is enabled, the TX checksumming offloading is required,
> then you will see invalid checksum. This is well documented here:
>
> https://github.com/openvswitch/ovs/blob/master/Documentation/topics/userspace-tso.rst#userspace-datapath---tso
>
> "Additionally, if the traffic is headed to a VM within the same host
> further optimization can be expected. As the traffic never leaves
> the machine, no MTU needs to be accounted for, and thus no
> segmentation and checksum calculations are required, which saves yet
> more cycles."
>
> Therefore, it's expected to see bad csum in the traffic dumps.
>
> To use the feature, you need few steps: enable the feature in OvS
> enable in qemu and inside the VM. The linux guest usually enable
> the feature by default if qemu offers it.
>
> HTH,
> fbl
>
>
>> Best,
>> Yifeng
>>
>> On Mon, Jan 27, 2020 at 12:09 PM Ilya Maximets  
>> wrote:
>>>
>>> On 27.01.2020 18:24, Yifeng Sun wrote:
 Hi Flavio,

 I am testing your patch using iperf between 2 VMs on the same host.
 But it seems that TCP connection can't be created between these 2 VMs.
 When inspecting further, I found that TCP packets have invalid 
 checksums.
 This might be the reason.

 I am wondering if I missed something in the setup? Thanks a lot.
>>>
>>> I didn't test myself, but according to current design, checksum 
>>> offloading
>>> (rx and tx) shuld be enabled in both VMs.  Otherwise all the packets 
>>> will
>>> be dropped by 

Re: [ovs-dev] [PATCH v4 0/3] Add support for TSO with DPDK

2020-01-29 Thread Flavio Leitner
On Tue, Jan 28, 2020 at 03:23:02PM -0800, Yifeng Sun wrote:
> Sure.
> 
> Firstly, make sure userspace-tso-enable is true
> # ovs-vsctl get Open_vSwitch . other_config
> {dpdk-init="true", enable-statistics="true", pmd-cpu-mask="0xf",
> userspace-tso-enable="true"}
> 
> Next, create 2 VMs with vhostuser-type interface on the same KVM host:
> 
>   
>path='/tmp/041afca0-6e11-4eab-a62f-1ccf5cd318fd' mode='server'/>
>   
>   
> 
> 

I have other options set, but I don't think they are related:
   
   


>   
>   
>function='0x0'/>
> 
> 
> When VM boots up, turn on tx, tso and sg
> # ethtool -K ens6 tx on
> # ethtool -K ens6 tso on
> # ethtool -K ens6 sg on

All the needed offloading features are turned on by default,
so I don't change anything in my testbed.

> Then run 'iperf -s' on one VM and 'iperf -c xx.xx.xx.xx' on another VM.
> Iperf doesn't work if there is no chage to VM's kernel. `tcpdump` shows
> that iperf server received packets with invalid TCP checksum.
> `nstat -a` shows that TcpInCsumErr number is accumulating.
> 
> After adding changes to VM's kernel as below, iperf works properly.
> in tcp_v4_rcv()
>   - if (skb_checksum_init(skb, IPPROTO_TCP, inet_compute_pseudo))
>   + if (skb_checksum_init(skb, IPPROTO_TCP, inet_compute_pseudo))
> 
> static inline bool tcp_checksum_complete(struct sk_buff *skb)
> {
> return 0;
> }

That's odd. Which kernel is that? Maybe I can try the same version.
I am using 5.2.14-200.fc30.x86_64.

Looks like somehow the packet lost its offloading flags, then kernel
has to check the csum and since it wasn't calculated before, it's 
just random garbage.

fbl


> 
> 
> 
> Best,
> Yifeng
> 
> On Tue, Jan 28, 2020 at 2:52 PM Flavio Leitner  wrote:
> >
> > On Tue, Jan 28, 2020 at 02:21:30PM -0800, Yifeng Sun wrote:
> > > Hi Flavio,
> > >
> > > Thanks for the explanation. I followed the steps in the document but
> > > TCP connection still failed to build between 2 VMs.
> > >
> > > I finally modified VM's kernel directly to disable TCP checksum validation
> > > to get it working properly. I got 30.0Gbps for 'iperf' between 2 VMs.
> >
> > Could you provide more details on how you did that? What's running
> > inside the VM?
> >
> > I don't change anything inside of the VMs (Linux) in my testbed.
> >
> > fbl
> >
> >
> > >
> > > Best,
> > > Yifeng
> > >
> > >
> > > On Tue, Jan 28, 2020 at 4:00 AM Flavio Leitner  wrote:
> > > >
> > > > On Mon, Jan 27, 2020 at 05:17:01PM -0800, Yifeng Sun wrote:
> > > > > Hi Ilya,
> > > > >
> > > > > Thanks for your reply.
> > > > >
> > > > > The thing is, if checksum offloading is enabled in both VMs, then
> > > > > sender VM will send
> > > > > a packet with invalid TCP checksum, and later OVS will send this
> > > > > packet to receiver
> > > > > VM directly without calculating a valid checksum. As a result,
> > > > > receiver VM will drop
> > > > > this packet because it contains invalid checksum. This is what
> > > > > happened when I tried
> > > > > this patch.
> > > > >
> > > >
> > > > When TSO is enabled, the TX checksumming offloading is required,
> > > > then you will see invalid checksum. This is well documented here:
> > > >
> > > > https://github.com/openvswitch/ovs/blob/master/Documentation/topics/userspace-tso.rst#userspace-datapath---tso
> > > >
> > > > "Additionally, if the traffic is headed to a VM within the same host
> > > > further optimization can be expected. As the traffic never leaves
> > > > the machine, no MTU needs to be accounted for, and thus no
> > > > segmentation and checksum calculations are required, which saves yet
> > > > more cycles."
> > > >
> > > > Therefore, it's expected to see bad csum in the traffic dumps.
> > > >
> > > > To use the feature, you need few steps: enable the feature in OvS
> > > > enable in qemu and inside the VM. The linux guest usually enable
> > > > the feature by default if qemu offers it.
> > > >
> > > > HTH,
> > > > fbl
> > > >
> > > >
> > > > > Best,
> > > > > Yifeng
> > > > >
> > > > > On Mon, Jan 27, 2020 at 12:09 PM Ilya Maximets  
> > > > > wrote:
> > > > > >
> > > > > > On 27.01.2020 18:24, Yifeng Sun wrote:
> > > > > > > Hi Flavio,
> > > > > > >
> > > > > > > I am testing your patch using iperf between 2 VMs on the same 
> > > > > > > host.
> > > > > > > But it seems that TCP connection can't be created between these 2 
> > > > > > > VMs.
> > > > > > > When inspecting further, I found that TCP packets have invalid 
> > > > > > > checksums.
> > > > > > > This might be the reason.
> > > > > > >
> > > > > > > I am wondering if I missed something in the setup? Thanks a lot.
> > > > > >
> > > > > > I didn't test myself, but according to current design, checksum 
> > > > > > offloading
> > > > > > (rx and tx) shuld be enabled in both VMs.  Otherwise all the 
> > > > > > packets will
> > > > > > be dropped by the guest kernel.
> > > > > >
> > > > > > Best regards, Ilya Maximets.
> > > >
> > > > --
> > > > 

[ovs-dev] RE

2020-01-29 Thread CHERYL ANN
hello, did you get my last mail?
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] Earn $13,000 in literally 24 hours

2020-01-29 Thread Bitcoin Code via dev
























___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v3 ovn 2/2] ovn: Add MLD support.

2020-01-29 Thread Dumitru Ceara
Extend the existing infrastructure used for IPv4 multicast to
IPv6 multicast:
- snoop MLDv1 & MLDv2 reports.
- if multicast querier is configured, generate MLDv2 queries.
- support IPv6 multicast relay.
- support static flood configuration for IPv6 multicast too.

Acked-by: Mark Michelson 
Signed-off-by: Dumitru Ceara 
---
 NEWS|1 
 controller/pinctrl.c|  359 +++--
 lib/logical-fields.c|   33 +++
 lib/ovn-l7.h|   97 
 northd/ovn-northd.8.xml |   24 ++
 northd/ovn-northd.c |  107 +++--
 ovn-nb.xml  |4 
 ovn-sb.ovsschema|5 
 ovn-sb.xml  |5 
 tests/ovn.at|  579 +++
 tests/system-ovn.at |   73 +-
 11 files changed, 1159 insertions(+), 128 deletions(-)

diff --git a/NEWS b/NEWS
index 9e7d601..2b8cd6f 100644
--- a/NEWS
+++ b/NEWS
@@ -5,6 +5,7 @@ Post-OVS-v2.12.0
- Added IPv6 NAT support for OVN routers.
- Added Stateless Floating IP support in OVN.
- Added Forwarding Group support in OVN.
+   - Added support for MLD Snooping and MLD Querier.
 
 v2.12.0 - 03 Sep 2019
 -
diff --git a/controller/pinctrl.c b/controller/pinctrl.c
index 5825bb1..a35c73a 100644
--- a/controller/pinctrl.c
+++ b/controller/pinctrl.c
@@ -263,7 +263,7 @@ static void ip_mcast_sync(
 struct ovsdb_idl_index *sbrec_igmp_groups,
 struct ovsdb_idl_index *sbrec_ip_multicast)
 OVS_REQUIRES(pinctrl_mutex);
-static void pinctrl_ip_mcast_handle_igmp(
+static void pinctrl_ip_mcast_handle(
 struct rconn *swconn,
 const struct flow *ip_flow,
 struct dp_packet *pkt_in,
@@ -1908,8 +1908,8 @@ process_packet_in(struct rconn *swconn, const struct 
ofp_header *msg)
);
 break;
 case ACTION_OPCODE_IGMP:
-pinctrl_ip_mcast_handle_igmp(swconn, , ,
- _metadata, );
+pinctrl_ip_mcast_handle(swconn, , , _metadata,
+);
 break;
 
 case ACTION_OPCODE_PUT_ARP:
@@ -3205,32 +3205,55 @@ pinctrl_compose_ipv4(struct dp_packet *packet, struct 
eth_addr eth_src,
 packet->packet_type = htonl(PT_ETH);
 
 struct eth_header *eh = dp_packet_put_zeros(packet, sizeof *eh);
-eh->eth_dst = eth_dst;
-eh->eth_src = eth_src;
-
 struct ip_header *nh = dp_packet_put_zeros(packet, sizeof *nh);
 
+eh->eth_dst = eth_dst;
+eh->eth_src = eth_src;
 eh->eth_type = htons(ETH_TYPE_IP);
 dp_packet_set_l3(packet, nh);
 nh->ip_ihl_ver = IP_IHL_VER(5, 4);
-nh->ip_tot_len = htons(sizeof(struct ip_header) + ip_payload_len);
+nh->ip_tot_len = htons(sizeof *nh + ip_payload_len);
 nh->ip_tos = IP_DSCP_CS6;
 nh->ip_proto = ip_proto;
 nh->ip_frag_off = htons(IP_DF);
 
-/* Setting tos and ttl to 0 and 1 respectively. */
 packet_set_ipv4(packet, ipv4_src, ipv4_dst, 0, ttl);
 
 nh->ip_csum = 0;
 nh->ip_csum = csum(nh, sizeof *nh);
 }
 
+static void
+pinctrl_compose_ipv6(struct dp_packet *packet, struct eth_addr eth_src,
+ struct eth_addr eth_dst, struct in6_addr *ipv6_src,
+ struct in6_addr *ipv6_dst, uint8_t ip_proto, uint8_t ttl,
+ uint16_t ip_payload_len)
+{
+dp_packet_clear(packet);
+packet->packet_type = htonl(PT_ETH);
+
+struct eth_header *eh = dp_packet_put_zeros(packet, sizeof *eh);
+struct ip6_hdr *nh = dp_packet_put_zeros(packet, sizeof *nh);
+
+eh->eth_dst = eth_dst;
+eh->eth_src = eth_src;
+eh->eth_type = htons(ETH_TYPE_IPV6);
+dp_packet_set_l3(packet, nh);
+
+nh->ip6_vfc = 0x60;
+nh->ip6_nxt = ip_proto;
+nh->ip6_plen = htons(ip_payload_len);
+
+packet_set_ipv6(packet, ipv6_src, ipv6_dst, 0, 0, ttl);
+}
+
 /*
  * Multicast snooping configuration.
  */
 struct ip_mcast_snoop_cfg {
 bool enabled;
-bool querier_enabled;
+bool querier_v4_enabled;
+bool querier_v6_enabled;
 
 uint32_t table_size;   /* Max number of allowed multicast groups. */
 uint32_t idle_time_s;  /* Idle timeout for multicast groups. */
@@ -3238,10 +3261,19 @@ struct ip_mcast_snoop_cfg {
 uint32_t query_max_resp_s; /* Multicast query max-response field. */
 uint32_t seq_no;   /* Used for flushing learnt groups. */
 
-struct eth_addr query_eth_src; /* Src ETH address used for queries. */
-struct eth_addr query_eth_dst; /* Dst ETH address used for queries. */
-ovs_be32 query_ipv4_src;   /* Src IPv4 address used for queries. */
-ovs_be32 query_ipv4_dst;   /* Dsc IPv4 address used for queries. */
+struct eth_addr query_eth_src;/* Src ETH address used for queries. */
+struct eth_addr query_eth_v4_dst; /* Dst ETH address used for IGMP
+   * queries.
+   */
+struct eth_addr query_eth_v6_dst; /* Dst ETH address used for MLD
+  

[ovs-dev] [PATCH v3 ovn 1/2] ovn-northd: Fix ipv4.mcast logical field.

2020-01-29 Thread Dumitru Ceara
Acked-by: Mark Michelson 
Signed-off-by: Dumitru Ceara 
---
 lib/logical-fields.c |3 ++-
 northd/ovn-northd.c  |2 +-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/lib/logical-fields.c b/lib/logical-fields.c
index 8fb591c..5748b67 100644
--- a/lib/logical-fields.c
+++ b/lib/logical-fields.c
@@ -158,7 +158,8 @@ ovn_init_symtab(struct shash *symtab)
 expr_symtab_add_field(symtab, "ip4.dst", MFF_IPV4_DST, "ip4", false);
 expr_symtab_add_predicate(symtab, "ip4.src_mcast",
   "ip4.src[28..31] == 0xe");
-expr_symtab_add_predicate(symtab, "ip4.mcast", "ip4.dst[28..31] == 0xe");
+expr_symtab_add_predicate(symtab, "ip4.mcast",
+  "eth.mcast && ip4.dst[28..31] == 0xe");
 
 expr_symtab_add_predicate(symtab, "icmp4", "ip4 && ip.proto == 1");
 expr_symtab_add_field(symtab, "icmp4.type", MFF_ICMPV4_TYPE, "icmp4",
diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
index 1e26098..e11d614 100644
--- a/northd/ovn-northd.c
+++ b/northd/ovn-northd.c
@@ -6305,7 +6305,7 @@ build_lswitch_flows(struct hmap *datapaths, struct hmap 
*ports,
  * ports - RFC 4541, section 2.1.2, item 2.
  */
 ovn_lflow_add(lflows, od, S_SWITCH_IN_L2_LKUP, 85,
-  "ip4 && ip4.dst == 224.0.0.0/24",
+  "ip4.mcast && ip4.dst == 224.0.0.0/24",
   "outport = \""MC_FLOOD"\"; output;");
 
 /* Forward uregistered IP multicast to routers with relay enabled

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v3 ovn 0/2] Add MLD support.

2020-01-29 Thread Dumitru Ceara
The first patch of the series is a minor fix of how IP multicast traffic
is matched.

The second patch extends the already existing IPv4 Multicast support
(IGMP snooping, IGMP querier, relay and static flood config) to IPv6
by implementing MLDv1 & MLDv2 snooping and querier.

Signed-off-by: Dumitru Ceara 

Dumitru Ceara (2):
  ovn-northd: Fix ipv4.mcast logical field.
  ovn: Add MLD support.


 NEWS|1 
 controller/pinctrl.c|  359 +++--
 lib/logical-fields.c|   36 +++
 lib/ovn-l7.h|   97 
 northd/ovn-northd.8.xml |   24 ++
 northd/ovn-northd.c |  109 +++--
 ovn-nb.xml  |4 
 ovn-sb.ovsschema|5 
 ovn-sb.xml  |5 
 tests/ovn.at|  579 +++
 tests/system-ovn.at |   73 +-
 11 files changed, 1162 insertions(+), 130 deletions(-)


---
v3:
- Reorder IN_IP_INPUT router flows so that flows that reply to ARP/ND have
  higher priority than multicast flows. This fixes system-ovn.at failures.
- Update ovn-northd man page accordingly.
- Fix system-ovn.at IGMP test to properly check the group name and add MLD
  system test.
v2:
- Rebase and fix conflict in NEWS.
- Add Mark's acks.

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 1/1] flow: Fix parsing l3_ofs with partial offloading

2020-01-29 Thread Eli Britstein
ping

On 1/14/2020 3:21 PM, Eli Britstein wrote:
> l3_ofs should be set all Ethernet packets, not just IPv4/IPv6 ones.
> For example for ARP over VLAN tagged packets, it may cause wrong
> processing like in changing the VLAN ID action. Fix it.
>
> Fixes: aab96ec4d81e ("dpif-netdev: retrieve flow directly from the flow mark")
> Signed-off-by: Eli Britstein 
> Reviewed-by: Roi Dayan 
> ---
>   lib/flow.c | 3 +--
>   1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/lib/flow.c b/lib/flow.c
> index 45bb96b54..5c32b4a01 100644
> --- a/lib/flow.c
> +++ b/lib/flow.c
> @@ -1107,6 +1107,7 @@ parse_tcp_flags(struct dp_packet *packet)
>   if (OVS_UNLIKELY(eth_type_mpls(dl_type))) {
>   packet->l2_5_ofs = (char *)data - frame;
>   }
> +packet->l3_ofs = (char *)data - frame;
>   if (OVS_LIKELY(dl_type == htons(ETH_TYPE_IP))) {
>   const struct ip_header *nh = data;
>   int ip_len;
> @@ -1116,7 +1117,6 @@ parse_tcp_flags(struct dp_packet *packet)
>   return 0;
>   }
>   dp_packet_set_l2_pad_size(packet, size - tot_len);
> -packet->l3_ofs = (uint16_t)((char *)nh - frame);
>   nw_proto = nh->ip_proto;
>   nw_frag = ipv4_get_nw_frag(nh);
>   
> @@ -1129,7 +1129,6 @@ parse_tcp_flags(struct dp_packet *packet)
>   if (OVS_UNLIKELY(!ipv6_sanity_check(nh, size))) {
>   return 0;
>   }
> -packet->l3_ofs = (uint16_t)((char *)nh - frame);
>   data_pull(, , sizeof *nh);
>   
>   plen = ntohs(nh->ip6_plen); /* Never pull padding. */
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev