Re: [ovs-discuss] external port range on internal logical ip seems weird

2020-04-21 Thread Ankur Sharma
Hi Flavio,

Glad to see your feedback, please find my replies inline.

Regards,
Ankur


From: Flavio Fernandes 
Sent: Tuesday, April 21, 2020 6:59 AM
To: Ankur Sharma 
Cc: Numan Siddique ; Mark Michelson ; 
Terry Wilson ; ovs-discuss@openvswitch.org 

Subject: external port range on internal logical ip seems weird

[cc Numan, Mark, Terry, ovs-discuss]

Hi Ankur,

I'm taking a deeper look at the changes for external port range [0] and 
scratching
my head a little bit about a particular behavior.

Let me start by mentioning about a basic setup I'm using:

1 internal switch with 1 logical port to represent a vm (10.0.0.3/24 
[10.0.0.3])
1 public switch (172.16.0.0/24 
[172.16.0.0])
1 rtr that connects both logical switches (10.0.0.1, 172.16.0.100)
1 snat_and_dnat rule for translating the ip, using port range

NOTE: The exact script is in this gist [1].
ovn-nbctl lsp-add sw0 sw0-port1
ovn-nbctl ls-add public
...
ovn-nbctl lsp-set-addresses sw0-port1 "50:54:00:00:00:03 10.0.0.3"
ovn-nbctl lr-add lr0
ovn-nbctl lrp-add lr0 lr0-sw0 00:00:00:00:ff:01 10.0.0.1/24 
[10.0.0.1]
...
ovn-nbctl lrp-add lr0 lr0-public 00:00:20:20:12:13 172.16.0.100/24 
[172.16.0.100]
...
ovn-nbctl --portrange lr-nat-add lr0 dnat_and_snat 172.16.0.110 10.0.0.3 
sw0-port1 30:54:00:00:00:03 8080-8082

And this is what the logical flow looks like regarding NAT:
[root@ovn-central /]# ovn-sbctl dump-flows lr0 | grep -i -e 'ct_' -e 'nat'
  table=5 (lr_in_unsnat   ), priority=100  , match=(ip && ip4.dst == 
172.16.0.110 && inport == "lr0-public"), action=(ct_snat;)
  table=5 (lr_in_unsnat   ), priority=0, match=(1), action=(next;)
  table=6 (lr_in_dnat ), priority=100  , match=(ip && ip4.dst == 
172.16.0.110 && inport == "lr0-public"), action=(ct_dnat(10.0.0.3,8080-8082);)
  table=6 (lr_in_dnat ), priority=0, match=(1), action=(next;)
  table=0 (lr_out_undnat  ), priority=100  , match=(ip && ip4.src == 
10.0.0.3 && outport == "lr0-public"), action=(eth.src = 30:54:00:00:00:03; 
ct_dnat;)
  table=0 (lr_out_undnat  ), priority=0, match=(1), action=(next;)
  table=1 (lr_out_snat), priority=120  , match=(nd_ns), action=(next;)
  table=1 (lr_out_snat), priority=33   , match=(ip && ip4.src == 
10.0.0.3 && outport == "lr0-public"), action=(eth.src = 30:54:00:00:00:03; 
ct_snat(172.16.0.110,8080-8082);)
  table=1 (lr_out_snat), priority=0, match=(1), action=(next;)
  table=2 (lr_out_egr_loop), priority=100  , match=(ip4.dst == 172.16.0.110 
&& outport == "lr0-public" && is_chassis_resident("sw0-port1")), action=(clone 
{ ct_clear; inport = outport; outport = ""; flags = 0; flags.loopback = 1; reg0 
= 0; reg1 = 0; reg2 = 0; reg3 = 0; reg4 = 0; reg5 = 0; reg6 = 0; reg7 = 0; reg8 
= 0; reg9 = 0; reg9[0] = 1; next(pipeline=ingress, table=0); };)

Out of that:
[root@ovn-central /]# ovn-sbctl dump-flows lr0 | grep 8080
  table=6 (lr_in_dnat ), priority=100  , match=(ip && ip4.dst == 
172.16.0.110 && inport == "lr0-public"), action=(ct_dnat(10.0.0.3,8080-8082);)
  table=1 (lr_out_snat), priority=33   , match=(ip && ip4.src == 
10.0.0.3 && outport == "lr0-public"), action=(eth.src = 30:54:00:00:00:03; 
ct_snat(172.16.0.110,8080-8082);)

The rule "ct_dnat(10.0.0.3,8080-8082)" -- line 40 in gist [1] --  seems wrong 
to me because external port range should, as the name suggests, be only applied 
to the external ip[2]. Am I missing something? That particular code lives here 
[3][4].

What do you think? Maybe we also need "internal_port_range" semantics?

[ANKUR]: Idea behind port range is to specify the range for port address 
translation(PAT). Netfilter allows specification for translating port also, 
while doing (src/dest) IP translation. Now, this PAT happens in either 
direction (based on SNAT or DNAT) and probably thats the reason phrase 
"external" is causing confusion. We dont need separate semantics, we can just 
move to a generic semantics from "external_port_range" to "port_range".

Not sure if you agree, but it could be easier to 

Re: [ovs-discuss] Question about RAFT cluster status output

2020-04-21 Thread Winson Wang
On Tue, Apr 21, 2020 at 2:19 PM Han Zhou  wrote:

>
>
> On Tue, Apr 21, 2020 at 2:12 PM Winson Wang 
> wrote:
> >
> > Hi Han
> >
> > I have question about the Connections output in my RAFT cluster.
> > Connections: -> ->3c2d <-29ce <-3c2d
> > Should the ""  be 29ce?
> >
> Yes, you are right. This seems to be a bug. Do you know how to reproduce
> this?
>

I am seeing this with my 646 node k8s cluster with ovn cni.
To reproduce it,  I think need trigger SB to cpu busy state such as restart
all the ovn-controller clients or NB change which
trigger SB generate large flow count in short time such as 200K flows.

I  run one backgroup script to check the raft node role every 10 seconds.
here is some logout for the role change during the stress time.

SB role changed from follower to candidate on 13:21:06
SB role changed from candidate to leader on 13:21:16
SB role changed from leader to follower on 13:22:13
SB role changed from follower to candidate on 13:46:54
SB role changed from candidate to follower on 13:47:05



>
> > cluster leader output:
> > ovs-appctl -t /var/run/openvswitch/ovnsb_db.ctl cluster/status
> OVN_Southbound
> > bb7d
> > Name: OVN_Southbound
> > Cluster ID: c316 (c316d7c4-6a72-4124-aa62-657b7c50c5c6)
> > Server ID: bb7d (bb7d4188-d0d5-4e5e-b0a3-fbbbaa849418)
> > Address: tcp:10.0.2.153:6644
> > Status: cluster member
> > Role: leader
> > Term: 41
> > Leader: self
> > Vote: self
> >
> > Election timer: 8000
> > Log: [8899, 8946]
> > Entries not yet committed: 0
> > Entries not yet applied: 0
> > Connections: -> ->3c2d <-29ce <-3c2d
> > Servers:
> > bb7d (bb7d at tcp:10.0.2.153:6644) (self) next_index=8920
> match_index=8945
> > 29ce (29ce at tcp:10.0.2.151:6644) next_index=8946 match_index=8945
> > 3c2d (3c2d at tcp:10.0.2.152:6644) next_index=8946 match_index=8945
> >
> > Name: OVN_Southbound
> > Cluster ID: c316 (c316d7c4-6a72-4124-aa62-657b7c50c5c6)
> > Server ID: 3c2d (3c2d4666-f10d-49dc-a1b4-dec50720a79f)
> > Address: tcp:10.0.2.152:6644
> > Status: cluster member
> > Role: follower
> > Term: 41
> > Leader: bb7d
> > Vote: unknown
> >
> > Election timer: 8000
> > Log: [8898, 8946]
> > Entries not yet committed: 0
> > Entries not yet applied: 0
> > Connections: -> <-29ce <-bb7d ->bb7d
> > Servers:
> > bb7d (bb7d at tcp:10.0.2.153:6644)
> > 29ce (29ce at tcp:10.0.2.151:6644)
> > 3c2d (3c2d at tcp:10.0.2.152:6644) (self)
> >
> > 29ce
> > Name: OVN_Southbound
> > Cluster ID: c316 (c316d7c4-6a72-4124-aa62-657b7c50c5c6)
> > Server ID: 29ce (29ce6194-ec71-4c8f-ba70-1953568ed4cc)
> > Address: tcp:10.0.2.151:6644
> > Status: cluster member
> > Role: follower
> > Term: 41
> > Leader: bb7d
> > Vote: bb7d
> >
> > Election timer: 8000
> > Log: [8875, 8929]
> > Entries not yet committed: 0
> > Entries not yet applied: 0
> > Connections: <-3c2d ->3c2d <-bb7d ->bb7d
> > Servers:
> > bb7d (bb7d at tcp:10.0.2.153:6644)
> > 29ce (29ce at tcp:10.0.2.151:6644) (self)
> > 3c2d (3c2d at tcp:10.0.2.152:6644)
> >
> > --
> > Winson
>


-- 
Winson
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] Question about RAFT cluster status output

2020-04-21 Thread Han Zhou
On Tue, Apr 21, 2020 at 2:12 PM Winson Wang  wrote:
>
> Hi Han
>
> I have question about the Connections output in my RAFT cluster.
> Connections: -> ->3c2d <-29ce <-3c2d
> Should the ""  be 29ce?
>
Yes, you are right. This seems to be a bug. Do you know how to reproduce
this?

> cluster leader output:
> ovs-appctl -t /var/run/openvswitch/ovnsb_db.ctl cluster/status
OVN_Southbound
> bb7d
> Name: OVN_Southbound
> Cluster ID: c316 (c316d7c4-6a72-4124-aa62-657b7c50c5c6)
> Server ID: bb7d (bb7d4188-d0d5-4e5e-b0a3-fbbbaa849418)
> Address: tcp:10.0.2.153:6644
> Status: cluster member
> Role: leader
> Term: 41
> Leader: self
> Vote: self
>
> Election timer: 8000
> Log: [8899, 8946]
> Entries not yet committed: 0
> Entries not yet applied: 0
> Connections: -> ->3c2d <-29ce <-3c2d
> Servers:
> bb7d (bb7d at tcp:10.0.2.153:6644) (self) next_index=8920
match_index=8945
> 29ce (29ce at tcp:10.0.2.151:6644) next_index=8946 match_index=8945
> 3c2d (3c2d at tcp:10.0.2.152:6644) next_index=8946 match_index=8945
>
> Name: OVN_Southbound
> Cluster ID: c316 (c316d7c4-6a72-4124-aa62-657b7c50c5c6)
> Server ID: 3c2d (3c2d4666-f10d-49dc-a1b4-dec50720a79f)
> Address: tcp:10.0.2.152:6644
> Status: cluster member
> Role: follower
> Term: 41
> Leader: bb7d
> Vote: unknown
>
> Election timer: 8000
> Log: [8898, 8946]
> Entries not yet committed: 0
> Entries not yet applied: 0
> Connections: -> <-29ce <-bb7d ->bb7d
> Servers:
> bb7d (bb7d at tcp:10.0.2.153:6644)
> 29ce (29ce at tcp:10.0.2.151:6644)
> 3c2d (3c2d at tcp:10.0.2.152:6644) (self)
>
> 29ce
> Name: OVN_Southbound
> Cluster ID: c316 (c316d7c4-6a72-4124-aa62-657b7c50c5c6)
> Server ID: 29ce (29ce6194-ec71-4c8f-ba70-1953568ed4cc)
> Address: tcp:10.0.2.151:6644
> Status: cluster member
> Role: follower
> Term: 41
> Leader: bb7d
> Vote: bb7d
>
> Election timer: 8000
> Log: [8875, 8929]
> Entries not yet committed: 0
> Entries not yet applied: 0
> Connections: <-3c2d ->3c2d <-bb7d ->bb7d
> Servers:
> bb7d (bb7d at tcp:10.0.2.153:6644)
> 29ce (29ce at tcp:10.0.2.151:6644) (self)
> 3c2d (3c2d at tcp:10.0.2.152:6644)
>
> --
> Winson
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


[ovs-discuss] Question about RAFT cluster status output

2020-04-21 Thread Winson Wang
Hi Han

I have question about the Connections output in my RAFT cluster.
Connections: *->* ->3c2d <-29ce <-3c2d
Should the ""  be 29ce?

cluster leader output:
ovs-appctl -t /var/run/openvswitch/ovnsb_db.ctl cluster/status
OVN_Southbound
bb7d
Name: OVN_Southbound
Cluster ID: c316 (c316d7c4-6a72-4124-aa62-657b7c50c5c6)
Server ID: bb7d (bb7d4188-d0d5-4e5e-b0a3-fbbbaa849418)
Address: tcp:10.0.2.153:6644
Status: cluster member
Role: leader
Term: 41
Leader: self
Vote: self

Election timer: 8000
Log: [8899, 8946]
Entries not yet committed: 0
Entries not yet applied: 0
Connections: *->* ->3c2d <-29ce <-3c2d
Servers:
bb7d (bb7d at tcp:10.0.2.153:6644) (self) next_index=8920
match_index=8945
29ce (29ce at tcp:10.0.2.151:6644) next_index=8946 match_index=8945
3c2d (3c2d at tcp:10.0.2.152:6644) next_index=8946 match_index=8945

Name: OVN_Southbound
Cluster ID: c316 (c316d7c4-6a72-4124-aa62-657b7c50c5c6)
Server ID: 3c2d (3c2d4666-f10d-49dc-a1b4-dec50720a79f)
Address: tcp:10.0.2.152:6644
Status: cluster member
Role: follower
Term: 41
Leader: bb7d
Vote: unknown

Election timer: 8000
Log: [8898, 8946]
Entries not yet committed: 0
Entries not yet applied: 0
Connections: -> <-29ce <-bb7d ->bb7d
Servers:
bb7d (bb7d at tcp:10.0.2.153:6644)
29ce (29ce at tcp:10.0.2.151:6644)
3c2d (3c2d at tcp:10.0.2.152:6644) (self)

29ce
Name: OVN_Southbound
Cluster ID: c316 (c316d7c4-6a72-4124-aa62-657b7c50c5c6)
Server ID: 29ce (29ce6194-ec71-4c8f-ba70-1953568ed4cc)
Address: tcp:10.0.2.151:6644
Status: cluster member
Role: follower
Term: 41
Leader: bb7d
Vote: bb7d

Election timer: 8000
Log: [8875, 8929]
Entries not yet committed: 0
Entries not yet applied: 0
Connections: <-3c2d ->3c2d <-bb7d ->bb7d
Servers:
bb7d (bb7d at tcp:10.0.2.153:6644)
29ce (29ce at tcp:10.0.2.151:6644) (self)
3c2d (3c2d at tcp:10.0.2.152:6644)

-- 
Winson
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


[ovs-discuss] OVS bridge VS. Linux bridge: 2 libvirt's VMs both using OVS inside testlab

2020-04-21 Thread Igor Podlesny
The VMs have:
  - a virtio_net Ethernet adapter;
  - OVS installed and its bridge configured atop of that Eth. adapter.

When those VMs are connected to Linux "standard" bridge everything
works flawlessly: VMs can ping the hypervisor bridge by its IP, and
ping each other just fine.

When connecting a single one of them to the hypervisor's own
OVS-bridge no issues were spotted as well.
But simultaneous launch of both VMs is where it's gonna break. In this
example a ping from one VM to another echoed back but only once:

box-64-69% ping 192.168.64.70
PING 192.168.64.70 (192.168.64.70) 56(84) bytes of data.
64 bytes from 192.168.64.70: icmp_seq=1 ttl=64 time=4.88 ms
^C
--- 192.168.64.70 ping statistics ---
5 packets transmitted, 1 received, 80% packet loss, time 58ms
rtt min/avg/max/mdev = 4.879/4.879/4.879/0.000 ms

A set of q-ns arises:
  - Is it due to a loop? Enabling, say, RSTP across all 3 involved OVS
bridges doesn't change anything.
  - How to debug it?
  - And why does it differ from "standard" Linux bridge?

-- 
End of message. Next message?
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


[ovs-discuss] IPv4 to IPv6 NAT and vice a versa

2020-04-21 Thread Brendan Doyle

Hi,

I noticed  patch for IPv6 NAT  support went in late last year. But all 
the test cases in the
patch had either both IPv4 addresses or both IPv6 addresses. So just 
wondering

is it possible to do IPv4 to IPv6 NAT and vice a versa.

Thanks
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] ssh not working between VMs on different hypervisors

2020-04-21 Thread Brendan Doyle
Solved, the darn MTU! forgot that it needs to be lowered to take account 
of the tunnel


on both VMs "ip link set eth1 mtu 1400"

Now ssh works :)


On 21/04/2020 15:29, Numan Siddique wrote:

On Tue, Apr 21, 2020 at 6:37 PM Brendan Doyle  wrote:

Folks,

Anybody seen this, is it a known problem?

VM1 on hypervisor 1

ping IP of VM1 on hypervisor2

# ping -c1 192.16.1.5
PING 192.16.1.5 (192.16.1.5) 56(84) bytes of data.
64 bytes from 192.16.1.5: icmp_seq=1 ttl=64 time=0.494 ms

But
# ssh 192.16.1.5
Connection closed by 192.16.1.5 port 22


ssh works between VMs on the same hypervisor, seems going through the
tunnel is the
problem.

I don't think its a tunnel problem.

On VM2 hypervisor, you can run tcpdump on genev_sys_6081 interface and
see if you receive the ssh packets.

Thanks
Numan



configuration
===
VM1 hypervisor1
---
ca-rain06-vmovs-1 ~]# ip a sh  eth1
3: eth1:  mtu 1500 qdisc pfifo_fast
state UP group default qlen 1000
  link/ether 52:54:00:be:06:16 brd ff:ff:ff:ff:ff:ff
  inet 192.16.1.6/24 brd 192.16.1.255 scope global eth1
 valid_lft forever preferred_lft forever
  inet6 fe80::5054:ff:febe:616/64 scope link
 valid_lft forever preferred_lft forever

VM1 hypervisor2

ca-rain05-vmovs-1 ~]#  ip a sh  eth1
3: eth1:  mtu 1500 qdisc pfifo_fast
state UP group default qlen 1000
  link/ether 52:54:00:e6:4f:46 brd ff:ff:ff:ff:ff:ff
  inet 192.16.1.5/24 brd 192.16.1.255 scope global eth1
 valid_lft forever preferred_lft forever
  inet6 fe80::5054:ff:fee6:4f46/64 scope link
 valid_lft forever preferred_lft forever



hypervisor1
-
# ovs-vsctl show
dbcc7c2e-cf07-4052-b40d-d4f47f5560b0
  Bridge br-int
  fail_mode: secure
  Port ovn-ca-rai-0
  Interface ovn-ca-rai-0
  type: geneve
  options: {csum="true", key=flow, remote_ip="172.20.1.17"}
  Port ovn-ca-rai-1
  Interface ovn-ca-rai-1
  type: geneve
  options: {csum="true", key=flow, remote_ip="172.20.1.5"}
  Port vnet3
  Interface vnet3
  Port vnet1
  Interface vnet1
  Port vnet5
  Interface vnet5
  Port br-int
  Interface br-int
  type: internal
  ovs_version: "2.13.90"

[ca-rain06 ~]# ip a sh  bond0
7: bond0:  mtu 1500 qdisc
noqueue state UP group default qlen 1000
  link/ether 98:03:9b:59:af:1c brd ff:ff:ff:ff:ff:ff
  inet 172.20.1.6/24 brd 172.20.1.255 scope global bond0
 valid_lft forever preferred_lft forever
  inet6 fe80::9a03:9bff:fe59:af1c/64 scope link
 valid_lft forever preferred_lft forever



hypervisor2

# ovs-vsctl show
169dc085-7224-42c3-b119-390b7d0fe450
  Bridge br-int
  fail_mode: secure
  Port vnet6
  Interface vnet6
  Port br-int
  Interface br-int
  type: internal
  Port ovn-ca-rai-1
  Interface ovn-ca-rai-1
  type: geneve
  options: {csum="true", key=flow, remote_ip="172.20.1.6"}
  Port ovn-ca-rai-0
  Interface ovn-ca-rai-0
  type: geneve
  options: {csum="true", key=flow, remote_ip="172.20.1.17"}
  Port vnet4
  Interface vnet4
  Port vnet8
  Interface vnet8
  Port vnet2
  Interface vnet2
  ovs_version: "2.13.90"

[ca-rain05 ~]# ip a sh  bond0
7: bond0:  mtu 1500 qdisc
noqueue state UP group default qlen 1000
  link/ether 98:03:9b:2d:91:a2 brd ff:ff:ff:ff:ff:ff
  inet 172.20.1.5/24 brd 172.20.1.255 scope global bond0
 valid_lft forever preferred_lft forever
  inet6 fe80::9a03:9bff:fe2d:91a2/64 scope link
 valid_lft forever preferred_lft forever


OVN Central (on a different hypervisor)
-
# ovn-sbctl show
Chassis ca-rain17
  hostname: ca-rain17.us.oracle.com
  Encap geneve
  ip: "172.20.1.17"
  options: {csum="true"}
Chassis ca-rain06
  hostname: ca-rain06.us.oracle.com
  Encap geneve
  ip: "172.20.1.6"
  options: {csum="true"}
  Port_Binding "47433b54-ac10-42f1-ae84-cc6fbb580297"
  Port_Binding "06e85cca-867a-44fc-b2c1-be62f2fb06c0"
  Port_Binding "284195d2-9280-4334-900e-571ecd00327a"
Chassis ca-rain05
  hostname: ca-rain05.us.oracle.com
  Encap geneve
  ip: "172.20.1.5"
  options: {csum="true"}
  Port_Binding "ce78fd2b-4c68-428c-baf1-71718e7f3871"
  Port_Binding "269089c4-9464-41ec-9f63-6b3804b34b07"
  Port_Binding "00bff7c0-2e2d-41ba-9485-3b5fa9801365"
  Port_Binding "1cb7d760-90b0-4201-9517-88cb2de31c79"

# ovn-nbctl show
switch 10073c55-8f96-411f-a3b6-89b13389a084 (ls_vcn2)
  port 

Re: [ovs-discuss] ssh not working between VMs on different hypervisors

2020-04-21 Thread Numan Siddique
On Tue, Apr 21, 2020 at 6:37 PM Brendan Doyle  wrote:
>
> Folks,
>
> Anybody seen this, is it a known problem?
>
> VM1 on hypervisor 1
> 
> ping IP of VM1 on hypervisor2
>
> # ping -c1 192.16.1.5
> PING 192.16.1.5 (192.16.1.5) 56(84) bytes of data.
> 64 bytes from 192.16.1.5: icmp_seq=1 ttl=64 time=0.494 ms
>
> But
> # ssh 192.16.1.5
> Connection closed by 192.16.1.5 port 22
>
>
> ssh works between VMs on the same hypervisor, seems going through the
> tunnel is the
> problem.

I don't think its a tunnel problem.

On VM2 hypervisor, you can run tcpdump on genev_sys_6081 interface and
see if you receive the ssh packets.

Thanks
Numan

>
>
> configuration
> ===
> VM1 hypervisor1
> ---
> ca-rain06-vmovs-1 ~]# ip a sh  eth1
> 3: eth1:  mtu 1500 qdisc pfifo_fast
> state UP group default qlen 1000
>  link/ether 52:54:00:be:06:16 brd ff:ff:ff:ff:ff:ff
>  inet 192.16.1.6/24 brd 192.16.1.255 scope global eth1
> valid_lft forever preferred_lft forever
>  inet6 fe80::5054:ff:febe:616/64 scope link
> valid_lft forever preferred_lft forever
>
> VM1 hypervisor2
> 
> ca-rain05-vmovs-1 ~]#  ip a sh  eth1
> 3: eth1:  mtu 1500 qdisc pfifo_fast
> state UP group default qlen 1000
>  link/ether 52:54:00:e6:4f:46 brd ff:ff:ff:ff:ff:ff
>  inet 192.16.1.5/24 brd 192.16.1.255 scope global eth1
> valid_lft forever preferred_lft forever
>  inet6 fe80::5054:ff:fee6:4f46/64 scope link
> valid_lft forever preferred_lft forever
>
>
>
> hypervisor1
> -
> # ovs-vsctl show
> dbcc7c2e-cf07-4052-b40d-d4f47f5560b0
>  Bridge br-int
>  fail_mode: secure
>  Port ovn-ca-rai-0
>  Interface ovn-ca-rai-0
>  type: geneve
>  options: {csum="true", key=flow, remote_ip="172.20.1.17"}
>  Port ovn-ca-rai-1
>  Interface ovn-ca-rai-1
>  type: geneve
>  options: {csum="true", key=flow, remote_ip="172.20.1.5"}
>  Port vnet3
>  Interface vnet3
>  Port vnet1
>  Interface vnet1
>  Port vnet5
>  Interface vnet5
>  Port br-int
>  Interface br-int
>  type: internal
>  ovs_version: "2.13.90"
>
> [ca-rain06 ~]# ip a sh  bond0
> 7: bond0:  mtu 1500 qdisc
> noqueue state UP group default qlen 1000
>  link/ether 98:03:9b:59:af:1c brd ff:ff:ff:ff:ff:ff
>  inet 172.20.1.6/24 brd 172.20.1.255 scope global bond0
> valid_lft forever preferred_lft forever
>  inet6 fe80::9a03:9bff:fe59:af1c/64 scope link
> valid_lft forever preferred_lft forever
>
>
>
> hypervisor2
> 
> # ovs-vsctl show
> 169dc085-7224-42c3-b119-390b7d0fe450
>  Bridge br-int
>  fail_mode: secure
>  Port vnet6
>  Interface vnet6
>  Port br-int
>  Interface br-int
>  type: internal
>  Port ovn-ca-rai-1
>  Interface ovn-ca-rai-1
>  type: geneve
>  options: {csum="true", key=flow, remote_ip="172.20.1.6"}
>  Port ovn-ca-rai-0
>  Interface ovn-ca-rai-0
>  type: geneve
>  options: {csum="true", key=flow, remote_ip="172.20.1.17"}
>  Port vnet4
>  Interface vnet4
>  Port vnet8
>  Interface vnet8
>  Port vnet2
>  Interface vnet2
>  ovs_version: "2.13.90"
>
> [ca-rain05 ~]# ip a sh  bond0
> 7: bond0:  mtu 1500 qdisc
> noqueue state UP group default qlen 1000
>  link/ether 98:03:9b:2d:91:a2 brd ff:ff:ff:ff:ff:ff
>  inet 172.20.1.5/24 brd 172.20.1.255 scope global bond0
> valid_lft forever preferred_lft forever
>  inet6 fe80::9a03:9bff:fe2d:91a2/64 scope link
> valid_lft forever preferred_lft forever
>
>
> OVN Central (on a different hypervisor)
> -
> # ovn-sbctl show
> Chassis ca-rain17
>  hostname: ca-rain17.us.oracle.com
>  Encap geneve
>  ip: "172.20.1.17"
>  options: {csum="true"}
> Chassis ca-rain06
>  hostname: ca-rain06.us.oracle.com
>  Encap geneve
>  ip: "172.20.1.6"
>  options: {csum="true"}
>  Port_Binding "47433b54-ac10-42f1-ae84-cc6fbb580297"
>  Port_Binding "06e85cca-867a-44fc-b2c1-be62f2fb06c0"
>  Port_Binding "284195d2-9280-4334-900e-571ecd00327a"
> Chassis ca-rain05
>  hostname: ca-rain05.us.oracle.com
>  Encap geneve
>  ip: "172.20.1.5"
>  options: {csum="true"}
>  Port_Binding "ce78fd2b-4c68-428c-baf1-71718e7f3871"
>  Port_Binding "269089c4-9464-41ec-9f63-6b3804b34b07"
>  Port_Binding "00bff7c0-2e2d-41ba-9485-3b5fa9801365"
>  Port_Binding "1cb7d760-90b0-4201-9517-88cb2de31c79"
>
> # ovn-nbctl show
> switch 10073c55-8f96-411f-a3b6-89b13389a084 (ls_vcn2)
>  port 

[ovs-discuss] ssh not working between VMs on different hypervisors

2020-04-21 Thread Brendan Doyle

Folks,

Anybody seen this, is it a known problem?

VM1 on hypervisor 1

ping IP of VM1 on hypervisor2

# ping -c1 192.16.1.5
PING 192.16.1.5 (192.16.1.5) 56(84) bytes of data.
64 bytes from 192.16.1.5: icmp_seq=1 ttl=64 time=0.494 ms

But
# ssh 192.16.1.5
Connection closed by 192.16.1.5 port 22


ssh works between VMs on the same hypervisor, seems going through the 
tunnel is the

problem.


configuration
===
VM1 hypervisor1
---
ca-rain06-vmovs-1 ~]# ip a sh  eth1
3: eth1:  mtu 1500 qdisc pfifo_fast 
state UP group default qlen 1000

    link/ether 52:54:00:be:06:16 brd ff:ff:ff:ff:ff:ff
    inet 192.16.1.6/24 brd 192.16.1.255 scope global eth1
   valid_lft forever preferred_lft forever
    inet6 fe80::5054:ff:febe:616/64 scope link
   valid_lft forever preferred_lft forever

VM1 hypervisor2

ca-rain05-vmovs-1 ~]#  ip a sh  eth1
3: eth1:  mtu 1500 qdisc pfifo_fast 
state UP group default qlen 1000

    link/ether 52:54:00:e6:4f:46 brd ff:ff:ff:ff:ff:ff
    inet 192.16.1.5/24 brd 192.16.1.255 scope global eth1
   valid_lft forever preferred_lft forever
    inet6 fe80::5054:ff:fee6:4f46/64 scope link
   valid_lft forever preferred_lft forever



hypervisor1
-
# ovs-vsctl show
dbcc7c2e-cf07-4052-b40d-d4f47f5560b0
    Bridge br-int
    fail_mode: secure
    Port ovn-ca-rai-0
    Interface ovn-ca-rai-0
    type: geneve
    options: {csum="true", key=flow, remote_ip="172.20.1.17"}
    Port ovn-ca-rai-1
    Interface ovn-ca-rai-1
    type: geneve
    options: {csum="true", key=flow, remote_ip="172.20.1.5"}
    Port vnet3
    Interface vnet3
    Port vnet1
    Interface vnet1
    Port vnet5
    Interface vnet5
    Port br-int
    Interface br-int
    type: internal
    ovs_version: "2.13.90"

[ca-rain06 ~]# ip a sh  bond0
7: bond0:  mtu 1500 qdisc 
noqueue state UP group default qlen 1000

    link/ether 98:03:9b:59:af:1c brd ff:ff:ff:ff:ff:ff
    inet 172.20.1.6/24 brd 172.20.1.255 scope global bond0
   valid_lft forever preferred_lft forever
    inet6 fe80::9a03:9bff:fe59:af1c/64 scope link
   valid_lft forever preferred_lft forever



hypervisor2

# ovs-vsctl show
169dc085-7224-42c3-b119-390b7d0fe450
    Bridge br-int
    fail_mode: secure
    Port vnet6
    Interface vnet6
    Port br-int
    Interface br-int
    type: internal
    Port ovn-ca-rai-1
    Interface ovn-ca-rai-1
    type: geneve
    options: {csum="true", key=flow, remote_ip="172.20.1.6"}
    Port ovn-ca-rai-0
    Interface ovn-ca-rai-0
    type: geneve
    options: {csum="true", key=flow, remote_ip="172.20.1.17"}
    Port vnet4
    Interface vnet4
    Port vnet8
    Interface vnet8
    Port vnet2
    Interface vnet2
    ovs_version: "2.13.90"

[ca-rain05 ~]# ip a sh  bond0
7: bond0:  mtu 1500 qdisc 
noqueue state UP group default qlen 1000

    link/ether 98:03:9b:2d:91:a2 brd ff:ff:ff:ff:ff:ff
    inet 172.20.1.5/24 brd 172.20.1.255 scope global bond0
   valid_lft forever preferred_lft forever
    inet6 fe80::9a03:9bff:fe2d:91a2/64 scope link
   valid_lft forever preferred_lft forever


OVN Central (on a different hypervisor)
-
# ovn-sbctl show
Chassis ca-rain17
    hostname: ca-rain17.us.oracle.com
    Encap geneve
    ip: "172.20.1.17"
    options: {csum="true"}
Chassis ca-rain06
    hostname: ca-rain06.us.oracle.com
    Encap geneve
    ip: "172.20.1.6"
    options: {csum="true"}
    Port_Binding "47433b54-ac10-42f1-ae84-cc6fbb580297"
    Port_Binding "06e85cca-867a-44fc-b2c1-be62f2fb06c0"
    Port_Binding "284195d2-9280-4334-900e-571ecd00327a"
Chassis ca-rain05
    hostname: ca-rain05.us.oracle.com
    Encap geneve
    ip: "172.20.1.5"
    options: {csum="true"}
    Port_Binding "ce78fd2b-4c68-428c-baf1-71718e7f3871"
    Port_Binding "269089c4-9464-41ec-9f63-6b3804b34b07"
    Port_Binding "00bff7c0-2e2d-41ba-9485-3b5fa9801365"
    Port_Binding "1cb7d760-90b0-4201-9517-88cb2de31c79"

# ovn-nbctl show
switch 10073c55-8f96-411f-a3b6-89b13389a084 (ls_vcn2)
    port 06e85cca-867a-44fc-b2c1-be62f2fb06c0
    addresses: ["52:54:00:2a:7b:49 192.17.1.6"]
    port ce78fd2b-4c68-428c-baf1-71718e7f3871
    addresses: ["52:54:00:d8:6e:eb 192.17.1.5"]
    port vcn2_subnet1-lr_vcn2
    type: router
    addresses: ["40:44:00:00:00:50"]
    router-port: lr_vcn2-vcn2_subnet1
switch 0e58d3cd-c36a-4651-8b42-4821f653bcb2 (ls_vcn3)
    port vcn3_subnet1-lr_vcn3
    type: router
    addresses: ["40:44:00:00:00:60"]
    router-port: lr_vcn3-vcn3_subnet1
    port 269089c4-9464-41ec-9f63-6b3804b34b07
    addresses: ["52:54:00:30:38:35 

Re: [ovs-discuss] [OVN] OVN Load balancing algorithm

2020-04-21 Thread Daniel Alvarez Sanchez
Thanks Numan for the investigation and the great explanation!

On Tue, Apr 21, 2020 at 9:38 AM Numan Siddique  wrote:

> On Fri, Apr 17, 2020 at 12:56 PM Han Zhou  wrote:
> >
> >
> >
> > On Tue, Apr 7, 2020 at 7:03 AM Maciej Jozefczyk 
> wrote:
> > >
> > > Hello!
> > >
> > > I would like to ask you to clarify how the OVN Load balancing
> algorithm works.
> > >
> > > Based on the action [1]:
> > > 1) If connection is alive the same 'backend' will be chosen,
> > >
> > > 2) If it is a new connection the backend will be chosen based on
> selection_method=dp_hash [2].
> > > Based on changelog the dp_hash uses '5 tuple hash' [3].
> > > The hash is calculated based on values: source and destination IP,
> source port, protocol and arbitrary value - 42. [4]
> > > Based on that information we could name it SOURCE_IP_PORT.
> > >
> > > Unfortunately we recently got a bug report in OVN Octavia provider
> driver project, that the Load Balancing in OVN
> > > works differently [5]. The report shows even when the test uses the
> same source ip and port, but new TCP connection,
> > > traffic is randomly distributed, but based on [2] it shouldn't?
> > >
> > > Is it a bug?  Is something else taken to account while creating a
> hash? Can it be fixed in OVS/OVN?
> > >
> > >
> > >
> > > Thanks,
> > > Maciej
> > >
> > >
> > > [1]
> https://github.com/ovn-org/ovn/blob/branch-20.03/lib/actions.c#L1017
> > > [2]
> https://github.com/ovn-org/ovn/blob/branch-20.03/lib/actions.c#L1059
> > > [3]
> https://github.com/openvswitch/ovs/blob/d58b59c17c70137aebdde37d3c01c26a26b28519/NEWS#L364-L371
> > > [4]
> https://github.com/openvswitch/ovs/blob/74286173f4d7f51f78e9db09b07a6d4d65263252/lib/flow.c#L2217
> > > [5] https://bugs.launchpad.net/neutron/+bug/1871239
> > >
> > > --
> > > Best regards,
> > > Maciej Józefczyk
> > > ___
> > > discuss mailing list
> > > disc...@openvswitch.org
> > > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
> >
> > Hi Maciej,
> >
> > Thanks for reporting. It is definitely strange that same 5-tuple flow
> resulted in hitting different backends. I didn't observed such behavior
> before (maybe I should try again myself to confirm). Can you make sure
> during the testing the group bucket didn't change? You can do so by:
> > # ovs-ofctl dump-groups br-int
> > and also check the group stats and see if multiple buckets has counter
> increased during the test
> > # ovs-ofctl dump-group-stats br-int [group]
> >
> > For the 5-tuple hash function you are seeing flow_hash_5tuple(), it is
> using all the 5-tuples. It adds both ports (src and dst) at once:
> >/* Add both ports at once. */
> > hash = hash_add(hash,
> > ((const uint32_t *)flow)[offsetof(struct flow,
> tp_src)
> >  / sizeof(uint32_t)]);
> >
> > The tp_src is the start of the offset, and the size is 32, meaning both
> src and dst, each is 16 bits. (Although I am not sure if dp_hash method is
> using this function or not. Need to check more code)
> >
> > BTW, I am not sure why Neutron give it the name SOURCE_IP_PORT. Shall it
> be called just 5-TUPLE, since protocol, destination IP and PORT are also
> considered in the hash.
> >
>
>
> Hi Maciej and Han,
>
> I did some testing and I can confirm as you're saying. OVN is not
> choosing the same backend with the src ip, src port fixed.
>
> I think there is an issue with OVN on how it is programming the group
> flows.  OVN is setting the selection_method as dp_hash.
> But when ovs-vswitchd receives the  GROUP_MOD openflow message, I
> noticed that the selection_method is not set.
> From the code I see that selection_method will be encoded only if
> ovn-controller uses openflow version 1.5 [1]
>
> Since selection_method is NULL, vswitchd uses the dp_hash method [2].
> dp_hash means it uses the hash calculated by
> the datapath. In the case of kernel datapath, from what I understand
> it uses skb_get_hash().
>
> I modified the vswitchd code to use the selection_method "hash" if
> selection_method is not set. In this case the load balancer
> works as expected. For a fixed src ip, src port, dst ip and dst port,
> the group action is selecting the same bucket always. [3]
>
> I think we need to fix a few issues in OVN
>   - Use openflow 1.5 so that ovn can set selection_method
>  -  Use "hash" method if dp_hash is not choosing the same bucket for
> 5-tuple hash.
>   - May be provide the option for the CMS to choose an algorithm i.e.
> to use dp_hash or hash.
>
> I'd rather not expose this to the CMS as it depends on the datapath
implementation as per [0] but maybe it makes sense to eventually abstract
it to the CMS in a more LB-ish way (common algorithm names used in load
balancing) in the case at some point the LB feature is enhanced somehow to
support more algorithms.

I believe that for OVN LB users, using OF 1.5 to force the use of 'hash'
would be the best solution now.

My 2 

Re: [ovs-discuss] [OVN] OVN Load balancing algorithm

2020-04-21 Thread Numan Siddique
On Fri, Apr 17, 2020 at 12:56 PM Han Zhou  wrote:
>
>
>
> On Tue, Apr 7, 2020 at 7:03 AM Maciej Jozefczyk  wrote:
> >
> > Hello!
> >
> > I would like to ask you to clarify how the OVN Load balancing algorithm 
> > works.
> >
> > Based on the action [1]:
> > 1) If connection is alive the same 'backend' will be chosen,
> >
> > 2) If it is a new connection the backend will be chosen based on 
> > selection_method=dp_hash [2].
> > Based on changelog the dp_hash uses '5 tuple hash' [3].
> > The hash is calculated based on values: source and destination IP,  source 
> > port, protocol and arbitrary value - 42. [4]
> > Based on that information we could name it SOURCE_IP_PORT.
> >
> > Unfortunately we recently got a bug report in OVN Octavia provider driver 
> > project, that the Load Balancing in OVN
> > works differently [5]. The report shows even when the test uses the same 
> > source ip and port, but new TCP connection,
> > traffic is randomly distributed, but based on [2] it shouldn't?
> >
> > Is it a bug?  Is something else taken to account while creating a hash? Can 
> > it be fixed in OVS/OVN?
> >
> >
> >
> > Thanks,
> > Maciej
> >
> >
> > [1] https://github.com/ovn-org/ovn/blob/branch-20.03/lib/actions.c#L1017
> > [2] https://github.com/ovn-org/ovn/blob/branch-20.03/lib/actions.c#L1059
> > [3] 
> > https://github.com/openvswitch/ovs/blob/d58b59c17c70137aebdde37d3c01c26a26b28519/NEWS#L364-L371
> > [4] 
> > https://github.com/openvswitch/ovs/blob/74286173f4d7f51f78e9db09b07a6d4d65263252/lib/flow.c#L2217
> > [5] https://bugs.launchpad.net/neutron/+bug/1871239
> >
> > --
> > Best regards,
> > Maciej Józefczyk
> > ___
> > discuss mailing list
> > disc...@openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
>
> Hi Maciej,
>
> Thanks for reporting. It is definitely strange that same 5-tuple flow 
> resulted in hitting different backends. I didn't observed such behavior 
> before (maybe I should try again myself to confirm). Can you make sure during 
> the testing the group bucket didn't change? You can do so by:
> # ovs-ofctl dump-groups br-int
> and also check the group stats and see if multiple buckets has counter 
> increased during the test
> # ovs-ofctl dump-group-stats br-int [group]
>
> For the 5-tuple hash function you are seeing flow_hash_5tuple(), it is using 
> all the 5-tuples. It adds both ports (src and dst) at once:
>/* Add both ports at once. */
> hash = hash_add(hash,
> ((const uint32_t *)flow)[offsetof(struct flow, tp_src)
>  / sizeof(uint32_t)]);
>
> The tp_src is the start of the offset, and the size is 32, meaning both src 
> and dst, each is 16 bits. (Although I am not sure if dp_hash method is using 
> this function or not. Need to check more code)
>
> BTW, I am not sure why Neutron give it the name SOURCE_IP_PORT. Shall it be 
> called just 5-TUPLE, since protocol, destination IP and PORT are also 
> considered in the hash.
>


Hi Maciej and Han,

I did some testing and I can confirm as you're saying. OVN is not
choosing the same backend with the src ip, src port fixed.

I think there is an issue with OVN on how it is programming the group
flows.  OVN is setting the selection_method as dp_hash.
But when ovs-vswitchd receives the  GROUP_MOD openflow message, I
noticed that the selection_method is not set.
From the code I see that selection_method will be encoded only if
ovn-controller uses openflow version 1.5 [1]

Since selection_method is NULL, vswitchd uses the dp_hash method [2].
dp_hash means it uses the hash calculated by
the datapath. In the case of kernel datapath, from what I understand
it uses skb_get_hash().

I modified the vswitchd code to use the selection_method "hash" if
selection_method is not set. In this case the load balancer
works as expected. For a fixed src ip, src port, dst ip and dst port,
the group action is selecting the same bucket always. [3]

I think we need to fix a few issues in OVN
  - Use openflow 1.5 so that ovn can set selection_method
 -  Use "hash" method if dp_hash is not choosing the same bucket for
5-tuple hash.
  - May be provide the option for the CMS to choose an algorithm i.e.
to use dp_hash or hash.

I'll look into it on how to support this.

[1] - https://github.com/openvswitch/ovs/blob/master/lib/ofp-group.c#L2120
   https://github.com/openvswitch/ovs/blob/master/lib/ofp-group.c#L2082

[2] - 
https://github.com/openvswitch/ovs/blob/master/ofproto/ofproto-dpif.c#L5108
[3] - 
https://github.com/openvswitch/ovs/blob/master/ofproto/ofproto-dpif-xlate.c#L4553


Thanks
Numan


> Thanks,
> Han
> ___
> discuss mailing list
> disc...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
___
discuss mailing list
disc...@openvswitch.org