Re: [ovs-dev] [PATCH v3 ovn] Add missing checkpatch.py script

2019-08-20 Thread Aaron Conole
0-day Robot  writes:

> Bleep bloop.  Greetings Lorenzo Bianconi, I am a robot and I have tried out 
> your patch.
> Thanks for your contribution.
>
> I encountered some error that I wasn't expecting.  See the details below.
>
>
> git-am:
> error: The following untracked working tree files would be overwritten by 
> merge:
>   utilities/checkpatch.py
> Please move or remove them before you can merge.
> Aborting
> Failed to merge in the changes.
> Patch failed at 0001 Add missing checkpatch.py script
> The copy of the patch that failed is found in:
>
> /var/lib/jenkins/jobs/upstream_build_from_pw/workspace/OVN/.git/rebase-apply/patch
> When you have resolved this problem, run "git am --resolved".
> If you prefer to skip this patch, run "git am --skip" instead.
> To restore the original branch and stop patching, run "git am --abort".
>

Oops!  This is a bug in the robot.  I'll fix it up.

> Please check this out.  If you feel there has been an error, please email 
> acon...@redhat.com
>
> Thanks,
> 0-day Robot
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 9/9] system-traffic: Add zone-based conntrack timeout policy test

2019-08-20 Thread Darrell Ball
On Mon, Aug 12, 2019 at 5:22 PM Yi-Hung Wei  wrote:

> On Sun, Aug 11, 2019 at 12:30 PM Darrell Ball  wrote:
> >
> > I did some further testing and ran into another issue; in this case,
> one, I did not expect.
> >
> > I added an additional sending of packets at the end of the test after
> this check:
> >
> > AT_CHECK([ovs-appctl dpctl/dump-conntrack | FORMAT_CT(10.1.1.2)], [0],
> [dnl
> > ])
> >
> > Below is new code
> >
> > dnl Do it again
> > dnl Send ICMP and UDP traffic
> > NS_CHECK_EXEC([at_ns0], [ping -q -c 3 -i 0.3 -w 2 10.1.1.2 |
> FORMAT_PING], [0], [dnl
> > 3 packets transmitted, 3 received, 0% packet loss, time 0ms
> > ])
> > AT_CHECK([ovs-ofctl -O OpenFlow13 packet-out br0 "in_port=1
> packet=5054000a505400090800451c0011a4cd0a0101010a010102000100020008
> actions=resubmit(,0)"])
> >
> > AT_CHECK([ovs-appctl dpctl/dump-conntrack | FORMAT_CT(10.1.1.2) | sort],
> [0], [dnl
> >
> icmp,orig=(src=10.1.1.1,dst=10.1.1.2,id=,type=8,code=0),reply=(src=10.1.1.2,dst=10.1.1.1,id=,type=0,code=0),zone=5
> >
> udp,orig=(src=10.1.1.1,dst=10.1.1.2,sport=,dport=),reply=(src=10.1.1.2,dst=10.1.1.1,sport=,dport=),zone=5
> > ])
> >
> > dnl Wait until the timeout expire.
> > dnl We intend to wait a bit longer, because conntrack does not recycle
> the entry right after it is expired.
> > sleep 5
> >
> > AT_CHECK([ovs-appctl dpctl/dump-conntrack | FORMAT_CT(10.1.1.2)], [0],
> [dnl
> > ])
> >
> > The test fails bcoz the second time with short timeouts, the conntrack
> entries are not cleanup up quickly
> >
> > @@ -0,0 +1,2 @@
> >
> +icmp,orig=(src=10.1.1.1,dst=10.1.1.2,id=,type=8,code=0),reply=(src=10.1.1.2,dst=10.1.1.1,id=,type=0,code=0),zone=5
> >
> +udp,orig=(src=10.1.1.1,dst=10.1.1.2,sport=,dport=),reply=(src=10.1.1.2,dst=10.1.1.1,sport=,dport=),zone=5
>
>
> Thanks for testing!   This test actually catch a kernel bug when ovs
> kernel handles conntrack cache.  It works for me on my ubuntu xenial
> VM with 4.4 kernel.
>
> Since this requires upstream kernel change, it will be backported to
> OVS once the fix gets upstream.
>
> Thanks,
>
> -Yi-Hung
>
> diff --git a/datapath/conntrack.c b/datapath/conntrack.c
> index f85d0a2572f6..ad48b559bcde 100644
> --- a/datapath/conntrack.c
> +++ b/datapath/conntrack.c
> @@ -76,6 +76,7 @@ enum ovs_ct_nat {
>  /* Conntrack action context for execution. */
>  struct ovs_conntrack_info {
> struct nf_conntrack_helper *helper;
> +   struct nf_ct_timeout *nf_ct_timeout;
> struct nf_conntrack_zone zone;
> struct nf_conn *ct;
> u8 commit : 1;
> @@ -745,6 +746,13 @@ static bool skb_nfct_cached(struct net *net,
> if (help && rcu_access_pointer(help->helper) !=
> info->helper)
> return false;
> }
> +   if (info->nf_ct_timeout) {
> +   struct nf_conn_timeout *timeout_ext;
> +
> +   timeout_ext = nf_ct_timeout_find(ct);
> +   if (!timeout_ext || info->nf_ct_timeout !=
> timeout_ext->timeout)
> +   return false;
> +   }
> /* Force conntrack entry direction to the current packet? */
> if (info->force && CTINFO2DIR(ctinfo) != IP_CT_DIR_ORIGINAL) {
> /* Delete the conntrack entry if confirmed, else just
> release
> @@ -1704,6 +1712,8 @@ int ovs_ct_copy_action(struct net *net, const
> struct nlattr *attr,
>   ct_info.timeout))
> pr_info_ratelimited("Failed to associated timeout "
> "policy `%s'\n",
> ct_info.timeout);
> +   else
> +   ct_info.nf_ct_timeout =
> nf_ct_timeout_find(ct_info.ct)->timeout;
> }
>

Forgot to respond to this one earlier.
I did review, unit test and system test these changes and they are fine.

Thanks Darrell



>
> if (helper) {
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v4 4/9] ct-dpif, dpif-netlink: Add conntrack timeout policy support

2019-08-20 Thread Darrell Ball
On Tue, Aug 20, 2019 at 12:30 PM Yi-Hung Wei  wrote:

> On Tue, Aug 20, 2019 at 12:46 AM Darrell Ball  wrote:
> > After fixing a bug in my proposed incremental and adding tracking of an
> already removed sub timeout policy:
> > Pls double check.
>
> Thanks for the proposed incremental.
>
> I checked all the other logging places in dpif-netlink, we usually do
> not log the successfully cases in the INFO level.  As the discussion
> in the e-mail thread, I think the successful cases does not provide
> much useful information, so I made some minor changes based on the
> proposed incremental.  I will fold in the following diff.
>

Looks good

As mentioned earlier, tracking the timeout profile deletion timing at INFO
level is
not that important in general. So, as long as we don't spam the log, this
part should
be fine.


>
> Thanks,
>
> -Yi-Hung
>
> diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c
> index 1d4ee60bd199..85827cd65503 100644
> --- a/lib/dpif-netlink.c
> +++ b/lib/dpif-netlink.c
> @@ -2822,11 +2822,10 @@ dpif_netlink_ct_dump_done(struct dpif *dpif
> OVS_UNUSED,
>struct ct_dpif_dump_state *dump_)
>  {
>  struct dpif_netlink_ct_dump_state *dump;
> -int err;
>
>  INIT_CONTAINER(dump, dump_, up);
>
> -err = nl_ct_dump_done(dump->nl_ct_dump);
> +int err = nl_ct_dump_done(dump->nl_ct_dump);
>  free(dump);
>  return err;
>  }
> @@ -3318,32 +3317,32 @@ out:
>  return err;
>  }
>
> -/* Returns 0 if all the sub timeout policies are deleted or
> - * not exist in the kernel. */
> +/* Returns 0 if all the sub timeout policies are deleted or not exist in
> the
> + * kernel.  Returns 1 if any sub timeout policy deletion failed. */
>  static int
>  dpif_netlink_ct_del_timeout_policy(struct dpif *dpif OVS_UNUSED,
> uint32_t tp_id)
>  {
>  struct ds nl_tp_name = DS_EMPTY_INITIALIZER;
> -int err = 0;
> +int ret = 0;
>
>  for (int i = 0; i < ARRAY_SIZE(tp_protos); ++i) {
>  dpif_netlink_format_tp_name(tp_id, tp_protos[i].l3num,
>  tp_protos[i].l4num, _tp_name);
> -err = nl_ct_del_timeout_policy(ds_cstr(_tp_name));
> +int err = nl_ct_del_timeout_policy(ds_cstr(_tp_name));
>  if (err == ENOENT) {
>  err = 0;
>  }
>  if (err) {
> -VLOG_WARN_RL(_rl, "failed to delete timeout policy %s
> (%s)",
> +static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(6, 6);
> +VLOG_INFO_RL(, "failed to delete timeout policy %s (%s)",
>   ds_cstr(_tp_name), ovs_strerror(err));
> -goto out;
> +ret = 1;
>  }
>  }
>
> -out:
>  ds_destroy(_tp_name);
> -return err;
> +return ret;
>  }
>
>  struct dpif_netlink_ct_timeout_policy_dump_state {
> @@ -3392,10 +3391,9 @@
> dpif_netlink_ct_timeout_policy_dump_start(struct dpif *dpif
> OVS_UNUSED,
>void **statep)
>  {
>  struct dpif_netlink_ct_timeout_policy_dump_state *dump_state;
> -int err;
>
>  *statep = dump_state = xzalloc(sizeof *dump_state);
> -err = nl_ct_timeout_policy_dump_start(_state->nl_dump_state);
> +int err = nl_ct_timeout_policy_dump_start(_state->nl_dump_state);
>  if (err) {
>  free(dump_state);
>  return err;
>
> <- end of diff
> -->
>
> >
> > diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c
> > index 1d4ee60..cba4432 100644
> > --- a/lib/dpif-netlink.c
> > +++ b/lib/dpif-netlink.c
> > @@ -2822,11 +2822,10 @@ dpif_netlink_ct_dump_done(struct dpif *dpif
> OVS_UNUSED,
> >struct ct_dpif_dump_state *dump_)
> >  {
> >  struct dpif_netlink_ct_dump_state *dump;
> > -int err;
> >
> >  INIT_CONTAINER(dump, dump_, up);
> >
> > -err = nl_ct_dump_done(dump->nl_ct_dump);
> > +int err = nl_ct_dump_done(dump->nl_ct_dump);
> >  free(dump);
> >  return err;
> >  }
> > @@ -3319,7 +3318,8 @@ out:
> >  }
> >
> >  /* Returns 0 if all the sub timeout policies are deleted or
> > - * not exist in the kernel. */
> > + * not exist in the kernel; returns 1 if any sub timeout policy deletion
> > + * failed. */
> >  static int
> >  dpif_netlink_ct_del_timeout_policy(struct dpif *dpif OVS_UNUSED,
> > uint32_t tp_id)
> > @@ -3330,18 +3330,19 @@ dpif_netlink_ct_del_timeout_policy(struct dpif
> *dpif OVS_UNUSED,
> >  for (int i = 0; i < ARRAY_SIZE(tp_protos); ++i) {
> >  dpif_netlink_format_tp_name(tp_id, tp_protos[i].l3num,
> >  tp_protos[i].l4num, _tp_name);
> > -err = nl_ct_del_timeout_policy(ds_cstr(_tp_name));
> > -if (err == ENOENT) {
> > -err = 0;
> > -}
> > -if (err) {
> > -VLOG_WARN_RL(_rl, "failed to delete timeout policy %s
> (%s)",
> > -   

Re: [ovs-dev] [PATCH v4 4/9] ct-dpif, dpif-netlink: Add conntrack timeout policy support

2019-08-20 Thread Yi-Hung Wei
On Tue, Aug 20, 2019 at 12:46 AM Darrell Ball  wrote:
> After fixing a bug in my proposed incremental and adding tracking of an 
> already removed sub timeout policy:
> Pls double check.

Thanks for the proposed incremental.

I checked all the other logging places in dpif-netlink, we usually do
not log the successfully cases in the INFO level.  As the discussion
in the e-mail thread, I think the successful cases does not provide
much useful information, so I made some minor changes based on the
proposed incremental.  I will fold in the following diff.

Thanks,

-Yi-Hung

diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c
index 1d4ee60bd199..85827cd65503 100644
--- a/lib/dpif-netlink.c
+++ b/lib/dpif-netlink.c
@@ -2822,11 +2822,10 @@ dpif_netlink_ct_dump_done(struct dpif *dpif OVS_UNUSED,
   struct ct_dpif_dump_state *dump_)
 {
 struct dpif_netlink_ct_dump_state *dump;
-int err;

 INIT_CONTAINER(dump, dump_, up);

-err = nl_ct_dump_done(dump->nl_ct_dump);
+int err = nl_ct_dump_done(dump->nl_ct_dump);
 free(dump);
 return err;
 }
@@ -3318,32 +3317,32 @@ out:
 return err;
 }

-/* Returns 0 if all the sub timeout policies are deleted or
- * not exist in the kernel. */
+/* Returns 0 if all the sub timeout policies are deleted or not exist in the
+ * kernel.  Returns 1 if any sub timeout policy deletion failed. */
 static int
 dpif_netlink_ct_del_timeout_policy(struct dpif *dpif OVS_UNUSED,
uint32_t tp_id)
 {
 struct ds nl_tp_name = DS_EMPTY_INITIALIZER;
-int err = 0;
+int ret = 0;

 for (int i = 0; i < ARRAY_SIZE(tp_protos); ++i) {
 dpif_netlink_format_tp_name(tp_id, tp_protos[i].l3num,
 tp_protos[i].l4num, _tp_name);
-err = nl_ct_del_timeout_policy(ds_cstr(_tp_name));
+int err = nl_ct_del_timeout_policy(ds_cstr(_tp_name));
 if (err == ENOENT) {
 err = 0;
 }
 if (err) {
-VLOG_WARN_RL(_rl, "failed to delete timeout policy %s (%s)",
+static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(6, 6);
+VLOG_INFO_RL(, "failed to delete timeout policy %s (%s)",
  ds_cstr(_tp_name), ovs_strerror(err));
-goto out;
+ret = 1;
 }
 }

-out:
 ds_destroy(_tp_name);
-return err;
+return ret;
 }

 struct dpif_netlink_ct_timeout_policy_dump_state {
@@ -3392,10 +3391,9 @@
dpif_netlink_ct_timeout_policy_dump_start(struct dpif *dpif
OVS_UNUSED,
   void **statep)
 {
 struct dpif_netlink_ct_timeout_policy_dump_state *dump_state;
-int err;

 *statep = dump_state = xzalloc(sizeof *dump_state);
-err = nl_ct_timeout_policy_dump_start(_state->nl_dump_state);
+int err = nl_ct_timeout_policy_dump_start(_state->nl_dump_state);
 if (err) {
 free(dump_state);
 return err;

<- end of diff -->

>
> diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c
> index 1d4ee60..cba4432 100644
> --- a/lib/dpif-netlink.c
> +++ b/lib/dpif-netlink.c
> @@ -2822,11 +2822,10 @@ dpif_netlink_ct_dump_done(struct dpif *dpif 
> OVS_UNUSED,
>struct ct_dpif_dump_state *dump_)
>  {
>  struct dpif_netlink_ct_dump_state *dump;
> -int err;
>
>  INIT_CONTAINER(dump, dump_, up);
>
> -err = nl_ct_dump_done(dump->nl_ct_dump);
> +int err = nl_ct_dump_done(dump->nl_ct_dump);
>  free(dump);
>  return err;
>  }
> @@ -3319,7 +3318,8 @@ out:
>  }
>
>  /* Returns 0 if all the sub timeout policies are deleted or
> - * not exist in the kernel. */
> + * not exist in the kernel; returns 1 if any sub timeout policy deletion
> + * failed. */
>  static int
>  dpif_netlink_ct_del_timeout_policy(struct dpif *dpif OVS_UNUSED,
> uint32_t tp_id)
> @@ -3330,18 +3330,19 @@ dpif_netlink_ct_del_timeout_policy(struct dpif *dpif 
> OVS_UNUSED,
>  for (int i = 0; i < ARRAY_SIZE(tp_protos); ++i) {
>  dpif_netlink_format_tp_name(tp_id, tp_protos[i].l3num,
>  tp_protos[i].l4num, _tp_name);
> -err = nl_ct_del_timeout_policy(ds_cstr(_tp_name));
> -if (err == ENOENT) {
> -err = 0;
> -}
> -if (err) {
> -VLOG_WARN_RL(_rl, "failed to delete timeout policy %s 
> (%s)",
> - ds_cstr(_tp_name), ovs_strerror(err));
> -goto out;
> +int err2 = nl_ct_del_timeout_policy(ds_cstr(_tp_name));
> +static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(6, 6);
> +VLOG_INFO_RL(, err2 == ENOENT
> + ? "timeout policy already removed %s (%s)"
> + : !err2 ? "deleted timeout policy %s (%s)"
> + : "failed to delete timeout policy %s (%s)",
> + 

Re: [ovs-dev] [PATCH v1] ovs-lib: Fix standalone db migration to raft

2019-08-20 Thread Han Zhou
On Mon, Aug 19, 2019 at 6:36 PM Aliasgar Ginwala  wrote:
>
> Current code of create-cluster from standalone db takes backup of existing
> standalone db and then generates a new clustered dbs from backup dbs.
Hence,
> during migration if nb and sb  dbs are still present, create-cluster will
fail
> saying file exists and will not really convert  dbs to clustered dbs. This
> patch fixes the same.
>
> e.g message that pops up while migration from standalone to raft cluster:
>  * Backing up database to
/etc/openvswitch/ovnnb_db.db.backup5.13.0-1278623084
> ovsdb-tool: I/O error: /etc/openvswitch/ovnnb_db.db: create failed (File
exists)
>  * Creating cluster database /etc/openvswitch/ovnnb_db.db from existing
one
>
> Signed-off-by: aginwala 
> ---
>  utilities/ovs-lib.in | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/utilities/ovs-lib.in b/utilities/ovs-lib.in
> index fa840ec63..76ce79b9c 100644
> --- a/utilities/ovs-lib.in
> +++ b/utilities/ovs-lib.in
> @@ -485,6 +485,7 @@ create_cluster () {
>  elif ovsdb_tool db-is-standalone "$DB_FILE"; then
>  # Convert standalone database to clustered.
>  backup_db || return 1
> +rm -f "$DB_FILE"
>  action "Creating cluster database $DB_FILE from existing one" \
> ovsdb_tool create-cluster "$DB_FILE" "$backup"
"$LOCAL_ADDR"
>  fi
> --
> 2.20.1 (Apple Git-117)
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Thanks Aliasgar.

Acked-by: Han Zhou 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3 ovn] Add missing checkpatch.py script

2019-08-20 Thread 0-day Robot
Bleep bloop.  Greetings Lorenzo Bianconi, I am a robot and I have tried out 
your patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


git-am:
error: The following untracked working tree files would be overwritten by merge:
utilities/checkpatch.py
Please move or remove them before you can merge.
Aborting
Failed to merge in the changes.
Patch failed at 0001 Add missing checkpatch.py script
The copy of the patch that failed is found in:
   
/var/lib/jenkins/jobs/upstream_build_from_pw/workspace/OVN/.git/rebase-apply/patch
When you have resolved this problem, run "git am --resolved".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".


Please check this out.  If you feel there has been an error, please email 
acon...@redhat.com

Thanks,
0-day Robot
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v3 ovn] Add missing checkpatch.py script

2019-08-20 Thread Lorenzo Bianconi
Add utility checkpatch.py script used to check patch correctness before
submitting it. Introduce checkpatch.at unit tests

Signed-off-by: Lorenzo Bianconi 
---
Changes since v2:
- add checkpatch.at unit test

Changes since v1:
- fix subject
---
 tests/automake.mk   |1 +
 tests/checkpatch.at |  335 +
 tests/testsuite.at  |1 +
 utilities/automake.mk   |3 +-
 utilities/checkpatch.py | 1006 +++
 5 files changed, 1345 insertions(+), 1 deletion(-)
 create mode 100755 tests/checkpatch.at
 create mode 100755 utilities/checkpatch.py

diff --git a/tests/automake.mk b/tests/automake.mk
index 5411772a4..b19144fe4 100644
--- a/tests/automake.mk
+++ b/tests/automake.mk
@@ -19,6 +19,7 @@ COMMON_MACROS_AT = \
 
 TESTSUITE_AT = \
tests/testsuite.at \
+   tests/checkpatch.at \
tests/ovn.at \
tests/ovn-northd.at \
tests/ovn-nbctl.at \
diff --git a/tests/checkpatch.at b/tests/checkpatch.at
new file mode 100755
index 0..fe21acdf2
--- /dev/null
+++ b/tests/checkpatch.at
@@ -0,0 +1,335 @@
+AT_BANNER([checkpatch])
+
+OVS_START_SHELL_HELPERS
+# try_checkpatch PATCH [ERRORS]
+#
+# Runs checkpatch under Python 2 and Python 3, if installed, on the given
+# PATCH, expecting the specified set of ERRORS (and warnings).
+try_checkpatch() {
+AT_SKIP_IF([test $HAVE_PYTHON2 = no && test $HAVE_PYTHON3 = no])
+# Take the patch to test from $1.  Remove an initial four-space indent
+# from it and, if it is just headers with no body, add a null body.
+echo "$1" | sed 's/^//' > test.patch
+if grep '---' expout >/dev/null 2>&1; then :
+else
+printf '\n---\n' >> test.patch
+fi
+
+# Take expected output from $2.
+if test -n "$2"; then
+echo "$2" | sed 's/^//' > expout
+else
+: > expout
+fi
+
+try_checkpatch__ "$HAVE_PYTHON2" "$PYTHON2"
+try_checkpatch__ "$HAVE_PYTHON3" "$PYTHON3"
+}
+try_checkpatch__() {
+if test $1 = no; then
+:
+elif test -s expout; then
+AT_CHECK([$2 $top_srcdir/utilities/checkpatch.py -q test.patch],
+ [1], [stdout])
+AT_CHECK([sed '/^Lines checked:/,$d' stdout], [0], [expout])
+else
+AT_CHECK([$2 $top_srcdir/utilities/checkpatch.py -q test.patch])
+fi
+}
+OVS_END_SHELL_HELPERS
+
+AT_SETUP([checkpatch - sign-offs])
+
+# Sign-off for single author who is also the committer.
+try_checkpatch \
+   "Author: A
+Commit: A
+
+Signed-off-by: A"
+try_checkpatch \
+   "Author: A
+Commit: A" \
+   "ERROR: Author A needs to sign off."
+
+# Single author but somehow the mailing list is the author.
+try_checkpatch \
+   "Author: Foo Bar via dev 
+Commit: A
+
+Signed-off-by: A" \
+   "ERROR: Author should not be mailing list."
+
+# Sign-off for single author and different committer.
+try_checkpatch \
+   "Author: A
+Commit: B
+
+Signed-off-by: A
+Signed-off-by: B"
+try_checkpatch \
+   "Author: A
+Commit: B" \
+   "ERROR: Author A needs to sign off.
+ERROR: Committer B needs to sign off."
+
+# Sign-off for multiple authors with one author also the committer.
+try_checkpatch \
+   "Author: A
+Commit: A
+
+Signed-off-by: A
+Co-authored-by: B
+Signed-off-by: B"
+try_checkpatch \
+   "Author: A
+Commit: A
+
+Co-authored-by: B
+Signed-off-by: B" \
+   "ERROR: Author A needs to sign off."
+try_checkpatch \
+   "Author: A
+Commit: A
+
+Signed-off-by: A
+Co-authored-by: B" \
+   "ERROR: Co-author B needs to sign off."
+try_checkpatch \
+   "Author: A
+Commit: A
+
+Co-authored-by: B" \
+   "ERROR: Author A needs to sign off.
+ERROR: Co-author B needs to sign off."
+
+# Sign-off for multiple authors and separate committer.
+try_checkpatch \
+   "Author: A
+Commit: C
+
+Signed-off-by: A
+Co-authored-by: B
+Signed-off-by: B
+Signed-off-by: C"
+try_checkpatch \
+   "Author: A
+Commit: C
+
+Signed-off-by: A
+Co-authored-by: B
+Signed-off-by: B" \
+   "ERROR: Committer C needs to sign off."
+
+# Extra sign-offs:
+#
+#- If we know the committer, one extra sign-off raises a warning.
+#
+#- If we do not know the committer, two extra sign-offs raise a warning.
+try_checkpatch \
+   "Author: A
+Commit: C
+
+Signed-off-by: A
+Co-authored-by: B
+Signed-off-by: B
+Signed-off-by: C
+Signed-off-by: D" \
+   "WARNING: Unexpected sign-offs from developers who are not authors or 
co-authors or committers: D"
+try_checkpatch \
+   "Author: A
+
+Signed-off-by: A
+Co-authored-by: B
+Signed-off-by: B
+Signed-off-by: C"
+try_checkpatch \
+   "Author: A
+
+Signed-off-by: A
+Co-authored-by: B
+Signed-off-by: B
+Signed-off-by: C
+Signed-off-by: D" \
+   "WARNING: Unexpected sign-offs from developers who are not authors or 
co-authors or committers: C, D"
+
+# Missing committer is OK, missing author is an error.
+try_checkpatch 

[ovs-dev] Productos especiales de acero inoxidable

2019-08-20 Thread Ofequipamientos
( https://www.ofequipamientos.com/ )
.
ARMARIOS METALICOS
ESTANTERIAS METALICOS
LOCKERS Y GUARDARROPA
CARROS Y PRODUCTOS DE ALAMBRE
RIELERAS Y GANCHERAS PARA CAMARAS FRIGORIFICAS
MESADAS Y PRODUCTOS ESPECIALES DE ACERO INOXIDABLE
.
( https://www.ofequipamientos.com/ )
.
CONSULTAS CLICK ACA ( mailto:consul...@ofequipamientos.com )
.

 Llámenos al
0341 156420480

consul...@ofequipamientos.com

Rosario. Argentina.

.
www.ofequipamientos.com ( https://www.ofequipamientos.com/ )
Este es un e-mail legal y contiene informacion de servicios y productos que 
consideramos pueden ser de su interes.
De acuerdo con la nueva Ley argentina Nro. 26.032, la libre distribucion de 
este e-mail esta
autorizada por tratarse de propositos de informacion, sin embargo, si le 
hemos causado alguna molestia por el mismo, le rogamos acepte nuestras 
disculpas.

Si lo desea puede ser quitado de nuestra lista de correos solicitandolo a 
baj...@yopmail.com
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCHv18] netdev-afxdp: add new netdev type for AF_XDP.

2019-08-20 Thread Ilya Maximets
On 20.08.2019 14:19, Eelco Chaudron wrote:
> 
> 
> On 20 Aug 2019, at 12:10, Ilya Maximets wrote:
> 
>> On 14.08.2019 19:16, William Tu wrote:
>>> On Wed, Aug 14, 2019 at 7:58 AM William Tu  wrote:

 On Wed, Aug 14, 2019 at 5:09 AM Eelco Chaudron  wrote:
>
>
>
> On 8 Aug 2019, at 17:38, Ilya Maximets wrote:
>
> 
>
> I see a rather high number of afxdp_cq_skip, which should to my
> knowledge never happen?

 I tried to investigate this previously, but didn't find anything
 suspicious.
 So, for my knowledge, this should never happen too.
 However, I only looked at the code without actually running, because
 I had no
 HW available for testing.

 While investigation and stress-testing virtual ports I found few
 issues with
 missing locking inside the kernel, so there is no trust for kernel
 part of XDP
 implementation from my side. I'm suspecting that there are some
 other bugs in
 kernel/libbpf that only could be reproduced with driver mode.

 This never happens for virtual ports with SKB mode, so I never saw
 this coverage
 counter being non-zero.
>>>
>>> Did some quick debugging, as something else has come up that needs my
>>> attention :)
>>>
>>> But once I’m in a faulty state and sent a single packet, causing
>>> afxdp_complete_tx() to be called, it tells me 2048 descriptors are
>>> ready, which is XSK_RING_PROD__DEFAULT_NUM_DESCS. So I guess that
>>> there might be some ring management bug. Maybe consumer and receiver
>>> are equal meaning 0 buffers, but it returns max? I did not look at
>>> the kernel code, so this is just a wild guess :)
>>>
>>> (gdb) p tx_done
>>> $3 = 2048
>>>
>>> (gdb) p umem->cq
>>> $4 = {cached_prod = 3830466864, cached_cons = 3578066899, mask =
>>> 2047, size = 2048, producer = 0x7f08486b8000, consumer =
>>> 0x7f08486b8040, ring = 0x7f08486b8080}
>>
>> Thanks for debugging!
>>
>> xsk_ring_cons__peek() just returns the difference between cached_prod
>> and cached_cons, but these values are too different:
>>
>> 3830466864 - 3578066899 = 252399965
>>
>> Since this value > requested, it returns requested number (2048).
>>
>> So, the ring is broken. At least broken its 'cached' part. It'll be
>> good
>> to look at *consumer and *producer values to verify the state of the
>> actual ring.
>>
>
> I’ll try to find some more time next week to debug further.
>
> William I noticed your email in xdp-newbies where you mention this
> problem of getting the wrong pointers. Did you ever follow up, or did
> further trouble shooting on the above?

 Yes, I posted here
 https://www.spinics.net/lists/xdp-newbies/msg00956.html
 "Question/Bug about AF_XDP idx_cq from xsk_ring_cons__peek?"

 At that time I was thinking about reproducing the problem using the
 xdpsock sample code from kernel. But turned out that my reproduction
 code is not correct, so not able to show the case we hit here in OVS.

 Then I put more similar code logic from OVS to xdpsock, but the problem
 does not show up. As a result, I worked around it by marking addr as
 "*addr == UINT64_MAX".

 I will debug again this week once I get my testbed back.

>>> Just to refresh my memory. The original issue is that
>>> when calling:
>>> tx_done = xsk_ring_cons__peek(>cq, CONS_NUM_DESCS, _cq);
>>> xsk_ring_cons__release(>cq, tx_done);
>>>
>>> I expect there are 'tx_done' elems on the CQ to re-cycle back to memory 
>>> pool.
>>> However, when I inspect these elems, I found some elems that 'already' been
>>> reported complete last time I call xsk_ring_cons__peek. In other word, some
>>> elems show up at CQ twice. And this cause overflow of the mempool.
>>>
>>> Thus, mark the elems on CQ as UINT64_MAX to indicate that we already
>>> seen this elem.
>>
>> William, Eelco, which HW NIC you're using? Which kernel driver?
> 
> I’m using the below on the latest bpf-next driver:
> 
> 01:00.0 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ 
> Network Connection (rev 01)
> 01:00.1 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ 
> Network Connection (rev 01)

Thanks for information.
I found one suspicious place inside the ixgbe driver that could break
the completion queue ring and prepared a patch:
https://patchwork.ozlabs.org/patch/1150244/

It'll be good if you can test it.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn 4/4] rhel: Run ovn services with the 'openvswitch' user

2019-08-20 Thread Jaime Caamaño Ruiz

> In regards to hugepages, I would probably stop using the system wide
> /dev/hugepages and instead setup a separate mount using hugeadm
> properly owned by the openvswitch user. Then we don't need a separate
> group setup for that and we don't interfere with the system.

In regards to this, now that I think of it, I don't know the reason
behind setting up the hugetlbfs system user and changing hugepages
system path permission, which is rather disruptive, when the rest of
the dpdk initialization is not done. Maybe it's fedora specific?

If it were to me I would remove that, and let the admin setup the
appropriate mount point as part of the dpdk setup workflow.

BR
Jaime.

-Original Message-
From: Jaime Caamaño Ruiz 
Reply-to: jcaam...@suse.com
To: Numan Siddique , Jaime Caamano 
Cc: ovs dev 
Subject: Re: [PATCH ovn 4/4] rhel: Run ovn services with the
'openvswitch' user
Date: Tue, 20 Aug 2019 14:31:23 +0200

> This is patch 4 of the series. Patch 1 takes care of adding ovn
> specific run dirs (run, log, db dirs).
> Can you please take a look at the series ?

Sorry, I completely missed those. You can disregard much of what I
said.

> Is there a way to solve this issue ? If we run ovn services as
> "ovn:openvswitch" (or ovn:hugetlbs in the case
> of dpdk), can ovn-controller access  /var/run/openvswitch/db.sock ?

You need write permission on db.sock. OVS currently creates it with
0700 that allows only the user owner to write to it, while it's most
common to create socket files with 0660 that would give group write
permissions. (files lib/socket-util-unix.c and
python/ovs/socket_util.py). I don't know the original reason for the
stricter permissions on these, some else could clarify.

Then you would probably want ovn user in openvswitch group.

In regards to hugepages, I would probably stop using the system wide
/dev/hugepages and instead setup a separate mount using hugeadm
properly owned by the openvswitch user. Then we don't need a separate
group setup for that and we don't interfere with the system.

BR
Jaime.


-Original Message-
From: Numan Siddique 
To: Jaime Caamano 
Cc: ovs dev 
Subject: Re: [PATCH ovn 4/4] rhel: Run ovn services with the
'openvswitch' user
Date: Wed, 14 Aug 2019 18:44:28 +0530



On Wed, Aug 14, 2019 at 6:08 PM Jaime Caamaño Ruiz 
wrote:
> Hello
> 
> Some comments, that probably are due to me being confused checking
> the
> new repo:
> 


Hi Jaime,

This is patch 4 of the series. Patch 1 takes care of adding ovn
specific run dirs (run, log, db dirs).
Can you please take a look at the series ?

Patch 0 - https://mail.openvswitch.org/pipermail/ovs-dev/2019-August/36
1628.html
https://patchwork.ozlabs.org/project/openvswitch/list/?submitter=67480


 
> 1)
> 
> > sed -i 's:\(.*su\).*:\1 ovn ovn:' %{_sysconfdir}/logrotate.d/ovn
> 
> I dont see the file %{_sysconfdir}/logrotate.d/ovn included anywhere.
> Anyway, since OVN_USER_ID defaults to openvswitch:openvswitch,
> shouldn't this be


It's added in patch 3 of the series - https://patchwork.ozlabs.org/patc
h/1146492/

 
> sed -i 's:\(.*su\).*:\1 openvswitch openvswitch:'
> %{_sysconfdir}/logrotate.d/ovn
> 
> What about when the group is hugetlbfs?
> 

hugetlbfs is used when ovs  is configured with dpdk. 

ovs-vswitchd/ovsdb-server will be running as user:group -
"openvswitch:hugetlbfs".

If ovn-controller runs as user:group - "openvswitch:openvswitch" it can
still access
the file - /var/run/openvswitch/db.sock and other br-*.mgmt socket
files right ?

That's what I thought. Please correct me If I am wrong.
 
> 2)
> 
> #OVN_USER_ID="openvswitch:openvswitch"
> 
> Again, what about when the group should be hugetlbfs?
> 
> 2)
> 
> > chown -R $INSTALL_USER:$INSTALL_GROUP $ovn_dbdir
> > chown -R $INSTALL_USER:$INSTALL_GROUP $OVN_RUNDIR
> > chown -R $INSTALL_USER:$INSTALL_GROUP $ovn_logdir
> 
> Since OVN and OVS still share log and run directories (right?, anyway
> I
> dont see where ovn_dbdir and ovn_logdir are set), changing
> OVN_USER_ID
> to a different value than OVS_USER_ID without changing a bunch of
> other
> stuff will break things. Doesn't it make sense to use OVS_USER_ID
> instead of introducing OVN_USER_ID until there is a more meaningful
> split?
> 

Please see patch 1 of the series which sets ovn_logdir, ovn_dbdir etc.

The idea is to totally separate out OVN from OVS including user ids.
But since its a bit tricky
as ovn-controller needs to access /var/run/openvswitch/db.sock, I
thought of using openvswitch user
for now. But added the OVN_USER_ID so that it would be easier later to
switch to new 'ovn' user.

Is there a way to solve this issue ? If we run ovn services as
"ovn:openvswitch" (or ovn:hugetlbs in the case
of dpdk), can ovn-controller access  /var/run/openvswitch/db.sock ?
I tested it by running as "ovn:openvswitch" and it was not working for
me. May be I am doing something wrong ?

Thanks for the comments.

Numan

 
> BR
> Jaime.
> 
> -Original Message-
> From: nusid...@redhat.com
> To: 

Re: [ovs-dev] [PATCH ovn 4/4] rhel: Run ovn services with the 'openvswitch' user

2019-08-20 Thread Jaime Caamaño Ruiz
> This is patch 4 of the series. Patch 1 takes care of adding ovn
> specific run dirs (run, log, db dirs).
> Can you please take a look at the series ?

Sorry, I completely missed those. You can disregard much of what I
said.

> Is there a way to solve this issue ? If we run ovn services as
> "ovn:openvswitch" (or ovn:hugetlbs in the case
> of dpdk), can ovn-controller access  /var/run/openvswitch/db.sock ?

You need write permission on db.sock. OVS currently creates it with
0700 that allows only the user owner to write to it, while it's most
common to create socket files with 0660 that would give group write
permissions. (files lib/socket-util-unix.c and
python/ovs/socket_util.py). I don't know the original reason for the
stricter permissions on these, some else could clarify.

Then you would probably want ovn user in openvswitch group.

In regards to hugepages, I would probably stop using the system wide
/dev/hugepages and instead setup a separate mount using hugeadm
properly owned by the openvswitch user. Then we don't need a separate
group setup for that and we don't interfere with the system.

BR
Jaime.


-Original Message-
From: Numan Siddique 
To: Jaime Caamano 
Cc: ovs dev 
Subject: Re: [PATCH ovn 4/4] rhel: Run ovn services with the
'openvswitch' user
Date: Wed, 14 Aug 2019 18:44:28 +0530



On Wed, Aug 14, 2019 at 6:08 PM Jaime Caamaño Ruiz 
wrote:
> Hello
> 
> Some comments, that probably are due to me being confused checking
> the
> new repo:
> 


Hi Jaime,

This is patch 4 of the series. Patch 1 takes care of adding ovn
specific run dirs (run, log, db dirs).
Can you please take a look at the series ?

Patch 0 - https://mail.openvswitch.org/pipermail/ovs-dev/2019-August/36
1628.html
https://patchwork.ozlabs.org/project/openvswitch/list/?submitter=67480


 
> 1)
> 
> > sed -i 's:\(.*su\).*:\1 ovn ovn:' %{_sysconfdir}/logrotate.d/ovn
> 
> I dont see the file %{_sysconfdir}/logrotate.d/ovn included anywhere.
> Anyway, since OVN_USER_ID defaults to openvswitch:openvswitch,
> shouldn't this be


It's added in patch 3 of the series - https://patchwork.ozlabs.org/patc
h/1146492/

 
> sed -i 's:\(.*su\).*:\1 openvswitch openvswitch:'
> %{_sysconfdir}/logrotate.d/ovn
> 
> What about when the group is hugetlbfs?
> 

hugetlbfs is used when ovs  is configured with dpdk. 

ovs-vswitchd/ovsdb-server will be running as user:group -
"openvswitch:hugetlbfs".

If ovn-controller runs as user:group - "openvswitch:openvswitch" it can
still access
the file - /var/run/openvswitch/db.sock and other br-*.mgmt socket
files right ?

That's what I thought. Please correct me If I am wrong.
 
> 2)
> 
> #OVN_USER_ID="openvswitch:openvswitch"
> 
> Again, what about when the group should be hugetlbfs?
> 
> 2)
> 
> > chown -R $INSTALL_USER:$INSTALL_GROUP $ovn_dbdir
> > chown -R $INSTALL_USER:$INSTALL_GROUP $OVN_RUNDIR
> > chown -R $INSTALL_USER:$INSTALL_GROUP $ovn_logdir
> 
> Since OVN and OVS still share log and run directories (right?, anyway
> I
> dont see where ovn_dbdir and ovn_logdir are set), changing
> OVN_USER_ID
> to a different value than OVS_USER_ID without changing a bunch of
> other
> stuff will break things. Doesn't it make sense to use OVS_USER_ID
> instead of introducing OVN_USER_ID until there is a more meaningful
> split?
> 

Please see patch 1 of the series which sets ovn_logdir, ovn_dbdir etc.

The idea is to totally separate out OVN from OVS including user ids.
But since its a bit tricky
as ovn-controller needs to access /var/run/openvswitch/db.sock, I
thought of using openvswitch user
for now. But added the OVN_USER_ID so that it would be easier later to
switch to new 'ovn' user.

Is there a way to solve this issue ? If we run ovn services as
"ovn:openvswitch" (or ovn:hugetlbs in the case
of dpdk), can ovn-controller access  /var/run/openvswitch/db.sock ?
I tested it by running as "ovn:openvswitch" and it was not working for
me. May be I am doing something wrong ?

Thanks for the comments.

Numan

 
> BR
> Jaime.
> 
> -Original Message-
> From: nusid...@redhat.com
> To: d...@openvswitch.org
> Cc: Jaime Caamano 
> Subject: [PATCH ovn 4/4] rhel: Run ovn services with the
> 'openvswitch'
> user
> Date: Tue, 13 Aug 2019 21:58:22 +0530
> 
> From: Numan Siddique 
> 
> This patch could have created a new user 'ovn' for ovn services
> instead
> of using 'openvswitch' user. But this would require some amount of
> work
> and
> proper testing since the new user 'ovn' should be part of
> 'openvswitch'
> group (to access /var/run/openvswitch/db.sock.). If ovs is compiled
> with dpdk,
> then it may get tricky (as ovs-vswitchd is run as user -
> openvswitch:hugetlbfs).
> We can support a new user for 'ovn' services in the future.
> 
> Recently the commit [1] in ovs repo added support to run ovn services
> with the
> 'openvswitch' user, but this commit was not applied to ovn repo as we
> had
> already created a new OVN repo. During the OVS/OVN formal split, we
> missed
> out on applying the patch [1]. 

Re: [ovs-dev] [PATCH] Make pidfile_is_running more robust against empty pidfiles

2019-08-20 Thread Numan Siddique
On Tue, Aug 20, 2019 at 2:52 PM Ilya Maximets 
wrote:

> On 20.08.2019 12:16, Ilya Maximets wrote:
> > On 20.08.2019 11:48, Numan Siddique wrote:
> >>
> >>
> >> On Wed, Aug 14, 2019 at 9:21 PM Michele Baldessari  > wrote:
> >>
> >> On Wed, Aug 14, 2019 at 02:28:13PM +0300, Ilya Maximets wrote:
> >> > On 14.08.2019 11:39, Michele Baldessari wrote:
> >> > > In some of our destructive testing of ovn-dbs inside containers
> managed
> >> > > by pacemaker we reached a situation where /var/run/openvswitch
> had
> >> > > empty .pid files. The current code does not deal well with them
> >> > > and pidfile_is_running() returns true in such a case and this
> confuses
> >> > > the OCF resource agent.
> >> > >
> >> > > - Before this change:
> >> > > Inside a container run:
> >> > >   killall ovsdb-server;
> >> > >   echo -n '' > /var/run/openvswitch/ovnnb_db.pid; echo -n '' >
> /var/run/openvswitch/ovnsb_db.pid
> >> > >
> >> > > We will observe that the cluster is unable to ever recover
> because
> >> > > it believes the ovn processes to be running when they really
> aren't and
> >> > > eventually just fails:
> >> > >  podman container set: ovn-dbs-bundle [
> 192.168.24.1:8787/rhosp15/openstack-ovn-northd:pcmklatest <
> http://192.168.24.1:8787/rhosp15/openstack-ovn-northd:pcmklatest>]
> >> > >ovn-dbs-bundle-0 (ocf::ovn:ovndb-servers):   Master
> controller-0
> >> > >ovn-dbs-bundle-1 (ocf::ovn:ovndb-servers):   Stopped
> controller-1
> >> > >ovn-dbs-bundle-2 (ocf::ovn:ovndb-servers):   Slave
> controller-2
> >> > >
> >> > > - After this change the cluster is able to recover from this
> state and
> >> > > correctly start the resource:
> >> > >  podman container set: ovn-dbs-bundle [
> 192.168.24.1:8787/rhosp15/openstack-ovn-northd:pcmklatest <
> http://192.168.24.1:8787/rhosp15/openstack-ovn-northd:pcmklatest>]
> >> > >ovn-dbs-bundle-0 (ocf::ovn:ovndb-servers):   Master
> controller-0
> >> > >ovn-dbs-bundle-1 (ocf::ovn:ovndb-servers):   Slave
> controller-1
> >> > >ovn-dbs-bundle-2 (ocf::ovn:ovndb-servers):   Slave
> controller-2
> >> > >
> >> > > Signed-off-by: Michele Baldessari  mich...@acksyn.org>>
> >> > > ---
> >> > >  ovn/utilities/ovn-ctl | 2 +-
> >> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> >> > >
> >> > > diff --git a/ovn/utilities/ovn-ctl b/ovn/utilities/ovn-ctl
> >> > > index 7e5cd469c83c..65f03e28ddba 100755
> >> > > --- a/ovn/utilities/ovn-ctl
> >> > > +++ b/ovn/utilities/ovn-ctl
> >> > > @@ -35,7 +35,7 @@
> ovn_northd_db_conf_file="$etcdir/ovn-northd-db-params.conf"
> >> > >
> >> > >  pidfile_is_running () {
> >> > >  pidfile=$1
> >> > > -test -e "$pidfile" && pid=`cat "$pidfile"` && pid_exists
> "$pid"
> >> > > +test -e "$pidfile" && [ -s "$pidfile" ] && pid=`cat
> "$pidfile"` && pid_exists "$pid"
> >> >
> >> > Hi. Thanks for the fix!
> >> >
> >> > Maybe it's better to add additional check for an empty argument to
> >> > 'pid_exists' function instead? This will cover more cases like
> invocations
> >> > from the utilities/ovs-lib.in <
> https://protect2.fireeye.com/url?k=64579ecddf75e065.64561582-0f6314ee43bf7086=http://ovs-lib.in
> >.
> >> >
> >> > I think, you may also add following tag to commit-message in this
> case:
> >> > Fixes: 3028ce2595c8 ("ovs-lib: Allow "status" command to work as
> non-root.")
> >> >
> >> > This patch also will be needed in ovn-org/ovn repository too.
> >> > (Use 'PATCH ovn' subject prefix while sending patches targeted
> for ovn repo.)
> >> >
> >> > Best regards, Ilya Maximets.
> >>
> >> Thanks for the feedback Ilya, I have amended things (hopefully
> correctly) in
> >> http://patchwork.ozlabs.org/patch/1147111/ (I could not figure out
> how
> >> to update an existing patch in patchwork, I hope this is okay)
> >>
> >>
> >> Hi Michele and Ilya,
> >>
> >> I applied this fix to the OVN repo. It's possible that the fix to
> address this issue in ovs-lib.in <
> https://protect2.fireeye.com/url?k=64579ecddf75e065.64561582-0f6314ee43bf7086=http://ovs-lib.in>
> could
> >> be missing in some deployments if older ovs version is used. I thought
> its no harm in having
> >> the fixes in both ovn-ctl and ovs-lib.in <
> https://protect2.fireeye.com/url?k=64579ecddf75e065.64561582-0f6314ee43bf7086=http://ovs-lib.in
> >.
> >>
> >
> > Hi Numan,
> >
> > There was already v2 for this patch (a bit renamed):
> > OVS:
> https://mail.openvswitch.org/pipermail/ovs-dev/2019-August/361678.html
> > OVN:
> https://mail.openvswitch.org/pipermail/ovs-dev/2019-August/361679.html
>
> Sorry, maybe I misunderstood what you wanted to do.
> Do you suggest to apply v1 to OVN repo and v2 to OVS repo?
> What about applying v2 to OVN repo?
>
>
Yes.  v1 to 

Re: [ovs-dev] [PATCH] test: do not require python2 for CHECK_CONNTRACK macro

2019-08-20 Thread Numan Siddique
I applied this patch to OVN master as this fix is required for ovn repo too.

Thanks
Numan

On Tue, Jul 30, 2019 at 11:51 PM Darrell Ball  wrote:

> On Tue, Jul 30, 2019 at 10:41 AM Mark Michelson 
> wrote:
>
> > Why do these macros require python at all?
> >
>
> alg test tools
>
>
>
> >
> > On 7/29/19 7:48 AM, Lorenzo Bianconi wrote:
> > > Do not strictly require python2 for CHECK_CONNTRACK macro definitions
> in
> > > system-{kmod,userspace}-macros.at
> > >
> > > Signed-off-by: Lorenzo Bianconi 
> > > ---
> > >   tests/system-kmod-macros.at  | 2 +-
> > >   tests/system-userspace-macros.at | 2 +-
> > >   2 files changed, 2 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/tests/system-kmod-macros.at b/tests/system-kmod-macros.at
> > > index 554a61e9b..48e94642b 100644
> > > --- a/tests/system-kmod-macros.at
> > > +++ b/tests/system-kmod-macros.at
> > > @@ -59,7 +59,7 @@ m4_define([CONFIGURE_VETH_OFFLOADS],
> > >   # kernel conntrack tables when the test is finished.
> > >   #
> > >   m4_define([CHECK_CONNTRACK],
> > > -[AT_SKIP_IF([test $HAVE_PYTHON2 = no])
> > > +[AT_SKIP_IF([test $HAVE_PYTHON = no])
> > >m4_foreach([mod], [[nf_conntrack_ipv4], [nf_conntrack_ipv6],
> > [nf_nat_ftp],
> > >   [nf_nat_tftp]],
> > >   [modprobe mod || echo "Module mod not loaded."
> > > diff --git a/tests/system-userspace-macros.at b/tests/
> > system-userspace-macros.at
> > > index 9d5f3bf41..a411e3d89 100644
> > > --- a/tests/system-userspace-macros.at
> > > +++ b/tests/system-userspace-macros.at
> > > @@ -65,7 +65,7 @@ m4_define([CONFIGURE_VETH_OFFLOADS],
> > >   # Perform requirements checks for running conntrack tests.
> > >   #
> > >   m4_define([CHECK_CONNTRACK],
> > > -[AT_SKIP_IF([test $HAVE_PYTHON2 = no])]
> > > +[AT_SKIP_IF([test $HAVE_PYTHON = no])]
> > >   )
> > >
> > >   # CHECK_CONNTRACK_ALG()
> > >
> >
> > ___
> > dev mailing list
> > d...@openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> >
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCHv18] netdev-afxdp: add new netdev type for AF_XDP.

2019-08-20 Thread Eelco Chaudron



On 20 Aug 2019, at 12:10, Ilya Maximets wrote:


On 14.08.2019 19:16, William Tu wrote:
On Wed, Aug 14, 2019 at 7:58 AM William Tu  
wrote:


On Wed, Aug 14, 2019 at 5:09 AM Eelco Chaudron  
wrote:




On 8 Aug 2019, at 17:38, Ilya Maximets wrote:




I see a rather high number of afxdp_cq_skip, which should to my
knowledge never happen?


I tried to investigate this previously, but didn't find anything
suspicious.
So, for my knowledge, this should never happen too.
However, I only looked at the code without actually running, 
because

I had no
HW available for testing.

While investigation and stress-testing virtual ports I found few
issues with
missing locking inside the kernel, so there is no trust for 
kernel

part of XDP
implementation from my side. I'm suspecting that there are some
other bugs in
kernel/libbpf that only could be reproduced with driver mode.

This never happens for virtual ports with SKB mode, so I never 
saw

this coverage
counter being non-zero.


Did some quick debugging, as something else has come up that 
needs my

attention :)

But once I’m in a faulty state and sent a single packet, 
causing
afxdp_complete_tx() to be called, it tells me 2048 descriptors 
are

ready, which is XSK_RING_PROD__DEFAULT_NUM_DESCS. So I guess that
there might be some ring management bug. Maybe consumer and 
receiver
are equal meaning 0 buffers, but it returns max? I did not look 
at

the kernel code, so this is just a wild guess :)

(gdb) p tx_done
$3 = 2048

(gdb) p umem->cq
$4 = {cached_prod = 3830466864, cached_cons = 3578066899, mask =
2047, size = 2048, producer = 0x7f08486b8000, consumer =
0x7f08486b8040, ring = 0x7f08486b8080}


Thanks for debugging!

xsk_ring_cons__peek() just returns the difference between 
cached_prod

and cached_cons, but these values are too different:

3830466864 - 3578066899 = 252399965

Since this value > requested, it returns requested number (2048).

So, the ring is broken. At least broken its 'cached' part. It'll 
be

good
to look at *consumer and *producer values to verify the state of 
the

actual ring.



I’ll try to find some more time next week to debug further.

William I noticed your email in xdp-newbies where you mention this
problem of getting the wrong pointers. Did you ever follow up, or 
did

further trouble shooting on the above?


Yes, I posted here
https://www.spinics.net/lists/xdp-newbies/msg00956.html
"Question/Bug about AF_XDP idx_cq from xsk_ring_cons__peek?"

At that time I was thinking about reproducing the problem using the
xdpsock sample code from kernel. But turned out that my reproduction
code is not correct, so not able to show the case we hit here in 
OVS.


Then I put more similar code logic from OVS to xdpsock, but the 
problem

does not show up. As a result, I worked around it by marking addr as
"*addr == UINT64_MAX".

I will debug again this week once I get my testbed back.


Just to refresh my memory. The original issue is that
when calling:
tx_done = xsk_ring_cons__peek(>cq, CONS_NUM_DESCS, _cq);
xsk_ring_cons__release(>cq, tx_done);

I expect there are 'tx_done' elems on the CQ to re-cycle back to 
memory pool.
However, when I inspect these elems, I found some elems that 
'already' been
reported complete last time I call xsk_ring_cons__peek. In other 
word, some

elems show up at CQ twice. And this cause overflow of the mempool.

Thus, mark the elems on CQ as UINT64_MAX to indicate that we already
seen this elem.


William, Eelco, which HW NIC you're using? Which kernel driver?


I’m using the below on the latest bpf-next driver:

01:00.0 Ethernet controller: Intel Corporation 82599ES 10-Gigabit 
SFI/SFP+ Network Connection (rev 01)
01:00.1 Ethernet controller: Intel Corporation 82599ES 10-Gigabit 
SFI/SFP+ Network Connection (rev 01)


//Eelco

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] dpif-netdev: Add core id in the PMD thread name.

2019-08-20 Thread David Marchand
On Tue, Aug 20, 2019 at 12:06 PM Ilya Maximets  wrote:
>
> On 20.08.2019 13:00, David Marchand wrote:
> > On Wed, Aug 14, 2019 at 9:45 AM Ilya Maximets  
> > wrote:
> >>
> >> On 13.08.2019 19:46, Eelco Chaudron wrote:
> >>> This is a really good idea :) One remark should we make it %03d?
> >>
> >> There is a hard limit for the thread name. It's 15 meaningful chars 
> >> excluding the
> >> terminating null byte. 'pmd-c02/id:' is 11 bytes wide keeping 4 bytes for 
> >> the
> >> thread id. 'pmd-c002/id:' is 12 bytes wide with only 3 bytes remaining for 
> >> id.
> >> Thread ids could easily become big ( > 1000) for a long running process, 
> >> that is
> >> why %02d was chosen, to save some space.
> >
> > Do we really need the /id: part?
> > A c%03d/pmd prefix would keep the existing pmd%d pattern and save 3 
> > characters.
>
> This will break lib/ovs-thread.c:thread_is_pmd() function.

Ok, not really a problem to change from my pov.

I don't really like the /id: in the thread name, as usually userspace
thread names contains simple alphanumeric characters.
But I can see no problem with this so:
Reviewed-by: David Marchand 


--
David Marchand
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH 1/1] [PATCH v3 ovn] logical-fields: fix memory leak while initializing ovnfield_by_name

2019-08-20 Thread Damijan Skvarc
>This patch named "fix memory leak ...", but I don't see any leaks fixed
>in the code. It seems that you've sent diff between v1 and v2 instead of
>the patch itself. If so, you need to squash this into your v1 patch and
>send as v3.
>
>Best regards, Ilya Maximets.

Yes, Ilya, you are right. I didn't realize patch file must contain cumulative
changes (and not partial one) in case they are spread through several commits.

In our case the memory leak occurs because ovnfield_by_name hash is initialized
twice without destroying previously allocated memory. 

I know the problem can be solved in different ways (ovsthread_once, 
OVS_CONSTRUCTOR,
calling ovn_destroy_ovnfields() before reinitializing  it again,...). However my
intention in this patch is just to solve memory leak and not changing anything 
in 
the behaviour. I am also aware that memory leak is not important since
it happens only once during lifecycle of ovn-controller. Thus the one, who will 
get 
the most from this patch is the observer of valgrind reports while running test 
suite. 
Namely, while running test suite ovn-controller is restarted so many times and 
this 
memory leak is reported so many times that it obscure much more important 
issues.

If anything else needs to be done regarding initialization of ovnfield_by_name 
hash I 
would reather think about getting rid of it, since it maps into ovn_fields 
array which
is currently contituted from one entry only. I believe searching for field name 
in this
single entry table directly would be even quicker then using hash. But since I 
don't 
know about the future/plans of ovn_fields array I am not able to do this change.


Signed-off-by: Damijan Skvarc 
---
 lib/logical-fields.c | 24 ++--
 1 file changed, 18 insertions(+), 6 deletions(-)

diff --git a/lib/logical-fields.c b/lib/logical-fields.c
index 8fb591c..13e5b83 100644
--- a/lib/logical-fields.c
+++ b/lib/logical-fields.c
@@ -57,6 +57,22 @@ add_ct_bit(const char *name, int index, struct shash *symtab)
 free(expansion);
 }
 
+static void
+ovn_init_ovnfields(void)
+{
+static bool initialized = false;
+
+if (!initialized) {
+shash_init(_by_name);
+for (int i = 0; i < OVN_FIELD_N_IDS; i++) {
+const struct ovn_field *of = _fields[i];
+ovs_assert(of->id == i); /* Fields must be in the enum order. */
+shash_add_once(_by_name, of->name, of);
+}
+initialized = true;
+}
+}
+
 void
 ovn_init_symtab(struct shash *symtab)
 {
@@ -220,12 +236,8 @@ ovn_init_symtab(struct shash *symtab)
 expr_symtab_add_field(symtab, "sctp.src", MFF_SCTP_SRC, "sctp", false);
 expr_symtab_add_field(symtab, "sctp.dst", MFF_SCTP_DST, "sctp", false);
 
-shash_init(_by_name);
-for (int i = 0; i < OVN_FIELD_N_IDS; i++) {
-const struct ovn_field *of = _fields[i];
-ovs_assert(of->id == i); /* Fields must be in the enum order. */
-shash_add_once(_by_name, of->name, of);
-}
+ovn_init_ovnfields();
+
 expr_symtab_add_ovn_field(symtab, "icmp4.frag_mtu", OVN_ICMP4_FRAG_MTU);
 }
 
-- 
2.7.4

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCHv18] netdev-afxdp: add new netdev type for AF_XDP.

2019-08-20 Thread Ilya Maximets
On 14.08.2019 19:16, William Tu wrote:
> On Wed, Aug 14, 2019 at 7:58 AM William Tu  wrote:
>>
>> On Wed, Aug 14, 2019 at 5:09 AM Eelco Chaudron  wrote:
>>>
>>>
>>>
>>> On 8 Aug 2019, at 17:38, Ilya Maximets wrote:
>>>
>>> 
>>>
>>> I see a rather high number of afxdp_cq_skip, which should to my
>>> knowledge never happen?
>>
>> I tried to investigate this previously, but didn't find anything
>> suspicious.
>> So, for my knowledge, this should never happen too.
>> However, I only looked at the code without actually running, because
>> I had no
>> HW available for testing.
>>
>> While investigation and stress-testing virtual ports I found few
>> issues with
>> missing locking inside the kernel, so there is no trust for kernel
>> part of XDP
>> implementation from my side. I'm suspecting that there are some
>> other bugs in
>> kernel/libbpf that only could be reproduced with driver mode.
>>
>> This never happens for virtual ports with SKB mode, so I never saw
>> this coverage
>> counter being non-zero.
>
> Did some quick debugging, as something else has come up that needs my
> attention :)
>
> But once I’m in a faulty state and sent a single packet, causing
> afxdp_complete_tx() to be called, it tells me 2048 descriptors are
> ready, which is XSK_RING_PROD__DEFAULT_NUM_DESCS. So I guess that
> there might be some ring management bug. Maybe consumer and receiver
> are equal meaning 0 buffers, but it returns max? I did not look at
> the kernel code, so this is just a wild guess :)
>
> (gdb) p tx_done
> $3 = 2048
>
> (gdb) p umem->cq
> $4 = {cached_prod = 3830466864, cached_cons = 3578066899, mask =
> 2047, size = 2048, producer = 0x7f08486b8000, consumer =
> 0x7f08486b8040, ring = 0x7f08486b8080}

 Thanks for debugging!

 xsk_ring_cons__peek() just returns the difference between cached_prod
 and cached_cons, but these values are too different:

 3830466864 - 3578066899 = 252399965

 Since this value > requested, it returns requested number (2048).

 So, the ring is broken. At least broken its 'cached' part. It'll be
 good
 to look at *consumer and *producer values to verify the state of the
 actual ring.

>>>
>>> I’ll try to find some more time next week to debug further.
>>>
>>> William I noticed your email in xdp-newbies where you mention this
>>> problem of getting the wrong pointers. Did you ever follow up, or did
>>> further trouble shooting on the above?
>>
>> Yes, I posted here
>> https://www.spinics.net/lists/xdp-newbies/msg00956.html
>> "Question/Bug about AF_XDP idx_cq from xsk_ring_cons__peek?"
>>
>> At that time I was thinking about reproducing the problem using the
>> xdpsock sample code from kernel. But turned out that my reproduction
>> code is not correct, so not able to show the case we hit here in OVS.
>>
>> Then I put more similar code logic from OVS to xdpsock, but the problem
>> does not show up. As a result, I worked around it by marking addr as
>> "*addr == UINT64_MAX".
>>
>> I will debug again this week once I get my testbed back.
>>
> Just to refresh my memory. The original issue is that
> when calling:
> tx_done = xsk_ring_cons__peek(>cq, CONS_NUM_DESCS, _cq);
> xsk_ring_cons__release(>cq, tx_done);
> 
> I expect there are 'tx_done' elems on the CQ to re-cycle back to memory pool.
> However, when I inspect these elems, I found some elems that 'already' been
> reported complete last time I call xsk_ring_cons__peek. In other word, some
> elems show up at CQ twice. And this cause overflow of the mempool.
> 
> Thus, mark the elems on CQ as UINT64_MAX to indicate that we already
> seen this elem.

William, Eelco, which HW NIC you're using? Which kernel driver?

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v12] Improved Packet Drop Statistics in OVS

2019-08-20 Thread Eelco Chaudron



On 20 Aug 2019, at 11:57, Anju Thomas wrote:


Thanks for the comments Eelco. I will address them in the next patch.

On a similar note, are these the final set of comments for this patch 
?


I reviewed the complete code, but not the self test parts.


Regards
Anju

-Original Message-
From: Eelco Chaudron 
Sent: Monday, August 12, 2019 5:48 PM
To: Anju Thomas 
Cc: d...@openvswitch.org; Keshav Gupta 
Subject: Re: [ovs-dev] [PATCH v12] Improved Packet Drop Statistics in 
OVS


Hi Anju,

See comments inline…

//Eelco


On 25 Jul 2019, at 14:16, Anju Thomas wrote:


Currently OVS maintains explicit packet drop/error counters only on
port level. Packets that are dropped as part of normal OpenFlow
processing are counted in flow stats of “drop” flows or as table
misses in table stats. These can only be interpreted by controllers
that know the semantics of the configured OpenFlow pipeline.
Without that knowledge, it is impossible for an OVS user to obtain
e.g. the total number of packets dropped due to OpenFlow rules.

Furthermore, there are numerous other reasons for which packets can 
be

dropped by OVS slow path that are not related to the OpenFlow
pipeline.
The generated datapath flow entries include a drop action to avoid
further expensive upcalls to the slow path, but subsequent packets
dropped by the datapath are not accounted anywhere.

Finally, the datapath itself drops packets in certain error
situations.
Also, these drops are today not accounted for.This makes it difficult
for OVS users to monitor packet drop in an OVS instance and to alert 
a

management system in case of a unexpected increase of such drops.
Also OVS trouble-shooters face difficulties in analysing packet 
drops.


With this patch we implement following changes to address the issues
mentioned above.

1. Identify and account all the silent packet drop scenarios

2. Display these drops in ovs-appctl coverage/show

v11->v12 Addresses comments from Ben Pfaff

Co-authored-by: Rohith Basavaraja 
Co-authored-by: Keshav Gupta 
Signed-off-by: Anju Thomas 
Signed-off-by: Rohith Basavaraja 
Signed-off-by: Keshav Gupta 
---
 datapath/linux/compat/include/linux/openvswitch.h |   1 +
 lib/dpif-netdev.c |  46 -
 lib/dpif.c|   7 +
 lib/dpif.h|   4 +
 lib/odp-execute.c |  79 
 lib/odp-util.c|   9 +
 ofproto/ofproto-dpif-ipfix.c  |   1 +
 ofproto/ofproto-dpif-sflow.c  |   1 +
 ofproto/ofproto-dpif-xlate.c  |  40 -
 ofproto/ofproto-dpif-xlate.h  |   3 +
 ofproto/ofproto-dpif.c|   8 +
 ofproto/ofproto-dpif.h|   8 +-
 tests/automake.mk |   3 +-
 tests/dpif-netdev.at  |   8 +
 tests/drop-stats.at   | 210
++
 tests/ofproto-dpif.at |   2 +-
 tests/tunnel-push-pop.at  |  29 ++-
 tests/tunnel.at   |  16 +-
 18 files changed, 464 insertions(+), 11 deletions(-)  create mode
100644 tests/drop-stats.at

diff --git a/datapath/linux/compat/include/linux/openvswitch.h
b/datapath/linux/compat/include/linux/openvswitch.h
index 65a003a..415c053 100644
--- a/datapath/linux/compat/include/linux/openvswitch.h
+++ b/datapath/linux/compat/include/linux/openvswitch.h
@@ -989,6 +989,7 @@ enum ovs_action_attr {  #ifndef __KERNEL__
OVS_ACTION_ATTR_TUNNEL_PUSH,   /* struct ovs_action_push_tnl*/
OVS_ACTION_ATTR_TUNNEL_POP,/* u32 port number. */
+   OVS_ACTION_ATTR_DROP,
 #endif
__OVS_ACTION_ATTR_MAX,/* Nothing past this will be accepted
   * from userspace. */
diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index
6b99a3c..3878b8d 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -101,6 +101,17 @@ enum { MAX_METERS = 65536 };/* Maximum 
number

of meters. */
 enum { MAX_BANDS = 8 }; /* Maximum number of bands / meter.
*/
 enum { N_METER_LOCKS = 64 };/* Maximum number of meters. */

+COVERAGE_DEFINE(datapath_drop_meter);
+COVERAGE_DEFINE(datapath_drop_upcall_error);
+COVERAGE_DEFINE(datapath_drop_lock_error);
+COVERAGE_DEFINE(datapath_drop_userspace_action_error);
+COVERAGE_DEFINE(datapath_drop_tunnel_push_error);
+COVERAGE_DEFINE(datapath_drop_tunnel_pop_error);
+COVERAGE_DEFINE(datapath_drop_recirc_error);
+COVERAGE_DEFINE(datapath_drop_invalid_port);
+COVERAGE_DEFINE(datapath_drop_invalid_tnl_port);
+COVERAGE_DEFINE(datapath_drop_rx_invalid_packet);
+
 /* Protects against changes to 'dp_netdevs'. */  static struct
ovs_mutex dp_netdev_mutex = OVS_MUTEX_INITIALIZER;

@@ -5762,7 +5773,7 @@ dp_netdev_run_meter(struct dp_netdev *dp, 
struct

dp_packet_batch 

Re: [ovs-dev] [PATCH] dpif-netdev: Add core id in the PMD thread name.

2019-08-20 Thread Ilya Maximets
On 20.08.2019 13:00, David Marchand wrote:
> On Wed, Aug 14, 2019 at 9:45 AM Ilya Maximets  wrote:
>>
>> On 13.08.2019 19:46, Eelco Chaudron wrote:
>>>
>>>
>>> On 13 Aug 2019, at 18:37, Ilya Maximets wrote:
>>>
 This is highly useful to see on which core PMD is running by
 only looking at the thread name. Thread Id still allows to
 distinguish different threads running on the same core over the time:

|dpif_netdev(pmd-c10/id:53)|DBG|Creating 2. subtable <...>
|dpif_netdev(pmd-c10/id:53)|DBG|flow_add: <...>, actions:2
|dpif_netdev(pmd-c09/id:70)|DBG|Core 9 processing port <..>

 In gdb, top or any other utility it's useful to quickly catch up
 needed thread without parsing logs, memory or matching threads by port
 names they're handling.

 Signed-off-by: Ilya Maximets 
 ---
  lib/dpif-netdev.c | 9 -
  1 file changed, 8 insertions(+), 1 deletion(-)

 diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
 index d0a1c58ad..34ba03836 100644
 --- a/lib/dpif-netdev.c
 +++ b/lib/dpif-netdev.c
 @@ -4735,9 +4735,16 @@ reconfigure_pmd_threads(struct dp_netdev *dp)
  FOR_EACH_CORE_ON_DUMP(core, pmd_cores) {
  pmd = dp_netdev_get_pmd(dp, core->core_id);
  if (!pmd) {
 +struct ds name = DS_EMPTY_INITIALIZER;
 +
  pmd = xzalloc(sizeof *pmd);
  dp_netdev_configure_pmd(pmd, dp, core->core_id, 
 core->numa_id);
 -pmd->thread = ovs_thread_create("pmd", pmd_thread_main, pmd);
 +
 +ds_put_format(, "pmd-c%02d/id:", core->core_id);
>>>
>>> This is a really good idea :) One remark should we make it %03d?
>>
>> There is a hard limit for the thread name. It's 15 meaningful chars 
>> excluding the
>> terminating null byte. 'pmd-c02/id:' is 11 bytes wide keeping 4 bytes for the
>> thread id. 'pmd-c002/id:' is 12 bytes wide with only 3 bytes remaining for 
>> id.
>> Thread ids could easily become big ( > 1000) for a long running process, 
>> that is
>> why %02d was chosen, to save some space.
> 
> Do we really need the /id: part?
> A c%03d/pmd prefix would keep the existing pmd%d pattern and save 3 
> characters.

This will break lib/ovs-thread.c:thread_is_pmd() function.

> 
> Otherwise, this is a good idea yes.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] dpif-netdev: Add core id in the PMD thread name.

2019-08-20 Thread David Marchand
On Wed, Aug 14, 2019 at 9:45 AM Ilya Maximets  wrote:
>
> On 13.08.2019 19:46, Eelco Chaudron wrote:
> >
> >
> > On 13 Aug 2019, at 18:37, Ilya Maximets wrote:
> >
> >> This is highly useful to see on which core PMD is running by
> >> only looking at the thread name. Thread Id still allows to
> >> distinguish different threads running on the same core over the time:
> >>
> >>|dpif_netdev(pmd-c10/id:53)|DBG|Creating 2. subtable <...>
> >>|dpif_netdev(pmd-c10/id:53)|DBG|flow_add: <...>, actions:2
> >>|dpif_netdev(pmd-c09/id:70)|DBG|Core 9 processing port <..>
> >>
> >> In gdb, top or any other utility it's useful to quickly catch up
> >> needed thread without parsing logs, memory or matching threads by port
> >> names they're handling.
> >>
> >> Signed-off-by: Ilya Maximets 
> >> ---
> >>  lib/dpif-netdev.c | 9 -
> >>  1 file changed, 8 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> >> index d0a1c58ad..34ba03836 100644
> >> --- a/lib/dpif-netdev.c
> >> +++ b/lib/dpif-netdev.c
> >> @@ -4735,9 +4735,16 @@ reconfigure_pmd_threads(struct dp_netdev *dp)
> >>  FOR_EACH_CORE_ON_DUMP(core, pmd_cores) {
> >>  pmd = dp_netdev_get_pmd(dp, core->core_id);
> >>  if (!pmd) {
> >> +struct ds name = DS_EMPTY_INITIALIZER;
> >> +
> >>  pmd = xzalloc(sizeof *pmd);
> >>  dp_netdev_configure_pmd(pmd, dp, core->core_id, 
> >> core->numa_id);
> >> -pmd->thread = ovs_thread_create("pmd", pmd_thread_main, pmd);
> >> +
> >> +ds_put_format(, "pmd-c%02d/id:", core->core_id);
> >
> > This is a really good idea :) One remark should we make it %03d?
>
> There is a hard limit for the thread name. It's 15 meaningful chars excluding 
> the
> terminating null byte. 'pmd-c02/id:' is 11 bytes wide keeping 4 bytes for the
> thread id. 'pmd-c002/id:' is 12 bytes wide with only 3 bytes remaining for id.
> Thread ids could easily become big ( > 1000) for a long running process, that 
> is
> why %02d was chosen, to save some space.

Do we really need the /id: part?
A c%03d/pmd prefix would keep the existing pmd%d pattern and save 3 characters.

Otherwise, this is a good idea yes.


-- 
David Marchand
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v12] Improved Packet Drop Statistics in OVS

2019-08-20 Thread Anju Thomas
Thanks for the comments Eelco. I will address them in the next patch.

On a similar note, are these the final set of comments for this patch ?

Regards
Anju

-Original Message-
From: Eelco Chaudron  
Sent: Monday, August 12, 2019 5:48 PM
To: Anju Thomas 
Cc: d...@openvswitch.org; Keshav Gupta 
Subject: Re: [ovs-dev] [PATCH v12] Improved Packet Drop Statistics in OVS

Hi Anju,

See comments inline…

//Eelco


On 25 Jul 2019, at 14:16, Anju Thomas wrote:

> Currently OVS maintains explicit packet drop/error counters only on 
> port level. Packets that are dropped as part of normal OpenFlow 
> processing are counted in flow stats of “drop” flows or as table 
> misses in table stats. These can only be interpreted by controllers 
> that know the semantics of the configured OpenFlow pipeline.
> Without that knowledge, it is impossible for an OVS user to obtain 
> e.g. the total number of packets dropped due to OpenFlow rules.
>
> Furthermore, there are numerous other reasons for which packets can be 
> dropped by OVS slow path that are not related to the OpenFlow 
> pipeline.
> The generated datapath flow entries include a drop action to avoid 
> further expensive upcalls to the slow path, but subsequent packets 
> dropped by the datapath are not accounted anywhere.
>
> Finally, the datapath itself drops packets in certain error 
> situations.
> Also, these drops are today not accounted for.This makes it difficult 
> for OVS users to monitor packet drop in an OVS instance and to alert a 
> management system in case of a unexpected increase of such drops.
> Also OVS trouble-shooters face difficulties in analysing packet drops.
>
> With this patch we implement following changes to address the issues 
> mentioned above.
>
> 1. Identify and account all the silent packet drop scenarios
>
> 2. Display these drops in ovs-appctl coverage/show
>
> v11->v12 Addresses comments from Ben Pfaff
>
> Co-authored-by: Rohith Basavaraja 
> Co-authored-by: Keshav Gupta 
> Signed-off-by: Anju Thomas 
> Signed-off-by: Rohith Basavaraja 
> Signed-off-by: Keshav Gupta 
> ---
>  datapath/linux/compat/include/linux/openvswitch.h |   1 +
>  lib/dpif-netdev.c |  46 -
>  lib/dpif.c|   7 +
>  lib/dpif.h|   4 +
>  lib/odp-execute.c |  79 
>  lib/odp-util.c|   9 +
>  ofproto/ofproto-dpif-ipfix.c  |   1 +
>  ofproto/ofproto-dpif-sflow.c  |   1 +
>  ofproto/ofproto-dpif-xlate.c  |  40 -
>  ofproto/ofproto-dpif-xlate.h  |   3 +
>  ofproto/ofproto-dpif.c|   8 +
>  ofproto/ofproto-dpif.h|   8 +-
>  tests/automake.mk |   3 +-
>  tests/dpif-netdev.at  |   8 +
>  tests/drop-stats.at   | 210 
> ++
>  tests/ofproto-dpif.at |   2 +-
>  tests/tunnel-push-pop.at  |  29 ++-
>  tests/tunnel.at   |  16 +-
>  18 files changed, 464 insertions(+), 11 deletions(-)  create mode 
> 100644 tests/drop-stats.at
>
> diff --git a/datapath/linux/compat/include/linux/openvswitch.h
> b/datapath/linux/compat/include/linux/openvswitch.h
> index 65a003a..415c053 100644
> --- a/datapath/linux/compat/include/linux/openvswitch.h
> +++ b/datapath/linux/compat/include/linux/openvswitch.h
> @@ -989,6 +989,7 @@ enum ovs_action_attr {  #ifndef __KERNEL__
>   OVS_ACTION_ATTR_TUNNEL_PUSH,   /* struct ovs_action_push_tnl*/
>   OVS_ACTION_ATTR_TUNNEL_POP,/* u32 port number. */
> + OVS_ACTION_ATTR_DROP,
>  #endif
>   __OVS_ACTION_ATTR_MAX,/* Nothing past this will be accepted
>  * from userspace. */
> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 
> 6b99a3c..3878b8d 100644
> --- a/lib/dpif-netdev.c
> +++ b/lib/dpif-netdev.c
> @@ -101,6 +101,17 @@ enum { MAX_METERS = 65536 };/* Maximum number 
> of meters. */
>  enum { MAX_BANDS = 8 }; /* Maximum number of bands / meter. 
> */
>  enum { N_METER_LOCKS = 64 };/* Maximum number of meters. */
>
> +COVERAGE_DEFINE(datapath_drop_meter);
> +COVERAGE_DEFINE(datapath_drop_upcall_error);
> +COVERAGE_DEFINE(datapath_drop_lock_error);
> +COVERAGE_DEFINE(datapath_drop_userspace_action_error);
> +COVERAGE_DEFINE(datapath_drop_tunnel_push_error);
> +COVERAGE_DEFINE(datapath_drop_tunnel_pop_error);
> +COVERAGE_DEFINE(datapath_drop_recirc_error);
> +COVERAGE_DEFINE(datapath_drop_invalid_port);
> +COVERAGE_DEFINE(datapath_drop_invalid_tnl_port);
> +COVERAGE_DEFINE(datapath_drop_rx_invalid_packet);
> +
>  /* Protects against changes to 'dp_netdevs'. */  static struct 
> ovs_mutex dp_netdev_mutex = OVS_MUTEX_INITIALIZER;
>
> @@ -5762,7 +5773,7 @@ 

Re: [ovs-dev] [PATCH v2 2/2] netdev-dpdk: Refactor vhost custom stats for extensibility.

2019-08-20 Thread David Marchand
On Mon, Aug 19, 2019 at 1:19 PM Ilya Maximets  wrote:
>
> vHost interfaces currently has only one custom statistic, but there
> might be others in the near future. This refactoring makes the code
> work in the same way as it done for dpdk and afxdp stats to keep the
> common style over the different code places and makes it easily
> extensible for the new stats addition.
>
> Signed-off-by: Ilya Maximets 
> ---
>  lib/netdev-dpdk.c | 28 ++--
>  1 file changed, 18 insertions(+), 10 deletions(-)
>
> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
> index 52ecf7576..bc20d6843 100644
> --- a/lib/netdev-dpdk.c
> +++ b/lib/netdev-dpdk.c
> @@ -110,12 +110,6 @@ BUILD_ASSERT_DECL(MAX_NB_MBUF % 
> ROUND_DOWN_POW2(MAX_NB_MBUF / MIN_NB_MBUF)
>  BUILD_ASSERT_DECL((MAX_NB_MBUF / ROUND_DOWN_POW2(MAX_NB_MBUF / MIN_NB_MBUF))
>% MP_CACHE_SZ == 0);
>
> -/* Size of vHost custom stats. */
> -#define VHOST_CUSTOM_STATS_SIZE  1
> -
> -/* Names of vHost custom stats. */
> -#define VHOST_STAT_TX_RETRIES"tx_retries"
> -
>  #define SOCKET0  0
>
>  /* Default size of Physical NIC RXQ */
> @@ -2827,17 +2821,31 @@ netdev_dpdk_vhost_get_custom_stats(const struct 
> netdev *netdev,
> struct netdev_custom_stats *custom_stats)
>  {
>  struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
> +int i;
>
> -ovs_mutex_lock(>mutex);
> +#define VHOST_CSTATS \
> +VHOST_CSTAT(tx_retries)
>
> -custom_stats->size = VHOST_CUSTOM_STATS_SIZE;
> +#define VHOST_CSTAT(NAME) + 1
> +custom_stats->size = VHOST_CSTATS;
> +#undef VHOST_CSTAT
>  custom_stats->counters = xcalloc(custom_stats->size,
>   sizeof *custom_stats->counters);
> -ovs_strlcpy(custom_stats->counters[0].name, VHOST_STAT_TX_RETRIES,
> +i = 0;
> +#define VHOST_CSTAT(NAME) \
> +ovs_strlcpy(custom_stats->counters[i++].name, #NAME, \
>  NETDEV_CUSTOM_STATS_NAME_SIZE);
> +VHOST_CSTATS;
> +#undef VHOST_CSTAT
> +
> +ovs_mutex_lock(>mutex);
>
>  rte_spinlock_lock(>stats_lock);
> -custom_stats->counters[0].value = dev->tx_retries;
> +i = 0;
> +#define VHOST_CSTAT(NAME) \
> +custom_stats->counters[i++].value = dev->NAME;
> +VHOST_CSTATS;
> +#undef VHOST_CSTAT
>  rte_spinlock_unlock(>stats_lock);
>
>  ovs_mutex_unlock(>mutex);
> --
> 2.17.1
>

Reviewed-by: David Marchand 



--
David Marchand
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 1/2] netdev-dpdk: Fix not reporting rx_oversize_errors in stats.

2019-08-20 Thread David Marchand
On Mon, Aug 19, 2019 at 1:19 PM Ilya Maximets  wrote:
>
> There is a big code duplication issue with DPDK xstats that led to
> missed "rx_oversize_errors" statistics. It's defined but not used.

Good catch.


> Fix that by actually using this stat along with code refactoring that
> will allow us to not make same mistakes in the future.
> Macro definitions are perfectly suitable to automate code generation
> in such cases and already used in a couple of places in OVS for similar
> purposes.
>
> Signed-off-by: Ilya Maximets 
> ---
>  lib/netdev-dpdk.c | 102 +++---
>  1 file changed, 32 insertions(+), 70 deletions(-)
>
> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
> index 48057835f..52ecf7576 100644
> --- a/lib/netdev-dpdk.c
> +++ b/lib/netdev-dpdk.c
> @@ -110,34 +110,6 @@ BUILD_ASSERT_DECL(MAX_NB_MBUF % 
> ROUND_DOWN_POW2(MAX_NB_MBUF / MIN_NB_MBUF)
>  BUILD_ASSERT_DECL((MAX_NB_MBUF / ROUND_DOWN_POW2(MAX_NB_MBUF / MIN_NB_MBUF))
>% MP_CACHE_SZ == 0);
>
> -/*
> - * DPDK XSTATS Counter names definition
> - */
> -#define XSTAT_RX_64_PACKETS  "rx_size_64_packets"
> -#define XSTAT_RX_65_TO_127_PACKETS   "rx_size_65_to_127_packets"
> -#define XSTAT_RX_128_TO_255_PACKETS  "rx_size_128_to_255_packets"
> -#define XSTAT_RX_256_TO_511_PACKETS  "rx_size_256_to_511_packets"
> -#define XSTAT_RX_512_TO_1023_PACKETS "rx_size_512_to_1023_packets"
> -#define XSTAT_RX_1024_TO_1522_PACKETS"rx_size_1024_to_1522_packets"
> -#define XSTAT_RX_1523_TO_MAX_PACKETS "rx_size_1523_to_max_packets"
> -
> -#define XSTAT_TX_64_PACKETS  "tx_size_64_packets"
> -#define XSTAT_TX_65_TO_127_PACKETS   "tx_size_65_to_127_packets"
> -#define XSTAT_TX_128_TO_255_PACKETS  "tx_size_128_to_255_packets"
> -#define XSTAT_TX_256_TO_511_PACKETS  "tx_size_256_to_511_packets"
> -#define XSTAT_TX_512_TO_1023_PACKETS "tx_size_512_to_1023_packets"
> -#define XSTAT_TX_1024_TO_1522_PACKETS"tx_size_1024_to_1522_packets"
> -#define XSTAT_TX_1523_TO_MAX_PACKETS "tx_size_1523_to_max_packets"
> -
> -#define XSTAT_RX_MULTICAST_PACKETS   "rx_multicast_packets"
> -#define XSTAT_TX_MULTICAST_PACKETS   "tx_multicast_packets"
> -#define XSTAT_RX_BROADCAST_PACKETS   "rx_broadcast_packets"
> -#define XSTAT_TX_BROADCAST_PACKETS   "tx_broadcast_packets"
> -#define XSTAT_RX_UNDERSIZED_ERRORS   "rx_undersized_errors"
> -#define XSTAT_RX_OVERSIZE_ERRORS "rx_oversize_errors"
> -#define XSTAT_RX_FRAGMENTED_ERRORS   "rx_fragmented_errors"
> -#define XSTAT_RX_JABBER_ERRORS   "rx_jabber_errors"
> -
>  /* Size of vHost custom stats. */
>  #define VHOST_CUSTOM_STATS_SIZE  1
>
> @@ -2682,51 +2654,41 @@ netdev_dpdk_convert_xstats(struct netdev_stats *stats,
> const struct rte_eth_xstat_name *names,
> const unsigned int size)
>  {
> +/* DPDK XSTATS Counter names definition. */
> +#define DPDK_XSTATS \
> +DPDK_XSTAT(multicast,   "rx_multicast_packets") \
> +DPDK_XSTAT(tx_multicast_packets,"tx_multicast_packets") \
> +DPDK_XSTAT(rx_broadcast_packets,"rx_broadcast_packets") \
> +DPDK_XSTAT(tx_broadcast_packets,"tx_broadcast_packets") \
> +DPDK_XSTAT(rx_undersized_errors,"rx_undersized_errors") \
> +DPDK_XSTAT(rx_oversize_errors,  "rx_oversize_errors"  ) \
> +DPDK_XSTAT(rx_fragmented_errors,"rx_fragmented_errors") \
> +DPDK_XSTAT(rx_jabber_errors,"rx_jabber_errors") \
> +DPDK_XSTAT(rx_1_to_64_packets,  "rx_size_64_packets"  ) \
> +DPDK_XSTAT(rx_65_to_127_packets,"rx_size_65_to_127_packets"   ) \
> +DPDK_XSTAT(rx_128_to_255_packets,   "rx_size_128_to_255_packets"  ) \
> +DPDK_XSTAT(rx_256_to_511_packets,   "rx_size_256_to_511_packets"  ) \
> +DPDK_XSTAT(rx_512_to_1023_packets,  "rx_size_512_to_1023_packets" ) \
> +DPDK_XSTAT(rx_1024_to_1522_packets, "rx_size_1024_to_1522_packets") \
> +DPDK_XSTAT(rx_1523_to_max_packets,  "rx_size_1523_to_max_packets" ) \
> +DPDK_XSTAT(tx_1_to_64_packets,  "tx_size_64_packets"  ) \
> +DPDK_XSTAT(tx_65_to_127_packets,"tx_size_65_to_127_packets"   ) \
> +DPDK_XSTAT(tx_128_to_255_packets,   "tx_size_128_to_255_packets"  ) \
> +DPDK_XSTAT(tx_256_to_511_packets,   "tx_size_256_to_511_packets"  ) \
> +DPDK_XSTAT(tx_512_to_1023_packets,  "tx_size_512_to_1023_packets" ) \
> +DPDK_XSTAT(tx_1024_to_1522_packets, "tx_size_1024_to_1522_packets") \
> +DPDK_XSTAT(tx_1523_to_max_packets,  "tx_size_1523_to_max_packets" )
> +
>  for (unsigned int i = 0; i < size; i++) {
> -if (strcmp(XSTAT_RX_64_PACKETS, names[i].name) == 0) {
> -stats->rx_1_to_64_packets = xstats[i].value;
> -} else if 

Re: [ovs-dev] [PATCH] Make pidfile_is_running more robust against empty pidfiles

2019-08-20 Thread Ilya Maximets
On 20.08.2019 12:16, Ilya Maximets wrote:
> On 20.08.2019 11:48, Numan Siddique wrote:
>>
>>
>> On Wed, Aug 14, 2019 at 9:21 PM Michele Baldessari > > wrote:
>>
>> On Wed, Aug 14, 2019 at 02:28:13PM +0300, Ilya Maximets wrote:
>> > On 14.08.2019 11:39, Michele Baldessari wrote:
>> > > In some of our destructive testing of ovn-dbs inside containers 
>> managed
>> > > by pacemaker we reached a situation where /var/run/openvswitch had
>> > > empty .pid files. The current code does not deal well with them
>> > > and pidfile_is_running() returns true in such a case and this 
>> confuses
>> > > the OCF resource agent.
>> > >
>> > > - Before this change:
>> > > Inside a container run:
>> > >   killall ovsdb-server;
>> > >   echo -n '' > /var/run/openvswitch/ovnnb_db.pid; echo -n '' > 
>> /var/run/openvswitch/ovnsb_db.pid
>> > >
>> > > We will observe that the cluster is unable to ever recover because
>> > > it believes the ovn processes to be running when they really aren't 
>> and
>> > > eventually just fails:
>> > >  podman container set: ovn-dbs-bundle 
>> [192.168.24.1:8787/rhosp15/openstack-ovn-northd:pcmklatest 
>> ]
>> > >    ovn-dbs-bundle-0     (ocf::ovn:ovndb-servers):       Master 
>> controller-0
>> > >    ovn-dbs-bundle-1     (ocf::ovn:ovndb-servers):       Stopped 
>> controller-1
>> > >    ovn-dbs-bundle-2     (ocf::ovn:ovndb-servers):       Slave 
>> controller-2
>> > >
>> > > - After this change the cluster is able to recover from this state 
>> and
>> > > correctly start the resource:
>> > >  podman container set: ovn-dbs-bundle 
>> [192.168.24.1:8787/rhosp15/openstack-ovn-northd:pcmklatest 
>> ]
>> > >    ovn-dbs-bundle-0     (ocf::ovn:ovndb-servers):       Master 
>> controller-0
>> > >    ovn-dbs-bundle-1     (ocf::ovn:ovndb-servers):       Slave 
>> controller-1
>> > >    ovn-dbs-bundle-2     (ocf::ovn:ovndb-servers):       Slave 
>> controller-2
>> > >
>> > > Signed-off-by: Michele Baldessari > >
>> > > ---
>> > >  ovn/utilities/ovn-ctl | 2 +-
>> > >  1 file changed, 1 insertion(+), 1 deletion(-)
>> > >
>> > > diff --git a/ovn/utilities/ovn-ctl b/ovn/utilities/ovn-ctl
>> > > index 7e5cd469c83c..65f03e28ddba 100755
>> > > --- a/ovn/utilities/ovn-ctl
>> > > +++ b/ovn/utilities/ovn-ctl
>> > > @@ -35,7 +35,7 @@ 
>> ovn_northd_db_conf_file="$etcdir/ovn-northd-db-params.conf"
>> > > 
>> > >  pidfile_is_running () {
>> > >      pidfile=$1
>> > > -    test -e "$pidfile" && pid=`cat "$pidfile"` && pid_exists "$pid"
>> > > +    test -e "$pidfile" && [ -s "$pidfile" ] && pid=`cat "$pidfile"` 
>> && pid_exists "$pid"
>> >
>> > Hi. Thanks for the fix!
>> >
>> > Maybe it's better to add additional check for an empty argument to
>> > 'pid_exists' function instead? This will cover more cases like 
>> invocations
>> > from the utilities/ovs-lib.in 
>> .
>> >
>> > I think, you may also add following tag to commit-message in this case:
>> > Fixes: 3028ce2595c8 ("ovs-lib: Allow "status" command to work as 
>> non-root.")
>> >
>> > This patch also will be needed in ovn-org/ovn repository too.
>> > (Use 'PATCH ovn' subject prefix while sending patches targeted for ovn 
>> repo.)
>> >
>> > Best regards, Ilya Maximets.
>>
>> Thanks for the feedback Ilya, I have amended things (hopefully 
>> correctly) in
>> http://patchwork.ozlabs.org/patch/1147111/ (I could not figure out how
>> to update an existing patch in patchwork, I hope this is okay)
>>
>>
>> Hi Michele and Ilya,
>>
>> I applied this fix to the OVN repo. It's possible that the fix to address 
>> this issue in ovs-lib.in 
>> 
>>  could
>> be missing in some deployments if older ovs version is used. I thought its 
>> no harm in having
>> the fixes in both ovn-ctl and ovs-lib.in 
>> .
>>
> 
> Hi Numan,
> 
> There was already v2 for this patch (a bit renamed):
> OVS: https://mail.openvswitch.org/pipermail/ovs-dev/2019-August/361678.html
> OVN: https://mail.openvswitch.org/pipermail/ovs-dev/2019-August/361679.html

Sorry, maybe I misunderstood what you wanted to do.
Do you suggest to apply v1 to OVN repo and v2 to OVS repo?
What about applying v2 to OVN repo?

> 
> Best regards, Ilya Maximets.
> 
> 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] Make pidfile_is_running more robust against empty pidfiles

2019-08-20 Thread Ilya Maximets
On 20.08.2019 11:48, Numan Siddique wrote:
> 
> 
> On Wed, Aug 14, 2019 at 9:21 PM Michele Baldessari  > wrote:
> 
> On Wed, Aug 14, 2019 at 02:28:13PM +0300, Ilya Maximets wrote:
> > On 14.08.2019 11:39, Michele Baldessari wrote:
> > > In some of our destructive testing of ovn-dbs inside containers 
> managed
> > > by pacemaker we reached a situation where /var/run/openvswitch had
> > > empty .pid files. The current code does not deal well with them
> > > and pidfile_is_running() returns true in such a case and this confuses
> > > the OCF resource agent.
> > >
> > > - Before this change:
> > > Inside a container run:
> > >   killall ovsdb-server;
> > >   echo -n '' > /var/run/openvswitch/ovnnb_db.pid; echo -n '' > 
> /var/run/openvswitch/ovnsb_db.pid
> > >
> > > We will observe that the cluster is unable to ever recover because
> > > it believes the ovn processes to be running when they really aren't 
> and
> > > eventually just fails:
> > >  podman container set: ovn-dbs-bundle 
> [192.168.24.1:8787/rhosp15/openstack-ovn-northd:pcmklatest 
> ]
> > >    ovn-dbs-bundle-0     (ocf::ovn:ovndb-servers):       Master 
> controller-0
> > >    ovn-dbs-bundle-1     (ocf::ovn:ovndb-servers):       Stopped 
> controller-1
> > >    ovn-dbs-bundle-2     (ocf::ovn:ovndb-servers):       Slave 
> controller-2
> > >
> > > - After this change the cluster is able to recover from this state and
> > > correctly start the resource:
> > >  podman container set: ovn-dbs-bundle 
> [192.168.24.1:8787/rhosp15/openstack-ovn-northd:pcmklatest 
> ]
> > >    ovn-dbs-bundle-0     (ocf::ovn:ovndb-servers):       Master 
> controller-0
> > >    ovn-dbs-bundle-1     (ocf::ovn:ovndb-servers):       Slave 
> controller-1
> > >    ovn-dbs-bundle-2     (ocf::ovn:ovndb-servers):       Slave 
> controller-2
> > >
> > > Signed-off-by: Michele Baldessari  >
> > > ---
> > >  ovn/utilities/ovn-ctl | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/ovn/utilities/ovn-ctl b/ovn/utilities/ovn-ctl
> > > index 7e5cd469c83c..65f03e28ddba 100755
> > > --- a/ovn/utilities/ovn-ctl
> > > +++ b/ovn/utilities/ovn-ctl
> > > @@ -35,7 +35,7 @@ 
> ovn_northd_db_conf_file="$etcdir/ovn-northd-db-params.conf"
> > > 
> > >  pidfile_is_running () {
> > >      pidfile=$1
> > > -    test -e "$pidfile" && pid=`cat "$pidfile"` && pid_exists "$pid"
> > > +    test -e "$pidfile" && [ -s "$pidfile" ] && pid=`cat "$pidfile"` 
> && pid_exists "$pid"
> >
> > Hi. Thanks for the fix!
> >
> > Maybe it's better to add additional check for an empty argument to
> > 'pid_exists' function instead? This will cover more cases like 
> invocations
> > from the utilities/ovs-lib.in 
> .
> >
> > I think, you may also add following tag to commit-message in this case:
> > Fixes: 3028ce2595c8 ("ovs-lib: Allow "status" command to work as 
> non-root.")
> >
> > This patch also will be needed in ovn-org/ovn repository too.
> > (Use 'PATCH ovn' subject prefix while sending patches targeted for ovn 
> repo.)
> >
> > Best regards, Ilya Maximets.
> 
> Thanks for the feedback Ilya, I have amended things (hopefully correctly) 
> in
> http://patchwork.ozlabs.org/patch/1147111/ (I could not figure out how
> to update an existing patch in patchwork, I hope this is okay)
> 
> 
> Hi Michele and Ilya,
> 
> I applied this fix to the OVN repo. It's possible that the fix to address 
> this issue in ovs-lib.in 
> 
>  could
> be missing in some deployments if older ovs version is used. I thought its no 
> harm in having
> the fixes in both ovn-ctl and ovs-lib.in 
> .
> 

Hi Numan,

There was already v2 for this patch (a bit renamed):
OVS: https://mail.openvswitch.org/pipermail/ovs-dev/2019-August/361678.html
OVN: https://mail.openvswitch.org/pipermail/ovs-dev/2019-August/361679.html

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] Make pidfile_is_running more robust against empty pidfiles

2019-08-20 Thread Numan Siddique
On Wed, Aug 14, 2019 at 9:21 PM Michele Baldessari 
wrote:

> On Wed, Aug 14, 2019 at 02:28:13PM +0300, Ilya Maximets wrote:
> > On 14.08.2019 11:39, Michele Baldessari wrote:
> > > In some of our destructive testing of ovn-dbs inside containers managed
> > > by pacemaker we reached a situation where /var/run/openvswitch had
> > > empty .pid files. The current code does not deal well with them
> > > and pidfile_is_running() returns true in such a case and this confuses
> > > the OCF resource agent.
> > >
> > > - Before this change:
> > > Inside a container run:
> > >   killall ovsdb-server;
> > >   echo -n '' > /var/run/openvswitch/ovnnb_db.pid; echo -n '' >
> /var/run/openvswitch/ovnsb_db.pid
> > >
> > > We will observe that the cluster is unable to ever recover because
> > > it believes the ovn processes to be running when they really aren't and
> > > eventually just fails:
> > >  podman container set: ovn-dbs-bundle [
> 192.168.24.1:8787/rhosp15/openstack-ovn-northd:pcmklatest]
> > >ovn-dbs-bundle-0 (ocf::ovn:ovndb-servers):   Master
> controller-0
> > >ovn-dbs-bundle-1 (ocf::ovn:ovndb-servers):   Stopped
> controller-1
> > >ovn-dbs-bundle-2 (ocf::ovn:ovndb-servers):   Slave
> controller-2
> > >
> > > - After this change the cluster is able to recover from this state and
> > > correctly start the resource:
> > >  podman container set: ovn-dbs-bundle [
> 192.168.24.1:8787/rhosp15/openstack-ovn-northd:pcmklatest]
> > >ovn-dbs-bundle-0 (ocf::ovn:ovndb-servers):   Master
> controller-0
> > >ovn-dbs-bundle-1 (ocf::ovn:ovndb-servers):   Slave
> controller-1
> > >ovn-dbs-bundle-2 (ocf::ovn:ovndb-servers):   Slave
> controller-2
> > >
> > > Signed-off-by: Michele Baldessari 
> > > ---
> > >  ovn/utilities/ovn-ctl | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/ovn/utilities/ovn-ctl b/ovn/utilities/ovn-ctl
> > > index 7e5cd469c83c..65f03e28ddba 100755
> > > --- a/ovn/utilities/ovn-ctl
> > > +++ b/ovn/utilities/ovn-ctl
> > > @@ -35,7 +35,7 @@
> ovn_northd_db_conf_file="$etcdir/ovn-northd-db-params.conf"
> > >
> > >  pidfile_is_running () {
> > >  pidfile=$1
> > > -test -e "$pidfile" && pid=`cat "$pidfile"` && pid_exists "$pid"
> > > +test -e "$pidfile" && [ -s "$pidfile" ] && pid=`cat "$pidfile"`
> && pid_exists "$pid"
> >
> > Hi. Thanks for the fix!
> >
> > Maybe it's better to add additional check for an empty argument to
> > 'pid_exists' function instead? This will cover more cases like
> invocations
> > from the utilities/ovs-lib.in.
> >
> > I think, you may also add following tag to commit-message in this case:
> > Fixes: 3028ce2595c8 ("ovs-lib: Allow "status" command to work as
> non-root.")
> >
> > This patch also will be needed in ovn-org/ovn repository too.
> > (Use 'PATCH ovn' subject prefix while sending patches targeted for ovn
> repo.)
> >
> > Best regards, Ilya Maximets.
>
> Thanks for the feedback Ilya, I have amended things (hopefully correctly)
> in
> http://patchwork.ozlabs.org/patch/1147111/ (I could not figure out how
> to update an existing patch in patchwork, I hope this is okay)
>
>
Hi Michele and Ilya,

I applied this fix to the OVN repo. It's possible that the fix to address
this issue in ovs-lib.in could
be missing in some deployments if older ovs version is used. I thought its
no harm in having
the fixes in both ovn-ctl and ovs-lib.in.

Thanks
Numan

Kind regards,
> Michele
> --
> Michele Baldessari
> C2A5 9DA3 9961 4FFB E01B  D0BC DDD4 DCCB 7515 5C6D
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2, ovn] Make pid_exists() more robust against empty pid argument

2019-08-20 Thread Daniel Alvarez Sanchez
On Wed, Aug 14, 2019 at 5:49 PM Michele Baldessari  wrote:
>
> In some of our destructive testing of ovn-dbs inside containers managed
> by pacemaker we reached a situation where /var/run/openvswitch had
> empty .pid files. The current code does not deal well with them
> and pidfile_is_running() returns true in such a case and this confuses
> the OCF resource agent.
>
> - Before this change:
> Inside a container run:
>   killall ovsdb-server;
>   echo -n '' > /var/run/openvswitch/ovnnb_db.pid; echo -n '' > 
> /var/run/openvswitch/ovnsb_db.pid
>
> We will observe that the cluster is unable to ever recover because
> it believes the ovn processes to be running when they really aren't and
> eventually just fails:
>  podman container set: ovn-dbs-bundle 
> [192.168.24.1:8787/rhosp15/openstack-ovn-northd:pcmklatest]
>ovn-dbs-bundle-0 (ocf::ovn:ovndb-servers):   Master controller-0
>ovn-dbs-bundle-1 (ocf::ovn:ovndb-servers):   Stopped controller-1
>ovn-dbs-bundle-2 (ocf::ovn:ovndb-servers):   Slave controller-2
>
> Let's make sure pid_exists() returns false when the pid is an empty
> string.
>
> - After this change the cluster is able to recover from this state and
> correctly start the resource:
>  podman container set: ovn-dbs-bundle 
> [192.168.24.1:8787/rhosp15/openstack-ovn-northd:pcmklatest]
>ovn-dbs-bundle-0 (ocf::ovn:ovndb-servers):   Master controller-0
>ovn-dbs-bundle-1 (ocf::ovn:ovndb-servers):   Slave controller-1
>ovn-dbs-bundle-2 (ocf::ovn:ovndb-servers):   Slave controller-2
>
> Fixes: 3028ce2595c8 ("ovs-lib: Allow "status" command to work as non-root.")
>
> Signed-off-by: Michele Baldessari 
> ---
> v1 -> v2
> 
> - Implemented Ilya's suggestion and moved the check from
>   pidfile_is_running() to pid_exists() and re-run my tests
> ---
>  utilities/ovs-lib.in | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/utilities/ovs-lib.in b/utilities/ovs-lib.in
> index fa840ec637f5..dc485413ef0c 100644
> --- a/utilities/ovs-lib.in
> +++ b/utilities/ovs-lib.in
> @@ -127,7 +127,7 @@ fi
>  pid_exists () {
>  # This is better than "kill -0" because it doesn't require permission to
>  # send a signal (so daemon_status in particular works as non-root).
> -test -d /proc/"$1"
> +test -n "$1" && test -d /proc/"$1"
>  }
>
>  pid_comm_check () {
> --
> 2.21.0

Acked-By:  Daniel Alvarez 
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v4 4/9] ct-dpif, dpif-netlink: Add conntrack timeout policy support

2019-08-20 Thread Darrell Ball
On Mon, Aug 19, 2019 at 7:41 PM Darrell Ball  wrote:

>
>
> On Mon, Aug 19, 2019 at 12:42 PM Darrell Ball  wrote:
>
>>
>>
>> On Mon, Aug 19, 2019 at 10:52 AM Yi-Hung Wei 
>> wrote:
>>
>>> On Fri, Aug 16, 2019 at 5:07 PM Darrell Ball  wrote:
>>> >
>>> > Thanks for the patch
>>> >
>>> > Pls let me know if this incremental works for you.
>>> > Main change is logging fix for timeout policy deletion.
>>> >
>>> > Darrell
>>> >
>>> > diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c
>>> > index 1d4ee60..00d957b 100644
>>> > --- a/lib/dpif-netlink.c
>>> > +++ b/lib/dpif-netlink.c
>>> > @@ -2822,11 +2822,10 @@ dpif_netlink_ct_dump_done(struct dpif *dpif
>>> OVS_UNUSED,
>>> >struct ct_dpif_dump_state *dump_)
>>> >  {
>>> >  struct dpif_netlink_ct_dump_state *dump;
>>> > -int err;
>>> >
>>> >  INIT_CONTAINER(dump, dump_, up);
>>> >
>>> > -err = nl_ct_dump_done(dump->nl_ct_dump);
>>> > +int err = nl_ct_dump_done(dump->nl_ct_dump);
>>> >  free(dump);
>>> >  return err;
>>> >  }
>>> > @@ -3335,7 +3334,8 @@ dpif_netlink_ct_del_timeout_policy(struct dpif
>>> *dpif OVS_UNUSED,
>>> >  err = 0;
>>> >  }
>>> >  if (err) {
>>> > -VLOG_WARN_RL(_rl, "failed to delete timeout policy
>>> %s (%s)",
>>> > +static struct vlog_rate_limit rl =
>>> VLOG_RATE_LIMIT_INIT(1, 1);
>>>
>>> Thanks for the diff.  It looks good in general.
>>>
>>> I agree on the main concern of the proposed diff which is the original
>>> rate limit in dpif-netlink (VLOG_RATE_LIMIT_INIT(, 5)) may log too
>>> much duplicated information.  However, since we may delete more than
>>> one one timeout policy in a minute, so lowering the rate limit to
>>> VLOG_RATE_LIMIT_INIT(1, 1) may miss some useful information.   I would
>>> use somewhere in the between (VLOG_RATE_LIMIT_INIT(5, 5)) in the next
>>> version.
>>>
>>
>> TBH, I am not sure we care lots about this information. I was even
>> debating changing it debug level.
>> We have 4 billion datapath timeout profile IDs, so it is unlikely we will
>> run out.
>> Eventually, they will get cleaned up by the retry thingy.
>>
>> Also, I am not sure what action we will take by seeing these logs anyhow.
>>
>> Spamming the log is more of a concern.
>>
>
> After more testing, I noticed there are a couple other aspects we might
> have overlooked.
>
> 1/ We don't have a log for a successful deletion attempt.
>
> 2/ When we try to delete 1 of the 6 associated Netfilter timeout policies
> and it fails, we don't try to delete
>  the remaining ones and bail out of the deletion loop early.
>
> Keeping the INFO level change (from WARN), addressing '1' and '2' above
> and also folding in
> your idea to keep the overall log generation rate a little higher, I ended
> up with:
>
> diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c
> index 1d4ee60..3926cfd 100644
> --- a/lib/dpif-netlink.c
> +++ b/lib/dpif-netlink.c
> @@ -2822,11 +2822,10 @@ dpif_netlink_ct_dump_done(struct dpif *dpif
> OVS_UNUSED,
>struct ct_dpif_dump_state *dump_)
>  {
>  struct dpif_netlink_ct_dump_state *dump;
> -int err;
>
>  INIT_CONTAINER(dump, dump_, up);
>
> -err = nl_ct_dump_done(dump->nl_ct_dump);
> +int err = nl_ct_dump_done(dump->nl_ct_dump);
>  free(dump);
>  return err;
>  }
> @@ -3334,14 +,12 @@ dpif_netlink_ct_del_timeout_policy(struct dpif
> *dpif OVS_UNUSED,
>  if (err == ENOENT) {
>  err = 0;
>  }
> -if (err) {
> -VLOG_WARN_RL(_rl, "failed to delete timeout policy %s
> (%s)",
> - ds_cstr(_tp_name), ovs_strerror(err));
> -goto out;
> -}
> +static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(6, 6);
> +VLOG_INFO_RL(, err ? "failed to delete timeout policy %s (%s)"
> +  : "deleted timeout policy %s (%s)",
> + ds_cstr(_tp_name), ovs_strerror(err));
>  }
>
> -out:
>  ds_destroy(_tp_name);
>  return err;
>  }
> @@ -3392,10 +3389,9 @@ dpif_netlink_ct_timeout_policy_dump_start(struct
> dpif *dpif OVS_UNUSED,
>void **statep)
>  {
>  struct dpif_netlink_ct_timeout_policy_dump_state *dump_state;
> -int err;
>
>  *statep = dump_state = xzalloc(sizeof *dump_state);
> -err = nl_ct_timeout_policy_dump_start(_state->nl_dump_state);
> +int err = nl_ct_timeout_policy_dump_start(_state->nl_dump_state);
>  if (err) {
>  free(dump_state);
>  return err;
>

> Running the system test (from Patch 9) and adding a deletion request for
> the in-use timeout policy yields
>
> > 2019-08-20T01:28:41.004Z|00203|dpif_netlink|INFO|deleted timeout policy
> ovs_tp_0_tcp4 (Success)
> > 2019-08-20T01:28:41.004Z|00204|dpif_netlink|INFO|failed to delete
> timeout policy ovs_tp_0_udp4 (Device or resource busy)
> > 2019-08-20T01:28:41.004Z|00205|dpif_netlink|INFO|failed