On November 16, 2017 10:49:16 PM PST, William Tu <u9012...@gmail.com> wrote:
>When deploying OVS on a large scale testbed, we occationally see OVS
>gets killed by the oom (out-of-memory) killer, after installing 100k
>rules and seeing ovs-vswitchd consumes more than 4GB of memory.
>Unfortunately, there is no better way to debug and root cause the
>memory
>leak.  The patch tries to add heuristic about the memory consumption
>of numbers of rules and the memory usage (typically 1-2 kB per rule)
>and set an upper bound for the memory usage of ovs-vswitchd.  If the
>memory usage, rss (resident set size), is larger than x16 num_rules,
>we kill the ovs-vswitchd with SIGSEGV, hoping to generate coredump
>file to help debugging.
>
>Signed-off-by: William Tu <u9012...@gmail.com>
>---
> lib/memory.c | 26 ++++++++++++++++++++++++++
> 1 file changed, 26 insertions(+)
>
>diff --git a/lib/memory.c b/lib/memory.c
>index da97476c6a45..75cce6e5dcc3 100644
>--- a/lib/memory.c
>+++ b/lib/memory.c
>@@ -25,6 +25,7 @@
> #include "timeval.h"
> #include "unixctl.h"
> #include "openvswitch/vlog.h"
>+#include <signal.h>
> 
> VLOG_DEFINE_THIS_MODULE(memory);
> 
>@@ -110,6 +111,27 @@ memory_should_report(void)
> }
> 
> static void
>+check_memory_usage(unsigned int num_rules)
>+{
>+    struct rusage usage;
>+    unsigned long int rss;
>+
>+    getrusage(RUSAGE_SELF, &usage);
>+    rss = (unsigned long int) usage.ru_maxrss; /* in kilobytes */
>+
>+    /* Typically a rule takes about 1-2 kilobytes of memory.  If the
>rss
>+     * (resident set size) is larger than 1GB and x16 of num_rules, we
>+     * might have a memory leak.  Thus, kill it with SIGSEGV to
>generate a
>+     * coredump.
>+     */
>+    if (rss > 1024 * 1024 && rss > num_rules * 16) {
>+        VLOG_ERR("Unexpected high memory usage of %lu kB,"
>+                 " rules %u killed with SIGSEGV", rss, num_rules);
>+        raise(SIGSEGV);
>+    }
>+}
>+
>+static void
> compose_report(const struct simap *usage, struct ds *s)
> {
>     const struct simap_node **nodes = simap_sort(usage);
>@@ -120,6 +142,10 @@ compose_report(const struct simap *usage, struct
>ds *s)
>         const struct simap_node *node = nodes[i];
> 
>         ds_put_format(s, "%s:%u ", node->name, node->data);
>+
>+        if (!strcmp(node->name, "rules")) {
>+            check_memory_usage(node->data);
>+          }
>     }
>     ds_chomp(s, ' ');
>     free(nodes);
>-- 
>2.7.4
>
>_______________________________________________
>dev mailing list
>d...@openvswitch.org
>https://mail.openvswitch.org/mailman/listinfo/ovs-dev

I know I suggested this but I didn't mean it as something that we'd carry in 
the tree but only as a temporary patch while we're trying to track down the 
leak.
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to