from:"Rodriguez Betancourt, Esteban"

[ovs-dev] suspicious skiplist code

2019-01-25 Thread Rodriguez Betancourt, Esteban

Hello,
In this case you are right, the "height" member is not only not used, 
it is in fact not required, and can be safely removed, without causing
security issues.

The code can't read past the end of the 'forward' array because the
skiplist "level" member, that specifies the maximum height of the whole
skiplist.

The "level" field is updated in insertions and deletions, so that in insertion
the root node will point to the newly created item (if there isn't a list there
yet). At the deletions, if the deleted item is the last one at that height then
the root is modified to point to NULL at that height, and the whole skiplist
height is decremented.

For the forward_to case
- If a node is found in a list of level/height N, then it has height N (that's 
why it was inserted in that list)
- forward_to travels throught nodes in the same level, so it is safe, as it 
doesn't go up.
- If a node has height N, then it belongs to all the lists initiated at 
root->forward[n, n-1 ,n-2, ..., 0]
- forward_to may go to lower levels, but that is safe, because of previous 
point.

So, the protection is given by the "level" field in skiplist root node, and it 
is enough
to guarantee that the code won't go off limits at 'forward' array. But yes, the 
height
field is unused, unneeded, and can be removed safely. Sorry about the trouble 
it may
have caused.

Regards,
Esteban

> I'm really suspicious about the code in lib/skiplist.c, because I
> noticed that it writes, but never reads, the 'height' member of struct
> skiplist_node.  I suspect that, therefore, in some circumstances the
> code can read past the end of the 'forward' array in skiplist_node.
>
> I'd appreciate it if someone out there has time to verify that I'm right
> or I'm wrong.
>
> Thanks,
>
> Ben.

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH v5 3/4] ovsdb-idl: idl compound indexes implementation

2017-07-14 Thread Rodriguez Betancourt, Esteban

I think that is ok to remove that last pointer check in the generic comparison 
function. UUID is more than enough.
Regards,
Esteban

> -Original Message-
> From: Lance Richardson [mailto:lrich...@redhat.com]
> Sent: jueves, 13 de julio de 2017 16:48
> To: Rodriguez Betancourt, Esteban <esteb...@hpe.com>
> Cc: Ben Pfaff <b...@ovn.org>; d...@openvswitch.org; Albornoz, Javier
> <javier.albor...@hpe.com>; Lutz, Arnoldo
> <arnoldo.lutz.guev...@hpe.com>
> Subject: Re: [ovs-dev] [PATCH v5 3/4] ovsdb-idl: idl compound indexes
> implementation
> 
> > From: "Rodriguez Betancourt, Esteban" <esteb...@hpe.com>
> > To: "Lance Richardson" <lrich...@redhat.com>, "Ben Pfaff"
> > <b...@ovn.org>
> > Cc: d...@openvswitch.org, "Javier Albornoz" <javier.albor...@hpe.com>,
> > "Arnoldo Lutz" <arnoldo.lutz.guev...@hpe.com>
> > Sent: Thursday, 13 July, 2017 5:22:42 PM
> > Subject: RE: [ovs-dev] [PATCH v5 3/4] ovsdb-idl: idl compound indexes
> > implementation
> >
> > Hello,
> > The UUID comparison was added to guarantee that always rows with the
> > same keys are sorted in the same way (between executions of a
> command,
> > for example). Before that we  used the pointer comparison (which gives
> > us some randomness in sorting) but  we kept it to be really really
> > sure that we are deleting/inserting the correct row.
> >
> > The row sync effectively was intended for finding the correct row to
> > delete when there are duplicated "index keys" and there are
> > deletions/updates/inserts (note that the updates are really handled as
> > a delete-and-reinsert).
> 
> Hi Esteban,
> 
> Thanks, I think all of that makes sense. I do wonder, though, since UUIDs
> should be unique whether it would make just as much to assert that the
> addresses are equal if the UUIDs are equal.
> 
> >
> > We copied ovsdb_idl_index_read from ovsdb_idl_read because we were
> > concerned of what would happen if somebody access the indexes when a
> > transaction is being made. If the index read the new values instead of
> > the old one then the rows could be lost/wrongly sorted, etc. It is the
> > same story with
> > ovsdb_idl_index_write_() and index_set: we wanted the indexes to
> > behave correctly during a transaction with new changes.
> >
> 
> Comparing the v4 version of ovsdb_idl_index_read() against the current
> ovsdb_idl_read(), the only difference between the two that I can find is that
> ovsdb_idl_index_read() takes the table class as a function parameter while
> ovsdb_idl_read() expects row->table to be non-NULL and takes the table
> class from row->table->class.  Since ovsdb_idl_index_init_row() initializes
> row->table appropriately, ovsdb_idl_read() should produce exactly the same
> result as ovsdb_idl_index_read(), whether a transaction on that row is in
> progress or not. So it seems to me we should be able to get rid of
> ovsdb_idl_index_read().
> 
> > ovsdb_idl_index_{find,forward_to} are used inside the autogenerated
> > functions for iterating over the indexes.
> > We thought that it would be nice to be able to say things like:
> >
> > OVSREC_EXAMPLE_FOR_EACH_RANGE(row, cursor, start, end) /* [start,
> end]
> > */ OVSREC_EXAMPLE_FOR_EACH_RANGE(row, cursor, NULL, end) /* [0,
> end]
> > */ OVSREC_EXAMPLE_FOR_EACH_RANGE(row, cursor, start, NULL) /*
> [start,
> > "+infty"] */ OVSREC_EXAMPLE_FOR_EACH_RANGE(row, cursor, NULL,
> NULL) /*
> > everything */
> >
> > Regards,
> > Esteban
> 
> Thanks, that makes sense.  I think it would be good to add tests for these
> cases.
> 
> Thanks again!
> 
>Lance
> 
> >
> > -Original Message-
> > From: Lance Richardson [mailto:lrich...@redhat.com]
> > Sent: jueves, 13 de julio de 2017 14:29
> > To: Ben Pfaff <b...@ovn.org>
> > Cc: d...@openvswitch.org; Rodriguez Betancourt, Esteban
> > <esteb...@hpe.com>; Albornoz, Javier <javier.albor...@hpe.com>; jorge
> > sauma <jorge.sa...@hpe.com>; Lutz, Arnoldo
> > <arnoldo.lutz.guev...@hpe.com>
> > Subject: Re: [ovs-dev] [PATCH v5 3/4] ovsdb-idl: idl compound indexes
> > implementation
> >
> >
> >
> > - Original Message -
> > > From: "Ben Pfaff" <b...@ovn.org>
> > > To: "Lance Richardson" <lrich...@redhat.com>
> > > Cc: d...@openvswitch.org, esteb...@hpe.com, "javier albornoz"
> > > <javier.albor...@hpe.com>, "jorge sauma"
> &

Re: [ovs-dev] [PATCH v5 3/4] ovsdb-idl: idl compound indexes implementation

2017-07-13 Thread Rodriguez Betancourt, Esteban

Hello,
The UUID comparison was added to guarantee that always rows with the same keys 
are 
sorted in the same way (between executions of a command, for example). Before 
that we
 used the pointer comparison (which gives us some randomness in sorting) but we 
kept it
to be really really sure that we are deleting/inserting the correct row.

The row sync effectively was intended for finding the correct row to delete 
when there are duplicated "index keys" and there are
deletions/updates/inserts (note that the updates are really handled as a 
delete-and-reinsert).

We copied ovsdb_idl_index_read from ovsdb_idl_read because we were concerned of 
what would happen if somebody
access the indexes when a transaction is being made. If the index read the new 
values instead of the old one then
the rows could be lost/wrongly sorted, etc. It is the same story with 
ovsdb_idl_index_write_() and index_set: we 
wanted the indexes to behave correctly during a transaction with new changes.

ovsdb_idl_index_{find,forward_to} are used inside the autogenerated functions 
for iterating over the indexes.
We thought that it would be nice to be able to say things like:

OVSREC_EXAMPLE_FOR_EACH_RANGE(row, cursor, start, end) /* [start, end] */
OVSREC_EXAMPLE_FOR_EACH_RANGE(row, cursor, NULL, end) /* [0, end] */
OVSREC_EXAMPLE_FOR_EACH_RANGE(row, cursor, start, NULL) /* [start, "+infty"] */
OVSREC_EXAMPLE_FOR_EACH_RANGE(row, cursor, NULL, NULL) /* everything */

Regards,
Esteban

-Original Message-
From: Lance Richardson [mailto:lrich...@redhat.com] 
Sent: jueves, 13 de julio de 2017 14:29
To: Ben Pfaff <b...@ovn.org>
Cc: d...@openvswitch.org; Rodriguez Betancourt, Esteban <esteb...@hpe.com>; 
Albornoz, Javier <javier.albor...@hpe.com>; jorge sauma <jorge.sa...@hpe.com>; 
Lutz, Arnoldo <arnoldo.lutz.guev...@hpe.com>
Subject: Re: [ovs-dev] [PATCH v5 3/4] ovsdb-idl: idl compound indexes 
implementation

- Original Message -
> From: "Ben Pfaff" <b...@ovn.org>
> To: "Lance Richardson" <lrich...@redhat.com>
> Cc: d...@openvswitch.org, esteb...@hpe.com, "javier albornoz" 
> <javier.albor...@hpe.com>, "jorge sauma"
> <jorge.sa...@hpe.com>, "arnoldo lutz guevara" 
> <arnoldo.lutz.guev...@hpe.com>
> Sent: Tuesday, 11 July, 2017 5:05:32 PM
> Subject: Re: [ovs-dev] [PATCH v5 3/4] ovsdb-idl: idl compound indexes 
> implementation
> 
> On Sat, Jun 24, 2017 at 05:01:51PM -0400, Lance Richardson wrote:
> > This patch adds support for the creation of multicolumn indexes in 
> > the C IDL to enable for efficient search and retrieval of database 
> > rows by key.
> > 
> > Signed-off-by: Esteban Rodriguez Betancourt <esteb...@hpe.com>
> > Co-authored-by: Lance Richardson <lrich...@redhat.com>
> > Signed-off-by: Lance Richardson <lrich...@redhat.com>
> > ---
> >   v5: - Coding style fixes (checkpatch.py)
> >   - Fixed memory leak (missing ovsdb_datum_destroy() in
> > ovsdb_idl_index_destroy_row__()).
> >   - Some polishing of comment and log message text.
> 
> Thanks for reviving this series.
> 
> I don't understand ovsdb_idl_index_read().  It is almost the same as 
> ovsdb_idl_read().  It looks like ovsdb_idl_read() could be implemented 
> as a wrapper around it, but I'm also not sure why ovsdb_idl_read() 
> can't be used directly.  Also, I don't understand its comment about 
> "index_set functions", since there are no functions with index_set in their 
> names.

Hi Ben,

I neglected to respond to the comment about "index_set functions" in my 
previous response, these are automatically generated for the IDL by the next 
patch. I had deleted that comment in v6 of this series (which is in the ml 
archive but not patchwork for some reason), it didn't seem useful to me.

> 
> ovsdb_idl_index_write_() is mostly copy-paste of a part of 
> ovsdb_idl_txn_write__(), so to avoid code duplication it would be best 
> to factor that code out of ovsdb_idl_txn_write__() and call it from 
> both places.
> 
> I don't understand the behavior of ovsdb_idl_index_find() and
> ovsdb_idl_index_forward_to() for a null 'value' argument.  Is there 
> some idiomatic usage where the null and nonnull behaviors work out nicely?
> 
> The row_sync behavior is confusing.  I remember it slightly from the 
> previous iteration but I don't remember being convinced it was the 
> best way to do things.
> 

The "row_sync" stuff seems to be a matter of using additional keys (row uuid 
values first, then memory addresses) when inserting or deleting a row in order 
to handle duplicate keys.

This still seems a bit strange, but after thinking about it a bit I believe it 
is a reasonable approach; these extr

[ovs-dev] [PATCH 4/4 v4] ovsdb-idl: Autogenerated functions for compound indexes

2016-12-28 Thread Rodriguez Betancourt, Esteban

Generates and fill the default comparators for columns with
type int, real, string. Also creates the macros that allow
to iterate over the contents of the index, and perform
queries.

Signed-off-by: Arnoldo Lutz Guevara 
Signed-off-by: Esteban Rodriguez Betancourt 
Co-authored-by: Arnoldo Lutz Guevara 
Co-authored-by: Esteban Rodriguez Betancourt 
---
 ovsdb/ovsdb-idlc.in | 259 +++
 tests/ovsdb-idl.at  | 263 
 tests/test-ovsdb.c  | 240 ++-
 3 files changed, 761 insertions(+), 1 deletion(-)

diff --git a/ovsdb/ovsdb-idlc.in b/ovsdb/ovsdb-idlc.in
index 721ab50..a1ff3e6 100755
--- a/ovsdb/ovsdb-idlc.in
+++ b/ovsdb/ovsdb-idlc.in
@@ -8,6 +8,7 @@ import sys
 import ovs.json
 import ovs.db.error
 import ovs.db.schema
+from ovs.db.types import StringType, IntegerType, RealType
 
 argv0 = sys.argv[0]
 
@@ -201,6 +202,26 @@ static inline bool %(s)s_is_deleted(const struct %(s)s 
*row)
 return %(s)s_row_get_seqno(row, OVSDB_IDL_CHANGE_DELETE) > 0;
 }
 
+void %(s)s_index_destroy_row(const struct %(s)s *);
+int %(s)s_index_compare(struct ovsdb_idl_index_cursor *, const struct %(s)s *, 
const struct %(s)s *);
+const struct %(s)s *%(s)s_index_first(struct ovsdb_idl_index_cursor *);
+const struct %(s)s *%(s)s_index_next(struct ovsdb_idl_index_cursor *);
+const struct %(s)s *%(s)s_index_find(struct ovsdb_idl_index_cursor *, const 
struct %(s)s *);
+const struct %(s)s *%(s)s_index_forward_to(struct ovsdb_idl_index_cursor *, 
const struct %(s)s *);
+const struct %(s)s *%(s)s_index_get_data(const struct ovsdb_idl_index_cursor 
*);
+#define %(S)s_FOR_EACH_RANGE(ROW, CURSOR, FROM, TO) \\
+for ((ROW) = %(s)s_index_forward_to(CURSOR, FROM); \\
+ ((ROW) && %(s)s_index_compare(CURSOR, ROW, TO) <= 0); \\
+ (ROW) = %(s)s_index_next(CURSOR))
+#define %(S)s_FOR_EACH_EQUAL(ROW, CURSOR, KEY) \\
+for ((ROW) = %(s)s_index_find(CURSOR, KEY); \\
+ ((ROW) && %(s)s_index_compare(CURSOR, ROW, KEY) == 0); \\
+ (ROW) = %(s)s_index_next(CURSOR))
+#define %(S)s_FOR_EACH_BYINDEX(ROW, CURSOR) \\
+for ((ROW) = %(s)s_index_first(CURSOR); \\
+ (ROW); \\
+ (ROW) = %(s)s_index_next(CURSOR))
+
 void %(s)s_init(struct %(s)s *);
 void %(s)s_delete(const struct %(s)s *);
 struct %(s)s *%(s)s_insert(struct ovsdb_idl_txn *);
@@ -257,6 +278,19 @@ bool %(s)s_is_updated(const struct %(s)s *, enum 
%(s)s_column_id);
 print
 
 # Table indexes.
+print "struct %(s)s * %(s)s_index_init_row(struct ovsdb_idl*, const 
struct ovsdb_idl_table_class *);" % {'s': structName}
+print
+for columnName, column in sorted(table.columns.iteritems()):
+print 'void %(s)s_index_set_%(c)s(const struct %(s)s *,' % {'s': 
structName, 'c': columnName},
+if column.type.is_smap():
+args = ['const struct smap *']
+else:
+comment, members = cMembers(prefix, tableName, columnName,
+column, True)
+args = ['%(type)s%(name)s' % member for member in members]
+print '%s);' % ', '.join(args)
+
+print
 printEnum("%stable_id" % prefix.lower(), ["%sTABLE_%s" % (prefix.upper(), 
tableName.upper()) for tableName in sorted(schema.tables)] + ["%sN_TABLES" % 
prefix.upper()])
 print
 for tableName in schema.tables:
@@ -979,6 +1013,231 @@ void
 print "free(%s);" % var
 print "}"
 
+# Index table related functions
+print '''
+/* Destroy 'row' of kind "%(t)s". The row must have been
+ * created with ovsdb_idl_index_init_row.
+ */
+void
+%(s)s_index_destroy_row(const struct %(s)s *row)
+{
+ovsdb_idl_index_destroy_row__(>header_);
+}
+''' % { 's' : structName, 't': tableName }
+print """
+/* Creates a new row of kind "%(t)s". */
+struct %(s)s *
+%(s)s_index_init_row(struct ovsdb_idl* idl, const struct ovsdb_idl_table_class 
*class)
+{""" % {'s': structName, 't': tableName}
+#for columnName, column in sorted(table.columns.iteritems()):
+#if column.type.is_smap():
+#print "smap_init(>%s);" % columnName
+print "return (struct %(s)s *) ovsdb_idl_index_init_row(idl, 
class);" % {'s': structName, 't': tableName}
+print "}"
+
+print '''
+/*  This function is used to compare "%(s)s" records on table in iteration 
loops for compound-index operations.
+After been called, cursor point to current position in the index
+Parameters: struct ovsdb_idl_index_cursor *cursor. Cursor used to iterate 
over the indexed data on this table.
+const struct "%(s)s" *const_data1,  const struct "%(s)s" 
*const_data2. Data to be compared.
+Return value: 0 if both

[ovs-dev] [PATCH 3/4 v4] ovsdb-idl: IDL Compound Indexes Implementation

2016-12-28 Thread Rodriguez Betancourt, Esteban

In the C IDL, allows to create multicolumn indexes in the
tables, that are keep synched with the data in the replica.

Signed-off-by: Esteban Rodriguez Betancourt 
---
 lib/ovsdb-idl-provider.h |  29 +++
 lib/ovsdb-idl.c  | 495 ++-
 lib/ovsdb-idl.h  |  59 ++
 3 files changed, 582 insertions(+), 1 deletion(-)

diff --git a/lib/ovsdb-idl-provider.h b/lib/ovsdb-idl-provider.h
index 8cfbb95..60025c4 100644
--- a/lib/ovsdb-idl-provider.h
+++ b/lib/ovsdb-idl-provider.h
@@ -1,4 +1,5 @@
 /* Copyright (c) 2009, 2010, 2011, 2012, 2016 Nicira, Inc.
+ * Copyright (C) 2016 Hewlett Packard Enterprise Development LP
  *
  * Licensed under the Apache License, Version 2.0 (the "License");
  * you may not use this file except in compliance with the License.
@@ -111,6 +112,7 @@ struct ovsdb_idl_table {
 struct hmap rows;/* Contains "struct ovsdb_idl_row"s. */
 struct ovsdb_idl *idl;   /* Containing idl. */
 unsigned int change_seqno[OVSDB_IDL_CHANGE_MAX];
+struct shash indexes;/* Contains "struct ovsdb_idl_index"s */
 struct ovs_list track_list; /* Tracked rows (ovsdb_idl_row.track_node). */
 struct ovsdb_idl_condition condition;
 bool cond_changed;
@@ -122,6 +124,33 @@ struct ovsdb_idl_class {
 size_t n_tables;
 };
 
+/*
+ * Contains the configuration per column of the index
+ */
+struct ovsdb_idl_index_column {
+const struct ovsdb_idl_column *column;
+column_comparator *comparer;
+int sorting_order;
+};
+
+/*
+ * Defines a IDL compound index
+ */
+struct ovsdb_idl_index {
+struct skiplist *skiplist;  /* Skiplist with pointer to
+ * rows*/
+struct ovsdb_idl_index_column *columns; /* Columns configuration */
+size_t n_columns;   /* Number of columns in index 
*/
+size_t alloc_columns;   /* Size allocated memory for
+ * columns, comparers and
+ * sorting order */
+bool row_sync;  /* Determines if the replica
+ * is modifying its content or
+ * not */
+const struct ovsdb_idl_table *table;/* Table that owns this index 
*/
+const char* index_name; /* The name of this index */
+};
+
 struct ovsdb_idl_row *ovsdb_idl_get_row_arc(
 struct ovsdb_idl_row *src,
 const struct ovsdb_idl_table_class *dst_table,
diff --git a/lib/ovsdb-idl.c b/lib/ovsdb-idl.c
index 1be1c68..d74ab16 100644
--- a/lib/ovsdb-idl.c
+++ b/lib/ovsdb-idl.c
@@ -1,4 +1,5 @@
 /* Copyright (c) 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016 Nicira, Inc.
+ * Copyright (C) 2016 Hewlett Packard Enterprise Development LP
  *
  * Licensed under the Apache License, Version 2.0 (the "License");
  * you may not use this file except in compliance with the License.
@@ -37,8 +38,10 @@
 #include "ovsdb-parser.h"
 #include "poll-loop.h"
 #include "openvswitch/shash.h"
+#include "skiplist.h"
 #include "sset.h"
 #include "util.h"
+#include "uuid.h"
 #include "openvswitch/vlog.h"
 
 VLOG_DEFINE_THIS_MODULE(ovsdb_idl);
@@ -221,6 +224,16 @@ ovsdb_idl_table_from_class(const struct ovsdb_idl *,
 static bool ovsdb_idl_track_is_set(struct ovsdb_idl_table *table);
 static void ovsdb_idl_send_cond_change(struct ovsdb_idl *idl);
 
+static int
+ovsdb_idl_index_generic_comparer(const void *, const void *, const void *);
+static struct ovsdb_idl_index *ovsdb_idl_create_index_(const struct
+   ovsdb_idl_table *table,
+   size_t allocated_cols);
+static void
+ ovsdb_idl_destroy_indexes(struct ovsdb_idl_table *table);
+static void ovsdb_idl_add_to_indexes(const struct ovsdb_idl_row *);
+static void ovsdb_idl_remove_from_indexes(const struct ovsdb_idl_row *);
+
 /* Creates and returns a connection to database 'remote', which should be in a
  * form acceptable to jsonrpc_session_open().  The connection will maintain an
  * in-memory replica of the remote database whose schema is described by
@@ -267,6 +280,7 @@ ovsdb_idl_create(const char *remote, const struct 
ovsdb_idl_class *class,
 memset(table->modes, default_mode, tc->n_columns);
 table->need_table = false;
 shash_init(>columns);
+shash_init(>indexes);
 for (j = 0; j < tc->n_columns; j++) {
 const struct ovsdb_idl_column *column = >columns[j];
 
@@ -320,6 +334,8 @@ ovsdb_idl_destroy(struct ovsdb_idl *idl)
 
 for (i = 0; i < idl->class->n_tables; i++) {
 struct ovsdb_idl_table *table = >tables[i];
+
+ovsdb_idl_destroy_indexes(table);
 shash_destroy(>columns);
 hmap_destroy(>rows);
 free(table->modes);

[ovs-dev] [PATCH 2/4 v4] lib:Data structures: Skiplist implementation

2016-12-28 Thread Rodriguez Betancourt, Esteban

Skiplist implementation intended for the IDL compound indexes
feature.

Signed-off-by: Esteban Rodriguez Betancourt 
---
 lib/automake.mk   |   2 +
 lib/skiplist.c| 313 ++
 lib/skiplist.h|  54 +
 tests/.gitignore  |   1 +
 tests/automake.mk |   2 +
 tests/library.at  |  11 ++
 tests/test-skiplist.c | 210 +
 7 files changed, 593 insertions(+)
 create mode 100644 lib/skiplist.c
 create mode 100644 lib/skiplist.h
 create mode 100644 tests/test-skiplist.c

diff --git a/lib/automake.mk b/lib/automake.mk
index 88344a3..e76c7bc 100644
--- a/lib/automake.mk
+++ b/lib/automake.mk
@@ -229,6 +229,8 @@ lib_libopenvswitch_la_SOURCES = \
lib/shash.c \
lib/simap.c \
lib/simap.h \
+   lib/skiplist.c \
+   lib/skiplist.h \
lib/smap.c \
lib/smap.h \
lib/socket-util.c \
diff --git a/lib/skiplist.c b/lib/skiplist.c
new file mode 100644
index 000..fcf24f6
--- /dev/null
+++ b/lib/skiplist.c
@@ -0,0 +1,313 @@
+/* Copyright (C) 2016 Hewlett Packard Enterprise Development LP
+ * All Rights Reserved.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License"); you may
+ * not use this file except in compliance with the License. You may obtain
+ * a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations
+ * under the License.
+ */
+
+/*
+ * Skiplist implementationn based on:
+ * "Skip List: A Probabilistic Alternative to Balanced Trees",
+ * by William Pugh.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "skiplist.h"
+#include "random.h"
+#include "util.h"
+
+/*
+ * The primary usage of the skiplists are the compound indexes
+ * at the IDL.
+ * For that use case 32 height levels is more than enough as
+ * it could indexes a table with 4.294.967.296 rows.
+ * In case that a new use case require more than that then this
+ * number can be changed, but also the way in which the random
+ * numbers are generated must be changed.
+ */
+#define SKIPLIST_MAX_LEVELS 32
+
+/*
+ * Skiplist node container
+ */
+struct skiplist_node {
+const void *data;   /* Pointer to saved data */
+uint64_t height;/* Height of this node */
+struct skiplist_node *forward[];/* Links to the next nodes */
+};
+
+/*
+ * Skiplist container
+ */
+
+struct skiplist {
+struct skiplist_node *header;   /* Pointer to head node (not first
+ * data node) */
+skiplist_comparator *cmp;   /* Pointer to the skiplist's comparison
+ * function */
+void *cfg;  /* Pointer to optional comparison
+ * configuration, used by the comparator */
+int max_levels; /* Max levels of the skiplist. */
+int64_t size;   /* Current size of the skiplist. */
+int64_t level;  /* Max number of levels used in this skiplist 
*/
+void (*free_func) (void *); /* Function that free the value's memory */
+};
+
+static int skiplist_determine_level(struct skiplist *sl);
+
+static struct skiplist_node *skiplist_create_node(int, const void *);
+
+static struct skiplist_node *skiplist_forward_to_(struct skiplist *sl,
+  const void *value,
+  struct skiplist_node
+  **update);
+
+/*
+ * skiplist_create returns a new skiplist, configured with given max_levels,
+ * data comparer and configuration.
+ * Sets a probability of 0.5 (RAND_MAX / 2).
+ */
+struct skiplist *
+skiplist_create(int max_levels, skiplist_comparator object_comparator,
+void *configuration)
+{
+random_init();
+struct skiplist *sl;
+
+sl = xmalloc(sizeof (struct skiplist));
+sl->cfg = configuration;
+sl->max_levels = max_levels < SKIPLIST_MAX_LEVELS ?
+max_levels : SKIPLIST_MAX_LEVELS;
+sl->size = 0;
+sl->level = 0;
+sl->cmp = object_comparator;
+sl->header = skiplist_create_node(sl->max_levels, NULL);
+sl->free_func = NULL;
+
+return sl;
+}
+
+/*
+ * Set a custom function that free the value's memory when
+ * destroying the skiplist.
+ */
+void
+skiplist_set_free_func(struct skiplist *sl, void (*func) (void *))
+{
+sl->free_func = func;
+}
+
+/*
+ * Determines the correspondent level for a skiplist node.
+ * Guarantees that the returned integer is less or equal
+ * to the current height of the skiplist plus 1.
+ */
+static int

[ovs-dev] [PATCH v2 1/1] json: Serialize strings using a lookup table

2016-10-05 Thread Rodriguez Betancourt, Esteban

The existing implementation uses a switch with
many conditions, that when compiled is translated
to a not optimal series of conditional jumps.

With a lookup table the generated code has less conditional jumps,
that should translate in improving the CPU ability to predict the
jumps.

Performance Comparison:
All the timings are in nanoseconds, "OVS Master" corresponds to 13a1d36.
N is the number of repetitions

Serialize vswitch.ovsschema
NOVS Master  Lookup TableDifferenceDiff per op
1   233182200369   3281332813
10 2724931   1919168  80576380576.3
100   22802794  24406648-1603854   -16038.54
1000 253645888 2062597604738612847386.128
1   23522457031906839780   44540592344540.5923
10 23967770920   19012178655  495559226549555.92265

Serialize echo example
NOVS Master  Lookup TableDifferenceDiff per op
1385712565 -8708-8708
10  17403 7312 10091 1009.1
100 5785956613  1246   12.46
1000   592310   528110 64200   64.2
1 6096334  5576109520225   52.0225
10   60970439 58477626   2492813   24.92813

Serialize mutate example
NOVS Master  Lookup TableDifferenceDiff per op
17115  19051 -11936  -11936
10  34110  39561  -5451-545.1
100296613 298645  -2032-20.32
1000  35104992930588 579911579.911
133898710   302786313620079362.0079
10  305069356  280622992   24446364244.46364

Signed-off-by: Esteban Rodriguez Betancourt 
---
I can't believe that I didn't send the patch in the previous mail, sorry about 
that.
Regards,
Esteban

 lib/json.c | 76 +-
 1 file changed, 40 insertions(+), 36 deletions(-)

diff --git a/lib/json.c b/lib/json.c
index 995f3c2..080e881 100644
--- a/lib/json.c
+++ b/lib/json.c
@@ -1610,49 +1610,53 @@ json_serialize_array(const struct json_array *array, 
struct json_serializer *s)
 ds_put_char(ds, ']');
 }
 
+const char *chars_escaping[256] = {
+"\\u", "\\u0001", "\\u0002", "\\u0003", "\\u0004", "\\u0005", 
"\\u0006", "\\u0007",
+"\\b", "\\t", "\\n", "\\u000b", "\\f", "\\r", "\\u000e", "\\u000f",
+"\\u0010", "\\u0011", "\\u0012", "\\u0013", "\\u0014", "\\u0015", 
"\\u0016", "\\u0017",
+"\\u0018", "\\u0019", "\\u001a", "\\u001b", "\\u001c", "\\u001d", 
"\\u001e", "\\u001f",
+" ", "!", "\\\"", "#", "$", "%", "&", "'",
+"(", ")", "*", "+", ",", "-", ".", "/",
+"0", "1", "2", "3", "4", "5", "6", "7",
+"8", "9", ":", ";", "<", "=", ">", "?",
+"@", "A", "B", "C", "D", "E", "F", "G",
+"H", "I", "J", "K", "L", "M", "N", "O",
+"P", "Q", "R", "S", "T", "U", "V", "W",
+"X", "Y", "Z", "[", "", "]", "^", "_",
+"`", "a", "b", "c", "d", "e", "f", "g",
+"h", "i", "j", "k", "l", "m", "n", "o",
+"p", "q", "r", "s", "t", "u", "v", "w",
+"x", "y", "z", "{", "|", "}", "~", "\x7f",
+"\x80", "\x81", "\x82", "\x83", "\x84", "\x85", "\x86", "\x87",
+"\x88", "\x89", "\x8a", "\x8b", "\x8c", "\x8d", "\x8e", "\x8f",
+"\x90", "\x91", "\x92", "\x93", "\x94", "\x95", "\x96", "\x97",
+"\x98", "\x99", "\x9a", "\x9b", "\x9c", "\x9d", "\x9e", "\x9f",
+"\xa0", "\xa1", "\xa2", "\xa3", "\xa4", "\xa5", "\xa6", "\xa7",
+"\xa8", "\xa9", "\xaa", "\xab", "\xac", "\xad", "\xae", "\xaf",
+"\xb0", "\xb1", "\xb2", "\xb3", "\xb4", "\xb5", "\xb6", "\xb7",
+"\xb8", "\xb9", "\xba", "\xbb", "\xbc", "\xbd", "\xbe", "\xbf",
+"\xc0", "\xc1", "\xc2", "\xc3", "\xc4", "\xc5", "\xc6", "\xc7",
+"\xc8", "\xc9", "\xca", "\xcb", "\xcc", "\xcd", "\xce", "\xcf",
+"\xd0", "\xd1", "\xd2", "\xd3", "\xd4", "\xd5", "\xd6", "\xd7",
+"\xd8", "\xd9", "\xda", "\xdb", "\xdc", "\xdd", "\xde", "\xdf",
+"\xe0", "\xe1", "\xe2", "\xe3", "\xe4", "\xe5", "\xe6", "\xe7",
+"\xe8", "\xe9", "\xea", "\xeb", "\xec", "\xed", "\xee", "\xef",
+"\xf0", "\xf1", "\xf2", "\xf3", "\xf4", "\xf5", "\xf6", "\xf7",
+"\xf8", "\xf9", "\xfa", "\xfb", "\xfc", "\xfd", "\xfe", "\xff"
+};
+
 static void
 json_serialize_string(const char *string, struct ds *ds)
 {
 uint8_t c;
+uint8_t c2;
+const char *escape;
 
 ds_put_char(ds, '"');
 while ((c = *string++) != '\0') {
-switch (c) {
-case '"':
-ds_put_cstr(ds, "\\\"");
-break;
-
-case '\\':
-ds_put_cstr(ds, "");
-break;
-
-case '\b':
-ds_put_cstr(ds, "\\b");
-break;
-
-case '\f':
-

[ovs-dev] [PATCH v3 1/1] json: Use reference counting in JSON objects

2016-10-04 Thread Rodriguez Betancourt, Esteban

After profiling OVSDB insert performance it was found
that some significant portion of its time OVSDB is
calling the function json_clone.

Also, the current usages of json_clone never modify the json,
just keeps it to prevent it to be freed.

With that in mind the struct json, json_create, json_clone
and json_destroy were modified to keep a count of how many
references of the json struct are left. Only when that count
reaches zero the json struct is freed.

The old "json_clone" function was renamed as "json_deep_clone".

Some examples of the performance difference:

In these tests a test table with 4 columns (string, string,
bool, integer) was used. All the tests used "commit block".

*** 50 process each inserting 1000 rows ***

Master OVS
Test Duration   131 seconds
Average Inserts Per second  746.2687 inserts/s
Average Insert Duration 134.1382 ms
Minimal Insert Duration   0.166202 ms
Maximum Insert Duration 489.8593 ms

JSON GC Patch
Test Duration   86 seconds
Average Inserts Per second1176 inserts/s
Average Insert Duration 82.26761 ms
Minimal Insert Duration  0.165448 ms
Maximum Insert Duration751.2111 ms

*** 5 process each inserting 1 rows ***

Master OVS
Test Duration  8 seconds
Average Inserts Per second  7142.857 inserts/s
Average Insert Duration0.656431 ms
Minimal Insert Duration0.125197 ms
Maximum Insert Duration   11.93203 ms

JSON GC Patch
Test Duration  7 seconds
Average Inserts Per second  8333.333 inserts/s
Average Insert Duration0.55688 ms
Minimal Insert Duration0.143233 ms
Maximum Insert Duration   26.26319 ms

Signed-off-by: Esteban Rodriguez Betancourt 
---
 include/openvswitch/json.h |  2 ++
 lib/json.c | 18 --
 2 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/include/openvswitch/json.h b/include/openvswitch/json.h
index 13f346c..83775ed 100644
--- a/include/openvswitch/json.h
+++ b/include/openvswitch/json.h
@@ -63,6 +63,7 @@ struct json_array {
 /* A JSON value. */
 struct json {
 enum json_type type;
+size_t count;
 union {
 struct shash *object;   /* Contains "struct json *"s. */
 struct json_array array;
@@ -99,6 +100,7 @@ bool json_boolean(const struct json *);
 double json_real(const struct json *);
 int64_t json_integer(const struct json *);
 
+struct json *json_deep_clone(const struct json *);
 struct json *json_clone(const struct json *);
 void json_destroy(struct json *);
 
diff --git a/lib/json.c b/lib/json.c
index 995f3c2..7b0398a 100644
--- a/lib/json.c
+++ b/lib/json.c
@@ -333,7 +333,7 @@ static void json_destroy_array(struct json_array *array);
 void
 json_destroy(struct json *json)
 {
-if (json) {
+if (json && !--json->count) {
 switch (json->type) {
 case JSON_OBJECT:
 json_destroy_object(json->u.object);
@@ -392,7 +392,7 @@ static struct json *json_clone_array(const struct 
json_array *array);
 
 /* Returns a deep copy of 'json'. */
 struct json *
-json_clone(const struct json *json)
+json_deep_clone(const struct json *json)
 {
 switch (json->type) {
 case JSON_OBJECT:
@@ -421,6 +421,15 @@ json_clone(const struct json *json)
 }
 }
 
+/* Returns 'json', with the reference count incremented. */
+struct json *
+json_clone(const struct json *json_)
+{
+struct json *json = CONST_CAST(struct json *, json_);
+json->count++;
+return json;
+}
+
 static struct json *
 json_clone_object(const struct shash *object)
 {
@@ -547,6 +556,10 @@ json_equal_array(const struct json_array *a, const struct 
json_array *b)
 bool
 json_equal(const struct json *a, const struct json *b)
 {
+if (a == b) {
+return true;
+}
+
 if (a->type != b->type) {
 return false;
 }
@@ -1405,6 +1418,7 @@ json_create(enum json_type type)
 {
 struct json *json = xmalloc(sizeof *json);
 json->type = type;
+json->count = 1;
 return json;
 }
 
-- 
2.7.4
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

[ovs-dev] json: Serialize strings using a lookup table

2016-09-07 Thread Rodriguez Betancourt, Esteban

The existing implementation uses a switch with
many conditions, that when compiled is translated
to a not optimal series of conditional jumps.

With a lookup table the generated code has less conditional jumps,
that should translate in improving the CPU ability to predict the
jumps.

Performance Comparison:
All the timings are in nanoseconds, "OVS Master" corresponds to 13a1d36.
N is the number of repetitions

Serialize vswitch.ovsschema
NOVS Master  Lookup TableDifferenceDiff per op
1   233182200369   3281332813
10 2724931   1919168  80576380576.3
100   22802794  24406648-1603854   -16038.54
1000 253645888 2062597604738612847386.128
1   23522457031906839780   44540592344540.5923
10 23967770920   19012178655  495559226549555.92265

Serialize echo example
NOVS Master  Lookup TableDifferenceDiff per op
1385712565 -8708-8708
10  17403 7312 10091 1009.1
100 5785956613  1246   12.46
1000   592310   528110 64200   64.2
1 6096334  5576109520225   52.0225
10   60970439 58477626   2492813   24.92813

Serialize mutate example
NOVS Master  Lookup TableDifferenceDiff per op
17115  19051 -11936  -11936
10  34110  39561  -5451-545.1
100296613 298645  -2032-20.32
1000  35104992930588 579911579.911
133898710   302786313620079362.0079
10  305069356  280622992   24446364244.46364


Signed-off-by: Esteban Rodriguez Betancourt 
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

[ovs-dev] [PATCH v2] json: Use reference counting in json objects

2016-08-29 Thread Rodriguez Betancourt, Esteban

After profiling OVSDB insert performance it was found
that some significant portion of its time OVSDB is
calling the function json_clone.

Also, the current usages of json_clone never modify the json,
just keeps it to prevent it to be freed.

With that in mind the struct json, json_create, json_clone
and json_destroy were modified to keep a count of how many
references of the json struct are left. Only when that count
reaches zero the json struct is freed.

The old "json_clone" function was renamed as "json_deep_clone".

Some examples of the performance difference:

In these tests a test table with 4 columns (string, string,
bool, integer) was used. All the tests used "commit block".

*** 50 process each inserting 1000 rows ***

Master OVS
Test Duration   131 seconds
Average Inserts Per second  746.2687 inserts/s
Average Insert Duration 134.1382 ms
Minimal Insert Duration   0.166202 ms
Maximum Insert Duration 489.8593 ms

JSON GC Patch
Test Duration   86 seconds
Average Inserts Per second1176 inserts/s
Average Insert Duration 82.26761 ms
Minimal Insert Duration  0.165448 ms
Maximum Insert Duration751.2111 ms

*** 5 process each inserting 1 rows ***

Master OVS
Test Duration  8 seconds
Average Inserts Per second  7142.857 inserts/s
Average Insert Duration0.656431 ms
Minimal Insert Duration0.125197 ms
Maximum Insert Duration   11.93203 ms

JSON GC Patch
Test Duration  7 seconds
Average Inserts Per second  8333.333 inserts/s
Average Insert Duration0.55688 ms
Minimal Insert Duration0.143233 ms
Maximum Insert Duration   26.26319 ms

Signed-off-by: Esteban Rodriguez Betancourt 
---
 lib/json.c | 18 --
 lib/json.h |  2 ++
 2 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/lib/json.c b/lib/json.c
index 4ac250b..20909ed 100644
--- a/lib/json.c
+++ b/lib/json.c
@@ -333,7 +333,7 @@ static void json_destroy_array(struct json_array *array);
 void
 json_destroy(struct json *json)
 {
-if (json) {
+if (json && !--json->count) {
 switch (json->type) {
 case JSON_OBJECT:
 json_destroy_object(json->u.object);
@@ -390,9 +390,18 @@ json_destroy_array(struct json_array *array)
 static struct json *json_clone_object(const struct shash *object);
 static struct json *json_clone_array(const struct json_array *array);
 
+/* Increments the counter of the json instance */
+struct json *
+json_clone(const struct json *json_)
+{
+struct json *json = CONST_CAST(struct json *, json_);
+json->count++;
+return json;
+}
+
 /* Returns a deep copy of 'json'. */
 struct json *
-json_clone(const struct json *json)
+json_deep_clone(const struct json *json)
 {
 switch (json->type) {
 case JSON_OBJECT:
@@ -547,6 +556,10 @@ json_equal_array(const struct json_array *a, const struct 
json_array *b)
 bool
 json_equal(const struct json *a, const struct json *b)
 {
+if (a == b) {
+return true;
+}
+
 if (a->type != b->type) {
 return false;
 }
@@ -1405,6 +1418,7 @@ json_create(enum json_type type)
 {
 struct json *json = xmalloc(sizeof *json);
 json->type = type;
+json->count = 1;
 return json;
 }
 
diff --git a/lib/json.h b/lib/json.h
index 3497035..a76c55d 100644
--- a/lib/json.h
+++ b/lib/json.h
@@ -61,6 +61,7 @@ struct json_array {
 
 /* A JSON value. */
 struct json {
+size_t count;
 enum json_type type;
 union {
 struct shash *object;   /* Contains "struct json *"s. */
@@ -99,6 +100,7 @@ double json_real(const struct json *);
 int64_t json_integer(const struct json *);
 
 struct json *json_clone(const struct json *);
+struct json *json_deep_clone(const struct json *);
 void json_destroy(struct json *);
 
 size_t json_hash(const struct json *, size_t basis);
-- 
2.7.4
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [PATCH] json: Use reference counting in json objects

2016-08-25 Thread Rodriguez Betancourt, Esteban

For example: 
50 process inserting each 1000 rows to a test table (4 columns: string * 2, 
bool, integer) using commit block

Master OVS
Test Duration   131 seconds
Average Inserts Per second  746.2687 inserts/s
Average Insert Duration 134.1382ms
Minimal Insert Duration 0.166202ms
Maximum Insert Duration 489.8593ms

JSON GC Patch
Test Duration   86 seconds
Average Inserts Per second  1176 inserts/s
Average Insert Duration 82.26761ms
Minimal Insert Duration 0.165448ms
Maximum Insert Duration 751.2111ms

5 process inserting 1 rows each to the same test table

Master OVS
Test Duration   8 seconds
Average Inserts Per second  7142.857 inserts/s
Average Insert Duration 0.656431ms
Minimal Insert Duration 0.125197ms
Maximum Insert Duration 11.93203ms

JSON GC Patch
Test Duration   7 seconds
Average Inserts Per second  8333.333 inserts/s
Average Insert Duration 0.55688ms
Minimal Insert Duration 0.143233ms
Maximum Insert Duration 26.26319ms

Also, accordingly with profiling using callgrind functions like json_clone or 
json_destroy sometimes can take up to 20% and 36% of the ops executed inOVSDB 
(those functions and functions called by those ones...). Although, in one odd 
case it shows that json_clone takes 50% of the operations executed. We can't 
avoid calling json_destroy sometimes, but I didn't find any place where a 
cloned JSON object is updated: only cloned, passed the new pointer and release 
the old one. 

Regards,
Esteban


From: Ryan Moats [mailto:rmo...@us.ibm.com] 
Sent: jueves, 25 de agosto de 2016 9:28
To: Rodriguez Betancourt, Esteban <esteb...@hpe.com>
Cc: dev@openvswitch.org
Subject: Re: [ovs-dev] [PATCH] json: Use reference counting in json objects

"dev" <dev-boun...@openvswitch.org> wrote on 08/16/2016 04:50:37 PM:

> From: "Rodriguez Betancourt, Esteban" <esteb...@hpe.com>
> To: "dev@openvswitch.org" <dev@openvswitch.org>
> Date: 08/16/2016 04:50 PM
> Subject: [ovs-dev] [PATCH] json: Use reference counting in json objects
> Sent by: "dev" <dev-boun...@openvswitch.org>
> 
> After profiling OVSDB insert performance it was found
> that some significant portion of its time OVSDB is
> calling the function json_clone.
> 
> Also, most of the usages of json_clone never modify the json,
> just keeps it to prevent it to be freed.
> 
> With that in mind the struct json, json_create, json_clone
> and json_destroy were modified to keep a count of how many
> references of the json struct are left. Only when that count
> reaches zero the json struct is freed.
> 
> The old "json_clone" function was renamed as "json_deep_clone".
> 
> The change provides some performance improvement, depending
> on the transactions performed in OVSDB.
> 
> Signed-off-by: Esteban Rodriguez Betancourt <esteb...@hpe.com>
> ---

Can you add some quantification of the statement
"provides some performance improvement"?
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

[ovs-dev] [PATCH] json: Use reference counting in json objects

2016-08-16 Thread Rodriguez Betancourt, Esteban

After profiling OVSDB insert performance it was found
that some significant portion of its time OVSDB is
calling the function json_clone.

Also, most of the usages of json_clone never modify the json,
just keeps it to prevent it to be freed.

With that in mind the struct json, json_create, json_clone
and json_destroy were modified to keep a count of how many
references of the json struct are left. Only when that count
reaches zero the json struct is freed.

The old "json_clone" function was renamed as "json_deep_clone".

The change provides some performance improvement, depending
on the transactions performed in OVSDB.

Signed-off-by: Esteban Rodriguez Betancourt 
---
 lib/json.c | 18 --
 lib/json.h |  2 ++
 2 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/lib/json.c b/lib/json.c
index 4ac250b..20909ed 100644
--- a/lib/json.c
+++ b/lib/json.c
@@ -333,7 +333,7 @@ static void json_destroy_array(struct json_array *array);
 void
 json_destroy(struct json *json)
 {
-if (json) {
+if (json && !--json->count) {
 switch (json->type) {
 case JSON_OBJECT:
 json_destroy_object(json->u.object);
@@ -390,9 +390,18 @@ json_destroy_array(struct json_array *array)
 static struct json *json_clone_object(const struct shash *object);
 static struct json *json_clone_array(const struct json_array *array);
 
+/* Increments the counter of the json instance */
+struct json *
+json_clone(const struct json *json_)
+{
+struct json *json = CONST_CAST(struct json *, json_);
+json->count++;
+return json;
+}
+
 /* Returns a deep copy of 'json'. */
 struct json *
-json_clone(const struct json *json)
+json_deep_clone(const struct json *json)
 {
 switch (json->type) {
 case JSON_OBJECT:
@@ -547,6 +556,10 @@ json_equal_array(const struct json_array *a, const struct 
json_array *b)
 bool
 json_equal(const struct json *a, const struct json *b)
 {
+if (a == b) {
+return true;
+}
+
 if (a->type != b->type) {
 return false;
 }
@@ -1405,6 +1418,7 @@ json_create(enum json_type type)
 {
 struct json *json = xmalloc(sizeof *json);
 json->type = type;
+json->count = 1;
 return json;
 }
 
diff --git a/lib/json.h b/lib/json.h
index 3497035..a76c55d 100644
--- a/lib/json.h
+++ b/lib/json.h
@@ -61,6 +61,7 @@ struct json_array {
 
 /* A JSON value. */
 struct json {
+size_t count;
 enum json_type type;
 union {
 struct shash *object;   /* Contains "struct json *"s. */
@@ -99,6 +100,7 @@ double json_real(const struct json *);
 int64_t json_integer(const struct json *);
 
 struct json *json_clone(const struct json *);
+struct json *json_deep_clone(const struct json *);
 void json_destroy(struct json *);
 
 size_t json_hash(const struct json *, size_t basis);
-- 
2.7.4
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

[ovs-dev] [PATCH v3] ovsdb: Weak references performance fix

2016-07-13 Thread Rodriguez Betancourt, Esteban

Prevents the cloning of rows with outgoing or
incoming weak references when those rows aren't
being modified.

It improves the OVSDB Server performance when
many rows with weak references are involved
in a transaction.

Signed-off-by: Esteban Rodriguez Betancourt 
---

In the previous patch I forgot to update the test to use the new function name,
otherwise this patch is identical to the previous one.

 include/openvswitch/list.h |  7 ++
 ovsdb/row.h|  8 ---
 ovsdb/transaction.c| 57 +++---
 tests/library.at   |  4 ++--
 tests/test-list.c  | 41 +
 5 files changed, 104 insertions(+), 13 deletions(-)

diff --git a/include/openvswitch/list.h b/include/openvswitch/list.h
index 5c2cca4..ea5b3db 100644
--- a/include/openvswitch/list.h
+++ b/include/openvswitch/list.h
@@ -288,4 +288,11 @@ ovs_list_is_short(const struct ovs_list *list)
 return list->next == list->prev;
 }
 
+/* Transplant a list into another, and resets the origin list */
+static inline void
+ovs_list_push_back_all(struct ovs_list *dst, struct ovs_list *src)
+{
+ovs_list_splice(dst, src->next, src);
+}
+
 #endif /* list.h */
diff --git a/ovsdb/row.h b/ovsdb/row.h
index b1d1edd..b38c7dd 100644
--- a/ovsdb/row.h
+++ b/ovsdb/row.h
@@ -36,9 +36,11 @@ struct ovsdb_column_set;
  * ovsdb_weak_ref" structures are created for them.
  */
 struct ovsdb_weak_ref {
-struct ovs_list src_node;   /* In src->src_refs list. */
-struct ovs_list dst_node;   /* In destination row's dst_refs list. */
-struct ovsdb_row *src;  /* Source row. */
+struct ovs_list src_node;  /* In src->src_refs list. */
+struct ovs_list dst_node;  /* In destination row's dst_refs list. */
+struct ovsdb_row *src; /* Source row. */
+struct ovsdb_table *dst_table; /* Destination table. */
+struct uuid dst;   /* Destination row uuid. */
 };
 
 /* A row in a database table. */
diff --git a/ovsdb/transaction.c b/ovsdb/transaction.c
index 9e12a62..e2a16e6 100644
--- a/ovsdb/transaction.c
+++ b/ovsdb/transaction.c
@@ -436,9 +436,48 @@ ovsdb_txn_row_commit(struct ovsdb_txn *txn OVS_UNUSED,
 return NULL;
 }
 
+static struct ovsdb_error *
+ovsdb_txn_update_weak_refs(struct ovsdb_txn *txn OVS_UNUSED,
+   struct ovsdb_txn_row *txn_row)
+{
+struct ovsdb_weak_ref *weak, *next;
+
+/* Remove the weak references originating in the old version of the row */
+if (txn_row->old) {
+LIST_FOR_EACH_SAFE (weak, next, src_node, _row->old->src_refs) {
+ovs_list_remove(>src_node);
+ovs_list_remove(>dst_node);
+free(weak);
+}
+}
+
+/* Although the originating rows have the responsability of updating the
+ * weak references in the dst, is possible that some source rows aren't
+ * part of the transaction.
+ * In that situation this row needs to move the list of incoming weak
+ * references from the old row into the new one.
+ */
+if (txn_row->old && txn_row->new) {
+/* Move the incoming weak references from old to new */
+ovs_list_push_back_all(_row->new->dst_refs, 
_row->old->dst_refs);
+}
+
+/* Insert the weak references originating in the new version of the row */
+struct ovsdb_row *dst_row;
+if (txn_row->new) {
+LIST_FOR_EACH (weak, src_node, _row->new->src_refs) {
+/* dst_row MUST exist */
+dst_row = CONST_CAST(struct ovsdb_row *,
+ovsdb_table_get_row(weak->dst_table, >dst));
+ovs_list_insert(_row->dst_refs, >dst_node);
+}
+}
+
+return NULL;
+}
+
 static void
-add_weak_ref(struct ovsdb_txn *txn,
- const struct ovsdb_row *src_, const struct ovsdb_row *dst_)
+add_weak_ref(const struct ovsdb_row *src_, const struct ovsdb_row *dst_)
 {
 struct ovsdb_row *src = CONST_CAST(struct ovsdb_row *, src_);
 struct ovsdb_row *dst = CONST_CAST(struct ovsdb_row *, dst_);
@@ -448,8 +487,6 @@ add_weak_ref(struct ovsdb_txn *txn,
 return;
 }
 
-dst = ovsdb_txn_row_modify(txn, dst);
-
 if (!ovs_list_is_empty(>dst_refs)) {
 /* Omit duplicates. */
 weak = CONTAINER_OF(ovs_list_back(>dst_refs),
@@ -461,7 +498,10 @@ add_weak_ref(struct ovsdb_txn *txn,
 
 weak = xmalloc(sizeof *weak);
 weak->src = src;
-ovs_list_push_back(>dst_refs, >dst_node);
+weak->dst_table = dst->table;
+weak->dst = *ovsdb_row_get_uuid(dst);
+/* The dst_refs list is updated at commit time */
+ovs_list_init(>dst_node);
 ovs_list_push_back(>src_refs, >src_node);
 }
 
@@ -471,7 +511,7 @@ assess_weak_refs(struct ovsdb_txn *txn, struct 
ovsdb_txn_row *txn_row)
 struct ovsdb_table *table;
 struct shash_node *node;
 
-if (txn_row->old) {
+if (txn_row->old && !txn_row->new) {
 /* Mark rows that have weak references to 'txn_row' as

[ovs-dev] [PATCH v2] ovsdb: Weak references performance fix

2016-07-08 Thread Rodriguez Betancourt, Esteban

Prevents the cloning of rows with outgoing or
incoming weak references when those rows aren't
being modified.

It improves the OVSDB Server performance when
many rows with weak references are involved
in a transaction.

Signed-off-by: Esteban Rodriguez Betancourt 
---
 include/openvswitch/list.h |  7 ++
 ovsdb/row.h|  8 ---
 ovsdb/transaction.c| 57 +++---
 tests/library.at   |  4 ++--
 tests/test-list.c  | 41 +
 5 files changed, 104 insertions(+), 13 deletions(-)

diff --git a/include/openvswitch/list.h b/include/openvswitch/list.h
index 5c2cca4..ea5b3db 100644
--- a/include/openvswitch/list.h
+++ b/include/openvswitch/list.h
@@ -288,4 +288,11 @@ ovs_list_is_short(const struct ovs_list *list)
 return list->next == list->prev;
 }
 
+/* Transplant a list into another, and resets the origin list */
+static inline void
+ovs_list_push_back_all(struct ovs_list *dst, struct ovs_list *src)
+{
+ovs_list_splice(dst, src->next, src);
+}
+
 #endif /* list.h */
diff --git a/ovsdb/row.h b/ovsdb/row.h
index b1d1edd..b38c7dd 100644
--- a/ovsdb/row.h
+++ b/ovsdb/row.h
@@ -36,9 +36,11 @@ struct ovsdb_column_set;
  * ovsdb_weak_ref" structures are created for them.
  */
 struct ovsdb_weak_ref {
-struct ovs_list src_node;   /* In src->src_refs list. */
-struct ovs_list dst_node;   /* In destination row's dst_refs list. */
-struct ovsdb_row *src;  /* Source row. */
+struct ovs_list src_node;  /* In src->src_refs list. */
+struct ovs_list dst_node;  /* In destination row's dst_refs list. */
+struct ovsdb_row *src; /* Source row. */
+struct ovsdb_table *dst_table; /* Destination table. */
+struct uuid dst;   /* Destination row uuid. */
 };
 
 /* A row in a database table. */
diff --git a/ovsdb/transaction.c b/ovsdb/transaction.c
index 9e12a62..e2a16e6 100644
--- a/ovsdb/transaction.c
+++ b/ovsdb/transaction.c
@@ -436,9 +436,48 @@ ovsdb_txn_row_commit(struct ovsdb_txn *txn OVS_UNUSED,
 return NULL;
 }
 
+static struct ovsdb_error *
+ovsdb_txn_update_weak_refs(struct ovsdb_txn *txn OVS_UNUSED,
+   struct ovsdb_txn_row *txn_row)
+{
+struct ovsdb_weak_ref *weak, *next;
+
+/* Remove the weak references originating in the old version of the row */
+if (txn_row->old) {
+LIST_FOR_EACH_SAFE (weak, next, src_node, _row->old->src_refs) {
+ovs_list_remove(>src_node);
+ovs_list_remove(>dst_node);
+free(weak);
+}
+}
+
+/* Although the originating rows have the responsability of updating the
+ * weak references in the dst, is possible that some source rows aren't
+ * part of the transaction.
+ * In that situation this row needs to move the list of incoming weak
+ * references from the old row into the new one.
+ */
+if (txn_row->old && txn_row->new) {
+/* Move the incoming weak references from old to new */
+ovs_list_push_back_all(_row->new->dst_refs, 
_row->old->dst_refs);
+}
+
+/* Insert the weak references originating in the new version of the row */
+struct ovsdb_row *dst_row;
+if (txn_row->new) {
+LIST_FOR_EACH (weak, src_node, _row->new->src_refs) {
+/* dst_row MUST exist */
+dst_row = CONST_CAST(struct ovsdb_row *,
+ovsdb_table_get_row(weak->dst_table, >dst));
+ovs_list_insert(_row->dst_refs, >dst_node);
+}
+}
+
+return NULL;
+}
+
 static void
-add_weak_ref(struct ovsdb_txn *txn,
- const struct ovsdb_row *src_, const struct ovsdb_row *dst_)
+add_weak_ref(const struct ovsdb_row *src_, const struct ovsdb_row *dst_)
 {
 struct ovsdb_row *src = CONST_CAST(struct ovsdb_row *, src_);
 struct ovsdb_row *dst = CONST_CAST(struct ovsdb_row *, dst_);
@@ -448,8 +487,6 @@ add_weak_ref(struct ovsdb_txn *txn,
 return;
 }
 
-dst = ovsdb_txn_row_modify(txn, dst);
-
 if (!ovs_list_is_empty(>dst_refs)) {
 /* Omit duplicates. */
 weak = CONTAINER_OF(ovs_list_back(>dst_refs),
@@ -461,7 +498,10 @@ add_weak_ref(struct ovsdb_txn *txn,
 
 weak = xmalloc(sizeof *weak);
 weak->src = src;
-ovs_list_push_back(>dst_refs, >dst_node);
+weak->dst_table = dst->table;
+weak->dst = *ovsdb_row_get_uuid(dst);
+/* The dst_refs list is updated at commit time */
+ovs_list_init(>dst_node);
 ovs_list_push_back(>src_refs, >src_node);
 }
 
@@ -471,7 +511,7 @@ assess_weak_refs(struct ovsdb_txn *txn, struct 
ovsdb_txn_row *txn_row)
 struct ovsdb_table *table;
 struct shash_node *node;
 
-if (txn_row->old) {
+if (txn_row->old && !txn_row->new) {
 /* Mark rows that have weak references to 'txn_row' as modified, so
  * that their weak references will get reassessed. */
 struct ovsdb_weak_ref *weak, *next;
@@ -506,7 +546,7

Re: [ovs-dev] [PATCH] ovsdb: Weak references performance fix

2016-07-08 Thread Rodriguez Betancourt, Esteban

Thanks for the comments, I'm going to send soon a new patch with the suggestions
applied.

About how the patch works:
The first attempt to stop the cascade changes was something like:

@@ -471,7 +512,7 @@ assess_weak_refs(struct ovsdb_txn *txn, struct 
ovsdb_txn_row *txn_row)
 struct ovsdb_table *table;
 struct shash_node *node;
 
-if (txn_row->old) {
+if (txn_row->old && !txn_row->new) {

The rationale was: the weak references are linked by UUID, and the UUID never 
changes, unless 
if the current row is being deleted. So, we in fact didn't need to change the 
rows that weak-referenced
the current row if the row wasn't deleted.
It didn't works: we discover that the modified rows were lacking incoming weak 
references in
the dst_refs list, so that gives us the idea of "moving" the references in 
old->dst_refs to new->dst_refs.

In the original code (dst_refs is created from scratch):

old->dst_refs = all the rows that weak referenced old

new->dst_refs = all the rows that weak referenced old and are still weak 
referencing new + rows in the transaction that weak referenced new



In the patch (dst_refs incrementally built):
Old->dst_refs = all the rows that weak referenced old

(Ideally, but expansive to calculate:)
New->dst_refs = old->dst_refs - "weak references removed within this TXN" + 
"weak references created within this TXN"

(What was implemented:)
New->dst_refs = old->dst_refs - "weak references in old rows in TXN" + "weak 
references in new rows in TXN"

The resulting sets should be equal in both cases.


There we do some more optimizations:
- If we know that the transactions must be successful at some point then,
instead of cloning dst_refs we could just move the elements between 
the lists.

- At that point we lost the rollback feature, but we aren't going to need
 it anyway (note that we didn't really touch the src_refs part).

- The references in dst_refs must point to new instead than old. 
 Previously we iterated over all the weak references in dst_refs
 to change that pointer, but using an UUID is easier, and prevents
 that iteration completely.

Regards,
Esteban

> -Original Message-
> From: Ben Pfaff [mailto:b...@ovn.org]
> Sent: sábado, 2 de julio de 2016 14:06
> To: Rodriguez Betancourt, Esteban <esteb...@hpe.com>
> Cc: dev@openvswitch.org
> Subject: Re: [ovs-dev] [PATCH] ovsdb: Weak references performance fix
> 
> On Mon, Jun 27, 2016 at 08:07:02PM +, Rodriguez Betancourt, Esteban
> wrote:
> > Prevents the cloning of rows with outgoing or incoming weak references
> > when those rows aren't being modified.
> >
> > It improves the OVSDB Server performance when many rows with weak
> > references are involved in a transaction.
> >
> > Signed-off-by: Esteban Rodriguez Betancourt <esteb...@hpe.com>
> 
> Great idea.  Thanks for working on this!
> 
> The new ovs_list_transplant() function takes two "const" pointers to lists but
> it modifies both lists.  I don't think it makes sense for them to be const.
> 
> The new ovs_list_transplant() function has a name that doesn't give much of
> a idea of what it does.  I think that really it appends everything in 'src' 
> to 'dst',
> although those are not great names since both lists are modified.  Maybe
> ovs_list_push_back_all() would be a better name.
> 
> I think that the new ovs_list_transplant() function can be implemented in
> terms of ovs_list_splice(), as:
> ovs_list_splice(dst, >next, src); I see a couple of existing 
> uses of
> ovs_list_splice() to do that, so it's probably best to convert them to use the
> new function for consistency.
> 
> Since add_weak_ref() doesn't use its first argument now, please delete the
> parameter entirely instead of marking it OVS_UNUSED.
> 
> In add_weak_ref(), instead of using memcpy() to copy a struct uuid, please
> use an ordinary assignment with =.
> 
> I don't think that this loop in ovsdb_txn_update_weak_refs() needs to be
> the _SAFE variant: it appears to me that it iterates on ->src_refs and
> ->src_node but only modifies ->dst_refs and ->dst_node:
> > +LIST_FOR_EACH_SAFE (weak, next, src_node, _row->new-
> >src_refs) {
> > +/* dst_row MUST exist */
> > +dst_row = CONST_CAST(struct ovsdb_row *,
> > +ovsdb_table_get_row(weak->dst_table, >dst));
> > +ovs_list_insert(_row->dst_refs, >dst_node);
> > +}
> 
> It's taking me some thought to convince myself that this new version is as
> correct as the previous version.  Do you have any arguments or explanations
> to help me out?
> 
> Thanks,
> 
> Ben.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

[ovs-dev] [PATCH] ovsdb: Weak references performance fix

2016-06-27 Thread Rodriguez Betancourt, Esteban

Prevents the cloning of rows with outgoing or
incoming weak references when those rows aren't
being modified.

It improves the OVSDB Server performance when
many rows with weak references are involved
in a transaction.

Signed-off-by: Esteban Rodriguez Betancourt 
---
 include/openvswitch/list.h | 19 +
 ovsdb/row.h|  8 ---
 ovsdb/transaction.c| 52 +-
 tests/library.at   |  2 +-
 tests/test-list.c  | 41 
 5 files changed, 113 insertions(+), 9 deletions(-)

diff --git a/include/openvswitch/list.h b/include/openvswitch/list.h
index 5c2cca4..a762ef7 100644
--- a/include/openvswitch/list.h
+++ b/include/openvswitch/list.h
@@ -288,4 +288,23 @@ ovs_list_is_short(const struct ovs_list *list)
 return list->next == list->prev;
 }
 
+/* Transplant a list into another, and resets the origin list */
+static inline void
+ovs_list_transplant(const struct ovs_list *dst_, const struct ovs_list *src_)
+{
+struct ovs_list *src, *dst;
+src = CONST_CAST(struct ovs_list *, src_);
+dst = CONST_CAST(struct ovs_list *, dst_);
+
+/* Chain last element of dst with first of src */
+src->next->prev = dst->prev;
+dst->prev->next = src->next;
+
+/* Chain last element of src with head of dst */
+src->prev->next = dst;
+dst->prev = src->prev;
+
+ovs_list_init(src);
+}
+
 #endif /* list.h */
diff --git a/ovsdb/row.h b/ovsdb/row.h
index b1d1edd..b38c7dd 100644
--- a/ovsdb/row.h
+++ b/ovsdb/row.h
@@ -36,9 +36,11 @@ struct ovsdb_column_set;
  * ovsdb_weak_ref" structures are created for them.
  */
 struct ovsdb_weak_ref {
-struct ovs_list src_node;   /* In src->src_refs list. */
-struct ovs_list dst_node;   /* In destination row's dst_refs list. */
-struct ovsdb_row *src;  /* Source row. */
+struct ovs_list src_node;  /* In src->src_refs list. */
+struct ovs_list dst_node;  /* In destination row's dst_refs list. */
+struct ovsdb_row *src; /* Source row. */
+struct ovsdb_table *dst_table; /* Destination table. */
+struct uuid dst;   /* Destination row uuid. */
 };
 
 /* A row in a database table. */
diff --git a/ovsdb/transaction.c b/ovsdb/transaction.c
index 9e12a62..5ad3de0 100644
--- a/ovsdb/transaction.c
+++ b/ovsdb/transaction.c
@@ -436,8 +436,48 @@ ovsdb_txn_row_commit(struct ovsdb_txn *txn OVS_UNUSED,
 return NULL;
 }
 
+static struct ovsdb_error *
+ovsdb_txn_update_weak_refs(struct ovsdb_txn *txn OVS_UNUSED,
+   struct ovsdb_txn_row *txn_row)
+{
+struct ovsdb_weak_ref *weak, *next;
+
+/* Remove the weak references originating in the old version of the row */
+if (txn_row->old) {
+LIST_FOR_EACH_SAFE (weak, next, src_node, _row->old->src_refs) {
+ovs_list_remove(>src_node);
+ovs_list_remove(>dst_node);
+free(weak);
+}
+}
+
+/* Although the originating rows have the responsability of updating the
+ * weak references in the dst, is possible that some source rows aren't
+ * part of the transaction.
+ * In that situation this row needs to move the list of incoming weak
+ * references from the old row into the new one.
+ */
+if (txn_row->old && txn_row->new) {
+/* Move the incoming weak references from old to new */
+ovs_list_transplant(_row->new->dst_refs, _row->old->dst_refs);
+}
+
+/* Insert the weak references originating in the new version of the row */
+struct ovsdb_row *dst_row;
+if (txn_row->new) {
+LIST_FOR_EACH_SAFE (weak, next, src_node, _row->new->src_refs) {
+/* dst_row MUST exist */
+dst_row = CONST_CAST(struct ovsdb_row *,
+ovsdb_table_get_row(weak->dst_table, >dst));
+ovs_list_insert(_row->dst_refs, >dst_node);
+}
+}
+
+return NULL;
+}
+
 static void
-add_weak_ref(struct ovsdb_txn *txn,
+add_weak_ref(struct ovsdb_txn *txn OVS_UNUSED,
  const struct ovsdb_row *src_, const struct ovsdb_row *dst_)
 {
 struct ovsdb_row *src = CONST_CAST(struct ovsdb_row *, src_);
@@ -448,8 +488,6 @@ add_weak_ref(struct ovsdb_txn *txn,
 return;
 }
 
-dst = ovsdb_txn_row_modify(txn, dst);
-
 if (!ovs_list_is_empty(>dst_refs)) {
 /* Omit duplicates. */
 weak = CONTAINER_OF(ovs_list_back(>dst_refs),
@@ -461,7 +499,10 @@ add_weak_ref(struct ovsdb_txn *txn,
 
 weak = xmalloc(sizeof *weak);
 weak->src = src;
-ovs_list_push_back(>dst_refs, >dst_node);
+weak->dst_table = dst->table;
+memcpy(>dst, ovsdb_row_get_uuid(dst), sizeof(struct uuid));
+/* The dst_refs list is updated at commit time */
+ovs_list_init(>dst_node);
 ovs_list_push_back(>src_refs, >src_node);
 }
 
@@ -471,7 +512,7 @@ assess_weak_refs(struct ovsdb_txn *txn, struct 
ovsdb_txn_row *txn_row)
 struct ovsdb_table

[ovs-dev] [PATCH] ovsdb: Strong references cascade fix

2016-06-10 Thread Rodriguez Betancourt, Esteban

Improves the performance of OVSDB avoiding the chain
reaction produced when modifing rows with a strong
reference and the pointed rows have more strong
references.

The approach taken was using the change bitmap to avoid
triggering a change count when the column hasn't changed.

One way to trigger the issue is emulating a simple linked list
with strong references within a table, where each new row
points to the previous.

Without the fix OVSDB creates a ovsdb_txn_row (and a copy
of the row) for each row in the table.
With the fix it only creates two ovsdb_txn_row: the new row and
the directly pointed row.

Signed-off-by: Esteban Rodriguez Betancourt 
---
 ovsdb/transaction.c | 20 +++-
 1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/ovsdb/transaction.c b/ovsdb/transaction.c
index 84ccfa3..9e12a62 100644
--- a/ovsdb/transaction.c
+++ b/ovsdb/transaction.c
@@ -271,16 +271,18 @@ update_row_ref_count(struct ovsdb_txn *txn, struct 
ovsdb_txn_row *r)
 const struct ovsdb_column *column = node->data;
 struct ovsdb_error *error;
 
-if (r->old) {
-error = ovsdb_txn_adjust_row_refs(txn, r->old, column, -1);
-if (error) {
-return OVSDB_WRAP_BUG("error decreasing refcount", error);
+if (bitmap_is_set(r->changed, column->index)) {
+if (r->old) {
+error = ovsdb_txn_adjust_row_refs(txn, r->old, column, -1);
+if (error) {
+return OVSDB_WRAP_BUG("error decreasing refcount", error);
+}
 }
-}
-if (r->new) {
-error = ovsdb_txn_adjust_row_refs(txn, r->new, column, 1);
-if (error) {
-return error;
+if (r->new) {
+error = ovsdb_txn_adjust_row_refs(txn, r->new, column, 1);
+if (error) {
+return error;
+}
 }
 }
 }
-- 
1.9.1

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

[ovs-dev] Question about SHA1 usage in OVSDB Server

2016-06-08 Thread Rodriguez Betancourt, Esteban

Hello,
I was performing some profiling on OVSDB, when inserting a lot of rows. The 
callgrind
results shows that the SHA1 calculation takes near 10% of the time within our 
test
(the whole file writing, including SHA1, takes like 20%).

We want to know further about why that checksum is need, in order to decide how
it could be optimized (we are considering: using processor specific 
instructions, changing
the algorithm or removing it at all).

By the way, I would want to know if there are any updates on the review of the 
IDL Compound
Indexes (pull request #127).

Thanks,
Esteban
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [PATCH 3/4 v3] ovsdb-idl: IDL Compound Indexes Implementation

2016-05-06 Thread Rodriguez Betancourt, Esteban

Send as pull request https://github.com/openvswitch/ovs/pull/127
Thanks,
Esteban
> -Original Message-
> From: Ben Pfaff [mailto:b...@ovn.org]
> Sent: viernes, 6 de mayo de 2016 10:09
> To: Rodriguez Betancourt, Esteban <esteb...@hpe.com>
> Cc: dev@openvswitch.org
> Subject: Re: [ovs-dev] [PATCH 3/4 v3] ovsdb-idl: IDL Compound Indexes
> Implementation
> 
> On Wed, Apr 20, 2016 at 09:57:15PM +0000, Rodriguez Betancourt, Esteban
> wrote:
> > In the C IDL, allows to create multicolumn indexes in the tables, that
> > are keep synched with the data in the replica.
> >
> > Signed-off-by: Esteban Rodriguez Betancourt <esteb...@hpe.com>
> 
> I'm getting strange patch application failures for this patch.  Maybe you
> should post it as a pull request on github.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

[ovs-dev] [PATCH 1/4 v3.1] ovsdb-idl: Compound Indexes Design Document

2016-04-20 Thread Rodriguez Betancourt, Esteban

In the work made in our projects, it was found the need to have a faster
access to the rows contained in tables in the replica, as well to have
the possibility to loop over a subset of rows that meet some specified
criteria.
Those needs lead us to design and implement a functionality that
satisfies those requirements, so an implementation of special indexes were
done.
In order to keep the OVSDB server implementation unmodified and avoid
extra load of processing, the indexes are created as part of the IDL.
The indexes are created as part of the initialization of the replica request
and are maintained automatically when there are changes in the replica.

This document explains the design rationale of the compound indexes feature.

Signed-off-by: Javier Albornoz 
Signed-off-by: Esteban Rodriguez Betancourt 
Signed-off-by: Jorge Arturo Sauma Vargas 
Co-authored-by: Javier Albornoz 
Co-authored-by: Esteban Rodriguez Betancourt 
Co-authored-by: Jorge Arturo Sauma Vargas 
---
The former 1/4 v3 patch has an error in the code example (& wasn't removed). 
Here it was fixed.

 Documentation/automake.mk   |   1 +
 Documentation/c-idl-compound-indexes.md | 232 
 2 files changed, 233 insertions(+)
 create mode 100644 Documentation/c-idl-compound-indexes.md

diff --git a/Documentation/automake.mk b/Documentation/automake.mk
index 5903c22..011855b 100644
--- a/Documentation/automake.mk
+++ b/Documentation/automake.mk
@@ -1,4 +1,5 @@
 docs += \
+   Documentation/c-idl-compound-indexes.md \
Documentation/committer-responsibilities.md \
Documentation/committer-grant-revocation.md \
Documentation/group-selection-method-property.txt
diff --git a/Documentation/c-idl-compound-indexes.md 
b/Documentation/c-idl-compound-indexes.md
new file mode 100644
index 000..489534f
--- /dev/null
+++ b/Documentation/c-idl-compound-indexes.md
@@ -0,0 +1,232 @@
+C IDL Compound Indexes
+==
+
+## Introduction
+
+This document describes the design and usage of the C IDL Compound Indexes
+feature that allows the developer to create indexes over any number of columns
+on the IDL side, and query them.
+
+This feature works completely on the IDL, without requiring changes to the
+OVSDB Server, OVSDB Protocol (OVSDB RFC (RFC 7047)) or
+performing additional communication with the server.
+
+Please note that in this document, the term "index" refers to the common
+database term defined as "a data structure that improves data retrieval". 
Unless
+stated otherwise, the definition for index from the OVSDB RFC (RFC 7047) is not
+used.
+
+## Typical Use Cases
+
+### Fast lookups
+
+Depending on the topology, the route table of a network device could manage
+thousands of routes. Commands such as "show ip route <*specific route*>" would
+need to do a sequential lookup of the routing table to find the specific route.
+With an index created, the lookup time could be faster.
+
+This same scenario could be applied to other features such as Access List rules
+and even interfaces lists.
+
+### Lexicographic order
+
+There are several cases where retrieving data in lexicographic order is needed.
+For example, SNMP. When an administrator or even a NMS would like to retrieve
+data from a specific device, it's possible that they will request data from 
full
+tables instead of just specific values. Also, they would like to have this
+information displayed in lexicographic order. This operation could be done by
+the SNMP daemon or by the CLI, but it would be better if the database could
+provide the data ready for consumption. Also, duplicate efforts by different
+processes will be avoided. Another use case for requesting data in 
lexicographic
+order is for user interfaces (web or CLI) where it would be better and quicker
+if the DB sends the data sorted instead of letting each process to sort the 
data
+by itself.
+
+## Implementation Design
+
+This feature maintains a collection of indexes per table. The developer can
+define any number of indexes per table.
+
+An index can be defined over any number of columns, and support the following
+options:
+
+-   Add a column with type string, int or real (using default comparators).
+-   Select ordering direction of a column (must be selected when creating the
+index).
+-   Use a custom iterator (eg: treat a string column like a IP, or sort by the
+value of "config" key in a map).
+
+For querying the index the user needs to create a cursor. That cursor points to
+a position in the index. With that, the user can perform lookups
+(by key) and/or get the following rows. The user can also compare the current
+value of the cursor to a record.
+
+For faster lookups, user would need to provide a key which will be used for 
finding
+the specific rows that meet this criteria. This key could be an IP address,

[ovs-dev] [PATCH 4/4] ovsdb-idl: Autogenerated functions for compound indexes

2016-04-20 Thread Rodriguez Betancourt, Esteban

From: Arnoldo Lutz Guevara 

Generates and fill the default comparators for columns with
type int, real, string. Also creates the macros that allow
to iterate over the contents of the index, and perform
queries.

Signed-off-by: Arnoldo Lutz Guevara 
Signed-off-by: Esteban Rodriguez Betancourt 
Co-authored-by: Arnoldo Lutz Guevara 
Co-authored-by: Esteban Rodriguez Betancourt 
---
 ovsdb/ovsdb-idlc.in | 261 ++
 tests/idltest.ovsschema |   6 +-
 tests/ovsdb-idl.at  | 267 
 tests/test-ovsdb.c  | 242 ++-
 4 files changed, 773 insertions(+), 3 deletions(-)

diff --git a/ovsdb/ovsdb-idlc.in b/ovsdb/ovsdb-idlc.in
index 26b0de4..bd0329f 100755
--- a/ovsdb/ovsdb-idlc.in
+++ b/ovsdb/ovsdb-idlc.in
@@ -8,6 +8,7 @@ import sys
 import ovs.json
 import ovs.db.error
 import ovs.db.schema
+from ovs.db.types import StringType, IntegerType, RealType
 
 argv0 = sys.argv[0]
 
@@ -186,6 +187,26 @@ const struct %(s)s *%(s)s_track_get_next(const struct 
%(s)s *);
  (ROW); \\
  (ROW) = %(s)s_track_get_next(ROW))
 
+void %(s)s_index_destroy_row(const struct %(s)s *);
+int %(s)s_index_compare(struct ovsdb_idl_index_cursor *, const struct %(s)s *, 
const struct %(s)s *);
+const struct %(s)s *%(s)s_index_first(struct ovsdb_idl_index_cursor *);
+const struct %(s)s *%(s)s_index_next(struct ovsdb_idl_index_cursor *);
+const struct %(s)s *%(s)s_index_find(struct ovsdb_idl_index_cursor *, const 
struct %(s)s *);
+const struct %(s)s *%(s)s_index_forward_to(struct ovsdb_idl_index_cursor *, 
const struct %(s)s *);
+const struct %(s)s *%(s)s_index_get_data(const struct ovsdb_idl_index_cursor 
*);
+#define %(S)s_FOR_EACH_RANGE(ROW, CURSOR, FROM, TO) \\
+for ((ROW) = %(s)s_index_forward_to(CURSOR, FROM); \\
+ ((ROW) && %(s)s_index_compare(CURSOR, ROW, TO) <= 0); \\
+ (ROW) = %(s)s_index_next(CURSOR))
+#define %(S)s_FOR_EACH_EQUAL(ROW, CURSOR, KEY) \\
+for ((ROW) = %(s)s_index_find(CURSOR, KEY); \\
+ ((ROW) && %(s)s_index_compare(CURSOR, ROW, KEY) == 0); \\
+ (ROW) = %(s)s_index_next(CURSOR))
+#define %(S)s_FOR_EACH_BYINDEX(ROW, CURSOR) \\
+for ((ROW) = %(s)s_index_first(CURSOR); \\
+ (ROW); \\
+ (ROW) = %(s)s_index_next(CURSOR))
+
 void %(s)s_init(struct %(s)s *);
 void %(s)s_delete(const struct %(s)s *);
 struct %(s)s *%(s)s_insert(struct ovsdb_idl_txn *);
@@ -218,6 +239,19 @@ bool %(s)s_is_updated(const struct %(s)s *, enum 
%(s)s_column_id);
 print
 
 # Table indexes.
+print "struct %(s)s * %(s)s_index_init_row(struct ovsdb_idl*, const 
struct ovsdb_idl_table_class *);" % {'s': structName}
+print
+for columnName, column in sorted(table.columns.iteritems()):
+print 'void %(s)s_index_set_%(c)s(const struct %(s)s *,' % {'s': 
structName, 'c': columnName},
+if column.type.is_smap():
+args = ['const struct smap *']
+else:
+comment, members = cMembers(prefix, tableName, columnName,
+column, True)
+args = ['%(type)s%(name)s' % member for member in members]
+print '%s);' % ', '.join(args)
+
+print
 printEnum("%stable_id" % prefix.lower(), ["%sTABLE_%s" % (prefix.upper(), 
tableName.upper()) for tableName in sorted(schema.tables)] + ["%sN_TABLES" % 
prefix.upper()])
 print
 for tableName in schema.tables:
@@ -747,6 +781,233 @@ const struct ovsdb_datum *
'C': columnName.upper()}
 print "}"
 
+# Index table related functions
+print '''
+/* Destroy 'row' of kind "%(t)s". The row must have been
+ * created with ovsdb_idl_index_init_row.
+ */
+void
+%(s)s_index_destroy_row(const struct %(s)s *row)
+{
+ovsdb_idl_index_destroy_row__(>header_);
+}
+''' % { 's' : structName, 't': tableName }
+print """
+/* Creates a new row of kind "%(t)s". */
+struct %(s)s *
+%(s)s_index_init_row(struct ovsdb_idl* idl, const struct ovsdb_idl_table_class 
*class)
+{""" % {'s': structName, 't': tableName}
+#for columnName, column in sorted(table.columns.iteritems()):
+#if column.type.is_smap():
+#print "smap_init(>%s);" % columnName
+print "return (struct %(s)s *) ovsdb_idl_index_init_row(idl, 
class);" % {'s': structName, 't': tableName}
+print "}"
+
+print '''
+/*  This function is used to compare "%(s)s" records on table in iteration 
loops for compound-index operations.
+After been called, cursor point to current position in the index
+Parameters: struct ovsdb_idl_index_cursor *cursor. Cursor used to iterate 
over the indexed data on this table.
+

[ovs-dev] [PATCH 2/4 v3] lib:Data structures: Skiplist implementation

2016-04-20 Thread Rodriguez Betancourt, Esteban

Skiplist implementation intended for the IDL compound indexes
feature.

Signed-off-by: Esteban Rodriguez Betancourt 
---
 lib/automake.mk   |   2 +
 lib/skiplist.c| 313 ++
 lib/skiplist.h|  54 +
 tests/.gitignore  |   1 +
 tests/automake.mk |   2 +
 tests/library.at  |  11 ++
 tests/test-skiplist.c | 210 +
 7 files changed, 593 insertions(+)
 create mode 100644 lib/skiplist.c
 create mode 100644 lib/skiplist.h
 create mode 100644 tests/test-skiplist.c

diff --git a/lib/automake.mk b/lib/automake.mk
index 7b829d0..5e08b44 100644
--- a/lib/automake.mk
+++ b/lib/automake.mk
@@ -220,6 +220,8 @@ lib_libopenvswitch_la_SOURCES = \
lib/shash.h \
lib/simap.c \
lib/simap.h \
+   lib/skiplist.c \
+   lib/skiplist.h \
lib/smap.c \
lib/smap.h \
lib/socket-util.c \
diff --git a/lib/skiplist.c b/lib/skiplist.c
new file mode 100644
index 000..fcf24f6
--- /dev/null
+++ b/lib/skiplist.c
@@ -0,0 +1,313 @@
+/* Copyright (C) 2016 Hewlett Packard Enterprise Development LP
+ * All Rights Reserved.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License"); you may
+ * not use this file except in compliance with the License. You may obtain
+ * a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations
+ * under the License.
+ */
+
+/*
+ * Skiplist implementationn based on:
+ * "Skip List: A Probabilistic Alternative to Balanced Trees",
+ * by William Pugh.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "skiplist.h"
+#include "random.h"
+#include "util.h"
+
+/*
+ * The primary usage of the skiplists are the compound indexes
+ * at the IDL.
+ * For that use case 32 height levels is more than enough as
+ * it could indexes a table with 4.294.967.296 rows.
+ * In case that a new use case require more than that then this
+ * number can be changed, but also the way in which the random
+ * numbers are generated must be changed.
+ */
+#define SKIPLIST_MAX_LEVELS 32
+
+/*
+ * Skiplist node container
+ */
+struct skiplist_node {
+const void *data;   /* Pointer to saved data */
+uint64_t height;/* Height of this node */
+struct skiplist_node *forward[];/* Links to the next nodes */
+};
+
+/*
+ * Skiplist container
+ */
+
+struct skiplist {
+struct skiplist_node *header;   /* Pointer to head node (not first
+ * data node) */
+skiplist_comparator *cmp;   /* Pointer to the skiplist's comparison
+ * function */
+void *cfg;  /* Pointer to optional comparison
+ * configuration, used by the comparator */
+int max_levels; /* Max levels of the skiplist. */
+int64_t size;   /* Current size of the skiplist. */
+int64_t level;  /* Max number of levels used in this skiplist 
*/
+void (*free_func) (void *); /* Function that free the value's memory */
+};
+
+static int skiplist_determine_level(struct skiplist *sl);
+
+static struct skiplist_node *skiplist_create_node(int, const void *);
+
+static struct skiplist_node *skiplist_forward_to_(struct skiplist *sl,
+  const void *value,
+  struct skiplist_node
+  **update);
+
+/*
+ * skiplist_create returns a new skiplist, configured with given max_levels,
+ * data comparer and configuration.
+ * Sets a probability of 0.5 (RAND_MAX / 2).
+ */
+struct skiplist *
+skiplist_create(int max_levels, skiplist_comparator object_comparator,
+void *configuration)
+{
+random_init();
+struct skiplist *sl;
+
+sl = xmalloc(sizeof (struct skiplist));
+sl->cfg = configuration;
+sl->max_levels = max_levels < SKIPLIST_MAX_LEVELS ?
+max_levels : SKIPLIST_MAX_LEVELS;
+sl->size = 0;
+sl->level = 0;
+sl->cmp = object_comparator;
+sl->header = skiplist_create_node(sl->max_levels, NULL);
+sl->free_func = NULL;
+
+return sl;
+}
+
+/*
+ * Set a custom function that free the value's memory when
+ * destroying the skiplist.
+ */
+void
+skiplist_set_free_func(struct skiplist *sl, void (*func) (void *))
+{
+sl->free_func = func;
+}
+
+/*
+ * Determines the correspondent level for a skiplist node.
+ * Guarantees that the returned integer is less or equal
+ * to the current height of the skiplist plus 1.
+ */
+static int

[ovs-dev] [PATCH 3/4 v3] ovsdb-idl: IDL Compound Indexes Implementation

2016-04-20 Thread Rodriguez Betancourt, Esteban

In the C IDL, allows to create multicolumn indexes in the
tables, that are keep synched with the data in the replica.

Signed-off-by: Esteban Rodriguez Betancourt 
---
 lib/ovsdb-idl-provider.h |  29 +++
 lib/ovsdb-idl.c  | 491 ++-
 lib/ovsdb-idl.h  |  59 ++
 3 files changed, 578 insertions(+), 1 deletion(-)

diff --git a/lib/ovsdb-idl-provider.h b/lib/ovsdb-idl-provider.h
index 027f79b..4653028 100644
--- a/lib/ovsdb-idl-provider.h
+++ b/lib/ovsdb-idl-provider.h
@@ -1,4 +1,5 @@
 /* Copyright (c) 2009, 2010, 2011, 2012 Nicira, Inc.
+ * Copyright (C) 2016 Hewlett Packard Enterprise Development LP
  *
  * Licensed under the Apache License, Version 2.0 (the "License");
  * you may not use this file except in compliance with the License.
@@ -69,6 +70,7 @@ struct ovsdb_idl_table {
 struct hmap rows;/* Contains "struct ovsdb_idl_row"s. */
 struct ovsdb_idl *idl;   /* Containing idl. */
 unsigned int change_seqno[OVSDB_IDL_CHANGE_MAX];
+struct shash indexes;/* Contains "struct ovsdb_idl_index"s */
 struct ovs_list track_list; /* Tracked rows (ovsdb_idl_row.track_node). */
 };
 
@@ -78,6 +80,33 @@ struct ovsdb_idl_class {
 size_t n_tables;
 };
 
+/*
+ * Contains the configuration per column of the index
+ */
+struct ovsdb_idl_index_column {
+const struct ovsdb_idl_column *column;
+column_comparator *comparer;
+int sorting_order;
+};
+
+/*
+ * Defines a IDL compound index
+ */
+struct ovsdb_idl_index {
+struct skiplist *skiplist;  /* Skiplist with pointer to
+ * rows*/
+struct ovsdb_idl_index_column *columns; /* Columns configuration */
+size_t n_columns;   /* Number of columns in index 
*/
+size_t alloc_columns;   /* Size allocated memory for
+ * columns, comparers and
+ * sorting order */
+bool row_sync;  /* Determines if the replica
+ * is modifying its content or
+ * not */
+const struct ovsdb_idl_table *table;/* Table that owns this index 
*/
+const char* index_name; /* The name of this index */
+};
+
 struct ovsdb_idl_row *ovsdb_idl_get_row_arc(
 struct ovsdb_idl_row *src,
 struct ovsdb_idl_table_class *dst_table,
diff --git a/lib/ovsdb-idl.c b/lib/ovsdb-idl.c
index 3ab05a3..814be11 100644
--- a/lib/ovsdb-idl.c
+++ b/lib/ovsdb-idl.c
@@ -1,4 +1,5 @@
 /* Copyright (c) 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016 Nicira, Inc.
+ * Copyright (C) 2016 Hewlett Packard Enterprise Development LP
  *
  * Licensed under the Apache License, Version 2.0 (the "License");
  * you may not use this file except in compliance with the License.
@@ -36,8 +37,10 @@
 #include "ovsdb-parser.h"
 #include "poll-loop.h"
 #include "shash.h"
+#include "skiplist.h"
 #include "sset.h"
 #include "util.h"
+#include "uuid.h"
 #include "openvswitch/vlog.h"
 
 VLOG_DEFINE_THIS_MODULE(ovsdb_idl);
@@ -204,6 +207,16 @@ ovsdb_idl_table_from_class(const struct ovsdb_idl *,
const struct ovsdb_idl_table_class *);
 static bool ovsdb_idl_track_is_set(struct ovsdb_idl_table *table);
 
+static int
+ovsdb_idl_index_generic_comparer(const void *, const void *, const void *);
+static struct ovsdb_idl_index *ovsdb_idl_create_index_(const struct
+   ovsdb_idl_table *table,
+   size_t allocated_cols);
+static void
+ ovsdb_idl_destroy_indexes(struct ovsdb_idl_table *table);
+static void ovsdb_idl_add_to_indexes(const struct ovsdb_idl_row *);
+static void ovsdb_idl_remove_from_indexes(const struct ovsdb_idl_row *);
+
 /* Creates and returns a connection to database 'remote', which should be in a
  * form acceptable to jsonrpc_session_open().  The connection will maintain an
  * in-memory replica of the remote database whose schema is described by
@@ -250,6 +263,7 @@ ovsdb_idl_create(const char *remote, const struct 
ovsdb_idl_class *class,
 memset(table->modes, default_mode, tc->n_columns);
 table->need_table = false;
 shash_init(>columns);
+shash_init(>indexes);
 for (j = 0; j < tc->n_columns; j++) {
 const struct ovsdb_idl_column *column = >columns[j];
 
@@ -285,6 +299,8 @@ ovsdb_idl_destroy(struct ovsdb_idl *idl)
 
 for (i = 0; i < idl->class->n_tables; i++) {
 struct ovsdb_idl_table *table = >tables[i];
+
+ovsdb_idl_destroy_indexes(table);
 shash_destroy(>columns);
 hmap_destroy(>rows);
 free(table->modes);
@@ -319,6 +335,7 @@ ovsdb_idl_clear(struct ovsdb_idl *idl)
 struct

[ovs-dev] [PATCH 1/4 v3] ovsdb-idl: Compound Indexes Design Document

2016-04-20 Thread Rodriguez Betancourt, Esteban

In the work made in our projects, it was found the need to have a faster
access to the rows contained in tables in the replica, as well to have
the possibility to loop over a subset of rows that meet some specified
criteria.
Those needs lead us to design and implement a functionality that
satisfies those requirements, so an implementation of special indexes were
done.
In order to keep the OVSDB server implementation unmodified and avoid
extra load of processing, the indexes are created as part of the IDL.
The indexes are created as part of the initialization of the replica request
and are maintained automatically when there are changes in the replica.

This document explains the design rationale of the compound indexes feature.

Signed-off-by: Javier Albornoz 
Signed-off-by: Esteban Rodriguez Betancourt 
Signed-off-by: Jorge Arturo Sauma Vargas 
Co-authored-by: Javier Albornoz 
Co-authored-by: Esteban Rodriguez Betancourt 
Co-authored-by: Jorge Arturo Sauma Vargas 
---
 Documentation/automake.mk   |   1 +
 Documentation/c-idl-compound-indexes.md | 232 
 2 files changed, 233 insertions(+)
 create mode 100644 Documentation/c-idl-compound-indexes.md

diff --git a/Documentation/automake.mk b/Documentation/automake.mk
index 5903c22..011855b 100644
--- a/Documentation/automake.mk
+++ b/Documentation/automake.mk
@@ -1,4 +1,5 @@
 docs += \
+   Documentation/c-idl-compound-indexes.md \
Documentation/committer-responsibilities.md \
Documentation/committer-grant-revocation.md \
Documentation/group-selection-method-property.txt
diff --git a/Documentation/c-idl-compound-indexes.md 
b/Documentation/c-idl-compound-indexes.md
new file mode 100644
index 000..0be86c8
--- /dev/null
+++ b/Documentation/c-idl-compound-indexes.md
@@ -0,0 +1,232 @@
+C IDL Compound Indexes
+==
+
+## Introduction
+
+This document describes the design and usage of the C IDL Compound Indexes
+feature that allows the developer to create indexes over any number of columns
+on the IDL side, and query them.
+
+This feature works completely on the IDL, without requiring changes to the
+OVSDB Server, OVSDB Protocol (OVSDB RFC (RFC 7047)) or
+performing additional communication with the server.
+
+Please note that in this document, the term "index" refers to the common
+database term defined as "a data structure that improves data retrieval". 
Unless
+stated otherwise, the definition for index from the OVSDB RFC (RFC 7047) is not
+used.
+
+## Typical Use Cases
+
+### Fast lookups
+
+Depending on the topology, the route table of a network device could manage
+thousands of routes. Commands such as "show ip route <*specific route*>" would
+need to do a sequential lookup of the routing table to find the specific route.
+With an index created, the lookup time could be faster.
+
+This same scenario could be applied to other features such as Access List rules
+and even interfaces lists.
+
+### Lexicographic order
+
+There are several cases where retrieving data in lexicographic order is needed.
+For example, SNMP. When an administrator or even a NMS would like to retrieve
+data from a specific device, it's possible that they will request data from 
full
+tables instead of just specific values. Also, they would like to have this
+information displayed in lexicographic order. This operation could be done by
+the SNMP daemon or by the CLI, but it would be better if the database could
+provide the data ready for consumption. Also, duplicate efforts by different
+processes will be avoided. Another use case for requesting data in 
lexicographic
+order is for user interfaces (web or CLI) where it would be better and quicker
+if the DB sends the data sorted instead of letting each process to sort the 
data
+by itself.
+
+## Implementation Design
+
+This feature maintains a collection of indexes per table. The developer can
+define any number of indexes per table.
+
+An index can be defined over any number of columns, and support the following
+options:
+
+-   Add a column with type string, int or real (using default comparators).
+-   Select ordering direction of a column (must be selected when creating the
+index).
+-   Use a custom iterator (eg: treat a string column like a IP, or sort by the
+value of "config" key in a map).
+
+For querying the index the user needs to create a cursor. That cursor points to
+a position in the index. With that, the user can perform lookups
+(by key) and/or get the following rows. The user can also compare the current
+value of the cursor to a record.
+
+For faster lookups, user would need to provide a key which will be used for 
finding
+the specific rows that meet this criteria. This key could be an IP address, a
+MAC address, an ACL rule, etc. When the information is found in the data
+structure the user's

[ovs-dev] [PATCH v2] cksum: Refine schema cksum validation

2016-04-15 Thread Rodriguez Betancourt, Esteban

Calculates the cksum removing the cksum line using a more
strict regex than the used previously.
It fixes a problem when calculating the cksum of a schema that
has fields with the substring cksum (e.g.: a checksum column),
lines that the previous cksum calculation incorrectly removes
before running cksum.
Also, the tool calculate-schema-cksum is introduced. This tool
calculates the cksum of a schema file. It could be used in other
programs, instead of calculating the cksum in an eventually
different way than the expected by cksum-schema-check and other
tools.

Signed-off-by: Esteban Rodriguez Betancourt 
---
 Makefile.am  | 1 +
 build-aux/calculate-schema-cksum | 4 
 build-aux/cksum-schema-check | 3 ++-
 3 files changed, 7 insertions(+), 1 deletion(-)
 create mode 100755 build-aux/calculate-schema-cksum

diff --git a/Makefile.am b/Makefile.am
index bd9ee00..69dbe3d 100644
--- a/Makefile.am
+++ b/Makefile.am
@@ -107,6 +107,7 @@ EXTRA_DIST = \
boot.sh \
build-aux/cccl \
build-aux/cksum-schema-check \
+   build-aux/calculate-schema-cksum \
build-aux/dist-docs \
build-aux/sodepends.pl \
build-aux/soexpand.pl \
diff --git a/build-aux/calculate-schema-cksum b/build-aux/calculate-schema-cksum
new file mode 100755
index 000..4ce9bf8
--- /dev/null
+++ b/build-aux/calculate-schema-cksum
@@ -0,0 +1,4 @@
+#!/bin/sh
+
+schema=$1
+sed '/"cksum": *"[0-9][0-9]* [0-9][0-9]*",/d' $schema | cksum
diff --git a/build-aux/cksum-schema-check b/build-aux/cksum-schema-check
index 0fe37e4..335e645 100755
--- a/build-aux/cksum-schema-check
+++ b/build-aux/cksum-schema-check
@@ -3,7 +3,8 @@
 schema=$1
 stamp=$2
 
-sum=`sed '/cksum/d' $schema | cksum`
+cksumcheckpath=`dirname $0`
+sum=`$cksumcheckpath/calculate-schema-cksum $schema`
 expected=`sed -n 's/.*"cksum": "\(.*\)".*/\1/p' $schema`
 if test "X$sum" = "X$expected"; then
 touch $stamp
-- 
1.9.1
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] cksum: Refine schema cksum validation

2016-04-12 Thread Rodriguez Betancourt, Esteban

By separating it in a new script, other tools can rely on this
"canonical way to calculate the cksum of a schema within this project"
(as this calculation is "implementation specific", as defined in the RFC).

We have some tools that modify the schema/generates schema files and
each had to implement this calculation, probably in a different way
(in fact, that's how we discover that the current cksum-schema-check
was calculating a different value that our scripts).

And last, but not least, developers that modify the schema
could find useful the script, if they wish to automate things like
cksum updating, or schema indentation.

Thanks,
Esteban

> -Original Message-
> From: Ben Pfaff [mailto:b...@ovn.org]
> Sent: domingo, 10 de abril de 2016 13:27
> To: Rodriguez Betancourt, Esteban <esteb...@hpe.com>
> Cc: dev@openvswitch.org
> Subject: Re: [ovs-dev] cksum: Refine schema cksum validation
> 
> On Thu, Apr 07, 2016 at 10:43:20PM +, Rodriguez Betancourt, Esteban
> wrote:
> > Calculates the cksum removing the cksum line using a more strict regex
> > than the used previously.
> > It fixes a problem when calculating the cksum of a schema that has
> > fields with the substring cksum (e.g.: a checksum column), lines that
> > the previous cksum calculation incorrectly removes before running
> > cksum.
> >
> > Signed-off-by: Esteban Rodriguez Betancourt <esteb...@hpe.com>
> 
> Why does this introduce a new script?
> 
> Thanks,
> 
> Ben.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

[ovs-dev] cksum: Refine schema cksum validation

2016-04-07 Thread Rodriguez Betancourt, Esteban

Calculates the cksum removing the cksum line using a more
strict regex than the used previously.
It fixes a problem when calculating the cksum of a schema that
has fields with the substring cksum (e.g.: a checksum column),
lines that the previous cksum calculation incorrectly removes
before running cksum.

Signed-off-by: Esteban Rodriguez Betancourt 
---
Makefile.am  | 1 +
build-aux/calculate-schema-cksum | 4 
build-aux/cksum-schema-check | 3 ++-
3 files changed, 7 insertions(+), 1 deletion(-)
create mode 100755 build-aux/calculate-schema-cksum

diff --git a/Makefile.am b/Makefile.am
index bd9ee00..69dbe3d 100644
--- a/Makefile.am
+++ b/Makefile.am
@@ -107,6 +107,7 @@ EXTRA_DIST = \
   boot.sh \
   build-aux/cccl \
   build-aux/cksum-schema-check \
+ build-aux/calculate-schema-cksum \
   build-aux/dist-docs \
   build-aux/sodepends.pl \
   build-aux/soexpand.pl \
diff --git a/build-aux/calculate-schema-cksum b/build-aux/calculate-schema-cksum
new file mode 100755
index 000..4ce9bf8
--- /dev/null
+++ b/build-aux/calculate-schema-cksum
@@ -0,0 +1,4 @@
+#!/bin/sh
+
+schema=$1
+sed '/"cksum": *"[0-9][0-9]* [0-9][0-9]*",/d' $schema | cksum
diff --git a/build-aux/cksum-schema-check b/build-aux/cksum-schema-check
index 0fe37e4..335e645 100755
--- a/build-aux/cksum-schema-check
+++ b/build-aux/cksum-schema-check
@@ -3,7 +3,8 @@
schema=$1
stamp=$2
-sum=`sed '/cksum/d' $schema | cksum`
+cksumcheckpath=`dirname $0`
+sum=`$cksumcheckpath/calculate-schema-cksum $schema`
expected=`sed -n 's/.*"cksum": "\(.*\)".*/\1/p' $schema`
if test "X$sum" = "X$expected"; then
 touch $stamp
--
1.9.1
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

[ovs-dev] [PATCH 1/4 v2] ovsdb-idl: Compound Indexes Design Document

2016-04-06 Thread Rodriguez Betancourt, Esteban

In the work made in our projects, it was found the need to have a faster
access to the rows contained in tables in the replica, as well to have
the possibility to loop over a subset of rows that meet some specified
criteria.
Those needs lead us to design and implement a functionality that
satisfies those requirements, so an implementation of special indexes were
done.
In order to keep the OVSDB server implementation unmodified and avoid
extra load of processing, the indexes are created as part of the IDL.
The indexes are created as part of the initialization of the replica request
and are maintained automatically when there are changes in the replica.

This document explains the design rationale of the compound indexes feature.

Signed-off-by: Javier Albornoz 
Signed-off-by: Esteban Rodriguez Betancourt 
Signed-off-by: Jorge Arturo Sauma Vargas 
Co-authored-by: Javier Albornoz 
Co-authored-by: Esteban Rodriguez Betancourt 
Co-authored-by: Jorge Arturo Sauma Vargas 
---
 Documentation/automake.mk   |   1 +
 Documentation/c-idl-compound-indexes.md | 226 
 2 files changed, 227 insertions(+)
 create mode 100644 Documentation/c-idl-compound-indexes.md

diff --git a/Documentation/automake.mk b/Documentation/automake.mk
index 5903c22..011855b 100644
--- a/Documentation/automake.mk
+++ b/Documentation/automake.mk
@@ -1,4 +1,5 @@
 docs += \
+   Documentation/c-idl-compound-indexes.md \
Documentation/committer-responsibilities.md \
Documentation/committer-grant-revocation.md \
Documentation/group-selection-method-property.txt
diff --git a/Documentation/c-idl-compound-indexes.md 
b/Documentation/c-idl-compound-indexes.md
new file mode 100644
index 000..1cdf1f8
--- /dev/null
+++ b/Documentation/c-idl-compound-indexes.md
@@ -0,0 +1,226 @@
+C IDL Compound Indexes
+==
+
+## Introduction
+
+This document describes the design and usage of the C IDL Compound Indexes
+feature that allows the developer to create indexes over any number of columns
+on the IDL side, and query them.
+
+This feature works completely on the IDL, without requiring changes to the
+OVSDB Server, OVSDB Protocol (OVSDB RFC (RFC 7047)) or
+performing additional communication with the server.
+
+Please note that in this document, the term "index" refers to the common
+database term defined as "a data structure that improves data retrieval". 
Unless
+stated otherwise, the definition for index from the OVSDB RFC (RFC 7047) is not
+used.
+
+## Typical Use Cases
+
+### Fast lookups
+
+Depending on the topology, the route table of a network device could manage
+thousands of routes. Commands such as "show ip route <*specific route*>" would
+need to do a sequential lookup of the routing table to find the specific route.
+With an index created, the lookup time could be faster.
+
+This same scenario could be applied to other features such as Access List rules
+and even interfaces lists.
+
+### Lexicographic order
+
+There are several cases where retrieving data in lexicographic order is needed.
+For example, SNMP. When an administrator or even a NMS would like to retrieve
+data from a specific device, it's possible that they will request data from 
full
+tables instead of just specific values. Also, they would like to have this
+information displayed in lexicographic order. This operation could be done by
+the SNMP daemon or by the CLI, but it would be better if the database could
+provide the data ready for consumption. Also, duplicate efforts by different
+processes will be avoided. Another use case for requesting data in 
lexicographic
+order is for user interfaces (web or CLI) where it would be better and quicker
+if the DB sends the data sorted instead of letting each process to sort the 
data
+by itself.
+
+## Implementation Design
+
+This feature maintains a collection of indexes per table. The developer can
+define any number of indexes per table.
+
+An index can be defined over any number of columns, and support the following
+options:
+
+-   Add a column with type string, int or real (using default comparators).
+-   Select ordering direction of a column (must be selected when creating the
+index).
+-   Use a custom iterator (eg: treat a string column like a IP, or sort by the
+value of "config" key in a map).
+
+For querying the index the user needs to create a cursor. That cursor points to
+a position in the index. With that, the user can perform lookups
+(by key) and/or get the following rows. The user can also compare the current
+value of the cursor to a record.
+
+For faster lookups, user would need to provide a key which will be used for 
finding
+the specific rows that meet this criteria. This key could be an IP address, a
+MAC address, an ACL rule, etc. When the information is found in the data
+structure the user's

[ovs-dev] [PATCH 3/4 v2] ovsdb-idl: IDL Compound Indexes Implementation

2016-04-06 Thread Rodriguez Betancourt, Esteban

In the C IDL, allows to create multicolumn indexes in the
tables, that are keep synched with the data in the replica.

Signed-off-by: Esteban Rodriguez Betancourt 
---
lib/ovsdb-idl-provider.h |  30 
lib/ovsdb-idl.c  | 378 ++-
lib/ovsdb-idl.h  |  52 +++
3 files changed, 459 insertions(+), 1 deletion(-)

diff --git a/lib/ovsdb-idl-provider.h b/lib/ovsdb-idl-provider.h
index 027f79b..d9597be 100644
--- a/lib/ovsdb-idl-provider.h
+++ b/lib/ovsdb-idl-provider.h
@@ -1,4 +1,5 @@
/* Copyright (c) 2009, 2010, 2011, 2012 Nicira, Inc.
+ * Copyright (C) 2016 Hewlett Packard Enterprise Development LP
  *
  * Licensed under the Apache License, Version 2.0 (the "License");
  * you may not use this file except in compliance with the License.
@@ -49,6 +50,7 @@ struct ovsdb_idl_column {
 bool mutable;
 void (*parse)(struct ovsdb_idl_row *, const struct ovsdb_datum *);
 void (*unparse)(struct ovsdb_idl_row *);
+int (*compare)(const void *, const void *); /* Perform a comparison over 
ovsrec_* */
};
 struct ovsdb_idl_table_class {
@@ -69,6 +71,7 @@ struct ovsdb_idl_table {
 struct hmap rows;/* Contains "struct ovsdb_idl_row"s. */
 struct ovsdb_idl *idl;   /* Containing idl. */
 unsigned int change_seqno[OVSDB_IDL_CHANGE_MAX];
+struct shash indexes;/* Contains "struct ovsdb_idl_index"s */
 struct ovs_list track_list; /* Tracked rows (ovsdb_idl_row.track_node). */
};
@@ -78,6 +81,33 @@ struct ovsdb_idl_class {
 size_t n_tables;
};
+/*
+ * Contains the configuration per column of the index
+ */
+struct ovsdb_idl_index_column {
+const struct ovsdb_idl_column *column;
+column_comparator *comparer;
+int sorting_order;
+};
+
+/*
+ * Defines a IDL compound index
+ */
+struct ovsdb_idl_index {
+struct skiplist *skiplist;  /* Skiplist with pointer to
+ * rows*/
+struct ovsdb_idl_index_column *columns; /* Columns configuration */
+size_t n_columns;   /* Number of columns in index 
*/
+size_t alloc_columns;   /* Size allocated memory for
+ * columns, comparers and
+ * sorting order */
+bool row_sync;  /* Determines if the replica
+ * is modifying its content or
+ * not */
+const struct ovsdb_idl_table *table;/* Table that owns this index 
*/
+const char* index_name; /* The name of this index */
+};
+
struct ovsdb_idl_row *ovsdb_idl_get_row_arc(
 struct ovsdb_idl_row *src,
 struct ovsdb_idl_table_class *dst_table,
diff --git a/lib/ovsdb-idl.c b/lib/ovsdb-idl.c
index 3ab05a3..06be5e8 100644
--- a/lib/ovsdb-idl.c
+++ b/lib/ovsdb-idl.c
@@ -1,4 +1,5 @@
/* Copyright (c) 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016 Nicira, Inc.
+ * Copyright (C) 2016 Hewlett Packard Enterprise Development LP
  *
  * Licensed under the Apache License, Version 2.0 (the "License");
  * you may not use this file except in compliance with the License.
@@ -36,8 +37,10 @@
#include "ovsdb-parser.h"
#include "poll-loop.h"
#include "shash.h"
+#include "skiplist.h"
#include "sset.h"
#include "util.h"
+#include "uuid.h"
#include "openvswitch/vlog.h"
 VLOG_DEFINE_THIS_MODULE(ovsdb_idl);
@@ -204,6 +207,16 @@ ovsdb_idl_table_from_class(const struct ovsdb_idl *,
const struct ovsdb_idl_table_class *);
static bool ovsdb_idl_track_is_set(struct ovsdb_idl_table *table);
+static int
+ ovsdb_idl_index_generic_comparer(const void *, const void *, const void *);
+static struct ovsdb_idl_index *ovsdb_idl_create_index_(const struct
+   ovsdb_idl_table *table,
+   size_t allocated_cols);
+static void
+ ovsdb_idl_destroy_indexes(struct ovsdb_idl_table *table);
+static void ovsdb_idl_add_to_indexes(const struct ovsdb_idl_row *);
+static void ovsdb_idl_remove_from_indexes(const struct ovsdb_idl_row *);
+
/* Creates and returns a connection to database 'remote', which should be in a
  * form acceptable to jsonrpc_session_open().  The connection will maintain an
  * in-memory replica of the remote database whose schema is described by
@@ -250,6 +263,7 @@ ovsdb_idl_create(const char *remote, const struct 
ovsdb_idl_class *class,
 memset(table->modes, default_mode, tc->n_columns);
 table->need_table = false;
 shash_init(>columns);
+shash_init(>indexes);
 for (j = 0; j < tc->n_columns; j++) {
 const struct ovsdb_idl_column *column = >columns[j];
@@ -285,6 +299,8 @@ ovsdb_idl_destroy(struct ovsdb_idl *idl)
 for (i = 0; i < idl->class->n_tables; i++) {

[ovs-dev] [PATCH 4/4 v2] ovsdb-idl: Autogenerated functions for compound indexes

2016-04-06 Thread Rodriguez Betancourt, Esteban

From:  Arnoldo Lutz Guevara 

Generates and fill the default comparators for columns with
type int, real, string. Also creates the macros that allow
to iterate over the contents of the index, and perform
queries.

Signed-off-by: Arnoldo Lutz Guevara 
Signed-off-by: Esteban Rodriguez Betancourt 
---
ovsdb/ovsdb-idlc.in | 116 +
tests/idltest.ovsschema |   6 +-
tests/ovsdb-idl.at  | 267 
tests/test-ovsdb.c  | 213 +-
4 files changed, 599 insertions(+), 3 deletions(-)

diff --git a/ovsdb/ovsdb-idlc.in b/ovsdb/ovsdb-idlc.in
index 26b0de4..38c0d51 100755
--- a/ovsdb/ovsdb-idlc.in
+++ b/ovsdb/ovsdb-idlc.in
@@ -8,9 +8,15 @@ import sys
import ovs.json
import ovs.db.error
import ovs.db.schema
+from ovs.db.types import StringType, IntegerType, RealType
 argv0 = sys.argv[0]
+def isColumnIndexable(column):
+return not column.type.is_map()  and column.type.key.is_valid() \
+   and (column.type.is_scalar())  and \
+column.type.key.toAtomicType() in ['OVSDB_TYPE_STRING', 
'OVSDB_TYPE_REAL', 'OVSDB_TYPE_INTEGER']
+
 def parseSchema(filename):
 return ovs.db.schema.IdlSchema.from_json(ovs.json.from_file(filename))
@@ -186,6 +192,25 @@ const struct %(s)s *%(s)s_track_get_next(const struct 
%(s)s *);
  (ROW); \\
  (ROW) = %(s)s_track_get_next(ROW))
+int %(s)s_index_compare(struct ovsdb_idl_index_cursor *, const struct %(s)s *, 
const struct %(s)s *);
+const struct %(s)s *%(s)s_index_first(struct ovsdb_idl_index_cursor *);
+const struct %(s)s *%(s)s_index_next(struct ovsdb_idl_index_cursor *);
+const struct %(s)s *%(s)s_index_find(struct ovsdb_idl_index_cursor *, const 
struct %(s)s *);
+const struct %(s)s *%(s)s_index_forward_to(struct ovsdb_idl_index_cursor *, 
const struct %(s)s *);
+const struct %(s)s *%(s)s_index_get_data(const struct ovsdb_idl_index_cursor 
*);
+#define %(S)s_FOR_EACH_RANGE(ROW, CURSOR, FROM, TO) \\
+for ((ROW) = %(s)s_index_forward_to(CURSOR, FROM); \\
+ ((ROW) && %(s)s_index_compare(CURSOR, ROW, TO) <= 0); \\
+ (ROW) = %(s)s_index_next(CURSOR))
+#define %(S)s_FOR_EACH_EQUAL(ROW, CURSOR, KEY) \\
+for ((ROW) = %(s)s_index_find(CURSOR, KEY); \\
+ ((ROW) && %(s)s_index_compare(CURSOR, ROW, KEY) == 0); \\
+ (ROW) = %(s)s_index_next(CURSOR))
+#define %(S)s_FOR_EACH_BYINDEX(ROW, CURSOR) \\
+for ((ROW) = %(s)s_index_first(CURSOR); \\
+ (ROW); \\
+ (ROW) = %(s)s_index_next(CURSOR))
+
void %(s)s_init(struct %(s)s *);
void %(s)s_delete(const struct %(s)s *);
struct %(s)s *%(s)s_insert(struct ovsdb_idl_txn *);
@@ -216,6 +241,10 @@ bool %(s)s_is_updated(const struct %(s)s *, enum 
%(s)s_column_id);
 print '%s);' % ', '.join(args)
 print
+for columnName, column in sorted(table.columns.iteritems()):
+if isColumnIndexable(column):
+print 'int %(s)s_index_%(c)s_cmp(const void *, const void *);' 
% {'s': structName, 'c': columnName}
+print
 # Table indexes.
 printEnum("%stable_id" % prefix.lower(), ["%sTABLE_%s" % (prefix.upper(), 
tableName.upper()) for tableName in sorted(schema.tables)] + ["%sN_TABLES" % 
prefix.upper()])
@@ -746,7 +775,90 @@ const struct ovsdb_datum *
'S': structName.upper(),
'C': columnName.upper()}
 print "}"
+# Column Index compare functions
+for columnName, column in sorted(table.columns.iteritems()):
+type = column.type
+funcDict = {'OVSDB_TYPE_STRING': 'str', 'OVSDB_TYPE_REAL': 
'double', 'OVSDB_TYPE_INTEGER': 'int'}
+if isColumnIndexable(column):
+print '''
+/*  Call an internal function defined to compare  "%(f)s" type columns for 
"%(c)s" columns
+in "%(s)s" tables.
+Parameters: void *row1, void * row2. Data to be compared. Must be of type 
corresponding the record of
+the table.
+Return value: 0 if both data values are equal, -1 if first parameter is 
less than second and 1 otherwise.
+For internal use only. Not recomended to be called directly. */ ''' % {'s' 
: structName, 'c' : columnName, 'f': type.key.toAtomicType()}
+print 'int'
+print '%(s)s_index_%(c)s_cmp(const void *row1, const void 
*row2)' % \
+{'s': structName, 'c': columnName}
+print '{'
+print 'struct %(s)s *data1 = (struct %(s)s *)row1;' % { 
's' : structName }
+print 'struct %(s)s *data2 = (struct %(s)s *)row2;' % { 
's' : structName }
+print 'return ovsdb_idl_index_%(f)scmp(data1->%(c)s, 
data2->%(c)s);' % \
+{'c': columnName, 'f': funcDict[type.key.toAtomicType()] }
+print "}"
+# Index table related functions

[ovs-dev] [PATCH 2/4 v2] lib:Data structures: Skiplist implementation

2016-04-06 Thread Rodriguez Betancourt, Esteban

Skiplist implementation intended for the IDL compound indexes
feature.

Signed-off-by: Esteban Rodriguez Betancourt 
---
lib/automake.mk   |   2 +
lib/skiplist.c| 313 ++
lib/skiplist.h|  54 +
tests/.gitignore  |   1 +
tests/automake.mk |   2 +
tests/library.at  |  11 ++
tests/test-skiplist.c | 210 +
7 files changed, 593 insertions(+)
create mode 100644 lib/skiplist.c
create mode 100644 lib/skiplist.h
create mode 100644 tests/test-skiplist.c

diff --git a/lib/automake.mk b/lib/automake.mk
index 7b829d0..5e08b44 100644
--- a/lib/automake.mk
+++ b/lib/automake.mk
@@ -220,6 +220,8 @@ lib_libopenvswitch_la_SOURCES = \
   lib/shash.h \
   lib/simap.c \
   lib/simap.h \
+ lib/skiplist.c \
+ lib/skiplist.h \
   lib/smap.c \
   lib/smap.h \
   lib/socket-util.c \
diff --git a/lib/skiplist.c b/lib/skiplist.c
new file mode 100644
index 000..fcf24f6
--- /dev/null
+++ b/lib/skiplist.c
@@ -0,0 +1,313 @@
+/* Copyright (C) 2016 Hewlett Packard Enterprise Development LP
+ * All Rights Reserved.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License"); you may
+ * not use this file except in compliance with the License. You may obtain
+ * a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations
+ * under the License.
+ */
+
+/*
+ * Skiplist implementationn based on:
+ * "Skip List: A Probabilistic Alternative to Balanced Trees",
+ * by William Pugh.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "skiplist.h"
+#include "random.h"
+#include "util.h"
+
+/*
+ * The primary usage of the skiplists are the compound indexes
+ * at the IDL.
+ * For that use case 32 height levels is more than enough as
+ * it could indexes a table with 4.294.967.296 rows.
+ * In case that a new use case require more than that then this
+ * number can be changed, but also the way in which the random
+ * numbers are generated must be changed.
+ */
+#define SKIPLIST_MAX_LEVELS 32
+
+/*
+ * Skiplist node container
+ */
+struct skiplist_node {
+const void *data;   /* Pointer to saved data */
+uint64_t height;/* Height of this node */
+struct skiplist_node *forward[];/* Links to the next nodes */
+};
+
+/*
+ * Skiplist container
+ */
+
+struct skiplist {
+struct skiplist_node *header;   /* Pointer to head node (not first
+ * data node) */
+skiplist_comparator *cmp;   /* Pointer to the skiplist's comparison
+ * function */
+void *cfg;  /* Pointer to optional comparison
+ * configuration, used by the comparator */
+int max_levels; /* Max levels of the skiplist. */
+int64_t size;   /* Current size of the skiplist. */
+int64_t level;  /* Max number of levels used in this skiplist 
*/
+void (*free_func) (void *); /* Function that free the value's memory */
+};
+
+static int skiplist_determine_level(struct skiplist *sl);
+
+static struct skiplist_node *skiplist_create_node(int, const void *);
+
+static struct skiplist_node *skiplist_forward_to_(struct skiplist *sl,
+  const void *value,
+  struct skiplist_node
+  **update);
+
+/*
+ * skiplist_create returns a new skiplist, configured with given max_levels,
+ * data comparer and configuration.
+ * Sets a probability of 0.5 (RAND_MAX / 2).
+ */
+struct skiplist *
+skiplist_create(int max_levels, skiplist_comparator object_comparator,
+void *configuration)
+{
+random_init();
+struct skiplist *sl;
+
+sl = xmalloc(sizeof (struct skiplist));
+sl->cfg = configuration;
+sl->max_levels = max_levels < SKIPLIST_MAX_LEVELS ?
+max_levels : SKIPLIST_MAX_LEVELS;
+sl->size = 0;
+sl->level = 0;
+sl->cmp = object_comparator;
+sl->header = skiplist_create_node(sl->max_levels, NULL);
+sl->free_func = NULL;
+
+return sl;
+}
+
+/*
+ * Set a custom function that free the value's memory when
+ * destroying the skiplist.
+ */
+void
+skiplist_set_free_func(struct skiplist *sl, void (*func) (void *))
+{
+sl->free_func = func;
+}
+
+/*
+ * Determines the correspondent level for a skiplist node.
+ * Guarantees that the returned integer is less or equal
+ * to the current height of the skiplist plus

Re: [ovs-dev] [PATCH 3/5] ovsdb-idl: IDL Compound Indexes Implementation

2016-04-06 Thread Rodriguez Betancourt, Esteban

> From: Ben Pfaff [mailto:b...@ovn.org]
> Sent: martes, 22 de marzo de 2016 14:55
> To: Rodriguez Betancourt, Esteban <esteb...@hpe.com>
> Cc: dev@openvswitch.org
> Subject: Re: [ovs-dev] [PATCH 3/5] ovsdb-idl: IDL Compound Indexes 
> Implementation
> 
> On Wed, Mar 09, 2016 at 12:03:54AM +0000, Rodriguez Betancourt, 
> Esteban
> wrote:
> > In the C IDL, allows to create multicolumn indexes in the tables, 
> > that are keep synched with the data in the replica.
> >
> > Signed-off-by: Esteban Rodriguez Betancourt <esteb...@hpe.com>
> 
> Until now, ovsdb-idl.h has been divided into sections separated by 
> page- breaks (control-L), and each section has had a comment 
> explaining basically what that section describes.  Some of the 
> sections, like the one for transactions, has a lot of detail.  This 
> commit adds a new section, but it doesn't add a page break or an 
> explanatory comment.  I think it should do both.  The design document 
> from patch 1 might be a good place from which to draw for the comment.

Changed in v2 patch.

> This patch and patch 2 has a couple of typedefs for function pointers.
> CodingStyle.md says:
> 
>   A function type is a good use for a typedef because it can clarify
> code.  The type should be a function type, not a pointer-to-function
> type.  That way, the typedef name can be used to declare function
> prototypes.  (It cannot be used for function definitions, because that
> is explicitly prohibited by C89 and C99.)

Changed in v2 patch.

> Do you think that the sorting order in struct ovsdb_idl_index is 
> useful, that is, do you see a common use for indexes in descending 
> order?  If indexes in descending order are only rarely useful, then 
> they could be implemented through a custom comparison function.

This is more needed by tools that expose the data to a user, and the user wants 
to sort the data in different ways.
Also this enables the option to have in a single index some columns in one 
direction and other columns in the other direction.

> I don't understand, in ovsdb_idl_index_compare(), why NULL != NULL.  
> It seems to me that this is risky and likely to cause confusion, if 
> not now then later.
Changed in v2 patch.

> I don't understand why indexes have string names.  It seems like kind 
> of a curious design.  Why isn't the client expected to retain a 
> pointer to the index that it wants to traverse, instead of a name?

One reason is to prevent the developer from modifying the values of the index by
referencing it using a string instead of giving direct access to the index data 
structure.
The other one is to allow a more intuitive way to refer to the indexes, and to
select them when programming. 

> Please use xmalloc() instead of malloc().
Changed in v2 patch.

> I found myself wondering whether there's value in doing dynamic 
> allocation of the three parallel arrays inside struct ovsdb_idl_index.
> It would be very unusual to have an index over more than, say, 3 
> columns, and so it might be easier to have fixed-size arrays of 3 
> elements (or a few more, to allow room for expansion).  I imagine that 
> the columns in indexes will be fixed at build time, after all.
> 
> ovsdb_idl_index_add_column() seems to be working hard to avoid using 
> xrealloc(), but I don't know why.

Changed in v2 patch. The indexes support any number of "columns" for doing the 
comparison. But that number isn't limited to the number of columns in a row, 
because eventually one column can have several values indexed, for example, 
different keys from a map column.

For some "queries", the need of filtering the data first raise the number of 
columns need to 5 or more.

> I don't understand why only certain columns have a default comparison 
> function.  Why not use ovsdb_datum_compare_3way() as the default for 
> every column?  I guess that the ovsdb_datums and ovsdb_type would have 
> to be passed into column_comparator instead of a pair of void * 
> pointers, but that would arguably be an improvement anyhow, certainly 
> from a type- safety viewpoint.

We decided against using the datums for the comparison because it seems 
intuitive that the parsed values in the ovsrec are easier and faster to compare.

In some cases, a table could have a really really high number of rows, and 
receive a lot of updates per second, so is important that the comparison 
functions are fast.

> Do you have an example of a useful use for a custom comparison function?

Sorting a string column, that contains IPv4 or IPv6 values.
Sort an specific member inside a map column Sort a string as a number Sort 
strings as dates Sort by a value contained in a referenced row

> Did you consider defining a struct of three members instead of using 
> three parallel

[ovs-dev] [PATCH v2 1/3] ovsdb-idl: Add support for on-demand columns

2016-03-18 Thread Rodriguez Betancourt, Esteban

From: Sebastian Arguello 

The IDL only supports reading from columns that are being monitored.
In the case where the column represent a frequently changing entity (e.g. 
counter),
and the reads are relatively infrequent (e.g. CLI client), there is a
significant overhead in replication.

This patch introduces a new column mode called OVSDB_IDL_ON_DEMAND.
An on-demand column will have its default value if it has never been updated.
This can happen if: 1) the user has not called explicitly a fetch operation
over it or 2) the server reply with the actual value has not been
received/processed (i.e. ovsdb_idl_run() has not been called). The on-demand
columns will keep the last value received from the OVSDB server.

The on-demand columns are not updated by the IDL automatically, they are
updated when the IDL user asks it by the calling one of the new fetching
functions that were added to the IDL. When this happens, the IDL sends a select
operation to request the data from the server. After calling ovsdb_idl_run(),
the IDL updates the replica with the information received from the server.

With this new column mode, the state of the replica could diverge from the
state of the database, as some of the columns could be outdated. The process
using the IDL is responsible for requesting the information before using it.

The main user visible changes in this patch are:
  - There is a new function that adds on-demand columns:
ovsdb_idl_add_on_demand_column()
  - Functions for fetching a cells (columns for specific rows),
columns, and table were added: ovsdb_idl_fetch_row(),
ovsdb_idl_fetch_column(), and ovsdb_idl_fetch_table()
  - Functions for verifying if the fetch requests of on-demand columns
were processed were added: ovsdb_idl_is_row_fetch_pending(),
ovsdb_idl_is_column_fetch_pending(), ovsdb_idl_is_table_fetch_pending()
  - When an on-demand column is updated, the IDL seqno is changed as well

Note that the Python IDL already has a feature similar to this called
Read-only columns.

Signed-off-by: Sebastian Arguello 
Signed-off-by: Esteban Rodriguez Betancourt 
---
This is a change over the first patch sent. The other two patches remain 
without changes.
The pull request https://github.com/openvswitch/ovs/pull/110 was also updated.
 lib/ovsdb-idl-provider.h |  11 ++
 lib/ovsdb-idl.c  | 470 ++-
 lib/ovsdb-idl.h  |  32 
 3 files changed, 511 insertions(+), 2 deletions(-)

diff --git a/lib/ovsdb-idl-provider.h b/lib/ovsdb-idl-provider.h
index 190acca..b1e18d7 100644
--- a/lib/ovsdb-idl-provider.h
+++ b/lib/ovsdb-idl-provider.h
@@ -1,4 +1,5 @@
 /* Copyright (c) 2009, 2010, 2011, 2012 Nicira, Inc.
+ * Copyright (C) 2016 Hewlett Packard Enterprise Development LP
  *
  * Licensed under the Apache License, Version 2.0 (the "License");
  * you may not use this file except in compliance with the License.
@@ -41,6 +42,10 @@ struct ovsdb_idl_row {
 unsigned int change_seqno[OVSDB_IDL_CHANGE_MAX];
 struct ovs_list track_node; /* Rows modified/added/deleted by IDL */
 unsigned long int *updated; /* Bitmap of columns updated by IDL */
+
+size_t outstanding_fetch_reqs; /* Number of on-demand columns in this row
+  with on-going fetch operations */
+
 };
 
 struct ovsdb_idl_column {
@@ -70,6 +75,12 @@ struct ovsdb_idl_table {
 struct ovsdb_idl *idl;   /* Containing idl. */
 unsigned int change_seqno[OVSDB_IDL_CHANGE_MAX];
 struct ovs_list track_list; /* Tracked rows (ovsdb_idl_row.track_node). */
+bool has_pending_fetch; /* Indicates if the table has a pending fetch
+operation */
+struct shash outstanding_col_fetch_reqs; /* Contains the name of the
+columns with on-demand fetch
+request pending. It does not
+store any data, just keys */
 };
 
 struct ovsdb_idl_class {
diff --git a/lib/ovsdb-idl.c b/lib/ovsdb-idl.c
index 4cb1c81..b29ca57 100644
--- a/lib/ovsdb-idl.c
+++ b/lib/ovsdb-idl.c
@@ -1,4 +1,5 @@
 /* Copyright (c) 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016 Nicira, Inc.
+ * Copyright (C) 2016 Hewlett Packard Enterprise Development LP
  *
  * Licensed under the Apache License, Version 2.0 (the "License");
  * you may not use this file except in compliance with the License.
@@ -83,6 +84,15 @@ enum ovsdb_idl_state {
 IDL_S_MONITORING2
 };
 
+/* Keeps the information of fetch request for on-demand fetch columns. */
+struct ovsdb_idl_fetch_node {
+   struct hmap_node hmap_node;  /* To store this structure in hmaps */
+   struct shash columns;/* Contains the columns requested */
+   struct ovsdb_idl_table *table;   /* Pointer to the requested table */
+   enum ovsdb_idl_fetch_type fetch_type; /* Type of

[ovs-dev] [PATCH] install.md: Suggest jemalloc memory allocator

2016-03-15 Thread Rodriguez Betancourt, Esteban

Change installing documentation to suggest to use
jemalloc memory allocator.

This memory allocator showed great performance gains
when used at ovsdb-server and other components.

Signed-off-by: Esteban Rodriguez Betancourt 
---
 .travis.yml | 3 +++
 INSTALL.md  | 7 +++
 2 files changed, 10 insertions(+)

diff --git a/.travis.yml b/.travis.yml
index 2b262e4..6618073 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -10,6 +10,8 @@ addons:
   - gcc-multilib
   - libssl-dev
   - llvm-dev
+  - libjemalloc1
+  - libjemalloc-dev
 
 before_install: ./.travis/prepare.sh
 
@@ -30,6 +32,7 @@ env:
   - KERNEL=3.14.60
   - KERNEL=3.12.53
   - KERNEL=3.10.96
+  - TESTSUITE=1 LIBS=-ljemalloc
 
 script: ./.travis/build.sh $OPTS
 
diff --git a/INSTALL.md b/INSTALL.md
index 9dadcee..fa7b40e 100644
--- a/INSTALL.md
+++ b/INSTALL.md
@@ -247,6 +247,13 @@ Here is an example:
   `% mkdir _gcc && (cd _gcc && ../configure CC=gcc)`
   `% mkdir _clang && (cd _clang && ../configure CC=clang)`
 
+Under certains loads the ovsdb-server and other components perform
+better when using the jemalloc memory allocator, instead of the GLibC
+memory allocator.
+
+If you wish to link with jemalloc add it to LIBS:
+
+  `% ./configure LIBS=-ljemalloc`
 
 Building the Sources
 
-- 
1.9.1
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [PATCH] Add --enable-jemalloc argument to build

2016-03-15 Thread Rodriguez Betancourt, Esteban

We opted for the flag option because we wanted to affect just OVSDB-Server, and 
not other programs. We don't have any specific issue with linking the other 
programs with jemalloc, but we don't have tests for the other programs so we 
are not sure if performance gain would be the same for them. We don't want to 
use jemalloc if there isn't a real benefit.  

We had checked that the programs that use the C IDL  performs better when 
linked with jemalloc, but again, we prefer to stay on the safe side and don't 
link the IDL with jemalloc, and keeping that decision to the specific 
developers.

Thanks,
Esteban

-Original Message-
From: Ben Pfaff [mailto:b...@ovn.org] 
Sent: lunes, 14 de marzo de 2016 11:10
To: Rodriguez Betancourt, Esteban <esteb...@hpe.com>
Cc: dev@openvswitch.org
Subject: Re: [ovs-dev] [PATCH] Add --enable-jemalloc argument to build

On Tue, Feb 23, 2016 at 10:21:21PM +, Rodriguez Betancourt, Esteban wrote:
> During our tests with OVSDB we found out that memory allocation is 
> used intensively by the server, so we thought that using a different 
> memory allocator could increase the performance. We tried jemalloc and 
> the performance gain was between 20% and 40%.
> 
> This patch would allow anyone to enable jemalloc on their systems in 
> case you want to try out this memory allocator.
> 
> The patch adds the --enable-jemalloc flag to link ovsdb-server with 
> jemalloc instead of the default memory allocator. This can improve the 
> OVSDB Server performance under certains loads.
> 
> Signed-off-by: Esteban Rodriguez Betancourt <esteb...@hpe.com>

I've been thinking about this for a while.  It's nice that jemalloc can speed 
up ovsdb-server, and I'm happy to find that out.  Given that, I'm also happy to 
encourage users to use jemalloc.  But this patch seems like a bit of overkill.  
In the end, I guess I have a couple of
questions:

Why is it worthwhile to add a new configure flag, instead of just encouraging 
users to run "./configure LIBS=-ljemalloc"?

Why is jemalloc only suitable for ovsdb-server?

Thanks,

Ben.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

[ovs-dev] [PATCH 5/5] tests:ovsdb-idl: Add testing for compound index feature

2016-03-08 Thread Rodriguez Betancourt, Esteban

This commit includes the tests to verify compound index behavior.
Tests are divide in single and double column indexes and skiplist.

Signed-off-by: Arnoldo Lutz Guevara 
Signed-off-by: Esteban Rodriguez Betancourt 
Co-authored-by: Arnoldo Lutz Guevara 
Co-authored-by: Esteban Rodriguez Betancourt 
---
 tests/.gitignore|   1 +
 tests/automake.mk   |   2 +
 tests/idltest.ovsschema |   6 +-
 tests/library.at|  11 ++
 tests/ovsdb-idl.at  | 267 
 tests/test-ovsdb.c  | 213 +-
 tests/test-skiplist.c   | 210 +
 7 files changed, 707 insertions(+), 3 deletions(-)
 create mode 100644 tests/test-skiplist.c

diff --git a/tests/.gitignore b/tests/.gitignore
index f4540a3..ed017c1 100644
--- a/tests/.gitignore
+++ b/tests/.gitignore
@@ -38,6 +38,7 @@
 /test-rstp
 /test-sflow
 /test-sha1
+/test-skiplist
 /test-stp
 /test-strtok_r
 /test-timeval
diff --git a/tests/automake.mk b/tests/automake.mk
index ef0a2f6..54955b9 100644
--- a/tests/automake.mk
+++ b/tests/automake.mk
@@ -167,6 +167,7 @@ valgrind_wrappers = \
tests/valgrind/test-reconnect \
tests/valgrind/test-rstp \
tests/valgrind/test-sha1 \
+   tests/valgrind/test-skiplist \
tests/valgrind/test-stp \
tests/valgrind/test-type-props \
tests/valgrind/test-unix-socket \
@@ -313,6 +314,7 @@ tests_ovstest_SOURCES = \
tests/test-rstp.c \
tests/test-sflow.c \
tests/test-sha1.c \
+   tests/test-skiplist.c \
tests/test-stp.c \
tests/test-unixctl.c \
tests/test-util.c \
diff --git a/tests/idltest.ovsschema b/tests/idltest.ovsschema
index 1d073aa..6c7f8a9 100644
--- a/tests/idltest.ovsschema
+++ b/tests/idltest.ovsschema
@@ -34,7 +34,8 @@
 "min": 0
   }
 }
-  }
+  },
+  "isRoot":true
 }, 
 "link2": {
   "columns": {
@@ -104,7 +105,8 @@
 "min": 0
   }
 }
-  }
+  },
+  "isRoot":true
 }
   }
 }
diff --git a/tests/library.at b/tests/library.at
index d5dcb12..4542156 100644
--- a/tests/library.at
+++ b/tests/library.at
@@ -51,6 +51,17 @@ AT_CHECK([ovstest test-sha1], [0], [.
 ])
 AT_CLEANUP
 
+AT_SETUP([test skiplist])
+AT_KEYWORDS([skiplist])
+AT_CHECK([ovstest test-skiplist], [0], [skiplist insert
+skiplist delete
+skiplist find
+skiplist forward_to
+skiplist random
+
+])
+AT_CLEANUP
+
 AT_SETUP([test type properties])
 AT_CHECK([test-type-props])
 AT_CLEANUP
diff --git a/tests/ovsdb-idl.at b/tests/ovsdb-idl.at
index ebf82a5..f4a90fd 100644
--- a/tests/ovsdb-idl.at
+++ b/tests/ovsdb-idl.at
@@ -776,3 +776,270 @@ OVSDB_CHECK_IDL_TRACK([track, simple idl, initially 
empty, various ops],
 014: updated columns: ba i ia r ra s
 015: done
 ]])
+
+# Tests to verify the functionality of the one column compound index.
+# It tests index for one column string and integer indexes.
+# The run of test-ovsdb generates the output of the display of data using the 
different indexes defined in
+# the program.
+# Then, some at_checks are used to verify the correctness of the corresponding 
index as well as the existence
+# of all the rows involved in the test.
+m4_define([OVSDB_CHECK_IDL_COMPOUND_INDEX_SINGLE_COLUMN_C],
+  [AT_SETUP([$1 - C])
+   AT_KEYWORDS([ovsdb server idl compound_index_single_column compound_index 
positive $5])
+   AT_CHECK([ovsdb-tool create db $abs_srcdir/idltest.ovsschema],
+  [0], [stdout], [ignore])
+   AT_CHECK([ovsdb-server '-vPATTERN:console:ovsdb-server|%c|%m' --detach 
--no-chdir --pidfile="`pwd`"/pid --remote=punix:socket 
--unixctl="`pwd`"/unixctl db], [0], [ignore], [ignore])
+   m4_if([$2], [], [],
+ [AT_CHECK([ovsdb-client transact unix:socket $2], [0], [ignore], 
[ignore], [kill `cat pid`])])
+# Generate the data to be tested.
+   AT_CHECK([test-ovsdb '-vPATTERN:console:test-ovsdb|%c|%m' -vjsonrpc -t10 -c 
idl-compound-index unix:socket $3],
+[0], [stdout], [ignore], [kill `cat pid`])
+# Filter the rows of data that corresponds to the string index eliminating the 
extra columns of data.
+# This is done to verifiy that the output data is in the correct and expected 
order.
+   AT_CHECK([[cat stdout | grep -oh '[0-9]\{3\}: s=.*' | sed -e 's/ i=.*//g']],
+[0], [$4], [], [kill `cat pid`])
+# Here, the data is filtered and sorted in order to have all the rows in the 
index and be
+# able to determined that all the involved rows are present.
+   AT_CHECK([[cat stdout | grep -oh '[0-9]\{3\}: s=.*' | sort -k 1,1n -k 2,2 
-k 3,3]],
+[0], [$5], [], [kill `cat pid`])
+# Filter the rows of data that corresponds to the integer index eliminating 
the extra columns of data.
+# This is done to verifiy that the output data is in the correct and expected 
order.
+

[ovs-dev] [PATCH 4/5] ovsdb-idl: Autogenerated functions for compound indexes

2016-03-08 Thread Rodriguez Betancourt, Esteban

From: Arnoldo Lutz Guevara 

Generates and fill the default comparators for columns with
type int, real, string. Also creates the macros that allow
to iterate over the contents of the index, and perform
queries.

Signed-off-by: Arnoldo Lutz Guevara 
Signed-off-by: Esteban Rodriguez Betancourt 
---
 ovsdb/ovsdb-idlc.in | 116 
 1 file changed, 116 insertions(+)

diff --git a/ovsdb/ovsdb-idlc.in b/ovsdb/ovsdb-idlc.in
index 26b0de4..38c0d51 100755
--- a/ovsdb/ovsdb-idlc.in
+++ b/ovsdb/ovsdb-idlc.in
@@ -8,9 +8,15 @@ import sys
 import ovs.json
 import ovs.db.error
 import ovs.db.schema
+from ovs.db.types import StringType, IntegerType, RealType
 
 argv0 = sys.argv[0]
 
+def isColumnIndexable(column):
+return not column.type.is_map()  and column.type.key.is_valid() \
+   and (column.type.is_scalar())  and \
+column.type.key.toAtomicType() in ['OVSDB_TYPE_STRING', 
'OVSDB_TYPE_REAL', 'OVSDB_TYPE_INTEGER']
+
 def parseSchema(filename):
 return ovs.db.schema.IdlSchema.from_json(ovs.json.from_file(filename))
 
@@ -186,6 +192,25 @@ const struct %(s)s *%(s)s_track_get_next(const struct 
%(s)s *);
  (ROW); \\
  (ROW) = %(s)s_track_get_next(ROW))
 
+int %(s)s_index_compare(struct ovsdb_idl_index_cursor *, const struct %(s)s *, 
const struct %(s)s *);
+const struct %(s)s *%(s)s_index_first(struct ovsdb_idl_index_cursor *);
+const struct %(s)s *%(s)s_index_next(struct ovsdb_idl_index_cursor *);
+const struct %(s)s *%(s)s_index_find(struct ovsdb_idl_index_cursor *, const 
struct %(s)s *);
+const struct %(s)s *%(s)s_index_forward_to(struct ovsdb_idl_index_cursor *, 
const struct %(s)s *);
+const struct %(s)s *%(s)s_index_get_data(const struct ovsdb_idl_index_cursor 
*);
+#define %(S)s_FOR_EACH_RANGE(ROW, CURSOR, FROM, TO) \\
+for ((ROW) = %(s)s_index_forward_to(CURSOR, FROM); \\
+ ((ROW) && %(s)s_index_compare(CURSOR, ROW, TO) <= 0); \\
+ (ROW) = %(s)s_index_next(CURSOR))
+#define %(S)s_FOR_EACH_EQUAL(ROW, CURSOR, KEY) \\
+for ((ROW) = %(s)s_index_find(CURSOR, KEY); \\
+ ((ROW) && %(s)s_index_compare(CURSOR, ROW, KEY) == 0); \\
+ (ROW) = %(s)s_index_next(CURSOR))
+#define %(S)s_FOR_EACH_BYINDEX(ROW, CURSOR) \\
+for ((ROW) = %(s)s_index_first(CURSOR); \\
+ (ROW); \\
+ (ROW) = %(s)s_index_next(CURSOR))
+
 void %(s)s_init(struct %(s)s *);
 void %(s)s_delete(const struct %(s)s *);
 struct %(s)s *%(s)s_insert(struct ovsdb_idl_txn *);
@@ -216,6 +241,10 @@ bool %(s)s_is_updated(const struct %(s)s *, enum 
%(s)s_column_id);
 print '%s);' % ', '.join(args)
 
 print
+for columnName, column in sorted(table.columns.iteritems()):
+if isColumnIndexable(column):
+print 'int %(s)s_index_%(c)s_cmp(const void *, const void *);' 
% {'s': structName, 'c': columnName}
+print
 
 # Table indexes.
 printEnum("%stable_id" % prefix.lower(), ["%sTABLE_%s" % (prefix.upper(), 
tableName.upper()) for tableName in sorted(schema.tables)] + ["%sN_TABLES" % 
prefix.upper()])
@@ -746,7 +775,90 @@ const struct ovsdb_datum *
'S': structName.upper(),
'C': columnName.upper()}
 print "}"
+# Column Index compare functions
+for columnName, column in sorted(table.columns.iteritems()):
+type = column.type
+funcDict = {'OVSDB_TYPE_STRING': 'str', 'OVSDB_TYPE_REAL': 
'double', 'OVSDB_TYPE_INTEGER': 'int'}
+if isColumnIndexable(column):
+print '''
+/*  Call an internal function defined to compare  "%(f)s" type columns for 
"%(c)s" columns
+in "%(s)s" tables.
+Parameters: void *row1, void * row2. Data to be compared. Must be of type 
corresponding the record of
+the table.
+Return value: 0 if both data values are equal, -1 if first parameter is 
less than second and 1 otherwise.
+For internal use only. Not recomended to be called directly. */ ''' % {'s' 
: structName, 'c' : columnName, 'f': type.key.toAtomicType()}
+print 'int'
+print '%(s)s_index_%(c)s_cmp(const void *row1, const void 
*row2)' % \
+{'s': structName, 'c': columnName}
+print '{'
+print 'struct %(s)s *data1 = (struct %(s)s *)row1;' % { 
's' : structName } 
+print 'struct %(s)s *data2 = (struct %(s)s *)row2;' % { 
's' : structName }
+print 'return ovsdb_idl_index_%(f)scmp(data1->%(c)s, 
data2->%(c)s);' % \
+{'c': columnName, 'f': funcDict[type.key.toAtomicType()] }
+print "}"
 
+# Index table related functions
+print '''
+/*  This function is used to compare "%(s)s" records on table in iterartion 
loops for compound-index operations.

[ovs-dev] [PATCH 3/5] ovsdb-idl: IDL Compound Indexes Implementation

2016-03-08 Thread Rodriguez Betancourt, Esteban

In the C IDL, allows to create multicolumn indexes in the
tables, that are keep synched with the data in the replica.

Signed-off-by: Esteban Rodriguez Betancourt 
---
 lib/ovsdb-idl-provider.h |   3 +
 lib/ovsdb-idl.c  | 378 +++
 lib/ovsdb-idl.h  |  63 
 3 files changed, 444 insertions(+)

diff --git a/lib/ovsdb-idl-provider.h b/lib/ovsdb-idl-provider.h
index 190acca..423217e 100644
--- a/lib/ovsdb-idl-provider.h
+++ b/lib/ovsdb-idl-provider.h
@@ -1,4 +1,5 @@
 /* Copyright (c) 2009, 2010, 2011, 2012 Nicira, Inc.
+ * Copyright (C) 2016 Hewlett Packard Enterprise Development LP
  *
  * Licensed under the Apache License, Version 2.0 (the "License");
  * you may not use this file except in compliance with the License.
@@ -49,6 +50,7 @@ struct ovsdb_idl_column {
 bool mutable;
 void (*parse)(struct ovsdb_idl_row *, const struct ovsdb_datum *);
 void (*unparse)(struct ovsdb_idl_row *);
+int (*compare)(const void *, const void *); /* Perform a comparison over 
ovsrec_* */
 };
 
 struct ovsdb_idl_table_class {
@@ -69,6 +71,7 @@ struct ovsdb_idl_table {
 struct hmap rows;/* Contains "struct ovsdb_idl_row"s. */
 struct ovsdb_idl *idl;   /* Containing idl. */
 unsigned int change_seqno[OVSDB_IDL_CHANGE_MAX];
+struct shash indexes;/* Contains "struct ovsdb_idl_index"s */
 struct ovs_list track_list; /* Tracked rows (ovsdb_idl_row.track_node). */
 };
 
diff --git a/lib/ovsdb-idl.c b/lib/ovsdb-idl.c
index 4cb1c81..24920a0 100644
--- a/lib/ovsdb-idl.c
+++ b/lib/ovsdb-idl.c
@@ -1,4 +1,5 @@
 /* Copyright (c) 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016 Nicira, Inc.
+ * Copyright (C) 2016 Hewlett Packard Enterprise Development LP
  *
  * Licensed under the Apache License, Version 2.0 (the "License");
  * you may not use this file except in compliance with the License.
@@ -36,8 +37,10 @@
 #include "ovsdb-parser.h"
 #include "poll-loop.h"
 #include "shash.h"
+#include "skiplist.h"
 #include "sset.h"
 #include "util.h"
+#include "uuid.h"
 #include "openvswitch/vlog.h"
 
 VLOG_DEFINE_THIS_MODULE(ovsdb_idl);
@@ -203,6 +206,15 @@ ovsdb_idl_table_from_class(const struct ovsdb_idl *,
const struct ovsdb_idl_table_class *);
 static bool ovsdb_idl_track_is_set(struct ovsdb_idl_table *table);
 
+static int
+ovsdb_idl_index_generic_comparer(const void *, const void *, const void *);
+static struct ovsdb_idl_index *
+ovsdb_idl_create_index_(const struct ovsdb_idl_table *table);
+static void
+ovsdb_idl_destroy_indexes(struct ovsdb_idl_table *table);
+static void ovsdb_idl_add_to_indexes(const struct ovsdb_idl_row *);
+static void ovsdb_idl_remove_from_indexes(const struct ovsdb_idl_row *);
+
 /* Creates and returns a connection to database 'remote', which should be in a
  * form acceptable to jsonrpc_session_open().  The connection will maintain an
  * in-memory replica of the remote database whose schema is described by
@@ -249,6 +261,7 @@ ovsdb_idl_create(const char *remote, const struct 
ovsdb_idl_class *class,
 memset(table->modes, default_mode, tc->n_columns);
 table->need_table = false;
 shash_init(>columns);
+shash_init(>indexes);
 for (j = 0; j < tc->n_columns; j++) {
 const struct ovsdb_idl_column *column = >columns[j];
 
@@ -284,6 +297,7 @@ ovsdb_idl_destroy(struct ovsdb_idl *idl)
 
 for (i = 0; i < idl->class->n_tables; i++) {
 struct ovsdb_idl_table *table = >tables[i];
+ovsdb_idl_destroy_indexes(table);
 shash_destroy(>columns);
 hmap_destroy(>rows);
 free(table->modes);
@@ -318,6 +332,7 @@ ovsdb_idl_clear(struct ovsdb_idl *idl)
 struct ovsdb_idl_arc *arc, *next_arc;
 
 if (!ovsdb_idl_row_is_orphan(row)) {
+ovsdb_idl_remove_from_indexes(row);
 ovsdb_idl_row_unparse(row);
 }
 LIST_FOR_EACH_SAFE (arc, next_arc, src_node, >src_arcs) {
@@ -1454,6 +1469,364 @@ ovsdb_idl_row_unparse(struct ovsdb_idl_row *row)
 }
 }
 
+/*
+ * Creates a new index, that is attached to the given idl and table.
+ * The index has the given name.
+ * All the indexes must be created before the first ovsdb_idl_run is
+ * executed.
+ */
+struct ovsdb_idl_index *
+ovsdb_idl_create_index(struct ovsdb_idl *idl,
+   const struct ovsdb_idl_table_class *tc,
+   const char *index_name)
+{
+size_t i;
+struct ovsdb_idl_index *index;
+for(i = 0; i < idl->class->n_tables; i++){
+struct ovsdb_idl_table *table = >tables[i];
+
+if (table->class == tc) {
+index = ovsdb_idl_create_index_(table);
+if(!shash_add_once(>indexes,
+  index_name,
+  index)){
+VLOG_ERR("Can't repeat index name '%s' at table %s",
+   index_name, table->class->name);
+

[ovs-dev] Subject: [PATCH 2/5] lib:Data structures: Skiplist implementation

2016-03-08 Thread Rodriguez Betancourt, Esteban

Skiplist implementation intended for the IDL compound indexes
feature.

Signed-off-by: Esteban Rodriguez Betancourt 
---
lib/automake.mk |   2 +
lib/skiplist.c  | 292 
lib/skiplist.h  |  53 ++
3 files changed, 347 insertions(+)
create mode 100644 lib/skiplist.c
create mode 100644 lib/skiplist.h

diff --git a/lib/automake.mk b/lib/automake.mk
index 27a1669..1317f85 100644
--- a/lib/automake.mk
+++ b/lib/automake.mk
@@ -224,6 +224,8 @@ lib_libopenvswitch_la_SOURCES = \
   lib/shash.h \
   lib/simap.c \
   lib/simap.h \
+ lib/skiplist.c \
+ lib/skiplist.h \
   lib/smap.c \
   lib/smap.h \
   lib/socket-util.c \
diff --git a/lib/skiplist.c b/lib/skiplist.c
new file mode 100644
index 000..01e296c
--- /dev/null
+++ b/lib/skiplist.c
@@ -0,0 +1,292 @@
+/* Copyright (C) 2016 Hewlett Packard Enterprise Development LP
+ * All Rights Reserved.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License"); you may
+ * not use this file except in compliance with the License. You may obtain
+ * a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations
+ * under the License.
+ */
+
+/*
+ * Skiplist implementationn based on:
+ * "Skip List: A Probabilistic Alternative to Balanced Trees",
+ * by William Pugh.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "skiplist.h"
+#include "random.h"
+
+#define SKIPLIST_MAX_LEVELS 64
+
+/*
+ * Skiplist node container
+ */
+struct skiplist_node
+{
+const void* data;   /* Pointer to saved data */
+uint64_t height;/* Height of this node */
+struct skiplist_node *forward[];/* Links to the next nodes */
+};
+
+/*
+ * Skiplist container
+ */
+
+struct skiplist
+{
+struct skiplist_node *header;   /* Pointer to head node
+   (not first data node)*/
+skiplist_comparator cmp;/* Pointer to the skiplist's comparison
+   function*/
+void *cfg;  /* Pointer to optional comparison
+   configuration, used by the comparator */
+int max_levels; /* Max levels of the skiplist. */
+uint64_t probability;   /* Probability that the node would take an
+   additional skiplist level. */
+int64_t size;   /* Current size of the skiplist. */
+int64_t level;  /* Max number of levels used in this
+   skiplist*/
+void (*free_func)(void *);  /* Function that free the value's memory */
+};
+
+static int skiplist_determine_level(struct skiplist* sl);
+
+static struct skiplist_node* skiplist_create_node(int, const void *);
+
+static struct skiplist_node*
+skiplist_forward_to_(struct skiplist* sl, const void *value,
+ struct skiplist_node **update);
+
+/*
+ * skiplist_create returns a new skiplist, configured with given max_levels,
+ * data comparer and configuration.
+ * Sets a probability of 0.5 (RAND_MAX / 2).
+ */
+struct skiplist*
+skiplist_create(int max_levels, skiplist_comparator object_comparator,
+void *configuration)
+{
+random_init();
+struct skiplist *sl;
+sl = malloc(sizeof(struct skiplist));
+sl->cfg = configuration;
+sl->max_levels = max_levels < SKIPLIST_MAX_LEVELS ?
+max_levels : SKIPLIST_MAX_LEVELS;
+sl->size = 0;
+sl->level = 0;
+sl->cmp = object_comparator;
+sl->probability = UINT32_MAX / 2;
+sl->header = skiplist_create_node(sl->max_levels, NULL);
+sl->free_func = NULL;
+
+return sl;
+}
+
+/*
+ * Set a custom function that free the value's memory when
+ * destroying the skiplist.
+ */
+void
+skiplist_set_free_func(struct skiplist* sl, void (*func)(void *))
+{
+sl->free_func = func;
+}
+
+/*
+ * Determines the correspondent level for a skiplist node.
+ */
+static int
+skiplist_determine_level(struct skiplist* sl)
+{
+int lvl = 0;
+while(random_uint32() <= sl->probability && lvl < sl->max_levels){
+lvl++;
+}
+return lvl;
+}
+
+/*
+ * Creates a new skiplist_node with given levels and data.
+ */
+static struct skiplist_node*
+skiplist_create_node(int levels, const void *object)
+{
+struct skiplist_node *new_node = malloc(sizeof(struct skiplist_node) +
+  (levels+1) * sizeof(struct skiplist_node *));
+new_node->data = object;
+

[ovs-dev] [PATCH 1/5] ovsdb-idl: Compound Indexes Design Document

2016-03-08 Thread Rodriguez Betancourt, Esteban

This is the pull request of the complete change: 
https://github.com/openvswitch/ovs/pull/111

---
From d7d82c305a610fdb9a81427aa5483e8c6a071687 Mon Sep 17 00:00:00 2001
From: Esteban Rodriguez Betancourt 
Date: Fri, 26 Feb 2016 17:23:28 -0600
Subject: [PATCH 1/5] ovsdb-idl: Compound Indexes Design Document

In the work made in our projects, it was found the need to have a faster
access to the rows contained in tables in the replica, as well to have
the possibility to loop over a subset of rows that meet some specified
criteria.
Those needs lead us to design and implement a functionality that
satisfies those requirements, so an implementation of special indexes were
done.
In order to keep the OVSDB server implementation unmodified and avoid
extra load of processing, the indexes are created as part of the IDL.
The indexes are created as part of the initialization of the replica request
and are maintained automatically when there are changes in the replica.

This document explains the design rationale of the compound indexes feature.

Signed-off-by: Javier Albornoz 
Signed-off-by: Esteban Rodriguez Betancourt 
Signed-off-by: Jorge Arturo Sauma Vargas 
Co-authored-by: Javier Albornoz 
Co-authored-by: Esteban Rodriguez Betancourt 
Co-authored-by: Jorge Arturo Sauma Vargas 
---
 Makefile.am|   1 +
 README-compound-indexes.md | 306 +
 2 files changed, 307 insertions(+)
 create mode 100644 README-compound-indexes.md

diff --git a/Makefile.am b/Makefile.am
index 75ccadf..8a8f565 100644
--- a/Makefile.am
+++ b/Makefile.am
@@ -88,6 +88,7 @@ docs = \
OPENFLOW-1.1+.md \
PORTING.md \
README.md \
+   README-compound-indexes.md \
README-lisp.md \
README-native-tunneling.md \
REPORTING-BUGS.md \
diff --git a/README-compound-indexes.md b/README-compound-indexes.md
new file mode 100644
index 000..7539d7b
--- /dev/null
+++ b/README-compound-indexes.md
@@ -0,0 +1,306 @@
+# Using indexes for special operations
+
+--
+
+## Introduction
+
+The OVSDB database was designed for powering a virtual switch, therefore not 
all
+common DB functionality was needed at the time it was implemented. However, as
+part of the of the work of our developers, they found scenarios where this 
common
+functionality would be useful. One of this functionality is using indexes for
+operations such as sorting or fast lookup.
+
+This design document describes a proposed solution for implementing this extra
+functionality based on the concept of indexes.
+
+Please note that in this document, the term "index" refers to the common
+database term defined as "a data structure that improves data retrieval". 
Unless
+stated otherwise, the definition for index from the OVSDB RFC (RFC 7047) is not
+used.
+
+## Problem statement
+
+In our projects, we would like to have available tools for special operations. 
This
+tools are not currently available on the OVSDB engine.
+
+The most common tools requested are for doing fast lookups of big tables and 
for
+a mechanism to retrieve data in lexicographic order.
+
+## Use cases
+
+### Fast lookups
+
+Depending on the topology, the route table of a network device could manage
+thousands of routes. Commands such as "show ip route <*specific route*>" would
+need to do a sequential lookup of the routing table to find the specific route.
+With an index created, the lookup time could be faster.
+
+This same scenario could be applied to other features such as Access List rules
+and even interfaces lists.
+
+### Lexicographic order
+
+There are several cases where retrieving data in lexicographic order is needed.
+For example, SNMP. When an administrator or even a NMS would like to retrieve
+data from a specific device, it's possible that they will request data from 
full
+tables instead of just specific values. Also, they would like to have this
+information displayed in lexicographic order. This operation could be done by
+the SNMP daemon or by the CLI, but it would be better if the database could
+provide the data ready for consumption. Also, duplicate efforts by different
+processes will be avoided. Another use case for requesting data in 
lexicographic
+order is for user interfaces (web or CLI) where it would be better and quicker
+if the DB sends the data sorted instead of letting each process to sort the 
data
+by itself.
+
+## Implementation
+
+The proposal is to create a data structure in memory that contains pointers to
+the rows where the desired information is stored. This data structure can be
+traversed in the order specified when creating the index.
+
+A index can be defined over any number of columns, and support the following
+options:
+
+-   Add a column with type string, int or real (using default comparators).
+-   Select if ordering

[ovs-dev] [PATCH] Add --enable-jemalloc argument to build

2016-02-23 Thread Rodriguez Betancourt, Esteban

During our tests with OVSDB we found out that memory allocation is
used intensively by the server, so we thought that using a different
memory allocator could increase the performance. We tried jemalloc
and the performance gain was between 20% and 40%.

This patch would allow anyone to enable jemalloc on their systems
in case you want to try out this memory allocator.

The patch adds the --enable-jemalloc flag to link ovsdb-server with jemalloc
instead of the default memory allocator. This can improve the OVSDB
Server performance under certains loads.

Signed-off-by: Esteban Rodriguez Betancourt 
---
 .travis.yml   |  3 +++
 DESIGN.md | 51 +++
 INSTALL.md|  4 
 configure.ac  | 14 ++
 ovsdb/automake.mk |  4 
 5 files changed, 76 insertions(+)

diff --git a/.travis.yml b/.travis.yml
index 52c9362..fa7cefa 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -10,6 +10,8 @@ addons:
   - gcc-multilib
   - libssl-dev
   - llvm-dev
+  - libjemalloc1
+  - libjemalloc-dev
 
 before_install: ./.travis/prepare.sh
 
@@ -33,6 +35,7 @@ env:
   - KERNEL=3.4.110
   - KERNEL=3.2.76
   - KERNEL=2.6.32.70
+  - OPTS="--enable-jemalloc"
 
 script: ./.travis/build.sh $OPTS
 
diff --git a/DESIGN.md b/DESIGN.md
index 6865d47..85d4f11 100644
--- a/DESIGN.md
+++ b/DESIGN.md
@@ -1082,6 +1082,57 @@ against desired actions in a bytewise fashion:
 Please report other discrepancies, if you notice any, so that we can
 fix or document them.
 
+jemalloc as Default Memory Manager for OVSDB-Server
+===
+
+Background
+--
+
+OVSDB-Server performs a lot of memory-related operations. Many of those are 
related to
+allocating small pieces of memory. It requests and releases memory continually 
for creating
+and destroying objects like dynamic strings, json objects, etc.
+
+Tests revealed that by replacing the GLibC memory allocator with the [jemalloc]
+(http://www.canonware.com/jemalloc) memory allocator, OVSDB-Server's 
performance can be
+significantly improved, and also, in some cases, can reduce the memory usage.
+
+## Performance Improvements
+
+Several tests (insertion, updates, pubsub and transaction size) were run in 
order to
+compare the GLibC performance and memory usage along both jemalloc and 
tcmalloc memory
+allocators.
+
+The results showed favorable results for jemalloc library.
+
+- Update 1: Each of 10 parallel workers do 25000 updates over 1000 rows.
+- Update 2: Each of 10 parallel workers do 25000 updates over 20 rows.
+- Insert: Each of 10 parallel workers insert 25000 rows.
+- Message Queue Simulation: One producer waits for requests (requested by 10 
parallel workers).
+  For each request the producer updates 512 rows, composed of one map column 
with 256 elements.
+- Transaction Size: The program inserts 100, 1000, 1, 10 and 50 
rows, alternating
+  between inserting all the rows in the same transaction, or inserting one row 
per transaction.
+
+| Test |   GLibC   |  jemalloc  |  tcmalloc  |
+|--|---|||
+| Update 1 |   28s |24s |31s |
+| Update 2 |   57s |38s |33s |
+| Insert   |   38s |32s |27s |
+| Queue|4s | 4s | 4s |
+| Size |  129.92s  |   119.87s  |   114.21s  |
+|Duration (seconds)  |
+
+
+| Test |   GLibC   |  jemalloc  |  tcmalloc  |
+|--|---|||
+| Update 1 |7.35   |18.92   |   171.42   |
+| Update 2 | 1090.7|82.88   |   111.08   |
+| Insert   |   66.66   |57.86   |   127.56   |
+| Queue| 1094.40   |28.13   |   270.47   |
+| Size |  410.17   |   116.50   |   114.21   |
+|Memory Usage (RSS, megabytes)   |
+
+The results show a consistent better results of jemalloc over GLibC memory 
allocator.
+TCMalloc was faster in some benchmarks, but uses a lot more of RAM than 
jemalloc.
 
 Suggestions
 ===
diff --git a/INSTALL.md b/INSTALL.md
index dc688ad..51d2228 100644
--- a/INSTALL.md
+++ b/INSTALL.md
@@ -257,6 +257,10 @@ Here is an example:
   `% mkdir _gcc && (cd _gcc && ../configure CC=gcc)`  
   `% mkdir _clang && (cd _clang && ../configure CC=clang)`
 
+To use jemalloc as the memory allocator for OVSDB Server, add
+--enable-jemalloc, which links OVSDB Server with jemalloc (the library
+must be installed on the system). For the rationale about this flag
+refer to DESIGN.md.
 
 Building the Sources
 
diff --git a/configure.ac b/configure.ac
index 49aa182..4e0369c 100644
--- a/configure.ac
+++ b/configure.ac
@@ -43,6 +43,20 @@ AC_SYS_LARGEFILE
 LT_INIT([disable-shared])
 m4_pattern_forbid([LT_INIT]) dnl Make autoconf fail if libtool is missing.
 
+AC_ARG_ENABLE([jemalloc],
+[AS_HELP_STRING([--enable-jemalloc], [use jemalloc as the

41 matches

Mail list logo