Re: [ovs-dev] [PATCH 3/5] ovsdb-idl: IDL Compound Indexes Implementation

Rodriguez Betancourt, Esteban Wed, 06 Apr 2016 13:23:16 -0700

> From: Ben Pfaff [mailto:[email protected]]
> Sent: martes, 22 de marzo de 2016 14:55
> To: Rodriguez Betancourt, Esteban <[email protected]>
> Cc: [email protected]
> Subject: Re: [ovs-dev] [PATCH 3/5] ovsdb-idl: IDL Compound Indexes 
> Implementation
> 
> On Wed, Mar 09, 2016 at 12:03:54AM +0000, Rodriguez Betancourt, 
> Esteban
> wrote:
> > In the C IDL, allows to create multicolumn indexes in the tables, 
> > that are keep synched with the data in the replica.
> >
> > Signed-off-by: Esteban Rodriguez Betancourt <[email protected]>
> 
> Until now, ovsdb-idl.h has been divided into sections separated by 
> page- breaks (control-L), and each section has had a comment 
> explaining basically what that section describes.  Some of the 
> sections, like the one for transactions, has a lot of detail.  This 
> commit adds a new section, but it doesn't add a page break or an 
> explanatory comment.  I think it should do both.  The design document 
> from patch 1 might be a good place from which to draw for the comment.


Changed in v2 patch.

> This patch and patch 2 has a couple of typedefs for function pointers.
> CodingStyle.md says:
> 
>       A function type is a good use for a typedef because it can clarify
>     code.  The type should be a function type, not a pointer-to-function
>     type.  That way, the typedef name can be used to declare function
>     prototypes.  (It cannot be used for function definitions, because that
>     is explicitly prohibited by C89 and C99.)

Changed in v2 patch.

> Do you think that the sorting order in struct ovsdb_idl_index is 
> useful, that is, do you see a common use for indexes in descending 
> order?  If indexes in descending order are only rarely useful, then 
> they could be implemented through a custom comparison function.

This is more needed by tools that expose the data to a user, and the user wants 
to sort the data in different ways.
Also this enables the option to have in a single index some columns in one 
direction and other columns in the other direction.

> I don't understand, in ovsdb_idl_index_compare(), why NULL != NULL.  
> It seems to me that this is risky and likely to cause confusion, if 
> not now then later.
Changed in v2 patch.

> I don't understand why indexes have string names.  It seems like kind 
> of a curious design.  Why isn't the client expected to retain a 
> pointer to the index that it wants to traverse, instead of a name?


One reason is to prevent the developer from modifying the values of the index by
referencing it using a string instead of giving direct access to the index data 
structure.
The other one is to allow a more intuitive way to refer to the indexes, and to
select them when programming. 

> Please use xmalloc() instead of malloc().
Changed in v2 patch.

> I found myself wondering whether there's value in doing dynamic 
> allocation of the three parallel arrays inside struct ovsdb_idl_index.
> It would be very unusual to have an index over more than, say, 3 
> columns, and so it might be easier to have fixed-size arrays of 3 
> elements (or a few more, to allow room for expansion).  I imagine that 
> the columns in indexes will be fixed at build time, after all.
> 
> ovsdb_idl_index_add_column() seems to be working hard to avoid using 
> xrealloc(), but I don't know why.

Changed in v2 patch. The indexes support any number of "columns" for doing the 
comparison. But that number isn't limited to the number of columns in a row, 
because eventually one column can have several values indexed, for example, 
different keys from a map column.

For some "queries", the need of filtering the data first raise the number of 
columns need to 5 or more.

> I don't understand why only certain columns have a default comparison 
> function.  Why not use ovsdb_datum_compare_3way() as the default for 
> every column?  I guess that the ovsdb_datums and ovsdb_type would have 
> to be passed into column_comparator instead of a pair of void * 
> pointers, but that would arguably be an improvement anyhow, certainly 
> from a type- safety viewpoint.

We decided against using the datums for the comparison because it seems 
intuitive that the parsed values in the ovsrec are easier and faster to compare.

In some cases, a table could have a really really high number of rows, and 
receive a lot of updates per second, so is important that the comparison 
functions are fast.

> Do you have an example of a useful use for a custom comparison function?

Sorting a string column, that contains IPv4 or IPv6 values.
Sort an specific member inside a map column Sort a string as a number Sort 
strings as dates Sort by a value contained in a referenced row

> Did you consider defining a struct of three members instead of using 
> three parallel arrays in struct ovsdb_idl_index?  It might well be 
> cleaner, and certainly would be for memory allocation issues.

Changed in v2 patch

> It looks like the clients don't actually need to see the definition of 
> struct ovsdb_idl_index, so perhaps it should be moved into 
> ovsdb-idl-provider.h.

Changed in v2 patch

> This patch has some minor coding style issues; I suggest reading 
> CodingStyle.md.

Changed in v2 patch

> The generic comparison functions takes 'row_sync' into account.  Do 
> custom comparison functions also need to take 'row_sync' into account?
> (Why or why not?)

No they don't. The row_sync is used exclusively by the generic comparison 
function.
More about it below.

> It looks like row_sync is needed because the skiplist data structure 
> implemented in the previous patch does not tolerate insertion of duplicates.
> That seems like a serious drawback for a data structure that is 
> supposed to be used as an index that may contain duplicates.
> Did you consider implementing a skip list (or other balanced tree
> structure) that gracefully supports duplicates?  That would be a much 
> more straightforward solution to the problem.

Allowing duplicates in the skiplist would make difficult to have a fast 
mechanism for deleting/modifying rows.

If the "key" is duplicated then we would need to iterate over all the equal 
keys, until we found the one with the value (pointer to row) that we want. If 
the "key"
is uncommon (e.g.: an UUID) that couldn't be very bad, but if the "key" is 
common (e.g.: a Boolean) each update/deletion of a row would need to iterate 
over many rows, potentially thousands.

That would be O(n), not O(log(n)). The row_sync solution was a balance between 
good performance for compound indexes, and adding a data structure for general 
use.

> Thanks,
> 
> Ben.

I'm going to send the updated patches soon, that takes in consideration that 
observations made here and in the responses to the other patches. The pull 
request was updated as well.

Thanks,
Esteban
_______________________________________________
dev mailing list
[email protected]
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [PATCH 3/5] ovsdb-idl: IDL Compound Indexes Implementation

Reply via email to