[OpenSER-Devel] Re: [OpenSER-Users] New db_berkeley module

Will Quan Thu, 11 Oct 2007 21:14:43 -0700

Henning, Thanks for working through this. I can definitely understandconsistency across the DB modules is important architecturally.I have been think about this all day, and I dont think I have afavorable response to the issue of the row id as a primary key in berkeley.The berkeley database is not relational and the extra burden ofmaintaining an artificial key (id) for each row will not actuallyimprove performance as it would in a relational database.I am not an expert in DB internals, so I'll just explain things as Iunderstand them. We need to hash this out :)

The api for querying in berkeley is either:

1. get() - where your provide the key, and in our case it must belexicographically equal in order to find a result. I believe this is the'natural join'.2. cursor() - where you iterate over each row, do the join on anycolumns you want, and create a result set.As implemented, without the id columns, the queries are implemented withget() which implies a natural join, or exact string equality on the'key', which is in most cases a composite key comprised of theMETADATA_KEY columns seperated by a delimiter. Since the underlyingaccess method is db_hash, the query runtime is constant.I think if we change things in the bdb schema to use the id column aspart of the composite key, we will be limiting ourselves to using cursorbased queries, since we will not know the id until after the first query.Aside, my understanding is that that future development would implementqueries that fetch and store the oid such that subsequent queries wouldperform queries in that table with a 'WHERE id = oid' clause. (Pleaselet me know if this assumption is incorrect.) As I sit here, I think Iwould have to create a secondary bdb database for each table thatrequires the id column. The key would be a unique integer id, and thevalue would point to the row of the 'real' table. This would probablywork but it does add a layer of complexity that we take for granted inthe relational databases. Today , these secondary databases are notimplemented, and there are other issues not discussed like the conceptof uniqueness of the ids, etc. However, to be honest I dont know if Ican get all this secondary db stuff working in the next 2 months.

Please do not take this as me rejecting your ideas, but rather fulldiscloser that making db_berkeley more 'relational' comes at the cost ofadditional complexities that are not implemented yet.

Aside, I started looking at the code for the openserctl cmds today, andI think I need to add some fifo cmds to the modules since openser isactually running at the time the openserctl util is being invoked. Thismeans the DBs are open and some data may not be commited to disk, etc. Ithought I'd use the carrierroute module as the starting example forimplemented such fifo commands, but I need a few more days to get allthose command implemeted/tested.

If you prefer discussions in this working group that is good, but I amalso available via sip if you want to g



Henning Westerholt wrote:

On Thursday 11 October 2007, William Quan wrote:
I was poking around and I don't think the Berkeley DB has indexes like
we're used to in relational databases (or if they do they are not
exposed via api).

So basically each Berkeley DB maps to a SQL 'table'. The 'rows' are
mapped to key/value pairs in the bdb, and 'columns' are
application-encapsulated fields that the module needs to manipulate.
Conceptually its like a big hash table, where you need to know the key
for the query to find a row. Because of this, I did not include the ID
column in the tables, as its the auto incremented column that relational
db would use for an index, not something that is ordinarily provided in
a query by the application. I did not see your xslt file, but could we
modify it to not include the id columns for the berkeleydb stuff?
Hello William,
the 'id' column is currently not used from the openser server, but this isplanned for further releases. For that reason we also include the id field tothe dbtext tables, this db is from the concept somewhat like the berkeley_dbmodule.
We also had a some real pain in the past to support different db tables forall the modules, so i really would like use the same table for this moduletoo. If this its possible with dbtext, it should be possible withdb_berkeley, too. :-)
BTW, the xml source is in db/schema, the xsl scripts are in doc/dbschema/xsl.
I use this module for registration so that involves the modules auth_db,
registrar, and usrloc. These modules use primarily tables subscriber and
location.
This stuff has been working for a while, but due to the key definition
of the subscriber and location tables, it does require you to set
use_domain=1 in the script.
I also tested tables acc and version, but the rest remain to be tested.
So, its ok if i set the METADATA_KEY field e.g. for subscriber to0 1 2 (id, username, domain)? What happens if i don't set the use_domainparameter?
I should have some more code in the next few days.
Great!

Best regards,

Henning



_______________________________________________
Devel mailing list
Devel@openser.org
http://openser.org/cgi-bin/mailman/listinfo/devel

[OpenSER-Devel] Re: [OpenSER-Users] New db_berkeley module

Reply via email to