Toru Maesaka: BlitzDB and Keyless Tables

Previously you couldn’t create a table without defining a primary key with BlitzDB. This actually sounds like a nice constraint since you should always define a primary key. However, I went ahead and made this possible. I did this because one of the reasons that I’m developing BlitzDB is to get a better understanding of how the MySQL/Drizzle storage subsystem works. So, implementing a hidden key-generator and using it internally was something I wanted to do for sometime.

Previously this is what you got if you tried to create a table without a primary key:

drizzle> create table t1 (col1 int, col2 int, col3 text) engine=blitz;
ERROR 1173 (42000): This table type requires a primary key

Now:

drizzle> create table t1 (col1 int, col2 int, col3 text) engine=blitz;
Query OK, 0 rows affected (0 sec)

Inserting rows work as you would expect:

drizzle> insert into t1 values (1, 1, "first row");
Query OK, 1 row affected (0 sec)
 
drizzle> insert into t1 values (1, 2, "second row");
Query OK, 1 row affected (0 sec)
 
drizzle> insert into t1 values (1, 3, "third row");
Query OK, 1 row affected (0 sec)
 
drizzle> insert into t1 values (2, 1, "fourth row");
Query OK, 1 row affected (0 sec)
 
drizzle> insert into t1 values (2, 2, "fifth row");
Query OK, 1 row affected (0 sec)
 
drizzle> insert into t1 values (2, 3, "sixth row");
Query OK, 1 row affected (0 sec)

Selecting rows works fine although since there isn’t a key column in this table, every operation would require a full table scan which is not sexy:

drizzle> select * from t1;
+------+------+------------+
| col1 | col2 | col3       |
+------+------+------------+
|    1 |    1 | first row  | 
|    1 |    2 | second row | 
|    1 |    3 | third row  | 
|    2 |    1 | fourth row | 
|    2 |    2 | fifth row  | 
|    2 |    3 | sixth row  | 
+------+------+------------+
 
drizzle> select * from t1 where col1 = 1;
+------+------+------------+
| col1 | col2 | col3       |
+------+------+------------+
|    1 |    1 | first row  | 
|    1 |    2 | second row | 
|    1 |    3 | third row  | 
+------+------+------------+
3 rows in set (0 sec)
 
drizzle> select * from t1 where col2 = 2;
+------+------+------------+
| col1 | col2 | col3       |
+------+------+------------+
|    1 |    2 | second row | 
|    2 |    2 | fifth row  | 
+------+------+------------+
2 rows in set (0 sec)

How the internal works

BlitzDB does what most people would assume. It atomically generates a sequential unsigned 64bit integer then converts it to big endian (network byte order) if necessary. It then uses that value as a key to store the row into TC. The auto-generated key is made sure to be big-endian because I want BlitzDB tables to work on all platforms. That is, admins should be able to copy the “data files” over to another server and happily keep using the database. Keys are converted and _always_ used as little-endian inside BlitzDB.

Next Step

There’s still some bits and pieces on update related code that I need to work on but in general things are looking good. When I get those tasks done, I can then start working on supporting secondary index which I have cool ideas for.

URL: http://torum.net/2009/10/blitzdb-keyless-tables/

_______________________________________________
Mailing list: https://launchpad.net/~drizzle-discuss
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~drizzle-discuss
More help   : https://help.launchpad.net/ListHelp

Reply via email to