hi stewart, I read your post about key tuple format. But I still do not quite understand when and how will those index_foo functions will be called? I've looked through csv and archive engine, as well as mysql filesystem and awss3 engine. None of then support index. I try to figure it out by reading source code of innodb engine, but still dose not quite understand. So I have no idea about how to implement index in storage engine. Can you give me some hints? And I did not read transaction-related source code in depth. I am not sure whether I can implement a filesystem or could-based TransactionalStorageEngine in three-months GSOC. What's your opinion?
Thanks! 2010/4/2 Planet Drizzle <[email protected]> > > Stewart Smith: The Drizzle (and MySQL) Key tuple format > > Here’s something that’s not really documented anywhere (unless you count > ha_innodb.cc as a source of server documentation). You may have some idea > about the MySQL/Drizzle row buffer format. This is passed around the storage > engine interface: in for write_row and update_row and out for the various > scan and index read methods. > > If you want to see the docs for it that exist in the code, check out > store_key_val_for_row in ha_innodb.cc. > > However, there is another format that is passed to your engine (and that your > engine is expected to understand) and for lack of a better name, I’m going to > call it the key tuple format. The first place you’ll probably see this is > when implementing the index_read function for a Cursor (or handler in MySQL > speak). > > You get two things: a pointer to the buffer and the length of the buffer. > Since a key can be made up of multiple parts, some of which can be NULL and > some of which can be of variable length, this buffer is not (usually) a > simple value. If you are starting out in your engine development, you can use > this buffer blindly as a single value for non-nullable indexes with only 1 > column. > > The basic format is this: > > The buffer is in-order of the index. First column in the index is first in > the buffer, second second etc. > The buffer must be zero-filled. The server kernel will use memcmp to compare > two key values. > If the column is NULLable, then the first byte is set to 1 if the column is > null. Else, 0 means not-null. > From ha_innodb.cc (for BLOBs, which I haven’t put in embedded_innodb yet): If > the column is of a BLOB type (it must be a column prefix field in this case), > then we put the length of the data in the field to the next 2 bytes, in the > little-endian format. If the field is SQL NULL, then these 2 bytes are set to > 0. Note that the length of data in the field is <= column prefix length. > For fixed length fields (such as int), the next max field length bytes are > for that field. > For VARCHAR, there is always a 2 byte (in little endian) length. This is > different to the row format, which may have 1 or 2 bytes. In the key tuple > format it is ALWAYS two bytes. > > I’ll discuss the use of this for rnd_pos() and position() in a later post… > > This blog post (but not the whole blog) is published under the Creative > Commons Attribution-Share Alike License. Attribution is by linking back to > this post and mentioning my name (Stewart Smith). > > URL: > http://www.flamingspork.com/blog/2010/04/02/the-drizzle-and-mysql-key-tuple-format/ > > > _______________________________________________ > Mailing list: https://launchpad.net/~drizzle-discuss > Post to : [email protected] > Unsubscribe : https://launchpad.net/~drizzle-discuss > More help : https://help.launchpad.net/ListHelp > _______________________________________________ Mailing list: https://launchpad.net/~drizzle-discuss Post to : [email protected] Unsubscribe : https://launchpad.net/~drizzle-discuss More help : https://help.launchpad.net/ListHelp

