Public bug reported: I have noticed the keystone SQL database schema design is not scalable. It can hold maybe a few hundreds or maximum thousands of entries, but beyond this, it is going to certainly create very serious efficiency problems, both in terms storage space and query response time. Here are the main problem points I have spotted:
i) most of the tables use primary keys of varchar(64) type: role, domain, project, token, user, group etc., supposed to contain unique hex identifiers. I am not exactly sure about the rationale behind this design? If the idea is to be able to accommodate up to 16**64=10**77 distinct records, than this is clearly flawed, as the tables won't hold more than a few thousand entries given the current length of the primary key (and foreign keys, for those minor entity tables that refer to the major entity). ii) some tables have composite keys on multiple varchar(64) fields: Create Table: CREATE TABLE `assignment` ( `type` enum('UserProject','GroupProject','UserDomain','GroupDomain') NOT NULL, `actor_id` varchar(64) NOT NULL, `target_id` varchar(64) NOT NULL, `role_id` varchar(64) NOT NULL, `inherited` tinyint(1) NOT NULL, PRIMARY KEY (`type`,`actor_id`,`target_id`,`role_id`), KEY `ix_actor_id` (`actor_id`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 1 row in set (0.00 sec) iii) some tables have unique keys defined on varchar(255) columns: Create Table: CREATE TABLE `role` ( `id` varchar(64) NOT NULL, `name` varchar(255) NOT NULL, `extra` text, PRIMARY KEY (`id`), UNIQUE KEY `name` (`name`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 iv) the generated public id for (user,domain) entities is currently 64 hex chars, while only 32 hex chars are needed to ensure uniqueness up to 16**16=2**64=10**19 entries, which should be more than sufficient for any practical installation. In order to remedy these problems, I propose the following improvements: i) replace the varchar(64) hex primary key by an auto-incremented integer(4) column. This will hold up to 4 billion records and will greatly reduce the storage requirements and improve query performance. ii) reduce the generated public id for (user, domain) entities to 32 hex chars, stored in binary form as two bigint(8) columns. iii) reduce the "name" field length to more manageable length or reduce index size using a hash function. ** Affects: keystone Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Identity (keystone). https://bugs.launchpad.net/bugs/1524124 Title: unscalable database schema design Status in OpenStack Identity (keystone): New Bug description: I have noticed the keystone SQL database schema design is not scalable. It can hold maybe a few hundreds or maximum thousands of entries, but beyond this, it is going to certainly create very serious efficiency problems, both in terms storage space and query response time. Here are the main problem points I have spotted: i) most of the tables use primary keys of varchar(64) type: role, domain, project, token, user, group etc., supposed to contain unique hex identifiers. I am not exactly sure about the rationale behind this design? If the idea is to be able to accommodate up to 16**64=10**77 distinct records, than this is clearly flawed, as the tables won't hold more than a few thousand entries given the current length of the primary key (and foreign keys, for those minor entity tables that refer to the major entity). ii) some tables have composite keys on multiple varchar(64) fields: Create Table: CREATE TABLE `assignment` ( `type` enum('UserProject','GroupProject','UserDomain','GroupDomain') NOT NULL, `actor_id` varchar(64) NOT NULL, `target_id` varchar(64) NOT NULL, `role_id` varchar(64) NOT NULL, `inherited` tinyint(1) NOT NULL, PRIMARY KEY (`type`,`actor_id`,`target_id`,`role_id`), KEY `ix_actor_id` (`actor_id`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 1 row in set (0.00 sec) iii) some tables have unique keys defined on varchar(255) columns: Create Table: CREATE TABLE `role` ( `id` varchar(64) NOT NULL, `name` varchar(255) NOT NULL, `extra` text, PRIMARY KEY (`id`), UNIQUE KEY `name` (`name`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 iv) the generated public id for (user,domain) entities is currently 64 hex chars, while only 32 hex chars are needed to ensure uniqueness up to 16**16=2**64=10**19 entries, which should be more than sufficient for any practical installation. In order to remedy these problems, I propose the following improvements: i) replace the varchar(64) hex primary key by an auto-incremented integer(4) column. This will hold up to 4 billion records and will greatly reduce the storage requirements and improve query performance. ii) reduce the generated public id for (user, domain) entities to 32 hex chars, stored in binary form as two bigint(8) columns. iii) reduce the "name" field length to more manageable length or reduce index size using a hash function. To manage notifications about this bug go to: https://bugs.launchpad.net/keystone/+bug/1524124/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp