Hi, Here is the link [1] <http://bit.ly/vbucket> where you can find my vbucket style partitioning code(in C++). This code basically takes the key from the sharded query exceuted by the client as input and then calculates its MD5 hash. Next the code calculates the index of the vbucket where it would ask in-order to get to a Drizzle server out of several servers available.
Well I haven't done the modulus operation yet on the decimal index value generated out of the MD5 hash of key in this code, because the value over which modulus operation is taken is actually the no. of vbuckets and that usually varies, depending on no. of clients connecting to the set of the servers. Usually no. of vbuckets is kept as multiple of 12, as 12 is divisible by 2,3,4 and 6 and that would help to exercise flexibility of evenly distributing load among the set of servers. I am also going through [2] <https://github.com/membase/libvbucket> as it is using libhashkit for different hashing algorithms like md5, crc, fnv1_64, fnv1a_64, fnv1_32, fnv1a_32, hsieh, murmur, & jenkins. Also I am advancing at a good pace in-order to create a sql-parser (for parsing key from the sharded query) by looking at the cJSON parser inside libvbucket. I would appreciate if somebody can suggest some ideas regarding building the SQL query-parser, for parsing the key supplied by the client. [1] http://bit.ly/vbucket [2] https://github.com/membase/libvbucket Regards, Abhishek Kumar Singh http://www.mapbender.org/User:Abhishek BE/1349/2007 Information Technology 8th SEMESTER BIT MESRA Skype: singhabhishek.bit Mobile: +91-8002111189 irc-nick: sin8h (irc.freenode.net) On Wed, Apr 6, 2011 at 20:40, Anurag Priyam <[email protected]>wrote: > > 3.Maintain a master shard index: This technique involves using a single > > master table that maps various values to specific shards. It is very > > flexible, and meets a wide variety of application situations. However, > this > > option often delivers lower performance as it requires an extra lookup > for > > each sharded SQL Statement. > > Range based approach is the actually very flexible application > specific way to shard. You can literally allow applications to adapt > in any way they want to. Just that the lookup table need not be on the > master server. > > [...] > > > 3.Rebalancing: In some cases, the sharding scheme chosen for a database > has > > to be changed. This could happen because the sharding scheme was > improperly > > chosen (e.g. partitioning users by zip code) or the application outgrows > the > > database even after being sharded. In such cases, the database shards > will > > have to be rebalanced which means the partitioning scheme changed and all > > existing data moved to new locations. Doing this without incurring down > time > > is extremely difficult. > > > > I would really appreciate if somebody from drizzle community is willing > to > > share ideas regarding handling these issues. > > I am quite sure you are following my thread of discussion. Very > likely, Stewart, and I have something there. > > -- > Anurag Priyam > http://about.me/yeban/ >
_______________________________________________ Mailing list: https://launchpad.net/~drizzle-discuss Post to : [email protected] Unsubscribe : https://launchpad.net/~drizzle-discuss More help : https://help.launchpad.net/ListHelp

