Hi,

Here is the link [1] <http://bit.ly/vbucket> where you can find my vbucket
style partitioning code(in C++). This code basically takes the key from the
sharded query exceuted by the client as input and then calculates its MD5
hash. Next the code calculates the index of the vbucket where it would ask
in-order to get to a Drizzle server out of several servers available.

Well I haven't done the modulus operation yet on the decimal index value
generated out of the MD5 hash of key in this code, because the value over
which modulus operation is taken is actually the no. of vbuckets and that
usually varies, depending on no. of clients connecting to the set of the
servers.

Usually no. of vbuckets is kept as multiple of 12, as 12 is divisible by
2,3,4 and 6 and that would help to exercise flexibility of evenly
distributing load among the set of servers.

I am also going through [2] <https://github.com/membase/libvbucket> as it is
using libhashkit for different hashing algorithms like md5, crc, fnv1_64,
fnv1a_64, fnv1_32, fnv1a_32, hsieh, murmur, & jenkins. Also I am advancing
at a good pace in-order to create a sql-parser (for parsing key from the
sharded query) by looking at the cJSON parser inside libvbucket.

I would appreciate if somebody can suggest some ideas regarding building the
SQL query-parser, for parsing the key supplied by the client.

[1] http://bit.ly/vbucket

[2] https://github.com/membase/libvbucket

Regards,

Abhishek Kumar Singh
http://www.mapbender.org/User:Abhishek
BE/1349/2007
Information Technology
8th SEMESTER
BIT MESRA

Skype: singhabhishek.bit
Mobile: +91-8002111189
irc-nick: sin8h (irc.freenode.net)



On Wed, Apr 6, 2011 at 20:40, Anurag Priyam <[email protected]>wrote:

> > 3.Maintain a master shard index: This technique involves using a single
> > master table that maps various values to specific shards. It is very
> > flexible, and meets a wide variety of application situations. However,
> this
> > option often delivers lower performance as it requires an extra lookup
> for
> > each sharded SQL Statement.
>
> Range based approach is the actually very flexible application
> specific way to shard. You can literally allow applications to adapt
> in any way they want to. Just that the lookup table need not be on the
> master server.
>
> [...]
>
> > 3.Rebalancing: In some cases, the sharding scheme chosen for a database
> has
> > to be changed. This could happen because the sharding scheme was
> improperly
> > chosen (e.g. partitioning users by zip code) or the application outgrows
> the
> > database even after being sharded. In such cases, the database shards
> will
> > have to be rebalanced which means the partitioning scheme changed and all
> > existing data moved to new locations. Doing this without incurring down
> time
> > is extremely difficult.
> >
> > I would really appreciate if somebody from drizzle community is willing
> to
> > share ideas regarding handling these issues.
>
> I am quite sure you are following my thread of discussion. Very
> likely, Stewart, and I have something there.
>
> --
> Anurag Priyam
> http://about.me/yeban/
>
_______________________________________________
Mailing list: https://launchpad.net/~drizzle-discuss
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~drizzle-discuss
More help   : https://help.launchpad.net/ListHelp

Reply via email to