Hi Folks,

I am *Abhishek*, a *senior undergrad* student at BIT Mesra, India. I want to
contribute to Drizzle as a Google SoC student this summer. I went through
the list of ideas. I am particularly interested in *"libdrizzle native
sharding"* project.

I have very good knowledge and experience of C/C++ and computer networking.
Some of my achievements that I want to cite in-order to show that I have
required skill-set as per the requirements of the project:

   - Yellow rated coder*(rating ~1700)* in *Topcoder*.
   - Extensive knowledge of STL, Linux system administration.
   - Among *Top 150 coders* in world in *Project Euler(solved 200+ problems)
   *.
   - Ranked *11th nationwide* in *Capture The Flag* type Virtual Private
   Network based hacking contest.
   - I have successfully *simulated  Secured Zone Routing Protocol* using
   ns-2 simulator and will be publishing a research paper on that soon.
   - I have experience of working with open-source communities. Last year
   through GSOC I got chance to work with *Mapbender Project*.
   - I have also *good* *understanding of Database internals*. And I have
   taken courses like Distributed Systems, Compilers and Operating Systems. So
   I have fundamentals clear.

Earlier *Andrew Hutchings* (irc-nick: LinuxJedi) was listed on wiki page as
mentor for this project. But now since he isn't available for that project,
so I would *request somebody from Drizzle community to be mentor for
"libdrizzle native sharding" project*.

I have already kicked off my work for this project. I have *development
environment ready* and *working on to solve some of the bugs*.

For this project I was thinking of implementing a plugin. This plugin would
basically serve two purposes: (a) Shard Selection and (b) Shard
Resolution. Typically
when talking about database sharding there are a couple of decisions an
implementation needs to make:

How do we assign a shard to a new user?  (Shard Selection)
How do we resolve the shard that a current object lives in? (Shard
Resolution)

So the plugin will be using what we call an Index database that holds a minimum
of two tables. One which the plugin creates is called which contains a record
for every shard in the system along with a capacity of the shard and its
usage field. This table is queried every time a new user, is created within
the system.  We assign the new user to the shard with the lowest usage to
capacity ratio. This allows for shards to be located on different types of
hardware that should take a smaller or larger number of users. The other
table is supplied by the application using the plugin and provides a mapping
of the user to the shard to be used.  Then whenever a request begins for an
object the application should query this table and retrieve the shard to use
and then pass that to the plugin to switch to that database.

Yesterday I had discussion with Stewart Smith(irc-nick: stewart) regarding
this project, then he gave me some ideas regarding *libdrizzle re-sharding*,
i.e. redistribution of the data across the shards (either to *achieve* *proper
load balancing* or to *satisfy application invariants*). Should I discuss
about how I am thinking about to implement libdrizzle re-sharding in my GSOC
application too?

Waiting for response from the community.

Abhishek Kumar Singh
http://www.mapbender.org/User:Abhishek
BE/1349/2007
Information Technology
8th SEMESTER
BIT MESRA

irc: sin8h (irc.freenode.net)
skype: singhabhishek.bit
mobile: +91-8002111189
_______________________________________________
Mailing list: https://launchpad.net/~drizzle-discuss
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~drizzle-discuss
More help   : https://help.launchpad.net/ListHelp

Reply via email to