Hi, CouchDB devs. This email is to offer my time and services to the CouchDB project, and I am specifically interested in the proposal set out: http://wiki.apache.org/couchdb/Partitioning_proposal
I notice that it was last updated in September, 2009. First of all, though, I'll share with you my background, and why I am wanting to do this. Having completed a Masters in Software Engineering in the UK, I am about to commence work towards a PhD. My Masters project was fundamentally a performance and programming comparison between the MapReduce high level languages: Pig, Hive and JAQL. If you so wish, the paper can be found here: http://www.macs.hw.ac.uk/~rs46/publications.html My PhD supervisor has previously worked on GpH (Glasgow Parallel Haskell), and I am in the first months of my PhD. He has urged me to look into Erlang implementations, suggesting CouchDB as an excellent example. Having looking into distributed processing in my dissertation (namely the MapReduce implementation found in Hadoop), the area I am most interested in within the CouchDB software is the partitioning of database over more than one (and many) nodes in a cluster. So, my first, naive, questions might be something like: 1. Is the proposal mentioned in the CouchDB wiki page still a valid problem (database partitioning). 2. If so, are the plans underway to solve this problem? I notice that there was a proposal for the Google Summer of Code in 2009 to provide a solution: http://socghop.appspot.com/document/show/user/rleeds/couchdb_cluster . 3. If this is still an open problem for the CouchDB dev team, how would one get involved in the design of a partitioning architecture for CouchDB ? If this is no longer a valid problem to solve, I would remain keen to use CouchDB as a platform on which to develop a system as part of my PhD work. Is the "CouchDB proposals" page on the wiki still valid? Any other comments, or suggestions, would be greatly appreciated at this early stage. Regards, Rob Stewart http://www.macs.hw.ac.uk/~rs46/
