Woops, link to the repo: https://github.com/JensRantil/cushions
Cheers, Jens On Tue, Dec 3, 2013 at 12:47 AM, Jens Rantil <[email protected]> wrote: > Hi Vivek, > > I was considering using the Go programming language; It is easy to deploy, > highly concurrent, fairly easy to learn, has great support for HTTP and has > a fairly good JavaScript engine coming up[1]. Also, I've lately looked at > goraft[2], which could be useful for a high-availability implementation of > this. Are you familiar with Go? > > [1] https://github.com/robertkrimen/otto > [2] https://github.com/goraft/raft > > So far I haven't written a single line of code, but since there is > interest I just set up a Github repos[1]. I named it "cushions" in lack of > a better name (taken?). Feel free to send me your Github username if you > have one. > > As of API I'm thinking view querying and document API being identical to > CouchDB. The sharding API will be HTTP/ReST based just like CouchDB under > /_shards/ or similar. > > Cheers, > Jens > > On 2 dec 2013, at 02:58, Vivek Pathak <[email protected]> wrote: > > Hi Jens, > > Quick question, what is the language/environment you planned to use for > the feature. > > On some thinking, this looks like something which might be a useful to me. > If your work is public domain, and if I have the skills on your > environment, I would like to contribute. > > Thanks > Vivek > > > On 11/28/13, 11:59 PM, Vivek Pathak wrote: > > > Hi > > Just curious when you say this is a problem: > > 2. Inconsistent views; While transferring documents from one shard to > another there will be a moment when a document resides on two databases. > > Are you talking about the view that shows which document do not belong - > or you are talking about general views? > > If you mean general view, one possibility is to specifically ignore all > documents in the wrong shard. This will give you a stale view - but that > is exactly what you get if you use stale=ok etc in single db case as well. > > Thanks > Vivek > > > > On 11/28/2013 06:28 PM, Jens Rantil wrote: > > Hey couchers, > > For the past week, I've been playing with the thought of implementing a > sharding orchestrator for CouchDB instances. My use case is that I have a > lot of data/documents that can't be stored on a single machine and I'd like > to be able to query the data with fairly low latencies. Oh, there's a big > write throughput constantly, too. > > Querying could be done by rereducing the view results coming in from the > various shards. AFAIK, this is how BigCouch does it too. > > Now, the rebalancing act of all this is pretty straight forward; each shard > is given a generated view that displays all documents that don't belong > there in the shard (and thus need to be transferred). The orchestrator will > then periodically check if any documents should be transferred to another > server, replicate those documents and purge* them on the source server. > > * Purging since I am not interested in keeping them on disk on the source > database anymore, not even tombstones. All the document data should only > reside on the new shard. > > My issue with the described solution is that there will be a moment when > views will be showing incorrect information. > > As I see it, there are two problems that needs to be solved with this > solution: > > 1. Purging of documents: According to the purge > documentation< > http://docs.couchdb.org/en/latest/api/database/misc.html#post--db-_purge>, > > purging documents will require rebuilding entire views. That ain't good. > 2. Inconsistent views; While transferring documents from one shard to > another there will be a moment when a document resides on two databases. > > 1) could possibly be solved by introducing an replica database of the > source database so that views could be recreated offline. However, this > makes replication way more tedious than I initially expected. 2) could be > solved by simply locking view reads until the inconsistency is fixed. But > hey, we're not bug fans of that kind of global locking. Possibly patching > all map-functions in the views would also fix the issue of double > documents. At the same time, that feels kind of like a hack. > > Now, I know BigCouch is around the corner. How does it cope with the above > issues? Maybe it's introduced some new magic for this that currently is not > in CouchDB? I've been trying to Google this, but so far haven't found too > much. Feel free to point me in the right direction if you know of any > material. > > Cheers, > Jens > > > > >
