Re: A couple of couchdb questions.

Jan Lehnardt Tue, 24 Mar 2009 14:47:14 -0700

Hi Gary,


On 24 Mar 2009, at 22:14, Gary Smith wrote:

Hello,
I'm working to implement a document warehouse for about 15mdocuments. These documents range between 10kb - 500kb (legacyarchived pdf's). Currently we do this by maintaining a mysqldatabase with the document stored on a variety of servers(consisting of about 6TB). Most of the problems that we encounterare a) backups and b) physical access to the documents as they areon a private network. Since not much changes backups aren't reallythat much of a problem (but restoring is very slow). We are nowlooking to add documents to this regularly (about 10k per week). Sowe are looking to implement something new, or at least, more useful.
So we thought about using Amazon S3 for storage but these documentsfall under HIPA constraints so we have decided to do this in house.
Looking at couchdb, it pretty much does what we are looking to do.We really only want to store a document and maybe some very basicmetadata (which we currently do by having both a PDF and a metadatafile). Implementation doesnt look like a problem with thedocumented API.


This sounds like a good application.

So, the questions.
I would like to break this down into multiple servers andincorporate replication at the same time. The document says thatpull is recommended over push but doesn't mention why.

Pull replication is faster, more reliable and is better at picking upfailed replications again.

Does push replication require the slave (or other node) to acceptthe put/post request as completed?


Not sure what you mean here.

If we choose pull replication instead of push, I assume that this issomething we will need to crontab out to schedule it, or does ithave a background process that constantly syncs? API looks likejust a single get request.


It's a POST request, but yeah, there's no automatism. See below.

Either way, here is what we are looking to do at this time. At twoseperate locations we will have multiple servers, setup in a master/master configuration. We should not run into any conflicts areupdates are not allowed. ID's are unique (MD5 checksum and someother unique information).


Good.

We wanted to use 4 servers at each location, partly because eachserver has 4TB of space (actually 3TB of raid 5). Each server willhold files based upon the first digit of the MD5 checksum (0-3 onserver A, 4-7 on server B, 8-A on server C, and B-F on server D).We were thinking of using Apache's URL rewriting to proxy therequest to the proper server. This should work for both get/put/post.

This sounds sensible. Other proxies can handle that as well, but ifyour are comfortable with Apache httpd, no reason to not use it.

We will also have the backup server at the second location (whichwill be the active for their location) using the same ideology.
What would be most useful is to be able to ensure that before acommit is accepted on a server we could gaurentee that it has beenreplicated to a second box.
Any ideas or suggestions on that?

I think it is easiest when your application knows about the fact thatthere is no single server when doing a write. Reads can still beserved through a single interface, but to make a proxy understand thereplication protocol is probably not worth the effort.

Here's how I'd set it up: Your application sends a PUT request to yourApache proxy (or to the correct backend node directly), the documentwill end up on one of the four backend servers. Your application alsoknows about the backend nodes, the partition rules and can access themdirectly, in both locations. After the PUT request that creates thedocument completes in location 1 you send a POST request to thebackend server in location 2 telling it to replicate from the nodethat the doc was written to in location 1 (issuing a pullreplication). When this call returns successfully (if not, you justretry), you return to the original caller that wanted to save thedocument.


Downsides:

- You need to deal with the fact that the communication between backends might not work and decide what to do (save anyway, allowingreduced reliability and weak consistency) or deny the write becausethe consistency and availability requirements can no longer beguaranteed).


 - Your backend nodes must be public.

In the Future, CouchDB will be able to do the partitioning for you(solving the second downside). The first one can never be circumventedin a distributed system.


        
Cheers
Jan
--

Re: A couple of couchdb questions.

Reply via email to