On Jul 31, 2010, at 10:45 AM, Simon Woodhead wrote: > Hi folks, > > First off: thanks! CouchDB is something I found out about by accident by > virtue of my g/f breaking her toe but that's another story, although I'm > really glad she did. I've been reading about it lots and have finally > deployed it in a real-world trial today. > > We store gazillions of small records. We already have them in JSON as they > flow through our queuing system. Along the way we take key bits of that info > into RDMS for main use but the end result is we need to keep the rest > somewhere for easy access. We tried S3 but it was too slow, we tried > simpleDB but hit the limits in hours so we've finally bit the bullet and are > trying CouchDB. So far so good - we write with our unique id as the _id > which make retrieval superbly easy. We haven't got into views but don't need > to for this application - just lots of parallel writes and a very occasional > retrieval. > > However, in a little over an hour of the trial we have a 2GB database and it > is growing quickly. This is no great surprise as there is an awful lot of > data getting pumped in here - in raw JSON it amounts to 10-15GB per day. So > my question to the list is are there any approved methods of archiving? > Sharding seems unnecessary since one node more than handles our read/write > requirements, but it is going to need a tonne of storage. We'll also be > replicating between sites so the total requirement will be doubled. > Presently it is running as a VM with storage on the SAN so any usage is > expensive. > > One idea I have had is to name the database to include the date and then > databases above a certain age could be detached and compressed somewhere. > Does that sound workable and is there an approved method for detaching? It > looks like we could just move the file without any adverse consequences but > I wanted to check! And re-attaching? >
this is what I would suggest. the 1.x line will be binary compatible, so you should have no trouble re-activating stored databases. > So far, CouchDB is looking like a dream come true and I'm very sure we're > going to move other applications to it as we find our way around. I have no > doubt we're going to be a relatively large deployment when we find our feet > with it so thanks again to all involved. > if you hit walls, please ask here before getting frustrated. there is a lot of experience with very large scale CouchDB, so folks will be able to help. Chris > cheers, > Simon > > > > > > -- > Simon Woodhead FCSI > Managing Director > <http://www.simwood.com> > Simwood eSMS Limited > Wholesale Telecommunications > > Keep up with the latest news from Simwood: > <http://feeds.simwood.com/SimwoodNews> > <http://www.facebook.com/pages/Simwood-eSMS-Limited/146897445321268> > <http://twitter.com/simwoodesms> > <http://twitter.com/simwoodesms> > w: http://www.simwood.com > > -- > ***** Email confidentiality notice ***** > > This message is private and confidential. If you have received this message > in error, please notify us and remove it from your system. > > > Simwood eSMS Limited is a limited company registered in England and Wales. > Registered number: 03379831. Registered office: c/o HW Chartered Accountants, > Keepers Lane, The Wergs, Wolverhampton, WV6 8UA. Trading address: Falcon > Drive, Cardiff Bay, Cardiff, CF10 4RU. >
