My apologies for the overloaded use of the term "incubation". I realize it has a special meaning for Apache projects. My bad.
Thanks for all of quick responses. It's a sign of a well-run project. I will keep my eye on the progress of CouchDB. Hopefully, it will rapidly reach the scalability point that I am looking for. Jonathan -----Original Message----- From: Jan Lehnardt [mailto:[EMAIL PROTECTED] Sent: Monday, November 03, 2008 8:50 AM To: couchdb-user@incubator.apache.org Subject: Re: Largest CouchDB dbs? On Nov 3, 2008, at 14:40, Jonathan Ginter wrote: > From what I have read, it sounds like the project is not yet ready to > scale this large, but there are plans in place to do so (faster view > parsers, partitioning, etc). Is there a rough target for this > work? We > have a roadmap for upcoming projects and I need to know whether > CouchDB > can be considered for the short term (i.e., within the next 4 - 6 > months) or whether we will have to give it more time to incubate and > come back to it later on in the longer term. No ETA. but feel free to sponsor development :) The two biggest boosts for view generation are (as you correctly identified) JSON serialisation on the Erlang-end and actually making use of MapReduce's parallel nature. At the moment, view creation is single-threaded and limited to a single core on your system. Just to avoid potential misunderstanding: Incubation is the process of becoming an Apache project. It has nothing to do with the software development roadmap. Cheers Jan -- > > > Jonathan > > -----Original Message----- > From: Damien Katz [mailto:[EMAIL PROTECTED] > Sent: Monday, November 03, 2008 6:00 AM > To: couchdb-user@incubator.apache.org > Subject: Re: Largest CouchDB dbs? > > > On Nov 3, 2008, at 4:38 AM, Jan Lehnardt wrote: > >> >> On Nov 3, 2008, at 05:53, Jonathan Ginter wrote: >> >>> I have a similar issue. I am interested in using CouchDB to host a >>> 200+ GB database that will receive well over 200 million documents >>> per day. Moreover, the data must roll out - i.e., constant >>> background purging - and also support UI queries. And this is just >>> a starting point to match the abilities of the relational database >>> we are already running. I will want the DB to scale up from there. >>> >>> If there is no hope of the CouchDB being able to handle all of that >>> - regardless of how many machines we deploy - I would like to know >>> that now before I look any further into this project. >> >>> Does anyone have a reasonable idea about whether CouchDB will be >>> capable of such massive scalability or how many machines it would >>> take to scale that large? >> >> This sounds like a scenario that CouchDB will ultimately be able to >> handle nicely. I don't think we can give out any guarantees about >> when >> an how this will be the case. Maintaining a 200+GB data set would >> require >> quite some hand-wiring at the moment. >> >> >>> I would appreciate any feedback that anyone might have on this. >> >> I think Damien can chime in here :) Damien? >> > > This is definitely well within what couchdb should be able to do once > partitioning is in place. I'm not really working on this yet, but > there are a lot of people and companies interested in seeing the > partitioning work done. So maybe some progress will be made soon. > > -Damien >