Thanks for all the previous help. For both parts, documents contain about 30 fields of metadata and the primary content of about 5K to 10K.
The desire is to prove out the feasibility of moving all our syndication services to a common platform that provides rapid customization for customer-specific syndication feeds. Part 1 -------- I've already done a successful proof-of-concept with 100K documents. No optimization. A couple of things I noticed. Environment my laptop (a recent, loaded MacBook Pro - 2.93 Intel Core 2 Duo, 8 GB memory) - 100K docs load took about 1 hour - Creating a single view with 'emit([single key],doc]) took about 1 hour - The log indicated view checkpoints every 30 sequence numbers or so. Part 2 ------- I'm about to do a volume test of about 2 million documents - . Primary load ---------------- I will be running in batches of about 1000 documents. Three separate unix servers on a local network: - One for couchdb instance - One for feeder process - One for database View definition ------------------ I have two views defined, without any reduce functions. Questions for Part 2 ------------------------- Firstly any thoughts or hints on my larger benchmark (Part 2) ? Is it naive to hope to speed up the first creation of the view by using map functions of the form 'emit([key],null)' and then using 'include_docs' on queries? Is there any way to control the checkpointing of views when creating the view for the first time - I'm guessing I'm looking at many hours to create a single view on 2 million documents. Any help would be appreciated. Regards, Tracy
