Thanks for all the previous help.

For both parts, documents contain about 30 fields of metadata and  the primary 
content of about 5K to 10K.

The desire is to prove out the feasibility of moving all our syndication 
services to a common platform that provides rapid customization for 
customer-specific syndication feeds.

Part 1
--------

I've already done a successful proof-of-concept with 100K documents.

No optimization. 

A couple of things I noticed.

Environment my laptop (a recent, loaded MacBook Pro - 2.93 Intel Core 2 Duo, 8 
GB memory)

 - 100K docs load took about 1 hour
- Creating a single view with 'emit([single key],doc]) took about 1 hour
- The log indicated view checkpoints every 30 sequence numbers or so.

Part 2
-------

I'm about to do a volume test of about 2 million documents - .

Primary load
----------------

I will be running in batches of about 1000 documents.

Three separate unix servers on a local network:

- One for couchdb instance
- One for feeder process
- One for database

View definition
------------------

I have two views defined, without any reduce functions.

Questions for Part 2
-------------------------

Firstly any thoughts or hints on my larger benchmark (Part 2) ?

Is it naive to hope to speed up the first creation of the view by using map 
functions of the form 'emit([key],null)' and then using 'include_docs' on 
queries?

Is there any way to control the checkpointing of views when creating the view 
for the first time - I'm guessing I'm looking at many hours to create a single 
view on 2 million documents.


Any help would be appreciated.

Regards,

Tracy

Reply via email to