Hello,

I’m a bit confused about how CouchDB really works. I just launched Futon and 
see that the indexer is busy working on a design document. I have almost a 
million documents.

A few minutes later, I see three more tasks appearing, all belonging to 
different design documents. No problem, except that the total count is all 
different:

- design doc 1: ~950,000
- design doc 2: ~450,000
- design doc 3: ~313,000
- design doc 4: ~85,000

Why are the total counts different? My understanding is/was that a database 
holds N documents. Each indexing function is passed a document which then gets 
compares whether it’s the doc_type it expects:

function(doc) {
    
<http://wiki.apache.org/couchdb/Introduction_to_CouchDB_views#CA-1846e35e0e66fe65e7a443a2459a0272833e6152_2>if
 (doc.Type == "customer") {
    
<http://wiki.apache.org/couchdb/Introduction_to_CouchDB_views#CA-1846e35e0e66fe65e7a443a2459a0272833e6152_3>emit(doc._id,
 {LastName: doc.LastName, FirstName: doc.FirstName});
    
<http://wiki.apache.org/couchdb/Introduction_to_CouchDB_views#CA-1846e35e0e66fe65e7a443a2459a0272833e6152_4>}
}

In the Genesis case, I was assuming that each view would have to go through 
each document across the database and index its own doc_type. Basically, one 
round for each design document for N total documents. For example, if the 
database contains 100,000 documents and two design documents, there would be 
two active tasks listed:

- _design/customers => index 100,000 documents
- _design/orders => index 100,000 documents

Later on, the indexing would be partial and the delta (say 9,000 docs) would 
have to be reindexed by each view:

- _design/customers => index 9,000 documents
- _design/orders => index 9,000 documents

This doesn’t seem to be the case. I’d love to know how indexing really works.

Thanks!

— Tito

Reply via email to