Hello,
I’m a bit confused about how CouchDB really works. I just launched Futon and
see that the indexer is busy working on a design document. I have almost a
million documents.
A few minutes later, I see three more tasks appearing, all belonging to
different design documents. No problem, except that the total count is all
different:
- design doc 1: ~950,000
- design doc 2: ~450,000
- design doc 3: ~313,000
- design doc 4: ~85,000
Why are the total counts different? My understanding is/was that a database
holds N documents. Each indexing function is passed a document which then gets
compares whether it’s the doc_type it expects:
function(doc) {
<http://wiki.apache.org/couchdb/Introduction_to_CouchDB_views#CA-1846e35e0e66fe65e7a443a2459a0272833e6152_2>if
(doc.Type == "customer") {
<http://wiki.apache.org/couchdb/Introduction_to_CouchDB_views#CA-1846e35e0e66fe65e7a443a2459a0272833e6152_3>emit(doc._id,
{LastName: doc.LastName, FirstName: doc.FirstName});
<http://wiki.apache.org/couchdb/Introduction_to_CouchDB_views#CA-1846e35e0e66fe65e7a443a2459a0272833e6152_4>}
}
In the Genesis case, I was assuming that each view would have to go through
each document across the database and index its own doc_type. Basically, one
round for each design document for N total documents. For example, if the
database contains 100,000 documents and two design documents, there would be
two active tasks listed:
- _design/customers => index 100,000 documents
- _design/orders => index 100,000 documents
Later on, the indexing would be partial and the delta (say 9,000 docs) would
have to be reindexed by each view:
- _design/customers => index 9,000 documents
- _design/orders => index 9,000 documents
This doesn’t seem to be the case. I’d love to know how indexing really works.
Thanks!
— Tito