hi All,

Buoyed by the response i got to my previous mail (Suggestions on View 
performance optimization/improvement), i am asking another question for 
optimizing document look up based on _id.

Let us say we have a db containing a million documents each with _id generated 
by us [1.....1000000]. If we have to get all the documents one by one (assuming 
the search/lookup code will get random inputs of [1..1000000]), wat would work 
best?

As of now wat we are doing is a simple look up like:
def getDocById(self, id):
     return self.db[id]

For doing a million lookups like this it takes about 50-60 mins on my laptop. 
Is there a better way of doing the same? Thought of fetching a bunch of keys in 
one go caching them (LRU style) and looking up the cache first before hitting 
the db, but given that the input 'id' randomly varies between [1..1000000], it 
has not been a great success.

Any thoughts? Ideas? Suggestions?

Environment details:
Couchdb - 0.9.0a757326
Erlang - 5.6.5
Linux kernel - 2.6.24-23-generic #1 SMP Mon Jan 26 00:13:11 UTC 2009 i686 
GNU/Linux
Ubuntu distribution
Centrino Dual core, 4GB RAM laptop

Thanks
Manju


      

Reply via email to