hi All,
Buoyed by the response i got to my previous mail (Suggestions on View
performance optimization/improvement), i am asking another question for
optimizing document look up based on _id.
Let us say we have a db containing a million documents each with _id generated
by us [1.....1000000]. If we have to get all the documents one by one (assuming
the search/lookup code will get random inputs of [1..1000000]), wat would work
best?
As of now wat we are doing is a simple look up like:
def getDocById(self, id):
return self.db[id]
For doing a million lookups like this it takes about 50-60 mins on my laptop.
Is there a better way of doing the same? Thought of fetching a bunch of keys in
one go caching them (LRU style) and looking up the cache first before hitting
the db, but given that the input 'id' randomly varies between [1..1000000], it
has not been a great success.
Any thoughts? Ideas? Suggestions?
Environment details:
Couchdb - 0.9.0a757326
Erlang - 5.6.5
Linux kernel - 2.6.24-23-generic #1 SMP Mon Jan 26 00:13:11 UTC 2009 i686
GNU/Linux
Ubuntu distribution
Centrino Dual core, 4GB RAM laptop
Thanks
Manju