how many docs is that, and have you run the view incrementally? first time index builds are painful...
On Wed, Oct 21, 2009 at 9:46 AM, Rajkumar S <[email protected]> wrote: > Hello, > > I am using couch db 0.9.0 for storing logs from my mail server. > > Logs are sent from mail servers to a RabbitMQ queue server. Log > insertions into couchdb is done by a python program, after fetching it > from RabbitMQ and converting to Json, using couchdb module (from > couchdb import *). I have a single document storing entire history of > the email transactions. I also have multiple RabbitMQ clients each > pulling from same queue and updating the same coudhdb. This means I > have to update the same document from different clients several times > during the life time of an email message. > > To do this I use the message id of each mail transaction as it's key. > (this appears in every log entry) When a first log entry arrives I > check if a doc with that key is present in db, if not I create a new > doc with that key. When second log arrives I extract the doc, convert > it to a hash table in my program, merge the new log entry with the > hash table and update the doc with the updated hash table's json. If a > conflict occurs, the program retries, fetching the doc and updating it > and storing again till conflict is resolved. > > This means for every write there is a corresponding read. > > Currently I am running it as a pilot and just have a single server > logging to couchdb. I have about 0.75 GB per day right now, with > GET/PUT happening almost continuously (say 1 - 2 per second). > Previously I had a test server running and I tested couple of map > reduce using that DB (about 5 mb) > > Now after logging from a single production machine I am not able to > run a single view so far. I get the following error if I wait long > enough: > > Error: case_clause > > {{bad_return_value,{os_process_error,"OS process timed out."}}, > {gen_server,call, > [<0.436.0>, > {prompt,[<<"rereduce">>, > [<<"function(keys, values)\n{\n return > values;\n}">>],..... > > I have changed os_process_timeout to 50000, removed the reduce part > but even after about 6 hours my map is not yet finished. Currently the > db size is 3.6G > > The map function I am using is: > > function(doc) { > if ("msgtype" in doc){ > if (doc.msgtype == "allow"){ > if ((doc.event == "action_allowed_ip") || (doc.event == > "action_allow_new")){ > result = {}; > ip = doc.parameters.client_address; > result["helo"] = doc.parameters.helo_name; > result["event"] = doc.event; > result["timestamp"] = doc.timestamp; > result["id"] = doc._id; > result["from"] = doc.parameters.sender; > result["to"] = doc.parameters.recipient; > emit (ip,result); > } > } > } > } > > Top shows that couchjs is most active process and it shows the > following line right now, > 11410 root 20 0 90752 27m 752 R 76 0.7 1235:05 couchjs > > My hardware is Intel(R) Core(TM)2 Duo CPU E6750 @ 2.66GHz, 4Gig RAM > and one SATA hard disk. I do not think this is the expected > performance of couchdb, so is there some thing I am doing wrong? Any > tips to enhance the performance to acceptable levels? > > thanks and much regards, > > raj >
