Bugs item #2801629, was opened at 2009-06-05 11:50 Message generated for change (Comment added) made by boncz You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=482468&aid=2801629&group_id=56967
Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: John van Schie (johnvanschie) Assigned to: Peter Boncz (boncz) Summary: Mserver sends no response, client hangs. Initial Comment: We are using MonetDB4, Feb2009-SP2, build from source (build log attached) with 64bit OIDS. The platform is Fedora core 10, with 64 GB RAM (55GB cached) and 99GB free disk space for the dbfarm. Our application uses mclient to manipulate XML in the database. This morning, I saw a mclient process that did not terminate and is still running after approx. 12 hours. Using strace on the mclient process shows that the process is waiting for blocking I/O. Executing a fresh mclient with the query "1+1" results in a stalled application, that also waits for blocking I/O. (see attached strace). It seems that Mserver stopped sending data to the clients. To debug this problem, I've generated stack traces of all threads for the Mserver and a list of open files (lsof). The server is still running, and I plan to keep it running unless no more information is needed. Unfortunately, the server is build with optimisation enabled and cannot share the data, as it is confidential. ---------------------------------------------------------------------- >Comment By: Peter Boncz (boncz) Date: 2009-06-09 17:08 Message: Hi John, Thanks for the bug report. It is hard to say what has happenend, it could be that the so-called short lock, which is apparently taken but not freed by an interpreter thread is not given back. This in the end blocks all incoming queries. The most useful info you attached is the gdb trace. However, it would really help if you could send me the last (~12) queries that went into the server. It would already be good to know whether these are read-only, document managment (add_doc/del_doc) or update queries. There appears to be at least one update query there. Another possible cause of deadlocks is sometimes bad error handling. Therefore, if there have been anby error messages coming out of that MonetDB instance, that would also be great to know. I will try to keep thinking, but if there is any additional information that you can share, it would greatly help the chances of finiding a solution/fix. thanks, Peter ---------------------------------------------------------------------- Comment By: Sjoerd Mullender (sjoerd) Date: 2009-06-08 18:04 Message: This looks like a classic deadlock situation: thread 23 is waiting for a lock in pflock_trycommit, thread 21 is waiting for a lock in pflock_end, threads 11, 9, 8, 6, 5, 4, 3, 2 are waiting for a lock in pflock_begin, threads 19, 7 are waiting for a lock in set_lock. What might be the case is that all but one of the threads waiting in pflock_begin and the two threads waiting in pflock_trycommit and pflock_end are all waiting for the same lock (PF_META_LOCK) which might be held by that one pflock_begin thread, which itself could be waiting for another lock (PF_SHORT_LOCK). Perhaps one of the other two threads waiting in set_lock has PF_SHORT_LOCK and is waiting for yet another lock. In any case, this seems an area where Peter has the expertise. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=482468&aid=2801629&group_id=56967 ------------------------------------------------------------------------------ Crystal Reports - New Free Runtime and 30 Day Trial Check out the new simplified licensing option that enables unlimited royalty-free distribution of the report engine for externally facing server and web deployment. http://p.sf.net/sfu/businessobjects _______________________________________________ Monetdb-bugs mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/monetdb-bugs
