Attached please find an instrumented framework\pull-agent\src\main\java\org\apache\manifoldcf\crawler\system\ResetManager.java class. Please rebuild with this class, cause the hang, and capture standard out so I can see it.
Thanks! Karl On Thu, Jul 7, 2011 at 2:12 PM, Karl Wright <[email protected]> wrote: > Thanks. I maybe can send you an instrumented ResetManager class later > today, if you are in a position to rebuild MCF and try this again. > > Karl > > On Thu, Jul 7, 2011 at 2:06 PM, Farzad Valad <[email protected]> wrote: >> I'm attaching the current thread dump file that goes with the log file. It >> is easy to recreate just cause an insert failure do size mismatch between >> the column and value, where the value can't fit. More than happy to test and >> help out. >> >> On 7/6/2011 2:44 PM, Farzad Valad wrote: >>> >>> You are right, it was db error. In this case I tried to insert a value >>> larger than the column size and the insert failed. I'll grab the log next >>> time too, but unfortunately deleted and running another test with a larger >>> column. As soon as it finishes or errors, I'll reproduce this one again and >>> send you the stack trace. >>> >>> On 7/6/2011 2:36 PM, Karl Wright wrote: >>>> >>>> I have seen this before. The critical traceback, which you see for >>>> ALL the worker threads, is: >>>> >>>> "Worker thread '36'" daemon prio=6 tid=0x00000000077ed000 nid=0xa98 in >>>> Object.wait() [0x000000000b1af000] >>>> java.lang.Thread.State: WAITING (on object monitor) >>>> at java.lang.Object.wait(Native Method) >>>> at java.lang.Object.wait(Object.java:485) >>>> at >>>> org.apache.manifoldcf.crawler.system.ResetManager.waitForReset(ResetManager.java:107) >>>> - locked<0x00000000e0005528> (a >>>> org.apache.manifoldcf.crawler.system.WorkerResetManager) >>>> at >>>> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:110) >>>> >>>> >>>> ManifoldCF has code in it for dealing with database errors that >>>> requires all worker threads to be brought into the same state. This >>>> code has never worked properly, and I've never been able to figure out >>>> why. But the underlying problem is that you've had a database error >>>> of some kind which requires a reset. This is usually a connection >>>> error. >>>> >>>> Can you look at manifoldcf.log and send the last stack trace in it? >>>> It could be too short a connection lifetime in either the manifoldcf >>>> configuration or in the postgresql configuration. >>>> >>>> Karl >>>> >>>> >>>> On Wed, Jul 6, 2011 at 3:27 PM, Farzad Valad<[email protected]> wrote: >>>>> >>>>> So this time I went through the thread dump and don't see any socket >>>>> waits. >>>>> Any thoughts why it is stuck this time? >>>>> >>>>> Thanks, >>>>> Farzad. >>>>> >>> >> >> >
