I haven't published it yet...will file a JIRA soon...
On 11/16/06, Geir Magnusson Jr. <[EMAIL PROTECTED]> wrote:
ah. whew. can you point me to that change you made? geir Evgueni Brevnov wrote: > I'm not aware if classlib uses SIGUSR2. In this particular case > classlib (to be more precise it is the portlib module) does sem_wait > which is interrupted by TM's SIGUSR2 signal. I replaced "hysem_wait" > with "while (hysem_wait() != 0) {}". It helped to pass all tests. > > Evgueni > > On 11/16/06, Geir Magnusson Jr. <[EMAIL PROTECTED]> wrote: >> um... classlib uses SIGUSR2 as well? Doesn't our thread manager use it? >> >> Evgueni Brevnov wrote: >> > Hey, >> > >> > Seems like the pretty old problem shows itself again. I'm talking >> > about SIGUSR2 signal :-(...Classlib's asynchronous signal reporter >> > uses system semaphores for synchronization purposes...and hysem_wait >> > is interrupted by the signal: >> > >> > (gdb) p perror("sym_wait error:") >> > sym_wait error:: Interrupted system call >> > >> > Do we have good (universal) solution for such cases? >> > >> > Thanks >> > Evgueni >> > >> > On 11/15/06, Geir Magnusson Jr. <[EMAIL PROTECTED]> wrote: >> >> >> >> >> >> Gregory Shimansky wrote: >> >> > Evgueni Brevnov wrote: >> >> >> hmmm.... strange. The patch was tested on multi-processor system >> >> >> running SUSE9. I will check if the patch misses something. >> Anyway, we >> >> >> need to wait with the patch submission until we 100% sure how >> >> >> hythread_monitor_init should behave. >> >> >> >> >> >> Thanks >> >> >> Evgueni >> >> >> >> >> >> On 11/11/06, Gregory Shimansky <[EMAIL PROTECTED]> wrote: >> >> >>> On Friday 10 November 2006 17:45 Evgueni Brevnov wrote: >> >> >>> > Hi, >> >> >>> > >> >> >>> > While investigating deadlock scenario which is described in >> >> >>> > HARMONY-2006 I found out one interesting thing. It turned out >> >> that DRL >> >> >>> > implementation of hythread_monitor_init / >> >> >>> > hythread_monitor_init_with_name initializes and acquires a >> monitor. >> >> >>> > Original spec reads: "Acquire and initialize a new monitor >> from the >> >> >>> > threading library...." AFAIU that doesn't mean to lock the >> >> monitor but >> >> >>> > get it from the threading library. So the hythread_monitor_init >> >> should >> >> >>> > not lock the monitor. >> >> >>> > >> >> >>> > Could somebody comment on that? >> >> >>> >> >> >>> It might be that semantic is different on different platforms >> >> which is >> >> >>> probably even worse. Your patch in HARMONY-2149 breaks nearly >> all of >> >> >>> acceptance tests on Linux while everything on Windows works (ok I >> >> >>> tested on >> >> >>> laptop with 1 processor while Linux was a HT server, sometimes >> it is >> >> >>> important for threading). >> >> > >> >> > I've tried to investigate the problem but didn't find the end of it >> >> yet. >> >> > The bug seems to be ubuntu specific (<joke>shall we maybe call this >> >> > distribution buggy and move on?</joke>). >> >> >> >> There is something odd about it, I'll admit... Remember the EOMEM >> bugs >> >> I found in forking? >> >> >> >> >> >> I didn't reproduce it on >> >> > gentoo, all tests work just fine. >> >> > >> >> > The bug look likes this, on tests gc.Force, gc.LOS, gc.List, gc.NPE, >> >> > gc.PhantomReferenceTest, gc.WeakReferenceTest, >> >> stress.WeakHashMapTest VM >> >> > segfaults. The stack looks like an infinite recursion of 4 stack >> >> frames: >> >> > >> >> > #0 0xb6dcb814 in null_java_reference_handler (signum=11, >> >> > info=0xb71a503c, context=0xb71a50bc) at >> >> > >> >> >> /nfs/ims/proj/drl/mrt1/users/gregory/Harmony/enhanced/drlvm/trunk/vm/vmco >> >> > re/src/util/linux/signals_ia32.cpp:443 >> >> > #1 <signal handler called> >> >> > #2 0xb6dcc20a in get_stack_addr () at >> >> > >> >> >> /nfs/ims/proj/drl/mrt1/users/gregory/Harmony/enhanced/drlvm/trunk/vm/vmco >> >> > re/src/util/linux/signals_ia32.cpp:293 >> >> > #3 0xb6dcb6cd in check_stack_overflow (info=0xb71a546c, >> uc=0xb71a54ec) >> >> > at >> >> > >> >> >> /nfs/ims/proj/drl/mrt1/users/gregory/Harmony/enhanced/drlvm/trunk/vm/vmco >> >> > re/src/util/linux/signals_ia32.cpp:399 >> >> > #4 0xb6dcb900 in null_java_reference_handler (signum=11, >> >> > info=0xb71a546c, context=0xb71a54ec) at >> >> > >> >> >> /nfs/ims/proj/drl/mrt1/users/gregory/Harmony/enhanced/drlvm/trunk/vm/vmco >> >> > re/src/util/linux/signals_ia32.cpp:451 >> >> > >> >> > and so on. The stack is very long. When I run VM with >> -Xtrace:signals I >> >> > get a very long log of messages that "NPE or SOE detected at >> ...". The >> >> > first time address always varies, but it appears to be memcpy. >> The next >> >> > addresses are always the same, they point to get_stack_addr >> function. >> >> > >> >> > So I tried to find out why memcpy crashes in the first place. It >> >> appears >> >> > to be a struct copy called from jsig_handler hysig. The stack looks >> >> like >> >> > this (if I can trust gdb on ubuntu): >> >> > >> >> > #0 0xb7a9b9dc in memcpy () from /lib/tls/i686/cmov/libc.so.6 >> >> > #1 0xb7ba0fa0 in jsig_handler (sig=-1215196204, siginfo=0x0, >> uc=0x0) >> >> > at hysigunix.c:169 >> >> > #2 0xb7f9ec8b in asynchSignalReporter (userData=0x0) at >> hysignal.c:971 >> >> > #3 0xb7baa8ef in thread_start_proc (thd=0x807a8e8, >> p_args=0x807a8d8) >> >> > at >> >> > >> >> >> /nfs/ims/proj/drl/mrt1/users/gregory/Harmony/enhanced/drlvm/trunk/vm/thread/src/thread_native_basic.c:712 >> >> >> >> >> > >> >> > #4 0xb7bb0ed4 in dummy_worker (opaque=0x0) at >> >> threadproc/unix/thread.c:138 >> >> > #5 0xb7b65341 in start_thread () from >> >> lib/tls/i686/cmov/libpthread.so.0 >> >> > #6 0xb7af94ee in clone () from /lib/tls/i686/cmov/libc.so.6 >> >> > >> >> > In jsig_handler a struct of type sigaction is copied >> >> > >> >> > act = saved_sigaction[sig]; >> >> > >> >> > and gcc replaces this statement with a call to memcpy it seems. >> But the >> >> > parameter sig is quite weird if you look at it. It is >> >> sig=-1215196204... >> >> > Now if I could only find where and this sig happened there... I >> cannot >> >> > find it in the depth of classlib native code this late at night. >> >> > >> >> >> >> >> > >> >