Re: [drlvm][threading] Should hythread_monitor_init() aquire the monitor?
Evgueni Brevnov wrote: In other words we will observe the crash as we do now if sem_wait completes unsuccessfully for whatever reason... Well it shouldn't return an error except for signal, shouldn't it? Two possible other errors are EINVAL and EDEADLK which should never happen. Maybe we should add an assertion after it that sem_wait was successful to catch this situation quickly, and it will be a good starting point for investigation. On 11/17/06, Evgueni Brevnov <[EMAIL PROTECTED]> wrote: Gregory, The code which goes after sem_wait doesn't work properly if sem_wait returns with an error code. So we need to either loop until sem_wait returns successfully or adjust the code after sem_wait to handle irregular cases. Thanks Evgueni On 11/16/06, Geir Magnusson Jr. <[EMAIL PROTECTED]> wrote: > Yes - that's why I was poking him to see the patch. I was going to > suggest something very similar. > > geir > > > Gregory Shimansky wrote: > > Evgueni Brevnov wrote: > >> You can look at the change here > >> http://issues.apache.org/jira/browse/HARMONY-2203 > > > > Could someone who knowns classlib native code internals better than me > > comment on this JIRA? I've added my comment from the general POV. > > > > I would change the loop to detect only signal interruption like > > > > while (sem_wait(&wakeUpASynchReporter) == -1 && errno == EINTR); > > > > Other than that I agree with the patch. I someone does not know, every > > step in gdb also interrupts sem_wait calls, so such loops are a common > > practice when using semaphores. > > > > If someone knows classlib internal logic with this asynchronous handlers > > stuff please write your opinion. > > > -- Gregory
Re: [drlvm][threading] Should hythread_monitor_init() aquire the monitor?
In other words we will observe the crash as we do now if sem_wait completes unsuccessfully for whatever reason... On 11/17/06, Evgueni Brevnov <[EMAIL PROTECTED]> wrote: Gregory, The code which goes after sem_wait doesn't work properly if sem_wait returns with an error code. So we need to either loop until sem_wait returns successfully or adjust the code after sem_wait to handle irregular cases. Thanks Evgueni On 11/16/06, Geir Magnusson Jr. <[EMAIL PROTECTED]> wrote: > Yes - that's why I was poking him to see the patch. I was going to > suggest something very similar. > > geir > > > Gregory Shimansky wrote: > > Evgueni Brevnov wrote: > >> You can look at the change here > >> http://issues.apache.org/jira/browse/HARMONY-2203 > > > > Could someone who knowns classlib native code internals better than me > > comment on this JIRA? I've added my comment from the general POV. > > > > I would change the loop to detect only signal interruption like > > > > while (sem_wait(&wakeUpASynchReporter) == -1 && errno == EINTR); > > > > Other than that I agree with the patch. I someone does not know, every > > step in gdb also interrupts sem_wait calls, so such loops are a common > > practice when using semaphores. > > > > If someone knows classlib internal logic with this asynchronous handlers > > stuff please write your opinion. > > >
Re: [drlvm][threading] Should hythread_monitor_init() aquire the monitor?
Gregory, The code which goes after sem_wait doesn't work properly if sem_wait returns with an error code. So we need to either loop until sem_wait returns successfully or adjust the code after sem_wait to handle irregular cases. Thanks Evgueni On 11/16/06, Geir Magnusson Jr. <[EMAIL PROTECTED]> wrote: Yes - that's why I was poking him to see the patch. I was going to suggest something very similar. geir Gregory Shimansky wrote: > Evgueni Brevnov wrote: >> You can look at the change here >> http://issues.apache.org/jira/browse/HARMONY-2203 > > Could someone who knowns classlib native code internals better than me > comment on this JIRA? I've added my comment from the general POV. > > I would change the loop to detect only signal interruption like > > while (sem_wait(&wakeUpASynchReporter) == -1 && errno == EINTR); > > Other than that I agree with the patch. I someone does not know, every > step in gdb also interrupts sem_wait calls, so such loops are a common > practice when using semaphores. > > If someone knows classlib internal logic with this asynchronous handlers > stuff please write your opinion. >
Re: [drlvm][threading] Should hythread_monitor_init() aquire the monitor?
Yes - that's why I was poking him to see the patch. I was going to suggest something very similar. geir Gregory Shimansky wrote: Evgueni Brevnov wrote: You can look at the change here http://issues.apache.org/jira/browse/HARMONY-2203 Could someone who knowns classlib native code internals better than me comment on this JIRA? I've added my comment from the general POV. I would change the loop to detect only signal interruption like while (sem_wait(&wakeUpASynchReporter) == -1 && errno == EINTR); Other than that I agree with the patch. I someone does not know, every step in gdb also interrupts sem_wait calls, so such loops are a common practice when using semaphores. If someone knows classlib internal logic with this asynchronous handlers stuff please write your opinion.
Re: [drlvm][threading] Should hythread_monitor_init() aquire the monitor?
Evgueni Brevnov wrote: You can look at the change here http://issues.apache.org/jira/browse/HARMONY-2203 Could someone who knowns classlib native code internals better than me comment on this JIRA? I've added my comment from the general POV. I would change the loop to detect only signal interruption like while (sem_wait(&wakeUpASynchReporter) == -1 && errno == EINTR); Other than that I agree with the patch. I someone does not know, every step in gdb also interrupts sem_wait calls, so such loops are a common practice when using semaphores. If someone knows classlib internal logic with this asynchronous handlers stuff please write your opinion. -- Gregory
Re: [drlvm][threading] Should hythread_monitor_init() aquire the monitor?
You can look at the change here http://issues.apache.org/jira/browse/HARMONY-2203 On 11/16/06, Evgueni Brevnov <[EMAIL PROTECTED]> wrote: I haven't published it yet...will file a JIRA soon... On 11/16/06, Geir Magnusson Jr. <[EMAIL PROTECTED]> wrote: > ah. whew. > > can you point me to that change you made? > > geir > > Evgueni Brevnov wrote: > > I'm not aware if classlib uses SIGUSR2. In this particular case > > classlib (to be more precise it is the portlib module) does sem_wait > > which is interrupted by TM's SIGUSR2 signal. I replaced "hysem_wait" > > with "while (hysem_wait() != 0) {}". It helped to pass all tests. > > > > Evgueni > > > > On 11/16/06, Geir Magnusson Jr. <[EMAIL PROTECTED]> wrote: > >> um... classlib uses SIGUSR2 as well? Doesn't our thread manager use it? > >> > >> Evgueni Brevnov wrote: > >> > Hey, > >> > > >> > Seems like the pretty old problem shows itself again. I'm talking > >> > about SIGUSR2 signal :-(...Classlib's asynchronous signal reporter > >> > uses system semaphores for synchronization purposes...and hysem_wait > >> > is interrupted by the signal: > >> > > >> > (gdb) p perror("sym_wait error:") > >> > sym_wait error:: Interrupted system call > >> > > >> > Do we have good (universal) solution for such cases? > >> > > >> > Thanks > >> > Evgueni > >> > > >> > On 11/15/06, Geir Magnusson Jr. <[EMAIL PROTECTED]> wrote: > >> >> > >> >> > >> >> Gregory Shimansky wrote: > >> >> > Evgueni Brevnov wrote: > >> >> >> hmmm strange. The patch was tested on multi-processor system > >> >> >> running SUSE9. I will check if the patch misses something. > >> Anyway, we > >> >> >> need to wait with the patch submission until we 100% sure how > >> >> >> hythread_monitor_init should behave. > >> >> >> > >> >> >> Thanks > >> >> >> Evgueni > >> >> >> > >> >> >> On 11/11/06, Gregory Shimansky <[EMAIL PROTECTED]> wrote: > >> >> >>> On Friday 10 November 2006 17:45 Evgueni Brevnov wrote: > >> >> >>> > Hi, > >> >> >>> > > >> >> >>> > While investigating deadlock scenario which is described in > >> >> >>> > HARMONY-2006 I found out one interesting thing. It turned out > >> >> that DRL > >> >> >>> > implementation of hythread_monitor_init / > >> >> >>> > hythread_monitor_init_with_name initializes and acquires a > >> monitor. > >> >> >>> > Original spec reads: "Acquire and initialize a new monitor > >> from the > >> >> >>> > threading library" AFAIU that doesn't mean to lock the > >> >> monitor but > >> >> >>> > get it from the threading library. So the hythread_monitor_init > >> >> should > >> >> >>> > not lock the monitor. > >> >> >>> > > >> >> >>> > Could somebody comment on that? > >> >> >>> > >> >> >>> It might be that semantic is different on different platforms > >> >> which is > >> >> >>> probably even worse. Your patch in HARMONY-2149 breaks nearly > >> all of > >> >> >>> acceptance tests on Linux while everything on Windows works (ok I > >> >> >>> tested on > >> >> >>> laptop with 1 processor while Linux was a HT server, sometimes > >> it is > >> >> >>> important for threading). > >> >> > > >> >> > I've tried to investigate the problem but didn't find the end of it > >> >> yet. > >> >> > The bug seems to be ubuntu specific (shall we maybe call this > >> >> > distribution buggy and move on?). > >> >> > >> >> There is something odd about it, I'll admit... Remember the EOMEM > >> bugs > >> >> I found in forking? > >> >> > >> >> > >> >> I didn't reproduce it on > >> >> > gentoo, all tests work just fine. > >> >> > > >> >> > The bug look likes this, on tests gc.Force, gc.LOS, gc.List, gc.NPE, > >> >> > gc.PhantomReferenceTest, gc.WeakReferenceTest, > >> >> stress.WeakHashMapTest VM > >> >> > segfaults. The stack looks like an infinite recursion of 4 stack > >> >> frames: > >> >> > > >> >> > #0 0xb6dcb814 in null_java_reference_handler (signum=11, > >> >> > info=0xb71a503c, context=0xb71a50bc) at > >> >> > > >> >> > >> /nfs/ims/proj/drl/mrt1/users/gregory/Harmony/enhanced/drlvm/trunk/vm/vmco > >> >> > re/src/util/linux/signals_ia32.cpp:443 > >> >> > #1 > >> >> > #2 0xb6dcc20a in get_stack_addr () at > >> >> > > >> >> > >> /nfs/ims/proj/drl/mrt1/users/gregory/Harmony/enhanced/drlvm/trunk/vm/vmco > >> >> > re/src/util/linux/signals_ia32.cpp:293 > >> >> > #3 0xb6dcb6cd in check_stack_overflow (info=0xb71a546c, > >> uc=0xb71a54ec) > >> >> > at > >> >> > > >> >> > >> /nfs/ims/proj/drl/mrt1/users/gregory/Harmony/enhanced/drlvm/trunk/vm/vmco > >> >> > re/src/util/linux/signals_ia32.cpp:399 > >> >> > #4 0xb6dcb900 in null_java_reference_handler (signum=11, > >> >> > info=0xb71a546c, context=0xb71a54ec) at > >> >> > > >> >> > >> /nfs/ims/proj/drl/mrt1/users/gregory/Harmony/enhanced/drlvm/trunk/vm/vmco > >> >> > re/src/util/linux/signals_ia32.cpp:451 > >> >> > > >> >> > and so on. The stack is very long. When I run VM with > >> -Xtrace:signals I > >> >> > get a very long log of messages that "NPE or SOE detected at > >> ...". The > >> >> > first time address always varies, but it ap
Re: [drlvm][threading] Should hythread_monitor_init() aquire the monitor?
I haven't published it yet...will file a JIRA soon... On 11/16/06, Geir Magnusson Jr. <[EMAIL PROTECTED]> wrote: ah. whew. can you point me to that change you made? geir Evgueni Brevnov wrote: > I'm not aware if classlib uses SIGUSR2. In this particular case > classlib (to be more precise it is the portlib module) does sem_wait > which is interrupted by TM's SIGUSR2 signal. I replaced "hysem_wait" > with "while (hysem_wait() != 0) {}". It helped to pass all tests. > > Evgueni > > On 11/16/06, Geir Magnusson Jr. <[EMAIL PROTECTED]> wrote: >> um... classlib uses SIGUSR2 as well? Doesn't our thread manager use it? >> >> Evgueni Brevnov wrote: >> > Hey, >> > >> > Seems like the pretty old problem shows itself again. I'm talking >> > about SIGUSR2 signal :-(...Classlib's asynchronous signal reporter >> > uses system semaphores for synchronization purposes...and hysem_wait >> > is interrupted by the signal: >> > >> > (gdb) p perror("sym_wait error:") >> > sym_wait error:: Interrupted system call >> > >> > Do we have good (universal) solution for such cases? >> > >> > Thanks >> > Evgueni >> > >> > On 11/15/06, Geir Magnusson Jr. <[EMAIL PROTECTED]> wrote: >> >> >> >> >> >> Gregory Shimansky wrote: >> >> > Evgueni Brevnov wrote: >> >> >> hmmm strange. The patch was tested on multi-processor system >> >> >> running SUSE9. I will check if the patch misses something. >> Anyway, we >> >> >> need to wait with the patch submission until we 100% sure how >> >> >> hythread_monitor_init should behave. >> >> >> >> >> >> Thanks >> >> >> Evgueni >> >> >> >> >> >> On 11/11/06, Gregory Shimansky <[EMAIL PROTECTED]> wrote: >> >> >>> On Friday 10 November 2006 17:45 Evgueni Brevnov wrote: >> >> >>> > Hi, >> >> >>> > >> >> >>> > While investigating deadlock scenario which is described in >> >> >>> > HARMONY-2006 I found out one interesting thing. It turned out >> >> that DRL >> >> >>> > implementation of hythread_monitor_init / >> >> >>> > hythread_monitor_init_with_name initializes and acquires a >> monitor. >> >> >>> > Original spec reads: "Acquire and initialize a new monitor >> from the >> >> >>> > threading library" AFAIU that doesn't mean to lock the >> >> monitor but >> >> >>> > get it from the threading library. So the hythread_monitor_init >> >> should >> >> >>> > not lock the monitor. >> >> >>> > >> >> >>> > Could somebody comment on that? >> >> >>> >> >> >>> It might be that semantic is different on different platforms >> >> which is >> >> >>> probably even worse. Your patch in HARMONY-2149 breaks nearly >> all of >> >> >>> acceptance tests on Linux while everything on Windows works (ok I >> >> >>> tested on >> >> >>> laptop with 1 processor while Linux was a HT server, sometimes >> it is >> >> >>> important for threading). >> >> > >> >> > I've tried to investigate the problem but didn't find the end of it >> >> yet. >> >> > The bug seems to be ubuntu specific (shall we maybe call this >> >> > distribution buggy and move on?). >> >> >> >> There is something odd about it, I'll admit... Remember the EOMEM >> bugs >> >> I found in forking? >> >> >> >> >> >> I didn't reproduce it on >> >> > gentoo, all tests work just fine. >> >> > >> >> > The bug look likes this, on tests gc.Force, gc.LOS, gc.List, gc.NPE, >> >> > gc.PhantomReferenceTest, gc.WeakReferenceTest, >> >> stress.WeakHashMapTest VM >> >> > segfaults. The stack looks like an infinite recursion of 4 stack >> >> frames: >> >> > >> >> > #0 0xb6dcb814 in null_java_reference_handler (signum=11, >> >> > info=0xb71a503c, context=0xb71a50bc) at >> >> > >> >> >> /nfs/ims/proj/drl/mrt1/users/gregory/Harmony/enhanced/drlvm/trunk/vm/vmco >> >> > re/src/util/linux/signals_ia32.cpp:443 >> >> > #1 >> >> > #2 0xb6dcc20a in get_stack_addr () at >> >> > >> >> >> /nfs/ims/proj/drl/mrt1/users/gregory/Harmony/enhanced/drlvm/trunk/vm/vmco >> >> > re/src/util/linux/signals_ia32.cpp:293 >> >> > #3 0xb6dcb6cd in check_stack_overflow (info=0xb71a546c, >> uc=0xb71a54ec) >> >> > at >> >> > >> >> >> /nfs/ims/proj/drl/mrt1/users/gregory/Harmony/enhanced/drlvm/trunk/vm/vmco >> >> > re/src/util/linux/signals_ia32.cpp:399 >> >> > #4 0xb6dcb900 in null_java_reference_handler (signum=11, >> >> > info=0xb71a546c, context=0xb71a54ec) at >> >> > >> >> >> /nfs/ims/proj/drl/mrt1/users/gregory/Harmony/enhanced/drlvm/trunk/vm/vmco >> >> > re/src/util/linux/signals_ia32.cpp:451 >> >> > >> >> > and so on. The stack is very long. When I run VM with >> -Xtrace:signals I >> >> > get a very long log of messages that "NPE or SOE detected at >> ...". The >> >> > first time address always varies, but it appears to be memcpy. >> The next >> >> > addresses are always the same, they point to get_stack_addr >> function. >> >> > >> >> > So I tried to find out why memcpy crashes in the first place. It >> >> appears >> >> > to be a struct copy called from jsig_handler hysig. The stack looks >> >> like >> >> > this (if I can trust gdb on ubuntu): >> >> > >> >> > #0 0xb7a9b9dc in memcpy () from /
Re: [drlvm][threading] Should hythread_monitor_init() aquire the monitor?
ah. whew. can you point me to that change you made? geir Evgueni Brevnov wrote: I'm not aware if classlib uses SIGUSR2. In this particular case classlib (to be more precise it is the portlib module) does sem_wait which is interrupted by TM's SIGUSR2 signal. I replaced "hysem_wait" with "while (hysem_wait() != 0) {}". It helped to pass all tests. Evgueni On 11/16/06, Geir Magnusson Jr. <[EMAIL PROTECTED]> wrote: um... classlib uses SIGUSR2 as well? Doesn't our thread manager use it? Evgueni Brevnov wrote: > Hey, > > Seems like the pretty old problem shows itself again. I'm talking > about SIGUSR2 signal :-(...Classlib's asynchronous signal reporter > uses system semaphores for synchronization purposes...and hysem_wait > is interrupted by the signal: > > (gdb) p perror("sym_wait error:") > sym_wait error:: Interrupted system call > > Do we have good (universal) solution for such cases? > > Thanks > Evgueni > > On 11/15/06, Geir Magnusson Jr. <[EMAIL PROTECTED]> wrote: >> >> >> Gregory Shimansky wrote: >> > Evgueni Brevnov wrote: >> >> hmmm strange. The patch was tested on multi-processor system >> >> running SUSE9. I will check if the patch misses something. Anyway, we >> >> need to wait with the patch submission until we 100% sure how >> >> hythread_monitor_init should behave. >> >> >> >> Thanks >> >> Evgueni >> >> >> >> On 11/11/06, Gregory Shimansky <[EMAIL PROTECTED]> wrote: >> >>> On Friday 10 November 2006 17:45 Evgueni Brevnov wrote: >> >>> > Hi, >> >>> > >> >>> > While investigating deadlock scenario which is described in >> >>> > HARMONY-2006 I found out one interesting thing. It turned out >> that DRL >> >>> > implementation of hythread_monitor_init / >> >>> > hythread_monitor_init_with_name initializes and acquires a monitor. >> >>> > Original spec reads: "Acquire and initialize a new monitor from the >> >>> > threading library" AFAIU that doesn't mean to lock the >> monitor but >> >>> > get it from the threading library. So the hythread_monitor_init >> should >> >>> > not lock the monitor. >> >>> > >> >>> > Could somebody comment on that? >> >>> >> >>> It might be that semantic is different on different platforms >> which is >> >>> probably even worse. Your patch in HARMONY-2149 breaks nearly all of >> >>> acceptance tests on Linux while everything on Windows works (ok I >> >>> tested on >> >>> laptop with 1 processor while Linux was a HT server, sometimes it is >> >>> important for threading). >> > >> > I've tried to investigate the problem but didn't find the end of it >> yet. >> > The bug seems to be ubuntu specific (shall we maybe call this >> > distribution buggy and move on?). >> >> There is something odd about it, I'll admit... Remember the EOMEM bugs >> I found in forking? >> >> >> I didn't reproduce it on >> > gentoo, all tests work just fine. >> > >> > The bug look likes this, on tests gc.Force, gc.LOS, gc.List, gc.NPE, >> > gc.PhantomReferenceTest, gc.WeakReferenceTest, >> stress.WeakHashMapTest VM >> > segfaults. The stack looks like an infinite recursion of 4 stack >> frames: >> > >> > #0 0xb6dcb814 in null_java_reference_handler (signum=11, >> > info=0xb71a503c, context=0xb71a50bc) at >> > >> /nfs/ims/proj/drl/mrt1/users/gregory/Harmony/enhanced/drlvm/trunk/vm/vmco >> > re/src/util/linux/signals_ia32.cpp:443 >> > #1 >> > #2 0xb6dcc20a in get_stack_addr () at >> > >> /nfs/ims/proj/drl/mrt1/users/gregory/Harmony/enhanced/drlvm/trunk/vm/vmco >> > re/src/util/linux/signals_ia32.cpp:293 >> > #3 0xb6dcb6cd in check_stack_overflow (info=0xb71a546c, uc=0xb71a54ec) >> > at >> > >> /nfs/ims/proj/drl/mrt1/users/gregory/Harmony/enhanced/drlvm/trunk/vm/vmco >> > re/src/util/linux/signals_ia32.cpp:399 >> > #4 0xb6dcb900 in null_java_reference_handler (signum=11, >> > info=0xb71a546c, context=0xb71a54ec) at >> > >> /nfs/ims/proj/drl/mrt1/users/gregory/Harmony/enhanced/drlvm/trunk/vm/vmco >> > re/src/util/linux/signals_ia32.cpp:451 >> > >> > and so on. The stack is very long. When I run VM with -Xtrace:signals I >> > get a very long log of messages that "NPE or SOE detected at ...". The >> > first time address always varies, but it appears to be memcpy. The next >> > addresses are always the same, they point to get_stack_addr function. >> > >> > So I tried to find out why memcpy crashes in the first place. It >> appears >> > to be a struct copy called from jsig_handler hysig. The stack looks >> like >> > this (if I can trust gdb on ubuntu): >> > >> > #0 0xb7a9b9dc in memcpy () from /lib/tls/i686/cmov/libc.so.6 >> > #1 0xb7ba0fa0 in jsig_handler (sig=-1215196204, siginfo=0x0, uc=0x0) >> > at hysigunix.c:169 >> > #2 0xb7f9ec8b in asynchSignalReporter (userData=0x0) at hysignal.c:971 >> > #3 0xb7baa8ef in thread_start_proc (thd=0x807a8e8, p_args=0x807a8d8) >> > at >> > >> /nfs/ims/proj/drl/mrt1/users/gregory/Harmony/enhanced/drlvm/trunk/vm/thread/src/thread_native_basic.c:712 >> >> > >> > #4 0xb7bb0ed4 in dummy_worker (opaque=0x
Re: [drlvm][threading] Should hythread_monitor_init() aquire the monitor?
um... classlib uses SIGUSR2 as well? Doesn't our thread manager use it? Evgueni Brevnov wrote: Hey, Seems like the pretty old problem shows itself again. I'm talking about SIGUSR2 signal :-(...Classlib's asynchronous signal reporter uses system semaphores for synchronization purposes...and hysem_wait is interrupted by the signal: (gdb) p perror("sym_wait error:") sym_wait error:: Interrupted system call Do we have good (universal) solution for such cases? Thanks Evgueni On 11/15/06, Geir Magnusson Jr. <[EMAIL PROTECTED]> wrote: Gregory Shimansky wrote: > Evgueni Brevnov wrote: >> hmmm strange. The patch was tested on multi-processor system >> running SUSE9. I will check if the patch misses something. Anyway, we >> need to wait with the patch submission until we 100% sure how >> hythread_monitor_init should behave. >> >> Thanks >> Evgueni >> >> On 11/11/06, Gregory Shimansky <[EMAIL PROTECTED]> wrote: >>> On Friday 10 November 2006 17:45 Evgueni Brevnov wrote: >>> > Hi, >>> > >>> > While investigating deadlock scenario which is described in >>> > HARMONY-2006 I found out one interesting thing. It turned out that DRL >>> > implementation of hythread_monitor_init / >>> > hythread_monitor_init_with_name initializes and acquires a monitor. >>> > Original spec reads: "Acquire and initialize a new monitor from the >>> > threading library" AFAIU that doesn't mean to lock the monitor but >>> > get it from the threading library. So the hythread_monitor_init should >>> > not lock the monitor. >>> > >>> > Could somebody comment on that? >>> >>> It might be that semantic is different on different platforms which is >>> probably even worse. Your patch in HARMONY-2149 breaks nearly all of >>> acceptance tests on Linux while everything on Windows works (ok I >>> tested on >>> laptop with 1 processor while Linux was a HT server, sometimes it is >>> important for threading). > > I've tried to investigate the problem but didn't find the end of it yet. > The bug seems to be ubuntu specific (shall we maybe call this > distribution buggy and move on?). There is something odd about it, I'll admit... Remember the EOMEM bugs I found in forking? I didn't reproduce it on > gentoo, all tests work just fine. > > The bug look likes this, on tests gc.Force, gc.LOS, gc.List, gc.NPE, > gc.PhantomReferenceTest, gc.WeakReferenceTest, stress.WeakHashMapTest VM > segfaults. The stack looks like an infinite recursion of 4 stack frames: > > #0 0xb6dcb814 in null_java_reference_handler (signum=11, > info=0xb71a503c, context=0xb71a50bc) at > /nfs/ims/proj/drl/mrt1/users/gregory/Harmony/enhanced/drlvm/trunk/vm/vmco > re/src/util/linux/signals_ia32.cpp:443 > #1 > #2 0xb6dcc20a in get_stack_addr () at > /nfs/ims/proj/drl/mrt1/users/gregory/Harmony/enhanced/drlvm/trunk/vm/vmco > re/src/util/linux/signals_ia32.cpp:293 > #3 0xb6dcb6cd in check_stack_overflow (info=0xb71a546c, uc=0xb71a54ec) > at > /nfs/ims/proj/drl/mrt1/users/gregory/Harmony/enhanced/drlvm/trunk/vm/vmco > re/src/util/linux/signals_ia32.cpp:399 > #4 0xb6dcb900 in null_java_reference_handler (signum=11, > info=0xb71a546c, context=0xb71a54ec) at > /nfs/ims/proj/drl/mrt1/users/gregory/Harmony/enhanced/drlvm/trunk/vm/vmco > re/src/util/linux/signals_ia32.cpp:451 > > and so on. The stack is very long. When I run VM with -Xtrace:signals I > get a very long log of messages that "NPE or SOE detected at ...". The > first time address always varies, but it appears to be memcpy. The next > addresses are always the same, they point to get_stack_addr function. > > So I tried to find out why memcpy crashes in the first place. It appears > to be a struct copy called from jsig_handler hysig. The stack looks like > this (if I can trust gdb on ubuntu): > > #0 0xb7a9b9dc in memcpy () from /lib/tls/i686/cmov/libc.so.6 > #1 0xb7ba0fa0 in jsig_handler (sig=-1215196204, siginfo=0x0, uc=0x0) > at hysigunix.c:169 > #2 0xb7f9ec8b in asynchSignalReporter (userData=0x0) at hysignal.c:971 > #3 0xb7baa8ef in thread_start_proc (thd=0x807a8e8, p_args=0x807a8d8) > at > /nfs/ims/proj/drl/mrt1/users/gregory/Harmony/enhanced/drlvm/trunk/vm/thread/src/thread_native_basic.c:712 > > #4 0xb7bb0ed4 in dummy_worker (opaque=0x0) at threadproc/unix/thread.c:138 > #5 0xb7b65341 in start_thread () from lib/tls/i686/cmov/libpthread.so.0 > #6 0xb7af94ee in clone () from /lib/tls/i686/cmov/libc.so.6 > > In jsig_handler a struct of type sigaction is copied > > act = saved_sigaction[sig]; > > and gcc replaces this statement with a call to memcpy it seems. But the > parameter sig is quite weird if you look at it. It is sig=-1215196204... > Now if I could only find where and this sig happened there... I cannot > find it in the depth of classlib native code this late at night. >
Re: [drlvm][threading] Should hythread_monitor_init() aquire the monitor?
Hey, Seems like the pretty old problem shows itself again. I'm talking about SIGUSR2 signal :-(...Classlib's asynchronous signal reporter uses system semaphores for synchronization purposes...and hysem_wait is interrupted by the signal: (gdb) p perror("sym_wait error:") sym_wait error:: Interrupted system call Do we have good (universal) solution for such cases? Thanks Evgueni On 11/15/06, Geir Magnusson Jr. <[EMAIL PROTECTED]> wrote: Gregory Shimansky wrote: > Evgueni Brevnov wrote: >> hmmm strange. The patch was tested on multi-processor system >> running SUSE9. I will check if the patch misses something. Anyway, we >> need to wait with the patch submission until we 100% sure how >> hythread_monitor_init should behave. >> >> Thanks >> Evgueni >> >> On 11/11/06, Gregory Shimansky <[EMAIL PROTECTED]> wrote: >>> On Friday 10 November 2006 17:45 Evgueni Brevnov wrote: >>> > Hi, >>> > >>> > While investigating deadlock scenario which is described in >>> > HARMONY-2006 I found out one interesting thing. It turned out that DRL >>> > implementation of hythread_monitor_init / >>> > hythread_monitor_init_with_name initializes and acquires a monitor. >>> > Original spec reads: "Acquire and initialize a new monitor from the >>> > threading library" AFAIU that doesn't mean to lock the monitor but >>> > get it from the threading library. So the hythread_monitor_init should >>> > not lock the monitor. >>> > >>> > Could somebody comment on that? >>> >>> It might be that semantic is different on different platforms which is >>> probably even worse. Your patch in HARMONY-2149 breaks nearly all of >>> acceptance tests on Linux while everything on Windows works (ok I >>> tested on >>> laptop with 1 processor while Linux was a HT server, sometimes it is >>> important for threading). > > I've tried to investigate the problem but didn't find the end of it yet. > The bug seems to be ubuntu specific (shall we maybe call this > distribution buggy and move on?). There is something odd about it, I'll admit... Remember the EOMEM bugs I found in forking? I didn't reproduce it on > gentoo, all tests work just fine. > > The bug look likes this, on tests gc.Force, gc.LOS, gc.List, gc.NPE, > gc.PhantomReferenceTest, gc.WeakReferenceTest, stress.WeakHashMapTest VM > segfaults. The stack looks like an infinite recursion of 4 stack frames: > > #0 0xb6dcb814 in null_java_reference_handler (signum=11, > info=0xb71a503c, context=0xb71a50bc) at > /nfs/ims/proj/drl/mrt1/users/gregory/Harmony/enhanced/drlvm/trunk/vm/vmco > re/src/util/linux/signals_ia32.cpp:443 > #1 > #2 0xb6dcc20a in get_stack_addr () at > /nfs/ims/proj/drl/mrt1/users/gregory/Harmony/enhanced/drlvm/trunk/vm/vmco > re/src/util/linux/signals_ia32.cpp:293 > #3 0xb6dcb6cd in check_stack_overflow (info=0xb71a546c, uc=0xb71a54ec) > at > /nfs/ims/proj/drl/mrt1/users/gregory/Harmony/enhanced/drlvm/trunk/vm/vmco > re/src/util/linux/signals_ia32.cpp:399 > #4 0xb6dcb900 in null_java_reference_handler (signum=11, > info=0xb71a546c, context=0xb71a54ec) at > /nfs/ims/proj/drl/mrt1/users/gregory/Harmony/enhanced/drlvm/trunk/vm/vmco > re/src/util/linux/signals_ia32.cpp:451 > > and so on. The stack is very long. When I run VM with -Xtrace:signals I > get a very long log of messages that "NPE or SOE detected at ...". The > first time address always varies, but it appears to be memcpy. The next > addresses are always the same, they point to get_stack_addr function. > > So I tried to find out why memcpy crashes in the first place. It appears > to be a struct copy called from jsig_handler hysig. The stack looks like > this (if I can trust gdb on ubuntu): > > #0 0xb7a9b9dc in memcpy () from /lib/tls/i686/cmov/libc.so.6 > #1 0xb7ba0fa0 in jsig_handler (sig=-1215196204, siginfo=0x0, uc=0x0) > at hysigunix.c:169 > #2 0xb7f9ec8b in asynchSignalReporter (userData=0x0) at hysignal.c:971 > #3 0xb7baa8ef in thread_start_proc (thd=0x807a8e8, p_args=0x807a8d8) > at > /nfs/ims/proj/drl/mrt1/users/gregory/Harmony/enhanced/drlvm/trunk/vm/thread/src/thread_native_basic.c:712 > > #4 0xb7bb0ed4 in dummy_worker (opaque=0x0) at threadproc/unix/thread.c:138 > #5 0xb7b65341 in start_thread () from lib/tls/i686/cmov/libpthread.so.0 > #6 0xb7af94ee in clone () from /lib/tls/i686/cmov/libc.so.6 > > In jsig_handler a struct of type sigaction is copied > > act = saved_sigaction[sig]; > > and gcc replaces this statement with a call to memcpy it seems. But the > parameter sig is quite weird if you look at it. It is sig=-1215196204... > Now if I could only find where and this sig happened there... I cannot > find it in the depth of classlib native code this late at night. >
Re: [drlvm][threading] Should hythread_monitor_init() aquire the monitor?
Gregory Shimansky wrote: Evgueni Brevnov wrote: hmmm strange. The patch was tested on multi-processor system running SUSE9. I will check if the patch misses something. Anyway, we need to wait with the patch submission until we 100% sure how hythread_monitor_init should behave. Thanks Evgueni On 11/11/06, Gregory Shimansky <[EMAIL PROTECTED]> wrote: On Friday 10 November 2006 17:45 Evgueni Brevnov wrote: > Hi, > > While investigating deadlock scenario which is described in > HARMONY-2006 I found out one interesting thing. It turned out that DRL > implementation of hythread_monitor_init / > hythread_monitor_init_with_name initializes and acquires a monitor. > Original spec reads: "Acquire and initialize a new monitor from the > threading library" AFAIU that doesn't mean to lock the monitor but > get it from the threading library. So the hythread_monitor_init should > not lock the monitor. > > Could somebody comment on that? It might be that semantic is different on different platforms which is probably even worse. Your patch in HARMONY-2149 breaks nearly all of acceptance tests on Linux while everything on Windows works (ok I tested on laptop with 1 processor while Linux was a HT server, sometimes it is important for threading). I've tried to investigate the problem but didn't find the end of it yet. The bug seems to be ubuntu specific (shall we maybe call this distribution buggy and move on?). There is something odd about it, I'll admit... Remember the EOMEM bugs I found in forking? I didn't reproduce it on gentoo, all tests work just fine. The bug look likes this, on tests gc.Force, gc.LOS, gc.List, gc.NPE, gc.PhantomReferenceTest, gc.WeakReferenceTest, stress.WeakHashMapTest VM segfaults. The stack looks like an infinite recursion of 4 stack frames: #0 0xb6dcb814 in null_java_reference_handler (signum=11, info=0xb71a503c, context=0xb71a50bc) at /nfs/ims/proj/drl/mrt1/users/gregory/Harmony/enhanced/drlvm/trunk/vm/vmco re/src/util/linux/signals_ia32.cpp:443 #1 #2 0xb6dcc20a in get_stack_addr () at /nfs/ims/proj/drl/mrt1/users/gregory/Harmony/enhanced/drlvm/trunk/vm/vmco re/src/util/linux/signals_ia32.cpp:293 #3 0xb6dcb6cd in check_stack_overflow (info=0xb71a546c, uc=0xb71a54ec) at /nfs/ims/proj/drl/mrt1/users/gregory/Harmony/enhanced/drlvm/trunk/vm/vmco re/src/util/linux/signals_ia32.cpp:399 #4 0xb6dcb900 in null_java_reference_handler (signum=11, info=0xb71a546c, context=0xb71a54ec) at /nfs/ims/proj/drl/mrt1/users/gregory/Harmony/enhanced/drlvm/trunk/vm/vmco re/src/util/linux/signals_ia32.cpp:451 and so on. The stack is very long. When I run VM with -Xtrace:signals I get a very long log of messages that "NPE or SOE detected at ...". The first time address always varies, but it appears to be memcpy. The next addresses are always the same, they point to get_stack_addr function. So I tried to find out why memcpy crashes in the first place. It appears to be a struct copy called from jsig_handler hysig. The stack looks like this (if I can trust gdb on ubuntu): #0 0xb7a9b9dc in memcpy () from /lib/tls/i686/cmov/libc.so.6 #1 0xb7ba0fa0 in jsig_handler (sig=-1215196204, siginfo=0x0, uc=0x0) at hysigunix.c:169 #2 0xb7f9ec8b in asynchSignalReporter (userData=0x0) at hysignal.c:971 #3 0xb7baa8ef in thread_start_proc (thd=0x807a8e8, p_args=0x807a8d8) at /nfs/ims/proj/drl/mrt1/users/gregory/Harmony/enhanced/drlvm/trunk/vm/thread/src/thread_native_basic.c:712 #4 0xb7bb0ed4 in dummy_worker (opaque=0x0) at threadproc/unix/thread.c:138 #5 0xb7b65341 in start_thread () from lib/tls/i686/cmov/libpthread.so.0 #6 0xb7af94ee in clone () from /lib/tls/i686/cmov/libc.so.6 In jsig_handler a struct of type sigaction is copied act = saved_sigaction[sig]; and gcc replaces this statement with a call to memcpy it seems. But the parameter sig is quite weird if you look at it. It is sig=-1215196204... Now if I could only find where and this sig happened there... I cannot find it in the depth of classlib native code this late at night.
Re: [drlvm][threading] Should hythread_monitor_init() aquire the monitor?
Gregory, I can't reproduce the problem described by you on my local Ubuntu machine. So I can only guess. And my guess is that mapPortLibSignalToUnix can't find corresponding signal in the map. That's why you have undefined sig (-1215196204) in jsig_handler. I can think of two reasons why everything works fine on my machine: 1) Another signal is generated on my build. 2) It is just a matter of luck that eax contains some proper value upon returning from mapPortLibSignalToUnix. That's it for now Thanks Evgueni On 11/14/06, Alexei Fedotov <[EMAIL PROTECTED]> wrote: Evgueni, That was great. Artem, It's nice to see you online. Could you please check the last comments to http://issues.apache.org/jira/browse/HARMONY-1904 and decide what should we do about this issue?
Re: [drlvm][threading] Should hythread_monitor_init() aquire the monitor?
Evgueni, That was great. Artem, It's nice to see you online. Could you please check the last comments to http://issues.apache.org/jira/browse/HARMONY-1904 and decide what should we do about this issue?
Re: [drlvm][threading] Should hythread_monitor_init() aquire the monitor?
Evgueni Brevnov wrote: hmmm strange. The patch was tested on multi-processor system running SUSE9. I will check if the patch misses something. Anyway, we need to wait with the patch submission until we 100% sure how hythread_monitor_init should behave. Thanks Evgueni On 11/11/06, Gregory Shimansky <[EMAIL PROTECTED]> wrote: On Friday 10 November 2006 17:45 Evgueni Brevnov wrote: > Hi, > > While investigating deadlock scenario which is described in > HARMONY-2006 I found out one interesting thing. It turned out that DRL > implementation of hythread_monitor_init / > hythread_monitor_init_with_name initializes and acquires a monitor. > Original spec reads: "Acquire and initialize a new monitor from the > threading library" AFAIU that doesn't mean to lock the monitor but > get it from the threading library. So the hythread_monitor_init should > not lock the monitor. > > Could somebody comment on that? It might be that semantic is different on different platforms which is probably even worse. Your patch in HARMONY-2149 breaks nearly all of acceptance tests on Linux while everything on Windows works (ok I tested on laptop with 1 processor while Linux was a HT server, sometimes it is important for threading). I've tried to investigate the problem but didn't find the end of it yet. The bug seems to be ubuntu specific (shall we maybe call this distribution buggy and move on?). I didn't reproduce it on gentoo, all tests work just fine. The bug look likes this, on tests gc.Force, gc.LOS, gc.List, gc.NPE, gc.PhantomReferenceTest, gc.WeakReferenceTest, stress.WeakHashMapTest VM segfaults. The stack looks like an infinite recursion of 4 stack frames: #0 0xb6dcb814 in null_java_reference_handler (signum=11, info=0xb71a503c, context=0xb71a50bc) at /nfs/ims/proj/drl/mrt1/users/gregory/Harmony/enhanced/drlvm/trunk/vm/vmco re/src/util/linux/signals_ia32.cpp:443 #1 #2 0xb6dcc20a in get_stack_addr () at /nfs/ims/proj/drl/mrt1/users/gregory/Harmony/enhanced/drlvm/trunk/vm/vmco re/src/util/linux/signals_ia32.cpp:293 #3 0xb6dcb6cd in check_stack_overflow (info=0xb71a546c, uc=0xb71a54ec) at /nfs/ims/proj/drl/mrt1/users/gregory/Harmony/enhanced/drlvm/trunk/vm/vmco re/src/util/linux/signals_ia32.cpp:399 #4 0xb6dcb900 in null_java_reference_handler (signum=11, info=0xb71a546c, context=0xb71a54ec) at /nfs/ims/proj/drl/mrt1/users/gregory/Harmony/enhanced/drlvm/trunk/vm/vmco re/src/util/linux/signals_ia32.cpp:451 and so on. The stack is very long. When I run VM with -Xtrace:signals I get a very long log of messages that "NPE or SOE detected at ...". The first time address always varies, but it appears to be memcpy. The next addresses are always the same, they point to get_stack_addr function. So I tried to find out why memcpy crashes in the first place. It appears to be a struct copy called from jsig_handler hysig. The stack looks like this (if I can trust gdb on ubuntu): #0 0xb7a9b9dc in memcpy () from /lib/tls/i686/cmov/libc.so.6 #1 0xb7ba0fa0 in jsig_handler (sig=-1215196204, siginfo=0x0, uc=0x0) at hysigunix.c:169 #2 0xb7f9ec8b in asynchSignalReporter (userData=0x0) at hysignal.c:971 #3 0xb7baa8ef in thread_start_proc (thd=0x807a8e8, p_args=0x807a8d8) at /nfs/ims/proj/drl/mrt1/users/gregory/Harmony/enhanced/drlvm/trunk/vm/thread/src/thread_native_basic.c:712 #4 0xb7bb0ed4 in dummy_worker (opaque=0x0) at threadproc/unix/thread.c:138 #5 0xb7b65341 in start_thread () from lib/tls/i686/cmov/libpthread.so.0 #6 0xb7af94ee in clone () from /lib/tls/i686/cmov/libc.so.6 In jsig_handler a struct of type sigaction is copied act = saved_sigaction[sig]; and gcc replaces this statement with a call to memcpy it seems. But the parameter sig is quite weird if you look at it. It is sig=-1215196204... Now if I could only find where and this sig happened there... I cannot find it in the depth of classlib native code this late at night. -- Gregory
Re: [drlvm][threading] Should hythread_monitor_init() aquire the monitor?
Oops, You were right. I take a llook into classlib hythread code. It looks like I incorrectly understand the documentation. This is a bug. Thanks Artem On 11/13/06, Evgueni Brevnov <[EMAIL PROTECTED]> wrote: Could someone familiar with classlib's implementation comment on that ? Thanks in advance. Evgueni On 11/13/06, Evgueni Brevnov <[EMAIL PROTECTED]> wrote: > Hello Artem, > > Are you 100% sure? I've looked at the classlib's implementation and > can't find where the monitor is acquired. Moreover if you look at the > initializeSignalTools() located in > modules\portlib\src\main\native\port\linux\hysignal.c you will find > that it initializes new monitors with hyhtread_monitor_init_with_name > and never frees these monitors. That turned out to be the reason of a > deadlock in HARMONY-2006. > > Thanks > Evgueni > > On 11/13/06, Artem Aliev <[EMAIL PROTECTED]> wrote: > > > It turned out that DRL > > > implementation of hythread_monitor_init / > > > hythread_monitor_init_with_name initializes and acquires a monitor. > > > > Eugeni, > > > > Both drlvm and classlib hythread work this way. > > This original hythread design that for compatibility reason was > > implemented in drlvm. > > > > Thanks > > Artem > > > > > > > > On 11/10/06, Evgueni Brevnov <[EMAIL PROTECTED]> wrote: > > > Hi, > > > > > > While investigating deadlock scenario which is described in > > > HARMONY-2006 I found out one interesting thing. It turned out that DRL > > > implementation of hythread_monitor_init / > > > hythread_monitor_init_with_name initializes and acquires a monitor. > > > Original spec reads: "Acquire and initialize a new monitor from the > > > threading library" AFAIU that doesn't mean to lock the monitor but > > > get it from the threading library. So the hythread_monitor_init should > > > not lock the monitor. > > > > > > Could somebody comment on that? > > > > > > Thanks > > > Evgueni > > > > > >
Re: [drlvm][threading] Should hythread_monitor_init() aquire the monitor?
Could someone familiar with classlib's implementation comment on that ? Thanks in advance. Evgueni On 11/13/06, Evgueni Brevnov <[EMAIL PROTECTED]> wrote: Hello Artem, Are you 100% sure? I've looked at the classlib's implementation and can't find where the monitor is acquired. Moreover if you look at the initializeSignalTools() located in modules\portlib\src\main\native\port\linux\hysignal.c you will find that it initializes new monitors with hyhtread_monitor_init_with_name and never frees these monitors. That turned out to be the reason of a deadlock in HARMONY-2006. Thanks Evgueni On 11/13/06, Artem Aliev <[EMAIL PROTECTED]> wrote: > > It turned out that DRL > > implementation of hythread_monitor_init / > > hythread_monitor_init_with_name initializes and acquires a monitor. > > Eugeni, > > Both drlvm and classlib hythread work this way. > This original hythread design that for compatibility reason was > implemented in drlvm. > > Thanks > Artem > > > > On 11/10/06, Evgueni Brevnov <[EMAIL PROTECTED]> wrote: > > Hi, > > > > While investigating deadlock scenario which is described in > > HARMONY-2006 I found out one interesting thing. It turned out that DRL > > implementation of hythread_monitor_init / > > hythread_monitor_init_with_name initializes and acquires a monitor. > > Original spec reads: "Acquire and initialize a new monitor from the > > threading library" AFAIU that doesn't mean to lock the monitor but > > get it from the threading library. So the hythread_monitor_init should > > not lock the monitor. > > > > Could somebody comment on that? > > > > Thanks > > Evgueni > > >
Re: [drlvm][threading] Should hythread_monitor_init() aquire the monitor?
Hello Artem, Are you 100% sure? I've looked at the classlib's implementation and can't find where the monitor is acquired. Moreover if you look at the initializeSignalTools() located in modules\portlib\src\main\native\port\linux\hysignal.c you will find that it initializes new monitors with hyhtread_monitor_init_with_name and never frees these monitors. That turned out to be the reason of a deadlock in HARMONY-2006. Thanks Evgueni On 11/13/06, Artem Aliev <[EMAIL PROTECTED]> wrote: > It turned out that DRL > implementation of hythread_monitor_init / > hythread_monitor_init_with_name initializes and acquires a monitor. Eugeni, Both drlvm and classlib hythread work this way. This original hythread design that for compatibility reason was implemented in drlvm. Thanks Artem On 11/10/06, Evgueni Brevnov <[EMAIL PROTECTED]> wrote: > Hi, > > While investigating deadlock scenario which is described in > HARMONY-2006 I found out one interesting thing. It turned out that DRL > implementation of hythread_monitor_init / > hythread_monitor_init_with_name initializes and acquires a monitor. > Original spec reads: "Acquire and initialize a new monitor from the > threading library" AFAIU that doesn't mean to lock the monitor but > get it from the threading library. So the hythread_monitor_init should > not lock the monitor. > > Could somebody comment on that? > > Thanks > Evgueni >
Re: [drlvm][threading] Should hythread_monitor_init() aquire the monitor?
It turned out that DRL implementation of hythread_monitor_init / hythread_monitor_init_with_name initializes and acquires a monitor. Eugeni, Both drlvm and classlib hythread work this way. This original hythread design that for compatibility reason was implemented in drlvm. Thanks Artem On 11/10/06, Evgueni Brevnov <[EMAIL PROTECTED]> wrote: Hi, While investigating deadlock scenario which is described in HARMONY-2006 I found out one interesting thing. It turned out that DRL implementation of hythread_monitor_init / hythread_monitor_init_with_name initializes and acquires a monitor. Original spec reads: "Acquire and initialize a new monitor from the threading library" AFAIU that doesn't mean to lock the monitor but get it from the threading library. So the hythread_monitor_init should not lock the monitor. Could somebody comment on that? Thanks Evgueni
Re: [drlvm][threading] Should hythread_monitor_init() aquire the monitor?
All, Evgueni's patch is a step in the right direction. Considering pthread_mutex_init as a conventional example, monitor shouldn't be locked at _init function. Test errors on Linux can just tell us that there are more places that rely on the incorrect contract of the function. -- Alexei On 11/11/06, Evgueni Brevnov <[EMAIL PROTECTED]> wrote: hmmm strange. The patch was tested on multi-processor system running SUSE9. I will check if the patch misses something. Anyway, we need to wait with the patch submission until we 100% sure how hythread_monitor_init should behave. Thanks Evgueni On 11/11/06, Gregory Shimansky <[EMAIL PROTECTED]> wrote: > On Friday 10 November 2006 17:45 Evgueni Brevnov wrote: > > Hi, > > > > While investigating deadlock scenario which is described in > > HARMONY-2006 I found out one interesting thing. It turned out that DRL > > implementation of hythread_monitor_init / > > hythread_monitor_init_with_name initializes and acquires a monitor. > > Original spec reads: "Acquire and initialize a new monitor from the > > threading library" AFAIU that doesn't mean to lock the monitor but > > get it from the threading library. So the hythread_monitor_init should > > not lock the monitor. > > > > Could somebody comment on that? > > It might be that semantic is different on different platforms which is > probably even worse. Your patch in HARMONY-2149 breaks nearly all of > acceptance tests on Linux while everything on Windows works (ok I tested on > laptop with 1 processor while Linux was a HT server, sometimes it is > important for threading). > > I think we need more investigation on whether or not the monitor has to be > locked in init. > > -- > Gregory Shimansky, Intel Middleware Products Division > -- Thank you, Alexei
Re: [drlvm][threading] Should hythread_monitor_init() aquire the monitor?
hmmm strange. The patch was tested on multi-processor system running SUSE9. I will check if the patch misses something. Anyway, we need to wait with the patch submission until we 100% sure how hythread_monitor_init should behave. Thanks Evgueni On 11/11/06, Gregory Shimansky <[EMAIL PROTECTED]> wrote: On Friday 10 November 2006 17:45 Evgueni Brevnov wrote: > Hi, > > While investigating deadlock scenario which is described in > HARMONY-2006 I found out one interesting thing. It turned out that DRL > implementation of hythread_monitor_init / > hythread_monitor_init_with_name initializes and acquires a monitor. > Original spec reads: "Acquire and initialize a new monitor from the > threading library" AFAIU that doesn't mean to lock the monitor but > get it from the threading library. So the hythread_monitor_init should > not lock the monitor. > > Could somebody comment on that? It might be that semantic is different on different platforms which is probably even worse. Your patch in HARMONY-2149 breaks nearly all of acceptance tests on Linux while everything on Windows works (ok I tested on laptop with 1 processor while Linux was a HT server, sometimes it is important for threading). I think we need more investigation on whether or not the monitor has to be locked in init. -- Gregory Shimansky, Intel Middleware Products Division
Re: [drlvm][threading] Should hythread_monitor_init() aquire the monitor?
On Friday 10 November 2006 17:45 Evgueni Brevnov wrote: > Hi, > > While investigating deadlock scenario which is described in > HARMONY-2006 I found out one interesting thing. It turned out that DRL > implementation of hythread_monitor_init / > hythread_monitor_init_with_name initializes and acquires a monitor. > Original spec reads: "Acquire and initialize a new monitor from the > threading library" AFAIU that doesn't mean to lock the monitor but > get it from the threading library. So the hythread_monitor_init should > not lock the monitor. > > Could somebody comment on that? It might be that semantic is different on different platforms which is probably even worse. Your patch in HARMONY-2149 breaks nearly all of acceptance tests on Linux while everything on Windows works (ok I tested on laptop with 1 processor while Linux was a HT server, sometimes it is important for threading). I think we need more investigation on whether or not the monitor has to be locked in init. -- Gregory Shimansky, Intel Middleware Products Division
[drlvm][threading] Should hythread_monitor_init() aquire the monitor?
Hi, While investigating deadlock scenario which is described in HARMONY-2006 I found out one interesting thing. It turned out that DRL implementation of hythread_monitor_init / hythread_monitor_init_with_name initializes and acquires a monitor. Original spec reads: "Acquire and initialize a new monitor from the threading library" AFAIU that doesn't mean to lock the monitor but get it from the threading library. So the hythread_monitor_init should not lock the monitor. Could somebody comment on that? Thanks Evgueni