Xiao Feng, >From below you said, > > But we actually found neither Assumption 1 nor 2 was guaranteed to be > hold true, so we introduced spinlock for temporary solution, hoping to > remove spinlock for the list access finally. > The above is curious. I tracked through the Thread Manager code and see that hythread_global_lock(NULL); is indeed held during parts of thread create, kill and stop-the-world GC enumeration. Maybe somehow the thread list is being accessed outside of the global_lock??
Assumption 1) and 2) make complete sense to me. But is it really true that both Assumption 1) and 2) have been violated? I assume the GC is not holding a privately maintained list of enumerable java threads. Is this correct? On 8/6/07, Xiao-Feng Li <[EMAIL PROTECTED]> wrote: > On 8/7/07, Weldon Washburn <[EMAIL PROTECTED]> wrote: > > Some additional questions: > > 1) > > I think the spin lock in question is gc_platform.h, line 273, > > "#define try_lock(x) (!atomic_cas32(&(x), LOCKED, FREE_LOCK))". Is > > this correct? > > Weldon, yes. That's the GCv5 spinlock. > > > 2) > > What workloads do we know the spin lock prevents from running? In other > > words, is this spin lock preventing anyone from getting their work done? > > As Gregory described, a test case can't finish because of 4000 threads > contending for the spinlock. I didn't try it, so I don't know if > that's the root cause. > > > 3) > > Is the only impact of the spin lock to waste CPU cycles? Or is the spin > > lock actually aggrevating other bugs in the system? Is the spin lock > > causing race conditions to appear that otherwise are rarely seen? In other > > words, does replacing the spin lock with a OS supported mutex actually fix a > > bug or simply make a bug less likely to cause problems? > > I guess it's only a CPU cycle issue. Mutex can be used here, but I > don't know if it's correct fix of the issue. Basically the list of > mutators is maintained and used by GC in following ways: > 1) insert/remove a mutator from the list during thread > creation/termination (or attach/detach); > 2) iterate the list for mutator-local data during garbage collection; > > For 1) above, the assumption is, thread creation/termination process > is serialized (Assumption 1), so we don't really need any lock > protection for the list insertion/removal. > > For 2) above, the assumption is no new mutators will be > created/terminated during the STW collection (Assumption 2), the list > iteration doesn't need lock protection either; > > But we actually found neither Assumption 1 nor 2 was guaranteed to be > hold true, so we introduced spinlock for temporary solution, hoping to > remove spinlock for the list access finally. > > Assumption 2 is a little more subtler than it looks. It's easy to > understand that, during STW collection, there is no thread is created. > But there is a special case when a thread is in the middle of its > creation process. If the creation action is not atomic wrt STW > collection, it is possible for a thread creation to be initiated > before STW and finished during STW, which means, this mutator is > inserted into the list during the STW collection. > > To summarize the two assumptions, GC expects the thread > creation/termination be atomic wrt each other and wrt the collection > process. > > Weldon, please let me know if this expectation is proper. Otherwise, > we will have to provide the atomicity by GC itself. > > Thanks, > xiaofeng > > > On 8/5/07, Gregory Shimansky <[EMAIL PROTECTED]> wrote: > > > > > > Weldon Washburn wrote: > > > > Gegory, Xiao-Feng, > > > > Thanks. We will get to this problem during redesign/reimplementation. > > > > Weldon > > > > > > Ok maybe it is not really a bug, I agree. I just created it for tracking > > > why thread.SmallStackThreadTest is excluded from the acceptance tests. > > > > > > > On 8/4/07, Xiao-Feng Li <[EMAIL PROTECTED]> wrote: > > > >> This issue is not a bug. It depends on TM's design for thread > > > >> destruction. As I understand, Weldon suggested to serialized the > > > >> process of thread termination/creation in TM. The gc_thread_kill > > > >> should be invoked in the serialization part. If that's true, the spin > > > >> lock can be removed. I put the spin lock there because TM redesign is > > > >> not finished. When it's done, we will remove the spin lock, but then > > > >> it's TM's responsibility to invoke gc_thread_kill at proper moment. > > > >> > > > >> Thanks, > > > >> xiaofeng > > > >> > > > >> On 8/3/07, Gregory Shimansky (JIRA) <[EMAIL PROTECTED]> wrote: > > > >>> [drlvm][gc_gen] Thread termination in thread.SmallStackThreadTest is > > > >> terribly slow > > > >> > > > ---------------------------------------------------------------------------------- > > > >>> Key: HARMONY-4601 > > > >>> URL: > > > https://issues.apache.org/jira/browse/HARMONY-4601 > > > >>> Project: Harmony > > > >>> Issue Type: Bug > > > >>> Components: DRLVM > > > >>> Environment: Linux/ia32 > > > >>> Reporter: Gregory Shimansky > > > >>> > > > >>> > > > >>> The test creates a big number of threads (4000) that mostly do > > > nothing. > > > >> Threads are waiting and then are joined by the main thread. This mostly > > > >> doesn't consume CPU on the machine. But then all of these threads go > > > through > > > >> gc_thread_kill function. It executes thread deletion from a list under > > > a > > > >> global spin-lock, so all 4000 threads are waiting on it using CPU at > > > the > > > >> same time since the lock spins inside of the loop. > > > >>> This test usually doesn't finish on linux/ia32 (for some reason it > > > works > > > >> on other platforms) and times out. So I created this bug report to > > > exclude > > > >> it. > > > >>> -- > > > >>> This message is automatically generated by JIRA. > > > >>> - > > > >>> You can reply to this email to add a comment to the issue online. > > > >>> > > > >>> > > > >> > > > >> -- > > > >> http://xiao-feng.blogspot.com > > > >> > > > > > > > > > > > > > > > > > > > > > -- > > > Gregory > > > > > > > > > > > > -- > > Weldon Washburn > > > > > -- > http://xiao-feng.blogspot.com > -- Weldon Washburn
