Thank you Xiaofeng, [1] is very useful! I am planning to implement weakref support for all the 3 concurrent algorithms. Using read barrier, I suppose the solution of different algorithms would be similar, but I am not sure, IMHO, it is because that the referent of weakref can not be overwritten after weakref has been created. Is it right? I have submitted my proposal to the GSoC application webm, I will keep updating it these days.
Thanks Simon 2009/4/1 Xiao-Feng Li <[email protected]> > Simon, I've got some time reviewing your proposal. It looks very > feasible. Probably you'd better read my blog entry on WeakRef and > Finalizer processing of Harmony [1]. > > One question, will you implement weakref support for all the three > concurrent GC algorithm? > > Btw, please go ahead to submit/update your proposal to the GSoC > application web. > > [1] > http://xiao-feng.blogspot.com/2007/05/weak-reference-processing-in-apache.html > > Thanks, > xiaofeng > > On Tue, Mar 31, 2009 at 6:06 PM, Simon Zhou <[email protected]> > wrote: > > Hi All, > > Here is my proposal, any suggestion is welcome! > > > > *Title/Summary:* GC-1: Implement WeakReference support in Harmony > Concurent > > GC > > > > *Student:* Simon.Zhou > > > > *Student e-mail:* [email protected] > > > > *Student Major:* Computer Science > > > > *Student Degree:* Master > > > > *Student Graduation:* 2009.7 > > > > *Organization:* Fudan University > > > > *Assigned Mentor:* > > > > *Abstract:* > > > > weak reference allows a program to maintain special reference to objects, > > which is a useful API for the program interacting with the garbage > > collector to some extent. Apache Harmony has a concurrent garbage > collector > > (code name: Tick) which can perform the garbage collection without > > suspending the application threads except for root set enumeration. Tick > do > > not have weak reference supporting now, it treats all the references > object > > just as strong references, which may cause inefficiency for applications > > using weak references. In this project, I will add the weak reference > > support for Tick, which would be different from the implementation in > > generational GC, because the consistency should be maintained for the > heap > > objects carefully. Read barrier of the get() method of reference object > will > > be used for this implementation, and the performance issue will be > seriously > > considered. before implement phantom reference, I will also implement the > > finalizer feature for Tick, although it is not recommanded to be used > except > > it is necessary. Besides this, I am trying to reduce the floating garbage > > for Tick’s concurrent algorithms. > > > > *Detailed Description:* > > > > Tick is a concurrent GC of Harmony DRLVM, it can do the garbage > collection > > without suspending application threads. For concurrent GC, the most > > important issue that should be dealt with is the inconsistency problem, > that > > is if we trace the heap object graph while the mutator updating the same > > graph, inconsistency problem occurs, some live objects may be mistaken > > relaimed as garbage. In order to keep the consistency of objects, write > > barriers is inserted when concurrent tracing starts, which intercept all > the > > write operations of object reference and record useful data for > rescaning. > > Now there are 3 different concurrent collection algorithms in Tick: DLG, > > Sliding-View, Mostly-Concurrent; and 3 different write barriers are used > > repectively. see more information [1]. > > > > Experiment result shows that Tick can significantly reduce or even > eliminate > > the pause time caused by garbage collection. However, one drawback of > Tick > > is that, it do not support weak reference mechanism, which is a important > > API of Java programming language. > > > > Weak reference is an useful tool for whom are much care about the > > application memory usage, e.g. it can help developers to cache some > > temporary large objects for later reusing; also, to do some cleanup > > operations just when a object is acturally relaimed. There are 3 kinds of > > weak reference class in Java language (the weakness is decreasingly): > soft > > reference, weak reference, phantom reference. Soft reference is usually > used > > to maintain better cache for large object, such as pictures load from > web; > > weak reference can help to reduce the frequency of creating short-live > > objects; phantom reference can usully help programmer to do cleanup > > operations just when an object is physically relaimed. (from now on, I > will > > use ‘weakref’ to represent the set of all the three kinds of weak > > references) All the three kinds of reference classes have a get() method, > if > > the object has not been actually relaimed, the get() method will return a > > reference of the referent object (referent object is the object which > > reference object point to) except phantom reference (phantom reference’s > > get() method always return null, since it is just used for cleanup > > operations when object is actually relaimed). see more information [2,3] > > > > Weakrefs are always processed in GC. For generational GC, weakref feature > > can be implemented easily: when minor collection occurs, weak reference > > objects are equeued and their referents are reclaimed. when major > collection > > occurs, soft reference objects are enqueued and their referents are > > reclaimed. when finalizer are processed, phantom reference objects are > > enqueued and their referents are reclaimed. > > > > Unlike generational GC, Tick just simply treats all the references in the > > heap as strong references, which make memory pressure become bigger for > the > > applications using much data structure based on weakrefs, such as > > WeakHashMap. In this project, my job is to implement this weakref feature > > for Tick, and this work should be independent to the concurrent > algorithms > > used in Tick. > > > > For concurrent GC, weakref processing is different with generational GC, > > because the tracing thread is running concurrently with application > threads. > > Application threads may invoke the get() method to get the reference of > > referent object, it may: > > 1, put it to the thread local stack as local var or calling parameter, > such > > as > > Object localvar = weakref.get(); > > functionA(weakref.get()); > > 2, assign it to another object slot, such as > > obj.field = weakref.get(); > > these operations will change the reachability of referent object, which > may > > cause the inconsistency issues in concurrent GC. In order to deal with > it, > > what we should do is to intercept these operations while application > thread > > is running. Because all these operations use the get() method to get the > > reference of referent object, inserting the read barrier in the get() is > a > > simple and effective approach. The main job of the read barrier is to add > > the pointer of referent object to a remembered set, then the garbage > > collection thread will pick it up and trace from it. > > > > I divided the work into 3 major steps: > > > > The first step is to implement the read barrier for get() method of > > reference object. My approch is implementing the barrier using a VMHelper > > method, which is a VMMagic based helper function writen in Java. That is, > > using Java to write some hot funtions in JVM, then these Java methods can > be > > compilered and even inlined into the native code produced by JIT compiler > > [4]. This is very useful approach to reduce the read barrier’s overhead. > In > > this step, I will write a Java version read barrier and add some code to > the > > JIT compiler (Jitrino and JET) to make it compile the barrier correctly > and > > effiently. > > > > The second step is to implement the weakref processing module in Tick. > This > > module will collect all the weakrefs while concurrent marking and put > them > > to different queues depending on its reference type. When concurrent > marking > > finishes, GC threads pick up the reference object in the queues to see if > > the referent object is marked, if so, the referent object is live and > should > > not be processed, otherwise, process the referent object according the > type > > of the reference object respectively. moreover, if the reference object > is > > intercepted by the read barrier in the get() method, its referent object > > should be marked while tracing, so the consistency is well maintained. > > > > After this step, I will test Tick with benchmark using weakrefs, or using > > data structure based on the weakrefs (such as WeakHashMap). The > experiment > > result will be analyzed and compared to Tick without weakrefs supporting > and > > generational GC with weakrefs supporting. Then I will form the result > data > > to a document report. > > > > The last step is trying to reduce the floating garabge [5] for concurrent > > GC. This work is not related to weakrefs feature, but its effect may be > > similar to it, since the major goal of reducing floating garbage is to > ease > > the memory pressure while using concurrent GC. Floating garbage always > exist > > and may not be totally eliminated in concurrent GC, my job is trying to > > optimize its impact as much as possible. > > > > *Additional Information:* > > > > Personal Information: > > I am a master canidate of Computer Science in Parallel Processing > > Institution, Fudan University. My interested area includes Compiler, > > Computer Architecure, Managed Runtime etc. I started to follow Harmony > with > > interests 2 year ago, and I used to contribute to DRLVM and have submited > > some patches to the community. As an intern, I improved Harmony > concurrent > > GC by using state-based phase design and implementing an efficient > scheduler > > in the summer of 2008, the patch of this work has been merged to the > source > > trunk [6]. > > > > Schedule: > > > > I can work at least 3 days per week for this work, my schedule this > > project as follow: > > April 10 ~ April 26 Communicate with the mentor and design the interface > > of the implementation, such as, the native interface of get() method, > weak > > reference processing interfaces in GC module. Write down the design > > document. > > > > April 27 ~ May 24 implementing read barrier for get() method of > reference > > object. > > > > May 25 ~ June 30 implementing the weak and soft reference processing > > features in GC module, add finalizer feature to Tick > > > > July 1 ~ July 15 implementing phantom reference processing features. > > > > July 16 ~ Aug 1 test and analyze Tick with weak reference related > > benchmark. write document for implementing and testing. > > > > Aug 1 ~ Aug 15 try to improve Tick by reducing floating garbage. > > > > *References:* > > > > [1] > > > http://xiao-feng.blogspot.com/2007/03/comparison-between-snapshot-based-gc.html > > [2] http://www.realjenius.com/node/377 > > [3] http://www.pawlan.com/Monica/refobjs/ > > [4] > > > http://mail-archives.apache.org/mod_mbox/harmony-dev/200505.mbox/%[email protected]%3e > > [5] http://www.memorymanagement.org/glossary/f.html#floating.garbage > > [6] http://issues.apache.org/jira/browse/HARMONY-5989 > > > > -- > > From : simon.z...@ppi, Fudan University > > > > > > -- > http://people.apache.org/~xli <http://people.apache.org/%7Exli> > -- >From : simon.z...@ppi, Fudan University
