Hi All;

You may be aware that the heap/GC issue/task has been lying on the floor for
quite some time now, although at least two individuals have, in the medium-term
past, expressed interest.  Looks like now it'll get done!

Welcome, George!

-jm

-- 
==== John Morrison            ==== MaK Technologies, Inc.
==== Chief Technology Officer ==== 185 Alewife Brook Pkwy, Cambridge, MA 02138
==== [EMAIL PROTECTED]               ==== http://www.mak.com/welcome.html
==== vox:617-876-8085 x115    ==== fax:617-876-9208


Hello George;

[EMAIL PROTECTED] wrote:
> Has anyone taken up the garbage collector challenge? If not, and if you're

Mr. Iain Lowe expressed interest over two months ago, but I haven't heard from
him lately.

> still looking for help in this area, I'd be willing to do my best to step
> in. What I've got so far is an implementation of a heap suitable for a
> conservative mark / sweep garbage collector. (By the way, why did you want
> the gc to be conservative? As I understand it, conservative gc is only
> necessary when you don't know what is a pointer and what is an integer
> (say). decaf must know this for objects on the heap, and presumably does for

Yes, that property of conservative GC schemes is the driving requirement, for a
few reasons.

(1) The decaf JVM we're using uses pointers, not handles, to represent Java
objects (although all Java objects are subclassed from a single C++ base class
in the JVM so we could probably change this if it became necessary).  This rules
out a whole bunch of "moving" or "copying" GC approaches.

(2) The decaf JVM and the jjos nano/pico/femto/whatever kernel uses C++ and
dynamic memory allocation as well. Currently, both the jjos C++ objects and the
Java objects (which are, after all, C++ objects as far as the JVM is concerned)
are allocated out of the same heap.  (This is just laziness and expediency. 
Originally, I thought we would use handles and reference counting, and
compacting GC for "back up."  The best laid plans...  However, maybe simpler is
better.)

(3) The JVM uses "green threads" right now (i.e., simulated, not "real"
threads).  However, when the underlying kernel implements native threads, we're
really going to have both a Java and a native code stack for each Java thread. 
Wouldn't it be nice if we just used a conservative GC to scan both data
structures the same way?

> its stacks too?)  It is based on pages (standard 4K size), with a page only
> holding objects of a specific size. Each page has a descriptor, and objects
> are marked using a bitmap in the descriptor for their page. Scanning of the
> heap is incremental, so that the scan cost is spread over allocations rather
> than as a hit at the end of marking.

This is, to the best of my knowledge, similar to the time-honored Kaffe-adopted
solution as described at:
        http://sourceware.cygnus.com/java/papers/nosb.html

> To be done before useful:
> - Allocation of large objects (> 1/2 page). These will get whole pages to
> themselves.
>         Issues: need a freelist or similar to manage pages
> - Implement mark phase.
>         Issues: how best to handle mark stack overflow?
> - Integrate with decaf.
>         Issues: got to find my way round it first!

Todd, whom I have cc-ed on this reply (Hi, Todd!), is The Man for decaf.  If
it's an easy question, then I can help you, too.  decaf uses the C++ heap (such
as it is) provided by jjos, for which I am the main point of contact (although
Todd can help you there, too).  Perhaps it is sufficient to modify the following
jjos files, which implement whatever heap functionality we have (currently
memory chunks can be allocated, but the routine to deallocate them simply throws
them away):

common/nativecode/builtins.cc (implements the "built in" memory allocators
called by the compiler)
common/nativecode/jbheap.cc (implements the C++ heap object called by the
builtins)

With respect to design considerations, a couple of not-quite-typical
requirements:

(1) Currently, decaf does not call destructors (and thence heap deallocators)
everywhere it is possible to know that the memory object has instantaneously
become garbage.  Either Todd or I can track those down so you don't have to.  We
have, of late, become more conscientious about calling destructors.  Please
advise us if there is a suitable entry point in your conservative heap
implementation for us to mark a memory block as unused...

(2) Currently, jjos does NOT provide virtual memory services (obviously, this
only applies to the "bare iron" i386 build).  We operate in physical memory
space.  This must change eventually (although it is not a short-term concern, it
*is* a design concern).

(3) We currently plan to implement Java-thread paging, so the virtual memory
subsystem is currently envisioned to throw the memory page to be paged out "over
the wall" for the Java paging thread to page out.  And, vice versa, we plan to
have the Java pager page back in pages and throw them back over the wall to the
native-code to map into virtual memory space.  (This would be nice, because we
plan to have the other Java threads which are ready to run to continue -- this
is unlike Linux-based Java implementations, which are typically in one UNIX
process and must stop all threads entirely while pages are fetched.  This should
compensate in large part for the pokiness of Java because, no matter how pokey
Java is, it's faster than the bloody mechanical disk.)

(4) As part of the mechanics of getting this cooperation between Java and the
native code to work correctly, the vmem/GC subsystem must be able to "lock down"
regions of memory (e.g., device driver code and data, the JVM itself, etc.). 
Such memory must neither be GC-ed nor paged.  Please note our heap object is not
necessarily a global, so we can set up separate heaps if such is required...

(5) Java requires that we call any "finalize" Java methods upon Java objects
prior to their being GC-ed/reclaimed.  Todd can perhaps shed more light upon
which context the methods must be executed in ... 

> Future directions:
> - Debug and optimise
> - Incremental marking, perhaps making use of the virtual memory system

It's OK with me (and probably Todd) if there is a stop-and-GC-entirely pause for
now.  Later, with the paging scheme as described above, obviously the vmem/pmem
subsystems, the Java pager, and the GC must all play nicely together.

> - Look at other types of gc: is generational gc useful for Java?

Without having either a copying/compacting collector, I don't know right now...

> My guiding light in all this has been "Garbage Collection : Algorithms for
> Automatic Dynamic Memory Management" by Richard Jones and Rafael Lins.
> http://www.amazon.com/exec/obidos/ASIN/0471941484/o/qid=932632539/sr=8-1/002

An absolutely fabulous book -- read it cover-to-cover.  Please pay particular
attention to the brief bits about how Lisp Machines availed themselves of the
special virtual memory hardware to increase performance (I do not have the book
with me at work, but I recall finding some of these bits via the index).  In
particular, I hope that we can scan pages BEFORE they have to get paged out, as
part of the "regular" paging process, and reclaim instantly those pages which
only contain garbage, and thus avoid the overhead of "throwing the page over the
wall" and sleeping any threads at all.  This is supposed to be a Really Big Win.

> -0933432-9403644 Incidentally, do you know of anywhere that discusses gc
> specifically for Java - eg a statement of what the most efficient method
> seems to be for the "average" Java app?

The URL I sent you above is one...
        http://adept.cs.twsu.edu/~thomas/jos_web.html
        http://www.ddj.com/articles/1998/9810/9810a/9810a.htm

I have other GC-related URLs, too, if you are interested...

> Sadly, I don't get too much time to do programming in my spare time - for
> example, I only managed to get started on this a month ago. What I propose
> is that I plough on for another month or so, but if I find I'm really not
> getting anywhere fast, I'll hand it all over to you or _Quinn to do with as
> you see fit. Of course, if you're interested in seeing what's there
> currently (not a huge lot), I can make that available now.

Both Todd and I can empathize with the problem of having limited hacking time. 
However, whatever you can contribute is that much more than we can do by
ourselves.  And, now is a really good time for us to be addressing these
problems.  Todd has (pretty much) finished the functionality of the JVM, and we
can boot and run Java programs (that don't require device access).  Now, we need
to keep JOS up and running longer (not leaking memory like a sieve), and enable
others to work on device drivers, etc...

Thanks for any help you can give us!  If it's OK with you, I'd like to forward
this reply on to the kernel list, so they can know what's going on ... please
advise...

-jm

-- 
==== John Morrison            ==== MaK Technologies, Inc.
==== Chief Technology Officer ==== 185 Alewife Brook Pkwy, Cambridge, MA 02138
==== [EMAIL PROTECTED]               ==== http://www.mak.com/welcome.html
==== vox:617-876-8085 x115    ==== fax:617-876-9208


Reply via email to