I have done some more looking into the Tcl Blend object GC
problem. Tcl Blend depends on the Java GC system to clean
up references to Tcl_Obj* structures that are stored inside
of TclObject interp reps (CObject instances). This does
not work because you can not call Tcl_DecrRefCount from
outside of the main Tcl thread.

There have been a couple of solutions discussed so far.

1. Queue cleanup events to the Tcl event queue when
   the Java GC thread finds a Tcl_Obj* that has not
   been cleaned up properly. 

2. Create a cleanup queue in Java, the Java GC thread
   would then add Tcl_Objs that needed to get cleaned
   up to this queue and Tcl would poll the queue in
   a thread safe way at some later time.

3. Create a cleanup thread in Java, this would be like
   the cleanup queue except instead of having Tcl poll
   the queue, you would use another lock for this one
   case.

4. Don't clean up, just make sure a CObject never
   holds a ref that needs to be released.

There is a problem with approach #1, if the Tcl interp
does not actually use the event loop, then it will never
process the queue of Tcl objects that need to get cleaned
up. That would lead to memory leaks. Solution #2 would
also not work if the Tcl event loop was not used. Solution
#3 would be really nasty, it would solve the problem of
not deadlocking the Java GC thread, but it would be a hack.

It seems to only real solution to this problem is approach
#4. It also seems like this is the most difficult solution
because it runs smack dab into the current implementation.


Here is a quick example of what is currently going on,
in this example we define a new command "getvar".

import tcl.lang.*;

public class GetVarCmd implements Command {
  public void cmdProc(Interp interp, TclObject[] objv) throws TclException
  {
    TclObject plat = interp.getVar("tcl_platform", "platform", 
TCL.GLOBAL_ONLY);
    return;
  }
}

Now you load it into Tcl Blend like so:

package require java
set i [java::getinterp]
$i createCommand getvar [java::new GetVarCmd]

getvar


When you call the "getvar" command, it calls Interp.getVar().
If you look at Java_tcl_lang_Interp_getVar() in javaInterp.c
you will find a call to JavaGetTclObject().

JavaGetTclObject(env, valuePtr, NULL);

This takes the Tcl_Obj* valuePtr and wraps it in a
tcl.lang.CObject by calling "new TclOBject(new CObject(objPtr))".

Here is the CObject constructor:

CObject(
    long objPtr)                // Pointer to Tcl_Obj from C.
{
    this.objPtr = objPtr;
    incrRefCount(objPtr);
}

Note how the incrRefCount() method is called for this
Tcl_Obj*, it end up calling Tcl_IncrRefCount().


So the result of the interp.getVar() method is a
TclObject wrapped around a Tcl_Obj* that has
had it's reference count incremented.

Now back to the "getvar" cmd, when it finishes
there are no more refs to the TclObject that
was returned by the call to Interp.getVar(),
so it could get GCed at some point.


One can help it along by calling:

java::call java.lang.System gc


When the Java GC thread runs, it thinks
"oh here is a Tcl_Obj* I should decr",
it then promptly calls CObject.decrRefCount()
which calls Tcl_DecrRefCount (BOOM!).



There are a couple of really interesting
things to note about this example.

1. The TclObject wrapped around a CObject
   is not really like a regular Tcl_Obj*
   because there are really two different
   refCounts. The TclObject has a ref count
   that is incremented by a call to
   TclOject.preserve() and the Tcl_Obj
   itself has another refCount that is
   totally unrelated to the TclObject's.

2. The Tcl_Obj's refCount is incremented
   by one with a call to Tcl_IncrRefCount
   when it's CObject is wrapped inside a
   TclObject. The same CObject can be
   the internal ref for multiple TclObjects,
   this is very different from the way a
   regular Tcl_Obj works.

3. There is a fundamental mismatch between
   the way Tcl method deal with Tcl_Objs
   and the way Tcl Blend methods deal with
   CObjects. Here are the comments (docs)
   for the Tcl_ObjGetVar2 method:

   Side effects:
        The ref count for the returned object is _not_ incremented to
        reflect the returned reference; if you want to keep a reference to
        the object you must increment its ref count yourself.



So now what? It seems like we are really in a bind,
Tcl Blend is automatically incrementing the returned
Tcl_Obj's refCount. This is exactly what the Tcl docs
say should not happen.


Well, the cheap fix for this leaking memory problem
is to just call TclObject.release() on the returned
object.

import tcl.lang.*;

public class GetVarCmd implements Command {
  public void cmdProc(Interp interp, TclObject[] objv) throws TclException
  {
    TclObject plat = interp.getVar("tcl_platform", "platform", 
TCL.GLOBAL_ONLY);
    plat.release();
    return;
  }
}

That would avoid the memory leak, but it is not a very good
long term solution. Java developers are just not used to
"freeing" memory, so that will forget to do it. It is
also going to be very hard to track down one of these
memory leaks once a developer knows about the problem.
(of course, it is just as bad in regular Tcl).

When you get right down to it, this problem can
be summed up as "that is just not right". I
am seriously thinking we need a much more
radical approach to really solve this problem.

First, lets just not call Tcl_IncrRefCount
on a new CObject. I think we need to just
require that a user incr the refCount
of the native Tcl_Obj to indicate that
they want to use it later. The user
would also be responsible for decrementing
the ref count later. This would fix
interp.getVar() so that it matched the
behavior of Tcl_ObjGetVar2.

Now you might be wondering, if the user
needs to incr and decr the Tcl_Obj ref
counts, how are they going to do that?
Good question, the TclObject class
has its own refCount, which is not
the same as the one in the Tcl_Obj.

The most radical approach would be to
just rip out the refCount stuff in
TclObject. You could then map a
TclObject.preserve() to Tcl_IncrRefCount
when the InternalRep was a CObject.

Java already has a GC system available,
so we really do not need to use
ref counting because we do not free
resources manually anyway. This would
mean that preserve() and release()
would be no-ops for every other
type of Internal rep.

While I kind of like that idea,
I am note sure if it could be
implemented all that quickly.
It might require some substantial
rewriting to both Tcl Blend and Jacl.

I may be possible to use a hybrid
approach when a call to
TclObject.preserve() uses the
existing implementation with
special code for the case
where the Internal rep is
a CObject. That way a
call to a TclObject.preserve()
for a wrapped CObject would
end up calling Tcl_IncrRefCount.

If we did that, and then removed
the automatic call to Tcl_IncrRefCount()
when the CObject is created, I think
our problem with Interp.getVar()
would go away.


I think this would still work the
way users would expect, and it
should not break existing code.
Take a look at the init()
method from BlendExtension.java
for a quick example.

    TclObject plat = interp.getVar("tcl_platform", "platform",
                                   TCL.GLOBAL_ONLY);

    if (plat.toString().equals("java")) {
        interp.setVar("tcljava", "tcljava",
            TclString.newInstance("jacl"), TCL.GLOBAL_ONLY);
    } else {
        interp.setVar("tcljava", "tcljava",
            TclString.newInstance("tclblend"), TCL.GLOBAL_ONLY);
    }


Here we see that TclObject.toString() is called, if
we wrapped a Tcl_Obj without incrementing its ref count,
this should still work. The toString() call would
invoke Java_tcl_lang_CObject_getString which
would call Tcl_GetStringFromObj() (no problem there right).

I am sure there are cases I have overlooked,
but I think the idea is doable.

What do you think?

Are there some cases that will make this solution
impossible?

Mo DeJong
Red Hat Inc

----------------------------------------------------------------
The TclJava mailing list is sponsored by Scriptics Corporation.
To subscribe:    send mail to [EMAIL PROTECTED]  
                 with the word SUBSCRIBE as the subject.
To unsubscribe:  send mail to [EMAIL PROTECTED] 
                 with the word UNSUBSCRIBE as the subject.
To send to the list, send email to '[EMAIL PROTECTED]'. 
An archive is available at http://www.mail-archive.com/tcljava@scriptics.com

Reply via email to