Max,
Well, now that you removed the call of "table.remove(principal)", the
"table" field in ReplayCache is useless.
Given that both classes are serializable, I understand that we may not
able to do as much clean up as we want.
Now that you removed the "table.remove(principal)" call from the
ReplayCache class, I think we should update the CacheTable.put(...)
method as well since Its "super.put(principal, rc)" call is now
redundant given that there is no chance for ReplayCache to remove the entry.
In addition, I think the CacheTable.put(...) method should check if the
ReplayCache object is empty and if yes, do the removal itself.
So, the cleanup of empty ReplayCache object is still done, except it's
now in CacheTable class instead of ReplayCache class which leads to the
deadlock.
Lastly, not really relevant to your changes here, but the code in both
classes doesn't seem too efficient and some even looks not quite correct.
For example, the impl of ReplayCache.put(...) method first goes through
the LinkList and insert authenticator timestamp in descending order and
then going through it again removing the expired timestamps. Wouldn't it
be better to revert the order or doing both together, i.e. check for
expiration and find the indexing while going through the list once?
Thanks,
Valerie
On 01/09/12 21:41, Weijun Wang wrote:
Hi Valerie
Please review my fix:
http://cr.openjdk.java.net/~weijun/7118809/webrev.00/
Basically, CacheTable.put() operates on ReplayCache inside it, and
ReplayCache.put() operates on CacheTable containing it, and both
methods are synchronized and using thread-safe Hashtable.
My solution is to move the "table.remove(principal)" line in
ReplayCache.put() outside the method. I search thru JDK and
CacheTable.put() is the only place the method is called:
public synchronized void put(String principal, AuthTime time, long
currTime) {
ReplayCache rc = super.get(principal);
if (rc == null) {
if (DEBUG) {
System.out.println("replay cache for " + principal + "
is null.");
}
rc = new ReplayCache(principal, this);
rc.put(time, currTime);
super.put(principal, rc);
}
else {
rc.put(time, currTime);
// re-insert the entry, since rc.put could have removed
the entry
super.put(principal, rc);
if (DEBUG) {
System.out.println("replay cache found.");
}
}
}
Curiously, you can see after each call, the ReplayCache object is
added back into the CacheTable. Therefore, the remove action is useless.
Maybe the most correct way is to remove a ReplayCache from a
CacheTable if it's empty, but that re-insert line was added in a
security fix some years ago which I cannot decipher.
So my fix simply removes the "remove" call in ReplayCache.
No regression test, hard to reproduce failure.
Thanks
Max
-------- Original Message --------
*Change Request ID*: 7118809
*Synopsis*: rcache deadlock
=== *Description*
============================================================
A DESCRIPTION OF THE PROBLEM :
The program as below:
--------------------------------------------------
import sun.security.krb5.internal.rcache.*;
import sun.security.krb5.internal.*;
import java.util.*;
public class KTest2 {
public static void main(String [] a) throws Exception{
final CacheTable ct = new CacheTable();
final long time = System.currentTimeMillis();
ct.put("TheOnlyOne", new AuthTime( time -
Krb5.DEFAULT_ALLOWABLE_CLOCKSKEW * 1000L, 0), time);
final ReplayCache rc = (ReplayCache) ct.get("TheOnlyOne");
new Thread() {
public void run() {
rc.put(new AuthTime( time - Krb5.DEFAULT_ALLOWABLE_CLOCKSKEW *
1000L, 0), time + 300*1000);
}
}.start();
ct.put("TheOnlyOne", new AuthTime( time -
Krb5.DEFAULT_ALLOWABLE_CLOCKSKEW * 1000L, 0), time);
}
}
--------------------------------------------------
When compiles as in: javac KTest2.java
and executed as in: java KTest2
can deadlock the hosting JVM as is reproduced by the stack-trace dump,
below:
-------------------------------------------------
Found one Java-level deadlock:
=============================
"Thread-0":
waiting to lock monitor 0x0921d720 (object 0xa5621b80, a
sun.security.krb5.internal.rcache.CacheTable),
which is held by "main"
"main":
waiting to lock monitor 0x0921caa0 (object 0xa5622fb8, a
sun.security.krb5.internal.rcache.ReplayCache),
which is held by "Thread-0"
Java stack information for the threads listed above:
===================================================
"Thread-0":
at java.util.Hashtable.remove(Hashtable.java:435)
- waiting to lock <0xa5621b80> (a
sun.security.krb5.internal.rcache.CacheTable)
at
sun.security.krb5.internal.rcache.ReplayCache.put(ReplayCache.java:123)
- locked <0xa5622fb8> (a
sun.security.krb5.internal.rcache.ReplayCache)
at KTest2$1.run(KTest2.java:13)
"main":
at
sun.security.krb5.internal.rcache.ReplayCache.put(ReplayCache.java:62)
- waiting to lock <0xa5622fb8> (a
sun.security.krb5.internal.rcache.ReplayCache)
at
sun.security.krb5.internal.rcache.CacheTable.put(CacheTable.java:59)
- locked <0xa5621b80> (a
sun.security.krb5.internal.rcache.CacheTable)
at KTest2.main(KTest2.java:16)
Found 1 deadlock.
...