On 01/12/2012 07:16 AM, Valerie (Yu-Ching) Peng wrote:
Max, Well, now that you removed the call of "table.remove(principal)", the "table" field in ReplayCache is useless. Given that both classes are serializable, I understand that we may not able to do as much clean up as we want.
Correct.
Now that you removed the "table.remove(principal)" call from the ReplayCache class, I think we should update the CacheTable.put(...) method as well since Its "super.put(principal, rc)" call is now redundant given that there is no chance for ReplayCache to remove the entry.
Correct.
In addition, I think the CacheTable.put(...) method should check if the ReplayCache object is empty and if yes, do the removal itself. So, the cleanup of empty ReplayCache object is still done, except it's now in CacheTable class instead of ReplayCache class which leads to the deadlock.
That was my original plan. But I have no idea why in CacheTable there is // re-insert the entry, since rc.put could have removed the entry Does it really have a secret use?
Lastly, not really relevant to your changes here, but the code in both classes doesn't seem too efficient and some even looks not quite correct. For example, the impl of ReplayCache.put(...) method first goes through the LinkList and insert authenticator timestamp in descending order and then going through it again removing the expired timestamps. Wouldn't it be better to revert the order or doing both together, i.e. check for expiration and find the indexing while going through the list once?
Maybe. Thanks Max
Thanks, Valerie On 01/09/12 21:41, Weijun Wang wrote:Hi Valerie Please review my fix: http://cr.openjdk.java.net/~weijun/7118809/webrev.00/ Basically, CacheTable.put() operates on ReplayCache inside it, and ReplayCache.put() operates on CacheTable containing it, and both methods are synchronized and using thread-safe Hashtable. My solution is to move the "table.remove(principal)" line in ReplayCache.put() outside the method. I search thru JDK and CacheTable.put() is the only place the method is called: public synchronized void put(String principal, AuthTime time, long currTime) { ReplayCache rc = super.get(principal); if (rc == null) { if (DEBUG) { System.out.println("replay cache for " + principal + " is null."); } rc = new ReplayCache(principal, this); rc.put(time, currTime); super.put(principal, rc); } else { rc.put(time, currTime); // re-insert the entry, since rc.put could have removed the entry super.put(principal, rc); if (DEBUG) { System.out.println("replay cache found."); } } } Curiously, you can see after each call, the ReplayCache object is added back into the CacheTable. Therefore, the remove action is useless. Maybe the most correct way is to remove a ReplayCache from a CacheTable if it's empty, but that re-insert line was added in a security fix some years ago which I cannot decipher. So my fix simply removes the "remove" call in ReplayCache. No regression test, hard to reproduce failure. Thanks Max -------- Original Message -------- *Change Request ID*: 7118809 *Synopsis*: rcache deadlock === *Description* ============================================================ A DESCRIPTION OF THE PROBLEM : The program as below: -------------------------------------------------- import sun.security.krb5.internal.rcache.*; import sun.security.krb5.internal.*; import java.util.*; public class KTest2 { public static void main(String [] a) throws Exception{ final CacheTable ct = new CacheTable(); final long time = System.currentTimeMillis(); ct.put("TheOnlyOne", new AuthTime( time - Krb5.DEFAULT_ALLOWABLE_CLOCKSKEW * 1000L, 0), time); final ReplayCache rc = (ReplayCache) ct.get("TheOnlyOne"); new Thread() { public void run() { rc.put(new AuthTime( time - Krb5.DEFAULT_ALLOWABLE_CLOCKSKEW * 1000L, 0), time + 300*1000); } }.start(); ct.put("TheOnlyOne", new AuthTime( time - Krb5.DEFAULT_ALLOWABLE_CLOCKSKEW * 1000L, 0), time); } } -------------------------------------------------- When compiles as in: javac KTest2.java and executed as in: java KTest2 can deadlock the hosting JVM as is reproduced by the stack-trace dump, below: ------------------------------------------------- Found one Java-level deadlock: ============================= "Thread-0": waiting to lock monitor 0x0921d720 (object 0xa5621b80, a sun.security.krb5.internal.rcache.CacheTable), which is held by "main" "main": waiting to lock monitor 0x0921caa0 (object 0xa5622fb8, a sun.security.krb5.internal.rcache.ReplayCache), which is held by "Thread-0" Java stack information for the threads listed above: =================================================== "Thread-0": at java.util.Hashtable.remove(Hashtable.java:435) - waiting to lock<0xa5621b80> (a sun.security.krb5.internal.rcache.CacheTable) at sun.security.krb5.internal.rcache.ReplayCache.put(ReplayCache.java:123) - locked<0xa5622fb8> (a sun.security.krb5.internal.rcache.ReplayCache) at KTest2$1.run(KTest2.java:13) "main": at sun.security.krb5.internal.rcache.ReplayCache.put(ReplayCache.java:62) - waiting to lock<0xa5622fb8> (a sun.security.krb5.internal.rcache.ReplayCache) at sun.security.krb5.internal.rcache.CacheTable.put(CacheTable.java:59) - locked<0xa5621b80> (a sun.security.krb5.internal.rcache.CacheTable) at KTest2.main(KTest2.java:16) Found 1 deadlock. ...
