It would be useful to know for sure that this happens on 1.6 beta and 1.5.0_07... Anyone?
On Fri, May 26, 2006 at 05:53:21PM +0100, Matthew Toseland wrote: > Another 3 stack traces here, of a different lost lock (still around > PacketSender). > > http://amphibian.dyndns.org/argh.2.txt > > The obvious solution would seem to be - and has been in the past - > LD_LIBRARY_PATH. Unfortunately there are systems on which this causes a > crash by itself e.g. some gentoo's, and nextgens tells me that some > users seem to get the same bug on Windows, although this is difficult to > confirm as they can't easily get a stack trace. > > For me this is triggered by inserts. > > It is known to happen on 1.4.2 and 1.5.0_06 (Sun *and* Blackdown). > > What we DO need to know is if it happens on Windows. Anyone who can get > a stack dump on a Windows node, watch out for all nodes getting backed > off due to Timeout3 or AcceptedTimeout (the same reason on all or most > nodes), and get some stack dumps. Our past experience is that this is an > NPTL issue and therefore linux-specific. > > IBM isn't tested yet. GCJ/GIJ should be immune, and nextgens is working > on that. > > On Wed, May 24, 2006 at 11:27:30PM +0100, Matthew Toseland wrote: > > Observe the two stack traces here: > > http://amphibian.dyndns.org/argh.txt > > > > Look at PacketSender in both cases. There were some seconds between > > them, but they're both the same. It has locked one lock, and it is > > waiting for the other. The other lock is not held by any thread. > > > > This is accompanied by wierd symptoms: Every node is backed off because > > of an AcceptedTimeout. > > > > In conclusion? The current 0.7 code triggers a JVM bug - at least on my > > machine - which kills us. I've seen the same thing with logging. > > > > Any ideas for a way forward? Or any ideas for why I am wrong (I hope I > > am)? This is consistent, I just did another one, many minutes later. It > > always has: > > > > "PacketSender thread for 0" daemon prio=1 tid=0x0825bbd8 nid=0x8c0 > > waiting for monitor entry [0xb11ff000..0xb11ff5c0] > > at freenet.node.KeyTracker.getNextUrgentTime(KeyTracker.java:790) > > - waiting to lock <0x7ef4d718> (a > > freenet.support.UpdatableSortedLinkedListWithForeignIndex) > > at freenet.node.PeerNode.getNextUrgentTime(PeerNode.java:641) > > - locked <0x7e129c78> (a freenet.node.PeerNode) > > at freenet.node.PacketSender.realRun(PacketSender.java:85) > > at freenet.node.PacketSender.run(PacketSender.java:47) > > at java.lang.Thread.run(Thread.java:595) > > > > And in all 3 cases, (and with the same problem with logging earlier), > > 0x7ef4d718 is not locked by any thread. > > > > And it's not looping; it's the same lock it's trying to get, and the > > same lock it's got already, in all 3 cases. > > > > This is with sun java 1.5.0_06. > > -- > > Matthew J Toseland - toad at amphibian.dyndns.org > > Freenet Project Official Codemonkey - http://freenetproject.org/ > > ICTHUS - Nothing is impossible. Our Boss says so. > > > > > _______________________________________________ > > Devl mailing list > > Devl at freenetproject.org > > http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl > > -- > Matthew J Toseland - toad at amphibian.dyndns.org > Freenet Project Official Codemonkey - http://freenetproject.org/ > ICTHUS - Nothing is impossible. Our Boss says so. > _______________________________________________ > Devl mailing list > Devl at freenetproject.org > http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl -- Matthew J Toseland - toad at amphibian.dyndns.org Freenet Project Official Codemonkey - http://freenetproject.org/ ICTHUS - Nothing is impossible. Our Boss says so. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: Digital signature URL: <https://emu.freenetproject.org/pipermail/devl/attachments/20060526/3ed25ea6/attachment.pgp>