It would be useful to know for sure that this happens on 1.6 beta and
1.5.0_07... Anyone?

On Fri, May 26, 2006 at 05:53:21PM +0100, Matthew Toseland wrote:
> Another 3 stack traces here, of a different lost lock (still around
> PacketSender).
> 
> http://amphibian.dyndns.org/argh.2.txt
> 
> The obvious solution would seem to be - and has been in the past -
> LD_LIBRARY_PATH. Unfortunately there are systems on which this causes a
> crash by itself e.g. some gentoo's, and nextgens tells me that some
> users seem to get the same bug on Windows, although this is difficult to
> confirm as they can't easily get a stack trace.
> 
> For me this is triggered by inserts.
> 
> It is known to happen on 1.4.2 and 1.5.0_06 (Sun *and* Blackdown).
> 
> What we DO need to know is if it happens on Windows. Anyone who can get
> a stack dump on a Windows node, watch out for all nodes getting backed
> off due to Timeout3 or AcceptedTimeout (the same reason on all or most
> nodes), and get some stack dumps. Our past experience is that this is an
> NPTL issue and therefore linux-specific.
> 
> IBM isn't tested yet. GCJ/GIJ should be immune, and nextgens is working
> on that.
> 
> On Wed, May 24, 2006 at 11:27:30PM +0100, Matthew Toseland wrote:
> > Observe the two stack traces here:
> > http://amphibian.dyndns.org/argh.txt
> > 
> > Look at PacketSender in both cases. There were some seconds between
> > them, but they're both the same. It has locked one lock, and it is
> > waiting for the other. The other lock is not held by any thread.
> > 
> > This is accompanied by wierd symptoms: Every node is backed off because
> > of an AcceptedTimeout.
> > 
> > In conclusion? The current 0.7 code triggers a JVM bug - at least on my
> > machine - which kills us. I've seen the same thing with logging.
> > 
> > Any ideas for a way forward? Or any ideas for why I am wrong (I hope I
> > am)? This is consistent, I just did another one, many minutes later. It
> > always has:
> > 
> > "PacketSender thread for 0" daemon prio=1 tid=0x0825bbd8 nid=0x8c0
> > waiting for monitor entry [0xb11ff000..0xb11ff5c0]
> > at freenet.node.KeyTracker.getNextUrgentTime(KeyTracker.java:790)
> > - waiting to lock <0x7ef4d718> (a
> >   freenet.support.UpdatableSortedLinkedListWithForeignIndex)
> > at freenet.node.PeerNode.getNextUrgentTime(PeerNode.java:641)
> > - locked <0x7e129c78> (a freenet.node.PeerNode)
> > at freenet.node.PacketSender.realRun(PacketSender.java:85)
> > at freenet.node.PacketSender.run(PacketSender.java:47)
> > at java.lang.Thread.run(Thread.java:595)
> > 
> > And in all 3 cases, (and with the same problem with logging earlier),
> > 0x7ef4d718 is not locked by any thread.
> > 
> > And it's not looping; it's the same lock it's trying to get, and the
> > same lock it's got already, in all 3 cases.
> > 
> > This is with sun java 1.5.0_06.
> > -- 
> > Matthew J Toseland - toad at amphibian.dyndns.org
> > Freenet Project Official Codemonkey - http://freenetproject.org/
> > ICTHUS - Nothing is impossible. Our Boss says so.
> 
> 
> 
> > _______________________________________________
> > Devl mailing list
> > Devl at freenetproject.org
> > http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
> 
> -- 
> Matthew J Toseland - toad at amphibian.dyndns.org
> Freenet Project Official Codemonkey - http://freenetproject.org/
> ICTHUS - Nothing is impossible. Our Boss says so.



> _______________________________________________
> Devl mailing list
> Devl at freenetproject.org
> http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

-- 
Matthew J Toseland - toad at amphibian.dyndns.org
Freenet Project Official Codemonkey - http://freenetproject.org/
ICTHUS - Nothing is impossible. Our Boss says so.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: 
<https://emu.freenetproject.org/pipermail/devl/attachments/20060526/3ed25ea6/attachment.pgp>

Reply via email to