Here is a snapshot of 6213 collapsing: outputBytes/hour
Time Events Value 2:00 2 3572 3:00 3960 875207 4:00 10879 6104331 5:00 926 265150 6:00 9814 3087231 7:00 9818 4884084 8:00 8088 2428783 9:00 9315 2787015 10:00 7463 1606426 11:00 10363 4189291 12:00 8813 2653230 1:00 8851 2771283 2:00 10739 3918438 3:00 10582 4374534 4:00 10564 3641628 5:00 6121 1278197 6:00 629 129938 7:00 14 2431 localQueryTraffic/hour Time Tries Succ Ratio 3:00 283 249 0.8798586572438163 4:00 428 420 0.9813084112149533 5:00 1 1 1.0 6:00 481 456 0.9480249480249481 7:00 634 607 0.9574132492113565 8:00 607 594 0.9785831960461285 9:00 639 616 0.9640062597809077 10:00 752 724 0.9627659574468085 11:00 883 855 0.9682899207248018 12:00 856 826 0.9649532710280374 1:00 1045 1022 0.9779904306220095 2:00 1347 1280 0.9502598366740905 3:00 1427 1152 0.8072880168185004 4:00 794 491 0.6183879093198993 5:00 1060 193 0.1820754716981132 6:00 225 0 0.0 7:00 54 0 0.0 Now, when we go find out why it died (from env): Class Threads used Checkpoint: Connection opener 52 freenet.interfaces.LocalNIOInterface$ConnectionShell 2 freenet.interfaces.PublicNIOInterface$ConnectionShell 5 freenet.node.states.data.DataStateInitiator 1 freenet.node.states.data.TrailerWriteCallbackMessage:true:true 1 :-( I don't have the memory to burn on 1000s of threads (unless Y is significantly better than Q). And the effect it has (from general): Pooled threads running jobs 60 (133.3%) Reason for refusing connections: activeThreads(60) >= maximumThreads (45) :-( By running my node out of all threads, it just shuts down. I get the feeling that chewing up threads for things that block indefinitely is a bad idea. Connections are either not timing out, or we are trying to contact a class that cannot be contacted (firewalls, NATs), or we are timing out too slowly, or... Using threads to serve content from my store is more important than opening a random connection. The cute thing is my node looks like it is trying to announce to get things moving again, but, at 133% load, that just isn't gonna happen. Looked again just now, now we are up to 144% load. I'm sure it can recover, but, better would be to handle the situation better. Leave 2 threads to serve content from the local datastore, and accept those Qs that can be served out of the local store, then at least we can still saturate the upstream serving content when this sort of thing happens. Consider killing stalled opens. Figure out if there is a class that just cannot be opened, and find ways to avoid trying to open them. And last, NIOize opens. Another strategy would be to figure out how much stack you need for a thread and trim them down to not chew up the memory as fast, that might allow me to allocate more threads. _______________________________________________ Devl mailing list [EMAIL PROTECTED] http://dodo.freenetproject.org/cgi-bin/mailman/listinfo/devl
