This is what the CPU sampler showed me: "Healthy" server: http://puu.sh/7ld4T.png CPU constantly at ~75%: http://puu.sh/7lp8Y.png
Both were taken after about 15 seconds. The order of those methods doesn't change after that.
Yesterday it ran fine for 18 hours, then went to 100% again. Today is the first time it didn't go straight to 100, and the NioProcessor threads in thread dumps mostly show poll0 (as it does when it runs smoothly). The only difference I could find is that resetWakeupSocket0 used more CPU time (when it's healthy it's below 1.000ms). That is also the method that shows up for many threads in a thread dump while it eats 100%.
I also noticed that (after it starts misbehaving) there are ~10 NioProcessor threads that constantly use more CPU time than all the other threads. In a thread dump, they're always at poll0 though, just like most of the other threads.
The profiler, on the other hand, was quite bitchy. It either didn't give conclusive results, didn't show anything at all (except 1.xxx methods instrumented - refreshing wouldn't help as well), or it got stuck after a few seconds and screwed up the profiled application in the process, so that I had to kill the server.
I did some further packet-related tests and couldn't find any malformed/abnormally long packets, and the amount of packets that run through the protocol codec is normal for the number of clients. The bandwidth is pretty normal as well while it derps.
