Thank you very much, everybody. Although we don't have the right tools to 
gather concrete evidence supporting this explanation, I agree it is the 
most likely cause. And thank you very much for filing the issue.

The fact that we're on a VM and have double-layer scheduling (one by the 
hypervisor and one by the OS) adds to the complexity and could cause some 
kind of oscillation where a thread could be kept from running for longer 
than usual. This could be a hypothesis for the periodic and rather regular 
spike we've observed, and a reason others may have not seen it. It could 
also be happening just on a particular version of the hypervisor that we 
happen to have, too.

Good this is now in the clear and we know how to deal with it.

Thank you very much once again.
-Akara





On Friday, April 17, 2015 at 1:32:08 AM UTC-7, √ wrote:
>
> I'd bet on this explanation too.
>
> Having looked at the code I think it should be possible to switch 
> implementations to something without spinlocking.
>
> On Fri, Apr 17, 2015 at 8:20 AM, Roland Kuhn <[email protected] 
> <javascript:>> wrote:
>
>> Hi Akara,
>>
>> thanks for all the detailed information, I am 95% certain that I know 
>> what happens—and I’d call it a bug. To the casual observer the code in 
>> DirectByteBufferPool looks like a CAS loop, but it actually is an 
>> open-coded primitive spinlock. If you have more threads competing for this 
>> lock than you have CPUs it might well happen that the kernel decides to 
>> schedule out a thread that was holding this lock, and since it is not a 
>> normal, honest lock, the kernel has no idea why the other threads now all 
>> are spinning like mad. It probably takes some server-type hardware and a 
>> bit of patience and good monitoring to actually notice, which is why you 
>> are the first person to report it :-)
>>
>> I created a ticket <https://github.com/akka/akka/issues/17216> to track 
>> this.
>>
>> Regards,
>>
>> Roland
>>
>> 17 apr 2015 kl. 05:20 skrev Akara Sucharitakul <[email protected] 
>> <javascript:>>:
>>
>> Nope, this is actually Ubuntu 10.10 on a 4vCPU VM (likely ESX but I can't 
>> tell for sure). Couple more pieces of information:
>>
>>    - Further experiments with parallelism-factor lowered to 1.0 no 
>>    longer shows the spikes. 
>>    - Trying to profile the spikes with JMC would cause the spikes not to 
>>    happen. The sampling somehow changed the timing characteristics.
>>    - Switching the ForkJoinExecutor to Java8, the initial experiments 
>>    show similar spikes. Since this was quick implementation of 
>>    ExecutorServiceConfigurator to get Java 8 ForkJoinPool to work, there is 
>>    still a bit of refinement we need to do before we can conclude on this 
>> fact.
>>
>> I'll keep updating this thread as we find out more. Expect it to be a 
>> long process, though.
>>
>> Thanks!
>> -Akara
>>
>> On Thursday, April 16, 2015 at 5:03:48 AM UTC-7, Martynas Mickevičius 
>> wrote:
>>>
>>> Hi,
>>>
>>> are you running this application on Windows? There is a known situation 
>>> <https://github.com/akka/akka/issues/16942> where high CPU usage might 
>>> occur when using TCP with akka IO.
>>>
>>> On Tue, Apr 14, 2015 at 7:28 AM, Akara Sucharitakul <
>>> [email protected]> wrote:
>>>
>>>> Scala 2.11.6, Akka 2.3.9, Spray 1.3.2.
>>>>
>>>> When testing a particular workload, we ran into CPU spikes. The avg CPU 
>>>> load is less than 5% but about 6-7 minutes into the run, we start to see 
>>>> CPU spiking to near 100% lasting for several seconds. This repeats itself 
>>>> every 6-7 minutes. We can't correlate this with a particular GC activity 
>>>> just yet. We tested against Java 7_45 but tried Java 8 without much change.
>>>>
>>>> Spray is configured on the default-dispatcher (Spray default).
>>>>
>>>> Trying to capture thread dumps at the time spikes happening reveal 9 
>>>> threads spinning on the followings:
>>>>
>>>> "default-akka.actor.default-dispatcher-11" prio=10 
>>>> tid=0x0000000001b2b800 nid=0x3752 runnable [0x00007f1191d40000]
>>>>    java.lang.Thread.State: RUNNABLE
>>>> at 
>>>> akka.io.DirectByteBufferPool.takeBufferFromPool(DirectByteBufferPool.scala:41)
>>>> at akka.io.DirectByteBufferPool.acquire(DirectByteBufferPool.scala:31)
>>>> at akka.io.TcpConnection.doRead(TcpConnection.scala:231)
>>>> at 
>>>> akka.io.TcpConnection$$anonfun$connected$1.applyOrElse(TcpConnection.scala:87)
>>>> at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
>>>> at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
>>>> at akka.io.TcpConnection.aroundReceive(TcpConnection.scala:28)
>>>> at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
>>>> at akka.actor.ActorCell.invoke(ActorCell.scala:487)
>>>> at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
>>>> at akka.dispatch.Mailbox.run(Mailbox.scala:221)
>>>> at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
>>>> at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>>>> at 
>>>> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>>>> at 
>>>> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>>>> at 
>>>> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
>>>>
>>>> 2 threads show this spinning:
>>>>
>>>> "default-akka.actor.default-dispatcher-8" prio=10 
>>>> tid=0x0000000001d6f000 nid=0x374f runnable [0x00007f11928d2000]
>>>>    java.lang.Thread.State: RUNNABLE
>>>> at 
>>>> akka.io.DirectByteBufferPool.offerBufferToPool(DirectByteBufferPool.scala:66)
>>>> at akka.io.DirectByteBufferPool.release(DirectByteBufferPool.scala:34)
>>>> at akka.io.TcpConnection.doRead(TcpConnection.scala:245)
>>>> at 
>>>> akka.io.TcpConnection$$anonfun$connected$1.applyOrElse(TcpConnection.scala:87)
>>>> at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
>>>> at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
>>>> at akka.io.TcpConnection.aroundReceive(TcpConnection.scala:28)
>>>> at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
>>>> at akka.actor.ActorCell.invoke(ActorCell.scala:487)
>>>> at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
>>>> at akka.dispatch.Mailbox.run(Mailbox.scala:221)
>>>> at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
>>>> at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>>>> at 
>>>> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>>>> at 
>>>> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>>>> at 
>>>> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
>>>>
>>>> 1 thread is on I/O:
>>>>
>>>> "default-akka.actor.default-dispatcher-4" prio=10 
>>>> tid=0x0000000001be2800 nid=0x36ff runnable [0x00007f11924b0000]
>>>>    java.lang.Thread.State: RUNNABLE
>>>> at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:295)
>>>> - locked <0x00000007d3b7e940> (a java.lang.Object)
>>>> at akka.io.TcpConnection.innerRead$1(TcpConnection.scala:216)
>>>> at akka.io.TcpConnection.doRead(TcpConnection.scala:232)
>>>> at 
>>>> akka.io.TcpConnection$$anonfun$connected$1.applyOrElse(TcpConnection.scala:87)
>>>> at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
>>>> at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
>>>> at akka.io.TcpConnection.aroundReceive(TcpConnection.scala:28)
>>>> at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
>>>> at akka.actor.ActorCell.invoke(ActorCell.scala:487)
>>>> at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
>>>> at akka.dispatch.Mailbox.run(Mailbox.scala:221)
>>>> at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
>>>> at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>>>> at 
>>>> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>>>> at 
>>>> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>>>> at 
>>>> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
>>>>
>>>> All in all, during this time there is no app logic being executed on 
>>>> the default dispatcher at all (12 threads on the default-dispatcher).
>>>>
>>>> For reference, here's the takeBufferFromPool implementation:
>>>>
>>>> @tailrec
>>>> private final def takeBufferFromPool(): ByteBuffer =
>>>>   if (locked.compareAndSet(false, true)) {
>>>>     val buffer =
>>>>       try if (buffersInPool > 0) {
>>>>         buffersInPool -= 1
>>>>         pool(buffersInPool)
>>>>       } else null
>>>>       finally locked.set(false)
>>>>
>>>>     // allocate new and clear outside the lock
>>>>     if (buffer == null)
>>>>       allocate(defaultBufferSize)
>>>>     else {
>>>>       buffer.clear()
>>>>       buffer
>>>>     }
>>>>   } else takeBufferFromPool() // spin while locked
>>>>
>>>>
>>>> It seems like some thread is holding *locked* (AtomicBoolean) to true 
>>>> causing the other threads to spin, but I do not seem to find the offending 
>>>> code.
>>>>
>>>> Now to the set of questions:
>>>>
>>>>    1. Using thread-pool-executor instead of fork-join-executor 
>>>>    prevents the spikes from happening. Anybody running into similar issues 
>>>>    with fork-join-executor? Any know bug in the Scala ForkJoinPool?
>>>>    2. I don't seem to find a way to use the Java 8 or Java 7 
>>>>    ForkJoinPool directly in the Akka docs. Is there a way I can test it 
>>>> out?
>>>>    3. Any other thoughts?
>>>>
>>>> Thanks in advance!
>>>> -Akara
>>>>
>>>>
>>>>
>>>> -- 
>>>> >>>>>>>>>> Read the docs: http://akka.io/docs/
>>>> >>>>>>>>>> Check the FAQ: 
>>>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>>> >>>>>>>>>> Search the archives: 
>>>> https://groups.google.com/group/akka-user
>>>> --- 
>>>> You received this message because you are subscribed to the Google 
>>>> Groups "Akka User List" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>> an email to [email protected].
>>>> To post to this group, send email to [email protected].
>>>> Visit this group at http://groups.google.com/group/akka-user.
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>
>>>
>>>
>>> -- 
>>> Martynas Mickevičius
>>> Typesafe <http://typesafe.com/> – Reactive 
>>> <http://www.reactivemanifesto.org/> Apps on the JVM
>>>  
>>
>> -- 
>> >>>>>>>>>> Read the docs: http://akka.io/docs/
>> >>>>>>>>>> Check the FAQ: 
>> http://doc.akka.io/docs/akka/current/additional/faq.html
>> >>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
>> --- 
>> You received this message because you are subscribed to the Google Groups 
>> "Akka User List" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> To post to this group, send email to [email protected] 
>> <javascript:>.
>> Visit this group at http://groups.google.com/group/akka-user.
>> For more options, visit https://groups.google.com/d/optout.
>>
>>
>>
>>
>> *Dr. Roland Kuhn*
>> *Akka Tech Lead*
>> Typesafe <http://typesafe.com/> – Reactive apps on the JVM.
>> twitter: @rolandkuhn
>> <http://twitter.com/#!/rolandkuhn>
>>  
>>  -- 
>> >>>>>>>>>> Read the docs: http://akka.io/docs/
>> >>>>>>>>>> Check the FAQ: 
>> http://doc.akka.io/docs/akka/current/additional/faq.html
>> >>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
>> --- 
>> You received this message because you are subscribed to the Google Groups 
>> "Akka User List" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> To post to this group, send email to [email protected] 
>> <javascript:>.
>> Visit this group at http://groups.google.com/group/akka-user.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
>
> -- 
> Cheers,
> √
>  

-- 
>>>>>>>>>>      Read the docs: http://akka.io/docs/
>>>>>>>>>>      Check the FAQ: 
>>>>>>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>>      Search the archives: https://groups.google.com/group/akka-user
--- 
You received this message because you are subscribed to the Google Groups "Akka 
User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.

Reply via email to