Florin Sultan created DIRMINA-898:
-------------------------------------
Summary: A wakeup() is issued in a thread before the target
select() is executed.
Key: DIRMINA-898
URL: https://issues.apache.org/jira/browse/DIRMINA-898
Project: MINA
Issue Type: Bug
Components: Core
Affects Versions: 2.0.4
Environment: Linux 2.6.18-164.11.1.el5xen #1 SMP Wed Jan 20 08:06:04
EST 2010 x86_64 x86_64 x86_64 GNU/Linux
java version "1.6.0_31"
Java(TM) SE Runtime Environment (build 1.6.0_31-b04)
Java HotSpot(TM) 64-Bit Server VM (build 20.6-b01, mixed mode)
Reporter: Florin Sultan
Context:
-------
I am using BarchartUDT (BC for short in the calls traced below) as transport.
I have integrated it with MINA release 2.0.4.
BarchartUDT is a java wrapper to the native UDT library (C++). UDT is a
reliable,
connection-oriented, streaming protocol.
The example below is the MINA sample timeserver that hangs in the first bind()
call.
Description:
-----------
Trigger : A wakeup() is issued in a thread before the target select() is
executed
in another thread, and it is lost.
It is not clear if this is a bug. The MINA code in AbstractPollingIoAcceptor
seems to rely
on the fact that the wakeup issued before a select in not lost. But the fact is
that the
two threads are racing, and if the wakeup IS lost the select will remain
blocked forever,
as in the example below.
Details:
-------
IN AbstractPollingIoAcceptor.bindInternal() the expected program order is :
1. startupAcceptor() -> Acceptor.run() -> select() -> BC::performSelect...
2. wakeup -> NioAcceptor.wakeup() -> selector.wakeup() -> BC:: wakeup
At runtime, 2. and 1. above are reversed, so the wakeup is lost and 1. hangs as
a result.
CALL TRACE
---------
Thread interleaving starting with AbstractPollingIoAcceptor.bind(Iterable):
--- AbstractIoAcceptor::bind(Iterable): calling bindInternal
--- bindInternal: calling startupAcceptor
startupAcceptor: Starting NEW Acceptor
--- bindInternal: calling wakeup()
:::BC : wakeup() sets wakeupStepCount to 1
--- bindInternal: before await
Accepptor.run: calling select()
:::BC : performSelect calling doSelectSocketUDT with wakeupBaseCount 0
wakeupStepCount 1
THREAD DUMP
-----------
The 2 threads waiting (one in select, the other in await) :
Full thread dump Java HotSpot(TM) 64-Bit Server VM (20.6-b01 mixed mode):
"NioUDTSocketAcceptor-1" prio=10 tid=0x00002aaab400d000 nid=0x5f8f runnable
[0x0000000040f04000]
java.lang.Thread.State: RUNNABLE
at com.barchart.udt.SocketUDT.select1(Native Method)
at com.barchart.udt.SocketUDT.select(SocketUDT.java:644)
at
com.barchart.udt.nio.SelectorUDT.doSelectSocketUDT(SelectorUDT.java:406)
at com.barchart.udt.nio.SelectorUDT.performSelect(SelectorUDT.java:528)
at com.barchart.udt.nio.SelectorUDT.doSelectReally(SelectorUDT.java:575)
at
com.barchart.udt.nio.SelectorUDT.doSelectInsideLock(SelectorUDT.java:333)
- locked <0x00000000eb2d3ee0> (a com.barchart.udt.nio.HelperNIOUDT$1)
- locked <0x00000000eb2d3ef0> (a java.util.Collections$UnmodifiableSet)
- locked <0x00000000eb2d3e78> (a com.barchart.udt.nio.SelectorUDT)
at com.barchart.udt.nio.SelectorUDT.select(SelectorUDT.java:191)
at com.barchart.udt.nio.SelectorUDT.select(SelectorUDT.java:181)
at
org.apache.mina.transport.socket.nio.NioUDTSocketAcceptor.select(NioUDTSocketAcceptor.java:279)
at
org.apache.mina.core.polling.AbstractPollingIoAcceptor$Acceptor.run(AbstractPollingIoAcceptor.jav
a:413)
at
org.apache.mina.util.NamePreservingRunnable.run(NamePreservingRunnable.java:64)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
"Low Memory Detector" daemon prio=10 tid=0x00000000454ca000 nid=0x5f8d runnable
[0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C2 CompilerThread1" daemon prio=10 tid=0x00000000454c7000 nid=0x5f8c waiting
on condition [0x00000000000000
00]
java.lang.Thread.State: RUNNABLE
"C2 CompilerThread0" daemon prio=10 tid=0x00000000454c2800 nid=0x5f8b waiting
on condition [0x00000000000000
00]
java.lang.Thread.State: RUNNABLE
"Signal Dispatcher" daemon prio=10 tid=0x00000000454c0000 nid=0x5f8a waiting on
condition [0x000000000000000
0]
java.lang.Thread.State: RUNNABLE
"Finalizer" daemon prio=10 tid=0x000000004549e000 nid=0x5f89 in Object.wait()
[0x0000000041cf1000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00000000eb2d4220> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:118)
- locked <0x00000000eb2d4220> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:134)
at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159)
"Reference Handler" daemon prio=10 tid=0x000000004549c000 nid=0x5f88 in
Object.wait() [0x0000000041bf0000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00000000eb2d00b0> (a java.lang.ref.Reference$Lock)
at java.lang.Object.wait(Object.java:485)
at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116)
- locked <0x00000000eb2d00b0> (a java.lang.ref.Reference$Lock)
"main" prio=10 tid=0x0000000045440000 nid=0x5f86 in Object.wait()
[0x0000000040644000]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00000000eb2d42c0> (a
org.apache.mina.core.service.AbstractIoAcceptor$AcceptorOperati
onFuture)
at
org.apache.mina.core.future.DefaultIoFuture.await0(DefaultIoFuture.java:189)
- locked <0x00000000eb2d42c0> (a
org.apache.mina.core.service.AbstractIoAcceptor$AcceptorOperationFu
ture)
at
org.apache.mina.core.future.DefaultIoFuture.awaitUninterruptibly(DefaultIoFuture.java:130)
at
org.apache.mina.core.polling.AbstractPollingIoAcceptor.bindInternal(AbstractPollingIoAcceptor.jav
a:330)
at
org.apache.mina.core.service.AbstractIoAcceptor.bind(AbstractIoAcceptor.java:291)
- locked <0x00000000eb2d4308> (a java.lang.Object)
at
org.apache.mina.core.service.AbstractIoAcceptor.bind(AbstractIoAcceptor.java:226)
at MinaTimeServer.main(MinaTimeServer.java:86)
"VM Thread" prio=10 tid=0x0000000045495000 nid=0x5f87 runnable
"VM Periodic Task Thread" prio=10 tid=0x00000000454d5000 nid=0x5f8e waiting on
condition
JNI global references: 1193
------------------------------------------------------------------------------------------
CODE - this is the code that causes it :
protected final Set<SocketAddress> bindInternal(
List<? extends SocketAddress> localAddresses) throws Exception {
// Create a bind request as a Future operation. When the selector
// have handled the registration, it will signal this future.
AcceptorOperationFuture request = new AcceptorOperationFuture(
localAddresses);
// adds the Registration request to the queue for the Workers
// to handle
registerQueue.add(request);
// creates the Acceptor instance and has the local
// executor kick it off.
System.out.println("--- bindInternal: calling startupAcceptor");
startupAcceptor();
// As we just started the acceptor, we have to unblock the select()
// in order to process the bind request we just have added to the
// registerQueue.
System.out.println("--- bindInternal: calling wakeup()");
wakeup();
System.out.println("--- bindInternal: before await");
// Now, we wait until this request is completed.
request.awaitUninterruptibly();
The implicit assumption here seems to be that either :
1. The acceptor thread will run and call select() before this thread (the
parent thread)
calls wakep();
OR
2. The wakeup() is memorized and, even if the acceptor thread runs after the
wakeup()
was issued, its select() will be interrupted.
Please let me know if this is a race bug, or if the code implicitly relies on
#2 above.
If not, then there should be some sort of synchronization between these 2
threads,
to prevent the select in the acceptor being called before the wakeup in the
parent thread.
Related question: Are there other places in MINA that may exhibit this behavior
?
Thanks.
Florin
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira