Hi,
I'm working through the failure scenarios to understand where we could lose
data and I'm receiving a dead letter error unexpectedly.
The model is a Collector Actor and a set of Worker Actors. Collector is the
parent to the Workers. A 'boss' is watching the Collector with a restart()
SupervisorStrategy. Workers 'pull' work from the Collector, so when a
Worker finishes, it sends a 'done' request then the Collector finds the
next thing to do and sends it.
We are deliberately throwing an exception in the Collector's onReceive()
method so it gets restarted to understand the flow. We are also making the
Worker Actors sleep for 5 seconds to simulate activity when they receive
work.
Frequently we see dead letter errors on the LAST message sent from the
Collector before we throw the exception:
[INFO] [03/06/2014 08:59:04.801] [Main-akka.actor.default-dispatcher-2]
[akka://Main/user/app/Supervisor/1394114329779_4] Message
[com.silverpop.rnd.collector_failure.TestMessage] from
Actor[akka://Main/user/app/Supervisor#-138733138] to
Actor[akka://Main/user/app/Supervisor/1394114329779_4#1820908525] was not
delivered. [1] dead letters encountered. This logging can be turned off or
adjusted with configuration settings 'akka.log-dead-letters' and
'akka.log-dead-letters-during-shutdown'.
Based on the timestamps in the logs, it looks like the restart process is
stopping the Worker (who is idle at this point) BEFORE the write to its
Mailbox, so the message is getting send to the dead letter.
Sample code:
@Override
public void onReceive(Object msg) {
m_numProcessed++;
TestMessage test = (TestMessage) msg;
log.error("Received message from: " + sender().path().toString() +
" " + ((TestMessage) msg).getMessageNumber());
ActorRef sender = sender();
if (sender.isTerminated()) {
log.error("Sender was terminated, so not sending the next
message. " + sender.path());
} else {
int msgId = m_messageNumber++;
log.error("Sending message id: " + msgId);
sender().tell(new TestMessage(NUM_SECONDS, msgId), getSelf());
}
boolean blowup = true;
if (blowup && m_numProcessed % 20 == 0) {
throw new RuntimeException("Blowing up the Collector");
}
}
Full logs:
[ERROR] [03/06/2014 08:59:04.794] [Main-akka.actor.default-dispatcher-10]
[akka://Main/user/app/Supervisor] Received message from:
akka://Main/user/app/Supervisor/1394114329779_3 16
[ERROR] [03/06/2014 08:59:04.794] [Main-akka.actor.default-dispatcher-10]
[akka://Main/user/app/Supervisor] Sending message id: 20
[ERROR] [03/06/2014 08:59:04.794] [Main-akka.actor.default-dispatcher-10]
[akka://Main/user/app/Supervisor] Received message from:
akka://Main/user/app/Supervisor/1394114329779_0 17
[ERROR] [03/06/2014 08:59:04.794] [Main-akka.actor.default-dispatcher-10]
[akka://Main/user/app/Supervisor] Sending message id: 21
[ERROR] [03/06/2014 08:59:04.794] [Main-akka.actor.default-dispatcher-10]
[akka://Main/user/app/Supervisor] Received message from:
akka://Main/user/app/Supervisor/1394114329779_2 15
[ERROR] [03/06/2014 08:59:04.794] [Main-akka.actor.default-dispatcher-10]
[akka://Main/user/app/Supervisor] Sending message id: 22
[ERROR] [03/06/2014 08:59:04.794] [Main-akka.actor.default-dispatcher-10]
[akka://Main/user/app/Supervisor] Received message from:
akka://Main/user/app/Supervisor/1394114329779_1 18
[ERROR] [03/06/2014 08:59:04.794] [Main-akka.actor.default-dispatcher-10]
[akka://Main/user/app/Supervisor] Sending message id: 23
[ERROR] [03/06/2014 08:59:04.796] [Main-akka.actor.default-dispatcher-6]
[akka://Main/user/app/Supervisor] Received message from:
akka://Main/user/app/Supervisor/1394114329779_4 19
[ERROR] [03/06/2014 08:59:04.796] [Main-akka.actor.default-dispatcher-6]
[akka://Main/user/app/Supervisor] Sending message id: 24
[ERROR] [03/06/2014 08:59:04.796] [Main-akka.actor.default-dispatcher-2]
[akka://Main/user/app/Supervisor] Blowing up the Collector
java.lang.RuntimeException: Blowing up the Collector
at
com.silverpop.rnd.collector_failure.CollectorFailure.onReceive(CollectorFailure.java:68)
at
akka.actor.UntypedActor$$anonfun$receive$1.applyOrElse(UntypedActor.scala:167)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
at akka.actor.ActorCell.invoke(ActorCell.scala:456)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
at akka.dispatch.Mailbox.run(Mailbox.scala:219)
at
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
[ERROR] [03/06/2014 08:59:04.796] [Main-akka.actor.default-dispatcher-2]
[akka://Main/user/app/Supervisor] Collector: I got a preRestart
[ERROR] [03/06/2014 08:59:04.797] [Main-akka.actor.default-dispatcher-6]
[akka://Main/user/app/Supervisor/1394114329779_4]
akka://Main/user/app/Supervisor/1394114329779_4 I got a post-stop message?
Why?
[INFO] [03/06/2014 08:59:04.801] [Main-akka.actor.default-dispatcher-2]
[akka://Main/user/app/Supervisor/1394114329779_4] Message
[com.silverpop.rnd.collector_failure.TestMessage] from
Actor[akka://Main/user/app/Supervisor#-138733138] to
Actor[akka://Main/user/app/Supervisor/1394114329779_4#1820908525] was not
delivered. [1] dead letters encountered. This logging can be turned off or
adjusted with configuration settings 'akka.log-dead-letters' and
'akka.log-dead-letters-during-shutdown'.
[ERROR] [03/06/2014 08:59:09.798] [Main-akka.actor.default-dispatcher-9]
[akka://Main/user/app/Supervisor/1394114329779_0]
akka://Main/user/app/Supervisor/1394114329779_0 I got a post-stop message?
Why?
[ERROR] [03/06/2014 08:59:09.798] [Main-akka.actor.default-dispatcher-10]
[akka://Main/user/app/Supervisor/1394114329779_3]
akka://Main/user/app/Supervisor/1394114329779_3 I got a post-stop message?
Why?
looking at the timestamps, the other threads
(akka://Main/user/app/Supervisor/1394114329779_0 for example) finish their
5 second sleep before getting the postShutdown message as you'd expect.
but akka://Main/user/app/Supervisor/1394114329779_4 gets the postShutdown
message immediately and then the dead letter error.
Reading the documentation, shouldn't the inbox for that Worker still exist
while Collector is doing it's restart logic? Or is the issue because the
child Actor is destroyed and not doing its own 'restart' logic?
Java 1.7, Akka 2.3.0 running on a local machine from in IntelliJ.
Thanks,
Chris
--
>>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>>>>>>> Check the FAQ:
>>>>>>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
---
You received this message because you are subscribed to the Google Groups "Akka
User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/groups/opt_out.