Hi,

I'm working through the failure scenarios to understand where we could lose 
data and I'm receiving a dead letter error unexpectedly.

The model is a Collector Actor and a set of Worker Actors. Collector is the 
parent to the Workers. A 'boss' is watching the Collector with a restart() 
SupervisorStrategy. Workers 'pull' work from the Collector, so when a 
Worker finishes, it sends a 'done' request then the Collector finds the 
next thing to do and sends it.

We are deliberately throwing an exception in the Collector's onReceive() 
method so it gets restarted to understand the flow. We are also making the 
Worker Actors sleep for 5 seconds to simulate activity when they receive 
work.

Frequently we see dead letter errors  on the LAST message sent from the 
Collector before we throw the exception:

[INFO] [03/06/2014 08:59:04.801] [Main-akka.actor.default-dispatcher-2] 
[akka://Main/user/app/Supervisor/1394114329779_4] Message 
[com.silverpop.rnd.collector_failure.TestMessage] from 
Actor[akka://Main/user/app/Supervisor#-138733138] to 
Actor[akka://Main/user/app/Supervisor/1394114329779_4#1820908525] was not 
delivered. [1] dead letters encountered. This logging can be turned off or 
adjusted with configuration settings 'akka.log-dead-letters' and 
'akka.log-dead-letters-during-shutdown'.

Based on the timestamps in the logs, it looks like the restart process is 
stopping the Worker (who is idle at this point) BEFORE the write to its 
Mailbox, so the message is getting send to the dead letter.

Sample code:

@Override
    public void onReceive(Object msg) {

        m_numProcessed++;

        TestMessage test = (TestMessage) msg;
        log.error("Received message from: " + sender().path().toString() + 
" " + ((TestMessage) msg).getMessageNumber());

        ActorRef sender = sender();
        if (sender.isTerminated()) {
            log.error("Sender was terminated, so not sending the next 
message. " + sender.path());
        } else {
            int msgId = m_messageNumber++;
            log.error("Sending message id: " + msgId);
            sender().tell(new TestMessage(NUM_SECONDS, msgId), getSelf());
        }

        boolean blowup = true;
        if (blowup && m_numProcessed % 20 == 0) {
            throw new RuntimeException("Blowing up the Collector");
        }

    }


Full logs:

[ERROR] [03/06/2014 08:59:04.794] [Main-akka.actor.default-dispatcher-10] 
[akka://Main/user/app/Supervisor] Received message from: 
akka://Main/user/app/Supervisor/1394114329779_3 16
[ERROR] [03/06/2014 08:59:04.794] [Main-akka.actor.default-dispatcher-10] 
[akka://Main/user/app/Supervisor] Sending message id: 20
[ERROR] [03/06/2014 08:59:04.794] [Main-akka.actor.default-dispatcher-10] 
[akka://Main/user/app/Supervisor] Received message from: 
akka://Main/user/app/Supervisor/1394114329779_0 17
[ERROR] [03/06/2014 08:59:04.794] [Main-akka.actor.default-dispatcher-10] 
[akka://Main/user/app/Supervisor] Sending message id: 21
[ERROR] [03/06/2014 08:59:04.794] [Main-akka.actor.default-dispatcher-10] 
[akka://Main/user/app/Supervisor] Received message from: 
akka://Main/user/app/Supervisor/1394114329779_2 15
[ERROR] [03/06/2014 08:59:04.794] [Main-akka.actor.default-dispatcher-10] 
[akka://Main/user/app/Supervisor] Sending message id: 22
[ERROR] [03/06/2014 08:59:04.794] [Main-akka.actor.default-dispatcher-10] 
[akka://Main/user/app/Supervisor] Received message from: 
akka://Main/user/app/Supervisor/1394114329779_1 18
[ERROR] [03/06/2014 08:59:04.794] [Main-akka.actor.default-dispatcher-10] 
[akka://Main/user/app/Supervisor] Sending message id: 23
[ERROR] [03/06/2014 08:59:04.796] [Main-akka.actor.default-dispatcher-6] 
[akka://Main/user/app/Supervisor] Received message from: 
akka://Main/user/app/Supervisor/1394114329779_4 19
[ERROR] [03/06/2014 08:59:04.796] [Main-akka.actor.default-dispatcher-6] 
[akka://Main/user/app/Supervisor] Sending message id: 24
[ERROR] [03/06/2014 08:59:04.796] [Main-akka.actor.default-dispatcher-2] 
[akka://Main/user/app/Supervisor] Blowing up the Collector
java.lang.RuntimeException: Blowing up the Collector
at 
com.silverpop.rnd.collector_failure.CollectorFailure.onReceive(CollectorFailure.java:68)
at 
akka.actor.UntypedActor$$anonfun$receive$1.applyOrElse(UntypedActor.scala:167)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
at akka.actor.ActorCell.invoke(ActorCell.scala:456)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
at akka.dispatch.Mailbox.run(Mailbox.scala:219)
at 
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at 
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at 
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)

[ERROR] [03/06/2014 08:59:04.796] [Main-akka.actor.default-dispatcher-2] 
[akka://Main/user/app/Supervisor] Collector: I got a preRestart
[ERROR] [03/06/2014 08:59:04.797] [Main-akka.actor.default-dispatcher-6] 
[akka://Main/user/app/Supervisor/1394114329779_4] 
akka://Main/user/app/Supervisor/1394114329779_4 I got a post-stop message? 
Why?
[INFO] [03/06/2014 08:59:04.801] [Main-akka.actor.default-dispatcher-2] 
[akka://Main/user/app/Supervisor/1394114329779_4] Message 
[com.silverpop.rnd.collector_failure.TestMessage] from 
Actor[akka://Main/user/app/Supervisor#-138733138] to 
Actor[akka://Main/user/app/Supervisor/1394114329779_4#1820908525] was not 
delivered. [1] dead letters encountered. This logging can be turned off or 
adjusted with configuration settings 'akka.log-dead-letters' and 
'akka.log-dead-letters-during-shutdown'.
[ERROR] [03/06/2014 08:59:09.798] [Main-akka.actor.default-dispatcher-9] 
[akka://Main/user/app/Supervisor/1394114329779_0] 
akka://Main/user/app/Supervisor/1394114329779_0 I got a post-stop message? 
Why?
[ERROR] [03/06/2014 08:59:09.798] [Main-akka.actor.default-dispatcher-10] 
[akka://Main/user/app/Supervisor/1394114329779_3] 
akka://Main/user/app/Supervisor/1394114329779_3 I got a post-stop message? 
Why?


looking at the timestamps, the other threads 
(akka://Main/user/app/Supervisor/1394114329779_0 for example) finish their 
5 second sleep before getting the postShutdown message as you'd expect. 

but akka://Main/user/app/Supervisor/1394114329779_4 gets the postShutdown 
message immediately and then the dead letter error.

Reading the documentation, shouldn't the inbox for that Worker still exist 
while Collector is doing it's restart logic? Or is the issue because the 
child Actor is destroyed and not doing its own 'restart' logic?

Java 1.7, Akka 2.3.0 running on a local machine from in IntelliJ.

Thanks,

Chris

-- 
>>>>>>>>>>      Read the docs: http://akka.io/docs/
>>>>>>>>>>      Check the FAQ: 
>>>>>>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>>      Search the archives: https://groups.google.com/group/akka-user
--- 
You received this message because you are subscribed to the Google Groups "Akka 
User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to