Hi, I am using Akka 2.4.9 with persistence, cluster and so on.
I have two nodes and everything works fine. I was sending a "custom" message from node1 to node2, but the message failed to be serialized (Due to a bug in my code). This resulted in the akka-remoting connection being GATED (http://doc.akka.io/docs/akka/current/scala/remoting.html#Lifecycle_and_Failure_Recovery_Model) which in turn resulted in the cluster nodes being UNREACHABLE. So to me it looks like the serialization issue with this one single message brought the cluster "down".. Logs from node2 2016-08-31 12:11:59,948 ERROR akkaSrc=akka.tcp://x@node2/system/endpointManager/reliableEndpointWriter-akka.tcp%3A%2F%2Fx%40node1-0/endpointWriter AssociationError [akka.tcp://x@node2] -> [akka.tcp://x@node1]: Error [Failed to write message to the transport] [ akka.remote.EndpointException: *Failed to write message to the transport* Caused by: com.fasterxml.jackson.databind.JsonMappingException: Instantiation of [simple type, class z.x.engine.BBDestEHInstance] value failed (z.x.engine.BBError): id cannot be null - akka.remote.EndpointWriter, akkaThrd=x-akka.remote.default-remote-dispatcher-6, thrd=x-akka.actor.default-dispatcher-4, lgr=akka.remote.EndpointWriter callstack-snipit: ..... ..... at no.nextgentel.oss.akkatools.serializing.JacksonJsonSerializer.toBinary(JacksonJsonSerializer.scala:84) at akka.remote.MessageSerializer$.*serialize* (MessageSerializer.scala:37) at akka.remote.*EndpointWriter* $$anonfun$serializeMessage$1.apply(Endpoint.scala:886) at akka.remote.EndpointWriter$$anonfun$serializeMessage$1.apply(Endpoint.scala:886) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58) at akka.remote.EndpointWriter.serializeMessage(Endpoint.scala:885) at akka.remote.EndpointWriter.writeSend(Endpoint.scala:780) at akka.remote.EndpointWriter$$anonfun$4.applyOrElse(Endpoint.scala:755) at akka.actor.Actor$class.aroundReceive(Actor.scala:484) at akka.remote.EndpointActor.aroundReceive(Endpoint.scala:447) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:526) at akka.actor.ActorCell.invoke(ActorCell.scala:495) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257) at akka.dispatch.Mailbox.run(Mailbox.scala:224) at akka.dispatch.Mailbox.exec(Mailbox.scala:234) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) 2016-08-31 12:11:59,967 WARN akkaSrc=akka.tcp://x@node2/system/endpointManager/reliableEndpointWriter-akka.tcp%3A%2F%2Fx%40node1-0 Association with remote system [akka.tcp://x@node1] *has failed, address is now gated for [5000] ms*. Reason: [Failed to write message to the transport] Caused by: [Instantiation of [simple type, class z.x.engine.BBDestEHInstance] value failed (z.x.engine.BBError): id cannot be null - akka.remote.ReliableDeliverySupervisor, akkaThrd=x-akka.remote.default-remote-dispatcher-6, thrd=x-akka.actor.default-dispatcher-4, lgr=akka.remote.ReliableDeliverySupervisor 2016-08-31 12:12:03,802 WARN akkaSrc=akka.tcp://x@node2/system/cluster/core/daemon Cluster Node [akka.tcp://x@node2] - Marking node(s) as *UNREACHABLE *[Member(address = akka.tcp://x@node1, status = Up)]. Node roles [] - akka.cluster.ClusterCoreDaemon, akkaThrd=x-akka.actor.default-dispatcher-19, thrd=x-akka.actor.default-dispatcher-28, lgr=akka.cluster.ClusterCoreDaemon 2016-08-31 12:12:03,810 WARN akkaSrc=akka.tcp://x@node2/user/$a Member detected as *unreachable*: Member(address = akka.tcp://x@node1, status = Up) - memberCount: 2 - no.nextgentel.oss.akkatools.cluster.ClusterListener, akkaThrd=x-akka.actor.default-dispatcher-78, thrd=x-akka.actor.default-dispatcher-20, lgr=no.nextgentel.oss.akkatools.cluster.ClusterListener Logs from node1 2016-08-31 12:12:04,708 WARN akkaSrc=akka.tcp://x@node1/user/$a Member detected as *unreachable*: Member(address = akka.tcp://x@node2, status = Up) - memberCount: 2 - no.nextgentel.oss.akkatools.cluster.ClusterListener, akkaThrd=x-akka.actor.default-dispatcher-28, thrd=x-akka.actor.default-dispatcher-28, lgr=no.nextgentel.oss.akkatools.cluster.ClusterListener Is my understanding of the problem correct? Is this the correct behavior? Can I do something different to prevent it from happening? -Morten -- >>>>>>>>>> Read the docs: http://akka.io/docs/ >>>>>>>>>> Check the FAQ: >>>>>>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html >>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user --- You received this message because you are subscribed to the Google Groups "Akka User List" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/akka-user. For more options, visit https://groups.google.com/d/optout.
