Hi Hector, I can't see what is wrong from that log. As you say it starts failing almost immediately, but it has some successful communication (cluster joining) before the first failure. I suggest that you try to minimize your application, and step by step try to find what component is causing the failure. I can see that you several external things, such as Camel, Rabbit, Kryo.
/Patrik On Mon, Oct 20, 2014 at 10:00 PM, Héctor Veiga <[email protected]> wrote: > Hi Patrik, > > First of all thank you for the help. We are using Akka 2.3.5. I didn't > update to 2.3.6 because I didn't see any issue fixed in github that was > related to Akka Cluster. However, if you think it's a good idea to update I > can do it right away. > The size of our messages is around 100KB (I don't think this should be a > problem). I enabled DEBUG and akka.remote.log-frame-size-exceeding=5000 > but I didn't see anything new in the log. > > Also, there could be something wrong in my application. I am seeing now > many dead letters logs like this: > > 2014-10-20 19:30:49,154 INFO [Main-akka.actor.default-dispatcher-4] - > Slf4jLogger$$anonfun$receive$1$$anonfun$applyOrElse$3.apply$mcV$sp > (Slf4jLogger.scala:74) - Message [com.mylib.MyObject] from > Actor[akka://Main/user/app/dataValidator/$b#-1994167073] to > Actor[akka://Main/deadLetters] was not delivered. [13] dead letters > encountered. This logging can be turned off or adjusted with configuration > settings 'akka.log-dead-letters' and > 'akka.log-dead-letters-during-shutdown'. > > The message flow in my actor system should be QueueConsumer > > DataValidator > ClusterAwareRouter > Processor (in local node or remote > node) > PublishActor. > > I am attaching the whole log, also we got a Kryo exception I have never > seen before. I will keep testing things in the meatime. > > Thanks, > > Hector Veiga. > > El lunes, 20 de octubre de 2014 10:41:59 UTC-5, Patrik Nordwall escribió: >> >> Hi Hector, >> >> On Mon, Oct 20, 2014 at 4:23 AM, Héctor Veiga <[email protected]> wrote: >> >>> Hi, >>> >>> I have been using Akka for few months already but it's the first time I >>> am playing with Akka Cluster. I am trying to have a clustered app that will >>> consist on 3 nodes. Each node will be reading from a messaging queue and >>> then sending the data to the correct actor through a consistent-hashing >>> cluster-aware router based on different parameters. When starting the >>> cluster I have no issues. I can see all the nodes joining the cluster and >>> they all are marked as Up. However, the moment I start to send data through >>> the messaging queue I get few dead-letter messages like these ones (sorry >>> for the long log): >>> >> >> Are you using Akka version 2.3.6? >> >> Do you mean that all messages go to deadLetters, also when you send a few >> messages? >> >> How large are the messages that you are sending? >> To see message size you can change akka.loglevel to DEBUG and use config >> akka.remote.log-frame-size-exceeding=5000 >> see http://doc.akka.io/docs/akka/2.3.6/general/configuration. >> html#akka-remote >> >> I would be interested in the full log before it starts failing. >> >> /Patrik >> >> >> >>> >>> *2014-10-20 01:42:40,051 INFO [Main-akka.actor.default-dispatcher-4] - >>> Message [akka.remote.transport.AssociationHandle$Disassociated] from >>> Actor[akka://Main/deadLetters] to >>> Actor[akka://Main/system/transports/akkaprotocolmanager.tcp0/akkaProtocol-tcp%3A%2F%2FMain%4010.196.145.54%3A56388-3#-919015728] >>> was not delivered. [1] dead letters encountered. This logging can be turned >>> off or adjusted with configuration settings 'akka.log-dead-letters' and >>> 'akka.log-dead-letters-during-shutdown'.* >>> *2014-10-20 01:42:40,065 WARN [Main-akka.actor.default-dispatcher-4] - >>> Association with remote system [akka.tcp://[email protected]:3351 >>> <http://[email protected]:3351>] has failed, address is now gated for >>> [5000] ms. Reason is: [Disassociated].* >>> *2014-10-20 01:42:40,073 INFO [Main-akka.actor.default-dispatcher-4] - >>> Message >>> [akka.remote.transport.ActorTransportAdapter$DisassociateUnderlying] from >>> Actor[akka://Main/deadLetters] to >>> Actor[akka://Main/system/transports/akkaprotocolmanager.tcp0/akkaProtocol-tcp%3A%2F%2FMain%4010.196.145.54%3A56388-3#-919015728] >>> was not delivered. [2] dead letters encountered. This logging can be turned >>> off or adjusted with configuration settings 'akka.log-dead-letters' and >>> 'akka.log-dead-letters-during-shutdown'.* >>> *2014-10-20 01:42:40,850 WARN [Main-akka.actor.default-dispatcher-30] - >>> Association with remote system [akka.tcp://[email protected]:3351 >>> <http://[email protected]:3351>] has failed, address is now gated for >>> [5000] ms. Reason is: [Failed to write message to the transport].* >>> *2014-10-20 01:42:40,864 INFO [Main-akka.actor.default-dispatcher-23] - >>> Message [akka.remote.transport.AssociationHandle$Disassociated] from >>> Actor[akka://Main/deadLetters] to >>> Actor[akka://Main/system/endpointManager/reliableEndpointWriter-akka.tcp%3A%2F%2FMain%40node2.myapp.com >>> <http://40node2.myapp.com>%3A3351-1/endpointWriter/endpointReader-akka.tcp%3A%2F%2FMain%40node2.myapp.com >>> <http://40node2.myapp.com>%3A3351-0#-1831360146] was not delivered. [3] >>> dead letters encountered. This logging can be turned off or adjusted with >>> configuration settings 'akka.log-dead-letters' and >>> 'akka.log-dead-letters-during-shutdown'.* >>> *2014-10-20 01:42:40,868 INFO [Main-akka.actor.default-dispatcher-23] - >>> Message [akka.remote.transport.AssociationHandle$Disassociated] from >>> Actor[akka://Main/deadLetters] to >>> Actor[akka://Main/system/transports/akkaprotocolmanager.tcp0/akkaProtocol-tcp%3A%2F%2FMain%4010.196.144.222%3A48127-2#-1180771271] >>> was not delivered. [4] dead letters encountered. This logging can be turned >>> off or adjusted with configuration settings 'akka.log-dead-letters' and >>> 'akka.log-dead-letters-during-shutdown'.* >>> *2014-10-20 01:42:41,486 INFO [Main-akka.actor.default-dispatcher-30] - >>> Message [akka.remote.transport.AssociationHandle$Disassociated] from >>> Actor[akka://Main/deadLetters] to >>> Actor[akka://Main/system/transports/akkaprotocolmanager.tcp0/akkaProtocol-tcp%3A%2F%2FMain%4010.196.144.217%3A58082-1#-1870263836] >>> was not delivered. [5] dead letters encountered. This logging can be turned >>> off or adjusted with configuration settings 'akka.log-dead-letters' and >>> 'akka.log-dead-letters-during-shutdown'.* >>> *2014-10-20 01:42:41,487 INFO [Main-akka.actor.default-dispatcher-30] - >>> Message [akka.remote.transport.AssociationHandle$Disassociated] from >>> Actor[akka://Main/deadLetters] to >>> Actor[akka://Main/system/transports/akkaprotocolmanager.tcp0/akkaProtocol-tcp%3A%2F%2FMain%4010.196.144.217%3A58082-1#-1870263836] >>> was not delivered. [6] dead letters encountered. This logging can be turned >>> off or adjusted with configuration settings 'akka.log-dead-letters' and >>> 'akka.log-dead-letters-during-shutdown'.* >>> *2014-10-20 01:42:41,488 WARN [Main-akka.actor.default-dispatcher-30] - >>> Association with remote system [akka.tcp://[email protected]:3351 >>> <http://[email protected]:3351>] has failed, address is now gated for >>> [5000] ms. Reason is: [Disassociated].* >>> *2014-10-20 01:42:41,490 INFO [Main-akka.actor.default-dispatcher-30] - >>> Message >>> [akka.remote.transport.ActorTransportAdapter$DisassociateUnderlying] from >>> Actor[akka://Main/deadLetters] to >>> Actor[akka://Main/system/transports/akkaprotocolmanager.tcp0/akkaProtocol-tcp%3A%2F%2FMain%4010.196.144.217%3A58082-1#-1870263836] >>> was not delivered. [7] dead letters encountered. This logging can be turned >>> off or adjusted with configuration settings 'akka.log-dead-letters' and >>> 'akka.log-dead-letters-during-shutdown'.* >>> *2014-10-20 01:42:41,785 INFO [Main-akka.actor.default-dispatcher-28] - >>> Message [akka.cluster.ClusterHeartbeatSender$Heartbeat] from >>> Actor[akka://Main/system/cluster/core/daemon/heartbeatSender#315593511] to >>> Actor[akka://Main/deadLetters] was not delivered. [8] dead letters >>> encountered. This logging can be turned off or adjusted with configuration >>> settings 'akka.log-dead-letters' and >>> 'akka.log-dead-letters-during-shutdown'.* >>> *2014-10-20 01:42:41,787 INFO [Main-akka.actor.default-dispatcher-28] - >>> Message [akka.cluster.ClusterHeartbeatSender$Heartbeat] from >>> Actor[akka://Main/system/cluster/core/daemon/heartbeatSender#315593511] to >>> Actor[akka://Main/deadLetters] was not delivered. [9] dead letters >>> encountered. This logging can be turned off or adjusted with configuration >>> settings 'akka.log-dead-letters' and >>> 'akka.log-dead-letters-during-shutdown'.* >>> *2014-10-20 01:42:41,788 INFO [Main-akka.actor.default-dispatcher-28] - >>> Message [akka.cluster.ClusterHeartbeatSender$Heartbeat] from >>> Actor[akka://Main/system/cluster/core/daemon/heartbeatSender#315593511] to >>> Actor[akka://Main/deadLetters] was not delivered. [10] dead letters >>> encountered, no more dead letters will be logged. This logging can be >>> turned off or adjusted with configuration settings 'akka.log-dead-letters' >>> and 'akka.log-dead-letters-during-shutdown'.* >>> *2014-10-20 01:42:46,751 WARN [Main-akka.actor.default-dispatcher-25] - >>> Association with remote system [akka.tcp://[email protected]:3351 >>> <http://[email protected]:3351>] has failed, address is now gated for >>> [5000] ms. Reason is: [Failed to write message to the transport].* >>> *2014-10-20 01:42:46,755 WARN [Main-akka.actor.default-dispatcher-25] - >>> Association with remote system [akka.tcp://[email protected]:3351 >>> <http://[email protected]:3351>] has failed, address is now gated for >>> [5000] ms. Reason is: [Failed to write message to the transport].* >>> *2014-10-20 01:42:54,205 WARN [Main-akka.actor.default-dispatcher-3] - >>> Association with remote system [akka.tcp://[email protected]:3351 >>> <http://[email protected]:3351>] has failed, address is now gated for >>> [5000] ms. Reason is: [Failed to write message to the transport].* >>> *2014-10-20 01:42:54,213 WARN [Main-akka.actor.default-dispatcher-3] - >>> Association with remote system [akka.tcp://[email protected]:3351 >>> <http://[email protected]:3351>] has failed, address is now gated for >>> [5000] ms. Reason is: [Failed to write message to the transport].* >>> *2014-10-20 01:42:54,357 WARN [Main-akka.actor.default-dispatcher-27] - >>> Association with remote system [akka.tcp://[email protected]:3351 >>> <http://[email protected]:3351>] has failed, address is now gated for >>> [5000] ms. Reason is: [Failed to write message to the transport].* >>> >>> And I keep getting those messages until I stop sending data. Also, I >>> have noticed some actors are using the hostname >>> (*akka.tcp://[email protected]:3351 >>> <http://[email protected]:3351>) *and some the private network IPs ( >>> akka://Main/system/transports/akkaprotocolmanager.tcp0/ >>> akkaProtocol-tcp%3A%2F%2FMain%4010.196.145.54%3A56388-3#-919015728*), *is >>> this an issue? >>> >> >>> Moreover, when I increase the load I start seeing messages that some >>> nodes become UNREACHABLE and few seconds later the become REACHABLE again. >>> I have been reading other threads in this group list and I followed few >>> tips like having a cluster dispatcher and tweaking the cluster failure >>> detector. My configuration looks like this: >>> >>> * cluster {* >>> * seed-nodes = ["akka.tcp://[email protected]:3351 >>> <http://[email protected]:3351>"]* >>> * use-dispatcher = cluster-dispatcher* >>> * gossip-interval = 5s* >>> * unreachable-nodes-reaper-interval = 5s* >>> * auto-down-unreachable-after = 300s* >>> * seed-node-timeout=60s* >>> >>> * failure-detector {* >>> * acceptable-heartbeat-pause = 15s* >>> * threshold = 15* >>> * heartbeat-interval = 3s* >>> * monitored-by-nr-of-members = 5* >>> * expected-response-after = 15s* >>> * }* >>> >>> * metrics {* >>> * enabled = off* >>> * }* >>> * }* >>> *...* >>> *cluster-dispatcher {* >>> * type = "Dispatcher"* >>> * executor = "fork-join-executor"* >>> * fork-join-executor {* >>> * parallelism-min = 2* >>> * parallelism-max = 4* >>> * }* >>> * extensions = >>> ["com.romix.akka.serialization.kryo.KryoSerializationExtension$"]* >>> *}* >>> >>> And my actor deployment configuration looks like this: >>> >>> * /app/dataProcessor {* >>> * router = balancing-pool* >>> * nr-of-instances = 1* >>> * }* >>> >>> * /app/smartFilterRouter {* >>> * router = consistent-hashing-group* >>> * nr-of-instances = 1000* >>> * routees.paths = ["/user/app/dataProcessor"]* >>> * cluster {* >>> * enabled = on* >>> * max-nr-of-instances-per-node = 1* >>> * allow-local-routees = on* >>> * extensions = >>> ["com.romix.akka.serialization.kryo.KryoSerializationExtension$"]* >>> * }* >>> * }* >>> >>> What can I tweak in order to get Akka Cluster to work properly? Am I >>> missing any important configuration? >>> >>> Thank you for your help and time, >>> >>> Hector Veiga. >>> >>> -- >>> >>>>>>>>>> Read the docs: http://akka.io/docs/ >>> >>>>>>>>>> Check the FAQ: http://doc.akka.io/docs/akka/ >>> current/additional/faq.html >>> >>>>>>>>>> Search the archives: https://groups.google.com/ >>> group/akka-user >>> --- >>> You received this message because you are subscribed to the Google >>> Groups "Akka User List" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> To post to this group, send email to [email protected]. >>> Visit this group at http://groups.google.com/group/akka-user. >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> >> >> -- >> >> Patrik Nordwall >> Typesafe <http://typesafe.com/> - Reactive apps on the JVM >> Twitter: @patriknw >> >> -- > >>>>>>>>>> Read the docs: http://akka.io/docs/ > >>>>>>>>>> Check the FAQ: > http://doc.akka.io/docs/akka/current/additional/faq.html > >>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user > --- > You received this message because you are subscribed to the Google Groups > "Akka User List" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at http://groups.google.com/group/akka-user. > For more options, visit https://groups.google.com/d/optout. > -- Patrik Nordwall Typesafe <http://typesafe.com/> - Reactive apps on the JVM Twitter: @patriknw -- >>>>>>>>>> Read the docs: http://akka.io/docs/ >>>>>>>>>> Check the FAQ: >>>>>>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html >>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user --- You received this message because you are subscribed to the Google Groups "Akka User List" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/akka-user. For more options, visit https://groups.google.com/d/optout.
