Re: [akka-user] Re: Node quarantined
This is the latest version of akka for java 7. On Friday, April 29, 2016 at 3:18:55 PM UTC-4, Patrik Nordwall wrote: > > There can be several reasons, but a good start is to use latest Akka > version. > tors 28 apr. 2016 kl. 21:13 skrev Guido Medina>: > >> Hi Ben, >> >> As my experience goes Netty 3 doesn't get much love, issues are barely >> fixed, >> like I mentioned before I'm running my own Netty 3.10.6 built internally, >> also; 3.10.0 is not even a good version, >> if you want force your version to 3.10.5.Final until they release >> 3.10.6.Final which has nice fixes. >> >> or >> >> you could get my branch, set the version to whatever is comfortable for >> you and build your own Netty, >> >> My branch: https://github.com/guidomedina/netty/commits/3.10-SFS >> >> has the following milestone: >> https://github.com/netty/netty/issues?q=milestone%3A3.10.6.Final+is%3Aclosed >> >> plus some minor fixes I added myself, as of interest there is a race >> condition fixed at 3.10.6 and >> I saw another between 3.10.0 and 3.10.5 which might be causing the issue >> you are experiencing. >> >> HTH, >> >> Guido. >> >> -- >> >> Read the docs: http://akka.io/docs/ >> >> Check the FAQ: >> http://doc.akka.io/docs/akka/current/additional/faq.html >> >> Search the archives: https://groups.google.com/group/akka-user >> --- >> You received this message because you are subscribed to the Google Groups >> "Akka User List" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to akka-user+...@googlegroups.com . >> To post to this group, send email to akka...@googlegroups.com >> . >> Visit this group at https://groups.google.com/group/akka-user. >> For more options, visit https://groups.google.com/d/optout. >> > -- >> Read the docs: http://akka.io/docs/ >> Check the FAQ: >> http://doc.akka.io/docs/akka/current/additional/faq.html >> Search the archives: https://groups.google.com/group/akka-user --- You received this message because you are subscribed to the Google Groups "Akka User List" group. To unsubscribe from this group and stop receiving emails from it, send an email to akka-user+unsubscr...@googlegroups.com. To post to this group, send email to akka-user@googlegroups.com. Visit this group at https://groups.google.com/group/akka-user. For more options, visit https://groups.google.com/d/optout.
Re: [akka-user] Re: Node quarantined
:00:10.356 WARN [geyser-akka.remote.default-remote-dispatcher-9] Remoting - Association to [akka.tcp://geyser@172.16.125.13:7000] having UID [-1471771858] is irrecoverably failed. UID is now quarantined and all messages to this UID will be delivered to dead letters. Remote actorsystem must be restarted to recover from this situation. 14:00:10.385 WARN [geyser-akka.remote.default-remote-dispatcher-10] a.r.EndpointWriter - AssociationError [akka.tcp://geyser@172.16.119.46:7000] -> [akka.tcp://geyser@172.16.125.13:7000]: Error [Invalid address: akka.tcp://geyser@172.16.125.13:7000] [ akka.remote.InvalidAssociation: Invalid address: akka.tcp://geyser@172.16.125.13:7000 Caused by: akka.remote.transport.Transport$InvalidAssociationException: The remote system has a UID that has been quarantined. Association aborted. ] 14:00:10.386 INFO [geyser-akka.remote.default-remote-dispatcher-27] Remoting - Quarantined address [akka.tcp://geyser@172.16.125.13:7000] is still unreachable or has not been restarted. Keeping it quarantined. IP: 139 13:59:57.544 INFO [geyser-akka.actor.default-dispatcher-187] AngelOfTheAbyss - Unreachable member (Member(address = akka.tcp://geyser@172.16.119.42:7000, status = Up)|Size:4) 13:59:58.359 INFO [geyser-akka.actor.default-dispatcher-178] AngelOfTheAbyss - Unreachable member (Member(address = akka.tcp://geyser@172.16.125.13:7000, status = Up)|Size:3) 14:00:11.358 INFO [geyser-akka.actor.default-dispatcher-32] AngelOfTheAbyss - Member removed (Member(address = akka.tcp://geyser@172.16.119.42:7000, status = Removed)|Size:3) 14:00:11.359 INFO [geyser-akka.actor.default-dispatcher-32] AngelOfTheAbyss - Member removed (Member(address = akka.tcp://geyser@172.16.125.13:7000, status = Removed)|Size:3) 14:00:11.361 WARN [geyser-akka.remote.default-remote-dispatcher-27] Remoting - Association to [akka.tcp://geyser@172.16.119.42:7000] having UID [-477546934] is irrecoverably failed. UID is now quarantined and all messages to this UID will be delivered to dead letters. Remote actorsystem must be restarted to recover from this situation. 14:00:11.361 WARN [geyser-akka.remote.default-remote-dispatcher-27] Remoting - Association to [akka.tcp://geyser@172.16.125.13:7000] having UID [-1471771858] is irrecoverably failed. UID is now quarantined and all messages to this UID will be delivered to dead letters. Remote actorsystem must be restarted to recover from this situation. Is there anything abnormal in the logs? Regards, Ben On Wednesday, March 23, 2016 at 9:33:02 AM UTC-4, Benjamin Black wrote: > > I look forward to trying out the new version. Not totally sure it is the > same issue I'm seeing this happen on a cluster where no node is being > restarted. I shall continue to investigate what has changed on my side, > because I wasn't see this before I upgraded other libraries. > > On Wednesday, March 23, 2016 at 2:08:10 AM UTC-4, Patrik Nordwall wrote: >> >> We have fixed the issue that is noticed as >> "Error encountered while processing system message acknowledgement >> buffer: [-1 {}] ack: ACK[6, {}]" >> >> https://github.com/akka/akka/pull/20093 >> >> It will be released in 2.4.3 and 2.3.15, probably by end of next week. >> >> /Patrik >> tis 22 mars 2016 kl. 23:39 skrev Guido Medina <oxy...@gmail.com>: >> >>> Yeah sorry I thought it was related with rolling restart. >>> >>> As for Netty, I'm using a *non-published yet* Netty with the following >>> fixes: >>> >>> https://github.com/netty/netty/issues?q=milestone%3A3.10.6.Final+is%3Aclosed >>> >>> You can just get it from Git and: >>> >>> $ git checkout 3.10 >>> $ mvn versions:set -DnewVersion=3.10.6.Final -DgenerateBackupPoms=false >>> $ mvn clean install >>> >>> And see if your problem goes away, >>> >>> Guido. >>> >>> On Tuesday, March 22, 2016 at 10:27:26 PM UTC, Benjamin Black wrote: >>>> >>>> Hi Guido, yes I'm aware of the leaving cluster conversation as I >>>> started it :-) This is separate issue. I am observing this behavior whilst >>>> the cluster seems stable with no nodes being added/removed. I suspect that >>>> this issue was first observed when I upgraded a different library that >>>> brought in a new version of the netty library. >>>> >>>> On Tuesday, March 22, 2016 at 6:23:14 PM UTC-4, Guido Medina wrote: >>>>> >>>>> Hi Benjamin, >>>>> >>>>> You have nodes with predefined ports, one thing I have which >>>>> eliminates that problem for these nodes is that >>>>> only my seed node(s) have the port set, the rest will just get a >>
Re: [akka-user] Re: Node quarantined
I look forward to trying out the new version. Not totally sure it is the same issue I'm seeing this happen on a cluster where no node is being restarted. I shall continue to investigate what has changed on my side, because I wasn't see this before I upgraded other libraries. On Wednesday, March 23, 2016 at 2:08:10 AM UTC-4, Patrik Nordwall wrote: > > We have fixed the issue that is noticed as > "Error encountered while processing system message acknowledgement buffer: > [-1 {}] ack: ACK[6, {}]" > > https://github.com/akka/akka/pull/20093 > > It will be released in 2.4.3 and 2.3.15, probably by end of next week. > > /Patrik > tis 22 mars 2016 kl. 23:39 skrev Guido Medina <oxy...@gmail.com > >: > >> Yeah sorry I thought it was related with rolling restart. >> >> As for Netty, I'm using a *non-published yet* Netty with the following >> fixes: >> >> https://github.com/netty/netty/issues?q=milestone%3A3.10.6.Final+is%3Aclosed >> >> You can just get it from Git and: >> >> $ git checkout 3.10 >> $ mvn versions:set -DnewVersion=3.10.6.Final -DgenerateBackupPoms=false >> $ mvn clean install >> >> And see if your problem goes away, >> >> Guido. >> >> On Tuesday, March 22, 2016 at 10:27:26 PM UTC, Benjamin Black wrote: >>> >>> Hi Guido, yes I'm aware of the leaving cluster conversation as I started >>> it :-) This is separate issue. I am observing this behavior whilst the >>> cluster seems stable with no nodes being added/removed. I suspect that this >>> issue was first observed when I upgraded a different library that brought >>> in a new version of the netty library. >>> >>> On Tuesday, March 22, 2016 at 6:23:14 PM UTC-4, Guido Medina wrote: >>>> >>>> Hi Benjamin, >>>> >>>> You have nodes with predefined ports, one thing I have which eliminates >>>> that problem for these nodes is that >>>> only my seed node(s) have the port set, the rest will just get a >>>> dynamic and available port, making it get a different port when you >>>> do a rolling restart. >>>> >>>> I suspect you are doing a rolling restart right? so you need to wait >>>> for that node with that address to completely leave the cluster (I'm also >>>> doing that), >>>> basically you terminate your system when you receive the message >>>> *MemberRemoved* for *_self_* address. >>>> >>>> I think I saw a discussion related to quarantine nodes when they are >>>> re-joining using the same address, not sure if here or if it is an actual >>>> Git ticket. >>>> >>>> HTH, >>>> >>>> Guido. >>>> >>> -- >> >>>>>>>>>> Read the docs: http://akka.io/docs/ >> >>>>>>>>>> Check the FAQ: >> http://doc.akka.io/docs/akka/current/additional/faq.html >> >>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user >> --- >> You received this message because you are subscribed to the Google Groups >> "Akka User List" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to akka-user+...@googlegroups.com . >> To post to this group, send email to akka...@googlegroups.com >> . >> Visit this group at https://groups.google.com/group/akka-user. >> For more options, visit https://groups.google.com/d/optout. >> > -- >>>>>>>>>> Read the docs: http://akka.io/docs/ >>>>>>>>>> Check the FAQ: >>>>>>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html >>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user --- You received this message because you are subscribed to the Google Groups "Akka User List" group. To unsubscribe from this group and stop receiving emails from it, send an email to akka-user+unsubscr...@googlegroups.com. To post to this group, send email to akka-user@googlegroups.com. Visit this group at https://groups.google.com/group/akka-user. For more options, visit https://groups.google.com/d/optout.
[akka-user] Re: Node quarantined
Hi Guido, yes I'm aware of the leaving cluster conversation as I started it :-) This is separate issue. I am observing this behavior whilst the cluster seems stable with no nodes being added/removed. I suspect that this issue was first observed when I upgraded a different library that brought in a new version of the netty library. On Tuesday, March 22, 2016 at 6:23:14 PM UTC-4, Guido Medina wrote: > > Hi Benjamin, > > You have nodes with predefined ports, one thing I have which eliminates > that problem for these nodes is that > only my seed node(s) have the port set, the rest will just get a dynamic > and available port, making it get a different port when you > do a rolling restart. > > I suspect you are doing a rolling restart right? so you need to wait for > that node with that address to completely leave the cluster (I'm also doing > that), > basically you terminate your system when you receive the message > *MemberRemoved* for *_self_* address. > > I think I saw a discussion related to quarantine nodes when they are > re-joining using the same address, not sure if here or if it is an actual > Git ticket. > > HTH, > > Guido. > -- >> Read the docs: http://akka.io/docs/ >> Check the FAQ: >> http://doc.akka.io/docs/akka/current/additional/faq.html >> Search the archives: https://groups.google.com/group/akka-user --- You received this message because you are subscribed to the Google Groups "Akka User List" group. To unsubscribe from this group and stop receiving emails from it, send an email to akka-user+unsubscr...@googlegroups.com. To post to this group, send email to akka-user@googlegroups.com. Visit this group at https://groups.google.com/group/akka-user. For more options, visit https://groups.google.com/d/optout.
[akka-user] Re: Node quarantined
I see the same issue with 2.3.14. On Tuesday, March 22, 2016 at 2:00:15 PM UTC-4, Guido Medina wrote: > > To eliminate noise please update to 2.3.14 which from 2.3.11 has some > cluster fixes, there are also several fixes on Scala 2.11.8 (not related) > > I don't know, I just have the custom of keeping my libs up to date. > > HTH, > > Guido. > > On Tuesday, March 22, 2016 at 5:34:23 PM UTC, Benjamin Black wrote: >> >> Hello, >> >> I'm trying to understand the cause of nodes being quarantined and >> possible solutions to fixing it. I'm using akka 2.3.11. On the quarantined >> node I see this logging: >> >> 2:45:44.204 ERROR [geyser-akka.remote.default-remote-dispatcher-6] >> a.r.EndpointWriter - AssociationError [akka.tcp:// >> geyser@172.16.120.174:7000] <- [akka.tcp://geyser@172.17.100.105:7000]: >> Error [Invalid address: akka.tcp://geyser@172.17.100.105:7000] [ >> akka.remote.InvalidAssociation: Invalid address: akka.tcp:// >> geyser@172.17.100.105:7000 >> Caused by: akka.remote.transport.Transport$InvalidAssociationException: >> The remote system has quarantined this system. No further associations to >> the remote system are possible until this system is restarted. >> ] >> 12:45:44.205 WARN [geyser-akka.remote.default-remote-dispatcher-25] >> Remoting - Tried to associate with unreachable remote address [akka.tcp:// >> geyser@172.17.100.105:7000]. Address is now gated for 5000 ms, all >> messages to this address will be delivered to dead letters. Reason: [The >> remote system has quarantined this system. No further associations to the >> remote system are possible until this system is restarted.] >> >> And on the node that cause the box to be quarantined I see this logging: >> >> 12:45:44.194 WARN [geyser-akka.remote.default-remote-dispatcher-6] >> Remoting - Association to [akka.tcp://geyser@172.16.120.174:7000] having >> UID [-450748474] is irrecoverably failed. UID is now quarantined and all >> messages to this UID will be delivered to dead letters. Remote actorsystem >> must be restarted to recover from this situation. >> 12:45:44.202 WARN [geyser-akka.remote.default-remote-dispatcher-7] >> a.r.EndpointWriter - AssociationError [akka.tcp:// >> geyser@172.17.100.105:7000] -> [akka.tcp://geyser@172.16.120.174:7000]: >> Error [Invalid address: akka.tcp://geyser@172.16.120.174:7000] [ >> akka.remote.InvalidAssociation: Invalid address: akka.tcp:// >> geyser@172.16.120.174:7000 >> Caused by: akka.remote.transport.Transport$InvalidAssociationException: >> The remote system has a UID that has been quarantined. Association aborted. >> ] >> 12:45:44.203 WARN [geyser-akka.remote.default-remote-dispatcher-7] >> Remoting - Tried to associate with unreachable remote address [akka.tcp:// >> geyser@172.16.120.174:7000]. Address is now gated for 5000 ms, all >> messages to this address will be delivered to dead letters. Reason: [The >> remote system has a UID that has been quarantined. Association aborted.] >> 12:45:44.221 ERROR [geyser-akka.remote.default-remote-dispatcher-7] >> Remoting - Association to [akka.tcp://geyser@172.16.120.174:7000] with >> UID [-450748474] irrecoverably failed. Quarantining address. >> java.lang.IllegalStateException: Error encountered while processing >> system message acknowledgement buffer: [-1 {}] ack: ACK[6, {}] >> at >> akka.remote.ReliableDeliverySupervisor$$anonfun$receive$1.applyOrElse(Endpoint.scala:288) >> >> ~[geyser.jar:1.1.17-SNAPSHOT] >> at akka.actor.Actor$class.aroundReceive(Actor.scala:467) >> ~[geyser.jar:1.1.17-SNAPSHOT] >> Caused by: java.lang.IllegalArgumentException: Highest SEQ so far was -1 >> but cumulative ACK is 6 >> at >> akka.remote.AckedSendBuffer.acknowledge(AckedDelivery.scala:103) >> ~[geyser.jar:1.1.17-SNAPSHOT] >> at >> akka.remote.ReliableDeliverySupervisor$$anonfun$receive$1.applyOrElse(Endpoint.scala:284) >> >> ~[geyser.jar:1.1.17-SNAPSHOT] >> ... 11 common frames omitted >> 12:45:44.221 WARN [geyser-akka.remote.default-remote-dispatcher-7] >> Remoting - Association to [akka.tcp://geyser@172.16.120.174:7000] having >> UID [-450748474] is irrecoverably failed. UID is now quarantined and all >> messages to this UID will be delivered to dead letters. Remote actorsystem >> must be restarted to recover from this situation. >> >> Quite a bit of data can be passed between the nodes ~200 Mb/sec and maybe >> the system is hitting a capacity issue although I don't see any issue with >>
[akka-user] Re: Clarification on unreachable nodes in cluster
Hi Guido, I think in your case you are shutting down before the node has communicated to the leader that it wants to leave. I wait to get the MemberExited message before shutting down the node. Maybe I should wait for the MemberRemoved? Either way the ultimate aim is to not have the unreachable logic kick in and have to wait x seconds (I use 10 seconds) for the node to be auto downed by the leader. And the reason why I don't want to wait is according to the docs the leader wouldn't be able to add nodes whilst any node in the cluster is considered unreachable, which is a problem if I'm doing a rolling restart of all the nodes. Regards, Ben On Thursday, March 17, 2016 at 4:51:30 PM UTC-4, Guido Medina wrote: > > As for cluster.leave(cluster.selfAddress) my micro-services use the > following to leave: > > Runtime.getRuntime().addShutdownHook(new Thread() { > @Override > public void run() { > final Cluster cluster = Cluster.get(system); > cluster.leave(cluster.selfAddress()); > system.terminate(); > Configurator.shutdown((LoggerContext) LogManager.getContext()); > } > }); > > But honestly I have never seen that work, the other nodes just report it > as unreachable until it times out and it is completely removed, > maybe the shutdown happens so fast that it is useless in my case. > > HTH, > > Guido. > > On Thursday, March 17, 2016 at 8:33:29 PM UTC, Guido Medina wrote: >> >> Hi Benjamin, >> >> I also rely on cluster events and AFAIK you can expect (and trust) >> *MemberUp* and *MemberRemoved*, these IMHO are the only two consistent >> states you can trust. >> In other words, I register some actors only when their nodes reach >> *MemberUp* and unregister only when their nodes reach *MemberRemoved* >> Any other state in between I would treat them as information only. >> >> So far I haven't got any issue with my mini-shard implementation relying >> on these only 2 statuses, the draw back is that it will only have to wait >> for a longer time to react. >> >> HTH, >> >> Guido. >> >> On Thursday, March 17, 2016 at 6:07:48 PM UTC, Benjamin Black wrote: >>> >>> Hello, >>> >>> I'm adding logic to our service so that when a node is being restarted >>> it gracefully leaves the cluster using cluster.leave(cluster >>> .selfAddress). In the cluster specification doc it states: >>> >>> If a node is unreachable then gossip convergence is not possible and >>> therefore any leader actions are also not possible (for instance, >>> allowing a node to become a part of the cluster). To be able to move >>> forward the state of theunreachable nodes must be changed. It must >>> become reachable again or marked as down >>> >>> Is this totally true? If a node is unreachable and is the >>> leaving/exiting/removed state will this stop the leader from adding a new >>> node? I ask because I have an actor that subscribes to cluster events and I >>> can see a node is being added whilst another node is considered unreachable >>> and in the exiting status: >>> >>> 14:02:46.843 INFO Exited member Member(address = akka.tcp:// >>> geyser@172.16.120.160:7000, status = Exiting) >>> 14:02:51.842 INFO Unreachable member Member(address = akka.tcp:// >>> geyser@172.16.120.160:7000, status = Exiting) >>> 14:02:53.843 INFO Removing member Member(address = akka.tcp:// >>> geyser@172.16.120.160:7000, status = Removed) >>> 14:02:57.843 INFO Exited member Member(address = akka.tcp:// >>> geyser@172.16.119.46:7000, status = Exiting) >>> 14:03:02.760 INFO Unreachable member Member(address = akka.tcp:// >>> geyser@172.16.119.46:7000, status = Exiting) >>> 14:03:04.843 INFO Adding member Member(address = akka.tcp:// >>> geyser@172.16.120.160:7000, status = Up) >>> >>> Thanks, >>> Ben >>> >>> -- >>>>>>>>>> Read the docs: http://akka.io/docs/ >>>>>>>>>> Check the FAQ: >>>>>>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html >>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user --- You received this message because you are subscribed to the Google Groups "Akka User List" group. To unsubscribe from this group and stop receiving emails from it, send an email to akka-user+unsubscr...@googlegroups.com. To post to this group, send email to akka-user@googlegroups.com. Visit this group at https://groups.google.com/group/akka-user. For more options, visit https://groups.google.com/d/optout.
[akka-user] Clarification on unreachable nodes in cluster
Hello, I'm adding logic to our service so that when a node is being restarted it gracefully leaves the cluster using cluster.leave(cluster.selfAddress). In the cluster specification doc it states: If a node is unreachable then gossip convergence is not possible and therefore any leader actions are also not possible (for instance, allowing a node to become a part of the cluster). To be able to move forward the state of theunreachable nodes must be changed. It must become reachable again or marked as down Is this totally true? If a node is unreachable and is the leaving/exiting/removed state will this stop the leader from adding a new node? I ask because I have an actor that subscribes to cluster events and I can see a node is being added whilst another node is considered unreachable and in the exiting status: 14:02:46.843 INFO Exited member Member(address = akka.tcp://geyser@172.16.120.160:7000, status = Exiting) 14:02:51.842 INFO Unreachable member Member(address = akka.tcp://geyser@172.16.120.160:7000, status = Exiting) 14:02:53.843 INFO Removing member Member(address = akka.tcp://geyser@172.16.120.160:7000, status = Removed) 14:02:57.843 INFO Exited member Member(address = akka.tcp://geyser@172.16.119.46:7000, status = Exiting) 14:03:02.760 INFO Unreachable member Member(address = akka.tcp://geyser@172.16.119.46:7000, status = Exiting) 14:03:04.843 INFO Adding member Member(address = akka.tcp://geyser@172.16.120.160:7000, status = Up) Thanks, Ben -- >> Read the docs: http://akka.io/docs/ >> Check the FAQ: >> http://doc.akka.io/docs/akka/current/additional/faq.html >> Search the archives: https://groups.google.com/group/akka-user --- You received this message because you are subscribed to the Google Groups "Akka User List" group. To unsubscribe from this group and stop receiving emails from it, send an email to akka-user+unsubscr...@googlegroups.com. To post to this group, send email to akka-user@googlegroups.com. Visit this group at https://groups.google.com/group/akka-user. For more options, visit https://groups.google.com/d/optout.
[akka-user] Lost actor communication
I am running a 24 node cluster that is roughly split into two roles: frontend and backend. There is a streamer actor on the frontend node talking to a tracker actor on the backend node. There can be many streamer actors on several frontend nodes talking to one tracker actor. It would seem that at some point the streamer actor on a frontend node stop being able to communicate with the tracker actor. It would seem that communicate between the frontend node and the backend node has been lost, but the backend node can still receive messages from the frontend. I say this because the streamer was able to send a poison pill to the tracker, which successfully killed the actor, but the streamer wasn't informed about the termination. I see no indication that a node has fallen from the cluster or is having problems communicating (I have logging set to INFO). Is there anything I can do to get a better idea of what is happening? Thanks, Ben -- Read the docs: http://akka.io/docs/ Check the FAQ: http://doc.akka.io/docs/akka/current/additional/faq.html Search the archives: https://groups.google.com/group/akka-user --- You received this message because you are subscribed to the Google Groups Akka User List group. To unsubscribe from this group and stop receiving emails from it, send an email to akka-user+unsubscr...@googlegroups.com. To post to this group, send email to akka-user@googlegroups.com. Visit this group at http://groups.google.com/group/akka-user. For more options, visit https://groups.google.com/d/optout.
Re: [akka-user] Akka cluster: node identity crisis
I upgraded to Akka 2.3.4 (scala 2.10), but still seeing the same issue. When I log Cluster(system).selfUniqueAddress I get something like UniqueAddress(akka.tcp://streaming@172.17.102.128:7000,630715883), with the last number changing everytime I restart. The gossip message error always has the same number -1482656725. For example, 18:12:27.472 INFO [streaming-akka.actor.default-dispatcher-15] Cluster(akka://streaming) - Cluster Node [akka.tcp://streaming@172.17.102.128:7000] - Ignoring received gossip intended for someone else, from [akka.tcp://streaming@172.17.110.143:7000] to [UniqueAddress(akka.tcp://streaming@172.17.102.128:7000,-1482656725)] On Tuesday, July 1, 2014 2:33:52 PM UTC-4, Patrik Nordwall wrote: Please use latest version, i.e. 2.3.4 There you find Cluster(system).selfUniqueAddress that includes the uid. I will look into this in more detail tomorrow. /Patrik 1 jul 2014 kl. 19:57 skrev Benjamin Black benbl...@gmail.com javascript:: I have a cluster of 15 nodes/boxes. I start the nodes roughly at the same time. One of the nodes is behaving oddly and continually logging Ignoring received gossip intended for someone else. However, the node does seem to work for a while before being being dropped from the cluster. Basically this one node seems to think it is someone else, whilst also behaviouring as itself. The code and config is exactly the same on all 15 nodes so I don't understand why I'm getting this issue on only one node. Maybe this is a hardware issue? Some logging: 11:27:45.412 INFO [main] Remoting - Starting remoting 11:27:45.638 INFO [main] Remoting - Remoting started; listening on addresses :[akka.tcp://streaming@172.17.102.128:7000] 11:27:45.660 INFO [main] Cluster(akka://streaming) - Cluster Node [akka.tcp://streaming@172.17.102.128:7000] - Starting up... 11:27:45.714 INFO [main] Cluster(akka://streaming) - Cluster Node [akka.tcp://streaming@172.17.102.128:7000] - Registered cluster JMX MBean [akka:type=Cluster] 11:27:45.715 INFO [main] Cluster(akka://streaming) - Cluster Node [akka.tcp://streaming@172.17.102.128:7000] - Started up successfully 11:27:45.830 INFO [streaming-akka.actor.default-dispatcher-3] a.a.LocalActorRef - Message [akka.cluster.InternalClusterAction$InitJoinAck] from Actor[akka.tcp:// streaming@172.17.100.98:7000/system/cluster/core/daemon#1997515880] to Actor[akka://streaming/system/cluster/core/daemon/joinSeedNodeProcess-1#1132911] was not delivered. [1] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'. 11:27:45.872 INFO [streaming-akka.actor.default-dispatcher-5] Cluster(akka://streaming) - Cluster Node [akka.tcp:// streaming@172.17.102.128:7000] - Welcome from [akka.tcp:// streaming@172.17.102.125:7000] 11:27:45.911 INFO [streaming-akka.actor.default-dispatcher-2] Cluster(akka://streaming) - Cluster Node [akka.tcp:// streaming@172.17.102.128:7000] - Ignoring received gossip intended for someone else, from [akka.tcp://streaming@172.17.102.68:7000] to [UniqueAddress(akka.tcp://streaming@172.17.102.128:7000,-1482656725)] 11:27:45.943 INFO [streaming-akka.actor.default-dispatcher-16] Cluster(akka://streaming) - Cluster Node [akka.tcp:// streaming@172.17.102.128:7000] - Ignoring received gossip intended for someone else, from [akka.tcp://streaming@172.17.102.70:7000] to [UniqueAddress(akka.tcp://streaming@172.17.102.128:7000,-1482656725)] 11:27:46.122 INFO [streaming-akka.actor.default-dispatcher-16] Cluster(akka://streaming) - Cluster Node [akka.tcp:// streaming@172.17.102.128:7000] - Ignoring received gossip intended for someone else, from [akka.tcp://streaming@172.17.102.69:7000] to [UniqueAddress(akka.tcp://streaming@172.17.102.128:7000,-1482656725)] Config: akka { cluster { seed-nodes = [ akka.tcp://streaming@172.17.102.125:7000 akka.tcp://streaming@172.17.100.98:7000 ] } remote.netty.tcp.hostname = 172.17.102.128 } I thought it was weird that the unique address in the gossip messages referred to a negative number. I added log.info(smy unique ID: ${AddressUidExtension(actorSystem).addressUid}) to the confused node (I hope this is the correct code) and it gave me the answer 1549799231, whilst continuing to give -1482656725 in the gossip messages. I'm guessing the problem is that the gossip messages have a corrupted address, which is why the confused node believes these messages are not for itself. I'm using Akka 2.3.2. -- Read the docs: http://akka.io/docs/ Check the FAQ: http://doc.akka.io/docs/akka/current/additional/faq.html Search the archives: https://groups.google.com/group/akka-user --- You received this message because you are subscribed to the Google Groups Akka User List group. To unsubscribe from this group and stop receiving emails from it, send
Re: [akka-user] Akka cluster: node identity crisis
Problem resolved. I was running two clusters (dev and prod) and somehow (the mystery remains) the dev cluster was interacting with this one box in the prod cluster. My advice to anyone else who sees this issue is to check the IP addresses of the from nodes. On Wednesday, July 2, 2014 4:31:32 AM UTC-4, Patrik Nordwall wrote: On Wed, Jul 2, 2014 at 12:21 AM, Benjamin Black benbl...@gmail.com javascript: wrote: I upgraded to Akka 2.3.4 (scala 2.10), but still seeing the same issue. When I log Cluster(system).selfUniqueAddress I get something like UniqueAddress(akka.tcp://streaming@172.17.102.128:7000,630715883), with the last number changing everytime I restart. Then it would be great if you could describe how to reproduce the problem. Preferably using a minimal sample, such as the SimpleClusterListener in the cluster sample https://typesafe.com/activator/template/akka-sample-cluster-scala . The gossip message error always has the same number -1482656725. For example, 18:12:27.472 INFO [streaming-akka.actor.default-dispatcher-15] Cluster(akka://streaming) - Cluster Node [akka.tcp:// streaming@172.17.102.128:7000] - Ignoring received gossip intended for someone else, from [akka.tcp://streaming@172.17.110.143:7000] to [UniqueAddress(akka.tcp://streaming@172.17.102.128:7000,-1482656725)] It is not an error to see a few of these while joining (especially when joining several nodes at the same time), but if this logging continues (never ends) something is wrong. Negative uid should not be an issue. It is just a random integer that is generated when the ActorSystem is started. The reason for using the uid is to be able to differentiate new and old actor system on the same host:port from each other when it is restarted. Regards, Patrik On Tuesday, July 1, 2014 2:33:52 PM UTC-4, Patrik Nordwall wrote: Please use latest version, i.e. 2.3.4 There you find Cluster(system).selfUniqueAddress that includes the uid. I will look into this in more detail tomorrow. /Patrik 1 jul 2014 kl. 19:57 skrev Benjamin Black benbl...@gmail.com: I have a cluster of 15 nodes/boxes. I start the nodes roughly at the same time. One of the nodes is behaving oddly and continually logging Ignoring received gossip intended for someone else. However, the node does seem to work for a while before being being dropped from the cluster. Basically this one node seems to think it is someone else, whilst also behaviouring as itself. The code and config is exactly the same on all 15 nodes so I don't understand why I'm getting this issue on only one node. Maybe this is a hardware issue? Some logging: 11:27:45.412 INFO [main] Remoting - Starting remoting 11:27:45.638 INFO [main] Remoting - Remoting started; listening on addresses :[akka.tcp://streaming@172.17.102.128:7000] 11:27:45.660 INFO [main] Cluster(akka://streaming) - Cluster Node [akka.tcp://streaming@172.17.102.128:7000] - Starting up... 11:27:45.714 INFO [main] Cluster(akka://streaming) - Cluster Node [akka.tcp://streaming@172.17.102.128:7000] - Registered cluster JMX MBean [akka:type=Cluster] 11:27:45.715 INFO [main] Cluster(akka://streaming) - Cluster Node [akka.tcp://streaming@172.17.102.128:7000] - Started up successfully 11:27:45.830 INFO [streaming-akka.actor.default-dispatcher-3] a.a.LocalActorRef - Message [akka.cluster.InternalClusterAction$InitJoinAck] from Actor[akka.tcp://streaming@172.17.100.98:7000/system/ cluster/core/daemon#1997515880] to Actor[akka://streaming/system/ cluster/core/daemon/joinSeedNodeProcess-1#1132911] was not delivered. [1] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'. 11:27:45.872 INFO [streaming-akka.actor.default-dispatcher-5] Cluster(akka://streaming) - Cluster Node [akka.tcp://streaming@172.17. 102.128:7000] - Welcome from [akka.tcp://streaming@172.17.102.125:7000] 11:27:45.911 INFO [streaming-akka.actor.default-dispatcher-2] Cluster(akka://streaming) - Cluster Node [akka.tcp://streaming@172.17. 102.128:7000] - Ignoring received gossip intended for someone else, from [akka.tcp://streaming@172.17.102.68:7000] to [UniqueAddress(akka.tcp://streaming@172.17.102.128:7000,-1482656725)] 11:27:45.943 INFO [streaming-akka.actor.default-dispatcher-16] Cluster(akka://streaming) - Cluster Node [akka.tcp://streaming@172.17. 102.128:7000] - Ignoring received gossip intended for someone else, from [akka.tcp://streaming@172.17.102.70:7000] to [UniqueAddress(akka.tcp://streaming@172.17.102.128:7000,-1482656725)] 11:27:46.122 INFO [streaming-akka.actor.default-dispatcher-16] Cluster(akka://streaming) - Cluster Node [akka.tcp://streaming@172.17. 102.128:7000] - Ignoring received gossip intended for someone else, from [akka.tcp://streaming@172.17.102.69:7000] to [UniqueAddress(akka.tcp
[akka-user] Akka cluster: node identity crisis
I have a cluster of 15 nodes/boxes. I start the nodes roughly at the same time. One of the nodes is behaving oddly and continually logging Ignoring received gossip intended for someone else. However, the node does seem to work for a while before being being dropped from the cluster. Basically this one node seems to think it is someone else, whilst also behaviouring as itself. The code and config is exactly the same on all 15 nodes so I don't understand why I'm getting this issue on only one node. Maybe this is a hardware issue? Some logging: 11:27:45.412 INFO [main] Remoting - Starting remoting 11:27:45.638 INFO [main] Remoting - Remoting started; listening on addresses :[akka.tcp://streaming@172.17.102.128:7000] 11:27:45.660 INFO [main] Cluster(akka://streaming) - Cluster Node [akka.tcp://streaming@172.17.102.128:7000] - Starting up... 11:27:45.714 INFO [main] Cluster(akka://streaming) - Cluster Node [akka.tcp://streaming@172.17.102.128:7000] - Registered cluster JMX MBean [akka:type=Cluster] 11:27:45.715 INFO [main] Cluster(akka://streaming) - Cluster Node [akka.tcp://streaming@172.17.102.128:7000] - Started up successfully 11:27:45.830 INFO [streaming-akka.actor.default-dispatcher-3] a.a.LocalActorRef - Message [akka.cluster.InternalClusterAction$InitJoinAck] from Actor[akka.tcp://streaming@172.17.100.98:7000/system/cluster/core/daemon#1997515880] to Actor[akka://streaming/system/cluster/core/daemon/joinSeedNodeProcess-1#1132911] was not delivered. [1] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'. 11:27:45.872 INFO [streaming-akka.actor.default-dispatcher-5] Cluster(akka://streaming) - Cluster Node [akka.tcp://streaming@172.17.102.128:7000] - Welcome from [akka.tcp://streaming@172.17.102.125:7000] 11:27:45.911 INFO [streaming-akka.actor.default-dispatcher-2] Cluster(akka://streaming) - Cluster Node [akka.tcp://streaming@172.17.102.128:7000] - Ignoring received gossip intended for someone else, from [akka.tcp://streaming@172.17.102.68:7000] to [UniqueAddress(akka.tcp://streaming@172.17.102.128:7000,-1482656725)] 11:27:45.943 INFO [streaming-akka.actor.default-dispatcher-16] Cluster(akka://streaming) - Cluster Node [akka.tcp://streaming@172.17.102.128:7000] - Ignoring received gossip intended for someone else, from [akka.tcp://streaming@172.17.102.70:7000] to [UniqueAddress(akka.tcp://streaming@172.17.102.128:7000,-1482656725)] 11:27:46.122 INFO [streaming-akka.actor.default-dispatcher-16] Cluster(akka://streaming) - Cluster Node [akka.tcp://streaming@172.17.102.128:7000] - Ignoring received gossip intended for someone else, from [akka.tcp://streaming@172.17.102.69:7000] to [UniqueAddress(akka.tcp://streaming@172.17.102.128:7000,-1482656725)] Config: akka { cluster { seed-nodes = [ akka.tcp://streaming@172.17.102.125:7000 akka.tcp://streaming@172.17.100.98:7000 ] } remote.netty.tcp.hostname = 172.17.102.128 } I thought it was weird that the unique address in the gossip messages referred to a negative number. I added log.info(smy unique ID: ${AddressUidExtension(actorSystem).addressUid}) to the confused node (I hope this is the correct code) and it gave me the answer 1549799231, whilst continuing to give -1482656725 in the gossip messages. I'm guessing the problem is that the gossip messages have a corrupted address, which is why the confused node believes these messages are not for itself. I'm using Akka 2.3.2. -- Read the docs: http://akka.io/docs/ Check the FAQ: http://doc.akka.io/docs/akka/current/additional/faq.html Search the archives: https://groups.google.com/group/akka-user --- You received this message because you are subscribed to the Google Groups Akka User List group. To unsubscribe from this group and stop receiving emails from it, send an email to akka-user+unsubscr...@googlegroups.com. To post to this group, send email to akka-user@googlegroups.com. Visit this group at http://groups.google.com/group/akka-user. For more options, visit https://groups.google.com/d/optout.
[akka-user] Akka actors not getting balanced thread time
I'm creating an application that HTTP streams data that is read from Kafka. A client can create multiple connections, with the data being evenly balanced between the connections. I'm using Spray 1.3.1 to handle the HTTP streaming and Akka 2.3.0. Each client connection creates a streamer actor that gets data from a reader actor that is unique to the client. For example, if a client connects four times, there will be four streamer actors, with the streamer actors all requesting data from one reader actor. What I'm witnessing is the following behavior (all connections to the same process, using default dispatcher): T0: 1st client connection, 1st streamer and reader created, streamer requests 1400 msgs per second from the reader T1: 2nd client connection, 2nd streamer created, 1st streamer requesting 600 msgs per second, 2nd streamer requesting 1200 msgs per second T2: 1st client connection killed, 1st streamer killed, 2nd streamer requesting 1700 msgs per second T3: 3rd client connection, 3rd streamer created, 2nd 3rd streamer each requesting 1000 per second (this is the behavior I want!) Basically it would seem that the 1st streamer is not getting the same thread time as later streamers. Is this a crazy thought? Is there anything I check to make sure the akka system is setup correctly? If people think this could be an akka bug then I can try to put together a small code example that demonstrates this behavior. Thanks. -- Read the docs: http://akka.io/docs/ Check the FAQ: http://doc.akka.io/docs/akka/current/additional/faq.html Search the archives: https://groups.google.com/group/akka-user --- You received this message because you are subscribed to the Google Groups Akka User List group. To unsubscribe from this group and stop receiving emails from it, send an email to akka-user+unsubscr...@googlegroups.com. To post to this group, send email to akka-user@googlegroups.com. Visit this group at http://groups.google.com/group/akka-user. For more options, visit https://groups.google.com/d/optout.
Re: [akka-user] Akka actors not getting balanced thread time
I'm working on reducing my code into something I can share. My system is actually a bit more complicated than I've explained. I'm using remote actors with other actors that coordinator everything and other actors that track offsets. I suppose I'm trying to understand how Akka allocates thread time to actors. The first streamer actor takes time to get up to full streaming speed, so maybe this affects the akka calculation? What kind of JVM monitoring are you referring to? I've used YourKit, but it didn't notice anything unusual. On Wednesday, April 16, 2014 3:08:30 PM UTC-4, √ wrote: Hi Benjamin, your question is hypothetical and without the code and config etc it's impossible to make a qualified answer. What does your JVM monitoring tell you? On Wed, Apr 16, 2014 at 7:47 PM, Benjamin Black benbl...@gmail.comjavascript: wrote: I'm creating an application that HTTP streams data that is read from Kafka. A client can create multiple connections, with the data being evenly balanced between the connections. I'm using Spray 1.3.1 to handle the HTTP streaming and Akka 2.3.0. Each client connection creates a streamer actor that gets data from a reader actor that is unique to the client. For example, if a client connects four times, there will be four streamer actors, with the streamer actors all requesting data from one reader actor. What I'm witnessing is the following behavior (all connections to the same process, using default dispatcher): T0: 1st client connection, 1st streamer and reader created, streamer requests 1400 msgs per second from the reader T1: 2nd client connection, 2nd streamer created, 1st streamer requesting 600 msgs per second, 2nd streamer requesting 1200 msgs per second T2: 1st client connection killed, 1st streamer killed, 2nd streamer requesting 1700 msgs per second T3: 3rd client connection, 3rd streamer created, 2nd 3rd streamer each requesting 1000 per second (this is the behavior I want!) Basically it would seem that the 1st streamer is not getting the same thread time as later streamers. Is this a crazy thought? Is there anything I check to make sure the akka system is setup correctly? If people think this could be an akka bug then I can try to put together a small code example that demonstrates this behavior. Thanks. -- Read the docs: http://akka.io/docs/ Check the FAQ: http://doc.akka.io/docs/akka/current/additional/faq.html Search the archives: https://groups.google.com/group/akka-user --- You received this message because you are subscribed to the Google Groups Akka User List group. To unsubscribe from this group and stop receiving emails from it, send an email to akka-user+...@googlegroups.com javascript:. To post to this group, send email to akka...@googlegroups.comjavascript: . Visit this group at http://groups.google.com/group/akka-user. For more options, visit https://groups.google.com/d/optout. -- Cheers, √ -- Read the docs: http://akka.io/docs/ Check the FAQ: http://doc.akka.io/docs/akka/current/additional/faq.html Search the archives: https://groups.google.com/group/akka-user --- You received this message because you are subscribed to the Google Groups Akka User List group. To unsubscribe from this group and stop receiving emails from it, send an email to akka-user+unsubscr...@googlegroups.com. To post to this group, send email to akka-user@googlegroups.com. Visit this group at http://groups.google.com/group/akka-user. For more options, visit https://groups.google.com/d/optout.