Hi Tom, Srini, We have also noticed this with Boron very sporadically even without any explicit action taken on shard like Srini did
Srini, Are you referring “journal-plugin-fallback” from http://doc.akka.io/docs/akka/current/scala/general/configuration.html#config-akka-persistence ? Regards Muthu From: controller-dev-boun...@lists.opendaylight.org [mailto:controller-dev-boun...@lists.opendaylight.org] On Behalf Of Srini Seetharaman Sent: Friday, August 11, 2017 9:40 AM To: Tom Pantelis Cc: controller-dev@lists.opendaylight.org Subject: Re: [controller-dev] Circuit Breaker timed out Thanks Tom. I will investigate further on why the local disk operation failed. Seems strange though because I haven't seen anything in dmesg. The default value for the call-timeout is 10s in akka.conf. On Thu, Aug 10, 2017 at 3:20 PM, Tom Pantelis <tompante...@gmail.com<mailto:tompante...@gmail.com>> wrote: That error is from akka persistence. It happens if the backend persistence plugin doesn't respond back in time. I've only seen this in a CSIT environment whose disk activity was overloaded. The timeouts can be tweaked - I don't recall exactly what they are but you can find them in the akka docs (names contain circuit-breaker). On Thu, Aug 10, 2017 at 6:01 PM, Srini Seetharaman <srini.seethara...@gmail.com<mailto:srini.seethara...@gmail.com>> wrote: Hi Tom, In our ODL deployment that is running in standalone mode with operational store persistence enabled, we saw the following error being printed. Once the member-1-default-operational shard is shutdown, all write transactions after that fail and the system becomes unstable. At this point, we were probably doing less than 10 transactions per second. Any idea what is causing this? Has anyone seen this before? 2017-08-07 19:15:59,622 | ERROR | lt-dispatcher-23 | Shard | 176 - com.typesafe.akka.slf4j - 2.4.7 | Failed to persist event type [org.opendaylight.controller.cluster.raft.ReplicatedLogImplEntry] with sequence number [9897493] for persistenceId [member-1-shard-default-operational]. akka.pattern.CircuitBreaker$$anon$1: Circuit Breaker Timed out. 2017-08-07 19:15:59,628 | INFO | lt-dispatcher-24 | Shard | 188 - org.opendaylight.controller.sa<http://org.opendaylight.controller.sa>l-akka-raft - 1.4.2.Boron-SR2 | Stopping Shard member-1-shard-default-operational 2017-08-07 19:15:59,629 | ERROR | lt-dispatcher-23 | LocalThreePhaseCommitCohort | 193 - org.opendaylight.controller.sa<http://org.opendaylight.controller.sa>l-distributed-datastore - 1.4.2.Boron-SR2 | Failed to prepare transaction member-1-datastore-operational-fe-5-txn-791019 on backend java.lang.RuntimeException: Transaction aborted due to shutdown. at org.opendaylight.controller.cl<http://org.opendaylight.controller.cl>uster.datastore.ShardCommitCoordinator.abortPendingTransactions(ShardCommitCoordinator.java:399)[193:org.opendaylight.controller.sal-distributed-datastore:1.4.2.Boron-SR2] at org.opendaylight.controller.cl<http://org.opendaylight.controller.cl>uster.datastore.Shard.postStop(Shard.java:211)[193:org.opendaylight.controller.sal-distributed-datastore:1.4.2.Boron-SR2] at akka.actor.Actor$class.aroundPostStop(Actor.scala:494)[175:com.typesafe.akka.actor:2.4.7] at akka.persistence.UntypedPersistentActor.akka$persistence$Eventsourced$$super$aroundPostStop(PersistentActor.scala:168)[181:com.typesafe.akka.persistence:2.4.7] at akka.persistence.Eventsourced$class.aroundPostStop(Eventsourced.scala:223)[181:com.typesafe.akka.persistence:2.4.7] at akka.persistence.UntypedPersistentActor.aroundPostStop(PersistentActor.scala:168)[181:com.typesafe.akka.persistence:2.4.7] at akka.actor.dungeon.FaultHandling$class.akka$actor$dungeon$FaultHandling$$finishTerminate(FaultHandling.scala:210)[175:com.typesafe.akka.actor:2.4.7] at akka.actor.dungeon.FaultHandling$class.handleChildTerminated(FaultHandling.scala:293)[175:com.typesafe.akka.actor:2.4.7] at akka.actor.ActorCell.handleChildTerminated(ActorCell.scala:374)[175:com.typesafe.akka.actor:2.4.7] at akka.actor.dungeon.DeathWatch$class.watchedActorTerminated(DeathWatch.scala:61)[175:com.typesafe.akka.actor:2.4.7] at akka.actor.ActorCell.watchedActorTerminated(ActorCell.scala:374)[175:com.typesafe.akka.actor:2.4.7] at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:460)[175:com.typesafe.akka.actor:2.4.7] at akka.actor.ActorCell.systemInvoke(ActorCell.scala:483)[175:com.typesafe.akka.actor:2.4.7] at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:282)[175:com.typesafe.akka.actor:2.4.7] at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:260)[175:com.typesafe.akka.actor:2.4.7] at akka.dispatch.Mailbox.run(Mailbox.scala:224)[175:com.typesafe.akka.actor:2.4.7] at akka.dispatch.Mailbox.exec(Mailbox.scala:234)[175:com.typesafe.akka.actor:2.4.7] at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)[171:org.scala-lang.scala-library:2.11.8.v20160304-115712-1706a37eb8] at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)[171:org.scala-lang.scala-library:2.11.8.v20160304-115712-1706a37eb8] at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)[171:org.scala-lang.scala-library:2.11.8.v20160304-115712-1706a37eb8] at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)[171:org.scala-lang.scala-library:2.11.8.v20160304-115712-1706a37eb8] 2017-08-07 19:15:59,629 | WARN | ult-dispatcher-3 | ConcurrentDOMDataBroker | 193 - org.opendaylight.controller.sa<http://org.opendaylight.controller.sa>l-distributed-datastore - 1.4.2.Boron-SR2 | Tx: DOM-956840 Error during phase CAN_COMMIT, starting Abort java.lang.RuntimeException: Transaction aborted due to shutdown. at org.opendaylight.controller.cl<http://org.opendaylight.controller.cl>uster.datastore.ShardCommitCoordinator.abortPendingTransactions(ShardCommitCoordinator.java:399)[193:org.opendaylight.controller.sal-distributed-datastore:1.4.2.Boron-SR2] at org.opendaylight.controller.cl<http://org.opendaylight.controller.cl>uster.datastore.Shard.postStop(Shard.java:211)[193:org.opendaylight.controller.sal-distributed-datastore:1.4.2.Boron-SR2] at akka.actor.Actor$class.aroundPostStop(Actor.scala:494)[175:com.typesafe.akka.actor:2.4.7] at akka.persistence.UntypedPersistentActor.akka$persistence$Eventsourced$$super$aroundPostStop(PersistentActor.scala:168)[181:com.typesafe.akka.persistence:2.4.7] at akka.persistence.Eventsourced$class.aroundPostStop(Eventsourced.scala:223)[181:com.typesafe.akka.persistence:2.4.7] at akka.persistence.UntypedPersistentActor.aroundPostStop(PersistentActor.scala:168)[181:com.typesafe.akka.persistence:2.4.7] at akka.actor.dungeon.FaultHandling$class.akka$actor$dungeon$FaultHandling$$finishTerminate(FaultHandling.scala:210)[175:com.typesafe.akka.actor:2.4.7] at akka.actor.dungeon.FaultHandling$class.handleChildTerminated(FaultHandling.scala:293)[175:com.typesafe.akka.actor:2.4.7] at akka.actor.ActorCell.handleChildTerminated(ActorCell.scala:374)[175:com.typesafe.akka.actor:2.4.7] at akka.actor.dungeon.DeathWatch$class.watchedActorTerminated(DeathWatch.scala:61)[175:com.typesafe.akka.actor:2.4.7] at akka.actor.ActorCell.watchedActorTerminated(ActorCell.scala:374)[175:com.typesafe.akka.actor:2.4.7] at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:460)[175:com.typesafe.akka.actor:2.4.7] at akka.actor.ActorCell.systemInvoke(ActorCell.scala:483)[175:com.typesafe.akka.actor:2.4.7] at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:282)[175:com.typesafe.akka.actor:2.4.7] at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:260)[175:com.typesafe.akka.actor:2.4.7] at akka.dispatch.Mailbox.run(Mailbox.scala:224)[175:com.typesafe.akka.actor:2.4.7] at akka.dispatch.Mailbox.exec(Mailbox.scala:234)[175:com.typesafe.akka.actor:2.4.7] at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)[171:org.scala-lang.scala-library:2.11.8.v20160304-115712-1706a37eb8] at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)[171:org.scala-lang.scala-library:2.11.8.v20160304-115712-1706a37eb8] at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)[171:org.scala-lang.scala-library:2.11.8.v20160304-115712-1706a37eb8] at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)[171:org.scala-lang.scala-library:2.11.8.v20160304-115712-1706a37eb8] 2017-08-07 19:15:59,630 | INFO | lt-dispatcher-17 | LocalActorRef | 176 - com.typesafe.akka.slf4j - 2.4.7 | Message [org.opendaylight.controller.cluster.raft.client.messages.Ge<http://luster.raft.client.messages.Ge>tOnDemandRaftState] from Actor[akka://opendaylight-cluster-data/temp/$b] to Actor[akka://opendaylight-cluster-data/user/shardmanager-operational/member-1-shard-default-operational#-376322108] was not delivered. [1] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
_______________________________________________ controller-dev mailing list controller-dev@lists.opendaylight.org https://lists.opendaylight.org/mailman/listinfo/controller-dev