On 09/15/2011 06:57 AM, Alan Wright wrote: > Hi > > I am migrating from 2 to 4.00.22 > > So far I am starting a web-tier with one load-balance server and an app > tier with two app servers. > > It starts successfully and the docs and admin applications function. > > After startup I am seeing errors in the logs, but then all these errors > stopped after about 2 hours - so it appears they are related. > > There are references to "Heartbeats" in the errors but when I checked > server health in the resin-admin app they all showed OK for heartbeats.
I've added this as a bug report http://bugs.caucho.com/view.php?id=4764. We should rename the "GlobalCacheHeartbeat" to something else to avoid the confusion. It's not the same as the main heartbeat. Basically, it's the app-tier telling the web-tier about its current dynamic server configuration. It's a "heartbeat" only because it's a push, rather than a poll operation. -- Scott > Any suggestions as to why these errors appear and what steps should be > taken would be great. > > The errors are as follows: > > in jvm-web-a.log - every 2 minutes > > [11-09-15 12:33:22.884] {resin-65} com.caucho.bam.TimeoutException: > TriadFuture[[email protected]] query timeout: 60000 > to: > [email protected] > query: > GlobalCacheHeartbeat[2ff0dbf6,59ff8cba,1316086342878] > at > com.caucho.cloud.bam.TriadFuture.get(TriadFuture.java:80) > at > com.caucho.cloud.bam.TriadFirstQuery.get(TriadFirstQuery.java:191) > at > com.caucho.cloud.bam.BamTriadSender.queryTriadFirstRemote(BamTriadSender.java:675) > at > com.caucho.cloud.globalcache.GlobalCacheActor.sendHeartbeat(GlobalCacheActor.java:224) > at > com.caucho.cloud.globalcache.GlobalCacheManager.sendHeartbeatImpl(GlobalCacheManager.java:144) > at > com.caucho.cloud.globalcache.GlobalCacheManager$HeartbeatTask.run(GlobalCacheManager.java:200) > at > com.caucho.env.thread.ResinThread.runTasks(ResinThread.java:164) > at > com.caucho.env.thread.ResinThread.run(ResinThread.java:130) > > > in jvm-app-a.log - repeating but no exact pattern > > [11-09-15 12:33:21.774] > {MailboxWorker[[email protected]]-47} > ClusterServer[id=web-a] notify-heartbeat-stop > [11-09-15 12:34:20.794] {resin-44} ClusterCacheManagerImpl[app-a] cannot > load data for HashKey[59ff8cba] from triad > [11-09-15 12:34:20.794] {resin-44} Missing or corrupted data in get for > MnodeValue[value=HashKey[59ff8cba],flags=0x6,version=1316086342878,lease=-1] > ProCacheEntry[key=null,keyHash=2ff0dbf6,owner=C_A] > [11-09-15 12:34:21.774] > {MailboxWorker[[email protected]]-58} > ClusterServer[id=web-a] notify-heartbeat-stop > [11-09-15 12:34:21.863] > {MailboxWorker[[email protected]]-56} > GlobalCacheActor[[email protected]]: > CloudPod[0,main,cluster=web-tier] data is missing for HashKey[59ff8cba] > [11-09-15 12:35:21.774] > {MailboxWorker[[email protected]]-72} > ClusterServer[id=web-a] notify-heartbeat-stop > [11-09-15 12:36:21.774] > {MailboxWorker[[email protected]]-82} > ClusterServer[id=web-a] notify-heartbeat-stop > [11-09-15 12:36:21.864] > {MailboxWorker[[email protected]]-80} > GlobalCacheActor[[email protected]]: > CloudPod[0,main,cluster=web-tier] data is missing for HashKey[59ff8cba] > [11-09-15 12:37:21.774] > {MailboxWorker[[email protected]]-91} > ClusterServer[id=web-a] notify-heartbeat-stop > [11-09-15 12:38:21.774] > {MailboxWorker[[email protected]]-105} > ClusterServer[id=web-a] notify-heartbeat-stop > > > in jvm-app-b.log - repeating every minute > > [11-09-15 12:32:21.866] > {MailboxWorker[[email protected]]-12} > ClusterServer[id=web-a] notify-heartbeat-stop > [11-09-15 12:33:21.774] > {MailboxWorker[[email protected]]-43} > ClusterServer[id=web-a] notify-heartbeat-stop > [11-09-15 12:34:21.774] > {MailboxWorker[[email protected]]-55} > ClusterServer[id=web-a] notify-heartbeat-stop > [11-09-15 12:35:21.774] > {MailboxWorker[[email protected]]-67} > ClusterServer[id=web-a] notify-heartbeat-stop > > _______________________________________________ resin-interest mailing list [email protected] http://maillist.caucho.com/mailman/listinfo/resin-interest
