Re: [Resin-interest] Resin 4.0.22 Errors in logs after start

2011-09-15 Thread Scott Ferguson
On 09/15/2011 06:57 AM, Alan Wright wrote:
> Hi
>
> I am migrating from 2 to 4.00.22
>
> So far I am starting a web-tier with one load-balance server and an app
> tier with two app servers.
>
> It starts successfully and the docs and admin applications function.
>
> After startup I am seeing errors in the logs, but then all these errors
> stopped after about 2 hours - so it appears they are related.
>
> There are references to "Heartbeats" in the errors but when I checked
> server health in the resin-admin app they all showed OK for heartbeats.

I've added this as a bug report http://bugs.caucho.com/view.php?id=4764.

We should rename the "GlobalCacheHeartbeat" to something else to avoid 
the confusion. It's not the same as the main heartbeat. Basically, it's 
the app-tier telling the web-tier about its current dynamic server 
configuration. It's a "heartbeat" only because it's a push, rather than 
a poll operation.

-- Scott

> Any suggestions as to why these errors appear and what steps should be
> taken would be great.
>
> The errors are as follows:
>
> in jvm-web-a.log - every 2 minutes
>
> [11-09-15 12:33:22.884] {resin-65} com.caucho.bam.TimeoutException:
> TriadFuture[global-ca...@aaa.web-tier.admin.resin] query timeout: 6
>to:
> global-ca...@aaa.app-tier.admin.resin
>query:
> GlobalCacheHeartbeat[2ff0dbf6,59ff8cba,1316086342878]
>  at
> com.caucho.cloud.bam.TriadFuture.get(TriadFuture.java:80)
>  at
> com.caucho.cloud.bam.TriadFirstQuery.get(TriadFirstQuery.java:191)
>  at
> com.caucho.cloud.bam.BamTriadSender.queryTriadFirstRemote(BamTriadSender.java:675)
>  at
> com.caucho.cloud.globalcache.GlobalCacheActor.sendHeartbeat(GlobalCacheActor.java:224)
>  at
> com.caucho.cloud.globalcache.GlobalCacheManager.sendHeartbeatImpl(GlobalCacheManager.java:144)
>  at
> com.caucho.cloud.globalcache.GlobalCacheManager$HeartbeatTask.run(GlobalCacheManager.java:200)
>  at
> com.caucho.env.thread.ResinThread.runTasks(ResinThread.java:164)
>  at
> com.caucho.env.thread.ResinThread.run(ResinThread.java:130)
>
>
> in jvm-app-a.log - repeating but no exact pattern
>
> [11-09-15 12:33:21.774]
> {MailboxWorker[cluster-heartb...@aaa.app-tier.admin.resin]-47}
> ClusterServer[id=web-a] notify-heartbeat-stop
> [11-09-15 12:34:20.794] {resin-44} ClusterCacheManagerImpl[app-a] cannot
> load data for HashKey[59ff8cba] from triad
> [11-09-15 12:34:20.794] {resin-44} Missing or corrupted data in get for
> MnodeValue[value=HashKey[59ff8cba],flags=0x6,version=1316086342878,lease=-1]
> ProCacheEntry[key=null,keyHash=2ff0dbf6,owner=C_A]
> [11-09-15 12:34:21.774]
> {MailboxWorker[cluster-heartb...@aaa.app-tier.admin.resin]-58}
> ClusterServer[id=web-a] notify-heartbeat-stop
> [11-09-15 12:34:21.863]
> {MailboxWorker[global-ca...@aaa.app-tier.admin.resin]-56}
> GlobalCacheActor[global-ca...@aaa.app-tier.admin.resin]:
> CloudPod[0,main,cluster=web-tier] data is missing for HashKey[59ff8cba]
> [11-09-15 12:35:21.774]
> {MailboxWorker[cluster-heartb...@aaa.app-tier.admin.resin]-72}
> ClusterServer[id=web-a] notify-heartbeat-stop
> [11-09-15 12:36:21.774]
> {MailboxWorker[cluster-heartb...@aaa.app-tier.admin.resin]-82}
> ClusterServer[id=web-a] notify-heartbeat-stop
> [11-09-15 12:36:21.864]
> {MailboxWorker[global-ca...@aaa.app-tier.admin.resin]-80}
> GlobalCacheActor[global-ca...@aaa.app-tier.admin.resin]:
> CloudPod[0,main,cluster=web-tier] data is missing for HashKey[59ff8cba]
> [11-09-15 12:37:21.774]
> {MailboxWorker[cluster-heartb...@aaa.app-tier.admin.resin]-91}
> ClusterServer[id=web-a] notify-heartbeat-stop
> [11-09-15 12:38:21.774]
> {MailboxWorker[cluster-heartb...@aaa.app-tier.admin.resin]-105}
> ClusterServer[id=web-a] notify-heartbeat-stop
>
>
> in jvm-app-b.log - repeating every minute
>
> [11-09-15 12:32:21.866]
> {MailboxWorker[cluster-heartb...@baa.app-tier.admin.resin]-12}
> ClusterServer[id=web-a] notify-heartbeat-stop
> [11-09-15 12:33:21.774]
> {MailboxWorker[cluster-heartb...@baa.app-tier.admin.resin]-43}
> ClusterServer[id=web-a] notify-heartbeat-stop
> [11-09-15 12:34:21.774]
> {MailboxWorker[cluster-heartb...@baa.app-tier.admin.resin]-55}
> ClusterServer[id=web-a] notify-heartbeat-stop
> [11-09-15 12:35:21.774]
> {MailboxWorker[cluster-heartb...@baa.app-tier.admin.resin]-67}
> ClusterServer[id=web-a] notify-heartbeat-stop
>
>



___
resin-interest mailing list
resin-interest@caucho.com
http://maillist.caucho.com/mailman/listinfo/resin-interest


Re: [Resin-interest] Resin 4.0.22 Errors in logs after start

2011-09-15 Thread Alan Wright
Apologies - the errors did not stop - I had copied the files for review 
and forgotten that I had done so.

The errors continue with the same pattern, although resin-admin looks OK


Alan

On 15/09/2011 14:57, Alan Wright wrote:
> Hi
>
> I am migrating from 2 to 4.00.22
>
> So far I am starting a web-tier with one load-balance server and an app
> tier with two app servers.
>
> It starts successfully and the docs and admin applications function.
>
> After startup I am seeing errors in the logs, but then all these errors
> stopped after about 2 hours - so it appears they are related.
>

-- 


Alan Wright
Athene Systems

tel 0845 230 9803


Athene Systems Limited
Registered Office:
Shieling House
Invincible Road
Farnborough
GU14 7QU

Registered in England and Wales No. 3156080



___
resin-interest mailing list
resin-interest@caucho.com
http://maillist.caucho.com/mailman/listinfo/resin-interest