Re: deployment reliability problems

Eugene Khablov Tue, 29 Apr 2008 06:03:28 -0700

On Sun, Apr 27, 2008 at 5:17 AM, Aristedes Maniatis <[EMAIL PROTECTED]> wrote:


> There appear to be some fundamental problems in WO deployment within a
> failover cluster environment. Surely there must be some workarounds for
> bigger deployments. Imagine a simple setup: two application servers with
> instances of several applications on each.
>
> * If one server goes offline suddenly (eg. a motherboard failure), running
> Javamonitor becomes almost impossible. Every click which causes a page
> refresh or change takes between 30 and 120 seconds. That is, it is tediously
> slow to perform vital work needed in the event of a failure of one node in
> the cluster. In fact, quite often Javamonitor will simply time out, forcing
> you back to the login page again.
>

This is a very rare situation. It has happened with our environment only
once. Workaround for this problem is to check server availability or server
in WOAdaptorInfo response.


>
> * If one instance of one application which Javamonitor monitors locks up
> somehow (we are still to understand what happens to the instance), then
> Javamonitor also becomes unresponsive (but not as badly) and additionally
> displays false information. For example, pressing refuse new sessions
> actually causes the application to refuse sessions, but the GUI icon does
> not change colour. Force killing the rogue instance returns everything to
> normal, but it is very hard to determine which instance is rogue.
>

We successfully solved this problem using small script that restarts very
slowly or deadlocked instances. This script also emails us memory status,
thread dump and some other useful debug info about such instances.


>
> * Trying to clear up the problems on a server by restarting wotaskd is
> fairly disastrous. It tries to start new copies of all instances, which
> fails since the old instances are still running on the same ports. It does
> not find the old instances when then require manual killing one at a time.
>
>
Our wotaskd reuses existing instances. We have MacOSX Server 10.4.11.


>
> I can't imagine that a system billed as designed for 'enterprise use' has
> these simple and fundamental issues. Of course, they only happen when
> something goes wrong, so you may not (touch wood) see them for years at a
> time. Are there techniques which I am missing here or are these problems not
> seen by others?
>
>
> Ari Maniatis
>
>
>
>
>
>
> -------------------------->
> ish
> http://www.ish.com.au
> Level 1, 30 Wilson Street Newtown 2042 Australia
> phone +61 2 9550 5001   fax +61 2 9550 4001
> GPG fingerprint CBFB 84B4 738D 4E87 5E5C  5EFA EF6A 7D2E 3E49 102A
>
>
> _______________________________________________
> Do not post admin requests to the list. They will be ignored.
> Webobjects-deploy mailing list      ([email protected])
> Help/Unsubscribe/Update your Subscription:
>
> http://lists.apple.com/mailman/options/webobjects-deploy/vesper%40mactime.ru
>
> This email sent to [EMAIL PROTECTED]
>



-- 
Eugene Khablov
Media Agency "Design Maximum"

Tel: +7 863 2648211
Fax: +7 863 2645229
Web: http://www.demax.ru

 _______________________________________________
Do not post admin requests to the list. They will be ignored.
Webobjects-deploy mailing list      ([email protected])
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/webobjects-deploy/archive%40mail-archive.com

This email sent to [EMAIL PROTECTED]

Re: deployment reliability problems

Reply via email to