On Nov 7, 2007, at 1:47 AM, [EMAIL PROTECTED] wrote:

Hi,

we have got a problem for some months now that we can’t find a solution for.

The situation: We have an application that is running on four different application servers (with quite some instances on each server, servers running on linux) controlled by monitors running on two of those servers (each monitor is responsible for 2 servers). The wotaskd is running on each server as well. Finally we got two web servers (Apache 2.0.49). We use Java 1.4.2, WebObjects 5.2.3.

The problem: Several times a day on each of the instances we got session timeouts (SessionRestorationErrors). But the sessions don’t time out, the requests are placed on the wrong instances. Of course, the session ids are not known on those wrong instances so the SessionRestorationErrors take place. What we have done so far: we tried setting send timeout, receive timeout and connect timeout in “Load Balancing and Adaptor Settings” to values of one minute and above without any success.

That is the classic solution for this type of problem. I can think of two explanations why it might not be working. The first is that your instances are stalling for longer than one minute. The other is that the problem is at a level below WebObjects.

For the first situation, we can use the apps to diagnose it. Add this to your Application,

    public WOResponse dispatchRequest(WORequest request)
    {
            WOResponse response;
            NSTimestamp startTime = new NSTimestamp();

            response = super.dispatchRequest(request);
            NSTimestamp stopTime = new NSTimestamp();
long milliseconds = stopTime.getTime() - startTime.getTime();

NSLog.debug.appendln("," + request.uri() + ", - elapsed time: ," + (milliseconds / 1000.0) );

        return response;
    }


You can easily grep this out of the log, separate it by commas, and sort by the time to see what the longest lag in returning a response it. If it is over a minute, I would look at:

1. Slow queries / DB contention
2. Excessive garbage collection due to memory starvation
3. Other processes on the machine (a cron job?) taking too many resources

If it is not over a minute, see below.

We are logging the woadaptor now. It seems we have got some kind of connection trouble:

Error: couldn't connect to 10.0.0.40 (1085): Operation now in progress
Error: Error connecting to server 10.0.0.40
Warn: Unable to find instance 55. Attempting to select another.
Warn: Unable to find instance 55. Attempting to select another.
Warn: Unable to find instance 60. Attempting to select another.

But 10.0.0.40:1085 is up and running. This error message is just been thrown about every 10 or 20 minutes and not all the time. We found some similar problems in mailing lists but none was helpful so far. Any suggestions how we can get rid of this problem? Thanks in advance.

The only other thing I can think of is that you have problems in your network or the app servers are running out of ports / file handles or some similar problem below the level of WebObjects. I have no idea how to debug that.

Chuck

--

Practical WebObjects - for developers who want to increase their overall knowledge of WebObjects or who are trying to solve specific problems.
http://www.global-village.net/products/practical_webobjects





_______________________________________________
Do not post admin requests to the list. They will be ignored.
Webobjects-deploy mailing list      (Webobjects-deploy@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/webobjects-deploy/archive%40mail-archive.com

This email sent to [EMAIL PROTECTED]

Reply via email to