Sorry, last for the night. Time for me to go to sleep.

If I modify "ab" source code in Apache 2.2 to output error number
causing it to fail, get:

  Send request failed 9!

The code where modification was done was:

        if (e != APR_SUCCESS) {
            /*
             * Let's hope this traps EWOULDBLOCK too !
             */
            if (!APR_STATUS_IS_EAGAIN(e)) {
                epipe++;
                printf("Send request failed %d!\n", e);
                close_connection(c);
            }
            return;
        }

On Mac OS X this equates to:

  #define EBADF           9               /* Bad file descriptor */

Maybe when I am really bored I'll pursue further as to why.

Graham

On 06/03/2006, at 10:27 PM, Graham Dumpleton wrote:

Don't even need to rewrite test to use threads to fire off requests. If
I hardwire test to use "ab" from Apache 1.3 or Apache 2.0, then it
works okay. It is only the version of "ab" in Apache 2.0 which causes
the test to fail.

Graham

On 06/03/2006, at 9:41 PM, Graham Dumpleton wrote:

A bit more information. This may actually be a problem with the "ab" tool used for this test. I have actually seen the "ab" program give problems on
Mac OS X before even with Apache 2.0.

On 06/03/2006, at 9:02 PM, Graham Dumpleton wrote:

Traceback (most recent call last):
  File "test.py", line 2247, in test_global_lock
    self.fail("global_lock is broken (too quick)")
File "/System/Library/Frameworks/Python.framework/Versions/2.3/ lib/python2.3/unittest.py", line 270, in fail
    raise self.failureException, msg
AssertionError: global_lock is broken (too quick)

What the test does is to run "ab" to fire off five concurrent requests. The handler the request is calling acquires a global lock, sleeps for a second
and then releases the lock. Because all requests need to acquire the
lock but are blocked out for a second, the execution of the five requests should therefore take five seconds because they will in effect be serialised.

The problem is that "ab" is giving errors on 3 out of 5 of the requests:

Concurrency Level:      5
Time taken for tests:   3.514548 seconds
Complete requests:      5
Failed requests:        0
Write errors:           3
Total transferred:      540 bytes
HTML transferred:       21 bytes
Requests per second:    1.42 [#/sec] (mean)
Time per request:       3514.548 [ms] (mean)
Time per request: 702.910 [ms] (mean, across all concurrent requests)
Transfer rate:          0.00 [Kbytes/sec] received

As example, when "ab" is run in verbose mode:

LOG: header received:
HTTP/1.1 200 OK^M
Date: Mon, 06 Mar 2006 10:28:42 GMT^M
Server: Apache/2.2.0 (Unix) mod_python/3.3.0-dev-20060305 Python/ 2.3.5^M
Connection: close^M
Content-Type: text/plain^M
^M
test ok
LOG: Response code = 200
Send request failed!
LOG: header received:
HTTP/1.1 200 OK^M
Date: Mon, 06 Mar 2006 10:28:42 GMT^M
Server: Apache/2.2.0 (Unix) mod_python/3.3.0-dev-20060305 Python/ 2.3.5^M
Connection: close^M
Content-Type: text/plain^M
^M

Note the "Send request failed!" message.

Although "ab" is giving the error, the requests are in fact received by Apache
and handled. This time output is from prefork and not worker.

[Mon Mar 06 21:28:41 2006] [error] [client 127.0.0.1] global_lock 1 1141640921.61 0 [Mon Mar 06 21:28:41 2006] [error] [client 127.0.0.1] global_lock 2 1141640921.61 0 [Mon Mar 06 21:28:42 2006] [error] [client 127.0.0.1] global_lock 3 1141640922.61 0 [Mon Mar 06 21:28:42 2006] [error] [client 127.0.0.1] global_lock 4 1141640922.61 0 [Mon Mar 06 21:28:42 2006] [error] [client 127.0.0.1] global_lock 5 1141640922.61 0 [Mon Mar 06 21:28:43 2006] [notice] mod_python: (Re)importing module 'tests' [Mon Mar 06 21:28:43 2006] [notice] mod_python: (Re)importing module 'tests' [Mon Mar 06 21:28:43 2006] [error] [client 127.0.0.1] global_lock 1 1141640923.14 4 [Mon Mar 06 21:28:43 2006] [error] [client 127.0.0.1] global_lock 2 1141640923.14 4 [Mon Mar 06 21:28:43 2006] [notice] mod_python: (Re)importing module 'tests' [Mon Mar 06 21:28:43 2006] [error] [client 127.0.0.1] global_lock 1 1141640923.22 1 [Mon Mar 06 21:28:43 2006] [notice] mod_python: (Re)importing module 'tests' [Mon Mar 06 21:28:43 2006] [notice] mod_python: (Re)importing module 'tests' [Mon Mar 06 21:28:43 2006] [error] [client 127.0.0.1] global_lock 1 1141640923.26 3 [Mon Mar 06 21:28:43 2006] [error] [client 127.0.0.1] global_lock 1 1141640923.26 2 [Mon Mar 06 21:28:43 2006] [error] [client 127.0.0.1] global_lock 1 1141640923.26 5 [Mon Mar 06 21:28:44 2006] [error] [client 127.0.0.1] global_lock 3 1141640924.14 4 [Mon Mar 06 21:28:44 2006] [error] [client 127.0.0.1] global_lock 4 1141640924.14 4 [Mon Mar 06 21:28:44 2006] [error] [client 127.0.0.1] global_lock 2 1141640924.14 1 [Mon Mar 06 21:28:44 2006] [error] [client 127.0.0.1] global_lock 5 1141640924.14 4 [Mon Mar 06 21:28:45 2006] [error] [client 127.0.0.1] global_lock 3 1141640925.14 1 [Mon Mar 06 21:28:45 2006] [error] [client 127.0.0.1] global_lock 4 1141640925.14 1 [Mon Mar 06 21:28:45 2006] [error] [client 127.0.0.1] global_lock 2 1141640925.14 3 [Mon Mar 06 21:28:45 2006] [error] [client 127.0.0.1] global_lock 5 1141640925.14 1 [Mon Mar 06 21:28:46 2006] [error] [client 127.0.0.1] global_lock 3 1141640926.14 3 [Mon Mar 06 21:28:46 2006] [error] [client 127.0.0.1] global_lock 4 1141640926.14 3 [Mon Mar 06 21:28:46 2006] [error] [client 127.0.0.1] global_lock 2 1141640926.14 2 [Mon Mar 06 21:28:46 2006] [error] [client 127.0.0.1] global_lock 5 1141640926.14 3 [Mon Mar 06 21:28:49 2006] [warn] child process 902 still did not exit, sending a SIGTERM [Mon Mar 06 21:28:51 2006] [warn] child process 902 still did not exit, sending a SIGTERM [Mon Mar 06 21:28:53 2006] [warn] child process 902 still did not exit, sending a SIGTERM [Mon Mar 06 21:28:55 2006] [error] child process 902 still did not exit, sending a SIGKILL [Mon Mar 06 21:28:56 2006] [info] removed PID file /Users/grahamd/ Workspaces/mod_python-trunk-dev-1/test/logs/httpd.pid (pid=898)
[Mon Mar 06 21:28:56 2006] [notice] caught SIGTERM, shutting down

The request with connection.id of 2 (last value) never gets to complete because the the test script is killing off Apache before it has a chance since "ab" prematurely gave up on that request. Apache didn't shutdown properly because of the problem with Py_Finalize() being called in signal handler and deadlocking because the
request handler still had the GIL.

-------------------------------------------------------------------- --
Ran 1 test in 8.658s

FAILED (failures=1)


The contents of the logs/error_log file are:

[Mon Mar 06 20:57:31 2006] [notice] mod_python: Creating 6 session mutexes based on 2 max processes and 3 max threads. [Mon Mar 06 20:57:31 2006] [notice] Apache/2.2.0 (Unix) mod_python/3.3.0-dev-20060305 Python/2.3.5 configured -- resuming normal operations
[Mon Mar 06 20:57:31 2006] [info] Server built: Feb 15 2006 21:39:56
[Mon Mar 06 20:57:31 2006] [debug] worker.c(1715): AcceptMutex: sysvsem (default: sysvsem) [Mon Mar 06 20:57:32 2006] [notice] mod_python: (Re)importing module 'tests' [Mon Mar 06 20:57:33 2006] [notice] mod_python: (Re)importing module 'tests' [Mon Mar 06 20:57:34 2006] [debug] worker.c(1329): taking over scoreboard slot from 2500 (quiescing) [Mon Mar 06 20:57:34 2006] [notice] mod_python: (Re)importing module 'tests' [Mon Mar 06 20:57:37 2006] [info] [client 127.0.0.1] (9)Bad file descriptor: core_output_filter: writing data to the network [Mon Mar 06 20:57:38 2006] [info] [client 127.0.0.1] (9)Bad file descriptor: core_output_filter: writing data to the network [Mon Mar 06 20:57:39 2006] [info] removed PID file /Users/grahamd/ Workspaces/mod_python-trunk-dev-1/test/logs/httpd.pid (pid=2497)
[Mon Mar 06 20:57:39 2006] [notice] caught SIGTERM, shutting down

In the above earlier worker log, the failures in the core output filter is purely because "ab" had killed the connection and thus the file descriptor on
server side had gone bad.

Now as to whether "ab" is the problem. I know that I have seen with "ab" before when doing load testing on Mac OS X that it will often exhibit some rate of failures for no good reason even though the handlers in the server look to work perfectly okay based on logging. I don't know the reason for why "ab" does this, but reasonably confident that "ab" is probably the cause
of the failure and not Apache.

What I might do is to use threading to make concurrent requests from the
test script itself and not involve "ab" at all and see how that goes.

BTW, the log output at top was from modified test handler as below.

def global_lock(req):

    import _apache

req.log_error("global_lock 1 %s %s" % (time.time (),req.connection.id))

    _apache._global_lock(req.server, 1)

req.log_error("global_lock 2 %s %s" % (time.time (),req.connection.id))

    time.sleep(1)

req.log_error("global_lock 3 %s %s" % (time.time (),req.connection.id))

    _apache._global_unlock(req.server, 1)

req.log_error("global_lock 4 %s %s" % (time.time (),req.connection.id))

    req.write("test ok")

req.log_error("global_lock 5 %s %s" % (time.time (),req.connection.id))

    return apache.OK


Graham


Reply via email to