From: [EMAIL PROTECTED] Operating system: Linux (redHat) PHP version: 4.0.4 PHP Bug Type: OCI8 related Bug description: Oracle Support claims PHP OCI library handling for ORA-600 Server Errors
We are running a large website using several machines with also different os, apache and php versions. We encountered a problem with "hanging" apache server prozesses (that will not quit on gracefull only on real restarts) as well as ORA-600 errors on database server (Oracle 8.1.6.1) under extreme website load or in conjunktion with unoptimized queries or when restarting a couple of webservers simultanous, quitting a large number of such hanging prozesses. Oracle Support claims the ORA-600 [16365] Error (an internal oracle error) on an flaw of their OCI library: This is their answer to a similar case like ours filed Mid July 2001: ================================== The main reason we are focusing on the client side is because the error generated on the server is caused by a message that was sent out of turn from the client. This is a half protocol violation error caused by the client. . Basically it means the client sent the server a message and instead of waiting for the reply it decided to send another message out of turn. In other cases of this where the apache server was the oracle client, under heavy load the apache server has a request timeout of 5 minutes. If the server becomes heavily used and in this case starts paging and swapping it is possible for it not to respond for this timeout period. What happens now is the apache server times the request out. It has a signal handler which, unfortunately, then long jumps out of oracle client library code (oci) leaving the read system call to the server. Also leaving the client side library code in an unknown state, therefore causing any number of possible problems. . Then the client application unaware of the longjmp by the apache server would try a request to the oracle server using the same session and cause this half protocol violation. Note this error is only generated by the MTS server. . If however, from the above information the connection to the oracle server is not from the web server then the problem will be caused by whatever other application happens to be using to access the database. . As oracle has no control over long jump outs of it's code the only way to drop a connection in this situtation is for the client to send an OCIBreak/OCIReset. ANy other request will cause the error that is seen above. . Therefore by increasing the timeout we reduce the possibility of longjmps out of the oracle client library. ALso the application can be modified to send an OCIBreak/OCIReset to the server. The people that wrote the PHP scripting language apparently put a modification in their code to cope with this. However, I have not seen this change so I cannot comment on it. =================== I could not find any break or reset instructions in the code of oci8.c nor any informations in mailinglist and bugdatabase, but I'm a lousy c programmer and maybe the problem was fixed in the latest version, but I could not find any release notes dealing with that problem. I always though the error would be a Oracle error, but it as they say it seems to be a misshandling on client side code (in this case the webserver that acts client to the database) thanks -- Edit bug report at: http://bugs.php.net/?id=13515&edit=1 -- PHP Development Mailing List <http://www.php.net/> To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] To contact the list administrators, e-mail: [EMAIL PROTECTED]