Hi,
thank you very much for the analysis. here's the knldiag and rtedump from the latest crash (last night :-( ) as you requested. http://www.signal7.de/latestcrash.tgz Best regards, Robert On Tuesday 08 October 2002 10:06, Mensing, Joerg wrote: > Hallo Robert, > your vserver reports a disappeared COMMUNIC semid. What is that for? For > each connection a vserver is started. The vserver uses two semaphores for > communication. For each direction a semaphore is needed to synchronize with > the SAP DB kernel shared communication memory segment usage. In an > established connection, the vserver gets packets from socket and puts the > content into a shared memory. Then it triggers the semaphore which is used > by the corresponding user kernel thread (UKT) inside SAP DB. Now it will > wait on the communication channel specific semaphore until this is > triggered by the SAP DB kernel. > > The SAP DB kernel will read the shared memory content request section and > after its work is done, it will put its reply into the shared memory reply > section. After the reply is available, the kernel uses the communication > channel specific semaphore to wakeup vserver. > > The vserver triggered by the semaphore will now transfer the reply section > content into the socket and wait for the next request. > > If your link is still valid, from the knldiag i can see that the SAP DB > kernel was kill by a 'Signal 9'. How can such a signal be recongnized? The > SAP DB Kernel after it is started immediatly forks a child process and > watches over its status. The 'Watchdog' reopens the knldiag after the child > process terminated and writes 'famous last words' and flushes content of > shared memory trace buffers. A signal '9' is the KILL signal which cannot > be caught or ignored. Whoever was that friendly to send that signal to SAP > DB kernel now triggered the 'cleanup' procedure in the watchdog. This > cleanup uses the 'tag file' directories, which are found in > /usr/spool/sql/ipc/db:dbname and /usr/spool/sql/ipc/us:dbname. In these > directories all shared memory segments and semaphores that belong to the > SAP DB instance 'dbname' are found. The cleanup code uses the directory > entries to removed the semaphores and shared memory segments for all > clients (including all vservers connected to SAP DB) and fo! r the SAP DB > kernel itself. This cleanup code in your case removed the IPC semaphores > with 'ipcrm()' system calls so that the vserver reported them as > 'disappeared'. If it uses those disapperaed semaphores it ends up with > 'invalid argument'. So what you see is a follow up of a 'Signal 9'. > > Best regards > jrg > > P.S.: It would be nice if you could upload your knldiag of the second crash > you recently had. The way you did with the first crash helped a lot for > this analysis. The excerpt you send us is not enaugh, since all i know now > is that you ran out of memory :-( > _______________________________________________ sapdb.general mailing list [EMAIL PROTECTED] http://listserv.sap.com/mailman/listinfo/sapdb.general
