Hello, Currently, if user wants to stop the openhpi daemon, then it is done by killing the openhpi daemon which results in abrupt shutdown. openhpi daemon is not supporting graceful shutdown. Graceful shutdown of the OpenHPI helps in closing the connections with the managed devices (like shelf manager of ATCA, OA of HP c-Class, etc) and releasing all the allocated memory by the openhpi.
The most important advantage of the graceful shutdown is that it helps in finding the memory leaks. The attached patch supports the graceful shutdown for OpenHPI and close() ABI is completely implemented for simulator plugin. The openhpi shutdown request can be sent through the kill command with SIGUSR1 signal or via init.d/openhpid shell script with shutdown as option. The openhpid shell script sends SIGUSR1 signal to openhpid process. The SIGUSR1 signal can be caught by the signal handler in openhpid daemon. And the shutdown signal can be sent to all the openhpi threads. Please note that 'stop' option in init.d/openhpid shell script (or kill <pid of openhpid>) stops/kills the openhpid abruptly. This is not changed as part of this patch. The openhpid.sh.in file has been modified to send the SIGUSR1 signal as part of the shutdown request. BUT, I'm not able to test it for the different Linux OS. I'm not sure whether changes done for different OS are correct or not. I am looking forward for the help from the different OS experts on this. The focus of the graceful shutdown is for the daemon to be able to handle a shutdown signal, close client connections, close open handlers, and to release all memory resources. Using this patch, memory leaks in openhpi are detected and the openhpi memory leak bug #1823713 has been closed. Since the changes involve many parts of the infrastructure, we can discuss the proposed changes. The advantage --------------- * Facilitates use of memory leak detection tools (like valgrind). * Encourages plug-in writers to fully implement close calls * Closes the connection with the managed devices so that managed devices can also cleanup the resources allocated. * Most of the telecom players are very strict on memory leaks. Since, the openhpi may run for few months to few years, the memory leaks in openhpi (if any) may create serious problems. The end-user can test on his/her system for the memory leaks. This helps in increasing the end-user confidence in OpenHPI. The disadvantages ------------------ * Complicates the socket connection a little, because of the use of timeouts * Requires more source code for the shutdown operation, which is really wasted since the kernel is going to dispose of the daemon process resources anyway. The attached graceful_shutdown_notes gives more information on the changes done as part of this patch. Regards, PG NOTE: This is tracked in openhpi tracker as feature request #2220356.
Handling the openhpi shutdown request
======================================
The openhpi shutdown request can be sent through the kill command with SIGUSR1
signal or via init.d/openhpid shell script with shutdown as option. The
openhpid shell script sends SIGUSR1 signal to openhpid process. The SIGUSR1
signal can be caught by the signal handler in openhpid daemon. And the shutdown
signal can be sent to all the openhpi threads.
NOTE: The openhpid.sh.in file has been modified to send the SIGUSR1 signal as
part of the shutdown request. BUT, I'm not able to test it for the different
Linux OS. I am looking forward for help from the different OS experts.
Below are the major changes that are done as part of the graceful shutdown
implementation.
Signal function
----------------
Catch the SIGUSR1 signal and intimate the openhpid main thread to stop
openhpi daemon main thread
----------------------------
a) Register signal function to catch SIGUSR1 in main() function.
b) Set the server socket timeout to 3 seconds.
c) Check the shutdown request, before listening for the HPI user connection. On
recieving the shutdown request, wait for the openphi client threads to exit.
g_thread_pool_free() call blocks till all the client threads have exited.
d) Intimate the discovery and event threads to stop. Wake up the discovery by
calling the discovery wakeup (oh_wake_discovery_thread) Wait for the discovery
and event threads to exit using the g_thread_join function.
e) Get the list of plugins instantiated and make the close ABI call of the
instantiated plugins.
f) Cleanup the framework event queue.
g) Cleanup the domain
h) Release the global variables.
i) Cleanup the handler
j) Exit normally.
The openhpi client threads
---------------------------
a) Check the shutdown request. On recieving the shutdown request, stop
listening for the HPI client requests.
b) Close the session with the HPI users.
c) Cleansup the memory allocated for HPI user.
d) Exit normally.
discovery and event get threads
-------------------------------
Whenever the thread is woken up or times out on waiting, Checks for the
shutdown request. On recieving the shutdown request, exits the thread.
event pop thread
-----------------
a) Whenever the thread is woken up or times out on waiting, Checks for the
shutdown request. On recieving the shutdown request, exits the thread.
b) Use the g_async_queue_timed_pop() (instead of g_async_queue_pop) for poping
the events from event queue.
Simulator plugin close() ABI
=============================
a) The annunciators are stored as part of the annunciator RDR. Release the
annunciator structures from the annunciator RDR.
b) Release the memory allocated for RPT and RDR.
c) Release the memory allocated for event log
d) Release the memory allocated for handler.
Preparing OpenHPI for Valgrind
-------------------------------
* Download the openhpi trunk and apply shutdown patch.
* Enable the debugging while running configure script. This will disables the
optimization and enables the debugging. This step is MUST.
* Compile and install the OpenHPI.
For example: #./configure --enable-debuggable
# make; make install
Finding the memory leaks using Valgrind
----------------------------------------
* Run the openhpi daemon under valgrind.
For example: # valgrind openhpid -c <openhpi.conf file>
* Run the tests to generate the different scenarios like saftest/hpitest,
plugin specific tests etc.
* Stop the openhpi daemon by sending SIGUSR1 signal to the valgrind
process/pid. Valgrind in turn forwards the SIGUSR1 signal to openhpi daemon.
NOTE: Do not use openhpid shell script to stop the openhpi daemon. The openhpid
shell script tries to send the SIGUSR1 signal to openhpi daemon pid.
* The memory errors will be shown on the screen.
* Analyze and fix the memory errors (if any)
* Re-run the above steps until no more valid memory errors are reported.
openhpid_graceful_shutdown.patch
Description: openhpid_graceful_shutdown.patch
simulator_graceful_shutdown.patch
Description: simulator_graceful_shutdown.patch
------------------------------------------------------------------------- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________ Openhpi-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/openhpi-devel
