Hello,

Currently, if user wants to stop the openhpi daemon, then it is done by killing 
the openhpi daemon which results in abrupt shutdown. openhpi daemon is not 
supporting graceful shutdown. Graceful shutdown of the OpenHPI helps in closing 
the connections with the managed devices (like shelf manager of ATCA, OA of HP 
c-Class, etc) and releasing all the allocated memory by the openhpi.

The most important advantage of the graceful shutdown is that it helps in 
finding the memory leaks.

The attached patch supports the graceful shutdown for OpenHPI and close() ABI 
is completely implemented for simulator plugin.
The openhpi shutdown request can be sent through the kill command with SIGUSR1 
signal or via init.d/openhpid shell script with shutdown as option. The
openhpid shell script sends SIGUSR1 signal to openhpid process. The SIGUSR1 
signal can be caught by the signal handler in openhpid daemon. And the shutdown 
signal can be sent to all the openhpi threads.
Please note that 'stop' option in init.d/openhpid shell script (or kill <pid of 
openhpid>) stops/kills the openhpid abruptly. This is not changed as part of 
this patch.

The openhpid.sh.in file has been modified to send the SIGUSR1 signal as part of 
the shutdown request. BUT, I'm not able to test it for the different Linux OS. 
I'm not sure whether changes done for different OS are correct or not. I am 
looking forward for the help from the different OS experts on this.

The focus of the graceful shutdown is for the daemon to be able to handle a 
shutdown signal, close client connections, close open handlers, and to release 
all memory resources. Using this patch, memory leaks in openhpi are detected 
and the openhpi memory leak bug #1823713 has been closed.

Since the changes involve many parts of the infrastructure, we can discuss the 
proposed changes.

The advantage
---------------
 * Facilitates use of memory leak detection tools (like valgrind).
 * Encourages plug-in writers to fully implement close calls
 * Closes the connection with the managed devices so that managed devices can
   also cleanup the resources allocated.
 * Most of the telecom players are very strict on memory leaks. Since, the
   openhpi may run for few months to few years, the memory leaks in
   openhpi (if any) may create serious problems. The end-user can test on
   his/her system for the memory leaks. This helps in increasing the end-user
   confidence in OpenHPI.

The disadvantages
------------------
 * Complicates the socket connection a little, because of the use of timeouts
 * Requires more source code for the shutdown operation, which is really wasted
   since the kernel is going to dispose of the daemon process resources anyway.

The attached graceful_shutdown_notes gives more information on the changes done 
as part of this patch.

Regards,
PG

NOTE: This is tracked in openhpi tracker as feature request #2220356.
Handling the openhpi shutdown request
======================================

The openhpi shutdown request can be sent through the kill command with SIGUSR1 
signal or via init.d/openhpid shell script with shutdown as option. The 
openhpid shell script sends SIGUSR1 signal to openhpid process. The SIGUSR1 
signal can be caught by the signal handler in openhpid daemon. And the shutdown 
signal can be sent to all the openhpi threads.
NOTE: The openhpid.sh.in file has been modified to send the SIGUSR1 signal as 
part of the shutdown request. BUT, I'm not able to test it for the different 
Linux OS. I am looking forward for help from the different OS experts.

Below are the major changes that are done as part of the graceful shutdown 
implementation.

Signal function
----------------
Catch the SIGUSR1 signal and intimate the openhpid main thread to stop

openhpi daemon main thread
----------------------------
a) Register signal function to catch SIGUSR1 in main() function.
b) Set the server socket timeout to 3 seconds.
c) Check the shutdown request, before listening for the HPI user connection. On 
recieving the shutdown request, wait for the openphi client threads to exit. 
g_thread_pool_free() call blocks till all the client threads have exited.
d) Intimate the discovery and event threads to stop. Wake up the discovery by 
calling the discovery wakeup (oh_wake_discovery_thread) Wait for the discovery 
and event threads to exit using the g_thread_join function.
e) Get the list of plugins instantiated and make the close ABI call of the 
instantiated plugins.
f) Cleanup the framework event queue.
g) Cleanup the domain
h) Release the global variables.
i) Cleanup the handler
j) Exit normally.

The openhpi client threads
---------------------------

a) Check the shutdown request. On recieving the shutdown request, stop 
listening for the HPI client requests.
b) Close the session with the HPI users.
c) Cleansup the memory allocated for HPI user.
d) Exit normally.

discovery and event get threads
-------------------------------
Whenever the thread is woken up or times out on waiting, Checks for the 
shutdown request. On recieving the shutdown request, exits the thread.

event pop thread
-----------------
a) Whenever the thread is woken up or times out on waiting, Checks for the 
shutdown request. On recieving the shutdown request, exits the thread.
b) Use the g_async_queue_timed_pop() (instead of g_async_queue_pop) for poping 
the events from event queue.

Simulator plugin close() ABI
=============================
a) The annunciators are stored as part of the annunciator RDR. Release the 
annunciator structures from the annunciator RDR.
b) Release the memory allocated for RPT and RDR.
c) Release the memory allocated for event log
d) Release the memory allocated for handler.


Preparing OpenHPI for Valgrind
-------------------------------
* Download the openhpi trunk and apply shutdown patch. 
* Enable the debugging while running configure script. This will disables the 
optimization and enables the debugging. This step is MUST.
* Compile and install the OpenHPI. 
For example: #./configure --enable-debuggable
             # make; make install

Finding the memory leaks using Valgrind
----------------------------------------
* Run the openhpi daemon under valgrind. 
For example: # valgrind openhpid -c <openhpi.conf file>
* Run the tests to generate the different scenarios like saftest/hpitest, 
plugin specific tests etc.
* Stop the openhpi daemon by sending SIGUSR1 signal to the valgrind 
process/pid. Valgrind in turn forwards the SIGUSR1 signal to openhpi daemon.
NOTE: Do not use openhpid shell script to stop the openhpi daemon. The openhpid 
shell script tries to send the SIGUSR1 signal to openhpi daemon pid.
* The memory errors will be shown on the screen.
* Analyze and fix the memory errors (if any)
* Re-run the above steps until no more valid memory errors are reported. 

Attachment: openhpid_graceful_shutdown.patch
Description: openhpid_graceful_shutdown.patch

Attachment: simulator_graceful_shutdown.patch
Description: simulator_graceful_shutdown.patch

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Openhpi-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/openhpi-devel

Reply via email to