[ 
https://issues.apache.org/jira/browse/KARAF-5315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martin Krüger closed KARAF-5315.
--------------------------------

> Race condition during shutdown using SIGTERM
> --------------------------------------------
>
>                 Key: KARAF-5315
>                 URL: https://issues.apache.org/jira/browse/KARAF-5315
>             Project: Karaf
>          Issue Type: Bug
>    Affects Versions: 4.0.9
>         Environment: Linux using systemd
>            Reporter: Martin Krüger
>            Assignee: Jean-Baptiste Onofré
>            Priority: Major
>             Fix For: 4.0.10, 4.1.3, 4.2.0.M1
>
>         Attachments: 
> 0001-KARAF-5315-Synchronize-access-to-SimpleFileLock.patch, 
> 0002-KARAF-5315-Signal-handler-stops-framework-directly.patch
>
>
> During shutdown using SIGTERM there is a race condition.
> {noformat}
> Error occurred shutting down framework: 
> java.nio.channels.ClosedChannelException
> java.nio.channels.ClosedChannelException
>          at sun.nio.ch.FileLockImpl.release(FileLockImpl.java:58)
>          at 
> org.apache.karaf.main.lock.SimpleFileLock.release(SimpleFileLock.java:78)
>          at org.apache.karaf.main.Main.destroy(Main.java:642)
>          at org.apache.karaf.main.Main.main(Main.java:188)
> Main process exited, code=exited, status=254
> {noformat}
> There are several problems in the code of the Main class.
> # The variable indicating the exit condition ( private boolean exiting; line 
> 89) used in several threads is not volatile.
> # The same is true for the lock (private Lock lock; line 87).
> # The signal handler calls Main.this.destroy(); which is called by the main 
> thread again after leaving function awaitShutdown() (line 581)
> Because the destroy() function releases the lock in the finally block the 
> lock is released twice. The used implementation is the SimpleFileLock. In 
> there the release() function is not synchronized. Since the channel of the 
> file-lock is closed the second call will result in the exception.
> To get rid of the double release the SimpleFileLock.release() function should 
> be synchronized. But I am not sure if the double call of the Main.destroy() 
> function is an even bigger problem because all activators are stopped twice 
> too.
> {code}
>             while (timeout > 0) {
>                 timeout -= step;
>                 FrameworkEvent event = framework.waitForStop(step);
>                 if (event.getType() != FrameworkEvent.WAIT_TIMEDOUT) {
>                     activatorManager.stopKarafActivators();
>                     return true;
>                 }
>             }
> {code}
> Maybe synchronizing the Main.destroy() function is a good idea too or find a 
> different way to have the signal handler stopping the framework by just 
> signaling the stop and waking up the main thread ....



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to