[
https://issues.apache.org/jira/browse/SLING-4477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Carsten Ziegeler closed SLING-4477.
-----------------------------------
> JcrInstaller should not call Thread.interrupt()
> -----------------------------------------------
>
> Key: SLING-4477
> URL: https://issues.apache.org/jira/browse/SLING-4477
> Project: Sling
> Issue Type: Improvement
> Components: Installer, JCR
> Reporter: Thomas Mueller
> Assignee: Carsten Ziegeler
> Fix For: JCR Installer 3.1.14
>
>
> There JcrInstaller calls Thread.interrupt() where it's dangerous and not
> necessary. Thread.interrupt is dangerous because it closes files (when using
> the FileChannel API), including Lucene files, and the Oak persistent cache
> files. All further I/O operations with that file, including I/O operations on
> other threads, will then fail (see ClosedByInterruptException for details).
> OAK-2571 protects against closing persistent cache files, by reopening the
> files. But it results in slower performance and ugly log messages.
> Thread.interrupt is also dangerous because it does not work as expected if
> some code catches InterruptedException and does not re-throw it. See
> http://stackoverflow.com/questions/2020992/is-thread-interrupt-evil
> Thread.interrupt is not necessary in most cases. Instead, a simple "volatile
> boolean" flag is sufficient, and much safer.
> The JCR installer uses a boolean flag (active), but also Thread.interrupt, to
> stop the sleep period.
> Just before the problem occurs, I see the following messages in the log file:
> {noformat}
> 05.03.2015 15:22:26.642 *INFO* [CM Event Dispatcher (Fire ConfigurationEvent:
> pid=org.apache.sling.installer.provider.jcr.impl.JcrInstaller)]
> org.apache.sling.installer.provider.jcr.impl.JcrInstaller Deactivating Apache
> Sling JCR Installer
> {noformat}
> The very last message is from the
> org.apache.sling.installer.provider.jcr.impl.JcrInstaller, who calls:
> {noformat}
> backgroundThread.interrupt();
> {noformat}
> One possible solution is: in JcrInstaller, instead of:
> {noformat}
> try {
> Thread.sleep(RUN_LOOP_DELAY_MSEC);
> } catch (final InterruptedException ignore) {
> // ignore
> }
> {noformat}
> use:
> {noformat}
> synchronized (this) {
> if (active) {
> try {
> wait(RUN_LOOP_DELAY_MSEC);
> } catch (final InterruptedException ignore) {
> // ignore
> }
> }
> }
> {noformat}
> and instead of:
> {noformat}
> backgroundThread.active = false;
> logger.debug("Waiting for " + backgroundThread.getName() + " Thread
> to end...");
> backgroundThread.interrupt();
> {noformat}
> use:
> {noformat}
> synchronized (backgroundThread) {
> backgroundThread.active = false;
> backgroundThread.notifyAll();
> }
> logger.debug("Waiting for " + backgroundThread.getName() + " Thread
> to end...");
> {noformat}
> That's also better than what we have now, because right now, _any_ code
> within "runOneCycle" (including library and Oak code) that is doing "catch
> (InterruptedException x) { }" will let the
> "Thread.sleep(RUN_LOOP_DELAY_MSEC)" sleep one second too long.
> And people catch and ignore InterruptedException a _lot_. Using wait and
> notifyAll does not suffer from this, and has no risk of trouble.
> An alternative is to simply remove backgroundThread.interrupt(), and change
> the loop to:
> {noformat}
> for (int i = 0; i < 100 && active; i++) {
> try {
> Thread.sleep(RUN_LOOP_DELAY_MSEC / 100);
> } catch (final InterruptedException ignore) {
> // ignore
> }
> }
> {noformat}
> This will let it sleep 10 ms too long at most. It requires that
> RUN_LOOP_DELAY_MSEC is >= 100, which is ugly.
> In this case, the field "active" needs to be volatile. But is needed even for
> the current code (it's better to fix that in all cases).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)