[
https://issues.apache.org/jira/browse/HADOOP-12829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Gregory Chanan updated HADOOP-12829:
------------------------------------
Summary: StatisticsDataReferenceCleaner swallows interrupt exceptions
(was: StatisticsDataReferenceCleaner swallos interrupt exceptions)
> StatisticsDataReferenceCleaner swallows interrupt exceptions
> ------------------------------------------------------------
>
> Key: HADOOP-12829
> URL: https://issues.apache.org/jira/browse/HADOOP-12829
> Project: Hadoop Common
> Issue Type: Improvement
> Components: fs
> Affects Versions: 2.8.0, 2.7.3, 2.6.4
> Reporter: Gregory Chanan
> Assignee: Gregory Chanan
>
> The StatisticsDataReferenceCleaner, implemented in HADOOP-12107 swallows
> interrupt exceptions. Over in Solr/Sentry land, we run thread leak checkers
> on our test code, which passed before this change and fails after it. Here's
> a sample report:
> {code}
> 1 thread leaked from SUITE scope at
> org.apache.solr.handler.TestSecureReplicationHandler:
> 1) Thread[id=16,
> name=org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner,
> state=WAITING, group=TGRP-TestSecureReplicationHandler]
> at java.lang.Object.wait(Native Method)
> at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135)
> at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151)
> at
> org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner.run(FileSystem.java:3040)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> And here's an indication that the interrupt is being ignored:
> {code}
> 25209 T16 oahf.FileSystem$Statistics$StatisticsDataReferenceCleaner.run WARN
> exception in the cleaner thread but it will continue to run
> java.lang.InterruptedException
> at java.lang.Object.wait(Native Method)
> at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135)
> at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151)
> at
> org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner.run(FileSystem.java:3040)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> This is inconsistent with how other long-running threads in hadoop, i.e.
> PeerCache respond to being interrupted.
> The argument for doing this in HADOOP-12107 is given as
> (https://issues.apache.org/jira/browse/HADOOP-12107?focusedCommentId=14598397&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14598397):
> {quote}
> Cleaner#run
> Catch and log InterruptedException in the while loop, such that thread does
> not die on a spurious wakeup. It's safe since it's a daemon thread.
> {quote}
> I'm unclear on what "spurious wakeup" means and it is not mentioned in
> https://docs.oracle.com/javase/tutorial/essential/concurrency/interrupt.html:
> {quote}
> A thread sends an interrupt by invoking interrupt on the Thread object for
> the thread to be interrupted. For the interrupt mechanism to work correctly,
> the interrupted thread must support its own interruption.
> {quote}
> So, I believe this thread should respect interruption.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)