Gregory Chanan created HADOOP-12829:
---------------------------------------
Summary: StatisticsDataReferenceCleaner swallos interrupt
exceptions
Key: HADOOP-12829
URL: https://issues.apache.org/jira/browse/HADOOP-12829
Project: Hadoop Common
Issue Type: Improvement
Components: fs
Affects Versions: 2.6.4, 2.8.0, 2.7.3
Reporter: Gregory Chanan
Assignee: Gregory Chanan
The StatisticsDataReferenceCleaner, implemented in HADOOP-12107 swallows
interrupt exceptions. Over in Solr/Sentry land, we run thread leak checkers on
our test code, which passed before this change and fails after it. Here's a
sample report:
{code}
1 thread leaked from SUITE scope at
org.apache.solr.handler.TestSecureReplicationHandler:
1) Thread[id=16,
name=org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner,
state=WAITING, group=TGRP-TestSecureReplicationHandler]
at java.lang.Object.wait(Native Method)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151)
at
org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner.run(FileSystem.java:3040)
at java.lang.Thread.run(Thread.java:745)
{code}
And here's an indication that the interrupt is being ignored:
{code}
25209 T16 oahf.FileSystem$Statistics$StatisticsDataReferenceCleaner.run WARN
exception in the cleaner thread but it will continue to run
java.lang.InterruptedException
at java.lang.Object.wait(Native Method)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151)
at
org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner.run(FileSystem.java:3040)
at java.lang.Thread.run(Thread.java:745)
{code}
This is inconsistent with how other long-running threads in hadoop, i.e.
PeerCache respond to being interrupted.
The argument for doing this in HADOOP-12107 is given as
(https://issues.apache.org/jira/browse/HADOOP-12107?focusedCommentId=14598397&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14598397):
{quote}
Cleaner#run
Catch and log InterruptedException in the while loop, such that thread does not
die on a spurious wakeup. It's safe since it's a daemon thread.
{quote}
I'm unclear on what "spurious wakeup" means and it is not mentioned in
https://docs.oracle.com/javase/tutorial/essential/concurrency/interrupt.html:
{quote}
A thread sends an interrupt by invoking interrupt on the Thread object for the
thread to be interrupted. For the interrupt mechanism to work correctly, the
interrupted thread must support its own interruption.
{quote}
So, I believe this thread should respect interruption.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)