[ 
https://issues.apache.org/jira/browse/HADOOP-12829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HADOOP-12829:
------------------------------------------
          Resolution: Fixed
       Fix Version/s: 2.9.0
    Target Version/s: 2.9.0
              Status: Resolved  (was: Patch Available)

Committed to 2.9, thanks!

> StatisticsDataReferenceCleaner swallows interrupt exceptions
> ------------------------------------------------------------
>
>                 Key: HADOOP-12829
>                 URL: https://issues.apache.org/jira/browse/HADOOP-12829
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: fs
>    Affects Versions: 2.8.0, 2.7.3, 2.6.4
>            Reporter: Gregory Chanan
>            Assignee: Gregory Chanan
>            Priority: Minor
>             Fix For: 2.9.0
>
>         Attachments: HADOOP-12829.patch, HADOOP-12829.patch
>
>
> The StatisticsDataReferenceCleaner, implemented in HADOOP-12107 swallows 
> interrupt exceptions.  Over in Solr/Sentry land, we run thread leak checkers 
> on our test code, which passed before this change and fails after it.  Here's 
> a sample report:
> {code}
> 1 thread leaked from SUITE scope at 
> org.apache.solr.handler.TestSecureReplicationHandler: 
>    1) Thread[id=16, 
> name=org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner,
>  state=WAITING, group=TGRP-TestSecureReplicationHandler]
>         at java.lang.Object.wait(Native Method)
>         at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135)
>         at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151)
>         at 
> org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner.run(FileSystem.java:3040)
>         at java.lang.Thread.run(Thread.java:745)
> {code}
> And here's an indication that the interrupt is being ignored:
> {code}
> 25209 T16 oahf.FileSystem$Statistics$StatisticsDataReferenceCleaner.run WARN 
> exception in the cleaner thread but it will continue to run 
> java.lang.InterruptedException
>       at java.lang.Object.wait(Native Method)
>       at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135)
>       at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151)
>       at 
> org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner.run(FileSystem.java:3040)
>       at java.lang.Thread.run(Thread.java:745)
> {code}
> This is inconsistent with how other long-running threads in hadoop, i.e. 
> PeerCache respond to being interrupted.
> The argument for doing this in HADOOP-12107 is given as 
> (https://issues.apache.org/jira/browse/HADOOP-12107?focusedCommentId=14598397&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14598397):
> {quote}
> Cleaner#run
> Catch and log InterruptedException in the while loop, such that thread does 
> not die on a spurious wakeup. It's safe since it's a daemon thread.
> {quote}
> I'm unclear on what "spurious wakeup" means and it is not mentioned in 
> https://docs.oracle.com/javase/tutorial/essential/concurrency/interrupt.html:
> {quote}
> A thread sends an interrupt by invoking interrupt on the Thread object for 
> the thread to be interrupted. For the interrupt mechanism to work correctly, 
> the interrupted thread must support its own interruption.
> {quote}
> So, I believe this thread should respect interruption.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to