[ 
https://issues.apache.org/jira/browse/HADOOP-10062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14279052#comment-14279052
 ] 

Sangjin Lee commented on HADOOP-10062:
--------------------------------------

Thanks for your feedback [~vicaya]!

I did look at the timer-driven regular publishing route, but it seems to me 
that by synchronizing publishMetricsNow() an inversion cannot happen even with 
regular publishing (inversion meaning earlier sampling pushed to the sinks 
later). In both code paths (onTimerEvent() and publishMetricsNow()), sample --> 
publish --> sinkadapter enqueue() would happen atomically while holding the 
lock. Assuming SinkQueue is a FIFO queue, this would certainly guarantee no 
inversion.

Could you kindly point out under what conditions a failure would occur? Thanks!

Having said that, I do like the idea of using ScheduledExecutorService over the 
Timer, as in general it is more robust in case of unexpected exceptions. I 
could make that change if needed.

> TestMetricsSystemImpl#testMultiThreadedPublish fails on trunk
> -------------------------------------------------------------
>
>                 Key: HADOOP-10062
>                 URL: https://issues.apache.org/jira/browse/HADOOP-10062
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: metrics
>    Affects Versions: 3.0.0
>         Environment: CentOS 6.4, Oracle JDK 1.6.0_31, JDK1.7.0_45
>            Reporter: Shinichi Yamashita
>            Assignee: Sangjin Lee
>            Priority: Minor
>         Attachments: HADOOP-10062-failed.txt, HADOOP-10062-success.txt, 
> HADOOP-10062.003.patch, HADOOP-10062.patch, HADOOP-10062.patch
>
>
> TestMetricsSystemInpl#testMultiThreadedPublish failed with "Metrics not 
> collected"
> {code}
> Running org.apache.hadoop.metrics2.impl.TestMetricsSystemImpl
> Tests run: 6, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 1.688 sec <<< 
> FAILURE! - in org.apache.hadoop.metrics2.impl.TestMetricsSystemImpl
> testMultiThreadedPublish(org.apache.hadoop.metrics2.impl.TestMetricsSystemImpl)
>   Time elapsed: 0.056 sec  <<< FAILURE!
> java.lang.AssertionError: Metric not collected!
> Metric not collected!
> Metric not collected!
> Metric not collected!
> Metric not collected!
> Metric not collected!
> Metric not collected!
> Metric not collected!
> Metric not collected!
> Passed
>         at org.junit.Assert.fail(Assert.java:93)
>         at org.junit.Assert.assertTrue(Assert.java:43)
>         at 
> org.apache.hadoop.metrics2.impl.TestMetricsSystemImpl.testMultiThreadedPublish(TestMetricsSystemImpl.java:232)
> Results :
> Failed tests:
>   TestMetricsSystemImpl.testMultiThreadedPublish:232 Metric not collected!
> Metric not collected!
> Metric not collected!
> Metric not collected!
> Metric not collected!
> Metric not collected!
> Metric not collected!
> Metric not collected!
> Metric not collected!
> Passed
> Tests run: 6, Failures: 1, Errors: 0, Skipped: 0
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to