[
https://issues.apache.org/jira/browse/HADOOP-10062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14279052#comment-14279052
]
Sangjin Lee commented on HADOOP-10062:
--------------------------------------
Thanks for your feedback [~vicaya]!
I did look at the timer-driven regular publishing route, but it seems to me
that by synchronizing publishMetricsNow() an inversion cannot happen even with
regular publishing (inversion meaning earlier sampling pushed to the sinks
later). In both code paths (onTimerEvent() and publishMetricsNow()), sample -->
publish --> sinkadapter enqueue() would happen atomically while holding the
lock. Assuming SinkQueue is a FIFO queue, this would certainly guarantee no
inversion.
Could you kindly point out under what conditions a failure would occur? Thanks!
Having said that, I do like the idea of using ScheduledExecutorService over the
Timer, as in general it is more robust in case of unexpected exceptions. I
could make that change if needed.
> TestMetricsSystemImpl#testMultiThreadedPublish fails on trunk
> -------------------------------------------------------------
>
> Key: HADOOP-10062
> URL: https://issues.apache.org/jira/browse/HADOOP-10062
> Project: Hadoop Common
> Issue Type: Bug
> Components: metrics
> Affects Versions: 3.0.0
> Environment: CentOS 6.4, Oracle JDK 1.6.0_31, JDK1.7.0_45
> Reporter: Shinichi Yamashita
> Assignee: Sangjin Lee
> Priority: Minor
> Attachments: HADOOP-10062-failed.txt, HADOOP-10062-success.txt,
> HADOOP-10062.003.patch, HADOOP-10062.patch, HADOOP-10062.patch
>
>
> TestMetricsSystemInpl#testMultiThreadedPublish failed with "Metrics not
> collected"
> {code}
> Running org.apache.hadoop.metrics2.impl.TestMetricsSystemImpl
> Tests run: 6, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 1.688 sec <<<
> FAILURE! - in org.apache.hadoop.metrics2.impl.TestMetricsSystemImpl
> testMultiThreadedPublish(org.apache.hadoop.metrics2.impl.TestMetricsSystemImpl)
> Time elapsed: 0.056 sec <<< FAILURE!
> java.lang.AssertionError: Metric not collected!
> Metric not collected!
> Metric not collected!
> Metric not collected!
> Metric not collected!
> Metric not collected!
> Metric not collected!
> Metric not collected!
> Metric not collected!
> Passed
> at org.junit.Assert.fail(Assert.java:93)
> at org.junit.Assert.assertTrue(Assert.java:43)
> at
> org.apache.hadoop.metrics2.impl.TestMetricsSystemImpl.testMultiThreadedPublish(TestMetricsSystemImpl.java:232)
> Results :
> Failed tests:
> TestMetricsSystemImpl.testMultiThreadedPublish:232 Metric not collected!
> Metric not collected!
> Metric not collected!
> Metric not collected!
> Metric not collected!
> Metric not collected!
> Metric not collected!
> Metric not collected!
> Metric not collected!
> Passed
> Tests run: 6, Failures: 1, Errors: 0, Skipped: 0
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)