Surya Hebbar created IMPALA-13787:
-------------------------------------
Summary: Compensate for inaccuracies of EC2 instance's clocks in
runtime-profile-test
Key: IMPALA-13787
URL: https://issues.apache.org/jira/browse/IMPALA-13787
Project: IMPALA
Issue Type: Bug
Reporter: Surya Hebbar
Assignee: Surya Hebbar
The resolution of the clock in EC2 instance is less. As mentioned in
stopwatch.h, which is used to measure the time for all the profile's counters.
[https://github.com/apache/impala/blob/cfeb57c128c7f514f3433a0399966f46a49a1a4a/be/src/util/stopwatch.h#L102C1-L111C18]
{code:java}
/// Stop watch for reporting elapsed time in nanosec based on clock_gettime
(Linux) or
/// MonotonicNanos (Apple). It is not affected by cpu frequency changes and it
is not
/// affected by user setting the system clock. A monotonic clock represents
monotonic
/// time since some unspecified starting point. It is good for computing
elapsed time.
///
/// The time values are in nanoseconds. For most machine configurations, the
clock
/// resolution will be 1 nanosecond. We fall back to low resolution in
configurations
/// where the clock is expensive. For those machine configurations (notably
EC2), the
/// clock resolution will be that of the system jiffy, which is between 1 and 10
/// milliseconds.
{code}
For the test in IMPALA-13751, it was suggested to reduce the sleep time between
events to 5ms, which was less than the clock's accuracy. This was to make the
test run faster.
[https://gerrit.cloudera.org/c/22482/3/be/src/util/runtime-profile-test.cc#2001]
Hence, the tests ran successfully on machines with clocks having higher
resolution clocks(>5ms).
So, irrespective of a deterministic or randomized test, the test fails on an
EC2 instance or an instance with less accuracy clock than <=5ms.
The 2 possible small fixes are,
1. For the test, increase the sleep timer between events from 5ms to 10ms
or
2. Include inaccuracies of EC2 instance's clock, and expect inaccurate
timestamp values for events(use EXPECT_GE instead of EXPECT_GT)
--
This message was sent by Atlassian Jira
(v8.20.10#820010)