Status: New
Owner: ----
Labels: Type-Defect Priority-Medium
New issue 230 by [email protected]: timer accuracy v one shot timer v
expiry time (issue_163.t)
http://code.google.com/p/memcached/issues/detail?id=230
What steps will reproduce the problem?
1. Run memcached/t/issue_163.t on a platform that has a course timer and
isn't that fast - repeatedly
What is the expected output? What do you see instead?
Test should pass,
test fails sometimes with:
not ok 4 - Objects should be reclaimed
# Failed test 'Objects should be reclaimed'
# at t/issue_163.t line 27.
# got: '0'
# expected: '1'
What version of the product are you using? On what operating system?
1.7.1.1, built from source on ARM (Panda board) running Linaro 11.09
3.0.0-1404-linaro-lt-omap kernel
Please provide any additional information below.
I believe the problem here is a combination of factors with memcached's
timing and the way the test is written and the courseness of the timer on
the board.
Bumping up the sleep from 2 to 3 seconds in the test fixes it - but there
are some underlying issues.
The test stores some data with a 1 second expiry, waits 2 seconds and
stores data again, testing that the slot got reclaimed after the expiry;
However:
1) memcached's only stores a 1 second precision timer.
2) clock_handler uses a 1 shot timer, resetting it every time it's
called, and hence clock_handler gets called less regularly than once a
second
So I added some debug:
a 1..7
b DAG: set_current_time: current_time=2 timer=1319726830/136748
c DAG: set_current_time: current_time=2 timer=1319726830/236723
d DAG: set_current_time: current_time=2 timer=1319726830/241087
e DAG do_item_alloc fall through: current_time=2 (new)exptime=3
time=1319726830/247984
f ok 1 - stored key
g ok 2 - We should have requested some memory
h DAG: set_current_time: current_time=3 timer=1319726831/563994
i DAG do_item_alloc tail loop: current_time=3 refcount=0 exptime=3
time=1319726832/296141
j DAG do_item_alloc fall through: current_time=3 (new)exptime=0
time=1319726832/296752
k ok 3 - stored key
l not ok 4 - Objects should be reclaimed
m # Failed test 'Objects should be reclaimed'
n # at t/issue_163.t line 27.
o # got: '0'
p # expected: '1'
q ok 5 - deleted key
r DAG do_item_alloc fall through: current_time=3 (new)exptime=0
time=1319726832/306609
s ok 6 - stored key
t ok 7 - we've not allocated and freed the same amount
u # Looks like you failed 1 test of 7.
So in this case the first store (on line e) happens when current_time is 2
at wallclock ...830.247984 seconds; and the expiry is set to 2+1=3
(line h) The clock_handler loop appears to have been very sluggish, and we
get back to set_current_time at ...831.563994 ~1.3s later where we set
current_time to 3
Then we go and do the second set (line i), current time is still '3' yet
the wallclock is just over 2 seconds since the original store. And since
the expiry was 3 and current time is 3, we haven't expired and hence the
test fails.
---
OK, so the OS seems to have been very sluggish at the event for
clock_handler; I think the granularity of the timer is 128Hz, so I'm not
quite sure why it took 1.3s rather than 1.008 - but the fact it's taking
any longer than 1 (which is legal) means the 2second limit in the test 163
isn't safe.
Dave