On 16 Oct 13:41 , Henk Langeveld wrote:
Before I create an issue, I'd like to know if anyone else can reproduce
this on an active server?
The following kstat/awk script reports kstat anomalies where
[rw](len)time fields inside a kstat_io[1] structure actually *decrease*
in value:
kstat -c disk -p sd:::*time 1 |
nawk '/time/ {
if (old[$1] && old[$1] > $2)
print $1,"old: "old[$1],"new:"$2
old[$1]=$2
}'
sd:39:sd39:rlentime old: 7223147599853 new:7223146752623
sd:39:sd39:rtime old: 1556295270928 new:1556295116623
sd:21:sd21:rlentime old: 44879798465601 new:44879798422271
sd:24:sd24:wlentime old: 15995502763 new:15995464571
sd:24:sd24:wtime old: 15595171901 new:15595120298
sd:38:sd38:wlentime old: 16762450959 new:16762280835
sd:38:sd38:wtime old: 16293043067 new:16292872942
sd:15:sd15:wlentime old: 16274495238 new:16274415688
sd:15:sd15:wtime old: 15857746407 new:15857664000
sd:2:sd2:wlentime old: 16300070951 new:16299959181
sd:2:sd2:wtime old: 15882033262 new:15881917149
sd:31:sd31:rlentime old: 7260803325344 new:7260802679629
sd:31:sd31:rtime old: 1558548730649 new:1558548606359
Here's the story behind this:
Several times now I've seen iostat -xn <interval> report broken values
for the wait, wsvc_t and %w columns.
The outrageously large values led me to believe we're dealing with
a 64bit signed/unsigned overflow.
Looking at the code and the definitions in kstat.h, these columns are
derived from data in the kstat_io structure.
According to the description of the structure, all fields in there
should increase monotonously.
I also have examples for the old perl-based kstat; I wanted to make
sure I wasn't seeing any artefact of the conversion to floating point.
Anyone else seeing this?
Is this a case of improper reporting (inappropriate caching) or
improper recording in the kernel (missing locks)?
rmustacc on #illumos (irc) pointed me at gethrtime_unscaled(), which
according to uts/common/os/msacct.c is inconsistent between different CPUs.
This appears to break the assumptions implied by kstat.h on 'Accumulated
time and queue length statistics'.
637 * All times are 64-bit nanoseconds (hrtime_t), as returned by
638 * gethrtime().
639 *
640 * The units of cumulative busy time are accumulated nanoseconds.
641 * The units of cumulative length*time products are elapsed time
642 * times queue length.
643 *
644 * Updates to the fields below are performed implicitly by calls to
645 * these five functions:
646 *
647 * kstat_waitq_enter()
648 * kstat_waitq_exit()
649 * kstat_runq_enter()
650 * kstat_runq_exit()
651 *
652 * kstat_waitq_to_runq() (see below)
653 * kstat_runq_back_to_waitq() (see below)
654 *
Each of these functions computes new and delta, assuming delta > 0.
new = gethrtime_unscaled();
delta = new - kiop->wlastupdate;
IMO, the [rw]lastupdate, [rw]lentime, and [rw]time fields in the kiop
structure should be protected with a test on (delta > 0).
Does this look sensible?
diff --git a/usr/src/uts/common/os/kstat_fr.c
b/usr/src/uts/common/os/kstat_fr.c
index 83b817e..473cd71 100644
--- a/usr/src/uts/common/os/kstat_fr.c
+++ b/usr/src/uts/common/os/kstat_fr.c
@@ -1325,11 +1325,14 @@ kstat_waitq_enter(kstat_io_t *kiop)
new = gethrtime_unscaled();
delta = new - kiop->wlastupdate;
- kiop->wlastupdate = new;
wcnt = kiop->wcnt++;
- if (wcnt != 0) {
- kiop->wlentime += delta * wcnt;
- kiop->wtime += delta;
+
+ if (delta > 0) {
+ kiop->wlastupdate = new;
+ if (wcnt != 0) {
+ kiop->wlentime += delta * wcnt;
+ kiop->wtime += delta;
+ }
}
}
@@ -1341,11 +1344,13 @@ kstat_waitq_exit(kstat_io_t *kiop)
new = gethrtime_unscaled();
delta = new - kiop->wlastupdate;
- kiop->wlastupdate = new;
wcnt = kiop->wcnt--;
ASSERT((int)wcnt > 0);
- kiop->wlentime += delta * wcnt;
- kiop->wtime += delta;
+ if (delta > 0) {
+ kiop->wlastupdate = new;
+ kiop->wlentime += delta * wcnt;
+ kiop->wtime += delta;
+ }
}
void
@@ -1356,11 +1361,13 @@ kstat_runq_enter(kstat_io_t *kiop)
new = gethrtime_unscaled();
delta = new - kiop->rlastupdate;
- kiop->rlastupdate = new;
rcnt = kiop->rcnt++;
- if (rcnt != 0) {
- kiop->rlentime += delta * rcnt;
- kiop->rtime += delta;
+ if (delta > 0) {
+ kiop->rlastupdate = new;
+ if (rcnt != 0) {
+ kiop->rlentime += delta * rcnt;
+ kiop->rtime += delta;
+ }
}
}
@@ -1372,11 +1379,14 @@ kstat_runq_exit(kstat_io_t *kiop)
new = gethrtime_unscaled();
delta = new - kiop->rlastupdate;
- kiop->rlastupdate = new;
rcnt = kiop->rcnt--;
+
ASSERT((int)rcnt > 0);
- kiop->rlentime += delta * rcnt;
- kiop->rtime += delta;
+ if (delta > 0) {
+ kiop->rlastupdate = new;
+ kiop->rlentime += delta * rcnt;
+ kiop->rtime += delta;
+ }
}
void
@@ -1388,18 +1398,22 @@ kstat_waitq_to_runq(kstat_io_t *kiop)
new = gethrtime_unscaled();
delta = new - kiop->wlastupdate;
- kiop->wlastupdate = new;
wcnt = kiop->wcnt--;
ASSERT((int)wcnt > 0);
- kiop->wlentime += delta * wcnt;
- kiop->wtime += delta;
+ if (delta > 0) {
+ kiop->wlastupdate = new;
+ kiop->wlentime += delta * wcnt;
+ kiop->wtime += delta;
+ }
delta = new - kiop->rlastupdate;
- kiop->rlastupdate = new;
rcnt = kiop->rcnt++;
- if (rcnt != 0) {
- kiop->rlentime += delta * rcnt;
- kiop->rtime += delta;
+ if (delta > 0) {
+ kiop->rlastupdate = new;
+ if (rcnt != 0) {
+ kiop->rlentime += delta * rcnt;
+ kiop->rtime += delta;
+ }
}
}
@@ -1412,18 +1426,22 @@ kstat_runq_back_to_waitq(kstat_io_t *kiop)
new = gethrtime_unscaled();
delta = new - kiop->rlastupdate;
- kiop->rlastupdate = new;
rcnt = kiop->rcnt--;
ASSERT((int)rcnt > 0);
- kiop->rlentime += delta * rcnt;
- kiop->rtime += delta;
+ if (delta > 0) {
+ kiop->rlastupdate = new;
+ kiop->rlentime += delta * rcnt;
+ kiop->rtime += delta;
+ }
delta = new - kiop->wlastupdate;
- kiop->wlastupdate = new;
wcnt = kiop->wcnt++;
- if (wcnt != 0) {
- kiop->wlentime += delta * wcnt;
- kiop->wtime += delta;
+ if (delta > 0) {
+ kiop->wlastupdate = new;
+ if (wcnt != 0) {
+ kiop->wlentime += delta * wcnt;
+ kiop->wtime += delta;
+ }
}
}
Regards,
Henk
-------------------------------------------
illumos-discuss
Archives: https://www.listbox.com/member/archive/182180/=now
RSS Feed: https://www.listbox.com/member/archive/rss/182180/21175430-2e6923be
Modify Your Subscription:
https://www.listbox.com/member/?member_id=21175430&id_secret=21175430-6a77cda4
Powered by Listbox: http://www.listbox.com