Sorry for the ludicriously late reply, but I figured out what the problem
eventually was. While it was showing up in the ZFS code, the problem was
actually fsflush. The machine has 64G of RAM and when memory started to get
active, the scanning took a solid 7+ seconds to complete. Setting
As Dan said, it looks like ZFS is busy.
That's very odd, as the system isn't doing anything I/O heavy. It has only a
single zpool of five devices in a raidz serving a single filesystem, and that
filesystem only writes logs at a rate of about 10MB/s. ZFS compression is
turned off.
How much
I'm cross-posting to zfs-discuss, as this is now more of a ZFS
query than a dtrace query at this point, and I'm not sure if all the ZFS
experts are listening on dtrace-discuss (although they probably
are... :^).
The only thing that jumps out at me is the ARC size - 53.4GB, or
most of your 64GB
Can you gather some ZFS IO statistics, like
fsstat zfs 1 for a minute or so.
Here is a snapshot from when it is exhibiting the behavior:
new name name attr attr lookup rddir read read write write
file remov chng get setops ops ops bytes ops bytes
0 0 0
The only thing that jumps out at me is the ARC size -
53.4GB, or
most of your 64GB of RAM. This in-and-of-itself is
not necessarily
a bad thing - if there are no other memory consumers,
let ZFS cache
data in the ARC. But if something is coming along to
flush dirty
ARC pages
dtrace -n ':::xcalls { @s[stack()] = count() } tick-1sec { trunc(@s,10);
printa(@s); clear(@s); }'
That will tell us where the xcalls are coming from in the kernel,
and we can go from there.
Thanks,
/jim
Jim Leonard wrote:
We have a 16-core x86 system that, at seemingly random intervals,
Jim Mauro has provided an excellent starting point. Keep in mind that kernel
threads will show up as pid 0 so you may be seeing a kernel thread
Causing the activity.
Jim L
--Original Message--
From: Jim Leonard trix...@oldskool.org
Sent: Tue, September 22, 2009 11:31 AM
To:
Thanks for the awesome two-liner, I'd been struggling with 1-second intervals
without a full-blown script.
I modified it to output walltime so that I could zoom in on the problem, and
here it is:
unix`xc_do_call+0x8f
unix`xc_wait_sync+0x36
zfs is busy?
Jim Leonard wrote:
Thanks for the awesome two-liner, I'd been struggling with 1-second intervals
without a full-blown script.
I modified it to output walltime so that I could zoom in on the problem, and
here it is:
unix`xc_do_call+0x8f
As Dan said, it looks like ZFS is busy.
How much RAM is on this system?
What release of Solaris?
Do you have any ZFS tweaks in /etc/system?
(like clamping the ARC size...)
Is the system memory constrained?
The xcalls are due to the page unmaps out of
what I'm assuming is the ZFS ARC (although
It would also be interesting to see some snapshots
of the ZFS arc kstats
kstat -n arcstats
Thanks
Jim Leonard wrote:
Thanks for the awesome two-liner, I'd been struggling with 1-second intervals
without a full-blown script.
I modified it to output walltime so that I could zoom in on the
11 matches
Mail list logo