On Thu, May 10, 2012 at 5:37 AM, Ian Collins i...@ianshome.com wrote:
I have an application I have been using to manage data replication for a
number of years. Recently we started using a new machine as a staging
server (not that new, an x4540) running Solaris 11 with a single pool built
from 7x6 drive raidz. No dedup and no reported errors.
On that box and nowhere else is see empty snapshots taking 17 or 18 seconds
to write. Everywhere else they return in under a second.
Using truss and the last published source code, it looks like the pause is
between a printf and the call to zfs_ioctl and there aren't any other
functions calls between them:
For each snapshot in a stream, there is one zfs_ioctl() call. During
that time, the kernel will read the entire substream (that is, for one
snapshot) from the input file descriptor.
100.5124 0.0004 open(/dev/zfs, O_RDWR|O_EXCL) = 10
100.7582 0.0001 read(7, \0\0\0\0\0\0\0\0ACCBBAF5.., 312) = 312
100.7586 0. read(7, 0x080464F8, 0) = 0
100.7591 0. time() = 1336628656
100.7653 0.0035 ioctl(8, ZFS_IOC_OBJSET_STATS, 0x08040CF0) = 0
100.7699 0.0022 ioctl(8, ZFS_IOC_OBJSET_STATS, 0x08040900) = 0
100.7740 0.0016 ioctl(8, ZFS_IOC_OBJSET_STATS, 0x08040580) = 0
100.7787 0.0026 ioctl(8, ZFS_IOC_OBJSET_STATS, 0x080405B0) = 0
100.7794 0.0001 write(1, r e c e i v i n g i n.., 75) = 75
118.3551 0.6927 ioctl(8, ZFS_IOC_RECV, 0x08042570) = 0
118.3596 0.0010 ioctl(8, ZFS_IOC_OBJSET_STATS, 0x08040900) = 0
118.3598 0. time() = 1336628673
118.3600 0. write(1, r e c e i v e d 3 1 2.., 45) = 45
zpool iostat (1 second interval) for the period is:
tank 12.5T 6.58T 175 0 271K 0
tank 12.5T 6.58T 176 0 299K 0
tank 12.5T 6.58T 189 0 259K 0
tank 12.5T 6.58T 156 0 231K 0
tank 12.5T 6.58T 170 0 243K 0
tank 12.5T 6.58T 252 0 295K 0
tank 12.5T 6.58T 179 0 200K 0
tank 12.5T 6.58T 214 0 258K 0
tank 12.5T 6.58T 165 0 210K 0
tank 12.5T 6.58T 154 0 178K 0
tank 12.5T 6.58T 186 0 221K 0
tank 12.5T 6.58T 184 0 215K 0
tank 12.5T 6.58T 218 0 248K 0
tank 12.5T 6.58T 175 0 228K 0
tank 12.5T 6.58T 146 0 194K 0
tank 12.5T 6.58T 99 258 209K 1.50M
tank 12.5T 6.58T 196 296 294K 1.31M
tank 12.5T 6.58T 188 130 229K 776K
Can anyone offer any insight or further debugging tips?
I have yet to see a time when zpool iostat tells me something useful.
I'd take a look at iostat -xzn 1 or similar output. It could point
to imbalanced I/O or a particular disk that has abnormally high
service times.
Have you installed any SRUs? If not, you could be seeing:
7060894 zfs recv is excruciatingly slow
which is fixed in Solaris 11 SRU 5.
If you are using zones and are using any https pkg(5) origins (such as
https://pkg.oracle.com/solaris/support), I suggest reading
https://forums.oracle.com/forums/thread.jspa?threadID=2380689tstart=15
before updating to SRU 6 (SRU 5 is fine, however). The fix for the
problem mentioned in that forums thread should show up in an upcoming
SRU via CR 7157313.
--
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss