Yes, I'm not entirely sure what is going on, but basically just using time to check how long a iozone -A takes to run:

hfs# time iozone -A
real    6m52.913s

zfs# time iozone -A
real    17m2.767s
Looking randomly around the place, it is interesting to note that performance only falls apart once we hit the "524288" sized tests in iozone. Adding code to cv_timedwait() looking for timeouts, it is interesting to note that we start getting timeouts once we hit that size. It is a 4 second timeout, and with a stacktrace leads us to:

spl: cv_wait timeout 0xffffff8042444db0 (4,0)
 txg_thread_wait (in zfs) + 92
 txg_sync_thread (in zfs) + 424
 0xffffff80002d6ff7

It will hit this timeout about 28 times during an iozone. Which "could" account for up to 112 seconds.

But strangly, changing the timeout to 1 second, or indeed to 8 seconds, make no difference to the final duration. So possibly I am looking at a red herring. But it is interesting the timeout from txg_thread_wait starts when IO numbers turn terrible.


_______________________________________________
developer mailing list
[email protected]
http://lists.open-zfs.org/mailman/listinfo/developer

Reply via email to