On Thu, Oct 10, 2013 at 2:51 PM, Ned Bass <[email protected]> wrote: > On Thu, Oct 10, 2013 at 02:12:38PM -0700, Matthew Ahrens wrote: > > > > In my testing I observe that dmu_tx_delay() _always_ returns here > > under normal conditions: > > > > 1034 if (now > tx->tx_start + min_tx_time) > > 1035 return; > > > > I haven't yet explained why this is the case. It must either be that > > dirty is staying very close to dirty_delay_min_bytes, or some > overhead > > higher up in the call path has already incurred enough delay. > > > > > > The delay would only kick in when the application is writing faster than > the > > disk can keep up. If you just have a few spinning disks, you can > probably hit > > it with "dd if=/dev/zero of=/pool/bigfile bs=1024k". Might need several > > instances of that (to different files) if you have lots of fast disks or > few > > slow CPUs. > > I've been running fio jobs with up to 128 threads and hitting the disks > pretty hard, but still don't see the delays kick in. But it is a pretty > big, fast pool, so I'm probably just unable to saturate the disks. > > I'll post some benchmark results soon which I hope will generate some > interesting discussions on performance. > > > The only exception to this is when I dynamically tune > zfs_dirty_data_max > > downward by a large amount. In that case dirty is close enough to > the > > new max that min_tx_time is initially around 3s. But then when it is > > truncated to 100ms, wakeup gets a timestamp in the past. And > because I > > implemented the kernel delay using msleep(), which takes a relative > > time, my system hangs. > > > > This raises another question I've been meaning to ask: why was the > delay > > implemented using cv_timedwait_hires()? I'm not familiar with timer > > APIs in Illumos, but it seems like this could have be done in a > simpler > > way that would be easier to emulate on other platforms, such as a > simple > > delay or sleep call. Yes, we could rewrite it for Linux, but I'd > prefer > > to minimize differences in core code like this. > > > > > > What routine would you prefer be used? There's no msleep() in illumos. > > I was thinking "surely there must be a simple sleep() -like interface in > illumos", but I guess I was wrong. :) >
Sure, there's delay(9f) <http://illumos.org/man/9f/delay>, but like sleep(3c), the granularity is way too coarse. --matt > > > Perhaps we should create a "sleep until this time" interface to wrap the > > (admittedly ugly) mutex + CV + timedwait(). Something to consider for > the > > common codebase. > > Yes, a clean wrapper would be nice, so we can hide the OS-specific bits. > > Thanks, > Ned >
_______________________________________________ developer mailing list [email protected] http://lists.open-zfs.org/mailman/listinfo/developer
