On 7/22/14, 10:11 PM, Richard Yao via illumos-zfs wrote:
> ZFSOnLinux tends to spend more time in the kernel than other Linux
> filesystems, so opportunities to reduce overhead are desirable for
> ZFSOnLinux. One such opportunity is random_get_pseudo_bytes(), which I
> frequently see in my flame graphs when profiling various workloads on Linux.
> 
> I recently did some profiling when evaluating a patch to disable the
> Linux readahead logic on zvols by running `dd if=/dev/zd0 of=/dev/null
> bs=4096` on my workstation on a zvol with volblocksize=4k. The profiling
> was done by having Linux's perf (which is like a crippled dtrace) tool
> execute it while taking samples of the entire system at 99Hz.
> 
> I noticed that we spent 3.18% generating random numbers when I was
> reviewing the flame graph. That is because `mm->mm_preferred =
> spa_get_random(c);` in vdev_mirror_map_alloc() calledt calls
> random_get_pseudo_bytes(), which hooks into /dev/urandom. ZoL 0.6.3 and
> earlier correctly map this to the Linux equivalent to
> random_get_pseudo_bytes(), which is get_random_bytes().
> 
> I wrote a patch to replace our compatibility layer's implementation of
> random_get_pseudo_bytes() with a fast PRNG that we seed using
> /dev/urandom to see if I could make this better:
> 
> https://github.com/ryao/spl/commit/8a6998f97bf45c4364effae9d1f649f55b71943a
> 
> It caused time spent in random_get_pseudo_bytes() under my benchmark
> designed to stress readahead to decrease from 3.18% to 0.06%. That is a
> factor of 50 improvement.
> 
> It would have been be better to put the new generator into ZFS as the
> implementation of spa_get_random() so that other consumers of
> random_get_pseudo_bytes() are unaffected. The only place where the new
> PRNG might cause a problem is zil_init_log_chain() because its GUIDs are
> 128-bit and the second 64-bit words from a 64-bit PRNG will always be
> related to the first. The probability of a collision between any two
> numbers is 1 in 2^64 - 1, which seems acceptable for the ZIL log chains.
> 
> The PRNG algorithm that I selected as a replacement for /dev/urandom in
> ZFS is a xorshift generator proposed by the following paper:
> 
> http://arxiv.org/pdf/1402.6246v2.pdf
> 
> It has good statistical properties and I see no technical reason not to
> use it as a replacement for /dev/urandom in ZoL's
> random_get_pseudo_bytes(). My only regret is that this requires either
> diverging from Illumos in what random_get_pseudo_bytes() means or
> modifying the code to keep the existing meaning and diverging anyway.
> 
> I would like to reconcile that difference by porting the new PRNG to
> Illumos for use as the SPL's get_spa_random() implementation while
> modifying zil_init_log_chain() to use random_get_pseudo_bytes() and the
> few places where random_get_pseudo_bytes() is used to use get_spa_random().
> 
> Would the Illumos ZFS developers be interested in this change?

Seems neat, I'd like to see this implemented in Illumos as well, though
I'd like to keep the old generator for things like vdev GUIDs.

Cheers,
-- 
Saso
_______________________________________________
developer mailing list
[email protected]
http://lists.open-zfs.org/mailman/listinfo/developer

Reply via email to