On Mon, Oct 14, 2013 at 2:27 PM, Saso Kiselkov <[email protected]>wrote:

> On 10/14/13 9:35 PM, Matthew Ahrens wrote:
> > Are these changes linked with the persistent L2ARC work?  Could we first
> > integrate the hash table size changes and evaluate persistency
> separately?
>
> They're not strictly linked and I can separate them out, but it helps
> L2ARC rebuild performance on pools with lots of zvols or small blocks
> significantly (otherwise rebuild times just plain suck). Pools with
> large block sizes don't need this per se.
>

Presumably it also helps pools with small blocks, even if they don't use
L2ARC at all.


>
> >> 3.What happens when the hash table size is excessively large, as happens
> >>  on bigmem systems?
> >
> > Indeed, what happens?  I would guess that it works just fine, especially
> > given that this will be executed early in boot, when nearly all of the
> > address space is available.  Do you have evidence that kmem_alloc(1GB)
> > does not work?
>
> At present the code resolves this by allocating with KM_NOSLEEP and
> halving the hash table size on each failed allocation request.


Oh right, we wrote that back when you could reasonably boot from UFS and
then load the zfs kernel module afterward, when you might not have much
free memory.  We might want to keep something similar even if we reduce the
contiguous address space requirement.  Or not -- maybe it's reasonable to
fail if there's < 0.1% free memory.

This kind
> of shotgun method is problematic for several reasons:
>
> A) It's ugly.
> B) Hash table size (and thus performance) becomes kind of unpredictable.
> C) It's not very good to assume that this code will be only loaded
>    at boot. What about other systems where these assumptions may not
>    strictly apply? (Especially Linux where there is a plethora of
>    filesystems - users can use some other FS during the normal course
>    of operations and only load ZFS much later after boot.)
>
> In short, I'd say why not nip this in the bud before it becomes a problem?
>

1. Because the code is nontrivial.  I'm asking that you show an actual
problem that this solves.  E.g. failure to allocate virtual address space
on Linux.  The code isn't *that* complicated, so it's OK if the problem it
solves is a relatively minor one.

2. Because testing will reveal if the changes actually fix the problem.  If
it's possible to run out of address space, you'll have problems in
dbuf_init() too; it's allocating 0.2% of RAM.

While we're at it, how about adding a tunable for sizing the
dbuf_hash_table too?

--matt
_______________________________________________
developer mailing list
[email protected]
http://lists.open-zfs.org/mailman/listinfo/developer

Reply via email to