OK, thanks. My confusion was that when I saw "FYI, this change will be made
completely redundant and obsolete by #1696" I read it as "#1696 will change
the write code such that it won't need spa_get_asize()".

Based on what you told me, now I read it as "#1696 contains basically the
same fix as mine so it's not really needed" and it all makes sense to me
now.

Thank you for all the clarification.


2013/12/3 Matthew Ahrens <[email protected]>

>
>
>
> On Tue, Dec 3, 2013 at 5:14 PM, Kohsuke Kawaguchi <[email protected]> wrote:
>
>> Thanks for the comments. I really appreciate that.
>>
>> Now, I subsequently proposed a change to make spa_asize_inflation a
>> tunable in ZFS on Linux (https://github.com/zfsonlinux/zfs/pull/1913),
>> and a comment was made that a separate effort
>> https://github.com/zfsonlinux/zfs/pull/1696 (which is porting of
>> https://www.illumos.org/issues/4045) would make this irrelevant.
>>
>> Now, if I understand correctly, IllumOS #4045 is already merged, and at
>> the same time you just gave me the source code pointer where
>> spa_asize_inflation is still a tunable in IllumOS. Does that mean this
>> tunable is stale in IllumOS already? Or did #4045 not fully eliminate the
>> need for spa_get_asize?
>>
>
>  I don't understand the questions.  The fix for illumos #4045 introduced
> the spa_asize_inflation tunable.  AFAIK there is no intention of
> eliminating that tunable or spa_get_asize.  Your pull request #1913 is a
> subset of illumos #4045.  You should probably just pull the entire #4045
> into linux, which is what pull request #1696 is.
>
> --matt
>
>
>>
>> 2013/12/3 Matthew Ahrens <[email protected]>
>>
>>>
>>>> So now I'm trying to see if there's any way to improve this estimate.
>>>> But I'm new to ZFS codebase, and so I'm looking for some help.
>>>>
>>>> - I'm not using raidz, so I should be able to knock off x4 right off
>>>> the bat. Shoudn't there be a way to tell if the pool is using raidz by
>>>> looking at spa->spa_root_vdev and calculate multiplication factors based on
>>>> vdev tree? (And since vdev tree shape won't change that much, hopefully
>>>> pre-calculate this value and store it in vdev_t)
>>>>
>>>
>>> Yes.  You will need to look at all top-level vdevs (i.e.
>>> spa_root_vdev->vdev_children[]) and see if they are using RAID-Z.  You
>>> would probably want to cache this in the spa_t.  You will need to
>>> re-evaluate when a new top-level vdev is added.
>>>
>>
>> I see, so root vdev itself is a dummy place holder and its children is
>> what I normally see in "zpool status" and the likes.
>>
>>
>>>
>>>
>>>>
>>>> - My 100MB write is write system calls to a file, and my understanding
>>>> is that for those ZFS wouldn't replicate blocks. So I'm wasting x3, too.
>>>> What if we pass in more contextual parameters to determine proper block
>>>> replication factor? If so, where can I learn the block replication policy
>>>> in ZFS?
>>>>
>>>
>>> This will be trickier, because some blocks (metadata) will be dittoed,
>>> and others will not.  The dmu_tx_hold_* code does not currently make this
>>> distinction.  The ditto policy is implemented in dmu_write_policy(), which
>>> is called way after you would need this information.
>>>
>>
>> OK, I'll skip that one then. Too hard for me.
>>
>>
>>>
>>>
>>>>
>>>> - On the last x2 factor, it appears that "ddt_sync" is dedup related?
>>>> If so, again is there any way to tell that dedup is not on and skip this? I
>>>> suppose this is not a property of spa but of dsl_dataset (?). Is something
>>>> like that feasible?
>>>>
>>>
>>> That's right.  You could either see if dedup is used anywhere in the
>>> pool, or you could see if the property is set on this particular dataset.
>>>  The dataset is readily available to all callers of spa_get_asize.  The
>>> logic for determining if dedup is used is also in dmu_write_policy().  You
>>> would need to replicate this logic in callers of spa_get_asize(), but you
>>> can use a simplified version, probably just "os_dedup_checksum !=
>>> ZIO_CHECKSUM_OFF".
>>>
>>
>> Thanks. I'll look at the code to see if this is something approachable
>> for me.
>>
>>
>>>
>>>
>>>>
>>>> Any insights/thoughts into this would be highly appreciated.
>>>>
>>>> --
>>>> Kohsuke Kawaguchi
>>>>
>>>> _______________________________________________
>>>> developer mailing list
>>>> [email protected]
>>>> http://lists.open-zfs.org/mailman/listinfo/developer
>>>>
>>>>
>>>
>>
>>
>> --
>> Kohsuke Kawaguchi
>>
>
>


-- 
Kohsuke Kawaguchi
_______________________________________________
developer mailing list
[email protected]
http://lists.open-zfs.org/mailman/listinfo/developer

Reply via email to