On Tue, Dec 3, 2013 at 5:14 PM, Kohsuke Kawaguchi <[email protected]> wrote:
> Thanks for the comments. I really appreciate that. > > Now, I subsequently proposed a change to make spa_asize_inflation a > tunable in ZFS on Linux (https://github.com/zfsonlinux/zfs/pull/1913), > and a comment was made that a separate effort > https://github.com/zfsonlinux/zfs/pull/1696 (which is porting of > https://www.illumos.org/issues/4045) would make this irrelevant. > > Now, if I understand correctly, IllumOS #4045 is already merged, and at > the same time you just gave me the source code pointer where > spa_asize_inflation is still a tunable in IllumOS. Does that mean this > tunable is stale in IllumOS already? Or did #4045 not fully eliminate the > need for spa_get_asize? > I don't understand the questions. The fix for illumos #4045 introduced the spa_asize_inflation tunable. AFAIK there is no intention of eliminating that tunable or spa_get_asize. Your pull request #1913 is a subset of illumos #4045. You should probably just pull the entire #4045 into linux, which is what pull request #1696 is. --matt > > 2013/12/3 Matthew Ahrens <[email protected]> > >> >>> So now I'm trying to see if there's any way to improve this estimate. >>> But I'm new to ZFS codebase, and so I'm looking for some help. >>> >>> - I'm not using raidz, so I should be able to knock off x4 right off the >>> bat. Shoudn't there be a way to tell if the pool is using raidz by looking >>> at spa->spa_root_vdev and calculate multiplication factors based on vdev >>> tree? (And since vdev tree shape won't change that much, hopefully >>> pre-calculate this value and store it in vdev_t) >>> >> >> Yes. You will need to look at all top-level vdevs (i.e. >> spa_root_vdev->vdev_children[]) and see if they are using RAID-Z. You >> would probably want to cache this in the spa_t. You will need to >> re-evaluate when a new top-level vdev is added. >> > > I see, so root vdev itself is a dummy place holder and its children is > what I normally see in "zpool status" and the likes. > > >> >> >>> >>> - My 100MB write is write system calls to a file, and my understanding >>> is that for those ZFS wouldn't replicate blocks. So I'm wasting x3, too. >>> What if we pass in more contextual parameters to determine proper block >>> replication factor? If so, where can I learn the block replication policy >>> in ZFS? >>> >> >> This will be trickier, because some blocks (metadata) will be dittoed, >> and others will not. The dmu_tx_hold_* code does not currently make this >> distinction. The ditto policy is implemented in dmu_write_policy(), which >> is called way after you would need this information. >> > > OK, I'll skip that one then. Too hard for me. > > >> >> >>> >>> - On the last x2 factor, it appears that "ddt_sync" is dedup related? If >>> so, again is there any way to tell that dedup is not on and skip this? I >>> suppose this is not a property of spa but of dsl_dataset (?). Is something >>> like that feasible? >>> >> >> That's right. You could either see if dedup is used anywhere in the >> pool, or you could see if the property is set on this particular dataset. >> The dataset is readily available to all callers of spa_get_asize. The >> logic for determining if dedup is used is also in dmu_write_policy(). You >> would need to replicate this logic in callers of spa_get_asize(), but you >> can use a simplified version, probably just "os_dedup_checksum != >> ZIO_CHECKSUM_OFF". >> > > Thanks. I'll look at the code to see if this is something approachable for > me. > > >> >> >>> >>> Any insights/thoughts into this would be highly appreciated. >>> >>> -- >>> Kohsuke Kawaguchi >>> >>> _______________________________________________ >>> developer mailing list >>> [email protected] >>> http://lists.open-zfs.org/mailman/listinfo/developer >>> >>> >> > > > -- > Kohsuke Kawaguchi >
_______________________________________________ developer mailing list [email protected] http://lists.open-zfs.org/mailman/listinfo/developer
