On Mar 18, 2012, at 11:16 AM, Jim Klimov wrote: > Hello all, > > I was asked if it is possible to convert a ZFS pool created > explicitly with ashift=12 (via the tweaked binary) and filled > with data back into ashift=9 so as to use the slack space > from small blocks (BP's, file tails, etc.)
copy out, copy in. Whether this is easy or not depends on how well you plan your storage use ... > The user's HDD marketing text says that it "efficiently" > emulates 512b sectors while using 4Kb ones natively (that's > why ashift=12 was enforced in the first place). Marketing: 2 drink minimum > > Questions are: > 1) How bad would a performance hit be with 512b blocks used > on a 4kb drive with such "efficient emulation"? Depends almost exclusively on the workload and hardware. In my experience, most folks who bite the 4KB bullet have low-cost HDDs where one cannot reasonably expect high performance. > Is it > possible to model/emulate the situation somehow in advance > to see if it's worth that change at all? It will be far more cost effective to just make the change. > 2) Is it possible to easily estimate the amount of "wasted" > disk space in slack areas of the currently active ZFS > allocation (unused portions of 4kb blocks that might > become available if the disks were reused with ashift=9)? Detailed space use is available from the zfs_blkstats mdb macro as previously described in such threads. > 3) How many parts of ZFS pool are actually affected by the > ashift setting? Everything is impacted. But that isn't a useful answer. > From what I gather, it is applied at the top-level vdev > level (I read that one can mix ashift=9 and ashift=12 > TLVDEVs in one pool spanning several TLVDEVs). Is that > a correct impression? Yes > If yes, how does ashift size influence the amount of > slots in uberblock ring (128 vs. 32 entries) which is > applied at the leaf vdev level (right?) but should be > consistent across the pool? It should be consistent across the top-level vdev. There is 128KB of space available for the uberblock list. The minimum size of an uberblock entry is 1KB. Obviously, a 4KB disk can't write only 1KB, so for 4KB sectors, there are 32 entries in theuberblock list. > As far as I see in ZFS on-disk format, all sizes and > offsets are in either bytes or 512b blocks, and the > ashift'ed block size is not actually used anywhere > except to set the minimal block size and its implicit > alignment during writes. The on-disk format doc is somewhat dated and unclear here. UTSL. > Is it wrong to think that it's enough to forge an > uberblock with ashift=9 and a matching self-checksum > and place that into the pool (leaf vdev labels), and > magically have all old data 4kb-aligned still available, > while new writes would be 512b-aligned? Yes, it is wrong to think that. > > Thanks for helping me grasp the theory, > //Jim -- richard -- DTrace Conference, April 3, 2012, http://wiki.smartos.org/display/DOC/dtrace.conf ZFS Performance and Training richard.ell...@richardelling.com +1-760-896-4422
_______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss