Am 13.04.2016 um 04:05 schrieb Richard Elling <[email protected]>: > >> On Apr 12, 2016, at 3:21 PM, Dirk Steinberg <[email protected]> wrote: >> >> Hi, >> >> in order to improve long-term performance on consumer-grade SSDs, >> I would like to reserve a certain range of LBA addresses on a freshly >> TRIMmed SSD to never be written to. That can be done by slicing the >> disk and leaving one slice of the disk unused. > > I do not believe this is the case. Overprovisioning is managed by changing the > capacity of the disk. The method depends on the drive. For SCSI drives we use > sg_format.
For modern SSDs with intelligent controllers not using a certain part of the SSD (after secure erase or TRIM) shoud be very similar to increasing the HPA (host protected area). See: http://www.anandtech.com/show/6489/playing-with-op <http://www.anandtech.com/show/6489/playing-with-op> It can be argued that using the HPA is the correct way of doing this, but even the HPA mathod is not without problems as some Linux distributions automatically turn off the HPA. I dislike the idea of having to boot another OS to do the adjustment. Maybe someone can tell me how to manage the HPA from SmartOS and also maybe make changes without destroying all data on the disk. >> OTOH I really like to use whole-disk vdev pools and putting pools >> on slices is unnecessarily complex and error-prone. > > A "whole-disk" vdev is one where slice 0 is set to an size between > LBA 256 and end - size of slice 9. The convenience is that the sysadmin > doesn’t have to run format or fmthard Exactly. ZFS should eliminate the need to run format, fdisk, fmthard and to play with GPT disklabels and LVMs. If one can demonstrate a valid use case where these legacy tools are still needed in spite of running ZFS on top, then some functionality may be missing in ZFS. In this case, preallocated zVOLs. In reality there are already there, but limited to one specific use case only: dumping. All I am saying is that the scope of permissible use cases should be broadened to allow other uses besides dumping. >> Also, should one later on decide that setting aside, say, 30% of the >> disk capacity as spare was too much, changing this to a smaller >> number afterwards in a slicing setup is a pain. >> >> Therefore my idea is to just reserve a certain amount of SSD blocks >> in the zpool and never use them. They must be nailed to specific >> block addresses but must never be written to. > > This is not the same as adjusting the drive's overprovisioning. > — richard It may not be the same, but the effect will be very similar for most modern SSDs (maybe not all). / Dirk >> A sparsely provisioned zvol does not do the trick, and neither does >> filling is thickly with zeros (then the blocks would have been written to, >> ignoring for the moment compression etc.). >> >> What I need is more or less exactly what Joyent implemented for the >> multi_vdev_crash_dump support: just preallocate a range of blocks >> for a zvol. So I tried: >> >> zfs create -V 50G -o checksum=noparity zones/SPARE >> >> Looking at „zpool iostat“ I see that nothing much happens at all. >> Also, I can see that no actual blocks are allocated: >> [root@nuc6 ~]# zfs get referenced zones/SPARE >> NAME PROPERTY VALUE SOURCE >> zones/SPARE referenced 9K - >> >> So the magic apparently only happens when you actually >> activate dumping to that zvol: >> >> [root@nuc6 ~]# dumpadm -d /dev/zvol/dsk/zones/SPARE >> Dump content: kernel pages >> Dump device: /dev/zvol/dsk/zones/SPARE (dedicated) >> >> In zpool iostat 1 I can see that about 200mb of data is written: >> >> zones 22.6G 453G 0 0 0 0 >> zones 27.3G 449G 0 7.54K 0 18.9M >> zones 49.5G 427G 0 33.6K 0 89.3M >> zones 71.7G 404G 0 34.6K 0 86.2M >> zones 72.6G 403G 0 1.39K 0 3.75M >> zones 72.6G 403G 0 0 0 0 >> >> That must be the allocation metadata only, since this is much less than >> the 50G, but still a noticeable amount of data. And we can actually see >> that the full 50G have been pre-allocated: >> >> [root@nuc6 ~]# zfs get referenced zones/SPARE >> NAME PROPERTY VALUE SOURCE >> zones/SPARE referenced 50.0G - >> >> Now I have exactly what I want: a nailed-down allocation of >> 50G of blocks that never have been written to. >> I’d like to keep that zvol in this state indefinitely. >> Only problem: as soon as I change dumpadm to dump >> to another device (or none), this goes away again. >> >> [root@nuc6 ~]# dumpadm -d none >> [root@nuc6 ~]# zfs get referenced zones/SPARE >> NAME PROPERTY VALUE SOURCE >> zones/SPARE referenced 9K - >> >> Back to square one! BTW, the amount of data written for the de-allocation >> is much less: >> >> zones 72.6G 403G 0 0 0 0 >> zones 72.6G 403G 0 101 0 149K >> zones 22.6G 453G 0 529 0 3.03M >> zones 22.6G 453G 0 0 0 0 >> >> So my question is: can I somehow keep the zVOL in the pre-allocated state, >> even when I do not use it as a dump device? >> >> While we are at it: If I DID use it as dump device, will a de-allocation >> and re-allocation occur on each reboot, or will the allocation remain intact? >> Can I somehow get a list of blocks allocated for the zvol via zdb? >> >> Thanks. >> >> Cheers >> Dirk >> > > ------------------------------------------- smartos-discuss Archives: https://www.listbox.com/member/archive/184463/=now RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00 Modify Your Subscription: https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb Powered by Listbox: http://www.listbox.com
