[zfs-discuss] question about zpool iostat output
I was just wondering: I added a SLOG/ZIL to my new system today...i noticed that the L2ARC shows up under it's own headingbut the SLOG/ZIL doesn'tis this correct? see: capacity operationsbandwidth poolalloc free read write read write -- - - - - - - rpool 15.3G 44.2G 0 0 0 0 c6t4d0s0 15.3G 44.2G 0 0 0 0 -- - - - - - - tank10.9T 7.22T 0 2.43K 0 300M raidz210.9T 7.22T 0 2.43K 0 300M c4t6d0 - - 0349 0 37.6M c4t5d0 - - 0350 0 37.6M c5t7d0 - - 0350 0 37.6M c5t3d0 - - 0350 0 37.6M c8t0d0 - - 0354 0 37.6M c4t7d0 - - 0351 0 37.6M c4t3d0 - - 0350 0 37.6M c5t8d0 - - 0349 0 37.6M c5t0d0 - - 0348 0 37.6M c8t1d0 - - 0353 0 37.6M c6t5d0s0 0 8.94G 0 0 0 0 cache - - - - - - c6t5d0s1 37.5G 0 0158 0 19.6M It seems sort of strange to me that it doesn't look like this instead: capacity operationsbandwidth poolalloc free read write read write -- - - - - - - rpool 15.3G 44.2G 0 0 0 0 c6t4d0s0 15.3G 44.2G 0 0 0 0 -- - - - - - - tank10.9T 7.22T 0 2.43K 0 300M raidz210.9T 7.22T 0 2.43K 0 300M c4t6d0 - - 0349 0 37.6M c4t5d0 - - 0350 0 37.6M c5t7d0 - - 0350 0 37.6M c5t3d0 - - 0350 0 37.6M c8t0d0 - - 0354 0 37.6M c4t7d0 - - 0351 0 37.6M c4t3d0 - - 0350 0 37.6M c5t8d0 - - 0349 0 37.6M c5t0d0 - - 0348 0 37.6M c8t1d0 - - 0353 0 37.6M log - - - - - - c6t5d0s0 0 8.94G 0 0 0 0 cache - - - - - - c6t5d0s1 37.5G 0 0158 0 19.6M ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] unsetting the bootfs property possible? imported a FreeBSD pool
Greetings - I am migrating a pool from FreeBSD 8.0 to OpenSolaris (Nexenta 3.0 RC1). I am in what seems to be a weird situation regarding this pool. Maybe someone can help. I used to boot off of this pool in FreeBSD, so the bootfs property got set: r...@nexenta:~# zpool get bootfs tank NAME PROPERTY VALUE SOURCE tank bootfstanklocal The presence of this property seems to be causing me all sorts of headaches. I cannot replace a disk or add a L2ARC because the presence of this flag is how ZFS code (libzfs_pool.c: zpool_vdev_attach and zpool_label_disk) determines if a pool is allegedly a root pool. r...@nexenta:~# zpool add tank cache c1d0 cannot label 'c1d0': EFI labeled devices are not supported on root pools. To replace disks, I was able to hack up libzfs_zpool.c and create a new custom version of the zpool command. That works, but this is a poor solution going forward because I have to be sure I use my customized version every time I replace a bad disk. Ultimately, I would like to just set the bootfs property back to default, but this seems to be beyond my ability. There are some checks in libzfs_pool.c that I can bypass in order to set the value back to its default of -, but ultimately I am stopped because there is code in zfs_ioctl.c, which I believe is kernel code, that checks to see if the bootfs value supplied is actually an existing dataset. I'd compile my own kernel but hey, this is only my first day using OpenSolaris - it was a big enough feat just learning how to compile stuff in the ON source tree :D What should I do here? Is there some obvious solution I'm missing? I'd like to be able to get my pool back to a state where I can use the *stock* zpool command to maintain it. I don't boot off of this pool anymore and if I could somehow set the boot. BTW, for reference, here is the output of zpool status (after I hacked up zpool to let me add a l2arc): pool: tank state: ONLINE status: The pool is formatted using an older on-disk format. The pool can still be used, but some features are unavailable. action: Upgrade the pool using 'zpool upgrade'. Once this is done, the pool will no longer be accessible on older software versions. scan: resilvered 351G in 2h44m with 0 errors on Tue May 25 23:33:38 2010 config: NAME STATE READ WRITE CKSUM tank ONLINE 0 0 0 raidz2-0ONLINE 0 0 0 c2t5d0p0 ONLINE 0 0 0 c2t4d0p0 ONLINE 0 0 0 c2t3d0p0 ONLINE 0 0 0 c2t2d0p0 ONLINE 0 0 0 c2t1d0p0 ONLINE 0 0 0 cache c1d0ONLINE 0 0 0 errors: No known data errors Thanks, Darren -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] [ZIL device brainstorm] intel x25-M G2 has ram cache?
--On 24 May 2010 23:41 -0400 rwali...@washdcmail.com wrote: I haven't seen where anyone has tested this, but the MemoRight SSD (sold by RocketDisk in the US) seems to claim all the right things: http://www.rocketdisk.com/vProduct.aspx?ID=1 pdf specs: http://www.rocketdisk.com/Local/Files/Product-PdfDataSheet-1_MemoRight%20 SSD%20GT%20Specification.pdf They claim to support the cache flush command, and with respect to DRAM cache backup they say (p. 14/section 3.9 in that pdf): At the risk of this getting a little off-topic (but hey, we're all looking for ZFS ZIL's ;) We've had similar issues when looking at SSD's recently (lack of cache protection during power failure) - the above SSD's look interesting [finally someone's noted you need to protect the cache] - but from what I've read about the Intel X25-E performance - the Intel drive with write cache turned off appears to be as fast, if not faster than those drives anyway... I've tried contacting Intel to find out if it's true their enterprise SSD has no cache protection on it, and what the effect of turning the write cache off would have on both performance and write endurance, but not heard anything back yet. Picking apart the Intel benchmarks published - they always have the write-cache enabled, which probably speaks volumes... -Karl ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] get parent dataset
Is there any way you can display the parent of a dataset by zfs (get/list) command ? I do not need to list for example for a dataset all it's children by using -r just to get the parent on a child. There are way's of grepping and doing some preg matches but i was wondering if there is any way by doing this directly. Thanks. -- ing. Vadim Comanescu S.C. Syneto S.R.L. str. Vasile Alecsandri nr 2, Timisoara Timis, Romania ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] USB Flashdrive as SLOG?
Hi, I know the general discussion is about flash SSD's connected through SATA/SAS or possibly PCI-E these days. So excuse me if I'm askign something that makes no sense... I have a server that can hold 6 U320 SCSI disks. Right now I put in 5 300GB for a data pool, and 1 18GB for the root pool. I've been thinking lately that I'm not sure I like the root pool being unprotected, but I can't afford to give up another drive bay. So recently the idea occurred to me to go the other way. If I were to get 2 USB Flash Thunb drives say 16 or 32 GB each, not only would i be able to mirror the root pool, but I'd also be able to put a 6th 300GB drive into the data pool. That led me to wonder whether partitioning out 8 or 12 GB on a 32GB thumb drive would be beneficial as an slog?? I bet the USB bus won't be as good as SATA or SAS, but will it be better than the internal ZIL on the U320 drives? This seems like at least a win-win, and possibly a win-win-win. Is there some other reason I'm insane to consider this? -Kyle ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] [ZIL device brainstorm] intel x25-M G2 has ram cache?
On Tue, May 25, 2010 at 10:08:57AM +0100, Karl Pielorz wrote: --On 24 May 2010 23:41 -0400 rwali...@washdcmail.com wrote: I haven't seen where anyone has tested this, but the MemoRight SSD (sold by RocketDisk in the US) seems to claim all the right things: http://www.rocketdisk.com/vProduct.aspx?ID=1 pdf specs: http://www.rocketdisk.com/Local/Files/Product-PdfDataSheet-1_MemoRight%20 SSD%20GT%20Specification.pdf They claim to support the cache flush command, and with respect to DRAM cache backup they say (p. 14/section 3.9 in that pdf): At the risk of this getting a little off-topic (but hey, we're all looking for ZFS ZIL's ;) We've had similar issues when looking at SSD's recently (lack of cache protection during power failure) - the above SSD's look interesting [finally someone's noted you need to protect the cache] - but from what I've read about the Intel X25-E performance - the Intel drive with write cache turned off appears to be as fast, if not faster than those drives anyway... I've tried contacting Intel to find out if it's true their enterprise SSD has no cache protection on it, and what the effect of turning the write cache off would have on both performance and write endurance, but not heard anything back yet. I guess the problem is not the cache by itself, but the fact that they ignore the CACHE FLUSH command.. and thus the non-battery-backed cache becomes a problem. -- Pasi Picking apart the Intel benchmarks published - they always have the write-cache enabled, which probably speaks volumes... -Karl ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] USB Flashdrive as SLOG?
The last couple times i've read this questions, people normally responded with: It depends you might not even NEED a slog, there is a script floating around which can help determine that... If you could benefit from one, it's going to be IOPS which help youso if the usb drive has more iops than your pool configuration does, then it might give some benefit.but then again, usb might not be as safe either, and if an older version you may want to mirror it. On Tue, May 25, 2010 at 8:11 AM, Kyle McDonald kmcdon...@egenera.comwrote: Hi, I know the general discussion is about flash SSD's connected through SATA/SAS or possibly PCI-E these days. So excuse me if I'm askign something that makes no sense... I have a server that can hold 6 U320 SCSI disks. Right now I put in 5 300GB for a data pool, and 1 18GB for the root pool. I've been thinking lately that I'm not sure I like the root pool being unprotected, but I can't afford to give up another drive bay. So recently the idea occurred to me to go the other way. If I were to get 2 USB Flash Thunb drives say 16 or 32 GB each, not only would i be able to mirror the root pool, but I'd also be able to put a 6th 300GB drive into the data pool. That led me to wonder whether partitioning out 8 or 12 GB on a 32GB thumb drive would be beneficial as an slog?? I bet the USB bus won't be as good as SATA or SAS, but will it be better than the internal ZIL on the U320 drives? This seems like at least a win-win, and possibly a win-win-win. Is there some other reason I'm insane to consider this? -Kyle ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] [ZIL device brainstorm] intel x25-M G2 has ram cache?
--On 25 May 2010 15:28 +0300 Pasi Kärkkäinen pa...@iki.fi wrote: I've tried contacting Intel to find out if it's true their enterprise SSD has no cache protection on it, and what the effect of turning the write cache off would have on both performance and write endurance, but not heard anything back yet. I guess the problem is not the cache by itself, but the fact that they ignore the CACHE FLUSH command.. and thus the non-battery-backed cache becomes a problem. The X25-E's do apparently honour the 'Disable Write Cache' command - without write cache, there is no cache to flush - all data is written to flash immediately - presumably before it's ACK'd to the host. I've seen a number of other sites do some testing with this - and found that it 'works' (i.e. with write-cache enabled, you get nasty data loss if the power is lost - with it disabled, it closes that window). But you obviously take quite a sizeable performance hit. We've got an X25-E here which we intend to test for ourselves (wisely ;) - to make sure that is the case... -Karl ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] [ZIL device brainstorm] intel x25-M G2 has ram cache?
On Tue, May 25, 2010 at 01:52:47PM +0100, Karl Pielorz wrote: --On 25 May 2010 15:28 +0300 Pasi Kärkkäinen pa...@iki.fi wrote: I've tried contacting Intel to find out if it's true their enterprise SSD has no cache protection on it, and what the effect of turning the write cache off would have on both performance and write endurance, but not heard anything back yet. I guess the problem is not the cache by itself, but the fact that they ignore the CACHE FLUSH command.. and thus the non-battery-backed cache becomes a problem. The X25-E's do apparently honour the 'Disable Write Cache' command - without write cache, there is no cache to flush - all data is written to flash immediately - presumably before it's ACK'd to the host. I've seen a number of other sites do some testing with this - and found that it 'works' (i.e. with write-cache enabled, you get nasty data loss if the power is lost - with it disabled, it closes that window). But you obviously take quite a sizeable performance hit. Yeah.. what I meant is: if you have write cache enabled, and the ssd drive honours 'CACHE FLUSH' command, then you should be safe.. Based on what I've understood the Intel SSDs ignore the CACHE FLUSH command, and thus it's not safe to run them with caches enabled.. We've got an X25-E here which we intend to test for ourselves (wisely ;) - to make sure that is the case... Please let us know how it goes :) -- Pasi ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] reconstruct recovery of rpool zpool and zfs file system with bad sectors
Roy, Thanks for your reply. I did get a new drive and attempted the approach (as you have suggested pre your reply) however once booted off the OpenSolaris Live CD (or the rebuilt new drive), I was not able to import the rpool (which I had established had sector errors). I expect I should have had some success if the vdev labels were intact (I currently suspect some critical boot files are impacted by bad sectors resulting in failed boot attempts from that partition slice). Unfortunately, I didn't keep a copy of the messages (if any - I have tried many permutations since). At my last attempt ... I installed knoppix (debian) on one of the partitions (also allowed access to smartctl and hdparm too - I was hoping to reduce the read timeout to speed up the exercise), then added zfs-fuse (to access the space I will use to stage the recovery file) and added dd_rescue and gnu ddrescue packages. smartctl appears not to be able to manage the disk while attached to usb (but I am guessing because don't have much experience with it). At this point I attempted dd_rescue to create an image of the partition with bad sectors (hoping there were efficiencies beyong normal dd) but it was at 5.6GB in 36 hours, so again I needed to abort however it does log the blocks attempted so far so hopefully I can skip past them when I next get an opportunity. Although it does now appear that gnu ddrescue is the preferred of the two utilities which I may opt to use to look at creating an image of the partition before attempting recovery of the slice (rpool). As an aside, I noticed that the knoppix 'dmesg | grep sd' command which reflects the primary partition devices, no longer appears to reflect the solaris partition (p2) slice devices (as it would the extended p4 partitions logical partition devices configured). I suspect due to this, the rpool (one of the solaris partition slices) appears not to be detected by the knoppix zfs-fuse 'zpool import' (although I can access the zpool which exists on partition p3). I wonder if this is related to the transition from ufs to zfs? -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] cannot import pool from another system, device-ids different! please help!
eon:1:~#zdb -l /dev/rdsk/c1d0 LABEL 0 failed to unpack label 0 LABEL 1 failed to unpack label 1 LABEL 2 failed to unpack label 2 LABEL 3 failed to unpack label 3 same for the other five drives in the pool what now? -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] question about zpool iostat output
Hi Thomas, This looks like a display bug. I'm seeing it too. Let me know which Solaris release you are running and I will file a bug. Thanks, Cindy On 05/25/10 01:42, Thomas Burgess wrote: I was just wondering: I added a SLOG/ZIL to my new system today...i noticed that the L2ARC shows up under it's own headingbut the SLOG/ZIL doesn'tis this correct? see: capacity operationsbandwidth poolalloc free read write read write -- - - - - - - rpool 15.3G 44.2G 0 0 0 0 c6t4d0s0 15.3G 44.2G 0 0 0 0 -- - - - - - - tank10.9T 7.22T 0 2.43K 0 300M raidz210.9T 7.22T 0 2.43K 0 300M c4t6d0 - - 0349 0 37.6M c4t5d0 - - 0350 0 37.6M c5t7d0 - - 0350 0 37.6M c5t3d0 - - 0350 0 37.6M c8t0d0 - - 0354 0 37.6M c4t7d0 - - 0351 0 37.6M c4t3d0 - - 0350 0 37.6M c5t8d0 - - 0349 0 37.6M c5t0d0 - - 0348 0 37.6M c8t1d0 - - 0353 0 37.6M c6t5d0s0 0 8.94G 0 0 0 0 cache - - - - - - c6t5d0s1 37.5G 0 0158 0 19.6M It seems sort of strange to me that it doesn't look like this instead: capacity operationsbandwidth poolalloc free read write read write -- - - - - - - rpool 15.3G 44.2G 0 0 0 0 c6t4d0s0 15.3G 44.2G 0 0 0 0 -- - - - - - - tank10.9T 7.22T 0 2.43K 0 300M raidz210.9T 7.22T 0 2.43K 0 300M c4t6d0 - - 0349 0 37.6M c4t5d0 - - 0350 0 37.6M c5t7d0 - - 0350 0 37.6M c5t3d0 - - 0350 0 37.6M c8t0d0 - - 0354 0 37.6M c4t7d0 - - 0351 0 37.6M c4t3d0 - - 0350 0 37.6M c5t8d0 - - 0349 0 37.6M c5t0d0 - - 0348 0 37.6M c8t1d0 - - 0353 0 37.6M log - - - - - - c6t5d0s0 0 8.94G 0 0 0 0 cache - - - - - - c6t5d0s1 37.5G 0 0158 0 19.6M ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] cannot import pool from another system, device-ids different! please help!
try to zdb -l /dev/rdsk/c1d0s0 2010/5/25 h bajsadb...@pleasespam.me eon:1:~#zdb -l /dev/rdsk/c1d0 LABEL 0 failed to unpack label 0 LABEL 1 failed to unpack label 1 LABEL 2 failed to unpack label 2 LABEL 3 failed to unpack label 3 same for the other five drives in the pool what now? -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] can you recover a pool if you lose the zil (b134+)
Is there a best practice on keeping a backup of the zpool.cache file? Is it possible? Does it change with changes to vdevs? -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] get parent dataset
On 5/25/2010 2:55 AM, Vadim Comanescu wrote: Is there any way you can display the parent of a dataset by zfs (get/list) command ? I do not need to list for example for a dataset all it's children by using -r just to get the parent on a child. There are way's of grepping and doing some preg matches but i was wondering if there is any way by doing this directly. Thanks. I'm not aware of any, but it seems like that would be a straight-forward thing to add to the zfs command... maybe file an RFE. - Garrett -- ing. Vadim Comanescu S.C. Syneto S.R.L. str. Vasile Alecsandri nr 2, Timisoara Timis, Romania ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] unsetting the bootfs property possible? imported a FreeBSD pool
Hi Reshekel, You might review these resources for information on using ZFS without having to hack code: http://hub.opensolaris.org/bin/view/Community+Group+zfs/docs ZFS Administration Guide http://www.solarisinternals.com/wiki/index.php/ZFS_Troubleshooting_Guide I will add a section on migrating from FreeBSD because this problem comes up often enough. You might search the list archive for this problem to see how others have resolved the partition issues. Moving ZFS storage pools from a FreeBSD system to a Solaris system is difficult because it looks like FreeBSD uses the disk's p0 partition and in Solaris releases, ZFS storage pools are either created with whole disks by using the d0 identifier or root pools, which are created by using the disk slice identifier (s0). This is an existing boot limitation. For example, see the difference in the two pools: # zpool status pool: rpool state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM rpool ONLINE 0 0 0 mirror-0ONLINE 0 0 0 c1t0d0s0 ONLINE 0 0 0 c1t1d0s0 ONLINE 0 0 0 pool: dozer state: ONLINE scrub: none requested config: NAMESTATE READ WRITE CKSUM dozer ONLINE 0 0 0 c2t5d0ONLINE 0 0 0 c2t6d0ONLINE 0 0 0 errors: No known data errors If you want to boot from a ZFS storage pool then you must create the pool with slices. This is why you see the message about EFI labels because pools that are created with whole disks use an EFI label and Solaris doesn't boot from an EFI label. You can add a cache device to a pool reserved for booting, but you must create a disk slice and then, add the cache device like this: # zpool add rpool cache c1t2d0s0 # zpool status pool: rpool state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM rpool ONLINE 0 0 0 mirror-0ONLINE 0 0 0 c1t0d0s0 ONLINE 0 0 0 c1t1d0s0 ONLINE 0 0 0 cache c1t2d0s0ONLINE 0 0 0 I suggest creating two pools, one small pool for booting and one larger pool for data storage. Thanks, Cindy On 05/25/10 02:58, Reshekel Shedwitz wrote: Greetings - I am migrating a pool from FreeBSD 8.0 to OpenSolaris (Nexenta 3.0 RC1). I am in what seems to be a weird situation regarding this pool. Maybe someone can help. I used to boot off of this pool in FreeBSD, so the bootfs property got set: r...@nexenta:~# zpool get bootfs tank NAME PROPERTY VALUE SOURCE tank bootfstanklocal The presence of this property seems to be causing me all sorts of headaches. I cannot replace a disk or add a L2ARC because the presence of this flag is how ZFS code (libzfs_pool.c: zpool_vdev_attach and zpool_label_disk) determines if a pool is allegedly a root pool. r...@nexenta:~# zpool add tank cache c1d0 cannot label 'c1d0': EFI labeled devices are not supported on root pools. To replace disks, I was able to hack up libzfs_zpool.c and create a new custom version of the zpool command. That works, but this is a poor solution going forward because I have to be sure I use my customized version every time I replace a bad disk. Ultimately, I would like to just set the bootfs property back to default, but this seems to be beyond my ability. There are some checks in libzfs_pool.c that I can bypass in order to set the value back to its default of -, but ultimately I am stopped because there is code in zfs_ioctl.c, which I believe is kernel code, that checks to see if the bootfs value supplied is actually an existing dataset. I'd compile my own kernel but hey, this is only my first day using OpenSolaris - it was a big enough feat just learning how to compile stuff in the ON source tree :D What should I do here? Is there some obvious solution I'm missing? I'd like to be able to get my pool back to a state where I can use the *stock* zpool command to maintain it. I don't boot off of this pool anymore and if I could somehow set the boot. BTW, for reference, here is the output of zpool status (after I hacked up zpool to let me add a l2arc): pool: tank state: ONLINE status: The pool is formatted using an older on-disk format. The pool can still be used, but some features are unavailable. action: Upgrade the pool using 'zpool upgrade'. Once this is done, the pool will no longer be accessible on older software versions. scan: resilvered 351G in 2h44m with 0 errors on Tue May 25 23:33:38 2010 config: NAME STATE READ WRITE CKSUM tank ONLINE 0 0 0 raidz2-0ONLINE 0 0 0 c2t5d0p0 ONLINE 0 0 0 c2t4d0p0 ONLINE 0 0 0
Re: [zfs-discuss] question about zpool iostat output
i am running the last release from the genunix page uname -a output: SunOS wonslung-raidz2 5.11 snv_134 i86pc i386 i86pc Solaris On Tue, May 25, 2010 at 10:33 AM, Cindy Swearingen cindy.swearin...@oracle.com wrote: Hi Thomas, This looks like a display bug. I'm seeing it too. Let me know which Solaris release you are running and I will file a bug. Thanks, Cindy On 05/25/10 01:42, Thomas Burgess wrote: I was just wondering: I added a SLOG/ZIL to my new system today...i noticed that the L2ARC shows up under it's own headingbut the SLOG/ZIL doesn'tis this correct? see: capacity operationsbandwidth poolalloc free read write read write -- - - - - - - rpool 15.3G 44.2G 0 0 0 0 c6t4d0s0 15.3G 44.2G 0 0 0 0 -- - - - - - - tank10.9T 7.22T 0 2.43K 0 300M raidz210.9T 7.22T 0 2.43K 0 300M c4t6d0 - - 0349 0 37.6M c4t5d0 - - 0350 0 37.6M c5t7d0 - - 0350 0 37.6M c5t3d0 - - 0350 0 37.6M c8t0d0 - - 0354 0 37.6M c4t7d0 - - 0351 0 37.6M c4t3d0 - - 0350 0 37.6M c5t8d0 - - 0349 0 37.6M c5t0d0 - - 0348 0 37.6M c8t1d0 - - 0353 0 37.6M c6t5d0s0 0 8.94G 0 0 0 0 cache - - - - - - c6t5d0s1 37.5G 0 0158 0 19.6M It seems sort of strange to me that it doesn't look like this instead: capacity operationsbandwidth poolalloc free read write read write -- - - - - - - rpool 15.3G 44.2G 0 0 0 0 c6t4d0s0 15.3G 44.2G 0 0 0 0 -- - - - - - - tank10.9T 7.22T 0 2.43K 0 300M raidz210.9T 7.22T 0 2.43K 0 300M c4t6d0 - - 0349 0 37.6M c4t5d0 - - 0350 0 37.6M c5t7d0 - - 0350 0 37.6M c5t3d0 - - 0350 0 37.6M c8t0d0 - - 0354 0 37.6M c4t7d0 - - 0351 0 37.6M c4t3d0 - - 0350 0 37.6M c5t8d0 - - 0349 0 37.6M c5t0d0 - - 0348 0 37.6M c8t1d0 - - 0353 0 37.6M log - - - - - - c6t5d0s0 0 8.94G 0 0 0 0 cache - - - - - - c6t5d0s1 37.5G 0 0158 0 19.6M ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] get parent dataset
On Tue, May 25, 2010 at 2:55 AM, Vadim Comanescu va...@syneto.net wrote: Is there any way you can display the parent of a dataset by zfs (get/list) command ? I do not need to list for example for a dataset all it's children by using -r just to get the parent on a child. There are way's of grepping and doing some preg matches but i was wondering if there is any way by doing this directly. Thanks. If you know the current dataset name, there's no real need to have the parent as a property. Just subtract the last / and any trailing characters. Heck, you could use 'dirname' in scripts to do it for you. bh...@basestar:~$ zfs list -o name rpool/ROOT/snv_134 NAME rpool/ROOT/snv_134 bh...@basestar:~$ dirname rpool/ROOT/snv_134 rpool/ROOT -B -- Brandon High : bh...@freaks.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] questions about zil
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Nicolas Williams I recently got a new SSD (ocz vertex LE 50gb) It seems to work really well as a ZIL performance wise. I know it doesn't have a supercap so lets' say dataloss occursis it just dataloss or is it pool loss? Just dataloss. WRONG! The correct answer depends on your version of solaris/opensolaris. More specifically, it depends on the zpool version. The latest fully updated sol10 and the latest opensolaris release (2009.06) only go up to zpool 14 or 15. But in zpool 19 is when a ZIL loss doesn't permanently offline the whole pool. I know this is available in the developer builds. The best answer to this, I think, is in the ZFS Best Practices Guide: (uggh, it's down right now, so I can't paste the link) If you have zpool 19, and you lose an unmirrored ZIL, then you lose your pool. Also, as a configurable option apparently, I know on my systems, it also meant I needed to power cycle. If you have zpool =19, and you lose an unmirrored ZIL, then performance will be degraded, but everything continues to work as normal. Apparently the most common mode of failure for SSD's is also failure to read. To make it worse, a ZIL is only read after system crash, which means the possibility of having a failed SSD undetected must be taken into consideration. If you do discover a failed ZIL after crash, with zpool 19 your pool is lost. But with zpool =19 only the unplayed writes are lost. With zpool =19, your pool will be intact, but you would lose up to 30sec of writes that occurred just before the crash. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] get parent dataset
On 5/25/2010 8:24 AM, Brandon High wrote: On Tue, May 25, 2010 at 2:55 AM, Vadim Comanescuva...@syneto.net wrote: Is there any way you can display the parent of a dataset by zfs (get/list) command ? I do not need to list for example for a dataset all it's children by using -r just to get the parent on a child. There are way's of grepping and doing some preg matches but i was wondering if there is any way by doing this directly. Thanks. If you know the current dataset name, there's no real need to have the parent as a property. Just subtract the last / and any trailing characters. Heck, you could use 'dirname' in scripts to do it for you. bh...@basestar:~$ zfs list -o name rpool/ROOT/snv_134 NAME rpool/ROOT/snv_134 bh...@basestar:~$ dirname rpool/ROOT/snv_134 rpool/ROOT -B Good point. :-) I was thinking of mount point names, which are different from data set names. -- Garrett ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] questions about zil
On Tue, May 25, 2010 at 11:27 AM, Edward Ned Harvey solar...@nedharvey.comwrote: From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Nicolas Williams I recently got a new SSD (ocz vertex LE 50gb) It seems to work really well as a ZIL performance wise. I know it doesn't have a supercap so lets' say dataloss occursis it just dataloss or is it pool loss? Just dataloss. WRONG! The correct answer depends on your version of solaris/opensolaris. More specifically, it depends on the zpool version. The latest fully updated sol10 and the latest opensolaris release (2009.06) only go up to zpool 14 or 15. But in zpool 19 is when a ZIL loss doesn't permanently offline the whole pool. I know this is available in the developer builds. The best answer to this, I think, is in the ZFS Best Practices Guide: (uggh, it's down right now, so I can't paste the link) If you have zpool 19, and you lose an unmirrored ZIL, then you lose your pool. Also, as a configurable option apparently, I know on my systems, it also meant I needed to power cycle. If you have zpool =19, and you lose an unmirrored ZIL, then performance will be degraded, but everything continues to work as normal. Apparently the most common mode of failure for SSD's is also failure to read. To make it worse, a ZIL is only read after system crash, which means the possibility of having a failed SSD undetected must be taken into consideration. If you do discover a failed ZIL after crash, with zpool 19 your pool is lost. But with zpool =19 only the unplayed writes are lost. With zpool =19, your pool will be intact, but you would lose up to 30sec of writes that occurred just before the crash. I didn't ask about losing my zil. I asked about power loss taking out my pool. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] unsetting the bootfs property possible? imported a FreeBSD pool
On Tue, May 25, 2010 at 1:58 AM, Reshekel Shedwitz reshe...@spam.la wrote: Ultimately, I would like to just set the bootfs property back to default, but this seems to be beyond my ability. There are some checks in libzfs_pool.c that I can bypass in order to set the value back to its default of -, but ultimately I am stopped because there is code in zfs_ioctl.c, which I believe is kernel code, that checks to see if the bootfs value supplied is actually an existing dataset. I'm fairly certain that I've been able to set and unset the bootfs property on my rpool in snv_133 and snv_134. Just use an empty value when setting it. In fact: bh...@basestar:~$ zpool get bootfs rpool NAME PROPERTY VALUE SOURCE rpool bootfsrpool/ROOT/snv_134 local bh...@basestar:~$ pfexec zpool set bootfs= rpool bh...@basestar:~$ zpool get bootfs rpool NAME PROPERTY VALUE SOURCE rpool bootfs- default bh...@basestar:~$ pfexec zpool set bootfs=rpool/ROOT/snv_134 rpool bh...@basestar:~$ zpool get bootfs rpool NAME PROPERTY VALUE SOURCE rpool bootfsrpool/ROOT/snv_134 local -- Brandon High : bh...@freaks.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] USB Flashdrive as SLOG?
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Kyle McDonald I've been thinking lately that I'm not sure I like the root pool being unprotected, but I can't afford to give up another drive bay. I'm guessing you won't be able to use the USB thumbs as a boot device. But that's just a guess. However, I see nothing wrong with mirroring your primary boot device to the USB. At least in this case, if the OS drive fails, your system doesn't crash. You're able to swap the OS drive and restore your OS mirror. That led me to wonder whether partitioning out 8 or 12 GB on a 32GB thumb drive would be beneficial as an slog?? I think the only way to find out is to measure it. I do have an educated guess though. I don't think, even the fastest USB flash drives are able to work quickly, with significantly low latency. Based on measurements I made years ago, so again I emphasize, only way to find out is to test it. One thing you could check, which does get you a lot of mileage for free is: Make sure your HBA has a BBU, and enable the WriteBack. In my measurements, this gains about 75% of the benefit that log devices would give you. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] unsetting the bootfs property possible? imported a FreeBSD pool
Cindy, Thanks for your reply. The important details may have been buried in my post, I will repeat them again to make it more clear: (1) This was my boot pool in FreeBSD, but I do not think the partitioning differences are really the issue. I can import the pool to nexenta/opensolaris just fine. Furthermore, this is *no longer* being used as a root pool in nexenta. I purchased an SSD for the purpose of booting nexenta. This pool is used purely for data storage - no booting. (2) I had to hack the code because zpool is forbidding me from adding or replacing devices - please see my logs in the previous post. zpool thinks this pool is a boot pool due to the bootfs flag being set, and zpool will not let me unset the bootfs property. So I'm stuck in a situation where zpool thinks my pool is a boot pool because of the bootfs property, and zpool will not let me unset the bootfs property. Because zpool thinks this pool is the boot pool, it is trying to forbid me from creating a configuration that isn't compatible with booting. In this situation, I am unable to add or replace devices without using my hacked version of zpool. I was able to hack the code to allow zpool to replace and add devices, but I was not able to figure out how to set the bootfs property back to the default value. Does this help explain my situation better? I think this is a bug, or maybe I'm missing something totally obvious. Thanks! -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] unsetting the bootfs property possible? imported a FreeBSD pool
On Tue, May 25, 2010 at 1:58 AM, Reshekel Shedwitz reshe...@spam.la wrote: Ultimately, I would like to just set the bootfs property back to default, but this seems to be beyond my ability. There are some checks in libzfs_pool.c that I can bypass in order to set the value back to its default of -, but ultimately I am stopped because there is code in zfs_ioctl.c, which I believe is kernel code, that checks to see if the bootfs value supplied is actually an existing dataset. I'm fairly certain that I've been able to set and unset the bootfs property on my rpool in snv_133 and snv_134. Just use an empty value when setting it. In fact: bh...@basestar:~$ zpool get bootfs rpool NAME PROPERTY VALUE SOURCE rpool bootfsrpool/ROOT/snv_134 local bh...@basestar:~$ pfexec zpool set bootfs= rpool bh...@basestar:~$ zpool get bootfs rpool NAME PROPERTY VALUE SOURCE rpool bootfs- default bh...@basestar:~$ pfexec zpool set bootfs=rpool/ROOT/snv_134 rpool bh...@basestar:~$ zpool get bootfs rpool NAME PROPERTY VALUE SOURCE rpool bootfsrpool/ROOT/snv_134 local -- Brandon High : bh...@freaks.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discu ss r...@nexenta:~# zpool set bootfs= tank cannot set property for 'tank': property 'bootfs' not supported on EFI labeled devices r...@nexenta:~# zpool get bootfs tank NAME PROPERTY VALUE SOURCE tank bootfstanklocal Could this be related to the way FreeBSD's zfs partitioned my disk? I thought ZFS used EFI by default though (except for boot pools). Thanks. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] questions about zil
From: Thomas Burgess [mailto:wonsl...@gmail.com] Just dataloss. WRONG! I didn't ask about losing my zil. I asked about power loss taking out my pool. As I recall: I recently got a new SSD (ocz vertex LE 50gb) It seems to work really well as a ZIL performance wise. My question is, how safe is it? I know it doesn't have a supercap so lets' say dataloss occursis it just dataloss or is it pool loss? At least to me, this was not clearly not asking about losing zil and was not clearly asking about power loss. Sorry for answering the question you thought you didn't ask. I would suggest clarifying your question, by saying instead: so lets' say *power*loss occurs Then it would have been clear what you were asking. Since this is a SSD you're talking about, unless you have enabled nonvolatile write cache on that disk (which you should never do), and the disk incorrectly handles cache flush commands (which it should never do), then the supercap is irrelevant. All ZIL writes are to be done synchronously. If you have a power loss, you don't lose your pool, and you also don't lose any writes in the ZIL. You do, however, lose any async writes that were not yet flushed to disk. There is no way to prevent that, regardless of ZIL configuration. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] USB Flashdrive as SLOG?
The USB stack in OpenSolaris is ... complex (STREAMs based!), and probably not the most performant or reliable portion of the system. Furthermore, the mass storage layer, which encapsulates SCSI, is not tuned for a high number of IOPS or low latencies, and the stack makes different assumptions about USB media than it makes for SCSI. Further, you will not be able to get direct DMA through this stack either, so you wind up sucking extra CPU doing data copies. I would think long and hard before I put too many eggs in that particular basket. Additionally, USB has the tendency to run at high interrupt rates (1000 Hz), which can have a detrimental impact on system performance and power consumption. Its possible that mass storage devices don't have this attribute -- I'm not sure, I've not tried to investigate it directly. One attribute that you can rest assured of though, is that the average latency for USB operations cannot be less than 1 ms -- which is driven by that 1000 Hz, because USB doesn't have a true interrupt mechanism (it polls). I believe that this is considerably higher than the lowest latency achievable with PCI and SATA or SAS devices. Generally, eSATA flash drives would be preferable for external flash media, I think. Additionally, the SATA framework has quite recently inherited FMA support, so you'll benefit from closer integration of FMA and ZFS when using SATA. - Garrett On 5/25/2010 8:39 AM, Edward Ned Harvey wrote: From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Kyle McDonald I've been thinking lately that I'm not sure I like the root pool being unprotected, but I can't afford to give up another drive bay. I'm guessing you won't be able to use the USB thumbs as a boot device. But that's just a guess. However, I see nothing wrong with mirroring your primary boot device to the USB. At least in this case, if the OS drive fails, your system doesn't crash. You're able to swap the OS drive and restore your OS mirror. That led me to wonder whether partitioning out 8 or 12 GB on a 32GB thumb drive would be beneficial as an slog?? I think the only way to find out is to measure it. I do have an educated guess though. I don't think, even the fastest USB flash drives are able to work quickly, with significantly low latency. Based on measurements I made years ago, so again I emphasize, only way to find out is to test it. One thing you could check, which does get you a lot of mileage for free is: Make sure your HBA has a BBU, and enable the WriteBack. In my measurements, this gains about 75% of the benefit that log devices would give you. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] unsetting the bootfs property possible? imported a FreeBSD pool
Reshekel Shedwitz wrote: r...@nexenta:~# zpool set bootfs= tank cannot set property for 'tank': property 'bootfs' not supported on EFI labeled devices r...@nexenta:~# zpool get bootfs tank NAME PROPERTY VALUE SOURCE tank bootfstanklocal Could this be related to the way FreeBSD's zfs partitioned my disk? I thought ZFS used EFI by default though (except for boot pools). Looks like this bit of code to me: http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/lib/libzfs/common/libzfs_pool.c#473 473 /* 474 * bootfs property cannot be set on a disk which has 475 * been EFI labeled. 476 */ 477 if (pool_uses_efi(nvroot)) { 478 zfs_error_aux(hdl, dgettext(TEXT_DOMAIN, 479 "property '%s' not supported on " 480 "EFI labeled devices"), propname); 481 (void) zfs_error(hdl, EZFS_POOL_NOTSUP, errbuf); 482 zpool_close(zhp); 483 goto error; 484 } 485 zpool_close(zhp); 486 break; It's not checking if you're clearing the property before bailing out with the error about setting it. A few lines above, another test (for a valid bootfs name) does get bypassed in the case of clearing the property. Don't know if that alone would fix it. -- Andrew Gabriel | Solaris Systems Architect Email: andrew.gabr...@oracle.com Mobile: +44 7720 598213 Oracle Pre-Sales Guillemont Park | Minley Road | Camberley | GU17 9QG | United Kingdom ORACLE Corporation UK Ltd is a company incorporated in England Wales | Company Reg. No. 1782505 | Reg. office: Oracle Parkway, Thames Valley Park, Reading RG6 1RA Oracle is committed to developing practices and products that help protect the environment ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] USB Flashdrive as SLOG?
On 5/25/2010 11:39 AM, Edward Ned Harvey wrote: From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Kyle McDonald I've been thinking lately that I'm not sure I like the root pool being unprotected, but I can't afford to give up another drive bay. I'm guessing you won't be able to use the USB thumbs as a boot device. But that's just a guess. No I've installed to an 8GB one on my laptop and booted from it. And this server offers USB drives as a boot option, I don't see why it wouldn't work. but I won't kow till I try it. However, I see nothing wrong with mirroring your primary boot device to the USB. At least in this case, if the OS drive fails, your system doesn't crash. You're able to swap the OS drive and restore your OS mirror. True. If nothing else I may do at least that. That led me to wonder whether partitioning out 8 or 12 GB on a 32GB thumb drive would be beneficial as an slog?? I think the only way to find out is to measure it. I do have an educated guess though. I don't think, even the fastest USB flash drives are able to work quickly, with significantly low latency. Based on measurements I made years ago, so again I emphasize, only way to find out is to test it. Yes I guess Ill have to try some benchmarks. The thing that got me thinking was that many of these drives support a windows feature called 'Ready boost' - which I think is just windows swapping to the USB drive instead of HD - but Windows does a performance test on the device to seee it's fast enough. I thought maybe if it's faster to swap to than a HD it might be faster for an SLOG too. But you're right the only way to know is to measure it. One thing you could check, which does get you a lot of mileage for free is: Make sure your HBA has a BBU, and enable the WriteBack. In my measurements, this gains about 75% of the benefit that log devices would give you. My HBA's have 256MB of BBC. And it's enabled on all 6 drives, so that should help. However I may have hit a bug inthe 'isp' driver (still have to debug and see if that's the root cause) and I may need to yank the RAID enabler, and go back to straight SCSI. -Kyle ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] questions about zil
On Mon, 24 May 2010, Thomas Burgess wrote: It's a sandforce sf-1500 model but without a supercapheres some info on it: Maximum Performance * Max Read: up to 270MB/s * Max Write: up to 250MB/s * Sustained Write: up to 235MB/s * Random Write 4k: 15,000 IOPS * Max 4k IOPS: 50,000 Isn't there a serious problem with these specifications? It seems that the minimum assured performance values (and the median) are much more interesting than some maximum performance value which might only be reached during a brief instant of the device lifetime under extremely ideal circumstances. It seems that toilet paper may of much more practical use than these specifications. In fact, I reject them as being specifications at all. The Apollo reentry vehicle was able to reach amazing speeds, but only for a single use. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] can you recover a pool if you lose the zil (b134+)
On May 25, 2010, at 7:46 AM, thomas wrote: Is there a best practice on keeping a backup of the zpool.cache file? Same as anything else, but a little bit easier because you can snapshot the root pool. Thus far, the only real use for the backups is for a manual recovery of missing top-level vdevs -- a rare event. Is it possible? Yes Does it change with changes to vdevs? Yes -- richard -- ZFS and NexentaStor training, Rotterdam, July 13-15, 2010 http://nexenta-rotterdam.eventbrite.com/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] questions about zil
At least to me, this was not clearly not asking about losing zil and was not clearly asking about power loss. Sorry for answering the question you thought you didn't ask. I was only responding to your response of WRONG!!! The guy wasn't wrong in regards to my questions. I'm sorry for not making THAT more clear in my post. I would suggest clarifying your question, by saying instead: so lets' say *power*loss occurs Then it would have been clear what you were asking. I'm pretty sure i did ask about power lossor at least it was implied by my point about the UPS. You're right, i probably should have been a little more clear. Since this is a SSD you're talking about, unless you have enabled nonvolatile write cache on that disk (which you should never do), and the disk incorrectly handles cache flush commands (which it should never do), then the supercap is irrelevant. All ZIL writes are to be done synchronously. This SSD doesn't use nonvolatile write cache (at least i don't think it does, it's a SF-1500 based ssd) I might be wrong about this, but i thought one of the biggest things about the sandforce was that it doesn't use DRAM If you have a power loss, you don't lose your pool, and you also don't lose any writes in the ZIL. You do, however, lose any async writes that were not yet flushed to disk. There is no way to prevent that, regardless of ZIL configuration. Yes, I know that i lose async writesi just wasn't sure if that resulted in an issue...I might be somewhat confused to how the ZIL works but i thought the point of the ZIL was to pretend a write actually happened when it may not have actually been flushed to disk yet...in this case, a write to the zil might not make it to diski just didn't know if this could result in a loss of a pool due to some sort of corruption of the uberblock or something.I'm not entirely up to speed on the voodoo that is ZFS. I wasn't trying to be rude, sorry if it came off like that. I am aware of the issue regarding removing the ZIL on non-dev versions of opensolarisi am on b134 so that doesnt' apply to me. Thanks ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] questions about zil
On Tue, May 25, 2010 at 12:38 PM, Bob Friesenhahn bfrie...@simple.dallas.tx.us wrote: On Mon, 24 May 2010, Thomas Burgess wrote: It's a sandforce sf-1500 model but without a supercapheres some info on it: Maximum Performance * Max Read: up to 270MB/s * Max Write: up to 250MB/s * Sustained Write: up to 235MB/s * Random Write 4k: 15,000 IOPS * Max 4k IOPS: 50,000 Isn't there a serious problem with these specifications? It seems that the minimum assured performance values (and the median) are much more interesting than some maximum performance value which might only be reached during a brief instant of the device lifetime under extremely ideal circumstances. It seems that toilet paper may of much more practical use than these specifications. In fact, I reject them as being specifications at all. The Apollo reentry vehicle was able to reach amazing speeds, but only for a single use. Bob What exactly do you mean? Every review i've read about this device has been great. Every review i've read about the sandforce controllers has been good toare you saying they have shorter lifetimes? Everything i've read has made them sound like they should last longer than typical ssds because they write less actual data -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] questions about zil
Also, let me note, it came with a 3 year warranty so I expect it to last at least 3 years...but if it doesn't, i'll just return it under the warranty. On Tue, May 25, 2010 at 1:26 PM, Thomas Burgess wonsl...@gmail.com wrote: On Tue, May 25, 2010 at 12:38 PM, Bob Friesenhahn bfrie...@simple.dallas.tx.us wrote: On Mon, 24 May 2010, Thomas Burgess wrote: It's a sandforce sf-1500 model but without a supercapheres some info on it: Maximum Performance * Max Read: up to 270MB/s * Max Write: up to 250MB/s * Sustained Write: up to 235MB/s * Random Write 4k: 15,000 IOPS * Max 4k IOPS: 50,000 Isn't there a serious problem with these specifications? It seems that the minimum assured performance values (and the median) are much more interesting than some maximum performance value which might only be reached during a brief instant of the device lifetime under extremely ideal circumstances. It seems that toilet paper may of much more practical use than these specifications. In fact, I reject them as being specifications at all. The Apollo reentry vehicle was able to reach amazing speeds, but only for a single use. Bob What exactly do you mean? Every review i've read about this device has been great. Every review i've read about the sandforce controllers has been good toare you saying they have shorter lifetimes? Everything i've read has made them sound like they should last longer than typical ssds because they write less actual data -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] questions about zil
On Tue, 25 May 2010, Thomas Burgess wrote: The Apollo reentry vehicle was able to reach amazing speeds, but only for a single use. What exactly do you mean? What I mean is what I said. A set of specifications which are all written as maximums (i.e. peak) is pretty useless. Perhaps if you were talking about a maximum ambient temperature specification or maximum allowed elevation, then a maximum specification makes sense. Perhaps the device is fine (I have no idea) but these posted specifications are virtually useless. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] questions about zil
From: Thomas Burgess [mailto:wonsl...@gmail.com] I might be somewhat confused to how the ZIL works but i thought the point of the ZIL was to pretend a write actually happened when it may not have actually been flushed to disk yet... No. How the ZIL works is like this: Whenever a process issues a sync write, the process blocks until the OS acknowledges the write has been committed to nonvolatile storage. Assuming you have a dedicated log device, the OS immediately commits this data to the log device, and unblocks the process. Then, the data is able to float around in RAM with all the async write requests, getting optimized for disk performance and so forth. The OS might aggregate up to 30 secs of small writes into a single larger sequential transaction for the primary storage devices. If there's an unfortunate event such as system crash during the meantime, then upon the next bootup, the OS will notice data in the ZIL log, which was intended for a TXG, which never made its way out to primary storage. Therefore, the OS replays the log, and commits those writes now. All the async writes that were still in RAM were lost, but the sync writes were not. The dedicated log device helps sync writes approach the performance of async writes. Nothing beats the performance of async writes. If you disable ZIL, then sync writes are handled the same as async writes. Both in terms of performance, and risk. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] unsetting the bootfs property possible? imported a FreeBSD pool
On Tue, May 25, 2010 at 8:47 AM, Reshekel Shedwitz reshe...@gmail.com wrote: Could this be related to the way FreeBSD's zfs partitioned my disk? I thought ZFS used EFI by default though (except for boot pools). Looks like it. Solaris thinks that it's EFI partitioned. By default, Solaris uses SMI for boot volumes, EFI for non-boot volumes. You could create a new pool (or use space on another existing pool) and move your data to it, then re-create your old pool. -B -- Brandon High : bh...@freaks.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] [ZIL device brainstorm] intel x25-M G2 has ram cache?
On Tue, May 25, 2010 at 2:08 AM, Karl Pielorz kpielorz_...@tdx.co.uk wrote: I've tried contacting Intel to find out if it's true their enterprise SSD has no cache protection on it, and what the effect of turning the write The E in X25-E does not mean enterprise. It means extreme. Like the EE series CPUs that Intel offers. -B -- Brandon High : bh...@freaks.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Can I recover filesystem from an offline pool?
Hi All, is there any procedure to recover a filesystem from an office pool or bring a pool on-line quickly. Here is my issue. * One 700GB Zpool * 1 filesystem with compression turn on (only using few MB) * Try to migrated another filesystem from a different pool with dedup stream. with zfs send -D | zfs receive * The system hung. * reboot the system, the system would hang trying to recover or remove the snapshot on the 700GB zpool. The HD light would flash for hours on then go quiet and the whole system hang. * reboot the system without the 700GB zpool disk detached. system boot up just fine. attach the disk and run zfs clear (-F) pool name then The HD light would flash for hours on then quiet and the whole system hang. I am not interested in the filesystem is having problems. I would like the to copy the data out of first filesystem that are only a few MB. Anyway I can copy the data out or remove the problem filesystem with the zpool offline or bring the pool on-line without the recover/remove process to the problem filesystem. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] multiple crashes upon boot after upgrading build 134 to 138, 139 or 140
Greetings, I see repeatable crashes on some systems after upgrading.. the signature is always the same: operating system: 5.11 snv_139 (i86pc) panic message: BAD TRAP: type=e (#pf Page fault) rp=ff00175f88c0 addr=0 occurred in module genunix due to a NULL pointer dereference list_remove+0x1b(ff03e19339f0, ff03e0814640) zfs_acl_release_nodes+0x34(ff03e19339c0) zfs_acl_free+0x16(ff03e19339c0) zfs_znode_free+0x5e(ff03e17fa600) zfs_zinactive+0x9b(ff03e17fa600) zfs_inactive+0x11c(ff03e17f8500, ff03ee867528, 0) fop_inactive+0xaf(ff03e17f8500, ff03ee867528, 0) vn_rele_dnlc+0x6c(ff03e17f8500) dnlc_purge+0x175() nfs_idmap_args+0x5e(ff00175f8c38) nfssys+0x1e1(12, 8047dd8) The stack always looks like the above, the vnode involved is sometimes a file, sometimes a directory. e.g.: I have seen the /boot/acpi directory and the /kernel/drv/amd64/acpi_driver fie in the vnode's path field. looking at the data, I notice that the z_acl.list_head indicates a single member in the list ( presuming that is the case, because list_prev and list_next point to the same address): (ff03e19339c0)::print zfs_acl_t { z_acl_count = 0x6 z_acl_bytes = 0x30 z_version = 0x1 z_next_ace = 0xff03e171d210 z_hints = 0 z_curr_node = 0xff03e0814640 z_acl = { list_size = 0x40 list_offset = 0 list_head = { list_next = 0xff03e0814640 list_prev = 0xff03e0814640 } } This member's next pointer is bad ( sometimes zero, sometimes a low number, eg. 0x10) The null pointer crash happens trying to follow the list_prev pointer: 0xff03e0814640::print zfs_acl_node_t { z_next = { list_next = 0 list_prev = 0 } z_acldata = 0xff03e10b6230 z_allocdata = 0xff03e171d200 z_allocsize = 0x30 z_size = 0x30 z_ace_count = 0x6 z_ace_idx = 0x2 } This is a repeating pattern, seems to me always a single zfs_acl_node in the list, with null / garbaged out list_next and list_prev pointers. e.g.: in another instance of this crash, the zfs_acl_node looks like this: ::stack list_remove+0x1b(ff03e10d24f0, ff03e0fc9a00) zfs_acl_release_nodes+0x34(ff03e10d24c0) zfs_acl_free+0x16(ff03e10d24c0) zfs_znode_free+0x5e(ff03e10cc200) zfs_zinactive+0x9b(ff03e10cc200) zfs_inactive+0x11c(ff03e1281840, ff03ea5c7010, 0) fop_inactive+0xaf(ff03e1281840, ff03ea5c7010, 0) vn_rele_dnlc+0x6c(ff03e1281840) dnlc_purge+0x175() nfs_idmap_args+0x5e(ff001811ac38) nfssys+0x1e1(12, 8047dd8) _sys_sysenter_post_swapgs+0x149() ::status ... panic message: BAD TRAP: type=e (#pf Page fault) rp=ff001811a8c0 addr=10 occurred in module genunix due to a NULL pointer dereference ff03e0fc9a00::print zfs_acl_node_t { z_next = { list_next = 0xff03e10e1cd9 list_prev = 0x10 } z_acldata = 0 z_allocdata = 0xff03e10cb5d0 z_allocsize = 0x30 z_size = 0x30 z_ace_count = 0x6 z_ace_idx = 0x2 } Looks to me the crash here is the same, and list_next / list_prev are garbage. Anybody seen this? Am I skipping too many versions when I am image-updating? I am hoping someone who knows this code would chime in. Steve -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] can you recover a pool if you lose the zil (b134+)
a manual recovery of missing top-level vdevs -- a rare event. Yes, but so rare that I never thought troubling me. In my mind it was only the slog and loosing the last few seconds doesn't wrong. So I don't have a backup, a snapshot neither the original zpool.cache file. Is there any solution for my problem? Thanks Ron -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] unsetting the bootfs property possible? imported a FreeBSD pool
Hi-- I apologize for missing understanding your original issue. Regardless of the original issues and the fact that current Solaris releases do not let you set the bootfs property on a pool that has a disk with an EFI label, the secondary bug here is not being able to remove a bootfs property on a pool that has a disk with an EFI label. If this helps with the migration of pools, then we should allow you to remove the bootfs property. I will file this bug on your behalf. In the meantime, I don't see how you can resolve the problem on this pool. Thanks, Cindy On 05/25/10 09:42, Reshekel Shedwitz wrote: Cindy, Thanks for your reply. The important details may have been buried in my post, I will repeat them again to make it more clear: (1) This was my boot pool in FreeBSD, but I do not think the partitioning differences are really the issue. I can import the pool to nexenta/opensolaris just fine. Furthermore, this is *no longer* being used as a root pool in nexenta. I purchased an SSD for the purpose of booting nexenta. This pool is used purely for data storage - no booting. (2) I had to hack the code because zpool is forbidding me from adding or replacing devices - please see my logs in the previous post. zpool thinks this pool is a boot pool due to the bootfs flag being set, and zpool will not let me unset the bootfs property. So I'm stuck in a situation where zpool thinks my pool is a boot pool because of the bootfs property, and zpool will not let me unset the bootfs property. Because zpool thinks this pool is the boot pool, it is trying to forbid me from creating a configuration that isn't compatible with booting. In this situation, I am unable to add or replace devices without using my hacked version of zpool. I was able to hack the code to allow zpool to replace and add devices, but I was not able to figure out how to set the bootfs property back to the default value. Does this help explain my situation better? I think this is a bug, or maybe I'm missing something totally obvious. Thanks! ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] cannot import pool from another system, device-ids different! please help!
eon:6:~#zdb -l /dev/rdsk/c1d0s0 LABEL 0 version: 22 name: 'videodrome' state: 0 txg: 55561 pool_guid: 5063071388564101079 hostid: 919514 hostname: 'Videodrome' top_guid: 15080595385902860350 guid: 12602499757569516679 vdev_children: 1 vdev_tree: type: 'raidz' id: 0 guid: 15080595385902860350 nparity: 1 metaslab_array: 23 metaslab_shift: 35 ashift: 9 asize: 6001149345792 is_log: 0 children[0]: type: 'disk' id: 0 guid: 5800353223031346021 path: '/dev/dsk/c1t0d0s0' devid: 'id1,s...@awdc_wd20eads-00s2b0=_wd-wcavy1123096/a' phys_path: '/p...@0,0/pci1043,8...@5/d...@0,0:a' whole_disk: 1 DTL: 30 children[1]: type: 'disk' id: 1 guid: 11924500712739180074 path: '/dev/dsk/c1t1d0s0' devid: 'id1,s...@awdc_wd20eads-00s2b0=_wd-wcavy1089951/a' phys_path: '/p...@0,0/pci1043,8...@5/d...@1,0:a' whole_disk: 1 DTL: 31 children[2]: type: 'disk' id: 2 guid: 6297108650128259181 path: '/dev/dsk/c10t0d0s0' devid: 'id1,s...@awdc_wd20eads-00s2b0=_wd-wcavy1089667/a' phys_path: '/p...@0,0/pci1043,8...@5,1/d...@0,0:a' whole_disk: 1 DTL: 32 children[3]: type: 'disk' id: 3 guid: 828343558065682349 path: '/dev/dsk/c0t1d0s0' devid: 'id1,s...@awdc_wd20eads-00s2b0=_wd-wcavy1098856/a' phys_path: '/p...@0,0/pci1043,8...@5,1/d...@1,0:a' whole_disk: 1 DTL: 33 children[4]: type: 'disk' id: 4 guid: 16604516587932073210 path: '/dev/dsk/c11t0d0s0' devid: 'id1,s...@awdc_wd20eads-00s2b0=_wd-wcavy1117911/a' phys_path: '/p...@0,0/pci1043,8...@5,2/d...@0,0:a' whole_disk: 1 DTL: 34 children[5]: type: 'disk' id: 5 guid: 12602499757569516679 path: '/dev/dsk/c11t1d0s0' devid: 'id1,s...@asamsung_hd103uj=s13pjdws256953/a' phys_path: '/p...@0,0/pci1043,8...@5,2/d...@1,0:a' whole_disk: 1 DTL: 57 LABEL 1 version: 22 name: 'videodrome' state: 0 txg: 55561 pool_guid: 5063071388564101079 hostid: 919514 hostname: 'Videodrome' top_guid: 15080595385902860350 guid: 12602499757569516679 vdev_children: 1 vdev_tree: type: 'raidz' id: 0 guid: 15080595385902860350 nparity: 1 metaslab_array: 23 metaslab_shift: 35 ashift: 9 asize: 6001149345792 is_log: 0 children[0]: type: 'disk' id: 0 guid: 5800353223031346021 path: '/dev/dsk/c1t0d0s0' devid: 'id1,s...@awdc_wd20eads-00s2b0=_wd-wcavy1123096/a' phys_path: '/p...@0,0/pci1043,8...@5/d...@0,0:a' whole_disk: 1 DTL: 30 children[1]: type: 'disk' id: 1 guid: 11924500712739180074 path: '/dev/dsk/c1t1d0s0' devid: 'id1,s...@awdc_wd20eads-00s2b0=_wd-wcavy1089951/a' phys_path: '/p...@0,0/pci1043,8...@5/d...@1,0:a' whole_disk: 1 DTL: 31 children[2]: type: 'disk' id: 2 guid: 6297108650128259181 path: '/dev/dsk/c10t0d0s0' devid: 'id1,s...@awdc_wd20eads-00s2b0=_wd-wcavy1089667/a' phys_path: '/p...@0,0/pci1043,8...@5,1/d...@0,0:a' whole_disk: 1 DTL: 32 children[3]: type: 'disk' id: 3 guid: 828343558065682349 path: '/dev/dsk/c0t1d0s0' devid: 'id1,s...@awdc_wd20eads-00s2b0=_wd-wcavy1098856/a' phys_path: '/p...@0,0/pci1043,8...@5,1/d...@1,0:a' whole_disk: 1 DTL: 33 children[4]: type: 'disk' id: 4 guid: 16604516587932073210 path: '/dev/dsk/c11t0d0s0' devid: 'id1,s...@awdc_wd20eads-00s2b0=_wd-wcavy1117911/a' phys_path: '/p...@0,0/pci1043,8...@5,2/d...@0,0:a' whole_disk: 1 DTL: 34 children[5]: type: 'disk' id: 5 guid: 12602499757569516679 path: '/dev/dsk/c11t1d0s0' devid: 'id1,s...@asamsung_hd103uj=s13pjdws256953/a' phys_path: '/p...@0,0/pci1043,8...@5,2/d...@1,0:a' whole_disk: 1
Re: [zfs-discuss] [ZIL device brainstorm] intel x25-M G2 has ram cache?
--On 25 May 2010 11:15 -0700 Brandon High bh...@freaks.com wrote: On Tue, May 25, 2010 at 2:08 AM, Karl Pielorz kpielorz_...@tdx.co.uk wrote: I've tried contacting Intel to find out if it's true their enterprise SSD has no cache protection on it, and what the effect of turning the write The E in X25-E does not mean enterprise. It means extreme. Like the EE series CPUs that Intel offers. Yet most of their web site seems to aim it quite firmly at the 'Enterprise' market, Imagine replacing up to 50 high-RPM hard disk drives with one Intel® X25-E Extreme SATA Solid-State Drive in your servers or, Enterprise applications place a premium on performance, reliability, power consumption and space. If you don't mind a little data loss risk? :) I'll post back when we've had a chance to try one in the 'real world' for our applications - with and without caching, especially when the plug gets pulled :) Otherwise, at least on the surface the quest for the 'perfect' (performance, safety, price, size) ZIL continues... -Karl ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] multiple crashes upon boot after upgrading build 134 to 138, 139 or 140
As I am looking at this further, I convince myself this should really be an assert. (I am running release builds, so assert-s do not fire). I think in a debug build, I should be seeing the !list_empty() assert in: list_remove(list_t *list, void *object) { list_node_t *lold = list_d2l(list, object); ASSERT(!list_empty(list)); ASSERT(lold-list_next != NULL); list_remove_node(lold); } I am suspecting, maybe this is a race. Assuming there is not other interfering thread, this crash could never happen.. tatic void zfs_acl_release_nodes(zfs_acl_t *aclp) { zfs_acl_node_t *aclnode; while (aclnode = list_head(aclp-z_acl)) { list_remove(aclp-z_acl, aclnode); zfs_acl_node_free(aclnode); } aclp-z_acl_count = 0; aclp-z_acl_bytes = 0; } List_head does a list_empty() check, and returns null on empty. So if we got past that, list_remove() should never find an empty list, perhaps there is interference from another thread. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] can you recover a pool if you lose the zil (b134+)
On May 25, 2010, at 12:33 PM, R. Eulenberg wrote: a manual recovery of missing top-level vdevs -- a rare event. Yes, but so rare that I never thought troubling me. In my mind it was only the slog and loosing the last few seconds doesn't wrong. So I don't have a backup, a snapshot neither the original zpool.cache file. Is there any solution for my problem? The description that Peter Woodman put together is a good reference. http://github.com/pjjw/logfix If you don't know the GUID, then it is a rather long trial-and-error process. Or you might recompile the source and rip out the parts looking for the separate log. -- richard -- ZFS and NexentaStor training, Rotterdam, July 13-15, 2010 http://nexenta-rotterdam.eventbrite.com/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] unsetting the bootfs property possible? imported a FreeBSD pool
On Tue, May 25, 2010 at 1:58 AM, Reshekel Shedwitz reshe...@spam.la wrote: I am migrating a pool from FreeBSD 8.0 to OpenSolaris (Nexenta 3.0 RC1). I am in what seems to be a weird situation regarding this pool. Maybe someone can help. I used to boot off of this pool in FreeBSD, so the bootfs property got set: I think everyone missed the completely obvious implication: FreeBSD allows bootfs to be set on EFI partitioned disks. It might allow the property to be unset as well. Can you boot a FreeBSD live cd and unset the zpool property? If not, you could comment out the check that Andrew Gabriel identified and rebuild the zpool command. Once unset, you should be able to use the distro-supplied binary. -B -- Brandon High : bh...@freaks.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] unsetting the bootfs property possible? imported a FreeBSD pool
Cindy, Thanks. Same goes to everyone else on this thread. I actually solved the issue - I booted back into FreeBSD's Fixit mode and was still able to import the pool (wouldn't have been able to if I upgraded the pool version!). FreeBSD's zpool command allowed me to unset the bootfs property. I guess that should have been more obvious to me. At least now I'm in good shape as far as this pool goes - zpool won't complain when I try to replace disks or add cache. Might be worth documenting this somewhere as a gotcha when migrating from FreeBSD to OpenSolaris. Thanks! -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] USB Flashdrive as SLOG?
Edward Ned Harvey wrote: From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Kyle McDonald I've been thinking lately that I'm not sure I like the root pool being unprotected, but I can't afford to give up another drive bay. I'm guessing you won't be able to use the USB thumbs as a boot device. But that's just a guess. However, I see nothing wrong with mirroring your primary boot device to the USB. At least in this case, if the OS drive fails, your system doesn't crash. You're able to swap the OS drive and restore your OS mirror. That led me to wonder whether partitioning out 8 or 12 GB on a 32GB thumb drive would be beneficial as an slog?? I think the only way to find out is to measure it. I do have an educated guess though. I don't think, even the fastest USB flash drives are able to work quickly, with significantly low latency. Based on measurements I made years ago, so again I emphasize, only way to find out is to test it. One thing you could check, which does get you a lot of mileage for free is: Make sure your HBA has a BBU, and enable the WriteBack. In my measurements, this gains about 75% of the benefit that log devices would give you. There are or at least have been some issues with ZFS and devices. Here's one that is still open: Bug 4755 - ZFS boot does not work with removable media (usb flash memory) http://defect.opensolaris.org/bz/show_bug.cgi?id=4755 Regarding performance...USB flash drives vary significantly in performance from one another between brands and models. Some get close to USB 2.0 theoretical limits, others just barely exceed USB 1.1. Vista and Windows 7 support the use of USB flash drives for ReadyBoost, a caching system to reduce application load times. Windows tests have shown that with enough RAM, that ReadyBoost caching offers little additional performance (as Windows does make use of system RAM for file caching too). I think using good USB flash drives has the potential to improve performance, and if you can keep mirrored flash drives on different, dedicated USB controllers that will help performance the most. If USB support in OpenSolaris has is poor and has weak performance, I wonder if an iSCSI target created out of the USB device on a Linux or Windows system on the same network might be able to offer better performance. Even if latency goes to 2-3ms, that's still much better than the 8.5 ms random seek times on a 7200 rpm hard disk. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] [indiana-discuss] image-update doesn't work anymore (bootfs not supported on EFI)
On Wed, 2010-05-05 at 10:35 -0600, Evan Layton wrote: Do you have any of the older BEs like build 134 that you can boot back to and see if those will allow you to set the bootfs property on the root pool? It's just really strange that out of nowhere it started thinking that the device is EFI labeled. I have a couple of BEs I could boot to: $ beadm list BE Active Mountpoint Space Policy Created -- -- -- - -- --- opensolaris - - 1.00G static 2009-10-01 08:00 opensolaris-124 - - 20.95M static 2009-10-03 13:30 opensolaris-125 - - 30.00M static 2009-10-17 15:18 opensolaris-126 - - 25.33M static 2009-10-29 20:18 opensolaris-127 - - 1.37G static 2009-11-14 13:20 opensolaris-128 - - 1.91G static 2009-12-04 14:28 opensolaris-129 - - 22.49M static 2009-12-12 11:31 opensolaris-130 - - 21.64M static 2009-12-26 19:46 opensolaris-131 - - 24.72M static 2010-01-22 22:51 opensolaris-132 - - 57.32M static 2010-02-09 23:05 opensolaris-133 - - 1.07G static 2010-02-20 12:55 opensolaris-134 N / 43.17G static 2010-03-08 21:58 opensolaris-138 R - 1.81G static 2010-05-04 12:03 I will try on 132 or 133. Get back to you later. Thanks! Sorry, I kind of forgot :-) r...@macbook:~# uname -a SunOS macbook 5.11 snv_132 i86pc i386 i86pc Solaris r...@macbook:~# zpool set bootfs=rpool/ROOT/opensolaris-132 rpool cannot set property for 'rpool': property 'bootfs' not supported on EFI labeled devices -- Christian ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] hybrid drive: flash and platters
Hello,As an avid fan of the application to flash technologies to the storage stratum, I researched theDMCache project (maintainedhere). It appears that the DmCache project is quite a bit behindL2ARC but headed in the right direction.I found the lwn article very interesting as it is effectivelya Linux application of L2ARC to improveMySQL performance. I had proposed the same ideain my blog post titledFilesystem Cache Optimization Strategies.The net there is that if you can cache the data in the filesystem cache, you can improve overallperformance by reducing the I/O to disk. I had hoped to have someone do some benchmarkingof MySQL in a cache optimized server with F20 PCIe flash cards but never got around to it.So, if you want to get all of the caching benefits of DmCache, just run your app on Solaris 10 today. ;-)Have a great day!Brad Brad Diggs | Principal Security Sales Consultant | +1.972.814.3698OracleNorth America Technology Organization16000 Dallas Parkway, Dallas, TX 75248eMail:brad.di...@oracle.comTech Blog:http://TheZoneManager.comLinkedIn:http://www.linkedin.com/in/braddiggs On May 21, 2010, at 8:00 PM, David Magda wrote:Seagate is planning on releasing a disk that's part spinning rust and part flash: http://www.theregister.co.uk/2010/05/21/seagate_momentus_xt/The design will have the flash be transparent to the operating system, but I wish they would have some way to access the two components separately. ZFS could certainly make use of it, and Linux is also working on a capability: http://kernelnewbies.org/KernelProjects/DmCache http://lwn.net/Articles/385442/___zfs-discuss mailing listzfs-discuss@opensolaris.orghttp://mail.opensolaris.org/mailman/listinfo/zfs-discuss___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss