Re: [zfs-discuss] Setting up ZFS on AHCI disks
Hi, are the drives properly configured in cfgadm? Cheers, Tonmaus -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] recomend sata controller 4 Home server with zfs raidz2 and 8x1tb hd
On Fri, Apr 16, 2010 at 12:21 AM, george b...@otenet.gr wrote: hi all im brand new to opensolaris ... feel free to call me noob :) i need to build a home server for media and general storage zfs sound like the perfect solution but i need to buy a 8 (or more) SATA controller any suggestions for compatible 2 opensolaris products will be really appreciated for the moment made a silly purchase spending 250 euros on lsi megaraid 8208elp which is SR and not compatible with OpenSolaris... thanx in advance G Depends on what sort of interface you're looking for. The supermicro AOC-SAT2-MV8's work great. They're pci-x based. 8-ports, come with SATA cables, and are relatively cheap ($150 most places). --Tim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] recomend sata controller 4 Home server with zfs raidz2 and 8x1tb hd
hello if you are looking for pci-e (8x), i would recommend sas/sata controller with lsi 1068E sas chip. they are nearly perfect with opensolaris. you must look for controller with it firmware (jbod mode) not those with raid enabled (ir mode). normally the cheaper variants are the right ones. one of the cheapest at all any my favourite is the supermicro usas-l8i http://www.supermicro.com/products/accessories/addon/AOC-USAS-L8i.cfm (about 100 euro in germany) although it is uio (wrong side mounted for special supermicro cases), it's normally not a problem because it's internal only. (you will loose one slot normally) see my hardware http://www.napp-it.org/hardware/ gea -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] recomend sata controller 4 Home server with zfs raidz2 and 8x1tb hd
On Fri, Apr 16, 2010 at 1:57 AM, Günther a...@hfg-gmuend.de wrote: hello if you are looking for pci-e (8x), i would recommend sas/sata controller with lsi 1068E sas chip. they are nearly perfect with opensolaris. you must look for controller with it firmware (jbod mode) not those with raid enabled (ir mode). normally the cheaper variants are the right ones. one of the cheapest at all any my favourite is the supermicro usas-l8i http://www.supermicro.com/products/accessories/addon/AOC-USAS-L8i.cfm (about 100 euro in germany) although it is uio (wrong side mounted for special supermicro cases), it's normally not a problem because it's internal only. (you will loose one slot normally) see my hardware http://www.napp-it.org/hardware/ gea -- The firmware can be flashed regardless of what the card came with. Why would you buy the UIO card when you can get the intel SASUC8i for the same price/cheaper and it comes in a standard form factor? The (potential) problem with the 1068 cards is that they don't support AHCI with SATA. --Tim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] recomend sata controller 4 Home server with zfs raidz2 and 8x1tb hd
On Thu, Apr 15 at 23:57, Günther wrote: hello if you are looking for pci-e (8x), i would recommend sas/sata controller with lsi 1068E sas chip. they are nearly perfect with opensolaris. For just a bit more, you can get the LSI SAS 9211-9i card which is 6Gbit/s. It works fine for us, and does JBOD no problem. -- Eric D. Mudama edmud...@mail.bounceswoosh.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] crypted zvol bandwith = lofidevice=`pfexec lofiadm -a /dev/zvol/rdsk/$volumepath -c aes-256-cbc`
On 16/04/2010 10:19, Mickael Lambert wrote: First! Great thanks for this great technology that is ZFS! Then! I need some advices about a weird thing I just find out. Seems my I/O on a crypted zvol is 3 time more than the corresponding ones off the pool on the lofi device. I have attached a file containing all information I know about. That should be very easy to reproduce. Did someone could explain that behavior to me? Encryption costs and in this setup you have multiple layers here too, a ZFS pool ontop of lofi (doing the encryption) ontop of a ZVOL which is in your rpool. So any write to the filesystems in apool have to go to ZFS, then to lofi (and be encrypted) then to the ZVOL that is in your rpool, that alone without the encryption adds to the IO requirements. A fair comparison would be to do the same setup with the multiple pools and lofi but don't have lofi do encryption. That would tell you the overhead of the encryption that lofi does. -- Darren J Moffat ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Snapshots and Data Loss
Richard, Applications can take advantage of this and there are services available to integrate ZFS snapshots with Oracle databases, Windows clients, etc. which services are you referring to? best regards. Maurilio. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS for ISCSI ntfs backing store.
For ease of administration with everyone in the department i'd prefer to keep everything consistent in the windows world. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Setting up ZFS on AHCI disks
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Tonmaus are the drives properly configured in cfgadm? I agree. You need to do these: devfsadm -Cv cfgadm -al ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS forensics/revert/restore shellscript and how-to.
Hi Richard, thanks for your time, I really appreciate it, but I'm still unclear on how this works. So uberblocks point to the MOS. Why do you then require multiple uberblocks? Or are there actually multiple MOS'es? Or is there one MOS and multiple delta's to it (and its predecessors) and do the uberblocks then point to the latest delta? In the latter case I can understand why Nullifying the latest uberblocks reverts to a previous situation, otherwise I don't see the difference between Nullifying the first uberblocks and Nullifying the last uberblocks. Thanks, Fred -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Setting up ZFS on AHCI disks
devfsadm -Cv gave a lot of removing file messages, apparently for items that were not relevant. cfgadm -al says, about the disks, sata0/0::dsk/c13t0d0 disk connectedconfigured ok sata0/1::dsk/c13t1d0 disk connectedconfigured ok sata0/2::dsk/c13t2d0 disk connectedconfigured ok sata0/3::dsk/c13t3d0 disk connectedconfigured ok I still get the same error message, but I'm guessing now that means I have to create a partition on the device. However, I am still stymied for the time being. fdisk can't open any of the /dev/rdsk/c13t*d0p0 devices. I tried running format, and get this AVAILABLE DISK SELECTIONS: 0. c12d1 DEFAULT cyl 19454 alt 2 hd 255 sec 63 /p...@0,0/pci-...@1f,1/i...@0/c...@1,0 1. c13t0d0 drive type unknown /p...@0,0/pci1043,8...@1f,2/d...@0,0 2. c13t1d0 drive type unknown /p...@0,0/pci1043,8...@1f,2/d...@1,0 3. c13t2d0 drive type unknown /p...@0,0/pci1043,8...@1f,2/d...@2,0 4. c13t3d0 drive type unknown /p...@0,0/pci1043,8...@1f,2/d...@3,0 Specify disk (enter its number): 1 Error: can't open disk '/dev/rdsk/c13t0d0p0'. AVAILABLE DRIVE TYPES: 0. Auto configure 1. other Specify disk type (enter its number): 0 Auto configure failed No Solaris fdisk partition found. At this point, I not sure whether to run fdisk, format or something else. I tried fdisk, partition and label, but gut the message Current Disk Type is not set. I expect this is a problem because of the drive type unknown appearing on the drives. I gather from another thread that I need to run fdisk, but I haven't been able to do it. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Setting up ZFS on AHCI disks
Your adapter read-outs look quite different than mine. I am on ICH-9, snv_133. Maybe that's why. But I thought I should ask on that occasion: -build? -do the drives currently support SATA-2 standard (by model, by jumper settings?) - could it be that the Areca controller has done something to them partition-wise? Regards, Tonmaus -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Secure delete?
On Thu, 15 Apr 2010, Eric D. Mudama wrote: The purpose of TRIM is to tell the drive that some # of sectors are no longer important so that it doesn't have to work as hard in its internal garbage collection. The sector size does not typically match the FLASH page size so the SSD still has to do some heavy lifting. It has to keep track of many small holes in the FLASH pages. This seems pretty complicated since all of this information needs to be well-preserved in non-volatile storage. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Secure delete?
On 4/16/2010 10:30 AM, Bob Friesenhahn wrote: On Thu, 15 Apr 2010, Eric D. Mudama wrote: The purpose of TRIM is to tell the drive that some # of sectors are no longer important so that it doesn't have to work as hard in its internal garbage collection. The sector size does not typically match the FLASH page size so the SSD still has to do some heavy lifting. It has to keep track of many small holes in the FLASH pages. This seems pretty complicated since all of this information needs to be well-preserved in non-volatile storage. But doesn't the TRIM command help here. If as the OS goes along it makes sectors as unused, then the SSD will have a lighter wight lift to only need to read for example 1 out of 8 (assuming sectors of 512 bytes, and 4K FLASH Pages) before writing a new page with that 1 sector and 7 new ones. Additionally in the background I would think it would be able to find a Page with 3 inuse sectors and another with 5 for example, write all 8 to a new page, remap those sectors to the new location, and then pre-erase the 2 pages just freed up. How doesn't that help? -Kyle Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Secure delete?
On Fri, 16 Apr 2010, Kyle McDonald wrote: But doesn't the TRIM command help here. If as the OS goes along it makes sectors as unused, then the SSD will have a lighter wight lift to only need to read for example 1 out of 8 (assuming sectors of 512 bytes, and 4K FLASH Pages) before writing a new page with that 1 sector and 7 new ones. Additionally in the background I would think it would be able to find a Page with 3 inuse sectors and another with 5 for example, write all 8 to a new page, remap those sectors to the new location, and then pre-erase the 2 pages just freed up. While I am not a SSD designer, I agree with you that a smart SSD designer would include a dereferencing table which maps sectors to pages so that FLASH pages can be completely filled, even if the stored sectors are not contiguous. It would allow sectors to be migrated to different pages in the background in order to support wear leveling and compaction. This is obviously challenging to do if FLASH is used to store this dereferencing table and the device does not at least include a super-capacitor which assures that the table will be fully written on power fail. If the table is corrupted, then the device is bricked. It is much more efficient (from a housekeeping perspective) if filesystem sectors map directly to SSD pages, but we are not there yet. As a devil's advocate, I am still waiting for someone to post a URL to a serious study which proves the long-term performance advantages of TRIM. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS for ISCSI ntfs backing store.
I have used build 124 in this capacity, although I did zero tuning. I had about 4T of data on a single 5T iSCSI volume over gigabit. The windows server was a VM, and the opensolaris box is on a Dell 2950, 16G of RAM, x25e for the zil, no l2arc cache device. I used comstar. It was being used as a target for Doubletake, so it only saw write IO, with very little read. My load testing using iometer was very positive, and I would not have hesitated to use it as the primary node serving about 1000 users, maybe 200-300 active at a time. Scott -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS forensics/revert/restore shellscript and how-to.
Hi Fred, Have you read the ZFS On Disk Format Specification paper at: http://hub.opensolaris.org/bin/download/Community+Group+zfs/docs/ondiskformat0822.pdf? Ifred pam wrote: Hi Richard, thanks for your time, I really appreciate it, but I'm still unclear on how this works. So uberblocks point to the MOS. Why do you then require multiple uberblocks? Or are there actually multiple MOS'es? Or is there one MOS and multiple delta's to it (and its predecessors) and do the uberblocks then point to the latest delta? In the latter case I can understand why Nullifying the latest uberblocks reverts to a previous situation, otherwise I don't see the difference between Nullifying the first uberblocks and Nullifying the last uberblocks. One reason for multiple uberblocks is that uberblocks, like everything else, are copy-on-write. The reason you have 4 copies (2 labels at front and 2 labels at the end of every disk) is redundancy. No, there are not multiple MOS'es in one pool (though there may be multiple copies of the MOS via ditto blocks). The current (or active) uberblock is the one with the highest transaction id with valid checksum. Transaction ids are basically monotonically increasing, so nullifying the last uberblock can revert you to a previous state. max Thanks, Fred ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] cannot set property for 'rpool': property 'bootfs' not supported on EFI labeled devices
I am getting the following error, however as you can see below this is a SMI label... cannot set property for 'rpool': property 'bootfs' not supported on EFI labeled devices # zpool get bootfs rpool NAME PROPERTY VALUE SOURCE rpool bootfs - default # zpool set bootfs=rpool/ROOT/s10s_u8wos_08a rpool cannot set property for 'rpool': property 'bootfs' not supported on EFI labeled devices partition pri Current partition table (original): Total disk cylinders available: 1989 + 2 (reserved cylinders) Part Tag Flag Cylinders Size Blocks 0 root wm 0 - 1988 69.93GB (1989/0/0) 146644992 1 unassigned wm 0 0 (0/0/0) 0 2 backup wm 0 - 1988 69.93GB (1989/0/0) 146644992 3 unassigned wm 0 0 (0/0/0) 0 4 unassigned wm 0 0 (0/0/0) 0 5 unassigned wm 0 0 (0/0/0) 0 6 unassigned wm 0 0 (0/0/0) 0 7 unassigned wm 0 0 (0/0/0) 0 Any ideas as to why? Thanks ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] cannot set property for 'rpool': property 'bootfs' not supported on EFI labeled devices
Hi Tony, Is this on an x86 system? If so, you might also check whether this disk has a Solaris fdisk partition or has an EFI fdisk partition. If it has an EFI fdisk partition then you'll need to change it to a Solaris fdisk partition. See the pointers below. Thanks, Cindy http://www.solarisinternals.com/wiki/index.php/ZFS_Troubleshooting_Guide Replacing/Relabeling the Root Pool Disk # fdisk /dev/rdsk/c1t1d0p0 selecting c1t1d0p0 Total disk size is 8924 cylinders Cylinder size is 16065 (512 byte) blocks Cylinders Partition StatusType Start End Length% = == = === == === 1 EFI 0 89248925100 . . . Enter Selection: On 04/16/10 11:13, Tony MacDoodle wrote: I am getting the following error, however as you can see below this is a SMI label... cannot set property for 'rpool': property 'bootfs' not supported on EFI labeled devices # zpool get bootfs rpool NAME PROPERTY VALUE SOURCE rpool bootfs - default # zpool set bootfs=rpool/ROOT/s10s_u8wos_08a rpool cannot set property for 'rpool': property 'bootfs' not supported on EFI labeled devices partition pri Current partition table (original): Total disk cylinders available: 1989 + 2 (reserved cylinders) Part Tag Flag Cylinders Size Blocks 0 root wm 0 - 1988 69.93GB (1989/0/0) 146644992 1 unassigned wm 0 0 (0/0/0) 0 2 backup wm 0 - 1988 69.93GB (1989/0/0) 146644992 3 unassigned wm 0 0 (0/0/0) 0 4 unassigned wm 0 0 (0/0/0) 0 5 unassigned wm 0 0 (0/0/0) 0 6 unassigned wm 0 0 (0/0/0) 0 7 unassigned wm 0 0 (0/0/0) 0 Any ideas as to why? Thanks ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Making ZFS better: file/directory granularity in-place rollback
AFAIK, if you want to restore a snapshot version of a file or directory, you need to use cp or such commands, to copy the snapshot version into the present. This is not done in-place, meaning, the cp or whatever tool must read the old version of objects and write new copies of the objects. You may avoid the new disk space consumption if you have dedup enabled, but you will not avoid the performance hit of requiring the complete read write of all the bytes of all the objects. Performance is nothing like a simple re-link of the snapshot restored object, and the newly restored objects are not guaranteed identical to the old version - because cp or whatever could be changing permissions and timestamps and stuff, according to the behavior of cp or whatever tool is being used. So the suggestion, or question is: Is it possible or planned to implement a rollback command, that works as fast as a link or re-link operation, implemented at a file or directory level, instead of the entire filesystem? ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Making ZFS better: zfshistory
If you've got nested zfs filesystems, and you're in some subdirectory where there's a file or something you want to rollback, it's presently difficult to know how far back up the tree you need to go, to find the correct .zfs subdirectory, and then you need to figure out the name of the snapshots available, and then you need to perform the restore, even after you figure all that out. This is pretty good, but it could be better. The solution should be cross-platform compatible, because the user who wishes to perform such operations may be working across an NFS or CIFS mountpoint. If something like this already exists, please let me know. Otherwise, I plan to: Create zfshistory command, written in python. (open source, public, free) zfshistory would have the following subcommands: lslist snapshots that contain the specified file or directory cpcopies a file or directory from a past snapshot to a new name in the current version of the filesystem rollback delete the present version of a named file or directory, and replace it with the specified past snapshot version locatelist complete paths to all the snapshot versions of the specified file or directory Example usage: rm somefile (Whoops!) zfshistory ls somefile somef...@frequent-2010-04-16-12-45-00 somef...@frequent-2010-04-16-12-30-00 somef...@frequent-2010-04-16-12-15-00 somef...@frequent-2010-04-16-12-00-00 somef...@hourly-2010-04-16-12-00-00 zfshistory cp somef...@frequent-2010-04-16-12-45-00 ./mynewfilename zfshistory rollback somefile (restores somefile to the latest snapshot available) zfshistory rollback somef...@frequent-2010-04-16-12-00-00 (restores somefile to the specified snapshot) zfshistory locate somefile /tank/.zfs/snapshot/frequent-2010-04-16-12-45-00/home/username/somefile /tank/.zfs/snapshot/frequent-2010-04-16-12-30-00/home/username/somefile /tank/.zfs/snapshot/frequent-2010-04-16-12-15-00/home/username/somefile /tank/.zfs/snapshot/frequent-2010-04-16-12-00-00/home/username/somefile /tank/.zfs/snapshot/hourly-2010-04-16-12-00-00/home/username/somefile ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Making ZFS better: rm files/directories from snapshots
The typical problem scenario is: Some user or users fill up the filesystem. They rm some files, but disk space is not freed. You need to destroy all the snapshots that contain the deleted files, before disk space is available again. It would be nice if you could rm files from snapshots, without needing to destroy the whole snapshot. Is there any existing work or solution for this? ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Setting up ZFS on AHCI disks
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Willard Korfhage devfsadm -Cv gave a lot of removing file messages, apparently for items that were not relevant. That's good. If there were no necessary changes, devfsadm would say nothing. I still get the same error message, but I'm guessing now that means I have to create a partition on the device. However, I am still stymied There should be no need to create partitions. Something simple like this should work: zpool create junkfooblah c13t0d0 And if it doesn't work, try zpool status just to verify for certain, that device is not already part of any pool. for the time being. fdisk can't open any of the /dev/rdsk/c13t*d0p0 devices. I tried running format, and get this There may be something weird happening in your system. I can't think of any reason for that behavior. Unless you simply have a SATA card that has no proper driver support from opensolaris while in AHCI mode. Error: can't open disk '/dev/rdsk/c13t0d0p0'. Yeah. Weird. AVAILABLE DRIVE TYPES: 0. Auto configure 1. other Specify disk type (enter its number): 0 Auto configure failed No Solaris fdisk partition found. Yeah. Weird. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Making ZFS better: rm files/directories from snapshots
On Fri, Apr 16 at 13:56, Edward Ned Harvey wrote: The typical problem scenario is: Some user or users fill up the filesystem. They rm some files, but disk space is not freed. You need to destroy all the snapshots that contain the deleted files, before disk space is available again. It would be nice if you could rm files from snapshots, without needing to destroy the whole snapshot. Is there any existing work or solution for this? Doesn't that defeat the purpose of a snapshot? If this is a real problem, I think that it calls for putting that user's files in a separate filesystem that can have its snapshots managed with a specific policy for addressing the usage model. --eric -- Eric D. Mudama edmud...@mail.bounceswoosh.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Secure delete?
On Fri, Apr 16 at 10:05, Bob Friesenhahn wrote: It is much more efficient (from a housekeeping perspective) if filesystem sectors map directly to SSD pages, but we are not there yet. How would you stripe or manage a dataset across a mix of devices with different geometries? That would break many of the assumptions made by filesystems today. I would argue it's easier to let the device virtualize this mapping and present a consistent interface, regardless of the underlying geometry. As a devil's advocate, I am still waiting for someone to post a URL to a serious study which proves the long-term performance advantages of TRIM. I am absolutely sure these studies exist, but as to some entity publishing a long term analysis that cost real money (many thousands of dollars) to create, I have no idea if data like that exists in the public domain where anyone can see it. I can virtually guarantee every storage, SSD and OS vendor is generating that data internally however. --eric -- Eric D. Mudama edmud...@mail.bounceswoosh.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Secure delete?
edm == Eric D Mudama edmud...@bounceswoosh.org writes: edm How would you stripe or manage a dataset across a mix of edm devices with different geometries? the ``geometry'' discussed is 1-dimensional: sector size. The way that you do it is to align all writes, and never write anything smaller than the sector size. The rule is very simple, and you can also start or stop following it at any moment without rewriting any of the dataset and still get the full benefit. pgpj2CsEgHKlY.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Setting up ZFS on AHCI disks
No Areca controller on this machine. It is a different box, and the drives are just plugged into the SATA ports on the motherboard. I'm running build svn_133, too. The drives are recent - 1.5TB drives, 3 Western Digital and 1 Seagate, if I recall correctly. They ought to support SATA-2. They are brand new, and haven't been used before. I have the feeling I'm missing some simple, obvious step because I'm still pretty new to OpenSolaris. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Secure delete?
On Fri, 16 Apr 2010, Eric D. Mudama wrote: On Fri, Apr 16 at 10:05, Bob Friesenhahn wrote: It is much more efficient (from a housekeeping perspective) if filesystem sectors map directly to SSD pages, but we are not there yet. How would you stripe or manage a dataset across a mix of devices with different geometries? That would break many of the assumptions made by filesystems today. I would argue it's easier to let the device virtualize this mapping and present a consistent interface, regardless of the underlying geometry. You must have misunderstood me. I was talking about functionality built into the device. As far as filesystems go, filesystems typically allocate much larger blocks than the sector size. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Secure delete?
On Fri, Apr 16 at 14:42, Miles Nordin wrote: edm == Eric D Mudama edmud...@bounceswoosh.org writes: edm How would you stripe or manage a dataset across a mix of edm devices with different geometries? the ``geometry'' discussed is 1-dimensional: sector size. The way that you do it is to align all writes, and never write anything smaller than the sector size. The rule is very simple, and you can also start or stop following it at any moment without rewriting any of the dataset and still get the full benefit. The response was regarding a filesystem with knowledge of the NAND geometry, to align writes to exact page granularity. My question was how to implement that, if not all devices in a stripe set have the same page size. What you're suggesting is exactly what SSD vendors already do. They present a 512B standard host interface sector size, and perform their own translations and management inside the device. -- Eric D. Mudama edmud...@mail.bounceswoosh.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Secure delete?
On 16 apr 2010, at 17.05, Bob Friesenhahn wrote: On Fri, 16 Apr 2010, Kyle McDonald wrote: But doesn't the TRIM command help here. If as the OS goes along it makes sectors as unused, then the SSD will have a lighter wight lift to only need to read for example 1 out of 8 (assuming sectors of 512 bytes, and 4K FLASH Pages) before writing a new page with that 1 sector and 7 new ones. Additionally in the background I would think it would be able to find a Page with 3 inuse sectors and another with 5 for example, write all 8 to a new page, remap those sectors to the new location, and then pre-erase the 2 pages just freed up. While I am not a SSD designer, I agree with you that a smart SSD designer would include a dereferencing table which maps sectors to pages so that FLASH pages can be completely filled, even if the stored sectors are not contiguous. It would allow sectors to be migrated to different pages in the background in order to support wear leveling and compaction. This is obviously challenging to do if FLASH is used to store this dereferencing table and the device does not at least include a super-capacitor which assures that the table will be fully written on power fail. If the table is corrupted, then the device is bricked. This is exactly how they work, at least most of them. If they should erase and reprogram each flash block (say, 128 KB blocks times the parallelism of the drive) for each 512 B block written, they would wear out in no time and performance would be horrible. Eventually they have to gc because they are out of erased blocks, and then they have to copy the data in used to new places. In that process, it of course helps if some of the data is tagged as not needed anymore, it can then compact the data much more efficiently and it doesn't have to copy around a lot of data that won't be used. I't should also help saving copy/erase cycles in the drive, since the data it moves is much more likely to actually be in use and probably won't be overwritten as fast as a non-used block, it will in effect pack data actually in use into flash blocks. If the disk is nearly full, TRIM likely doesn't make much difference. I'd guess TRIM should be very useful on a slog device. It is much more efficient (from a housekeeping perspective) if filesystem sectors map directly to SSD pages, but we are not there yet. I agree with Eric that it could very well be better to let the device virtualize the thing and have control and knowledge of all the hardware specific implementation details. Flash chips, controllers, bus drivers and configurations don't all come equal. As a devil's advocate, I am still waiting for someone to post a URL to a serious study which proves the long-term performance advantages of TRIM. That would sure be interesting! /ragge Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Making ZFS better: zfshistory
On Fri, Apr 16, 2010 at 01:54:45PM -0400, Edward Ned Harvey wrote: If you've got nested zfs filesystems, and you're in some subdirectory where there's a file or something you want to rollback, it's presently difficult to know how far back up the tree you need to go, to find the correct .zfs subdirectory, and then you need to figure out the name of the snapshots available, and then you need to perform the restore, even after you figure all that out. I've a ksh93 script that lists all the snapshotted versions of a file... Works over NFS too. % zfshist /usr/bin/ls History for /usr/bin/ls (/.zfs/snapshot/*/usr/bin/ls): -r-xr-xr-x 1 root bin33416 Jul 9 2008 /.zfs/snapshot/install/usr/bin/ls -r-xr-xr-x 1 root bin37612 Nov 21 2008 /.zfs/snapshot/2009-12-07-20:47:58/usr/bin/ls -r-xr-xr-x 1 root bin37612 Nov 21 2008 /.zfs/snapshot/2009-12-01-00:42:30/usr/bin/ls -r-xr-xr-x 1 root bin37612 Nov 21 2008 /.zfs/snapshot/2009-07-17-21:08:45/usr/bin/ls -r-xr-xr-x 1 root bin37612 Nov 21 2008 /.zfs/snapshot/2009-06-03-03:44:34/usr/bin/ls % It's not perfect (e.g., it doesn't properly canonicalize its arguments, so it doesn't handle symlinks and ..s in paths), but it's a start. Nico -- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Making ZFS better: zfshistory
On Apr 16, 2010, at 1:37 PM, Nicolas Williams wrote: On Fri, Apr 16, 2010 at 01:54:45PM -0400, Edward Ned Harvey wrote: If you've got nested zfs filesystems, and you're in some subdirectory where there's a file or something you want to rollback, it's presently difficult to know how far back up the tree you need to go, to find the correct .zfs subdirectory, and then you need to figure out the name of the snapshots available, and then you need to perform the restore, even after you figure all that out. I've a ksh93 script that lists all the snapshotted versions of a file... Works over NFS too. % zfshist /usr/bin/ls History for /usr/bin/ls (/.zfs/snapshot/*/usr/bin/ls): -r-xr-xr-x 1 root bin33416 Jul 9 2008 /.zfs/snapshot/install/usr/bin/ls -r-xr-xr-x 1 root bin37612 Nov 21 2008 /.zfs/snapshot/2009-12-07-20:47:58/usr/bin/ls -r-xr-xr-x 1 root bin37612 Nov 21 2008 /.zfs/snapshot/2009-12-01-00:42:30/usr/bin/ls -r-xr-xr-x 1 root bin37612 Nov 21 2008 /.zfs/snapshot/2009-07-17-21:08:45/usr/bin/ls -r-xr-xr-x 1 root bin37612 Nov 21 2008 /.zfs/snapshot/2009-06-03-03:44:34/usr/bin/ls % It's not perfect (e.g., it doesn't properly canonicalize its arguments, so it doesn't handle symlinks and ..s in paths), but it's a start. There are some interesting design challenges here. For the general case, you can't rely on the snapshot name to be in time order, so you need to sort by the mtime of the destination. It would be cool to only list files which are different. If you mv a file to another directory, you might want to search by filename or a partial directory+filename. Regexp :-) Or maybe you just setup your tracker.cfg and be happy? -- richard ZFS storage and performance consulting at http://www.RichardElling.com ZFS training on deduplication, NexentaStor, and NAS performance Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Making ZFS better: zfshistory
On Fri, Apr 16, 2010 at 02:19:47PM -0700, Richard Elling wrote: On Apr 16, 2010, at 1:37 PM, Nicolas Williams wrote: I've a ksh93 script that lists all the snapshotted versions of a file... Works over NFS too. % zfshist /usr/bin/ls History for /usr/bin/ls (/.zfs/snapshot/*/usr/bin/ls): -r-xr-xr-x 1 root bin33416 Jul 9 2008 /.zfs/snapshot/install/usr/bin/ls -r-xr-xr-x 1 root bin37612 Nov 21 2008 /.zfs/snapshot/2009-12-07-20:47:58/usr/bin/ls -r-xr-xr-x 1 root bin37612 Nov 21 2008 /.zfs/snapshot/2009-12-01-00:42:30/usr/bin/ls -r-xr-xr-x 1 root bin37612 Nov 21 2008 /.zfs/snapshot/2009-07-17-21:08:45/usr/bin/ls -r-xr-xr-x 1 root bin37612 Nov 21 2008 /.zfs/snapshot/2009-06-03-03:44:34/usr/bin/ls % It's not perfect (e.g., it doesn't properly canonicalize its arguments, so it doesn't handle symlinks and ..s in paths), but it's a start. There are some interesting design challenges here. For the general case, you can't rely on the snapshot name to be in time order, so you need to sort by the mtime of the destination. I'm using ls -ltr. It would be cool to only list files which are different. True. That'd not be hard. If you mv a file to another directory, you might want to search by filename or a partial directory+filename. Or even inode number. Or maybe you just setup your tracker.cfg and be happy? Exactly. Nico -- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ZFS mirror
I have a question. I have a disk that solaris 10 zfs is installed. I wanted to add the other disks and replace this with the other. (totally three others). If I do this, I add some other disks, would the data be written immediately? Or only the new data is mirrored? Or I should use snapshots to replace that? Thanks! -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS mirror
On 04/17/10 09:34 AM, MstAsg wrote: I have a question. I have a disk that solaris 10 zfs is installed. I wanted to add the other disks and replace this with the other. (totally three others). If I do this, I add some other disks, would the data be written immediately? Or only the new data is mirrored? Or I should use snapshots to replace that? If you add a disk as a mirror, it will be resilvered as an exact copy (mirror!) of the original. -- Ian. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Making ZFS better: zfshistory
On Fri, Apr 16, 2010 at 2:19 PM, Richard Elling richard.ell...@gmail.comwrote: Or maybe you just setup your tracker.cfg and be happy? What's a tracker.cfg, and how would it help ZFS users on non-Solaris systems? ;) -- Freddie Cash fjwc...@gmail.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Setting up ZFS on AHCI disks
On Fri, Apr 16, 2010 at 11:46:01AM -0700, Willard Korfhage wrote: The drives are recent - 1.5TB drives I'm going to bet this is a 32-bit system, and you're getting screwed by the 1TB limit that applies there. If so, you will find clues hidden in dmesg from boot time about this, as the drives are probed. -- Dan. pgpTSqZVxspOO.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS mirror
On Apr 16, 2010, at 2:49 PM, Ian Collins wrote: On 04/17/10 09:34 AM, MstAsg wrote: I have a question. I have a disk that solaris 10 zfs is installed. I wanted to add the other disks and replace this with the other. (totally three others). If I do this, I add some other disks, would the data be written immediately? Or only the new data is mirrored? Or I should use snapshots to replace that? If you add a disk as a mirror, it will be resilvered as an exact copy (mirror!) of the original. I'm sure Ian meant attach a disk as a mirror rather than add :-). In ZFS terminology attach is used for mirrors (RAID-1) and add is used for stripes (RAID-0) -- richard ZFS storage and performance consulting at http://www.RichardElling.com ZFS training on deduplication, NexentaStor, and NAS performance Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS mirror
MstAsg, Is this the root pool disk? I'm not sure I'm following what you want to do but I think you want to attach a disk to create a mirrored configuration, then detach the original disk. If this is a ZFS root pool that contains the Solaris OS, then following these steps: 1. Attach disk-2. # zpool attach rpool disk-1 disk-2 2. Use the zpool status command to make sure the disk has resilvered completely. 3. Apply the bootblocks to disk-2: # installgrub /boot/grub/stage1 /boot/grub/stage2 /dev/rdsk/disk-2 4. Test that you can boot from disk-2. 5. Detach disk-1. # zpool detach rpool disk-1 If this isn't the root pool, then you can skip steps 2-4. Thanks, Cindy On 04/16/10 15:34, MstAsg wrote: I have a question. I have a disk that solaris 10 zfs is installed. I wanted to add the other disks and replace this with the other. (totally three others). If I do this, I add some other disks, would the data be written immediately? Or only the new data is mirrored? Or I should use snapshots to replace that? Thanks! ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Setting up ZFS on AHCI disks
isainfo -k returns amd64, so I don't think that is the answer. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS mirror
If this isn't a root pool disk, then skip steps 3-4. Letting the replacement disk resilver before removing the original disk is good advice for any configuration. cs On 04/16/10 16:15, Cindy Swearingen wrote: MstAsg, Is this the root pool disk? I'm not sure I'm following what you want to do but I think you want to attach a disk to create a mirrored configuration, then detach the original disk. If this is a ZFS root pool that contains the Solaris OS, then following these steps: 1. Attach disk-2. # zpool attach rpool disk-1 disk-2 2. Use the zpool status command to make sure the disk has resilvered completely. 3. Apply the bootblocks to disk-2: # installgrub /boot/grub/stage1 /boot/grub/stage2 /dev/rdsk/disk-2 4. Test that you can boot from disk-2. 5. Detach disk-1. # zpool detach rpool disk-1 If this isn't the root pool, then you can skip steps 2-4. Thanks, Cindy On 04/16/10 15:34, MstAsg wrote: I have a question. I have a disk that solaris 10 zfs is installed. I wanted to add the other disks and replace this with the other. (totally three others). If I do this, I add some other disks, would the data be written immediately? Or only the new data is mirrored? Or I should use snapshots to replace that? Thanks! ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Secure delete?
edm == Eric D Mudama edmud...@bounceswoosh.org writes: edm What you're suggesting is exactly what SSD vendors already do. no, it's not. You have to do it for them. edm They present a 512B standard host interface sector size, and edm perform their own translations and management inside the edm device. It is not nearly so magical! The pages are 2 - 4kB. They are this size for nothing to do with the erase block size or the secret blackbox filesystem running on the SSD. It's because of the ECC, because the reed-solomon for the entire block must be recalculated if any of the block is changed. Therefore, changing 0.5kB means: for a 4kB page device: * read 4kB * write 4kB for a 2kB page device: * read 2kB * write 2kB and changing 4kB at offset integer * 4kB means: for a 4kB device: * write 4kB for a 2kB device: * write 4kB It does not matter if all devices have the same page size or not. Just write at the biggest size, or write at the appropriate size if you can. The important thing is that you write a whole page, even if you just pad with zeroes, so the controller does not have to do any reading. simple. the problem with big-sector spinning hard drives and alignment/blocksize is exactly the same problem. non-ZFS people discuss it a lot becuase ZFS filesystems start at integer * rather large block offset, thanks to all the disk label hokus pocus, but NTFS filesystems often start at 16065 * 0.5kB pgpEdIwHb5RuZ.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS mirror
On 04/17/10 10:09 AM, Richard Elling wrote: On Apr 16, 2010, at 2:49 PM, Ian Collins wrote: On 04/17/10 09:34 AM, MstAsg wrote: I have a question. I have a disk that solaris 10 zfs is installed. I wanted to add the other disks and replace this with the other. (totally three others). If I do this, I add some other disks, would the data be written immediately? Or only the new data is mirrored? Or I should use snapshots to replace that? If you add a disk as a mirror, it will be resilvered as an exact copy (mirror!) of the original. I'm sure Ian meant attach a disk as a mirror rather than add :-). In ZFS terminology attach is used for mirrors (RAID-1) and add is used for stripes (RAID-0) Good catch Richard! -- Ian. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Making ZFS better: rm files/directories from snapshots
Eric D. Mudama wrote: On Fri, Apr 16 at 13:56, Edward Ned Harvey wrote: The typical problem scenario is: Some user or users fill up the filesystem. They rm some files, but disk space is not freed. You need to destroy all the snapshots that contain the deleted files, before disk space is available again. It would be nice if you could rm files from snapshots, without needing to destroy the whole snapshot. Is there any existing work or solution for this? Doesn't that defeat the purpose of a snapshot? If this is a real problem, I think that it calls for putting that user's files in a separate filesystem that can have its snapshots managed with a specific policy for addressing the usage model. --eric There was a discussion on this a couple of months ago, and Eric hits the nail right on the head: you *don't* want to support such a feature, as it breaks the fundamental assumption about what a snapshot is (and represents). AFAIK, snapshots will remain forever read-only, with any action acting on the snapshot as a whole (delete/promote/clone, et al.), and there is no plan to ever change this design. It's unfortunate that people can get themselves into a situation where they might need this ability, but, honestly, it's a self-inflicted wound, and there's only so much designers can do to prevent people from shooting themselves in the foot. It's also not a fatal problem - deleting snapshots is still possible, it's just not the simplest method of recovery. -- Erik Trimble Java System Support Mailstop: usca22-123 Phone: x17195 Santa Clara, CA Timezone: US/Pacific (GMT-0800) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Making ZFS better: file/directory granularity in-place rollback
Edward Ned Harvey wrote: AFAIK, if you want to restore a snapshot version of a file or directory, you need to use cp or such commands, to copy the snapshot version into the present. This is not done in-place, meaning, the cp or whatever tool must read the old version of objects and write new copies of the objects. You may avoid the new disk space consumption if you have dedup enabled, but you will not avoid the performance hit of requiring the complete read write of all the bytes of all the objects. Performance is nothing like a simple re-link of the snapshot restored object, and the newly restored objects are not guaranteed identical to the old version - because cp or whatever could be changing permissions and timestamps and stuff, according to the behavior of cp or whatever tool is being used. So the suggestion, or question is: Is it possible or planned to implement a rollback command, that works as fast as a link or re-link operation, implemented at a file or directory level, instead of the entire filesystem? ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss Not to be a contrary person, but the job you describe above is properly the duty of a BACKUP system. Snapshots *aren't* traditional backups, though some people use them as such. While I see no technical reason why snapshots couldn't support some form of partial rollback, there's a whole bunch of other features that Backup software provides that can't be shoehorned directly into snapshots, so why bother trying to add a feature that should properly reside elsewhere in the system? -- Erik Trimble Java System Support Mailstop: usca22-123 Phone: x17195 Santa Clara, CA Timezone: US/Pacific (GMT-0800) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Making an rpool smaller?
When I set up my opensolaris system at home, I just grabbed a 160 GB drive that I had sitting around to use for the rpool. Now I'm thinking of moving the rpool to another disk, probably ssd, and I don't really want to shell out the money for two 160 GB drives. I'm currently using ~ 18GB in the rpool, so any of the ssd 'boot drives' being sold are large enough. I know I can't attach a device that much smaller to the rpool however. Would it be possible to do the following? 1. Attach the new drives. 2. Reboot from LiveCD. 3. zpool create new_rpool on the ssd 4. zfs send all datasets from rpool to new_rpool 5. installgrub /boot/grub/stage1 /boot/grub/stage2 on the ssd 6. zfs export the rpool and new_rpool 7. 'zfs import new_rpool rpool' (This should rename it to rpool, right?) 8. shutdown and disconnect the old rpool drive This should work, right? I plan to test it on a VirtualBox instance first, but does anyone see a problem with the general steps I've laid out? -B -- Brandon High : bh...@freaks.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Making ZFS better: rm files/directories from snapshots
On Fri, Apr 16, 2010 at 01:56:07PM -0400, Edward Ned Harvey wrote: The typical problem scenario is: Some user or users fill up the filesystem. They rm some files, but disk space is not freed. You need to destroy all the snapshots that contain the deleted files, before disk space is available again. It would be nice if you could rm files from snapshots, without needing to destroy the whole snapshot. Is there any existing work or solution for this? See the archives. See the other replies to you already. Short version: no. However, a script to find all the snapshots that you'd have to delete in order to delete some file might be useful, but really, only marginally so: you should send your snapshots to backup and clean them out from time to time anyways. Nico -- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Making an rpool smaller?
On 04/17/10 11:41 AM, Brandon High wrote: When I set up my opensolaris system at home, I just grabbed a 160 GB drive that I had sitting around to use for the rpool. Now I'm thinking of moving the rpool to another disk, probably ssd, and I don't really want to shell out the money for two 160 GB drives. I'm currently using ~ 18GB in the rpool, so any of the ssd 'boot drives' being sold are large enough. I know I can't attach a device that much smaller to the rpool however. Would it be possible to do the following? 1. Attach the new drives. 2. Reboot from LiveCD. 3. zpool create new_rpool on the ssd 4. zfs send all datasets from rpool to new_rpool 5. installgrub /boot/grub/stage1 /boot/grub/stage2 on the ssd 6. zfs export the rpool and new_rpool 7. 'zfs import new_rpool rpool' (This should rename it to rpool, right?) 8. shutdown and disconnect the old rpool drive This should work, right? I plan to test it on a VirtualBox instance first, but does anyone see a problem with the general steps I've laid out? It should work. You aren't changing your current rpool (and you could probably import it read only for the copy), so it's there if things go tits up. -- Ian. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Making ZFS better: zfshistory
On Apr 16, 2010, at 2:58 PM, Freddie Cash wrote: On Fri, Apr 16, 2010 at 2:19 PM, Richard Elling richard.ell...@gmail.com wrote: Or maybe you just setup your tracker.cfg and be happy? What's a tracker.cfg, and how would it help ZFS users on non-Solaris systems? ;) tracker is the gnome answer to spotlight. http://projects.gnome.org/tracker/ -- richard ZFS storage and performance consulting at http://www.RichardElling.com ZFS training on deduplication, NexentaStor, and NAS performance Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Making ZFS better: zfshistory
From: Richard Elling [mailto:richard.ell...@gmail.com] There are some interesting design challenges here. For the general case, you can't rely on the snapshot name to be in time order, so you need to sort by the mtime of the destination. Actually ... drwxr-xr-x 16 root root 20 Mar 29 12:07 daily-2010-04-01-00-00-00/ drwxr-xr-x 16 root root 20 Mar 29 12:07 daily-2010-04-02-00-00-00/ drwxr-xr-x 16 root root 20 Mar 29 12:07 daily-2010-04-03-00-00-00/ drwxr-xr-x 16 root root 20 Mar 29 12:07 daily-2010-04-04-00-00-00/ drwxr-xr-x 16 root root 20 Mar 29 12:07 daily-2010-04-05-00-00-00/ drwxr-xr-x 16 root root 20 Mar 29 12:07 daily-2010-04-06-00-00-00/ Um ... All the same time. Even if I stat those directories ... Access: Modify: and Change: are all useless... How in the heck can you identify when a snapshot was taken, if you're not relying on the name of the snapshot? It would be cool to only list files which are different. Know of any way to do that? ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] recomend sata controller 4 Home server with zfs raidz2 and 8x1tb hd
Eric D. Mudama edmud...@bounceswoosh.org writes: On Thu, Apr 15 at 23:57, Günther wrote: hello if you are looking for pci-e (8x), i would recommend sas/sata controller with lsi 1068E sas chip. they are nearly perfect with opensolaris. For just a bit more, you can get the LSI SAS 9211-9i card which is 6Gbit/s. It works fine for us, and does JBOD no problem. I can't resist getting in a similar questions here. Its not so easy to really get good info about this subject... there is a lot of info on the subject but when you remove all pci-e info .. maybe not so much. I will be needing a 4 or more port PCI sata controller soon and would like to get one that can make use of the newest sata (alleged) 3GB transfer rates. It's older base hardware... athlon64 3400+ 2.2 ghz 3GB Ram With A-open AK86-L Motherboard. So what do any of you know about a PCI card that fills the bill? ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS Perfomance
On Apr 14, 2010, at 11:10 PM, Daniel Carosone wrote: On Wed, Apr 14, 2010 at 09:58:50AM -0700, Richard Elling wrote: On Apr 14, 2010, at 8:57 AM, Yariv Graf wrote: From my experience dealing with 4TB you stop writing after 80% of zpool utilization YMMV. I have routinely completely filled zpools. There have been some improvements in performance of allocations when free space gets low in the past 6-9 months, so later releases are more efficient. Some weeks ago, I read with interest an excellent discussion of changes resulting in performance benefits for the fishworks platform, from Roch Bourbonnais. After all the analysis, three key changes are described in the penultimate paragraph. The first two of these basically adjust thresholds for existing behavioual changes (e.g the switch from first-fit to best-fit); the last is an actual code change. I meant to ask at the time, and never followed up to do so, whether: - these changes are also/yet in onnv-gate zfs - which builds, if so - whether the altered thresholds are accessible as tunables, for older builds/in the meantime. There are several b114: 6596237 Stop looking and start ganging b129: 6869229 zfs should switch to shiny new metaslabs more frequently b138: 6917066 zfs block picking can be improved there are probably a few more... -- richard I've just added the above as a comment on the blog post, in the hopes of attracting Roch's attention there. There have been recent commits go by (b134) that seem promising too. -- Dan. ZFS storage and performance consulting at http://www.RichardElling.com ZFS training on deduplication, NexentaStor, and NAS performance Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Making ZFS better: file/directory granularity in-place rollback
From: Erik Trimble [mailto:erik.trim...@oracle.com] Not to be a contrary person, but the job you describe above is properly the duty of a BACKUP system. Snapshots *aren't* traditional backups, though some people use them as such. While I see no technical reason why snapshots couldn't support some form of partial rollback, there's a whole bunch of other features that Backup software provides that can't be shoehorned directly into snapshots, so why bother trying to add a feature that should properly reside elsewhere in the system? I don't get what you're talking about. The disconnect is: I don't know why you would say this is a task for a backup system, when it's ideally suited to the snapshot system, provided snapshots are available. Allow me to rephrase, and see if that changes anything... One of the most valuable reasons to have snapshots is the ability for users to restore or examine past versions of files instantly, without the need for sysadmin assistance, or lag time waiting for tapes. Snapshots augment the backup system, but do not replace or eliminate the need for backups. Given a filesystem, and some snapshots of that filesystem, all the data for all the versions of all the files already exists on disk. It required essentially no time to link all the files to their respective snapshots. It would be nice, if you wanted to, to be able to link an old version of the file to the present filesystem also in zero time. This is not in any way a suggestion of eliminating or replacing your backup system. Backups are always needed in *addition* to snapshots. Just incase anything ever destroys the pool somehow. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Making ZFS better: rm files/directories from snapshots
From: Erik Trimble [mailto:erik.trim...@oracle.com] Sent: Friday, April 16, 2010 7:35 PM Doesn't that defeat the purpose of a snapshot? Eric hits the nail right on the head: you *don't* want to support such a feature, as it breaks the fundamental assumption about what a snapshot is (and represents). Ok, point taken, but what you've stated is just an opinion. It's not a fundamental necessity, or a mathematical necessity, or impossible to think otherwise, that a snapshot is 100% and always will be immutable. IMHO, for some people in some situations the assumption that a snapshot is identical to the way the FS was at some given time can be valuable. However: If your only option is to destroy the whole snapshot, in order to free up disk space occupied by some files in the snapshot ... Destroying the whole snapshot for some people in some situations can be even less desirable than destroying the subset of files you want freed from disk. There could be value in the ability to destroy some subset of files without destroying the whole snapshot. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Making an rpool smaller?
On 04/16/10 07:41 PM, Brandon High wrote: 1. Attach the new drives. 2. Reboot from LiveCD. 3. zpool create new_rpool on the ssd Is step 2 actually necessary? Couldn't you create a new BE # beadm create old_rpool # beadm activate old_rpool # reboot # beadm delete rpool It's the same number of steps but saves the bother of making a zpool version compatible live cd. Also, how attached are you the pool name rpool? I have systems with root pools called spool, tpool, etc., even one rpool-1 (because the text installer detected an earlier rpool on an iscsi volume I was overwriting) and they all seem to work fine. Actually. my preferred method (if you really want the new pool to be called rpool) would be to do the 4 step rename on the ssd after all the other steps are done and you've sucessfully booted it. Then you always have the untouched old disk in case you mess up. Also, (gurus please correct here), you might need to change step 3 to something like # zpool create -f -o failmode=continue -R /mnt -m legacy rpool ssd in which case you can recv to it without rebooting at all, and #zpool set bootfs =... You might also consider where you want swap to be and make sure that vfstab is correct on the old disk now that the root pool has a different name. There was detailed documentation on how to zfs send/recv root pools on the Sun ZFS documentation site, but right now it doesn't seem to be Googleable. I'm not sure your original set of steps will work without at least doing the above two. You might need to check to be sure the ssd has an SMI label. AFAIK the official syntax for installing the MBR is # installboot -F zfs /usr/platform/`uname -i`/lib/fs/zfs/bootblk /dev/rdsk/ssd Finally, you should check or delete /etc/zfs/zpool.cache because it will likely be incorrect on the ssd after recv'ing the snapshot. HTH -- Frank ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Making ZFS better: rm files/directories from snapshots
From: Nicolas Williams [mailto:nicolas.willi...@oracle.com] you should send your snapshots to backup and clean them out from time to time anyways. When using ZFS as a filesystem in a fileserver, the desired configuration such as auto-snapshots is something like: Every 15 mins for the most recent hour Every hour for the most recent 24 hours Every day for the most recent 30 days Or something like that. And all snapshots older than 30 days are automatically destroyed. A good sysadmin in this situation will send snaps to backup, and also allow the snaps to rotate automatically. There need not be an absolute, Snapshots exist for the sole purpose of staging for removable media backup. Snapshots, in and of themselves, are useful as an augmentation to regular backups. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Making ZFS better: rm files/directories from snapshots
On 04/17/10 12:56 PM, Edward Ned Harvey wrote: From: Erik Trimble [mailto:erik.trim...@oracle.com] Sent: Friday, April 16, 2010 7:35 PM Doesn't that defeat the purpose of a snapshot? Eric hits the nail right on the head: you *don't* want to support such a feature, as it breaks the fundamental assumption about what a snapshot is (and represents). Ok, point taken, but what you've stated is just an opinion. It's not a fundamental necessity, or a mathematical necessity, or impossible to think otherwise, that a snapshot is 100% and always will be immutable. But is a fundamental of zfs: snapshot A read-only version of a file system or volume at a given point in time. It is specified as filesys...@name or vol...@name. IMHO, for some people in some situations the assumption that a snapshot is identical to the way the FS was at some given time can be valuable. However: If your only option is to destroy the whole snapshot, in order to free up disk space occupied by some files in the snapshot ... Destroying the whole snapshot for some people in some situations can be even less desirable than destroying the subset of files you want freed from disk. There could be value in the ability to destroy some subset of files without destroying the whole snapshot. I can see your point, it can be really annoying when there's a very large file you want to delete to free space locked up in snapshots. I've been there and it was a pain. Now I use nested filesystems for storing media files, so removing snapshots is more manageable. -- Ian. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Making an rpool smaller?
On 04/16/10 08:57 PM, Frank Middleton wrote: AFAIK the official syntax for installing the MBR is # installboot -F zfs /usr/platform/`uname -i`/lib/fs/zfs/bootblk /dev/rdsk/ssd Sorry, that's for SPARC. You had the installgrub down correctly... ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Making ZFS better: zfshistory
On Apr 16, 2010, at 5:33 PM, Edward Ned Harvey wrote: From: Richard Elling [mailto:richard.ell...@gmail.com] There are some interesting design challenges here. For the general case, you can't rely on the snapshot name to be in time order, so you need to sort by the mtime of the destination. Actually ... drwxr-xr-x 16 root root 20 Mar 29 12:07 daily-2010-04-01-00-00-00/ drwxr-xr-x 16 root root 20 Mar 29 12:07 daily-2010-04-02-00-00-00/ drwxr-xr-x 16 root root 20 Mar 29 12:07 daily-2010-04-03-00-00-00/ drwxr-xr-x 16 root root 20 Mar 29 12:07 daily-2010-04-04-00-00-00/ drwxr-xr-x 16 root root 20 Mar 29 12:07 daily-2010-04-05-00-00-00/ drwxr-xr-x 16 root root 20 Mar 29 12:07 daily-2010-04-06-00-00-00/ Um ... All the same time. Even if I stat those directories ... Access: Modify: and Change: are all useless... which is why you need to stat the destination :-) How in the heck can you identify when a snapshot was taken, if you're not relying on the name of the snapshot? zfs list -t snapshot lists in time order. It would be cool to only list files which are different. Know of any way to do that? cmp But since the world has moved onto time machine, time slider, and whatever Windows users use, is this tool relegated to the CLI dinosaurs? ;-) -- richard ZFS storage and performance consulting at http://www.RichardElling.com ZFS training on deduplication, NexentaStor, and NAS performance Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Making ZFS better: rm files/directories from snapshots
Ian Collins wrote: On 04/17/10 12:56 PM, Edward Ned Harvey wrote: From: Erik Trimble [mailto:erik.trim...@oracle.com] Sent: Friday, April 16, 2010 7:35 PM Doesn't that defeat the purpose of a snapshot? Eric hits the nail right on the head: you *don't* want to support such a feature, as it breaks the fundamental assumption about what a snapshot is (and represents). Ok, point taken, but what you've stated is just an opinion. It's not a fundamental necessity, or a mathematical necessity, or impossible to think otherwise, that a snapshot is 100% and always will be immutable. But is a fundamental of zfs: snapshot A read-only version of a file system or volume at a given point in time. It is specified as filesys...@name or vol...@name. As Ian said. Also, take a look at the old conversation about writeable snapshots - sure, it's my /opinion/ that snapshots should be read-only, but it's also a fundamental design decision made when ZFS was started. There's actually non-trivial assumptions in the code about the immutability of a snapshot, and I'd not want to go down that path of changing a fundamental assumption of the whole system. The problem with mutable snapshots is that you've moved over into the realm of a versioning filesystem, and that's a whole 'nother rats nest of complicated issues. -- Erik Trimble Java System Support Mailstop: usca22-123 Phone: x17195 Santa Clara, CA Timezone: US/Pacific (GMT-0800) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Making an rpool smaller?
On Fri, Apr 16, 2010 at 5:57 PM, Frank Middleton f.middle...@apogeect.com wrote: Is step 2 actually necessary? Couldn't you create a new BE # beadm create old_rpool # beadm activate old_rpool # reboot # beadm delete rpool Right now, my boot environments are named after the build it's running. I'm guessing that by 'rpool' you mean the current BE above. bh...@basestar:~$ beadm list BE Active Mountpoint Space Policy Created -- -- -- - -- --- snv_129 - - 1.47M static 2009-12-14 13:33 snv_133 NR / 16.37G static 2010-02-23 18:54 So what you're suggesting is creating a new BE and booting to that to do the send | recv? Why would I want to destroy my current BE? You might also consider where you want swap to be and make sure that vfstab is correct on the old disk now that the root pool has a different name. There was detailed documentation on how to zfs That's the main reason for making sure the rpool is names rpool, so I don't have to chase down any references to the old name. send/recv root pools on the Sun ZFS documentation site, but right now it doesn't seem to be Googleable. I'm not sure your original set of steps will work without at least doing the above two. I figure that by booting to a live cd / live usb, the pool will not be in use, so there shouldn't be any special steps involved. I'll try out a few variations on a VM and see how it goes. -B -- Brandon High : bh...@freaks.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] recomend sata controller 4 Home server with zfs raidz2 and 8x1tb hd
On Fri, Apr 16, 2010 at 7:35 PM, Harry Putnam rea...@newsguy.com wrote: Eric D. Mudama edmud...@bounceswoosh.org writes: On Thu, Apr 15 at 23:57, Günther wrote: hello if you are looking for pci-e (8x), i would recommend sas/sata controller with lsi 1068E sas chip. they are nearly perfect with opensolaris. For just a bit more, you can get the LSI SAS 9211-9i card which is 6Gbit/s. It works fine for us, and does JBOD no problem. I can't resist getting in a similar questions here. Its not so easy to really get good info about this subject... there is a lot of info on the subject but when you remove all pci-e info .. maybe not so much. I will be needing a 4 or more port PCI sata controller soon and would like to get one that can make use of the newest sata (alleged) 3GB transfer rates. It's older base hardware... athlon64 3400+ 2.2 ghz 3GB Ram With A-open AK86-L Motherboard. So what do any of you know about a PCI card that fills the bill? If you're talking about standard PCI, and not PCI-e or PCI-X, there's no reason to try to get a faster controller. A standard PCI slot can't even max out the first revision of SATA. --Tim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Is it safe/possible to idle HD's in a ZFS Vdev to save wear/power?
Hi all, I'm still an OpenSolaris noob, so please be gentle... I was just wondering if it is possible to spindown/idle/sleep hard disks that are part of a Vdev pool SAFELY? My objective is to sleep the drives after X time when they're not being used and spin them back up if required. This is for a home system, so I'm not fussed by a minor delay before data is accessible again. I realise it's not applicable, in the Linux world this is possible using hdparm on mdadm devices, but I can't find any info on doing similar in OpenSolaris safely or otherwise. Could anyone shed some light here? -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Making an rpool smaller?
On 04/16/10 09:53 PM, Brandon High wrote: Right now, my boot environments are named after the build it's running. I'm guessing that by 'rpool' you mean the current BE above. No, I didn't :-(. Please ignore that part - too much caffeine :-). I figure that by booting to a live cd / live usb, the pool will not be in use, so there shouldn't be any special steps involved. Might be the easiest way. But I've never found having a different name for the root pool to be a problem. The lack, until recently, of a bootable CD for SPARC may have something to do with living with different names. Makes it easier to recv snapshots from different hosts and architectures, too. I'll try out a few variations on a VM and see how it goes. You'll need to do the zfs create with legacy mount option, and set the bootfs property. Otherwise it looks like you are on the right path. Cheers -- Frank ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss