Re: [zfs-macos] pros/cons of multiple zfs filesystems
On 17. März 2014 at 05:00:25, roemer (uwe.ro...@gmail.com) wrote: Thanks for the detailed example! On Monday, 17 March 2014 07:34:45 UTC+11, dch wrote: I've been a happy maczfs and also zfsosx user for several years now. [...] zfs send is a very easy way to do a very trustable backup, once you get past the first potentially large transfers. Can this happen bi-directiona? Or is it only applicable for creating 'read-only' replicas of a master filesystem onto some clients? I mean, what happens once you cloned one file system, sent it to your laptop, then edit on both the laptop and your ZFS server? Then you’re screwed :-). It’s not duplicity or some other low-level sync tool. I find it works best when you have a known master that you’re working off. Slightly OT, but in FreeBSD with HAST you can do some gonzo crazy stuff: http://www.aisecure.net/2012/02/07/hast-freebsd-zfs-with-carp-failover/ All my source code work lives in a zfs case sensitive noatime copies=2 filesystem, and I replicate that regularly to my other boxes as required. How does a 'copies=2' filesystem play together with a 'RAIDZ1' (or even RAIDZ2) pool? RAIDZ would have all data stored redundantly already, so would 'copies=2' not end up in quadrupling the storage requirement if used on a raidz pool? Yes, but in this case, the laptop isn’t redundant, and my data is precious. IIRC the whole repos dataset, even with history, is 40 Gb, so that’s reasonable IMO. For most customer projects I will have 3 or more VMs running different configs or operating systems under VMWare Fusion. These each live in their own zfs filesystem, compressed lz4 noatime case sensitive. I snapshot these after creation using vagrant install, again after config, and the changes are replicated using zfs snapshots again to the other OSX system, and also to the remote FreeBSD box. I can see that zfs is really good for handling multiple virtual machines. Yup, zfs rollback for testing deployments or upgrades is simply bliss. In summary, I'm more than happy with the performance once I used ashift=12 and moved past 8GB ram. Datasets once you get used to them are extraordinarily useful -- snapshot your config just before a critical upgrade. I start seeing the potential in snapshots. In fact, I just realised that I do manual 'snapshots' on some of my repeating projects already for quite some time with annual clones of the previous directory structure. So ZFS snapshots would be a natural fit here. But regarding the memory consumption: What makes ZFS so memory hungry in your case? I don’t think it’s very hungry actually. 4GB (under the old MacZFS 74.1) simply wasn’t enough and I’d get crashes. With 8GB that went away. Bearing in mind with 16GB RAM I can run a web browser (oink at least 1GB), a 20GB VM that’s been compressed into a 10GB RAMdisk, +1 GB RAM for the VM, that seems pretty reasonable. That would leave 4GB for ZFS and the normal OSX baseline stuff roughly. I’m happy to report back with RAM usage if somebody tells me what z* incantation is needed. Do you use deduplication? Never. But I do use cloned datasets a fair bit, which probably helps the situation a bit. The 2nd law of ZFS is not to use deduplication, even if you think you need it. IIRC the rough numbers are 1GB RAM / TB storage, and I’d want ECC RAM for that. BTW pretty sure the 1st law of ZFS is not to trust USB devices with your data. -- Dave Cottlehuber Sent from my PDP11 -- --- You received this message because you are subscribed to the Google Groups zfs-macos group. To unsubscribe from this group and stop receiving emails from it, send an email to zfs-macos+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: [zfs-macos] pros/cons of multiple zfs filesystems
On Mon, Mar 17, 2014 at 3:35 AM, Dave Cottlehuber d...@jsonified.com wrote: On 17. März 2014 at 05:00:25, roemer (uwe.ro...@gmail.com) wrote: How does a 'copies=2' filesystem play together with a 'RAIDZ1' (or even RAIDZ2) pool? RAIDZ would have all data stored redundantly already, so would 'copies=2' not end up in quadrupling the storage requirement if used on a raidz pool? Yes No, RAIDZ does not store your data redundantly. It splits your data across multiple drives and uses space equivalent to one drive to store parity information about the data so that it can be mathematically made whole if one drive goes missing. RAIDZ2 or RAIDZ3 just raise the level of parity, i.e. the number of disk failures that can happen before data is lost, to two or three respectively. So the amount of space lost to parity is a constant of disk size x RAID level. Thus, if you're using copies, the amount of space lost is just dataset size / copies. One of the nice things about using copies as opposed to mirroring is that you can set it on a per file system (e.g. dataset) as opposed to mirroring which affects the entire vdev. On the other hand, if you're using mirroring, then yes turning on copies=2 does cut your storage space to pool size / 4. (Assuming all datasets in the pool have this set.) RAIDZ vs mirroring vs copies all comes down to trading off performance vs Reliability, Availability and Serviceability vs space. There are formulas for figuring all of this out. Start at Serve the Home's Raid Reliablitity calculatorhttp://www.servethehome.com/raid-calculator/raid-reliability-calculator-simple-mttdl-model/* which takes into account everything, but increasing file redundancy. For that there's this article: ZFS, Copies, and Data Protectionhttps://blogs.oracle.com/relling/entry/zfs_copies_and_data_protection. And for RAIDZ vs Mirroring performance see When To (And Not To) Use RAID-Zhttps://blogs.oracle.com/roch/entry/when_to_and_not_to . Phil * Note that the Mean Time to Data Loss calculated at this site, while being an industry standard, is essentially useless other than for getting a relative comparison of different configurations. For details see: Mean time to meaningless: MTTDL, Markov models, and storage system reliabilityhttps://www.usenix.org/legacy/event/hotstorage10/tech/full_papers/Greenan.pdf . -- --- You received this message because you are subscribed to the Google Groups zfs-macos group. To unsubscribe from this group and stop receiving emails from it, send an email to zfs-macos+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: [zfs-macos] pros/cons of multiple zfs filesystems
Thanks for the response, Björn. The hint regarding dataset-specific snapshots is good, though I have to first think about how I would best make use of them. However another point that you raised is interesting: On Sunday, 16 March 2014 10:34:52 UTC+11, Bjoern Kahl wrote: [...] Under Mac OSX, a mounted file system comes at higher costs than on other Unix like operating systems, due to the Finder and MDS services, so I would not suggest to really try to have hundreds of file systems mounted at the same time. But any reasonable number (some 10) go without noticeable performance impact. I would need about 10 separate mount points / data sets, so I guess this would be fine. MDS services however means Spotlight, but the MacZFS Wiki as well as several other posts on the web give the advice to switch off spotlight for ZFS with mdutil -i off mountPoint Why is Spotlight thought to be evil for ZFS? Or does your comment imply that these advices are outdated, and mds-indexing for ZFS mount points is ok nowadays? Note that I am mainly aiming to store static 'archival' data and documents on ZFS, not my main user directory. [...] Snapshots can also easily be used for real off-site backups by the zfs send / receive mechanism. Haven't looked at send/receive yet, but if they require network connections, I am afraid classical ADSL speeds with mac 1MBit/s upload will not be much fun... And for periodic backup to an external HDD I was thinking about ChronoSync or simply rsync roemer -- --- You received this message because you are subscribed to the Google Groups zfs-macos group. To unsubscribe from this group and stop receiving emails from it, send an email to zfs-macos+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: [zfs-macos] pros/cons of multiple zfs filesystems
An advantage of snapshots is with active filesystems such as those used by a database. For a consist at database backup you of course need to stop the program then backup then restart ( or use some database tool if available) . The time to create a snapshot is essentially zero so the above start - stop is actually practical. Then you use your backup software of choice on the snapshot not the active file system. On Sun, Mar 16, 2014 at 7:16 AM, roemer uwe.ro...@gmail.com wrote: Thanks for the response, Björn. The hint regarding dataset-specific snapshots is good, though I have to first think about how I would best make use of them. However another point that you raised is interesting: On Sunday, 16 March 2014 10:34:52 UTC+11, Bjoern Kahl wrote: [...] Under Mac OSX, a mounted file system comes at higher costs than on other Unix like operating systems, due to the Finder and MDS services, so I would not suggest to really try to have hundreds of file systems mounted at the same time. But any reasonable number (some 10) go without noticeable performance impact. I would need about 10 separate mount points / data sets, so I guess this would be fine. MDS services however means Spotlight, but the MacZFS Wiki as well as several other posts on the web give the advice to switch off spotlight for ZFS with mdutil -i off mountPoint Why is Spotlight thought to be evil for ZFS? Or does your comment imply that these advices are outdated, and mds-indexing for ZFS mount points is ok nowadays? Note that I am mainly aiming to store static 'archival' data and documents on ZFS, not my main user directory. [...] Snapshots can also easily be used for real off-site backups by the zfs send / receive mechanism. Haven't looked at send/receive yet, but if they require network connections, I am afraid classical ADSL speeds with mac 1MBit/s upload will not be much fun... And for periodic backup to an external HDD I was thinking about ChronoSync or simply rsync roemer -- --- You received this message because you are subscribed to the Google Groups zfs-macos group. To unsubscribe from this group and stop receiving emails from it, send an email to zfs-macos+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout. -- --- You received this message because you are subscribed to the Google Groups zfs-macos group. To unsubscribe from this group and stop receiving emails from it, send an email to zfs-macos+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: [zfs-macos] pros/cons of multiple zfs filesystems
Thanks for the detailed example! On Monday, 17 March 2014 07:34:45 UTC+11, dch wrote: I've been a happy maczfs and also zfsosx user for several years now. [...] zfs send is a very easy way to do a very trustable backup, once you get past the first potentially large transfers. Can this happen bi-directiona? Or is it only applicable for creating 'read-only' replicas of a master filesystem onto some clients? I mean, what happens once you cloned one file system, sent it to your laptop, then edit on both the laptop and your ZFS server? All my source code work lives in a zfs case sensitive noatime copies=2 filesystem, and I replicate that regularly to my other boxes as required. How does a 'copies=2' filesystem play together with a 'RAIDZ1' (or even RAIDZ2) pool? RAIDZ would have all data stored redundantly already, so would 'copies=2' not end up in quadrupling the storage requirement if used on a raidz pool? For most customer projects I will have 3 or more VMs running different configs or operating systems under VMWare Fusion. These each live in their own zfs filesystem, compressed lz4 noatime case sensitive. I snapshot these after creation using vagrant install, again after config, and the changes are replicated using zfs snapshots again to the other OSX system, and also to the remote FreeBSD box. I can see that zfs is really good for handling multiple virtual machines. [...] In summary, I'm more than happy with the performance once I used ashift=12 and moved past 8GB ram. Datasets once you get used to them are extraordinarily useful -- snapshot your config just before a critical upgrade. I start seeing the potential in snapshots. In fact, I just realised that I do manual 'snapshots' on some of my repeating projects already for quite some time with annual clones of the previous directory structure. So ZFS snapshots would be a natural fit here. But regarding the memory consumption: What makes ZFS so memory hungry in your case? Do you use deduplication? -- --- You received this message because you are subscribed to the Google Groups zfs-macos group. To unsubscribe from this group and stop receiving emails from it, send an email to zfs-macos+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.