Re: [zfs-macos] pros/cons of multiple zfs filesystems

2014-03-19 Thread Matt Elliott
Could you share your script?


Matt Elliott
melli...@ncsa.illinois.edu


On Mar 19, 2014, at 11:26 AM, Daniel Jozsef  wrote:

> Finder is one of them.
> 
> When I first migrated my Linux-created ZFS mirror pool over to ZEVO (after 
> tearing down my NAS box, and housing my data in a pair of simple Firewire 800 
> enclosures), I noticed that files that were there in the command line were 
> missing in Finder. Sometimes Finder would flash a folder full of files upon 
> opening, and then just make most of the icons (files) disappear.
> 
> These disappearing files were ones with native characters in their names. 
> Since the pool was already created, I had no option of setting normalization 
> at a ZFS level, so I just wrote a script to go through all the files and 
> rename them to a FormD name. Finder suddenly started working normally.

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"zfs-macos" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to zfs-macos+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [zfs-macos] pros/cons of multiple zfs filesystems

2014-03-18 Thread roemer


On Wednesday, 19 March 2014 00:13:27 UTC+11, Daniel Becker wrote:
>
> On Mar 18, 2014, at 4:52 AM, roemer > 
> wrote:
>
> I am also not sure now whether the performance is still higher due to 
> parallel I/O (see comment above about constant number of disk ops per 
> RAIDZ)...
> At least it should be so high to saturate a gigabit ethernet link (i.e. 
> 100 - 110 MB/s).
>
>
> For sequential transfers, RAIDZ with n disks will give you (n-1) times the 
> throughput of a single disk. It’s only IOPS (i.e., performance in 
> seek-dominated workloads / random I/O) where it doesn’t buy you anything, 
> and with those kinds of workloads you typically won’t reach that kind of 
> throughput anyway, at least not with conventional hard drives.
>

That's what I assumed too when reading that zfs article, but it didn't 
become clear to me.
Thanks for the clarification. For my typical SOHO usage pattern, I wouldn't 
expect IOPS throughput to be a problem. 

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"zfs-macos" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to zfs-macos+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [zfs-macos] pros/cons of multiple zfs filesystems

2014-03-18 Thread Daniel Becker
On Mar 18, 2014, at 4:52 AM, roemer  wrote:

> I am also not sure now whether the performance is still higher due to 
> parallel I/O (see comment above about constant number of disk ops per 
> RAIDZ)...
> At least it should be so high to saturate a gigabit ethernet link (i.e. 100 - 
> 110 MB/s).

For sequential transfers, RAIDZ with n disks will give you (n-1) times the 
throughput of a single disk. It’s only IOPS (i.e., performance in 
seek-dominated workloads / random I/O) where it doesn’t buy you anything, and 
with those kinds of workloads you typically won’t reach that kind of throughput 
anyway, at least not with conventional hard drives.

smime.p7s
Description: S/MIME cryptographic signature


Re: [zfs-macos] pros/cons of multiple zfs filesystems

2014-03-18 Thread roemer
Excellent post, many thanks for the links, especially about RAIDZ and the 
MTTDL metric problem. 

Now back to the question about RAIDZ and/or copies=X:
Both protect against data corruption on disk. RAIDZ does it with parity 
information on the whole disk level, copies=X does it via internal file 
copies.
If the goal is to protect against whole disk failures in a multi-disk 
setting, I would assume RAIDZ is more natural. 
It doesn't discriminate though about which files it protects - everything 
gets stored with additional parity information.

The interesting bit that I read in the 'ZFS, Copies, and Data 
Protection'
 
article is that, in contrast to traditional raid, it seems not to give 
improved performance due to stripping. At least not in terms of disk 
operation rate. It rather 'just' improves fail safety. The article leaves 
out though whether the disk operations actually deal with more data in the 
RAIDZ case as one logical RAIDZ I/O still affects N-1 disk blocks, so I 
would assume data throughput per file access still increases with the 
number of disks N (for N>2). 
Is my reasoning here correct?

The 'copies=X' parameter of zfs file systems seems to target settings with 
just a single disk, say a ZFS-formatted partition on a laptop drive, where 
raid is not applicable. This is fine and makes IMHO a lot of sense. But I 
do not see a point for combining both together as the parity information of 
RAIDZ would already protect against data corruption and even disk loss. 

One interesting question is whether copies=X (X>=2) alone could do the same 
than RAIDZ on a purely stripped disk pool.
I read in Oracle's zfs documentation that copies=X tries to store copies on 
different disks - but wouldn't a stripped disk pool use all disks anyway?
Or am I incorrectly mixing here my understanding of traditional raid0 
settings with the mechanics of zfs?

Some background information perhaps why I am asking all this:
I am playing with the idea to format a 4-disk 'JBOD" enclosure using zfs 
with a RAIDZ or even RAIDZ2 setting to protect against disk failures.
In my understanding this also should protect against single file corruption 
and the ominous 'bit rot' - especially with RAIDZ2.
I would loose one or two disk capacity though from the beginning, which I 
would be fine with. 
If I could gain some space again using a different tactic, I am fine too 
though as the enclosure has 4 bays only.
I am also not sure now whether the performance is still higher due to 
parallel I/O (see comment above about constant number of disk ops per 
RAIDZ)...
At least it should be so high to saturate a gigabit ethernet link (i.e. 100 
- 110 MB/s).


On Monday, 17 March 2014 22:23:18 UTC+11, Philip Robar wrote:

>
> On Mon, Mar 17, 2014 at 3:35 AM, Dave Cottlehuber 
> 
> > wrote:
>
>> On 17. März 2014 at 05:00:25, roemer (uwe@gmail.com ) 
>> wrote:
>>
>> > > How does a 'copies=2' filesystem play together with a 'RAIDZ1' (or 
>> even
>> > > RAIDZ2) pool? RAIDZ would have all data stored redundantly already, so
>>
> > > would 'copies=2' not end up in quadrupling the storage requirement if 
>> used
>> > > on a raidz pool?
>
>  
>> Yes
>
>
> So the amount of space lost to parity is a constant of disk size x RAID 
> level. Thus, if you're using copies, the amount of space lost is just 
> dataset size / copies. One of the nice things about using copies as opposed 
> to mirroring is that you can set it on a per file system (e.g. dataset) as 
> opposed to mirroring which affects the entire vdev.
>

> On the other hand, if you're using mirroring, then yes turning on copies=2 
> does cut your storage space to pool size / 4. (Assuming all datasets in the 
> pool have this set.)
>
> RAIDZ vs mirroring vs copies all comes down to trading off performance vs 
> Reliability, Availability and Serviceability vs space. There are formulas 
> for figuring all of this out. Start at Serve the Home's Raid Reliablitity 
> calculator*
>  which 
> takes into account everything, but increasing file redundancy. For that 
> there's this article: ZFS, Copies, and Data 
> Protection.
>  
> And for RAIDZ vs Mirroring performance see When To (And Not To) Use 
> RAID-Z
> .
>
>
> Phil
>
> * Note that the Mean Time to Data Loss calculated at this site, while 
> being an industry standard, is essentially useless other than for getting a 
> relative comparison of different configurations. For details see: Mean 
> time to meaningless: MTTDL, Markov models, and storage system 
> reliability

Re: [zfs-macos] pros/cons of multiple zfs filesystems

2014-03-17 Thread Jason Belec
My wife and I have both our iTunes libraries on ZFS on the basement server, 
each of our systems user data also is ZFS which backs up every 20 minutes to 
the basement server. This has been running for years under OSX and the 
current/stable and old MacZFS. That server then forwards all the snapshots to 
another location just in case, losing family photos is bad!

Currently, anything that must have HFS+ is being tested in a ZVOL (development 
builds) which is formatted for HFS+ with ZFS underneath. So far this has been 
quite good for Mail and seems to be Spotlight friendly, no guarantees yet. For 
those that want to try it.


--
Jason Belec
Sent from my iPad

> On Mar 17, 2014, at 7:46 AM, Jason Belec  wrote:
> 
> Good man.
> 
> 
> --
> Jason Belec
> Sent from my iPad
> 
>>> On Mar 17, 2014, at 3:35 AM, Dave Cottlehuber  wrote:
>>> 
>>> On 17. März 2014 at 05:00:25, roemer (uwe.ro...@gmail.com) wrote:
>>> Thanks for the detailed example!
>>> 
 On Monday, 17 March 2014 07:34:45 UTC+11, dch wrote:
 
 I've been a happy maczfs and also zfsosx user for several years now.
 [...]
 zfs send is a very easy way to do a very trustable
 backup, once you get past the first potentially large transfers.
 
 Can this happen bi-directiona? Or is it only applicable for creating
>>> 'read-only' replicas of a master filesystem onto some clients?
>>> I mean, what happens once you cloned one file system, sent it to your
>>> laptop, then edit on both the laptop and your ZFS server?
>> 
>> Then you’re screwed :-). It’s not duplicity or some other low-level sync
>> tool. I find it works best when you have a known master that you’re working
>> off.
>> 
>> Slightly OT, but in FreeBSD with HAST you can do some gonzo crazy stuff:
>> http://www.aisecure.net/2012/02/07/hast-freebsd-zfs-with-carp-failover/
>> 
 All my source code & work lives in a zfs case sensitive noatime
 copies=2 filesystem, and I replicate that regularly to my other boxes
 as required.
 
 How does a 'copies=2' filesystem play together with a 'RAIDZ1' (or even
>>> RAIDZ2) pool?
>>> RAIDZ would have all data stored redundantly already, so would 'copies=2'
>>> not end up in quadrupling the storage requirement if used on a raidz pool?
>> 
>> Yes, but in this case, the laptop isn’t redundant, and my data is precious.
>> IIRC the whole repos dataset, even with history, is < 40 Gb, so that’s
>> reasonable IMO.
>> 
 For most customer projects I will have 3 or more VMs running different
 configs or operating systems under VMWare Fusion. These each live in
 their own zfs filesystem, compressed lz4 noatime case sensitive. I
 snapshot these after creation using vagrant install, again after
 config, and the changes are replicated using zfs snapshots again to
 the other OSX system, and also to the remote FreeBSD box.
 
 I can see that zfs is really good for handling multiple virtual machines.
>> 
>> Yup, zfs rollback for testing deployments or upgrades is simply bliss.
>> 
>>> In summary, I'm more than happy with the performance once I used
 ashift=12 and moved past 8GB ram. Datasets once you get used to them
 are extraordinarily useful -- snapshot your config just before a
 critical upgrade.
 
 I start seeing the potential in snapshots. In fact, I just realised that I
>>> do manual
>>> 'snapshots' on some of my repeating projects already for quite some time
>>> with annual
>>> clones of the previous directory structure. So ZFS snapshots would be a
>>> natural fit here.
>>> 
>>> But regarding the memory consumption:
>>> What makes ZFS so memory hungry in your case?
>> 
>> I don’t think it’s very hungry actually. 4GB (under the old MacZFS 74.1)
>> simply wasn’t enough and I’d get crashes. With 8GB that went away. Bearing
>> in mind with 16GB RAM I can run a web browser (oink at least 1GB), a 20GB VM
>> that’s been compressed into a 10GB RAMdisk, +1 GB RAM for the VM, that seems
>> pretty reasonable. That would leave 4GB for ZFS and the normal OSX baseline
>> stuff roughly.
>> 
>> I’m happy to report back with RAM usage if somebody tells me what z* 
>> incantation is needed.
>> 
>>> Do you use deduplication?
>> 
>> Never. But I do use cloned datasets a fair bit, which probably helps the
>> situation a bit.
>> 
>> The 2nd law of ZFS is not to use deduplication, even if you think you need 
>> it.
>> IIRC the rough numbers are 1GB RAM / TB storage, and I’d want ECC RAM for 
>> that.
>> 
>> BTW pretty sure the 1st law of ZFS is not to trust USB devices with your 
>> data.
>> 
>> --  
>> Dave Cottlehuber
>> Sent from my PDP11
>> 
>> 
>> 
>> -- 
>> 
>> --- 
>> You received this message because you are subscribed to the Google Groups 
>> "zfs-macos" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to zfs-macos+unsubscr...@googlegroups.com.
>> For more options, visit https://groups.google.com/d/optout.
> 
> -- 
> 
> --- 
> You received this me

Re: [zfs-macos] pros/cons of multiple zfs filesystems

2014-03-17 Thread Jason Belec
Good man.


--
Jason Belec
Sent from my iPad

> On Mar 17, 2014, at 3:35 AM, Dave Cottlehuber  wrote:
> 
>> On 17. März 2014 at 05:00:25, roemer (uwe.ro...@gmail.com) wrote:
>> Thanks for the detailed example!
>> 
>>> On Monday, 17 March 2014 07:34:45 UTC+11, dch wrote:
>>> 
>>> I've been a happy maczfs and also zfsosx user for several years now.
>>> [...]
>>> zfs send is a very easy way to do a very trustable
>>> backup, once you get past the first potentially large transfers.
>>> 
>>> Can this happen bi-directiona? Or is it only applicable for creating
>> 'read-only' replicas of a master filesystem onto some clients?
>> I mean, what happens once you cloned one file system, sent it to your
>> laptop, then edit on both the laptop and your ZFS server?
> 
> Then you’re screwed :-). It’s not duplicity or some other low-level sync
> tool. I find it works best when you have a known master that you’re working
> off.
> 
> Slightly OT, but in FreeBSD with HAST you can do some gonzo crazy stuff:
>  http://www.aisecure.net/2012/02/07/hast-freebsd-zfs-with-carp-failover/
> 
>>> All my source code & work lives in a zfs case sensitive noatime
>>> copies=2 filesystem, and I replicate that regularly to my other boxes
>>> as required.
>>> 
>>> How does a 'copies=2' filesystem play together with a 'RAIDZ1' (or even
>> RAIDZ2) pool?
>> RAIDZ would have all data stored redundantly already, so would 'copies=2'
>> not end up in quadrupling the storage requirement if used on a raidz pool?
> 
> Yes, but in this case, the laptop isn’t redundant, and my data is precious.
> IIRC the whole repos dataset, even with history, is < 40 Gb, so that’s
> reasonable IMO.
> 
>>> For most customer projects I will have 3 or more VMs running different
>>> configs or operating systems under VMWare Fusion. These each live in
>>> their own zfs filesystem, compressed lz4 noatime case sensitive. I
>>> snapshot these after creation using vagrant install, again after
>>> config, and the changes are replicated using zfs snapshots again to
>>> the other OSX system, and also to the remote FreeBSD box.
>>> 
>>> I can see that zfs is really good for handling multiple virtual machines.
> 
> Yup, zfs rollback for testing deployments or upgrades is simply bliss.
> 
>> In summary, I'm more than happy with the performance once I used
>>> ashift=12 and moved past 8GB ram. Datasets once you get used to them
>>> are extraordinarily useful -- snapshot your config just before a
>>> critical upgrade.
>>> 
>>> I start seeing the potential in snapshots. In fact, I just realised that I
>> do manual
>> 'snapshots' on some of my repeating projects already for quite some time
>> with annual
>> clones of the previous directory structure. So ZFS snapshots would be a
>> natural fit here.
>> 
>> But regarding the memory consumption:
>> What makes ZFS so memory hungry in your case?
> 
> I don’t think it’s very hungry actually. 4GB (under the old MacZFS 74.1)
> simply wasn’t enough and I’d get crashes. With 8GB that went away. Bearing
> in mind with 16GB RAM I can run a web browser (oink at least 1GB), a 20GB VM
> that’s been compressed into a 10GB RAMdisk, +1 GB RAM for the VM, that seems
> pretty reasonable. That would leave 4GB for ZFS and the normal OSX baseline
> stuff roughly.
> 
> I’m happy to report back with RAM usage if somebody tells me what z* 
> incantation is needed.
> 
>> Do you use deduplication?
> 
> Never. But I do use cloned datasets a fair bit, which probably helps the
>  situation a bit.
> 
> The 2nd law of ZFS is not to use deduplication, even if you think you need it.
> IIRC the rough numbers are 1GB RAM / TB storage, and I’d want ECC RAM for 
> that.
> 
> BTW pretty sure the 1st law of ZFS is not to trust USB devices with your data.
> 
> --  
> Dave Cottlehuber
> Sent from my PDP11
> 
> 
> 
> -- 
> 
> --- 
> You received this message because you are subscribed to the Google Groups 
> "zfs-macos" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to zfs-macos+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"zfs-macos" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to zfs-macos+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [zfs-macos] pros/cons of multiple zfs filesystems

2014-03-17 Thread Philip Robar
On Mon, Mar 17, 2014 at 3:35 AM, Dave Cottlehuber  wrote:

> On 17. März 2014 at 05:00:25, roemer (uwe.ro...@gmail.com) wrote:
>
> > > How does a 'copies=2' filesystem play together with a 'RAIDZ1' (or even
> > > RAIDZ2) pool? RAIDZ would have all data stored redundantly already, so
>
> > would 'copies=2' not end up in quadrupling the storage requirement if
> used
> > > on a raidz pool?


> Yes


No, RAIDZ does not store your data redundantly. It splits your data across
multiple drives and uses space equivalent to one drive to store parity
information about the data so that it can be mathematically made whole if
one drive goes missing. RAIDZ2 or RAIDZ3 just raise the level of parity,
i.e. the number of disk failures that can happen before data is lost, to
two or three respectively.

So the amount of space lost to parity is a constant of disk size x RAID
level. Thus, if you're using copies, the amount of space lost is just
dataset size / copies. One of the nice things about using copies as opposed
to mirroring is that you can set it on a per file system (e.g. dataset) as
opposed to mirroring which affects the entire vdev.

On the other hand, if you're using mirroring, then yes turning on copies=2
does cut your storage space to pool size / 4. (Assuming all datasets in the
pool have this set.)

RAIDZ vs mirroring vs copies all comes down to trading off performance vs
Reliability, Availability and Serviceability vs space. There are formulas
for figuring all of this out. Start at Serve the Home's Raid Reliablitity
calculator*
which
takes into account everything, but increasing file redundancy. For that
there's this article: ZFS, Copies, and Data
Protection.
And for RAIDZ vs Mirroring performance see When To (And Not To) Use
RAID-Z
.


Phil

* Note that the Mean Time to Data Loss calculated at this site, while being
an industry standard, is essentially useless other than for getting a
relative comparison of different configurations. For details see: Mean time
to meaningless: MTTDL, Markov models, and storage system
reliability
.

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"zfs-macos" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to zfs-macos+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [zfs-macos] pros/cons of multiple zfs filesystems

2014-03-17 Thread Dave Cottlehuber
On 17. März 2014 at 05:00:25, roemer (uwe.ro...@gmail.com) wrote:
> Thanks for the detailed example!
>  
> On Monday, 17 March 2014 07:34:45 UTC+11, dch wrote:
> >
> > I've been a happy maczfs and also zfsosx user for several years now.
> > [...]
> > zfs send is a very easy way to do a very trustable
> > backup, once you get past the first potentially large transfers.
> >
> > Can this happen bi-directiona? Or is it only applicable for creating
> 'read-only' replicas of a master filesystem onto some clients?
> I mean, what happens once you cloned one file system, sent it to your
> laptop, then edit on both the laptop and your ZFS server?

Then you’re screwed :-). It’s not duplicity or some other low-level sync
tool. I find it works best when you have a known master that you’re working
off.

Slightly OT, but in FreeBSD with HAST you can do some gonzo crazy stuff:
 http://www.aisecure.net/2012/02/07/hast-freebsd-zfs-with-carp-failover/

> > All my source code & work lives in a zfs case sensitive noatime
> > copies=2 filesystem, and I replicate that regularly to my other boxes
> > as required.
> >
> > How does a 'copies=2' filesystem play together with a 'RAIDZ1' (or even
> RAIDZ2) pool?
> RAIDZ would have all data stored redundantly already, so would 'copies=2'
> not end up in quadrupling the storage requirement if used on a raidz pool?

Yes, but in this case, the laptop isn’t redundant, and my data is precious.
IIRC the whole repos dataset, even with history, is < 40 Gb, so that’s
reasonable IMO.

> > For most customer projects I will have 3 or more VMs running different
> > configs or operating systems under VMWare Fusion. These each live in
> > their own zfs filesystem, compressed lz4 noatime case sensitive. I
> > snapshot these after creation using vagrant install, again after
> > config, and the changes are replicated using zfs snapshots again to
> > the other OSX system, and also to the remote FreeBSD box.
> >
> > I can see that zfs is really good for handling multiple virtual machines.

Yup, zfs rollback for testing deployments or upgrades is simply bliss.

> In summary, I'm more than happy with the performance once I used
> > ashift=12 and moved past 8GB ram. Datasets once you get used to them
> > are extraordinarily useful -- snapshot your config just before a
> > critical upgrade.
> >
> > I start seeing the potential in snapshots. In fact, I just realised that I
> do manual
> 'snapshots' on some of my repeating projects already for quite some time
> with annual
> clones of the previous directory structure. So ZFS snapshots would be a
> natural fit here.
>
> But regarding the memory consumption:
> What makes ZFS so memory hungry in your case?

I don’t think it’s very hungry actually. 4GB (under the old MacZFS 74.1)
simply wasn’t enough and I’d get crashes. With 8GB that went away. Bearing
in mind with 16GB RAM I can run a web browser (oink at least 1GB), a 20GB VM
that’s been compressed into a 10GB RAMdisk, +1 GB RAM for the VM, that seems
pretty reasonable. That would leave 4GB for ZFS and the normal OSX baseline
stuff roughly.

I’m happy to report back with RAM usage if somebody tells me what z* 
incantation is needed.

> Do you use deduplication?

Never. But I do use cloned datasets a fair bit, which probably helps the
 situation a bit.

The 2nd law of ZFS is not to use deduplication, even if you think you need it.
IIRC the rough numbers are 1GB RAM / TB storage, and I’d want ECC RAM for that.

BTW pretty sure the 1st law of ZFS is not to trust USB devices with your data.

--  
Dave Cottlehuber
Sent from my PDP11



-- 

--- 
You received this message because you are subscribed to the Google Groups 
"zfs-macos" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to zfs-macos+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [zfs-macos] pros/cons of multiple zfs filesystems

2014-03-16 Thread roemer
Thanks for the detailed example!

On Monday, 17 March 2014 07:34:45 UTC+11, dch wrote:
>
> I've been a happy maczfs and also zfsosx user for several years now. 
> [...]
> zfs send is a very easy way to do a very trustable 
> backup, once you get past the first potentially large transfers. 
>
> Can this happen bi-directiona? Or is it only applicable for creating 
'read-only' replicas of a master filesystem onto some clients?
I mean, what happens once you cloned one file system, sent it to your 
laptop, then edit on both the laptop and your ZFS server?
 

> All my source code & work lives in a zfs case sensitive noatime 
> copies=2 filesystem, and I replicate that regularly to my other boxes 
> as required. 
>
> How does a 'copies=2' filesystem play together with a 'RAIDZ1' (or even 
RAIDZ2) pool?
RAIDZ would have all data stored redundantly already, so would 'copies=2'
not end up in quadrupling the storage requirement if used on a raidz pool?
 

> For most customer projects I will have 3 or more VMs running different 
> configs or operating systems under VMWare Fusion. These each live in 
> their own zfs filesystem, compressed lz4 noatime case sensitive. I 
> snapshot these after creation using vagrant install, again after 
> config, and the changes are replicated using zfs snapshots again to 
> the other OSX system, and also to the remote FreeBSD box. 
>
> I can see that zfs is really good for handling multiple virtual machines.
 

> [...]

In summary, I'm more than happy with the performance once I used 
> ashift=12 and moved past 8GB ram. Datasets once you get used to them 
> are extraordinarily useful -- snapshot your config just before a 
> critical upgrade. 
>
> I start seeing the potential in snapshots. In fact, I just realised that I 
do manual 
'snapshots' on some of my repeating projects already for quite some time 
with annual 
clones of the previous directory structure. So ZFS snapshots would be a 
natural fit here.

But regarding the memory consumption:
What makes ZFS so memory hungry in your case?
Do you use deduplication?

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"zfs-macos" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to zfs-macos+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [zfs-macos] pros/cons of multiple zfs filesystems

2014-03-16 Thread Jason Belec
Yeah but that's databases! Whole different game. ;)

Jason
Sent from my iPhone 5S

> On Mar 16, 2014, at 8:28 PM, roemer  wrote:
> 
>> On Monday, 17 March 2014 06:40:02 UTC+11, cap wrote:
>> An advantage of snapshots is with active filesystems such as those used by a 
>> database.  For a consist at database backup you of course need to stop the 
>> program then backup then restart ( or use some database tool if available) . 
>>  The time to create a snapshot is essentially zero so the above start - stop 
>> is actually practical.  Then you use your backup software of choice on the 
>> snapshot not the active file system.
> This is only fine if your database is read-only or you have control on the 
> update workload.
> Most database systems use a combination of no-force+steal buffering and WAL 
> logging (e.g. MySQL InnoDB or PostgreSQl and basically all commercial RDBMS).
> Taking a file-system level snapshot underneath does not guarantee that you 
> get a consistent snapshot of the database log and data pages.
> Together with high update rates, this can be dangerous. Better use the 
> database system's snapshot facility too before you take the ZFS snapshot. 
> Granted, open source systems are a bit weak in that regard...
>   
> -- 
> 
> --- 
> You received this message because you are subscribed to the Google Groups 
> "zfs-macos" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to zfs-macos+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"zfs-macos" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to zfs-macos+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [zfs-macos] pros/cons of multiple zfs filesystems

2014-03-16 Thread roemer
On Monday, 17 March 2014 06:40:02 UTC+11, cap wrote:

> An advantage of snapshots is with active filesystems such as those used by 
> a database.  For a consist at database backup you of course need to stop 
> the program then backup then restart ( or use some database tool if 
> available) .  The time to create a snapshot is essentially zero so the 
> above start - stop is actually practical.  Then you use your backup 
> software of choice on the snapshot not the active file system.
>
> This is only fine if your database is read-only or you have control on the 
update workload.
Most database systems use a combination of no-force+steal buffering and WAL 
logging (e.g. MySQL InnoDB or PostgreSQl and basically all commercial 
RDBMS).
Taking a file-system level snapshot underneath does not guarantee that you 
get a consistent snapshot of the database log and data pages.
Together with high update rates, this can be dangerous. Better use the 
database system's snapshot facility too before you take the ZFS snapshot. 
Granted, open source systems are a bit weak in that regard...
  

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"zfs-macos" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to zfs-macos+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [zfs-macos] pros/cons of multiple zfs filesystems

2014-03-16 Thread Dave Cottlehuber
I've been a happy maczfs and also zfsosx user for several years now.

TL;DR
- the team are amazing at nailing fixes when I've reported issues. I
use zfs 100% of the time for my work and sanity
- while I can't get a clean shutdown atm, it's rare that I need to
anyway, and zfs has my data anyway once sync has completed
- the zfs compatibility across OS is a huge win
- performance is not a constraint for me, and I'm a very heavy user
- datasets and snapshots are almost as nice as openafs vols for management

I'm a heavy user of snapshots and pools, for some inspiration, the LR:

3 main systems, 2x OSX, 1x large FreeBSD physical hosted server.

My main work laptop is a 16GB early 2011 MBP with a small 256GB SSD
for OS, 1 partitions for each of 4 OS, and a large native ZFS 512GB
SSD. Now that I've been using this for a while, I could have survived
with a 64GB OS disk, and a 256GB zfs SSD, but hey. If I could fit more
ram in, I would. The other boxes are bigger (32GB iMac, 64GB FreeBSD
box with ECC RAM, dual disks mirrored ZFS). I use an ashifted zpool
which has made a noticeable difference in performance on all the
systems I've implemented.

I keep my itunes collection (in a zfs filesystem, formD normalisation,
noatime) and use the snapshots to keep an up-to-date read-only zfs
mirror on the other 2 systems. movies are the reverse, after watching
one on the laptop it gets shuffled off to the larger boxes for
permanent storage. zfs send is a very easy way to do a very trustable
backup, once you get past the first potentially large transfers.

All my source code & work lives in a zfs case sensitive noatime
copies=2 filesystem, and I replicate that regularly to my other boxes
as required.

For most customer projects I will have 3 or more VMs running different
configs or operating systems under VMWare Fusion. These each live in
their own zfs filesystem, compressed lz4 noatime case sensitive. I
snapshot these after creation using vagrant install, again after
config, and the changes are replicated using zfs snapshots again to
the other OSX system, and also to the remote FreeBSD box.

Where I can, I spin up these VMs in a zpool-backed ramdisk (with
compression) which means I can fit a 20GB disk image into 16GB of RAM
and still work effectively. I don't confess to knowing how that
actually works but it does. And its very very fast. The specific
config for that image is stored in the main SSD and as I'm not writing
continuously to it while running the VM, things are peachy.

At the end of the project, I can remove the local snapshots as
required, and I archive them onto 32GB SD cards (yup) with a zpool and
copies=2. They're a nice easy archival format, so long as you have
another copy stashed safely too.

A couple of months ago, I had a number of hardware failures on the
MBP, and each time I was able to guarantee that my data was intact,
with full integrity, despite the travesties worked upon it each time
it went to the factory for repair. I'd never have been certain with
HFS+.

I don't have my ~ homedir in zfs just yet, but I've no particular
reason not to move it now other than time constraints. with
normalisation and case insensitivity I don't think I will see the
issues I did under prior versions with less support. Spotlight is not
important to me, and finder behaves itself now under Mavericks and the
new ZFSOSX builds.

In summary, I'm more than happy with the performance once I used
ashift=12 and moved past 8GB ram. Datasets once you get used to them
are extraordinarily useful -- snapshot your config just before a
critical upgrade.

A+
Dave

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"zfs-macos" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to zfs-macos+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [zfs-macos] pros/cons of multiple zfs filesystems

2014-03-16 Thread Jason Belec
Snapshots also only store the difference from the last snapshot and when 
combined with send/receive are very efficient for replication through 
transmission to remote servers.


--
Jason Belec
Sent from my iPad

> On Mar 16, 2014, at 3:40 PM, Simon Casady  wrote:
> 
> An advantage of snapshots is with active filesystems such as those used by a 
> database.  For a consist at database backup you of course need to stop the 
> program then backup then restart ( or use some database tool if available) .  
> The time to create a snapshot is essentially zero so the above start - stop 
> is actually practical.  Then you use your backup software of choice on the 
> snapshot not the active file system.
> 
> 
>> On Sun, Mar 16, 2014 at 7:16 AM, roemer  wrote:
>> Thanks for the response, Björn.
>> The hint regarding dataset-specific snapshots is good, though I have to 
>> first think about how I would best make use of them.
>> 
>> However another point that you raised is interesting:
>> 
>>> On Sunday, 16 March 2014 10:34:52 UTC+11, Bjoern Kahl wrote:
>>> [...]
>>> 
>>>  Under Mac OSX, a mounted file system comes at higher costs than on 
>>>  other Unix like operating systems, due to the Finder and MDS services, 
>>>  so I would not suggest to really try to have hundreds of file systems 
>>>  mounted at the same time.  But any reasonable number (some 10) go 
>>>  without noticeable performance impact. 
>> 
>> I would need about 10 separate mount points / data sets, so I guess this 
>> would be fine.
>> MDS services however means Spotlight, but the MacZFS Wiki as well as several 
>> other posts on the web give the advice to switch off spotlight for ZFS with
>> mdutil -i off mountPoint
>> 
>> Why is Spotlight thought to be evil for ZFS? 
>> Or does your comment imply that these advices are outdated, and mds-indexing 
>> for ZFS mount points is ok nowadays?
>> Note that I am mainly aiming to store static 'archival' data and documents 
>> on ZFS, not my main user directory.
>>  
>>> [...] Snapshots can also easily be used for real 
>>>  off-site backups by the zfs send / receive mechanism. 
>>> 
>> Haven't looked at send/receive yet, but if they require network connections, 
>> I am afraid classical ADSL speeds with mac 1MBit/s upload will not be much 
>> fun...
>> And for periodic backup to an external HDD I was thinking about ChronoSync 
>> or simply rsync
>> 
>> roemer
>> -- 
>> 
>> --- 
>> You received this message because you are subscribed to the Google Groups 
>> "zfs-macos" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to zfs-macos+unsubscr...@googlegroups.com.
>> For more options, visit https://groups.google.com/d/optout.
> 
> -- 
> 
> --- 
> You received this message because you are subscribed to the Google Groups 
> "zfs-macos" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to zfs-macos+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"zfs-macos" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to zfs-macos+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [zfs-macos] pros/cons of multiple zfs filesystems

2014-03-16 Thread Simon Casady
An advantage of snapshots is with active filesystems such as those used by
a database.  For a consist at database backup you of course need to stop
the program then backup then restart ( or use some database tool if
available) .  The time to create a snapshot is essentially zero so the
above start - stop is actually practical.  Then you use your backup
software of choice on the snapshot not the active file system.


On Sun, Mar 16, 2014 at 7:16 AM, roemer  wrote:

> Thanks for the response, Björn.
> The hint regarding dataset-specific snapshots is good, though I have to
> first think about how I would best make use of them.
>
> However another point that you raised is interesting:
>
> On Sunday, 16 March 2014 10:34:52 UTC+11, Bjoern Kahl wrote:
>>
>> [...]
>>
>>  Under Mac OSX, a mounted file system comes at higher costs than on
>>  other Unix like operating systems, due to the Finder and MDS services,
>>  so I would not suggest to really try to have hundreds of file systems
>>  mounted at the same time.  But any reasonable number (some 10) go
>>  without noticeable performance impact.
>>
>
> I would need about 10 separate mount points / data sets, so I guess this
> would be fine.
> MDS services however means Spotlight, but the MacZFS Wiki as well as
> several other posts on the web give the advice to switch off spotlight for
> ZFS with
> mdutil -i off mountPoint
>
> Why is Spotlight thought to be evil for ZFS?
> Or does your comment imply that these advices are outdated, and
> mds-indexing for ZFS mount points is ok nowadays?
> Note that I am mainly aiming to store static 'archival' data and documents
> on ZFS, not my main user directory.
>
>
>> [...] Snapshots can also easily be used for real
>>  off-site backups by the zfs send / receive mechanism.
>>
>> Haven't looked at send/receive yet, but if they require network
> connections, I am afraid classical ADSL speeds with mac 1MBit/s upload will
> not be much fun...
> And for periodic backup to an external HDD I was thinking about ChronoSync
> or simply rsync
>
> roemer
>
> --
>
> ---
> You received this message because you are subscribed to the Google Groups
> "zfs-macos" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to zfs-macos+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"zfs-macos" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to zfs-macos+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [zfs-macos] pros/cons of multiple zfs filesystems

2014-03-16 Thread roemer
Thanks for the response, Björn.
The hint regarding dataset-specific snapshots is good, though I have to 
first think about how I would best make use of them.

However another point that you raised is interesting:

On Sunday, 16 March 2014 10:34:52 UTC+11, Bjoern Kahl wrote:
>
> [...]
>  Under Mac OSX, a mounted file system comes at higher costs than on 
>  other Unix like operating systems, due to the Finder and MDS services, 
>  so I would not suggest to really try to have hundreds of file systems 
>  mounted at the same time.  But any reasonable number (some 10) go 
>  without noticeable performance impact. 
>

I would need about 10 separate mount points / data sets, so I guess this 
would be fine.
MDS services however means Spotlight, but the MacZFS Wiki as well as 
several other posts on the web give the advice to switch off spotlight for 
ZFS with
mdutil -i off mountPoint

Why is Spotlight thought to be evil for ZFS? 
Or does your comment imply that these advices are outdated, and 
mds-indexing for ZFS mount points is ok nowadays?
Note that I am mainly aiming to store static 'archival' data and documents 
on ZFS, not my main user directory.
 

> [...] Snapshots can also easily be used for real 
>  off-site backups by the zfs send / receive mechanism. 
>
> Haven't looked at send/receive yet, but if they require network 
connections, I am afraid classical ADSL speeds with mac 1MBit/s upload will 
not be much fun...
And for periodic backup to an external HDD I was thinking about ChronoSync 
or simply rsync

roemer

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"zfs-macos" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to zfs-macos+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [zfs-macos] pros/cons of multiple zfs filesystems

2014-03-15 Thread Bjoern Kahl
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


 Hi roemer,

Am 15.03.14 23:52, schrieb roemer:
> When one creates a new zpool, this automatically creates a root 
> filesystem too - and even mounts this. What is now the advantage 
> (or disadvantage) of creating further sub-filesystems inside the 
> pool using zfs? And what is the difference to simply create 
> sub-directories under the zpool root?
> 
> Two advantages, that I can see, are separate compression and quota 
> settings. But what about general performance? Is there a 
> performance penalty for having multiple zfs filesystems inside one 
> pool, perhaps even with different settings?

 Not really.

 Technically, a file system (or, in ZFS language: a dataset) is very
 similar to a directory and one can have thousands of these without
 noticeable performance impacts as far as the ZFS core is concerned.

 Under Mac OSX, a mounted file system comes at higher costs than on
 other Unix like operating systems, due to the Finder and MDS services,
 so I would not suggest to really try to have hundreds of file systems
 mounted at the same time.  But any reasonable number (some 10) go
 without noticeable performance impact.


 One additional advantage not in your list is the ability to make
 snapshots, including cloning these as new then (almost) independent
 read-write file systems, or to use the snapshots as lightweight
 backups against user error / application misbehavior.  Of course,
 these can not replace a true off-site backup, but are nevertheless
 useful.  For example I used to have parts of my User directory on ZFS
 and have it automatically snapshoted every 15 minutes as a cheap
 versioning solution.  Snapshots can also easily be used for real
 off-site backups by the zfs send / receive mechanism.


 Best regards

Björn

- -- 
| Bjoern Kahl   +++   Siegburg   +++Germany |
| "googlelogin@-my-domain-"   +++   www.bjoern-kahl.de  |
| Languages: German, English, Ancient Latin (a bit :-)) |
-BEGIN PGP SIGNATURE-
Version: GnuPG v1
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQCVAgUBUyTjmlsDv2ib9OLFAQIzpQP/VdOU581ICrh8olSnyqVrErKuT/FJCHAf
dGygGvv/b9egj8fajOcENwkEY9Lrzv14DNP/EFWmssaNfIpSpUR7TikaumUPMJgV
QfTEj51zWCwbwcWtln9vrMmQ9fk31vicyDhbjs7iph2YVRd+nABMT3c2Tt93WPZg
8WwkngD9TaQ=
=yYu8
-END PGP SIGNATURE-

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"zfs-macos" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to zfs-macos+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[zfs-macos] pros/cons of multiple zfs filesystems

2014-03-15 Thread roemer
When one creates a new zpool, this automatically creates a root filesystem 
too - and even mounts this.
What is now the advantage (or disadvantage) of creating further 
sub-filesystems inside the pool using zfs?
And what is the difference to simply create sub-directories under the zpool 
root?

Two advantages, that I can see, are separate compression and quota settings.
But what about general performance? Is there a performance penalty for 
having multiple zfs filesystems inside one pool, perhaps even with 
different settings?


-- 

--- 
You received this message because you are subscribed to the Google Groups 
"zfs-macos" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to zfs-macos+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.