On March 13, 2014 at 10:35:19 AM, Francois Gaudreault 
([email protected]) wrote:

Hello ZFS people :) 

As a cloud provider, we are trying to see how ZFS would help us to 
optimizing how we store VM data. However, we have some blind spots, and 
we would like to get some answers :) Hopefully, I am at the right place 
for that. Yes I could spend some days to search on the web, and yes the 
answer is probably there somewhere, but asking to experts is much faster :P 

Basically, let's say we have a linux VM running on EXT or XFS as the 
local filesystem, how is XFS/EXT consistency helped by ZFS features? 
(i.e Is it true to say fsck would be useless if we use ZFS?) And the 
other question would be, how do we make sure we get consistent snapshots 
when using c-o-w? 


If you’re running EXT on top of a volume on ZFS, then you should still use 
FSCK.  The problem is that your VM (or the OS running on it) can wind up 
crashing with its data still in OS buffers.  While the underlying ZFS will be 
resilient, unless you’ve absolute confidence in both the client OS *and* the 
hardware (power!), you have to protect against its losses.  Using ZFS in this 
way doesn’t offer much in the way of data integrity protection over say a 
hardware RAID.  (Yes, there *are* benefits to using ZFS even so, as you get 
self-healing against bitrot, data portability, and a bunch of other benefits 
besides.  But its best to think of ZFS like a really nice RAID appliance in 
this regard.

If *however*, you can use ZFS directly in the guest VM, then you start to see a 
lot of other benefits, and including being able to dispense with fsck. ;-)

Snapshot “consistency” is guaranteed *from ZFS’ point of view*.  I.e. your 
approach is going to ensure that ZFS is happy.   But if you don’t run sync() or 
whatever causes your kernel to flush its in-memory buffers to the virtual disk, 
and do a (virtual) sync, you won’t necessarily have a complete transaction 
group at ZFS layer, and your data may be lost.

The VM/ZFS coordination products generally plug into some kind of snapshot 
callback in the client OS or database, which quiesces output and performs any 
flushes of disk buffers there first, then do the ZFS snapshot, then resumes the 
quiesced OS or database app.

The good news is that the ZFS snapshot with cow is *really* cheap.  (Just an 
update to copy the uberblock, really.)  Compared to some other products, this 
is orders of magnitude less expensive.

Snapshots (and dedup) also offer myriad other advantages for deployments with 
client VMs.  Cheap provisioning, reduced data cost, increased memory 
efficiencies, etc.  ZFS is a good solution here.  But as the volume interface 
it exposes to VMs is usually a logical disk, you should take the same cautions 
in the client OS that you’d do if it were a real disk.  (sync, fsck, etc.)

        - Garrett



Thanks in advance! 

-- 
Francois Gaudreault 
Architecte de Solution Cloud | Cloud Solutions Architect 
[email protected] 
514-629-6775 
- - - 
CloudOps 
420 rue Guy 
Montréal QC H3J 1S6 
www.cloudops.com 
@CloudOps_ 

_______________________________________________ 
developer mailing list 
[email protected] 
http://lists.open-zfs.org/mailman/listinfo/developer 
_______________________________________________
developer mailing list
[email protected]
http://lists.open-zfs.org/mailman/listinfo/developer

Reply via email to