[storage-discuss] LUN write cache behaviour

Tore Anderson Thu, 12 Nov 2009 02:02:43 -0800

Hello list,

my Unified Storage 7310 cluster has a little tickbox that allows me to
enable/disable write caching on a LUN.  It also has a warning saying not
to enable it if you don't know what you're doing, but doesn't go into
any detail about how it actually works.  I'd like to know what I'm
doing, so I hope someone on this list can enlighten me.


As I've understood it so far, when the write cache is disabled, all
writes go straight to the ZIL, and are asynchronously migrated to the
data disks.  The behaviour is the same as if you're writing
synchronously to a file on an NFS export.  Correct me if I'm wrong here.

If I enable the write cache, does all writes then go straight to
DRAM/ARC (are those the same thing)?  Does the data stop by the ZIL on
the way to the data disks, or is it migrated straight there?

When using write caching, is it safe to do cluster takeovers/failbacks,
will the volatile cached data be flushed to the ZIL or the data disks
before the other cluster node takes over or not?

I'm using the 7310 as backing for a private cloud with virtual machines
running Linux, mostly - one LUN per machine.  The filesystem is designed
to work well with write-caching block devices, as it's journalling all
writes and can inserts barriers after the journal commit blocks.  I can
see a potential problem with surviving a (uncontrolled only?) cluster
failover on the 7310 without a reboot and a journal replay on the VM
though - DRAM-cached writes will just vanish and operation will continue
as if they never happened, right?  But other than that, would it be safe
for me to enable write caching?

Does enabling/disabling write cache affect pre-caching of reads of the
recently-written data?  If the writes are cached in the ZIL only (ie.
cache disabled, if I've understood correctly), will a read of that data
have to wait for the writes to be flushed to the data disks before the
data can be read back and inserted into the ARC/L2ARC?  If that's the
case, will this situation improve if I enable write caching?

If a block that was recently-written (and have therefore not yet reached
the data disks) is overwritten, will the first write still be flushed to
the data disks at some point or not?

The reason I'm asking is that I recently had a situation where a virtual
machine ran out of memory and started trashing its swap partition, ie.
writing lots of data just to re-read it shortly after. This basically
killed the performance of the 7310 completely - the disks were all
pegged at ~250 IOOPS.  The swap partition on the VM was small enough to
have fitted in any of the DRAM, ARC, L2ARC, or ZIL, so I would very much
like to understand why that happened and if enabling write caching would
have prevented it.

Best regards,
-- 
Tore Anderson
Redpill Linpro AS - http://www.redpill-linpro.com/
Tel: +47 21 54 41 27
_______________________________________________
storage-discuss mailing list
storage-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/storage-discuss

[storage-discuss] LUN write cache behaviour

Reply via email to