Well, the other semi-unique thing about WAFL is the fact that all meta-data
stored in files, rather than custom structures.  So, the "tree" I was
describing is actually three files-  the free inode map, the free space map,
and the inode file.  The most important one for our purposes here is the
inode file - that describes the actual filesystem structure.

Sooo, what happens is that the inode file is modified every time the
structure of the filesystem is changed.  Practically speaking, this means it
is cache-resident all the time.  But, since WAFL's architecture is to never
update blocks, the inode file is never updated on disk - a new one is simply
written.

This is how Netapp snapshots work - basically there is a "root inode" that
is special and points to the inode file.  When you want to make a snapshot,
you make a new "root inode" and statically point it at the current inode
file.  Since that inode file describes the view of the filesystem _at that
point in time_, you end up with a read-only virtual filesystem.

This same logic is applied to insuring on-disk consistency.  Every few
seconds, a new snapshot is created that points at the current inode file.
The netapp continues processing requests, but the on-disk filesystem
structure is "fixed" at that snapshot.  When the consistency timer expires,
the old snapshot is deleted and a new one created that points at the current
inode file - so the entire filesystem view on-disk updates atomically to
represent what the Filer had already been representing in memory.

The battery-backed cache stores all of the transactions between the last
consistency point and the present moment (in another unique note, it
actually caches the NFS operation itself, not the low-level I/O).  This
gives it a pool of marked-as-completed writes to work with to help make more
intelligent decisions about write layouts.

So, the situation is not so grim as "lose your cache, lose your filesystem"
- in a truly tragic scenario with power failure plus cache-battery failure,
the worst case is that you would recover your filer to discover that it was
at a consistent state from 10 seconds before the power failure (10 seconds
is the longest time a filer will go between consistency points).  

Thanks,
Matt
*still pleased with netapp's craftiness*

--
Matthew Zito
GridApp Systems
Email: [EMAIL PROTECTED]
Cell: 646-220-3551
Phone: 212-358-8211 x 359
http://www.gridapp.com

> -----Original Message-----
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On 
> Behalf Of Tanel Poder
> Sent: Friday, September 19, 2003 12:35 PM
> To: Multiple recipients of list ORACLE-L
> Subject: Re: asynch I/O
> 
> 
> > available raid stripe that's free and writes the block there, then 
> > updates the tree.  Besides being rather crafty, it creates 
> a situation 
> > where
> 
> And the tree is living in batter backed cache?
> 
> Tanel.
> 
> 
> -- 
> Please see the official ORACLE-L FAQ: http://www.orafaq.net
> -- 
> Author: Tanel Poder
>   INET: [EMAIL PROTECTED]
> 
> Fat City Network Services    -- 858-538-5051 http://www.fatcity.com
> San Diego, California        -- Mailing list and web hosting services
> ---------------------------------------------------------------------
> To REMOVE yourself from this mailing list, send an E-Mail message
> to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') 
> and in the message BODY, include a line containing: UNSUB 
> ORACLE-L (or the name of mailing list you want to be removed 
> from).  You may also send the HELP command for other 
> information (like subscribing).
> 

-- 
Please see the official ORACLE-L FAQ: http://www.orafaq.net
-- 
Author: Matthew Zito
  INET: [EMAIL PROTECTED]

Fat City Network Services    -- 858-538-5051 http://www.fatcity.com
San Diego, California        -- Mailing list and web hosting services
---------------------------------------------------------------------
To REMOVE yourself from this mailing list, send an E-Mail message
to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from).  You may
also send the HELP command for other information (like subscribing).

Reply via email to