A recent increase in email about ZFS and SNDR (the replication component of Availability Suite), has given me reasons to post one of my replies.

Well, now I'm confused! A collegue just pointed me towards your blog entry about SNDR and ZFS which, until now, I thought was not a supported configuration. So, could you confirm that for me one way or the other?

ZFS is supported with SNDR, because SNDR is filesystem agnostic. That said, ZFS is a very different beast then other Solaris filesystems.

The two golden rules of ZFS replication are:

1). All volumes in a ZFS storage pool (see output of zpool status), must be placed in a single SNDR I/O consistency group. ZFS is the first Solaris filesystem that validates consistency at all levels, so all vdevs in a single storage pool must be replicated in a write-order consistent manner, and I/O consistency groups is the means to accomplish this.

2). While SNDR replication is active, do not attempt to zpool import the SNDR secondary volumes, and while the ZFS storage pool is imported on the SNDR secondary node, do not resume replication. This is truly a double-edge sword, as the instance of ZFS running on the SNDR secondary node, will see replicated writes from ZFS on the SNDR primary node, consider these unknown CRCs as some form of data corruption, and panic Solaris. This is the same reason two or more Solaris hosts can't access the same ZFS storage pool in a SAN.

There is a slight safety net here, in that zpool import will think that the ZFS storage pool is active on another node. Unfortunately stopping replication does not change this state, so you will still need to use the -f (force) option anyway, that is unless the zpool is in the exported state on the SNDR primary node, as the exported state will be replicated to the SNDR secondary node.

Of course I know that AVS only cares about blocks so, in principle, the FS is irrelevant. However, last time I was researching this, I found a doc that explained that the lack of support was due to the unpredictable nature of zfs background processes (resilver, etc) and therefore not being guaranteed of a truly quiesced FS.

ZFS the filesystem is always on disk consistent, and ZFS does maintain filesystem consistency through coordination between the ZPL (ZFS POSIX Layer) and the ZIL (ZFS Intent Log). Unfortunately for SNDR, ZFS caches a lot of an applications filesystem data in the ZIL, therefore the data is in memory, not written to disk, so SNDR does not know this data exists. ZIL flushes to disk can be seconds behind the actual application writes completing, and if SNDR is running asynchronously, these replicated writes to the SNDR secondary can be additional seconds behind the actual application writes.

Unlike UFS filesystems and lockfs -f, or lockfs -w, there is no 'supported' way to get ZFS to empty the ZIL to disk on demand. So even though one will get both ZFS and application filesystem consistency within the SNDR secondary volume, there can be many seconds worth of lost data, since SNDR can't replicate what it does not see.
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to