Re: [storage-discuss] sndradm - add volume (zfs spare) to existent group

Jim Dunham Mon, 01 Jun 2009 05:38:49 -0700

Roman,

Ok, thanks. That's what I was thinking about. I just wan't sure ifit's ok to add volumes into existent groups from AVS point of view
Only one question about commands you suggested. Is this about addinga disk queue?
sndradm -g -q a :

Is this kind a compulsory?

It will be for any filesystem like ZFS, QFS, also volume managers, ordatabases, each of which uses multiple physical disks to represent asingle consistent collection of pooled storage. Failure to do this,will allow the replicated volumes to be updated inconsistently.

Grouping two or more replicated volumes into an I/O consistency group,is compulsory if the data on the secondary node is deemed valuable atany point in time.

I mean I don't use disk queues for this particular installation.

Since ZFS does a very good job of distributing I/Os across allconstituent vdevs in a single ZFS storage pool, you have been eitherlucky or fortunate that ZFS has not seen inconsistencies, especiallysince ZFS is very good at this. Based on this, it is likely that theZFS storage pool has been idle prior to accessing the replicated ZFSstorage pool.

And let me ask you Jim about the famous post "AVS and ZFS,seamless". It was yours, right? Even if not, would you mindcommenting it?


It it my posting.

In hindsight, I should have described the "goals" of this write up, asin hindsight it is too complex. The goals of the write up were to takethe available 46 disks of a Sun Fire 4500 (Thumper), and replicatethem to another SunFire 4500. They complexity came about due to thefact that there were additional requirements for the bitmap volumes.

- On RAID-1, not RAIDz storage, allowing for both redundancy, but notat the cost of a RAIDz write I/O, when SNDR is just flipping one ormore bits in a bitmap volume.

- Distributed across as many disks as possible, so that a distributedZFS write across many vdevs, would also incur a distributed bitmapwrite across many bitmap volumes. Having a multi-disk ZFS write,resulting in a single disk update of multiple bitmap volumes, wouldcreate an I/O bottleneck.

FWIW: This article was written before the general availability ofSSDs. Placing bitmap volumes on write SSDs is highly optimal, as thereis no seek cost, which is what was trying to be avoided when doingmulti-disk updates. Years ago, AVS had a optional SPARC only featurecalled Fast Write Cache (FWC). It was software that utilized an S-BUSbased non-volatile memory module. Due to its high cost and ties to S-BUS architectures, this technology became obsolete and was dropped outof AVS. Bitmaps on SSDs, replace this functionality very well.

The first question is about the issue with RDC timeouts and disksqueues that I asked some times ago.
http://www.mail-archive.com/[email protected]/msg05497.html
I wonder if that configuration described  in blog was successful?

I recall that there was a cut-n-paste error, from my notes to theblog, but my notes were based on a working configuration.

How did the link look like?

This was a 1 GigE with less the 5 ms latency between sites. Measuredlink latency is the #1 issues in data replication, in that the local-to-remote, plus remote-to-local, round trip time is import. SNDRreplication is done using RPC over TCP/IP, and then amount of time ittakes to move data, even bitmap data, plays a key roles inreplication. The RPC timeout value, a feature of RPC, not SNDR, hasbecome a problem as volume sizes, thus bitmap sizes have increased.

Was it something like dedicated circuit with usual latency?

Link latency averaged about 5 ms, with spikes no more than 10 ms. Iflink latency is higher than this, the RPC timeouts will be common, andproblematic.

How did it work out eventually?

Well, it was an experiment, not a production environment, as theequipment was on loan for the purposes of SNDR and ZFS testing.

And the second question about slices for bitmap. What was the reasonfor putting it not on dedicated pair of disk as documentationsuggests?

If ZFS does a 46 volume write I/O, one would not want 46 bitmapupdates, to happen to a single disk. The example had mirrored bitmapsvolumes for each replica, minimizing a single bitmap as being an I/Obottleneck.

(avs administration guide: raw devices must be stored on a diskseparate from the disk that contains the data from the replicatedvolumes.... The bitmap must not be stored on the same disk asreplicated volumes)

From the simplistic point of view of the administration guide, if onehad a single data disk and was doing sequential write I/O, these writeI/Os, plus bitmap updates, would turn the optimal sequential write,into random write I/Os, if both replica and bitmap were on the samedisk. Placing them on separate disks, would allow the sequence write I/Os and sequential bitmap I/Os, to be sequential I/Os. Ideally, onewould want to carry this theme across any sized replicationconfiguration, but maximizing a large ZFS configuration of 46 disks,requires 46 bitmaps, and some tradeoffs where taken.



- Jim Dunham

Roman Naumenko
Frontline Technologies
--
This message posted from opensolaris.org
_______________________________________________
storage-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/storage-discuss


_______________________________________________
storage-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/storage-discuss

Re: [storage-discuss] sndradm - add volume (zfs spare) to existent group

Reply via email to