On Mon, Oct 04, 1999 at 04:55:04PM -0600, Brian Grossman wrote:
> 
> Has anyone tried running sw-raid1 over linux's network block device?  I've
> been tempted, but not quite tempted enough yet.

I think I saw that on the lkml.  There were some problems (mainly performance
under heavy load, IIRC).  But it should be doable.  Especially if anyone needs
the feature    ;)

> Does sw-raid1 require synchronous data xfer for the block devices involved?
> If so, can sw-raid1 be told to not require synchronous data xfer?  Does it
> make any sense to do so?

When a block device says it's done, the RAID layer can assume nothing else.
Whether the NBD code says it's done when the data reached their destination,
or when they reached some cache (or the wire), I don't know.

> Could sw-raid1 do reads only from one device, but do writes to both (for
> performance on nwb)?

That would require tweaking in the RAID code.

But if you plug in a FE card in each machine dedicated to the NBD (or one
card for each NBD) you have performance similar to that of normal disks.
Around 12 MB/s read/write (simultaneously - that's even better than most
disks - but I'm not sure that matters a lot).

> Something else I've been thinking about, but haven't had time to
> investigate properly, is intercepting filesystem calls and shipping them
> off to another machine, where they are duplicated.  Anybody here have an
> idea how difficult that would be?  Could it be done in a fs-independent way?

Ouch.  That could get really really nasty.  RPC in the kernel is nasty already.
Doing RAID over NBDs is pretty close to you approach, as I see it, and it's
fairly clean.

There are still implications though...  You must assure that only one machine
at a time will have the filesystem mounted, or you'll see some strange stuff
happening to your data     :)
Besides, if the machine that had the filesystem mounted crashes, the filesystem
will need a (argh, don't say it!) fsck!  I guess this is the real showstopper.

Something like PVFS (http://ece.clemson.edu/parl/pvfs/index.html) seems like
the right thing to use.  Unfortunately however, it seems as though they do
not implement redundancy  :(

It should be doable to write a user-space daemon that could monitor other hosts,
and when one decides to stop responding, the NBD imported from that host will
be marked as ``failed'' in the RAID, and our new host will fsck and mount the
filesystem.

It would probably be wise to look into ReiserFS or ext3fs, or any other filesystem
with journalling support.  The fsck is a bad one, unless I've overlooked something.

................................................................
: [EMAIL PROTECTED]  : And I see the elder races,         :
:.........................: putrid forms of man                :
:   Jakob Østergaard      : See him rise and claim the earth,  :
:        OZ9ABN           : his downfall is at hand.           :
:.........................:............{Konkhra}...............:

Reply via email to