Re: RAID5/ext3 trouble...

Amos Shapira Sun, 18 Mar 2007 12:20:45 -0800

On 18/03/07, Dan Bar Dov <[EMAIL PROTECTED]> wrote:


That's a good question to which I have no answer.
I don't know how google do it. I can think of
1. special file system
2. some kind of "scrubber" - a daemon scanning for FS changes and copying
whatever changed
3. use a sync tool (rsync?) on adaily (hourly?) basis

I doubt google has 1. This is some good startup material.



Actually they have (1): [1] and they relay a lot on it. But apart from
describing it in this paper (and mentioning it in other publications, e.g.
[2] is the latest I read, among tons of others) they don't provide the
actual code, which I suppose I understand since it could be viewed as part
of their core business ("organizing the world's information").

Another simple idea I just had overnight - cross-mirror disks among a couple
of nodes (preferably on different racks, when we reach such a stage) either
as database replication or through DRBD-style disk replication (sort of
"RAID 0 over net"), then if one of the nodes goes down its partner can take
over handling of its queue of jobs.

If the rest of the system is architected to use local data as much as
possible (i.e. all the stages which process the same piece of info run on
the same node) this might be enough to achieve both redundancy and
reliability, while still minimizing network traffic (second conclusion in
[2])) since the mirroring over the net can be done asynchronously - losing a
transaction during a node's crash should have negligible effect.

[1] http://labs.google.com/papers/gfs.html
[2] http://labs.google.com/papers/mapreduce.html

--Amos

Re: RAID5/ext3 trouble...

Reply via email to