Hi, We run with 2-way replication. The wonderful folks at Yahoo! worked through most of the bugs during 0.19.x IIRC. There was never any bugs with 2-way replication per-se, but running a cluster with 2 replicas exposed other bugs at a 100x rate compared to running with 3 replicas (due to the fact that a silent corruption + loss of a single data node = file loss).
I'd estimate we lose files at a rate of about 1 per month for 200TB of actual data. That number would probably go down an order of magnitude or more if we were running with 3 replicas. Hope this helps. Brian On Jan 10, 2010, at 3:55 AM, Eli Collins wrote: >> data.replication = 2 >> >> A bit of topic - is it safe to have such number? About a year ago I heard >> only 3 way replication was fully tested, while 2 way had some issues - was >> it fixed in subsequent versions? > > I think that's still a relatively untested configuration, though I'm > not aware of any known bugs with it. I know of at least one cluster > that uses 2-way replication. Note that 3-way replication is used both > for availability and performance, though in a write benchmark 2-way > replication should be faster than 3-way. > > Thanks, > Eli
smime.p7s
Description: S/MIME cryptographic signature
