Hi,

We run with 2-way replication.  The wonderful folks at Yahoo! worked through 
most of the bugs during 0.19.x IIRC.  There was never any bugs with 2-way 
replication per-se, but running a cluster with 2 replicas exposed other bugs at 
a 100x rate compared to running with 3 replicas (due to the fact that a silent 
corruption + loss of a single data node = file loss).

I'd estimate we lose files at a rate of about 1 per month for 200TB of actual 
data.  That number would probably go down an order of magnitude or more if we 
were running with 3 replicas.

Hope this helps.

Brian

On Jan 10, 2010, at 3:55 AM, Eli Collins wrote:

>> data.replication = 2
>> 
>> A bit of topic - is it safe to have such number? About a year ago I heard
>> only 3 way replication was fully tested, while 2 way had some issues - was
>> it fixed in subsequent versions?
> 
> I think that's still a relatively untested configuration, though I'm
> not aware of any known bugs with it. I know of at least one cluster
> that uses 2-way replication.  Note that 3-way replication is used both
> for availability and performance, though in a write benchmark 2-way
> replication should be faster than 3-way.
> 
> Thanks,
> Eli

Attachment: smime.p7s
Description: S/MIME cryptographic signature

Reply via email to