Greetings Riak Users-

I googled, and checked the wiki, and searched the past year of the mailing list 
archives to see if this had come up before.   I'm planning to use N=3 
replication for our data in the cluster we'll be building soon.  Its going to 
be a small cluster and we're a bootstrapped startup with only a little funding, 
so we're looking at using Hetzner 4S dedicated servers (32GB RAM, 3TB Disks, 8 
Cores for about $50 a month.)  

However, I've been seeing advice that we shouldn't use this because its not ECC 
memory.  That "memory errors are a lot bigger than people think", and that this 
can cause data to rot over time.

So, my question is this:  When you issue a read request to Riak, and the data 
is stored on 3 nodes, does any kind of a error check code ever get generated 
and compared? 

Suppose I had an address record on three nodes, but at the moment the record 
was being written to one of the nodes a cosmic ray flipped a bit and instead of 
it being 123 Main street, the address read 223 main street. 

When I read that record, and all three nodes respond, will I simply get the 
result of whichever node is festest?

If, when I read, I say that R=2 and so 2 nodes have to respond, is the result 
from the two nodes compared? 

I know the vector clock will be compared to make sure to return the latest 
record, but in this situation the vector clocks would be the same even though 
the data isn't. 

Assuming there's no hash generated from the data that would catch or correct 
his type of error, I'm interested in hearing from people with largish clusters 
and knowing whether you use ECC RAM on them or not. 

Basically looking for some advice from people with more experience, as the ones 
advocating ECC are pretty fervent but the cost difference is significant. 

Thanks in advance!

_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to