On Thu, Feb 2, 2012 at 2:58 PM, Gary Smith <[email protected]> wrote:
>
> Personally, I think that the servers that you are looking at are more than
> fine. Others may disagree and cite things like "production data", etc.
> Those that believe that ECC is the save all also forget to tell you that even
> data states on disk can in theory change over time. Complex scenarios could
> say that what if 2 cosmic rays happened to hit 2 memory chips (primary and
> the ECC specific chip) and flip both bits. This would not generate an error
> since the checksums would still be the same.
I've actually seen a scenario where intermittent RAM problems caused
mostly hidden differences in the two disk instances of a software raid
mirror. It wasn't fun to diagnose and fix since the disk reads would
randomly have good and bad data even after the RAM was fixed. I'm
sure that is a rare thing, but...
--
Les Mikesell
[email protected]
_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com