On Thu, Feb 26, 2009 at 3:28 AM, john_re john_re-at-fastmail.us |PDX Linux| <...> wrote: > Do you use ECC RAM? Do you have any data about failure rates? > > I'm evaluating this for a system with 8GB DRAM, & > http://en.wikipedia.org/wiki/Dynamic_random_access_memory#Errors_and_error_correction > says > "Tests[ecc]give widely varying error rates, but about 10-12upset/bit-hr > is typical, roughly one bit error, per month, per gigabyte of memory. > > In most computers used for serious scientific or financial computing and > as servers, ECC is the rule rather than the exception, as can be seen by > examining manufacturers' specifications." > > > So, for that data 8GB DRAM is about 8 errors per month, ie about > one per 3-4 days. > > What rates do you have?
Under normal operation in a data center environment on high quality server-class hardware, I'll see one or two ECC corrected single bit errors per quarter on hosts with 24GB+ of RAM. When RAM is on its way out, those rates go WAY up even though it's still functional. Under harsher circumstances (e.g. non-data center) the rates are probably higher, but I have no hard data on them. After corrupting many TB of data on one of my home systems (yes, including backups) due to creeping memory problems, I think it will be a while before I skip ECC for cost reasons again. -- Steve _______________________________________________ PLUG mailing list [email protected] http://lists.pdxlinux.org/mailman/listinfo/plug
