Re: Weird file corruption?

1999-01-15 Thread Zach Heilig
On Wed, Jan 13, 1999 at 05:26:40PM -0500, Brian Feldman wrote:
...
 How could it be memory when it's written to disk, extracted, then after a 
 nearly
 full build read again? Why would it extract completely the first time with no
 errors?

I had this exact same problem when evaluating pentium motherboards a
while back.  One of my (parity!) simms was marginal [later verified
with a simm checker], but it was not noticed until after I made one
corrupt archive [and deleted the originals].  Luckily, both the simm
and the motherboard were still returnable.

This also verified for me that most Intel chipsets for pentium do
not use parity even if available.

-- 
Zach Heilig z...@uffdaonline.net / Zach Heilig z...@gaffaneys.com

To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-current in the body of the message


Re: Weird file corruption?

1999-01-15 Thread Brian Feldman
On Fri, 15 Jan 1999, Zach Heilig wrote:

 On Wed, Jan 13, 1999 at 05:26:40PM -0500, Brian Feldman wrote:
 ...
  How could it be memory when it's written to disk, extracted, then after a 
  nearly
  full build read again? Why would it extract completely the first time with 
  no
  errors?
 
 I had this exact same problem when evaluating pentium motherboards a
 while back.  One of my (parity!) simms was marginal [later verified
 with a simm checker], but it was not noticed until after I made one
 corrupt archive [and deleted the originals].  Luckily, both the simm
 and the motherboard were still returnable.

Good to know I am looking in the right place.
I switched my timings from Turbo to Normal (I have 2 EDO/2 FP), and now it
seems to past tests, but I think I did see a few bytes get corrupted in an image
in netscape... ah well, so you'd recommend finding someone with a SIMM checker?

 
 This also verified for me that most Intel chipsets for pentium do
 not use parity even if available.
 
 -- 
 Zach Heilig z...@uffdaonline.net / Zach Heilig z...@gaffaneys.com
 

 Brian Feldman_ __  ___ ___ ___  
 gr...@unixhelp.org   _ __ ___ | _ ) __|   \ 
 http://www.freebsd.org/ _ __ ___  | _ \__ \ |) |
 FreeBSD: The Power to Serve!  _ __ ___  _ |___/___/___/ 


To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-current in the body of the message


Re: Weird file corruption?

1999-01-15 Thread Zach Heilig
On Fri, Jan 15, 1999 at 02:51:01PM -0500, Brian Feldman wrote:
 Good to know I am looking in the right place.
 I switched my timings from Turbo to Normal (I have 2 EDO/2 FP), and now it
 seems to past tests, but I think I did see a few bytes get corrupted in an 
 image
 in netscape... ah well, so you'd recommend finding someone with a SIMM 
 checker?

Except simm checkers don't always catch errors, so if the simm passes,
there still is no guarantee (but simm checkers do weed out obvious
duds quicker than trying in a system).  Unfortunately, there is no
conclusive test [that I know about] to prove a simm is good.  The
best test I know is to use emperical evidence based on how the system
acts with or without the suspect simm.

Even better is to only use motherboards that support parity and/or ECC,
with parity/ecc simms/dimms.  Then you catch most problems right away,
rather than scratching your head (and doing the if I twiddle this knob,
it seems to reduce the problem... I think).

-- 
Zach Heilig z...@uffdaonline.net / Zach Heilig z...@gaffaneys.com

To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-current in the body of the message


Re: Weird file corruption?

1999-01-15 Thread Peter Jeremy
Zach Heilig z...@uffdaonline.net wrote:
Except simm checkers don't always catch errors, so if the simm passes,
there still is no guarantee (but simm checkers do weed out obvious
duds quicker than trying in a system).  Unfortunately, there is no
conclusive test [that I know about] to prove a simm is good.
I'd agree with that.  The guts of a DRAM (or any type) is high-speed
analog circuitry with delicate multi-phase clocking (the external
clocks and selects are internally subdivided into maybe 20 phases).
There can be pattern-sensitive crosstalk between rows or columns that
depends on inter-access timings - or even how long since a particular
row was refreshed.  I suspect it's impossible to prove that a SIMM
is good - there are two many combinations to test.

Even better is to only use motherboards that support parity and/or ECC,
with parity/ecc simms/dimms.
This is the only practical way to detect memory problems.

A subsidiary problem is that, unlike say Solaris, FreeBSD doesn't
automatically report ECC errors.  Without this, your memory controller
can be furiously correcting a hard single-bit error and die when
it glitches to a double-bit error.  Someone did post a script that
checked and cleared the relevant register in an i440 several months
ago, but I seem to have mislaid both the script and reference :-(.

Peter

To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-current in the body of the message