RE: Disk errors when copying

2007-09-10 Thread Ted Mittelstaedt


 -Original Message-
 From: Lars Eighner [mailto:[EMAIL PROTECTED]
 Sent: Sunday, September 09, 2007 11:17 AM
 To: Ted Mittelstaedt
 Cc: Richard Tobin; freebsd-questions@freebsd.org
 Subject: RE: Disk errors when copying


 On Fri, 7 Sep 2007, Ted Mittelstaedt wrote:

 
 
  Subject: Disk errors when copying
 
 
  When copy between disks (ad10 -ad8), I get errors:
 
  ad10: WARNING - READ_DMA48 UDMA ICRC error (retrying request)
  LBA=435128800
  ad10: FAILURE - READ_DMA48 status=51READY,DSC,ERROR
  error=10NID_NOT_FOUND LBA=435128800
  g_vfs_done():ad10s2g[READ(offset=175562145792, length=131072)]error = 5
 
  I don't get these errors just reading the data from ad10.  Is this
  some kind of system error rather than a bad disk?  Is it a
 known problem?
 
 
  Yes it is a known problem.  It does not happen with most combinations
  of drives and controllers.  You need to exhaustively document the
  motherboard/controller/hard disk and put it into a PR and file it
  so that the developer can add your combo into his database.  The more
  of these that are documented the quicker that a coorelation is going
  to show up and get fixed.

 I wish I'd known that before I trashed my disc and spent a couple of weeks
 and hundreds of bucks building a new system.


One of the rules of thumb when you have hardware problems with a new
system (I'm assuming of course that these UDMA errors have been
happening since the system was built) is to search both the FreeBSD
questions mailing list archives, and the PR database - both closed and
open PRs.  Particularly closed PRs are a wealth of information because
so many of them are closed for lack of followup.

A typical scenario is someone will report a problem like your having
and 3 months later the developer will make a change in the code and
then ask the reporter to test the change and see if it fixed the
problem.  By then the original reporter has gone on to something else
and won't respond.  The developer then closes the PR and assumes whatever
he did fixed the problem.

If you do find closed PRs that are the same problem and same hardware
as yours, definitely refer to their numbers in your PR.

Ted

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


RE: Disk errors when copying

2007-09-10 Thread Lars Eighner

On Sun, 9 Sep 2007, Ted Mittelstaedt wrote:


From: Lars Eighner [mailto:[EMAIL PROTECTED]



I wish I'd known that before I trashed my disc and spent a couple of weeks
and hundreds of bucks building a new system.



One of the rules of thumb when you have hardware problems with a new
system (I'm assuming of course that these UDMA errors have been
happening since the system was built) is to search both the FreeBSD
questions mailing list archives, and the PR database - both closed and
open PRs.  Particularly closed PRs are a wealth of information because
so many of them are closed for lack of followup.


I got the (disc) manufacture's utilities (which run on a bootable
FreeDOS CD) and ran every test over and over.  It kept telling me
the disc was fine.  I should have believed.

I always feel a little weird about discs because although the manufacture
and the BIOS agree on the geometry, FreeBSD always (over three or four boxes
with a half-dozen different discs) tells me the geometry is wrong.  It seems
so confident about it, I generally let it do what it wants.  But what does
FreeBSD know about the disc that the manufacture and the BIOS don't?


--
Lars Eighner
http://www.larseighner.com/index.html
8800 N IH35 APT 1191 AUSTIN TX 78753-5266

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


RE: Disk errors when copying

2007-09-10 Thread Richard Tobin
   ad10: WARNING - READ_DMA48 UDMA ICRC error (retrying request)
   LBA=435128800
   ad10: FAILURE - READ_DMA48 status=51READY,DSC,ERROR
   error=10NID_NOT_FOUND LBA=435128800
   g_vfs_done():ad10s2g[READ(offset=175562145792, length=131072)]error = 5

 One of the rules of thumb when you have hardware problems with a new
 system (I'm assuming of course that these UDMA errors have been
 happening since the system was built)

In my case it happened once and did not recur.  But looking at the SMART
log on the disk it appears that it might have happened before without
my noticing.  I was copying the disk before moving it to a different
machine, so I probably won't be able to test it further.

I'm sending a PR.

-- Richard
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


RE: Disk errors when copying

2007-09-10 Thread Ted Mittelstaedt

geometry is meaningless in LBA mode.

The drive and BIOS mfgr agree on a convenient fiction to
reduce support calls.

Don't forget that running under FreeDOS your running in
real mode not protected mode.  In real mode the segmented
BIOS functions are actually used and it could be they are
even used for addressing the disk, and the disk controller
chipset emulates a MFM controller.  (esentially)

In the protected mode UNIX runs in, most of that BIOS code
is useless, the disk driver talks directly to the disk
controller chipset.  There is probably some undocumented
misbehavior that Microsoft got told about and so put it in
their disk driver code, but that the FreeBSD developers didn't
get told about.

Ted

 -Original Message-
 From: Lars Eighner [mailto:[EMAIL PROTECTED]
 Sent: Monday, September 10, 2007 2:30 AM
 To: Ted Mittelstaedt
 Cc: freebsd-questions@freebsd.org
 Subject: RE: Disk errors when copying


 On Sun, 9 Sep 2007, Ted Mittelstaedt wrote:

  From: Lars Eighner [mailto:[EMAIL PROTECTED]

  I wish I'd known that before I trashed my disc and spent a
 couple of weeks
  and hundreds of bucks building a new system.
 
 
  One of the rules of thumb when you have hardware problems with a new
  system (I'm assuming of course that these UDMA errors have been
  happening since the system was built) is to search both the FreeBSD
  questions mailing list archives, and the PR database - both closed and
  open PRs.  Particularly closed PRs are a wealth of information because
  so many of them are closed for lack of followup.

 I got the (disc) manufacture's utilities (which run on a bootable
 FreeDOS CD) and ran every test over and over.  It kept telling me
 the disc was fine.  I should have believed.

 I always feel a little weird about discs because although the manufacture
 and the BIOS agree on the geometry, FreeBSD always (over three or
 four boxes
 with a half-dozen different discs) tells me the geometry is
 wrong.  It seems
 so confident about it, I generally let it do what it wants.  But what does
 FreeBSD know about the disc that the manufacture and the BIOS don't?


 --
 Lars Eighner
 http://www.larseighner.com/index.html
 8800 N IH35 APT 1191 AUSTIN TX 78753-5266



___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


RE: Disk errors when copying

2007-09-09 Thread Lars Eighner

On Fri, 7 Sep 2007, Ted Mittelstaedt wrote:





Subject: Disk errors when copying


When copy between disks (ad10 -ad8), I get errors:

ad10: WARNING - READ_DMA48 UDMA ICRC error (retrying request)
LBA=435128800
ad10: FAILURE - READ_DMA48 status=51READY,DSC,ERROR
error=10NID_NOT_FOUND LBA=435128800
g_vfs_done():ad10s2g[READ(offset=175562145792, length=131072)]error = 5

I don't get these errors just reading the data from ad10.  Is this
some kind of system error rather than a bad disk?  Is it a known problem?



Yes it is a known problem.  It does not happen with most combinations
of drives and controllers.  You need to exhaustively document the
motherboard/controller/hard disk and put it into a PR and file it
so that the developer can add your combo into his database.  The more
of these that are documented the quicker that a coorelation is going
to show up and get fixed.


I wish I'd known that before I trashed my disc and spent a couple of weeks
and hundreds of bucks building a new system.

--
Lars Eighner
http://www.larseighner.com/index.html
8800 N IH35 APT 1191 AUSTIN TX 78753-5266

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


RE: Disk errors when copying

2007-09-07 Thread Ted Mittelstaedt


 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] Behalf Of Richard Tobin
 Sent: Wednesday, September 05, 2007 3:03 PM
 To: freebsd-questions@freebsd.org
 Subject: Disk errors when copying
 
 
 When copy between disks (ad10 -ad8), I get errors:
 
 ad10: WARNING - READ_DMA48 UDMA ICRC error (retrying request) 
 LBA=435128800
 ad10: FAILURE - READ_DMA48 status=51READY,DSC,ERROR 
 error=10NID_NOT_FOUND LBA=435128800
 g_vfs_done():ad10s2g[READ(offset=175562145792, length=131072)]error = 5
 
 I don't get these errors just reading the data from ad10.  Is this
 some kind of system error rather than a bad disk?  Is it a known problem?
 

Yes it is a known problem.  It does not happen with most combinations
of drives and controllers.  You need to exhaustively document the
motherboard/controller/hard disk and put it into a PR and file it
so that the developer can add your combo into his database.  The more
of these that are documented the quicker that a coorelation is going
to show up and get fixed.

Ted
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Disk errors when copying

2007-09-06 Thread Ivan Voras

Richard Tobin wrote:

When copy between disks (ad10 -ad8), I get errors:

ad10: WARNING - READ_DMA48 UDMA ICRC error (retrying request) LBA=435128800
ad10: FAILURE - READ_DMA48 status=51READY,DSC,ERROR error=10NID_NOT_FOUND 
LBA=435128800
g_vfs_done():ad10s2g[READ(offset=175562145792, length=131072)]error = 5

I don't get these errors just reading the data from ad10.  Is this
some kind of system error rather than a bad disk?  Is it a known problem?


It doesn't match any recent known problem - it looks like a disk error. 
You might want to pinpoint the file which causes it and skip that file. 
Use sysutils/smartmontools to test and monitor the drive.




signature.asc
Description: OpenPGP digital signature