Disk Errors

2012-07-24 Thread dweimer
Just curious, I am sure the likely issue is a bad disk, but I thought 
there might be a chance this could be caused by possibly by something 
else.


I have three 1TB disks I use for backup, two of them are Western 
Digital drives I bought specifically for this purpose.  One is a Seagate 
drive that came out of a barebones PC that I replaced with a couple 
smaller drives in a stripe to gain performance.  I use the drives in an 
external SATA dock, using geom eli encryption, the western digital 
drives give me no problems, but the seagate drive gives me a lot of the 
following errors under load.


ad4: TIMEOUT - WRITE_DMA48 retrying (1 retry left) LBA=817755328
ad4: WARNING - WRITE_DMA48 UDMA ICRC error (retrying request) 
LBA=837397120

ad4: TIMEOUT - WRITE_DMA48 retrying (1 retry left) LBA=879786112
ad4: WARNING - WRITE_DMA48 UDMA ICRC error (retrying request) 
LBA=882931200
ad4: WARNING - WRITE_DMA48 UDMA ICRC error (retrying request) 
LBA=890542016
ad4: WARNING - WRITE_DMA48 UDMA ICRC error (retrying request) 
LBA=902767296

ad4: TIMEOUT - WRITE_DMA48 retrying (1 retry left) LBA=904071296

dmesg info about the drive at connection time:
ad4: 953869MB Seagate ST31000528AS CC46 at ata2-master UDMA100 SATA 
3Gb/s


dmesg info about one of the western digital drives:
ad4: 953869MB WDC WD10EARS-00Y5B1 80.00A80 at ata2-master UDMA100 
SATA 3Gb/s


Before I scrap the drive I just wanted to see if anyone could either 
say for sure its hardware, or if something else could possibly cause 
this.  I don't suspect the controller, cable or dock as the problems 
would likely occur with the western Digital drives as well if one of 
them were involved.



--
Thanks,
   Dean E. Weimer
   http://www.dweimer.net/
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: Disk Errors

2012-07-24 Thread Wojciech Puchar

Just curious, I am sure the likely issue is a bad disk, but I thought there


actually not that likely.

i had such problems, occuring randomly on many drives, and all problems 
disappeared after changing computer, with the same disk.


BTW i would recommend you to turn on AHCI driver


ad4: TIMEOUT - WRITE_DMA48 retrying (1 retry left) LBA=817755328
ad4: WARNING - WRITE_DMA48 UDMA ICRC error (retrying request) LBA=837397120
ad4: TIMEOUT - WRITE_DMA48 retrying (1 retry left) LBA=879786112
ad4: WARNING - WRITE_DMA48 UDMA ICRC error (retrying request) LBA=882931200
ad4: WARNING - WRITE_DMA48 UDMA ICRC error (retrying request) LBA=890542016
ad4: WARNING - WRITE_DMA48 UDMA ICRC error (retrying request) LBA=902767296
ad4: TIMEOUT - WRITE_DMA48 retrying (1 retry left) LBA=904071296

dmesg info about the drive at connection time:
ad4: 953869MB Seagate ST31000528AS CC46 at ata2-master UDMA100 SATA 3Gb/s

dmesg info about one of the western digital drives:
ad4: 953869MB WDC WD10EARS-00Y5B1 80.00A80 at ata2-master UDMA100 SATA 
3Gb/s


Before I scrap the drive I just wanted to see if anyone could either say for 
sure its hardware, or if something else could possibly cause this.  I don't 
suspect the controller, cable or dock as the problems would likely occur with 
the western Digital drives as well if one of them were involved.



--
Thanks,
  Dean E. Weimer
  http://www.dweimer.net/
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: Disk Errors

2012-07-24 Thread Dan Nelson
In the last episode (Jul 24), dweimer said:
 I have three 1TB disks I use for backup, two of them are Western Digital
 drives I bought specifically for this purpose.  One is a Seagate drive
 that came out of a barebones PC that I replaced with a couple smaller
 drives in a stripe to gain performance.  I use the drives in an external
 SATA dock, using geom eli encryption, the western digital drives give me
 no problems, but the seagate drive gives me a lot of the following errors
 under load.
 
 ad4: TIMEOUT - WRITE_DMA48 retrying (1 retry left) LBA=817755328
 ad4: WARNING - WRITE_DMA48 UDMA ICRC error (retrying request) LBA=837397120
 ad4: TIMEOUT - WRITE_DMA48 retrying (1 retry left) LBA=879786112
 ad4: WARNING - WRITE_DMA48 UDMA ICRC error (retrying request) LBA=882931200
 ad4: WARNING - WRITE_DMA48 UDMA ICRC error (retrying request) LBA=890542016
 ad4: WARNING - WRITE_DMA48 UDMA ICRC error (retrying request) LBA=902767296
 ad4: TIMEOUT - WRITE_DMA48 retrying (1 retry left) LBA=904071296

If you install the sysutils/smartmontools port, you can run smartctl -x
/dev/ad4 to dump the drive's SMART attribute table and error logs.  Those
should give you an indication of whether the drive is going bad.  If the
drive is logging those write errors in its internal log, then you know it's
not a cabling issue.  If it's not logging errors, I suppose you might have a
loose SATA plug on the drive itself, which would explain why the problem
follows the drive around.

 dmesg info about the drive at connection time:
 ad4: 953869MB Seagate ST31000528AS CC46 at ata2-master UDMA100 SATA 
 3Gb/s
 
 dmesg info about one of the western digital drives:
 ad4: 953869MB WDC WD10EARS-00Y5B1 80.00A80 at ata2-master UDMA100 
 SATA 3Gb/s
 
 Before I scrap the drive I just wanted to see if anyone could either 
 say for sure its hardware, or if something else could possibly cause 
 this.  I don't suspect the controller, cable or dock as the problems 
 would likely occur with the western Digital drives as well if one of 
 them were involved.

-- 
Dan Nelson
dnel...@allantgroup.com
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: Disk Errors

2012-07-24 Thread jb
dweimer dweimer at dweimer.net writes:

 ... 
 ad4: TIMEOUT - WRITE_DMA48 retrying (1 retry left) LBA=817755328
 ad4: WARNING - WRITE_DMA48 UDMA ICRC error (retrying request) 
 LBA=837397120
 ... 

There is a story about it:
http://linux-bsd-sharing.blogspot.com/2009/03/howto-fix-sata-dma-timeout-issues-on.html

But do not rush, read the comments as well:
...
Tony Schwartz said...

Thing is though, I have a secondary issue. This second issue is probably
what caused the first issue (DMA TIMEOUTS) to begin with. My disks keep
spinning down then up, every 20 seconds or so. I have no idea why this is 
happening, but it's not just one disk. I think that it was timing out because 
he disk goes to spin up and that takes too long. Any ideas here? I've used 
atacontrol and it's not configured to spindown. Thanks.
...

Benjamin said...

LoL, found the solution and feeling a little embarrassed by it. Good thing
I got a GURU in the forums to look at it.

It was just the power supply and my disk was spinning down cos the power
wasn't sufficient to run 6 HDs and 9 fans for cooling ha ha ha.
...

CyberRax said...

Just for information: while this hasn't been fixed as elegantly as in the
patch FreeBSD does incorporate since 8-STABLE r199158 a solution for the
problem: ATA_REQUEST_TIMEOUT kernel option that be be set higher than the 
default 5.
What is needed is adding options ATA_REQUEST_TIMEOUT=X (where X is
timeout in seconds) into the kernel configuration file.
Changing the timeout will need rebuilding and installing the kernel, but
it's still better than nothing.

jb


___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: Disk Errors

2012-07-24 Thread dweimer

On 2012-07-24 12:50, Wojciech Puchar wrote:
Just curious, I am sure the likely issue is a bad disk, but I 
thought there


actually not that likely.

i had such problems, occurring randomly on many drives, and all
problems disappeared after changing computer, with the same disk.

BTW i would recommend you to turn on AHCI driver



Now that made me just notice something interesting, my software mirror 
running on the internal SATA disks that contain the Operating System on 
this server is using the ahci driver but the external SATA drive isn't 
guess I am going to have to reboot tonight and check and see if 
something is set on the controllers BIOS that keeps it from running 
AHCI.


Just an FYI, the server is running entirely on commodity PC hardware, 
as this is my home web server.  Though its running all well known major 
brands for hardware.  It is running FreeBSD 9.0-RELEASE-P3, upgraded a 
few times via source from an original install of 8.2 on this hardware.


--
Thanks,
   Dean E. Weimer
   http://www.dweimer.net/
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: Disk Errors

2012-07-24 Thread dweimer

On 2012-07-24 13:04, Dan Nelson wrote:

In the last episode (Jul 24), dweimer said:
I have three 1TB disks I use for backup, two of them are Western 
Digital
drives I bought specifically for this purpose.  One is a Seagate 
drive
that came out of a barebones PC that I replaced with a couple 
smaller
drives in a stripe to gain performance.  I use the drives in an 
external
SATA dock, using geom eli encryption, the western digital drives 
give me
no problems, but the seagate drive gives me a lot of the following 
errors

under load.

ad4: TIMEOUT - WRITE_DMA48 retrying (1 retry left) LBA=817755328
ad4: WARNING - WRITE_DMA48 UDMA ICRC error (retrying request) 
LBA=837397120

ad4: TIMEOUT - WRITE_DMA48 retrying (1 retry left) LBA=879786112
ad4: WARNING - WRITE_DMA48 UDMA ICRC error (retrying request) 
LBA=882931200
ad4: WARNING - WRITE_DMA48 UDMA ICRC error (retrying request) 
LBA=890542016
ad4: WARNING - WRITE_DMA48 UDMA ICRC error (retrying request) 
LBA=902767296

ad4: TIMEOUT - WRITE_DMA48 retrying (1 retry left) LBA=904071296


If you install the sysutils/smartmontools port, you can run smartctl 
-x
/dev/ad4 to dump the drive's SMART attribute table and error logs.  
Those
should give you an indication of whether the drive is going bad.  If 
the
drive is logging those write errors in its internal log, then you 
know it's
not a cabling issue.  If it's not logging errors, I suppose you might 
have a
loose SATA plug on the drive itself, which would explain why the 
problem

follows the drive around.



Running a long test on the drive now, doesn't seem to show anything 
that sticks out at me as failing right now.


smartctl 5.43 2012-06-30 r3573 [FreeBSD 9.0-RELEASE-p3 amd64] (local 
build)
Copyright (C) 2002-12 by Bruce Allen, 
http://smartmontools.sourceforge.net


=== START OF INFORMATION SECTION ===
Model Family: Seagate Barracuda 7200.12
Device Model: ST31000528AS
Serial Number:5VP7ST1C
LU WWN Device Id: 5 000c50 02f7a3bb4
Firmware Version: CC46
User Capacity:1,000,204,886,016 bytes [1.00 TB]
Sector Size:  512 bytes logical/physical
Device is:In smartctl database [for details use: -P show]
ATA Version is:   8
ATA Standard is:  ATA-8-ACS revision 4
Local Time is:Tue Jul 24 14:29:08 2012 CDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
AAM level is: 208 (intermediate), recommended: 208
APM feature is:   Unavailable
Rd look-ahead is: Enabled
Write cache is:   Enabled
ATA Security is:  Disabled, NOT FROZEN [SEC1]

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82)	Offline data collection 
activity

was completed without error.
Auto Offline Data Collection: Enabled.
Self-test execution status:  ( 248)	Self-test routine in 
progress...

80% of test remaining.
Total time to complete Offline
data collection:(  600) seconds.
Offline data collection
capabilities:(0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off 
support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities:(0x0003)	Saves SMART data before 
entering

power-saving mode.
Supports SMART auto save timer.
Error logging capability:(0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time:(   1) minutes.
Extended self-test routine
recommended polling time:( 173) minutes.
Conveyance self-test routine
recommended polling time:(   2) minutes.
SCT capabilities:  (0x103f) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME  FLAGSVALUE WORST THRESH FAIL RAW_VALUE
  1 Raw_Read_Error_Rate POSR--   117   099   006-145191418
  3 Spin_Up_TimePO   095   095   000-0
  4 Start_Stop_Count-O--CK   100   100   020-114
  5 Reallocated_Sector_Ct   PO--CK   100   100   036-0
  7 Seek_Error_Rate 

Re: Disk Errors

2012-07-24 Thread dweimer

On 2012-07-24 13:37, jb wrote:

dweimer dweimer at dweimer.net writes:


...
ad4: TIMEOUT - WRITE_DMA48 retrying (1 retry left) LBA=817755328
ad4: WARNING - WRITE_DMA48 UDMA ICRC error (retrying request)
LBA=837397120
...


There is a story about it:

http://linux-bsd-sharing.blogspot.com/2009/03/howto-fix-sata-dma-timeout-issues-on.html

But do not rush, read the comments as well:
...
Tony Schwartz said...

Thing is though, I have a secondary issue. This second issue is 
probably
what caused the first issue (DMA TIMEOUTS) to begin with. My disks 
keep
spinning down then up, every 20 seconds or so. I have no idea why 
this is

happening, but it's not just one disk. I think that it was timing out
because
he disk goes to spin up and that takes too long. Any ideas here? I've 
used

atacontrol and it's not configured to spindown. Thanks.
...

Benjamin said...

LoL, found the solution and feeling a little embarrassed by it.
Good thing
I got a GURU in the forums to look at it.

It was just the power supply and my disk was spinning down cos 
the power

wasn't sufficient to run 6 HDs and 9 fans for cooling ha ha ha.
...



I wouldn't expect power as the external dock has its own power supply, 
I would expect this to occur on the other drives as well.  Though its 
possible the Seagate drive requires more power than the Western Digital 
drives, I think I will look up the specs tonight on that, as well as do 
some searching on the eSATA doc to verify that there haven't been any 
problems with it and Seagate drives



CyberRax said...

Just for information: while this hasn't been fixed as elegantly 
as in the
patch FreeBSD does incorporate since 8-STABLE r199158 a solution for 
the
problem: ATA_REQUEST_TIMEOUT kernel option that be be set higher than 
the

default 5.
What is needed is adding options ATA_REQUEST_TIMEOUT=X (where X 
is

timeout in seconds) into the kernel configuration file.
Changing the timeout will need rebuilding and installing the 
kernel, but

it's still better than nothing.

jb


___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to 
freebsd-questions-unsubscr...@freebsd.org


--
Thanks,
   Dean E. Weimer
   http://www.dweimer.net/
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: Disk Errors

2012-07-24 Thread jb
dweimer dweimer at dweimer.net writes:

 ... 
 188 Command_Timeout -O--CK   100   098   000-
 21475164202
 ...

I can not find it for Seagate;
http://sourceforge.net/apps/trac/smartmontools/wiki/AttributesSeagate

but for Western-Digital:
http://sourceforge.net/apps/trac/smartmontools/wiki/AttributesWestern-Digital
...
188 Command Time OutA number of aborted operations 
due to HDD timeout.
Normally this attribute value should be equal to zero and if you have values far
above zero, then most likely you have some serious problems with your power
supply or you have an oxidized data cable. 
...

jb





___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: Disk Errors

2012-07-24 Thread Michael Powell
dweimer wrote:

[snip]
 
 SMART Attributes Data Structure revision number: 10
 Vendor Specific SMART Attributes with Thresholds:
 ID# ATTRIBUTE_NAME  FLAGSVALUE WORST THRESH FAIL RAW_VALUE
1 Raw_Read_Error_Rate POSR--   117   099   006-145191418
[...]
7 Seek_Error_Rate POSR--   078   060   030-77590473
[...]
 195 Hardware_ECC_Recovered  -O-RC-   025   023   000-145191418
[...]
 241 Total_LBAs_Written  --   100   253   000-1480696469
 242 Total_LBAs_Read --   100   253   000-922627427
[snip]

Really, most of the numbers don't look really bad, but I'd cast a leery eye 
towards the way these three correlate.  Read errors from bad spots in the 
magnetic media are one thing, but notice how the drive is recovering data 
with built-in ECC routines. Then notice that the seek error rate is moving 
along at a similar pace. There is a possibility that this is a purely 
mechanical weakness in the head positioning function, just barely not bad 
enough for to allow the drive to attempt to hide it through ECC.

When I suspect media failure I generally use the manufacturers diagnostic 
utility to scan for defective media. I haven't used many Seagates in a long 
time so mostly this means WD's wddiags, which can be downloaded as a 
bootable CD .iso image. Seagate will have something similar. The quick scan 
is meant to be non-destructive while the long scan usually is. (I just had 
an old Raptor drive grow 5 bad spots recently, and the long scan fixed it 
without destroying any data - a first for me that) 

As long as the remap space area on the drive is not full usually these 
diagnostics have a good chance to fix bad spots. If it's an infrequent affair 
then one  may just continue to use it. If I see new bad sectors a week later 
it is an indication that the drive has outlived it's usefulness and I 
replace it. If it's another year before I get a small handful of bad spots I 
may just let the diags fix it and continue to use. That is - as long as the 
remap space is not full. Once that happens any new bad spots are permanent 
and cannot be done anything about. Time to replace drive.

The difference here is bad spots developing in the media on the platter(s) as 
opposed to the problem actually stemming from head seek position-location 
problems. None of the diags can do anything about head seek troubles, only 
identify if the problem is media on the platter(s) related.

-Mike


___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: Disk Errors

2012-07-24 Thread Warren Block

On Tue, 24 Jul 2012, dweimer wrote:

Just curious, I am sure the likely issue is a bad disk, but I thought there 
might be a chance this could be caused by possibly by something else.


I have three 1TB disks I use for backup, two of them are Western Digital 
drives I bought specifically for this purpose.  One is a Seagate drive that 
came out of a barebones PC that I replaced with a couple smaller drives in a 
stripe to gain performance.  I use the drives in an external SATA dock, using 
geom eli encryption, the western digital drives give me no problems, but the 
seagate drive gives me a lot of the following errors under load.


ad4: TIMEOUT - WRITE_DMA48 retrying (1 retry left) LBA=817755328
ad4: WARNING - WRITE_DMA48 UDMA ICRC error (retrying request) LBA=837397120
ad4: TIMEOUT - WRITE_DMA48 retrying (1 retry left) LBA=879786112
ad4: WARNING - WRITE_DMA48 UDMA ICRC error (retrying request) LBA=882931200
ad4: WARNING - WRITE_DMA48 UDMA ICRC error (retrying request) LBA=890542016
ad4: WARNING - WRITE_DMA48 UDMA ICRC error (retrying request) LBA=902767296
ad4: TIMEOUT - WRITE_DMA48 retrying (1 retry left) LBA=904071296

dmesg info about the drive at connection time:
ad4: 953869MB Seagate ST31000528AS CC46 at ata2-master UDMA100 SATA 3Gb/s


There are more than a few problem reports on the net concerning that 
drive, even on Seagate's own forums.  Both hardware problems and 
firmware problems.  Your later post says you have firmware version CC46, 
and Seagate has an update to CC49.  That's worth a try.

http://knowledge.seagate.com/articles/en_US/FAQ/213891en?language=en_US
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: Disk Errors

2012-07-24 Thread dweimer

On 2012-07-24 16:10, Warren Block wrote:

On Tue, 24 Jul 2012, dweimer wrote:

Just curious, I am sure the likely issue is a bad disk, but I 
thought there might be a chance this could be caused by possibly by 
something else.


I have three 1TB disks I use for backup, two of them are Western 
Digital drives I bought specifically for this purpose.  One is a 
Seagate drive that came out of a barebones PC that I replaced with a 
couple smaller drives in a stripe to gain performance.  I use the 
drives in an external SATA dock, using geom eli encryption, the 
western digital drives give me no problems, but the seagate drive 
gives me a lot of the following errors under load.


ad4: TIMEOUT - WRITE_DMA48 retrying (1 retry left) LBA=817755328
ad4: WARNING - WRITE_DMA48 UDMA ICRC error (retrying request) 
LBA=837397120

ad4: TIMEOUT - WRITE_DMA48 retrying (1 retry left) LBA=879786112
ad4: WARNING - WRITE_DMA48 UDMA ICRC error (retrying request) 
LBA=882931200
ad4: WARNING - WRITE_DMA48 UDMA ICRC error (retrying request) 
LBA=890542016
ad4: WARNING - WRITE_DMA48 UDMA ICRC error (retrying request) 
LBA=902767296

ad4: TIMEOUT - WRITE_DMA48 retrying (1 retry left) LBA=904071296

dmesg info about the drive at connection time:
ad4: 953869MB Seagate ST31000528AS CC46 at ata2-master UDMA100 
SATA 3Gb/s


There are more than a few problem reports on the net concerning that
drive, even on Seagate's own forums.  Both hardware problems and
firmware problems.  Your later post says you have firmware version
CC46, and Seagate has an update to CC49.  That's worth a try.

http://knowledge.seagate.com/articles/en_US/FAQ/213891en?language=en_US


Definately going to try this firmware update, if only it would see the 
disk through the eSATA controller, but unfortunately it marks it as a 
JBOD raid instead of straight access to the disk.  So this will have to 
wait until I put my puppy to bed for the night as she keeps trying to 
eat the pillow from my bed while I am working on this.


--
Thanks,
   Dean E. Weimer
   http://www.dweimer.net/
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: Disk Errors

2012-07-24 Thread Adam Vande More
On Tue, Jul 24, 2012 at 11:40 AM, dweimer dwei...@dweimer.net wrote:

 Just curious, I am sure the likely issue is a bad disk, but I thought
 there might be a chance this could be caused by possibly by something else.

 I have three 1TB disks I use for backup, two of them are Western Digital
 drives I bought specifically for this purpose.  One is a Seagate drive that
 came out of a barebones PC that I replaced with a couple smaller drives in
 a stripe to gain performance.  I use the drives in an external SATA dock,
 using geom eli encryption, the western digital drives give me no problems,
 but the seagate drive gives me a lot of the following errors under load.

 ad4: TIMEOUT - WRITE_DMA48 retrying (1 retry left) LBA=817755328
 ad4: WARNING - WRITE_DMA48 UDMA ICRC error (retrying request) LBA=837397120
 ad4: TIMEOUT - WRITE_DMA48 retrying (1 retry left) LBA=879786112
 ad4: WARNING - WRITE_DMA48 UDMA ICRC error (retrying request) LBA=882931200
 ad4: WARNING - WRITE_DMA48 UDMA ICRC error (retrying request) LBA=890542016
 ad4: WARNING - WRITE_DMA48 UDMA ICRC error (retrying request) LBA=902767296
 ad4: TIMEOUT - WRITE_DMA48 retrying (1 retry left) LBA=904071296


This type of problem has been a consistent problem on FreeBSD until mid 8.x
range.  Try upgrading your system to something a little more modern.

-- 
Adam Vande More
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: Disk Errors

2012-07-24 Thread dweimer
 

On 2012-07-24 21:29, Adam Vande More wrote: 

 On Tue, Jul 24, 2012
at 11:40 AM, dweimer dwei...@dweimer.net wrote:
 
 Just curious, I
am sure the likely issue is a bad disk, but I thought there might be a
chance this could be caused by possibly by something else.
 
 I have
three 1TB disks I use for backup, two of them are Western Digital drives
I bought specifically for this purpose. One is a Seagate drive that came
out of a barebones PC that I replaced with a couple smaller drives in a
stripe to gain performance. I use the drives in an external SATA dock,
using geom eli encryption, the western digital drives give me no
problems, but the seagate drive gives me a lot of the following errors
under load.
 
 ad4: TIMEOUT - WRITE_DMA48 retrying (1 retry left)
LBA=817755328
 ad4: WARNING - WRITE_DMA48 UDMA ICRC error (retrying
request) LBA=837397120
 ad4: TIMEOUT - WRITE_DMA48 retrying (1 retry
left) LBA=879786112
 ad4: WARNING - WRITE_DMA48 UDMA ICRC error
(retrying request) LBA=882931200
 ad4: WARNING - WRITE_DMA48 UDMA ICRC
error (retrying request) LBA=890542016
 ad4: WARNING - WRITE_DMA48
UDMA ICRC error (retrying request) LBA=902767296
 ad4: TIMEOUT -
WRITE_DMA48 retrying (1 retry left) LBA=904071296
 
 This type of
problem has been a consistent problem on FreeBSD until mid 8.x range.
Try upgrading your system to something a little more modern.
 
 -- 

Adam Vande More

Its running 9.0-RELEASE-P3 updated from source from an
original install of 8.2 on this hardware. 

I have done the firmware
update on the drive, so hopefully I will see an improvement in about 2
hours when tonights backups kick off. 

-- 
Thanks,
 Dean E. Weimer

http://www.dweimer.net/
 
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: Disk Errors

2008-12-15 Thread Al Plant

Wojciech Puchar wrote:

ad2: FAILURE - READ_DMA timed out LBA=0
ad2: TIMEOUT - READ_DMA retrying (1 retry left) LBA=1
ad2: TIMEOUT - READ_DMA retrying (0 retries left) LBA=1

The flash drive is detected with 3940272 sectors.  Is there a way to 
control the LBA= parameter?  Does it matter if I try?

no.



How can I control the number of retries?

I read that FreeBSD doesn't use the BIOS at least for CHS.  Does 
FreeBSD use the BIOS for PIO and UDMA modes?

no.

try disabling dma with

set hw.ata.ata_dma=0

bootloader command
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to 
freebsd-questions-unsubscr...@freebsd.org



Aloha,

Wojciech, could be on the right track. I have recently had to do this on 
 several different FreeBSd server boxes to stop these errors. Both 
current FreeBSD 7 and  8 have done this. Hardware didnt seem to matter.



~Al Plant - Honolulu, Hawaii -  Phone:  808-284-2740
  + http://hawaiidakine.com + http://freebsdinfo.org +
  + http://aloha50.net   - Supporting - FreeBSD 6.* - 7.* - 8.* +
   email: n...@hdk5.net 
All that's really worth doing is what we do for others.- Lewis Carrol

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Disk Errors

2008-12-14 Thread Jason C. Wells
I am working on installing 6.4-RELEASE on a Motorola CPN5360 which is an 
industrial CompactPCI computer.  The system boots via PXE.  That much is 
good.  The host has two storage devices.


This is a 16MB boot flash device that is soldered to the board.

ad0: FAILURE - SETFEATURES SET TRANSFER MODE status=51READY,DSC,ERROR 
error=4

ABORTED
ad0: 15MB SunDisk SDTB-128 vcb 1.45 at ata0-master BIOSPIO

This is a standard compact flash from Kingston. Many repetitive errors 
are snipped here.


ad2: 1923MB CF CARD 2GB Ver2.19K at ata1-master UDMA33
ad2: FAILURE - READ_DMA timed out LBA=3940269
ad2: FAILURE - READ_DMA timed out LBA=3940209
ad2: TIMEOUT - READ_DMA retrying (0 retries left) LBA=0
ad2: FAILURE - READ_DMA timed out LBA=0
ad2: TIMEOUT - READ_DMA retrying (1 retry left) LBA=1
ad2: TIMEOUT - READ_DMA retrying (0 retries left) LBA=1

The flash drive is detected with 3940272 sectors.  Is there a way to 
control the LBA= parameter?  Does it matter if I try?


How can I control the number of retries?

I read that FreeBSD doesn't use the BIOS at least for CHS.  Does FreeBSD 
use the BIOS for PIO and UDMA modes?


Any thoughts on how to get these storage devices working?

Thanks,
Jason

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: Disk Errors

2008-12-14 Thread Wojciech Puchar

ad2: FAILURE - READ_DMA timed out LBA=0
ad2: TIMEOUT - READ_DMA retrying (1 retry left) LBA=1
ad2: TIMEOUT - READ_DMA retrying (0 retries left) LBA=1

The flash drive is detected with 3940272 sectors.  Is there a way to control 
the LBA= parameter?  Does it matter if I try?

no.



How can I control the number of retries?

I read that FreeBSD doesn't use the BIOS at least for CHS.  Does FreeBSD use 
the BIOS for PIO and UDMA modes?

no.

try disabling dma with

set hw.ata.ata_dma=0

bootloader command
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: Disk errors on installing FreeBSD 7.0

2008-08-08 Thread Julien Cigar
Same problems for me with atapi CD/DVD drives (READ_BIG timeouts,
etc) .. it works a bit better when dma is turned off, but then
performances are very poor.

On Thu, 2008-08-07 at 14:17 -1000, Al Plant wrote:
 N.J. Thomas wrote:
  * Snorre D. ?verb? [EMAIL PROTECTED] [2008-08-07 15:29:11+]:
  When I boot up with the installation DVD these error messages appear
  on the screen.
 
  ad1: FAILURE - READ_DMA status=51READY,DSC,ERROR error=84ICRC,ABORTED 
  LBA=0055347
  ad0: FAILURE - READ_DMA status=51READY,DSC,ERROR error=84ICRC,ABORTED 
  LBA=0
  etc
  
  I got the same exact errors trying to install 7.0-RELEASE on two
  different Dell boxes. One was 4 years old, the other was brand new (3
  months ago).
  
  Never was able to fix the problem. For the older one, I plugged in an
  external DVD drive and installed via that. For the other one, I
  installed via a mini-install disk, and then did a minimal network
  install.
  
  For the record, they both had SATA drives and the disks worked (and
  still work) fine after the OS was installed. It was just copying the
  base system off the CD that was causing errors.
  
  Thomas
  ___
  freebsd-questions@freebsd.org mailing list
  http://lists.freebsd.org/mailman/listinfo/freebsd-questions
  To unsubscribe, send any mail to [EMAIL PROTECTED]
 
 888
   Aloha,
 
 I am getting the same errors as you guys with an intermittient BIG_read 
 one occasionally. I've tried to install FreeeBSD CURRENT 8 and 7 release.
 
 This is on a no name box with a bio board and 1100 cpu. I've had this on 
 other boxes too and load IDE drives on a box that works with them and 
 then put them in the box with errors and they work just fine.
 
 Every thing gets recognized normally at  install time, but the size of 
 the IDE drive a Fujutsu 20 gig. shows twice what it should be every time.
 
 Dont know if this has anything to do with it, except if you change the 
 size in installer it wont load anything.
 
 Maybe one of the top level gurus on the list can help.
 
 
 
-- 
Julien Cigar
Belgian Biodiversity Platform
http://www.biodiversity.be
Université Libre de Bruxelles (ULB)
Campus de la Plaine CP 257
Bâtiment NO, Bureau 4 N4 115C (Niveau 4)
Boulevard du Triomphe, entrée ULB 2
B-1050 Bruxelles
Mail: [EMAIL PROTECTED]
@biobel: http://biobel.biodiversity.be/person/show/471
Tel : 02 650 57 52

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Disk errors on installing FreeBSD 8.0 (solved)

2008-08-08 Thread Al Plant

Julien Cigar wrote:

Same problems for me with atapi CD/DVD drives (READ_BIG timeouts,
etc) .. it works a bit better when dma is turned off, but then
performances are very poor.

On Thu, 2008-08-07 at 14:17 -1000, Al Plant wrote:

N.J. Thomas wrote:

* Snorre D. ?verb? [EMAIL PROTECTED] [2008-08-07 15:29:11+]:

When I boot up with the installation DVD these error messages appear
on the screen.

ad1: FAILURE - READ_DMA status=51READY,DSC,ERROR error=84ICRC,ABORTED 
LBA=0055347
ad0: FAILURE - READ_DMA status=51READY,DSC,ERROR error=84ICRC,ABORTED LBA=0
etc

I got the same exact errors trying to install 7.0-RELEASE on two
different Dell boxes. One was 4 years old, the other was brand new (3
months ago).

Never was able to fix the problem. For the older one, I plugged in an
external DVD drive and installed via that. For the other one, I
installed via a mini-install disk, and then did a minimal network
install.

For the record, they both had SATA drives and the disks worked (and
still work) fine after the OS was installed. It was just copying the
base system off the CD that was causing errors.

Thomas
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]

888
  Aloha,

I am getting the same errors as you guys with an intermittient BIG_read 
one occasionally. I've tried to install FreeeBSD CURRENT 8 and 7 release.


This is on a no name box with a bio board and 1100 cpu. I've had this on 
other boxes too and load IDE drives on a box that works with them and 
then put them in the box with errors and they work just fine.


Every thing gets recognized normally at  install time, but the size of 
the IDE drive a Fujutsu 20 gig. shows twice what it should be every time.


Dont know if this has anything to do with it, except if you change the 
size in installer it wont load anything.


Maybe one of the top level gurus on the list can help.




Aloha,

The suggestion to put the folloeing worked to clear my DMA error.

In: /boot/loader
Put: hw.ata.ata_dma=0 #disable IDE DMA

This allowed an uninterrupted boot.

Thanks for the suggestion.

~Al Plant - Honolulu, Hawaii -  Phone:  808-284-2740
  + http://hawaiidakine.com + http://freebsdinfo.org +
  + http://aloha50.net   - Supporting - FreeBSD 6.* - 7.* - 8.* +
   email: [EMAIL PROTECTED] 
All that's really worth doing is what we do for others.- Lewis Carrol

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Disk errors on installing FreeBSD 7.0

2008-08-07 Thread Snorre D. Øverbø
Hi all,

I'm trying to install FreeBSD 7.0 release on a box at home.

When I boot up with the installation DVD these error messages appear on
the screen.
[Written down by hand:]

ad1: FAILURE - READ_DMA status=51READY,DSC,ERROR
error=84ICRC,ABORTED LBA=0055347
ad0: FAILURE - READ_DMA status=51READY,DSC,ERROR
error=84ICRC,ABORTED LBA=0
etc
etc



FreeBSD sees these hard disks during boot up:

ad0: 78533MB Hitachi HDS728080PLAT20 PF20A21B at ata0-master UDMA133
ad1: 32253MB Samsung SP2014N VC100-33 at ata0-slave UDMA100 [This is
wrong, the disk have a capacity of about 185 GB]


So, when I go on with the installation program and try to write to disk
the new slices and labels in fdisk I get an error message saying
something like not able to write to the disk.

I don't think there is something wrong with my hardware, because I just
have recently installed both Windows XP and Slackware Linux on the same
box, without any problems.

Does anybody have any idea to fix these problems?


regards,

Snorre D. Øverbø



signature.asc
Description: OpenPGP digital signature


Re: Disk errors on installing FreeBSD 7.0

2008-08-07 Thread larin

Snorre D. Øverbø wrote:

Hi all,

I'm trying to install FreeBSD 7.0 release on a box at home.

When I boot up with the installation DVD these error messages appear on
the screen.
[Written down by hand:]

ad1: FAILURE - READ_DMA status=51READY,DSC,ERROR
error=84ICRC,ABORTED LBA=0055347
ad0: FAILURE - READ_DMA status=51READY,DSC,ERROR
error=84ICRC,ABORTED LBA=0
etc
etc



FreeBSD sees these hard disks during boot up:

ad0: 78533MB Hitachi HDS728080PLAT20 PF20A21B at ata0-master UDMA133
ad1: 32253MB Samsung SP2014N VC100-33 at ata0-slave UDMA100 [This is
wrong, the disk have a capacity of about 185 GB]


So, when I go on with the installation program and try to write to disk
the new slices and labels in fdisk I get an error message saying
something like not able to write to the disk.

I don't think there is something wrong with my hardware, because I just
have recently installed both Windows XP and Slackware Linux on the same
box, without any problems.

Does anybody have any idea to fix these problems?


regards,

Snorre D. Øverbø

  

   Try to replace the IDE ribbon cable.
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Disk errors on installing FreeBSD 7.0

2008-08-07 Thread Lokadamus

Snorre D. Øverbø wrote:

Hi all,

I'm trying to install FreeBSD 7.0 release on a box at home.

When I boot up with the installation DVD these error messages appear on
the screen.
[Written down by hand:]

ad1: FAILURE - READ_DMA status=51READY,DSC,ERROR
error=84ICRC,ABORTED LBA=0055347
ad0: FAILURE - READ_DMA status=51READY,DSC,ERROR
error=84ICRC,ABORTED LBA=0
etc
etc



FreeBSD sees these hard disks during boot up:

ad0: 78533MB Hitachi HDS728080PLAT20 PF20A21B at ata0-master UDMA133
ad1: 32253MB Samsung SP2014N VC100-33 at ata0-slave UDMA100 [This is
wrong, the disk have a capacity of about 185 GB]


So, when I go on with the installation program and try to write to disk
the new slices and labels in fdisk I get an error message saying
something like not able to write to the disk.

I don't think there is something wrong with my hardware, because I just
have recently installed both Windows XP and Slackware Linux on the same
box, without any problems.

Does anybody have any idea to fix these problems?


regards,

Snorre D. Øverbø

  

Do you get the same error with FreeBSD 8.0 Current?
ftp://ftp.freebsd.org/pub/FreeBSD/snapshots/

regards
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Disk errors on installing FreeBSD 7.0

2008-08-07 Thread N.J. Thomas
* Snorre D. ?verb? [EMAIL PROTECTED] [2008-08-07 15:29:11+]:
 When I boot up with the installation DVD these error messages appear
 on the screen.
 
 ad1: FAILURE - READ_DMA status=51READY,DSC,ERROR error=84ICRC,ABORTED 
 LBA=0055347
 ad0: FAILURE - READ_DMA status=51READY,DSC,ERROR error=84ICRC,ABORTED 
 LBA=0
 etc

I got the same exact errors trying to install 7.0-RELEASE on two
different Dell boxes. One was 4 years old, the other was brand new (3
months ago).

Never was able to fix the problem. For the older one, I plugged in an
external DVD drive and installed via that. For the other one, I
installed via a mini-install disk, and then did a minimal network
install.

For the record, they both had SATA drives and the disks worked (and
still work) fine after the OS was installed. It was just copying the
base system off the CD that was causing errors.

Thomas
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Disk errors on installing FreeBSD 7.0

2008-08-07 Thread Al Plant

N.J. Thomas wrote:

* Snorre D. ?verb? [EMAIL PROTECTED] [2008-08-07 15:29:11+]:

When I boot up with the installation DVD these error messages appear
on the screen.

ad1: FAILURE - READ_DMA status=51READY,DSC,ERROR error=84ICRC,ABORTED 
LBA=0055347
ad0: FAILURE - READ_DMA status=51READY,DSC,ERROR error=84ICRC,ABORTED LBA=0
etc


I got the same exact errors trying to install 7.0-RELEASE on two
different Dell boxes. One was 4 years old, the other was brand new (3
months ago).

Never was able to fix the problem. For the older one, I plugged in an
external DVD drive and installed via that. For the other one, I
installed via a mini-install disk, and then did a minimal network
install.

For the record, they both had SATA drives and the disks worked (and
still work) fine after the OS was installed. It was just copying the
base system off the CD that was causing errors.

Thomas
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


888
 Aloha,

I am getting the same errors as you guys with an intermittient BIG_read 
one occasionally. I've tried to install FreeeBSD CURRENT 8 and 7 release.


This is on a no name box with a bio board and 1100 cpu. I've had this on 
other boxes too and load IDE drives on a box that works with them and 
then put them in the box with errors and they work just fine.


Every thing gets recognized normally at  install time, but the size of 
the IDE drive a Fujutsu 20 gig. shows twice what it should be every time.


Dont know if this has anything to do with it, except if you change the 
size in installer it wont load anything.


Maybe one of the top level gurus on the list can help.



--

~Al Plant - Honolulu, Hawaii -  Phone:  808-284-2740
  + http://hawaiidakine.com + http://freebsdinfo.org +
  + http://aloha50.net   - Supporting - FreeBSD 6.* - 7.* - 8.* +
   email: [EMAIL PROTECTED] 
All that's really worth doing is what we do for others.- Lewis Carrol

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


RE: Disk errors when copying

2007-09-10 Thread Ted Mittelstaedt


 -Original Message-
 From: Lars Eighner [mailto:[EMAIL PROTECTED]
 Sent: Sunday, September 09, 2007 11:17 AM
 To: Ted Mittelstaedt
 Cc: Richard Tobin; freebsd-questions@freebsd.org
 Subject: RE: Disk errors when copying


 On Fri, 7 Sep 2007, Ted Mittelstaedt wrote:

 
 
  Subject: Disk errors when copying
 
 
  When copy between disks (ad10 -ad8), I get errors:
 
  ad10: WARNING - READ_DMA48 UDMA ICRC error (retrying request)
  LBA=435128800
  ad10: FAILURE - READ_DMA48 status=51READY,DSC,ERROR
  error=10NID_NOT_FOUND LBA=435128800
  g_vfs_done():ad10s2g[READ(offset=175562145792, length=131072)]error = 5
 
  I don't get these errors just reading the data from ad10.  Is this
  some kind of system error rather than a bad disk?  Is it a
 known problem?
 
 
  Yes it is a known problem.  It does not happen with most combinations
  of drives and controllers.  You need to exhaustively document the
  motherboard/controller/hard disk and put it into a PR and file it
  so that the developer can add your combo into his database.  The more
  of these that are documented the quicker that a coorelation is going
  to show up and get fixed.

 I wish I'd known that before I trashed my disc and spent a couple of weeks
 and hundreds of bucks building a new system.


One of the rules of thumb when you have hardware problems with a new
system (I'm assuming of course that these UDMA errors have been
happening since the system was built) is to search both the FreeBSD
questions mailing list archives, and the PR database - both closed and
open PRs.  Particularly closed PRs are a wealth of information because
so many of them are closed for lack of followup.

A typical scenario is someone will report a problem like your having
and 3 months later the developer will make a change in the code and
then ask the reporter to test the change and see if it fixed the
problem.  By then the original reporter has gone on to something else
and won't respond.  The developer then closes the PR and assumes whatever
he did fixed the problem.

If you do find closed PRs that are the same problem and same hardware
as yours, definitely refer to their numbers in your PR.

Ted

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


RE: Disk errors when copying

2007-09-10 Thread Lars Eighner

On Sun, 9 Sep 2007, Ted Mittelstaedt wrote:


From: Lars Eighner [mailto:[EMAIL PROTECTED]



I wish I'd known that before I trashed my disc and spent a couple of weeks
and hundreds of bucks building a new system.



One of the rules of thumb when you have hardware problems with a new
system (I'm assuming of course that these UDMA errors have been
happening since the system was built) is to search both the FreeBSD
questions mailing list archives, and the PR database - both closed and
open PRs.  Particularly closed PRs are a wealth of information because
so many of them are closed for lack of followup.


I got the (disc) manufacture's utilities (which run on a bootable
FreeDOS CD) and ran every test over and over.  It kept telling me
the disc was fine.  I should have believed.

I always feel a little weird about discs because although the manufacture
and the BIOS agree on the geometry, FreeBSD always (over three or four boxes
with a half-dozen different discs) tells me the geometry is wrong.  It seems
so confident about it, I generally let it do what it wants.  But what does
FreeBSD know about the disc that the manufacture and the BIOS don't?


--
Lars Eighner
http://www.larseighner.com/index.html
8800 N IH35 APT 1191 AUSTIN TX 78753-5266

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


RE: Disk errors when copying

2007-09-10 Thread Richard Tobin
   ad10: WARNING - READ_DMA48 UDMA ICRC error (retrying request)
   LBA=435128800
   ad10: FAILURE - READ_DMA48 status=51READY,DSC,ERROR
   error=10NID_NOT_FOUND LBA=435128800
   g_vfs_done():ad10s2g[READ(offset=175562145792, length=131072)]error = 5

 One of the rules of thumb when you have hardware problems with a new
 system (I'm assuming of course that these UDMA errors have been
 happening since the system was built)

In my case it happened once and did not recur.  But looking at the SMART
log on the disk it appears that it might have happened before without
my noticing.  I was copying the disk before moving it to a different
machine, so I probably won't be able to test it further.

I'm sending a PR.

-- Richard
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


RE: Disk errors when copying

2007-09-10 Thread Ted Mittelstaedt

geometry is meaningless in LBA mode.

The drive and BIOS mfgr agree on a convenient fiction to
reduce support calls.

Don't forget that running under FreeDOS your running in
real mode not protected mode.  In real mode the segmented
BIOS functions are actually used and it could be they are
even used for addressing the disk, and the disk controller
chipset emulates a MFM controller.  (esentially)

In the protected mode UNIX runs in, most of that BIOS code
is useless, the disk driver talks directly to the disk
controller chipset.  There is probably some undocumented
misbehavior that Microsoft got told about and so put it in
their disk driver code, but that the FreeBSD developers didn't
get told about.

Ted

 -Original Message-
 From: Lars Eighner [mailto:[EMAIL PROTECTED]
 Sent: Monday, September 10, 2007 2:30 AM
 To: Ted Mittelstaedt
 Cc: freebsd-questions@freebsd.org
 Subject: RE: Disk errors when copying


 On Sun, 9 Sep 2007, Ted Mittelstaedt wrote:

  From: Lars Eighner [mailto:[EMAIL PROTECTED]

  I wish I'd known that before I trashed my disc and spent a
 couple of weeks
  and hundreds of bucks building a new system.
 
 
  One of the rules of thumb when you have hardware problems with a new
  system (I'm assuming of course that these UDMA errors have been
  happening since the system was built) is to search both the FreeBSD
  questions mailing list archives, and the PR database - both closed and
  open PRs.  Particularly closed PRs are a wealth of information because
  so many of them are closed for lack of followup.

 I got the (disc) manufacture's utilities (which run on a bootable
 FreeDOS CD) and ran every test over and over.  It kept telling me
 the disc was fine.  I should have believed.

 I always feel a little weird about discs because although the manufacture
 and the BIOS agree on the geometry, FreeBSD always (over three or
 four boxes
 with a half-dozen different discs) tells me the geometry is
 wrong.  It seems
 so confident about it, I generally let it do what it wants.  But what does
 FreeBSD know about the disc that the manufacture and the BIOS don't?


 --
 Lars Eighner
 http://www.larseighner.com/index.html
 8800 N IH35 APT 1191 AUSTIN TX 78753-5266



___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


RE: Disk errors when copying

2007-09-09 Thread Lars Eighner

On Fri, 7 Sep 2007, Ted Mittelstaedt wrote:





Subject: Disk errors when copying


When copy between disks (ad10 -ad8), I get errors:

ad10: WARNING - READ_DMA48 UDMA ICRC error (retrying request)
LBA=435128800
ad10: FAILURE - READ_DMA48 status=51READY,DSC,ERROR
error=10NID_NOT_FOUND LBA=435128800
g_vfs_done():ad10s2g[READ(offset=175562145792, length=131072)]error = 5

I don't get these errors just reading the data from ad10.  Is this
some kind of system error rather than a bad disk?  Is it a known problem?



Yes it is a known problem.  It does not happen with most combinations
of drives and controllers.  You need to exhaustively document the
motherboard/controller/hard disk and put it into a PR and file it
so that the developer can add your combo into his database.  The more
of these that are documented the quicker that a coorelation is going
to show up and get fixed.


I wish I'd known that before I trashed my disc and spent a couple of weeks
and hundreds of bucks building a new system.

--
Lars Eighner
http://www.larseighner.com/index.html
8800 N IH35 APT 1191 AUSTIN TX 78753-5266

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


RE: Disk errors when copying

2007-09-07 Thread Ted Mittelstaedt


 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] Behalf Of Richard Tobin
 Sent: Wednesday, September 05, 2007 3:03 PM
 To: freebsd-questions@freebsd.org
 Subject: Disk errors when copying
 
 
 When copy between disks (ad10 -ad8), I get errors:
 
 ad10: WARNING - READ_DMA48 UDMA ICRC error (retrying request) 
 LBA=435128800
 ad10: FAILURE - READ_DMA48 status=51READY,DSC,ERROR 
 error=10NID_NOT_FOUND LBA=435128800
 g_vfs_done():ad10s2g[READ(offset=175562145792, length=131072)]error = 5
 
 I don't get these errors just reading the data from ad10.  Is this
 some kind of system error rather than a bad disk?  Is it a known problem?
 

Yes it is a known problem.  It does not happen with most combinations
of drives and controllers.  You need to exhaustively document the
motherboard/controller/hard disk and put it into a PR and file it
so that the developer can add your combo into his database.  The more
of these that are documented the quicker that a coorelation is going
to show up and get fixed.

Ted
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Disk errors when copying

2007-09-06 Thread Ivan Voras

Richard Tobin wrote:

When copy between disks (ad10 -ad8), I get errors:

ad10: WARNING - READ_DMA48 UDMA ICRC error (retrying request) LBA=435128800
ad10: FAILURE - READ_DMA48 status=51READY,DSC,ERROR error=10NID_NOT_FOUND 
LBA=435128800
g_vfs_done():ad10s2g[READ(offset=175562145792, length=131072)]error = 5

I don't get these errors just reading the data from ad10.  Is this
some kind of system error rather than a bad disk?  Is it a known problem?


It doesn't match any recent known problem - it looks like a disk error. 
You might want to pinpoint the file which causes it and skip that file. 
Use sysutils/smartmontools to test and monitor the drive.




signature.asc
Description: OpenPGP digital signature


Disk errors when copying

2007-09-05 Thread Richard Tobin
When copy between disks (ad10 -ad8), I get errors:

ad10: WARNING - READ_DMA48 UDMA ICRC error (retrying request) LBA=435128800
ad10: FAILURE - READ_DMA48 status=51READY,DSC,ERROR error=10NID_NOT_FOUND 
LBA=435128800
g_vfs_done():ad10s2g[READ(offset=175562145792, length=131072)]error = 5

I don't get these errors just reading the data from ad10.  Is this
some kind of system error rather than a bad disk?  Is it a known problem?

(6.2 stable, SATA disks)

-- Richard

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


How To Monitor Disk Errors?

2005-11-22 Thread Drew Tomlinson
I have an old machine running 4.11.  It died sometime last night from 
what I think was a disk problem.  The machine was still running and 
still passing packets (it is my firewall) but I could not log in via the 
console, ssh, or telnet.  I powered the machine off/on and heard the 
click of death coming from one of the internal IDE drives.  By some 
miracle, the machine did finally boot and is running again.


I'm sure I'm on borrowed time here.  However I would like to find some 
way to monitor drive errors so I know which drive is failing so replace 
the correct drive.  I have two in the machine.  I've checked 
/var/log/messages but see no entries there regarding the drive.  Is 
there some utility that will let me see the current number of errors 
since boot?


Thanks,

Drew

--
Visit The Alchemist's Warehouse
Magic Tricks, DVDs, Videos, Books,  More!

http://www.alchemistswarehouse.com

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: How To Monitor Disk Errors?

2005-11-22 Thread Nicolas Blais
On November 22, 2005 09:05 pm, Drew Tomlinson wrote:
 I have an old machine running 4.11.  It died sometime last night from
 what I think was a disk problem.  The machine was still running and
 still passing packets (it is my firewall) but I could not log in via the
 console, ssh, or telnet.  I powered the machine off/on and heard the
 click of death coming from one of the internal IDE drives.  By some
 miracle, the machine did finally boot and is running again.

 I'm sure I'm on borrowed time here.  However I would like to find some
 way to monitor drive errors so I know which drive is failing so replace
 the correct drive.  I have two in the machine.  I've checked
 /var/log/messages but see no entries there regarding the drive.  Is
 there some utility that will let me see the current number of errors
 since boot?

 Thanks,

 Drew

Check out /sysutils/smartmontools in the ports.  It could be what you need.
When installed, a simple smartctl -H /dev/*disk* like :
smartctl -H /dev/ad0 will tell you if your drive is healthy or not.  

You can also set up the smartd which will check your drives at certain 
intervals.

Nicolas
-- 
FreeBSD 7.0-CURRENT #1: Sat Nov 19 12:36:29 EST 2005 
[EMAIL PROTECTED]:/usr/obj/usr/src/sys/CLK01A 
PGP? (updated 16 Nov 05) : http://www.clkroot.net/security/nb_root.asc


pgp0ncXariPbZ.pgp
Description: PGP signature


Re: How To Monitor Disk Errors?

2005-11-22 Thread Joao Barros
Try smartmontools: http://www.freshports.org/sysutils/smartmontools/

If you have IBM Deskstars see if there is newer firmware available.

On 11/23/05, Drew Tomlinson [EMAIL PROTECTED] wrote:
 I have an old machine running 4.11.  It died sometime last night from
 what I think was a disk problem.  The machine was still running and
 still passing packets (it is my firewall) but I could not log in via the
 console, ssh, or telnet.  I powered the machine off/on and heard the
 click of death coming from one of the internal IDE drives.  By some
 miracle, the machine did finally boot and is running again.

 I'm sure I'm on borrowed time here.  However I would like to find some
 way to monitor drive errors so I know which drive is failing so replace
 the correct drive.  I have two in the machine.  I've checked
 /var/log/messages but see no entries there regarding the drive.  Is
 there some utility that will let me see the current number of errors
 since boot?

 Thanks,

 Drew

 --
 Visit The Alchemist's Warehouse
 Magic Tricks, DVDs, Videos, Books,  More!

 http://www.alchemistswarehouse.com

 ___
 freebsd-questions@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-questions
 To unsubscribe, send any mail to [EMAIL PROTECTED]



--
Joao Barros
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: How To Monitor Disk Errors?

2005-11-22 Thread Drew Tomlinson

On 11/22/2005 6:18 PM Nicolas Blais wrote:


On November 22, 2005 09:05 pm, Drew Tomlinson wrote:
 


I have an old machine running 4.11.  It died sometime last night from
what I think was a disk problem.  The machine was still running and
still passing packets (it is my firewall) but I could not log in via the
console, ssh, or telnet.  I powered the machine off/on and heard the
click of death coming from one of the internal IDE drives.  By some
miracle, the machine did finally boot and is running again.

I'm sure I'm on borrowed time here.  However I would like to find some
way to monitor drive errors so I know which drive is failing so replace
the correct drive.  I have two in the machine.  I've checked
/var/log/messages but see no entries there regarding the drive.  Is
there some utility that will let me see the current number of errors
since boot?

Thanks,

Drew
   



Check out /sysutils/smartmontools in the ports.  It could be what you need.
When installed, a simple smartctl -H /dev/*disk* like :
smartctl -H /dev/ad0 will tell you if your drive is healthy or not.  

You can also set up the smartd which will check your drives at certain 
intervals.


Nicolas
 




Thanks for your reply.  However it appears I need to be at FBSD ver 5.0 
or higher to use this tool.  Here's the output:


blacksheep# smartctl -H /dev/ad0
smartctl version 5.33 [i386-portbld-freebsd4.11] Copyright (C) 2002-4 
Bruce Allen

Home page is http://smartmontools.sourceforge.net/

ATA support is not provided for this kernel version. Please ugrade to a 
recent 5-CURRENT kernel (post 09/01/2003 or so)

Smartctl: Device Read Identity Failed (not an ATA/ATAPI device)

A mandatory SMART command failed: exiting. To continue, add one or more 
'-T permissive' options.


Cheers,

Drew

--
Visit The Alchemist's Warehouse
Magic Tricks, DVDs, Videos, Books,  More!

http://www.alchemistswarehouse.com

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: disk errors help!!

2005-10-07 Thread Danny Pansters
On Friday 7 October 2005 01:55, RYAN vAN GINNEKEN wrote:
 My freebsd 4.11 system has been subjected to a power failure and seems
 to have many disk errors something about soft updates and not being able
 to read certain sectors it comes up with the standard single user pick
 your shell command and tells me to run fsck manually.  I have run it
 several times but the system keeps comming with ad0s4 marked dirty which
 is my /usr partition.  

Doesn't it go all through the list (everyting should be - if recoverable - in 
lost+found after you answered Y to everything that you think would be 
important). If you have to reboot all the time at this stage already uhm 
you're doing bad.

 Sometimes i get a resetting device ata0 timeout 
 error.  

Most important info. Your drive is dying. Might be fast or slow but it is 
going. Switch to mayhem mode.

 What should i do of course i have no current backups for this 
 system and the most important thing is retrieving my users data it would
 be great if i could get the system to boot up but if i have to reinstall
 no biggy as long as i can get most of my data back.

Stop rebooting, save the drive or what's left of it, put it into another box 
and try to salvage your data from there. Get it out of that box. You never 
know if its solely a disk issue or perhaps a south bridge that cracked or 
what have you. Stop booting it.

 ps  i promise to do remote backups nightly for the rest of my life.

Oh well. Real men and all that :)

Just my NSHO,

Dan
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


disk errors help!!

2005-10-06 Thread RYAN vAN GINNEKEN
My freebsd 4.11 system has been subjected to a power failure and seems 
to have many disk errors something about soft updates and not being able 
to read certain sectors it comes up with the standard single user pick 
your shell command and tells me to run fsck manually.  I have run it 
several times but the system keeps comming with ad0s4 marked dirty which 
is my /usr partition.  Sometimes i get a resetting device ata0 timeout 
error.  What should i do of course i have no current backups for this 
system and the most important thing is retrieving my users data it would 
be great if i could get the system to boot up but if i have to reinstall 
no biggy as long as i can get most of my data back.


ps  i promise to do remote backups nightly for the rest of my life.


--
Computer King/CaNMail

http://www.computerking.ca http://www.canmail.org

Sales, Service, and Hosting
Email, Data, and Web Packages
Ask about web design specials

Affiliates
http://www.computerking.ca/pages/links/affiliates/affiliates.htm

--

If you eat a live frog in the morning, nothing worse will happen to either of 
you for the rest of the day.

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: disk errors help!!

2005-10-06 Thread Glenn Dawson

At 06:55 PM 10/6/2005, RYAN vAN GINNEKEN wrote:
My freebsd 4.11 system has been subjected to a power failure and 
seems to have many disk errors something about soft updates and not 
being able to read certain sectors it comes up with the standard 
single user pick your shell command and tells me to run fsck 
manually.  I have run it several times but the system keeps comming 
with ad0s4 marked dirty which is my /usr partition.  Sometimes i get 
a resetting device ata0 timeout error.  What should i do of course i 
have no current backups for this system and the most important thing 
is retrieving my users data it would be great if i could get the 
system to boot up but if i have to reinstall no biggy as long as i 
can get most of my data back.


ps  i promise to do remote backups nightly for the rest of my life.


Sounds like you definitely have some problems.

If you have somewhere to put it, you can use dd with 
conv=noerror,sync to get an image of the drive, less the damaged 
areas (which get filled in with nul's by the sync option).  Once you 
have the image, you can use it as a backing store for md(4), fsck 
that, mount it and get whatever you can from what's left.


-Glenn




--
Computer King/CaNMail

http://www.computerking.ca http://www.canmail.org

Sales, Service, and Hosting
Email, Data, and Web Packages
Ask about web design specials

Affiliates
http://www.computerking.ca/pages/links/affiliates/affiliates.htm

--

If you eat a live frog in the morning, nothing worse will happen to 
either of you for the rest of the day.


___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Disk errors

2005-10-03 Thread Joe S

Mike Jeays wrote:

I am getting one or two of these a day on a Western Digital 80GB disk


Most drive manufacturers provide diagnostic tools for the drives they 
produce. In this case, Western Digital provides a bootable diagnostic tool:


* Data Lifeguard Diagnostic for DOS (CD)
  http://support.wdc.com/download/index.asp?swid=30

Burn the ISO to a CD and boot up the system with the CD.
Run the diagnostics.

It's best to use the tool provided by the manufacturer of the drive you 
have. UltimateBootCD includes most popular drive tools.


___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Disk errors

2005-10-03 Thread Mike Jeays
On Mon, 2005-10-03 at 13:48, Joe S wrote:
 Mike Jeays wrote:
  I am getting one or two of these a day on a Western Digital 80GB disk
 
 Most drive manufacturers provide diagnostic tools for the drives they 
 produce. In this case, Western Digital provides a bootable diagnostic tool:
 
 * Data Lifeguard Diagnostic for DOS (CD)
http://support.wdc.com/download/index.asp?swid=30
 
 Burn the ISO to a CD and boot up the system with the CD.
 Run the diagnostics.
 
 It's best to use the tool provided by the manufacturer of the drive you 
 have. UltimateBootCD includes most popular drive tools.
 
 ___
 freebsd-questions@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-questions
 To unsubscribe, send any mail to [EMAIL PROTECTED]

Thanks for the suggestion.  I downloaded the ISO image, but it won't
boot properly on my machine; it just says something about Caldera
DR-DOS, and the screen goes blank.  Maybe SCO has had its finger in the
pie...

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re[2]: Disk errors

2005-10-01 Thread Gerard Seibert
On Fri, 30 Sep 2005 20:08:15 -0400, Mike Jeays [EMAIL PROTECTED]
Subject: Re: Disk errors
Wrote these words of wisdom:

 On Fri, 2005-09-30 at 16:48, Gerard Seibert wrote:
  On Fri, 30 Sep 2005 16:31:58 -0400, Mike Jeays [EMAIL PROTECTED]
  Subject: Disk errors
  Wrote these words of wisdom:
  
   I am getting one or two of these a day on a Western Digital 80GB disk
   How concerned should I be?  The machine seems reliable otherwise.
   
   ad0: WARNING - READ_DMA UDMA ICRC error (retrying request) LBA=61092255
   ad0: WARNING - READ_DMA UDMA ICRC error (retrying request) LBA=61314367
   ad0: WARNING - READ_DMA UDMA ICRC error (retrying request) LBA=7139615
  
  * REPLY SEPARATOR *
  On 9/30/2005 4:43:22 PM, Gerard Seibert Replied:
  
  I am not sure what free disk diagnostic programs are available, but I
  have used Steve Gibson's 'SpinRite'  http://grc.com  with great
  success in the past. If there is something wrong with the disk or
  controller, it will find it. Just run it at the highest level, level 5
  I believe.
  
 
 Thanks for the suggestion.  I didn't fancy spending $89 US on software
 to test a single 80GB disk, because I can buy a new one for about the
 same price.
 
 I googled for free tools, and found DFT by Hitachi.  I downloaded the
 bootable CD version, and tested my disk with it.  DFT didn't report any
 errors, and so I will carry on using the disk, and make sure I have good
 backups.  The tool was easy to download and use, but I don't have any
 evidence about how good it is at finding errors - so far.
 
 DFT can be found at:
 http://www.hgst.com/hdd/support/download.htm

* REPLY SEPARATOR *
On 10/1/2005 6:07:18 AM, Gerard Seibert Replied:

Yes, it has gone up in price. I remember when version on first appeared.
It was only $19. if memory serves me correctly. I had not realized that
it had increased in price so dramatically, since as a registered owner
of a previous version I receive a discount on updates.

I have no knowledge regarding DFT.

BTW, does you drive support 'SMART', and is it enabled?

HTH
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Disk errors

2005-09-30 Thread Mike Jeays
I am getting one or two of these a day on a Western Digital 80GB disk
How concerned should I be?  The machine seems reliable otherwise.

ad0: WARNING - READ_DMA UDMA ICRC error (retrying request) LBA=61092255
ad0: WARNING - READ_DMA UDMA ICRC error (retrying request) LBA=61314367
ad0: WARNING - READ_DMA UDMA ICRC error (retrying request) LBA=7139615


___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Disk errors

2005-09-30 Thread Gerard Seibert
On Fri, 30 Sep 2005 16:31:58 -0400, Mike Jeays [EMAIL PROTECTED]
Subject: Disk errors
Wrote these words of wisdom:

 I am getting one or two of these a day on a Western Digital 80GB disk
 How concerned should I be?  The machine seems reliable otherwise.
 
 ad0: WARNING - READ_DMA UDMA ICRC error (retrying request) LBA=61092255
 ad0: WARNING - READ_DMA UDMA ICRC error (retrying request) LBA=61314367
 ad0: WARNING - READ_DMA UDMA ICRC error (retrying request) LBA=7139615

* REPLY SEPARATOR *
On 9/30/2005 4:43:22 PM, Gerard Seibert Replied:

I am not sure what free disk diagnostic programs are available, but I
have used Steve Gibson's 'SpinRite'  http://grc.com  with great
success in the past. If there is something wrong with the disk or
controller, it will find it. Just run it at the highest level, level 5
I believe.


-- 
Gerard Seibert
[EMAIL PROTECTED]
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Disk errors

2005-09-30 Thread Mike Jeays
On Fri, 2005-09-30 at 16:48, Gerard Seibert wrote:
 On Fri, 30 Sep 2005 16:31:58 -0400, Mike Jeays [EMAIL PROTECTED]
 Subject: Disk errors
 Wrote these words of wisdom:
 
  I am getting one or two of these a day on a Western Digital 80GB disk
  How concerned should I be?  The machine seems reliable otherwise.
  
  ad0: WARNING - READ_DMA UDMA ICRC error (retrying request) LBA=61092255
  ad0: WARNING - READ_DMA UDMA ICRC error (retrying request) LBA=61314367
  ad0: WARNING - READ_DMA UDMA ICRC error (retrying request) LBA=7139615
 
 * REPLY SEPARATOR *
 On 9/30/2005 4:43:22 PM, Gerard Seibert Replied:
 
 I am not sure what free disk diagnostic programs are available, but I
 have used Steve Gibson's 'SpinRite'  http://grc.com  with great
 success in the past. If there is something wrong with the disk or
 controller, it will find it. Just run it at the highest level, level 5
 I believe.
 

Thanks for the suggestion.  I didn't fancy spending $89 US on software
to test a single 80GB disk, because I can buy a new one for about the
same price.

I googled for free tools, and found DFT by Hitachi.  I downloaded the
bootable CD version, and tested my disk with it.  DFT didn't report any
errors, and so I will carry on using the disk, and make sure I have good
backups.  The tool was easy to download and use, but I don't have any
evidence about how good it is at finding errors - so far.

DFT can be found at:
http://www.hgst.com/hdd/support/download.htm

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Disk errors

2005-09-30 Thread Micah



Mike Jeays wrote:

On Fri, 2005-09-30 at 16:48, Gerard Seibert wrote:


On Fri, 30 Sep 2005 16:31:58 -0400, Mike Jeays [EMAIL PROTECTED]
Subject: Disk errors
Wrote these words of wisdom:



I am getting one or two of these a day on a Western Digital 80GB disk
How concerned should I be?  The machine seems reliable otherwise.

ad0: WARNING - READ_DMA UDMA ICRC error (retrying request) LBA=61092255
ad0: WARNING - READ_DMA UDMA ICRC error (retrying request) LBA=61314367
ad0: WARNING - READ_DMA UDMA ICRC error (retrying request) LBA=7139615


* REPLY SEPARATOR *
On 9/30/2005 4:43:22 PM, Gerard Seibert Replied:

I am not sure what free disk diagnostic programs are available, but I
have used Steve Gibson's 'SpinRite'  http://grc.com  with great
success in the past. If there is something wrong with the disk or
controller, it will find it. Just run it at the highest level, level 5
I believe.




Thanks for the suggestion.  I didn't fancy spending $89 US on software
to test a single 80GB disk, because I can buy a new one for about the
same price.

I googled for free tools, and found DFT by Hitachi.  I downloaded the
bootable CD version, and tested my disk with it.  DFT didn't report any
errors, and so I will carry on using the disk, and make sure I have good
backups.  The tool was easy to download and use, but I don't have any
evidence about how good it is at finding errors - so far.

DFT can be found at:
http://www.hgst.com/hdd/support/download.htm



A good collection of free tools that includes several hard disk 
diagnostic apps, try the ultimate boot cd. 
http://www.ultimatebootcd.com/ It's come in handy once or twice.


Later,
Micah
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Any way to lock down disk errors?

2003-11-12 Thread Lowell Gilbert
Jim Hatfield [EMAIL PROTECTED] writes:

 Strictly speaking OT but the machine is running FreeBSD.
 
 While copying a file I got I/O errors. The console shows:
 
 ad0: hard error cmd=read fsbn 31891359 of 31891359-31891486 status=59 error=40
 ad0: hard error cmd=read fsbn 31891231 of 31891231-31891486 status=59 error=40
 
 Given that the disk is just under three months old, is it worth doing
 anything other than getting it replaced? I have no other disk big
 enough to old the data on it so unless the supplier sends me a
 replacement ahead of me returning the faulty one it will be a pain.
 
 I have enough space to empty the partition with the error in, but I
 couldn't find anything in newfs or fsck which would let me map out
 selected blocks or to do a full write test of each block and map out
 bad ones. Is there such a beast?

Unfortunately, this doesn't really do any good any more.  Disks will
do this internally before even reporting errors back to you, so if
you're getting a lot of problems, then it's possible (but rare) that a
manufacturer's maintenance tool will straighten out the trouble, but
even then you'd need to backup everything off of it first...

If you want to try badsect(8), you can, but it isn't for the
faint-hearted.  
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Any way to lock down disk errors?

2003-11-11 Thread Jim Hatfield
Strictly speaking OT but the machine is running FreeBSD.

While copying a file I got I/O errors. The console shows:

ad0: hard error cmd=read fsbn 31891359 of 31891359-31891486 status=59 error=40
ad0: hard error cmd=read fsbn 31891231 of 31891231-31891486 status=59 error=40

Given that the disk is just under three months old, is it worth doing
anything other than getting it replaced? I have no other disk big
enough to old the data on it so unless the supplier sends me a
replacement ahead of me returning the faulty one it will be a pain.

I have enough space to empty the partition with the error in, but I
couldn't find anything in newfs or fsck which would let me map out
selected blocks or to do a full write test of each block and map out
bad ones. Is there such a beast?


___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


swap_pager: indefinite wait buffer but no disk errors

2003-01-07 Thread Bruce Campbell

This document:

http://www.freebsd.org/doc/en_US.ISO8859-
1/books/faq/troubleshoot.html#INDEFINITE-WAIT-BUFFER

includes:

5.30. What does the error ``swap_pager: indefinite wait buffer:'' mean?

This means that a process is trying to page memory to disk, and the page 
attempt has hung trying to access the disk for more than 20 seconds. It might 
be caused by bad blocks on the disk drive, disk wiring, cables, or any other 
disk I/O-related hardware. If the drive itself is actually bad, you will also 
see disk errors in /var/log/messages and in the output of dmesg. Otherwise, 
check your cables and connections.

I am seeing occasional swap_pager: indefinite wait buffer on 4 systems under a
heavy simultaneous sequential i/o test on ATA devices ad0 and ad1 (on the
same channel).  swap is on ad0.  ad1 is mounted as /test and ad1 is filled
to capacity and read back repeatedly.  ad0 is filled to maybe 50%
capacitity and read back repeatedly.

System hardware is:

http://www.freebsd.uwaterloo.ca/twiki/bin/view/Freebsd/IntelP4

o/s is:

FreeBSD ecserv1.uwaterloo.ca 4.7-RELEASE FreeBSD 4.7-RELEASE #0: Wed Oct  9 
15:08:34 GMT 2002 
[EMAIL PROTECTED]:/usr/obj/usr/src/sys/GENERIC  i386

I am not using tagged queueing.  Both disks are reported as UDMA33.

Logs are below.  There are no disk errors.  During the test, ad0 and ad1
are each reading or writing around 25 megabytes per second.  Load
average is around 0.25, but system is very slow to log in to, or
to respond to keyboard, during the test.

Same test on 4 dual processor AMD systems (with the same disks) does not
yield this particular problem. 

Same test with just one disk under test does not yield this problem.

Under normal type usage, the problem never happens.  I'm just reporting
this to indicate that there appears to be some other cause than disk
errors for this problem.

Logs of 4 systems are:

Jan  5 00:00:00 ecserv2 newsyslog[1784]: logfile turned over
Jan  5 21:33:07 ecserv2 /kernel: swap_pager: indefinite wait buffer: device: 
#ad/0x20001, blkno: 640, size: 4096
Jan  5 21:33:37 ecserv2 /kernel: swap_pager: indefinite wait buffer: device: 
#ad/0x20001, blkno: 640, size: 4096
Jan  5 22:45:21 ecserv2 /kernel: swap_pager: indefinite wait buffer: device: 
#ad/0x20001, blkno: 432, size: 4096
Jan  5 22:46:18 ecserv2 /kernel: swap_pager: indefinite wait buffer: device: 
#ad/0x20001, blkno: 432, size: 4096
Jan  5 22:46:18 ecserv2 /kernel: swap_pager: indefinite wait buffer: device: 
#ad/0x20001, blkno: 432, size: 4096
Jan  6 00:00:00 ecserv2 newsyslog[9854]: logfile turned over
Jan  6 00:00:00 ecserv2 newsyslog[9854]: logfile turned over
Jan  6 01:50:20 ecserv2 /kernel: swap_pager: indefinite wait buffer: device: 
#ad/0x20001, blkno: 312, size: 4096
Jan  6 01:51:18 ecserv2 /kernel: swap_pager: indefinite wait buffer: device: 
#ad/0x20001, blkno: 312, size: 4096
Jan  6 02:40:37 ecserv2 /kernel: pid 9894 (file1), uid 0 on /test: file system 
full
Jan  6 07:56:50 ecserv2 /kernel: swap_pager: indefinite wait buffer: device: 
#ad/0x20001, blkno: 264, size: 4096
Jan  6 07:57:02 ecserv2 /kernel: swap_pager: indefinite wait buffer: device: 
#ad/0x20001, blkno: 272, size: 4096
Jan  6 08:56:55 ecserv2 /kernel: swap_pager: indefinite wait buffer: device: 
#ad/0x20001, blkno: 328, size: 4096
Jan  6 09:40:33 ecserv2 /kernel: pid 10624 (file1), uid 0 on /test: file system 
full
Jan  6 10:46:50 ecserv2 /kernel: swap_pager: indefinite wait buffer: device: 
#ad/0x20001, blkno: 432, size: 4096
Jan  6 10:46:50 ecserv2 last message repeated 4 times
Jan  6 12:55:21 ecserv2 /kernel: swap_pager: indefinite wait buffer: device: 
#ad/0x20001, blkno: 432, size: 4096
Jan  6 12:56:40 ecserv2 last message repeated 3 times
Jan  6 16:39:44 ecserv2 /kernel: pid 2 (file1), uid 0 on /test: file system 
full
Jan  6 19:45:20 ecserv2 /kernel: swap_pager: indefinite wait buffer: device: 
#ad/0x20001, blkno: 432, size: 4096
Jan  6 20:56:51 ecserv2 /kernel: swap_pager: indefinite wait buffer: device: 
#ad/0x20001, blkno: 288, size: 4096
Jan  6 20:58:15 ecserv2 /kernel: swap_pager: indefinite wait buffer: device: 
#ad/0x20001, blkno: 312, size: 4096
Jan  6 20:58:45 ecserv2 /kernel: swap_pager: indefinite wait buffer: device: 
#ad/0x20001, blkno: 312, size: 4096
Jan  6 20:58:45 ecserv2 /kernel: swap_pager: indefinite wait buffer: device: 
#ad/0x20001, blkno: 328, size: 4096
Jan  6 22:56:51 ecserv2 /kernel: swap_pager: indefinite wait buffer: device: 
#ad/0x20001, blkno: 344, size: 4096
Jan  6 23:05:40 ecserv2 /kernel: swap_pager: indefinite wait buffer: device: 
#ad/0x20001, blkno: 432, size: 4096
Jan  6 23:38:43 ecserv2 /kernel: pid 11570 (file1), uid 0 on /test: file system 
full
Jan  7 00:00:00 ecserv2 newsyslog[11731]: logfile turned over
Jan  7 00:00:00 ecserv2 newsyslog[11731]: logfile turned over
Jan  7 01:46:23 ecserv2 /kernel: swap_pager: indefinite wait buffer: device: 
#ad/0x20001, blkno: 528, size: 4096
Jan  7 02:56:51 ecserv2 /kernel: swap_pager: indefinite wait buffer: device