Re: [CentOS] Kernel Errors Present...

2011-02-01 Thread Keith Roberts
On Wed, 12 Jan 2011, John R Pierce wrote:

 To: centos@centos.org
 From: John R Pierce pie...@hogranch.com
 Subject: Re: [CentOS] Kernel Errors Present...


 In the BIOS I turn DMA off for /dev/hda and /dev/hdc,
 but they still show up in /proc/ide/.../settings as
 using_dma 1.


 say HUH? IDE PIO modes are like 3-7 MBytes/sec and require 
100% CPU
 utilization during the transfer phase. why in dogs name 
would you be
 doing this in 2011 ?

...snip...

 January 12, 2011 06:26PM
 Use the modern, 80 wire cables, and trust the technology - 
 it's come a long way.

Thanks for all the replies concerning this.

I have bought an off-the-shelf 24 round ATA 133 IDE cable, 
and installed that in place of the 40-wire cable.

The problem now appears to be fixed. Been keeping an eye on 
my logwatch emails, and they are no longer reporting this 
problem.

Kind Regards,

Keith Roberts

-
Websites:
http://www.karsites.net
http://www.php-debuggers.net
http://www.raised-from-the-dead.org.uk

All email addresses are challenge-response protected with
TMDA [http://tmda.net]
-
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Kernel Errors Present

2011-01-17 Thread Shade.GE
Same Problem here, the harddrive (2.5 Samsung HM121HC) running with
Kernel 2.6.18-194.32.1.el5 (x86_64) produces errors on high load.
With one step back kernel the errors are gone. Im already changed the
harddrive with a new one, same errors on the newest kernel.

dmesg output:

hdc: status error: status=0x58 { DriveReady SeekComplete DataRequest }
ide: failed opcode was: unknown
hdc: drive not ready for command
attempt to access beyond end of device
hdc3: rw=0, want=25863980832, limit=225841770
attempt to access beyond end of device
hdc3: rw=0, want=7830939224, limit=225841770
attempt to access beyond end of device
hdc3: rw=0, want=31645262224, limit=225841770
attempt to access beyond end of device
hdc3: rw=0, want=25863980832, limit=225841770
attempt to access beyond end of device
hdc3: rw=0, want=25863980832, limit=225841770
hdc: status error: status=0x58 { DriveReady SeekComplete DataRequest }
ide: failed opcode was: unknown
hdc: drive not ready for command

There are no errors logged in smart, i already try'd with smartctl -t
long  no errors. I also did a blocktest on this drive.
Next step is to change the cables, but i don't think this would be a
solution, i think it's a kernel IDE / DMA problem.

Wolfgang

Am 15.01.11 14:50, schrieb Ryan Wagoner:
 On Sat, Jan 15, 2011 at 7:57 AM, Keith Roberts ke...@karsites.net wrote:
 I hope to be be getting some custom made 80 wire UDMA IDE
 cables sorted ASAP. That should squeeze extra speed from all
 the drives on the machine.
 You shouldn't need custom cables. IDE 80 pin cables can be sourced all
 over the Internet for around $5 a cable. Prices have gone up since
 they are now not common. You might even post a wanted ad on craigslist
 and see if you can get a handful for a few bucks.

 Ryan
 ___
 CentOS mailing list
 CentOS@centos.org
 http://lists.centos.org/mailman/listinfo/centos

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Kernel Errors Present

2011-01-17 Thread Keith Roberts
On Mon, 17 Jan 2011, Shade.GE wrote:

 To: CentOS mailing list centos@centos.org
 From: Shade.GE shade...@gmail.com
 Subject: Re: [CentOS] Kernel Errors Present
 
 Same Problem here, the harddrive (2.5 Samsung HM121HC) running with
 Kernel 2.6.18-194.32.1.el5 (x86_64) produces errors on high load.
 With one step back kernel the errors are gone. Im already changed the
 harddrive with a new one, same errors on the newest kernel.

 dmesg output:

 hdc: status error: status=0x58 { DriveReady SeekComplete DataRequest }
 ide: failed opcode was: unknown
 hdc: drive not ready for command
 attempt to access beyond end of device
 hdc3: rw=0, want=25863980832, limit=225841770
 attempt to access beyond end of device
 hdc3: rw=0, want=7830939224, limit=225841770
 attempt to access beyond end of device
 hdc3: rw=0, want=31645262224, limit=225841770
 attempt to access beyond end of device
 hdc3: rw=0, want=25863980832, limit=225841770
 attempt to access beyond end of device
 hdc3: rw=0, want=25863980832, limit=225841770
 hdc: status error: status=0x58 { DriveReady SeekComplete DataRequest }
 ide: failed opcode was: unknown
 hdc: drive not ready for command

 There are no errors logged in smart, i already try'd with smartctl -t
 long  no errors. I also did a blocktest on this drive.
 Next step is to change the cables, but i don't think this would be a
 solution, i think it's a kernel IDE / DMA problem.

Check your BIOS settings are corect, and have you enabled 
LBA for this drive?

You might need to enter the C/H/S values by hand, if these 
are not being detected properly. Is the drive jumpered 
properly?

I have got rid of the errors now from my WD 40GB drive, on 
the latest 32 bit kernel.

Also look in /var/log/messages to see how the kernel 
initialises the drive.

As I mentioned in an earlier post, I now use hdparm from the 
rc.local script to reset my drive to UDMA 2. Please check 
the posts I made last week regarding this.

Please also read the man page for hdparm. You can use that 
to get alot of information about your drive, and it's 
current (U)DMA settings.

EG:

[root@karsites ~]# hdparm /dev/hde

/dev/hde:
  multcount=  2 (on)
  IO_support   =  3 (32-bit w/sync)
  unmaskirq=  1 (on)
  using_dma=  1 (on)
  keepsettings =  0 (off)
  readonly =  0 (off)
  readahead= 256 (on)
  geometry = 65535/16/63, sectors = 78165360, start = 0
[root@karsites ~]#
[root@karsites ~]# hdparm -I /dev/hde

/dev/hde:

ATA device, with non-removable media
 Model Number:   WDC WD400BB-00GFA0
 Serial Number:  WD-WMAKA1241735
 Firmware Revision:  09.01B09
Standards:
 Supported: 5 4 3
 Likely used: 6
Configuration:
 Logical max current
 cylinders   16383   16383
 heads   16  16
 sectors/track   63  63
 --
 CHS current addressable sectors:   16514064
 LBAuser addressable sectors:   78165360
 device size with M = 1024*1024:   38166 MBytes
 device size with M = 1000*1000:   40020 MBytes 
(40 GB)
Capabilities:
 LBA, IORDY(can be disabled)
 bytes avail on r/w long: 40
 Standby timer values: spec'd by Standard, with 
device specific minimum
 R/W multiple sector transfer: Max = 16  Current = 2
 Recommended acoustic management value: 128, current 
value: 254
 DMA: mdma0 mdma1 mdma2 udma0 udma1 *udma2 udma3 
udma4 udma5
  Cycle time: min=120ns recommended=120ns
 PIO: pio0 pio1 pio2 pio3 pio4
  Cycle time: no flow control=120ns  IORDY flow 
control=120ns
Commands/features:
 Enabled Supported:
*SMART feature set
 Security Mode feature set
*Power Management feature set
*Write cache
*Look-ahead
*Host Protected Area feature set
*WRITE_BUFFER command
*READ_BUFFER command
*DOWNLOAD_MICROCODE
 SET_MAX security extension
*Automatic Acoustic Management feature set
*Device Configuration Overlay feature set
*SMART error logging
*SMART self-test
Security:
 supported
 not enabled
 not locked
 not frozen
 not expired: security count
 not supported: enhanced erase
HW reset results:
 CBLID- below Vih
 Device num = 0 determined by the jumper
Checksum: correct
[root@karsites ~]#

HTH

Keith Roberts

-
Websites:
http://www.karsites.net
http://www.php-debuggers.net
http://www.raised-from-the-dead.org.uk

All email addresses are challenge-response protected with
TMDA [http://tmda.net]
-
___
CentOS mailing list

Re: [CentOS] Kernel Errors Present

2011-01-17 Thread Keith Roberts
On Sat, 15 Jan 2011, Ryan Wagoner wrote:

 To: CentOS mailing list centos@centos.org
 From: Ryan Wagoner rswago...@gmail.com
 Subject: Re: [CentOS] Kernel Errors Present
 
 On Sat, Jan 15, 2011 at 7:57 AM, Keith Roberts ke...@karsites.net wrote:
 I hope to be be getting some custom made 80 wire UDMA IDE
 cables sorted ASAP. That should squeeze extra speed from all
 the drives on the machine.

 You shouldn't need custom cables. IDE 80 pin cables can be sourced all
 over the Internet for around $5 a cable. Prices have gone up since
 they are now not common. You might even post a wanted ad on craigslist
 and see if you can get a handful for a few bucks.

Hi Ryan.

I have a tall tower case with six 5.25 drive bays. The 
standard off-the-shelf IDE cables tend to have the slave 
connector about half way up the cable. So this means I can 
only use the master connector on standard cables. If I try 
to use the slave connetor, that makes the cable to short to 
reach the m/b.

These people here:

http://estore.circuitassembly.com/products/Custom-ATA-66-100-133-IDE-Cable-Ultra-DMA-EIDE-Two-Devices-.html

can make custom cables for not much more than an 
off-the-shelf version. The maximum length is 40, and I can 
choose where I want the slave connector to be. Anything from 
3 to 13 away from the master connector. Now that's what I 
call service!

Kind Regards,

Keith Roberts

-
Websites:
http://www.karsites.net
http://www.php-debuggers.net
http://www.raised-from-the-dead.org.uk

All email addresses are challenge-response protected with
TMDA [http://tmda.net]
-
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Kernel Errors Present

2011-01-15 Thread Keith Roberts
On Sat, 15 Jan 2011, Leonard den Ottolander wrote:

 To: CentOS mailing list centos@centos.org
 From: Leonard den Ottolander leon...@den.ottolander.nl
 Subject: Re: [CentOS] Kernel Errors Present
 
 Hi Keith,

 On Thu, 2011-01-13 at 19:03 +, Keith Roberts wrote:
 Well it seems likely it's because the drive is on a
 40-wire cable. But the kernel wants to do UDMA at 100 MB/s.

 See hdparm's -X switch to override the (U)DMA mode used for the drive.

Hi Leonard.

Yes, I've added the following into /etc/rc.d/rc.local:

#!/bin/sh
#
# This script will be executed *after* all the other init 
scripts.
# You can put your own initialization stuff in here if you 
don't
# want to do the full Sys V style init stuff.

touch /var/lock/subsys/local

# turn off DMA for hde WD drive
# -d0 = off
# -d1 = on
# hdparm -d0 /dev/hde

# hdparm -d1 -Xudma2 /dev/hde
#
# /dev/hde:
#  setting using_dma to 1 (on)
#  setting xfermode to 66 (UltraDMA mode2)
#  using_dma=  1 (on)
#
# set WD drive to use UDMA2 - 33 MB/s
hdparm -d1 -Xudma2 /dev/hde

# set sector count for multiple sector I/O
# WD drives like a low setting
# to prevent I/O data errors.
hdparm -m2 /dev/hde

# enable 32-bit data transfers with a special sync sequence
# required by many chipsets
# /dev/hde:
# setting 32-bit IO_support flag to 3
# IO_support   =  3 (32-bit w/sync)
hdparm -c3 /dev/hde

sleep 10

# end of rc.local

At reset/power on time the IT8212 controller spotted the WD 
drive was on a 40 wire cable, and set the UDMA transfer rate 
to UDMA 2 (33 MB/s).

However for some reason the kernel decided to set the 
transfer rate for the drive to UDMA 5 (100 MB/s).

There were thousands of CRC errors in the SMART data for 
this drive, which would indicate crosstalk problems on the 
40 wire cable being run at too high a speed.

Now I'm using hdparm to reset the drive to UDMA2 (33 MB/s) 
there are no more dma_intr errors occuring, or being 
reported by logwatch.

Thanks for all the feedback on this.

The drive is now working as desired.

I hope to be be getting some custom made 80 wire UDMA IDE 
cables sorted ASAP. That should squeeze extra speed from all 
the drives on the machine.

The 40 wire IDE cables will be packed away safely!

Kind Regards,

Keith Roberts

-
Websites:
http://www.karsites.net
http://www.php-debuggers.net
http://www.raised-from-the-dead.org.uk

All email addresses are challenge-response protected with
TMDA [http://tmda.net]
-
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Kernel Errors Present

2011-01-15 Thread Ryan Wagoner
On Sat, Jan 15, 2011 at 7:57 AM, Keith Roberts ke...@karsites.net wrote:
 I hope to be be getting some custom made 80 wire UDMA IDE
 cables sorted ASAP. That should squeeze extra speed from all
 the drives on the machine.

You shouldn't need custom cables. IDE 80 pin cables can be sourced all
over the Internet for around $5 a cable. Prices have gone up since
they are now not common. You might even post a wanted ad on craigslist
and see if you can get a handful for a few bucks.

Ryan
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Kernel Errors Present

2011-01-14 Thread Leonard den Ottolander
Hi Keith,

On Thu, 2011-01-13 at 19:03 +, Keith Roberts wrote:
 Well it seems likely it's because the drive is on a 
 40-wire cable. But the kernel wants to do UDMA at 100 MB/s.

See hdparm's -X switch to override the (U)DMA mode used for the drive.

Regards,
Leonard.

-- 
mount -t life -o ro /dev/dna /genetic/research


___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Kernel Errors Present

2011-01-13 Thread Keith Roberts
On Thu, 13 Jan 2011, Tsuyoshi Nagata wrote:

 To: CentOS mailing list centos@centos.org
 From: Tsuyoshi Nagata nagata...@jp.fujitsu.com
 Subject: Re: [CentOS] Kernel Errors Present
 
 Hi Keith
 (2011/01/13 6:39), Keith Roberts wrote:
 hde:  dma_intr: error=0x84 { DriveStat ...:  12 Time(s)
 hde:  dma_intr: status=0x51 { DriveReady SeekComplete
 The first error is data transmitting error. Your HARD DRIVE have
 a data transmitting error or malfunction on transmitting path without 
 disk.
 (The trouble is on memory, chip set, IDE-cable, HDD-Circuit(DMA). HDD 
 dish is OK.)
 DMA I/O was designed with 2 separated unit (control-unit and data-unit)
 The trouble is on control-unit part.

 Vivard/smartctl only explains your data-unit is OK.

  dma_intr: error=0x84 { DriveStatusError BadCRC }
  http://www.mail-archive.com/debian-user@lists.debian.org/msg128610.html

Thanks for all that information Tsuyoshi.

I have turned off dma for this drive with:

[root@karsites hde]# hdparm -d0 /dev/hde

/dev/hde:
  setting using_dma to 0 (off)
  using_dma=  0 (off)

[root@karsites hde]# hdparm -d /dev/hde

/dev/hde:
  using_dma=  0 (off)

I'll watch and see how things go now.

Kind Regards,

Keith

-
Websites:
http://www.karsites.net
http://www.php-debuggers.net
http://www.raised-from-the-dead.org.uk

All email addresses are challenge-response protected with
TMDA [http://tmda.net]
-
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Kernel Errors Present

2011-01-13 Thread Keith Roberts
On Thu, 13 Jan 2011, Tsuyoshi Nagata wrote:

 To: CentOS mailing list centos@centos.org
 From: Tsuyoshi Nagata nagata...@jp.fujitsu.com
 Subject: Re: [CentOS] Kernel Errors Present
 
 Hi Keith
 (2011/01/13 6:39), Keith Roberts wrote:
 hde:  dma_intr: error=0x84 { DriveStat ...:  12 Time(s)
 hde:  dma_intr: status=0x51 { DriveReady SeekComplete
 The first error is data transmitting error. Your HARD DRIVE have
 a data transmitting error or malfunction on transmitting path without 
 disk.
 (The trouble is on memory, chip set, IDE-cable, HDD-Circuit(DMA). HDD 
 dish is OK.)
 DMA I/O was designed with 2 separated unit (control-unit and data-unit)
 The trouble is on control-unit part.

 Vivard/smartctl only explains your data-unit is OK.

Well it seems likely it's because the drive is on a 
40-wire cable. But the kernel wants to do UDMA at 100 MB/s.

The ITE8212 PCI controller card spots the 40 wire cable, and 
sets the transfer mode down from UDMA5 (100 MB/s) to UDMA2, 
33MB/s

I have found this in /var/log/messages:

*snip*
Jan 13 18:35:16 karsites kernel: hde: 78165360 sectors (40020 MB) w/2048KiB 
Cache, CHS=65535/16/63, UDMA(100)
Jan 13 18:35:16 karsites kernel: hde: cache flushes not supported
Jan 13 18:35:16 karsites kernel:  hde: hde1 hde2  hde5 hde6 hde7 
*snip*
Jan 13 18:35:16 karsites kernel: hde: dma_intr: status=0x51 { DriveReady 
SeekComplete Error }
Jan 13 18:35:16 karsites kernel: hde: dma_intr: error=0x84 { DriveStatusError 
BadCRC }
Jan 13 18:35:16 karsites kernel: ide: failed opcode was: unknown
Jan 13 18:35:16 karsites kernel: hde: dma_intr: status=0x51 { DriveReady 
SeekComplete Error }
Jan 13 18:35:16 karsites kernel: hde: dma_intr: error=0x84 { DriveStatusError 
BadCRC }
Jan 13 18:35:16 karsites kernel: ide: failed opcode was: unknown
Jan 13 18:35:16 karsites kernel: hde: dma_intr: status=0x51 { DriveReady 
SeekComplete Error }
Jan 13 18:35:16 karsites kernel: hde: dma_intr: error=0x84 { DriveStatusError 
BadCRC }
Jan 13 18:35:16 karsites kernel: ide: failed opcode was: unknown
Jan 13 18:35:16 karsites kernel: hde: dma_intr: status=0x51 { DriveReady 
SeekComplete Error }
Jan 13 18:35:16 karsites kernel: hde: dma_intr: error=0x84 { DriveStatusError 
BadCRC }
Jan 13 18:35:16 karsites kernel: ide: failed opcode was: unknown
Jan 13 18:35:16 karsites kernel: ide2: reset: success
*snip*

Is the kernel probing the drive directly, and using the 
drive maximum UDMA rate, instead of getting this from the 
ITE8212 PCI card?

I have been reading up about hdparm, and set the drives UDMA 
mode in /etc/init.d/rc.local with;

[root@karsites ~]# hdparm -d1 -Xudma2 /dev/hde

/dev/hde:
  setting using_dma to 1 (on)
  setting xfermode to 66 (UltraDMA mode2)
  using_dma=  1 (on)

Is that why the ide2 reset was successfull?

I shall monitor this and see if I get those errors again.

Also, using hdparm from the command line, allows me to test 
the data transfer rates, with or without DMA enabled.

Looks good, and I guess I will find some 80 conductor 
IDE cables for all my IDE drives, and enable UDMA to get 
the maximum transfer rate.

40 wire IDE cables are not worth the hassle any more,
now UDMA is so stable.

Kind Regards,

Keith Roberts



-
Websites:
http://www.karsites.net
http://www.php-debuggers.net
http://www.raised-from-the-dead.org.uk

All email addresses are challenge-response protected with
TMDA [http://tmda.net]
-
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Kernel Errors Present

2011-01-13 Thread compdoc
Is it the ITE IT8212 ATA RAID Controller ?

I would suspect that raid card - the few I've tried didn't work well even
with the manufacturer's supplied windows drivers. The linux drivers might
not be any better.

I'm not sure why you distrust DMA, or if it's just on this one card that you
have problems with it. ATA is being phased out now, it became a very mature
and reliable technology its it final years.

Motherboards with onboard ATA66/100/133 ports became extremely reliable. As
long as you used the 80 wire cables.

Unfortunately, a good controller is hard to find. I used to like Promise as
a windows controller. They were very reliable if you had the lastest
firmware and drivers. But I don't know if they work well in linux, or even
how well their current models work. (been a while since I've used one)




___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Kernel Errors Present

2011-01-13 Thread Keith Roberts
On Thu, 13 Jan 2011, compdoc wrote:

 To: centos@centos.org
 From: compdoc comp...@hotrodpc.com
 Subject: Re: [CentOS] Kernel Errors Present
 
 Is it the ITE IT8212 ATA RAID Controller ?

Yes it is a card with that ITE8212 chip on it.

I had to reflash the BIOS on the card, to make it work in 
standard IDE mode. There's one BIOS download for RAID 
functionality, and another for 100% ATAPI functions. So it's 
either a RAID card, or a plain IDE card. You cannot switch 
between RAID functionality and standard IDE functionality 
without flashing the BIOS on the PCI card.

I'm monitoring the drives performance, and waiting to see if 
there are any more dma errors reported by logwatch. Nothing 
so far in /var/log/messages :)

Keith

-
Websites:
http://www.karsites.net
http://www.php-debuggers.net
http://www.raised-from-the-dead.org.uk

All email addresses are challenge-response protected with
TMDA [http://tmda.net]
-
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


[CentOS] Kernel Errors Present

2011-01-12 Thread Keith Roberts
I'm getting this message in my logwatch email notification:

- Kernel Begin 


  WARNING:  Kernel Errors Present
  sdb:3Buffer I/O error on device sdb, l ...:  2 
Time(s)
 Buffer I/O error on device sdb, l ...:  12 Time(s)
 hde: dma_intr: error=0x84 { DriveStat ...:  12 Time(s)
 hde: dma_intr: status=0x51 { DriveReady SeekComplete 
Error } ...:
12 Time(s)

  -- Kernel End -

It's an old drive I'm using for swap space, /var, and /tmp.
(It's on a PCI IDE controller, that's why it comes up as 
hde.)

If I test it for bad sectors using Vivard, there are no bad 
sectors found or remapped.

I'm just trying to move a lot of regular disk I/O from my 
main drive with the root installtion on it, to a replaceable 
spare.

I cannot find which log file these messages are going to.

Nothing in /var/log/dmesg or messages.

Where does logwatch get these messages from?

Kind Regards,

Keith Roberts

-
Websites:
http://www.karsites.net
http://www.php-debuggers.net
http://www.raised-from-the-dead.org.uk

All email addresses are challenge-response protected with
TMDA [http://tmda.net]
-
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Kernel Errors Present

2011-01-12 Thread Kwan Lowe
On Wed, Jan 12, 2011 at 4:45 AM, Keith Roberts ke...@karsites.net wrote:


 It's an old drive I'm using for swap space, /var, and /tmp.
 (It's on a PCI IDE controller, that's why it comes up as
 hde.)

 If I test it for bad sectors using Vivard, there are no bad
 sectors found or remapped.

 I'm just trying to move a lot of regular disk I/O from my
 main drive with the root installtion on it, to a replaceable
 spare.

 I cannot find which log file these messages are going to.

 Nothing in /var/log/dmesg or messages.

 Where does logwatch get these messages from?


Take a look at your /etc/syslog.conf and /etc/sysconfig/syslog and see
where the kernel messages are being logged.

There's a klogd service that logs these particular messages.. it's
started from the same runscript as syslog (/etc/rc.d/init.d/syslog).
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Kernel Errors Present

2011-01-12 Thread compdoc
Bad sectors get reallocated automatically, so you might not find any with
testing. You need to see how many have been reallocated.

SMART should already be enabled, so maximize your term window and type:

smartctl -a /dev/sdb

That will show the reallocated sector count, as well as power on hours, and
temps, etc. Do that for each drive.

If its attached to a raid controller, you have to take additional steps as
found on google.

If there are any reallocated sectors, you might want to think about
replacing it. I have a customer with a failing drive in a server that causes
it to freeze from time to time as it develops new bad sectors. I'm replacing
it this weekend...



___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Kernel Errors Present

2011-01-12 Thread Keith Roberts
On Wed, 12 Jan 2011, compdoc wrote:

 To: 'CentOS mailing list' centos@centos.org
 From: compdoc comp...@hotrodpc.com
 Subject: Re: [CentOS] Kernel Errors Present
 
 Bad sectors get reallocated automatically, so you might 
 not find any with testing. You need to see how many have 
 been reallocated.

Vivard disk diagnostic tool lists any sector read 
erros, and a count of remapped sectors, if there are any 
remapped.

 SMART should already be enabled, so maximize your term 
 window and type:

 smartctl -a /dev/sdb

 That will show the reallocated sector count, as well as 
 power on hours, and temps, etc. Do that for each drive.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME  FLAG VALUE WORST THRESH TYPE 
UPDATED  WHEN_FAILED RAW_VALUE

   1 Raw_Read_Error_Rate 0x000b   200   200   051 
Pre-fail  Always   -   0

   3 Spin_Up_Time0x0007   093   090   021 
Pre-fail  Always   -   2741

   4 Start_Stop_Count0x0032   099   099   040 
Old_age   Always   -   1611

   5 Reallocated_Sector_Ct   0x0033   200   200   140 
Pre-fail  Always   -   0

   7 Seek_Error_Rate 0x000b   200   200   051 
Pre-fail  Always   -   0

No re-allocated sectors found.

 If its attached to a raid controller, you have to take 
 additional steps as found on google.

No it's a standard IDE controller.

Looking in /proc/ide/ide2/hde/settings I find this:

namevalue   min max 
mode
-   --- --- 


pio_modewrite-only  0   255 
w

using_dma   1   0   1 
rw

wcache  1   0   1 
rw


I have tried to turn DMA off for this drive, using the 
libata.dma=0 kernel boot parameter.

But it's still coming up as using_dma 1.

If I can turn DMA off for this drive, that might get rid of 
the DMA error messages.

  hde: dma_intr: error=0x84 { DriveStat ...:  12 Time(s)
  hde: dma_intr: status=0x51 { DriveReady SeekComplete

Kind Regards,

Keith Roberts

-
Websites:
http://www.karsites.net
http://www.php-debuggers.net
http://www.raised-from-the-dead.org.uk

All email addresses are challenge-response protected with
TMDA [http://tmda.net]
-
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Kernel Errors Present...

2011-01-12 Thread m . roth
Keith Roberts wrote:
 On Wed, 12 Jan 2011, compdoc wrote:

 To: 'CentOS mailing list' centos@centos.org
 From: compdoc comp...@hotrodpc.com
 Subject: Re: [CentOS] Kernel Errors Present

 Bad sectors get reallocated automatically, so you might
 not find any with testing. You need to see how many have
 been reallocated.
snip
Maybe, but I'd fsck -C -c /dev/
   ^^ - check for bad blocks, put them in the table

   mark



___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Kernel Errors Present...

2011-01-12 Thread Keith Roberts
On Wed, 12 Jan 2011, m.r...@5-cent.us wrote:

 To: CentOS mailing list centos@centos.org
 From: m.r...@5-cent.us
 Subject: Re: [CentOS] Kernel Errors Present...
 
 Keith Roberts wrote:
 On Wed, 12 Jan 2011, compdoc wrote:

 To: 'CentOS mailing list' centos@centos.org
 From: compdoc comp...@hotrodpc.com
 Subject: Re: [CentOS] Kernel Errors Present

 Bad sectors get reallocated automatically, so you might
 not find any with testing. You need to see how many have
 been reallocated.
 snip
 Maybe, but I'd fsck -C -c /dev/
   ^^ - check for bad blocks, put them in the table

I could do that soon

But I don't want to use DMA on this drive (/dev/hde) anyway.

In the BIOS I turn DMA off for /dev/hda and /dev/hdc,
but they still show up in /proc/ide/.../settings as 
using_dma 1.

So is the kernel ignoring the BIOS DMA settings?

32-bit transfer mode is on in the BIOS though.

Keith

-
Websites:
http://www.karsites.net
http://www.php-debuggers.net
http://www.raised-from-the-dead.org.uk

All email addresses are challenge-response protected with
TMDA [http://tmda.net]
-
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Kernel Errors Present...

2011-01-12 Thread John R Pierce

 In the BIOS I turn DMA off for /dev/hda and /dev/hdc,
 but they still show up in /proc/ide/.../settings as
 using_dma 1.


say HUH?IDE PIO modes are like 3-7 MBytes/sec and require 100% CPU 
utilization during the transfer phase.   why in dogs name would you be 
doing this in 2011 ?




___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Kernel Errors Present...

2011-01-12 Thread compdoc
What model is the drive?



___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Kernel Errors Present...

2011-01-12 Thread Keith Roberts
On Wed, 12 Jan 2011, compdoc wrote:

 What model is the drive?

=== START OF INFORMATION SECTION ===
Model Family: Western Digital Caviar family
Device Model: WDC WD400BB-00GFA0
Serial Number:WD-WMAKA1241735
Firmware Version: 09.01B09
User Capacity:40,020,664,320 bytes
Device is:In smartctl database [for details use: -P 
show]
ATA Version is:   5
ATA Standard is:  Exact ATA specification draft version not 
indicated
Local Time is:Wed Jan 12 22:44:01 2011 GMT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

Keith

-
Websites:
http://www.karsites.net
http://www.php-debuggers.net
http://www.raised-from-the-dead.org.uk

All email addresses are challenge-response protected with
TMDA [http://tmda.net]
-
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Kernel Errors Present

2011-01-12 Thread Tsuyoshi Nagata
Hi Keith
(2011/01/13 6:39), Keith Roberts wrote:
hde: dma_intr: error=0x84 { DriveStat ...:  12 Time(s)
hde: dma_intr: status=0x51 { DriveReady SeekComplete
The first error is data transmitting error. Your HARD DRIVE have
a data transmitting error or malfunction on transmitting path without disk.
(The trouble is on memory, chip set, IDE-cable, HDD-Circuit(DMA). HDD dish is 
OK.)
DMA I/O was designed with 2 separated unit (control-unit and data-unit)
The trouble is on control-unit part.

Vivard/smartctl only explains your data-unit is OK.

 dma_intr: error=0x84 { DriveStatusError BadCRC }
 http://www.mail-archive.com/debian-user@lists.debian.org/msg128610.html

-Tsuyoshi.


___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos