Re: ATA driver problem?? (lost disk contact)

1999-12-19 Thread Soren Schmidt

It seems Allen Pulsifer wrote:
 According to the DPTA-3x spec from IBM, if the drive has fully entered
 Standby mode, it can take up to 31 seconds for it to spin back up.
 (See sections 3.3.6.1 and 13.0).  Other drive models may take even
 longer, and even after the drive is back up, it may take a few seconds
 to respond to the command.
 
 You might have to set the timeout value as high as 45-60 seconds in
 order to get reliable operation.
 
 One possibility: the Check Power Mode command (sections 10.5.2 and 12.1)
 allows you to determine if the drive is in Standby mode.  You might
 be able to timeout after 5-10 seconds, abort the read/write command,
 do a Check Power Mode command, and if the drive is in the process
 of spinning back up, then wait patiently for it to come to life
 before retrying the original read/write command.
 
 It looks to me like you would have to do a soft reset (sections 11.0,
 9.6 and 10.1) in order to abort the read/write command.  A soft
 reset would also cause the drive to come back to life if it were
 in Sleep mode (sections 3.3.6, 10.5.1 and 12.31).
 
 Note that section 13.0 (page 190) is explicit about this procedure:
 "We recommend that the host system executes Soft reset and then
 retry to issue the command if the host system would occur timeout
 for the device."

This is more or less what is done now, I just doesn't do the
check power mode after the reset, there is not much point, I
know the disk is coming up. 


-Søren


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: ATA driver problem?? (lost disk contact)

1999-12-18 Thread Martin Blapp


hi,

I've bought two new 16GB ATA disks and am not able to boot
anymore since wd0 has been retired:

Fresh current from today:

[...]
ad0: ad_timeout: lost disk contact
ata0: resetting devices

and after it hangs forever.

I tried IDE_DELAY=1 and 15000 but it did not change anything.

Break into DDB is not possible. This happens with/without
ATA DMA support in the kernel.

fuchur# pciconf -l
chip0@pci0:0:0: class=0x06 card=0x chip=0x70061022 rev=0x23 hdr=0x00
pcib1@pci0:1:0: class=0x060400 card=0x chip=0x70071022 rev=0x01 hdr=0x01
isab0@pci0:7:0: class=0x060100 card=0x chip=0x74081022 rev=0x01 hdr=0x00
ide_pci0@pci0:7:1:  class=0x01018a card=0x chip=0x74091022 rev=0x03 
hdr=0x00
chip1@pci0:7:3: class=0x068000 card=0x chip=0x740b1022 rev=0x03 hdr=0x00
none0@pci0:7:4: class=0x0c0310 card=0x chip=0x740c1022 rev=0x06 hdr=0x00
de0@pci0:10:0:  class=0x02 card=0x chip=0x00091011 rev=0x20 hdr=0x00
none1@pci0:11:0:class=0x01 card=0x10001000 chip=0x000f1000 rev=0x26 
hdr=0x00
vga-pci0@pci1:5:0:  class=0x03 card=0x2179102b chip=0x0525102b rev=0x04 
hdr=0x00 

and the output from the old wd driver:

wdc0 at port 0x1f0-0x1f7 irq 14 on isa0
wdc0: unit 0 (wd0): IBM-DJNA-351520
wd0: 14664MB (30033360 sectors), 29795 cyls, 16 heads, 63 S/T, 512 B/S
wdc0: unit 1 (atapi): TOSHIBA CD-ROM XM-6202B/1110, removable, accel,
ovlap, dma, iordis
Device wcd0a: name slot allocation failed (Errno=17)
Device wcd0c: name slot allocation failed (Errno=17) 
wcd0: drive speed 5512KB/sec, 256KB cache
wcd0: supported read types: CD-R, CD-RW, CD-DA
wcd0: Audio: play, 255 volume levels
wcd0: Mechanism: ejectable tray
wcd0: Medium: no/blank disc inside, unlocked
wdc1 at port 0x170-0x177 irq 15 on isa0
wdc1: unit 0 (wd2): IBM-DJNA-351520
wd2: 14664MB (30033360 sectors), 29795 cyls, 16 heads, 63 S/T, 512 B/S
wdc1: unit 1 (atapi): IOMEGA  ZIP 100   ATAPI/23.D, removable, intr,
iordis
wfd0: medium type unknown (no disk)
wfd0: buggy Zip drive, 64-block transfer limit set 

Martin

PS: I can give you access to the machine if you like.

Martin Blapp, [EMAIL PROTECTED]

Improware AG, UNIX solution and service provider
Zurlindenstrasse 29, 4133 Pratteln, Switzerland
Phone: +41 79 370 26 05, Fax: +41 61 826 93 01




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: ATA driver problem?? (lost disk contact)

1999-12-18 Thread Dave J. Boers

On Sat, Dec 18, 1999 at 08:44:42PM +0100, Soren Schmidt wrote:
 There is no way to see if the disk was in suspend mode, you can
 give it a command and se how long it takes before it comes back :)
 
 The problem here is that it takes the command and OK's it, but it
 takes the spinuptime + overhead before the answer comes, and then
 the driver already timed out. 

I am under the impression that the drive does not need to do ADM if it is
shutdown once every six days. So can't we go with phk's solution: make a
cron job that shuts down and powers up the drive once every six days? 

Regards,

Dave. 

-- 
  God, root, what's the difference?
  [EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: ATA driver problem?? (lost disk contact)

1999-12-18 Thread Martin Blapp


Sorry,

I found a rather easy workaround. Disable DMA for
the disks in the BIOS ... But I still wonder why
enable/disable ATA DMA in kernel has no effect for
this crash. Why does only the BIOS disable help ?

ata-pci0: Unknown PCI ATA controller (generic mode) at device 7.1 on pci0
ata-pci0: Busmastering DMA supported
ata0 at 0x01f0 irq 14 on ata-pci0
ata1 at 0x0170 irq 15 on ata-pci0
chip1: PCI to Other bridge (vendor=1022 device=740b) at device 7.3 on pci0 

Strange thing is that the the two disks report themselves
different (The disks are identical) and the settings in the
BIOS for ata0 and ata1 too ...

ad0: IBM-DJNA-351520/J56OA30K ATA-4 disk at ata0 as master
ad0: 14664MB (30033360 sectors), 29795 cyls, 16 heads, 63 S/T, 512 B/S
ad0: 16 secs/int, 32 depth queue, DMA

  ^^^
why that ?

ad2: IBM-DJNA-351520/J56OA30K ATA-4 disk at ata1 as master
ad2: 14664MB (30033360 sectors), 29795 cyls, 16 heads, 63 S/T, 512 B/S
ad2: 16 secs/int, 32 depth queue, PIO   

Anyway, so DMA on K7 boards is not supported. Is someone working on this ?

Martin

Martin Blapp, [EMAIL PROTECTED]

Improware AG, UNIX solution and service provider
Zurlindenstrasse 29, 4133 Pratteln, Switzerland
Phone: +41 79 370 26 05, Fax: +41 61 826 93 01





To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: ATA driver problem?? (lost disk contact)

1999-12-18 Thread Soren Schmidt

It seems Martin Blapp wrote:
 
 Sorry,
 
 I found a rather easy workaround. Disable DMA for
 the disks in the BIOS ... But I still wonder why
 enable/disable ATA DMA in kernel has no effect for
 this crash. Why does only the BIOS disable help ?

No idea, I have to study AMD's southbridge first..

 ata-pci0: Unknown PCI ATA controller (generic mode) at device 7.1 on pci0
 ata-pci0: Busmastering DMA supported
 ata0 at 0x01f0 irq 14 on ata-pci0
 ata1 at 0x0170 irq 15 on ata-pci0
 chip1: PCI to Other bridge (vendor=1022 device=740b) at device 7.3 on pci0 
 
 Strange thing is that the the two disks report themselves
 different (The disks are identical) and the settings in the
 BIOS for ata0 and ata1 too ...
 
 ad0: IBM-DJNA-351520/J56OA30K ATA-4 disk at ata0 as master
 ad0: 14664MB (30033360 sectors), 29795 cyls, 16 heads, 63 S/T, 512 B/S
 ad0: 16 secs/int, 32 depth queue, DMA
 
 ^^^
   why that ?
 
 ad2: IBM-DJNA-351520/J56OA30K ATA-4 disk at ata1 as master
 ad2: 14664MB (30033360 sectors), 29795 cyls, 16 heads, 63 S/T, 512 B/S
 ad2: 16 secs/int, 32 depth queue, PIO   
 
 Anyway, so DMA on K7 boards is not supported. Is someone working on this ?

Its not all K7 boards, those that has the VIA southbridge are supported
ie most K7 boards. Its just that nobody has written support for
the AMD southbridge yet. It should work in generic mode as the above
suggest in  PIO or DMA mode, just no UDMA. 
BTW I need full dmesg's from verbose boots, these snippets are not enough. 
I'll try to get docs on the AMD southbridge, if so, it should be pretty
easy to add support for it...

-Søren


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: ATA driver problem?? (lost disk contact)

1999-12-18 Thread Richard Seaman, Jr.

On Sat, Dec 18, 1999 at 11:20:51PM +0100, Martin Blapp wrote:
 
 Sorry,
 
 I found a rather easy workaround. Disable DMA for
 the disks in the BIOS ... But I still wonder why
 enable/disable ATA DMA in kernel has no effect for
 this crash. Why does only the BIOS disable help ?

Purely a wild guess on my part:

If the BIOS is set to enable UDMA, then the bios sets both
the controller and the disk for UDMA.  But, the ata driver
tries to set the disk to WDMA2 mode for "generic drivers".
If the controller is set for UDMA and the disk for WDMA2, they
might have problems communicating (the "generic driver" doesn't
try to mess with the controller settings, I don't think).

However, if the BIOS sets the disk and the controller to
PIO, then when the ata drivers uses the "generic" treatment
to set the disk to WDMA2, this works since PIO and WDMA2
have similar timings.

As I said, this is purely a wild guess from someone who
understands all this poorly.


-- 
Richard Seaman, Jr.   email: [EMAIL PROTECTED]
5182 N. Maple Lanephone: 262-367-5450
Chenequa WI 53058 fax:   262-367-5852


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



RE: ATA driver problem?? (lost disk contact)

1999-12-18 Thread Allen Pulsifer

According to the DPTA-3x spec from IBM, if the drive has fully entered
Standby mode, it can take up to 31 seconds for it to spin back up.
(See sections 3.3.6.1 and 13.0).  Other drive models may take even
longer, and even after the drive is back up, it may take a few seconds
to respond to the command.

You might have to set the timeout value as high as 45-60 seconds in
order to get reliable operation.

One possibility: the Check Power Mode command (sections 10.5.2 and 12.1)
allows you to determine if the drive is in Standby mode.  You might
be able to timeout after 5-10 seconds, abort the read/write command,
do a Check Power Mode command, and if the drive is in the process
of spinning back up, then wait patiently for it to come to life
before retrying the original read/write command.

It looks to me like you would have to do a soft reset (sections 11.0,
9.6 and 10.1) in order to abort the read/write command.  A soft
reset would also cause the drive to come back to life if it were
in Sleep mode (sections 3.3.6, 10.5.1 and 12.31).

Note that section 13.0 (page 190) is explicit about this procedure:
"We recommend that the host system executes Soft reset and then
retry to issue the command if the host system would occur timeout
for the device."

Hope this helps.

Allen

 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED]]On Behalf Of Soren Schmidt
 Sent: Saturday, December 18, 1999 4:13 PM
 To: [EMAIL PROTECTED]
 Cc: Richard Seaman Jr.; [EMAIL PROTECTED]
 Subject: Re: ATA driver problem?? (lost disk contact)


 It seems Dave J. Boers wrote:
  On Sat, Dec 18, 1999 at 08:44:42PM +0100, Soren Schmidt wrote:
   There is no way to see if the disk was in suspend mode, you can
   give it a command and se how long it takes before it comes back :)
  
   The problem here is that it takes the command and OK's it, but it
   takes the spinuptime + overhead before the answer comes, and then
   the driver already timed out.
 
  I am under the impression that the drive does not need to do ADM if it is
  shutdown once every six days. So can't we go with phk's solution: make a
  cron job that shuts down and powers up the drive once every six days?

 I'd rather just up the timeout to 10s like the old wd driver, that way
 it apparently isn't a problem anymore, we just wait for the sucker to
 spin up if needed.

 -Søren


 To Unsubscribe: send mail to [EMAIL PROTECTED]
 with "unsubscribe freebsd-current" in the body of the message



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: ATA driver problem?? (lost disk contact)

1999-12-17 Thread Poul-Henning Kamp

In message [EMAIL PROTECTED], Soren Schmidt writes:
It seems Richard Seaman, Jr. wrote:

Yup, sounds like the problem some are seing, now I wonder why I
havn't seen it on any of the IBM disks I've access to, hmm...

It apparantly can't be disabled, but I'll try to figure out if
I can detect when the drive is in this mode, or put it in
standby mode and back again when there is nothing else to do,
that should reset the timer...

Probably the best thing to do would be to write a "atamaint" program
and schedule a cronjob to run it at 05:00 every morning or something.

--
Poul-Henning Kamp FreeBSD coreteam member
[EMAIL PROTECTED]   "Real hackers run -current on their laptop."
FreeBSD -- It will take a long time before progress goes too far!


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: ATA driver problem?? (lost disk contact)

1999-12-17 Thread Richard Seaman, Jr.

On Fri, Dec 17, 1999 at 08:22:03AM +0100, Soren Schmidt wrote:

 Yup, sounds like the problem some are seing, now I wonder why I
 havn't seen it on any of the IBM disks I've access to, hmm...
 
 It apparantly can't be disabled, but I'll try to figure out if
 I can detect when the drive is in this mode, or put it in
 standby mode and back again when there is nothing else to do,
 that should reset the timer...

Note that the wd driver doesn't "report" any problems.  Don't
know if that is because the wd driver handles this differently,
or because the reporting is different.  

The machine that reports these problems runs 7/24, and has for
over a year and a half.  The IBM disk has been in for quite a
while (maybe 6 months or more).  Only ata "reports" the problem.

Note that the IBM specs say that spinup from standby to idle is
13 secs "typical" and 31 secs max for this drive.  I'm assuming
that what we're seeing is that the ata driver "lost contact"
because the timeout is less that the time it takes to spinup
from standby to idle (or to spinup from an interrupted switch
from idle to standby)?

-- 
Richard Seaman, Jr.   email: [EMAIL PROTECTED]
5182 N. Maple Lanephone: 262-367-5450
Chenequa WI 53058 fax:   262-367-5852


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: ATA driver problem?? (lost disk contact)

1999-12-17 Thread Soren Schmidt

It seems Richard Seaman, Jr. wrote:
 
  Yup, sounds like the problem some are seing, now I wonder why I
  havn't seen it on any of the IBM disks I've access to, hmm...
  
  It apparantly can't be disabled, but I'll try to figure out if
  I can detect when the drive is in this mode, or put it in
  standby mode and back again when there is nothing else to do,
  that should reset the timer...
 
 Note that the wd driver doesn't "report" any problems.  Don't
 know if that is because the wd driver handles this differently,
 or because the reporting is different.  

Because the wd driver has a 10 secs timeout, and ata has 5 secs.
I think the easiest way to "solve" this is to increase the 
timeout to 10-15 secs, as little as I want to do that...

 The machine that reports these problems runs 7/24, and has for
 over a year and a half.  The IBM disk has been in for quite a
 while (maybe 6 months or more).  Only ata "reports" the problem.

Se above..

-Søren


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: ATA driver problem?? (lost disk contact)

1999-12-17 Thread Richard Seaman, Jr.

On Fri, Dec 17, 1999 at 02:28:29PM +0100, Soren Schmidt wrote:

 Because the wd driver has a 10 secs timeout, and ata has 5 secs.
 I think the easiest way to "solve" this is to increase the 
 timeout to 10-15 secs, as little as I want to do that...

I don't really understand disk drivers, so if I'm off base,
I apologize.  I'm under the impression that you can query the
disk to see if its in idle mode, or if not, if its in standby
mode.  If you leave the timeout at 5 secs, and you actually
timeout, perhaps you can check the disk to see if its in
standby mode, or in the process of spinning up.  If so, for
just this case, perhaps you can adjust the timeout to a greater
value before retrying the command?  Also, perhaps you want to
skip printing the diagnostic if the timeout was due to 
standby/spinup, unless it also fails on retry?

-- 
Richard Seaman, Jr.   email: [EMAIL PROTECTED]
5182 N. Maple Lanephone: 262-367-5450
Chenequa WI 53058 fax:   262-367-5852


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: ATA driver problem?? (lost disk contact)

1999-12-16 Thread Soren Schmidt

It seems Dave J. Boers wrote:
 
 I am still having "disc contact lost messages" regularly too. I've been
 posting about them on several occasions some time ago.  I haven't been able
 to pinn it down, however. IF they occur, they occur somewhere between 9:15
 and 9:20 a.m. OR p.m. But they don't always. This used to be 10:15, but
 that changed _some weeks after_ the change of daylight saving time. I can't
 seem to relate it to anything. It is unlikely that it's a power glitch,
 because the system has been displaying the problem with two different
 UPS's. 
 
 The machine is running current current's which are regularly updated. It's
 an ABIT BP6 and the disk causing problems is a WD 7200 RPM 18,2 Gb disk
 running UDMA33. It's the only IDE disk in the system; the other disks are
 all SCSI. The system is running 24/7. Other details were posted earlier. 

There is this thing with the IBM's doing some headcleaning stuff once
a day/week, but I've never seen any of my IBM's do that (I got plenty
of them). I'll try to get more info on that from IBM...

-Søren


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: ATA driver problem?? (lost disk contact)

1999-12-16 Thread Soren Schmidt

It seems Soren Schmidt wrote:
 It seems Dave J. Boers wrote:
  
  I am still having "disc contact lost messages" regularly too. I've been
  posting about them on several occasions some time ago.  I haven't been able
  to pinn it down, however. IF they occur, they occur somewhere between 9:15
  and 9:20 a.m. OR p.m. But they don't always. This used to be 10:15, but
  that changed _some weeks after_ the change of daylight saving time. I can't
  seem to relate it to anything. It is unlikely that it's a power glitch,
  because the system has been displaying the problem with two different
  UPS's. 
  
  The machine is running current current's which are regularly updated. It's
  an ABIT BP6 and the disk causing problems is a WD 7200 RPM 18,2 Gb disk
  running UDMA33. It's the only IDE disk in the system; the other disks are
  all SCSI. The system is running 24/7. Other details were posted earlier. 
 
 There is this thing with the IBM's doing some headcleaning stuff once
 a day/week, but I've never seen any of my IBM's do that (I got plenty
 of them). I'll try to get more info on that from IBM...

One more thing, do you have SMART enabled in your BIOS ??, if so
turn it off, and see if that changes anything...

-Søren


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: ATA driver problem?? (lost disk contact)

1999-12-16 Thread Dave J. Boers

On Thu, Dec 16, 1999 at 09:36:42AM +0100, Soren Schmidt wrote:
 One more thing, do you have SMART enabled in your BIOS ??, if so
 turn it off, and see if that changes anything...

I don't recall having it enabled; but I will check to make sure as soon as
I get home from work (which is still some 10 hours away sigh).

Regards,

Dave Boers. 

-- 
  God, root, what's the difference?
  [EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: ATA driver problem?? (lost disk contact)

1999-12-16 Thread Soren Schmidt

It seems Devin Butterfield wrote:
 That's interesting...In my case it is quite easily reproduced (very
 predictable). All I have to do is reboot and then run sysinstall and
 when it probes the devices the disks time out. So far I have not been
 able to get this behavior at any other time.
 
 I should also note that it repeatedly try's "resetting devices...done."
 many times (number of times it does this varies).
 
 Soren, since the problem is reproducible in my case, can you think of
 anything else I can try to help shed some light on what might be causing
 these time-outs we are having?

Hmm, does the problem persist if you increase the timeout in ata-disk.c
to some too big value, like 100 secs or so??
If so, there is something causing the timeout function to be called
without a real timeout. This could be the problem, I just dont see how
that would be possible...

-Søren


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: ATA driver problem?? (lost disk contact)

1999-12-16 Thread Dave J. Boers

On Thu, Dec 16, 1999 at 03:29:30PM +0600, Max Khon wrote:
 hi, there!
 same here, dmesg output:
 
SNIP
 ata_command: timeout waiting for interrupt
 Mounting root from ufs:/dev/ad0s2a
 ata0-master: ad_timeout: lost disk contact - resetting
 ata0: resetting devices .. done
 ata0-master: ad_timeout: lost disk contact - resetting
 ata0: resetting devices .. done
 ata0-master: ad_timeout: lost disk contact - resetting
 ata0: resetting devices .. done
 ata0-master: ad_timeout: lost disk contact - resetting
 ata0: resetting devices .. done
 ata0-master: ad_timeout: lost disk contact - resetting
 ata0: resetting devices .. done
 ata0-master: ad_timeout: lost disk contact - resetting
 ata0: resetting devices .. done
 ata0-master: ad_timeout: lost disk contact - resetting
 ata0: resetting devices .. done

Could you tell met the exact time on which these messages occurred?
Anywhere near 10:15 or 9:15 ? 

Regards,

Dave Boers. 

-- 

  God, root, what's the difference?
  [djb,bofh,coredump,root]@relativity.student.utwente.nl


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: ATA driver problem?? (lost disk contact)

1999-12-16 Thread Max Khon

hi, there!

On Thu, 16 Dec 1999, Dave J. Boers wrote:

 Could you tell met the exact time on which these messages occurred?
 Anywhere near 10:15 or 9:15 ? 

nope. the time is unpredictable.
sometimes it can work more than a day without spilling out those messages

/fjoe



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: ATA driver problem?? (lost disk contact)

1999-12-16 Thread Richard Seaman, Jr.

On Thu, Dec 16, 1999 at 09:25:11AM +0100, Soren Schmidt wrote:
 It seems Dave J. Boers wrote:
  
  I am still having "disc contact lost messages" regularly too. I've been
  posting about them on several occasions some time ago.  I haven't been able
  to pinn it down, however. IF they occur, they occur somewhere between 9:15
  and 9:20 a.m. OR p.m. But they don't always. This used to be 10:15, but
  that changed _some weeks after_ the change of daylight saving time. I can't
  seem to relate it to anything. It is unlikely that it's a power glitch,
  because the system has been displaying the problem with two different
  UPS's. 
  
  The machine is running current current's which are regularly updated. It's
  an ABIT BP6 and the disk causing problems is a WD 7200 RPM 18,2 Gb disk
  running UDMA33. It's the only IDE disk in the system; the other disks are
  all SCSI. The system is running 24/7. Other details were posted earlier. 
 
 There is this thing with the IBM's doing some headcleaning stuff once
 a day/week, but I've never seen any of my IBM's do that (I got plenty
 of them). I'll try to get more info on that from IBM...

I've been running the ata driver for about a week now.  Yesterday,
for the first time, I got the messages posted below, and now again
this morning.

Note the fallback to PIO. Also note that Dec 15 is exactly 1 week
from the first time I ran with the ATA drivers, thought there have
been several reboots in the interim. The times are CST (-600).  The
machine's time is synched using ntp.

Dec 15 07:00:44 test /kernel: ata0-master: ad_timeout: lost disk contact - resetting
Dec 15 07:00:45 test /kernel: ata0: resetting devices .. done

[snip]

Dec 15 19:01:02 test /kernel: ata0-master: ad_timeout: lost disk contact - resetting
Dec 15 19:01:02 test /kernel: ata0: resetting devices .. done
Dec 15 19:01:07 test /kernel: ata0-master: ad_timeout: lost disk contact - resetting
Dec 15 19:01:07 test /kernel: ata0: resetting devices .. done
Dec 15 19:01:07 test /kernel: ata0-master: ad_timeout: lost disk contact - resetting
Dec 15 19:01:07 test /kernel: ata0: resetting devices .. done
Dec 15 19:01:12 test /kernel: ata0-master: ad_timeout: lost disk contact - resetting
Dec 15 19:01:12 test /kernel: ata0: resetting devices .. done
Dec 15 19:01:12 test /kernel: ata0-master: ad_timeout: lost disk contact - resetting
Dec 15 19:01:12 test /kernel: ata0: resetting devices .. done
Dec 15 19:01:17 test /kernel: ata0-master: ad_timeout: lost disk contact - resetting
Dec 15 19:01:17 test /kernel: ata0-master: ad_timeout: trying fallback to PIO mode
Dec 15 19:01:17 test /kernel: ata0: resetting devices .. done
Dec 15 19:01:17 test /kernel: ata0-master: ad_timeout: lost disk contact - resetting
Dec 15 19:01:17 test /kernel: ata0: resetting devices .. done
Dec 15 19:01:22 test /kernel: ata0-master: ad_timeout: lost disk contact - resetting
Dec 15 19:01:22 test /kernel: ata0: resetting devices .. done

[snip]

Dec 16 07:01:24 test /kernel: ata0-master: ad_timeout: lost disk contact - resetting
Dec 16 07:01:24 test /kernel: ata0: resetting devices .. done
Dec 16 07:01:29 test /kernel: ata0-master: ad_timeout: lost disk contact - resetting
Dec 16 07:01:29 test /kernel: ata0: resetting devices .. done
Dec 16 07:01:34 test /kernel: ata0-master: ad_timeout: lost disk contact - resetting
Dec 16 07:01:34 test /kernel: ata0: resetting devices .. done
Dec 16 07:01:39 test /kernel: ata0-master: ad_timeout: lost disk contact - resetting
Dec 16 07:01:39 test /kernel: ata0: resetting devices .. done


Setup:

Dec 11 11:31:02 test /kernel: ata-pci0: SiS 5591 ATA controller irq 14 at device 0.1 
on pci0
Dec 11 11:31:02 test /kernel: ata-pci0: Busmastering DMA supported
Dec 11 11:31:02 test /kernel: ata0 at 0x01f0 irq 14 on ata-pci0
Dec 11 11:31:02 test /kernel: ata1 at 0x0170 irq 15 on ata-pci0

[snip]

Dec 11 11:31:02 test /kernel: ata-isa0: already registered as ata0
Dec 11 11:31:02 test /kernel: ata-isa1: already registered as ata1

[snip]

Dec 11 11:31:02 test /kernel: ad0: IBM-DJNA-371800/J78OA30K ATA-4 disk at ata0 as 
master
Dec 11 11:31:02 test /kernel: ad0: 17206MB (35239680 sectors), 34960 cyls, 16 heads, 
63 S/T, 512 B/S
Dec 11 11:31:02 test /kernel: ad0: 16 secs/int, 32 depth queue, UDMA33
Dec 11 11:31:02 test /kernel: ad1: Maxtor 85120A8/AA8Z2726 ATA-3 disk at ata0 as 
slave 
Dec 11 11:31:02 test /kernel: ad1: 4884MB (10003392 sectors), 9924 cyls, 16 heads, 63 
S/T, 512 B/S
Dec 11 11:31:02 test /kernel: ad1: 16 secs/int, 1 depth queue, DMA
Dec 11 11:31:02 test /kernel: ad2: Maxtor 91152D8/WAS82739 ATA-4 disk at ata1 as 
master
Dec 11 11:31:02 test /kernel: ad2: 10991MB (22510656 sectors), 22332 cyls, 16 heads, 
63 S/T, 512 B/S
Dec 11 11:31:02 test /kernel: ad2: 16 secs/int, 1 depth queue, UDMA33


-- 
Richard Seaman, Jr.   email: [EMAIL PROTECTED]
5182 N. Maple Lanephone: 262-367-5450
Chenequa WI 53058 fax:   262-367-5852


To Unsubscribe: send mail to [EMAIL PROTECTED]
with 

Re: ATA driver problem?? (lost disk contact)

1999-12-16 Thread Dave J. Boers

On Thu, Dec 16, 1999 at 07:10:46AM -0600, Richard Seaman, Jr. wrote:

 Dec 15 19:01:02 test /kernel: ata0-master: ad_timeout: lost disk contact - resetting
 Dec 15 19:01:02 test /kernel: ata0: resetting devices .. done
snipsnip 
 Dec 16 07:01:24 test /kernel: ata0-master: ad_timeout: lost disk contact - resetting
 Dec 16 07:01:24 test /kernel: ata0: resetting devices .. done

...and again there is almost precisely 12 hours in between...
That's the same as I find time and again.
I noticed that you are using IBM disks, whil my disk is a WD. The only
common denominator seems to be the fact that we are both using -current
with ATA drivers and that we are both running UDMA33. 

Regards,

Dave Boers

-- 
  God, root, what's the difference?
  [EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: ATA driver problem?? (lost disk contact)

1999-12-16 Thread Soren Schmidt

It seems Dave J. Boers wrote:
 On Thu, Dec 16, 1999 at 07:10:46AM -0600, Richard Seaman, Jr. wrote:
 
  Dec 15 19:01:02 test /kernel: ata0-master: ad_timeout: lost disk contact - 
resetting
  Dec 15 19:01:02 test /kernel: ata0: resetting devices .. done
 snipsnip 
  Dec 16 07:01:24 test /kernel: ata0-master: ad_timeout: lost disk contact - 
resetting
  Dec 16 07:01:24 test /kernel: ata0: resetting devices .. done
 
 ...and again there is almost precisely 12 hours in between...
 That's the same as I find time and again.
 I noticed that you are using IBM disks, whil my disk is a WD. The only
 common denominator seems to be the fact that we are both using -current
 with ATA drivers and that we are both running UDMA33. 

Uhm, that wont be new WD drives, as they are exactly the same as
IBM drives give or take the label :)

-Søren


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: ATA driver problem?? (lost disk contact)

1999-12-16 Thread Dave J. Boers

On Thu, Dec 16, 1999 at 03:02:24PM +0100, Soren Schmidt wrote:
 Uhm, that wont be new WD drives, as they are exactly the same as
 IBM drives give or take the label :)

Huh? That I didn't know. So you're saying that WD and IBM 18 Gb disks are
the same hardware? 

My disk: 

ad0: WDC AC418000D/J78OA30K ATA-4 disk at ata0 as master
ad0: 17206MB (35239680 sectors), 34960 cyls, 16 heads, 63 S/T, 512 B/S
ad0: 16 secs/int, 32 depth queue, UDMA33

I would *love* to hear more about that. Can you point me to some info? 

Regards,

Dave Boers. 

-- 
  God, root, what's the difference?
  [EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: ATA driver problem?? (lost disk contact)

1999-12-16 Thread Soren Schmidt

It seems Dave J. Boers wrote:
 On Thu, Dec 16, 1999 at 03:02:24PM +0100, Soren Schmidt wrote:
  Uhm, that wont be new WD drives, as they are exactly the same as
  IBM drives give or take the label :)
 
 Huh? That I didn't know. So you're saying that WD and IBM 18 Gb disks are
 the same hardware? 
 
 My disk: 
 
 ad0: WDC AC418000D/J78OA30K ATA-4 disk at ata0 as master
 ad0: 17206MB (35239680 sectors), 34960 cyls, 16 heads, 63 S/T, 512 B/S
 ad0: 16 secs/int, 32 depth queue, UDMA33
 
 I would *love* to hear more about that. Can you point me to some info? 

I read it somewhere that IBM  WD has joined forces on their newer
disks, funny enough WD disks now looks exactly like IBM disks :)

I think this only applies to WD's expert series, but that I'm not
sure of. At least the 9G AC29100D I've got is physically identical
to the IBM drives I've got.

-Søren


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: ATA driver problem?? (lost disk contact)

1999-12-16 Thread Richard Seaman, Jr.

On Thu, Dec 16, 1999 at 09:25:11AM +0100, Soren Schmidt wrote:

 There is this thing with the IBM's doing some headcleaning stuff once
 a day/week, but I've never seen any of my IBM's do that (I got plenty
 of them). I'll try to get more info on that from IBM...

Check http://www.storage.ibm.com/techsup/hddtech/prodspec/djna_spw.pdf

On page 99 it says:

10.12 Automatic Drive Maintenance (ADM)

ADM function is equipped to maintain the reliability even in continuous usage.
ADM function is to go into Standby mode automatically after detecting idle mode
at intervals of 6 days.

This function is always enabled regardless of Standby Timer value. The detail
of Standby Timer is described in 12.6, "Idle (E3h/97h)" on page 122, and 12.32,
"Standby (E2h/96h)" on page 171.

The 6 days counter is reset at the following.
Power on Ready
Entering Standby mode by Standby Command
Entering Standby mode by Standby Timer

Both Soft Reset and Hard Reset do not disturb the spin down of ADM.

If a command is received during spin down of ADM, the drive quits the spin down
and tries to complete the command as soon as possible.
In case the spin down of ADM is disturbed by a command, it is retried 12 hours
later. For timeout concern, refer to 13.0, "Timeout Values" on page 185.


-- 
Richard Seaman, Jr.   email: [EMAIL PROTECTED]
5182 N. Maple Lanephone: 262-367-5450
Chenequa WI 53058 fax:   262-367-5852


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: ATA driver problem?? (lost disk contact)

1999-12-16 Thread Dave J. Boers

On Thu, Dec 16, 1999 at 04:50:55PM -0600, Richard Seaman, Jr. wrote:
 Check http://www.storage.ibm.com/techsup/hddtech/prodspec/djna_spw.pdf
snip
 If a command is received during spin down of ADM, the drive quits the spin down
 and tries to complete the command as soon as possible.
 In case the spin down of ADM is disturbed by a command, it is retried 12 hours
 later.

That sure sounds like my 12 hours. I guess this more or solves the mystery.
There is still one thing which keeps me wondering, though. How exactly does
the ata driver react to the drive doing ADM? Whenever I hear it spinning
down, I immediately hear it spinning up again. Does this mean that the ATA
driver won't allow the drive to do _any_ ADM at all? Is that a bad thing? 

Regards,

Dave Boers

-- 
  God, root, what's the difference?
  [EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: ATA driver problem?? (lost disk contact)

1999-12-16 Thread Soren Schmidt

It seems Richard Seaman, Jr. wrote:

Yup, sounds like the problem some are seing, now I wonder why I
havn't seen it on any of the IBM disks I've access to, hmm...

It apparantly can't be disabled, but I'll try to figure out if
I can detect when the drive is in this mode, or put it in
standby mode and back again when there is nothing else to do,
that should reset the timer...

 On Thu, Dec 16, 1999 at 09:25:11AM +0100, Soren Schmidt wrote:
 
  There is this thing with the IBM's doing some headcleaning stuff once
  a day/week, but I've never seen any of my IBM's do that (I got plenty
  of them). I'll try to get more info on that from IBM...
 
 Check http://www.storage.ibm.com/techsup/hddtech/prodspec/djna_spw.pdf
 
 On page 99 it says:
 
 10.12 Automatic Drive Maintenance (ADM)
 
 ADM function is equipped to maintain the reliability even in continuous usage.
 ADM function is to go into Standby mode automatically after detecting idle mode
 at intervals of 6 days.
 
 This function is always enabled regardless of Standby Timer value. The detail
 of Standby Timer is described in 12.6, "Idle (E3h/97h)" on page 122, and 12.32,
 "Standby (E2h/96h)" on page 171.
 
 The 6 days counter is reset at the following.
 Power on Ready
 Entering Standby mode by Standby Command
 Entering Standby mode by Standby Timer
 
 Both Soft Reset and Hard Reset do not disturb the spin down of ADM.
 
 If a command is received during spin down of ADM, the drive quits the spin down
 and tries to complete the command as soon as possible.
 In case the spin down of ADM is disturbed by a command, it is retried 12 hours
 later. For timeout concern, refer to 13.0, "Timeout Values" on page 185.

-Søren


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



ATA driver problem?? (lost disk contact)

1999-12-15 Thread Devin Butterfield

Hi,

I just recently compiled a kernel with the new ATA driver and have
discovered a problem: if I run sysinstall, right when it says "probing
devices, please wait (this can be a while)" error messages saying...

Dec 15 21:20:05 dbm /kernel: ata0-slave: ad_timeout: lost disk contact -
resetting
Dec 15 21:20:05 dbm /kernel: ata0: resetting devices .. done
Dec 15 21:20:15 dbm /kernel: ata0-slave: ad_timeout: lost disk contact -
resetting
Dec 15 21:20:15 dbm /kernel: ata0: resetting devices .. done
Dec 15 21:20:25 dbm /kernel: ata0-slave: ad_timeout: lost disk contact -
resetting
Dec 15 21:20:25 dbm /kernel: ata0: resetting devices .. done
Dec 15 21:20:35 dbm /kernel: ata0-slave: ad_timeout: lost disk contact -
resetting
Dec 15 21:20:35 dbm /kernel: ata0-slave: ad_timeout: trying fallback to
PIO mode
Dec 15 21:20:35 dbm /kernel: ata0: resetting devices .. done

and after printing these messages a number of times, sysinstall will
finally come up. If I quit sysinstall and then run it again, probing
goes well and there are no timeouts. The interesting thing is that I can
reproduce this problem by rebooting and running sysinstall. So, this
only happens when running sysinstall for the first time after a boot.
:-/

I've read through all the previous messages regarding these timeout
problems and have even increased the timeout in ata-disk.c to 10 secs
but no luck.

Anybody have any ideas?? Below is the usual info...
--
Regards, Devin.

Copyright (c) 1992-1999 The FreeBSD Project.
Copyright (c) 1982, 1986, 1989, 1991, 1993
The Regents of the University of California. All rights reserved.
FreeBSD 4.0-19991214-CURRENT #1: Wed Dec 15 21:05:38 PST 1999
[EMAIL PROTECTED]:/usr/src/sys/compile/DBM
Timecounter "i8254"  frequency 1193182 Hz
CPU: AMD-K6(tm) 3D processor (501.14-MHz 586-class CPU)
  Origin = "AuthenticAMD"  Id = 0x58c  Stepping = 12
  Features=0x8021bfFPU,VME,DE,PSE,TSC,MSR,MCE,CX8,PGE,MMX
  AMD Features=0x8800SYSCALL,3DNow!
real memory  = 134152192 (131008K bytes)
avail memory = 126500864 (123536K bytes)
Preloaded elf kernel "kernel" at 0xc0378000.
md0: Malloc disk
npx0: math processor on motherboard
npx0: INT 16 interface
pcib0: AcerLabs M1541 (Aladdin-V) PCI host bridge on motherboard
pci0: PCI bus on pcib0
pcib1: AcerLabs M5243 PCI-PCI bridge at device 1.0 on pci0
pci1: PCI bus on pcib1
isab0: AcerLabs M1533 portable PCI-ISA bridge at device 7.0 on pci0
isa0: ISA bus on isab0
ata-pci0: AcerLabs Aladdin ATA controller at device 15.0 on pci0
ata-pci0: Busmastering DMA supported
ata0 at 0x01f0 irq 14 on ata-pci0
ata1 at 0x0170 irq 15 on ata-pci0
dc0: 82c169 PNIC 10/100BaseTX irq 10 at device 16.0 on pci0
dc0: Ethernet address: 00:a0:cc:27:48:ec
miibus0: MII bus on dc0
ukphy0: Generic IEEE 802.3u media interface on miibus0
ukphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
SNIP
ad0: Maxtor 90871U2/FA570480 ATA-5 disk at ata0 as master
ad0: 8297MB (16992864 sectors), 16858 cyls, 16 heads, 63 S/T, 512 B/S
ad0: 16 secs/int, 1 depth queue, UDMA33
ad1: WDC AC26400B/32.02S32 ATA-4 disk at ata0 as slave 
ad1: 6149MB (12594960 sectors), 13328 cyls, 15 heads, 63 S/T, 512 B/S
ad1: 16 secs/int, 1 depth queue, UDMA33
acd0: HITACHI CDR-7930/1022 CDROM drive at ata1 as master
acd0: read 1377KB/s (1377KB/s), 128KB buffer, PIO
acd0: supported read types: CD-DA
acd0: Audio: play, 255 volume levels
acd0: Mechanism: ejectable tray
acd0: Medium: no/blank disc inside, unlocked
Waiting 15 seconds for SCSI devices to settle
Mounting root from ufs:/dev/ad0s2a
cd0 at adv0 bus 0 target 3 lun 0
cd0: YAMAHA CRW4260 1.0h Removable CD-ROM SCSI-2 device 
cd0: 3.300MB/s transfers
cd0: Attempt to query device size failed: NOT READY, Medium not present
- tray closed


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: ATA driver problem?? (lost disk contact)

1999-12-15 Thread Soren Schmidt

It seems Devin Butterfield wrote:
 Hi,
 
 I just recently compiled a kernel with the new ATA driver and have
 discovered a problem: if I run sysinstall, right when it says "probing
 devices, please wait (this can be a while)" error messages saying...
[snip]
 and after printing these messages a number of times, sysinstall will
 finally come up. If I quit sysinstall and then run it again, probing
 goes well and there are no timeouts. The interesting thing is that I can
 reproduce this problem by rebooting and running sysinstall. So, this
 only happens when running sysinstall for the first time after a boot.
 :-/
 
 I've read through all the previous messages regarding these timeout
 problems and have even increased the timeout in ata-disk.c to 10 secs
 but no luck.

Hmm, I'd put my disks on different channels, but thats just for
performance sake. I'm currently trying every wierd setup I can
imagine with the HW I have for testing, but I havn't been able
to get any of my test setups to exhibit this behavior...

But I'm working on it...

-Soren


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message