Re: ATA weird message?

1999-12-13 Thread Soren Schmidt

It seems Mike Smith wrote:
  It seems Munehiro Matsuda wrote:
   Hi all,
   
   I am using -current as of December 9 (CTM:src-cur.4130.gz), and
   got following weird ATA related messages while 'make -j4 buildworld'.
   I never had this kind of message when using wd drivers.
   
 ata0-master: ad_timeout: lost disk contact - resetting
 ata0: resetting devices .. done
  
  Hmm, maybe the timeout in ata-disk.c is too short, try increasing
  the 5*hz to say 10*hz in line 436, and see if that changes anything..
 
 Ugh.  Are we still using these short timeouts for I/O transaction 
 completion?  Much pain some time back established that we need at least 
 30 seconds for some drives doing internal error recovery, and as long as 
 the drive is returning something sensible (ie. still busy), we should 
 give it at least that long.

Not really, the timeout for completing a request on a drive that
said it is ready, _should_ complete in much less than 5 secs, but
a reset of a hung drive should be waited for 31 secs, which we
do allready in ata_reset()...

I'm afraid that the sporious timeouts that some are seeing is an
artifact of another problem...

-Søren


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: ATA weird message?

1999-12-13 Thread Munehiro Matsuda

From: Soren Schmidt [EMAIL PROTECTED]
Date: Mon, 13 Dec 1999 08:33:43 +0100 (CET)
::It seems Munehiro Matsuda wrote:
:: Hi all,
:: 
:: I am using -current as of December 9 (CTM:src-cur.4130.gz), and
:: got following weird ATA related messages while 'make -j4 buildworld'.
:: I never had this kind of message when using wd drivers.
:: 
:: ata0-master: ad_timeout: lost disk contact - resetting
:: ata0: resetting devices .. done
::
::Hmm, maybe the timeout in ata-disk.c is too short, try increasing
::the 5*hz to say 10*hz in line 436, and see if that changes anything..
::
::-Søren
::

I tried that, but didn't seem to help much.
I run 'make -j4 buildworld' twice, just in case, and got the same
warning the first round, but didn't on the second.

One thing I noticed is that, the first warning occurs on a idle disk,
not the one running 'buildworld'! 

  FYI, my system configuration:
   ata0 -- master: ad0s1 (Win95), ad0s2* (-STABLE)   slave: none
   ata1 -- master: ad2s1 (DOS),   ad2s2* (-CURRENT)  slave: none

I was running 'buidworld' on -CURRENT (ad2s2), at that time -STABLE
and Win95 partitions where mounted, but basically idle. 
IIRC, buildworld was working on lib/libc when the warning occurred. 

So, it may be that, system was too busy talking to that running disk
with loads of small-file transfers, and timed its-self out on the other?

  Thank you,
   Haro

=--
   _ _Munehiro (haro) Matsuda
 -|- /_\  |_|_|   Office of Business Planning  Developement, Kubota Corp.
 /|\ |_|  |_|_|   1-3 Nihonbashi-Muromachi 3-Chome
  Chuo-ku Tokyo 103, Japan
  Tel: +81-3-3245-3318  Fax: +81-3-32454-3315
  Email: [EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: ATA weird message?

1999-12-12 Thread Soren Schmidt

It seems Munehiro Matsuda wrote:
 Hi all,
 
 I am using -current as of December 9 (CTM:src-cur.4130.gz), and
 got following weird ATA related messages while 'make -j4 buildworld'.
 I never had this kind of message when using wd drivers.
 
   ata0-master: ad_timeout: lost disk contact - resetting
   ata0: resetting devices .. done

Hmm, maybe the timeout in ata-disk.c is too short, try increasing
the 5*hz to say 10*hz in line 436, and see if that changes anything..

-Søren


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: ATA weird message?

1999-12-12 Thread Mike Smith

 It seems Munehiro Matsuda wrote:
  Hi all,
  
  I am using -current as of December 9 (CTM:src-cur.4130.gz), and
  got following weird ATA related messages while 'make -j4 buildworld'.
  I never had this kind of message when using wd drivers.
  
  ata0-master: ad_timeout: lost disk contact - resetting
  ata0: resetting devices .. done
 
 Hmm, maybe the timeout in ata-disk.c is too short, try increasing
 the 5*hz to say 10*hz in line 436, and see if that changes anything..

Ugh.  Are we still using these short timeouts for I/O transaction 
completion?  Much pain some time back established that we need at least 
30 seconds for some drives doing internal error recovery, and as long as 
the drive is returning something sensible (ie. still busy), we should 
give it at least that long.


-- 
\\ Give a man a fish, and you feed him for a day. \\  Mike Smith
\\ Tell him he should learn how to fish himself,  \\  [EMAIL PROTECTED]
\\ and he'll hate you for a lifetime. \\  [EMAIL PROTECTED]




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



ATA weird message?

1999-12-10 Thread Munehiro Matsuda

Hi all,

I am using -current as of December 9 (CTM:src-cur.4130.gz), and
got following weird ATA related messages while 'make -j4 buildworld'.
I never had this kind of message when using wd drivers.

ata0-master: ad_timeout: lost disk contact - resetting
ata0: resetting devices .. done
ata1-master: ad_timeout: lost disk contact - resetting
ata1: resetting devices .. done
ata0-master: ad_timeout: lost disk contact - resetting
ata0: resetting devices .. done
ata1-master: ad_timeout: lost disk contact - resetting
ata1: resetting devices .. done
ata0-master: ad_timeout: lost disk contact - resetting
ata0: resetting devices .. done
ata1-master: ad_timeout: lost disk contact - resetting
ata1: resetting devices .. done
ata0-master: ad_timeout: lost disk contact - resetting
ata0-master: ad_timeout: trying fallback to PIO mode
ata0: resetting devices .. done
ata1-master: ad_timeout: lost disk contact - resetting
ata1-master: ad_timeout: trying fallback to PIO mode
ata1: resetting devices .. done

What's happening here?

System configuration follows:
--8888888--

FreeBSD 4.0-CURRENT #0: Thu Dec  9 22:17:58 JST 1999
[EMAIL PROTECTED]:/usr/src/sys/compile/JKPC15
Timecounter "i8254"  frequency 1193182 Hz
CPU: Pentium II/Xeon/Celeron (266.62-MHz 686-class CPU)
  Origin = "GenuineIntel"  Id = 0x650  Stepping = 0
  
Features=0x183f9ffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR
real memory  = 67043328 (65472K bytes)
avail memory = 61542400 (60100K bytes)
Preloaded elf kernel "kernel" at 0xc0346000.
Pentium Pro MTRR support enabled

--8888888--

ata-pci0: Intel PIIX4 ATA controller at device 7.1 on pci0
ata-pci0: Busmastering DMA supported
ata0 at 0x01f0 irq 14 on ata-pci0
ata1 at 0x0170 irq 15 on ata-pci0

ad0: TOSHIBA MK1011GAV/H0.07 C ATA-4 disk at ata0 as master
ad0: 9590MB (19640880 sectors), 19485 cyls, 16 heads, 63 S/T, 512 B/S
ad0: 16 secs/int, 1 depth queue, UDMA33
ad2: HITACHI_DK227A-50/00L0A0A3 ATA-3 disk at ata1 as master
ad2: 4789MB (9809100 sectors), 10380 cyls, 15 heads, 63 S/T, 512 B/S
ad2: 16 secs/int, 1 depth queue, UDMA33

--8888888--

# ATA and ATAPI devices
controller  ata0
controller  ata1
#controller ata2
device  atadisk0# ATA disk drives
device  atapicd0# ATAPI CDROM drives
device  atapifd0# ATAPI floppy drives
#device atapist0# ATAPI tape drives
options ATA_STATIC_ID   #Static device numbering
#optionsATA_ENABLE_ATAPI_DMA#Enable DMA on ATAPI devices

---
  _ _Munehiro (haro) Matsuda
-|- /_\  |_|_|   Office of Business Planning  Developement, Kubota Corp.
/|\ |_|  |_|_|   1-3 Nihonbashi-Muromachi 3-Chome
 Chuo-ku Tokyo 103, Japan
 Tel: +81-3-3245-3318  Fax: +81-3-32454-3315
 Email: [EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: ATA weird message?

1999-12-10 Thread patrick

On 11 Dec, Munehiro Matsuda wrote:
 Hi all,
 
 I am using -current as of December 9 (CTM:src-cur.4130.gz), and
 got following weird ATA related messages while 'make -j4 buildworld'.
 I never had this kind of message when using wd drivers.
 
   ata0-master: ad_timeout: lost disk contact - resetting
   ata0: resetting devices .. done
   ata1-master: ad_timeout: lost disk contact - resetting
   ata1: resetting devices .. done

I too am seeing this.  I decided tonight to go from -STABLE (3.4-RC) to
-CURRENT.  After I built a 4.0-CURRENT kernel and rebooted into single
user mode, I started to "make -j10 buildworld" with /usr mounted async. 
In about 10 minutes, I got a similar error to the above error:

ata1-master: ad_timeout: lost disk contact - resetting
ata1: resetting devices .. done
ad_transfer: timeout waiting for DRQ

At that point, my system freezes up.  I have to reboot.

I thought it was the "-j10", so when I rebooted back into -CURRENT
kernel single user mode, I did just a "make buildworld".  After some
time, it again gave me the same error.  

I booted into -STABLE and verified that everything was working and it
was.  So I recompiled the -CURRENT kernel without SOFTUPDATES (and all
other "extras"), just in case.  It went further this time in the build,
but still froze with the above error.  

Summary:
The error occurred no matter if I was using softupdates, async mount,
or normal mount.  I cannot build -CURRENT.  In -STABLE, I am fine, and
have been running -STABLE for over a year now on this machine.

Information on System:
Dual Pentium 200 on  ASUS P/E-P55T2P4D motherboard with 128Megs EDO RAM.

dmesg from -STABLE:
FreeBSD 3.4-RC #1: Thu Dec  9 19:45:30 EST 1999
[EMAIL PROTECTED]:/usr/src/sys/compile/PATRICK
Timecounter "i8254"  frequency 1193182 Hz
CPU: Pentium/P54C (586-class CPU)
  Origin = "GenuineIntel"  Id = 0x52c  Stepping = 12
  Features=0x3bfFPU,VME,DE,PSE,TSC,MSR,MCE,CX8,APIC
real memory  = 134217728 (131072K bytes)
avail memory = 12800 (125000K bytes)
Programming 16 pins in IOAPIC #0
EISA INTCONTROL = 0c00
FreeBSD/SMP: Multiprocessor motherboard
 cpu0 (BSP): apic id:  0, version: 0x00030010, at 0xfee0
 cpu1 (AP):  apic id:  1, version: 0x00030010, at 0xfee0
 io0 (APIC): apic id:  2, version: 0x000f0011, at 0xfec0
Preloaded elf kernel "kernel.stable" at 0xc025e000.
eisa0: ASU5201 (System Board)
Probing for devices on the EISA bus
Probing for devices on PCI bus 0:
chip0: Intel 82439HX PCI cache memory controller rev 0x01 on pci0.0.0
chip1: Intel 82375EB PCI-EISA bridge rev 0x05 on pci0.7.0
vga0: ATI model 4750 graphics accelerator rev 0x5c int a irq 10 on pci0.10.0
fxp0: Intel EtherExpress Pro 10/100B Ethernet rev 0x08 int a irq 11 on pci0.1
1.0
fxp0: Ethernet address 00:90:27:cb:0f:32
ide_pci0: PCI IDE controller (busmaster capable) rev 0x01 int a irq 14 on pci
0.13.0
Probing for PnP devices:
CSN 1 Vendor ID: ALS0120 [0x20019305] Serial 0x Comp ID: @@@ [0x000
0]
Probing for devices on the ISA bus:
sc0 on isa
sc0: VGA color 16 virtual consoles, flags=0x0
ed0 at 0x280-0x29f irq 10 on isa
ed0: address 00:e0:29:17:73:26, type NE2000 (16 bit)
atkbdc0 at 0x60-0x6f on motherboard
atkbd0 irq 1 on isa
psm0 not found
sio0 at 0x3f8-0x3ff irq 4 flags 0x10 on isa
sio0: type 16550A
sio1 at 0x2f8-0x2ff irq 3 on isa
sio1: type 16550A   
fdc0 at 0x3f0-0x3f7 irq 6 drq 2 on isa
fdc0: FIFO enabled, 8 bytes threshold
fd0: 1.44MB 3.5in
wdc0 at 0x1f0-0x1f7 irq 14 flags 0x90ff on isa
wdc0: unit 0 (wd0): ST34321A, LBA, 32-bit, multi-block-32
wd0: 4103MB (8404830 sectors), 523 cyls, 255 heads, 63 S/T, 512 B/S
wdc1 at 0x170-0x177 irq 15 flags 0x90ff90ff on isa
wdc1: unit 0 (wd2): ST34321A, LBA, 32-bit, multi-block-32
wd2: 4103MB (8404830 sectors), 523 cyls, 255 heads, 63 S/T, 512 B/S
wdc1: unit 1 (atapi): CD-524EA/1.0A, removable, accel, ovlap, dma, iordis
acd0: drive speed 4134KB/sec, 128KB cache
acd0: supported read types: CD-DA
acd0: Audio: play, 16 volume levels
acd0: Mechanism: ejectable tray
acd0: Medium: CD-ROM 120mm data disc loaded, unlocked
ppc0 at 0x378 irq 7 flags 0x40 on isa
ppc0: SMC-like chipset (ECP/EPP/PS2/NIBBLE) in COMPATIBLE mode
ppc0: FIFO with 16/16/15 bytes threshold
lpt0: generic printer on ppbus 0
lpt0: Interrupt-driven port
ppi0: generic parallel i/o on ppbus 0
plip0: PLIP network interface on ppbus 0
vga0 at 0x3b0-0x3df maddr 0xa msize 131072 on isa
npx0 on motherboard
npx0: INT 16 interface
joy0 at 0x201 on isa
joy0: joystick
Intel Pentium detected, installing workaround for F00F bug
APIC_IO: routing 8254 via pin 2
changing root device to wd2s1a
SMP: AP CPU #1 Launched!

Some probes from -CURRENT:
mainboard0:ASU5201 (System Board) on eisa 0 slot 0
ata-pci0: CMD 646 ATA controller (generic mode) irq 14 at device 13.0
on pci0
ata-pci0: Busmastering DMA Supported

If someone needs more information or for me to do any testing, please
let me know.  I'm not a developer, but I'm