Re: Strange SCSI related system hang

2000-01-10 Thread Justin T. Gibbs

In article [EMAIL PROTECTED] you wrote:
 Hi all, 
 
 This morning I had a very strange (at least I've never seen it before) SCSI
 related system hang. The system simply stopped responding at 9:30:03 am
 this morning. I found it in this state at 13:20. It had been hanging
 _hard_. No response to console, serial terminal or network. After a hard
 reset the system came back online normally and is working normally again. 
 
 Note that the machine had an uptime of 4 days, 14 hours before the problem
 occured and it never happened before. 
 
 Could this be a hardware problem? 

Perhaps.  Is your WD drive getting hot?  The ahc driver believes that,
during a message out phase, the target simply dropped off the bus.
It may be that the ahc driver did something to provoke that, but without
a bus analyzer on the drive, it is hard to know.  According to the
progrom counter, we are waiting for the target to request the next
byte at the time this occurs, but that request never comes.

--
Justin


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Strange SCSI related system hang

2000-01-09 Thread Dave J. Boers

Hi all, 

This morning I had a very strange (at least I've never seen it before) SCSI
related system hang. The system simply stopped responding at 9:30:03 am
this morning. I found it in this state at 13:20. It had been hanging
_hard_. No response to console, serial terminal or network. After a hard
reset the system came back online normally and is working normally again. 

Note that the machine had an uptime of 4 days, 14 hours before the problem
occured and it never happened before. 

Could this be a hardware problem? 

 uname -a: 

FreeBSD relativity.student.utwente.nl 4.0-CURRENT FreeBSD 4.0-CURRENT #0:
Thu Dec 30 21:42:21 CET 1999
[EMAIL PROTECTED]:/usr/src/sys/compile/RELATIVITY3  i386

 Here's the relevant system log messages: 

Note that all these messages occur at xx:30:00 or xx:30:01. That's probably
related to a cron job which runs every half hour and copies some files
(about 20 Mb) from my IDE disk to my SCSI disk. When the system is idle,
there's usually no other SCSI activity. 

Jan  9 04:30:01 relativity /kernel: Unexpected busfree.  LASTPHASE == 0xa0
Jan  9 04:30:01 relativity /kernel: SEQADDR == 0x151
Jan  9 04:30:01 relativity /kernel: Unexpected busfree.  LASTPHASE == 0xa0
Jan  9 04:30:01 relativity /kernel: SEQADDR == 0x151
Jan  9 04:30:01 relativity /kernel: Unexpected busfree.  LASTPHASE == 0xa0
Jan  9 04:30:01 relativity /kernel: SEQADDR == 0x151
Jan  9 04:30:01 relativity /kernel: Unexpected busfree.  LASTPHASE == 0xa0
Jan  9 04:30:01 relativity /kernel: SEQADDR == 0x151
Jan  9 04:30:01 relativity /kernel: Unexpected busfree.  LASTPHASE == 0xa0
Jan  9 04:30:01 relativity /kernel: SEQADDR == 0x151
Jan  9 04:30:01 relativity /kernel: ahc0:A:0: no active SCB for reconnecting target - 
issuing BUS DEVICE RESET

Jan  9 06:30:00 relativity /kernel: Unexpected busfree.  LASTPHASE == 0xa0
Jan  9 06:30:00 relativity /kernel: SEQADDR == 0x151

Jan  9 07:30:00 relativity /kernel: Unexpected busfree.  LASTPHASE == 0xa0
Jan  9 07:30:00 relativity /kernel: SEQADDR == 0x151
Jan  9 07:30:00 relativity /kernel: Unexpected busfree.  LASTPHASE == 0xa0
Jan  9 07:30:00 relativity /kernel: SEQADDR == 0x151
Jan  9 07:30:01 relativity /kernel: Unexpected busfree.  LASTPHASE == 0xa0
Jan  9 07:30:01 relativity /kernel: SEQADDR == 0x151
Jan  9 07:30:01 relativity /kernel: Unexpected busfree.  LASTPHASE == 0xa0
Jan  9 07:30:01 relativity /kernel: SEQADDR == 0x151
Jan  9 07:30:01 relativity /kernel: ahc0:A:0: no active SCB for reconnecting target - 
issuing BUS DEVICE RESET
Jan  9 07:30:01 relativity /kernel: SAVED_TCL == 0x0, ARG_1 == 0x9, SEQ_FLAGS == 0x0
Jan  9 07:30:01 relativity /kernel: ahc0: Bus Device Reset on A:0. 1 SCBs aborted

Jan  9 08:30:01 relativity /kernel: Unexpected busfree.  LASTPHASE == 0xa0
Jan  9 08:30:01 relativity /kernel: SEQADDR == 0x151
Jan  9 08:30:01 relativity /kernel: Unexpected busfree.  LASTPHASE == 0xa0
Jan  9 08:30:01 relativity /kernel: SEQADDR == 0x151

Jan  9 09:30:03 relativity /kernel: (da0:ahc0:0:0:0): Invalidating pack

 Here's my complete dmesg output: 

Copyright (c) 1992-1999 The FreeBSD Project.
Copyright (c) 1982, 1986, 1989, 1991, 1993
The Regents of the University of California. All rights reserved.
FreeBSD 4.0-CURRENT #0: Thu Dec 30 21:42:21 CET 1999
[EMAIL PROTECTED]:/usr/src/sys/compile/RELATIVITY3
Timecounter "i8254"  frequency 1193182 Hz
CPU: Pentium II/Celeron (450.00-MHz 686-class CPU)
  Origin = "GenuineIntel"  Id = 0x665  Stepping = 5
  
Features=0x183fbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR
real memory  = 134217728 (131072K bytes)
avail memory = 127238144 (124256K bytes)
Programming 24 pins in IOAPIC #0
FreeBSD/SMP: Multiprocessor motherboard
 cpu0 (BSP): apic id:  0, version: 0x00040011, at 0xfee0
 cpu1 (AP):  apic id:  1, version: 0x00040011, at 0xfee0
 io0 (APIC): apic id:  2, version: 0x00170011, at 0xfec0
Preloaded elf kernel "kernel" at 0xc02d2000.
Preloaded elf module "vesa.ko" at 0xc02d209c.
VESA: v3.0, 7936k memory, flags:0x1, mode table:0xc02cf102 (122)
VESA: NVidia
Pentium Pro MTRR support enabled
npx0: math processor on motherboard
npx0: INT 16 interface
pcib0: Intel 82443BX (440 BX) host to PCI bridge on motherboard
pci0: PCI bus on pcib0
pcib1: Intel 82443BX (440 BX) PCI-PCI (AGP) bridge at device 1.0 on pci0
pci1: PCI bus on pcib1
vga-pci0: NVidia Riva TNT graphics accelerator irq 16 at device 0.0 on pci1
isab0: Intel 82371AB PCI to ISA bridge at device 7.0 on pci0
isa0: ISA bus on isab0
ata-pci0: Intel PIIX4 ATA controller at device 7.1 on pci0
ata-pci0: Busmastering DMA supported
pci0: Intel 82371AB/EB (PIIX4) USB controller (vendor=0x8086, dev=0x7112) at 7.2
intpm0: Intel 82371AB Power management controller at device 7.3 on pci0
intpm0: I/O mapped 5000
intpm0: intr IRQ 9 enabled revision 0
smbus0: System Management Bus on intsmb0
smb0: SMBus general purpose I/O on smbus0
intpm0: PM I/O mapped 4000 
ed0: NE2000 PCI Ethernet (RealTek 8029) irq 19 at 

Re: Strange SCSI related system hang

2000-01-09 Thread Matthew Dillon

:Hi all, 
:
:This morning I had a very strange (at least I've never seen it before) SCSI
:related system hang. The system simply stopped responding at 9:30:03 am
:this morning. I found it in this state at 13:20. It had been hanging
:_hard_. No response to console, serial terminal or network. After a hard
:reset the system came back online normally and is working normally again. 
:
:Note that the machine had an uptime of 4 days, 14 hours before the problem
:occured and it never happened before. 
:
:Could this be a hardware problem? 

Yes.  The problem is probably termination.  You have three devices
on your SCSI bus -- your hard drive should be the terminating device
and should have termination enabled.  Neither of the other two devices
should have termination enabled.  Alternatively, none of the devices
should have termination enabled and you should have an external active
terminator (active terminators have LEDs).

Never use a CDRom or ZIP drive to terminate a SCSI bus, they're generally
too cheap to do it right.

I am presuming that these are all internally mounted.  If you have
any externally mounted SCSI devices or a combination of the two then both
ends must be terminated.

-Matt

:...
:Jan  9 04:30:01 relativity /kernel: Unexpected busfree.  LASTPHASE == 0xa0
:Jan  9 04:30:01 relativity /kernel: SEQADDR == 0x151
:Jan  9 04:30:01 relativity /kernel: Unexpected busfree.  LASTPHASE == 0xa0
:Jan  9 04:30:01 relativity /kernel: SEQADDR == 0x151
:Jan  9 04:30:01 relativity /kernel: Unexpected busfree.  LASTPHASE == 0xa0
:...
:Jan  9 04:30:01 relativity /kernel: SEQADDR == 0x151
:Jan  9 04:30:01 relativity /kernel: ahc0:A:0: no active SCB for reconnecting target - 
:issuing BUS DEVICE RESET
:
:...
:cd0 at ahc0 bus 0 target 1 lun 0
:cd0: PLEXTOR CD-R   PX-R412C 1.05 Removable CD-ROM SCSI-2 device 
:cd0: 10.000MB/s transfers (10.000MHz, offset 8)
:cd0: Attempt to query device size failed: NOT READY, Medium not present
:WARNING: / was not properly dismounted
:da0 at ahc0 bus 0 target 0 lun 0
:da0: WDIGTL ENTERPRISE 1.70 Fixed Direct Access SCSI-2 device 
:da0: 20.000MB/s transfers (20.000MHz, offset 15)
:da0: 4157MB (8515173 512 byte sectors: 255H 63S/T 530C)
:da1 at ahc0 bus 0 target 5 lun 0
:da1: IOMEGA ZIP 100 J.02 Removable Direct Access SCSI-2 device 
:da1: 3.300MB/s transfers
:da1: 96MB (196608 512 byte sectors: 64H 32S/T 96C)




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message