So we're sure it's not the hardware.
Now something strange, almost the same happend some days ago on my
workstation.
I had two scsi disks attached to it. Then something strange happened
(disk seemed to be very busy). I couldn't get any information from it.
after a reboot I found out that te partiontable was damaged.
My workstation is an x86 based system with a Adaptec 2940 scsi
controller. The failing disk was a 20G seagate with scsi-id 12. (the
other disk has scsi-id 10)
I didn't check the drive yet, but I suspect it to be ok. (smartclt also
told me so)
Could be the same problem?
Are all disks attached to the same scsi channel?
Which kernels are those systems using? which filesystems?
(I'm using 2.4.19 with ext3)
I've only got a ss4-110. Can I simulate the problem with that one?
There are more common things between the u2 and e3000...there both 64
bit, isn't it?
Maybe the kernel, libc or any other software with similair versions?
Daniel van Eeden <[EMAIL PROTECTED]>
Andreas Loong wrote:
[note, this is for the ultra2, which displays the same problem]
try these:
badblocks (read the manpage)
read the manpage, tried the program.. it didn't find anything.
This doesn't seem to be the problem, from what I've experienced.
cat /proc/scsi/<controller>/0
Sparc ESP Host Adapter:
PROM node f006347c
PROM name SUNW,fas
ESP Model Happy Meal FAS
DMA Revision Rev HME/FAS
cat /proc/scsi/scsi
nothing unusual here.
scsi-config <dev> (X frontend for scsiinfo)
well, it finds the disks etc, and I can't find any strange values.
and if everything fails this could be a (dirty) solution:
scsiadd -r <scsi_id>
scsiadd -a <scsi_id>
This is not really what I want to do. I'll try to explain the problem
better :
Sometimes, after the system has been up and running for a while with a
couple of disks attached, I get "Live target 0 not responding" plastered
over the console. The disks becomes totally non-responsive and the LED
is lit constantly. Nothing gets written to the logs at all. This
happened with one disk, and it managed to corrupt my partition table on
that disk. I reinstalled on another disk and thought that I don't want
to encounter this kind of problem again, I thought it was the disk that
was faulty. Now, with a different disk, woody installed on it. Got the
latest SMP kernel from the stable tree and started to construct a mirror
of two different disks. Then I got hit with the same error message
again. A bit odd, had to reboot.. got the mirrors up and running and
today I was just about to copy the contents of the root over to the
mirror so that I could quietly sit and work on the files that needed
some work in order to reflect the changes. While copying, it hung again.
I do not think this is a hardware issue, as it always messes with target
0, no matter what drive is there. The feeling I get after encountered
this problem on two different machines is that it is either kernel-based
or debian-based. The Ultra2 and the Enterprise 3000 have a few things in
common, although one is high-end and the other is rather low-end.
1) Both are SBUS based.
2) Same SCSI chip? I'll check this.
3) Anything else?
Hope this clears up any misunderstandings.
Wbr
Andreas Loong
--
+---------------------------------------------+
| Daniel van Eeden <[EMAIL PROTECTED]> |
| icq: 36952189 |
| aim: Compukid128 |
| jabber: [EMAIL PROTECTED] |
| msn: [EMAIL PROTECTED] |
| phone: +31 343 522622 |
| http://compukid.no-ip.org/about_me.html |
+---------------------------------------------+