Re: Boot-time hard drive errors

2013-02-25 Thread b w
This is not very helpful, but you can try Pause, Scroll Lock, high FPS
filming and pause or taking pictures with short exposure.


On Mon, Feb 25, 2013 at 6:20 AM, Martin Alejandro Paredes Sanchez <
mapsw...@prodigy.net.mx> wrote:

> On Sunday 24 February 2013 14:33:06 Ronald F. Guilmette wrote:
> > I have a somewhat eclectic system, currently running (or at any rate,
> > trying to run) 9.1-RELEASE.  The system in question contains three
> > drives, to wit:
> >
> > ATA-8 SATA 3.x device
> > ATA-8 SATA 1.x device
> > ATA-8 SATA 3.x device
> >
> > Previously, I had the ST3500320AS in this system, along with one other
> > entirely different Seagate drive, i.e. one not shown in the list above.
> > (Also, I was previously running 8.3-RELEASE and only recently updated
> > to 9.1-RELEASE.)
> >
> > Since I reconfigured the system to its current state, i.e. with the set
> > of three drives listed above, whenever I reboot the system, about 50%
> > of the time, when the boot process gets down to the point where it
> > would ordinarily be printing out the messages relating to ada0, ada1,
> > etc. suddenly I start to get a massive and apparently endless stream
> > of error messages, apparently relating to one of the drives listed
> > above, but the stream actually alternates between two consecutive
> > error messages, both undoubtedly related to each other.
> >
>
> Does your HDD controller is SATA 3?
>
> I had a similar problem (some times could not boot) and was caused because
> my
> HDD controller is SATA 1
>
> Intel ICH5 SATA150 controller
>
> And my hard disk is SATA 2
>
> WDC WD2500AVVS-00L2B0 01.03A01
>
> The problem disapear when I lock the HDD at 150 MB/s (jumper settings the
> HDD
> to SATA 1)
> ___
> freebsd-questions@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-questions
> To unsubscribe, send any mail to "
> freebsd-questions-unsubscr...@freebsd.org"
>
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"


Re: Boot-time hard drive errors

2013-02-24 Thread Martin Alejandro Paredes Sanchez
On Sunday 24 February 2013 14:33:06 Ronald F. Guilmette wrote:
> I have a somewhat eclectic system, currently running (or at any rate,
> trying to run) 9.1-RELEASE.  The system in question contains three
> drives, to wit:
>
> ATA-8 SATA 3.x device
> ATA-8 SATA 1.x device
> ATA-8 SATA 3.x device
>
> Previously, I had the ST3500320AS in this system, along with one other
> entirely different Seagate drive, i.e. one not shown in the list above.
> (Also, I was previously running 8.3-RELEASE and only recently updated
> to 9.1-RELEASE.)
>
> Since I reconfigured the system to its current state, i.e. with the set
> of three drives listed above, whenever I reboot the system, about 50%
> of the time, when the boot process gets down to the point where it
> would ordinarily be printing out the messages relating to ada0, ada1,
> etc. suddenly I start to get a massive and apparently endless stream
> of error messages, apparently relating to one of the drives listed
> above, but the stream actually alternates between two consecutive
> error messages, both undoubtedly related to each other.
>

Does your HDD controller is SATA 3?

I had a similar problem (some times could not boot) and was caused because my 
HDD controller is SATA 1

Intel ICH5 SATA150 controller

And my hard disk is SATA 2

WDC WD2500AVVS-00L2B0 01.03A01

The problem disapear when I lock the HDD at 150 MB/s (jumper settings the HDD 
to SATA 1)
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"


Re: Boot-time hard drive errors

2013-02-24 Thread Simon


Have you tried Pause/Break to see if you could feeze the screen to get the
error message?

I would stress test all three drives to see if they pass with flying colors. One
or more of your drives could be indeed flaky, regardless being new, that means
little. Also, something could be conflicting from time to time, that could also
show up under stress testing.

Make backup if you have important data before stress testing.

-Simon

On Sun, 24 Feb 2013 13:33:06 -0800, Ronald F. Guilmette wrote:



>I have a somewhat eclectic system, currently running (or at any rate,
>trying to run) 9.1-RELEASE.  The system in question contains three
>drives, to wit:

>ATA-8 SATA 3.x device
>ATA-8 SATA 1.x device
>ATA-8 SATA 3.x device

>Previously, I had the ST3500320AS in this system, along with one other
>entirely different Seagate drive, i.e. one not shown in the list above.
>(Also, I was previously running 8.3-RELEASE and only recently updated
>to 9.1-RELEASE.)

>Since I reconfigured the system to its current state, i.e. with the set
>of three drives listed above, whenever I reboot the system, about 50%
>of the time, when the boot process gets down to the point where it
>would ordinarily be printing out the messages relating to ada0, ada1,
>etc. suddenly I start to get a massive and apparently endless stream
>of error messages, apparently relating to one of the drives listed
>above, but the stream actually alternates between two consecutive
>error messages, both undoubtedly related to each other.

>The boot process never completes, and I am just left staring at a
>screen that's displaying, in very rapid succession, first the one
>error message and then the other, and then the first one again, and
>then the second one again, and on and on like that.

>Unfortunately, the two error messages are being printed on the screen
>so fast (and alternating, as described above) that I cannot even read
>them, but I could just barely make out that they seem to relate to ada2...
>well, anyway, one or another of the hard drives.

>I do not know the proper way to rectify whatever is causing these "flaky"
>errors.  I use the term "flaky" because, as I have said, this boot-time
>problem only seems to occur maybe about 50% of the time, and the rest
>of the time when I boot up there is no problem whatsoever.

>Because I am able to boot up successfully, with no problems whatsoever,
>a significant fraction of the time, I am inclined to think that whatever
>is causing the failure is not actually a hardware fault.  (And by the way,
>the WDC drive and the Hitachi drive are both practically brand new.  That
>doesn't prove anything, of course, but it does make me think that they
>are unlikely to have serious hardware faults.)

>I would report this problem by filing a standard PR, but as I've said
>above, I can't even read the error messages, because they are being
>printed in such rapid succession, so I'm not sure that filing a PR
>would be useful to anybody.  I mean what would it say?  That I'm getting
>some unspecified failure at boot time that seems to relate to the hard
>drives in this system?  That kind of PR would clearly not be very helpful.

>Has anyone else ever encountered symptoms like those I have listed
>above, either with 9.1-RELEASE or with any other version of FreeBSD?


>Regards,
>rfg
>___
>freebsd-questions@freebsd.org mailing list
>http://lists.freebsd.org/mailman/listinfo/freebsd-questions
>To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"




___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"


Boot-time hard drive errors

2013-02-24 Thread Ronald F. Guilmette


I have a somewhat eclectic system, currently running (or at any rate,
trying to run) 9.1-RELEASE.  The system in question contains three
drives, to wit:

ATA-8 SATA 3.x device
ATA-8 SATA 1.x device
ATA-8 SATA 3.x device

Previously, I had the ST3500320AS in this system, along with one other
entirely different Seagate drive, i.e. one not shown in the list above.
(Also, I was previously running 8.3-RELEASE and only recently updated
to 9.1-RELEASE.)

Since I reconfigured the system to its current state, i.e. with the set
of three drives listed above, whenever I reboot the system, about 50%
of the time, when the boot process gets down to the point where it
would ordinarily be printing out the messages relating to ada0, ada1,
etc. suddenly I start to get a massive and apparently endless stream
of error messages, apparently relating to one of the drives listed
above, but the stream actually alternates between two consecutive
error messages, both undoubtedly related to each other.

The boot process never completes, and I am just left staring at a
screen that's displaying, in very rapid succession, first the one
error message and then the other, and then the first one again, and
then the second one again, and on and on like that.

Unfortunately, the two error messages are being printed on the screen
so fast (and alternating, as described above) that I cannot even read
them, but I could just barely make out that they seem to relate to ada2...
well, anyway, one or another of the hard drives.

I do not know the proper way to rectify whatever is causing these "flaky"
errors.  I use the term "flaky" because, as I have said, this boot-time
problem only seems to occur maybe about 50% of the time, and the rest
of the time when I boot up there is no problem whatsoever.

Because I am able to boot up successfully, with no problems whatsoever,
a significant fraction of the time, I am inclined to think that whatever
is causing the failure is not actually a hardware fault.  (And by the way,
the WDC drive and the Hitachi drive are both practically brand new.  That
doesn't prove anything, of course, but it does make me think that they
are unlikely to have serious hardware faults.)

I would report this problem by filing a standard PR, but as I've said
above, I can't even read the error messages, because they are being
printed in such rapid succession, so I'm not sure that filing a PR
would be useful to anybody.  I mean what would it say?  That I'm getting
some unspecified failure at boot time that seems to relate to the hard
drives in this system?  That kind of PR would clearly not be very helpful.

Has anyone else ever encountered symptoms like those I have listed
above, either with 9.1-RELEASE or with any other version of FreeBSD?


Regards,
rfg
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"


Re: Drive errors in raidz array

2010-01-23 Thread krad
On 22 January 2010 21:31, Dan Naumov  wrote:

> >> I have a system with 24 drives in raidz2.
>
> Congrats, you answered your own question within the first sentance :)
>
> ANSWER: As per the ZFS documentation, don't do raidz/raidz2 vdev
> groups bigger than 9 vdevs per group or bad things (tm) will happen.
> Google will tell you more.
>
> - Sincerely,
> Dan Naumov
> ___
> freebsd-questions@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-questions
> To unsubscribe, send any mail to "
> freebsd-questions-unsubscr...@freebsd.org"
>


he didnt actually say that you have inferred it. However you are correct
about the vdev size.

The best configuration would probably be x2 raidz2 vdevs of 12 drives or 3x
of 8.

You could also go for 3x raidz of 7 drives with 3 hot spares. It really
depends on what redundancy/capacity ratio you want.

Having said all this im not convinced the errors you are seeing are
definitely due to having 24 drives in a vdev. I would expect some write
performance issues and slow rebuild times but not device errors
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"


RE: Drive errors in raidz array

2010-01-22 Thread Dan Naumov
>> I have a system with 24 drives in raidz2.

Congrats, you answered your own question within the first sentance :)

ANSWER: As per the ZFS documentation, don't do raidz/raidz2 vdev
groups bigger than 9 vdevs per group or bad things (tm) will happen.
Google will tell you more.

- Sincerely,
Dan Naumov
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"


Drive errors in raidz array

2010-01-22 Thread Toby Burress
I have a system with 24 drives in raidz2.  When testing with bonnie++
it seemed to work fine (although I had to raise the arc_max to
prevent kernel panics).  However, now we're copying data to it and
dmesg is showing many errors like:

mpt0: mpt_cam_event: 0x16
mpt0: request 0xff80005f3840:63495 timed out for ccb 0xff000988f800 
(req->ccb 0xff000988f800)
mpt0: request 0xff80005f1f80:63496 timed out for ccb 0xff00098d0800 
(req->ccb 0xff00098d0800)
mpt0: attempting to abort req 0xff80005f3840:63495 function 0
mpt0: request 0xff8000601ee0:63497 timed out for ccb 0xff011edaa800 
(req->ccb 0xff011edaa800)
mpt0: request 0xff80005f4ec0:63498 timed out for ccb 0xff011eda5800 
(req->ccb 0xff011eda5800)
mpt0: mpt_wait_req(1) timed out
mpt0: mpt_recover_commands: abort timed-out. Resetting controller
mpt0: mpt_cam_event: 0x0
mpt0: completing timedout/aborted req 0xff80005f3840:63495
mpt0: completing timedout/aborted req 0xff80005f1f80:63496
mpt0: completing timedout/aborted req 0xff8000601ee0:63497
mpt0: completing timedout/aborted req 0xff80005f4ec0:63498

followed by

(da0:mpt0:0:1:0): READ(10). CDB: 28 0 1 23 81 6f 0 0 2b 0 
(da0:mpt0:0:1:0): CAM Status: SCSI Status Error
(da0:mpt0:0:1:0): SCSI Status: Check Condition
(da0:mpt0:0:1:0): UNIT ATTENTION asc:29,0
(da0:mpt0:0:1:0): Power on, reset, or bus device reset occurred
(da0:mpt0:0:1:0): Retrying Command (per Sense Data)

for every drive in the array.  Additionally, zpool scrub says:

 pool: backups
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://www.sun.com/msg/ZFS-8000-9P
 scrub: resilver completed after 0h0m with 0 errors on Thu Jan 21 23:15:36 2010

I'm using 8.0-RELEASE-p2 on amd64.  One other thing that changed
between testing with bonnie++ and now is that I used glabel to label
the drives before I put them in the raidz array.

There is no raid controller.

Is this something anyone has seen before?  Googling around shows
some similar errors but no solutions.
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"


Re: Drive errors on boot

2006-04-13 Thread Bryan Curl
On 4/10/06, David J Brooks <[EMAIL PROTECTED]> wrote:
>
> On Monday 10 April 2006 09:26, Bryan Curl wrote:
> > --- Lowell Gilbert
> >
> > <[EMAIL PROTECTED]> wrote:
> > > Bryan Curl <[EMAIL PROTECTED]> writes:
> > > > My apologies if this is a repost. It seems either
> > >
> > > I
> > >
> > > > had a gmail problem or list never posted the
> > >
> > > question.
> > >
> > > > I have subscribed with another address to monitor
> > > > problem.
> > > >
> > > > Anyway, here is my question again.
> > > >
> > > > I get the following errors from dmesg on one of my
> > >
> > > ide
> > >
> > > > drives on boot.
> > > > Other similar drives dont error and are setup the
> > >
> > > same
> > >
> > > > in bios (except cylinder & block config of course)
> > > > System and this drive seem to work fine otherwise.
> > >
> > > I
> > >
> > > > re-fdisk this one but it still does this error.
> > > >
> > > > FreeBSD 6.0-RELEASE-p6 #0: Tue Apr  4 09:43:53 MDT
> > > > 2006
> > > >
> > > > ad1: 1916MB  at
> > >
> > > ata0-slave
> > >
> > > > WDMA2
> > > > ad1: FAILURE - READ_DMA status=51
> > > > error=10 LBA=3924359
> > > > ad1: FAILURE - READ_DMA status=51
> > > > error=10 LBA=3924343
> > > > ad1: FAILURE - READ_DMA status=51
> > > > error=10 LBA=3924356
> > > > ad1: FAILURE - READ_DMA status=51
> > > > error=10 LBA=3924359
> > >
> > > This is probably a hardware problem.  My first guess
> > > would be
> > > cabling.  Try swapping the cable.  And make sure
> > > there is a master on
> > > the bus if this one is probing as a slave.
> >
> > This is the primary slave drive. Primary master is the
> > boot drive where OS lives. The master is cabled on the
> > end connector and the slave is connected to the middle
> > connector on the cable.
> >
> > The supplied documentation on the drive jumpers is
> > vague at best. It only makes mention of one jumper
> > (master or slave positions) There are 3 other jumpers
> > on the drive that are not mentioned.
> >
> > Looks to me like DMA feature isn't working but I dont
> > know if this is activated by a jumper or by firmware
> > somehow.
> >
> > > > I  dont know what causes these errors either.
> > > >
> > > > dc0: failed to force tx and rx to idle state
> > > > dc0: failed to force tx and rx to idle state
> > >
> > > The driver tried to force the transmitter and
> > > receiver to be "idle"
> > > temporarily, and failed.  There are a number of
> > > different cases where
> > > the driver tries to do this, so it's hard to guess
> > > exactly what's
> > > happening this time.  Some of the relevant variables
> > > are: whether this
> > > happens at boot time, whether it happens after an
> > > underrun or overrun,
> > > and which real controller chip you have.
> >
> > I have seen this error on every FreeBSD installation I
> > have ever had. To my knowledge, it never seemed to
> > bother anything. I just hate watching errors scroll by.
>
> I solved this same error on my machine by adding
> sysctl hw.ata.ata_dma=0
> to /boot/loader.conf
>
> That slows down drive access something fierce, but it worked for me. Once
> the
> machine has booted you may be able to turn DMA access back on with
> atacontrol(8).
>
> The problem was ultimately solved for me by upgrading to 6.1-PRERELEASE.
>
> HTH,
> David
> --
> Sure God created the world in only six days,
> but He didn't have an established user-base.
> ___
> freebsd-questions@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-questions
> To unsubscribe, send any mail to "
> [EMAIL PROTECTED]"
>

Writing a file to the drive the other day it coredumped.
So I pulled the drive and booted to an old WIn98 foppy and partitioned and
formatted the drive just to start from scratch, no problems indicated in
that process.
Then I CVSUPed to RELENG_6_1. No problems upgrading at all.
I reinstalled the drive with a different cable and put it on the secondary
slave position, and fdisk and labeled per sysinstall, full use, no MBR
changes.
I tried different bios settings like auto recognition, user defined, with
LBA, Normal and Large Modes. No change.

fsck /dev/ad3s1d looks good this time but same dmesg errors exists on boot.
This an older Maxtor 72004 AP 2Gig

Could be time for trashcan to take ownership.
Any more ideas?


--
--
Bryan
bc3910 'at' gmail 'dot' com
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Drive errors on boot

2006-04-10 Thread David J Brooks
On Monday 10 April 2006 09:26, Bryan Curl wrote:
> --- Lowell Gilbert
>
> <[EMAIL PROTECTED]> wrote:
> > Bryan Curl <[EMAIL PROTECTED]> writes:
> > > My apologies if this is a repost. It seems either
> >
> > I
> >
> > > had a gmail problem or list never posted the
> >
> > question.
> >
> > > I have subscribed with another address to monitor
> > > problem.
> > >
> > > Anyway, here is my question again.
> > >
> > > I get the following errors from dmesg on one of my
> >
> > ide
> >
> > > drives on boot.
> > > Other similar drives dont error and are setup the
> >
> > same
> >
> > > in bios (except cylinder & block config of course)
> > > System and this drive seem to work fine otherwise.
> >
> > I
> >
> > > re-fdisk this one but it still does this error.
> > >
> > > FreeBSD 6.0-RELEASE-p6 #0: Tue Apr  4 09:43:53 MDT
> > > 2006
> > >
> > > ad1: 1916MB  at
> >
> > ata0-slave
> >
> > > WDMA2
> > > ad1: FAILURE - READ_DMA status=51
> > > error=10 LBA=3924359
> > > ad1: FAILURE - READ_DMA status=51
> > > error=10 LBA=3924343
> > > ad1: FAILURE - READ_DMA status=51
> > > error=10 LBA=3924356
> > > ad1: FAILURE - READ_DMA status=51
> > > error=10 LBA=3924359
> >
> > This is probably a hardware problem.  My first guess
> > would be
> > cabling.  Try swapping the cable.  And make sure
> > there is a master on
> > the bus if this one is probing as a slave.
>
> This is the primary slave drive. Primary master is the
> boot drive where OS lives. The master is cabled on the
> end connector and the slave is connected to the middle
> connector on the cable.
>
> The supplied documentation on the drive jumpers is
> vague at best. It only makes mention of one jumper
> (master or slave positions) There are 3 other jumpers
> on the drive that are not mentioned.
>
> Looks to me like DMA feature isn't working but I dont
> know if this is activated by a jumper or by firmware
> somehow.
>
> > > I  dont know what causes these errors either.
> > >
> > > dc0: failed to force tx and rx to idle state
> > > dc0: failed to force tx and rx to idle state
> >
> > The driver tried to force the transmitter and
> > receiver to be "idle"
> > temporarily, and failed.  There are a number of
> > different cases where
> > the driver tries to do this, so it's hard to guess
> > exactly what's
> > happening this time.  Some of the relevant variables
> > are: whether this
> > happens at boot time, whether it happens after an
> > underrun or overrun,
> > and which real controller chip you have.
>
> I have seen this error on every FreeBSD installation I
> have ever had. To my knowledge, it never seemed to
> bother anything. I just hate watching errors scroll by.

I solved this same error on my machine by adding
sysctl hw.ata.ata_dma=0
to /boot/loader.conf

That slows down drive access something fierce, but it worked for me. Once the 
machine has booted you may be able to turn DMA access back on with 
atacontrol(8).

The problem was ultimately solved for me by upgrading to 6.1-PRERELEASE.

HTH,
David
-- 
Sure God created the world in only six days,
but He didn't have an established user-base.
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Drive errors on boot

2006-04-10 Thread Bryan Curl


--- Lowell Gilbert
<[EMAIL PROTECTED]> wrote:

> Bryan Curl <[EMAIL PROTECTED]> writes:
> 
> > My apologies if this is a repost. It seems either
> I
> > had a gmail problem or list never posted the
> question.
> > I have subscribed with another address to monitor
> > problem.
> > 
> > Anyway, here is my question again.
> > 
> > I get the following errors from dmesg on one of my
> ide
> > drives on boot.
> > Other similar drives dont error and are setup the
> same
> > in bios (except cylinder & block config of course)
> > System and this drive seem to work fine otherwise.
> I
> > re-fdisk this one but it still does this error.
> > 
> > FreeBSD 6.0-RELEASE-p6 #0: Tue Apr  4 09:43:53 MDT
> > 2006
> > 
> > ad1: 1916MB  at
> ata0-slave
> > WDMA2
> > ad1: FAILURE - READ_DMA status=51
> > error=10 LBA=3924359
> > ad1: FAILURE - READ_DMA status=51
> > error=10 LBA=3924343
> > ad1: FAILURE - READ_DMA status=51
> > error=10 LBA=3924356
> > ad1: FAILURE - READ_DMA status=51
> > error=10 LBA=3924359
> 
> This is probably a hardware problem.  My first guess
> would be
> cabling.  Try swapping the cable.  And make sure
> there is a master on
> the bus if this one is probing as a slave.

This is the primary slave drive. Primary master is the
boot drive where OS lives. The master is cabled on the
end connector and the slave is connected to the middle
connector on the cable.

The supplied documentation on the drive jumpers is
vague at best. It only makes mention of one jumper
(master or slave positions) There are 3 other jumpers
on the drive that are not mentioned.

Looks to me like DMA feature isn't working but I dont
know if this is activated by a jumper or by firmware
somehow.

> 
> > I  dont know what causes these errors either.
> > 
> > dc0: failed to force tx and rx to idle state
> > dc0: failed to force tx and rx to idle state
> 
> 
> The driver tried to force the transmitter and
> receiver to be "idle"
> temporarily, and failed.  There are a number of
> different cases where
> the driver tries to do this, so it's hard to guess
> exactly what's
> happening this time.  Some of the relevant variables
> are: whether this
> happens at boot time, whether it happens after an
> underrun or overrun,
> and which real controller chip you have.
>

I have seen this error on every FreeBSD installation I
have ever had. To my knowledge, it never seemed to
bother anything. I just hate watching errors scroll by.



__
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Drive errors on boot

2006-04-10 Thread Lowell Gilbert
Bryan Curl <[EMAIL PROTECTED]> writes:

> My apologies if this is a repost. It seems either I
> had a gmail problem or list never posted the question.
> I have subscribed with another address to monitor
> problem.
> 
> Anyway, here is my quesion again.
> 
> I get the following errors from dmesg on one of my ide
> drives on boot.
> Other similar drives dont error and are setup the same
> in bios (except cylinder & block config of course)
> System and this drive seem to work fine otherwise. I
> re-fdisk this one but it still does this error.
> 
> FreeBSD 6.0-RELEASE-p6 #0: Tue Apr  4 09:43:53 MDT
> 2006
> 
> ad1: 1916MB  at ata0-slave
> WDMA2
> ad1: FAILURE - READ_DMA status=51
> error=10 LBA=3924359
> ad1: FAILURE - READ_DMA status=51
> error=10 LBA=3924343
> ad1: FAILURE - READ_DMA status=51
> error=10 LBA=3924356
> ad1: FAILURE - READ_DMA status=51
> error=10 LBA=3924359

This is probably a hardware problem.  My first guess would be
cabling.  Try swapping the cable.  And make sure there is a master on
the bus if this one is probing as a slave.

> I  dont know what causes these errors either.
> 
> dc0: failed to force tx and rx to idle state
> dc0: failed to force tx and rx to idle state


The driver tried to force the transmitter and receiver to be "idle"
temporarily, and failed.  There are a number of different cases where
the driver tries to do this, so it's hard to guess exactly what's
happening this time.  Some of the relevant variables are: whether this
happens at boot time, whether it happens after an underrun or overrun,
and which real controller chip you have.
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Drive errors on boot

2006-04-10 Thread Bryan Curl
My apologies if this is a repost. It seems either I
had a gmail problem or list never posted the question.
I have subscribed with another address to monitor
problem.

Anyway, here is my quesion again.

I get the following errors from dmesg on one of my ide
drives on boot.
Other similar drives dont error and are setup the same
in bios (except cylinder & block config of course)
System and this drive seem to work fine otherwise. I
re-fdisk this one but it still does this error.

FreeBSD 6.0-RELEASE-p6 #0: Tue Apr  4 09:43:53 MDT
2006

ad1: 1916MB  at ata0-slave
WDMA2
ad1: FAILURE - READ_DMA status=51
error=10 LBA=3924359
ad1: FAILURE - READ_DMA status=51
error=10 LBA=3924343
ad1: FAILURE - READ_DMA status=51
error=10 LBA=3924356
ad1: FAILURE - READ_DMA status=51
error=10 LBA=3924359

I  dont know what causes these errors either.

dc0: failed to force tx and rx to idle state
dc0: failed to force tx and rx to idle state

Any ideas / suggsted reading somewhere on this?
Thank You.




__
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Drive errors on boot

2006-04-08 Thread Bryan Curl
Hello group,

I get these errors on one of my ide drives on boot.
Other similar drives dont error and are setup the same in bios (except
cylinder & block config of course)
System and this drive seem to work fine otherwise. I re-fdisk this one but
it still does this error.

FreeBSD 6.0-RELEASE-p6 #0: Tue Apr  4 09:43:53 MDT 2006

ad1: 1916MB  at ata0-slave WDMA2
ad1: FAILURE - READ_DMA status=51 error=10
LBA=3924359
ad1: FAILURE - READ_DMA status=51 error=10
LBA=3924343
ad1: FAILURE - READ_DMA status=51 error=10
LBA=3924356
ad1: FAILURE - READ_DMA status=51 error=10
LBA=3924359

I  dont know what causes these errors either.

dc0: failed to force tx and rx to idle state
dc0: failed to force tx and rx to idle state

Any ideas / suggsted reading somewhere on this?
Thank You.


---
Bryan
bc3910 'at' gmail 'dot' com
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


USB drive errors (USB 2.0 and memory drive)

2005-07-06 Thread Louis LeBlanc
Well, I've been in a config debug mood lately, so I'm going to go
after one more issue.  Rather, I'm gonna ask for help here, since I
can't find the solution online.

This has been happening since I managed to get ehci working without
causing the kernel to freak out.  I'm running 5.4 RELEASE p1
(upgrading to p4 later on tonight).  The problem happens when I plug
in either of my USB key devices (one being a PNY USB Disk, the other a
little iPod Shuffle).

I am using both ehci and ohci drivers, and AFAICT, the problem only
happens with USB2.0 devices.

The problem shows up in /var/log/messages as follows:

Jul  6 19:41:31 keyslapper kernel: umass0: PNY USB DISK 20X, rev 2.00/1.00, 
addr 2
Jul  6 19:41:32 keyslapper kernel: da0 at umass-sim0 bus 0 target 0 lun 0
Jul  6 19:41:32 keyslapper kernel: da0: < USB DISK 20X PMAP> Removable Direct 
Access SCSI-0 device 
Jul  6 19:41:32 keyslapper kernel: da0: 40.000MB/s transfers
Jul  6 19:41:32 keyslapper kernel: da0: 962MB (1970176 512 byte sectors: 64H 
32S/T 962C)
Jul  6 19:41:32 keyslapper kernel: umass0: Phase Error, residue = 0
Jul  6 19:41:32 keyslapper kernel: (da0:umass-sim0:0:0:0): Synchronize cache 
failed, status == 0x4, scsi status == 0x0

The last two lines repeat 16 or 17 times.

When I try to mount these, I have no problems.  No errors, and
everything appears to work fine.  I can move files, edit directly on
the disk, whatever.

So, these messages are an indication of something wierd somewhere, I
just don't know if it's purely cosmetic, or if there's really
something wrong and my resume is going to get eaten one of these days.

I've googled for these messages, and found a lot of reports of the
same problem (with few variations), but no solutions or suggestions.

I tracked this error message to /usr/src/sys/cam/scsi/scsi_da.c, but
I'm not sure exactly where this happens in the device initialization
yet (just a quick skim through the code and no understanding of the
underlying architecture or USB specs yet).

Anyone have any idea?

TIA
Lou
-- 
Louis LeBlanc  FreeBSD-at-keyslapper-DOT-net
Fully Funded Hobbyist,   KeySlapper Extrordinaire :)
Please send off-list email to: leblanc at keyslapper d.t net
Key fingerprint = C5E7 4762 F071 CE3B ED51  4FB8 AF85 A2FE 80C8 D9A2

BASIC, n.:
  A programming language.  Related to certain social diseases in
  that those who have it will not admit it in polite company.


pgpEqxtYt2oPD.pgp
Description: PGP signature


Re: hard drive errors

2005-01-20 Thread Jason Henson
On 01/20/05 19:21:13, David Bear wrote:
I am receiving the following errors on my hard drive. This appears to
affect some file in /var/log. My question is twofold. 1) shouldn't  
ufs
notice this sector as being unuseable and mark it offlimites? 2) if
not, is there a way to mark it so manually?

ad0s1g: hard error reading fsbn 19674311 of 6765124-6765135 (ad0s1 bn
19674311; cn 1618 tn 16 sn 41) status=59 error=40
ad0s1g: hard error reading fsbn 6765124 (ad0s1 bn 6765124; cn 556 tn
74 sn 58) status=59 error=40
ad0s1h: hard error reading fsbn 88412159 of 35809248-35809251 (ad0s1
bn 88412159; cn 7271 tn 64 sn 38) status=59 error=40
ad0s1h: hard error reading fsbn 35809251 (ad0s1 bn 35809251; cn 2945
tn 15 sn 51) status=59 error=40
ad0s1g: hard error reading fsbn 19674303 of 6765120-6765133 (ad0s1 bn
19674303; cn 1618 tn 16 sn 33) status=59 error=40
ad0s1g: hard error reading fsbn 6765124 (ad0s1 bn 6765124; cn 556 tn
74 sn 58) status=59 error=40


Checkout ports/sysutils/smartmontools/ if your drive supports smart.   
It will give you the health and a lot of info about the drive and any  
problems it is having.

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: hard drive errors

2005-01-20 Thread John
On Thu, Jan 20, 2005 at 05:21:13PM -0700, David Bear wrote:
> I am receiving the following errors on my hard drive. This appears to
> affect some file in /var/log. My question is twofold. 1) shouldn't ufs
> notice this sector as being unuseable and mark it offlimites? 2) if
> not, is there a way to mark it so manually?
> 
> 
> ad0s1g: hard error reading fsbn 19674311 of 6765124-6765135 (ad0s1 bn
> 19674311; cn 1618 tn 16 sn 41) status=59 error=40
> ad0s1g: hard error reading fsbn 6765124 (ad0s1 bn 6765124; cn 556 tn
> 74 sn 58) status=59 error=40
> ad0s1h: hard error reading fsbn 88412159 of 35809248-35809251 (ad0s1
> bn 88412159; cn 7271 tn 64 sn 38) status=59 error=40
> ad0s1h: hard error reading fsbn 35809251 (ad0s1 bn 35809251; cn 2945
> tn 15 sn 51) status=59 error=40
> ad0s1g: hard error reading fsbn 19674303 of 6765120-6765133 (ad0s1 bn
> 19674303; cn 1618 tn 16 sn 33) status=59 error=40
> ad0s1g: hard error reading fsbn 6765124 (ad0s1 bn 6765124; cn 556 tn
> 74 sn 58) status=59 error=40

Modern disk drives do a lot to manage errors, but things can still
happen that they cannot protect against - this is part of the reason
various RAID schemes are used.

If the drive gets a lot of recoverable (soft) errors, that means that
it can reconstruct the data, even though it was damaged.  Having
reconstructed the data, it can remap the sector.

A hard error means that, by the time the problem was noticed, data
were already unrecoverable.  It can't simply remap the sector
somewhere else, because the data are already gone!  If it were to
map it somewhere else - what would it put there?  It doesn't know,
and neither do I.

You really, really need to back up your data somewhere.  You may
already have lost data which are valuable to you, but that's no
reason to loose more.  After that, go into the BIOS and do a surface
scan of the drive.  That will cause it to remap all the sectors
that are unrecoverable.  Then, remake the affected filesystem, and
restore your data.  If the drive is basically a good drive, you
should be fine again.  If the drive is failing, more hard (and
soft) errors will pop up, and your data are at greater risk.

Fortunately, you say the errors seem to be in /var/log.  Maybe
remaking the /var subsystem and loosing some log files won't
really cause you any pain.  I hope that that is the case.

There used to be filesystem-level code to manage bad sectors.  This
was bad, because when you went to do unit copies (rarely done
anymore), you'd still hit the bad spots.  The ability to manage
disk defects was then pushed down into the driver (bad144 disk
defect management), and then down into the drives themselves.

NONE of those methods can protect you from the sudden and seemingly
spontaneous loss of data!  If you move your system, or it is subject
to shock and vibration, and the heads go bouncing across the surface
- data may be lost.  Sometimes I swear cosmic rays just blast out
some bits (well, it SEEMS like it), and, ultimately, thermodynamics
cannot be beaten - any image, magnetic or otherwise, fades with
time.  The signal-to-noise ratio of the heads and eletronics also
changes over the life of the product, and tiny flecks can come off,
be deposited on, or moved around the disk surface.  All of this
can cause data problems.

Though almost no-one does it, back up your data.  Back up your
data.  Back up your data.  Like the old joke about real estate that
the three most imporant features are location, location, and
location, the three most imporant steps in preserving and protecting
data (short of hardware RAID protection and remote and local
subsystem based replication) are backup, backup, and backup.

I actually have an arrangement with a friend of mine that the most
imporant data on my system are rolled up into a tarball and an
expect script FTP's it to one of his servers every night.  A little
kludgy, but it works as poor-man's remote data replication.
-- 

John Lind
[EMAIL PROTECTED]
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: hard drive errors

2005-01-20 Thread Chuck Swiger
David Bear wrote:
I am receiving the following errors on my hard drive. This appears to
affect some file in /var/log. My question is twofold. 1) shouldn't ufs
notice this sector as being unuseable and mark it offlimites? 2) if
not, is there a way to mark it so manually?
Sure, by default, modern drives will notice and replace failing sectors using 
spare ones.  The error message you are seeing very probably indicates that the 
drive has enough bad sectors that it has run out of spares and is going to 
completely fail very soon.

Back up your data ASAP
--
-Chuck
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


hard drive errors

2005-01-20 Thread David Bear
I am receiving the following errors on my hard drive. This appears to
affect some file in /var/log. My question is twofold. 1) shouldn't ufs
notice this sector as being unuseable and mark it offlimites? 2) if
not, is there a way to mark it so manually?


ad0s1g: hard error reading fsbn 19674311 of 6765124-6765135 (ad0s1 bn
19674311; cn 1618 tn 16 sn 41) status=59 error=40
ad0s1g: hard error reading fsbn 6765124 (ad0s1 bn 6765124; cn 556 tn
74 sn 58) status=59 error=40
ad0s1h: hard error reading fsbn 88412159 of 35809248-35809251 (ad0s1
bn 88412159; cn 7271 tn 64 sn 38) status=59 error=40
ad0s1h: hard error reading fsbn 35809251 (ad0s1 bn 35809251; cn 2945
tn 15 sn 51) status=59 error=40
ad0s1g: hard error reading fsbn 19674303 of 6765120-6765133 (ad0s1 bn
19674303; cn 1618 tn 16 sn 33) status=59 error=40
ad0s1g: hard error reading fsbn 6765124 (ad0s1 bn 6765124; cn 556 tn
74 sn 58) status=59 error=40

-- 
David Bear
phone:  480-965-8257
fax:480-965-9189
College of Public Programs/ASU
Wilson Hall 232
Tempe, AZ 85287-0803
 "Beware the IP portfolio, everyone will be suspect of trespassing"
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Drive errors?

2003-11-16 Thread stan
On Fri, Nov 14, 2003 at 06:24:11PM -0500, stan wrote:
> On Fri, Nov 14, 2003 at 05:30:29PM +0300, Joseph Begumisa wrote:
> > > atapci0:  port 0xa000-0xa00f at device 7.1 on pci0
> > > atapci0: Correcting VIA config for southbridge data corruption bug
> > > ata0: at 0x1f0 irq 14 on atapci0
> > > ata1: at 0x170 irq 15 on atapci0
> > > ad3: 39266MB  [79780/16/63] at ata1-slave UDMA100
> > 
> > Well the OS sees an ATA100 Controller which is good.  So i guess you could
> > first look into the issue of the ribbon cable as mentioned below so that
> > we can eliminate that.
> > 
> 
> OK, I have confirmed that all IDE cables in the machine, with the exception
> of the one going to the CD, which is on it's own controlere, are 80 wire
> ones.
> 
Here's the final dispotion on this issue.

I replaced the drive in the running machine this weekend, and put it in a
test machine. Then I ran IBM/Hitachi's DFT software, it passed the advanced
test with flying colors (again). So I decided to try the "excersise"
functioanlity (which is a bit hidden in thier menu). It failed this test
within 5 minutes. 

So I'ts on it's way to be replaced under waranty.

Thanks for the help on this!

-- 
"They that would give up essential liberty for temporary safety deserve
neither liberty nor safety."
-- Benjamin Franklin
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Drive errors?

2003-11-14 Thread stan
On Fri, Nov 14, 2003 at 05:30:29PM +0300, Joseph Begumisa wrote:
> > atapci0:  port 0xa000-0xa00f at device 7.1 on pci0
> > atapci0: Correcting VIA config for southbridge data corruption bug
> > ata0: at 0x1f0 irq 14 on atapci0
> > ata1: at 0x170 irq 15 on atapci0
> > ad3: 39266MB  [79780/16/63] at ata1-slave UDMA100
> 
> Well the OS sees an ATA100 Controller which is good.  So i guess you could
> first look into the issue of the ribbon cable as mentioned below so that
> we can eliminate that.
> 

OK, I have confirmed that all IDE cables in the machine, with the exception
of the one going to the CD, which is on it's own controlere, are 80 wire
ones.

What should I do next?
-- 
"They that would give up essential liberty for temporary safety deserve
neither liberty nor safety."
-- Benjamin Franklin
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Drive errors?

2003-11-14 Thread Joseph Begumisa
> atapci0:  port 0xa000-0xa00f at device 7.1 on pci0
> atapci0: Correcting VIA config for southbridge data corruption bug
> ata0: at 0x1f0 irq 14 on atapci0
> ata1: at 0x170 irq 15 on atapci0
> ad3: 39266MB  [79780/16/63] at ata1-slave UDMA100

Well the OS sees an ATA100 Controller which is good.  So i guess you could
first look into the issue of the ribbon cable as mentioned below so that
we can eliminate that.


> >
> > The last time I had such a problem, I solved it by using a 40 pin
> > 80 wire ribbon cable in place of the 40 pin 40 wire ribbon cable.
>
> I'll check that, but given that its a new drive I suspect it already has an
> 80 wire cable.
>
> --
> "They that would give up essential liberty for temporary safety deserve
> neither liberty nor safety."
>   -- Benjamin Franklin
> ___
> [EMAIL PROTECTED] mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-questions
> To unsubscribe, send any mail to "[EMAIL PROTECTED]"
>

Joseph.
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Drive errors?

2003-11-14 Thread stan
On Fri, Nov 14, 2003 at 03:48:14PM +0300, Joseph Begumisa wrote:
> 
> On Thu, 13 Nov 2003, stan wrote:
> 
> > I've got a relativly recent STABLE machine that I', gettng errors like this
> > on:
> >
> >
> > Nov 12 20:00:01 black newsyslog[33912]: logfile turned over due to size>100K
> > Nov 12 20:00:07 black /kernel: ad3s1e: UDMA ICRC error writing fsbn 113202879 of 
> > 56601408-56601455 (ad3s1 bn 113202879; cn 7046 tn 141 sn 6) retrying
> > Nov 12 20:00:15 black /kernel: ad3s1e: UDMA ICRC error writing fsbn 114997439 of 
> > 57498688-57498943 (ad3s1 bn 114997439; cn 7158 tn 66 sn 11) retrying
> > Nov 12 20:00:16 black /kernel: ad3s1e: UDMA ICRC error writing fsbn 115000927 of 
> > 57500432-57500687 (ad3s1 bn 115000927; cn 7158 tn 121 sn 34) retrying
> > Nov 12 20:00:20 black /kernel: ad3s1e: UDMA ICRC error writing fsbn 115172415 of 
> > 57586176-57586335 (ad3s1 bn 115172415; cn 7169 tn 38 sn 36) retrying
> > Nov 12 20:00:22 black /kernel: ad3s1e: UDMA ICRC error writing fsbn 115188431 of 
> > 57594184-57594191 (ad3s1 bn 115188431; cn 7170 tn 37 sn 50) retrying
> >
> > The drive in question is an IBM/Hitachi 40H unit detected as:
> >
> > Nov 13 19:11:18 black /kernel: ad3: 39266MB  [79780/16/63] at 
> > ata1-slave UDMA100
> >
> 
> Pasting the output from /var/run/dmesg.boot would be useful so that we can
> know what the controller is detected as and also see whether there is any
> other useful information concerning this problem.

OK, thnaks for the sugestion.


Copyright (c) 1992-2003 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD 4.9-RC #13: Sun Oct 19 11:58:37 EDT 2003
[EMAIL PROTECTED]:/usr/obj/usr/src/sys/BLACK
Timecounter "i8254"  frequency 1193182 Hz
CPU: AMD Athlon(tm) processor (900.04-MHz 686-class CPU)
  Origin = "AuthenticAMD"  Id = 0x644  Stepping = 4
  
Features=0x183f9ff
  AMD Features=0xc044
real memory  = 805240832 (786368K bytes)
avail memory = 779059200 (760800K bytes)
Preloaded elf kernel "kernel" at 0xc03da000.
Preloaded elf module "agp.ko" at 0xc03da09c.
VESA: v2.0, 32768k memory, flags:0x1, mode table:0xc03573c2 (122)
VESA: ATI RADEON
Pentium Pro MTRR support enabled
md0: Malloc disk
Using $PIR table, 9 entries at 0xc00fdd90
npx0:  on motherboard
npx0: INT 16 interface
pcib0:  on motherboard
pci0:  on pcib0
agp0:  mem 0xc000-0xcfff at 
device 0.0 on pci0
pcib1:  at device 1.0 on pci0
pci1:  on pcib1
pci1:  at 0.0 irq 9
isab0:  at device 7.0 on pci0
isa0:  on isab0
atapci0:  port 0xa000-0xa00f at device 7.1 on pci0
atapci0: Correcting VIA config for southbridge data corruption bug
ata0: at 0x1f0 irq 14 on atapci0
ata1: at 0x170 irq 15 on atapci0
uhci0:  port 0xa400-0xa41f irq 10 at device 7.2 on pci0
usb0:  on uhci0
usb0: USB revision 1.0
uhub0: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
uhci1:  port 0xa800-0xa81f irq 10 at device 7.3 on pci0
usb1:  on uhci1
usb1: USB revision 1.0
uhub1: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub1: 2 ports with 2 removable, self powered
pci0:  (vendor=0x1106, dev=0x3057) at 7.4
pcm0:  port 0xb400-0xb403,0xb000-0xb003,0xac00-0xacff irq 11 at device 
7.5 on pci0
pcm0: 
ahc0:  port 0xb800-0xb8ff mem 
0xdb001000-0xdb001fff irq 11 at device 9.0 on pci0
aic7850: Single Channel A, SCSI Id=7, 3/253 SCBs
ahc1:  port 0xbc00-0xbcff mem 0xdb00-0xdb000fff irq 
10 at device 11.0 on pci0
aic7860: Ultra Single Channel A, SCSI Id=7, 3/253 SCBs
ed0:  port 0xc000-0xc01f irq 10 at device 13.0 on 
pci0
ed0: address 00:50:ba:52:69:f1, type NE2000 (16 bit) 
orm0:  at iomem 0xc-0xc7fff,0xcc000-0xcc7ff on isa0
pmtimer0 on isa0
fdc0:  at port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on isa0
fdc0: FIFO enabled, 8 bytes threshold
fd0: <1440-KB 3.5" drive> on fdc0 drive 0
fdc0: ready for input in output
fdc0: cmd 3 failed at out byte 1 of 3
fd1: <1200-KB 5.25" drive> on fdc0 drive 1
atkbdc0:  at port 0x60,0x64 on isa0
atkbd0:  flags 0x1 irq 1 on atkbdc0
kbd0 at atkbd0
psm0:  irq 12 on atkbdc0
psm0: model MouseMan+, device ID 0
vga0:  at port 0x3c0-0x3df iomem 0xa-0xb on isa0
sc0:  at flags 0x100 on isa0
sc0: VGA <16 virtual consoles, flags=0x300>
sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0
sio0: type 16550A
sio1 at port 0x2f8-0x2ff irq 3 on isa0
sio1: type 16550A
ppc0:  at port 0x378-0x37f irq 7 on isa0
ppc0: Generic chipset (EPP/NIBBLE) in COMPATIBLE mode
plip0:  on ppbus0
lpt0:  on ppbus0
lpt0: Interrupt-driven port
ppi0:  on ppbus0
ad0: 39266MB  [79780/16/63] at ata0-master UDMA100
ad1: 39266MB  [79780/16/63] at ata0-slave UDMA100
ad2: 39266MB  [79780/16/63] at ata1-master UDMA100
ad3: 39266MB  [79780/16/63] at ata1-slave UDMA100
Waiting 5 seconds for SCSI devices to settle
sa0 at ahc1 bus 0 target 4 lun 0
sa0:  Removable Sequential Access SCSI-2 device 
sa0: 10.000MB/s transfers (10.000MHz, offset 15)
sa1 at ahc1 bus 0 target 5 lun 0
sa1:  Removable 

Re: Drive errors?

2003-11-14 Thread Joseph Begumisa

On Thu, 13 Nov 2003, stan wrote:

> I've got a relativly recent STABLE machine that I', gettng errors like this
> on:
>
>
> Nov 12 20:00:01 black newsyslog[33912]: logfile turned over due to size>100K
> Nov 12 20:00:07 black /kernel: ad3s1e: UDMA ICRC error writing fsbn 113202879 of 
> 56601408-56601455 (ad3s1 bn 113202879; cn 7046 tn 141 sn 6) retrying
> Nov 12 20:00:15 black /kernel: ad3s1e: UDMA ICRC error writing fsbn 114997439 of 
> 57498688-57498943 (ad3s1 bn 114997439; cn 7158 tn 66 sn 11) retrying
> Nov 12 20:00:16 black /kernel: ad3s1e: UDMA ICRC error writing fsbn 115000927 of 
> 57500432-57500687 (ad3s1 bn 115000927; cn 7158 tn 121 sn 34) retrying
> Nov 12 20:00:20 black /kernel: ad3s1e: UDMA ICRC error writing fsbn 115172415 of 
> 57586176-57586335 (ad3s1 bn 115172415; cn 7169 tn 38 sn 36) retrying
> Nov 12 20:00:22 black /kernel: ad3s1e: UDMA ICRC error writing fsbn 115188431 of 
> 57594184-57594191 (ad3s1 bn 115188431; cn 7170 tn 37 sn 50) retrying
>
> The drive in question is an IBM/Hitachi 40H unit detected as:
>
> Nov 13 19:11:18 black /kernel: ad3: 39266MB  [79780/16/63] at 
> ata1-slave UDMA100
>

Pasting the output from /var/run/dmesg.boot would be useful so that we can
know what the controller is detected as and also see whether there is any
other useful information concerning this problem.

The last time I had such a problem, I solved it by using a 40 pin
80 wire ribbon cable in place of the 40 pin 40 wire ribbon cable.

cheers,

Joseph.
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Drive errors?

2003-11-13 Thread stan
On Fri, Nov 14, 2003 at 02:31:29AM +0100, Martin wrote:
> Am Fr, den 14.11.2003 schrieb stan um 01:20:
> > Nov 12 20:00:07 black /kernel: ad3s1e: UDMA ICRC error writing fsbn 
> > 113202879 of 56601408-56601455 (ad3s1 bn 113202879; cn 7046 tn 141 sn 6)
> > retrying
> 
> I had a similar effect here yesterday (but on CURRENT) and I 
> noticed that something is wrong with the allocation of interrupts
> for the PCI-bus (BIOS-settings). Perhaps you have same problem as
> myself.
> 
> I figured out that on my Abit-BE6 Mainboard always assigns the same 
> IRQ to the IDE-controller and the card in PCI-slot 3. It depended on 
> how busy the card in slot 3 was. My TV-capture card even froze my PC
> completely after starting xawtv in PCI-slot 3.
> 
> Check your device listing which comes right before the loader is
> started. Perhaps you discover a second device using same IRQ or
> DMA. Also check your BIOS settings.
> 
That's an interesting posibility. The machine in questio has all 4 onboard
IDE places used, and 2 more on an add in IDE card, as well as 2 SCSO
controlers.

I'll check it out, and report back.

-- 
"They that would give up essential liberty for temporary safety deserve
neither liberty nor safety."
-- Benjamin Franklin
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Drive errors?

2003-11-13 Thread Martin
Am Fr, den 14.11.2003 schrieb stan um 01:20:
> Nov 12 20:00:07 black /kernel: ad3s1e: UDMA ICRC error writing fsbn 
> 113202879 of 56601408-56601455 (ad3s1 bn 113202879; cn 7046 tn 141 sn 6)
> retrying

I had a similar effect here yesterday (but on CURRENT) and I 
noticed that something is wrong with the allocation of interrupts
for the PCI-bus (BIOS-settings). Perhaps you have same problem as
myself.

I figured out that on my Abit-BE6 Mainboard always assigns the same 
IRQ to the IDE-controller and the card in PCI-slot 3. It depended on 
how busy the card in slot 3 was. My TV-capture card even froze my PC
completely after starting xawtv in PCI-slot 3.

Check your device listing which comes right before the loader is
started. Perhaps you discover a second device using same IRQ or
DMA. Also check your BIOS settings.

I had to set my IRQ-numbers directly in my BIOS, because I need
to force it to use free interrupts that the BIOS doesn't want
to use. (Seems to be a bug in the BIOS program.)

I know I can choose PNP-OS in my BIOS settings, but I don't like
FreeBSD allocating 4 or 5 devices on IRQ 10.

If someone has got an idea how I can solve it, please tell me.
(BIOS is flashed to the most recent version.) I'm already
thinking about getting a new mainboard.

Martin


___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Drive errors?

2003-11-13 Thread stan
I've got a relativly recent STABLE machine that I', gettng errors like this
on:


Nov 12 20:00:01 black newsyslog[33912]: logfile turned over due to size>100K
Nov 12 20:00:07 black /kernel: ad3s1e: UDMA ICRC error writing fsbn 113202879 of 
56601408-56601455 (ad3s1 bn 113202879; cn 7046 tn 141 sn 6) retrying
Nov 12 20:00:15 black /kernel: ad3s1e: UDMA ICRC error writing fsbn 114997439 of 
57498688-57498943 (ad3s1 bn 114997439; cn 7158 tn 66 sn 11) retrying
Nov 12 20:00:16 black /kernel: ad3s1e: UDMA ICRC error writing fsbn 115000927 of 
57500432-57500687 (ad3s1 bn 115000927; cn 7158 tn 121 sn 34) retrying
Nov 12 20:00:20 black /kernel: ad3s1e: UDMA ICRC error writing fsbn 115172415 of 
57586176-57586335 (ad3s1 bn 115172415; cn 7169 tn 38 sn 36) retrying
Nov 12 20:00:22 black /kernel: ad3s1e: UDMA ICRC error writing fsbn 115188431 of 
57594184-57594191 (ad3s1 bn 115188431; cn 7170 tn 37 sn 50) retrying

The drive in question is an IBM/Hitachi 40H unit detected as:

Nov 13 19:11:18 black /kernel: ad3: 39266MB  [79780/16/63] at 
ata1-slave UDMA100

Ay first I thought this might be a drive problem, so I downlaoded IBM's
Drif Fitness Tess. I've run it on this machine, in the advanced mode, and
it passes with flying colors.

So, is the disk bad anyway? Or is this some other issue? What issue could
it be? I ran the test in the exact computer, asme cables etc to elimnaye
changing them BTW
-- 
"They that would give up essential liberty for temporary safety deserve
neither liberty nor safety."
-- Benjamin Franklin
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


drive errors?

2003-01-03 Thread Bryce Newall
Hi all,

Just looking for a second opinion... does this look like my hard drive is
on its way to a slow (and possibly painful) death?

Thanks!

Jan  3 10:52:39 cosmos /kernel: swap_pager: indefinite wait buffer: device: 
#da/0x20001, blkno: 648, size: 4096
Jan  3 10:53:21 cosmos /kernel: swap_pager: indefinite wait buffer: device: 
#da/0x20001, blkno: 648, size: 4096
Jan  3 10:53:21 cosmos /kernel: (da0:ahc0:0:0:0): SCB 0x7e - timed out
Jan  3 10:53:21 cosmos /kernel: ahc0: Dumping Card State in Command phase, at SEQADDR 
0x15c
Jan  3 10:53:21 cosmos /kernel: ACCUM = 0x80, SINDEX = 0xac, DINDEX = 0xc0, ARG_2 = 
0x27
Jan  3 10:53:21 cosmos /kernel: HCNT = 0x0 SCBPTR = 0x7
Jan  3 10:53:21 cosmos /kernel: SCSISEQ = 0x12, SBLKCTL = 0x0
Jan  3 10:53:21 cosmos /kernel: DFCNTRL = 0x4, DFSTATUS = 0x6d
Jan  3 10:53:22 cosmos /kernel: LASTPHASE = 0x80, SCSISIGI = 0x84, SXFRCTL0 = 0x88
Jan  3 10:53:22 cosmos /kernel: SSTAT0 = 0x7, SSTAT1 = 0x2
Jan  3 10:53:22 cosmos /kernel: STACK == 0x186, 0x156, 0x0, 0x35
Jan  3 10:53:22 cosmos /kernel: SCB count = 230
Jan  3 10:53:22 cosmos /kernel: Kernel NEXTQSCB = 202
Jan  3 10:53:22 cosmos /kernel: Card NEXTQSCB = 126
Jan  3 10:53:22 cosmos /kernel: QINFIFO entries: 126 123 3 182 4
Jan  3 10:53:22 cosmos /kernel: Waiting Queue entries:
Jan  3 10:53:22 cosmos /kernel: Disconnected Queue entries:
Jan  3 10:53:22 cosmos /kernel: QOUTFIFO entries:
Jan  3 10:53:22 cosmos /kernel: Sequencer Free SCB List: 5 13 10 4 2 3 15 12 1 0 8 14 
6 9 11
Jan  3 10:53:22 cosmos /kernel: Sequencer SCB Info: 0(c 0x60, s 0x7, l 0, t 0xff) 1(c 
0x60, s 0x7, l 0, t 0xff) 2(c 0x40, s 0x57, l 0, t 0xff) 3(c 0x40, s 0x57, l 0, t 
0xff) 4(c 0x40, s 0x57, l 0, t 0xff) 5(c 0x40, s 0x57, l 0, t 0xff) 6(c 0x60, s 0x7, l 
0, t 0xff) 7(c 0x40, s 0x57, l 0, t 0x9c) 8(c 0x60, s 0x7, l 0, t 0xff) 9(c 0x60, s 
0x7, l 0, t 0xff) 10(c 0x60, s 0x7, l 0, t 0xff) 11(c 0x60, s 0x7, l 0, t 0xff) 12(c 
0x60, s 0x7, l 0, t 0xff) 13(c 0x60, s 0x7, l 0, t 0xff) 14(c 0x60, s 0x7, l 0, t 
0xff) 15(c 0x60, s 0x7, l 0, t 0xff)
Jan  3 10:53:22 cosmos /kernel: Pending list: 4(c 0x62, s 0x7, l 0), 182(c 0x60, s 
0x7, l 0), 3(c 0x60, s 0x7, l 0), 123(c 0x60, s 0x7, l 0), 126(c 0x62, s 0x7, l 0), 
156(c 0x40, s 0x57, l 0)
Jan  3 10:53:22 cosmos /kernel: Kernel Free SCB list: 93 228 80 133 18 169 196 50 87 
217 90 20 165 22 173 114 209 159 189 41 172 214 94 76 68 178 82 131 215 30 49 108 216 
86 16 84 75 11 77 14 116 89 183 45 42 211 153 121 21 63 24 125 44 70 175 103 185 229 
177 187 195 19 193 197 119 188 181 145 194 85 179 60 37 161 25 147 39 97 32 65 71 132 
34 109 141 192 67 152 128 143 99 200 171 100 226 73 204 9 227 51 206 111 218 138 31 
198 48 146 78 112 57 98 29 91 13 191 17 168 95 5 54 207 201 47 180 36 129 38 212 122 
43 117 83 61 127 115 176 55 102 144 113 139 74 52 15 58 27 213 56 79 160 69 150 96 7 
174 118 8 205 154 107 210 134 28 203 208 120 10 53 199 190 148 1 186 59 62 81 137 40 
64 104 170 135 219 6 140 155 26 184 166 157 0 142 158 2 164 106 72 162 110 105 66 167 
101 124 12 35 149 136 46 151 130 23 88 163 92 33 225 224 223 222 221 220
Jan  3 10:53:22 cosmos /kernel: Untagged Q(5): 156
Jan  3 10:53:22 cosmos /kernel: sg[0] - Addr 0x562f000 : Length 4096
Jan  3 10:53:22 cosmos /kernel: sg[1] - Addr 0x20d : Length 4096
Jan  3 10:53:22 cosmos /kernel: (da0:ahc0:0:0:0): Other SCB Timeout
Jan  3 10:53:22 cosmos /kernel: (da0:ahc0:0:0:0): SCB 0x7b - timed out
Jan  3 10:53:22 cosmos /kernel: ahc0: Dumping Card State in Command phase, at SEQADDR 
0x15c
Jan  3 10:53:22 cosmos /kernel: ACCUM = 0x80, SINDEX = 0xac, DINDEX = 0xc0, ARG_2 = 
0x27
Jan  3 10:53:22 cosmos /kernel: HCNT = 0x0 SCBPTR = 0x7
Jan  3 10:53:22 cosmos /kernel: SCSISEQ = 0x12, SBLKCTL = 0x0
Jan  3 10:53:22 cosmos /kernel: DFCNTRL = 0x4, DFSTATUS = 0x6d
Jan  3 10:53:22 cosmos /kernel: LASTPHASE = 0x80, SCSISIGI = 0x84, SXFRCTL0 = 0x88
Jan  3 10:53:22 cosmos /kernel: SSTAT0 = 0x7, SSTAT1 = 0x2
Jan  3 10:53:23 cosmos /kernel: STACK == 0x186, 0x156, 0x0, 0x35
Jan  3 10:53:23 cosmos /kernel: SCB count = 230
Jan  3 10:53:23 cosmos /kernel: Kernel NEXTQSCB = 202
Jan  3 10:53:23 cosmos /kernel: Card NEXTQSCB = 126
Jan  3 10:53:23 cosmos /kernel: QINFIFO entries: 126 123 3 182 4
Jan  3 10:53:23 cosmos /kernel: Waiting Queue entries:
Jan  3 10:53:23 cosmos /kernel: Disconnected Queue entries:
Jan  3 10:53:23 cosmos /kernel: QOUTFIFO entries:
Jan  3 10:53:23 cosmos /kernel: Sequencer Free SCB List: 5 13 10 4 2 3 15 12 1 0 8 14 
6 9 11
Jan  3 10:53:23 cosmos /kernel: Sequencer SCB Info: 0(c 0x60, s 0x7, l 0, t 0xff) 1(c 
0x60, s 0x7, l 0, t 0xff) 2(c 0x40, s 0x57, l 0, t 0xff) 3(c 0x40, s 0x57, l 0, t 
0xff) 4(c 0x40, s 0x57, l 0, t 0xff) 5(c 0x40, s 0x57, l 0, t 0xff) 6(c 0x60, s 0x7, l 
0, t 0xff) 7(c 0x40, s 0x57, l 0, t 0x9c) 8(c 0x60, s 0x7, l 0, t 0xff) 9(c 0x60, s 
0x7, l 0, t 0xff) 10(c 0x60, s 0x7, l 0, t 0xff) 11(c 0x60, s 0x7, l 0, t 0xff) 12(c 
0x60, s 0x7, l 0, t 0xff) 13(c 0x60, s 0x7, l 0, t 0xff) 14(c 0x60, s 0x7, l 0, t 
0xff) 15(c 0x60, s 0x7, l 0,