Re: scsi tape errors

2004-03-17 Thread David Bear
Thanks for the advice.  I appreciate all the insights.

However, my experience suggest that either I'm doing things very
wrong, or I'm using the wrong hardware for BSD.

Since the kernel dumps so much information when the scsi bus behaves
strangely, isn't there some way to decode from the kernel messages
what could really be happening?  I ran a full system dump to a tape
yesterday.  The dump proceeded without error untill it fill the tape,
35 gig. When I put in a new tape, then things began to act strangely.

After a full power down and reboot, things are still misbehaving.
When I attempt to read the tape the dump wrote without error, I get
i/o errors.  Even commands like "mt status" and "mt offline" cause the
input/output error message... and lots of kernel messages similar to
what I posted.

I'm beginning to wonder if my scsi card has something wrong with it.
There's got to be a better way to trouble shoot/track down, diagnose
this.  

btw, I am using an adaptec 29160 wide card.. I am using a wide cable
to connect the externel tape unit. And I am using an active terminator
on the tape unit.  (It does have an LED to signify I guess that it is
powered)  Are there any additional tools that I can use to check
whether this is really a tape device issue, a scsi device issue...
etc.?

On Tue, Mar 16, 2004 at 09:18:03PM -0600, Dan Nelson wrote:
> In the last episode (Mar 16), David Bear said:
> > I am getting error messages that don't make much sense. They would
> > lead me to beleive that the tape is bad... I guess. Yet, I have a
> > hard time beleiving that the two out of four tapes is bad.
> > 
> > issuing an 'mt erase' it get an input/output error.
> > 
> > below are the kernel messages.
> > 
> > could two tapes suddenly just become 'bad'?
> > 
> > Since these are ait tapes and have a 64k ram buffer, I'm wondering if
> > there may be some bad data there and if there is a way to clear it...
> > 
> > The tape unit is a sony sdx300c.  I've updated it to the latest
> > firmware. Its attached to an adaptec 2940wide.
> 
> Is it possible that erasing an AIT tape takes more than 4 minutes? 
> That's how long the cam layer will wait for an erase command to
> complete.  Try adding
> 
> options   SA_ERASE_TIMEOUT=10*60
> 
> and rebuilding your kernel.
> 
> -- 
>   Dan Nelson
>   [EMAIL PROTECTED]

-- 
David Bear
phone:  480-965-8257
fax:480-965-9189
College of Public Programs/ASU
Wilson Hall 232
Tempe, AZ 85287-0803
 "Beware the IP portfolio, everyone will be suspect of trespassing"
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: scsi tape errors

2004-03-16 Thread Dan Nelson
In the last episode (Mar 16), David Bear said:
> I am getting error messages that don't make much sense. They would
> lead me to beleive that the tape is bad... I guess. Yet, I have a
> hard time beleiving that the two out of four tapes is bad.
> 
> issuing an 'mt erase' it get an input/output error.
> 
> below are the kernel messages.
> 
> could two tapes suddenly just become 'bad'?
> 
> Since these are ait tapes and have a 64k ram buffer, I'm wondering if
> there may be some bad data there and if there is a way to clear it...
> 
> The tape unit is a sony sdx300c.  I've updated it to the latest
> firmware. Its attached to an adaptec 2940wide.

Is it possible that erasing an AIT tape takes more than 4 minutes? 
That's how long the cam layer will wait for an erase command to
complete.  Try adding

options SA_ERASE_TIMEOUT=10*60

and rebuilding your kernel.

-- 
Dan Nelson
[EMAIL PROTECTED]
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


scsi tape errors

2004-03-16 Thread David Bear
I am getting error messages that don't make much sense. They would
lead me to beleive that the tape is bad... I guess. Yet, I have a hard
time beleiving that the two out of four tapes is bad.

issuing an 'mt erase' it get an input/output error.

below are the kernel messages.

could two tapes suddenly just become 'bad'?

Since these are ait tapes and have a 64k ram buffer, I'm wondering if
there may be some bad data there and if there is a way to clear it...

The tape unit is a sony sdx300c.  I've updated it to the latest
firmware. Its attached to an adaptec 2940wide.

Any advice would be greatly appreciated.
== dmesg ==
 
... lots of stuff deleted ...

Disconnected Queue entries: 0:0 
QOUTFIFO entries: 
Sequencer Free SCB List: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 
24 25 26 27 28 29 30 31 
Sequencer SCB Info: 0(c 0x44, s 0xa7, l 0, t 0x0) 1(c 0x0, s 0xff, l 255, t 0xff) 2(c 
0x0, s 0xff, l 255, t 0xff) 3(c 0x0, s 0xff, l 255, t 0xff) 4(c 0x0, s 0xff, l 255, t 
0xff) 5(c 0x0, s 0xff, l 255, t 0xff) 6(c 0x0, s 0xff, l 255, t 0xff) 7(c 0x0, s 0xff, 
l 255, t 0xff) 8(c 0x0, s 0xff, l 255, t 0xff) 9(c 0x0, s 0xff, l 255, t 0xff) 10(c 
0x0, s 0xff, l 255, t 0xff) 11(c 0x0, s 0xff, l 255, t 0xff) 12(c 0x0, s 0xff, l 255, 
t 0xff) 13(c 0x0, s 0xff, l 255, t 0xff) 14(c 0x0, s 0xff, l 255, t 0xff) 15(c 0x0, s 
0xff, l 255, t 0xff) 16(c 0x0, s 0xff, l 255, t 0xff) 17(c 0x0, s 0xff, l 255, t 0xff) 
18(c 0x0, s 0xff, l 255, t 0xff) 19(c 0x0, s 0xff, l 255, t 0xff) 20(c 0x0, s 0xff, l 
255, t 0xff) 21(c 0x0, s 0xff, l 255, t 0xff) 22(c 0x0, s 0xff, l 255, t 0xff) 23(c 
0x0, s 0xff, l 255, t 0xff) 24(c 0x0, s 0xff, l 255, t 0xff) 25(c 0x0, s 0xff, l 255, 
t 0xff) 26(c 0x0, s 0xff, l 255, t 0xff) 27(c 0x0, s 0xff, l 255, t 0xff) 28(c 0x0, s 
0xff, l 255, t 0xff) 29(c 0x0, s 0xff, l 255, 
 t 0xff) 30(c 0x0, s 0xff, l 255, t 0xff) 31(c 0x0, s 0xff, l 255, t 0xff) 
Pending list: 0(c 0x40, s 0xa7, l 0)
Kernel Free SCB list: 15 16 17 18 19 1 2 3 4 5 6 7 8 9 13 12 11 10 
Untagged Q(10): 0 
sg[0] - Addr 0x4366000 : Length 4096
sg[1] - Addr 0x3c27000 : Length 4096
(sa0:ahc1:0:10:0): Queuing a BDR SCB
(sa0:ahc1:0:10:0): Bus Device Reset Message Sent
(sa0:ahc1:0:10:0): no longer in timeout, status = 34b
ahc1: Bus Device Reset on A:10. 1 SCBs aborted
(sa0:ahc1:0:10:0): unable to rewind after test read



-- 
David Bear
phone:  480-965-8257
fax:480-965-9189
College of Public Programs/ASU
Wilson Hall 232
Tempe, AZ 85287-0803
 "Beware the IP portfolio, everyone will be suspect of trespassing"
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "[EMAIL PROTECTED]"