>Submitter-Id: net
>Originator: Jan van den Bosch
>Organization: net
>Confidential: no
>Synopsis: Archive QIC02 tape-unit device randomly halts.
>Severity: serious
>Priority: low
>Category: cvs
>Class: sw-bug
>Release: cvs-1.10
>Environment:
System: FreeBSD jvdbgw.icts.tue.nl 3.2-RELEASE FreeBSD 3.2-RELEASE #24: Wed Mar 8
23:00:02 CET 2000 [EMAIL PROTECTED]:/usr/src/sys/compile/XANTOS i386
>Description:
The wt QIC02 tapedriver in thwe FreeBSD 3.x has a nice bug (could be
also in NetBSD and other BSD's ???)
The wt driver sometimes comes in a deadlock state (waiting for an
wake-up event that never occurs). It may happen when you are doing
longer backups on - I assume - some "slow" PC's like mine.
The backup does not finish correctly, it "hangs".
The code is in /sys/i386/isa/wt.c.
The solution is simple. After changing it,
the bug never occurred (to me) again.
Reason:
In the Interrupt routine wtintr() there is no provision for
stopping the timeout() if all i/o (one block) is completed
(state 'i/o finished'). If an interrupt is generated at this moment,
it comes in an unexpected state ('continue i/o'). In this state,
there is of course no wakeup, so it keeps sleeping ...
Patch:
*Allways put the timer off* if i/o is finished.
Add this line in the function body of wtintr(sc):
untimeout(wtimer, (caddr_t)t, t->co_handler);
Remark:
In NetBSD ther is no call-out handler, the patch is even
more simple.
>How-To-Repeat:
Problem shows up every time I was making a longer backup,
on my old 486 PC system.
(May be on a fast system it does not occure, because it is a timing problem)
>Fix:
----context diff, patch in /sys/i386/isa directory:
*** wt.c Wed Mar 8 22:57:56 2000
--- wt.c.orig Fri Sep 10 10:29:44 1999
***************
*** 165,174 ****
#ifdef DEVFS
void *devfs_token_r;
#endif
- /* JVDB added 03082000 */
- struct callout_handle co_handler;
- /* Callout handler for receiving & UNSETTING timeouts */
- /**/
} wtinfo_t;
static wtinfo_t wttab[NWT]; /* tape info by unit number */
--- 165,170 ----
***************
*** 259,267 ****
{
wtinfo_t *t = wttab + id->id_unit;
- /* JVDB inserted 03082000 */
- callout_handle_init(&t->co_handler); /* installing callout */
- /**/
id->id_ointr = wtintr;
if (t->type == ARCHIVE) {
printf ("wt%d: type <Archive>\n", t->unit);
--- 255,260 ----
***************
*** 272,278 ****
t->dens = -1; /* unknown density */
isa_dmainit(t->chan, 1024);
-
#ifdef DEVFS
t->devfs_token_r =
devfs_add_devswf(&wt_cdevsw, id->id_unit, DV_CHR, 0, 0,
--- 265,270 ----
***************
*** 440,448 ****
((struct mtget*)arg)->mt_resid = 0;
((struct mtget*)arg)->mt_fileno = 0; /* file */
((struct mtget*)arg)->mt_blkno = 0; /* block */
- /* JVDB added 231294 */
- ((struct mtget*)arg)->mt_density = t->dens; /* density */
- /**/
return (0);
case MTIOCTOP:
break;
--- 432,437 ----
***************
*** 458,474 ****
case MTNOCACHE: /* disable controller cache */
return (0);
case MTREW: /* rewind */
if (t->flags & TPREW) /* rewind is running */
return (0);
if (error = wtwait (t, PCATCH, "wtorew"))
return (error);
wtrewind (t);
return (0);
- case MTOFFL: /* JVDB put the drive offline */
- /* actually do a reset ...*/
- wtreset(t);
- return 0;
- /**/
case MTFSF: /* forward space file */
for (count=((struct mtop*)arg)->mt_count; count>0; --count) {
if (error = wtwait (t, PCATCH, "wtorfm"))
--- 447,459 ----
case MTNOCACHE: /* disable controller cache */
return (0);
case MTREW: /* rewind */
+ case MTOFFL: /* rewind and put the drive offline */
if (t->flags & TPREW) /* rewind is running */
return (0);
if (error = wtwait (t, PCATCH, "wtorew"))
return (error);
wtrewind (t);
return (0);
case MTFSF: /* forward space file */
for (count=((struct mtop*)arg)->mt_count; count>0; --count) {
if (error = wtwait (t, PCATCH, "wtorfm"))
***************
*** 680,689 ****
t->flags |= TPVOL; /* end of file */
else
t->flags |= TPEXCEP; /* i/o error */
- /* JVDB added 03082000 */
- /* stop timer going off !!!!*/
- untimeout(wtimer, (caddr_t)t, t->co_handler);
- /**/
wakeup ((caddr_t)t);
return;
}
--- 665,670 ----
***************
*** 695,704 ****
}
if (t->dmacount > t->dmatotal) /* short last block */
t->dmacount = t->dmatotal;
- /* JVDB added 03082000 */
- /* stop time going off !!!!*/
- untimeout(wtimer, t, t->co_handler);
- /**/
wakeup ((caddr_t)t); /* wake up user level */
TRACE (("i/o finished, %d\n", t->dmacount));
}
--- 676,681 ----
***************
*** 867,875 ****
t->flags |= TPTIMER;
/* Some controllers seem to lose dma interrupts too often.
* To make the tape stream we need 1 tick timeout. */
- /** JVDB added 03082000; set the handler */
- t->co_handler =
- /**/
timeout (wtimer, (caddr_t)t, (t->flags & TPACTIVE) ? 1 : hz);
}
}
--- 844,849 ----
--QAA00903.952787866/jvdbgw.icts.tue.nl--