>Submitter-Id:   net
>Originator:     Jan van den Bosch
>Organization:   net
>Confidential:  no
>Synopsis:      Archive QIC02 tape-unit device randomly halts.
>Severity:      serious
>Priority:      low 
>Category:      cvs
>Class:         sw-bug
>Release:       cvs-1.10
>Environment:
        
System: FreeBSD jvdbgw.icts.tue.nl 3.2-RELEASE FreeBSD 3.2-RELEASE #24: Wed Mar 8 
23:00:02 CET 2000 [EMAIL PROTECTED]:/usr/src/sys/compile/XANTOS i386


>Description:

The wt QIC02 tapedriver in thwe FreeBSD 3.x has a nice bug (could be
also in NetBSD and other BSD's ???)
The wt driver sometimes comes in a deadlock state (waiting for an 
wake-up event that never occurs). It may happen when you are doing 
longer backups on - I assume - some "slow" PC's like mine. 
The backup does not finish correctly, it "hangs".
The code is in /sys/i386/isa/wt.c.
The solution is simple. After changing it,
the bug never occurred (to me) again. 

Reason:
In the Interrupt routine wtintr() there is no provision for 
stopping the timeout() if all i/o (one block) is completed
(state 'i/o finished'). If an interrupt is generated at this moment,
it comes in an unexpected state ('continue i/o'). In this state,
there is of course no wakeup, so it keeps sleeping ... 

Patch:
*Allways put the timer off* if i/o is finished.
Add this line in the function body of wtintr(sc):
        untimeout(wtimer, (caddr_t)t, t->co_handler);

Remark:
In NetBSD ther is no call-out handler, the patch is even
more simple.

>How-To-Repeat:

Problem shows up every time I was making a longer backup, 
on my old 486 PC system.
(May be on a fast system it does not occure, because it is a timing problem)

>Fix:

----context diff, patch in /sys/i386/isa directory:

*** wt.c        Wed Mar  8 22:57:56 2000
--- wt.c.orig   Fri Sep 10 10:29:44 1999
***************
*** 165,174 ****
  #ifdef        DEVFS
        void    *devfs_token_r;
  #endif
-       /* JVDB added 03082000 */
-       struct callout_handle co_handler;
-               /* Callout handler for receiving & UNSETTING timeouts */
-       /**/
  } wtinfo_t;
  
  static wtinfo_t wttab[NWT];                    /* tape info by unit number */
--- 165,170 ----
***************
*** 259,267 ****
  {
        wtinfo_t *t = wttab + id->id_unit;
  
- /* JVDB inserted 03082000  */
-       callout_handle_init(&t->co_handler);    /* installing callout */
- /**/
        id->id_ointr = wtintr;
        if (t->type == ARCHIVE) {
                printf ("wt%d: type <Archive>\n", t->unit);
--- 255,260 ----
***************
*** 272,278 ****
        t->dens = -1;                           /* unknown density */
        isa_dmainit(t->chan, 1024);
  
- 
  #ifdef DEVFS
        t->devfs_token_r = 
                devfs_add_devswf(&wt_cdevsw, id->id_unit, DV_CHR, 0, 0, 
--- 265,270 ----
***************
*** 440,448 ****
                ((struct mtget*)arg)->mt_resid = 0;
                ((struct mtget*)arg)->mt_fileno = 0;            /* file */
                ((struct mtget*)arg)->mt_blkno = 0;             /* block */
-               /* JVDB added 231294 */
-               ((struct mtget*)arg)->mt_density = t->dens;    /* density */
-               /**/
                return (0);
        case MTIOCTOP:
                break;
--- 432,437 ----
***************
*** 458,474 ****
        case MTNOCACHE:         /* disable controller cache */
                return (0);
        case MTREW:             /* rewind */
                if (t->flags & TPREW)   /* rewind is running */
                        return (0);
                if (error = wtwait (t, PCATCH, "wtorew"))
                        return (error);
                wtrewind (t);
                return (0);
-       case MTOFFL:            /* JVDB  put the drive offline */
-                               /* actually do a reset ...*/
-               wtreset(t);
-               return 0;
-               /**/
        case MTFSF:             /* forward space file */
                for (count=((struct mtop*)arg)->mt_count; count>0; --count) {
                        if (error = wtwait (t, PCATCH, "wtorfm"))
--- 447,459 ----
        case MTNOCACHE:         /* disable controller cache */
                return (0);
        case MTREW:             /* rewind */
+       case MTOFFL:            /* rewind and put the drive offline */
                if (t->flags & TPREW)   /* rewind is running */
                        return (0);
                if (error = wtwait (t, PCATCH, "wtorew"))
                        return (error);
                wtrewind (t);
                return (0);
        case MTFSF:             /* forward space file */
                for (count=((struct mtop*)arg)->mt_count; count>0; --count) {
                        if (error = wtwait (t, PCATCH, "wtorfm"))
***************
*** 680,689 ****
                        t->flags |= TPVOL;      /* end of file */
                else
                        t->flags |= TPEXCEP;    /* i/o error */
-       /* JVDB added 03082000 */
-               /* stop timer going off !!!!*/
-               untimeout(wtimer, (caddr_t)t, t->co_handler);
-       /**/
                wakeup ((caddr_t)t);
                return;
        }
--- 665,670 ----
***************
*** 695,704 ****
        }
        if (t->dmacount > t->dmatotal)          /* short last block */
                t->dmacount = t->dmatotal;
-       /* JVDB added 03082000 */
-               /* stop time going off !!!!*/
-               untimeout(wtimer, t, t->co_handler);
-       /**/
        wakeup ((caddr_t)t);    /* wake up user level */
        TRACE (("i/o finished, %d\n", t->dmacount));
  }
--- 676,681 ----
***************
*** 867,875 ****
                t->flags |= TPTIMER;
                /* Some controllers seem to lose dma interrupts too often.
                 * To make the tape stream we need 1 tick timeout. */
-               /** JVDB added 03082000; set the handler */
-               t->co_handler = 
-               /**/
                timeout (wtimer, (caddr_t)t, (t->flags & TPACTIVE) ? 1 : hz);
        }
  }
--- 844,849 ----

--QAA00903.952787866/jvdbgw.icts.tue.nl--

Reply via email to