Re: Re : [CHECKER] 28 potential interrupt errors

2001-03-22 Thread Linus Torvalds

In article <[EMAIL PROTECTED]>,
Jean Tourrilhes  <[EMAIL PROTECTED]> wrote:
>
>   I agree that the IrDA stack is full of irq/locking bugs (there
>is a patch of mine waiting in Dag's mailbox), but this one is not a
>bug, it's a false positive.
>   The restore_flags(flags); will restore the state of the
>interrupt register before the cli happened, so will automatically
>reenable interrupts.

Look closer. The error report is a big bogus, because it points out as
an error the "return" that is _correct_, not the "return" that is buggy.

Their checkers verify that all exists out of a function have the same
characteristics, and they found a case where one exit exists with
interrupts still disabled, while another one exists after having done a
"restore_flags()". 

So it looks like a real bug, it's just that the error is the _earlier_
return value, not the one pointed at.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Re : [CHECKER] 28 potential interrupt errors

2001-03-22 Thread Jean Tourrilhes

On Thu, Mar 22, 2001 at 03:49:31PM -0800, Junfeng Yang wrote:
> 
> Sometimes the line number reported by the checker is not correct.
> But if you go into the function, you can find the bug.

Gotcha. It in fact indicate the error at the end of the
function instead of the place where the error is. Very confusing.
So, mea culpa, I was wrong...

Jean
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re : [CHECKER] 28 potential interrupt errors

2001-03-22 Thread Jean Tourrilhes

Junfeng Yang wrote :
> Hi,
> 
> Here are yet more results from the MC project.  This checker looks for
> inconsistent usage of interrupt functions.
[...]
> -
> [BUG] error path
> 
> /u2/acc/oses/linux/2.4.1/drivers/net/irda/irport.c:943:irport_net_ioctl: 
>ERROR:INTR:947:997: Interrupts inconsistent, severity `20':997
> 
> /* Disable interrupts & save flags */
> save_flags(flags);
> Start --->
> cli();
> 
> switch (cmd) {
> case SIOCSBANDWIDTH: /* Set bandwidth */
> if (!capable(CAP_NET_ADMIN))
> return -EPERM;
> irda_task_execute(self, __irport_change_speed, NULL, NULL,
> 
> ... DELETED 40 lines ...
> 
> }
> 
> restore_flags(flags);
> 
> Error --->
> return ret;
> }
> 
> static struct net_device_stats *irport_net_get_stats(struct net_device *dev)
> {
> -

I agree that the IrDA stack is full of irq/locking bugs (there
is a patch of mine waiting in Dag's mailbox), but this one is not a
bug, it's a false positive.
The restore_flags(flags); will restore the state of the
interrupt register before the cli happened, so will automatically
reenable interrupts. The exact same code was used all over the kernel
before spinlock were introduced.

So, if you see :
save_flags(flags);
cli();
...
restore_flags(flags);
It's correct (but a bit outdated).


> -
> [BUG] error path. this bug is interesting
> 
> 
>/u2/acc/oses/linux/2.4.1/drivers/net/pcmcia/wavelan_cs.c:2561:wavelan_get_wireless_stats:
> ERROR:INTR:2528:2561: Interrupts inconsistent, severity `20':2561
> 
> 
>   /* Disable interrupts & save flags */
> Start --->
>   spin_lock_irqsave (>lock, flags);
> 
>   if(lp == (net_local *) NULL)
> return (iw_stats *) NULL;
>   wstats = >wstats;
> 
>   /* Get data from the mmc */
> 
> ... DELETED 23 lines ...
> 
> 
> #ifdef DEBUG_IOCTL_TRACE
>   printk(KERN_DEBUG "%s: <-wavelan_get_wireless_stats()\n", dev->name);
> #endif
> Error --->
>   return >wstats;
> }
> #endif  /* WIRELESS_EXT */
> 
> -

Didn't look into 2.4.1, but in 2.4.2 the irq_restore is just
above the printk, in the part that is "DELETED". It even has a nice
comments to that effect. Check the code by yourself.
So, I guess it's another false positive and a bug in your
parser. That's why it's so "interesting" ;-)

Good luck...

Jean
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Re : [CHECKER] 28 potential interrupt errors

2001-03-22 Thread Jean Tourrilhes

On Thu, Mar 22, 2001 at 03:49:31PM -0800, Junfeng Yang wrote:
 
 Sometimes the line number reported by the checker is not correct.
 But if you go into the function, you can find the bug.

Gotcha. It in fact indicate the error at the end of the
function instead of the place where the error is. Very confusing.
So, mea culpa, I was wrong...

Jean
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Re : [CHECKER] 28 potential interrupt errors

2001-03-22 Thread Linus Torvalds

In article [EMAIL PROTECTED],
Jean Tourrilhes  [EMAIL PROTECTED] wrote:

   I agree that the IrDA stack is full of irq/locking bugs (there
is a patch of mine waiting in Dag's mailbox), but this one is not a
bug, it's a false positive.
   The restore_flags(flags); will restore the state of the
interrupt register before the cli happened, so will automatically
reenable interrupts.

Look closer. The error report is a big bogus, because it points out as
an error the "return" that is _correct_, not the "return" that is buggy.

Their checkers verify that all exists out of a function have the same
characteristics, and they found a case where one exit exists with
interrupts still disabled, while another one exists after having done a
"restore_flags()". 

So it looks like a real bug, it's just that the error is the _earlier_
return value, not the one pointed at.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [CHECKER] 28 potential interrupt errors

2001-03-18 Thread Dawson Engler

> Request:  can the checker check for skb's being freed correctly?  The
> rules:
> 
> If an skb is in interrupt context, call dev_kfree_skb_irq.
> If an skb might be in interrupt context, call dev_kfree_skb_any.
> If an skb is not in interrupt context, call dev_kfree_skb.

It shouldn't be hard to check this.  The only thing interesting will be
deriving when you're in an interrupt context.  Thanks for the pointer.
Are there other context-sensitive rules that we could go after as well?

> I dunno WTF the programmer was thinking here...  Your de-ref checker
> should have caught this too:  check 'lp' for NULL after de-referencing
> lp->lock.

These reports are for an older version of the checker --- we've fixed
some bugs in the system, which should catch more of these errors.

Dawson
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [CHECKER] 28 potential interrupt errors

2001-03-18 Thread Kai Germaschewski

On Sun, 18 Mar 2001, Andrew Morton wrote:

> I'm planning on poking through everything which has been
> identified as a posible problem.  But I won't start for
> several weeks - give the maintainers (if any) time to
> address these things.

I took a look at the ISDN issues, here's a patch which should fix most of 
it (sleeping in IRQ, and unchecked user access still missing).

I'ld appreciate it if somebody felt like looking through it. The patch is
against current CVS, but applies against current 2.4.3-pre.

--Kai

diff -ur isdn-HEAD/drivers/isdn/avmb1/capi.c isdn-h/drivers/isdn/avmb1/capi.c
--- isdn-HEAD/drivers/isdn/avmb1/capi.c Thu Mar 15 22:19:21 2001
+++ isdn-h/drivers/isdn/avmb1/capi.cSat Mar 17 18:21:23 2001
@@ -1082,6 +1082,8 @@
return -ENODEV;
 
skb = alloc_skb(count, GFP_USER);
+   if (!skb)
+   return -ENOMEM;
 
if ((retval = copy_from_user(skb_put(skb, count), buf, count))) {
kfree_skb(skb);
@@ -1501,6 +1503,8 @@
return -EINVAL;
 
skb = alloc_skb(CAPI_DATA_B3_REQ_LEN+count, GFP_USER);
+   if (!skb)
+   return -ENOMEM;
 
skb_reserve(skb, CAPI_DATA_B3_REQ_LEN);
if ((retval = copy_from_user(skb_put(skb, count), buf, count))) {
Only in isdn-h/drivers/isdn/avmb1: capi.c%
diff -ur isdn-HEAD/drivers/isdn/avmb1/capidrv.c isdn-h/drivers/isdn/avmb1/capidrv.c
--- isdn-HEAD/drivers/isdn/avmb1/capidrv.c  Thu Mar 15 17:05:56 2001
+++ isdn-h/drivers/isdn/avmb1/capidrv.c Sat Mar 17 18:23:33 2001
@@ -2065,8 +2065,8 @@
__u16 datahandle;
 
if (!card) {
-   printk(KERN_ERR "capidrv-%d: if_sendbuf called with invalid driverId 
%d!\n",
-  card->contrnr, id);
+   printk(KERN_ERR "capidrv: if_sendbuf called with invalid driverId 
+%d!\n",
+  id);
return 0;
}
if (debugmode > 1)
@@ -2137,8 +2137,8 @@
__u8 *p;
 
if (!card) {
-   printk(KERN_ERR "capidrv-%d: if_readstat called with invalid driverId 
%d!\n",
-  card->contrnr, id);
+   printk(KERN_ERR "capidrv: if_readstat called with invalid driverId 
+%d!\n",
+  id);
return -ENODEV;
}
 
Only in isdn-h/drivers/isdn/avmb1: capidrv.c%
diff -ur isdn-HEAD/drivers/isdn/hisax/config.c isdn-h/drivers/isdn/hisax/config.c
--- isdn-HEAD/drivers/isdn/hisax/config.c   Thu Mar 15 22:19:21 2001
+++ isdn-h/drivers/isdn/hisax/config.c  Sat Mar 17 18:07:41 2001
@@ -925,13 +925,12 @@
 
save_flags(flags);
cli();
-   if (!(cs = (struct IsdnCardState *)
-   kmalloc(sizeof(struct IsdnCardState), GFP_ATOMIC))) {
+   cs = kmalloc(sizeof(struct IsdnCardState), GFP_ATOMIC)
+   if (!cs) {
printk(KERN_WARNING
   "HiSax: No memory for IsdnCardState(card %d)\n",
   cardnr + 1);
-   restore_flags(flags);
-   return (0);
+   goto out;
}
memset(cs, 0, sizeof(struct IsdnCardState));
card->cs = cs;
@@ -950,241 +949,235 @@
 #endif
cs->protocol = card->protocol;
 
-   if ((card->typ > 0) && (card->typ <= ISDN_CTYPE_COUNT)) {
-   if (!(cs->dlog = kmalloc(MAX_DLOG_SPACE, GFP_ATOMIC))) {
-   printk(KERN_WARNING
-   "HiSax: No memory for dlog(card %d)\n",
-   cardnr + 1);
-   restore_flags(flags);
-   return (0);
-   }
-   if (!(cs->status_buf = kmalloc(HISAX_STATUS_BUFSIZE, GFP_ATOMIC))) {
-   printk(KERN_WARNING
-   "HiSax: No memory for status_buf(card %d)\n",
-   cardnr + 1);
-   kfree(cs->dlog);
-   restore_flags(flags);
-   return (0);
-   }
-   cs->stlist = NULL;
-   cs->status_read = cs->status_buf;
-   cs->status_write = cs->status_buf;
-   cs->status_end = cs->status_buf + HISAX_STATUS_BUFSIZE - 1;
-   cs->typ = card->typ;
-   strcpy(cs->iif.id, id);
-   cs->iif.channels = 2;
-   cs->iif.maxbufsize = MAX_DATA_SIZE;
-   cs->iif.hl_hdrlen = MAX_HEADER_LEN;
-   cs->iif.features =
-   ISDN_FEATURE_L2_X75I |
-   ISDN_FEATURE_L2_HDLC |
-   ISDN_FEATURE_L2_HDLC_56K |
-   ISDN_FEATURE_L2_TRANS |
-   ISDN_FEATURE_L3_TRANS |
+   if (card->typ <= 0 || card->typ > ISDN_CTYPE_COUNT) {
+   printk(KERN_WARNING
+  "HiSax: Card Type %d out of range\n",
+  card->typ);
+   goto outf_cs;
+   }
+   if (!(cs->dlog = 

Re: [CHECKER] 28 potential interrupt errors

2001-03-18 Thread David Woodhouse

On Sun, 18 Mar 2001, Andrew Morton wrote:

> There's another thing which needs doing to n_r3964.c, BTW - the
> abuse of task queues in r3964_close().  This is, I think, the
> only client of task queues which needs to poke so deeply into
> the implementation internals and Linus has mentioned something about
> needing to redesign the task queues in 2.5.  So n_r3964 needs
> somehow to be redesigned so that it can use standard APIs.

Hmmm. That particular piece of ugliness seems to have been particularly 
gratuitous. We had two task queues, each of which would decrement a 
countdown value, calling on_timeout() if it reached zero, and stick the 
other on the tq_timer list.

Is there anyone out there that can test this and save me the trouble of 
trying to remember how to drive it?

Index: drivers/char/n_r3964.c
===
RCS file: /inst/cvs/linux/drivers/char/Attic/n_r3964.c,v
retrieving revision 1.1.2.7
diff -u -r1.1.2.7 n_r3964.c
--- drivers/char/n_r3964.c  2001/02/24 19:01:19 1.1.2.7
+++ drivers/char/n_r3964.c  2001/03/18 13:02:49
@@ -13,6 +13,12 @@
  * L. Haag
  *
  * $Log: n_r3964.c,v $
+ * Revision 1.10  2001/03/18 13:02:24  dwmw2
+ * Fix timer usage, use spinlocks properly.
+ *
+ * Revision 1.9  2001/03/18 12:52:14  dwmw2
+ * Merge changes in 2.4.2
+ *
  * Revision 1.8  2000/03/23 14:14:54  dwmw2
  * Fix race in sleeping in r3964_read()
  *
@@ -110,8 +116,6 @@
 #define TRACE_Q(fmt, arg...) /**/
 #endif
 
-static void on_timer_1(void*);
-static void on_timer_2(void*);
 static void add_tx_queue(struct r3964_info *, struct r3964_block_header *);
 static void remove_from_tx_queue(struct r3964_info *pInfo, int error_code);
 static void put_char(struct r3964_info *pInfo, unsigned char ch);
@@ -120,7 +124,7 @@
 static void transmit_block(struct r3964_info *pInfo);
 static void receive_char(struct r3964_info *pInfo, const unsigned char c);
 static void receive_error(struct r3964_info *pInfo, const char flag);
-static void on_timeout(struct r3964_info *pInfo);
+static void on_timeout(unsigned long priv);
 static int enable_signals(struct r3964_info *pInfo, pid_t pid, int arg);
 static int read_telegram(struct r3964_info *pInfo, pid_t pid, unsigned char *buf);
 static void add_msg(struct r3964_client_info *pClient, int msg_id, int arg,
@@ -217,7 +221,7 @@
 {
int status;

-   printk ("r3964: Philips r3964 Driver $Revision: 1.8 $\n");
+   printk ("r3964: Philips r3964 Driver $Revision: 1.10 $\n");
 
/*
 * Register the tty line discipline
@@ -247,40 +251,11 @@
  * Protocol implementation routines
  */
 
-static void on_timer_1(void *arg)
-{
-   struct r3964_info *pInfo = (struct r3964_info *)arg;
-  
-   if(pInfo->count_down)
-   {
-  if(!--pInfo->count_down)
-  {
- on_timeout(pInfo);
-  }
-   }
-   queue_task(>bh_2, _timer);
-}
-
-static void on_timer_2(void *arg)
-{
-   struct r3964_info *pInfo = (struct r3964_info *)arg;
-  
-   if(pInfo->count_down)
-   {
-  if(!--pInfo->count_down)
-  {
- on_timeout(pInfo);
-  }
-   }
-   queue_task(>bh_1, _timer);
-}
-
 static void add_tx_queue(struct r3964_info *pInfo, struct r3964_block_header *pHeader)
 {
unsigned long flags;

-   save_flags(flags);
-   cli();
+   spin_lock_irqsave(>lock, flags);
 
pHeader->next = NULL;
 
@@ -294,7 +269,7 @@
   pInfo->tx_last = pHeader;
}

-   restore_flags(flags);
+   spin_unlock_irqrestore(>lock, flags);
 
TRACE_Q("add_tx_queue %x, length %d, tx_first = %x", 
   (int)pHeader, pHeader->length, (int)pInfo->tx_first );
@@ -337,8 +312,7 @@
   wake_up_interruptible (>read_wait);
}
 
-   save_flags(flags);
-   cli();
+   spin_lock_irqsave(>lock, flags);
 
pInfo->tx_first = pHeader->next;
if(pInfo->tx_first==NULL)
@@ -346,7 +320,7 @@
   pInfo->tx_last = NULL;
}
 
-   restore_flags(flags);
+   spin_unlock_irqrestore(>lock, flags);
 
kfree(pHeader);
TRACE_M("remove_from_tx_queue - kfree %x",(int)pHeader);
@@ -359,8 +333,7 @@
 {
unsigned long flags;

-   save_flags(flags);
-   cli();
+   spin_lock_irqsave(>lock, flags);
 
pHeader->next = NULL;
 
@@ -375,7 +348,7 @@
}
pInfo->blocks_in_rx_queue++;

-   restore_flags(flags);
+   spin_unlock_irqrestore(>lock, flags);
 
TRACE_Q("add_rx_queue: %x, length = %d, rx_first = %x, count = %d",
   (int)pHeader, pHeader->length,
@@ -396,8 +369,7 @@
TRACE_Q("remove_from_rx_queue: %x, length %d",
   (int)pHeader, (int)pHeader->length );
 
-   save_flags(flags);
-   cli();
+   spin_lock_irqsave(>lock, flags);
 
if(pInfo->rx_first == pHeader)
{
@@ -430,7 +402,7 @@
   }
}
 
-   restore_flags(flags);
+   spin_unlock_irqrestore(>lock, flags);
 
kfree(pHeader);
TRACE_M("remove_from_rx_queue - kfree %x",(int)pHeader);
@@ -471,17 +443,16 @@
unsigned long flags;

 
-   save_flags(flags);
-   

Re: [CHECKER] 28 potential interrupt errors

2001-03-18 Thread Jeff Garzik

Junfeng Yang wrote:
> [BUG] error path
> 
> /u2/acc/oses/linux/2.4.1/drivers/net/appletalk/cops.c:776:cops_rx: 
>ERROR:INTR:763:804: Interrupts inconsistent, severity `20':804

Fixed.

Request:  can the checker check for skb's being freed correctly?  The
rules:

If an skb is in interrupt context, call dev_kfree_skb_irq.
If an skb might be in interrupt context, call dev_kfree_skb_any.
If an skb is not in interrupt context, call dev_kfree_skb.

I also found and fixed an error of this sort on cops.c as well.

> [BUG] error path. this bug is interesting
> 
> 
>/u2/acc/oses/linux/2.4.1/drivers/net/pcmcia/wavelan_cs.c:2561:wavelan_get_wireless_stats:
> ERROR:INTR:2528:2561: Interrupts inconsistent, severity `20':2561
> 
>   /* Disable interrupts & save flags */
> Start --->
>   spin_lock_irqsave (>lock, flags);
> 
>   if(lp == (net_local *) NULL)
> return (iw_stats *) NULL;

Fixed.

I dunno WTF the programmer was thinking here...  Your de-ref checker
should have caught this too:  check 'lp' for NULL after de-referencing
lp->lock.

> [BUG] error path
> 
> /u2/acc/oses/linux/2.4.1/drivers/net/tokenring/smctr.c:3655:smctr_open_tr: 
>ERROR:INTR:3594:3661: Interrupts inconsistent, severity `20':3661

Seems to be fixed already.

-- 
Jeff Garzik   | May you have warm words on a cold evening,
Building 1024 | a full mooon on a dark night,
MandrakeSoft  | and a smooth road all the way to your door.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [CHECKER] 28 potential interrupt errors

2001-03-18 Thread Jeff Garzik

Junfeng Yang wrote:
 [BUG] error path
 
 /u2/acc/oses/linux/2.4.1/drivers/net/appletalk/cops.c:776:cops_rx: 
ERROR:INTR:763:804: Interrupts inconsistent, severity `20':804

Fixed.

Request:  can the checker check for skb's being freed correctly?  The
rules:

If an skb is in interrupt context, call dev_kfree_skb_irq.
If an skb might be in interrupt context, call dev_kfree_skb_any.
If an skb is not in interrupt context, call dev_kfree_skb.

I also found and fixed an error of this sort on cops.c as well.

 [BUG] error path. this bug is interesting
 
 
/u2/acc/oses/linux/2.4.1/drivers/net/pcmcia/wavelan_cs.c:2561:wavelan_get_wireless_stats:
 ERROR:INTR:2528:2561: Interrupts inconsistent, severity `20':2561
 
   /* Disable interrupts  save flags */
 Start ---
   spin_lock_irqsave (lp-lock, flags);
 
   if(lp == (net_local *) NULL)
 return (iw_stats *) NULL;

Fixed.

I dunno WTF the programmer was thinking here...  Your de-ref checker
should have caught this too:  check 'lp' for NULL after de-referencing
lp-lock.

 [BUG] error path
 
 /u2/acc/oses/linux/2.4.1/drivers/net/tokenring/smctr.c:3655:smctr_open_tr: 
ERROR:INTR:3594:3661: Interrupts inconsistent, severity `20':3661

Seems to be fixed already.

-- 
Jeff Garzik   | May you have warm words on a cold evening,
Building 1024 | a full mooon on a dark night,
MandrakeSoft  | and a smooth road all the way to your door.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [CHECKER] 28 potential interrupt errors

2001-03-18 Thread David Woodhouse

On Sun, 18 Mar 2001, Andrew Morton wrote:

 There's another thing which needs doing to n_r3964.c, BTW - the
 abuse of task queues in r3964_close().  This is, I think, the
 only client of task queues which needs to poke so deeply into
 the implementation internals and Linus has mentioned something about
 needing to redesign the task queues in 2.5.  So n_r3964 needs
 somehow to be redesigned so that it can use standard APIs.

Hmmm. That particular piece of ugliness seems to have been particularly 
gratuitous. We had two task queues, each of which would decrement a 
countdown value, calling on_timeout() if it reached zero, and stick the 
other on the tq_timer list.

Is there anyone out there that can test this and save me the trouble of 
trying to remember how to drive it?

Index: drivers/char/n_r3964.c
===
RCS file: /inst/cvs/linux/drivers/char/Attic/n_r3964.c,v
retrieving revision 1.1.2.7
diff -u -r1.1.2.7 n_r3964.c
--- drivers/char/n_r3964.c  2001/02/24 19:01:19 1.1.2.7
+++ drivers/char/n_r3964.c  2001/03/18 13:02:49
@@ -13,6 +13,12 @@
  * L. Haag
  *
  * $Log: n_r3964.c,v $
+ * Revision 1.10  2001/03/18 13:02:24  dwmw2
+ * Fix timer usage, use spinlocks properly.
+ *
+ * Revision 1.9  2001/03/18 12:52:14  dwmw2
+ * Merge changes in 2.4.2
+ *
  * Revision 1.8  2000/03/23 14:14:54  dwmw2
  * Fix race in sleeping in r3964_read()
  *
@@ -110,8 +116,6 @@
 #define TRACE_Q(fmt, arg...) /**/
 #endif
 
-static void on_timer_1(void*);
-static void on_timer_2(void*);
 static void add_tx_queue(struct r3964_info *, struct r3964_block_header *);
 static void remove_from_tx_queue(struct r3964_info *pInfo, int error_code);
 static void put_char(struct r3964_info *pInfo, unsigned char ch);
@@ -120,7 +124,7 @@
 static void transmit_block(struct r3964_info *pInfo);
 static void receive_char(struct r3964_info *pInfo, const unsigned char c);
 static void receive_error(struct r3964_info *pInfo, const char flag);
-static void on_timeout(struct r3964_info *pInfo);
+static void on_timeout(unsigned long priv);
 static int enable_signals(struct r3964_info *pInfo, pid_t pid, int arg);
 static int read_telegram(struct r3964_info *pInfo, pid_t pid, unsigned char *buf);
 static void add_msg(struct r3964_client_info *pClient, int msg_id, int arg,
@@ -217,7 +221,7 @@
 {
int status;

-   printk ("r3964: Philips r3964 Driver $Revision: 1.8 $\n");
+   printk ("r3964: Philips r3964 Driver $Revision: 1.10 $\n");
 
/*
 * Register the tty line discipline
@@ -247,40 +251,11 @@
  * Protocol implementation routines
  */
 
-static void on_timer_1(void *arg)
-{
-   struct r3964_info *pInfo = (struct r3964_info *)arg;
-  
-   if(pInfo-count_down)
-   {
-  if(!--pInfo-count_down)
-  {
- on_timeout(pInfo);
-  }
-   }
-   queue_task(pInfo-bh_2, tq_timer);
-}
-
-static void on_timer_2(void *arg)
-{
-   struct r3964_info *pInfo = (struct r3964_info *)arg;
-  
-   if(pInfo-count_down)
-   {
-  if(!--pInfo-count_down)
-  {
- on_timeout(pInfo);
-  }
-   }
-   queue_task(pInfo-bh_1, tq_timer);
-}
-
 static void add_tx_queue(struct r3964_info *pInfo, struct r3964_block_header *pHeader)
 {
unsigned long flags;

-   save_flags(flags);
-   cli();
+   spin_lock_irqsave(pInfo-lock, flags);
 
pHeader-next = NULL;
 
@@ -294,7 +269,7 @@
   pInfo-tx_last = pHeader;
}

-   restore_flags(flags);
+   spin_unlock_irqrestore(pInfo-lock, flags);
 
TRACE_Q("add_tx_queue %x, length %d, tx_first = %x", 
   (int)pHeader, pHeader-length, (int)pInfo-tx_first );
@@ -337,8 +312,7 @@
   wake_up_interruptible (pInfo-read_wait);
}
 
-   save_flags(flags);
-   cli();
+   spin_lock_irqsave(pInfo-lock, flags);
 
pInfo-tx_first = pHeader-next;
if(pInfo-tx_first==NULL)
@@ -346,7 +320,7 @@
   pInfo-tx_last = NULL;
}
 
-   restore_flags(flags);
+   spin_unlock_irqrestore(pInfo-lock, flags);
 
kfree(pHeader);
TRACE_M("remove_from_tx_queue - kfree %x",(int)pHeader);
@@ -359,8 +333,7 @@
 {
unsigned long flags;

-   save_flags(flags);
-   cli();
+   spin_lock_irqsave(pInfo-lock, flags);
 
pHeader-next = NULL;
 
@@ -375,7 +348,7 @@
}
pInfo-blocks_in_rx_queue++;

-   restore_flags(flags);
+   spin_unlock_irqrestore(pInfo-lock, flags);
 
TRACE_Q("add_rx_queue: %x, length = %d, rx_first = %x, count = %d",
   (int)pHeader, pHeader-length,
@@ -396,8 +369,7 @@
TRACE_Q("remove_from_rx_queue: %x, length %d",
   (int)pHeader, (int)pHeader-length );
 
-   save_flags(flags);
-   cli();
+   spin_lock_irqsave(pInfo-lock, flags);
 
if(pInfo-rx_first == pHeader)
{
@@ -430,7 +402,7 @@
   }
}
 
-   restore_flags(flags);
+   spin_unlock_irqrestore(pInfo-lock, flags);
 
kfree(pHeader);
TRACE_M("remove_from_rx_queue - kfree %x",(int)pHeader);
@@ -471,17 +443,16 @@
unsigned long flags;
 

Re: [CHECKER] 28 potential interrupt errors

2001-03-18 Thread Kai Germaschewski

On Sun, 18 Mar 2001, Andrew Morton wrote:

 I'm planning on poking through everything which has been
 identified as a posible problem.  But I won't start for
 several weeks - give the maintainers (if any) time to
 address these things.

I took a look at the ISDN issues, here's a patch which should fix most of 
it (sleeping in IRQ, and unchecked user access still missing).

I'ld appreciate it if somebody felt like looking through it. The patch is
against current CVS, but applies against current 2.4.3-pre.

--Kai

diff -ur isdn-HEAD/drivers/isdn/avmb1/capi.c isdn-h/drivers/isdn/avmb1/capi.c
--- isdn-HEAD/drivers/isdn/avmb1/capi.c Thu Mar 15 22:19:21 2001
+++ isdn-h/drivers/isdn/avmb1/capi.cSat Mar 17 18:21:23 2001
@@ -1082,6 +1082,8 @@
return -ENODEV;
 
skb = alloc_skb(count, GFP_USER);
+   if (!skb)
+   return -ENOMEM;
 
if ((retval = copy_from_user(skb_put(skb, count), buf, count))) {
kfree_skb(skb);
@@ -1501,6 +1503,8 @@
return -EINVAL;
 
skb = alloc_skb(CAPI_DATA_B3_REQ_LEN+count, GFP_USER);
+   if (!skb)
+   return -ENOMEM;
 
skb_reserve(skb, CAPI_DATA_B3_REQ_LEN);
if ((retval = copy_from_user(skb_put(skb, count), buf, count))) {
Only in isdn-h/drivers/isdn/avmb1: capi.c%
diff -ur isdn-HEAD/drivers/isdn/avmb1/capidrv.c isdn-h/drivers/isdn/avmb1/capidrv.c
--- isdn-HEAD/drivers/isdn/avmb1/capidrv.c  Thu Mar 15 17:05:56 2001
+++ isdn-h/drivers/isdn/avmb1/capidrv.c Sat Mar 17 18:23:33 2001
@@ -2065,8 +2065,8 @@
__u16 datahandle;
 
if (!card) {
-   printk(KERN_ERR "capidrv-%d: if_sendbuf called with invalid driverId 
%d!\n",
-  card-contrnr, id);
+   printk(KERN_ERR "capidrv: if_sendbuf called with invalid driverId 
+%d!\n",
+  id);
return 0;
}
if (debugmode  1)
@@ -2137,8 +2137,8 @@
__u8 *p;
 
if (!card) {
-   printk(KERN_ERR "capidrv-%d: if_readstat called with invalid driverId 
%d!\n",
-  card-contrnr, id);
+   printk(KERN_ERR "capidrv: if_readstat called with invalid driverId 
+%d!\n",
+  id);
return -ENODEV;
}
 
Only in isdn-h/drivers/isdn/avmb1: capidrv.c%
diff -ur isdn-HEAD/drivers/isdn/hisax/config.c isdn-h/drivers/isdn/hisax/config.c
--- isdn-HEAD/drivers/isdn/hisax/config.c   Thu Mar 15 22:19:21 2001
+++ isdn-h/drivers/isdn/hisax/config.c  Sat Mar 17 18:07:41 2001
@@ -925,13 +925,12 @@
 
save_flags(flags);
cli();
-   if (!(cs = (struct IsdnCardState *)
-   kmalloc(sizeof(struct IsdnCardState), GFP_ATOMIC))) {
+   cs = kmalloc(sizeof(struct IsdnCardState), GFP_ATOMIC)
+   if (!cs) {
printk(KERN_WARNING
   "HiSax: No memory for IsdnCardState(card %d)\n",
   cardnr + 1);
-   restore_flags(flags);
-   return (0);
+   goto out;
}
memset(cs, 0, sizeof(struct IsdnCardState));
card-cs = cs;
@@ -950,241 +949,235 @@
 #endif
cs-protocol = card-protocol;
 
-   if ((card-typ  0)  (card-typ = ISDN_CTYPE_COUNT)) {
-   if (!(cs-dlog = kmalloc(MAX_DLOG_SPACE, GFP_ATOMIC))) {
-   printk(KERN_WARNING
-   "HiSax: No memory for dlog(card %d)\n",
-   cardnr + 1);
-   restore_flags(flags);
-   return (0);
-   }
-   if (!(cs-status_buf = kmalloc(HISAX_STATUS_BUFSIZE, GFP_ATOMIC))) {
-   printk(KERN_WARNING
-   "HiSax: No memory for status_buf(card %d)\n",
-   cardnr + 1);
-   kfree(cs-dlog);
-   restore_flags(flags);
-   return (0);
-   }
-   cs-stlist = NULL;
-   cs-status_read = cs-status_buf;
-   cs-status_write = cs-status_buf;
-   cs-status_end = cs-status_buf + HISAX_STATUS_BUFSIZE - 1;
-   cs-typ = card-typ;
-   strcpy(cs-iif.id, id);
-   cs-iif.channels = 2;
-   cs-iif.maxbufsize = MAX_DATA_SIZE;
-   cs-iif.hl_hdrlen = MAX_HEADER_LEN;
-   cs-iif.features =
-   ISDN_FEATURE_L2_X75I |
-   ISDN_FEATURE_L2_HDLC |
-   ISDN_FEATURE_L2_HDLC_56K |
-   ISDN_FEATURE_L2_TRANS |
-   ISDN_FEATURE_L3_TRANS |
+   if (card-typ = 0 || card-typ  ISDN_CTYPE_COUNT) {
+   printk(KERN_WARNING
+  "HiSax: Card Type %d out of range\n",
+  card-typ);
+   goto outf_cs;
+   }
+   if (!(cs-dlog = kmalloc(MAX_DLOG_SPACE, GFP_ATOMIC))) {
+  

Re: [CHECKER] 28 potential interrupt errors

2001-03-18 Thread Dawson Engler

 Request:  can the checker check for skb's being freed correctly?  The
 rules:
 
 If an skb is in interrupt context, call dev_kfree_skb_irq.
 If an skb might be in interrupt context, call dev_kfree_skb_any.
 If an skb is not in interrupt context, call dev_kfree_skb.

It shouldn't be hard to check this.  The only thing interesting will be
deriving when you're in an interrupt context.  Thanks for the pointer.
Are there other context-sensitive rules that we could go after as well?

 I dunno WTF the programmer was thinking here...  Your de-ref checker
 should have caught this too:  check 'lp' for NULL after de-referencing
 lp-lock.

These reports are for an older version of the checker --- we've fixed
some bugs in the system, which should catch more of these errors.

Dawson
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [CHECKER] 28 potential interrupt errors

2001-03-17 Thread Andrew Morton

David Woodhouse wrote:
> 
> [ n_r3964.c stuff ]
> ...
> akpm, were you looking at this?

I'm planning on poking through everything which has been
identified as a posible problem.  But I won't start for
several weeks - give the maintainers (if any) time to
address these things.

So.. please go ahead :)

There's another thing which needs doing to n_r3964.c, BTW - the
abuse of task queues in r3964_close().  This is, I think, the
only client of task queues which needs to poke so deeply into
the implementation internals and Linus has mentioned something about
needing to redesign the task queues in 2.5.  So n_r3964 needs
somehow to be redesigned so that it can use standard APIs.

-
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [CHECKER] 28 potential interrupt errors

2001-03-17 Thread David Woodhouse

On Fri, 16 Mar 2001, Junfeng Yang wrote:

> -
> [BUG] return with int disabled in an error path
> 
> /u2/acc/oses/linux/2.4.1/drivers/char/n_r3964.c:1036:add_msg: ERROR:INTR:990:995: 
>Interrupts inconsistent, severity `20':995
> 
> 
>   save_flags(flags);
> Start --->
>   cli();
> 
>   pMsg = kmalloc(sizeof(struct r3964_message), GFP_KERNEL);
>   TRACE_M("add_msg - kmalloc %x",(int)pMsg);
> Return without
> enabling interrupt
>   --->
>   if(pMsg==NULL)
>  return;
> 
> 
>   ... DELETED 44 lines ...
> 
>if(pClient->sig_flags & R3964_USE_SIGIO)
>{
>   kill_proc(pClient->pid, SIGIO, 1);
>}
> Error --->
> }
> 
> static struct r3964_message *remove_msg(struct r3964_info *pInfo,
>struct r3964_client_info *pClient)
> {
> -


The simple 'fix' is to move the offending kmalloc() up above the cli(). 
That might actually be something else to make an automated test for - 
anything which schedules can re-enable interrupts. In general, it's bad to 
call anything which can schedule when interrupts are disabled.

But actually, the cli() is there because of the fundamentally flawed 
assumption that it provides sufficient locking. I've converted the whole 
thing to use spinlocks instead, but haven't got a test setup ATM. I'll 
poke at it more on Monday.

akpm, were you looking at this?

-- 
dwmw2



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [CHECKER] 28 potential interrupt errors

2001-03-17 Thread Junfeng Yang

> Your reporting is a little misleading here.

Thanks for verifying these bugs ;)

The interrupt checker checks for inconsistent interrupt states. For
example, if a function has one exit point with interrupt disabled, and
another exit point with interrupt enabled, the checker will report an
error at the second exit point. The code snippets are automatically culled
from the source based on the line number in the error report. So the
reporting is sometimes misleading. I'll put the actuall line number in the
comments.

>
> Yes, there's a bug in this function - the `return -EPERM'
> doesn't do a `restore_flags()'.  But there is no bug
> in the place you've reported.
>
> (Personally, I think *any* C function which has more than
>  one `return' statement is a bug, and we see a classic
>  instance here of one of the problems which this practice
>  can cause.  Religious issue. )


It may be better to have two exit points, one for normal path and one for
error path. All error handling code can be put at the end of the function.

>
>
> > ...
>
> > [BUG] error path
> >
> > /u2/acc/oses/linux/2.4.1/drivers/net/wan/comx-hw-mixcom.c:505:MIXCOM_open: 
>ERROR:INTR:514:562: Interrupts inconsistent, severity `20':562
> >
> > }
> >
> > Start --->
> > save_flags(flags); cli();
> >
> > if(hw->channel==1) {
> > request_region(dev->base_addr, MIXCOM_IO_EXTENT, dev->name);
> > }
> >
> > if(hw->channel==0 && !(ch->init_status & IRQ_ALLOCATED)) {
> >
> > ... DELETED 38 lines ...
> >
> > procfile->mode = S_IFREG |  0444;
> > }
> > }
> >
> > Error --->
> > return 0;
> > }
>
> There's another problem here.  We're calling request_region()
> inside cli().  request_region() can sleep.
>
> On SMP, cli() does all sorts of bizarre things which are
> quite different between different architectures.  I don't
> know if this practice is actually unsafe on any architectures,
> but it is really bad practice.  It's certainly the case that
> schedue() will enable interrupts for you, so whatever you're
> protecting won't be protected!
>
> So I'd add `sleep inside cli()' to your list of things to
> look out for.
>
> Does your tool have the ability to track which functions
> can and can't sleep?  This is a very common source of bug.
> Grab a spinlock, then call a function which calls a function
> which calls a function which calls kmalloc(GFP_KERNEL).  Unless
> the spinlock is always protected by a semaphore, this can deadlock.
>
> >
> > /u2/acc/oses/linux/2.4.1/drivers/scsi/eata_dma.c:490:eata_queue: 
>ERROR:INTR:464:506: Interrupts inconsistent, severity `20':506
> >
> > save_flags(flags);
> > Start --->
> > cli();
> >
> > #if 0
> > for (x = 1, sh = first_HBA; x <= registered_HBAs; x++, sh = SD(sh)->next) {
> >   if(inb((uint)sh->base + HA_RAUXSTAT) & HA_AIRQ) {
> > printk("eata_dma: scsi%d interrupt pending in eata_queue.\n"
> >"  Calling interrupt handler.\n", sh->host_no);
> >
> > ... DELETED 32 lines ...
> >
> > printk(KERN_CRIT "eata_queue pid %ld, HBA QUEUE FULL..., "
> >"returning DID_BUS_BUSY\n", cmd->pid));
> > done(cmd);
> > restore_flags(flags);
> > Error --->
> > return(0);
> > }
> > ccb = >ccb[y];
>
> Something's gone a little wrong here.  The bug is in fact about
> 20 lines higher up.
>
>
> Generally: yes, everything you've found needs fixing.
>

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [CHECKER] 28 potential interrupt errors

2001-03-17 Thread Andrew Morton

Junfeng Yang wrote:
> 
> ...
> 
> [BUG] sleep_or_timeout will call interruptible_sleep_on, which will save disabled 
>flags and then restore them.
> 
> /u2/acc/oses/linux/2.4.1/drivers/cdrom/cm206.c:474:send_command: ERROR:INTR:462:474: 
>Interrupts inconsistent, severity `20':474
> 
>   if (!(inw(r_line_status) & ls_transmitter_buffer_empty)) {
> cd->command = command;
> Start --->
> cli();  /* don't interrupt before sleep */
> outw(dc_mask_sync_error | dc_no_stop_on_error |
>  (inw(r_data_status) & 0x7f), r_data_control);
> /* interrupt routine sends command */
> 
> Save & Restore
> flags here --->
> if (sleep_or_timeout(>uart, UART_TIMEOUT)) {
>   debug(("Time out on write-buffer\n"));
>   stats(write_timeout);
> 
> ... DELETED 2 lines ...
> 
> }
> debug(("Write commmand delayed\n"));
>   }
>   else outw(command, r_uart_transmit);
> Error --->
> }

Yes, this is a bug.

> ...

> /u2/acc/oses/linux/2.4.1/drivers/net/irda/irport.c:943:irport_net_ioctl: 
>ERROR:INTR:947:997: Interrupts inconsistent, severity `20':997
> 
> /* Disable interrupts & save flags */
> save_flags(flags);
> Start --->
> cli();
> 
> switch (cmd) {
> case SIOCSBANDWIDTH: /* Set bandwidth */
> if (!capable(CAP_NET_ADMIN))
> return -EPERM;
> irda_task_execute(self, __irport_change_speed, NULL, NULL,
> 
> ... DELETED 40 lines ...
> 
> }
> 
> restore_flags(flags);
> 
> Error --->
> return ret;
> }

Your reporting is a little misleading here.

Yes, there's a bug in this function - the `return -EPERM'
doesn't do a `restore_flags()'.  But there is no bug
in the place you've reported.

(Personally, I think *any* C function which has more than
 one `return' statement is a bug, and we see a classic
 instance here of one of the problems which this practice
 can cause.  Religious issue. )


> ...

> [BUG] error path
> 
> /u2/acc/oses/linux/2.4.1/drivers/net/wan/comx-hw-mixcom.c:505:MIXCOM_open: 
>ERROR:INTR:514:562: Interrupts inconsistent, severity `20':562
> 
> }
> 
> Start --->
> save_flags(flags); cli();
> 
> if(hw->channel==1) {
> request_region(dev->base_addr, MIXCOM_IO_EXTENT, dev->name);
> }
> 
> if(hw->channel==0 && !(ch->init_status & IRQ_ALLOCATED)) {
> 
> ... DELETED 38 lines ...
> 
> procfile->mode = S_IFREG |  0444;
> }
> }
> 
> Error --->
> return 0;
> }

There's another problem here.  We're calling request_region()
inside cli().  request_region() can sleep.

On SMP, cli() does all sorts of bizarre things which are
quite different between different architectures.  I don't
know if this practice is actually unsafe on any architectures,
but it is really bad practice.  It's certainly the case that
schedue() will enable interrupts for you, so whatever you're
protecting won't be protected!

So I'd add `sleep inside cli()' to your list of things to
look out for.

Does your tool have the ability to track which functions
can and can't sleep?  This is a very common source of bug.
Grab a spinlock, then call a function which calls a function
which calls a function which calls kmalloc(GFP_KERNEL).  Unless
the spinlock is always protected by a semaphore, this can deadlock.

> 
> /u2/acc/oses/linux/2.4.1/drivers/scsi/eata_dma.c:490:eata_queue: ERROR:INTR:464:506: 
>Interrupts inconsistent, severity `20':506
> 
> save_flags(flags);
> Start --->
> cli();
> 
> #if 0
> for (x = 1, sh = first_HBA; x <= registered_HBAs; x++, sh = SD(sh)->next) {
>   if(inb((uint)sh->base + HA_RAUXSTAT) & HA_AIRQ) {
> printk("eata_dma: scsi%d interrupt pending in eata_queue.\n"
>"  Calling interrupt handler.\n", sh->host_no);
> 
> ... DELETED 32 lines ...
> 
> printk(KERN_CRIT "eata_queue pid %ld, HBA QUEUE FULL..., "
>"returning DID_BUS_BUSY\n", cmd->pid));
> done(cmd);
> restore_flags(flags);
> Error --->
> return(0);
> }
> ccb = >ccb[y];

Something's gone a little wrong here.  The bug is in fact about
20 lines higher up.


Generally: yes, everything you've found needs fixing.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [CHECKER] 28 potential interrupt errors

2001-03-17 Thread Andrew Morton

Junfeng Yang wrote:
 
 ...
 
 [BUG] sleep_or_timeout will call interruptible_sleep_on, which will save disabled 
flags and then restore them.
 
 /u2/acc/oses/linux/2.4.1/drivers/cdrom/cm206.c:474:send_command: ERROR:INTR:462:474: 
Interrupts inconsistent, severity `20':474
 
   if (!(inw(r_line_status)  ls_transmitter_buffer_empty)) {
 cd-command = command;
 Start ---
 cli();  /* don't interrupt before sleep */
 outw(dc_mask_sync_error | dc_no_stop_on_error |
  (inw(r_data_status)  0x7f), r_data_control);
 /* interrupt routine sends command */
 
 Save  Restore
 flags here ---
 if (sleep_or_timeout(cd-uart, UART_TIMEOUT)) {
   debug(("Time out on write-buffer\n"));
   stats(write_timeout);
 
 ... DELETED 2 lines ...
 
 }
 debug(("Write commmand delayed\n"));
   }
   else outw(command, r_uart_transmit);
 Error ---
 }

Yes, this is a bug.

 ...

 /u2/acc/oses/linux/2.4.1/drivers/net/irda/irport.c:943:irport_net_ioctl: 
ERROR:INTR:947:997: Interrupts inconsistent, severity `20':997
 
 /* Disable interrupts  save flags */
 save_flags(flags);
 Start ---
 cli();
 
 switch (cmd) {
 case SIOCSBANDWIDTH: /* Set bandwidth */
 if (!capable(CAP_NET_ADMIN))
 return -EPERM;
 irda_task_execute(self, __irport_change_speed, NULL, NULL,
 
 ... DELETED 40 lines ...
 
 }
 
 restore_flags(flags);
 
 Error ---
 return ret;
 }

Your reporting is a little misleading here.

Yes, there's a bug in this function - the `return -EPERM'
doesn't do a `restore_flags()'.  But there is no bug
in the place you've reported.

(Personally, I think *any* C function which has more than
 one `return' statement is a bug, and we see a classic
 instance here of one of the problems which this practice
 can cause.  Religious issue. )


 ...

 [BUG] error path
 
 /u2/acc/oses/linux/2.4.1/drivers/net/wan/comx-hw-mixcom.c:505:MIXCOM_open: 
ERROR:INTR:514:562: Interrupts inconsistent, severity `20':562
 
 }
 
 Start ---
 save_flags(flags); cli();
 
 if(hw-channel==1) {
 request_region(dev-base_addr, MIXCOM_IO_EXTENT, dev-name);
 }
 
 if(hw-channel==0  !(ch-init_status  IRQ_ALLOCATED)) {
 
 ... DELETED 38 lines ...
 
 procfile-mode = S_IFREG |  0444;
 }
 }
 
 Error ---
 return 0;
 }

There's another problem here.  We're calling request_region()
inside cli().  request_region() can sleep.

On SMP, cli() does all sorts of bizarre things which are
quite different between different architectures.  I don't
know if this practice is actually unsafe on any architectures,
but it is really bad practice.  It's certainly the case that
schedue() will enable interrupts for you, so whatever you're
protecting won't be protected!

So I'd add `sleep inside cli()' to your list of things to
look out for.

Does your tool have the ability to track which functions
can and can't sleep?  This is a very common source of bug.
Grab a spinlock, then call a function which calls a function
which calls a function which calls kmalloc(GFP_KERNEL).  Unless
the spinlock is always protected by a semaphore, this can deadlock.

 
 /u2/acc/oses/linux/2.4.1/drivers/scsi/eata_dma.c:490:eata_queue: ERROR:INTR:464:506: 
Interrupts inconsistent, severity `20':506
 
 save_flags(flags);
 Start ---
 cli();
 
 #if 0
 for (x = 1, sh = first_HBA; x = registered_HBAs; x++, sh = SD(sh)-next) {
   if(inb((uint)sh-base + HA_RAUXSTAT)  HA_AIRQ) {
 printk("eata_dma: scsi%d interrupt pending in eata_queue.\n"
"  Calling interrupt handler.\n", sh-host_no);
 
 ... DELETED 32 lines ...
 
 printk(KERN_CRIT "eata_queue pid %ld, HBA QUEUE FULL..., "
"returning DID_BUS_BUSY\n", cmd-pid));
 done(cmd);
 restore_flags(flags);
 Error ---
 return(0);
 }
 ccb = hd-ccb[y];

Something's gone a little wrong here.  The bug is in fact about
20 lines higher up.


Generally: yes, everything you've found needs fixing.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [CHECKER] 28 potential interrupt errors

2001-03-17 Thread Junfeng Yang

 Your reporting is a little misleading here.

Thanks for verifying these bugs ;)

The interrupt checker checks for inconsistent interrupt states. For
example, if a function has one exit point with interrupt disabled, and
another exit point with interrupt enabled, the checker will report an
error at the second exit point. The code snippets are automatically culled
from the source based on the line number in the error report. So the
reporting is sometimes misleading. I'll put the actuall line number in the
comments.


 Yes, there's a bug in this function - the `return -EPERM'
 doesn't do a `restore_flags()'.  But there is no bug
 in the place you've reported.

 (Personally, I think *any* C function which has more than
  one `return' statement is a bug, and we see a classic
  instance here of one of the problems which this practice
  can cause.  Religious issue. )


It may be better to have two exit points, one for normal path and one for
error path. All error handling code can be put at the end of the function.



  ...

  [BUG] error path
 
  /u2/acc/oses/linux/2.4.1/drivers/net/wan/comx-hw-mixcom.c:505:MIXCOM_open: 
ERROR:INTR:514:562: Interrupts inconsistent, severity `20':562
 
  }
 
  Start ---
  save_flags(flags); cli();
 
  if(hw-channel==1) {
  request_region(dev-base_addr, MIXCOM_IO_EXTENT, dev-name);
  }
 
  if(hw-channel==0  !(ch-init_status  IRQ_ALLOCATED)) {
 
  ... DELETED 38 lines ...
 
  procfile-mode = S_IFREG |  0444;
  }
  }
 
  Error ---
  return 0;
  }

 There's another problem here.  We're calling request_region()
 inside cli().  request_region() can sleep.

 On SMP, cli() does all sorts of bizarre things which are
 quite different between different architectures.  I don't
 know if this practice is actually unsafe on any architectures,
 but it is really bad practice.  It's certainly the case that
 schedue() will enable interrupts for you, so whatever you're
 protecting won't be protected!

 So I'd add `sleep inside cli()' to your list of things to
 look out for.

 Does your tool have the ability to track which functions
 can and can't sleep?  This is a very common source of bug.
 Grab a spinlock, then call a function which calls a function
 which calls a function which calls kmalloc(GFP_KERNEL).  Unless
 the spinlock is always protected by a semaphore, this can deadlock.

 
  /u2/acc/oses/linux/2.4.1/drivers/scsi/eata_dma.c:490:eata_queue: 
ERROR:INTR:464:506: Interrupts inconsistent, severity `20':506
 
  save_flags(flags);
  Start ---
  cli();
 
  #if 0
  for (x = 1, sh = first_HBA; x = registered_HBAs; x++, sh = SD(sh)-next) {
if(inb((uint)sh-base + HA_RAUXSTAT)  HA_AIRQ) {
  printk("eata_dma: scsi%d interrupt pending in eata_queue.\n"
 "  Calling interrupt handler.\n", sh-host_no);
 
  ... DELETED 32 lines ...
 
  printk(KERN_CRIT "eata_queue pid %ld, HBA QUEUE FULL..., "
 "returning DID_BUS_BUSY\n", cmd-pid));
  done(cmd);
  restore_flags(flags);
  Error ---
  return(0);
  }
  ccb = hd-ccb[y];

 Something's gone a little wrong here.  The bug is in fact about
 20 lines higher up.


 Generally: yes, everything you've found needs fixing.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [CHECKER] 28 potential interrupt errors

2001-03-17 Thread David Woodhouse

On Fri, 16 Mar 2001, Junfeng Yang wrote:

 -
 [BUG] return with int disabled in an error path
 
 /u2/acc/oses/linux/2.4.1/drivers/char/n_r3964.c:1036:add_msg: ERROR:INTR:990:995: 
Interrupts inconsistent, severity `20':995
 
 
   save_flags(flags);
 Start ---
   cli();
 
   pMsg = kmalloc(sizeof(struct r3964_message), GFP_KERNEL);
   TRACE_M("add_msg - kmalloc %x",(int)pMsg);
 Return without
 enabling interrupt
   ---
   if(pMsg==NULL)
  return;
 
 
   ... DELETED 44 lines ...
 
if(pClient-sig_flags  R3964_USE_SIGIO)
{
   kill_proc(pClient-pid, SIGIO, 1);
}
 Error ---
 }
 
 static struct r3964_message *remove_msg(struct r3964_info *pInfo,
struct r3964_client_info *pClient)
 {
 -


The simple 'fix' is to move the offending kmalloc() up above the cli(). 
That might actually be something else to make an automated test for - 
anything which schedules can re-enable interrupts. In general, it's bad to 
call anything which can schedule when interrupts are disabled.

But actually, the cli() is there because of the fundamentally flawed 
assumption that it provides sufficient locking. I've converted the whole 
thing to use spinlocks instead, but haven't got a test setup ATM. I'll 
poke at it more on Monday.

akpm, were you looking at this?

-- 
dwmw2



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [CHECKER] 28 potential interrupt errors

2001-03-17 Thread Andrew Morton

David Woodhouse wrote:
 
 [ n_r3964.c stuff ]
 ...
 akpm, were you looking at this?

I'm planning on poking through everything which has been
identified as a posible problem.  But I won't start for
several weeks - give the maintainers (if any) time to
address these things.

So.. please go ahead :)

There's another thing which needs doing to n_r3964.c, BTW - the
abuse of task queues in r3964_close().  This is, I think, the
only client of task queues which needs to poke so deeply into
the implementation internals and Linus has mentioned something about
needing to redesign the task queues in 2.5.  So n_r3964 needs
somehow to be redesigned so that it can use standard APIs.

-
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



[CHECKER] 28 potential interrupt errors

2001-03-16 Thread Junfeng Yang

Hi,

Here are yet more results from the MC project.  This checker looks for
inconsistent usage of interrupt functions.  For example, it notices when
interrupts can be either on or off when a function exits.  It tracks
cli(), sti(), save_flags() and so forth.  We've hand-inspected the results
to ensure that the ones you see here are likely to be errors.

As usual, please CC us at [EMAIL PROTECTED] if you can verify these
potential errors or show that these bugs are false positives.

Important: The code snippets with each bug here were automatically culled
from the source, not manually selected, so they are sometimes inaccurate
as to the actual location of the bug.  We've included a comment before
each bug to help in understanding what the checker found, but the only way
to know for sure is to inspect the source.

-Junfeng & Andy


Where the errors are:

+--++
| file | fn |
+--++
| drivers/cdrom/cm206.c| receive_byte   |
| drivers/cdrom/cm206.c| send_command   |
| drivers/char/ip2/i2lib.c | i2QueueCommands|
| drivers/char/n_r3964.c   | add_msg|
| drivers/char/rio/rioroute.c  | RIOFixPhbs |
| drivers/char/rio/riotable.c  | RIODeleteRta   |
| drivers/i2o/i2o_block.c  | i2ob_del_device|
| drivers/ide/ide.c| ide_spin_wait_hwgroup  |
| drivers/isdn/hisax/config.c  | checkcard  |
| drivers/isdn/hisax/isar.c| isar_load_firmware |
| drivers/isdn/isdn_ppp.c  | isdn_ppp_bind  |
| drivers/net/appletalk/cops.c | cops_rx|
| drivers/net/hamradio/soundmodem/sm_wss.c | wss_set_codec_fmt  |
| drivers/net/irda/irport.c| irport_net_ioctl   |
| drivers/net/irda/irtty.c | irtty_net_ioctl|
| drivers/net/irda/nsc-ircc.c  | nsc_ircc_net_ioctl |
| drivers/net/irda/toshoboe.c  | toshoboe_net_ioctl |
| drivers/net/irda/w83977af_ir.c   | w83977af_net_ioctl |
| drivers/net/pcmcia/wavelan_cs.c  | wavelan_get_wireless_stats |
| drivers/net/tokenring/smctr.c| smctr_open_tr  |
| drivers/net/wan/comx-hw-mixcom.c | MIXCOM_open|
| drivers/net/wan/lmc/lmc_main.c   | lmc_watchdog   |
| drivers/scsi/eata_dma.c  | eata_queue |
| drivers/scsi/qla1280.c   | qla1280_intr_handler   |
| drivers/sound/ad1848.c   | ad1848_resume  |
| drivers/sound/emu10k1/midi.c | emu10k1_midi_callback  |
| drivers/sound/sscape.c   | sscape_pnp_upload_file |
| net/irda/irttp.c | irttp_proc_read|
+--++

Listing:

[BUG] sleep_or_timeout will call interruptible_sleep_on, which will save disabled 
flags and then restore them.

/u2/acc/oses/linux/2.4.1/drivers/cdrom/cm206.c:474:send_command: ERROR:INTR:462:474: 
Interrupts inconsistent, severity `20':474

  if (!(inw(r_line_status) & ls_transmitter_buffer_empty)) {
cd->command = command;
Start --->
cli();  /* don't interrupt before sleep */
outw(dc_mask_sync_error | dc_no_stop_on_error |
 (inw(r_data_status) & 0x7f), r_data_control);
/* interrupt routine sends command */

Save & Restore
flags here --->
if (sleep_or_timeout(>uart, UART_TIMEOUT)) {
  debug(("Time out on write-buffer\n"));
  stats(write_timeout);

... DELETED 2 lines ...

}
debug(("Write commmand delayed\n"));
  }
  else outw(command, r_uart_transmit);
Error --->
}

uch receive_byte(int timeout)
{
  uch ret;
-
[BUG] sleep_or_timeout will call interruptible_sleep_on, which will save disabled 
flags and then restore them.

/u2/acc/oses/linux/2.4.1/drivers/cdrom/cm206.c:499:receive_byte: ERROR:INTR:479:494: 
Interrupts inconsistent, severity `30':494

{
  uch ret;
Start --->
  cli();
  debug(("cli\n"));
  ret = cd->ur[cd->ur_r];
  if (cd->ur_r != cd->ur_w) {
sti();
debug(("returning #%d: 0x%x\n", cd->ur_r, cd->ur[cd->ur_r]));
cd->ur_r++; cd->ur_r %= UR_SIZE;

... DELETED 5 lines ...

#ifdef STATISTICS
if (timeout==UART_TIMEOUT) stats(receive_timeout) /* no `;'! */
else stats(dsb_timeout);
#endif
Error --->
return 0xda;
  }
  ret = cd->ur[cd->ur_r];
  debug(("slept; returning #%d: 0x%x\n", cd->ur_r, cd->ur[cd->ur_r]));
  cd->ur_r++; cd->ur_r %= UR_SIZE;