I think I found the culprit

Olaf Meyer Tue, 17 Nov 1998 06:05:59 -0500
Kaz Kylheku writes:
 > 
 > On Mon, 16 Nov 1998, Olaf Meyer wrote:
 > 
 > > I made substantial changes/addons to the WaveLAN driver. Reversing all this
 > > I could start from the beginning. Maybe it would be faster, but I think
 > > I'm first going to try to trace the bug. Essentially all my code is 
 > > interrupt driven, so this can basically happen any where ...
 > 
 > Yeah, you really have to watch what you do in interrupt-driven code. The best
 > thing is to do as little as possible and do it in a reentrant fashion.  Use
 > only operations that are certified as being atomic. You can't make any system
 > calls or call any code that assumes that you are running in a process, and
 > tries to block you.
 > 
 > 

I think I found the culprit for the "Aiee: scheduling in interrupt 00124b39"
message :-)

If I sort the system.map by addresses, 00124b39 falls between
__wait_ob_buffer and sync_buffers, i.e. the schedule call that
causes my freeze-up has to be the one in __wait_on_buffer, correct?

  00124abc T __wait_on_buffer
  00124b74 T sync_buffers

Now __wait_on_buffer only gets called in wait_on_buffer and lock_buffer (I'm
using a 2.0.33 kernel) These in turn mainly get called from filesystem and
char/block device related code. I don't do any file related stuff. In fact I
only execute my own code and some socket buffer related code (like alloc and
free). I included a small portion of the code which shows where the freeze
occurs.  I actually run through this section without any problems over
25,000 times before the freeze occurs ...

I currently don't really know where to continue the bug hunt ...
If somebody is a little be more familiar with the above calls, 
I'd like to know how the heck I can get into one of the above functions
from my code.

Many thanks,

  Olaf

  /* this code section gets called from the receive routine of the WaveLAN
   * driver, i.e. it is part of an interrupt
   */
 
  if (!node->rtData) {
    node->currNRT = node->currNRT->next;

    // my wrapper for printk. It dumps the string onto the network, so I can
    // debug remotely on a machine that doesn't crash :-)
    // printks only to the console are not very usefull since everything
    // works fine over 25,000 times and the messages scroll through much too
    // fast. Dumping the stuff on the network I can read the stuff to a file
    // using tcpdump and look at it after the crash ...

    print_wl(node, DEBUG_INFO, "go ahead for NRT data to %s\n", node->currNRT->name);

    /**** I get until here since this is dumped onto the network */
    /**** according to the output the if statement is true, so I enter
    /**** and call wl_process_NRT_TOKEN (see below)

    if (!memcmp(node->currNRT->macAddr, node->dev->dev_addr, ETH_ALEN)) {
      // it's my turn, I don't send the token to myself!
      // the interface is only simplex!
      wl_process_NRT_TOKEN(node);  // see below
    }
  }


  void
  wl_process_NRT_TOKEN(rtnode_t *node)
  {
    int num = 0;

    // the next text does NOT get dumped to the network anymore
    // I only see the Aiee: scheduling in interrupt error message flashing
    // over my console

    print_wl(node, DEBUG_INFO, "received NRT token\n");

    ....


-
To unsubscribe from this list: send the line "unsubscribe linux-net" in
the body of a message to [EMAIL PROTECTED]
I think I found the culprit

Reply via email to