On Fri, 13 Nov 1998, Olaf Meyer wrote:

> 
> Why does the kernel crash when the hard_start_xmit routine of the device
> driver drops the packets for a prolonged time (like sevral minutes)? Is this
> maybe in bug (im doing this with 2.0.33) or is this related to how the
> higher protocol handlers deal with this problem. I thought that dropped
> packets get re-queued in the "do_dev_queue_xmit" routine. The crash occurs
> when I block the hard_start_xmit routine on purpose by returning 1
> immediately. The network traffic is VERY low, i.e. not more then 10 packets
> get dropped.  This behavior seems independent of the device driver that I'm
> using (its a ISA Wavelan card).
> 
> Any ideas welcome :-)

Are you getting a lock-up or some other type of crash?

Here is a tip: if you are gonna return 1 from hard_start_xmit, you must
set the dev->tbusy flag to 1. That's part of the undocumented network
device protocol. I found this out the hard way; by experimenting
and reverse engineering when I started work on the Mobitex radio modem
driver.

The idea is that when your driver cannot take more packets, because its
transmit buffers are all full, it must mark itself as ``busy transmitting''.
That way, the device layer won't bother it with more packets. Then when you
become un-busy, you must do mark_bh(NET_BH); to indicate that network bottom
half processing should take place in order to kick down more packets. And of
course, you must set your tbusy flag to zero.

See, if you reject packets without marking your device as busy, the device
layer will try to give you more packets. If I recall correctly, this leads to
an infinite loop between the device layer and your device, which causes your machine 
to lock up.  

Also, you are probably aware that if you return 1 from the transmit function,
you must leave the packet buffer untouched, because the above layer may
requeue it (or free it if the transmit queue is backlogged too far).
If you return 0, you have accepted the packet, and are responsible for
freeing it when you are done with it.

Here is how to detect a loop: put in printk's and enable their printing to the
console. If that prink() becomes part of an infinite loop, it will be painfully
obviously, it will be painfully obvious as a stream of messages. You will still
have to reboot.

Note that prink's that are going only to the logger daemons but not to the
console will not cause any output if an infinite loop happens. That's because
the daemons can't get any CPU time to do their work; the loop can't be
preempted. Only console output works.

Good luck...

-
To unsubscribe from this list: send the line "unsubscribe linux-net" in
the body of a message to [EMAIL PROTECTED]

Reply via email to