Ira W. Snyder wrote: [...]
>>>> But we want so solve the TX problem. You only have to stop the
>>>> netif_queue if you TX ring is full. Then you have to deal with calling
>>>> netif_wake_queue() eventually. But we obviously don't want to poll.
>>>>
>>>> The solution might be to look at the TX queue in the interrupt handler
>>>> and/or the NAPI loop. And now we get back to my first question. If the
>>>> controller does a loopback in hardware, i.e. you receive a frame for
>>>> each frame send, you don't need to poll the TX queue length.
>>>>
>>>> If you receive a CAN frame it might be on of your's, which means the TX
>>>> queue might have at least room for one frame. In pseudo code it might
>>>> look like this:
>>>>
>>>> xmit()
>>>> {
>>>> send_frame_to_hardware();
>>>> if (tx_ring_full) {
>>>> netif_stop_queue();
>>>> priv->flags |= TX_RING_FULL;
>>>> }
>>>> }
>>>>
>>>> irq() and netif_poll()
>>>> {
>>>> if (priv->flags & TX_RING_FULL) {
>>>> if (room_in_ring()) {
>>>> priv->flags &= ~TX_RING_FULL;
>>>> netif_wake_queue();
>>>> }
>>>> }
>>>> }
>>>>
>>>> Should be implemented race-free, i.e. you might have to use atomic_*
>>>> operations and/or don't use a shared flag variable.
>>> Cool, that should work. netif_queue_stopped() could be used instead of
>>> the TX_RING_FULL flag.
>> right...good point
>>
>> I should have looked at my own driver :)
>>
>> If room_in_ring() is cheap, it boils down to:
>>
>>>> irq() and netif_poll()
>>>> {
>>>> if (room_in_ring()) {
>>>> netif_wake_queue();
>>>> }
>> Otherwise:
>>
>>>> irq() and netif_poll()
>>>> {
>>>> if (netif_queue_stopped() && room_in_ring()) {
>>>> netif_wake_queue();
>>>> }
>
> Hmm, this *almost* works. Running "cangen -g 0 can0" quickly gets the
> queue into a hung state, though.
what is a "hung state"?
> I added printk() to the sites where I stop/wake the queue. It is always
> woken up correctly, but cangen always exits with "write: No buffer space
> available" after a few stop/wake cycles. Immediately running cangen
> again works for a few more stop/wake cycles.
let's look the the cangen source if there's propper error handling...
http://svn.berlios.de/viewvc/socketcan/trunk/can-utils/cangen.c?revision=787&view=markup
nope, there isn't
There's the option "-i" to ignore ENOBUFS, which is....we don't want to
do that....
Better improbe the source: (from the pengutronix' canutils...)
again:
len = write(s, &frame, sizeof(frame));
if (len == -1) {
switch (errno) {
case ENOBUFS: {
int err;
struct pollfd fds[] = {
{
.fd = s,
.events = POLLOUT,
},
};
err = poll(fds, 1, 1000);
if (err == -1 && errno != -EINTR) {
perror("poll()");
exit(EXIT_FAILURE);
} else if (err == 0) {
printf("a timeout occured\n");
exit(EXIT_FAILURE);
}
}
case EINTR: /* fallthrough */
goto again;
default:
perror("write");
exit(EXIT_FAILURE);
}
}
If you hit a timeout, which means you cannot send packages for 1000 ms,
your send queue probably didn't wake up...
>
> Here is the output of ifconfig:
> can0 Link encap:UNSPEC HWaddr
> 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00
> UP RUNNING NOARP MTU:16 Metric:1
> RX packets:1902 errors:0 dropped:0 overruns:0 frame:0
> TX packets:1902 errors:0 dropped:0 overruns:0 carrier:0
> collisions:0 txqueuelen:10
> RX bytes:10767 (10.5 KiB) TX bytes:10767 (10.5 KiB)
> Interrupt:22
>
> can1 Link encap:UNSPEC HWaddr
> 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00
> UP RUNNING NOARP MTU:16 Metric:1
> RX packets:1902 errors:0 dropped:0 overruns:0 frame:0
> TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
> collisions:0 txqueuelen:10
> RX bytes:10767 (10.5 KiB) TX bytes:0 (0.0 b)
> Interrupt:22
>
> I'm stopping the queue correctly: the xmit() routine never complains that it
> tried to transmit but had no buffer space. You'll see that we didn't drop any
> packets in this run. In another run, we did drop packets. It is like the
> firmware didn't echo the packets back to me. Also note that the TX of
> can0 always matches the RX of can1. They're cabled together, so this is
> expected.
Hmm...it looks like this.
Double check your RX path, so that you don't loose any frames in
software. Is it possible that the RX and TX collide in your driver
and/or in hardware?
Another improvement for the cangen utility is to exit after a certain
number of messages. So you can match the number of generated frames with
the rx/tx counters of the CAN cards and interrupts (cat /proc/interrupts)
> can0 Link encap:UNSPEC HWaddr
> 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00
> UP RUNNING NOARP MTU:16 Metric:1
> RX packets:461 errors:0 dropped:0 overruns:0 frame:0
> TX packets:608 errors:0 dropped:0 overruns:0 carrier:0
> collisions:0 txqueuelen:10
> RX bytes:2616 (2.5 KiB) TX bytes:3434 (3.3 KiB)
> Interrupt:22
>
> can1 Link encap:UNSPEC HWaddr
> 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00
> UP RUNNING NOARP MTU:16 Metric:1
> RX packets:608 errors:0 dropped:0 overruns:0 frame:0
> TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
> collisions:0 txqueuelen:10
> RX bytes:3434 (3.3 KiB) TX bytes:0 (0.0 b)
> Interrupt:22
>
> Any ideas what I am doing wrong?
cheers, Marc
--
Pengutronix e.K. | Marc Kleine-Budde |
Industrial Linux Solutions | Phone: +49-231-2826-924 |
Vertretung West/Dortmund | Fax: +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686 | http://www.pengutronix.de |
signature.asc
Description: OpenPGP digital signature
_______________________________________________ Socketcan-core mailing list [email protected] https://lists.berlios.de/mailman/listinfo/socketcan-core
