On Wed, Feb 24, 2010 at 12:06:43PM +0100, Marc Kleine-Budde wrote:
> Wolfgang Grandegger wrote:
> > Marc Kleine-Budde wrote:
> >> Ira W. Snyder wrote:
> >>> On Tue, Feb 23, 2010 at 08:41:24PM +0100, Oliver Hartkopp wrote:
> >>>> Ira W. Snyder wrote:
> >>>>> On Mon, Feb 22, 2010 at 08:42:56PM +0100, Wolfgang Grandegger wrote:
> >>>>>> - Don't include sja1000/sja1000.h. It's only for drivers in sja1000.
> >>>>>>   I know that some other drivers use SJA1000 definitions as well, but
> >>>>>>   that requires a general solution.
> >>>>>>
> >>>>> Why not? I need some of the definitions for the SJA1000 error registers.
> >>>>> Is there any reason why it can't be include/linux/can/sja1000.h instead?
> >>>>>
> >>>>> It seems stupid to duplicate the register definitions in each new driver
> >>>>> that comes along.
> >>>> Register definitions are always hardware specific.
> >>>>
> >>>> The fact that the Janz card has some registers in SJA1000 layout does not
> >>>> justify to push sja1000.h to a public place like include/linux/can/* IMO.
> >>>>
> >>>> The ESD USB and the EMS USB drivers also have some registers in SJA1000
> >>>> layout. I would prefer following their approach.
> >>>>
> >>>>>> - About xmit flow control. What happens if you send messages quickly at
> >>>>>>   125 KB/s. You could use "cangen -g 0 can0" for that test. How many
> >>>>>>   messages get dropped?
> >>>>>>
> >>>>> I let the cangen command run for a while:
> >>>>> $ ifconfig -a
> >>>>> can0      Link encap:UNSPEC  HWaddr 
> >>>>> 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00  
> >>>>>           UP RUNNING NOARP  MTU:16  Metric:1
> >>>>>           RX packets:473455 errors:0 dropped:0 overruns:0 frame:0
> >>>>>           TX packets:473455 errors:0 dropped:1831983 overruns:0 
> >>>>> carrier:0
> >>>> Oops!
> >>>>
> >>>> This looks weird in janz_xmit() !
> >>>>
> >>>> +        /* check that we can actually transmit */
> >>>> +        if (!(desc.control & DESC_VALID)) {
> >>>> +                dev_err(mod->dev, "%s: no buffers\n", __func__);
> >>>> +                stats->tx_dropped++;
> >>>> +                kfree_skb(skb);
> >>>> +                goto out_unlock;
> >>>> +        }
> >>>>
> >>>> You should enable/wake the tx_queue only when your hardware is 
> >>>> potentially
> >>>> able to process new CAN frames. You are getting lot's of trashed frames 
> >>>> with
> >>>> your implementation.
> >>>>
> >>> Yep. I know how a network driver is supposed to work. Unfortunately,
> >>> AFAICT from the datasheet, this hardware has no notification mechanism
> >>> for completed packets.
> >> Have you asked the vendor? Sometimes the guys at the support have good
> >> ideas.
> >>
> >>> The only way I can think of to track TX ring usage is to have the TX
> >>> thread keep an index of where the last used packet was, and then check
> >>> to see how many are left in the ring, for each TX.
> >>>
> >>> That text description is hard to understand, I mean something like this:
> >>>
> >>> start up (first TX packet)
> >>> index     buffer          state
> >>> =======================================
> >>>           0       empty
> >>>           1       empty
> >>> last_processed    2       empty
> >>>           3       full (waiting to be sent by firmware)
> >>>           4       full (waiting to be sent by firmware)
> >>> fasttx_num        5       empty (this is where the next packet will go)
> >>>           6       empty
> >>>           7       empty
> >>>           8       empty
> >>>
> >>> So, when we call xmit, we first start at the last_processed index, and
> >>> iterate until we find a packet waiting to be sent by the firmware. Then
> >>> we do the correct subtraction to know if we have free space left.
> >>>
> >>> Of course, we still must poll to call netif_wake_queue(). :( This is a
> >>> pretty horrible solution. Thoughts?
> >> Er...I have only followed this thread with one eye....
> >> How do you do the ECHO of tx'ed CAN frames?
> > 
> > The hardware does it.
> > 
> >> But we want so solve the TX problem. You only have to stop the
> >> netif_queue if you TX ring is full. Then you have to deal with calling
> >> netif_wake_queue() eventually. But we obviously don't want to poll.
> >>
> >> The solution might be to look at the TX queue in the interrupt handler
> >> and/or the NAPI loop. And now we get back to my first question. If the
> >> controller does a loopback in hardware, i.e. you receive a frame for
> >> each frame send, you don't need to poll the TX queue length.
> >>
> >> If you receive a CAN frame it might be on of your's, which means the TX
> >> queue might have at least room for one frame. In pseudo code it might
> >> look like this:
> >>
> >> xmit()
> >> {
> >>    send_frame_to_hardware();
> >>    if (tx_ring_full) {
> >>            netif_stop_queue();
> >>            priv->flags |= TX_RING_FULL;
> >>    }
> >> }
> >>
> >> irq() and netif_poll()
> >> {
> >>    if (priv->flags & TX_RING_FULL) {
> >>            if (room_in_ring()) {
> >>                    priv->flags &= ~TX_RING_FULL;
> >>                    netif_wake_queue();
> >>            }
> >>    }
> >> }
> >>
> >> Should be implemented race-free, i.e. you might have to use atomic_*
> >> operations and/or don't use a shared flag variable.
> > 
> > Cool, that should work. netif_queue_stopped() could be used instead of
> > the TX_RING_FULL flag.
> 
> right...good point
> 
> I should have looked at my own driver :)
> 
> If room_in_ring() is cheap, it boils down to:
> 
> >> irq() and netif_poll()
> >> {
> >>    if (room_in_ring()) {
> >>            netif_wake_queue();
> >> }
> 
> Otherwise:
> 
> >> irq() and netif_poll()
> >> {
> >>    if (netif_queue_stopped() && room_in_ring()) {
> >>            netif_wake_queue();
> >> }

Hmm, this *almost* works. Running "cangen -g 0 can0" quickly gets the
queue into a hung state, though.

I added printk() to the sites where I stop/wake the queue. It is always
woken up correctly, but cangen always exits with "write: No buffer space
available" after a few stop/wake cycles. Immediately running cangen
again works for a few more stop/wake cycles.

Here is the output of ifconfig:
can0      Link encap:UNSPEC  HWaddr 
00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00  
          UP RUNNING NOARP  MTU:16  Metric:1
          RX packets:1902 errors:0 dropped:0 overruns:0 frame:0
          TX packets:1902 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:10 
          RX bytes:10767 (10.5 KiB)  TX bytes:10767 (10.5 KiB)
          Interrupt:22 

can1      Link encap:UNSPEC  HWaddr 
00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00  
          UP RUNNING NOARP  MTU:16  Metric:1
          RX packets:1902 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:10 
          RX bytes:10767 (10.5 KiB)  TX bytes:0 (0.0 b)
          Interrupt:22 

I'm stopping the queue correctly: the xmit() routine never complains that it
tried to transmit but had no buffer space. You'll see that we didn't drop any
packets in this run. In another run, we did drop packets. It is like the
firmware didn't echo the packets back to me. Also note that the TX of
can0 always matches the RX of can1. They're cabled together, so this is
expected.

can0      Link encap:UNSPEC  HWaddr 
00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00  
          UP RUNNING NOARP  MTU:16  Metric:1
          RX packets:461 errors:0 dropped:0 overruns:0 frame:0
          TX packets:608 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:10 
          RX bytes:2616 (2.5 KiB)  TX bytes:3434 (3.3 KiB)
          Interrupt:22 

can1      Link encap:UNSPEC  HWaddr 
00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00  
          UP RUNNING NOARP  MTU:16  Metric:1
          RX packets:608 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:10 
          RX bytes:3434 (3.3 KiB)  TX bytes:0 (0.0 b)
          Interrupt:22 

Any ideas what I am doing wrong?

Thanks for the help,
Ira
_______________________________________________
Socketcan-core mailing list
[email protected]
https://lists.berlios.de/mailman/listinfo/socketcan-core

Reply via email to