Hi Günter,

[email protected] wrote:
> Hi,
> 
> I am the maintainer of a MPC5200 port of eCos at Elektrobit.
> I found a race-condition in the implementation of our MSCAN
> CAN-driver and it seems your driver has the same problem
> (I reviewed your code for reference reasons):
> http://git.kernel.org/?p=linux/kernel/git/davem/net-next-2.6.git;a=blob_plain;f=drivers/net/can/mscan/mscan.c;hb=HEAD
> 
> I first contacted Wolfram Sang personally and he pointed me
> to the right sources and to this mailing list.

Thanks for double checking the Socket-CAN driver.

> Problem is:
> MSCAN has three Tx-Buffers. But only one buffer-interface.
> So to select one buffer you must use a buffer selection (cantbsel)
> register.
> 
> Your driver implementation selects the buffer in the 
> mscan_start_xmit() function and then fills the buffer with
> payload.
> 
> But in the ISR mscan_isr() the buffer is also selected
> to maintain the tx-stats.
> 
> Race-condition occurs if mscan_isr() interrupts mscan_start_xmit()
> after it selects a buffer by writing to the tbsel register.
> After ISR returns mscan_start_xmit will continue to write payload
> into the wrong buffer.

TX interrupts are disabled in mscan_start_xmit() while accessing the
tbsel register and writing the payload:

static netdev_tx_t mscan_start_xmit(struct sk_buff *skb, ...)
{
        out_8(&regs->cantier, 0);
        ...
        out_8(&regs->cantier, priv->tx_active);

        return NETDEV_TX_OK;
}

> A second race condition occurs with the cantier register.
> The documentation says its an interrupt-enable register.
> But in-fact it is a sort of acknowledgement register which
> is accessed in both ISR and xmit function with load-and-store
> operations.

I don't see that problem yet in the MSCAN Socketcan driver.

> The fix in my eCos driver was to lock the ISR from occuring during
> cantbsel and cantier register are accessed inside the transmit function.
> 
> If you need a reference-implementation I can send you my
> implementation of the MSCAN driver for eCos operating system.
> It differs with your implementation because it is able to
> abort buffers whenever a Can-Message arrives at the Tx-API
> with has higher priority than the messages currently in the
> buffers (to avoid starvation on the bus).
> 
> Perhaps the following lines could be  written simpler
> in function mscan_isr():
> 
>    cantier = in_8(&regs->cantier) & MSCAN_TXE;
>    cantflg = in_8(&regs->cantflg) & cantier;
> 
>    if (cantier && cantflg) {
>       ...
> Instead of this i would write (its the same):
>    if (cantflg) {

This will break the synchronization with mscan_start_xmit(), because the
ISR can be entered by to RX or error interrupts as well.

Wolfgang.

_______________________________________________
Socketcan-core mailing list
[email protected]
https://lists.berlios.de/mailman/listinfo/socketcan-core

Reply via email to