On Wed, Jan 8, 2014 at 10:13 PM, Geert Uytterhoeven
<[email protected]> wrote:
> On Wed, Jan 8, 2014 at 9:41 AM, Laurent Pinchart
> <[email protected]> wrote:
>> On Wednesday 08 January 2014 09:28:59 Geert Uytterhoeven wrote:
>>> On Wed, Jan 8, 2014 at 1:27 AM, Laurent Pinchart wrote:
>>> > On Tuesday 07 January 2014 21:27:18 Geert Uytterhoeven wrote:
>>> >> I was regularly getting I/O errors when using the Renesas RSPI/QSPI
>>> >>
>>> >> driver on r8a7791:
>>> >>     m25p80 spi0.0: error -110 reading SR
>>> >>
>>> >> Until I applied the following patch, which re-reads RSPI_SPSR on a
>>> >> time-out, and continues if the condition has become true:
>>> >>
>>> >> diff --git a/drivers/spi/spi-rspi.c b/drivers/spi/spi-rspi.c
>>> >> index 4b31d89e8568..e63e30c500da 100644
>>> >> --- a/drivers/spi/spi-rspi.c
>>> >> +++ b/drivers/spi/spi-rspi.c
>>> >> @@ -442,8 +442,13 @@ static int rspi_wait_for_interrupt(struct rspi_data
>>> >> *rspi, u8 wait_mask, rspi->spsr = rspi_read8(rspi, RSPI_SPSR);
>>> >>
>>> >>       rspi_enable_irq(rspi, enable_bit);
>>> >>       ret = wait_event_timeout(rspi->wait, rspi->spsr & wait_mask, HZ);
>>> >>
>>> >> -     if (ret == 0 && !(rspi->spsr & wait_mask))
>>> >> -             return -ETIMEDOUT;
>>> >> +     if (ret == 0 && !(rspi->spsr & wait_mask)) {
>>> >> +             u8 spsr = rspi_read8(rspi, RSPI_SPSR);
>>> >> +             printk("*** rspi->spsr = 0x%02x, real spsr = 0x%02x,
>>> >> wait_mask =>
>>> > 0x%02x
>>> >
>>> >> ***\n",
>>> >> +                    rspi->spsr, spsr, wait_mask);
>>> >> +             if (!(spsr & wait_mask))
>>> >> +                     return -ETIMEDOUT;
>>> >> +     }
>>> >>
>>> >>       return 0;
>>> >>
>>> >>  }
>>> >>
>>> >> Now it prints from time to time:
>>> >>     *** rspi->spsr = 0x20, real spsr = 0xa0, wait_mask = 0x80 ***
>>> >>
>>> >> which shows that rspi->spsr (as set from the interrupt handler) didn't
>>> >> have bit 7 set, while RSPI_SPSR does have bit 7 set.
>>> >>
>>> >> So this looks like a race condition in the interrupt handling.
>>> >
>>> > What happens if you print rspi->spsr in the interrupt handler ? Does it
>>> > have bit 7 set ?
>>>
>>> I haven't tried that yet. The driver is extremely interrupt-heavy (O(n),
>>> with n the number of bytes transfered), so adding a printk() as-is won't be
>>> a good idea.
>>>
>>> Will think a bit more about a better approach...
>>
>> Just limit it to the first 10 interrupts then :-)
>
> That assumes the issue happens during the first 10 interrupts, which is
> usually not the case ;-)
>
>> What I'd like to know is whether the interrupt is trigerred before bit 7 gets
>> set by the hardware, or if the rspi_wait_for_interrupt() function gets a 
>> stale
>> value of rspi->spsr.
>
> It seems to get a stale value. I ended up doing the printk() anyway, and
> when reading using small buffer sizes, the issues is more likely to happen.
>
> If I add a second copy of rspi->spsr, and make that copy volatile OR add
> a few calls to smp_mb() around the places where it's written and read,
> the copy has the right value inside rspi_wait_for_interrupt(), BUT only if
> I also print the spsr value inside rspi_irq(). Note that the original 
> rspi->spsr
> still has the stale value.
>
> I'd expect wake_up() and wait_event_timeout() to have the right memory
> barriers (cfr. Documentation/memory-barriers.txt), so I don't have to add
> my own. And why does it need the extra printk()? What extra
> synchronization does that give?
>
> Note that the interrupt handler always runs on CPU core 0,
> while rspi_wait_for_interrupt() can run on core 0 or 1.
> Is there a cache-coherency issue between the two CPU cores?
>
> Am I missing something?

What happens if you run on a uniprocessor system or in case of R-Car
disable SMP and run with a single CPU core?

Cheers,

/ magnus
--
To unsubscribe from this list: send the line "unsubscribe linux-spi" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to