On Fri, Mar 13, 2015 at 09:03:33AM +0100, Alexander Sverdlin wrote:
> Hello!
> 
> On 12/03/15 14:16, ext [email protected] wrote:
> >> There is one big problem in the current design: ISR accesses the controller
> >> > registers in parallel with i2c_davinci_xfer_msg() in process context. 
> >> > The whole
> >> > logic is not obvious, many operations are performed in process context 
> >> > while
> >> > ISR is always enabled and does something asynchronous even while it's not
> >> > expected. We have faced these races on 4-cores Keystone chip. Some 
> >> > examples:
> >> > 
> >> > - when non-existing device is accessed first comes NAK IRQ, then ARDY 
> >> > IRQ. After
> >> >    NAK we already jump out of wait_for_completion_timeout() and 
> >> > depending on how
> >> >    lucky we are ARDY IRQ will access MDR register in the middle of some 
> >> > other
> >> >    operation in process context;
> >> > 
> >> > - STOP condition is triggered in many places in the driver, in ISR, in
> >> >    i2c_davinci_xfer_msg(), but there is no code which guarantees that 
> >> > STOP will
> >> >    be really completed. We have seen many STOP conditions simply missing 
> >> > in
> >> >    back-to-back transfers, when next i2c_davinci_xfer_msg() call simply 
> >> > overwrites
> >> >    MDR register while STOP is still not generated.
> >> > 
> >> > So, make the design more robust and obvious:
> >> > - leave hot path (buffers management) in ISR, remove other registers 
> >> > access from
> >> >    ISR;
> >> > - introduce second synchronization point, to make sure that STOP 
> >> > condition is
> >> >    really generated and it's safe to begin next transfer;
> >> > - simplify the state machine;
> >> > - enable IRQs only when they are expected, disable them in ISR when 
> >> > transfer is
> >> >    completed/failed;
> >> > - STOP is normally set simultaneously with START condition (in case of 
> >> > last
> >> >    message); only special case when STOP is additionally generated is 
> >> > received NAK
> >> >    -- this case is handled separately.
> > I'm not sure about this change (- It's too significant and definitely will 
> > need more review & Tested-by.
> 
> Maybe you can offer this patch the customers who suddenly cannot access the 
> devices on the bus until reboot...
> Because it's not a "bus lockup". 
> 
> > We need to be careful with it, especially taking into account 
> > DAVINCI_I2C_MDR_RM mode and future
> > changes like https://lkml.org/lkml/2014/5/1/348.
> > 
> > May be you can split it?
> 
> I can may be split it into two patches, but I'm not sure about the result. 
> Each of them will only solve
> 50% of the problem and then nobody will see a clear benefit applying only 
> one. But what I can offer you is
> to spend the same effort on rebasing the "DAVINCI_I2C_MDR_RM mode" patch you 
> are referring to. I can rebase
> it and take it into my series if you wish.

So, shall I take this into i2c/for-next?

Attachment: signature.asc
Description: Digital signature

Reply via email to