Re: [PATCH v2 2/2] mmc: sdhci-omap: Don't finish_mrq() on a command error during tuning

2019-03-12 Thread Rizvi, Mohammad Faiz Abbas
Hi Adrian,

On 3/6/2019 5:39 PM, Adrian Hunter wrote:
> On 1/03/19 10:38 AM, Faiz Abbas wrote:
>> commit 5b0d62108b46 ("mmc: sdhci-omap: Add platform specific reset
>> callback") skips data resets during tuning operation. Because of this,
>> a data error or data finish interrupt might still arrive after a command
>> error has been handled and the mrq ended. This ends up with a "mmc0: Got
>> data interrupt 0x0002 even though no data operation was in progress"
>> error message.
>>
>> Fix this by adding a platform specific callback for command errors. Mark
>> the mrq as a failure but wait for a data interrupt instead of calling
>> finish_mrq().
>>
>> Fixes: 5b0d62108b46 ("mmc: sdhci-omap: Add platform specific reset
>> callback")
>> Signed-off-by: Faiz Abbas 
>> ---
>>  drivers/mmc/host/sdhci-omap.c | 24 
>>  1 file changed, 24 insertions(+)
>>
>> diff --git a/drivers/mmc/host/sdhci-omap.c b/drivers/mmc/host/sdhci-omap.c
>> index b1a66ca3821a..67b361a403bc 100644
>> --- a/drivers/mmc/host/sdhci-omap.c
>> +++ b/drivers/mmc/host/sdhci-omap.c
>> @@ -797,6 +797,29 @@ void sdhci_omap_reset(struct sdhci_host *host, u8 mask)
>>  sdhci_reset(host, mask);
>>  }
>>  
>> +void sdhci_omap_cmd_err(struct sdhci_host *host, u32 intmask, u32 
>> *intmask_p)
>> +{
>> +struct sdhci_pltfm_host *pltfm_host = sdhci_priv(host);
>> +struct sdhci_omap_host *omap_host = sdhci_pltfm_priv(pltfm_host);
>> +
>> +if (omap_host->is_tuning) {
>> +/*
>> + * Since we are not resetting data lines during tuning
>> + * operation, data error or data complete interrupts
>> + * might still arrive. Mark this request as a failure
>> + * but still wait for the data interrupt
>> + */
>> +if (intmask & SDHCI_INT_TIMEOUT)
>> +host->cmd->error = -ETIMEDOUT;
>> +else
>> +host->cmd->error = -EILSEQ;
>> +
>> +sdhci_finish_command(host);
>> +} else {
>> +sdhci_cmd_err(host, intmask, intmask_p);
>> +}
>> +}
> 
> Could this be done by the existing ->irq() callback? i.e. mask the error
> bits in intmask (have to write them back to SDHCI_INT_STATUS also) but set
> cmd->error.
> 

It is possible but I really don't want the callback to be unnecessarily
called for every single interrupt that happens. I think we should only
use the callback in the case of an actual error interrupt :-)

Thanks,
Faiz


Re: [PATCH v2 1/8] mmc: sdhci: Get rid of finish_tasklet

2019-03-12 Thread Rizvi, Mohammad Faiz Abbas
Hi Adrian,

On 3/8/2019 7:06 PM, Adrian Hunter wrote:
> On 6/03/19 12:00 PM, Faiz Abbas wrote:
>> Adrian,
>>
>> On 25/02/19 1:47 PM, Adrian Hunter wrote:
>>> On 15/02/19 9:20 PM, Faiz Abbas wrote:
 sdhci.c has two bottom halves implemented. A threaded_irq for handling
 card insert/remove operations and a tasklet for finishing mmc requests.
 With the addition of external dma support, dmaengine APIs need to
 terminate in non-atomic context before unmapping the dma buffers.

 To facilitate this, remove the finish_tasklet and move the call of
 sdhci_request_done() to the threaded_irq() callback.
>>>
>>> The irq thread has a higher latency than the tasklet.  The performance drop
>>> is measurable on the system I tried:
>>>
>>> Before:
>>>
>>> # dd if=/dev/mmcblk1 of=/dev/null bs=1G count=1 &
>>> 1+0 records in
>>> 1+0 records out
>>> 1073741824 bytes (1.1 GB) copied, 4.44502 s, 242 MB/s
>>>
>>> After:
>>>
>>> # dd if=/dev/mmcblk1 of=/dev/null bs=1G count=1 &
>>> 1+0 records in
>>> 1+0 records out
>>> 1073741824 bytes (1.1 GB) copied, 4.50898 s, 238 MB/s
>>>
>>> So we only want to resort to the thread for the error case.
>>>
>>
>> Sorry for the late response here, but this is about 1.6% decrease. I
>> tried out the same commands on a dra7xx board here (with about 5
>> consecutive dd of 1GB) and the average decrease was 0.3%. I believe you
>> will also find a lesser percentage change if you average over multiple
>> dd commands.
>>
>> Is this really so significant that we have to maintain two different
>> bottom halves and keep having difficulty with adding APIs that can sleep?
> 
> It is a performance drop that can be avoided, so it might as well be.
> Splitting the success path from the failure path is common for I/O drivers
> for similar reasons as here: the success path can be optimized whereas the
> failure path potentially needs to sleep.

Understood. You wanna keep the success path as fast as possible.
> 
>>
>> Also I am not sure how to implement only the error handling part in the
>> threaded_irq. We need to enter sdhci_request_done() and get the current
>> mrq before we can check for error conditions like I've done in patch 2:
>>
>> /* Terminate and synchronize dma in case of an error */
>> if (data && (mrq->cmd->error || data->error) &&
>> host->use_external_dma) {
>>  struct dma_chan *chan = sdhci_external_dma_channel(host, data);
>>  dmaengine_terminate_sync(chan);
>> }
>>
>> On a related note, do we really need to protect everything in
>> sdhci_request_done() with spinlocks?
>>  In patch 2 I have only removed lock
>> for the terminate_sync() parts that I added but the whole
>> dma_unmap/dma_sync parts should be left unprotected IMO.
> 
> As it is written, synchronization is needed to stop the same mrq being
> finished twice.
> 
> I suggest doing the dmaengine_terminate_sync() before calling
> sdhci_request_done().  Perhaps you could record the channel that needs to be
> sync'd and then do:
> 
>   struct dma_chan *chan = READ_ONCE(host->sync_chan);
> 
>   if (chan) {
>   dmaengine_terminate_sync(chan);
>   host->sync_chan = NULL;
>   }
> 

Will try this out and send next version.

Thanks,
Faiz


Re: [RFC PATCH 1/3] can: m_can: Create m_can core to leverage common code

2019-01-09 Thread Rizvi, Mohammad Faiz Abbas

Hi Dan, Wolfgang,

On 1/10/2019 1:14 PM, Wolfgang Grandegger wrote:

Hello Dan,

sorry for my late response on that topic...

Am 09.01.19 um 21:58 schrieb Dan Murphy:

Wolfgang

On 11/3/18 5:45 AM, Wolfgang Grandegger wrote:

Hello Dan,

Am 31.10.2018 um 21:15 schrieb Dan Murphy:

Wolfgang

Thanks for the review

On 10/27/2018 09:19 AM, Wolfgang Grandegger wrote:

Hello Dan,

for the RFC, could you please just do the necessary changes to the
existing code. We can discuss about better names, etc. later. For
the review if the common code I quickly did:

   mv m_can.c m_can_platform.c
   mv m_can_core.c m_can.c

The file names are similar to what we have for the C_CAN driver.

   s/classdev/priv/
   variable name s/m_can_dev/priv/

Then your patch 1/3 looks as shown below. I'm going to comment on that
one. The comments start with "***"



So you would like me to align the names with the c_can driver?


That would be the obvious choice.




*** I didn't review the rest of the patch for now.



snipped the code to reply to the comment.


Looking to the generic code, you didn't really change the way
the driver is accessing the registers. Also the interrupt handling
and rx polling is as it was before. Does that work properly using
the SPI interface of the TCAN4x5x?


I don't want to change any of that yet.  Maybe my cover letter was not clear
or did not go through.

But the intention was just to break out the functionality to create a MCAN 
framework
that can be used by devices that contain the Bosch MCAN core and provider their 
own protocal to access
the registers in the device.

I don't want to do any functional changes at this time on the IP code itself 
until we have a framework.
There should be no regression in the io mapped code.

I did comment on the interrupt handling and asked if a threaded work queue 
would affect CAN timing.
For the original TCAN driver this was the way it was implemented.


Do threaded interrupts with RX polling make sense? I think we need a
common interface allowing to select hard-irqs+napi or threaded-irqs.



I have been working on this code for about a month now and I am *not happy* 
with the amount of change that needs
to be done to make the m_can a framework.

I can tx/rx frames from another CAN device to the TCAN part but I have not even 
touched the iomapped code.

The challenging part is that the m_can code that is currently available does 
not have to worry about atomic context because
there is no peripheral waiting.  Since the TCAN is a peripheral device we need 
to take into about the hard waits in IRQ context
as well as the atomic context.  Doing this creates many deltas in the base code 
that may break iomapped devices.  I have had to
add the thread_irqs and now I am in the midst of the issue you brought up with 
napi.  I would have to schedule a queue for perp devices
and leave the non-threaded iomapped irq.

At this point I think it may be wise to leave the m_can code alone as it is 
working and stable and just work on the TCAN driver as
a standalone driver.  A framework would be nice but I think it would destablize 
the m_can driver which is embedded in many SoC's and
we cannot possibly test everyone of them.


Unfortunately, I do not have m_can hardware at hand.


There are exactly 3 platforms in mainline that use the m_can driver. I 
can help Dan test it on a dra76x. I haven't had a chance to look at the 
changes in depth, but just testing for regressions on existing platforms 
shouldn't be too hard once we have it working on one.


Thanks,
Faiz




What are your thoughts?


What we need is a common set of functions doing tx, rx, error and state
handling. This will requires substantial changes to the existing
io-mapped m_can driver, of course. I still believe it's worth the
effort, but I agree that it's difficult for you to re-write and test the
existing m_can driver.

What about implementing such a set of common functions plus the SPI
specific part for your TCAN device. What do you/others think?

Wolfgang.



Re: [PATCH v2 0/3] Add support for using external dma in SDHCI

2018-11-28 Thread Rizvi, Mohammad Faiz Abbas

+ Mark Brown

Chunyan,

On 11/21/2018 5:17 PM, Faiz Abbas wrote:

Hi Chunyan,

On 12/11/18 12:56 PM, Chunyan Zhang wrote:

Currently the generic SDHCI code in the Linux kernel supports the SD
standard DMA integrated into the host controller but does not have any
support for external DMA controllers implemented using dmaengine meaning
that custom code is needed for any systems that use a generic DMA
controller with SDHCI which in practice means any SDHCI controller that
doesn't have an integrated DMA controller so we should have this as a
generic feature.

There are already a number of controller specific drivers that have dmaengine
code, and some could use sdhci.c actually, but needed to implement 
mmc_ops->request()
in their specific driver for sending command with external dma using dmaengine
framework, with this patchset, them will take advantage of the generic support.
TI's omap controller is the case as an example.

Any comments are very appreciated.



This is great. It helps us move am335x and am43xx platforms to
sdhci-omap. What platforms have you tested this on?



Gentle ping on this. I tried testing these with an am335x-evm board. In 
their current condition, the card fails to enumerate altogether. The 
changes suggested by Adrian should fix this. Let me know when you post a v3.


Thanks,
Faiz


Re: [PATCH v2 0/3] Add support for using external dma in SDHCI

2018-11-28 Thread Rizvi, Mohammad Faiz Abbas

+ Mark Brown

Chunyan,

On 11/21/2018 5:17 PM, Faiz Abbas wrote:

Hi Chunyan,

On 12/11/18 12:56 PM, Chunyan Zhang wrote:

Currently the generic SDHCI code in the Linux kernel supports the SD
standard DMA integrated into the host controller but does not have any
support for external DMA controllers implemented using dmaengine meaning
that custom code is needed for any systems that use a generic DMA
controller with SDHCI which in practice means any SDHCI controller that
doesn't have an integrated DMA controller so we should have this as a
generic feature.

There are already a number of controller specific drivers that have dmaengine
code, and some could use sdhci.c actually, but needed to implement 
mmc_ops->request()
in their specific driver for sending command with external dma using dmaengine
framework, with this patchset, them will take advantage of the generic support.
TI's omap controller is the case as an example.

Any comments are very appreciated.



This is great. It helps us move am335x and am43xx platforms to
sdhci-omap. What platforms have you tested this on?



Gentle ping on this. I tried testing these with an am335x-evm board. In 
their current condition, the card fails to enumerate altogether. The 
changes suggested by Adrian should fix this. Let me know when you post a v3.


Thanks,
Faiz