from:"Joel Fernandes"

Re: [PATCH 4/9] dma: edma: Find missed events and issue them

2013-08-01 Thread Joel Fernandes

On 08/01/2013 01:13 AM, Sekhar Nori wrote:
> On Thursday 01 August 2013 07:57 AM, Joel Fernandes wrote:
>> On 07/31/2013 04:18 AM, Sekhar Nori wrote:
>>> On Wednesday 31 July 2013 10:19 AM, Joel Fernandes wrote:
>>>> Hi Sekhar,
>>>>
>>>> On 07/30/2013 02:05 AM, Sekhar Nori wrote:
>>>>> On Monday 29 July 2013 06:59 PM, Joel Fernandes wrote:
>>>>>> In an effort to move to using Scatter gather lists of any size with
>>>>>> EDMA as discussed at [1] instead of placing limitations on the driver,
>>>>>> we work through the limitations of the EDMAC hardware to find missed
>>>>>> events and issue them.
>>>>>>
>>>>>> The sequence of events that require this are:
>>>>>>
>>>>>> For the scenario where MAX slots for an EDMA channel is 3:
>>>>>>
>>>>>> SG1 -> SG2 -> SG3 -> SG4 -> SG5 -> SG6 -> Null
>>>>>>
>>>>>> The above SG list will have to be DMA'd in 2 sets:
>>>>>>
>>>>>> (1) SG1 -> SG2 -> SG3 -> Null
>>>>>> (2) SG4 -> SG5 -> SG6 -> Null
>>>>>>
>>>>>> After (1) is succesfully transferred, the events from the MMC controller
>>>>>> donot stop coming and are missed by the time we have setup the transfer
>>>>>> for (2). So here, we catch the events missed as an error condition and
>>>>>> issue them manually.
>>>>>
>>>>> Are you sure there wont be any effect of these missed events on the
>>>>> peripheral side. For example, wont McASP get into an underrun condition
>>>>> when it encounters a null PaRAM set? Even UART has to transmit to a
>>>>
>>>> But it will not encounter null PaRAM set because McASP uses contiguous
>>>> buffers for transfer which are not scattered across physical memory.
>>>> This can be accomplished with an SG of size 1. For such SGs, this patch
>>>> series leaves it linked Dummy and does not link to Null set. Null set is
>>>> only used for SG lists that are > MAX_NR_SG in size such as those
>>>> created for example by MMC and Crypto.
>>>>
>>>>> particular baud so I guess it cannot wait like the way MMC/SD can.
>>>>
>>>> Existing driver have to wait anyway if they hit MAX SG limit today. If
>>>> they don't want to wait, they would have allocated a contiguous block of
>>>> memory and DMA that in one stretch so they don't lose any events, and in
>>>> such cases we are not linking to Null.
>>>
>>> As long as DMA driver can advertize its MAX SG limit, peripherals can
>>> always work around that by limiting the number of sync events they
>>> generate so as to not having any of the events getting missed. With this
>>> series, I am worried that EDMA drivers is advertizing that it can handle
>>> any length SG list while not taking care of missing any events while
>>> doing so. This will break the assumptions that driver writers make.
>>
>> This is already being done by some other DMA engine drivers ;). We can
>> advertise more than we can handle at a time, that's the basis of this
>> whole idea.
>>
>> I understand what you're saying but events are not something that have
>> be serviced immediately, they can be queued etc and the actually
>> transfer from the DMA controller can be delayed. As long as we don't
>> miss the event we are fine which my series takes care off.
>>
>> So far I have tested this series on following modules in various
>> configurations and have seen no issues:
>> - Crypto AES
>> - MMC/SD
>> - SPI (128x160 display)
> 
> Notice how in each of these cases the peripheral is in control of when
> data is driven out? Please test with McASP in a configuration where
> codec drives the frame-sync/bit-clock or with UART under high baud rate.

McASP allocates a contiguous buffer. For this case there is always an SG
of size 1 and this patch series doesn't effect it at all, there is not
stalling. Further McASP audio driver is still awaiting conversion to use
DMA engine so there's no way yet to test it.

>>>>> Also, wont this lead to under-utilization of the peripheral bandwith?
>>>>> Meaning, MMC/SD is ready with data but cannot transfer because the DMA
>>>>> is waiting to be set-up.
>>>>
>>>> But it is waiting anyway even today. Currently based on MAX segs, MMC
>>

Re: [PATCH 4/9] dma: edma: Find missed events and issue them

2013-08-01 Thread Joel Fernandes

Just some corrections here..

On 08/01/2013 03:28 PM, Joel Fernandes wrote:

>>> 2. If the interrupt handler for some reason doesn't complete or get
>>> service in time, we will end up DMA'ing incorrect data as events
>>> wouldn't stop coming in even if interrupt is not yet handled (in your
>>> example linked sets P1 or P2 would be old ones being repeated). Where as
>>> with my method, we are not doing any DMA once we finish the current
>>> MAX_NR_SG set even if events continue to come.
>>
>> Where is repetition and possibility of wrong data being transferred? We
>> have a linear list of PaRAM sets - not a loop. You would link the end to
>> PaRAM set chain to dummy PaRAM set which BTW will not cause missed
>> events. The more number of PaRAM sets you add to the chain, the more
> 
> There would have to be a loop, how else would you ensure continuity and
> uninterrupted DMA?
> 
> Consider if you have 2 sets of linked sets:
> L1 is the first set of Linked sets and L2 is the second.
> 
> When L1 is done, EDMA continues with L2 (due to the link) while
> interrupt handler prepares L1. The continuity depends on L1 being linked
> to L2. Only the absolute last break up of the MAX_NR_SG linked set will
> be linked to Dummy.
> 
> So consider MAX_NR_SG=10, and sg_len = 35
> 
> L1 - L2 - L1 - L1 - Dummy

Should be,
L1 - L2 - L1 - L2 - Dummy

> 
> The split would be in number of slots,
> 10 - 10 - 10 -  5 - Dummy
> 
>> time CPU gets to intervene before DMA eventually stalls. This is a
>> tradeoff system designers can manage.
> 
> Consider what happens in the case where MAX_SG_NR=1 or 2. In that case,
> there's a change we might not get enough time for the interrupt handler
> to setup next series of linked set.
> 
> Some how this limitation has to be overcome by advising in comments than
> MAX_SG_NR should always be greater than a certain number to ensure
> proper operation.

s/than/that/


Thanks,

-Joel

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 4/9] dma: edma: Find missed events and issue them

2013-08-02 Thread Joel Fernandes

Hi Sekhar,

Thanks for your detailed illustrations.

On 08/02/2013 08:26 AM, Sekhar Nori wrote:
[..]
>> This can be used only for buffers that are contiguous in memory, not
>> those that are scattered across memory.
>
> I was hinting at using the linking facility of EDMA to achieve this.
> Each PaRAM set has full 32-bit source and destination pointers so I see
> no reason why non-contiguous case cannot be handled.
>
> Lets say you need to transfer SG[0..6] on channel C. Now, PaRAM sets are
> typically 4 times the number of channels. In this case we use one DMA
> PaRAM set and two Link PaRAM sets per channel. P0 is the DMA PaRAM set
> and P1 and P2 are the Link sets.
>
> Initial setup:
>
> SG0 -> SG1 -> SG2 -> SG3 -> SG4 -> SG5 -> SG6 -> NULL
>  ^  ^  ^
>  |  |  |
> P0  -> P1  -> P2  -> NULL
>
> P[0..2].TCINTEN = 1, so get an interrupt after each SG element
> completion. On each completion interrupt, hardware automatically copies
> the linked PaRAM set into the DMA PaRAM set so after SG0 is transferred
> out, the state of hardware is:
>
> SG1  -> SG2 -> SG3 -> SG3 -> SG6 -> NULL
>  ^   ^
>  |   |
> P0,1P2  -> NULL
>  |   ^
>  |   |
>  -
>
> SG1 transfer has already started by the time the TC interrupt is
> handled. As you can see P1 is now redundant and ready to be recycled. So
> in the interrupt handler, software recycles P1. Thus:
>
> SG1 -> SG2 -> SG3 -> SG4 -> SG5 -> SG6 -> NULL
>  ^  ^  ^
>  |  |  |
> P0  -> P2  -> P1  -> NULL
>
> Now, on next interrupt, P2 gets copied and thus can get recycled.
> Hardware state:
>
> SG2  -> SG3 -> SG4 -> SG5 -> SG6 -> NULL
>  ^   ^
>  |   |
> P0,2P1  -> NULL
>  |   ^
>  |   |
>  -
>
> As part of TC completion interrupt handling:
>
> SG2 -> SG3 -> SG4 -> SG5 -> SG6 -> NULL
>  ^  ^  ^
>  |  |  |
> P0  -> P1  -> P2  -> NULL
>
> This goes on until the SG list in exhausted. If you use more PaRAM sets,
> interrupt handler gets more time to recycle the PaRAM set. At no point
> we touch P0 as it is always under active transfer. Thus the peripheral
> is always kept busy.
>
> Do you see any reason why such a mechanism cannot be implemented?

 This is possible and looks like another way to do it, but there are 2
 problems I can see with it.

 1. Its inefficient because of too many interrupts:

 Imagine case where we have an SG list of size 30 and MAX_NR_SG size is
 10. This method will trigger 30 interrupts always, where as with my
 patch series, you'd get only 3 interrupts. If you increase MAX_SG_NR ,
 you'd get even fewer interrupts.
>>>
>>> Yes, but you are seeing only one side of inefficiency. In your design
>>> DMA *always* stalls waiting for CPU to intervene. The whole point to DMA
>>> is to keep it going while CPU does bookeeping in background. This is
>>> simply not going to scale with fast peripherals.
>>
>> Agreed. So far though, I've no way to reproduce a fast peripheral that
>> scatters data across physical memory and suffers from any stall.
>>
>>> Besides, missed events are error conditions as far as EDMA and the
>>> peripheral is considered. You are handling error interrupt to support a
>>> successful transaction. Think about why EDMA considers missed events as
>>> error condition.
>>
>> I agree with this, its not the best way to do it. I have been working on
>> a different approach.
>>
>> However, in support of the series:
>> 1. It doesn't break any existing code
>> 2. It works for all current DMA users (performance and correctness)
>> 3. It removes the SG limitations on DMA users.
> 
> Right, all of this should be true even with the approach I am suggesting.
> 
>> So what you suggested, would be more of a feature addition than a
>> limitation of this series. It is atleast better than what's being done
>> now - forcing the limit to the total number of SGs, so it is a step in
>> the right direction.
> 
> No, I do not see my approach is an feature addition to what you are
> doing. They are both very contrasting ways. For example, you would not
> need the manual (re)trigger in CC error condition in what I am proposing.
> 
>>
 2. If the interrupt handler for some reason doesn't complete or get
 service in time, we will end up DMA'ing incorrect data as events
 wouldn't stop coming in even if interrupt is not yet handled (in your
 example linked sets P1 or P2 would be old ones being repeated). Where as
 with my method, we are not doing any DMA once we finish the current
 MAX_NR_SG set even if events continue to come.
>>>
>>> Where is repetition and possibility of wrong data being transferred? We
>>> have a linear list of PaRAM sets - not a loop. You

Re: [PATCH 4/9] dma: edma: Find missed events and issue them

2013-08-02 Thread Joel Fernandes

Hi Sekhar,

Considering you agree with my understanding of the approach you proposed,

I worked on some code to quickly try the different approach (ping-pong)
between sets, here is a hack patch:

https://github.com/joelagnel/linux-kernel/commits/dma/edma-no-sg-limits-interleaved

As I suspected it also has problems with missing interrupts, coming back
to my other point about getting errors if ISR doesn't get enough time to
setup for the next transfer. If you'd use < 5 MAX_NR slots you start
seeing EDMA errors.

For > 5 slots, I don't see errors, but there is stalling because of
missed interrupts.

I observe that for an SG-list of size 10, it takes atleast 7 ms before
the interrupt handlers (ISR) gets a chance to execute. This I feel is
quite long, what is your opinion about this?

Describing my approach here:

If MAX slots is 10 for example, we split it into 2 cyclically linked
sets of size 5 each. Interrupts are setup to trigger for every 5 PaRAM
set transfers. After the first 5 transfer, the ISR recycles them for the
next 5 entries in the SG-list. This happens in parallel/simultaneously
as the second set of 5 are being transferred.

Thanks,

-Joel

On 08/02/2013 01:15 PM, Joel Fernandes wrote:[..]
> Even in your diagrams you are actually showing such a cyclic link
>
>
>>>>>>
>>>>>> SG1 -> SG2 -> SG3 -> SG4 -> SG5 -> SG6 -> NULL
>>>>>>  ^  ^  ^
>>>>>>  |  |  |
>>>>>> P0  -> P2  -> P1  -> NULL
>
> Comparing this..
>
>>>>>>
>>>>>> Now, on next interrupt, P2 gets copied and thus can get recycled.
>>>>>> Hardware state:
>>>>>>
>>>>>> SG2  -> SG3 -> SG4 -> SG5 -> SG6 -> NULL
>>>>>>  ^   ^
>>>>>>  |   |
>>>>>> P0,2P1  -> NULL
>>>>>>  |   ^
>>>>>>  |   |
>>>>>>  -
>>>>>>
>>>>>> As part of TC completion interrupt handling:
>>>>>>
>>>>>> SG2 -> SG3 -> SG4 -> SG5 -> SG6 -> NULL
>>>>>>  ^  ^  ^
>>>>>>  |  |  |
>>>>>> P0  -> P1  -> P2  -> NULL
>
> .. with this. Notice that P2 -> P1 became P1 -> P2
>
> The next thing logical diagram would look like:
>
>>>>>>
>>>>>> Now, on next interrupt, P1 gets copied and thus can get recycled.
>>>>>> Hardware state:
>>>>>>
>>>>>> SG3  -> SG4 -> SG5 -> SG6 -> NULL
>>>>>>  ^   ^
>>>>>>  |   |
>>>>>> P0,1P2  -> NULL
>>>>>>  |   ^
>>>>>>  |   |
>>>>>>  -
>>>>>>
>>>>>> As part of TC completion interrupt handling:
>>>>>>
>>>>>> SG3 -> SG5 -> SG6 -> SG6 -> NULL
>>>>>>  ^  ^  ^
>>>>>>  |  |  |
>>>>>> P0  -> P2  -> P1  -> NULL
>
>
> "P1 gets copied" happens only because of the cyclic link from P2 to P1,
> it wouldn't have happened if P2 was linked to Dummy as you described.
>
> Now coming to 2 linked sets vs 1, I meant the same thing that to give
> interrupt handler more time, we could have something like:
>
>>>>>> As part of TC completion interrupt handling:
>>>>>>
>>>>>> SG1 -> SG2 -> SG3 -> SG4 -> SG5 -> NULL
>>>>>>  ^  ^ ^
>>>>>>  |  | |
>>>>>> P0  -> P1  -> P2  -> P3  -> P4  ->  Null
>
> So what I was describing as 2 sets of linked sets is P1 and P2 being 1
> set, and P3 and P4 being another set. We would then recycle a complete
> set at the same time. That way interrupt handler could do more at once
> and get more time to recycle. So we would setup TC interrupts only for
> P2 and P4 in the above diagrams.
>
> Thanks,
>
> -Joel
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3 08/12] dma: edma: Add one more required slot to MAX slots

2013-08-05 Thread Joel Fernandes

We'd now need a separate slot just for the channel and separate
ones for the 2 linked sets, so we make adjustments to allocate
an extra channel accordingly.

Signed-off-by: Joel Fernandes 
---
 drivers/dma/edma.c |5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/dma/edma.c b/drivers/dma/edma.c
index a242269..df50a04 100644
--- a/drivers/dma/edma.c
+++ b/drivers/dma/edma.c
@@ -48,7 +48,7 @@
 
 /* Max of 16 segments per channel to conserve PaRAM slots */
 #define MAX_NR_SG  16
-#define EDMA_MAX_SLOTS MAX_NR_SG
+#define EDMA_MAX_SLOTS (MAX_NR_SG+1)
 #define EDMA_DESCRIPTORS   16
 
 struct edma_desc {
@@ -311,6 +311,9 @@ static struct dma_async_tx_descriptor *edma_prep_slave_sg(
 
num_slots_needed = sg_len > MAX_NR_SG ? MAX_NR_SG : sg_len;
 
+   /* Allocate one extra to account for the channel itself */
+   num_slots_needed++;
+
for (i = 0; i < num_slots_needed; i++) {
if (echan->slot[i] < 0) {
echan->slot[i] =
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3 05/12] ARM: edma: Add function to enable interrupt for a PaRAM slot

2013-08-05 Thread Joel Fernandes

To prevent common programming errors, add a function to enable
interrupts correctly. Also keeps calling code more readable.

Signed-off-by: Joel Fernandes 
---
 arch/arm/common/edma.c |   15 +++
 include/linux/platform_data/edma.h |1 +
 2 files changed, 16 insertions(+)

diff --git a/arch/arm/common/edma.c b/arch/arm/common/edma.c
index 6433b6c..aa43c49 100644
--- a/arch/arm/common/edma.c
+++ b/arch/arm/common/edma.c
@@ -1095,6 +1095,21 @@ void edma_set_transfer_params(unsigned slot,
 EXPORT_SYMBOL(edma_set_transfer_params);
 
 /**
+ * edma_enable_interrupt - enable interrupt on parameter RAM slot
+ * @slot: parameter RAM slot which is the link target
+ *
+ * The originating slot should not be part of any active DMA transfer.
+ */
+void edma_enable_interrupt(unsigned slot)
+{
+   pr_debug("Enabling interrupt on slot %d\n", slot);
+
+   edma_parm_modify(EDMA_CTLR(slot), PARM_OPT, EDMA_CHAN_SLOT(slot),
+~TCINTEN, TCINTEN);
+}
+EXPORT_SYMBOL(edma_enable_interrupt);
+
+/**
  * edma_link - link one parameter RAM slot to another
  * @from: parameter RAM slot originating the link
  * @to: parameter RAM slot which is the link target
diff --git a/include/linux/platform_data/edma.h 
b/include/linux/platform_data/edma.h
index 57300fd..184ff5f 100644
--- a/include/linux/platform_data/edma.h
+++ b/include/linux/platform_data/edma.h
@@ -136,6 +136,7 @@ void edma_set_dest_index(unsigned slot, s16 dest_bidx, s16 
dest_cidx);
 void edma_set_transfer_params(unsigned slot, u16 acnt, u16 bcnt, u16 ccnt,
u16 bcnt_rld, enum sync_dimension sync_mode);
 void edma_link(unsigned from, unsigned to);
+void edma_enable_interrupt(unsigned slot);
 void edma_unlink(unsigned from);
 
 /* calls that operate on an entire parameter RAM slot */
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3 06/12] ARM: edma: Add pr_debug in edma_link

2013-08-05 Thread Joel Fernandes

Useful for visualizing linking of PaRAM slots

Signed-off-by: Joel Fernandes 
---
 arch/arm/common/edma.c |2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/arm/common/edma.c b/arch/arm/common/edma.c
index aa43c49..34d3fc9 100644
--- a/arch/arm/common/edma.c
+++ b/arch/arm/common/edma.c
@@ -1125,6 +1125,8 @@ void edma_link(unsigned from, unsigned to)
ctlr_to = EDMA_CTLR(to);
to = EDMA_CHAN_SLOT(to);
 
+   pr_debug("Setting up link %d -> %d\n", from, to);
+
if (from >= edma_cc[ctlr_from]->num_slots)
return;
if (to >= edma_cc[ctlr_to]->num_slots)
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3 03/12] dma: edma: remove limits on number of slots

2013-08-05 Thread Joel Fernandes

With this series, this check is no longer required and
we shouldn't need to reject drivers DMA'ing more than the
MAX number of slots.

Signed-off-by: Joel Fernandes 
---
 drivers/dma/edma.c |6 --
 1 file changed, 6 deletions(-)

diff --git a/drivers/dma/edma.c b/drivers/dma/edma.c
index 7b0853c..b6d609c 100644
--- a/drivers/dma/edma.c
+++ b/drivers/dma/edma.c
@@ -247,12 +247,6 @@ static struct dma_async_tx_descriptor *edma_prep_slave_sg(
return NULL;
}
 
-   if (sg_len > MAX_NR_SG) {
-   dev_err(dev, "Exceeded max SG segments %d > %d\n",
-   sg_len, MAX_NR_SG);
-   return NULL;
-   }
-
edesc = kzalloc(sizeof(*edesc) + sg_len *
sizeof(edesc->pset[0]), GFP_ATOMIC);
if (!edesc) {
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3 10/12] dma: edma: Check if MAX_NR_SG is even in prep function

2013-08-05 Thread Joel Fernandes

Splitting of MAX available slots into 2 sets of size MAX_NR_LS
requires to the MAX_NR_SG function to be even. We ensure the same
in prep function.

Signed-off-by: Joel Fernandes 
---
 drivers/dma/edma.c |3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/dma/edma.c b/drivers/dma/edma.c
index 70923a2..061f0cf 100644
--- a/drivers/dma/edma.c
+++ b/drivers/dma/edma.c
@@ -354,6 +354,9 @@ static struct dma_async_tx_descriptor *edma_prep_slave_sg(
int src_bidx, dst_bidx, src_cidx, dst_cidx;
int i, num_slots_needed;
 
+   if (MAX_NR_SG & 1)
+   dev_err(dev, "%s: MAX_NR_SG must be even\n", __func__);
+
if (unlikely(!echan || !sgl || !sg_len))
return NULL;
 
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3 04/12] dma: edma: Write out and handle MAX_NR_SG at a given time

2013-08-05 Thread Joel Fernandes

Process SG-elements in batches of MAX_NR_SG if they are greater
than MAX_NR_SG. Due to this, at any given time only those many
slots will be used in the given channel no matter how long the
scatter list is. We keep track of how much has been written
inorder to process the next batch of elements in the scatter-list
and detect completion.

Signed-off-by: Joel Fernandes 
---
 drivers/dma/edma.c |   76 ++--
 1 file changed, 50 insertions(+), 26 deletions(-)

diff --git a/drivers/dma/edma.c b/drivers/dma/edma.c
index b6d609c..080d669 100644
--- a/drivers/dma/edma.c
+++ b/drivers/dma/edma.c
@@ -56,6 +56,7 @@ struct edma_desc {
struct list_headnode;
int absync;
int pset_nr;
+   int total_processed;
struct edmacc_param pset[0];
 };
 
@@ -104,22 +105,36 @@ static void edma_desc_free(struct virt_dma_desc *vdesc)
 /* Dispatch a queued descriptor to the controller (caller holds lock) */
 static void edma_execute(struct edma_chan *echan)
 {
-   struct virt_dma_desc *vdesc = vchan_next_desc(>vchan);
+   struct virt_dma_desc *vdesc;
struct edma_desc *edesc;
-   int i;
+   struct device *dev = echan->vchan.chan.device->dev;
 
-   if (!vdesc) {
-   echan->edesc = NULL;
-   return;
+   int i, j, total_left, total_process;
+
+   /* If either we processed all psets or we're still not started */
+   if (!echan->edesc ||
+   echan->edesc->pset_nr == echan->edesc->total_processed) {
+   /* Get next vdesc */
+   vdesc = vchan_next_desc(>vchan);
+   if (!vdesc) {
+   echan->edesc = NULL;
+   return;
+   }
+   list_del(>node);
+   echan->edesc = to_edma_desc(>tx);
}
 
-   list_del(>node);
+   edesc = echan->edesc;
+
+   /* Find out how many left */
+   total_left = edesc->pset_nr - edesc->total_processed;
+   total_process = total_left > MAX_NR_SG ? MAX_NR_SG : total_left;
 
-   echan->edesc = edesc = to_edma_desc(>tx);
 
/* Write descriptor PaRAM set(s) */
-   for (i = 0; i < edesc->pset_nr; i++) {
-   edma_write_slot(echan->slot[i], >pset[i]);
+   for (i = 0; i < total_process; i++) {
+   j = i + edesc->total_processed;
+   edma_write_slot(echan->slot[i], >pset[j]);
dev_dbg(echan->vchan.chan.device->dev,
"\n pset[%d]:\n"
"  chnum\t%d\n"
@@ -132,24 +147,29 @@ static void edma_execute(struct edma_chan *echan)
"  bidx\t%08x\n"
"  cidx\t%08x\n"
"  lkrld\t%08x\n",
-   i, echan->ch_num, echan->slot[i],
-   edesc->pset[i].opt,
-   edesc->pset[i].src,
-   edesc->pset[i].dst,
-   edesc->pset[i].a_b_cnt,
-   edesc->pset[i].ccnt,
-   edesc->pset[i].src_dst_bidx,
-   edesc->pset[i].src_dst_cidx,
-   edesc->pset[i].link_bcntrld);
+   j, echan->ch_num, echan->slot[i],
+   edesc->pset[j].opt,
+   edesc->pset[j].src,
+   edesc->pset[j].dst,
+   edesc->pset[j].a_b_cnt,
+   edesc->pset[j].ccnt,
+   edesc->pset[j].src_dst_bidx,
+   edesc->pset[j].src_dst_cidx,
+   edesc->pset[j].link_bcntrld);
/* Link to the previous slot if not the last set */
-   if (i != (edesc->pset_nr - 1))
+   if (i != (total_process - 1))
edma_link(echan->slot[i], echan->slot[i+1]);
/* Final pset links to the dummy pset */
else
edma_link(echan->slot[i], echan->ecc->dummy_slot);
}
 
-   edma_start(echan->ch_num);
+   edesc->total_processed += total_process;
+
+   if (edesc->total_processed <= MAX_NR_SG) {
+   dev_dbg(dev, "first transfer starting %d\n", echan->ch_num);
+   edma_start(echan->ch_num);
+   }
 }
 
 static int edma_terminate_all(struct edma_chan *echan)
@@ -358,19 +378,23 @@ static void edma_callback(unsigned ch_num, u16 ch_status, 
void *data)
struct edma_desc *edesc;
unsigned long flags;
 
-   /* Stop the channel */
-   edma_stop(echan->ch_num);
-

[PATCH v3 11/12] dma: edma: Keep tracking of Pending interrupts (pending_acks)

2013-08-05 Thread Joel Fernandes

The total_processed variable was being used to keep track of when
DMA is completed for a particular descriptor.

This doesn't work anymore, as the interrupt for completion can come
in much later than when the total_processed variable is updated to
reflect that all SG entries have been issues. Introduce another
variable that reflects the actual value. This will serve a purpose
different from what total_processed was required to help with.

Also add code that give special treatment for the first linked set
interrupt as pending_acks is handled a bit differently for that case.

Signed-off-by: Joel Fernandes 
---
 drivers/dma/edma.c |   32 +---
 1 file changed, 29 insertions(+), 3 deletions(-)

diff --git a/drivers/dma/edma.c b/drivers/dma/edma.c
index 061f0cf..eca1b47 100644
--- a/drivers/dma/edma.c
+++ b/drivers/dma/edma.c
@@ -59,6 +59,10 @@ struct edma_desc {
int pset_nr;
int total_processed;
int next_setup_linkset;
+
+   int no_first_ls_ack;
+   int pending_acks;
+
struct edmacc_param pset[0];
 };
 
@@ -147,8 +151,7 @@ static void edma_execute(struct edma_chan *echan)
struct edmacc_param tmp_param;
 
/* If either we processed all psets or we're still not started */
-   if (!echan->edesc ||
-   echan->edesc->pset_nr == echan->edesc->total_processed) {
+   if (!echan->edesc || !echan->edesc->pending_acks) {
/* Get next vdesc */
vdesc = vchan_next_desc(>vchan);
if (!vdesc) {
@@ -180,6 +183,8 @@ static void edma_execute(struct edma_chan *echan)
 
edesc->total_processed += total_link_set;
 
+   edesc->pending_acks += total_link_set;
+
total_left = edesc->pset_nr - edesc->total_processed;
 
total_link_set = total_left > MAX_NR_LS ?
@@ -190,6 +195,8 @@ static void edma_execute(struct edma_chan *echan)
   where total pset_nr is strictly within MAX_NR size */
if (total_left > total_link_set)
edma_enable_interrupt(echan->slot[i]);
+   else
+   edesc->no_first_ls_ack = 1;
 
/* Setup link between linked set 0 to set 1 */
edma_link(echan->slot[i], echan->slot[i+1]);
@@ -210,6 +217,9 @@ static void edma_execute(struct edma_chan *echan)
}
 
edesc->total_processed += total_link_set;
+
+   edesc->pending_acks += total_link_set;
+
total_left = edesc->pset_nr - edesc->total_processed;
 
if (total_left)
@@ -267,6 +277,8 @@ static void edma_execute(struct edma_chan *echan)
 
edesc->total_processed += total_link_set;
 
+   edesc->pending_acks += total_link_set;
+
slot_off = total_link_set + ls_cur_off - 1;
 
if (edesc->total_processed == edesc->pset_nr)
@@ -498,8 +510,22 @@ static void edma_callback(unsigned ch_num, u16 ch_status, 
void *data)
 
edesc = echan->edesc;
 
+   if (edesc->pending_acks < MAX_NR_LS)
+   edesc->pending_acks = 0;
+   else
+   edesc->pending_acks -= MAX_NR_LS;
+
+   if (edesc->no_first_ls_ack) {
+   if (edesc->pending_acks < MAX_NR_LS)
+   edesc->pending_acks = 0;
+   else
+   edesc->pending_acks -= MAX_NR_LS;
+
+   edesc->no_first_ls_ack = 0;
+   }
+
if (edesc) {
-   if (edesc->total_processed == edesc->pset_nr) {
+   if (!edesc->pending_acks) {
dev_dbg(dev, "Transfer complete,"
" stopping channel %d\n", ch_num);
edma_stop(echan->ch_num);
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3 12/12] dma: edma: Return if nothing left todo in edma_execute

2013-08-05 Thread Joel Fernandes

It is possible edma_execute is called even when all the SG
elements have been submitted for transmission, we add a check
for the same and avoid executing the rest of the function.

Signed-off-by: Joel Fernandes 
---
 drivers/dma/edma.c |3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/dma/edma.c b/drivers/dma/edma.c
index eca1b47..62987fc 100644
--- a/drivers/dma/edma.c
+++ b/drivers/dma/edma.c
@@ -168,6 +168,9 @@ static void edma_execute(struct edma_chan *echan)
total_left = edesc->pset_nr - edesc->total_processed;
total_link_set = total_left > MAX_NR_LS ? MAX_NR_LS : total_left;
 
+   if (!total_left)
+   return;
+
/* First time, setup 2 cyclically linked sets, each containing half
   the slots allocated for this channel */
if (edesc->total_processed == 0) {
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3 02/12] ARM: edma: Don't clear EMR of channel in edma_stop

2013-08-05 Thread Joel Fernandes

We certainly don't want error conditions to be cleared any other
place but the EDMA error handler, as this will make us 'forget'
about missed events we might need to know errors have occurred.

This fixes a race condition where the EMR was being cleared
by the transfer completion interrupt handler.

Basically, what was happening was:

Missed event
 |
 |
 V
SG1-SG2-SG3-Null
 \
  \__TC Interrupt (Almost same time as ARM is executing
TC interrupt handler, an event got missed and also forgotten
by clearing the EMR).

This causes the following  problems:

1.
If error interrupt is also pending and TC interrupt clears the EMR
by calling edma_stop as has been observed in the edma_callback function,
the ARM will execute the error interrupt even though the EMR is clear.
As a result, the  dma_ccerr_handler returns IRQ_NONE. If this happens
enough number of times, IRQ subsystem disables the interrupt thinking
its spurious which makes error handler never execute again.

2.
Also even if error handler doesn't return IRQ_NONE, the removing of EMR
removes the knowledge about which channel had a missed event, and thus
a manual trigger on such channels cannot be performed.

The EMR is ultimately being cleared by the Error interrupt handler
once it is handled so we remove code that does it in edma_stop and
allow it to happen there.

Signed-off-by: Joel Fernandes 
---
 arch/arm/common/edma.c |1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/arm/common/edma.c b/arch/arm/common/edma.c
index 3567ba1..6433b6c 100644
--- a/arch/arm/common/edma.c
+++ b/arch/arm/common/edma.c
@@ -1307,7 +1307,6 @@ void edma_stop(unsigned channel)
edma_shadow0_write_array(ctlr, SH_EECR, j, mask);
edma_shadow0_write_array(ctlr, SH_ECR, j, mask);
edma_shadow0_write_array(ctlr, SH_SECR, j, mask);
-   edma_write_array(ctlr, EDMA_EMCR, j, mask);
 
pr_debug("EDMA: EER%d %08x\n", j,
edma_shadow0_read_array(ctlr, SH_EER, j));
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3 07/12] dma: edma: Add function to dump a PaRAM set from PaRAM

2013-08-05 Thread Joel Fernandes

Previously, such a dump function was used but it wasn't reading
from the PaRAM, rather just from a edmacc_param structure, we
add a helpful function for debugging that directly reads from
the PaRAM and gives the current state correctly (including links
and interrupt information).

Signed-off-by: Joel Fernandes 
---
 drivers/dma/edma.c |   31 +++
 1 file changed, 31 insertions(+)

diff --git a/drivers/dma/edma.c b/drivers/dma/edma.c
index 080d669..a242269 100644
--- a/drivers/dma/edma.c
+++ b/drivers/dma/edma.c
@@ -102,6 +102,37 @@ static void edma_desc_free(struct virt_dma_desc *vdesc)
kfree(container_of(vdesc, struct edma_desc, vdesc));
 }
 
+static inline void dump_pset(struct edma_chan *echan, int slot,
+struct edmacc_param *pset_unused, int pset_idx)
+{
+   struct edmacc_param pset_local, *pset;
+   pset = _local;
+
+   edma_read_slot(slot, pset);
+
+   dev_dbg(echan->vchan.chan.device->dev,
+   "\n pset[%d]:\n"
+   "  chnum\t%d\n"
+   "  slot\t%d\n"
+   "  opt\t%08x\n"
+   "  src\t%08x\n"
+   "  dst\t%08x\n"
+   "  abcnt\t%08x\n"
+   "  ccnt\t%08x\n"
+   "  bidx\t%08x\n"
+   "  cidx\t%08x\n"
+   "  lkrld\t%08x\n",
+   pset_idx, echan->ch_num, slot,
+   pset[0].opt,
+   pset[0].src,
+   pset[0].dst,
+   pset[0].a_b_cnt,
+   pset[0].ccnt,
+   pset[0].src_dst_bidx,
+   pset[0].src_dst_cidx,
+   pset[0].link_bcntrld);
+}
+
 /* Dispatch a queued descriptor to the controller (caller holds lock) */
 static void edma_execute(struct edma_chan *echan)
 {
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3 00/12] edma: Add support for SG lists of any length

2013-08-05 Thread Joel Fernandes

Here is a more improved approach for DMA support of SG lists of any length
in the EDMA DMA Engine driver.

In the previous approach [1] we depended on error interrupts to detect
missed events and manually retrigger them, however as discussed in [2],
there are concerns this can be trouble some for high speed peripherals
which may need a more real-time response from the DMA controller.

In this approach, we divide the total no of MAX slots per channel, into
2 linked sets which are cyclically linked to each other (the cyclic
link between the 2 sets make sure that the DMA is continuous till the whole
SG list has exhausted). We then enable completion interrupts on both linked
sets which results in recyling/preparing respective linked set with the
next set of SG entries. The interrupt handler executes in parallel while
the EDMA controller DMA's the next list. This results in no interruption.

Special handling is done for first linked set (as we set up both linked
sets initially before starting with the DMA), and last one where the cyclic
link has to be broken and a link to the Dummy slot has to be created.
Also we keep track of whether all pending DMA operations have completed
before we can mark it as complete.

[1] https://lkml.org/lkml/2013/7/29/312
[2] https://lkml.org/lkml/2013/7/30/54

Joel Fernandes (12):
  dma: edma: Setup parameters to DMA MAX_NR_SG at a time
  ARM: edma: Don't clear EMR of channel in edma_stop
  dma: edma: remove limits on number of slots
  dma: edma: Write out and handle MAX_NR_SG at a given time
  ARM: edma: Add function to enable interrupt for a PaRAM slot
  ARM: edma: Add pr_debug in edma_link
  dma: edma: Add function to dump a PaRAM set from PaRAM
  dma: edma: Add one more required slot to MAX slots
  dma: edma: Implement multiple linked sets for continuity
  dma: edma: Check if MAX_NR_SG is even in prep function
  dma: edma: Keep tracking of Pending interrupts (pending_acks)
  dma: edma: Return if nothing left todo in edma_execute

 arch/arm/common/edma.c |   18 ++-
 drivers/dma/edma.c |  279 +---
 include/linux/platform_data/edma.h |1 +
 3 files changed, 243 insertions(+), 55 deletions(-)

-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3 09/12] dma: edma: Implement multiple linked sets for continuity

2013-08-05 Thread Joel Fernandes

Here we implement splitting up of the total MAX number of slots
available for a channel into 2 cyclically linked sets. Transfer
completion Interrupts are enabled on both linked sets and respective
handler recycles them on completion to process the next linked set.
Both linked sets are cyclically linked to each other to ensure
continuity of DMA operations. Interrupt handlers execute asynchronously
to the EDMA events and recycles the linked sets at the right time,
as a result EDMA is not blocked or dependent on interrupts and DMA
continues till the end of the SG-lists without any interruption.

Suggested-by: Sekhar Nori 
Signed-off-by: Joel Fernandes 
---
 drivers/dma/edma.c |  157 +++-
 1 file changed, 118 insertions(+), 39 deletions(-)

diff --git a/drivers/dma/edma.c b/drivers/dma/edma.c
index df50a04..70923a2 100644
--- a/drivers/dma/edma.c
+++ b/drivers/dma/edma.c
@@ -48,6 +48,7 @@
 
 /* Max of 16 segments per channel to conserve PaRAM slots */
 #define MAX_NR_SG  16
+#define MAX_NR_LS  (MAX_NR_SG >> 1)
 #define EDMA_MAX_SLOTS (MAX_NR_SG+1)
 #define EDMA_DESCRIPTORS   16
 
@@ -57,6 +58,7 @@ struct edma_desc {
int absync;
int pset_nr;
int total_processed;
+   int next_setup_linkset;
struct edmacc_param pset[0];
 };
 
@@ -140,7 +142,9 @@ static void edma_execute(struct edma_chan *echan)
struct edma_desc *edesc;
struct device *dev = echan->vchan.chan.device->dev;
 
-   int i, j, total_left, total_process;
+   int i, total_left, total_link_set;
+   int ls_cur_off, ls_next_off, slot_off;
+   struct edmacc_param tmp_param;
 
/* If either we processed all psets or we're still not started */
if (!echan->edesc ||
@@ -159,48 +163,121 @@ static void edma_execute(struct edma_chan *echan)
 
/* Find out how many left */
total_left = edesc->pset_nr - edesc->total_processed;
-   total_process = total_left > MAX_NR_SG ? MAX_NR_SG : total_left;
-
-
-   /* Write descriptor PaRAM set(s) */
-   for (i = 0; i < total_process; i++) {
-   j = i + edesc->total_processed;
-   edma_write_slot(echan->slot[i], >pset[j]);
-   dev_dbg(echan->vchan.chan.device->dev,
-   "\n pset[%d]:\n"
-   "  chnum\t%d\n"
-   "  slot\t%d\n"
-   "  opt\t%08x\n"
-   "  src\t%08x\n"
-   "  dst\t%08x\n"
-   "  abcnt\t%08x\n"
-   "  ccnt\t%08x\n"
-   "  bidx\t%08x\n"
-   "  cidx\t%08x\n"
-   "  lkrld\t%08x\n",
-   j, echan->ch_num, echan->slot[i],
-   edesc->pset[j].opt,
-   edesc->pset[j].src,
-   edesc->pset[j].dst,
-   edesc->pset[j].a_b_cnt,
-   edesc->pset[j].ccnt,
-   edesc->pset[j].src_dst_bidx,
-   edesc->pset[j].src_dst_cidx,
-   edesc->pset[j].link_bcntrld);
-   /* Link to the previous slot if not the last set */
-   if (i != (total_process - 1))
+   total_link_set = total_left > MAX_NR_LS ? MAX_NR_LS : total_left;
+
+   /* First time, setup 2 cyclically linked sets, each containing half
+  the slots allocated for this channel */
+   if (edesc->total_processed == 0) {
+   for (i = 0; i < total_link_set; i++) {
+   edma_write_slot(echan->slot[i+1], >pset[i]);
+
+   if (i != total_link_set - 1) {
+   edma_link(echan->slot[i+1], echan->slot[i+2]);
+   dump_pset(echan, echan->slot[i+1],
+ edesc->pset, i);
+   }
+   }
+
+   edesc->total_processed += total_link_set;
+
+   total_left = edesc->pset_nr - edesc->total_processed;
+
+   total_link_set = total_left > MAX_NR_LS ?
+MAX_NR_LS : total_left;
+
+   if (total_link_set) {
+   /* Don't setup interrupt for first linked set for cases
+  where total pset_nr is strictly within MAX_NR size */
+   if (total_left > total_link_set)
+   edma_enable_interrupt(echan->slot[i]);
+
+   /* Setup link between linked set 0 to set

[PATCH v3 01/12] dma: edma: Setup parameters to DMA MAX_NR_SG at a time

2013-08-05 Thread Joel Fernandes

Changes are made here for configuring existing parameters to support
DMA'ing them out in batches as needed.

Also allocate as many as slots as needed by the SG list, but not more
than MAX_NR_SG. Then these slots will be reused accordingly.
For ex, if MAX_NR_SG=10, and number of SG entries is 40, still only
10 slots will be allocated to DMA the entire SG list of size 40.

Signed-off-by: Joel Fernandes 
---
 drivers/dma/edma.c |   14 +++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/drivers/dma/edma.c b/drivers/dma/edma.c
index 5f3e532..7b0853c 100644
--- a/drivers/dma/edma.c
+++ b/drivers/dma/edma.c
@@ -222,9 +222,9 @@ static struct dma_async_tx_descriptor *edma_prep_slave_sg(
enum dma_slave_buswidth dev_width;
u32 burst;
struct scatterlist *sg;
-   int i;
int acnt, bcnt, ccnt, src, dst, cidx;
int src_bidx, dst_bidx, src_cidx, dst_cidx;
+   int i, num_slots_needed;
 
if (unlikely(!echan || !sgl || !sg_len))
return NULL;
@@ -262,8 +262,11 @@ static struct dma_async_tx_descriptor *edma_prep_slave_sg(
 
edesc->pset_nr = sg_len;
 
-   for_each_sg(sgl, sg, sg_len, i) {
-   /* Allocate a PaRAM slot, if needed */
+   /* Allocate a PaRAM slot, if needed */
+
+   num_slots_needed = sg_len > MAX_NR_SG ? MAX_NR_SG : sg_len;
+
+   for (i = 0; i < num_slots_needed; i++) {
if (echan->slot[i] < 0) {
echan->slot[i] =
edma_alloc_slot(EDMA_CTLR(echan->ch_num),
@@ -273,6 +276,10 @@ static struct dma_async_tx_descriptor *edma_prep_slave_sg(
return NULL;
}
}
+   }
+
+   /* Configure PaRAM sets for each SG */
+   for_each_sg(sgl, sg, sg_len, i) {
 
acnt = dev_width;
 
@@ -330,6 +337,7 @@ static struct dma_async_tx_descriptor *edma_prep_slave_sg(
/* Configure A or AB synchronized transfers */
if (edesc->absync)
edesc->pset[i].opt |= SYNCDIM;
+
/* If this is the last set, enable completion interrupt flag */
if (i == sg_len - 1)
edesc->pset[i].opt |= TCINTEN;
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v3 00/12] edma: Add support for SG lists of any length

2013-08-15 Thread Joel Fernandes

Following offline discussions with Sekhar, we discussed some ideas to
change a few things in this patch series to make it fail-safe. As such,
the only changes we are making for v4 will be to not cyclically link
immediately but doing so only once the ISR has finished setup (apart
from other style cleanups).

Any conditions where we are not able to modify the link in time (due to
heavily loaded system) will be detected and reported by the use of
linking to a NULL set.

The new approach will be fast because of no requirement to stall or
wait, and any DMA issues with heavily loaded systems can be detected as
error conditions.

This should architecturally be the final version of the patch series to
add DMA support for SG lists of any length.

Thanks,

-Joel

On 08/05/2013 11:14 AM, Joel Fernandes wrote:
> Here is a more improved approach for DMA support of SG lists of any length
> in the EDMA DMA Engine driver.
> 
> In the previous approach [1] we depended on error interrupts to detect
> missed events and manually retrigger them, however as discussed in [2],
> there are concerns this can be trouble some for high speed peripherals
> which may need a more real-time response from the DMA controller.
> 
> In this approach, we divide the total no of MAX slots per channel, into
> 2 linked sets which are cyclically linked to each other (the cyclic
> link between the 2 sets make sure that the DMA is continuous till the whole
> SG list has exhausted). We then enable completion interrupts on both linked
> sets which results in recyling/preparing respective linked set with the
> next set of SG entries. The interrupt handler executes in parallel while
> the EDMA controller DMA's the next list. This results in no interruption.
> 
> Special handling is done for first linked set (as we set up both linked
> sets initially before starting with the DMA), and last one where the cyclic
> link has to be broken and a link to the Dummy slot has to be created.
> Also we keep track of whether all pending DMA operations have completed
> before we can mark it as complete.
> 
> [1] https://lkml.org/lkml/2013/7/29/312
> [2] https://lkml.org/lkml/2013/7/30/54
> 
> Joel Fernandes (12):
>   dma: edma: Setup parameters to DMA MAX_NR_SG at a time
>   ARM: edma: Don't clear EMR of channel in edma_stop
>   dma: edma: remove limits on number of slots
>   dma: edma: Write out and handle MAX_NR_SG at a given time
>   ARM: edma: Add function to enable interrupt for a PaRAM slot
>   ARM: edma: Add pr_debug in edma_link
>   dma: edma: Add function to dump a PaRAM set from PaRAM
>   dma: edma: Add one more required slot to MAX slots
>   dma: edma: Implement multiple linked sets for continuity
>   dma: edma: Check if MAX_NR_SG is even in prep function
>   dma: edma: Keep tracking of Pending interrupts (pending_acks)
>   dma: edma: Return if nothing left todo in edma_execute
> 
>  arch/arm/common/edma.c |   18 ++-
>  drivers/dma/edma.c |  279 
> +---
>  include/linux/platform_data/edma.h |1 +
>  3 files changed, 243 insertions(+), 55 deletions(-)
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 06/14] crypto: omap-aes: Remove previously used intermediate buffers

2013-08-17 Thread Joel Fernandes

Intermdiate buffers were allocated, mapped and used for DMA.  These are no
longer required as we use the SGs from crypto layer directly in previous
commits in the series. Also along with it, remove the logic for copying SGs
etc as they are no longer used, and all the associated variables in 
omap_aes_device.

Signed-off-by: Joel Fernandes 
---
 drivers/crypto/omap-aes.c |   90 -
 1 file changed, 90 deletions(-)

diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
index 5e034a1..bbdd1c3 100644
--- a/drivers/crypto/omap-aes.c
+++ b/drivers/crypto/omap-aes.c
@@ -149,25 +149,13 @@ struct omap_aes_dev {
struct ablkcipher_request   *req;
size_t  total;
struct scatterlist  *in_sg;
-   struct scatterlist  in_sgl;
-   size_t  in_offset;
struct scatterlist  *out_sg;
-   struct scatterlist  out_sgl;
-   size_t  out_offset;
-
-   size_t  buflen;
-   void*buf_in;
-   size_t  dma_size;
int dma_in;
struct dma_chan *dma_lch_in;
-   dma_addr_t  dma_addr_in;
-   void*buf_out;
int dma_out;
struct dma_chan *dma_lch_out;
int in_sg_len;
int out_sg_len;
-   dma_addr_t  dma_addr_out;
-
const struct omap_aes_pdata *pdata;
 };
 
@@ -352,33 +340,6 @@ static int omap_aes_dma_init(struct omap_aes_dev *dd)
dd->dma_lch_out = NULL;
dd->dma_lch_in = NULL;
 
-   dd->buf_in = (void *)__get_free_pages(GFP_KERNEL, OMAP_AES_CACHE_SIZE);
-   dd->buf_out = (void *)__get_free_pages(GFP_KERNEL, OMAP_AES_CACHE_SIZE);
-   dd->buflen = PAGE_SIZE << OMAP_AES_CACHE_SIZE;
-   dd->buflen &= ~(AES_BLOCK_SIZE - 1);
-
-   if (!dd->buf_in || !dd->buf_out) {
-   dev_err(dd->dev, "unable to alloc pages.\n");
-   goto err_alloc;
-   }
-
-   /* MAP here */
-   dd->dma_addr_in = dma_map_single(dd->dev, dd->buf_in, dd->buflen,
-DMA_TO_DEVICE);
-   if (dma_mapping_error(dd->dev, dd->dma_addr_in)) {
-   dev_err(dd->dev, "dma %d bytes error\n", dd->buflen);
-   err = -EINVAL;
-   goto err_map_in;
-   }
-
-   dd->dma_addr_out = dma_map_single(dd->dev, dd->buf_out, dd->buflen,
- DMA_FROM_DEVICE);
-   if (dma_mapping_error(dd->dev, dd->dma_addr_out)) {
-   dev_err(dd->dev, "dma %d bytes error\n", dd->buflen);
-   err = -EINVAL;
-   goto err_map_out;
-   }
-
dma_cap_zero(mask);
dma_cap_set(DMA_SLAVE, mask);
 
@@ -405,14 +366,6 @@ static int omap_aes_dma_init(struct omap_aes_dev *dd)
 err_dma_out:
dma_release_channel(dd->dma_lch_in);
 err_dma_in:
-   dma_unmap_single(dd->dev, dd->dma_addr_out, dd->buflen,
-DMA_FROM_DEVICE);
-err_map_out:
-   dma_unmap_single(dd->dev, dd->dma_addr_in, dd->buflen, DMA_TO_DEVICE);
-err_map_in:
-   free_pages((unsigned long)dd->buf_out, OMAP_AES_CACHE_SIZE);
-   free_pages((unsigned long)dd->buf_in, OMAP_AES_CACHE_SIZE);
-err_alloc:
if (err)
pr_err("error: %d\n", err);
return err;
@@ -422,11 +375,6 @@ static void omap_aes_dma_cleanup(struct omap_aes_dev *dd)
 {
dma_release_channel(dd->dma_lch_out);
dma_release_channel(dd->dma_lch_in);
-   dma_unmap_single(dd->dev, dd->dma_addr_out, dd->buflen,
-DMA_FROM_DEVICE);
-   dma_unmap_single(dd->dev, dd->dma_addr_in, dd->buflen, DMA_TO_DEVICE);
-   free_pages((unsigned long)dd->buf_out, OMAP_AES_CACHE_SIZE);
-   free_pages((unsigned long)dd->buf_in, OMAP_AES_CACHE_SIZE);
 }
 
 static void sg_copy_buf(void *buf, struct scatterlist *sg,
@@ -443,42 +391,6 @@ static void sg_copy_buf(void *buf, struct scatterlist *sg,
scatterwalk_done(, out, 0);
 }
 
-static int sg_copy(struct scatterlist **sg, size_t *offset, void *buf,
-  size_t buflen, size_t total, int out)
-{
-   unsigned int count, off = 0;
-
-   while (buflen && total) {
-   count = min((*sg)->length - *offset, total);
-   count = min(count, buflen);
-
-   if (!count)
-   return off;
-
-   /*
-* buflen and total are AES_BLOCK_SIZE size aligned,
-* so count should be also aligned
-*/
-
-   sg_copy_buf(buf + off, *s

[PATCH v2 00/14] crypto: omap-aes: Improve DMA, add PIO mode and support for AM437x

2013-08-17 Thread Joel Fernandes

Following patch series rewrites the DMA code to be cleaner and faster. Earlier,
only a single SG was used for DMA purpose, and the SG-list passed from the
crypto layer was being copied and DMA'd one entry at a time. This turns out to
be quite inefficient and lot of code, we replace it with much simpler approach
that directly passes the SG-list from crypto to the DMA layers for cases where
possible. For all cases where such a direct passing of SG list is not possible,
we create a new SG-list and do the copying. This is still better than before, as
we create an SG list as big as needed and not just 1-element list.

We also add PIO mode support to the driver, and switch to it whenever the DMA
channel allocation is not available. This also has shown to give good 
performance
for small blocks as shown below.

Tests have been performed on AM335x, OMAP4 and AM437x SoCs.

Below is a sample run on AM335x SoC (beaglebone board), showing
performance improvement (20% for 8K blocks):

With DMA rewrite (key size = 128-bit)
16 byte blocks: 4318 operations in 1 seconds (69088 bytes)
64 byte blocks: 4360 operations in 1 seconds (279040 bytes)
256 byte blocks: 3609 operations in 1 seconds (923904 bytes)
1024 byte blocks: 3418 operations in 1 seconds (3500032 bytes)
8192 byte blocks: 1766 operations in 1 seconds (14467072 bytes)

Without DMA rewrite:
16 byte blocks: 4417 operations in 1 seconds (70672 bytes)
64 byte blocks: 4221 operations in 1 seconds (270144 bytes)
256 byte blocks: 3528 operations in 1 seconds (903168 bytes)
1024 byte blocks: 3281 operations in 1 seconds (3359744 bytes)
8192 byte blocks: 1460 operations in 1 seconds (11960320 bytes)

With PIO mode, good performance is observed for small blocks:
16 byte blocks: 20585 operations in 1 seconds (329360 bytes)
64 byte blocks: 8106 operations in 1 seconds (518784 bytes)
256 byte blocks: 2359 operations in 1 seconds (603904 bytes)
1024 byte blocks: 605 operations in 1 seconds (619520 bytes)
8192 byte blocks: 79 operations in 1 seconds (647168 bytes)

Future work in this direction would be to dynamically change between PIO/DMA 
mode
based on the block size.

Changes since last series:
* Unaligned cases for omap-aes are handled with patch: 
   "Add support for cases of unaligned lengths"
* Support for am437x SoC is added and tested.
* Changes following review comments on debug patch 

Note:
  The debug patch:  "crypto: omap-aes: Add useful debug macros" will generate
  a checkpatch error, which cannot be fixed. Refer to patch for error message
  and reasons for why cannot be fixed, thanks.

Joel Fernandes (14):
  crypto: scatterwalk:  Add support for calculating number of SG
elements
  crypto: omap-aes: Add useful debug macros
  crypto: omap-aes: Populate number of SG elements
  crypto: omap-aes: Simplify DMA usage by using direct SGs
  crypto: omap-aes: Sync SG before DMA operation
  crypto: omap-aes: Remove previously used intermediate buffers
  crypto: omap-aes: Add IRQ info and helper macros
  crypto: omap-aes: PIO mode: Add IRQ handler and walk SGs
  crypto: omap-aes: PIO mode: platform data for OMAP4/AM437x and
trigger
  crypto: omap-aes: Switch to PIO mode during probe
  crypto: omap-aes: Add support for cases of unaligned lengths
  crypto: omap-aes: Convert kzalloc to devm_kzalloc
  crypto: omap-aes: Convert request_irq to devm_request_irq
  crypto: omap-aes: Kconfig: Add build support for AM437x

 crypto/scatterwalk.c |   22 ++
 drivers/crypto/Kconfig   |2 +-
 drivers/crypto/omap-aes.c|  466 +++---
 include/crypto/scatterwalk.h |2 +
 4 files changed, 284 insertions(+), 208 deletions(-)

-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 08/14] crypto: omap-aes: PIO mode: Add IRQ handler and walk SGs

2013-08-17 Thread Joel Fernandes

We add an IRQ handler that implements a state-machine for PIO-mode and data
structures for walking the scatter-gather list. The IRQ handler is called in
succession both when data is available to read or next data can be sent for
processing. This process continues till the entire in/out SG lists have been
walked. Once the SG-list has been completely walked, the IRQ handler schedules
the done_task tasklet.

Also add a useful macro that is used through out the IRQ code for a common
pattern of calculating how much an SG list has been walked.  This improves code
readability and avoids checkpatch errors.

Signed-off-by: Joel Fernandes 
---
 drivers/crypto/omap-aes.c |   90 +
 1 file changed, 90 insertions(+)

diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
index 9a964e8..889dc99 100644
--- a/drivers/crypto/omap-aes.c
+++ b/drivers/crypto/omap-aes.c
@@ -39,6 +39,8 @@
 #define DST_MAXBURST   4
 #define DMA_MIN(DST_MAXBURST * sizeof(u32))
 
+#define _calc_walked(inout) (dd->inout##_walk.offset - dd->inout##_sg->offset)
+
 /* OMAP TRM gives bitfields as start:end, where start is the higher bit
number. For example 7:0 */
 #define FLD_MASK(start, end)   (((1 << ((start) - (end) + 1)) - 1) << (end))
@@ -91,6 +93,8 @@
 #define FLAGS_FAST BIT(5)
 #define FLAGS_BUSY BIT(6)
 
+#define AES_BLOCK_WORDS(AES_BLOCK_SIZE >> 2)
+
 struct omap_aes_ctx {
struct omap_aes_dev *dd;
 
@@ -156,6 +160,8 @@ struct omap_aes_dev {
size_t  total;
struct scatterlist  *in_sg;
struct scatterlist  *out_sg;
+   struct scatter_walk in_walk;
+   struct scatter_walk out_walk;
int dma_in;
struct dma_chan *dma_lch_in;
int dma_out;
@@ -852,6 +858,90 @@ static const struct omap_aes_pdata omap_aes_pdata_omap4 = {
.minor_shift= 0,
 };
 
+static irqreturn_t omap_aes_irq(int irq, void *dev_id)
+{
+   struct omap_aes_dev *dd = dev_id;
+   u32 status, i;
+   u32 *src, *dst;
+
+   status = omap_aes_read(dd, AES_REG_IRQ_STATUS(dd));
+   if (status & AES_REG_IRQ_DATA_IN) {
+   omap_aes_write(dd, AES_REG_IRQ_ENABLE(dd), 0x0);
+
+   BUG_ON(!dd->in_sg);
+
+   BUG_ON(_calc_walked(in) > dd->in_sg->length);
+
+   src = sg_virt(dd->in_sg) + _calc_walked(in);
+
+   for (i = 0; i < AES_BLOCK_WORDS; i++) {
+   omap_aes_write(dd, AES_REG_DATA_N(dd, i), *src);
+
+   scatterwalk_advance(>in_walk, 4);
+   if (dd->in_sg->length == _calc_walked(in)) {
+   dd->in_sg = scatterwalk_sg_next(dd->in_sg);
+   if (dd->in_sg) {
+   scatterwalk_start(>in_walk,
+ dd->in_sg);
+   src = sg_virt(dd->in_sg) +
+ _calc_walked(in);
+   }
+   } else {
+   src++;
+   }
+   }
+
+   /* Clear IRQ status */
+   status &= ~AES_REG_IRQ_DATA_IN;
+   omap_aes_write(dd, AES_REG_IRQ_STATUS(dd), status);
+
+   /* Enable DATA_OUT interrupt */
+   omap_aes_write(dd, AES_REG_IRQ_ENABLE(dd), 0x4);
+
+   } else if (status & AES_REG_IRQ_DATA_OUT) {
+   omap_aes_write(dd, AES_REG_IRQ_ENABLE(dd), 0x0);
+
+   BUG_ON(!dd->out_sg);
+
+   BUG_ON(_calc_walked(out) > dd->out_sg->length);
+
+   dst = sg_virt(dd->out_sg) + _calc_walked(out);
+
+   for (i = 0; i < AES_BLOCK_WORDS; i++) {
+   *dst = omap_aes_read(dd, AES_REG_DATA_N(dd, i));
+   scatterwalk_advance(>out_walk, 4);
+   if (dd->out_sg->length == _calc_walked(out)) {
+   dd->out_sg = scatterwalk_sg_next(dd->out_sg);
+   if (dd->out_sg) {
+   scatterwalk_start(>out_walk,
+ dd->out_sg);
+   dst = sg_virt(dd->out_sg) +
+ _calc_walked(out);
+   }
+   } else {
+   dst++;
+   }
+   }
+
+   dd->total -= AES_BLOCK_SIZE;
+
+   BUG_ON(dd->total < 0);
+
+

[PATCH v2 07/14] crypto: omap-aes: Add IRQ info and helper macros

2013-08-17 Thread Joel Fernandes

Add IRQ information to pdata and helper macros. These are required
for PIO-mode support.

Signed-off-by: Joel Fernandes 
---
 drivers/crypto/omap-aes.c |8 
 1 file changed, 8 insertions(+)

diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
index bbdd1c3..9a964e8 100644
--- a/drivers/crypto/omap-aes.c
+++ b/drivers/crypto/omap-aes.c
@@ -75,6 +75,10 @@
 
 #define AES_REG_LENGTH_N(x)(0x54 + ((x) * 0x04))
 
+#define AES_REG_IRQ_STATUS(dd) ((dd)->pdata->irq_status_ofs)
+#define AES_REG_IRQ_ENABLE(dd) ((dd)->pdata->irq_enable_ofs)
+#define AES_REG_IRQ_DATA_INBIT(1)
+#define AES_REG_IRQ_DATA_OUT   BIT(2)
 #define DEFAULT_TIMEOUT(5*HZ)
 
 #define FLAGS_MODE_MASK0x000f
@@ -120,6 +124,8 @@ struct omap_aes_pdata {
u32 data_ofs;
u32 rev_ofs;
u32 mask_ofs;
+   u32 irq_enable_ofs;
+   u32 irq_status_ofs;
 
u32 dma_enable_in;
u32 dma_enable_out;
@@ -836,6 +842,8 @@ static const struct omap_aes_pdata omap_aes_pdata_omap4 = {
.data_ofs   = 0x60,
.rev_ofs= 0x80,
.mask_ofs   = 0x84,
+   .irq_status_ofs = 0x8c,
+   .irq_enable_ofs = 0x90,
.dma_enable_in  = BIT(5),
.dma_enable_out = BIT(6),
.major_mask = 0x0700,
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 12/14] crypto: omap-aes: Convert kzalloc to devm_kzalloc

2013-08-17 Thread Joel Fernandes

Use devm_kzalloc instead of kzalloc. With this change, there is no need to
call kfree in error/exit paths.

Signed-off-by: Joel Fernandes 
---
 drivers/crypto/omap-aes.c |4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
index cbeaaf4..6a4ac4a 100644
--- a/drivers/crypto/omap-aes.c
+++ b/drivers/crypto/omap-aes.c
@@ -1147,7 +1147,7 @@ static int omap_aes_probe(struct platform_device *pdev)
int err = -ENOMEM, i, j, irq = -1;
u32 reg;
 
-   dd = kzalloc(sizeof(struct omap_aes_dev), GFP_KERNEL);
+   dd = devm_kzalloc(dev, sizeof(struct omap_aes_dev), GFP_KERNEL);
if (dd == NULL) {
dev_err(dev, "unable to alloc data struct.\n");
goto err_data;
@@ -1241,7 +1241,6 @@ err_irq:
tasklet_kill(>queue_task);
pm_runtime_disable(dev);
 err_res:
-   kfree(dd);
dd = NULL;
 err_data:
dev_err(dev, "initialization failed.\n");
@@ -1269,7 +1268,6 @@ static int omap_aes_remove(struct platform_device *pdev)
tasklet_kill(>queue_task);
omap_aes_dma_cleanup(dd);
pm_runtime_disable(dd->dev);
-   kfree(dd);
dd = NULL;
 
return 0;
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 02/14] crypto: omap-aes: Add useful debug macros

2013-08-17 Thread Joel Fernandes

When DEBUG is enabled, these macros can be used to print variables in integer
and hex format, and clearly display which registers, offsets and values are
being read/written , including printing the names of the offsets and their 
values.

Note:
This patch results in a checkpatch error that cannot be fixed.
ERROR: Macros with multiple statements should be enclosed in a do - while loop
+#define omap_aes_read(dd, offset)  \
+   __raw_readl(dd->io_base + offset);  \
+   pr_debug("omap_aes_read(" #offset ")\n");

Using do-while loop will break a lot of code such as:
ret = omap_aes_read(..);

On the other hand, not using a do-while loop will only result in a spurious
debug print message when DEBUG is enabled, all other issues would be caught at
compile time if any. As such, there is no code in the driver as of now that
requires a do-while loop, but there is code that will break if a do-while loop
is used in the macro so we ignore the checkpatch error in this case.

Signed-off-by: Joel Fernandes 
---
 drivers/crypto/omap-aes.c |   22 +-
 1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
index ee15b0f..26b802b 100644
--- a/drivers/crypto/omap-aes.c
+++ b/drivers/crypto/omap-aes.c
@@ -13,7 +13,8 @@
  *
  */
 
-#define pr_fmt(fmt) "%s: " fmt, __func__
+#define prn(num) pr_debug(#num "=%d\n", num)
+#define prx(num) pr_debug(#num "=%x\n", num)
 
 #include 
 #include 
@@ -172,16 +173,35 @@ struct omap_aes_dev {
 static LIST_HEAD(dev_list);
 static DEFINE_SPINLOCK(list_lock);
 
+#ifdef DEBUG
+/*
+ * Note: In DEBUG mode, when using conditionals, omap_aes_read _must_
+ * be surrounded by braces otherwise you may see spurious prints.
+ */
+#define omap_aes_read(dd, offset)  \
+   __raw_readl(dd->io_base + offset);  \
+   pr_debug("omap_aes_read(" #offset ")\n");
+#else
 static inline u32 omap_aes_read(struct omap_aes_dev *dd, u32 offset)
 {
return __raw_readl(dd->io_base + offset);
 }
+#endif
 
+#ifdef DEBUG
+#define omap_aes_write(dd, offset, value)  \
+   do {\
+   pr_debug("omap_aes_write(" #offset "=%x) value=%d\n",   \
+offset, value);\
+   __raw_writel(value, dd->io_base + offset);  \
+   } while (0)
+#else
 static inline void omap_aes_write(struct omap_aes_dev *dd, u32 offset,
  u32 value)
 {
__raw_writel(value, dd->io_base + offset);
 }
+#endif
 
 static inline void omap_aes_write_mask(struct omap_aes_dev *dd, u32 offset,
u32 value, u32 mask)
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 04/14] crypto: omap-aes: Simplify DMA usage by using direct SGs

2013-08-17 Thread Joel Fernandes

In early version of this driver, assumptions were made such as DMA layer
requires contiguous buffers etc. Due to this, new buffers were allocated,
mapped and used for DMA. These assumptions are no longer true and DMAEngine
scatter-gather DMA doesn't have such requirements. We simply the DMA operations
by directly using the scatter-gather buffers provided by the crypto layer
instead of creating our own.

Lot of logic that handled DMA'ing only X number of bytes of the total, or as
much as fitted into a 3rd party buffer is removed and is no longer required.

Also, good performance improvement of atleast ~20% seen with encrypting a
buffer size of 8K (1800 ops/sec vs 1400 ops/sec).  Improvement will be higher
for much larger blocks though such benchmarking is left as an exercise for the
reader.  Also DMA usage is much more simplified and coherent with rest of the
code.

Signed-off-by: Joel Fernandes 
---
 drivers/crypto/omap-aes.c |  147 -
 1 file changed, 25 insertions(+), 122 deletions(-)

diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
index e369e6e..64dd5c1 100644
--- a/drivers/crypto/omap-aes.c
+++ b/drivers/crypto/omap-aes.c
@@ -480,22 +480,14 @@ static int sg_copy(struct scatterlist **sg, size_t 
*offset, void *buf,
 }
 
 static int omap_aes_crypt_dma(struct crypto_tfm *tfm,
-   struct scatterlist *in_sg, struct scatterlist *out_sg)
+   struct scatterlist *in_sg, struct scatterlist *out_sg,
+   int in_sg_len, int out_sg_len)
 {
struct omap_aes_ctx *ctx = crypto_tfm_ctx(tfm);
struct omap_aes_dev *dd = ctx->dd;
struct dma_async_tx_descriptor *tx_in, *tx_out;
struct dma_slave_config cfg;
-   dma_addr_t dma_addr_in = sg_dma_address(in_sg);
-   int ret, length = sg_dma_len(in_sg);
-
-   pr_debug("len: %d\n", length);
-
-   dd->dma_size = length;
-
-   if (!(dd->flags & FLAGS_FAST))
-   dma_sync_single_for_device(dd->dev, dma_addr_in, length,
-  DMA_TO_DEVICE);
+   int ret;
 
memset(, 0, sizeof(cfg));
 
@@ -514,7 +506,7 @@ static int omap_aes_crypt_dma(struct crypto_tfm *tfm,
return ret;
}
 
-   tx_in = dmaengine_prep_slave_sg(dd->dma_lch_in, in_sg, 1,
+   tx_in = dmaengine_prep_slave_sg(dd->dma_lch_in, in_sg, in_sg_len,
DMA_MEM_TO_DEV,
DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
if (!tx_in) {
@@ -533,7 +525,7 @@ static int omap_aes_crypt_dma(struct crypto_tfm *tfm,
return ret;
}
 
-   tx_out = dmaengine_prep_slave_sg(dd->dma_lch_out, out_sg, 1,
+   tx_out = dmaengine_prep_slave_sg(dd->dma_lch_out, out_sg, out_sg_len,
DMA_DEV_TO_MEM,
DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
if (!tx_out) {
@@ -551,7 +543,7 @@ static int omap_aes_crypt_dma(struct crypto_tfm *tfm,
dma_async_issue_pending(dd->dma_lch_out);
 
/* start DMA */
-   dd->pdata->trigger(dd, length);
+   dd->pdata->trigger(dd, dd->total);
 
return 0;
 }
@@ -560,93 +552,28 @@ static int omap_aes_crypt_dma_start(struct omap_aes_dev 
*dd)
 {
struct crypto_tfm *tfm = crypto_ablkcipher_tfm(
crypto_ablkcipher_reqtfm(dd->req));
-   int err, fast = 0, in, out;
-   size_t count;
-   dma_addr_t addr_in, addr_out;
-   struct scatterlist *in_sg, *out_sg;
-   int len32;
+   int err;
 
pr_debug("total: %d\n", dd->total);
 
-   if (sg_is_last(dd->in_sg) && sg_is_last(dd->out_sg)) {
-   /* check for alignment */
-   in = IS_ALIGNED((u32)dd->in_sg->offset, sizeof(u32));
-   out = IS_ALIGNED((u32)dd->out_sg->offset, sizeof(u32));
-
-   fast = in && out;
+   err = dma_map_sg(dd->dev, dd->in_sg, dd->in_sg_len, DMA_TO_DEVICE);
+   if (!err) {
+   dev_err(dd->dev, "dma_map_sg() error\n");
+   return -EINVAL;
}
 
-   if (fast)  {
-   count = min(dd->total, sg_dma_len(dd->in_sg));
-   count = min(count, sg_dma_len(dd->out_sg));
-
-   if (count != dd->total) {
-   pr_err("request length != buffer length\n");
-   return -EINVAL;
-   }
-
-   pr_debug("fast\n");
-
-   err = dma_map_sg(dd->dev, dd->in_sg, 1, DMA_TO_DEVICE);
-   if (!err) {
-   dev_err(dd->dev, "dma_map_sg() error\n");
-   return -EINVAL;
-   }
-
-   err = dma_map_sg(dd->dev, dd->out_sg, 1,

[PATCH v2 10/14] crypto: omap-aes: Switch to PIO mode during probe

2013-08-17 Thread Joel Fernandes

In cases where requesting for DMA channels fails for some reason, or channel
numbers are not provided in DT or platform data, we switch to PIO-only mode
also checking if platform provides IRQ numbers and interrupt register offsets
in DT and platform data. All dma-only paths are avoided in this mode.

Signed-off-by: Joel Fernandes 
---
 drivers/crypto/omap-aes.c |   28 +++-
 1 file changed, 23 insertions(+), 5 deletions(-)

diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
index 62b3260..d214dbf 100644
--- a/drivers/crypto/omap-aes.c
+++ b/drivers/crypto/omap-aes.c
@@ -1064,7 +1064,7 @@ static int omap_aes_probe(struct platform_device *pdev)
struct omap_aes_dev *dd;
struct crypto_alg *algp;
struct resource res;
-   int err = -ENOMEM, i, j;
+   int err = -ENOMEM, i, j, irq = -1;
u32 reg;
 
dd = kzalloc(sizeof(struct omap_aes_dev), GFP_KERNEL);
@@ -1108,8 +1108,23 @@ static int omap_aes_probe(struct platform_device *pdev)
tasklet_init(>queue_task, omap_aes_queue_task, (unsigned long)dd);
 
err = omap_aes_dma_init(dd);
-   if (err)
-   goto err_dma;
+   if (err && AES_REG_IRQ_STATUS(dd) && AES_REG_IRQ_ENABLE(dd)) {
+   dd->pio_only = 1;
+
+   irq = platform_get_irq(pdev, 0);
+   if (irq < 0) {
+   dev_err(dev, "can't get IRQ resource\n");
+   goto err_irq;
+   }
+
+   err = request_irq(irq, omap_aes_irq, 0,
+   dev_name(dev), dd);
+   if (err) {
+   dev_err(dev, "Unable to grab omap-aes IRQ\n");
+   goto err_irq;
+   }
+   }
+
 
INIT_LIST_HEAD(>list);
spin_lock(_lock);
@@ -1137,8 +1152,11 @@ err_algs:
for (j = dd->pdata->algs_info[i].registered - 1; j >= 0; j--)
crypto_unregister_alg(
>pdata->algs_info[i].algs_list[j]);
-   omap_aes_dma_cleanup(dd);
-err_dma:
+   if (dd->pio_only)
+   free_irq(irq, dd);
+   else
+   omap_aes_dma_cleanup(dd);
+err_irq:
tasklet_kill(>done_task);
tasklet_kill(>queue_task);
pm_runtime_disable(dev);
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 09/14] crypto: omap-aes: PIO mode: platform data for OMAP4/AM437x and trigger

2013-08-17 Thread Joel Fernandes

We initialize the scatter gather walk lists needed for PIO mode and avoid all
DMA paths such as mapping/unmapping buffers by checking for the pio_only flag.

Signed-off-by: Joel Fernandes 
---
 drivers/crypto/omap-aes.c |   43 ++-
 1 file changed, 30 insertions(+), 13 deletions(-)

diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
index 889dc99..62b3260 100644
--- a/drivers/crypto/omap-aes.c
+++ b/drivers/crypto/omap-aes.c
@@ -168,6 +168,7 @@ struct omap_aes_dev {
struct dma_chan *dma_lch_out;
int in_sg_len;
int out_sg_len;
+   int pio_only;
const struct omap_aes_pdata *pdata;
 };
 
@@ -413,6 +414,16 @@ static int omap_aes_crypt_dma(struct crypto_tfm *tfm,
struct dma_slave_config cfg;
int ret;
 
+   if (dd->pio_only) {
+   scatterwalk_start(>in_walk, dd->in_sg);
+   scatterwalk_start(>out_walk, dd->out_sg);
+
+   /* Enable DATAIN interrupt and let it take
+  care of the rest */
+   omap_aes_write(dd, AES_REG_IRQ_ENABLE(dd), 0x2);
+   return 0;
+   }
+
dma_sync_sg_for_device(dd->dev, dd->in_sg, in_sg_len, DMA_TO_DEVICE);
 
memset(, 0, sizeof(cfg));
@@ -482,21 +493,25 @@ static int omap_aes_crypt_dma_start(struct omap_aes_dev 
*dd)
 
pr_debug("total: %d\n", dd->total);
 
-   err = dma_map_sg(dd->dev, dd->in_sg, dd->in_sg_len, DMA_TO_DEVICE);
-   if (!err) {
-   dev_err(dd->dev, "dma_map_sg() error\n");
-   return -EINVAL;
-   }
+   if (!dd->pio_only) {
+   err = dma_map_sg(dd->dev, dd->in_sg, dd->in_sg_len,
+DMA_TO_DEVICE);
+   if (!err) {
+   dev_err(dd->dev, "dma_map_sg() error\n");
+   return -EINVAL;
+   }
 
-   err = dma_map_sg(dd->dev, dd->out_sg, dd->out_sg_len, DMA_FROM_DEVICE);
-   if (!err) {
-   dev_err(dd->dev, "dma_map_sg() error\n");
-   return -EINVAL;
+   err = dma_map_sg(dd->dev, dd->out_sg, dd->out_sg_len,
+DMA_FROM_DEVICE);
+   if (!err) {
+   dev_err(dd->dev, "dma_map_sg() error\n");
+   return -EINVAL;
+   }
}
 
err = omap_aes_crypt_dma(tfm, dd->in_sg, dd->out_sg, dd->in_sg_len,
 dd->out_sg_len);
-   if (err) {
+   if (err && !dd->pio_only) {
dma_unmap_sg(dd->dev, dd->in_sg, dd->in_sg_len, DMA_TO_DEVICE);
dma_unmap_sg(dd->dev, dd->out_sg, dd->out_sg_len,
 DMA_FROM_DEVICE);
@@ -600,9 +615,11 @@ static void omap_aes_done_task(unsigned long data)
 
pr_debug("enter done_task\n");
 
-   dma_sync_sg_for_cpu(dd->dev, dd->in_sg, dd->in_sg_len, DMA_FROM_DEVICE);
-
-   omap_aes_crypt_dma_stop(dd);
+   if (!dd->pio_only) {
+   dma_sync_sg_for_device(dd->dev, dd->out_sg, dd->out_sg_len,
+  DMA_FROM_DEVICE);
+   omap_aes_crypt_dma_stop(dd);
+   }
omap_aes_finish_req(dd, 0);
omap_aes_handle_queue(dd, NULL);
 
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 05/14] crypto: omap-aes: Sync SG before DMA operation

2013-08-17 Thread Joel Fernandes

Earlier functions that did a similar sync are replaced by the dma_sync_sg_*
which can operate on entire SG list.

Signed-off-by: Joel Fernandes 
---
 drivers/crypto/omap-aes.c |4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
index 64dd5c1..5e034a1 100644
--- a/drivers/crypto/omap-aes.c
+++ b/drivers/crypto/omap-aes.c
@@ -489,6 +489,8 @@ static int omap_aes_crypt_dma(struct crypto_tfm *tfm,
struct dma_slave_config cfg;
int ret;
 
+   dma_sync_sg_for_device(dd->dev, dd->in_sg, in_sg_len, DMA_TO_DEVICE);
+
memset(, 0, sizeof(cfg));
 
cfg.src_addr = dd->phys_base + AES_REG_DATA_N(dd, 0);
@@ -676,6 +678,8 @@ static void omap_aes_done_task(unsigned long data)
 
pr_debug("enter done_task\n");
 
+   dma_sync_sg_for_cpu(dd->dev, dd->in_sg, dd->in_sg_len, DMA_FROM_DEVICE);
+
omap_aes_crypt_dma_stop(dd);
omap_aes_finish_req(dd, 0);
omap_aes_handle_queue(dd, NULL);
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 13/14] crypto: omap-aes: Convert request_irq to devm_request_irq

2013-08-17 Thread Joel Fernandes

Keeps request_irq exit/error code paths simpler.

Suggested-by: Lokesh Vutla 
Signed-off-by: Joel Fernandes 
---
 drivers/crypto/omap-aes.c |6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
index 6a4ac4a..ab449b5 100644
--- a/drivers/crypto/omap-aes.c
+++ b/drivers/crypto/omap-aes.c
@@ -1197,7 +1197,7 @@ static int omap_aes_probe(struct platform_device *pdev)
goto err_irq;
}
 
-   err = request_irq(irq, omap_aes_irq, 0,
+   err = devm_request_irq(dev, irq, omap_aes_irq, 0,
dev_name(dev), dd);
if (err) {
dev_err(dev, "Unable to grab omap-aes IRQ\n");
@@ -1232,9 +1232,7 @@ err_algs:
for (j = dd->pdata->algs_info[i].registered - 1; j >= 0; j--)
crypto_unregister_alg(
>pdata->algs_info[i].algs_list[j]);
-   if (dd->pio_only)
-   free_irq(irq, dd);
-   else
+   if (!dd->pio_only)
omap_aes_dma_cleanup(dd);
 err_irq:
tasklet_kill(>done_task);
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 11/14] crypto: omap-aes: Add support for cases of unaligned lengths

2013-08-17 Thread Joel Fernandes

For cases where offset/length of on any page of the input SG is not aligned by
AES_BLOCK_SIZE, we copy all the pages from the input SG list into a contiguous
buffer and prepare a single element SG list for this buffer with length as the
total bytes to crypt.

This is requried for cases such as when an SG list of 16 bytes total size
contains 16 pages each containing 1 byte. DMA using the direct buffers of such
instances is not possible.

For this purpose, we first detect if the unaligned case and accordingly
allocate enough number of pages to satisfy the request and prepare SG lists.
We then copy data into the buffer, and copy data out of it on completion.

Signed-off-by: Joel Fernandes 
---
 drivers/crypto/omap-aes.c |   86 +++--
 1 file changed, 83 insertions(+), 3 deletions(-)

diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
index d214dbf..cbeaaf4 100644
--- a/drivers/crypto/omap-aes.c
+++ b/drivers/crypto/omap-aes.c
@@ -157,9 +157,23 @@ struct omap_aes_dev {
struct tasklet_struct   queue_task;
 
struct ablkcipher_request   *req;
+
+   /*
+* total is used by PIO mode for book keeping so introduce
+* variable total_save as need it to calc page_order
+*/
size_t  total;
+   size_t  total_save;
+
struct scatterlist  *in_sg;
struct scatterlist  *out_sg;
+
+   /* Buffers for copying for unaligned cases */
+   struct scatterlist  in_sgl;
+   struct scatterlist  out_sgl;
+   struct scatterlist  *orig_out;
+   int sgs_copied;
+
struct scatter_walk in_walk;
struct scatter_walk out_walk;
int dma_in;
@@ -543,12 +557,51 @@ static int omap_aes_crypt_dma_stop(struct omap_aes_dev 
*dd)
dmaengine_terminate_all(dd->dma_lch_in);
dmaengine_terminate_all(dd->dma_lch_out);
 
-   dma_unmap_sg(dd->dev, dd->in_sg, dd->in_sg_len, DMA_TO_DEVICE);
-   dma_unmap_sg(dd->dev, dd->out_sg, dd->out_sg_len, DMA_FROM_DEVICE);
-
return err;
 }
 
+int omap_aes_check_aligned(struct scatterlist *sg)
+{
+   while (sg) {
+   if (!IS_ALIGNED(sg->offset, 4))
+   return -1;
+   if (!IS_ALIGNED(sg->length, AES_BLOCK_SIZE))
+   return -1;
+   sg = sg_next(sg);
+   }
+   return 0;
+}
+
+int omap_aes_copy_sgs(struct omap_aes_dev *dd)
+{
+   void *buf_in, *buf_out;
+   int pages;
+
+   pages = get_order(dd->total);
+
+   buf_in = (void *)__get_free_pages(GFP_ATOMIC, pages);
+   buf_out = (void *)__get_free_pages(GFP_ATOMIC, pages);
+
+   if (!buf_in || !buf_out) {
+   pr_err("Couldn't allocated pages for unaligned cases.\n");
+   return -1;
+   }
+
+   dd->orig_out = dd->out_sg;
+
+   sg_copy_buf(buf_in, dd->in_sg, 0, dd->total, 0);
+
+   sg_init_table(>in_sgl, 1);
+   sg_set_buf(>in_sgl, buf_in, dd->total);
+   dd->in_sg = >in_sgl;
+
+   sg_init_table(>out_sgl, 1);
+   sg_set_buf(>out_sgl, buf_out, dd->total);
+   dd->out_sg = >out_sgl;
+
+   return 0;
+}
+
 static int omap_aes_handle_queue(struct omap_aes_dev *dd,
   struct ablkcipher_request *req)
 {
@@ -582,9 +635,19 @@ static int omap_aes_handle_queue(struct omap_aes_dev *dd,
/* assign new request to device */
dd->req = req;
dd->total = req->nbytes;
+   dd->total_save = req->nbytes;
dd->in_sg = req->src;
dd->out_sg = req->dst;
 
+   if (omap_aes_check_aligned(dd->in_sg) ||
+   omap_aes_check_aligned(dd->out_sg)) {
+   if (omap_aes_copy_sgs(dd))
+   pr_err("Failed to copy SGs for unaligned cases\n");
+   dd->sgs_copied = 1;
+   } else {
+   dd->sgs_copied = 0;
+   }
+
dd->in_sg_len = scatterwalk_bytes_sglen(dd->in_sg, dd->total);
dd->out_sg_len = scatterwalk_bytes_sglen(dd->out_sg, dd->total);
BUG_ON(dd->in_sg_len < 0 || dd->out_sg_len < 0);
@@ -612,14 +675,31 @@ static int omap_aes_handle_queue(struct omap_aes_dev *dd,
 static void omap_aes_done_task(unsigned long data)
 {
struct omap_aes_dev *dd = (struct omap_aes_dev *)data;
+   void *buf_in, *buf_out;
+   int pages;
 
pr_debug("enter done_task\n");
 
if (!dd->pio_only) {
dma_sync_sg_for_device(dd->dev, dd->out_sg, dd->out_sg_len,
   DMA_FROM_DEVICE);
+   dma_unmap_sg(dd->dev, dd->in_sg, dd->in_sg_len, DMA_TO_DEVICE);
+

[PATCH v2 01/14] crypto: scatterwalk: Add support for calculating number of SG elements

2013-08-17 Thread Joel Fernandes

Crypto layer only passes nbytes to encrypt but in omap-aes driver we need to
know number of SG elements to pass to dmaengine slave API.  We add function for
the same to scatterwalk library.

Signed-off-by: Joel Fernandes 
---
 crypto/scatterwalk.c |   22 ++
 include/crypto/scatterwalk.h |2 ++
 2 files changed, 24 insertions(+)

diff --git a/crypto/scatterwalk.c b/crypto/scatterwalk.c
index 7281b8a..79ca227 100644
--- a/crypto/scatterwalk.c
+++ b/crypto/scatterwalk.c
@@ -124,3 +124,25 @@ void scatterwalk_map_and_copy(void *buf, struct 
scatterlist *sg,
scatterwalk_done(, out, 0);
 }
 EXPORT_SYMBOL_GPL(scatterwalk_map_and_copy);
+
+int scatterwalk_bytes_sglen(struct scatterlist *sg, int num_bytes)
+{
+   int offset = 0, n = 0;
+
+   /* num_bytes is too small */
+   if (num_bytes < sg->length)
+   return -1;
+
+   do {
+   offset += sg->length;
+   n++;
+   sg = scatterwalk_sg_next(sg);
+
+   /* num_bytes is too large */
+   if (unlikely(!sg && (num_bytes < offset)))
+   return -1;
+   } while (sg && (num_bytes > offset));
+
+   return n;
+}
+EXPORT_SYMBOL_GPL(scatterwalk_bytes_sglen);
diff --git a/include/crypto/scatterwalk.h b/include/crypto/scatterwalk.h
index 3744d2a..13621cc 100644
--- a/include/crypto/scatterwalk.h
+++ b/include/crypto/scatterwalk.h
@@ -113,4 +113,6 @@ void scatterwalk_done(struct scatter_walk *walk, int out, 
int more);
 void scatterwalk_map_and_copy(void *buf, struct scatterlist *sg,
  unsigned int start, unsigned int nbytes, int out);
 
+int scatterwalk_bytes_sglen(struct scatterlist *sg, int num_bytes);
+
 #endif  /* _CRYPTO_SCATTERWALK_H */
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 03/14] crypto: omap-aes: Populate number of SG elements

2013-08-17 Thread Joel Fernandes

Crypto layer only passes nbytes but number of SG elements is needed for mapping
or unmapping SGs at one time using dma_map* API and also needed to pass in for
dmaengine prep function.

We call function added to scatterwalk for this purpose in omap_aes_handle_queue
to populate the values which are used later.

Signed-off-by: Joel Fernandes 
---
 drivers/crypto/omap-aes.c |6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
index 26b802b..e369e6e 100644
--- a/drivers/crypto/omap-aes.c
+++ b/drivers/crypto/omap-aes.c
@@ -164,6 +164,8 @@ struct omap_aes_dev {
void*buf_out;
int dma_out;
struct dma_chan *dma_lch_out;
+   int in_sg_len;
+   int out_sg_len;
dma_addr_t  dma_addr_out;
 
const struct omap_aes_pdata *pdata;
@@ -731,6 +733,10 @@ static int omap_aes_handle_queue(struct omap_aes_dev *dd,
dd->out_offset = 0;
dd->out_sg = req->dst;
 
+   dd->in_sg_len = scatterwalk_bytes_sglen(dd->in_sg, dd->total);
+   dd->out_sg_len = scatterwalk_bytes_sglen(dd->out_sg, dd->total);
+   BUG_ON(dd->in_sg_len < 0 || dd->out_sg_len < 0);
+
rctx = ablkcipher_request_ctx(req);
ctx = crypto_ablkcipher_ctx(crypto_ablkcipher_reqtfm(req));
rctx->mode &= FLAGS_MODE_MASK;
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 14/14] crypto: omap-aes: Kconfig: Add build support for AM437x

2013-08-17 Thread Joel Fernandes

For AM437x SoC, ARCH_OMAP2 and ARCH_OMAP3 is not enabled in the defconfig. We
follow same thing as SHA driver, and add depends on ARCH_OMAP2PLUS so that the
config is selectable for AES driver on AM437x SoC builds.

Signed-off-by: Joel Fernandes 
---
 drivers/crypto/Kconfig |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/crypto/Kconfig b/drivers/crypto/Kconfig
index dffb855..e289afa 100644
--- a/drivers/crypto/Kconfig
+++ b/drivers/crypto/Kconfig
@@ -252,7 +252,7 @@ config CRYPTO_DEV_OMAP_SHAM
 
 config CRYPTO_DEV_OMAP_AES
tristate "Support for OMAP AES hw engine"
-   depends on ARCH_OMAP2 || ARCH_OMAP3
+   depends on ARCH_OMAP2 || ARCH_OMAP3 || ARCH_OMAP2PLUS
select CRYPTO_AES
select CRYPTO_BLKCIPHER2
help
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2 02/14] crypto: omap-aes: Add useful debug macros

2013-08-17 Thread Joel Fernandes

On 08/17/2013 11:22 PM, Joe Perches wrote:
> On Sat, 2013-08-17 at 21:42 -0500, Joel Fernandes wrote:
>> When DEBUG is enabled, these macros can be used to print variables in integer
>> and hex format, and clearly display which registers, offsets and values are
>> being read/written , including printing the names of the offsets and their 
>> values.
>>
>> Note:
>> This patch results in a checkpatch error that cannot be fixed.
>> ERROR: Macros with multiple statements should be enclosed in a do - while 
>> loop
>> +#define omap_aes_read(dd, offset)  \
>> +   __raw_readl(dd->io_base + offset);  \
>> +   pr_debug("omap_aes_read(" #offset ")\n");
>>
>> Using do-while loop will break a lot of code such as:
>> ret = omap_aes_read(..);
> 
> That's where you use a statement expression macro
> 
> #define omap_aes_read(dd, offset) \
> ({\
>   pr_debug("omap_aes_read("omap_aes_read(" #offset ")\n");\
>   __raw_readl((dd)->iobase + offset); \
> })
> 

That made things a lot simpler, thanks. Re-spinning just this patch as below:

--->8---

From: Joel Fernandes 
Subject: [PATCH] crypto: omap-aes: Add useful debug macros

When DEBUG is enabled, these macros can be used to print variables in integer
and hex format, and clearly display which registers, offsets and values are
being read/written , including printing the names of the offsets and their 
values.

Using statement expression macros in read path as,
Suggested-by: Joe Perches 

Signed-off-by: Joel Fernandes 
---
 drivers/crypto/omap-aes.c |   24 +++-
 1 file changed, 23 insertions(+), 1 deletion(-)

diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
index ee15b0f..e26d4d4 100644
--- a/drivers/crypto/omap-aes.c
+++ b/drivers/crypto/omap-aes.c
@@ -13,7 +13,9 @@
  *
  */

-#define pr_fmt(fmt) "%s: " fmt, __func__
+#define pr_fmt(fmt) "%20s: " fmt, __func__
+#define prn(num) pr_debug(#num "=%d\n", num)
+#define prx(num) pr_debug(#num "=%x\n", num)

 #include 
 #include 
@@ -172,16 +174,36 @@ struct omap_aes_dev {
 static LIST_HEAD(dev_list);
 static DEFINE_SPINLOCK(list_lock);

+#ifdef DEBUG
+#define omap_aes_read(dd, offset)  \
+({ \
+   int _read_ret;  \
+   _read_ret = __raw_readl(dd->io_base + offset);  \
+   pr_debug("omap_aes_read(" #offset "=%#x)= %#x\n",   \
+offset, _read_ret);\
+   _read_ret;  \
+})
+#else
 static inline u32 omap_aes_read(struct omap_aes_dev *dd, u32 offset)
 {
return __raw_readl(dd->io_base + offset);
 }
+#endif

+#ifdef DEBUG
+#define omap_aes_write(dd, offset, value)  \
+   do {\
+   pr_debug("omap_aes_write(" #offset "=%#x) value=%#x\n", \
+offset, value);\
+   __raw_writel(value, dd->io_base + offset);  \
+   } while (0)
+#else
 static inline void omap_aes_write(struct omap_aes_dev *dd, u32 offset,
  u32 value)
 {
__raw_writel(value, dd->io_base + offset);
 }
+#endif

 static inline void omap_aes_write_mask(struct omap_aes_dev *dd, u32 offset,
u32 value, u32 mask)
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2 04/14] crypto: omap-aes: Simplify DMA usage by using direct SGs

2013-08-20 Thread Joel Fernandes

On 08/20/2013 07:57 AM, Lokesh Vutla wrote:
> Hi Joel,
> 
> On Sunday 18 August 2013 08:12 AM, Joel Fernandes wrote:
>> In early version of this driver, assumptions were made such as DMA layer
>> requires contiguous buffers etc. Due to this, new buffers were allocated,
>> mapped and used for DMA. These assumptions are no longer true and DMAEngine
>> scatter-gather DMA doesn't have such requirements. We simply the DMA 
>> operations
>> by directly using the scatter-gather buffers provided by the crypto layer
>> instead of creating our own.
>>
>> Lot of logic that handled DMA'ing only X number of bytes of the total, or as
>> much as fitted into a 3rd party buffer is removed and is no longer required.
>>
>> Also, good performance improvement of atleast ~20% seen with encrypting a
>> buffer size of 8K (1800 ops/sec vs 1400 ops/sec).  Improvement will be higher
>> for much larger blocks though such benchmarking is left as an exercise for 
>> the
>> reader.  Also DMA usage is much more simplified and coherent with rest of the
>> code.
>>
>> Signed-off-by: Joel Fernandes 
>> ---
>>  drivers/crypto/omap-aes.c |  147 
>> -
>>  1 file changed, 25 insertions(+), 122 deletions(-)
>>
>> diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
>> index e369e6e..64dd5c1 100644
>> --- a/drivers/crypto/omap-aes.c
>> +++ b/drivers/crypto/omap-aes.c
>> @@ -480,22 +480,14 @@ static int sg_copy(struct scatterlist **sg, size_t 
>> *offset, void *buf,
>>  }
>>  
>>  static int omap_aes_crypt_dma(struct crypto_tfm *tfm,
>> -struct scatterlist *in_sg, struct scatterlist *out_sg)
>> +struct scatterlist *in_sg, struct scatterlist *out_sg,
>> +int in_sg_len, int out_sg_len)
>>  {
>>  struct omap_aes_ctx *ctx = crypto_tfm_ctx(tfm);
>>  struct omap_aes_dev *dd = ctx->dd;
>>  struct dma_async_tx_descriptor *tx_in, *tx_out;
>>  struct dma_slave_config cfg;
>> -dma_addr_t dma_addr_in = sg_dma_address(in_sg);
>> -int ret, length = sg_dma_len(in_sg);
>> -
>> -pr_debug("len: %d\n", length);
>> -
>> -dd->dma_size = length;
>> -
>> -if (!(dd->flags & FLAGS_FAST))
>> -dma_sync_single_for_device(dd->dev, dma_addr_in, length,
>> -   DMA_TO_DEVICE);
>> +int ret;
> By this change FLAGS_FAST is unsed, it can be cleaned right?
> or Am I missing something?

Yes, FLAGS_FAST would be unused now and can go away. Since it is very trivial
change, I will make this change in the not-immediate future and submit.

Thanks,

-Joel


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v3 02/12] ARM: edma: Don't clear EMR of channel in edma_stop

2013-08-11 Thread Joel Fernandes

On Sun, Aug 11, 2013 at 11:25 PM, Sekhar Nori  wrote:
> On 8/8/2013 5:19 PM, Sekhar Nori wrote:
>> On Monday 05 August 2013 09:44 PM, Joel Fernandes wrote:
>>> We certainly don't want error conditions to be cleared any other
>>> place but the EDMA error handler, as this will make us 'forget'
>>> about missed events we might need to know errors have occurred.
>>>
>>> This fixes a race condition where the EMR was being cleared
>>> by the transfer completion interrupt handler.
>>>
>>> Basically, what was happening was:
>>>
>>> Missed event
>>>  |
>>>  |
>>>  V
>>> SG1-SG2-SG3-Null
>>>  \
>>>   \__TC Interrupt (Almost same time as ARM is executing
>>> TC interrupt handler, an event got missed and also forgotten
>>> by clearing the EMR).
>>>
>>> This causes the following  problems:
>>>
>>> 1.
>>> If error interrupt is also pending and TC interrupt clears the EMR
>>> by calling edma_stop as has been observed in the edma_callback function,
>>> the ARM will execute the error interrupt even though the EMR is clear.
>>> As a result, the  dma_ccerr_handler returns IRQ_NONE. If this happens
>>> enough number of times, IRQ subsystem disables the interrupt thinking
>>> its spurious which makes error handler never execute again.
>>>
>>> 2.
>>> Also even if error handler doesn't return IRQ_NONE, the removing of EMR
>>> removes the knowledge about which channel had a missed event, and thus
>>> a manual trigger on such channels cannot be performed.
>>>
>>> The EMR is ultimately being cleared by the Error interrupt handler
>>> once it is handled so we remove code that does it in edma_stop and
>>> allow it to happen there.
>>>
>>> Signed-off-by: Joel Fernandes 
>>
>> Queuing this for v3.11 fixes. While committing, I changed the headline
>> to remove capitalization and made it more readable by removing register
>> level details. The new headline is:
>>
>> ARM: edma: don't clear missed events in edma_stop()
>
> Forgot to ask, should this be tagged for stable? IOW, how serious is
> this race in current kernel (without the entire series applied)? I have
> never observed it myself - so please provide details how easy/difficult
> it is to hit this condition.

The race was uncovered by recent EDMA patch series, So this patch can
go in for next kernel release as such, I am not aware of any other DMA
user that maybe uncovering the race condition.

Thanks,

-Joel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v3 08/12] dma: edma: Add one more required slot to MAX slots

2013-08-12 Thread Joel Fernandes

On 08/12/2013 08:31 AM, Mark Brown wrote:
> On Mon, Aug 12, 2013 at 11:26:49AM +0530, Sekhar Nori wrote:
> 
>> No need for a separate patch for this, just do this in the patch where
>> you include the two linked sets.
> 
> Can you guys please think about the CC lists you're using for these
> patch serieses?  I've certainly no interest in random patches to the
> DaVinci DMA controllers, and I suspect the same is true for most of the
> people on the CC list.
> 

Sorry. I was actually getting a sense of this. What I did initially was
I copied the CC list from last years work on the same series hoping that
it would be a complete mail chain. Bad judgement on my part.
I should have paid more attention to Mark's rants about his inbox
getting spammed with this series in those old threads ;)

I guess moving forward I will use get_maintainer.pl and refine my
git-send scripts with only interested folks. Any words of advice on when
to and when not to CC someone remotely connected to a series is welcome.

@Sekhar, let's keep only linux-omap, linux-davinci and few other
interested folks on the CC chain for the rest of review on the series.

Thanks,

-Joel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v3 01/12] dma: edma: Setup parameters to DMA MAX_NR_SG at a time

2013-08-12 Thread Joel Fernandes

Dropped quite a few from the CC list...

On 08/12/2013 02:15 AM, Sekhar Nori wrote:
> On Monday 05 August 2013 09:44 PM, Joel Fernandes wrote:
>> Changes are made here for configuring existing parameters to support
>> DMA'ing them out in batches as needed.
>>
>> Also allocate as many as slots as needed by the SG list, but not more
>> than MAX_NR_SG. Then these slots will be reused accordingly.
>> For ex, if MAX_NR_SG=10, and number of SG entries is 40, still only
>> 10 slots will be allocated to DMA the entire SG list of size 40.
>>
>> Signed-off-by: Joel Fernandes 
>> ---
>>  drivers/dma/edma.c |   14 +++---
>>  1 file changed, 11 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/dma/edma.c b/drivers/dma/edma.c
>> index 5f3e532..7b0853c 100644
>> --- a/drivers/dma/edma.c
>> +++ b/drivers/dma/edma.c
>> @@ -222,9 +222,9 @@ static struct dma_async_tx_descriptor 
>> *edma_prep_slave_sg(
>>  enum dma_slave_buswidth dev_width;
>>  u32 burst;
>>  struct scatterlist *sg;
>> -int i;
>>  int acnt, bcnt, ccnt, src, dst, cidx;
>>  int src_bidx, dst_bidx, src_cidx, dst_cidx;
>> +int i, num_slots_needed;
> 
> 'nslots' is more to my liking. Better keep variable names short.
> 
>>  
>>  if (unlikely(!echan || !sgl || !sg_len))
>>  return NULL;
>> @@ -262,8 +262,11 @@ static struct dma_async_tx_descriptor 
>> *edma_prep_slave_sg(
>>  
>>  edesc->pset_nr = sg_len;
>>  
>> -for_each_sg(sgl, sg, sg_len, i) {
>> -/* Allocate a PaRAM slot, if needed */
>> +/* Allocate a PaRAM slot, if needed */
>> +
>> +num_slots_needed = sg_len > MAX_NR_SG ? MAX_NR_SG : sg_len;
> 
> nslots = min(MAX_NR_SG, sg_len);

I agree the original naming was quite long. I would rather using
something more descriptive though than nslots. How does slots_needed sound?

Thanks,

-Joel

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v3 01/12] dma: edma: Setup parameters to DMA MAX_NR_SG at a time

2013-08-12 Thread Joel Fernandes

Responding to other comments in this post,

On 08/12/2013 02:15 AM, Sekhar Nori wrote:

[..]
>>  if (unlikely(!echan || !sgl || !sg_len))
>>  return NULL;
>> @@ -262,8 +262,11 @@ static struct dma_async_tx_descriptor 
>> *edma_prep_slave_sg(
>>  
>>  edesc->pset_nr = sg_len;
>>  
>> -for_each_sg(sgl, sg, sg_len, i) {
>> -/* Allocate a PaRAM slot, if needed */
>> +/* Allocate a PaRAM slot, if needed */
>> +
>> +num_slots_needed = sg_len > MAX_NR_SG ? MAX_NR_SG : sg_len;
> 
> nslots = min(MAX_NR_SG, sg_len);

Changed to this, with the +1

> 
>> +
>> +for (i = 0; i < num_slots_needed; i++) {
>>  if (echan->slot[i] < 0) {
>>  echan->slot[i] =
>>  edma_alloc_slot(EDMA_CTLR(echan->ch_num),
>> @@ -273,6 +276,10 @@ static struct dma_async_tx_descriptor 
>> *edma_prep_slave_sg(
>>  return NULL;
>>  }
>>  }
>> +}
>> +
>> +/* Configure PaRAM sets for each SG */
>> +for_each_sg(sgl, sg, sg_len, i) {
>>  
>>  acnt = dev_width;
>>  
>> @@ -330,6 +337,7 @@ static struct dma_async_tx_descriptor 
>> *edma_prep_slave_sg(
>>  /* Configure A or AB synchronized transfers */
>>  if (edesc->absync)
>>  edesc->pset[i].opt |= SYNCDIM;
>> +
> 
> Random extra newline.

Removing..

> 
> The patch as such is fine, but I dont think it makes lot of sense
> standalone. This needs to be merged into the patch where you actually
> handle the entire SG list in batches.

I think it does actually, this patch just takes care of preparing the
param set list correctly and allocating slots. It doesn't the actual DMA
or take part in the algorithm. As a result, the patch can be reused
incase in future the main algorithm is rewritten in a subsequent series.
Further this patch was reused straight from old implementation so it
proved to be useful being a separate patch last time. I also plan to
rewrite just this functionality in the future.

Thanks,

-Joel


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v3 01/12] dma: edma: Setup parameters to DMA MAX_NR_SG at a time

2013-08-12 Thread Joel Fernandes

On 08/12/2013 06:55 PM, Joel Fernandes wrote:
> Dropped quite a few from the CC list...
> 
> On 08/12/2013 02:15 AM, Sekhar Nori wrote:
>> On Monday 05 August 2013 09:44 PM, Joel Fernandes wrote:
>>> Changes are made here for configuring existing parameters to support
>>> DMA'ing them out in batches as needed.
>>>
>>> Also allocate as many as slots as needed by the SG list, but not more
>>> than MAX_NR_SG. Then these slots will be reused accordingly.
>>> For ex, if MAX_NR_SG=10, and number of SG entries is 40, still only
>>> 10 slots will be allocated to DMA the entire SG list of size 40.
>>>
>>> Signed-off-by: Joel Fernandes 
>>> ---
>>>  drivers/dma/edma.c |   14 +++---
>>>  1 file changed, 11 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/dma/edma.c b/drivers/dma/edma.c
>>> index 5f3e532..7b0853c 100644
>>> --- a/drivers/dma/edma.c
>>> +++ b/drivers/dma/edma.c
>>> @@ -222,9 +222,9 @@ static struct dma_async_tx_descriptor 
>>> *edma_prep_slave_sg(
>>> enum dma_slave_buswidth dev_width;
>>> u32 burst;
>>> struct scatterlist *sg;
>>> -   int i;
>>> int acnt, bcnt, ccnt, src, dst, cidx;
>>> int src_bidx, dst_bidx, src_cidx, dst_cidx;
>>> +   int i, num_slots_needed;
>>
>> 'nslots' is more to my liking. Better keep variable names short.
>>
>>>  
>>> if (unlikely(!echan || !sgl || !sg_len))
>>> return NULL;
>>> @@ -262,8 +262,11 @@ static struct dma_async_tx_descriptor 
>>> *edma_prep_slave_sg(
>>>  
>>> edesc->pset_nr = sg_len;
>>>  
>>> -   for_each_sg(sgl, sg, sg_len, i) {
>>> -   /* Allocate a PaRAM slot, if needed */
>>> +   /* Allocate a PaRAM slot, if needed */
>>> +
>>> +   num_slots_needed = sg_len > MAX_NR_SG ? MAX_NR_SG : sg_len;
>>
>> nslots = min(MAX_NR_SG, sg_len);
> 
> I agree the original naming was quite long. I would rather using
> something more descriptive though than nslots. How does slots_needed sound?

Sorry for the noise, nslots is fine and I've changed it to the same.

-Joel

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v3 09/12] dma: edma: Implement multiple linked sets for continuity

2013-08-12 Thread Joel Fernandes

On 08/12/2013 01:56 PM, Sekhar Nori wrote:
> On Monday 05 August 2013 04:14 PM, Joel Fernandes wrote:
>> Here we implement splitting up of the total MAX number of slots
>> available for a channel into 2 cyclically linked sets. Transfer
>> completion Interrupts are enabled on both linked sets and respective
>> handler recycles them on completion to process the next linked set.
>> Both linked sets are cyclically linked to each other to ensure
>> continuity of DMA operations. Interrupt handlers execute asynchronously
>> to the EDMA events and recycles the linked sets at the right time,
>> as a result EDMA is not blocked or dependent on interrupts and DMA
>> continues till the end of the SG-lists without any interruption.
>>
>> Suggested-by: Sekhar Nori 
>> Signed-off-by: Joel Fernandes 
>> ---
>>  drivers/dma/edma.c |  157 
>> +++-
>>  1 file changed, 118 insertions(+), 39 deletions(-)
>>
>> diff --git a/drivers/dma/edma.c b/drivers/dma/edma.c
>> index df50a04..70923a2 100644
>> --- a/drivers/dma/edma.c
>> +++ b/drivers/dma/edma.c
>> @@ -48,6 +48,7 @@
>>  
>>  /* Max of 16 segments per channel to conserve PaRAM slots */
>>  #define MAX_NR_SG   16
>> +#define MAX_NR_LS   (MAX_NR_SG >> 1)
>>  #define EDMA_MAX_SLOTS  (MAX_NR_SG+1)
>>  #define EDMA_DESCRIPTORS16
>>  
>> @@ -57,6 +58,7 @@ struct edma_desc {
>>  int absync;
>>  int pset_nr;
>>  int total_processed;
>> +int next_setup_linkset;
>>  struct edmacc_param pset[0];
>>  };
>>  
>> @@ -140,7 +142,9 @@ static void edma_execute(struct edma_chan *echan)
>>  struct edma_desc *edesc;
>>  struct device *dev = echan->vchan.chan.device->dev;
>>  
>> -int i, j, total_left, total_process;
>> +int i, total_left, total_link_set;
>> +int ls_cur_off, ls_next_off, slot_off;
>> +struct edmacc_param tmp_param;
>>  
>>  /* If either we processed all psets or we're still not started */
>>  if (!echan->edesc ||
>> @@ -159,48 +163,121 @@ static void edma_execute(struct edma_chan *echan)
>>  
>>  /* Find out how many left */
>>  total_left = edesc->pset_nr - edesc->total_processed;
>> -total_process = total_left > MAX_NR_SG ? MAX_NR_SG : total_left;
>> -
>> -
>> -/* Write descriptor PaRAM set(s) */
>> -for (i = 0; i < total_process; i++) {
>> -j = i + edesc->total_processed;
>> -edma_write_slot(echan->slot[i], >pset[j]);
>> -dev_dbg(echan->vchan.chan.device->dev,
>> -"\n pset[%d]:\n"
>> -"  chnum\t%d\n"
>> -"  slot\t%d\n"
>> -"  opt\t%08x\n"
>> -"  src\t%08x\n"
>> -"  dst\t%08x\n"
>> -"  abcnt\t%08x\n"
>> -"  ccnt\t%08x\n"
>> -"  bidx\t%08x\n"
>> -"  cidx\t%08x\n"
>> -"  lkrld\t%08x\n",
>> -j, echan->ch_num, echan->slot[i],
>> -edesc->pset[j].opt,
>> -edesc->pset[j].src,
>> -edesc->pset[j].dst,
>> -edesc->pset[j].a_b_cnt,
>> -edesc->pset[j].ccnt,
>> -edesc->pset[j].src_dst_bidx,
>> -edesc->pset[j].src_dst_cidx,
>> -edesc->pset[j].link_bcntrld);
>> -/* Link to the previous slot if not the last set */
>> -if (i != (total_process - 1))
> 
>> +total_link_set = total_left > MAX_NR_LS ? MAX_NR_LS : total_left;
> 
> The name you gave here sounds like this is defining total number of
> linked PaRAM sets. Rather this is actually tracking the number of PaRAM
> sets (slots) in current linked set, correct? Then may be just call it
> 'nslots' or even 'num_slots'? There are just too many variables with
> "total" prefix to keep track of in this function!

I would rather just leave this naming alone. The code is quite self
documenting: total_link_set means "Calculate what's the total size of a
Linkset, or total no.of slots in a linkset we n

Re: [PATCH v3 07/12] dma: edma: Add function to dump a PaRAM set from PaRAM

2013-08-12 Thread Joel Fernandes

On 08/12/2013 12:52 AM, Sekhar Nori wrote:
> On Monday 05 August 2013 09:44 PM, Joel Fernandes wrote:
>> Previously, such a dump function was used but it wasn't reading
>> from the PaRAM, rather just from a edmacc_param structure, we
>> add a helpful function for debugging that directly reads from
>> the PaRAM and gives the current state correctly (including links
>> and interrupt information).
>>
>> Signed-off-by: Joel Fernandes 
> 
> You should convert existing instances of PaRAM set dump to use this new
> function along with introducing it.

Well, the old dump callers were thrown out with completely rewritten
code. This rewritten code already contains the dump calls. Are you
saying pull out those dump pset calls into a separate patch?

Thanks,

-Joel

> Sekhar
> 
>> ---
>>  drivers/dma/edma.c |   31 +++
>>  1 file changed, 31 insertions(+)
>>
>> diff --git a/drivers/dma/edma.c b/drivers/dma/edma.c
>> index 080d669..a242269 100644
>> --- a/drivers/dma/edma.c
>> +++ b/drivers/dma/edma.c
>> @@ -102,6 +102,37 @@ static void edma_desc_free(struct virt_dma_desc *vdesc)
>>  kfree(container_of(vdesc, struct edma_desc, vdesc));
>>  }
>>  
>> +static inline void dump_pset(struct edma_chan *echan, int slot,
>> + struct edmacc_param *pset_unused, int pset_idx)
>> +{
>> +struct edmacc_param pset_local, *pset;
>> +pset = _local;
>> +
>> +edma_read_slot(slot, pset);
>> +
>> +dev_dbg(echan->vchan.chan.device->dev,
>> +"\n pset[%d]:\n"
>> +"  chnum\t%d\n"
>> +"  slot\t%d\n"
>> +"  opt\t%08x\n"
>> +"  src\t%08x\n"
>> +"  dst\t%08x\n"
>> +"  abcnt\t%08x\n"
>> +"  ccnt\t%08x\n"
>> +"  bidx\t%08x\n"
>> +"  cidx\t%08x\n"
>> +"  lkrld\t%08x\n",
>> +pset_idx, echan->ch_num, slot,
>> +pset[0].opt,
>> +pset[0].src,
>> +pset[0].dst,
>> +pset[0].a_b_cnt,
>> +pset[0].ccnt,
>> +pset[0].src_dst_bidx,
>> +pset[0].src_dst_cidx,
>> +pset[0].link_bcntrld);
>> +}
>> +
>>  /* Dispatch a queued descriptor to the controller (caller holds lock) */
>>  static void edma_execute(struct edma_chan *echan)
>>  {
>>
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-omap" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v3 08/12] dma: edma: Add one more required slot to MAX slots

2013-08-12 Thread Joel Fernandes

On 08/12/2013 12:56 AM, Sekhar Nori wrote:
> On Monday 05 August 2013 09:44 PM, Joel Fernandes wrote:
>> We'd now need a separate slot just for the channel and separate
>> ones for the 2 linked sets, so we make adjustments to allocate
>> an extra channel accordingly.
>>
>> Signed-off-by: Joel Fernandes 
> 
> No need for a separate patch for this, just do this in the patch where
> you include the two linked sets.

Ok, I dropped this patch. Anyway, most of this patch has gone away now
because of other changes you suggested.

Thanks,

-Joel

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 00/10] crypto: omap-aes: DMA and PIO mode improvements

2013-08-14 Thread Joel Fernandes

This patch series is a rewrite of the DMA portion of omap-aes driver
and also adds support for PIO mode. Both these modes, give better
performance than before.

Earlier, only a single SG was used for DMA purpose, and the SG-list
passed from the crypto layer was being copied and DMA'd one entry at
a time. This turns out to be quite inefficient, so we replace it with
much simpler code that directly passes the SG-list from crypto to the
DMA layer.

We also add PIO mode support to the driver, and switch to PIO mode
whenever the DMA channel allocation is not available. This is only for
OMAP4 platform will work on any platform on which IRQ information is
populated.

Tests performed on am33xx and omap4 SoCs , notice the 50% perf improvement
for large 8K blocks:

Sample run on am33xx (beaglebone):
With DMA rewrite:
[   26.410052] test 0 (128 bit key, 16 byte blocks): 4318 operations in 1 
seconds (69088 bytes)
[   27.414314] test 1 (128 bit key, 64 byte blocks): 4360 operations in 1 
seconds (279040 bytes)
[   28.414406] test 2 (128 bit key, 256 byte blocks): 3609 operations in 1 
seconds (923904 bytes)
[   29.414410] test 3 (128 bit key, 1024 byte blocks): 3418 operations in 1 
seconds (3500032 bytes)
[   30.414510] test 4 (128 bit key, 8192 byte blocks): 1766 operations in 1 
seconds (14467072 bytes)

Without DMA rewrite:
[   31.920519] test 0 (128 bit key, 16 byte blocks): 4417 operations in 1 
seconds (70672 bytes)
[   32.925997] test 1 (128 bit key, 64 byte blocks): 4221 operations in 1 
seconds (270144 bytes)
[   33.926194] test 2 (128 bit key, 256 byte blocks): 3528 operations in 1 
seconds (903168 bytes)
[   34.926225] test 3 (128 bit key, 1024 byte blocks): 3281 operations in 1 
seconds (3359744 bytes)
[   35.926385] test 4 (128 bit key, 8192 byte blocks): 1460 operations in 1 
seconds (11960320 bytes)

With PIO mode, note the tremndous boost in performance for small blocks there:
[   27.294905] test 0 (128 bit key, 16 byte blocks): 20585 operations in 1 
seconds (329360 bytes)
[   28.302282] test 1 (128 bit key, 64 byte blocks): 8106 operations in 1 
seconds (518784 bytes)
[   29.302374] test 2 (128 bit key, 256 byte blocks): 2359 operations in 1 
seconds (603904 bytes)
[   30.302575] test 3 (128 bit key, 1024 byte blocks): 605 operations in 1 
seconds (619520 bytes)
[   31.303781] test 4 (128 bit key, 8192 byte blocks): 79 operations in 1 
seconds (647168 bytes)

Future work in this direction would be to dynamically change between
PIO/DMA mode based on the block size.

Joel Fernandes (10):
  crypto: scatterwalk:  Add support for calculating number of SG
elements
  crypto: omap-aes: Add useful debug macros
  crypto: omap-aes: Populate number of SG elements
  crypto: omap-aes: Simplify DMA usage by using direct SGs
  crypto: omap-aes: Sync SG before DMA operation
  crypto: omap-aes: Remove previously used intermediate buffers
  crypto: omap-aes: Add IRQ info and helper macros
  crypto: omap-aes: PIO mode: Add IRQ handler and walk SGs
  crypto: omap-aes: PIO mode: platform data for OMAP4 and trigger it
  crypto: omap-aes: Switch to PIO mode in probe function

 crypto/scatterwalk.c |   22 +++
 drivers/crypto/omap-aes.c|  400 --
 include/crypto/scatterwalk.h |2 +
 3 files changed, 217 insertions(+), 207 deletions(-)

-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 01/10] crypto: scatterwalk: Add support for calculating number of SG elements

2013-08-14 Thread Joel Fernandes

Crypto layer only passes nbytes to encrypt but in omap-aes driver we
need to know number of SG elements to pass to dmaengine slave API.
We add function for the same to scatterwalk library.

Signed-off-by: Joel Fernandes 
---
 crypto/scatterwalk.c |   22 ++
 include/crypto/scatterwalk.h |2 ++
 2 files changed, 24 insertions(+)

diff --git a/crypto/scatterwalk.c b/crypto/scatterwalk.c
index 7281b8a..79ca227 100644
--- a/crypto/scatterwalk.c
+++ b/crypto/scatterwalk.c
@@ -124,3 +124,25 @@ void scatterwalk_map_and_copy(void *buf, struct 
scatterlist *sg,
scatterwalk_done(, out, 0);
 }
 EXPORT_SYMBOL_GPL(scatterwalk_map_and_copy);
+
+int scatterwalk_bytes_sglen(struct scatterlist *sg, int num_bytes)
+{
+   int offset = 0, n = 0;
+
+   /* num_bytes is too small */
+   if (num_bytes < sg->length)
+   return -1;
+
+   do {
+   offset += sg->length;
+   n++;
+   sg = scatterwalk_sg_next(sg);
+
+   /* num_bytes is too large */
+   if (unlikely(!sg && (num_bytes < offset)))
+   return -1;
+   } while (sg && (num_bytes > offset));
+
+   return n;
+}
+EXPORT_SYMBOL_GPL(scatterwalk_bytes_sglen);
diff --git a/include/crypto/scatterwalk.h b/include/crypto/scatterwalk.h
index 3744d2a..13621cc 100644
--- a/include/crypto/scatterwalk.h
+++ b/include/crypto/scatterwalk.h
@@ -113,4 +113,6 @@ void scatterwalk_done(struct scatter_walk *walk, int out, 
int more);
 void scatterwalk_map_and_copy(void *buf, struct scatterlist *sg,
  unsigned int start, unsigned int nbytes, int out);
 
+int scatterwalk_bytes_sglen(struct scatterlist *sg, int num_bytes);
+
 #endif  /* _CRYPTO_SCATTERWALK_H */
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 06/10] crypto: omap-aes: Remove previously used intermediate buffers

2013-08-14 Thread Joel Fernandes

Intermdiate buffers were allocated, mapped and used for DMA.
These are no longer required as we use the SGs from crypto layer
directly in previous commits in the series. Also along with it,
remove the logic for copying SGs etc as they are no longer used,
and all the associated variables in omap_aes_device.

Signed-off-by: Joel Fernandes 
---
 drivers/crypto/omap-aes.c |   90 -
 1 file changed, 90 deletions(-)

diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
index df2f867..71da52b 100644
--- a/drivers/crypto/omap-aes.c
+++ b/drivers/crypto/omap-aes.c
@@ -156,25 +156,13 @@ struct omap_aes_dev {
struct ablkcipher_request   *req;
size_t  total;
struct scatterlist  *in_sg;
-   struct scatterlist  in_sgl;
-   size_t  in_offset;
struct scatterlist  *out_sg;
-   struct scatterlist  out_sgl;
-   size_t  out_offset;
-
-   size_t  buflen;
-   void*buf_in;
-   size_t  dma_size;
int dma_in;
struct dma_chan *dma_lch_in;
-   dma_addr_t  dma_addr_in;
-   void*buf_out;
int dma_out;
struct dma_chan *dma_lch_out;
int in_sg_len;
int out_sg_len;
-   dma_addr_t  dma_addr_out;
-
const struct omap_aes_pdata *pdata;
 };
 
@@ -362,33 +350,6 @@ static int omap_aes_dma_init(struct omap_aes_dev *dd)
dd->dma_lch_out = NULL;
dd->dma_lch_in = NULL;
 
-   dd->buf_in = (void *)__get_free_pages(GFP_KERNEL, OMAP_AES_CACHE_SIZE);
-   dd->buf_out = (void *)__get_free_pages(GFP_KERNEL, OMAP_AES_CACHE_SIZE);
-   dd->buflen = PAGE_SIZE << OMAP_AES_CACHE_SIZE;
-   dd->buflen &= ~(AES_BLOCK_SIZE - 1);
-
-   if (!dd->buf_in || !dd->buf_out) {
-   dev_err(dd->dev, "unable to alloc pages.\n");
-   goto err_alloc;
-   }
-
-   /* MAP here */
-   dd->dma_addr_in = dma_map_single(dd->dev, dd->buf_in, dd->buflen,
-DMA_TO_DEVICE);
-   if (dma_mapping_error(dd->dev, dd->dma_addr_in)) {
-   dev_err(dd->dev, "dma %d bytes error\n", dd->buflen);
-   err = -EINVAL;
-   goto err_map_in;
-   }
-
-   dd->dma_addr_out = dma_map_single(dd->dev, dd->buf_out, dd->buflen,
- DMA_FROM_DEVICE);
-   if (dma_mapping_error(dd->dev, dd->dma_addr_out)) {
-   dev_err(dd->dev, "dma %d bytes error\n", dd->buflen);
-   err = -EINVAL;
-   goto err_map_out;
-   }
-
dma_cap_zero(mask);
dma_cap_set(DMA_SLAVE, mask);
 
@@ -415,14 +376,6 @@ static int omap_aes_dma_init(struct omap_aes_dev *dd)
 err_dma_out:
dma_release_channel(dd->dma_lch_in);
 err_dma_in:
-   dma_unmap_single(dd->dev, dd->dma_addr_out, dd->buflen,
-DMA_FROM_DEVICE);
-err_map_out:
-   dma_unmap_single(dd->dev, dd->dma_addr_in, dd->buflen, DMA_TO_DEVICE);
-err_map_in:
-   free_pages((unsigned long)dd->buf_out, OMAP_AES_CACHE_SIZE);
-   free_pages((unsigned long)dd->buf_in, OMAP_AES_CACHE_SIZE);
-err_alloc:
if (err)
pr_err("error: %d\n", err);
return err;
@@ -432,11 +385,6 @@ static void omap_aes_dma_cleanup(struct omap_aes_dev *dd)
 {
dma_release_channel(dd->dma_lch_out);
dma_release_channel(dd->dma_lch_in);
-   dma_unmap_single(dd->dev, dd->dma_addr_out, dd->buflen,
-DMA_FROM_DEVICE);
-   dma_unmap_single(dd->dev, dd->dma_addr_in, dd->buflen, DMA_TO_DEVICE);
-   free_pages((unsigned long)dd->buf_out, OMAP_AES_CACHE_SIZE);
-   free_pages((unsigned long)dd->buf_in, OMAP_AES_CACHE_SIZE);
 }
 
 static void sg_copy_buf(void *buf, struct scatterlist *sg,
@@ -453,42 +401,6 @@ static void sg_copy_buf(void *buf, struct scatterlist *sg,
scatterwalk_done(, out, 0);
 }
 
-static int sg_copy(struct scatterlist **sg, size_t *offset, void *buf,
-  size_t buflen, size_t total, int out)
-{
-   unsigned int count, off = 0;
-
-   while (buflen && total) {
-   count = min((*sg)->length - *offset, total);
-   count = min(count, buflen);
-
-   if (!count)
-   return off;
-
-   /*
-* buflen and total are AES_BLOCK_SIZE size aligned,
-* so count should be also aligned
-*/
-
-   sg_copy_buf(buf + off, *s

[PATCH 03/10] crypto: omap-aes: Populate number of SG elements

2013-08-14 Thread Joel Fernandes

Crypto layer only passes nbytes but number of SG elements is needed
for mapping/unmapping SGs at one time using dma_map* API and also
needed to pass in for dmaengine prep function.

We call function added to scatterwalk for this purpose in omap_aes_handle_queue
to populate the values which are used later.

Signed-off-by: Joel Fernandes 
---
 drivers/crypto/omap-aes.c |6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
index 3838e0a..b7189a8 100644
--- a/drivers/crypto/omap-aes.c
+++ b/drivers/crypto/omap-aes.c
@@ -171,6 +171,8 @@ struct omap_aes_dev {
void*buf_out;
int dma_out;
struct dma_chan *dma_lch_out;
+   int in_sg_len;
+   int out_sg_len;
dma_addr_t  dma_addr_out;
 
const struct omap_aes_pdata *pdata;
@@ -741,6 +743,10 @@ static int omap_aes_handle_queue(struct omap_aes_dev *dd,
dd->out_offset = 0;
dd->out_sg = req->dst;
 
+   dd->in_sg_len = scatterwalk_bytes_sglen(dd->in_sg, dd->total);
+   dd->out_sg_len = scatterwalk_bytes_sglen(dd->out_sg, dd->total);
+   BUG_ON(dd->in_sg_len < 0 || dd->out_sg_len < 0);
+
rctx = ablkcipher_request_ctx(req);
ctx = crypto_ablkcipher_ctx(crypto_ablkcipher_reqtfm(req));
rctx->mode &= FLAGS_MODE_MASK;
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 05/10] crypto: omap-aes: Sync SG before DMA operation

2013-08-14 Thread Joel Fernandes

Earlier functions that did a similar sync are replaced by
the dma_sync_sg_* which can operate on entire SG list.

Signed-off-by: Joel Fernandes 
---
 drivers/crypto/omap-aes.c |4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
index 9104a7f..df2f867 100644
--- a/drivers/crypto/omap-aes.c
+++ b/drivers/crypto/omap-aes.c
@@ -499,6 +499,8 @@ static int omap_aes_crypt_dma(struct crypto_tfm *tfm,
struct dma_slave_config cfg;
int ret;
 
+   dma_sync_sg_for_device(dd->dev, dd->in_sg, in_sg_len, DMA_TO_DEVICE);
+
memset(, 0, sizeof(cfg));
 
cfg.src_addr = dd->phys_base + AES_REG_DATA_N(dd, 0);
@@ -686,6 +688,8 @@ static void omap_aes_done_task(unsigned long data)
 
pr_debug("enter done_task\n");
 
+   dma_sync_sg_for_cpu(dd->dev, dd->in_sg, dd->in_sg_len, DMA_FROM_DEVICE);
+
omap_aes_crypt_dma_stop(dd);
omap_aes_finish_req(dd, 0);
omap_aes_handle_queue(dd, NULL);
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 04/10] crypto: omap-aes: Simplify DMA usage by using direct SGs

2013-08-14 Thread Joel Fernandes

In early version of this driver, assumptions were made such as
DMA layer requires contiguous buffers etc. Due to this, new buffers
were allocated, mapped and used for DMA. These assumptions are no
longer true and DMAEngine scatter-gather DMA doesn't have such
requirements. We simply the DMA operations by directly using the
scatter-gather buffers provided by the crypto layer instead of
creating our own.

Lot of logic that handled DMA'ing only X number of bytes of the
total, or as much as fitted into a 3rd party buffer is removed
and is no longer required.

Also, drastically improves performance (atleast 35% seen with
encrypting a buffer size of 8K (2800 ops/sec vs 1800 ops/sec).
Also DMA usage is much more simplified and coherent with rest
of the code.

Signed-off-by: Joel Fernandes 
---
 drivers/crypto/omap-aes.c |  147 -
 1 file changed, 25 insertions(+), 122 deletions(-)

diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
index b7189a8..9104a7f 100644
--- a/drivers/crypto/omap-aes.c
+++ b/drivers/crypto/omap-aes.c
@@ -490,22 +490,14 @@ static int sg_copy(struct scatterlist **sg, size_t 
*offset, void *buf,
 }
 
 static int omap_aes_crypt_dma(struct crypto_tfm *tfm,
-   struct scatterlist *in_sg, struct scatterlist *out_sg)
+   struct scatterlist *in_sg, struct scatterlist *out_sg,
+   int in_sg_len, int out_sg_len)
 {
struct omap_aes_ctx *ctx = crypto_tfm_ctx(tfm);
struct omap_aes_dev *dd = ctx->dd;
struct dma_async_tx_descriptor *tx_in, *tx_out;
struct dma_slave_config cfg;
-   dma_addr_t dma_addr_in = sg_dma_address(in_sg);
-   int ret, length = sg_dma_len(in_sg);
-
-   pr_debug("len: %d\n", length);
-
-   dd->dma_size = length;
-
-   if (!(dd->flags & FLAGS_FAST))
-   dma_sync_single_for_device(dd->dev, dma_addr_in, length,
-  DMA_TO_DEVICE);
+   int ret;
 
memset(, 0, sizeof(cfg));
 
@@ -524,7 +516,7 @@ static int omap_aes_crypt_dma(struct crypto_tfm *tfm,
return ret;
}
 
-   tx_in = dmaengine_prep_slave_sg(dd->dma_lch_in, in_sg, 1,
+   tx_in = dmaengine_prep_slave_sg(dd->dma_lch_in, in_sg, in_sg_len,
DMA_MEM_TO_DEV,
DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
if (!tx_in) {
@@ -543,7 +535,7 @@ static int omap_aes_crypt_dma(struct crypto_tfm *tfm,
return ret;
}
 
-   tx_out = dmaengine_prep_slave_sg(dd->dma_lch_out, out_sg, 1,
+   tx_out = dmaengine_prep_slave_sg(dd->dma_lch_out, out_sg, out_sg_len,
DMA_DEV_TO_MEM,
DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
if (!tx_out) {
@@ -561,7 +553,7 @@ static int omap_aes_crypt_dma(struct crypto_tfm *tfm,
dma_async_issue_pending(dd->dma_lch_out);
 
/* start DMA */
-   dd->pdata->trigger(dd, length);
+   dd->pdata->trigger(dd, dd->total);
 
return 0;
 }
@@ -570,93 +562,28 @@ static int omap_aes_crypt_dma_start(struct omap_aes_dev 
*dd)
 {
struct crypto_tfm *tfm = crypto_ablkcipher_tfm(
crypto_ablkcipher_reqtfm(dd->req));
-   int err, fast = 0, in, out;
-   size_t count;
-   dma_addr_t addr_in, addr_out;
-   struct scatterlist *in_sg, *out_sg;
-   int len32;
+   int err;
 
pr_debug("total: %d\n", dd->total);
 
-   if (sg_is_last(dd->in_sg) && sg_is_last(dd->out_sg)) {
-   /* check for alignment */
-   in = IS_ALIGNED((u32)dd->in_sg->offset, sizeof(u32));
-   out = IS_ALIGNED((u32)dd->out_sg->offset, sizeof(u32));
-
-   fast = in && out;
+   err = dma_map_sg(dd->dev, dd->in_sg, dd->in_sg_len, DMA_TO_DEVICE);
+   if (!err) {
+   dev_err(dd->dev, "dma_map_sg() error\n");
+   return -EINVAL;
}
 
-   if (fast)  {
-   count = min(dd->total, sg_dma_len(dd->in_sg));
-   count = min(count, sg_dma_len(dd->out_sg));
-
-   if (count != dd->total) {
-   pr_err("request length != buffer length\n");
-   return -EINVAL;
-   }
-
-   pr_debug("fast\n");
-
-   err = dma_map_sg(dd->dev, dd->in_sg, 1, DMA_TO_DEVICE);
-   if (!err) {
-   dev_err(dd->dev, "dma_map_sg() error\n");
-   return -EINVAL;
-   }
-
-   err = dma_map_sg(dd->dev, dd->out_sg, 1, DMA_FROM_DEVICE);
-   if (!err) {
-   dev_err(dd->dev, "dma_map_sg() erro

[PATCH 08/10] crypto: omap-aes: PIO mode: Add IRQ handler and walk SGs

2013-08-14 Thread Joel Fernandes

We add an IRQ handler that implements a state-machine for PIO-mode
and data structures for walking the scatter-gather list. The IRQ handler
is called in succession both when data is available to read or next
data can be sent for processing. This process continues till the entire
in/out SG lists have been walked. Once the SG-list has been completely
walked, the IRQ handler schedules the done_task tasklet.

Also add a useful macro that is used through out the IRQ code for a
common pattern of calculating how much an SG list has been walked.
This improves code readability and avoids checkpatch errors.

Signed-off-by: Joel Fernandes 
---
 drivers/crypto/omap-aes.c |   90 +
 1 file changed, 90 insertions(+)

diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
index c057eac..891455b 100644
--- a/drivers/crypto/omap-aes.c
+++ b/drivers/crypto/omap-aes.c
@@ -46,6 +46,8 @@
 #define DST_MAXBURST   4
 #define DMA_MIN(DST_MAXBURST * sizeof(u32))
 
+#define _calc_walked(inout) (dd->inout##_walk.offset - dd->inout##_sg->offset)
+
 /* OMAP TRM gives bitfields as start:end, where start is the higher bit
number. For example 7:0 */
 #define FLD_MASK(start, end)   (((1 << ((start) - (end) + 1)) - 1) << (end))
@@ -98,6 +100,8 @@
 #define FLAGS_FAST BIT(5)
 #define FLAGS_BUSY BIT(6)
 
+#define AES_BLOCK_WORDS(AES_BLOCK_SIZE >> 2)
+
 struct omap_aes_ctx {
struct omap_aes_dev *dd;
 
@@ -163,6 +167,8 @@ struct omap_aes_dev {
size_t  total;
struct scatterlist  *in_sg;
struct scatterlist  *out_sg;
+   struct scatter_walk in_walk;
+   struct scatter_walk out_walk;
int dma_in;
struct dma_chan *dma_lch_in;
int dma_out;
@@ -862,6 +868,90 @@ static const struct omap_aes_pdata omap_aes_pdata_omap4 = {
.minor_shift= 0,
 };
 
+static irqreturn_t omap_aes_irq(int irq, void *dev_id)
+{
+   struct omap_aes_dev *dd = dev_id;
+   u32 status, i;
+   u32 *src, *dst;
+
+   status = omap_aes_read(dd, AES_REG_IRQ_STATUS(dd));
+   if (status & AES_REG_IRQ_DATA_IN) {
+   omap_aes_write(dd, AES_REG_IRQ_ENABLE(dd), 0x0);
+
+   BUG_ON(!dd->in_sg);
+
+   BUG_ON(_calc_walked(in) > dd->in_sg->length);
+
+   src = sg_virt(dd->in_sg) + _calc_walked(in);
+
+   for (i = 0; i < AES_BLOCK_WORDS; i++) {
+   omap_aes_write(dd, AES_REG_DATA_N(dd, i), *src);
+
+   scatterwalk_advance(>in_walk, 4);
+   if (dd->in_sg->length == _calc_walked(in)) {
+   dd->in_sg = scatterwalk_sg_next(dd->in_sg);
+   if (dd->in_sg) {
+   scatterwalk_start(>in_walk,
+ dd->in_sg);
+   src = sg_virt(dd->in_sg) +
+ _calc_walked(in);
+   }
+   } else {
+   src++;
+   }
+   }
+
+   /* Clear IRQ status */
+   status &= ~AES_REG_IRQ_DATA_IN;
+   omap_aes_write(dd, AES_REG_IRQ_STATUS(dd), status);
+
+   /* Enable DATA_OUT interrupt */
+   omap_aes_write(dd, AES_REG_IRQ_ENABLE(dd), 0x4);
+
+   } else if (status & AES_REG_IRQ_DATA_OUT) {
+   omap_aes_write(dd, AES_REG_IRQ_ENABLE(dd), 0x0);
+
+   BUG_ON(!dd->out_sg);
+
+   BUG_ON(_calc_walked(out) > dd->out_sg->length);
+
+   dst = sg_virt(dd->out_sg) + _calc_walked(out);
+
+   for (i = 0; i < AES_BLOCK_WORDS; i++) {
+   *dst = omap_aes_read(dd, AES_REG_DATA_N(dd, i));
+   scatterwalk_advance(>out_walk, 4);
+   if (dd->out_sg->length == _calc_walked(out)) {
+   dd->out_sg = scatterwalk_sg_next(dd->out_sg);
+   if (dd->out_sg) {
+   scatterwalk_start(>out_walk,
+ dd->out_sg);
+   dst = sg_virt(dd->out_sg) +
+ _calc_walked(out);
+   }
+   } else {
+   dst++;
+   }
+   }
+
+   dd->total -= AES_BLOCK_SIZE;
+
+   BUG_ON(dd->total < 0);
+
+

[PATCH 10/10] crypto: omap-aes: Switch to PIO mode in probe function

2013-08-14 Thread Joel Fernandes

In cases where requesting for DMA channels fails for some reason, or channel
numbers are not provided in DT or platform data, we switch to PIO-only mode
also checking if platform provides IRQ numbers and interrupt register offsets
in DT and platform data. All dma-only paths are avoided in this mode.

Signed-off-by: Joel Fernandes 
---
 drivers/crypto/omap-aes.c |   28 +++-
 1 file changed, 23 insertions(+), 5 deletions(-)

diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
index 54f2729..34e3b77 100644
--- a/drivers/crypto/omap-aes.c
+++ b/drivers/crypto/omap-aes.c
@@ -1074,7 +1074,7 @@ static int omap_aes_probe(struct platform_device *pdev)
struct omap_aes_dev *dd;
struct crypto_alg *algp;
struct resource res;
-   int err = -ENOMEM, i, j;
+   int err = -ENOMEM, i, j, irq = -1;
u32 reg;
 
dd = kzalloc(sizeof(struct omap_aes_dev), GFP_KERNEL);
@@ -1118,8 +1118,23 @@ static int omap_aes_probe(struct platform_device *pdev)
tasklet_init(>queue_task, omap_aes_queue_task, (unsigned long)dd);
 
err = omap_aes_dma_init(dd);
-   if (err)
-   goto err_dma;
+   if (err && AES_REG_IRQ_STATUS(dd) && AES_REG_IRQ_ENABLE(dd)) {
+   dd->pio_only = 1;
+
+   irq = platform_get_irq(pdev, 0);
+   if (irq < 0) {
+   dev_err(dev, "can't get IRQ resource\n");
+   goto err_irq;
+   }
+
+   err = request_irq(irq, omap_aes_irq, 0,
+   dev_name(dev), dd);
+   if (err) {
+   dev_err(dev, "Unable to grab omap-aes IRQ\n");
+   goto err_irq;
+   }
+   }
+
 
INIT_LIST_HEAD(>list);
spin_lock(_lock);
@@ -1147,8 +1162,11 @@ err_algs:
for (j = dd->pdata->algs_info[i].registered - 1; j >= 0; j--)
crypto_unregister_alg(
>pdata->algs_info[i].algs_list[j]);
-   omap_aes_dma_cleanup(dd);
-err_dma:
+   if (dd->pio_only)
+   free_irq(irq, dd);
+   else
+   omap_aes_dma_cleanup(dd);
+err_irq:
tasklet_kill(>done_task);
tasklet_kill(>queue_task);
pm_runtime_disable(dev);
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 09/10] crypto: omap-aes: PIO mode: platform data for OMAP4 and trigger it

2013-08-14 Thread Joel Fernandes

We initialize the scatter gather walk lists needed for PIO mode
and avoid all DMA paths such as mapping/unmapping buffers by
checking for the pio_only flag.

Signed-off-by: Joel Fernandes 
---
 drivers/crypto/omap-aes.c |   43 ++-
 1 file changed, 30 insertions(+), 13 deletions(-)

diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
index 891455b..54f2729 100644
--- a/drivers/crypto/omap-aes.c
+++ b/drivers/crypto/omap-aes.c
@@ -175,6 +175,7 @@ struct omap_aes_dev {
struct dma_chan *dma_lch_out;
int in_sg_len;
int out_sg_len;
+   int pio_only;
const struct omap_aes_pdata *pdata;
 };
 
@@ -423,6 +424,16 @@ static int omap_aes_crypt_dma(struct crypto_tfm *tfm,
struct dma_slave_config cfg;
int ret;
 
+   if (dd->pio_only) {
+   scatterwalk_start(>in_walk, dd->in_sg);
+   scatterwalk_start(>out_walk, dd->out_sg);
+
+   /* Enable DATAIN interrupt and let it take
+  care of the rest */
+   omap_aes_write(dd, AES_REG_IRQ_ENABLE(dd), 0x2);
+   return 0;
+   }
+
dma_sync_sg_for_device(dd->dev, dd->in_sg, in_sg_len, DMA_TO_DEVICE);
 
memset(, 0, sizeof(cfg));
@@ -492,21 +503,25 @@ static int omap_aes_crypt_dma_start(struct omap_aes_dev 
*dd)
 
pr_debug("total: %d\n", dd->total);
 
-   err = dma_map_sg(dd->dev, dd->in_sg, dd->in_sg_len, DMA_TO_DEVICE);
-   if (!err) {
-   dev_err(dd->dev, "dma_map_sg() error\n");
-   return -EINVAL;
-   }
+   if (!dd->pio_only) {
+   err = dma_map_sg(dd->dev, dd->in_sg, dd->in_sg_len,
+DMA_TO_DEVICE);
+   if (!err) {
+   dev_err(dd->dev, "dma_map_sg() error\n");
+   return -EINVAL;
+   }
 
-   err = dma_map_sg(dd->dev, dd->out_sg, dd->out_sg_len, DMA_FROM_DEVICE);
-   if (!err) {
-   dev_err(dd->dev, "dma_map_sg() error\n");
-   return -EINVAL;
+   err = dma_map_sg(dd->dev, dd->out_sg, dd->out_sg_len,
+DMA_FROM_DEVICE);
+   if (!err) {
+   dev_err(dd->dev, "dma_map_sg() error\n");
+   return -EINVAL;
+   }
}
 
err = omap_aes_crypt_dma(tfm, dd->in_sg, dd->out_sg, dd->in_sg_len,
 dd->out_sg_len);
-   if (err) {
+   if (err && !dd->pio_only) {
dma_unmap_sg(dd->dev, dd->in_sg, dd->in_sg_len, DMA_TO_DEVICE);
dma_unmap_sg(dd->dev, dd->out_sg, dd->out_sg_len,
 DMA_FROM_DEVICE);
@@ -610,9 +625,11 @@ static void omap_aes_done_task(unsigned long data)
 
pr_debug("enter done_task\n");
 
-   dma_sync_sg_for_cpu(dd->dev, dd->in_sg, dd->in_sg_len, DMA_FROM_DEVICE);
-
-   omap_aes_crypt_dma_stop(dd);
+   if (!dd->pio_only) {
+   dma_sync_sg_for_device(dd->dev, dd->out_sg, dd->out_sg_len,
+  DMA_FROM_DEVICE);
+   omap_aes_crypt_dma_stop(dd);
+   }
omap_aes_finish_req(dd, 0);
omap_aes_handle_queue(dd, NULL);
 
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 07/10] crypto: omap-aes: Add IRQ info and helper macros

2013-08-14 Thread Joel Fernandes

Add IRQ information to pdata and helper macros. These are
required for PIO-mode support.

Signed-off-by: Joel Fernandes 
---
 drivers/crypto/omap-aes.c |8 
 1 file changed, 8 insertions(+)

diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
index 71da52b..c057eac 100644
--- a/drivers/crypto/omap-aes.c
+++ b/drivers/crypto/omap-aes.c
@@ -82,6 +82,10 @@
 
 #define AES_REG_LENGTH_N(x)(0x54 + ((x) * 0x04))
 
+#define AES_REG_IRQ_STATUS(dd) ((dd)->pdata->irq_status_ofs)
+#define AES_REG_IRQ_ENABLE(dd) ((dd)->pdata->irq_enable_ofs)
+#define AES_REG_IRQ_DATA_INBIT(1)
+#define AES_REG_IRQ_DATA_OUT   BIT(2)
 #define DEFAULT_TIMEOUT(5*HZ)
 
 #define FLAGS_MODE_MASK0x000f
@@ -127,6 +131,8 @@ struct omap_aes_pdata {
u32 data_ofs;
u32 rev_ofs;
u32 mask_ofs;
+   u32 irq_enable_ofs;
+   u32 irq_status_ofs;
 
u32 dma_enable_in;
u32 dma_enable_out;
@@ -846,6 +852,8 @@ static const struct omap_aes_pdata omap_aes_pdata_omap4 = {
.data_ofs   = 0x60,
.rev_ofs= 0x80,
.mask_ofs   = 0x84,
+   .irq_status_ofs = 0x8c,
+   .irq_enable_ofs = 0x90,
.dma_enable_in  = BIT(5),
.dma_enable_out = BIT(6),
.major_mask = 0x0700,
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 02/10] crypto: omap-aes: Add useful debug macros

2013-08-14 Thread Joel Fernandes

When DEBUG is enabled, these macros can be used to print variables
in integer and hex format, and clearly display which registers,
offsets and values are being read/written , including printing the
names of the offsets and their values.

Signed-off-by: Joel Fernandes 
---
 drivers/crypto/omap-aes.c |   30 ++
 1 file changed, 30 insertions(+)

diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
index ee15b0f..3838e0a 100644
--- a/drivers/crypto/omap-aes.c
+++ b/drivers/crypto/omap-aes.c
@@ -15,6 +15,14 @@
 
 #define pr_fmt(fmt) "%s: " fmt, __func__
 
+#ifdef DEBUG
+#define prn(num) printk(#num "=%d\n", num)
+#define prx(num) printk(#num "=%x\n", num)
+#else
+#define prn(num) do { } while (0)
+#define prx(num)  do { } while (0)
+#endif
+
 #include 
 #include 
 #include 
@@ -172,13 +180,35 @@ struct omap_aes_dev {
 static LIST_HEAD(dev_list);
 static DEFINE_SPINLOCK(list_lock);
 
+#ifdef DEBUG
+#define omap_aes_read(dd, offset)  \
+   do {\
+   omap_aes_read_1(dd, offset);\
+   pr_debug("omap_aes_read(" #offset ")\n");   \
+   } while (0)
+
+static inline u32 omap_aes_read_1(struct omap_aes_dev *dd, u32 offset)
+#else
 static inline u32 omap_aes_read(struct omap_aes_dev *dd, u32 offset)
+#endif
 {
return __raw_readl(dd->io_base + offset);
 }
 
+#ifdef DEBUG
+#define omap_aes_write(dd, offset, value)  \
+   do {\
+   pr_debug("omap_aes_write(" #offset "=%x) value=%d\n",   \
+offset, value);\
+   omap_aes_write_1(dd, offset, value);\
+   } while (0)
+
+static inline void omap_aes_write_1(struct omap_aes_dev *dd, u32 offset,
+ u32 value)
+#else
 static inline void omap_aes_write(struct omap_aes_dev *dd, u32 offset,
  u32 value)
+#endif
 {
__raw_writel(value, dd->io_base + offset);
 }
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 00/10] crypto: omap-aes: DMA and PIO mode improvements

2013-08-14 Thread Joel Fernandes

On 08/14/2013 06:12 PM, Joel Fernandes wrote:
> This patch series is a rewrite of the DMA portion of omap-aes driver
> and also adds support for PIO mode. Both these modes, give better
> performance than before.
> 
> Earlier, only a single SG was used for DMA purpose, and the SG-list
> passed from the crypto layer was being copied and DMA'd one entry at
> a time. This turns out to be quite inefficient, so we replace it with
> much simpler code that directly passes the SG-list from crypto to the
> DMA layer.
> 
> We also add PIO mode support to the driver, and switch to PIO mode
> whenever the DMA channel allocation is not available. This is only for
> OMAP4 platform will work on any platform on which IRQ information is
> populated.
> 
> Tests performed on am33xx and omap4 SoCs , notice the 50% perf improvement

Just correcting, this is more like 35% not 50% when using DMA.

Thanks,

-Joel





> for large 8K blocks:
> 
> Sample run on am33xx (beaglebone):
> With DMA rewrite:
> [   26.410052] test 0 (128 bit key, 16 byte blocks): 4318 operations in 1 
> seconds (69088 bytes)
> [   27.414314] test 1 (128 bit key, 64 byte blocks): 4360 operations in 1 
> seconds (279040 bytes)
> [   28.414406] test 2 (128 bit key, 256 byte blocks): 3609 operations in 1 
> seconds (923904 bytes)
> [   29.414410] test 3 (128 bit key, 1024 byte blocks): 3418 operations in 1 
> seconds (3500032 bytes)
> [   30.414510] test 4 (128 bit key, 8192 byte blocks): 1766 operations in 1 
> seconds (14467072 bytes)
> 
> Without DMA rewrite:
> [   31.920519] test 0 (128 bit key, 16 byte blocks): 4417 operations in 1 
> seconds (70672 bytes)
> [   32.925997] test 1 (128 bit key, 64 byte blocks): 4221 operations in 1 
> seconds (270144 bytes)
> [   33.926194] test 2 (128 bit key, 256 byte blocks): 3528 operations in 1 
> seconds (903168 bytes)
> [   34.926225] test 3 (128 bit key, 1024 byte blocks): 3281 operations in 1 
> seconds (3359744 bytes)
> [   35.926385] test 4 (128 bit key, 8192 byte blocks): 1460 operations in 1 
> seconds (11960320 bytes)
> 
> With PIO mode, note the tremndous boost in performance for small blocks there:
> [   27.294905] test 0 (128 bit key, 16 byte blocks): 20585 operations in 1 
> seconds (329360 bytes)
> [   28.302282] test 1 (128 bit key, 64 byte blocks): 8106 operations in 1 
> seconds (518784 bytes)
> [   29.302374] test 2 (128 bit key, 256 byte blocks): 2359 operations in 1 
> seconds (603904 bytes)
> [   30.302575] test 3 (128 bit key, 1024 byte blocks): 605 operations in 1 
> seconds (619520 bytes)
> [   31.303781] test 4 (128 bit key, 8192 byte blocks): 79 operations in 1 
> seconds (647168 bytes)
> 
> Future work in this direction would be to dynamically change between
> PIO/DMA mode based on the block size.
> 
> Joel Fernandes (10):
>   crypto: scatterwalk:  Add support for calculating number of SG
> elements
>   crypto: omap-aes: Add useful debug macros
>   crypto: omap-aes: Populate number of SG elements
>   crypto: omap-aes: Simplify DMA usage by using direct SGs
>   crypto: omap-aes: Sync SG before DMA operation
>   crypto: omap-aes: Remove previously used intermediate buffers
>   crypto: omap-aes: Add IRQ info and helper macros
>   crypto: omap-aes: PIO mode: Add IRQ handler and walk SGs
>   crypto: omap-aes: PIO mode: platform data for OMAP4 and trigger it
>   crypto: omap-aes: Switch to PIO mode in probe function
> 
>  crypto/scatterwalk.c |   22 +++
>  drivers/crypto/omap-aes.c|  400 
> --
>  include/crypto/scatterwalk.h |2 +
>  3 files changed, 217 insertions(+), 207 deletions(-)
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 02/10] crypto: omap-aes: Add useful debug macros

2013-08-14 Thread Joel Fernandes

On 08/14/2013 06:29 PM, Joe Perches wrote:
> On Wed, 2013-08-14 at 18:12 -0500, Joel Fernandes wrote:
>> When DEBUG is enabled, these macros can be used to print variables
>> in integer and hex format, and clearly display which registers,
>> offsets and values are being read/written , including printing the
>> names of the offsets and their values.
> []
>> diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
> []
>> @@ -15,6 +15,14 @@
>>  
>>  #define pr_fmt(fmt) "%s: " fmt, __func__
>>  
>> +#ifdef DEBUG
>> +#define prn(num) printk(#num "=%d\n", num)
>> +#define prx(num) printk(#num "=%x\n", num)
> 
> pr_debug?

Sure, can change that.

>> +#else
>> +#define prn(num) do { } while (0)
>> +#define prx(num)  do { } while (0)
>> +#endif
> []
>> @@ -172,13 +180,35 @@ struct omap_aes_dev {
>>  static LIST_HEAD(dev_list);
>>  static DEFINE_SPINLOCK(list_lock);
>>  
>> +#ifdef DEBUG
>> +#define omap_aes_read(dd, offset)   \
>> +do {\
>> +omap_aes_read_1(dd, offset);\
>> +pr_debug("omap_aes_read(" #offset ")\n");   \
>> +} while (0)
>> +
>> +static inline u32 omap_aes_read_1(struct omap_aes_dev *dd, u32 offset)
>> +#else
>>  static inline u32 omap_aes_read(struct omap_aes_dev *dd, u32 offset)
>> +#endif
>>  {
>>  return __raw_readl(dd->io_base + offset);
>>  }
>>  
>> +#ifdef DEBUG
>> +#define omap_aes_write(dd, offset, value)   \
>> +do {\
>> +pr_debug("omap_aes_write(" #offset "=%x) value=%d\n",   \
>> + offset, value);\
>> +omap_aes_write_1(dd, offset, value);\
>> +} while (0)
>> +
>> +static inline void omap_aes_write_1(struct omap_aes_dev *dd, u32 offset,
>> +  u32 value)
>> +#else
>>  static inline void omap_aes_write(struct omap_aes_dev *dd, u32 offset,
>>u32 value)
>> +#endif
>>  {
>>  __raw_writel(value, dd->io_base + offset);
>>  }
> 
> Umm, yuck?
> 
> Is there any real value in read_1 and write_1?

Can you be more descriptive? There is a lot of value in them for debug
to show clearly sequence of read/writes. Moreover, they are no-op'd when
DEBUG is disabled.

Other than the obvious use, I guess it could be rewritten to be bit
cleaner. I can do that.

-Joel

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 02/10] crypto: omap-aes: Add useful debug macros

2013-08-14 Thread Joel Fernandes

On 08/14/2013 07:47 PM, Joe Perches wrote:
> On Wed, 2013-08-14 at 18:40 -0500, Joel Fernandes wrote:
>> On 08/14/2013 06:29 PM, Joe Perches wrote:
>>> On Wed, 2013-08-14 at 18:12 -0500, Joel Fernandes wrote:
>>>> When DEBUG is enabled, these macros can be used to print variables
>>>> in integer and hex format, and clearly display which registers,
>>>> offsets and values are being read/written , including printing the
>>>> names of the offsets and their values.
>>> []
>>>> diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
>>> []
>>>> @@ -15,6 +15,14 @@
>>>>  
>>>>  #define pr_fmt(fmt) "%s: " fmt, __func__
>>>>  
>>>> +#ifdef DEBUG
>>>> +#define prn(num) printk(#num "=%d\n", num)
>>>> +#define prx(num) printk(#num "=%x\n", num)
>>>
>>> pr_debug?
>>
>> Sure, can change that.
>>
>>>> +#else
>>>> +#define prn(num) do { } while (0)
>>>> +#define prx(num)  do { } while (0)
>>>> +#endif
>>> []
>>>> @@ -172,13 +180,35 @@ struct omap_aes_dev {
>>>>  static LIST_HEAD(dev_list);
>>>>  static DEFINE_SPINLOCK(list_lock);
>>>>  
>>>> +#ifdef DEBUG
>>>> +#define omap_aes_read(dd, offset) \
>>>> +  do {\
>>>> +  omap_aes_read_1(dd, offset);\
>>>> +  pr_debug("omap_aes_read(" #offset ")\n");   \
>>>> +  } while (0)
>>>> +
>>>> +static inline u32 omap_aes_read_1(struct omap_aes_dev *dd, u32 offset)
>>>> +#else
>>>>  static inline u32 omap_aes_read(struct omap_aes_dev *dd, u32 offset)
>>>> +#endif
>>>>  {
>>>>return __raw_readl(dd->io_base + offset);
>>>>  }
>>>>  
>>>> +#ifdef DEBUG
>>>> +#define omap_aes_write(dd, offset, value) \
>>>> +  do {\
>>>> +  pr_debug("omap_aes_write(" #offset "=%x) value=%d\n",   \
>>>> +   offset, value);\
>>>> +  omap_aes_write_1(dd, offset, value);\
>>>> +  } while (0)
>>>> +
>>>> +static inline void omap_aes_write_1(struct omap_aes_dev *dd, u32 offset,
>>>> +u32 value)
>>>> +#else
>>>>  static inline void omap_aes_write(struct omap_aes_dev *dd, u32 offset,
>>>>  u32 value)
>>>> +#endif
>>>>  {
>>>>__raw_writel(value, dd->io_base + offset);
>>>>  }
>>>
>>> Umm, yuck?
>>>
>>> Is there any real value in read_1 and write_1?
>>
>> Can you be more descriptive? There is a lot of value in them for debug
>> to show clearly sequence of read/writes. Moreover, they are no-op'd when
>> DEBUG is disabled.
> 
> pr_debug is no-op'd when DEBUG is not #defined.
> so just use a single
> 
> omap_aes_write(...)
> {
>   pr_debug(...)
>   __raw_writel(...);
> }

Actually this doesn't work, as the pr_debug cannot print the name of the
offset as my original patch set does using "#offset".

There are many places where named offsets are used to pass to
read/write, and this macro helps to visually see which offset is being
written to by name.

So the original patch would stand in its current form except for a small
rewrite of the write debug part of it as follows to be cleaner getting
rid of the _1. For the read , we still need it as we need to return the
value from a function which cannot be done in a macro.

So the new patch would look something like this:

#ifdef DEBUG
#define omap_aes_read(dd, offset)  \
omap_aes_read_1(dd, offset);   \
pr_debug("omap_aes_read(" #offset ")\n");
static inline u32 omap_aes_read_1(struct omap_aes_dev *dd, u32 offset)
#else
static inline u32 omap_aes_read(struct omap_aes_dev *dd, u32 offset)
#endif
{
return __raw_readl(dd->io_base + offset);
}

#ifdef DEBUG
#define omap_aes_write(dd, offset, value)  \
do {   \
pr_debug("omap_aes_write(" #offset "=%x) value=%d\n",  \
 offset, value);   \
__raw_writel(value, dd->io_base + offset); \
} while (0)
#else
static inline void omap_aes_write(struct omap_aes_dev *dd, u32 offset,
  u32 value)
{
__raw_writel(value, dd->io_base + offset);
}
#endif



Thanks,

-Joel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 00/10] crypto: omap-aes: DMA and PIO mode improvements

2013-08-15 Thread Joel Fernandes

On 08/15/2013 12:58 AM, Dmitry Kasatkin wrote:
> On 15/08/13 02:30, Joel Fernandes wrote:
>> On 08/14/2013 06:12 PM, Joel Fernandes wrote:
>>> This patch series is a rewrite of the DMA portion of omap-aes driver
>>> and also adds support for PIO mode. Both these modes, give better
>>> performance than before.
>>>
>>> Earlier, only a single SG was used for DMA purpose, and the SG-list
>>> passed from the crypto layer was being copied and DMA'd one entry at
>>> a time. This turns out to be quite inefficient, so we replace it with
>>> much simpler code that directly passes the SG-list from crypto to the
>>> DMA layer.
>>>
>>> We also add PIO mode support to the driver, and switch to PIO mode
>>> whenever the DMA channel allocation is not available. This is only for
>>> OMAP4 platform will work on any platform on which IRQ information is
>>> populated.
>>>
>>> Tests performed on am33xx and omap4 SoCs , notice the 50% perf improvement
>> Just correcting, this is more like 35% not 50% when using DMA.
> 
> Hmm :)
> 
> 1766/1460 = ~20%

Yes sorry, I messed the cover letter up. If I resend the series again,
I'll update this number.

On OMAP4 though, I saw 2800 ops/sec vs 1800 ops/sec which is around 50%
so it depends on SoC and DMA controller. OMAP4 uses SDMA while AM335x
uses EDMA.

Also with very large blocks, this improvement will be much higher as we
will not be doing all the intermediate copy but I haven't tested with
such large blocks.

Thanks,

-Joel

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 02/10] crypto: omap-aes: Add useful debug macros

2013-08-15 Thread Joel Fernandes

On 08/15/2013 01:23 AM, Dmitry Kasatkin wrote:
> On 15/08/13 06:12, Joel Fernandes wrote:
>> On 08/14/2013 07:47 PM, Joe Perches wrote:
>>> On Wed, 2013-08-14 at 18:40 -0500, Joel Fernandes wrote:
>>>> On 08/14/2013 06:29 PM, Joe Perches wrote:
>>>>> On Wed, 2013-08-14 at 18:12 -0500, Joel Fernandes wrote:
>>>>>> When DEBUG is enabled, these macros can be used to print variables
>>>>>> in integer and hex format, and clearly display which registers,
>>>>>> offsets and values are being read/written , including printing the
>>>>>> names of the offsets and their values.
>>>>> []
>>>>>> diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
>>>>> []
>>>>>> @@ -15,6 +15,14 @@
>>>>>>  
>>>>>>  #define pr_fmt(fmt) "%s: " fmt, __func__
>>>>>>  
>>>>>> +#ifdef DEBUG
>>>>>> +#define prn(num) printk(#num "=%d\n", num)
>>>>>> +#define prx(num) printk(#num "=%x\n", num)
>>>>> pr_debug?
>>>> Sure, can change that.
>>>>
>>>>>> +#else
>>>>>> +#define prn(num) do { } while (0)
>>>>>> +#define prx(num)  do { } while (0)
>>>>>> +#endif
>>>>> []
>>>>>> @@ -172,13 +180,35 @@ struct omap_aes_dev {
>>>>>>  static LIST_HEAD(dev_list);
>>>>>>  static DEFINE_SPINLOCK(list_lock);
>>>>>>  
>>>>>> +#ifdef DEBUG
>>>>>> +#define omap_aes_read(dd, offset)   
>>>>>> \
>>>>>> +do {
>>>>>> \
>>>>>> +omap_aes_read_1(dd, offset);
>>>>>> \
>>>>>> +pr_debug("omap_aes_read(" #offset ")\n");   
>>>>>> \
>>>>>> +} while (0)
>>>>>> +
>>>>>> +static inline u32 omap_aes_read_1(struct omap_aes_dev *dd, u32 offset)
>>>>>> +#else
>>>>>>  static inline u32 omap_aes_read(struct omap_aes_dev *dd, u32 offset)
>>>>>> +#endif
>>>>>>  {
>>>>>>  return __raw_readl(dd->io_base + offset);
>>>>>>  }
>>>>>>  
>>>>>> +#ifdef DEBUG
>>>>>> +#define omap_aes_write(dd, offset, value)   
>>>>>> \
>>>>>> +do {
>>>>>> \
>>>>>> +pr_debug("omap_aes_write(" #offset "=%x) value=%d\n",   
>>>>>> \
>>>>>> + offset, value);
>>>>>> \
>>>>>> +omap_aes_write_1(dd, offset, value);
>>>>>> \
>>>>>> +} while (0)
>>>>>> +
>>>>>> +static inline void omap_aes_write_1(struct omap_aes_dev *dd, u32 offset,
>>>>>> +  u32 value)
>>>>>> +#else
>>>>>>  static inline void omap_aes_write(struct omap_aes_dev *dd, u32 offset,
>>>>>>u32 value)
>>>>>> +#endif
>>>>>>  {
>>>>>>  __raw_writel(value, dd->io_base + offset);
>>>>>>  }
>>>>> Umm, yuck?
>>>>>
>>>>> Is there any real value in read_1 and write_1?
>>>> Can you be more descriptive? There is a lot of value in them for debug
>>>> to show clearly sequence of read/writes. Moreover, they are no-op'd when
>>>> DEBUG is disabled.
>>> pr_debug is no-op'd when DEBUG is not #defined.
>>> so just use a single
>>>
>>> omap_aes_write(...)
>>> {
>>> pr_debug(...)
>>> __raw_writel(...);
>>> }
>> Actually this doesn't work, as the pr_debug cannot print the name of the
>> offset as my original patch set does using "#offset".
>>
>> There are many places where named offsets are used to pass to
>> read/write, and this macro helps to visually see which offset is being
>> written to by name.
>>
>

[PATCH v4 5/6] dma: edma: Leave linked to Null slot instead of DUMMY slot

2013-08-29 Thread Joel Fernandes

Dummy slot has been used as a way for missed-events not to be
reported as missing. This has been particularly troublesome for cases
where we might want to temporarily pause all incoming events.

For EDMA DMAC, there is no way to do any such pausing of events as
the occurence of the "next" event is not software controlled.
Using "edma_pause" in IRQ handlers doesn't help as by then the event
in concern from the slave is already missed.

Linking a dummy slot, is seen to absorb these events which we didn't
want to miss. So we don't link to dummy, but instead leave it linked
to NULL set, allow an error condition and detect the channel that
missed it.

Signed-off-by: Joel Fernandes 

dma: edma: Link to dummy slot only for last SG list split

Consider the case where we have a scatter-list like:
SG1->SG2->SG3->SG4->SG5->SG6->Null

For ex, for a MAX_NR_SG of 2, earlier we were splitting this as:
SG1->SG2->Null
SG3->SG4->Null
SG5->SG6->Null

Now we split it as
SG1->SG2->Null
SG3->SG4->Null
SG5->SG6->Dummy

This approach results in lesser unwanted interrupts that occur
for the last list split. The Dummy slot has the property of not
raising an error condition if events are missed unlike the Null
slot. We are OK with this as we're done with processing the
whole list once we reach Dummy.

Signed-off-by: Joel Fernandes 
---
 drivers/dma/edma.c | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/dma/edma.c b/drivers/dma/edma.c
index 0e77b57..25e47d4 100644
--- a/drivers/dma/edma.c
+++ b/drivers/dma/edma.c
@@ -158,13 +158,18 @@ static void edma_execute(struct edma_chan *echan)
/* Link to the previous slot if not the last set */
if (i != (nslots - 1))
edma_link(echan->slot[i], echan->slot[i+1]);
-   /* Final pset links to the dummy pset */
-   else
-   edma_link(echan->slot[i], echan->ecc->dummy_slot);
}
 
edesc->processed += nslots;
 
+   /*
+* If this is either the last set in a set of SG-list transactions
+* then setup a link to the dummy slot, this results in all future
+* events being absorbed and that's OK because we're done
+*/
+   if (edesc->processed == edesc->pset_nr)
+   edma_link(echan->slot[nslots-1], echan->ecc->dummy_slot);
+
edma_resume(echan->ch_num);
 
if (edesc->processed <= MAX_NR_SG) {
-- 
1.8.1.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v4 3/6] ARM: edma: Add function to manually trigger an EDMA channel

2013-08-29 Thread Joel Fernandes

Manual trigger for events missed as a result of splitting a
scatter gather list and DMA'ing it in batches. Add a helper
function to trigger a channel incase any such events are missed.

Signed-off-by: Joel Fernandes 
---
 arch/arm/common/edma.c | 17 +
 include/linux/platform_data/edma.h |  2 ++
 2 files changed, 19 insertions(+)

diff --git a/arch/arm/common/edma.c b/arch/arm/common/edma.c
index 3567ba1..67e8cf5 100644
--- a/arch/arm/common/edma.c
+++ b/arch/arm/common/edma.c
@@ -1236,6 +1236,23 @@ void edma_resume(unsigned channel)
 }
 EXPORT_SYMBOL(edma_resume);
 
+int edma_trigger_channel(unsigned channel)
+{
+   unsigned ctlr;
+   unsigned int mask;
+
+   ctlr = EDMA_CTLR(channel);
+   channel = EDMA_CHAN_SLOT(channel);
+   mask = BIT(channel & 0x1f);
+
+   edma_shadow0_write_array(ctlr, SH_ESR, (channel >> 5), mask);
+
+   pr_debug("EDMA: ESR%d %08x\n", (channel >> 5),
+edma_shadow0_read_array(ctlr, SH_ESR, (channel >> 5)));
+   return 0;
+}
+EXPORT_SYMBOL(edma_trigger_channel);
+
 /**
  * edma_start - start dma on a channel
  * @channel: channel being activated
diff --git a/include/linux/platform_data/edma.h 
b/include/linux/platform_data/edma.h
index 57300fd..179fb91 100644
--- a/include/linux/platform_data/edma.h
+++ b/include/linux/platform_data/edma.h
@@ -180,4 +180,6 @@ struct edma_soc_info {
const s16   (*xbar_chans)[2];
 };
 
+int edma_trigger_channel(unsigned);
+
 #endif
-- 
1.8.1.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v4 1/6] dma: edma: Setup parameters to DMA MAX_NR_SG at a time

2013-08-29 Thread Joel Fernandes

Changes are made here for configuring existing parameters to support
DMA'ing them out in batches as needed.

Also allocate as many as slots as needed by the SG list, but not more
than MAX_NR_SG. Then these slots will be reused accordingly.
For ex, if MAX_NR_SG=10, and number of SG entries is 40, still only
10 slots will be allocated to DMA the entire SG list of size 40.

Also enable TC interrupts for slots that are a last in a current
iteration, or that fall on a MAX_NR_SG boundary.

Signed-off-by: Joel Fernandes 
---
 drivers/dma/edma.c | 18 +++---
 1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/drivers/dma/edma.c b/drivers/dma/edma.c
index 5f3e532..e522ad5 100644
--- a/drivers/dma/edma.c
+++ b/drivers/dma/edma.c
@@ -222,9 +222,9 @@ static struct dma_async_tx_descriptor *edma_prep_slave_sg(
enum dma_slave_buswidth dev_width;
u32 burst;
struct scatterlist *sg;
-   int i;
int acnt, bcnt, ccnt, src, dst, cidx;
int src_bidx, dst_bidx, src_cidx, dst_cidx;
+   int i, nslots;
 
if (unlikely(!echan || !sgl || !sg_len))
return NULL;
@@ -262,8 +262,10 @@ static struct dma_async_tx_descriptor *edma_prep_slave_sg(
 
edesc->pset_nr = sg_len;
 
-   for_each_sg(sgl, sg, sg_len, i) {
-   /* Allocate a PaRAM slot, if needed */
+   /* Allocate a PaRAM slot, if needed */
+   nslots = min_t(unsigned, MAX_NR_SG, sg_len);
+
+   for (i = 0; i < nslots; i++) {
if (echan->slot[i] < 0) {
echan->slot[i] =
edma_alloc_slot(EDMA_CTLR(echan->ch_num),
@@ -273,6 +275,10 @@ static struct dma_async_tx_descriptor *edma_prep_slave_sg(
return NULL;
}
}
+   }
+
+   /* Configure PaRAM sets for each SG */
+   for_each_sg(sgl, sg, sg_len, i) {
 
acnt = dev_width;
 
@@ -330,6 +336,12 @@ static struct dma_async_tx_descriptor *edma_prep_slave_sg(
/* Configure A or AB synchronized transfers */
if (edesc->absync)
edesc->pset[i].opt |= SYNCDIM;
+
+   /* If this is the last in a current SG set of transactions,
+  enable interrupts so that next set is processed */
+   if (!((i+1) % MAX_NR_SG))
+   edesc->pset[i].opt |= TCINTEN;
+
/* If this is the last set, enable completion interrupt flag */
if (i == sg_len - 1)
edesc->pset[i].opt |= TCINTEN;
-- 
1.8.1.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v4 0/6] dma: edma: Support scatter-lists of any length

2013-08-29 Thread Joel Fernandes

The following series adds support to EDMA driver to enable DMA of
scatter-gather lists of arbitrary length, but still make use of only
a certain MAX number of slots at a time for a given channel. Thus
free-ing up the rest of the slots to other slaves/channels.  With this
there is no need for slave drivers to query the EDMA driver about how
much is the MAX it can send at a time as done in [1]. Drivers can send
SG lists of any number of entries to DMA. Reference discussion at [2].

With this, all the patches for MMC and EDMA related to "sg limits" can be
dropped.

Tested omap-aes and omap_hsmmc drivers with different MAX number of slots,
even just 1. In the case where it is 1, only 1-slot is used to DMA an
entire scatter list of arbitrary length.
Since this series touches EDMA private API code also shared with davinci-pcm,
playback of a 16-bit 44.1KHz audio file with davinci-pcm has been tested.

Sample test run with 1 vs 16 (MAX number of slots/SG) in omap-aes driver:
MAX slots = 1:
 (128 bit key, 8192 byte blocks): 1266 operations in 1 seconds (10371072 bytes)
MAX slots = 16:
 (128 bit key, 8192 byte blocks): 1601 operations in 1 seconds (13115392 bytes)

Note: For the above test, 8K buffer is mapped into SG list of size 2 so
only 2 slots are required. So beyond size 2, there will not be any noticeable
performance improvement. But above experiment just shows as proof of concept
that even using 1 slot is managed by just DMA'ing 1 SG entry at a time.

Patch series history:
v1: The initial patch series
v2: Several style related cleanups.
v3: Rewrote major portions of the series to avoid usage of error interrupts.
v4: Go back to v1: After performing tests, it is determined that the v1
series is a good enough approach and that v3 triggers problems in certain cases.
Tests have shown that there is not any noticeable performance difference between
using vs not using error interrupts. v4 also squashes some patches in v1.
At this point, we are freezing the overall architecture of the patch series to 
v4.

[1] https://lkml.org/lkml/2013/7/18/432
[2] http://marc.info/?l=linux-omap=137416733628831=2

Joel Fernandes (6):
  dma: edma: Setup parameters to DMA MAX_NR_SG at a time
  dma: edma: Write out and handle MAX_NR_SG at a given time
  ARM: edma: Add function to manually trigger an EDMA channel
  dma: edma: Find missed events and issue them
  dma: edma: Leave linked to Null slot instead of DUMMY slot
  dma: edma: Remove limits on number of slots

 arch/arm/common/edma.c |  17 
 drivers/dma/edma.c | 164 -
 include/linux/platform_data/edma.h |   2 +
 3 files changed, 144 insertions(+), 39 deletions(-)

-- 
1.8.1.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v4 6/6] dma: edma: Remove limits on number of slots

2013-08-29 Thread Joel Fernandes

With this series, this check is no longer required and
we shouldn't need to reject drivers DMA'ing more than the
MAX number of slots.

Signed-off-by: Joel Fernandes 
---
 drivers/dma/edma.c | 6 --
 1 file changed, 6 deletions(-)

diff --git a/drivers/dma/edma.c b/drivers/dma/edma.c
index 25e47d4..d966413 100644
--- a/drivers/dma/edma.c
+++ b/drivers/dma/edma.c
@@ -287,12 +287,6 @@ static struct dma_async_tx_descriptor *edma_prep_slave_sg(
return NULL;
}
 
-   if (sg_len > MAX_NR_SG) {
-   dev_err(dev, "Exceeded max SG segments %d > %d\n",
-   sg_len, MAX_NR_SG);
-   return NULL;
-   }
-
edesc = kzalloc(sizeof(*edesc) + sg_len *
sizeof(edesc->pset[0]), GFP_ATOMIC);
if (!edesc) {
-- 
1.8.1.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v4 2/6] dma: edma: Write out and handle MAX_NR_SG at a given time

2013-08-29 Thread Joel Fernandes

Process SG-elements in batches of MAX_NR_SG if they are greater
than MAX_NR_SG. Due to this, at any given time only those many
slots will be used in the given channel no matter how long the
scatter list is. We keep track of how much has been written
inorder to process the next batch of elements in the scatter-list
and detect completion.

For such intermediate transfer completions (one batch of MAX_NR_SG),
make use of pause and resume functions instead of start and stop
when such intermediate transfer is in progress or completed as we
donot want to clear any pending events.

Signed-off-by: Joel Fernandes 
---
 drivers/dma/edma.c | 79 --
 1 file changed, 53 insertions(+), 26 deletions(-)

diff --git a/drivers/dma/edma.c b/drivers/dma/edma.c
index e522ad5..732829b 100644
--- a/drivers/dma/edma.c
+++ b/drivers/dma/edma.c
@@ -56,6 +56,7 @@ struct edma_desc {
struct list_headnode;
int absync;
int pset_nr;
+   int processed;
struct edmacc_param pset[0];
 };
 
@@ -104,22 +105,34 @@ static void edma_desc_free(struct virt_dma_desc *vdesc)
 /* Dispatch a queued descriptor to the controller (caller holds lock) */
 static void edma_execute(struct edma_chan *echan)
 {
-   struct virt_dma_desc *vdesc = vchan_next_desc(>vchan);
+   struct virt_dma_desc *vdesc;
struct edma_desc *edesc;
-   int i;
-
-   if (!vdesc) {
-   echan->edesc = NULL;
-   return;
+   struct device *dev = echan->vchan.chan.device->dev;
+   int i, j, left, nslots;
+
+   /* If either we processed all psets or we're still not started */
+   if (!echan->edesc ||
+   echan->edesc->pset_nr == echan->edesc->processed) {
+   /* Get next vdesc */
+   vdesc = vchan_next_desc(>vchan);
+   if (!vdesc) {
+   echan->edesc = NULL;
+   return;
+   }
+   list_del(>node);
+   echan->edesc = to_edma_desc(>tx);
}
 
-   list_del(>node);
+   edesc = echan->edesc;
 
-   echan->edesc = edesc = to_edma_desc(>tx);
+   /* Find out how many left */
+   left = edesc->pset_nr - edesc->processed;
+   nslots = min(MAX_NR_SG, left);
 
/* Write descriptor PaRAM set(s) */
-   for (i = 0; i < edesc->pset_nr; i++) {
-   edma_write_slot(echan->slot[i], >pset[i]);
+   for (i = 0; i < nslots; i++) {
+   j = i + edesc->processed;
+   edma_write_slot(echan->slot[i], >pset[j]);
dev_dbg(echan->vchan.chan.device->dev,
"\n pset[%d]:\n"
"  chnum\t%d\n"
@@ -132,24 +145,31 @@ static void edma_execute(struct edma_chan *echan)
"  bidx\t%08x\n"
"  cidx\t%08x\n"
"  lkrld\t%08x\n",
-   i, echan->ch_num, echan->slot[i],
-   edesc->pset[i].opt,
-   edesc->pset[i].src,
-   edesc->pset[i].dst,
-   edesc->pset[i].a_b_cnt,
-   edesc->pset[i].ccnt,
-   edesc->pset[i].src_dst_bidx,
-   edesc->pset[i].src_dst_cidx,
-   edesc->pset[i].link_bcntrld);
+   j, echan->ch_num, echan->slot[i],
+   edesc->pset[j].opt,
+   edesc->pset[j].src,
+   edesc->pset[j].dst,
+   edesc->pset[j].a_b_cnt,
+   edesc->pset[j].ccnt,
+   edesc->pset[j].src_dst_bidx,
+   edesc->pset[j].src_dst_cidx,
+   edesc->pset[j].link_bcntrld);
/* Link to the previous slot if not the last set */
-   if (i != (edesc->pset_nr - 1))
+   if (i != (nslots - 1))
edma_link(echan->slot[i], echan->slot[i+1]);
/* Final pset links to the dummy pset */
else
edma_link(echan->slot[i], echan->ecc->dummy_slot);
}
 
-   edma_start(echan->ch_num);
+   edesc->processed += nslots;
+
+   edma_resume(echan->ch_num);
+
+   if (edesc->processed <= MAX_NR_SG) {
+   dev_dbg(dev, "first transfer starting %d\n", echan->ch_num);
+   edma_start(echan->ch_num);
+   }
 }
 
 static int edma_terminate_all(struct edma_chan *echan)
@@ -368,19 +388,26 @@ static void edma_callback(unsigned ch_num, u16 ch_status, 
void *data)

[PATCH v4 4/6] dma: edma: Find missed events and issue them

2013-08-29 Thread Joel Fernandes

In an effort to move to using Scatter gather lists of any size with
EDMA as discussed at [1] instead of placing limitations on the driver,
we work through the limitations of the EDMAC hardware to find missed
events and issue them.

The sequence of events that require this are:

For the scenario where MAX slots for an EDMA channel is 3:

SG1 -> SG2 -> SG3 -> SG4 -> SG5 -> SG6 -> Null

The above SG list will have to be DMA'd in 2 sets:

(1) SG1 -> SG2 -> SG3 -> Null
(2) SG4 -> SG5 -> SG6 -> Null

After (1) is succesfully transferred, the events from the MMC controller
donot stop coming and are missed by the time we have setup the transfer
for (2). So here, we catch the events missed as an error condition and
issue them manually.

In the second part of the patch, we make handle the NULL slot cases:
For crypto IP, we continue to receive events even continuously in
NULL slot, the setup of the next set of SG elements happens after
the error handler executes. This is results in some recursion problems.
Due to this, we continously receive error interrupts when we manually
trigger an event from the error handler.

We fix this, by first detecting if the Channel is currently transferring
from a NULL slot or not, that's where the edma_read_slot in the error
callback from interrupt handler comes in. With this we can determine if
the set up of the next SG list has completed, and we manually trigger
only in this case. If the setup has _not_ completed, we are still in NULL
so we just set a missed flag and allow the manual triggerring to happen
in edma_execute which will be eventually called. This fixes the above
mentioned race conditions seen with the crypto drivers.

[1] http://marc.info/?l=linux-omap=137416733628831=2

Signed-off-by: Joel Fernandes 
---
 drivers/dma/edma.c | 50 +-
 1 file changed, 49 insertions(+), 1 deletion(-)

diff --git a/drivers/dma/edma.c b/drivers/dma/edma.c
index 732829b..0e77b57 100644
--- a/drivers/dma/edma.c
+++ b/drivers/dma/edma.c
@@ -70,6 +70,7 @@ struct edma_chan {
int ch_num;
boolalloced;
int slot[EDMA_MAX_SLOTS];
+   int missed;
struct dma_slave_config cfg;
 };
 
@@ -170,6 +171,20 @@ static void edma_execute(struct edma_chan *echan)
dev_dbg(dev, "first transfer starting %d\n", echan->ch_num);
edma_start(echan->ch_num);
}
+
+   /*
+* This happens due to setup times between intermediate transfers
+* in long SG lists which have to be broken up into transfers of
+* MAX_NR_SG
+*/
+   if (echan->missed) {
+   dev_dbg(dev, "missed event in execute detected\n");
+   edma_clean_channel(echan->ch_num);
+   edma_stop(echan->ch_num);
+   edma_start(echan->ch_num);
+   edma_trigger_channel(echan->ch_num);
+   echan->missed = 0;
+   }
 }
 
 static int edma_terminate_all(struct edma_chan *echan)
@@ -387,6 +402,7 @@ static void edma_callback(unsigned ch_num, u16 ch_status, 
void *data)
struct device *dev = echan->vchan.chan.device->dev;
struct edma_desc *edesc;
unsigned long flags;
+   struct edmacc_param p;
 
/* Pause the channel */
edma_pause(echan->ch_num);
@@ -414,7 +430,39 @@ static void edma_callback(unsigned ch_num, u16 ch_status, 
void *data)
 
break;
case DMA_CC_ERROR:
-   dev_dbg(dev, "transfer error on channel %d\n", ch_num);
+   spin_lock_irqsave(>vchan.lock, flags);
+
+   edma_read_slot(EDMA_CHAN_SLOT(echan->slot[0]), );
+
+   /*
+* Issue later based on missed flag which will be sure
+* to happen as:
+* (1) we finished transmitting an intermediate slot and
+* edma_execute is coming up.
+* (2) or we finished current transfer and issue will
+* call edma_execute.
+*
+* Important note: issuing can be dangerous here and
+* lead to some nasty recursion when we are in a NULL
+* slot. So we avoid doing so and set the missed flag.
+*/
+   if (p.a_b_cnt == 0 && p.ccnt == 0) {
+   dev_dbg(dev, "Error occurred, looks like slot is null, 
just setting miss\n");
+   echan->missed = 1;
+   } else {
+   /*
+* The slot is already programmed but the event got
+* missed, so its safe to issue it here.
+*/
+   dev_dbg(dev, "Error occurred but slot is no

[PATCH 2/3] crypto: omap-des: Add config and build options

2013-08-29 Thread Joel Fernandes

Add config and build options for the newly added omap-des driver.

Signed-off-by: Joel Fernandes 
---
 drivers/crypto/Kconfig  | 11 +++
 drivers/crypto/Makefile |  1 +
 2 files changed, 12 insertions(+)

diff --git a/drivers/crypto/Kconfig b/drivers/crypto/Kconfig
index e289afa..119a8e5 100644
--- a/drivers/crypto/Kconfig
+++ b/drivers/crypto/Kconfig
@@ -259,6 +259,17 @@ config CRYPTO_DEV_OMAP_AES
  OMAP processors have AES module accelerator. Select this if you
  want to use the OMAP module for AES algorithms.
 
+config CRYPTO_DEV_OMAP_DES
+   tristate "Support for OMAP DES3DES hw engine"
+   depends on ARCH_OMAP2PLUS
+   select CRYPTO_DES
+   select CRYPTO_BLKCIPHER2
+   help
+ OMAP processors have DES/3DES module accelerator. Select this if you
+ want to use the OMAP module for DES and 3DES algorithms. Currently
+ the ECB and CBC modes of operation supported by the driver. Also
+ accesses made on unaligned boundaries are also supported.
+
 config CRYPTO_DEV_PICOXCELL
tristate "Support for picoXcell IPSEC and Layer2 crypto engines"
depends on ARCH_PICOXCELL && HAVE_CLK
diff --git a/drivers/crypto/Makefile b/drivers/crypto/Makefile
index 38ce13d..ada440f 100644
--- a/drivers/crypto/Makefile
+++ b/drivers/crypto/Makefile
@@ -11,6 +11,7 @@ obj-$(CONFIG_CRYPTO_DEV_IXP4XX) += ixp4xx_crypto.o
 obj-$(CONFIG_CRYPTO_DEV_PPC4XX) += amcc/
 obj-$(CONFIG_CRYPTO_DEV_OMAP_SHAM) += omap-sham.o
 obj-$(CONFIG_CRYPTO_DEV_OMAP_AES) += omap-aes.o
+obj-$(CONFIG_CRYPTO_DEV_OMAP_DES) += omap-des.o
 obj-$(CONFIG_CRYPTO_DEV_PICOXCELL) += picoxcell_crypto.o
 obj-$(CONFIG_CRYPTO_DEV_SAHARA) += sahara.o
 obj-$(CONFIG_CRYPTO_DEV_S5P) += s5p-sss.o
-- 
1.8.1.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 0/3] crypto: omap-des: Add driver for OMAP DES module

2013-08-29 Thread Joel Fernandes

OMAP4 and AM437x SoCs have a DES3DES module which is capable of performing DES
encrypt/decrypt and 3DES ede encrypt/decrypt operations. Following patch series
adds support for the same.

Tests have been performed with crypto/testmgr and all tests are successful.

Joel Fernandes (3):
  crypto: omap-des: Add omap-des driver for OMAP4/AM43xx
  crypto: omap-des: Add config and build options
  crypto: omap-des: Add triple DES (des3_ede) support to driver

 drivers/crypto/Kconfig|   11 +
 drivers/crypto/Makefile   |1 +
 drivers/crypto/omap-des.c | 1239 +
 3 files changed, 1251 insertions(+)
 create mode 100644 drivers/crypto/omap-des.c

-- 
1.8.1.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3/3] crypto: omap-des: Add triple DES (des3_ede) support to driver

2013-08-29 Thread Joel Fernandes

OMAP DES module supports 3DES operation where 3 64-bit keys are used to
perform a DES encrypt-decrypt-encrypt (ede) operation on a buffer.

Signed-off-by: Joel Fernandes 
---
 drivers/crypto/omap-des.c | 53 ---
 1 file changed, 50 insertions(+), 3 deletions(-)

diff --git a/drivers/crypto/omap-des.c b/drivers/crypto/omap-des.c
index 6a9a25f..0df60cb 100644
--- a/drivers/crypto/omap-des.c
+++ b/drivers/crypto/omap-des.c
@@ -83,7 +83,7 @@ struct omap_des_ctx {
struct omap_des_dev *dd;
 
int keylen;
-   u32 key[DES_KEY_SIZE / sizeof(u32)];
+   u32 key[(3 * DES_KEY_SIZE) / sizeof(u32)];
unsigned long   flags;
 };
 
@@ -265,8 +265,10 @@ static int omap_des_write_ctrl(struct omap_des_dev *dd)
val |= DES_REG_CTRL_CBC;
if (dd->flags & FLAGS_ENCRYPT)
val |= DES_REG_CTRL_DIRECTION;
+   if (key32 == 6)
+   val |= DES_REG_CTRL_TDES;
 
-   mask |= DES_REG_CTRL_CBC | DES_REG_CTRL_DIRECTION;
+   mask |= DES_REG_CTRL_CBC | DES_REG_CTRL_DIRECTION | DES_REG_CTRL_TDES;
 
omap_des_write_mask(dd, DES_REG_CTRL(dd), val, mask);
 
@@ -725,7 +727,7 @@ static int omap_des_setkey(struct crypto_ablkcipher *tfm, 
const u8 *key,
 {
struct omap_des_ctx *ctx = crypto_ablkcipher_ctx(tfm);
 
-   if (keylen != DES_KEY_SIZE)
+   if (keylen != DES_KEY_SIZE && keylen != (3*DES_KEY_SIZE))
return -EINVAL;
 
pr_debug("enter, keylen: %d\n", keylen);
@@ -817,6 +819,51 @@ static struct crypto_alg algs_ecb_cbc[] = {
.encrypt= omap_des_cbc_encrypt,
.decrypt= omap_des_cbc_decrypt,
}
+},
+{
+   .cra_name   = "ecb(des3_ede)",
+   .cra_driver_name= "ecb-des3-omap",
+   .cra_priority   = 100,
+   .cra_flags  = CRYPTO_ALG_TYPE_ABLKCIPHER |
+ CRYPTO_ALG_KERN_DRIVER_ONLY |
+ CRYPTO_ALG_ASYNC,
+   .cra_blocksize  = DES_BLOCK_SIZE,
+   .cra_ctxsize= sizeof(struct omap_des_ctx),
+   .cra_alignmask  = 0,
+   .cra_type   = _ablkcipher_type,
+   .cra_module = THIS_MODULE,
+   .cra_init   = omap_des_cra_init,
+   .cra_exit   = omap_des_cra_exit,
+   .cra_u.ablkcipher = {
+   .min_keysize= 3*DES_KEY_SIZE,
+   .max_keysize= 3*DES_KEY_SIZE,
+   .setkey = omap_des_setkey,
+   .encrypt= omap_des_ecb_encrypt,
+   .decrypt= omap_des_ecb_decrypt,
+   }
+},
+{
+   .cra_name   = "cbc(des3_ede)",
+   .cra_driver_name= "cbc-des3-omap",
+   .cra_priority   = 100,
+   .cra_flags  = CRYPTO_ALG_TYPE_ABLKCIPHER |
+ CRYPTO_ALG_KERN_DRIVER_ONLY |
+ CRYPTO_ALG_ASYNC,
+   .cra_blocksize  = DES_BLOCK_SIZE,
+   .cra_ctxsize= sizeof(struct omap_des_ctx),
+   .cra_alignmask  = 0,
+   .cra_type   = _ablkcipher_type,
+   .cra_module = THIS_MODULE,
+   .cra_init   = omap_des_cra_init,
+   .cra_exit   = omap_des_cra_exit,
+   .cra_u.ablkcipher = {
+   .min_keysize= 3*DES_KEY_SIZE,
+   .max_keysize= 3*DES_KEY_SIZE,
+   .ivsize = DES_BLOCK_SIZE,
+   .setkey = omap_des_setkey,
+   .encrypt= omap_des_cbc_encrypt,
+   .decrypt= omap_des_cbc_decrypt,
+   }
 }
 };
 
-- 
1.8.1.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/3] crypto: omap-des: Add omap-des driver for OMAP4/AM43xx

2013-08-29 Thread Joel Fernandes

Add omap-des driver with platform data for OMAP4. Support added for DES
ECB and CBC modes.

Where possible, code is reused from omap-aes driver with changes made for
adjusting key size, block size, removing non-existent encryption modes
and adding support for OMAP4 platform data and offsets.

Tests have been conducted with the CRYPTO test manager, and functionality
is verified at different page length alignments.

Signed-off-by: Joel Fernandes 
---
 drivers/crypto/omap-des.c | 1192 +
 1 file changed, 1192 insertions(+)
 create mode 100644 drivers/crypto/omap-des.c

diff --git a/drivers/crypto/omap-des.c b/drivers/crypto/omap-des.c
new file mode 100644
index 000..6a9a25f
--- /dev/null
+++ b/drivers/crypto/omap-des.c
@@ -0,0 +1,1192 @@
+/*
+ * Cryptographic API.
+ *
+ * Support for OMAP DES and Triple DES HW acceleration.
+ *
+ * Copyright (c) 2012 Texas Instruments Incorporated
+ * Author: Joel Fernandes 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as published
+ * by the Free Software Foundation.
+ *
+ */
+
+#define pr_fmt(fmt) "%s: " fmt, __func__
+
+#ifdef DEBUG
+#define prn(num) printk(#num "=%d\n", num)
+#define prx(num) printk(#num "=%x\n", num)
+#else
+#define prn(num) do { } while (0)
+#define prx(num)  do { } while (0)
+#endif
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define DST_MAXBURST   2
+
+#define DES_BLOCK_WORDS(DES_BLOCK_SIZE >> 2)
+
+#define _calc_walked(inout) (dd->inout##_walk.offset - dd->inout##_sg->offset)
+
+#define DES_REG_KEY(dd, x) ((dd)->pdata->key_ofs - \
+   ((x ^ 0x01) * 0x04))
+
+#define DES_REG_IV(dd, x)  ((dd)->pdata->iv_ofs + ((x) * 0x04))
+
+#define DES_REG_CTRL(dd)   ((dd)->pdata->ctrl_ofs)
+#define DES_REG_CTRL_CBC   (1 << 4)
+#define DES_REG_CTRL_TDES  (1 << 3)
+#define DES_REG_CTRL_DIRECTION (1 << 2)
+#define DES_REG_CTRL_INPUT_READY   (1 << 1)
+#define DES_REG_CTRL_OUTPUT_READY  (1 << 0)
+
+#define DES_REG_DATA_N(dd, x)  ((dd)->pdata->data_ofs + ((x) * 0x04))
+
+#define DES_REG_REV(dd)((dd)->pdata->rev_ofs)
+
+#define DES_REG_MASK(dd)   ((dd)->pdata->mask_ofs)
+
+#define DES_REG_LENGTH_N(x)(0x24 + ((x) * 0x04))
+
+#define DES_REG_IRQ_STATUS(dd) ((dd)->pdata->irq_status_ofs)
+#define DES_REG_IRQ_ENABLE(dd) ((dd)->pdata->irq_enable_ofs)
+#define DES_REG_IRQ_DATA_INBIT(1)
+#define DES_REG_IRQ_DATA_OUT   BIT(2)
+
+#define FLAGS_MODE_MASK0x000f
+#define FLAGS_ENCRYPT  BIT(0)
+#define FLAGS_CBC  BIT(1)
+#define FLAGS_INIT BIT(4)
+#define FLAGS_BUSY BIT(6)
+
+struct omap_des_ctx {
+   struct omap_des_dev *dd;
+
+   int keylen;
+   u32 key[DES_KEY_SIZE / sizeof(u32)];
+   unsigned long   flags;
+};
+
+struct omap_des_reqctx {
+   unsigned long mode;
+};
+
+#define OMAP_DES_QUEUE_LENGTH  1
+#define OMAP_DES_CACHE_SIZE0
+
+struct omap_des_algs_info {
+   struct crypto_alg   *algs_list;
+   unsigned intsize;
+   unsigned intregistered;
+};
+
+struct omap_des_pdata {
+   struct omap_des_algs_info   *algs_info;
+   unsigned intalgs_info_size;
+
+   void(*trigger)(struct omap_des_dev *dd, int length);
+
+   u32 key_ofs;
+   u32 iv_ofs;
+   u32 ctrl_ofs;
+   u32 data_ofs;
+   u32 rev_ofs;
+   u32 mask_ofs;
+   u32 irq_enable_ofs;
+   u32 irq_status_ofs;
+
+   u32 dma_enable_in;
+   u32 dma_enable_out;
+   u32 dma_start;
+
+   u32 major_mask;
+   u32 major_shift;
+   u32 minor_mask;
+   u32 minor_shift;
+};
+
+struct omap_des_dev {
+   struct list_headlist;
+   unsigned long   phys_base;
+   void __iomem*io_base;
+   struct omap_des_ctx *ctx;
+   struct device   *dev;
+   unsigned long   flags;
+   int err;
+
+   /* spinlock used for queues */
+   spinlock_t  lock;
+   struct crypto_queue queue;
+
+   struct tasklet_struct   done_task;
+   struct tasklet_struct   queue_task;
+
+   struct ablkcipher_request   *req;
+   /*
+* total is used by PIO

Re: [PATCH 1/3] crypto: omap-des: Add omap-des driver for OMAP4/AM43xx

2013-08-29 Thread Joel Fernandes

On 08/29/2013 06:27 PM, Joel Fernandes wrote:
> Add omap-des driver with platform data for OMAP4. Support added for DES
> ECB and CBC modes.
> 
> Where possible, code is reused from omap-aes driver with changes made for
> adjusting key size, block size, removing non-existent encryption modes
> and adding support for OMAP4 platform data and offsets.
> 
> Tests have been conducted with the CRYPTO test manager, and functionality
> is verified at different page length alignments.
> 
> Signed-off-by: Joel Fernandes 
> ---
>  drivers/crypto/omap-des.c | 1192 
> +
>  1 file changed, 1192 insertions(+)
>  create mode 100644 drivers/crypto/omap-des.c
> 
> diff --git a/drivers/crypto/omap-des.c b/drivers/crypto/omap-des.c
> new file mode 100644
> index 000..6a9a25f
> --- /dev/null
> +++ b/drivers/crypto/omap-des.c
> @@ -0,0 +1,1192 @@
> +/*
> + * Cryptographic API.
> + *
> + * Support for OMAP DES and Triple DES HW acceleration.
> + *
> + * Copyright (c) 2012 Texas Instruments Incorporated
> + * Author: Joel Fernandes 

Have to change this to 2013. Will do so if there are no other comments.

Regards,

-Joel

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] ARM: dts: add AM33XX EDMA support

2013-08-30 Thread Joel Fernandes

Hi Benoit,

On 08/26/2013 03:36 AM, Benoit Cousson wrote:
> - minus all the TI emails which are not working anymore :-(
> 
> I've just sent my previous email too soon...
> 
> Now the patch is different :-) I'll take that one.

Unfortunately this patch is still missing from your latest pull request:

Subject "[GIT PULL] ARM: OMAP: Device Tree for 3.12 take #2"
  git://git.kernel.org/pub/scm/linux/kernel/git/bcousson/linux-omap-dt.git
tags/for_3.12/dts_signed  (commit 4843be165c10f9886c87eeb20acf19a3ddec6653)

Below is a scissor patch that cleanly applies on above branch.

Thanks,

-Joel

--->8
From: Matt Porter 
Subject: [PATCH] ARM: dts: add AM33XX EDMA support

Adds AM33XX EDMA support to the am33xx.dtsi as documented in
Documentation/devicetree/bindings/dma/ti-edma.txt

v2 changes:
Drop DT entries that are non-hardware-description Joel Fernandes 
Discussion in [1].

v3 changes: Changed node name from "edma: edma@" to "edma: dma-controller@"
by Sebastian Andrzej Siewior 

[1] https://patchwork.kernel.org/patch/2226761/

Signed-off-by: Matt Porter 
Signed-off-by: Joel A Fernandes 
---
 arch/arm/boot/dts/am33xx.dtsi | 12 
 1 file changed, 12 insertions(+)

diff --git a/arch/arm/boot/dts/am33xx.dtsi b/arch/arm/boot/dts/am33xx.dtsi
index 5996d63..f5869ed 100644
--- a/arch/arm/boot/dts/am33xx.dtsi
+++ b/arch/arm/boot/dts/am33xx.dtsi
@@ -101,6 +101,18 @@
reg = <0x4820 0x1000>;
};

+   edma: dma-controller@4900 {
+   compatible = "ti,edma3";
+   ti,hwmods = "tpcc", "tptc0", "tptc1", "tptc2";
+   reg =   <0x4900 0x1>,
+   <0x44e10f90 0x10>;
+   interrupts = <12 13 14>;
+   #dma-cells = <1>;
+   dma-channels = <64>;
+   ti,edma-regions = <4>;
+   ti,edma-slots = <256>;
+   };
+
gpio0: gpio@44e07000 {
compatible = "ti,omap4-gpio";
ti,hwmods = "gpio1";
-- 
1.8.1.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] ARM: EDMA: Fix clearing of unused list for DT DMA resources

2013-07-27 Thread Joel Fernandes

Hi Tony or Sekhar,

If this patch looks ok, could you pick it up for -rc cycle?

It fixes DMA breakages after the merge window for devices for which DMA
resources are being populated in device tree instead pdev.

Thanks,

-Joel


On 07/22/2013 12:59 PM, Joel Fernandes wrote:
> HWMOD removal for MMC is breaking edma_start as the events are being manually
> triggered due to unused channel list not being clear, Thanks to Balaji TK for
> finding this issue.
> 
> This patch fixes the issue, by reading the "dmas" property from the DT node if
> it exists and clearing the bits in the unused channel list.
> 
> Cc: Balaji T K 
> Cc: Pantel Antoniou 
> Signed-off-by: Joel Fernandes 
> ---
> v2 changes: Fixed compiler warning
> 
>  arch/arm/common/edma.c |   31 +++
>  1 file changed, 23 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/arm/common/edma.c b/arch/arm/common/edma.c
> index a432e6c..765d578 100644
> --- a/arch/arm/common/edma.c
> +++ b/arch/arm/common/edma.c
> @@ -561,14 +561,29 @@ static int reserve_contiguous_slots(int ctlr, unsigned 
> int id,
>  static int prepare_unused_channel_list(struct device *dev, void *data)
>  {
>   struct platform_device *pdev = to_platform_device(dev);
> - int i, ctlr;
> -
> - for (i = 0; i < pdev->num_resources; i++) {
> - if ((pdev->resource[i].flags & IORESOURCE_DMA) &&
> - (int)pdev->resource[i].start >= 0) {
> - ctlr = EDMA_CTLR(pdev->resource[i].start);
> - clear_bit(EDMA_CHAN_SLOT(pdev->resource[i].start),
> - edma_cc[ctlr]->edma_unused);
> + int i = 0, ctlr;
> + u32 dma_chan;
> + const __be32 *dma_chan_p;
> + struct property *prop;
> +
> + if (dev->of_node) {
> + of_property_for_each_u32(dev->of_node, "dmas", prop, \
> +  dma_chan_p, dma_chan) {
> + if (i++ & 1) {
> + ctlr = EDMA_CTLR(dma_chan);
> + clear_bit(EDMA_CHAN_SLOT(dma_chan),
> + edma_cc[ctlr]->edma_unused);
> + }
> + }
> + } else {
> + for (; i < pdev->num_resources; i++) {
> + if ((pdev->resource[i].flags & IORESOURCE_DMA) &&
> + (int)pdev->resource[i].start >= 0) {
> + ctlr = EDMA_CTLR(pdev->resource[i].start);
> + clear_bit(EDMA_CHAN_SLOT(
> +   pdev->resource[i].start),
> +   edma_cc[ctlr]->edma_unused);
> + }
>   }
>   }
>  
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3/9] ARM: edma: Add function to manually trigger an EDMA channel

2013-07-29 Thread Joel Fernandes

Manual trigger for events missed as a result of splitting a
scatter gather list and DMA'ing it in batches. Add a helper
function to trigger a channel incase any such events are missed.

Signed-off-by: Joel Fernandes 
---
 arch/arm/common/edma.c |   21 +
 include/linux/platform_data/edma.h |2 ++
 2 files changed, 23 insertions(+)

diff --git a/arch/arm/common/edma.c b/arch/arm/common/edma.c
index 3567ba1..10995b2 100644
--- a/arch/arm/common/edma.c
+++ b/arch/arm/common/edma.c
@@ -1236,6 +1236,27 @@ void edma_resume(unsigned channel)
 }
 EXPORT_SYMBOL(edma_resume);
 
+int edma_manual_trigger(unsigned channel)
+{
+   unsigned ctlr;
+   int j;
+   unsigned int mask;
+
+   ctlr = EDMA_CTLR(channel);
+   channel = EDMA_CHAN_SLOT(channel);
+   mask = BIT(channel & 0x1f);
+
+   j = channel >> 5;
+
+   /* EDMA channels without event association */
+   edma_shadow0_write_array(ctlr, SH_ESR, j, mask);
+
+   pr_debug("EDMA: ESR%d %08x\n", j,
+edma_shadow0_read_array(ctlr, SH_ESR, j));
+   return 0;
+}
+EXPORT_SYMBOL(edma_manual_trigger);
+
 /**
  * edma_start - start dma on a channel
  * @channel: channel being activated
diff --git a/include/linux/platform_data/edma.h 
b/include/linux/platform_data/edma.h
index 57300fd..0e11eca 100644
--- a/include/linux/platform_data/edma.h
+++ b/include/linux/platform_data/edma.h
@@ -180,4 +180,6 @@ struct edma_soc_info {
const s16   (*xbar_chans)[2];
 };
 
+int edma_manual_trigger(unsigned);
+
 #endif
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 5/9] dma: edma: Leave linked to Null slot instead of DUMMY slot

2013-07-29 Thread Joel Fernandes

Dummy slot has been used as a way for missed-events not to be
reported as missing. This has been particularly troublesome for cases
where we might want to temporarily pause all incoming events.

For EDMA DMAC, there is no way to do any such pausing of events as
the occurence of the "next" event is not software controlled.
Using "edma_pause" in IRQ handlers doesn't help as by then the event
in concern from the slave is already missed.

Linking a dummy slot, is seen to absorb these events which we didn't
want to miss. So we don't link to dummy, but instead leave it linked
to NULL set, allow an error condition and detect the channel that
missed it.

Signed-off-by: Joel Fernandes 
---
 drivers/dma/edma.c |3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/dma/edma.c b/drivers/dma/edma.c
index aa4989f..1eda5cc 100644
--- a/drivers/dma/edma.c
+++ b/drivers/dma/edma.c
@@ -159,9 +159,6 @@ static void edma_execute(struct edma_chan *echan)
/* Link to the previous slot if not the last set */
if (i != (total_process - 1))
edma_link(echan->slot[i], echan->slot[i+1]);
-   /* Final pset links to the dummy pset */
-   else
-   edma_link(echan->slot[i], echan->ecc->dummy_slot);
}
 
edesc->total_processed += total_process;
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 8/9] dma: edma: Link to dummy slot only for last SG list split

2013-07-29 Thread Joel Fernandes

Consider the case where we have a scatter-list like:
SG1->SG2->SG3->SG4->SG5->SG6->Null

For ex, for a MAX_NR_SG of 2, earlier we were splitting this as:
SG1->SG2->Null
SG3->SG4->Null
SG5->SG6->Null

Now we split it as
SG1->SG2->Null
SG3->SG4->Null
SG5->SG6->Dummy

This approach results in lesser unwanted interrupts that occur
for the last list split. The Dummy slot has the property of not
raising an error condition if events are missed unlike the Null
slot. We are OK with this as we're done with processing the
whole list once we reach Dummy.

Signed-off-by: Joel Fernandes 
---
 drivers/dma/edma.c |6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/dma/edma.c b/drivers/dma/edma.c
index c72e8c9..0d3ebde 100644
--- a/drivers/dma/edma.c
+++ b/drivers/dma/edma.c
@@ -164,6 +164,12 @@ static void edma_execute(struct edma_chan *echan)
 
edesc->total_processed += total_process;
 
+   /* If this is either the last set in a set of SG-list transactions
+  then setup a link to the dummy slot, this results in all future
+  events being absorbed and that's OK because we're done */
+   if (edesc->total_processed == edesc->pset_nr)
+   edma_link(echan->slot[total_process-1], echan->ecc->dummy_slot);
+
edma_resume(echan->ch_num);
 
if (edesc->total_processed <= MAX_NR_SG) {
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 4/9] dma: edma: Find missed events and issue them

2013-07-29 Thread Joel Fernandes

In an effort to move to using Scatter gather lists of any size with
EDMA as discussed at [1] instead of placing limitations on the driver,
we work through the limitations of the EDMAC hardware to find missed
events and issue them.

The sequence of events that require this are:

For the scenario where MAX slots for an EDMA channel is 3:

SG1 -> SG2 -> SG3 -> SG4 -> SG5 -> SG6 -> Null

The above SG list will have to be DMA'd in 2 sets:

(1) SG1 -> SG2 -> SG3 -> Null
(2) SG4 -> SG5 -> SG6 -> Null

After (1) is succesfully transferred, the events from the MMC controller
donot stop coming and are missed by the time we have setup the transfer
for (2). So here, we catch the events missed as an error condition and
issue them manually.

[1] http://marc.info/?l=linux-omap=137416733628831=2

Signed-off-by: Joel Fernandes 
---
 drivers/dma/edma.c |   10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/dma/edma.c b/drivers/dma/edma.c
index d9a151b..aa4989f 100644
--- a/drivers/dma/edma.c
+++ b/drivers/dma/edma.c
@@ -417,7 +417,15 @@ static void edma_callback(unsigned ch_num, u16 ch_status, 
void *data)
 
break;
case DMA_CC_ERROR:
-   dev_dbg(dev, "transfer error on channel %d\n", ch_num);
+   if (echan->edesc) {
+   dev_dbg(dev, "Missed event on %d, retrying\n",
+   ch_num);
+   edma_clean_channel(echan->ch_num);
+   edma_stop(echan->ch_num);
+   edma_start(echan->ch_num);
+   edma_manual_trigger(echan->ch_num);
+   }
+   dev_dbg(dev, "handled error on channel %d\n", ch_num);
break;
default:
break;
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/9] dma: edma: Setup parameters to DMA MAX_NR_SG at a time

2013-07-29 Thread Joel Fernandes

Changes are made here for configuring existing parameters to support
DMA'ing them out in batches as needed.

Also allocate as many as slots as needed by the SG list, but not more
than MAX_NR_SG. Then these slots will be reused accordingly.
For ex, if MAX_NR_SG=10, and number of SG entries is 40, still only
10 slots will be allocated to DMA the entire SG list of size 40.

Also enable TC interrupts for slots that are a last in a current
iteration, or that fall on a MAX_NR_SG boundary.

Signed-off-by: Joel Fernandes 
---
 drivers/dma/edma.c |   19 ---
 1 file changed, 16 insertions(+), 3 deletions(-)

diff --git a/drivers/dma/edma.c b/drivers/dma/edma.c
index 5f3e532..0b68f94 100644
--- a/drivers/dma/edma.c
+++ b/drivers/dma/edma.c
@@ -222,9 +222,9 @@ static struct dma_async_tx_descriptor *edma_prep_slave_sg(
enum dma_slave_buswidth dev_width;
u32 burst;
struct scatterlist *sg;
-   int i;
int acnt, bcnt, ccnt, src, dst, cidx;
int src_bidx, dst_bidx, src_cidx, dst_cidx;
+   int i, num_slots_needed;
 
if (unlikely(!echan || !sgl || !sg_len))
return NULL;
@@ -262,8 +262,11 @@ static struct dma_async_tx_descriptor *edma_prep_slave_sg(
 
edesc->pset_nr = sg_len;
 
-   for_each_sg(sgl, sg, sg_len, i) {
-   /* Allocate a PaRAM slot, if needed */
+   /* Allocate a PaRAM slot, if needed */
+
+   num_slots_needed = sg_len > MAX_NR_SG ? MAX_NR_SG : sg_len;
+
+   for (i = 0; i < num_slots_needed; i++) {
if (echan->slot[i] < 0) {
echan->slot[i] =
edma_alloc_slot(EDMA_CTLR(echan->ch_num),
@@ -273,6 +276,10 @@ static struct dma_async_tx_descriptor *edma_prep_slave_sg(
return NULL;
}
}
+   }
+
+   /* Configure PaRAM sets for each SG */
+   for_each_sg(sgl, sg, sg_len, i) {
 
acnt = dev_width;
 
@@ -330,6 +337,12 @@ static struct dma_async_tx_descriptor *edma_prep_slave_sg(
/* Configure A or AB synchronized transfers */
if (edesc->absync)
edesc->pset[i].opt |= SYNCDIM;
+
+   /* If this is the last in a current SG set of transactions,
+  enable interrupts so that next set is processed */
+   if (!((i+1) % MAX_NR_SG))
+   edesc->pset[i].opt |= TCINTEN;
+
/* If this is the last set, enable completion interrupt flag */
if (i == sg_len - 1)
edesc->pset[i].opt |= TCINTEN;
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 9/9] dma: edma: remove limits on number of slots

2013-07-29 Thread Joel Fernandes

With this series, this check is no longer required and
we shouldn't need to reject drivers DMA'ing more than the
MAX number of slots.

Signed-off-by: Joel Fernandes 
---
 drivers/dma/edma.c |6 --
 1 file changed, 6 deletions(-)

diff --git a/drivers/dma/edma.c b/drivers/dma/edma.c
index 0d3ebde..abf2e87 100644
--- a/drivers/dma/edma.c
+++ b/drivers/dma/edma.c
@@ -285,12 +285,6 @@ static struct dma_async_tx_descriptor *edma_prep_slave_sg(
return NULL;
}
 
-   if (sg_len > MAX_NR_SG) {
-   dev_err(dev, "Exceeded max SG segments %d > %d\n",
-   sg_len, MAX_NR_SG);
-   return NULL;
-   }
-
edesc = kzalloc(sizeof(*edesc) + sg_len *
sizeof(edesc->pset[0]), GFP_ATOMIC);
if (!edesc) {
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 0/9] dma: edma: Support scatter-lists of any length

2013-07-29 Thread Joel Fernandes

The following series adds support to EDMA driver to enable DMA of
scatter-gather lists of arbitrary length, but still make use of only
a certain MAX number of slots at a time for a given channel. Thus
free-ing up the rest of the slots to other slaves/channels.  With this
there is no need for slave drivers to query the EDMA driver about how
much is the MAX it can send at a time as done in [1]. Drivers can send
SG lists of any number of entries to DMA. Reference discussion at [2].

Tested omap-aes and omap_hsmmc drivers with different MAX number of slots,
even just 1. In the case where it is 1, only 1-slot is used to DMA an
entire scatter list of arbitrary length.
Since this series touches EDMA private API code also shared with davinci-pcm,
playback of a 16-bit 44.1KHz audio file with davinci-pcm has been tested.

Sample test run with 1 vs 16 (MAX number of slots/SG) in omap-aes driver:
MAX slots = 1:
 (128 bit key, 8192 byte blocks): 1266 operations in 1 seconds (10371072 bytes)
MAX slots = 16:
 (128 bit key, 8192 byte blocks): 1601 operations in 1 seconds (13115392 bytes)

Note: For the above test, 8K buffer is mapped into SG list of size 2 so
only 2 slots are required. So beyond size 2, there will not be any noticeable
performance improvement. But using 1 slot is even managed by just DMA'ing 1 SG
entry at a time.

[1] https://lkml.org/lkml/2013/7/18/432
[2] http://marc.info/?l=linux-omap=137416733628831=2

Joel Fernandes (9):
  dma: edma: Setup parameters to DMA MAX_NR_SG at a time
  dma: edma: Write out and handle MAX_NR_SG at a given time
  ARM: edma: Add function to manually trigger an EDMA channel
  dma: edma: Find missed events and issue them
  dma: edma: Leave linked to Null slot instead of DUMMY slot
  dma: edma: Detect null slot errors and handle them correctly
  ARM: edma: Don't clear EMR of channel in edma_stop
  dma: edma: Link to dummy slot only for last SG list split
  dma: edma: remove limits on number of slots

 arch/arm/common/edma.c |   22 -
 drivers/dma/edma.c |  157 +++-
 include/linux/platform_data/edma.h |2 +
 3 files changed, 142 insertions(+), 39 deletions(-)

-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 7/9] ARM: edma: Don't clear EMR of channel in edma_stop

2013-07-29 Thread Joel Fernandes

We certainly don't want error conditions to be cleared anywhere
as this will make us 'forget' about missed events. We depend on
knowing which events were missed in order to be able to reissue them.

This fixes a race condition where the EMR was being cleared
by the transfer completion interrupt handler.

Basically, what was happening was:

Missed event
 |
 |
 V
SG1-SG2-SG3-Null
 \
  \__TC Interrupt (Almost same time as ARM is executing
TC interrupt handler, an event got missed and also forgotten
by clearing the EMR).

The EMR is ultimately being cleared by the Error interrupt
handler once it is handled so we don't have to do it in edma_stop.

Signed-off-by: Joel Fernandes 
---
 arch/arm/common/edma.c |1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/arm/common/edma.c b/arch/arm/common/edma.c
index 10995b2..dec772e 100644
--- a/arch/arm/common/edma.c
+++ b/arch/arm/common/edma.c
@@ -1328,7 +1328,6 @@ void edma_stop(unsigned channel)
edma_shadow0_write_array(ctlr, SH_EECR, j, mask);
edma_shadow0_write_array(ctlr, SH_ECR, j, mask);
edma_shadow0_write_array(ctlr, SH_SECR, j, mask);
-   edma_write_array(ctlr, EDMA_EMCR, j, mask);
 
pr_debug("EDMA: EER%d %08x\n", j,
edma_shadow0_read_array(ctlr, SH_EER, j));
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/9] dma: edma: Write out and handle MAX_NR_SG at a given time

2013-07-29 Thread Joel Fernandes

Process SG-elements in batches of MAX_NR_SG if they are greater
than MAX_NR_SG. Due to this, at any given time only those many
slots will be used in the given channel no matter how long the
scatter list is. We keep track of how much has been written
inorder to process the next batch of elements in the scatter-list
and detect completion.

For such intermediate transfer completions (one batch of MAX_NR_SG),
make use of pause and resume functions instead of start and stop
when such intermediate transfer is in progress or completed as we
donot want to clear any pending events.

Signed-off-by: Joel Fernandes 
---
 drivers/dma/edma.c |   79 +++-
 1 file changed, 54 insertions(+), 25 deletions(-)

diff --git a/drivers/dma/edma.c b/drivers/dma/edma.c
index 0b68f94..d9a151b 100644
--- a/drivers/dma/edma.c
+++ b/drivers/dma/edma.c
@@ -56,6 +56,7 @@ struct edma_desc {
struct list_headnode;
int absync;
int pset_nr;
+   int total_processed;
struct edmacc_param pset[0];
 };
 
@@ -104,22 +105,36 @@ static void edma_desc_free(struct virt_dma_desc *vdesc)
 /* Dispatch a queued descriptor to the controller (caller holds lock) */
 static void edma_execute(struct edma_chan *echan)
 {
-   struct virt_dma_desc *vdesc = vchan_next_desc(>vchan);
+   struct virt_dma_desc *vdesc;
struct edma_desc *edesc;
-   int i;
+   struct device *dev = echan->vchan.chan.device->dev;
 
-   if (!vdesc) {
-   echan->edesc = NULL;
-   return;
+   int i, j, total_left, total_process;
+
+   /* If either we processed all psets or we're still not started */
+   if (!echan->edesc ||
+   echan->edesc->pset_nr == echan->edesc->total_processed) {
+   /* Get next vdesc */
+   vdesc = vchan_next_desc(>vchan);
+   if (!vdesc) {
+   echan->edesc = NULL;
+   return;
+   }
+   list_del(>node);
+   echan->edesc = to_edma_desc(>tx);
}
 
-   list_del(>node);
+   edesc = echan->edesc;
+
+   /* Find out how many left */
+   total_left = edesc->pset_nr - edesc->total_processed;
+   total_process = total_left > MAX_NR_SG ? MAX_NR_SG : total_left;
 
-   echan->edesc = edesc = to_edma_desc(>tx);
 
/* Write descriptor PaRAM set(s) */
-   for (i = 0; i < edesc->pset_nr; i++) {
-   edma_write_slot(echan->slot[i], >pset[i]);
+   for (i = 0; i < total_process; i++) {
+   j = i + edesc->total_processed;
+   edma_write_slot(echan->slot[i], >pset[j]);
dev_dbg(echan->vchan.chan.device->dev,
"\n pset[%d]:\n"
"  chnum\t%d\n"
@@ -132,24 +147,31 @@ static void edma_execute(struct edma_chan *echan)
"  bidx\t%08x\n"
"  cidx\t%08x\n"
"  lkrld\t%08x\n",
-   i, echan->ch_num, echan->slot[i],
-   edesc->pset[i].opt,
-   edesc->pset[i].src,
-   edesc->pset[i].dst,
-   edesc->pset[i].a_b_cnt,
-   edesc->pset[i].ccnt,
-   edesc->pset[i].src_dst_bidx,
-   edesc->pset[i].src_dst_cidx,
-   edesc->pset[i].link_bcntrld);
+   j, echan->ch_num, echan->slot[i],
+   edesc->pset[j].opt,
+   edesc->pset[j].src,
+   edesc->pset[j].dst,
+   edesc->pset[j].a_b_cnt,
+   edesc->pset[j].ccnt,
+   edesc->pset[j].src_dst_bidx,
+   edesc->pset[j].src_dst_cidx,
+   edesc->pset[j].link_bcntrld);
/* Link to the previous slot if not the last set */
-   if (i != (edesc->pset_nr - 1))
+   if (i != (total_process - 1))
edma_link(echan->slot[i], echan->slot[i+1]);
/* Final pset links to the dummy pset */
else
edma_link(echan->slot[i], echan->ecc->dummy_slot);
}
 
-   edma_start(echan->ch_num);
+   edesc->total_processed += total_process;
+
+   edma_resume(echan->ch_num);
+
+   if (edesc->total_processed <= MAX_NR_SG) {
+   dev_dbg(dev, "first transfer starting %d\n", echan->ch_num);
+   edma_start(echan->ch_num);
+   }
 }
 
 static int edma_termin

[PATCH 6/9] dma: edma: Detect null slot errors and handle them correctly

2013-07-29 Thread Joel Fernandes

For crypto IP, we continue to receive events even continuously in
NULL slot, and request lines don't get de-asserted unlike omap_hsmmc.
Due to this, we continously receive error interrupts when we
manually trigger an event.

We fix this, by first detecting if the Channel is currently transferring
from a NULL slot or not, that's where the edma_read_slot in the error
callback from interrupt handler comes in.

Second thing we do is, if we detect if we are on a NULL slot, we don't
forcefully trigger as this will only result in more error conditions.
Instead we set a missed flag and allow the manual triggerring to happen
in edma_execute which will eventually be called. This fixes the issue
where we are on a NULL slot and continue to receive events from
modules like crypto that don't stop their request events after a
transfer is completed.

Signed-off-by: Joel Fernandes 
---
 drivers/dma/edma.c |   42 ++
 1 file changed, 38 insertions(+), 4 deletions(-)

diff --git a/drivers/dma/edma.c b/drivers/dma/edma.c
index 1eda5cc..c72e8c9 100644
--- a/drivers/dma/edma.c
+++ b/drivers/dma/edma.c
@@ -70,6 +70,7 @@ struct edma_chan {
int ch_num;
boolalloced;
int slot[EDMA_MAX_SLOTS];
+   int missed;
struct dma_slave_config cfg;
 };
 
@@ -169,6 +170,18 @@ static void edma_execute(struct edma_chan *echan)
dev_dbg(dev, "first transfer starting %d\n", echan->ch_num);
edma_start(echan->ch_num);
}
+
+   /* This happens due to setup times between intermediate transfers
+  in long SG lists which have to be broken up into transfers of
+  MAX_NR_SG */
+   if (echan->missed) {
+   dev_dbg(dev, "missed event in execute detected\n");
+   edma_clean_channel(echan->ch_num);
+   edma_stop(echan->ch_num);
+   edma_start(echan->ch_num);
+   edma_manual_trigger(echan->ch_num);
+   echan->missed = 0;
+   }
 }
 
 static int edma_terminate_all(struct edma_chan *echan)
@@ -387,6 +400,7 @@ static void edma_callback(unsigned ch_num, u16 ch_status, 
void *data)
struct device *dev = echan->vchan.chan.device->dev;
struct edma_desc *edesc;
unsigned long flags;
+   struct edmacc_param p;
 
/* Pause the channel */
edma_pause(echan->ch_num);
@@ -414,15 +428,35 @@ static void edma_callback(unsigned ch_num, u16 ch_status, 
void *data)
 
break;
case DMA_CC_ERROR:
-   if (echan->edesc) {
-   dev_dbg(dev, "Missed event on %d, retrying\n",
-   ch_num);
+   spin_lock_irqsave(>vchan.lock, flags);
+
+   edma_read_slot(EDMA_CHAN_SLOT(echan->slot[0]), );
+
+   if (p.a_b_cnt == 0 && p.ccnt == 0) {
+   dev_dbg(dev, "Error occurred, looks like slot is null, 
just setting miss\n");
+   /*
+  Issue later based on missed flag which will be sure
+  to happen as:
+  (1) we finished transmitting an intermediate slot and
+  edma_execute is coming up.
+  (2) or we finished current transfer and issue will
+  call edma_execute.
+
+  Important note: issuing can be dangerous here and
+  lead to some nasty recursion as this is a NULL slot
+  at this point.
+   */
+   echan->missed = 1;
+   } else {
+   dev_dbg(dev, "Error occurred but slot is non-null, 
TRIGGERING\n");
edma_clean_channel(echan->ch_num);
edma_stop(echan->ch_num);
edma_start(echan->ch_num);
edma_manual_trigger(echan->ch_num);
}
-   dev_dbg(dev, "handled error on channel %d\n", ch_num);
+
+   spin_unlock_irqrestore(>vchan.lock, flags);
+
break;
default:
break;
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] ARM: EDMA: Fix clearing of unused list for DT DMA resources

2013-07-29 Thread Joel Fernandes

On 07/29/2013 02:01 AM, Sekhar Nori wrote:> On Monday 22 July 2013 11:29
PM, Joel Fernandes wrote:
>> HWMOD removal for MMC is breaking edma_start as the events are being
manually
>> triggered due to unused channel list not being clear, Thanks to
Balaji TK for
>> finding this issue.
>
> So, Reported-by: Balaji T K 
>
> ?

Sure, I'll add that as well.

>
>>
>> This patch fixes the issue, by reading the "dmas" property from the
DT node if
>> it exists and clearing the bits in the unused channel list.
>>
>> Cc: Balaji T K 
>> Cc: Pantel Antoniou 
>> Signed-off-by: Joel Fernandes 
>> ---
>> v2 changes: Fixed compiler warning
>>
>>  arch/arm/common/edma.c |   31 +++
>>  1 file changed, 23 insertions(+), 8 deletions(-)
>>
>> diff --git a/arch/arm/common/edma.c b/arch/arm/common/edma.c
>> index a432e6c..765d578 100644
>> --- a/arch/arm/common/edma.c
>> +++ b/arch/arm/common/edma.c
>> @@ -561,14 +561,29 @@ static int reserve_contiguous_slots(int ctlr,
unsigned int id,
>>  static int prepare_unused_channel_list(struct device *dev, void *data)
>>  {
>>  struct platform_device *pdev = to_platform_device(dev);
>> -int i, ctlr;
>> -
>> -for (i = 0; i < pdev->num_resources; i++) {
>> -if ((pdev->resource[i].flags & IORESOURCE_DMA) &&
>> -(int)pdev->resource[i].start >= 0) {
>> -ctlr = EDMA_CTLR(pdev->resource[i].start);
>> -clear_bit(EDMA_CHAN_SLOT(pdev->resource[i].start),
>> -edma_cc[ctlr]->edma_unused);
>> +int i = 0, ctlr;
>> +u32 dma_chan;
>> +const __be32 *dma_chan_p;
>> +struct property *prop;
>> +
>> +if (dev->of_node) {
>> +of_property_for_each_u32(dev->of_node, "dmas", prop, \
>> + dma_chan_p, dma_chan) {
>> +if (i++ & 1) {
>> +ctlr = EDMA_CTLR(dma_chan);
>> +clear_bit(EDMA_CHAN_SLOT(dma_chan),
>> +edma_cc[ctlr]->edma_unused);
>> +}
>
> I think it will be cleaner if you use of_property_count_strings() on
> dma-names to get the count of DMA channels for this device and then use
> of_parse_phandle_with_args() to get access to the args. Honestly, I have
> not used these APIs myself, so if this doesn’t work the way I think it
> should then let me know and I will try to do some experiments myself.
>
> The current way of skipping over even fields really depends on
> #dma-cells for EDMA to be 1. That in itself is not such a bad
> assumption, but if you have APIs which already take care of this for
> you, why not use them?

Sure, looks like of_dma_request_slave_channel does same thing in
drive/dma/of-dma.c. Let me follow up with a patch that does it this way.
I agree its cleaner.

>> +}
>> +} else {
>> +for (; i < pdev->num_resources; i++) {
>> +if ((pdev->resource[i].flags & IORESOURCE_DMA) &&
>> +(int)pdev->resource[i].start >= 0) {
>> +ctlr = EDMA_CTLR(pdev->resource[i].start);
>> +clear_bit(EDMA_CHAN_SLOT(
>> +  pdev->resource[i].start),
>> +  edma_cc[ctlr]->edma_unused);
>> +}
>
> So there is very little in common between OF and non-OF versions of this
> function. Why not have two different versions of this function for the
> two cases? The OF version can reside under the CONFIG_OF conditional
> already in use in the file. This will also save you the ugly line breaks
> you had to resort to due to too deep indentation.

Actually those line breaks are not necessary and wouldn't result in
compilation errors. I was planning to drop them. I'll go ahead and split
it out anyway, now that also the OF version of the function is going to
be bit longer if we use the of_parse functions.

Thanks for your review,

Regards,

-Joel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] ARM: EDMA: Fix clearing of unused list for DT DMA resources

2013-07-29 Thread Joel Fernandes

On 07/29/2013 02:04 AM, Sekhar Nori wrote:
> On Sunday 28 July 2013 05:02 AM, Joel Fernandes wrote:
>> Hi Tony or Sekhar,
>>
>> If this patch looks ok, could you pick it up for -rc cycle?
>>
>> It fixes DMA breakages after the merge window for devices for which DMA
>> resources are being populated in device tree instead pdev.
> 
> But which DT-enabled platform in kernel is using EDMA? A grep for edma
> over arch/arm/boot/dts/* brings up nothing.
> Does this really have to go into the -rc cycle or can it wait till
> v3.12?


But unused channel list is also populated for private EDMA callers like
davinci-pcm no? IIRC, HWMOD data is also removed for McASP so EDMA for
those may break.

Thanks,

-Joel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] dma: edma: add device_slave_caps() support

2013-07-29 Thread Joel Fernandes

Hi Vinod,

On 07/29/2013 04:45 AM, Vinod Koul wrote:
> On Thu, Jul 25, 2013 at 12:53:51PM +0530, Vinod Koul wrote:
>> On Wed, Jul 24, 2013 at 02:36:26PM -0500, Joel Fernandes wrote:
>>>> Also another point worth considering is the approach Russell suggested, I 
>>>> havent
>>>> gotten a chance to dig deeper but if I understood it correctly then 
>>>> programming
>>>> the device_dma_parameters should be the right thing to do. Again I need to 
>>>> look
>>>> deeper and esp wrt edma
>>>
>>> OK. I have some patches sitting in my tree too that I'm working on. With
>>> that I don't need to know about maximum number of allowed segments and
>>> can send along any number of segment. I will rework them and post them.
>>> fwiw, I will also implement caps API incase like Lars did populating the
>>> other fields though these will not be unused.
>>>
>>> For segment size, at this time I don't know any driver that uses it
>>> other than davinci-pcm. For this reason the calculations can be done as
>>> Lars suggested (for minimum of maximum). Do you know in advance if
>>> you're going to amend to drop segment size if we go with what Russell
>>> suggested, or are you going to leave the seg-size in the caps API anyway.
>> I am just back and havent really done my work on this. Let me check and as I
>> said if my understanding is right I would be inclined to remove these 
>> fields...
> Okay so the max sg_len should be done by setting the device_dma_parameters.
> 
> See the example in MXS DMA and MMC drivers

Ah, I see those examples. Thanks.

> 
> Now for second parameter why do you need that to be passed, I thought in edma 
> the
> clients needs this value somehow to program the channel. It would be good that
> if we can delink from this and let edma derive.

Do you mean the burst width and device width parameters?

On second thought, I am thinking do we really need to set this
particular parameter if we can rearchitect the driver a bit. EDMA
theoretically can transmit very large segment sizes. There are 3
internal counters (ACNT, BCNT, CCNT) that are 16-bit, total size is
quite large (product of counters).

EDMA driver however currently assumes that ACNT is device width and BCNT
is max burst in units of device width. These parameters are again client
configured so these have to be passed to the EDMA driver by
configuration or passed through API to calculate maximum segment size.
This is required because client may be unaware that CCNT is only 16 bit,
so following calculation is being done in the SG segment size patch:

+*edma_get_slave_sg_limits(struct dma_chan *chan,
+   enum dma_slave_buswidth addr_width,
+   u32 maxburst)
+{
[..]
+   echan->sg_limits.max_seg_len =
+   (SZ_64K - 1) * addr_width * maxburst;

The other down side of assigning ACNT device width and BCNT as burst
size it seems to be wasteful of the range allowed by 16-bit width.
I am thinking if we can break free from these assumptions, then we can
transmit segments of literally any length and not have to set any
segment size limits at all. I'll try to work on some patches to see if
we can overcome segment size limits.

> Also there should be no such thing as max_sg, the driver should be able to
> perform any size of sg_list. Internally it can be breaking the dma
> into multiple trasnactions.

Sure, I just submitted a patch series that does exactly that:
http://marc.info/?l=linux-omap=137510483230005=2

Thanks,

-Joel

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] ARM: EDMA: Fix clearing of unused list for DT DMA resources

2013-07-29 Thread Joel Fernandes

On 07/29/2013 11:52 PM, Sekhar Nori wrote:> On Tuesday 30 July 2013
09:23 AM, Joel Fernandes wrote:
>> On 07/29/2013 02:04 AM, Sekhar Nori wrote:
>>> On Sunday 28 July 2013 05:02 AM, Joel Fernandes wrote:
>>>> Hi Tony or Sekhar,
>>>>
>>>> If this patch looks ok, could you pick it up for -rc cycle?
>>>>
>>>> It fixes DMA breakages after the merge window for devices for which DMA
>>>> resources are being populated in device tree instead pdev.
>>>
>>> But which DT-enabled platform in kernel is using EDMA? A grep for edma
>>> over arch/arm/boot/dts/* brings up nothing.
>>> Does this really have to go into the -rc cycle or can it wait till
>>> v3.12?
>>
>>
>> But unused channel list is also populated for private EDMA callers like
>> davinci-pcm no? IIRC, HWMOD data is also removed for McASP so EDMA for
>> those may break.
>
> Yes, but which DT-enabled platform has davinci-pcm working today?
> AFAICS, this bug affects DT-enabled platforms only.

Yes, you're right. I didn't realize nothing else is working yet. I guess
its fine then for next kernel release.

Thanks,

-Joel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] ARM: EDMA: Fix clearing of unused list for DT DMA resources

2013-07-30 Thread Joel Fernandes

On 07/30/2013 11:29 AM, Sekhar Nori wrote:
> On 7/30/2013 9:17 AM, Joel Fernandes wrote:
> 
>>>> diff --git a/arch/arm/common/edma.c b/arch/arm/common/edma.c
>>>> index a432e6c..765d578 100644
>>>> --- a/arch/arm/common/edma.c
>>>> +++ b/arch/arm/common/edma.c
> 
>>>> +  } else {
>>>> +  for (; i < pdev->num_resources; i++) {
>>>> +  if ((pdev->resource[i].flags & IORESOURCE_DMA) &&
>>>> +  (int)pdev->resource[i].start >= 0) {
>>>> +  ctlr = EDMA_CTLR(pdev->resource[i].start);
>>>> +  clear_bit(EDMA_CHAN_SLOT(
>>>> +pdev->resource[i].start),
>>>> +edma_cc[ctlr]->edma_unused);
>>>> +  }
>>>
>>> So there is very little in common between OF and non-OF versions of this
>>> function. Why not have two different versions of this function for the
>>> two cases? The OF version can reside under the CONFIG_OF conditional
>>> already in use in the file. This will also save you the ugly line breaks
>>> you had to resort to due to too deep indentation.
>>
>> Actually those line breaks are not necessary and wouldn't result in
>> compilation errors. I was planning to drop them. I'll go ahead and split
>> it out anyway, now that also the OF version of the function is going to
>> be bit longer if we use the of_parse functions.
>>
>> Thanks for your review,
> 
> It turns out, I gave a bad idea. What I suggested will break the case of
> non-DT boot with CONFIG_OF enabled. So what you had was fine. May be
> just return from "if (dev->of_node)" so you don't need to have an else
> block and can save on the indentation.>

Ok, sure. I will go ahead and return from the if block.

Thanks,

-Joel


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 4/9] dma: edma: Find missed events and issue them

2013-07-30 Thread Joel Fernandes

Hi Sekhar,

On 07/30/2013 02:05 AM, Sekhar Nori wrote:
> On Monday 29 July 2013 06:59 PM, Joel Fernandes wrote:
>> In an effort to move to using Scatter gather lists of any size with
>> EDMA as discussed at [1] instead of placing limitations on the driver,
>> we work through the limitations of the EDMAC hardware to find missed
>> events and issue them.
>>
>> The sequence of events that require this are:
>>
>> For the scenario where MAX slots for an EDMA channel is 3:
>>
>> SG1 -> SG2 -> SG3 -> SG4 -> SG5 -> SG6 -> Null
>>
>> The above SG list will have to be DMA'd in 2 sets:
>>
>> (1) SG1 -> SG2 -> SG3 -> Null
>> (2) SG4 -> SG5 -> SG6 -> Null
>>
>> After (1) is succesfully transferred, the events from the MMC controller
>> donot stop coming and are missed by the time we have setup the transfer
>> for (2). So here, we catch the events missed as an error condition and
>> issue them manually.
> 
> Are you sure there wont be any effect of these missed events on the
> peripheral side. For example, wont McASP get into an underrun condition
> when it encounters a null PaRAM set? Even UART has to transmit to a

But it will not encounter null PaRAM set because McASP uses contiguous
buffers for transfer which are not scattered across physical memory.
This can be accomplished with an SG of size 1. For such SGs, this patch
series leaves it linked Dummy and does not link to Null set. Null set is
only used for SG lists that are > MAX_NR_SG in size such as those
created for example by MMC and Crypto.

> particular baud so I guess it cannot wait like the way MMC/SD can.

Existing driver have to wait anyway if they hit MAX SG limit today. If
they don't want to wait, they would have allocated a contiguous block of
memory and DMA that in one stretch so they don't lose any events, and in
such cases we are not linking to Null.

> Also, wont this lead to under-utilization of the peripheral bandwith?
> Meaning, MMC/SD is ready with data but cannot transfer because the DMA
> is waiting to be set-up.

But it is waiting anyway even today. Currently based on MAX segs, MMC
driver/subsystem will make SG list of size max_segs. Between these
sessions of creating such smaller SG-lists, if for some reason the MMC
controller is sending events, these will be lost anyway.

What will happen now with this patch series is we are simply accepting a
bigger list than this, and handling all the max_segs stuff within the
EDMA driver itself without outside world knowing. This is actually more
efficient as for long transfers, we are not going back and forth much
between the client and EDMA driver.

> Did you consider a ping-pong scheme with say three PaRAM sets per
> channel? That way you can keep a continuous transfer going on from the
> peripheral over the complete SG list.

Do you mean ping-pong scheme as used in the davinci-pcm driver today?
This can be used only for buffers that are contiguous in memory, not
those that are scattered across memory.

Thanks,

-Joel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 7/9] ARM: edma: Don't clear EMR of channel in edma_stop

2013-07-30 Thread Joel Fernandes

On 07/30/2013 03:29 AM, Sekhar Nori wrote:
> On Monday 29 July 2013 06:59 PM, Joel Fernandes wrote:
>> We certainly don't want error conditions to be cleared anywhere
> 
> 'anywhere' is a really loaded term.
> 
>> as this will make us 'forget' about missed events. We depend on
>> knowing which events were missed in order to be able to reissue them.
> 
>> This fixes a race condition where the EMR was being cleared
>> by the transfer completion interrupt handler.
>>
>> Basically, what was happening was:
>>
>> Missed event
>>  |
>>  |
>>  V
>> SG1-SG2-SG3-Null
>>  \
>>   \__TC Interrupt (Almost same time as ARM is executing
>> TC interrupt handler, an event got missed and also forgotten
>> by clearing the EMR).
> 
> Sorry, but I dont see how edma_stop() is coming into picture in the race
> you describe?

In edma_callback function, for the case of DMA_COMPLETE (Transfer
completion interrupt), edma_stop() is called when all sets have been
processed. This had the effect of clearing the EMR.

This has 2 problems:

1.
If error interrupt is also pending and TC interrupt clears the EMR.

Due to this the ARM will execute the error interrupt even though the EMR
is clear. As a result, the following if condition in dma_ccerr_handler
will be true and IRQ_NONE is returned.

if ((edma_read_array(ctlr, EDMA_EMR, 0) == 0) &&
(edma_read_array(ctlr, EDMA_EMR, 1) == 0) &&
(edma_read(ctlr, EDMA_QEMR) == 0) &&
(edma_read(ctlr, EDMA_CCERR) == 0))
return IRQ_NONE;

If this happens enough number of times, IRQ subsystem disables the
interrupt thinking its spurious which creates serious problems.

2.
If the above if statement condition is removed, then EMR is 0 so the
callback function will not be called in dma_ccerr_handler thus the event
is forgotten, never triggered manually or never sets missed flag of the
channel.

So about the race: TC interrupt handler executing before the error
interrupt handler can result in clearing the EMR and creates these problems.

>> The EMR is ultimately being cleared by the Error interrupt
>> handler once it is handled so we don't have to do it in edma_stop.
> 
> This, I agree with. edma_clean_channel() also there to re-initialize the
> channel so doing it in edma_stop() certainly seems superfluous.

Sure.

Thanks,

-Joel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3/9] ARM: edma: Add function to manually trigger an EDMA channel

2013-07-30 Thread Joel Fernandes

On 07/30/2013 12:18 AM, Sekhar Nori wrote:
> On Monday 29 July 2013 06:59 PM, Joel Fernandes wrote:
>> Manual trigger for events missed as a result of splitting a
>> scatter gather list and DMA'ing it in batches. Add a helper
>> function to trigger a channel incase any such events are missed.
>>
>> Signed-off-by: Joel Fernandes 
>> ---
>>  arch/arm/common/edma.c |   21 +
>>  include/linux/platform_data/edma.h |2 ++
>>  2 files changed, 23 insertions(+)
>>
>> diff --git a/arch/arm/common/edma.c b/arch/arm/common/edma.c
>> index 3567ba1..10995b2 100644
>> --- a/arch/arm/common/edma.c
>> +++ b/arch/arm/common/edma.c
>> @@ -1236,6 +1236,27 @@ void edma_resume(unsigned channel)
>>  }
>>  EXPORT_SYMBOL(edma_resume);
>>  
>> +int edma_manual_trigger(unsigned channel)
> 
> edma_trigger_channel() maybe? Brings consistency with
> edma_alloc_channel() edma_free_channel() etc.

Ok, sure.

> 
>> +{
>> +unsigned ctlr;
>> +int j;
>> +unsigned int mask;
>> +
>> +ctlr = EDMA_CTLR(channel);
>> +channel = EDMA_CHAN_SLOT(channel);
>> +mask = BIT(channel & 0x1f);
>> +
>> +j = channel >> 5;
>> +
>> +/* EDMA channels without event association */
> 
> May be actually check for no-event association before you trigger in
> software? You can do that by looking at unused channel list, no?

But, we want to trigger whether there is event association or not in
this function. For ex, MMC has event associated but still this function
is used to trigger event for it.

> 
>> +edma_shadow0_write_array(ctlr, SH_ESR, j, mask);
> 
> edma_shadow0_write_array(ctlr, SH_ESR, channel >> 5, mask) is no less
> readable, but I leave it to you.

Sure that's more readable, will changed it to that.

Thanks,

-Joel

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 7/9] ARM: edma: Don't clear EMR of channel in edma_stop

2013-07-31 Thread Joel Fernandes

On 07/31/2013 04:35 AM, Sekhar Nori wrote:
> On Wednesday 31 July 2013 10:35 AM, Joel Fernandes wrote:
>> On 07/30/2013 03:29 AM, Sekhar Nori wrote:
>>> On Monday 29 July 2013 06:59 PM, Joel Fernandes wrote:
>>>> We certainly don't want error conditions to be cleared anywhere
>>>
>>> 'anywhere' is a really loaded term.
>>>
>>>> as this will make us 'forget' about missed events. We depend on
>>>> knowing which events were missed in order to be able to reissue them.
>>>
>>>> This fixes a race condition where the EMR was being cleared
>>>> by the transfer completion interrupt handler.
>>>>
>>>> Basically, what was happening was:
>>>>
>>>> Missed event
>>>>  |
>>>>  |
>>>>  V
>>>> SG1-SG2-SG3-Null
>>>>  \
>>>>   \__TC Interrupt (Almost same time as ARM is executing
>>>> TC interrupt handler, an event got missed and also forgotten
>>>> by clearing the EMR).
>>>
>>> Sorry, but I dont see how edma_stop() is coming into picture in the race
>>> you describe?
>>
>> In edma_callback function, for the case of DMA_COMPLETE (Transfer
>> completion interrupt), edma_stop() is called when all sets have been
>> processed. This had the effect of clearing the EMR.
> 
> Ah, thanks. I was missing the fact that the race comes into picture only
> when using the DMA engine driver. I guess that should be mentioned
> somewhere since it is not immediately obvious.
> 
> The patch looks good to me. So if you respin just this one with some
> updated explanation based on what you wrote below, I will take it.

Sure I'll do that. Also the trigger_channel patch, will you be taking
that one too? I can send these 2 in a series as they touch
arch/arm/common/edma.c

Thanks,

-Joel



> 
> Thanks,
> Sekhar
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-omap" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 4/9] dma: edma: Find missed events and issue them

2013-07-31 Thread Joel Fernandes

On 07/31/2013 04:18 AM, Sekhar Nori wrote:
> On Wednesday 31 July 2013 10:19 AM, Joel Fernandes wrote:
>> Hi Sekhar,
>>
>> On 07/30/2013 02:05 AM, Sekhar Nori wrote:
>>> On Monday 29 July 2013 06:59 PM, Joel Fernandes wrote:
>>>> In an effort to move to using Scatter gather lists of any size with
>>>> EDMA as discussed at [1] instead of placing limitations on the driver,
>>>> we work through the limitations of the EDMAC hardware to find missed
>>>> events and issue them.
>>>>
>>>> The sequence of events that require this are:
>>>>
>>>> For the scenario where MAX slots for an EDMA channel is 3:
>>>>
>>>> SG1 -> SG2 -> SG3 -> SG4 -> SG5 -> SG6 -> Null
>>>>
>>>> The above SG list will have to be DMA'd in 2 sets:
>>>>
>>>> (1) SG1 -> SG2 -> SG3 -> Null
>>>> (2) SG4 -> SG5 -> SG6 -> Null
>>>>
>>>> After (1) is succesfully transferred, the events from the MMC controller
>>>> donot stop coming and are missed by the time we have setup the transfer
>>>> for (2). So here, we catch the events missed as an error condition and
>>>> issue them manually.
>>>
>>> Are you sure there wont be any effect of these missed events on the
>>> peripheral side. For example, wont McASP get into an underrun condition
>>> when it encounters a null PaRAM set? Even UART has to transmit to a
>>
>> But it will not encounter null PaRAM set because McASP uses contiguous
>> buffers for transfer which are not scattered across physical memory.
>> This can be accomplished with an SG of size 1. For such SGs, this patch
>> series leaves it linked Dummy and does not link to Null set. Null set is
>> only used for SG lists that are > MAX_NR_SG in size such as those
>> created for example by MMC and Crypto.
>>
>>> particular baud so I guess it cannot wait like the way MMC/SD can.
>>
>> Existing driver have to wait anyway if they hit MAX SG limit today. If
>> they don't want to wait, they would have allocated a contiguous block of
>> memory and DMA that in one stretch so they don't lose any events, and in
>> such cases we are not linking to Null.
> 
> As long as DMA driver can advertize its MAX SG limit, peripherals can
> always work around that by limiting the number of sync events they
> generate so as to not having any of the events getting missed. With this
> series, I am worried that EDMA drivers is advertizing that it can handle
> any length SG list while not taking care of missing any events while
> doing so. This will break the assumptions that driver writers make.

This is already being done by some other DMA engine drivers ;). We can
advertise more than we can handle at a time, that's the basis of this
whole idea.

I understand what you're saying but events are not something that have
be serviced immediately, they can be queued etc and the actually
transfer from the DMA controller can be delayed. As long as we don't
miss the event we are fine which my series takes care off.

So far I have tested this series on following modules in various
configurations and have seen no issues:
- Crypto AES
- MMC/SD
- SPI (128x160 display)

>>> Also, wont this lead to under-utilization of the peripheral bandwith?
>>> Meaning, MMC/SD is ready with data but cannot transfer because the DMA
>>> is waiting to be set-up.
>>
>> But it is waiting anyway even today. Currently based on MAX segs, MMC
>> driver/subsystem will make SG list of size max_segs. Between these
>> sessions of creating such smaller SG-lists, if for some reason the MMC
>> controller is sending events, these will be lost anyway.
> 
> But if MMC/SD driver knows how many events it should generate if it
> knows the MAX SG limit. So there should not be any missed events in
> current code. And I am not claiming that your solution is making matters
> worse. But its not making it much better as well.

This is not true for crypto, the events are not deasserted and crypto
continues to send events. This is what led to the "don't trigger in
Null" patch where I'm setting the missed flag to avoid recursion.

>> This can be used only for buffers that are contiguous in memory, not
>> those that are scattered across memory.
> 
> I was hinting at using the linking facility of EDMA to achieve this.
> Each PaRAM set has full 32-bit source and destination pointers so I see
> no reason why non-contiguous case cannot be handled.
> 
> Lets say you need to transfer SG[0..6] on channel C. Now, PaRAM sets are
> typically 4 times the number of channels. In this

Re: [PATCH 4/9] dma: edma: Find missed events and issue them

2013-07-31 Thread Joel Fernandes

On 07/31/2013 09:27 PM, Joel Fernandes wrote:
> On 07/31/2013 04:18 AM, Sekhar Nori wrote:
>> On Wednesday 31 July 2013 10:19 AM, Joel Fernandes wrote:
>>> Hi Sekhar,
>>>
>>> On 07/30/2013 02:05 AM, Sekhar Nori wrote:
>>>> On Monday 29 July 2013 06:59 PM, Joel Fernandes wrote:
>>>>> In an effort to move to using Scatter gather lists of any size with
>>>>> EDMA as discussed at [1] instead of placing limitations on the driver,
>>>>> we work through the limitations of the EDMAC hardware to find missed
>>>>> events and issue them.
>>>>>
>>>>> The sequence of events that require this are:
>>>>>
>>>>> For the scenario where MAX slots for an EDMA channel is 3:
>>>>>
>>>>> SG1 -> SG2 -> SG3 -> SG4 -> SG5 -> SG6 -> Null
>>>>>
>>>>> The above SG list will have to be DMA'd in 2 sets:
>>>>>
>>>>> (1) SG1 -> SG2 -> SG3 -> Null
>>>>> (2) SG4 -> SG5 -> SG6 -> Null
>>>>>
>>>>> After (1) is succesfully transferred, the events from the MMC controller
>>>>> donot stop coming and are missed by the time we have setup the transfer
>>>>> for (2). So here, we catch the events missed as an error condition and
>>>>> issue them manually.
>>>>
>>>> Are you sure there wont be any effect of these missed events on the
>>>> peripheral side. For example, wont McASP get into an underrun condition
>>>> when it encounters a null PaRAM set? Even UART has to transmit to a
>>>
>>> But it will not encounter null PaRAM set because McASP uses contiguous
>>> buffers for transfer which are not scattered across physical memory.
>>> This can be accomplished with an SG of size 1. For such SGs, this patch
>>> series leaves it linked Dummy and does not link to Null set. Null set is
>>> only used for SG lists that are > MAX_NR_SG in size such as those
>>> created for example by MMC and Crypto.
>>>
>>>> particular baud so I guess it cannot wait like the way MMC/SD can.
>>>
>>> Existing driver have to wait anyway if they hit MAX SG limit today. If
>>> they don't want to wait, they would have allocated a contiguous block of
>>> memory and DMA that in one stretch so they don't lose any events, and in
>>> such cases we are not linking to Null.
>>
>> As long as DMA driver can advertize its MAX SG limit, peripherals can
>> always work around that by limiting the number of sync events they
>> generate so as to not having any of the events getting missed. With this
>> series, I am worried that EDMA drivers is advertizing that it can handle
>> any length SG list while not taking care of missing any events while
>> doing so. This will break the assumptions that driver writers make.

Sorry, just forgot to respond to "not taking care of missing any events
while doing so". Can you clarify this? DMA engine driver is taking care
of missed events.

Also- missing of events doesn't result in feedback to the peripheral.
Peripheral sends even to DMA controller, event is missed. Peripheral
doesn't know anything about what happened and is waiting for transfer
from the DMA controller.

Thanks,

-Joel

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 4/9] dma: edma: Find missed events and issue them

2013-07-31 Thread Joel Fernandes

On 07/31/2013 09:27 PM, Joel Fernandes wrote:
> On 07/31/2013 04:18 AM, Sekhar Nori wrote:
>> On Wednesday 31 July 2013 10:19 AM, Joel Fernandes wrote:
>>> Hi Sekhar,
>>>
>>> On 07/30/2013 02:05 AM, Sekhar Nori wrote:
>>>> On Monday 29 July 2013 06:59 PM, Joel Fernandes wrote:
>>>>> In an effort to move to using Scatter gather lists of any size with
>>>>> EDMA as discussed at [1] instead of placing limitations on the driver,
>>>>> we work through the limitations of the EDMAC hardware to find missed
>>>>> events and issue them.
>>>>>
>>>>> The sequence of events that require this are:
>>>>>
>>>>> For the scenario where MAX slots for an EDMA channel is 3:
>>>>>
>>>>> SG1 -> SG2 -> SG3 -> SG4 -> SG5 -> SG6 -> Null
>>>>>
>>>>> The above SG list will have to be DMA'd in 2 sets:
>>>>>
>>>>> (1) SG1 -> SG2 -> SG3 -> Null
>>>>> (2) SG4 -> SG5 -> SG6 -> Null
>>>>>
>>>>> After (1) is succesfully transferred, the events from the MMC controller
>>>>> donot stop coming and are missed by the time we have setup the transfer
>>>>> for (2). So here, we catch the events missed as an error condition and
>>>>> issue them manually.
>>>>
>>>> Are you sure there wont be any effect of these missed events on the
>>>> peripheral side. For example, wont McASP get into an underrun condition
>>>> when it encounters a null PaRAM set? Even UART has to transmit to a
>>>
>>> But it will not encounter null PaRAM set because McASP uses contiguous
>>> buffers for transfer which are not scattered across physical memory.
>>> This can be accomplished with an SG of size 1. For such SGs, this patch
>>> series leaves it linked Dummy and does not link to Null set. Null set is
>>> only used for SG lists that are > MAX_NR_SG in size such as those
>>> created for example by MMC and Crypto.
>>>
>>>> particular baud so I guess it cannot wait like the way MMC/SD can.
>>>
>>> Existing driver have to wait anyway if they hit MAX SG limit today. If
>>> they don't want to wait, they would have allocated a contiguous block of
>>> memory and DMA that in one stretch so they don't lose any events, and in
>>> such cases we are not linking to Null.
>>
>> As long as DMA driver can advertize its MAX SG limit, peripherals can
>> always work around that by limiting the number of sync events they
>> generate so as to not having any of the events getting missed. With this
>> series, I am worried that EDMA drivers is advertizing that it can handle
>> any length SG list while not taking care of missing any events while
>> doing so. This will break the assumptions that driver writers make.
> 
> This is already being done by some other DMA engine drivers ;). We can
> advertise more than we can handle at a time, that's the basis of this
> whole idea.
> 
> I understand what you're saying but events are not something that have
> be serviced immediately, they can be queued etc and the actually
> transfer from the DMA controller can be delayed. As long as we don't
> miss the event we are fine which my series takes care off.
> 
> So far I have tested this series on following modules in various
> configurations and have seen no issues:
> - Crypto AES
> - MMC/SD
> - SPI (128x160 display)
> 
>>>> Also, wont this lead to under-utilization of the peripheral bandwith?
>>>> Meaning, MMC/SD is ready with data but cannot transfer because the DMA
>>>> is waiting to be set-up.
>>>
>>> But it is waiting anyway even today. Currently based on MAX segs, MMC
>>> driver/subsystem will make SG list of size max_segs. Between these
>>> sessions of creating such smaller SG-lists, if for some reason the MMC
>>> controller is sending events, these will be lost anyway.
>>
>> But if MMC/SD driver knows how many events it should generate if it
>> knows the MAX SG limit. So there should not be any missed events in
>> current code. And I am not claiming that your solution is making matters
>> worse. But its not making it much better as well.
> 
> This is not true for crypto, the events are not deasserted and crypto
> continues to send events. This is what led to the "don't trigger in
> Null" patch where I'm setting the missed flag to avoid recursion.
> 
>>> This can be used only for buffers that are contiguous

Re: [PATCH] dma: edma: add device_slave_caps() support

2013-07-24 Thread Joel Fernandes

On 07/24/2013 03:03 AM, Lars-Peter Clausen wrote:
> On 07/23/2013 06:43 PM, Joel Fernandes wrote:
>> Implement device_slave_caps(). EDMA has a limited number of slots.
>> Slave drivers such as omap_hsmmc will query the driver to make
>> sure they don't pass in more than these many scatter segments.
>>
>> Signed-off-by: Joel Fernandes 
>> ---
>> Vinod, or Dan- If this patch looks ok, can you please merge in for
>> -rc cycle? This patch is required to fix MMC support on AM33xx. This
>> patch is blocking 3 other patches which fix various MMC things. Thanks!
>>
>> Notes:
>> (1) this approach is temporary and only for -rc cycle to fix MMC on
>> AM335x. It will be replace by the RFC series in future kernels:
>> http://www.spinics.net/lists/arm-kernel/msg260094.html
>>
>> (2) Patch depends Vinod's patch at:
>> http://permalink.gmane.org/gmane.linux.kernel/1525112
>>
>>  drivers/dma/edma.c |9 +
>>  1 file changed, 9 insertions(+)
>>
>> diff --git a/drivers/dma/edma.c b/drivers/dma/edma.c
>> index 7222cbe..81d5429 100644
>> --- a/drivers/dma/edma.c
>> +++ b/drivers/dma/edma.c
>> @@ -517,6 +517,14 @@ static void edma_issue_pending(struct dma_chan *chan)
>>  spin_unlock_irqrestore(>vchan.lock, flags);
>>  }
>>  
>> +static inline int edma_slave_caps(struct dma_chan *chan,
>> +struct dma_slave_caps *caps)
>> +{
>> +caps->max_sg_nr = MAX_NR_SG;
> 
> Hm, what about the other fields?
> 

Other fields are unused, the max segment size is supposed to be
calculated "given" the address width and burst size. Since these
can't be provided to get_caps, I have left it out for now.
See: https://lkml.org/lkml/2013/3/6/464

Even if it did, the "segment size" itself is unused in the MMC driver
that this is supposed to fix, unlike the "number of segments" which I'm
populating above.

If you know of a better way to populate max segment size, let me know.

Thanks,

-Joel

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] dma: edma: add device_slave_caps() support

2013-07-24 Thread Joel Fernandes

On 07/24/2013 03:40 AM, Lars-Peter Clausen wrote:
> On 07/24/2013 10:28 AM, Fernandes, Joel wrote:
>>
>> On Jul 24, 2013, at 3:23 AM, "Lars-Peter Clausen"  wrote:
>>
>>> On 07/24/2013 10:11 AM, Joel Fernandes wrote:
>>>> On 07/24/2013 03:03 AM, Lars-Peter Clausen wrote:
>>>>> On 07/23/2013 06:43 PM, Joel Fernandes wrote:
>>>>>> Implement device_slave_caps(). EDMA has a limited number of slots.
>>>>>> Slave drivers such as omap_hsmmc will query the driver to make
>>>>>> sure they don't pass in more than these many scatter segments.
>>>>>>
>>>>>> Signed-off-by: Joel Fernandes 
>>>>>> ---
>>>>>> Vinod, or Dan- If this patch looks ok, can you please merge in for
>>>>>> -rc cycle? This patch is required to fix MMC support on AM33xx. This
>>>>>> patch is blocking 3 other patches which fix various MMC things. Thanks!
>>>>>>
>>>>>> Notes:
>>>>>> (1) this approach is temporary and only for -rc cycle to fix MMC on
>>>>>> AM335x. It will be replace by the RFC series in future kernels:
>>>>>> http://www.spinics.net/lists/arm-kernel/msg260094.html
>>>>>>
>>>>>> (2) Patch depends Vinod's patch at:
>>>>>> http://permalink.gmane.org/gmane.linux.kernel/1525112
>>>>>>
>>>>>> drivers/dma/edma.c |9 +
>>>>>> 1 file changed, 9 insertions(+)
>>>>>>
>>>>>> diff --git a/drivers/dma/edma.c b/drivers/dma/edma.c
>>>>>> index 7222cbe..81d5429 100644
>>>>>> --- a/drivers/dma/edma.c
>>>>>> +++ b/drivers/dma/edma.c
>>>>>> @@ -517,6 +517,14 @@ static void edma_issue_pending(struct dma_chan 
>>>>>> *chan)
>>>>>>spin_unlock_irqrestore(>vchan.lock, flags);
>>>>>> }
>>>>>>
>>>>>> +static inline int edma_slave_caps(struct dma_chan *chan,
>>>>>> +struct dma_slave_caps *caps)
>>>>>> +{
>>>>>> +caps->max_sg_nr = MAX_NR_SG;
>>>>>
>>>>> Hm, what about the other fields?
>>>>
>>>> Other fields are unused, the max segment size is supposed to be
>>>> calculated "given" the address width and burst size. Since these
>>>> can't be provided to get_caps, I have left it out for now.
>>>> See: https://lkml.org/lkml/2013/3/6/464
>>>
>>> The PL330 driver is similar in this regard, the maximum segment size also
>>> depends on address width and burst width. What I did for the get_slave_caps
>>> implementation is to set it to the minimum maximum size. E.g. in you case
>>> that should be SZ_64K - 1 (burstsize and addrwidth both set to 1).
>>
>> So you're setting max to minimum maximum size? Isn't that like telling the 
>> driver that its segments can't be bigger than this... Unless I'm missing 
>> something..
> 
> Yes. This is a limitation of the current slave_caps API. The maximum needs
> to be the maximum for all possible configurations. A specific configuration
> may allow a larger maximum. So we maybe have to extend the API to be able to
> query the limits for a certain configuration. Not sure what the best way
> would be to do that, either adding a config parameter to get_slave_caps or
> to break it into two functions like you proposed one for the static
> capabilities and one for the sg limits.

I am OK with either approach as long as a decision can be made quickly
by maintainers. Right now lot of back and forth has happened and 3
different versions of the same thing have been posted since January.
Since this is such a trivial change, it doesn't make sense to spend so
much time on it IMO The sad part is though this change is trivial,
other drivers such as MMC are broken and cannot be enabled due to this.
We cannot afford to leave them broken.

If Vinod is not available, can Dan please respond on how to proceed on
this? We really need this trivial change to go into this -rc cycle and
not delay it by another kernel release. Thank you.

-Joel

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] dma: edma: add device_slave_caps() support

2013-07-24 Thread Joel Fernandes

On 07/24/2013 01:33 PM, Vinod Koul wrote:
> On Wed, Jul 24, 2013 at 01:55:24PM -0500, Joel Fernandes wrote:
>> On 07/24/2013 03:40 AM, Lars-Peter Clausen wrote:
>>> On 07/24/2013 10:28 AM, Fernandes, Joel wrote:
>>>>>>>> Vinod, or Dan- If this patch looks ok, can you please merge in for
>>>>>>>> -rc cycle? This patch is required to fix MMC support on AM33xx. This
>>>>>>>> patch is blocking 3 other patches which fix various MMC things. Thanks!
> Sorry I was travelling so not realy on top of email for last few days...
> 
> Now I am not sure we can send this to -rc.

OK.

> If it were just this one, we could have pushed but it also depends on a new 
> API
> which is sitting in my -next. I am not super comfortable to send that to Linus
> for the -rcs. Sure, he would scream at me!

OK.

> Also another point worth considering is the approach Russell suggested, I 
> havent
> gotten a chance to dig deeper but if I understood it correctly then 
> programming
> the device_dma_parameters should be the right thing to do. Again I need to 
> look
> deeper and esp wrt edma

OK. I have some patches sitting in my tree too that I'm working on. With
that I don't need to know about maximum number of allowed segments and
can send along any number of segment. I will rework them and post them.
fwiw, I will also implement caps API incase like Lars did populating the
other fields though these will not be unused.

For segment size, at this time I don't know any driver that uses it
other than davinci-pcm. For this reason the calculations can be done as
Lars suggested (for minimum of maximum). Do you know in advance if
you're going to amend to drop segment size if we go with what Russell
suggested, or are you going to leave the seg-size in the caps API anyway.

Thanks,

-Joel

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Regression 3.11-rc1: omap4panda: no usb and consequently no ethernet

2013-07-25 Thread Joel Fernandes

Hi Arend,

On 07/25/2013 08:06 AM, Arend van Spriel wrote:
> On 07/18/2013 02:42 PM, Roger Quadros wrote:
>> On 07/18/2013 03:38 PM, Arend van Spriel wrote:
>>> On 07/18/2013 01:30 PM, Roger Quadros wrote:
 On 07/18/2013 02:24 PM, Arend van Spriel wrote:
> On 07/18/2013 01:18 PM, Roger Quadros wrote:
>> Hi Arend,
>>
>> On 07/18/2013 11:41 AM, Arend van Spriel wrote:
>>> Hi Tony,
>>>
>>>
>>> We are using the panda board (es variant) for testing our SDIO
>>> based chips. For this we have an adapter card connection to
>>> expansion connector A. As this adapter is not publicly available
>>> we had internally patched board-omap4panda.c. Also we follow the
>>> -rc releases and use TFTP to boot the kernel image which requires
>>> USB.
>>>
>>> Moving to 3.11-rc1 I found that the mentioned board file was
>>> removed by your commit:
>>>
>>> commit b42b918194c4791510ac049e3d507169a7de8544
>>> Author: Tony Lindgren 
>>> Date:   Thu May 30 12:53:05 2013 -0700
>>>
>>>ARM: OMAP2+: Remove board-omap4panda.c
>>>
>>> As our patches on that file are internal I do not hold it against
>>> you. This is no regression and we need to fix it.
>>>
>>> So my first step was to follow the recipe given in that commit.
>>> Beside that I noticed a thread about USB issue on LKML so I also
>>> applied the following commit:
>>>
>>> commit 352f573e59050c7a604c35c58b4bbfc51edebbee
>>> Author: Roger Quadros 
>>> Date:   Tue Jun 18 19:04:47 2013 +0300
>>>
>>>ARM: OMAP2+: Provide alias to USB PHY clock
>>>
>>> The attached kernel log seems to suggest that the device tree is
>>> picked up by the kernel, but the USB does not seem very active.
>>> No ethernet connectivity. This does seem a regression to me. Is
>>> there some other patch that I need to get it going again?
>>>
>>
>> I tried with your config and 3.11-rc1 kernel with the above commit
>> on top and ethernet seems to
>> work for me. My boot log is attached.
>>
>> Are you sure that you are building the DTB for panda-es and the
>> bootloader is picking up the right file and
>> not an outdated one?
>
> I appended the DTB to the kernel image thus avoiding the need to
> update u-boot (at least that is what I understood from Tony's commit).
>

 I understand this can be a little tedious at first.

 This is my u-boot boot.txt

 fatload mmc 0:1 0x825f omap4-panda-es.dtb
 fatload mmc 0:1 0x8030 uImage
 set fdt_high 0x
 setenv bootargs console=ttyO2,115200n8 mem=1G@0x8000
 root=/dev/mmcblk0p2 rootwait
 bootm 0x8030 - 0x825f

 You need to generate boot.scr from the above boot.txt and place it
 in SD card boot partition.

 mkimage -A arm -T script -C none -n "Boot Image" -d boot.txt boot.scr

 And of course copy the omap4-panda-es.dtb to SD card boot partition.

 hope this helps.
>>>
>>> Thanks for sharing this.
>>>
>> Why is the version of SPL loader and u-boot different in your log?
>> You need to use the MLO file generated by the u-boot build along
>> with the uImage.
>>
>> Just to be sure we are on the same setup could you please try with
>> latest u-boot release (2013-04). Thanks.
>
> I recall having difficulty with TFTP boot using a 2013 u-boot
> release, but that hurdle is for later. I will try.
>

 OK. We can figure this out as well.
>>>
>>> I tried with same SPL and u-boot version, but that did not work out.
>>> So I moved to v2013.04 and the log looks better. I was especially
>>> interested in this:
>>>
>>> [2.807434] usb 1-1.1: new high-speed USB device number 3 using
>>> ehci-omap
>>> [2.932495] usb 1-1.1: New USB device found, idVendor=0424,
>>> idProduct=ec00
>>> [2.932495] usb 1-1.1: New USB device strings: Mfr=0, Product=0,
>>> SerialNumbe0
>>> [2.958770] smsc95xx v1.0.4
>>> Starting logging: OK
>>> Initializing random number generator... [3.045806] smsc95xx
>>> 1-1.1:1.0 eth0
>>
>> Cool! :).
>>
>> FYI. I also tested tftpboot from u-boot and NFS root file system and
>> it works fine.
> 
> Hi Roger,
> 
> Can I get back on this topic. When USB and ethernet was working for me
> as stated above, I was not doing tftpboot. When I use tftpboot the
> images are obtained from the tftp server, but after kernel has started
> there is nothing in /sys/bus/usb/devices/.

I quickly tried 3.11-rc2 + Roger's USB PHY clock patch on omap4-panda-es
and enabled following USB options:

CONFIG_USB_MUSB_HDRC=y
CONFIG_USB_PHY=y
CONFIG_OMAP_USB2=y
CONFIG_OMAP_CONTROL_USB=y
CONFIG_USB_MUSB_OMAP2PLUS=y
CONFIG_MFD_OMAP_USB_HOST=y
CONFIG_USB_EHCI_HCD=y
CONFIG_USB_OHCI_HCD=y
CONFIG_USB_EHCI_HCD_OMAP=y

With this I see both USB ports, and ethernet:

bash-4.2# ifconfig eth0
eth0  Link encap:Ethernet  HWaddr

Re: Regression 3.11-rc1: omap4panda: no usb and consequently no ethernet

2013-07-25 Thread Joel Fernandes

On 07/25/2013 09:49 PM, Joel Fernandes wrote:

[..]

>> Can I get back on this topic. When USB and ethernet was working for me
>> as stated above, I was not doing tftpboot. When I use tftpboot the
>> images are obtained from the tftp server, but after kernel has started
>> there is nothing in /sys/bus/usb/devices/.
> 
> I quickly tried 3.11-rc2 + Roger's USB PHY clock patch on omap4-panda-es
> and enabled following USB options:
> 
> CONFIG_USB_MUSB_HDRC=y
> CONFIG_USB_PHY=y
> CONFIG_OMAP_USB2=y
> CONFIG_OMAP_CONTROL_USB=y
> CONFIG_USB_MUSB_OMAP2PLUS=y
> CONFIG_MFD_OMAP_USB_HOST=y
> CONFIG_USB_EHCI_HCD=y
> CONFIG_USB_OHCI_HCD=y
> CONFIG_USB_EHCI_HCD_OMAP=y

Sorry I missed saying I also set had to set CONFIG_NOP_USB_XCEIV=y


Thanks,

-Joel

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 4053 matches

Mail list logo