Re: dma_declare_coherent_memory on main memory

2018-12-08 Thread Rob Landley
On 12/7/18 9:34 AM, Christoph Hellwig wrote:
> Hi all,
> 
> the ARM imx27/31 ports and various sh boards use
> dma_declare_coherent_memory on main memory taken from the memblock
> allocator.
> 
> Is there any good reason these couldn't be switched to CMA areas?
> Getting rid of these magic dma_declare_coherent_memory area would
> help making the dma allocator a lot simpler.

Not that I know of?

Rob

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: dmaengine for sh7760 (was Re: use the generic dma-noncoherent code for sh V2)

2018-08-18 Thread Rob Landley
On 08/17/2018 03:23 PM, Arnd Bergmann wrote:
> On Fri, Aug 17, 2018 at 7:04 PM Rob Landley  wrote:
>> On 07/31/2018 07:56 AM, Arnd Bergmann wrote:
>>> On Fri, Jul 27, 2018 at 6:20 PM, Rob Landley  wrote:
>>>> On 07/24/2018 03:21 PM, Christoph Hellwig wrote:
>>>>> On Tue, Jul 24, 2018 at 02:01:42PM +0200, Christoph Hellwig wrote:
>>>>>> Hi all,
>>> If you hack on it, please convert the dmaengine platform data to use
>>> a dma_slave_map array to pass the data into the dmaengine driver,
>>
>> The dmatest module didn't need it? I don't see why the ethernet driver would?
>> (Isn't the point of an allocator to allocate from a request?)
> 
> I guess you have hit two of the special cases here:
> 
> - dmatest uses the memory-to-memory DMA engine interface, not the slave
>   API, so you don't have to configure a slave at all

I've read through
https://www.kernel.org/doc/Documentation/driver-api/dmaengine/client.rst twice
and am still very unclear on the slave API.

> - smc91x (and its smc911x.c relative) are apparently special in that they
>   use they use the DMA slave API

Only sort of. In 4.14 at least it's under #ifdef ARCH_PXA and full of PXA
constants (PXAD_PRIO_LOWEST and such).

> but (AFAICT) require programming
>   the dmaengine hardware into a memory-to-memory transfer with no
>   DMA slave request signal and completely synchronous operation
>   (the IRQ handler polls for the DMA descriptor to be complete),
>   see also https://lkml.org/lkml/2018/4/3/464 for the discussion about
>   the recent rework of that driver's implementation.

Bookmarked, thanks.

(Being able to just upgrade to a 4.19 kernel or something and have DMA work in
this driver if I've got dmaengine set up for the platform would be lovely.)

>>> mapping the settings from a (pdev-name, channel-id) tuple to a pointer
>>> that describes the channel configuration rather than having the
>>> mapping from an numerical slave_id to a struct sh_dmae_slave_config
>>> in the setup files. It should be a fairly mechanical conversion.
>>
>> I think all 8 channels are generic. Drivers should be able to grab them and
>> release them at will, why does it need a table?
>>
>> (I say this not having made the smc91x.c driver use this yet, its 
>> "conversion"
>> to device tree left it full of PXA #ifdefs and constants, and I've tried the
> 
> Another point about smc91x is that it only uses DMA on the PXA platform,
> which is not part of the "multiplatform" ARM setup. It's likely that no
> other platform actually has a DMA engine that can talk to this device in
> the absence of a request signal, or that on more modern CPU cores,
> a readsl() is actually just as fast, but it avoids the setup cost of talking
> to the dma engine. Possibly both of the above.

The sh7760 has the CPU pegged at 100% trying to keep up with ethernet traffic.
Being able to use DMA on this would be very nice.

>> last half-dozen kernel releases and qemu releases and have yet to find an arm
>> mainstone board under qemu that _doesn't_ time out trying to use DMA with 
>> this
>> card. But that's another post...)
> 
> Is smc91x the only driver that you want to make use of the DMA engine?

This driver's the low-hanging fruit, yeah. Copying NOR flash jffs2 data into
page cache would be nice but there's a decompression step so I'm not sure that's
a win.

> I suspect that every other one currently relies on passing a slave ID
> shdma_chan_filter into dma_request_slave_channel_compat() or
> dma_request_channel() , which are some of the interfaces we want to
> remove in the future, to make everything work the same across
> all platforms.

What are "all platforms" in this context? I tried to find an x86 variant that
uses DMAEngine but came up empty. Can I use DMAEngine on a raspberry pi perhaps?
Is there a QEMU taret I can play with DMAEngine under?

I built a mainstone kernel with dmaengine amd smc91x both enabled, and booted it
on qemu-system-arm -M mainstone, and it works fine until I try to ping the host
(via the 10.0.2.2 redirect), at which point no packets are received and a timer
expires all over the console a few seconds later. I.E. the DMA claims to be
there, but the transfer never occurs.

I built and tested every Linux version back to 4.2 (before the smc91x was
converted from PXA dma to use DMAEngine, albeit in a very PXA specific manner).
I also tested each qemu release back to 2.3.0, with no obvious behavioral
difference.

I can dig further back in qemu history maybe? Ask on the qemu list? (Did this
ever work for anyone? I can post my kernel config and qemu command line if you
think it would help?)

> shdma_chan_filter() is one of those that expect its pointer argument to
&

dmaengine for sh7760 (was Re: use the generic dma-noncoherent code for sh V2)

2018-08-17 Thread Rob Landley



On 07/31/2018 07:56 AM, Arnd Bergmann wrote:
> On Fri, Jul 27, 2018 at 6:20 PM, Rob Landley  wrote:
>> On 07/24/2018 03:21 PM, Christoph Hellwig wrote:
>>> On Tue, Jul 24, 2018 at 02:01:42PM +0200, Christoph Hellwig wrote:
>>>> Hi all,
>>>>
>>>> can you review these patches to switch sh to use the generic
>>>> dma-noncoherent code?  All the requirements are in mainline already
>>>> and we've switched various architectures over to it already.
>>>
>>> Ok, there is one more issue with this version.   Wait for a new one
>>> tomorrow.
>>
>> Speaking of DMA:
>>
>> I'm trying to wire up DMAEngine to an sh7760 board that uses platform data 
>> (and
>> fix the smc91x.c driver to use DMAEngine without #ifdef arm), so I've been
>> reading through all that stuff, but the docs seem kinda... thin?
>>
>> Is there something I should have read other than
>> Documentation/driver-model/platform.txt,
>> Documentation/dmaegine/{provider,client}.txt, then trying to picking through 
>> the
>> source code and the sh7760 hardware pdf? (And watching the youtube video of
>> Laurent Pinchart's 2014 ELC talk on DMA, Maxime Ripard's 2015 ELC overview of
>> DMAEngine, the Xilinx video on DMAEngine...)
>>
>> At first I thought the SH_DMAE could initialize itself, but the probe 
>> function
>> needs platform data, and although arch/sh/kernel/cpu/sh4a/setup-sh7722.c 
>> looks
>> _kind_ of like a model I can crib from:
> 
>> B) That platform data is supplying sh_dmae_slave_config preallocating slave
>> channels to devices? (Does it have to? The docs gave me the impression the
>> driver would dynamically request them and devices could even share. Wasn't 
>> that
>> sort of the point of DMAEngine? Can my new board data _not_ do that? What's 
>> the
>> minimum amount of micromanaging I have to do?)
> 
> The thing here is that arch/sh is way behind on the API use, and it
> has prevented us from cleaning up drivers as well. A slave driver
> should have to just call dma_request_chan() with a constant
> string to identify its channel rather than going two different ways
> depending on whether it's used with DT or platform data.

I got the DMA controller hooked up to DMAEngine and the dmatest module is happy
with the result on all 8 channels. (Finding
arch/sh/kernel/cpu/sh4a/setup-sh7722.c helped a lot, finding it earlier would
have helped more. :)

The config symbols are:

CONFIG_SH_DMA=y
CONFIG_DMADEVICES=y
CONFIG_SH_DMAE_BASE=y
CONFIG_SH_DMAE=y
CONFIG_DMATEST=y #optional

The platform data is:

#include 
#include 
#include 

/* DMA */
static struct resource sh7760_dma_resources[] = {
{
.start  = SH_DMAC_BASE0,
.end= SH_DMAC_BASE0 + 9*16 - 1,
.flags  = IORESOURCE_MEM,
}, {
.start  = DMTE0_IRQ,
.end= DMTE0_IRQ,
.flags  = IORESOURCE_IRQ,
}
};

static struct sh_dmae_channel dma_chan[] = {
{
.offset = 0,
.dmars = 0,
.dmars_bit = 0,
}, {
.offset = 0x10,
.dmars = 0,
.dmars_bit = 8,
}, {
.offset = 0x20,
.dmars = 4,
.dmars_bit = 0,
}, {
.offset = 0x30,
.dmars = 4,
.dmars_bit = 8,
}, {
.offset = 0x50,
.dmars = 8,
.dmars_bit = 0,
}, {
.offset = 0x60,
.dmars = 8,
.dmars_bit = 8,
}, {
.offset = 0x70,
.dmars = 12,
.dmars_bit = 0,
}, {
.offset = 0x80,
.dmars = 12,
.dmars_bit = 8,
}
};

static const unsigned int ts_shift[] = TS_SHIFT;

static struct sh_dmae_pdata sh7760_dma_pdata = {
.channel = dma_chan,
.channel_num = ARRAY_SIZE(dma_chan),
.ts_low_shift   = CHCR_TS_LOW_SHIFT,
.ts_low_mask= CHCR_TS_LOW_MASK,
.ts_high_shift  = CHCR_TS_HIGH_SHIFT,
.ts_high_mask   = CHCR_TS_HIGH_MASK,
.ts_shift   = ts_shift,
.ts_shift_num   = ARRAY_SIZE(ts_shift),
.dmaor_init = DMAOR_INIT,
.dmaor_is_32bit = 1,
};

struct platform_device sh7760_dma_device = {
.name   = "sh-dma-engine",
.id = -1,
.num_resources  = ARRAY_SIZE(sh7760_dma_resources),
.resource   = sh7760_dma_resources,
.dev = {.platform_data = _dma_pdata},
};


And then add sh7760_dma_device to your struct platform_device array.

> If you hack on it, please convert the dmaeng

Re: use the generic dma-noncoherent code for sh V2

2018-07-27 Thread Rob Landley
On 07/24/2018 03:21 PM, Christoph Hellwig wrote:
> On Tue, Jul 24, 2018 at 02:01:42PM +0200, Christoph Hellwig wrote:
>> Hi all,
>>
>> can you review these patches to switch sh to use the generic
>> dma-noncoherent code?  All the requirements are in mainline already
>> and we've switched various architectures over to it already.
>
> Ok, there is one more issue with this version.   Wait for a new one
> tomorrow.

Speaking of DMA:

I'm trying to wire up DMAEngine to an sh7760 board that uses platform data (and
fix the smc91x.c driver to use DMAEngine without #ifdef arm), so I've been
reading through all that stuff, but the docs seem kinda... thin?

Is there something I should have read other than
Documentation/driver-model/platform.txt,
Documentation/dmaegine/{provider,client}.txt, then trying to picking through the
source code and the sh7760 hardware pdf? (And watching the youtube video of
Laurent Pinchart's 2014 ELC talk on DMA, Maxime Ripard's 2015 ELC overview of
DMAEngine, the Xilinx video on DMAEngine...)

At first I thought the SH_DMAE could initialize itself, but the probe function
needs platform data, and although arch/sh/kernel/cpu/sh4a/setup-sh7722.c looks
_kind_ of like a model I can crib from:

A) "make ARCH=sh se7722_defconfig" results in a config with SH_DMA disabled??!?
(This is why I use miniconfig instead of defconfig format, I'm assuming that's
bit rot?)

B) That platform data is supplying sh_dmae_slave_config preallocating slave
channels to devices? (Does it have to? The docs gave me the impression the
driver would dynamically request them and devices could even share. Wasn't that
sort of the point of DMAEngine? Can my new board data _not_ do that? What's the
minimum amount of micromanaging I have to do?)

C) It's full of stuff like setting ts_low_shift to CHCR_TS_LOW_SHIFT where both
grepping Docuemntation and Google "dmaengine ts_low_shift" are unhelpful.

What I'd really like is a "hello world" version of DMAEngine somewhere I can
build and run on a supported qemu target, to set up _one_ channel with a block
device or something using it. I can't tell what's optional, or what the minimal
version of this looks like.

(Currently I've only managed to update this kernel to 4.14 because 4.15
introduced an intermittent data corruption bug in the flash, which takes long
enough to reproduce bisecting it is fiddly and ship deadlines are all blinky and
red. But next release should be current, _and_ with at least the 4.14 source
published so I can point people at it. Heck, maybe I can convince them to let me
port it to device tree next cycle, but I need to get it to _work_ first. And
doing PIO on a 100baseT controller, I.E. a ~200 mhz embedded CPU trying to copy
11 megabytes/second across a 16 bit bus with a for(;;) loop... bit of a
performance bottleneck even before you add https.)

Thanks,

Rob

>>
>> Changes since V1:
>>  - fixed two stupid compile errors and verified them using a local
>>cross toolchain instead of the 0day buildbot
>> ___
>> iommu mailing list
>> iommu@lists.linux-foundation.org
>> https://lists.linuxfoundation.org/mailman/listinfo/iommu
> ---end quoted text---
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sh" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu