Re: [PATCH 00/14] DMA-mapping framework redesign preparation

2012-02-24 Thread Arnd Bergmann
On Friday 23 December 2011, Marek Szyprowski wrote:
 The solution we found is to introduce a new public dma mapping functions
 with additional attributes argument: dma_alloc_attrs and
 dma_free_attrs(). This way all different kinds of architecture specific
 buffer mappings can be hidden behind the attributes without the need of
 creating several versions of dma_alloc_ function.

Since the patches are now in linux-next, we should make sure that they
can actually get merged into 3.4.

I've looked at all the patches again and found them to be straightforward
and helpful, I hope we can get them merged next time. Please add my

Reviewed-by: Arnd Bergmann a...@arndb.de
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


RE: [PATCH 00/14] DMA-mapping framework redesign preparation

2012-01-10 Thread Marek Szyprowski
Hello,

To help everyone in testing and adapting our patches for his hardware 
platform I've rebased our patches onto the latest v3.2 Linux kernel and
prepared a few GIT branches in our public repository. These branches
contain our memory management related patches posted in the following
threads:

[PATCHv18 0/11] Contiguous Memory Allocator:
http://www.spinics.net/lists/linux-mm/msg28125.html
later called CMAv18,

[PATCH 00/14] DMA-mapping framework redesign preparation:
http://www.spinics.net/lists/linux-sh/msg09777.html
and
[PATCH 0/8 v4] ARM: DMA-mapping framework redesign:
http://www.spinics.net/lists/arm-kernel/msg151147.html
with the following update:
http://www.spinics.net/lists/arm-kernel/msg154889.html
later called DMAv5.

These branches are available in our public GIT repository:

git://git.infradead.org/users/kmpark/linux-samsung
http://git.infradead.org/users/kmpark/linux-samsung/

The following branches are available:

1) 3.2-cma-v18
Vanilla Linux v3.2 with fixed CMA v18 patches (first patch replaced
with the one from v17 to fix SMP issues, see the respective thread).

2) 3.2-dma-v5
Vanilla Linux v3.2 + iommu/next (IOMMU maintainer's patches) branch
with DMA-preparation and DMA-mapping framework redesign patches.

3) 3.2-cma-v18-dma-v5
Previous two branches merged together (DMA-mapping on top of CMA)

4) 3.2-cma-v18-dma-v5-exynos
Previous branch rebased on top of iommu/next + kgene/for-next (Samsung
SoC platform maintainer's patches) with new Exynos4 IOMMU driver by 
KyongHo Cho and relevant glue code.

5) 3.2-dma-v5-exynos
Branch from point 2 rebased on top of iommu/next + kgene/for-next 
(Samsung SoC maintainer's patches) with new Exynos4 IOMMU driver by 
KyongHo Cho and relevant glue code.

I hope everyone will find a branch that suits his needs. :)

Best regards
-- 
Marek Szyprowski
Samsung Poland RD Center



___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


RE: [PATCH 00/14] DMA-mapping framework redesign preparation

2011-12-28 Thread Marek Szyprowski
Hello,

On Tuesday, December 27, 2011 6:53 PM James Bottomley wrote:

 On Tue, 2011-12-27 at 09:25 +0100, Marek Szyprowski wrote:
 [...]
Usually these drivers don't touch the buffer data at all, so the mapping
in kernel virtual address space is not needed. We can introduce
DMA_ATTRIB_NO_KERNEL_MAPPING attribute which lets kernel to skip/ignore
creation of kernel virtual mapping. This way we can save previous
vmalloc area and simply some mapping operation on a few architectures.
  
   I really think this wants to be a separate function.  dma_alloc_coherent
   is for allocating memory to be shared between the kernel and a driver;
   we already have dma_map_sg for mapping userspace I/O as an alternative
   interface.  This feels like it's something different again rather than
   an option to dma_alloc_coherent.
 
  That is just a starting point for the discussion.
 
  I thought about this API a bit and came to conclusion that there is no much
  difference between a dma_alloc_coherent which creates a mapping in kernel
  virtual space and the one that does not. It is just a hint from the driver
  that it will not use that mapping at all. Of course this attribute makes 
  sense
  only together with adding a dma_mmap_attrs() call, because otherwise drivers
  won't be able to get access to the buffer data.
 
 This depends.  On Virtually indexed systems like PA-RISC, there are two
 ways of making a DMA range coherent.  One is to make the range uncached.
 This is incredibly slow and not what we do by default, but it can be
 used to make multiple mappings coherent.  The other is to load the
 virtual address up as a coherence index into the IOMMU.  This makes it a
 full peer in the coherence process, but means we can only designate a
 single virtual range to be coherent (not multiple mappings unless they
 happen to be congruent).  Perhaps it doesn't matter that much, since I
 don't see a use for this on PA, but if any other architecture works the
 same, you'd have to designate a single mapping as the coherent one and
 essentially promise not to use the other mapping if we followed our
 normal coherence protocols.
 
 Obviously, the usual range we currently make coherent is the kernel
 mapping (that's actually the only virtual address we have by the time
 we're deep in the iommu code), so designating a different virtual
 address would need some surgery to the guts of the iommu code.

I see, in this case not much can be achieved by dropping the kernel
mapping for the allocated buffer. I'm also not sure how to mmap the buffer
into userspace meet the cpu requirements? Is it possible to use non-cached
mapping in userspace together with coherent mapping in kernel virtual
space?

However on some other architectures this attribute allows using HIGH_MEM
for the allocated coherent buffer. The other possibility is to allocate it
in chunks and map them contiguously into dma address space. With 
NO_KERNEL_MAPPING attribute we avoid consuming vmalloc range for the newly
allocated buffer for which we cannot use the linear mapping (because it is
scattered).

Of course this attribute will be implemented by the architectures where it
gives some benefits. All other can simply ignore it and return plain
coherent buffer with ordinary kernel virtual mapping. The driver will just
ignore it.

Best regards
-- 
Marek Szyprowski
Samsung Poland RD Center



___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


RE: [PATCH 00/14] DMA-mapping framework redesign preparation

2011-12-27 Thread Marek Szyprowski
Hello,

On Friday, December 23, 2011 5:35 PM Matthew Wilcox wrote:

 On Fri, Dec 23, 2011 at 01:27:19PM +0100, Marek Szyprowski wrote:
  The first issue we identified is the fact that on some platform (again,
  mainly ARM) there are several functions for allocating DMA buffers:
  dma_alloc_coherent, dma_alloc_writecombine and dma_alloc_noncoherent
 
 Is this write-combining from the point of view of the device (ie iommu),
 or from the point of view of the CPU, or both?

It is about write-combining from the CPU point of view. Right now there are
no devices with such advanced memory interface to do write combining on the
DMA side, but I believe that they might appear at some point in the future 
as well.

  The next step in dma mapping framework update is the introduction of
  dma_mmap/dma_mmap_attrs() function. There are a number of drivers
  (mainly V4L2 and ALSA) that only exports the DMA buffers to user space.
  Creating a userspace mapping with correct page attributes is not an easy
  task for the driver. Also the DMA-mapping framework is the only place
  where the complete information about the allocated pages is available,
  especially if the implementation uses IOMMU controller to provide a
  contiguous buffer in DMA address space which is scattered in physical
  memory space.
 
 Surely we only need a helper which drivrs can call from their mmap routine
 to solve this?

On ARM architecture it is already implemented this way and a bunch of drivers
use dma_mmap_coherent/dma_mmap_writecombine calls. We would like to standardize
these calls across all architectures.

  Usually these drivers don't touch the buffer data at all, so the mapping
  in kernel virtual address space is not needed. We can introduce
  DMA_ATTRIB_NO_KERNEL_MAPPING attribute which lets kernel to skip/ignore
  creation of kernel virtual mapping. This way we can save previous
  vmalloc area and simply some mapping operation on a few architectures.
 
 I really think this wants to be a separate function.  dma_alloc_coherent
 is for allocating memory to be shared between the kernel and a driver;
 we already have dma_map_sg for mapping userspace I/O as an alternative
 interface.  This feels like it's something different again rather than
 an option to dma_alloc_coherent.

That is just a starting point for the discussion. 

I thought about this API a bit and came to conclusion that there is no much
difference between a dma_alloc_coherent which creates a mapping in kernel
virtual space and the one that does not. It is just a hint from the driver
that it will not use that mapping at all. Of course this attribute makes sense
only together with adding a dma_mmap_attrs() call, because otherwise drivers
won't be able to get access to the buffer data.

On coherent architectures where dma_alloc_coherent is just a simple wrapper
around alloc_pages_exact() such attribute can be simply ignored without any
impact on the drivers (that's the main idea behind dma attributes!).
However such hint will help a lot on non-coherent architectures where 
additional work need to be done to provide a cohenent mapping in kernel 
address space. It also saves some precious kernel resources like vmalloc
address range.

Best regards
-- 
Marek Szyprowski
Samsung Poland RD Center



___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 00/14] DMA-mapping framework redesign preparation

2011-12-23 Thread Marek Szyprowski
Hello eveyone,

On Linaro Memory Management meeting in Budapest (May 2011) we have
discussed about the design of DMA mapping framework. We tried to
identify the drawbacks and limitations as well as to provide some a
solution for them. The discussion was mainly about ARM architecture, but
some of the conclusions need to be applied to cross-architecture code.

The first issue we identified is the fact that on some platform (again,
mainly ARM) there are several functions for allocating DMA buffers:
dma_alloc_coherent, dma_alloc_writecombine and dma_alloc_noncoherent
(not functional now). For each of them there is a match dma_free_*
function. This gives us quite a lot of functions in the public API and
complicates things when we need to have several different
implementations for different devices selected in runtime (if IOMMU
controller is available only for a few devices in the system). Also the
drivers which use less common variants are less portable because of the
lacks of dma_alloc_writecombine on other architectures.

The solution we found is to introduce a new public dma mapping functions
with additional attributes argument: dma_alloc_attrs and
dma_free_attrs(). This way all different kinds of architecture specific
buffer mappings can be hidden behind the attributes without the need of
creating several versions of dma_alloc_ function.

dma_alloc_coherent() can be wrapped on top of new dma_alloc_attrs() with
NULL attrs parameter. dma_alloc_writecombine and dma_alloc_noncoherent
can be implemented as a simple wrappers which sets attributes to
DMA_ATTRS_WRITECOMBINE or DMA_ATTRS_NON_CONSISTENT respectively. These
new attributes will be implemented only on the architectures that really
support them, the others will simply ignore them defaulting to the
dma_alloc_coherent equivalent.

The next step in dma mapping framework update is the introduction of
dma_mmap/dma_mmap_attrs() function. There are a number of drivers
(mainly V4L2 and ALSA) that only exports the DMA buffers to user space.
Creating a userspace mapping with correct page attributes is not an easy
task for the driver. Also the DMA-mapping framework is the only place
where the complete information about the allocated pages is available,
especially if the implementation uses IOMMU controller to provide a
contiguous buffer in DMA address space which is scattered in physical
memory space.

Usually these drivers don't touch the buffer data at all, so the mapping
in kernel virtual address space is not needed. We can introduce
DMA_ATTRIB_NO_KERNEL_MAPPING attribute which lets kernel to skip/ignore
creation of kernel virtual mapping. This way we can save previous
vmalloc area and simply some mapping operation on a few architectures.

This patch series is a preparation for the above changes in the public
dma mapping API. The main goal is to modify dma_map_ops structure and
let all users to use for implementation of the new public funtions.

The proof-of-concept patches for ARM architecture have been already
posted a few times and now they are working resonably well. They perform
conversion to dma_map_ops based implementation and add support for
generic IOMMU-based dma mapping implementation. To get them merged we
first need to get acceptance for the changes in the common,
cross-architecture structures. More information about these patches can
be found in the following threads:

http://www.spinics.net/lists/linux-mm/msg19856.html
http://www.spinics.net/lists/linux-mm/msg21241.html
http://lists.linaro.org/pipermail/linaro-mm-sig/2011-September/000571.html
http://lists.linaro.org/pipermail/linaro-mm-sig/2011-September/000577.html
http://www.spinics.net/lists/linux-mm/msg25490.html

The patches are prepared on top of Linux Kernel v3.2-rc6. I would
appreciate any comments and help with getting this patch series into
linux-next tree.

The idea apllied in this patch set have been also presented during the
Kernel Summit 2011 and ELC-E 2011 in Prague, in the presentation 'ARM
DMA-Mapping Framework Redesign and IOMMU integration'.

I'm really sorry if I missed any of the relevant architecture mailing
lists. I've did my best to include everyone. Feel free to forward this
patchset to all interested developers and maintainers. I've already feel
like a nasty spammer.

Best regards
Marek Szyprowski
Samsung Poland RD Center


Patch summary:

Andrzej Pietrasiewicz (9):
  X86: adapt for dma_map_ops changes
  MIPS: adapt for dma_map_ops changes
  PowerPC: adapt for dma_map_ops changes
  IA64: adapt for dma_map_ops changes
  SPARC: adapt for dma_map_ops changes
  Alpha: adapt for dma_map_ops changes
  SH: adapt for dma_map_ops changes
  Microblaze: adapt for dma_map_ops changes
  Unicore32: adapt for dma_map_ops changes

Marek Szyprowski (5):
  common: dma-mapping: introduce alloc_attrs and free_attrs methods
  common: dma-mapping: remove old alloc_coherent and free_coherent
methods
  common: dma-mapping: introduce mmap method
  common: DMA-mapping: add WRITE_COMBINE 

Re: [PATCH 00/14] DMA-mapping framework redesign preparation

2011-12-23 Thread Matthew Wilcox
On Fri, Dec 23, 2011 at 01:27:19PM +0100, Marek Szyprowski wrote:
 The first issue we identified is the fact that on some platform (again,
 mainly ARM) there are several functions for allocating DMA buffers:
 dma_alloc_coherent, dma_alloc_writecombine and dma_alloc_noncoherent

Is this write-combining from the point of view of the device (ie iommu),
or from the point of view of the CPU, or both?

 The next step in dma mapping framework update is the introduction of
 dma_mmap/dma_mmap_attrs() function. There are a number of drivers
 (mainly V4L2 and ALSA) that only exports the DMA buffers to user space.
 Creating a userspace mapping with correct page attributes is not an easy
 task for the driver. Also the DMA-mapping framework is the only place
 where the complete information about the allocated pages is available,
 especially if the implementation uses IOMMU controller to provide a
 contiguous buffer in DMA address space which is scattered in physical
 memory space.

Surely we only need a helper which drivrs can call from their mmap routine to 
solve this?

 Usually these drivers don't touch the buffer data at all, so the mapping
 in kernel virtual address space is not needed. We can introduce
 DMA_ATTRIB_NO_KERNEL_MAPPING attribute which lets kernel to skip/ignore
 creation of kernel virtual mapping. This way we can save previous
 vmalloc area and simply some mapping operation on a few architectures.

I really think this wants to be a separate function.  dma_alloc_coherent
is for allocating memory to be shared between the kernel and a driver;
we already have dma_map_sg for mapping userspace I/O as an alternative
interface.  This feels like it's something different again rather than
an option to dma_alloc_coherent.

-- 
Matthew Wilcox  Intel Open Source Technology Centre
Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 00/14] DMA-mapping framework redesign preparation

2011-12-23 Thread Benjamin Herrenschmidt
On Fri, 2011-12-23 at 09:35 -0700, Matthew Wilcox wrote:
 I really think this wants to be a separate function.
 dma_alloc_coherent
 is for allocating memory to be shared between the kernel and a driver;
 we already have dma_map_sg for mapping userspace I/O as an alternative
 interface.  This feels like it's something different again rather than
 an option to dma_alloc_coherent. 

Depends. There can be some interesting issues with some of the ARM stuff
out there (and to a lesser extent older ppc embedded stuff).

For example, some devices really want a physically contiguous chunk, and
are not cache coherent. In that case, you can't keep the linear mapping
around. But you also don't waste your precious kernel virtual space
creating a separate non-cachable mapping for those.

In general, dma mapping attributes as a generic feature make sense,
whether this specific attribute does or not though. And we probably want
space for platform specific attributes, for example, FSL embedded
iommu's have interesting features for directing data toward a specific
core cache etc... that we might want to expose using such attributes.

Cheers,
Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev