Re: [RFC 3/3] mm: iommu: The Virtual Contiguous Memory Manager

2010-07-22 Thread Zach Pfeffer
On Thu, Jul 22, 2010 at 08:51:51AM +0100, Russell King - ARM Linux wrote:
 On Wed, Jul 21, 2010 at 08:50:26PM -0700, Zach Pfeffer wrote:
  On Wed, Jul 14, 2010 at 10:59:43AM +0900, FUJITA Tomonori wrote:
   On Tue, 13 Jul 2010 10:02:23 +0100
   
   Zach Pfeffer said this new VCM infrastructure can be useful for
   video4linux. However, I don't think we need 3,000-lines another
   abstraction layer to solve video4linux's issue nicely.
  
  Its only 3000 lines because I haven't converted the code to use
  function pointers.
 
 I don't understand - you've made this claim a couple of times.  I
 can't see how converting the code to use function pointers (presumably
 to eliminate those switch statements) would reduce the number of lines
 of code.
 
 Please explain (or show via new patches) how does converting this to
 function pointers significantly reduce the number of lines of code.
 
 We might then be able to put just _one_ of these issues to bed.

Aye. Its getting worked on. Once its done I'll push it.

 
  Getting back to the point. There is no API that can handle large
  buffer allocation and sharing with low-level attribute control for
  virtual address spaces outside the CPU.
 
 I think we've dealt with the attribute issue to death now.  Shall we
 repeat it again?

I think the only point of agreement is that all mappings must have
compatible attributes, the issue of multiple mappings is still
outstanding, as is needing more fine grained control of the attributes
of a set of compatible mappings (I still need to digest your examples
a little).

 
  The DMA API et al. take a CPU centric view of virtual space
  management, sharing has to be explicitly written and external virtual
  space management is left up to device driver writers.
 
 I think I've also shown that not to be the case with example code.
 
 The code behind the DMA API can be changed on a per-device basis
 (currently on ARM we haven't supported that because no one's asked
 for it yet) so that it can support multiple IOMMUs even of multiple
 different types.

I'm seeing that now. As I become more familiar with the DMA API the
way forward may become more clear to me. I certainly appreciate the
time you've spent discussing things and the code examples you've
listed. For example, it fairly clear how I can use a scatter list to
describe a mapping of big buffers. I can start down this path and see
what shakes out.

 
 --
 To unsubscribe, send a message with 'unsubscribe linux-mm' in
 the body to majord...@kvack.org.  For more info on Linux MM,
 see: http://www.linux-mm.org/ .
 Don't email: a href=mailto:d...@kvack.org; em...@kvack.org /a
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/3 v3] mm: iommu: An API to unify IOMMU, CPU and device memory management

2010-07-22 Thread Zach Pfeffer
On Thu, Jul 22, 2010 at 08:34:55AM +0100, Russell King - ARM Linux wrote:
 On Wed, Jul 21, 2010 at 09:25:28PM -0700, Zach Pfeffer wrote:
  Yes it is a problem, as Russell has brought up, but there's something
  I probably haven't communicated well. I'll use the following example:
  
  There are 3 devices: A CPU, a decoder and a video output device. All 3
  devices need to map the same 12 MB buffer at the same time.
 
 Why do you need the same buffer mapped by the CPU?
 
 Let's take your example of a video decoder and video output device.
 Surely the CPU doesn't want to be writing to the same memory region
 used for the output picture as the decoder is writing to.  So what's
 the point of mapping that memory into the CPU's address space?

It may, especially if you're doing some software post processing. Also
by mapping all the buffers its extremly fast to pass the buffers
around in this senario - the buffer passing becomes a simple signal.

 
 Surely the video output device doesn't need to see the input data to
 the decoder either?

No, but other devices may (like the CPU).

 
 Surely, all you need is:
 
 1. a mapping for the CPU for a chunk of memory to pass data to the
decoder.
 2. a mapping for the decoder to see the chunk of memory to receive data
from the CPU.
 3. a mapping for the decoder to see a chunk of memory used for the output
video buffer.
 4. a mapping for the output device to see the video buffer.
 
 So I don't see why everything needs to be mapped by everything else.

That's fair, but we do share buffers and we do have many, very large
mappings, and we do need to pull these from a separate pools because
they need to exhibit a particular allocation profile. I agree with you
that things should work like you've listed, but with Qualcomm's ARM
multimedia engines we're seeing some different usage scenarios. Its
the giant buffers, needing to use our own buffer allocator, the need
to share and the need to swap out virtual IOMMU space (which we
haven't talked about much) which make the DMA API seem like a
mismatch. (we haven't even talked about graphics usage ;) ).
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/3 v3] mm: iommu: An API to unify IOMMU, CPU and device memory management

2010-07-22 Thread Zach Pfeffer
On Thu, Jul 22, 2010 at 08:39:17AM +0100, Russell King - ARM Linux wrote:
 On Wed, Jul 21, 2010 at 09:30:34PM -0700, Zach Pfeffer wrote:
  This goes to the nub of the issue. We need a lot of 1 MB physically
  contiguous chunks. The system is going to fragment and we'll never get
  our 12 1 MB chunks that we'll need, since the DMA API allocator uses
  the system pool it will never succeed.
 
 By the DMA API allocator I assume you mean the coherent DMA interface,
 The DMA coherent API and DMA streaming APIs are two separate sub-interfaces
 of the DMA API and are not dependent on each other.

I didn't know that, but yes. As far as I can tell they both allocate
memory from the VM. We'd need a way to hook in our our own minimized
mapping allocator.
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 3/3] mm: iommu: The Virtual Contiguous Memory Manager

2010-07-22 Thread Zach Pfeffer
On Thu, Jul 22, 2010 at 01:47:36PM +0900, FUJITA Tomonori wrote:
 On Wed, 21 Jul 2010 20:50:26 -0700
 Zach Pfeffer zpfef...@codeaurora.org wrote:
 
  On Wed, Jul 14, 2010 at 10:59:43AM +0900, FUJITA Tomonori wrote:
   On Tue, 13 Jul 2010 10:02:23 +0100
   
   Zach Pfeffer said this new VCM infrastructure can be useful for
   video4linux. However, I don't think we need 3,000-lines another
   abstraction layer to solve video4linux's issue nicely.
  
  Its only 3000 lines because I haven't converted the code to use
  function pointers.
 
 The main point is adding a new abstraction that don't provide the huge
 benefit.

I disagree. In its current form the API may not be appropriate for
inclusion into the kernel, but it provides a common framework for
handling a class of problems that have been solved many times in the
kernel: large buffer management, IOMMU interoperation and fine grained
mapping control.
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/3 v3] mm: iommu: An API to unify IOMMU, CPU and device memory management

2010-07-22 Thread Zach Pfeffer
On Thu, Jul 22, 2010 at 01:43:26PM +0900, FUJITA Tomonori wrote:
 On Wed, 21 Jul 2010 21:30:34 -0700
 Zach Pfeffer zpfef...@codeaurora.org wrote:
 
  On Wed, Jul 21, 2010 at 10:44:37AM +0900, FUJITA Tomonori wrote:
   On Tue, 20 Jul 2010 15:20:01 -0700
   Zach Pfeffer zpfef...@codeaurora.org wrote:
   
 I'm not saying that it's reasonable to pass (or even allocate) a 1MB
 buffer via the DMA API.

But given a bunch of large chunks of memory, is there any API that can
manage them (asked this on the other thread as well)?
   
   What is the problem about mapping a 1MB buffer with the DMA API?
   
   Possibly, an IOMMU can't find space for 1MB but it's not the problem
   of the DMA API.
  
  This goes to the nub of the issue. We need a lot of 1 MB physically
  contiguous chunks. The system is going to fragment and we'll never get
  our 12 1 MB chunks that we'll need, since the DMA API allocator uses
  the system pool it will never succeed. For this reason we reserve a
  pool of 1 MB chunks (and 16 MB, 64 KB etc...) to satisfy our
  requests. This same use case is seen on most embedded media engines
  that are getting built today.
 
 We don't need a new abstraction to reserve some memory.
 
 If you want pre-allocated memory pool per device (and share them with
 some), the DMA API can for coherent memory (see
 dma_alloc_from_coherent). You can extend the DMA API if necessary.

That function won't work for us. We can't use
bitmap_find_free_region(), we need to use our own allocator. If
anything we need a dma_alloc_from_custom(my_allocator). Take a look
at:

mm: iommu: A physical allocator for the VCMM
vcm_alloc_max_munch() 
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 3/3] mm: iommu: The Virtual Contiguous Memory Manager

2010-07-21 Thread Zach Pfeffer
On Wed, Jul 14, 2010 at 10:59:43AM +0900, FUJITA Tomonori wrote:
 On Tue, 13 Jul 2010 10:02:23 +0100
 
 Zach Pfeffer said this new VCM infrastructure can be useful for
 video4linux. However, I don't think we need 3,000-lines another
 abstraction layer to solve video4linux's issue nicely.

Its only 3000 lines because I haven't converted the code to use
function pointers.

 I can't find any reasonable reasons that we need to merge VCM; seems
 that the combination of the current APIs (or with some small
 extensions) can work for the issues that VCM tries to solve.

Getting back to the point. There is no API that can handle large
buffer allocation and sharing with low-level attribute control for
virtual address spaces outside the CPU. At this point if you need to
work with big buffers, 1 MB and 16 MB etc, and map those big buffers
to non-CPU virtual spaces you need to explicitly carve them out and
set up the mappings and sharing by hand. Its reasonable to have an API
that can do this especially since IOMMUs are going to become more
prevalent. The DMA API et al. take a CPU centric view of virtual space
management, sharing has to be explicitly written and external virtual
space management is left up to device driver writers. Given a system
where each device has an IOMMU or a MMU the whole concept of a
scatterlist goes away. The VCM API gets a jump on it.
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/3 v3] mm: iommu: An API to unify IOMMU, CPU and device memory management

2010-07-21 Thread Zach Pfeffer
On Tue, Jul 20, 2010 at 09:44:12PM -0400, Timothy Meade wrote:
 On Tue, Jul 20, 2010 at 8:44 PM, Zach Pfeffer zpfef...@codeaurora.org wrote:
  On Mon, Jul 19, 2010 at 05:21:35AM -0400, Tim HRM wrote:
  On Fri, Jul 16, 2010 at 8:01 PM, Larry Bassel lbas...@codeaurora.org 
  wrote:
   On 16 Jul 10 08:58, Russell King - ARM Linux wrote:
   On Thu, Jul 15, 2010 at 08:48:36PM -0400, Tim HRM wrote:
Interesting, since I seem to remember the MSM devices mostly conduct
IO through regions of normal RAM, largely accomplished through
ioremap() calls.
   
Without more public domain documentation of the MSM chips and AMSS
interfaces I wouldn't know how to avoid this, but I can imagine it
creates a bit of urgency for Qualcomm developers as they attempt to
upstream support for this most interesting SoC.
  
   As the patch has been out for RFC since early April on the 
   linux-arm-kernel
   mailing list (Subject: [RFC] Prohibit ioremap() on kernel managed RAM),
   and no comments have come back from Qualcomm folk.
  
   We are investigating the impact of this change on us, and I
   will send out more detailed comments next week.
  
  
   The restriction on creation of multiple V:P mappings with differing
   attributes is also fairly hard to miss in the ARM architecture
   specification when reading the sections about caches.
  
  
   Larry Bassel
  
   --
   Sent by an employee of the Qualcomm Innovation Center, Inc.
   The Qualcomm Innovation Center, Inc. is a member of the Code Aurora 
   Forum.
  
 
  Hi Larry and Qualcomm people.
  I'm curious what your reason for introducing this new api (or adding
  to dma) is. ?Specifically how this would be used to make the memory
  mapping of the MSM chip dynamic in contrast to the fixed _PHYS defines
  in the Android and Codeaurora trees.
 
  The MSM has many integrated engines that allow offloading a variety of
  workloads. These engines have always addressed memory using physical
  addresses, because of this we had to reserve large (10's MB) buffers
  at boot. These buffers are never freed regardless of whether an engine
  is actually using them. As you can imagine, needing to reserve memory
  for all time on a device that doesn't have a lot of memory in the
  first place is not ideal because that memory could be used for other
  things, running apps, etc.
 
  To solve this problem we put IOMMUs in front of a lot of the
  engines. IOMMUs allow us to map physically discontiguous memory into a
  virtually contiguous address range. This means that we could ask the
  OS for 10 MB of pages and map all of these into our IOMMU space and
  the engine would still see a contiguous range.
 
 
 
 I see. Much like I suspected, this is used to replace the static
 regime of the earliest Android kernel.  You mention placing IOMMUs in
 front of the A11 engines, you are involved in this architecture as an
 engineer or similar?  

I'm involved to the extent of designing and implementing VCM and,
finding it useful for this class of problems, trying push it upstream.

 Is there a reason a cooperative approach using
 RPC or another mechanism is not used for memory reservation, this is
 something that can be accomplished fully on APPS side?

It can be accomplished a few ways. At this point we let the
application processor manage the buffers. Other cooperative approaches
have been talked about. As you can see in the short, but voluminous
cannon of MSM Linux support there is a degree of RPC used to
communicate with other nodes in the system. As time progresses the
cannon of code shows this usage going down.

 
  In reality, limitations in the hardware meant that we needed to map
  memory using larger mappings to minimize the number of TLB
  misses. This, plus the number of IOMMUs and the extreme use cases we
  needed to design for led us to a generic design.
 
  This generic design solved our problem and the general mapping
  problem. We thought other people, who had this same big-buffer
  interoperation problem would also appreciate a common API that was
  built with their needs in mind so we pushed our idea up.
 
 
  I'm also interested in how this ability to map memory regions as files
  for devices like KGSL/DRI or PMEM might work and why this is better
  suited to that purpose than existing methods, where this fits into
  camera preview and other issues that have been dealt with in these
  trees in novel ways (from my perspective).
 
  The file based approach was driven by Android's buffer passing scheme
  and the need to write userspace drivers for multimedia, etc...
 
 
 So the Android file backed approach is obiviated by GEM and other mechanisms?

Aye.

 
 Thanks you for you help,
 Timothy Meade
 -tmzt #htc-linux (facebook.com/HTCLinux)
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/3 v3] mm: iommu: An API to unify IOMMU, CPU and device memory management

2010-07-21 Thread Zach Pfeffer
On Mon, Jul 19, 2010 at 12:44:49AM -0700, Eric W. Biederman wrote:
 Zach Pfeffer zpfef...@codeaurora.org writes:
 
  On Thu, Jul 15, 2010 at 09:55:35AM +0100, Russell King - ARM Linux wrote:
  On Wed, Jul 14, 2010 at 06:29:58PM -0700, Zach Pfeffer wrote:
   The VCM ensures that all mappings that map a given physical buffer:
   IOMMU mappings, CPU mappings and one-to-one device mappings all map
   that buffer using the same (or compatible) attributes. At this point
   the only attribute that users can pass is CACHED. In the absence of
   CACHED all accesses go straight through to the physical memory.
  
  So what you're saying is that if I have a buffer in kernel space
  which I already have its virtual address, I can pass this to VCM and
  tell it !CACHED, and it'll setup another mapping which is not cached
  for me?
 
  Not quite. The existing mapping will be represented by a reservation
  from the prebuilt VCM of the VM. This reservation has been marked
  non-cached. Another reservation on a IOMMU VCM, also marked non-cached
  will be backed with the same physical memory. This is legal in ARM,
  allowing the vcm_back call to succeed. If you instead passed cached on
  the second mapping, the first mapping would be non-cached and the
  second would be cached. If the underlying architecture supported this
  than the vcm_back would go through.
 
 How does this compare with the x86 pat code?

First, thanks for asking this question. I wasn't aware of the x86 pat
code and I got to read about it. From my initial read the VCM differs in 2 ways:

1. The attributes are explicitly set on virtual address ranges. These
reservations can then map physical memory with these attributes.

2. We explicitly allow multiple mappings (as long as the attributes are
compatible). One such mapping may come from a IOMMU's virtual address
space while another comes from the CPUs virtual address space. These
mappings may exist at the same time.

 
  You are aware that multiple V:P mappings for the same physical page
  with different attributes are being outlawed with ARMv6 and ARMv7
  due to speculative prefetching.  The cache can be searched even for
  a mapping specified as 'normal, uncached' and you can get cache hits
  because the data has been speculatively loaded through a separate
  cached mapping of the same physical page.
 
  I didn't know that. Thanks for the heads up.
 
  FYI, during the next merge window, I will be pushing a patch which makes
  ioremap() of system RAM fail, which should be the last core code creator
  of mappings with different memory types.  This behaviour has been outlawed
  (as unpredictable) in the architecture specification and does cause
  problems on some CPUs.
 
  That's fair enough, but it seems like it should only be outlawed for
  those processors on which it breaks.
 
 To my knowledge mismatch of mapping attributes is a problem on most
 cpus on every architecture.  I don't see it making sense to encourage
 coding constructs that will fail in the strangest most difficult to
 debug ways.

Yes it is a problem, as Russell has brought up, but there's something
I probably haven't communicated well. I'll use the following example:

There are 3 devices: A CPU, a decoder and a video output device. All 3
devices need to map the same 12 MB buffer at the same time. Once this
buffer has served its purpose it gets freed and goes back into the
pool of big buffers. When the same usage case exists again the buffer
needs to get reallocated and the same devices need to map to it.

This usage case does exist, not only for Qualcomm but for all of these
SoC media engines that have started running Linux. The VCM API
attempts to cover this case for the Linux kernel.
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/3 v3] mm: iommu: An API to unify IOMMU, CPU and device memory management

2010-07-21 Thread Zach Pfeffer
On Wed, Jul 21, 2010 at 10:44:37AM +0900, FUJITA Tomonori wrote:
 On Tue, 20 Jul 2010 15:20:01 -0700
 Zach Pfeffer zpfef...@codeaurora.org wrote:
 
   I'm not saying that it's reasonable to pass (or even allocate) a 1MB
   buffer via the DMA API.
  
  But given a bunch of large chunks of memory, is there any API that can
  manage them (asked this on the other thread as well)?
 
 What is the problem about mapping a 1MB buffer with the DMA API?
 
 Possibly, an IOMMU can't find space for 1MB but it's not the problem
 of the DMA API.

This goes to the nub of the issue. We need a lot of 1 MB physically
contiguous chunks. The system is going to fragment and we'll never get
our 12 1 MB chunks that we'll need, since the DMA API allocator uses
the system pool it will never succeed. For this reason we reserve a
pool of 1 MB chunks (and 16 MB, 64 KB etc...) to satisfy our
requests. This same use case is seen on most embedded media engines
that are getting built today.
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/3 v3] mm: iommu: An API to unify IOMMU, CPU and device memory management

2010-07-20 Thread Zach Pfeffer
On Tue, Jul 20, 2010 at 09:54:33PM +0100, Russell King - ARM Linux wrote:
 On Tue, Jul 20, 2010 at 01:45:17PM -0700, Zach Pfeffer wrote:
  You can also conflict in access permissions which can and do conflict
  (which are what multiple mappings are all about...some buffer can get
  some access, while others get different access).
 
 Access permissions don't conflict between mappings - each mapping has
 unique access permissions.

Yes. Bad choice of words.

  The VCM API allows the same memory to be mapped as long as it makes
  sense and allows those attributes that can change to be specified. It
  could be the alternative, globally applicable approach, your looking
  for and request in your patch.
 
 I very much doubt it - there's virtually no call for creating an
 additional mapping of existing kernel memory with different permissions.
 The only time kernel memory gets remapped is with vmalloc(), where we
 want to create a virtually contiguous mapping from a collection of
 (possibly) non-contiguous pages.  Such allocations are always created
 with R/W permissions.
 
 There are some cases where the vmalloc APIs are used to create mappings
 with different memory properties, but as already covered, this has become
 illegal with ARMv6 and v7 architectures.
 
 So no, VCM doesn't help because there's nothing that could be solved here.
 Creating read-only mappings is pointless, and creating mappings with
 different memory type, sharability or cache attributes is illegal.

I don't think its pointless; it may have limited utility but things
like read-only mappings can be useful.

  Without the VCM API (or something like it) there will just be a bunch
  of duplicated code that's basically doing ioremap. This code will
  probably fail to configure its mappings correctly, in which case your
  patch is a bad idea because it'll spawn bugs all over the place
  instead of at a know location. We could instead change ioremap to
  match the attributes of System RAM if that's what its mapping.
 
 And as I say, what is the point of creating another identical mapping to
 the one we already have?

As you say probably not much. We do still have a problem (and other
people have it as well) we need to map in large contiguous buffers
with various attributes and point the kernel and various engines at
them. This seems like something that would be globally useful. The
feedback I've gotten is that we should just keep our usage private to
our mach-msm branch. 

I've got a couple of questions:

Do you think a global solution to this problem is appropriate?

What would that solution need to look like, transparent huge pages?

How should people change various mapping attributes for these large
sections of memory?
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/3 v3] mm: iommu: An API to unify IOMMU, CPU and device memory management

2010-07-20 Thread Zach Pfeffer
On Mon, Jul 19, 2010 at 05:21:35AM -0400, Tim HRM wrote:
 On Fri, Jul 16, 2010 at 8:01 PM, Larry Bassel lbas...@codeaurora.org wrote:
  On 16 Jul 10 08:58, Russell King - ARM Linux wrote:
  On Thu, Jul 15, 2010 at 08:48:36PM -0400, Tim HRM wrote:
   Interesting, since I seem to remember the MSM devices mostly conduct
   IO through regions of normal RAM, largely accomplished through
   ioremap() calls.
  
   Without more public domain documentation of the MSM chips and AMSS
   interfaces I wouldn't know how to avoid this, but I can imagine it
   creates a bit of urgency for Qualcomm developers as they attempt to
   upstream support for this most interesting SoC.
 
  As the patch has been out for RFC since early April on the linux-arm-kernel
  mailing list (Subject: [RFC] Prohibit ioremap() on kernel managed RAM),
  and no comments have come back from Qualcomm folk.
 
  We are investigating the impact of this change on us, and I
  will send out more detailed comments next week.
 
 
  The restriction on creation of multiple V:P mappings with differing
  attributes is also fairly hard to miss in the ARM architecture
  specification when reading the sections about caches.
 
 
  Larry Bassel
 
  --
  Sent by an employee of the Qualcomm Innovation Center, Inc.
  The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.
 
 
 Hi Larry and Qualcomm people.
 I'm curious what your reason for introducing this new api (or adding
 to dma) is.  Specifically how this would be used to make the memory
 mapping of the MSM chip dynamic in contrast to the fixed _PHYS defines
 in the Android and Codeaurora trees.

The MSM has many integrated engines that allow offloading a variety of
workloads. These engines have always addressed memory using physical
addresses, because of this we had to reserve large (10's MB) buffers
at boot. These buffers are never freed regardless of whether an engine
is actually using them. As you can imagine, needing to reserve memory
for all time on a device that doesn't have a lot of memory in the
first place is not ideal because that memory could be used for other
things, running apps, etc.

To solve this problem we put IOMMUs in front of a lot of the
engines. IOMMUs allow us to map physically discontiguous memory into a
virtually contiguous address range. This means that we could ask the
OS for 10 MB of pages and map all of these into our IOMMU space and
the engine would still see a contiguous range.

In reality, limitations in the hardware meant that we needed to map
memory using larger mappings to minimize the number of TLB
misses. This, plus the number of IOMMUs and the extreme use cases we
needed to design for led us to a generic design.

This generic design solved our problem and the general mapping
problem. We thought other people, who had this same big-buffer
interoperation problem would also appreciate a common API that was
built with their needs in mind so we pushed our idea up.

 
 I'm also interested in how this ability to map memory regions as files
 for devices like KGSL/DRI or PMEM might work and why this is better
 suited to that purpose than existing methods, where this fits into
 camera preview and other issues that have been dealt with in these
 trees in novel ways (from my perspective).

The file based approach was driven by Android's buffer passing scheme
and the need to write userspace drivers for multimedia, etc...

--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/3 v3] mm: iommu: An API to unify IOMMU, CPU and device memory management

2010-07-19 Thread Zach Pfeffer
On Thu, Jul 15, 2010 at 09:55:35AM +0100, Russell King - ARM Linux wrote:
 On Wed, Jul 14, 2010 at 06:29:58PM -0700, Zach Pfeffer wrote:
  The VCM ensures that all mappings that map a given physical buffer:
  IOMMU mappings, CPU mappings and one-to-one device mappings all map
  that buffer using the same (or compatible) attributes. At this point
  the only attribute that users can pass is CACHED. In the absence of
  CACHED all accesses go straight through to the physical memory.
 
 So what you're saying is that if I have a buffer in kernel space
 which I already have its virtual address, I can pass this to VCM and
 tell it !CACHED, and it'll setup another mapping which is not cached
 for me?

Not quite. The existing mapping will be represented by a reservation
from the prebuilt VCM of the VM. This reservation has been marked
non-cached. Another reservation on a IOMMU VCM, also marked non-cached
will be backed with the same physical memory. This is legal in ARM,
allowing the vcm_back call to succeed. If you instead passed cached on
the second mapping, the first mapping would be non-cached and the
second would be cached. If the underlying architecture supported this
than the vcm_back would go through.

 
 You are aware that multiple V:P mappings for the same physical page
 with different attributes are being outlawed with ARMv6 and ARMv7
 due to speculative prefetching.  The cache can be searched even for
 a mapping specified as 'normal, uncached' and you can get cache hits
 because the data has been speculatively loaded through a separate
 cached mapping of the same physical page.

I didn't know that. Thanks for the heads up.

 FYI, during the next merge window, I will be pushing a patch which makes
 ioremap() of system RAM fail, which should be the last core code creator
 of mappings with different memory types.  This behaviour has been outlawed
 (as unpredictable) in the architecture specification and does cause
 problems on some CPUs.

That's fair enough, but it seems like it should only be outlawed for
those processors on which it breaks.

 
 We've also the issue of multiple mappings with differing cache attributes
 which needs addressing too...

The VCM has been architected to handle these things.
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/3 v3] mm: iommu: An API to unify IOMMU, CPU and device memory management

2010-07-14 Thread Zach Pfeffer
On Wed, Jul 14, 2010 at 10:59:38AM +0900, FUJITA Tomonori wrote:
 On Tue, 13 Jul 2010 05:14:21 -0700
 Zach Pfeffer zpfef...@codeaurora.org wrote:
 
   You mean that you want to specify this alignment attribute every time
   you create an IOMMU mapping? Then you can set segment_boundary_mask
   every time you create an IOMMU mapping. It's odd but it should work.
  
  Kinda. I want to forget about IOMMUs, devices and CPUs. I just want to
  create a mapping that has the alignment I specify, regardless of the
  mapper. The mapping is created on a VCM and the VCM is associated with
  a mapper: a CPU, an IOMMU'd device or a direct mapped device.
 
 Sounds like you can do the above with the combination of the current
 APIs, create a virtual address and then an I/O address.
 

Yes, and that's what the implementation does - and all the other
implementations that need to do this same thing. Why not solve the
problem once?

 The above can't be a reason to add a new infrastructure includes more
 than 3,000 lines.

Right now its 3000 lines because I haven't converted to a function
pointer based implementation. Once I do that the size of the
implementation will shrink and the code will act as a lib. Users pass
buffer mappers and the lib will ease the management of of those
buffers.

  
 
   Another possible solution is extending struct dma_attrs. We could add
   the alignment attribute to it.
  
  That may be useful, but in the current DMA-API may be seen as
  redundant info.
 
 If there is real requirement, we can extend the DMA-API.

If the DMA-API contained functions to allocate virtual space separate
from physical space and reworked how chained buffers functioned it
would probably work - but then things start to look like the VCM API
which does graph based map management.

 
 --
 To unsubscribe, send a message with 'unsubscribe linux-mm' in
 the body to majord...@kvack.org.  For more info on Linux MM,
 see: http://www.linux-mm.org/ .
 Don't email: a href=mailto:d...@kvack.org; em...@kvack.org /a
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 3/3] mm: iommu: The Virtual Contiguous Memory Manager

2010-07-14 Thread Zach Pfeffer
On Wed, Jul 14, 2010 at 09:34:03PM +0200, Joerg Roedel wrote:
 On Mon, Jul 12, 2010 at 10:21:05PM -0700, Zach Pfeffer wrote:
  Joerg Roedel wrote:
 
   The DMA-API already does this with the help of IOMMUs if they are
   present. What is the benefit of your approach over that?
  
  The grist to the DMA-API mill is the opaque scatterlist. Each
  scatterlist element brings together a physical address and a bus
  address that may be different. The set of scatterlist elements
  constitute both the set of physical buffers and the mappings to those
  buffers. My approach separates these two things into a struct physmem
  which contains the set of physical buffers and a struct reservation
  which contains the set of bus addresses (or device addresses). Each
  element in the struct physmem may be of various lengths (without
  resorting to chaining). A map call maps the one set to the other.
 
 Okay, thats a different concept, where is the benefit?

The benefit is that virtual address space and physical address space
are managed independently. This may be useful if you want to reuse the
same set of physical buffers, a user simply maps them when they're
needed. It also means that different physical memories could be
targeted and a virtual allocation could map those memories without
worrying about where they were. 

This whole concept is just a logical extension of the already existing
separation between pages and page frames... in fact the separation
between physical memory and what is mapped to that memory is
fundamental to the Linux kernel. This approach just says that arbitrary
long buffers should work the same way.
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/3 v3] mm: iommu: An API to unify IOMMU, CPU and device memory management

2010-07-14 Thread Zach Pfeffer
On Wed, Jul 14, 2010 at 11:05:36PM +0100, Russell King - ARM Linux wrote:
 On Wed, Jul 14, 2010 at 01:11:49PM -0700, Zach Pfeffer wrote:
  If the DMA-API contained functions to allocate virtual space separate
  from physical space and reworked how chained buffers functioned it
  would probably work - but then things start to look like the VCM API
  which does graph based map management.
 
 Every additional virtual mapping of a physical buffer results in
 additional cache aliases on aliasing caches, and more workload for
 developers to sort out the cache aliasing issues.
 
 What does VCM to do mitigate that?

The VCM ensures that all mappings that map a given physical buffer:
IOMMU mappings, CPU mappings and one-to-one device mappings all map
that buffer using the same (or compatible) attributes. At this point
the only attribute that users can pass is CACHED. In the absence of
CACHED all accesses go straight through to the physical memory.

The architecture of the VCM allows these sorts of consistency checks
to be made since all mappers of a given physical resource are
tracked. This is feasible because the physical resources we're
tracking are typically large.
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/3 v3] mm: iommu: An API to unify IOMMU, CPU and device memory management

2010-07-14 Thread Zach Pfeffer
On Wed, Jul 14, 2010 at 06:29:58PM -0700, Zach Pfeffer wrote:
 On Wed, Jul 14, 2010 at 11:05:36PM +0100, Russell King - ARM Linux wrote:
  On Wed, Jul 14, 2010 at 01:11:49PM -0700, Zach Pfeffer wrote:
   If the DMA-API contained functions to allocate virtual space separate
   from physical space and reworked how chained buffers functioned it
   would probably work - but then things start to look like the VCM API
   which does graph based map management.
  
  Every additional virtual mapping of a physical buffer results in
  additional cache aliases on aliasing caches, and more workload for
  developers to sort out the cache aliasing issues.
  
  What does VCM to do mitigate that?
 
 The VCM ensures that all mappings that map a given physical buffer:
 IOMMU mappings, CPU mappings and one-to-one device mappings all map
 that buffer using the same (or compatible) attributes. At this point
 the only attribute that users can pass is CACHED. In the absence of
 CACHED all accesses go straight through to the physical memory.
 
 The architecture of the VCM allows these sorts of consistency checks
 to be made since all mappers of a given physical resource are
 tracked. This is feasible because the physical resources we're
 tracking are typically large.

A few more things...

In addition to CACHED, the VCMM can support different cache policies
as long as the architecture can support it - they get passed down
through the device map call.

In addition, handling physical mappings in the VCMM enables it to
perform refcounting on the physical chunks (ie, to see how many
virtual spaces it's been mapped to, including the kernel's). This
allows it to turn on any coherency protocols that are available in
hardware (ie, setting the shareable bit on something that is mapped to
more than one virtual space). That same attribute can be left off on a
buffer that has only one virtual mapping (ie, scratch buffers used by
one device only). It is then up to the underlying system to deal with
that shared attribute - to enable redirection if it's supported, or to
force something to be non-cacheable, etc. Doing it all through the
VCMM allows all these mechanisms be worked out once per architecture
and then reused.

--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/3 v3] mm: iommu: An API to unify IOMMU, CPU and device memory management

2010-07-14 Thread Zach Pfeffer
On Wed, Jul 14, 2010 at 06:47:34PM -0700, Eric W. Biederman wrote:
 Zach Pfeffer zpfef...@codeaurora.org writes:
 
  On Wed, Jul 14, 2010 at 11:05:36PM +0100, Russell King - ARM Linux wrote:
  On Wed, Jul 14, 2010 at 01:11:49PM -0700, Zach Pfeffer wrote:
   If the DMA-API contained functions to allocate virtual space separate
   from physical space and reworked how chained buffers functioned it
   would probably work - but then things start to look like the VCM API
   which does graph based map management.
  
  Every additional virtual mapping of a physical buffer results in
  additional cache aliases on aliasing caches, and more workload for
  developers to sort out the cache aliasing issues.
  
  What does VCM to do mitigate that?
 
  The VCM ensures that all mappings that map a given physical buffer:
  IOMMU mappings, CPU mappings and one-to-one device mappings all map
  that buffer using the same (or compatible) attributes. At this point
  the only attribute that users can pass is CACHED. In the absence of
  CACHED all accesses go straight through to the physical memory.
 
  The architecture of the VCM allows these sorts of consistency checks
  to be made since all mappers of a given physical resource are
  tracked. This is feasible because the physical resources we're
  tracking are typically large.
 
 On x86 this is implemented in the pat code, and could reasonably be
 generalized to be cross platform.
 
 This is controlled by HAVE_PFNMAP_TRACKING and with entry points
 like track_pfn_vma_new.
 
 Given that we already have an implementation that tracks the cached
 vs non-cached attribute using the dma api.  I don't see that the
 API has to change.  An implementation of the cached vs non-cached
 status for arm and other architectures is probably appropriate.
 
 It is definitely true that getting your mapping caching attributes
 out of sync can be a problem.

Sure, but we're still stuck with needing lots of scatterlist list
elements and needing to copy them to share physical buffers.
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 3/3] mm: iommu: The Virtual Contiguous Memory Manager

2010-07-13 Thread Zach Pfeffer
On Tue, Jul 13, 2010 at 02:59:08PM +0900, FUJITA Tomonori wrote:
 On Mon, 12 Jul 2010 22:46:59 -0700
 Zach Pfeffer zpfef...@codeaurora.org wrote:
 
  Joerg Roedel wrote:
   On Fri, Jul 02, 2010 at 12:33:51AM -0700, Zach Pfeffer wrote:
   Daniel Walker wrote:
   
   So if we include this code which map implementations could you
   collapse into this implementations ? Generally , what currently existing
   code can VCMM help to eliminate?
   In theory, it can eliminate all code the interoperates between IOMMU,
   CPU and non-IOMMU based devices and all the mapping code, alignment,
   mapping attribute and special block size support that's been
   implemented.
   
   Thats a very abstract statement. Can you point to particular code files
   and give a rough sketch how it could be improved using VCMM?
  
  I can. Not to single out a particular subsystem, but the video4linux
  code contains interoperation code to abstract the difference between
  sg buffers, vmalloc buffers and physically contiguous buffers. The
  VCMM is an attempt to provide a framework where these and all the
  other buffer types can be unified.
 
 Why video4linux can't use the DMA API? Doing DMA with vmalloc'ed
 buffers is a thing that we should avoid (there are some exceptions
 like xfs though).

I'm not sure, but I know that it makes the distinction. From
video4linux/videobuf:

media/videobuf-dma-sg.h   /* Physically scattered */  
media/videobuf-vmalloc.h  /* vmalloc() buffers*/  
media/videobuf-dma-contig.h   /* Physically contiguous */
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/3 v3] mm: iommu: An API to unify IOMMU, CPU and device memory management

2010-07-13 Thread Zach Pfeffer
On Tue, Jul 13, 2010 at 03:03:25PM +0900, FUJITA Tomonori wrote:
 On Mon, 12 Jul 2010 22:57:06 -0700
 Zach Pfeffer zpfef...@codeaurora.org wrote:
 
  FUJITA Tomonori wrote:
   On Thu, 08 Jul 2010 16:59:52 -0700
   Zach Pfeffer zpfef...@codeaurora.org wrote:
   
   The problem I'm trying to solve boils down to this: map a set of
   contiguous physical buffers to an aligned IOMMU address. I need to
   allocate the set of physical buffers in a particular way: use 1 MB
   contiguous physical memory, then 64 KB, then 4 KB, etc. and I need to
   align the IOMMU address in a particular way.
   
   Sounds like the DMA API already supports what you want.
   
   You can set segment_boundary_mask in struct device_dma_parameters if
   you want to align the IOMMU address. See IOMMU implementations that
   support dma_get_seg_boundary() properly.
  
  That function takes the wrong argument in a VCM world:
  
  unsigned long dma_get_seg_boundary(struct device *dev);
  
  The boundary should be an attribute of the device side mapping,
  independent of the device. This would allow better code reuse.
 
 You mean that you want to specify this alignment attribute every time
 you create an IOMMU mapping? Then you can set segment_boundary_mask
 every time you create an IOMMU mapping. It's odd but it should work.

Kinda. I want to forget about IOMMUs, devices and CPUs. I just want to
create a mapping that has the alignment I specify, regardless of the
mapper. The mapping is created on a VCM and the VCM is associated with
a mapper: a CPU, an IOMMU'd device or a direct mapped device.

 
 Another possible solution is extending struct dma_attrs. We could add
 the alignment attribute to it.

That may be useful, but in the current DMA-API may be seen as
redundant info.

--
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 3/3] mm: iommu: The Virtual Contiguous Memory Manager

2010-07-12 Thread Zach Pfeffer
Joerg Roedel wrote:
 On Fri, Jul 02, 2010 at 12:09:02AM -0700, Zach Pfeffer wrote:
 Hari Kanigeri wrote:
 He demonstrated the usage of his code in one of the emails he sent out
 initially. Did you go over that, and what (or how many) step would you
 use with the current code to do the same thing?
 -- So is this patch set adding layers and abstractions to help the User ?

 If the idea is to share some memory across multiple devices, I guess
 you can achieve the same by calling the map function provided by iommu
 module and sharing the mapped address to the 10's or 100's of devices
 to access the buffers. You would only need a dedicated virtual pool
 per IOMMU device to manage its virtual memory allocations.
 Yeah, you can do that. My idea is to get away from explicit addressing
 and encapsulate the device address to physical address link into a
 mapping.
 
 The DMA-API already does this with the help of IOMMUs if they are
 present. What is the benefit of your approach over that?

The grist to the DMA-API mill is the opaque scatterlist. Each
scatterlist element brings together a physical address and a bus
address that may be different. The set of scatterlist elements
constitute both the set of physical buffers and the mappings to those
buffers. My approach separates these two things into a struct physmem
which contains the set of physical buffers and a struct reservation
which contains the set of bus addresses (or device addresses). Each
element in the struct physmem may be of various lengths (without
resorting to chaining). A map call maps the one set to the other. 

-- 
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 3/3] mm: iommu: The Virtual Contiguous Memory Manager

2010-07-12 Thread Zach Pfeffer
Joerg Roedel wrote:
 On Thu, Jul 01, 2010 at 03:00:17PM -0700, Zach Pfeffer wrote:
 Additionally, the current IOMMU interface does not allow users to
 associate one page table with multiple IOMMUs [...]
 
 Thats not true. Multiple IOMMUs are completly handled by the IOMMU
 drivers. In the case of the IOMMU-API backend drivers this also includes
 the ability to use page-tables on multiple IOMMUs.

Yeah. I see that now.

 
 Since the particular topology is run-time configurable all of these
 use-cases and more can be expressed without pushing the topology into
 the low-level IOMMU driver.
 
 The IOMMU driver has to know about the topology anyway because it needs
 to know which IOMMU it needs to program for a particular device.

Perhaps, but why not create a VCM which can be shared across all
mappers in the system? Why bury it in a device driver and make all
IOMMU device drivers managed their own virtual spaces? Practically
this would entail a minor refactor to the fledging IOMMU interface;
adding associate and activate ops.

 
 Already, there are ~20 different IOMMU map implementations in the
 kernel. Had the Linux kernel had the VCMM, many of those
 implementations could have leveraged the mapping and topology
 management of a VCMM, while focusing on a few key hardware specific
 functions (map this physical address, program the page table base
 register).
 
 I partially agree here. All the IOMMU implementations in the Linux
 kernel have a lot of functionality in common where code could be
 shared. Work to share code has been done in the past by Fujita Tomonori
 but there are more places to work on. I am just not sure if a new
 front-end API is the right way to do this.

I don't really think its a new front end API. Its just an API that
allows easier mapping manipulation than the current APIs.

-- 
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 3/3] mm: iommu: The Virtual Contiguous Memory Manager

2010-07-12 Thread Zach Pfeffer
Joerg Roedel wrote:
 On Fri, Jul 02, 2010 at 12:33:51AM -0700, Zach Pfeffer wrote:
 Daniel Walker wrote:
 
 So if we include this code which map implementations could you
 collapse into this implementations ? Generally , what currently existing
 code can VCMM help to eliminate?
 In theory, it can eliminate all code the interoperates between IOMMU,
 CPU and non-IOMMU based devices and all the mapping code, alignment,
 mapping attribute and special block size support that's been
 implemented.
 
 Thats a very abstract statement. Can you point to particular code files
 and give a rough sketch how it could be improved using VCMM?

I can. Not to single out a particular subsystem, but the video4linux
code contains interoperation code to abstract the difference between
sg buffers, vmalloc buffers and physically contiguous buffers. The
VCMM is an attempt to provide a framework where these and all the
other buffer types can be unified.

-- 
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 3/3] mm: iommu: The Virtual Contiguous Memory Manager

2010-07-12 Thread Zach Pfeffer
Joerg Roedel wrote:
 On Thu, Jul 01, 2010 at 11:17:34PM -0700, Zach Pfeffer wrote:
 Andi Kleen wrote:
 
 Hmm? dma_map_* does not change any CPU mappings. It only sets up
 DMA mapping(s).
 Sure, but I was saying that iommu_map() doesn't just set up the IOMMU
 mappings, its sets up both the iommu and kernel buffer mappings.
 
 What do you mean by kernel buffer mappings?

In-kernel mappings whose addresses can be dereferenced. 

 
 
 That assumes that all the IOMMUs on the system support the same page table
 format, right?
 Actually no. Since the VCMM abstracts a page-table as a Virtual
 Contiguous Region (VCM) a VCM can be associated with any device,
 regardless of their individual page table format.
 
 The IOMMU-API abstracts a page-table as a domain which can also be
 associated with any device (behind an iommu).

It does, but only by convention. The domain member is just a big
catchall void *. It would be more useful to factor out a VCM
abstraction, with associated ops. As it stands all IOMMU device driver
writters have to re-invent IOMMU virtual address management.

-- 
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/3 v3] mm: iommu: An API to unify IOMMU, CPU and device memory management

2010-07-12 Thread Zach Pfeffer
FUJITA Tomonori wrote:
 On Thu, 08 Jul 2010 16:59:52 -0700
 Zach Pfeffer zpfef...@codeaurora.org wrote:
 
 The problem I'm trying to solve boils down to this: map a set of
 contiguous physical buffers to an aligned IOMMU address. I need to
 allocate the set of physical buffers in a particular way: use 1 MB
 contiguous physical memory, then 64 KB, then 4 KB, etc. and I need to
 align the IOMMU address in a particular way.
 
 Sounds like the DMA API already supports what you want.
 
 You can set segment_boundary_mask in struct device_dma_parameters if
 you want to align the IOMMU address. See IOMMU implementations that
 support dma_get_seg_boundary() properly.

That function takes the wrong argument in a VCM world:

unsigned long dma_get_seg_boundary(struct device *dev);

The boundary should be an attribute of the device side mapping,
independent of the device. This would allow better code reuse.

-- 
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/3 v3] mm: iommu: An API to unify IOMMU, CPU and device memory management

2010-07-08 Thread Zach Pfeffer
Russell King - ARM Linux wrote:
 On Wed, Jul 07, 2010 at 03:44:27PM -0700, Zach Pfeffer wrote:
 The DMA API handles the allocation and use of DMA channels. It can
 configure physical transfer settings, manage scatter-gather lists,
 etc. 
 
 You're confused about what the DMA API is.  You're talking about
 the DMA engine subsystem (drivers/dma) not the DMA API (see
 Documentation/DMA-API.txt, include/linux/dma-mapping.h, and
 arch/arm/include/asm/dma-mapping.h)

Thanks for the clarification. 

 
 The VCM allows all device buffers to be passed between all devices in
 the system without passing those buffers through each domain's
 API. This means that instead of writing code to interoperate between
 DMA engines, IOMMU mapped spaces, CPUs and physically addressed
 devices the user can simply target a device with a buffer using the
 same API regardless of how that device maps or otherwise accesses the
 buffer.
 
 With the DMA API, if we have a SG list which refers to the physical
 pages (as a struct page, offset, length tuple), the DMA API takes
 care of dealing with CPU caches and IOMMUs to make the data in the
 buffer visible to the target device.  It provides you with a set of
 cookies referring to the SG lists, which may be coalesced if the
 IOMMU can do so.
 
 If you have a kernel virtual address, the DMA API has single buffer
 mapping/unmapping functions to do the same thing, and provide you
 with a cookie to pass to the device to refer to that buffer.
 
 These cookies are whatever the device needs to be able to access
 the buffer - for instance, if system SDRAM is located at 0xc000
 virtual, 0x8000 physical and 0x4000 as far as the DMA device
 is concerned, then the cookie for a buffer at 0xc000 virtual will
 be 0x4000 and not 0x8000.

It sounds like I've got some work to do. I appreciate the feedback.

The problem I'm trying to solve boils down to this: map a set of
contiguous physical buffers to an aligned IOMMU address. I need to
allocate the set of physical buffers in a particular way: use 1 MB
contiguous physical memory, then 64 KB, then 4 KB, etc. and I need to
align the IOMMU address in a particular way. I also need to swap out the
IOMMU address spaces and map the buffers into the kernel.

I have this all solved, but it sounds like I'll need to migrate to the DMA
API to upstream it.

-- 
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC 1/3 v3] mm: iommu: An API to unify IOMMU, CPU and device memory management

2010-07-06 Thread Zach Pfeffer
This patch contains the documentation for the API, termed the Virtual
Contiguous Memory Manager. Its use would allow all of the IOMMU to VM,
VM to device and device to IOMMU interoperation code to be refactored
into platform independent code.

Comments, suggestions and criticisms are welcome and wanted.

Signed-off-by: Zach Pfeffer zpfef...@codeaurora.org
---
 Documentation/vcm.txt |  587 +
 1 files changed, 587 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/vcm.txt

diff --git a/Documentation/vcm.txt b/Documentation/vcm.txt
new file mode 100644
index 000..1c6a8be
--- /dev/null
+++ b/Documentation/vcm.txt
@@ -0,0 +1,587 @@
+What is this document about?
+
+
+This document covers how to use the Virtual Contiguous Memory Manager
+(VCMM), how the first implementation works with a specific low-level
+Input/Output Memory Management Unit (IOMMU) and the way the VCMM is used
+from user-space. It also contains a section that describes why something
+like the VCMM is needed in the kernel.
+
+If anything in this document is wrong, please send patches to the
+maintainer of this file, listed at the bottom of the document.
+
+
+The Virtual Contiguous Memory Manager
+=
+
+The VCMM was built to solve the system-wide memory mapping issues that
+occur when many bus-masters have IOMMUs.
+
+An IOMMU maps device addresses to physical addresses. It also insulates
+the system from spurious or malicious device bus transactions and allows
+fine-grained mapping attribute control. The Linux kernel core does not
+contain a generic API to handle IOMMU mapped memory; device driver writers
+must implement device specific code to interoperate with the Linux kernel
+core. As the number of IOMMUs increases, coordinating the many address
+spaces mapped by all discrete IOMMUs becomes difficult without in-kernel
+support.
+
+The VCMM API enables device independent IOMMU control, virtual memory
+manager (VMM) interoperation and non-IOMMU enabled device interoperation
+by treating devices with or without IOMMUs and all CPUs with or without
+MMUs, their mapping contexts and their mappings using common
+abstractions. Physical hardware is given a generic device type and mapping
+contexts are abstracted into Virtual Contiguous Memory (VCM)
+regions. Users reserve memory from VCMs and back their reservations
+with physical memory.
+
+Why the VCMM is Needed
+--
+
+Driver writers who control devices with IOMMUs must contend with device
+control and memory management. Driver writers have a large device driver
+API that they can leverage to control their devices, but they are lacking
+a unified API to help them program mappings into IOMMUs and share those
+mappings with other devices and CPUs in the system.
+
+Sharing is complicated by Linux's CPU-centric VMM. The CPU-centric model
+generally makes sense because average hardware only contains a MMU for the
+CPU and possibly a graphics MMU. If every device in the system has one or
+more MMUs the CPU-centric memory management (MM) programming model breaks
+down.
+
+Abstracting IOMMU device programming into a common API has already begun
+in the Linux kernel. It was built to abstract the difference between AMD
+and Intel IOMMUs to support x86 virtualization on both platforms. The
+interface is listed in include/linux/iommu.h. It contains
+interfaces for mapping and unmapping as well as domain management. This
+interface has not gained widespread use outside the x86; PA-RISC, Alpha
+and SPARC architectures and ARM and PowerPC platforms all use their own
+mapping modules to control their IOMMUs. The VCMM contains an IOMMU
+programming layer, but since its abstraction supports map management
+independent of device control, the layer is not used directly. This
+higher-level view enables a new kernel service, not just an IOMMU
+interoperation layer.
+
+The General Idea: Map Management using Graphs
+-
+
+Looking at mapping from a system-wide perspective reveals a general graph
+problem. The VCMM's API is built to manage the general mapping graph. Each
+node that talks to memory, either through an MMU or directly (physically
+mapped) can be thought of as the device-end of a mapping edge. The other
+edge is the physical memory (or intermediate virtual space) that is
+mapped.
+
+In the direct-mapped case the device is assigned a one-to-one MMU. This
+scheme allows direct mapped devices to participate in general graph
+management.
+
+The CPU nodes can also be brought under the same mapping abstraction with
+the use of a light overlay on the existing VMM. This light overlay allows
+VMM-managed mappings to interoperate with the common API. The light
+overlay enables this without substantial modifications to the existing
+VMM.
+
+In addition to CPU nodes that are running Linux (and the VMM), remote CPU
+nodes that may

[RFC 2/3] mm: iommu: A physical allocator for the VCMM

2010-07-06 Thread Zach Pfeffer
The Virtual Contiguous Memory Manager (VCMM) needs a physical pool to
allocate from. It breaks up the pool into sub-pools of same-sized
chunks. In particular, it breaks the pool it manages into sub-pools of
1 MB, 64 KB and 4 KB chunks.

When a user makes a request, this allocator satisfies that request
from the sub-pools using a maximum-munch strategy. This strategy
attempts to satisfy a request using the largest chunk-size without
over-allocating, then moving on to the next smallest size without
over-allocating and finally completing the request with the smallest
sized chunk, over-allocating if necessary.

The maximum-munch strategy allows physical page allocation for small
TLBs that need to map a given range using the minimum number of mappings.

Although the allocator has been configured for 1 MB, 64 KB and 4 KB
chunks, it can be easily extended to other chunk sizes.

Signed-off-by: Zach Pfeffer zpfef...@codeaurora.org
---
 arch/arm/mm/vcm_alloc.c   |  425 +
 include/linux/vcm_alloc.h |   70 
 2 files changed, 495 insertions(+), 0 deletions(-)
 create mode 100644 arch/arm/mm/vcm_alloc.c
 create mode 100644 include/linux/vcm_alloc.h

diff --git a/arch/arm/mm/vcm_alloc.c b/arch/arm/mm/vcm_alloc.c
new file mode 100644
index 000..e592e71
--- /dev/null
+++ b/arch/arm/mm/vcm_alloc.c
@@ -0,0 +1,425 @@
+/* Copyright (c) 2010, Code Aurora Forum. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 and
+ * only version 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
+ * 02110-1301, USA.
+ */
+
+#include linux/kernel.h
+#include linux/slab.h
+#include linux/module.h
+#include linux/vcm_alloc.h
+#include linux/string.h
+#include asm/sizes.h
+
+/* Amount of memory managed by VCM */
+#define TOTAL_MEM_SIZE SZ_32M
+
+static unsigned int base_pa = 0x8000;
+int basicalloc_init;
+
+int chunk_sizes[NUM_CHUNK_SIZES] = {SZ_1M, SZ_64K, SZ_4K};
+int init_num_chunks[] = {
+   (TOTAL_MEM_SIZE/2) / SZ_1M,
+   (TOTAL_MEM_SIZE/4) / SZ_64K,
+   (TOTAL_MEM_SIZE/4) / SZ_4K
+};
+#define LAST_SZ() (ARRAY_SIZE(chunk_sizes) - 1)
+
+#define vcm_alloc_err(a, ...)  \
+   pr_err(ERROR %s %i  a, __func__, __LINE__, ##__VA_ARGS__)
+
+struct phys_chunk_head {
+   struct list_head head;
+   int num;
+};
+
+struct phys_mem {
+   struct phys_chunk_head heads[ARRAY_SIZE(chunk_sizes)];
+} phys_mem;
+
+static int is_allocated(struct list_head *allocated)
+{
+   /* This should not happen under normal conditions */
+   if (!allocated) {
+   vcm_alloc_err(no allocated\n);
+   return 0;
+   }
+
+   if (!basicalloc_init) {
+   vcm_alloc_err(no basicalloc_init\n);
+   return 0;
+   }
+   return !list_empty(allocated);
+}
+
+static int count_allocated_size(enum chunk_size_idx idx)
+{
+   int cnt = 0;
+   struct phys_chunk *chunk, *tmp;
+
+   if (!basicalloc_init) {
+   vcm_alloc_err(no basicalloc_init\n);
+   return 0;
+   }
+
+   list_for_each_entry_safe(chunk, tmp,
+phys_mem.heads[idx].head, list) {
+   if (is_allocated(chunk-allocated))
+   cnt++;
+   }
+
+   return cnt;
+}
+
+
+int vcm_alloc_get_mem_size(void)
+{
+   return TOTAL_MEM_SIZE;
+}
+EXPORT_SYMBOL(vcm_alloc_get_mem_size);
+
+
+int vcm_alloc_blocks_avail(enum chunk_size_idx idx)
+{
+   if (!basicalloc_init) {
+   vcm_alloc_err(no basicalloc_init\n);
+   return 0;
+   }
+
+   return phys_mem.heads[idx].num;
+}
+EXPORT_SYMBOL(vcm_alloc_blocks_avail);
+
+
+int vcm_alloc_get_num_chunks(void)
+{
+   return ARRAY_SIZE(chunk_sizes);
+}
+EXPORT_SYMBOL(vcm_alloc_get_num_chunks);
+
+
+int vcm_alloc_all_blocks_avail(void)
+{
+   int i;
+   int cnt = 0;
+
+   if (!basicalloc_init) {
+   vcm_alloc_err(no basicalloc_init\n);
+   return 0;
+   }
+
+   for (i = 0; i  ARRAY_SIZE(chunk_sizes); ++i)
+   cnt += vcm_alloc_blocks_avail(i);
+   return cnt;
+}
+EXPORT_SYMBOL(vcm_alloc_all_blocks_avail);
+
+
+int vcm_alloc_count_allocated(void)
+{
+   int i;
+   int cnt = 0;
+
+   if (!basicalloc_init) {
+   vcm_alloc_err(no basicalloc_init\n);
+   return 0;
+   }
+
+   for (i = 0; i  ARRAY_SIZE

Re: [RFC 3/3] mm: iommu: The Virtual Contiguous Memory Manager

2010-07-03 Thread Zach Pfeffer
Andi Kleen wrote:
 The standard Linux approach to such a problem is to write
 a library that drivers can use for common functionality, not put a middle 
 layer inbetween. Libraries are much more flexible than layers.

I've been thinking about this statement. Its very true. I use the
genalloc lib which is a great piece of software to manage VCMs
(domains in linux/iommu.h parlance?).

On our hardware we have 3 things we have to do, use the minimum set of
mappings to map a buffer because of the extremely small TLBs in all the
IOMMUs we have to support, use special virtual alignments and direct
various multimedia flows through certain IOMMUs. To support this we:

1. Use the genalloc lib to allocate virtual space for our IOMMUs,
allowing virtual alignment to be specified.

2. Have a maxmunch allocator that manages our own physical pool.

I think I may be able to support this using the iommu interface and
some util functions. The big thing that's lost is the unified topology
management, but as demonstrated that may fall out from a refactor.

Anyhow, sounds like a few things to try. Thanks for the feedback so
far. I'll do some refactoring and see what's missing.

-- 
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 3/3] mm: iommu: The Virtual Contiguous Memory Manager

2010-07-02 Thread Zach Pfeffer
Andi Kleen wrote:
 The VCMM provides a more abstract, global view with finer-grained
 control of each mapping a user wants to create. For instance, the
 semantics of iommu_map preclude its use in setting up just the IOMMU
 side of a mapping. With a one-sided map, two IOMMU devices can be
 
 Hmm? dma_map_* does not change any CPU mappings. It only sets up
 DMA mapping(s).

Sure, but I was saying that iommu_map() doesn't just set up the IOMMU
mappings, its sets up both the iommu and kernel buffer mappings.

 
 Additionally, the current IOMMU interface does not allow users to
 associate one page table with multiple IOMMUs unless the user explicitly
 
 That assumes that all the IOMMUs on the system support the same page table
 format, right?

Actually no. Since the VCMM abstracts a page-table as a Virtual
Contiguous Region (VCM) a VCM can be associated with any device,
regardless of their individual page table format.

 
 As I understand your approach would help if you have different
 IOMMus with an different low level interface, which just 
 happen to have the same pte format. Is that very likely?
 
 I would assume if you have lots of copies of the same IOMMU
 in the system then you could just use a single driver with multiple
 instances that share some state for all of them.  That model
 would fit in the current interfaces. There's no reason multiple
 instances couldn't share the same allocation data structure.
 
 And if you have lots of truly different IOMMUs then they likely
 won't be able to share PTEs at the hardware level anyways, because
 the formats are too different.

See VCM's above.

 
 The VCMM takes the long view. Its designed for a future in which the
 number of IOMMUs will go up and the ways in which these IOMMUs are
 composed will vary from system to system, and may vary at
 runtime. Already, there are ~20 different IOMMU map implementations in
 the kernel. Had the Linux kernel had the VCMM, many of those
 implementations could have leveraged the mapping and topology management
 of a VCMM, while focusing on a few key hardware specific functions (map
 this physical address, program the page table base register).
 
 The standard Linux approach to such a problem is to write
 a library that drivers can use for common functionality, not put a middle 
 layer in between. Libraries are much more flexible than layers.

That's true up to the, is this middle layer so useful that its worth
it point. The VM is a middle layer, you could make the same argument
about it, the mapping code isn't too hard, just map in the memory
that you need and be done with it. But the VM middle layer provides a
clean separation between page frames and pages which turns out to be
infinitely useful. The VCMM is built in the same spirit, It says
things like, mapping is a global problem, I'm going to abstract
entire virtual spaces and allow people arbitrary chuck size
allocation, I'm not going to care that my device is physically mapping
this buffer and this other device is a virtual, virtual device.

 
 That said I'm not sure there's all that much duplicated code anyways.
 A lot of the code is always IOMMU specific. The only piece
 which might be shareable is the mapping allocation, but I don't
 think that's very much of a typical driver
 
 In my old pci-gart driver the allocation was all only a few lines of code, 
 although given it was somewhat dumb in this regard because it only managed a 
 small remapping window.

I agree that its not a lot of code, and that this layer may be a bit heavy, but 
I'd like to focus on is a global mapping view useful and if so is something 
like the graph management that the VCMM provides generally useful.

-- 
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 3/3] mm: iommu: The Virtual Contiguous Memory Manager

2010-07-02 Thread Zach Pfeffer
Hari Kanigeri wrote:
 He demonstrated the usage of his code in one of the emails he sent out
 initially. Did you go over that, and what (or how many) step would you
 use with the current code to do the same thing?
 
 -- So is this patch set adding layers and abstractions to help the User ?
 
 If the idea is to share some memory across multiple devices, I guess
 you can achieve the same by calling the map function provided by iommu
 module and sharing the mapped address to the 10's or 100's of devices
 to access the buffers. You would only need a dedicated virtual pool
 per IOMMU device to manage its virtual memory allocations.

Yeah, you can do that. My idea is to get away from explicit addressing
and encapsulate the device address to physical address link into a
mapping.

-- 
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 3/3] mm: iommu: The Virtual Contiguous Memory Manager

2010-07-02 Thread Zach Pfeffer
Hari Kanigeri wrote:
 The VCMM takes the long view. Its designed for a future in which the
 number of IOMMUs will go up and the ways in which these IOMMUs are
 composed will vary from system to system, and may vary at
 runtime. Already, there are ~20 different IOMMU map implementations in
 the kernel. Had the Linux kernel had the VCMM, many of those
 implementations could have leveraged the mapping and topology management
 of a VCMM, while focusing on a few key hardware specific functions (map
 this physical address, program the page table base register).

 
 -- Sounds good.
 Did you think of a way to handle the cases where one of the Device
 that is using the mapped address crashed ?
 How is the physical address unbacked in this case ?

Actually the API takes care of that by design. Since the physical
space is managed apart from the mapper the mapper can crash and not
affect the physical memory allocation.

-- 
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 3/3] mm: iommu: The Virtual Contiguous Memory Manager

2010-07-02 Thread Zach Pfeffer
Daniel Walker wrote:
 On Thu, 2010-07-01 at 15:00 -0700, Zach Pfeffer wrote:
 
 
 Additionally, the current IOMMU interface does not allow users to
 associate one page table with multiple IOMMUs unless the user explicitly
 wrote a muxed device underneith the IOMMU interface. This also could be
 done, but would have to be done for every such use case. Since the
 particular topology is run-time configurable all of these use-cases and
 more can be expressed without pushing the topology into the low-level
 IOMMU driver.

 The VCMM takes the long view. Its designed for a future in which the
 number of IOMMUs will go up and the ways in which these IOMMUs are
 composed will vary from system to system, and may vary at
 runtime. Already, there are ~20 different IOMMU map implementations in
 the kernel. Had the Linux kernel had the VCMM, many of those
 implementations could have leveraged the mapping and topology management
 of a VCMM, while focusing on a few key hardware specific functions (map
 this physical address, program the page table base register).
 
 So if we include this code which map implementations could you
 collapse into this implementations ? Generally , what currently existing
 code can VCMM help to eliminate?

In theory, it can eliminate all code the interoperates between IOMMU,
CPU and non-IOMMU based devices and all the mapping code, alignment,
mapping attribute and special block size support that's been
implemented.


-- 
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 3/3] mm: iommu: The Virtual Contiguous Memory Manager

2010-07-02 Thread Zach Pfeffer
Andi Kleen wrote:
 On Thu, Jul 01, 2010 at 11:17:34PM -0700, Zach Pfeffer wrote:
 Andi Kleen wrote:
 The VCMM provides a more abstract, global view with finer-grained
 control of each mapping a user wants to create. For instance, the
 semantics of iommu_map preclude its use in setting up just the IOMMU
 side of a mapping. With a one-sided map, two IOMMU devices can be
 Hmm? dma_map_* does not change any CPU mappings. It only sets up
 DMA mapping(s).
 Sure, but I was saying that iommu_map() doesn't just set up the IOMMU
 mappings, its sets up both the iommu and kernel buffer mappings.
 
 Normally the data is already in the kernel or mappings, so why
 would you need another CPU mapping too? Sometimes the CPU
 code has to scatter-gather, but that is considered acceptable
 (and if it really cannot be rewritten to support sg it's better
 to have an explicit vmap operation) 
 
 In general on larger systems with many CPUs changing CPU mappings
 also gets expensive (because you have to communicate with all cores), 
 and is not a good idea on frequent IO paths.

That's all true, but what a VCMM allows is for these trade-offs to be
made by the user for future systems. It may not be too expensive to
change the IO path around on future chips or the user may be okay with
the performance penalty. A VCMM doesn't enforce a policy on the user,
it lets the user make their own policy.


 Additionally, the current IOMMU interface does not allow users to
 associate one page table with multiple IOMMUs unless the user explicitly
 That assumes that all the IOMMUs on the system support the same page table
 format, right?
 Actually no. Since the VCMM abstracts a page-table as a Virtual
 Contiguous Region (VCM) a VCM can be associated with any device,
 regardless of their individual page table format.
 
 But then there is no real page table sharing, isn't it? 
 The real information should be in the page tables, nowhere else.

Yeah, and the implementation ensures that it. The VCMM just adds a few
fields like start_addr, len and the device. The device still manages
the its page-tables.

 The standard Linux approach to such a problem is to write
 a library that drivers can use for common functionality, not put a middle 
 layer in between. Libraries are much more flexible than layers.
 That's true up to the, is this middle layer so useful that its worth
 it point. The VM is a middle layer, you could make the same argument
 about it, the mapping code isn't too hard, just map in the memory
 that you need and be done with it. But the VM middle layer provides a
 clean separation between page frames and pages which turns out to be
 
 Actually we use both PFNs and struct page *s in many layers up
 and down, there's not really any layering in that.

Sure, but the PFNs and the struct page *s are the middle layer. Its
just that things haven't been layered on top of them. A VCMM is the
higher level abstraction, since it allows the size of the PFs to vary
and the consumers of the VCM's to be determined at run-time.

-- 
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC 1/3 v3] mm: iommu: An API to unify IOMMU, CPU and device memory management

2010-07-02 Thread Zach Pfeffer
This patch contains the documentation for the API, termed the Virtual
Contiguous Memory Manager. Its use would allow all of the IOMMU to VM,
VM to device and device to IOMMU interoperation code to be refactored
into platform independent code.

Comments, suggestions and criticisms are welcome and wanted.

Signed-off-by: Zach Pfeffer zpfef...@codeaurora.org
---
 Documentation/vcm.txt |  587 +
 1 files changed, 587 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/vcm.txt

diff --git a/Documentation/vcm.txt b/Documentation/vcm.txt
new file mode 100644
index 000..1c6a8be
--- /dev/null
+++ b/Documentation/vcm.txt
@@ -0,0 +1,587 @@
+What is this document about?
+
+
+This document covers how to use the Virtual Contiguous Memory Manager
+(VCMM), how the first implementation works with a specific low-level
+Input/Output Memory Management Unit (IOMMU) and the way the VCMM is used
+from user-space. It also contains a section that describes why something
+like the VCMM is needed in the kernel.
+
+If anything in this document is wrong, please send patches to the
+maintainer of this file, listed at the bottom of the document.
+
+
+The Virtual Contiguous Memory Manager
+=
+
+The VCMM was built to solve the system-wide memory mapping issues that
+occur when many bus-masters have IOMMUs.
+
+An IOMMU maps device addresses to physical addresses. It also insulates
+the system from spurious or malicious device bus transactions and allows
+fine-grained mapping attribute control. The Linux kernel core does not
+contain a generic API to handle IOMMU mapped memory; device driver writers
+must implement device specific code to interoperate with the Linux kernel
+core. As the number of IOMMUs increases, coordinating the many address
+spaces mapped by all discrete IOMMUs becomes difficult without in-kernel
+support.
+
+The VCMM API enables device independent IOMMU control, virtual memory
+manager (VMM) interoperation and non-IOMMU enabled device interoperation
+by treating devices with or without IOMMUs and all CPUs with or without
+MMUs, their mapping contexts and their mappings using common
+abstractions. Physical hardware is given a generic device type and mapping
+contexts are abstracted into Virtual Contiguous Memory (VCM)
+regions. Users reserve memory from VCMs and back their reservations
+with physical memory.
+
+Why the VCMM is Needed
+--
+
+Driver writers who control devices with IOMMUs must contend with device
+control and memory management. Driver writers have a large device driver
+API that they can leverage to control their devices, but they are lacking
+a unified API to help them program mappings into IOMMUs and share those
+mappings with other devices and CPUs in the system.
+
+Sharing is complicated by Linux's CPU-centric VMM. The CPU-centric model
+generally makes sense because average hardware only contains a MMU for the
+CPU and possibly a graphics MMU. If every device in the system has one or
+more MMUs the CPU-centric memory management (MM) programming model breaks
+down.
+
+Abstracting IOMMU device programming into a common API has already begun
+in the Linux kernel. It was built to abstract the difference between AMD
+and Intel IOMMUs to support x86 virtualization on both platforms. The
+interface is listed in include/linux/iommu.h. It contains
+interfaces for mapping and unmapping as well as domain management. This
+interface has not gained widespread use outside the x86; PA-RISC, Alpha
+and SPARC architectures and ARM and PowerPC platforms all use their own
+mapping modules to control their IOMMUs. The VCMM contains an IOMMU
+programming layer, but since its abstraction supports map management
+independent of device control, the layer is not used directly. This
+higher-level view enables a new kernel service, not just an IOMMU
+interoperation layer.
+
+The General Idea: Map Management using Graphs
+-
+
+Looking at mapping from a system-wide perspective reveals a general graph
+problem. The VCMM's API is built to manage the general mapping graph. Each
+node that talks to memory, either through an MMU or directly (physically
+mapped) can be thought of as the device-end of a mapping edge. The other
+edge is the physical memory (or intermediate virtual space) that is
+mapped.
+
+In the direct-mapped case the device is assigned a one-to-one MMU. This
+scheme allows direct mapped devices to participate in general graph
+management.
+
+The CPU nodes can also be brought under the same mapping abstraction with
+the use of a light overlay on the existing VMM. This light overlay allows
+VMM-managed mappings to interoperate with the common API. The light
+overlay enables this without substantial modifications to the existing
+VMM.
+
+In addition to CPU nodes that are running Linux (and the VMM), remote CPU
+nodes that may

Re: [RFC 1/3] mm: iommu: An API to unify IOMMU, CPU and device memory management

2010-07-01 Thread Zach Pfeffer
Thank you for the corrections. I'm correcting them now. Some responses:

Randy Dunlap wrote:
 +struct vcm *vcm_create(size_t start_addr, size_t len);
 
 Seems odd to use size_t for start_addr.

I used size_t because I wanted to allow the start_addr the same range
as len. Is there a better type to use? I see 'unsigned long' used
throughout the mm code. Perhaps that's better for both the start_addr
and len.


 +A Reservation is created and destroyed with:
 +
 +struct res *vcm_reserve(struct vcm *vcm, size_t len, uint32_t attr);
 
 s/uint32_t/u32/ ?

Sure.


 +Associate and activate all three to their respective devices:
 +
 +avcm_iommu = vcm_assoc(vcm_iommu, dev_iommu, attr0);
 +avcm_onetoone = vcm_assoc(vcm_onetoone, dev_onetoone, attr1);
 +avcm_vmm = vcm_assoc(vcm_vmm, dev_cpu, attr2);
 
 error handling on vcm_assoc() failures?

I'll add the deassociate call to the example.


 +res_iommu = vcm_reserve(vcm_iommu, SZ_2MB + SZ_4K, attr);
 +res_onetoone = vcm_reserve(vcm_onetoone, SZ_2MB + SZ_4K, attr);
 +res_vmm = vcm_reserve(vcm_vmm, SZ_2MB + SZ_4K, attr);
 
 error handling?

I'll add it here too.

-- 
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC 1v2/3] mm: iommu: An API to unify IOMMU, CPU and device memory management

2010-07-01 Thread Zach Pfeffer
This patch contains the documentation for the API, termed the Virtual
Contiguous Memory Manager. Its use would allow all of the IOMMU to VM,
VM to device and device to IOMMU interoperation code to be refactored
into platform independent code.

Comments, suggestions and criticisms are welcome and wanted.

Signed-off-by: Zach Pfeffer zpfef...@codeaurora.org
---
 Documentation/vcm.txt |  587 +
 1 files changed, 587 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/vcm.txt

diff --git a/Documentation/vcm.txt b/Documentation/vcm.txt
new file mode 100644
index 000..b9029db
--- /dev/null
+++ b/Documentation/vcm.txt
@@ -0,0 +1,587 @@
+What is this document about?
+
+
+This document covers how to use the Virtual Contiguous Memory Manager
+(VCMM), how the first implementation works with a specific low-level
+Input/Output Memory Management Unit (IOMMU) and the way the VCMM is used
+from user-space. It also contains a section that describes why something
+like the VCMM is needed in the kernel.
+
+If anything in this document is wrong, please send patches to the
+maintainer of this file, listed at the bottom of the document.
+
+
+The Virtual Contiguous Memory Manager
+=
+
+The VCMM was built to solve the system-wide memory mapping issues that
+occur when many bus-masters have IOMMUs.
+
+An IOMMU maps device addresses to physical addresses. It also insulates
+the system from spurious or malicious device bus transactions and allows
+fine-grained mapping attribute control. The Linux kernel core does not
+contain a generic API to handle IOMMU mapped memory; device driver writers
+must implement device specific code to interoperate with the Linux kernel
+core. As the number of IOMMUs increases, coordinating the many address
+spaces mapped by all discrete IOMMUs becomes difficult without in-kernel
+support.
+
+The VCMM API enables device independent IOMMU control, virtual memory
+manager (VMM) interoperation and non-IOMMU enabled device interoperation
+by treating devices with or without IOMMUs and all CPUs with or without
+MMUs, their mapping contexts and their mappings using common
+abstractions. Physical hardware is given a generic device type and mapping
+contexts are abstracted into Virtual Contiguous Memory (VCM)
+regions. Users reserve memory from VCMs and back their reservations
+with physical memory.
+
+Why the VCMM is Needed
+--
+
+Driver writers who control devices with IOMMUs must contend with device
+control and memory management. Driver writers have a large device driver
+API that they can leverage to control their devices, but they are lacking
+a unified API to help them program mappings into IOMMUs and share those
+mappings with other devices and CPUs in the system.
+
+Sharing is complicated by Linux's CPU-centric VMM. The CPU-centric model
+generally makes sense because average hardware only contains a MMU for the
+CPU and possibly a graphics MMU. If every device in the system has one or
+more MMUs the CPU-centric memory management (MM) programming model breaks
+down.
+
+Abstracting IOMMU device programming into a common API has already begun
+in the Linux kernel. It was built to abstract the difference between AMD
+and Intel IOMMUs to support x86 virtualization on both platforms. The
+interface is listed in include/linux/iommu.h. It contains
+interfaces for mapping and unmapping as well as domain management. This
+interface has not gained widespread use outside the x86; PA-RISC, Alpha
+and SPARC architectures and ARM and PowerPC platforms all use their own
+mapping modules to control their IOMMUs. The VCMM contains an IOMMU
+programming layer, but since its abstraction supports map management
+independent of device control, the layer is not used directly. This
+higher-level view enables a new kernel service, not just an IOMMU
+interoperation layer.
+
+The General Idea: Map Management using Graphs
+-
+
+Looking at mapping from a system-wide perspective reveals a general graph
+problem. The VCMM's API is built to manage the general mapping graph. Each
+node that talks to memory, either through an MMU or directly (physically
+mapped) can be thought of as the device-end of a mapping edge. The other
+edge is the physical memory (or intermediate virtual space) that is
+mapped.
+
+In the direct-mapped case the device is assigned a one-to-one MMU. This
+scheme allows direct mapped devices to participate in general graph
+management.
+
+The CPU nodes can also be brought under the same mapping abstraction with
+the use of a light overlay on the existing VMM. This light overlay allows
+VMM-managed mappings to interoperate with the common API. The light
+overlay enables this without substantial modifications to the existing
+VMM.
+
+In addition to CPU nodes that are running Linux (and the VMM), remote CPU
+nodes that may

Re: [RFC 3/3] mm: iommu: The Virtual Contiguous Memory Manager

2010-07-01 Thread Zach Pfeffer
Andi Kleen wrote:

 Also for me it's still quite unclear why we would want this code at all...
 It doesn't seem to do anything you couldn't do with the existing interfaces.
 I don't know all that much about what Zach's done here, but from what
 he's said so far it looks like this help to manage lots of IOMMUs on a
 single system.. On x86 it seems like there's not all that many IOMMUs in
 comparison .. Zach mentioned 10 to 100 IOMMUs ..
 
 The current code can manage multiple IOMMUs fine.

That's fair. The current code does manage multiple IOMMUs without issue
for a static map topology. Its core function 'map' maps a physical chunk
of some size into a IOMMU's address space and the kernel's address
space for some domain.

The VCMM provides a more abstract, global view with finer-grained
control of each mapping a user wants to create. For instance, the
symantics of iommu_map preclude its use in setting up just the IOMMU
side of a mapping. With a one-sided map, two IOMMU devices can be
pointed to the same physical memory without mapping that same memory
into the kernel's address space.

Additionally, the current IOMMU interface does not allow users to
associate one page table with multiple IOMMUs unless the user explicitly
wrote a muxed device underneith the IOMMU interface. This also could be
done, but would have to be done for every such use case. Since the
particular topology is run-time configurable all of these use-cases and
more can be expressed without pushing the topology into the low-level
IOMMU driver.

The VCMM takes the long view. Its designed for a future in which the
number of IOMMUs will go up and the ways in which these IOMMUs are
composed will vary from system to system, and may vary at
runtime. Already, there are ~20 different IOMMU map implementations in
the kernel. Had the Linux kernel had the VCMM, many of those
implementations could have leveraged the mapping and topology management
of a VCMM, while focusing on a few key hardware specific functions (map
this physical address, program the page table base register).

-- 
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC 1/3] mm: iommu: An API to unify IOMMU, CPU and device memory management

2010-06-29 Thread Zach Pfeffer
This patch contains the documentation for the API, termed the Virtual
Contiguous Memory Manager. Its use would allow all of the IOMMU to VM,
VM to device and device to IOMMU interoperation code to be refactored
into platform independent code.

Comments, suggestions and criticisms are welcome and wanted.

Signed-off-by: Zach Pfeffer zpfef...@codeaurora.org
---
 Documentation/vcm.txt |  583 +
 1 files changed, 583 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/vcm.txt

diff --git a/Documentation/vcm.txt b/Documentation/vcm.txt
new file mode 100644
index 000..d29c757
--- /dev/null
+++ b/Documentation/vcm.txt
@@ -0,0 +1,583 @@
+What is this document about?
+
+
+This document covers how to use the Virtual Contiguous Memory Manager
+(VCMM), how the first implmentation works with a specific low-level
+Input/Output Memory Management Unit (IOMMU) and the way the VCMM is used
+from user-space. It also contains a section that describes why something
+like the VCMM is needed in the kernel.
+
+If anything in this document is wrong please send patches to the
+maintainer of this file, listed at the bottom of the document.
+
+
+The Virtual Contiguous Memory Manager
+=
+
+The VCMM was built to solve the system-wide memory mapping issues that
+occur when many bus-masters have IOMMUs.
+
+An IOMMU maps device addresses to physical addresses. It also insulates
+the system from spurious or malicious device bus transactions and allows
+fine-grained mapping attribute control. The Linux kernel core does not
+contain a generic API to handle IOMMU mapped memory; device driver writers
+must implement device specific code to interoperate with the Linux kernel
+core. As the number of IOMMUs increases, coordinating the many address
+spaces mapped by all discrete IOMMUs becomes difficult without in-kernel
+support.
+
+The VCMM API enables device independent IOMMU control, virtual memory
+manager (VMM) interoperation and non-IOMMU enabled device interoperation
+by treating devices with or without IOMMUs and all CPUs with or without
+MMUs, their mapping contexts and their mappings using common
+abstractions. Physical hardware is given a generic device type and mapping
+contexts are abstracted into Virtual Contiguous Memory (VCM)
+regions. Users reserve memory from VCMs and back their reservations
+with physical memory.
+
+Why the VCMM is Needed
+--
+
+Driver writers who control devices with IOMMUs must contend with device
+control and memory management. Driver writers have a large device driver
+API that they can leverage to control their devices, but they are lacking
+a unified API to help them program mappings into IOMMUs and share those
+mappings with other devices and CPUs in the system.
+
+Sharing is complicated by Linux's CPU centric VMM. The CPU centric model
+generally makes sense because average hardware only contains a MMU for the
+CPU and possibly a graphics MMU. If every device in the system has one or
+more MMUs the CPU centric memory management (MM) programming model breaks
+down.
+
+Abstracting IOMMU device programming into a common API has already begun
+in the Linux kernel. It was built to abstract the difference between AMDs
+and Intels IOMMUs to support x86 virtualization on both platforms. The
+interface is listed in kernel/include/linux/iommu.h. It contains
+interfaces for mapping and unmapping as well as domain management. This
+interface has not gained widespread use outside the x86; PA-RISC, Alpha
+and SPARC architectures and ARM and PowerPC platforms all use their own
+mapping modules to control their IOMMUs. The VCMM contains an IOMMU
+programming layer, but since its abstraction supports map management
+independent of device control, the layer is not used directly. This
+higher-level view enables a new kernel service, not just an IOMMU
+interoperation layer.
+
+The General Idea: Map Management using Graphs
+-
+
+Looking at mapping from a system-wide perspective reveals a general graph
+problem. The VCMMs API is built to manage the general mapping graph. Each
+node that talks to memory, either through an MMU or directly (physically
+mapped) can be thought of as the device-end of a mapping edge. The other
+edge is the physical memory (or intermediate virtual space) that is
+mapped.
+
+In the direct mapped case the device is assigned a one-to-one MMU. This
+scheme allows direct mapped devices to participate in general graph
+management.
+
+The CPU nodes can also be brought under the same mapping abstraction with
+the use of a light overlay on the existing VMM. This light overlay allows
+VMM managed mappings to interoperate with the common API. The light
+overlay enables this without substantial modifications to the existing
+VMM.
+
+In addition to CPU nodes that are running Linux (and the VMM), remote CPU
+nodes that may

[RFC 2/3] mm: iommu: A physical allocator for the VCMM

2010-06-29 Thread Zach Pfeffer
The Virtual Contiguous Memory Manager (VCMM) needs a physical pool to
allocate from. It breaks up the pool into sub-pools of same-sized
chunks. In particular, it breaks the pool it manages into sub-pools of
1 MB, 64 KB and 4 KB chunks.

When a user makes a request, this allocator satisfies that request
from the sub-pools using a maximum-munch strategy. This strategy
attempts to satisfy a request using the largest chunk-size without
over-allocating, then moving on to the next smallest size without
over-allocating and finally completing the request with the smallest
sized chunk, over-allocating if necessary.

The maximum-munch strategy allows physical page allocation for small
TLBs that need to map a given range using the minimum number of mappings.

Although the allocator has been configured for 1 MB, 64 KB and 4 KB
chunks, it can be easily extended to other chunk sizes.

Signed-off-by: Zach Pfeffer zpfef...@codeaurora.org
---
 arch/arm/mm/vcm_alloc.c   |  425 +
 include/linux/vcm_alloc.h |   70 
 2 files changed, 495 insertions(+), 0 deletions(-)
 create mode 100644 arch/arm/mm/vcm_alloc.c
 create mode 100644 include/linux/vcm_alloc.h

diff --git a/arch/arm/mm/vcm_alloc.c b/arch/arm/mm/vcm_alloc.c
new file mode 100644
index 000..e592e71
--- /dev/null
+++ b/arch/arm/mm/vcm_alloc.c
@@ -0,0 +1,425 @@
+/* Copyright (c) 2010, Code Aurora Forum. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 and
+ * only version 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
+ * 02110-1301, USA.
+ */
+
+#include linux/kernel.h
+#include linux/slab.h
+#include linux/module.h
+#include linux/vcm_alloc.h
+#include linux/string.h
+#include asm/sizes.h
+
+/* Amount of memory managed by VCM */
+#define TOTAL_MEM_SIZE SZ_32M
+
+static unsigned int base_pa = 0x8000;
+int basicalloc_init;
+
+int chunk_sizes[NUM_CHUNK_SIZES] = {SZ_1M, SZ_64K, SZ_4K};
+int init_num_chunks[] = {
+   (TOTAL_MEM_SIZE/2) / SZ_1M,
+   (TOTAL_MEM_SIZE/4) / SZ_64K,
+   (TOTAL_MEM_SIZE/4) / SZ_4K
+};
+#define LAST_SZ() (ARRAY_SIZE(chunk_sizes) - 1)
+
+#define vcm_alloc_err(a, ...)  \
+   pr_err(ERROR %s %i  a, __func__, __LINE__, ##__VA_ARGS__)
+
+struct phys_chunk_head {
+   struct list_head head;
+   int num;
+};
+
+struct phys_mem {
+   struct phys_chunk_head heads[ARRAY_SIZE(chunk_sizes)];
+} phys_mem;
+
+static int is_allocated(struct list_head *allocated)
+{
+   /* This should not happen under normal conditions */
+   if (!allocated) {
+   vcm_alloc_err(no allocated\n);
+   return 0;
+   }
+
+   if (!basicalloc_init) {
+   vcm_alloc_err(no basicalloc_init\n);
+   return 0;
+   }
+   return !list_empty(allocated);
+}
+
+static int count_allocated_size(enum chunk_size_idx idx)
+{
+   int cnt = 0;
+   struct phys_chunk *chunk, *tmp;
+
+   if (!basicalloc_init) {
+   vcm_alloc_err(no basicalloc_init\n);
+   return 0;
+   }
+
+   list_for_each_entry_safe(chunk, tmp,
+phys_mem.heads[idx].head, list) {
+   if (is_allocated(chunk-allocated))
+   cnt++;
+   }
+
+   return cnt;
+}
+
+
+int vcm_alloc_get_mem_size(void)
+{
+   return TOTAL_MEM_SIZE;
+}
+EXPORT_SYMBOL(vcm_alloc_get_mem_size);
+
+
+int vcm_alloc_blocks_avail(enum chunk_size_idx idx)
+{
+   if (!basicalloc_init) {
+   vcm_alloc_err(no basicalloc_init\n);
+   return 0;
+   }
+
+   return phys_mem.heads[idx].num;
+}
+EXPORT_SYMBOL(vcm_alloc_blocks_avail);
+
+
+int vcm_alloc_get_num_chunks(void)
+{
+   return ARRAY_SIZE(chunk_sizes);
+}
+EXPORT_SYMBOL(vcm_alloc_get_num_chunks);
+
+
+int vcm_alloc_all_blocks_avail(void)
+{
+   int i;
+   int cnt = 0;
+
+   if (!basicalloc_init) {
+   vcm_alloc_err(no basicalloc_init\n);
+   return 0;
+   }
+
+   for (i = 0; i  ARRAY_SIZE(chunk_sizes); ++i)
+   cnt += vcm_alloc_blocks_avail(i);
+   return cnt;
+}
+EXPORT_SYMBOL(vcm_alloc_all_blocks_avail);
+
+
+int vcm_alloc_count_allocated(void)
+{
+   int i;
+   int cnt = 0;
+
+   if (!basicalloc_init) {
+   vcm_alloc_err(no basicalloc_init\n);
+   return 0;
+   }
+
+   for (i = 0; i  ARRAY_SIZE

Re: [RFC] mm: iommu: An API to unify IOMMU, CPU and device memory management

2010-06-28 Thread Zach Pfeffer
FUJITA Tomonori wrote:
 On Thu, 24 Jun 2010 23:48:50 -0700
 Zach Pfeffer zpfef...@codeaurora.org wrote:
 
 Andi Kleen wrote:
 Zach Pfeffer zpfef...@codeaurora.org writes:

 This patch contains the documentation for and the main header file of
 the API, termed the Virtual Contiguous Memory Manager. Its use would
 allow all of the IOMMU to VM, VM to device and device to IOMMU
 interoperation code to be refactored into platform independent code.
 I read all the description and it's still unclear what advantage
 this all has over the current architecture? 

 At least all the benefits mentioned seem to be rather nebulous.

 Can you describe a concrete use case that is improved by this code
 directly?
 Sure. On a SoC with many IOMMUs (10-100), where each IOMMU may have
 its own set of page-tables or share page-tables, and where devices
 with and without IOMMUs and CPUs with or without MMUS want to
 communicate, an abstraction like the VCM helps manage all conceivable
 mapping topologies. In the same way that the Linux MM manages pages
 apart from page-frames, the VCMM allows the Linux MM to manage ideal
 memory regions, VCMs, apart from the actual memory region.

 One real scenario would be video playback from a file on a memory
 card. To read and display the video, a DMA engine would read blocks of
 data from the memory card controller into memory. These would
 typically be managed using a scatter-gather list. This list would be
 mapped into a contiguous buffer of the video decoder's IOMMU. The
 video decoder would write into a buffer mapped by the display engine's
 IOMMU as well as the CPU (if the kernel needed to intercept the
 buffers). In this instance, the video decoder's IOMMU and the display
 engine's IOMMU use different page-table formats.

 Using the VCM API, this topology can be created without worrying about
 the device's IOMMUs or how to map the buffers into the kernel, or how
 to interoperate with the scatter-gather list. The call flow would would go:
 
 Can you explain how you can't do the above with the existing API?

Sure. You can do the same thing with the current API, but the VCM takes a
wider view; the mapper is a parameter.

Taking include/linux/iommu.h as a common interface, the key function
is iommu_map(). This function maps a physical memory region, paddr, of
gfp_order, to a virtual region starting at iova:

extern int iommu_map(struct iommu_domain *domain, unsigned long iova,
 phys_addr_t paddr, int gfp_order, int prot);

Users who call this, kvm_iommu_map_pages() for instance, run similar
loops:

foreach page 
iommu_map(domain, va(page), ...)

The VCM encapsulates this as vcm_back(). This function iterates over a
set of physical regions and maps those physical regions to a virtual
address space that has been associated with a mapper at run-time. The
loop above, and the other loops (and other associated IOMMU software)
that don't use the common interface like arch/powerpc/kernel/vio.c all
do similar work.

In the end the VCM's dynamic virtual region association mechanism (and
multihomed physical memory targeting) allows all IOMMU mapping code in
the system to use the same API.

This may seem like syntactic sugar, but treating devices with IOMMUs
(bus-masters), device with MMUs (CPUs) and devices without MMUs (DMA
engines) as endpoints in a mapping graph allows new features to be
developed. One such feature is system-wide memory migration (including
memory that devices map). With a common API a loop like this can be
written one place:

foreach mapper of pa_region
remap(mapper, new_pa_region)

It could also be used for better power-management:

foreach mapper of soon_to_be_powered_off_pa_region
ask(mapper, soon_to_be_powered_off_pa_region)

The VCM is just the first step.

More concretely, the way the VCM works allows the transparent use and
interoperation of different mapping chunk sizes. This is important in
multimedia devices because IOMMU TLB misses may cause multimedia
devices to miss their performance goals. Multi-chunk size support has
been added for IOMMU mappers and wouldn't be hard to add to CPU
mappers (CPU mappers still use 4KB).

 The general point of the VCMM is to allow users a higher level API
 than the current IOMMU abstraction provides that solves the general
 mapping problem. This means that all of the common mapping code would
 be written once. In addition, the API allows all the low level details
 of IOMMU programing and VM interoperation to be handled at the right
 level.

 Eventually the following functions could all be reworked and their
 users could call VCM functions.
 
 There are more IOMMUs (e.g. x86 has calgary, gart too). And what is
 the point of converting old IOMMUs (the majority of the below)? are
 there any potential users of your API for such old IOMMUs?

That's a good question. I gave the list of the current IOMMU mapping
functions to bring awareness to the fact that the general system-wide
mapping

Re: [RFC] mm: iommu: An API to unify IOMMU, CPU and device memory management

2010-06-25 Thread Zach Pfeffer
Andi Kleen wrote:
 Zach Pfeffer zpfef...@codeaurora.org writes:
 
 This patch contains the documentation for and the main header file of
 the API, termed the Virtual Contiguous Memory Manager. Its use would
 allow all of the IOMMU to VM, VM to device and device to IOMMU
 interoperation code to be refactored into platform independent code.
 
 I read all the description and it's still unclear what advantage
 this all has over the current architecture? 
 
 At least all the benefits mentioned seem to be rather nebulous.
 
 Can you describe a concrete use case that is improved by this code
 directly?

Sure. On a SoC with many IOMMUs (10-100), where each IOMMU may have
its own set of page-tables or share page-tables, and where devices
with and without IOMMUs and CPUs with or without MMUS want to
communicate, an abstraction like the VCM helps manage all conceivable
mapping topologies. In the same way that the Linux MM manages pages
apart from page-frames, the VCMM allows the Linux MM to manage ideal
memory regions, VCMs, apart from the actual memory region.

One real scenario would be video playback from a file on a memory
card. To read and display the video, a DMA engine would read blocks of
data from the memory card controller into memory. These would
typically be managed using a scatter-gather list. This list would be
mapped into a contiguous buffer of the video decoder's IOMMU. The
video decoder would write into a buffer mapped by the display engine's
IOMMU as well as the CPU (if the kernel needed to intercept the
buffers). In this instance, the video decoder's IOMMU and the display
engine's IOMMU use different page-table formats.

Using the VCM API, this topology can be created without worrying about
the device's IOMMUs or how to map the buffers into the kernel, or how
to interoperate with the scatter-gather list. The call flow would would go:

1. Establish a memory region for the video decoder and the display engine
that's 128 MB and starts at 0x1000.

vcm_out = vcm_create(0x1000, SZ_128M);


2. Associate the memory region with the video decoder's IOMMU and the
display engine's IOMMU.

avcm_dec = vcm_assoc(vcm_out, video_dec_dev, 0);
avcm_disp = vcm_assoc(vcm_out, disp_dev, 0);

The 2 dev_ids, video_dec_dev and disp_dev allow the right IOMMU
low-level functions to be called underneath.


3. Actually program the underlying IOMMUs.

vcm_activate(avcm_dec);
vcm_activate(avcm_disp);


4. Allocate 2 physical buffers that the DMA engine and video decoder will
use. Make sure each buffer is 64 KB contiguous.

buf_64k = vcm_phys_alloc(MT0, 2*SZ_64K, VCM_64KB);


5. Allocate a 16 MB buffer for the output of the video decoder and the
input of the display engine. Use 1MB, 64KB and 4KB blocks to map the
buffer.

buf_frame = vcm_phys_alloc(MT0, SZ_16M);


6. Program the DMA controller.

buf = vcm_get_next_phys_addr(buf_64k, NULL, len);
while (buf) {
   dma_prg(buf);
   buf = vcm_get_next_phys_addr(buf_64k, NULL, len);
}


7. Create virtual memory regions for the DMA buffers and the video
decoder output from the vcm_out region. Make sure the buffers are
aligned to the buffer size.

res_64k = vcm_reserve(vcm_out, 8*SZ_64K, VCM_ALIGN_64K);
res_16M = vcm_reserve(vcm_out, SZ_16M, VCM_ALIGN_16M);


8. Connect the virtual reservations with the physical allocations.

vcm_back(res_64k, buf_64k);
vcm_back(res_16M, buf_frame);


9. Program the decoder and the display engine with addresses from the
 IOMMU side of the mapping:

base_64k = vcm_get_dev_addr(res_64k);
base_16M = vcm_get_dev_addr(res_16M);


10. Create a kernel mapping to read and write the 16M buffer.

cpu_vcm = vcm_create_from_prebuilt(VCM_PREBUILT_KERNEL);


11. Create a reservation on that prebuilt VCM. Use any alignment.

res_cpu_16M = vcm_reserve(cpu_vcm, SZ_16M, 0);


12. Back the reservation using the same physical memory that the
decoder and the display engine are looking at.

vcm_back(res_cpu_16M, buf_frame);


13. Get a pointer that kernel can dereference.

base_cpu_16M = vcm_get_dev_addr(res_cpu_16M);


The general point of the VCMM is to allow users a higher level API
than the current IOMMU abstraction provides that solves the general
mapping problem. This means that all of the common mapping code would
be written once. In addition, the API allows all the low level details
of IOMMU programing and VM interoperation to be handled at the right
level.

Eventually the following functions could all be reworked and their
users could call VCM functions.

arch/arm/plat-omap/iovmm.c
map_iovm_area()

arch/m68k/sun3/sun3dvma.c
dvma_map_align()

arch/alpha/kernel/pci_iommu.c
pci_map_single_1()

arch/powerpc/platforms/pasemi/iommu.c
iobmap_build()

arch/powerpc/kernel/iommu.c
iommu_map_page()

arch/sparc/mm/iommu.c
iommu_map_dma_area()

arch/sparc/kernel/pci_sun4v_asm.S
ENTRY(pci_sun4v_iommu_map)

arch/ia64/hp/common/sba_iommu.c
sba_map_page()

arch/arm/mach-omap2/iommu2.c
omap2_iommu_init()

arch/arm/plat-omap/iovmm.c

[RFC] mm: iommu: An API to unify IOMMU, CPU and device memory management

2010-06-23 Thread Zach Pfeffer
This patch contains the documentation for and the main header file of
the API, termed the Virtual Contiguous Memory Manager. Its use would
allow all of the IOMMU to VM, VM to device and device to IOMMU
interoperation code to be refactored into platform independent code.

Comments, suggestions and criticisms are welcome and wanted.

Signed-off-by: Zach Pfeffer zpfef...@codeaurora.org
---
 Documentation/vcm.txt |  583 
 include/linux/vcm.h   | 1017 +
 2 files changed, 1600 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/vcm.txt
 create mode 100644 include/linux/vcm.h

diff --git a/Documentation/vcm.txt b/Documentation/vcm.txt
new file mode 100644
index 000..d29c757
--- /dev/null
+++ b/Documentation/vcm.txt
@@ -0,0 +1,583 @@
+What is this document about?
+
+
+This document covers how to use the Virtual Contiguous Memory Manager
+(VCMM), how the first implmentation works with a specific low-level
+Input/Output Memory Management Unit (IOMMU) and the way the VCMM is used
+from user-space. It also contains a section that describes why something
+like the VCMM is needed in the kernel.
+
+If anything in this document is wrong please send patches to the
+maintainer of this file, listed at the bottom of the document.
+
+
+The Virtual Contiguous Memory Manager
+=
+
+The VCMM was built to solve the system-wide memory mapping issues that
+occur when many bus-masters have IOMMUs.
+
+An IOMMU maps device addresses to physical addresses. It also insulates
+the system from spurious or malicious device bus transactions and allows
+fine-grained mapping attribute control. The Linux kernel core does not
+contain a generic API to handle IOMMU mapped memory; device driver writers
+must implement device specific code to interoperate with the Linux kernel
+core. As the number of IOMMUs increases, coordinating the many address
+spaces mapped by all discrete IOMMUs becomes difficult without in-kernel
+support.
+
+The VCMM API enables device independent IOMMU control, virtual memory
+manager (VMM) interoperation and non-IOMMU enabled device interoperation
+by treating devices with or without IOMMUs and all CPUs with or without
+MMUs, their mapping contexts and their mappings using common
+abstractions. Physical hardware is given a generic device type and mapping
+contexts are abstracted into Virtual Contiguous Memory (VCM)
+regions. Users reserve memory from VCMs and back their reservations
+with physical memory.
+
+Why the VCMM is Needed
+--
+
+Driver writers who control devices with IOMMUs must contend with device
+control and memory management. Driver writers have a large device driver
+API that they can leverage to control their devices, but they are lacking
+a unified API to help them program mappings into IOMMUs and share those
+mappings with other devices and CPUs in the system.
+
+Sharing is complicated by Linux's CPU centric VMM. The CPU centric model
+generally makes sense because average hardware only contains a MMU for the
+CPU and possibly a graphics MMU. If every device in the system has one or
+more MMUs the CPU centric memory management (MM) programming model breaks
+down.
+
+Abstracting IOMMU device programming into a common API has already begun
+in the Linux kernel. It was built to abstract the difference between AMDs
+and Intels IOMMUs to support x86 virtualization on both platforms. The
+interface is listed in kernel/include/linux/iommu.h. It contains
+interfaces for mapping and unmapping as well as domain management. This
+interface has not gained widespread use outside the x86; PA-RISC, Alpha
+and SPARC architectures and ARM and PowerPC platforms all use their own
+mapping modules to control their IOMMUs. The VCMM contains an IOMMU
+programming layer, but since its abstraction supports map management
+independent of device control, the layer is not used directly. This
+higher-level view enables a new kernel service, not just an IOMMU
+interoperation layer.
+
+The General Idea: Map Management using Graphs
+-
+
+Looking at mapping from a system-wide perspective reveals a general graph
+problem. The VCMMs API is built to manage the general mapping graph. Each
+node that talks to memory, either through an MMU or directly (physically
+mapped) can be thought of as the device-end of a mapping edge. The other
+edge is the physical memory (or intermediate virtual space) that is
+mapped.
+
+In the direct mapped case the device is assigned a one-to-one MMU. This
+scheme allows direct mapped devices to participate in general graph
+management.
+
+The CPU nodes can also be brought under the same mapping abstraction with
+the use of a light overlay on the existing VMM. This light overlay allows
+VMM managed mappings to interoperate with the common API. The light
+overlay enables this without substantial