Re: [PATCH drm-next v2 05/16] drm: manager to keep track of GPUs VA mappings

2023-03-20 Thread Liam R. Howlett
* Danilo Krummrich  [230313 19:46]:
> On 3/7/23 23:43, Liam R. Howlett wrote:
> > * Danilo Krummrich  [230306 10:46]:
> > > On 3/2/23 03:38, Liam R. Howlett wrote:
> > > > * Danilo Krummrich  [230227 08:17]:
> > > > 
> > > > ...
> > > > > > > Would this variant be significantly more efficient?
> > > > > > 
> > > > > > Well, what you are doing is walking the tree to see if there's 
> > > > > > anything
> > > > > > there... then re-walking the tree to store it.  So, yes, it's much 
> > > > > > more
> > > > > > efficient..  However, writing is heavier.  How much of the time is 
> > > > > > spent
> > > > > > walking vs writing depends on the size of the tree, but it's rather 
> > > > > > easy
> > > > > > to do this in a single walk of the tree so why wouldn't you?
> > > > > 
> > > > > I will, I was just curious about how much of an impact it has.
> > > > > 
> > > > > > 
> > > > > > > 
> > > > > > > Also, would this also work while already walking the tree?
> > > > > > 
> > > > > > Yes, to an extent.  If you are at the correct location in the tree, 
> > > > > > you
> > > > > > can write to that location.  If you are not in the correct location 
> > > > > > and
> > > > > > try to write to the tree then things will go poorly..  In this 
> > > > > > scenario,
> > > > > > we are very much walking the tree and writing to it in two steps.
> > > > > > 
> > > > > > > 
> > > > > > > To remove an entry while walking the tree I have a separate 
> > > > > > > function
> > > > > > > drm_gpuva_iter_remove(). Would I need something similar for 
> > > > > > > inserting
> > > > > > > entries?
> > > > > > 
> > > > > > I saw that.  Your remove function uses the erase operation which is
> > > > > > implemented as a walk to that location and a store of a null over 
> > > > > > the
> > > > > > range that is returned.  You do not need a function to insert an 
> > > > > > entry
> > > > > > if the maple state is at the correct location, and that doesn't just
> > > > > > mean setting mas.index/mas.last to the correct value.  There is a 
> > > > > > node &
> > > > > > offset saved in the maple state that needs to be in the correct
> > > > > > location.  If you store to that node then the node may be replaced, 
> > > > > > so
> > > > > > other iterators that you have may become stale, but the one you used
> > > > > > execute the store operation will now point to the new node with the 
> > > > > > new
> > > > > > entry.
> > > > > > 
> > > > > > > 
> > > > > > > I already provided this example in a separate mail thread, but it 
> > > > > > > may makes
> > > > > > > sense to move this to the mailing list:
> > > > > > > 
> > > > > > > In __drm_gpuva_sm_map() we're iterating a given range of the 
> > > > > > > tree, where the
> > > > > > > given range is the size of the newly requested mapping. 
> > > > > > > __drm_gpuva_sm_map()
> > > > > > > invokes a callback for each sub-operation that needs to be taken 
> > > > > > > in order to
> > > > > > > fulfill this mapping request. In most cases such a callback just 
> > > > > > > creates a
> > > > > > > drm_gpuva_op object and stores it in a list.
> > > > > > > 
> > > > > > > However, drivers can also implement the callback, such that they 
> > > > > > > directly
> > > > > > > execute this operation within the callback.
> > > > > > > 
> > > > > > > Let's have a look at the following example:
> > > > > > > 
> > > > > > > 0 a 2
> > > > > > > old: |---|   (bo_offset=n)
> > > > > > > 
> > > > > > >   1 b 3
> > > > > > > req:   |---| (bo_offset=m)
> > > > > > > 
> > > > > > > 0  a' 1 b 3
> > > > > > > new: |-|---| (a.bo_offset=n,b.bo_offset=m)
> > > > > > > 
> > > > > > > This would result in the following operations.
> > > > > > > 
> > > > > > > __drm_gpuva_sm_map() finds entry "a" and calls back into the 
> > > > > > > driver
> > > > > > > suggesting to re-map "a" with the new size. The driver removes 
> > > > > > > entry "a"
> > > > > > > from the tree and adds "a'"
> > > > > > 
> > > > > > What you have here won't work.  The driver will cause your iterators
> > > > > > maple state to point to memory that is freed.  You will either need 
> > > > > > to
> > > > > > pass through your iterator so that the modifications can occur with 
> > > > > > that
> > > > > > maple state so it remains valid, or you will need to invalidate the
> > > > > > iterator on every modification by the driver.
> > > > > > 
> > > > > > I'm sure the first idea you have will be to invalidate the 
> > > > > > iterator, but
> > > > > > that is probably not the way to proceed.  Even ignoring the unclear
> > > > > > locking of two maple states trying to modify the tree, this is 
> > > > > > rather
> > > > > > inefficient - each invalidation means a re-walk of the tree.  You 
> > > > > > may as
> > > > > > well not use an iterator in this case.
> > > > > > 
> > > > > > Depending on how/when the lookups occur, you could still iterate 
> > > > > > 

Re: [PATCH drm-next v2 05/16] drm: manager to keep track of GPUs VA mappings

2023-03-13 Thread Danilo Krummrich

On 3/7/23 23:43, Liam R. Howlett wrote:

* Danilo Krummrich  [230306 10:46]:

On 3/2/23 03:38, Liam R. Howlett wrote:

* Danilo Krummrich  [230227 08:17]:

...

Would this variant be significantly more efficient?


Well, what you are doing is walking the tree to see if there's anything
there... then re-walking the tree to store it.  So, yes, it's much more
efficient..  However, writing is heavier.  How much of the time is spent
walking vs writing depends on the size of the tree, but it's rather easy
to do this in a single walk of the tree so why wouldn't you?


I will, I was just curious about how much of an impact it has.





Also, would this also work while already walking the tree?


Yes, to an extent.  If you are at the correct location in the tree, you
can write to that location.  If you are not in the correct location and
try to write to the tree then things will go poorly..  In this scenario,
we are very much walking the tree and writing to it in two steps.



To remove an entry while walking the tree I have a separate function
drm_gpuva_iter_remove(). Would I need something similar for inserting
entries?


I saw that.  Your remove function uses the erase operation which is
implemented as a walk to that location and a store of a null over the
range that is returned.  You do not need a function to insert an entry
if the maple state is at the correct location, and that doesn't just
mean setting mas.index/mas.last to the correct value.  There is a node &
offset saved in the maple state that needs to be in the correct
location.  If you store to that node then the node may be replaced, so
other iterators that you have may become stale, but the one you used
execute the store operation will now point to the new node with the new
entry.



I already provided this example in a separate mail thread, but it may makes
sense to move this to the mailing list:

In __drm_gpuva_sm_map() we're iterating a given range of the tree, where the
given range is the size of the newly requested mapping. __drm_gpuva_sm_map()
invokes a callback for each sub-operation that needs to be taken in order to
fulfill this mapping request. In most cases such a callback just creates a
drm_gpuva_op object and stores it in a list.

However, drivers can also implement the callback, such that they directly
execute this operation within the callback.

Let's have a look at the following example:

0 a 2
old: |---|   (bo_offset=n)

  1 b 3
req:   |---| (bo_offset=m)

0  a' 1 b 3
new: |-|---| (a.bo_offset=n,b.bo_offset=m)

This would result in the following operations.

__drm_gpuva_sm_map() finds entry "a" and calls back into the driver
suggesting to re-map "a" with the new size. The driver removes entry "a"
from the tree and adds "a'"


What you have here won't work.  The driver will cause your iterators
maple state to point to memory that is freed.  You will either need to
pass through your iterator so that the modifications can occur with that
maple state so it remains valid, or you will need to invalidate the
iterator on every modification by the driver.

I'm sure the first idea you have will be to invalidate the iterator, but
that is probably not the way to proceed.  Even ignoring the unclear
locking of two maple states trying to modify the tree, this is rather
inefficient - each invalidation means a re-walk of the tree.  You may as
well not use an iterator in this case.

Depending on how/when the lookups occur, you could still iterate over
the tree and let the driver modify the ending of "a", but leave the tree
alone and just store b over whatever - but the failure scenarios may
cause you grief.

If you pass the iterator through, then you can just use it to do your
writes and keep iterating as if nothing changed.


Passing through the iterater clearly seems to be the way to go.

I assume that if the entry to insert isn't at the location of the iterator
(as in the following example) we can just keep walking to this location my
changing the index of the mas and calling mas_walk()?


no.  You have to mas_set() to the value and walk from the top of the
tree.  mas_walk() walks down, not from side to side - well, it does go
forward within a node (increasing offset), but if you hit the node limit
then you have gotten yourself in trouble.


This would also imply
that the "outer" tree walk continues after the entry we just inserted,
right?


I don't understand the "outer" tree walk statement.


I think I could have phrased this better. I just mean "my" iterator walking
each tree entry rather than an internal tree walk, as it happens in e.g.
mas_walk() or mas_find().





 1 a 3
old:   |---| (bo_offset=n)

   0 b 2
req: |---|   (bo_offset=m)

   0 b 2  a' 3
new: |---|-| (b.bo_offset=m,a.bo_offset=n+2)

Again, after finding "a", we want to remove it and insert "a'" instead.


Ah, so 

Re: [PATCH drm-next v2 05/16] drm: manager to keep track of GPUs VA mappings

2023-03-07 Thread Liam R. Howlett
* Danilo Krummrich  [230306 10:46]:
> On 3/2/23 03:38, Liam R. Howlett wrote:
> > * Danilo Krummrich  [230227 08:17]:
> > 
> > ...
> > > > > Would this variant be significantly more efficient?
> > > > 
> > > > Well, what you are doing is walking the tree to see if there's anything
> > > > there... then re-walking the tree to store it.  So, yes, it's much more
> > > > efficient..  However, writing is heavier.  How much of the time is spent
> > > > walking vs writing depends on the size of the tree, but it's rather easy
> > > > to do this in a single walk of the tree so why wouldn't you?
> > > 
> > > I will, I was just curious about how much of an impact it has.
> > > 
> > > > 
> > > > > 
> > > > > Also, would this also work while already walking the tree?
> > > > 
> > > > Yes, to an extent.  If you are at the correct location in the tree, you
> > > > can write to that location.  If you are not in the correct location and
> > > > try to write to the tree then things will go poorly..  In this scenario,
> > > > we are very much walking the tree and writing to it in two steps.
> > > > 
> > > > > 
> > > > > To remove an entry while walking the tree I have a separate function
> > > > > drm_gpuva_iter_remove(). Would I need something similar for inserting
> > > > > entries?
> > > > 
> > > > I saw that.  Your remove function uses the erase operation which is
> > > > implemented as a walk to that location and a store of a null over the
> > > > range that is returned.  You do not need a function to insert an entry
> > > > if the maple state is at the correct location, and that doesn't just
> > > > mean setting mas.index/mas.last to the correct value.  There is a node &
> > > > offset saved in the maple state that needs to be in the correct
> > > > location.  If you store to that node then the node may be replaced, so
> > > > other iterators that you have may become stale, but the one you used
> > > > execute the store operation will now point to the new node with the new
> > > > entry.
> > > > 
> > > > > 
> > > > > I already provided this example in a separate mail thread, but it may 
> > > > > makes
> > > > > sense to move this to the mailing list:
> > > > > 
> > > > > In __drm_gpuva_sm_map() we're iterating a given range of the tree, 
> > > > > where the
> > > > > given range is the size of the newly requested mapping. 
> > > > > __drm_gpuva_sm_map()
> > > > > invokes a callback for each sub-operation that needs to be taken in 
> > > > > order to
> > > > > fulfill this mapping request. In most cases such a callback just 
> > > > > creates a
> > > > > drm_gpuva_op object and stores it in a list.
> > > > > 
> > > > > However, drivers can also implement the callback, such that they 
> > > > > directly
> > > > > execute this operation within the callback.
> > > > > 
> > > > > Let's have a look at the following example:
> > > > > 
> > > > >0 a 2
> > > > > old: |---|   (bo_offset=n)
> > > > > 
> > > > >  1 b 3
> > > > > req:   |---| (bo_offset=m)
> > > > > 
> > > > >0  a' 1 b 3
> > > > > new: |-|---| (a.bo_offset=n,b.bo_offset=m)
> > > > > 
> > > > > This would result in the following operations.
> > > > > 
> > > > > __drm_gpuva_sm_map() finds entry "a" and calls back into the driver
> > > > > suggesting to re-map "a" with the new size. The driver removes entry 
> > > > > "a"
> > > > > from the tree and adds "a'"
> > > > 
> > > > What you have here won't work.  The driver will cause your iterators
> > > > maple state to point to memory that is freed.  You will either need to
> > > > pass through your iterator so that the modifications can occur with that
> > > > maple state so it remains valid, or you will need to invalidate the
> > > > iterator on every modification by the driver.
> > > > 
> > > > I'm sure the first idea you have will be to invalidate the iterator, but
> > > > that is probably not the way to proceed.  Even ignoring the unclear
> > > > locking of two maple states trying to modify the tree, this is rather
> > > > inefficient - each invalidation means a re-walk of the tree.  You may as
> > > > well not use an iterator in this case.
> > > > 
> > > > Depending on how/when the lookups occur, you could still iterate over
> > > > the tree and let the driver modify the ending of "a", but leave the tree
> > > > alone and just store b over whatever - but the failure scenarios may
> > > > cause you grief.
> > > > 
> > > > If you pass the iterator through, then you can just use it to do your
> > > > writes and keep iterating as if nothing changed.
> > > 
> > > Passing through the iterater clearly seems to be the way to go.
> > > 
> > > I assume that if the entry to insert isn't at the location of the iterator
> > > (as in the following example) we can just keep walking to this location my
> > > changing the index of the mas and calling mas_walk()?
> > 
> > no.  You have to mas_set() to the value and walk from the top of the
> > 

Re: [PATCH drm-next v2 05/16] drm: manager to keep track of GPUs VA mappings

2023-03-06 Thread Danilo Krummrich

On 3/2/23 03:38, Liam R. Howlett wrote:

* Danilo Krummrich  [230227 08:17]:

...

Would this variant be significantly more efficient?


Well, what you are doing is walking the tree to see if there's anything
there... then re-walking the tree to store it.  So, yes, it's much more
efficient..  However, writing is heavier.  How much of the time is spent
walking vs writing depends on the size of the tree, but it's rather easy
to do this in a single walk of the tree so why wouldn't you?


I will, I was just curious about how much of an impact it has.





Also, would this also work while already walking the tree?


Yes, to an extent.  If you are at the correct location in the tree, you
can write to that location.  If you are not in the correct location and
try to write to the tree then things will go poorly..  In this scenario,
we are very much walking the tree and writing to it in two steps.



To remove an entry while walking the tree I have a separate function
drm_gpuva_iter_remove(). Would I need something similar for inserting
entries?


I saw that.  Your remove function uses the erase operation which is
implemented as a walk to that location and a store of a null over the
range that is returned.  You do not need a function to insert an entry
if the maple state is at the correct location, and that doesn't just
mean setting mas.index/mas.last to the correct value.  There is a node &
offset saved in the maple state that needs to be in the correct
location.  If you store to that node then the node may be replaced, so
other iterators that you have may become stale, but the one you used
execute the store operation will now point to the new node with the new
entry.



I already provided this example in a separate mail thread, but it may makes
sense to move this to the mailing list:

In __drm_gpuva_sm_map() we're iterating a given range of the tree, where the
given range is the size of the newly requested mapping. __drm_gpuva_sm_map()
invokes a callback for each sub-operation that needs to be taken in order to
fulfill this mapping request. In most cases such a callback just creates a
drm_gpuva_op object and stores it in a list.

However, drivers can also implement the callback, such that they directly
execute this operation within the callback.

Let's have a look at the following example:

   0 a 2
old: |---|   (bo_offset=n)

 1 b 3
req:   |---| (bo_offset=m)

   0  a' 1 b 3
new: |-|---| (a.bo_offset=n,b.bo_offset=m)

This would result in the following operations.

__drm_gpuva_sm_map() finds entry "a" and calls back into the driver
suggesting to re-map "a" with the new size. The driver removes entry "a"
from the tree and adds "a'"


What you have here won't work.  The driver will cause your iterators
maple state to point to memory that is freed.  You will either need to
pass through your iterator so that the modifications can occur with that
maple state so it remains valid, or you will need to invalidate the
iterator on every modification by the driver.

I'm sure the first idea you have will be to invalidate the iterator, but
that is probably not the way to proceed.  Even ignoring the unclear
locking of two maple states trying to modify the tree, this is rather
inefficient - each invalidation means a re-walk of the tree.  You may as
well not use an iterator in this case.

Depending on how/when the lookups occur, you could still iterate over
the tree and let the driver modify the ending of "a", but leave the tree
alone and just store b over whatever - but the failure scenarios may
cause you grief.

If you pass the iterator through, then you can just use it to do your
writes and keep iterating as if nothing changed.


Passing through the iterater clearly seems to be the way to go.

I assume that if the entry to insert isn't at the location of the iterator
(as in the following example) we can just keep walking to this location my
changing the index of the mas and calling mas_walk()?


no.  You have to mas_set() to the value and walk from the top of the
tree.  mas_walk() walks down, not from side to side - well, it does go
forward within a node (increasing offset), but if you hit the node limit
then you have gotten yourself in trouble.


This would also imply
that the "outer" tree walk continues after the entry we just inserted,
right?


I don't understand the "outer" tree walk statement.


I think I could have phrased this better. I just mean "my" iterator 
walking each tree entry rather than an internal tree walk, as it happens 
in e.g. mas_walk() or mas_find().






1 a 3
old:   |---| (bo_offset=n)

  0 b 2
req: |---|   (bo_offset=m)

  0 b 2  a' 3
new: |---|-| (b.bo_offset=m,a.bo_offset=n+2)

Again, after finding "a", we want to remove it and insert "a'" instead.


Ah, so you could walk to 0, see that it's NULL from 0 - 1, call
mas_next() and get "a" 

Re: [PATCH drm-next v2 05/16] drm: manager to keep track of GPUs VA mappings

2023-03-06 Thread Danilo Krummrich

On 2/28/23 17:24, Liam R. Howlett wrote:

* Danilo Krummrich  [230227 21:17]:

On Tue, Feb 21, 2023 at 01:20:50PM -0500, Liam R. Howlett wrote:

* Danilo Krummrich  [230217 08:45]:

Add infrastructure to keep track of GPU virtual address (VA) mappings
with a decicated VA space manager implementation.

New UAPIs, motivated by Vulkan sparse memory bindings graphics drivers
start implementing, allow userspace applications to request multiple and
arbitrary GPU VA mappings of buffer objects. The DRM GPU VA manager is
intended to serve the following purposes in this context.

1) Provide infrastructure to track GPU VA allocations and mappings,
making use of the maple_tree.

2) Generically connect GPU VA mappings to their backing buffers, in
particular DRM GEM objects.

3) Provide a common implementation to perform more complex mapping
operations on the GPU VA space. In particular splitting and merging
of GPU VA mappings, e.g. for intersecting mapping requests or partial
unmap requests.

Suggested-by: Dave Airlie 
Signed-off-by: Danilo Krummrich 
---
  Documentation/gpu/drm-mm.rst|   31 +
  drivers/gpu/drm/Makefile|1 +
  drivers/gpu/drm/drm_gem.c   |3 +
  drivers/gpu/drm/drm_gpuva_mgr.c | 1704 +++
  include/drm/drm_drv.h   |6 +
  include/drm/drm_gem.h   |   75 ++
  include/drm/drm_gpuva_mgr.h |  714 +
  7 files changed, 2534 insertions(+)
  create mode 100644 drivers/gpu/drm/drm_gpuva_mgr.c
  create mode 100644 include/drm/drm_gpuva_mgr.h

diff --git a/Documentation/gpu/drm-mm.rst b/Documentation/gpu/drm-mm.rst
index a52e6f4117d6..c9f120cfe730 100644
--- a/Documentation/gpu/drm-mm.rst
+++ b/Documentation/gpu/drm-mm.rst
@@ -466,6 +466,37 @@ DRM MM Range Allocator Function References
  .. kernel-doc:: drivers/gpu/drm/drm_mm.c
 :export:
  

...


+
+/**
+ * drm_gpuva_remove_iter - removes the iterators current element
+ * @it: the _gpuva_iterator
+ *
+ * This removes the element the iterator currently points to.
+ */
+void
+drm_gpuva_iter_remove(struct drm_gpuva_iterator *it)
+{
+   mas_erase(>mas);
+}
+EXPORT_SYMBOL(drm_gpuva_iter_remove);
+
+/**
+ * drm_gpuva_insert - insert a _gpuva
+ * @mgr: the _gpuva_manager to insert the _gpuva in
+ * @va: the _gpuva to insert
+ * @addr: the start address of the GPU VA
+ * @range: the range of the GPU VA
+ *
+ * Insert a _gpuva with a given address and range into a
+ * _gpuva_manager.
+ *
+ * Returns: 0 on success, negative error code on failure.
+ */
+int
+drm_gpuva_insert(struct drm_gpuva_manager *mgr,
+struct drm_gpuva *va)
+{
+   u64 addr = va->va.addr;
+   u64 range = va->va.range;
+   MA_STATE(mas, >va_mt, addr, addr + range - 1);
+   struct drm_gpuva_region *reg = NULL;
+   int ret;
+
+   if (unlikely(!drm_gpuva_in_mm_range(mgr, addr, range)))
+   return -EINVAL;
+
+   if (unlikely(drm_gpuva_in_kernel_region(mgr, addr, range)))
+   return -EINVAL;
+
+   if (mgr->flags & DRM_GPUVA_MANAGER_REGIONS) {
+   reg = drm_gpuva_in_region(mgr, addr, range);
+   if (unlikely(!reg))
+   return -EINVAL;
+   }
+


-


+   if (unlikely(drm_gpuva_find_first(mgr, addr, range)))
+   return -EEXIST;
+
+   ret = mas_store_gfp(, va, GFP_KERNEL);


mas_walk() will set the internal maple state to the limits to what it
finds.  So, instead of an iterator, you can use the walk function and
ensure there is a large enough area in the existing NULL:

/*
  * Nothing at addr, mas now points to the location where the store would
  * happen
  */
if (mas_walk())
return -EEXIST;



For some reason mas_walk() finds an entry and hence this function returns
-EEXIST for the following sequence of insertions.

A = [0xc - 0xf]
B = [0x0 - 0xb]

Interestingly, inserting B before A works fine.

I attached a test module that reproduces the issue. I hope its just a stupid
mistake I just can't spot though.


This is probably my fault in how I explained things, I seem to have had
a bug in my code.

Let me try again.

mas_walk() will go to the range of mas.index
It will set mas.index = range_start
It will set mas.last = range_end
It will return entry in that range.

Your code is walking to addr (0xc, say)
You get NULL
and the range is now: mas.index = 0, mas.last = ULONG_MAX

You set mas.last = 0xc + 0x4 -1
You store your va in the range of 0 - 0xf - This isn't what you want
to do and this is why you are seeing it exists when done in this order.

In the reverse order, your lower limit is fine so it works out.

Try adding a check to ensure the lower range is still accurate as well:
 if (mas.index < addr)
 mas.index = addr;



Thanks for the clarification, this indeed fixes the issue.


If you compile with CONFIG_DEBUG_MAPLE_TREE, you can use mt_dump() to
dump the tree for 

Re: [PATCH drm-next v2 05/16] drm: manager to keep track of GPUs VA mappings

2023-03-01 Thread Liam R. Howlett
* Danilo Krummrich  [230227 08:17]:

...
> > > Would this variant be significantly more efficient?
> > 
> > Well, what you are doing is walking the tree to see if there's anything
> > there... then re-walking the tree to store it.  So, yes, it's much more
> > efficient..  However, writing is heavier.  How much of the time is spent
> > walking vs writing depends on the size of the tree, but it's rather easy
> > to do this in a single walk of the tree so why wouldn't you?
> 
> I will, I was just curious about how much of an impact it has.
> 
> > 
> > > 
> > > Also, would this also work while already walking the tree?
> > 
> > Yes, to an extent.  If you are at the correct location in the tree, you
> > can write to that location.  If you are not in the correct location and
> > try to write to the tree then things will go poorly..  In this scenario,
> > we are very much walking the tree and writing to it in two steps.
> > 
> > > 
> > > To remove an entry while walking the tree I have a separate function
> > > drm_gpuva_iter_remove(). Would I need something similar for inserting
> > > entries?
> > 
> > I saw that.  Your remove function uses the erase operation which is
> > implemented as a walk to that location and a store of a null over the
> > range that is returned.  You do not need a function to insert an entry
> > if the maple state is at the correct location, and that doesn't just
> > mean setting mas.index/mas.last to the correct value.  There is a node &
> > offset saved in the maple state that needs to be in the correct
> > location.  If you store to that node then the node may be replaced, so
> > other iterators that you have may become stale, but the one you used
> > execute the store operation will now point to the new node with the new
> > entry.
> > 
> > > 
> > > I already provided this example in a separate mail thread, but it may 
> > > makes
> > > sense to move this to the mailing list:
> > > 
> > > In __drm_gpuva_sm_map() we're iterating a given range of the tree, where 
> > > the
> > > given range is the size of the newly requested mapping. 
> > > __drm_gpuva_sm_map()
> > > invokes a callback for each sub-operation that needs to be taken in order 
> > > to
> > > fulfill this mapping request. In most cases such a callback just creates a
> > > drm_gpuva_op object and stores it in a list.
> > > 
> > > However, drivers can also implement the callback, such that they directly
> > > execute this operation within the callback.
> > > 
> > > Let's have a look at the following example:
> > > 
> > >   0 a 2
> > > old: |---|   (bo_offset=n)
> > > 
> > > 1 b 3
> > > req:   |---| (bo_offset=m)
> > > 
> > >   0  a' 1 b 3
> > > new: |-|---| (a.bo_offset=n,b.bo_offset=m)
> > > 
> > > This would result in the following operations.
> > > 
> > > __drm_gpuva_sm_map() finds entry "a" and calls back into the driver
> > > suggesting to re-map "a" with the new size. The driver removes entry "a"
> > > from the tree and adds "a'"
> > 
> > What you have here won't work.  The driver will cause your iterators
> > maple state to point to memory that is freed.  You will either need to
> > pass through your iterator so that the modifications can occur with that
> > maple state so it remains valid, or you will need to invalidate the
> > iterator on every modification by the driver.
> > 
> > I'm sure the first idea you have will be to invalidate the iterator, but
> > that is probably not the way to proceed.  Even ignoring the unclear
> > locking of two maple states trying to modify the tree, this is rather
> > inefficient - each invalidation means a re-walk of the tree.  You may as
> > well not use an iterator in this case.
> > 
> > Depending on how/when the lookups occur, you could still iterate over
> > the tree and let the driver modify the ending of "a", but leave the tree
> > alone and just store b over whatever - but the failure scenarios may
> > cause you grief.
> > 
> > If you pass the iterator through, then you can just use it to do your
> > writes and keep iterating as if nothing changed.
> 
> Passing through the iterater clearly seems to be the way to go.
> 
> I assume that if the entry to insert isn't at the location of the iterator
> (as in the following example) we can just keep walking to this location my
> changing the index of the mas and calling mas_walk()?

no.  You have to mas_set() to the value and walk from the top of the
tree.  mas_walk() walks down, not from side to side - well, it does go
forward within a node (increasing offset), but if you hit the node limit
then you have gotten yourself in trouble.

> This would also imply
> that the "outer" tree walk continues after the entry we just inserted,
> right?

I don't understand the "outer" tree walk statement.

> 
>1 a 3
> old:   |---| (bo_offset=n)
> 
>  0 b 2
> req: |---|   (bo_offset=m)
> 
>  0 b 2  a' 3
> 

Re: [PATCH drm-next v2 05/16] drm: manager to keep track of GPUs VA mappings

2023-02-28 Thread Liam R. Howlett
* Danilo Krummrich  [230227 21:17]:
> On Tue, Feb 21, 2023 at 01:20:50PM -0500, Liam R. Howlett wrote:
> > * Danilo Krummrich  [230217 08:45]:
> > > Add infrastructure to keep track of GPU virtual address (VA) mappings
> > > with a decicated VA space manager implementation.
> > > 
> > > New UAPIs, motivated by Vulkan sparse memory bindings graphics drivers
> > > start implementing, allow userspace applications to request multiple and
> > > arbitrary GPU VA mappings of buffer objects. The DRM GPU VA manager is
> > > intended to serve the following purposes in this context.
> > > 
> > > 1) Provide infrastructure to track GPU VA allocations and mappings,
> > >making use of the maple_tree.
> > > 
> > > 2) Generically connect GPU VA mappings to their backing buffers, in
> > >particular DRM GEM objects.
> > > 
> > > 3) Provide a common implementation to perform more complex mapping
> > >operations on the GPU VA space. In particular splitting and merging
> > >of GPU VA mappings, e.g. for intersecting mapping requests or partial
> > >unmap requests.
> > > 
> > > Suggested-by: Dave Airlie 
> > > Signed-off-by: Danilo Krummrich 
> > > ---
> > >  Documentation/gpu/drm-mm.rst|   31 +
> > >  drivers/gpu/drm/Makefile|1 +
> > >  drivers/gpu/drm/drm_gem.c   |3 +
> > >  drivers/gpu/drm/drm_gpuva_mgr.c | 1704 +++
> > >  include/drm/drm_drv.h   |6 +
> > >  include/drm/drm_gem.h   |   75 ++
> > >  include/drm/drm_gpuva_mgr.h |  714 +
> > >  7 files changed, 2534 insertions(+)
> > >  create mode 100644 drivers/gpu/drm/drm_gpuva_mgr.c
> > >  create mode 100644 include/drm/drm_gpuva_mgr.h
> > > 
> > > diff --git a/Documentation/gpu/drm-mm.rst b/Documentation/gpu/drm-mm.rst
> > > index a52e6f4117d6..c9f120cfe730 100644
> > > --- a/Documentation/gpu/drm-mm.rst
> > > +++ b/Documentation/gpu/drm-mm.rst
> > > @@ -466,6 +466,37 @@ DRM MM Range Allocator Function References
> > >  .. kernel-doc:: drivers/gpu/drm/drm_mm.c
> > > :export:
> > >  
> > ...
> > 
> > > +
> > > +/**
> > > + * drm_gpuva_remove_iter - removes the iterators current element
> > > + * @it: the _gpuva_iterator
> > > + *
> > > + * This removes the element the iterator currently points to.
> > > + */
> > > +void
> > > +drm_gpuva_iter_remove(struct drm_gpuva_iterator *it)
> > > +{
> > > + mas_erase(>mas);
> > > +}
> > > +EXPORT_SYMBOL(drm_gpuva_iter_remove);
> > > +
> > > +/**
> > > + * drm_gpuva_insert - insert a _gpuva
> > > + * @mgr: the _gpuva_manager to insert the _gpuva in
> > > + * @va: the _gpuva to insert
> > > + * @addr: the start address of the GPU VA
> > > + * @range: the range of the GPU VA
> > > + *
> > > + * Insert a _gpuva with a given address and range into a
> > > + * _gpuva_manager.
> > > + *
> > > + * Returns: 0 on success, negative error code on failure.
> > > + */
> > > +int
> > > +drm_gpuva_insert(struct drm_gpuva_manager *mgr,
> > > +  struct drm_gpuva *va)
> > > +{
> > > + u64 addr = va->va.addr;
> > > + u64 range = va->va.range;
> > > + MA_STATE(mas, >va_mt, addr, addr + range - 1);
> > > + struct drm_gpuva_region *reg = NULL;
> > > + int ret;
> > > +
> > > + if (unlikely(!drm_gpuva_in_mm_range(mgr, addr, range)))
> > > + return -EINVAL;
> > > +
> > > + if (unlikely(drm_gpuva_in_kernel_region(mgr, addr, range)))
> > > + return -EINVAL;
> > > +
> > > + if (mgr->flags & DRM_GPUVA_MANAGER_REGIONS) {
> > > + reg = drm_gpuva_in_region(mgr, addr, range);
> > > + if (unlikely(!reg))
> > > + return -EINVAL;
> > > + }
> > > +
> > 
> > -
> > 
> > > + if (unlikely(drm_gpuva_find_first(mgr, addr, range)))
> > > + return -EEXIST;
> > > +
> > > + ret = mas_store_gfp(, va, GFP_KERNEL);
> > 
> > mas_walk() will set the internal maple state to the limits to what it
> > finds.  So, instead of an iterator, you can use the walk function and
> > ensure there is a large enough area in the existing NULL:
> > 
> > /*
> >  * Nothing at addr, mas now points to the location where the store would
> >  * happen
> >  */
> > if (mas_walk())
> > return -EEXIST;
> > 
> 
> For some reason mas_walk() finds an entry and hence this function returns
> -EEXIST for the following sequence of insertions.
> 
> A = [0xc - 0xf]
> B = [0x0 - 0xb]
> 
> Interestingly, inserting B before A works fine.
> 
> I attached a test module that reproduces the issue. I hope its just a stupid
> mistake I just can't spot though.

This is probably my fault in how I explained things, I seem to have had
a bug in my code.

Let me try again.

mas_walk() will go to the range of mas.index
It will set mas.index = range_start
It will set mas.last = range_end
It will return entry in that range.

Your code is walking to addr (0xc, say)
You get NULL
and the range is now: mas.index = 0, mas.last = ULONG_MAX

You set mas.last = 0xc + 0x4 -1
You store your va in the range of 0 - 0xf - 

Re: [PATCH drm-next v2 05/16] drm: manager to keep track of GPUs VA mappings

2023-02-27 Thread Danilo Krummrich
On Tue, Feb 21, 2023 at 01:20:50PM -0500, Liam R. Howlett wrote:
> * Danilo Krummrich  [230217 08:45]:
> > Add infrastructure to keep track of GPU virtual address (VA) mappings
> > with a decicated VA space manager implementation.
> > 
> > New UAPIs, motivated by Vulkan sparse memory bindings graphics drivers
> > start implementing, allow userspace applications to request multiple and
> > arbitrary GPU VA mappings of buffer objects. The DRM GPU VA manager is
> > intended to serve the following purposes in this context.
> > 
> > 1) Provide infrastructure to track GPU VA allocations and mappings,
> >making use of the maple_tree.
> > 
> > 2) Generically connect GPU VA mappings to their backing buffers, in
> >particular DRM GEM objects.
> > 
> > 3) Provide a common implementation to perform more complex mapping
> >operations on the GPU VA space. In particular splitting and merging
> >of GPU VA mappings, e.g. for intersecting mapping requests or partial
> >unmap requests.
> > 
> > Suggested-by: Dave Airlie 
> > Signed-off-by: Danilo Krummrich 
> > ---
> >  Documentation/gpu/drm-mm.rst|   31 +
> >  drivers/gpu/drm/Makefile|1 +
> >  drivers/gpu/drm/drm_gem.c   |3 +
> >  drivers/gpu/drm/drm_gpuva_mgr.c | 1704 +++
> >  include/drm/drm_drv.h   |6 +
> >  include/drm/drm_gem.h   |   75 ++
> >  include/drm/drm_gpuva_mgr.h |  714 +
> >  7 files changed, 2534 insertions(+)
> >  create mode 100644 drivers/gpu/drm/drm_gpuva_mgr.c
> >  create mode 100644 include/drm/drm_gpuva_mgr.h
> > 
> > diff --git a/Documentation/gpu/drm-mm.rst b/Documentation/gpu/drm-mm.rst
> > index a52e6f4117d6..c9f120cfe730 100644
> > --- a/Documentation/gpu/drm-mm.rst
> > +++ b/Documentation/gpu/drm-mm.rst
> > @@ -466,6 +466,37 @@ DRM MM Range Allocator Function References
> >  .. kernel-doc:: drivers/gpu/drm/drm_mm.c
> > :export:
> >  
> ...
> 
> > +
> > +/**
> > + * drm_gpuva_remove_iter - removes the iterators current element
> > + * @it: the _gpuva_iterator
> > + *
> > + * This removes the element the iterator currently points to.
> > + */
> > +void
> > +drm_gpuva_iter_remove(struct drm_gpuva_iterator *it)
> > +{
> > +   mas_erase(>mas);
> > +}
> > +EXPORT_SYMBOL(drm_gpuva_iter_remove);
> > +
> > +/**
> > + * drm_gpuva_insert - insert a _gpuva
> > + * @mgr: the _gpuva_manager to insert the _gpuva in
> > + * @va: the _gpuva to insert
> > + * @addr: the start address of the GPU VA
> > + * @range: the range of the GPU VA
> > + *
> > + * Insert a _gpuva with a given address and range into a
> > + * _gpuva_manager.
> > + *
> > + * Returns: 0 on success, negative error code on failure.
> > + */
> > +int
> > +drm_gpuva_insert(struct drm_gpuva_manager *mgr,
> > +struct drm_gpuva *va)
> > +{
> > +   u64 addr = va->va.addr;
> > +   u64 range = va->va.range;
> > +   MA_STATE(mas, >va_mt, addr, addr + range - 1);
> > +   struct drm_gpuva_region *reg = NULL;
> > +   int ret;
> > +
> > +   if (unlikely(!drm_gpuva_in_mm_range(mgr, addr, range)))
> > +   return -EINVAL;
> > +
> > +   if (unlikely(drm_gpuva_in_kernel_region(mgr, addr, range)))
> > +   return -EINVAL;
> > +
> > +   if (mgr->flags & DRM_GPUVA_MANAGER_REGIONS) {
> > +   reg = drm_gpuva_in_region(mgr, addr, range);
> > +   if (unlikely(!reg))
> > +   return -EINVAL;
> > +   }
> > +
> 
> -
> 
> > +   if (unlikely(drm_gpuva_find_first(mgr, addr, range)))
> > +   return -EEXIST;
> > +
> > +   ret = mas_store_gfp(, va, GFP_KERNEL);
> 
> mas_walk() will set the internal maple state to the limits to what it
> finds.  So, instead of an iterator, you can use the walk function and
> ensure there is a large enough area in the existing NULL:
> 
> /*
>  * Nothing at addr, mas now points to the location where the store would
>  * happen
>  */
> if (mas_walk())
>   return -EEXIST;
> 

For some reason mas_walk() finds an entry and hence this function returns
-EEXIST for the following sequence of insertions.

A = [0xc - 0xf]
B = [0x0 - 0xb]

Interestingly, inserting B before A works fine.

I attached a test module that reproduces the issue. I hope its just a stupid
mistake I just can't spot though.

> /* The NULL entry ends at mas.last, make sure there is room */
> if (mas.last < (addr + range - 1))
>   return -EEXIST;
> 
> /* Limit the store size to the correct end address, and store */
>  mas.last = addr + range - 1;
>  ret = mas_store_gfp(, va, GFP_KERNEL);
> 
/* SPDX-License-Identifier: GPL-2.0 */
#if 1
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#endif

struct maple_tree mt;

struct va {
	u64 addr;
	u64 range;
};

static int va_store(struct va *va)
{
	void *entry = NULL;
	u64 addr = va->addr;
	u64 range = va->range;
	u64 last = addr + range - 1;
	MA_STATE(mas, , addr, addr);
	int ret;

	mas_lock();

	if ((entry = 

Re: [PATCH drm-next v2 05/16] drm: manager to keep track of GPUs VA mappings

2023-02-27 Thread Danilo Krummrich

On 2/23/23 20:09, Liam R. Howlett wrote:

* Danilo Krummrich  [230222 13:13]:

On 2/21/23 19:20, Liam R. Howlett wrote:

* Danilo Krummrich  [230217 08:45]:

Add infrastructure to keep track of GPU virtual address (VA) mappings
with a decicated VA space manager implementation.

New UAPIs, motivated by Vulkan sparse memory bindings graphics drivers
start implementing, allow userspace applications to request multiple and
arbitrary GPU VA mappings of buffer objects. The DRM GPU VA manager is
intended to serve the following purposes in this context.

1) Provide infrastructure to track GPU VA allocations and mappings,
 making use of the maple_tree.

2) Generically connect GPU VA mappings to their backing buffers, in
 particular DRM GEM objects.

3) Provide a common implementation to perform more complex mapping
 operations on the GPU VA space. In particular splitting and merging
 of GPU VA mappings, e.g. for intersecting mapping requests or partial
 unmap requests.

Suggested-by: Dave Airlie 
Signed-off-by: Danilo Krummrich 
---
   Documentation/gpu/drm-mm.rst|   31 +
   drivers/gpu/drm/Makefile|1 +
   drivers/gpu/drm/drm_gem.c   |3 +
   drivers/gpu/drm/drm_gpuva_mgr.c | 1704 +++
   include/drm/drm_drv.h   |6 +
   include/drm/drm_gem.h   |   75 ++
   include/drm/drm_gpuva_mgr.h |  714 +
   7 files changed, 2534 insertions(+)
   create mode 100644 drivers/gpu/drm/drm_gpuva_mgr.c
   create mode 100644 include/drm/drm_gpuva_mgr.h

diff --git a/Documentation/gpu/drm-mm.rst b/Documentation/gpu/drm-mm.rst
index a52e6f4117d6..c9f120cfe730 100644
--- a/Documentation/gpu/drm-mm.rst
+++ b/Documentation/gpu/drm-mm.rst
@@ -466,6 +466,37 @@ DRM MM Range Allocator Function References
   .. kernel-doc:: drivers/gpu/drm/drm_mm.c
  :export:

...


+
+/**
+ * drm_gpuva_remove_iter - removes the iterators current element
+ * @it: the _gpuva_iterator
+ *
+ * This removes the element the iterator currently points to.
+ */
+void
+drm_gpuva_iter_remove(struct drm_gpuva_iterator *it)
+{
+   mas_erase(>mas);
+}
+EXPORT_SYMBOL(drm_gpuva_iter_remove);
+
+/**
+ * drm_gpuva_insert - insert a _gpuva
+ * @mgr: the _gpuva_manager to insert the _gpuva in
+ * @va: the _gpuva to insert
+ * @addr: the start address of the GPU VA
+ * @range: the range of the GPU VA
+ *
+ * Insert a _gpuva with a given address and range into a
+ * _gpuva_manager.
+ *
+ * Returns: 0 on success, negative error code on failure.
+ */
+int
+drm_gpuva_insert(struct drm_gpuva_manager *mgr,
+struct drm_gpuva *va)
+{
+   u64 addr = va->va.addr;
+   u64 range = va->va.range;
+   MA_STATE(mas, >va_mt, addr, addr + range - 1);
+   struct drm_gpuva_region *reg = NULL;
+   int ret;
+
+   if (unlikely(!drm_gpuva_in_mm_range(mgr, addr, range)))
+   return -EINVAL;
+
+   if (unlikely(drm_gpuva_in_kernel_region(mgr, addr, range)))
+   return -EINVAL;
+
+   if (mgr->flags & DRM_GPUVA_MANAGER_REGIONS) {
+   reg = drm_gpuva_in_region(mgr, addr, range);
+   if (unlikely(!reg))
+   return -EINVAL;
+   }
+


-


+   if (unlikely(drm_gpuva_find_first(mgr, addr, range)))
+   return -EEXIST;
+
+   ret = mas_store_gfp(, va, GFP_KERNEL);


mas_walk() will set the internal maple state to the limits to what it
finds.  So, instead of an iterator, you can use the walk function and
ensure there is a large enough area in the existing NULL:

/*
   * Nothing at addr, mas now points to the location where the store would
   * happen
   */
if (mas_walk())
return -EEXIST;

/* The NULL entry ends at mas.last, make sure there is room */
if (mas.last < (addr + range - 1))
return -EEXIST;

/* Limit the store size to the correct end address, and store */
   mas.last = addr + range - 1;
   ret = mas_store_gfp(, va, GFP_KERNEL);



Would this variant be significantly more efficient?


Well, what you are doing is walking the tree to see if there's anything
there... then re-walking the tree to store it.  So, yes, it's much more
efficient..  However, writing is heavier.  How much of the time is spent
walking vs writing depends on the size of the tree, but it's rather easy
to do this in a single walk of the tree so why wouldn't you?


I will, I was just curious about how much of an impact it has.





Also, would this also work while already walking the tree?


Yes, to an extent.  If you are at the correct location in the tree, you
can write to that location.  If you are not in the correct location and
try to write to the tree then things will go poorly..  In this scenario,
we are very much walking the tree and writing to it in two steps.



To remove an entry while walking the tree I have a separate function
drm_gpuva_iter_remove(). Would I need something similar for inserting
entries?


I saw that.  Your remove function uses the erase 

Re: [PATCH drm-next v2 05/16] drm: manager to keep track of GPUs VA mappings

2023-02-23 Thread Liam R. Howlett
* Danilo Krummrich  [230222 13:13]:
> On 2/21/23 19:20, Liam R. Howlett wrote:
> > * Danilo Krummrich  [230217 08:45]:
> > > Add infrastructure to keep track of GPU virtual address (VA) mappings
> > > with a decicated VA space manager implementation.
> > > 
> > > New UAPIs, motivated by Vulkan sparse memory bindings graphics drivers
> > > start implementing, allow userspace applications to request multiple and
> > > arbitrary GPU VA mappings of buffer objects. The DRM GPU VA manager is
> > > intended to serve the following purposes in this context.
> > > 
> > > 1) Provide infrastructure to track GPU VA allocations and mappings,
> > > making use of the maple_tree.
> > > 
> > > 2) Generically connect GPU VA mappings to their backing buffers, in
> > > particular DRM GEM objects.
> > > 
> > > 3) Provide a common implementation to perform more complex mapping
> > > operations on the GPU VA space. In particular splitting and merging
> > > of GPU VA mappings, e.g. for intersecting mapping requests or partial
> > > unmap requests.
> > > 
> > > Suggested-by: Dave Airlie 
> > > Signed-off-by: Danilo Krummrich 
> > > ---
> > >   Documentation/gpu/drm-mm.rst|   31 +
> > >   drivers/gpu/drm/Makefile|1 +
> > >   drivers/gpu/drm/drm_gem.c   |3 +
> > >   drivers/gpu/drm/drm_gpuva_mgr.c | 1704 +++
> > >   include/drm/drm_drv.h   |6 +
> > >   include/drm/drm_gem.h   |   75 ++
> > >   include/drm/drm_gpuva_mgr.h |  714 +
> > >   7 files changed, 2534 insertions(+)
> > >   create mode 100644 drivers/gpu/drm/drm_gpuva_mgr.c
> > >   create mode 100644 include/drm/drm_gpuva_mgr.h
> > > 
> > > diff --git a/Documentation/gpu/drm-mm.rst b/Documentation/gpu/drm-mm.rst
> > > index a52e6f4117d6..c9f120cfe730 100644
> > > --- a/Documentation/gpu/drm-mm.rst
> > > +++ b/Documentation/gpu/drm-mm.rst
> > > @@ -466,6 +466,37 @@ DRM MM Range Allocator Function References
> > >   .. kernel-doc:: drivers/gpu/drm/drm_mm.c
> > >  :export:
> > ...
> > 
> > > +
> > > +/**
> > > + * drm_gpuva_remove_iter - removes the iterators current element
> > > + * @it: the _gpuva_iterator
> > > + *
> > > + * This removes the element the iterator currently points to.
> > > + */
> > > +void
> > > +drm_gpuva_iter_remove(struct drm_gpuva_iterator *it)
> > > +{
> > > + mas_erase(>mas);
> > > +}
> > > +EXPORT_SYMBOL(drm_gpuva_iter_remove);
> > > +
> > > +/**
> > > + * drm_gpuva_insert - insert a _gpuva
> > > + * @mgr: the _gpuva_manager to insert the _gpuva in
> > > + * @va: the _gpuva to insert
> > > + * @addr: the start address of the GPU VA
> > > + * @range: the range of the GPU VA
> > > + *
> > > + * Insert a _gpuva with a given address and range into a
> > > + * _gpuva_manager.
> > > + *
> > > + * Returns: 0 on success, negative error code on failure.
> > > + */
> > > +int
> > > +drm_gpuva_insert(struct drm_gpuva_manager *mgr,
> > > +  struct drm_gpuva *va)
> > > +{
> > > + u64 addr = va->va.addr;
> > > + u64 range = va->va.range;
> > > + MA_STATE(mas, >va_mt, addr, addr + range - 1);
> > > + struct drm_gpuva_region *reg = NULL;
> > > + int ret;
> > > +
> > > + if (unlikely(!drm_gpuva_in_mm_range(mgr, addr, range)))
> > > + return -EINVAL;
> > > +
> > > + if (unlikely(drm_gpuva_in_kernel_region(mgr, addr, range)))
> > > + return -EINVAL;
> > > +
> > > + if (mgr->flags & DRM_GPUVA_MANAGER_REGIONS) {
> > > + reg = drm_gpuva_in_region(mgr, addr, range);
> > > + if (unlikely(!reg))
> > > + return -EINVAL;
> > > + }
> > > +
> > 
> > -
> > 
> > > + if (unlikely(drm_gpuva_find_first(mgr, addr, range)))
> > > + return -EEXIST;
> > > +
> > > + ret = mas_store_gfp(, va, GFP_KERNEL);
> > 
> > mas_walk() will set the internal maple state to the limits to what it
> > finds.  So, instead of an iterator, you can use the walk function and
> > ensure there is a large enough area in the existing NULL:
> > 
> > /*
> >   * Nothing at addr, mas now points to the location where the store would
> >   * happen
> >   */
> > if (mas_walk())
> > return -EEXIST;
> > 
> > /* The NULL entry ends at mas.last, make sure there is room */
> > if (mas.last < (addr + range - 1))
> > return -EEXIST;
> > 
> > /* Limit the store size to the correct end address, and store */
> >   mas.last = addr + range - 1;
> >   ret = mas_store_gfp(, va, GFP_KERNEL);
> > 
> 
> Would this variant be significantly more efficient?

Well, what you are doing is walking the tree to see if there's anything
there... then re-walking the tree to store it.  So, yes, it's much more
efficient..  However, writing is heavier.  How much of the time is spent
walking vs writing depends on the size of the tree, but it's rather easy
to do this in a single walk of the tree so why wouldn't you?

> 
> Also, would this also work while already walking the tree?

Yes, to an extent.  If you are at the correct location in the tree, you
can write to that 

Re: [PATCH drm-next v2 05/16] drm: manager to keep track of GPUs VA mappings

2023-02-23 Thread Danilo Krummrich

On 2/23/23 08:06, Christian König wrote:

Am 22.02.23 um 17:40 schrieb Danilo Krummrich:

On 2/22/23 16:14, Christian König wrote:

Am 22.02.23 um 16:07 schrieb Danilo Krummrich:

On 2/22/23 11:25, Christian König wrote:

Am 17.02.23 um 14:44 schrieb Danilo Krummrich:





+/**
+ * DOC: Overview
+ *
+ * The DRM GPU VA Manager, represented by struct 
drm_gpuva_manager keeps track
+ * of a GPU's virtual address (VA) space and manages the 
corresponding virtual
+ * mappings represented by _gpuva objects. It also keeps 
track of the

+ * mapping's backing _gem_object buffers.
+ *
+ * _gem_object buffers maintain a list (and a corresponding 
list lock) of
+ * _gpuva objects representing all existent GPU VA mappings 
using this

+ * _gem_object as backing buffer.
+ *
+ * If the _GPUVA_MANAGER_REGIONS feature is enabled, a GPU VA 
mapping can
+ * only be created within a previously allocated 
_gpuva_region, which
+ * represents a reserved portion of the GPU VA space. GPU VA 
mappings are not

+ * allowed to span over a _gpuva_region's boundary.
+ *
+ * GPU VA regions can also be flagged as sparse, which allows 
drivers to create
+ * sparse mappings for a whole GPU VA region in order to support 
Vulkan

+ * 'Sparse Resources'.


Well since we have now found that there is absolutely no technical 
reason for having those regions could we please drop them?


I disagree this was the outcome of our previous discussion.

In nouveau I still need them to track the separate sparse page 
tables and, as you confirmed previously, Nvidia cards are not the 
only cards supporting this feature.


The second reason is that with regions we can avoid merging between 
buffers, which saves some effort. However, I agree that this 
argument by itself probably doesn't hold too much, since you've 
pointed out in a previous mail that:



1) If we merge and decide to only do that inside certain boundaries 
then those boundaries needs to be provided and checked against. This 
burns quite some CPU cycles


2) If we just merge what we can we might have extra page table 
updates which cost time and could result in undesired side effects.


3) If we don't merge at all we have additional housekeeping for the 
mappings and maybe hw restrictions.



However, if a driver uses regions to track its separate sparse page 
tables anyway it gets 1) for free, which is a nice synergy.


I totally agree that regions aren't for everyone though. Hence, I 
made them an optional feature and by default regions are disabled. 
In order to use them drm_gpuva_manager_init() must be called with 
the DRM_GPUVA_MANAGER_REGIONS feature flag.


I really would not want to open code regions or have two GPUVA 
manager instances in nouveau to track sparse page tables. That would 
be really messy, hence I hope we can agree on this to be an optional 
feature.


I absolutely don't think that this is a good idea then. This separate 
handling of sparse page tables is completely Nouveau specific.


Actually, I rely on what you said in a previous mail when I say it's, 
potentially, not specific to nouveau.



This sounds similar to what AMD hw used to have up until gfx8 (I 
think), basically sparse resources where defined through a separate 
mechanism to the address resolution of the page tables. I won't rule 
out that other hardware has similar approaches.




Ok, sounds like I didn't made my point here clear: AMD does have that 
same mechanism for older hw you try to implement here for Nouveau, but 
we have *abandoned* it because it is to much trouble and especially 
overhead to support! In other words we have said "Ok we would need two 
separate components to cleanly handle that, one for newer hw and one for 
older hw.".


My point was more about the potential existence of other hardware having 
similar concepts.


I, personally, can't judge whether actually making use of having 
separate sparse page tables (or similar concepts) makes sense for other 
drivers or not. I think it depends on how the hardware works, which 
limitations it has in handling page tables, etc.


I definitely recognize your experience and that for AMD you decided its 
not worth using a similar mechanism. I would definitely be interested in 
the details. Do you mind sharing them?


However, I think we need to differentiate between whether for AMD 
hardware you just found an approach that worked out better for your 
specific hardware or whether something is fundamentally broken with 
separate sparse page tables (or similar concepts) in general.


Do you think there is something fundamentally broken with such an 
approach? And if so, why?




What you now try to do is to write one component which works for both. 
We have already exercised this idea and came to the conclusion that it's 
not a good path to go down. So you're basically just repeating our mistake.


I mean if it's just for Nouveau then I would say feel free to do 
whatever you want, but since this component is supposed to be used by 
more 

Re: [PATCH drm-next v2 05/16] drm: manager to keep track of GPUs VA mappings

2023-02-22 Thread Christian König

Am 22.02.23 um 17:40 schrieb Danilo Krummrich:

On 2/22/23 16:14, Christian König wrote:

Am 22.02.23 um 16:07 schrieb Danilo Krummrich:

On 2/22/23 11:25, Christian König wrote:

Am 17.02.23 um 14:44 schrieb Danilo Krummrich:





+/**
+ * DOC: Overview
+ *
+ * The DRM GPU VA Manager, represented by struct 
drm_gpuva_manager keeps track
+ * of a GPU's virtual address (VA) space and manages the 
corresponding virtual
+ * mappings represented by _gpuva objects. It also keeps 
track of the

+ * mapping's backing _gem_object buffers.
+ *
+ * _gem_object buffers maintain a list (and a corresponding 
list lock) of
+ * _gpuva objects representing all existent GPU VA mappings 
using this

+ * _gem_object as backing buffer.
+ *
+ * If the _GPUVA_MANAGER_REGIONS feature is enabled, a GPU VA 
mapping can
+ * only be created within a previously allocated 
_gpuva_region, which
+ * represents a reserved portion of the GPU VA space. GPU VA 
mappings are not

+ * allowed to span over a _gpuva_region's boundary.
+ *
+ * GPU VA regions can also be flagged as sparse, which allows 
drivers to create
+ * sparse mappings for a whole GPU VA region in order to support 
Vulkan

+ * 'Sparse Resources'.


Well since we have now found that there is absolutely no technical 
reason for having those regions could we please drop them?


I disagree this was the outcome of our previous discussion.

In nouveau I still need them to track the separate sparse page 
tables and, as you confirmed previously, Nvidia cards are not the 
only cards supporting this feature.


The second reason is that with regions we can avoid merging between 
buffers, which saves some effort. However, I agree that this 
argument by itself probably doesn't hold too much, since you've 
pointed out in a previous mail that:



1) If we merge and decide to only do that inside certain boundaries 
then those boundaries needs to be provided and checked against. This 
burns quite some CPU cycles


2) If we just merge what we can we might have extra page table 
updates which cost time and could result in undesired side effects.


3) If we don't merge at all we have additional housekeeping for the 
mappings and maybe hw restrictions.



However, if a driver uses regions to track its separate sparse page 
tables anyway it gets 1) for free, which is a nice synergy.


I totally agree that regions aren't for everyone though. Hence, I 
made them an optional feature and by default regions are disabled. 
In order to use them drm_gpuva_manager_init() must be called with 
the DRM_GPUVA_MANAGER_REGIONS feature flag.


I really would not want to open code regions or have two GPUVA 
manager instances in nouveau to track sparse page tables. That would 
be really messy, hence I hope we can agree on this to be an optional 
feature.


I absolutely don't think that this is a good idea then. This separate 
handling of sparse page tables is completely Nouveau specific.


Actually, I rely on what you said in a previous mail when I say it's, 
potentially, not specific to nouveau.



This sounds similar to what AMD hw used to have up until gfx8 (I 
think), basically sparse resources where defined through a separate 
mechanism to the address resolution of the page tables. I won't rule 
out that other hardware has similar approaches.




Ok, sounds like I didn't made my point here clear: AMD does have that 
same mechanism for older hw you try to implement here for Nouveau, but 
we have *abandoned* it because it is to much trouble and especially 
overhead to support! In other words we have said "Ok we would need two 
separate components to cleanly handle that, one for newer hw and one for 
older hw.".


What you now try to do is to write one component which works for both. 
We have already exercised this idea and came to the conclusion that it's 
not a good path to go down. So you're basically just repeating our mistake.


I mean if it's just for Nouveau then I would say feel free to do 
whatever you want, but since this component is supposed to be used by 
more drivers then I strongly think we need to tackle this from a 
different side.


Even when it's optional feature mixing this into the common handling 
is exactly what I pointed out as not properly separating between 
hardware specific and hardware agnostic functionality.


Optionally having regions is *not* a hardware specific concept, 
drivers might use it for a hardware specific purpose though. Which 
potentially is is the case for almost every DRM helper.


Drivers can use regions only for the sake of not merging between 
buffer boundaries as well. Some drivers might prefer this over "never 
merge" or "always merge", depending on the cost of re-organizing page 
tables for unnecessary splits/merges, without having the need of 
tracking separate sparse page tables.


Its just that I think *if* a driver needs to track separate sparse 
page tables anyways its a nice synergy since then there is no extra 
cost for getting this optimization.



Re: [PATCH drm-next v2 05/16] drm: manager to keep track of GPUs VA mappings

2023-02-22 Thread Danilo Krummrich

On 2/21/23 19:20, Liam R. Howlett wrote:

* Danilo Krummrich  [230217 08:45]:

Add infrastructure to keep track of GPU virtual address (VA) mappings
with a decicated VA space manager implementation.

New UAPIs, motivated by Vulkan sparse memory bindings graphics drivers
start implementing, allow userspace applications to request multiple and
arbitrary GPU VA mappings of buffer objects. The DRM GPU VA manager is
intended to serve the following purposes in this context.

1) Provide infrastructure to track GPU VA allocations and mappings,
making use of the maple_tree.

2) Generically connect GPU VA mappings to their backing buffers, in
particular DRM GEM objects.

3) Provide a common implementation to perform more complex mapping
operations on the GPU VA space. In particular splitting and merging
of GPU VA mappings, e.g. for intersecting mapping requests or partial
unmap requests.

Suggested-by: Dave Airlie 
Signed-off-by: Danilo Krummrich 
---
  Documentation/gpu/drm-mm.rst|   31 +
  drivers/gpu/drm/Makefile|1 +
  drivers/gpu/drm/drm_gem.c   |3 +
  drivers/gpu/drm/drm_gpuva_mgr.c | 1704 +++
  include/drm/drm_drv.h   |6 +
  include/drm/drm_gem.h   |   75 ++
  include/drm/drm_gpuva_mgr.h |  714 +
  7 files changed, 2534 insertions(+)
  create mode 100644 drivers/gpu/drm/drm_gpuva_mgr.c
  create mode 100644 include/drm/drm_gpuva_mgr.h

diff --git a/Documentation/gpu/drm-mm.rst b/Documentation/gpu/drm-mm.rst
index a52e6f4117d6..c9f120cfe730 100644
--- a/Documentation/gpu/drm-mm.rst
+++ b/Documentation/gpu/drm-mm.rst
@@ -466,6 +466,37 @@ DRM MM Range Allocator Function References
  .. kernel-doc:: drivers/gpu/drm/drm_mm.c
 :export:
  

...


+
+/**
+ * drm_gpuva_remove_iter - removes the iterators current element
+ * @it: the _gpuva_iterator
+ *
+ * This removes the element the iterator currently points to.
+ */
+void
+drm_gpuva_iter_remove(struct drm_gpuva_iterator *it)
+{
+   mas_erase(>mas);
+}
+EXPORT_SYMBOL(drm_gpuva_iter_remove);
+
+/**
+ * drm_gpuva_insert - insert a _gpuva
+ * @mgr: the _gpuva_manager to insert the _gpuva in
+ * @va: the _gpuva to insert
+ * @addr: the start address of the GPU VA
+ * @range: the range of the GPU VA
+ *
+ * Insert a _gpuva with a given address and range into a
+ * _gpuva_manager.
+ *
+ * Returns: 0 on success, negative error code on failure.
+ */
+int
+drm_gpuva_insert(struct drm_gpuva_manager *mgr,
+struct drm_gpuva *va)
+{
+   u64 addr = va->va.addr;
+   u64 range = va->va.range;
+   MA_STATE(mas, >va_mt, addr, addr + range - 1);
+   struct drm_gpuva_region *reg = NULL;
+   int ret;
+
+   if (unlikely(!drm_gpuva_in_mm_range(mgr, addr, range)))
+   return -EINVAL;
+
+   if (unlikely(drm_gpuva_in_kernel_region(mgr, addr, range)))
+   return -EINVAL;
+
+   if (mgr->flags & DRM_GPUVA_MANAGER_REGIONS) {
+   reg = drm_gpuva_in_region(mgr, addr, range);
+   if (unlikely(!reg))
+   return -EINVAL;
+   }
+


-


+   if (unlikely(drm_gpuva_find_first(mgr, addr, range)))
+   return -EEXIST;
+
+   ret = mas_store_gfp(, va, GFP_KERNEL);


mas_walk() will set the internal maple state to the limits to what it
finds.  So, instead of an iterator, you can use the walk function and
ensure there is a large enough area in the existing NULL:

/*
  * Nothing at addr, mas now points to the location where the store would
  * happen
  */
if (mas_walk())
return -EEXIST;

/* The NULL entry ends at mas.last, make sure there is room */
if (mas.last < (addr + range - 1))
return -EEXIST;

/* Limit the store size to the correct end address, and store */
  mas.last = addr + range - 1;
  ret = mas_store_gfp(, va, GFP_KERNEL);



Would this variant be significantly more efficient?

Also, would this also work while already walking the tree?

To remove an entry while walking the tree I have a separate function 
drm_gpuva_iter_remove(). Would I need something similar for inserting 
entries?


I already provided this example in a separate mail thread, but it may 
makes sense to move this to the mailing list:


In __drm_gpuva_sm_map() we're iterating a given range of the tree, where 
the given range is the size of the newly requested mapping. 
__drm_gpuva_sm_map() invokes a callback for each sub-operation that 
needs to be taken in order to fulfill this mapping request. In most 
cases such a callback just creates a drm_gpuva_op object and stores it 
in a list.


However, drivers can also implement the callback, such that they 
directly execute this operation within the callback.


Let's have a look at the following example:

 0 a 2
old: |---|   (bo_offset=n)

   1 b 3
req:   |---| (bo_offset=m)

 0  a' 1 b 3
new: |-|---| (a.bo_offset=n,b.bo_offset=m)

This would 

Re: [PATCH drm-next v2 05/16] drm: manager to keep track of GPUs VA mappings

2023-02-22 Thread Danilo Krummrich

On 2/22/23 16:14, Christian König wrote:

Am 22.02.23 um 16:07 schrieb Danilo Krummrich:

On 2/22/23 11:25, Christian König wrote:

Am 17.02.23 um 14:44 schrieb Danilo Krummrich:





+/**
+ * DOC: Overview
+ *
+ * The DRM GPU VA Manager, represented by struct drm_gpuva_manager 
keeps track
+ * of a GPU's virtual address (VA) space and manages the 
corresponding virtual
+ * mappings represented by _gpuva objects. It also keeps track 
of the

+ * mapping's backing _gem_object buffers.
+ *
+ * _gem_object buffers maintain a list (and a corresponding 
list lock) of
+ * _gpuva objects representing all existent GPU VA mappings 
using this

+ * _gem_object as backing buffer.
+ *
+ * If the _GPUVA_MANAGER_REGIONS feature is enabled, a GPU VA 
mapping can
+ * only be created within a previously allocated _gpuva_region, 
which
+ * represents a reserved portion of the GPU VA space. GPU VA 
mappings are not

+ * allowed to span over a _gpuva_region's boundary.
+ *
+ * GPU VA regions can also be flagged as sparse, which allows 
drivers to create
+ * sparse mappings for a whole GPU VA region in order to support 
Vulkan

+ * 'Sparse Resources'.


Well since we have now found that there is absolutely no technical 
reason for having those regions could we please drop them?


I disagree this was the outcome of our previous discussion.

In nouveau I still need them to track the separate sparse page tables 
and, as you confirmed previously, Nvidia cards are not the only cards 
supporting this feature.


The second reason is that with regions we can avoid merging between 
buffers, which saves some effort. However, I agree that this argument 
by itself probably doesn't hold too much, since you've pointed out in 
a previous mail that:



1) If we merge and decide to only do that inside certain boundaries 
then those boundaries needs to be provided and checked against. This 
burns quite some CPU cycles


2) If we just merge what we can we might have extra page table updates 
which cost time and could result in undesired side effects.


3) If we don't merge at all we have additional housekeeping for the 
mappings and maybe hw restrictions.



However, if a driver uses regions to track its separate sparse page 
tables anyway it gets 1) for free, which is a nice synergy.


I totally agree that regions aren't for everyone though. Hence, I made 
them an optional feature and by default regions are disabled. In order 
to use them drm_gpuva_manager_init() must be called with the 
DRM_GPUVA_MANAGER_REGIONS feature flag.


I really would not want to open code regions or have two GPUVA manager 
instances in nouveau to track sparse page tables. That would be really 
messy, hence I hope we can agree on this to be an optional feature.


I absolutely don't think that this is a good idea then. This separate 
handling of sparse page tables is completely Nouveau specific.


Actually, I rely on what you said in a previous mail when I say it's, 
potentially, not specific to nouveau.



This sounds similar to what AMD hw used to have up until gfx8 (I think), 
basically sparse resources where defined through a separate mechanism to 
the address resolution of the page tables. I won't rule out that other 
hardware has similar approaches.





Even when it's optional feature mixing this into the common handling is 
exactly what I pointed out as not properly separating between hardware 
specific and hardware agnostic functionality.


Optionally having regions is *not* a hardware specific concept, drivers 
might use it for a hardware specific purpose though. Which potentially 
is is the case for almost every DRM helper.


Drivers can use regions only for the sake of not merging between buffer 
boundaries as well. Some drivers might prefer this over "never merge" or 
"always merge", depending on the cost of re-organizing page tables for 
unnecessary splits/merges, without having the need of tracking separate 
sparse page tables.


Its just that I think *if* a driver needs to track separate sparse page 
tables anyways its a nice synergy since then there is no extra cost for 
getting this optimization.




This is exactly the problem we ran into with TTM as well and I've spend 
a massive amount of time to clean that up again. >


Admittedly, I don't know what problems you are referring to. However, I 
don't see which kind of trouble it could cause by allowing drivers to 
track regions optionally.



Regards,
Christian.





I don't really see a need for them any more.

Regards,
Christian.









Re: [PATCH drm-next v2 05/16] drm: manager to keep track of GPUs VA mappings

2023-02-22 Thread Christian König

Am 22.02.23 um 16:07 schrieb Danilo Krummrich:

On 2/22/23 11:25, Christian König wrote:

Am 17.02.23 um 14:44 schrieb Danilo Krummrich:





+/**
+ * DOC: Overview
+ *
+ * The DRM GPU VA Manager, represented by struct drm_gpuva_manager 
keeps track
+ * of a GPU's virtual address (VA) space and manages the 
corresponding virtual
+ * mappings represented by _gpuva objects. It also keeps track 
of the

+ * mapping's backing _gem_object buffers.
+ *
+ * _gem_object buffers maintain a list (and a corresponding 
list lock) of
+ * _gpuva objects representing all existent GPU VA mappings 
using this

+ * _gem_object as backing buffer.
+ *
+ * If the _GPUVA_MANAGER_REGIONS feature is enabled, a GPU VA 
mapping can
+ * only be created within a previously allocated _gpuva_region, 
which
+ * represents a reserved portion of the GPU VA space. GPU VA 
mappings are not

+ * allowed to span over a _gpuva_region's boundary.
+ *
+ * GPU VA regions can also be flagged as sparse, which allows 
drivers to create
+ * sparse mappings for a whole GPU VA region in order to support 
Vulkan

+ * 'Sparse Resources'.


Well since we have now found that there is absolutely no technical 
reason for having those regions could we please drop them?


I disagree this was the outcome of our previous discussion.

In nouveau I still need them to track the separate sparse page tables 
and, as you confirmed previously, Nvidia cards are not the only cards 
supporting this feature.


The second reason is that with regions we can avoid merging between 
buffers, which saves some effort. However, I agree that this argument 
by itself probably doesn't hold too much, since you've pointed out in 
a previous mail that:



1) If we merge and decide to only do that inside certain boundaries 
then those boundaries needs to be provided and checked against. This 
burns quite some CPU cycles


2) If we just merge what we can we might have extra page table updates 
which cost time and could result in undesired side effects.


3) If we don't merge at all we have additional housekeeping for the 
mappings and maybe hw restrictions.



However, if a driver uses regions to track its separate sparse page 
tables anyway it gets 1) for free, which is a nice synergy.


I totally agree that regions aren't for everyone though. Hence, I made 
them an optional feature and by default regions are disabled. In order 
to use them drm_gpuva_manager_init() must be called with the 
DRM_GPUVA_MANAGER_REGIONS feature flag.


I really would not want to open code regions or have two GPUVA manager 
instances in nouveau to track sparse page tables. That would be really 
messy, hence I hope we can agree on this to be an optional feature.


I absolutely don't think that this is a good idea then. This separate 
handling of sparse page tables is completely Nouveau specific.


Even when it's optional feature mixing this into the common handling is 
exactly what I pointed out as not properly separating between hardware 
specific and hardware agnostic functionality.


This is exactly the problem we ran into with TTM as well and I've spend 
a massive amount of time to clean that up again.


Regards,
Christian.





I don't really see a need for them any more.

Regards,
Christian.







Re: [PATCH drm-next v2 05/16] drm: manager to keep track of GPUs VA mappings

2023-02-22 Thread Danilo Krummrich

On 2/22/23 11:25, Christian König wrote:

Am 17.02.23 um 14:44 schrieb Danilo Krummrich:





+/**
+ * DOC: Overview
+ *
+ * The DRM GPU VA Manager, represented by struct drm_gpuva_manager 
keeps track
+ * of a GPU's virtual address (VA) space and manages the 
corresponding virtual
+ * mappings represented by _gpuva objects. It also keeps track of 
the

+ * mapping's backing _gem_object buffers.
+ *
+ * _gem_object buffers maintain a list (and a corresponding list 
lock) of
+ * _gpuva objects representing all existent GPU VA mappings using 
this

+ * _gem_object as backing buffer.
+ *
+ * If the _GPUVA_MANAGER_REGIONS feature is enabled, a GPU VA 
mapping can
+ * only be created within a previously allocated _gpuva_region, 
which
+ * represents a reserved portion of the GPU VA space. GPU VA mappings 
are not

+ * allowed to span over a _gpuva_region's boundary.
+ *
+ * GPU VA regions can also be flagged as sparse, which allows drivers 
to create

+ * sparse mappings for a whole GPU VA region in order to support Vulkan
+ * 'Sparse Resources'.


Well since we have now found that there is absolutely no technical 
reason for having those regions could we please drop them?


I disagree this was the outcome of our previous discussion.

In nouveau I still need them to track the separate sparse page tables 
and, as you confirmed previously, Nvidia cards are not the only cards 
supporting this feature.


The second reason is that with regions we can avoid merging between 
buffers, which saves some effort. However, I agree that this argument by 
itself probably doesn't hold too much, since you've pointed out in a 
previous mail that:



1) If we merge and decide to only do that inside certain boundaries then 
those boundaries needs to be provided and checked against. This burns 
quite some CPU cycles


2) If we just merge what we can we might have extra page table updates 
which cost time and could result in undesired side effects.


3) If we don't merge at all we have additional housekeeping for the 
mappings and maybe hw restrictions.



However, if a driver uses regions to track its separate sparse page 
tables anyway it gets 1) for free, which is a nice synergy.


I totally agree that regions aren't for everyone though. Hence, I made 
them an optional feature and by default regions are disabled. In order 
to use them drm_gpuva_manager_init() must be called with the 
DRM_GPUVA_MANAGER_REGIONS feature flag.


I really would not want to open code regions or have two GPUVA manager 
instances in nouveau to track sparse page tables. That would be really 
messy, hence I hope we can agree on this to be an optional feature.




I don't really see a need for them any more.

Regards,
Christian.
 




Re: [PATCH drm-next v2 05/16] drm: manager to keep track of GPUs VA mappings

2023-02-22 Thread Christian König

Am 17.02.23 um 14:44 schrieb Danilo Krummrich:

Add infrastructure to keep track of GPU virtual address (VA) mappings
with a decicated VA space manager implementation.

New UAPIs, motivated by Vulkan sparse memory bindings graphics drivers
start implementing, allow userspace applications to request multiple and
arbitrary GPU VA mappings of buffer objects. The DRM GPU VA manager is
intended to serve the following purposes in this context.

1) Provide infrastructure to track GPU VA allocations and mappings,
making use of the maple_tree.

2) Generically connect GPU VA mappings to their backing buffers, in
particular DRM GEM objects.

3) Provide a common implementation to perform more complex mapping
operations on the GPU VA space. In particular splitting and merging
of GPU VA mappings, e.g. for intersecting mapping requests or partial
unmap requests.

Suggested-by: Dave Airlie 
Signed-off-by: Danilo Krummrich 
---
  Documentation/gpu/drm-mm.rst|   31 +
  drivers/gpu/drm/Makefile|1 +
  drivers/gpu/drm/drm_gem.c   |3 +
  drivers/gpu/drm/drm_gpuva_mgr.c | 1704 +++
  include/drm/drm_drv.h   |6 +
  include/drm/drm_gem.h   |   75 ++
  include/drm/drm_gpuva_mgr.h |  714 +
  7 files changed, 2534 insertions(+)
  create mode 100644 drivers/gpu/drm/drm_gpuva_mgr.c
  create mode 100644 include/drm/drm_gpuva_mgr.h

diff --git a/Documentation/gpu/drm-mm.rst b/Documentation/gpu/drm-mm.rst
index a52e6f4117d6..c9f120cfe730 100644
--- a/Documentation/gpu/drm-mm.rst
+++ b/Documentation/gpu/drm-mm.rst
@@ -466,6 +466,37 @@ DRM MM Range Allocator Function References
  .. kernel-doc:: drivers/gpu/drm/drm_mm.c
 :export:
  
+DRM GPU VA Manager

+==
+
+Overview
+
+
+.. kernel-doc:: drivers/gpu/drm/drm_gpuva_mgr.c
+   :doc: Overview
+
+Split and Merge
+---
+
+.. kernel-doc:: drivers/gpu/drm/drm_gpuva_mgr.c
+   :doc: Split and Merge
+
+Locking
+---
+
+.. kernel-doc:: drivers/gpu/drm/drm_gpuva_mgr.c
+   :doc: Locking
+
+
+DRM GPU VA Manager Function References
+--
+
+.. kernel-doc:: include/drm/drm_gpuva_mgr.h
+   :internal:
+
+.. kernel-doc:: drivers/gpu/drm/drm_gpuva_mgr.c
+   :export:
+
  DRM Buddy Allocator
  ===
  
diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile

index d40defbb0347..4d098efffb98 100644
--- a/drivers/gpu/drm/Makefile
+++ b/drivers/gpu/drm/Makefile
@@ -45,6 +45,7 @@ drm-y := \
drm_vblank.o \
drm_vblank_work.o \
drm_vma_manager.o \
+   drm_gpuva_mgr.o \
drm_writeback.o
  drm-$(CONFIG_DRM_LEGACY) += \
drm_agpsupport.o \
diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
index 59a0bb5ebd85..65115fe88627 100644
--- a/drivers/gpu/drm/drm_gem.c
+++ b/drivers/gpu/drm/drm_gem.c
@@ -164,6 +164,9 @@ void drm_gem_private_object_init(struct drm_device *dev,
if (!obj->resv)
obj->resv = >_resv;
  
+	if (drm_core_check_feature(dev, DRIVER_GEM_GPUVA))

+   drm_gem_gpuva_init(obj);
+
drm_vma_node_reset(>vma_node);
INIT_LIST_HEAD(>lru_node);
  }
diff --git a/drivers/gpu/drm/drm_gpuva_mgr.c b/drivers/gpu/drm/drm_gpuva_mgr.c
new file mode 100644
index ..19f583704562
--- /dev/null
+++ b/drivers/gpu/drm/drm_gpuva_mgr.c
@@ -0,0 +1,1704 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2022 Red Hat.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ * Authors:
+ * Danilo Krummrich 
+ *
+ */
+
+#include 
+#include 
+
+/**
+ * DOC: Overview
+ *
+ * The DRM GPU VA Manager, represented by struct drm_gpuva_manager keeps track
+ * of a GPU's virtual address (VA) space and manages the corresponding virtual
+ * mappings represented by _gpuva objects. It also keeps track of the
+ * mapping's backing _gem_object buffers.
+ *
+ * _gem_object buffers maintain a 

Re: [PATCH drm-next v2 05/16] drm: manager to keep track of GPUs VA mappings

2023-02-21 Thread Liam R. Howlett
* Danilo Krummrich  [230217 08:45]:
> Add infrastructure to keep track of GPU virtual address (VA) mappings
> with a decicated VA space manager implementation.
> 
> New UAPIs, motivated by Vulkan sparse memory bindings graphics drivers
> start implementing, allow userspace applications to request multiple and
> arbitrary GPU VA mappings of buffer objects. The DRM GPU VA manager is
> intended to serve the following purposes in this context.
> 
> 1) Provide infrastructure to track GPU VA allocations and mappings,
>making use of the maple_tree.
> 
> 2) Generically connect GPU VA mappings to their backing buffers, in
>particular DRM GEM objects.
> 
> 3) Provide a common implementation to perform more complex mapping
>operations on the GPU VA space. In particular splitting and merging
>of GPU VA mappings, e.g. for intersecting mapping requests or partial
>unmap requests.
> 
> Suggested-by: Dave Airlie 
> Signed-off-by: Danilo Krummrich 
> ---
>  Documentation/gpu/drm-mm.rst|   31 +
>  drivers/gpu/drm/Makefile|1 +
>  drivers/gpu/drm/drm_gem.c   |3 +
>  drivers/gpu/drm/drm_gpuva_mgr.c | 1704 +++
>  include/drm/drm_drv.h   |6 +
>  include/drm/drm_gem.h   |   75 ++
>  include/drm/drm_gpuva_mgr.h |  714 +
>  7 files changed, 2534 insertions(+)
>  create mode 100644 drivers/gpu/drm/drm_gpuva_mgr.c
>  create mode 100644 include/drm/drm_gpuva_mgr.h
> 
> diff --git a/Documentation/gpu/drm-mm.rst b/Documentation/gpu/drm-mm.rst
> index a52e6f4117d6..c9f120cfe730 100644
> --- a/Documentation/gpu/drm-mm.rst
> +++ b/Documentation/gpu/drm-mm.rst
> @@ -466,6 +466,37 @@ DRM MM Range Allocator Function References
>  .. kernel-doc:: drivers/gpu/drm/drm_mm.c
> :export:
>  
...

> +
> +/**
> + * drm_gpuva_remove_iter - removes the iterators current element
> + * @it: the _gpuva_iterator
> + *
> + * This removes the element the iterator currently points to.
> + */
> +void
> +drm_gpuva_iter_remove(struct drm_gpuva_iterator *it)
> +{
> + mas_erase(>mas);
> +}
> +EXPORT_SYMBOL(drm_gpuva_iter_remove);
> +
> +/**
> + * drm_gpuva_insert - insert a _gpuva
> + * @mgr: the _gpuva_manager to insert the _gpuva in
> + * @va: the _gpuva to insert
> + * @addr: the start address of the GPU VA
> + * @range: the range of the GPU VA
> + *
> + * Insert a _gpuva with a given address and range into a
> + * _gpuva_manager.
> + *
> + * Returns: 0 on success, negative error code on failure.
> + */
> +int
> +drm_gpuva_insert(struct drm_gpuva_manager *mgr,
> +  struct drm_gpuva *va)
> +{
> + u64 addr = va->va.addr;
> + u64 range = va->va.range;
> + MA_STATE(mas, >va_mt, addr, addr + range - 1);
> + struct drm_gpuva_region *reg = NULL;
> + int ret;
> +
> + if (unlikely(!drm_gpuva_in_mm_range(mgr, addr, range)))
> + return -EINVAL;
> +
> + if (unlikely(drm_gpuva_in_kernel_region(mgr, addr, range)))
> + return -EINVAL;
> +
> + if (mgr->flags & DRM_GPUVA_MANAGER_REGIONS) {
> + reg = drm_gpuva_in_region(mgr, addr, range);
> + if (unlikely(!reg))
> + return -EINVAL;
> + }
> +

-

> + if (unlikely(drm_gpuva_find_first(mgr, addr, range)))
> + return -EEXIST;
> +
> + ret = mas_store_gfp(, va, GFP_KERNEL);

mas_walk() will set the internal maple state to the limits to what it
finds.  So, instead of an iterator, you can use the walk function and
ensure there is a large enough area in the existing NULL:

/*
 * Nothing at addr, mas now points to the location where the store would
 * happen
 */
if (mas_walk())
return -EEXIST;

/* The NULL entry ends at mas.last, make sure there is room */
if (mas.last < (addr + range - 1))
return -EEXIST;

/* Limit the store size to the correct end address, and store */
 mas.last = addr + range - 1;
 ret = mas_store_gfp(, va, GFP_KERNEL);

> + if (unlikely(ret))
> + return ret;
> +
> + va->mgr = mgr;
> + va->region = reg;
> +
> + return 0;
> +}
> +EXPORT_SYMBOL(drm_gpuva_insert);
> +
> +/**
> + * drm_gpuva_remove - remove a _gpuva
> + * @va: the _gpuva to remove
> + *
> + * This removes the given  from the underlaying tree.
> + */
> +void
> +drm_gpuva_remove(struct drm_gpuva *va)
> +{
> + MA_STATE(mas, >mgr->va_mt, va->va.addr, 0);
> +
> + mas_erase();
> +}
> +EXPORT_SYMBOL(drm_gpuva_remove);
> +
...

> +/**
> + * drm_gpuva_find_first - find the first _gpuva in the given range
> + * @mgr: the _gpuva_manager to search in
> + * @addr: the _gpuvas address
> + * @range: the _gpuvas range
> + *
> + * Returns: the first _gpuva within the given range
> + */
> +struct drm_gpuva *
> +drm_gpuva_find_first(struct drm_gpuva_manager *mgr,
> +  u64 addr, u64 range)
> +{
> + MA_STATE(mas, >va_mt, addr, 0);
> +
> + return mas_find(, addr + range - 1);
> +}
> +EXPORT_SYMBOL(drm_gpuva_find_first);
> +
> +/**
> + * drm_gpuva_find 

Re: [PATCH drm-next v2 05/16] drm: manager to keep track of GPUs VA mappings

2023-02-17 Thread kernel test robot
Hi Danilo,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on 48075a66fca613477ac1969b576a93ef5db0164f]

url:
https://github.com/intel-lab-lkp/linux/commits/Danilo-Krummrich/drm-execution-context-for-GEM-buffers/20230217-215101
base:   48075a66fca613477ac1969b576a93ef5db0164f
patch link:
https://lore.kernel.org/r/20230217134422.14116-6-dakr%40redhat.com
patch subject: [PATCH drm-next v2 05/16] drm: manager to keep track of GPUs VA 
mappings
config: mips-allyesconfig 
(https://download.01.org/0day-ci/archive/20230218/202302180805.b0ab40v5-...@intel.com/config)
compiler: mips-linux-gcc (GCC) 12.1.0
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# 
https://github.com/intel-lab-lkp/linux/commit/00132cc92b6745cfd51c0d5df4c246a848f2ceaa
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review 
Danilo-Krummrich/drm-execution-context-for-GEM-buffers/20230217-215101
git checkout 00132cc92b6745cfd51c0d5df4c246a848f2ceaa
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 
O=build_dir ARCH=mips olddefconfig
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 
O=build_dir ARCH=mips SHELL=/bin/bash drivers/gpu/drm/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot 
| Link: 
https://lore.kernel.org/oe-kbuild-all/202302180805.b0ab40v5-...@intel.com/

All warnings (new ones prefixed by >>):

>> drivers/gpu/drm/drm_gpuva_mgr.c:1383:5: warning: no previous prototype for 
>> 'drm_gpuva_sm_step' [-Wmissing-prototypes]
1383 | int drm_gpuva_sm_step(struct drm_gpuva_op *__op, void *priv)
 | ^
--
>> drivers/gpu/drm/drm_gpuva_mgr.c:529: warning: expecting prototype for 
>> drm_gpuva_remove_iter(). Prototype was for drm_gpuva_iter_remove() instead
   drivers/gpu/drm/drm_gpuva_mgr.c:549: warning: Excess function parameter 
'addr' description in 'drm_gpuva_insert'
   drivers/gpu/drm/drm_gpuva_mgr.c:549: warning: Excess function parameter 
'range' description in 'drm_gpuva_insert'
   drivers/gpu/drm/drm_gpuva_mgr.c:765: warning: Excess function parameter 
'addr' description in 'drm_gpuva_region_insert'
   drivers/gpu/drm/drm_gpuva_mgr.c:765: warning: Excess function parameter 
'range' description in 'drm_gpuva_region_insert'
   drivers/gpu/drm/drm_gpuva_mgr.c:1345: warning: Excess function parameter 
'ops' description in 'drm_gpuva_sm_unmap'
   drivers/gpu/drm/drm_gpuva_mgr.c:1589: warning: Function parameter or member 
'addr' not described in 'drm_gpuva_prefetch_ops_create'
   drivers/gpu/drm/drm_gpuva_mgr.c:1589: warning: Function parameter or member 
'range' not described in 'drm_gpuva_prefetch_ops_create'
   drivers/gpu/drm/drm_gpuva_mgr.c:1589: warning: Excess function parameter 
'req_addr' description in 'drm_gpuva_prefetch_ops_create'
   drivers/gpu/drm/drm_gpuva_mgr.c:1589: warning: Excess function parameter 
'req_range' description in 'drm_gpuva_prefetch_ops_create'


vim +/drm_gpuva_sm_step +1383 drivers/gpu/drm/drm_gpuva_mgr.c

  1382  
> 1383  int drm_gpuva_sm_step(struct drm_gpuva_op *__op, void *priv)
  1384  {
  1385  struct {
  1386  struct drm_gpuva_manager *mgr;
  1387  struct drm_gpuva_ops *ops;
  1388  } *args = priv;
  1389  struct drm_gpuva_manager *mgr = args->mgr;
  1390  struct drm_gpuva_ops *ops = args->ops;
  1391  struct drm_gpuva_op *op;
  1392  
  1393  op = gpuva_op_alloc(mgr);
  1394  if (unlikely(!op))
  1395  goto err;
  1396  
  1397  memcpy(op, __op, sizeof(*op));
  1398  
  1399  if (op->op == DRM_GPUVA_OP_REMAP) {
  1400  struct drm_gpuva_op_remap *__r = &__op->remap;
  1401  struct drm_gpuva_op_remap *r = >remap;
  1402  
  1403  r->unmap = kmemdup(__r->unmap, sizeof(*r->unmap),
  1404 GFP_KERNEL);
  1405  if (unlikely(!r->unmap))
  1406  goto err_free_op;
  1407  
  1408  if (__r->prev) {
  1409  r->prev = kmemdup(__r->prev, sizeof(*r->prev),
  1410GFP_KERNEL);
  1411  if (unlikely(!r->prev))
  1412  goto err_free_unmap;
  1413  }
  1414  
  1415  if (__r->next) {
  1416  r->next = kmemdup(__r->next, sizeof(*r->next),
  1417GFP_KERNEL);
  1418 

[PATCH drm-next v2 05/16] drm: manager to keep track of GPUs VA mappings

2023-02-17 Thread Danilo Krummrich
Add infrastructure to keep track of GPU virtual address (VA) mappings
with a decicated VA space manager implementation.

New UAPIs, motivated by Vulkan sparse memory bindings graphics drivers
start implementing, allow userspace applications to request multiple and
arbitrary GPU VA mappings of buffer objects. The DRM GPU VA manager is
intended to serve the following purposes in this context.

1) Provide infrastructure to track GPU VA allocations and mappings,
   making use of the maple_tree.

2) Generically connect GPU VA mappings to their backing buffers, in
   particular DRM GEM objects.

3) Provide a common implementation to perform more complex mapping
   operations on the GPU VA space. In particular splitting and merging
   of GPU VA mappings, e.g. for intersecting mapping requests or partial
   unmap requests.

Suggested-by: Dave Airlie 
Signed-off-by: Danilo Krummrich 
---
 Documentation/gpu/drm-mm.rst|   31 +
 drivers/gpu/drm/Makefile|1 +
 drivers/gpu/drm/drm_gem.c   |3 +
 drivers/gpu/drm/drm_gpuva_mgr.c | 1704 +++
 include/drm/drm_drv.h   |6 +
 include/drm/drm_gem.h   |   75 ++
 include/drm/drm_gpuva_mgr.h |  714 +
 7 files changed, 2534 insertions(+)
 create mode 100644 drivers/gpu/drm/drm_gpuva_mgr.c
 create mode 100644 include/drm/drm_gpuva_mgr.h

diff --git a/Documentation/gpu/drm-mm.rst b/Documentation/gpu/drm-mm.rst
index a52e6f4117d6..c9f120cfe730 100644
--- a/Documentation/gpu/drm-mm.rst
+++ b/Documentation/gpu/drm-mm.rst
@@ -466,6 +466,37 @@ DRM MM Range Allocator Function References
 .. kernel-doc:: drivers/gpu/drm/drm_mm.c
:export:
 
+DRM GPU VA Manager
+==
+
+Overview
+
+
+.. kernel-doc:: drivers/gpu/drm/drm_gpuva_mgr.c
+   :doc: Overview
+
+Split and Merge
+---
+
+.. kernel-doc:: drivers/gpu/drm/drm_gpuva_mgr.c
+   :doc: Split and Merge
+
+Locking
+---
+
+.. kernel-doc:: drivers/gpu/drm/drm_gpuva_mgr.c
+   :doc: Locking
+
+
+DRM GPU VA Manager Function References
+--
+
+.. kernel-doc:: include/drm/drm_gpuva_mgr.h
+   :internal:
+
+.. kernel-doc:: drivers/gpu/drm/drm_gpuva_mgr.c
+   :export:
+
 DRM Buddy Allocator
 ===
 
diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
index d40defbb0347..4d098efffb98 100644
--- a/drivers/gpu/drm/Makefile
+++ b/drivers/gpu/drm/Makefile
@@ -45,6 +45,7 @@ drm-y := \
drm_vblank.o \
drm_vblank_work.o \
drm_vma_manager.o \
+   drm_gpuva_mgr.o \
drm_writeback.o
 drm-$(CONFIG_DRM_LEGACY) += \
drm_agpsupport.o \
diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
index 59a0bb5ebd85..65115fe88627 100644
--- a/drivers/gpu/drm/drm_gem.c
+++ b/drivers/gpu/drm/drm_gem.c
@@ -164,6 +164,9 @@ void drm_gem_private_object_init(struct drm_device *dev,
if (!obj->resv)
obj->resv = >_resv;
 
+   if (drm_core_check_feature(dev, DRIVER_GEM_GPUVA))
+   drm_gem_gpuva_init(obj);
+
drm_vma_node_reset(>vma_node);
INIT_LIST_HEAD(>lru_node);
 }
diff --git a/drivers/gpu/drm/drm_gpuva_mgr.c b/drivers/gpu/drm/drm_gpuva_mgr.c
new file mode 100644
index ..19f583704562
--- /dev/null
+++ b/drivers/gpu/drm/drm_gpuva_mgr.c
@@ -0,0 +1,1704 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2022 Red Hat.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ * Authors:
+ * Danilo Krummrich 
+ *
+ */
+
+#include 
+#include 
+
+/**
+ * DOC: Overview
+ *
+ * The DRM GPU VA Manager, represented by struct drm_gpuva_manager keeps track
+ * of a GPU's virtual address (VA) space and manages the corresponding virtual
+ * mappings represented by _gpuva objects. It also keeps track of the
+ * mapping's backing _gem_object buffers.
+ *
+ * _gem_object buffers maintain a list (and a corresponding list lock) of
+ * _gpuva objects representing