Re: [Intel-gfx] [RFC PATCH 8/9] drm/gem: Associate GEM objects with drm cgroup

2021-03-18 Thread Brian Welty


On 3/18/2021 3:16 AM, Daniel Vetter wrote:
> On Sat, Mar 6, 2021 at 1:44 AM Brian Welty  wrote:
>>
>>
>> On 2/11/2021 7:34 AM, Daniel Vetter wrote:
>>> On Wed, Feb 10, 2021 at 02:00:57PM -0800, Brian Welty wrote:
>>>>
>>>> On 2/9/2021 2:54 AM, Daniel Vetter wrote:
>>>>> On Tue, Jan 26, 2021 at 01:46:25PM -0800, Brian Welty wrote:
>>>>>> This patch adds tracking of which cgroup to make charges against for a
>>>>>> given GEM object.  We associate the current task's cgroup with GEM 
>>>>>> objects
>>>>>> as they are created.  First user of this is for charging DRM cgroup for
>>>>>> device memory allocations.  The intended behavior is for device drivers 
>>>>>> to
>>>>>> make the cgroup charging calls at the time that backing store is 
>>>>>> allocated
>>>>>> or deallocated for the object.
>>>>>>
>>>>>> Exported functions are provided for charging memory allocations for a
>>>>>> GEM object to DRM cgroup. To aid in debugging, we store how many bytes
>>>>>> have been charged inside the GEM object.  Add helpers for setting and
>>>>>> clearing the object's associated cgroup which will check that charges are
>>>>>> not being leaked.
>>>>>>
>>>>>> For shared objects, this may make the charge against a cgroup that is
>>>>>> potentially not the same cgroup as the process using the memory.  Based
>>>>>> on the memory cgroup's discussion of "memory ownership", this seems
>>>>>> acceptable [1].
>>>>>>
>>>>>> [1] https://www.kernel.org/doc/Documentation/cgroup-v2.txt, "Memory 
>>>>>> Ownership"
>>>>>>
>>>>>> Signed-off-by: Brian Welty 
>>>>>
>>>>> Since for now we only have the generic gpu/xpu/bikeshed.memory bucket that
>>>>> counts everything, why don't we also charge in these gem functions?
>>>>
>>>> I'm not sure what you mean exactly.  You want to merge/move the charging 
>>>> logic
>>>> proposed in patch #5 (drm_cgroup_try_charge in kernel/cgroup/drm.c) into
>>>> drm_gem_object_charge_mem() ?
>>>>
>>>> Or reading below, I think you are okay keeping the logic separated as is, 
>>>> but
>>>> you want much of the code in kernel/cgroup/drm.c moved to 
>>>> drivers/gpu/cgroup ?
>>>> Yes, I see that should allow to reduce number of exported functions.
>>>
>>> Both. I mean we'd need to look at the code again when it's shuffled, but
>>> I'd say:
>>>
>>> - cgroup code and the charging for general gpu memory moves to
>>>   drivers/gpu/cgroup, so dma-buf heaps can use it too.
>>>
>>> - the charging for gem buffers moves into core gem code, so it happens for
>>>   all gpu drivers and all gem buffer allocations.
>>
>> Daniel, I'm not sure we're in sync on what 'charging for general gpu memory'
>> means.  Thus far, I have been proposing to charge/uncharge when backing 
>> store is
>> allocated/freed.  And thus, this would be done in DRM driver (so then also in
>> the dma-buf exporter).
>> I can't see how we'd hoist this part into drm gem code.
>> The memory limit in this series is for VRAM usage/limit not GEM buffers...
> 
> Yes this would be at gem buffer creation time. And just to get cgroups
> for drm up

Okay, but it's not of the ones in Tejun's list to start with:
   https://lists.freedesktop.org/archives/dri-devel/2020-April/262141.html
I hoped we would start by pursuing those (gpu.weight and gpu.memory.high)
as first step.

Limiting GEM buffers is essentially controlling virtual memory size, which 
tend to just always get set to unlimited.
Would be nice to get consensus from maintainers before proceeding to implement
this.


> 
>> Unless you are talking about charging for GEM buffer creation?  But this is
>> more of a 'soft resource' more along lines of Kenny's earlier GEM buffer 
>> limit
>> control.
>> I raised issue with this then, and at the time, Tejun agreed we should keep 
>> to
>> 'hard resource' controls, see [1] and [2].
>>
>> [1] https://lists.freedesktop.org/archives/dri-devel/2019-May/218071.html
>> [2] https://lists.freedesktop.org/archives/dri-devel/2020-April/262141.html
>>
>>>
>>> - this might or might not mean a bunch less exported stuff from the
>>>   cgroups files (si

Re: [Intel-gfx] [RFC PATCH 8/9] drm/gem: Associate GEM objects with drm cgroup

2021-03-05 Thread Brian Welty


On 2/11/2021 7:34 AM, Daniel Vetter wrote:
> On Wed, Feb 10, 2021 at 02:00:57PM -0800, Brian Welty wrote:
>>
>> On 2/9/2021 2:54 AM, Daniel Vetter wrote:
>>> On Tue, Jan 26, 2021 at 01:46:25PM -0800, Brian Welty wrote:
>>>> This patch adds tracking of which cgroup to make charges against for a
>>>> given GEM object.  We associate the current task's cgroup with GEM objects
>>>> as they are created.  First user of this is for charging DRM cgroup for
>>>> device memory allocations.  The intended behavior is for device drivers to
>>>> make the cgroup charging calls at the time that backing store is allocated
>>>> or deallocated for the object.
>>>>
>>>> Exported functions are provided for charging memory allocations for a
>>>> GEM object to DRM cgroup. To aid in debugging, we store how many bytes
>>>> have been charged inside the GEM object.  Add helpers for setting and
>>>> clearing the object's associated cgroup which will check that charges are
>>>> not being leaked.
>>>>
>>>> For shared objects, this may make the charge against a cgroup that is
>>>> potentially not the same cgroup as the process using the memory.  Based
>>>> on the memory cgroup's discussion of "memory ownership", this seems
>>>> acceptable [1].
>>>>
>>>> [1] https://www.kernel.org/doc/Documentation/cgroup-v2.txt, "Memory 
>>>> Ownership"
>>>>
>>>> Signed-off-by: Brian Welty 
>>>
>>> Since for now we only have the generic gpu/xpu/bikeshed.memory bucket that
>>> counts everything, why don't we also charge in these gem functions?
>>
>> I'm not sure what you mean exactly.  You want to merge/move the charging 
>> logic
>> proposed in patch #5 (drm_cgroup_try_charge in kernel/cgroup/drm.c) into
>> drm_gem_object_charge_mem() ?
>>
>> Or reading below, I think you are okay keeping the logic separated as is, but
>> you want much of the code in kernel/cgroup/drm.c moved to drivers/gpu/cgroup 
>> ?
>> Yes, I see that should allow to reduce number of exported functions.
> 
> Both. I mean we'd need to look at the code again when it's shuffled, but
> I'd say:
> 
> - cgroup code and the charging for general gpu memory moves to
>   drivers/gpu/cgroup, so dma-buf heaps can use it too.
> 
> - the charging for gem buffers moves into core gem code, so it happens for
>   all gpu drivers and all gem buffer allocations.

Daniel, I'm not sure we're in sync on what 'charging for general gpu memory'
means.  Thus far, I have been proposing to charge/uncharge when backing store is
allocated/freed.  And thus, this would be done in DRM driver (so then also in
the dma-buf exporter).
I can't see how we'd hoist this part into drm gem code.
The memory limit in this series is for VRAM usage/limit not GEM buffers...

Unless you are talking about charging for GEM buffer creation?  But this is
more of a 'soft resource' more along lines of Kenny's earlier GEM buffer limit
control.
I raised issue with this then, and at the time, Tejun agreed we should keep to
'hard resource' controls, see [1] and [2].

[1] https://lists.freedesktop.org/archives/dri-devel/2019-May/218071.html
[2] https://lists.freedesktop.org/archives/dri-devel/2020-April/262141.html

> 
> - this might or might not mean a bunch less exported stuff from the
>   cgroups files (since you don't need separate steps for linking a gem
>   object to a cgroup from the actual charging), and probably no exports
>   anymore for drivers (since they charge nothing). That will change
>   when we add charging for specific memory pools I guess, but we add that
>   when we add tha functionality.

... so considering VRAM charging, then yes, we very much need to have exported
functions for drivers to do the charging.
But these can be exported from drm.ko (or new .ko?) instead of kernel.  Is
that still preference?   Also, if number of exported functions is concern, we
can replace some of it with use of function pointers.

So then returning to this comment of yours:

> - cgroup code and the charging for general gpu memory moves to
>   drivers/gpu/cgroup, so dma-buf heaps can use it too.

If you agree that we are charging just at backing-store level, then I think
logic belongs in drivers/gpu/drm/cgroup ??  As charging is done in DRM driver
(also dma-buf exporter).  In other words, part of drm.
If I understand, dma-buf heaps is exporter of system memory and doesn't
need to charge against gpu controller??
Will need some help to understand the dma-buf heap use case a bit more.


Thanks,
-Brian

> 
>>> Also, that would remove the need for all these functions exported to
>>> 

Re: [Intel-gfx] [RFC PATCH 8/9] drm/gem: Associate GEM objects with drm cgroup

2021-02-10 Thread Brian Welty


On 2/9/2021 2:54 AM, Daniel Vetter wrote:
> On Tue, Jan 26, 2021 at 01:46:25PM -0800, Brian Welty wrote:
>> This patch adds tracking of which cgroup to make charges against for a
>> given GEM object.  We associate the current task's cgroup with GEM objects
>> as they are created.  First user of this is for charging DRM cgroup for
>> device memory allocations.  The intended behavior is for device drivers to
>> make the cgroup charging calls at the time that backing store is allocated
>> or deallocated for the object.
>>
>> Exported functions are provided for charging memory allocations for a
>> GEM object to DRM cgroup. To aid in debugging, we store how many bytes
>> have been charged inside the GEM object.  Add helpers for setting and
>> clearing the object's associated cgroup which will check that charges are
>> not being leaked.
>>
>> For shared objects, this may make the charge against a cgroup that is
>> potentially not the same cgroup as the process using the memory.  Based
>> on the memory cgroup's discussion of "memory ownership", this seems
>> acceptable [1].
>>
>> [1] https://www.kernel.org/doc/Documentation/cgroup-v2.txt, "Memory 
>> Ownership"
>>
>> Signed-off-by: Brian Welty 
> 
> Since for now we only have the generic gpu/xpu/bikeshed.memory bucket that
> counts everything, why don't we also charge in these gem functions?

I'm not sure what you mean exactly.  You want to merge/move the charging logic
proposed in patch #5 (drm_cgroup_try_charge in kernel/cgroup/drm.c) into
drm_gem_object_charge_mem() ?

Or reading below, I think you are okay keeping the logic separated as is, but
you want much of the code in kernel/cgroup/drm.c moved to drivers/gpu/cgroup ?
Yes, I see that should allow to reduce number of exported functions.

> 
> Also, that would remove the need for all these functions exported to
> drivers. Plus the cgroups setup could also move fully into drm core code,
> since all drivers (*) support it
> That way this would really be a fully
> generic cgroups controller, and we could land it.


Patch #2 proposed to have a few setup functions called during drm device
registration.
You are suggesting to have this more tightly integrated?
Okay, can see what that looks like.  It's true most of the exported functions 
from
kernel/cgroup/drm.c were taking a struct drm_device pointer, so seems it can be
restructured as you suggest.  But I guess we will always need some logic in
kernel/cgroup/drm.c to encapsulate the allocation of the 'struct 
cgroup_subsys_state'
for this new controller.
But I'm not sure I see how this makes the controller 'fully generic' as you 
describe.


> 
> The other things I'd do:
> - drop gpu scheduling controller from the initial patch series. Yes we'll
>   need it, but we also need vram limits and all these things for full
>   featured controller. Having the minimal viable cgroup controller in
>   upstream would unblock all these other things, and we could discuss them
>   in separate patch series, instead of one big bikeshed that never reaches
>   full consensus.
> 
> - the xpu thing is probably real, I just chatted with Android people for
>   their gpu memory accounting needs, and cgroups sounds like a solution
>   for them too. But unlike on desktop/server linux, on Android all shared
>   buffers are allocated from dma-buf heaps, so outside of drm, and hence a
>   cgroup controller that's tightly tied to drm isn't that useful. So I
>   think we should move the controller/charge functions up one level into
>   drivers/gpu/cgroups.

Hmm, so for this, you are asking for the cgroup logic to not directly use
DRM data structures?  Okay, that's why you suggest drivers/gpu/cgroups and
not drivers/gpu/drm/cgroups.  So this is your angle to make it 'fully
generic' then.

> 
>   On the naming bikeshed I think gpu is perfectly fine, just explain in
>   the docs that the G stands for "general" :-) Otherwise we might need to
>   rename drivers/gpu to drivers/xpu too, and that's maybe one bikeshed too
>   far. Plus, right now it really is the controller for gpu related memory,
>   even if we extend it to Android (where it would also include
>   video/camera allocatioons). Extending this cgroup controller to
>   accelerators in general is maybe a bit too much.
>  
> - The other disambiguation is how we account dma-buf (well, buffer based)
>   gpu allocations vs HMM gpu memory allocations, that might be worth
>   clarifying in the docs.
> 
> - Finally to accelerate this further, I think it'd be good to pull out the
>   cgroup spec for this more minimized series into patch 1, as a draft.
>   That way we could get all stakeholders to ack on that ack, so hopefully
>   we're building somet

Re: [Intel-gfx] [RFC PATCH 7/9] drmcg: Add initial support for tracking gpu time usage

2021-02-03 Thread Brian Welty


On 2/3/2021 5:25 AM, Joonas Lahtinen wrote:
> Quoting Brian Welty (2021-01-26 23:46:24)
>> Single control below is added to DRM cgroup controller in order to track
>> user execution time for GPU devices.  It is up to device drivers to
>> charge execution time to the cgroup via drm_cgroup_try_charge().
>>
>>   sched.runtime
>>   Read-only value, displays current user execution time for each DRM
>>   device. The expectation is that this is incremented by DRM device
>>   driver's scheduler upon user context completion or context switch.
>>   Units of time are in microseconds for consistency with cpu.stats.
> 
> Were not we also planning for a percentage style budgeting?

Yes, that's right.  Above is to report accumlated time usage.
I can include controls for time sharing in next submission.
But not using percentage.
Relative time share can be implemented with weights as described in cgroups
documentation for resource distribution models.
This was also the prior feedback from Tejun [1], and so will look very much
like the existing cpu.weight or io.weight.

> 
> Capping the maximum runtime is definitely useful, but in order to
> configure a system for peaceful co-existence of two or more workloads we
> must also impose a limit on how big portion of the instantaneous
> capacity can be used.

Agreed.  This is also included with CPU and IO controls (cpu.max and io.max),
so we should also plan to have the same.

-Brian

[1]  https://lists.freedesktop.org/archives/dri-devel/2020-April/262141.html

> 
> Regards, Joonas
> 
>> Signed-off-by: Brian Welty 
>> ---
>>  Documentation/admin-guide/cgroup-v2.rst |  9 +
>>  include/drm/drm_cgroup.h|  2 ++
>>  include/linux/cgroup_drm.h  |  2 ++
>>  kernel/cgroup/drm.c | 20 
>>  4 files changed, 33 insertions(+)
>>
>> diff --git a/Documentation/admin-guide/cgroup-v2.rst 
>> b/Documentation/admin-guide/cgroup-v2.rst
>> index ccc25f03a898..f1d0f333a49e 100644
>> --- a/Documentation/admin-guide/cgroup-v2.rst
>> +++ b/Documentation/admin-guide/cgroup-v2.rst
>> @@ -2205,6 +2205,15 @@ thresholds are hit, this would then allow the DRM 
>> device driver to invoke
>>  some equivalent to OOM-killer or forced memory eviction for the device
>>  backed memory in order to attempt to free additional space.
>>  
>> +The below set of control files are for time accounting of DRM devices. Units
>> +of time are in microseconds.
>> +
>> +  sched.runtime
>> +Read-only value, displays current user execution time for each DRM
>> +device. The expectation is that this is incremented by DRM device
>> +driver's scheduler upon user context completion or context switch.
>> +
>> +
>>  Misc
>>  
>>  
>> diff --git a/include/drm/drm_cgroup.h b/include/drm/drm_cgroup.h
>> index 9ba0e372..315dab8a93b8 100644
>> --- a/include/drm/drm_cgroup.h
>> +++ b/include/drm/drm_cgroup.h
>> @@ -22,6 +22,7 @@ enum drmcg_res_type {
>> DRMCG_TYPE_MEM_CURRENT,
>> DRMCG_TYPE_MEM_MAX,
>> DRMCG_TYPE_MEM_TOTAL,
>> +   DRMCG_TYPE_SCHED_RUNTIME,
>> __DRMCG_TYPE_LAST,
>>  };
>>  
>> @@ -79,5 +80,6 @@ void drm_cgroup_uncharge(struct drmcg *drmcg,struct 
>> drm_device *dev,
>>  enum drmcg_res_type type, u64 usage)
>>  {
>>  }
>> +
>>  #endif /* CONFIG_CGROUP_DRM */
>>  #endif /* __DRM_CGROUP_H__ */
>> diff --git a/include/linux/cgroup_drm.h b/include/linux/cgroup_drm.h
>> index 3570636473cf..0fafa663321e 100644
>> --- a/include/linux/cgroup_drm.h
>> +++ b/include/linux/cgroup_drm.h
>> @@ -19,6 +19,8 @@
>>   */
>>  struct drmcg_device_resource {
>> struct page_counter memory;
>> +   seqlock_t sched_lock;
>> +   u64 exec_runtime;
>>  };
>>  
>>  /**
>> diff --git a/kernel/cgroup/drm.c b/kernel/cgroup/drm.c
>> index 08e75eb67593..64e9d0dbe8c8 100644
>> --- a/kernel/cgroup/drm.c
>> +++ b/kernel/cgroup/drm.c
>> @@ -81,6 +81,7 @@ static inline int init_drmcg_single(struct drmcg *drmcg, 
>> struct drm_device *dev)
>> /* set defaults here */
>> page_counter_init(>memory,
>>   parent_ddr ? _ddr->memory : NULL);
>> +   seqlock_init(>sched_lock);
>> drmcg->dev_resources[minor] = ddr;
>>  
>> return 0;
>> @@ -287,6 +288,10 @@ static int drmcg_seq_show_fn(int id, void *ptr, void 
>> *data)
>> s

Re: [Intel-gfx] [RFC PATCH 0/9] cgroup support for GPU devices

2021-02-01 Thread Brian Welty

On 1/28/2021 7:00 PM, Xingyou Chen wrote:
> On 2021/1/27 上午5:46, Brian Welty wrote:
> 
>> We'd like to revisit the proposal of a GPU cgroup controller for managing
>> GPU devices but with just a basic set of controls.  This series is based on 
>> the prior patch series from Kenny Ho [1].  We take Kenny's base patches
>> which implement the basic framework for the controller, but we propose an
>> alternate set of control files.  Here we've taken a subset of the controls
>> proposed in earlier discussion on ML here [2]. 
>>
>> This series proposes a set of device memory controls (gpu.memory.current,
>> gpu.memory.max, and gpu.memory.total) and accounting of GPU time usage
>> (gpu.sched.runtime).  GPU time sharing controls are left as future work.
>> These are implemented within the GPU controller along with integration/usage
>> of the device memory controls by the i915 device driver.
>>
>> As an accelerator or GPU device is similar in many respects to a CPU with
>> (or without) attached system memory, the basic principle here is try to
>> copy the semantics of existing controls from other controllers when possible
>> and where these controls serve the same underlying purpose.
>> For example, the memory.max and memory.current controls are based on
>> same controls from MEMCG controller.
> 
> It seems not to be DRM specific, or even GPU specific. Would we have an 
> universal
> control group for any accelerator, GPGPU device etc, that hold sharable 
> resources
> like device memory, compute utility, bandwidth, with extra control file to 
> select
> between devices(or vendors)?
> 
> e.g. /cgname.device that stores PCI BDF, or enum(intel, amdgpu, nvidia, ...),
> defaults to none, means not enabled.
> 

Hi, thanks for the feedback.  Yes, I tend to agree.  I've asked about this in
earlier work; my suggestion is to name the controller something like 'XPU' to
be clear that these controls could apply to more than GPU.

But at least for now, based on Tejun's reply [1], the feedback is to try and 
keep
this controller as small and focused as possible on just GPU.  At least until
we get some consensus on set of controls for GPU.  but for this we need more
active input from community..

-Brian

[1] https://lists.freedesktop.org/archives/dri-devel/2019-November/243167.html
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [RFC PATCH 9/9] drm/i915: Use memory cgroup for enforcing device memory limit

2021-01-26 Thread Brian Welty
To charge device memory allocations, we need to (1) identify appropriate
cgroup to charge (currently decided at object creation time), and (2)
make the charging call at the time that memory pages are being allocated.

For (1), see prior DRM patch which associates current task's cgroup with
GEM objects as they are created.  That cgroup will be charged/uncharged
for all paging activity against the GEM object.

For (2), we call drm_get_object_charge_mem() in .get_pages callback
for the GEM object type.  Uncharging is done in .put_pages when the
memory is marked such that it can be evicted.  The try_charge() call will
fail with -ENOMEM if the current memory allocation will exceed the cgroup
device memory maximum, and allow for driver to perform memory reclaim.

We also set the total device memory reported by DRM cgroup by storing
in drm_device.drmcg_props after initializing LMEM memory regions.

FIXME: to release drm cgroup reference requires this additional patch:
  https://patchwork.freedesktop.org/patch/349029

Signed-off-by: Brian Welty 
---
 drivers/gpu/drm/i915/gem/i915_gem_mman.c   |  1 +
 drivers/gpu/drm/i915/gem/i915_gem_region.c | 23 ++
 drivers/gpu/drm/i915/intel_memory_region.c | 13 ++--
 drivers/gpu/drm/i915/intel_memory_region.h |  2 +-
 4 files changed, 32 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c 
b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
index ec28a6cde49b..9fbe91d4d2f1 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
@@ -16,6 +16,7 @@
 #include "i915_gem_gtt.h"
 #include "i915_gem_ioctls.h"
 #include "i915_gem_object.h"
+#include "i915_gem_lmem.h"
 #include "i915_gem_mman.h"
 #include "i915_trace.h"
 #include "i915_user_extensions.h"
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_region.c 
b/drivers/gpu/drm/i915/gem/i915_gem_region.c
index 3e3dad22a683..690b36b25984 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_region.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_region.c
@@ -12,11 +12,14 @@ void
 i915_gem_object_put_pages_buddy(struct drm_i915_gem_object *obj,
struct sg_table *pages)
 {
-   __intel_memory_region_put_pages_buddy(obj->mm.region, >mm.blocks);
+   u64 freed;
 
+   freed = __intel_memory_region_put_pages_buddy(obj->mm.region,
+ >mm.blocks);
obj->mm.dirty = false;
sg_free_table(pages);
kfree(pages);
+   drm_gem_object_uncharge_mem(>base, freed);
 }
 
 int
@@ -25,7 +28,7 @@ i915_gem_object_get_pages_buddy(struct drm_i915_gem_object 
*obj)
const u64 max_segment = i915_sg_segment_size();
struct intel_memory_region *mem = obj->mm.region;
struct list_head *blocks = >mm.blocks;
-   resource_size_t size = obj->base.size;
+   resource_size_t charged, size = obj->base.size;
resource_size_t prev_end;
struct i915_buddy_block *block;
unsigned int flags;
@@ -44,12 +47,22 @@ i915_gem_object_get_pages_buddy(struct drm_i915_gem_object 
*obj)
}
 
flags = I915_ALLOC_MIN_PAGE_SIZE;
-   if (obj->flags & I915_BO_ALLOC_CONTIGUOUS)
+   if (obj->flags & I915_BO_ALLOC_CONTIGUOUS) {
flags |= I915_ALLOC_CONTIGUOUS;
+   charged = roundup_pow_of_two(size);
+   } else {
+   charged = size;
+   }
+
+   ret = drm_gem_object_charge_mem(>base, charged);
+   if (ret) {
+   DRM_DEBUG("DRMCG: charge_mem failed for %lld\n", charged);
+   goto err_free_sg;
+   }
 
ret = __intel_memory_region_get_pages_buddy(mem, size, flags, blocks);
if (ret)
-   goto err_free_sg;
+   goto err_uncharge;
 
GEM_BUG_ON(list_empty(blocks));
 
@@ -99,6 +112,8 @@ i915_gem_object_get_pages_buddy(struct drm_i915_gem_object 
*obj)
 
return 0;
 
+err_uncharge:
+   drm_gem_object_uncharge_mem(>base, charged);
 err_free_sg:
sg_free_table(st);
kfree(st);
diff --git a/drivers/gpu/drm/i915/intel_memory_region.c 
b/drivers/gpu/drm/i915/intel_memory_region.c
index 1bfcdd89b241..9b1edbf4361c 100644
--- a/drivers/gpu/drm/i915/intel_memory_region.c
+++ b/drivers/gpu/drm/i915/intel_memory_region.c
@@ -46,13 +46,18 @@ intel_memory_region_free_pages(struct intel_memory_region 
*mem,
return size;
 }
 
-void
+u64
 __intel_memory_region_put_pages_buddy(struct intel_memory_region *mem,
  struct list_head *blocks)
 {
+   u64 freed;
+
mutex_lock(>mm_lock);
-   mem->avail += intel_memory_region_free_pages(mem, blocks);
+   freed = intel_memory_region_free_pages(mem, blocks);
+   mem->avail += freed;
mutex_unlock(>mm_lock);
+
+   return freed;
 }
 
 void
@@ -241

[Intel-gfx] [RFC PATCH 8/9] drm/gem: Associate GEM objects with drm cgroup

2021-01-26 Thread Brian Welty
This patch adds tracking of which cgroup to make charges against for a
given GEM object.  We associate the current task's cgroup with GEM objects
as they are created.  First user of this is for charging DRM cgroup for
device memory allocations.  The intended behavior is for device drivers to
make the cgroup charging calls at the time that backing store is allocated
or deallocated for the object.

Exported functions are provided for charging memory allocations for a
GEM object to DRM cgroup. To aid in debugging, we store how many bytes
have been charged inside the GEM object.  Add helpers for setting and
clearing the object's associated cgroup which will check that charges are
not being leaked.

For shared objects, this may make the charge against a cgroup that is
potentially not the same cgroup as the process using the memory.  Based
on the memory cgroup's discussion of "memory ownership", this seems
acceptable [1].

[1] https://www.kernel.org/doc/Documentation/cgroup-v2.txt, "Memory Ownership"

Signed-off-by: Brian Welty 
---
 drivers/gpu/drm/drm_gem.c | 89 +++
 include/drm/drm_gem.h | 17 
 2 files changed, 106 insertions(+)

diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
index c2ce78c4edc3..a12da41eaafe 100644
--- a/drivers/gpu/drm/drm_gem.c
+++ b/drivers/gpu/drm/drm_gem.c
@@ -29,6 +29,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -112,6 +113,89 @@ drm_gem_init(struct drm_device *dev)
return drmm_add_action(dev, drm_gem_init_release, NULL);
 }
 
+/**
+ * drm_gem_object_set_cgroup - associate GEM object with a cgroup
+ * @obj: GEM object which is being associated with a cgroup
+ * @task: task associated with process control group to use
+ *
+ * This will acquire a reference on cgroup and use for charging GEM
+ * memory allocations.
+ * This helper could be extended in future to migrate charges to another
+ * cgroup, print warning if this usage occurs.
+ */
+void drm_gem_object_set_cgroup(struct drm_gem_object *obj,
+  struct task_struct *task)
+{
+   /* if object has existing cgroup, we migrate the charge... */
+   if (obj->drmcg) {
+   pr_warn("DRM: need to migrate cgroup charge of %lld\n",
+   atomic64_read(>drmcg_bytes_charged));
+   }
+   obj->drmcg = drmcg_get(task);
+}
+EXPORT_SYMBOL(drm_gem_object_set_cgroup);
+
+/**
+ * drm_gem_object_unset_cgroup - clear GEM object's associated cgroup
+ * @obj: GEM object
+ *
+ * This will release a reference on cgroup.
+ */
+void drm_gem_object_unset_cgroup(struct drm_gem_object *obj)
+{
+   WARN_ON(atomic64_read(>drmcg_bytes_charged));
+   drmcg_put(obj->drmcg);
+}
+EXPORT_SYMBOL(drm_gem_object_unset_cgroup);
+
+/**
+ * drm_gem_object_charge_mem - try charging size bytes to DRM cgroup
+ * @obj: GEM object which is being charged
+ * @size: number of bytes to charge
+ *
+ * Try to charge @size bytes to GEM object's associated DRM cgroup.  This
+ * will fail if a successful charge would cause the current device memory
+ * usage to go above the cgroup's GPU memory maximum limit.
+ *
+ * Returns 0 on success.  Otherwise, an error code is returned.
+ */
+int drm_gem_object_charge_mem(struct drm_gem_object *obj, u64 size)
+{
+   int ret;
+
+   ret = drm_cgroup_try_charge(obj->drmcg, obj->dev,
+   DRMCG_TYPE_MEM_CURRENT, size);
+   if (!ret)
+   atomic64_add(size, >drmcg_bytes_charged);
+   return ret;
+}
+EXPORT_SYMBOL(drm_gem_object_charge_mem);
+
+/**
+ * drm_gem_object_uncharge_mem - uncharge size bytes from DRM cgroup
+ * @obj: GEM object which is being uncharged
+ * @size: number of bytes to uncharge
+ *
+ * Uncharge @size bytes from the DRM cgroup associated with specified
+ * GEM object.
+ *
+ * Returns 0 on success.  Otherwise, an error code is returned.
+ */
+void drm_gem_object_uncharge_mem(struct drm_gem_object *obj, u64 size)
+{
+   u64 charged = atomic64_read(>drmcg_bytes_charged);
+
+   if (WARN_ON(!charged))
+   return;
+   if (WARN_ON(size > charged))
+   size = charged;
+
+   atomic64_sub(size, >drmcg_bytes_charged);
+   drm_cgroup_uncharge(obj->drmcg, obj->dev, DRMCG_TYPE_MEM_CURRENT,
+   size);
+}
+EXPORT_SYMBOL(drm_gem_object_uncharge_mem);
+
 /**
  * drm_gem_object_init - initialize an allocated shmem-backed GEM object
  * @dev: drm_device the object should be initialized for
@@ -156,6 +240,8 @@ void drm_gem_private_object_init(struct drm_device *dev,
obj->dev = dev;
obj->filp = NULL;
 
+   drm_gem_object_set_cgroup(obj, current);
+
kref_init(>refcount);
obj->handle_count = 0;
obj->size = size;
@@ -950,6 +1036,9 @@ drm_gem_object_release(struct drm_gem_object *obj)
 
dma_resv_fini(>_resv

[Intel-gfx] [RFC PATCH 7/9] drmcg: Add initial support for tracking gpu time usage

2021-01-26 Thread Brian Welty
Single control below is added to DRM cgroup controller in order to track
user execution time for GPU devices.  It is up to device drivers to
charge execution time to the cgroup via drm_cgroup_try_charge().

  sched.runtime
  Read-only value, displays current user execution time for each DRM
  device. The expectation is that this is incremented by DRM device
  driver's scheduler upon user context completion or context switch.
  Units of time are in microseconds for consistency with cpu.stats.

Signed-off-by: Brian Welty 
---
 Documentation/admin-guide/cgroup-v2.rst |  9 +
 include/drm/drm_cgroup.h|  2 ++
 include/linux/cgroup_drm.h  |  2 ++
 kernel/cgroup/drm.c | 20 
 4 files changed, 33 insertions(+)

diff --git a/Documentation/admin-guide/cgroup-v2.rst 
b/Documentation/admin-guide/cgroup-v2.rst
index ccc25f03a898..f1d0f333a49e 100644
--- a/Documentation/admin-guide/cgroup-v2.rst
+++ b/Documentation/admin-guide/cgroup-v2.rst
@@ -2205,6 +2205,15 @@ thresholds are hit, this would then allow the DRM device 
driver to invoke
 some equivalent to OOM-killer or forced memory eviction for the device
 backed memory in order to attempt to free additional space.
 
+The below set of control files are for time accounting of DRM devices. Units
+of time are in microseconds.
+
+  sched.runtime
+Read-only value, displays current user execution time for each DRM
+device. The expectation is that this is incremented by DRM device
+driver's scheduler upon user context completion or context switch.
+
+
 Misc
 
 
diff --git a/include/drm/drm_cgroup.h b/include/drm/drm_cgroup.h
index 9ba0e372..315dab8a93b8 100644
--- a/include/drm/drm_cgroup.h
+++ b/include/drm/drm_cgroup.h
@@ -22,6 +22,7 @@ enum drmcg_res_type {
DRMCG_TYPE_MEM_CURRENT,
DRMCG_TYPE_MEM_MAX,
DRMCG_TYPE_MEM_TOTAL,
+   DRMCG_TYPE_SCHED_RUNTIME,
__DRMCG_TYPE_LAST,
 };
 
@@ -79,5 +80,6 @@ void drm_cgroup_uncharge(struct drmcg *drmcg,struct 
drm_device *dev,
 enum drmcg_res_type type, u64 usage)
 {
 }
+
 #endif /* CONFIG_CGROUP_DRM */
 #endif /* __DRM_CGROUP_H__ */
diff --git a/include/linux/cgroup_drm.h b/include/linux/cgroup_drm.h
index 3570636473cf..0fafa663321e 100644
--- a/include/linux/cgroup_drm.h
+++ b/include/linux/cgroup_drm.h
@@ -19,6 +19,8 @@
  */
 struct drmcg_device_resource {
struct page_counter memory;
+   seqlock_t sched_lock;
+   u64 exec_runtime;
 };
 
 /**
diff --git a/kernel/cgroup/drm.c b/kernel/cgroup/drm.c
index 08e75eb67593..64e9d0dbe8c8 100644
--- a/kernel/cgroup/drm.c
+++ b/kernel/cgroup/drm.c
@@ -81,6 +81,7 @@ static inline int init_drmcg_single(struct drmcg *drmcg, 
struct drm_device *dev)
/* set defaults here */
page_counter_init(>memory,
  parent_ddr ? _ddr->memory : NULL);
+   seqlock_init(>sched_lock);
drmcg->dev_resources[minor] = ddr;
 
return 0;
@@ -287,6 +288,10 @@ static int drmcg_seq_show_fn(int id, void *ptr, void *data)
seq_printf(sf, "%d:%d %llu\n", DRM_MAJOR, minor->index,
   minor->dev->drmcg_props.memory_total);
break;
+   case DRMCG_TYPE_SCHED_RUNTIME:
+   seq_printf(sf, "%d:%d %llu\n", DRM_MAJOR, minor->index,
+  ktime_to_us(ddr->exec_runtime));
+   break;
default:
seq_printf(sf, "%d:%d\n", DRM_MAJOR, minor->index);
break;
@@ -384,6 +389,12 @@ struct cftype files[] = {
.private = DRMCG_TYPE_MEM_TOTAL,
.flags = CFTYPE_ONLY_ON_ROOT,
},
+   {
+   .name = "sched.runtime",
+   .seq_show = drmcg_seq_show,
+   .private = DRMCG_TYPE_SCHED_RUNTIME,
+   .flags = CFTYPE_NOT_ON_ROOT,
+   },
{ } /* terminate */
 };
 
@@ -440,6 +451,10 @@ EXPORT_SYMBOL(drmcg_device_early_init);
  * choose to enact some form of memory reclaim, but the exact behavior is left
  * to the DRM device driver to define.
  *
+ * For @res type of DRMCG_TYPE_SCHED_RUNTIME:
+ * For GPU time accounting, add @usage amount of GPU time to @drmcg for
+ * the given device.
+ *
  * Returns 0 on success.  Otherwise, an error code is returned.
  */
 int drm_cgroup_try_charge(struct drmcg *drmcg, struct drm_device *dev,
@@ -466,6 +481,11 @@ int drm_cgroup_try_charge(struct drmcg *drmcg, struct 
drm_device *dev,
err = 0;
}
break;
+   case DRMCG_TYPE_SCHED_RUNTIME:
+   write_seqlock(>sched_lock);
+   res->exec_runtime = ktime_add(res->exec_runtime, usage);
+   write_sequnlock(>sched_lock);
+   break;
default:
err = -EINVAL;
break;
-- 
2.20.1

[Intel-gfx] [RFC PATCH 6/9] drmcg: Add memory.total file

2021-01-26 Thread Brian Welty
Following control is introduced in order to display total memory that
exists for the DRM device. DRM drivers can advertise this value by
writing to drm_device.drmcg_props. This is needed in order to effectively
use the other memory controls.
Normally for system memory this is available to the user using procfs.

   memory.total
  Read-only value, displays total memory for a device, shown
  only in root cgroup.

Signed-off-by: Brian Welty 
---
 Documentation/admin-guide/cgroup-v2.rst |  4 
 include/drm/drm_cgroup.h|  2 ++
 kernel/cgroup/drm.c | 10 ++
 3 files changed, 16 insertions(+)

diff --git a/Documentation/admin-guide/cgroup-v2.rst 
b/Documentation/admin-guide/cgroup-v2.rst
index 405912710b3a..ccc25f03a898 100644
--- a/Documentation/admin-guide/cgroup-v2.rst
+++ b/Documentation/admin-guide/cgroup-v2.rst
@@ -2189,6 +2189,10 @@ MEM controller.  All memory amounts are in bytes.
 DRM device memory.  If memory usage reaches this limit,
 subsequent device memory allocations will fail.
 
+  memory.total
+Read-only value, displays total memory for a device, shown only in
+root cgroup.
+
 While some DRM devices may be capable to present multiple memory segments
 to the user, the intent with above controls is to aggregate all user
 allocatable backing store.  Any support for multiple memory segments is
diff --git a/include/drm/drm_cgroup.h b/include/drm/drm_cgroup.h
index 8b4c4e798b11..9ba0e372 100644
--- a/include/drm/drm_cgroup.h
+++ b/include/drm/drm_cgroup.h
@@ -15,11 +15,13 @@ struct drmcg;
  * of storing per device defaults
  */
 struct drmcg_props {
+   u64 memory_total;
 };
 
 enum drmcg_res_type {
DRMCG_TYPE_MEM_CURRENT,
DRMCG_TYPE_MEM_MAX,
+   DRMCG_TYPE_MEM_TOTAL,
__DRMCG_TYPE_LAST,
 };
 
diff --git a/kernel/cgroup/drm.c b/kernel/cgroup/drm.c
index bec41f343208..08e75eb67593 100644
--- a/kernel/cgroup/drm.c
+++ b/kernel/cgroup/drm.c
@@ -283,6 +283,10 @@ static int drmcg_seq_show_fn(int id, void *ptr, void *data)
else
seq_printf(sf, "%ld\n", ddr->memory.max * PAGE_SIZE);
break;
+   case DRMCG_TYPE_MEM_TOTAL:
+   seq_printf(sf, "%d:%d %llu\n", DRM_MAJOR, minor->index,
+  minor->dev->drmcg_props.memory_total);
+   break;
default:
seq_printf(sf, "%d:%d\n", DRM_MAJOR, minor->index);
break;
@@ -374,6 +378,12 @@ struct cftype files[] = {
.private = DRMCG_TYPE_MEM_MAX,
.flags = CFTYPE_NOT_ON_ROOT
},
+   {
+   .name = "memory.total",
+   .seq_show = drmcg_seq_show,
+   .private = DRMCG_TYPE_MEM_TOTAL,
+   .flags = CFTYPE_ONLY_ON_ROOT,
+   },
{ } /* terminate */
 };
 
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [RFC PATCH 5/9] drmcg: Add support for device memory accounting via page counter

2021-01-26 Thread Brian Welty
Here we introduce a general purpose drm_cgroup_try_charge and uncharge
pair of functions. This is modelled after the existing RDMA cgroup controller,
and the idea is for these functions to be used for charging/uncharging all
current and future DRM resource controls.

Two new controls are added in order to allow DRM device drivers to expose
memories attached to the device for purposes of memory accounting and
enforcing allocation limit.

Following controls are introduced and are intended to provide equivalent
behavior and functionality where possible as the MEM controller:

  memory.current
  Read-only value, displays current device memory usage for all DRM
  device memory in terms of allocated physical backing store.

  memory.max
  Read-write value, displays the maximum memory usage limit of device
  memory. If memory usage reaches this limit, then subsequent memory
  allocations will fail.

The expectation is that the DRM device driver simply lets allocations fail
when memory.max is reached.  Future work might be to introduce the device
memory concept of memory.high or memory.low controls, such that DRM device
might be able to invoke memory eviction when these lower threshold are hit.

With provided charging functions, support for memory accounting/charging
is functional.  The intent is for DRM device drivers to charge against
memory.current as device memory is physically allocated.   Implementation is
simplified by making use of page counters underneath.  Nested cgroups will
correctly maintain the parent for the memory page counters, such that
hierarchial charging to parent's memory.current is supported.

Note, this is only for tracking of allocations from device-backed memory.
Memory charging for objects that are backed by system memory is already
handled by the mm subsystem and existing memory accounting controls.

Signed-off-by: Brian Welty 
---
 Documentation/admin-guide/cgroup-v2.rst |  29 -
 include/drm/drm_cgroup.h|  20 
 include/linux/cgroup_drm.h  |   4 +-
 kernel/cgroup/drm.c | 145 +++-
 4 files changed, 191 insertions(+), 7 deletions(-)

diff --git a/Documentation/admin-guide/cgroup-v2.rst 
b/Documentation/admin-guide/cgroup-v2.rst
index b099e1d71098..405912710b3a 100644
--- a/Documentation/admin-guide/cgroup-v2.rst
+++ b/Documentation/admin-guide/cgroup-v2.rst
@@ -2171,8 +2171,35 @@ of GPU-related resources.
 GPU Interface Files
 
 
-TODO
+GPU controller supports usage of multiple DRM devices within a cgroup.
+As such, for all interface files, output is keyed by device major:minor
+and is not ordered.
 
+The first set of control files are for regulating the accounting and
+allocation of memories attached to DRM devices.  The controls are intended
+to provide equivalent behavior and functionality where possible as the
+MEM controller.  All memory amounts are in bytes.
+
+  memory.current
+Read-only value, displays current device memory usage for the DRM
+device memory in terms of allocated physical backing store.
+
+  memory.max
+Read-write value, displays the maximum memory usage limit for the
+DRM device memory.  If memory usage reaches this limit,
+subsequent device memory allocations will fail.
+
+While some DRM devices may be capable to present multiple memory segments
+to the user, the intent with above controls is to aggregate all user
+allocatable backing store.  Any support for multiple memory segments is
+left as future work.
+
+The expectation is that the DRM device driver simply lets allocations fail
+when memory.max is reached.  Future work might be to introduce the device
+memory concept of memory.high or memory.low controls.  When these lower
+thresholds are hit, this would then allow the DRM device driver to invoke
+some equivalent to OOM-killer or forced memory eviction for the device
+backed memory in order to attempt to free additional space.
 
 Misc
 
diff --git a/include/drm/drm_cgroup.h b/include/drm/drm_cgroup.h
index 12526def9fbf..8b4c4e798b11 100644
--- a/include/drm/drm_cgroup.h
+++ b/include/drm/drm_cgroup.h
@@ -8,6 +8,7 @@
 #include 
 
 struct drm_device;
+struct drmcg;
 
 /**
  * Per DRM device properties for DRM cgroup controller for the purpose
@@ -17,6 +18,8 @@ struct drmcg_props {
 };
 
 enum drmcg_res_type {
+   DRMCG_TYPE_MEM_CURRENT,
+   DRMCG_TYPE_MEM_MAX,
__DRMCG_TYPE_LAST,
 };
 
@@ -33,6 +36,11 @@ void drmcg_unregister_dev(struct drm_device *dev);
 
 void drmcg_device_early_init(struct drm_device *device);
 
+int drm_cgroup_try_charge(struct drmcg *drmcg, struct drm_device *dev,
+ enum drmcg_res_type type, u64 usage);
+void drm_cgroup_uncharge(struct drmcg *drmcg, struct drm_device *dev,
+enum drmcg_res_type type, u64 usage);
+
 #else
 
 static inline void drmcg_bind(
@@ -57,5 +65,17 @@ static inline void drmcg_device_early_init

[Intel-gfx] [RFC PATCH 3/9] drm, cgroup: Initialize drmcg properties

2021-01-26 Thread Brian Welty
From: Kenny Ho 

drmcg initialization involves allocating a per cgroup, per device data
structure and setting the defaults.  There are two entry points for
drmcg init:

1) When struct drmcg is created via css_alloc, initialization is done
  for each device

2) When DRM devices are created after drmcgs are created, per
  device drmcg data structure is allocated at the beginning of
  DRM device creation such that drmcg can begin tracking usage
  statistics

Entry point #2 usually applies to the root cgroup since it can be
created before DRM devices are available.  The drmcg controller will go
through all existing drm cgroups and initialize them with the new device
accordingly.

Extending Kenny's original work, this has been simplified some and
the custom_init callback has been removed. (Brian)

Signed-off-by Kenny Ho 
Signed-off-by: Brian Welty 
---
 drivers/gpu/drm/drm_drv.c  |  3 ++
 include/drm/drm_cgroup.h   | 17 +++
 include/drm/drm_device.h   |  7 +++
 include/linux/cgroup_drm.h | 13 ++
 kernel/cgroup/drm.c| 95 +-
 5 files changed, 134 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
index 3b940926d672..dac742445b38 100644
--- a/drivers/gpu/drm/drm_drv.c
+++ b/drivers/gpu/drm/drm_drv.c
@@ -570,6 +570,7 @@ static void drm_dev_init_release(struct drm_device *dev, 
void *res)
/* Prevent use-after-free in drm_managed_release when debugging is
 * enabled. Slightly awkward, but can't really be helped. */
dev->dev = NULL;
+   mutex_destroy(>drmcg_mutex);
mutex_destroy(>master_mutex);
mutex_destroy(>clientlist_mutex);
mutex_destroy(>filelist_mutex);
@@ -612,6 +613,7 @@ static int drm_dev_init(struct drm_device *dev,
mutex_init(>filelist_mutex);
mutex_init(>clientlist_mutex);
mutex_init(>master_mutex);
+   mutex_init(>drmcg_mutex);
 
ret = drmm_add_action(dev, drm_dev_init_release, NULL);
if (ret)
@@ -652,6 +654,7 @@ static int drm_dev_init(struct drm_device *dev,
if (ret)
goto err;
 
+   drmcg_device_early_init(dev);
return 0;
 
 err:
diff --git a/include/drm/drm_cgroup.h b/include/drm/drm_cgroup.h
index 530c9a0b3238..43caf1b6a0de 100644
--- a/include/drm/drm_cgroup.h
+++ b/include/drm/drm_cgroup.h
@@ -4,6 +4,17 @@
 #ifndef __DRM_CGROUP_H__
 #define __DRM_CGROUP_H__
 
+#include 
+
+struct drm_device;
+
+/**
+ * Per DRM device properties for DRM cgroup controller for the purpose
+ * of storing per device defaults
+ */
+struct drmcg_props {
+};
+
 #ifdef CONFIG_CGROUP_DRM
 
 void drmcg_bind(struct drm_minor (*(*acq_dm)(unsigned int minor_id)),
@@ -15,6 +26,8 @@ void drmcg_register_dev(struct drm_device *dev);
 
 void drmcg_unregister_dev(struct drm_device *dev);
 
+void drmcg_device_early_init(struct drm_device *device);
+
 #else
 
 static inline void drmcg_bind(
@@ -35,5 +48,9 @@ static inline void drmcg_unregister_dev(struct drm_device 
*dev)
 {
 }
 
+static inline void drmcg_device_early_init(struct drm_device *device)
+{
+}
+
 #endif /* CONFIG_CGROUP_DRM */
 #endif /* __DRM_CGROUP_H__ */
diff --git a/include/drm/drm_device.h b/include/drm/drm_device.h
index d647223e8390..1cdccc9a653c 100644
--- a/include/drm/drm_device.h
+++ b/include/drm/drm_device.h
@@ -8,6 +8,7 @@
 
 #include 
 #include 
+#include 
 
 struct drm_driver;
 struct drm_minor;
@@ -318,6 +319,12 @@ struct drm_device {
 */
struct drm_fb_helper *fb_helper;
 
+/** \name DRM Cgroup */
+   /*@{ */
+   struct mutex drmcg_mutex;
+   struct drmcg_props drmcg_props;
+   /*@} */
+
/* Everything below here is for legacy driver, never use! */
/* private: */
 #if IS_ENABLED(CONFIG_DRM_LEGACY)
diff --git a/include/linux/cgroup_drm.h b/include/linux/cgroup_drm.h
index 307bb75db248..50f055804400 100644
--- a/include/linux/cgroup_drm.h
+++ b/include/linux/cgroup_drm.h
@@ -12,11 +12,19 @@
 
 #ifdef CONFIG_CGROUP_DRM
 
+/**
+ * Per DRM cgroup, per device resources (such as statistics and limits)
+ */
+struct drmcg_device_resource {
+   /* for per device stats */
+};
+
 /**
  * The DRM cgroup controller data structure.
  */
 struct drmcg {
struct cgroup_subsys_state  css;
+   struct drmcg_device_resource*dev_resources[MAX_DRM_DEV];
 };
 
 /**
@@ -40,6 +48,8 @@ static inline struct drmcg *css_to_drmcg(struct 
cgroup_subsys_state *css)
  */
 static inline struct drmcg *drmcg_get(struct task_struct *task)
 {
+   if (!cgroup_subsys_enabled(gpu_cgrp_subsys))
+   return NULL;
return css_to_drmcg(task_get_css(task, gpu_cgrp_id));
 }
 
@@ -70,6 +80,9 @@ static inline struct drmcg *drmcg_parent(struct drmcg *cg)
 
 #else /* CONFIG_CGROUP_DRM */
 
+struct drmcg_device_resource {
+};
+
 struct drmcg {
 };
 
diff --git a/kernel/cgroup/drm.c b/kernel/cgroup/drm.c
index 061bb9c458e4..836929c27de8 100644
--- a/kerne

[Intel-gfx] [RFC PATCH 4/9] drmcg: Add skeleton seq_show and write for drmcg files

2021-01-26 Thread Brian Welty
Add basic .seq_show and .write functions for use with DRM cgroup control
files.  This is based on original work from Kenny Ho and extracted from
patches [1] and [2].  Has been simplified to remove having different file
types and functions for each.

[1] https://lists.freedesktop.org/archives/dri-devel/2020-February/254986.html
[2] https://lists.freedesktop.org/archives/dri-devel/2020-February/254990.html

Co-developed-by: Kenny Ho 
Signed-off-by: Brian Welty 
---
 include/drm/drm_cgroup.h |   5 ++
 kernel/cgroup/drm.c  | 102 +++
 2 files changed, 107 insertions(+)

diff --git a/include/drm/drm_cgroup.h b/include/drm/drm_cgroup.h
index 43caf1b6a0de..12526def9fbf 100644
--- a/include/drm/drm_cgroup.h
+++ b/include/drm/drm_cgroup.h
@@ -1,5 +1,6 @@
 /* SPDX-License-Identifier: MIT
  * Copyright 2019 Advanced Micro Devices, Inc.
+ * Copyright © 2021 Intel Corporation
  */
 #ifndef __DRM_CGROUP_H__
 #define __DRM_CGROUP_H__
@@ -15,6 +16,10 @@ struct drm_device;
 struct drmcg_props {
 };
 
+enum drmcg_res_type {
+   __DRMCG_TYPE_LAST,
+};
+
 #ifdef CONFIG_CGROUP_DRM
 
 void drmcg_bind(struct drm_minor (*(*acq_dm)(unsigned int minor_id)),
diff --git a/kernel/cgroup/drm.c b/kernel/cgroup/drm.c
index 836929c27de8..aece11faa0bc 100644
--- a/kernel/cgroup/drm.c
+++ b/kernel/cgroup/drm.c
@@ -5,6 +5,7 @@
  */
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -12,6 +13,7 @@
 #include 
 #include 
 #include 
+#include 
 
 static struct drmcg *root_drmcg __read_mostly;
 
@@ -222,6 +224,106 @@ drmcg_css_alloc(struct cgroup_subsys_state *parent_css)
return >css;
 }
 
+static int drmcg_apply_value(struct drmcg_device_resource *ddr,
+enum drmcg_res_type type, char *buf)
+{
+   int ret = 0;
+   unsigned long val;
+
+   switch (type) {
+   default:
+   break;
+   }
+
+   return ret;
+}
+
+static int drmcg_seq_show_fn(int id, void *ptr, void *data)
+{
+   struct drm_minor *minor = ptr;
+   struct seq_file *sf = data;
+   struct drmcg *drmcg = css_to_drmcg(seq_css(sf));
+   enum drmcg_res_type type = seq_cft(sf)->private;
+   struct drmcg_device_resource *ddr;
+
+   if (minor->type != DRM_MINOR_PRIMARY)
+   return 0;
+
+   ddr = drmcg->dev_resources[minor->index];
+   if (ddr == NULL)
+   return 0;
+
+   seq_printf(sf, "%d:%d ", DRM_MAJOR, minor->index);
+   switch (type) {
+   default:
+   seq_puts(sf, "\n");
+   break;
+   }
+
+   return 0;
+}
+
+static int drmcg_seq_show(struct seq_file *sf, void *v)
+{
+   return drm_minor_for_each(_seq_show_fn, sf);
+}
+
+static ssize_t drmcg_write(struct kernfs_open_file *of, char *buf,
+  size_t nbytes, loff_t off)
+{
+   struct drmcg *drmcg = css_to_drmcg(of_css(of));
+   enum drmcg_res_type type = of_cft(of)->private;
+   char *cft_name = of_cft(of)->name;
+   char *limits = strstrip(buf);
+   struct drmcg_device_resource *ddr;
+   struct drm_minor *dm;
+   char *line;
+   char sattr[256];
+   int minor, ret = 0;
+
+   while (!ret && limits != NULL) {
+   line =  strsep(, "\n");
+
+   if (sscanf(line,
+   __stringify(DRM_MAJOR)":%u %255[^\t\n]",
+   , sattr) != 2) {
+   pr_err("drmcg: error parsing %s ", cft_name);
+   pr_cont_cgroup_name(drmcg->css.cgroup);
+   pr_cont("\n");
+
+   continue;
+   }
+
+   mutex_lock(_mutex);
+   if (acquire_drm_minor)
+   dm = acquire_drm_minor(minor);
+   else
+   dm = NULL;
+   mutex_unlock(_mutex);
+
+   if (IS_ERR_OR_NULL(dm)) {
+   pr_err("drmcg: invalid minor %d for %s ",
+   minor, cft_name);
+   pr_cont_cgroup_name(drmcg->css.cgroup);
+   pr_cont("\n");
+
+   continue;
+   }
+
+   mutex_lock(>dev->drmcg_mutex);
+   ddr = drmcg->dev_resources[minor];
+   ret = drmcg_apply_value(ddr, type, sattr);
+   mutex_unlock(>dev->drmcg_mutex);
+
+   mutex_lock(_mutex);
+   if (put_drm_dev)
+   put_drm_dev(dm->dev); /* release from acquire_drm_minor 
*/
+   mutex_unlock(_mutex);
+   }
+
+   return ret ?: nbytes;
+}
+
 struct cftype files[] = {
{ } /* terminate */
 };
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [RFC PATCH 2/9] drm, cgroup: Bind drm and cgroup subsystem

2021-01-26 Thread Brian Welty
From: Kenny Ho 

Since the drm subsystem can be compiled as a module and drm devices can
be added and removed during run time, add several functions to bind the
drm subsystem as well as drm devices with drmcg.

Two pairs of functions:
drmcg_bind/drmcg_unbind - used to bind/unbind the drm subsystem to the
cgroup subsystem as the drm core initialize/exit.

drmcg_register_dev/drmcg_unregister_dev - used to register/unregister
drm devices to the cgroup subsystem as the devices are presented/removed
from userspace.

Signed-off-by: Kenny Ho 
---
 drivers/gpu/drm/drm_drv.c  |   8 +++
 include/drm/drm_cgroup.h   |  39 +++
 include/linux/cgroup_drm.h |   4 ++
 kernel/cgroup/drm.c| 131 +
 4 files changed, 182 insertions(+)
 create mode 100644 include/drm/drm_cgroup.h

diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
index 20d22e41d7ce..3b940926d672 100644
--- a/drivers/gpu/drm/drm_drv.c
+++ b/drivers/gpu/drm/drm_drv.c
@@ -42,6 +42,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "drm_crtc_internal.h"
 #include "drm_internal.h"
@@ -893,6 +894,8 @@ int drm_dev_register(struct drm_device *dev, unsigned long 
flags)
if (drm_core_check_feature(dev, DRIVER_MODESET))
drm_modeset_register_all(dev);
 
+   drmcg_register_dev(dev);
+
DRM_INFO("Initialized %s %d.%d.%d %s for %s on minor %d\n",
 driver->name, driver->major, driver->minor,
 driver->patchlevel, driver->date,
@@ -928,6 +931,8 @@ EXPORT_SYMBOL(drm_dev_register);
  */
 void drm_dev_unregister(struct drm_device *dev)
 {
+   drmcg_unregister_dev(dev);
+
if (drm_core_check_feature(dev, DRIVER_LEGACY))
drm_lastclose(dev);
 
@@ -1030,6 +1035,7 @@ static const struct file_operations drm_stub_fops = {
 
 static void drm_core_exit(void)
 {
+   drmcg_unbind();
unregister_chrdev(DRM_MAJOR, "drm");
debugfs_remove(drm_debugfs_root);
drm_sysfs_destroy();
@@ -1056,6 +1062,8 @@ static int __init drm_core_init(void)
if (ret < 0)
goto error;
 
+   drmcg_bind(_minor_acquire, _dev_put);
+
drm_core_init_complete = true;
 
DRM_DEBUG("Initialized\n");
diff --git a/include/drm/drm_cgroup.h b/include/drm/drm_cgroup.h
new file mode 100644
index ..530c9a0b3238
--- /dev/null
+++ b/include/drm/drm_cgroup.h
@@ -0,0 +1,39 @@
+/* SPDX-License-Identifier: MIT
+ * Copyright 2019 Advanced Micro Devices, Inc.
+ */
+#ifndef __DRM_CGROUP_H__
+#define __DRM_CGROUP_H__
+
+#ifdef CONFIG_CGROUP_DRM
+
+void drmcg_bind(struct drm_minor (*(*acq_dm)(unsigned int minor_id)),
+   void (*put_ddev)(struct drm_device *dev));
+
+void drmcg_unbind(void);
+
+void drmcg_register_dev(struct drm_device *dev);
+
+void drmcg_unregister_dev(struct drm_device *dev);
+
+#else
+
+static inline void drmcg_bind(
+   struct drm_minor (*(*acq_dm)(unsigned int minor_id)),
+   void (*put_ddev)(struct drm_device *dev))
+{
+}
+
+static inline void drmcg_unbind(void)
+{
+}
+
+static inline void drmcg_register_dev(struct drm_device *dev)
+{
+}
+
+static inline void drmcg_unregister_dev(struct drm_device *dev)
+{
+}
+
+#endif /* CONFIG_CGROUP_DRM */
+#endif /* __DRM_CGROUP_H__ */
diff --git a/include/linux/cgroup_drm.h b/include/linux/cgroup_drm.h
index 345af54a5d41..307bb75db248 100644
--- a/include/linux/cgroup_drm.h
+++ b/include/linux/cgroup_drm.h
@@ -5,6 +5,10 @@
 #define _CGROUP_DRM_H
 
 #include 
+#include 
+
+/* limit defined per the way drm_minor_alloc operates */
+#define MAX_DRM_DEV (64 * DRM_MINOR_RENDER)
 
 #ifdef CONFIG_CGROUP_DRM
 
diff --git a/kernel/cgroup/drm.c b/kernel/cgroup/drm.c
index 5e38a8230922..061bb9c458e4 100644
--- a/kernel/cgroup/drm.c
+++ b/kernel/cgroup/drm.c
@@ -1,11 +1,142 @@
 // SPDX-License-Identifier: MIT
 // Copyright 2019 Advanced Micro Devices, Inc.
+#include 
+#include 
 #include 
 #include 
 #include 
+#include 
+#include 
+#include 
 
 static struct drmcg *root_drmcg __read_mostly;
 
+/* global mutex for drmcg across all devices */
+static DEFINE_MUTEX(drmcg_mutex);
+
+static DECLARE_BITMAP(known_devs, MAX_DRM_DEV);
+
+static struct drm_minor (*(*acquire_drm_minor)(unsigned int minor_id));
+
+static void (*put_drm_dev)(struct drm_device *dev);
+
+/**
+ * drmcg_bind - Bind DRM subsystem to cgroup subsystem
+ * @acq_dm: function pointer to the drm_minor_acquire function
+ * @put_ddev: function pointer to the drm_dev_put function
+ *
+ * This function binds some functions from the DRM subsystem and make
+ * them available to the drmcg subsystem.
+ *
+ * drmcg_unbind does the opposite of this function
+ */
+void drmcg_bind(struct drm_minor (*(*acq_dm)(unsigned int minor_id)),
+   void (*put_ddev)(struct drm_device *dev))
+{
+   mutex_lock(_mutex);
+   acquire_drm_minor = acq_dm;
+   put_drm_dev = put_ddev;
+   mutex_unlock(_mutex);
+}
+EXPORT_SYMBOL(drmcg_bind);
+
+/**
+ * 

[Intel-gfx] [RFC PATCH 1/9] cgroup: Introduce cgroup for drm subsystem

2021-01-26 Thread Brian Welty
From: Kenny Ho 

With the increased importance of machine learning, data science and
other cloud-based applications, GPUs are already in production use in
data centers today.  Existing GPU resource management is very coarse
grain, however, as sysadmins are only able to distribute workload on a
per-GPU basis.  An alternative is to use GPU virtualization (with or
without SRIOV) but it generally acts on the entire GPU instead of the
specific resources in a GPU.  With a drm cgroup controller, we can
enable alternate, fine-grain, sub-GPU resource management (in addition
to what may be available via GPU virtualization.)

Signed-off-by: Kenny Ho 
---
 Documentation/admin-guide/cgroup-v2.rst | 18 -
 Documentation/cgroup-v1/drm.rst |  1 +
 include/linux/cgroup_drm.h  | 92 +
 include/linux/cgroup_subsys.h   |  4 ++
 init/Kconfig|  5 ++
 kernel/cgroup/Makefile  |  1 +
 kernel/cgroup/drm.c | 42 +++
 7 files changed, 161 insertions(+), 2 deletions(-)
 create mode 100644 Documentation/cgroup-v1/drm.rst
 create mode 100644 include/linux/cgroup_drm.h
 create mode 100644 kernel/cgroup/drm.c

diff --git a/Documentation/admin-guide/cgroup-v2.rst 
b/Documentation/admin-guide/cgroup-v2.rst
index 63521cd36ce5..b099e1d71098 100644
--- a/Documentation/admin-guide/cgroup-v2.rst
+++ b/Documentation/admin-guide/cgroup-v2.rst
@@ -63,8 +63,10 @@ v1 is available under 
:ref:`Documentation/admin-guide/cgroup-v1/index.rst 
+
+#ifdef CONFIG_CGROUP_DRM
+
+/**
+ * The DRM cgroup controller data structure.
+ */
+struct drmcg {
+   struct cgroup_subsys_state  css;
+};
+
+/**
+ * css_to_drmcg - get the corresponding drmcg ref from a cgroup_subsys_state
+ * @css: the target cgroup_subsys_state
+ *
+ * Return: DRM cgroup that contains the @css
+ */
+static inline struct drmcg *css_to_drmcg(struct cgroup_subsys_state *css)
+{
+   return css ? container_of(css, struct drmcg, css) : NULL;
+}
+
+/**
+ * drmcg_get - get the drmcg reference that a task belongs to
+ * @task: the target task
+ *
+ * This increase the reference count of the css that the @task belongs to
+ *
+ * Return: reference to the DRM cgroup the task belongs to
+ */
+static inline struct drmcg *drmcg_get(struct task_struct *task)
+{
+   return css_to_drmcg(task_get_css(task, gpu_cgrp_id));
+}
+
+/**
+ * drmcg_put - put a drmcg reference
+ * @drmcg: the target drmcg
+ *
+ * Put a reference obtained via drmcg_get
+ */
+static inline void drmcg_put(struct drmcg *drmcg)
+{
+   if (drmcg)
+   css_put(>css);
+}
+
+/**
+ * drmcg_parent - find the parent of a drm cgroup
+ * @cg: the target drmcg
+ *
+ * This does not increase the reference count of the parent cgroup
+ *
+ * Return: parent DRM cgroup of @cg
+ */
+static inline struct drmcg *drmcg_parent(struct drmcg *cg)
+{
+   return css_to_drmcg(cg->css.parent);
+}
+
+#else /* CONFIG_CGROUP_DRM */
+
+struct drmcg {
+};
+
+static inline struct drmcg *css_to_drmcg(struct cgroup_subsys_state *css)
+{
+   return NULL;
+}
+
+static inline struct drmcg *drmcg_get(struct task_struct *task)
+{
+   return NULL;
+}
+
+static inline void drmcg_put(struct drmcg *drmcg)
+{
+}
+
+static inline struct drmcg *drmcg_parent(struct drmcg *cg)
+{
+   return NULL;
+}
+
+#endif /* CONFIG_CGROUP_DRM */
+#endif /* _CGROUP_DRM_H */
diff --git a/include/linux/cgroup_subsys.h b/include/linux/cgroup_subsys.h
index acb77dcff3b4..f4e627942115 100644
--- a/include/linux/cgroup_subsys.h
+++ b/include/linux/cgroup_subsys.h
@@ -61,6 +61,10 @@ SUBSYS(pids)
 SUBSYS(rdma)
 #endif
 
+#if IS_ENABLED(CONFIG_CGROUP_DRM)
+SUBSYS(gpu)
+#endif
+
 /*
  * The following subsystems are not supported on the default hierarchy.
  */
diff --git a/init/Kconfig b/init/Kconfig
index b77c60f8b963..bee29f51e380 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1027,6 +1027,11 @@ config CGROUP_RDMA
  Attaching processes with active RDMA resources to the cgroup
  hierarchy is allowed even if can cross the hierarchy's limit.
 
+config CGROUP_DRM
+   bool "DRM controller (EXPERIMENTAL)"
+   help
+ Provides accounting and enforcement of resources in the DRM subsystem.
+
 config CGROUP_FREEZER
bool "Freezer controller"
help
diff --git a/kernel/cgroup/Makefile b/kernel/cgroup/Makefile
index 5d7a76bfbbb7..31f186f58121 100644
--- a/kernel/cgroup/Makefile
+++ b/kernel/cgroup/Makefile
@@ -4,5 +4,6 @@ obj-y := cgroup.o rstat.o namespace.o cgroup-v1.o freezer.o
 obj-$(CONFIG_CGROUP_FREEZER) += legacy_freezer.o
 obj-$(CONFIG_CGROUP_PIDS) += pids.o
 obj-$(CONFIG_CGROUP_RDMA) += rdma.o
+obj-$(CONFIG_CGROUP_DRM) += drm.o
 obj-$(CONFIG_CPUSETS) += cpuset.o
 obj-$(CONFIG_CGROUP_DEBUG) += debug.o
diff --git a/kernel/cgroup/drm.c b/kernel/cgroup/drm.c
new file mode 100644
index ..5e38a8230922
--- /dev/null
+++ b/kernel/cgroup/drm.c
@@ -0,0 +1,42 @@
+// SPDX-License-Identifier: MIT
+// 

[Intel-gfx] [RFC PATCH 0/9] cgroup support for GPU devices

2021-01-26 Thread Brian Welty
We'd like to revisit the proposal of a GPU cgroup controller for managing
GPU devices but with just a basic set of controls.  This series is based on 
the prior patch series from Kenny Ho [1].  We take Kenny's base patches
which implement the basic framework for the controller, but we propose an
alternate set of control files.  Here we've taken a subset of the controls
proposed in earlier discussion on ML here [2]. 

This series proposes a set of device memory controls (gpu.memory.current,
gpu.memory.max, and gpu.memory.total) and accounting of GPU time usage
(gpu.sched.runtime).  GPU time sharing controls are left as future work.
These are implemented within the GPU controller along with integration/usage
of the device memory controls by the i915 device driver.

As an accelerator or GPU device is similar in many respects to a CPU with
(or without) attached system memory, the basic principle here is try to
copy the semantics of existing controls from other controllers when possible
and where these controls serve the same underlying purpose.
For example, the memory.max and memory.current controls are based on
same controls from MEMCG controller.

Following with the implementation used by the existing RDMA controller,
here we introduce a general purpose drm_cgroup_try_charge and uncharge
pair of exported functions. These functions are to be used for
charging and uncharging all current and future DRM resource controls.

Patches 1 - 4 are part original work and part refactoring of the prior
work from Kenny Ho from his series for GPU / DRM controller v2 [1].

Patches 5 - 7 introduce new controls to the GPU / DRM controller for device
memory accounting and GPU time tracking.

Patch 8 introduces DRM support for associating GEM objects with a cgroup.

Patch 9 implements i915 changes to use cgroups for device memory charging
and enforcing device memory allocation limit.

[1] https://lists.freedesktop.org/archives/dri-devel/2020-February/257052.html
[2] https://lists.freedesktop.org/archives/dri-devel/2019-November/242599.html

Brian Welty (6):
  drmcg: Add skeleton seq_show and write for drmcg files
  drmcg: Add support for device memory accounting via page counter
  drmcg: Add memory.total file
  drmcg: Add initial support for tracking gpu time usage
  drm/gem: Associate GEM objects with drm cgroup
  drm/i915: Use memory cgroup for enforcing device memory limit

Kenny Ho (3):
  cgroup: Introduce cgroup for drm subsystem
  drm, cgroup: Bind drm and cgroup subsystem
  drm, cgroup: Initialize drmcg properties

 Documentation/admin-guide/cgroup-v2.rst|  58 ++-
 Documentation/cgroup-v1/drm.rst|   1 +
 drivers/gpu/drm/drm_drv.c  |  11 +
 drivers/gpu/drm/drm_gem.c  |  89 
 drivers/gpu/drm/i915/gem/i915_gem_mman.c   |   1 +
 drivers/gpu/drm/i915/gem/i915_gem_region.c |  23 +-
 drivers/gpu/drm/i915/intel_memory_region.c |  13 +-
 drivers/gpu/drm/i915/intel_memory_region.h |   2 +-
 include/drm/drm_cgroup.h   |  85 
 include/drm/drm_device.h   |   7 +
 include/drm/drm_gem.h  |  17 +
 include/linux/cgroup_drm.h | 113 +
 include/linux/cgroup_subsys.h  |   4 +
 init/Kconfig   |   5 +
 kernel/cgroup/Makefile |   1 +
 kernel/cgroup/drm.c| 533 +
 16 files changed, 954 insertions(+), 9 deletions(-)
 create mode 100644 Documentation/cgroup-v1/drm.rst
 create mode 100644 include/drm/drm_cgroup.h
 create mode 100644 include/linux/cgroup_drm.h
 create mode 100644 kernel/cgroup/drm.c

-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH i-g-t] tests/i915/query: Do not assert engine info rsvd being zero

2020-03-04 Thread Brian Welty


On 3/4/2020 1:29 AM, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin 
> 
> These are not input fields and i915 currently leaves them untouched.
> 
> In the spirit of trusting the query as the authoritative source of
> information, stop asserting these output only reserved fields are zero.
> 
> This should prevent the test from auto-failing if we extend the data in
> the future.
> 
> Signed-off-by: Tvrtko Ursulin 
> Cc: Brian Welty 

Reviewed-by: Brian Welty 

> ---
>  tests/i915/i915_query.c | 5 -
>  1 file changed, 5 deletions(-)
> 
> diff --git a/tests/i915/i915_query.c b/tests/i915/i915_query.c
> index 71807425fbc4..e7c6fc91e32b 100644
> --- a/tests/i915/i915_query.c
> +++ b/tests/i915/i915_query.c
> @@ -690,11 +690,6 @@ static void engines(int fd)
> engine->flags,
> engine->capabilities);
>  
> - /* MBZ fields. */
> - igt_assert_eq(engine->rsvd0, 0);
> - igt_assert_eq(engine->rsvd1[0], 0);
> - igt_assert_eq(engine->rsvd1[1], 0);
> -
>   switch (engine->engine.engine_class) {
>   case I915_ENGINE_CLASS_RENDER:
>   /* Will be tested later. */
> 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 10/12] drm/i915/gt: Refactor l3cc/mocs availability

2020-02-18 Thread Brian Welty


On 2/18/2020 8:21 AM, Chris Wilson wrote:
> On dgfx, we only use l3cc and not mocs, but we share the table of
> register definitions with Tigerlake (which includes the mocs). This
> confuses our selftest that verifies that the registers do contain the
> values in our tables after various events (idling, reset, activity etc).
> 
> When constructing the table of register definitions, also include the
> flags for which registers are valid so that information is computed
> centrally and available to all callers.
> 
> Signed-off-by: Chris Wilson 
> Cc: Brian Welty 
> Cc: Daniele Ceraolo Spurio 
> ---
>  drivers/gpu/drm/i915/gt/intel_mocs.c| 72 +
>  drivers/gpu/drm/i915/gt/selftest_mocs.c | 24 ++---
>  2 files changed, 67 insertions(+), 29 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_mocs.c 
> b/drivers/gpu/drm/i915/gt/intel_mocs.c
> index 0afc1eb3c20f..632e08a4592b 100644
> --- a/drivers/gpu/drm/i915/gt/intel_mocs.c
> +++ b/drivers/gpu/drm/i915/gt/intel_mocs.c
> @@ -280,9 +280,32 @@ static const struct drm_i915_mocs_entry icl_mocs_table[] 
> = {
>   GEN11_MOCS_ENTRIES
>  };
>  
> -static bool get_mocs_settings(const struct drm_i915_private *i915,
> -   struct drm_i915_mocs_table *table)
> +enum {
> + HAS_GLOBAL_MOCS = BIT(0),
> + HAS_ENGINE_MOCS = BIT(1),
> + HAS_RENDER_L3CC = BIT(2),
> +};
> +
> +static bool has_l3cc(const struct drm_i915_private *i915)
>  {
> + return true;
> +}
> +
> +static bool has_global_mocs(const struct drm_i915_private *i915)
> +{
> + return HAS_GLOBAL_MOCS_REGISTERS(i915);
> +}
> +
> +static bool has_mocs(const struct drm_i915_private *i915)
> +{
> + return !IS_DGFX(i915);
> +}
> +
> +static unsigned int get_mocs_settings(const struct drm_i915_private *i915,
> +   struct drm_i915_mocs_table *table)
> +{
> + unsigned int flags;
> +
>   if (INTEL_GEN(i915) >= 12) {
>   table->size  = ARRAY_SIZE(tgl_mocs_table);
>   table->table = tgl_mocs_table;
> @@ -302,11 +325,11 @@ static bool get_mocs_settings(const struct 
> drm_i915_private *i915,
>   } else {
>   drm_WARN_ONCE(>drm, INTEL_GEN(i915) >= 9,
> "Platform that should have a MOCS table does 
> not.\n");
> - return false;
> + return 0;
>   }
>  
>   if (GEM_DEBUG_WARN_ON(table->size > table->n_entries))
> - return false;
> + return 0;
>  
>   /* WaDisableSkipCaching:skl,bxt,kbl,glk */
>   if (IS_GEN(i915, 9)) {
> @@ -315,10 +338,20 @@ static bool get_mocs_settings(const struct 
> drm_i915_private *i915,
>   for (i = 0; i < table->size; i++)
>   if (GEM_DEBUG_WARN_ON(table->table[i].l3cc_value &
> (L3_ESC(1) | L3_SCC(0x7
> - return false;
> + return 0;
>   }
>  
> - return true;
> + flags = 0;
> + if (has_mocs(i915)) {
> + if (has_global_mocs(i915))
> + flags |= HAS_GLOBAL_MOCS;
> + else
> + flags |= HAS_ENGINE_MOCS;
> + }
> + if (has_l3cc(i915))
> + flags |= HAS_RENDER_L3CC;
> +
> + return flags;
>  }
>  
>  /*
> @@ -411,18 +444,20 @@ static void init_l3cc_table(struct intel_engine_cs 
> *engine,
>  void intel_mocs_init_engine(struct intel_engine_cs *engine)
>  {
>   struct drm_i915_mocs_table table;
> + unsigned int flags;
>  
>   /* Called under a blanket forcewake */
>   assert_forcewakes_active(engine->uncore, FORCEWAKE_ALL);
>  
> - if (!get_mocs_settings(engine->i915, ))
> + flags = get_mocs_settings(engine->i915, );
> + if (!flags)
>   return;
>  
>   /* Platforms with global MOCS do not need per-engine initialization. */
> - if (!HAS_GLOBAL_MOCS_REGISTERS(engine->i915))
> + if (flags & HAS_ENGINE_MOCS)
>   init_mocs_table(engine, );
>  
> - if (engine->class == RENDER_CLASS)
> + if (flags & HAS_RENDER_L3CC && engine->class == RENDER_CLASS)
>   init_l3cc_table(engine, );
>  }
>  
> @@ -431,26 +466,17 @@ static u32 global_mocs_offset(void)
>   return i915_mmio_reg_offset(GEN12_GLOBAL_MOCS(0));
>  }
>  
> -static void init_global_mocs(struct intel_gt *gt)
> +void intel_mocs_init(struct intel_gt *gt)
>  {
>   struct drm_i915_mocs_table table;
> + unsigned int flags;

Re: [Intel-gfx] [PATCH] drm/i915/selftests: Fix selftest_mocs for DGFX

2020-02-12 Thread Brian Welty


On 2/12/2020 4:34 PM, Chris Wilson wrote:
> Quoting Brian Welty (2020-02-13 00:14:18)
>> For DGFX devices, the MOCS control value is not initialized or used.
> 
> Then why is the table populated?
> -Chris
> 

The format has changed (been reduced?) for DGFX.  
drm_i915_mocs_entry.l3cc_value is what is still initialized/used.
Probably first needed is the patch that defines the table entries for DGFX.
Ugh, I didn't notice this wasn't applied yet.  Let me ask about this.

-Brian

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH] drm/i915/selftests: Fix selftest_mocs for DGFX

2020-02-12 Thread Brian Welty
For DGFX devices, the MOCS control value is not initialized or used.
Update the selftest to skip reading and checking control values
for these devices.

References: e6e2ac07118b ("drm/i915: do not set MOCS control values on dgfx")
Fixes: 3fb33cd32ffd ("drm/i915/selftests: Add coverage of mocs registers")
Signed-off-by: Brian Welty 
---
 drivers/gpu/drm/i915/gt/selftest_mocs.c | 22 +++---
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/selftest_mocs.c 
b/drivers/gpu/drm/i915/gt/selftest_mocs.c
index de1f83100fb6..8a94a546d580 100644
--- a/drivers/gpu/drm/i915/gt/selftest_mocs.c
+++ b/drivers/gpu/drm/i915/gt/selftest_mocs.c
@@ -199,7 +199,7 @@ static int check_l3cc_table(struct intel_engine_cs *engine,
return 0;
 }
 
-static int check_mocs_engine(struct live_mocs *arg,
+static int check_mocs_engine(struct intel_gt *gt, struct live_mocs *arg,
 struct intel_context *ce)
 {
struct i915_vma *vma = arg->scratch;
@@ -222,7 +222,7 @@ static int check_mocs_engine(struct live_mocs *arg,
 
/* Read the mocs tables back using SRM */
offset = i915_ggtt_offset(vma);
-   if (!err)
+   if (!err && !IS_DGFX(gt->i915))
err = read_mocs_table(rq, >table, );
if (!err && ce->engine->class == RENDER_CLASS)
err = read_l3cc_table(rq, >table, );
@@ -235,7 +235,7 @@ static int check_mocs_engine(struct live_mocs *arg,
 
/* Compare the results against the expected tables */
vaddr = arg->vaddr;
-   if (!err)
+   if (!err && !IS_DGFX(gt->i915))
err = check_mocs_table(ce->engine, >table, );
if (!err && ce->engine->class == RENDER_CLASS)
err = check_l3cc_table(ce->engine, >table, );
@@ -262,7 +262,7 @@ static int live_mocs_kernel(void *arg)
 
for_each_engine(engine, gt, id) {
intel_engine_pm_get(engine);
-   err = check_mocs_engine(, engine->kernel_context);
+   err = check_mocs_engine(gt, , engine->kernel_context);
intel_engine_pm_put(engine);
if (err)
break;
@@ -295,7 +295,7 @@ static int live_mocs_clean(void *arg)
break;
}
 
-   err = check_mocs_engine(, ce);
+   err = check_mocs_engine(gt, , ce);
intel_context_put(ce);
if (err)
break;
@@ -332,7 +332,7 @@ static int active_engine_reset(struct intel_context *ce,
return err;
 }
 
-static int __live_mocs_reset(struct live_mocs *mocs,
+static int __live_mocs_reset(struct intel_gt *gt, struct live_mocs *mocs,
 struct intel_context *ce)
 {
int err;
@@ -341,7 +341,7 @@ static int __live_mocs_reset(struct live_mocs *mocs,
if (err)
return err;
 
-   err = check_mocs_engine(mocs, ce);
+   err = check_mocs_engine(gt, mocs, ce);
if (err)
return err;
 
@@ -349,13 +349,13 @@ static int __live_mocs_reset(struct live_mocs *mocs,
if (err)
return err;
 
-   err = check_mocs_engine(mocs, ce);
+   err = check_mocs_engine(gt, mocs, ce);
if (err)
return err;
 
-   intel_gt_reset(ce->engine->gt, ce->engine->mask, "mocs");
+   intel_gt_reset(gt, ce->engine->mask, "mocs");
 
-   err = check_mocs_engine(mocs, ce);
+   err = check_mocs_engine(gt, mocs, ce);
if (err)
return err;
 
@@ -390,7 +390,7 @@ static int live_mocs_reset(void *arg)
}
 
intel_engine_pm_get(engine);
-   err = __live_mocs_reset(, ce);
+   err = __live_mocs_reset(gt, , ce);
intel_engine_pm_put(engine);
 
intel_context_put(ce);
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: Make use of drm_gem_object_release

2020-01-16 Thread Brian Welty


On 1/16/2020 11:30 AM, Chris Wilson wrote:
> Quoting Brian Welty (2020-01-16 19:20:47)
>> As i915 is using drm_gem_private_object_init, it is best to
>> use the inverse function for cleanup: drm_gem_object_release.
>> This removes need for a shmem_release and phys_release.
>>
>> Signed-off-by: Brian Welty 
>> ---
>> Chris, the cleanup sequence in drm_gem_object_release() vs the replaced
>> i915 code is different, but should be okay?  Light testing didn't find
>> any issues.
> 
> commit 0c159ffef628fa94d0f4f9128e7f2b6f2b5e86ef
> Author: Chris Wilson 
> Date:   Wed Jul 3 19:06:01 2019 +0100
> 
> drm/i915/gem: Defer obj->base.resv fini until RCU callback
> 
> Since reservation_object_fini() does an immediate free, rather than
> kfree_rcu as normal, we have to delay the release until after the RCU
> grace period has elapsed (i.e. from the rcu cleanup callback) so that we
> can rely on the RCU protected access to the fences while the object is a
> zombie.
> 
> i915_gem_busy_ioctl relies on having an RCU barrier to protect the
> reservation in order to avoid having to take a reference and strong
> memory barriers.
> 
> v2: Order is important; only release after putting the pages!
> 
> Fixes: c03467ba40f7 ("drm/i915/gem: Free pages before rcu-freeing the 
> object
> ")
> Testcase: igt/gem_busy/close-race
> Signed-off-by: Chris Wilson 
> Cc: Matthew Auld 
> Reviewed-by: Mika Kuoppala 
> Link: 
> https://patchwork.freedesktop.org/patch/msgid/20190703180601.10950-1-c
> h...@chris-wilson.co.uk
> 

Thanks, I didn't check the history to see this was using drm_gem_object_release 
in past.
But looks to be using kfree_rcu now for the free.

Are we okay now as this patch has gone in since?:

commit 96e95496b02dbf1b19a2d4ce238810572e149606
Author: Christian König 
Date:   Tue Aug 6 13:33:12 2019 +0200

dma-buf: fix shared fence list handling in reservation_object_copy_fences




___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH] drm/i915: Make use of drm_gem_object_release

2020-01-16 Thread Brian Welty
As i915 is using drm_gem_private_object_init, it is best to
use the inverse function for cleanup: drm_gem_object_release.
This removes need for a shmem_release and phys_release.

Signed-off-by: Brian Welty 
---
Chris, the cleanup sequence in drm_gem_object_release() vs the replaced
i915 code is different, but should be okay?  Light testing didn't find
any issues.
---
 drivers/gpu/drm/i915/gem/i915_gem_object.c | 4 +---
 drivers/gpu/drm/i915/gem/i915_gem_phys.c   | 7 ---
 drivers/gpu/drm/i915/gem/i915_gem_shmem.c  | 9 +
 3 files changed, 2 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c 
b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index 46bacc82ddc4..d51838d7d2ec 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -159,7 +159,6 @@ static void __i915_gem_free_object_rcu(struct rcu_head 
*head)
container_of(head, typeof(*obj), rcu);
struct drm_i915_private *i915 = to_i915(obj->base.dev);
 
-   dma_resv_fini(>base._resv);
i915_gem_object_free(obj);
 
GEM_BUG_ON(!atomic_read(>mm.free_count));
@@ -222,8 +221,7 @@ static void __i915_gem_free_objects(struct drm_i915_private 
*i915,
if (obj->base.import_attach)
drm_prime_gem_destroy(>base, NULL);
 
-   drm_gem_free_mmap_offset(>base);
-
+   drm_gem_object_release(>base);
if (obj->ops->release)
obj->ops->release(obj);
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_phys.c 
b/drivers/gpu/drm/i915/gem/i915_gem_phys.c
index b1b7c1b3038a..7c19f92f256b 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_phys.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_phys.c
@@ -134,16 +134,9 @@ i915_gem_object_put_pages_phys(struct drm_i915_gem_object 
*obj,
drm_pci_free(obj->base.dev, obj->phys_handle);
 }
 
-static void phys_release(struct drm_i915_gem_object *obj)
-{
-   fput(obj->base.filp);
-}
-
 static const struct drm_i915_gem_object_ops i915_gem_phys_ops = {
.get_pages = i915_gem_object_get_pages_phys,
.put_pages = i915_gem_object_put_pages_phys,
-
-   .release = phys_release,
 };
 
 int i915_gem_object_attach_phys(struct drm_i915_gem_object *obj, int align)
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c 
b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
index a2a980d9d241..4004cfe1e28a 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
@@ -418,13 +418,6 @@ shmem_pwrite(struct drm_i915_gem_object *obj,
return 0;
 }
 
-static void shmem_release(struct drm_i915_gem_object *obj)
-{
-   i915_gem_object_release_memory_region(obj);
-
-   fput(obj->base.filp);
-}
-
 const struct drm_i915_gem_object_ops i915_gem_shmem_ops = {
.flags = I915_GEM_OBJECT_HAS_STRUCT_PAGE |
 I915_GEM_OBJECT_IS_SHRINKABLE,
@@ -436,7 +429,7 @@ const struct drm_i915_gem_object_ops i915_gem_shmem_ops = {
 
.pwrite = shmem_pwrite,
 
-   .release = shmem_release,
+   .release = i915_gem_object_release_memory_region,
 };
 
 static int __create_shmem(struct drm_i915_private *i915,
-- 
2.21.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: Fix a bug calling sleep function in atomic context

2019-11-13 Thread Brian Welty


On 11/12/2019 4:28 PM, Bruce Chang wrote:
> There are quite a few reports regarding "BUG: sleeping function called from
> invalid context at mm/page_alloc.c"
> 
> Basically after the io_mapping_map_atomic_wc/kmap_atomic, it enters atomic
> context, but compress_page cannot be called in atomic context as it will
> call pool_alloc with GFP_KERNEL flag which can go to sleep. This is why
> the bug got reported.
> 
> So, changed to non atomic version instead.

The atomic functions were recently added, so seems worth a note that 
you are fixing that patch by adding:
Fixes: 895d8ebeaa924 ("drm/i915: error capture with no ggtt slot")

And your fix here looks correct to me, so:
Reviewed-by: Brian Welty 


> 
> Signed-off-by: Bruce Chang 
> ---
>  drivers/gpu/drm/i915/i915_gpu_error.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c 
> b/drivers/gpu/drm/i915/i915_gpu_error.c
> index 1f2f266f26af..7118ecb7f144 100644
> --- a/drivers/gpu/drm/i915/i915_gpu_error.c
> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
> @@ -1007,67 +1007,67 @@ i915_error_object_create(struct drm_i915_private 
> *i915,
>   compress->wc = i915_gem_object_is_lmem(vma->obj) ||
>  drm_mm_node_allocated(>error_capture);
>  
>   ret = -EINVAL;
>   if (drm_mm_node_allocated(>error_capture)) {
>   void __iomem *s;
>   dma_addr_t dma;
>  
>   for_each_sgt_daddr(dma, iter, vma->pages) {
>   ggtt->vm.insert_page(>vm, dma, slot,
>I915_CACHE_NONE, 0);
>  
>   s = io_mapping_map_wc(>iomap, slot, PAGE_SIZE);
>   ret = compress_page(compress, (void  __force *)s, dst);
>   io_mapping_unmap(s);
>   if (ret)
>   break;
>   }
>   } else if (i915_gem_object_is_lmem(vma->obj)) {
>   struct intel_memory_region *mem = vma->obj->mm.region;
>   dma_addr_t dma;
>  
>   for_each_sgt_daddr(dma, iter, vma->pages) {
>   void __iomem *s;
>  
> - s = io_mapping_map_atomic_wc(>iomap, dma);
> + s = io_mapping_map_wc(>iomap, dma, PAGE_SIZE);
>   ret = compress_page(compress, (void __force *)s, dst);
> - io_mapping_unmap_atomic(s);
> + io_mapping_unmap(s);
>   if (ret)
>   break;
>   }
>   } else {
>   struct page *page;
>  
>   for_each_sgt_page(page, iter, vma->pages) {
>   void *s;
>  
>   drm_clflush_pages(, 1);
>  
> - s = kmap_atomic(page);
> + s = kmap(page);
>   ret = compress_page(compress, s, dst);
> - kunmap_atomic(s);
> + kunmap(s);
>  
>   drm_clflush_pages(, 1);
>  
>   if (ret)
>   break;
>   }
>   }
>  
>   if (ret || compress_flush(compress, dst)) {
>   while (dst->page_count--)
>   pool_free(>pool, dst->pages[dst->page_count]);
>   kfree(dst);
>   dst = NULL;
>   }
>   compress_finish(compress);
>  
>   return dst;
>  }
>  
>  /*
>   * Generate a semi-unique error code. The code is not meant to have meaning, 
> The
>   * code's only purpose is to try to prevent false duplicated bug reports by
>   * grossly estimating a GPU error state.
>   *
>   * TODO Ideally, hashing the batchbuffer would be a very nice way to 
> determine
> 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH] drm/i915: disable set/get_tiling ioctl on gen12+

2019-11-07 Thread Brian Welty


On 8/28/2019 11:50 PM, Daniel Vetter wrote:
> On Wed, Aug 28, 2019 at 08:31:27PM +, Souza, Jose wrote:
>> On Wed, 2019-08-28 at 21:13 +0100, Chris Wilson wrote:
>>> Quoting Souza, Jose (2019-08-28 21:11:53)
 Reviewed-by: José Roberto de Souza 
>>>
>>> It's using a non-standard for i915 error code, and get_tiling is not
>>
>> Huum should it use ENOTSUPP then?!
> 
> https://dri.freedesktop.org/docs/drm/gpu/drm-uapi.html#recommended-ioctl-return-values
> 
> Per above "feature not supported" -> EOPNOTSUPP.
> 
>>> affected, it will always return LINEAR. You cannot set tiling as 
>>
>> Following this set_tiling() LINEAR should be allowed too.
>> I prefer the current approach of returning error.
> 
> I'm not seeing the value in keeping get_tiling supported. Either userspace
> still uses the legacy backhannel and dri2, in which case it needs to be
> fixed no matter what. Or it's using modifiers, in which case this is dead
> code. Only other user I can think of is takeover for fastboot, but if you
> get anything else than untiled it's also broken (we don't have an ioctl to
> read out the modifiers, heck even all the planes, there's no getfb2).
> 
> So really not seeing the point in keeping that working.

Daniel,  I came across usage of GET_TILING in libdrm.
Is used in drm_intel_bo_gem_create_from_name() and 
drm_intel_bo_gem_create_from_prime().
Should these be updated so they don't fail when EOPNOTSUPP is returned on 
gen12+?
Maybe libdrm should just set tiling_mode to 0 on EOPNOTSUPP error instead of 
those calls failing?

-Brian

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH] drm/i915: Delegate our irq handler to a thread

2019-09-26 Thread Brian Welty

On 9/26/2019 7:25 AM, Chris Wilson wrote:
> Moving our primary irq handler to a RT thread incurs an extra 1us delay
> in process interrupts. This is most notice in waking up client threads,
> where it adds about 20% of extra latency. It also imposes a delay in
> feeding the GPU, an extra 1us before signaling secondary engines and
> extra latency in resubmitting work to keep the GPU busy. The latter case
> is insignificant as the latency hidden by the active GPU, and
> preempt-to-busy ensures that no extra latency is incurred for
> preemption.
> 
> The benefit is that we reduced the impact on the rest of the system, the
> cycletest shows a reduction from 5us mean latency to 2us, with the
> maximum observed latency (in a 2 minute window) reduced by over 160us.

Hi Chris,

I'm curious to understand this a little better.
If only benefit of this patch is improving overall system performance, then
can you say why i915 interrupt handling doesn't fit into what I thought was
the common usage of threaded interrupts...

Usually with request_threaded_irq(), I thought typically both handlers
are specified and so you'd only fallback to the threaded handler when the
interrupt context handler is overwhelmed?
Like so:
while (HW events need some action) {
 do something ...
if (too overwhelmed)  /* ie. reduce load on system */
return IRQ_WAKE_THREAD;
}
return IRQ_HANDLED;

Likely you considered something like above.
I'm just looking to understand why using only threaded interrupt handler is 
best in this case.
(Also maybe discuss this a little bit further in commit message...?)

Sorry if this was perhaps discussed in response to Tvrtko's question.
I didn't quite follow the fast/slow separation mentioned in that thread.

Thanks,
-Brian


> 
> Signed-off-by: Chris Wilson 
> Cc: Joonas Lahtinen 
> Cc: Tvrtko Ursulin 
> Cc: Clark Williams 
> Cc: Sebastian Andrzej Siewior 
> ---
> Note this should need the fixes in
> 20190926105644.16703-2-bige...@linutronix.de, by itself this should be a
> test vehicle to exercise that patch!
> ---
>  drivers/gpu/drm/i915/i915_irq.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> index bc83f094065a..f3df7714a3f3 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -4491,8 +4491,8 @@ int intel_irq_install(struct drm_i915_private *dev_priv)
>  
>   intel_irq_reset(dev_priv);
>  
> - ret = request_irq(irq, intel_irq_handler(dev_priv),
> -   IRQF_SHARED, DRIVER_NAME, dev_priv);
> + ret = request_threaded_irq(irq, NULL, intel_irq_handler(dev_priv),
> +IRQF_SHARED, DRIVER_NAME, dev_priv);
>   if (ret < 0) {
>   dev_priv->drm.irq_enabled = false;
>   return ret;
> 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [RFC PATCH 0/3] Propose new struct drm_mem_region

2019-07-30 Thread Brian Welty

On 7/30/2019 2:34 AM, Daniel Vetter wrote:
> On Tue, Jul 30, 2019 at 08:45:57AM +, Koenig, Christian wrote:
>> Yeah, that looks like a good start. Just a couple of random design 
>> comments/requirements.
>>
>> First of all please restructure the changes so that you more or less 
>> have the following:
>> 1. Adding of the new structures and functionality without any change to 
>> existing code.
>> 2. Replacing the existing functionality in TTM and all of its drivers.
>> 3. Replacing the existing functionality in i915.
>>
>> This should make it much easier to review the new functionality when it 
>> is not mixed with existing TTM stuff.

Sure, understood.  But I hope it's fair that I wouldn't be updating all
drivers in an RFC series until there is a bit of clarity/agreement on any
path forward.  But I can include amdgpu patch next time.

>>
>>
>> Second please completely drop the concept of gpu_offset or start of the 
>> memory region like here:
>>> drm_printf(p, "gpu_offset: 0x%08llX\n", man->region.start);
>> At least on AMD hardware we have the following address spaces which are 
>> sometimes even partially overlapping: VM, MC, SYSTEM, FB, AGP, XGMI, bus 
>> addresses and physical addresses.
>>
>> Pushing a concept of a general GPU address space into the memory 
>> management was a rather bad design mistake in TTM and we should not 
>> repeat that here.
>>
>> A region should only consists of a size in bytes and (internal to the 
>> region manager) allocations in that region.

Got it. I was trying to include fields that seemed relevant to a base
structure and could then optionally be leveraged at the choice of device
driver.  But I see your point.

>>
>>
>> Third please don't use any CPU or architecture specific types in any 
>> data structures:
>>> +struct drm_mem_region {
>>> +   resource_size_t start; /* within GPU physical address space */
>>> +   resource_size_t io_start; /* BAR address (CPU accessible) */
>>> +   resource_size_t size;
>>
>> I knew that resource_size is mostly 64bit on modern architectures, but 
>> dGPUs are completely separate to the architecture and we always need 
>> 64bits here at least for AMD hardware.
>>
>> So this should either be always uint64_t, or something like 
>> gpu_resource_size which depends on what the compiled in drivers require 
>> if we really need that.
>>
>> And by the way: Please always use bytes for things like sizes and not 
>> number of pages, cause page size is again CPU/architecture specific and 
>> GPU drivers don't necessary care about that.

Makes sense,  will fix.

Hmm,  I did hope that at least the DRM cgroup controller could leverage
struct page_counter.  It encapsulates nicely much of the fields for 
managing a memory limit.  But well, this is off topic

>>
>>
>> And here also a few direct comments on the code:
>>> +   union {
>>> +   struct drm_mm *mm;
>>> +   /* FIXME (for i915): struct drm_buddy_mm *buddy_mm; */
>>> +   void *priv;
>>> +   };
>> Maybe just always use void *mm here.
> 
> I'd say lets drop this outright, and handle private data by embedding this
> structure in the right place. That's how we handle everything in drm now
> as much as possible, and it's so much cleaner. I think ttm still loves
> priv pointers a bit too much in some places.

Okay, I'll drop it until I might be able to prove this might be useful later.

>
>>> +   spinlock_t move_lock;
>>> +   struct dma_fence *move;
>>
>> That is TTM specific and I'm not sure if we want it in the common memory 
>> management handling.
>>
>> If we want that here we should probably replace the lock with some rcu 
>> and atomic fence pointer exchange first.
> 
> Yeah  not sure we want any of these details in this shared structure
> either.
> 

Thanks for the feedback. I can remove it too.
I was unsure if might be a case for having it in future.

Well, struct drm_mem_region will be quite small then if it only has a
size and type field.
Hardly seems worth introducing a new structure if these are the only fields.
I know we thought it might benefit cgroups controller,  but I still hope to
find earlier purpose it could serve.

-Brian


[snip]

>>
>> Am 30.07.19 um 02:32 schrieb Brian Welty:
>>> [ By request, resending to include amd-gfx + intel-gfx.  Since resending,
>>>I fixed the nit with ordering of header includes that Sam noted. ]
>>>
>>> This RFC series is first implementation of some ideas expressed
>>> earlier on dri-

Re: [Intel-gfx] [RFC PATCH 0/3] Propose new struct drm_mem_region

2019-07-30 Thread Brian Welty

On 7/30/2019 3:45 AM, Daniel Vetter wrote:
> On Tue, Jul 30, 2019 at 12:24 PM Koenig, Christian
>  wrote:
>>
>> Am 30.07.19 um 11:38 schrieb Daniel Vetter:
>>> On Tue, Jul 30, 2019 at 08:45:57AM +, Koenig, Christian wrote:

Snipped the feedback on struct drm_mem_region.
Will be easier to have separate thread.

>>>>
>>>>> +/*
>>>>> + * Memory types for drm_mem_region
>>>>> + */
>>>> #define DRM_MEM_SWAP?
>>> btw what did you have in mind for this? Since we use shmem we kinda don't
>>> know whether the BO is actually swapped out or not, at least on the i915
>>> side. So this would be more NOT_CURRENTLY_PINNED_AND_POSSIBLY_SWAPPED_OUT.
>>
>> Yeah, the problem is not everybody can use shmem. For some use cases you
>> have to use memory allocated through dma_alloc_coherent().
>>
>> So to be able to swap this out you need a separate domain to copy it
>> from whatever is backing it currently to shmem.
>>
>> So we essentially have:
>> DRM_MEM_SYS_SWAPABLE
>> DRM_MEM_SYS_NOT_GPU_MAPPED
>> DRM_MEM_SYS_GPU_MAPPED
>>
>> Or something like that.
> 
> Yeah i915-gem is similar. We oportunistically keep the pages pinned
> sometimes even if not currently mapped into the (what ttm calls) TT.
> So I think these three for system memory make sense for us too. I
> think that's similar (at least in spirit) to the dma_alloc cache you
> have going on. Mabye instead of the somewhat cumbersome NOT_GPU_MAPPED
> we could have something like PINNED or so. Although it's not
> permanently pinned, so maybe that's confusing too.
> 

Okay, I see now I was far off the mark with what I thought TTM_PL_SYSTEM
was.  The discussion helped clear up several bits of confusion on my part.
From proposed names, I find MAPPED and PINNED slightly confusing.
In terms of backing store description, maybe these are a little better:
  DRM_MEM_SYS_UNTRANSLATED  (TTM_PL_SYSTEM)
  DRM_MEM_SYS_TRANSLATED(TTM_PL_TT or i915's SYSTEM)

Are these allowed to be both overlapping? Or non-overlapping (partitioned)?
Per Christian's point about removing .start, seems it doesn't need to
matter.

Whatever we define for these sub-types, does it make sense for SYSTEM and
VRAM to each have them defined?

I'm unclear how DRM_MEM_SWAP (or DRM_MEM_SYS_SWAPABLE) would get
configured by driver...  this is a fixed size partition of host memory?
Or it is a kind of dummy memory region just for swap implementation?


>>>> TTM was clearly missing that resulting in a whole bunch of extra
>>>> handling and rather complicated handling.
>>>>
>>>>> +#define DRM_MEM_SYSTEM 0
>>>>> +#define DRM_MEM_STOLEN 1
>>>> I think we need a better naming for that.
>>>>
>>>> STOLEN sounds way to much like stolen VRAM for integrated GPUs, but at
>>>> least for TTM this is the system memory currently GPU accessible.
>>> Yup this is wrong, for i915 we use this as stolen, for ttm it's the gpu
>>> translation table window into system memory. Not the same thing at all.
>>
>> Thought so. The closest I have in mind is GTT, but everything else works
>> as well.
> 
> Would your GPU_MAPPED above work for TT? I think we'll also need
> STOLEN, I'm even hearing noises that there's going to be stolen for
> discrete vram for us ... Also if we expand I guess we need to teach
> ttm to cope with more, or maybe treat the DRM one as some kind of
> sub-flavour.
Daniel, maybe what i915 calls stolen could just be DRM_MEM_RESERVED or
DRM_MEM_PRIV.  Or maybe can argue it falls into UNTRANSLATED type that
I suggested above, I'm not sure.

-Brian


> -Daniel
> 
>>
>> Christian.
>>
>>> -Daniel
>>>
>>>>
>>>> Thanks for looking into that,
>>>> Christian.
>>>>
>>>> Am 30.07.19 um 02:32 schrieb Brian Welty:
>>>>> [ By request, resending to include amd-gfx + intel-gfx.  Since resending,
>>>>> I fixed the nit with ordering of header includes that Sam noted. ]
>>>>>
>>>>> This RFC series is first implementation of some ideas expressed
>>>>> earlier on dri-devel [1].
>>>>>
>>>>> Some of the goals (open for much debate) are:
>>>>> - Create common base structure (subclass) for memory regions (patch 
>>>>> #1)
>>>>> - Create common memory region types (patch #2)
>>>>> - Create common set of memory_region function callbacks (based on
>>>>>   ttm_mem_type_manager_funcs and intel_memory_regions_ops)
>>>>> - Create comm

[Intel-gfx] [RFC PATCH 0/3] Propose new struct drm_mem_region

2019-07-29 Thread Brian Welty
[ By request, resending to include amd-gfx + intel-gfx.  Since resending,
  I fixed the nit with ordering of header includes that Sam noted. ]

This RFC series is first implementation of some ideas expressed
earlier on dri-devel [1].

Some of the goals (open for much debate) are:
  - Create common base structure (subclass) for memory regions (patch #1)
  - Create common memory region types (patch #2)
  - Create common set of memory_region function callbacks (based on
ttm_mem_type_manager_funcs and intel_memory_regions_ops)
  - Create common helpers that operate on drm_mem_region to be leveraged
by both TTM drivers and i915, reducing code duplication
  - Above might start with refactoring ttm_bo_manager.c as these are
helpers for using drm_mm's range allocator and could be made to
operate on DRM structures instead of TTM ones.
  - Larger goal might be to make LRU management of GEM objects common, and
migrate those fields into drm_mem_region and drm_gem_object strucures.

Patches 1-2 implement the proposed struct drm_mem_region and adds
associated common set of definitions for memory region type.

Patch #3 is update to i915 and is based upon another series which is
in progress to add vram support to i915 [2].

[1] https://lists.freedesktop.org/archives/dri-devel/2019-June/224501.html
[2] https://lists.freedesktop.org/archives/intel-gfx/2019-June/203649.html

Brian Welty (3):
  drm: introduce new struct drm_mem_region
  drm: Introduce DRM_MEM defines for specifying type of drm_mem_region
  drm/i915: Update intel_memory_region to use nested drm_mem_region

 drivers/gpu/drm/i915/gem/i915_gem_object.c|  2 +-
 drivers/gpu/drm/i915/gem/i915_gem_shmem.c |  2 +-
 drivers/gpu/drm/i915/i915_gem_gtt.c   | 10 ++---
 drivers/gpu/drm/i915/i915_gpu_error.c |  2 +-
 drivers/gpu/drm/i915/i915_query.c |  2 +-
 drivers/gpu/drm/i915/intel_memory_region.c| 10 +++--
 drivers/gpu/drm/i915/intel_memory_region.h| 19 +++--
 drivers/gpu/drm/i915/intel_region_lmem.c  | 26 ++---
 .../drm/i915/selftests/intel_memory_region.c  |  8 ++--
 drivers/gpu/drm/ttm/ttm_bo.c  | 34 +---
 drivers/gpu/drm/ttm/ttm_bo_manager.c  | 14 +++
 drivers/gpu/drm/ttm/ttm_bo_util.c | 11 +++---
 drivers/gpu/drm/vmwgfx/vmwgfx_gmrid_manager.c |  8 ++--
 drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c|  4 +-
 include/drm/drm_mm.h  | 39 ++-
 include/drm/ttm/ttm_bo_api.h  |  2 +-
 include/drm/ttm/ttm_bo_driver.h   | 16 
 include/drm/ttm/ttm_placement.h   |  8 ++--
 18 files changed, 124 insertions(+), 93 deletions(-)

-- 
2.21.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [RFC PATCH 1/3] drm: introduce new struct drm_mem_region

2019-07-29 Thread Brian Welty
Move basic members of ttm_mem_type_manager into a new DRM memory region
structure.  The idea is for this base structure to be nested inside
the TTM structure and later in Intel's proposed intel_memory_region.

As comments in the code suggest, the following future work can extend
the usefulness of this:
- Create common memory region types (next patch)
- Create common set of memory_region function callbacks (based on
  ttm_mem_type_manager_funcs and intel_memory_regions_ops)
- Create common helpers that operate on drm_mem_region to be leveraged
  by both TTM drivers and i915, reducing code duplication
- Above might start with refactoring ttm_bo_manager.c as these are
  helpers for using drm_mm's range allocator and could be made to
  operate on DRM structures instead of TTM ones.
- Larger goal might be to make LRU management of GEM objects common, and
  migrate those fields into drm_mem_region and drm_gem_object strucures.

vmwgfx changes included here as just example of what driver updates will
look like, and can be moved later to separate patch.  Other TTM drivers
need to be updated similarly.

Signed-off-by: Brian Welty 
---
 drivers/gpu/drm/ttm/ttm_bo.c  | 34 +++
 drivers/gpu/drm/ttm/ttm_bo_manager.c  | 14 
 drivers/gpu/drm/ttm/ttm_bo_util.c | 11 +++---
 drivers/gpu/drm/vmwgfx/vmwgfx_gmrid_manager.c |  8 ++---
 drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c|  4 +--
 include/drm/drm_mm.h  | 31 +++--
 include/drm/ttm/ttm_bo_api.h  |  2 +-
 include/drm/ttm/ttm_bo_driver.h   | 16 -
 8 files changed, 75 insertions(+), 45 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index 58c403eda04e..45434ea513dd 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -84,8 +84,8 @@ static void ttm_mem_type_debug(struct ttm_bo_device *bdev, 
struct drm_printer *p
drm_printf(p, "has_type: %d\n", man->has_type);
drm_printf(p, "use_type: %d\n", man->use_type);
drm_printf(p, "flags: 0x%08X\n", man->flags);
-   drm_printf(p, "gpu_offset: 0x%08llX\n", man->gpu_offset);
-   drm_printf(p, "size: %llu\n", man->size);
+   drm_printf(p, "gpu_offset: 0x%08llX\n", man->region.start);
+   drm_printf(p, "size: %llu\n", man->region.size);
drm_printf(p, "available_caching: 0x%08X\n", 
man->available_caching);
drm_printf(p, "default_caching: 0x%08X\n", man->default_caching);
if (mem_type != TTM_PL_SYSTEM)
@@ -399,7 +399,7 @@ static int ttm_bo_handle_move_mem(struct ttm_buffer_object 
*bo,
 
if (bo->mem.mm_node)
bo->offset = (bo->mem.start << PAGE_SHIFT) +
-   bdev->man[bo->mem.mem_type].gpu_offset;
+   bdev->man[bo->mem.mem_type].region.start;
else
bo->offset = 0;
 
@@ -926,9 +926,9 @@ static int ttm_bo_add_move_fence(struct ttm_buffer_object 
*bo,
struct dma_fence *fence;
int ret;
 
-   spin_lock(>move_lock);
-   fence = dma_fence_get(man->move);
-   spin_unlock(>move_lock);
+   spin_lock(>region.move_lock);
+   fence = dma_fence_get(man->region.move);
+   spin_unlock(>region.move_lock);
 
if (fence) {
reservation_object_add_shared_fence(bo->resv, fence);
@@ -1490,9 +1490,9 @@ static int ttm_bo_force_list_clean(struct ttm_bo_device 
*bdev,
}
spin_unlock(>lru_lock);
 
-   spin_lock(>move_lock);
-   fence = dma_fence_get(man->move);
-   spin_unlock(>move_lock);
+   spin_lock(>region.move_lock);
+   fence = dma_fence_get(man->region.move);
+   spin_unlock(>region.move_lock);
 
if (fence) {
ret = dma_fence_wait(fence, false);
@@ -1535,8 +1535,8 @@ int ttm_bo_clean_mm(struct ttm_bo_device *bdev, unsigned 
mem_type)
ret = (*man->func->takedown)(man);
}
 
-   dma_fence_put(man->move);
-   man->move = NULL;
+   dma_fence_put(man->region.move);
+   man->region.move = NULL;
 
return ret;
 }
@@ -1561,7 +1561,7 @@ int ttm_bo_evict_mm(struct ttm_bo_device *bdev, unsigned 
mem_type)
 EXPORT_SYMBOL(ttm_bo_evict_mm);
 
 int ttm_bo_init_mm(struct ttm_bo_device *bdev, unsigned type,
-   unsigned long p_size)
+  resource_size_t p_size)
 {
int ret;
struct ttm_mem_type_manager *man;
@@ -1570,10 +1570,16 @@ int ttm_bo_init_mm(struct ttm_bo_device *bdev, unsigned 
type,
BUG_ON(type >= TTM_NUM_MEM_TYPES);
man = >man[type];
BUG_ON(man->has_type);
+
+   /* FIXME: add call to (new) drm_mem_region_init ? */
+   ma

[Intel-gfx] [RFC PATCH 3/3] drm/i915: Update intel_memory_region to use nested drm_mem_region

2019-07-29 Thread Brian Welty
Some fields are deleted from intel_memory_region in favor of instead
using the new nested drm_mem_region structure.

Note, this is based upon unmerged i915 series [1] in order to show how
i915 might begin to integrate the proposed drm_mem_region.

[1] https://lists.freedesktop.org/archives/intel-gfx/2019-June/203649.html

Signed-off-by: Brian Welty 
---
 drivers/gpu/drm/i915/gem/i915_gem_object.c|  2 +-
 drivers/gpu/drm/i915/gem/i915_gem_shmem.c |  2 +-
 drivers/gpu/drm/i915/i915_gem_gtt.c   | 10 +++
 drivers/gpu/drm/i915/i915_gpu_error.c |  2 +-
 drivers/gpu/drm/i915/i915_query.c |  2 +-
 drivers/gpu/drm/i915/intel_memory_region.c| 10 ---
 drivers/gpu/drm/i915/intel_memory_region.h| 19 --
 drivers/gpu/drm/i915/intel_region_lmem.c  | 26 +--
 .../drm/i915/selftests/intel_memory_region.c  |  8 +++---
 9 files changed, 37 insertions(+), 44 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c 
b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index 73d2d72adc19..7e56fd89a972 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -606,7 +606,7 @@ static int i915_gem_object_region_select(struct 
drm_i915_private *dev_priv,
ret = i915_gem_object_migrate(obj, ce, id);
if (!ret) {
if (MEMORY_TYPE_FROM_REGION(region) ==
-   INTEL_LMEM) {
+   DRM_MEM_VRAM) {
/*
 * TODO: this should be part of get_pages(),
 * when async get_pages arrives
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c 
b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
index d24f34443c4c..ac18e73665d4 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
@@ -53,7 +53,7 @@ static int shmem_get_pages(struct drm_i915_gem_object *obj)
 * If there's no chance of allocating enough pages for the whole
 * object, bail early.
 */
-   if (obj->base.size > resource_size(>region))
+   if (obj->base.size > mem->region.size)
return -ENOMEM;
 
st = kmalloc(sizeof(*st), GFP_KERNEL);
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c 
b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 2288a55f27f1..f4adc7e397ff 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -2737,20 +2737,20 @@ int i915_gem_init_memory_regions(struct 
drm_i915_private *i915)
 
for (i = 0; i < ARRAY_SIZE(intel_region_map); i++) {
struct intel_memory_region *mem = NULL;
-   u32 type;
+   u8 type;
 
if (!HAS_REGION(i915, BIT(i)))
continue;
 
type = MEMORY_TYPE_FROM_REGION(intel_region_map[i]);
switch (type) {
-   case INTEL_SMEM:
+   case DRM_MEM_SYSTEM:
mem = i915_gem_shmem_setup(i915);
break;
-   case INTEL_STOLEN:
+   case DRM_MEM_STOLEN:
mem = i915_gem_stolen_setup(i915);
break;
-   case INTEL_LMEM:
+   case DRM_MEM_VRAM:
mem = i915_gem_setup_fake_lmem(i915);
break;
}
@@ -2762,7 +2762,7 @@ int i915_gem_init_memory_regions(struct drm_i915_private 
*i915)
}
 
mem->id = intel_region_map[i];
-   mem->type = type;
+   mem->region.type = type;
mem->instance = 
MEMORY_INSTANCE_FROM_REGION(intel_region_map[i]);
 
i915->regions[i] = mem;
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c 
b/drivers/gpu/drm/i915/i915_gpu_error.c
index 9feb597f2b01..908691c3aadb 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -1048,7 +1048,7 @@ i915_error_object_create(struct drm_i915_private *i915,
struct intel_memory_region *mem = vma->obj->memory_region;
 
for_each_sgt_dma(dma, iter, vma->pages) {
-   s = io_mapping_map_atomic_wc(>iomap, dma);
+   s = io_mapping_map_atomic_wc(>region.iomap, dma);
ret = compress_page(compress, s, dst);
io_mapping_unmap_atomic(s);
 
diff --git a/drivers/gpu/drm/i915/i915_query.c 
b/drivers/gpu/drm/i915/i915_query.c
index 21c4c2592d6c..d16b4a6688e8 100644
--- a/drivers/gpu/drm/i915/i915_query.c
+++ b/drivers/gpu/drm/i915/i915_query.c
@@ -184,7 +184,7 @@ static int query_memregion_info(struct drm_i915_private 
*dev_priv,
continue;
 
info.id = region->id;
-   info.size = resource_size(>

[Intel-gfx] [RFC PATCH 2/3] drm: Introduce DRM_MEM defines for specifying type of drm_mem_region

2019-07-29 Thread Brian Welty
Introduce DRM memory region types to be common for both drivers using
TTM and for i915.  For now, TTM continues to define it's own set but
uses the DRM base definitions.

Signed-off-by: Brian Welty 
---
 include/drm/drm_mm.h| 8 
 include/drm/ttm/ttm_placement.h | 8 
 2 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/include/drm/drm_mm.h b/include/drm/drm_mm.h
index 465f8d10d863..b78dc9284702 100644
--- a/include/drm/drm_mm.h
+++ b/include/drm/drm_mm.h
@@ -59,6 +59,14 @@
 struct drm_device;
 struct drm_mm;
 
+/*
+ * Memory types for drm_mem_region
+ */
+#define DRM_MEM_SYSTEM 0
+#define DRM_MEM_STOLEN 1
+#define DRM_MEM_VRAM   2
+#define DRM_MEM_PRIV   3
+
 /**
  * struct drm_mem_region
  *
diff --git a/include/drm/ttm/ttm_placement.h b/include/drm/ttm/ttm_placement.h
index e88a8e39767b..976cf8d2f899 100644
--- a/include/drm/ttm/ttm_placement.h
+++ b/include/drm/ttm/ttm_placement.h
@@ -37,10 +37,10 @@
  * Memory regions for data placement.
  */
 
-#define TTM_PL_SYSTEM   0
-#define TTM_PL_TT   1
-#define TTM_PL_VRAM 2
-#define TTM_PL_PRIV 3
+#define TTM_PL_SYSTEM   DRM_MEM_SYSTEM
+#define TTM_PL_TT   DRM_MEM_STOLEN
+#define TTM_PL_VRAM DRM_MEM_VRAM
+#define TTM_PL_PRIV DRM_MEM_PRIV
 
 #define TTM_PL_FLAG_SYSTEM  (1 << TTM_PL_SYSTEM)
 #define TTM_PL_FLAG_TT  (1 << TTM_PL_TT)
-- 
2.21.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [RFC PATCH 2/3] drm: Introduce DRM_MEM defines for specifying type of drm_mem_region

2019-07-29 Thread Brian Welty
Introduce DRM memory region types to be common for both drivers using
TTM and for i915.  For now, TTM continues to define it's own set but
uses the DRM base definitions.

Signed-off-by: Brian Welty 
---
 include/drm/drm_mm.h| 8 
 include/drm/ttm/ttm_placement.h | 8 
 2 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/include/drm/drm_mm.h b/include/drm/drm_mm.h
index 3d123eb10d62..8178d13384bc 100644
--- a/include/drm/drm_mm.h
+++ b/include/drm/drm_mm.h
@@ -59,6 +59,14 @@
 struct drm_device;
 struct drm_mm;
 
+/*
+ * Memory types for drm_mem_region
+ */
+#define DRM_MEM_SYSTEM 0
+#define DRM_MEM_STOLEN 1
+#define DRM_MEM_VRAM   2
+#define DRM_MEM_PRIV   3
+
 /**
  * struct drm_mem_region
  *
diff --git a/include/drm/ttm/ttm_placement.h b/include/drm/ttm/ttm_placement.h
index e88a8e39767b..976cf8d2f899 100644
--- a/include/drm/ttm/ttm_placement.h
+++ b/include/drm/ttm/ttm_placement.h
@@ -37,10 +37,10 @@
  * Memory regions for data placement.
  */
 
-#define TTM_PL_SYSTEM   0
-#define TTM_PL_TT   1
-#define TTM_PL_VRAM 2
-#define TTM_PL_PRIV 3
+#define TTM_PL_SYSTEM   DRM_MEM_SYSTEM
+#define TTM_PL_TT   DRM_MEM_STOLEN
+#define TTM_PL_VRAM DRM_MEM_VRAM
+#define TTM_PL_PRIV DRM_MEM_PRIV
 
 #define TTM_PL_FLAG_SYSTEM  (1 << TTM_PL_SYSTEM)
 #define TTM_PL_FLAG_TT  (1 << TTM_PL_TT)
-- 
2.21.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [RFC PATCH 3/3] drm/i915: Update intel_memory_region to use nested drm_mem_region

2019-07-29 Thread Brian Welty
Some fields are deleted from intel_memory_region in favor of instead
using the new nested drm_mem_region structure.

Note, this is based upon unmerged i915 series [1] in order to show how
i915 might begin to integrate the proposed drm_mem_region.

[1] https://lists.freedesktop.org/archives/intel-gfx/2019-June/203649.html

Signed-off-by: Brian Welty 
---
 drivers/gpu/drm/i915/gem/i915_gem_object.c|  2 +-
 drivers/gpu/drm/i915/gem/i915_gem_shmem.c |  2 +-
 drivers/gpu/drm/i915/i915_gem_gtt.c   | 10 +++
 drivers/gpu/drm/i915/i915_gpu_error.c |  2 +-
 drivers/gpu/drm/i915/i915_query.c |  2 +-
 drivers/gpu/drm/i915/intel_memory_region.c| 10 ---
 drivers/gpu/drm/i915/intel_memory_region.h| 19 --
 drivers/gpu/drm/i915/intel_region_lmem.c  | 26 +--
 .../drm/i915/selftests/intel_memory_region.c  |  8 +++---
 9 files changed, 37 insertions(+), 44 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c 
b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index 73d2d72adc19..7e56fd89a972 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -606,7 +606,7 @@ static int i915_gem_object_region_select(struct 
drm_i915_private *dev_priv,
ret = i915_gem_object_migrate(obj, ce, id);
if (!ret) {
if (MEMORY_TYPE_FROM_REGION(region) ==
-   INTEL_LMEM) {
+   DRM_MEM_VRAM) {
/*
 * TODO: this should be part of get_pages(),
 * when async get_pages arrives
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c 
b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
index d24f34443c4c..ac18e73665d4 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
@@ -53,7 +53,7 @@ static int shmem_get_pages(struct drm_i915_gem_object *obj)
 * If there's no chance of allocating enough pages for the whole
 * object, bail early.
 */
-   if (obj->base.size > resource_size(>region))
+   if (obj->base.size > mem->region.size)
return -ENOMEM;
 
st = kmalloc(sizeof(*st), GFP_KERNEL);
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c 
b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 2288a55f27f1..f4adc7e397ff 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -2737,20 +2737,20 @@ int i915_gem_init_memory_regions(struct 
drm_i915_private *i915)
 
for (i = 0; i < ARRAY_SIZE(intel_region_map); i++) {
struct intel_memory_region *mem = NULL;
-   u32 type;
+   u8 type;
 
if (!HAS_REGION(i915, BIT(i)))
continue;
 
type = MEMORY_TYPE_FROM_REGION(intel_region_map[i]);
switch (type) {
-   case INTEL_SMEM:
+   case DRM_MEM_SYSTEM:
mem = i915_gem_shmem_setup(i915);
break;
-   case INTEL_STOLEN:
+   case DRM_MEM_STOLEN:
mem = i915_gem_stolen_setup(i915);
break;
-   case INTEL_LMEM:
+   case DRM_MEM_VRAM:
mem = i915_gem_setup_fake_lmem(i915);
break;
}
@@ -2762,7 +2762,7 @@ int i915_gem_init_memory_regions(struct drm_i915_private 
*i915)
}
 
mem->id = intel_region_map[i];
-   mem->type = type;
+   mem->region.type = type;
mem->instance = 
MEMORY_INSTANCE_FROM_REGION(intel_region_map[i]);
 
i915->regions[i] = mem;
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c 
b/drivers/gpu/drm/i915/i915_gpu_error.c
index 9feb597f2b01..908691c3aadb 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -1048,7 +1048,7 @@ i915_error_object_create(struct drm_i915_private *i915,
struct intel_memory_region *mem = vma->obj->memory_region;
 
for_each_sgt_dma(dma, iter, vma->pages) {
-   s = io_mapping_map_atomic_wc(>iomap, dma);
+   s = io_mapping_map_atomic_wc(>region.iomap, dma);
ret = compress_page(compress, s, dst);
io_mapping_unmap_atomic(s);
 
diff --git a/drivers/gpu/drm/i915/i915_query.c 
b/drivers/gpu/drm/i915/i915_query.c
index 21c4c2592d6c..d16b4a6688e8 100644
--- a/drivers/gpu/drm/i915/i915_query.c
+++ b/drivers/gpu/drm/i915/i915_query.c
@@ -184,7 +184,7 @@ static int query_memregion_info(struct drm_i915_private 
*dev_priv,
continue;
 
info.id = region->id;
-   info.size = resource_size(>

[Intel-gfx] [RFC PATCH 1/3] drm: introduce new struct drm_mem_region

2019-07-29 Thread Brian Welty
Move basic members of ttm_mem_type_manager into a new DRM memory region
structure.  The idea is for this base structure to be nested inside
the TTM structure and later in Intel's proposed intel_memory_region.

As comments in the code suggest, the following future work can extend
the usefulness of this:
- Create common memory region types (next patch)
- Create common set of memory_region function callbacks (based on
  ttm_mem_type_manager_funcs and intel_memory_regions_ops)
- Create common helpers that operate on drm_mem_region to be leveraged
  by both TTM drivers and i915, reducing code duplication
- Above might start with refactoring ttm_bo_manager.c as these are
  helpers for using drm_mm's range allocator and could be made to
  operate on DRM structures instead of TTM ones.
- Larger goal might be to make LRU management of GEM objects common, and
  migrate those fields into drm_mem_region and drm_gem_object strucures.

vmwgfx changes included here as just example of what driver updates will
look like, and can be moved later to separate patch.  Other TTM drivers
need to be updated similarly.

Signed-off-by: Brian Welty 
---
 drivers/gpu/drm/ttm/ttm_bo.c  | 34 +++
 drivers/gpu/drm/ttm/ttm_bo_manager.c  | 14 
 drivers/gpu/drm/ttm/ttm_bo_util.c | 11 +++---
 drivers/gpu/drm/vmwgfx/vmwgfx_gmrid_manager.c |  8 ++---
 drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c|  4 +--
 include/drm/drm_mm.h  | 27 +++
 include/drm/ttm/ttm_bo_api.h  |  2 +-
 include/drm/ttm/ttm_bo_driver.h   | 16 -
 8 files changed, 73 insertions(+), 43 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index 58c403eda04e..45434ea513dd 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -84,8 +84,8 @@ static void ttm_mem_type_debug(struct ttm_bo_device *bdev, 
struct drm_printer *p
drm_printf(p, "has_type: %d\n", man->has_type);
drm_printf(p, "use_type: %d\n", man->use_type);
drm_printf(p, "flags: 0x%08X\n", man->flags);
-   drm_printf(p, "gpu_offset: 0x%08llX\n", man->gpu_offset);
-   drm_printf(p, "size: %llu\n", man->size);
+   drm_printf(p, "gpu_offset: 0x%08llX\n", man->region.start);
+   drm_printf(p, "size: %llu\n", man->region.size);
drm_printf(p, "available_caching: 0x%08X\n", 
man->available_caching);
drm_printf(p, "default_caching: 0x%08X\n", man->default_caching);
if (mem_type != TTM_PL_SYSTEM)
@@ -399,7 +399,7 @@ static int ttm_bo_handle_move_mem(struct ttm_buffer_object 
*bo,
 
if (bo->mem.mm_node)
bo->offset = (bo->mem.start << PAGE_SHIFT) +
-   bdev->man[bo->mem.mem_type].gpu_offset;
+   bdev->man[bo->mem.mem_type].region.start;
else
bo->offset = 0;
 
@@ -926,9 +926,9 @@ static int ttm_bo_add_move_fence(struct ttm_buffer_object 
*bo,
struct dma_fence *fence;
int ret;
 
-   spin_lock(>move_lock);
-   fence = dma_fence_get(man->move);
-   spin_unlock(>move_lock);
+   spin_lock(>region.move_lock);
+   fence = dma_fence_get(man->region.move);
+   spin_unlock(>region.move_lock);
 
if (fence) {
reservation_object_add_shared_fence(bo->resv, fence);
@@ -1490,9 +1490,9 @@ static int ttm_bo_force_list_clean(struct ttm_bo_device 
*bdev,
}
spin_unlock(>lru_lock);
 
-   spin_lock(>move_lock);
-   fence = dma_fence_get(man->move);
-   spin_unlock(>move_lock);
+   spin_lock(>region.move_lock);
+   fence = dma_fence_get(man->region.move);
+   spin_unlock(>region.move_lock);
 
if (fence) {
ret = dma_fence_wait(fence, false);
@@ -1535,8 +1535,8 @@ int ttm_bo_clean_mm(struct ttm_bo_device *bdev, unsigned 
mem_type)
ret = (*man->func->takedown)(man);
}
 
-   dma_fence_put(man->move);
-   man->move = NULL;
+   dma_fence_put(man->region.move);
+   man->region.move = NULL;
 
return ret;
 }
@@ -1561,7 +1561,7 @@ int ttm_bo_evict_mm(struct ttm_bo_device *bdev, unsigned 
mem_type)
 EXPORT_SYMBOL(ttm_bo_evict_mm);
 
 int ttm_bo_init_mm(struct ttm_bo_device *bdev, unsigned type,
-   unsigned long p_size)
+  resource_size_t p_size)
 {
int ret;
struct ttm_mem_type_manager *man;
@@ -1570,10 +1570,16 @@ int ttm_bo_init_mm(struct ttm_bo_device *bdev, unsigned 
type,
BUG_ON(type >= TTM_NUM_MEM_TYPES);
man = >man[type];
BUG_ON(man->has_type);
+
+   /* FIXME: add call to (new) drm_mem_region_init ? */
+   ma

[Intel-gfx] [RFC PATCH 0/3] Propose new struct drm_mem_region

2019-07-29 Thread Brian Welty
This RFC series is first implementation of some ideas expressed
earlier on dri-devel [1].

Some of the goals (open for much debate) are:
  - Create common base structure (subclass) for memory regions (patch #1)
  - Create common memory region types (patch #2)
  - Create common set of memory_region function callbacks (based on
ttm_mem_type_manager_funcs and intel_memory_regions_ops)
  - Create common helpers that operate on drm_mem_region to be leveraged
by both TTM drivers and i915, reducing code duplication
  - Above might start with refactoring ttm_bo_manager.c as these are
helpers for using drm_mm's range allocator and could be made to
operate on DRM structures instead of TTM ones.
  - Larger goal might be to make LRU management of GEM objects common, and
migrate those fields into drm_mem_region and drm_gem_object strucures.

Patches 1-2 implement the proposed struct drm_mem_region and adds
associated common set of definitions for memory region type.

Patch #3 is update to i915 and is based upon another series which is
in progress to add vram support to i915 [2].

[1] https://lists.freedesktop.org/archives/dri-devel/2019-June/224501.html
[2] https://lists.freedesktop.org/archives/intel-gfx/2019-June/203649.html

Brian Welty (3):
  drm: introduce new struct drm_mem_region
  drm: Introduce DRM_MEM defines for specifying type of drm_mem_region
  drm/i915: Update intel_memory_region to use nested drm_mem_region

 drivers/gpu/drm/i915/gem/i915_gem_object.c|  2 +-
 drivers/gpu/drm/i915/gem/i915_gem_shmem.c |  2 +-
 drivers/gpu/drm/i915/i915_gem_gtt.c   | 10 +++---
 drivers/gpu/drm/i915/i915_gpu_error.c |  2 +-
 drivers/gpu/drm/i915/i915_query.c |  2 +-
 drivers/gpu/drm/i915/intel_memory_region.c| 10 +++---
 drivers/gpu/drm/i915/intel_memory_region.h| 19 +++---
 drivers/gpu/drm/i915/intel_region_lmem.c  | 26 +++---
 .../drm/i915/selftests/intel_memory_region.c  |  8 ++---
 drivers/gpu/drm/ttm/ttm_bo.c  | 34 ++
 drivers/gpu/drm/ttm/ttm_bo_manager.c  | 14 
 drivers/gpu/drm/ttm/ttm_bo_util.c | 11 +++---
 drivers/gpu/drm/vmwgfx/vmwgfx_gmrid_manager.c |  8 ++---
 drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c|  4 +--
 include/drm/drm_mm.h  | 35 +++
 include/drm/ttm/ttm_bo_api.h  |  2 +-
 include/drm/ttm/ttm_bo_driver.h   | 16 -
 include/drm/ttm/ttm_placement.h   |  8 ++---
 18 files changed, 122 insertions(+), 91 deletions(-)

-- 
2.21.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [RFC PATCH 5/5] drm/i915: Use memory cgroup for enforcing device memory limit

2019-05-01 Thread Brian Welty
i915 driver now includes DRIVER_CGROUPS in feature bits.

To charge device memory allocations, we need to (1) identify appropriate
cgroup to charge (currently decided at object creation time), and (2)
make the charging call at the time that memory pages are being allocated.

For (1), see prior DRM patch which associates current task's cgroup with
GEM objects as they are created.  That cgroup will be charged/uncharged
for all paging activity against the GEM object.

For (2), we call mem_cgroup_try_charge_direct() in .get_pages callback
for the GEM object type.  Uncharging is done in .put_pages when the
memory is marked such that it can be evicted.  The try_charge() call will
fail with -ENOMEM if the current memory allocation will exceed the cgroup
device memory maximum, and allow for driver to perform memory reclaim.

Cc: cgro...@vger.kernel.org
Cc: linux...@kvack.org
Cc: dri-de...@lists.freedesktop.org
Cc: Matt Roper 
Signed-off-by: Brian Welty 
---
 drivers/gpu/drm/i915/i915_drv.c|  2 +-
 drivers/gpu/drm/i915/intel_memory_region.c | 24 ++
 2 files changed, 21 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 5a0a59922cb4..4d496c3c3681 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -3469,7 +3469,7 @@ static struct drm_driver driver = {
 * deal with them for Intel hardware.
 */
.driver_features =
-   DRIVER_GEM | DRIVER_PRIME |
+   DRIVER_GEM | DRIVER_PRIME | DRIVER_CGROUPS |
DRIVER_RENDER | DRIVER_MODESET | DRIVER_ATOMIC | DRIVER_SYNCOBJ,
.release = i915_driver_release,
.open = i915_driver_open,
diff --git a/drivers/gpu/drm/i915/intel_memory_region.c 
b/drivers/gpu/drm/i915/intel_memory_region.c
index 813ff83c132b..e4ac5e4d4857 100644
--- a/drivers/gpu/drm/i915/intel_memory_region.c
+++ b/drivers/gpu/drm/i915/intel_memory_region.c
@@ -53,6 +53,8 @@ i915_memory_region_put_pages_buddy(struct drm_i915_gem_object 
*obj,
mutex_unlock(>memory_region->mm_lock);
 
obj->mm.dirty = false;
+   mem_cgroup_uncharge_direct(obj->base.memcg,
+  obj->base.size >> PAGE_SHIFT);
 }
 
 int
@@ -65,19 +67,29 @@ i915_memory_region_get_pages_buddy(struct 
drm_i915_gem_object *obj)
struct scatterlist *sg;
unsigned int sg_page_sizes;
unsigned long n_pages;
+   int err;
 
GEM_BUG_ON(!IS_ALIGNED(size, mem->mm.min_size));
GEM_BUG_ON(!list_empty(>blocks));
 
+   err = mem_cgroup_try_charge_direct(obj->base.memcg, size >> PAGE_SHIFT);
+   if (err) {
+   DRM_DEBUG("MEMCG: try_charge failed for %lld\n", size);
+   return err;
+   }
+
st = kmalloc(sizeof(*st), GFP_KERNEL);
-   if (!st)
-   return -ENOMEM;
+   if (!st) {
+   err = -ENOMEM;
+   goto err_uncharge;
+   }
 
n_pages = div64_u64(size, mem->mm.min_size);
 
if (sg_alloc_table(st, n_pages, GFP_KERNEL)) {
kfree(st);
-   return -ENOMEM;
+   err = -ENOMEM;
+   goto err_uncharge;
}
 
sg = st->sgl;
@@ -161,7 +173,11 @@ i915_memory_region_get_pages_buddy(struct 
drm_i915_gem_object *obj)
 err_free_blocks:
memory_region_free_pages(obj, st);
mutex_unlock(>mm_lock);
-   return -ENXIO;
+   err = -ENXIO;
+err_uncharge:
+   mem_cgroup_uncharge_direct(obj->base.memcg,
+  obj->base.size >> PAGE_SHIFT);
+   return err;
 }
 
 int i915_memory_region_init_buddy(struct intel_memory_region *mem)
-- 
2.21.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [RFC PATCH 2/5] cgroup: Change kernfs_node for directories to store cgroup_subsys_state

2019-05-01 Thread Brian Welty
Change the kernfs_node.priv to store the cgroup_subsys_state (CSS) pointer
for directories, instead of storing cgroup pointer.  This is done in order
to support files within the cgroup associated with devices.  We require
of_css() to return the device-specific CSS pointer for these files.

Cc: cgro...@vger.kernel.org
Signed-off-by: Brian Welty 
---
 kernel/cgroup/cgroup-v1.c | 10 
 kernel/cgroup/cgroup.c| 48 +--
 2 files changed, 27 insertions(+), 31 deletions(-)

diff --git a/kernel/cgroup/cgroup-v1.c b/kernel/cgroup/cgroup-v1.c
index c126b34fd4ff..4fa56cc2b99c 100644
--- a/kernel/cgroup/cgroup-v1.c
+++ b/kernel/cgroup/cgroup-v1.c
@@ -723,6 +723,7 @@ int proc_cgroupstats_show(struct seq_file *m, void *v)
 int cgroupstats_build(struct cgroupstats *stats, struct dentry *dentry)
 {
struct kernfs_node *kn = kernfs_node_from_dentry(dentry);
+   struct cgroup_subsys_state *css;
struct cgroup *cgrp;
struct css_task_iter it;
struct task_struct *tsk;
@@ -740,12 +741,13 @@ int cgroupstats_build(struct cgroupstats *stats, struct 
dentry *dentry)
 * @kn->priv is RCU safe.  Let's do the RCU dancing.
 */
rcu_read_lock();
-   cgrp = rcu_dereference(*(void __rcu __force **)>priv);
-   if (!cgrp || cgroup_is_dead(cgrp)) {
+   css = rcu_dereference(*(void __rcu __force **)>priv);
+   if (!css || cgroup_is_dead(css->cgroup)) {
rcu_read_unlock();
mutex_unlock(_mutex);
return -ENOENT;
}
+   cgrp = css->cgroup;
rcu_read_unlock();
 
css_task_iter_start(>self, 0, );
@@ -851,7 +853,7 @@ void cgroup1_release_agent(struct work_struct *work)
 static int cgroup1_rename(struct kernfs_node *kn, struct kernfs_node 
*new_parent,
  const char *new_name_str)
 {
-   struct cgroup *cgrp = kn->priv;
+   struct cgroup_subsys_state *css = kn->priv;
int ret;
 
if (kernfs_type(kn) != KERNFS_DIR)
@@ -871,7 +873,7 @@ static int cgroup1_rename(struct kernfs_node *kn, struct 
kernfs_node *new_parent
 
ret = kernfs_rename(kn, new_parent, new_name_str);
if (!ret)
-   TRACE_CGROUP_PATH(rename, cgrp);
+   TRACE_CGROUP_PATH(rename, css->cgroup);
 
mutex_unlock(_mutex);
 
diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index 9b035e728941..1fe4fee502ea 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -595,12 +595,13 @@ static void cgroup_get_live(struct cgroup *cgrp)
 
 struct cgroup_subsys_state *of_css(struct kernfs_open_file *of)
 {
-   struct cgroup *cgrp = of->kn->parent->priv;
+   struct cgroup_subsys_state *css = of->kn->parent->priv;
struct cftype *cft = of_cft(of);
 
-   /* FIXME this needs updating to lookup device-specific CSS */
-
/*
+* If the cft specifies a subsys and this is not a device file,
+* then lookup the css, otherwise it is already correct.
+*
 * This is open and unprotected implementation of cgroup_css().
 * seq_css() is only called from a kernfs file operation which has
 * an active reference on the file.  Because all the subsystem
@@ -608,10 +609,9 @@ struct cgroup_subsys_state *of_css(struct kernfs_open_file 
*of)
 * the matching css from the cgroup's subsys table is guaranteed to
 * be and stay valid until the enclosing operation is complete.
 */
-   if (cft->ss)
-   return rcu_dereference_raw(cgrp->subsys[cft->ss->id]);
-   else
-   return >self;
+   if (cft->ss && !css->device)
+   css = rcu_dereference_raw(css->cgroup->subsys[cft->ss->id]);
+   return css;
 }
 EXPORT_SYMBOL_GPL(of_css);
 
@@ -1524,12 +1524,14 @@ static u16 cgroup_calc_subtree_ss_mask(u16 
subtree_control, u16 this_ss_mask)
  */
 void cgroup_kn_unlock(struct kernfs_node *kn)
 {
+   struct cgroup_subsys_state *css;
struct cgroup *cgrp;
 
if (kernfs_type(kn) == KERNFS_DIR)
-   cgrp = kn->priv;
+   css = kn->priv;
else
-   cgrp = kn->parent->priv;
+   css = kn->parent->priv;
+   cgrp = css->cgroup;
 
mutex_unlock(_mutex);
 
@@ -1556,12 +1558,14 @@ void cgroup_kn_unlock(struct kernfs_node *kn)
  */
 struct cgroup *cgroup_kn_lock_live(struct kernfs_node *kn, bool drain_offline)
 {
+   struct cgroup_subsys_state *css;
struct cgroup *cgrp;
 
if (kernfs_type(kn) == KERNFS_DIR)
-   cgrp = kn->priv;
+   css = kn->priv;
else
-   cgrp = kn->parent->priv;
+   css = kn->parent->priv;
+   cgrp = css->cgroup;
 
/*
 * We're gonna grab cgroup_mutex which nests outside kernfs
@@ -1652,7 

[Intel-gfx] [RFC PATCH 4/5] drm: Add memory cgroup registration and DRIVER_CGROUPS feature bit

2019-05-01 Thread Brian Welty
With new cgroups per-device framework, registration with memory cgroup
subsystem can allow us to enforce limit for allocation of device memory
against process cgroups.

This patch adds new driver feature bit, DRIVER_CGROUPS, such that DRM
will register the device with cgroups. Doing so allows device drivers to
charge memory allocations to device-specific state within the cgroup.

Note, this is only for GEM objects allocated from device memory.
Memory charging for GEM objects using system memory is already handled
by the mm subsystem charing the normal (non-device) memory cgroup.

To charge device memory allocations, we need to (1) identify appropriate
cgroup to charge (currently decided at object creation time), and (2)
make the charging call at the time that memory pages are being allocated.
Above is one policy, and this is open for debate if this is the right
choice.

For (1), we associate the current task's cgroup with GEM objects as they
are created.  That cgroup will be charged/uncharged for all paging
activity against the GEM object.  Note, if the process is not part of a
memory cgroup, then this returns NULL and no charging will occur.
For shared objects, this may make the charge against a cgroup that is
potentially not the same cgroup as the process using the memory.  Based
on the memory cgroup's discussion of "memory ownership", this seems
acceptable [1].  For (2), this is for device drivers to implement within
appropriate page allocation logic.

[1] https://www.kernel.org/doc/Documentation/cgroup-v2.txt, "Memory Ownership"

Cc: cgro...@vger.kernel.org
Cc: linux...@kvack.org
Cc: dri-de...@lists.freedesktop.org
Cc: Matt Roper 
Signed-off-by: Brian Welty 
---
 drivers/gpu/drm/drm_drv.c | 12 
 drivers/gpu/drm/drm_gem.c |  7 +++
 include/drm/drm_device.h  |  3 +++
 include/drm/drm_drv.h |  8 
 include/drm/drm_gem.h | 11 +++
 5 files changed, 41 insertions(+)

diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
index 862621494a93..890bd3c0e63e 100644
--- a/drivers/gpu/drm/drm_drv.c
+++ b/drivers/gpu/drm/drm_drv.c
@@ -28,6 +28,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -987,6 +988,12 @@ int drm_dev_register(struct drm_device *dev, unsigned long 
flags)
if (ret)
goto err_minors;
 
+   if (dev->dev && drm_core_check_feature(dev, DRIVER_CGROUPS)) {
+   ret = mem_cgroup_device_register(dev->dev, >memcg_id);
+   if (ret)
+   goto err_minors;
+   }
+
dev->registered = true;
 
if (dev->driver->load) {
@@ -1009,6 +1016,8 @@ int drm_dev_register(struct drm_device *dev, unsigned 
long flags)
goto out_unlock;
 
 err_minors:
+   if (dev->memcg_id)
+   mem_cgroup_device_unregister(dev->memcg_id);
remove_compat_control_link(dev);
drm_minor_unregister(dev, DRM_MINOR_PRIMARY);
drm_minor_unregister(dev, DRM_MINOR_RENDER);
@@ -1052,6 +1061,9 @@ void drm_dev_unregister(struct drm_device *dev)
 
drm_legacy_rmmaps(dev);
 
+   if (dev->memcg_id)
+   mem_cgroup_device_unregister(dev->memcg_id);
+
remove_compat_control_link(dev);
drm_minor_unregister(dev, DRM_MINOR_PRIMARY);
drm_minor_unregister(dev, DRM_MINOR_RENDER);
diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
index 50de138c89e0..966fbd701deb 100644
--- a/drivers/gpu/drm/drm_gem.c
+++ b/drivers/gpu/drm/drm_gem.c
@@ -38,6 +38,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -281,6 +282,9 @@ drm_gem_handle_delete(struct drm_file *filp, u32 handle)
if (IS_ERR_OR_NULL(obj))
return -EINVAL;
 
+   /* Release reference on cgroup used with GEM object charging */
+   mem_cgroup_put(obj->memcg);
+
/* Release driver's reference and decrement refcount. */
drm_gem_object_release_handle(handle, obj, filp);
 
@@ -410,6 +414,9 @@ drm_gem_handle_create_tail(struct drm_file *file_priv,
goto err_revoke;
}
 
+   /* Acquire reference on cgroup for charging GEM memory allocations */
+   obj->memcg = mem_cgroup_device_from_task(dev->memcg_id, current);
+
*handlep = handle;
return 0;
 
diff --git a/include/drm/drm_device.h b/include/drm/drm_device.h
index 7f9ef709b2b6..9859f2289066 100644
--- a/include/drm/drm_device.h
+++ b/include/drm/drm_device.h
@@ -190,6 +190,9 @@ struct drm_device {
 */
int irq;
 
+   /* @memcg_id: cgroup subsys (memcg) index for our device state */
+   unsigned long memcg_id;
+
/**
 * @vblank_disable_immediate:
 *
diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h
index 5cc7f728ec73..13b0e0b9527f 100644
--- a/include/drm/drm_drv.h
+++ b/include/drm/drm_drv.h
@@ -92,6 +92,14 @@ enum drm_driver_fe

[Intel-gfx] [RFC PATCH 3/5] memcg: Add per-device support to memory cgroup subsystem

2019-05-01 Thread Brian Welty
Here we update memory cgroup to enable the newly introduced per-device
framework.  As mentioned in the prior patch, the intent here is to allow
drivers to have their own private cgroup controls (such as memory limit)
to be applied to device resources instead of host system resources.

In summary, to enable device registration for memory cgroup subsystem:
  *  set .allow_devices to true
  *  add new exported device register and device unregister functions
 to register a device with the cgroup subsystem
  *  implement the .device_css_alloc callback to create device
 specific cgroups_subsys_state within a cgroup

As cgroup is created and for current registered devices, one will see in
the cgroup filesystem these additional files:
  mount//memory.devices//

Registration of a new device is performed in device drivers using new
mem_cgroup_device_register(). This will create above files in existing
cgroups.

And for runtime charging to the cgroup, we add the following:
  *  add new routine to lookup the device-specific cgroup_subsys_state
 which is within the task's cgroup (mem_cgroup_device_from_task)
  *  add new functions for device specific 'direct' charging

The last point above involves adding new mem_cgroup_try_charge_direct
and mem_cgroup_uncharge_direct functions.  The 'direct' name is to say
that we are charging the specified cgroup state directly and not using
any associated page or mm_struct.  We are called within device specific
memory management routines, where the device driver will track which
cgroup to charge within its own private data structures.

With this initial submission, support for memory accounting and charging
is functional.  Nested cgroups will correctly maintain the parent for
device-specific state as well, such that hierarchial charging to device
files is supported.

Cc: cgro...@vger.kernel.org
Cc: linux...@kvack.org
Cc: dri-de...@lists.freedesktop.org
Cc: Matt Roper 
Signed-off-by: Brian Welty 
---
 include/linux/memcontrol.h |  10 ++
 mm/memcontrol.c| 183 ++---
 2 files changed, 178 insertions(+), 15 deletions(-)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index dbb6118370c1..711669b613dc 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -348,6 +348,11 @@ void mem_cgroup_cancel_charge(struct page *page, struct 
mem_cgroup *memcg,
bool compound);
 void mem_cgroup_uncharge(struct page *page);
 void mem_cgroup_uncharge_list(struct list_head *page_list);
+/* direct charging to mem_cgroup is primarily for device driver usage */
+int mem_cgroup_try_charge_direct(struct mem_cgroup *memcg,
+unsigned long nr_pages);
+void mem_cgroup_uncharge_direct(struct mem_cgroup *memcg,
+   unsigned long nr_pages);
 
 void mem_cgroup_migrate(struct page *oldpage, struct page *newpage);
 
@@ -395,6 +400,11 @@ struct lruvec *mem_cgroup_page_lruvec(struct page *, 
struct pglist_data *);
 bool task_in_mem_cgroup(struct task_struct *task, struct mem_cgroup *memcg);
 struct mem_cgroup *mem_cgroup_from_task(struct task_struct *p);
 
+struct mem_cgroup *mem_cgroup_device_from_task(unsigned long id,
+  struct task_struct *p);
+int mem_cgroup_device_register(struct device *dev, unsigned long *dev_id);
+void mem_cgroup_device_unregister(unsigned long dev_id);
+
 struct mem_cgroup *get_mem_cgroup_from_mm(struct mm_struct *mm);
 
 struct mem_cgroup *get_mem_cgroup_from_page(struct page *page);
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 81a0d3914ec9..2c8407aed0f5 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -823,6 +823,47 @@ struct mem_cgroup *mem_cgroup_from_task(struct task_struct 
*p)
 }
 EXPORT_SYMBOL(mem_cgroup_from_task);
 
+int mem_cgroup_device_register(struct device *dev, unsigned long *dev_id)
+{
+   return cgroup_device_register(_cgrp_subsys, dev, dev_id);
+}
+EXPORT_SYMBOL(mem_cgroup_device_register);
+
+void mem_cgroup_device_unregister(unsigned long dev_id)
+{
+   cgroup_device_unregister(_cgrp_subsys, dev_id);
+}
+EXPORT_SYMBOL(mem_cgroup_device_unregister);
+
+/**
+ * mem_cgroup_device_from_task: Lookup device-specific memcg
+ * @id: device-specific id returned from mem_cgroup_device_register
+ * @p: task to lookup the memcg
+ *
+ * First use mem_cgroup_from_task to lookup and obtain a reference on
+ * the memcg associated with this task @p.  Within this memcg, find the
+ * device-specific one associated with @id.
+ * However if mem_cgroup is disabled, NULL is returned.
+ */
+struct mem_cgroup *mem_cgroup_device_from_task(unsigned long id,
+  struct task_struct *p)
+{
+   struct mem_cgroup *memcg;
+   struct mem_cgroup *dev_memcg = NULL;
+
+   if (mem_cgroup_disabled())
+   return NULL;
+
+   rcu_read_lock();
+   memcg  = mem_cgroup_from_task(p);
+   if (memcg

[Intel-gfx] [RFC PATCH 0/5] cgroup support for GPU devices

2019-05-01 Thread Brian Welty
In containerized or virtualized environments, there is desire to have
controls in place for resources that can be consumed by users of a GPU
device.  This RFC patch series proposes a framework for integrating 
use of existing cgroup controllers into device drivers.
The i915 driver is updated in this series as our primary use case to
leverage this framework and to serve as an example for discussion.

The patch series enables device drivers to use cgroups to control the
following resources within a GPU (or other accelerator device):
*  control allocation of device memory (reuse of memcg)
and with future work, we could extend to:
*  track and control share of GPU time (reuse of cpu/cpuacct)
*  apply mask of allowed execution engines (reuse of cpusets)

Instead of introducing a new cgroup subsystem for GPU devices, a new
framework is proposed to allow devices to register with existing cgroup
controllers, which creates per-device cgroup_subsys_state within the
cgroup.  This gives device drivers their own private cgroup controls
(such as memory limits or other parameters) to be applied to device
resources instead of host system resources.
Device drivers (GPU or other) are then able to reuse the existing cgroup
controls, instead of inventing similar ones.

Per-device controls would be exposed in cgroup filesystem as:
mount//.devices//
such as (for example):
mount//memory.devices//memory.max
mount//memory.devices//memory.current
mount//cpu.devices//cpu.stat
mount//cpu.devices//cpu.weight

The drm/i915 patch in this series is based on top of other RFC work [1]
for i915 device memory support.

AMD [2] and Intel [3] have proposed related work in this area within the
last few years, listed below as reference.  This new RFC reuses existing
cgroup controllers and takes a different approach than prior work.

Finally, some potential discussion points for this series:
* merge proposed .devices into a single devices directory?
* allow devices to have multiple registrations for subsets of resources?
* document a 'common charging policy' for device drivers to follow?

[1] https://patchwork.freedesktop.org/series/56683/
[2] https://lists.freedesktop.org/archives/dri-devel/2018-November/197106.html
[3] https://lists.freedesktop.org/archives/intel-gfx/2018-January/153156.html


Brian Welty (5):
  cgroup: Add cgroup_subsys per-device registration framework
  cgroup: Change kernfs_node for directories to store
cgroup_subsys_state
  memcg: Add per-device support to memory cgroup subsystem
  drm: Add memory cgroup registration and DRIVER_CGROUPS feature bit
  drm/i915: Use memory cgroup for enforcing device memory limit

 drivers/gpu/drm/drm_drv.c  |  12 +
 drivers/gpu/drm/drm_gem.c  |   7 +
 drivers/gpu/drm/i915/i915_drv.c|   2 +-
 drivers/gpu/drm/i915/intel_memory_region.c |  24 +-
 include/drm/drm_device.h   |   3 +
 include/drm/drm_drv.h  |   8 +
 include/drm/drm_gem.h  |  11 +
 include/linux/cgroup-defs.h|  28 ++
 include/linux/cgroup.h |   3 +
 include/linux/memcontrol.h |  10 +
 kernel/cgroup/cgroup-v1.c  |  10 +-
 kernel/cgroup/cgroup.c | 310 ++---
 mm/memcontrol.c| 183 +++-
 13 files changed, 552 insertions(+), 59 deletions(-)

-- 
2.21.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [RFC PATCH 1/5] cgroup: Add cgroup_subsys per-device registration framework

2019-05-01 Thread Brian Welty
In containerized or virtualized environments, there is desire to have
controls in place for resources that can be consumed by users of a GPU
device.  For this purpose, we extend control groups with a mechanism
for device drivers to register with cgroup subsystems.
Device drivers (GPU or other) are then able to reuse the existing cgroup
controls, instead of inventing similar ones.

A new framework is proposed to allow devices to register with existing
cgroup controllers, which creates per-device cgroup_subsys_state within
the cgroup.  This gives device drivers their own private cgroup controls
(such as memory limits or other parameters) to be applied to device
resources instead of host system resources.

It is exposed in cgroup filesystem as:
  mount//.devices//
such as (for example):
  mount//memory.devices//memory.max
  mount//memory.devices//memory.current
  mount//cpu.devices//cpu.stat

The creation of above files is implemented in css_populate_dir() for
cgroup subsystems that have enabled per-device support.
Above files are created either at time of cgroup creation (for known
registered devices) or at the time of device driver registration of the
device, during cgroup_register_device.  cgroup_device_unregister will
remove files from all current cgroups.

Cc: cgro...@vger.kernel.org
Cc: linux...@kvack.org
Cc: dri-de...@lists.freedesktop.org
Cc: Matt Roper 
Signed-off-by: Brian Welty 
---
 include/linux/cgroup-defs.h |  28 
 include/linux/cgroup.h  |   3 +
 kernel/cgroup/cgroup.c  | 270 ++--
 3 files changed, 289 insertions(+), 12 deletions(-)

diff --git a/include/linux/cgroup-defs.h b/include/linux/cgroup-defs.h
index 1c70803e9f77..aeaab420e349 100644
--- a/include/linux/cgroup-defs.h
+++ b/include/linux/cgroup-defs.h
@@ -162,6 +162,17 @@ struct cgroup_subsys_state {
struct work_struct destroy_work;
struct rcu_work destroy_rwork;
 
+   /*
+* Per-device state for devices registered with our subsys.
+* @device_css_idr stores pointer to per-device cgroup_subsys_state,
+* created when devices are associated with this css.
+* @device_kn is for creating .devices sub-directory within this cgroup
+* or for the per-device sub-directory (subsys.devices/).
+*/
+   struct device *device;
+   struct idr device_css_idr;
+   struct kernfs_node *device_kn;
+
/*
 * PI: the parent css.  Placed here for cache proximity to following
 * fields of the containing structure.
@@ -589,6 +600,9 @@ struct cftype {
  */
 struct cgroup_subsys {
struct cgroup_subsys_state *(*css_alloc)(struct cgroup_subsys_state 
*parent_css);
+   struct cgroup_subsys_state *(*device_css_alloc)(struct device *device,
+   struct 
cgroup_subsys_state *cgroup_css,
+   struct 
cgroup_subsys_state *parent_device_css);
int (*css_online)(struct cgroup_subsys_state *css);
void (*css_offline)(struct cgroup_subsys_state *css);
void (*css_released)(struct cgroup_subsys_state *css);
@@ -636,6 +650,13 @@ struct cgroup_subsys {
 */
bool threaded:1;
 
+   /*
+* If %true, the controller supports device drivers to register
+* with this controller for cloning the cgroup functionality
+* into per-device cgroup state under .dev//.
+*/
+   bool allow_devices:1;
+
/*
 * If %false, this subsystem is properly hierarchical -
 * configuration, resource accounting and restriction on a parent
@@ -664,6 +685,13 @@ struct cgroup_subsys {
/* idr for css->id */
struct idr css_idr;
 
+   /*
+* IDR of registered devices, allows subsys_state to have state
+* for each device. Exposed as per-device entries in filesystem,
+* under .device//.
+*/
+   struct idr device_idr;
+
/*
 * List of cftypes.  Each entry is the first entry of an array
 * terminated by zero length name.
diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index 81f58b4a5418..3531bf948703 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -116,6 +116,9 @@ int cgroupstats_build(struct cgroupstats *stats, struct 
dentry *dentry);
 int proc_cgroup_show(struct seq_file *m, struct pid_namespace *ns,
 struct pid *pid, struct task_struct *tsk);
 
+int cgroup_device_register(struct cgroup_subsys *ss, struct device *dev,
+  unsigned long *dev_id);
+void cgroup_device_unregister(struct cgroup_subsys *ss, unsigned long dev_id);
 void cgroup_fork(struct task_struct *p);
 extern int cgroup_can_fork(struct task_struct *p);
 extern void cgroup_cancel_fork(struct task_struct *p);
diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index 3f2b4bde0f9c..9b035e728941 100644
--- a/kernel/cgroup/cgroup.c
+++ b/ker