Re: [PATCH] drm/ttm: Implement strict NUMA pool allocations

2024-03-22 Thread Bhardwaj, Rajneesh



On 3/22/2024 11:29 AM, Ruhl, Michael J wrote:

-Original Message-
From: dri-devel  On Behalf Of
Rajneesh Bhardwaj
Sent: Friday, March 22, 2024 3:08 AM
To: amd-...@lists.freedesktop.org; dri-devel@lists.freedesktop.org
Cc: felix.kuehl...@amd.com; alexander.deuc...@amd.com;
christian.koe...@amd.com; Rajneesh Bhardwaj
; Joe Greathouse

Subject: [PATCH] drm/ttm: Implement strict NUMA pool allocations

This change allows TTM to be flexible to honor NUMA localized
allocations which can result in significant performance improvement on a
multi socket NUMA system. On GFXIP 9.4.3 based AMD APUs, we see
manyfold benefits of this change resulting not only in ~10% performance
improvement in certain benchmarks but also generating more consistent
and less sporadic results specially when the NUMA balancing is not
explecitely disabled. In certain scenarios, workloads show a run-to-run
variability e.g. HPL would show a ~10x performance drop after running
back to back 4-5 times and would recover later on a subsequent run. This
is seen with memory intensive other workloads too. It was seen that when
CPU caches were dropped e.g. sudo sysctl -w vm.drop_caches=1, the
variability reduced but the performance was still well below a good run.

Use of __GFP_THISNODE flag ensures that during memory allocation, kernel
prioritizes memory allocations from the local or closest NUMA node
thereby reducing memory access latency. When memory is allocated using
__GFP_THISNODE flag, memory allocations will predominantly be done on
the local node, consequency, the shrinkers may priotitize reclaiming
memory from caches assocoated with local node to maintain memory
locality and minimize latency, thereby provide better shinker targeting.

Reduced memory pressure on remote nodes, can also indirectly influence
shrinker behavior by potentially reducing the frequency and intensity of
memory reclamation operation on remote nodes and could provide improved
overall system performance.

While this change could be more beneficial in general, i.e., without the
use of a module parameter, but in absence of widespread testing, limit
it to the AMD GFXIP 9.4.3 based ttm pool initializations only.


Cc: Joe Greathouse 
Signed-off-by: Rajneesh Bhardwaj 
---
drivers/gpu/drm/amd/amdgpu/amdgpu.h   |  1 +
drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c   |  8 
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c   |  7 ++-
drivers/gpu/drm/ttm/tests/ttm_pool_test.c | 10 +-
drivers/gpu/drm/ttm/ttm_device.c  |  2 +-
drivers/gpu/drm/ttm/ttm_pool.c|  7 ++-
include/drm/ttm/ttm_pool.h|  4 +++-
7 files changed, 30 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 9c62552bec34..96532cfc6230 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -253,6 +253,7 @@ extern int amdgpu_user_partt_mode;
extern int amdgpu_agp;

extern int amdgpu_wbrf;
+extern bool strict_numa_alloc;

#define AMDGPU_VM_MAX_NUM_CTX   4096
#define AMDGPU_SG_THRESHOLD (256*1024*1024)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 80b9642f2bc4..a183a6b4493d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -781,6 +781,14 @@ int queue_preemption_timeout_ms = 9000;
module_param(queue_preemption_timeout_ms, int, 0644);
MODULE_PARM_DESC(queue_preemption_timeout_ms, "queue preemption
timeout in ms (1 = Minimum, 9000 = default)");

+/**
+ * DOC: strict_numa_alloc(bool)
+ * Policy to force NUMA allocation requests from the proximity NUMA domain
only.
+ */
+bool strict_numa_alloc;
+module_param(strict_numa_alloc, bool, 0444);
+MODULE_PARM_DESC(strict_numa_alloc, "Force NUMA allocation requests
to be satisfied from the closest node only (false = default)");
+
/**
  * DOC: debug_evictions(bool)
  * Enable extra debug messages to help determine the cause of evictions
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index b0ed10f4de60..a9f78f85e28c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -1768,6 +1768,7 @@ static int amdgpu_ttm_reserve_tmr(struct
amdgpu_device *adev)

static int amdgpu_ttm_pools_init(struct amdgpu_device *adev)
{
+   bool policy = true;
int i;

if (!adev->gmc.is_app_apu || !adev->gmc.num_mem_partitions)
@@ -1779,11 +1780,15 @@ static int amdgpu_ttm_pools_init(struct
amdgpu_device *adev)
if (!adev->mman.ttm_pools)
return -ENOMEM;

+   /* Policy not only depends on the module param but also on the ASIC
+* setting use_strict_numa_alloc as well.
+*/
for (i = 0; i < adev->gmc.num_mem_partitions; i++) {
ttm_pool_init(>mman.ttm_pools[i], adev->dev,
  

Re: [PATCH] drm/ttm: Implement strict NUMA pool allocations

2024-03-22 Thread Bhardwaj, Rajneesh





On 3/22/2024 9:15 AM, Christian König wrote:

Am 22.03.24 um 08:07 schrieb Rajneesh Bhardwaj:

This change allows TTM to be flexible to honor NUMA localized
allocations which can result in significant performance improvement on a
multi socket NUMA system. On GFXIP 9.4.3 based AMD APUs, we see
manyfold benefits of this change resulting not only in ~10% performance
improvement in certain benchmarks but also generating more consistent
and less sporadic results specially when the NUMA balancing is not
explecitely disabled. In certain scenarios, workloads show a run-to-run
variability e.g. HPL would show a ~10x performance drop after running
back to back 4-5 times and would recover later on a subsequent run. This
is seen with memory intensive other workloads too. It was seen that when
CPU caches were dropped e.g. sudo sysctl -w vm.drop_caches=1, the
variability reduced but the performance was still well below a good run.

Use of __GFP_THISNODE flag ensures that during memory allocation, kernel
prioritizes memory allocations from the local or closest NUMA node
thereby reducing memory access latency.


Exactly that's what it doesn't do.

__GFP_THISNODE just means it enforces allocation from the specified node.


Thanks for the feedback, Christian.

Sure, maybe I should have made it clear in the commit message that there 
is no fallback to zonelist while allocating slab cache. And this is 
exactly what we want, we don't want the pages to be landing on remote nodes.




Additional to that there is a mandatory requirement that this flag is 
only used if it has to be used for correctness. And that is simply not 
the case here.



Sorry ,I didn't quite understand this. What is the mandatory 
requirement, could you please clarify this?  Based on the documentation 
I read about this and after reading the kernel source code, I didn't 
find any hard requirement to discourage use of this.





So as long as nobody can explain why that should help this is an 
absolutely no-go.



Could you please clarify the controversial part here? We have absolutely 
strong backing data the proves the usefulness besides this change 
restricts us to a certain HW IP. Also, the possibilty of OOM killer 
seems to less actually when we use __THISNODE.


https://elixir.bootlin.com/linux/latest/source/mm/page_alloc.c#L3439


Another important thing I want to highlight here is that for AMD 
GFXIP9.4.3 APU, which has unified HBM stack (no RAM/VRAM distinction), 
we get the backing vram pages using ttm_pool_alloc_page(nid) already.






Regards,
Christian.


  When memory is allocated using
__GFP_THISNODE flag, memory allocations will predominantly be done on
the local node, consequency, the shrinkers may priotitize reclaiming
memory from caches assocoated with local node to maintain memory
locality and minimize latency, thereby provide better shinker targeting.

Reduced memory pressure on remote nodes, can also indirectly influence
shrinker behavior by potentially reducing the frequency and intensity of
memory reclamation operation on remote nodes and could provide improved
overall system performance.

While this change could be more beneficial in general, i.e., without the
use of a module parameter, but in absence of widespread testing, limit
it to the AMD GFXIP 9.4.3 based ttm pool initializations only.


Cc: Joe Greathouse 
Signed-off-by: Rajneesh Bhardwaj 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu.h   |  1 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c   |  8 
  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c   |  7 ++-
  drivers/gpu/drm/ttm/tests/ttm_pool_test.c | 10 +-
  drivers/gpu/drm/ttm/ttm_device.c  |  2 +-
  drivers/gpu/drm/ttm/ttm_pool.c    |  7 ++-
  include/drm/ttm/ttm_pool.h    |  4 +++-
  7 files changed, 30 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h

index 9c62552bec34..96532cfc6230 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -253,6 +253,7 @@ extern int amdgpu_user_partt_mode;
  extern int amdgpu_agp;
    extern int amdgpu_wbrf;
+extern bool strict_numa_alloc;
    #define AMDGPU_VM_MAX_NUM_CTX    4096
  #define AMDGPU_SG_THRESHOLD    (256*1024*1024)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c

index 80b9642f2bc4..a183a6b4493d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -781,6 +781,14 @@ int queue_preemption_timeout_ms = 9000;
  module_param(queue_preemption_timeout_ms, int, 0644);
  MODULE_PARM_DESC(queue_preemption_timeout_ms, "queue preemption 
timeout in ms (1 = Minimum, 9000 = default)");

  +/**
+ * DOC: strict_numa_alloc(bool)
+ * Policy to force NUMA allocation requests from the proximity NUMA 
domain only.

+ */
+bool strict_numa_alloc;
+module_param(strict_numa_alloc, bool, 0444);
+MODULE_PARM_DESC(strict_numa_alloc, "Force 

RE: [BUG] BUG: kernel NULL pointer dereference at ttm_device_init+0xb4

2024-01-22 Thread Bhardwaj, Rajneesh
[AMD Official Use Only - General]

-Original Message-
From: Steven Rostedt 
Sent: Monday, January 22, 2024 6:19 PM
To: LKML 
Cc: Linus Torvalds ; Bhardwaj, Rajneesh 
; Kuehling, Felix ; Koenig, 
Christian ; dri-devel@lists.freedesktop.org
Subject: Re: [BUG] BUG: kernel NULL pointer dereference at ttm_device_init+0xb4

On Mon, 22 Jan 2024 18:15:47 -0500
Steven Rostedt  wrote:

> > ttm_pool_init(>pool, dev, dev_to_node(dev), use_dma_alloc, 
> > use_dma32); <<<--- BUG!
> >
> > Specifically, it appears that dev is NULL and dev_to_node() doesn't
> > like having a NULL pointer passed to it.
> >
>
> Yeah, that qxl_ttm_init() has:
>
>   /* No others user of address space so set it to 0 */
>   r = ttm_device_init(>mman.bdev, _bo_driver, NULL,
>   qdev->ddev.anon_inode->i_mapping,
>   qdev->ddev.vma_offset_manager,
>   false, false);
>
> Where that NULL is "dev"!
>
> Thus that will never work here.

Perhaps this is the real fix?

I think the fix might be already applied to drm misc. Please see, 
https://lkml.iu.edu/hypermail/linux/kernel/2401.1/06778.html


-- Steve

diff --git a/drivers/gpu/drm/ttm/ttm_device.c b/drivers/gpu/drm/ttm/ttm_device.c
index f5187b384ae9..bc217b4d6b04 100644
--- a/drivers/gpu/drm/ttm/ttm_device.c
+++ b/drivers/gpu/drm/ttm/ttm_device.c
@@ -215,7 +215,8 @@ int ttm_device_init(struct ttm_device *bdev, const struct 
ttm_device_funcs *func

ttm_sys_man_init(bdev);

-   ttm_pool_init(>pool, dev, dev_to_node(dev), use_dma_alloc, 
use_dma32);
+   ttm_pool_init(>pool, dev, dev ? dev_to_node(dev) : NUMA_NO_NODE,
+ use_dma_alloc, use_dma32);

bdev->vma_manager = vma_manager;
spin_lock_init(>lru_lock);


Re: [BUG] BUG: kernel NULL pointer dereference at ttm_device_init+0xb4

2024-01-22 Thread Bhardwaj, Rajneesh


On 1/22/2024 7:43 PM, Linus Torvalds wrote:

On Mon, 22 Jan 2024 at 15:17, Steven Rostedt  wrote:

Perhaps this is the real fix?

If you send a signed-off version, I'll apply it asap.



I think a fix might already be in flight. Please see Linux-Kernel 
Archive: Re: [PATCH] drm/ttm: fix ttm pool initialization for 
no-dma-device drivers (iu.edu) 






Thanks,
  Linus

Re: [BUG] BUG: kernel NULL pointer dereference at ttm_device_init+0xb4

2024-01-22 Thread Bhardwaj, Rajneesh



On 1/22/2024 7:34 PM, Steven Rostedt wrote:

On Mon, 22 Jan 2024 19:29:41 -0500
"Bhardwaj, Rajneesh"  wrote:


In one of my previous revisions of this patch when I was experimenting,
I used something like below. Wonder if that could work in your case
and/or in general.


diff --git a/drivers/gpu/drm/ttm/ttm_device.c
b/drivers/gpu/drm/ttm/ttm_device.c

index 43e27ab77f95..4c3902b94be4 100644

--- a/drivers/gpu/drm/ttm/ttm_device.c

+++ b/drivers/gpu/drm/ttm/ttm_device.c

@@ -195,6 +195,7 @@ int ttm_device_init(struct ttm_device *bdev, struct
ttm_device_funcs *funcs,

bool use_dma_alloc, bool use_dma32){

struct ttm_global *glob = _glob;

+bool node_has_cpu = false;

int ret;

if (WARN_ON(vma_manager == NULL))

@@ -213,7 +214,12 @@ int ttm_device_init(struct ttm_device *bdev, struct
ttm_device_funcs *funcs,

bdev->funcs = funcs;

ttm_sys_man_init(bdev);

-ttm_pool_init(>pool, dev, NUMA_NO_NODE, use_dma_alloc, use_dma32);

+

+node_has_cpu = node_state(dev->numa_node, N_CPU);

Considering that qxl_ttm_init() passes in dev = NULL, the above would blow
up just the same.



I agree, I think we need something like you suggested i.e.

+   ttm_pool_init(>pool, dev, dev ? dev_to_node(dev) : NUMA_NO_NODE,
+ use_dma_alloc, use_dma32);


I am not quite sure if the above node_has_cpu change will be a better 
solution in general, along with the NULL pointer check as you suggested. 
If you prefer that, then I can send a fix otherwise, your fix looks good 
to me.





-- Steve



+if (node_has_cpu)

+ttm_pool_init(>pool, dev, dev->numa_node, use_dma_alloc, use_dma32);

+else

+ttm_pool_init(>pool, dev, NUMA_NO_NODE, use_dma_alloc,

+use_dma32);

bdev->vma_manager = vma_manager;

spin_lock_init(>lru_lock);



-- Steve


Re: [BUG] BUG: kernel NULL pointer dereference at ttm_device_init+0xb4

2024-01-22 Thread Bhardwaj, Rajneesh


On 1/22/2024 6:06 PM, Steven Rostedt wrote:

I just kicked off testing some patches on top of 6.8-rc1 and triggered this
immediately:

[ note this happened on both my 32 bit an 64 bit test machines, this is
   just the 32 bit output ]

  BUG: kernel NULL pointer dereference, address: 0238
  #PF: supervisor read access in kernel mode
  #PF: error_code(0x) - not-present page
  *pdpt =  *pde = f000ff53f000ff53
  Oops:  [#1] PREEMPT SMP PTI
  CPU: 0 PID: 9 Comm: kworker/0:1 Not tainted 
6.8.0-rc1-test-1-g2b44760609e9-dirty #1056
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 
1.16.3-debian-1.16.3-2 04/01/2014
  Workqueue: events work_for_cpu_fn
  EIP: ttm_device_init+0xb4/0x274
  Code: 86 10 09 00 00 83 c4 0c 85 c0 0f 84 96 01 00 00 8b 45 ac 8d 9e 94 00 00 00 89 
46 08 89 f0 e8 27 05 00 00 8b 55 a8 0f b6 45 98 <8b> 8a 38 02 00 00 50 0f b6 45 
9c 50 89 d8 e8 95 ee ff ff 8b 45 a0
  EAX:  EBX: c135a7e4 ECX: c135a7b0 EDX: 
  ESI: c135a750 EDI: 0007bc1d EBP: c11d7e4c ESP: c11d7de4
  DS: 007b ES: 007b FS: 00d8 GS:  SS: 0068 EFLAGS: 00010246
  CR0: 80050033 CR2: 0238 CR3: 145c4000 CR4: 06f0
  Call Trace:
   ? show_regs+0x4f/0x58
   ? __die+0x1d/0x58
   ? page_fault_oops+0x171/0x330
   ? lock_acquire+0xa4/0x280
   ? kernelmode_fixup_or_oops.constprop.0+0x7c/0xcc
   ? __bad_area_nosemaphore.constprop.0+0x124/0x1b4
   ? __mutex_lock+0x17f/0xb00
   ? bad_area_nosemaphore+0xf/0x14
   ? do_user_addr_fault+0x140/0x3e4
   ? exc_page_fault+0x5b/0x1d8
   ? pvclock_clocksource_read_nowd+0x130/0x130
   ? handle_exception+0x133/0x133
   ? pvclock_clocksource_read_nowd+0x130/0x130
   ? ttm_device_init+0xb4/0x274
   ? pvclock_clocksource_read_nowd+0x130/0x130
   ? ttm_device_init+0xb4/0x274
   qxl_ttm_init+0x34/0x130
   qxl_bo_init+0xd/0x10
   qxl_device_init+0x52a/0x92c
   qxl_pci_probe+0x91/0x1ac
   local_pci_probe+0x3d/0x84
   work_for_cpu_fn+0x16/0x20
   process_one_work+0x1bc/0x4a0
   worker_thread+0x310/0x3a8
   kthread+0xea/0x110
   ? rescuer_thread+0x2f0/0x2f0
   ? kthread_complete_and_exit+0x1c/0x1c
   ret_from_fork+0x34/0x4c
   ? kthread_complete_and_exit+0x1c/0x1c
   ret_from_fork_asm+0x12/0x18
   entry_INT80_32+0xf0/0xf0
  Modules linked in:
  CR2: 0238
  ---[ end trace  ]---

The crash happened here:

int ttm_device_init(struct ttm_device *bdev, const struct ttm_device_funcs 
*funcs,
struct device *dev, struct address_space *mapping,
struct drm_vma_offset_manager *vma_manager,
bool use_dma_alloc, bool use_dma32)
{
struct ttm_global *glob = _glob;
int ret;

if (WARN_ON(vma_manager == NULL))
return -EINVAL;

ret = ttm_global_init();
if (ret)
return ret;

bdev->wq = alloc_workqueue("ttm",
   WQ_MEM_RECLAIM | WQ_HIGHPRI | WQ_UNBOUND, 
16);
if (!bdev->wq) {
ttm_global_release();
return -ENOMEM;
}

bdev->funcs = funcs;

ttm_sys_man_init(bdev);

ttm_pool_init(>pool, dev, dev_to_node(dev), use_dma_alloc, use_dma32); 
<<<--- BUG!

Specifically, it appears that dev is NULL and dev_to_node() doesn't like
having a NULL pointer passed to it.

I currently "fixed" this with a:

if (!dev)
return -EINVAL;

at the start of this function just so that I can continue running my tests,
but that is obviously incorrect.



In one of my previous revisions of this patch when I was experimenting, 
I used something like below. Wonder if that could work in your case 
and/or in general.



diff --git a/drivers/gpu/drm/ttm/ttm_device.c 
b/drivers/gpu/drm/ttm/ttm_device.c


index 43e27ab77f95..4c3902b94be4 100644

--- a/drivers/gpu/drm/ttm/ttm_device.c

+++ b/drivers/gpu/drm/ttm/ttm_device.c

@@ -195,6 +195,7 @@ int ttm_device_init(struct ttm_device *bdev, struct 
ttm_device_funcs *funcs,


bool use_dma_alloc, bool use_dma32){

struct ttm_global *glob = _glob;

+bool node_has_cpu = false;

int ret;

if (WARN_ON(vma_manager == NULL))

@@ -213,7 +214,12 @@ int ttm_device_init(struct ttm_device *bdev, struct 
ttm_device_funcs *funcs,


bdev->funcs = funcs;

ttm_sys_man_init(bdev);

-ttm_pool_init(>pool, dev, NUMA_NO_NODE, use_dma_alloc, use_dma32);

+

+node_has_cpu = node_state(dev->numa_node, N_CPU);

+if (node_has_cpu)

+ttm_pool_init(>pool, dev, dev->numa_node, use_dma_alloc, use_dma32);

+else

+ttm_pool_init(>pool, dev, NUMA_NO_NODE, use_dma_alloc,

+use_dma32);

bdev->vma_manager = vma_manager;

spin_lock_init(>lru_lock);




-- Steve

RE: [PATCH 2/2] drm/amdgpu: use GTT only as fallback for VRAM|GTT

2023-11-27 Thread Bhardwaj, Rajneesh
[AMD Official Use Only - General]

-Original Message-
From: amd-gfx  On Behalf Of Hamza Mahfooz
Sent: Monday, November 27, 2023 10:53 AM
To: Christian König ; 
jani.nik...@linux.intel.com; kher...@redhat.com; d...@redhat.com; 
za...@vmware.com; Olsak, Marek ; 
linux-graphics-maintai...@vmware.com; amd-...@lists.freedesktop.org; 
nouv...@lists.freedesktop.org; intel-...@lists.freedesktop.org; 
virtualizat...@lists.linux.dev; spice-de...@lists.freedesktop.org; 
dri-devel@lists.freedesktop.org
Subject: Re: [PATCH 2/2] drm/amdgpu: use GTT only as fallback for VRAM|GTT

On 11/27/23 09:54, Christian König wrote:
> Try to fill up VRAM as well by setting the busy flag on GTT allocations.
>
> This fixes the issue that when VRAM was evacuated for suspend it's
> never filled up again unless the application is restarted.

I found the subject description a bit misleading. Maybe use a Fixes tag 
describing it is a fix for suspend resume regression other than that, looks 
good to me.

Acked-by: Rajneesh Bhardwaj 

>

Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2893

> Signed-off-by: Christian König 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 6 ++
>   1 file changed, 6 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> index aa0dd6dad068..ddc8fb4db678 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> @@ -173,6 +173,12 @@ void amdgpu_bo_placement_from_domain(struct amdgpu_bo 
> *abo, u32 domain)
>   abo->flags & AMDGPU_GEM_CREATE_PREEMPTIBLE ?
>   AMDGPU_PL_PREEMPT : TTM_PL_TT;
>   places[c].flags = 0;
> + /*
> +  * When GTT is just an alternative to VRAM make sure that we
> +  * only use it as fallback and still try to fill up VRAM first.
> +  */
> + if (domain & AMDGPU_GEM_DOMAIN_VRAM)
> + places[c].flags |= TTM_PL_FLAG_BUSY;
>   c++;
>   }
>
--
Hamza



Re: [Patch v2] drm/ttm: Schedule delayed_delete worker closer

2023-11-08 Thread Bhardwaj, Rajneesh



On 11/8/2023 9:49 AM, Christian König wrote:

Am 08.11.23 um 13:58 schrieb Rajneesh Bhardwaj:

Try to allocate system memory on the NUMA node the device is closest to
and try to run delayed_delete workers on a CPU of this node as well.

To optimize the memory clearing operation when a TTM BO gets freed by
the delayed_delete worker, scheduling it closer to a NUMA node where the
memory was initially allocated helps avoid the cases where the worker
gets randomly scheduled on the CPU cores that are across interconnect
boundaries such as xGMI, PCIe etc.

This change helps USWC GTT allocations on NUMA systems (dGPU) and AMD
APU platforms such as GFXIP9.4.3.

Acked-by: Felix Kuehling 
Signed-off-by: Rajneesh Bhardwaj 


Reviewed-by: Christian König 

Going to push this to drm-misc-next.


Thanks Christian, there is a new regression reported and I am checking 
on that. Please don't submit it yet.





Thanks,
Christian.


---

Changes in v2:
  - Absorbed the feedback provided by Christian in the commit message 
and

    the comment.

  drivers/gpu/drm/ttm/ttm_bo.c | 8 +++-
  drivers/gpu/drm/ttm/ttm_device.c | 3 ++-
  2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index 5757b9415e37..6f28a77a565b 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -370,7 +370,13 @@ static void ttm_bo_release(struct kref *kref)
  spin_unlock(>bdev->lru_lock);
    INIT_WORK(>delayed_delete, ttm_bo_delayed_delete);
-    queue_work(bdev->wq, >delayed_delete);
+
+    /* Schedule the worker on the closest NUMA node. This
+ * improves performance since system memory might be
+ * cleared on free and that is best done on a CPU core
+ * close to it.
+ */
+    queue_work_node(bdev->pool.nid, bdev->wq, 
>delayed_delete);

  return;
  }
  diff --git a/drivers/gpu/drm/ttm/ttm_device.c 
b/drivers/gpu/drm/ttm/ttm_device.c

index 43e27ab77f95..72b81a2ee6c7 100644
--- a/drivers/gpu/drm/ttm/ttm_device.c
+++ b/drivers/gpu/drm/ttm/ttm_device.c
@@ -213,7 +213,8 @@ int ttm_device_init(struct ttm_device *bdev, 
struct ttm_device_funcs *funcs,

  bdev->funcs = funcs;
    ttm_sys_man_init(bdev);
-    ttm_pool_init(>pool, dev, NUMA_NO_NODE, use_dma_alloc, 
use_dma32);

+
+    ttm_pool_init(>pool, dev, dev_to_node(dev), use_dma_alloc, 
use_dma32);

    bdev->vma_manager = vma_manager;
  spin_lock_init(>lru_lock);




RE: [PATCH v2] drm/amdkfd: CRIU export dmabuf handles for GTT BOs

2022-03-09 Thread Bhardwaj, Rajneesh
[AMD Official Use Only]

Please ignore the previous email, that was sent in error. This one is with the 
minor version bump so this looks good.

Reviewed-by : Rajneesh Bhardwaj 

-Original Message-
From: amd-gfx  On Behalf Of David Yat Sin
Sent: Tuesday, March 8, 2022 4:08 PM
To: amd-...@lists.freedesktop.org; dri-devel@lists.freedesktop.org
Cc: Kuehling, Felix ; Yat Sin, David 

Subject: [PATCH v2] drm/amdkfd: CRIU export dmabuf handles for GTT BOs

Export dmabuf handles for GTT BOs so that their contents can be accessed using 
SDMA during checkpoint/restore.

Signed-off-by: David Yat Sin 
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 12 
 include/uapi/linux/kfd_ioctl.h   |  3 ++-
 2 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 2c7d76e67ddb..e1e2362841f8 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -1759,7 +1759,8 @@ static int criu_checkpoint_bos(struct kfd_process *p,
goto exit;
}
}
-   if (bo_bucket->alloc_flags & 
KFD_IOC_ALLOC_MEM_FLAGS_VRAM) {
+   if (bo_bucket->alloc_flags
+   & (KFD_IOC_ALLOC_MEM_FLAGS_VRAM | 
KFD_IOC_ALLOC_MEM_FLAGS_GTT)) 
+{
ret = 
criu_get_prime_handle(_bo->tbo.base,
bo_bucket->alloc_flags &

KFD_IOC_ALLOC_MEM_FLAGS_WRITABLE ? DRM_RDWR : 0, @@ -1812,7 +1813,8 @@ static 
int criu_checkpoint_bos(struct kfd_process *p,
 
 exit:
while (ret && bo_index--) {
-   if (bo_buckets[bo_index].alloc_flags & 
KFD_IOC_ALLOC_MEM_FLAGS_VRAM)
+   if (bo_buckets[bo_index].alloc_flags
+   & (KFD_IOC_ALLOC_MEM_FLAGS_VRAM | 
KFD_IOC_ALLOC_MEM_FLAGS_GTT))
close_fd(bo_buckets[bo_index].dmabuf_fd);
}
 
@@ -2211,7 +2213,8 @@ static int criu_restore_bo(struct kfd_process *p,
 
pr_debug("map memory was successful for the BO\n");
/* create the dmabuf object and export the bo */
-   if (bo_bucket->alloc_flags & KFD_IOC_ALLOC_MEM_FLAGS_VRAM) {
+   if (bo_bucket->alloc_flags
+   & (KFD_IOC_ALLOC_MEM_FLAGS_VRAM | KFD_IOC_ALLOC_MEM_FLAGS_GTT)) {
ret = criu_get_prime_handle(_mem->bo->tbo.base, DRM_RDWR,
_bucket->dmabuf_fd);
if (ret)
@@ -2281,7 +2284,8 @@ static int criu_restore_bos(struct kfd_process *p,
 
 exit:
while (ret && i--) {
-   if (bo_buckets[i].alloc_flags & KFD_IOC_ALLOC_MEM_FLAGS_VRAM)
+   if (bo_buckets[i].alloc_flags
+  & (KFD_IOC_ALLOC_MEM_FLAGS_VRAM | 
KFD_IOC_ALLOC_MEM_FLAGS_GTT))
close_fd(bo_buckets[i].dmabuf_fd);
}
kvfree(bo_buckets);
diff --git a/include/uapi/linux/kfd_ioctl.h b/include/uapi/linux/kfd_ioctl.h 
index b40687bf1014..eb9ff85f8556 100644
--- a/include/uapi/linux/kfd_ioctl.h
+++ b/include/uapi/linux/kfd_ioctl.h
@@ -33,9 +33,10 @@
  * - 1.5 - Add SVM API
  * - 1.6 - Query clear flags in SVM get_attr API
  * - 1.7 - Checkpoint Restore (CRIU) API
+ * - 1.8 - CRIU - Support for SDMA transfers with GTT BOs
  */
 #define KFD_IOCTL_MAJOR_VERSION 1
-#define KFD_IOCTL_MINOR_VERSION 7
+#define KFD_IOCTL_MINOR_VERSION 8
 
 struct kfd_ioctl_get_version_args {
__u32 major_version;/* from KFD */
--
2.17.1


RE: [PATCH] drm/amdkfd: CRIU export dmabuf handles for GTT BOs

2022-03-09 Thread Bhardwaj, Rajneesh
[AMD Official Use Only]

Reviewed-by: Rajneesh Bhardwaj 

-Original Message-
From: amd-gfx  On Behalf Of David Yat Sin
Sent: Tuesday, March 8, 2022 2:12 PM
To: amd-...@lists.freedesktop.org; dri-devel@lists.freedesktop.org
Cc: Kuehling, Felix ; Yat Sin, David 

Subject: [PATCH] drm/amdkfd: CRIU export dmabuf handles for GTT BOs

Export dmabuf handles for GTT BOs so that their contents can be accessed using 
SDMA during checkpoint/restore.

Signed-off-by: David Yat Sin 
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 2c7d76e67ddb..e1e2362841f8 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -1759,7 +1759,8 @@ static int criu_checkpoint_bos(struct kfd_process *p,
goto exit;
}
}
-   if (bo_bucket->alloc_flags & 
KFD_IOC_ALLOC_MEM_FLAGS_VRAM) {
+   if (bo_bucket->alloc_flags
+   & (KFD_IOC_ALLOC_MEM_FLAGS_VRAM | 
KFD_IOC_ALLOC_MEM_FLAGS_GTT)) 
+{
ret = 
criu_get_prime_handle(_bo->tbo.base,
bo_bucket->alloc_flags &

KFD_IOC_ALLOC_MEM_FLAGS_WRITABLE ? DRM_RDWR : 0, @@ -1812,7 +1813,8 @@ static 
int criu_checkpoint_bos(struct kfd_process *p,
 
 exit:
while (ret && bo_index--) {
-   if (bo_buckets[bo_index].alloc_flags & 
KFD_IOC_ALLOC_MEM_FLAGS_VRAM)
+   if (bo_buckets[bo_index].alloc_flags
+   & (KFD_IOC_ALLOC_MEM_FLAGS_VRAM | 
KFD_IOC_ALLOC_MEM_FLAGS_GTT))
close_fd(bo_buckets[bo_index].dmabuf_fd);
}
 
@@ -2211,7 +2213,8 @@ static int criu_restore_bo(struct kfd_process *p,
 
pr_debug("map memory was successful for the BO\n");
/* create the dmabuf object and export the bo */
-   if (bo_bucket->alloc_flags & KFD_IOC_ALLOC_MEM_FLAGS_VRAM) {
+   if (bo_bucket->alloc_flags
+   & (KFD_IOC_ALLOC_MEM_FLAGS_VRAM | KFD_IOC_ALLOC_MEM_FLAGS_GTT)) {
ret = criu_get_prime_handle(_mem->bo->tbo.base, DRM_RDWR,
_bucket->dmabuf_fd);
if (ret)
@@ -2281,7 +2284,8 @@ static int criu_restore_bos(struct kfd_process *p,
 
 exit:
while (ret && i--) {
-   if (bo_buckets[i].alloc_flags & KFD_IOC_ALLOC_MEM_FLAGS_VRAM)
+   if (bo_buckets[i].alloc_flags
+  & (KFD_IOC_ALLOC_MEM_FLAGS_VRAM | 
KFD_IOC_ALLOC_MEM_FLAGS_GTT))
close_fd(bo_buckets[i].dmabuf_fd);
}
kvfree(bo_buckets);
--
2.17.1


RE: [Patch v5 00/24] CHECKPOINT RESTORE WITH ROCm

2022-02-03 Thread Bhardwaj, Rajneesh
[AMD Official Use Only]

Thank you Felix for the review and your guidance.

-Original Message-
From: Kuehling, Felix  
Sent: Thursday, February 3, 2022 10:22 PM
To: Bhardwaj, Rajneesh ; 
amd-...@lists.freedesktop.org
Cc: Yat Sin, David ; Deucher, Alexander 
; dri-devel@lists.freedesktop.org
Subject: Re: [Patch v5 00/24] CHECKPOINT RESTORE WITH ROCm

The series is

Reviewed-by: Felix Kuehling 


Am 2022-02-03 um 04:08 schrieb Rajneesh Bhardwaj:
> V5: Proposed IOCTL APIs for CRIU with consolidated feedback
>
> CRIU is a user space tool which is very popular for container live 
> migration in datacentres. It can checkpoint a running application, 
> save its complete state, memory contents and all system resources to 
> images on disk which can be migrated to another m achine and restored later.
> More information on CRIU can be found at https://criu.org/Main_Page
>
> CRIU currently does not support Checkpoint / Restore with applications 
> that have devices files open so it cannot perform checkpoint and 
> restore on GPU devices which are very complex and have their own VRAM 
> managed privately. CRIU, however can support external devices by using 
> a plugin architecture. We feel that we are getting close to finalizing 
> our IOCTL APIs which were again changed since V3 for an improved modular 
> design.
>
> Our changes to CRIU user space  are can be obtained from here:
> https://github.com/RadeonOpenCompute/criu/tree/amdgpu_rfc-211222
>
> We have tested the following scenarios:
>   - Checkpoint / Restore of a Pytorch (BERT) workload
>   - kfdtests with queues and events
>   - Gfx9 and Gfx10 based multi GPU test systems
>   - On baremetal and inside a docker container
>   - Restoring on a different system
>
> V1: Initial
> V2: Addressed review comments
> V3: Rebased on latest amd-staging-drm-next (5.15 based)
> v4: New API design and basic support for SVM, however there is an 
> outstanding issue with SVM restore which is currently under debug and 
> hopefully that won't impact the ioctl APIs as SVMs are treated as 
> private data hidden from user space like queues and events with the 
> new approch.
> V5: Fix the SVM related issues and finalize the APIs.
>
> David Yat Sin (9):
>drm/amdkfd: CRIU Implement KFD unpause operation
>drm/amdkfd: CRIU add queues support
>drm/amdkfd: CRIU restore queue ids
>drm/amdkfd: CRIU restore sdma id for queues
>drm/amdkfd: CRIU restore queue doorbell id
>drm/amdkfd: CRIU checkpoint and restore queue mqds
>drm/amdkfd: CRIU checkpoint and restore queue control stack
>drm/amdkfd: CRIU checkpoint and restore events
>drm/amdkfd: CRIU implement gpu_id remapping
>
> Rajneesh Bhardwaj (15):
>x86/configs: CRIU update debug rock defconfig
>drm/amdkfd: CRIU Introduce Checkpoint-Restore APIs
>drm/amdkfd: CRIU Implement KFD process_info ioctl
>drm/amdkfd: CRIU Implement KFD checkpoint ioctl
>drm/amdkfd: CRIU Implement KFD restore ioctl
>drm/amdkfd: CRIU Implement KFD resume ioctl
>drm/amdkfd: CRIU export BOs as prime dmabuf objects
>drm/amdkfd: CRIU checkpoint and restore xnack mode
>drm/amdkfd: CRIU allow external mm for svm ranges
>drm/amdkfd: use user_gpu_id for svm ranges
>drm/amdkfd: CRIU Discover svm ranges
>drm/amdkfd: CRIU Save Shared Virtual Memory ranges
>drm/amdkfd: CRIU prepare for svm resume
>drm/amdkfd: CRIU resume shared virtual memory ranges
>drm/amdkfd: Bump up KFD API version for CRIU
>
>   arch/x86/configs/rock-dbg_defconfig   |   53 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h|7 +-
>   .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  |   64 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c   |   20 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h   |2 +
>   drivers/gpu/drm/amd/amdkfd/kfd_chardev.c  | 1471 ++---
>   drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c   |2 +-
>   .../drm/amd/amdkfd/kfd_device_queue_manager.c |  185 ++-
>   .../drm/amd/amdkfd/kfd_device_queue_manager.h |   16 +-
>   drivers/gpu/drm/amd/amdkfd/kfd_events.c   |  313 +++-
>   drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h  |   14 +
>   .../gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c  |   75 +
>   .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v10.c  |   77 +
>   .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c   |   92 ++
>   .../gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c   |   84 +
>   drivers/gpu/drm/amd/amdkfd/kfd_priv.h |  160 +-
>   drivers/gpu/drm/amd/amdkfd/kfd_process.c  |   72 +-
>   .../amd/amdkfd/kfd_process_queue_manager.c|  372 -
>   drivers/gpu/drm/amd/amdkfd/kfd_svm.c  |  331 +++-
>   drivers/gpu/drm/amd/amdkfd/kfd_svm.h  |   39 +
>   include/uapi/linux/kfd_ioctl.h|   84 +-
>   21 files changed, 3193 insertions(+), 340 deletions(-)
>


Re: [PATCH] drm/ttm: Don't inherit GEM object VMAs in child process

2022-01-10 Thread Bhardwaj, Rajneesh

Hi Christian


I have reverted the change from the amd-staging-drm-next as per the 
discussion.  Thank you.



Regards

Rajneesh

On 1/4/2022 1:08 PM, Felix Kuehling wrote:

[+Adrian]

Am 2021-12-23 um 2:05 a.m. schrieb Christian König:


Am 22.12.21 um 21:53 schrieb Daniel Vetter:

On Mon, Dec 20, 2021 at 01:12:51PM -0500, Bhardwaj, Rajneesh wrote:

[SNIP]
Still sounds funky. I think minimally we should have an ack from CRIU
developers that this is officially the right way to solve this
problem. I
really don't want to have random one-off hacks that don't work across
the
board, for a problem where we (drm subsystem) really shouldn't be the
only
one with this problem. Where "this problem" means that the mmap space is
per file description, and not per underlying inode or real device or
whatever. That part sounds like a CRIU problem, and I expect CRIU folks
want a consistent solution across the board for this. Hence please
grab an
ack from them.

Unfortunately it's a KFD design problem. AMD used a single device
node, then mmaped different objects from the same offset to different
processes and expected it to work the rest of the fs subsystem without
churn.

This may be true for mmaps in the KFD device, but not for mmaps in the
DRM render nodes.



So yes, this is indeed because the mmap space is per file descriptor
for the use case here.

No. This is a different problem.

The problem has to do with the way that DRM manages mmap permissions. In
order to be able to mmap an offset in the render node, there needs to be
a BO that was created in the same render node. If you fork a process, it
inherits the VMA. But KFD doesn't know anything about the inherited BOs
from the parent process. Therefore those BOs don't get checkpointed and
restored in the child process. When the CRIU checkpoint is restored, our
CRIU plugin never creates a BO corresponding to the VMA in the child
process' render node FD. We've also lost the relationship between the
parent and child-process' render node FDs. After "fork" the render node
FD points to the same struct file in parent and child. After restoring
the CRIU checkpoint, they are separate struct files, created by separate
"open" system calls. Therefore the mmap call that restores the VMA fails
in the child process.

At least for KFD, there is no point inheriting BOs from a child process,
because the GPU has no way of accessing the BOs in the child process.
The child process has no GPU address space, no user mode queues, no way
to do anything with the GPU before it completely reinitializes its KFD
context.

We can workaround this issue in user mode with madvise(...,
MADV_DONTFORK). In fact we've already done this for some BOs to avoid a
memory leak in the parent process while a child process exists. But it's
slightly racy because there is a short time window where VMA exists
without the VM_DONTCOPY flag. A fork during that time window could still
create a child process with an inherited VMA.

Therefore a safer solution is to set the vm_flags in the VMA in the
driver when the VMA is first created.

Regards,
   Felix



And thanks for pointing this out, this indeed makes the whole change
extremely questionable.

Regards,
Christian.


Cheers, Daniel



Re: [PATCH] drm/ttm: Don't inherit GEM object VMAs in child process

2021-12-22 Thread Bhardwaj, Rajneesh

Sorry for the typo in my previous email. Please read Adrian Reber*

On 12/22/2021 8:49 PM, Bhardwaj, Rajneesh wrote:


Adding Adrian Rebel who is the CRIU maintainer and CRIU list

On 12/22/2021 3:53 PM, Daniel Vetter wrote:

On Mon, Dec 20, 2021 at 01:12:51PM -0500, Bhardwaj, Rajneesh wrote:

On 12/20/2021 4:29 AM, Daniel Vetter wrote:

On Fri, Dec 10, 2021 at 07:58:50AM +0100, Christian König wrote:

Am 09.12.21 um 19:28 schrieb Felix Kuehling:

Am 2021-12-09 um 10:30 a.m. schrieb Christian König:

That still won't work.

But I think we could do this change for the amdgpu mmap callback only.

If graphics user mode has problems with it, we could even make this
specific to KFD BOs in the amdgpu_gem_object_mmap callback.

I think it's fine for the whole amdgpu stack, my concern is more about
radeon, nouveau and the ARM stacks which are using this as well.

That blew up so nicely the last time we tried to change it and I know of at
least one case where radeon was/is used with BOs in a child process.

I'm way late and burried again, but I think it'd be good to be consistent



I had committed this change into our amd-staging-drm-next branch last 
week after I got the ACK and RB from Felix and Christian.




here across drivers. Or at least across drm drivers. And we've had the vma
open/close refcounting to make fork work since forever.

I think if we do this we should really only do this for mmap() where this
applies, but reading through the thread here I'm honestly confused why
this is a problem. If CRIU can't handle forked mmaps it needs to be
thought that, not hacked around. Or at least I'm not understanding why
this shouldn't work ...
-Daniel


Hi Daniel

In the v2
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fall%2Fa1a865f5-ad2c-29c8-cbe4-2635d53eceb6%40amd.com%2FT%2Fdata=04%7C01%7Crajneesh.bhardwaj%40amd.com%7Ce4634a16c37149da173408d9c58d1338%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637758031981907821%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=h0z4sO19bsJecMqeHGdz%2BHZElKuyzK%2BW%2FMbLWA79I10%3Dreserved=0
I pretty much limited the scope of the change to KFD BOs on mmap. Regarding
CRIU, I think its not a CRIU problem as CRIU on restore, only tries to
recreate all the child processes and then mmaps all the VMAs it sees (as per
checkpoint snapshot) in the new process address space after the VMA
placements are finalized in the position independent code phase. Since the
inherited VMAs don't have access rights the criu mmap fails.

Still sounds funky. I think minimally we should have an ack from CRIU
developers that this is officially the right way to solve this problem. I
really don't want to have random one-off hacks that don't work across the
board, for a problem where we (drm subsystem) really shouldn't be the only
one with this problem. Where "this problem" means that the mmap space is
per file description, and not per underlying inode or real device or
whatever. That part sounds like a CRIU problem, and I expect CRIU folks
want a consistent solution across the board for this. Hence please grab an
ack from them.

Cheers, Daniel



Maybe Adrian can share his views on this.

Hi Adrian - For the context, on CRIU restore we see mmap failures ( in 
PIE restore phase) due to permission issues on the (render node) VMAs 
that were inherited since the application that check pointed had 
forked.  The VMAs ideally should not be in the child process but the 
smaps file shows these VMAs in the child address space. We didn't want 
to use madvise to avoid this copy and rather change in the kernel mode 
to limit the impact to our user space library thunk. Based on my 
understanding, during PIE restore phase, after the VMA placements are 
finalized, CRIU does a sys_mmap on all the VMA it sees in the VmaEntry 
list and I think its not an issue as per CRIU design but do you think 
we could handle this corner case better inside CRIU?




Regards,

Rajneesh


Regards,
Christian.


Regards,
     Felix



Regards,
Christian.

Am 09.12.21 um 16:29 schrieb Bhardwaj, Rajneesh:

Sounds good. I will send a v2 with only ttm_bo_mmap_obj change. Thank
you!

On 12/9/2021 10:27 AM, Christian König wrote:

Hi Rajneesh,

yes, separating this from the drm_gem_mmap_obj() change is certainly
a good idea.


The child cannot access the BOs mapped by the parent anyway with
access restrictions applied

exactly that is not correct. That behavior is actively used by some
userspace stacks as far as I know.

Regards,
Christian.

Am 09.12.21 um 16:23 schrieb Bhardwaj, Rajneesh:

Thanks Christian. Would it make it less intrusive if I just use the
flag for ttm bo mmap and remove the drm_gem_mmap_obj change from
this patch? For our use case, just the ttm_bo_mmap_obj change
should suffice and we don't want to put any more work arounds in
the user space (thunk, in our case).

The child cannot access the BOs mapped by the parent anyway w

Re: [PATCH] drm/ttm: Don't inherit GEM object VMAs in child process

2021-12-22 Thread Bhardwaj, Rajneesh

Adding Adrian Rebel who is the CRIU maintainer and CRIU list

On 12/22/2021 3:53 PM, Daniel Vetter wrote:

On Mon, Dec 20, 2021 at 01:12:51PM -0500, Bhardwaj, Rajneesh wrote:

On 12/20/2021 4:29 AM, Daniel Vetter wrote:

On Fri, Dec 10, 2021 at 07:58:50AM +0100, Christian König wrote:

Am 09.12.21 um 19:28 schrieb Felix Kuehling:

Am 2021-12-09 um 10:30 a.m. schrieb Christian König:

That still won't work.

But I think we could do this change for the amdgpu mmap callback only.

If graphics user mode has problems with it, we could even make this
specific to KFD BOs in the amdgpu_gem_object_mmap callback.

I think it's fine for the whole amdgpu stack, my concern is more about
radeon, nouveau and the ARM stacks which are using this as well.

That blew up so nicely the last time we tried to change it and I know of at
least one case where radeon was/is used with BOs in a child process.

I'm way late and burried again, but I think it'd be good to be consistent



I had committed this change into our amd-staging-drm-next branch last 
week after I got the ACK and RB from Felix and Christian.




here across drivers. Or at least across drm drivers. And we've had the vma
open/close refcounting to make fork work since forever.

I think if we do this we should really only do this for mmap() where this
applies, but reading through the thread here I'm honestly confused why
this is a problem. If CRIU can't handle forked mmaps it needs to be
thought that, not hacked around. Or at least I'm not understanding why
this shouldn't work ...
-Daniel


Hi Daniel

In the v2
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fall%2Fa1a865f5-ad2c-29c8-cbe4-2635d53eceb6%40amd.com%2FT%2Fdata=04%7C01%7Crajneesh.bhardwaj%40amd.com%7Ce4634a16c37149da173408d9c58d1338%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637758031981907821%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=h0z4sO19bsJecMqeHGdz%2BHZElKuyzK%2BW%2FMbLWA79I10%3Dreserved=0
I pretty much limited the scope of the change to KFD BOs on mmap. Regarding
CRIU, I think its not a CRIU problem as CRIU on restore, only tries to
recreate all the child processes and then mmaps all the VMAs it sees (as per
checkpoint snapshot) in the new process address space after the VMA
placements are finalized in the position independent code phase. Since the
inherited VMAs don't have access rights the criu mmap fails.

Still sounds funky. I think minimally we should have an ack from CRIU
developers that this is officially the right way to solve this problem. I
really don't want to have random one-off hacks that don't work across the
board, for a problem where we (drm subsystem) really shouldn't be the only
one with this problem. Where "this problem" means that the mmap space is
per file description, and not per underlying inode or real device or
whatever. That part sounds like a CRIU problem, and I expect CRIU folks
want a consistent solution across the board for this. Hence please grab an
ack from them.

Cheers, Daniel



Maybe Adrian can share his views on this.

Hi Adrian - For the context, on CRIU restore we see mmap failures ( in 
PIE restore phase) due to permission issues on the (render node) VMAs 
that were inherited since the application that check pointed had 
forked.  The VMAs ideally should not be in the child process but the 
smaps file shows these VMAs in the child address space. We didn't want 
to use madvise to avoid this copy and rather change in the kernel mode 
to limit the impact to our user space library thunk. Based on my 
understanding, during PIE restore phase, after the VMA placements are 
finalized, CRIU does a sys_mmap on all the VMA it sees in the VmaEntry 
list and I think its not an issue as per CRIU design but do you think we 
could handle this corner case better inside CRIU?






Regards,

Rajneesh


Regards,
Christian.


Regards,
     Felix



Regards,
Christian.

Am 09.12.21 um 16:29 schrieb Bhardwaj, Rajneesh:

Sounds good. I will send a v2 with only ttm_bo_mmap_obj change. Thank
you!

On 12/9/2021 10:27 AM, Christian König wrote:

Hi Rajneesh,

yes, separating this from the drm_gem_mmap_obj() change is certainly
a good idea.


The child cannot access the BOs mapped by the parent anyway with
access restrictions applied

exactly that is not correct. That behavior is actively used by some
userspace stacks as far as I know.

Regards,
Christian.

Am 09.12.21 um 16:23 schrieb Bhardwaj, Rajneesh:

Thanks Christian. Would it make it less intrusive if I just use the
flag for ttm bo mmap and remove the drm_gem_mmap_obj change from
this patch? For our use case, just the ttm_bo_mmap_obj change
should suffice and we don't want to put any more work arounds in
the user space (thunk, in our case).

The child cannot access the BOs mapped by the parent anyway with
access restrictions applied so I wonder why even inherit the vma?

On 12/9/2021 2:54 AM, Christian König wrote:

Am 08.1

Re: [PATCH] drm/ttm: Don't inherit GEM object VMAs in child process

2021-12-20 Thread Bhardwaj, Rajneesh



On 12/20/2021 4:29 AM, Daniel Vetter wrote:

On Fri, Dec 10, 2021 at 07:58:50AM +0100, Christian König wrote:

Am 09.12.21 um 19:28 schrieb Felix Kuehling:

Am 2021-12-09 um 10:30 a.m. schrieb Christian König:

That still won't work.

But I think we could do this change for the amdgpu mmap callback only.

If graphics user mode has problems with it, we could even make this
specific to KFD BOs in the amdgpu_gem_object_mmap callback.

I think it's fine for the whole amdgpu stack, my concern is more about
radeon, nouveau and the ARM stacks which are using this as well.

That blew up so nicely the last time we tried to change it and I know of at
least one case where radeon was/is used with BOs in a child process.

I'm way late and burried again, but I think it'd be good to be consistent
here across drivers. Or at least across drm drivers. And we've had the vma
open/close refcounting to make fork work since forever.

I think if we do this we should really only do this for mmap() where this
applies, but reading through the thread here I'm honestly confused why
this is a problem. If CRIU can't handle forked mmaps it needs to be
thought that, not hacked around. Or at least I'm not understanding why
this shouldn't work ...
-Daniel



Hi Daniel

In the v2 
https://lore.kernel.org/all/a1a865f5-ad2c-29c8-cbe4-2635d53ec...@amd.com/T/ 
I pretty much limited the scope of the change to KFD BOs on mmap. 
Regarding CRIU, I think its not a CRIU problem as CRIU on restore, only 
tries to recreate all the child processes and then mmaps all the VMAs it 
sees (as per checkpoint snapshot) in the new process address space after 
the VMA placements are finalized in the position independent code phase. 
Since the inherited VMAs don't have access rights the criu mmap fails.


Regards,

Rajneesh


Regards,
Christian.


Regards,
    Felix



Regards,
Christian.

Am 09.12.21 um 16:29 schrieb Bhardwaj, Rajneesh:

Sounds good. I will send a v2 with only ttm_bo_mmap_obj change. Thank
you!

On 12/9/2021 10:27 AM, Christian König wrote:

Hi Rajneesh,

yes, separating this from the drm_gem_mmap_obj() change is certainly
a good idea.


The child cannot access the BOs mapped by the parent anyway with
access restrictions applied

exactly that is not correct. That behavior is actively used by some
userspace stacks as far as I know.

Regards,
Christian.

Am 09.12.21 um 16:23 schrieb Bhardwaj, Rajneesh:

Thanks Christian. Would it make it less intrusive if I just use the
flag for ttm bo mmap and remove the drm_gem_mmap_obj change from
this patch? For our use case, just the ttm_bo_mmap_obj change
should suffice and we don't want to put any more work arounds in
the user space (thunk, in our case).

The child cannot access the BOs mapped by the parent anyway with
access restrictions applied so I wonder why even inherit the vma?

On 12/9/2021 2:54 AM, Christian König wrote:

Am 08.12.21 um 21:53 schrieb Rajneesh Bhardwaj:

When an application having open file access to a node forks, its
shared
mappings also get reflected in the address space of child process
even
though it cannot access them with the object permissions applied.
With the
existing permission checks on the gem objects, it might be
reasonable to
also create the VMAs with VM_DONTCOPY flag so a user space
application
doesn't need to explicitly call the madvise(addr, len,
MADV_DONTFORK)
system call to prevent the pages in the mapped range to appear in
the
address space of the child process. It also prevents the memory
leaks
due to additional reference counts on the mapped BOs in the child
process that prevented freeing the memory in the parent for which
we had
worked around earlier in the user space inside the thunk library.

Additionally, we faced this issue when using CRIU to checkpoint
restore
an application that had such inherited mappings in the child which
confuse CRIU when it mmaps on restore. Having this flag set for the
render node VMAs helps. VMAs mapped via KFD already take care of
this so
this is needed only for the render nodes.

Unfortunately that is most likely a NAK. We already tried
something similar.

While it is illegal by the OpenGL specification and doesn't work
for most userspace stacks, we do have some implementations which
call fork() with a GL context open and expect it to work.

Regards,
Christian.


Cc: Felix Kuehling 

Signed-off-by: David Yat Sin 
Signed-off-by: Rajneesh Bhardwaj 
---
    drivers/gpu/drm/drm_gem.c   | 3 ++-
    drivers/gpu/drm/ttm/ttm_bo_vm.c | 2 +-
    2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
index 09c820045859..d9c4149f36dd 100644
--- a/drivers/gpu/drm/drm_gem.c
+++ b/drivers/gpu/drm/drm_gem.c
@@ -1058,7 +1058,8 @@ int drm_gem_mmap_obj(struct drm_gem_object
*obj, unsigned long obj_size,
    goto err_drm_gem_object_put;
    }
    -    vma->vm_flags |= VM_IO | VM_PFNMAP | VM_DONTEXPAND |
VM_DONTDUMP;
+    vma->vm_flags |=

Re: [PATCH] drm/ttm: Don't inherit GEM object VMAs in child process

2021-12-09 Thread Bhardwaj, Rajneesh

Sounds good. I will send a v2 with only ttm_bo_mmap_obj change. Thank you!

On 12/9/2021 10:27 AM, Christian König wrote:

Hi Rajneesh,

yes, separating this from the drm_gem_mmap_obj() change is certainly a 
good idea.


The child cannot access the BOs mapped by the parent anyway with 
access restrictions applied


exactly that is not correct. That behavior is actively used by some 
userspace stacks as far as I know.


Regards,
Christian.

Am 09.12.21 um 16:23 schrieb Bhardwaj, Rajneesh:
Thanks Christian. Would it make it less intrusive if I just use the 
flag for ttm bo mmap and remove the drm_gem_mmap_obj change from this 
patch? For our use case, just the ttm_bo_mmap_obj change should 
suffice and we don't want to put any more work arounds in the user 
space (thunk, in our case).


The child cannot access the BOs mapped by the parent anyway with 
access restrictions applied so I wonder why even inherit the vma?


On 12/9/2021 2:54 AM, Christian König wrote:

Am 08.12.21 um 21:53 schrieb Rajneesh Bhardwaj:
When an application having open file access to a node forks, its 
shared

mappings also get reflected in the address space of child process even
though it cannot access them with the object permissions applied. 
With the
existing permission checks on the gem objects, it might be 
reasonable to

also create the VMAs with VM_DONTCOPY flag so a user space application
doesn't need to explicitly call the madvise(addr, len, MADV_DONTFORK)
system call to prevent the pages in the mapped range to appear in the
address space of the child process. It also prevents the memory leaks
due to additional reference counts on the mapped BOs in the child
process that prevented freeing the memory in the parent for which 
we had

worked around earlier in the user space inside the thunk library.

Additionally, we faced this issue when using CRIU to checkpoint 
restore

an application that had such inherited mappings in the child which
confuse CRIU when it mmaps on restore. Having this flag set for the
render node VMAs helps. VMAs mapped via KFD already take care of 
this so

this is needed only for the render nodes.


Unfortunately that is most likely a NAK. We already tried something 
similar.


While it is illegal by the OpenGL specification and doesn't work for 
most userspace stacks, we do have some implementations which call 
fork() with a GL context open and expect it to work.


Regards,
Christian.



Cc: Felix Kuehling 

Signed-off-by: David Yat Sin 
Signed-off-by: Rajneesh Bhardwaj 
---
  drivers/gpu/drm/drm_gem.c   | 3 ++-
  drivers/gpu/drm/ttm/ttm_bo_vm.c | 2 +-
  2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
index 09c820045859..d9c4149f36dd 100644
--- a/drivers/gpu/drm/drm_gem.c
+++ b/drivers/gpu/drm/drm_gem.c
@@ -1058,7 +1058,8 @@ int drm_gem_mmap_obj(struct drm_gem_object 
*obj, unsigned long obj_size,

  goto err_drm_gem_object_put;
  }
  -    vma->vm_flags |= VM_IO | VM_PFNMAP | VM_DONTEXPAND | 
VM_DONTDUMP;

+    vma->vm_flags |= VM_IO | VM_PFNMAP | VM_DONTEXPAND
+    | VM_DONTDUMP | VM_DONTCOPY;
  vma->vm_page_prot = 
pgprot_writecombine(vm_get_page_prot(vma->vm_flags));

  vma->vm_page_prot = pgprot_decrypted(vma->vm_page_prot);
  }
diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c 
b/drivers/gpu/drm/ttm/ttm_bo_vm.c

index 33680c94127c..420a4898fdd2 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
@@ -566,7 +566,7 @@ int ttm_bo_mmap_obj(struct vm_area_struct *vma, 
struct ttm_buffer_object *bo)

    vma->vm_private_data = bo;
  -    vma->vm_flags |= VM_PFNMAP;
+    vma->vm_flags |= VM_PFNMAP | VM_DONTCOPY;
  vma->vm_flags |= VM_IO | VM_DONTEXPAND | VM_DONTDUMP;
  return 0;
  }






Re: [PATCH] drm/ttm: Don't inherit GEM object VMAs in child process

2021-12-09 Thread Bhardwaj, Rajneesh
Thanks Christian. Would it make it less intrusive if I just use the flag 
for ttm bo mmap and remove the drm_gem_mmap_obj change from this patch? 
For our use case, just the ttm_bo_mmap_obj change should suffice and we 
don't want to put any more work arounds in the user space (thunk, in our 
case).


The child cannot access the BOs mapped by the parent anyway with access 
restrictions applied so I wonder why even inherit the vma?


On 12/9/2021 2:54 AM, Christian König wrote:

Am 08.12.21 um 21:53 schrieb Rajneesh Bhardwaj:

When an application having open file access to a node forks, its shared
mappings also get reflected in the address space of child process even
though it cannot access them with the object permissions applied. 
With the

existing permission checks on the gem objects, it might be reasonable to
also create the VMAs with VM_DONTCOPY flag so a user space application
doesn't need to explicitly call the madvise(addr, len, MADV_DONTFORK)
system call to prevent the pages in the mapped range to appear in the
address space of the child process. It also prevents the memory leaks
due to additional reference counts on the mapped BOs in the child
process that prevented freeing the memory in the parent for which we had
worked around earlier in the user space inside the thunk library.

Additionally, we faced this issue when using CRIU to checkpoint restore
an application that had such inherited mappings in the child which
confuse CRIU when it mmaps on restore. Having this flag set for the
render node VMAs helps. VMAs mapped via KFD already take care of this so
this is needed only for the render nodes.


Unfortunately that is most likely a NAK. We already tried something 
similar.


While it is illegal by the OpenGL specification and doesn't work for 
most userspace stacks, we do have some implementations which call 
fork() with a GL context open and expect it to work.


Regards,
Christian.



Cc: Felix Kuehling 

Signed-off-by: David Yat Sin 
Signed-off-by: Rajneesh Bhardwaj 
---
  drivers/gpu/drm/drm_gem.c   | 3 ++-
  drivers/gpu/drm/ttm/ttm_bo_vm.c | 2 +-
  2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
index 09c820045859..d9c4149f36dd 100644
--- a/drivers/gpu/drm/drm_gem.c
+++ b/drivers/gpu/drm/drm_gem.c
@@ -1058,7 +1058,8 @@ int drm_gem_mmap_obj(struct drm_gem_object 
*obj, unsigned long obj_size,

  goto err_drm_gem_object_put;
  }
  -    vma->vm_flags |= VM_IO | VM_PFNMAP | VM_DONTEXPAND | 
VM_DONTDUMP;

+    vma->vm_flags |= VM_IO | VM_PFNMAP | VM_DONTEXPAND
+    | VM_DONTDUMP | VM_DONTCOPY;
  vma->vm_page_prot = 
pgprot_writecombine(vm_get_page_prot(vma->vm_flags));

  vma->vm_page_prot = pgprot_decrypted(vma->vm_page_prot);
  }
diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c 
b/drivers/gpu/drm/ttm/ttm_bo_vm.c

index 33680c94127c..420a4898fdd2 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
@@ -566,7 +566,7 @@ int ttm_bo_mmap_obj(struct vm_area_struct *vma, 
struct ttm_buffer_object *bo)

    vma->vm_private_data = bo;
  -    vma->vm_flags |= VM_PFNMAP;
+    vma->vm_flags |= VM_PFNMAP | VM_DONTCOPY;
  vma->vm_flags |= VM_IO | VM_DONTEXPAND | VM_DONTDUMP;
  return 0;
  }




RE: [PATCH 2/2] drm/amdgpu: fix a few compiler warnings

2021-03-11 Thread Bhardwaj, Rajneesh
[AMD Official Use Only - Internal Distribution Only]

Reviewed-by: Rajneesh Bhardwaj 

-Original Message-
From: amd-gfx  On Behalf Of Oak Zeng
Sent: Wednesday, March 10, 2021 10:29 PM
To: dri-devel@lists.freedesktop.org; amd-...@lists.freedesktop.org
Cc: Zeng, Oak 
Subject: [PATCH 2/2] drm/amdgpu: fix a few compiler warnings

[CAUTION: External Email]

1. make function mmhub_v1_7_setup_vm_pt_regs static 2. indent a if statement

Signed-off-by: Oak Zeng 
Reported-by: kernel test robot 
Reported-by: Dan Carpenter 
---
 drivers/gpu/drm/amd/amdgpu/gfxhub_v1_1.c | 4 ++--  
drivers/gpu/drm/amd/amdgpu/mmhub_v1_7.c  | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_1.c 
b/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_1.c
index 3b4193c..8fca72e 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_1.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_1.c
@@ -88,14 +88,14 @@ int gfxhub_v1_1_get_xgmi_info(struct amdgpu_device *adev)
adev->gmc.xgmi.num_physical_nodes = max_region + 1;

if (adev->gmc.xgmi.num_physical_nodes > max_num_physical_nodes)
-   return -EINVAL;
+   return -EINVAL;

adev->gmc.xgmi.physical_node_id =
REG_GET_FIELD(xgmi_lfb_cntl, MC_VM_XGMI_LFB_CNTL,
  PF_LFB_REGION);

if (adev->gmc.xgmi.physical_node_id > max_physical_node_id)
-   return -EINVAL;
+   return -EINVAL;

adev->gmc.xgmi.node_segment_size = seg_size;
}
diff --git a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_7.c 
b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_7.c
index ac74d66..29d7f50 100644
--- a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_7.c
+++ b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_7.c
@@ -53,7 +53,7 @@ static u64 mmhub_v1_7_get_fb_location(struct amdgpu_device 
*adev)
return base;
 }

-void mmhub_v1_7_setup_vm_pt_regs(struct amdgpu_device *adev, uint32_t vmid,
+static void mmhub_v1_7_setup_vm_pt_regs(struct amdgpu_device *adev, 
+uint32_t vmid,
uint64_t page_table_base)  {
struct amdgpu_vmhub *hub = >vmhub[AMDGPU_MMHUB_0];
--
2.7.4

___
amd-gfx mailing list
amd-...@lists.freedesktop.org
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfxdata=04%7C01%7Crajneesh.bhardwaj%40amd.com%7C8f296a25634a47c40dde08d8e43dde97%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637510301669209178%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=ucX2H8f4HhlXKJ4OBcZZwjfBTBAXYDSrGpPF%2BK1JOLw%3Dreserved=0
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] drm/ttm: ioremap buffer according to TTM mem caching setting

2021-03-04 Thread Bhardwaj, Rajneesh


On 3/4/2021 12:31 PM, Christian König wrote:

[CAUTION: External Email]

Am 04.03.21 um 18:01 schrieb Bhardwaj, Rajneesh:

I was wondering if a managed version of such API exists but looks like
none. We only have devm_ioremap_wc but that is valid only for
PAGE_CACHE_MODE_WC whereas ioremap_cache uses _WB. One more small
comment below.


Acked-by: Rajneesh Bhardwaj 

On 3/4/2021 11:04 AM, Oak Zeng wrote:

If tbo.mem.bus.caching is cached, buffer is intended to be mapped
as cached from CPU. Map it with ioremap_cache.

This wasn't necessary before as device memory was never mapped
as cached from CPU side. It becomes necessary for aldebaran as
device memory is mapped cached from CPU.

Signed-off-by: Oak Zeng 
Reviewed-by: Christian Konig 
---
  drivers/gpu/drm/ttm/ttm_bo_util.c | 14 ++
  1 file changed, 14 insertions(+)

diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c
b/drivers/gpu/drm/ttm/ttm_bo_util.c
index 031e581..7429464 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_util.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
@@ -91,6 +91,10 @@ static int ttm_resource_ioremap(struct ttm_device
*bdev,
    if (mem->bus.caching == ttm_write_combined)
  addr = ioremap_wc(mem->bus.offset, bus_size);
+#ifdef CONFIG_X86



Please use #if defined (CONFIG_X86)


Actually #ifdef is usually preferred.


oops, i was referring to IS_ENABLED (CONFIG) and not if defined.




Christian.




+    else if (mem->bus.caching == ttm_cached)
+    addr = ioremap_cache(mem->bus.offset, bus_size);
+#endif
  else
  addr = ioremap(mem->bus.offset, bus_size);
  if (!addr) {
@@ -372,6 +376,11 @@ static int ttm_bo_ioremap(struct
ttm_buffer_object *bo,
  if (mem->bus.caching == ttm_write_combined)
  map->virtual = ioremap_wc(bo->mem.bus.offset + offset,
    size);
+#ifdef CONFIG_X86
+    else if (mem->bus.caching == ttm_cached)
+    map->virtual = ioremap_cache(bo->mem.bus.offset + offset,
+  size);
+#endif
  else
  map->virtual = ioremap(bo->mem.bus.offset + offset,
 size);
@@ -490,6 +499,11 @@ int ttm_bo_vmap(struct ttm_buffer_object *bo,
struct dma_buf_map *map)
  else if (mem->bus.caching == ttm_write_combined)
  vaddr_iomem = ioremap_wc(mem->bus.offset,
   bo->base.size);
+    else if (mem->bus.caching == ttm_cached)
+#ifdef CONFIG_X86
+    vaddr_iomem = ioremap_cache(mem->bus.offset,
+  bo->base.size);
+#endif
  else
  vaddr_iomem = ioremap(mem->bus.offset, bo->base.size);

___
amd-gfx mailing list
amd-...@lists.freedesktop.org
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfxdata=04%7C01%7Crajneesh.bhardwaj%40amd.com%7Cc4386544ea10487d3a0c08d8df3363a1%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637504759264793970%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=nM2UtQQdActyapfZSrhfx%2BoJ%2BdszV4Yp62LTehsUWwY%3Dreserved=0 




___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] drm/ttm: ioremap buffer according to TTM mem caching setting

2021-03-04 Thread Bhardwaj, Rajneesh
I was wondering if a managed version of such API exists but looks like 
none. We only have devm_ioremap_wc but that is valid only for 
PAGE_CACHE_MODE_WC whereas ioremap_cache uses _WB. One more small 
comment below.



Acked-by: Rajneesh Bhardwaj 

On 3/4/2021 11:04 AM, Oak Zeng wrote:

If tbo.mem.bus.caching is cached, buffer is intended to be mapped
as cached from CPU. Map it with ioremap_cache.

This wasn't necessary before as device memory was never mapped
as cached from CPU side. It becomes necessary for aldebaran as
device memory is mapped cached from CPU.

Signed-off-by: Oak Zeng 
Reviewed-by: Christian Konig 
---
  drivers/gpu/drm/ttm/ttm_bo_util.c | 14 ++
  1 file changed, 14 insertions(+)

diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c 
b/drivers/gpu/drm/ttm/ttm_bo_util.c
index 031e581..7429464 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_util.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
@@ -91,6 +91,10 @@ static int ttm_resource_ioremap(struct ttm_device *bdev,
  
  		if (mem->bus.caching == ttm_write_combined)

addr = ioremap_wc(mem->bus.offset, bus_size);
+#ifdef CONFIG_X86



Please use #if defined (CONFIG_X86)


+   else if (mem->bus.caching == ttm_cached)
+   addr = ioremap_cache(mem->bus.offset, bus_size);
+#endif
else
addr = ioremap(mem->bus.offset, bus_size);
if (!addr) {
@@ -372,6 +376,11 @@ static int ttm_bo_ioremap(struct ttm_buffer_object *bo,
if (mem->bus.caching == ttm_write_combined)
map->virtual = ioremap_wc(bo->mem.bus.offset + offset,
  size);
+#ifdef CONFIG_X86
+   else if (mem->bus.caching == ttm_cached)
+   map->virtual = ioremap_cache(bo->mem.bus.offset + 
offset,
+ size);
+#endif
else
map->virtual = ioremap(bo->mem.bus.offset + offset,
   size);
@@ -490,6 +499,11 @@ int ttm_bo_vmap(struct ttm_buffer_object *bo, struct 
dma_buf_map *map)
else if (mem->bus.caching == ttm_write_combined)
vaddr_iomem = ioremap_wc(mem->bus.offset,
 bo->base.size);
+   else if (mem->bus.caching == ttm_cached)
+#ifdef CONFIG_X86
+   vaddr_iomem = ioremap_cache(mem->bus.offset,
+ bo->base.size);
+#endif
else
vaddr_iomem = ioremap(mem->bus.offset, bo->base.size);
  

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel