Hi Oak,

yeah, I completely agree with you and Felix. The main problem here is getting the memory pressure visible on both sides.

At the moment I have absolutely no idea how to handle that, maybe something like the ttm_resource object shared between TTM and HMM?

Regards,
Christian.

Am 16.08.23 um 05:47 schrieb Zeng, Oak:
Hi Felix,

It is great to hear from you!

When I implement the HMM-based SVM for intel devices, I found this interesting 
problem: HMM uses struct page based memory management scheme which is 
completely different against the BO/TTM style memory management philosophy. 
Writing SVM code upon the BO/TTM concept seems overkill and awkward. So I 
thought we better make the SVM code BO-less and TTM-less. But on the other 
hand, currently vram eviction and cgroup memory accounting are all hooked to 
the TTM layer, which means a TTM-less SVM driver won't be able to evict vram 
allocated through TTM/gpu_vram_mgr.

Ideally HMM migration should use drm-buddy for vram allocation, but we need to 
solve this TTM/HMM mutual eviction problem as you pointed out (I am working 
with application engineers to figure out whether mutual eviction can truly 
benefit applications). Maybe we can implement a TTM-less vram management block 
which can be shared b/t the HMM-based driver and the BO-based driver:
    * allocate/free memory from drm-buddy, buddy-block based
    * memory eviction logics, allow driver to specify which allocation is 
evictable
    * memory accounting, cgroup logic

Maybe such a block can be placed at drm layer (say, call it drm_vram_mgr for 
now), so it can be shared b/t amd and intel. So I involved amd folks. Today 
both amd and intel-xe driver implemented a TTM-based vram manager which doesn't 
serve above design goal. Once the drm_vram_mgr is implemented, both amd and 
intel's BO-based/TTM-based vram manager, and the HMM-based vram manager can 
call into this drm-vram-mgr.

Thanks again,
Oak

-----Original Message-----
From: Felix Kuehling <felix.kuehl...@amd.com>
Sent: August 15, 2023 6:17 PM
To: Zeng, Oak <oak.z...@intel.com>; Thomas Hellström
<thomas.hellst...@linux.intel.com>; Brost, Matthew
<matthew.br...@intel.com>; Vishwanathapura, Niranjana
<niranjana.vishwanathap...@intel.com>; Welty, Brian <brian.we...@intel.com>;
Christian König <christian.koe...@amd.com>; Philip Yang
<philip.y...@amd.com>; intel...@lists.freedesktop.org; dri-
de...@lists.freedesktop.org
Subject: Re: Implement svm without BO concept in xe driver

Hi Oak,

I'm not sure what you're looking for from AMD? Are we just CC'ed FYI? Or
are you looking for comments about

   * Our plans for VRAM management with HMM
   * Our experience with BO-based VRAM management
   * Something else?

IMO, having separate memory pools for HMM and TTM is a non-starter for
AMD. We need access to the full VRAM in either of the APIs for it to be
useful. That also means, we need to handle memory pressure in both
directions. That's one of the main reasons we went with the BO-based
approach initially. I think in the long run, using the buddy allocator,
or the amdgpu_vram_mgr directly for HMM migrations would be better,
assuming we can handle memory pressure in both directions between HMM
and TTM sharing the same pool of physical memory.

Regards,
    Felix


On 2023-08-15 16:34, Zeng, Oak wrote:
Also + Christian

Thanks,

Oak

*From:*Intel-xe <intel-xe-boun...@lists.freedesktop.org> *On Behalf Of
*Zeng, Oak
*Sent:* August 14, 2023 11:38 PM
*To:* Thomas Hellström <thomas.hellst...@linux.intel.com>; Brost,
Matthew <matthew.br...@intel.com>; Vishwanathapura, Niranjana
<niranjana.vishwanathap...@intel.com>; Welty, Brian
<brian.we...@intel.com>; Felix Kuehling <felix.kuehl...@amd.com>;
Philip Yang <philip.y...@amd.com>; intel...@lists.freedesktop.org;
dri-devel@lists.freedesktop.org
*Subject:* [Intel-xe] Implement svm without BO concept in xe driver

Hi Thomas, Matt and all,

This came up when I port i915 svm codes to xe driver. In i915
implementation, we have i915_buddy manage gpu vram and svm codes
directly call i915_buddy layer to allocate/free vram. There is no
gem_bo/ttm bo concept involved in the svm implementation.

In xe driver,  we have drm_buddy, xe_ttm_vram_mgr and ttm layer to
manage vram. Drm_buddy is initialized during xe_ttm_vram_mgr
initialization. Vram allocation/free is done through xe_ttm_vram_mgr
functions which call into drm_buddy layer to allocate vram blocks.

I plan to implement xe svm driver the same way as we did in i915,
which means there will not be bo concept in the svm implementation.
Drm_buddy will be passed to svm layer during vram initialization and
svm will allocate/free memory directly from drm_buddy, bypassing
ttm/xee vram manager. Here are a few considerations/things we are
aware of:

  1. This approach seems match hmm design better than bo concept. Our
     svm implementation will be based on hmm. In hmm design, each vram
     page is backed by a struct page. It is very easy to perform page
     granularity migrations (b/t vram and system memory). If BO concept
     is involved, we will have to split/remerge BOs during page
     granularity migrations.

  2. We have a prove of concept of this approach in i915, originally
     implemented by Niranjana. It seems work but it only has basic
     functionalities for now. We don’t have advanced features such as
     memory eviction etc.

  3. With this approach, vram will divided into two separate pools: one
     for xe_gem_created BOs and one for vram used by svm. Those two
     pools are not connected: memory pressure from one pool won’t be
     able to evict vram from another pool. At this point, we don’t
     whether this aspect is good or not.

  4. Amdkfd svm went different approach which is BO based. The benefit
     of this approach is a lot of existing driver facilities (such as
     memory eviction/cgroup/accounting) can be reused

Do you have any comment to this approach? Should I come back with a
RFC of some POC codes?

Thanks,

Oak


Reply via email to