Re: [Intel-gfx] [RFC PATCH 0/6] Supporting GMEM (generalized memory management) for external memory devices

2023-12-04 Thread Christian König
Am 04.12.23 um 00:32 schrieb Alistair Popple: Christian König writes: Am 01.12.23 um 06:48 schrieb Zeng, Oak: [SNIP] Besides memory eviction/oversubscription, there are a few other pain points when I use hmm: 1) hmm doesn't support file-back memory, so it is hard to share memory b/t

Re: [Intel-gfx] [RFC PATCH 0/6] Supporting GMEM (generalized memory management) for external memory devices

2023-12-03 Thread Alistair Popple
Christian König writes: > Am 01.12.23 um 06:48 schrieb Zeng, Oak: >> [SNIP] >> Besides memory eviction/oversubscription, there are a few other pain points >> when I use hmm: >> >> 1) hmm doesn't support file-back memory, so it is hard to share > memory b/t process in a gpu environment. You

Re: [Intel-gfx] [RFC PATCH 0/6] Supporting GMEM (generalized memory management) for external memory devices

2023-12-01 Thread Philipp Stanner
On Fri, 2023-12-01 at 02:37 +, zhuweixi wrote: > From your argument on KVM I can see that the biggest miscommunication > between us is that you believed that GMEM wanted to share the whole > address space. No, it is not the case. GMEM is only providing > coordination via certain mmap() calls.

Re: [Intel-gfx] [RFC PATCH 0/6] Supporting GMEM (generalized memory management) for external memory devices

2023-12-01 Thread Christian König
Am 01.12.23 um 06:48 schrieb Zeng, Oak: [SNIP] 3. MMU notifiers register hooks at certain core MM events, while GMEM declares basic functions and internally invokes them. GMEM requires less from the driver side -- no need to understand what core MM behaves at certain MMU events. GMEM also

Re: [Intel-gfx] [RFC PATCH 0/6] Supporting GMEM (generalized memory management) for external memory devices

2023-12-01 Thread David Hildenbrand
On 01.12.23 03:44, zhuweixi wrote: Thanks! I hope you understood that that was a joke :) I am planning to present GMEM in Linux MM Alignment Sessions so I can collect more input from the mm developers. Sounds good. But please try inviting key HMM/driver developer as well. Most of the

Re: [Intel-gfx] [RFC PATCH 0/6] Supporting GMEM (generalized memory management) for external memory devices

2023-11-30 Thread Alistair Popple
"Zeng, Oak" writes: > See inline comments > >> -Original Message- >> From: dri-devel On Behalf Of >> zhuweixi >> Sent: Thursday, November 30, 2023 5:48 AM >> To: Christian König ; Zeng, Oak >> ; Christian König ; linux- >> m...@kvack.org; linux-ker...@vger.kernel.org;

Re: [Intel-gfx] [RFC PATCH 0/6] Supporting GMEM (generalized memory management) for external memory devices

2023-11-30 Thread Alistair Popple
zhuweixi writes: > Glad to know that there is a common demand for a new syscall like > hmadvise(). I expect it would also be useful for homogeneous NUMA > cases. Credits to cudaMemAdvise() API which brought this idea to > GMEM's design. It's not clear to me that this would need to be a new

Re: [Intel-gfx] [RFC PATCH 0/6] Supporting GMEM (generalized memory management) for external memory devices

2023-11-30 Thread Zeng, Oak
See inline comments > -Original Message- > From: dri-devel On Behalf Of > zhuweixi > Sent: Thursday, November 30, 2023 5:48 AM > To: Christian König ; Zeng, Oak > ; Christian König ; linux- > m...@kvack.org; linux-ker...@vger.kernel.org; a...@linux-foundation.org; > Danilo Krummrich ;

Re: [Intel-gfx] [RFC PATCH 0/6] Supporting GMEM (generalized memory management) for external memory devices

2023-11-30 Thread zhuweixi
Thanks! I am planning to present GMEM in Linux MM Alignment Sessions so I can collect more input from the mm developers. @Christian @Oak I will also send you invitations once a presentation is scheduled. :) -Weixi -Original Message- From: David Hildenbrand Sent: Thursday, November

Re: [Intel-gfx] [RFC PATCH 0/6] Supporting GMEM (generalized memory management) for external memory devices

2023-11-30 Thread zhuweixi
From your argument on KVM I can see that the biggest miscommunication between us is that you believed that GMEM wanted to share the whole address space. No, it is not the case. GMEM is only providing coordination via certain mmap() calls. So you are raising a case supporting GMEM again --

Re: [Intel-gfx] [RFC PATCH 0/6] Supporting GMEM (generalized memory management) for external memory devices

2023-11-30 Thread David Hildenbrand
On 29.11.23 09:27, zhuweixi wrote: Glad to hear that more sharable code is desirable. IMHO, for a common MM subsystem, it is more beneficial for GMEM to extend core MM instead of building a separate one. More core-mm complexity, awesome, we all love that! ;) -- Cheers, David / dhildenb

Re: [Intel-gfx] [RFC PATCH 0/6] Supporting GMEM (generalized memory management) for external memory devices

2023-11-30 Thread Christian König
Am 30.11.23 um 08:22 schrieb zhuweixi: Add @Oak to the KFD discussion. I will reply separately elaborating your questions on GMEM's difference from HMM/MMU notifiers. Christian, thanks for pointing me to that AMDKFD discussion. I have read the discussion around the AMDKFD skeleton patch and

Re: [Intel-gfx] [RFC PATCH 0/6] Supporting GMEM (generalized memory management) for external memory devices

2023-11-30 Thread zhuweixi
Glad to know that there is a common demand for a new syscall like hmadvise(). I expect it would also be useful for homogeneous NUMA cases. Credits to cudaMemAdvise() API which brought this idea to GMEM's design. To answer @Oak's questions about GMEM vs. HMM, Here is the major difference:

Re: [Intel-gfx] [RFC PATCH 0/6] Supporting GMEM (generalized memory management) for external memory devices

2023-11-30 Thread Christian König
Hi Oak, yeah, #4 is indeed a really good point and I think Felix will agree to that as well. HMM is basically still missing a way to advise device attributes for the CPU address space. Both migration strategy as well as device specific information (like cache preferences) fall into this

Re: [Intel-gfx] [RFC PATCH 0/6] Supporting GMEM (generalized memory management) for external memory devices

2023-11-29 Thread zhuweixi
Add @Oak to the KFD discussion. I will reply separately elaborating your questions on GMEM's difference from HMM/MMU notifiers. Christian, thanks for pointing me to that AMDKFD discussion. I have read the discussion around the AMDKFD skeleton patch and found the previous discussion in the

Re: [Intel-gfx] [RFC PATCH 0/6] Supporting GMEM (generalized memory management) for external memory devices

2023-11-29 Thread Zeng, Oak
Hi Weixi, Even though Christian has listed reasons rejecting this proposal (yes they are very reasonable to me), I would open my mind and further explore the possibility here. Since the current GPU driver uses a hmm based implementation (AMD and NV has done this; At Intel we are catching up),

Re: [Intel-gfx] [RFC PATCH 0/6] Supporting GMEM (generalized memory management) for external memory devices

2023-11-29 Thread Christian König
Am 29.11.23 um 09:27 schrieb zhuweixi: Glad to hear that more sharable code is desirable. IMHO, for a common MM subsystem, it is more beneficial for GMEM to extend core MM instead of building a separate one. As stated in the beginning of my RFC letter, MM systems are large and similar. Even a

Re: [Intel-gfx] [RFC PATCH 0/6] Supporting GMEM (generalized memory management) for external memory devices

2023-11-29 Thread zhuweixi
Glad to hear that more sharable code is desirable. IMHO, for a common MM subsystem, it is more beneficial for GMEM to extend core MM instead of building a separate one. As stated in the beginning of my RFC letter, MM systems are large and similar. Even a sophisticated one like Linux MM that

Re: [Intel-gfx] [RFC PATCH 0/6] Supporting GMEM (generalized memory management) for external memory devices

2023-11-28 Thread Dave Airlie
On Tue, 28 Nov 2023 at 23:07, Christian König wrote: > > Am 28.11.23 um 13:50 schrieb Weixi Zhu: > > The problem: > > > > Accelerator driver developers are forced to reinvent external MM subsystems > > case by case, because Linux core MM only considers host memory resources. > > These reinvented

[Intel-gfx] [RFC PATCH 0/6] Supporting GMEM (generalized memory management) for external memory devices

2023-11-28 Thread Weixi Zhu
The problem: Accelerator driver developers are forced to reinvent external MM subsystems case by case, because Linux core MM only considers host memory resources. These reinvented MM subsystems have similar orders of magnitude of LoC as Linux MM (80K), e.g. Nvidia-UVM has 70K, AMD GPU has 14K and

Re: [Intel-gfx] [RFC PATCH 0/6] Supporting GMEM (generalized memory management) for external memory devices

2023-11-28 Thread Christian König
Adding a few missing important people to the explicit to list. Am 28.11.23 um 13:50 schrieb Weixi Zhu: The problem: Accelerator driver developers are forced to reinvent external MM subsystems case by case, because Linux core MM only considers host memory resources. These reinvented MM

Re: [Intel-gfx] [RFC PATCH 0/6] Supporting GMEM (generalized memory management) for external memory devices

2023-11-28 Thread Christian König
Am 28.11.23 um 13:50 schrieb Weixi Zhu: The problem: Accelerator driver developers are forced to reinvent external MM subsystems case by case, because Linux core MM only considers host memory resources. These reinvented MM subsystems have similar orders of magnitude of LoC as Linux MM (80K),