Re: [PATCH v2 0/2] Initialize backend memory objects in parallel

2024-01-29 Thread Mark Kanda

On 1/29/24 1:11 PM, David Hildenbrand wrote:

On 22.01.24 16:32, Mark Kanda wrote:

v2:
- require MADV_POPULATE_WRITE (simplify the implementation)
- require prealloc context threads to ensure optimal thread placement
- use machine phase 'initialized' to detremine when to allow parallel 
init


QEMU initializes preallocated backend memory when parsing the 
corresponding
objects from the command line. In certain scenarios, such as memory 
being
preallocated across multiple numa nodes, this approach is not optimal 
due to

the unnecessary serialization.

This series addresses this issue by initializing the backend memory 
objects in

parallel.


I just played with the code, some comments:

* I suggest squashing both patches. It doesn't make things clearer if we
  factor out unconditionally adding contexts to a global list.

* Keep the functions MT-capable, at least as long as async=false. That
  is, don't involve the global list if async=false. virtio-mem will
  perform preallocation from other threads at some point, where we could
  see concurrent preallocations for different devices. I made sure that
  qemu_mem_prealloc() can handle that.

* Rename wait_mem_prealloc() to qemu_finish_async_mem_prealloc() and let
  it report the error / return true/false like qemu_prealloc_mem().
  Especially, don't change the existing
  "qemu_prealloc_mem: preallocating memory failed" error message.

* Do the conditional async=false fixup in touch_all_pages(). That means,
  in qemu_prealloc_mem(), only route the async parameter through.


One thing I don't quite like is what happens when multiple threads 
would try

issuing "async=true". It will currently not happen, but we should catch
whenever that happens and require that only one thread at a time can
perform async preallocs. Maybe we can assert in qemu_prealloc_mem()/
qemu_finish_async_mem_prealloc() that we hold the BQL. Hopefully, that
is the case when we start creating memory backends, before starting the
main loop. If not, maybe we should just document that async limitation.

Ideally, we'd have some async_start(), prealloc(), prealloc(),
async_finish() interface, where async_start() would block until
another thread called async_finish(), so we never have a mixture.
But that would currently be over-engineering.


I'll attach the untested, likely broken, code I played with to see
what it could look like. Observe how I only conditionally add the
context to the list at the end of touch_all_pages().



Thank you very much for the feedback David. I'll take a close look at 
this for v3.


Best regards,
-Mark



From fe26cc5252f1284efa8e667310609a22c6166324 Mon Sep 17 00:00:00 2001
From: Mark Kanda 
Date: Mon, 22 Jan 2024 09:32:18 -0600
Subject: [PATCH] oslib-posix: initialize selected backend memory 
objects in

 parallel

QEMU initializes preallocated backend memory as the objects are parsed 
from

the command line. This is not optimal in some cases (e.g. memory spanning
multiple NUMA nodes) because the memory objects are initialized in 
series.


Allow the initialization to occur in parallel. In order to ensure optimal
thread placement, parallel initialization requires prealloc context 
threads

to be in use.

Signed-off-by: Mark Kanda 
Signed-off-by: David Hildenbrand 
---
 backends/hostmem.c |   8 ++-
 hw/virtio/virtio-mem.c |   4 +-
 include/qemu/osdep.h   |  19 +-
 system/vl.c    |   8 +++
 util/oslib-posix.c | 130 +++--
 util/oslib-win32.c |   8 ++-
 6 files changed, 140 insertions(+), 37 deletions(-)

diff --git a/backends/hostmem.c b/backends/hostmem.c
index 30f69b2cb5..8f602dc86f 100644
--- a/backends/hostmem.c
+++ b/backends/hostmem.c
@@ -20,6 +20,7 @@
 #include "qom/object_interfaces.h"
 #include "qemu/mmap-alloc.h"
 #include "qemu/madvise.h"
+#include "hw/qdev-core.h"

 #ifdef CONFIG_NUMA
 #include 
@@ -235,9 +236,10 @@ static void 
host_memory_backend_set_prealloc(Object *obj, bool value,

 int fd = memory_region_get_fd(>mr);
 void *ptr = memory_region_get_ram_ptr(>mr);
 uint64_t sz = memory_region_size(>mr);
+    bool async = !phase_check(PHASE_MACHINE_INITIALIZED);

 if (!qemu_prealloc_mem(fd, ptr, sz, backend->prealloc_threads,
-   backend->prealloc_context, errp)) {
+   backend->prealloc_context, async, 
errp)) {

 return;
 }
 backend->prealloc = true;
@@ -323,6 +325,7 @@ host_memory_backend_memory_complete(UserCreatable 
*uc, Error **errp)

 HostMemoryBackendClass *bc = MEMORY_BACKEND_GET_CLASS(uc);
 void *ptr;
 uint64_t sz;
+    bool async = !phase_check(PHASE_MACHINE_INITIALIZED);

 if (!bc->alloc) {
 return;
@@ -398,7 +401,8 @@ host_memory_backend_memory_complete(UserCreatable 
*uc, Error **errp)
 if (backend->prealloc && 
!qemu_prealloc_mem(memory_region_get_fd(>mr),

 ptr, sz,

Re: [PATCH v2 0/2] Initialize backend memory objects in parallel

2024-01-29 Thread David Hildenbrand

On 22.01.24 16:32, Mark Kanda wrote:

v2:
- require MADV_POPULATE_WRITE (simplify the implementation)
- require prealloc context threads to ensure optimal thread placement
- use machine phase 'initialized' to detremine when to allow parallel init

QEMU initializes preallocated backend memory when parsing the corresponding
objects from the command line. In certain scenarios, such as memory being
preallocated across multiple numa nodes, this approach is not optimal due to
the unnecessary serialization.

This series addresses this issue by initializing the backend memory objects in
parallel.


I just played with the code, some comments:

* I suggest squashing both patches. It doesn't make things clearer if we
  factor out unconditionally adding contexts to a global list.

* Keep the functions MT-capable, at least as long as async=false. That
  is, don't involve the global list if async=false. virtio-mem will
  perform preallocation from other threads at some point, where we could
  see concurrent preallocations for different devices. I made sure that
  qemu_mem_prealloc() can handle that.

* Rename wait_mem_prealloc() to qemu_finish_async_mem_prealloc() and let
  it report the error / return true/false like qemu_prealloc_mem().
  Especially, don't change the existing
  "qemu_prealloc_mem: preallocating memory failed" error message.

* Do the conditional async=false fixup in touch_all_pages(). That means,
  in qemu_prealloc_mem(), only route the async parameter through.


One thing I don't quite like is what happens when multiple threads would try
issuing "async=true". It will currently not happen, but we should catch
whenever that happens and require that only one thread at a time can
perform async preallocs. Maybe we can assert in qemu_prealloc_mem()/
qemu_finish_async_mem_prealloc() that we hold the BQL. Hopefully, that
is the case when we start creating memory backends, before starting the
main loop. If not, maybe we should just document that async limitation.

Ideally, we'd have some async_start(), prealloc(), prealloc(),
async_finish() interface, where async_start() would block until
another thread called async_finish(), so we never have a mixture.
But that would currently be over-engineering.


I'll attach the untested, likely broken, code I played with to see
what it could look like. Observe how I only conditionally add the
context to the list at the end of touch_all_pages().


From fe26cc5252f1284efa8e667310609a22c6166324 Mon Sep 17 00:00:00 2001
From: Mark Kanda 
Date: Mon, 22 Jan 2024 09:32:18 -0600
Subject: [PATCH] oslib-posix: initialize selected backend memory objects in
 parallel

QEMU initializes preallocated backend memory as the objects are parsed from
the command line. This is not optimal in some cases (e.g. memory spanning
multiple NUMA nodes) because the memory objects are initialized in series.

Allow the initialization to occur in parallel. In order to ensure optimal
thread placement, parallel initialization requires prealloc context threads
to be in use.

Signed-off-by: Mark Kanda 
Signed-off-by: David Hildenbrand 
---
 backends/hostmem.c |   8 ++-
 hw/virtio/virtio-mem.c |   4 +-
 include/qemu/osdep.h   |  19 +-
 system/vl.c|   8 +++
 util/oslib-posix.c | 130 +++--
 util/oslib-win32.c |   8 ++-
 6 files changed, 140 insertions(+), 37 deletions(-)

diff --git a/backends/hostmem.c b/backends/hostmem.c
index 30f69b2cb5..8f602dc86f 100644
--- a/backends/hostmem.c
+++ b/backends/hostmem.c
@@ -20,6 +20,7 @@
 #include "qom/object_interfaces.h"
 #include "qemu/mmap-alloc.h"
 #include "qemu/madvise.h"
+#include "hw/qdev-core.h"
 
 #ifdef CONFIG_NUMA

 #include 
@@ -235,9 +236,10 @@ static void host_memory_backend_set_prealloc(Object *obj, 
bool value,
 int fd = memory_region_get_fd(>mr);
 void *ptr = memory_region_get_ram_ptr(>mr);
 uint64_t sz = memory_region_size(>mr);
+bool async = !phase_check(PHASE_MACHINE_INITIALIZED);
 
 if (!qemu_prealloc_mem(fd, ptr, sz, backend->prealloc_threads,

-   backend->prealloc_context, errp)) {
+   backend->prealloc_context, async, errp)) {
 return;
 }
 backend->prealloc = true;
@@ -323,6 +325,7 @@ host_memory_backend_memory_complete(UserCreatable *uc, 
Error **errp)
 HostMemoryBackendClass *bc = MEMORY_BACKEND_GET_CLASS(uc);
 void *ptr;
 uint64_t sz;
+bool async = !phase_check(PHASE_MACHINE_INITIALIZED);
 
 if (!bc->alloc) {

 return;
@@ -398,7 +401,8 @@ host_memory_backend_memory_complete(UserCreatable *uc, 
Error **errp)
 if (backend->prealloc && 
!qemu_prealloc_mem(memory_region_get_fd(>mr),
 ptr, sz,
 backend->prealloc_threads,
-backend->prealloc_context, 
errp)) {
+  

Re: [PATCH v2 0/2] Initialize backend memory objects in parallel

2024-01-29 Thread David Hildenbrand

On 29.01.24 14:39, Mark Kanda wrote:

Ping.

Any comments?



Sorry, fell through the cracks, will review this soonish.

--
Cheers,

David / dhildenb




Re: [PATCH v2 0/2] Initialize backend memory objects in parallel

2024-01-29 Thread Mark Kanda

Ping.

Any comments?

Thanks/regards,
-Mark

On 1/22/24 9:32 AM, Mark Kanda wrote:

v2:
- require MADV_POPULATE_WRITE (simplify the implementation)
- require prealloc context threads to ensure optimal thread placement
- use machine phase 'initialized' to detremine when to allow parallel init

QEMU initializes preallocated backend memory when parsing the corresponding
objects from the command line. In certain scenarios, such as memory being
preallocated across multiple numa nodes, this approach is not optimal due to
the unnecessary serialization.

This series addresses this issue by initializing the backend memory objects in
parallel.

Mark Kanda (2):
   oslib-posix: refactor memory prealloc threads
   oslib-posix: initialize backend memory objects in parallel

  backends/hostmem.c |   8 +++-
  hw/virtio/virtio-mem.c |   4 +-
  include/qemu/osdep.h   |  14 +-
  system/vl.c|   6 +++
  util/oslib-posix.c | 103 -
  util/oslib-win32.c |   8 +++-
  6 files changed, 103 insertions(+), 40 deletions(-)






[PATCH v2 0/2] Initialize backend memory objects in parallel

2024-01-22 Thread Mark Kanda
v2:
- require MADV_POPULATE_WRITE (simplify the implementation)
- require prealloc context threads to ensure optimal thread placement
- use machine phase 'initialized' to detremine when to allow parallel init

QEMU initializes preallocated backend memory when parsing the corresponding
objects from the command line. In certain scenarios, such as memory being
preallocated across multiple numa nodes, this approach is not optimal due to
the unnecessary serialization.

This series addresses this issue by initializing the backend memory objects in
parallel.

Mark Kanda (2):
  oslib-posix: refactor memory prealloc threads
  oslib-posix: initialize backend memory objects in parallel

 backends/hostmem.c |   8 +++-
 hw/virtio/virtio-mem.c |   4 +-
 include/qemu/osdep.h   |  14 +-
 system/vl.c|   6 +++
 util/oslib-posix.c | 103 -
 util/oslib-win32.c |   8 +++-
 6 files changed, 103 insertions(+), 40 deletions(-)

-- 
2.39.3