> > > > > > > > Compute the numa information for a virtio_pmem device from the > > > > > > > > memory > > > > > > > > range of the device. Previously, the target_node was always 0 > > > > > > > > since > > > > > > > > the ndr_desc.target_node field was never explicitly set. The > > > > > > > > code for > > > > > > > > computing the numa node is taken from cxl_pmem_region_probe in > > > > > > > > drivers/cxl/pmem.c. > > > > > > > > > > > > > > > > Signed-off-by: Michael Sammler <samm...@google.com> > > > > > > Tested-by: Mina Almasry <almasrym...@google.com> > > > > > > I don't have much expertise on this driver, but with the help of this > > > patch I was able to get memory tiering [1] emulation going on qemu. As > > > far as I know there is no alternative to this emulation, and so I > > > would love to see this or equivalent merged, if possible. > > > > > > This is what I have going to get memory tiering emulation: > > > > > > In qemu, added these configs: > > > -object > > > memory-backend-file,id=m4,share=on,mem-path="$path_to_virtio_pmem_file",size=2G > > > \ > > > -smp 2,sockets=2,maxcpus=2 \ > > > -numa node,nodeid=0,memdev=m0 \ > > > -numa node,nodeid=1,memdev=m1 \ > > > -numa node,nodeid=2,memdev=m2,initiator=0 \ > > > -numa node,nodeid=3,initiator=0 \ > > > -device virtio-pmem-pci,memdev=m4,id=nvdimm1 \ > > > > > > On boot, ran these commands: > > > ndctl_static create-namespace -e namespace0.0 -m devdax -f 1&> > > > /dev/null > > > echo dax0.0 > /sys/bus/dax/drivers/device_dax/unbind > > > echo dax0.0 > /sys/bus/dax/drivers/kmem/new_id > > > for i in `ls /sys/devices/system/memory/`; do > > > state=$(cat "/sys/devices/system/memory/$i/state" 2&>/dev/null) > > > if [ "$state" == "offline" ]; then > > > echo online_movable > "/sys/devices/system/memory/$i/state" > > > fi > > > done > > > > Nice to see the way to handle the virtio-pmem device memory through kmem > > driver > > and online the corresponding memory blocks to 'zone_movable'. > > > > This also opens way to use this memory range directly irrespective of > > attached > > block device. Of course there won't be any persistent data guarantee. But > > good > > way to simulate memory tiering inside guest as demonstrated below. > > > > > > Without this CL, I see the memory onlined in node 0 always, and is not > > > a separate memory tier. With this CL and qemu configs, the memory is > > > onlined in node 3 and is set as a separate memory tier, which enables > > > qemu-based development: > > > > > > ==> /sys/devices/virtual/memory_tiering/memory_tier22/nodelist <== > > > 3 > > > ==> /sys/devices/virtual/memory_tiering/memory_tier4/nodelist <== > > > 0-2 > > > > > > AFAIK there is no alternative to enabling memory tiering emulation in > > > qemu, and would love to see this or equivalent merged, if possible. > > > > Just wondering if Qemu vNVDIMM device can also achieve this? > > > > I spent a few minutes on this. Please note I'm really not familiar > with these drivers, but as far as I can tell the qemu vNVDIMM device > has the same problem and needs a similar fix to this to what Michael > did here. What I did with vNVDIMM qemu device: > > - Added these qemu configs: > -object > memory-backend-file,id=m4,share=on,mem-path=./hello,size=2G,readonly=off > \ > -device nvdimm,id=nvdimm1,memdev=m4,unarmed=off \ > > - Ran the same commands in my previous email (they seem to apply to > the vNVDIMM device without modification): > ndctl_static create-namespace -e namespace0.0 -m devdax -f 1&> /dev/null > echo dax0.0 > /sys/bus/dax/drivers/device_dax/unbind > echo dax0.0 > /sys/bus/dax/drivers/kmem/new_id > for i in `ls /sys/devices/system/memory/`; do > state=$(cat "/sys/devices/system/memory/$i/state" 2&>/dev/null) > if [ "$state" == "offline" ]; then > echo online_movable > "/sys/devices/system/memory/$i/state" > fi > done > > I see the memory from the vNVDIMM device get onlined on node0, and is > not detected as a separate memory tier. I suspect that driver needs a > similar fix to this one.
Thanks for trying. It seems vNVDIMM device already has an option to provide the target node[1]. [1] https://www.mail-archive.com/qemu-devel@nongnu.org/msg827765.html