Re: [PATCH RESEND v2] block/nvme: introduce PMR support from NVMe 1.4 spec
On 3/17/20 4:23 AM, Stefan Hajnoczi wrote: >> Code is posted here >> https://github.com/AndrzejJakowski/qemu/commit/3a7762a1d13ff1543d1da430748eb24e38faab6f >> >> QEMU command line: >> >> # below are just relevant pieces of configuration, other stuff omitted >> # tried different setting (e.g. pmem=on and pmem=off) >> >> ./x86_64-softmmu/qemu-system-x86_64 ... \ >> -object >> memory-backend-file,id=mem1,share=off,pmem=on,mem-path=../nvme_pmr.bin,size=$((1*1024*1024)) >> \ > share=off is MAP_PRIVATE. If persistence is desired then share=on > should be used. > > However, this shouldn't affect "system_reset" behavior since the QEMU > process still has the same mapped file open. > Hi Stefan, Thx!! share=off setting was the problem. I confirmed with my simple test that persistence is achieved. I didn't find API to perform flush (msync). Any suggestion what function to use? Given that host memory backend is working I think my patch is almost ready for resubmission -- let me know if there are any other comments. Andrzej >> -drive file=../nvme.bin,format=raw,if=none,id=nvme_emulated \ >> -device nvme,drive=nvme_emulated,serial="test serial",pmrdev=mem1 >> >> In VM: >> My persisent memory region is exposed PCI BAR >> Region 2: Memory at fe00 (64-bit, prefetchable) [size=1M] >> >> So I perform reads/writes from/to following adress 0xfe00 (decimal >> 4261412864) >> >> dd if=test.bin of=/dev/mem bs=1 count=30 seek=4261412864 >> dd if=/dev/mem of=test1.bin bs=1 count=30 skip=4261412864 > Did you verify that the guest kernel is really accessing the BAR? I > remember that distro kernels often ship with options that make > /dev/mem of limited use because it's considered insecure. > >> On VMM I didn't observe that backing file has been updated and after power >> cycling VM >> I see old junk when reading PMR region. > Did you check that the pmrdev mmap region contains the data the guest > wrote before power cycling? > >> Also from include/qemu/pmem.h it looks like pmem_persist() will cause qemu >> to exit >> if libpmem is not installed: > The libpmem support only needs to be used when the pmem=on option was > given. If there isn't a physical pmem device then it doesn't need to > be used. > > Stefan
Re: [PATCH RESEND v2] block/nvme: introduce PMR support from NVMe 1.4 spec
On Mon, Mar 16, 2020 at 5:10 PM Andrzej Jakowski wrote: > On 3/16/20 4:32 AM, Stefan Hajnoczi wrote: > > On Wed, Mar 11, 2020 at 11:08:27PM -0700, Klaus Birkelund Jensen wrote: > >> On Mar 11 15:54, Andrzej Jakowski wrote: > >>> On 3/11/20 2:20 AM, Stefan Hajnoczi wrote: > Please try: > > $ git grep pmem > > backends/hostmem-file.c is the backend that can be used and the > pmem_persist() API can be used to flush writes. > >>> I've reworked this patch into hostmem-file type of backend. > >>> From simple tests in virtual machine: writing to PMR region > >>> and then reading from it after VM power cycle I have observed that > >>> there is no persistency. > > Sounds like an integration bug. QEMU's NVDIMM emulation uses > > HostMemoryBackend and file contents survive guest reboot. > > > > If you would like help debugging this, please post a link to the code > > and the command-line that you are using. > > > > Code is posted here > https://github.com/AndrzejJakowski/qemu/commit/3a7762a1d13ff1543d1da430748eb24e38faab6f > > QEMU command line: > > # below are just relevant pieces of configuration, other stuff omitted > # tried different setting (e.g. pmem=on and pmem=off) > > ./x86_64-softmmu/qemu-system-x86_64 ... \ > -object > memory-backend-file,id=mem1,share=off,pmem=on,mem-path=../nvme_pmr.bin,size=$((1*1024*1024)) > \ share=off is MAP_PRIVATE. If persistence is desired then share=on should be used. However, this shouldn't affect "system_reset" behavior since the QEMU process still has the same mapped file open. > -drive file=../nvme.bin,format=raw,if=none,id=nvme_emulated \ > -device nvme,drive=nvme_emulated,serial="test serial",pmrdev=mem1 > > In VM: > My persisent memory region is exposed PCI BAR > Region 2: Memory at fe00 (64-bit, prefetchable) [size=1M] > > So I perform reads/writes from/to following adress 0xfe00 (decimal > 4261412864) > > dd if=test.bin of=/dev/mem bs=1 count=30 seek=4261412864 > dd if=/dev/mem of=test1.bin bs=1 count=30 skip=4261412864 Did you verify that the guest kernel is really accessing the BAR? I remember that distro kernels often ship with options that make /dev/mem of limited use because it's considered insecure. > On VMM I didn't observe that backing file has been updated and after power > cycling VM > I see old junk when reading PMR region. Did you check that the pmrdev mmap region contains the data the guest wrote before power cycling? > Also from include/qemu/pmem.h it looks like pmem_persist() will cause qemu to > exit > if libpmem is not installed: The libpmem support only needs to be used when the pmem=on option was given. If there isn't a physical pmem device then it doesn't need to be used. Stefan
Re: [PATCH RESEND v2] block/nvme: introduce PMR support from NVMe 1.4 spec
On 3/16/20 4:32 AM, Stefan Hajnoczi wrote: > On Wed, Mar 11, 2020 at 11:08:27PM -0700, Klaus Birkelund Jensen wrote: >> On Mar 11 15:54, Andrzej Jakowski wrote: >>> On 3/11/20 2:20 AM, Stefan Hajnoczi wrote: Please try: $ git grep pmem backends/hostmem-file.c is the backend that can be used and the pmem_persist() API can be used to flush writes. >>> I've reworked this patch into hostmem-file type of backend. >>> From simple tests in virtual machine: writing to PMR region >>> and then reading from it after VM power cycle I have observed that >>> there is no persistency. > Sounds like an integration bug. QEMU's NVDIMM emulation uses > HostMemoryBackend and file contents survive guest reboot. > > If you would like help debugging this, please post a link to the code > and the command-line that you are using. > Code is posted here https://github.com/AndrzejJakowski/qemu/commit/3a7762a1d13ff1543d1da430748eb24e38faab6f QEMU command line: # below are just relevant pieces of configuration, other stuff omitted # tried different setting (e.g. pmem=on and pmem=off) ./x86_64-softmmu/qemu-system-x86_64 ... \ -object memory-backend-file,id=mem1,share=off,pmem=on,mem-path=../nvme_pmr.bin,size=$((1*1024*1024)) \ -drive file=../nvme.bin,format=raw,if=none,id=nvme_emulated \ -device nvme,drive=nvme_emulated,serial="test serial",pmrdev=mem1 In VM: My persisent memory region is exposed PCI BAR Region 2: Memory at fe00 (64-bit, prefetchable) [size=1M] So I perform reads/writes from/to following adress 0xfe00 (decimal 4261412864) dd if=test.bin of=/dev/mem bs=1 count=30 seek=4261412864 dd if=/dev/mem of=test1.bin bs=1 count=30 skip=4261412864 On VMM I didn't observe that backing file has been updated and after power cycling VM I see old junk when reading PMR region. Also from include/qemu/pmem.h it looks like pmem_persist() will cause qemu to exit if libpmem is not installed: #ifndef QEMU_PMEM_H #define QEMU_PMEM_H #ifdef CONFIG_LIBPMEM #include #else /* !CONFIG_LIBPMEM */ static inline void * pmem_memcpy_persist(void *pmemdest, const void *src, size_t len) { /* If 'pmem' option is 'on', we should always have libpmem support, or qemu will report a error and exit, never come here. */ g_assert_not_reached(); return NULL; } static inline void pmem_persist(const void *addr, size_t len) { g_assert_not_reached(); } #endif /* CONFIG_LIBPMEM */ #endif /* QEMU_PMEM_H */
Re: [PATCH RESEND v2] block/nvme: introduce PMR support from NVMe 1.4 spec
On Wed, Mar 11, 2020 at 11:08:27PM -0700, Klaus Birkelund Jensen wrote: > On Mar 11 15:54, Andrzej Jakowski wrote: > > On 3/11/20 2:20 AM, Stefan Hajnoczi wrote: > > > Please try: > > > > > > $ git grep pmem > > > > > > backends/hostmem-file.c is the backend that can be used and the > > > pmem_persist() API can be used to flush writes. > > > > I've reworked this patch into hostmem-file type of backend. > > From simple tests in virtual machine: writing to PMR region > > and then reading from it after VM power cycle I have observed that > > there is no persistency. Sounds like an integration bug. QEMU's NVDIMM emulation uses HostMemoryBackend and file contents survive guest reboot. If you would like help debugging this, please post a link to the code and the command-line that you are using. > > I guess that persistent behavior can be achieved if memory backend file > > resides on actual persistent memory in VMM. I haven't found mechanism to > > persist memory backend file when it resides in the file system on block > > storage. My original mmap + msync based solution worked well there. > > I believe that main problem with mmap was with "ifdef _WIN32" that made it > > platform specific and w/o it patchew CI complained. > > Is there a way that I could rework mmap + msync solution so it would fit > > into qemu design? > > > > Hi Andrzej, > > Thanks for working on this! > > FWIW, I have implemented other stuff for the NVMe device that requires > persistent storage (e.g. LBA allocation tracking for DULBE support). I > used the approach of adding an additional blockdev and simply use the > qemu block layer. This would also make it work on WIN32. And if we just > set bit 0 in PMRWBM and disable the write cache on the blockdev we > should be good on the durability requirements. > > Unfortunately, I do not see (or know, maybe Stefan has an idea?) an easy > way of using the MemoryRegionOps nicely with async block backend i/o. so > we either have to use blocking I/O or fire and forget aio. Or, we can > maybe keep bit 1 set in PMRWBM and force a blocking blk_flush on PMRSTS > read. QEMU's block layer does not support persistent memory semantics and doesn't support mmap. It's fine for storing state from device emulation code, but if the guest itself requires memory load/store access to the data then the QEMU block layer does not provide that. For PMR I think HostMemoryBackend is the best fit. Stefan signature.asc Description: PGP signature
Re: [PATCH RESEND v2] block/nvme: introduce PMR support from NVMe 1.4 spec
On Mar 11 15:54, Andrzej Jakowski wrote: > On 3/11/20 2:20 AM, Stefan Hajnoczi wrote: > > Please try: > > > > $ git grep pmem > > > > backends/hostmem-file.c is the backend that can be used and the > > pmem_persist() API can be used to flush writes. > > I've reworked this patch into hostmem-file type of backend. > From simple tests in virtual machine: writing to PMR region > and then reading from it after VM power cycle I have observed that > there is no persistency. > > I guess that persistent behavior can be achieved if memory backend file > resides on actual persistent memory in VMM. I haven't found mechanism to > persist memory backend file when it resides in the file system on block > storage. My original mmap + msync based solution worked well there. > I believe that main problem with mmap was with "ifdef _WIN32" that made it > platform specific and w/o it patchew CI complained. > Is there a way that I could rework mmap + msync solution so it would fit > into qemu design? > Hi Andrzej, Thanks for working on this! FWIW, I have implemented other stuff for the NVMe device that requires persistent storage (e.g. LBA allocation tracking for DULBE support). I used the approach of adding an additional blockdev and simply use the qemu block layer. This would also make it work on WIN32. And if we just set bit 0 in PMRWBM and disable the write cache on the blockdev we should be good on the durability requirements. Unfortunately, I do not see (or know, maybe Stefan has an idea?) an easy way of using the MemoryRegionOps nicely with async block backend i/o. so we either have to use blocking I/O or fire and forget aio. Or, we can maybe keep bit 1 set in PMRWBM and force a blocking blk_flush on PMRSTS read. Finally, a thing to consider is that this is adding an optional NVMe 1.4 feature to an already frankenstein device that doesn't even implement mandatory v1.2. I think that bumping the NVMe version to 1.4 is out of the question until we actually implement it fully wrt. mandatory features. My patchset brings the device up to v1.3 and I have v1.4 ready for posting, so I think we can get there. Klaus
Re: [PATCH RESEND v2] block/nvme: introduce PMR support from NVMe 1.4 spec
On 3/11/20 2:20 AM, Stefan Hajnoczi wrote: > Please try: > > $ git grep pmem > > backends/hostmem-file.c is the backend that can be used and the > pmem_persist() API can be used to flush writes. I've reworked this patch into hostmem-file type of backend. >From simple tests in virtual machine: writing to PMR region and then reading from it after VM power cycle I have observed that there is no persistency. I guess that persistent behavior can be achieved if memory backend file resides on actual persistent memory in VMM. I haven't found mechanism to persist memory backend file when it resides in the file system on block storage. My original mmap + msync based solution worked well there. I believe that main problem with mmap was with "ifdef _WIN32" that made it platform specific and w/o it patchew CI complained. Is there a way that I could rework mmap + msync solution so it would fit into qemu design?
Re: [PATCH RESEND v2] block/nvme: introduce PMR support from NVMe 1.4 spec
On 3/11/20 2:20 AM, Stefan Hajnoczi wrote: > Oh, I think I see what you mean. That is not how the term > "preallocated" is usually used in POSIX file systems. File systems > have sparse files by default and the term preallocation is used in the > context of fadvise(2) for reserving space. > > In this case I think you're saying the file cannot grow. That is > implicit since the BAR can't grow either so you could drop the comment > about preallocation. Yes, there is no need to have file preallocated in POSIX meaning. Actaul requirement is to have file that is multiple of MiB and power-of-two in size. User may (but may not need to) use fallocate/fadvise to fulfill this requirement.
Re: [PATCH RESEND v2] block/nvme: introduce PMR support from NVMe 1.4 spec
On Tue, Mar 10, 2020 at 8:09 PM Andrzej Jakowski wrote: > On 3/10/20 2:51 AM, Stefan Hajnoczi wrote: > > On Fri, Mar 06, 2020 at 03:38:53PM -0700, Andrzej Jakowski wrote: > >> diff --git a/hw/block/nvme.c b/hw/block/nvme.c > >> index d28335cbf3..ff7e74d765 100644 > >> --- a/hw/block/nvme.c > >> +++ b/hw/block/nvme.c > >> @@ -19,10 +19,14 @@ > >> * -drive file=,if=none,id= > >> * -device nvme,drive=,serial=,id=, \ > >> * cmb_size_mb=, \ > >> + * [pmr_file=,] \ > >> * num_queues= > >> * > >> * Note cmb_size_mb denotes size of CMB in MB. CMB is assumed to be at > >> * offset 0 in BAR2 and supports only WDS, RDS and SQS for now. > >> + * > >> + * Either cmb or pmr - due to limitation in avaialbe BAR indexes. > > > > s/avaialbe/available/ > > > >> + * pmr_file file needs to be preallocated and power of two in size. > > > > Why does it need to be preallocated? > > PMR file is mmaped into address space. If memory accesses are made outside of > file then SIGBUS signal is raised. Preallocation requirement was introduced > to prevent this situation. Oh, I think I see what you mean. That is not how the term "preallocated" is usually used in POSIX file systems. File systems have sparse files by default and the term preallocation is used in the context of fadvise(2) for reserving space. In this case I think you're saying the file cannot grow. That is implicit since the BAR can't grow either so you could drop the comment about preallocation. > > > >> */ > >> > >> #include "qemu/osdep.h" > >> @@ -1141,6 +1145,28 @@ static void nvme_write_bar(NvmeCtrl *n, hwaddr > >> offset, uint64_t data, > >> NVME_GUEST_ERR(nvme_ub_mmiowr_cmbsz_readonly, > >> "invalid write to read only CMBSZ, ignored"); > >> return; > >> +#ifndef _WIN32 > > > > This ifdef is a hint that the layering is not right. QEMU device models > > usually only implement the "frontend" device registers, interrupts, and > > request processing logic. The platform-specific host "backend" > > (mmapping files, sending network packets, audio/graphics APIs, etc) is > > implemented separately. > > Agree. I couldn't find QEMU backend ensuring persistence - thus decided to > go with mmap. Please try: $ git grep pmem backends/hostmem-file.c is the backend that can be used and the pmem_persist() API can be used to flush writes.
Re: [PATCH RESEND v2] block/nvme: introduce PMR support from NVMe 1.4 spec
On 3/10/20 2:51 AM, Stefan Hajnoczi wrote: > On Fri, Mar 06, 2020 at 03:38:53PM -0700, Andrzej Jakowski wrote: >> diff --git a/hw/block/nvme.c b/hw/block/nvme.c >> index d28335cbf3..ff7e74d765 100644 >> --- a/hw/block/nvme.c >> +++ b/hw/block/nvme.c >> @@ -19,10 +19,14 @@ >> * -drive file=,if=none,id= >> * -device nvme,drive=,serial=,id=, \ >> * cmb_size_mb=, \ >> + * [pmr_file=,] \ >> * num_queues= >> * >> * Note cmb_size_mb denotes size of CMB in MB. CMB is assumed to be at >> * offset 0 in BAR2 and supports only WDS, RDS and SQS for now. >> + * >> + * Either cmb or pmr - due to limitation in avaialbe BAR indexes. > > s/avaialbe/available/ > >> + * pmr_file file needs to be preallocated and power of two in size. > > Why does it need to be preallocated? PMR file is mmaped into address space. If memory accesses are made outside of file then SIGBUS signal is raised. Preallocation requirement was introduced to prevent this situation. > >> */ >> >> #include "qemu/osdep.h" >> @@ -1141,6 +1145,28 @@ static void nvme_write_bar(NvmeCtrl *n, hwaddr >> offset, uint64_t data, >> NVME_GUEST_ERR(nvme_ub_mmiowr_cmbsz_readonly, >> "invalid write to read only CMBSZ, ignored"); >> return; >> +#ifndef _WIN32 > > This ifdef is a hint that the layering is not right. QEMU device models > usually only implement the "frontend" device registers, interrupts, and > request processing logic. The platform-specific host "backend" > (mmapping files, sending network packets, audio/graphics APIs, etc) is > implemented separately. Agree. I couldn't find QEMU backend ensuring persistence - thus decided to go with mmap. > > In the previous version I asked NVDIMM folks to review this patch and > suggest how to use the same HostMemoryBackend (see > include/sysemu/hostmem.h) that is already used for NVDIMM emulation. > > That seems cleaner than baking platform-specific memory mapped file I/O > into hw/block/nvme.c, and it will also add a few features that this > patch does not have. > > If NVDIMM folks don't respond to this email, would you be able to > research backends/hostmem*.c and try to integrate it? If you feel lost > I can help but it will require me to spend time investigating how that > stuff works again :). > Yes I can research this topic. Does HostMemoryBacked provide persistence?
Re: [PATCH RESEND v2] block/nvme: introduce PMR support from NVMe 1.4 spec
On Fri, Mar 06, 2020 at 03:38:53PM -0700, Andrzej Jakowski wrote: > diff --git a/hw/block/nvme.c b/hw/block/nvme.c > index d28335cbf3..ff7e74d765 100644 > --- a/hw/block/nvme.c > +++ b/hw/block/nvme.c > @@ -19,10 +19,14 @@ > * -drive file=,if=none,id= > * -device nvme,drive=,serial=,id=, \ > * cmb_size_mb=, \ > + * [pmr_file=,] \ > * num_queues= > * > * Note cmb_size_mb denotes size of CMB in MB. CMB is assumed to be at > * offset 0 in BAR2 and supports only WDS, RDS and SQS for now. > + * > + * Either cmb or pmr - due to limitation in avaialbe BAR indexes. s/avaialbe/available/ > + * pmr_file file needs to be preallocated and power of two in size. Why does it need to be preallocated? > */ > > #include "qemu/osdep.h" > @@ -1141,6 +1145,28 @@ static void nvme_write_bar(NvmeCtrl *n, hwaddr offset, > uint64_t data, > NVME_GUEST_ERR(nvme_ub_mmiowr_cmbsz_readonly, > "invalid write to read only CMBSZ, ignored"); > return; > +#ifndef _WIN32 This ifdef is a hint that the layering is not right. QEMU device models usually only implement the "frontend" device registers, interrupts, and request processing logic. The platform-specific host "backend" (mmapping files, sending network packets, audio/graphics APIs, etc) is implemented separately. In the previous version I asked NVDIMM folks to review this patch and suggest how to use the same HostMemoryBackend (see include/sysemu/hostmem.h) that is already used for NVDIMM emulation. That seems cleaner than baking platform-specific memory mapped file I/O into hw/block/nvme.c, and it will also add a few features that this patch does not have. If NVDIMM folks don't respond to this email, would you be able to research backends/hostmem*.c and try to integrate it? If you feel lost I can help but it will require me to spend time investigating how that stuff works again :). signature.asc Description: PGP signature
[PATCH RESEND v2] block/nvme: introduce PMR support from NVMe 1.4 spec
This patch introduces support for PMR that has been defined as part of NVMe 1.4 spec. User can now specify a pmr_file which will be mmap'ed into qemu address space and subsequently in PCI BAR 2. Guest OS can perform mmio read and writes to the PMR region that will stay persistent accross system reboot. Signed-off-by: Andrzej Jakowski --- Changes since v1: - provided support for Bit 1 from PMRWBM register instead of Bit 0 to ensure improved performance in virtualized environment [1] (Stefan) - added check if pmr size is power of two in size (David) - addressed cross compilation build problems reported by CI environment [1]: https://nvmexpress.org/wp-content/uploads/NVM-Express-1_4-2019.06.10-Ratified.pdf [2]: https://lore.kernel.org/qemu-devel/20200218224811.30050-1-andrzej.jakow...@linux.intel.com/ --- Persistent Memory Region (PMR) is a new optional feature provided in NVMe 1.4 specification. This patch implements initial support for it in NVMe driver. hw/block/nvme.c | 165 +++- hw/block/nvme.h | 5 ++ hw/block/trace-events | 5 ++ include/block/nvme.h | 172 ++ 4 files changed, 346 insertions(+), 1 deletion(-) diff --git a/hw/block/nvme.c b/hw/block/nvme.c index d28335cbf3..ff7e74d765 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -19,10 +19,14 @@ * -drive file=,if=none,id= * -device nvme,drive=,serial=,id=, \ * cmb_size_mb=, \ + * [pmr_file=,] \ * num_queues= * * Note cmb_size_mb denotes size of CMB in MB. CMB is assumed to be at * offset 0 in BAR2 and supports only WDS, RDS and SQS for now. + * + * Either cmb or pmr - due to limitation in avaialbe BAR indexes. + * pmr_file file needs to be preallocated and power of two in size. */ #include "qemu/osdep.h" @@ -1141,6 +1145,28 @@ static void nvme_write_bar(NvmeCtrl *n, hwaddr offset, uint64_t data, NVME_GUEST_ERR(nvme_ub_mmiowr_cmbsz_readonly, "invalid write to read only CMBSZ, ignored"); return; +#ifndef _WIN32 +case 0xE00: /* PMRCAP */ +NVME_GUEST_ERR(nvme_ub_mmiowr_pmrcap_readonly, + "invalid write to PMRCAP register, ignored"); +return; +case 0xE04: /* TODO PMRCTL */ +break; +case 0xE08: /* PMRSTS */ +NVME_GUEST_ERR(nvme_ub_mmiowr_pmrsts_readonly, + "invalid write to PMRSTS register, ignored"); +return; +case 0xE0C: /* PMREBS */ +NVME_GUEST_ERR(nvme_ub_mmiowr_pmrebs_readonly, + "invalid write to PMREBS register, ignored"); +return; +case 0xE10: /* PMRSWTP */ +NVME_GUEST_ERR(nvme_ub_mmiowr_pmrswtp_readonly, + "invalid write to PMRSWTP register, ignored"); +return; +case 0xE14: /* TODO PMRMSC */ + break; +#endif /* !_WIN32 */ default: NVME_GUEST_ERR(nvme_ub_mmiowr_invalid, "invalid MMIO write," @@ -1169,6 +1195,22 @@ static uint64_t nvme_mmio_read(void *opaque, hwaddr addr, unsigned size) } if (addr < sizeof(n->bar)) { +#ifndef _WIN32 +/* + * When PMRWBM bit 1 is set then read from + * from PMRSTS should ensure prior writes + * made it to persistent media + */ +if (addr == 0xE08 && +(NVME_PMRCAP_PMRWBM(n->bar.pmrcap) & 0x02) >> 1) { +int ret; +ret = msync(n->pmrbuf, n->f_pmr_size, MS_SYNC); +if (!ret) { +NVME_GUEST_ERR(nvme_ub_mmiord_pmrread_barrier, + "error while persisting data"); +} +} +#endif /* !_WIN32 */ memcpy(, ptr + addr, size); } else { NVME_GUEST_ERR(nvme_ub_mmiord_invalid_ofs, @@ -1303,6 +1345,31 @@ static const MemoryRegionOps nvme_cmb_ops = { }, }; +#ifndef _WIN32 +static void nvme_pmr_write(void *opaque, hwaddr addr, uint64_t data, +unsigned size) +{ +NvmeCtrl *n = (NvmeCtrl *)opaque; +stn_le_p(>pmrbuf[addr], size, data); +} + +static uint64_t nvme_pmr_read(void *opaque, hwaddr addr, unsigned size) +{ +NvmeCtrl *n = (NvmeCtrl *)opaque; +return ldn_le_p(>pmrbuf[addr], size); +} + +static const MemoryRegionOps nvme_pmr_ops = { +.read = nvme_pmr_read, +.write = nvme_pmr_write, +.endianness = DEVICE_LITTLE_ENDIAN, +.impl = { +.min_access_size = 1, +.max_access_size = 8, +}, +}; +#endif /* !_WIN32 */ + static void nvme_realize(PCIDevice *pci_dev, Error **errp) { NvmeCtrl *n = NVME(pci_dev); @@ -1332,6 +1399,39 @@ static void nvme_realize(PCIDevice *pci_dev, Error **errp) error_setg(errp, "serial property not set"); return; } + +#ifndef _WIN32 +if (!n->cmb_size_mb && n->pmr_file) { +int fd; + +n->f_pmr = fopen(n->pmr_file, "r+b"); +if (!n->f_pmr) { +error_setg(errp, "pmr