Re: [Qemu-devel] [PATCH v2 4/8] mem/nvdimm: ensure write persistence to PMEM in label emulation

2018-02-12 Thread Stefan Hajnoczi
On Fri, Feb 09, 2018 at 10:57:26PM +0800, Haozhong Zhang wrote:
> On 02/09/18 14:27 +, Stefan Hajnoczi wrote:
> > On Wed, Feb 07, 2018 at 03:33:27PM +0800, Haozhong Zhang wrote:
> > > @@ -156,11 +157,17 @@ static void nvdimm_write_label_data(NVDIMMDevice 
> > > *nvdimm, const void *buf,
> > >  {
> > >  MemoryRegion *mr;
> > >  PCDIMMDevice *dimm = PC_DIMM(nvdimm);
> > > +bool is_pmem = object_property_get_bool(OBJECT(dimm->hostmem),
> > > +"pmem", NULL);
> > >  uint64_t backend_offset;
> > >  
> > >  nvdimm_validate_rw_label_data(nvdimm, size, offset);
> > >  
> > > -memcpy(nvdimm->label_data + offset, buf, size);
> > > +if (!is_pmem) {
> > > +memcpy(nvdimm->label_data + offset, buf, size);
> > > +} else {
> > > +pmem_memcpy_persist(nvdimm->label_data + offset, buf, size);
> > > +}
> > 
> > Is this enough to prevent label corruption in case of power failure?
> > 
> > pmem_memcpy_persist() is not atomic.  Power failure can result in a mix
> > of the old and new label data.
> > 
> > If we want this operation to be 100% safe there needs to be some kind of
> > update protocol that makes the change atomic, like a Label A and Label B
> > area with a single Label Index field that can be updated atomically to
> > point to the active Label A/B area.
> 
> All this patch series is to guarantee: if the guest is still alive and
> running, all its previous writes to pmem, which were performed by
> QEMU, will be still persistent on pmem.
> 
> If a power failure happens before QEMU returns to the guest, e.g., in
> the middle of above pmem_memcpy_persist(), yes, the guest label data
> may be in an inconsistent state, but the guest also has no chance to
> progress.  And, that is what could happen in the non-virtualization
> environment as well, and it's the responsibility of the (guest) SW to
> defend such failures, e.g., by the protocol you mentioned.

Thanks for explaining!  I thought the atomic update is done by the ACPI
_LSW DSM method but it turns out the driver must do it.

For anyone else who is interested, a detailed description of the Label
Storage Area is on page 652 of the UEFI Specification version 2.7 and
the Linux driver implements the atomic update algorithm in
drivers/nvdimm/label.c:__pmem_label_update().

Stefan


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH v2 4/8] mem/nvdimm: ensure write persistence to PMEM in label emulation

2018-02-09 Thread Haozhong Zhang
On 02/09/18 14:27 +, Stefan Hajnoczi wrote:
> On Wed, Feb 07, 2018 at 03:33:27PM +0800, Haozhong Zhang wrote:
> > @@ -156,11 +157,17 @@ static void nvdimm_write_label_data(NVDIMMDevice 
> > *nvdimm, const void *buf,
> >  {
> >  MemoryRegion *mr;
> >  PCDIMMDevice *dimm = PC_DIMM(nvdimm);
> > +bool is_pmem = object_property_get_bool(OBJECT(dimm->hostmem),
> > +"pmem", NULL);
> >  uint64_t backend_offset;
> >  
> >  nvdimm_validate_rw_label_data(nvdimm, size, offset);
> >  
> > -memcpy(nvdimm->label_data + offset, buf, size);
> > +if (!is_pmem) {
> > +memcpy(nvdimm->label_data + offset, buf, size);
> > +} else {
> > +pmem_memcpy_persist(nvdimm->label_data + offset, buf, size);
> > +}
> 
> Is this enough to prevent label corruption in case of power failure?
> 
> pmem_memcpy_persist() is not atomic.  Power failure can result in a mix
> of the old and new label data.
> 
> If we want this operation to be 100% safe there needs to be some kind of
> update protocol that makes the change atomic, like a Label A and Label B
> area with a single Label Index field that can be updated atomically to
> point to the active Label A/B area.

All this patch series is to guarantee: if the guest is still alive and
running, all its previous writes to pmem, which were performed by
QEMU, will be still persistent on pmem.

If a power failure happens before QEMU returns to the guest, e.g., in
the middle of above pmem_memcpy_persist(), yes, the guest label data
may be in an inconsistent state, but the guest also has no chance to
progress.  And, that is what could happen in the non-virtualization
environment as well, and it's the responsibility of the (guest) SW to
defend such failures, e.g., by the protocol you mentioned.

Haozhong



Re: [Qemu-devel] [PATCH v2 4/8] mem/nvdimm: ensure write persistence to PMEM in label emulation

2018-02-09 Thread Stefan Hajnoczi
On Wed, Feb 07, 2018 at 03:33:27PM +0800, Haozhong Zhang wrote:
> @@ -156,11 +157,17 @@ static void nvdimm_write_label_data(NVDIMMDevice 
> *nvdimm, const void *buf,
>  {
>  MemoryRegion *mr;
>  PCDIMMDevice *dimm = PC_DIMM(nvdimm);
> +bool is_pmem = object_property_get_bool(OBJECT(dimm->hostmem),
> +"pmem", NULL);
>  uint64_t backend_offset;
>  
>  nvdimm_validate_rw_label_data(nvdimm, size, offset);
>  
> -memcpy(nvdimm->label_data + offset, buf, size);
> +if (!is_pmem) {
> +memcpy(nvdimm->label_data + offset, buf, size);
> +} else {
> +pmem_memcpy_persist(nvdimm->label_data + offset, buf, size);
> +}

Is this enough to prevent label corruption in case of power failure?

pmem_memcpy_persist() is not atomic.  Power failure can result in a mix
of the old and new label data.

If we want this operation to be 100% safe there needs to be some kind of
update protocol that makes the change atomic, like a Label A and Label B
area with a single Label Index field that can be updated atomically to
point to the active Label A/B area.

Stefan


signature.asc
Description: PGP signature


[Qemu-devel] [PATCH v2 4/8] mem/nvdimm: ensure write persistence to PMEM in label emulation

2018-02-06 Thread Haozhong Zhang
Guest writes to vNVDIMM labels are intercepted and performed on the
backend by QEMU. When the backend is a real persistent memort, QEMU
needs to take proper operations to ensure its write persistence on the
persistent memory. Otherwise, a host power failure may result in the
loss of guest label configurations.

Signed-off-by: Haozhong Zhang 
---
 hw/mem/nvdimm.c |  9 -
 include/qemu/pmem.h | 31 +++
 2 files changed, 39 insertions(+), 1 deletion(-)
 create mode 100644 include/qemu/pmem.h

diff --git a/hw/mem/nvdimm.c b/hw/mem/nvdimm.c
index 61e677f92f..18861d1a7a 100644
--- a/hw/mem/nvdimm.c
+++ b/hw/mem/nvdimm.c
@@ -23,6 +23,7 @@
  */
 
 #include "qemu/osdep.h"
+#include "qemu/pmem.h"
 #include "qapi/error.h"
 #include "qapi/visitor.h"
 #include "qapi-visit.h"
@@ -156,11 +157,17 @@ static void nvdimm_write_label_data(NVDIMMDevice *nvdimm, 
const void *buf,
 {
 MemoryRegion *mr;
 PCDIMMDevice *dimm = PC_DIMM(nvdimm);
+bool is_pmem = object_property_get_bool(OBJECT(dimm->hostmem),
+"pmem", NULL);
 uint64_t backend_offset;
 
 nvdimm_validate_rw_label_data(nvdimm, size, offset);
 
-memcpy(nvdimm->label_data + offset, buf, size);
+if (!is_pmem) {
+memcpy(nvdimm->label_data + offset, buf, size);
+} else {
+pmem_memcpy_persist(nvdimm->label_data + offset, buf, size);
+}
 
 mr = host_memory_backend_get_memory(dimm->hostmem, _abort);
 backend_offset = memory_region_size(mr) - nvdimm->label_size + offset;
diff --git a/include/qemu/pmem.h b/include/qemu/pmem.h
new file mode 100644
index 00..9017596ff0
--- /dev/null
+++ b/include/qemu/pmem.h
@@ -0,0 +1,31 @@
+/*
+ * Stub functions for libpmem.
+ *
+ * Copyright (c) 2018 Intel Corporation.
+ *
+ * Author: Haozhong Zhang 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#ifndef QEMU_PMEM_H
+#define QEMU_PMEM_H
+
+#ifdef CONFIG_LIBPMEM
+#include 
+#else  /* !CONFIG_LIBPMEM */
+
+#include 
+
+/* Stubs */
+
+static inline void *
+pmem_memcpy_persist(void *pmemdest, const void *src, size_t len)
+{
+return memcpy(pmemdest, src, len);
+}
+
+#endif /* CONFIG_LIBPMEM */
+
+#endif /* !QEMU_PMEM_H */
-- 
2.14.1