Re: [Xen-devel] [RFC XEN PATCH v4 00/41] Add vNVDIMM support to HVM domains

2018-02-14 Thread Haozhong Zhang
On 02/13/18 15:39 +, Roger Pau Monné wrote:
> On Tue, Feb 13, 2018 at 06:40:20AM -0700, Jan Beulich wrote:
> > >>> On 13.02.18 at 12:13,  wrote:
> > > On Tue, Feb 13, 2018 at 04:05:45AM -0700, Jan Beulich wrote:
> > >> >>> On 13.02.18 at 11:29,  wrote:
> > >> > On Tue, Feb 13, 2018 at 03:06:24AM -0700, Jan Beulich wrote:
> > >> >> >>> On 12.02.18 at 11:05,  wrote:
> > >> >> > If you map the NVDIMM as MMIO to Dom0 you don't need the M2P entries
> > >> >> > IIRC, and if it's mapped using 1GB pages it shouldn't use that much
> > >> >> > memory for the page tables (ie: you could just use normal RAM for 
> > >> >> > the
> > >> >> > page tables that map the NVDIMM IMO). Of course that only applies to
> > >> >> > PVH/HVM.
> > >> >> 
> > >> >> But in order to use (part of) it in a RAM-like manner we need struct
> > >> >> page_info for it.
> > >> > 
> > >> > I guess the main use of this would be to grant NVDIMM pages? And
> > >> > without a page_info that's not possible.
> > >> 
> > >> Why grant? Simply giving such a page as RAM to a guest would
> > >> already be a problem without struct page_info (as then we can't
> > >> track the page owner, nor can we refcount the page).
> > > 
> > > My point was to avoid doing that, and always assign the pages as
> > > MMIO, which IIRC doesn't require a struct page_info.
> > 
> > MMIO pages can't be used for things like page tables, because of
> > the refcounting that's needed. The page being like RAM, however,
> > implies that the guest needs to be able to use it as anything a RAM
> > page can be used for.
> 
> OK, I'm quite unsure about what people actually use NVDIMM for, I
> thought it was mostly used as some kind of storage, but if it's
> actually used as plain RAM then yes, we likely need struct page_info
> for them, which is a PITA.
> 
> My worries are that if you boot bare metal Linux and use NVDIMM, and
> then reboot into Xen you won't be able to access the NVDIMM data
> anymore AFAICT because Xen will have taken over it, and already used
> part of it to store it's own page tables, which is problematic IMO.
> 

The page tables for NVDIMM whose size is not large are still kept in
RAM.

This patchset does not let Xen use any NVDIMM at boot time, just
because of your worries. Part 2 of this patchset introduces a set of
xl subcommands to allow users to educate Xen after boot up which parts
of NVDIMM can be safely used by Xen without corrupting the real useful
data. That is, I suppose users to pre-partition the NVDIMM (before
using it with Xen) to at least two parts, one for hypervisor
management purpose and the data in it does not need to preserve across
power cycles, and others for user data which may need to be
non-volatile.

Haozhong

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC XEN PATCH v4 00/41] Add vNVDIMM support to HVM domains

2018-02-14 Thread Haozhong Zhang
On 02/12/18 10:05 +, Roger Pau Monné wrote:
> On Mon, Feb 12, 2018 at 09:25:42AM +0800, Haozhong Zhang wrote:
> > On 02/09/18 12:33 +, Roger Pau Monné wrote:
> > > Thanks for the series, I'm however wondering whether it's appropriate
> > > to post a v4 as RFC. Ie: at v4 the reviewer expects the submitter to
> > > have a clear picture of what needs to be implemented.
> > > 
> > > On Thu, Dec 07, 2017 at 06:09:49PM +0800, Haozhong Zhang wrote:
> > > > All patches can also be found at
> > > >   Xen:  https://github.com/hzzhan9/xen.git nvdimm-rfc-v4
> > > >   QEMU: https://github.com/hzzhan9/qemu.git xen-nvdimm-rfc-v4
> > > > 
> > > > RFC v3 can be found at
> > > >   https://lists.xen.org/archives/html/xen-devel/2017-09/msg00964.html
> > > > 
> > > > Changes in v4:
> > > >   * Move the functionality of management util 'xen-ndctl' to Xne
> > > > management tool 'xl'.
> > > >   * Load QEMU ACPI via QEMU fw_cfg and BIOSLinkerLoader interface.
> > > >   * Other changes are documented in patches separately.
> > > > 
> > > > 
> > > > - Part 0. Bug fix and code cleanup
> > > >   [01/41] x86_64/mm: fix the PDX group check in mem_hotadd_check()
> > > >   [02/41] x86_64/mm: avoid cleaning the unmapped frame table
> > > >   [03/41] hvmloader/util: do not compare characters after '\0' in 
> > > > strncmp
> > > > 
> > > > - Part 1. Detect host PMEM
> > > >   Detect host PMEM via NFIT. No frametable and M2P table for them are
> > > >   created in this part.
> > > > 
> > > >   [04/41] xen/common: add Kconfig item for pmem support
> > > >   [05/41] x86/mm: exclude PMEM regions from initial frametable
> > > >   [06/41] acpi: probe valid PMEM regions via NFIT
> > > >   [07/41] xen/pmem: register valid PMEM regions to Xen hypervisor
> > > >   [08/41] xen/pmem: hide NFIT and deny access to PMEM from Dom0
> > > 
> > > I'm afraid I might ask stupied questions, since I haven't followed the
> > > design discussion of this series very closely.
> > > 
> > > So you basically hide the NVDIMM from Dom0, and only allow guests to
> > > use it?
> > 
> > Yes, though I have some unsent patches (for vNVDIMM label support) to
> > allow QEMU in dom0 to access NVDIMM via DMOP.
> > 
> > > 
> > > What happens when you boot the same system without Xen? Will the
> > > NVDIMM get corrupted because for example Linux will write something to
> > > it?
> > 
> > Bare metal OS without Xen may write to NVDIMM which may or may not
> > corrupt the data, depending on the existing data on NVDIMM and how
> > other OS uses NVDIMM.
> > 
> > If the bare-metal OS uses NVDIMM, for example, as the volatile memory
> > or the fast disk cache, then the random data may be dumped to NVDIMM
> > and corrupt the existing data.
> > 
> > If the bare-metal OS treats NVDIMM as storage, it may probe certain
> > structures (e.g., file systems) on NVDIMM before further operations
> > and stop if such structures are not probed. In such case, the existing
> > data on NVDIMM will not be corrupted.
> 
> OK. I have to admit my knowledge of NVDIMM is very limited. Is it
> expected to for example partition a NVDIMM into several partitions and
> maybe use one as disk cache and others as storage?
> 
> How would that be accomplished, using GPT for example? Or there's some
> NVDIMM specific way to describe the layout?

NVDIMM is mapped to CPU address space just as regular RAM. Basically
SW can access it via the normal memory access instructions (e.g, mov
on x86) with necessary cache flush operations (e.g, clwb/clflushopt/clflush)
to guarantee the write persistence. Beyond this basic byte-addressable
interface, SW can choose to, for example, use it as the typical
memory, use it as a persistent storage, and even implement a block
interface over it. SW can choose its own method to partition NVDIMM,
maybe via the typical disk partitions and file systems, or the labels
which are provided by NVDIMM.

When those SW runs in a HVM domain, the primary work of Xen is to map
the host NVDIMM address to guest address space in EPT as RW just like
the normal memory virtualization.

> 
> Would it be conceivable to store Dom0 root filesystem in a NVDIMM
> while also using it to provide storage to the guests?

Yes, it's possible, though it's not allowed in this patchset.  We need
to configure Xen hypervisor before booting, to know which part of
NVDIMM is needed to map to Dom0 and where the management structures of
that part of NVDIMM are maintained (e.g., in another part of NVDIM or
in RAM).

Haozhong

> 
> > > 
> > > >   [09/41] xen/pmem: add framework for hypercall XEN_SYSCTL_nvdimm_op
> > > >   [10/41] xen/pmem: add XEN_SYSCTL_nvdimm_pmem_get_rgions_nr
> > > >   [11/41] xen/pmem: add XEN_SYSCTL_nvdimm_pmem_get_regions
> > > >   [12/41] tools/xl: add xl command 'pmem-list'
> > > > 
> > > > - Part 2. Setup host PMEM for management and guest data usage
> > > >   Allow users or admins in Dom0 to setup host PMEM pages for
> > > >   management and guest data usages.
> > > >* Management PMEM pages are 

Re: [Xen-devel] [RFC XEN PATCH v4 00/41] Add vNVDIMM support to HVM domains

2018-02-13 Thread Roger Pau Monné
On Tue, Feb 13, 2018 at 06:40:20AM -0700, Jan Beulich wrote:
> >>> On 13.02.18 at 12:13,  wrote:
> > On Tue, Feb 13, 2018 at 04:05:45AM -0700, Jan Beulich wrote:
> >> >>> On 13.02.18 at 11:29,  wrote:
> >> > On Tue, Feb 13, 2018 at 03:06:24AM -0700, Jan Beulich wrote:
> >> >> >>> On 12.02.18 at 11:05,  wrote:
> >> >> > If you map the NVDIMM as MMIO to Dom0 you don't need the M2P entries
> >> >> > IIRC, and if it's mapped using 1GB pages it shouldn't use that much
> >> >> > memory for the page tables (ie: you could just use normal RAM for the
> >> >> > page tables that map the NVDIMM IMO). Of course that only applies to
> >> >> > PVH/HVM.
> >> >> 
> >> >> But in order to use (part of) it in a RAM-like manner we need struct
> >> >> page_info for it.
> >> > 
> >> > I guess the main use of this would be to grant NVDIMM pages? And
> >> > without a page_info that's not possible.
> >> 
> >> Why grant? Simply giving such a page as RAM to a guest would
> >> already be a problem without struct page_info (as then we can't
> >> track the page owner, nor can we refcount the page).
> > 
> > My point was to avoid doing that, and always assign the pages as
> > MMIO, which IIRC doesn't require a struct page_info.
> 
> MMIO pages can't be used for things like page tables, because of
> the refcounting that's needed. The page being like RAM, however,
> implies that the guest needs to be able to use it as anything a RAM
> page can be used for.

OK, I'm quite unsure about what people actually use NVDIMM for, I
thought it was mostly used as some kind of storage, but if it's
actually used as plain RAM then yes, we likely need struct page_info
for them, which is a PITA.

My worries are that if you boot bare metal Linux and use NVDIMM, and
then reboot into Xen you won't be able to access the NVDIMM data
anymore AFAICT because Xen will have taken over it, and already used
part of it to store it's own page tables, which is problematic IMO.

Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC XEN PATCH v4 00/41] Add vNVDIMM support to HVM domains

2018-02-13 Thread Jan Beulich
>>> On 13.02.18 at 12:13,  wrote:
> On Tue, Feb 13, 2018 at 04:05:45AM -0700, Jan Beulich wrote:
>> >>> On 13.02.18 at 11:29,  wrote:
>> > On Tue, Feb 13, 2018 at 03:06:24AM -0700, Jan Beulich wrote:
>> >> >>> On 12.02.18 at 11:05,  wrote:
>> >> > If you map the NVDIMM as MMIO to Dom0 you don't need the M2P entries
>> >> > IIRC, and if it's mapped using 1GB pages it shouldn't use that much
>> >> > memory for the page tables (ie: you could just use normal RAM for the
>> >> > page tables that map the NVDIMM IMO). Of course that only applies to
>> >> > PVH/HVM.
>> >> 
>> >> But in order to use (part of) it in a RAM-like manner we need struct
>> >> page_info for it.
>> > 
>> > I guess the main use of this would be to grant NVDIMM pages? And
>> > without a page_info that's not possible.
>> 
>> Why grant? Simply giving such a page as RAM to a guest would
>> already be a problem without struct page_info (as then we can't
>> track the page owner, nor can we refcount the page).
> 
> My point was to avoid doing that, and always assign the pages as
> MMIO, which IIRC doesn't require a struct page_info.

MMIO pages can't be used for things like page tables, because of
the refcounting that's needed. The page being like RAM, however,
implies that the guest needs to be able to use it as anything a RAM
page can be used for.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC XEN PATCH v4 00/41] Add vNVDIMM support to HVM domains

2018-02-13 Thread Roger Pau Monné
On Tue, Feb 13, 2018 at 04:05:45AM -0700, Jan Beulich wrote:
> >>> On 13.02.18 at 11:29,  wrote:
> > On Tue, Feb 13, 2018 at 03:06:24AM -0700, Jan Beulich wrote:
> >> >>> On 12.02.18 at 11:05,  wrote:
> >> > If you map the NVDIMM as MMIO to Dom0 you don't need the M2P entries
> >> > IIRC, and if it's mapped using 1GB pages it shouldn't use that much
> >> > memory for the page tables (ie: you could just use normal RAM for the
> >> > page tables that map the NVDIMM IMO). Of course that only applies to
> >> > PVH/HVM.
> >> 
> >> But in order to use (part of) it in a RAM-like manner we need struct
> >> page_info for it.
> > 
> > I guess the main use of this would be to grant NVDIMM pages? And
> > without a page_info that's not possible.
> 
> Why grant? Simply giving such a page as RAM to a guest would
> already be a problem without struct page_info (as then we can't
> track the page owner, nor can we refcount the page).

My point was to avoid doing that, and always assign the pages as
MMIO, which IIRC doesn't require a struct page_info.

Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC XEN PATCH v4 00/41] Add vNVDIMM support to HVM domains

2018-02-13 Thread Jan Beulich
>>> On 13.02.18 at 11:29,  wrote:
> On Tue, Feb 13, 2018 at 03:06:24AM -0700, Jan Beulich wrote:
>> >>> On 12.02.18 at 11:05,  wrote:
>> > If you map the NVDIMM as MMIO to Dom0 you don't need the M2P entries
>> > IIRC, and if it's mapped using 1GB pages it shouldn't use that much
>> > memory for the page tables (ie: you could just use normal RAM for the
>> > page tables that map the NVDIMM IMO). Of course that only applies to
>> > PVH/HVM.
>> 
>> But in order to use (part of) it in a RAM-like manner we need struct
>> page_info for it.
> 
> I guess the main use of this would be to grant NVDIMM pages? And
> without a page_info that's not possible.

Why grant? Simply giving such a page as RAM to a guest would
already be a problem without struct page_info (as then we can't
track the page owner, nor can we refcount the page).

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC XEN PATCH v4 00/41] Add vNVDIMM support to HVM domains

2018-02-13 Thread Roger Pau Monné
On Tue, Feb 13, 2018 at 03:06:24AM -0700, Jan Beulich wrote:
> >>> On 12.02.18 at 11:05,  wrote:
> > If you map the NVDIMM as MMIO to Dom0 you don't need the M2P entries
> > IIRC, and if it's mapped using 1GB pages it shouldn't use that much
> > memory for the page tables (ie: you could just use normal RAM for the
> > page tables that map the NVDIMM IMO). Of course that only applies to
> > PVH/HVM.
> 
> But in order to use (part of) it in a RAM-like manner we need struct
> page_info for it.

I guess the main use of this would be to grant NVDIMM pages? And
without a page_info that's not possible.

Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC XEN PATCH v4 00/41] Add vNVDIMM support to HVM domains

2018-02-13 Thread Jan Beulich
>>> On 12.02.18 at 11:05,  wrote:
> If you map the NVDIMM as MMIO to Dom0 you don't need the M2P entries
> IIRC, and if it's mapped using 1GB pages it shouldn't use that much
> memory for the page tables (ie: you could just use normal RAM for the
> page tables that map the NVDIMM IMO). Of course that only applies to
> PVH/HVM.

But in order to use (part of) it in a RAM-like manner we need struct
page_info for it.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC XEN PATCH v4 00/41] Add vNVDIMM support to HVM domains

2018-02-12 Thread Roger Pau Monné
On Mon, Feb 12, 2018 at 09:25:42AM +0800, Haozhong Zhang wrote:
> On 02/09/18 12:33 +, Roger Pau Monné wrote:
> > Thanks for the series, I'm however wondering whether it's appropriate
> > to post a v4 as RFC. Ie: at v4 the reviewer expects the submitter to
> > have a clear picture of what needs to be implemented.
> > 
> > On Thu, Dec 07, 2017 at 06:09:49PM +0800, Haozhong Zhang wrote:
> > > All patches can also be found at
> > >   Xen:  https://github.com/hzzhan9/xen.git nvdimm-rfc-v4
> > >   QEMU: https://github.com/hzzhan9/qemu.git xen-nvdimm-rfc-v4
> > > 
> > > RFC v3 can be found at
> > >   https://lists.xen.org/archives/html/xen-devel/2017-09/msg00964.html
> > > 
> > > Changes in v4:
> > >   * Move the functionality of management util 'xen-ndctl' to Xne
> > > management tool 'xl'.
> > >   * Load QEMU ACPI via QEMU fw_cfg and BIOSLinkerLoader interface.
> > >   * Other changes are documented in patches separately.
> > > 
> > > 
> > > - Part 0. Bug fix and code cleanup
> > >   [01/41] x86_64/mm: fix the PDX group check in mem_hotadd_check()
> > >   [02/41] x86_64/mm: avoid cleaning the unmapped frame table
> > >   [03/41] hvmloader/util: do not compare characters after '\0' in strncmp
> > > 
> > > - Part 1. Detect host PMEM
> > >   Detect host PMEM via NFIT. No frametable and M2P table for them are
> > >   created in this part.
> > > 
> > >   [04/41] xen/common: add Kconfig item for pmem support
> > >   [05/41] x86/mm: exclude PMEM regions from initial frametable
> > >   [06/41] acpi: probe valid PMEM regions via NFIT
> > >   [07/41] xen/pmem: register valid PMEM regions to Xen hypervisor
> > >   [08/41] xen/pmem: hide NFIT and deny access to PMEM from Dom0
> > 
> > I'm afraid I might ask stupied questions, since I haven't followed the
> > design discussion of this series very closely.
> > 
> > So you basically hide the NVDIMM from Dom0, and only allow guests to
> > use it?
> 
> Yes, though I have some unsent patches (for vNVDIMM label support) to
> allow QEMU in dom0 to access NVDIMM via DMOP.
> 
> > 
> > What happens when you boot the same system without Xen? Will the
> > NVDIMM get corrupted because for example Linux will write something to
> > it?
> 
> Bare metal OS without Xen may write to NVDIMM which may or may not
> corrupt the data, depending on the existing data on NVDIMM and how
> other OS uses NVDIMM.
> 
> If the bare-metal OS uses NVDIMM, for example, as the volatile memory
> or the fast disk cache, then the random data may be dumped to NVDIMM
> and corrupt the existing data.
> 
> If the bare-metal OS treats NVDIMM as storage, it may probe certain
> structures (e.g., file systems) on NVDIMM before further operations
> and stop if such structures are not probed. In such case, the existing
> data on NVDIMM will not be corrupted.

OK. I have to admit my knowledge of NVDIMM is very limited. Is it
expected to for example partition a NVDIMM into several partitions and
maybe use one as disk cache and others as storage?

How would that be accomplished, using GPT for example? Or there's some
NVDIMM specific way to describe the layout?

Would it be conceivable to store Dom0 root filesystem in a NVDIMM
while also using it to provide storage to the guests?

> > 
> > >   [09/41] xen/pmem: add framework for hypercall XEN_SYSCTL_nvdimm_op
> > >   [10/41] xen/pmem: add XEN_SYSCTL_nvdimm_pmem_get_rgions_nr
> > >   [11/41] xen/pmem: add XEN_SYSCTL_nvdimm_pmem_get_regions
> > >   [12/41] tools/xl: add xl command 'pmem-list'
> > > 
> > > - Part 2. Setup host PMEM for management and guest data usage
> > >   Allow users or admins in Dom0 to setup host PMEM pages for
> > >   management and guest data usages.
> > >* Management PMEM pages are used to store the frametable and M2P of
> > >  PMEM pages (including themselves), and never mapped to guest.
> > >* Guest data PMEM pages can be mapped to guest and used as the
> > >  backend storage of virtual NVDIMM devices.
> > 
> > So this is basically tied to a PV Dom0, but I would like to also think
> > about what would happen with a PVH Dom0. In that case AFAICT Xen could
> > map the full NVDIMM to the Dom0 p2m as MMIO using 1GB pages, at which
> > point Dom0 could manage the NVDIMM as desired? Ie: Dom0 could map
> > parts of the NVDIMM to DomU as it maps other MMIO regions.
> 
> The primary reason I don't want to map NVDIMM to Dom0 (either PV or
> PVH) is the frame table and M2P table of NVDIMM are maintained on
> NVDIMM. Because NVDIMM is non-volatile and Xen has no idea of which
> portion of NVDIMM can be used for frame table and M2P, Xen needs the
> user input for such information (patch 18, 22, 23) after it boots
> up. That is, before Xen boots up, Xen cannot determine which portion
> of NVDIMM for its frame table and M2P that should not map to Dom0.

If you map the NVDIMM as MMIO to Dom0 you don't need the M2P entries
IIRC, and if it's mapped using 1GB pages it shouldn't use that much
memory for the page tables (ie: you could just use 

Re: [Xen-devel] [RFC XEN PATCH v4 00/41] Add vNVDIMM support to HVM domains

2018-02-11 Thread Haozhong Zhang
On 02/09/18 12:33 +, Roger Pau Monné wrote:
> Thanks for the series, I'm however wondering whether it's appropriate
> to post a v4 as RFC. Ie: at v4 the reviewer expects the submitter to
> have a clear picture of what needs to be implemented.
> 
> On Thu, Dec 07, 2017 at 06:09:49PM +0800, Haozhong Zhang wrote:
> > All patches can also be found at
> >   Xen:  https://github.com/hzzhan9/xen.git nvdimm-rfc-v4
> >   QEMU: https://github.com/hzzhan9/qemu.git xen-nvdimm-rfc-v4
> > 
> > RFC v3 can be found at
> >   https://lists.xen.org/archives/html/xen-devel/2017-09/msg00964.html
> > 
> > Changes in v4:
> >   * Move the functionality of management util 'xen-ndctl' to Xne
> > management tool 'xl'.
> >   * Load QEMU ACPI via QEMU fw_cfg and BIOSLinkerLoader interface.
> >   * Other changes are documented in patches separately.
> > 
> > 
> > - Part 0. Bug fix and code cleanup
> >   [01/41] x86_64/mm: fix the PDX group check in mem_hotadd_check()
> >   [02/41] x86_64/mm: avoid cleaning the unmapped frame table
> >   [03/41] hvmloader/util: do not compare characters after '\0' in strncmp
> > 
> > - Part 1. Detect host PMEM
> >   Detect host PMEM via NFIT. No frametable and M2P table for them are
> >   created in this part.
> > 
> >   [04/41] xen/common: add Kconfig item for pmem support
> >   [05/41] x86/mm: exclude PMEM regions from initial frametable
> >   [06/41] acpi: probe valid PMEM regions via NFIT
> >   [07/41] xen/pmem: register valid PMEM regions to Xen hypervisor
> >   [08/41] xen/pmem: hide NFIT and deny access to PMEM from Dom0
> 
> I'm afraid I might ask stupied questions, since I haven't followed the
> design discussion of this series very closely.
> 
> So you basically hide the NVDIMM from Dom0, and only allow guests to
> use it?

Yes, though I have some unsent patches (for vNVDIMM label support) to
allow QEMU in dom0 to access NVDIMM via DMOP.

> 
> What happens when you boot the same system without Xen? Will the
> NVDIMM get corrupted because for example Linux will write something to
> it?

Bare metal OS without Xen may write to NVDIMM which may or may not
corrupt the data, depending on the existing data on NVDIMM and how
other OS uses NVDIMM.

If the bare-metal OS uses NVDIMM, for example, as the volatile memory
or the fast disk cache, then the random data may be dumped to NVDIMM
and corrupt the existing data.

If the bare-metal OS treats NVDIMM as storage, it may probe certain
structures (e.g., file systems) on NVDIMM before further operations
and stop if such structures are not probed. In such case, the existing
data on NVDIMM will not be corrupted.

> 
> >   [09/41] xen/pmem: add framework for hypercall XEN_SYSCTL_nvdimm_op
> >   [10/41] xen/pmem: add XEN_SYSCTL_nvdimm_pmem_get_rgions_nr
> >   [11/41] xen/pmem: add XEN_SYSCTL_nvdimm_pmem_get_regions
> >   [12/41] tools/xl: add xl command 'pmem-list'
> > 
> > - Part 2. Setup host PMEM for management and guest data usage
> >   Allow users or admins in Dom0 to setup host PMEM pages for
> >   management and guest data usages.
> >* Management PMEM pages are used to store the frametable and M2P of
> >  PMEM pages (including themselves), and never mapped to guest.
> >* Guest data PMEM pages can be mapped to guest and used as the
> >  backend storage of virtual NVDIMM devices.
> 
> So this is basically tied to a PV Dom0, but I would like to also think
> about what would happen with a PVH Dom0. In that case AFAICT Xen could
> map the full NVDIMM to the Dom0 p2m as MMIO using 1GB pages, at which
> point Dom0 could manage the NVDIMM as desired? Ie: Dom0 could map
> parts of the NVDIMM to DomU as it maps other MMIO regions.

The primary reason I don't want to map NVDIMM to Dom0 (either PV or
PVH) is the frame table and M2P table of NVDIMM are maintained on
NVDIMM. Because NVDIMM is non-volatile and Xen has no idea of which
portion of NVDIMM can be used for frame table and M2P, Xen needs the
user input for such information (patch 18, 22, 23) after it boots
up. That is, before Xen boots up, Xen cannot determine which portion
of NVDIMM for its frame table and M2P that should not map to Dom0.

Thanks,
Haozhong

> 
> I'm not sure Xen needs to know anything else apart from how to map the
> full NVDIMM to Dom0 as MMIO, which would greatly simplify this series.
> 
> Thanks, Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC XEN PATCH v4 00/41] Add vNVDIMM support to HVM domains

2018-02-09 Thread Roger Pau Monné
Thanks for the series, I'm however wondering whether it's appropriate
to post a v4 as RFC. Ie: at v4 the reviewer expects the submitter to
have a clear picture of what needs to be implemented.

On Thu, Dec 07, 2017 at 06:09:49PM +0800, Haozhong Zhang wrote:
> All patches can also be found at
>   Xen:  https://github.com/hzzhan9/xen.git nvdimm-rfc-v4
>   QEMU: https://github.com/hzzhan9/qemu.git xen-nvdimm-rfc-v4
> 
> RFC v3 can be found at
>   https://lists.xen.org/archives/html/xen-devel/2017-09/msg00964.html
> 
> Changes in v4:
>   * Move the functionality of management util 'xen-ndctl' to Xne
> management tool 'xl'.
>   * Load QEMU ACPI via QEMU fw_cfg and BIOSLinkerLoader interface.
>   * Other changes are documented in patches separately.
> 
> 
> - Part 0. Bug fix and code cleanup
>   [01/41] x86_64/mm: fix the PDX group check in mem_hotadd_check()
>   [02/41] x86_64/mm: avoid cleaning the unmapped frame table
>   [03/41] hvmloader/util: do not compare characters after '\0' in strncmp
> 
> - Part 1. Detect host PMEM
>   Detect host PMEM via NFIT. No frametable and M2P table for them are
>   created in this part.
> 
>   [04/41] xen/common: add Kconfig item for pmem support
>   [05/41] x86/mm: exclude PMEM regions from initial frametable
>   [06/41] acpi: probe valid PMEM regions via NFIT
>   [07/41] xen/pmem: register valid PMEM regions to Xen hypervisor
>   [08/41] xen/pmem: hide NFIT and deny access to PMEM from Dom0

I'm afraid I might ask stupied questions, since I haven't followed the
design discussion of this series very closely.

So you basically hide the NVDIMM from Dom0, and only allow guests to
use it?

What happens when you boot the same system without Xen? Will the
NVDIMM get corrupted because for example Linux will write something to
it?

>   [09/41] xen/pmem: add framework for hypercall XEN_SYSCTL_nvdimm_op
>   [10/41] xen/pmem: add XEN_SYSCTL_nvdimm_pmem_get_rgions_nr
>   [11/41] xen/pmem: add XEN_SYSCTL_nvdimm_pmem_get_regions
>   [12/41] tools/xl: add xl command 'pmem-list'
> 
> - Part 2. Setup host PMEM for management and guest data usage
>   Allow users or admins in Dom0 to setup host PMEM pages for
>   management and guest data usages.
>* Management PMEM pages are used to store the frametable and M2P of
>  PMEM pages (including themselves), and never mapped to guest.
>* Guest data PMEM pages can be mapped to guest and used as the
>  backend storage of virtual NVDIMM devices.

So this is basically tied to a PV Dom0, but I would like to also think
about what would happen with a PVH Dom0. In that case AFAICT Xen could
map the full NVDIMM to the Dom0 p2m as MMIO using 1GB pages, at which
point Dom0 could manage the NVDIMM as desired? Ie: Dom0 could map
parts of the NVDIMM to DomU as it maps other MMIO regions.

I'm not sure Xen needs to know anything else apart from how to map the
full NVDIMM to Dom0 as MMIO, which would greatly simplify this series.

Thanks, Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [RFC XEN PATCH v4 00/41] Add vNVDIMM support to HVM domains

2017-12-07 Thread Haozhong Zhang
All patches can also be found at
  Xen:  https://github.com/hzzhan9/xen.git nvdimm-rfc-v4
  QEMU: https://github.com/hzzhan9/qemu.git xen-nvdimm-rfc-v4

RFC v3 can be found at
  https://lists.xen.org/archives/html/xen-devel/2017-09/msg00964.html

Changes in v4:
  * Move the functionality of management util 'xen-ndctl' to Xne
management tool 'xl'.
  * Load QEMU ACPI via QEMU fw_cfg and BIOSLinkerLoader interface.
  * Other changes are documented in patches separately.


- Part 0. Bug fix and code cleanup
  [01/41] x86_64/mm: fix the PDX group check in mem_hotadd_check()
  [02/41] x86_64/mm: avoid cleaning the unmapped frame table
  [03/41] hvmloader/util: do not compare characters after '\0' in strncmp

- Part 1. Detect host PMEM
  Detect host PMEM via NFIT. No frametable and M2P table for them are
  created in this part.

  [04/41] xen/common: add Kconfig item for pmem support
  [05/41] x86/mm: exclude PMEM regions from initial frametable
  [06/41] acpi: probe valid PMEM regions via NFIT
  [07/41] xen/pmem: register valid PMEM regions to Xen hypervisor
  [08/41] xen/pmem: hide NFIT and deny access to PMEM from Dom0
  [09/41] xen/pmem: add framework for hypercall XEN_SYSCTL_nvdimm_op
  [10/41] xen/pmem: add XEN_SYSCTL_nvdimm_pmem_get_rgions_nr
  [11/41] xen/pmem: add XEN_SYSCTL_nvdimm_pmem_get_regions
  [12/41] tools/xl: add xl command 'pmem-list'

- Part 2. Setup host PMEM for management and guest data usage
  Allow users or admins in Dom0 to setup host PMEM pages for
  management and guest data usages.
   * Management PMEM pages are used to store the frametable and M2P of
 PMEM pages (including themselves), and never mapped to guest.
   * Guest data PMEM pages can be mapped to guest and used as the
 backend storage of virtual NVDIMM devices.

  [13/41] x86_64/mm: refactor memory_add()
  [14/41] x86_64/mm: allow customized location of extended frametable and M2P 
table
  [15/41] xen/pmem: add XEN_SYSCTL_nvdimm_pmem_setup to setup management PMEM 
region
  [16/41] tools/xl: accept all bases in parse_ulong()
  [17/41] tools/xl: expose parse_ulong()
  [18/41] tools/xl: add xl command 'pmem-setup'
  [19/41] xen/pmem: support PMEM_REGION_TYPE_MGMT for 
XEN_SYSCTL_nvdimm_pmem_get_regions_nr
  [20/41] xen/pmem: support PMEM_REGION_TYPE_MGMT for 
XEN_SYSCTL_nvdimm_pmem_get_regions
  [21/41] tools/xl: add option '--mgmt | -m' to xl command pmem-list
  [22/41] xen/pmem: support setup PMEM region for guest data usage
  [23/41] tools/xl: add option '--data | -d' to xl command pmem-setup
  [24/41] xen/pmem: support PMEM_REGION_TYPE_DATA for 
XEN_SYSCTL_nvdimm_pmem_get_regions_nr
  [25/41] xen/pmem: support PMEM_REGION_TYPE_DATA for 
XEN_SYSCTL_nvdimm_pmem_get_regions
  [26/41] tools/xl: add option '--data | -d' to xl command pmem-list

- Part 3. Hypervisor support to map host PMEM pages to HVM domain
  [27/41] xen/pmem: add function to map PMEM pages to HVM domain
  [28/41] xen/pmem: release PMEM pages on HVM domain destruction
  [29/41] xen: add hypercall XENMEM_populate_pmem_map

- Part 4. Load QEMU ACPI
  Guest NFIT and NVDIMM namespace devices are built by QEMU. This part
  loads QEMU ACPI via QEMU fw_cfg and BIOSLinkerLoader interface. A
  simple blacklist mechanism is added to reject DM ACPI tables that
  may conflict with those built by Xen itself.

  [30/41] tools: reserve extra guest memory for ACPI from device model
  [31/41] tools/libacpi: add callback to translate GPA to GVA
  [32/41] tools/libacpi: build a DM ACPI signature blacklist
  [33/41] tools/libacpi, hvmloader: detect QEMU fw_cfg interface
  [34/41] tools/libacpi: probe QEMU ACPI ROMs via fw_cfg interface
  [35/41] tools/libacpi: add a QEMU BIOSLinkLoader executor
  [36/41] tools/libacpi: add function to get the data of QEMU RSDP
  [37/41] tools/libacpi: load QEMU ACPI

- Part 5. Remaining tool stack changes
  Add xl domain configuration and generate new QEMU options for vNVDIMM.

  [38/41] tools/xl: add xl domain configuration for virtual NVDIMM devices
  [39/41] tools/libxl: allow aborting domain creation on fatal QMP init errors
  [40/41] tools/libxl: initiate PMEM mapping via QMP callback
  [41/41] tools/libxl: build qemu options from xl vNVDIMM configs


 docs/man/xl.cfg.pod.5.in|  40 ++
 tools/firmware/hvmloader/Makefile   |   4 +-
 tools/firmware/hvmloader/util.c |  16 +
 tools/firmware/hvmloader/util.h |  14 +
 tools/flask/policy/modules/dom0.te  |   2 +-
 tools/flask/policy/modules/xen.if   |   3 +-
 tools/libacpi/acpi2_0.h |   1 +
 tools/libacpi/build.c   | 176 -
 tools/libacpi/libacpi.h |  10 +
 tools/libacpi/qemu.h|  56 +++
 tools/libacpi/qemu_fw_cfg.c |  99 +
 tools/libacpi/qemu_loader.c | 392 +++
 tools/libacpi/qemu_stub.c   |  64 +++
 tools/libxc/include/xenctrl.h   |  88 +
 tools/libxc/xc_domain.c |  15 +
 tools/libxc/xc_misc.c   | 154 
 tools/libxl/Makefile