Re: [Xen-devel] [PATCH 06/13] xen: detect pre-allocated memory interfering with e820 map

2015-03-30 Thread Juergen Gross

On 02/25/2015 05:00 PM, Juergen Gross wrote:

On 02/25/2015 03:24 PM, David Vrabel wrote:

On 24/02/15 06:27, Juergen Gross wrote:

On 02/19/2015 07:07 PM, David Vrabel wrote:

On 18/02/2015 06:51, Juergen Gross wrote:

+{
+unsigned long pfn;
+unsigned long area_start, area_end;
+unsigned i;
+
+for (i = 0; i  XEN_N_RESERVED_AREAS; i++) {
+
+if (!xen_reserved_area[i].size)
+break;
+
+area_start = PFN_DOWN(xen_reserved_area[i].start);
+area_end = PFN_UP(xen_reserved_area[i].start +
+  xen_reserved_area[i].size);
+if (area_start = end_pfn || area_end = start_pfn)
+continue;
+
+if (area_start  start_pfn)
+xen_set_identity_and_remap_chunk(start_pfn, area_start,
+ released, remapped);
+
+if (area_end  end_pfn)
+xen_set_identity_and_remap_chunk(area_end, end_pfn,
+ released, remapped);
+
+*remapped += min(area_end, end_pfn) -
+max(area_start, start_pfn);
+
+return;


Why not defer the whole chunk that conflicts?  Or for that matter defer
all this remapping to the last minute.


There are two problems arising from this:

- In the initrd case remapping would be deferred too long: the initrd
   data is still in use when device initialization is running. And we
   really want the remap to have happened before PCI space is being
used.


I'm not sure I understand what you're saying here.


I thought you wanted to defer the remapping to the point where the
initrd memory is no longer being used. But the suggestion below is
more clear.



I'm suggesting:

1. Reserve all holes.

2. Relocate (if necessary) all modules (initrd, etc.) to regions that
are RAM in the e820.

3. Rebuild the p2m in RAM.

4. Relocate frames from E820 holes/reserved to the end, free p2m pages
from the holes and replacing them with the read-only 1:1 page (where
possible).


- Delaying all remapping to the point where the new p2m list is in place
   would either result in a p2m list with all memory holes covered with
   individual entries as the new list is built with those holes still
   populated, ...
   The first option could easily waste significant amounts of memory (on
   my test machine with 1TB RAM this would have been about 1GB), while
   the second option would be performance critical.


I don't understand how this wastes memory.   When you relocate the
frames from the holes you can reclaim the p2m pages for the holes (and
replace them with the r/o mapped identity p2m page).


Okay, this would work, I guess.

I'll have a try with some new patches...


I tried your approach and hit a problem I can't solve without a major
rework of the kernel's init sequence:

dmi_scan_machine() (and possibly other functions like probe_roms())
need the identity mappings of BIOS, ACPI or PCI memory. Otherwise
SMBIOS, DMI and extension ROMs won't be discovered.

This can be solved only by either a complete rework of the sequence of
called init functions (not desirable, I guess) or by doing the unmap
part of the remapping as early as today.

This means, of course, I was just lucky with my resolution of the p2m
table conflicting with the E820 map by just delaying the remapping of
this memory area: in case it would have collided with an area needed
to be identity mapped early, the machine wouldn't have been able to
boot my kernel. So I really need to relocate the p2m list, even if this
is not as easy as delaying the remapping.


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 06/13] xen: detect pre-allocated memory interfering with e820 map

2015-02-25 Thread David Vrabel
On 24/02/15 06:27, Juergen Gross wrote:
 On 02/19/2015 07:07 PM, David Vrabel wrote:
 On 18/02/2015 06:51, Juergen Gross wrote:
 +{
 +unsigned long pfn;
 +unsigned long area_start, area_end;
 +unsigned i;
 +
 +for (i = 0; i  XEN_N_RESERVED_AREAS; i++) {
 +
 +if (!xen_reserved_area[i].size)
 +break;
 +
 +area_start = PFN_DOWN(xen_reserved_area[i].start);
 +area_end = PFN_UP(xen_reserved_area[i].start +
 +  xen_reserved_area[i].size);
 +if (area_start = end_pfn || area_end = start_pfn)
 +continue;
 +
 +if (area_start  start_pfn)
 +xen_set_identity_and_remap_chunk(start_pfn, area_start,
 + released, remapped);
 +
 +if (area_end  end_pfn)
 +xen_set_identity_and_remap_chunk(area_end, end_pfn,
 + released, remapped);
 +
 +*remapped += min(area_end, end_pfn) -
 +max(area_start, start_pfn);
 +
 +return;

 Why not defer the whole chunk that conflicts?  Or for that matter defer
 all this remapping to the last minute.
 
 There are two problems arising from this:
 
 - In the initrd case remapping would be deferred too long: the initrd
   data is still in use when device initialization is running. And we
   really want the remap to have happened before PCI space is being used.

I'm not sure I understand what you're saying here.

I'm suggesting:

1. Reserve all holes.

2. Relocate (if necessary) all modules (initrd, etc.) to regions that
are RAM in the e820.

3. Rebuild the p2m in RAM.

4. Relocate frames from E820 holes/reserved to the end, free p2m pages
from the holes and replacing them with the read-only 1:1 page (where
possible).

 - Delaying all remapping to the point where the new p2m list is in place
   would either result in a p2m list with all memory holes covered with
   individual entries as the new list is built with those holes still
   populated, ...
   The first option could easily waste significant amounts of memory (on
   my test machine with 1TB RAM this would have been about 1GB), while
   the second option would be performance critical.

I don't understand how this wastes memory.   When you relocate the
frames from the holes you can reclaim the p2m pages for the holes (and
replace them with the r/o mapped identity p2m page).

David

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 06/13] xen: detect pre-allocated memory interfering with e820 map

2015-02-25 Thread Juergen Gross

On 02/25/2015 03:24 PM, David Vrabel wrote:

On 24/02/15 06:27, Juergen Gross wrote:

On 02/19/2015 07:07 PM, David Vrabel wrote:

On 18/02/2015 06:51, Juergen Gross wrote:

+{
+unsigned long pfn;
+unsigned long area_start, area_end;
+unsigned i;
+
+for (i = 0; i  XEN_N_RESERVED_AREAS; i++) {
+
+if (!xen_reserved_area[i].size)
+break;
+
+area_start = PFN_DOWN(xen_reserved_area[i].start);
+area_end = PFN_UP(xen_reserved_area[i].start +
+  xen_reserved_area[i].size);
+if (area_start = end_pfn || area_end = start_pfn)
+continue;
+
+if (area_start  start_pfn)
+xen_set_identity_and_remap_chunk(start_pfn, area_start,
+ released, remapped);
+
+if (area_end  end_pfn)
+xen_set_identity_and_remap_chunk(area_end, end_pfn,
+ released, remapped);
+
+*remapped += min(area_end, end_pfn) -
+max(area_start, start_pfn);
+
+return;


Why not defer the whole chunk that conflicts?  Or for that matter defer
all this remapping to the last minute.


There are two problems arising from this:

- In the initrd case remapping would be deferred too long: the initrd
   data is still in use when device initialization is running. And we
   really want the remap to have happened before PCI space is being used.


I'm not sure I understand what you're saying here.


I thought you wanted to defer the remapping to the point where the
initrd memory is no longer being used. But the suggestion below is
more clear.



I'm suggesting:

1. Reserve all holes.

2. Relocate (if necessary) all modules (initrd, etc.) to regions that
are RAM in the e820.

3. Rebuild the p2m in RAM.

4. Relocate frames from E820 holes/reserved to the end, free p2m pages
from the holes and replacing them with the read-only 1:1 page (where
possible).


- Delaying all remapping to the point where the new p2m list is in place
   would either result in a p2m list with all memory holes covered with
   individual entries as the new list is built with those holes still
   populated, ...
   The first option could easily waste significant amounts of memory (on
   my test machine with 1TB RAM this would have been about 1GB), while
   the second option would be performance critical.


I don't understand how this wastes memory.   When you relocate the
frames from the holes you can reclaim the p2m pages for the holes (and
replace them with the r/o mapped identity p2m page).


Okay, this would work, I guess.

I'll have a try with some new patches...


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 06/13] xen: detect pre-allocated memory interfering with e820 map

2015-02-23 Thread Juergen Gross

On 02/19/2015 07:07 PM, David Vrabel wrote:

On 18/02/2015 06:51, Juergen Gross wrote:

Currently especially for dom0 guest memory with guest pfns not
matching host areas populated with RAM are remapped to areas which
are RAM native as well. This is done to be able to use identity
mappings (pfn == mfn) for I/O areas.

Up to now it is not checked whether the remapped memory is already
in use. Remapping used memory will probably result in data corruption,
as the remapped memory will no longer be reserved. Any memory
allocation after the remap can claim that memory.

Add an infrastructure to check for conflicts of reserved memory areas
and in case of a conflict to react via an area specific function.

This function has 3 options:
- Panic
- Handle the conflict by moving the data to another memory area.
   This is indicated by a return value other than 0.
- Just return 0. This will delay invalidating the conflicting memory
   area to just before doing the remap. This option will be usable for
   cases only where the memory will no longer be needed when the remap
   operation will be started, e.g. for the p2m list, which is already
   copied then.

When doing the remap, check for not remapping a reserved page.

Signed-off-by: Juergen Gross jgr...@suse.com
---
  arch/x86/xen/setup.c   | 185
+++--
  arch/x86/xen/xen-ops.h |   2 +
  2 files changed, 182 insertions(+), 5 deletions(-)

diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c
index 0dda131..a0af554 100644
--- a/arch/x86/xen/setup.c
+++ b/arch/x86/xen/setup.c
@@ -59,6 +59,20 @@ static unsigned long xen_remap_mfn __initdata =
INVALID_P2M_ENTRY;
  static unsigned long xen_remap_pfn;
  static unsigned long xen_max_pfn;

+/*
+ * Areas with memblock_reserve()d memory to be checked against final
E820 map.
+ * Each area has an associated function to handle conflicts (by either
+ * removing the conflict or by just crashing with an appropriate
message).
+ * The array has a fixed size as there are only few areas of interest
which are
+ * well known: kernel, page tables, p2m, initrd.
+ */
+#define XEN_N_RESERVED_AREAS4
+static struct {
+phys_addr_tstart;
+phys_addr_tsize;
+int(*func)(phys_addr_t start, phys_addr_t size);
+} xen_reserved_area[XEN_N_RESERVED_AREAS] __initdata;
+
  /*
   * The maximum amount of extra memory compared to the base size.  The
   * main scaling factor is the size of struct page.  At extreme ratios
@@ -365,10 +379,10 @@ static void __init
xen_set_identity_and_remap_chunk(unsigned long start_pfn,
  unsigned long end_pfn, unsigned long *released, unsigned long
*remapped)
  {
  unsigned long pfn;
-unsigned long i = 0;
+unsigned long i;
  unsigned long n = end_pfn - start_pfn;

-while (i  n) {
+for (i = 0; i  n; ) {
  unsigned long cur_pfn = start_pfn + i;
  unsigned long left = n - i;
  unsigned long size = left;
@@ -411,6 +425,53 @@ static void __init
xen_set_identity_and_remap_chunk(unsigned long start_pfn,
  (unsigned long)__va(pfn  PAGE_SHIFT),
  mfn_pte(pfn, PAGE_KERNEL_IO), 0);
  }
+/* Check to be remapped memory area for conflicts with reserved areas.
+ *
+ * Skip regions known to be reserved which are handled later. For these
+ * regions we have to increase the remapped counter in order to reserve
+ * extra memory space.
+ *
+ * In case a memory page already in use is to be remapped, just BUG().
+ */
+static void __init xen_set_identity_and_remap_chunk_chk(unsigned long
start_pfn,
+unsigned long end_pfn, unsigned long *released, unsigned long
*remapped)


...remap_chunk_checked() ?


I just wanted to avoid the function name to be even longer. OTOH I
really don't mind to use your suggestion. :-)




+{
+unsigned long pfn;
+unsigned long area_start, area_end;
+unsigned i;
+
+for (i = 0; i  XEN_N_RESERVED_AREAS; i++) {
+
+if (!xen_reserved_area[i].size)
+break;
+
+area_start = PFN_DOWN(xen_reserved_area[i].start);
+area_end = PFN_UP(xen_reserved_area[i].start +
+  xen_reserved_area[i].size);
+if (area_start = end_pfn || area_end = start_pfn)
+continue;
+
+if (area_start  start_pfn)
+xen_set_identity_and_remap_chunk(start_pfn, area_start,
+ released, remapped);
+
+if (area_end  end_pfn)
+xen_set_identity_and_remap_chunk(area_end, end_pfn,
+ released, remapped);
+
+*remapped += min(area_end, end_pfn) -
+max(area_start, start_pfn);
+
+return;


Why not defer the whole chunk that conflicts?  Or for that matter defer
all this remapping to the last minute.


There are two problems arising from this:

- In the initrd case remapping would be deferred too long: the initrd
  data is still in use when device initialization is running. And we
  really want the remap to have 

Re: [Xen-devel] [PATCH 06/13] xen: detect pre-allocated memory interfering with e820 map

2015-02-19 Thread David Vrabel

On 18/02/2015 06:51, Juergen Gross wrote:

Currently especially for dom0 guest memory with guest pfns not
matching host areas populated with RAM are remapped to areas which
are RAM native as well. This is done to be able to use identity
mappings (pfn == mfn) for I/O areas.

Up to now it is not checked whether the remapped memory is already
in use. Remapping used memory will probably result in data corruption,
as the remapped memory will no longer be reserved. Any memory
allocation after the remap can claim that memory.

Add an infrastructure to check for conflicts of reserved memory areas
and in case of a conflict to react via an area specific function.

This function has 3 options:
- Panic
- Handle the conflict by moving the data to another memory area.
   This is indicated by a return value other than 0.
- Just return 0. This will delay invalidating the conflicting memory
   area to just before doing the remap. This option will be usable for
   cases only where the memory will no longer be needed when the remap
   operation will be started, e.g. for the p2m list, which is already
   copied then.

When doing the remap, check for not remapping a reserved page.

Signed-off-by: Juergen Gross jgr...@suse.com
---
  arch/x86/xen/setup.c   | 185 +++--
  arch/x86/xen/xen-ops.h |   2 +
  2 files changed, 182 insertions(+), 5 deletions(-)

diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c
index 0dda131..a0af554 100644
--- a/arch/x86/xen/setup.c
+++ b/arch/x86/xen/setup.c
@@ -59,6 +59,20 @@ static unsigned long xen_remap_mfn __initdata = 
INVALID_P2M_ENTRY;
  static unsigned long xen_remap_pfn;
  static unsigned long xen_max_pfn;

+/*
+ * Areas with memblock_reserve()d memory to be checked against final E820 map.
+ * Each area has an associated function to handle conflicts (by either
+ * removing the conflict or by just crashing with an appropriate message).
+ * The array has a fixed size as there are only few areas of interest which are
+ * well known: kernel, page tables, p2m, initrd.
+ */
+#define XEN_N_RESERVED_AREAS   4
+static struct {
+   phys_addr_t start;
+   phys_addr_t size;
+   int (*func)(phys_addr_t start, phys_addr_t size);
+} xen_reserved_area[XEN_N_RESERVED_AREAS] __initdata;
+
  /*
   * The maximum amount of extra memory compared to the base size.  The
   * main scaling factor is the size of struct page.  At extreme ratios
@@ -365,10 +379,10 @@ static void __init 
xen_set_identity_and_remap_chunk(unsigned long start_pfn,
unsigned long end_pfn, unsigned long *released, unsigned long *remapped)
  {
unsigned long pfn;
-   unsigned long i = 0;
+   unsigned long i;
unsigned long n = end_pfn - start_pfn;

-   while (i  n) {
+   for (i = 0; i  n; ) {
unsigned long cur_pfn = start_pfn + i;
unsigned long left = n - i;
unsigned long size = left;
@@ -411,6 +425,53 @@ static void __init 
xen_set_identity_and_remap_chunk(unsigned long start_pfn,
(unsigned long)__va(pfn  PAGE_SHIFT),
mfn_pte(pfn, PAGE_KERNEL_IO), 0);
  }
+/* Check to be remapped memory area for conflicts with reserved areas.
+ *
+ * Skip regions known to be reserved which are handled later. For these
+ * regions we have to increase the remapped counter in order to reserve
+ * extra memory space.
+ *
+ * In case a memory page already in use is to be remapped, just BUG().
+ */
+static void __init xen_set_identity_and_remap_chunk_chk(unsigned long 
start_pfn,
+   unsigned long end_pfn, unsigned long *released, unsigned long *remapped)


...remap_chunk_checked() ?


+{
+   unsigned long pfn;
+   unsigned long area_start, area_end;
+   unsigned i;
+
+   for (i = 0; i  XEN_N_RESERVED_AREAS; i++) {
+
+   if (!xen_reserved_area[i].size)
+   break;
+
+   area_start = PFN_DOWN(xen_reserved_area[i].start);
+   area_end = PFN_UP(xen_reserved_area[i].start +
+ xen_reserved_area[i].size);
+   if (area_start = end_pfn || area_end = start_pfn)
+   continue;
+
+   if (area_start  start_pfn)
+   xen_set_identity_and_remap_chunk(start_pfn, area_start,
+released, remapped);
+
+   if (area_end  end_pfn)
+   xen_set_identity_and_remap_chunk(area_end, end_pfn,
+released, remapped);
+
+   *remapped += min(area_end, end_pfn) -
+   max(area_start, start_pfn);
+
+   return;


Why not defer the whole chunk that conflicts?  Or for that matter defer 
all this remapping to the last minute.


David

___
Xen-devel mailing list
Xen-devel@lists.xen.org