[Xen-devel] [v10][PATCH 07/16] hvmloader/e820: construct guest e820 table

2015-07-20 Thread Tiejun Chen
Now use the hypervisor-supplied memory map to build our final e820 table:
* Add regions for BIOS ranges and other special mappings not in the
  hypervisor map
* Add in the hypervisor supplied regions
* Adjust the lowmem and highmem regions if we've had to relocate
  memory (adding a highmem region if necessary)
* Sort all the ranges so that they appear in memory order.

CC: Keir Fraser k...@xen.org
CC: Jan Beulich jbeul...@suse.com
CC: Andrew Cooper andrew.coop...@citrix.com
CC: Ian Jackson ian.jack...@eu.citrix.com
CC: Stefano Stabellini stefano.stabell...@eu.citrix.com
CC: Ian Campbell ian.campb...@citrix.com
CC: Wei Liu wei.l...@citrix.com
Signed-off-by: Tiejun Chen tiejun.c...@intel.com
---
v10:

* Instead of correcting e820, I'd like to correct memory_map.map[]
  and then copy them into e820 directly. I think this can make sure
  hvm_info, memory_map.map[] and e820 are on the same page.

v9:

* Refine that chunk of codes to check/modify highmem

v8:

* define low_mem_end as uint32_t

* Correct those two wrong loops, memory_map.nr_map - nr
  when we're trying to revise low/high memory e820 entries.

* Improve code comments and the patch head description

* Add one check if highmem is just populated by hvmloader itself

v5 ~ v7:

* Nothing is changed.

v4:

* Rename local variable, low_mem_pgend, to low_mem_end.

* Improve some code comments

* Adjust highmem after lowmem is changed.
 
 tools/firmware/hvmloader/e820.c | 108 +++-
 1 file changed, 95 insertions(+), 13 deletions(-)

diff --git a/tools/firmware/hvmloader/e820.c b/tools/firmware/hvmloader/e820.c
index 7a414ab..ca794ad 100644
--- a/tools/firmware/hvmloader/e820.c
+++ b/tools/firmware/hvmloader/e820.c
@@ -105,7 +105,11 @@ int build_e820_table(struct e820entry *e820,
  unsigned int lowmem_reserved_base,
  unsigned int bios_image_base)
 {
-unsigned int nr = 0;
+unsigned int nr = 0, i, j;
+uint32_t low_mem_end = hvm_info-low_mem_pgend  PAGE_SHIFT;
+uint32_t add_high_mem = 0;
+uint64_t high_mem_end = (uint64_t)hvm_info-high_mem_pgend  PAGE_SHIFT;
+uint64_t map_start, map_size, map_end;
 
 if ( !lowmem_reserved_base )
 lowmem_reserved_base = 0xA;
@@ -149,13 +153,6 @@ int build_e820_table(struct e820entry *e820,
 e820[nr].type = E820_RESERVED;
 nr++;
 
-/* Low RAM goes here. Reserve space for special pages. */
-BUG_ON((hvm_info-low_mem_pgend  PAGE_SHIFT)  (2u  20));
-e820[nr].addr = 0x10;
-e820[nr].size = (hvm_info-low_mem_pgend  PAGE_SHIFT) - e820[nr].addr;
-e820[nr].type = E820_RAM;
-nr++;
-
 /*
  * Explicitly reserve space for special pages.
  * This space starts at RESERVED_MEMBASE an extends to cover various
@@ -191,16 +188,101 @@ int build_e820_table(struct e820entry *e820,
 nr++;
 }
 
+/* Low RAM goes here. Reserve space for special pages. */
+BUG_ON(low_mem_end  (2u  20));
 
-if ( hvm_info-high_mem_pgend )
+/*
+ * Construct E820 table according to recorded memory map.
+ *
+ * The memory map created by toolstack may include,
+ *
+ * #1. Low memory region
+ *
+ * Low RAM starts at least from 1M to make sure all standard regions
+ * of the PC memory map, like BIOS, VGA memory-mapped I/O and vgabios,
+ * have enough space.
+ *
+ * #2. Reserved regions if they exist
+ *
+ * #3. High memory region if it exists
+ *
+ * Note we just have one low memory entry and one high mmeory entry if
+ * exists.
+ *
+ * But we may have relocated RAM to allocate sufficient MMIO previously
+ * so low_mem_pgend would be changed over there. And here memory_map[]
+ * records the original low/high memory, so if low_mem_end is less than
+ * the original we need to revise low/high memory range firstly.
+ */
+for ( i = 0; i  memory_map.nr_map; i++ )
 {
-e820[nr].addr = ((uint64_t)1  32);
-e820[nr].size =
-((uint64_t)hvm_info-high_mem_pgend  PAGE_SHIFT) - e820[nr].addr;
-e820[nr].type = E820_RAM;
+map_start = memory_map.map[i].addr;
+map_size = memory_map.map[i].size;
+map_end = map_start + map_size;
+
+/* If we need to adjust lowmem. */
+if ( memory_map.map[i].type == E820_RAM 
+ low_mem_end  map_start  low_mem_end  map_end )
+{
+add_high_mem = map_end - low_mem_end;
+memory_map.map[i].size = low_mem_end - map_start;
+break;
+}
+}
+
+/* If we need to adjust highmem. */
+if ( add_high_mem )
+{
+/* Modify the existing highmem region if it exists. */
+for ( i = 0; i  memory_map.nr_map; i++ )
+{
+map_start = memory_map.map[i].addr;
+map_size = memory_map.map[i].size;
+map_end = map_start + map_size;
+
+if ( memory_map.map[i].type == E820_RAM 
+ 

Re: [Xen-devel] [v10][PATCH 07/16] hvmloader/e820: construct guest e820 table

2015-07-20 Thread Jan Beulich
 On 20.07.15 at 08:16, tiejun.c...@intel.com wrote:
 +/* If there was no highmem region, just create one. */
 +if ( i == memory_map.nr_map )
 +{
 +memory_map.map[i].addr = ((uint64_t)1  32);
 +memory_map.map[i].size = add_high_mem;
 +memory_map.map[i].type = E820_RAM;

Don't you need to increment memory_map.nr_map here?

 +}
 +
 +/* A sanity check if high memory is broken. */
 +BUG_ON( high_mem_end !=
 +memory_map.map[i].addr + memory_map.map[i].size);
 +}
 +
 +/* Now fulfill e820. */

s/fulfill/fill/.

 +/* Finally we need to sort all e820 entries. */
 +for ( j = 0; j  nr-1; j++ )
 +{
 +for ( i = j+1; i  nr; i++ )

Blanks around binary operators please.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v10][PATCH 07/16] hvmloader/e820: construct guest e820 table

2015-07-20 Thread Chen, Tiejun

v10:

* Instead of correcting e820, I'd like to correct memory_map.map[]
   and then copy them into e820 directly. I think this can make sure
   hvm_info, memory_map.map[] and e820 are on the same page.


Actually, now that you mention it -- this should probably happen
instead when we update hvm_info-{low,high}_mem_pgend.


I also considered this point previously but I thought just right now we 
only update hvm_info-low/high_mem_pgend inside pci_setup(). But you 
can't guarantee this would be a sole place in the future. Instead, 
memory_map.map[] would always be copied into e820 when we build e820 table.


So I think we'd better do this update once at the last minute.

Thanks
Tiejun



I'm happy to leave this where it is for now, so with Jan's comments
addressed (in particular, incrementing nr_map):

Reviewed-by: George Dunlap george.dun...@eu.citrix.com

But if we have time it might be better to pull the loop in pci_setup()
which moves the memory out into a function, and have that function
modify the lowmem and highmem map[] entries at the same time we modify
low_mem_pgend and high_mem_pgend.  Alternately, you could put that on
your list of clean-ups to hvmloader for 4.7.

Thanks,
  -George



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v10][PATCH 07/16] hvmloader/e820: construct guest e820 table

2015-07-20 Thread George Dunlap
On Mon, Jul 20, 2015 at 7:16 AM, Tiejun Chen tiejun.c...@intel.com wrote:
 Now use the hypervisor-supplied memory map to build our final e820 table:
 * Add regions for BIOS ranges and other special mappings not in the
   hypervisor map
 * Add in the hypervisor supplied regions
 * Adjust the lowmem and highmem regions if we've had to relocate
   memory (adding a highmem region if necessary)
 * Sort all the ranges so that they appear in memory order.

 CC: Keir Fraser k...@xen.org
 CC: Jan Beulich jbeul...@suse.com
 CC: Andrew Cooper andrew.coop...@citrix.com
 CC: Ian Jackson ian.jack...@eu.citrix.com
 CC: Stefano Stabellini stefano.stabell...@eu.citrix.com
 CC: Ian Campbell ian.campb...@citrix.com
 CC: Wei Liu wei.l...@citrix.com
 Signed-off-by: Tiejun Chen tiejun.c...@intel.com
 ---
 v10:

 * Instead of correcting e820, I'd like to correct memory_map.map[]
   and then copy them into e820 directly. I think this can make sure
   hvm_info, memory_map.map[] and e820 are on the same page.

Actually, now that you mention it -- this should probably happen
instead when we update hvm_info-{low,high}_mem_pgend.

I'm happy to leave this where it is for now, so with Jan's comments
addressed (in particular, incrementing nr_map):

Reviewed-by: George Dunlap george.dun...@eu.citrix.com

But if we have time it might be better to pull the loop in pci_setup()
which moves the memory out into a function, and have that function
modify the lowmem and highmem map[] entries at the same time we modify
low_mem_pgend and high_mem_pgend.  Alternately, you could put that on
your list of clean-ups to hvmloader for 4.7.

Thanks,
 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v10][PATCH 07/16] hvmloader/e820: construct guest e820 table

2015-07-20 Thread Chen, Tiejun
Looks just a little bit should be changed so I also paste this new 
online to try winning your Acked here,



hvmloader/e820: construct guest e820 table

Now use the hypervisor-supplied memory map to build our final e820 table:
* Add regions for BIOS ranges and other special mappings not in the
  hypervisor map
* Add in the hypervisor supplied regions
* Adjust the lowmem and highmem regions if we've had to relocate
  memory (adding a highmem region if necessary)
* Sort all the ranges so that they appear in memory order.

CC: Keir Fraser k...@xen.org
CC: Jan Beulich jbeul...@suse.com
CC: Andrew Cooper andrew.coop...@citrix.com
CC: Ian Jackson ian.jack...@eu.citrix.com
CC: Stefano Stabellini stefano.stabell...@eu.citrix.com
CC: Ian Campbell ian.campb...@citrix.com
CC: Wei Liu wei.l...@citrix.com
Reviewed-by: George Dunlap george.dun...@eu.citrix.com
Signed-off-by: Tiejun Chen tiejun.c...@intel.com
---
 tools/firmware/hvmloader/e820.c | 109 
+++-

 1 file changed, 96 insertions(+), 13 deletions(-)

diff --git a/tools/firmware/hvmloader/e820.c 
b/tools/firmware/hvmloader/e820.c

index 7a414ab..a6cacdf 100644
--- a/tools/firmware/hvmloader/e820.c
+++ b/tools/firmware/hvmloader/e820.c
@@ -105,7 +105,11 @@ int build_e820_table(struct e820entry *e820,
  unsigned int lowmem_reserved_base,
  unsigned int bios_image_base)
 {
-unsigned int nr = 0;
+unsigned int nr = 0, i, j;
+uint32_t low_mem_end = hvm_info-low_mem_pgend  PAGE_SHIFT;
+uint32_t add_high_mem = 0;
+uint64_t high_mem_end = (uint64_t)hvm_info-high_mem_pgend  
PAGE_SHIFT;

+uint64_t map_start, map_size, map_end;

 if ( !lowmem_reserved_base )
 lowmem_reserved_base = 0xA;
@@ -149,13 +153,6 @@ int build_e820_table(struct e820entry *e820,
 e820[nr].type = E820_RESERVED;
 nr++;

-/* Low RAM goes here. Reserve space for special pages. */
-BUG_ON((hvm_info-low_mem_pgend  PAGE_SHIFT)  (2u  20));
-e820[nr].addr = 0x10;
-e820[nr].size = (hvm_info-low_mem_pgend  PAGE_SHIFT) - 
e820[nr].addr;

-e820[nr].type = E820_RAM;
-nr++;
-
 /*
  * Explicitly reserve space for special pages.
  * This space starts at RESERVED_MEMBASE an extends to cover various
@@ -191,16 +188,102 @@ int build_e820_table(struct e820entry *e820,
 nr++;
 }

+/* Low RAM goes here. Reserve space for special pages. */
+BUG_ON(low_mem_end  (2u  20));

-if ( hvm_info-high_mem_pgend )
+/*
+ * Construct E820 table according to recorded memory map.
+ *
+ * The memory map created by toolstack may include,
+ *
+ * #1. Low memory region
+ *
+ * Low RAM starts at least from 1M to make sure all standard regions
+ * of the PC memory map, like BIOS, VGA memory-mapped I/O and vgabios,
+ * have enough space.
+ *
+ * #2. Reserved regions if they exist
+ *
+ * #3. High memory region if it exists
+ *
+ * Note we just have one low memory entry and one high mmeory entry if
+ * exists.
+ *
+ * But we may have relocated RAM to allocate sufficient MMIO previously
+ * so low_mem_pgend would be changed over there. And here memory_map[]
+ * records the original low/high memory, so if low_mem_end is less than
+ * the original we need to revise low/high memory range firstly.
+ */
+for ( i = 0; i  memory_map.nr_map; i++ )
 {
-e820[nr].addr = ((uint64_t)1  32);
-e820[nr].size =
-((uint64_t)hvm_info-high_mem_pgend  PAGE_SHIFT) - 
e820[nr].addr;

-e820[nr].type = E820_RAM;
+map_start = memory_map.map[i].addr;
+map_size = memory_map.map[i].size;
+map_end = map_start + map_size;
+
+/* If we need to adjust lowmem. */
+if ( memory_map.map[i].type == E820_RAM 
+ low_mem_end  map_start  low_mem_end  map_end )
+{
+add_high_mem = map_end - low_mem_end;
+memory_map.map[i].size = low_mem_end - map_start;
+break;
+}
+}
+
+/* If we need to adjust highmem. */
+if ( add_high_mem )
+{
+/* Modify the existing highmem region if it exists. */
+for ( i = 0; i  memory_map.nr_map; i++ )
+{
+map_start = memory_map.map[i].addr;
+map_size = memory_map.map[i].size;
+map_end = map_start + map_size;
+
+if ( memory_map.map[i].type == E820_RAM 
+ map_start == ((uint64_t)1  32))
+{
+memory_map.map[i].size += add_high_mem;
+break;
+}
+}
+
+/* If there was no highmem region, just create one. */
+if ( i == memory_map.nr_map )
+{
+memory_map.map[i].addr = ((uint64_t)1  32);
+memory_map.map[i].size = add_high_mem;
+memory_map.map[i].type = E820_RAM;
+memory_map.nr_map++;
+}
+
+/* A sanity 

Re: [Xen-devel] [v10][PATCH 07/16] hvmloader/e820: construct guest e820 table

2015-07-20 Thread George Dunlap
On Mon, Jul 20, 2015 at 2:23 PM, Chen, Tiejun tiejun.c...@intel.com wrote:
 v10:

 * Instead of correcting e820, I'd like to correct memory_map.map[]
and then copy them into e820 directly. I think this can make sure
hvm_info, memory_map.map[] and e820 are on the same page.


 Actually, now that you mention it -- this should probably happen
 instead when we update hvm_info-{low,high}_mem_pgend.


 I also considered this point previously but I thought just right now we only
 update hvm_info-low/high_mem_pgend inside pci_setup(). But you can't
 guarantee this would be a sole place in the future. Instead,
 memory_map.map[] would always be copied into e820 when we build e820 table.

We can guarantee it, if nothing else by making sure that no new
changes are added which change the one but not the other.

But perhaps better would be to put a check in build_e820_map() to
BUG() if hvm_info and memory_map are out of sync.

On the other hand, looking at this now, I think that long-term we
should probably move to have *all* information about the memory layout
passed to hvmloader via the memory map, rather than via hvm_info.
That way we can also get rid of the magic knowledge that hvmloader
has about the memory layout (e.g., the VGA hole).

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v10][PATCH 07/16] hvmloader/e820: construct guest e820 table

2015-07-20 Thread Chen, Tiejun

Actually, now that you mention it -- this should probably happen
instead when we update hvm_info-{low,high}_mem_pgend.



I also considered this point previously but I thought just right now we only
update hvm_info-low/high_mem_pgend inside pci_setup(). But you can't
guarantee this would be a sole place in the future. Instead,
memory_map.map[] would always be copied into e820 when we build e820 table.


We can guarantee it, if nothing else by making sure that no new
changes are added which change the one but not the other.


This means you have to syn this again once you change hvm_info so I 
think this may cost a little bit.


Thanks
Tiejun



But perhaps better would be to put a check in build_e820_map() to
BUG() if hvm_info and memory_map are out of sync.

On the other hand, looking at this now, I think that long-term we
should probably move to have *all* information about the memory layout
passed to hvmloader via the memory map, rather than via hvm_info.
That way we can also get rid of the magic knowledge that hvmloader
has about the memory layout (e.g., the VGA hole).

  -George



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v10][PATCH 07/16] hvmloader/e820: construct guest e820 table

2015-07-20 Thread Jan Beulich
 On 20.07.15 at 16:35, tiejun.c...@intel.com wrote:
 Looks just a little bit should be changed so I also paste this new 
 online to try winning your Acked here,

Just like the other one, provided it also works,
Reviewed-by: Jan Beulich jbeul...@suse.com


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel