[Xen-devel] [v10][PATCH 07/16] hvmloader/e820: construct guest e820 table
Now use the hypervisor-supplied memory map to build our final e820 table: * Add regions for BIOS ranges and other special mappings not in the hypervisor map * Add in the hypervisor supplied regions * Adjust the lowmem and highmem regions if we've had to relocate memory (adding a highmem region if necessary) * Sort all the ranges so that they appear in memory order. CC: Keir Fraser k...@xen.org CC: Jan Beulich jbeul...@suse.com CC: Andrew Cooper andrew.coop...@citrix.com CC: Ian Jackson ian.jack...@eu.citrix.com CC: Stefano Stabellini stefano.stabell...@eu.citrix.com CC: Ian Campbell ian.campb...@citrix.com CC: Wei Liu wei.l...@citrix.com Signed-off-by: Tiejun Chen tiejun.c...@intel.com --- v10: * Instead of correcting e820, I'd like to correct memory_map.map[] and then copy them into e820 directly. I think this can make sure hvm_info, memory_map.map[] and e820 are on the same page. v9: * Refine that chunk of codes to check/modify highmem v8: * define low_mem_end as uint32_t * Correct those two wrong loops, memory_map.nr_map - nr when we're trying to revise low/high memory e820 entries. * Improve code comments and the patch head description * Add one check if highmem is just populated by hvmloader itself v5 ~ v7: * Nothing is changed. v4: * Rename local variable, low_mem_pgend, to low_mem_end. * Improve some code comments * Adjust highmem after lowmem is changed. tools/firmware/hvmloader/e820.c | 108 +++- 1 file changed, 95 insertions(+), 13 deletions(-) diff --git a/tools/firmware/hvmloader/e820.c b/tools/firmware/hvmloader/e820.c index 7a414ab..ca794ad 100644 --- a/tools/firmware/hvmloader/e820.c +++ b/tools/firmware/hvmloader/e820.c @@ -105,7 +105,11 @@ int build_e820_table(struct e820entry *e820, unsigned int lowmem_reserved_base, unsigned int bios_image_base) { -unsigned int nr = 0; +unsigned int nr = 0, i, j; +uint32_t low_mem_end = hvm_info-low_mem_pgend PAGE_SHIFT; +uint32_t add_high_mem = 0; +uint64_t high_mem_end = (uint64_t)hvm_info-high_mem_pgend PAGE_SHIFT; +uint64_t map_start, map_size, map_end; if ( !lowmem_reserved_base ) lowmem_reserved_base = 0xA; @@ -149,13 +153,6 @@ int build_e820_table(struct e820entry *e820, e820[nr].type = E820_RESERVED; nr++; -/* Low RAM goes here. Reserve space for special pages. */ -BUG_ON((hvm_info-low_mem_pgend PAGE_SHIFT) (2u 20)); -e820[nr].addr = 0x10; -e820[nr].size = (hvm_info-low_mem_pgend PAGE_SHIFT) - e820[nr].addr; -e820[nr].type = E820_RAM; -nr++; - /* * Explicitly reserve space for special pages. * This space starts at RESERVED_MEMBASE an extends to cover various @@ -191,16 +188,101 @@ int build_e820_table(struct e820entry *e820, nr++; } +/* Low RAM goes here. Reserve space for special pages. */ +BUG_ON(low_mem_end (2u 20)); -if ( hvm_info-high_mem_pgend ) +/* + * Construct E820 table according to recorded memory map. + * + * The memory map created by toolstack may include, + * + * #1. Low memory region + * + * Low RAM starts at least from 1M to make sure all standard regions + * of the PC memory map, like BIOS, VGA memory-mapped I/O and vgabios, + * have enough space. + * + * #2. Reserved regions if they exist + * + * #3. High memory region if it exists + * + * Note we just have one low memory entry and one high mmeory entry if + * exists. + * + * But we may have relocated RAM to allocate sufficient MMIO previously + * so low_mem_pgend would be changed over there. And here memory_map[] + * records the original low/high memory, so if low_mem_end is less than + * the original we need to revise low/high memory range firstly. + */ +for ( i = 0; i memory_map.nr_map; i++ ) { -e820[nr].addr = ((uint64_t)1 32); -e820[nr].size = -((uint64_t)hvm_info-high_mem_pgend PAGE_SHIFT) - e820[nr].addr; -e820[nr].type = E820_RAM; +map_start = memory_map.map[i].addr; +map_size = memory_map.map[i].size; +map_end = map_start + map_size; + +/* If we need to adjust lowmem. */ +if ( memory_map.map[i].type == E820_RAM + low_mem_end map_start low_mem_end map_end ) +{ +add_high_mem = map_end - low_mem_end; +memory_map.map[i].size = low_mem_end - map_start; +break; +} +} + +/* If we need to adjust highmem. */ +if ( add_high_mem ) +{ +/* Modify the existing highmem region if it exists. */ +for ( i = 0; i memory_map.nr_map; i++ ) +{ +map_start = memory_map.map[i].addr; +map_size = memory_map.map[i].size; +map_end = map_start + map_size; + +if ( memory_map.map[i].type == E820_RAM +
Re: [Xen-devel] [v10][PATCH 07/16] hvmloader/e820: construct guest e820 table
On 20.07.15 at 08:16, tiejun.c...@intel.com wrote: +/* If there was no highmem region, just create one. */ +if ( i == memory_map.nr_map ) +{ +memory_map.map[i].addr = ((uint64_t)1 32); +memory_map.map[i].size = add_high_mem; +memory_map.map[i].type = E820_RAM; Don't you need to increment memory_map.nr_map here? +} + +/* A sanity check if high memory is broken. */ +BUG_ON( high_mem_end != +memory_map.map[i].addr + memory_map.map[i].size); +} + +/* Now fulfill e820. */ s/fulfill/fill/. +/* Finally we need to sort all e820 entries. */ +for ( j = 0; j nr-1; j++ ) +{ +for ( i = j+1; i nr; i++ ) Blanks around binary operators please. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [v10][PATCH 07/16] hvmloader/e820: construct guest e820 table
v10: * Instead of correcting e820, I'd like to correct memory_map.map[] and then copy them into e820 directly. I think this can make sure hvm_info, memory_map.map[] and e820 are on the same page. Actually, now that you mention it -- this should probably happen instead when we update hvm_info-{low,high}_mem_pgend. I also considered this point previously but I thought just right now we only update hvm_info-low/high_mem_pgend inside pci_setup(). But you can't guarantee this would be a sole place in the future. Instead, memory_map.map[] would always be copied into e820 when we build e820 table. So I think we'd better do this update once at the last minute. Thanks Tiejun I'm happy to leave this where it is for now, so with Jan's comments addressed (in particular, incrementing nr_map): Reviewed-by: George Dunlap george.dun...@eu.citrix.com But if we have time it might be better to pull the loop in pci_setup() which moves the memory out into a function, and have that function modify the lowmem and highmem map[] entries at the same time we modify low_mem_pgend and high_mem_pgend. Alternately, you could put that on your list of clean-ups to hvmloader for 4.7. Thanks, -George ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [v10][PATCH 07/16] hvmloader/e820: construct guest e820 table
On Mon, Jul 20, 2015 at 7:16 AM, Tiejun Chen tiejun.c...@intel.com wrote: Now use the hypervisor-supplied memory map to build our final e820 table: * Add regions for BIOS ranges and other special mappings not in the hypervisor map * Add in the hypervisor supplied regions * Adjust the lowmem and highmem regions if we've had to relocate memory (adding a highmem region if necessary) * Sort all the ranges so that they appear in memory order. CC: Keir Fraser k...@xen.org CC: Jan Beulich jbeul...@suse.com CC: Andrew Cooper andrew.coop...@citrix.com CC: Ian Jackson ian.jack...@eu.citrix.com CC: Stefano Stabellini stefano.stabell...@eu.citrix.com CC: Ian Campbell ian.campb...@citrix.com CC: Wei Liu wei.l...@citrix.com Signed-off-by: Tiejun Chen tiejun.c...@intel.com --- v10: * Instead of correcting e820, I'd like to correct memory_map.map[] and then copy them into e820 directly. I think this can make sure hvm_info, memory_map.map[] and e820 are on the same page. Actually, now that you mention it -- this should probably happen instead when we update hvm_info-{low,high}_mem_pgend. I'm happy to leave this where it is for now, so with Jan's comments addressed (in particular, incrementing nr_map): Reviewed-by: George Dunlap george.dun...@eu.citrix.com But if we have time it might be better to pull the loop in pci_setup() which moves the memory out into a function, and have that function modify the lowmem and highmem map[] entries at the same time we modify low_mem_pgend and high_mem_pgend. Alternately, you could put that on your list of clean-ups to hvmloader for 4.7. Thanks, -George ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [v10][PATCH 07/16] hvmloader/e820: construct guest e820 table
Looks just a little bit should be changed so I also paste this new online to try winning your Acked here, hvmloader/e820: construct guest e820 table Now use the hypervisor-supplied memory map to build our final e820 table: * Add regions for BIOS ranges and other special mappings not in the hypervisor map * Add in the hypervisor supplied regions * Adjust the lowmem and highmem regions if we've had to relocate memory (adding a highmem region if necessary) * Sort all the ranges so that they appear in memory order. CC: Keir Fraser k...@xen.org CC: Jan Beulich jbeul...@suse.com CC: Andrew Cooper andrew.coop...@citrix.com CC: Ian Jackson ian.jack...@eu.citrix.com CC: Stefano Stabellini stefano.stabell...@eu.citrix.com CC: Ian Campbell ian.campb...@citrix.com CC: Wei Liu wei.l...@citrix.com Reviewed-by: George Dunlap george.dun...@eu.citrix.com Signed-off-by: Tiejun Chen tiejun.c...@intel.com --- tools/firmware/hvmloader/e820.c | 109 +++- 1 file changed, 96 insertions(+), 13 deletions(-) diff --git a/tools/firmware/hvmloader/e820.c b/tools/firmware/hvmloader/e820.c index 7a414ab..a6cacdf 100644 --- a/tools/firmware/hvmloader/e820.c +++ b/tools/firmware/hvmloader/e820.c @@ -105,7 +105,11 @@ int build_e820_table(struct e820entry *e820, unsigned int lowmem_reserved_base, unsigned int bios_image_base) { -unsigned int nr = 0; +unsigned int nr = 0, i, j; +uint32_t low_mem_end = hvm_info-low_mem_pgend PAGE_SHIFT; +uint32_t add_high_mem = 0; +uint64_t high_mem_end = (uint64_t)hvm_info-high_mem_pgend PAGE_SHIFT; +uint64_t map_start, map_size, map_end; if ( !lowmem_reserved_base ) lowmem_reserved_base = 0xA; @@ -149,13 +153,6 @@ int build_e820_table(struct e820entry *e820, e820[nr].type = E820_RESERVED; nr++; -/* Low RAM goes here. Reserve space for special pages. */ -BUG_ON((hvm_info-low_mem_pgend PAGE_SHIFT) (2u 20)); -e820[nr].addr = 0x10; -e820[nr].size = (hvm_info-low_mem_pgend PAGE_SHIFT) - e820[nr].addr; -e820[nr].type = E820_RAM; -nr++; - /* * Explicitly reserve space for special pages. * This space starts at RESERVED_MEMBASE an extends to cover various @@ -191,16 +188,102 @@ int build_e820_table(struct e820entry *e820, nr++; } +/* Low RAM goes here. Reserve space for special pages. */ +BUG_ON(low_mem_end (2u 20)); -if ( hvm_info-high_mem_pgend ) +/* + * Construct E820 table according to recorded memory map. + * + * The memory map created by toolstack may include, + * + * #1. Low memory region + * + * Low RAM starts at least from 1M to make sure all standard regions + * of the PC memory map, like BIOS, VGA memory-mapped I/O and vgabios, + * have enough space. + * + * #2. Reserved regions if they exist + * + * #3. High memory region if it exists + * + * Note we just have one low memory entry and one high mmeory entry if + * exists. + * + * But we may have relocated RAM to allocate sufficient MMIO previously + * so low_mem_pgend would be changed over there. And here memory_map[] + * records the original low/high memory, so if low_mem_end is less than + * the original we need to revise low/high memory range firstly. + */ +for ( i = 0; i memory_map.nr_map; i++ ) { -e820[nr].addr = ((uint64_t)1 32); -e820[nr].size = -((uint64_t)hvm_info-high_mem_pgend PAGE_SHIFT) - e820[nr].addr; -e820[nr].type = E820_RAM; +map_start = memory_map.map[i].addr; +map_size = memory_map.map[i].size; +map_end = map_start + map_size; + +/* If we need to adjust lowmem. */ +if ( memory_map.map[i].type == E820_RAM + low_mem_end map_start low_mem_end map_end ) +{ +add_high_mem = map_end - low_mem_end; +memory_map.map[i].size = low_mem_end - map_start; +break; +} +} + +/* If we need to adjust highmem. */ +if ( add_high_mem ) +{ +/* Modify the existing highmem region if it exists. */ +for ( i = 0; i memory_map.nr_map; i++ ) +{ +map_start = memory_map.map[i].addr; +map_size = memory_map.map[i].size; +map_end = map_start + map_size; + +if ( memory_map.map[i].type == E820_RAM + map_start == ((uint64_t)1 32)) +{ +memory_map.map[i].size += add_high_mem; +break; +} +} + +/* If there was no highmem region, just create one. */ +if ( i == memory_map.nr_map ) +{ +memory_map.map[i].addr = ((uint64_t)1 32); +memory_map.map[i].size = add_high_mem; +memory_map.map[i].type = E820_RAM; +memory_map.nr_map++; +} + +/* A sanity
Re: [Xen-devel] [v10][PATCH 07/16] hvmloader/e820: construct guest e820 table
On Mon, Jul 20, 2015 at 2:23 PM, Chen, Tiejun tiejun.c...@intel.com wrote: v10: * Instead of correcting e820, I'd like to correct memory_map.map[] and then copy them into e820 directly. I think this can make sure hvm_info, memory_map.map[] and e820 are on the same page. Actually, now that you mention it -- this should probably happen instead when we update hvm_info-{low,high}_mem_pgend. I also considered this point previously but I thought just right now we only update hvm_info-low/high_mem_pgend inside pci_setup(). But you can't guarantee this would be a sole place in the future. Instead, memory_map.map[] would always be copied into e820 when we build e820 table. We can guarantee it, if nothing else by making sure that no new changes are added which change the one but not the other. But perhaps better would be to put a check in build_e820_map() to BUG() if hvm_info and memory_map are out of sync. On the other hand, looking at this now, I think that long-term we should probably move to have *all* information about the memory layout passed to hvmloader via the memory map, rather than via hvm_info. That way we can also get rid of the magic knowledge that hvmloader has about the memory layout (e.g., the VGA hole). -George ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [v10][PATCH 07/16] hvmloader/e820: construct guest e820 table
Actually, now that you mention it -- this should probably happen instead when we update hvm_info-{low,high}_mem_pgend. I also considered this point previously but I thought just right now we only update hvm_info-low/high_mem_pgend inside pci_setup(). But you can't guarantee this would be a sole place in the future. Instead, memory_map.map[] would always be copied into e820 when we build e820 table. We can guarantee it, if nothing else by making sure that no new changes are added which change the one but not the other. This means you have to syn this again once you change hvm_info so I think this may cost a little bit. Thanks Tiejun But perhaps better would be to put a check in build_e820_map() to BUG() if hvm_info and memory_map are out of sync. On the other hand, looking at this now, I think that long-term we should probably move to have *all* information about the memory layout passed to hvmloader via the memory map, rather than via hvm_info. That way we can also get rid of the magic knowledge that hvmloader has about the memory layout (e.g., the VGA hole). -George ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [v10][PATCH 07/16] hvmloader/e820: construct guest e820 table
On 20.07.15 at 16:35, tiejun.c...@intel.com wrote: Looks just a little bit should be changed so I also paste this new online to try winning your Acked here, Just like the other one, provided it also works, Reviewed-by: Jan Beulich jbeul...@suse.com ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel