James, On Wed, Apr 25, 2018 at 02:22:07PM +0100, James Morse wrote: > Hi Akashi, > > On 25/04/18 10:20, AKASHI Takahiro wrote: > > On Tue, Apr 24, 2018 at 05:08:57PM +0100, James Morse wrote: > >> On 16/04/18 11:08, AKASHI Takahiro wrote: > >>> On Thu, Apr 12, 2018 at 05:01:52PM +0100, James Morse wrote: > >>>> On 05/04/18 03:42, AKASHI Takahiro wrote: > >>>>> On Mon, Apr 02, 2018 at 10:53:32AM +0900, AKASHI Takahiro wrote: > >>>>>> either because > >>>>>> a. new kernel (or initrd/dtb) may have been allocated on a NOMAP region > >>>>>> which are not suitable for usable memory, or > >>>>>> b. new kernel (or initrd/dtb) may have been allocated on a reserved > >>>>>> region > >>>>>> whose contents can be overwritten. > >>>>>> > >>>>>> While we see (b) even today, (a) is a backward compatibility issue. > >>>> > >>>> (a) doesn't happen because request_standard_resources() checks > >>>> memblock_is_nomap(), and reports those regions as 'reserved'. > >>> > >>> I might have confused you. The assumption here was that we adopt format > >>> (D), > >>> where all NOMAP regions are sub nodes of "System RAM", but still use > >>> the current kexec-tools. > >>> As I said above, this will end up an un-expected behavior. > >> > >> I'd like to fix this without having to fix user-space at the same time. It > >> looks > >> like no-one else has second level reserved regions, > > > > This was my assumption when I sent out a patch to kexec-tools. > > But this would still leave user-space that isn't updated broken. > > > >>>>> # I don't know yet whether people are happy with this fix, and also have > >>>>> kernel patches for my other approaches. They are neither not much > >>>>> complicated. > >>>> > >>>> I don't think we should fix this in userspace, exporting all the > >>>> memblock_reserved() regions as 'reserved' in /proc/iomem looks like the > >>>> right > >>>> thing to do. > >>> > >>> Again, if you modify /proc/iomem, you have to update kexec-tools, too. > >> > >> If we squash the memblock_reserved() stuff down so it appears as a top > >> level > >> 'reserved' region too, I don't think we do. > > > > If I correctly understand, you're talking about my format (E). > > As I said, it will fix the issue without modifying user-space, but > > > > || This does not only look quite noisy but also ignores the fact that > > || reserved regions are part of System RAM (or memblock.memory). > > I agree its noisy, there are significantly more 'reserved' areas, but these > are > all either nomap or memblock_reserved(). > > Why does it matter if a reserved-region is nomap or memblock_reserved()? Any > new > kernel will learn the difference from the EFI memory map and make its own > decisions.
Yeah, kernel can do (though kernel won't look though system resources list for this purpose anyway), what about kexec-like user applications? It may want to seek /proc/iomem to identify all the *usable* memory on the system, that is "System RAM", but doesn't care whether some range is reserved or not (for some reason) yet does care !NOMAP. > Kexec-tools only needs to know what it can overwrite without clobbering > important data like the UEFI memory map, or the APCI tables covered by the > linear map. > > > >> This prevents the efi-memory-map > >> being overwritten on kernels since kexec was merged. > >> > >> Its horribly fiddly to do this. The kernel code/data are special reserved > >> regions that we already describe as a subset of system-ram, even though > >> they are > >> both also fragments of a bigger memblock_reserved() block. > > > > Actually, we don't have to avoid kernel code/data regions as copying > > loaded data to the final destinations will be done at the very end of kexec. > > For kexec yes, but that is the existing format of the file, which we shouldn't > change, otherwise we break something else. One trivial downside in this approach is that a secondary kernel will be loaded at an address different from the one of current kernel. While it is sane, it looks a bit odd that, every time kexec'ed, a new kernel (code/data) is located back and forth :) > > >> While we can walk memblock for regions that aren't reserved, allocating > >> memory > >> in the loop changes what is reserved. That one O(N) walk ends up being > >> four... > > > > At most O(n^2)? > > I think for_each_free_mem_range() is smart enough not to do that. Patch > incoming... Yes, my v9 of kexec_file patch makes use of it. Thanks, -Takahiro AKASHI > > Thanks, > > James -- To unsubscribe from this list: send the line "unsubscribe linux-efi" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html
