On Mon, 27 Nov 2017 10:13:09 -0700 Simon Glass <s...@chromium.org> wrote:
> (Tom - any thoughts about a more expansive cc list on this?) > > Hi Masahiro, > > On 26 November 2017 at 07:16, Masahiro Yamada > <yamada.masah...@socionext.com> wrote: > > 2017-11-26 20:38 GMT+09:00 Simon Glass <s...@chromium.org>: > >> Hi Philipp, > >> > >> On 25 November 2017 at 16:31, Dr. Philipp Tomsich > >> <philipp.toms...@theobroma-systems.com> wrote: > >>> Hi, > >>> > >>>> On 25 Nov 2017, at 23:34, Simon Glass <s...@chromium.org> wrote: > >>>> > >>>> +Tom, Masahiro, Philipp > >>>> > >>>> Hi, > >>>> > >>>> On 22 November 2017 at 03:27, Wolfgang Denk <w...@denx.de> wrote: > >>>>> Dear Kever Yang, > >>>>> > >>>>> In message > >>>>> <fd0bb500-80c4-f317-cc18-f7aaf1344...@rock-chips.com> you > >>>>> wrote: > >>>>>> > >>>>>> I can understand this feature, we always do dram_init_banks() > >>>>>> first, then we relocate to 'known' area, then will be no risk > >>>>>> to access memory. I believe there must be some historical > >>>>>> reason for some kind of device, the relocate feature is a > >>>>>> wonderful idea for it. > >>>>> > >>>>> This is actuallyu not so much a feature needed to support some > >>>>> specific device (in this case much simpler approahces would be > >>>>> possible), but to support a whole set of features. > >>>>> Unfortunately these appear to get forgotten / ignored over time. > >>>>> > >>>>>> many other SoCs should be similar. > >>>>>> - Without relocate we can save many step, some of our customer > >>>>>> really care much about the boot time duration. > >>>>>> * no need to relocate everything > >>>>>> * no need to copy all the code > >>>>>> * no need init the driver more than once > >>>>> > >>>>> Please have a look at the README, section "Memory Management". > >>>>> The reloaction is not done to any _fixed_ address, but the > >>>>> address is actually computed at runtime, depending on a number > >>>>> features enabled (at least this is how it used to be - > >>>>> appearently little of this is tested on a regular base, so I > >>>>> would not be surprised if things are broken today). > >>>>> > >>>>> The basic idea was to reserve areas of memory at the top of RAM, > >>>>> that would not be initialized / modified by U-Boot and Linux, > >>>>> not even across a reset / warm boot. > >>>>> > >>>>> This was used for exaple for: > >>>>> > >>>>> - pRAM (Protected RAM) which could be used to store all kind of > >>>>> data (for example, using a pramfs [Protected and Persistent RAM > >>>>> Filesystem]) that could be kept across reboots of the OS. > >>>>> > >>>>> - shared frame buffer / video memory. U-Boot and Linux would be > >>>>> able to initialize the video memory just once (in U-Boot) and > >>>>> then share it, maybe even across reboots. especially, this > >>>>> would allow for a very early splash screen that gets passed > >>>>> (flicker free) to Linux until some Linux GUI takes over (much > >>>>> more difficult today). > >>>>> > >>>>> - shared log buffer: U-Boot and Linux used to use the same > >>>>> syslog buffer mechanism, so you could share it between U-Boot > >>>>> and Linux. this allows for example to > >>>>> * read the Linux kernel panic messages after reset in U-Boot; > >>>>> this is very useful when you bring up a new system and Linux > >>>>> crashes before it can display the log buffer on the console > >>>>> * pass U-Boot POST results on to Linux, so the application code > >>>>> can read and process these > >>>>> * process the system log of the previous run (especially after > >>>>> a panic) in Lunux after it rebootet. > >>>>> > >>>>> etc. > >>>>> > >>>>> There are a number of such features which require to reserve > >>>>> room at the top of RAM, the size of which is calculatedat > >>>>> runtime, often depending on user settable environment data. > >>>>> > >>>>> All this cannot be done without relocation to a (dynmaically > >>>>> computed) target address. > >>>>> > >>>>> > >>>>> Yes, the code could be simpler and faster without that - but > >>>>> then, you cut off a number of features. > >>>> > >>>> I would be interested in seeing benchmarks showing the cost of > >>>> relocation in terms of boot time. Last time I did this was on > >>>> Exynos 5 and it was some years ago. The time was pretty small > >>>> provided the cache was on for the memory copies associated with > >>>> relocation itself. Something like 10-20ms but I don't have the > >>>> numbers handy. > >>>> > >>>> I think it is useful to be able to allocate memory in > >>>> board_init_f() for use by U-Boot for things like the display and > >>>> the malloc() region. > >>>> > >>>> Options we might consider: > >>>> > >>>> 1. Don't relocate the code and data. Thus we could avoid the > >>>> copy and relocation cost. This is already supported with the > >>>> GD_FLG_SKIP_RELOC used when U-Boot runs as an EFI app > >>>> > >>>> 2. Rather than throwing away the old malloc() region, keep it > >>>> around so existing allocated blocks work. Then new malloc() > >>>> region would be used for future allocations. We could perhaps > >>>> ignore free() calls in that region > >>>> > >>>> 2a. This would allow us to avoid re-init of driver model in most > >>>> cases I think. E.g. we could init serial and timer before > >>>> relocation and leave them inited after relocation. We could just > >>>> init the 'additional' devices not done before relocation. > >>>> > >>>> 2b. I suppose we could even extend this to SPL if we wanted to. I > >>>> suspect it would just be a pain though, since SPL might use > >>>> memory that U-Boot wants. > >>>> > >>>> 3. We could turn on the cache earlier. This removes most of the > >>>> boot-time penalty. Ideally this should be turned on in SPL and > >>>> perhaps redone in U-Boot which has more memory available. If SPL > >>>> is not used, we could turn on the cache before relocation. > >>> > >>> Both turning on the cache and initialising the clocking could be > >>> of benefit to boot-time. > >>> > >>> However, the biggest possible gain will come from utilising > >>> Falcon mode to skip the full U-Boot stage and directly boot into > >>> the OS from SPL. This assumes that the drivers involved are > >>> fully optimised, so loading up the OS image does not take longer > >>> than necessary. > >> > >> I'd like to see numbers on that. From my experience, loading and > >> running U-Boot does not take very long... > >> > >>> > >>>> 4. Rather than the reserving memory in board_init_f() we could > >>>> have it call malloc() from the expanded region. We could then > >>>> perhaps then move this reserve/allocate code in to particular > >>>> drivers or subsystems, and drop a good chunk of the init > >>>> sequence. We would need to have a larger malloc() region than is > >>>> currently the case. > >>>> > >>>> There are still some arch-specific bits in board_init_f() which > >>>> make these sorts of changes a bit tricky to support generically. > >>>> IMO it would be best to move to 'generic relocation' written in > >>>> C, where all archs work basically the same way, before > >>>> attempting any of the above. > >>>> > >>>> Still, I can see some benefits and even some simplifications. > >>>> > >>>> Regards, > >>>> Simon > >>> > > > > > > > > This discussion should have happened. > > U-Boot boot sequence is crazily inefficient. > > > > > > > > When we talk about "relocation", two things are happening. > > > > [1] U-Boot proper copies itself to the very end of DRAM > > [2] Fix-up the global symbols > > > > In my opinion, only [2] is useful. > > > > > > SPL initializes the DRAM, so it knows the base and size of DRAM. > > SPL should be able to load the U-Boot proper to the final > > destination. So, [1] is unnecessary. > > > > > > [2] is necessary because SPL may load the U-Boot proper > > to a different place than CONFIG_SYS_TEXT_BASE. > > This feature is useful for platforms > > whose DRAM base/size is only known at run-time. > > (Of course, it should be user-configurable by CONFIG_RELOCATE > > or something.) > > > > Moreover, board_init_f() is unneeded - > > everything in board_init_f() is already done by SPL. > > Multiple-time DM initialization is really inefficient and ugly. > > > > > > The following is how the ideal boot loader would work. > > > > > > Requirement for U-Boot proper: > > U-Boot never changes the location by itself. > > So, SPL or a vendor loader must load U-Boot proper > > to the final destination directly. > > (You can load it to the very end of DRAM if you like, > > but the actual place does not matter here.) > > > > > > Boot sequence of U-Boot proper: > > If CONFIG_RELOCATE (or something) is enabled, > > it fixes the global symbols at the very beginning > > of the boot. > > (In this case, CONFIG_SYS_TEXT_BASE can be arbitrary) > > > > That's it. Proceed to the rest of init code. > > (= board_init_r) > > board_init_f() is unnecessary. > > > > This should work for recent platforms. > > Yes that sounds reasonable to me. > > We could do the symbol fixup/relocation in SPL after loading U-Boot., > although that would probably push us to using ELF format for U-Boot > which is a bit limited. > > Still I think the biggest performance improvement comes from turning > on the cache in SPL. So the above is a simplification, not really a > speed-up. > > > > > > > > > We should think about old platforms that boot from a NOR flash or > > something. There are two solutions: > > - execute-in-place: run the code in the flash directly > > - use SPL (common/spl/spl-nor.c) if you want to run > > it from RAM > > This seems like a big regression in functionality. For example for x86 > 32-bit we currently don't have an SPL (we do for 64-bit). So I think > this means that everything would be forced to have an SPL? > > I am wondering who else we should cc on this discussion? Not all boards use SPL. There are some targets, which use FBL (SPL counterpart) from vendor and only U-boot proper. Good example is Odroid XU3. And I also do agree - for the original post in this discussion we should have the measurements of boot time improvement. > > Regards, > Simon > _______________________________________________ > U-Boot mailing list > U-Boot@lists.denx.de > https://lists.denx.de/listinfo/u-boot Best regards, Lukasz Majewski -- DENX Software Engineering GmbH, Managing Director: Wolfgang Denk HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: w...@denx.de
pgpHUb5bmlTVM.pgp
Description: OpenPGP digital signature
_______________________________________________ U-Boot mailing list U-Boot@lists.denx.de https://lists.denx.de/listinfo/u-boot