Hello *,

I hope you're all doing well. I recently got seL4 running on my Raspberry
Pi 3, and wanted to report some notes here and ask some questions regarding
the port, and perhaps get some help working on a patch (or twelve, I'm
sure.)

Recently I followed the Data61 blog post here to get a recent copy of seL4
on my RPi, but hit a few errors along the way that burned some time. In
particular I got burned by the U-Boot glitch and getting seL4 building due
to a glitch in the 'repo' setup for sel4test: https://research.csiro.au/
tsblog/sel4-raspberry-pi-3/

I ended up writing my own blog post, with effectively "fully reproducible"
instructions (e.g. one of the codepad.org links on the RPi3 wiki page was
dead) for a recent build. This includes proper setup for uboot.env in order
to `fatload` and `bootm` the seL4 ELF binary off disk, the right U-Boot
version, and using Docker to build the seL4 image:
https://inner-haven.net/posts/2017-09-15-sel4-rpi3.html

That's all fine, but...

I got curious about this U-Boot glitch, as even a recent (git) copy of
U-Boot still suffered from the issue reported in the original Data61 blog
post. I found it strange this regression would exist for nearly 9 months,
unless it was possibly something intentional (e.g. an intended behavioral
change) or it went unreported.

The glitch is this: after booting the seL4 binary, the system immediately
hangs with no output after trying to start the sel4test application. This
glitch did not occur in versions of U-Boot up-to v2017.01. Versions
v2017.03 and later (to now) suffer.

I decided to spend some time bisecting this problem and found the issue.
Versions of U-Boot after v2017.01 include the following patch, which causes
the data cache to be enabled prior to U-Boot loading and executing ELF
binaries:

https://github.com/thoughtpolice/u-boot/commit/
995eab8b5b580b67394312b1621c60a71042cd18

This behavior previously existed as a workaround for QNX, which required
that the dcache be disabled before execution. U-Boot, previously, disabled
the dcache when booting *any* ELF binary at all (the binary can of course
enable it or check to see if it isn't enabled already), as a simplistic
workaround. This patch changes it to only disable the dcache on U-Boot
images (uImages) marked as QNX ELF binaries, so that QNX systems boot
properly still.

It would seem then that seL4 -- or at least sel4test -- requires the dcache
to be disabled upon entry for the RPi3. If the dcache is enabled, as in the
above patch, it otherwise hangs immediately.

In fact, a recent copy of U-Boot *can* execute seL4 properly: you simply
have to use `./tools/mkimage` inside U-Boot to mark the application as
`mkimage -C none -A arm -T kernel -O qnx -d <input> <output>` first -- this
creates a uImage with the appropriate header to mark the ELF as a QNX
binary. U-Boot recognizes this and appropriately disables the dcache before
application entry, once you use `bootm` or something.

I took the time to write a patch to U-Boot in order to add a new OS
definition -- "sel4" -- in order to address this problem more clearly. My
change is here:

https://github.com/thoughtpolice/u-boot/commit/
720caf6c398584658abfa883d4d390f4b6200269

With this change, you can mark any seL4 binary as `-O sel4` to create an
appropriately bootable uImage, with the dcache appropriately disabled.

Assuming this is in place, the following U-Boot commands will execute seL4:

U-Boot> setenv bootfile uImage
U-Boot> fatload mmc 0 ${loadaddr} ${bootfile}
U-Boot> bootm ${loadaddr}
## Booting kernel from Legacy Image at 00200000 ...
   Image Name:
   Image Type:   ARM seL4 Kernel Image (uncompressed)
   Data Size:    3385548 Bytes = 3.2 MiB
   Load Address: 00000000
   Entry Point:  00000000
   Verifying Checksum ... OK
   Loading Kernel Image ... OK
CACHE: Misaligned operation at range [20000000, 200078bc]
CACHE: Misaligned operation at range [200078bc, 2032a720]
CACHE: Misaligned operation at range [2032c000, 2033c820]
## Starting application at 0x20000000 ...

ELF-loader started on CPU: ARM Ltd. Cortex-A53 r0p4
  paddr=[20000000..2033c81f]
ELF-loading image 'kernel'
  paddr=[0..32fff]
  vaddr=[e0000000..e0032fff]
  virt_entry=e0000000
ELF-loading image 'sel4test-driver'
  paddr=[33000..44dfff]
  vaddr=[10000..42afff]
  virt_entry=2a03c
Enabling MMU and paging
Jumping to kernel-image entry point...

----

So with that all done, it's clear there's a change needed *somewhere* but
I'm not sure where or what to do:

  1) Is the above the right approach? Clearly, seL4 on the RPi3 assumed a
disabled dcache during its development, but this has now changed, so for
this platform something will need to be fixed, such as the above.

  2) Or perhaps this is not a U-Boot problem, but should be fixed in seL4.
Or in sel4test. For example, is there an option to assume the dcache is
already enabled in the configuration? Maybe that should be toggled instead.
Or does seL4 *always* assume it is disabled at boot? What about non-RPi3
platforms like other ARM systems? These likely use U-Boot as well, but it's
unclear if the cache should be disabled here, too. In this case, the above
U-Boot patch is (in theory) wrong, but if that's true, it was always wrong
to begin with (as it always disabled the cache prior to ELF loading).

  If something like this was the case, then the rpi3 configurations (or
perhaps some bootinfo code) needs to change.

  3) Alternatively, this is just an outright bug in seL4 that should be
fixed and U-Boot was previously masking it. (I'm not sure if the proofs
cover this one :)

  4) Perhaps it should be a mix of both. For example, perhaps the
sel4/sel4test code should be changed to assume an already-enabled dcache,
or fixed outright. But perhaps a U-Boot patch would still be good to submit
upstream: for example, I could submit the above patch upstream, but
*remove* the dcache code. This would allow U-Boot users to mark seL4
binaries appropriately for clarity and future compatibility and make U-Boot
"officially support" seL4. And in the future, should seL4 need more
patches/workarounds/requirements from U-Boot, it would be easier to add and
maintain them.

---

I'd prefer not to run myself into dead ends trying to get the wrong
solution upstream to either seL4 or U-Boot, and it would be great to fix
this for users. I'd also like to look into more adventurous stuff in the
future (like multicore on the RPi3 which seems unsupported), so any advice
is appreciated to get started. Even then: bisection and the patch creation
only took an hour, so if it's wrong, not much was lost.

-- 
Regards,
Austin - PGP: 4096R/0x91384671
_______________________________________________
Devel mailing list
[email protected]
https://sel4.systems/lists/listinfo/devel

Reply via email to