On 09/03/2021 23.16, Linus Torvalds wrote: > On Tue, Mar 9, 2021 at 1:17 PM Rasmus Villemoes > <li...@rasmusvillemoes.dk> wrote: >> >> So add an initramfs_async= kernel parameter, allowing the main init >> process to proceed to handling device_initcall()s without waiting for >> populate_rootfs() to finish. > > Oh, and a completely unrelated second comment about this: some of the > initramfs population code seems to be actively written to be slow. > > For example, I'm not sure why that write_buffer() function uses an > array of indirect function pointer actions. Even ignoring the whole > "speculation protections make that really slow" issue that came later, > it seems to always have been actively (a) slower and (b) more complex. > [...] > Is that likely to be a big part of the costs? No. I assume it's the > decompression and the actual VFS operations.
Yes, I have been doing some simple measurements, simply by decompressing the blob in userspace and comparing to the time to that used by populate_rootfs(). For both the 6M lz4-compressed blob on my ppc target and the 26M xz-compressed blob on my laptop, the result is that the decompression itself accounts for the vast majority of the time - and for ppc in particular, I don't think there's any spectre slowdown. So I haven't dared looking into changing the unpack implementation since it doesn't seem it could buy that much. Rasmus