Hi Greg,

thx a lot for the feedback and hints. You asked for lots of numbers, I tried to 
add some I have available here at the moment. Find them inline. I'm 
additionally interested in some more details of some of the ideas you outlined. 
Would be nice if you could go some more into details at certain points. I added 
some questions inline as well.

> -----Original Message-----
> From: Greg KH [mailto:gre...@linuxfoundation.org]
> Sent: Sunday, December 21, 2014 6:47 PM
> To: Hoyer, Marko (ADITG/SW2)
> Cc: Umut Tezduyar Lindskog; systemd-devel@lists.freedesktop.org
> Subject: Re: [systemd-devel] Improving module loading
> On Sun, Dec 21, 2014 at 12:31:30PM +0000, Hoyer, Marko (ADITG/SW2)
> wrote:
> > > If you have control over your kernel, why not just build the
> modules
> > > into the kernel, then all of this isn't an issue at all and there
> is
> > > no overhead of module loading?
> >
> > It is a questions of kernel image size and startup performance.
> > - We are somehow limited in terms of size from where we are loading
> the kernel.
> What do you mean by this?  What is limiting this?  What is your limit?
> How large are these kernel modules that you are having a hard time to
> build into your kernel image?
- As far as I remember, we have special fastboot aware partitions on the emmc 
that are available very fast. But those are very limited in size. But with this 
point I'm pretty much not sure. This is something I got told.

- targeted kernel size: 2-3MB packed

- Kernel modules:
        - we have heavy graphics drivers (~800kb, stripped), they are needed 
half the way at startup
        - video processing unit drivers (don't know the size), they are needed 
half the way at startup
        - wireless & bluetooth, they are needed very late
        - usb subsystem, conventionally needed very late (but this finally 
depends on the concrete product)
        - hot plug mass storage handling, conventionally needed very late (but 
this finally depends on the concrete product)
        - audio driver, in most of our products needed very late
        - some drivers for INC communication (partly needed very early -> we 
compiled in them, partly needed later -> we have them as module)

All in all I'd guess we are getting twice the size if we would compile in all 
the stuff.

> > - Loading the image is a kind of monolithic block in terms of time
> > where you can hardly do things in parallel
> How long does loading a tiny kernel image actually take?

I don't know exact numbers, sorry. I guess something between 50-200ms plus time 
for unpacking. But this loading and unpacking job is important since it is 
located directly on the critical path.

> > - We are strongly following the idea from Umut (loading things not
> > before they are actually needed) to get up early services very early
> > (e.g. rendering a camera on a display in less than 2secs after power
> > on)
> Ah, IVI, you all have some really strange hardware configurations :(

Yes IVI. Since we are developing our hardware as well as our software 
(different department), I'm interested in getting more infos about what is 
strange about IVI hardware configuration in general. Maybe we can improve 
things to a certain extent. Could you go more into details?

> There is no reason you have to do a "cold reset" to get your boot times
> down, there is the fun "resume from a system image" solution that
> others have done that can get that camera up and displayed in
> milliseconds.

I'm interested in this point.
- Are you talking about "Save To RAM", "Save to Disk", or a hybrid combination 
of both?
- Or do you have something completely different in mind?

I personally thought about such a solution as well. I'm by now not fully 
convinced since we have really hard timing requirements (partly motivated by 
law). So I see two different principal ways for a "resume" solution:
- either the resume solution is robust enough to guarantee to come up properly 
every boot up
        - achieved for instance by a static system image that brings the system 
into a static state very fast, from which on a kind of conventional boot is 
going on then ...
- or the boot up after an actual "cold reset" is fast enough to at least 
guarantee the really important timing requirements in case the resume is not 
coming up properly

> > - Some modules do time / CPU consuming things in init(), which would
> > delay the entry time into userspace
> Then fix them, that's the best thing about Linux, you have the source
> to not accept problems like this!  And no module should do expensive
> things in init(), we have been fixing issues like that for a while now.

This would be properly the cleanest solution. In a long term perspective we are 
of course going this way and we are trying to get suppliers to go this way with 
us as well. But finally, we have to bring up now products at a fixed date. So 
it sometimes is easier, and more stable to work around suboptimal things.

For instance:
- refactoring a driver that is doing lots of CPU intensive things in init()
- taking the module as it is and using the time by loading things from emmc in 

> >     -> deferred init calls are not really a solution because they
> cannot
> > be controlled in the needed granularity
> We have loads of granularity there, how much more do you need?
> > So finally it is actually a trade of between compiling things in and
> spending the overhead of module loading to gain the flexibility to load
> things later.
> That's fine, but you will run into the kernel lock that prevents
> modules loading at the same time for some critical sections, if your
> I/O issues don't limit you already.
> There are lots of areas you can work on to speed up boot times other
> than worrying about multithreaded kernel module loading.  I really
> doubt this is going to be the final solution for your problems.

It is of course not. The initial intention to develop something new here on top 
of kmod or systemd-modules-load was not to load kernel modules in parallel. We 
found that for lots of our modules we can actually gain some benefit by loading 
things in parallel so we decided to include the threaded approach as well.

The initial motivation to develop something new here was to get rid of using 
the "udevd" / "udevadm trigger" approach during startup. This "setting up the 
system hardware completely in one early phase during startup" approach is not 
going well with our timing requirements. So we are setting up our static 
hardware piece by piece exactly at this point in time when it is needed using 
our tool. Besides the actually module loading, this tool provides a mechanism 
for synchronization, and is doing some addition setup stuff on top. The 
threaded loading is just a feature.

Some numbers to the above mentioned arguments:
- in an idle system, systemd-udev-trigger.service takes by about 150-200ms just 
to get the complete device tree to send out "add" uevents again (udevd was 
deactivated while doing this measure)
- the processing of the resulting uevents by udevd takes 1-2 seconds (with the 
default rule set) again in an idle system
- in a general solution, we'd need to wait for udevd to completely settle until 
we can start using the devices

> good luck,
> greg k-h

Thx ;)

systemd-devel mailing list

Reply via email to