Re: Why is the deferred initcall patch not mainline?

2014-11-02 Thread Geert Uytterhoeven
On Sun, Nov 2, 2014 at 3:37 AM, Nicolas Pitre n...@fluxnic.net wrote:
 On Thu, 30 Oct 2014, Geert Uytterhoeven wrote:
 On Thu, Oct 30, 2014 at 12:49 AM, Tim Bird tim.b...@sonymobile.com wrote:
  The way the feature is expressed in the current code is that a
  set of drivers are marked for deferred initialization (I'll refer
  to this as issue 0).  Then, at boot: 1) most drivers are initialized
  normally, 2) user space is started, and then 3) user space indicates
  to the kernel that the deferred drivers should be initialized.

 One (IMHO important) point in the current implementation is that the call
 to free_initmem() is also delayed until after initialization of the
 deferred drivers.

 This is different from modular drivers, which are loaded after 
 free_initmem().

 This is because modules have their __initmem sections freed right after
 each module is initialized.

I know.

But it means _all_ init sections are kept until userspace kicks the deferred
initcalls, and they have completed.

 The deferred initcalls could also have a separate initmem section which
 freeing is also deferred.  But I don't think it makes such a big
 difference in the end.

Yes, it can be handled.

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say programmer or something like that.
-- Linus Torvalds
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Why is the deferred initcall patch not mainline?

2014-11-01 Thread Nicolas Pitre
On Thu, 30 Oct 2014, Geert Uytterhoeven wrote:

 On Thu, Oct 30, 2014 at 12:49 AM, Tim Bird tim.b...@sonymobile.com wrote:
  The way the feature is expressed in the current code is that a
  set of drivers are marked for deferred initialization (I'll refer
  to this as issue 0).  Then, at boot: 1) most drivers are initialized
  normally, 2) user space is started, and then 3) user space indicates
  to the kernel that the deferred drivers should be initialized.
 
 One (IMHO important) point in the current implementation is that the call
 to free_initmem() is also delayed until after initialization of the
 deferred drivers.
 
 This is different from modular drivers, which are loaded after free_initmem().

This is because modules have their __initmem sections freed right after 
each module is initialized.

The deferred initcalls could also have a separate initmem section which 
freeing is also deferred.  But I don't think it makes such a big 
difference in the end.


Nicolas
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Why is the deferred initcall patch not mainline?

2014-11-01 Thread Nicolas Pitre
On Wed, 29 Oct 2014, Tim Bird wrote:

 I have been thinking about the points you made previously,
 and have given the problem space some more thought.  I agree
 that as it stands this is a very niche solution, and it would
 be good to think about the broader picture and how things
 might be designed differently to make the feature usable
 more easily and to a broader group.
 
 Taking a step back, the overall goal is to allow user space
 to do stuff while the kernel is still initializing statically
 linked drivers, so the device's primary function can be ready
 as soon as possible (and not wait for secondarily-needed
 functionality to initialize). For things that are able to be
 made into a module (and for situations where the kernel module
 loading is turned on), this feature should not be needed in
 its current form.  In that case, user space already has control
 over module load ordering and timing.
 
 The way the feature is expressed in the current code is that a
 set of drivers are marked for deferred initialization (I'll refer
 to this as issue 0).  Then, at boot: 1) most drivers are initialized
 normally, 2) user space is started, and then 3) user space indicates
 to the kernel that the deferred drivers should be initialized.
 
 This is very coarse, allowing only two categories of drivers: (ignoring
 other boot phases for the moment) - regular drivers and deferred drivers.
 It also requires source code changes to mark the drivers to be deferred.
 Finally, it requires an explicit notification from user-space to complete
 the process.  All of these attributes are undesirable.
 
 There may also be an opportunity here to work out more granular driver
 load ordering, which would benefit other systems (especially those that
 are hitting the EPROBE_DEFER issue).
 
 As it stands now, the ordering of the initcalls within a particular level
 is pretty much arbitrary (determined by link order, usually without oversight
 by the developer).  Just FYI, here are some numbers culled from a recent
 kernel:
 
 initcall macronumber of instances in kernel source
 --
 early_init446
 core_init 614
 postcore_init 150
 arch_init 751
 subsys_init   573
 fs_init   1372
 device_init   1211
 late_init 440

Did you count module_init instances which are folded into the 
device_init leven when built-in?

 I'm going to rattle off a few ideas - I'm not sure which ones might
 stick,  I just want to bounce these around and see what people think.
 Note that I didn't think of most of these, but I'm just repeating ones
 that have been stated, and adding a few thoughts of my own.
 
 First, if the ordering of initialization is not the default
 provided by the kernel, it needs to be recorded somewhere.  A developer
 needs to express it (or a tool needs to figure it out), but if it is
 going to be reflected in the final kernel behaviour (or image), the
 kernel needs it at boot time (if not compile time).  The current
 initcall system hardcodes a level for each driver initialization
 routine in the source code itself, by putting it in the macro
 name for each init routine.  There can
 only be one such order expressed in the code itself.
 
 For developers who wish to express another order (or priority), a
 new mechanism will need to be used.  If possible, I strongly prefer
 putting this into the KCONFIG system, as that is where other details
 about kernel configuration are stored, and there are pre-existing tools
 for dealing with the format.  I am hesitant to create a special language
 or config format for this (unless it is much simpler than adding something
 to Kconfig).  As Nicolas pointed out, Kconfig already has information
 about dependencies in terms of not allowing a driver to be a module
 if a dependent module is statically linked. Having the tool warn for
 violations of that ordering would be valuable.

I think you're confusing two issues: ordering and dependency.  The 
dependency affects some of the ordering, but only a small portion of it.  
Within an initcall level the ordering is a result of the link order and 
therefore rather arbitrary.

IMHO the current initcall level system is simply too simple for the 
current kernel complexity.  The number of levels, and especially their 
names, are also completely arbitrary.  It probably made sense back when 
initcalls were introduced, but it is just too inflexible now.

Initcalls should instead be turned into targets and prerequisites, just 
like dependencies in a makefile.  This way, the ultimate target execute 
/sbin/init in userspace could indicate its prerequisite as mount root 
fs.  Then mount root fs could have USB storage as a prerequisite 
depending on the boot args. From USB storage you could have two 
prerequisites: USB stack and USB device enumeration.  And so down to 
the very first initcalls with no 

Re: Why is the deferred initcall patch not mainline?

2014-10-30 Thread Geert Uytterhoeven
On Thu, Oct 30, 2014 at 12:49 AM, Tim Bird tim.b...@sonymobile.com wrote:
 The way the feature is expressed in the current code is that a
 set of drivers are marked for deferred initialization (I'll refer
 to this as issue 0).  Then, at boot: 1) most drivers are initialized
 normally, 2) user space is started, and then 3) user space indicates
 to the kernel that the deferred drivers should be initialized.

One (IMHO important) point in the current implementation is that the call
to free_initmem() is also delayed until after initialization of the
deferred drivers.

This is different from modular drivers, which are loaded after free_initmem().

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say programmer or something like that.
-- Linus Torvalds
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Why is the deferred initcall patch not mainline?

2014-10-29 Thread Tim Bird


On 10/27/2014 01:29 PM, Nicolas Pitre wrote:
 On Fri, 24 Oct 2014, Geert Uytterhoeven wrote:
 
 Several patches are linked from
 http://elinux.org/Deferred_Initcalls

 Latest version is
 http://elinux.org/images/5/51/0001-Port-deferred-initcalls-to-3.10.patch
 
 In the hope of providing some constructive and concrete feedback to this 
 thread, here's what I have to say about the patch linked above ( I 
 looked only at the latest version):
 
 - Commented out code is not acceptable for mainline. But everyone knows 
   that already.
 
 - Returning a null byte through the /proc file is dubious.
 
 - The /proc interface is probably not the best. I'd go with an entry in 
   /sys/kernel instead.
 
 - If the deferred_initcall section is empty, this could return 1 upfront 
   and do the free_initmem() earlier as it used to.
 
 - It was mentioned somewhere that the config system could use a 4th 
   state in addition to n, m and y.  That would be required before this 
   goes upstream simply to express all the dependencies between modules.  
   Right now if a core module is configured with m, then all the 
   submodules that depend on it inherit the modular-only restriction.  
   The same would need to be enforced for deferred initcalls.
 
 - Currently all deferred initcalls are lumped together in a single 
   section with no regards to the original initcall level. This is likely 
   to cause trouble if two initcalls are called in a different order than 
   intended. Nothing prevents that from happening right now.
 
 This patch is still not generic enough for mainline inclusion IMHO.  It 
 currently falls in the you better know what you're doing category and 
 that is possibly good enough for its actual users.  Trying to make this 
 more generic is going to require some more work.  And this would have to 
 come with serious arguments explaining why simply using modules in the 
 first place is not acceptable.

Sorry to take so long to reply.  This feedback is very welcome,
and I appreciate the time taken to review the patch.  I
apologize in advance for the rather long response...

I have been thinking about the points you made previously,
and have given the problem space some more thought.  I agree
that as it stands this is a very niche solution, and it would
be good to think about the broader picture and how things
might be designed differently to make the feature usable
more easily and to a broader group.

Taking a step back, the overall goal is to allow user space
to do stuff while the kernel is still initializing statically
linked drivers, so the device's primary function can be ready
as soon as possible (and not wait for secondarily-needed
functionality to initialize). For things that are able to be
made into a module (and for situations where the kernel module
loading is turned on), this feature should not be needed in
its current form.  In that case, user space already has control
over module load ordering and timing.

The way the feature is expressed in the current code is that a
set of drivers are marked for deferred initialization (I'll refer
to this as issue 0).  Then, at boot: 1) most drivers are initialized
normally, 2) user space is started, and then 3) user space indicates
to the kernel that the deferred drivers should be initialized.

This is very coarse, allowing only two categories of drivers: (ignoring
other boot phases for the moment) - regular drivers and deferred drivers.
It also requires source code changes to mark the drivers to be deferred.
Finally, it requires an explicit notification from user-space to complete
the process.  All of these attributes are undesirable.

There may also be an opportunity here to work out more granular driver
load ordering, which would benefit other systems (especially those that
are hitting the EPROBE_DEFER issue).

As it stands now, the ordering of the initcalls within a particular level
is pretty much arbitrary (determined by link order, usually without oversight
by the developer).  Just FYI, here are some numbers culled from a recent
kernel:

initcall macro  number of instances in kernel source
--  
early_init  446
core_init   614
postcore_init   150
arch_init   751
subsys_init 573
fs_init 1372
device_init 1211
late_init   440


I'm going to rattle off a few ideas - I'm not sure which ones might
stick,  I just want to bounce these around and see what people think.
Note that I didn't think of most of these, but I'm just repeating ones
that have been stated, and adding a few thoughts of my own.

First, if the ordering of initialization is not the default
provided by the kernel, it needs to be recorded somewhere.  A developer
needs to express it (or a tool needs to figure it out), but if it is
going to be reflected in the final kernel behaviour (or image), the
kernel needs it at boot time (if not compile 

Re: Why is the deferred initcall patch not mainline?

2014-10-27 Thread Nicolas Pitre
On Fri, 24 Oct 2014, Geert Uytterhoeven wrote:

 Several patches are linked from
 http://elinux.org/Deferred_Initcalls
 
 Latest version is
 http://elinux.org/images/5/51/0001-Port-deferred-initcalls-to-3.10.patch

In the hope of providing some constructive and concrete feedback to this 
thread, here's what I have to say about the patch linked above ( I 
looked only at the latest version):

- Commented out code is not acceptable for mainline. But everyone knows 
  that already.

- Returning a null byte through the /proc file is dubious.

- The /proc interface is probably not the best. I'd go with an entry in 
  /sys/kernel instead.

- If the deferred_initcall section is empty, this could return 1 upfront 
  and do the free_initmem() earlier as it used to.

- It was mentioned somewhere that the config system could use a 4th 
  state in addition to n, m and y.  That would be required before this 
  goes upstream simply to express all the dependencies between modules.  
  Right now if a core module is configured with m, then all the 
  submodules that depend on it inherit the modular-only restriction.  
  The same would need to be enforced for deferred initcalls.

- Currently all deferred initcalls are lumped together in a single 
  section with no regards to the original initcall level. This is likely 
  to cause trouble if two initcalls are called in a different order than 
  intended. Nothing prevents that from happening right now.

This patch is still not generic enough for mainline inclusion IMHO.  It 
currently falls in the you better know what you're doing category and 
that is possibly good enough for its actual users.  Trying to make this 
more generic is going to require some more work.  And this would have to 
come with serious arguments explaining why simply using modules in the 
first place is not acceptable.


Nicolas
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Why is the deferred initcall patch not mainline?

2014-10-27 Thread Alexandre Belloni
On 27/10/2014 at 16:29:10 -0400, Nicolas Pitre wrote :
 On Fri, 24 Oct 2014, Geert Uytterhoeven wrote:
 
  Several patches are linked from
  http://elinux.org/Deferred_Initcalls
  
  Latest version is
  http://elinux.org/images/5/51/0001-Port-deferred-initcalls-to-3.10.patch
 
 In the hope of providing some constructive and concrete feedback to this 
 thread, here's what I have to say about the patch linked above ( I 
 looked only at the latest version):
 
 - Commented out code is not acceptable for mainline. But everyone knows 
   that already.
 
 - Returning a null byte through the /proc file is dubious.
 
 - The /proc interface is probably not the best. I'd go with an entry in 
   /sys/kernel instead.
 
 - If the deferred_initcall section is empty, this could return 1 upfront 
   and do the free_initmem() earlier as it used to.
 
 - It was mentioned somewhere that the config system could use a 4th 
   state in addition to n, m and y.  That would be required before this 
   goes upstream simply to express all the dependencies between modules.  
   Right now if a core module is configured with m, then all the 
   submodules that depend on it inherit the modular-only restriction.  
   The same would need to be enforced for deferred initcalls.
 
 - Currently all deferred initcalls are lumped together in a single 
   section with no regards to the original initcall level. This is likely 
   to cause trouble if two initcalls are called in a different order than 
   intended. Nothing prevents that from happening right now.
 
 This patch is still not generic enough for mainline inclusion IMHO.  It 
 currently falls in the you better know what you're doing category and 
 that is possibly good enough for its actual users.  Trying to make this 
 more generic is going to require some more work.  And this would have to 
 come with serious arguments explaining why simply using modules in the 
 first place is not acceptable.
 

That one is easy, you simply can't compile the network stack as a
module and it is huge.

I completely agree with all your arguments and I'm not sure it is worth
making it foolproof.

-- 
Alexandre Belloni, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Why is the deferred initcall patch not mainline?

2014-10-24 Thread Rob Landley
On 10/23/14 19:36, Nicolas Pitre wrote:
 On Thu, 23 Oct 2014, Rob Landley wrote:
 3) You, too, conveniently avoided to define the initial problem so far.
That makes for rather sterile conversations about alternative 
solutions that could score higher on the mainline acceptance scale.

With modules, you can already defer large portions of kernel-side system
bringup until userspace is ready for them. With static linking, you can't.

This patch series sounds like it lets static drivers hold off their
initialization until userspace sends them an insmod-equivalent event
through sysfs, possibly with associated arguments since the module
codepath already implements that so exposing it through the new
mechanism in the static linking case would be trivial.

Seems conceptually fairly straightforward to me, but I'm just guessing
since nobody's yet linked to the patches during this thread (that I've
noticed).

 In some cases, the system may want to defer initialization of some drivers
 until explicit action through the user interface.  So the trigger may not 
 be
 called until well after boot is completed.

 In that case the trigger for initializing those drivers should be the 
 first time they're accessed from user space.

 Which gets us back to one of the big reasons strikesystemd/strike
 devfsd failed years ago: you have to probe the hardware in order to know
 which /dev nodes to create, so you can't have accessing the /dev node
 probe the hardware. (There's no /dev node for a usb controller...)
 
 There is /sys/bus/usb/devices that could be accessed in order to trigger 
 the initial setup and probe.  It is most likely that libusb does that, 
 but this could be made to work with a simple 'cat' or 'touch' invocation 
 as well.

Please let me know which devices to trigger to launch an encrypted
ramdisk driver that has nontrivial setup because it needs to generate
keys (and collect enough entropy to do so). Or how about a driver that
programs a set of gpio pins to a specific behavior, obviously that's
triggered by examining the hardware.

A module can produce multiple /dev nodes from one piece of hardware, a
piece of hardware can produce no dev nodes (speaking of usb, the actual
bus-level driver), dev nodes may not have any associated hardware but
still require setup (/dev/urandom if you care about the quality of the
entropy pool)...

This is why devfs didn't work. You're trying to do this at the wrong
level. If you want to defer a module's init, doing so at the _module_
level is the only coherent way to do it.

 That could be the very first time libusb or similar tries to 
 enumerate available USB devices for example.  No special interface 
 needed.

 So now you're requiring libusb enumerating usb devices, when before this
 you could just reach out and open /dev/ttyUSB0 and it would be there.
 
 You can't just reach out with the deferred initcall scheme either, do 
 you?

You can already can do this with modules. Just don't insmod until you're
ready.

Right now the implementation ties together the code is in kernel with
the code starts running, so you can't both statically link the module
and control when it starts doing stuff. That really _seems_ like it's
just an implementation detail: decoupling them so the code is in kernel
but doesn't call its init function until userspace tells it to does not
sound like a huge conceptual stretch.

Is there an actual reason to invent a whole new unrelated thing instead?

 This is an embedded solution?

 I'm suggesting that they no longer prevent user space from executing
 earlier.  Why would you then still want an explicit trigger from user
 space?
 Because only the user space knows when it is now OK to initialize those
 drivers, and begin using CPU cycles on them.

 So what?  That is still not a good answer.

 Why?

 I believe Tim's proposal was to take a category of existing device
 probing, one already done on a background thread, and wait to start it
 until userspace says go. That's about as nonintrusive a change as you get.
 
 You might still be able to do better.

We have a mechanism available in one context. Would you rather make that
mechanism available in another context, or design a whole new mechanism
from scratch?

 If you really want to be non intrusive, you could e.g. make those 
 background threads into SIGSTOP and let user space SIGCONT them as it 
 sees fit.  No new special interfaces needed.

We have an existing module namespace, and existing mechanisms that use
it to control this sort of thing. Are you suggesting a lookup mechanism
that says here's the threat that would be initializing this module if
we hadn't started the thread SIGSTOP? (With each one in its own thread
so you have the same level of granularity the existing mechanism provides?)

 You're talking about requiring weird arbitrary things to have side effects.
 
 Like if stalling arbitrary initcalls wouldn't have side effects?

You're arguing that modules, as the exist today, couldn't 

Re: Why is the deferred initcall patch not mainline?

2014-10-24 Thread Geert Uytterhoeven
On Fri, Oct 24, 2014 at 9:38 PM, Rob Landley r...@landley.net wrote:
 I'm going to recuse myself from the rest of this thread because I'm
 clearly getting annoyed with us talking past each other. Somebody's got
 an actual patch (which they still haven't linked to). I'll shut up and
 let them show you the code.

Several patches are linked from
http://elinux.org/Deferred_Initcalls

Latest version is
http://elinux.org/images/5/51/0001-Port-deferred-initcalls-to-3.10.patch

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say programmer or something like that.
-- Linus Torvalds
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Why is the deferred initcall patch not mainline?

2014-10-24 Thread Nicolas Pitre
On Fri, 24 Oct 2014, Rob Landley wrote:

 On 10/23/14 19:36, Nicolas Pitre wrote:
 
  As you know already, you can do anything you want on your own.  That's 
  granted by the GPL.
 
 I'm pretty sure I could have done anything I wanted on my own with
 System 6 unix in the 1970's (modulo being 7 years old), since the BSD
 guys _did_ and their stuff is still around (and is powering obscure
 things like the iPhone).

Incidentally there is this thing called Linux powering similarly obscur 
curiosities such as Android, and outnumbering iPhones in terms of units 
shipped.

So what's your point again?

 And I learned C in 1989 to apply mod files to the WWIV bulletin 
 board system (an open source development community that didn't even 
 have the patch program).

You needed to pay a license to get the WWIV source code.  At least that 
was the case when I was a sysop in 1993.

 But by all means, credit the GPL for the existence of open source.

Oh my!  Obviously that's exactly what I did, right?

And now you want me to take what you say seriously?

The impression I get from your diatribe is that you might be living in 
the past.  I don't dispute the fact that You had issues with the Linux 
community before, but one has to admit that a _lot_ of people don't. And 
I'm lucky enough to be one of them, and in that context I was trying to 
help.

 Did you notice that there's no such thing as the GPL anymore? Linux
 and Samba implement two ends of the same protocol, each one is GPL, and
 they can't share code. Poor QEMU wants to suck GPL processor definitions
 out of binutils/gdb to emulate processors and GPL driver code out of
 Linux to emulate devices, and there _is_ no license that allows it to
 combine code from both sources. (Making qemu GPLv2 or later means it
 couldn't accept code from _either_ source.)

And now we're far far away from $subject that started this thread.  
This is going nowhere.

 I'm going to recuse myself from the rest of this thread because I'm
 clearly getting annoyed with us talking past each other. Somebody's got
 an actual patch (which they still haven't linked to). I'll shut up and
 let them show you the code.

On that I agree with you.


Nicolas
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Why is the deferred initcall patch not mainline?

2014-10-23 Thread Bird, Tim
On Wednesday, October 22, 2014 8:49 AM, Nicolas Pitre [n...@fluxnic.net] wrote:
 On Wed, 22 Oct 2014, Rob Landley wrote:
  On 10/21/14 14:58, Nicolas Pitre wrote:
   On Tue, 21 Oct 2014, Bird, Tim wrote:
  
   I'm going to respond to several comments in this one message (sorry for 
   the likely confusion)
  
   On Tuesday, October 21, 2014 9:31 AM, Nicolas Pitre [n...@fluxnic.net] 
   wrote:
  
   On Tue, 21 Oct 2014, Grant Likely wrote:
  
   On Sat, Oct 18, 2014 at 9:11 AM, Bird, Tim tim.b...@sonymobile.com 
   wrote:
   The answer is pretty easy, I think.  I tried to mainline it once but 
   failed, and didn't really
   try again. If it is being found useful,  we should try to mainline it 
   again,  this time with
   more persistence.  The reason it got rejected before IIRC was that 
   you can accomplish
   a similar thing with modules, with no changes to the kernel. But that 
   doesn't cover
   the case where the loadable modules feature of the kernel is turned 
   off, which is
   common in very small systems.
  
   It is a rather clumsy approach though since it requires changes to
   modules and it makes the configuration static per build. Could it
   instead be done by the kernel accepting a list of initcalls that
   should be deferred? It would depend I suppose on the cost of finding
   the initcalls to defer at boot time.
  
   Yeah, I'm not a big fan of having to change kernel code in order to
   use the feature.  I am quite intrigued by Geert Uytterhoeven's idea
   to add a 'D' option to the config system, so that the record of which
   modules to defer could be stored there.  This is much better than
   hand-altering code.  I don't know how difficult this would be to add
   to the kbuild system, but the mechanism for altering the macro would
   be, IMHO, very straightforward.
  
   Straight forward but IMHO rather suboptimal. Sure it might be good
   enough if all you want is to ship products out the door, but for
   mainline something better should be done.
  
   This patch predated Arjan Van de Ven's fastboot work.  I don't
   know if some of his parallelization (asynchronous module loading), and
   optimizations for USB loading made things substantially better than this.
   The USB spec makes in impossible to avoid a certain amount of delay
   in probing the USB busses
  
   USB was the main culprit, but we sometimes deferred other modules, if 
   they
   were not in the fastpath for taking a picture. Sony cameras had a goal of
   booting in .5 seconds, but I think the best we ever achieved was about 
   1.1
   seconds, using deferred initcalls and a variety of other techniques.
  
   Some initcalls can be executed in parallel, but they currently all have
   to complete before user space is started.  It should be possible to
   still do the parallel initcall thing, and let user space run before they
   are done as well.  Only waiting for the root fs to be available should
   be sufficient.  That would be completely generic, and help embedded as
   well as desktop systems.
 
  What would actually be nice is if initramfs could read something out of
  /proc or /sys to check the status of initcalls. (Or maybe get
  notification through the hotplug netlink mechanism.)
 
  Since initramfs is _already_ up really early, before needing any
  particular drivers and way before the real root filesystem, we can
  trivially punt this sort of synchronization to userspace if userspace
  can just get the information about when kernel deferred processing is done.
 
  Possibly we already have this: /sys/module has directories for all the
  kernel modules including the ones built static, so possibly userspace
  can just wait for /sys/module/zlib_delfate/initstate to say live. It
  would just be nice to have a way to notice that's happened without
  spinning reading a file.

 Again, not generic enough. Instead, the reading of that file could be
 suspended by the kernel until all initcalls have completed and then
 return an appropriate error code if the corresponding resource is
 actually not there.

 Otherwise the standard hotplug notification mechanism is already
 available.

I'm not sure why this attention to reading the status.  The salient feature
here is that the initializations are deferred until user space tells the kernel
to proceed.  It's the initiation of the trigger from user-space that matters.
The whole purpose of this feature is to defer some driver initializations until
the product can get into a state where it is already ready to perform it's 
primary
function.  Only user space knows when that is.

There seems to be a desire to have an automatic mechanism for triggering
the deferred initializations.  I'm OK with this, as long as there's some 
reasonable
use case for it.  There are lots of possible trigger mechanisms, including just
a simple timer, but I think it's important that the primary use case of 
'trigger-when-user-space-says-to' is still supported.

This code is really intended for a 

RE: Why is the deferred initcall patch not mainline?

2014-10-23 Thread Nicolas Pitre
On Thu, 23 Oct 2014, Bird, Tim wrote:

 I'm not sure why this attention to reading the status.  The salient feature
 here is that the initializations are deferred until user space tells the 
 kernel
 to proceed.  It's the initiation of the trigger from user-space that matters.
 The whole purpose of this feature is to defer some driver initializations 
 until
 the product can get into a state where it is already ready to perform it's 
 primary
 function.  Only user space knows when that is.

This is still a rather restrictive view of the problem IMHO.

Let's step back a bit. Your concern is that some initcalls are taking 
too long and preventing user space from executing early, right?  I'm 
suggesting that they no longer prevent user space from executing 
earlier.  Why would you then still want an explicit trigger from user 
space?

 There seems to be a desire to have an automatic mechanism for triggering
 the deferred initializations.  I'm OK with this, as long as there's some 
 reasonable
 use case for it.  There are lots of possible trigger mechanisms, including 
 just
 a simple timer, but I think it's important that the primary use case of 
 'trigger-when-user-space-says-to' is still supported.

Why a trigger?  I'm suggesting no trigger at all is needed.

Let all initcalls start initializing whenever they can.  Simply that 
they shouldn't prevent user space from running early.

Because initcalls are running in parallel, then they must be using 
separate kernel threads.  It may be possible to adjust their priority so 
if one of them is actually using a lot of CPU cycles then it will run 
only when all the other threads (including user space) are no longer 
running.

 This code is really intended for a very specialized kernel configuration, 
 where all
 the modules are statically linked, and indeed module loading itself is turned 
 off. 
 I think that's a minority of Linux deployments out there.  This configuration
 implies some other attributes, like configuration for very small size and/or 
 very
 fast boot, where KALLSYMS may not be present, and other kernel features may
 not be available as well.  Indeed, in the smallest systems /proc or /sys may 
 not
 be there, so an alternative (maybe a sysctl or even a new syscall) might be
 appropriate. 
 
 Quite frankly, the hacky way this is often done is to make stuff like this a
 one-time side effect of a rarely called syscall (like sync).  Please note I'm 
 not
 recommending this for mainline, just pointing out there are interesting ways
 that embedded developers just make the existing code work for their weird
 cases.

Agreed.  However if you're looking for a solution that may go into 
mainline, it just can't be hackish like that.  There might be generic 
solutions that meet your goal while still being useful to others.  
Focussing on the best way to implement a particular solution while there 
might be other solutions to explore is a bad approach.


Nicolas
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Why is the deferred initcall patch not mainline?

2014-10-23 Thread Alexandre Belloni
On 23/10/2014 at 13:56:44 -0400, Nicolas Pitre wrote :
 On Thu, 23 Oct 2014, Bird, Tim wrote:
 
  I'm not sure why this attention to reading the status.  The salient feature
  here is that the initializations are deferred until user space tells the 
  kernel
  to proceed.  It's the initiation of the trigger from user-space that 
  matters.
  The whole purpose of this feature is to defer some driver initializations 
  until
  the product can get into a state where it is already ready to perform it's 
  primary
  function.  Only user space knows when that is.
 
 This is still a rather restrictive view of the problem IMHO.
 
 Let's step back a bit. Your concern is that some initcalls are taking 
 too long and preventing user space from executing early, right?  I'm 
 suggesting that they no longer prevent user space from executing 
 earlier.  Why would you then still want an explicit trigger from user 
 space?
 
  There seems to be a desire to have an automatic mechanism for triggering
  the deferred initializations.  I'm OK with this, as long as there's some 
  reasonable
  use case for it.  There are lots of possible trigger mechanisms, including 
  just
  a simple timer, but I think it's important that the primary use case of 
  'trigger-when-user-space-says-to' is still supported.
 
 Why a trigger?  I'm suggesting no trigger at all is needed.
 
 Let all initcalls start initializing whenever they can.  Simply that 
 they shouldn't prevent user space from running early.
 
 Because initcalls are running in parallel, then they must be using 
 separate kernel threads.  It may be possible to adjust their priority so 
 if one of them is actually using a lot of CPU cycles then it will run 
 only when all the other threads (including user space) are no longer 
 running.
 

You probably can't do that without introducing race conditions. A number
of userspace libraries and script are actually expecting init and probe
to be synchronous. I will refer to the async probe discussion and the
following thread:

http://thread.gmane.org/gmane.linux.kernel/1781529

Anyway, your userspace will have to have a way to know what has been
initialized. On my side, I was also using that mechanism to delay the
network stack init but I still want to know when my dhcp client can
start for example.

-- 
Alexandre Belloni, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Why is the deferred initcall patch not mainline?

2014-10-23 Thread Rob Landley


On 10/23/14 12:21, Bird, Tim wrote:
 On Wednesday, October 22, 2014 8:49 AM, Nicolas Pitre [n...@fluxnic.net] 
 wrote:
 On Wed, 22 Oct 2014, Rob Landley wrote:
 Otherwise the standard hotplug notification mechanism is already
 available.
 
 I'm not sure why this attention to reading the status.  The salient feature
 here is that the initializations are deferred until user space tells the 
 kernel
 to proceed.  It's the initiation of the trigger from user-space that matters.
 The whole purpose of this feature is to defer some driver initializations 
 until
 the product can get into a state where it is already ready to perform it's 
 primary
 function.  Only user space knows when that is.
 
 There seems to be a desire to have an automatic mechanism for triggering
 the deferred initializations.  I'm OK with this, as long as there's some 
 reasonable
 use case for it.  There are lots of possible trigger mechanisms, including 
 just
 a simple timer, but I think it's important that the primary use case of 
 'trigger-when-user-space-says-to' is still supported.

The patches were reference but not (re-?)posted. People were talking
about waiting for the real root filesystem to show up, which strike me
as the wrong approach. Glad to hear the patch series is taking a better one.

 This code is really intended for a very specialized kernel configuration, 
 where all
 the modules are statically linked, and indeed module loading itself is turned 
 off. 
 I think that's a minority of Linux deployments out there.

Yeah, but not as rare as you're implying. That's how I build most of my
systems, for example.

Modules mean you need bits of the kernel to live in the root filesystem
image (and to match it exactly due to stable-api-nonsense.txt), which
complicates both build and upgrade. Unloading modules has never really
been properly supported, so there's no actual size or complexity
advantage to modules: you need it once and the resource is consumed
until next reboot. And of course there's security fun (spraying it down
with cryptography makes it awkward more than safe, and doesn't
change that you now have a multimode kernel that sometimes does one
thing and sometimes does another).

Not Going There with modules is a valid response for embedded systems if
I want to know what I'm deploying.

 This configuration
 implies some other attributes, like configuration for very small size and/or 
 very
 fast boot, where KALLSYMS may not be present, and other kernel features may
 not be available as well.

A new feature can have requirements. Not every existing deployment can
take advantage of any given new feature anyway. (Your _biggest_ blocker
will be that they're using a ${VENDOR:-broadcom} BSP that's stuck on
2.6.32 in 2014 and upgrading to a kernel version less than 5 years old
will never happen as long as you source hardware from vendors that fork
software rather than getting support upstream.)

 Indeed, in the smallest systems /proc or /sys may not
 be there, so an alternative (maybe a sysctl or even a new syscall) might be
 appropriate. 

A) Those don't interest me. As far as I'm concerned, they're not Linux.

B) If you propose a new syscall for this, it will never be merged. The
mechanism they implemented for this sort of thing is sysfs and hotplug.

 Quite frankly, the hacky way this is often done is to make stuff like this a
 one-time side effect of a rarely called syscall (like sync).  Please note I'm 
 not
 recommending this for mainline, just pointing out there are interesting ways
 that embedded developers just make the existing code work for their weird
 cases.
 
 Maybe there are some use cases for doing deferred initializations, 
 particularly
 automatically, for systems that do have modules turned on (i.e. for modules
 that are, in that case, still statically linked to the kernel for whatever 
 reason).
 I would welcome some discussion of these, to select an appropriate trigger
 mechanism for those cases.  But we should not let the primary purpose of this
 feature get lost in that discussion.

I thought it was common to defer at least some device probing until the
/dev node got opened. Which is a chicken and egg problem with regards to
the dev node showing up so you _can_ open them, which screwed up devfs
to the point of unworkability, and the answer to that was sysfs. So
having sysfs trigger deferred init from userspace makes perfect sense,
doing it that way means history is on your side and the kernel guys are
more likely to approve because it smells like what they've already done.

   -- Tim

Rob
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Why is the deferred initcall patch not mainline?

2014-10-23 Thread Nicolas Pitre
On Thu, 23 Oct 2014, Alexandre Belloni wrote:

 On 23/10/2014 at 13:56:44 -0400, Nicolas Pitre wrote :
  On Thu, 23 Oct 2014, Bird, Tim wrote:
  
   I'm not sure why this attention to reading the status.  The salient 
   feature
   here is that the initializations are deferred until user space tells the 
   kernel
   to proceed.  It's the initiation of the trigger from user-space that 
   matters.
   The whole purpose of this feature is to defer some driver initializations 
   until
   the product can get into a state where it is already ready to perform 
   it's primary
   function.  Only user space knows when that is.
  
  This is still a rather restrictive view of the problem IMHO.
  
  Let's step back a bit. Your concern is that some initcalls are taking 
  too long and preventing user space from executing early, right?  I'm 
  suggesting that they no longer prevent user space from executing 
  earlier.  Why would you then still want an explicit trigger from user 
  space?
  
   There seems to be a desire to have an automatic mechanism for triggering
   the deferred initializations.  I'm OK with this, as long as there's some 
   reasonable
   use case for it.  There are lots of possible trigger mechanisms, 
   including just
   a simple timer, but I think it's important that the primary use case of 
   'trigger-when-user-space-says-to' is still supported.
  
  Why a trigger?  I'm suggesting no trigger at all is needed.
  
  Let all initcalls start initializing whenever they can.  Simply that 
  they shouldn't prevent user space from running early.
  
  Because initcalls are running in parallel, then they must be using 
  separate kernel threads.  It may be possible to adjust their priority so 
  if one of them is actually using a lot of CPU cycles then it will run 
  only when all the other threads (including user space) are no longer 
  running.
  
 
 You probably can't do that without introducing race conditions. A number
 of userspace libraries and script are actually expecting init and probe
 to be synchronous.

They already have to cope with the fact that most things can be 
available through not-yet-loaded modules, or may never be there at all. 
If not then they should be fixed.

And if you do rely on such a feature for your small embedded 
system then you won't have that many libs and scripts to fix.

 I will refer to the async probe discussion and the
 following thread:
 
 http://thread.gmane.org/gmane.linux.kernel/1781529

I still don't think that is a good idea at all.  This async probe 
concept requires a trigger from user space and that opens many cans of 
worms as user space now has to be aware of specific kernel driver 
modules, their ordering dependencies, etc.

My point is simply not to defer any initialization at all.  This way you 
don't have to select which module or initcall to send a trigger for 
later on.

Once again, what is the actual problem you want to solve?  If it is 
about making sure user space can execute ASAP then _that_ should be the 
topic, not figuring out how to implement a particular solution.

 Anyway, your userspace will have to have a way to know what has been
 initialized.

Hotplug notifications via dbus.

 On my side, I was also using that mechanism to delay the network stack 
 init but I still want to know when my dhcp client can start for 
 example.

Ditto.  And not only do you want to know when the network stack is 
initialized, but you also need to wait for a link to be established 
before DHCP can work.


Nicolas
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Why is the deferred initcall patch not mainline?

2014-10-23 Thread Bird, Tim
On Thursday, October 23, 2014 12:05 PM, Nicolas Pitre wrote:

 On Thu, 23 Oct 2014, Alexandre Belloni wrote:

  On 23/10/2014 at 13:56:44 -0400, Nicolas Pitre wrote :
   On Thu, 23 Oct 2014, Bird, Tim wrote:
  
I'm not sure why this attention to reading the status.  The salient 
feature
here is that the initializations are deferred until user space tells 
the kernel
to proceed.  It's the initiation of the trigger from user-space that 
matters.
The whole purpose of this feature is to defer some driver 
initializations until
the product can get into a state where it is already ready to perform 
it's primary
function.  Only user space knows when that is.
  
   This is still a rather restrictive view of the problem IMHO.
  
   Let's step back a bit. Your concern is that some initcalls are taking
   too long and preventing user space from executing early, right?
Well,  not exactly.

That is not the exact problem we're trying to solve, although it is close.
The problem is not that users-space doesn't start early enough, per se,
it's that there are a set of drivers statically linked to the kernel that are
not needed until after (possibly well after) user space starts.
Any cycles whatsoever being spent on those drivers (either in their
initialization routines, or in processing them or scheduling them)
impairs the primary function of the device.  On a very old presentation
I gave on this, the use case I gave was getting a picture of a baby's smile.
USB drivers are NOT needed for this, but they *are* needed for full
product operation.

In some cases, the system may want to defer initialization of some drivers
until explicit action through the user interface.  So the trigger may not be
called until well after boot is completed.

   I'm suggesting that they no longer prevent user space from executing
   earlier.  Why would you then still want an explicit trigger from user
   space?
Because only the user space knows when it is now OK to initialize those
drivers, and begin using CPU cycles on them.

  
There seems to be a desire to have an automatic mechanism for triggering
the deferred initializations.  I'm OK with this, as long as there's 
some reasonable
use case for it.  There are lots of possible trigger mechanisms, 
including just
a simple timer, but I think it's important that the primary use case of
'trigger-when-user-space-says-to' is still supported.
  
   Why a trigger?  I'm suggesting no trigger at all is needed.
  
   Let all initcalls start initializing whenever they can.  Simply that
   they shouldn't prevent user space from running early.
  
   Because initcalls are running in parallel, then they must be using
   separate kernel threads.  It may be possible to adjust their priority so
   if one of them is actually using a lot of CPU cycles then it will run
   only when all the other threads (including user space) are no longer
   running.
  
 
  You probably can't do that without introducing race conditions. A number
  of userspace libraries and script are actually expecting init and probe
  to be synchronous.

 They already have to cope with the fact that most things can be
 available through not-yet-loaded modules, or may never be there at all.
 If not then they should be fixed.

 And if you do rely on such a feature for your small embedded
 system then you won't have that many libs and scripts to fix.

  I will refer to the async probe discussion and the
  following thread:
 
  http://thread.gmane.org/gmane.linux.kernel/1781529

 I still don't think that is a good idea at all.  This async probe
 concept requires a trigger from user space and that opens many cans of
 worms as user space now has to be aware of specific kernel driver
 modules, their ordering dependencies, etc.

 My point is simply not to defer any initialization at all.  This way you
 don't have to select which module or initcall to send a trigger for
 later on.

If you are going to avoid having a sub-set of modules consume
CPU cycles in early boot, you're going to have to identify them somehow.
How do you propose to enumerate the modules to defer (or
de-prioritize, as the case may be)?

Note that this solution should work on UP systems, were there is
essentially a zero-sum game on using CPU cycles at boot.


 Once again, what is the actual problem you want to solve?  If it is
 about making sure user space can execute ASAP then _that_ should be the
 topic, not figuring out how to implement a particular solution.

See above.  The actual problem is that we want some sub-set of statically
linked drivers to not consume any cycles during a period of time defined
by user space.  This is rather trivial to accomplish with modules, and the
proposed implementation tries to provide similar functionality for a statically
linked kernel.  I'm open to discussing solutions other than the particular
implementation proposed, just not ones that don't actually solve that problem.

  

RE: Why is the deferred initcall patch not mainline?

2014-10-23 Thread Nicolas Pitre
On Thu, 23 Oct 2014, Bird, Tim wrote:

 On Thursday, October 23, 2014 12:05 PM, Nicolas Pitre wrote:
 
  On Thu, 23 Oct 2014, Alexandre Belloni wrote:
 
   On 23/10/2014 at 13:56:44 -0400, Nicolas Pitre wrote :
On Thu, 23 Oct 2014, Bird, Tim wrote:
   
 I'm not sure why this attention to reading the status.  The salient 
 feature
 here is that the initializations are deferred until user space tells 
 the kernel
 to proceed.  It's the initiation of the trigger from user-space that 
 matters.
 The whole purpose of this feature is to defer some driver 
 initializations until
 the product can get into a state where it is already ready to perform 
 it's primary
 function.  Only user space knows when that is.
   
This is still a rather restrictive view of the problem IMHO.
   
Let's step back a bit. Your concern is that some initcalls are taking
too long and preventing user space from executing early, right?
 Well,  not exactly.
 
 That is not the exact problem we're trying to solve, although it is close.
 The problem is not that users-space doesn't start early enough, per se,
 it's that there are a set of drivers statically linked to the kernel that are
 not needed until after (possibly well after) user space starts.
 Any cycles whatsoever being spent on those drivers (either in their
 initialization routines, or in processing them or scheduling them)
 impairs the primary function of the device.  On a very old presentation
 I gave on this, the use case I gave was getting a picture of a baby's smile.
 USB drivers are NOT needed for this, but they *are* needed for full
 product operation.

As I suggested earlier, those cycles spent on those drivers may be 
deferred to a moment when the CPU has nothing else to do anyway by 
giving a lower priority to the threads handling them.

 In some cases, the system may want to defer initialization of some drivers
 until explicit action through the user interface.  So the trigger may not be
 called until well after boot is completed.

In that case the trigger for initializing those drivers should be the 
first time they're accessed from user space.  That could be the very 
first time libusb or similar tries to enumerate available USB devices 
for example.  No special interface needed.

I'm suggesting that they no longer prevent user space from executing
earlier.  Why would you then still want an explicit trigger from user
space?
 Because only the user space knows when it is now OK to initialize those
 drivers, and begin using CPU cycles on them.

So what?  That is still not a good answer.

User space shouldn't have to care as long as it has all the CPU cycles 
it wants in priority.  But as soon as user space relinquishes the CPU 
then there is no reason why driver initialization couldn't take over 
until user space is made runnable again.

[...]
  My point is simply not to defer any initialization at all.  This way you
  don't have to select which module or initcall to send a trigger for
  later on.
 
 If you are going to avoid having a sub-set of modules consume
 CPU cycles in early boot, you're going to have to identify them somehow.
 How do you propose to enumerate the modules to defer (or
 de-prioritize, as the case may be)?

Anything that is not involved with making the root fs available.

 Note that this solution should work on UP systems, were there is
 essentially a zero-sum game on using CPU cycles at boot.

The scheduler knows how to prioritize things on UP as well.  The top 
priority thread will always go to sleep at some point allowing other 
threads to run. But I'm sure you know all that.

  Once again, what is the actual problem you want to solve?  If it is
  about making sure user space can execute ASAP then _that_ should be the
  topic, not figuring out how to implement a particular solution.
 
 See above.  The actual problem is that we want some sub-set of statically
 linked drivers to not consume any cycles during a period of time defined
 by user space. 

Once again you're defining a solution (i.e. not consume any cycles ...) 
rather than the problem motivating this particular solution. That's not 
how you're going to have something merged upstream.

And I'm not saying your solution is completely bad either if you're 
looking for the simplest way and willing to keep it to yourself.  What 
I'm saying is that there are other possible solutions that could solve 
your initial problem _and_ be acceptable to mainline... but they're 
unlikely to look like what you have now.


Nicolas
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Why is the deferred initcall patch not mainline?

2014-10-23 Thread Rob Landley
On 10/23/14 14:05, Nicolas Pitre wrote:
 On Thu, 23 Oct 2014, Alexandre Belloni wrote:
 
 On 23/10/2014 at 13:56:44 -0400, Nicolas Pitre wrote :
 On Thu, 23 Oct 2014, Bird, Tim wrote:
 Why a trigger?  I'm suggesting no trigger at all is needed.

 Let all initcalls start initializing whenever they can.  Simply that 
 they shouldn't prevent user space from running early.

 Because initcalls are running in parallel, then they must be using 
 separate kernel threads.  It may be possible to adjust their priority so 
 if one of them is actually using a lot of CPU cycles then it will run 
 only when all the other threads (including user space) are no longer 
 running.


 You probably can't do that without introducing race conditions. A number
 of userspace libraries and script are actually expecting init and probe
 to be synchronous.
 
 They already have to cope with the fact that most things can be 
 available through not-yet-loaded modules, or may never be there at all. 
 If not then they should be fixed.
 
 And if you do rely on such a feature for your small embedded 
 system then you won't have that many libs and scripts to fix.

There are userspace libraries distinguishing between init and probe?
I.E. treating them as two separate things already?

So how were they accessing them as two separate things before this patch
set?

 I will refer to the async probe discussion and the
 following thread:

 http://thread.gmane.org/gmane.linux.kernel/1781529
 
 I still don't think that is a good idea at all.  This async probe 
 concept requires a trigger from user space and that opens many cans of 
 worms as user space now has to be aware of specific kernel driver 
 modules, their ordering dependencies, etc.
 
 My point is simply not to defer any initialization at all.  This way you 
 don't have to select which module or initcall to send a trigger for 
 later on.

Why would this be hard?

for i in $(find /sys/module -name initstate)
do
  [ $(cat $i) != live ]  echo kick  $i
done

And I'm confused that you're concerned about init order so your solution
is to do nothing, thereby preserving the existing init order which could
not _possibly_ be exposed verbatim to userspace...

 Once again, what is the actual problem you want to solve?  If it is 
 about making sure user space can execute ASAP then _that_ should be the 
 topic, not figuring out how to implement a particular solution.
 
 Anyway, your userspace will have to have a way to know what has been
 initialized.
 
 Hotplug notifications via dbus.

Wait, we need a _third_ mechanism for hotplug notifications now? (The
/proc/sys/kernel/hotplug helper, netlink, and you want another one?)

 On my side, I was also using that mechanism to delay the network stack 
 init but I still want to know when my dhcp client can start for 
 example.
 
 Ditto.  And not only do you want to know when the network stack is 
 initialized, but you also need to wait for a link to be established 
 before DHCP can work.

Um, doesn't the existing hotplug mechanism _already_ give us
notification that eth0 and similar showed up? (Pretty sure I hit that
while poking at mdev, although it was a while ago...)

Increasingly confused,

Rob
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Why is the deferred initcall patch not mainline?

2014-10-23 Thread Rob Landley


On 10/23/14 15:50, Nicolas Pitre wrote:
 On Thu, 23 Oct 2014, Bird, Tim wrote:
 
 On Thursday, October 23, 2014 12:05 PM, Nicolas Pitre wrote:

 On Thu, 23 Oct 2014, Alexandre Belloni wrote:

 On 23/10/2014 at 13:56:44 -0400, Nicolas Pitre wrote :
 On Thu, 23 Oct 2014, Bird, Tim wrote:

 I'm not sure why this attention to reading the status.  The salient 
 feature
 here is that the initializations are deferred until user space tells the 
 kernel
 to proceed.  It's the initiation of the trigger from user-space that 
 matters.
 The whole purpose of this feature is to defer some driver 
 initializations until
 the product can get into a state where it is already ready to perform 
 it's primary
 function.  Only user space knows when that is.

 This is still a rather restrictive view of the problem IMHO.

 Let's step back a bit. Your concern is that some initcalls are taking
 too long and preventing user space from executing early, right?
 Well,  not exactly.

 That is not the exact problem we're trying to solve, although it is close.
 The problem is not that users-space doesn't start early enough, per se,
 it's that there are a set of drivers statically linked to the kernel that are
 not needed until after (possibly well after) user space starts.
 Any cycles whatsoever being spent on those drivers (either in their
 initialization routines, or in processing them or scheduling them)
 impairs the primary function of the device.  On a very old presentation
 I gave on this, the use case I gave was getting a picture of a baby's smile.
 USB drivers are NOT needed for this, but they *are* needed for full
 product operation.
 
 As I suggested earlier, those cycles spent on those drivers may be 
 deferred to a moment when the CPU has nothing else to do anyway by 
 giving a lower priority to the threads handling them.

Unless you're using realtime priorities your kernel will spend about 5%
of its time servicing the lowest priority threads no matter what you do,
to avoid priority inversion lockups of the kind that cost us a mars
probe back in the 90's.

http://research.microsoft.com/en-us/um/people/mbj/Mars_Pathfinder/Authoritative_Account.html

Doing hardware probing at low priorities can cause really _fun_ latency
spikes in the system as something grabs a lock and then sleeps. (And
doing this at the realtime scheduling where it won't do that translates
those latency spikes into the aforementioned hard lockup, so not
actually a solution per se.)

Trying to fix this in the general case is the priority inheritance
problem, and last I heard was really hard. Maybe it's been fixed in the
past few years and I hadn't noticed. (The rise of SMP made it a less
pressing issue, but system bringup is its own little world.)

The reliable fix to priority inversion is to let low priority jobs still
get a decent crack at the CPU so clogs clear themselves naturally. And
this means that scheduling it down as far as it goes does _not_ simply
make low priority jobs go away.

 In some cases, the system may want to defer initialization of some drivers
 until explicit action through the user interface.  So the trigger may not be
 called until well after boot is completed.
 
 In that case the trigger for initializing those drivers should be the 
 first time they're accessed from user space.

Which gets us back to one of the big reasons strikesystemd/strike
devfsd failed years ago: you have to probe the hardware in order to know
which /dev nodes to create, so you can't have accessing the /dev node
probe the hardware. (There's no /dev node for a usb controller...)

 That could be the very
 first time libusb or similar tries to enumerate available USB devices 
 for example.  No special interface needed.

So now you're requiring libusb enumerating usb devices, when before this
you could just reach out and open /dev/ttyUSB0 and it would be there.

This is an embedded solution?

 I'm suggesting that they no longer prevent user space from executing
 earlier.  Why would you then still want an explicit trigger from user
 space?
 Because only the user space knows when it is now OK to initialize those
 drivers, and begin using CPU cycles on them.
 
 So what?  That is still not a good answer.

Why?

I believe Tim's proposal was to take a category of existing device
probing, one already done on a background thread, and wait to start it
until userspace says go. That's about as nonintrusive a change as you get.

You're talking about requiring weird arbitrary things to have side effects.

 User space shouldn't have to care as long as it has all the CPU cycles 
 it wants in priority.

That's not how scheduling works. The realtime people have been trying to
make scheduling work that wasy for _years_ and it's still a flaming pain
to use their stuff without hard lockups and weird inexplicable dropouts.

 But as soon as user space relinquishes the CPU 
 then there is no reason why driver initialization couldn't take over 
 until user space is made runnable again.

Re: Why is the deferred initcall patch not mainline?

2014-10-23 Thread Nicolas Pitre
On Thu, 23 Oct 2014, Rob Landley wrote:

 On 10/23/14 14:05, Nicolas Pitre wrote:
  On Thu, 23 Oct 2014, Alexandre Belloni wrote:
  
  On 23/10/2014 at 13:56:44 -0400, Nicolas Pitre wrote :
  On Thu, 23 Oct 2014, Bird, Tim wrote:
  Why a trigger?  I'm suggesting no trigger at all is needed.
 
  Let all initcalls start initializing whenever they can.  Simply that 
  they shouldn't prevent user space from running early.
 
  Because initcalls are running in parallel, then they must be using 
  separate kernel threads.  It may be possible to adjust their priority so 
  if one of them is actually using a lot of CPU cycles then it will run 
  only when all the other threads (including user space) are no longer 
  running.
 
 
  You probably can't do that without introducing race conditions. A number
  of userspace libraries and script are actually expecting init and probe
  to be synchronous.
  
  They already have to cope with the fact that most things can be 
  available through not-yet-loaded modules, or may never be there at all. 
  If not then they should be fixed.
  
  And if you do rely on such a feature for your small embedded 
  system then you won't have that many libs and scripts to fix.
 
 There are userspace libraries distinguishing between init and probe?
 I.E. treating them as two separate things already?

Why not?

 So how were they accessing them as two separate things before this patch
 set?

Before engaging a conversation with a device, you verify if it exists 
first?

  I will refer to the async probe discussion and the
  following thread:
 
  http://thread.gmane.org/gmane.linux.kernel/1781529
  
  I still don't think that is a good idea at all.  This async probe 
  concept requires a trigger from user space and that opens many cans of 
  worms as user space now has to be aware of specific kernel driver 
  modules, their ordering dependencies, etc.
  
  My point is simply not to defer any initialization at all.  This way you 
  don't have to select which module or initcall to send a trigger for 
  later on.
 
 Why would this be hard?
 
 for i in $(find /sys/module -name initstate)
 do
   [ $(cat $i) != live ]  echo kick  $i
 done

You should have a look at /sys/bus/*/*probe then.  Maybe it does what 
you need already.

 And I'm confused that you're concerned about init order so your solution
 is to do nothing, thereby preserving the existing init order which could
 not _possibly_ be exposed verbatim to userspace...

The kernel already has the deferred probe mechanism to cope with the 
init ordering which, as experience has shown, may only be dealt with at 
run time.  All attempts to create that ordering statically in the past 
have failed.  So what do you want exposed verbatim to user space again?

  Once again, what is the actual problem you want to solve?  If it is 
  about making sure user space can execute ASAP then _that_ should be the 
  topic, not figuring out how to implement a particular solution.
  
  Anyway, your userspace will have to have a way to know what has been
  initialized.
  
  Hotplug notifications via dbus.
 
 Wait, we need a _third_ mechanism for hotplug notifications now? (The
 /proc/sys/kernel/hotplug helper, netlink, and you want another one?)

No, I actually meant hotplug and netlink.  My bad.

  On my side, I was also using that mechanism to delay the network stack 
  init but I still want to know when my dhcp client can start for 
  example.
  
  Ditto.  And not only do you want to know when the network stack is 
  initialized, but you also need to wait for a link to be established 
  before DHCP can work.
 
 Um, doesn't the existing hotplug mechanism _already_ give us
 notification that eth0 and similar showed up? (Pretty sure I hit that
 while poking at mdev, although it was a while ago...)

Indeed it does. So no new user space notification mechanisms are needed 
which is my point.


Nicolas
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Why is the deferred initcall patch not mainline?

2014-10-23 Thread Nicolas Pitre
On Thu, 23 Oct 2014, Rob Landley wrote:

 Doing hardware probing at low priorities can cause really _fun_ latency
 spikes in the system as something grabs a lock and then sleeps. (And
 doing this at the realtime scheduling where it won't do that translates
 those latency spikes into the aforementioned hard lockup, so not
 actually a solution per se.)
 
 Trying to fix this in the general case is the priority inheritance
 problem, and last I heard was really hard. Maybe it's been fixed in the
 past few years and I hadn't noticed. (The rise of SMP made it a less
 pressing issue, but system bringup is its own little world.)
 
I know you're a smart *ss.  But:

1) All this is not about fixing the RT scheduler for the general case.

2) System bring-up being its own world may have special scheduling 
   treatment that doesn't necessarily have to be RT.

3) You, too, conveniently avoided to define the initial problem so far.
   That makes for rather sterile conversations about alternative 
   solutions that could score higher on the mainline acceptance scale.

  In some cases, the system may want to defer initialization of some drivers
  until explicit action through the user interface.  So the trigger may not 
  be
  called until well after boot is completed.
  
  In that case the trigger for initializing those drivers should be the 
  first time they're accessed from user space.
 
 Which gets us back to one of the big reasons strikesystemd/strike
 devfsd failed years ago: you have to probe the hardware in order to know
 which /dev nodes to create, so you can't have accessing the /dev node
 probe the hardware. (There's no /dev node for a usb controller...)

There is /sys/bus/usb/devices that could be accessed in order to trigger 
the initial setup and probe.  It is most likely that libusb does that, 
but this could be made to work with a simple 'cat' or 'touch' invocation 
as well.

  That could be the very first time libusb or similar tries to 
  enumerate available USB devices for example.  No special interface 
  needed.
 
 So now you're requiring libusb enumerating usb devices, when before this
 you could just reach out and open /dev/ttyUSB0 and it would be there.

You can't just reach out with the deferred initcall scheme either, do 
you?

 This is an embedded solution?
 
  I'm suggesting that they no longer prevent user space from executing
  earlier.  Why would you then still want an explicit trigger from user
  space?
  Because only the user space knows when it is now OK to initialize those
  drivers, and begin using CPU cycles on them.
  
  So what?  That is still not a good answer.
 
 Why?
 
 I believe Tim's proposal was to take a category of existing device
 probing, one already done on a background thread, and wait to start it
 until userspace says go. That's about as nonintrusive a change as you get.

You might still be able to do better.

If you really want to be non intrusive, you could e.g. make those 
background threads into SIGSTOP and let user space SIGCONT them as it 
sees fit.  No new special interfaces needed.

 You're talking about requiring weird arbitrary things to have side effects.

Like if stalling arbitrary initcalls wouldn't have side effects?

What I'm suggesting is to let the system do its thing the most efficient 
way while giving a strong bias to running user space first.  How 
arbitrarily weird can that be?

 If you're running in initramfs we haven't necessarily done _any_ driver
 probing yet. That's what initramfs is for. You can put device firmware
 in there so static drivers can make hotplug firmware loading requests to
 userspce during their device programming. (It's one of those usermode
 helper callback things.)

True if you need firmware, or if you want to actually load modules to 
get to the root fs device.  Otherwise all built-in driver init functions 
have been called and waited for at that point.

  Note that this solution should work on UP systems, were there is
  essentially a zero-sum game on using CPU cycles at boot.
  
  The scheduler knows how to prioritize things on UP as well.  The top 
  priority thread will always go to sleep at some point allowing other 
  threads to run. But I'm sure you know all that.
 
 The top priority threads will get preempted.
 
 (Did you follow any of the work Con Kolivas and company were doing a few
 years ago?)

Yeah... and I also notice it is still maintained, still out of mainline.

As you know already, you can do anything you want on your own.  That's 
granted by the GPL.


Nicolas
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Why is the deferred initcall patch not mainline?

2014-10-22 Thread Frank Rowand
On 10/21/2014 12:37 PM, Bird, Tim wrote:

 snip 

 With regards to doing it dynamically, I'd have to think about how
 to do that.  Having text-based lists of things to do at runtime seems
 to fit with how we're using device tree these days, but I'm not sure
 how that would work.

Initcall function names are not available without KALLSYMS.  That
dependency would increase kernel size.  So text based does not
seem too good.

Of course, if you are creating a text based list at compile time,
a macro could easily convert an init function text name to the
function pointer that is used in do_initcall_level().  Thus you
would have a not so large list of function pointers.

 snip 

-Frank
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Why is the deferred initcall patch not mainline?

2014-10-22 Thread Geert Uytterhoeven
On Tue, Oct 21, 2014 at 9:58 PM, Nicolas Pitre n...@fluxnic.net wrote:
 Yeah, I'm not a big fan of having to change kernel code in order to
 use the feature.  I am quite intrigued by Geert Uytterhoeven's idea
 to add a 'D' option to the config system, so that the record of which
 modules to defer could be stored there.  This is much better than
 hand-altering code.  I don't know how difficult this would be to add
 to the kbuild system, but the mechanism for altering the macro would
 be, IMHO, very straightforward.

 Straight forward but IMHO rather suboptimal. Sure it might be good
 enough if all you want is to ship products out the door, but for
 mainline something better should be done.

An alternative could be to add a processing step before linking,
changing the section name for initcalls you want to defer, based on a
small config file.

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say programmer or something like that.
-- Linus Torvalds
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Why is the deferred initcall patch not mainline?

2014-10-22 Thread Rob Landley


On 10/21/14 14:58, Nicolas Pitre wrote:
 On Tue, 21 Oct 2014, Bird, Tim wrote:
 
 I'm going to respond to several comments in this one message (sorry for the 
 likely confusion)

 On Tuesday, October 21, 2014 9:31 AM, Nicolas Pitre [n...@fluxnic.net] wrote:

 On Tue, 21 Oct 2014, Grant Likely wrote:

 On Sat, Oct 18, 2014 at 9:11 AM, Bird, Tim tim.b...@sonymobile.com wrote:
 The answer is pretty easy, I think.  I tried to mainline it once but 
 failed, and didn't really
 try again. If it is being found useful,  we should try to mainline it 
 again,  this time with
 more persistence.  The reason it got rejected before IIRC was that you 
 can accomplish
 a similar thing with modules, with no changes to the kernel. But that 
 doesn't cover
 the case where the loadable modules feature of the kernel is turned off, 
 which is
 common in very small systems.

 It is a rather clumsy approach though since it requires changes to
 modules and it makes the configuration static per build. Could it
 instead be done by the kernel accepting a list of initcalls that
 should be deferred? It would depend I suppose on the cost of finding
 the initcalls to defer at boot time.

 Yeah, I'm not a big fan of having to change kernel code in order to
 use the feature.  I am quite intrigued by Geert Uytterhoeven's idea
 to add a 'D' option to the config system, so that the record of which
 modules to defer could be stored there.  This is much better than
 hand-altering code.  I don't know how difficult this would be to add
 to the kbuild system, but the mechanism for altering the macro would
 be, IMHO, very straightforward.
 
 Straight forward but IMHO rather suboptimal. Sure it might be good 
 enough if all you want is to ship products out the door, but for 
 mainline something better should be done.
 
 This patch predated Arjan Van de Ven's fastboot work.  I don't
 know if some of his parallelization (asynchronous module loading), and
 optimizations for USB loading made things substantially better than this.
 The USB spec makes in impossible to avoid a certain amount of delay
 in probing the USB busses

 USB was the main culprit, but we sometimes deferred other modules, if they
 were not in the fastpath for taking a picture. Sony cameras had a goal of
 booting in .5 seconds, but I think the best we ever achieved was about 1.1
 seconds, using deferred initcalls and a variety of other techniques.
 
 Some initcalls can be executed in parallel, but they currently all have 
 to complete before user space is started.  It should be possible to 
 still do the parallel initcall thing, and let user space run before they 
 are done as well.  Only waiting for the root fs to be available should 
 be sufficient.  That would be completely generic, and help embedded as 
 well as desktop systems.

What would actually be nice is if initramfs could read something out of
/proc or /sys to check the status of initcalls. (Or maybe get
notification through the hotplug netlink mechanism.)

Since initramfs is _already_ up really early, before needing any
particular drivers and way before the real root filesystem, we can
trivially punt this sort of synchronization to userspace if userspace
can just get the information about when kernel deferred processing is done.

Possibly we already have this: /sys/module has directories for all the
kernel modules including the ones built static, so possibly userspace
can just wait for /sys/module/zlib_delfate/initstate to say live. It
would just be nice to have a way to notice that's happened without
spinning reading a file.

Rob
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Why is the deferred initcall patch not mainline?

2014-10-22 Thread Nicolas Pitre
On Wed, 22 Oct 2014, Rob Landley wrote:

 
 
 On 10/21/14 14:58, Nicolas Pitre wrote:
  On Tue, 21 Oct 2014, Bird, Tim wrote:
  
  I'm going to respond to several comments in this one message (sorry for 
  the likely confusion)
 
  On Tuesday, October 21, 2014 9:31 AM, Nicolas Pitre [n...@fluxnic.net] 
  wrote:
 
  On Tue, 21 Oct 2014, Grant Likely wrote:
 
  On Sat, Oct 18, 2014 at 9:11 AM, Bird, Tim tim.b...@sonymobile.com 
  wrote:
  The answer is pretty easy, I think.  I tried to mainline it once but 
  failed, and didn't really
  try again. If it is being found useful,  we should try to mainline it 
  again,  this time with
  more persistence.  The reason it got rejected before IIRC was that you 
  can accomplish
  a similar thing with modules, with no changes to the kernel. But that 
  doesn't cover
  the case where the loadable modules feature of the kernel is turned 
  off, which is
  common in very small systems.
 
  It is a rather clumsy approach though since it requires changes to
  modules and it makes the configuration static per build. Could it
  instead be done by the kernel accepting a list of initcalls that
  should be deferred? It would depend I suppose on the cost of finding
  the initcalls to defer at boot time.
 
  Yeah, I'm not a big fan of having to change kernel code in order to
  use the feature.  I am quite intrigued by Geert Uytterhoeven's idea
  to add a 'D' option to the config system, so that the record of which
  modules to defer could be stored there.  This is much better than
  hand-altering code.  I don't know how difficult this would be to add
  to the kbuild system, but the mechanism for altering the macro would
  be, IMHO, very straightforward.
  
  Straight forward but IMHO rather suboptimal. Sure it might be good 
  enough if all you want is to ship products out the door, but for 
  mainline something better should be done.
  
  This patch predated Arjan Van de Ven's fastboot work.  I don't
  know if some of his parallelization (asynchronous module loading), and
  optimizations for USB loading made things substantially better than this.
  The USB spec makes in impossible to avoid a certain amount of delay
  in probing the USB busses
 
  USB was the main culprit, but we sometimes deferred other modules, if they
  were not in the fastpath for taking a picture. Sony cameras had a goal of
  booting in .5 seconds, but I think the best we ever achieved was about 1.1
  seconds, using deferred initcalls and a variety of other techniques.
  
  Some initcalls can be executed in parallel, but they currently all have 
  to complete before user space is started.  It should be possible to 
  still do the parallel initcall thing, and let user space run before they 
  are done as well.  Only waiting for the root fs to be available should 
  be sufficient.  That would be completely generic, and help embedded as 
  well as desktop systems.
 
 What would actually be nice is if initramfs could read something out of
 /proc or /sys to check the status of initcalls. (Or maybe get
 notification through the hotplug netlink mechanism.)
 
 Since initramfs is _already_ up really early, before needing any
 particular drivers and way before the real root filesystem, we can
 trivially punt this sort of synchronization to userspace if userspace
 can just get the information about when kernel deferred processing is done.
 
 Possibly we already have this: /sys/module has directories for all the
 kernel modules including the ones built static, so possibly userspace
 can just wait for /sys/module/zlib_delfate/initstate to say live. It
 would just be nice to have a way to notice that's happened without
 spinning reading a file.

Again, not generic enough. Instead, the reading of that file could be 
suspended by the kernel until all initcalls have completed and then 
return an appropriate error code if the corresponding resource is 
actually not there.

Otherwise the standard hotplug notification mechanism is already 
available.


Nicolas
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Why is the deferred initcall patch not mainline?

2014-10-21 Thread Alexandre Belloni
On 19/10/2014 at 08:59:20 +0200, Dirk Behme wrote :
 Btw.: Does anybody have the correct mail address of Chris? Maybe he
 has some opinions on this, too, as his talk is the starting point of
 this discussion ;)
 

I think you can try challi...@gmail.com 

 

-- 
Alexandre Belloni, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Why is the deferred initcall patch not mainline?

2014-10-21 Thread Grant Likely
On Sat, Oct 18, 2014 at 9:11 AM, Bird, Tim tim.b...@sonymobile.com wrote:
 The answer is pretty easy, I think.  I tried to mainline it once but failed, 
 and didn't really try again. If it is being found useful,  we should try to 
 mainline it again,  this time with more persistence.  The reason it got 
 rejected before IIRC was that you can accomplish a similar thing with 
 modules, with no changes to the kernel. But that doesn't cover the case where 
 the loadable modules feature of the kernel is turned off, which is common in 
 very small systems.

It is a rather clumsy approach though since it requires changes to
modules and it makes the configuration static per build. Could it
instead be done by the kernel accepting a list of initcalls that
should be deferred? It would depend I suppose on the cost of finding
the initcalls to defer at boot time.

I missed the session unfortunately, are there some measurements
available that I could look at? Which subsystems are typically the
problem?

g.


   -- Tim

 Sent from my Sony smartphone on T-Mobile's 4G LTE Network


  Dirk Behme wrote 

 Hi,

 During the ELCE 2014 in Duesseldorf in Chris Hallinan's talk [1] there
 has been the unanswered question why the deferred initcall patch [2]
 isn't mainline, yet.

 Anybody remembers?

 Best regards

 Dirk


 [1] http://sched.co/1yG5fmY

 [2] http://elinux.org/Deferred_Initcalls
 --
 To unsubscribe from this list: send the line unsubscribe linux-embedded in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Why is the deferred initcall patch not mainline?

2014-10-21 Thread Grant Likely
On Tue, Oct 21, 2014 at 1:52 PM, Grant Likely grant.lik...@secretlab.ca wrote:
 On Sat, Oct 18, 2014 at 9:11 AM, Bird, Tim tim.b...@sonymobile.com wrote:
 The answer is pretty easy, I think.  I tried to mainline it once but failed, 
 and didn't really try again. If it is being found useful,  we should try to 
 mainline it again,  this time with more persistence.  The reason it got 
 rejected before IIRC was that you can accomplish a similar thing with 
 modules, with no changes to the kernel. But that doesn't cover the case 
 where the loadable modules feature of the kernel is turned off, which is 
 common in very small systems.

 It is a rather clumsy approach though since it requires changes to
 modules and it makes the configuration static per build. Could it
 instead be done by the kernel accepting a list of initcalls that
 should be deferred? It would depend I suppose on the cost of finding
 the initcalls to defer at boot time.

An, yes, I'm aware of the irony in calling this clumsy when I was the
one to introduce deferred probe.

g.
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Why is the deferred initcall patch not mainline?

2014-10-21 Thread Nicolas Pitre
On Tue, 21 Oct 2014, Grant Likely wrote:

 On Sat, Oct 18, 2014 at 9:11 AM, Bird, Tim tim.b...@sonymobile.com wrote:
  The answer is pretty easy, I think.  I tried to mainline it once but 
  failed, and didn't really try again. If it is being found useful,  we 
  should try to mainline it again,  this time with more persistence.  The 
  reason it got rejected before IIRC was that you can accomplish a similar 
  thing with modules, with no changes to the kernel. But that doesn't cover 
  the case where the loadable modules feature of the kernel is turned off, 
  which is common in very small systems.
 
 It is a rather clumsy approach though since it requires changes to
 modules and it makes the configuration static per build. Could it
 instead be done by the kernel accepting a list of initcalls that
 should be deferred? It would depend I suppose on the cost of finding
 the initcalls to defer at boot time.
 
 I missed the session unfortunately, are there some measurements
 available that I could look at? Which subsystems are typically the
 problem?

I, too, would like to know more about the problem.  Any pointers?


Nicolas
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Why is the deferred initcall patch not mainline?

2014-10-21 Thread Bird, Tim
I'm going to respond to several comments in this one message (sorry for the 
likely confusion)

On Tuesday, October 21, 2014 9:31 AM, Nicolas Pitre [n...@fluxnic.net] wrote:

 On Tue, 21 Oct 2014, Grant Likely wrote:

  On Sat, Oct 18, 2014 at 9:11 AM, Bird, Tim tim.b...@sonymobile.com wrote:
 The answer is pretty easy, I think.  I tried to mainline it once but 
 failed, and didn't really
 try again. If it is being found useful,  we should try to mainline it 
 again,  this time with
 more persistence.  The reason it got rejected before IIRC was that you can 
 accomplish
 a similar thing with modules, with no changes to the kernel. But that 
 doesn't cover
 the case where the loadable modules feature of the kernel is turned off, 
 which is
 common in very small systems.
 
  It is a rather clumsy approach though since it requires changes to
  modules and it makes the configuration static per build. Could it
  instead be done by the kernel accepting a list of initcalls that
  should be deferred? It would depend I suppose on the cost of finding
  the initcalls to defer at boot time.

Yeah, I'm not a big fan of having to change kernel code in order to
use the feature.  I am quite intrigued by Geert Uytterhoeven's idea
to add a 'D' option to the config system, so that the record of which
modules to defer could be stored there.  This is much better than
hand-altering code.  I don't know how difficult this would be to add
to the kbuild system, but the mechanism for altering the macro would
be, IMHO, very straightforward.

I should say that it's been quite some time since I worked on this,
so some of my recollections may be fuzzy.

With regards to doing it dynamically, I'd have to think about how
to do that.  Having text-based lists of things to do at runtime seems
to fit with how we're using device tree these days, but I'm not sure
how that would work.

The code as it stands now is quite simple, just creating a new linker section
to hold the list of deferred function pointers, re-using all existing
routines for processing such lists, doing a few code changes to handle 
actually deferring the initialization and memory free-ing, and finally
creating a /proc entry to trigger the whole thing. 

In a modern kernel, the /proc trigger should definitely be moved to
/sys.  Other than this, though, if you move to some other system of
processing the list, you will have to create new infrastructure for
working through the deferred module list, or make a change in the
way the items are handled in the generic init function pointer processing.
A simple solution would be to just compare each item from each ...initcall.init
section with a list of deferred functions, and not process them, until doing
the deferred init.

Note that the current technique uses the compiler and linker do some of
the work for list aggregation and processing, so that would have to be replaced
with something else if  you do it differently.

 
  I missed the session unfortunately, are there some measurements
  available that I could look at? Which subsystems are typically the
  problem?

 I, too, would like to know more about the problem.  Any pointers?

Here is the elinux wiki page with some historical measurements:
http://elinux.org/Deferred_Initcalls

The example on the wiki page defers 2 USB modules, and it
saved 530 milliseconds on an x86 system.

This is consistent with what we saw on cameras at Sony.
This patch predated Arjan Van de Ven's fastboot work.  I don't
know if some of his parallelization (asynchronous module loading), and
optimizations for USB loading made things substantially better than this.
The USB spec makes in impossible to avoid a certain amount of delay
in probing the USB busses

USB was the main culprit, but we sometimes deferred other modules, if they
were not in the fastpath for taking a picture. Sony cameras had a goal of
booting in .5 seconds, but I think the best we ever achieved was about 1.1
seconds, using deferred initcalls and a variety of other techniques.

 -- Tim
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Why is the deferred initcall patch not mainline?

2014-10-21 Thread Nicolas Pitre
On Tue, 21 Oct 2014, Bird, Tim wrote:

 I'm going to respond to several comments in this one message (sorry for the 
 likely confusion)
 
 On Tuesday, October 21, 2014 9:31 AM, Nicolas Pitre [n...@fluxnic.net] wrote:
 
  On Tue, 21 Oct 2014, Grant Likely wrote:
 
   On Sat, Oct 18, 2014 at 9:11 AM, Bird, Tim tim.b...@sonymobile.com 
   wrote:
  The answer is pretty easy, I think.  I tried to mainline it once but 
  failed, and didn't really
  try again. If it is being found useful,  we should try to mainline it 
  again,  this time with
  more persistence.  The reason it got rejected before IIRC was that you 
  can accomplish
  a similar thing with modules, with no changes to the kernel. But that 
  doesn't cover
  the case where the loadable modules feature of the kernel is turned off, 
  which is
  common in very small systems.
  
   It is a rather clumsy approach though since it requires changes to
   modules and it makes the configuration static per build. Could it
   instead be done by the kernel accepting a list of initcalls that
   should be deferred? It would depend I suppose on the cost of finding
   the initcalls to defer at boot time.
 
 Yeah, I'm not a big fan of having to change kernel code in order to
 use the feature.  I am quite intrigued by Geert Uytterhoeven's idea
 to add a 'D' option to the config system, so that the record of which
 modules to defer could be stored there.  This is much better than
 hand-altering code.  I don't know how difficult this would be to add
 to the kbuild system, but the mechanism for altering the macro would
 be, IMHO, very straightforward.

Straight forward but IMHO rather suboptimal. Sure it might be good 
enough if all you want is to ship products out the door, but for 
mainline something better should be done.

 This patch predated Arjan Van de Ven's fastboot work.  I don't
 know if some of his parallelization (asynchronous module loading), and
 optimizations for USB loading made things substantially better than this.
 The USB spec makes in impossible to avoid a certain amount of delay
 in probing the USB busses
 
 USB was the main culprit, but we sometimes deferred other modules, if they
 were not in the fastpath for taking a picture. Sony cameras had a goal of
 booting in .5 seconds, but I think the best we ever achieved was about 1.1
 seconds, using deferred initcalls and a variety of other techniques.

Some initcalls can be executed in parallel, but they currently all have 
to complete before user space is started.  It should be possible to 
still do the parallel initcall thing, and let user space run before they 
are done as well.  Only waiting for the root fs to be available should 
be sufficient.  That would be completely generic, and help embedded as 
well as desktop systems.


Nicolas
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Why is the deferred initcall patch not mainline?

2014-10-21 Thread Dirk Behme

On 21.10.2014 21:37, Bird, Tim wrote:

I'm going to respond to several comments in this one message (sorry for the 
likely confusion)

On Tuesday, October 21, 2014 9:31 AM, Nicolas Pitre [n...@fluxnic.net] wrote:


On Tue, 21 Oct 2014, Grant Likely wrote:


On Sat, Oct 18, 2014 at 9:11 AM, Bird, Tim tim.b...@sonymobile.com wrote:

The answer is pretty easy, I think.  I tried to mainline it once but failed, 
and didn't really
try again. If it is being found useful,  we should try to mainline it again,  
this time with
more persistence.  The reason it got rejected before IIRC was that you can 
accomplish
a similar thing with modules, with no changes to the kernel. But that doesn't 
cover
the case where the loadable modules feature of the kernel is turned off, which 
is
common in very small systems.


It is a rather clumsy approach though since it requires changes to
modules and it makes the configuration static per build. Could it
instead be done by the kernel accepting a list of initcalls that
should be deferred? It would depend I suppose on the cost of finding
the initcalls to defer at boot time.


Yeah, I'm not a big fan of having to change kernel code in order to
use the feature.  I am quite intrigued by Geert Uytterhoeven's idea
to add a 'D' option to the config system, so that the record of which
modules to defer could be stored there.  This is much better than
hand-altering code.  I don't know how difficult this would be to add
to the kbuild system, but the mechanism for altering the macro would
be, IMHO, very straightforward.

I should say that it's been quite some time since I worked on this,
so some of my recollections may be fuzzy.

With regards to doing it dynamically, I'd have to think about how
to do that.  Having text-based lists of things to do at runtime seems
to fit with how we're using device tree these days, but I'm not sure
how that would work.

The code as it stands now is quite simple, just creating a new linker section
to hold the list of deferred function pointers, re-using all existing
routines for processing such lists, doing a few code changes to handle
actually deferring the initialization and memory free-ing, and finally
creating a /proc entry to trigger the whole thing.

In a modern kernel, the /proc trigger should definitely be moved to
/sys.  Other than this, though, if you move to some other system of
processing the list, you will have to create new infrastructure for
working through the deferred module list, or make a change in the
way the items are handled in the generic init function pointer processing.
A simple solution would be to just compare each item from each ...initcall.init
section with a list of deferred functions, and not process them, until doing
the deferred init.

Note that the current technique uses the compiler and linker do some of
the work for list aggregation and processing, so that would have to be replaced
with something else if  you do it differently.



I missed the session unfortunately, are there some measurements
available that I could look at? Which subsystems are typically the
problem?


I, too, would like to know more about the problem.  Any pointers?


Here is the elinux wiki page with some historical measurements:
http://elinux.org/Deferred_Initcalls

The example on the wiki page defers 2 USB modules, and it
saved 530 milliseconds on an x86 system.

This is consistent with what we saw on cameras at Sony.
This patch predated Arjan Van de Ven's fastboot work.  I don't
know if some of his parallelization (asynchronous module loading), and
optimizations for USB loading made things substantially better than this.
The USB spec makes in impossible to avoid a certain amount of delay
in probing the USB busses

USB was the main culprit, but we sometimes deferred other modules, if they
were not in the fastpath for taking a picture. Sony cameras had a goal of
booting in .5 seconds, but I think the best we ever achieved was about 1.1
seconds, using deferred initcalls and a variety of other techniques.



To extend the list of usage examples, e.g.

-late_initcall(clk_debug_init);
+deferred_initcall(clk_debug_init);

I.e. you might want to have some debug features enabled, but you don't 
want to spend the time needed for initializing them in the time critical 
boot phase.


Best regards

Dirk

--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Why is the deferred initcall patch not mainline?

2014-10-20 Thread Geert Uytterhoeven
On Sat, Oct 18, 2014 at 4:05 PM, Alexandre Belloni
alexandre.bell...@free-electrons.com wrote:
 On 18/10/2014 at 10:11:27 +0200, Bird, Tim wrote :
 The answer is pretty easy, I think.  I tried to mainline it once but failed, 
 and didn't really try again. If it is being found useful,  we should try to 
 mainline it again,  this time with more persistence.  The reason it got 
 rejected before IIRC was that you can accomplish a similar thing with 
 modules, with no changes to the kernel. But that doesn't cover the case 
 where the loadable modules feature of the kernel is turned off, which is 
 common in very small systems.

 There is also the case of subsystems that can't be compiled as modules.
 I didn't even try to push that to the mainline because I believe we
 prefer not having code without any users/calls in the kernel. You would
 still have to patch your kernel to use deferred_module_init().

Using deferred_module_init() instead of module_init() is a configuration thing.
Perhaps we can extend Kconfig to handle this? I.e. there will be a new
CONFIG_FOO=d value, to indicate deferred initialization, and turn module_init()
into deferred_module_init()?

As this should work for both modules and subsystems that can't be compiled
as modules, this would also force us to clean up the two uses of bool.
Currently bool is used for both
  1. Options that can be enabled/disabled,
  2. Modules that can be built-in or disable.
Only the latter should get a third value (deferred initialization).

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say programmer or something like that.
-- Linus Torvalds
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Why is the deferred initcall patch not mainline?

2014-10-19 Thread Dirk Behme

On 18.10.2014 10:11, Bird, Tim wrote:

The answer is pretty easy, I think.  I tried to mainline it once but failed, 
and didn't really try again. If it is being found useful,  we should try to 
mainline it again,  this time with more persistence.  The reason it got 
rejected before IIRC was that you can accomplish a similar thing with modules, 
with no changes to the kernel. But that doesn't cover the case where the 
loadable modules feature of the kernel is turned off, which is common in very 
small systems.


Just some other uses cases: You want to avoid the overhead of ELF 
module loading, even if module loading is on. We've seen a lot of 
cases where the overall boot time is a lot faster having the driver in 
the kernel than loading it as module. Even if the kernel size and 
therefore its load time increases with this.


And if you want to have the driver quite early, earlier than the user 
space loads the modules. But want to have the delay/wait time of that 
driver to be running _after_ you have mounted the rootfs.


Thanks

Dirk

Btw.: Does anybody have the correct mail address of Chris? Maybe he 
has some opinions on this, too, as his talk is the starting point of 
this discussion ;)




 Dirk Behme wrote 

Hi,

During the ELCE 2014 in Duesseldorf in Chris Hallinan's talk [1] there
has been the unanswered question why the deferred initcall patch [2]
isn't mainline, yet.

Anybody remembers?

Best regards

Dirk


[1] http://sched.co/1yG5fmY

[2] http://elinux.org/Deferred_Initcalls
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Why is the deferred initcall patch not mainline?

2014-10-18 Thread Bird, Tim
The answer is pretty easy, I think.  I tried to mainline it once but failed, 
and didn't really try again. If it is being found useful,  we should try to 
mainline it again,  this time with more persistence.  The reason it got 
rejected before IIRC was that you can accomplish a similar thing with modules, 
with no changes to the kernel. But that doesn't cover the case where the 
loadable modules feature of the kernel is turned off, which is common in very 
small systems.

  -- Tim

Sent from my Sony smartphone on T-Mobile’s 4G LTE Network


 Dirk Behme wrote 

Hi,

During the ELCE 2014 in Duesseldorf in Chris Hallinan's talk [1] there
has been the unanswered question why the deferred initcall patch [2]
isn't mainline, yet.

Anybody remembers?

Best regards

Dirk


[1] http://sched.co/1yG5fmY

[2] http://elinux.org/Deferred_Initcalls
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Why is the deferred initcall patch not mainline?

2014-10-18 Thread Alexandre Belloni
Hi,

On 18/10/2014 at 10:11:27 +0200, Bird, Tim wrote :
 The answer is pretty easy, I think.  I tried to mainline it once but failed, 
 and didn't really try again. If it is being found useful,  we should try to 
 mainline it again,  this time with more persistence.  The reason it got 
 rejected before IIRC was that you can accomplish a similar thing with 
 modules, with no changes to the kernel. But that doesn't cover the case where 
 the loadable modules feature of the kernel is turned off, which is common in 
 very small systems.
 

There is also the case of subsystems that can't be compiled as modules.
I didn't even try to push that to the mainline because I believe we
prefer not having code without any users/calls in the kernel. You would
still have to patch your kernel to use deferred_module_init().

It is also quite easy to port, maybe you can try to push it to mainline
or if you want I can try to send an updated patch myself.

  Dirk Behme wrote 
 
 Hi,
 
 During the ELCE 2014 in Duesseldorf in Chris Hallinan's talk [1] there
 has been the unanswered question why the deferred initcall patch [2]
 isn't mainline, yet.
 
 Anybody remembers?
 
 Best regards
 
 Dirk
 
 
 [1] http://sched.co/1yG5fmY
 
 [2] http://elinux.org/Deferred_Initcalls
 --
 To unsubscribe from this list: send the line unsubscribe linux-embedded in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Alexandre Belloni, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Why is the deferred initcall patch not mainline?

2014-10-18 Thread Bird, Tim
 Alexandre Belloni wrote 

 Hi,

 On 18/10/2014 at 10:11:27 +0200, Bird, Tim wrote :
  The answer is pretty easy, I think.  I tried to mainline it once but 
  failed, and didn't really try again. If it is being found useful,  we 
  should try to mainline it again,  this time with more persistence.  The 
  reason it got rejected before IIRC was that you can accomplish a similar 
  thing with modules, with no changes to the kernel. But that doesn't cover 
  the case where the loadable modules feature of the kernel is turned off, 
  which is common in very small systems.
 

 There is also the case of subsystems that can't be compiled as modules.
 I didn't even try to push that to the mainline because I believe we
 prefer not having code without any users/calls in the kernel. You would
 still have to patch your kernel to use deferred_module_init().

 It is also quite easy to port, maybe you can try to push it to mainline
 or if you want I can try to send an updated patch myself.

I won't have time to get to it any time soon. I'm traveling this week and will 
be swamped the next few weeks.  If you send something, I certainly won't object.
  -- Tim

Sent from my Sony smartphone

   Dirk Behme wrote 
 
  Hi,
 
  During the ELCE 2014 in Duesseldorf in Chris Hallinan's talk [1] there
  has been the unanswered question why the deferred initcall patch [2]
  isn't mainline, yet.
 
  Anybody remembers?
 
  Best regards
 
  Dirk
 
 
  [1] http://sched.co/1yG5fmY ;
 
  [2] http://elinux.org/Deferred_Initcalls ;
  --
  To unsubscribe from this list: send the line unsubscribe linux-embedded in
  the body of a message to 
  majord...@vger.kernel.orgmailto:majord...@vger.kernel.org
  More majordomo info at  http://vger.kernel.org/majordomo-info.html ;

 --
 Alexandre Belloni, Free Electrons
 Embedded Linux, Kernel and Android engineering
 http://free-electrons.com
N�r��yb�X��ǧv�^�)޺{.n�+{�zf�uם�{ay�ʇڙ�,j��f���h���z��w���
���j:+v���w�j�mzZ+�ݢj��!�i