http://lwn.net/Articles/318611/Wakelocks and the embedded problem
The relationship between embedded system developers and the kernel
community is known for being rough, at best. Kernel developers complain
about low-quality work and a lack of contributions from the embedded
side;
the embedded developers, when they say anything at all, express
frustrations that the kernel development process does not really keep
their
needs in mind. A current discussion involving developers from the
Android
project gives some insight into where this disconnect comes from.
Android, of course, is Google's platform for mobile telephones. The initial Android stack was developed behind closed doors; the code only made it out into the world when the first deployments were already in the works. The Android developers have done a lot of kernel work, but very little code has made made the journey into the mainline. The code which has been merged all went into the staging tree without a whole lot of initiative from the Android side. Now, though, Android developer Arve Hjønnevåg is making an effort to merge a piece of that project's infrastructure through the normal process. It is not proving to be an easy ride. The most controversial bit of code is a feature known as "wakelocks." In Android-speak, a "wakelock" is a mechanism which can prevent the system from going into a low-power state. In brief, kernel code can set up a wakelock with something like this: #include <linux/wakelock.h>
wake_lock_init(struct wakelock *lock, int type, const char *name);
The type value describes what kind of wakelock this is; name gives it a name which can be seen in /proc/wakelocks. There are two possibilities for the type: WAKE_LOCK_SUSPEND prevents the system from suspending, while WAKE_LOCK_IDLE prevents going into a low-power idle state which may increase response times. The API for acquiring and releasing these locks is: void wake_lock(struct wake_lock *lock);
void wake_lock_timeout(struct wake_lock *lock, long timeout);
void wake_unlock(struct wake_lock *lock);
There is also a user-space interface. Writing a name to /sys/power/wake_lock establishes a lock with that name, which can then be written to /sys/power/wake_unlock to release the lock. The current patch set only allows suspend locks to be taken from user space. This submission has not been received particularly well. It has, instead, drawn comments like this from Ben Herrenschmidt: looks to me like some people hacked up some
ad-hoc trick for their own local need without instead trying to figure
out how to fit things with the existing infrastructure (or possibly
propose changes to the existing infrastructure to fit their needs).
or this one from Pavel Machek: Ok, I think that this wakelock stuff is in
"can't be used properly" area on Rusty's scale of nasty interfaces.
There's no end of reasons to dislike this interface. Much of it duplicates the existing pm_qos (quality of service) API; it seems that pm_qos does not meet Android's needs, but it also seems that no effort was made to fix the problems. The scheme seems over-engineered when all that is really needed is a "do not suspend" flag - or, at most, a counter. The patches disable the existing /sys/power/state interface, which does not play well with wakelocks. There is no way to recover if a user-space process exits while holding a wakelock. The default behavior for the system is to suspend, even if a process is running; keeping a system awake may involve a chain of wakelocks obtained by various software components. And so on. The end result is that this code will not make it into the mainline kernel. But it has been shipped on large numbers of G1 phones, with many more yet to go. So users of all those phones will be using out-of-tree code which will not be merged, at least not in anything like its current form. Any applications which depend on the wakelock sysfs interface will break if that interface is brought up to proper standards. It's a bit of a mess, but it is a very typical mess for the embedded systems community. Embedded developers operate under a set of constraints which makes proper kernel development hard. For example:
One could argue that Google has the time, resources, and in-house kernel development knowledge to avoid all of these problems and do things right. Instead, we have been treated to a fairly classic example of how things can go wrong. The good news is that Google developers are now engaging with the community and trying to get their code into the mainline. This process could well be long, and require a fair amount of adjustment on the Android side. Even if the idea of wakelocks as a way to prevent the system from suspending is accepted - which is far from certain - the interface will require significant changes. The associated "early suspend" API - essentially a notification mechanism for system state changes - will need to be generalized beyond the specific needs of the G1 phone. It could well be a lot of work. But if that work gets done, the kernel will be much better placed to handle the power-management needs of handheld devices. That, in turn, can only benefit anybody else working on embedded Linux deployments. And, crucially, it will help the Android developers as they port their code to other devices with differing needs. As the number of Android-based phones grows, the cost of carrying out-of-tree code to support each of them will also grow. It would be far better to generalize that support and get it into the mainline, where it can be maintained and improved by the community. Most embedded systems vendors, it seems, would be unwilling to do that work; they are too busy trying to put together their next product. So this sort of code tends to languish out of the mainline, and the quality of embedded Linux suffers accordingly. Perhaps this case will be different, though; maybe Google will put the resources into getting its specialized code into shape and merged into the mainline. That effort could help to establish Android as a solid, well-supported platform for mobile use, and that should be good for business. Your editor, ever the optimist, hopes that things will work out this way; it would be a good demonstration of how embedded community can work better with the kernel community, getting a better kernel in return. (Log in to post comments)
This does not count for all embedded developers Posted Feb 11, 2009 10:40 UTC (Wed) by w_sang (subscriber, #52415) [Link] I would have liked if you wrote "a lot of embedded developers"
instead
of "embedded developers" at times. We at Pengutronix, for example, are
working hard on getting our patches upstream and advertising this to
our customers, and I know a few others who do, too. "Embedded" is not
just the usual suspects and their mobile phones, there are numerous
devices solving industrial tasks which want to be supported. Quality is
definately needed here. In my book, the time constraint problem is the biggest one.
Customers
do want results whilst the mainline review process needs time, so you
often end up working with a customer-version and a mainline-version,
porting fixes back and forth. Also, one can see that the hardware
developers face the same time constraints (be it processor
manufacturers or board designers), which makes producing kernel quality
code an even bigger challenge because of sloppy hardware. I am just now working on a generic SPI-driver for the i.MX-platforms
for mainline. Everyone who wants to get an idea what difficulties an
embedded kernel developer may face is invited to join me. This one is a
prime example.
This does not count for all embedded developers Posted Feb 11, 2009 13:59 UTC (Wed) by corbet (editor, #1) [Link] You are right, I should not have used quite such a broad brush. There are quite a few embedded developers who make a point of working with the upstream kernel, and the number seems to be growing. My apologies.
Wakelocks and the embedded problem Posted Feb 11, 2009 11:47 UTC (Wed) by russell (subscriber, #10458) [Link] A little off topic. But instead of a
wakelock. I'd like to see a
poweroff timer that powered down regardless of what user space is
doing. After doing who knows what damage cooking my laptop of several
occasions. I no longer trust user space to get it right and power off.
Wakelocks and the embedded problem Posted Feb 11, 2009 13:01 UTC (Wed) by Kluge (subscriber, #2881) [Link] 'Another fundamental rule is "upstream
first": code goes into the mainline before being shipped to
customers.'
I thought that the kernel hackers disliked adding new features unless
they were already in use (and
Wakelocks and the embedded problem Posted Feb 11, 2009 15:53 UTC (Wed) by knan (subscriber, #3940) [Link] "In use by some other piece of code" is
the usual criteria. I.e. a
driver using your added shared infrastructure, a userspace program
talking to the interface added, etc.
The actual hardware being more than dreams in a simulator also helps,
of course.
Wakelocks and the embedded problem Posted Feb 14, 2009 0:34 UTC (Sat) by giraffedata (subscriber, #1954) [Link] I can remember various LWN articles about some proposed feature where kernel developers argued that it needed to be used out of tree and shipped with distributions for a while to prove its worthiness before joining the kernel.org major league. I believe these were major functions, though.
Wakelocks and the embedded problem Posted Feb 11, 2009 16:20 UTC (Wed) by michaeljt (subscriber, #39183) [Link] Perhaps unsurprisingly, LWN articles on
this subject tend to come to
the point of view that embedded developers should adjust to fit the
kernel developers' model. Perhaps the kernel developers would be able
to move towards the embedded developers to some extent without
compromising their own positions though?
Just going from the example above (and I realise that this may already
be happening without my knowing), the main problem seems to be the
interfaces, not the code. So if the embedded developers were able to
discuss the interfaces with the relevant kernel developers on private
mailing lists, to get an idea of what was likely to wash and what not
then everyone would be much further on, even without the embedded
people releasing their code. Of course, once they did get to the stage of releasing code, there
would still be the long integration process, but a lot of the heat
would be taken off by the fact that the interfaces were likely to get
through without too much discussion. The embedded people would be able
to ship without the integration being complete, but safe in the
knowledge that at some point their stuff would run on a generic kernel,
with all the resulting benefits, as long as they showed a reasonable
amount of good will.
Wakelocks and the embedded problem Posted Feb 11, 2009 16:38 UTC (Wed) by droundy (subscriber, #4559) [Link] I imagine the problem with this idea is
that usually interfaces are
trickier than implementations, and it's very hard to know if an
interface is "right" without also having a decent implementation. e.g.
presumably the problem with pm_qos that made it inadequate for
android's needs probably wasn't obvious when that code was reviewed
(and is still not clear to me).
Wakelocks and the embedded problem Posted Feb 11, 2009 18:19 UTC (Wed) by michaeljt (subscriber, #39183) [Link] They could still explain though, why the
existing interfaces did not
suit them and what they proposed to/were in the process of creating
instead. That would at least give some valuable feedback as to how
likely the changes are to get in. The embedded people do create
implementations. Even that limited feedback as they went along might
make everyone's life easier.
Wakelocks and the embedded problem Posted Feb 11, 2009 21:13 UTC (Wed) by gouyou (subscriber, #30290) [Link] > if the
embedded developers were able to discuss the interfaces with the
> relevant kernel developers on private mailing lists
Yeah, like most of them would be interested to have discussion like
that under NDA, helping for-profit companies produce better products ...
Wakelocks and the embedded problem Posted Feb 11, 2009 21:45 UTC (Wed) by michaeljt (subscriber, #39183) [Link] I am supposing of course that they think
the embedded people will
contribute interesting code in the long run. If they don't think that
then this is moot anyway :)
Wakelocks and the embedded problem Posted Feb 11, 2009 22:00 UTC (Wed) by gouyou (subscriber, #30290) [Link] But even if they contribute interesting
code, most kernel developer do
not work on Linux only for glory, they get paid to do it. I'm not sure
company like RedHat, Novell, IBM or Oracle would be terribly happy to
have their people spend time reviewing embedded API.
(For the top empoyer you can take a look here: http://lwn.net/Articles/312074/)
Wakelocks and the embedded problem Posted Feb 12, 2009 0:10 UTC (Thu) by dlang (subscriber, #313) [Link] most kernel developers will respond to
private e-mails about new developments.
there isn't a list for this, in part because there are so many kernel
developers that such a list would hardly be limited. the kernel folks have included drivers for hardware that's not
shipping yet. so, the kernel developers have shown that they are willing to work
with
embedded developers, but they can't be proactive about it because they
don't have any way of knowing that they need to contact someone. the
embedded developers know they are working on something, and can easily
find out who to contact for advice. for the most part they don't choose
to do so.
Wakelocks and the embedded problem Posted Feb 12, 2009 3:03 UTC (Thu) by jamesh (subscriber, #1159) [Link] The private mailing list thing seems like
it would be problematic. Are
you thinking of a single private mailing list, or one for each embeded
developer?
If it is just a single mailing list, then the developer's competitors
will likely also be on the list, which they might consider just as bad
as a public list. If it is separate lists, that is a lot of effort for the kernel
developers. Also, what should they do if two embedded developers
propose interfaces that achieve similar or identical aims? Do they
break confidentiality and try to get the two to cooperate, or do they
have to pretend that they don't know about the other use case?
Wakelocks and the embedded problem Posted Feb 12, 2009 9:04 UTC (Thu) by michaeljt (subscriber, #39183) [Link] Actually I was thinking that the embedded
developers would not be on
the list at all, but CCed when appropriate. And if handled delicately,
they might even welcome a limited co-ordination with competitors on
kernel interfaces - those are likely not to be the most valuable "IP"
which they wish to keep to their breast for all times. If the kernel
developers thought that the resulting contributions were likely to be
of sufficient value (to themselves or their employers :) ) they could
even play intermediaries without actually dropping names. This "if" is
of course the hinging point for everything I have posted up until now.
Wakelocks and the embedded problem Posted Feb 12, 2009 9:53 UTC (Thu) by johill (subscriber, #25196) [Link] You're also assuming that no kernel
developer (for lack of more
specification) are competition, something which cannot possibly be
true. If you think this through, the list might as well be public, and
then you might as well use linux-kernel or a more appropriate subsystem
list.
Wakelocks and the embedded problem Posted Feb 12, 2009 12:09 UTC (Thu) by mjg59 (subscriber, #23239) [Link] Google had all of this code in a public
git repository long before they
shipped anything running it, so absence of discussion before now isn't
down to wanting to keep it secret.
Wakelocks and the embedded PM Posted Feb 12, 2009 18:00 UTC (Thu) by mgross (subscriber, #38112) [Link] As I look more and more closely at the
wakelock structure I'm struck by
how similar it is to some ideas we tossed around on the CELF PM working
group a few years back. Ideas that fizzled a little at that time.
The high level notion of having a "fall-line" to low power states
subject to constraints keeping components from "falling' to a lower
power state is still quite interesting. FWIW at the time we worked on
this concept in CELF things got complex around the dependency and
notification networks that needed to be managed to make things work. Wakelock implements a type of constraint method. I think the API has
problems but the general idea of of constraint based steepest descent
PM still has appeal. To me anyway.
Wakelocks and the embedded PM Posted Feb 13, 2009 0:53 UTC (Fri) by mjg59 (subscriber, #23239) [Link] I think the real question is over how
constraints should be exposed.
I'm very much on the side of inferring constraints from the behaviour
of userland - if they have a device open then we should assume that
they want to use it, so should avoid shutting it down. We're nowhere
near providing that level of functionality in the kernel yet, but doing
so helps the embedded, desktop and server worlds.
I'm not sold on the idea of providing explicit constraints in most
cases. If you're going to provide that constraint explicitly, why not
allow the kernel to infer it? The code to say "Nothing needs access to
input devices now" is not significantly differently complicated to the
code that closes the input device when it doesn't need it. But that's
the kind of case that the Android code deals with now. Stuff like the pm_qos framework deals with a different case, where
you're supplying additional functional constraints to the kernel above
and beyond those that can be inferred. I think we should be focusing on
what those constraints might be rather than thinking about the wakelock
and early suspend code from Android.
Wakelocks and the embedded problem Posted Feb 13, 2009 0:44 UTC (Fri) by jd (subscriber, #26381) [Link] The level of interaction by embedded developers can be roughly modeled by Brownian motion. Sometimes it is there, sometimes it isn't. For example, when working on the FOLK kernel patch set of obscure drivers, I encountered drivers for embedded hardware that would be there one week and vanish the next.
(I had a devil of a time trying to find VME or Fieldbus drivers that
would sit still. The drivers would appear without warning - the
companies rarely advertised them - and then vanished without warning.) Sometimes, I would get all kinds of odd reactions to questions. The
COMEDI developers were dead set against merging their code with the
baseline, although I could never get them to give me a reason that made
sense. I could never get much of a coherent answer from RTAI, either.
I'm sure both groups had excellent reasons, and mean no offense to
either, but I would have preferred to know what that reason was. The Transputer drivers never made it into the mainstream, either,
and I
only discovered them on a series of barely-recognized FTP sites that
didn't appear on most search engines. True, not many people used
Transputers by the time the patch came out, but then not many people
used the CBM64 when drivers for Commodore peripherals started
circulating. There was zero documentation for the Transputer drivers,
including any indication of who wrote them, and they'd clearly been
abandoned a long time by the time I found them. One of the reasons I developed FOLK was to stop this kind of
nonsense
from happening - people would have a better idea of what was out there,
whether the developers liked it or not. (I got into a few tangles with
GRSecurity over that. I can understand their reasoning of wanting to
make sure security code hasn't been tampered with, but they have no
control over what someone installing it does and they're now near-death
from lack of exposure. They wouldn't depend as much on a single revenue
stream if their work was better-known, better-circulated and
better-understood. I can understand their position, but I can still
resent the fact that Linux will be a poorer place when GRSecurity goes
the way of the Dodo.) I found many, many other embedded projects out there, and expect to
find many many more such projects should I ever go looking again. These
projects don't suffer from a lack of releases, a lack of
open-sourceness or a lack of highly imaginative solutions. What they
lack is an existence within the visible spectrum. What you don't see,
you can't use. Sure, there are some "secret" projects out there, but if
the published projects were getting some eyeballs, there'd be less need
for "secret" APIs (as the problems with the mainstream APIs would be
fixed or replacements would already be incorporated). Sure, if more of these projects got discussed and more got included
into the mainstream, it wouldn't fix all the problems in the world, or
even in the embedded world. What it would do is reduce the number of
opportunities for problems and misunderstandings to develop. Isn't that
in the interests of both embedded and non-embedded developers? |
