Re: [PATCH 12/14] ARM: OMAP1: Timer32K: Fix timer32K for clockevents and clean it up

2008-03-21 Thread Thomas Gleixner
On Thu, 20 Mar 2008, Kevin Hilman wrote:

> Russell King - ARM Linux <[EMAIL PROTECTED]> writes:
> 
> > On Mon, Mar 17, 2008 at 11:42:36AM +0200, Tony Lindgren wrote:
> >> This patch fixes timer32k for clockevents and syncs it with
> >> linux-omap tree.
> >> 
> >> Signed-off-by: Tony Lindgren <[EMAIL PROTECTED]>
> >> ---
> >>  arch/arm/mach-omap1/timer32k.c |   20 ++--
> >>  1 files changed, 14 insertions(+), 6 deletions(-)
> >> 
> >>...
> >> @@ -126,9 +131,9 @@ static void omap_32k_timer_set_mode(enum 
> >> clock_event_mode mode,
> >>  
> >>switch (mode) {
> >>case CLOCK_EVT_MODE_PERIODIC:
> >> +  case CLOCK_EVT_MODE_ONESHOT:
> >>omap_32k_timer_start(OMAP_32K_TIMER_TICK_PERIOD);
> >>break;
> >> -  case CLOCK_EVT_MODE_ONESHOT:
> >
> > I didn't think an event was supposed to be programmed to fire when one
> > shot mode is selected - from the other implementations, it appears that
> > the timer should be disabled until set_next_event() has been called.
> 
> That is correct.  The 'start' should only be happening for the
> periodic mode.  This following change should be done.

Yup. Oneshot mode is programmed from the clockevents framework.

Thanks,
tglx
 
> Kevin
> 
> 
> diff --git a/arch/arm/mach-omap1/timer32k.c b/arch/arm/mach-omap1/timer32k.c
> index 905e735..0d9b02e 100644
> --- a/arch/arm/mach-omap1/timer32k.c
> +++ b/arch/arm/mach-omap1/timer32k.c
> @@ -131,9 +131,8 @@ static void omap_32k_timer_set_mode(enum clock_event_mode 
> mode,
>  
>   switch (mode) {
>   case CLOCK_EVT_MODE_PERIODIC:
> - case CLOCK_EVT_MODE_ONESHOT:
>   omap_32k_timer_start(OMAP_32K_TIMER_TICK_PERIOD);
> - break;
> + case CLOCK_EVT_MODE_ONESHOT:
>   case CLOCK_EVT_MODE_UNUSED:
>   case CLOCK_EVT_MODE_SHUTDOWN:
>   break;
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/5] ARM: twd: Add context save restore support

2011-01-25 Thread Thomas Gleixner
On Tue, 25 Jan 2011, Russell King - ARM Linux wrote:
> On Mon, Jan 24, 2011 at 11:39:13PM -0800, Colin Cross wrote:
> > On Mon, Jan 24, 2011 at 3:11 AM, Russell King - ARM Linux
> >  wrote:
> > > On Mon, Jan 24, 2011 at 11:06:09AM +, Russell King - ARM Linux wrote:
> > >> On Mon, Jan 24, 2011 at 02:21:17PM +0530, Santosh Shilimkar wrote:
> > >> > In CPU low power state, local timer looses its register context. This
> > >> > patch adds context save restore hooks which can be used by platforms
> > >> > appropriately.
> > >>
> > >> I thought the whole point of CLOCK_EVT_FEAT_C3STOP was that the generic
> > >> timer stuff wouldn't rely on it being kept alive?
> > >>
> > >> Hmm, it looks like we bypass the clockevents code by only setting the
> > >> TWD load value at initialization time, not when we switch to periodic
> > >> mode.  We really ought to rewrite it whenever we switch back to periodic
> > >> mode.
> > >>
> > >> I suspect fixing that means you won't need this save/restore support.
> > >
> > > Untested, but should do what's required.
> > >
> > > diff --git a/arch/arm/kernel/smp_twd.c b/arch/arm/kernel/smp_twd.c
> > > index fd91566..60636f4 100644
> > > --- a/arch/arm/kernel/smp_twd.c
> > > +++ b/arch/arm/kernel/smp_twd.c
> > > @@ -36,6 +36,7 @@ static void twd_set_mode(enum clock_event_mode mode,
> > >                /* timer load already set up */
> > >                ctrl = TWD_TIMER_CONTROL_ENABLE | 
> > > TWD_TIMER_CONTROL_IT_ENABLE
> > >                        | TWD_TIMER_CONTROL_PERIODIC;
> > > +               __raw_writel(twd_timer_rate / HZ, twd_base + 
> > > TWD_TIMER_LOAD);
> > >                break;
> > >        case CLOCK_EVT_MODE_ONESHOT:
> > >                /* period set, and timer enabled in 'next_event' hook */
> > > @@ -81,7 +82,7 @@ int twd_timer_ack(void)
> > >
> > >  static void __cpuinit twd_calibrate_rate(void)
> > >  {
> > > -       unsigned long load, count;
> > > +       unsigned long count;
> > >        u64 waitjiffies;
> > >
> > >        /*
> > > @@ -116,10 +117,6 @@ static void __cpuinit twd_calibrate_rate(void)
> > >                printk("%lu.%02luMHz.\n", twd_timer_rate / 100,
> > >                        (twd_timer_rate / 100) % 100);
> > >        }
> > > -
> > > -       load = twd_timer_rate / HZ;
> > > -
> > > -       __raw_writel(load, twd_base + TWD_TIMER_LOAD);
> > >  }
> > >
> > >  /*
> > 
> > This doesn't work for oneshot timers if the TWD_TIMER_CONTROL register
> > gets reset by cpu idle between twd_set_mode and twd_set_next_event.
> > Shadowing ctrl in a percpu variable and rewriting it during every call
> > to twd_set_next_event does work, but that's not much different, and a
> > little less efficient, than just saving and restoring the control and
> > load registers in idle.  It does have the advantage that platforms
> > don't need any extra calls.
> 
> The next question is can we teach the generic time infrastructure about
> this so we don't have to modify every clock event driver for it?  We
> really need to get away from having this kind of knowledge buried down
> in the lowest levels of every driver.

In which way? I mean the generic code issues a call to the set_mode
function when we leave the broadcast mode. So what should the generic
code do more ?

Thanks,

tglx


Re: [PATCH 3/5] ARM: twd: Add context save restore support

2011-01-25 Thread Thomas Gleixner
On Tue, 25 Jan 2011, Russell King - ARM Linux wrote:
> On Tue, Jan 25, 2011 at 02:23:10PM +0100, Thomas Gleixner wrote:
> > In which way? I mean the generic code issues a call to the set_mode
> > function when we leave the broadcast mode. So what should the generic
> > code do more ?
> 
> I can't say because these patches only add the hooks, there's no
> implementation yet which uses the hooks.
> 
> Given the description about _why_ those hooks are necessary, it seems
> that something is required.  Either we start adding custom hacks to
> each clockevent driver as is done with this patch, or we get some
> generic help in place.
> 
> I'm not thrilled by the custom hack approach - and I thought the
> clockevent stuff was created to stop this kind of thing happening.

Yes, and it does the right thing:

 idle enter (where the cpu local tick device stops)

  tick_broadcast_oneshot_control(CLOCK_EVT_NOTIFY_BROADCAST_ENTER)

clockevents_set_mode(dev, CLOCK_EVT_MODE_SHUTDOWN)
tick_broadcast_set_event(dev->next_event, 1)

idle exit

  tick_broadcast_oneshot_control(CLOCK_EVT_NOTIFY_BROADCAST_EXIT)

clockevents_set_mode(dev, CLOCK_EVT_MODE_ONESHOT)
tick_program_event(dev->next_event, 1)

So the generic code has all the calls in place. If a clock chip
implementation misses to set control registers in the
CLOCK_EVT_MODE_ONESHOT case, then it's not a short coming of the
generic code which needs magic hooks in the arch code.

The same applies for the periodic mode switch, which is handled via
tick_broadcast_on_off().

Am I missing something ?

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/5] ARM: twd: Add context save restore support

2011-01-25 Thread Thomas Gleixner
On Tue, 25 Jan 2011, Russell King - ARM Linux wrote:
> On Tue, Jan 25, 2011 at 03:12:24PM +0100, Thomas Gleixner wrote:
> > On Tue, 25 Jan 2011, Russell King - ARM Linux wrote:
> > > On Tue, Jan 25, 2011 at 02:23:10PM +0100, Thomas Gleixner wrote:
> > > > In which way? I mean the generic code issues a call to the set_mode
> > > > function when we leave the broadcast mode. So what should the generic
> > > > code do more ?
> > > 
> > > I can't say because these patches only add the hooks, there's no
> > > implementation yet which uses the hooks.
> > > 
> > > Given the description about _why_ those hooks are necessary, it seems
> > > that something is required.  Either we start adding custom hacks to
> > > each clockevent driver as is done with this patch, or we get some
> > > generic help in place.
> > > 
> > > I'm not thrilled by the custom hack approach - and I thought the
> > > clockevent stuff was created to stop this kind of thing happening.
> > 
> > Yes, and it does the right thing:
> > 
> >  idle enter (where the cpu local tick device stops)
> > 
> >   
> > tick_broadcast_oneshot_control(CLOCK_EVT_NOTIFY_BROADCAST_ENTER)
> > 
> > clockevents_set_mode(dev, CLOCK_EVT_MODE_SHUTDOWN)
> > tick_broadcast_set_event(dev->next_event, 1)
> > 
> > idle exit
> > 
> >   
> > tick_broadcast_oneshot_control(CLOCK_EVT_NOTIFY_BROADCAST_EXIT)
> > 
> > clockevents_set_mode(dev, CLOCK_EVT_MODE_ONESHOT)
> > tick_program_event(dev->next_event, 1)
> > 
> > So the generic code has all the calls in place. If a clock chip
> > implementation misses to set control registers in the
> > CLOCK_EVT_MODE_ONESHOT case, then it's not a short coming of the
> > generic code which needs magic hooks in the arch code.
> > 
> > The same applies for the periodic mode switch, which is handled via
> > tick_broadcast_on_off().
> > 
> > Am I missing something ?
> 
> I quote Colin:
> 
> | This doesn't work for oneshot timers if the TWD_TIMER_CONTROL register
> | gets reset by cpu idle between twd_set_mode and twd_set_next_event.

And I fear that he means the twd_set_mode(ONESHOT) call which happens
during boot/cpuonline.

The point is that the above function needs to be called from the cpu
idle code to notify the clock events layer. The correct mechanism is
clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_ENTER/EXIT). The
existing callers are in:

arch/x86/kernel/process.c
drivers/acpi/acpi_pad.c
drivers/acpi/processor_idle.c
drivers/idle/intel_idle.c

So i'm not surprised, that this wont work on ARM :)

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Timekeeping issue on aggressive suspend/resume

2010-06-09 Thread Thomas Gleixner
On Wed, 9 Jun 2010, Suresh Rajashekara wrote:

> I have an application (running on 2.6.29-omap1) which puts an OMAP1
> system to suspend aggressively. The system wakes up every 4 seconds
> and stays awake for about 35 milliseconds and sleeps again for another
> 4 seconds. This design is to save power on a battery operated device.
> 
> This aggressive suspend resume action seems like creating an issue to
> other applications in the system waiting for some timeout to happen
> (especially an application which is waiting using the mq_timedreceive
> and is supposed to timeout every 30 seconds. It seems to wake up every
> 90 seconds). Seems like the timekeeping is not happening properly in
> side the kernel.
> 
> If the suspend duration is changed from 4 second to 1 second, then
> things work somewhat better. On reducing it to 0.5 second (which was
> our earlier design on 2.6.16-rc3), the problem seems to disappear.
> 
> Is this expected?

Yes, that's caused by the fact that suspend (via sys/power/state )
freezes the kernel internal timers and the user space visible timers
which are based on CLOCK_MONOTONIC or jiffies (like mq_timedreceive on
your .29 kernel). Only CLOCK_REALTIME based timers are kept correct as
we have to align to the wall clock time.

The reason for this is, that otherwise almost all timers are expired
when we resume and we get a thundering herd of apps and kernel
facilities due to firing timeouts.

Another problem is that jiffies can wrap around on 32 bit systems
during a long suspend though I don't think that's a real world problem
as it takes between 49 to 497 days of suspend depending on the HZ
setting. SO for your usecase it would not matter.

I'm more concerned about code getting surprised by firing timers as
the kernel has this behaviour for a long time now.

Though we could change that conditionally - the default would still be
the freeze of jiffies and CLOCK_MONOTONIC for historical compability.

There will be probably some accounting issues. uptime, cpu time of the
suspend task and some others, but that needs to be found out.

Thanks,

tglx



--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Timekeeping issue on aggressive suspend/resume

2010-06-14 Thread Thomas Gleixner
On Mon, 14 Jun 2010, john stultz wrote:

> On Mon, 2010-06-14 at 00:46 -0700, Suresh Rajashekara wrote:
> > On Thu, Jun 10, 2010 at 12:52 PM, john stultz  wrote:
> > > I think Thomas was suggesting that you consider creating a option for
> > > where CLOCK_MONOTONIC included total_sleep_time.
> > >
> > > In that case the *hack* (and this is a hack, we'll need some more
> > > thoughtful discussion before anything like it could make it upstream)
> > > would be in timekeeping_resume() to comment out the lines that update
> > > wall_to_monotonic and total_sleep_time.
> > >
> > > It would be interesting to hear if that hack works for you, and we can
> > > try to come up with a better way to think about how to accommodate both
> > > views of how to account time over suspend.
> > 
> > Thanks.
> > 
> > I tried this fix. It seemed to help, however the accuracy of sleep
> > time for the process was not quite right. A process thread which was
> > supposed to wake up every (X) seconds, seemed to wake up every (X -
> > delta X) seconds.
> 
> Ah, the sleep time is probably too coarse (seconds). We probably need to
> increase the granularity from read_persistent_clock() and see if that
> helps (although most persistent clocks aren't very fine grained).
> 
> > Also another side effect of this change was that the system time was
> > no longer in sync with the wall time.
> 
> ? This doesn't make much sense to me, as you shouldn't be manipulating
> xtime differently.
> 
> Just to be clear, you mean the value from "date" doesn't match your
> watch after resume? 
> 
> > These problems were more pronounced when the suspend/wakeup cycle time
> > was brought down to 0.5 seconds from 4 seconds. The periodicity of
> > most of the process threads were disturbed.
> > 
> > I decided to NOT suspend/resume the timekeeping subsystem in the
> > kernel and try. It seemed to work. Every application seems to work
> > fine.
> > 
> > Now my question is; Is it safe to disable suspend/resume of the
> > timekeeping subsystem? Will it have an effect (on
> > functionality/performance) which may not surface in my short
> > experiments?
> 
> Well, the difficultly here is what folks actually mean by suspend. On
> some hardware it means everything is powered off, and so on resume we
> have to re-init hardware values.
> 
> It seems in your case that the hardware isn't completely powered off,
> since the clocksource you're using seemed to continue counting while the
> system was suspended. 
> 
> So in this case you might be ok. Your suspend seems closer to an deep
> idle state on x86. So suspending timekeeping might not be necessary.
> 
> However, you're right that there may be lurking issues:
> 
> 1) The suspend time would have to be limited to the clocksource's
> max_idle_ns value, since after that amount of cycles have past, we might
> overflow the accumulation function, or the clocksource may have wrapped.
> 
> 2) If the hardware does reset the clocksource at some point during the
> suspend, you'll have odd time issues.
> 
> 3) You could run into some difficulty keeping close sync with an NTP
> server, as the long delays between accumulation will probably cause an
> oscillating over-shoot and over-correction.
> 
> I suspect these different definitions of "suspend" on all of the
> different hardware types out there is going to be a growing problem in
> the near term. Especially as deep idle states start to power off more
> hardware and becomes closer to suspend in behavior.

I don't think it's really an issue. Such hardware uses a 32.768kHz
driven (RTC alike) clocksource/event which is never powered off and
not affected by suspend/resume unless you run out of battery. That
hardware provides sub second resolution (~30us) contrary to the PC
style RTC which gives you seconds only. That's really good enough for
timekeeping, NOHZ and even HIGHRES.

The NTP sync might become an issue for real long sleep times, but
that's an NTP problem and needs to be addressed seperately.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] mfd: twl4030-irq: irq_desc->lock converted to raw_spinlock_t

2009-12-16 Thread Thomas Gleixner
On Wed, 16 Dec 2009, Felipe Balbi wrote:
> commit 239007b8440abff689632f50cdf0f2b9e895b534 converted
> the spinlock_t to raw_spinlock_t. Unfortunately twl4030-irq
> was left aside on the conversion.
> 
> Cc: Thomas Gleixner 
> Cc: Tony Lindgren 
> Cc: linux-omap@vger.kernel.org
> Signed-off-by: Felipe Balbi 
> ---
> 
> I'm not sure this is the expected fix since twl4030-irq handler
> should be running in thread context. Currently mask/unmask/set_type
> are deferred to a workqueue. Thomas, should this be done ? How
> do you expect irq chips on slow busses to implement mask/unmask/set_type ?

That does not matter whether this runs in thread context or not. We
changed the type of the lock from spinlock to raw_spinlock. So we need
to use the corresponding raw_spin_* functions for it independent of
the context. Your patch is completely correct.

Acked-by: Thomas Gleixner 
 
>  drivers/mfd/twl4030-irq.c |4 ++--
>  1 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/mfd/twl4030-irq.c b/drivers/mfd/twl4030-irq.c
> index 20d29ba..9df9a5a 100644
> --- a/drivers/mfd/twl4030-irq.c
> +++ b/drivers/mfd/twl4030-irq.c
> @@ -568,12 +568,12 @@ static void twl4030_sih_do_edge(struct work_struct
> *work)
>   bytes[byte] &= ~(0x03 << off);
>  -spin_lock_irq(&d->lock);
> + raw_spin_lock_irq(&d->lock);
>   if (d->status & IRQ_TYPE_EDGE_RISING)
>   bytes[byte] |= BIT(off + 1);
>   if (d->status & IRQ_TYPE_EDGE_FALLING)
>   bytes[byte] |= BIT(off + 0);
> - spin_unlock_irq(&d->lock);
> + raw_spin_unlock_irq(&d->lock);
>   edge_change &= ~BIT(i);
>   }
> -- 
> 1.6.6.rc0
> 
> 
> -- 
> balbi
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [linux-pm] [PATCH 0/8] Suspend block api (version 8)

2010-05-26 Thread Thomas Gleixner
Alan,

On Wed, 26 May 2010, Alan Cox wrote:

> > Really, what are you getting at? Do you deny that there are programs,
> > that prevent a device from sleeping? (Just think of the bouncing
> > cows app)
> > 
> > And if you have two kernels, one with which your device is dead after 1
> > hour and one with which your device is dead after 10 hours. Which would
> > you prefer? I mean really... this is ridiculous. 
> 
> The problem you have is that this is policy. If I have the device wired
> to a big screen and I want cows bouncing on it I'll be most upset if
> instead it suspends. What you are essentially arguing for is for the
> kernel to disobey the userspace. It's as ridiculous (albeit usually less
> damaging) as a file system saying "Ooh thats a rude file name, the app
> can't have meant it, I'll put your document soemwhere else"
> 
> The whole API feels wrong to me. It's breaking rule #1 of technology "You
> cannot solve a social problem with technology". In this case you have a
> social/economic problem which is crap code. You solve it with an
> economics solution - creative incentives not to produce crap code like
> boxes that keep popping up saying "App XYZ is using all your battery" and
> red-amber-green powermeter scores in app stores.

I completely agree. 

We have already proven that the social pressure on crappy applications
works. When NOHZ was merged into the kernel we got no effect at all
because a big percentage of user space applications just used timers
at will and without any thoughts, also it unveiled busy polling and
other horrible coding constructs. So what happened ? Arjan created
powertop which lets the user analyse the worst offenders in his
system. As a result the offending apps got fixed rapidly simply
because no maintainer wanted to be on top of the powertop sh*tlist.

In the mobile app space it's basically the same problem. Users can
influence the app writers simply by voting and setting up public lists
of apps which are crappy or excellent. All it needs is a nice powertop
tool for the phone which allows the users to identify the crap on
their phones. That provides much more incentive - especially for
commercial players - to fix their crappy code.

Adding that sys_try_to_fix_crappy_userspace_code() API to the kernel
is just counter productive as it signals to the app provider: Go
ahead, keep on coding crap!

That's not a solution, that's just capitulation. 

It's absurd that some folks believe that giving up the most efficient
tool to apply pressure to crappy app providers is a good idea.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [linux-pm] [PATCH 0/8] Suspend block api (version 8)

2010-05-26 Thread Thomas Gleixner
Florian,

On Wed, 26 May 2010, Florian Mickler wrote:
>
> On the other hand, applications can say, they don't need that much
> power and userspace guaranties and not take a suspend blocker.
> 
> This is an option which they currently don't have.

Wrong. A well coded power aware application is very well able to
express that in various ways already today. Admittedly it's far from
perfect, but that can be fixed by adding interfaces which allow the
power aware coder to express the requirements of his application
actively, not by avoiding it.

suspend blockers are completely backwards as they basically disable
the kernels ability to do resource management.

Also they enforce a black and white scheme (suspend or run) on the
kernel which is stupid, as there are more options to efficiently save
power than those two. While current android devices might not provide
them, later hardware will have it and any atom based device has them
already.

So what the kernel needs to know to make better decisions are things
like:

  - how much slack can timers have (exisiting interface)
  - how much delay of wakeups is tolerated (missing interface)

and probably some others. That information would allow the kernel to
make better decisions versus power states, grouping timers, race to
idle and other things which influence the power consumption based on
the hardware you are running on.

> I don't think opportunistic suspend is a policy decision by the kernel.
> it is something new. Something which currently only the android
> userspace implements / supports. If you don't want to suspend while
> looking at the bouncing-cow, you have to take a suspend blocker and
> make yourself a user-visible power-eater, or don't do 
> 
> echo "opportunistic" > /sys/power/policy 
> 
> in the first place.
> 
> This "optionally being badly written, who cares?" is a new feature the
> kernel can provide to applications. 

It's a misfeature which the kernel should not provide at all. It sends
out the completely wrong message: Hey, we can deal with your crappy
code, keep on coding that way.

While this seems to sound cool to certain people in the mobile space,
it's completely backwards and will backfire in no time. 

The power efficiency of a mobile device is depending on a sane overall
software stack and not on the ability to mitigate crappy software in
some obscure way which is prone to malfunction and disappoint users.

Thanks,

tglx

--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [linux-pm] [PATCH 0/8] Suspend block api (version 8)

2010-05-27 Thread Thomas Gleixner
On Wed, 26 May 2010, Alan Cox wrote:

> > The power efficiency of a mobile device is depending on a sane overall
> > software stack and not on the ability to mitigate crappy software in
> > some obscure way which is prone to malfunction and disappoint users.
> 
> Even if you believe the kernel should be containing junk the model that
> works and is used for everything else is resource management. Not giving
> various tasks the ability to override rules, otherwise you end up needing
> suspend blocker blockers next week.

We definitely will need them when we want to optimize the kernel
resource management on a QoS based scheme, which is the only sensible
way to go IMNSHO.

> A model based on the idea that a task can set its desired wakeup
> behaviour *subject to hard limits* (ie soft/hard process wakeup) works
> both for the sane system where its elegantly managing hard RT, and for
> the crud where you sandbox it to stop it making a nasty mess.

Right, the base system can set sensible defaults for "verified" apps,
which will work most of the time except for those which have special
requirements and need a skilled coder anyway. And for the sandbox crud
the sensible default can be "very long time" and allow the kernel to
ignore them at will.

> Do we even need a syscall or will adding RLIMIT_WAKEUP or similar do the
> trick ?

That might be a good starting point.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [linux-pm] [PATCH 0/8] Suspend block api (version 8)

2010-05-27 Thread Thomas Gleixner
On Thu, 27 May 2010, Florian Mickler wrote:

> On Wed, 26 May 2010 22:03:37 +0200
> Vitaly Wool  wrote:
> 
> > On Wed, May 26, 2010 at 9:56 PM, Florian Mickler  
> > wrote:
> > 
> > > Your approach definitely sounds better than the current solution.
> > > What about mapping suspend blocker functionality later on, when this
> > > interface exists, on to this new approach and deprecating it?
> > 
> > What about coming back after some while with the appropriate solution
> > when it's ready instead of stubbornly pushing crap?
> > 
> > ~Vitaly
> 
> Because quite frankly, for a good part of linux users, suspend blockers
> is already in the kernel. It's just an historical mistake that they are
> not in the linux kernel's hosted on kernel.org. 

No, it's not a historical mistake. It's a technical decision _NOT_ to
merge crap. If we would accept every crappy patch which gets shipped
in large quantities as a defacto part of the kernel we would have a
completely unmaintainable mess since years.

> So why don't we do what we always do? Improve existing interfaces step
> by step? 

Exactly, that's what we are going to do. We improve and extend
existing interfaces step by step, but not by creating a horrible and
unmaintainable mess in the frist place which we can never get rid of
anymore.

> Top Down approaches fail from time to time. Also it is not clear, that
> that proposed interface works for the use cases. This has to be proven
> by providing an implementation. 

Nobody prevents you to sit down and start with a prove of concept
implementation.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [linux-pm] [PATCH 0/8] Suspend block api (version 8)

2010-05-27 Thread Thomas Gleixner
On Wed, 26 May 2010, Florian Mickler wrote:

> On Wed, 26 May 2010 19:22:16 +0200 (CEST)
> Thomas Gleixner  wrote:
> > On Wed, 26 May 2010, Florian Mickler wrote:
> > >
> > > On the other hand, applications can say, they don't need that much
> > > power and userspace guaranties and not take a suspend blocker.
> > > 
> > > This is an option which they currently don't have.
> > 
> > Wrong. A well coded power aware application is very well able to
> > express that in various ways already today. Admittedly it's far from
> > perfect, but that can be fixed by adding interfaces which allow the
> > power aware coder to express the requirements of his application
> > actively, not by avoiding it.
> 
> Yeah, but a user can't say: "I don't
> know programming, but I had this idea. Here try it out." 
> to his friend. Because his friends phone then will crap out.
> 
> This is a negative. The phone just doesn't work well. 
> 
> A knowledgeable programmer is able to do extra work to enable specific
> guarantees. A dummy-throw-away-programmer doesn't want to do
> any extra work. 

Trying to solve that inside the kernel is the patently wrong
approach. The only way to give Joe Clicker the ability to "code" his
idea is to give him a restricted sandbox environment which takes care
of the extra work and is created by knowledgable programmers with the
Joe Clickers in mind.

Every Joe Clicker can "code" websites and does this happily in his
sandbox which consists of a web server and a web application
framework. There is no single line of code in the kernel to make this
work.

As I said before we need new interfaces and new technologies to let
the kernel do better power management, but definitely not a big hammer
approach which is actively preventing the kernel to do smarter
decisions.

The correct approach is QoS based and everything else is just a
bandaid which is going to hurt us badly.

> > 
> > suspend blockers are completely backwards as they basically disable
> > the kernels ability to do resource management.
> 
> I don't think this is a defect in the approach but the point of it. 

I know that this is the point of the approach, but that does not make
it less wrong. Me, Alan and others explained already in great length
why it is the wrong approach, but you refuse to listen.

You remind me of my 14 year old son, who argues in circles to convince
me that I must allow him something which is wrong. And if he would
think about it w/o his primary goal of getting permission in mind he
would concede that it's wrong.

> > 
> > Also they enforce a black and white scheme (suspend or run) on the
> > kernel which is stupid, as there are more options to efficiently save
> > power than those two. While current android devices might not provide
> > them, later hardware will have it and any atom based device has them
> > already.
> 
> This is true. Nonetheless, in my opinion, implementing the "backwards
> approach" in any form (suspend blockers or some other sort of "sane"
> interface) is necessary in the long run.  I also believe that Alan's
> approach is the more flexible one. But I'm not in a position to judge
> on this.
> 
> If it really is better and superior, I think userland will switch
> over to it, as soon as it is possible to do it. The impact to the
> drivers code is needed anyway. What looses the kernel by implementing
> suspend blockers, and later a more finegrained approach and mapping the
> userspace part of suspend blockers on that finegrained approach later
> on?

The kernel loses the ability to remove suspend blockers once the
interface is exposed to user space. That's the whole point. We would
have to carry it on for a long time and trying to work around it when
implementing a technical correct approach.

And we have never seen crap move to a better interface. It will stay
there forever and hell will break lose when we break it.

> > So what the kernel needs to know to make better decisions are things
> > like:
> > 
> >   - how much slack can timers have (exisiting interface)
> 
> I liked this idea of Arjan, to give some applications infinite timer
> slack. But I don't know if it can made work in a "dummy proof" way.
> (I.e. it is too complicated to get it right in general, except for the
> "infinity" or not kind of tuning)

A mobile device can implement sensible defaults for the careless
applications and let the "good" apps override them.
 
> >   - how much delay of wakeups is tolerated (missing interface)
> 
> Doesn't solve the segregation problem and is probably overkill for most

It solves the segre

Re: [linux-pm] [PATCH 0/8] Suspend block api (version 8)

2010-05-27 Thread Thomas Gleixner
On Wed, 26 May 2010, Arve Hjønnevåg wrote:
> 2010/5/26 Alan Cox :
> > On Wed, 26 May 2010 15:30:58 -0700
> > Arve Hjønnevåg  wrote:
> >
> >> On Wed, May 26, 2010 at 6:16 AM, Alan Cox  wrote:
> >> >> Really, what are you getting at? Do you deny that there are programs,
> >> >> that prevent a device from sleeping? (Just think of the bouncing
> >> >> cows app)
> >> >>
> >> >> And if you have two kernels, one with which your device is dead after 1
> >> >> hour and one with which your device is dead after 10 hours. Which would
> >> >> you prefer? I mean really... this is ridiculous.
> >> >
> >> > The problem you have is that this is policy. If I have the device wired
> >> > to a big screen and I want cows bouncing on it I'll be most upset if
> >> > instead it suspends.
> >>
> >> We never suspend when the screen is on. If the screen is off, I would
> >> not be upset if it suspends.
> >
> > This is policy and platform specific. The OLPC can suspend with the
> > screen on. Should I write my app to know about this once for Android and
> > once for OLPC (and no doubt once for Apple). In the OLPC case cows could
> > indeed suspend to RAM between frames if the wakeup time was suitable.
> 
> Are you still talking about Linux suspend? If you can enter S3 from
> idle and still wake up for the next timer or interrupt, then do that.
> Suspend blockers should have not effect on this.

It does not matter whether it's S3 or any other power saving
state / mechanism in a particular device.

What matters is that the kernel needs to know about the QoS
requirements of the applications which are active to make optimal
decisions for power management. 

If we can go into any given power state from idle then the decision
which power state to select needs to be made on the requirements of an
application to react on the next event (timer, interrupt, ...).

So yes, we could go into S3 (with some effort) and arm the wakeup
source which will bring us back when the requirements of the apps are
known at the point where the decision is made.

> >
> > My app simply should not have to know this sort of crap, that's the whole
> > point of an OS.
> >
> 
> Most apps does not have to know about this with opportunistic suspend
> either. If the user is interacting with the device, we don't suspend.
> If your apps needs to run when the user is not interacting with the
> device, then you can block suspend.

You can express that in QoS requirements as well. If you say "max
wakeup latency 100ms" then the kernel will select a power state which
will meet this requirement. So it can decide whether to go into full
suspend on a given system or select some other better suiting power
saving mechanism. But this allows us to express this in a completely
hardware independent way. If the hardware does not provide a suitable
state, then the box stays on.

> >> > What you are essentially arguing for is for the
> >> > kernel to disobey the userspace.
> >>
> >> No I'm not. User-space asked the kernel to suspend when possible.
> >> Suspend is an existing kernel feature. All opportunistic suspend adds
> >> is the ability to use it safely when there are wakeup event you cannot
> >> afford to ignore.
> >
> > Don't get me wrong - opportunistic suspend isn't the problem. Suspend
> > blockers are - or more precisely - the manner in which they express
> > intent from userspace. Opportunistic suspend is wonderful stuff for all
> > sorts of things and if it persuades people like netbook manufacturers to
> > think harder, and Linux driver authors to optimise their suspend/resume
> > paths we all win.
> >
> >> Our actual stating point is this: Some systems can only enter their
> >> lowest power state from suspend. So we added an api that allowed us to
> >> use suspend without loosing wakeup events. Since suspending also
> >> improves our battery life on platforms that enter the same power state
> >> from idle and suspend and we already have a way to safely use suspend,
> >> we would be crazy not to use it.
> >
> > Opportunistic suspend isn't special. Suspend is just a very deep idle. In
> 
> Suspend as it is currently implemented in Linux is special. Regular
> timers stop, and only specially marked wakeup events can bring the
> system back to the normal state.

That's a matter of the current implementation. We can change that and
improve it to do better resource management. And this requirement is
not restricted to the mobile phone space, it's true for laptops,
virtualization and other areas as well.
 
> > But you can express user suspend blocking in this manner. Your call
> > incoming code would say 'I want good latency'. As someone needs good
> > latency the box won't suspend. If your approach is to start with an
> > initial 'anyone can set a low latency we don't care' then anyone can
> > block suspends.
> >
> > Equally your call handling example is about latency not about suspend.
> > You want the phone to stay on, to fetch a picture and display it promptly.
> >
> 
> I don't th

Re: [linux-pm] [PATCH 0/8] Suspend block api (version 8)

2010-05-27 Thread Thomas Gleixner
On Thu, 27 May 2010, Matthew Garrett wrote:
> On Thu, May 27, 2010 at 12:39:43AM +0100, Alan Cox wrote:
> 
> > - Supporting Android needs well good
> > - Opportunistic suspend good
> > - Manner in which interface is expressed to userspace bad
> > - Latency constraint interface would be better
> > - Your existing behaviour can be implemented by a simplistic use of a
> >   latency constraint interface
> > - We can fix a pile of other directly connected things at the same time
> > - Implementation internals I care far less about because we can fix those
> >   later
> > - Suspend is just a power state
> > 
> > How does that fit your model and vision ?
> 
> I don't entirely see how this works. In order to deal with poorly 
> written applications, it's necessary to (optionally, based on some 
> policy) ignore them when it comes to the scheduler. The problem is how 
> to implement the optional nature of this in a race-free manner. This is 
> obviously a pathological case, but imagine an application that does 
> something along the following lines:
> 
> int input = open ("/dev/input", O_RDONLY|O_NONBLOCK);
> char foo;
> 
> while (1) {
>   suspend_block();
>   if (read(input, &foo, 1) > 0) {
>   (do something)
>   suspend_unblock();
>   } else {
>   suspend_unblock();
>   (draw bouncing cows and clouds and tractor beams briefly)
>   }
> }
> 
> Now, if the user is playing this game, you want it to be scheduled. If 
> the user has put down their phone and the screen lock has kicked in, you 
> don't want it to be scheduled. So we could imagine some sort of cgroup 
> that contains untrusted tasks - when the session is active we set a flag 
> one way which indicates to the scheduler that tasks in TASK_RUNNING 
> should be scheduled, and when the session is idle we set the flag the 
> other way and all processes in that cgroup get shifted to 
> TASK_INTERRUPTIBLE or something.
> 
> Except that doesn't work. If the session goes idle in the middle of the 
> app drawing a frame, we'll stop the process and the task will never call 
> read(). So the user hits a key, we wake up, nothing shifts from 
> TASK_INTERRUPTIBLE into TASK_RUNNING, the key never gets read, we go 
> back to sleep. The event never gets delivered.
> 
> Now let's try this in the Android world. The user hits a key and the 
> system wakes up. The input layer takes a suspend block. The application 
> now draws all the cows it wants to, takes its own suspend block and 
> reads the input device. This empties the queue and the kernel-level 
> suspend block is released. The application then processes the event 
> before releasing the suspend block. The event has been delivered and 
> handled.

Thanks for providing this example:

  1) It proves that suspend blockers are solely designed to encourage
 people to code crap.

  2) It exposes the main conceptual problem of the approach:

 The input layer in the kernel magically takes a suspend blocker
 and releases it in an equally magic way just to allow the crappy
 application to reach the point where it takes it's own suspend
 blocker and can react on the user input.
 
 And you need to do that, because the user applications suspend
 blocker magic is racy as hell. To work around that you sprinkle
 your suspend blocker magic all over the kernel instead of telling
 people how to solve the problem correctly.

 And what are you achieving with this versus power saving ?

 Exaclty _NOTHING_ ! 

 Simply because you move the cow drawing CPU time from the point
 where the device wants to go into suspend to the point where the
 user hits a key again. You even delay the reaction of your app to
 the user input by the time it needs to finish drawing cows.
 
 So you need to come up with a way better example why you need
 your blockers than with this prove of misconception.

> You can't express that with resource limits or QoS constraints. If you 
> want to deal with this kind of situation then, as far as I can tell, you 
> need either suspend blockers or something so close to them that it makes 
> no difference.

Wrong. If your application is interactive then you set the QoS
requirement once to interactive and be done.

So the correct point to make a power state decision is when the app
waits for a key press. At this point the kernel can take several
pathes:

  1) Keep the system alive because the input device is in active
 state and a key press is expected

  2) Go into supsend because the input device is deactivated after
 the screen lock kicked in.

This behaves exactly the same way in terms of power consumption as
your blocker example just without all the mess you are trying to
create.

And it allows the kernel to use intermediate power saving methods
(between idle and suspend) which might be available on some hardware.

Thanks,

tglx
--
To unsubscribe from th

Re: [linux-pm] [PATCH 0/8] Suspend block api (version 8)

2010-05-27 Thread Thomas Gleixner
On Thu, 27 May 2010, Alan Stern wrote:
> On Thu, 27 May 2010, Peter Zijlstra wrote:
> 
> > On Thu, 2010-05-27 at 16:33 +0100, Alan Cox wrote:
> > > On Thu, 27 May 2010 17:09:16 +0200
> > > Peter Zijlstra  wrote:
> > > 
> > > > On Thu, 2010-05-27 at 11:06 -0400, Alan Stern wrote:
> > > > > 
> > > > > Opportunistic suspends are okay.
> > > > > 
> > > > > The proposed userspace API is too Android-specific.
> > > > 
> > > > I would argue opportunistic suspends are not ok, and therefore the
> > > > proposed API is utterly irrelevant.
> > > 
> > > Assuming you are happy that opportunistically entering C6 and the like is
> > > ok via ACPI can you explain why you have a problem with opportunistic
> > > suspend and why it is different to a very deep sleep CPU state such as we
> > > have now (and which on many embedded platforms we deal with *is* sleeping
> > > external devices too)
> > 
> > Agreed, but I understood the opportunistic suspend line from Alan Stern
> > to mean the echo opportunistic > /sys/power/foo thing.
> 
> Yes, that's what I meant.  Why do you think it is any worse than "echo 
> mem >/sys/power/state"?  The only difference is that it will block 
> until all in-kernel suspend blockers are disabled.
> 
> Or do you also think that "echo mem >/sys/power/state" is evil and 
> should be removed from the kernel as soon as possible?
> 
> And to answer Thomas's question: The whole reason for having in-kernel 
> suspend blockers is so that userspace can tell the system to suspend 
> without losing wakeup events.
> 
> Suppose a key is pressed just as a user program writes "mem" to
> /sys/power/state.  The keyboard driver handles the keystroke and queues
> an input event.  Then the system suspends and doesn't wake up until
> some other event occurs -- even though the keypress was an important
> one and it should have prevented the system from suspending.
> 
> With in-kernel suspend blockers and opportunistic suspend, this 
> scenario is prevented.  That is their raison-d'etre.

I tend to disagree. You are still looking at suspend as some extra
esoteric mechanism. We should stop doing this and look at suspend (to
RAM) as an additional deep idle state, which is well defined in terms
of power savings and wakeup latency. That's what I think opportunistic
suspend is all about. Now if you look at our current idle states in
x86 the incoming keystroke is never lost. 

So when suspend does lose the wakeup event then we need to fix that,
but why do we need suspend blockers for this ? Let's take your
example:

> The keyboard driver handles the keystroke and queues an input
> event. Then the system goes into suspend 

Why do we go into suspend at all? If there is an event queued then
something is woken up and got running, which is reason enough _not_ to
enter suspend. If that's broken in the current implementation then we
need to fix it, but not with a big hammer by adding another
interface. We have that information already and obviously we do not
use it, so lets figure out why before adding suspend blockers just
because they paper over the underlying problem.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [linux-pm] [PATCH 0/8] Suspend block api (version 8)

2010-05-27 Thread Thomas Gleixner
On Thu, 27 May 2010, Matthew Garrett wrote:
> On Thu, May 27, 2010 at 05:32:56PM +0200, Thomas Gleixner wrote:
> > On Thu, 27 May 2010, Matthew Garrett wrote:
> > > Now let's try this in the Android world. The user hits a key and the 
> > > system wakes up. The input layer takes a suspend block. The application 
> > > now draws all the cows it wants to, takes its own suspend block and 
> > > reads the input device. This empties the queue and the kernel-level 
> > > suspend block is released. The application then processes the event 
> > > before releasing the suspend block. The event has been delivered and 
> > > handled.
> > 
> > Thanks for providing this example:
> > 
> >   1) It proves that suspend blockers are solely designed to encourage
> >  people to code crap.
> 
> No. Suspend blockers are designed to ensure that suspend isn't racy with 
> respect to wakeup events. The bit that mitigates badly written 
> applications is the bit where the scheduler doesn't run any more.
> 
> If you're happy with a single badly written application being able to 
> cripple your power management story, you don't need opportunistic 
> suspend. But you still have complications when it comes to deciding to 
> enter suspend at the same time as you receive a wakeup event.

Wrong. Setting the QoS requirements of the badly written app to any
latency will allow the kernel to suspend even if the crappy app is
active.

And again. I'm opposing the general chant that fixing crappy
applications in the kernel is a good thing. It's the worst decision we
could make.
 
> >  And you need to do that, because the user applications suspend
> >  blocker magic is racy as hell. To work around that you sprinkle
> >  your suspend blocker magic all over the kernel instead of telling
> >  people how to solve the problem correctly.
> 
> What /is/ the correct way to solve this problem when entering explicit 
> suspend states? How do you guarantee that an event has been delivered to 
> userspace before transitioning into suspend? Now, this is a less 
> interesting problem if you're not using opportunistic suspend, but it's 
> still a problem.

Holy crap. If an event happens _before_ we go into an idle state - and
I see suspend as an deeper idle state - then we do not go there at all.

The whole notion of treating suspend to RAM any different than a plain
idle C-State is wrong. It's not different at all. You just use a
different mechanism which has longer takedown and wakeup latencies and
requires to shut down stuff and setup extra wakeup sources.

And there is the whole problem. Switching from normal event delivery
to those special wakeup sources. That needs to be engineered in any
case carefuly and it does not matter whether you add suspend blockers
or not.
 
> >  Simply because you move the cow drawing CPU time from the point
> >  where the device wants to go into suspend to the point where the
> >  user hits a key again. You even delay the reaction of your app to
> >  the user input by the time it needs to finish drawing cows.
> 
> It's how application mainloops tend to work.

So what's the f*cking point ? You draw exactly the same amount of
power and still you are claiming that it's better or what ?
 
> > > You can't express that with resource limits or QoS constraints. If you 
> > > want to deal with this kind of situation then, as far as I can tell, you 
> > > need either suspend blockers or something so close to them that it makes 
> > > no difference.
> > 
> > Wrong. If your application is interactive then you set the QoS
> > requirement once to interactive and be done.
> >
> > So the correct point to make a power state decision is when the app
> > waits for a key press. At this point the kernel can take several
> > pathes:
> > 
> >   1) Keep the system alive because the input device is in active
> >  state and a key press is expected
> > 
> >   2) Go into supsend because the input device is deactivated after
> >  the screen lock kicked in.
> 
> That's no good. If the input device has been deactivated, how does the 
> wakeup event get delivered to the application?
>  
> > This behaves exactly the same way in terms of power consumption as
> > your blocker example just without all the mess you are trying to
> > create.
> 
> And means that wakeup events don't get delivered. That's a shortcoming.

That's utter nonsense. If we have a problem with missed wakeups then
it needs to be fixed and not papered over with suspend blocker magic.

I'm 

Re: [linux-pm] [PATCH 0/8] Suspend block api (version 8)

2010-05-27 Thread Thomas Gleixner
On Thu, 27 May 2010, Alan Cox wrote:
> That's all your need to do it right.
> 
> In kernel yes your device driver probably does need to say things like
> 'Don't go below C6 for a moment' just as a high speed serial port might
> want to say 'Nothing over 10mS please'
> 
> I can't speak for Thomas, but I'm certainly not arguing that you don't
> need something that looks more like the blocker side of the logic *in
> kernel*, because there is stuff that you want to express which isn't tied
> to the task.

I'm not opposed, but yes it needs to be expressed in quantifiable
terms, i.e. wakeup latency. That's just contributing to the global QoS
state of affairs even if it is not tied to a particular task. 

And that allows the driver to be intelligent about it. The serial port
at 9600 has definitely different requirements than at 115200.

But that's quite a different concept than the big hammer approach of
the blockers.

Thanks,

tglx

--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [linux-pm] [PATCH 0/8] Suspend block api (version 8)

2010-05-27 Thread Thomas Gleixner
On Thu, 27 May 2010, Matthew Garrett wrote:

> On Thu, May 27, 2010 at 05:16:15PM +0100, Alan Cox wrote:
> 
> > I can't speak for Thomas, but I'm certainly not arguing that you don't
> > need something that looks more like the blocker side of the logic *in
> > kernel*, because there is stuff that you want to express which isn't tied
> > to the task.
> 
> Sure, if you're not using opportunistic suspend then I don't think 
> there's any real need for the userspace side of this. The question is 
> how to implement something with the useful properties of opportunistic 
> suspend without without implementing something pretty much equivalent to 
> the userspace suspend blockers. I've sent another mail expressing why I 
> don't think your proposed QoS style behaviour provides that.

Opportunistic suspend is just a deep idle state, nothing else. If the
overall QoS requirements allow to enter that deep idle state then the
kernel goes there. Same decision as for all other idle states. You
don't need any user space blocker for this decision, just sensible QoS
information.

Stop thinking about suspend as a special mechanism. It's not - except
for s2disk, which is an entirely different beast.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [linux-pm] [PATCH 0/8] Suspend block api (version 8)

2010-05-27 Thread Thomas Gleixner
On Thu, 27 May 2010, Matthew Garrett wrote:

> On Thu, May 27, 2010 at 06:45:25PM +0200, Thomas Gleixner wrote:
> > On Thu, 27 May 2010, Matthew Garrett wrote:
> > > No. Suspend blockers are designed to ensure that suspend isn't racy with 
> > > respect to wakeup events. The bit that mitigates badly written 
> > > applications is the bit where the scheduler doesn't run any more.
> > > 
> > > If you're happy with a single badly written application being able to 
> > > cripple your power management story, you don't need opportunistic 
> > > suspend. But you still have complications when it comes to deciding to 
> > > enter suspend at the same time as you receive a wakeup event.
> > 
> > Wrong. Setting the QoS requirements of the badly written app to any
> > latency will allow the kernel to suspend even if the crappy app is
> > active.
> 
> That's not what I want if I'm using the app at the time...

Your crappy app does not use suspend blockers either.
 
> > And again. I'm opposing the general chant that fixing crappy
> > applications in the kernel is a good thing. It's the worst decision we
> > could make.
> 
> You still need the in-kernel suspend blockers if you want to guarantee 
> that you can't lose wakeup events. But yes, if you're not concerned 
> handling badly behaved applications then I believe that you can lose 
> opportunistic suspend and just use the scheduler.

No, we do not. We need correctly implemented drivers and a safe
switchover from normal event delivery to wakeup based.
 
> > The whole notion of treating suspend to RAM any different than a plain
> > idle C-State is wrong. It's not different at all. You just use a
> > different mechanism which has longer takedown and wakeup latencies and
> > requires to shut down stuff and setup extra wakeup sources.
> 
> My question was about explicit suspend states, not implicitly handling 
> an identical state based on scheduler constraints. Suspend-as-a-C-state 
> isn't usable on x86 - you have to explicitly trigger it based on some 

And why not ? Just because suspend is not implemented as an ACPI
C-state ? 

Nonsense, if we want to push the system into suspend from the idle
state we can do that. It's just not implemented and we've never tried
to do it as it requires a non trivial amount of work, but I have done
it on an ARM two years ago as a prove of concept and it works like a
charm.

> policy. And if you want to be able to do that without risking the loss 
> of wakeup events then you need in-kernel suspend blockers.

Crap. Stop beating on those lost wakeup events. If we lose them then
the drivers are broken and do not handle the switch over correctly. Or
the suspend mechanism is broken as it does not evaluate the system
state correctly. Blockers are just papering over that w/o tackling the
real problem.

> > So if we really sit back and look at suspend as another idle state,
> > then we have first off the same requirements for entering it as we
> > have for any other idle state:
> 
> There are various platforms where we cannot treat suspend as an idle 
> state. Any solution that requires that doesn't actually solve the 
> problem. Yes, this is *trivial* if you can depend on the scheduler. But 
> we can't, and so it's difficult.

Stop handwaving. Which platforms prevent us to go into suspend from
idle ? Please point me to the relevant documentation which says so.

Just because we have not tried to implemented it does not mean that we
cannot implement it.

Thanks,

tglx


--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [linux-pm] [PATCH 0/8] Suspend block api (version 8)

2010-05-27 Thread Thomas Gleixner
On Thu, 27 May 2010, Alan Stern wrote:

> On Thu, 27 May 2010, Thomas Gleixner wrote:
> 
> > The whole notion of treating suspend to RAM any different than a plain
> > idle C-State is wrong. It's not different at all. You just use a
> > different mechanism which has longer takedown and wakeup latencies and
> > requires to shut down stuff and setup extra wakeup sources.
> 
> This is where you are wrong.  Maybe not wrong in principle, but wrong 
> in practice -- the kernel's present suspend-to-RAM (or just "suspend") 
> implementation is _very_ different from C-states (or just "idle").

Holy bouncing cow. I damned good know that the current implementation
is not doing that and that suspend is implemented in a totally
different way. That does not mean that this is written in stone. We
CAN change that and fix it to gain opportunistic suspend.
 
> The primary difference is that the kernel can be forced into suspend
> even when the system isn't idle.  In particular, it may be in the
> middle of processing a potential wakeup event -- and the current design
> gives the PM core no way to know if that is so.  This is a weakness
> that in-kernel suspend blockers fix.

Oh no. They paper over a short coming. If there is a pending event,
the kernel knows that. It just does not make use of this
information. Blockers just paper over this by sprinkling
do_not_suspend() calls all over the place. What a sensible solution.

> With C-states this can't happen.  If the CPU goes into a deeper C-state 
> then ipso facto it isn't busy processing anything, much less a wakeup 
> event.

And that's the whole point of doing the opportunistic suspend with the
help of the scheduler.

> Now maybe this difference is a bad thing, and the whole PM 
> suspend/hibernate infrastructure should be rewritten.  But the fact, 
> remains, that's how it works now.  And it can't be changed easily or 
> quickly.

So what you are saying is that we better paper over the shortcomings
of our current implementation with do_not_suspend() code sprinkled all
over the place instead of sitting down and making suspend from idle
work. It's more or less trivial depending on the platform, but not
rocket science.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [linux-pm] [PATCH 0/8] Suspend block api (version 8)

2010-05-27 Thread Thomas Gleixner
On Thu, 27 May 2010, Alan Stern wrote:

> On Thu, 27 May 2010, Felipe Balbi wrote:
> 
> > On Thu, May 27, 2010 at 05:06:23PM +0200, ext Alan Stern wrote:
> > >If people don't mind, here is a greatly simplified summary of the
> > >comments and objections I have seen so far on this thread:
> > >
> > >   The in-kernel suspend blocker implementation is okay, even
> > >   beneficial.
> > 
> > I disagree here. I believe expressing that as QoS is much better. Let 
> > the kernel decide which power state is better as long as I can say I 
> > need 100us IRQ latency or 100ms wakeup latency.
> 
> Does this mean you believe "echo mem >/sys/power/state" is bad and
> should be removed?  Or "echo disk >/sys/power/state"?  They pay no

mem should be replaced by an idle suspend to ram mechanism

> attention to latencies or other requirements.

s2disk is a totally different beast as it shuts down the box into the
complete power off state.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [linux-pm] [PATCH 0/8] Suspend block api (version 8)

2010-05-27 Thread Thomas Gleixner
On Thu, 27 May 2010, Matthew Garrett wrote:
> On Thu, May 27, 2010 at 07:15:31PM +0200, Thomas Gleixner wrote:
> > On Thu, 27 May 2010, Matthew Garrett wrote:
> > > My question was about explicit suspend states, not implicitly handling 
> > > an identical state based on scheduler constraints. Suspend-as-a-C-state 
> > > isn't usable on x86 - you have to explicitly trigger it based on some 
> > 
> > And why not ? Just because suspend is not implemented as an ACPI
> > C-state ? 
> > 
> > Nonsense, if we want to push the system into suspend from the idle
> > state we can do that. It's just not implemented and we've never tried
> > to do it as it requires a non trivial amount of work, but I have done
> > it on an ARM two years ago as a prove of concept and it works like a
> > charm.
> 
> ACPI provides no guarantees about what level of hardware functionality 
> remains during S3. You don't have any useful ability to determine which 
> events will generate wakeups. And from a purely practical point of view, 
> since the latency is in the range of seconds, you'll never have a low 
> enough wakeup rate to hit it.

Right, it does not as of today. So we cannot use that on x86
hardware. Fine. That does not prevent us to implement it for
architectures which can do it. And if x86 comes to the point where it
can handle it as well we're going to use it. Where is the problem ? If
x86 cannot guarantee the wakeup sources it's not going to be used for
such devices. The kernel just does not provide the service for it, so
what ?
 
> > > policy. And if you want to be able to do that without risking the loss 
> > > of wakeup events then you need in-kernel suspend blockers.
> > 
> > Crap. Stop beating on those lost wakeup events. If we lose them then
> > the drivers are broken and do not handle the switch over correctly. Or
> > the suspend mechanism is broken as it does not evaluate the system
> > state correctly. Blockers are just papering over that w/o tackling the
> > real problem.
> 
> Ger;kljaserf;kljf;kljer;klj. Suspend blockers are the mechanism for the 
> driver to indicate whether the wakeup event has been handled. That's 
> what they're there for. The in-kernel ones don't paper over anything.

So the driver says: events have been handled. Neat.

Now if that crappy app does not use the user space blockers or is not
allowed to use them, then what are you guaranteeing ? All you
guarantee is that the application has emptied the input queue. Nothing
else. And that's the same as setting the QoS guarantee to NONE.

An application which uses the blocker is just holding off the system
from going into deep idle. Nothing which cannot be done today.

So the only thing you are imposing to app writers is to use an
interface which solves nothing and does not save you any power at
all. 

If the application drops the blocker after processing the event and
before it goes back to the read event then you just postpone the CPU
usage and therefor power consumption to a later point in time instead
of going back to the blocking read right away. 

Again what is it saving ?  NOTHING!  And for nothing you want to mess
up the kernel big time ?

Runnable tasks and QoS guarantees are the indicators whether you can
go to opportunistic suspend or not. Everything else is just window
dressing.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [linux-pm] [PATCH 0/8] Suspend block api (version 8)

2010-05-27 Thread Thomas Gleixner
On Thu, 27 May 2010, Matthew Garrett wrote:

> On Thu, May 27, 2010 at 07:24:02PM +0200, Thomas Gleixner wrote:
> 
> > Oh no. They paper over a short coming. If there is a pending event,
> > the kernel knows that. It just does not make use of this
> > information. Blockers just paper over this by sprinkling
> > do_not_suspend() calls all over the place. What a sensible solution.
> 
> Even if we could use suspend-via-deep-idle-state on PCs, we still need 
> to be able to enter suspend while the system isn't idle. There's two 
> ways to do that:
> 
> 1) Force the system to be idle. Doing this race-free is difficult.
> 
> 2) Enter suspend even though the system isn't idle. Since we can't rely 
> on the scheduler, we need drivers to know whether userspace has consumed 
> all wakeup events before allowing the transition to occur. Doing so 
> requires either in-kernel suspend blockers or something that's almost 
> identical.

You're just not getting it. If user space has consumed the event is
not relevant at all.

What's relevant is whether the app has processed the event and reacted
accordingly. That's all that matters.

Emptying your input queue is just the wrong indicator.

And as I explained several times now: It does _NOT_ matter when the
app goes back in to blocked/idle state. You have to spend the CPU
cycles and power for that anyway.

And for the apps which do not use the user space blockers the queue
empty indicator is just bullshit, because after emptying the queue the
kernel can go into suspend w/o any guarantee that the event has been
processed.

The whole concept sucks, as it does not solve anything. Burning power
now or in 100ms is the same thing power consumption wise.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [linux-pm] [PATCH 0/8] Suspend block api (version 8)

2010-05-27 Thread Thomas Gleixner
On Thu, 27 May 2010, Alan Stern wrote:

> On Thu, 27 May 2010, Thomas Gleixner wrote:
> 
> > Crap. Stop beating on those lost wakeup events. If we lose them then
> > the drivers are broken and do not handle the switch over correctly. Or
> > the suspend mechanism is broken as it does not evaluate the system
> > state correctly. Blockers are just papering over that w/o tackling the
> > real problem.
> 
> That's the point -- suspend does not evaluate the system state 
> correctly because it doesn't have the necessary information.  Suspend 
> blockers are a way of providing it that information.  They don't paper 
> over the problem; they solve it.

Nonsense. The system state is well defined when a event is pending and
we just have to say good bye to the idea that forced suspend is a good
solution. It's not as it does not guarantee the event processing in
badly written apps and it does move the power consumption to a later
point in time for those apps which acquire/drop the blockers.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [linux-pm] [PATCH 0/8] Suspend block api (version 8)

2010-05-27 Thread Thomas Gleixner
On Thu, 27 May 2010, Matthew Garrett wrote:
> On Thu, May 27, 2010 at 07:35:50PM +0200, Peter Zijlstra wrote:
> > On Thu, 2010-05-27 at 18:32 +0100, Matthew Garrett wrote:
> > > On Thu, May 27, 2010 at 07:28:08PM +0200, Peter Zijlstra wrote:
> > > > Why would you care about the screen for a network event?
> > > 
> > > Because the application that needs to handle the network packet is 
> > > currently blocked trying to draw something to the screen.
> > 
> > Then that's an application bug right there, isn't it?
> > 
> > If should have listened to the window server telling its clients it was
> > going to go away. Drawing after you get that is your own damn fault ;-)
> 
> How long do you wait for applications to respond that they've stopped 
> drawing? What if the application is heavily in swap at the time?

Very realistic scenario on a mobile phone. 

--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [linux-pm] [PATCH 0/8] Suspend block api (version 8)

2010-05-27 Thread Thomas Gleixner
On Thu, 27 May 2010, Matthew Garrett wrote:

> On Thu, May 27, 2010 at 07:34:40PM +0200, Peter Zijlstra wrote:
> > > we still need to be able to enter suspend while the system isn't idle.
> > 
> > _WHY_!?
> 
> Because if I'm running a kernel build in a tmpfs and I hit the sleep 
> key, I need to go to sleep. Blocking processes on driver access isn't 
> sufficient.

That's the difference between opportunistic and enforced suspend. On
enforced suspend the QoS guarantee is forced to NONE for everything
and you go down no matter what. When you decide to push the wakeup
button the system restores.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [linux-pm] [PATCH 0/8] Suspend block api (version 8)

2010-05-27 Thread Thomas Gleixner
On Thu, 27 May 2010, Matthew Garrett wrote:

> On Thu, May 27, 2010 at 06:49:18PM +0100, Alan Cox wrote:
> > > ACPI provides no guarantees about what level of hardware functionality 
> > > remains during S3. You don't have any useful ability to determine which 
> > > events will generate wakeups. And from a purely practical point of view, 
> > > since the latency is in the range of seconds, you'll never have a low 
> > > enough wakeup rate to hit it.
> > 
> > So PCs with current ACPI don't get opportunistic suspend capability. It
> > probably won't be supported on the Commodore Amiga either - your point ?
> 
> Actually, the reverse - there's no terribly good way to make PCs work 
> with scheduler-based suspend, but there's no reason why they wouldn't 
> work with the current opportunistic suspend implementation.

How does that solve the problems you mentioned above ? Wakeup
guarantees, latencies ...

It's not a prove of the technical correctness of the approach if it
can provide a useless functionality.
 
Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [linux-pm] [PATCH 0/8] Suspend block api (version 8)

2010-05-27 Thread Thomas Gleixner
On Thu, 27 May 2010, Matthew Garrett wrote:

> On Thu, May 27, 2010 at 07:59:02PM +0200, Thomas Gleixner wrote:
> > On Thu, 27 May 2010, Matthew Garrett wrote:
> > > ACPI provides no guarantees about what level of hardware functionality 
> > > remains during S3. You don't have any useful ability to determine which 
> > > events will generate wakeups. And from a purely practical point of view, 
> > > since the latency is in the range of seconds, you'll never have a low 
> > > enough wakeup rate to hit it.
> > 
> > Right, it does not as of today. So we cannot use that on x86
> > hardware. Fine. That does not prevent us to implement it for
> > architectures which can do it. And if x86 comes to the point where it
> > can handle it as well we're going to use it. Where is the problem ? If
> > x86 cannot guarantee the wakeup sources it's not going to be used for
> > such devices. The kernel just does not provide the service for it, so
> > what ?
> 
> We were talking about PCs. Suspend-as-c-state is already implemented for 
> OMAP.
 
Ah, now we talk about PCs. And all of a sudden the problem of the
unability of determining wakeup sources is not longer relevant ? So
how do you guarantee that we don't miss one if we cant figure out
which ones are kept alive in S3 ?

> > So the only thing you are imposing to app writers is to use an
> > interface which solves nothing and does not save you any power at
> > all. 
> 
> It's already been demonstrated that the Android approach saves power.

Demonstrated ? Care to explain me how it makes a difference:

while (1) {
  block();
  read();
  process_event();
  unblock();
---> suspend
<--- resume
  do_crap();100 cycles
}

vs.

while (1) {
  read();
---> suspend
<--- resume
  process_event();
  do_crap();100 cycles
}

You spend the damned 1000 cycles in any case just at a different
point in time. So if you are so convinced and have fully understood
all the implications, please enlighten me why do_crap() costs less
power with the blockers approach.

An you are also stubbornly refusing to answer my analysis about the
effect on apps which do not use the blocker or are not allowed to.

1) The kernel blocker does not guarantee that the lousy app has
   processed the event. It just guarantees that the lousy app has
   emptied the input queue. So what's the point of the kernel blocker
   in that case ?

2) What's the difference on setting that app to QoS(NONE) and let the
   kernel happily ignore it.

Come up with real explanations and numbers and not just the "it has
been demonstrated" chant which is not holding water if you look at the
above.
 
> > Runnable tasks and QoS guarantees are the indicators whether you can
> > go to opportunistic suspend or not. Everything else is just window
> > dressing.
> 
> As I keep saying, this is all much less interesting if you don't care 
> about handling suboptimal applications. If you do care about them then 
> the Android approach works. Nobody has demonstrated a scheduler-based 
> one that does.

That does not make the android approach any better. They should have
talked to us upfront and not after the fact. Just because they decided
to do that in their google basement w/o talking to people who care is
not proving that it's a good solution and even less a reason to merge
it as is.

The kernel history is full of examples where crappy solutions got
rejected and kept out of the kernel for a long time even if there was
a need for them in the application field and they got shipped in
quantities with out of tree patches (NOHZ, high resolution timers,
...). At some point people stopped arguing for crappy solutions and
sat down and got it right. The problem of power management and
opportunistic suspend is not different in any way.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [linux-pm] [PATCH 0/8] Suspend block api (version 8)

2010-05-27 Thread Thomas Gleixner
On Thu, 27 May 2010, Matthew Garrett wrote:

> On Thu, May 27, 2010 at 08:18:11PM +0200, Peter Zijlstra wrote:
> > On Thu, 2010-05-27 at 19:14 +0100, Matthew Garrett wrote:
> > > If I get a WoL 
> > > packet in the 0.5 of a second between userspace deciding to suspend and 
> > > actually doing so, the system shouldn't suspend. 
> > 
> > Please re-read Thomas' description of how a driver should do the state
> > transition.
> >
> > So either we get the packet before suspend, and we cancel the suspend,
> > or we get it after and we wake up. What's the problem, and how does that
> > need suspend blockers?
> 
> In order to cancel the suspend we need to keep track of whether 
> userspace has consumed the event and reacted appropriately. Since we 
> can't do this with the process scheduler, we end up with something that 
> looks like suspend blockers.

And the scheduler based approach does exactly this and provides
exactly the same non guarantees to applications which are set to
QoS(NONE) as it does to applications which do not (or are not allowed
to) use the user space suspend blockers.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [linux-pm] [PATCH 0/8] Suspend block api (version 8)

2010-05-27 Thread Thomas Gleixner
On Thu, 27 May 2010, Alan Cox wrote:

> > No, it's not. Forced suspend may be in response to hitting a key, but it 
> 
> You are the only person here talking about 'forced' suspends. The rest of
> us are talking about idling down and ensuring we are always in a state we
> un-idle correctly.
> 
> > may also be in response to a 30 minute timeout expiring. If I get a WoL 
> > packet in the 0.5 of a second between userspace deciding to suspend and 
> > actually doing so, the system shouldn't suspend.
> 
> I don't think that argument holds water in the form you have it
> 
> What about 5 nanoseconds before you suspend. Now you can't do that (laws
> of physics and stuff).
> 
> So your position would seem to be "we have a race but can debate how big
> is permissible"
> 
> The usual model is
> 
> "At no point should we be in a position when entering a suspend style
>  deep sleep where we neither abort the suspend, nor commit to a
>  suspend/resume sequence if the events we care about occur"
> 
> and that is why the hardware model is
> 
>   Set wake flags
>   Check if idle
>   If idle
>   Suspend
>   else
>   Clear wake flags
>   Unwind
> 
> and the wake flags guarantee that an event at any point after the wake
> flags are set until they are cleared will cause a suspend to be resumed,
> possibly immediately after the suspend.

And if a platform can not guarantee the wakeup or the lossless
transition of states then you can not solve this by throwing blockers
or whatever into the code.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [linux-pm] [PATCH 0/8] Suspend block api (version 8)

2010-05-27 Thread Thomas Gleixner
On Thu, 27 May 2010, Matthew Garrett wrote:

> On Thu, May 27, 2010 at 08:53:11PM +0200, Thomas Gleixner wrote:
> > On Thu, 27 May 2010, Matthew Garrett wrote:
> > > We were talking about PCs. Suspend-as-c-state is already implemented for 
> > > OMAP.
> >  
> > Ah, now we talk about PCs. And all of a sudden the problem of the
> > unability of determining wakeup sources is not longer relevant ? So
> > how do you guarantee that we don't miss one if we cant figure out
> > which ones are kept alive in S3 ?
> 
> A wakeup event is defined as one that wakes the system - if a system 
> can't be woken by a specific event then it's impossible to lose it, 
> since it wasn't a wakeup event to begin with.

So where is the problem ?
 
> > > > So the only thing you are imposing to app writers is to use an
> > > > interface which solves nothing and does not save you any power at
> > > > all. 
> > > 
> > > It's already been demonstrated that the Android approach saves power.
> > 
> > Demonstrated ? Care to explain me how it makes a difference:
> > 
> > while (1) {
> >   block();
> >   read();
> >   process_event();
> >   unblock();
> > ---> suspend
> > <--- resume
> >   do_crap();100 cycles
> > }
> > 
> > vs.
> > 
> > while (1) {
> >   read();
> > ---> suspend
> > <--- resume
> >   process_event();
> >   do_crap();100 cycles
> > }
> > 
> > You spend the damned 1000 cycles in any case just at a different
> > point in time. So if you are so convinced and have fully understood
> > all the implications, please enlighten me why do_crap() costs less
> > power with the blockers approach.
> 
> Consider the case where the read() is non-blocking.

Which I consider in the same range as the application which does:

  while (1);

> > 1) The kernel blocker does not guarantee that the lousy app has
> >processed the event. It just guarantees that the lousy app has
> >emptied the input queue. So what's the point of the kernel blocker
> >in that case ?
> 
> Yes, I think you're right here. You need the userspace component as well 
> for this to work correctly.
>
> > 2) What's the difference on setting that app to QoS(NONE) and let the
> >kernel happily ignore it.
> 
> What sets that flag, and how does it interact with an application that 
> never makes a blocking system call?

  QoS(NONE) would be default policy for untrusted apps in the same way
  as you'd refuse the usage of supsend blockers for untrusted apps.
  
  while (1); is definitely not an application which should be granted
  trusted rights, right ?

  block_suspend(); while(1);
  
  is the same as:

  QoS(minimal latency); while(1);

  So if you really go to trust an "while (1);" application you better
  make sure that this app has the appropriate QoS(NONE) or QoS(10s)
  call in it before letting it lose.
 
> > Come up with real explanations and numbers and not just the "it has
> > been demonstrated" chant which is not holding water if you look at the
> > above.
> 
> The numbers were earlier in the thread.

  Numbers, yes. But I really give a sh*t about numbers w/o a detailed
  explanation of the scenario which created those numbers. And if the
  numbers boil down to: we handle the untrusted app which does "while
  (1);" then they are absolutely useless.

> > The kernel history is full of examples where crappy solutions got
> > rejected and kept out of the kernel for a long time even if there was
> > a need for them in the application field and they got shipped in
> > quantities with out of tree patches (NOHZ, high resolution timers,
> > ...). At some point people stopped arguing for crappy solutions and
> > sat down and got it right. The problem of power management and
> > opportunistic suspend is not different in any way.
> 
> Yes, and I'd agree with this if anyone seemed to have any idea how to do 
> it right. But despite all the heat and noise in this thread, all we've 
> got is an expression that it should be handled by the scheduler (a 
> viewpoint I agree with) without any explanation of how to switch 
> policies in such a way that applications idle without losing wakeup 
> events.

Why in the world should they lose wakeup events ? If an app goes idle,
it goes idle because it is waiting for an event. There is no other
sane reason except for those apps which are untrusted and force
idled. And for those you agreed already the suspend blockers don't
solve anything as they are either not implemented or the app is not
trusted to use them.

So we are back to round one of the whole discussion:

   Is it worth to create an utter mess in the kernel just to handle a
   subset of crappy coded applications ?

And the answer stays the same: No, not at all.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [linux-pm] [PATCH 0/8] Suspend block api (version 8)

2010-05-27 Thread Thomas Gleixner
On Thu, 27 May 2010, Alan Stern wrote:

> On Thu, 27 May 2010, Matthew Garrett wrote:
> 
> > On Thu, May 27, 2010 at 10:23:50PM +0200, Thomas Gleixner wrote:
> > > On Thu, 27 May 2010, Matthew Garrett wrote:
> > > > A wakeup event is defined as one that wakes the system - if a system 
> > > > can't be woken by a specific event then it's impossible to lose it, 
> > > > since it wasn't a wakeup event to begin with.
> > > 
> > > So where is the problem ?
> > 
> > The problem is that, right now, if a wakeup event is received between 
> > the point where userspace decides to start a suspend and userspace 
> > actually starts a suspend, that event may not abort the suspend.
> 
> The two of you are talking at cross purposes.  Thomas is referring to 
> idle-based suspend and Matthew is talking about forced suspend.

Yes, and forced suspend to disk is the same as force suspend to disk,
which has both nothing to do with sensible resource management.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [linux-pm] [PATCH 0/8] Suspend block api (version 8)

2010-05-27 Thread Thomas Gleixner
On Thu, 27 May 2010, Rafael J. Wysocki wrote:

> On Thursday 27 May 2010, Thomas Gleixner wrote:
> > On Thu, 27 May 2010, Alan Stern wrote:
> > 
> > > On Thu, 27 May 2010, Felipe Balbi wrote:
> > > 
> > > > On Thu, May 27, 2010 at 05:06:23PM +0200, ext Alan Stern wrote:
> > > > >If people don't mind, here is a greatly simplified summary of the
> > > > >comments and objections I have seen so far on this thread:
> > > > >
> > > > >   The in-kernel suspend blocker implementation is okay, even
> > > > >   beneficial.
> > > > 
> > > > I disagree here. I believe expressing that as QoS is much better. Let 
> > > > the kernel decide which power state is better as long as I can say I 
> > > > need 100us IRQ latency or 100ms wakeup latency.
> > > 
> > > Does this mean you believe "echo mem >/sys/power/state" is bad and
> > > should be removed?  Or "echo disk >/sys/power/state"?  They pay no
> > 
> > mem should be replaced by an idle suspend to ram mechanism
> 
> Well, what about when I want the machine to suspend _regardless_ of whether
> or not it's idle at the moment?  That actually happens quite often to me. :-)

Fair enough. Let's agree on a non ambigous terminology then:

 forced:

 suspend which you enforce via user interaction, which
 also implies that you risk losing wakeups depending on
 the hardware properties

 opportunistic:

 suspend driven from the idle context, which guarantees to
 not lose wakeups. Provided only when the hardware does
 provide the necessary capabilities.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [linux-pm] [PATCH 0/8] Suspend block api (version 8)

2010-05-28 Thread Thomas Gleixner
On Thu, 27 May 2010, Alan Stern wrote:

> On Thu, 27 May 2010, Thomas Gleixner wrote:
> 
> > > The two of you are talking at cross purposes.  Thomas is referring to 
> > > idle-based suspend and Matthew is talking about forced suspend.
> > 
> > Yes, and forced suspend to disk is the same as force suspend to disk,
> > which has both nothing to do with sensible resource management.
> 
> If I understand correctly, you are saying that all the untrusted 
> applications should run with QoS(NONE).  Then they could do whatever 
> they wanted without causing any interference.
> 
> And with idle-based power management (rather than forced suspend), 
> there would be no issue with wakeup events getting unduly delayed.
> 
> Unless one of those events was meant for an untrusted application.  Is 
> that the source of the difficulty?

Probably, but that's not solved by suspend blockers either as I
explained several times now. Because those untrusted apps either lack
blocker calls or are not allowed to use them, so the blocker does not
help for those either.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [linux-pm] [PATCH 0/8] Suspend block api (version 8)

2010-05-31 Thread Thomas Gleixner
On Sat, 29 May 2010, James Bottomley wrote:
> The job of the kernel is to accommodate hardware as best it can ...
> sometimes it might not be able to, but most of the time it does a pretty
> good job.
> 
> The facts are that C states and S states are different and are entered
> differently. 

That's an x86'ism which is going away. And that's really completely
irrelevant for the mobile device space. Can we please stop trying to
fix todays x86 based laptop problems? They are simply not fixable.

> For some omap hardware, the power consumption in the
> lowest C state (with all the ancillary power control) is the same as S3,
> that's fine, suspend from idle works as well as suspend to ram modulo
> bad apps. For quite a lot of MSM hardware, the lowest C state power
> consumption is quite a bit above S3.  It's not acceptable to tell those
> people "tough, your battery runs out in 30 minutes because you bought
> the wrong hardware".  We have to figure out how to get to S3 ... whether
> this is from idle or some other mechanism is again a discussion point,
> but not doing it is not an option.

If you'd have read the answers from Alan carefully, then you'd have
noticed that even x86 hardware is getting to the point where OMAP is
today. i.e. support of transparent suspend from idle. If that wouldn't
happen then x86 would be simply unusable for mobile devices. It's that
easy. And we really do _NOT_ care about the existing laptop hardware
which does not provide that because it's a lost case. Not only due to
the missing (or just disabled) wakeup sources, also due to the fact
that you cannot do sensible power management by completely disabling
clock and/or power of unused devices in the chipset. There is a damn
good reason why the mobile space is _NOT_ x86 based at the moment.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [linux-pm] [PATCH 0/8] Suspend block api (version 8)

2010-05-31 Thread Thomas Gleixner
On Sat, 29 May 2010, James Bottomley wrote:
> On Sat, 2010-05-29 at 10:10 +0200, Peter Zijlstra wrote:
> > 
> > Not using suspend is exactly the point. As Alan has argued, propagating
> > suspend blockers up into all regions of userspace will take much longer
> > than fixing the hardware.
> 
> Strange, that's not what I heard as the possible solution.  I thought he
> was advocating expressing the kernel side of suspend blockers as QoS
> constraints.  Once we have QoS constraints correctly done in the kernel,
> userspace still has to express its requirements.  If the requirements
> are static, then they can be done from policy files or in some other
> static way but if they're dynamic, they'll still have to be in the
> applications ... in about the same places the android wakelocks are.

That's wrong. You only need the explicit dynamic QoS constraints for
applications which follow the scheme:

 while (1) {
   if (event_available())
  process_event();
   else
  do_useless_crap_which_consumes_power();
 } 

which need the following annotation:

 while (1) {
   block_suspend();
   if (event_available()) {
  process_event();
  unblock_suspend();
   } else {
  unblock_suspend();
  do_useless_crap_which_consumes_power();
   }
 } 

Plus the kernel counterpart of drivers which take the suspend blocker
in the interrupt handler and release it when the event queue is empty.

So that's done for making polling event handling power "efficient".

Even worse, you need the same "annotation" for non polling mode and it
enforces the use of select() because you cannot take a suspend blocker
across a blocking read() without adding more invasive interactions to
the kernel..

So the "sane" app looks like:

   while (1) {
 select();
 block_suspend();
 process_events();
 unblock_suspend();
   }

I'm really tired of arguing that this promotion of "programming style"
is the worst idea ever, so let's look how you can do the same thing
QoS based.

s/block_suspend()/qos(INTERACTIVE)/ and
s/unblock_suspend()/qos(NONE)/ and
s/block_magic()/qos_magic()/ in the drivers.

Yes, it's mostly the same, with a subtle difference:

While android can use it in the big hammer approach to disable the
existing user initiated suspend via /sys/power/state, the rest of the
world can benefit as well in various ways.

 - Sane applications which use a blocking event wait can be handled
   with a static QoS setting simply because a blocking read relies on
   the QoS state of the underlying I/O system.

 - Idle based suspend as the logical consequence of idle states is
   just a matter of QoS constraint based decisions.

 - Untrusted apps can be confined in cgroups. The groups are set to
   QoS(None) when user land decides that it's time to safe power
   (e.g. screen lock kicks in)

 - QoS states can block applications from I/O when the I/O system is
   set to a state which is exclusive.

  - ...

So that allows to use the same mechanism for more than the android
sledge hammer approach and confines the controversial use cases into
android specific files without adding a hard to maintain user space
interface which would prevent or at least make it hard to do some of
the above mentioned things which we want to see implemented.

While I personally disagree with the android approach, I REALLY want
to provide them a mechanism to make it work, but not for the price
that it actively prevents solutions which are considered to be
technically superior by a relevant group of developers/maintainers.

> > You got to realize this is about Linux as a whole, I really don't care
> > one whit about the specific Android case. We want a solution that is
> > generic enough to solve the power consumption problem and makes sense on
> > future hardware.
> 
> I don't think anyone disagrees with this. As long as we find a long term
> solution that satisfies the android case, everyone will be happy.

See above.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [linux-pm] [PATCH 0/8] Suspend block api (version 8)

2010-05-31 Thread Thomas Gleixner
On Mon, 31 May 2010, James Bottomley wrote:
> 
> For MSM hardware, it looks possible to unify the S and C states by doing
> suspend to ram from idle but I'm not sure how much work that is.

On ARM, it's not rocket science and we have in tree support for this
already (OMAP). I have done the same thing on a Samsung part as a
prove of concept two years ago and it's really easy as the hardware is
sane. Hint: It's designed for mobile devices :)

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [linux-pm] [PATCH 0/8] Suspend block api (version 8)

2010-05-31 Thread Thomas Gleixner
On Mon, 31 May 2010, James Bottomley wrote:

> On Mon, 2010-05-31 at 22:49 +0200, Thomas Gleixner wrote:
> > On Sat, 29 May 2010, James Bottomley wrote:
> > > The job of the kernel is to accommodate hardware as best it can ...
> > > sometimes it might not be able to, but most of the time it does a pretty
> > > good job.
> > > 
> > > The facts are that C states and S states are different and are entered
> > > differently. 
> > 
> > That's an x86'ism which is going away. And that's really completely
> > irrelevant for the mobile device space. Can we please stop trying to
> > fix todays x86 based laptop problems? They are simply not fixable.
> 
> You're the one mentioning x86, not me.  I already explained that some
> MSM hardware (the G1 for example) has lower power consumption in S3
> (which I'm using as an ACPI shorthand for suspend to ram) than any
> suspend from idle C state.

Those machines can go from idle into S2RAM just fine w/o touching the
/sys/power/state S2RAM mechanism.

It's just a deeper "C" state, really.

The confusion is that S3 is considered to be a complete different
mechanism - which is true for PC style x86 - but not relevant for
hardware which is sane from the PM point of view.

Now some people think, that suspend blockers are a cure for the
existing x86/ACPI/BIOS mess, which cannot go to S3 from idle, but
that's simply not feasible.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [linux-pm] [PATCH 0/8] Suspend block api (version 8)

2010-05-31 Thread Thomas Gleixner
On Tue, 1 Jun 2010, Rafael J. Wysocki wrote:
> On Monday 31 May 2010, Thomas Gleixner wrote:
> > So that allows to use the same mechanism for more than the android
> > sledge hammer approach and confines the controversial use cases into
> > android specific files without adding a hard to maintain user space
> > interface which would prevent or at least make it hard to do some of
> > the above mentioned things which we want to see implemented.
> 
> I generally agree.
> 
> I think the Alan Stern's recent proposal goes along these lines, but it has
> the advantage of being a bit more specific. ;-)

Yes, Alan Stern's proposal is going into that direction and I'm not
opposed. Just wanted to get the overall picture straight for James :)

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [linux-pm] [PATCH 0/8] Suspend block api (version 8)

2010-05-31 Thread Thomas Gleixner
On Tue, 1 Jun 2010, Rafael J. Wysocki wrote:
> On Tuesday 01 June 2010, Neil Brown wrote:
> > My freerunner has a single core so without CONFIG_PREEMPT it may be that
> > there is no actual race-window - maybe the PRE_SUSPENDs will all run before 
> > a
> > soft_irq thread has a chance to finish handling of the interrupt (my
> > knowledge of these details is limits).  But on a muilti-core device I think
> > there would definitely be a race-window.
> 
> Yes, there always will be a race window.  The only thing we can do is to
> narrow it, but we cannot really close it (at least not on a PC, but I'm not
> really sure it can be closed at all).

It can be closed, when the state transition from normal event delivery
to wakeup mode is state safe, which it is on most platforms which are
designed for the mobile space.

Not so the current PC style x86 platforms, which are not relevant for
the problem at hand at all. Really, that stuff is going either to gain
sane properties or it's just going into the irrelevant realm.

Any attempt to solve the current x86/ACPI/BIOS/mess is waste of time
and is inevitably going to prevent progress.

> If you really want _all_ events to be delivered timely, the only way to go is
> to avoid using suspend (and use the idle framework for power management).

Amen.

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [linux-pm] [PATCH 0/8] Suspend block api (version 8)

2010-05-31 Thread Thomas Gleixner
On Tue, 1 Jun 2010, Neil Brown wrote:
> And if you are right that the race window cannot be closed, then the whole
> suspend-blocker infrastructure is pointless as the purpose of it is simply to
> close that window.  If it really does not and cannot work, then it would be
> best to reject it for that reason rather than for less concrete aesthetic
> arguments.
> But presumably it does work in practice on Android hardware so . confused.
> 
> Having just seen the email from Thomas, maybe you mean that it cannot be
> closed on devices using ACPI, but can on other devices.  I can sort-of
> imagine how that would be the case (I tried reading an ACPI spec once - my
> hat is of to those of you who understand it).
> That shouldn't prevent us from closing the race window on "sane" hardware
> that allows it.  This would, I think, be sufficient for Android's needs.
> 
> I'm hoping we can get agreement on:
>   - there is a race with suspend

That's a matter of how you define "suspend". 

If "suspend" is another deep idle state and the hardware is sane,
there is no race at all - assumed that the driver/platform developer
got it right. It's not rocket science to transition from "normal" irq
delivery to wakeup based delivery raceless (except for PC style x86
hardware of today)

If "suspend" is the thing we are used to via /sys/power/state then the
race will persist forever except for the suspend blocker workaround,
which we can express in QoS terms as well w/o adding another suspend
related user space API.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [linux-pm] [PATCH 0/8] Suspend block api (version 8)

2010-06-01 Thread Thomas Gleixner
On Tue, 1 Jun 2010, Neil Brown wrote:
> 
> I think you have acknowledged that there is a race with suspend - thanks.
> Next step was "can it be closed".
> You seem to suggest that it can, but you describe it as a "work around"
> rather than a "bug fix"...
> 
> Do you agree that the race is a "bug", and therefore it is appropriate to
> "fix" it assuming an acceptable fix can be found (which I think it can)?

If we can fix it, yes we definitely should do and not work around it.
 
Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [linux-pm] [PATCH 0/8] Suspend block api (version 8)

2010-06-01 Thread Thomas Gleixner

On Mon, 31 May 2010, Arve Hjønnevåg wrote:

> On Mon, May 31, 2010 at 2:46 PM, Thomas Gleixner  wrote:
> > On Mon, 31 May 2010, James Bottomley wrote:
> >>
> >> For MSM hardware, it looks possible to unify the S and C states by doing
> >> suspend to ram from idle but I'm not sure how much work that is.
> >
> > On ARM, it's not rocket science and we have in tree support for this
> > already (OMAP). I have done the same thing on a Samsung part as a
> > prove of concept two years ago and it's really easy as the hardware is
> > sane. Hint: It's designed for mobile devices :)
> >
> 
> We already enter the same power state from idle and suspend on msm. In
> the absence of misbehaving apps, the difference in power consumption
> is entirely caused by periodic timers in the user-space framework
> _and_ kernel. It only takes a few timers triggering per second (I
> think 3 if they do no work) to double the average power consumption on
> the G1 if the radio is off. We originally added wakelocks because the
> hardware we had at the time had much lower power consumption in
> suspend then idle, but we still use suspend because it saves power.

So how do you differentiate between timers which _should_ fire and
those you do not care about ?

We have mechanisms in place to defer timers so the wakeups are
minimized. If that's not enough we need to revisit.

Thanks,

tglx



Re: [linux-pm] [PATCH 0/8] Suspend block api (version 8)

2010-06-01 Thread Thomas Gleixner
On Mon, 31 May 2010, Arve Hjønnevåg wrote:

> 2010/5/31 Rafael J. Wysocki :
> > On Monday 31 May 2010, Arve Hjønnevåg wrote:
> >> 2010/5/30 Rafael J. Wysocki :
> > ...
> >>
> >> I think it makes more sense to block suspend while wakeup events are
> >> pending than blocking it everywhere timers are used by code that could
> >> be called while handling wakeup events or other critical work. Also,
> >> even if you did block suspend everywhere timers where used you still
> >> have the race where a wakeup interrupt happens right after you decided
> >> to suspend. In other words, you still need to block suspend in all the
> >> same places as with the current opportunistic suspend code, so what is
> >> the benefit of delaying suspend until idle?
> >
> > Assume for a while that you don't use suspend blockers, OK?  I realize you
> > think that anything else doesn't make sense, but evidently some other people
> > have that opinion about suspend blockers.
> >
> 
> It sounded like you were suggesting that initiating suspend from idle
> would somehow avoid the race condition with wakeup events. All I'm
> saying is that you would need to block suspend in all the same places.
> If you don't care about ignoring wakeup events, then sure you can
> initiate suspend from idle.

And why should you miss a wakeup there ? If you get an interrupt in
the transition, then you are not longer idle.

Thanks,

tglx

Re: [linux-pm] [PATCH 0/8] Suspend block api (version 8)

2010-06-02 Thread Thomas Gleixner
On Tue, 1 Jun 2010, Arve Hjønnevåg wrote:
> 2010/6/1 Thomas Gleixner :
> >
> > On Mon, 31 May 2010, Arve Hjønnevåg wrote:
> >
> >> On Mon, May 31, 2010 at 2:46 PM, Thomas Gleixner  
> >> wrote:
> >> > On Mon, 31 May 2010, James Bottomley wrote:
> >> >>
> >> >> For MSM hardware, it looks possible to unify the S and C states by doing
> >> >> suspend to ram from idle but I'm not sure how much work that is.
> >> >
> >> > On ARM, it's not rocket science and we have in tree support for this
> >> > already (OMAP). I have done the same thing on a Samsung part as a
> >> > prove of concept two years ago and it's really easy as the hardware is
> >> > sane. Hint: It's designed for mobile devices :)
> >> >
> >>
> >> We already enter the same power state from idle and suspend on msm. In
> >> the absence of misbehaving apps, the difference in power consumption
> >> is entirely caused by periodic timers in the user-space framework
> >> _and_ kernel. It only takes a few timers triggering per second (I
> >> think 3 if they do no work) to double the average power consumption on
> >> the G1 if the radio is off. We originally added wakelocks because the
> >> hardware we had at the time had much lower power consumption in
> >> suspend then idle, but we still use suspend because it saves power.
> >
> > So how do you differentiate between timers which _should_ fire and
> > those you do not care about ?
> >
> 
> Only alarms are allowed to fire while suspended.
> 
> > We have mechanisms in place to defer timers so the wakeups are
> > minimized. If that's not enough we need to revisit.
> >
> 
> Deferring the the timers forever without stopping the clock can cause
> problems. Our user space code has a lot of timeouts that will trigger
> an error if an app does not respond in time. Freezing everything and
> stopping the clock while suspended is a lot simpler than trying to
> stop individual timers and processes from running.

And resume updates timekeeping to account for the slept time. So the
only way to get away with that is to sleep under a second or just
ignoring the update by avoiding the access to rtc. 

So how do you keep timekeeping happy ?

Thanks,

tglx

Re: [linux-pm] [PATCH 0/8] Suspend block api (version 8)

2010-06-02 Thread Thomas Gleixner
On Wed, 2 Jun 2010, Arve Hjønnevåg wrote:
> 2010/6/2 Thomas Gleixner :
> > On Tue, 1 Jun 2010, Arve Hjønnevåg wrote:
> >> Deferring the the timers forever without stopping the clock can cause
> >> problems. Our user space code has a lot of timeouts that will trigger
> >> an error if an app does not respond in time. Freezing everything and
> >> stopping the clock while suspended is a lot simpler than trying to
> >> stop individual timers and processes from running.
> >
> > And resume updates timekeeping to account for the slept time. So the
> 
> No, for the monotonic clock it does the opposite. The hardware clock
> is read on resume and the offset is set so the monotonic clock gets
> the same value as it had when suspend was called.

Grr, yes. Misread the code. -ENOTENOUGHCOFFEE

Thanks,

tglx

Re: [linux-pm] [PATCH 0/8] Suspend block api (version 8)

2010-06-02 Thread Thomas Gleixner
On Tue, 1 Jun 2010, Arve Hjønnevåg wrote:
> 2010/6/1 Thomas Gleixner :
> > On Mon, 31 May 2010, Arve Hjønnevåg wrote:
> >
> >> 2010/5/31 Rafael J. Wysocki :
> >> > On Monday 31 May 2010, Arve Hjønnevåg wrote:
> >> >> 2010/5/30 Rafael J. Wysocki :
> >> > ...
> >> >>
> >> >> I think it makes more sense to block suspend while wakeup events are
> >> >> pending than blocking it everywhere timers are used by code that could
> >> >> be called while handling wakeup events or other critical work. Also,
> >> >> even if you did block suspend everywhere timers where used you still
> >> >> have the race where a wakeup interrupt happens right after you decided
> >> >> to suspend. In other words, you still need to block suspend in all the
> >> >> same places as with the current opportunistic suspend code, so what is
> >> >> the benefit of delaying suspend until idle?
> >> >
> >> > Assume for a while that you don't use suspend blockers, OK?  I realize 
> >> > you
> >> > think that anything else doesn't make sense, but evidently some other 
> >> > people
> >> > have that opinion about suspend blockers.
> >> >
> >>
> >> It sounded like you were suggesting that initiating suspend from idle
> >> would somehow avoid the race condition with wakeup events. All I'm
> >> saying is that you would need to block suspend in all the same places.
> >> If you don't care about ignoring wakeup events, then sure you can
> >> initiate suspend from idle.
> >
> > And why should you miss a wakeup there ? If you get an interrupt in
> > the transition, then you are not longer idle.
> >
> 
> Because suspend itself causes you to not be idle you cannot abort
> suspend just because you are not idle anymore.

You still are addicted to the current suspend mechanism. :)

The whole point of doing it from idle in the form of a deep power
state is to avoid the massive sh*tload of work which is neccesary to
run the existing suspend code. But that needs runtime power management
and tweaks to avoid your timers waking you, etc.

The mechanism you want to use is: suspend no matter what, like closing
the lid of the laptop, but with a few tweaks around it:

   1) An interrupt on a wakeup source which comes in while the suspend
  code runs, i.e before you transitioned into wakeup mode, must
  abort / prevent suspend.

   2) Prevent another attempt to suspend before the event has been
  delivered and the apps have signaled that they have not longer
  any work to do.

   As a side effect you confine crappy apps with that mechanism in
   some way.

In the laptop case we do not want the tweaks as not going into suspend
might set your backpack on fire.

If I understood you correctly then you can shutdown the CPU in idle
completelty already, but that's not enough due to:

1) crappy applications keeping the cpu away from idle
2) timers firing

Would solving those two issues be sufficient for you or am I missing
something ?

Thanks,

tglx

Re: [linux-pm] [PATCH 0/8] Suspend block api (version 8)

2010-06-02 Thread Thomas Gleixner
On Wed, 2 Jun 2010, Arve Hjønnevåg wrote:
> 2010/6/2 Thomas Gleixner :
> > On Tue, 1 Jun 2010, Arve Hjønnevåg wrote:
> >>
> >> Because suspend itself causes you to not be idle you cannot abort
> >> suspend just because you are not idle anymore.
> >
> > You still are addicted to the current suspend mechanism. :)
> >
> 
> No I want you to stop confusing low power idle modes with suspend. I
> know how to enter low power modes from idle if that low power mode is
> not too disruptive.

What prevents us from going into a disruptive mode from idle ? I don't
see a reason - except crappy ACPI stuff, which I'm happy to ignore.

> > If I understood you correctly then you can shutdown the CPU in idle
> > completelty already, but that's not enough due to:
> >
> >    1) crappy applications keeping the cpu away from idle
> >    2) timers firing
> >
> > Would solving those two issues be sufficient for you or am I missing
> > something ?
> 
> Solving those two is enough for current android phones, but it may not
> be enough for other android devices.

In which way ? May not be enough is a pretty vague statement.

> 1 is not solvable (meaning we cannot fix all apps),

We can mitigate it with cgroups and confine crap there, i.e. force
idle them.

> and 2 is difficult to fix since the periodic
> work is useful while the device is actually in use. One possible way
> to solve 2 is to allow timers on a not-idle clock.

That's what I had in mind.

> Unrelated to Android, I also want to use opportunistic suspend on my
> desktop.

I expect that intel/amd fixing their stuff is going to happen way
before we sprinkled suspend blockers over a full featured desktop
distro.

Thanks,

tglx

Re: [PATCH] - race-free suspend. Was: Re: [linux-pm] [PATCH 0/8] Suspend block api (version 8)

2010-06-02 Thread Thomas Gleixner
On Wed, 2 Jun 2010, Arve Hjønnevåg wrote:
> 2010/6/2 Neil Brown :
> > There would still need to be some sort of communication between the the
> > suspend daemon on any event daemon to ensure that the events had been
> > processed.  This could be very light weight interaction.  The point though 
> > is
> > that with this patch it becomes possible to avoid races.  Possible is better
> > than impossible.
> >
> 
> We already have a solution. I don't think rejecting our solution but
> merging a worse solution should be the goal.

That's not the goal at all. We want a solution which is acceptable for
android and OTOH does not get into the way of other approaches.

The main problem I have is that suspend blockers are only addressing
one particular problem space of power management.

We have more requirements than that, e.g. an active device transfer
requires to prevent the idle code to select a deep power state due to
latency requirements. 

So we then have to implement two mechanisms in the relevant drivers:

   1) telling the idle code to limit latency
   2) telling the suspend code not to suspend

My main interest is to limit it to one mechanism, which is QoS based
and let idle and suspend make the appropriate decisions based on that
information.

Thanks,

tglx





Re: [PATCH] - race-free suspend. Was: Re: [linux-pm] [PATCH 0/8] Suspend block api (version 8)

2010-06-02 Thread Thomas Gleixner
On Wed, 2 Jun 2010, Arve Hjønnevåg wrote:

> 2010/6/2 Thomas Gleixner :
> > On Wed, 2 Jun 2010, Arve Hjønnevåg wrote:
> >> 2010/6/2 Neil Brown :
> >> > There would still need to be some sort of communication between the the
> >> > suspend daemon on any event daemon to ensure that the events had been
> >> > processed.  This could be very light weight interaction.  The point 
> >> > though is
> >> > that with this patch it becomes possible to avoid races.  Possible is 
> >> > better
> >> > than impossible.
> >> >
> >>
> >> We already have a solution. I don't think rejecting our solution but
> >> merging a worse solution should be the goal.
> >
> > That's not the goal at all. We want a solution which is acceptable for
> > android and OTOH does not get into the way of other approaches.
> >
> 
> I don't actually think the suspend blocker patchset get in the way of
> anything else.
> 
> > The main problem I have is that suspend blockers are only addressing
> > one particular problem space of power management.
> >
> > We have more requirements than that, e.g. an active device transfer
> > requires to prevent the idle code to select a deep power state due to
> > latency requirements.
> >
> > So we then have to implement two mechanisms in the relevant drivers:
> >
> >   1) telling the idle code to limit latency
> >   2) telling the suspend code not to suspend
> 
> And 3) telling the idle code to not enter low power modes that disrupt
> active interrupts or clocks.
>
> Our wakelock code handles 2 and 3, but I removed support for 3 on
> request since you can hack it by specifying a latency value that you
> know the low power mode cannot support.

You are mixing concepts. 

clock domains and power domains are a separate issue which are already
handled by the run time power management code and the clock framework.

The interrupt latency is a QoS requirement and has nothing to do with
power domains and clock domains simply because I can go deeper w/o
violating the clock and power domain constraints when the latency
allows it.

> > My main interest is to limit it to one mechanism, which is QoS based
> > and let idle and suspend make the appropriate decisions based on that
> > information.
> >
> 
> We can use one mechanism for this, but we still have to specify both.
> To me this is just another naming argument and not a good reason to
> not merge the suspend blocker code. You have to modify the same

The main objection against suspend blocker is the user space interface
/ ABI issue, not the driver code which we can fix in no time. But we
cannot fix it once it is glued into a user space interface.

I don't care about adding two empty static inlines into a header file,
which allows to merge the android drivers, but I care much about
giving a guaranteed behaviour to user space.

Thanks,

tglx

Re: [linux-pm] [PATCH 0/8] Suspend block api (version 8)

2010-06-03 Thread Thomas Gleixner
On Thu, 3 Jun 2010, James Bottomley wrote:

> On Thu, 2010-06-03 at 11:03 +0100, Alan Cox wrote:
> > > [mtg: ] This has been a pain point for the PM_QOS implementation.  They 
> > > change the constrain back and forth at the transaction level of the i2c 
> > > driver.  The pm_qos code really wasn't made to deal with such hot path 
> > > use, as each such change triggers a re-computation of what the aggregate 
> > > qos request is.
> > 
> > That should be trivial in the usual case because 99% of the time you can
> > hot path
> > 
> > the QoS entry changing is the latest one
> > there have been no other changes
> > If it is valid I can use the cached previous aggregate I cunningly
> > saved in the top QoS entry when I computed the new one
> > 
> > (ie most of the time from the kernel side you have a QoS stack)
> 
> It's not just the list based computation: that's trivial to fix, as you
> say ... the other problem is the notifier chain, because that's blocking
> and could be long.  Could we invoke the notifier through a workqueue?
> It doesn't seem to have veto power, so it's pure notification, does it
> matter if the notice is delayed (as long as it's in order)?

It depends on the information type and for a lot of things we might
get away without notifiers. 

The only real issue is when you need to get other cores out of their
deep idle state to make a new constraint work. That's what we do with
the DMA latency notifier right now.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: suspend blockers & Android integration

2010-06-04 Thread Thomas Gleixner
On Fri, 4 Jun 2010, Peter Zijlstra wrote:

> On Fri, 2010-06-04 at 11:43 +0200, Peter Zijlstra wrote:
> > I still believe containment is a cgroup problem. The freeze/snapshot/resume
> > container folks seem to face many of the same problems. Including the
> > pending timer one I suspect. Lets solve it there. 
> 
> While talking to Thomas about this, we'd probably need a CLOCK_MONOTONIC
> namespace to pull this off, so that resumed apps don't see the jump in
> absolute time.
> 
> This would also help with locating the relevant timers, since they'd be
> on the related timer base.
> 
> The only 'interesting' issue I can see here is that if you create 1000
> CLOCK_MONOTONIC namepaces, we'd need to have a tree of trees in order to
> efficiently find the leftmost timer.

We can do more clever than that. All CLOCK_MONOTONIC timers can live
in the CLOCK_MONOTONIC rbtree, we just need proper annotation, i.e.:

struct hrtimer {
   ktime_t expires;
   ..
   struct list_head namespace;
   ktime_t base_offset;
};

So expires would be on CLOCK_MONOTONIC as seen from the kernel, just
the user space interfaces would take the base_offset into account.

On freeze we remove the timers from the rbtree (they are easy to
find via the namespace list) and on thaw we set the base_offset
accordingly and insert them again. So no surprise for user space and
no tree of trees to walk through.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: suspend blockers & Android integration

2010-06-04 Thread Thomas Gleixner
On Sat, 5 Jun 2010, Rafael J. Wysocki wrote:
> I kind of agree here, so I'd like to focus a bit on that.
> 
> Here's my idea in the very general terms:
> 
> (1) Use the cgroup freezer to "suspend" the "untrusted" apps (ie. the ones
> that don't use suspend blockers aka wakelocks in the Android world) at the
> point Android would normally start opportunistic suspend.

There is an additional benefit to this approach:

 In the current android world a background task (e.g. download
 initiated before the screensaver kicked in) prevents the suspend,
 but that also means that the crapplications can still suck power
 completely unconfined.

 With the cgroup freezer you can "suspend" them right away and
 just keep the trusted background task(s) alive which allows us to
 go into deeper idle states instead of letting the crapplications
 run unconfined until the download finished and the suspend
 blocker goes away.
 
> (2) Allow the cpuidle framework to put CPUs into low-power states after the
> "trusted" apps (ie. the ones that use suspend blockers in the Android
> world) have gone idle.
> 
> (3) Teach the cpuidle framework to schedule runtime suspend of I/O devices
> before idling the last CPU (*).
> 
> (4) Design a mechanism to resume the I/O devices suspended in (3) so that
> they are not powered up unnecessarily (that's going to be difficult as far
> as I can see).
> 
> This way, in principle, we should be able to save (at least almost) as much
> energy as the opportunistic suspend currently used by Android, provided that
> things will be capable of staying idle for extended periods of time.
> 
> (*) That may require per-device PM QoS requirements to be used, in which case
> devices may even be suspended earlier if the PM QoS requirements of all
> of their users are met.
> 
> I wonder what people think.  Is this realistic and if so, would it be 
> difficult
> to implement?

I think it's realistic and not overly complicated to implement.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: suspend blockers & Android integration

2010-06-04 Thread Thomas Gleixner
Arve,

On Fri, 4 Jun 2010, Arve Hjønnevåg wrote:

> On Fri, Jun 4, 2010 at 5:05 PM, Thomas Gleixner  wrote:
> > On Sat, 5 Jun 2010, Rafael J. Wysocki wrote:
> >> I kind of agree here, so I'd like to focus a bit on that.
> >>
> >> Here's my idea in the very general terms:
> >>
> >> (1) Use the cgroup freezer to "suspend" the "untrusted" apps (ie. the ones
> >>     that don't use suspend blockers aka wakelocks in the Android world) at 
> >> the
> >>     point Android would normally start opportunistic suspend.
> >
> > There is an additional benefit to this approach:
> >
> >     In the current android world a background task (e.g. download
> >     initiated before the screensaver kicked in) prevents the suspend,
> >     but that also means that the crapplications can still suck power
> >     completely unconfined.
> >
> 
> Yes this can happen. It is usually only a big problem when you combine
> an (trusted) application that has a bug that blocks suspend forever
> with an application that wakes up too often for us to enter low power
> idle modes.

Why is it a BUG in the trusted app, when I initiate a download and put
the phone down ?

That download might take a minute or two, but that's not an
justification for the crapplication to run unconfined and prevent
lower power states.

> >     With the cgroup freezer you can "suspend" them right away and
> >     just keep the trusted background task(s) alive which allows us to
> >     go into deeper idle states instead of letting the crapplications
> >     run unconfined until the download finished and the suspend
> >     blocker goes away.
> >
> 
> Yes this would be better, but I want it in addition to suspend, not
> instead of it. It is also unclear if our user-space code could easily
> make use of it since our trusted code calls into untrusted code.

Sorry, that's really the worst argument I saw in this whole
discussion. 

You're basically saying, that you have no idea what your user space
stack is doing and you do not care at all as long as your suspend
blocker scheme makes things work somehow.

Up to that point, I really tried hard to step back from my initial
"OMG, promoting crap is a nono" reaction and work with you on a
sensible technical solution to confine crap and make it aligned with
other efforts in this area.

So now, after I spent a reasonable amount of time (as you did) to
understand what your requirements are, you come up with another
restriction which is so outside of any level of sanity, that I'm at
the point of giving up and just going into NAK mode.

Can you please answer the following question:

What is the point of having the distinction of "trusted" and
"untrusted" when you have no way to prevent "trusted" code calling
"into "untrusted" code ?

That's violating any sense of abstraction and layering and makes it
entirely clear that the only way you can deal with your own design
failure is a big hammer which you need to force into the kernel.

Sorry, no. I'm perfectly willing to make progress on that, as long as
we walk on a sane ground. But abusing the kernel for fixing basic
engineering problems in the user space side of affairs is completely
out of discussion.

> >> I wonder what people think.  Is this realistic and if so, would it be 
> >> difficult
> >> to implement?
> >
> > I think it's realistic and not overly complicated to implement.
> >
> 
> The kernel support can be easily implemented on most arm hardware, I
> don't know if it can work on most existing x86 hardware. It does not

It does not matter. Even Intel folks told you more than once, that x86
hardware is going to be fixed pretty soon. Hint: that's crucial to
their business 

> give us the same power savings as suspend with existing software, but
> it can handle bad apps better (assuming you don't combine
> opportunistic suspend and cgroup freezing). The biggest hurdle is how
> to handle dependencies between processes that gets frozen and
> processes that don't get frozen.

See above.

Thanks,

tglx

Re: suspend blockers & Android integration

2010-06-05 Thread Thomas Gleixner
B1;2005;0cOn Fri, 4 Jun 2010, Arve Hjønnevåg wrote:

> 2010/6/4 Thomas Gleixner :
> > Arve,
> >
> > On Fri, 4 Jun 2010, Arve Hjønnevåg wrote:
> >
> >> On Fri, Jun 4, 2010 at 5:05 PM, Thomas Gleixner  wrote:
> >> > On Sat, 5 Jun 2010, Rafael J. Wysocki wrote:
> >> >> I kind of agree here, so I'd like to focus a bit on that.
> >> >>
> >> >> Here's my idea in the very general terms:
> >> >>
> >> >> (1) Use the cgroup freezer to "suspend" the "untrusted" apps (ie. the 
> >> >> ones
> >> >>     that don't use suspend blockers aka wakelocks in the Android world) 
> >> >> at the
> >> >>     point Android would normally start opportunistic suspend.
> >> >
> >> > There is an additional benefit to this approach:
> >> >
> >> >     In the current android world a background task (e.g. download
> >> >     initiated before the screensaver kicked in) prevents the suspend,
> >> >     but that also means that the crapplications can still suck power
> >> >     completely unconfined.
> >> >
> >>
> >> Yes this can happen. It is usually only a big problem when you combine
> >> an (trusted) application that has a bug that blocks suspend forever
> >> with an application that wakes up too often for us to enter low power
> >> idle modes.
> >
> > Why is it a BUG in the trusted app, when I initiate a download and put
> > the phone down ?
> >
> 
> It is not, but we have had bugs where a trusted app does not unblock
> suspend after some failure case where it is no longer making any
> progress.

Well, that's simply an application bug which sucks battery with or
without suspend blockers. So it's unrelated to the freezing of
untrusted apps while a trusted app still works in the background
before allowing the machine to suspend.
 
> > That download might take a minute or two, but that's not an
> > justification for the crapplication to run unconfined and prevent
> > lower power states.
> >
> 
> I agree, but this is not a simple problem to solve.

Not with suspend blockers, but with cgroup confinement of crap, it's
straight forward.

> >> >     With the cgroup freezer you can "suspend" them right away and
> >> >     just keep the trusted background task(s) alive which allows us to
> >> >     go into deeper idle states instead of letting the crapplications
> >> >     run unconfined until the download finished and the suspend
> >> >     blocker goes away.
> >> >
> >>
> >> Yes this would be better, but I want it in addition to suspend, not
> >> instead of it. It is also unclear if our user-space code could easily
> >> make use of it since our trusted code calls into untrusted code.
> >
> > Sorry, that's really the worst argument I saw in this whole
> > discussion.
> >
> > You're basically saying, that you have no idea what your user space
> > stack is doing and you do not care at all as long as your suspend
> > blocker scheme makes things work somehow.
> >
> 
> Yes I don't know everything our user-space stack is doing, but I do
> know that it makes many calls between processes (and in both
> directions). As far as I know it uses timeouts when calling into
> untrusted code, so a misbehaving application will cause an error
> dialog to pop up asking if the user if it should wait longer or
> terminate the application.

Sigh, the more I learn about the details of android and it's violation
of all sane engineering principles the more I understand why you
invented a huge nail to push through all layers in order to bring the
system into idle at all. And yes, you need a sledge hammer to drive
that big nail through everything, so you are using the right tool.

Seriously, the cross app call goes through your framework, which
already knows, that the untrusted part is frozen. So it can deal
nicely with it in any way you want including unfreezing.

> > Up to that point, I really tried hard to step back from my initial
> > "OMG, promoting crap is a nono" reaction and work with you on a
> > sensible technical solution to confine crap and make it aligned with
> > other efforts in this area.
> >
> > So now, after I spent a reasonable amount of time (as you did) to
> > understand what your requirements are, you come up with another
> > restriction which is so outside of any level of sanity, that I'm at
> > the point of giving up and just going into NA

Re: [linux-pm] [PATCH 0/8] Suspend block api (version 8)

2010-06-05 Thread Thomas Gleixner
On Sat, 5 Jun 2010, Florian Mickler wrote:
> On Sat, 5 Jun 2010 23:26:27 +0300
> Felipe Contreras  wrote:
> 
> > Supposing there's a perfect usage of suspend blockers from user-space
> > on current x86 platforms (in theory Android would have that), is the
> > benefit that big to consider this a strong argument in favor of
> > suspend blockers? Considering the small amount of x86 platforms using
> > Android (is there any?), the fact that nobody else will use suspend
> > blockers, and that x86 is already being fixed (as mentioned many times
> > before) so dynamic PM is possible, I don't think we should be
> > considering current x86 at all for suspend blockers.
> 
> A solution for the desktop to deprecate having to shut down the
> machine would not be that bad, wouldn;t it? (Why have I to shut down
> my machine anyway?) 
> 
> In my opinion such a decision (when to shutdown) has to be guided by
> userspace knowledge.
> Future x86 hardware is to be fixed (as I read in this discussion), so
> using suspend blockers could be the tool of choice. 
> 
> But alright. Let's be a little bit more focused on the present
> situation:
> 
> 1) There currently is no other solution.

Wrong. The solutions are there, just being refused.

> 2) It is a first stepping stone to bringing android to mainline.

Crap. Once we have an userspace API we are bound to it. Period. 

Also there is no technical reason why the whole driver stack cannot
be merged w/o the suspend blockers.

> 3) Any "perfect" solution will emerge anyway. As there are so many
> people with so strong opinions engaged in this discussion, I'm
> confident.

Yes, and that needs cooperation and compromises from both sides and
not just blindly defining that there is no other solution.

> 4) If there is a better solution, android will shurely adapt it as
> soon as it is there. 

Sure, after we accepted the crap there is no incentive at all.

Stop that advertising campaing already.

No thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [linux-pm] [PATCH 0/8] Suspend block api (version 8)

2010-06-05 Thread Thomas Gleixner
On Sat, 5 Jun 2010, Florian Mickler wrote:

> On Sat, 5 Jun 2010 23:24:40 +0200 (CEST)
> Thomas Gleixner  wrote:
> 
> > Stop that advertising campaing already.
> 
> Stop advertising that there is no problem. 
> 
> > 
> > No thanks,
> > 
> > tglx
> 
> Cheers,
> Flo
> 
> (Sorry, crossfire. Caused
> by you answering in the wrong subthread. I know that you are
> engineering and not dismissing the problem. This was pointed at
> Felipes "There is no problem"/"suspend blockers don't do what they are 
> designed to do well")

Welcome to my killfile.

 
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: suspend blockers & Android integration

2010-06-05 Thread Thomas Gleixner
On Sat, 5 Jun 2010, Arve Hjønnevåg wrote:
> 2010/6/5 Thomas Gleixner :
> > B1;2005;0cOn Fri, 4 Jun 2010, Arve Hjønnevåg wrote:
> >> > Why is it a BUG in the trusted app, when I initiate a download and put
> >> > the phone down ?
> >> >
> >>
> >> It is not, but we have had bugs where a trusted app does not unblock
> >> suspend after some failure case where it is no longer making any
> >> progress.
> >
> > Well, that's simply an application bug which sucks battery with or
> > without suspend blockers. So it's unrelated to the freezing of
> > untrusted apps while a trusted app still works in the background
> > before allowing the machine to suspend.
> >
> 
> It is not unrelated if the trusted app has stopped working but still
> blocks suspend. The battery drains when you combine them.

What you are describing is a problem which is not solvable either way.
If you take the lock and do not release it you're not going to
suspend. I never claimed that any other mechanism resolves this.

But this is not related to the fact that freezing crap while running a
sane background task is going to save you power vs. an approach where
running a sane background task allows crap to consume power unconfined
until it is done.

> >> > That download might take a minute or two, but that's not an
> >> > justification for the crapplication to run unconfined and prevent
> >> > lower power states.
> >> >
> >>
> >> I agree, but this is not a simple problem to solve.
> >
> > Not with suspend blockers, but with cgroup confinement of crap, it's
> > straight forward.
> >
> 
> I don't think is is straight forward. If the a process in the frozen
> group holds a resource that a process in the unfrozen group needs, how
> do deal with that?

I'm going to fix the framework which puts the group into freeze state
w/o making sure that there is no held shared resource. Come on it's
not rocket science.
 
> >> Yes I don't know everything our user-space stack is doing, but I do
> >> know that it makes many calls between processes (and in both
> >> directions). As far as I know it uses timeouts when calling into
> >> untrusted code, so a misbehaving application will cause an error
> >> dialog to pop up asking if the user if it should wait longer or
> >> terminate the application.
> >
> > Sigh, the more I learn about the details of android and it's violation
> > of all sane engineering principles the more I understand why you
> > invented a huge nail to push through all layers in order to bring the
> > system into idle at all. And yes, you need a sledge hammer to drive
> > that big nail through everything, so you are using the right tool.
> >
> > Seriously, the cross app call goes through your framework, which
> > already knows, that the untrusted part is frozen. So it can deal
> > nicely with it in any way you want including unfreezing.
> 
> Cross app calls do not go through a central process.

It's not about a central process, it goes through your framework,
which should be able to deal with it. If not, it's a design failure
which needs to be fixed at the place where the failure happened.
 
> >>
> >> How can it be fixed? The user presses the back button, the framework
> >> determines that app A is in the foreground and send the key to app A,
> >> app A decides that it it does not have anything internal to go back to
> >> and tells the framework to switch back to the previous app. If the
> >> user presses the back key again, the framework does not know which app
> >> this key should go to until app A has finished processing the first
> >> key press.
> >
> > Errm, what has this to do with frozen apps? If your system is
> > handling input events then there are no frozen apps and even if they
> > are frozen your framework can unfreeze them _before_ talking to them.
> >
> > So which unfixable problem are you describing with the above example ?
> >
> 
> You are claiming that trusted code should not have any dependencies on
> untrusted code. I gave you a visible example of such a dependency and
> want you to tell me how you can avoid this dependency. Since you are
> claiming that our user-space framework is fundamentally broken if it
> has to wait for untrusted code, I don't think it is unreasonable for
> you to answer this. Or do you think it is valid to communicate with
> untrusted code when the screen is on but not when it is off.

It does not matter whether the screen is off or not. If you need to
call into that untrusted app from your trusted app and you know about
the might be frozen state then you can deal with it.

So taking your example:

Event happens and gets delivered to the framework

  framework selects A because it is in the foreground
  
  if (A is frozen)
 unfreeze(A)

  deliver_event_to(A)

It's that simple.

If your framework cannot deal with that simple problem then you have a
much more serious problem already.

Thanks,

tglx


Re: suspend blockers & Android integration

2010-06-05 Thread Thomas Gleixner
On Sat, 5 Jun 2010, Arjan van de Ven wrote:

> On Sat, 5 Jun 2010 15:26:36 -0700
> Brian Swetland  wrote:
> 
> > 
> > I'm continually surprised by answers like this.  We run on hardware
> > that power gates very aggressively and draws in the neighborhood of
> > 1-2mA at the battery when in the lowest state (3-5mA while the radio
> > is connected to the network and paging).  Waking up out of that lowest
> > state and executing code every few seconds or (worse) several times a
> > second) will raise your average power consumption.  Being able to stay
> > parked at the very bottom for minutes or hours at a time when nothing
> > "interesting" is happening is very useful and can have a significant
> > impact on overall battery life.
> 
> It's relatively simple math.
> 
> If you wake up for a burst of work, you burn power at the higher level
> P1 (versus the lower power level P2), for, lets say an average time T,
> with a relatively small T (few milliseconds at most).
> 
> If you wake up X times per second (with X being a fractional number, so
> can be smaller than 1) the extra power consumption factor is
> 
>   X * T * P1
> ---
> X * T * P1 + (1.0 - X * T) * P2
> 
> if you draw a graph of this, for real values of P and T, there's a real
> point where you hit diminishing returns.
> 
> if say T is 5 milliseconds (that's a high amount), and X is 1
> wakeup/second, then there's already a 200:1 ratio in time an power.
> 
> If X goes to once every 10 seconds (not unreasonable, especially since
> any real device will pull email and stuff in the backgroudn), you have
> 2000:1 time and power ratios...
> 
> Unless your "on" power is insane high (and hopefully it's not, since
> you're not turning on the whole device obviously, you do selective
> power and clock gating)... that "divide by 200 or 2000" makes the whole
> problem go away.. in the "seconds" range for really low power devices.
> Not in "hours" range. 

That's the whole problem. Suspend blockers are a binary all on/off
approach so you waste power just to get the thing back to
"suspend". They unleash the world and some more just to put it back
into oblivion with brute force.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: suspend blockers & Android integration

2010-06-05 Thread Thomas Gleixner
On Sat, 5 Jun 2010, Arve Hjønnevåg wrote:
> 2010/6/5 Thomas Gleixner :
> > On Sat, 5 Jun 2010, Arve Hjønnevåg wrote:
> >> >> > That download might take a minute or two, but that's not an
> >> >> > justification for the crapplication to run unconfined and prevent
> >> >> > lower power states.
> >> >> >
> >> >>
> >> >> I agree, but this is not a simple problem to solve.
> >> >
> >> > Not with suspend blockers, but with cgroup confinement of crap, it's
> >> > straight forward.
> >> >
> >>
> >> I don't think is is straight forward. If the a process in the frozen
> >> group holds a resource that a process in the unfrozen group needs, how
> >> do deal with that?
> >
> > I'm going to fix the framework which puts the group into freeze state
> > w/o making sure that there is no held shared resource. Come on it's
> > not rocket science.
> >
> 
> I'm not sure which framework you are talking about here, but I don't
> think there is a single framework that knows about all shared
> resources.

Damn, it's not me talking about "our framework", you are mentioning
when it fits your needs.

If you do not have a clearly defined user space framework, then we
talk about a completely random conglomeration of applications which
need to be brought into submission by some global brute force
approach.

I'm tired of this, really. You just use terminlology as it fits to
defend the complete design failure of android. But you fail to trick
me :)

Can you please explain in a consistent way how the application stack
and the underlying framework (which exists according to android docs)
is handling events and how the separation of trust level works ?

We need to know that, otherwise we turn in circles forever.

Thanks,

tglx

Re: suspend blockers & Android integration

2010-06-05 Thread Thomas Gleixner
On Sat, 5 Jun 2010, Arve Hjønnevåg wrote:
> 2010/6/5 Thomas Gleixner :
> >> > Well, that's simply an application bug which sucks battery with or
> >> > without suspend blockers. So it's unrelated to the freezing of
> >> > untrusted apps while a trusted app still works in the background
> >> > before allowing the machine to suspend.
> >> >
> >>
> >> It is not unrelated if the trusted app has stopped working but still
> >> blocks suspend. The battery drains when you combine them.
> >
> > What you are describing is a problem which is not solvable either way.
> > If you take the lock and do not release it you're not going to
> > suspend. I never claimed that any other mechanism resolves this.
> >
> Whether you claimed it or not, this is the only case where using
> cgroups would have a significant power saving over what we get with
> suspend. The trusted app is idle and the untrusted app is frozen, so
> we enter a low power mode from idle.

Nothing else was what I said and depending on the usage pattern this
can be significant. Just you converted a perfectly sensible technical
argument into a quibble about BUGs in applicatins which are not
confinable by defintion.
 
> > But this is not related to the fact that freezing crap while running a
> > sane background task is going to save you power vs. an approach where
> > running a sane background task allows crap to consume power unconfined
> > until it is done.
> >
> If the task that is blocking suspend is using the cpu anyway, then the
> bad app does not increase the power consumption nearly as much as if
> the task that blocked suspend is idle.

That's utter bullshit. If the app missed to release the supsend
blocker then your crappy "while(1);" app is killing you in no time,
while the same frozen crappy "while(1);" does no harm at all.
 
Thanks,

tglx

Re: suspend blockers & Android integration

2010-06-05 Thread Thomas Gleixner
On Sat, 5 Jun 2010, Arve Hjønnevåg wrote:
> 2010/6/5 Thomas Gleixner :
> > On Sat, 5 Jun 2010, Arve Hjønnevåg wrote:
> >> 2010/6/5 Thomas Gleixner :
> >> > B1;2005;0cOn Fri, 4 Jun 2010, Arve Hjønnevåg wrote:
> >> Cross app calls do not go through a central process.
> >
> > It's not about a central process, it goes through your framework,
> > which should be able to deal with it. If not, it's a design failure
> > which needs to be fixed at the place where the failure happened.
> >
> >> >>
> >> >> How can it be fixed? The user presses the back button, the framework
> >> >> determines that app A is in the foreground and send the key to app A,
> >> >> app A decides that it it does not have anything internal to go back to
> >> >> and tells the framework to switch back to the previous app. If the
> >> >> user presses the back key again, the framework does not know which app
> >> >> this key should go to until app A has finished processing the first
> >> >> key press.
> >> >
> >> > Errm, what has this to do with frozen apps? If your system is
> >> > handling input events then there are no frozen apps and even if they
> >> > are frozen your framework can unfreeze them _before_ talking to them.
> >> >
> >> > So which unfixable problem are you describing with the above example ?
> >> >
> >>
> >> You are claiming that trusted code should not have any dependencies on
> >> untrusted code. I gave you a visible example of such a dependency and
> >> want you to tell me how you can avoid this dependency. Since you are
> >> claiming that our user-space framework is fundamentally broken if it
> >> has to wait for untrusted code, I don't think it is unreasonable for
> >> you to answer this. Or do you think it is valid to communicate with
> >> untrusted code when the screen is on but not when it is off.
> >
> > It does not matter whether the screen is off or not. If you need to
> > call into that untrusted app from your trusted app and you know about
> > the might be frozen state then you can deal with it.
> >
> > So taking your example:
> >
> > Event happens and gets delivered to the framework
> >
> >      framework selects A because it is in the foreground
> >
> >      if (A is frozen)
> >         unfreeze(A)
> >
> >      deliver_event_to(A)
> >
> > It's that simple.
> >
> 
> That is too simple. You also have to prevent A from being frozen while
> it is processing the event or the result would be the same as if it
> was frozen beforehand.

The framework decides when to freeze the app in the first place (as
your framework does now when it decides to suspend)
 
 So it knows whether the app is frozen or not.

 So it knows damend well whether it processed the event or not.

Thanks,

tglx
 

Re: suspend blockers & Android integration

2010-06-06 Thread Thomas Gleixner
On Sat, 5 Jun 2010, Arve Hjønnevåg wrote:
> 2010/6/5 Thomas Gleixner :
> > On Sat, 5 Jun 2010, Arve Hjønnevåg wrote:
> >> That is too simple. You also have to prevent A from being frozen while
> >> it is processing the event or the result would be the same as if it
> >> was frozen beforehand.
> >
> > The framework decides when to freeze the app in the first place (as
> > your framework does now when it decides to suspend)
> >
> >     So it knows whether the app is frozen or not.
> >
> >     So it knows damend well whether it processed the event or not.
> >
> 
> Our user-space code is not single-threaded. So just because an app was
> not frozen when you checked does not mean it will remain unfrozen. We
> can use the same user-space wakelock api we have now to prevent
> freezing apps instead of preventing suspend, but we loose any
> advantage we get from freezing just a subset of processes this way.

Errm. That does not matter whether its single threaded or not. And
right, you have to prevent that it gets frozen while you are calling
into it.

But that does not change the fact that you can do finer grained power
control even in the case when suspend is impossible because a
background application has work to finish and does that without
requiring interaction with the frozen part.

That's what I pointed out in the first place and you just argue in
circles why it is impossible to do so.

Let me recapitulate:

   Full on state:   No difference because everything runs
   Full suspend state:  No difference because everything is down

   Screen off, background work active:

  Suspend blocker held by the active background work lets
  other applications which are unrelated consume CPU cycles
  and power.

  versus

  Frozen apps restrict the CPU cycles and power consumption to
  the background work (if there are no interactions with
  frozen tasks) and therefor save more power than the on/off
  approach.

  If your user space stack cannot be distangled that way, then
  it's a problem of your user space stack and not changing the
  fact, that a well designedd system allows you to do that.

Any objections ?

Thanks,

tglx








Re: suspend blockers & Android integration

2010-06-06 Thread Thomas Gleixner
On Sat, 5 Jun 2010, Arve Hjønnevåg wrote:
> 2010/6/5 Thomas Gleixner :
> > On Sat, 5 Jun 2010, Arve Hjønnevåg wrote:
> >> 2010/6/5 Thomas Gleixner :
> >> > On Sat, 5 Jun 2010, Arve Hjønnevåg wrote:
> >> >> >> > That download might take a minute or two, but that's not an
> >> >> >> > justification for the crapplication to run unconfined and prevent
> >> >> >> > lower power states.
> >> >> >> >
> >> >> >>
> >> >> >> I agree, but this is not a simple problem to solve.
> >> >> >
> >> >> > Not with suspend blockers, but with cgroup confinement of crap, it's
> >> >> > straight forward.
> >> >> >
> >> >>
> >> >> I don't think is is straight forward. If the a process in the frozen
> >> >> group holds a resource that a process in the unfrozen group needs, how
> >> >> do deal with that?
> >> >
> >> > I'm going to fix the framework which puts the group into freeze state
> >> > w/o making sure that there is no held shared resource. Come on it's
> >> > not rocket science.
> >> >
> >>
> >> I'm not sure which framework you are talking about here, but I don't
> >> think there is a single framework that knows about all shared
> >> resources.
> >
> > Damn, it's not me talking about "our framework", you are mentioning
> > when it fits your needs.
> 
> You said you were going to fix the framework. I did know if you were
> talking about the cgroup framework, or the android user-space
> frameworks. I don't think either has knowledge about all shared
> resources.

The cgroup freezer makes sure that there are no in kernel resources
blocked. Of course the user space side has to do the same and it's not
rocket science.
 
> >
> > If you do not have a clearly defined user space framework, then we
> > talk about a completely random conglomeration of applications which
> > need to be brought into submission by some global brute force
> > approach.
> >
> > I'm tired of this, really. You just use terminlology as it fits to
> > defend the complete design failure of android. But you fail to trick
> > me :)
> >
> > Can you please explain in a consistent way how the application stack
> > and the underlying framework (which exists according to android docs)
> > is handling events and how the separation of trust level works ?
> >
> 
> I don't think I can, since I only know small parts of it. I know some

Sigh. That's the main reason why this discussion goes nowhere.

How in heavens sake can we make a decision whether suspend blockers
are the right and only way to go, when the people 

> events like input event go though a single thread in our system
> process, while other events like network packets (which are also
> wakeup events) goes directly to the app.


Re: suspend blockers & Android integration

2010-06-06 Thread Thomas Gleixner
On Sat, 5 Jun 2010, Arve Hjønnevåg wrote:
> 2010/6/5 Thomas Gleixner :
> >
> > Can you please explain in a consistent way how the application stack
> > and the underlying framework (which exists according to android docs)
> > is handling events and how the separation of trust level works ?
> >
> 
> I don't think I can, since I only know small parts of it. I know some

Sigh, thats the whole reason why this discussion goes nowhere.

How in heavens sake should we be able to decide whether suspend
blockers are the right and only thing which solves a problem, when the
folks advocating suspend blockers are not able to explain the problem
in the first place ?

> events like input event go though a single thread in our system
> process, while other events like network packets (which are also
> wakeup events) goes directly to the app.

Yes, we know that already, but that's a completely useless information
as it does not describe the full constraints and dependencies.

Lemme summarize:

  Android needs suspend blockers, because it works, but cannot explain
  why it works and why it only works that way.

A brilliant argument to merge them - NOT.

Thanks,

tglx

Re: [linux-pm] suspend blockers & Android integration

2010-06-06 Thread Thomas Gleixner
On Sun, 6 Jun 2010, James Bottomley wrote:
>
>  3. We've lost sight of one of the original goals, which was to
> bring the android tree close enough to the kernel so that the
> android downstream driver and board producers don't have to
> choose between the android kernel and vanilla kernel.

There are two ways to do that w/o creating a dependcy on anything.

1) merge the drivers w/o the suspend_blockers. It's not rocket science
   to have a patch which brings them back for android.

2) merge the drivers with empty stub implementations for annotation.
   android just has to patch in the real one.

While I'd prefer #1, I' not in the way of #2.

Both ways can get the drivers into the kernel and it could/should have
been done right from the beginning, but now we face a situation where
drivers are held hostage.

Then we can sit down more relaxed and fix the stuff in a way which
makes both sides happy. If we manage to replace them, we can deprecate
the stub implementation and remove it after a grace period. If we
rename them it's not an issue either. We can rename them right away to
a qos interface, but that does not really make a difference.

What we really want to avoid is implementing an user space contract in
a frenzy which binds us forever.

It's not the suspend_blockers which are the causing the nightmare,
it's solely the drivers itself especially when there are different
implementations in both trees. And frankly, the drivers in android are
not in a shape which makes them flood in within 2 weeks. That's
serious work to get them brushed up and polished. So that gives us
quite a period of time to solve the suspend problem.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [linux-pm] suspend blockers & Android integration

2010-06-06 Thread Thomas Gleixner
James,

On Sun, 6 Jun 2010, James Bottomley wrote:

> On Sun, 2010-06-06 at 17:46 +0200, Thomas Gleixner wrote:
> > On Sun, 6 Jun 2010, James Bottomley wrote:
> > >
> > >  3. We've lost sight of one of the original goals, which was to
> > > bring the android tree close enough to the kernel so that the
> > > android downstream driver and board producers don't have to
> > > choose between the android kernel and vanilla kernel.
> > 
> > There are two ways to do that w/o creating a dependcy on anything.
> > 
> > 1) merge the drivers w/o the suspend_blockers. It's not rocket science
> >to have a patch which brings them back for android.
> 
> Well, we sort of tried this when Greg pulled some of them into the
> staging tree.  The problem is that without the annotations, the drivers
> are still different, and patches won't apply, so, unsurprisingly, they
> didn't get improved or even maintained.
> 
> > 2) merge the drivers with empty stub implementations for annotation.
> >android just has to patch in the real one.
> 
> That's also possible.  This time, we would have a cosmetically closer
> tree ... however, what's in the kernel wouldn't be compilable for
> android ... which is where all the downstream wants to test, so they'd
> still be building for the android tree ... we just might have an easier
> time of it picking up their fixes.

The downstream users will be bound to the android tree anyway until
the full set of drivers for a given platform is completely merged. So
optimistically that would be 2.6.36, which gives us a couple of
months to sort out the whole thing.

Once a driver is merged mainline and the android tree switched over to
use the mainline version, fixes can be sent both ways and that's not a
real problem.

> > While I'd prefer #1, I' not in the way of #2.
> 
> I think 1 is unviable ... I'm not opposed to 2 but I'd like to try to
> get the kernel really closer to android before we go for the cosmetic
> only option.
> 
> > Both ways can get the drivers into the kernel and it could/should have
> > been done right from the beginning, but now we face a situation where
> > drivers are held hostage.
> > 
> > Then we can sit down more relaxed and fix the stuff in a way which
> > makes both sides happy. If we manage to replace them, we can deprecate
> > the stub implementation and remove it after a grace period. If we
> > rename them it's not an issue either. We can rename them right away to
> > a qos interface, but that does not really make a difference.
> > 
> > What we really want to avoid is implementing an user space contract in
> > a frenzy which binds us forever.
> 
> Well, that's why the QoS proposal ... it already has a userspace API ...
> we'd just be extending it for statistics, which looks like a wothwhile
> goal independent of android anyway.

Right, but there is no dependency for the driver merge on that.
 
> > It's not the suspend_blockers which are the causing the nightmare,
> > it's solely the drivers itself especially when there are different
> > implementations in both trees. And frankly, the drivers in android are
> > not in a shape which makes them flood in within 2 weeks. That's
> > serious work to get them brushed up and polished. So that gives us
> > quite a period of time to solve the suspend problem.
> 
> Right, so the sooner we make it easier for the drivers to use the kernel
> as their main repository, the better.

Yep, the fastest way is to provide two stub inlines in pm.h and let
the driver flood come in.

I think all of us involved in that can do with a break, where we sit
back, calm down and rethink w/o time pressure.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [linux-pm] suspend blockers & Android integration

2010-06-06 Thread Thomas Gleixner
On Sun, 6 Jun 2010, Brian Swetland wrote:

> On Sun, Jun 6, 2010 at 11:04 AM, Thomas Gleixner  wrote:
> >>
> >> Right, so the sooner we make it easier for the drivers to use the kernel
> >> as their main repository, the better.
> >
> > Yep, the fastest way is to provide two stub inlines in pm.h and let
> > the driver flood come in.
> 
> As mentioned previously, we didn't know this was an option (stubs
> without an implementation behind them).  If it is, and thus we can

That's what Greg did when he pulled stuff into staging, but there is
no reason not to do it outside of staging as well. We can simply put
the stub functions into Documentation/feature-removal.txt to ensure
that we don't forget about them :)

> simplify the driver merging process short-term while sorting out a
> long-term implementation or replacement for suspend blockers, then I
> think we're making real progress.

Yes, that way we do not lose the annotations. Replacing them, ripping
them out or whatever we agree on, is a nobrainer.

But it gets the drivers into the tree, so they are usable outside of
android as well and the delta between android and mainline shrinks
significantly.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [linux-pm] suspend blockers & Android integration

2010-06-06 Thread Thomas Gleixner
Brian,

On Sun, 6 Jun 2010, Brian Swetland wrote:

> On Sun, Jun 6, 2010 at 12:24 PM, Christoph Hellwig  wrote:
> >
> > Yes.  That's what makes me wonder about some parts of the discussion
> > here.  Getting the drivers for one or more of the android plattforms
> > in is not a problem.  I'd say it could have easily been done with the
> > manweeks spent arguing in this and related threads.
> >
> > The much bigger issues is to get android userspace running on a more
> > or less vanilla kernel, that is withoutthe  binder, without the rather
> > interesting group ID based security hack^H^H^H^Hmodel, without the
> > suspend blocker userspace API and so on, and so on.
> 
> Somebody will have to broker a deal with the frameworks/apps folks to
> get rid of the binder.  They like it a lot.  Of course if somebody
> built a drop-in replacement for the userspace side that didn't require
> a kernel driver, had the same performance characteristics, solved the
> same problems, etc, they could probably make an argument for it (or
> just provide it as a drop-in replacement for people who want a more
> "pure" linux underneath Android, even if we didn't pick it up).
> 
> The group ID stuff works incredibly well for gating device access --
> we ensure that devices that need access from various processes end up
> with perms like 0660 root audio (say for a raw audio interface), and
> then we assure that processes which have the "may use audio hardware"
> permission are executed with audio as an additional group.  We ended
> up using the same model to control socket, raw socket, and bt socket
> access because at the time we could not find a reasonable way to grant
> or exclude such permissions on a process by process basis.
> Maintaining about 20-30 lines of diffs to make that work was not a bad
> tradeoff (and we don't expect those patches to go upstream).  If
> there's a way to accomplish this without patching the kernel, we're
> all ears.
> 
> While we do maintain some assorted patches to core code (like the
> permissions thing), we strongly prefer to keep our work localized to
> drivers (generic to android or specific to hardware), and try to
> migrate to common kernel features when possible, submit patches (like
> work Mike Chan is doing around cpufreq and power measurement), etc.
> Patches to core code cause more headaches when we rebase up to new
> kernel versions, after all.

thanks for the honest answer and thanks to Christoph for the
reminder! 

That takes a lot of the bullshit arguments about downstream users
being hurt out of the discussion. The above problems are way more
complex to resolve than the suspend blocker details.

That's another prove why we can let the drivers flow in (in the worst
case w/o the suspend blocker stubs) and have no pressure to resolve
the suspend blocker problem yesterday.

That said, after thinking more about it, I'm advocating the stubs
solution with a clear removal / decision date constraint
(e.g. 2.6.37), as it forces all involved parties to stay tuned and not
to forget about it. I'm curious about the outcome :)

Thanks,

tglx

Re: [linux-pm] suspend blockers & Android integrationy

2010-06-06 Thread Thomas Gleixner
Alan,

On Sun, 6 Jun 2010, Alan Stern wrote:

> On Sun, 6 Jun 2010, Matthew Garrett wrote:
> 
> > The difference between idle-based suspend and opportunistic suspend is 
> > that the former will continue to wake up for timers and will never be 
> > entered if something is using CPU, whereas the latter will be entered 
> > whenever no suspend blocks are held. The problem with opportunistic 
> > suspend is that you might make the decision to suspend simultaneusly 
> > with a wakeup event being received. Suspend blocks facilitate 
> > synchronisation between the kernel and userspace to ensure that all such 
> > events have been consumed and handld appropriately.
> 
> Remember that suspend takes place in several phases, the first of which 
> is to freeze tasks.  The phases can be controlled individually by the 
> process carrying out a suspend, and there's nothing to prevent you from 
> stopping after the freezer phase.  Devices won't get powered down, but 
> Android uses aggressive runtime power management for its devices 
> anyway.
> 
> If you do this then the synchronization can be carried out entirely
> from userspace, with no need for kernel modifications such as suspend
> blockers. And since Android can reach essentially the same low-power
> state from idle as from suspend, it appears that they really don't need
> any kernel changes at all.

Well there are some things to solve:

 1) the confinement of untrusted apps 

 2) the confinement of trusted apps firing periodic timers. 

 Aside of that they need to shut off undesired interrupt sources,
 but that's not a real problem to solve and probably possible
 today already.

#1 can be solved elegantly by cgroups. We know how to freeze the
   cgroup timers when the need arises, but that's not a real problem
   as all periodic timers are firing only once when the target app
   does not handle them.

   Though the cgroup based approach of freezing timers might be
   interesting for containers c/r as well and it might be necessary to
   emulate the suspend semantics of freezing CLOCK_MONOTONIC, but
   that's simple to do and basically no overhead.

#2 is a tad harder, as it requires to fix the trusted apps not to fire
   timers when there is nothing to do.

   Though you can solve it with cgroups as well. The unfreeze problem
   for real wakeups can be solved as mhelsley pointed out somewhere
   else in this thread.

But that depends on user space changes 

Though as we learned today that suspend blockers are the least of the
problems which android is facing vs. mainline and we have a plan to
get the drivers in we can relax a bit and think more about it.

I'm not saying that we should ignore the shortcomings of todays code,
but in the face of hardware which perfectly goes into the same power
state from idle as it does from suspend we can IMNSHO safely ignore
the x86/ACPI/BIOS crap and the user space wreckage (see above) and
just focus on a sane design based on current/future hardware.

That's nothing new, we do not go and make NOHZ/HIGHRES work on crappy
hardware either, even if there have been patches around to do so. We
do not make broken TSCs work, even if there are brute force ways to do
so. 

It's more sane to say "Sorry, it does not work on your system" than
trying to make it work under all circumstances for a questionable
benefit and paying the price for it in terms of maintainability and/or
complexity.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [linux-pm] suspend blockers & Android integrationy

2010-06-07 Thread Thomas Gleixner
Alan,

On Mon, 7 Jun 2010, Alan Stern wrote:
> On Mon, 7 Jun 2010, Thomas Gleixner wrote:
> > #2 is a tad harder, as it requires to fix the trusted apps not to fire
> >timers when there is nothing to do.
> 
> No; all you have to do is handle the trusted apps as though they were 
> untrusted -- just as in the original wakelock approach.
> 
> >Though you can solve it with cgroups as well. The unfreeze problem
> >for real wakeups can be solved as mhelsley pointed out somewhere
> >else in this thread.
> > 
> > But that depends on user space changes 
> 
> If you handle all the apps uniformly, very few userspace changes are
> needed.

Oh, I see. I misunderstood you. -ENOTENOUGHSLEEP
 
Thanks,

Thomas
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.33-rc8-rt1 on Beagle

2010-03-01 Thread Thomas Gleixner
On Mon, 1 Mar 2010, Uwe Kleine-König wrote:
> You might want to report that to the author(s) of
> drivers/mmc/host/omap_hsmmc.c.

FYI, thats fixed in 33-rt3 and the fix is going mainline as well.

Thanks,

tglx

[PATCH] mmc: omap_hsmmc: Fix conditional locking

2010-03-01 Thread Thomas Gleixner
Conditional locking on (!in_interrupt()) is broken by design and there
is no reason to keep the host->irq_lock across the call to
mmc_request_done(). Also the host->protect_card magic hack does not
depend on the context

Fix the mess by dropping host->irq_lock before calling
mmc_request_done().

Signed-off-by: Thomas Gleixner 

diff --git a/drivers/mmc/host/omap_hsmmc.c b/drivers/mmc/host/omap_hsmmc.c
index 4b23225..99a3383 100644
--- a/drivers/mmc/host/omap_hsmmc.c
+++ b/drivers/mmc/host/omap_hsmmc.c
@@ -468,14 +468,6 @@ omap_hsmmc_start_command(struct omap_hsmmc_host *host, 
struct mmc_command *cmd,
if (host->use_dma)
cmdreg |= DMA_EN;
 
-   /*
-* In an interrupt context (i.e. STOP command), the spinlock is unlocked
-* by the interrupt handler, otherwise (i.e. for a new request) it is
-* unlocked here.
-*/
-   if (!in_interrupt())
-   spin_unlock_irqrestore(&host->irq_lock, host->flags);
-
OMAP_HSMMC_WRITE(host->base, ARG, cmd->arg);
OMAP_HSMMC_WRITE(host->base, CMD, cmdreg);
 }
@@ -506,7 +498,9 @@ omap_hsmmc_xfer_done(struct omap_hsmmc_host *host, struct 
mmc_data *data)
}
 
host->mrq = NULL;
+   spin_unlock(&host->irq_lock);
mmc_request_done(host->mmc, mrq);
+   spin_lock(&host->irq_lock);
return;
}
 
@@ -523,7 +517,9 @@ omap_hsmmc_xfer_done(struct omap_hsmmc_host *host, struct 
mmc_data *data)
 
if (!data->stop) {
host->mrq = NULL;
+   spin_unlock(&host->irq_lock);
mmc_request_done(host->mmc, data->mrq);
+   spin_lock(&host->irq_lock);
return;
}
omap_hsmmc_start_command(host, data->stop, NULL);
@@ -551,7 +547,9 @@ omap_hsmmc_cmd_done(struct omap_hsmmc_host *host, struct 
mmc_command *cmd)
}
if ((host->data == NULL && !host->response_busy) || cmd->error) {
host->mrq = NULL;
+   spin_unlock(&host->irq_lock);
mmc_request_done(host->mmc, cmd->mrq);
+   spin_lock(&host->irq_lock);
}
 }
 
@@ -1077,37 +1075,31 @@ static void omap_hsmmc_request(struct mmc_host *mmc, 
struct mmc_request *req)
struct omap_hsmmc_host *host = mmc_priv(mmc);
int err;
 
+   spin_lock_irqsave(&host->irq_lock, host->flags);
/*
-* Prevent races with the interrupt handler because of unexpected
-* interrupts, but not if we are already in interrupt context i.e.
-* retries.
+* Protect the card from I/O if there is a possibility
+* it can be removed.
 */
-   if (!in_interrupt()) {
-   spin_lock_irqsave(&host->irq_lock, host->flags);
-   /*
-* Protect the card from I/O if there is a possibility
-* it can be removed.
-*/
-   if (host->protect_card) {
-   if (host->reqs_blocked < 3) {
-   /*
-* Ensure the controller is left in a consistent
-* state by resetting the command and data state
-* machines.
-*/
-   omap_hsmmc_reset_controller_fsm(host, SRD);
-   omap_hsmmc_reset_controller_fsm(host, SRC);
-   host->reqs_blocked += 1;
-   }
-   req->cmd->error = -EBADF;
-   if (req->data)
-   req->data->error = -EBADF;
-   spin_unlock_irqrestore(&host->irq_lock, host->flags);
-   mmc_request_done(mmc, req);
-   return;
-   } else if (host->reqs_blocked)
-   host->reqs_blocked = 0;
-   }
+   if (host->protect_card) {
+   if (host->reqs_blocked < 3) {
+   /*
+* Ensure the controller is left in a consistent
+* state by resetting the command and data state
+* machines.
+*/
+   omap_hsmmc_reset_controller_fsm(host, SRD);
+   omap_hsmmc_reset_controller_fsm(host, SRC);
+   host->reqs_blocked += 1;
+   }
+   req->cmd->error = -EBADF;
+   if (req->data)
+   req->data->error = -EBADF;
+   spin_unlock_irqrestore(&host->irq_lock, host->flags);
+   mmc_request_done(mmc, req);
+   return;
+   } else if (host-&g

RE: [PATCH] mmc: omap_hsmmc: Fix conditional locking

2010-03-03 Thread Thomas Gleixner
On Tue, 2 Mar 2010, Madhusudhan wrote:
> > Conditional locking on (!in_interrupt()) is broken by design and there
> > is no reason to keep the host->irq_lock across the call to
> > mmc_request_done(). Also the host->protect_card magic hack does not
> > depend on the context
> > 
> 
> Can you please elaborate why the existing logic is broken?

Locks are only to be held to serialize data or state. 

The mmc_request_done() call does _NOT_ require that at all. So
dropping the lock there is the right thing to do.

Also conditional locking on in_interrupt() is generally a nono as it
relies on assumptions which are not necessarily true in all
circumstances. Just one simple example: interrupt threading will make
it explode nicely and it did already with the realtime patches
applied.

Such code constructs prevent us to do generic changes to the kernel
behaviour without any real good reason.

> It locks at the new request and unlocks just before issuing the cmd. Further
> IRQ handler has these calls hence the !in_interrupt check.

Aside of the conditional locking I have several issues with that code:

1) The code flow is massively unreadable:

   omap_hsmmc_start_command()
   {
.
if (!in_interrupt())
   spin_unlock_irq();
   }
 
   omap_hsmmc_request()
   {
if (!in_interrupt())
   spin_lock_irq();

omap_hsmmc_start_command();
   }

We generally want to see the lock/unlock pairs in one function and not
having to figure out where the heck unlock happens.

2) The point of unlocking is patently wrong

   omap_hsmmc_start_command()
   {
.
if (!in_interrupt())
   spin_unlock_irq();
--->
OMAP_HSMMC_WRITE(host->base, ARG, cmd->arg);
--->
OMAP_HSMMC_WRITE(host->base, CMD, cmdreg);
   }

What happens, if you get a spurious interrupt here ? Same for SMP,
though you are probably protected by the core mmc code request
serialization there.

> How does this patch improve that? In fact with your patch for a data
> transfer cmd there are several lock-unlock calls.

1) The patch simply removes conditional locking and moves the lock
   sections to those places which protect something. Aside of that it
   makes the code easier to understand.

2) What's the point of not having those lock/unlocks ? On UP the
   spinlock is a NOOP anyway, so you won't even notice. On SMP you
   won't notice either, simply because the lock is cache hot and
   almost never contended.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] omap changes for v2.6.39 merge window

2011-03-30 Thread Thomas Gleixner
On Wed, 30 Mar 2011, Linus Torvalds wrote:
> On Wed, Mar 30, 2011 at 10:06 AM, Arnd Bergmann  wrote:
> >
> > I'm still new to the ARM world, but I think one real problem is the way
> > that all platforms have their own trees with a very flat hierarchy --
> > a lot of people directly ask Linus to pull their trees, and the main
> > way to sort out conflicts is linux-next. The number of platforms in the
> > ARM arch is still increasing, so I assume that this only gets worse.
> 
> Because as far as I can tell, most of that board support really is
> about crazy details that the kernel shouldn't even care about. Come up
> with a table that describes them, have one common parsing routine, and
> push the table into a bootloader. And get rid of having to add a board
> file for every crazy random piece of hardware that nobody really cares
> about.

There is effort on the way to address that with device tree support,
but that wont solve the other problem Arnd mentioned.

Let me phrase it different.

The main problem is NOT that these things conflict, the main problem -
and I can tell you after working through all that irq/gpio/mfd
sh*tpile - is that these subarchs start a life on their own and find
tons of creative ways to work around shortcomings of infrastructure
code up to the point where infrastructure code cannot be changed
anymore w/o breaking the world and some more. There is a f*cking good
reason why I made myself run through all that horror. These
shortcomings are partially real, but most of the time the failure is
on those folks simply because they do not understand how it works. If
the shortcoming is real they fail to talk to the infrastructure
maintainers and just hack something which boots.

The ARM core code and CPU/TLB/CACHE abnominations handling which is in
Russell's hands is working very well. Piling the babysitting of
sub-arch support onto Russell as well simply cannot scale, as the
whole madness of inconsistency of the ARM core architecture itself is
a full time job on it's own.

Watching the rapidly increasing number of SoCs which are spilling out
in the ARM ecosystem and their totaly non-architected "glue together
random IP cores" philosophy, I' convinced that we need a full-time
gatekeeper who babysits the subarch stuff and keeps an eye on those
ever repeating failure patterns and works on resolving them.

The only problem is to find a person, who is willing to do that, has
enough experience, broad shoulders and a strong accepted voice. Not to
talk about finding someone who is willing to pay a large enough
compensation for pain and suffering.

This is getting worse now as there seems to be a strong incentive to
get all that vendor BSP crap into mainline and I did a couple of
reviews in the last months which were more than frustrating. Running
through a 10 rounds review for a 200 lines driver is not really
encouraging - though at least the review prevented that a 1000 lines
horror crap got merged. I don't blame those people too much as they
have been thrown into Linux development after dealing with black hole
OS cores for years and therefor being spoiled with the thought of
"work around the failure / missing feature" somehow w/o the ability to
talk to anyone about it. That's a system failure of the established
commercial OS world and it will take some time to show those people
that it can be solved different. But that takes quite some manpower.

So one person will be not enough, that needs to be a whole team of
experienced people in the very near future to deal with the massive
tsunami of crap which is targeted at mainline. If we fail to set that
up, then we run into a very ugly maintainability issue in no time.

Thanks,

tglx

--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] omap changes for v2.6.39 merge window

2011-03-30 Thread Thomas Gleixner
On Wed, 30 Mar 2011, Tony Lindgren wrote:

> * Thomas Gleixner  [110330 14:07]:
> > 
> > So one person will be not enough, that needs to be a whole team of
> > experienced people in the very near future to deal with the massive
> > tsunami of crap which is targeted at mainline. If we fail to set that
> > up, then we run into a very ugly maintainability issue in no time.
> 
> One thing that will help here and distribute the load is to move
> more things under drivers/ as then we have more maintainers looking
> at the code.

Guess what's that going to solve? Nothing, nada.

Really, you move the problem to people who are not prepared to deal
with the wave either. So what's the gain?

FYI, lots of the wreckage I observed regarding irq stuff was in
drivers/*

My statement stays the same. In whatever playground you try to shift
that problem in it's not going to scale the way it is right now,

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] omap changes for v2.6.39 merge window

2011-03-30 Thread Thomas Gleixner
On Wed, 30 Mar 2011, Tony Lindgren wrote:

> * Thomas Gleixner  [110330 15:22]:
> > On Wed, 30 Mar 2011, Tony Lindgren wrote:
> > 
> > > * Thomas Gleixner  [110330 14:07]:
> > > > 
> > > > So one person will be not enough, that needs to be a whole team of
> > > > experienced people in the very near future to deal with the massive
> > > > tsunami of crap which is targeted at mainline. If we fail to set that
> > > > up, then we run into a very ugly maintainability issue in no time.
> > > 
> > > One thing that will help here and distribute the load is to move
> > > more things under drivers/ as then we have more maintainers looking
> > > at the code.
> > 
> > Guess what's that going to solve? Nothing, nada.
> > 
> > Really, you move the problem to people who are not prepared to deal
> > with the wave either. So what's the gain?
> 
> I guess my point is that with creating more common frameworks people
> will be using common code. Some examples that come to mind are clock
> framework, gpiolib, dma engine, runtime PM and so on.

For all that to happen you need a really experienced team with a
strong team lead to fight that through and go through the existing
horror while dealing  with the incoming flood at the same time.

See commit 9ad198cb for illustration.

Sigh, I fought that battle for a couple of month to deal with shite
coming in faster than you can fix it and everyone ignoring it.

Thanks,

tglx



--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] omap changes for v2.6.39 merge window

2011-03-30 Thread Thomas Gleixner
On Wed, 30 Mar 2011, Tony Lindgren wrote:

> * Paul E. McKenney  [110330 15:35]:
> > On Wed, Mar 30, 2011 at 02:54:35PM -0700, Tony Lindgren wrote:
> > > * Thomas Gleixner  [110330 14:07]:
> > > > 
> > > > So one person will be not enough, that needs to be a whole team of
> > > > experienced people in the very near future to deal with the massive
> > > > tsunami of crap which is targeted at mainline. If we fail to set that
> > > > up, then we run into a very ugly maintainability issue in no time.
> > > 
> > > One thing that will help here and distribute the load is to move
> > > more things under drivers/ as then we have more maintainers looking
> > > at the code.
> > 
> > In many cases, the ARM SoC vendors will want their people producing the
> > code, so although moving things to drivers might be a good thing to do,
> > it won't really increase the number of people involved.  Plus the move
> > to the drivers subtree would be a problem for devices with tight ties
> > to the board or SoC.
> > 
> > There is work on pushing towards common code, but there is a lot of code
> > and this will take time and a lot of work.
> 
> I agree on the common code part, then even drivers with tight
> ties to board or SoC become just generic drivers that are easy
> to review.

You wish. There is an already existing problem that the identical IP
cores of peripheral crap are reused accross architectures. And of
course because it is a different architecture we have two different
drivers with different issues.

See: http://marc.info/?l=linux-kernel&m=130041568128164

We already fail to detect this on the driver level, so please answer
the question I asked before: How do you spread the load and scale with
the amount of shite which is coming in?

The above example is probably not the only one in tree and we will see
lots of unnoticed instances of drivers dealing with minimal different
versions of the same IP crappola in the near future simply because the
vendors claim that their stuff is unique and only works with their
particular instance of hackery unless we have enough capable people to
look over this. Whether it's in arch/ or drivers/ it does not
matter. We are simply not prepared to the amount of crap coming in.

Thanks,

tglx

--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] omap changes for v2.6.39 merge window

2011-03-31 Thread Thomas Gleixner
B1;2401;0cOn Thu, 31 Mar 2011, Nicolas Pitre wrote:
> On Wed, 30 Mar 2011, Linus Torvalds wrote:
> > And ARM fanbois can say "oh, but arm is special" all they want, but
> > they need to realize that the lack of common platform for ARM is a
> > real major issue. It's not a "feature", and I'm sorry, but anybody who
> > calls x86 "peanuts" is a moron and should be ashamed of himself.
> > Instead of trying to feel superior, those people should feel like
> > pariah.
> 
> Oh come on.  You just provided actual numbers above showing that ARM is 
> simply fscked up (your words) compared to X86.  I would be curious to 
> know what people like tglx who did significant work on both 
> architectures actually think of X86 relative to ARM when it comes to 
> kernel maintenance.

To be honest both suck in their own way. The only reason why x86 is
slightly less horrible is the fact that it's better architectured at
the hardware level.

But I see the same mess coming in with all those wonderful Atom based
SoCs on the horizon, which are nothing else than any other random ARM
SoC, just that they glue an x86 core into the same cheepo random IP
peripherals conglomerate. In fact some of those chip have been ARM
powered before they got an x86 injected.

And worse: the Intel folks went there and wrote a new driver for an IP
block which had already an "ARM associated" driver.

So I say that it is not only an ARM problem, it's a general problem
that people do not realize that the IP cores are reused all over the
place and across architectures.

I'm pretty sure after I went through all the irq code recently that
lots of those ARM SoCs from vendors across the board could share a lot
of driver code if someone would actually sit down and analyse the
situation. Right now we have nobody who has the time and the stomach
to go through this and at the same time prevent that more copied
crappola is hitting the tree.

I'm sure that device tree is part of the solution, but that only helps
if we find a way to prevent duplicate drivers in the first place.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] omap changes for v2.6.39 merge window

2011-03-31 Thread Thomas Gleixner
On Thu, 31 Mar 2011, Russell King - ARM Linux wrote:

> On Thu, Mar 31, 2011 at 10:06:34AM +0200, Ingo Molnar wrote:
> > Having strong, effective platform abstractions inside the kernel really 
> > helps 
> > even if the hardware space itself is inevitably fragmented: both powerpc 
> > and 
> > x86 has shown that. Until you realize and appreciate that you really have 
> > not 
> > understood the problem i think.
> 
> No, I think it is the other way around.  Folk like me and Nicolas over
> the last ten years have put considerable amounts of effort into trying
> to keep the ARM support code as clean and maintainable as possible.
> 
> That is true of the common ARM stuff, but there's no way we can do this
> for all SoC support - there aren't the hours in the day to provide such

That's what I said. You and Nicholas wont scale.

> a wide oversight.  That's why we have SoC maintainers, and the SoC
> maintainers have the responsibility to sort out their own sub-trees.

But the current SoC maintainer model does not work either. The SoC
maintainers care about their sandbox and have exactly zero incentive
to look at the overall picture, e.g reuse of code for the same IP
blocks, better abstraction mechanisms etc. 

Therefor you need a team of experienced kernel developers which are
NOT associated with a particular vendor who are able to tame that SoC
crowd and work closely with you and Nicholas to keep stuff in sync.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] omap changes for v2.6.39 merge window

2011-03-31 Thread Thomas Gleixner
On Wed, 30 Mar 2011, Linus Torvalds wrote:

> On Wed, Mar 30, 2011 at 6:15 PM, Bill Gatliff  wrote:
> >
> > I'm not sure this metric is completely fair to ARM.  If you want to
> > level the field, I think you have to divide each result by the number
> > of SoC's
> 
> But that's the problem with ARM. Hardware companies that do one-off
> stuff, with no sense of compatibility.
> 
> And calling it an "opportunity" is just stupid.
> 
> There's nothing good about causing extra work just because ARM hasn't
> had the sense to standardize on one (or a couple) of interrupt
> controllers etc.

Well, ARM is not the only one there, it's the top one, but mips and
power are not less crazy. Here are the top five architectures counted
in number of irq_chip implementations.

  arch/arm 139
  arch/mips75
  arch/powerpc 44
  arch/alpha   21
  arch/x86 15

Thanks,

tglx

Re: [GIT PULL] omap changes for v2.6.39 merge window

2011-03-31 Thread Thomas Gleixner
On Thu, 31 Mar 2011, Russell King - ARM Linux wrote:
> And I'm not going to be merging anything into my tree for the time being.
> I know there's no way for me to continue without being moaned at by someone.
> So I'm just going to take the easy option at the moment and do precisely
> nothing in terms of queueing patches until something gets resolved one way
> or the other.  I'm not even going to review any patches because I currently
> see it as a total waste of time - I've no idea whether they'll stand any
> chance what so ever of making it into mainline.
> 
> What's the way out of this?  I've no idea.  Can ARM continue being part
> of the mainline kernel?  I've no idea.  Will we be ripping out all the
> ARM platform code from the mainline kernel?  I've no idea.

Moving arm out of mainline would be a complete disaster.
 
> I am now completely demotivated.

I can understand that, but I have to say again, that you do a pretty
damned good job on keeping the ARM core code clean and I always enjoy
working with you when infrastructure I'm working on needs to deal with
ARM requirements. We disagree from time to time, but I don't remember
a single incident where we could not resolve that on a technical base.

It's not your fault, that you can't clone yourself 10 times to deal
with the subarch flood. It's also not your fault that the subarch
maintainers tend not to look over the brim of their SoC sandbox and
figure out how to solve the big picture.

So no, ARM needs to stay and the ARM crowd needs to find competent
people who babysit the SoC crowd and work with you to get the big
picture straight. Something like this should have been done _before_
the wave of ARM spilling out of every other silicon shop became
tsunami sized. But not doing it at all will not solve anything.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] omap changes for v2.6.39 merge window

2011-03-31 Thread Thomas Gleixner
On Thu, 31 Mar 2011, Kevin Hilman wrote:

> Thomas Gleixner  writes:
> 
> > But the current SoC maintainer model does not work either. The SoC
> > maintainers care about their sandbox and have exactly zero incentive
> > to look at the overall picture, e.g reuse of code for the same IP
> > blocks, better abstraction mechanisms etc. 
> 
> zero incentive?  that's a bit strong, IMO.
> 
> That may be true for some SoCs, it's not really fair as a sweeping
> statement.

Fair enough, but it's the perception in general.

> Conference (ELC, US and Europe.)  Especially as IP blocks are reused
> across SoC families, abstractions like runtime PM are the only way to
> keep the SoC specifics of PM out of the common driver.

Right, I know that these things happen, but at the same time the sheer
amount of stuff flowing in makes it hard that these infrastructure
stuff really works out. And we are only at the beginning of the big
shuffle "code in to mainline" game.

After cleaning up the whole irq stuff across the tree I can tell you,
that the mess is non-linear growing with the number of instances.

You can see the patterns which are:
- copy and paste
- introduce different bugs
- add more abuse

That's what I'm really concerned about.

> Yes, ARM SoC maintainers have to make up some ground.  But compare this
> to just a couple years ago where the common complaint was "why aren't
> embedded SoC people contributing code to mainline", and you'll see we
> have come a long way.

Well, code comes in, which is progress. But we need to figure out how
to deal with the increasingly growing flood before we drown in it.

Thanks,

tglx

--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] omap changes for v2.6.39 merge window

2011-03-31 Thread Thomas Gleixner
On Thu, 31 Mar 2011, Arnd Bergmann wrote:

> On Thursday 31 March 2011, Kevin Hilman wrote:
> > Some SoCs families (like OMAP) have huge amount of diversity even within
> > the SoC family, so better abstractions and generic infrastrucure
> > improvements are an obvious win, even staying within the SoC.
> 
> But that's the point. The incentive is there for managing the infrastructure
> within the SoC, but not across SoCs. Allow me to use OMAP as a bad example
> while pointing out that it's really one of the best supported platforms
> we currently have, while the others are usually much worse in terms of
> working with the community (or at least they are behind on the learning
> curve but getting there):
> 
> * OMAP2 introduced the hwmod concept as an attempt to reduce duplication
>   between board code, but the code was done on the mach-omap2 level
>   instead of finding a way to make it work across SOC vendors, or using
>   an existing solution.
> 
> * The IOMMU code in omap2 duplicates the API we have in the common kernel,
>   with slight differences, instead of using the existing code, making it
>   impossible to share a driver between SOC families.
> 
> * The ti-st code duplicates parts of the bluetooth layer (apparently
>   that is getting fixed soon).
> 
> * The DSS display drivers introduce new infrastructure include new bus
>   types that have the complexity to make them completely generic, but
>   in practice can only work on OMAP, and are clearly not written with
>   cross-vendor abstractions in mind.

Right, but the problem starts in way simpler areas like irq chips and
gpio stuff, where lots of the IP cores are similar and trivial enough
to be shared across many SoC families.

Even the OMAP "consolidated" code is silly:

static void _set_gpio_dataout(struct gpio_bank *bank, int gpio, int enable)
{
void __iomem *reg = bank->base;
u32 l = 0;

switch (bank->method) {
#ifdef CONFIG_ARCH_OMAP1
case METHOD_MPUIO:
reg += OMAP_MPUIO_OUTPUT / bank->stride;
l = __raw_readl(reg);
if (enable)
l |= 1 << gpio;
else
l &= ~(1 << gpio);
break;
#endif
#ifdef CONFIG_ARCH_OMAP15XX
case METHOD_GPIO_1510:
reg += OMAP1510_GPIO_DATA_OUTPUT;
l = __raw_readl(reg);
if (enable)
l |= 1 << gpio;
else
l &= ~(1 << gpio);
break;
#endif
#ifdef CONFIG_ARCH_OMAP16XX
case METHOD_GPIO_1610:
if (enable)
reg += OMAP1610_GPIO_SET_DATAOUT;
else
reg += OMAP1610_GPIO_CLEAR_DATAOUT;
l = 1 << gpio;
break;
#endif
#if defined(CONFIG_ARCH_OMAP730) || defined(CONFIG_ARCH_OMAP850)
case METHOD_GPIO_7XX:
reg += OMAP7XX_GPIO_DATA_OUTPUT;
l = __raw_readl(reg);
if (enable)
l |= 1 << gpio;
else
l &= ~(1 << gpio);
break;
#endif
#if defined(CONFIG_ARCH_OMAP2) || defined(CONFIG_ARCH_OMAP3)
case METHOD_GPIO_24XX:
if (enable)
reg += OMAP24XX_GPIO_SETDATAOUT;
else
reg += OMAP24XX_GPIO_CLEARDATAOUT;
l = 1 << gpio;
break;
#endif
#ifdef CONFIG_ARCH_OMAP4
case METHOD_GPIO_44XX:
if (enable)
reg += OMAP4_GPIO_SETDATAOUT;
else
reg += OMAP4_GPIO_CLEARDATAOUT;
l = 1 << gpio;
break;
#endif
default:
WARN_ON(1);
return;
}
__raw_writel(l, reg);
}

So we have 2 types of sections:

#1
  data = read_reg();
  if (enable)
 data |= bit;
  else
 data &= ~bit;
  write_reg(data);

#2
  if (enable)
 write_enable_reg(bit);
  else
 write_disable_reg(bit);

But the code above has 6 cases in the switch because nobody abstracted
it out consequently. Not to talk about the ifdef mess.

So now look at tons of other gpio implementations all over the DOZENS
of ARM plat-/mach- space and guess what.

Most have either type #1 or type #2 just slightly different copied,
less or better abstracted and I'm pretty damned sure, that you could
consolidate all that stuff into a handful or even less drivers which
provide the code across the board.

Same for irq chips. Most of these gpio things have callbacks which do:

irq_xxx(struct irq_data *d)
{
gpio = irq_data_get_irq_chip_data(d);
irq = d->irq - gpio->base_irq;
reg = convert_to_reg(gpio, irq);
mask = convert_to_mask(gpio);

write(reg, mask);
}

I saw all those incarnations lately and you can boil them down to a
handful or less as

Re: [GIT PULL] omap changes for v2.6.39 merge window

2011-03-31 Thread Thomas Gleixner
On Thu, 31 Mar 2011, Nicolas Pitre wrote:
> On Thu, 31 Mar 2011, da...@lang.hm wrote:
> 
> > I think that part of the issue is that when Linus points out a problem, the
> > response isn't "we agree and are working on it, here's what we are doing",
> > instead it seems to be mostly "there is no problem, this is just because 
> > there
> > is so much variation in ARM"
>
> If prominent people looking at this from the side line continue bashing 
> at those who are both feet in the mud trying to contain the flood rather 
> than actually helping then nothing will change.  Instead this only 
> creates despair and the splashed people may simply decide to throw in 
> the towel, at which point things will collapse for real.  In reality, 
> the system has been going as it is for quite a while and with more or 
> less the same level of intensity.  And the fact is that _users_ of the 
> ARM kernel are not complaining.  Things are far from being perfect, but 
> so far things have been "good enough" for the majority of the people 
> involved, and improvements are constantly being worked on with the men 
> power available.

And that's the whole point why I was ranting in the first place. I
know that there are clever folks working on the solution, but it's
entirely clear to me, that they are simply not enough compared to the
massive inbound flood. So neither you nor Russell can cope with it,
you simply do not scale. That's why I suggested that the ARM community
needs to push competent man power into this.

You say the concept of subarch maintainers is working quite well. That
depends on the definition of working. It works in terms of users can
use it, but it does not work from a maintainability POV.

Nobody wants to bash on those who are working on it, but IMNSHO the
current way is running into an utter nightmare even w/o you and
Russell throwing in the towel.

I went through quite a few iterations of large scale cleanups, so I
know how you feel.

Thanks,

tglx


--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] omap changes for v2.6.39 merge window

2011-03-31 Thread Thomas Gleixner
On Thu, 31 Mar 2011, Nicolas Pitre wrote:

> On Thu, 31 Mar 2011, Thomas Gleixner wrote:
> 
> > Start off with such a trivial, but immense effective cleanup and see
> > what it helps to share code even accross SoC vendors. They all glue
> > together random IP blocks from the market and there are not soo many
> > sources which are relevant. This makes sense in all aspects:
> > 
> >   1) less and better code
> >   2) faster setup for new SoCs
> >   3) shared benefit for all vendors
> 
> If this was always true.  Someone commented on the fact that the IP 
> block providing USB on OMAP is shared with a couple other platforms.  
> But about 2600 lines of pure glue is still necessary around the common 
> driver to make it work for everyone.  I'm not saying that separate 
> drivers are called for here, simply that hardware people _will_ screw it 
> up, especially when they are hooking it up to a non-standard 
> SOC-specific bus.

Right. That's a problem, but we should not ignore the places where
reusing stuff is easy possible. And making good examples out of it.

And it really _IS_ worth the trouble. Look at the git log of
drivers/spi/pxa2xx* . We could have slapped the other "x86" driver
into spi, but that does not make any sense from a software engineering
and maintainability POV. And it would have been more work in the end
to cleanup the separate driver than isolating the existing one and
reuse it.

This is a sustainability issue. And we need to become more clever
about identifying the places where we can abstract stuff into shared
drivers and infrastructure when we want to sustain Linux for another
few decades.

> And what would the hardware guys tell you?  That software is cheap.

If you can prove with simple examples that using existing software
removes 6 month of useless reinventing the wheel and another 6 month
of testing plus the fight with the kernel folks, then eventually they
start to listen as you can express this in $$$.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] omap changes for v2.6.39 merge window

2011-03-31 Thread Thomas Gleixner
On Thu, 31 Mar 2011, Nicolas Pitre wrote:
> On Thu, 31 Mar 2011, Linus Torvalds wrote:
> > What I'm _not_ seeing is a lot of cross-platform maintenance or sense
> > of people trying to reign things in and look for solutions to the
> > proliferation of random stupid and mindless platform code.
> 
> I do that, Russell does that, Catalin does that, Tony does that, and 
> maybe less than a handful of other important people I'm not listing 
> (sorry).  But clearly we are far far from being enough people doing that 
> kind of work.  And the fact is that the volume of ARM platform code is 
> steadily being produced at a rate far surpassing X86, and even higher 
> than all the other architectures put together.  Linaro is trying to help 
> here, but Linaro cannot conjure the needed experience and knowledge for 
> that kind of work with a magic wand.
> 
> So we need help!  If core kernel people could get off their X86 stool 
> and get down in the ARM mud to help sort out this mess that would be 
> really nice (thanks tglx).  Until then all that the few of us can do is 

The main help we can give (aside of actually looking at code and
concepts) is to feed back the experience we have with massive cleanups
and which mechanisms work and which not and - at least I can speak for
myself - to stand at your side when it comes to pushing that through.

One thing what really helps to force people to get their act together
is that you as maintainers identify trouble spots which are easily
addressable. Then put those spots on your list and require people who
want to submit new features in that area to cleanup the mess first and
then put their new feature thing on top. We successfully used this to
get the unification work after the mechanical i386/x86_64 merger
done. It requires a lot of stubborness, but it reduces the work burden
of the maintainers a lot.

Once you have the easy spots addressed, that could be device tree
stuff, gpio, irq_chips for the beginning and see how it works out,
then go steps further.

The important point here is from my POV that you put down the
requirements with no wiggle room and just ignore the "oh it could be
done better" whining. Either people come up with a patch which solves
the whole issue better or they just will cope. But when you have
consolidated stuff then you can and need to look from a high level
perspective and refine the infrastructure.

I really regret in hindsight, that I did not enforce the cleanup of
the irq layer way earlier and that I did not see the abuse of it early
enough. At some point I realized that being polite is the wrong
solution, so I forced myself to push this cleanup through. That's an
experience which I don't wish anybody else to make. Especially because
as long as the oldstyle stuff works oldstyle crap comes in faster than
you can fix it. See commit 9ad198c. And it's a massive effort to do
something which results in:

Total patches: 414
Total files touched:   702  
Total insertions: 6805
Total deletions:  8632
Lines added  -1827

And that massive effort is just because you cannot break stuff after
the fact in a massive scale. The whole thing introduced a mere 6
patches fixup fallout which were applied one day after rc1. It could
have been avoided, but 

So yes, we can and will help with advise, a certain amount of review
(especially on the concept level) and giving you all the support you
need to fight that trough.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 13/25] OMAP4: PM: Add WakeupGen module as OMAP gic_arch_extn

2011-09-09 Thread Thomas Gleixner
On Fri, 9 Sep 2011, Santosh wrote:

> On Thursday 08 September 2011 11:57 PM, Kevin Hilman wrote:
> > Santosh Shilimkar  writes:
> > 
> > > OMAP WakeupGen is the interrupt controller extension used along
> > > with ARM GIC to wake the CPU out from low power states on
> > > external interrupts.
> > > 
> > > The WakeupGen unit is responsible for generating wakeup event
> > > from the incoming interrupts and enable bits. It is implemented
> > > in MPU always ON power domain. During normal operation,
> > > WakeupGen delivers external interrupts directly to the GIC.
> > > 
> > > Signed-off-by: Santosh Shilimkar
> > > Cc: Kevin Hilman
> > 
> > [...]
> > 
> > > +#ifdef CONFIG_PM
> > > +/*
> > > + * Masking wakeup irqs is handled by the IRQCHIP_MASK_ON_SUSPEND flag,
> > > + * so no action is necessary in set_wake, but implement an empty handler
> > > + * here to prevent enable_irq_wake() returning an error.
> > > + * FIXME: Remove the dummy handler once gen irq code fix above.
> > > + */
> > 
> > Just curious... is there a fix for this pending for v3.2?
> > 
> I have sent a proposed change to Thomas but have not seen
> any response yet from him.
> 
> http://www.spinics.net/lists/arm-kernel/msg134297.html

diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
index 0a7840a..cd4bc01 100644
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c

@@ -467,6 +467,9 @@ static int set_irq_wake_real(unsigned int irq, unsigned int 
on)

   struct irq_desc *desc = irq_to_desc(irq);
   int ret = -ENXIO;

+  if (irq_desc_get_chip(desc)->flags & IRQCHIP_MASK_ON_SUSPEND)
+ return 0;
+
if (desc->irq_data.chip->irq_set_wake)
   ret = desc->irq_data.chip->irq_set_wake(&desc->irq_data, on);

The flag says: MASK ON SUSPEND and it does not imply that you don't
need a wake function. There might be cases where you want to setup
stuff in that function in order to have the wakeup happen on that
interrupt line despite of the mask on suspend.

We either need a separate flag or a global dummy set_wake function in
the core to avoid empty copies all over the place.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 13/25] OMAP4: PM: Add WakeupGen module as OMAP gic_arch_extn

2011-09-09 Thread Thomas Gleixner
On Fri, 9 Sep 2011, Santosh wrote:
> On Friday 09 September 2011 12:49 PM, Thomas Gleixner wrote:
> > 
> > The flag says: MASK ON SUSPEND and it does not imply that you don't
> > need a wake function. There might be cases where you want to setup
> > stuff in that function in order to have the wakeup happen on that
> > interrupt line despite of the mask on suspend.
> > 
> I see your point.
> 
> > We either need a separate flag or a global dummy set_wake function in
> > the core to avoid empty copies all over the place.
> > 
> A flag is probably better since you mentioned that on some arch, there
> might be need to have actual set_wake() handler. Or if the global
> dummy can be over-ridden by platform, that's fine too.

Global dummy would mean:

int irq_set_wake_dummy(...)
{
return 0;
}

And you just add it to your chip, but either way I don't care whether
it's a dummy function or a flag.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 13/25] OMAP4: PM: Add WakeupGen module as OMAP gic_arch_extn

2011-09-12 Thread Thomas Gleixner
On Fri, 9 Sep 2011, Santosh wrote:

> On Friday 09 September 2011 01:48 PM, Thomas Gleixner wrote:
> > On Fri, 9 Sep 2011, Santosh wrote:
> > > On Friday 09 September 2011 12:49 PM, Thomas Gleixner wrote:
> > > > 
> > > > The flag says: MASK ON SUSPEND and it does not imply that you don't
> > > > need a wake function. There might be cases where you want to setup
> > > > stuff in that function in order to have the wakeup happen on that
> > > > interrupt line despite of the mask on suspend.
> > > > 
> > > I see your point.
> > > 
> > > > We either need a separate flag or a global dummy set_wake function in
> > > > the core to avoid empty copies all over the place.
> > > > 
> > > A flag is probably better since you mentioned that on some arch, there
> > > might be need to have actual set_wake() handler. Or if the global
> > > dummy can be over-ridden by platform, that's fine too.
> > 
> > Global dummy would mean:
> > 
> > int irq_set_wake_dummy(...)
> > {
> > return 0;
> > }
> > 
> > And you just add it to your chip, but either way I don't care whether
> > it's a dummy function or a flag.
> > 
> Will below patch work for you then ?
> Attaching the same in case mailer damages it.

It does :)
 
> Regards,
> Santosh
> 
> From d63d4347dc8fb144b19f4d4e7c0621397cccea94 Mon Sep 17 00:00:00 2001
> From: Santosh Shilimkar 
> Date: Fri, 9 Sep 2011 13:59:35 +0530
> Subject: [PATCH] irq: Add IRQCHIP_SKIP_SET_WAKE flag to avoid need of dummy
> set_wake() handler.
> 
> Certain IRQCHIP's may not need to install the irq_set_wake() handler if
> the IRQCHIP_MASK_ON_SUSPEND flag is set. But but if it's not implemented,
> enable_irq_wake() will return an error.  That needs the IRQCHIP to install
> an empty set_wake handler.
> 
> Add an 'IRQCHIP_SKIP_SET_WAKE' flag so that IRQCHIP can inform the core
> irq code that irq_set_wake() handler is not necessary.
> 
> Signed-off-by: Santosh Shilimkar 
> Cc: Thomas Gleixner 

Queued to irq/core for 3.2

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ARM SoC tree: OMAP PM dependency on tip irq/core

2011-10-07 Thread Thomas Gleixner
On Fri, 7 Oct 2011, Arnd Bergmann wrote:

> On Friday 07 October 2011, Kevin Hilman wrote:
> > > I've pulled in rmk/devel-stable as a dependency now, thanks for
> > > reminding me of that.
> > >
> > > Thomas, where should I get the irq-core branch (or whichever
> > > I should wait for) to pull in as another dependency. Is that
> > > branch one that never gets rebased? 
> > 
> > git://tesla.tglx.de/git/linux-2.6-tip irq/core
> > 
> > I asked Thomas about this earlier when I was going to build up the
> > dependencies myself, and he said it won't be rebased.
> 
> Ok, thanks for the info.
> 
> I think I've now also come up with a workflow for tracking the dependencies:
> I have a depends/xxx branch for each other branch that I need to wait for
> getting merged first. When I want to send a pull request, I first check
> all the depends/* branches using 'git branch --merged next/xxx |
> grep depends' to see what the dependencies are, and 'git branch --merged
> torvalds/master | grep depends' to see if they are already merged upstream.
> When a dependency is already merged, I can remove its tracking branch
> from the arm-soc tree.

For your internal dependecies, yes. But for irq/core you don't have to
wait. You just need to tell Linus in the pull request that you pulled
my branch with my ack as it contains modifications which are
prerequisite for arm/whatever.

When I send my pull request later, then this wont do any damage as
Linus has the commits already. Same the other way round. We do that
all the time, otherwise pull dependencies would be a nightmare.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] tty: omap-serial: fix boot hang by converting to use a threaded IRQ handler (was Re: [PATCH] irq: always set IRQF_ONESHOT if no primary handler is specified)

2011-08-23 Thread Thomas Gleixner
Linus,

On Tue, 23 Aug 2011, Linus Torvalds wrote:
> but the "there is no difference for others" seems to be total crap,
> exactly because it results in this IRQF mismatch.
> 
> So I think that commit should just be reverted. Thomas?

Yes. I missed that detail when I applied that patch. Reverting it
seems to be the only sensible solution.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] IRQ: allow check_wakeup_irqs to notice level-triggered interrupts.

2012-04-25 Thread Thomas Gleixner
On Wed, 25 Apr 2012, NeilBrown wrote:

> Level triggered interrupts do not cause IRQS_PENDING to be set, so
> check_wakeup_irqs ignores them.
> They don't need to set IRQS_PENDING as the level stays high which
> shows that they must be pending.  However if such an interrupt fired
> during late suspend, it will have been masked so the fact that it
> is still asserted will not cause the suspend to abort.
> 
> So if any wakeup interrupt is masked, unmask it when checking wakeup
> irqs.  If the interrupt is asserted, suspend will abort.
> 
> Signed-off-by: NeilBrown 
> ---
> 
>  kernel/irq/pm.c |6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/kernel/irq/pm.c b/kernel/irq/pm.c
> index 15e53b1..0d26206 100644
> --- a/kernel/irq/pm.c
> +++ b/kernel/irq/pm.c
> @@ -106,6 +106,12 @@ int check_wakeup_irqs(void)
>   if (irqd_is_wakeup_set(&desc->irq_data)) {
>   if (desc->istate & IRQS_PENDING)
>   return -EBUSY;
> + if (irqd_irq_masked(&desc->irq_data))
> + /* Probably a level interrupt
> +  * which fired recently and was
> +  * masked
> +  */
> + unmask_irq(desc);

Oh no. We don't unmask unconditionally. What about an interrupt which
is disabled, has no handler . ? That needs more thought.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] IRQ: allow check_wakeup_irqs to notice level-triggered interrupts.

2012-04-25 Thread Thomas Gleixner
On Wed, 25 Apr 2012, NeilBrown wrote:
> On Wed, 25 Apr 2012 10:50:15 +0200 (CEST) Thomas Gleixner
>  wrote:
> 
> > On Wed, 25 Apr 2012, NeilBrown wrote:
> > 
> > > Level triggered interrupts do not cause IRQS_PENDING to be set, so
> > > check_wakeup_irqs ignores them.
> > > They don't need to set IRQS_PENDING as the level stays high which
> > > shows that they must be pending.  However if such an interrupt fired
> > > during late suspend, it will have been masked so the fact that it
> > > is still asserted will not cause the suspend to abort.
> > > 
> > > So if any wakeup interrupt is masked, unmask it when checking wakeup
> > > irqs.  If the interrupt is asserted, suspend will abort.
> > > 
> > > Signed-off-by: NeilBrown 
> > > ---
> > > 
> > >  kernel/irq/pm.c |6 ++
> > >  1 file changed, 6 insertions(+)
> > > 
> > > diff --git a/kernel/irq/pm.c b/kernel/irq/pm.c
> > > index 15e53b1..0d26206 100644
> > > --- a/kernel/irq/pm.c
> > > +++ b/kernel/irq/pm.c
> > > @@ -106,6 +106,12 @@ int check_wakeup_irqs(void)
> > >   if (irqd_is_wakeup_set(&desc->irq_data)) {
> > >   if (desc->istate & IRQS_PENDING)
> > >   return -EBUSY;
> > > + if (irqd_irq_masked(&desc->irq_data))
> > > + /* Probably a level interrupt
> > > +  * which fired recently and was
> > > +  * masked
> > > +  */
> > > + unmask_irq(desc);
> > 
> > Oh no. We don't unmask unconditionally. What about an interrupt which
> > is disabled, has no handler . ? That needs more thought.
> 
> If there is no handler, then irqd_is_wakeup_set() should fail should it not?

Well, it does not. The wakeup state is a permanent flag on the irq
line. Though we don't have IRQS_SUSPENDED on that descriptor.
 
> For disabled: would it be OK to check desc->depth?
> Something like:
>  if (desc->depth == 1 && (desc->state & IRQS_SUSPENDED) &&
>  irqd_irq_masked(&desc->irq_data))
>   unmask_irq(desc);
> 

Why not simply managing the pending bit for level irqs ?

Index: tip/kernel/irq/chip.c
===
--- tip.orig/kernel/irq/chip.c
+++ tip/kernel/irq/chip.c
@@ -379,8 +379,10 @@ handle_level_irq(unsigned int irq, struc
 * If its disabled or no action available
 * keep it masked and get out of here
 */
-   if (unlikely(!desc->action || irqd_irq_disabled(&desc->irq_data)))
+   if (unlikely(!desc->action || irqd_irq_disabled(&desc->irq_data))) {
+   desc->istate |= IRQS_PENDING;
goto out_unlock;
+   }
 
handle_irq_event(desc);
 
Index: tip/kernel/irq/resend.c
===
--- tip.orig/kernel/irq/resend.c
+++ tip/kernel/irq/resend.c
@@ -58,10 +58,13 @@ void check_irq_resend(struct irq_desc *d
/*
 * We do not resend level type interrupts. Level type
 * interrupts are resent by hardware when they are still
-* active.
+* active. Clear the pending bit so suspend/resume does not
+* get confused.
 */
-   if (irq_settings_is_level(desc))
+   if (irq_settings_is_level(desc)) {
+   desc->istate &= ~IRQS_PENDING;
return;
+   }
if (desc->istate & IRQS_REPLAY)
return;
if (desc->istate & IRQS_PENDING) {

And to handle interrupts which have been disabled before suspend add:

Index: tip/kernel/irq/pm.c
===
--- tip.orig/kernel/irq/pm.c
+++ tip/kernel/irq/pm.c
@@ -103,7 +103,8 @@ int check_wakeup_irqs(void)
int irq;
 
for_each_irq_desc(irq, desc) {
-   if (irqd_is_wakeup_set(&desc->irq_data)) {
+   if (desc->depth == 1 &&
+   irqd_is_wakeup_set(&desc->irq_data)) {
if (desc->istate & IRQS_PENDING)
return -EBUSY;
continue;
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] IRQ: allow check_wakeup_irqs to notice level-triggered interrupts.

2012-05-04 Thread Thomas Gleixner
Neil,

On Fri, 4 May 2012, NeilBrown wrote:

> On Wed, 25 Apr 2012 14:54:54 +0200 (CEST) Thomas Gleixner
>  wrote:
> 
> > Why not simply managing the pending bit for level irqs ?
> > 
> 
> Hi Thomas,
>  thanks again for the patch.  I finally made time to test it and it works as
> expected.  I've included it below with a change-log entry and tested-by:
> in case that helps.

thanks for testing. The changelog is great. You know how to make the
live of lazy buggers easier :)
  
>   for_each_irq_desc(irq, desc) {
> - if (irqd_is_wakeup_set(&desc->irq_data)) {
> + if (desc->depth == 1 &&
> + irqd_is_wakeup_set(&desc->irq_data)) {
>   if (desc->istate & IRQS_PENDING)
>   return -EBUSY;
>   continue;

I split that part into a separate patch, as it's really a different
issue.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Seeking clarity on IRQCHIP_MASK_ON_SUSPEND

2012-09-10 Thread Thomas Gleixner
On Mon, 10 Sep 2012, NeilBrown wrote:
> 
> The IRQCHIP_MASK_ON_SUSPEND flag seems to be hard to use correctly, so either
> I'm understanding it wrongly, or it could be made easier to use.
> If the first case, I'm hoping that some improvement to documentation might
> result.  If the second, then maybe we can fix the code.
... 
> Is anyone able to give a definitive answer on this?  Should
> IRQCHIP_MASK_ON_SUSPEND be removed?

The whole point of IRQCHIP_MASK_ON_SUSPEND is to deal with hardware
designed by geniuses.

Most SoCs have a way to mark the interrupts which serve as a wake up
source as such. All other interrupts are magically "masked" on entry
to suspend.

Now there is hardware which is missing such a control, so we need to
mask the non wakeup interrupts right before going into suspend.

That's what IRQCHIP_MASK_ON_SUSPEND does. Not more, not less. See
commit d209a699a0b for more ugly details.

You might be looking for a different functionality. Can you explain
what you need?

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Seeking clarity on IRQCHIP_MASK_ON_SUSPEND

2012-09-10 Thread Thomas Gleixner
On Mon, 10 Sep 2012, Shilimkar, Santosh wrote:
> Thomas,
> 
> On Mon, Sep 10, 2012 at 3:58 PM, Thomas Gleixner  wrote:
> > On Mon, 10 Sep 2012, NeilBrown wrote:
> >>
> >> The IRQCHIP_MASK_ON_SUSPEND flag seems to be hard to use correctly, so 
> >> either
> >> I'm understanding it wrongly, or it could be made easier to use.
> >> If the first case, I'm hoping that some improvement to documentation might
> >> result.  If the second, then maybe we can fix the code.
> > ...
> >> Is anyone able to give a definitive answer on this?  Should
> >> IRQCHIP_MASK_ON_SUSPEND be removed?
> >
> > The whole point of IRQCHIP_MASK_ON_SUSPEND is to deal with hardware
> > designed by geniuses.
> >
> > Most SoCs have a way to mark the interrupts which serve as a wake up
> > source as such. All other interrupts are magically "masked" on entry
> > to suspend.
> >
> Just to support this, IRQCHIP_MASK_ON_SUSPEND does work with quite
> a few ARM platforms too. OMAP already uses it in mainline and I have
> seen patches for U500 and Tegra SOCs. Most of these usages are with
> IRQ chips who doesn't have any driver run time PM supported and
> the IRQ CHIP itself is shutdown when the CPU/CPU cluster gets
> power down. So as far as functionality concerned with the flag,
> it does what it suppose to do.
> 
> > Now there is hardware which is missing such a control, so we need to
> > mask the non wakeup interrupts right before going into suspend.
> >
> > That's what IRQCHIP_MASK_ON_SUSPEND does. Not more, not less. See
> > commit d209a699a0b for more ugly details.
> >
> > You might be looking for a different functionality. Can you explain
> > what you need?
> >
> Neil's email came from a discussion on the usage of this flag for
> OMAP GPIO irqchip which I proposed. With the flag, when the
> lazy check_irq routine is called, the GPIO driver is runtime suspended
> and hence the late mask/unmask calls take abort(clocks are already gated).
> GPIO IRQCHIP is secondary IRQCHIP connected to 1 interrupt line
> per bank(32 GPIOs) to the primary interrupt controller IRQCHIP.

So for this thing you need a IRQCHIP_MASK_BEFORE_SUSPEND variant? 

--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Seeking clarity on IRQCHIP_MASK_ON_SUSPEND

2012-09-11 Thread Thomas Gleixner
On Tue, 11 Sep 2012, NeilBrown wrote:
> On Mon, 10 Sep 2012 12:28:35 +0200 (CEST) Thomas Gleixner
>  wrote:
> > You might be looking for a different functionality. Can you explain
> > what you need?
> 
> I want as particular GPIO interrupt to be masked before entering suspend.
> I produced code to get the ->suspend() callback to perform this masking.
> Another developer (Santosh) felt that IRQCHIP_MASK_ON_SUSPEND was the
> preferred way to do this and on the surface this looks like it should be
> correct.  However it doesn't work as explained previously.
> I want a resolution to this difference of opinion that doesn't just sweep the
> issue under that carpet but provides a clear answer to this sort of issue.
> 
> My current opinion is that IRQCHIP_MASK_ON_SUSPEND should be discarded.  The
> patch which introduced it says:
> 
> Rather than working around this in the affected interrupt chip
> implementations we can solve this elegant in the core code itself.
> 
> It appears that the solution in core code, while elegant, is wrong.  It
> happens too late to be generally usable.  It is easy enough to handle this

Sigh. The flag was invented with the semantics to keep the current
"check for any interrupt" pending functionality alive and then mask it
right before going down, so only the designated wakeup interrupts can
wake the device. That was the result of the discussion back then and
that was what the developer wanted to achieve with his driver suspend
hackery.

> issue in the interrupt chip drivers so maybe that is the best place
> to handle it.

And have the same "keep track of wakeup interrupts" code over and over
in the drivers.
 
> The the very least I think we need a big comment saying the
> IRQCHIP_MASK_ON_SUSPEND can only be used for irqchips which can always be
> programmed, even when they are suspended from an runtime-PM perspective,
> and that those chips must handle masking in their 'suspend' callback.

Sigh, no. Either we make IRQCHIP_MASK_ON_SUSPEND into an
implementation which masks the interrupts early, if the existing users
find this acceptable or have a separate IRQCHIP_MASK_BEFORE_SUSPEND
flag.

This GPIO driver at hand is probably not the last one which requires
this and we really want to do stuff in the core code instead of having
random implementations of the same stuff all over the place.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [linux-pm] [RFC PATCH] PM: Introduce generic DVFS framework with device-specific OPPs

2011-04-27 Thread Thomas Gleixner
On Wed, 27 Apr 2011, Menon, Nishanth wrote:
> On Wed, Apr 27, 2011 at 12:49, Colin Cross  wrote:
> > I proposed in a different thread on LKML that DVFS be handled within
> > the generic clock implementation.  Platforms would register a
> > regulator and a table of voltages for each struct clock that required
> > DVFS, and the voltages would be changed on normal clk_* requests.
> > This maintains compatibility with existing clk_* calls.
> 
> It is upto SoC frameworks to implement the transitions. E.g. lets look
> at scalability: How'd the mechanism proposed work with temperature
> variances: Example: I dont want to hit 1.5GHz if temp >70C - wont it
> be an SoC specific hack I'd need to introduce?

Why is limiting the max core frequency depending on temperature a SoC
specific problem ?

Everyone wants to do that. x86 does it in hardware / SMM, other
architectures want the kernel to take care of it.

So the decision is simple. Something wants to set core freq to 1.5
GHz, so it calls clk_set_rate() and there we consult the DVFS code
first to validate that setting. If it can be set, fine, then DVFS will
set the voltages _before_ we change the frequency or it will simply
veto the change because one of the preliminaries for such a change is
not given.

Please stop thinking that your SoC is sooo special. It's NOT.

The HW concepts are quite similar all over the place, they are just
named differently and use different IP blocks with slightly different
functionality, but the problems are not unique to a particular SoC at
all.

> All OPP framework does is store that maps, and leaves it to users to
> choose regulators, clock framework variances, SoC temperature sensors
> or what ever mechanisms they choose to allow through a transition.

That's how it's implemented, but that does not say that the design is
correct and usable for more than the usecase it was modeled after.
 
We are looking into a common clock framework, which abstracts out the
duplicated functionality of the various implementations and reduces
them to the real thing: hardware drivers. So we really need to look
into that DVFS problem as well, simply because it is tightly coupled
and not a complete separate entity.

And looking at the struct clk disaster we really don't want another
incarnation in terms of DVFS where we end up with the same decision
functions in various SoCs over and over.

Thanks,

tglx

Re: [linux-pm] [RFC PATCH] PM: Introduce generic DVFS framework with device-specific OPPs

2011-04-27 Thread Thomas Gleixner
On Wed, 27 Apr 2011, Menon, Nishanth wrote:

> OPP table is just a storage and retrieval mechanism, it is upto SoC
> frameworks to choose the most adequate of solutions - e.g. OMAP has
> omap_device, hwmod and a clock framework for more intricate control to
> work in conjunction with cpuidle frameworks as well.

Can you please stop thinking about OMAP for a minute?

A clock framework is nothing SoC specific. A framework is an
abstraction of common HW functionality, which implements general
functionality and relies on the HW specific part to configure it and
to provide access to the hardware itself.

clocks are ordered as trees in HW, simply because you cannot have a
clock consumer be driven by more than one active clock at the same
time. A clock consumer may select a different clock producer, but that
merily changes the tree structure nothing else. So why should every
SoC implement it's own (different buggy) version of tree handling and
call it framework?

Yes, I know you might argue that some devices need two clocks enabled
to be functional. That's correct, but coupling those clocks at the
framework level is the wrong thing to do. If a device needs both an
interface clock and a separate interconnect clock to work, then it
needs to enable both clocks and become a consumer of them.
 
> There is cross domain dependency which OMAP (yet to be pushed to
> mainline) has - example: when OMAP4's MPUs are at a certain OPP, L3
> (OMAP's SoC bus) needs to be at least a certain OPP - these are
> framework which may be very custom to OMAP itself.

Wrong again. That's not a framework when you hack SoC specific
decision functions into it. It's the OMAP internal hackery to make
stuff work, but that's far from a framework.

What you are describing is a restriction which can be expressed in
tables or rules which are fed into a general framework.

Look at generic irqs, generic timekeeping, generic clockevents and
tons of other real frameworks in the kernel. They abstract out
concepts and provide generic interfaces rather than claiming that the
problem is unique to a particular piece of silicon.

Forget OMAP implementation details for a while, sit back and look at
the big picture.

Thanks,

tglx


--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


  1   2   >