Re: OMAP HDQ: was Re: DSS2/PM on 3.2 broken?
On Wed, 1 Feb 2012, NeilBrown wrote: On Fri, 27 Jan 2012 23:02:51 -0700 (MST) Paul Walmsley p...@pwsan.com wrote: Since the HDQ module doesn't support the idle protocol, the target clock FSM in the CM is what should determine whether the module is considered idle or not. And as long as the bit in the CM_FCLKEN register corresponding to HDQ_FCLK is set, the FSM should consider the module as active, and the clockdomain should not be allowed to go inactive. I write should. Given that big caution at the bottom of section 20.4.5 System Power Management and Wakeup, this appears to have not been correctly implemented. Can you help me understand why you say should there. You seem to be implying that a down-stream clock gate should cause an upstream clock to stay active, but that doesn't make sense to me given the presence of the much richer SIDLE framework. Also I cannot find it in the TRM (though there a lots of words in there and I might have missed some). Functional clocks are almost never auto-idled by the hardware[*]. So when the kernel sets a CM_FCLKEN* bit in a register for a clockdomain, that clockdomain should never go inactive. As I understand it, main functional clocks on I/O IP blocks aren't automatically idled for a few basic reasons. One is that the main functional clock has to be running in order for register accesses to succeed, so initiators have to have some way of preventing it from going off while they are interacting with it. The PRCM doesn't snoop the interconnect :-) Another is that some of the I/O IP blocks can continue to operate after they've idle-acked. They idle-ack, which could allow clocks to be gated to interconnects, but the module can still continue to fill or drain FIFOs, which obviously requires a clock. This was what is not working correctly with the UARTs... some events that generate interrupts can't generate SWAKEUPs :-( And there are a few other lesser reasons that I am aware of that are not worth going into at this point. Even interface clock auto-idle works in a different way than what one might expect. Even if a module idle-acks, the interface clock will stay on until the interface clockdomain can go idle. So suppose that you have an interface clockdomain with 20 modules and 19 of them idle-ack. The interface clock stays supplied to those 19 modules until the 20th idle-acks. This ensures that an initiator access to a target module will succeed. (This is true for OMAP2/3; OMAP4 does something more clever.) In general, setting ICLKEN/FCLKEN bits to zero doesn't shut down clocks directly; it simply indicates that the clocks may be shut down once other users of that clock also indicate that they don't need the clocks. Most modules use SIDLE to keep the upstream clock active. It's the other way around. The device driver uses the FCLKEN bits to keep the upstream clock active :-) Disabling AUTO_HDQ is probably a reasonable workaround. It would prevent the CORE_L4 clockdomain from going inactive, and that happens to be where the functional clock originates from, also. We have a partial facility to handle this type of bug in the existing code. Could you please try the patch enclosed at the bottom of this E-mail and see if it helps? Yes, that patch fixes my problem too - the HDQ keeps working. Great; thanks for the test. We'll implement a variant of that for mainline. Please also let me know if you wind up having wakeup problems with HDQ, especially with MPU off-mode. If you don't, obviously that's good; but I suspect you might want it at some point. (it's a bit smoke-and-mirrors though .. I want fclk to stay on, so let's make sure iclk doesn't autoidle, because we *know* they have the same source :-) A theoretically clearer workaround would be to place the main functional clock's clockdomain into software-supervised wakeup as long as the hwmod code thinks that the HDQ IP block is active. But that is not so easy to implement with our current codebase. We'd need to implement usecounting for our clockdomain control operations, and the other PM code assumes that it can force clockdomain states directly. So I'm not too eager to do that rewrite at the moment given all the other stuff that's going on... Meanwhile, this fix targets the target clock FSM, and that's where the bug is, albeit on a slightly different path. It's also functionally equivalent, given that the HDQ apparently can't send wakeups. - Paul *. Main functional clocks can be autoidled if they are interface+functional clocks. This is the case for some purely computational, non-I/O IP blocks such as MAILBOXES. Those combined interface+functional clocks are controlled via ICLKEN registers, not FCLKEN registers. -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at
Re: OMAP HDQ: was Re: DSS2/PM on 3.2 broken?
On Fri, 27 Jan 2012 23:02:51 -0700 (MST) Paul Walmsley p...@pwsan.com wrote: Since the HDQ module doesn't support the idle protocol, the target clock FSM in the CM is what should determine whether the module is considered idle or not. And as long as the bit in the CM_FCLKEN register corresponding to HDQ_FCLK is set, the FSM should consider the module as active, and the clockdomain should not be allowed to go inactive. I write should. Given that big caution at the bottom of section 20.4.5 System Power Management and Wakeup, this appears to have not been correctly implemented. Can you help me understand why you say should there. You seem to be implying that a down-stream clock gate should cause an upstream clock to stay active, but that doesn't make sense to me given the presence of the much richer SIDLE framework. Also I cannot find it in the TRM (though there a lots of words in there and I might have missed some). Most modules use SIDLE to keep the upstream clock active. HDQ doesn't have that so it needs software over-rid to keep it active. It also mentions the AUTO_HDQ bit of PRCM.CM_AUTOIDLE1_CORE. This had me confused for a while, but I think it only affects the iclk. That's correct. I don't think the iclk is the problem. It is the fclk that is disappearing because the clkdm that feeds it is being turned off. Disabling AUTO_HDQ is probably a reasonable workaround. It would prevent the CORE_L4 clockdomain from going inactive, and that happens to be where the functional clock originates from, also. We have a partial facility to handle this type of bug in the existing code. Could you please try the patch enclosed at the bottom of this E-mail and see if it helps? Yes, that patch fixes my problem too - the HDQ keeps working. (it's a bit smoke-and-mirrors though .. I want fclk to stay on, so let's make sure iclk doesn't autoidle, because we *know* they have the same source :-) Thanks, NeilBrown signature.asc Description: PGP signature
Re: OMAP HDQ: was Re: DSS2/PM on 3.2 broken?
On Thu, 26 Jan 2012 07:19:19 -0700 (MST) Paul Walmsley p...@pwsan.com wrote: On Tue, 24 Jan 2012, NeilBrown wrote: On Sat, 21 Jan 2012 17:07:18 -0700 (MST) Paul Walmsley p...@pwsan.com wrote: On Wed, 18 Jan 2012, NeilBrown wrote: Oh - another thing. Sometimes during early boot I get: [0.158447] omap_hwmod: usbtll_fck: missing clockdomain for usbtll_fck. [0.176879] omap_hwmod: hdq: softreset failed (waited 1 usec) This latter bug just got fixed, it's updated in the new HDQ series that I sent out: http://marc.info/?l=linux-omapm=132719073710874w=2 BTW, if you could give that a spin for us at some point, it would be greatly appreciated; I don't think I have a board with a 1-wire device on it (at least, not that I know of). Yes, the patch series seems to work OK - at least it doesn't break anything: I can still access my battery charge meter. Thanks. Should I add a Tested-by: from you on those patches? Well... I only tested them after backporting to 3.2, and Linus is a big proponent of the idea that once you rebase patches you invalidate any testing However I don't think there difference between 3.2 and 3.3-rc1 is big enough in the area, so yes: Tested-by: NeilBrown ne...@suse.de Thanks. It doesn't fix the problem I am having where HDQ gets confused when CPUIDLE kicks in, but I think I might have a handle on that now. Unlike other modules, HDQ doesn't have SIDLEMODE. According to 18.4.5 (AM/DM37x Multimedia Device Silicon Revision) it acts as though SIDLEMODE were set to SIDLE_FORCE so if the clock-domain trys to switch off, HDQ lets it. Hmm. Do you know if the problem occurs in the middle of the HDQ transaction, or when HDQ is completely idle? I'd suspect the former. I cannot tell. Each transaction involves receiving two interrupts - one to say the write completed and one to say the read completed. I don't recall seeing any transaction which got the write-completion interrupt but not the read-completion, but I haven't watched closely enough to be sure. Here's a theory: perhaps the MPU powerdomain is hitting a low-power state while waiting for an HDQ interrupt. When the MPU powerdomain is in a low power state, so is the MPU interrupt controller, so the only way that the MPU can wake up is if the HDQ can issue a wakeup event to the PRCM. And I don't see any evidence that the HDQ is capable of doing this, based on the HDQ sections of the TRM. What a huge energy waste, if true. In the config I am testing MPU only goes to RET, never OFF. Same for CORE. Maybe try something like the following patch -- compile-tested only here. If this works, you might want to try dropping this patch and using the pad mux to set a wakeup event on the 1-wire pad when the signal goes low. That might be a more power-efficient approach. You may still have to use some PM QoS request there to ensure that the HDQ can wake up fast enough to see the pulse, but the constraint shouldn't need to be as ludicrously low as it is in the following patch. Doesn't work - crashes :-( pm_qos_power_init is defined as a late_initcall, so you cannot call pm_qos_update_request until after all the initcalls have run. But with your patch, the probe of the bq27000 triggers a read of the registers which tries to update_request - and it goes 'bang'. I really think the problem is the CORE pwrdm gating a clock because no module says it needs it - i.e. nothing to do with MPU at all. We want to keep CORE active when an HDQ transaction is happening, but MPU is welcome to go to sleep. I don't think you can express that with 'qos'. I think it needs some omap-specific machinery. I can 'fix' the problem simply by making sure pwrdm_for_each_clkdm(core_pd, _cpuidle_deny_idle); runs in omap3_enter_idle whenever HDQ is active. One of the reasons that I think it is a clock problem rather than just missing a wakeup event is that once the problem starts happening I cannot recovery without rebooting. i.e. even if I tell the UARTs to keep the clocks on permanently and keep the CPUIDLE state at 0, the HDQ doesn't start working again. It has clearly become confused. The HDQ doco makes a point of saying that you shouldn't issue any commands (except 'enable clock') when the clock is disabled. I think we end up doing that and it gets confused and cannot recover. I note that there is an ad-hoc dependency between the camera and various power states as well. Maybe we need a little bit of infrastructure so that camera can say Keep CORE and MPU on (or whatever it needs) and HDQ can say Keep CORE on. ??? Thanks, NeilBrown - Paul commit 92fa561191b60ddf1296e02a6206160e8a5718f0 Author: Paul Walmsley p...@pwsan.com Date: Thu Jan 26 07:04:40 2012 -0700 Attempt to work around lack of HDQ wakeup support diff --git a/drivers/w1/masters/omap_hdq.c
Re: OMAP HDQ: was Re: DSS2/PM on 3.2 broken?
On Sat, 28 Jan 2012, NeilBrown wrote: On Thu, 26 Jan 2012 07:19:19 -0700 (MST) Paul Walmsley p...@pwsan.com wrote: Thanks. Should I add a Tested-by: from you on those patches? Well... I only tested them after backporting to 3.2, and Linus is a big proponent of the idea that once you rebase patches you invalidate any testing However I don't think there difference between 3.2 and 3.3-rc1 is big enough in the area, so yes: Tested-by: NeilBrown ne...@suse.de Okay, thanks. Here's a theory: perhaps the MPU powerdomain is hitting a low-power state while waiting for an HDQ interrupt. When the MPU powerdomain is in a low power state, so is the MPU interrupt controller, so the only way that the MPU can wake up is if the HDQ can issue a wakeup event to the PRCM. And I don't see any evidence that the HDQ is capable of doing this, based on the HDQ sections of the TRM. What a huge energy waste, if true. In the config I am testing MPU only goes to RET, never OFF. Same for CORE. That qualifies as a low-power state for the MPU :-) In MPU RET, at least in theory, the MPU INTC will not be operational, and thus unable to respond to interrupts. Maybe try something like the following patch -- compile-tested only here. If this works, you might want to try dropping this patch and using the pad mux to set a wakeup event on the 1-wire pad when the signal goes low. That might be a more power-efficient approach. You may still have to use some PM QoS request there to ensure that the HDQ can wake up fast enough to see the pulse, but the constraint shouldn't need to be as ludicrously low as it is in the following patch. Doesn't work - crashes :-( pm_qos_power_init is defined as a late_initcall, so you cannot call pm_qos_update_request until after all the initcalls have run. But with your patch, the probe of the bq27000 triggers a read of the registers which tries to update_request - and it goes 'bang'. Hmm, okay. Could you please remove the register read from the HDQ probe and see if that makes a difference? I really think the problem is the CORE pwrdm gating a clock because no module says it needs it - i.e. nothing to do with MPU at all. Until pm_runtime_put*() is called, the usecount of hdq_fck will still be non-zero. So the CORE shouldn't be able to gate it or hdq_ick at that time, and thus should not be able to enter idle. Hence the question about where the problems occur: whether they occur in the middle of the transaction or when the HDQ clocks are disabled. We want to keep CORE active when an HDQ transaction is happening, but MPU is welcome to go to sleep. I don't think you can express that with 'qos'. I think it needs some omap-specific machinery. The OMAP PRCM hardware should keep the CORE* clkdms active when the hdq_fck is enabled. So it's possible there could be a PRCM silicon bug that doesn't take hdq_fck into account when determining whether the CORE_* clkdms are inactive. I can 'fix' the problem simply by making sure pwrdm_for_each_clkdm(core_pd, _cpuidle_deny_idle); runs in omap3_enter_idle whenever HDQ is active. Hmm, that does suggest that it's not wakeup related. One of the reasons that I think it is a clock problem rather than just missing a wakeup event is that once the problem starts happening I cannot recovery without rebooting. i.e. even if I tell the UARTs to keep the clocks on permanently and keep the CPUIDLE state at 0, the HDQ doesn't start working again. It has clearly become confused. The HDQ doco makes a point of saying that you shouldn't issue any commands (except 'enable clock') when the clock is disabled. I think we end up doing that and it gets confused and cannot recover. I note that there is an ad-hoc dependency between the camera and various power states as well. Maybe we need a little bit of infrastructure so that camera can say Keep CORE and MPU on (or whatever it needs) and HDQ can say Keep CORE on. ??? I'm not familiar with the camera problems, but in the HDQ case, this should only be needed if a silicon bug exists. Which is certainly possible; we've seen this problem with one other IP block in the past. Based on a quick glance at the errata, I don't see anything related to the HDQ, but that doesn't really mean anything. In any case, we should be able to work around this via the hwmod layer and a special flag, if the problem really is a PRCM bug. This will depend on the functional powerstate conversion. Thanks for the detailed test reports. - Paul -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: OMAP HDQ: was Re: DSS2/PM on 3.2 broken?
On Fri, 27 Jan 2012 15:58:40 -0700 (MST) Paul Walmsley p...@pwsan.com wrote: On Sat, 28 Jan 2012, NeilBrown wrote: Here's a theory: perhaps the MPU powerdomain is hitting a low-power state while waiting for an HDQ interrupt. When the MPU powerdomain is in a low power state, so is the MPU interrupt controller, so the only way that the MPU can wake up is if the HDQ can issue a wakeup event to the PRCM. And I don't see any evidence that the HDQ is capable of doing this, based on the HDQ sections of the TRM. What a huge energy waste, if true. In the config I am testing MPU only goes to RET, never OFF. Same for CORE. That qualifies as a low-power state for the MPU :-) In MPU RET, at least in theory, the MPU INTC will not be operational, and thus unable to respond to interrupts. Maybe try something like the following patch -- compile-tested only here. If this works, you might want to try dropping this patch and using the pad mux to set a wakeup event on the 1-wire pad when the signal goes low. That might be a more power-efficient approach. You may still have to use some PM QoS request there to ensure that the HDQ can wake up fast enough to see the pulse, but the constraint shouldn't need to be as ludicrously low as it is in the following patch. Doesn't work - crashes :-( pm_qos_power_init is defined as a late_initcall, so you cannot call pm_qos_update_request until after all the initcalls have run. But with your patch, the probe of the bq27000 triggers a read of the registers which tries to update_request - and it goes 'bang'. Hmm, okay. Could you please remove the register read from the HDQ probe and see if that makes a difference? Sorry, I misdiagnosed. The problem was that the pm_qos_request structure needs to be zeroed before pm_qos_add_request() or it complains, doesn't initialise it, and future calls can crash. So with 'kzalloc' in place of 'kmalloc' where hdq_data is allocated, the qos patch does make my problem go away. It feels a bit heavy-handed though. I really think the problem is the CORE pwrdm gating a clock because no module says it needs it - i.e. nothing to do with MPU at all. Until pm_runtime_put*() is called, the usecount of hdq_fck will still be non-zero. So the CORE shouldn't be able to gate it or hdq_ick at that time, and thus should not be able to enter idle. Hence the question about where the problems occur: whether they occur in the middle of the transaction or when the HDQ clocks are disabled. We want to keep CORE active when an HDQ transaction is happening, but MPU is welcome to go to sleep. I don't think you can express that with 'qos'. I think it needs some omap-specific machinery. The OMAP PRCM hardware should keep the CORE* clkdms active when the hdq_fck is enabled. So it's possible there could be a PRCM silicon bug that doesn't take hdq_fck into account when determining whether the CORE_* clkdms are inactive. That isn't the way I understand the TRM and the code. I admit there is a lot that I don't understand, but I've really drilled down in this area and my understanding aligns with what I see experimentally, so clkdm_allow_idle(struct clockdomain *clkdm) tells the hardware to allow that clkdm to go idle, and it does so with no reference to whether any clk in the clkdm is current enabled. This seems to be called conditional only on which IDLE state is being entered. Any state below C1 is entered with clkdm_allow_idle in force. When the hardware has been told that it can idle a clockdomain, it will handshake with each module that uses the clockdomain. i.e. it doesn't check if the clock is enabled, it asks the module directly. The module answers according to the SIDLE (slave idle) setting. This can be set to - always say no, don't idle, I'm busy - always say yes, force me idle, I don't care - 'smartidle' where the response is based on the internal state of the module. A good example of smartidle is the UART. If it hasn't seen a character in or out for about 10 (I think) character times it will say yes, idle the clock, else it will say no. When RX then goes low it will handshake back to say Hey, I'm busy all of a sudden, can I have the clock back please which it then gets in time to count off the start bit and then read the rest of the incoming byte.(*) However HDQ doesn't have SIDLE. So the PRCM doesn't handshake with it. It just assumes that HDQ will say yes, force me idle, I don't care. The diversion about UARTs is relevant because when the UARTs have SIDLE set to 'no' I don't see the problem (they keep the clock domain active). It is only when the UARTs set their SIDLE to smartidle that the problem appears. So: if the HDQ driver wants the clkdm to stay on it has to explicitly arrange it in software by appropriate clkdm_deny_idle() calls. From the System Power Management and Wakeup section of the
Re: OMAP HDQ: was Re: DSS2/PM on 3.2 broken?
On Sat, 28 Jan 2012, NeilBrown wrote: On Fri, 27 Jan 2012 15:58:40 -0700 (MST) Paul Walmsley p...@pwsan.com wrote: Hmm, okay. Could you please remove the register read from the HDQ probe and see if that makes a difference? So with 'kzalloc' in place of 'kmalloc' where hdq_data is allocated, the qos patch does make my problem go away. It feels a bit heavy-handed though. Thanks for the test. It's not clear to me how this driver could really work well if the MPU powerdomain is able to go to a low power state between the two phases of the transaction. There's no wakeup source, so the interrupt won't be handled until another device wakes the system up. Of course, if you have a lot of wakeups coming from other devices, or if something is forcing the MPU powerdomain to stay ON or INACTIVE, this may not really matter. So you might want that patch, at least on 3.3. But from your following text, it sounds like there is another problem. The OMAP PRCM hardware should keep the CORE* clkdms active when the hdq_fck is enabled. So it's possible there could be a PRCM silicon bug that doesn't take hdq_fck into account when determining whether the CORE_* clkdms are inactive. ... clkdm_allow_idle(struct clockdomain *clkdm) tells the hardware to allow that clkdm to go idle, and it does so with no reference to whether any clk in the clkdm is current enabled. Since the HDQ module doesn't support the idle protocol, the target clock FSM in the CM is what should determine whether the module is considered idle or not. And as long as the bit in the CM_FCLKEN register corresponding to HDQ_FCLK is set, the FSM should consider the module as active, and the clockdomain should not be allowed to go inactive. I write should. Given that big caution at the bottom of section 20.4.5 System Power Management and Wakeup, this appears to have not been correctly implemented. It also mentions the AUTO_HDQ bit of PRCM.CM_AUTOIDLE1_CORE. This had me confused for a while, but I think it only affects the iclk. That's correct. I don't think the iclk is the problem. It is the fclk that is disappearing because the clkdm that feeds it is being turned off. Disabling AUTO_HDQ is probably a reasonable workaround. It would prevent the CORE_L4 clockdomain from going inactive, and that happens to be where the functional clock originates from, also. We have a partial facility to handle this type of bug in the existing code. Could you please try the patch enclosed at the bottom of this E-mail and see if it helps? My idea of a fix would be a secondary reference count on each clkdm. When a clk that doesn't have an associated SIDLE setting is activated we increase this reference count (and decrease it when the clk is deactivated). While the count is non-zero we deny_idle for that clkdm. Or something like that. I think we can work around this case with the OCPIF_SWSUP_IDLE flag, since the functional clock and interface clock are in the same clockdomain. Thanks for listening :-) Thanks for working through this bug and helping test :-) (*) Another note about UARTs. In 3.2 (haven't checked 3.3) the driver not only sets SMARTIDLE but also stops the clocks when things are idle. So when RX goes low and the UART wakes us up it doesn't get a clock straight away but we have to wait for code to enable the clock. In my experiments, this is fast enough to stay in-sync for speeds up to 38400 (I think). Above there we miss the first bit and get corrupt characters. I think that if the UART is expecting data it should not disable the clock but should use SMARTIDLE to let it turn off, and then turn back on again quickly but maybe 3.3 gets that right. With the most recent UART patch set that was posted against 3.3-rc1, as long as the CORE powerdomain isn't in a low-power state, the UART can enter idle with clocks cut and still wake up quickly enough to handle incoming data at 115200 8n1 on the 35xx BeagleBoard. That all happens in hardware; no kernel involvement is needed. I don't see the corrupt characters that you do, but there may be a 35xx vs. 37xx difference here. (You're using an OMAP37xx?) - Paul From 16b9f4ee44f96846fad5a9cb0e55bcd25b77ed59 Mon Sep 17 00:00:00 2001 From: Paul Walmsley p...@pwsan.com Date: Fri, 27 Jan 2012 22:46:34 -0700 Subject: [PATCH] Try software-supervised control of HDQ_ICLK Do not automatically enable interface clock autoidle on HDQ_ICLK. Require the hwmod layer to use software gating on HDQ_ICLK. This is because the CM FSM that handles HDQ_FCLK appears to not be working as expected. This patch is experimental, untested, and for bug-testing purposes only. --- arch/arm/mach-omap2/clock3xxx_data.c |1 + arch/arm/mach-omap2/omap_hwmod_3xxx_data.c |2 +- arch/arm/plat-omap/clock.c |4 +++- arch/arm/plat-omap/include/plat/clock.h|1 + 4 files changed, 6 insertions(+), 2
Re: OMAP HDQ: was Re: DSS2/PM on 3.2 broken?
On Tue, 24 Jan 2012, NeilBrown wrote: On Sat, 21 Jan 2012 17:07:18 -0700 (MST) Paul Walmsley p...@pwsan.com wrote: On Wed, 18 Jan 2012, NeilBrown wrote: Oh - another thing. Sometimes during early boot I get: [0.158447] omap_hwmod: usbtll_fck: missing clockdomain for usbtll_fck. [0.176879] omap_hwmod: hdq: softreset failed (waited 1 usec) This latter bug just got fixed, it's updated in the new HDQ series that I sent out: http://marc.info/?l=linux-omapm=132719073710874w=2 BTW, if you could give that a spin for us at some point, it would be greatly appreciated; I don't think I have a board with a 1-wire device on it (at least, not that I know of). Yes, the patch series seems to work OK - at least it doesn't break anything: I can still access my battery charge meter. Thanks. Should I add a Tested-by: from you on those patches? It doesn't fix the problem I am having where HDQ gets confused when CPUIDLE kicks in, but I think I might have a handle on that now. Unlike other modules, HDQ doesn't have SIDLEMODE. According to 18.4.5 (AM/DM37x Multimedia Device Silicon Revision) it acts as though SIDLEMODE were set to SIDLE_FORCE so if the clock-domain trys to switch off, HDQ lets it. Hmm. Do you know if the problem occurs in the middle of the HDQ transaction, or when HDQ is completely idle? I'd suspect the former. Here's a theory: perhaps the MPU powerdomain is hitting a low-power state while waiting for an HDQ interrupt. When the MPU powerdomain is in a low power state, so is the MPU interrupt controller, so the only way that the MPU can wake up is if the HDQ can issue a wakeup event to the PRCM. And I don't see any evidence that the HDQ is capable of doing this, based on the HDQ sections of the TRM. What a huge energy waste, if true. Maybe try something like the following patch -- compile-tested only here. If this works, you might want to try dropping this patch and using the pad mux to set a wakeup event on the 1-wire pad when the signal goes low. That might be a more power-efficient approach. You may still have to use some PM QoS request there to ensure that the HDQ can wake up fast enough to see the pulse, but the constraint shouldn't need to be as ludicrously low as it is in the following patch. - Paul commit 92fa561191b60ddf1296e02a6206160e8a5718f0 Author: Paul Walmsley p...@pwsan.com Date: Thu Jan 26 07:04:40 2012 -0700 Attempt to work around lack of HDQ wakeup support diff --git a/drivers/w1/masters/omap_hdq.c b/drivers/w1/masters/omap_hdq.c index 63e3eda..990e38b 100644 --- a/drivers/w1/masters/omap_hdq.c +++ b/drivers/w1/masters/omap_hdq.c @@ -17,6 +17,7 @@ #include linux/io.h #include linux/sched.h #include linux/pm_runtime.h +#include linux/pm_qos.h #include asm/irq.h #include mach/hardware.h @@ -61,6 +62,8 @@ struct hdq_data { /* lock status update */ struct mutex hdq_mutex; int hdq_usecount; + struct pm_qos_request pm_qos_request; + u8 hdq_irqstatus; /* device lock */ spinlock_t hdq_spinlock; @@ -409,6 +412,18 @@ static int omap_hdq_get(struct hdq_data *hdq_data) goto rtn; } + /* +* XXX Horrible hack to keep the MPU powerdomain out of a +* low-power state during HDQ transfers, since the HDQ doesn't +* appear to be able to wake the system. This constraint is +* way too short - it will probably keep the MPU out of WFI +* which is a total waste - but we don't have a +* platform-independent way to control powerdomain states at +* the moment and there is no obvious replacement for platform_data +* fn ptrs AFAIK +*/ + pm_qos_update_request(hdq_data-pm_qos_request, 1); + if (OMAP_HDQ_MAX_USER == hdq_data-hdq_usecount) { dev_dbg(hdq_data-dev, attempt to exceed the max use count); ret = -EINVAL; @@ -464,6 +479,10 @@ static int omap_hdq_put(struct hdq_data *hdq_data) if (0 == hdq_data-hdq_usecount) pm_runtime_put_sync(hdq_data-dev); } + + pm_qos_update_request(hdq_data-pm_qos_request, + PM_QOS_CPU_DMA_LAT_DEFAULT_VALUE); + mutex_unlock(hdq_data-hdq_mutex); return ret; @@ -578,6 +597,10 @@ static int __devinit omap_hdq_probe(struct platform_device *pdev) hdq_data-hdq_usecount = 0; mutex_init(hdq_data-hdq_mutex); + pm_qos_add_request(hdq_data-pm_qos_request, + PM_QOS_CPU_DMA_LATENCY, + PM_QOS_CPU_DMA_LAT_DEFAULT_VALUE); + pm_runtime_enable(pdev-dev); pm_runtime_get_sync(pdev-dev); -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at
OMAP HDQ: was Re: DSS2/PM on 3.2 broken?
On Sat, 21 Jan 2012 17:07:18 -0700 (MST) Paul Walmsley p...@pwsan.com wrote: On Wed, 18 Jan 2012, NeilBrown wrote: Oh - another thing. Sometimes during early boot I get: [0.158447] omap_hwmod: usbtll_fck: missing clockdomain for usbtll_fck. [0.176879] omap_hwmod: hdq: softreset failed (waited 1 usec) This latter bug just got fixed, it's updated in the new HDQ series that I sent out: http://marc.info/?l=linux-omapm=132719073710874w=2 BTW, if you could give that a spin for us at some point, it would be greatly appreciated; I don't think I have a board with a 1-wire device on it (at least, not that I know of). Yes, the patch series seems to work OK - at least it doesn't break anything: I can still access my battery charge meter. It doesn't fix the problem I am having where HDQ gets confused when CPUIDLE kicks in, but I think I might have a handle on that now. Unlike other modules, HDQ doesn't have SIDLEMODE. According to 18.4.5 (AM/DM37x Multimedia Device Silicon Revision) it acts as though SIDLEMODE were set to SIDLE_FORCE so if the clock-domain trys to switch off, HDQ lets it. Normally UART has SIDLEMODE set to SIDLE_NO so that keeps the necessary clock domain active. But if I let the UARTs sleep, SIDLEMODE changes to SIDLE_SMART which can allow the clockdomain to be stopped, which affects HDQ. However I'm not sure how PRCM.CM_AUTOIDLE1_CORE fits into this. if AUTO_HDQ is clear - which it is by default and I cannot see code that changes it - then the clock for HDQ should be independent of something. I definitely see behaviour that looks like the HDQ clock stopping when the UART allows SIDLE_SMART or when I explicitly deny idling of the clockdomains. Should there be special handling of HDQ because it doesn't have SIDLEMODE?? Thanks, NeilBrown signature.asc Description: PGP signature
Re: DSS2/PM on 3.2 broken?
On Mon, 23 Jan 2012 14:11:16 -0800 Kevin Hilman khil...@ti.com wrote: NeilBrown ne...@suse.de writes: On Thu, 19 Jan 2012 16:22:37 -0800 Kevin Hilman khil...@ti.com wrote: NeilBrown ne...@suse.de writes: On Thu, 19 Jan 2012 11:37:39 -0800 Kevin Hilman khil...@ti.com wrote: Joe Woodward j...@terrafix.co.uk writes: [...] At least this part is expected. In the kernel you're using the UART clocks are enabled/disabled during the idle path depending on the low-power state being targetted, so would expect to see lots of UART clock gating going on. Even though uarts 1,2,4 aren't even in use? Yes. Our UART idle management before v3.3 was, um... a disaster. Maybe this is fixed in 3.3? As of v3.3, the UARTs are managed independently using runtime PM autosupend, so only UART that are in use should be coming on and back off again. However, as Paul has recently posted some UART fixes, you'll see that we have a few kinks to work out in the UART driver as well. Good news ... I'll probably hold out until -rc2 before I start experimenting with 3.3 much. I think I understand why I was getting single-character corruption when waking the UART while you just lost the character - I suspect you had off_mode enabled. That makes the difference for me. Hopefully we can get rid of both the corruption and the character loss, but I'll wait until I've tried 3.3 before I pursue that much more. Thanks, NeilBrown signature.asc Description: PGP signature
Re: DSS2/PM on 3.2 broken?
NeilBrown ne...@suse.de writes: On Thu, 19 Jan 2012 16:22:37 -0800 Kevin Hilman khil...@ti.com wrote: NeilBrown ne...@suse.de writes: On Thu, 19 Jan 2012 11:37:39 -0800 Kevin Hilman khil...@ti.com wrote: Joe Woodward j...@terrafix.co.uk writes: [...] At least this part is expected. In the kernel you're using the UART clocks are enabled/disabled during the idle path depending on the low-power state being targetted, so would expect to see lots of UART clock gating going on. Even though uarts 1,2,4 aren't even in use? Yes. Our UART idle management before v3.3 was, um... a disaster. Maybe this is fixed in 3.3? As of v3.3, the UARTs are managed independently using runtime PM autosupend, so only UART that are in use should be coming on and back off again. However, as Paul has recently posted some UART fixes, you'll see that we have a few kinks to work out in the UART driver as well. Kevin -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: DSS2/PM on 3.2 broken?
On Sat, 21 Jan 2012 17:07:18 -0700 (MST) Paul Walmsley p...@pwsan.com wrote: On Wed, 18 Jan 2012, NeilBrown wrote: Oh - another thing. Sometimes during early boot I get: [0.158447] omap_hwmod: usbtll_fck: missing clockdomain for usbtll_fck. [0.176879] omap_hwmod: hdq: softreset failed (waited 1 usec) This latter bug just got fixed, it's updated in the new HDQ series that I sent out: http://marc.info/?l=linux-omapm=132719073710874w=2 BTW, if you could give that a spin for us at some point, it would be greatly appreciated; I don't think I have a board with a 1-wire device on it (at least, not that I know of). I'll try them against 3.2 tomorrow. I'll be a while longer before I move up to 3.3-rc but when I do I'll try to test there too. Thanks, NeilBrown signature.asc Description: PGP signature
Re: DSS2/PM on 3.2 broken?
On Thu, 19 Jan 2012 16:22:37 -0800 Kevin Hilman khil...@ti.com wrote: NeilBrown ne...@suse.de writes: On Thu, 19 Jan 2012 11:37:39 -0800 Kevin Hilman khil...@ti.com wrote: Joe Woodward j...@terrafix.co.uk writes: [...] If I do (either from the console or via a button press on the screen) then I never get a SYNC_LOST. echo 0 /sys/devices/omapdss/display0/enabled echo 1 /sys/devices/omapdss/display0/enabled Just trying to think of some ideas that may be affecting the DSS... - Could it to be to do with the GPIO being used as a wake source (i.e. does the GPIO driver do runtime_pm properly?)? - Could it to be to do with the UART as it seems to fix itself whenever a character is pressed? - Could it to be to do with the ordering in which drivers are resumed? Here's my guess/hunch as to why the UART wakeup helps and GPIO doesn't. The UART's are idled using timeouts, so after any activity (including a wakeup) the UART timeout will not alow the UARTs to idle (and thus the system to hit low power states) for a given timeout period. While I agree with that, I'm wondering if there might be something else odd relating to the UARTs that we haven't spotted yet. I suspect the problem for both is constraints that aren't stated explicitly. I'm chasing done the different problem of HDQ not coping with CPUIDLE but I think it is very likely to have a similar root cause, as HDQ goes wrong at the same time that DSS goes wrong, and is a vaguely similar way. I just tried compiling omap_hwmod.c with #define DEBUG and got some strange results. Strange indeed. I assume you are you using the runtime PM converted driver from Paul? I tried that branch on my 3530/Overo (console on UART3) and didn't see any nul's on my UART3 console. I found out what is causing this. I'm open to suggestions for fixing it. I turned on debugging in clock.c and found that just before the nuls start appearing, I'm told clock: dpll4_ck: disabling in hardware and just before it starts working again I see clock: dpll4_ck: enabling in hardware The uart3 fck depends on dpll4_ck, but this early it boot, uart3 hasn't been initialised (I use earlyprintk to see this at all). So as soon as something that depends on dpll4_ck goes to sleep and disables its clock, the uart3 clock can get turned off because the dependency isn't known yet. I think it is mmc3 turning off that initially turns dpll4_ck off. Then when HDQ wants hdq_fck which depends on dpll4_ck, we get the uart clock back for a while. Then when uart2 gets enabled, uart3 comes back on, and soon after uart3 gets enabled, which keeps it on. So it seems we need some way to avoid turning off any clocks until all dependencies have been registered. At least when earlyprintk is being used. Would that make sense? Unfortunately understanding this issue doesn't seem to give any insight into the other issues that seemed to be related - ah well... [...] The other thing I discovered is that when I set the uart 'sleep_timeout' to 5 (all uarts) so that CPUIDLE can let CORE enter the lower power states, I get a fairly steady stream of: [ 1168.490478] omap_hwmod: uart1: enabling [ 1168.490509] omap_hwmod: uart1: enabling clocks [ 1168.490539] omap_hwmod: uart2: enabling [ 1168.490539] omap_hwmod: uart2: enabling clocks [ 1168.490631] omap_hwmod: uart3: enabling [ 1168.490661] omap_hwmod: uart3: enabling clocks [ 1168.490661] omap_hwmod: uart4: enabling [ 1168.490692] omap_hwmod: uart4: enabling clocks [ 1168.578796] omap_hwmod: uart3: idling [ 1168.578796] omap_hwmod: uart3: disabling clocks [ 1168.578826] omap_hwmod: uart4: idling [ 1168.578826] omap_hwmod: uart4: disabling clocks [ 1168.578857] omap_hwmod: uart1: idling [ 1168.578857] omap_hwmod: uart1: disabling clocks [ 1168.578887] omap_hwmod: uart2: idling [ 1168.578887] omap_hwmod: uart2: disabling clocks [ 1168.714263] omap_hwmod: uart1: enabling [ 1168.714294] omap_hwmod: uart1: enabling clocks [ 1168.714324] omap_hwmod: uart2: enabling [ 1168.714324] omap_hwmod: uart2: enabling clocks [ 1168.714416] omap_hwmod: uart3: enabling [ 1168.714447] omap_hwmod: uart3: enabling clocks [ 1168.714447] omap_hwmod: uart4: enabling [ 1168.714477] omap_hwmod: uart4: enabling clocks [ 1168.803344] omap_hwmod: uart3: idling [ 1168.803344] omap_hwmod: uart3: disabling clocks [ 1168.803375] omap_hwmod: uart4: idling [ 1168.803375] omap_hwmod: uart4: disabling clocks [ 1168.851257] omap_hwmod: uart3: enabling [ 1168.851287] omap_hwmod: uart3: enabling clocks [ 1168.851318] omap_hwmod: uart4: enabling [ 1168.851318] omap_hwmod: uart4: enabling clocks [ 1168.885955] omap_hwmod: uart3: idling [ 1168.885986] omap_hwmod: uart3: disabling clocks [ 1168.885986] omap_hwmod: uart4: idling [ 1168.886016] omap_hwmod: uart4: disabling clocks [ 1168.886383]
Re: DSS2/PM on 3.2 broken?
On Wed, 18 Jan 2012, NeilBrown wrote: Oh - another thing. Sometimes during early boot I get: [0.158447] omap_hwmod: usbtll_fck: missing clockdomain for usbtll_fck. [0.176879] omap_hwmod: hdq: softreset failed (waited 1 usec) This latter bug just got fixed, it's updated in the new HDQ series that I sent out: http://marc.info/?l=linux-omapm=132719073710874w=2 BTW, if you could give that a spin for us at some point, it would be greatly appreciated; I don't think I have a board with a 1-wire device on it (at least, not that I know of). At the same time the UART seem to hiccup and my USB-Serial port loses track and doesn't see any more characters until I close and re-open. Could that be related? Any idea what it means? It shouldn't be related. - Paul -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: DSS2/PM on 3.2 broken?
Tomi Valkeinen tomi.valkei...@ti.com writes: On Thu, 2012-01-19 at 11:24 -0800, Kevin Hilman wrote: Tomi Valkeinen tomi.valkei...@ti.com writes: Now, you already said using pm_runtime_put_sync version is the correct way when suspending. But to use that I need to either always use pm_runtime_put_sync, or add an extra boolean which marks that we're suspending, and pass that around, or make it a DSS global variable. I'm not sure why can't you use the sync version just in the suspend callback? To do that the suspend callback should be the one that disables the device and calls pm_runtime_suspend. With DSS that's not the case, it's the panel drivers that are in charge of enabling/disabling DSS (by calling appropriate functions in omapdss). Ah, OK. makes sense. The only thing that the omapdss suspend callback does is to call the normal disable functions on the display drivers. This leads to calls to disable-functions on the dss hwmod drivers, and then pm_runtime_put calls. But at the point the pm_runtime funcs are called, the code has no idea that we're actually doing system suspend. Thus I'd need to pass that information somehow, probably with a global variable. And while that's not difficult to do, it sure feels a bit ugly. I think I'll just change the pm_runtime_put calls to sync versions for now. I imagine the perf impact with the change should be negligible. I'll return to this issue after the devtree adaptation has been made, as it changes the child-parent relations of the dss related devices, and perhaps managing the PM states will also get a bit easier. Sounds reasonable. Kevin -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: DSS2/PM on 3.2 broken?
...snip... cat /sys/kernel/debug/omapdss/clk is below and reports 66461538 for fck, so 66MHz? Still safe for OPP50. And disabling SMART REFLEX had no obvious effect. If you can think of anything else I could try to explore to narrow down the source of this, I am very happy to test or examine anything you suggest. Thanks, NeilBrown I'm not sure if this will give you any pointers but i'll tell you what I'm fairly certain I'm seeing... If I run the stock 3.2 with the changes to dss.c and dispc.c to make the pm_runtime_put()'s in to pm_runtime_put_sync()'s then the DSS warnings when suspending do indeed go away. I have two wake sources, one being the UART the console is on and the other being a GPIO button. I've tested the following situations: - If I sleep using the console and wake by typing a character then I *never* get a SYNC_LOST error. - If I sleep by pressing a button on the display (which does a system (echo mem /sys/power/state) and then wake by typing a character in to the console then I *never* get a SYNC_LOST error. - If I sleep by pressing a button on the display (which does a system (echo mem /sys/power/state) and then wake by pressing a button attached to the GPIO then I get a sequence of SYNC_LOST errors which continue at a rate of about one every 0.5 seconds until a type a character in to the console and then everything settles down and starts working again. So, at least from what I'm seeing, the SYNC_LOST errors seem related somehow to the UART, if that is indeed possible? Cheers, Joe -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: DSS2/PM on 3.2 broken?
On Thu, 2012-01-19 at 10:17 +, Joe Woodward wrote: ...snip... cat /sys/kernel/debug/omapdss/clk is below and reports 66461538 for fck, so 66MHz? Still safe for OPP50. And disabling SMART REFLEX had no obvious effect. If you can think of anything else I could try to explore to narrow down the source of this, I am very happy to test or examine anything you suggest. Thanks, NeilBrown I'm not sure if this will give you any pointers but i'll tell you what I'm fairly certain I'm seeing... If I run the stock 3.2 with the changes to dss.c and dispc.c to make the pm_runtime_put()'s in to pm_runtime_put_sync()'s then the DSS warnings when suspending do indeed go away. I have two wake sources, one being the UART the console is on and the other being a GPIO button. I've tested the following situations: - If I sleep using the console and wake by typing a character then I *never* get a SYNC_LOST error. - If I sleep by pressing a button on the display (which does a system (echo mem /sys/power/state) and then wake by typing a character in to the console then I *never* get a SYNC_LOST error. - If I sleep by pressing a button on the display (which does a system (echo mem /sys/power/state) and then wake by pressing a button attached to the GPIO then I get a sequence of SYNC_LOST errors which continue at a rate of about one every 0.5 seconds until a type a character in to the console and then everything settles down and starts working again. So, at least from what I'm seeing, the SYNC_LOST errors seem related somehow to the UART, if that is indeed possible? Well, I don't see how UART could directly affect DSS. The thing that comes to my mind is that typing a char in the console causes a change in the power management, which then fixes the DSS. But then again, I'd expect the console to go into sleep/idle state after a short while, which should again cause sync_losts. But that's not happening... Can you show the dss clocks you use (cat debugfs/omapdss/clk)? It's been a while since I did anything with PM, so I don't remember how and if it can be done, but a simple test would be to lock the OMAP to always use OPP100. Tomi signature.asc Description: This is a digitally signed message part
Re: DSS2/PM on 3.2 broken?
... (apologies for awful formatting of the previous mail) ... Well, I don't see how UART could directly affect DSS. The thing that comes to my mind is that typing a char in the console causes a change in the power management, which then fixes the DSS. But then again, I'd expect the console to go into sleep/idle state after a short while, which should again cause sync_losts. But that's not happening... Can you show the dss clocks you use (cat debugfs/omapdss/clk)? It's been a while since I did anything with PM, so I don't remember how and if it can be done, but a simple test would be to lock the OMAP to always use OPP100. Tomi Right, I'm running with both CPUIDLE and CPUFREQ disabled at present so I assume that means I should always be in OPP100? The panel is 800x480 (I've added a configuration to the generic-panel with the correct panel parameters), with just 1 FB device (/dev/fb0). Clock dump is shown below: # pwd /debug/omapdss # cat clk [ 53.146057] omapdss DSS: dss_runtime_get [ 53.151000] omapdss DSS: dss_runtime_put [ 53.155151] omapdss DISPC: dispc_runtime_get [ 53.160430] omapdss DISPC: dispc_runtime_put - DSS - dpll4_ck 43200 DSS_FCK (DSS1_ALWON_FCLK) = 43200 / 12 * 2 = 7200 - DISPC - dispc fclk source = DSS_FCK (DSS1_ALWON_FCLK) fck 7200 - LCD1 - lcd1_clk source = DSS_FCK (DSS1_ALWON_FCLK) lck 7200lck div 1 pck 3600pck div 2 The SYNC_LOST errors sometimes don't happen, and sometimes fix themselves. It does seem that whenever a character is typed in to the UART the SYNC_LOST errors stop immediately. This is a question more for Kevin, but is there some debug/tracing I can enable within the PM framework to see a bit more of whats going on? I've added some more traces below... Trace with no errors: [ 91.806427] PM: Syncing filesystems ... done. [ 91.811370] PM: Preparing system for mem sleep [ 91.816986] Freezing user space processes ... (elapsed 0.02 seconds) done. [ 91.846221] Freezing remaining freezable tasks ... (elapsed 0.02 seconds) done. [ 91.877441] PM: Entering mem sleep [ 91.881317] Suspending console(s) (use no_console_suspend to debug) [ 92.001068] omapdss CORE: suspend 2 [ 92.184600] omapdss DISPC: dispc_runtime_put [ 92.184631] omapdss DISPC: dispc_save_context [ 92.184692] omapdss DISPC: context saved, ctx_loss_count 0 [ 92.184722] omapdss DSS: dss_runtime_put [ 92.184753] omapdss DSS: dss_runtime_put [ 92.184783] omapdss DSS: dss_save_context [ 92.184783] omapdss DSS: context saved [ 92.185485] PM: suspend of devices complete after 297.088 msecs [ 92.186401] PM: late suspend of devices complete after 0.885 msecs [ 94.111755] Successfully put all powerdomains to target state [ 94.112518] PM: early resume of devices complete after 0.549 msecs [ 94.113555] omapdss CORE: resume [ 94.114135] omapdss DSS: dss_runtime_get [ 94.114166] omapdss DSS: dss_restore_context [ 94.114196] omapdss DSS: context restored [ 94.114196] omapdss DISPC: dispc_runtime_get [ 94.114227] omapdss DSS: dss_runtime_get [ 94.114257] omapdss DISPC: dispc_restore_context [ 94.114288] omapdss DISPC: onoff 0 rf 0 ieo 0 ipc 0 ihs 1 ivs 1 acbi 0 acb 0 [ 94.114318] omapdss DSS: dispc clock info found from cache. [ 94.114318] omapdss DSS: dpll4_m4 = 43200 [ 94.114349] omapdss DSS: fck = 7200 (12) [ 94.114379] omapdss DISPC: lck = 7200 (1) [ 94.114379] omapdss DISPC: pck = 3600 (2) [ 94.114410] omapdss DISPC: channel 0 xres 800 yres 480 [ 94.114440] omapdss DISPC: pck 36000 [ 94.114440] omapdss DISPC: hsw 10 hfp 164 hbp 89 vsw 10 vfp 10 vbp 23 [ 94.114471] omapdss DISPC: hsync 33866Hz, vsync 64Hz [ 94.698455] PM: resume of devices complete after 585.082 msecs [ 94.847900] PM: Finishing wakeup. [ 94.851409] Restarting tasks ... done. Trace with SYNC_LOST errors: [ 106.635437] PM: Syncing filesystems ... done. [ 106.640563] PM: Preparing system for mem sleep [ 106.646118] Freezing user space processes ... (elapsed 0.02 seconds) done. [ 106.677703] Freezing remaining freezable tasks ... (elapsed 0.02 seconds) done. [ 106.708923] PM: Entering mem sleep [ 106.712768] Suspending console(s) (use no_console_suspend to debug) [ 106.832672] omapdss CORE: suspend 2 [ 107.011260] omapdss DISPC: dispc_runtime_put [ 107.011291] omapdss DISPC: dispc_save_context [ 107.011352] omapdss DISPC: context saved, ctx_loss_count 0 [ 107.011383] omapdss DSS: dss_runtime_put [ 107.011413] omapdss DSS: dss_runtime_put [ 107.011444] omapdss DSS: dss_save_context [ 107.011444] omapdss DSS: context saved [ 107.012145] PM: suspend of devices complete after 292.266 msecs [ 107.013061] PM: late suspend of devices complete after 0.885 msecs [ 107.716400] Successfully put all powerdomains to target state [ 107.717102] PM: early resume of devices complete after 0.488 msecs [ 107.718139] omapdss CORE: resume [ 107.718780]
Re: DSS2/PM on 3.2 broken?
On Thu, 2012-01-19 at 11:29 +, Joe Woodward wrote: ... (apologies for awful formatting of the previous mail) ... Well, I don't see how UART could directly affect DSS. The thing that comes to my mind is that typing a char in the console causes a change in the power management, which then fixes the DSS. But then again, I'd expect the console to go into sleep/idle state after a short while, which should again cause sync_losts. But that's not happening... Can you show the dss clocks you use (cat debugfs/omapdss/clk)? It's been a while since I did anything with PM, so I don't remember how and if it can be done, but a simple test would be to lock the OMAP to always use OPP100. Tomi Right, I'm running with both CPUIDLE and CPUFREQ disabled at present so I assume that means I should always be in OPP100? The panel is 800x480 (I've added a configuration to the generic-panel with the correct panel parameters), with just 1 FB device (/dev/fb0). Clock dump is shown below: # pwd /debug/omapdss # cat clk [ 53.146057] omapdss DSS: dss_runtime_get [ 53.151000] omapdss DSS: dss_runtime_put [ 53.155151] omapdss DISPC: dispc_runtime_get [ 53.160430] omapdss DISPC: dispc_runtime_put - DSS - dpll4_ck 43200 DSS_FCK (DSS1_ALWON_FCLK) = 43200 / 12 * 2 = 7200 - DISPC - dispc fclk source = DSS_FCK (DSS1_ALWON_FCLK) fck 7200 - LCD1 - lcd1_clk source = DSS_FCK (DSS1_ALWON_FCLK) lck 7200lck div 1 pck 3600pck div 2 The SYNC_LOST errors sometimes don't happen, and sometimes fix themselves. It does seem that whenever a character is typed in to the UART the SYNC_LOST errors stop immediately. Yep, the clocks are so low that they should work fine with OPP50 also... Tomi -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: DSS2/PM on 3.2 broken?
-Original Message- From: Tomi Valkeinen tomi.valkei...@ti.com To: Joe Woodward j...@terrafix.co.uk Cc: NeilBrown ne...@suse.de, Paul Walmsley p...@pwsan.com, khil...@ti.com, t-kri...@ti.com, govindra...@ti.com, linux-omap@vger.kernel.org Date: Thu, 19 Jan 2012 13:36:59 +0200 Subject: Re: DSS2/PM on 3.2 broken? On Thu, 2012-01-19 at 11:29 +, Joe Woodward wrote: ... (apologies for awful formatting of the previous mail) ... Well, I don't see how UART could directly affect DSS. The thing that comes to my mind is that typing a char in the console causes a change in the power management, which then fixes the DSS. But then again, I'd expect the console to go into sleep/idle state after a short while, which should again cause sync_losts. But that's not happening... Can you show the dss clocks you use (cat debugfs/omapdss/clk)? It's been a while since I did anything with PM, so I don't remember how and if it can be done, but a simple test would be to lock the OMAP to always use OPP100. Tomi Right, I'm running with both CPUIDLE and CPUFREQ disabled at present so I assume that means I should always be in OPP100? The panel is 800x480 (I've added a configuration to the generic-panel with the correct panel parameters), with just 1 FB device (/dev/fb0). Clock dump is shown below: # pwd /debug/omapdss # cat clk [ 53.146057] omapdss DSS: dss_runtime_get [ 53.151000] omapdss DSS: dss_runtime_put [ 53.155151] omapdss DISPC: dispc_runtime_get [ 53.160430] omapdss DISPC: dispc_runtime_put - DSS - dpll4_ck 43200 DSS_FCK (DSS1_ALWON_FCLK) = 43200 / 12 * 2 = 7200 - DISPC - dispc fclk source = DSS_FCK (DSS1_ALWON_FCLK) fck 7200 - LCD1 - lcd1_clk source = DSS_FCK (DSS1_ALWON_FCLK) lck 7200lck div 1 pck 3600pck div 2 The SYNC_LOST errors sometimes don't happen, and sometimes fix themselves. It does seem that whenever a character is typed in to the UART the SYNC_LOST errors stop immediately. Yep, the clocks are so low that they should work fine with OPP50 also... If I do (either from the console or via a button press on the screen) then I never get a SYNC_LOST. echo 0 /sys/devices/omapdss/display0/enabled echo 1 /sys/devices/omapdss/display0/enabled Just trying to think of some ideas that may be affecting the DSS... - Could it to be to do with the GPIO being used as a wake source (i.e. does the GPIO driver do runtime_pm properly?)? - Could it to be to do with the UART as it seems to fix itself whenever a character is pressed? - Could it to be to do with the ordering in which drivers are resumed? Is the SYNC_LOST normally due to a lack of memory bandwidth? If so, it is possible to find out what the kernel is doing during the resume? And before looking at this too much more, is the changing of the pm_runtime_put to the _sync versions the correct fix? Sorry for so many questions, but I'm interested in getting this fixed as it's the only thing stopping me from switching to 3.2 from 3.0! Cheers, Joe -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: DSS2/PM on 3.2 broken?
On Thu, 2012-01-19 at 12:21 +, Joe Woodward wrote: If I do (either from the console or via a button press on the screen) then I never get a SYNC_LOST. echo 0 /sys/devices/omapdss/display0/enabled echo 1 /sys/devices/omapdss/display0/enabled Just trying to think of some ideas that may be affecting the DSS... - Could it to be to do with the GPIO being used as a wake source (i.e. does the GPIO driver do runtime_pm properly?)? - Could it to be to do with the UART as it seems to fix itself whenever a character is pressed? - Could it to be to do with the ordering in which drivers are resumed? Well, none of these sound probable to me. I don't see any connection with GPIOs or UART as such with DSS (I mean something that could cause sync losts). The only thing that I can see affecting DSS via GPIO/UART is indirectly via PM, with voltages or clocks. But if DSS responds in some way, I presume the voltages are ok. And your clocks should be low enough to work with lower OPPs also. So I'm quite out of ideas. Of course there could be a problem in the omapdss when it returns its registers. But it wouldn't explain why the synclosts end if you press a key in the console. Is the SYNC_LOST normally due to a lack of memory bandwidth? If so, it is possible to find out what the kernel is doing during the resume? No. That should cause fifo underflows. I don't know the exact reasons for sync_lost, but I think it generally means that the DSS sub-modules lose sync with each other. Mostly this is due to clock config, but can be cause also by other (wrong) configuration which makes a DSS sub-module behave somehow wrong (or halt totally). And before looking at this too much more, is the changing of the pm_runtime_put to the _sync versions the correct fix? Well. I think it's a correct fix, in functional sense. However, I would like to use non-sync versions normally, but that's just for performance optimization. Sorry for so many questions, but I'm interested in getting this fixed as it's the only thing stopping me from switching to 3.2 from 3.0! The only idea I have currently is to add/enable debug prints which show information about PM changing its states, so we could see if something actually changes at the moment you press a key in the console. What kind of setup did you have again? I wonder if I could reproduce it easily with overo/beagle (it was omap3, wasn't it?)? Tomi signature.asc Description: This is a digitally signed message part
Re: DSS2/PM on 3.2 broken?
Tomi Valkeinen tomi.valkei...@ti.com writes: On Fri, 2012-01-13 at 11:30 -0800, Kevin Hilman wrote: Tomi Valkeinen tomi.valkei...@ti.com writes: pm_runtime_put() is called inside omapdss driver's .suspend callback. So I guess that means the system suspend has started. However, the logs show that the runtime_suspend callback _is_ being called before the system suspend is finished, so the workqueue can't be frozen... ...or there are other ways that the runtime_suspend callback is called. In order to ensure that devices are properly idled for system-wide suspend, The OMAP PM domain layer will call the drivers -runtime_suspend callback during late suspend (using the PM domain's _noirq callbacks.) This is there so that even when runtime PM has been disabled (via userspace, or pm_runtime_forbid() calls) the driver can still be properly idled before suspend. In your case, I suspect you're seeing the driver's -runtime_suspend callback called during late suspend, not by the PM workqueue. Ok, that explains the invocation of the callbacks. I don't understand DSS enough to make sense of the logs you sent, so not sure how to be of more help. Well, I think I can summarize the situation: Normally, when the user wants to turn a display off, the panel drivers call DSS functions to disable the corresponding output, leading to pm_runtime_put() calls. In system suspend case, the suspend callback initiates the turning off of the displays, which again leads to pm_runtime_put() calls. Now, you already said using pm_runtime_put_sync version is the correct way when suspending. But to use that I need to either always use pm_runtime_put_sync, or add an extra boolean which marks that we're suspending, and pass that around, or make it a DSS global variable. I'm not sure why can't you use the sync version just in the suspend callback? Both of those options sound a bit ugly to me. So my two questions are, 1) Is there something funny with the DSS architecture to have a problem like this? I tried to grep around, but I didn't see any other driver doing something like that (but I could've missed it). 2) why can't the PM framework handle this, by directing the pm_runtime_put calls to the _sync version when suspending. Somehow it feels strange to me that the driver needs to care about that. This has been a big debate with the runtime PM framework in general. Some of us have argued that system PM and runtime PM should be more unified, and that runtime PM calls should work normally from system suspend. After all, system suspend could be considered simply a forced idle/quiecient state state where all the runtime callbacks kick in to idle the system. However those of us arguing for that lost the battle with the PM core maintainers, and system PM and runtime PM have to be managed separately. I've worked around that by using the PM domain level calls to ensure runtime PM callbacks are called on system suspend, but the side effect is the complications you've run into. So for now, if you use runtime PM in your system PM you have to use the sync versions. For most drivers I've dealt with, that means just using sync version in the suspend callback is required, but maybe the interaction of the various DSS subblocks means you need the sync version everywhere. Kevin -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: DSS2/PM on 3.2 broken?
Joe Woodward j...@terrafix.co.uk writes: [...] If I do (either from the console or via a button press on the screen) then I never get a SYNC_LOST. echo 0 /sys/devices/omapdss/display0/enabled echo 1 /sys/devices/omapdss/display0/enabled Just trying to think of some ideas that may be affecting the DSS... - Could it to be to do with the GPIO being used as a wake source (i.e. does the GPIO driver do runtime_pm properly?)? - Could it to be to do with the UART as it seems to fix itself whenever a character is pressed? - Could it to be to do with the ordering in which drivers are resumed? Here's my guess/hunch as to why the UART wakeup helps and GPIO doesn't. The UART's are idled using timeouts, so after any activity (including a wakeup) the UART timeout will not alow the UARTs to idle (and thus the system to hit low power states) for a given timeout period. I don't know what DSS does on wakeup, but presumably it has a burst of activity to do right away. Because of the UART timeout, the DSS probably completes its burst of activity before the system ever idles to a low power state. With a GPIO wakeup, the system can (re)idle and possibly hit low power states immediately after the wakeup. If that is the case, that suggests that the DSS probably needs a constraint someplace to prevent this from happening, at least during this burst period. Kevin -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: DSS2/PM on 3.2 broken?
NeilBrown ne...@suse.de writes: On Thu, 19 Jan 2012 11:37:39 -0800 Kevin Hilman khil...@ti.com wrote: Joe Woodward j...@terrafix.co.uk writes: [...] If I do (either from the console or via a button press on the screen) then I never get a SYNC_LOST. echo 0 /sys/devices/omapdss/display0/enabled echo 1 /sys/devices/omapdss/display0/enabled Just trying to think of some ideas that may be affecting the DSS... - Could it to be to do with the GPIO being used as a wake source (i.e. does the GPIO driver do runtime_pm properly?)? - Could it to be to do with the UART as it seems to fix itself whenever a character is pressed? - Could it to be to do with the ordering in which drivers are resumed? Here's my guess/hunch as to why the UART wakeup helps and GPIO doesn't. The UART's are idled using timeouts, so after any activity (including a wakeup) the UART timeout will not alow the UARTs to idle (and thus the system to hit low power states) for a given timeout period. While I agree with that, I'm wondering if there might be something else odd relating to the UARTs that we haven't spotted yet. I suspect the problem for both is constraints that aren't stated explicitly. I'm chasing done the different problem of HDQ not coping with CPUIDLE but I think it is very likely to have a similar root cause, as HDQ goes wrong at the same time that DSS goes wrong, and is a vaguely similar way. I just tried compiling omap_hwmod.c with #define DEBUG and got some strange results. Strange indeed. I assume you are you using the runtime PM converted driver from Paul? I tried that branch on my 3530/Overo (console on UART3) and didn't see any nul's on my UART3 console. [...] The other thing I discovered is that when I set the uart 'sleep_timeout' to 5 (all uarts) so that CPUIDLE can let CORE enter the lower power states, I get a fairly steady stream of: [ 1168.490478] omap_hwmod: uart1: enabling [ 1168.490509] omap_hwmod: uart1: enabling clocks [ 1168.490539] omap_hwmod: uart2: enabling [ 1168.490539] omap_hwmod: uart2: enabling clocks [ 1168.490631] omap_hwmod: uart3: enabling [ 1168.490661] omap_hwmod: uart3: enabling clocks [ 1168.490661] omap_hwmod: uart4: enabling [ 1168.490692] omap_hwmod: uart4: enabling clocks [ 1168.578796] omap_hwmod: uart3: idling [ 1168.578796] omap_hwmod: uart3: disabling clocks [ 1168.578826] omap_hwmod: uart4: idling [ 1168.578826] omap_hwmod: uart4: disabling clocks [ 1168.578857] omap_hwmod: uart1: idling [ 1168.578857] omap_hwmod: uart1: disabling clocks [ 1168.578887] omap_hwmod: uart2: idling [ 1168.578887] omap_hwmod: uart2: disabling clocks [ 1168.714263] omap_hwmod: uart1: enabling [ 1168.714294] omap_hwmod: uart1: enabling clocks [ 1168.714324] omap_hwmod: uart2: enabling [ 1168.714324] omap_hwmod: uart2: enabling clocks [ 1168.714416] omap_hwmod: uart3: enabling [ 1168.714447] omap_hwmod: uart3: enabling clocks [ 1168.714447] omap_hwmod: uart4: enabling [ 1168.714477] omap_hwmod: uart4: enabling clocks [ 1168.803344] omap_hwmod: uart3: idling [ 1168.803344] omap_hwmod: uart3: disabling clocks [ 1168.803375] omap_hwmod: uart4: idling [ 1168.803375] omap_hwmod: uart4: disabling clocks [ 1168.851257] omap_hwmod: uart3: enabling [ 1168.851287] omap_hwmod: uart3: enabling clocks [ 1168.851318] omap_hwmod: uart4: enabling [ 1168.851318] omap_hwmod: uart4: enabling clocks [ 1168.885955] omap_hwmod: uart3: idling [ 1168.885986] omap_hwmod: uart3: disabling clocks [ 1168.885986] omap_hwmod: uart4: idling [ 1168.886016] omap_hwmod: uart4: disabling clocks [ 1168.886383] omap_hwmod: uart3: enabling [ 1168.886383] omap_hwmod: uart3: enabling clocks [ 1168.886383] omap_hwmod: uart4: enabling [ 1168.886413] omap_hwmod: uart4: enabling clocks [ 1168.944885] omap_hwmod: uart3: idling [ 1168.944915] omap_hwmod: uart3: disabling clocks At least this part is expected. In the kernel you're using the UART clocks are enabled/disabled during the idle path depending on the low-power state being targetted, so would expect to see lots of UART clock gating going on. Kevin It seems to mostly be uart3 and uart4. If there is any interaction between the uart clock and the hdq clock, this could explain why the HDQ gets confused when all this is happening. And if there is a connection there, then there could also be a connection with a DSS clock which could be confusing it. I admit this is largely surmise and hypothesis - but there is definitely something odd here! Thanks, NeilBrown -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: DSS2/PM on 3.2 broken?
On Thu, 2012-01-19 at 11:24 -0800, Kevin Hilman wrote: Tomi Valkeinen tomi.valkei...@ti.com writes: Now, you already said using pm_runtime_put_sync version is the correct way when suspending. But to use that I need to either always use pm_runtime_put_sync, or add an extra boolean which marks that we're suspending, and pass that around, or make it a DSS global variable. I'm not sure why can't you use the sync version just in the suspend callback? To do that the suspend callback should be the one that disables the device and calls pm_runtime_suspend. With DSS that's not the case, it's the panel drivers that are in charge of enabling/disabling DSS (by calling appropriate functions in omapdss). The only thing that the omapdss suspend callback does is to call the normal disable functions on the display drivers. This leads to calls to disable-functions on the dss hwmod drivers, and then pm_runtime_put calls. But at the point the pm_runtime funcs are called, the code has no idea that we're actually doing system suspend. Thus I'd need to pass that information somehow, probably with a global variable. And while that's not difficult to do, it sure feels a bit ugly. I think I'll just change the pm_runtime_put calls to sync versions for now. I imagine the perf impact with the change should be negligible. I'll return to this issue after the devtree adaptation has been made, as it changes the child-parent relations of the dss related devices, and perhaps managing the PM states will also get a bit easier. Tomi signature.asc Description: This is a digitally signed message part
Re: DSS2/PM on 3.2 broken?
On Wed, 18 Jan 2012 09:13:59 +0200 Tomi Valkeinen tomi.valkei...@ti.com wrote: On Fri, 2012-01-13 at 22:20 +1100, NeilBrown wrote: Having CPUIDLE makes the DSS2 problem worse: lots of [ 21.085113] omapdss DISPC error: SYNC_LOST on channel lcd, restarting the output with video overlays disabled messages whenever the CPU isn't busy. I'm not sure if it is the case here, but DSS has restrictions about the max DSS clocks on different OPPs. For example, on OMAP4430 LCD clock maximum is 186MHz at OPP100, and 93MHz at OPP50. So it's a quite big drop, causing problems with all but the rather small displays. And the DSS driver doesn't have any support to handle this at the moment, as there isn't support in the PM framework to do this. I think the only way to handle this at the moment is for the DSS driver to set an arbitrarily high constraint on, say, mem throughput, and hope that it keeps the OMAP in the required OPP. Tomi This LCD panel on this device sets: .pixel_clock = 22000, in the struct omap_video_timings so I'm guessing that is 22MHz? I'll try building without SMARTREFLEX and see if that makes a difference - presumably that should keep it at OPP100 (??). Thanks, NeilBrown signature.asc Description: PGP signature
Re: DSS2/PM on 3.2 broken?
On Wed, 2012-01-18 at 22:15 +1100, NeilBrown wrote: On Wed, 18 Jan 2012 09:13:59 +0200 Tomi Valkeinen tomi.valkei...@ti.com wrote: On Fri, 2012-01-13 at 22:20 +1100, NeilBrown wrote: Having CPUIDLE makes the DSS2 problem worse: lots of [ 21.085113] omapdss DISPC error: SYNC_LOST on channel lcd, restarting the output with video overlays disabled messages whenever the CPU isn't busy. I'm not sure if it is the case here, but DSS has restrictions about the max DSS clocks on different OPPs. For example, on OMAP4430 LCD clock maximum is 186MHz at OPP100, and 93MHz at OPP50. So it's a quite big drop, causing problems with all but the rather small displays. And the DSS driver doesn't have any support to handle this at the moment, as there isn't support in the PM framework to do this. I think the only way to handle this at the moment is for the DSS driver to set an arbitrarily high constraint on, say, mem throughput, and hope that it keeps the OMAP in the required OPP. Tomi This LCD panel on this device sets: .pixel_clock = 22000, in the struct omap_video_timings so I'm guessing that is 22MHz? No, that's the pixel clock. There are probably limitations on the pix clock also, but usually the problem is the functional clocks, which need to be n x pck, where n depends on the needs for scaling. Tomi -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: DSS2/PM on 3.2 broken?
On Wed, 18 Jan 2012 13:42:20 +0200 Tomi Valkeinen tomi.valkei...@ti.com wrote: On Wed, 2012-01-18 at 22:15 +1100, NeilBrown wrote: On Wed, 18 Jan 2012 09:13:59 +0200 Tomi Valkeinen tomi.valkei...@ti.com wrote: On Fri, 2012-01-13 at 22:20 +1100, NeilBrown wrote: Having CPUIDLE makes the DSS2 problem worse: lots of [ 21.085113] omapdss DISPC error: SYNC_LOST on channel lcd, restarting the output with video overlays disabled messages whenever the CPU isn't busy. I'm not sure if it is the case here, but DSS has restrictions about the max DSS clocks on different OPPs. For example, on OMAP4430 LCD clock maximum is 186MHz at OPP100, and 93MHz at OPP50. So it's a quite big drop, causing problems with all but the rather small displays. And the DSS driver doesn't have any support to handle this at the moment, as there isn't support in the PM framework to do this. I think the only way to handle this at the moment is for the DSS driver to set an arbitrarily high constraint on, say, mem throughput, and hope that it keeps the OMAP in the required OPP. Tomi This LCD panel on this device sets: .pixel_clock = 22000, in the struct omap_video_timings so I'm guessing that is 22MHz? No, that's the pixel clock. There are probably limitations on the pix clock also, but usually the problem is the functional clocks, which need to be n x pck, where n depends on the needs for scaling. Ahh.. cat /sys/kernel/debug/omapdss/clk is below and reports 66461538 for fck, so 66MHz? Still safe for OPP50. And disabling SMART REFLEX had no obvious effect. If you can think of anything else I could try to explore to narrow down the source of this, I am very happy to test or examine anything you suggest. Thanks, NeilBrown - DSS - dpll4_ck 86400 DSS_FCK (DSS1_ALWON_FCLK) = 86400 / 13 = 66461538 - DISPC - dispc fclk source = DSS_FCK (DSS1_ALWON_FCLK) fck 66461538 - LCD1 - lcd1_clk source = DSS_FCK (DSS1_ALWON_FCLK) lck 66461538lck div 1 pck 22153846pck div 3 - DSI1 PLL - dsi pll source = pclkfree Fint0 regn 0 CLKIN4DDR 0 regm 0 DSS_FCK (DSS1_ALWON_FCLK) 0 regm_dispc 0(off) DSS_FCK (DSS1_ALWON_FCLK) 0 regm_dsi 0 (off) - DSI1 - dsi fclk source = DSS_FCK (DSS1_ALWON_FCLK) DSI_FCLK66461538 DDR_CLK 0 TxByteClkHS 0 LP_CLK 0 signature.asc Description: PGP signature
Re: DSS2/PM on 3.2 broken?
On Sat, 14 Jan 2012 10:09:15 +1100 NeilBrown ne...@suse.de wrote: On Fri, 13 Jan 2012 04:31:37 -0700 (MST) Paul Walmsley p...@pwsan.com wrote: On Fri, 13 Jan 2012, NeilBrown wrote: Also, the HDQ access to the battery 'fuel gauge' is working fine, so presumably that gets disturbed by one of the lower power states (3,4,5). I guess it is 4,5,6 when CORE goes to RET or OFF. So we need some way for HDQ to say I'm busy just like UART can. omap_hdq_can_sleep() ??? There must be a cleaner way... The first thing to try is the HDQ runtime PM conversion series that was sent separately. Maybe let us know if that fixes the problem. Not completely - but I'm still exploring. I've finished exploring (for now) and am seriously stumped. Maybe you can help... I modified your patch so that hdq uses autosuspend - because it seemed like a good idea and ends up providing slightly more informative failure modes. The normal behaviour of the bq27000 driver is to request lots of values in quick succession. So using autosuspend (with 100ms timeout) means that we have only one pm_resume and one pm_suspend for each batch of requests which at least reduces the noise. I have added an omap_hdq_can_sleep() function which only succeeds if the hdq device is run-time suspended, and call it in the same place that omap_uart_can_sleep is called, to ensure the core never goes to a lower state while he hdq is active. I've double-checked that this really works. HDQ all works find normally, but then normally CORE stays ON because the UART keeps it on. If I tell the UARTs they can sleep (by setting sleep_timeout) and then access the bq27000 via HDQ it doesn't work properly. The normal sequence is that it resets the HDQ and sends a 'BREAK' signal (possibly not really necessary) for which we get a TIMEOUT interrupt. Then write address - get TX interrupt - wait a bit - get RX interrupt - read byte. We normally get the TIMEOUT interrupt and often get the first TX and then RX interrupts but then the device goes silent. No more interrupts come in, and the status register (which shows interrupt source) reads as zero. Occasionally I have seen maybe half a dozen successful reads before it goes silent, but usually it is zero or 1. It is as though something is timing out and going to sleep. A clock auto-sleeping? Seems unlikely. If I set all the UART sleep_timeout values to 0 to keep CORE always ON, the problem doesn't fix itself. Whatever got turned off stays off. Once it starts failing, it never works again until a reboot. Could we be caching something in memory which we think is on, but due to CORE going into RET (or maybe OFF), the thing is now off and we never bother to reset it because the cached value is 'on' ?? I'm wondering about _sysc_cache, but I haven't looked very deeply into it. Any other ideas? Oh - another thing. Sometimes during early boot I get: [0.158447] omap_hwmod: usbtll_fck: missing clockdomain for usbtll_fck. [0.176879] omap_hwmod: hdq: softreset failed (waited 1 usec) At the same time the UART seem to hiccup and my USB-Serial port loses track and doesn't see any more characters until I close and re-open. Could that be related? Any idea what it means? Thanks, NeilBrown signature.asc Description: PGP signature
Re: DSS2/PM on 3.2 broken?
On Fri, 2012-01-13 at 22:20 +1100, NeilBrown wrote: Having CPUIDLE makes the DSS2 problem worse: lots of [ 21.085113] omapdss DISPC error: SYNC_LOST on channel lcd, restarting the output with video overlays disabled messages whenever the CPU isn't busy. I'm not sure if it is the case here, but DSS has restrictions about the max DSS clocks on different OPPs. For example, on OMAP4430 LCD clock maximum is 186MHz at OPP100, and 93MHz at OPP50. So it's a quite big drop, causing problems with all but the rather small displays. And the DSS driver doesn't have any support to handle this at the moment, as there isn't support in the PM framework to do this. I think the only way to handle this at the moment is for the DSS driver to set an arbitrarily high constraint on, say, mem throughput, and hope that it keeps the OMAP in the required OPP. Tomi signature.asc Description: This is a digitally signed message part
Re: DSS2/PM on 3.2 broken?
cc Tero, Govindraj On Thu, 12 Jan 2012, NeilBrown wrote: On Wed, 11 Jan 2012 06:43:04 -0700 (MST) Paul Walmsley p...@pwsan.com wrote: I spent some time exploring why cpuidle never drops below state 0 and found out that the code explicitly disables other states when uart is active - for a fairly broad definition of 'active'. I found the sleep_timeout setting and set them all to 1 second. This meant that cpuidle started working, but I got a lot of garbage characters. ... Does this really mean CPU_IDLE cannot be used when you might expect chars from a serial port - e.g. GPS or Bluetooth? Hmmm. Looking at mach-omap2/pm34xx.c and mach-omap2/cpuidle34xx.c, it indeed prevents even the MPU from entering a low-power state when the UART is active, as you write. That doesn't seem correct. Please try the following patch. It works fine for me on v3.2 with omap2plus_defconfig on a 35xx BeagleBoard. No dropped or garbled characters with console use. Not too surprising, since the UART has a receive FIFO to withstand interrupt servicing delays. - Paul From: Paul Walmsley p...@pwsan.com Date: Fri, 13 Jan 2012 02:10:30 -0700 Subject: [PATCH] ARM: OMAP3: PM: allow MPU to enter low-power states even when the UART is active For some reason, both the existing OMAP3 PM code and the OMAP3 CPUIdle driver prevent the MPU powerdomain from entering low-power modes when any UART isn't asleep. Possibly it is intended to minimize the ARM wakeup latency when UART activity arrives, but the UART has a FIFO that should handle this for most cases, with no dropped characters. I may be forgetting something important, though. And CORE/PER low-power states are a different matter entirely. Thanks to NeilBrown ne...@suse.de for reporting the problem. Signed-off-by: Paul Walmsley p...@pwsan.com Cc: NeilBrown ne...@suse.de Cc: Joe Woodward j...@terrafix.co.uk Cc: Tero Kristo t-kri...@ti.com Cc: Kevin Hilman khil...@ti.com Cc: Govindraj.R govindraj.r...@ti.com --- arch/arm/mach-omap2/cpuidle34xx.c |8 +++- arch/arm/mach-omap2/pm.h |1 - arch/arm/mach-omap2/pm34xx.c | 10 -- 3 files changed, 3 insertions(+), 16 deletions(-) diff --git a/arch/arm/mach-omap2/cpuidle34xx.c b/arch/arm/mach-omap2/cpuidle34xx.c index 942bb4f..4ef32d4 100644 --- a/arch/arm/mach-omap2/cpuidle34xx.c +++ b/arch/arm/mach-omap2/cpuidle34xx.c @@ -183,6 +183,9 @@ static int next_valid_state(struct cpuidle_device *dev, core_deepest_state = PWRDM_POWER_OFF; } + if (!omap_uart_can_sleep()) + core_deepest_state = PWRDM_POWER_ON; + /* Check if current state is valid */ if ((cx-valid) (cx-mpu_state = mpu_deepest_state) @@ -244,11 +247,6 @@ static int omap3_enter_idle_bm(struct cpuidle_device *dev, struct omap3_idle_statedata *cx; int ret; - if (!omap3_can_sleep()) { - new_state_idx = drv-safe_state_index; - goto select_state; - } - /* * Prevent idle completely if CAM is active. * CAM does not have wakeup capability in OMAP3. diff --git a/arch/arm/mach-omap2/pm.h b/arch/arm/mach-omap2/pm.h index 4e166ad..eac6fce 100644 --- a/arch/arm/mach-omap2/pm.h +++ b/arch/arm/mach-omap2/pm.h @@ -18,7 +18,6 @@ extern void *omap3_secure_ram_storage; extern void omap3_pm_off_mode_enable(int); extern void omap_sram_idle(void); -extern int omap3_can_sleep(void); extern int omap_set_pwrdm_state(struct powerdomain *pwrdm, u32 state); extern int omap3_idle_init(void); diff --git a/arch/arm/mach-omap2/pm34xx.c b/arch/arm/mach-omap2/pm34xx.c index efa6649..1c49824 100644 --- a/arch/arm/mach-omap2/pm34xx.c +++ b/arch/arm/mach-omap2/pm34xx.c @@ -484,21 +484,11 @@ console_still_active: clkdm_allow_idle(mpu_pwrdm-pwrdm_clkdms[0]); } -int omap3_can_sleep(void) -{ - if (!omap_uart_can_sleep()) - return 0; - return 1; -} - static void omap3_pm_idle(void) { local_irq_disable(); local_fiq_disable(); - if (!omap3_can_sleep()) - goto out; - if (omap_irq_pending() || need_resched()) goto out; -- 1.7.8.3 -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: DSS2/PM on 3.2 broken?
On Thu, 12 Jan 2012, NeilBrown wrote: On Wed, 11 Jan 2012 06:43:04 -0700 (MST) Paul Walmsley p...@pwsan.com wrote: Not off-mode - just retention. Okay. It's expected that the UART will drop the first byte also when CORE is in retention, due to the delay involved in waking up the chip. But that shouldn't prevent the MPU from going into a low-power state. I've sent a patch under separate cover that should fix this. If the former, this is hardly surprising as the HDQ driver (drivers/w1/masters/omap_hdq.c) doesn't contain any context save/restore code. In fact the HDQ driver doesn't even use PM runtime, so quite a bit of work is needed there. The HDQ driver does enable/disable the iclk and fclk. I thought I read that having the clocks enabled would prevent the powerdomain from dropping to a lower-power state ?? Several years ago, we started handling all of the operations needed to quiesce and wake an IP block in a common integration layer called 'hwmod', which is currently called through the PM runtime functions. This includes clk_{enable,disable}(). Unfortunately, due to some internal political reasons, not all of our drivers are converted to use PM runtime functions yet. You just happened to hit one that isn't -- apparently very few people care very much about this driver... Could you suggest some other driver which does do the right sort of runtime PM that I could try copying? I did a quick conversion for you. The series is at git://git.pwsan.com/linux-2.6 in the branch named 'hdq_hwmod_runtime_pm'. Please let me know if it works. I did boot-test it, and noticed that the hwmod subsystem was unable to softreset the device: [0.182769] omap_hwmod: hdq: softreset failed (waited 1 usec) This could be because I screwed up something in the hwmod data, or it could be that the HDQ reset hardware doesn't work correctly, similar to I2C. If that presents a problem, you might add the HWMOD_INIT_NO_RESET flag to the struct omap_hwmod.flags field for the HDQ. It also spit out these messages later, but I have no idea what they mean. At least the kernel seems to be able to access the HDQ registers: [1.685272] omap_hdq omap_hdq: OMAP HDQ Hardware Rev 0.5. Driver in Interrupt mode [1.888153] w1_master_driver w1_bus_master1: Family 1 for 01..3d is not registered. - Paul -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: DSS2/PM on 3.2 broken?
On Fri, 13 Jan 2012 03:05:03 -0700 (MST) Paul Walmsley p...@pwsan.com wrote: cc Tero, Govindraj On Thu, 12 Jan 2012, NeilBrown wrote: On Wed, 11 Jan 2012 06:43:04 -0700 (MST) Paul Walmsley p...@pwsan.com wrote: I spent some time exploring why cpuidle never drops below state 0 and found out that the code explicitly disables other states when uart is active - for a fairly broad definition of 'active'. I found the sleep_timeout setting and set them all to 1 second. This meant that cpuidle started working, but I got a lot of garbage characters. ... Does this really mean CPU_IDLE cannot be used when you might expect chars from a serial port - e.g. GPS or Bluetooth? Hmmm. Looking at mach-omap2/pm34xx.c and mach-omap2/cpuidle34xx.c, it indeed prevents even the MPU from entering a low-power state when the UART is active, as you write. That doesn't seem correct. Please try the following patch. It works fine for me on v3.2 with omap2plus_defconfig on a 35xx BeagleBoard. No dropped or garbled characters with console use. Not too surprising, since the UART has a receive FIFO to withstand interrupt servicing delays. Much nicer!! I now see CPUIDLE using state1 and state2 as well as the normal state0. I also see idle current drain drop from about 160mA to 95mA And there are no garbled characters when I type. Having CPUIDLE makes the DSS2 problem worse: lots of [ 21.085113] omapdss DISPC error: SYNC_LOST on channel lcd, restarting the output with video overlays disabled messages whenever the CPU isn't busy. The only way to get rid of them that I have found is to blank the display: # echo 1 /sys/devices/platform/omapfb/graphics/fb0/blank Also, the HDQ access to the battery 'fuel gauge' is working fine, so presumably that gets disturbed by one of the lower power states (3,4,5). I guess it is 4,5,6 when CORE goes to RET or OFF. So we need some way for HDQ to say I'm busy just like UART can. omap_hdq_can_sleep() ??? There must be a cleaner way... More experiments: - I enabled off_mode and started seeing state3 happening. UART and HDQ still fine - as expected. - I set the /sys/devices/system/cpu/cpu0/cpuidle/state?/usage values all to '10'. Now things start to break; Console is mostly OK, but the first char after 10 seconds of idle is bad. I type 'enter' and get 'C'. In general the first char typed is mapped to something else in a consistent way: 0a - c3 20 - 90 55 - d5 2a - 95 It is a bit like we are missing a start bit, and a stop bit is shifting down into the msb .. sometimes. HDQ also breaks. That surprised me a bit as the HDQ access is immediately after typing so it should be in the 10-second timeout when the UART keeps CORE powered on. Maybe we need runtime_pm to run omap_hdq_break() again to reinitialise the HDQ port if CORE might have been turned off. Also saw states all the way down to 6. Cannot tell if this helps current drain as I need HDQ working for that. So this is real progress, but there is still room for improvement. I might try writing an omap_hdq_can_sleep() and use it like omap_uart_can_sleep(). And get runtime_pm working on HDQ so it turns off after a short delay, and re-initialises if we have to turn it back on again. Thanks for the patch, it's been: Tested-by: NeilBrown ne...@suse.de Thanks, NeilBrown - Paul From: Paul Walmsley p...@pwsan.com Date: Fri, 13 Jan 2012 02:10:30 -0700 Subject: [PATCH] ARM: OMAP3: PM: allow MPU to enter low-power states even when the UART is active For some reason, both the existing OMAP3 PM code and the OMAP3 CPUIdle driver prevent the MPU powerdomain from entering low-power modes when any UART isn't asleep. Possibly it is intended to minimize the ARM wakeup latency when UART activity arrives, but the UART has a FIFO that should handle this for most cases, with no dropped characters. I may be forgetting something important, though. And CORE/PER low-power states are a different matter entirely. Thanks to NeilBrown ne...@suse.de for reporting the problem. Signed-off-by: Paul Walmsley p...@pwsan.com Cc: NeilBrown ne...@suse.de Cc: Joe Woodward j...@terrafix.co.uk Cc: Tero Kristo t-kri...@ti.com Cc: Kevin Hilman khil...@ti.com Cc: Govindraj.R govindraj.r...@ti.com --- arch/arm/mach-omap2/cpuidle34xx.c |8 +++- arch/arm/mach-omap2/pm.h |1 - arch/arm/mach-omap2/pm34xx.c | 10 -- 3 files changed, 3 insertions(+), 16 deletions(-) diff --git a/arch/arm/mach-omap2/cpuidle34xx.c b/arch/arm/mach-omap2/cpuidle34xx.c index 942bb4f..4ef32d4 100644 --- a/arch/arm/mach-omap2/cpuidle34xx.c +++ b/arch/arm/mach-omap2/cpuidle34xx.c @@ -183,6 +183,9 @@ static int next_valid_state(struct cpuidle_device *dev, core_deepest_state = PWRDM_POWER_OFF; } + if (!omap_uart_can_sleep()) +
Re: DSS2/PM on 3.2 broken?
On Fri, 13 Jan 2012, NeilBrown wrote: Also, the HDQ access to the battery 'fuel gauge' is working fine, so presumably that gets disturbed by one of the lower power states (3,4,5). I guess it is 4,5,6 when CORE goes to RET or OFF. So we need some way for HDQ to say I'm busy just like UART can. omap_hdq_can_sleep() ??? There must be a cleaner way... The first thing to try is the HDQ runtime PM conversion series that was sent separately. Maybe let us know if that fixes the problem. More experiments: - I enabled off_mode and started seeing state3 happening. UART and HDQ still fine - as expected. - I set the /sys/devices/system/cpu/cpu0/cpuidle/state?/usage values all to '10'. Now things start to break; Console is mostly OK, but the first char after 10 seconds of idle is bad. I type 'enter' and get 'C'. In general the first char typed is mapped to something else in a consistent way: 0a - c3 20 - 90 55 - d5 2a - 95 It is a bit like we are missing a start bit, and a stop bit is shifting down into the msb .. sometimes. I'm not seeing garbling at all on the OMAP3 BeagleBoard here, running v3.2 with omap2plus_defconfig. The first character from the console gets lost, which again is expected when the CORE enters a low-power state and there's no flow control on the serial interface -- but no garbling. HDQ also breaks. That surprised me a bit as the HDQ access is immediately after typing so it should be in the 10-second timeout when the UART keeps CORE powered on. Maybe we need runtime_pm to run omap_hdq_break() again to reinitialise the HDQ port if CORE might have been turned off. Also saw states all the way down to 6. Cannot tell if this helps current drain as I need HDQ working for that. If CORE is going off, the HDQ device context will be lost, since the HDQ controller will be powered off. You'll need to add code to either save and restore the HDQ register context into the HDQ driver, or to reinitialize the HDQ when it is re-enabled. The latter is what we do for I2C, I think. As far as the DSS goes, your best bet there is to try to work it out with Archit and Tomi. Thanks for the patch, it's been: Tested-by: NeilBrown ne...@suse.de Thanks. - Paul -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: DSS2/PM on 3.2 broken?
On Fri, Jan 13, 2012 at 3:35 PM, Paul Walmsley p...@pwsan.com wrote: cc Tero, Govindraj On Thu, 12 Jan 2012, NeilBrown wrote: On Wed, 11 Jan 2012 06:43:04 -0700 (MST) Paul Walmsley p...@pwsan.com wrote: I spent some time exploring why cpuidle never drops below state 0 and found out that the code explicitly disables other states when uart is active - for a fairly broad definition of 'active'. I found the sleep_timeout setting and set them all to 1 second.  This meant that cpuidle started working, but I got a lot of garbage characters. ... Does this really mean CPU_IDLE cannot be used when you might expect chars from a serial port - e.g. GPS or Bluetooth? Hmmm.  Looking at mach-omap2/pm34xx.c and mach-omap2/cpuidle34xx.c, it indeed prevents even the MPU from entering a low-power state when the UART is active, as you write.  That doesn't seem correct. Please try the following patch.  It works fine for me on v3.2 with omap2plus_defconfig on a 35xx BeagleBoard.  No dropped or garbled characters with console use.  Not too surprising, since the UART has a receive FIFO to withstand interrupt servicing delays. - Paul From: Paul Walmsley p...@pwsan.com Date: Fri, 13 Jan 2012 02:10:30 -0700 Subject: [PATCH] ARM: OMAP3: PM: allow MPU to enter low-power states even  when the UART is active For some reason, both the existing OMAP3 PM code and the OMAP3 CPUIdle driver prevent the MPU powerdomain from entering low-power modes when any UART isn't asleep.  Possibly it is intended to minimize the ARM wakeup latency when UART activity arrives, but the UART has a FIFO that should handle this for most cases, with no dropped characters.  I may be forgetting something important, though.  And CORE/PER low-power states are a different matter entirely. Thanks to NeilBrown ne...@suse.de for reporting the problem. Signed-off-by: Paul Walmsley p...@pwsan.com Cc: NeilBrown ne...@suse.de Cc: Joe Woodward j...@terrafix.co.uk Cc: Tero Kristo t-kri...@ti.com Cc: Kevin Hilman khil...@ti.com Cc: Govindraj.R govindraj.r...@ti.com ---  arch/arm/mach-omap2/cpuidle34xx.c |   8 +++-  arch/arm/mach-omap2/pm.h      |   1 -  arch/arm/mach-omap2/pm34xx.c    |  10 --  3 files changed, 3 insertions(+), 16 deletions(-) diff --git a/arch/arm/mach-omap2/cpuidle34xx.c b/arch/arm/mach-omap2/cpuidle34xx.c index 942bb4f..4ef32d4 100644 --- a/arch/arm/mach-omap2/cpuidle34xx.c +++ b/arch/arm/mach-omap2/cpuidle34xx.c @@ -183,6 +183,9 @@ static int next_valid_state(struct cpuidle_device *dev,             core_deepest_state = PWRDM_POWER_OFF;     } +    if (!omap_uart_can_sleep()) +        core_deepest_state = PWRDM_POWER_ON; + omap_uart_can_sleep is removed part of uart runtime conversion, and is no more available on latest mainline, looks like uart runtime patches got merged after 3.2 tagging. -- Thanks, Govindraj.R -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: DSS2/PM on 3.2 broken?
On Fri, 13 Jan 2012, Govindraj wrote: diff --git a/arch/arm/mach-omap2/cpuidle34xx.c b/arch/arm/mach-omap2/cpuidle34xx.c index 942bb4f..4ef32d4 100644 --- a/arch/arm/mach-omap2/cpuidle34xx.c +++ b/arch/arm/mach-omap2/cpuidle34xx.c @@ -183,6 +183,9 @@ static int next_valid_state(struct cpuidle_device *dev, Â Â Â Â Â Â Â Â Â Â Â Â core_deepest_state = PWRDM_POWER_OFF; Â Â Â Â } + Â Â Â if (!omap_uart_can_sleep()) + Â Â Â Â Â Â Â core_deepest_state = PWRDM_POWER_ON; + omap_uart_can_sleep is removed part of uart runtime conversion, and is no more available on latest mainline, looks like uart runtime patches got merged after 3.2 tagging. Great, so sounds like this patch is not needed... - Paul
Re: DSS2/PM on 3.2 broken?
Paul Walmsley p...@pwsan.com writes: cc Tero, Govindraj On Thu, 12 Jan 2012, NeilBrown wrote: On Wed, 11 Jan 2012 06:43:04 -0700 (MST) Paul Walmsley p...@pwsan.com wrote: I spent some time exploring why cpuidle never drops below state 0 and found out that the code explicitly disables other states when uart is active - for a fairly broad definition of 'active'. I found the sleep_timeout setting and set them all to 1 second. This meant that cpuidle started working, but I got a lot of garbage characters. ... Does this really mean CPU_IDLE cannot be used when you might expect chars from a serial port - e.g. GPS or Bluetooth? Hmmm. Looking at mach-omap2/pm34xx.c and mach-omap2/cpuidle34xx.c, it indeed prevents even the MPU from entering a low-power state when the UART is active, as you write. That doesn't seem correct. Please try the following patch. It works fine for me on v3.2 with omap2plus_defconfig on a 35xx BeagleBoard. No dropped or garbled characters with console use. Not too surprising, since the UART has a receive FIFO to withstand interrupt servicing delays. - Paul From: Paul Walmsley p...@pwsan.com Date: Fri, 13 Jan 2012 02:10:30 -0700 Subject: [PATCH] ARM: OMAP3: PM: allow MPU to enter low-power states even when the UART is active For some reason, both the existing OMAP3 PM code and the OMAP3 CPUIdle driver prevent the MPU powerdomain from entering low-power modes when any UART isn't asleep. Possibly it is intended to minimize the ARM wakeup latency when UART activity arrives, but the UART has a FIFO that should handle this for most cases, with no dropped characters. I may be forgetting something important, though. And CORE/PER low-power states are a different matter entirely. Just FYI... the UART can_sleep hackery was removed for v3.3 and replaced by using a PM QoS constraint: commit 2fd149645eb46d26130d7070c6de037dddf34880 Author: Govindraj.R govindraj.r...@ti.com Date: Wed Nov 9 17:41:21 2011 +0530 ARM: OMAP2+: UART: Remove omap_uart_can_sleep and add pm_qos Kevin -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: DSS2/PM on 3.2 broken?
Tomi Valkeinen tomi.valkei...@ti.com writes: On Thu, 2012-01-12 at 14:40 -0800, Kevin Hilman wrote: Tomi Valkeinen tomi.valkei...@ti.com writes: On Mon, 2012-01-09 at 12:46 +, Joe Woodward wrote: I'm running on a Gumstix Overo (OMAP3530) with an 24-bit LCD panel connected via the DPI interface (using the generic panel driver). Entering standby used to work just fine on 3.0, but on 3.2 I get the following: I've been debugging this, but I'm at loss. I added some debug prints, which I show below, and I also pushed them to git://gitorious.org/linux-omap-dss2/linux.git pm-test-prints if somebody wants to see exactly what they print. Kevin, perhaps you have an idea what could be wrong here. Long version below, short version: doing system suspend breaks omapdss, if omapdss uses pm_runtime_put, but works with pm_runtime_put_sync. Is the pm_runtime_put() done after system suspend has started? After system suspend has begun, the workqueue used for async callbacks is frozen, so any runtime PM calls that you want to work during suspend/resume must use the _sync versions. pm_runtime_put() is called inside omapdss driver's .suspend callback. So I guess that means the system suspend has started. However, the logs show that the runtime_suspend callback _is_ being called before the system suspend is finished, so the workqueue can't be frozen... ...or there are other ways that the runtime_suspend callback is called. In order to ensure that devices are properly idled for system-wide suspend, The OMAP PM domain layer will call the drivers -runtime_suspend callback during late suspend (using the PM domain's _noirq callbacks.) This is there so that even when runtime PM has been disabled (via userspace, or pm_runtime_forbid() calls) the driver can still be properly idled before suspend. In your case, I suspect you're seeing the driver's -runtime_suspend callback called during late suspend, not by the PM workqueue. I don't understand DSS enough to make sense of the logs you sent, so not sure how to be of more help. Kevin -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: DSS2/PM on 3.2 broken?
Kevin Hilman khil...@ti.com writes: [...] From: Paul Walmsley p...@pwsan.com Date: Fri, 13 Jan 2012 02:10:30 -0700 Subject: [PATCH] ARM: OMAP3: PM: allow MPU to enter low-power states even when the UART is active For some reason, both the existing OMAP3 PM code and the OMAP3 CPUIdle driver prevent the MPU powerdomain from entering low-power modes when any UART isn't asleep. Possibly it is intended to minimize the ARM wakeup latency when UART activity arrives, but the UART has a FIFO that should handle this for most cases, with no dropped characters. I may be forgetting something important, though. And CORE/PER low-power states are a different matter entirely. Just FYI... the UART can_sleep hackery was removed for v3.3 and replaced by using a PM QoS constraint: However, looking closer I see now that the constraint being used in the mailine driver uses the CPU_DMA_LATENCY QoS constraint, which is preventing deeper C states as well just like the current code (before your patch), so an similar fix will still be needed for mainline. Kevin -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: DSS2/PM on 3.2 broken?
On Fri, 13 Jan 2012, Kevin Hilman wrote: Kevin Hilman khil...@ti.com writes: [...] From: Paul Walmsley p...@pwsan.com Date: Fri, 13 Jan 2012 02:10:30 -0700 Subject: [PATCH] ARM: OMAP3: PM: allow MPU to enter low-power states even when the UART is active For some reason, both the existing OMAP3 PM code and the OMAP3 CPUIdle driver prevent the MPU powerdomain from entering low-power modes when any UART isn't asleep. Possibly it is intended to minimize the ARM wakeup latency when UART activity arrives, but the UART has a FIFO that should handle this for most cases, with no dropped characters. I may be forgetting something important, though. And CORE/PER low-power states are a different matter entirely. Just FYI... the UART can_sleep hackery was removed for v3.3 and replaced by using a PM QoS constraint: However, looking closer I see now that the constraint being used in the mailine driver uses the CPU_DMA_LATENCY QoS constraint, which is preventing deeper C states as well just like the current code (before your patch), so an similar fix will still be needed for mainline. Hmm. This is the following code in serial_omap_set_termios() ? /* calculate wakeup latency constraint */ up-calc_latency = (100 * up-port.fifosize) / (1000 * baud / 8); up-latency = up-calc_latency; schedule_work(up-qos_work); Let's work out this formula for 115.2kbps: (100 * 16) / (1000 * 115200 / 8) = 1600 / 1440 = 1.11... (presumably the unit here is milliseconds) = 1 (since up-calc_latency is a u32) Then up-calc_latency is passed to pm_qos_update_request(). According to Documentation/power/pm_qos_interface.txt, it is expecting microseconds. So that's one problem. To fix that it should be up-calc_latency = (100 * up-port.fifosize) / (baud / 8); With that change, it should be possible to enter C3, with the timings that are in the mainline cpuidle34xx.c. - Paul -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: DSS2/PM on 3.2 broken?
On Fri, 13 Jan 2012 04:31:37 -0700 (MST) Paul Walmsley p...@pwsan.com wrote: On Fri, 13 Jan 2012, NeilBrown wrote: Also, the HDQ access to the battery 'fuel gauge' is working fine, so presumably that gets disturbed by one of the lower power states (3,4,5). I guess it is 4,5,6 when CORE goes to RET or OFF. So we need some way for HDQ to say I'm busy just like UART can. omap_hdq_can_sleep() ??? There must be a cleaner way... The first thing to try is the HDQ runtime PM conversion series that was sent separately. Maybe let us know if that fixes the problem. Not completely - but I'm still exploring. More experiments: - I enabled off_mode and started seeing state3 happening. UART and HDQ still fine - as expected. - I set the /sys/devices/system/cpu/cpu0/cpuidle/state?/usage values all to '10'. Now things start to break; Console is mostly OK, but the first char after 10 seconds of idle is bad. I type 'enter' and get 'C'. In general the first char typed is mapped to something else in a consistent way: 0a - c3 20 - 90 55 - d5 2a - 95 It is a bit like we are missing a start bit, and a stop bit is shifting down into the msb .. sometimes. I'm not seeing garbling at all on the OMAP3 BeagleBoard here, running v3.2 with omap2plus_defconfig. The first character from the console gets lost, which again is expected when the CORE enters a low-power state and there's no flow control on the serial interface -- but no garbling. What baud rate? If I use 57600 or lower I see no problem at all. The first char I type gets through fine. If I try higher rates (115200 is the default, I've tried 230400 and 460800) I get consistent corruption, though different corruptions at different rates. I've tried enabling hardware handshaking but it makes no difference - however it is possibly there is something wrong with the wiring so I'll check that before pursuing that direction much further. Presumably with working hardware handshaking we should be able to avoid any loss or corruption?? (I'm not sure if I said - but the board I am working on is the GTA04 Openmoko board: www.gta04.org). Thanks, NeilBrown signature.asc Description: PGP signature
Re: DSS2/PM on 3.2 broken?
Hi just to follow up on this, I think there are some other bugs in this formula... maybe someone might want to tackle them. On Fri, 13 Jan 2012, Paul Walmsley wrote: up-calc_latency = (100 * up-port.fifosize) / (baud / 8); One problem is that UARTs generally don't transmit only 8 bits per byte. In the most common framing arrangement, 8N1, there are ten bits per byte. See for example https://en.wikipedia.org/wiki/8-N-1 So that part of this formula will contribute to an underestimation of the time it takes to fill the FIFO. Using 10 data bits per byte adds another 277 microseconds. So ideally this formula should change the 8 depending on the number of data bits, whether there is a parity bit, and on the number of stop bits. The real formula to determine the number of bits per byte should be something like: bits_per_byte = number of data bits + 1 bit if parity is used + number of stop bits Those quantities should all be easily available in serial_omap_set_termios(). So then the formula would be: up-calc_latency = (100 * up-port.fifosize) / (baud / bits_per_byte); ... Another issue is that the formula is missing the receive FIFO trigger level watermark. That might contribute to an overestimation of the amount of time available for the MPU to service the FIFO. So rather than using simply up-port.fifosize, it should use (up-port.fifosize - up.port-rxtrigthres). up.port-rxtrigthres should be the appropriate number derived from the 34xx TRM vZR section 17.4.2.1.2 Receive FIFO Trigger section - it comes from the UART's TCR register. So then it would be up-calc_latency = (100 * (up-port.fifosize - up-port.rxtrigthres)) / (baud / bits_per_byte); It seems however that omap-serial.c just hardcodes the rxtrigthres to 0, so maybe this is not such a significant issue: serial_out(up, UART_TI752_TCR, OMAP_UART_TCR_TRIG); Ideally the two FIFO thresholds here would be configurable by the UART user - especially given that the OMAP UART can be quite deep - 60 bytes. ... One other issue. If flow control is enabled, then IMHO, the UART probably shouldn't set a wakeup latency constraint at all. That's because there would be no risk of data loss. Users of the UART might have some performance constraints, but those should be set by the UART-using code or some system-wide wakeup latency value. ... As an aside, I wonder how we handle the UART flow control lines during chip off-mode? I assume those would need an appropriate off-mode mux value to indicate that the OMAP UART isn't ready; I wonder if we do that? - Paul -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: DSS2/PM on 3.2 broken?
On Sat, 14 Jan 2012, NeilBrown wrote: On Fri, 13 Jan 2012 04:31:37 -0700 (MST) Paul Walmsley p...@pwsan.com wrote: I'm not seeing garbling at all on the OMAP3 BeagleBoard here, running v3.2 with omap2plus_defconfig. The first character from the console gets lost, which again is expected when the CORE enters a low-power state and there's no flow control on the serial interface -- but no garbling. What baud rate? 115200 bps, no flow control. Presumably with working hardware handshaking we should be able to avoid any loss or corruption?? Yes. If you're using off-mode, you'll need to ensure that the I/O ring is set to wake the system when it sees the appropriate signal level on the incoming flow control line. (I'm not sure if I said - but the board I am working on is the GTA04 Openmoko board: www.gta04.org). Did not know that. Exciting! Maybe you can send some to the OMAP maintainers ;-) - Paul -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: DSS2/PM on 3.2 broken?
On Fri, 13 Jan 2012, Paul Walmsley wrote: Let's work out this formula for 115.2kbps: (100 * 16) / (1000 * 115200 / 8) = 1600 / 1440 = 1.11... (presumably the unit here is milliseconds) = 1 (since up-calc_latency is a u32) Then up-calc_latency is passed to pm_qos_update_request(). According to Documentation/power/pm_qos_interface.txt, it is expecting microseconds. So that's one problem. To fix that it should be up-calc_latency = (100 * up-port.fifosize) / (baud / 8); With that change, it should be possible to enter C3, with the timings that are in the mainline cpuidle34xx.c. Actually, just noticed that the fifosize is 60 bytes. So that's even better. With the fixed formula above, it should be ~ 5.2 ms wakeup latency - enough to enter C4 in the CPUIdle data in mainline -- at least in theory... - Paul -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: DSS2/PM on 3.2 broken?
On Fri, 13 Jan 2012 16:34:06 -0700 (MST) Paul Walmsley p...@pwsan.com wrote: So ideally this formula should change the 8 depending on the number of data bits, whether there is a parity bit, and on the number of stop bits. The real formula to determine the number of bits per byte should be something like: bits_per_byte = number of data bits + 1 bit if parity is used + number of stop bits + 1 for the start bit. NeilBrown signature.asc Description: PGP signature
Re: DSS2/PM on 3.2 broken?
On Sat, 14 Jan 2012, NeilBrown wrote: On Fri, 13 Jan 2012 16:34:06 -0700 (MST) Paul Walmsley p...@pwsan.com wrote: So ideally this formula should change the 8 depending on the number of data bits, whether there is a parity bit, and on the number of stop bits. The real formula to determine the number of bits per byte should be something like: bits_per_byte = number of data bits + 1 bit if parity is used + number of stop bits + 1 for the start bit. Indeed... - Paul -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: DSS2/PM on 3.2 broken?
Hi, On Wed, 2012-01-11 at 15:15 +, Joe Woodward wrote: The pm_runtime_get_sync() call in dss_runtime_get() is returning a negative value, could you print out that value too? If the call to pm_runtime_get_sync() fails, we bail out immediately. So the LCD interface shouldn't have been enabled at all. I don't know why we get sync lost errors. There is a possibility that the LCD interface wasn't switched off when for some reason when we suspended, and switching on the clocks again tries to start the transfer without DSS being configured correctly? I can reproduce this on my Overo. The problem goes away if I change dss's and dispc'c pm_runtime_put() calls to pm_runtime_put_sync(). But I'm not quite sure why. I would imagine that the pm framework would flush the async pm_runtime_put calls before the system goes into suspend. Comparing the logs of successful and non-successful suspends it seems that the order of suspend and resume events changes. Perhaps the suspend goes fine, but for whatever reason the resume calls happen in a different order, causing the problems. I need to study this more to understand the problem. Kevin, can you confirm that just using pm_runtime_put() is enough even in system suspend case, and the framework handles doing the runtime_put before suspending? Tomi signature.asc Description: This is a digitally signed message part
Re: DSS2/PM on 3.2 broken?
On Thu, 2012-01-12 at 11:28 +0200, Tomi Valkeinen wrote: Hi, On Wed, 2012-01-11 at 15:15 +, Joe Woodward wrote: The pm_runtime_get_sync() call in dss_runtime_get() is returning a negative value, could you print out that value too? If the call to pm_runtime_get_sync() fails, we bail out immediately. So the LCD interface shouldn't have been enabled at all. I don't know why we get sync lost errors. There is a possibility that the LCD interface wasn't switched off when for some reason when we suspended, and switching on the clocks again tries to start the transfer without DSS being configured correctly? I can reproduce this on my Overo. The problem goes away if I change dss's and dispc'c pm_runtime_put() calls to pm_runtime_put_sync(). But I'm not quite sure why. Oh, one more thing. I didn't get any sync losts, but I may just have been lucky. So it may or may not be a separate problem. Tomi signature.asc Description: This is a digitally signed message part
Re: DSS2/PM on 3.2 broken?
On Thu, 2012-01-12 at 11:28 +0200, Tomi Valkeinen wrote: Hi, On Wed, 2012-01-11 at 15:15 +, Joe Woodward wrote: The pm_runtime_get_sync() call in dss_runtime_get() is returning a negative value, could you print out that value too? If the call to pm_runtime_get_sync() fails, we bail out immediately. So the LCD interface shouldn't have been enabled at all. I don't know why we get sync lost errors. There is a possibility that the LCD interface wasn't switched off when for some reason when we suspended, and switching on the clocks again tries to start the transfer without DSS being configured correctly? I can reproduce this on my Overo. The problem goes away if I change dss's and dispc'c pm_runtime_put() calls to pm_runtime_put_sync(). But I'm not quite sure why. (I seem to have not waken up today...) I also forgot to mention that the fail happens in drivers/base/power/runtime.c:rpm_resume(), where the check dev-power.disable_depth 0 causes the function to return EACCES. I'm not sure what disable_depth 0 means. Tomi signature.asc Description: This is a digitally signed message part
Re: DSS2/PM on 3.2 broken?
On Mon, 2012-01-09 at 12:46 +, Joe Woodward wrote: I'm running on a Gumstix Overo (OMAP3530) with an 24-bit LCD panel connected via the DPI interface (using the generic panel driver). Entering standby used to work just fine on 3.0, but on 3.2 I get the following: I've been debugging this, but I'm at loss. I added some debug prints, which I show below, and I also pushed them to git://gitorious.org/linux-omap-dss2/linux.git pm-test-prints if somebody wants to see exactly what they print. Kevin, perhaps you have an idea what could be wrong here. Long version below, short version: doing system suspend breaks omapdss, if omapdss uses pm_runtime_put, but works with pm_runtime_put_sync. First, as a test case, here's what happens if I manually disable a display (with my comments beginning with #): echo 0 /sys/devices/platform/omapdss/display0/enabled # here dpi.c calls dispc_runtime_put [ 44.528015] dispc_runtime_put [ 44.531158] dispc disable depth: 0 [ 44.535308] dispc disable depth: 0 [ 44.538879] dispc_runtime_put end 0 # here dpi.c calls dss_runtime_put [ 44.542633] dss_runtime_put [ 44.545593] dss disable depth: 0 [ 44.548980] dss disable depth: 0 [ 44.552429] dss_runtime_put end 0 # then the runtime PM runs dispc suspend in its workqueue [ 44.557861] dispc_runtime_suspend start # here dispc_runtime_suspend uses dss_runtime_put [ 44.561981] dss_runtime_put [ 44.564910] dss disable depth: 0 [ 44.568420] dss disable depth: 0 [ 44.571838] dss_runtime_put end 0 [ 44.575378] dispc_runtime_suspend end # then the runtime PM runs dss suspend in its workqueue [ 44.579315] dss_runtime_suspend start [ 44.583221] dss_runtime_suspend end And similarly when enabling the display: echo 1 /sys/devices/platform/omapdss/display0/enabled [ 442.109222] dss_runtime_get [ 442.112304] dss disable depth: 0 [ 442.115753] dss_runtime_resume start [ 442.119537] dss_runtime_resume end [ 442.123199] dss disable depth: 0 [ 442.126586] dss_runtime_get end 0 [ 442.130126] dispc_runtime_get [ 442.133270] dispc disable depth: 0 [ 442.136932] dispc_runtime_resume start [ 442.140869] dss_runtime_get [ 442.143890] dss disable depth: 0 [ 442.147308] dss disable depth: 0 [ 442.150695] dss_runtime_get end 1 [ 442.154235] dispc_runtime_resume end [ 442.158020] dispc disable depth: 0 [ 442.161651] dispc_runtime_get end 0 This all works fine. Now, if I suspend the system: echo mem /sys/power/state [ 482.097930] PM: Syncing filesystems ... done. [ 482.105041] PM: Preparing system for mem sleep [ 482.116394] Freezing user space processes ... (elapsed 0.02 seconds) done. [ 482.146331] Freezing remaining freezable tasks ... (elapsed 0.02 seconds) done. [ 482.177520] PM: Entering mem sleep # dss core suspend disables the displays, as was done manually above [ 482.197570] core suspend 2 [ 482.213867] dispc_runtime_put [ 482.217010] dispc disable depth: 0 [ 482.220886] dispc disable depth: 0 [ 482.224578] dispc_runtime_put end 0 [ 482.228271] dss_runtime_put [ 482.231262] dss disable depth: 0 [ 482.234649] dss disable depth: 0 [ 482.238128] dss_runtime_put end 0 [ 482.242218] core suspend end 0 # so far looks exactly the same as above [ 482.246795] PM: suspend of devices complete after 60.260 msecs # now runtime PM runs the suspends in its workqueue [ 482.256744] dispc_runtime_suspend start [ 482.260955] dss_runtime_put # dss's disable_depth is 1?? Which, afaik, means something like runtime PM is disabled for DSS. [ 482.263916] dss disable depth: 1 [ 482.267303] dss disable depth: 1 [ 482.270782] dss_runtime_put end 0 [ 482.274261] dispc_runtime_suspend end [ 482.278381] dss_runtime_suspend start [ 482.282257] dss_runtime_suspend end [ 482.286224] PM: late suspend of devices complete after 33.209 msecs [ 482.292968] Disabling non-boot CPUs ... Except the disable_depth, the suspend went as in manual case. But on resume things are rather broken: [ 696.654754] Successfully put all powerdomains to target state # Here the PM framework calls dss's runtime_resume callback, even if nobody has called pm_rutime_get for dss! [ 696.661254] dss_runtime_resume start [ 696.665100] dss_runtime_resume end [ 696.668731] dispc_runtime_resume start [ 696.672760] dss_runtime_get [ 696.675689] dss disable depth: 1 [ 696.679168] EACCESS RPM_RESUME [ 696.682373] dss disable depth: 1 # And here pm_runtime_get fails for DSS, because dss disable_depth is 1... [ 696.685791] [ cut here ] [ 696.690734] WARNING: at drivers/video/omap2/dss/dss.c:716 dss_runtime_get+0x74/0x8c() [ 696.699035] Modules linked in: [ 696.702392] [c001b3c4] (unwind_backtrace+0x0/0xf0) from [c005248c] (warn_slowpath_common+0x4c /0x64) [ 696.712371] [c005248c] (warn_slowpath_common+0x4c/0x64) from [c00524c0] (warn_slowpath_null+0 x1c/0x24) [ 696.722625] [c00524c0] (warn_slowpath_null+0x1c/0x24) from [c0273cd8] (dss_runtime_get+0x74/0 x8c) [ 696.732421]
Re: DSS2/PM on 3.2 broken?
Tomi Valkeinen tomi.valkei...@ti.com writes: On Mon, 2012-01-09 at 12:46 +, Joe Woodward wrote: I'm running on a Gumstix Overo (OMAP3530) with an 24-bit LCD panel connected via the DPI interface (using the generic panel driver). Entering standby used to work just fine on 3.0, but on 3.2 I get the following: I've been debugging this, but I'm at loss. I added some debug prints, which I show below, and I also pushed them to git://gitorious.org/linux-omap-dss2/linux.git pm-test-prints if somebody wants to see exactly what they print. Kevin, perhaps you have an idea what could be wrong here. Long version below, short version: doing system suspend breaks omapdss, if omapdss uses pm_runtime_put, but works with pm_runtime_put_sync. Is the pm_runtime_put() done after system suspend has started? After system suspend has begun, the workqueue used for async callbacks is frozen, so any runtime PM calls that you want to work during suspend/resume must use the _sync versions. Kevin -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: DSS2/PM on 3.2 broken?
On Thu, 2012-01-12 at 14:40 -0800, Kevin Hilman wrote: Tomi Valkeinen tomi.valkei...@ti.com writes: On Mon, 2012-01-09 at 12:46 +, Joe Woodward wrote: I'm running on a Gumstix Overo (OMAP3530) with an 24-bit LCD panel connected via the DPI interface (using the generic panel driver). Entering standby used to work just fine on 3.0, but on 3.2 I get the following: I've been debugging this, but I'm at loss. I added some debug prints, which I show below, and I also pushed them to git://gitorious.org/linux-omap-dss2/linux.git pm-test-prints if somebody wants to see exactly what they print. Kevin, perhaps you have an idea what could be wrong here. Long version below, short version: doing system suspend breaks omapdss, if omapdss uses pm_runtime_put, but works with pm_runtime_put_sync. Is the pm_runtime_put() done after system suspend has started? After system suspend has begun, the workqueue used for async callbacks is frozen, so any runtime PM calls that you want to work during suspend/resume must use the _sync versions. pm_runtime_put() is called inside omapdss driver's .suspend callback. So I guess that means the system suspend has started. However, the logs show that the runtime_suspend callback _is_ being called before the system suspend is finished, so the workqueue can't be frozen... Tomi -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: DSS2/PM on 3.2 broken?
On Mon, 9 Jan 2012, Joe Woodward wrote: I'm running on a Gumstix Overo (OMAP3530) with an 24-bit LCD panel connected via the DPI interface (using the generic panel driver). Entering standby used to work just fine on 3.0, but on 3.2 I get the following: # echo mem /sys/power/state ... [ 24.466674] [ cut here ] [ 24.466857] WARNING: at drivers/video/omap2/dss/dss.c:713 0xc01350f8() ... At which point the screen either restarts, or sometimes flickers and I get the following: [ 22.578796] omapdss DISPC error: SYNC_LOST on channel lcd, restarting the output with video overlays disabled [ 23.391571] omapdss DISPC error: SYNC_LOST on channel lcd, restarting the output with video overlays disabled [ 24.391571] omapdss DISPC error: SYNC_LOST on channel lcd, restarting the output with video overlays disabled It normally recovers after doing this for a while... Anyone have any ideas? Consider cc'ing the DSS maintainers and linux-fbdev list: OMAP DISPLAY SUBSYSTEM and FRAMEBUFFER SUPPORT (DSS2) M: Tomi Valkeinen tomi.valkei...@ti.com L: linux-omap@vger.kernel.org L: linux-fb...@vger.kernel.org S: Maintained F: drivers/video/omap2/ F: Documentation/arm/OMAP/DSS - Paul -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: DSS2/PM on 3.2 broken?
cc Kevin On Tue, 10 Jan 2012, NeilBrown wrote: It seems that when cpuidle on an omap3 tries to switch to lower power states, various things misbehave: - UARTs lose characters Is this with off-mode enabled, or is this with only retention idle enabled? If off-mode is enabled, and incoming serial traffic is used to wake the system, it's known and expected that the first byte will be lost. This is an artifact of the hardware and seems to be unavoidable. - dss loses sync Best to take this up with the DSS maintainer directly. - HDQ seems to lose everything. Is this with off-mode enabled, or just with retention idle? If the former, this is hardly surprising as the HDQ driver (drivers/w1/masters/omap_hdq.c) doesn't contain any context save/restore code. In fact the HDQ driver doesn't even use PM runtime, so quite a bit of work is needed there. - Paul -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: DSS2/PM on 3.2 broken?
Hi, On Wednesday 11 January 2012 07:13 PM, Paul Walmsley wrote: cc Kevin On Tue, 10 Jan 2012, NeilBrown wrote: It seems that when cpuidle on an omap3 tries to switch to lower power states, various things misbehave: - UARTs lose characters Is this with off-mode enabled, or is this with only retention idle enabled? If off-mode is enabled, and incoming serial traffic is used to wake the system, it's known and expected that the first byte will be lost. This is an artifact of the hardware and seems to be unavoidable. - dss loses sync Best to take this up with the DSS maintainer directly. Could you enable omapdss debug by adding 'omapdss.debug=1 debug' in your bootargs and share logs? The pm_runtime_get_sync() call in dss_runtime_get() is returning a negative value, could you print out that value too? If the call to pm_runtime_get_sync() fails, we bail out immediately. So the LCD interface shouldn't have been enabled at all. I don't know why we get sync lost errors. There is a possibility that the LCD interface wasn't switched off when for some reason when we suspended, and switching on the clocks again tries to start the transfer without DSS being configured correctly? Also, are you on the final 3.2, or the 3.2-rc's by any chance? There were some hwmod patches which got in 3.2-rc4 related to hwmod fixes. Archit - HDQ seems to lose everything. Is this with off-mode enabled, or just with retention idle? If the former, this is hardly surprising as the HDQ driver (drivers/w1/masters/omap_hdq.c) doesn't contain any context save/restore code. In fact the HDQ driver doesn't even use PM runtime, so quite a bit of work is needed there. - Paul -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: DSS2/PM on 3.2 broken?
...snip... Could you enable omapdss debug by adding 'omapdss.debug=1 debug' in your bootargs and share logs? As requested: # echo mem /sys/power/state [ 37.371734] PM: Syncing filesystems ... done. [ 37.397460] PM: Preparing system for mem sleep [ 37.402923] Freezing user space processes ... (elapsed 0.02 seconds) done. [ 37.432312] Freezing remaining freezable tasks ... (elapsed 0.02 seconds) done. [ 37.463256] PM: Entering mem sleep [ 37.467132] Suspending console(s) (use no_console_suspend to debug) SYSTEM SUSPENDS HERE [ 37.587646] omapdss CORE: suspend 2 [ 37.776000] omapdss DISPC: dispc_runtime_put [ 37.776336] omapdss DSS: dss_runtime_put [ 37.779693] PM: suspend of devices complete after 305.389 msecs [ 37.786010] omapdss DISPC: dispc_save_context [ 37.786437] omapdss DISPC: context saved, ctx_loss_count 0 [ 37.786560] omapdss DSS: dss_runtime_put [ 37.786773] omapdss DSS: dss_save_context [ 37.786895] omapdss DSS: context saved [ 37.787384] PM: late suspend of devices complete after 7.446 msecs [ 40.932312] Successfully put all powerdomains to target state [ 40.934509] omapdss DSS: dss_restore_context [ 40.934661] omapdss DSS: context restored [ 40.934814] omapdss DSS: dss_runtime_get [ 40.934967] @@@ dss_runtime_get::pm_runtime_get_sync() return val = -13 [ 40.935089] [ cut here ] [ 40.935241] WARNING: at drivers/video/omap2/dss/dss.c:716 0xc0137304() [ 40.935363] Modules linked in: [ 40.935516] Backtrace: [ 40.935638] Function entered at [c0010c48] from [c02bdad0] [ 40.935791] r6:c02fa302 r5:02cc r4: r3: [ 40.936065] Function entered at [c02bdab8] from [c0034874] [ 40.936218] Function entered at [c0034820] from [c00348b0] [ 40.936340] r8: r7:c0386c14 r6:c0028c48 r5:c0371c24 r4:fff3 [ 40.936645] r3:0009 [ 40.936767] Function entered at [c003488c] from [c0137304] [ 40.936920] Function entered at [c01372a8] from [c0137b20] [ 40.937042] r4:dec50008 r3:c0137b10 [ 40.937225] Function entered at [c0137b10] from [c0179ff0] [ 40.937347] r6:c0028c48 r5:c0371c24 r4:dec50008 r3:c0137b10 [ 40.937652] Function entered at [c0179fbc] from [c0028ca4] [ 40.937774] Function entered at [c0028c48] from [c017afc0] [ 40.937927] r4:dec50008 r3: [ 40.938079] Function entered at [c017aee0] from [c017b5ec] [ 40.938232] r9:c02d0044 r8: r6:dec5005c r5:0010 r4:dec50008 [ 40.938568] Function entered at [c017b520] from [c0063e48] [ 40.938690] r8:c02c3898 r7: r6:0003 r5: r4: [ 40.938995] Function entered at [c0063cc0] from [c0064004] [ 40.939147] r7:0004 r6: r5:c02c388c r4:0003 [ 40.939422] Function entered at [c0063ef4] from [c0062e78] [ 40.939544] r6:0003 r5:0003 r4:d8869000 r3:006d [ 40.939819] Function entered at [c0062dd0] from [c0117ac8] [ 40.939971] Function entered at [c0117aac] from [c00e2e54] [ 40.940124] Function entered at [c00e2d44] from [c009903c] [ 40.940246] Function entered at [c0098f88] from [c00992b8] [ 40.940368] r8:0004 r7: r6: r5:000aabf8 r4:d857c940 [ 40.940704] Function entered at [c0099274] from [c000dcc0] [ 40.941101] r8:c000de44 r7:0004 r6:000aabf8 r5:0004 r4:000a8e38 [ 40.941436] ---[ end trace db0a5fd53a21ce6e ]--- [ 40.943511] PM: early resume of devices complete after 9.307 msecs [ 40.950622] omapdss DISPC: dispc_save_context [ 40.951049] omapdss DISPC: context saved, ctx_loss_count 0 [ 40.951171] omapdss DSS: dss_runtime_put [ 40.951812] omapdss CORE: resume [ 40.953460] omapdss DSS: dss_runtime_get [ 40.953613] @@@ dss_runtime_get::pm_runtime_get_sync() return val = 1 [ 40.953735] omapdss DISPC: dispc_runtime_get [ 40.953948] omapdss DSS: dss_runtime_get [ 40.954071] @@@ dss_runtime_get::pm_runtime_get_sync() return val = 1 [ 40.954193] omapdss DISPC: dispc_restore_context [ 40.954467] omapdss DISPC: onoff 0 rf 0 ieo 0 ipc 0 ihs 1 ivs 1 acbi 0 acb 0 [ 40.954620] omapdss DSS: dispc clock info found from cache. [ 40.954772] omapdss DSS: dpll4_m4 = 43200 [ 40.954986] omapdss DSS: fck = 7200 (12) [ 40.955108] omapdss DISPC: lck = 7200 (1) [ 40.955230] omapdss DISPC: pck = 3600 (2) [ 40.955383] omapdss DISPC: channel 0 xres 800 yres 480 [ 40.955505] omapdss DISPC: pck 36000 [ 40.955657] omapdss DISPC: hsw 10 hfp 164 hbp 89 vsw 10 vfp 10 vbp 23 [ 40.955810] omapdss DISPC: hsync 33866Hz, vsync 64Hz [ 41.294464] PM: resume of devices complete after 345.153 msecs [ 41.688568] PM: Finishing wakeup. [ 41.692169] Restarting tasks ... done. When SYNC is lost we get something like: [ 118.972534] omapdss DISPC error: SYNC_LOST on channel lcd, restarting the output with video overlays disabled [ 119.166839] omapdss DISPC: dispc_runtime_put [ 119.175292] omapdss DSS: dss_runtime_put [ 119.183746] omapdss DISPC: dispc_enable_plane 1, 0 [
Re: DSS2/PM on 3.2 broken?
Hi, On Wednesday 11 January 2012 08:45 PM, Joe Woodward wrote: ...snip... Could you enable omapdss debug by adding 'omapdss.debug=1 debug' in your bootargs and share logs? As requested: # echo mem /sys/power/state [ 37.371734] PM: Syncing filesystems ... done. [ 37.397460] PM: Preparing system for mem sleep [ 37.402923] Freezing user space processes ... (elapsed 0.02 seconds) done. [ 37.432312] Freezing remaining freezable tasks ... (elapsed 0.02 seconds) done. [ 37.463256] PM: Entering mem sleep [ 37.467132] Suspending console(s) (use no_console_suspend to debug) SYSTEM SUSPENDS HERE [ 37.587646] omapdss CORE: suspend 2 [ 37.776000] omapdss DISPC: dispc_runtime_put [ 37.776336] omapdss DSS: dss_runtime_put [ 37.779693] PM: suspend of devices complete after 305.389 msecs [ 37.786010] omapdss DISPC: dispc_save_context [ 37.786437] omapdss DISPC: context saved, ctx_loss_count 0 [ 37.786560] omapdss DSS: dss_runtime_put [ 37.786773] omapdss DSS: dss_save_context [ 37.786895] omapdss DSS: context saved [ 37.787384] PM: late suspend of devices complete after 7.446 msecs [ 40.932312] Successfully put all powerdomains to target state [ 40.934509] omapdss DSS: dss_restore_context [ 40.934661] omapdss DSS: context restored [ 40.934814] omapdss DSS: dss_runtime_get [ 40.934967] @@@ dss_runtime_get::pm_runtime_get_sync() return val = -13 This resume/restore looks to be happening a bit early according to me. All the resuming should happen after the print 'omapdss CORE: resume'. One experiment worth trying would be to reboot/halt the kernel which will lead to only shutdown of DSS, and no resume. This will help us figure out if the issue is in the suspend or resume path. Archit [ 40.935089] [ cut here ] [ 40.935241] WARNING: at drivers/video/omap2/dss/dss.c:716 0xc0137304() [ 40.935363] Modules linked in: [ 40.935516] Backtrace: [ 40.935638] Function entered at [c0010c48] from [c02bdad0] [ 40.935791] r6:c02fa302 r5:02cc r4: r3: [ 40.936065] Function entered at [c02bdab8] from [c0034874] [ 40.936218] Function entered at [c0034820] from [c00348b0] [ 40.936340] r8: r7:c0386c14 r6:c0028c48 r5:c0371c24 r4:fff3 [ 40.936645] r3:0009 [ 40.936767] Function entered at [c003488c] from [c0137304] [ 40.936920] Function entered at [c01372a8] from [c0137b20] [ 40.937042] r4:dec50008 r3:c0137b10 [ 40.937225] Function entered at [c0137b10] from [c0179ff0] [ 40.937347] r6:c0028c48 r5:c0371c24 r4:dec50008 r3:c0137b10 [ 40.937652] Function entered at [c0179fbc] from [c0028ca4] [ 40.937774] Function entered at [c0028c48] from [c017afc0] [ 40.937927] r4:dec50008 r3: [ 40.938079] Function entered at [c017aee0] from [c017b5ec] [ 40.938232] r9:c02d0044 r8: r6:dec5005c r5:0010 r4:dec50008 [ 40.938568] Function entered at [c017b520] from [c0063e48] [ 40.938690] r8:c02c3898 r7: r6:0003 r5: r4: [ 40.938995] Function entered at [c0063cc0] from [c0064004] [ 40.939147] r7:0004 r6: r5:c02c388c r4:0003 [ 40.939422] Function entered at [c0063ef4] from [c0062e78] [ 40.939544] r6:0003 r5:0003 r4:d8869000 r3:006d [ 40.939819] Function entered at [c0062dd0] from [c0117ac8] [ 40.939971] Function entered at [c0117aac] from [c00e2e54] [ 40.940124] Function entered at [c00e2d44] from [c009903c] [ 40.940246] Function entered at [c0098f88] from [c00992b8] [ 40.940368] r8:0004 r7: r6: r5:000aabf8 r4:d857c940 [ 40.940704] Function entered at [c0099274] from [c000dcc0] [ 40.941101] r8:c000de44 r7:0004 r6:000aabf8 r5:0004 r4:000a8e38 [ 40.941436] ---[ end trace db0a5fd53a21ce6e ]--- [ 40.943511] PM: early resume of devices complete after 9.307 msecs [ 40.950622] omapdss DISPC: dispc_save_context [ 40.951049] omapdss DISPC: context saved, ctx_loss_count 0 [ 40.951171] omapdss DSS: dss_runtime_put [ 40.951812] omapdss CORE: resume [ 40.953460] omapdss DSS: dss_runtime_get [ 40.953613] @@@ dss_runtime_get::pm_runtime_get_sync() return val = 1 [ 40.953735] omapdss DISPC: dispc_runtime_get [ 40.953948] omapdss DSS: dss_runtime_get [ 40.954071] @@@ dss_runtime_get::pm_runtime_get_sync() return val = 1 [ 40.954193] omapdss DISPC: dispc_restore_context [ 40.954467] omapdss DISPC: onoff 0 rf 0 ieo 0 ipc 0 ihs 1 ivs 1 acbi 0 acb 0 [ 40.954620] omapdss DSS: dispc clock info found from cache. [ 40.954772] omapdss DSS: dpll4_m4 = 43200 [ 40.954986] omapdss DSS: fck = 7200 (12) [ 40.955108] omapdss DISPC: lck = 7200 (1) [ 40.955230] omapdss DISPC: pck = 3600 (2) [ 40.955383] omapdss DISPC: channel 0 xres 800 yres 480 [ 40.955505] omapdss DISPC: pck 36000 [ 40.955657] omapdss DISPC: hsw 10 hfp 164 hbp 89 vsw 10 vfp 10 vbp 23 [ 40.955810] omapdss DISPC: hsync 33866Hz, vsync 64Hz [ 41.294464] PM: resume of devices
Re: DSS2/PM on 3.2 broken?
-Original Message- From: Archit a0393...@ti.com To: Joe Woodward j...@terrafix.co.uk Cc: Paul Walmsley p...@pwsan.com, NeilBrown ne...@suse.de, khil...@ti.com, linux-omap@vger.kernel.org, Tomi Valkeinen tomi.valkei...@ti.com Date: Wed, 11 Jan 2012 21:22:00 +0530 Subject: Re: DSS2/PM on 3.2 broken? Hi, On Wednesday 11 January 2012 08:45 PM, Joe Woodward wrote: ...snip... Could you enable omapdss debug by adding 'omapdss.debug=1 debug' in your bootargs and share logs? As requested: # echo mem /sys/power/state [ 37.371734] PM: Syncing filesystems ... done. [ 37.397460] PM: Preparing system for mem sleep [ 37.402923] Freezing user space processes ... (elapsed 0.02 seconds) done. [ 37.432312] Freezing remaining freezable tasks ... (elapsed 0.02 seconds) done. [ 37.463256] PM: Entering mem sleep [ 37.467132] Suspending console(s) (use no_console_suspend to debug) SYSTEM SUSPENDS HERE [ 37.587646] omapdss CORE: suspend 2 [ 37.776000] omapdss DISPC: dispc_runtime_put [ 37.776336] omapdss DSS: dss_runtime_put [ 37.779693] PM: suspend of devices complete after 305.389 msecs [ 37.786010] omapdss DISPC: dispc_save_context [ 37.786437] omapdss DISPC: context saved, ctx_loss_count 0 [ 37.786560] omapdss DSS: dss_runtime_put [ 37.786773] omapdss DSS: dss_save_context [ 37.786895] omapdss DSS: context saved [ 37.787384] PM: late suspend of devices complete after 7.446 msecs [ 40.932312] Successfully put all powerdomains to target state [ 40.934509] omapdss DSS: dss_restore_context [ 40.934661] omapdss DSS: context restored [ 40.934814] omapdss DSS: dss_runtime_get [ 40.934967] @@@ dss_runtime_get::pm_runtime_get_sync() return val = -13 This resume/restore looks to be happening a bit early according to me. All the resuming should happen after the print 'omapdss CORE: resume'. One experiment worth trying would be to reboot/halt the kernel which will lead to only shutdown of DSS, and no resume. This will help us figure out if the issue is in the suspend or resume path. Archit I'm not 100% sure what you mean for me to do here? Do you mean to halt the kernel after suspending, but before resuming? If so, how would I go about doing this? If I just reboot or poweroff I get the following... # poweroff The system is going down NOW! Sent SIGTERM to all processes Sent SIGKILL to all processes Requesting system poweroff [ 43.310607] omapdss CORE: shutdown [ 43.499053] omapdss DISPC: dispc_runtime_put [ 43.504272] omapdss DISPC: dispc_save_context [ 43.509399] omapdss DISPC: context saved, ctx_loss_count 0 [ 43.515747] omapdss DSS: dss_runtime_put [ 43.521453] omapdss DSS: dss_runtime_put [ 43.527221] System halted. # reboot The system is going down NOW! Sent SIGTERM to all processes Sent SIGKILL to all processes Requesting system reboot [ 25.652008] omapdss CORE: shutdown [ 25.836120] omapdss DISPC: dispc_runtime_put [ 25.841125] omapdss DISPC: dispc_save_context [ 25.846221] omapdss DSS: dss_runtime_put [ 25.851776] omapdss DISPC: context saved, ctx_loss_count 0 [ 25.857788] omapdss DSS: dss_runtime_put [ 25.864074] Restarting system. Cheers, Joe -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: DSS2/PM on 3.2 broken?
Hi, On Wednesday 11 January 2012 09:43 PM, Joe Woodward wrote: -Original Message- From: Archita0393...@ti.com To: Joe Woodwardj...@terrafix.co.uk Cc: Paul Walmsleyp...@pwsan.com, NeilBrownne...@suse.de,khil...@ti.com,linux-omap@vger.kernel.org, Tomi Valkeinen tomi.valkei...@ti.com Date: Wed, 11 Jan 2012 21:22:00 +0530 Subject: Re: DSS2/PM on 3.2 broken? Hi, On Wednesday 11 January 2012 08:45 PM, Joe Woodward wrote: ...snip... Could you enable omapdss debug by adding 'omapdss.debug=1 debug' in your bootargs and share logs? snip Archit I'm not 100% sure what you mean for mmount -t debugfs debugfs /sys/kernel/debuge to do here? Do you mean to halt the kernel after suspending, but before resuming? If so, how would I go about doing this? I'd asked for reboot or power off as you have done below. The suspend path looks okay, I asked for the power off to see if there was a dss_runtime_get() was called at the wrong time by some interrupt while suspending. One difference between the shutdown and suspend logs is that we don't see the prints 'omapdss DSS: dss_save_context' and 'omapdss DSS: context saved', that seems a bit fishy. I'm not sure how to go ahead from here. We haven't seen this on OMAP4, and I don't think I have tried this on OAMP3. Maybe Tomi has more ideas. Archit If I just reboot or poweroff I get the following... # poweroff The system is going down NOW! Sent SIGTERM to all processes Sent SIGKILL to all processes Requesting system poweroff [ 43.310607] omapdss CORE: shutdown [ 43.499053] omapdss DISPC: dispc_runtime_put [ 43.504272] omapdss DISPC: dispc_save_context [ 43.509399] omapdss DISPC: context saved, ctx_loss_count 0 [ 43.515747] omapdss DSS: dss_runtime_put [ 43.521453] omapdss DSS: dss_runtime_put [ 43.527221] System halted. # reboot The system is going down NOW! Sent SIGTERM to all processes Sent SIGKILL to all processes Requesting system reboot [ 25.652008] omapdss CORE: shutdown [ 25.836120] omapdss DISPC: dispc_runtime_put [ 25.841125] omapdss DISPC: dispc_save_context [ 25.846221] omapdss DSS: dss_runtime_put [ 25.851776] omapdss DISPC: context saved, ctx_loss_count 0 [ 25.857788] omapdss DSS: dss_runtime_put [ 25.864074] Restarting system. Cheers, Joe -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: DSS2/PM on 3.2 broken?
On Wed, 11 Jan 2012 06:43:04 -0700 (MST) Paul Walmsley p...@pwsan.com wrote: cc Kevin On Tue, 10 Jan 2012, NeilBrown wrote: It seems that when cpuidle on an omap3 tries to switch to lower power states, various things misbehave: - UARTs lose characters Is this with off-mode enabled, or is this with only retention idle enabled? I haven't experimented with off-mode much. When I try it the suspend-time current drain goes from 65mA to about 90mA which doesn't seem the right direction. The only other observation is that the cam power domain doesn't reach the target state of '0'. I'm guessing they are related but haven't progressed further. (I don't have anything like a camera driver compiled in - nor do I have a camera attached). So it is just retention. I occasionally see evidence of a garbled char when I wake from suspend, but that doesn't bother me much. I spent some time exploring why cpuidle never drops below state 0 and found out that the code explicitly disables other states when uart is active - for a fairly broad definition of 'active'. I found the sleep_timeout setting and set them all to 1 second. This meant that cpuidle started working, but I got a lot of garbage characters. If off-mode is enabled, and incoming serial traffic is used to wake the system, it's known and expected that the first byte will be lost. This is an artifact of the hardware and seems to be unavoidable. Does this really mean CPU_IDLE cannot be used when you might expect chars from a serial port - e.g. GPS or Bluetooth? Which power domain needs to stay non-off? I would guess 'CORE' for UART1,2 and 'PER' for UART3,4. I'm using UART3 for the console port, but it looks like 'PER' doesn't get switched off by CPUIDLE. What is the important issue here that I don't know about ? :-) - dss loses sync Best to take this up with the DSS maintainer directly. - HDQ seems to lose everything. Is this with off-mode enabled, or just with retention idle? Not off-mode - just retention. If the former, this is hardly surprising as the HDQ driver (drivers/w1/masters/omap_hdq.c) doesn't contain any context save/restore code. In fact the HDQ driver doesn't even use PM runtime, so quite a bit of work is needed there. The HDQ driver does enable/disable the iclk and fclk. I thought I read that having the clocks enabled would prevent the powerdomain from dropping to a lower-power state ?? Could you suggest some other driver which does do the right sort of runtime PM that I could try copying? Thanks, NeilBrown signature.asc Description: PGP signature
Re: DSS2/PM on 3.2 broken?
-Original Message- From: NeilBrown ne...@suse.de To: Joe Woodward j...@terrafix.co.uk Cc: linux-omap@vger.kernel.org Date: Tue, 10 Jan 2012 08:08:49 +1100 Subject: Re: DSS2/PM on 3.2 broken? On Mon, 09 Jan 2012 12:46:43 + Joe Woodward j...@terrafix.co.uk wrote: I'm running on a Gumstix Overo (OMAP3530) with an 24-bit LCD panel connected via the DPI interface (using the generic panel driver). Entering standby used to work just fine on 3.0, but on 3.2 I get the following: # echo mem /sys/power/state [ 23.186279] PM: Syncing filesystems ... done. [ 23.194244] Freezing user space processes ... (elapsed 0.01 seconds) done. [ 23.219543] Freezing remaining freezable tasks ... (elapsed 0.02 seconds) done. [ 23.251037] Suspending console(s) (use no_console_suspend to debug) [ 23.554656] PM: suspend of devices complete after 296.417 msecs [ 23.561859] PM: late suspend of devices complete after 6.957 msecs [ 24.464813] Successfully put all powerdomains to target state [ 24.466674] [ cut here ] [ 24.466857] WARNING: at drivers/video/omap2/dss/dss.c:713 0xc01350f8() [ 24.467010] Modules linked in: [ 24.467132] Backtrace: [ 24.467254] Function entered at [c0010c3c] from [c02c4a60] [ 24.467407] r6:c02ffdaa r5:02c9 r4: r3: [ 24.467651] Function entered at [c02c4a48] from [c00344d0] [ 24.467803] Function entered at [c003447c] from [c003450c] [ 24.467926] r8: r7:c0390a84 r6:c00288a4 r5:c037b85c r4:fff3 [ 24.468200] r3:0009 [ 24.468322] Function entered at [c00344e8] from [c01350f8] [ 24.468475] Function entered at [c01350ac] from [c013595c] [ 24.468597] r4:dec50208 r3:c013594c [ 24.468780] Function entered at [c013594c] from [c0182724] [ 24.468902] r6:c00288a4 r5:c037b85c r4:dec50208 r3:c013594c [ 24.469177] Function entered at [c01826f0] from [c0028900] [ 24.469299] Function entered at [c00288a4] from [c01836f4] [ 24.469421] r4:dec50208 r3: [ 24.469604] Function entered at [c0183614] from [c0183d20] [ 24.469726] r9:c02d7044 r8: r6:dec5025c r5:0010 r4:dec50208 [ 24.470031] Function entered at [c0183c54] from [c0063aa4] [ 24.470153] r8:c02ca888 r7: r6:0003 r5: r4: [ 24.470458] Function entered at [c006391c] from [c0063c60] [ 24.470581] r7:0004 r6: r5:c02ca87c r4:0003 [ 24.471130] Function entered at [c0063b50] from [c0062ad4] [ 24.471282] r6:0003 r5:0003 r4:c6a87000 r3:006d [ 24.471557] Function entered at [c0062a2c] from [c0117728] [ 24.471679] Function entered at [c011770c] from [c00e2ab0] [ 24.471832] Function entered at [c00e29a0] from [c0098c98] [ 24.471954] Function entered at [c0098be4] from [c0098f14] [ 24.472076] r8:0004 r7: r6: r5:000ac750 r4:d8a70dc0 [ 24.472412] Function entered at [c0098ed0] from [c000dcc0] [ 24.472534] r8:c000de44 r7:0004 r6:000ac750 r5:0004 r4:000a8e38 [ 24.472839] ---[ end trace 9f4f3053f6637dae ]--- [ 24.475006] PM: early resume of devices complete after 8.666 msecs [ 25.040344] PM: resume of devices complete after 560.943 msecs [ 25.277801] Restarting tasks ... done. At which point the screen either restarts, or sometimes flickers and I get the following: [ 22.578796] omapdss DISPC error: SYNC_LOST on channel lcd, restarting the output with video overlays disabled [ 23.391571] omapdss DISPC error: SYNC_LOST on channel lcd, restarting the output with video overlays disabled [ 24.391571] omapdss DISPC error: SYNC_LOST on channel lcd, restarting the output with video overlays disabled It normally recovers after doing this for a while... Anyone have any ideas? I think I can help you work around the problem but I would much rather see it fixed. So the main reason I'm replying is to make this thread seem more interesting so that more people look at it and hopefully the right person sees it :-) It seems that when cpuidle on an omap3 tries to switch to lower power states, various things misbehave: - UARTs lose characters - dss loses sync - HDQ seems to lose everything. The first is known in the code so cpuidle is disabled if there is any UART traffic. At first boot it is assume thered might be uart activity and the timeout for there doesn't seem to be any activity is '0' meaning 'don't time out' (look for sleep_timeout in sysfs). On suspend, the UARTs are allowed to go to sleep and they are not explicitly woken at resume. So after a suspend/resume cycle, cpuidle is more likely to try to adjust idle states and so DSS and HDQ are more likely to fail. You can disable cpuidle by writing '0' to all the 'sleep_timeout' fields, or probably by echo 1 /sys/module/cpuidle/parameters/off That should stablise the display. It would be great
Re: DSS2/PM on 3.2 broken?
On Mon, 09 Jan 2012 12:46:43 + Joe Woodward j...@terrafix.co.uk wrote: I'm running on a Gumstix Overo (OMAP3530) with an 24-bit LCD panel connected via the DPI interface (using the generic panel driver). Entering standby used to work just fine on 3.0, but on 3.2 I get the following: # echo mem /sys/power/state [ 23.186279] PM: Syncing filesystems ... done. [ 23.194244] Freezing user space processes ... (elapsed 0.01 seconds) done. [ 23.219543] Freezing remaining freezable tasks ... (elapsed 0.02 seconds) done. [ 23.251037] Suspending console(s) (use no_console_suspend to debug) [ 23.554656] PM: suspend of devices complete after 296.417 msecs [ 23.561859] PM: late suspend of devices complete after 6.957 msecs [ 24.464813] Successfully put all powerdomains to target state [ 24.466674] [ cut here ] [ 24.466857] WARNING: at drivers/video/omap2/dss/dss.c:713 0xc01350f8() [ 24.467010] Modules linked in: [ 24.467132] Backtrace: [ 24.467254] Function entered at [c0010c3c] from [c02c4a60] [ 24.467407] r6:c02ffdaa r5:02c9 r4: r3: [ 24.467651] Function entered at [c02c4a48] from [c00344d0] [ 24.467803] Function entered at [c003447c] from [c003450c] [ 24.467926] r8: r7:c0390a84 r6:c00288a4 r5:c037b85c r4:fff3 [ 24.468200] r3:0009 [ 24.468322] Function entered at [c00344e8] from [c01350f8] [ 24.468475] Function entered at [c01350ac] from [c013595c] [ 24.468597] r4:dec50208 r3:c013594c [ 24.468780] Function entered at [c013594c] from [c0182724] [ 24.468902] r6:c00288a4 r5:c037b85c r4:dec50208 r3:c013594c [ 24.469177] Function entered at [c01826f0] from [c0028900] [ 24.469299] Function entered at [c00288a4] from [c01836f4] [ 24.469421] r4:dec50208 r3: [ 24.469604] Function entered at [c0183614] from [c0183d20] [ 24.469726] r9:c02d7044 r8: r6:dec5025c r5:0010 r4:dec50208 [ 24.470031] Function entered at [c0183c54] from [c0063aa4] [ 24.470153] r8:c02ca888 r7: r6:0003 r5: r4: [ 24.470458] Function entered at [c006391c] from [c0063c60] [ 24.470581] r7:0004 r6: r5:c02ca87c r4:0003 [ 24.471130] Function entered at [c0063b50] from [c0062ad4] [ 24.471282] r6:0003 r5:0003 r4:c6a87000 r3:006d [ 24.471557] Function entered at [c0062a2c] from [c0117728] [ 24.471679] Function entered at [c011770c] from [c00e2ab0] [ 24.471832] Function entered at [c00e29a0] from [c0098c98] [ 24.471954] Function entered at [c0098be4] from [c0098f14] [ 24.472076] r8:0004 r7: r6: r5:000ac750 r4:d8a70dc0 [ 24.472412] Function entered at [c0098ed0] from [c000dcc0] [ 24.472534] r8:c000de44 r7:0004 r6:000ac750 r5:0004 r4:000a8e38 [ 24.472839] ---[ end trace 9f4f3053f6637dae ]--- [ 24.475006] PM: early resume of devices complete after 8.666 msecs [ 25.040344] PM: resume of devices complete after 560.943 msecs [ 25.277801] Restarting tasks ... done. At which point the screen either restarts, or sometimes flickers and I get the following: [ 22.578796] omapdss DISPC error: SYNC_LOST on channel lcd, restarting the output with video overlays disabled [ 23.391571] omapdss DISPC error: SYNC_LOST on channel lcd, restarting the output with video overlays disabled [ 24.391571] omapdss DISPC error: SYNC_LOST on channel lcd, restarting the output with video overlays disabled It normally recovers after doing this for a while... Anyone have any ideas? I think I can help you work around the problem but I would much rather see it fixed. So the main reason I'm replying is to make this thread seem more interesting so that more people look at it and hopefully the right person sees it :-) It seems that when cpuidle on an omap3 tries to switch to lower power states, various things misbehave: - UARTs lose characters - dss loses sync - HDQ seems to lose everything. The first is known in the code so cpuidle is disabled if there is any UART traffic. At first boot it is assume thered might be uart activity and the timeout for there doesn't seem to be any activity is '0' meaning 'don't time out' (look for sleep_timeout in sysfs). On suspend, the UARTs are allowed to go to sleep and they are not explicitly woken at resume. So after a suspend/resume cycle, cpuidle is more likely to try to adjust idle states and so DSS and HDQ are more likely to fail. You can disable cpuidle by writing '0' to all the 'sleep_timeout' fields, or probably by echo 1 /sys/module/cpuidle/parameters/off That should stablise the display. It would be great if you could check if cpuidle was active in 3.0 when this all worked. I would check by: grep . /sys/devices/system/cpu/cpu?/cpuidle/state?/usage and see if any state other than state0 is used. If cpuidle is changing power states but the display is happy, then it would be fantastic if you