Re: [U-Boot] [PATCH] watchdog: omap_wdt: improve watchdog reset path
On Thu, Mar 1, 2018 at 1:15 AM, Ruslan Bilovolwrote: > Remove busy looping during watchdog reset. > Each polling of W_PEND_WTGR bit ("finish posted > write") after watchdog reset takes 120-140us > on BeagleBone Black board. Current U-Boot code > has watchdog resets in random places and often > there is situation when watchdog is reset > few times in a row in nested functions. > This adds extra delays and slows the whole system. > > Instead of polling W_PEND_WTGR bit, we skip > watchdog reset if the bit is set. Anyway, watchdog > is in the middle of reset *right now*, so we can > just return. > > This noticeably increases performance of the > system. Below are some measurements on BBB: > - DFU upload over USB 15% faster > - fastboot image upload 3x times faster > - USB ep0 transfers with 4k packets 20% faster > > Signed-off-by: Ruslan Bilovol Tested-by: Alex Kiernan -- Alex Kiernan ___ U-Boot mailing list U-Boot@lists.denx.de https://lists.denx.de/listinfo/u-boot
Re: [U-Boot] [PATCH] watchdog: omap_wdt: improve watchdog reset path
Because someone was wondering what the usb speed was in linux. My kernel is based on 3.2. Note I have USB DMA disabled so that doesn't help the speed. Not sure if u-boot is using DMA but I'd guess not. ADB over usb $ time adb -s 16019491 push userdata.img /mnt/obb/ userdata.img: 1 file pushed. 3.5 MB/s (45007716 bytes in 12.391s) ADB over rndis $ time adb -s 192.168.2.1 push userdata.img /mnt/obb/ userdata.img: 1 file pushed. 2.9 MB/s (45007716 bytes in 14.900s) Netcat over rndis 1|root@android:/ # time busybox nc 192.168.2.2 12345 > /mnt/obb/userdata.img 0m11.12s real 0m0.11s user 0m8.30s system ~3.9MB/s On Mon, Mar 5, 2018 at 3:45 PM, Ruslan Bilovolwrote: > On Mon, Mar 5, 2018 at 6:00 PM, Jonathan Cormer > wrote: >> On Sat, 2018-03-03 at 12:00 +, u-boot-requ...@lists.denx.de wrote: >>> >>> On 1 March 2018 at 16:33, Tom Rini wrote: >>> > >>> > On Thu, Mar 01, 2018 at 04:10:44PM +0200, Ruslan Bilovol wrote: >>> > > >>> > > Hi Lukasz, >>> > > >>> > > On Thu, Mar 1, 2018 at 12:53 PM, Lukasz Majewski >>> > > wrote: >>> > > > >>> > > > Hi Ruslan, >>> > > > >>> > > > > >>> > > > > Remove busy looping during watchdog reset. >>> > > > > Each polling of W_PEND_WTGR bit ("finish posted >>> > > > > write") after watchdog reset takes 120-140us >>> > > > > on BeagleBone Black board. Current U-Boot code >>> > > > > has watchdog resets in random places and often >>> > > > > there is situation when watchdog is reset >>> > > > > few times in a row in nested functions. >>> > > > > This adds extra delays and slows the whole system. >>> > > > > >>> > > > > Instead of polling W_PEND_WTGR bit, we skip >>> > > > > watchdog reset if the bit is set. Anyway, watchdog >>> > > > > is in the middle of reset *right now*, so we can >>> > > > > just return. >>> > > > It seems like similar problem may be in the Linux kernel >>> > > > driver: >>> > > > v4.16-rc1/source/drivers/watchdog/omap_wdt.c#L74 >>> > > > >>> > > > Linux driver also waits for the write. >>> > > Right, Linux driver has similar code but it doesn't affect >>> > > performance. >>> > > This is because of watchdog usage in Linux, it is enabled and >>> > > reset by userspace. This is quite rare event. >>> > > Moreover, since Linux has interrupts enabled and has scheduling, >>> > > such busy loops in the omap driver are not very different to >>> > > just mdelay() usage. The system can handle interrupts, and can >>> > > even do preemption if PREEMPT is enabled. >>> > > So use of such busy loops in that particular case shouldn't cause >>> > > any noticeable performance issues. >>> > True. But, can you also submit a patch to the kernel side (and CC >>> > me) ? >>> > That's far more likely to get the attention of the engineers that >>> > might >>> > know of corner cases with the various SoC families where we might >>> > need >>> > to keep doing what we're doing or other possible problems. Thanks! >>> > >>> Some additional input from my side. >>> >>> My previous measurements were wrong, due to usage of slow USB hub >>> port >>> on my laptop. Using another USB port I have next transfer speed for >>> "fastboot flash" operation: >>> - without patch: 2.84 MiB/sec >>> - with patch: 7.81 MiB/sec >>> >>> So it gives us about 2.75 times improvement, as stated by Ruslan in >>> his commit message. Also, I tested that WDT continues to work >>> correctly, and it does (tried several use-cases, and also debug patch >>> with infinite loop). So again: >>> >>> Tested-by: Sam Protsenko >>> >>> Also: >>> >>> Reviewed-by: Sam Protsenko >>> >>> I checked the code and I can say that there shouldn't be any concerns >>> about WDT functionality with this patch. By the way, in my opinion, >>> for kernel this patch doesn't make much sense, and there might be >>> even >>> confusion in case we send it... If there any concerns about it, we >>> can >>> invite related engineers in this discussion, asking them to review >>> this. >>> >>> Overall, I think this patch is "must have" in U-Boot, as it gives us >>> dramatic improvement without any drawbacks, especially for AM335x >>> boards. It's probably the best approach for WDT reset procedure until >>> interrupts are enabled for ARM architecture (if we even consider it). >> Tested patch with AM335x on custom u-boot-2013.10 based branch. Patch >> cleanly applied, obviously not much in this wdt driver has changed >> since then. >> >> Fastboot flash >> Before patch: 2.89MiB/s >> After patch: 8.68MiB/s > > Cool, that's even a little bit better than I've got on my setup! > Thanks for testing. > > Regards, > Ruslan -- Jonathan Cormier Software Engineer Voice: 315.425.4045 x222 http://www.CriticalLink.com 6712 Brooklawn Parkway, Syracuse, NY 13211 ___ U-Boot mailing list U-Boot@lists.denx.de https://lists.denx.de/listinfo/u-boot
Re: [U-Boot] [PATCH] watchdog: omap_wdt: improve watchdog reset path
On Mon, Mar 5, 2018 at 6:00 PM, Jonathan Cormerwrote: > On Sat, 2018-03-03 at 12:00 +, u-boot-requ...@lists.denx.de wrote: >> >> On 1 March 2018 at 16:33, Tom Rini wrote: >> > >> > On Thu, Mar 01, 2018 at 04:10:44PM +0200, Ruslan Bilovol wrote: >> > > >> > > Hi Lukasz, >> > > >> > > On Thu, Mar 1, 2018 at 12:53 PM, Lukasz Majewski >> > > wrote: >> > > > >> > > > Hi Ruslan, >> > > > >> > > > > >> > > > > Remove busy looping during watchdog reset. >> > > > > Each polling of W_PEND_WTGR bit ("finish posted >> > > > > write") after watchdog reset takes 120-140us >> > > > > on BeagleBone Black board. Current U-Boot code >> > > > > has watchdog resets in random places and often >> > > > > there is situation when watchdog is reset >> > > > > few times in a row in nested functions. >> > > > > This adds extra delays and slows the whole system. >> > > > > >> > > > > Instead of polling W_PEND_WTGR bit, we skip >> > > > > watchdog reset if the bit is set. Anyway, watchdog >> > > > > is in the middle of reset *right now*, so we can >> > > > > just return. >> > > > It seems like similar problem may be in the Linux kernel >> > > > driver: >> > > > v4.16-rc1/source/drivers/watchdog/omap_wdt.c#L74 >> > > > >> > > > Linux driver also waits for the write. >> > > Right, Linux driver has similar code but it doesn't affect >> > > performance. >> > > This is because of watchdog usage in Linux, it is enabled and >> > > reset by userspace. This is quite rare event. >> > > Moreover, since Linux has interrupts enabled and has scheduling, >> > > such busy loops in the omap driver are not very different to >> > > just mdelay() usage. The system can handle interrupts, and can >> > > even do preemption if PREEMPT is enabled. >> > > So use of such busy loops in that particular case shouldn't cause >> > > any noticeable performance issues. >> > True. But, can you also submit a patch to the kernel side (and CC >> > me) ? >> > That's far more likely to get the attention of the engineers that >> > might >> > know of corner cases with the various SoC families where we might >> > need >> > to keep doing what we're doing or other possible problems. Thanks! >> > >> Some additional input from my side. >> >> My previous measurements were wrong, due to usage of slow USB hub >> port >> on my laptop. Using another USB port I have next transfer speed for >> "fastboot flash" operation: >> - without patch: 2.84 MiB/sec >> - with patch: 7.81 MiB/sec >> >> So it gives us about 2.75 times improvement, as stated by Ruslan in >> his commit message. Also, I tested that WDT continues to work >> correctly, and it does (tried several use-cases, and also debug patch >> with infinite loop). So again: >> >> Tested-by: Sam Protsenko >> >> Also: >> >> Reviewed-by: Sam Protsenko >> >> I checked the code and I can say that there shouldn't be any concerns >> about WDT functionality with this patch. By the way, in my opinion, >> for kernel this patch doesn't make much sense, and there might be >> even >> confusion in case we send it... If there any concerns about it, we >> can >> invite related engineers in this discussion, asking them to review >> this. >> >> Overall, I think this patch is "must have" in U-Boot, as it gives us >> dramatic improvement without any drawbacks, especially for AM335x >> boards. It's probably the best approach for WDT reset procedure until >> interrupts are enabled for ARM architecture (if we even consider it). > Tested patch with AM335x on custom u-boot-2013.10 based branch. Patch > cleanly applied, obviously not much in this wdt driver has changed > since then. > > Fastboot flash > Before patch: 2.89MiB/s > After patch: 8.68MiB/s Cool, that's even a little bit better than I've got on my setup! Thanks for testing. Regards, Ruslan ___ U-Boot mailing list U-Boot@lists.denx.de https://lists.denx.de/listinfo/u-boot
Re: [U-Boot] [PATCH] watchdog: omap_wdt: improve watchdog reset path
On Sat, 2018-03-03 at 12:00 +, u-boot-requ...@lists.denx.de wrote: > > On 1 March 2018 at 16:33, Tom Riniwrote: > > > > On Thu, Mar 01, 2018 at 04:10:44PM +0200, Ruslan Bilovol wrote: > > > > > > Hi Lukasz, > > > > > > On Thu, Mar 1, 2018 at 12:53 PM, Lukasz Majewski > > > wrote: > > > > > > > > Hi Ruslan, > > > > > > > > > > > > > > Remove busy looping during watchdog reset. > > > > > Each polling of W_PEND_WTGR bit ("finish posted > > > > > write") after watchdog reset takes 120-140us > > > > > on BeagleBone Black board. Current U-Boot code > > > > > has watchdog resets in random places and often > > > > > there is situation when watchdog is reset > > > > > few times in a row in nested functions. > > > > > This adds extra delays and slows the whole system. > > > > > > > > > > Instead of polling W_PEND_WTGR bit, we skip > > > > > watchdog reset if the bit is set. Anyway, watchdog > > > > > is in the middle of reset *right now*, so we can > > > > > just return. > > > > It seems like similar problem may be in the Linux kernel > > > > driver: > > > > v4.16-rc1/source/drivers/watchdog/omap_wdt.c#L74 > > > > > > > > Linux driver also waits for the write. > > > Right, Linux driver has similar code but it doesn't affect > > > performance. > > > This is because of watchdog usage in Linux, it is enabled and > > > reset by userspace. This is quite rare event. > > > Moreover, since Linux has interrupts enabled and has scheduling, > > > such busy loops in the omap driver are not very different to > > > just mdelay() usage. The system can handle interrupts, and can > > > even do preemption if PREEMPT is enabled. > > > So use of such busy loops in that particular case shouldn't cause > > > any noticeable performance issues. > > True. But, can you also submit a patch to the kernel side (and CC > > me) ? > > That's far more likely to get the attention of the engineers that > > might > > know of corner cases with the various SoC families where we might > > need > > to keep doing what we're doing or other possible problems. Thanks! > > > Some additional input from my side. > > My previous measurements were wrong, due to usage of slow USB hub > port > on my laptop. Using another USB port I have next transfer speed for > "fastboot flash" operation: > - without patch: 2.84 MiB/sec > - with patch: 7.81 MiB/sec > > So it gives us about 2.75 times improvement, as stated by Ruslan in > his commit message. Also, I tested that WDT continues to work > correctly, and it does (tried several use-cases, and also debug patch > with infinite loop). So again: > > Tested-by: Sam Protsenko > > Also: > > Reviewed-by: Sam Protsenko > > I checked the code and I can say that there shouldn't be any concerns > about WDT functionality with this patch. By the way, in my opinion, > for kernel this patch doesn't make much sense, and there might be > even > confusion in case we send it... If there any concerns about it, we > can > invite related engineers in this discussion, asking them to review > this. > > Overall, I think this patch is "must have" in U-Boot, as it gives us > dramatic improvement without any drawbacks, especially for AM335x > boards. It's probably the best approach for WDT reset procedure until > interrupts are enabled for ARM architecture (if we even consider it). Tested patch with AM335x on custom u-boot-2013.10 based branch. Patch cleanly applied, obviously not much in this wdt driver has changed since then. Fastboot flash Before patch: 2.89MiB/s After patch: 8.68MiB/s Tested-by: Jonathan Cormier Reviewed-by: Jonathan Cormier ___ U-Boot mailing list U-Boot@lists.denx.de https://lists.denx.de/listinfo/u-boot
Re: [U-Boot] [PATCH] watchdog: omap_wdt: improve watchdog reset path
On Sun, Mar 04, 2018 at 04:00:15PM +0200, Ruslan Bilovol wrote: > On Thu, Mar 1, 2018 at 4:33 PM, Tom Riniwrote: > > On Thu, Mar 01, 2018 at 04:10:44PM +0200, Ruslan Bilovol wrote: > >> Hi Lukasz, > >> > >> On Thu, Mar 1, 2018 at 12:53 PM, Lukasz Majewski wrote: > >> > Hi Ruslan, > >> > > >> >> Remove busy looping during watchdog reset. > >> >> Each polling of W_PEND_WTGR bit ("finish posted > >> >> write") after watchdog reset takes 120-140us > >> >> on BeagleBone Black board. Current U-Boot code > >> >> has watchdog resets in random places and often > >> >> there is situation when watchdog is reset > >> >> few times in a row in nested functions. > >> >> This adds extra delays and slows the whole system. > >> >> > >> >> Instead of polling W_PEND_WTGR bit, we skip > >> >> watchdog reset if the bit is set. Anyway, watchdog > >> >> is in the middle of reset *right now*, so we can > >> >> just return. > >> > > >> > It seems like similar problem may be in the Linux kernel driver: > >> > v4.16-rc1/source/drivers/watchdog/omap_wdt.c#L74 > >> > > >> > Linux driver also waits for the write. > >> > >> Right, Linux driver has similar code but it doesn't affect performance. > >> This is because of watchdog usage in Linux, it is enabled and > >> reset by userspace. This is quite rare event. > >> Moreover, since Linux has interrupts enabled and has scheduling, > >> such busy loops in the omap driver are not very different to > >> just mdelay() usage. The system can handle interrupts, and can > >> even do preemption if PREEMPT is enabled. > >> So use of such busy loops in that particular case shouldn't cause > >> any noticeable performance issues. > > > > True. But, can you also submit a patch to the kernel side (and CC me) ? > > That's far more likely to get the attention of the engineers that might > > know of corner cases with the various SoC families where we might need > > to keep doing what we're doing or other possible problems. Thanks! > > Do you mean just resend this patch, but include kernel watchdog > mailing list? Well, I'm going to set aside what I said here since we've gotten some TI folks to review the patch here. -- Tom signature.asc Description: PGP signature ___ U-Boot mailing list U-Boot@lists.denx.de https://lists.denx.de/listinfo/u-boot
Re: [U-Boot] [PATCH] watchdog: omap_wdt: improve watchdog reset path
On Thu, Mar 1, 2018 at 4:33 PM, Tom Riniwrote: > On Thu, Mar 01, 2018 at 04:10:44PM +0200, Ruslan Bilovol wrote: >> Hi Lukasz, >> >> On Thu, Mar 1, 2018 at 12:53 PM, Lukasz Majewski wrote: >> > Hi Ruslan, >> > >> >> Remove busy looping during watchdog reset. >> >> Each polling of W_PEND_WTGR bit ("finish posted >> >> write") after watchdog reset takes 120-140us >> >> on BeagleBone Black board. Current U-Boot code >> >> has watchdog resets in random places and often >> >> there is situation when watchdog is reset >> >> few times in a row in nested functions. >> >> This adds extra delays and slows the whole system. >> >> >> >> Instead of polling W_PEND_WTGR bit, we skip >> >> watchdog reset if the bit is set. Anyway, watchdog >> >> is in the middle of reset *right now*, so we can >> >> just return. >> > >> > It seems like similar problem may be in the Linux kernel driver: >> > v4.16-rc1/source/drivers/watchdog/omap_wdt.c#L74 >> > >> > Linux driver also waits for the write. >> >> Right, Linux driver has similar code but it doesn't affect performance. >> This is because of watchdog usage in Linux, it is enabled and >> reset by userspace. This is quite rare event. >> Moreover, since Linux has interrupts enabled and has scheduling, >> such busy loops in the omap driver are not very different to >> just mdelay() usage. The system can handle interrupts, and can >> even do preemption if PREEMPT is enabled. >> So use of such busy loops in that particular case shouldn't cause >> any noticeable performance issues. > > True. But, can you also submit a patch to the kernel side (and CC me) ? > That's far more likely to get the attention of the engineers that might > know of corner cases with the various SoC families where we might need > to keep doing what we're doing or other possible problems. Thanks! Do you mean just resend this patch, but include kernel watchdog mailing list? Or reimplement it for kernel? In last case it doesn't have much sense - it brings almost no value to the driver, since watchdog usage approach by kernel is too different comparing to U-Boot. Yet also kernel driver is more complicated and such a change may trigger it's rework. Probably will need to introduce additional synchronization at least. By the way, if needed, I can resend this patch including watchdog kernel mailing list and related maintainer, so kernel folks can review and provide some feedback. Best regards, Ruslan Bilovol ___ U-Boot mailing list U-Boot@lists.denx.de https://lists.denx.de/listinfo/u-boot
Re: [U-Boot] [PATCH] watchdog: omap_wdt: improve watchdog reset path
On 1 March 2018 at 16:33, Tom Riniwrote: > On Thu, Mar 01, 2018 at 04:10:44PM +0200, Ruslan Bilovol wrote: >> Hi Lukasz, >> >> On Thu, Mar 1, 2018 at 12:53 PM, Lukasz Majewski wrote: >> > Hi Ruslan, >> > >> >> Remove busy looping during watchdog reset. >> >> Each polling of W_PEND_WTGR bit ("finish posted >> >> write") after watchdog reset takes 120-140us >> >> on BeagleBone Black board. Current U-Boot code >> >> has watchdog resets in random places and often >> >> there is situation when watchdog is reset >> >> few times in a row in nested functions. >> >> This adds extra delays and slows the whole system. >> >> >> >> Instead of polling W_PEND_WTGR bit, we skip >> >> watchdog reset if the bit is set. Anyway, watchdog >> >> is in the middle of reset *right now*, so we can >> >> just return. >> > >> > It seems like similar problem may be in the Linux kernel driver: >> > v4.16-rc1/source/drivers/watchdog/omap_wdt.c#L74 >> > >> > Linux driver also waits for the write. >> >> Right, Linux driver has similar code but it doesn't affect performance. >> This is because of watchdog usage in Linux, it is enabled and >> reset by userspace. This is quite rare event. >> Moreover, since Linux has interrupts enabled and has scheduling, >> such busy loops in the omap driver are not very different to >> just mdelay() usage. The system can handle interrupts, and can >> even do preemption if PREEMPT is enabled. >> So use of such busy loops in that particular case shouldn't cause >> any noticeable performance issues. > > True. But, can you also submit a patch to the kernel side (and CC me) ? > That's far more likely to get the attention of the engineers that might > know of corner cases with the various SoC families where we might need > to keep doing what we're doing or other possible problems. Thanks! > Some additional input from my side. My previous measurements were wrong, due to usage of slow USB hub port on my laptop. Using another USB port I have next transfer speed for "fastboot flash" operation: - without patch: 2.84 MiB/sec - with patch: 7.81 MiB/sec So it gives us about 2.75 times improvement, as stated by Ruslan in his commit message. Also, I tested that WDT continues to work correctly, and it does (tried several use-cases, and also debug patch with infinite loop). So again: Tested-by: Sam Protsenko Also: Reviewed-by: Sam Protsenko I checked the code and I can say that there shouldn't be any concerns about WDT functionality with this patch. By the way, in my opinion, for kernel this patch doesn't make much sense, and there might be even confusion in case we send it... If there any concerns about it, we can invite related engineers in this discussion, asking them to review this. Overall, I think this patch is "must have" in U-Boot, as it gives us dramatic improvement without any drawbacks, especially for AM335x boards. It's probably the best approach for WDT reset procedure until interrupts are enabled for ARM architecture (if we even consider it). > -- > Tom ___ U-Boot mailing list U-Boot@lists.denx.de https://lists.denx.de/listinfo/u-boot
Re: [U-Boot] [PATCH] watchdog: omap_wdt: improve watchdog reset path
Hi Ruslan, > Hi Lukasz, > > On Thu, Mar 1, 2018 at 12:53 PM, Lukasz Majewski> wrote: > > Hi Ruslan, > > > >> Remove busy looping during watchdog reset. > >> Each polling of W_PEND_WTGR bit ("finish posted > >> write") after watchdog reset takes 120-140us > >> on BeagleBone Black board. Current U-Boot code > >> has watchdog resets in random places and often > >> there is situation when watchdog is reset > >> few times in a row in nested functions. > >> This adds extra delays and slows the whole system. > >> > >> Instead of polling W_PEND_WTGR bit, we skip > >> watchdog reset if the bit is set. Anyway, watchdog > >> is in the middle of reset *right now*, so we can > >> just return. > > > > It seems like similar problem may be in the Linux kernel driver: > > v4.16-rc1/source/drivers/watchdog/omap_wdt.c#L74 > > > > Linux driver also waits for the write. > > Right, Linux driver has similar code but it doesn't affect > performance. This is because of watchdog usage in Linux, it is > enabled and reset by userspace. This is quite rare event. > Moreover, since Linux has interrupts enabled and has scheduling, > such busy loops in the omap driver are not very different to > just mdelay() usage. The system can handle interrupts, and can > even do preemption if PREEMPT is enabled. > So use of such busy loops in that particular case shouldn't cause > any noticeable performance issues. > > In U-Boot we have polling instead of interrupts and something like > cooperative multitasking. Also watchdog resets much more often, > and such busy pollings in the driver delay execution of other code. > > For example, in DFU we have indefinite loop, something like: >---> > | dfu_write() > | watchdog_reset() > | ... > | handle_usb() > | `- watchdog_reset() > | ` ... > `' > > And each delay caused by watchdog reset adds significant > overhead comparing to handling USB events or writing to the > storage. > > So as bottom line, we don't need to do similar change in > Linux driver because there is almost no impact on performance > in that case. But in case of U-Boot which uses polling instead > of interrupts, this patch causes such a big performance > improvement. Thank you for this work - it really improves speed considerably :-). Reviewed-by: Lukasz Majewski > > Best regards, > Ruslan > > > > >> > >> This noticeably increases performance of the > >> system. Below are some measurements on BBB: > >> - DFU upload over USB 15% faster > >> - fastboot image upload 3x times faster > >> - USB ep0 transfers with 4k packets 20% faster > >> > >> Signed-off-by: Ruslan Bilovol > >> --- > >> drivers/watchdog/omap_wdt.c | 21 +++-- > >> 1 file changed, 15 insertions(+), 6 deletions(-) > >> > >> diff --git a/drivers/watchdog/omap_wdt.c > >> b/drivers/watchdog/omap_wdt.c index 7b1f429..d724c96 100644 > >> --- a/drivers/watchdog/omap_wdt.c > >> +++ b/drivers/watchdog/omap_wdt.c > >> @@ -53,16 +53,25 @@ void hw_watchdog_reset(void) > >> { > >> struct wd_timer *wdt = (struct wd_timer *)WDT_BASE; > >> > >> - /* wait for posted write to complete */ > >> - while ((readl(>wdtwwps)) & WDT_WWPS_PEND_WTGR) > >> - ; > >> + /* > >> + * Somebody just triggered watchdog reset and write to WTGR > >> register > >> + * is in progress. It is resetting right now, no need to > >> trigger it > >> + * again > >> + */ > >> + if ((readl(>wdtwwps)) & WDT_WWPS_PEND_WTGR) > >> + return; > >> > >> wdt_trgr_pattern = ~wdt_trgr_pattern; > >> writel(wdt_trgr_pattern, >wdtwtgr); > >> > >> - /* wait for posted write to complete */ > >> - while ((readl(>wdtwwps) & WDT_WWPS_PEND_WTGR)) > >> - ; > >> + /* > >> + * Don't wait for posted write to complete, i.e. don't check > >> + * WDT_WWPS_PEND_WTGR bit in WWPS register. There is no > >> writes to > >> + * WTGR register outside of this func, and if entering it > >> + * we see WDT_WWPS_PEND_WTGR bit set, it means watchdog reset > >> + * was just triggered. This prevents us from wasting time in > >> busy > >> + * polling of WDT_WWPS_PEND_WTGR bit. > >> + */ > >> } > >> > >> static int omap_wdt_set_timeout(unsigned int timeout) > > > > > > > > > > Best regards, > > > > Lukasz Majewski > > > > -- > > > > DENX Software Engineering GmbH, Managing Director: Wolfgang > > Denk HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, > > Germany Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: > > w...@denx.de Best regards, Lukasz Majewski -- DENX Software Engineering GmbH, Managing Director: Wolfgang Denk HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: w...@denx.de pgpMDbXOwluHT.pgp Description: OpenPGP digital
Re: [U-Boot] [PATCH] watchdog: omap_wdt: improve watchdog reset path
On Thursday 01 March 2018 06:45 AM, Ruslan Bilovol wrote: > Remove busy looping during watchdog reset. > Each polling of W_PEND_WTGR bit ("finish posted > write") after watchdog reset takes 120-140us > on BeagleBone Black board. Current U-Boot code > has watchdog resets in random places and often > there is situation when watchdog is reset > few times in a row in nested functions. > This adds extra delays and slows the whole system. > > Instead of polling W_PEND_WTGR bit, we skip > watchdog reset if the bit is set. Anyway, watchdog > is in the middle of reset *right now*, so we can > just return. > > This noticeably increases performance of the > system. Below are some measurements on BBB: > - DFU upload over USB 15% faster > - fastboot image upload 3x times faster > - USB ep0 transfers with 4k packets 20% faster > > Signed-off-by: Ruslan BilovolThanks and regards, Lokesh > --- > drivers/watchdog/omap_wdt.c | 21 +++-- > 1 file changed, 15 insertions(+), 6 deletions(-) > > diff --git a/drivers/watchdog/omap_wdt.c b/drivers/watchdog/omap_wdt.c > index 7b1f429..d724c96 100644 > --- a/drivers/watchdog/omap_wdt.c > +++ b/drivers/watchdog/omap_wdt.c > @@ -53,16 +53,25 @@ void hw_watchdog_reset(void) > { > struct wd_timer *wdt = (struct wd_timer *)WDT_BASE; > > - /* wait for posted write to complete */ > - while ((readl(>wdtwwps)) & WDT_WWPS_PEND_WTGR) > - ; > + /* > + * Somebody just triggered watchdog reset and write to WTGR register > + * is in progress. It is resetting right now, no need to trigger it > + * again > + */ > + if ((readl(>wdtwwps)) & WDT_WWPS_PEND_WTGR) > + return; > > wdt_trgr_pattern = ~wdt_trgr_pattern; > writel(wdt_trgr_pattern, >wdtwtgr); > > - /* wait for posted write to complete */ > - while ((readl(>wdtwwps) & WDT_WWPS_PEND_WTGR)) > - ; > + /* > + * Don't wait for posted write to complete, i.e. don't check > + * WDT_WWPS_PEND_WTGR bit in WWPS register. There is no writes to > + * WTGR register outside of this func, and if entering it > + * we see WDT_WWPS_PEND_WTGR bit set, it means watchdog reset > + * was just triggered. This prevents us from wasting time in busy > + * polling of WDT_WWPS_PEND_WTGR bit. > + */ > } > > static int omap_wdt_set_timeout(unsigned int timeout) > ___ U-Boot mailing list U-Boot@lists.denx.de https://lists.denx.de/listinfo/u-boot
Re: [U-Boot] [PATCH] watchdog: omap_wdt: improve watchdog reset path
On Thu, Mar 01, 2018 at 04:10:44PM +0200, Ruslan Bilovol wrote: > Hi Lukasz, > > On Thu, Mar 1, 2018 at 12:53 PM, Lukasz Majewskiwrote: > > Hi Ruslan, > > > >> Remove busy looping during watchdog reset. > >> Each polling of W_PEND_WTGR bit ("finish posted > >> write") after watchdog reset takes 120-140us > >> on BeagleBone Black board. Current U-Boot code > >> has watchdog resets in random places and often > >> there is situation when watchdog is reset > >> few times in a row in nested functions. > >> This adds extra delays and slows the whole system. > >> > >> Instead of polling W_PEND_WTGR bit, we skip > >> watchdog reset if the bit is set. Anyway, watchdog > >> is in the middle of reset *right now*, so we can > >> just return. > > > > It seems like similar problem may be in the Linux kernel driver: > > v4.16-rc1/source/drivers/watchdog/omap_wdt.c#L74 > > > > Linux driver also waits for the write. > > Right, Linux driver has similar code but it doesn't affect performance. > This is because of watchdog usage in Linux, it is enabled and > reset by userspace. This is quite rare event. > Moreover, since Linux has interrupts enabled and has scheduling, > such busy loops in the omap driver are not very different to > just mdelay() usage. The system can handle interrupts, and can > even do preemption if PREEMPT is enabled. > So use of such busy loops in that particular case shouldn't cause > any noticeable performance issues. True. But, can you also submit a patch to the kernel side (and CC me) ? That's far more likely to get the attention of the engineers that might know of corner cases with the various SoC families where we might need to keep doing what we're doing or other possible problems. Thanks! -- Tom signature.asc Description: PGP signature ___ U-Boot mailing list U-Boot@lists.denx.de https://lists.denx.de/listinfo/u-boot
Re: [U-Boot] [PATCH] watchdog: omap_wdt: improve watchdog reset path
On 1 March 2018 at 03:15, Ruslan Bilovolwrote: > Remove busy looping during watchdog reset. > Each polling of W_PEND_WTGR bit ("finish posted > write") after watchdog reset takes 120-140us > on BeagleBone Black board. Current U-Boot code > has watchdog resets in random places and often > there is situation when watchdog is reset > few times in a row in nested functions. > This adds extra delays and slows the whole system. > > Instead of polling W_PEND_WTGR bit, we skip > watchdog reset if the bit is set. Anyway, watchdog > is in the middle of reset *right now*, so we can > just return. > > This noticeably increases performance of the > system. Below are some measurements on BBB: > - DFU upload over USB 15% faster > - fastboot image upload 3x times faster > - USB ep0 transfers with 4k packets 20% faster > > Signed-off-by: Ruslan Bilovol > --- > drivers/watchdog/omap_wdt.c | 21 +++-- > 1 file changed, 15 insertions(+), 6 deletions(-) > > diff --git a/drivers/watchdog/omap_wdt.c b/drivers/watchdog/omap_wdt.c > index 7b1f429..d724c96 100644 > --- a/drivers/watchdog/omap_wdt.c > +++ b/drivers/watchdog/omap_wdt.c > @@ -53,16 +53,25 @@ void hw_watchdog_reset(void) > { > struct wd_timer *wdt = (struct wd_timer *)WDT_BASE; > > - /* wait for posted write to complete */ > - while ((readl(>wdtwwps)) & WDT_WWPS_PEND_WTGR) > - ; > + /* > +* Somebody just triggered watchdog reset and write to WTGR register > +* is in progress. It is resetting right now, no need to trigger it > +* again > +*/ > + if ((readl(>wdtwwps)) & WDT_WWPS_PEND_WTGR) > + return; > > wdt_trgr_pattern = ~wdt_trgr_pattern; > writel(wdt_trgr_pattern, >wdtwtgr); > > - /* wait for posted write to complete */ > - while ((readl(>wdtwwps) & WDT_WWPS_PEND_WTGR)) > - ; > + /* > +* Don't wait for posted write to complete, i.e. don't check > +* WDT_WWPS_PEND_WTGR bit in WWPS register. There is no writes to > +* WTGR register outside of this func, and if entering it > +* we see WDT_WWPS_PEND_WTGR bit set, it means watchdog reset > +* was just triggered. This prevents us from wasting time in busy > +* polling of WDT_WWPS_PEND_WTGR bit. > +*/ > } > > static int omap_wdt_set_timeout(unsigned int timeout) > -- > 1.9.1 > Tested-by: Sam Protsenko Tested this patch on my BeagleBone Black Rev A5A. Works fine. I can see USB throughput improvement when doing "fastboot flash". Here are my results: - before patch: 2.72 MiB/sec - when disabling CONFIG_OMAP_WATCHDOG: 5.13 MiB/sec - with this patch: 5.08 MiB/sec So I can see 1.87 times improvement in speed, which is really good news for us, especially for flashing big images. I wonder, what is the max USB speed limit for transfer via EP1. E.g., what will we see if test it via Gadget Zero in Linux? Maybe there is a room for further improvements. But those are only my thoughts. Thanks, Ruslan! ___ U-Boot mailing list U-Boot@lists.denx.de https://lists.denx.de/listinfo/u-boot
Re: [U-Boot] [PATCH] watchdog: omap_wdt: improve watchdog reset path
Hi Lukasz, On Thu, Mar 1, 2018 at 12:53 PM, Lukasz Majewskiwrote: > Hi Ruslan, > >> Remove busy looping during watchdog reset. >> Each polling of W_PEND_WTGR bit ("finish posted >> write") after watchdog reset takes 120-140us >> on BeagleBone Black board. Current U-Boot code >> has watchdog resets in random places and often >> there is situation when watchdog is reset >> few times in a row in nested functions. >> This adds extra delays and slows the whole system. >> >> Instead of polling W_PEND_WTGR bit, we skip >> watchdog reset if the bit is set. Anyway, watchdog >> is in the middle of reset *right now*, so we can >> just return. > > It seems like similar problem may be in the Linux kernel driver: > v4.16-rc1/source/drivers/watchdog/omap_wdt.c#L74 > > Linux driver also waits for the write. Right, Linux driver has similar code but it doesn't affect performance. This is because of watchdog usage in Linux, it is enabled and reset by userspace. This is quite rare event. Moreover, since Linux has interrupts enabled and has scheduling, such busy loops in the omap driver are not very different to just mdelay() usage. The system can handle interrupts, and can even do preemption if PREEMPT is enabled. So use of such busy loops in that particular case shouldn't cause any noticeable performance issues. In U-Boot we have polling instead of interrupts and something like cooperative multitasking. Also watchdog resets much more often, and such busy pollings in the driver delay execution of other code. For example, in DFU we have indefinite loop, something like: ---> | dfu_write() | watchdog_reset() | ... | handle_usb() | `- watchdog_reset() | ` ... `' And each delay caused by watchdog reset adds significant overhead comparing to handling USB events or writing to the storage. So as bottom line, we don't need to do similar change in Linux driver because there is almost no impact on performance in that case. But in case of U-Boot which uses polling instead of interrupts, this patch causes such a big performance improvement. Best regards, Ruslan > >> >> This noticeably increases performance of the >> system. Below are some measurements on BBB: >> - DFU upload over USB 15% faster >> - fastboot image upload 3x times faster >> - USB ep0 transfers with 4k packets 20% faster >> >> Signed-off-by: Ruslan Bilovol >> --- >> drivers/watchdog/omap_wdt.c | 21 +++-- >> 1 file changed, 15 insertions(+), 6 deletions(-) >> >> diff --git a/drivers/watchdog/omap_wdt.c b/drivers/watchdog/omap_wdt.c >> index 7b1f429..d724c96 100644 >> --- a/drivers/watchdog/omap_wdt.c >> +++ b/drivers/watchdog/omap_wdt.c >> @@ -53,16 +53,25 @@ void hw_watchdog_reset(void) >> { >> struct wd_timer *wdt = (struct wd_timer *)WDT_BASE; >> >> - /* wait for posted write to complete */ >> - while ((readl(>wdtwwps)) & WDT_WWPS_PEND_WTGR) >> - ; >> + /* >> + * Somebody just triggered watchdog reset and write to WTGR >> register >> + * is in progress. It is resetting right now, no need to >> trigger it >> + * again >> + */ >> + if ((readl(>wdtwwps)) & WDT_WWPS_PEND_WTGR) >> + return; >> >> wdt_trgr_pattern = ~wdt_trgr_pattern; >> writel(wdt_trgr_pattern, >wdtwtgr); >> >> - /* wait for posted write to complete */ >> - while ((readl(>wdtwwps) & WDT_WWPS_PEND_WTGR)) >> - ; >> + /* >> + * Don't wait for posted write to complete, i.e. don't check >> + * WDT_WWPS_PEND_WTGR bit in WWPS register. There is no >> writes to >> + * WTGR register outside of this func, and if entering it >> + * we see WDT_WWPS_PEND_WTGR bit set, it means watchdog reset >> + * was just triggered. This prevents us from wasting time in >> busy >> + * polling of WDT_WWPS_PEND_WTGR bit. >> + */ >> } >> >> static int omap_wdt_set_timeout(unsigned int timeout) > > > > > Best regards, > > Lukasz Majewski > > -- > > DENX Software Engineering GmbH, Managing Director: Wolfgang Denk > HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany > Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: w...@denx.de ___ U-Boot mailing list U-Boot@lists.denx.de https://lists.denx.de/listinfo/u-boot
[U-Boot] [PATCH] watchdog: omap_wdt: improve watchdog reset path
Remove busy looping during watchdog reset. Each polling of W_PEND_WTGR bit ("finish posted write") after watchdog reset takes 120-140us on BeagleBone Black board. Current U-Boot code has watchdog resets in random places and often there is situation when watchdog is reset few times in a row in nested functions. This adds extra delays and slows the whole system. Instead of polling W_PEND_WTGR bit, we skip watchdog reset if the bit is set. Anyway, watchdog is in the middle of reset *right now*, so we can just return. This noticeably increases performance of the system. Below are some measurements on BBB: - DFU upload over USB 15% faster - fastboot image upload 3x times faster - USB ep0 transfers with 4k packets 20% faster Signed-off-by: Ruslan Bilovol--- drivers/watchdog/omap_wdt.c | 21 +++-- 1 file changed, 15 insertions(+), 6 deletions(-) diff --git a/drivers/watchdog/omap_wdt.c b/drivers/watchdog/omap_wdt.c index 7b1f429..d724c96 100644 --- a/drivers/watchdog/omap_wdt.c +++ b/drivers/watchdog/omap_wdt.c @@ -53,16 +53,25 @@ void hw_watchdog_reset(void) { struct wd_timer *wdt = (struct wd_timer *)WDT_BASE; - /* wait for posted write to complete */ - while ((readl(>wdtwwps)) & WDT_WWPS_PEND_WTGR) - ; + /* +* Somebody just triggered watchdog reset and write to WTGR register +* is in progress. It is resetting right now, no need to trigger it +* again +*/ + if ((readl(>wdtwwps)) & WDT_WWPS_PEND_WTGR) + return; wdt_trgr_pattern = ~wdt_trgr_pattern; writel(wdt_trgr_pattern, >wdtwtgr); - /* wait for posted write to complete */ - while ((readl(>wdtwwps) & WDT_WWPS_PEND_WTGR)) - ; + /* +* Don't wait for posted write to complete, i.e. don't check +* WDT_WWPS_PEND_WTGR bit in WWPS register. There is no writes to +* WTGR register outside of this func, and if entering it +* we see WDT_WWPS_PEND_WTGR bit set, it means watchdog reset +* was just triggered. This prevents us from wasting time in busy +* polling of WDT_WWPS_PEND_WTGR bit. +*/ } static int omap_wdt_set_timeout(unsigned int timeout) -- 1.9.1 ___ U-Boot mailing list U-Boot@lists.denx.de https://lists.denx.de/listinfo/u-boot
Re: [U-Boot] [PATCH] watchdog: omap_wdt: improve watchdog reset path
Hi Ruslan, > Remove busy looping during watchdog reset. > Each polling of W_PEND_WTGR bit ("finish posted > write") after watchdog reset takes 120-140us > on BeagleBone Black board. Current U-Boot code > has watchdog resets in random places and often > there is situation when watchdog is reset > few times in a row in nested functions. > This adds extra delays and slows the whole system. > > Instead of polling W_PEND_WTGR bit, we skip > watchdog reset if the bit is set. Anyway, watchdog > is in the middle of reset *right now*, so we can > just return. It seems like similar problem may be in the Linux kernel driver: v4.16-rc1/source/drivers/watchdog/omap_wdt.c#L74 Linux driver also waits for the write. > > This noticeably increases performance of the > system. Below are some measurements on BBB: > - DFU upload over USB 15% faster > - fastboot image upload 3x times faster > - USB ep0 transfers with 4k packets 20% faster > > Signed-off-by: Ruslan Bilovol> --- > drivers/watchdog/omap_wdt.c | 21 +++-- > 1 file changed, 15 insertions(+), 6 deletions(-) > > diff --git a/drivers/watchdog/omap_wdt.c b/drivers/watchdog/omap_wdt.c > index 7b1f429..d724c96 100644 > --- a/drivers/watchdog/omap_wdt.c > +++ b/drivers/watchdog/omap_wdt.c > @@ -53,16 +53,25 @@ void hw_watchdog_reset(void) > { > struct wd_timer *wdt = (struct wd_timer *)WDT_BASE; > > - /* wait for posted write to complete */ > - while ((readl(>wdtwwps)) & WDT_WWPS_PEND_WTGR) > - ; > + /* > + * Somebody just triggered watchdog reset and write to WTGR > register > + * is in progress. It is resetting right now, no need to > trigger it > + * again > + */ > + if ((readl(>wdtwwps)) & WDT_WWPS_PEND_WTGR) > + return; > > wdt_trgr_pattern = ~wdt_trgr_pattern; > writel(wdt_trgr_pattern, >wdtwtgr); > > - /* wait for posted write to complete */ > - while ((readl(>wdtwwps) & WDT_WWPS_PEND_WTGR)) > - ; > + /* > + * Don't wait for posted write to complete, i.e. don't check > + * WDT_WWPS_PEND_WTGR bit in WWPS register. There is no > writes to > + * WTGR register outside of this func, and if entering it > + * we see WDT_WWPS_PEND_WTGR bit set, it means watchdog reset > + * was just triggered. This prevents us from wasting time in > busy > + * polling of WDT_WWPS_PEND_WTGR bit. > + */ > } > > static int omap_wdt_set_timeout(unsigned int timeout) Best regards, Lukasz Majewski -- DENX Software Engineering GmbH, Managing Director: Wolfgang Denk HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: w...@denx.de pgpzC2C_OFRe8.pgp Description: OpenPGP digital signature ___ U-Boot mailing list U-Boot@lists.denx.de https://lists.denx.de/listinfo/u-boot