Re: brocken devfreq simple_ondemand for Odroid XU3/4?
On 2020-06-29-12-52-10, Lukasz Luba wrote: > Hi Chanwoo, > > On 6/29/20 2:43 AM, Chanwoo Choi wrote: > > Hi, > > > > Sorry for late reply because of my perfornal issue. I count not check the > > email. > > I hope you are good now. > > > > > On 6/26/20 8:22 PM, Bartlomiej Zolnierkiewicz wrote: > > > > > > On 6/25/20 2:12 PM, Kamil Konieczny wrote: > > > > On 25.06.2020 14:02, Lukasz Luba wrote: > > > > > > > > > > > > > > > On 6/25/20 12:30 PM, Kamil Konieczny wrote: > > > > > > Hi Lukasz, > > > > > > > > > > > > On 25.06.2020 12:02, Lukasz Luba wrote: > > > > > > > Hi Sylwester, > > > > > > > > > > > > > > On 6/24/20 4:11 PM, Sylwester Nawrocki wrote: > > > > > > > > Hi All, > > > > > > > > > > > > > > > > On 24.06.2020 12:32, Lukasz Luba wrote: > > > > > > > > > I had issues with devfreq governor which wasn't called by > > > > > > > > > devfreq > > > > > > > > > workqueue. The old DELAYED vs DEFERRED work discussions and > > > > > > > > > my patches > > > > > > > > > for it [1]. If the CPU which scheduled the next work went > > > > > > > > > idle, the > > > > > > > > > devfreq workqueue will not be kicked and devfreq governor > > > > > > > > > won't check > > > > > > > > > DMC status and will not decide to decrease the frequency > > > > > > > > > based on low > > > > > > > > > busy_time. > > > > > > > > > The same applies for going up with the frequency. They both > > > > > > > > > are > > > > > > > > > done by the governor but the workqueue must be scheduled > > > > > > > > > periodically. > > > > > > > > > > > > > > > > As I have been working on resolving the video mixer IOMMU fault > > > > > > > > issue > > > > > > > > described here: https://patchwork.kernel.org/patch/10861757 > > > > > > > > I did some investigation of the devfreq operation, mostly on > > > > > > > > Odroid U3. > > > > > > > > > > > > > > > > My conclusions are similar to what Lukasz says above. I would > > > > > > > > like to add > > > > > > > > that broken scheduling of the performance counters read and the > > > > > > > > devfreq > > > > > > > > updates seems to have one more serious implication. In each > > > > > > > > call, which > > > > > > > > normally should happen periodically with fixed interval we stop > > > > > > > > the counters, > > > > > > > > read counter values and start the counters again. But if period > > > > > > > > between > > > > > > > > calls becomes long enough to let any of the counters overflow, > > > > > > > > we will > > > > > > > > get wrong performance measurement results. My observations are > > > > > > > > that > > > > > > > > the workqueue job can be suspended for several seconds and > > > > > > > > conditions for > > > > > > > > the counter overflow occur sooner or later, depending among > > > > > > > > others > > > > > > > > on the CPUs load. > > > > > > > > Wrong bus load measurement can lead to setting too low > > > > > > > > interconnect bus > > > > > > > > clock frequency and then bad things happen in peripheral > > > > > > > > devices. > > > > > > > > > > > > > > > > I agree the workqueue issue needs to be fixed. I have some WIP > > > > > > > > code to use > > > > > > > > the performance counters overflow interrupts instead of SW > > > > > > > > polling and with > > > > > > > > that the interconnect bus clock control seems to work much > > > > > > > > better. > > > > > > > > > > > > > > > > > > > > > > Thank you for sharing your use case and investigation results. I > > > > > > > think > > > > > > > we are reaching a decent number of developers to maybe address > > > > > > > this > > > > > > > issue: 'workqueue issue needs to be fixed'. > > > > > > > I have been facing this devfreq workqueue issue ~5 times in > > > > > > > different > > > > > > > platforms. > > > > > > > > > > > > > > Regarding the 'performance counters overflow interrupts' there is > > > > > > > one > > > > > > > thing worth to keep in mind: variable utilization and frequency. > > > > > > > For example, in order to make a conclusion in algorithm deciding > > > > > > > that > > > > > > > the device should increase or decrease the frequency, we fix the > > > > > > > period > > > > > > > of observation, i.e. to 500ms. That can cause the long delay if > > > > > > > the > > > > > > > utilization of the device suddenly drops. For example we set an > > > > > > > overflow threshold to value i.e. 1000 and we know that at 1000MHz > > > > > > > and full utilization (100%) the counter will reach that threshold > > > > > > > after 500ms (which we want, because we don't want too many > > > > > > > interrupts > > > > > > > per sec). What if suddenly utilization drops to 2% (i.e. from > > > > > > > 5GB/s > > > > > > > to 250MB/s (what if it drops to 25MB/s?!)), the counter will > > > > > > > reach the > > > > > > > threshold after 50*500ms = 25s. It is impossible just for the > > > > > > > counters > > > > > > > to predict next utilization and adjust the threshold. [...] > > > > > > > > >
Re: brocken devfreq simple_ondemand for Odroid XU3/4?
Hi Chanwoo, On 6/29/20 2:43 AM, Chanwoo Choi wrote: Hi, Sorry for late reply because of my perfornal issue. I count not check the email. I hope you are good now. On 6/26/20 8:22 PM, Bartlomiej Zolnierkiewicz wrote: On 6/25/20 2:12 PM, Kamil Konieczny wrote: On 25.06.2020 14:02, Lukasz Luba wrote: On 6/25/20 12:30 PM, Kamil Konieczny wrote: Hi Lukasz, On 25.06.2020 12:02, Lukasz Luba wrote: Hi Sylwester, On 6/24/20 4:11 PM, Sylwester Nawrocki wrote: Hi All, On 24.06.2020 12:32, Lukasz Luba wrote: I had issues with devfreq governor which wasn't called by devfreq workqueue. The old DELAYED vs DEFERRED work discussions and my patches for it [1]. If the CPU which scheduled the next work went idle, the devfreq workqueue will not be kicked and devfreq governor won't check DMC status and will not decide to decrease the frequency based on low busy_time. The same applies for going up with the frequency. They both are done by the governor but the workqueue must be scheduled periodically. As I have been working on resolving the video mixer IOMMU fault issue described here: https://patchwork.kernel.org/patch/10861757 I did some investigation of the devfreq operation, mostly on Odroid U3. My conclusions are similar to what Lukasz says above. I would like to add that broken scheduling of the performance counters read and the devfreq updates seems to have one more serious implication. In each call, which normally should happen periodically with fixed interval we stop the counters, read counter values and start the counters again. But if period between calls becomes long enough to let any of the counters overflow, we will get wrong performance measurement results. My observations are that the workqueue job can be suspended for several seconds and conditions for the counter overflow occur sooner or later, depending among others on the CPUs load. Wrong bus load measurement can lead to setting too low interconnect bus clock frequency and then bad things happen in peripheral devices. I agree the workqueue issue needs to be fixed. I have some WIP code to use the performance counters overflow interrupts instead of SW polling and with that the interconnect bus clock control seems to work much better. Thank you for sharing your use case and investigation results. I think we are reaching a decent number of developers to maybe address this issue: 'workqueue issue needs to be fixed'. I have been facing this devfreq workqueue issue ~5 times in different platforms. Regarding the 'performance counters overflow interrupts' there is one thing worth to keep in mind: variable utilization and frequency. For example, in order to make a conclusion in algorithm deciding that the device should increase or decrease the frequency, we fix the period of observation, i.e. to 500ms. That can cause the long delay if the utilization of the device suddenly drops. For example we set an overflow threshold to value i.e. 1000 and we know that at 1000MHz and full utilization (100%) the counter will reach that threshold after 500ms (which we want, because we don't want too many interrupts per sec). What if suddenly utilization drops to 2% (i.e. from 5GB/s to 250MB/s (what if it drops to 25MB/s?!)), the counter will reach the threshold after 50*500ms = 25s. It is impossible just for the counters to predict next utilization and adjust the threshold. [...] irq triggers for underflow and overflow, so driver can adjust freq Probably possible on some platforms, depends on how many PMU registers are available, what information can be can assign to them and type of interrupt. A lot of hassle and still - platform and device specific. Also, drivers should not adjust the freq, governors (different types of them with different settings that they can handle) should do it. What the framework can do is to take this responsibility and provide generic way to monitor the devices (or stop if they are suspended). That should work nicely with the governors, which try to predict the next best frequency. From my experience the more fluctuating intervals the governors are called, the more odd decisions they make. That's why I think having a predictable interval i.e. 100ms is something desirable. Tuning the governors is easier in this case, statistics are easier to trace and interpret, solution is not to platform specific, etc. Kamil do you have plans to refresh and push your next version of the workqueue solution? I do not, as Bartek takes over my work, +CC Bartek Hi Lukasz, As you remember in January Chanwoo has proposed another idea (to allow selecting workqueue type by devfreq device driver): "I'm developing the RFC patch and then I'll send it as soon as possible." (https://lore.kernel.org/linux-pm/6107fa2b-81ad-060d-89a2-d8941ac4d...@samsung.com/) "After posting my suggestion, we can discuss it" (https://lore.kernel.org/linux-pm/f5c5cd64-b72c-2802-f6ea-ab3d28483...@samsung.com/) so we have been waiting on the patch to be posted..
Re: brocken devfreq simple_ondemand for Odroid XU3/4?
On 6/26/20 6:50 PM, Sylwester Nawrocki wrote: Hi Lukasz, On 25.06.2020 12:02, Lukasz Luba wrote: Regarding the 'performance counters overflow interrupts' there is one thing worth to keep in mind: variable utilization and frequency. For example, in order to make a conclusion in algorithm deciding that the device should increase or decrease the frequency, we fix the period of observation, i.e. to 500ms. That can cause the long delay if the utilization of the device suddenly drops. For example we set an overflow threshold to value i.e. 1000 and we know that at 1000MHz and full utilization (100%) the counter will reach that threshold after 500ms (which we want, because we don't want too many interrupts per sec). What if suddenly utilization drops to 2% (i.e. from 5GB/s to 250MB/s (what if it drops to 25MB/s?!)), the counter will reach the threshold after 50*500ms = 25s. It is impossible just for the counters to predict next utilization and adjust the threshold. Agreed, that's in case when we use just the performance counter (PMCNT) overflow interrupts. In my experiments I used the (total) cycle counter (CCNT) overflow interrupts. As that counter is clocked with fixed rate between devfreq updates it can be used as a timer by pre-loading it with initial value depending on current bus frequency. But we could as well use some reliable system timer mechanism to generate periodic events. I was hoping to use the cycle counter to generate low frequency monitor events and the actual performance counters overflow interrupts to detect any sudden changes of utilization. However, it seems it cannot be done with as simple performance counters HW architecture as on Exynos4412. It looks like on Exynos5422 we have all what is needed, there is more flexibility in selecting the counter source signal, e.g. each counter can be a clock cycle counter or can count various bus events related to actual utilization. Moreover, we could configure the counter gating period and alarm interrupts are available for when the counter value drops below configured MIN threshold or exceeds configured MAX value. I see. I don't have TRM for Exynos5422 so couldn't see that. I also have to keep in mind other platforms which might not have this feature. So it should be possible to configure the HW to generate the utilization monitoring events without excessive continuous CPU intervention. I agree, that would be desirable especially for low load in the system. But I'm rather not going to work on the Exynos5422 SoC support at the moment. I see. To address that, we still need to have another mechanism (like watchdog) which will be triggered just to check if the threshold needs adjustment. This mechanism can be a local timer in the driver or a framework timer running kind of 'for loop' on all this type of devices (like the scheduled workqueue). In both cases in the system there will be interrupts, timers (even at workqueues) and scheduling. The approach to force developers to implement their local watchdog timers (or workqueues) in drivers is IMHO wrong and that's why we have frameworks. Yes, it should be also possible in the framework to use the counter alarm events where the hardware is advanced enough, in order to avoid excessive SW polling. Looks promising, but that would need more plumbing I assume. Regards, Lukasz -- Regards, Sylwester
Re: brocken devfreq simple_ondemand for Odroid XU3/4?
On 6/26/20 12:22 PM, Bartlomiej Zolnierkiewicz wrote: On 6/25/20 2:12 PM, Kamil Konieczny wrote: On 25.06.2020 14:02, Lukasz Luba wrote: On 6/25/20 12:30 PM, Kamil Konieczny wrote: [snip] Kamil do you have plans to refresh and push your next version of the workqueue solution? I do not, as Bartek takes over my work, +CC Bartek Hi Lukasz, Hi Bartek, As you remember in January Chanwoo has proposed another idea (to allow selecting workqueue type by devfreq device driver): "I'm developing the RFC patch and then I'll send it as soon as possible." (https://lore.kernel.org/linux-pm/6107fa2b-81ad-060d-89a2-d8941ac4d...@samsung.com/) "After posting my suggestion, we can discuss it" (https://lore.kernel.org/linux-pm/f5c5cd64-b72c-2802-f6ea-ab3d28483...@samsung.com/) so we have been waiting on the patch to be posted.. Similarly we have been waiting on (any) feedback for exynos-bus/nocp fixes for Exynos5422 support (which have been posted by Kamil also in January): https://lore.kernel.org/linux-pm/8f82d8d5-927b-afb4-272f-45c16b5a2...@samsung.com/ Considering the above and how hard it has been to push the changes through review/merge process last year we are near giving up when it comes to upstream devfreq contributions. Sylwester is still working on exynos-bus & interconnect integration (continuation of Artur Swigon's work from last year) & related issues (IRQ support for PPMU) but I'm seriously considering putting it all on-hold.. Thank you for detailed explanation and update. I see. Anyway, if you or Sylwester need some help with this devfreq workqueue, I offer my time as a reviewer The more generic solution you propose, the better for all platforms. Regards, Lukasz Best regards, -- Bartlomiej Zolnierkiewicz Samsung R Institute Poland Samsung Electronics
Re: brocken devfreq simple_ondemand for Odroid XU3/4?
Hi Sylwester, On 6/25/20 12:11 AM, Sylwester Nawrocki wrote: > Hi All, > > On 24.06.2020 12:32, Lukasz Luba wrote: >> I had issues with devfreq governor which wasn't called by devfreq >> workqueue. The old DELAYED vs DEFERRED work discussions and my patches >> for it [1]. If the CPU which scheduled the next work went idle, the >> devfreq workqueue will not be kicked and devfreq governor won't check >> DMC status and will not decide to decrease the frequency based on low >> busy_time. >> The same applies for going up with the frequency. They both are >> done by the governor but the workqueue must be scheduled periodically. > > As I have been working on resolving the video mixer IOMMU fault issue > described here: https://patchwork.kernel.org/patch/10861757 > I did some investigation of the devfreq operation, mostly on Odroid U3. > > My conclusions are similar to what Lukasz says above. I would like to add > that broken scheduling of the performance counters read and the devfreq > updates seems to have one more serious implication. In each call, which > normally should happen periodically with fixed interval we stop the counters, > read counter values and start the counters again. But if period between > calls becomes long enough to let any of the counters overflow, we will > get wrong performance measurement results. My observations are that > the workqueue job can be suspended for several seconds and conditions for > the counter overflow occur sooner or later, depending among others > on the CPUs load. > Wrong bus load measurement can lead to setting too low interconnect bus > clock frequency and then bad things happen in peripheral devices. > > I agree the workqueue issue needs to be fixed. I have some WIP code to use > the performance counters overflow interrupts instead of SW polling and with It is good way to resolve the overflow issue. > that the interconnect bus clock control seems to work much better. > -- Best Regards, Chanwoo Choi Samsung Electronics
Re: brocken devfreq simple_ondemand for Odroid XU3/4?
Hi, Sorry for late reply because of my perfornal issue. I count not check the email. On 6/26/20 8:22 PM, Bartlomiej Zolnierkiewicz wrote: > > On 6/25/20 2:12 PM, Kamil Konieczny wrote: >> On 25.06.2020 14:02, Lukasz Luba wrote: >>> >>> >>> On 6/25/20 12:30 PM, Kamil Konieczny wrote: Hi Lukasz, On 25.06.2020 12:02, Lukasz Luba wrote: > Hi Sylwester, > > On 6/24/20 4:11 PM, Sylwester Nawrocki wrote: >> Hi All, >> >> On 24.06.2020 12:32, Lukasz Luba wrote: >>> I had issues with devfreq governor which wasn't called by devfreq >>> workqueue. The old DELAYED vs DEFERRED work discussions and my patches >>> for it [1]. If the CPU which scheduled the next work went idle, the >>> devfreq workqueue will not be kicked and devfreq governor won't check >>> DMC status and will not decide to decrease the frequency based on low >>> busy_time. >>> The same applies for going up with the frequency. They both are >>> done by the governor but the workqueue must be scheduled periodically. >> >> As I have been working on resolving the video mixer IOMMU fault issue >> described here: https://patchwork.kernel.org/patch/10861757 >> I did some investigation of the devfreq operation, mostly on Odroid U3. >> >> My conclusions are similar to what Lukasz says above. I would like to add >> that broken scheduling of the performance counters read and the devfreq >> updates seems to have one more serious implication. In each call, which >> normally should happen periodically with fixed interval we stop the >> counters, >> read counter values and start the counters again. But if period between >> calls becomes long enough to let any of the counters overflow, we will >> get wrong performance measurement results. My observations are that >> the workqueue job can be suspended for several seconds and conditions for >> the counter overflow occur sooner or later, depending among others >> on the CPUs load. >> Wrong bus load measurement can lead to setting too low interconnect bus >> clock frequency and then bad things happen in peripheral devices. >> >> I agree the workqueue issue needs to be fixed. I have some WIP code to >> use >> the performance counters overflow interrupts instead of SW polling and >> with >> that the interconnect bus clock control seems to work much better. >> > > Thank you for sharing your use case and investigation results. I think > we are reaching a decent number of developers to maybe address this > issue: 'workqueue issue needs to be fixed'. > I have been facing this devfreq workqueue issue ~5 times in different > platforms. > > Regarding the 'performance counters overflow interrupts' there is one > thing worth to keep in mind: variable utilization and frequency. > For example, in order to make a conclusion in algorithm deciding that > the device should increase or decrease the frequency, we fix the period > of observation, i.e. to 500ms. That can cause the long delay if the > utilization of the device suddenly drops. For example we set an > overflow threshold to value i.e. 1000 and we know that at 1000MHz > and full utilization (100%) the counter will reach that threshold > after 500ms (which we want, because we don't want too many interrupts > per sec). What if suddenly utilization drops to 2% (i.e. from 5GB/s > to 250MB/s (what if it drops to 25MB/s?!)), the counter will reach the > threshold after 50*500ms = 25s. It is impossible just for the counters > to predict next utilization and adjust the threshold. [...] irq triggers for underflow and overflow, so driver can adjust freq >>> >>> Probably possible on some platforms, depends on how many PMU registers >>> are available, what information can be can assign to them and type of >>> interrupt. A lot of hassle and still - platform and device specific. >>> Also, drivers should not adjust the freq, governors (different types >>> of them with different settings that they can handle) should do it. >>> >>> What the framework can do is to take this responsibility and provide >>> generic way to monitor the devices (or stop if they are suspended). >>> That should work nicely with the governors, which try to predict the >>> next best frequency. From my experience the more fluctuating intervals >>> the governors are called, the more odd decisions they make. >>> That's why I think having a predictable interval i.e. 100ms is something >>> desirable. Tuning the governors is easier in this case, statistics >>> are easier to trace and interpret, solution is not to platform specific, >>> etc. >>> >>> Kamil do you have plans to refresh and push your next version of the >>> workqueue solution? >> >> I do not, as Bartek takes over my work, >> +CC Bartek > > Hi Lukasz, > > As you remember in January Chanwoo has
Re: brocken devfreq simple_ondemand for Odroid XU3/4?
Hi Lukasz, On 25.06.2020 12:02, Lukasz Luba wrote: > Regarding the 'performance counters overflow interrupts' there is one > thing worth to keep in mind: variable utilization and frequency. > For example, in order to make a conclusion in algorithm deciding that > the device should increase or decrease the frequency, we fix the period > of observation, i.e. to 500ms. That can cause the long delay if the > utilization of the device suddenly drops. For example we set an > overflow threshold to value i.e. 1000 and we know that at 1000MHz > and full utilization (100%) the counter will reach that threshold > after 500ms (which we want, because we don't want too many interrupts > per sec). What if suddenly utilization drops to 2% (i.e. from 5GB/s > to 250MB/s (what if it drops to 25MB/s?!)), the counter will reach the > threshold after 50*500ms = 25s. It is impossible just for the counters > to predict next utilization and adjust the threshold. Agreed, that's in case when we use just the performance counter (PMCNT) overflow interrupts. In my experiments I used the (total) cycle counter (CCNT) overflow interrupts. As that counter is clocked with fixed rate between devfreq updates it can be used as a timer by pre-loading it with initial value depending on current bus frequency. But we could as well use some reliable system timer mechanism to generate periodic events. I was hoping to use the cycle counter to generate low frequency monitor events and the actual performance counters overflow interrupts to detect any sudden changes of utilization. However, it seems it cannot be done with as simple performance counters HW architecture as on Exynos4412. It looks like on Exynos5422 we have all what is needed, there is more flexibility in selecting the counter source signal, e.g. each counter can be a clock cycle counter or can count various bus events related to actual utilization. Moreover, we could configure the counter gating period and alarm interrupts are available for when the counter value drops below configured MIN threshold or exceeds configured MAX value. So it should be possible to configure the HW to generate the utilization monitoring events without excessive continuous CPU intervention. But I'm rather not going to work on the Exynos5422 SoC support at the moment. > To address that, we still need to have another mechanism (like watchdog) > which will be triggered just to check if the threshold needs adjustment. > This mechanism can be a local timer in the driver or a framework > timer running kind of 'for loop' on all this type of devices (like > the scheduled workqueue). In both cases in the system there will be > interrupts, timers (even at workqueues) and scheduling. > The approach to force developers to implement their local watchdog > timers (or workqueues) in drivers is IMHO wrong and that's why we have > frameworks. Yes, it should be also possible in the framework to use the counter alarm events where the hardware is advanced enough, in order to avoid excessive SW polling. -- Regards, Sylwester
Re: brocken devfreq simple_ondemand for Odroid XU3/4?
On 6/25/20 2:12 PM, Kamil Konieczny wrote: > On 25.06.2020 14:02, Lukasz Luba wrote: >> >> >> On 6/25/20 12:30 PM, Kamil Konieczny wrote: >>> Hi Lukasz, >>> >>> On 25.06.2020 12:02, Lukasz Luba wrote: Hi Sylwester, On 6/24/20 4:11 PM, Sylwester Nawrocki wrote: > Hi All, > > On 24.06.2020 12:32, Lukasz Luba wrote: >> I had issues with devfreq governor which wasn't called by devfreq >> workqueue. The old DELAYED vs DEFERRED work discussions and my patches >> for it [1]. If the CPU which scheduled the next work went idle, the >> devfreq workqueue will not be kicked and devfreq governor won't check >> DMC status and will not decide to decrease the frequency based on low >> busy_time. >> The same applies for going up with the frequency. They both are >> done by the governor but the workqueue must be scheduled periodically. > > As I have been working on resolving the video mixer IOMMU fault issue > described here: https://patchwork.kernel.org/patch/10861757 > I did some investigation of the devfreq operation, mostly on Odroid U3. > > My conclusions are similar to what Lukasz says above. I would like to add > that broken scheduling of the performance counters read and the devfreq > updates seems to have one more serious implication. In each call, which > normally should happen periodically with fixed interval we stop the > counters, > read counter values and start the counters again. But if period between > calls becomes long enough to let any of the counters overflow, we will > get wrong performance measurement results. My observations are that > the workqueue job can be suspended for several seconds and conditions for > the counter overflow occur sooner or later, depending among others > on the CPUs load. > Wrong bus load measurement can lead to setting too low interconnect bus > clock frequency and then bad things happen in peripheral devices. > > I agree the workqueue issue needs to be fixed. I have some WIP code to use > the performance counters overflow interrupts instead of SW polling and > with > that the interconnect bus clock control seems to work much better. > Thank you for sharing your use case and investigation results. I think we are reaching a decent number of developers to maybe address this issue: 'workqueue issue needs to be fixed'. I have been facing this devfreq workqueue issue ~5 times in different platforms. Regarding the 'performance counters overflow interrupts' there is one thing worth to keep in mind: variable utilization and frequency. For example, in order to make a conclusion in algorithm deciding that the device should increase or decrease the frequency, we fix the period of observation, i.e. to 500ms. That can cause the long delay if the utilization of the device suddenly drops. For example we set an overflow threshold to value i.e. 1000 and we know that at 1000MHz and full utilization (100%) the counter will reach that threshold after 500ms (which we want, because we don't want too many interrupts per sec). What if suddenly utilization drops to 2% (i.e. from 5GB/s to 250MB/s (what if it drops to 25MB/s?!)), the counter will reach the threshold after 50*500ms = 25s. It is impossible just for the counters to predict next utilization and adjust the threshold. [...] >>> >>> irq triggers for underflow and overflow, so driver can adjust freq >>> >> >> Probably possible on some platforms, depends on how many PMU registers >> are available, what information can be can assign to them and type of >> interrupt. A lot of hassle and still - platform and device specific. >> Also, drivers should not adjust the freq, governors (different types >> of them with different settings that they can handle) should do it. >> >> What the framework can do is to take this responsibility and provide >> generic way to monitor the devices (or stop if they are suspended). >> That should work nicely with the governors, which try to predict the >> next best frequency. From my experience the more fluctuating intervals >> the governors are called, the more odd decisions they make. >> That's why I think having a predictable interval i.e. 100ms is something >> desirable. Tuning the governors is easier in this case, statistics >> are easier to trace and interpret, solution is not to platform specific, >> etc. >> >> Kamil do you have plans to refresh and push your next version of the >> workqueue solution? > > I do not, as Bartek takes over my work, > +CC Bartek Hi Lukasz, As you remember in January Chanwoo has proposed another idea (to allow selecting workqueue type by devfreq device driver): "I'm developing the RFC patch and then I'll send it as soon as possible." (https://lore.kernel.org/linux-pm/6107fa2b-81ad-060d-89a2-d8941ac4d...@samsung.com/) "After
Re: brocken devfreq simple_ondemand for Odroid XU3/4?
On 25.06.2020 14:02, Lukasz Luba wrote: > > > On 6/25/20 12:30 PM, Kamil Konieczny wrote: >> Hi Lukasz, >> >> On 25.06.2020 12:02, Lukasz Luba wrote: >>> Hi Sylwester, >>> >>> On 6/24/20 4:11 PM, Sylwester Nawrocki wrote: Hi All, On 24.06.2020 12:32, Lukasz Luba wrote: > I had issues with devfreq governor which wasn't called by devfreq > workqueue. The old DELAYED vs DEFERRED work discussions and my patches > for it [1]. If the CPU which scheduled the next work went idle, the > devfreq workqueue will not be kicked and devfreq governor won't check > DMC status and will not decide to decrease the frequency based on low > busy_time. > The same applies for going up with the frequency. They both are > done by the governor but the workqueue must be scheduled periodically. As I have been working on resolving the video mixer IOMMU fault issue described here: https://patchwork.kernel.org/patch/10861757 I did some investigation of the devfreq operation, mostly on Odroid U3. My conclusions are similar to what Lukasz says above. I would like to add that broken scheduling of the performance counters read and the devfreq updates seems to have one more serious implication. In each call, which normally should happen periodically with fixed interval we stop the counters, read counter values and start the counters again. But if period between calls becomes long enough to let any of the counters overflow, we will get wrong performance measurement results. My observations are that the workqueue job can be suspended for several seconds and conditions for the counter overflow occur sooner or later, depending among others on the CPUs load. Wrong bus load measurement can lead to setting too low interconnect bus clock frequency and then bad things happen in peripheral devices. I agree the workqueue issue needs to be fixed. I have some WIP code to use the performance counters overflow interrupts instead of SW polling and with that the interconnect bus clock control seems to work much better. >>> >>> Thank you for sharing your use case and investigation results. I think >>> we are reaching a decent number of developers to maybe address this >>> issue: 'workqueue issue needs to be fixed'. >>> I have been facing this devfreq workqueue issue ~5 times in different >>> platforms. >>> >>> Regarding the 'performance counters overflow interrupts' there is one >>> thing worth to keep in mind: variable utilization and frequency. >>> For example, in order to make a conclusion in algorithm deciding that >>> the device should increase or decrease the frequency, we fix the period >>> of observation, i.e. to 500ms. That can cause the long delay if the >>> utilization of the device suddenly drops. For example we set an >>> overflow threshold to value i.e. 1000 and we know that at 1000MHz >>> and full utilization (100%) the counter will reach that threshold >>> after 500ms (which we want, because we don't want too many interrupts >>> per sec). What if suddenly utilization drops to 2% (i.e. from 5GB/s >>> to 250MB/s (what if it drops to 25MB/s?!)), the counter will reach the >>> threshold after 50*500ms = 25s. It is impossible just for the counters >>> to predict next utilization and adjust the threshold. [...] >> >> irq triggers for underflow and overflow, so driver can adjust freq >> > > Probably possible on some platforms, depends on how many PMU registers > are available, what information can be can assign to them and type of > interrupt. A lot of hassle and still - platform and device specific. > Also, drivers should not adjust the freq, governors (different types > of them with different settings that they can handle) should do it. > > What the framework can do is to take this responsibility and provide > generic way to monitor the devices (or stop if they are suspended). > That should work nicely with the governors, which try to predict the > next best frequency. From my experience the more fluctuating intervals > the governors are called, the more odd decisions they make. > That's why I think having a predictable interval i.e. 100ms is something > desirable. Tuning the governors is easier in this case, statistics > are easier to trace and interpret, solution is not to platform specific, > etc. > > Kamil do you have plans to refresh and push your next version of the > workqueue solution? I do not, as Bartek takes over my work, +CC Bartek -- Best regards, Kamil Konieczny Samsung R Institute Poland
Re: brocken devfreq simple_ondemand for Odroid XU3/4?
On 6/25/20 12:30 PM, Kamil Konieczny wrote: Hi Lukasz, On 25.06.2020 12:02, Lukasz Luba wrote: Hi Sylwester, On 6/24/20 4:11 PM, Sylwester Nawrocki wrote: Hi All, On 24.06.2020 12:32, Lukasz Luba wrote: I had issues with devfreq governor which wasn't called by devfreq workqueue. The old DELAYED vs DEFERRED work discussions and my patches for it [1]. If the CPU which scheduled the next work went idle, the devfreq workqueue will not be kicked and devfreq governor won't check DMC status and will not decide to decrease the frequency based on low busy_time. The same applies for going up with the frequency. They both are done by the governor but the workqueue must be scheduled periodically. As I have been working on resolving the video mixer IOMMU fault issue described here: https://patchwork.kernel.org/patch/10861757 I did some investigation of the devfreq operation, mostly on Odroid U3. My conclusions are similar to what Lukasz says above. I would like to add that broken scheduling of the performance counters read and the devfreq updates seems to have one more serious implication. In each call, which normally should happen periodically with fixed interval we stop the counters, read counter values and start the counters again. But if period between calls becomes long enough to let any of the counters overflow, we will get wrong performance measurement results. My observations are that the workqueue job can be suspended for several seconds and conditions for the counter overflow occur sooner or later, depending among others on the CPUs load. Wrong bus load measurement can lead to setting too low interconnect bus clock frequency and then bad things happen in peripheral devices. I agree the workqueue issue needs to be fixed. I have some WIP code to use the performance counters overflow interrupts instead of SW polling and with that the interconnect bus clock control seems to work much better. Thank you for sharing your use case and investigation results. I think we are reaching a decent number of developers to maybe address this issue: 'workqueue issue needs to be fixed'. I have been facing this devfreq workqueue issue ~5 times in different platforms. Regarding the 'performance counters overflow interrupts' there is one thing worth to keep in mind: variable utilization and frequency. For example, in order to make a conclusion in algorithm deciding that the device should increase or decrease the frequency, we fix the period of observation, i.e. to 500ms. That can cause the long delay if the utilization of the device suddenly drops. For example we set an overflow threshold to value i.e. 1000 and we know that at 1000MHz and full utilization (100%) the counter will reach that threshold after 500ms (which we want, because we don't want too many interrupts per sec). What if suddenly utilization drops to 2% (i.e. from 5GB/s to 250MB/s (what if it drops to 25MB/s?!)), the counter will reach the threshold after 50*500ms = 25s. It is impossible just for the counters to predict next utilization and adjust the threshold. [...] irq triggers for underflow and overflow, so driver can adjust freq Probably possible on some platforms, depends on how many PMU registers are available, what information can be can assign to them and type of interrupt. A lot of hassle and still - platform and device specific. Also, drivers should not adjust the freq, governors (different types of them with different settings that they can handle) should do it. What the framework can do is to take this responsibility and provide generic way to monitor the devices (or stop if they are suspended). That should work nicely with the governors, which try to predict the next best frequency. From my experience the more fluctuating intervals the governors are called, the more odd decisions they make. That's why I think having a predictable interval i.e. 100ms is something desirable. Tuning the governors is easier in this case, statistics are easier to trace and interpret, solution is not to platform specific, etc. Kamil do you have plans to refresh and push your next version of the workqueue solution? Regards, Lukasz
Re: brocken devfreq simple_ondemand for Odroid XU3/4?
Hi Lukasz, On 25.06.2020 12:02, Lukasz Luba wrote: > Hi Sylwester, > > On 6/24/20 4:11 PM, Sylwester Nawrocki wrote: >> Hi All, >> >> On 24.06.2020 12:32, Lukasz Luba wrote: >>> I had issues with devfreq governor which wasn't called by devfreq >>> workqueue. The old DELAYED vs DEFERRED work discussions and my patches >>> for it [1]. If the CPU which scheduled the next work went idle, the >>> devfreq workqueue will not be kicked and devfreq governor won't check >>> DMC status and will not decide to decrease the frequency based on low >>> busy_time. >>> The same applies for going up with the frequency. They both are >>> done by the governor but the workqueue must be scheduled periodically. >> >> As I have been working on resolving the video mixer IOMMU fault issue >> described here: https://patchwork.kernel.org/patch/10861757 >> I did some investigation of the devfreq operation, mostly on Odroid U3. >> >> My conclusions are similar to what Lukasz says above. I would like to add >> that broken scheduling of the performance counters read and the devfreq >> updates seems to have one more serious implication. In each call, which >> normally should happen periodically with fixed interval we stop the counters, >> read counter values and start the counters again. But if period between >> calls becomes long enough to let any of the counters overflow, we will >> get wrong performance measurement results. My observations are that >> the workqueue job can be suspended for several seconds and conditions for >> the counter overflow occur sooner or later, depending among others >> on the CPUs load. >> Wrong bus load measurement can lead to setting too low interconnect bus >> clock frequency and then bad things happen in peripheral devices. >> >> I agree the workqueue issue needs to be fixed. I have some WIP code to use >> the performance counters overflow interrupts instead of SW polling and with >> that the interconnect bus clock control seems to work much better. >> > > Thank you for sharing your use case and investigation results. I think > we are reaching a decent number of developers to maybe address this > issue: 'workqueue issue needs to be fixed'. > I have been facing this devfreq workqueue issue ~5 times in different > platforms. > > Regarding the 'performance counters overflow interrupts' there is one > thing worth to keep in mind: variable utilization and frequency. > For example, in order to make a conclusion in algorithm deciding that > the device should increase or decrease the frequency, we fix the period > of observation, i.e. to 500ms. That can cause the long delay if the > utilization of the device suddenly drops. For example we set an > overflow threshold to value i.e. 1000 and we know that at 1000MHz > and full utilization (100%) the counter will reach that threshold > after 500ms (which we want, because we don't want too many interrupts > per sec). What if suddenly utilization drops to 2% (i.e. from 5GB/s > to 250MB/s (what if it drops to 25MB/s?!)), the counter will reach the > threshold after 50*500ms = 25s. It is impossible just for the counters > to predict next utilization and adjust the threshold. [...] irq triggers for underflow and overflow, so driver can adjust freq -- Best regards, Kamil Konieczny Samsung R Institute Poland
Re: brocken devfreq simple_ondemand for Odroid XU3/4?
Hi Sylwester, On 6/24/20 4:11 PM, Sylwester Nawrocki wrote: Hi All, On 24.06.2020 12:32, Lukasz Luba wrote: I had issues with devfreq governor which wasn't called by devfreq workqueue. The old DELAYED vs DEFERRED work discussions and my patches for it [1]. If the CPU which scheduled the next work went idle, the devfreq workqueue will not be kicked and devfreq governor won't check DMC status and will not decide to decrease the frequency based on low busy_time. The same applies for going up with the frequency. They both are done by the governor but the workqueue must be scheduled periodically. As I have been working on resolving the video mixer IOMMU fault issue described here: https://patchwork.kernel.org/patch/10861757 I did some investigation of the devfreq operation, mostly on Odroid U3. My conclusions are similar to what Lukasz says above. I would like to add that broken scheduling of the performance counters read and the devfreq updates seems to have one more serious implication. In each call, which normally should happen periodically with fixed interval we stop the counters, read counter values and start the counters again. But if period between calls becomes long enough to let any of the counters overflow, we will get wrong performance measurement results. My observations are that the workqueue job can be suspended for several seconds and conditions for the counter overflow occur sooner or later, depending among others on the CPUs load. Wrong bus load measurement can lead to setting too low interconnect bus clock frequency and then bad things happen in peripheral devices. I agree the workqueue issue needs to be fixed. I have some WIP code to use the performance counters overflow interrupts instead of SW polling and with that the interconnect bus clock control seems to work much better. Thank you for sharing your use case and investigation results. I think we are reaching a decent number of developers to maybe address this issue: 'workqueue issue needs to be fixed'. I have been facing this devfreq workqueue issue ~5 times in different platforms. Regarding the 'performance counters overflow interrupts' there is one thing worth to keep in mind: variable utilization and frequency. For example, in order to make a conclusion in algorithm deciding that the device should increase or decrease the frequency, we fix the period of observation, i.e. to 500ms. That can cause the long delay if the utilization of the device suddenly drops. For example we set an overflow threshold to value i.e. 1000 and we know that at 1000MHz and full utilization (100%) the counter will reach that threshold after 500ms (which we want, because we don't want too many interrupts per sec). What if suddenly utilization drops to 2% (i.e. from 5GB/s to 250MB/s (what if it drops to 25MB/s?!)), the counter will reach the threshold after 50*500ms = 25s. It is impossible just for the counters to predict next utilization and adjust the threshold. To address that, we still need to have another mechanism (like watchdog) which will be triggered just to check if the threshold needs adjustment. This mechanism can be a local timer in the driver or a framework timer running kind of 'for loop' on all this type of devices (like the scheduled workqueue). In both cases in the system there will be interrupts, timers (even at workqueues) and scheduling. The approach to force developers to implement their local watchdog timers (or workqueues) in drivers is IMHO wrong and that's why we have frameworks. Regards, Lukasz
Re: brocken devfreq simple_ondemand for Odroid XU3/4?
Hi All, On 24.06.2020 12:32, Lukasz Luba wrote: > I had issues with devfreq governor which wasn't called by devfreq > workqueue. The old DELAYED vs DEFERRED work discussions and my patches > for it [1]. If the CPU which scheduled the next work went idle, the > devfreq workqueue will not be kicked and devfreq governor won't check > DMC status and will not decide to decrease the frequency based on low > busy_time. > The same applies for going up with the frequency. They both are > done by the governor but the workqueue must be scheduled periodically. As I have been working on resolving the video mixer IOMMU fault issue described here: https://patchwork.kernel.org/patch/10861757 I did some investigation of the devfreq operation, mostly on Odroid U3. My conclusions are similar to what Lukasz says above. I would like to add that broken scheduling of the performance counters read and the devfreq updates seems to have one more serious implication. In each call, which normally should happen periodically with fixed interval we stop the counters, read counter values and start the counters again. But if period between calls becomes long enough to let any of the counters overflow, we will get wrong performance measurement results. My observations are that the workqueue job can be suspended for several seconds and conditions for the counter overflow occur sooner or later, depending among others on the CPUs load. Wrong bus load measurement can lead to setting too low interconnect bus clock frequency and then bad things happen in peripheral devices. I agree the workqueue issue needs to be fixed. I have some WIP code to use the performance counters overflow interrupts instead of SW polling and with that the interconnect bus clock control seems to work much better. -- Regards, Sylwester
Re: brocken devfreq simple_ondemand for Odroid XU3/4?
On 6/24/20 2:13 PM, Krzysztof Kozlowski wrote: On Wed, Jun 24, 2020 at 02:03:03PM +0100, Lukasz Luba wrote: On 6/24/20 1:06 PM, Krzysztof Kozlowski wrote: My case was clearly showing wrong behavior. System was idle but not sleeping - network working, SSH connection ongoing. Therefore at least one CPU was not idle and could adjust the devfreq/DMC... but this did not happen. The system stayed for like a minute in 633 MHz OPP. Not-waking up idle processors - ok... so why not using power efficient workqueue? It is exactly for this purpose - wake up from time to time on whatever CPU to do the necessary job. IIRC I've done this experiment, still keeping in devfreq: INIT_DEFERRABLE_WORK() just applying patch [1]. It uses a system_wq which should be the same as system_power_efficient_wq when CONFIG_WQ_POWER_EFFICIENT_DEFAULT is not set (our case). This wasn't solving the issue for the deferred work. That's why the patch 2/2 following patch 1/2 [1] was needed. The deferred work uses TIMER_DEFERRABLE in it's initialization and this is the problem. When the deferred work was queued on a CPU, next that CPU went idle, the work was not migrated to some other CPU. The former cpu is also not woken up according to the documentation [2]. Yes, you need either workqueue.power_efficient kernel param or CONFIG option to actually enable it. But at least it could then work on any CPU. Another solution is to use directly WQ_UNBOUND. That's why Kamil's approach should be continue IMHO. It gives more control over important devices like: bus, dmc, gpu, which utilization does not strictly correspond to cpu utilization (which might be low or even 0 and cpu put into idle). I think Kamil was pointing out also some other issues not only dmc (buses probably), but I realized too late to help him. This should not be a configurable option. Why someone would prefer to use one over another and decide about this during build or run time? Instead it should be just *right* all the time. Always. I had the same opinion, as you can see in my explanation to those patches, but I failed. That's why I agree with Kamil's approach because had higher chance to get into mainline and fix at least some of the use cases. Argument that we want to save power so we will not wake up any CPU is ridiculous if because of this system stays in high-power mode. If system is idle and memory going to be idle, someone should be woken up to save more power and slow down memory controller. If system is idle but memory going to be busy, the currently busy CPU (which performs some memory-intensive job) could do the job and ramp up the devfreq performance. I agree. I think this devfreq mechanism was designed in the times where there was/were 1 or 2 CPUs in the system. After a while we got ~8 and not all of them are used. This scenario was probably not experimented widely on mainline platforms. That is a good material for improvements, for someone who has time and power. Regards, Lukasz Best regards, Krzysztof
Re: brocken devfreq simple_ondemand for Odroid XU3/4?
On Wed, Jun 24, 2020 at 02:03:03PM +0100, Lukasz Luba wrote: > > > On 6/24/20 1:06 PM, Krzysztof Kozlowski wrote: > > My case was clearly showing wrong behavior. System was idle but not > > sleeping - network working, SSH connection ongoing. Therefore at least > > one CPU was not idle and could adjust the devfreq/DMC... but this did not > > happen. The system stayed for like a minute in 633 MHz OPP. > > > > Not-waking up idle processors - ok... so why not using power efficient > > workqueue? It is exactly for this purpose - wake up from time to time on > > whatever CPU to do the necessary job. > > IIRC I've done this experiment, still keeping in devfreq: > INIT_DEFERRABLE_WORK() > just applying patch [1]. It uses a system_wq which should > be the same as system_power_efficient_wq when > CONFIG_WQ_POWER_EFFICIENT_DEFAULT is not set (our case). > This wasn't solving the issue for the deferred work. That's > why the patch 2/2 following patch 1/2 [1] was needed. > > The deferred work uses TIMER_DEFERRABLE in it's initialization > and this is the problem. When the deferred work was queued on a CPU, > next that CPU went idle, the work was not migrated to some other CPU. > The former cpu is also not woken up according to the documentation [2]. Yes, you need either workqueue.power_efficient kernel param or CONFIG option to actually enable it. But at least it could then work on any CPU. Another solution is to use directly WQ_UNBOUND. > That's why Kamil's approach should be continue IMHO. It gives more > control over important devices like: bus, dmc, gpu, which utilization > does not strictly correspond to cpu utilization (which might be low or > even 0 and cpu put into idle). > > I think Kamil was pointing out also some other issues not only dmc > (buses probably), but I realized too late to help him. This should not be a configurable option. Why someone would prefer to use one over another and decide about this during build or run time? Instead it should be just *right* all the time. Always. Argument that we want to save power so we will not wake up any CPU is ridiculous if because of this system stays in high-power mode. If system is idle and memory going to be idle, someone should be woken up to save more power and slow down memory controller. If system is idle but memory going to be busy, the currently busy CPU (which performs some memory-intensive job) could do the job and ramp up the devfreq performance. Best regards, Krzysztof
Re: brocken devfreq simple_ondemand for Odroid XU3/4?
On 6/24/20 1:06 PM, Krzysztof Kozlowski wrote: On Wed, Jun 24, 2020 at 01:18:42PM +0200, Kamil Konieczny wrote: Hi, On 24.06.2020 12:32, Lukasz Luba wrote: Hi Krzysztof and Willy On 6/23/20 8:11 PM, Krzysztof Kozlowski wrote: On Tue, Jun 23, 2020 at 09:02:38PM +0200, Krzysztof Kozlowski wrote: On Tue, 23 Jun 2020 at 18:47, Willy Wolff wrote: Hi everybody, Is DVFS for memory bus really working on Odroid XU3/4 board? Using a simple microbenchmark that is doing only memory accesses, memory DVFS seems to not working properly: The microbenchmark is doing pointer chasing by following index in an array. Indices in the array are set to follow a random pattern (cutting prefetcher), and forcing RAM access. git clone https://protect2.fireeye.com/url?k=c364e88a-9eb6fe2f-c36563c5-0cc47a31bee8-631885f0a63a11a0=1=https%3A%2F%2Fgithub.com%2Fwwilly%2Fbenchmark.git \ && cd benchmark \ && source env.sh \ && ./bench_build.sh \ && bash source/scripts/test_dvfs_mem.sh Python 3, cmake and sudo rights are required. Results: DVFS CPU with performance governor mem_gov = simple_ondemand at 16500 Hz in idle, should be bumped when the benchmark is running. - on the LITTLE cluster it takes 4.74308 s to run (683.004 c per memory access), - on the big cluster it takes 4.76556 s to run (980.343 c per moemory access). While forcing DVFS memory bus to use performance governor, mem_gov = performance at 82500 Hz in idle, - on the LITTLE cluster it takes 1.1451 s to run (164.894 c per memory access), - on the big cluster it takes 1.18448 s to run (243.664 c per memory access). The kernel used is the last 5.7.5 stable with default exynos_defconfig. Thanks for the report. Few thoughts: 1. What trans_stat are saying? Except DMC driver you can also check all other devfreq devices (e.g. wcore) - maybe the devfreq events (nocp) are not properly assigned? 2. Try running the measurement for ~1 minutes or longer. The counters might have some delay (which would require probably fixing but the point is to narrow the problem). 3. What do you understand by "mem_gov"? Which device is it? +Cc Lukasz who was working on this. Thanks Krzysztof for adding me here. I just run memtester and more-or-less ondemand works (at least ramps up): Before: /sys/class/devfreq/10c2.memory-controller$ cat trans_stat From : To : 16500 20600 27500 41300 54300 63300 72800 82500 time(ms) * 16500: 0 0 0 0 0 0 0 0 1795950 20600: 1 0 0 0 0 0 0 0 4770 27500: 0 1 0 0 0 0 0 0 15540 41300: 0 0 1 0 0 0 0 0 20780 54300: 0 0 0 1 0 0 0 1 10760 63300: 0 0 0 0 2 0 0 0 10310 72800: 0 0 0 0 0 0 0 0 0 82500: 0 0 0 0 0 2 0 0 25920 Total transition : 9 $ sudo memtester 1G During memtester: /sys/class/devfreq/10c2.memory-controller$ cat trans_stat From : To : 16500 20600 27500 41300 54300 63300 72800 82500 time(ms) 16500: 0 0 0 0 0 0 0 1 1801490 20600: 1 0 0 0 0 0 0 0 4770 27500: 0 1 0 0 0 0 0 0 15540 41300: 0 0 1 0 0 0 0 0 20780 54300: 0 0 0 1 0 0 0 2 11090 63300: 0 0 0 0 3 0 0 0 17210 72800: 0 0 0 0 0 0 0 0 0 * 82500: 0 0 0 0 0 3 0 0 169020 Total transition : 13 However after killing memtester it stays at 633 MHz for very long time and does not slow down. This is indeed weird... I had issues with devfreq governor which wasn't called by devfreq workqueue. The old DELAYED vs DEFERRED work discussions and my patches for it [1]. If the CPU which scheduled the next work went idle, the devfreq workqueue will not be kicked and devfreq governor won't check DMC status and will not decide to decrease the frequency based on low busy_time. The same applies for going up with the frequency. They both are done by the
Re: brocken devfreq simple_ondemand for Odroid XU3/4?
On Wed, Jun 24, 2020 at 01:18:42PM +0200, Kamil Konieczny wrote: > Hi, > > On 24.06.2020 12:32, Lukasz Luba wrote: > > Hi Krzysztof and Willy > > > > On 6/23/20 8:11 PM, Krzysztof Kozlowski wrote: > >> On Tue, Jun 23, 2020 at 09:02:38PM +0200, Krzysztof Kozlowski wrote: > >>> On Tue, 23 Jun 2020 at 18:47, Willy Wolff > >>> wrote: > > Hi everybody, > > Is DVFS for memory bus really working on Odroid XU3/4 board? > Using a simple microbenchmark that is doing only memory accesses, memory > DVFS > seems to not working properly: > > The microbenchmark is doing pointer chasing by following index in an > array. > Indices in the array are set to follow a random pattern (cutting > prefetcher), > and forcing RAM access. > > git clone > https://protect2.fireeye.com/url?k=c364e88a-9eb6fe2f-c36563c5-0cc47a31bee8-631885f0a63a11a0=1=https%3A%2F%2Fgithub.com%2Fwwilly%2Fbenchmark.git > \ > && cd benchmark \ > && source env.sh \ > && ./bench_build.sh \ > && bash source/scripts/test_dvfs_mem.sh > > Python 3, cmake and sudo rights are required. > > Results: > DVFS CPU with performance governor > mem_gov = simple_ondemand at 16500 Hz in idle, should be bumped when > the > benchmark is running. > - on the LITTLE cluster it takes 4.74308 s to run (683.004 c per memory > access), > - on the big cluster it takes 4.76556 s to run (980.343 c per moemory > access). > > While forcing DVFS memory bus to use performance governor, > mem_gov = performance at 82500 Hz in idle, > - on the LITTLE cluster it takes 1.1451 s to run (164.894 c per memory > access), > - on the big cluster it takes 1.18448 s to run (243.664 c per memory > access). > > The kernel used is the last 5.7.5 stable with default exynos_defconfig. > >>> > >>> Thanks for the report. Few thoughts: > >>> 1. What trans_stat are saying? Except DMC driver you can also check > >>> all other devfreq devices (e.g. wcore) - maybe the devfreq events > >>> (nocp) are not properly assigned? > >>> 2. Try running the measurement for ~1 minutes or longer. The counters > >>> might have some delay (which would require probably fixing but the > >>> point is to narrow the problem). > >>> 3. What do you understand by "mem_gov"? Which device is it? > >> > >> +Cc Lukasz who was working on this. > > > > Thanks Krzysztof for adding me here. > > > >> > >> I just run memtester and more-or-less ondemand works (at least ramps > >> up): > >> > >> Before: > >> /sys/class/devfreq/10c2.memory-controller$ cat trans_stat > >> From : To > >> : 16500 20600 27500 41300 54300 63300 > >> 72800 82500 time(ms) > >> * 16500: 0 0 0 0 0 0 > >> 0 0 1795950 > >> 20600: 1 0 0 0 0 0 > >> 0 0 4770 > >> 27500: 0 1 0 0 0 0 > >> 0 0 15540 > >> 41300: 0 0 1 0 0 0 > >> 0 0 20780 > >> 54300: 0 0 0 1 0 0 > >> 0 1 10760 > >> 63300: 0 0 0 0 2 0 > >> 0 0 10310 > >> 72800: 0 0 0 0 0 0 > >> 0 0 0 > >> 82500: 0 0 0 0 0 2 > >> 0 0 25920 > >> Total transition : 9 > >> > >> > >> $ sudo memtester 1G > >> > >> During memtester: > >> /sys/class/devfreq/10c2.memory-controller$ cat trans_stat > >> From : To > >> : 16500 20600 27500 41300 54300 63300 > >> 72800 82500 time(ms) > >> 16500: 0 0 0 0 0 0 > >> 0 1 1801490 > >> 20600: 1 0 0 0 0 0 > >> 0 0 4770 > >> 27500: 0 1 0 0 0 0 > >> 0 0 15540 > >> 41300: 0 0 1 0 0 0 > >> 0 0 20780 > >> 54300: 0 0 0 1 0 0 > >> 0 2 11090 > >> 63300: 0 0 0 0 3 0 > >> 0 0 17210 > >> 72800: 0 0 0 0 0 0 > >> 0 0 0 > >> * 82500: 0 0 0 0 0 3 > >> 0 0 169020 > >> Total
Re: brocken devfreq simple_ondemand for Odroid XU3/4?
Hi, On 24.06.2020 12:32, Lukasz Luba wrote: > Hi Krzysztof and Willy > > On 6/23/20 8:11 PM, Krzysztof Kozlowski wrote: >> On Tue, Jun 23, 2020 at 09:02:38PM +0200, Krzysztof Kozlowski wrote: >>> On Tue, 23 Jun 2020 at 18:47, Willy Wolff >>> wrote: Hi everybody, Is DVFS for memory bus really working on Odroid XU3/4 board? Using a simple microbenchmark that is doing only memory accesses, memory DVFS seems to not working properly: The microbenchmark is doing pointer chasing by following index in an array. Indices in the array are set to follow a random pattern (cutting prefetcher), and forcing RAM access. git clone https://protect2.fireeye.com/url?k=c364e88a-9eb6fe2f-c36563c5-0cc47a31bee8-631885f0a63a11a0=1=https%3A%2F%2Fgithub.com%2Fwwilly%2Fbenchmark.git \ && cd benchmark \ && source env.sh \ && ./bench_build.sh \ && bash source/scripts/test_dvfs_mem.sh Python 3, cmake and sudo rights are required. Results: DVFS CPU with performance governor mem_gov = simple_ondemand at 16500 Hz in idle, should be bumped when the benchmark is running. - on the LITTLE cluster it takes 4.74308 s to run (683.004 c per memory access), - on the big cluster it takes 4.76556 s to run (980.343 c per moemory access). While forcing DVFS memory bus to use performance governor, mem_gov = performance at 82500 Hz in idle, - on the LITTLE cluster it takes 1.1451 s to run (164.894 c per memory access), - on the big cluster it takes 1.18448 s to run (243.664 c per memory access). The kernel used is the last 5.7.5 stable with default exynos_defconfig. >>> >>> Thanks for the report. Few thoughts: >>> 1. What trans_stat are saying? Except DMC driver you can also check >>> all other devfreq devices (e.g. wcore) - maybe the devfreq events >>> (nocp) are not properly assigned? >>> 2. Try running the measurement for ~1 minutes or longer. The counters >>> might have some delay (which would require probably fixing but the >>> point is to narrow the problem). >>> 3. What do you understand by "mem_gov"? Which device is it? >> >> +Cc Lukasz who was working on this. > > Thanks Krzysztof for adding me here. > >> >> I just run memtester and more-or-less ondemand works (at least ramps >> up): >> >> Before: >> /sys/class/devfreq/10c2.memory-controller$ cat trans_stat >> From : To >> : 16500 20600 27500 41300 54300 63300 >> 72800 82500 time(ms) >> * 16500: 0 0 0 0 0 0 >> 0 0 1795950 >> 20600: 1 0 0 0 0 0 >> 0 0 4770 >> 27500: 0 1 0 0 0 0 >> 0 0 15540 >> 41300: 0 0 1 0 0 0 >> 0 0 20780 >> 54300: 0 0 0 1 0 0 >> 0 1 10760 >> 63300: 0 0 0 0 2 0 >> 0 0 10310 >> 72800: 0 0 0 0 0 0 >> 0 0 0 >> 82500: 0 0 0 0 0 2 >> 0 0 25920 >> Total transition : 9 >> >> >> $ sudo memtester 1G >> >> During memtester: >> /sys/class/devfreq/10c2.memory-controller$ cat trans_stat >> From : To >> : 16500 20600 27500 41300 54300 63300 >> 72800 82500 time(ms) >> 16500: 0 0 0 0 0 0 >> 0 1 1801490 >> 20600: 1 0 0 0 0 0 >> 0 0 4770 >> 27500: 0 1 0 0 0 0 >> 0 0 15540 >> 41300: 0 0 1 0 0 0 >> 0 0 20780 >> 54300: 0 0 0 1 0 0 >> 0 2 11090 >> 63300: 0 0 0 0 3 0 >> 0 0 17210 >> 72800: 0 0 0 0 0 0 >> 0 0 0 >> * 82500: 0 0 0 0 0 3 >> 0 0 169020 >> Total transition : 13 >> >> However after killing memtester it stays at 633 MHz for very long time >> and does not slow down. This is indeed weird... > > I had issues with devfreq governor which wasn't called by devfreq > workqueue. The old DELAYED vs DEFERRED work discussions and my patches > for it [1]. If
Re: brocken devfreq simple_ondemand for Odroid XU3/4?
Hi Krzysztof and Willy On 6/23/20 8:11 PM, Krzysztof Kozlowski wrote: On Tue, Jun 23, 2020 at 09:02:38PM +0200, Krzysztof Kozlowski wrote: On Tue, 23 Jun 2020 at 18:47, Willy Wolff wrote: Hi everybody, Is DVFS for memory bus really working on Odroid XU3/4 board? Using a simple microbenchmark that is doing only memory accesses, memory DVFS seems to not working properly: The microbenchmark is doing pointer chasing by following index in an array. Indices in the array are set to follow a random pattern (cutting prefetcher), and forcing RAM access. git clone https://github.com/wwilly/benchmark.git \ && cd benchmark \ && source env.sh \ && ./bench_build.sh \ && bash source/scripts/test_dvfs_mem.sh Python 3, cmake and sudo rights are required. Results: DVFS CPU with performance governor mem_gov = simple_ondemand at 16500 Hz in idle, should be bumped when the benchmark is running. - on the LITTLE cluster it takes 4.74308 s to run (683.004 c per memory access), - on the big cluster it takes 4.76556 s to run (980.343 c per moemory access). While forcing DVFS memory bus to use performance governor, mem_gov = performance at 82500 Hz in idle, - on the LITTLE cluster it takes 1.1451 s to run (164.894 c per memory access), - on the big cluster it takes 1.18448 s to run (243.664 c per memory access). The kernel used is the last 5.7.5 stable with default exynos_defconfig. Thanks for the report. Few thoughts: 1. What trans_stat are saying? Except DMC driver you can also check all other devfreq devices (e.g. wcore) - maybe the devfreq events (nocp) are not properly assigned? 2. Try running the measurement for ~1 minutes or longer. The counters might have some delay (which would require probably fixing but the point is to narrow the problem). 3. What do you understand by "mem_gov"? Which device is it? +Cc Lukasz who was working on this. Thanks Krzysztof for adding me here. I just run memtester and more-or-less ondemand works (at least ramps up): Before: /sys/class/devfreq/10c2.memory-controller$ cat trans_stat From : To : 16500 20600 27500 41300 54300 63300 72800 82500 time(ms) * 16500: 0 0 0 0 0 0 0 0 1795950 20600: 1 0 0 0 0 0 0 0 4770 27500: 0 1 0 0 0 0 0 0 15540 41300: 0 0 1 0 0 0 0 0 20780 54300: 0 0 0 1 0 0 0 1 10760 63300: 0 0 0 0 2 0 0 0 10310 72800: 0 0 0 0 0 0 0 0 0 82500: 0 0 0 0 0 2 0 0 25920 Total transition : 9 $ sudo memtester 1G During memtester: /sys/class/devfreq/10c2.memory-controller$ cat trans_stat From : To : 16500 20600 27500 41300 54300 63300 72800 82500 time(ms) 16500: 0 0 0 0 0 0 0 1 1801490 20600: 1 0 0 0 0 0 0 0 4770 27500: 0 1 0 0 0 0 0 0 15540 41300: 0 0 1 0 0 0 0 0 20780 54300: 0 0 0 1 0 0 0 2 11090 63300: 0 0 0 0 3 0 0 0 17210 72800: 0 0 0 0 0 0 0 0 0 * 82500: 0 0 0 0 0 3 0 0169020 Total transition : 13 However after killing memtester it stays at 633 MHz for very long time and does not slow down. This is indeed weird... I had issues with devfreq governor which wasn't called by devfreq workqueue. The old DELAYED vs DEFERRED work discussions and my patches for it [1]. If the CPU which scheduled the next work went idle, the devfreq workqueue will not be kicked and devfreq governor won't check DMC status and will not decide to decrease the frequency based on low busy_time. The same applies for going up with the frequency. They both are done by the governor but the workqueue must be scheduled periodically. I couldn't do much with this back then. I have given the example that this is causing issues with the DMC [2]. There is also a description of your situation staying at 633MHz for long time: ' When it is missing opportunity to change the
Re: brocken devfreq simple_ondemand for Odroid XU3/4?
On 2020-06-24-10-14-38, Krzysztof Kozlowski wrote: > On Wed, Jun 24, 2020 at 10:01:17AM +0200, Willy Wolff wrote: > > Hi Krzysztof, > > Thanks to look at it. > > > > mem_gov is /sys/class/devfreq/10c2.memory-controller/governor > > > > Here some numbers after increasing the running time: > > > > Running using simple_ondemand: > > Before: > > From : To > > > >: 16500 20600 27500 41300 54300 63300 > > 72800 82500 time(ms) > > * 16500: 0 0 0 0 0 0 > > 0 4 4528600 > > 20600: 5 0 0 0 0 0 > > 0 0 57780 > > 27500: 0 5 0 0 0 0 > > 0 0 50060 > > 41300: 0 0 5 0 0 0 > > 0 0 46240 > > 54300: 0 0 0 5 0 0 > > 0 0 48970 > > 63300: 0 0 0 0 5 0 > > 0 0 47330 > > 72800: 0 0 0 0 0 0 > > 0 0 0 > > 82500: 0 0 0 0 0 5 > > 0 0331300 > > Total transition : 34 > > > > > > After: > > From : To > >: 16500 20600 27500 41300 54300 63300 > > 72800 82500 time(ms) > > * 16500: 0 0 0 0 0 0 > > 0 4 5098890 > > 20600: 5 0 0 0 0 0 > > 0 0 57780 > > 27500: 0 5 0 0 0 0 > > 0 0 50060 > > 41300: 0 0 5 0 0 0 > > 0 0 46240 > > 54300: 0 0 0 5 0 0 > > 0 0 48970 > > 63300: 0 0 0 0 5 0 > > 0 0 47330 > > 72800: 0 0 0 0 0 0 > > 0 0 0 > > 82500: 0 0 0 0 0 5 > > 0 0331300 > > Total transition : 34 > > > > With a running time of: > > LITTLE => 283.699 s (680.877 c per mem access) > > big => 284.47 s (975.327 c per mem access) > > I see there were no transitions during your memory test. > > > > > And when I set to the performance governor: > > Before: > > From : To > >: 16500 20600 27500 41300 54300 63300 > > 72800 82500 time(ms) > > 16500: 0 0 0 0 0 0 > > 0 5 5099040 > > 20600: 5 0 0 0 0 0 > > 0 0 57780 > > 27500: 0 5 0 0 0 0 > > 0 0 50060 > > 41300: 0 0 5 0 0 0 > > 0 0 46240 > > 54300: 0 0 0 5 0 0 > > 0 0 48970 > > 63300: 0 0 0 0 5 0 > > 0 0 47330 > > 72800: 0 0 0 0 0 0 > > 0 0 0 > > * 82500: 0 0 0 0 0 5 > > 0 0331350 > > Total transition : 35 > > > > After: > > From : To > >: 16500 20600 27500 41300 54300 63300 > > 72800 82500 time(ms) > > 16500: 0 0 0 0 0 0 > > 0 5 5099040 > > 20600: 5 0 0 0 0 0 > > 0 0 57780 > > 27500: 0 5 0 0 0 0 > > 0 0 50060 > > 41300: 0 0 5 0 0 0 > > 0 0 46240 > > 54300: 0 0 0 5 0 0 > > 0 0 48970 > > 63300: 0 0 0 0 5 0 > > 0 0 47330 > > 72800: 0 0 0 0 0 0 > > 0 0 0 > > * 82500: 0 0 0 0 0 5 > > 0 0472980 > > Total transition : 35 > > > > With a running time of: >
Re: brocken devfreq simple_ondemand for Odroid XU3/4?
On Wed, Jun 24, 2020 at 10:01:17AM +0200, Willy Wolff wrote: > Hi Krzysztof, > Thanks to look at it. > > mem_gov is /sys/class/devfreq/10c2.memory-controller/governor > > Here some numbers after increasing the running time: > > Running using simple_ondemand: > Before: > From : To > >: 16500 20600 27500 41300 54300 63300 > 72800 82500 time(ms) > * 16500: 0 0 0 0 0 0 >0 4 4528600 > 20600: 5 0 0 0 0 0 >0 0 57780 > 27500: 0 5 0 0 0 0 >0 0 50060 > 41300: 0 0 5 0 0 0 >0 0 46240 > 54300: 0 0 0 5 0 0 >0 0 48970 > 63300: 0 0 0 0 5 0 >0 0 47330 > 72800: 0 0 0 0 0 0 >0 0 0 > 82500: 0 0 0 0 0 5 >0 0331300 > Total transition : 34 > > > After: > From : To >: 16500 20600 27500 41300 54300 63300 > 72800 82500 time(ms) > * 16500: 0 0 0 0 0 0 >0 4 5098890 > 20600: 5 0 0 0 0 0 >0 0 57780 > 27500: 0 5 0 0 0 0 >0 0 50060 > 41300: 0 0 5 0 0 0 >0 0 46240 > 54300: 0 0 0 5 0 0 >0 0 48970 > 63300: 0 0 0 0 5 0 >0 0 47330 > 72800: 0 0 0 0 0 0 >0 0 0 > 82500: 0 0 0 0 0 5 >0 0331300 > Total transition : 34 > > With a running time of: > LITTLE => 283.699 s (680.877 c per mem access) > big => 284.47 s (975.327 c per mem access) I see there were no transitions during your memory test. > > And when I set to the performance governor: > Before: > From : To >: 16500 20600 27500 41300 54300 63300 > 72800 82500 time(ms) > 16500: 0 0 0 0 0 0 >0 5 5099040 > 20600: 5 0 0 0 0 0 >0 0 57780 > 27500: 0 5 0 0 0 0 >0 0 50060 > 41300: 0 0 5 0 0 0 >0 0 46240 > 54300: 0 0 0 5 0 0 >0 0 48970 > 63300: 0 0 0 0 5 0 >0 0 47330 > 72800: 0 0 0 0 0 0 >0 0 0 > * 82500: 0 0 0 0 0 5 >0 0331350 > Total transition : 35 > > After: > From : To >: 16500 20600 27500 41300 54300 63300 > 72800 82500 time(ms) > 16500: 0 0 0 0 0 0 >0 5 5099040 > 20600: 5 0 0 0 0 0 >0 0 57780 > 27500: 0 5 0 0 0 0 >0 0 50060 > 41300: 0 0 5 0 0 0 >0 0 46240 > 54300: 0 0 0 5 0 0 >0 0 48970 > 63300: 0 0 0 0 5 0 >0 0 47330 > 72800: 0 0 0 0 0 0 >0 0 0 > * 82500: 0 0 0 0 0 5 >0 0472980 > Total transition : 35 > > With a running time of: > LITTLE: 68.8428 s (165.223 c per mem access) > big: 71.3268 s (244.549 c per mem access) > > > I see some transition, but not occuring during the benchmark. > I haven't dive into the code, but maybe it is the heuristic behind that is not > well defined? If you know
Re: brocken devfreq simple_ondemand for Odroid XU3/4?
Hi Krzysztof, Thanks to look at it. mem_gov is /sys/class/devfreq/10c2.memory-controller/governor Here some numbers after increasing the running time: Running using simple_ondemand: Before: From : To : 16500 20600 27500 41300 54300 63300 72800 82500 time(ms) * 16500: 0 0 0 0 0 0 0 4 4528600 20600: 5 0 0 0 0 0 0 0 57780 27500: 0 5 0 0 0 0 0 0 50060 41300: 0 0 5 0 0 0 0 0 46240 54300: 0 0 0 5 0 0 0 0 48970 63300: 0 0 0 0 5 0 0 0 47330 72800: 0 0 0 0 0 0 0 0 0 82500: 0 0 0 0 0 5 0 0331300 Total transition : 34 After: From : To : 16500 20600 27500 41300 54300 63300 72800 82500 time(ms) * 16500: 0 0 0 0 0 0 0 4 5098890 20600: 5 0 0 0 0 0 0 0 57780 27500: 0 5 0 0 0 0 0 0 50060 41300: 0 0 5 0 0 0 0 0 46240 54300: 0 0 0 5 0 0 0 0 48970 63300: 0 0 0 0 5 0 0 0 47330 72800: 0 0 0 0 0 0 0 0 0 82500: 0 0 0 0 0 5 0 0331300 Total transition : 34 With a running time of: LITTLE => 283.699 s (680.877 c per mem access) big => 284.47 s (975.327 c per mem access) And when I set to the performance governor: Before: From : To : 16500 20600 27500 41300 54300 63300 72800 82500 time(ms) 16500: 0 0 0 0 0 0 0 5 5099040 20600: 5 0 0 0 0 0 0 0 57780 27500: 0 5 0 0 0 0 0 0 50060 41300: 0 0 5 0 0 0 0 0 46240 54300: 0 0 0 5 0 0 0 0 48970 63300: 0 0 0 0 5 0 0 0 47330 72800: 0 0 0 0 0 0 0 0 0 * 82500: 0 0 0 0 0 5 0 0331350 Total transition : 35 After: From : To : 16500 20600 27500 41300 54300 63300 72800 82500 time(ms) 16500: 0 0 0 0 0 0 0 5 5099040 20600: 5 0 0 0 0 0 0 0 57780 27500: 0 5 0 0 0 0 0 0 50060 41300: 0 0 5 0 0 0 0 0 46240 54300: 0 0 0 5 0 0 0 0 48970 63300: 0 0 0 0 5 0 0 0 47330 72800: 0 0 0 0 0 0 0 0 0 * 82500: 0 0 0 0 0 5 0 0472980 Total transition : 35 With a running time of: LITTLE: 68.8428 s (165.223 c per mem access) big: 71.3268 s (244.549 c per mem access) I see some transition, but not occuring during the benchmark. I haven't dive into the code, but maybe it is the heuristic behind that is not well defined? If you know how it's working that would be helpfull before I dive in it. I run your test as well, and indeed, it seems to work for large bunch of memory, and there is some delay before making a transition (seems to be around 10s). When you kill memtester, it reduces the freq stepwisely every ~10s. Note that the timing shown above account for the
Re: brocken devfreq simple_ondemand for Odroid XU3/4?
On Tue, Jun 23, 2020 at 09:02:38PM +0200, Krzysztof Kozlowski wrote: > On Tue, 23 Jun 2020 at 18:47, Willy Wolff wrote: > > > > Hi everybody, > > > > Is DVFS for memory bus really working on Odroid XU3/4 board? > > Using a simple microbenchmark that is doing only memory accesses, memory > > DVFS > > seems to not working properly: > > > > The microbenchmark is doing pointer chasing by following index in an array. > > Indices in the array are set to follow a random pattern (cutting > > prefetcher), > > and forcing RAM access. > > > > git clone https://github.com/wwilly/benchmark.git \ > > && cd benchmark \ > > && source env.sh \ > > && ./bench_build.sh \ > > && bash source/scripts/test_dvfs_mem.sh > > > > Python 3, cmake and sudo rights are required. > > > > Results: > > DVFS CPU with performance governor > > mem_gov = simple_ondemand at 16500 Hz in idle, should be bumped when the > > benchmark is running. > > - on the LITTLE cluster it takes 4.74308 s to run (683.004 c per memory > > access), > > - on the big cluster it takes 4.76556 s to run (980.343 c per moemory > > access). > > > > While forcing DVFS memory bus to use performance governor, > > mem_gov = performance at 82500 Hz in idle, > > - on the LITTLE cluster it takes 1.1451 s to run (164.894 c per memory > > access), > > - on the big cluster it takes 1.18448 s to run (243.664 c per memory > > access). > > > > The kernel used is the last 5.7.5 stable with default exynos_defconfig. > > Thanks for the report. Few thoughts: > 1. What trans_stat are saying? Except DMC driver you can also check > all other devfreq devices (e.g. wcore) - maybe the devfreq events > (nocp) are not properly assigned? > 2. Try running the measurement for ~1 minutes or longer. The counters > might have some delay (which would require probably fixing but the > point is to narrow the problem). > 3. What do you understand by "mem_gov"? Which device is it? +Cc Lukasz who was working on this. I just run memtester and more-or-less ondemand works (at least ramps up): Before: /sys/class/devfreq/10c2.memory-controller$ cat trans_stat From : To : 16500 20600 27500 41300 54300 63300 72800 82500 time(ms) * 16500: 0 0 0 0 0 0 0 0 1795950 20600: 1 0 0 0 0 0 0 0 4770 27500: 0 1 0 0 0 0 0 0 15540 41300: 0 0 1 0 0 0 0 0 20780 54300: 0 0 0 1 0 0 0 1 10760 63300: 0 0 0 0 2 0 0 0 10310 72800: 0 0 0 0 0 0 0 0 0 82500: 0 0 0 0 0 2 0 0 25920 Total transition : 9 $ sudo memtester 1G During memtester: /sys/class/devfreq/10c2.memory-controller$ cat trans_stat From : To : 16500 20600 27500 41300 54300 63300 72800 82500 time(ms) 16500: 0 0 0 0 0 0 0 1 1801490 20600: 1 0 0 0 0 0 0 0 4770 27500: 0 1 0 0 0 0 0 0 15540 41300: 0 0 1 0 0 0 0 0 20780 54300: 0 0 0 1 0 0 0 2 11090 63300: 0 0 0 0 3 0 0 0 17210 72800: 0 0 0 0 0 0 0 0 0 * 82500: 0 0 0 0 0 3 0 0169020 Total transition : 13 However after killing memtester it stays at 633 MHz for very long time and does not slow down. This is indeed weird... Best regards, Krzysztof
Re: brocken devfreq simple_ondemand for Odroid XU3/4?
On Tue, 23 Jun 2020 at 18:47, Willy Wolff wrote: > > Hi everybody, > > Is DVFS for memory bus really working on Odroid XU3/4 board? > Using a simple microbenchmark that is doing only memory accesses, memory DVFS > seems to not working properly: > > The microbenchmark is doing pointer chasing by following index in an array. > Indices in the array are set to follow a random pattern (cutting prefetcher), > and forcing RAM access. > > git clone https://github.com/wwilly/benchmark.git \ > && cd benchmark \ > && source env.sh \ > && ./bench_build.sh \ > && bash source/scripts/test_dvfs_mem.sh > > Python 3, cmake and sudo rights are required. > > Results: > DVFS CPU with performance governor > mem_gov = simple_ondemand at 16500 Hz in idle, should be bumped when the > benchmark is running. > - on the LITTLE cluster it takes 4.74308 s to run (683.004 c per memory > access), > - on the big cluster it takes 4.76556 s to run (980.343 c per moemory access). > > While forcing DVFS memory bus to use performance governor, > mem_gov = performance at 82500 Hz in idle, > - on the LITTLE cluster it takes 1.1451 s to run (164.894 c per memory > access), > - on the big cluster it takes 1.18448 s to run (243.664 c per memory access). > > The kernel used is the last 5.7.5 stable with default exynos_defconfig. Thanks for the report. Few thoughts: 1. What trans_stat are saying? Except DMC driver you can also check all other devfreq devices (e.g. wcore) - maybe the devfreq events (nocp) are not properly assigned? 2. Try running the measurement for ~1 minutes or longer. The counters might have some delay (which would require probably fixing but the point is to narrow the problem). 3. What do you understand by "mem_gov"? Which device is it? Best regards, Krzysztof
brocken devfreq simple_ondemand for Odroid XU3/4?
Hi everybody, Is DVFS for memory bus really working on Odroid XU3/4 board? Using a simple microbenchmark that is doing only memory accesses, memory DVFS seems to not working properly: The microbenchmark is doing pointer chasing by following index in an array. Indices in the array are set to follow a random pattern (cutting prefetcher), and forcing RAM access. git clone https://github.com/wwilly/benchmark.git \ && cd benchmark \ && source env.sh \ && ./bench_build.sh \ && bash source/scripts/test_dvfs_mem.sh Python 3, cmake and sudo rights are required. Results: DVFS CPU with performance governor mem_gov = simple_ondemand at 16500 Hz in idle, should be bumped when the benchmark is running. - on the LITTLE cluster it takes 4.74308 s to run (683.004 c per memory access), - on the big cluster it takes 4.76556 s to run (980.343 c per moemory access). While forcing DVFS memory bus to use performance governor, mem_gov = performance at 82500 Hz in idle, - on the LITTLE cluster it takes 1.1451 s to run (164.894 c per memory access), - on the big cluster it takes 1.18448 s to run (243.664 c per memory access). The kernel used is the last 5.7.5 stable with default exynos_defconfig. Cheers, Willy