[PATCH] serial: qcom_geni_serial: To correct QUP Version detection logic
The current implementation reduces the sampling rate by half if qup HW version is greater is than 2.5 by checking if the geni SE major version is greater than 2 and geni SE minor version is greater than 5. This implementation fails when the version is 3 or greater. Hence new implementation checks if version is greater than or equal to 0x2005 which would work for any future version. Signed-off-by: Paras Sharma --- drivers/tty/serial/qcom_geni_serial.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/tty/serial/qcom_geni_serial.c b/drivers/tty/serial/qcom_geni_serial.c index f0b1b47..e18b431 100644 --- a/drivers/tty/serial/qcom_geni_serial.c +++ b/drivers/tty/serial/qcom_geni_serial.c @@ -1000,7 +1000,7 @@ static void qcom_geni_serial_set_termios(struct uart_port *uport, sampling_rate = UART_OVERSAMPLING; /* Sampling rate is halved for IP versions >= 2.5 */ ver = geni_se_get_qup_hw_version(>se); - if (GENI_SE_VERSION_MAJOR(ver) >= 2 && GENI_SE_VERSION_MINOR(ver) >= 5) + if (ver >= 0x2005) sampling_rate /= 2; clk_rate = get_clk_div_rate(baud, sampling_rate, _div); -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
RE: [PATCH] MAINTAINERS: Remove bouncing email of Beniamin Bia
[yes, I know, bad-email format, but I wanted this to come from my work email] Apologies also for the delay here. Things pile-up on my side and I defer things a bit. Talked to Michael Hennerich about this [since he's the more senior contact at Analog]. We can replace the email from Beniamin Bia with Michael's. Or, we can remove the "Orphan" blocks and just have the catch-all "drivers/iio/*/ad*" cover this driver and others that were upstreamed by Beniamin. Either option is fine from us. -Original Message- From: Krzysztof Kozlowski Sent: Saturday, August 29, 2020 5:58 PM To: Jonathan Cameron Cc: Andy Shevchenko ; Linux Kernel Mailing List ; Hennerich, Michael ; linux-iio ; Ardelean, Alexandru Subject: Re: [PATCH] MAINTAINERS: Remove bouncing email of Beniamin Bia On Sat, 29 Aug 2020 at 16:54, Jonathan Cameron wrote: (...) > > > ANALOG DEVICES INC AD7091R5 DRIVER > > > -M: Beniamin Bia > > > L: linux-...@vger.kernel.org > > > -S: Supported > > > +S: Orphan > > Given it should be covered by the catch all for Analog devices IIO > drivers, either we should confirm if it should move to someone else at > Analog, or if we should just drop specifically listing this one. > Listing it as Orphan when they are good at supporting their drivers > may give the wrong impression. > > +CC Alex to make sure people at Analog notice :) Sure, good point. I wanted to start the discussion so the interested people might appear. Best regards, Krzysztof
Re: [PATCH v3 1/1] mm, oom_adj: don't loop through tasks in __set_oom_adj when not necessary
On Tue 01-09-20 18:25:58, Suren Baghdasaryan wrote: > Currently __set_oom_adj loops through all processes in the system to > keep oom_score_adj and oom_score_adj_min in sync between processes > sharing their mm. This is done for any task with more that one mm_users, > which includes processes with multiple threads (sharing mm and signals). > However for such processes the loop is unnecessary because their signal > structure is shared as well. > Android updates oom_score_adj whenever a tasks changes its role > (background/foreground/...) or binds to/unbinds from a service, making > it more/less important. Such operation can happen frequently. > We noticed that updates to oom_score_adj became more expensive and after > further investigation found out that the patch mentioned in "Fixes" > introduced a regression. Using Pixel 4 with a typical Android workload, > write time to oom_score_adj increased from ~3.57us to ~362us. Moreover > this regression linearly depends on the number of multi-threaded > processes running on the system. > Mark the mm with a new MMF_MULTIPROCESS flag bit when task is created with > (CLONE_VM && !CLONE_THREAD && !CLONE_VFORK). Change __set_oom_adj to use > MMF_MULTIPROCESS instead of mm_users to decide whether oom_score_adj > update should be synchronized between multiple processes. To prevent > races between clone() and __set_oom_adj(), when oom_score_adj of the > process being cloned might be modified from userspace, we use > oom_adj_mutex. Its scope is changed to global. The combination of > (CLONE_VM && !CLONE_THREAD) is rarely used except for the case of vfork(). > To prevent performance regressions of vfork(), we skip taking oom_adj_mutex > and setting MMF_MULTIPROCESS when CLONE_VFORK is specified. Clearing the > MMF_MULTIPROCESS flag (when the last process sharing the mm exits) is left > out of this patch to keep it simple and because it is believed that this > threading model is rare. Should there ever be a need for optimizing that > case as well, it can be done by hooking into the exit path, likely > following the mm_update_next_owner pattern. > With the combination of (CLONE_VM && !CLONE_THREAD && !CLONE_VFORK) being > quite rare, the regression is gone after the change is applied. > > Fixes: 44a70adec910 ("mm, oom_adj: make sure processes sharing mm have same > view of oom_score_adj") > Reported-by: Tim Murray > Debugged-by: Minchan Kim > Suggested-by: Michal Hocko > Signed-off-by: Suren Baghdasaryan Acked-by: Michal Hocko > --- > > v3: > - Addressed Eric Biederman's comments from: > https://lore.kernel.org/linux-mm/87imd6n0qk@x220.int.ebiederm.org/ > -- renabled oom_adj_lock back to oom_adj_mutex > -- renamed MMF_PROC_SHARED into MMF_MULTIPROCESS and fixed its comment > - Updated description to reflect the change > > > v2: > - https://lore.kernel.org/linux-mm/20200824153036.3201505-1-sur...@google.com/ > - Implemented proposal from Michal Hocko in: > https://lore.kernel.org/linux-fsdevel/20200820124109.gi5...@dhcp22.suse.cz/ > - Updated description to reflect the change > > v1: > - https://lore.kernel.org/linux-mm/20200820002053.1424000-1-sur...@google.com/ > > fs/proc/base.c | 3 +-- > include/linux/oom.h| 1 + > include/linux/sched/coredump.h | 1 + > kernel/fork.c | 21 + > mm/oom_kill.c | 2 ++ > 5 files changed, 26 insertions(+), 2 deletions(-) > > diff --git a/fs/proc/base.c b/fs/proc/base.c > index 617db4e0faa0..aa69c35d904c 100644 > --- a/fs/proc/base.c > +++ b/fs/proc/base.c > @@ -1055,7 +1055,6 @@ static ssize_t oom_adj_read(struct file *file, char > __user *buf, size_t count, > > static int __set_oom_adj(struct file *file, int oom_adj, bool legacy) > { > - static DEFINE_MUTEX(oom_adj_mutex); > struct mm_struct *mm = NULL; > struct task_struct *task; > int err = 0; > @@ -1095,7 +1094,7 @@ static int __set_oom_adj(struct file *file, int > oom_adj, bool legacy) > struct task_struct *p = find_lock_task_mm(task); > > if (p) { > - if (atomic_read(>mm->mm_users) > 1) { > + if (test_bit(MMF_MULTIPROCESS, >mm->flags)) { > mm = p->mm; > mmgrab(mm); > } > diff --git a/include/linux/oom.h b/include/linux/oom.h > index f022f581ac29..2db9a1432511 100644 > --- a/include/linux/oom.h > +++ b/include/linux/oom.h > @@ -55,6 +55,7 @@ struct oom_control { > }; > > extern struct mutex oom_lock; > +extern struct mutex oom_adj_mutex; > > static inline void set_current_oom_origin(void) > { > diff --git a/include/linux/sched/coredump.h b/include/linux/sched/coredump.h > index ecdc6542070f..dfd82eab2902 100644 > --- a/include/linux/sched/coredump.h > +++ b/include/linux/sched/coredump.h > @@ -72,6 +72,7 @@ static inline int get_dumpable(struct mm_struct *mm) > #define MMF_DISABLE_THP
Re: [PATCH 5.8 000/255] 5.8.6-rc1 review
On Tue, 1 Sep 2020 at 21:06, Greg Kroah-Hartman wrote: > > This is the start of the stable review cycle for the 5.8.6 release. > There are 255 patches in this series, all will be posted as a response > to this one. If anyone has any issues with these being applied, please > let me know. > > Responses should be made by Thu, 03 Sep 2020 15:09:01 +. > Anything received after that time might be too late. > > The whole patch series can be found in one patch at: > > https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.8.6-rc1.gz > or in the git tree and branch at: > > git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git > linux-5.8.y > and the diffstat can be found below. > > thanks, > > greg k-h > While running LTP CVE test suite on i386 this BUG triggered after the known warning. Please find below full test log link [1]. This was reported on the mailing list on next-20200811 but did not get any reply [2]. [ 138.177043] [ cut here ] [ 138.181675] WARNING: CPU: 1 PID: 8301 at mm/mremap.c:230 move_page_tables+0x6ef/0x720 [ 138.189515] Modules linked in: x86_pkg_temp_thermal [ 138.194436] CPU: 1 PID: 8301 Comm: true Not tainted 5.8.6-rc1 #1 [ 138.194437] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS 2.2 05/23/2018 [ 138.194439] EIP: move_page_tables+0x6ef/0x720 <> [ 802.156512] BUG: unable to handle page fault for address: fe402000 [ 802.162703] #PF: supervisor write access in kernel mode [ 802.167927] #PF: error_code(0x0002) - not-present page [ 802.173064] *pde = 23e61067 *pte = 64b32163 [ 802.177329] Oops: 0002 [#1] SMP [ 802.180469] CPU: 1 PID: 13118 Comm: cve-2017-17053 Tainted: G W 5.8.6-rc1 #1 [ 802.188811] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS 2.2 05/23/2018 [ 802.196199] EIP: memcpy+0x14/0x30 [ 802.199517] Code: e8 a1 72 c5 ff 0f 31 31 c3 59 58 eb 85 cc cc cc cc cc cc cc cc cc 3e 8d 74 26 00 55 89 e5 57 89 c7 56 89 d6 53 89 cb c1 e9 02 a5 89 d9 83 e1 03 74 02 f3 a4 5b 5e 5f 5d c3 8d b4 26 00 00 00 [ 802.218259] EAX: fe402000 EBX: 0001 ECX: 4000 EDX: fb3dd000 [ 802.224518] ESI: fb3dd000 EDI: fe402000 EBP: ea799ddc ESP: ea799dd0 [ 802.230773] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010206 [ 802.237551] CR0: 80050033 CR2: fe402000 CR3: 1eee9000 CR4: 003406d0 [ 802.243809] DR0: DR1: DR2: DR3: [ 802.250065] DR6: fffe0ff0 DR7: 0400 [ 802.253897] Call Trace: [ 802.256345] ldt_dup_context+0x6b/0x90 [ 802.260093] dup_mm+0x2b3/0x480 [ 802.263230] copy_process+0x13d6/0x1650 [ 802.267062] _do_fork+0x7b/0x3b0 [ 802.270284] ? set_next_entity+0xa9/0x250 [ 802.274290] __ia32_sys_clone+0x77/0xa0 [ 802.278119] do_syscall_32_irqs_on+0x3d/0x250 [ 802.282472] ? do_fast_syscall_32+0x2d/0xc0 [ 802.286656] ? trace_hardirqs_on+0x30/0xf0 [ 802.290746] ? trace_hardirqs_off_finish+0x32/0xa0 [ 802.295533] ? do_SYSENTER_32+0x15/0x20 [ 802.299371] do_fast_syscall_32+0x49/0xc0 [ 802.303374] do_SYSENTER_32+0x15/0x20 [ 802.307032] entry_SYSENTER_32+0x9f/0xf2 [ 802.310956] EIP: 0xb7fbb549 [ 802.313747] Code: 03 74 c0 01 10 05 03 74 b8 01 10 06 03 74 b4 01 10 07 03 74 b0 01 10 08 03 74 d8 01 00 00 00 00 00 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d 76 00 58 b8 77 00 00 00 cd 80 90 8d 76 [ 802.332483] EAX: ffda EBX: 01200011 ECX: EDX: [ 802.338742] ESI: EDI: b7dbdba8 EBP: b7dbd348 ESP: b7dbd2f0 [ 802.344998] DS: 007b ES: 007b FS: GS: 0033 SS: 007b EFLAGS: 0246 [ 802.351776] Modules linked in: algif_hash x86_pkg_temp_thermal [ 802.357608] CR2: fe402000 [ 802.360920] ---[ end trace ea48459ba50c2a87 ]--- [ 802.365542] EIP: memcpy+0x14/0x30 [ 802.368858] Code: e8 a1 72 c5 ff 0f 31 31 c3 59 58 eb 85 cc cc cc cc cc cc cc cc cc 3e 8d 74 26 00 55 89 e5 57 89 c7 56 89 d6 53 89 cb c1 e9 02 a5 89 d9 83 e1 03 74 02 f3 a4 5b 5e 5f 5d c3 8d b4 26 00 00 00 [ 802.387593] EAX: fe402000 EBX: 0001 ECX: 4000 EDX: fb3dd000 [ 802.393852] ESI: fb3dd000 EDI: fe402000 EBP: ea799ddc ESP: ea799dd0 [ 802.400107] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010206 [ 802.406887] CR0: 80050033 CR2: fe402000 CR3: 1eee9000 CR4: 003406d0 [ 802.413143] DR0: DR1: DR2: DR3: [ 802.419400] DR6: fffe0ff0 DR7: 0400 full test log, [1] https://qa-reports.linaro.org/lkft/linux-stable-rc-5.8-oe/build/v5.8.5-256-gad57c5b5e64d/testrun/3148295/suite/linux-log-parser/test/check-kernel-bug-1727425/log [2] https://lore.kernel.org/linux-mm/ca+g9fysingoh09h0paf1+utkhpnn490qcolb2drfhmt+cjh...@mail.gmail.com/ -- Linaro LKFT https://lkft.linaro.org
Re: [PATCH] arm64: dts: qcom: sc7180: Add 'sustainable_power' for CPU thermal zones
I'm not massively familiar with this area of the code, but I guess I shouldn't let that stop me from having an opinion! :-P * I would agree that it seems highly unlikely that someone would put one of these chips in a device that could only dissipate the heat from the lowest OPP, so having some higher estimate definitely makes sense. * In terms of the numbers here, I believe that you're claiming that we can dissipate 768 mW * 6 + 1202 mW * 2 = ~7 Watts of power. No, I'm claiming it's 768 mW + 1202 mW = ~2 W. SC7180 has a 6 thermal zones for the 6 little cores and 4 zones for the 2 big cores. Each of these thermal zones uses either all little or all big cores as cooling devices, hence the power sustainable power of the individual zones doesn't add up. 768 mW corresponds to 6x 128 mW (aka all little cores at 1.8 GHz), and 1202 mW to 2x 601 mW (both big cores at 1.9 GHz). My memory of how much power we could dissipate in previous laptops I worked on is a little fuzzy, but that doesn't seem insane for a passively-cooled laptop. However, I think someone could conceivably put this chip in a smaller form factor. In such a case, it seems like we'd want these things to sum up to ~2000 (if it would ever make sense for someone to put this chip in a phone) or ~4000 (if it would ever make sense for someone to put this chip in a small tablet). See above, the sustainable power with this patch only adds up to ~2000. It is possible though that it would be lower in a smaller form factor device. I'd be ok with posting something lower for SC7180 (it would be a guess though) and use the specific numbers in the device specific DT. It seems possible that, to achieve this, we might have to tweak the "dynamic-power-coefficient". I don't know how much thought was put into those numbers, but the fact that the little cores have a super round 100 for their dynamic-power-coefficient makes me feel like they might have been more schwags than anything. Rajendra maybe knows? Yeah, it's possible that that was just an approximation No, these are based on actual power measurements. * I'm curious about the fact that there are two numbers here: one for littles and one for bigs. If I had to guess I'd say that since all the cores are in one package so the contributions kinda need to be thought of together, right? If we're sitting there thermally throttled then we'd want to pick the best perf-per-watt for the overall package. This is why your patch says we can sustain the little cores at max and the big cores get whatever is left over, right? It's derived from how Qualcomm specified the thermal zones and cooling devices. Any ("cpu") zone is either cooled by (all) big cores or by (all) little cores, but not a mix of them. In my tests I also saw that the big cores seemed to have little impact on the little ones. The little cores are at max because even running at max frequency the temperature in the 'little zones' wouldn't come close to the trip point. * Should we be leaving some room in here for the GPU? ...or I guess once we list it as a cooling device we'll have to decrease the amount the CPUs can use? I don't know for sure, but judging from the CPU zones I wouldn't be surprised if the GPU was managed exclusively in the dedicated GPU thermal zones (I guess that's what 'gpuss0-thermal' and 'gpuss1-thermal' are). If that's not the case the values in the CPU zones can be adjusted when specific data is available. So I guess the tl; dr is: a) We should check "dynamic-power-coefficient" and possibly adjust. ok, lets see if Rajendra can check if there is room for tweaking. I suggest we don't :) -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
v4.9.234, WARNING: CPU: 1 PID: 166 at kernel/locking/lockdep.c:3326
Hi, have a reproachable kernel crash oops after update to 4.9.234 IMX6DL boot process. Regards Chris [ OK ] Started Various fixups to make systemd work better on Debian. [3.635568] [ cut here ] [3.640597] WARNING: CPU: 1 PID: 166 at kernel/locking/lockdep.c:3326 __lock_acquire+0x58c/0x1cf0 [3.649480] DEBUG_LOCKS_WARN_ON(class_idx > MAX_LOCKDEP_KEYS)[ 3.655058] Modules linked in: [3.658131] CPU: 1 PID: 166 Comm: systemd-cgroups Not tainted 4.9.234 #57 [3.664927] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree) [3.671458] Backtrace: [3.673949] [] (dump_backtrace) from [] (show_stack+0x18/0x1c) [3.681526] r6:6093 r5: r4:c0d1d434 r3: [3.687202] [] (show_stack) from [] (dump_stack+0xf0/0x11c) [3.694528] [] (dump_stack) from [] (__warn+0xec/0x104) [3.701498] r8:0cfe r7:0009 r6:c0aebbc0 r5: r4:ec86fd38 r3:0001 [3.709249] [] (__warn) from [] (warn_slowpath_fmt+0x40/0x48) [3.716739] r9:ee1c6560 r8:0002 r7:ee1c6000 r6:c0d1d508 r5:c14dc0ac r4:c0ae8f88 [3.724492] [] (warn_slowpath_fmt) from [] (__lock_acquire+0x58c/0x1cf0) [3.732933] r3:c0aecb48 r2:c0ae8f88 [3.736511] r4:c3c3c50c [3.739052] [] (__lock_acquire) from [] (lock_acquire+0x78/0x98) [3.746801] r10:b6fa9000 r9:ee7a0480 r8:c0d51598 r7:0001 r6:0080 r5:6093 [3.754632] r4: [3.757180] [] (lock_acquire) from [] (_raw_spin_lock_irq+0x5c/0x6c) [3.765275] r7:b6fa9000 r6:ed9fe118 r5:c01fbc08 r4:ed9fe118 [3.770946] [] (_raw_spin_lock_irq) from [] (__vma_link_file+0x80/0x98) [3.779299] r5:ec9009f8 r4:ed9fe10c [3.782882] [] (__vma_link_file) from [] (__vma_adjust+0xc4/0x664) [3.790803] r6:ec900840 r5:ec900c08 r4:ec900c44 r3:ec9009f8 [3.796471] [] (__vma_adjust) from [] (__split_vma+0x11c/0x1b8) [3.804133] r10:b6fa9000 r9: r8:c0d51598 r7:b6fa9000 r6: r5:ec900c08 [3.811964] r4:ec9009f8 [3.814507] [] (__split_vma) from [] (split_vma+0x28/0x34) [3.821736] r9:00100073 r8:ee7a0480 r7:b6fa8000 r6: r5: r4:ec900c08 [3.829487] [] (split_vma) from [] (mprotect_fixup+0x244/0x274) [3.837151] [] (mprotect_fixup) from [] (do_mprotect_pkey+0x154/0x20c) [3.845420] r10:007d r9:b6fa9000 r8:0005 r7:0001 r6: r5:0001 [3.853252] r4:b6fa9000 [3.855794] [] (do_mprotect_pkey) from [] (SyS_mprotect+0x14/0x18) [3.863717] r9:ec86e000 r8:c0107c44 r7:007d r6:b6f8a250 r5:b6fd3d98 r4:b6fd2c50 [3.871469] [] (SyS_mprotect) from [] (ret_fast_syscall+0x0/0x1c) [3.879306] ---[ end trace 890e4e38c95446ca ]---
Re: [PATCH] arm64: dts: qcom: sc7180: Add 'sustainable_power' for CPU thermal zones
* In terms of the numbers here, I believe that you're claiming that we can dissipate 768 mW * 6 + 1202 mW * 2 = ~7 Watts of power. My memory of how much power we could dissipate in previous laptops I worked on is a little fuzzy, but that doesn't seem insane for a passively-cooled laptop. However, I think someone could conceivably put this chip in a smaller form factor. In such a case, it seems like we'd want these things to sum up to ~2000 (if it would ever make sense for someone to put this chip in a phone) or ~4000 (if it would ever make sense for someone to put this chip in a small tablet). It seems possible that, to achieve this, we might have to tweak the "dynamic-power-coefficient". DPC values are calculated (at a SoC) by actually measuring max power at various frequency/voltage combinations by running things like dhrystone. How would the max power a SoC can generate depend on form factors? How much it can dissipate sure is, but then I am not super familiar how thermal frameworks end up using DPC for calculating power dissipated, I am guessing they don't. I don't know how much thought was put into those numbers, but the fact that the little cores have a super round 100 for their dynamic-power-coefficient makes me feel like they might have been more schwags than anything. Rajendra maybe knows? FWIK, the values are always scaled and normalized to 100 for silver and then used to derive the relative DPC number for gold. If you see the DPC for silver cores even on sdm845 is a 100. Again these are not estimations but based on actual power measurements. -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
Re: [RFC PATCH] USB: misc: Add usb_hub_pwr driver
On 20-09-01 13:21:43, Matthias Kaehlcke wrote: > The driver combo usb_hub_pwr/usb_hub_psupply allows to control > the power supply of an onboard USB hub. > > The drivers address two issues: > - a USB hub needs to be powered before it can be discovered > - battery powered devices may want to switch the USB hub off >during suspend to extend battery life > > The regulator of the hub is controlled by the usb_hub_psupply > platform driver. The regulator is switched on when the platform > device is initialized, which enables discovery of the hub. The > driver provides an external interface to enable/disable the > power supply which is used by the usb_hub_pwr driver. > > The usb_hub_pwr extends the generic USB hub driver. The device is > initialized when the hub is discovered by the USB subsystem. It > uses the usb_hub_psupply interface to make its own request to > enable the regulator (increasing the use count to 2). > > During system suspend usb_hub_pwr checks if any wakeup capable > devices are connected to the hub. If not it 'disables' the hub > regulator (decreasing the use count to 1, hence the regulator > stays enabled for now). When the usb_hub_psupply device suspends > it disables the hub regulator unconditionally (decreasing the use > count to 0 or 1, depending on the actions of usb_hub_pwr). This > is done to allow the usb_hub_pwr device to control the state of > the regulator during system suspend. > > Upon resume usb_hub_psupply enables the regulator again, the > usb_hub_pwr device does the same if it disabled the regulator > during resume. Hi Matthias, I did similar several years ago [1], but the concept (power sequence) doesn't be accepted by power maintainer. Your patch introduce an new way to fix this long-term issue, I have an idea to fix it more generally. - Create a table (say usb_pm_table) for USB device which needs to do initial power on and power management during suspend suspend/resume based on VID and PID, example: usb/core/quirks.c - After hub (both roothub and intermediate hub) device is created, search the DT node under this hub, and see if the device is in usb_pm_table. If it is in it, create a platform device, say usb-power-supply, and the related driver is like your usb_hub_psupply.c, the parent of this device is controller device. - After this usb-power-supply device is probed, do initial power on at probe, eg, clock, regulator, reset-gpio. - This usb-power-supply device system suspend operation should be called after onboard device has suspended since it is created before it. No runtime PM ops are needed for it. - When the hub is removed, delete this platform device. What's your opinion? [1] https://lore.kernel.org/lkml/1498027328-25078-1-git-send-email-peter.c...@nxp.com/ Peter > > Co-developed-by: Ravi Chandra Sadineni > Signed-off-by: Ravi Chandra Sadineni > Signed-off-by: Matthias Kaehlcke > --- > The driver currently only supports a single power supply. This should > work for most/many configurations/hubs, support for multiple power > supplies can be added later if needed. > > No DT bindings are included since this is just a RFC. Here is a DT > example: > > usb_hub_psupply: usb-hub-psupply { > compatible = "linux,usb_hub_psupply"; > vdd-supply = <_hub>; > }; > > _1_dwc3 { > /* 2.0 hub on port 1 */ > hub@1 { > compatible = "usbbda,5411"; > reg = <1>; > psupply = <_hub_psupply>; > }; > > /* 3.0 hub on port 2 */ > hub@2 { > compatible = "usbbda,411"; > reg = <2>; > psupply = <_hub_psupply>; > }; > }; > > drivers/usb/misc/Kconfig | 14 +++ > drivers/usb/misc/Makefile | 1 + > drivers/usb/misc/usb_hub_psupply.c | 112 ++ > drivers/usb/misc/usb_hub_psupply.h | 9 ++ > drivers/usb/misc/usb_hub_pwr.c | 177 + > 5 files changed, 313 insertions(+) > create mode 100644 drivers/usb/misc/usb_hub_psupply.c > create mode 100644 drivers/usb/misc/usb_hub_psupply.h > create mode 100644 drivers/usb/misc/usb_hub_pwr.c > > diff --git a/drivers/usb/misc/Kconfig b/drivers/usb/misc/Kconfig > index 6818ea689cd9..79ed50e6a7bf 100644 > --- a/drivers/usb/misc/Kconfig > +++ b/drivers/usb/misc/Kconfig > @@ -275,3 +275,17 @@ config USB_CHAOSKEY > > To compile this driver as a module, choose M here: the > module will be called chaoskey. > + > +config USB_HUB_PWR > + tristate "Control power supply for onboard USB hubs" > + depends on PM > + help > + Say Y here if you want to control the power supply of an > + onboard USB hub. The driver switches the power supply of the > + hub on, to make sure the hub can be discovered. During system > + suspend the power supply is switched off, unless a wakeup > + capable device is connected to the hub. This may reduce power > + consumption on battery powered devices. > + > + To compile this driver as a module, choose M
Re: [PATCH v1 08/10] powerpc/pseries/iommu: Add ddw_property_create() and refactor enable_ddw()
On Mon, 2020-08-31 at 14:34 +1000, Alexey Kardashevskiy wrote: > > On 29/08/2020 01:25, Leonardo Bras wrote: > > On Mon, 2020-08-24 at 15:07 +1000, Alexey Kardashevskiy wrote: > > > On 18/08/2020 09:40, Leonardo Bras wrote: > > > > Code used to create a ddw property that was previously scattered in > > > > enable_ddw() is now gathered in ddw_property_create(), which deals with > > > > allocation and filling the property, letting it ready for > > > > of_property_add(), which now occurs in sequence. > > > > > > > > This created an opportunity to reorganize the second part of > > > > enable_ddw(): > > > > > > > > Without this patch enable_ddw() does, in order: > > > > kzalloc() property & members, create_ddw(), fill ddwprop inside > > > > property, > > > > ddw_list_add(), do tce_setrange_multi_pSeriesLP_walk in all memory, > > > > of_add_property(). > > > > > > > > With this patch enable_ddw() does, in order: > > > > create_ddw(), ddw_property_create(), of_add_property(), ddw_list_add(), > > > > do tce_setrange_multi_pSeriesLP_walk in all memory. > > > > > > > > This change requires of_remove_property() in case anything fails after > > > > of_add_property(), but we get to do tce_setrange_multi_pSeriesLP_walk > > > > in all memory, which looks the most expensive operation, only if > > > > everything else succeeds. > > > > > > > > Signed-off-by: Leonardo Bras > > > > --- > > > > arch/powerpc/platforms/pseries/iommu.c | 97 +++--- > > > > 1 file changed, 57 insertions(+), 40 deletions(-) > > > > > > > > diff --git a/arch/powerpc/platforms/pseries/iommu.c > > > > b/arch/powerpc/platforms/pseries/iommu.c > > > > index 4031127c9537..3a1ef02ad9d5 100644 > > > > --- a/arch/powerpc/platforms/pseries/iommu.c > > > > +++ b/arch/powerpc/platforms/pseries/iommu.c > > > > @@ -1123,6 +1123,31 @@ static void reset_dma_window(struct pci_dev > > > > *dev, struct device_node *par_dn) > > > > ret); > > > > } > > > > > > > > +static int ddw_property_create(struct property **ddw_win, const char > > > > *propname, > > > > > > @propname is always the same, do you really want to pass it every time? > > > > I think it reads better, like "create a ddw property with this name". > > This reads as "there are at least two ddw properties". > > > Also, it makes possible to create ddw properties with other names, in > > case we decide to create properties with different names depending on > > the window created. > > It is one window at any given moment, why call it different names... I > get the part that it is not always "direct" anymore but still... > It seems the case as one of the options you suggested on patch [09/10] >> I suspect it breaks kexec (from older kernel to this one) so you >> either need to check for both DT names or just keep the old one. > > > Also, it's probably optimized / inlined at this point. > > Is it ok doing it like this? > > > > > > + u32 liobn, u64 dma_addr, u32 page_shift, > > > > u32 window_shift) > > > > +{ > > > > + struct dynamic_dma_window_prop *ddwprop; > > > > + struct property *win64; > > > > + > > > > + *ddw_win = win64 = kzalloc(sizeof(*win64), GFP_KERNEL); > > > > + if (!win64) > > > > + return -ENOMEM; > > > > + > > > > + win64->name = kstrdup(propname, GFP_KERNEL); > > > > > > Not clear why "win64->name = DIRECT64_PROPNAME" would not work here, the > > > generic OF code does not try kfree() it but it is probably out of scope > > > here. > > > > Yeah, I had that question too. > > Previous code was like that, and I as trying not to mess too much on > > how it's done. > > > > > > + ddwprop = kzalloc(sizeof(*ddwprop), GFP_KERNEL); > > > > + win64->value = ddwprop; > > > > + win64->length = sizeof(*ddwprop); > > > > + if (!win64->name || !win64->value) > > > > + return -ENOMEM; > > > > > > Up to 2 memory leaks here. I see the cleanup at "out_free_prop:" but > > > still looks fragile. Instead you could simply return win64 as the only > > > error possible here is -ENOMEM and returning NULL is equally good. > > > > I agree. It's better if this function have it's own cleaning routine. > > It will be fixed for next version. > > > > > > > > > + > > > > + ddwprop->liobn = cpu_to_be32(liobn); > > > > + ddwprop->dma_base = cpu_to_be64(dma_addr); > > > > + ddwprop->tce_shift = cpu_to_be32(page_shift); > > > > + ddwprop->window_shift = cpu_to_be32(window_shift); > > > > + > > > > + return 0; > > > > +} > > > > + > > > > /* > > > > * If the PE supports dynamic dma windows, and there is space for a > > > > table > > > > * that can map all pages in a linear offset, then setup such a table, > > > > @@ -1140,12 +1165,11 @@ static bool enable_ddw(struct pci_dev *dev, > > > > struct device_node *pdn) > > > > struct ddw_query_response query; > > > > struct ddw_create_response
Re: [PATCH] drm/i915/lspcon: Limits to 8 bpc for RGB/YCbCr444
> On Sep 1, 2020, at 03:48, Ville Syrjälä wrote: > > On Thu, Aug 27, 2020 at 01:04:54PM +0800, Kai Heng Feng wrote: >> Hi Ville, >> >>> On Aug 27, 2020, at 12:24 AM, Ville Syrjälä >>> wrote: >>> >>> On Wed, Aug 26, 2020 at 01:21:15PM +0800, Kai-Heng Feng wrote: LSPCON only supports 8 bpc for RGB/YCbCr444. Set the correct bpp otherwise it renders blank screen. >>> >>> Hmm. Does >>> git://github.com/vsyrjala/linux.git dp_downstream_ports_5 >>> work? >>> >>> Actually better make that dp_downstream_ports_5^^^ aka. >>> 54d846ce62a2 ("drm/i915: Do YCbCr 444->420 conversion via DP protocol >>> converters") to avoid the experiments and hacks I have sitting on top. >> >> Can you please rebase it to mainline master or drm-tip? > > git://github.com/vsyrjala/linux.git dp_downstream_ports_6 Yes this solves the issue. Thanks a lot! Any timeline this will get merged? Kai-Heng > > I threw out the hacks/experimental stuff. > >> >> I am getting errors on the branch: >> >> DESCEND objtool >> CALLscripts/atomic/check-atomics.sh >> CALLscripts/checksyscalls.sh >> CHK include/generated/compile.h >> Building modules, stage 2. >> MODPOST 166 modules >> LD arch/x86/boot/compressed/vmlinux >> ld: arch/x86/boot/compressed/pgtable_64.o:(.bss+0x0): multiple definition of >> `__force_order'; arch/x86/boot/compressed/kaslr_64.o:(.bss+0x0): first >> defined here >> ld: arch/x86/boot/compressed/head_64.o: warning: relocation in read-only >> section `.head.text' >> ld: warning: creating DT_TEXTREL in a PIE >> make[2]: *** [arch/x86/boot/compressed/Makefile:119: >> arch/x86/boot/compressed/vmlinux] Error 1 >> make[1]: *** [arch/x86/boot/Makefile:113: arch/x86/boot/compressed/vmlinux] >> Error 2 >> make: *** [arch/x86/Makefile:284: bzImage] Error 2 >> make: *** Waiting for unfinished jobs >> >> Kai-Heng >> >>> Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2195 Signed-off-by: Kai-Heng Feng --- drivers/gpu/drm/i915/display/intel_lspcon.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/display/intel_lspcon.c b/drivers/gpu/drm/i915/display/intel_lspcon.c index b781bf469644..c7a44fcaade8 100644 --- a/drivers/gpu/drm/i915/display/intel_lspcon.c +++ b/drivers/gpu/drm/i915/display/intel_lspcon.c @@ -196,7 +196,8 @@ void lspcon_ycbcr420_config(struct drm_connector *connector, crtc_state->port_clock /= 2; crtc_state->output_format = INTEL_OUTPUT_FORMAT_YCBCR444; crtc_state->lspcon_downsampling = true; - } + } else + crtc_state->pipe_bpp = 24; } static bool lspcon_probe(struct intel_lspcon *lspcon) -- 2.17.1 >>> >>> -- >>> Ville Syrjälä >>> Intel > > -- > Ville Syrjälä > Intel
Re: [PATCH 3/3] ARM: tegra: Pass multiple versions in opp-supported-hw property
On 01-09-20, 16:21, Dmitry Osipenko wrote: > IIUC, there is no fixed formula for Tegra, at least I don't see it. For > example, if you'll take a look at the 1300MHz OPP of Tegra30, then you > could see that this freq has a lot of voltages each depending on > specific combination of SPEEDO+PROCESS versions. Right, it may not be worth it to clean this up :) -- viresh
Re: [PATCH] block: Fix potential NULL pointer dereference in __bio_crypt_clone()
On Wed, Sep 02, 2020 at 01:56:53AM +, linmiaohe wrote: > Eric Biggers wrote: > >On Tue, Sep 01, 2020 at 07:59:21AM -0400, Miaohe Lin wrote: > >> mempool_alloc() may return NULL if __GFP_DIRECT_RECLAIM is not set in > >> gfp_mask under memory pressure. So we should check the return value of > >> mempool_alloc() against NULL before dereference. > >> > >> Fixes: a892c8d52c02 ("block: Inline encryption support for blk-mq") > >> Signed-off-by: Miaohe Lin > > > >It's intended that __GFP_DIRECT_RECLAIM always be set here. > >Do you have an example where it isn't set here? > > map_request() only pass GFP_ATOMIC to gfp_mask, though bio crypt is not used > yet. > > >Also, if this can indeed happen, then we need to make __bio_crypt_clone() > >(and bio_crypt_clone()) return a bool (or an error code) to indicate whether > >it succeeded or failed. We can't just ignore the allocation failure. > > > >- Eric > > IMO, just the allocation failure is ok or we would break KABI. > Many thanks. > Ignoring the allocation failure isn't okay, since it would cause encrypted I/O to fall back to unencrypted I/O, which would cause data corruption. Also, upstream doesn't have a stable KABI. I sent out a patch with what I have in mind; can you take a look? https://lkml.kernel.org/r/20200902051511.79821-1-ebigg...@kernel.org - Eric
Re: [PATCH 0/2] link vdso with linker
On Tue, Sep 01, 2020 at 03:25:21PM -0700, Nick Desaulniers wrote: > Kees Cook is working on series that adds --orphan-section=warn to arm, > arm64, and x86. I noticed that ppc vdso were still using cc-ldoption > for these which I removed. It seems this results in that flag being > silently dropped. > > I'm very confident with the first patch, but the second needs closer > review around the error mentioned below the fold related to the .got > section. > > Nick Desaulniers (2): > powerpc/vdso64: link vdso64 with linker > powerpc/vdso32: link vdso64 with linker > > arch/powerpc/include/asm/vdso.h | 17 ++--- > arch/powerpc/kernel/vdso32/Makefile | 7 +-- > arch/powerpc/kernel/vdso32/vdso32.lds.S | 3 ++- > arch/powerpc/kernel/vdso64/Makefile | 8 ++-- > arch/powerpc/kernel/vdso64/vdso64.lds.S | 1 - > 5 files changed, 15 insertions(+), 21 deletions(-) > > -- > 2.28.0.402.g5ffc5be6b7-goog > ppc44x_defconfig and powernv_defconfig start failing with this series when LD=ld.lld is used. $ make -skj"$(nproc)" ARCH=powerpc CROSS_COMPILE=powerpc-linux-gnu- LLVM=1 O=out/ppc32 distclean ppc44x_defconfig uImage ld.lld: error: relocation R_PPC_REL16_LO cannot be used against symbol __kernel_datapage_offset; recompile with -fPIC >>> defined in arch/powerpc/kernel/vdso32/datapage.o >>> referenced by >>> arch/powerpc/kernel/vdso32/gettimeofday.o:(__kernel_gettimeofday) ld.lld: error: relocation R_PPC_REL16_LO cannot be used against symbol __kernel_datapage_offset; recompile with -fPIC >>> defined in arch/powerpc/kernel/vdso32/datapage.o >>> referenced by >>> arch/powerpc/kernel/vdso32/gettimeofday.o:(__kernel_clock_gettime) ld.lld: error: relocation R_PPC_REL16_LO cannot be used against symbol __kernel_datapage_offset; recompile with -fPIC >>> defined in arch/powerpc/kernel/vdso32/datapage.o >>> referenced by >>> arch/powerpc/kernel/vdso32/gettimeofday.o:(__kernel_clock_getres) ld.lld: error: relocation R_PPC_REL16_LO cannot be used against symbol __kernel_datapage_offset; recompile with -fPIC >>> defined in arch/powerpc/kernel/vdso32/datapage.o >>> referenced by arch/powerpc/kernel/vdso32/gettimeofday.o:(__kernel_time) ... $ make -skj"$(nproc)" ARCH=powerpc CROSS_COMPILE=powerpc64le-linux-gnu- LLVM=1 O=out/ppc64le distclean powernv_defconfig zImage.epapr ld.lld: error: relocation R_PPC64_REL16_LO cannot be used against symbol __kernel_datapage_offset; recompile with -fPIC >>> defined in arch/powerpc/kernel/vdso64/datapage.o >>> referenced by >>> arch/powerpc/kernel/vdso64/gettimeofday.o:(__kernel_gettimeofday) ld.lld: error: relocation R_PPC64_REL16_LO cannot be used against symbol __kernel_datapage_offset; recompile with -fPIC >>> defined in arch/powerpc/kernel/vdso64/datapage.o >>> referenced by >>> arch/powerpc/kernel/vdso64/gettimeofday.o:(__kernel_clock_gettime) ld.lld: error: relocation R_PPC64_REL16_LO cannot be used against symbol __kernel_datapage_offset; recompile with -fPIC >>> defined in arch/powerpc/kernel/vdso64/datapage.o >>> referenced by >>> arch/powerpc/kernel/vdso64/gettimeofday.o:(__kernel_clock_getres) ld.lld: error: relocation R_PPC64_REL16_LO cannot be used against symbol __kernel_datapage_offset; recompile with -fPIC >>> defined in arch/powerpc/kernel/vdso64/datapage.o >>> referenced by arch/powerpc/kernel/vdso64/gettimeofday.o:(__kernel_time) ld.lld: error: relocation R_PPC64_REL16_LO cannot be used against symbol __kernel_datapage_offset; recompile with -fPIC >>> defined in arch/powerpc/kernel/vdso64/datapage.o >>> referenced by >>> arch/powerpc/kernel/vdso64/cacheflush.o:(__kernel_sync_dicache) ... We need Fangrui's patch to fix ppc44x_defconfig: https://lore.kernel.org/lkml/20200205005054.k72fuikf6rwrg...@google.com/ That exact same fix is needed in arch/powerpc/kernel/vdso64/datapage.S to fix powernv_defconfig. Cheers, Nathan
Re: [RFC v2 2/2] KVM: VMX: Enable bus lock VM exit
On 9/1/2020 4:43 PM, Vitaly Kuznetsov wrote: Chenyi Qiang writes: Virtual Machine can exploit bus locks to degrade the performance of system. Bus lock can be caused by split locked access to writeback(WB) memory or by using locks on uncacheable(UC) memory. The bus lock is typically >1000 cycles slower than an atomic operation within a cache line. It also disrupts performance on other cores (which must wait for the bus lock to be released before their memory operations can complete). To address the threat, bus lock VM exit is introduced to notify the VMM when a bus lock was acquired, allowing it to enforce throttling or other policy based mitigations. A VMM can enable VM exit due to bus locks by setting a new "Bus Lock Detection" VM-execution control(bit 30 of Secondary Processor-based VM execution controls). If delivery of this VM exit was preempted by a higher priority VM exit (e.g. EPT misconfiguration, EPT violation, APIC access VM exit, APIC write VM exit, exception bitmap exiting), bit 26 of exit reason in vmcs field is set to 1. In current implementation, the KVM exposes this capability through KVM_CAP_X86_BLD. The user can set it to enable the bus lock VM exit (disabled by default). If bus locks in guest are detected by KVM, exit to user space even when current exit reason is handled by KVM internally. Set a new field KVM_RUN_BUS_LOCK in vcpu->run->flags to inform the user space that there is a bus lock in guest and it is preempted by a higher priority VM exit. Every bus lock acquired in non-root mode will be recorded in vcpu->stat.bus_locks and exposed through debugfs when the bus lock VM exit is enabled. Document for Bus Lock VM exit is now available at the latest "Intel Architecture Instruction Set Extensions Programming Reference". Document Link: https://software.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html Co-developed-by: Xiaoyao Li Signed-off-by: Xiaoyao Li Signed-off-by: Chenyi Qiang --- arch/x86/include/asm/kvm_host.h| 9 arch/x86/include/asm/vmx.h | 1 + arch/x86/include/asm/vmxfeatures.h | 1 + arch/x86/include/uapi/asm/kvm.h| 1 + arch/x86/include/uapi/asm/vmx.h| 4 +++- arch/x86/kvm/vmx/capabilities.h| 6 + arch/x86/kvm/vmx/vmx.c | 33 ++- arch/x86/kvm/vmx/vmx.h | 2 +- arch/x86/kvm/x86.c | 36 +- arch/x86/kvm/x86.h | 5 + include/uapi/linux/kvm.h | 2 ++ 11 files changed, 96 insertions(+), 4 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index be5363b21540..bfabe2f15b30 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -829,6 +829,9 @@ struct kvm_vcpu_arch { /* AMD MSRC001_0015 Hardware Configuration */ u64 msr_hwcr; + + /* Set when bus lock VM exit is preempted by a higher priority VM exit */ + bool bus_lock_detected; }; struct kvm_lpage_info { @@ -1002,6 +1005,9 @@ struct kvm_arch { bool guest_can_read_msr_platform_info; bool exception_payload_enabled; + /* Set when bus lock vm exit is enabled by user */ + bool bus_lock_exit; + struct kvm_pmu_event_filter *pmu_event_filter; struct task_struct *nx_lpage_recovery_thread; }; @@ -1051,6 +1057,7 @@ struct kvm_vcpu_stat { u64 req_event; u64 halt_poll_success_ns; u64 halt_poll_fail_ns; + u64 bus_locks; }; struct x86_instruction_info; @@ -1388,6 +1395,8 @@ extern u8 kvm_tsc_scaling_ratio_frac_bits; extern u64 kvm_max_tsc_scaling_ratio; /* 1ull << kvm_tsc_scaling_ratio_frac_bits */ extern u64 kvm_default_tsc_scaling_ratio; +/* bus lock detection supported */ +extern bool kvm_has_bus_lock_exit; extern u64 kvm_mce_cap_supported; diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h index cd7de4b401fe..93a880bc31a7 100644 --- a/arch/x86/include/asm/vmx.h +++ b/arch/x86/include/asm/vmx.h @@ -73,6 +73,7 @@ #define SECONDARY_EXEC_PT_USE_GPA VMCS_CONTROL_BIT(PT_USE_GPA) #define SECONDARY_EXEC_TSC_SCALING VMCS_CONTROL_BIT(TSC_SCALING) #define SECONDARY_EXEC_ENABLE_USR_WAIT_PAUSE VMCS_CONTROL_BIT(USR_WAIT_PAUSE) +#define SECONDARY_EXEC_BUS_LOCK_DETECTION VMCS_CONTROL_BIT(BUS_LOCK_DETECTION) #define PIN_BASED_EXT_INTR_MASK VMCS_CONTROL_BIT(INTR_EXITING) #define PIN_BASED_NMI_EXITING VMCS_CONTROL_BIT(NMI_EXITING) diff --git a/arch/x86/include/asm/vmxfeatures.h b/arch/x86/include/asm/vmxfeatures.h index 9915990fd8cf..e80523346274 100644 --- a/arch/x86/include/asm/vmxfeatures.h +++ b/arch/x86/include/asm/vmxfeatures.h @@ -83,5 +83,6 @@ #define VMX_FEATURE_TSC_SCALING ( 2*32+ 25) /* Scale hardware TSC when read in guest */ #define VMX_FEATURE_USR_WAIT_PAUSE
Re: [PATCH 5.4 000/214] 5.4.62-rc1 review
On Wed, 2 Sep 2020 at 00:39, Guenter Roeck wrote: > > On 9/1/20 8:08 AM, Greg Kroah-Hartman wrote: > > This is the start of the stable review cycle for the 5.4.62 release. > > There are 214 patches in this series, all will be posted as a response > > to this one. If anyone has any issues with these being applied, please > > let me know. > > > > Responses should be made by Thu, 03 Sep 2020 15:09:01 +. > > Anything received after that time might be too late. > > > > Building x86_64:tools/perf ... failed > -- > Error log: > Warning: Kernel ABI header at 'tools/include/uapi/linux/kvm.h' differs from > latest version at 'include/uapi/linux/kvm.h' > Warning: Kernel ABI header at 'tools/include/uapi/linux/sched.h' differs from > latest version at 'include/uapi/linux/sched.h' > Warning: Kernel ABI header at 'tools/arch/x86/include/asm/cpufeatures.h' > differs from latest version at 'arch/x86/include/asm/cpufeatures.h' > Warning: Kernel ABI header at 'tools/arch/x86/include/uapi/asm/unistd.h' > differs from latest version at 'arch/x86/include/uapi/asm/unistd.h' > Makefile.config:846: No libcap found, disables capability support, please > install libcap-devel/libcap-dev > Makefile.config:958: No openjdk development package found, please install JDK > package, e.g. openjdk-8-jdk, java-1.8.0-openjdk-devel > PERF_VERSION = 5.4.61.gf5583dd12e6f > In file included from btf_dump.c:16:0: > btf_dump.c: In function ‘btf_align_of’: > tools/include/linux/kernel.h:53:17: error: comparison of distinct pointer > types lacks a cast [-Werror] > (void) (&_min1 == &_min2); \ > ^ > btf_dump.c:770:10: note: in expansion of macro ‘min’ >return min(sizeof(void *), t->size); > ^~~ > cc1: all warnings being treated as errors > make[7]: *** [/tmp/buildbot-builddir/tools/perf/staticobjs/btf_dump.o] Error 1 This perf build break noticed and reported on mailing list [1] > > Bisect log below. Reverting the following two patches fixes the problem. > > 497ef945f327 libbpf: Fix build on ppc64le architecture > 401834f55ce7 libbpf: Handle GCC built-in types for Arm NEON > > Guenter > > --- > $ git bisect log > # bad: [f5583dd12e6fc8a3c11ae732f38bce8334e150a2] Linux 5.4.62-rc1 > # good: [6576d69aac94cd8409636dfa86e0df39facdf0d2] Linux 5.4.61 > git bisect start 'HEAD' 'v5.4.61' > # good: [6c747bd0794c982b500bda7334ef55d9dabb6cc6] nvme-fc: Fix wrong return > value in __nvme_fc_init_request() > git bisect good 6c747bd0794c982b500bda7334ef55d9dabb6cc6 > # bad: [81b5698e6d9ecdc9569df8f4b93be70d587f5ddf] serial: samsung: Removes > the IRQ not found warning > git bisect bad 81b5698e6d9ecdc9569df8f4b93be70d587f5ddf > # bad: [973679736caa8e1b39b68866535bdc7899a46f25] ASoC: wm8994: Avoid > attempts to read unreadable registers > git bisect bad 973679736caa8e1b39b68866535bdc7899a46f25 > # good: [1789df2a787c589dbe83bc3ed52af2abbc739d1b] ext4: correctly restore > system zone info when remount fails > git bisect good 1789df2a787c589dbe83bc3ed52af2abbc739d1b > # good: [ba1fb0301a60cbded377e0f312c82847415a1820] drm/amd/powerplay: correct > UVD/VCE PG state on custom pptable uploading > git bisect good ba1fb0301a60cbded377e0f312c82847415a1820 > # bad: [1ef070d29e73a50e98a93d9a68f69cfef4247170] netfilter: avoid ipv6 -> > nf_defrag_ipv6 module dependency > git bisect bad 1ef070d29e73a50e98a93d9a68f69cfef4247170 > # bad: [401834f55ce7f86bf2c0f8fdd8fbf9e1baf19f1c] libbpf: Handle GCC built-in > types for Arm NEON > git bisect bad 401834f55ce7f86bf2c0f8fdd8fbf9e1baf19f1c > # good: [ccb6e88cd42a9cb65bde705f7f8e7c9822dcb711] drm/amd/display: Switch to > immediate mode for updating infopackets > git bisect good ccb6e88cd42a9cb65bde705f7f8e7c9822dcb711 > # first bad commit: [401834f55ce7f86bf2c0f8fdd8fbf9e1baf19f1c] libbpf: Handle > GCC built-in types for Arm NEON [1] https://lore.kernel.org/stable/ca+g9fyvsnkxvs7hdcb3lc9w+rp8hba3f1fg3951s+xhfioj...@mail.gmail.com/ - Naresh
Re: checkpatch? (was: Re: [PATCH v3] coccinelle: misc: add uninitialized_var.cocci script)
On 9/1/20 5:37 PM, Joe Perches wrote: > On Tue, 2020-09-01 at 12:48 +0300, Denis Efremov wrote: >> uninitialized_var() macro was removed from the sources [1] and >> other warning-silencing tricks were deprecated [2]. The purpose of this >> cocci script is to prevent new occurrences of uninitialized_var() >> open-coded variants. > >> +( >> +* T var =@p var; >> +| >> +* T var =@p *(&(var)); >> +| >> +* var =@p var >> +| >> +* var =@p *(&(var)) >> +) > > Adding a checkpatch test might be a good thing too. > > --- > scripts/checkpatch.pl | 11 +++ > 1 file changed, 11 insertions(+) > > diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl > index 149518d2a6a7..300b2659aab3 100755 > --- a/scripts/checkpatch.pl > +++ b/scripts/checkpatch.pl > @@ -3901,6 +3901,17 @@ sub process { > #ignore lines not being added > next if ($line =~ /^[^\+]/); > > +# check for self assigments used to avoid compiler warnings > +# e.g.: int foo = foo, *bar = NULL; > +#struct foo bar = *(&(bar)); > + if ($line =~ /^\+\s*(?:$Declare)?([A-Za-z_][A-Za-z\d_]*)\s*=/) { > + my $var = $1; > + if ($line =~ > /^\+\s*(?:$Declare)?$var\s*=\s*(?:$var|\*\s*\(?\s*&\s*\(?\s*$var\s*\)?\s*\)?)\s*[;,]/) > { > + WARN("SELF_ASSIGNMENT", > + "Do not use self-assignments to avoid > compiler warnings\n" . $herecurr); > + } > + } > + > # check for dereferences that span multiple lines > if ($prevline =~ /^\+.*$Lval\s*(?:\.|->)\s*$/ && > $line =~ /^\+\s*(?!\#\s*(?!define\s+|if))\s*$Lval/) { Looks good. I also faced this kind of assignments after declarations. https://lkml.org/lkml/2020/8/31/85 I'm not sure if they are used to suppress compiler warnings, through. Denis
linux-next: manual merge of the scsi-mkp tree with Linus' tree
Hi all, Today's linux-next merge of the scsi-mkp tree got a conflict in: drivers/scsi/aacraid/aachba.c between commit: df561f6688fe ("treewide: Use fallthrough pseudo-keyword") from Linus' tree and commit: cfd3d2225aa5 ("scsi: aacraid: Remove erroneous fallthrough annotation") from the scsi-mkp tree. I fixed it up (I removed the line removed by the latter - it was rewritten by the former to "fallthrough;") and can carry the fix as necessary. This is now fixed as far as linux-next is concerned, but any non trivial conflicts should be mentioned to your upstream maintainer when your tree is submitted for merging. You may also want to consider cooperating with the maintainer of the conflicting tree to minimise any particularly complex conflicts. -- Cheers, Stephen Rothwell pgpjT6jAiMG6t.pgp Description: OpenPGP digital signature
linux-next: manual merge of the scsi-mkp tree with Linus' tree
Hi all, Today's linux-next merge of the scsi-mkp tree got a conflict in: drivers/scsi/ufs/ufshcd.h between commit: 8da76f71fef7 ("scsi: ufs-pci: Add quirk for broken auto-hibernate for Intel EHL") from Linus' tree and commit: 5df6f2def50c ("scsi: ufs: Introduce skipping manual flush for Write Booster") from the scsi-mkp tree. I fixed it up (see below) and can carry the fix as necessary. This is now fixed as far as linux-next is concerned, but any non trivial conflicts should be mentioned to your upstream maintainer when your tree is submitted for merging. You may also want to consider cooperating with the maintainer of the conflicting tree to minimise any particularly complex conflicts. -- Cheers, Stephen Rothwell diff --cc drivers/scsi/ufs/ufshcd.h index b5b2761456fb,a88dfac7c9e9.. --- a/drivers/scsi/ufs/ufshcd.h +++ b/drivers/scsi/ufs/ufshcd.h @@@ -531,11 -531,10 +531,16 @@@ enum ufshcd_quirks */ UFSHCD_QUIRK_BROKEN_OCS_FATAL_ERROR = 1 << 10, + /* + * This quirk needs to be enabled if the host controller has + * auto-hibernate capability but it doesn't work. + */ + UFSHCD_QUIRK_BROKEN_AUTO_HIBERN8= 1 << 11, ++ + /* +* This quirk needs to disable manual flush for write booster +*/ - UFSHCI_QUIRK_SKIP_MANUAL_WB_FLUSH_CTRL = 1 << 11, ++ UFSHCI_QUIRK_SKIP_MANUAL_WB_FLUSH_CTRL = 1 << 12, }; enum ufshcd_caps { pgpy6FhB1wHOT.pgp Description: OpenPGP digital signature
Re: [PATCH 0/7] powerpc/watchpoint: 2nd DAWR kvm enablement + selftests
Hi Paul, On 9/2/20 8:02 AM, Paul Mackerras wrote: On Thu, Jul 23, 2020 at 03:50:51PM +0530, Ravi Bangoria wrote: Patch #1, #2 and #3 enables p10 2nd DAWR feature for Book3S kvm guest. DAWR is a hypervisor resource and thus H_SET_MODE hcall is used to set/unset it. A new case H_SET_MODE_RESOURCE_SET_DAWR1 is introduced in H_SET_MODE hcall for setting/unsetting 2nd DAWR. Also, new capability KVM_CAP_PPC_DAWR1 has been added to query 2nd DAWR support via kvm ioctl. This feature also needs to be enabled in Qemu to really use it. I'll reply link to qemu patches once I post them in qemu-devel mailing list. Patch #4, #5, #6 and #7 adds selftests to test 2nd DAWR. If/when you resubmit these patches, please split the KVM patches into a separate series, since the KVM patches would go via my tree whereas I expect the selftests/powerpc patches would go through Michael Ellerman's tree. Sure. Will split it. Thanks, Ravi
Re: [PATCH 2/7] powerpc/watchpoint/kvm: Add infrastructure to support 2nd DAWR
Hi Paul, diff --git a/arch/powerpc/include/asm/hvcall.h b/arch/powerpc/include/asm/hvcall.h index 33793444144c..03f401d7be41 100644 --- a/arch/powerpc/include/asm/hvcall.h +++ b/arch/powerpc/include/asm/hvcall.h @@ -538,6 +538,8 @@ struct hv_guest_state { s64 tb_offset; u64 dawr0; u64 dawrx0; + u64 dawr1; + u64 dawrx1; u64 ciabr; u64 hdec_expiry; u64 purr; After this struct, there is a macro HV_GUEST_STATE_VERSION, I guess that also needs to be incremented because I'm adding new members in the struct? Thanks, Ravi
Re: [PATCH 2/7] powerpc/watchpoint/kvm: Add infrastructure to support 2nd DAWR
Hi Paul, On 9/2/20 7:31 AM, Paul Mackerras wrote: On Thu, Jul 23, 2020 at 03:50:53PM +0530, Ravi Bangoria wrote: kvm code assumes single DAWR everywhere. Add code to support 2nd DAWR. DAWR is a hypervisor resource and thus H_SET_MODE hcall is used to set/ unset it. Introduce new case H_SET_MODE_RESOURCE_SET_DAWR1 for 2nd DAWR. Is this the same interface as will be defined in PAPR and available under PowerVM, or is it a new/different interface for KVM? Yes, kvm hcall interface for 2nd DAWR is same as PowerVM, as defined in PAPR. Also, kvm will support 2nd DAWR only if CPU_FTR_DAWR1 is set. In general QEMU wants to be able to control all aspects of the virtual machine presented to the guest, meaning that just because a host has a particular hardware capability does not mean we should automatically present that capability to the guest. In this case, QEMU will want a way to control whether the guest sees the availability of the second DAWR/X registers or not, i.e. whether a H_SET_MODE to set DAWR[X]1 will succeed or fail. Patch #3 adds new kvm capability KVM_CAP_PPC_DAWR1 that can be checked by Qemu. Also, as suggested by David in Qemu patch[1], I'm planning to add new machine capability in Qemu: -machine cap-dawr1=ON/OFF cap-dawr1 will be default ON when PPC_FEATURE2_ARCH_3_10 is set and OFF otherwise. Is this correct approach? [1]: https://lore.kernel.org/kvm/20200724045613.ga8...@umbus.fritz.box Thanks, Ravi
Re: [PATCH 1/7] powerpc/watchpoint/kvm: Rename current DAWR macros and variables
Hi Paul, On 9/2/20 7:19 AM, Paul Mackerras wrote: On Thu, Jul 23, 2020 at 03:50:52PM +0530, Ravi Bangoria wrote: Power10 is introducing second DAWR. Use real register names (with suffix 0) from ISA for current macros and variables used by kvm. Most of this looks fine, but I think we should not change the existing names in arch/powerpc/include/uapi/asm/kvm.h (and therefore also Documentation/virt/kvm/api.rst). Missed that I'm changing uapi. I'll rename only those macros/variables which are not uapi. Thanks, Ravi
RE: [PATCH v1 1/1] scsi: ufshcd: Allow zero value setting to Auto-Hibernate Timer
> > On 2020-08-29 00:32, Avri Altman wrote: > >> > >> The zero value Auto-Hibernate Timer is a valid setting, and it > >> indicates the Auto-Hibernate feature being disabled. Correctly > > Right. So " ufshcd_auto_hibern8_enable" is no longer an appropriate > > name. > > Maybe ufshcd_auto_hibern8_set instead? > Thanks for your comment. I am ok with the name change suggestion. > > > > Also, did you verified that no other platform relies on its non-zero > > value? > I only tested the change on Qualcomm's platform. I do not have other > platforms to do the test. > The UFS host controller spec JESD220E, Section 5.2.5 says > "Software writes “0” to disable Auto-Hibernate Idle Timer". So the spec > supports this zero value. > Some options: > - We could add a hba->caps so that we only apply the change for > Qualcomm's platforms. > This is not preferred because it is following the spec implementations. > - Or other platforms that do not support the zero value needs a caps. Yeah, I don't think another caps is required, Maybe just an ack from Stanley. Thanks, Avri
[PATCH v2] i2c: i801: Register lis3lv02d I2C device on Dell Latitude 5480
Value of /sys/devices/platform/lis3lv02d/position when Horizontal: (36,-108,-1152) Left elevated: (-432,-126,-1062) Front elevated: (36,594,-936) Upside down:(-126,-252,1098) Signed-off-by: Jeffrey Lin Reviewed-by: Jean Delvare --- Changes in v2: - Added Jean's Reviewed-by drivers/i2c/busses/i2c-i801.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/i2c/busses/i2c-i801.c b/drivers/i2c/busses/i2c-i801.c index e32ef3f01fe8..efab1e71ad6a 100644 --- a/drivers/i2c/busses/i2c-i801.c +++ b/drivers/i2c/busses/i2c-i801.c @@ -1274,6 +1274,7 @@ static const struct { /* * Additional individual entries were added after verification. */ + { "Latitude 5480", 0x29 }, { "Vostro V131",0x1d }, }; -- 2.28.0
Re: [RESEND PATCH 1/2] arm64: dts: ti: k3-j721e-main: Add PCIe device tree nodes
Hi Nishanth, On 01/09/20 8:22 pm, Nishanth Menon wrote: On 19:36-20200901, Kishon Vijay Abraham I wrote: Add PCIe device tree node (both RC and EP) for the four PCIe instances here. Signed-off-by: Kishon Vijay Abraham I --- arch/arm64/boot/dts/ti/k3-j721e-main.dtsi | 218 ++ arch/arm64/boot/dts/ti/k3-j721e.dtsi | 5 +- 2 files changed, 222 insertions(+), 1 deletion(-) Did you look at the diff of the dtbs_check before and after this series? I see: https://pastebin.ubuntu.com/p/9fyfrTjx9M/ I didn't see any errors when I checked for individual bindings a0393678@a0393678-ssd:~/repos/linux$ mkconfig64 dtbs_check DT_SCHEMA_FILES="Documentation/devicetree/bindings/pci/ti,j721e-pci-ep.yaml" SCHEMA Documentation/devicetree/bindings/processed-schema.yaml DTC arch/arm64/boot/dts/ti/k3-am654-base-board.dt.yaml DTC arch/arm64/boot/dts/ti/k3-j721e-common-proc-board.dt.yaml CHECK arch/arm64/boot/dts/ti/k3-am654-base-board.dt.yaml CHECK arch/arm64/boot/dts/ti/k3-j721e-common-proc-board.dt.yaml a0393678@a0393678-ssd:~/repos/linux$ mkconfig64 dtbs_check DT_SCHEMA_FILES="Documentation/devicetree/bindings/pci/ti,j721e-pci-host.yaml" SCHEMA Documentation/devicetree/bindings/processed-schema.yaml DTC arch/arm64/boot/dts/ti/k3-am654-base-board.dt.yaml DTC arch/arm64/boot/dts/ti/k3-j721e-common-proc-board.dt.yaml CHECK arch/arm64/boot/dts/ti/k3-am654-base-board.dt.yaml CHECK arch/arm64/boot/dts/ti/k3-j721e-common-proc-board.dt.yaml diff --git a/arch/arm64/boot/dts/ti/k3-j721e-main.dtsi b/arch/arm64/boot/dts/ti/k3-j721e-main.dtsi index 00a36a14efe7..a36909d8b8c3 100644 --- a/arch/arm64/boot/dts/ti/k3-j721e-main.dtsi +++ b/arch/arm64/boot/dts/ti/k3-j721e-main.dtsi @@ -28,6 +28,26 @@ #size-cells = <1>; ranges = <0x0 0x0 0x0010 0x1c000>; + pcie0_ctrl: pcie-ctrl@4070 { https://github.com/devicetree-org/devicetree-specification/releases/download/v0.3/devicetree-specification-v0.3.pdf Section 2.2.2: why not use syscon@4070 and so on? okay, will change to generic name. + compatible = "syscon"; + reg = <0x4070 0x4>; + }; + + pcie1_ctrl: pcie-ctrl@4074 { + compatible = "syscon"; + reg = <0x4074 0x4>; + }; + + pcie2_ctrl: pcie-ctrl@4078 { + compatible = "syscon"; + reg = <0x4078 0x4>; + }; + + pcie3_ctrl: pcie-ctrl@407c { + compatible = "syscon"; + reg = <0x407c 0x4>; + }; + serdes_ln_ctrl: serdes-ln-ctrl@4080 { compatible = "mmio-mux"; reg = <0x4080 0x50>; @@ -576,6 +596,204 @@ }; }; + pcie0_rc: pcie@290 { + compatible = "ti,j721e-pcie-host"; + reg = <0x00 0x0290 0x00 0x1000>, + <0x00 0x02907000 0x00 0x400>, + <0x00 0x0d00 0x00 0x0080>, + <0x00 0x1000 0x00 0x1000>; + reg-names = "intd_cfg", "user_cfg", "reg", "cfg"; + interrupt-names = "link_state"; + interrupts = ; + device_type = "pci"; + ti,syscon-pcie-ctrl = <_ctrl>; + max-link-speed = <3>; + num-lanes = <2>; + power-domains = <_pds 239 TI_SCI_PD_EXCLUSIVE>; + clocks = <_clks 239 1>; + clock-names = "fck"; + #address-cells = <3>; + #size-cells = <2>; + bus-range = <0x0 0xf>; + vendor-id = <0x104c>; + device-id = <0xb00d>; + msi-map = <0x0 _its 0x0 0x1>; + dma-coherent; + ranges = <0x0100 0x0 0x10001000 0x0 0x10001000 0x0 0x001>, +<0x0200 0x0 0x10011000 0x0 0x10011000 0x0 0x7fef000>; + dma-ranges = <0x0200 0x0 0x0 0x0 0x0 0x1 0x0>; + }; + + pcie0_ep: pcie-ep@290 { Not related to this patch, but just a suggestion: pcie-ep -> do we need to add that to the Generic names in DT spec? [...] diff --git a/arch/arm64/boot/dts/ti/k3-j721e.dtsi b/arch/arm64/boot/dts/ti/k3-j721e.dtsi index f787aa73aaae..eeb02115b966 100644 --- a/arch/arm64/boot/dts/ti/k3-j721e.dtsi +++ b/arch/arm64/boot/dts/ti/k3-j721e.dtsi @@ -132,9 +132,12 @@ <0x00 0x0640 0x00 0x0640 0x00 0x0040>, /* USBSS1 */
Re: [PATCH -next] powerpc: Convert to DEFINE_SHOW_ATTRIBUTE
On Thu, Jul 16, 2020 at 05:07:12PM +0800, Qinglang Miao wrote: > From: Chen Huang > > Use DEFINE_SHOW_ATTRIBUTE macro to simplify the code. > > Signed-off-by: Chen Huang For the arch/powerpc/kvm part: Acked-by: Paul Mackerras I expect Michael Ellerman will take the patch through his tree. Paul.
[PATCH 2/2] iommu: amd: Use cmpxchg_double() when updating 128-bit IRTE
When using 128-bit interrupt-remapping table entry (IRTE) (a.k.a GA mode), current driver disables interrupt remapping when it updates the IRTE so that the upper and lower 64-bit values can be updated safely. However, this creates a small window, where the interrupt could arrive and result in IO_PAGE_FAULT (for interrupt) as shown below. IOMMU DriverDevice IRQ === irte.RemapEn=0 ... change IRTEIRQ from device ==> IO_PAGE_FAULT !! ... irte.RemapEn=1 This scenario has been observed when changing irq affinity on a system running I/O-intensive workload, in which the destination APIC ID in the IRTE is updated. Instead, use cmpxchg_double() to update the 128-bit IRTE at once without disabling the interrupt remapping. However, this means several features, which require GA (128-bit IRTE) support will also be affected if cmpxchg16b is not supported (which is unprecedented for AMD processors w/ IOMMU). Reported-by: Sean Osborne Tested-by: Erik Rockstrom Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/Kconfig | 2 +- drivers/iommu/amd/init.c | 21 +++-- drivers/iommu/amd/iommu.c | 17 + 3 files changed, 33 insertions(+), 7 deletions(-) diff --git a/drivers/iommu/amd/Kconfig b/drivers/iommu/amd/Kconfig index 1f061d91e0b8..626b97d0dd21 100644 --- a/drivers/iommu/amd/Kconfig +++ b/drivers/iommu/amd/Kconfig @@ -10,7 +10,7 @@ config AMD_IOMMU select IOMMU_API select IOMMU_IOVA select IOMMU_DMA - depends on X86_64 && PCI && ACPI + depends on X86_64 && PCI && ACPI && HAVE_CMPXCHG_DOUBLE help With this option you can enable support for AMD IOMMU hardware in your system. An IOMMU is a hardware component which provides diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index c652f16eb702..ad30467f6930 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -1511,7 +1511,14 @@ static int __init init_iommu_one(struct amd_iommu *iommu, struct ivhd_header *h) iommu->mmio_phys_end = MMIO_REG_END_OFFSET; else iommu->mmio_phys_end = MMIO_CNTR_CONF_OFFSET; - if (((h->efr_attr & (0x1 << IOMMU_FEAT_GASUP_SHIFT)) == 0)) + + /* +* Note: GA (128-bit IRTE) mode requires cmpxchg16b supports. +* GAM also requires GA mode. Therefore, we need to +* check cmbxchg16b support before enabling it. +*/ + if (!boot_cpu_has(X86_FEATURE_CX16) || + ((h->efr_attr & (0x1 << IOMMU_FEAT_GASUP_SHIFT)) == 0)) amd_iommu_guest_ir = AMD_IOMMU_GUEST_IR_LEGACY; break; case 0x11: @@ -1520,8 +1527,18 @@ static int __init init_iommu_one(struct amd_iommu *iommu, struct ivhd_header *h) iommu->mmio_phys_end = MMIO_REG_END_OFFSET; else iommu->mmio_phys_end = MMIO_CNTR_CONF_OFFSET; - if (((h->efr_reg & (0x1 << IOMMU_EFR_GASUP_SHIFT)) == 0)) + + /* +* Note: GA (128-bit IRTE) mode requires cmpxchg16b supports. +* XT, GAM also requires GA mode. Therefore, we need to +* check cmbxchg16b support before enabling them. +*/ + if (boot_cpu_has(X86_FEATURE_CX16) || + ((h->efr_reg & (0x1 << IOMMU_EFR_GASUP_SHIFT)) == 0)) { amd_iommu_guest_ir = AMD_IOMMU_GUEST_IR_LEGACY; + break; + } + /* * Note: Since iommu_update_intcapxt() leverages * the IOMMU MMIO access to MSI capability block registers diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index 967f4e96d1eb..a382d7a73eaa 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -3292,6 +3292,7 @@ static int alloc_irq_index(u16 devid, int count, bool align, static int modify_irte_ga(u16 devid, int index, struct irte_ga *irte, struct amd_ir_data *data) { + bool ret; struct irq_remap_table *table; struct amd_iommu *iommu; unsigned long flags; @@ -3309,10 +3310,18 @@ static int modify_irte_ga(u16 devid, int index, struct irte_ga *irte, entry = (struct irte_ga *)table->table; entry = [index]; - entry->lo.fields_remap.valid = 0; - entry->hi.val = irte->hi.val; - entry->lo.val = irte->lo.val; - entry->lo.fields_remap.valid = 1; + + ret = cmpxchg_double(>lo.val, >hi.val, +entry->lo.val, entry->hi.val, +irte->lo.val, irte->hi.val); + /* +* We use cmpxchg16 to atomically update the 128-bit IRTE, +* and it cannot be updated by the hardware or other processors +*
[PATCH 1/2] iommu: amd: Restore IRTE.RemapEn bit after programming IRTE
Currently, the RemapEn (valid) bit is accidentally cleared when programming IRTE w/ guestMode=0. It should be restored to the prior state. Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/iommu.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index ba9f3dbc5b94..967f4e96d1eb 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -3850,6 +3850,7 @@ int amd_iommu_deactivate_guest_mode(void *data) struct amd_ir_data *ir_data = (struct amd_ir_data *)data; struct irte_ga *entry = (struct irte_ga *) ir_data->entry; struct irq_cfg *cfg = ir_data->cfg; + u64 valid = entry->lo.fields_remap.valid; if (!AMD_IOMMU_GUEST_IR_VAPIC(amd_iommu_guest_ir) || !entry || !entry->lo.fields_vapic.guest_mode) @@ -3858,6 +3859,7 @@ int amd_iommu_deactivate_guest_mode(void *data) entry->lo.val = 0; entry->hi.val = 0; + entry->lo.fields_remap.valid = valid; entry->lo.fields_remap.dm = apic->irq_dest_mode; entry->lo.fields_remap.int_type= apic->irq_delivery_mode; entry->hi.fields.vector= cfg->vector; -- 2.17.1
[PATCH 0/2] iommu: amd: Fix intremap IO_PAGE_FAULT for VMs
Interrupt remapping IO_PAGE_FAULT has been observed under system w/ large number of VMs w/ pass-through devices. This can be reproduced with 64 VMs + 64 pass-through VFs of Mellanox MT28800 Family [ConnectX-5 Ex], where each VM runs small-packet netperf test via the pass-through device to the netserver running on the host. All VMs are running in reboot loop, to trigger IRTE updates. In addition, to accelerate the failure, irqbalance is triggered periodically (e.g. 1-5 sec), which should generate large amount of updates to IRTE. This setup generally triggers IO_PAGE_FAULT within 3-4 hours. Investigation has shown that the issue is in the code to update IRTE while remapping is enabled. Please see patch 2/2 for detail discussion. This serires has been tested running in the setup mentioned above upto 96 hours w/o seeing issues. Thanks, Suravee Suravee Suthikulpanit (2): iommu: amd: Restore IRTE.RemapEn bit after programming IRTE iommu: amd: Use cmpxchg_double() when updating 128-bit IRTE drivers/iommu/amd/Kconfig | 2 +- drivers/iommu/amd/init.c | 21 +++-- drivers/iommu/amd/iommu.c | 19 +++ 3 files changed, 35 insertions(+), 7 deletions(-) -- 2.17.1
[PATCH] coccinelle: ifnullfree: add vfree(), kvfree*() functions
Extend the list of free functions with kvfree(), kvfree_sensitive(), vfree(). Signed-off-by: Denis Efremov --- scripts/coccinelle/free/ifnullfree.cocci | 11 +-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/scripts/coccinelle/free/ifnullfree.cocci b/scripts/coccinelle/free/ifnullfree.cocci index 2045391e36a0..285b92d5c665 100644 --- a/scripts/coccinelle/free/ifnullfree.cocci +++ b/scripts/coccinelle/free/ifnullfree.cocci @@ -20,8 +20,14 @@ expression E; - if (E != NULL) ( kfree(E); +| + kvfree(E); | kfree_sensitive(E); +| + kvfree_sensitive(E, ...); +| + vfree(E); | debugfs_remove(E); | @@ -42,9 +48,10 @@ position p; @@ * if (E != NULL) -* \(kfree@p\|kfree_sensitive@p\|debugfs_remove@p\|debugfs_remove_recursive@p\| +* \(kfree@p\|kvfree@p\|kfree_sensitive@p\|kvfree_sensitive@p\|vfree@p\| +* debugfs_remove@p\|debugfs_remove_recursive@p\| * usb_free_urb@p\|kmem_cache_destroy@p\|mempool_destroy@p\| -* dma_pool_destroy@p\)(E); +* dma_pool_destroy@p\)(E, ...); @script:python depends on org@ p << r.p; -- 2.26.2
Re: [PATCH net-next] net: sch_generic: aviod concurrent reset and enqueue op for lockless qdisc
On Tue, Sep 1, 2020 at 6:42 PM Yunsheng Lin wrote: > > On 2020/9/2 2:24, Cong Wang wrote: > > On Mon, Aug 31, 2020 at 5:59 PM Yunsheng Lin wrote: > >> > >> Currently there is concurrent reset and enqueue operation for the > >> same lockless qdisc when there is no lock to synchronize the > >> q->enqueue() in __dev_xmit_skb() with the qdisc reset operation in > >> qdisc_deactivate() called by dev_deactivate_queue(), which may cause > >> out-of-bounds access for priv->ring[] in hns3 driver if user has > >> requested a smaller queue num when __dev_xmit_skb() still enqueue a > >> skb with a larger queue_mapping after the corresponding qdisc is > >> reset, and call hns3_nic_net_xmit() with that skb later. > > > > Can you be more specific here? Which call path requests a smaller > > tx queue num? If you mean netif_set_real_num_tx_queues(), clearly > > we already have a synchronize_net() there. > > When the netdevice is in active state, the synchronize_net() seems to > do the correct work, as below: > > CPU 0: CPU1: > __dev_queue_xmit() netif_set_real_num_tx_queues() > rcu_read_lock_bh(); > netdev_core_pick_tx(dev, skb, sb_dev); > . > . dev->real_num_tx_queues = txq; > . . > . . > . synchronize_net(); > . . > q->enqueue(). > . . > rcu_read_unlock_bh(). > qdisc_reset_all_tx_gt > > Right. > but dev->real_num_tx_queues is not RCU-protected, maybe that is a problem > too. > > The problem we hit is as below: > In hns3_set_channels(), hns3_reset_notify(h, HNAE3_DOWN_CLIENT) is called > to deactive the netdevice when user requested a smaller queue num, and > txq->qdisc is already changed to noop_qdisc when calling > netif_set_real_num_tx_queues(), so the synchronize_net() in the function > netif_set_real_num_tx_queues() does not help here. How could qdisc still be running after deactivating the device? > > > > >> > >> Avoid the above concurrent op by calling synchronize_rcu_tasks() > >> after assigning new qdisc to dev_queue->qdisc and before calling > >> qdisc_deactivate() to make sure skb with larger queue_mapping > >> enqueued to old qdisc will always be reset when qdisc_deactivate() > >> is called. > > > > Like Eric said, it is not nice to call such a blocking function when > > we have a large number of TX queues. Possibly we just need to > > add a synchronize_net() as in netif_set_real_num_tx_queues(), > > if it is missing. > > As above, the synchronize_net() in netif_set_real_num_tx_queues() seems > to work when netdevice is in active state, but does not work when in > deactive. Please explain why deactivated device still has qdisc running? At least before commit 379349e9bc3b4, we always test deactivate bit before enqueueing. Are you complaining about that commit? That commit is indeed suspicious, at least it does not precisely revert commit ba27b4cdaaa66561aaedb21 as it claims. > > And we do not want skb left in the old qdisc when netdevice is deactived, > right? Yes, and more importantly, qdisc should not be running after deactivation. Thanks.
RE: [PATCH 3/3] edac: sifive: Add EDAC support for Memory Controller in SiFive SoCs
> -Original Message- > From: Borislav Petkov > Sent: 31 August 2020 14:22 > To: Yash Shah > Cc: robh...@kernel.org; pal...@dabbelt.com; Paul Walmsley ( Sifive) > ; mche...@kernel.org; tony.l...@intel.com; > a...@eecs.berkeley.edu; james.mo...@arm.com; rrich...@marvell.com; > devicet...@vger.kernel.org; linux-ri...@lists.infradead.org; linux- > ker...@vger.kernel.org; linux-e...@vger.kernel.org; Sachin Ghadi > > Subject: Re: [PATCH 3/3] edac: sifive: Add EDAC support for Memory > Controller in SiFive SoCs > > [External Email] Do not click links or attachments unless you recognize the > sender and know the content is safe > > > Subject: Re: [PATCH 3/3] edac: sifive: Add EDAC support for Memory > > Controller in SiFive SoCs > > Fix subject prefix: "EDAC/sifive: ..." > > On Tue, Aug 25, 2020 at 05:36:22PM +0530, Yash Shah wrote: > > Add Memory controller EDAC support in exisiting SiFive platform EDAC > > s/in exisiting/to the/ > > > driver. It registers for notifier events from the SiFive DDR > > controller driver for DDR ECC events. > > Simplify: > > "It registers for ECC notifier events from the memory controller." > > > Signed-off-by: Yash Shah > > --- > > drivers/edac/Kconfig | 2 +- > > drivers/edac/sifive_edac.c | 117 > > + > > 2 files changed, 118 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/edac/Kconfig b/drivers/edac/Kconfig index > > 7b6ec30..f8b3b53 100644 > > --- a/drivers/edac/Kconfig > > +++ b/drivers/edac/Kconfig > > @@ -462,7 +462,7 @@ config EDAC_ALTERA_SDMMC > > > > config EDAC_SIFIVE > > bool "Sifive platform EDAC driver" > > - depends on EDAC=y && SIFIVE_L2 > > + depends on EDAC=y && (SIFIVE_L2 || SIFIVE_DDR) > > help > > Support for error detection and correction on the SiFive SoCs. > > > > diff --git a/drivers/edac/sifive_edac.c b/drivers/edac/sifive_edac.c > > index 3a3dcb1..cf032685 100644 > > --- a/drivers/edac/sifive_edac.c > > +++ b/drivers/edac/sifive_edac.c > > @@ -11,14 +11,120 @@ > > #include > > #include "edac_module.h" > > #include > > +#include > > > > #define DRVNAME "sifive_edac" > > +#define SIFIVE_EDAC_MOD_NAME "Sifive ECC Manager" > > s/SIFIVE_EDAC_MOD_NAME/EDAC_MOD_NAME/g > > like the other EDAC drivers. > Sure, will make all the above suggested textual changes in v2. > ... > > > +static int ecc_mc_register(struct platform_device *pdev) { > > + struct sifive_edac_mc_priv *p; > > + struct edac_mc_layer layers[1]; > > + int ret; > > + > > + p = devm_kzalloc(>dev, sizeof(*p), GFP_KERNEL); > > + if (!p) > > + return -ENOMEM; > > + > > + p->notifier.notifier_call = ecc_mc_err_event; > > + platform_set_drvdata(pdev, p); > > + > > + layers[0].type = EDAC_MC_LAYER_CHIP_SELECT; > > + layers[0].size = 1; > > + layers[0].is_virt_csrow = true; > > + > > + p->mci = edac_mc_alloc(0, ARRAY_SIZE(layers), layers, 0); > > + if (!p->mci) { > > + dev_err(>dev, "Failed mem allocation for mc > > instance\n"); > > + return -ENOMEM; > > + } > > + > > + p->mci->pdev = >dev; > > + /* Initialize controller capabilities */ > > + p->mci->mtype_cap = MEM_FLAG_DDR4; > > + p->mci->edac_ctl_cap = EDAC_FLAG_NONE | EDAC_FLAG_SECDED; > > + p->mci->edac_cap = EDAC_FLAG_SECDED; > > + p->mci->scrub_cap = SCRUB_UNKNOWN; > > + p->mci->scrub_mode = SCRUB_HW_PROG; > > + p->mci->ctl_name = dev_name(>dev); > > + p->mci->dev_name = dev_name(>dev); > > + p->mci->mod_name = SIFIVE_EDAC_MOD_NAME; > > + p->mci->ctl_page_to_phys = NULL; > > + > > + /* Interrupt feature is supported by cadence mc */ > > + edac_op_state = EDAC_OPSTATE_INT; > > + > > + ret = edac_mc_add_mc(p->mci); > > + if (ret) { > > + edac_printk(KERN_ERR, SIFIVE_EDAC_MOD_NAME, > > + "Failed to register with EDAC core\n"); > > + goto err; > > + } > > + > > +#ifdef CONFIG_SIFIVE_DDR > > It seems all that ifdeffery can be replaced with > > if (IS_ENABLED(CONFIG_...)) Yes, will replace all the ifdeffery in v2 Thanks for the review. - Yash > > Thx. > > -- > Regards/Gruss, > Boris. > > https://people.kernel.org/tglx/notes-about-netiquette
Re: [LKP] Re: [rcuperf] 4e88ec4a9e: UBSAN:division-overflow_in_arch/x86/include/asm/div64.h
On 9/2/2020 12:27 AM, Paul E. McKenney wrote: On Tue, Sep 01, 2020 at 03:03:28PM +0800, Rong Chen wrote: On 8/31/20 11:50 PM, Paul E. McKenney wrote: On Mon, Aug 31, 2020 at 08:01:22PM +0800, kernel test robot wrote: Greeting, FYI, we noticed the following commit (built with gcc-9): commit: 4e88ec4a9eb17527e640b063f79e5b875733eb53 ("rcuperf: Change rcuperf to rcuscale") https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master in testcase: trinity with following parameters: runtime: 300s test-description: Trinity is a linux system call fuzz tester. test-url: http://codemonkey.org.uk/projects/trinity/ on test machine: qemu-system-i386 -enable-kvm -cpu SandyBridge -smp 2 -m 8G caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace): +-+++ | | 65bd77f554 | 4e88ec4a9e | +-+++ | boot_successes | 13 | 0 | | boot_failures | 0 | 14 | | UBSAN:division-overflow_in_arch/x86/include/asm/div64.h | 0 | 14 | | error:#[##] | 0 | 14 | | EIP:main_func.cold | 0 | 14 | | Kernel_panic-not_syncing:Fatal_exception| 0 | 14 | +-+++ If you fix the issue, kindly add following tag Reported-by: kernel test robot Does the patch below fix this for you? Yes, this patch can fix the issue, and nreaders was adjusted to 1: [ 5.953645] The force parameter has not been set to 1. The Iris poweroff handler will not be installed. [ 12.546587] rcu-ref-scale: --- Start of test: verbose=0 shutdown=1 holdoff=10 loops=1 nreaders=-1 nruns=30 readdelay=0 [ 12.561495] [ cut here ] [ 12.562016] ref_scale_init: nreaders = 0, adjusted to 1 [ 12.562601] WARNING: CPU: 0 PID: 1 at kernel/rcu/refscale.c:684 ref_scale_init+0x653/0x80 Thank you! May I add your Tested-by? Yes, please, it's my pleasure. Best Regards, Rong Chen Thanx, Paul Best Regards, Rong Chen Thanx, Paul commit d301e320e952e2e604d83d9540e52510b0eb3d94 Author: Paul E. McKenney Date: Thu Aug 27 09:58:19 2020 -0700 refscale: Bounds-check module parameters The default value for refscale.nreaders is -1, which results in the code setting the value to three-quarters of the number of CPUs. On single-CPU systems, this results in three-quarters of the value one, which the C language's integer arithmetic rounds to zero. This in turn results in a divide-by-zero error. This commit therefore adds bounds checking to the refscale module parameters, so that if they are less than one, they are set to the value one. Reported-by: kernel test robot Signed-off-by: Paul E. McKenney diff --git a/kernel/rcu/refscale.c b/kernel/rcu/refscale.c index 952595c..fb5f20d 100644 --- a/kernel/rcu/refscale.c +++ b/kernel/rcu/refscale.c @@ -681,6 +681,12 @@ ref_scale_init(void) // Reader tasks (default to ~75% of online CPUs). if (nreaders < 0) nreaders = (num_online_cpus() >> 1) + (num_online_cpus() >> 2); + if (WARN_ONCE(loops <= 0, "%s: loops = %ld, adjusted to 1\n", __func__, loops)) + loops = 1; + if (WARN_ONCE(nreaders <= 0, "%s: nreaders = %d, adjusted to 1\n", __func__, nreaders)) + nreaders = 1; + if (WARN_ONCE(nruns <= 0, "%s: nruns = %d, adjusted to 1\n", __func__, nruns)) + nruns = 1; reader_tasks = kcalloc(nreaders, sizeof(reader_tasks[0]), GFP_KERNEL); if (!reader_tasks) { ___ LKP mailing list -- l...@lists.01.org To unsubscribe send an email to lkp-le...@lists.01.org ___ LKP mailing list -- l...@lists.01.org To unsubscribe send an email to lkp-le...@lists.01.org
[PATCH v6 1/8] powerpc/watchpoint: Fix quarword instruction handling on p10 predecessors
On p10 predecessors, watchpoint with quarword access is compared at quardword length. If the watch range is doubleword or less than that in a first half of quarword aligned 16 bytes, and if there is any unaligned quadword access which will access only the 2nd half, the handler should consider it as extraneous and emulate/single-step it before continuing. Reported-by: Pedro Miraglia Franco de Carvalho Fixes: 74c6881019b7 ("powerpc/watchpoint: Prepare handler to handle more than one watchpoint") Signed-off-by: Ravi Bangoria --- arch/powerpc/include/asm/hw_breakpoint.h | 1 + arch/powerpc/kernel/hw_breakpoint.c | 12 ++-- 2 files changed, 11 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/include/asm/hw_breakpoint.h b/arch/powerpc/include/asm/hw_breakpoint.h index db206a7f38e2..9b68eafebf43 100644 --- a/arch/powerpc/include/asm/hw_breakpoint.h +++ b/arch/powerpc/include/asm/hw_breakpoint.h @@ -42,6 +42,7 @@ struct arch_hw_breakpoint { #else #define HW_BREAKPOINT_SIZE 0x8 #endif +#define HW_BREAKPOINT_SIZE_QUADWORD0x10 #define DABR_MAX_LEN 8 #define DAWR_MAX_LEN 512 diff --git a/arch/powerpc/kernel/hw_breakpoint.c b/arch/powerpc/kernel/hw_breakpoint.c index 1f4a1efa0074..9f7df1c37233 100644 --- a/arch/powerpc/kernel/hw_breakpoint.c +++ b/arch/powerpc/kernel/hw_breakpoint.c @@ -520,9 +520,17 @@ static bool ea_hw_range_overlaps(unsigned long ea, int size, struct arch_hw_breakpoint *info) { unsigned long hw_start_addr, hw_end_addr; + unsigned long align_size = HW_BREAKPOINT_SIZE; - hw_start_addr = ALIGN_DOWN(info->address, HW_BREAKPOINT_SIZE); - hw_end_addr = ALIGN(info->address + info->len, HW_BREAKPOINT_SIZE); + /* +* On p10 predecessors, quadword is handle differently then +* other instructions. +*/ + if (!cpu_has_feature(CPU_FTR_ARCH_31) && size == 16) + align_size = HW_BREAKPOINT_SIZE_QUADWORD; + + hw_start_addr = ALIGN_DOWN(info->address, align_size); + hw_end_addr = ALIGN(info->address + info->len, align_size); return ((ea < hw_end_addr) && (ea + size > hw_start_addr)); } -- 2.26.2
linux-next: manual merge of the rcu tree with the jc_docs tree
Hi all, Today's linux-next merge of the rcu tree got a conflict in: Documentation/memory-barriers.txt between commit: 537f3a7cf48e ("docs/memory-barriers.txt: Fix references for DMA*.txt files") from the jc_docs tree and commit: 6f6705147bab ("docs: fix references for DMA*.txt files") from the rcu tree. I fixed it up (they are preety much the same - I used the former) and can carry the fix as necessary. This is now fixed as far as linux-next is concerned, but any non trivial conflicts should be mentioned to your upstream maintainer when your tree is submitted for merging. You may also want to consider cooperating with the maintainer of the conflicting tree to minimise any particularly complex conflicts. -- Cheers, Stephen Rothwell pgp2sEGPO1oH0.pgp Description: OpenPGP digital signature
[PATCH v6 8/8] powerpc/watchpoint/selftests: Tests for kernel accessing user memory
Introduce tests to cover simple scenarios where user is watching memory which can be accessed by kernel as well. We also support _MODE_EXACT with _SETHWDEBUG interface. Move those testcases out- side of _BP_RANGE condition. This will help to test _MODE_EXACT scenarios when CONFIG_HAVE_HW_BREAKPOINT is not set, eg: $ ./ptrace-hwbreak ... PTRACE_SET_DEBUGREG, Kernel Access Userspace, len: 8: Ok PPC_PTRACE_SETHWDEBUG, MODE_EXACT, WO, len: 1: Ok PPC_PTRACE_SETHWDEBUG, MODE_EXACT, RO, len: 1: Ok PPC_PTRACE_SETHWDEBUG, MODE_EXACT, RW, len: 1: Ok PPC_PTRACE_SETHWDEBUG, MODE_EXACT, Kernel Access Userspace, len: 1: Ok success: ptrace-hwbreak Suggested-by: Pedro Miraglia Franco de Carvalho Signed-off-by: Ravi Bangoria --- .../selftests/powerpc/ptrace/ptrace-hwbreak.c | 48 ++- 1 file changed, 46 insertions(+), 2 deletions(-) diff --git a/tools/testing/selftests/powerpc/ptrace/ptrace-hwbreak.c b/tools/testing/selftests/powerpc/ptrace/ptrace-hwbreak.c index fc477dfe86a2..2e0d86e0687e 100644 --- a/tools/testing/selftests/powerpc/ptrace/ptrace-hwbreak.c +++ b/tools/testing/selftests/powerpc/ptrace/ptrace-hwbreak.c @@ -20,6 +20,8 @@ #include #include #include +#include +#include #include "ptrace.h" #define SPRN_PVR 0x11F @@ -44,6 +46,7 @@ struct gstruct { }; static volatile struct gstruct gstruct __attribute__((aligned(512))); +static volatile char cwd[PATH_MAX] __attribute__((aligned(8))); static void get_dbginfo(pid_t child_pid, struct ppc_debug_info *dbginfo) { @@ -138,6 +141,9 @@ static void test_workload(void) write_var(len); } + /* PTRACE_SET_DEBUGREG, Kernel Access Userspace test */ + syscall(__NR_getcwd, , PATH_MAX); + /* PPC_PTRACE_SETHWDEBUG, MODE_EXACT, WO test */ write_var(1); @@ -150,6 +156,9 @@ static void test_workload(void) else read_var(1); + /* PPC_PTRACE_SETHWDEBUG, MODE_EXACT, Kernel Access Userspace test */ + syscall(__NR_getcwd, , PATH_MAX); + /* PPC_PTRACE_SETHWDEBUG, MODE_RANGE, DW ALIGNED, WO test */ gstruct.a[rand() % A_LEN] = 'a'; @@ -293,6 +302,24 @@ static int test_set_debugreg(pid_t child_pid) return 0; } +static int test_set_debugreg_kernel_userspace(pid_t child_pid) +{ + unsigned long wp_addr = (unsigned long)cwd; + char *name = "PTRACE_SET_DEBUGREG"; + + /* PTRACE_SET_DEBUGREG, Kernel Access Userspace test */ + wp_addr &= ~0x7UL; + wp_addr |= (1Ul << DABR_READ_SHIFT); + wp_addr |= (1UL << DABR_WRITE_SHIFT); + wp_addr |= (1UL << DABR_TRANSLATION_SHIFT); + ptrace_set_debugreg(child_pid, wp_addr); + ptrace(PTRACE_CONT, child_pid, NULL, 0); + check_success(child_pid, name, "Kernel Access Userspace", wp_addr, 8); + + ptrace_set_debugreg(child_pid, 0); + return 0; +} + static void get_ppc_hw_breakpoint(struct ppc_hw_breakpoint *info, int type, unsigned long addr, int len) { @@ -338,6 +365,22 @@ static void test_sethwdebug_exact(pid_t child_pid) ptrace_delhwdebug(child_pid, wh); } +static void test_sethwdebug_exact_kernel_userspace(pid_t child_pid) +{ + struct ppc_hw_breakpoint info; + unsigned long wp_addr = (unsigned long) + char *name = "PPC_PTRACE_SETHWDEBUG, MODE_EXACT"; + int len = 1; /* hardcoded in kernel */ + int wh; + + /* PPC_PTRACE_SETHWDEBUG, MODE_EXACT, Kernel Access Userspace test */ + get_ppc_hw_breakpoint(, PPC_BREAKPOINT_TRIGGER_WRITE, wp_addr, 0); + wh = ptrace_sethwdebug(child_pid, ); + ptrace(PTRACE_CONT, child_pid, NULL, 0); + check_success(child_pid, name, "Kernel Access Userspace", wp_addr, len); + ptrace_delhwdebug(child_pid, wh); +} + static void test_sethwdebug_range_aligned(pid_t child_pid) { struct ppc_hw_breakpoint info; @@ -452,9 +495,10 @@ static void run_tests(pid_t child_pid, struct ppc_debug_info *dbginfo, bool dawr) { test_set_debugreg(child_pid); + test_set_debugreg_kernel_userspace(child_pid); + test_sethwdebug_exact(child_pid); + test_sethwdebug_exact_kernel_userspace(child_pid); if (dbginfo->features & PPC_DEBUG_FEATURE_DATA_BP_RANGE) { - test_sethwdebug_exact(child_pid); - test_sethwdebug_range_aligned(child_pid); if (dawr || is_8xx) { test_sethwdebug_range_unaligned(child_pid); -- 2.26.2
[PATCH v6 2/8] powerpc/watchpoint: Fix handling of vector instructions
Vector load/store instructions are special because they are always aligned. Thus unaligned EA needs to be aligned down before comparing it with watch ranges. Otherwise we might consider valid event as invalid. Fixes: 74c6881019b7 ("powerpc/watchpoint: Prepare handler to handle more than one watchpoint") Signed-off-by: Ravi Bangoria --- arch/powerpc/kernel/hw_breakpoint.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/powerpc/kernel/hw_breakpoint.c b/arch/powerpc/kernel/hw_breakpoint.c index 9f7df1c37233..f6b24838ca3c 100644 --- a/arch/powerpc/kernel/hw_breakpoint.c +++ b/arch/powerpc/kernel/hw_breakpoint.c @@ -644,6 +644,8 @@ static void get_instr_detail(struct pt_regs *regs, struct ppc_inst *instr, if (*type == CACHEOP) { *size = cache_op_size(); *ea &= ~(*size - 1); + } else if (*type == LOAD_VMX || *type == STORE_VMX) { + *ea &= ~(*size - 1); } } -- 2.26.2
[PATCH v6 5/8] powerpc/watchpoint: Fix exception handling for CONFIG_HAVE_HW_BREAKPOINT=N
On powerpc, ptrace watchpoint works in one-shot mode. i.e. kernel disables event every time it fires and user has to re-enable it. Also, in case of ptrace watchpoint, kernel notifies ptrace user before executing instruction. With CONFIG_HAVE_HW_BREAKPOINT=N, kernel is missing to disable ptrace event and thus it's causing infinite loop of exceptions. This is especially harmful when user watches on a data which is also read/written by kernel, eg syscall parameters. In such case, infinite exceptions happens in kernel mode which causes soft-lockup. Fixes: 9422de3e953d ("powerpc: Hardware breakpoints rewrite to handle non DABR breakpoint registers") Reported-by: Pedro Miraglia Franco de Carvalho Signed-off-by: Ravi Bangoria --- arch/powerpc/include/asm/hw_breakpoint.h | 3 ++ arch/powerpc/kernel/process.c | 48 +++ arch/powerpc/kernel/ptrace/ptrace-noadv.c | 4 +- 3 files changed, 54 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/hw_breakpoint.h b/arch/powerpc/include/asm/hw_breakpoint.h index 81872c420476..abebfbee5b1c 100644 --- a/arch/powerpc/include/asm/hw_breakpoint.h +++ b/arch/powerpc/include/asm/hw_breakpoint.h @@ -18,6 +18,7 @@ struct arch_hw_breakpoint { u16 type; u16 len; /* length of the target data symbol */ u16 hw_len; /* length programmed in hw */ + u8 flags; }; /* Note: Don't change the first 6 bits below as they are in the same order @@ -37,6 +38,8 @@ struct arch_hw_breakpoint { #define HW_BRK_TYPE_PRIV_ALL (HW_BRK_TYPE_USER | HW_BRK_TYPE_KERNEL | \ HW_BRK_TYPE_HYP) +#define HW_BRK_FLAG_DISABLED 0x1 + /* Minimum granularity */ #ifdef CONFIG_PPC_8xx #define HW_BREAKPOINT_SIZE 0x4 diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c index 016bd831908e..160fbbf41d40 100644 --- a/arch/powerpc/kernel/process.c +++ b/arch/powerpc/kernel/process.c @@ -636,6 +636,44 @@ void do_send_trap(struct pt_regs *regs, unsigned long address, (void __user *)address); } #else /* !CONFIG_PPC_ADV_DEBUG_REGS */ + +static void do_break_handler(struct pt_regs *regs) +{ + struct arch_hw_breakpoint null_brk = {0}; + struct arch_hw_breakpoint *info; + struct ppc_inst instr = ppc_inst(0); + int type = 0; + int size = 0; + unsigned long ea; + int i; + + /* +* If underneath hw supports only one watchpoint, we know it +* caused exception. 8xx also falls into this category. +*/ + if (nr_wp_slots() == 1) { + __set_breakpoint(0, _brk); + current->thread.hw_brk[0] = null_brk; + current->thread.hw_brk[0].flags |= HW_BRK_FLAG_DISABLED; + return; + } + + /* Otherwise findout which DAWR caused exception and disable it. */ + wp_get_instr_detail(regs, , , , ); + + for (i = 0; i < nr_wp_slots(); i++) { + info = >thread.hw_brk[i]; + if (!info->address) + continue; + + if (wp_check_constraints(regs, instr, ea, type, size, info)) { + __set_breakpoint(i, _brk); + current->thread.hw_brk[i] = null_brk; + current->thread.hw_brk[i].flags |= HW_BRK_FLAG_DISABLED; + } + } +} + void do_break (struct pt_regs *regs, unsigned long address, unsigned long error_code) { @@ -647,6 +685,16 @@ void do_break (struct pt_regs *regs, unsigned long address, if (debugger_break_match(regs)) return; + /* +* We reach here only when watchpoint exception is generated by ptrace +* event (or hw is buggy!). Now if CONFIG_HAVE_HW_BREAKPOINT is set, +* watchpoint is already handled by hw_breakpoint_handler() so we don't +* have to do anything. But when CONFIG_HAVE_HW_BREAKPOINT is not set, +* we need to manually handle the watchpoint here. +*/ + if (!IS_ENABLED(CONFIG_HAVE_HW_BREAKPOINT)) + do_break_handler(regs); + /* Deliver the signal to userspace */ force_sig_fault(SIGTRAP, TRAP_HWBKPT, (void __user *)address); } diff --git a/arch/powerpc/kernel/ptrace/ptrace-noadv.c b/arch/powerpc/kernel/ptrace/ptrace-noadv.c index 57a0ab822334..c9122ed91340 100644 --- a/arch/powerpc/kernel/ptrace/ptrace-noadv.c +++ b/arch/powerpc/kernel/ptrace/ptrace-noadv.c @@ -286,11 +286,13 @@ long ppc_del_hwdebug(struct task_struct *child, long data) } return ret; #else /* CONFIG_HAVE_HW_BREAKPOINT */ - if (child->thread.hw_brk[data - 1].address == 0) + if (!(child->thread.hw_brk[data - 1].flags & HW_BRK_FLAG_DISABLED) && + child->thread.hw_brk[data - 1].address == 0) return -ENOENT; child->thread.hw_brk[data - 1].address = 0;
[PATCH v6 7/8] powerpc/watchpoint/ptrace: Introduce PPC_DEBUG_FEATURE_DATA_BP_ARCH_31
PPC_DEBUG_FEATURE_DATA_BP_ARCH_31 can be used to determine whether we are running on an ISA 3.1 compliant machine. Which is needed to determine DAR behaviour, 512 byte boundary limit etc. This was requested by Pedro Miraglia Franco de Carvalho for extending watchpoint features in gdb. Note that availability of 2nd DAWR is independent of this flag and should be checked using ppc_debug_info->num_data_bps. Signed-off-by: Ravi Bangoria --- Documentation/powerpc/ptrace.rst | 1 + arch/powerpc/include/uapi/asm/ptrace.h| 1 + arch/powerpc/kernel/ptrace/ptrace-noadv.c | 2 ++ 3 files changed, 4 insertions(+) diff --git a/Documentation/powerpc/ptrace.rst b/Documentation/powerpc/ptrace.rst index 864d4b61..77725d69eb4a 100644 --- a/Documentation/powerpc/ptrace.rst +++ b/Documentation/powerpc/ptrace.rst @@ -46,6 +46,7 @@ features will have bits indicating whether there is support for:: #define PPC_DEBUG_FEATURE_DATA_BP_RANGE 0x4 #define PPC_DEBUG_FEATURE_DATA_BP_MASK 0x8 #define PPC_DEBUG_FEATURE_DATA_BP_DAWR 0x10 + #define PPC_DEBUG_FEATURE_DATA_BP_ARCH_310x20 2. PTRACE_SETHWDEBUG diff --git a/arch/powerpc/include/uapi/asm/ptrace.h b/arch/powerpc/include/uapi/asm/ptrace.h index f5f1ccc740fc..7004cfea3f5f 100644 --- a/arch/powerpc/include/uapi/asm/ptrace.h +++ b/arch/powerpc/include/uapi/asm/ptrace.h @@ -222,6 +222,7 @@ struct ppc_debug_info { #define PPC_DEBUG_FEATURE_DATA_BP_RANGE0x0004 #define PPC_DEBUG_FEATURE_DATA_BP_MASK 0x0008 #define PPC_DEBUG_FEATURE_DATA_BP_DAWR 0x0010 +#define PPC_DEBUG_FEATURE_DATA_BP_ARCH_31 0x0020 #ifndef __ASSEMBLY__ diff --git a/arch/powerpc/kernel/ptrace/ptrace-noadv.c b/arch/powerpc/kernel/ptrace/ptrace-noadv.c index 48c52426af80..aa36fcad36cd 100644 --- a/arch/powerpc/kernel/ptrace/ptrace-noadv.c +++ b/arch/powerpc/kernel/ptrace/ptrace-noadv.c @@ -57,6 +57,8 @@ void ppc_gethwdinfo(struct ppc_debug_info *dbginfo) } else { dbginfo->features = 0; } + if (cpu_has_feature(CPU_FTR_ARCH_31)) + dbginfo->features |= PPC_DEBUG_FEATURE_DATA_BP_ARCH_31; } int ptrace_get_debugreg(struct task_struct *child, unsigned long addr, -- 2.26.2
[PATCH v6 3/8] powerpc/watchpoint/ptrace: Fix SETHWDEBUG when CONFIG_HAVE_HW_BREAKPOINT=N
When kernel is compiled with CONFIG_HAVE_HW_BREAKPOINT=N, user can still create watchpoint using PPC_PTRACE_SETHWDEBUG, with limited functionalities. But, such watchpoints are never firing because of the missing privilege settings. Fix that. It's safe to set HW_BRK_TYPE_PRIV_ALL because we don't really leak any kernel address in signal info. Setting HW_BRK_TYPE_PRIV_ALL will also help to find scenarios when kernel accesses user memory. Reported-by: Pedro Miraglia Franco de Carvalho Suggested-by: Pedro Miraglia Franco de Carvalho Signed-off-by: Ravi Bangoria --- arch/powerpc/kernel/ptrace/ptrace-noadv.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/ptrace/ptrace-noadv.c b/arch/powerpc/kernel/ptrace/ptrace-noadv.c index 697c7e4b5877..57a0ab822334 100644 --- a/arch/powerpc/kernel/ptrace/ptrace-noadv.c +++ b/arch/powerpc/kernel/ptrace/ptrace-noadv.c @@ -217,7 +217,7 @@ long ppc_set_hwdebug(struct task_struct *child, struct ppc_hw_breakpoint *bp_inf return -EIO; brk.address = ALIGN_DOWN(bp_info->addr, HW_BREAKPOINT_SIZE); - brk.type = HW_BRK_TYPE_TRANSLATE; + brk.type = HW_BRK_TYPE_TRANSLATE | HW_BRK_TYPE_PRIV_ALL; brk.len = DABR_MAX_LEN; if (bp_info->trigger_type & PPC_BREAKPOINT_TRIGGER_READ) brk.type |= HW_BRK_TYPE_READ; -- 2.26.2
[PATCH v6 0/8] powerpc/watchpoint: Bug fixes plus new feature flag
Patch #1 fixes issue for quardword instruction on p10 predecessors. Patch #2 fixes issue for vector instructions. Patch #3 fixes a bug about watchpoint not firing when created with ptrace PPC_PTRACE_SETHWDEBUG and CONFIG_HAVE_HW_BREAKPOINT=N. The fix uses HW_BRK_TYPE_PRIV_ALL for ptrace user which, I guess, should be fine because we don't leak any kernel addresses and PRIV_ALL will also help to cover scenarios when kernel accesses user memory. Patch #4,#5 fixes infinite exception bug, again the bug happens only with CONFIG_HAVE_HW_BREAKPOINT=N. Patch #6 fixes two places where we are missing to set hw_len. Patch #7 introduce new feature bit PPC_DEBUG_FEATURE_DATA_BP_ARCH_31 which will be set when running on ISA 3.1 compliant machine. Patch #8 finally adds selftest to test scenarios fixed by patch#2,#3 and also moves MODE_EXACT tests outside of BP_RANGE condition. Christophe, let me know if this series breaks something for 8xx. v5: https://lore.kernel.org/r/20200825043617.1073634-1-ravi.bango...@linux.ibm.com v5->v6: - Fix build faulure reported by kernel test robot - patch #5. Use more compact if condition, suggested by Christophe Ravi Bangoria (8): powerpc/watchpoint: Fix quarword instruction handling on p10 predecessors powerpc/watchpoint: Fix handling of vector instructions powerpc/watchpoint/ptrace: Fix SETHWDEBUG when CONFIG_HAVE_HW_BREAKPOINT=N powerpc/watchpoint: Move DAWR detection logic outside of hw_breakpoint.c powerpc/watchpoint: Fix exception handling for CONFIG_HAVE_HW_BREAKPOINT=N powerpc/watchpoint: Add hw_len wherever missing powerpc/watchpoint/ptrace: Introduce PPC_DEBUG_FEATURE_DATA_BP_ARCH_31 powerpc/watchpoint/selftests: Tests for kernel accessing user memory Documentation/powerpc/ptrace.rst | 1 + arch/powerpc/include/asm/hw_breakpoint.h | 12 ++ arch/powerpc/include/uapi/asm/ptrace.h| 1 + arch/powerpc/kernel/Makefile | 3 +- arch/powerpc/kernel/hw_breakpoint.c | 149 +--- .../kernel/hw_breakpoint_constraints.c| 162 ++ arch/powerpc/kernel/process.c | 48 ++ arch/powerpc/kernel/ptrace/ptrace-noadv.c | 9 +- arch/powerpc/xmon/xmon.c | 1 + .../selftests/powerpc/ptrace/ptrace-hwbreak.c | 48 +- 10 files changed, 282 insertions(+), 152 deletions(-) create mode 100644 arch/powerpc/kernel/hw_breakpoint_constraints.c -- 2.26.2
[PATCH v6 4/8] powerpc/watchpoint: Move DAWR detection logic outside of hw_breakpoint.c
Power10 hw has multiple DAWRs but hw doesn't tell which DAWR caused the exception. So we have a sw logic to detect that in hw_breakpoint.c. But hw_breakpoint.c gets compiled only with CONFIG_HAVE_HW_BREAKPOINT=Y. Move DAWR detection logic outside of hw_breakpoint.c so that it can be reused when CONFIG_HAVE_HW_BREAKPOINT is not set. Signed-off-by: Ravi Bangoria --- arch/powerpc/include/asm/hw_breakpoint.h | 8 + arch/powerpc/kernel/Makefile | 3 +- arch/powerpc/kernel/hw_breakpoint.c | 159 + .../kernel/hw_breakpoint_constraints.c| 162 ++ 4 files changed, 174 insertions(+), 158 deletions(-) create mode 100644 arch/powerpc/kernel/hw_breakpoint_constraints.c diff --git a/arch/powerpc/include/asm/hw_breakpoint.h b/arch/powerpc/include/asm/hw_breakpoint.h index 9b68eafebf43..81872c420476 100644 --- a/arch/powerpc/include/asm/hw_breakpoint.h +++ b/arch/powerpc/include/asm/hw_breakpoint.h @@ -10,6 +10,7 @@ #define _PPC_BOOK3S_64_HW_BREAKPOINT_H #include +#include #ifdef __KERNEL__ struct arch_hw_breakpoint { @@ -52,6 +53,13 @@ static inline int nr_wp_slots(void) return cpu_has_feature(CPU_FTR_DAWR1) ? 2 : 1; } +bool wp_check_constraints(struct pt_regs *regs, struct ppc_inst instr, + unsigned long ea, int type, int size, + struct arch_hw_breakpoint *info); + +void wp_get_instr_detail(struct pt_regs *regs, struct ppc_inst *instr, +int *type, int *size, unsigned long *ea); + #ifdef CONFIG_HAVE_HW_BREAKPOINT #include #include diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile index cbf41fb4ee89..a5550c2b24c4 100644 --- a/arch/powerpc/kernel/Makefile +++ b/arch/powerpc/kernel/Makefile @@ -45,7 +45,8 @@ obj-y := cputable.o syscalls.o \ signal.o sysfs.o cacheinfo.o time.o \ prom.o traps.o setup-common.o \ udbg.o misc.o io.o misc_$(BITS).o \ - of_platform.o prom_parse.o firmware.o + of_platform.o prom_parse.o firmware.o \ + hw_breakpoint_constraints.o obj-y += ptrace/ obj-$(CONFIG_PPC64)+= setup_64.o \ paca.o nvram_64.o note.o syscall_64.o diff --git a/arch/powerpc/kernel/hw_breakpoint.c b/arch/powerpc/kernel/hw_breakpoint.c index f6b24838ca3c..f4e8f21046f5 100644 --- a/arch/powerpc/kernel/hw_breakpoint.c +++ b/arch/powerpc/kernel/hw_breakpoint.c @@ -494,161 +494,6 @@ void thread_change_pc(struct task_struct *tsk, struct pt_regs *regs) } } -static bool dar_in_user_range(unsigned long dar, struct arch_hw_breakpoint *info) -{ - return ((info->address <= dar) && (dar - info->address < info->len)); -} - -static bool ea_user_range_overlaps(unsigned long ea, int size, - struct arch_hw_breakpoint *info) -{ - return ((ea < info->address + info->len) && - (ea + size > info->address)); -} - -static bool dar_in_hw_range(unsigned long dar, struct arch_hw_breakpoint *info) -{ - unsigned long hw_start_addr, hw_end_addr; - - hw_start_addr = ALIGN_DOWN(info->address, HW_BREAKPOINT_SIZE); - hw_end_addr = ALIGN(info->address + info->len, HW_BREAKPOINT_SIZE); - - return ((hw_start_addr <= dar) && (hw_end_addr > dar)); -} - -static bool ea_hw_range_overlaps(unsigned long ea, int size, -struct arch_hw_breakpoint *info) -{ - unsigned long hw_start_addr, hw_end_addr; - unsigned long align_size = HW_BREAKPOINT_SIZE; - - /* -* On p10 predecessors, quadword is handle differently then -* other instructions. -*/ - if (!cpu_has_feature(CPU_FTR_ARCH_31) && size == 16) - align_size = HW_BREAKPOINT_SIZE_QUADWORD; - - hw_start_addr = ALIGN_DOWN(info->address, align_size); - hw_end_addr = ALIGN(info->address + info->len, align_size); - - return ((ea < hw_end_addr) && (ea + size > hw_start_addr)); -} - -/* - * If hw has multiple DAWR registers, we also need to check all - * dawrx constraint bits to confirm this is _really_ a valid event. - * If type is UNKNOWN, but privilege level matches, consider it as - * a positive match. - */ -static bool check_dawrx_constraints(struct pt_regs *regs, int type, - struct arch_hw_breakpoint *info) -{ - if (OP_IS_LOAD(type) && !(info->type & HW_BRK_TYPE_READ)) - return false; - - /* -* The Cache Management instructions other than dcbz never -* cause a match. i.e. if type is CACHEOP, the instruction -* is dcbz, and dcbz is treated as Store. -*/ - if ((OP_IS_STORE(type) || type == CACHEOP) && !(info->type &
[PATCH v6 6/8] powerpc/watchpoint: Add hw_len wherever missing
There are couple of places where we set len but not hw_len. For ptrace/perf watchpoints, when CONFIG_HAVE_HW_BREAKPOINT=Y, hw_len will be calculated and set internally while parsing watchpoint. But when CONFIG_HAVE_HW_BREAKPOINT=N, we need to manually set 'hw_len'. Similarly for xmon as well, hw_len needs to be set directly. Fixes: b57aeab811db ("powerpc/watchpoint: Fix length calculation for unaligned target") Signed-off-by: Ravi Bangoria --- arch/powerpc/kernel/ptrace/ptrace-noadv.c | 1 + arch/powerpc/xmon/xmon.c | 1 + 2 files changed, 2 insertions(+) diff --git a/arch/powerpc/kernel/ptrace/ptrace-noadv.c b/arch/powerpc/kernel/ptrace/ptrace-noadv.c index c9122ed91340..48c52426af80 100644 --- a/arch/powerpc/kernel/ptrace/ptrace-noadv.c +++ b/arch/powerpc/kernel/ptrace/ptrace-noadv.c @@ -219,6 +219,7 @@ long ppc_set_hwdebug(struct task_struct *child, struct ppc_hw_breakpoint *bp_inf brk.address = ALIGN_DOWN(bp_info->addr, HW_BREAKPOINT_SIZE); brk.type = HW_BRK_TYPE_TRANSLATE | HW_BRK_TYPE_PRIV_ALL; brk.len = DABR_MAX_LEN; + brk.hw_len = DABR_MAX_LEN; if (bp_info->trigger_type & PPC_BREAKPOINT_TRIGGER_READ) brk.type |= HW_BRK_TYPE_READ; if (bp_info->trigger_type & PPC_BREAKPOINT_TRIGGER_WRITE) diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c index df7bca00f5ec..55c43a6c9111 100644 --- a/arch/powerpc/xmon/xmon.c +++ b/arch/powerpc/xmon/xmon.c @@ -969,6 +969,7 @@ static void insert_cpu_bpts(void) brk.address = dabr[i].address; brk.type = (dabr[i].enabled & HW_BRK_TYPE_DABR) | HW_BRK_TYPE_PRIV_ALL; brk.len = 8; + brk.hw_len = 8; __set_breakpoint(i, ); } } -- 2.26.2
[tip:master] BUILD SUCCESS 43b00b155cc21855de77ec14c31fdfc2a43c9c0d
allmodconfig powerpc allnoconfig x86_64 randconfig-a004-20200901 x86_64 randconfig-a003-20200901 x86_64 randconfig-a001-20200901 x86_64 randconfig-a002-20200901 x86_64 randconfig-a006-20200901 x86_64 randconfig-a005-20200901 i386 randconfig-a004-20200901 i386 randconfig-a005-20200901 i386 randconfig-a006-20200901 i386 randconfig-a002-20200901 i386 randconfig-a001-20200901 i386 randconfig-a003-20200901 i386 randconfig-a016-20200901 i386 randconfig-a015-20200901 i386 randconfig-a011-20200901 i386 randconfig-a013-20200901 i386 randconfig-a014-20200901 i386 randconfig-a012-20200901 i386 randconfig-a016-20200902 i386 randconfig-a015-20200902 i386 randconfig-a011-20200902 i386 randconfig-a013-20200902 i386 randconfig-a014-20200902 i386 randconfig-a012-20200902 riscvallyesconfig riscv defconfig riscvallmodconfig x86_64 rhel x86_64 allyesconfig x86_64rhel-7.6-kselftests x86_64 rhel-8.3 x86_64 kexec clang tested configs: x86_64 randconfig-a013-20200901 x86_64 randconfig-a016-20200901 x86_64 randconfig-a011-20200901 x86_64 randconfig-a012-20200901 x86_64 randconfig-a015-20200901 x86_64 randconfig-a014-20200901 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org
Re: [PATCH v2 10/11] lockdep: Only trace IRQ edges
On Fri, Aug 21, 2020 at 10:47:48AM +0200, Peter Zijlstra wrote: > From: Nicholas Piggin > > Problem: > > raw_local_irq_save(); > local_irq_save(); > ... > local_irq_restore(); > raw_local_irq_restore(); > > existing instances: > > - lock_acquire() > raw_local_irq_save() > __lock_acquire() >arch_spin_lock(_lock) > pv_wait() := kvm_wait() (same or worse for Xen/HyperV) >local_irq_save() > > - trace_clock_global() > raw_local_irq_save() > arch_spin_lock() >pv_wait() := kvm_wait() >local_irq_save() > > - apic_retrigger_irq() > raw_local_irq_save() > apic->send_IPI() := default_send_IPI_single_phys() >local_irq_save() > > Possible solutions: > > A) make it work by enabling the tracing inside raw_*() > B) make it work by keeping tracing disabled inside raw_*() > > Now, given that the only reason to use the raw_* variant is because you don't > want tracing, A) seems like a weird option (although it can be done), so we > pick B) and declare any code that ends up doing: > > raw_local_irq_save() > local_irq_save() > lockdep_assert_irqs_disabled(); > > broken. AFAICT this problem has existed forever, the only reason it came > up is because I changed IRQ tracing vs lockdep recursion and the first > instance is fairly common, the other cases hardly ever happen. > On sparc64, this patch results in the traceback below. The traceback is gone after reverting the patch. Guenter --- [0.00] WARNING: CPU: 0 PID: 0 at kernel/locking/lockdep.c:4875 check_flags.part.39+0x280/0x2a0 [0.00] DEBUG_LOCKS_WARN_ON(lockdep_hardirqs_enabled()) [0.00] Modules linked in: [0.00] CPU: 0 PID: 0 Comm: swapper Not tainted 5.9.0-rc3 #1 [0.00] Call Trace: [0.00] [<00469890>] __warn+0xb0/0xe0 [0.00] [<004698fc>] warn_slowpath_fmt+0x3c/0x80 [0.00] [<004cfce0>] check_flags.part.39+0x280/0x2a0 [0.00] [<004cff18>] lock_acquire+0x218/0x4e0 [0.00] [<00d740c8>] _raw_spin_lock+0x28/0x40 [0.00] [<009870f4>] p1275_cmd_direct+0x14/0x60 [0.00] [<009872cc>] prom_getproplen+0x4c/0x60 [0.00] [<00987308>] prom_getproperty+0x8/0x80 [0.00] [<00987390>] prom_getint+0x10/0x40 [0.00] [<017df4b4>] prom_init+0x38/0x8c [0.00] [<00d6b558>] tlb_fixup_done+0x44/0x6c [0.00] [] 0xffd0e930 [0.00] irq event stamp: 1 [0.00] hardirqs last enabled at (1): [<00987124>] p1275_cmd_direct+0x44/0x60 [0.00] hardirqs last disabled at (0): [<>] 0x0 [0.00] softirqs last enabled at (0): [<>] 0x0 [0.00] softirqs last disabled at (0): [<>] 0x0 [0.00] random: get_random_bytes called from print_oops_end_marker+0x30/0x60 with crng_init=0 [0.00] ---[ end trace ]--- [0.00] possible reason: unannotated irqs-off. --- bisect log: # bad: [f75aef392f869018f78cfedf3c320a6b3fcfda6b] Linux 5.9-rc3 # good: [1127b219ce9481c84edad9711626d856127d5e51] Merge tag 'fallthrough-fixes-5.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux git bisect start 'f75aef392f86' '1127b219ce94' # good: [8bb5021cc2ee5d5dd129a9f2f5ad2bb76eea297d] Merge tag 'powerpc-5.9-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux git bisect good 8bb5021cc2ee5d5dd129a9f2f5ad2bb76eea297d # good: [ceb2465c51195967f11f6507538579816ac67cb8] Merge tag 'irqchip-fixes-5.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/urgent git bisect good ceb2465c51195967f11f6507538579816ac67cb8 # bad: [b69bea8a657b681442765b06be92a2607b1bd875] Merge tag 'locking-urgent-2020-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip git bisect bad b69bea8a657b681442765b06be92a2607b1bd875 # good: [00b0ed2d4997af6d0a93edef820386951fd66d94] locking/lockdep: Cleanup git bisect good 00b0ed2d4997af6d0a93edef820386951fd66d94 # bad: [044d0d6de9f50192f9697583504a382347ee95ca] lockdep: Only trace IRQ edges git bisect bad 044d0d6de9f50192f9697583504a382347ee95ca # good: [021c109330ebc1f54b546c63a078ea3c31356ecb] arm64: Implement arch_irqs_disabled() git bisect good 021c109330ebc1f54b546c63a078ea3c31356ecb # good: [99dc56feb7932020502d40107a712fa302b32082] mips: Implement arch_irqs_disabled() git bisect good 99dc56feb7932020502d40107a712fa302b32082 # first bad commit: [044d0d6de9f50192f9697583504a382347ee95ca] lockdep: Only trace IRQ edges
Re: [PATCH] tmpfs: Restore functionality of nr_inodes=0
On Tue, 1 Sep 2020, Byron Stanoszek wrote: > Commit e809d5f0b5c9 ("tmpfs: per-superblock i_ino support") made changes to > shmem_reserve_inode() in mm/shmem.c, however the original test for > (sbinfo->max_inodes) got dropped. This causes mounting tmpfs with option > nr_inodes=0 to fail: > > # mount -ttmpfs -onr_inodes=0 none /ext0 > mount: /ext0: mount(2) system call failed: Cannot allocate memory. > > This patch restores the nr_inodes=0 functionality. > > Fixes: e809d5f0b5c9 ("tmpfs: per-superblock i_ino support") > Cc: Chris Down > Signed-off-by: Byron Stanoszek Yikes, thank you Byron, very bad of me not to have spotted that: Acked-by: Hugh Dickins I've taken a quick look to see how I missed it: yes, I'd compared against my own tree, knew I had to come back here sometime to replace the SB_KERNMOUNT test by a max_inodes test like I had, to restore the performance of nr_inodes=0; but thought the SB_KERNMOUNT test was good enough for now - without realizing the effect on the code below it. The error does seem to be localized just to this block, yes. Many thanks. > --- > mm/shmem.c | 10 ++ > 1 file changed, 6 insertions(+), 4 deletions(-) > > diff --git a/mm/shmem.c b/mm/shmem.c > index 271548ca20f3..8e2b35ba93ad 100644 > --- a/mm/shmem.c > +++ b/mm/shmem.c > @@ -279,11 +279,13 @@ static int shmem_reserve_inode(struct super_block *sb, > ino_t *inop) > > if (!(sb->s_flags & SB_KERNMOUNT)) { > spin_lock(>stat_lock); > - if (!sbinfo->free_inodes) { > - spin_unlock(>stat_lock); > - return -ENOSPC; > + if (sbinfo->max_inodes) { > + if (!sbinfo->free_inodes) { > + spin_unlock(>stat_lock); > + return -ENOSPC; > + } > + sbinfo->free_inodes--; > } > - sbinfo->free_inodes--; > if (inop) { > ino = sbinfo->next_ino++; > if (unlikely(is_zero_ino(ino))) > -- > 2.28.0
[PATCH] tmpfs: Restore functionality of nr_inodes=0
Commit e809d5f0b5c9 ("tmpfs: per-superblock i_ino support") made changes to shmem_reserve_inode() in mm/shmem.c, however the original test for (sbinfo->max_inodes) got dropped. This causes mounting tmpfs with option nr_inodes=0 to fail: # mount -ttmpfs -onr_inodes=0 none /ext0 mount: /ext0: mount(2) system call failed: Cannot allocate memory. This patch restores the nr_inodes=0 functionality. Fixes: e809d5f0b5c9 ("tmpfs: per-superblock i_ino support") Cc: Chris Down Signed-off-by: Byron Stanoszek --- mm/shmem.c | 10 ++ 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/mm/shmem.c b/mm/shmem.c index 271548ca20f3..8e2b35ba93ad 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -279,11 +279,13 @@ static int shmem_reserve_inode(struct super_block *sb, ino_t *inop) if (!(sb->s_flags & SB_KERNMOUNT)) { spin_lock(>stat_lock); - if (!sbinfo->free_inodes) { - spin_unlock(>stat_lock); - return -ENOSPC; + if (sbinfo->max_inodes) { + if (!sbinfo->free_inodes) { + spin_unlock(>stat_lock); + return -ENOSPC; + } + sbinfo->free_inodes--; } - sbinfo->free_inodes--; if (inop) { ino = sbinfo->next_ino++; if (unlikely(is_zero_ino(ino))) -- 2.28.0
Re: [PATCH 4/7] arm64: dts: ti: k3-am65-base-board Use generic camera for node name instead of ov5640
On 9/1/20 5:30 PM, Nishanth Menon wrote: > Use camera@ naming for nodes following standard conventions of device > tree (section 2.2.2 Generic Names recommendation in [1]). > > [1] https://github.com/devicetree-org/devicetree-specification/tree/v0.3 > > Suggested-by: Suman Anna > Signed-off-by: Nishanth Menon Acked-by: Suman Anna > --- > arch/arm64/boot/dts/ti/k3-am654-base-board.dts | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arch/arm64/boot/dts/ti/k3-am654-base-board.dts > b/arch/arm64/boot/dts/ti/k3-am654-base-board.dts > index b8a8a0fcb8af..86c9074cb070 100644 > --- a/arch/arm64/boot/dts/ti/k3-am654-base-board.dts > +++ b/arch/arm64/boot/dts/ti/k3-am654-base-board.dts > @@ -257,7 +257,7 @@ > pinctrl-0 = <_i2c1_pins_default>; > clock-frequency = <40>; > > - ov5640@3c { > + ov5640: camera@3c { > compatible = "ovti,ov5640"; > reg = <0x3c>; > >
Re: [PATCH 1/7] arm64: dts: ti: k3-am65*: Use generic gpio for node names
On 9/1/20 5:30 PM, Nishanth Menon wrote: > Use gpio@ naming for nodes following standard conventions of device > tree (section 2.2.2 Generic Names recommendation in [1]). > > [1] https://github.com/devicetree-org/devicetree-specification/tree/v0.3 > > Suggested-by: Suman Anna > Signed-off-by: Nishanth Menon Acked-by: Suman Anna > --- > arch/arm64/boot/dts/ti/k3-am65-main.dtsi | 4 ++-- > arch/arm64/boot/dts/ti/k3-am65-wakeup.dtsi | 2 +- > 2 files changed, 3 insertions(+), 3 deletions(-) > > diff --git a/arch/arm64/boot/dts/ti/k3-am65-main.dtsi > b/arch/arm64/boot/dts/ti/k3-am65-main.dtsi > index 76e0edc4ad5c..336d09d6fec7 100644 > --- a/arch/arm64/boot/dts/ti/k3-am65-main.dtsi > +++ b/arch/arm64/boot/dts/ti/k3-am65-main.dtsi > @@ -661,7 +661,7 @@ > }; > }; > > - main_gpio0: main_gpio0@60 { > + main_gpio0: gpio@60 { > compatible = "ti,am654-gpio", "ti,keystone-gpio"; > reg = <0x0 0x60 0x0 0x100>; > gpio-controller; > @@ -676,7 +676,7 @@ > clock-names = "gpio"; > }; > > - main_gpio1: main_gpio1@601000 { > + main_gpio1: gpio@601000 { > compatible = "ti,am654-gpio", "ti,keystone-gpio"; > reg = <0x0 0x601000 0x0 0x100>; > gpio-controller; > diff --git a/arch/arm64/boot/dts/ti/k3-am65-wakeup.dtsi > b/arch/arm64/boot/dts/ti/k3-am65-wakeup.dtsi > index a1ffe88d9664..0765700a8ba8 100644 > --- a/arch/arm64/boot/dts/ti/k3-am65-wakeup.dtsi > +++ b/arch/arm64/boot/dts/ti/k3-am65-wakeup.dtsi > @@ -80,7 +80,7 @@ > ti,interrupt-ranges = <0 712 16>; > }; > > - wkup_gpio0: wkup_gpio0@4211 { > + wkup_gpio0: gpio@4211 { > compatible = "ti,am654-gpio", "ti,keystone-gpio"; > reg = <0x4211 0x100>; > gpio-controller; >
Re: [PATCH 3/7] arm64: dts: ti: k3-*: Use generic pinctrl for node names
On 9/1/20 5:30 PM, Nishanth Menon wrote: > Use pinctrl@ naming for nodes following standard conventions of device > tree (section 2.2.2 Generic Names recommendation in [1]). > > [1] https://github.com/devicetree-org/devicetree-specification/tree/v0.3 > > Suggested-by: Suman Anna > Signed-off-by: Nishanth Menon Acked-by: Suman Anna > --- > arch/arm64/boot/dts/ti/k3-am65-main.dtsi| 4 ++-- > arch/arm64/boot/dts/ti/k3-am65-wakeup.dtsi | 2 +- > arch/arm64/boot/dts/ti/k3-j721e-main.dtsi | 2 +- > arch/arm64/boot/dts/ti/k3-j721e-mcu-wakeup.dtsi | 2 +- > 4 files changed, 5 insertions(+), 5 deletions(-) > > diff --git a/arch/arm64/boot/dts/ti/k3-am65-main.dtsi > b/arch/arm64/boot/dts/ti/k3-am65-main.dtsi > index 03e28fc256de..9c96e3f58c86 100644 > --- a/arch/arm64/boot/dts/ti/k3-am65-main.dtsi > +++ b/arch/arm64/boot/dts/ti/k3-am65-main.dtsi > @@ -134,7 +134,7 @@ > }; > }; > > - main_pmx0: pinmux@11c000 { > + main_pmx0: pinctrl@11c000 { > compatible = "pinctrl-single"; > reg = <0x0 0x11c000 0x0 0x2e4>; > #pinctrl-cells = <1>; > @@ -142,7 +142,7 @@ > pinctrl-single,function-mask = <0x>; > }; > > - main_pmx1: pinmux@11c2e8 { > + main_pmx1: pinctrl@11c2e8 { > compatible = "pinctrl-single"; > reg = <0x0 0x11c2e8 0x0 0x24>; > #pinctrl-cells = <1>; > diff --git a/arch/arm64/boot/dts/ti/k3-am65-wakeup.dtsi > b/arch/arm64/boot/dts/ti/k3-am65-wakeup.dtsi > index 0765700a8ba8..bb498be2f0a4 100644 > --- a/arch/arm64/boot/dts/ti/k3-am65-wakeup.dtsi > +++ b/arch/arm64/boot/dts/ti/k3-am65-wakeup.dtsi > @@ -39,7 +39,7 @@ > reg = <0x4314 0x4>; > }; > > - wkup_pmx0: pinmux@4301c000 { > + wkup_pmx0: pinctrl@4301c000 { > compatible = "pinctrl-single"; > reg = <0x4301c000 0x118>; > #pinctrl-cells = <1>; > diff --git a/arch/arm64/boot/dts/ti/k3-j721e-main.dtsi > b/arch/arm64/boot/dts/ti/k3-j721e-main.dtsi > index 00a36a14efe7..1d2a7c05b6f3 100644 > --- a/arch/arm64/boot/dts/ti/k3-j721e-main.dtsi > +++ b/arch/arm64/boot/dts/ti/k3-j721e-main.dtsi > @@ -327,7 +327,7 @@ > }; > }; > > - main_pmx0: pinmux@11c000 { > + main_pmx0: pinctrl@11c000 { > compatible = "pinctrl-single"; > /* Proxy 0 addressing */ > reg = <0x0 0x11c000 0x0 0x2b4>; > diff --git a/arch/arm64/boot/dts/ti/k3-j721e-mcu-wakeup.dtsi > b/arch/arm64/boot/dts/ti/k3-j721e-mcu-wakeup.dtsi > index c4a48e8d420a..9ad0266598ad 100644 > --- a/arch/arm64/boot/dts/ti/k3-j721e-mcu-wakeup.dtsi > +++ b/arch/arm64/boot/dts/ti/k3-j721e-mcu-wakeup.dtsi > @@ -53,7 +53,7 @@ > reg = <0x0 0x4314 0x0 0x4>; > }; > > - wkup_pmx0: pinmux@4301c000 { > + wkup_pmx0: pinctrl@4301c000 { > compatible = "pinctrl-single"; > /* Proxy 0 addressing */ > reg = <0x00 0x4301c000 0x00 0x178>; >
[PATCH] mmc: sdhci-of-esdhc: Don't walk device-tree on every interrupt
Commit b214fe592ab7 ("mmc: sdhci-of-esdhc: add erratum eSDHC7 support") added code to check for a specific compatible string in the device-tree on every esdhc interrupt. We know that if it's present the compatible string will be found on the sdhc host. Instead of walking the device-tree, go directly to the sdhc host's device and use of_device_is_compatible(). Signed-off-by: Chris Packham --- I found this in passing while trying to track down another issue using ftrace. I found it odd that I was seeing a lot of calls to __of_device_is_compatible() coming from esdhc_irq() (the fact that this interrupt is going off on my board is also odd, but that's a different story). drivers/mmc/host/sdhci-of-esdhc.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/mmc/host/sdhci-of-esdhc.c b/drivers/mmc/host/sdhci-of-esdhc.c index 7c73d243dc6c..11c8c522d623 100644 --- a/drivers/mmc/host/sdhci-of-esdhc.c +++ b/drivers/mmc/host/sdhci-of-esdhc.c @@ -1177,10 +1177,11 @@ static void esdhc_set_uhs_signaling(struct sdhci_host *host, static u32 esdhc_irq(struct sdhci_host *host, u32 intmask) { + struct device *dev = mmc_dev(host->mmc); + struct device_node *np = dev->of_node; u32 command; - if (of_find_compatible_node(NULL, NULL, - "fsl,p2020-esdhc")) { + if (of_device_is_compatible(np, "fsl,p2020-esdhc")) { command = SDHCI_GET_CMD(sdhci_readw(host, SDHCI_COMMAND)); if (command == MMC_WRITE_MULTIPLE_BLOCK && -- 2.28.0
Re: [PATCH v5 00/14] irqchip: Fix potential resource leaks
On 07/06/2020 03:30 PM, Marc Zyngier wrote: On 2020-07-06 02:19, Tiezhu Yang wrote: When I test the irqchip code of Loongson, I read the related code of other chips in drivers/irqchip and I find some potential resource leaks in the error path, I think it is better to fix them. v2: - Split the first patch into a new patch series which includes small patches and add "Fixes" tag - Use "goto" label to handle error path in some patches v3: - Add missed variable "ret" in the patch #5 and #13 v4: - Modify the commit message of each patch suggested by Markus Elfring - Make "irq_domain_remove(root_domain)" under CONFIG_SMP in patch #3 - Add a return statement before goto label in patch #4 v5: - Modify the commit messages and do some code cleanups Please stop replying to Markus Elfring, and give people who actually care a chance to review this code. Elfring will keep asking you to make absolutely pointless changes until you are blue in the face Hi Marc, Any comments? Could you please apply this patch series? Thanks, Tiezhu Thanks, M.
[PATCH v2] arm64: dts: qcom: sc7180: Add lpass cpu node for I2S driver
From: Ajit Pandey Add the I2S controller node to sc7180 dtsi. Add pinmux for primary and secondary I2S. Signed-off-by: Ajit Pandey Signed-off-by: Cheng-Yi Chiang Signed-off-by: Srinivasa Rao Mandadapu --- arch/arm64/boot/dts/qcom/sc7180.dtsi | 69 1 file changed, 69 insertions(+) diff --git a/arch/arm64/boot/dts/qcom/sc7180.dtsi b/arch/arm64/boot/dts/qcom/sc7180.dtsi index d46b383..db60ca5 100644 --- a/arch/arm64/boot/dts/qcom/sc7180.dtsi +++ b/arch/arm64/boot/dts/qcom/sc7180.dtsi @@ -676,6 +676,36 @@ }; }; + lpass_cpu: lpass@62f0 { + compatible = "qcom,sc7180-lpass-cpu"; + + reg = <0 0x62f0 0 0x29000>; + reg-names = "lpass-lpaif"; + + iommus = <_smmu 0x1020 0>; + + power-domains = <_hm LPASS_CORE_HM_GDSCR>; + + clocks = < GCC_LPASS_CFG_NOC_SWAY_CLK>, +< LPASS_AUDIO_CORE_CORE_CLK>, +< LPASS_AUDIO_CORE_EXT_MCLK0_CLK>, +< LPASS_AUDIO_CORE_SYSNOC_MPORT_CORE_CLK>, +< LPASS_AUDIO_CORE_LPAIF_PRI_IBIT_CLK>, +< LPASS_AUDIO_CORE_LPAIF_SEC_IBIT_CLK>; + + clock-names = "pcnoc-sway-clk", "audio-core", + "mclk0", "pcnoc-mport-clk", + "mi2s-bit-clk0", "mi2s-bit-clk1"; + + + #sound-dai-cells = <1>; + #address-cells = <1>; + #size-cells = <0>; + + interrupts = ; + interrupt-names = "lpass-irq-lpaif"; + }; + sdhc_1: sdhci@7c4000 { compatible = "qcom,sc7180-sdhci", "qcom,sdhci-msm-v5"; reg = <0 0x7c4000 0 0x1000>, @@ -1721,6 +1751,45 @@ }; }; + sec_mi2s_active: sec-mi2s-active { + pinmux { + pins = "gpio49", "gpio50", "gpio51"; + function = "mi2s_1"; + }; + + pinconf { + pins = "gpio49", "gpio50", "gpio51";; + drive-strength = <8>; + bias-pull-up; + }; + }; + + pri_mi2s_active: pri-mi2s-active { + pinmux { + pins = "gpio53", "gpio54", "gpio55", "gpio56"; + function = "mi2s_0"; + }; + + pinconf { + pins = "gpio53", "gpio54", "gpio55", "gpio56"; + drive-strength = <8>; + bias-pull-up; + }; + }; + + pri_mi2s_mclk_active: pri-mi2s-mclk-active { + pinmux { + pins = "gpio57"; + function = "lpass_ext"; + }; + + pinconf { + pins = "gpio57"; + drive-strength = <8>; + bias-pull-up; + }; + }; + sdc1_on: sdc1-on { pinconf-clk { pins = "sdc1_clk"; -- Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc., is a member of Code Aurora Forum, a Linux Foundation Collaborative Project.
Re: [PATCH kcsan 6/9] tools/memory-model: Expand the cheatsheet.txt notion of relaxed
On Mon, Aug 31, 2020 at 11:20:34AM -0700, paul...@kernel.org wrote: > From: "Paul E. McKenney" > > This commit adds a key entry enumerating the various types of relaxed > operations. > > Signed-off-by: Paul E. McKenney > --- > tools/memory-model/Documentation/cheatsheet.txt | 27 > ++--- > 1 file changed, 15 insertions(+), 12 deletions(-) > > diff --git a/tools/memory-model/Documentation/cheatsheet.txt > b/tools/memory-model/Documentation/cheatsheet.txt > index 33ba98d..31b814d 100644 > --- a/tools/memory-model/Documentation/cheatsheet.txt > +++ b/tools/memory-model/Documentation/cheatsheet.txt > @@ -5,7 +5,7 @@ > > Store, e.g., WRITE_ONCE()Y > Y > Load, e.g., READ_ONCE() Y Y Y > Y > -Unsuccessful RMW operation Y Y Y > Y > +Relaxed operationY Y Y > Y > rcu_dereference()Y Y Y > Y > Successful *_acquire() R Y Y Y YY > Y > Successful *_release() CY YY W > Y > @@ -17,14 +17,17 @@ smp_mb__before_atomic() CPY YY > a a a aY > smp_mb__after_atomic()CPa aYY Y Y YY > > > -Key: C: Ordering is cumulative > - P: Ordering propagates > - R: Read, for example, READ_ONCE(), or read portion of RMW > - W: Write, for example, WRITE_ONCE(), or write portion of RMW > - Y: Provides ordering > - a: Provides ordering given intervening RMW atomic operation > - DR: Dependent read (address dependency) > - DW: Dependent write (address, data, or control dependency) > - RMW:Atomic read-modify-write operation > - SELF: Orders self, as opposed to accesses before and/or after > - SV: Orders later accesses to the same variable > +Key: Relaxed: A relaxed operation is either a *_relaxed() RMW > + operation, an unsuccessful RMW operation, or one of > + the atomic_read() and atomic_set() family of operations. To be accurate, atomic_set() doesn't return any value, so it cannot be ordered against DR and DW ;-) I think we can split the Relaxed family into two groups: void Relaxed: atomic_set() or atomic RMW operations that don't return any value (e.g atomic_inc()) non-void Relaxed: a *_relaxed() RMW operation, an unsuccessful RMW operation, or atomic_read(). And "void Relaxed" is similar to WRITE_ONCE(), only has "Self" and "SV" equal "Y", while "non-void Relaxed" plays the same rule as "Relaxed" in this patch. Thoughts? Regards, Boqun > + C:Ordering is cumulative > + P:Ordering propagates > + R:Read, for example, READ_ONCE(), or read portion of RMW > + W:Write, for example, WRITE_ONCE(), or write portion of RMW > + Y:Provides ordering > + a:Provides ordering given intervening RMW atomic operation > + DR: Dependent read (address dependency) > + DW: Dependent write (address, data, or control dependency) > + RMW: Atomic read-modify-write operation > + SELF: Orders self, as opposed to accesses before and/or after > + SV: Orders later accesses to the same variable > -- > 2.9.5 >
Re: [PATCH v2 11/11] lockdep,trace: Expose tracepoints
On Fri, Aug 21, 2020 at 10:47:49AM +0200, Peter Zijlstra wrote: > The lockdep tracepoints are under the lockdep recursion counter, this > has a bunch of nasty side effects: > > - TRACE_IRQFLAGS doesn't work across the entire tracepoint > > - RCU-lockdep doesn't see the tracepoints either, hiding numerous >"suspicious RCU usage" warnings. > > Pull the trace_lock_*() tracepoints completely out from under the > lockdep recursion handling and completely rely on the trace level > recusion handling -- also, tracing *SHOULD* not be taking locks in any > case. > Wonder what is worse - the problem or its fix. This patch results in a number of WARNING backtraces for several archtectures/platforms. Reverting it fixes the problems. Guenter --- arm: [ 27.055084] [ 27.056213] = [ 27.056274] WARNING: suspicious RCU usage [ 27.056335] 5.9.0-rc3 #1 Not tainted [ 27.056396] - [ 27.056457] include/trace/events/lock.h:13 suspicious rcu_dereference_check() usage! [ 27.056549] [ 27.056549] other info that might help us debug this: [ 27.056549] [ 27.056640] [ 27.056640] rcu_scheduler_active = 2, debug_locks = 1 [ 27.056732] RCU used illegally from extended quiescent state! [ 27.056793] no locks held by swapper/0/0. [ 27.056854] [ 27.056854] stack backtrace: [ 27.056915] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.9.0-rc3 #1 [ 27.057006] Hardware name: Generic OMAP3-GP (Flattened Device Tree) [ 27.057098] [] (unwind_backtrace) from [] (show_stack+0x10/0x14) [ 27.057189] [] (show_stack) from [] (dump_stack+0xd8/0xf8) [ 27.057312] [] (dump_stack) from [] (lock_acquire+0x4d8/0x4dc) [ 27.057464] [] (lock_acquire) from [] (_raw_spin_lock_irqsave+0x58/0x74) [ 27.057617] [] (_raw_spin_lock_irqsave) from [] (pwrdm_lock+0x10/0x18) [ 27.057739] [] (pwrdm_lock) from [] (clkdm_deny_idle+0x10/0x24) [ 27.057891] [] (clkdm_deny_idle) from [] (omap3_enter_idle_bm+0xd4/0x1b8) [ 27.058044] [] (omap3_enter_idle_bm) from [] (cpuidle_enter_state+0x16c/0x620) [ 27.058197] [] (cpuidle_enter_state) from [] (cpuidle_enter+0x50/0x54) [ 27.058349] [] (cpuidle_enter) from [] (do_idle+0x228/0x2b8) [ 27.058471] [] (do_idle) from [] (cpu_startup_entry+0x18/0x1c) [ 27.058624] [] (cpu_startup_entry) from [] (start_kernel+0x518/0x558) [ 27.059692] [ 27.059753] = [ 27.059753] WARNING: suspicious RCU usage [ 27.059753] 5.9.0-rc3 #1 Not tainted [ 27.059753] - [ 27.059753] include/trace/events/lock.h:58 suspicious rcu_dereference_check() usage! [ 27.059783] [ 27.059783] other info that might help us debug this: [ 27.059783] [ 27.059783] [ 27.059783] rcu_scheduler_active = 2, debug_locks = 1 [ 27.059783] RCU used illegally from extended quiescent state! [ 27.059783] 1 lock held by swapper/0/0: [ 27.059814] #0: c1e30f3c (logbuf_lock){-...}-{2:2}, at: vprintk_emit+0x60/0x38c [ 27.059906] [ 27.059936] stack backtrace: [ 27.059936] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.9.0-rc3 #1 [ 27.059936] Hardware name: Generic OMAP3-GP (Flattened Device Tree) [ 27.059936] [] (unwind_backtrace) from [] (show_stack+0x10/0x14) [ 27.059936] [] (show_stack) from [] (dump_stack+0xd8/0xf8) [ 27.059936] [] (dump_stack) from [] (lock_release+0x41c/0x420) [ 27.059936] [] (lock_release) from [] (_raw_spin_unlock+0x14/0x38) [ 27.059967] [] (_raw_spin_unlock) from [] (vprintk_emit+0xb4/0x38c) [ 27.059967] [] (vprintk_emit) from [] (vprintk_default+0x20/0x28) [ 27.059967] [] (vprintk_default) from [] (printk+0x30/0x5c) [ 27.059967] [] (printk) from [] (lockdep_rcu_suspicious+0x2c/0xec) [ 27.059967] [] (lockdep_rcu_suspicious) from [] (lock_acquire+0x4d8/0x4dc) [ 27.059967] [] (lock_acquire) from [] (_raw_spin_lock_irqsave+0x58/0x74) [ 27.059997] [] (_raw_spin_lock_irqsave) from [] (pwrdm_lock+0x10/0x18) [ 27.059997] [] (pwrdm_lock) from [] (clkdm_deny_idle+0x10/0x24) [ 27.059997] [] (clkdm_deny_idle) from [] (omap3_enter_idle_bm+0xd4/0x1b8) [ 27.059997] [] (omap3_enter_idle_bm) from [] (cpuidle_enter_state+0x16c/0x620) [ 27.059997] [] (cpuidle_enter_state) from [] (cpuidle_enter+0x50/0x54) [ 27.059997] [] (cpuidle_enter) from [] (do_idle+0x228/0x2b8) [ 27.059997] [] (do_idle) from [] (cpu_startup_entry+0x18/0x1c) [ 27.060028] [] (cpu_startup_entry) from [] (start_kernel+0x518/0x558) s390: [ 19.490586] = [ 19.490752] WARNING: suspicious RCU usage [ 19.490921] 5.9.0-rc3 #1 Not tainted [ 19.491086] - [ 19.491253] include/trace/events/lock.h:37 suspicious rcu_dereference_check() usage! [ 19.491418] [ 19.491418] other info that might help us debug this: [ 19.491418] [ 19.491613] [ 19.491613] rcu_scheduler_active = 2, debug_locks = 1 [ 19.491779] RCU used illegally from extended quiescent state! [ 19.491950] no locks held by
Uli Behringer is a racist and a sexist. Worked at Music Tribe for 30 years (it's name has changed repeatedly).
Uli Behringer is a racist and a sexist. Worked at Music Tribe for 30 years (it's name has changed repeatedly) Cons of working at Music Tribe: Uli Behringer, the CEO, is a racist and a sexist. You can tell by the way he presents himself: plastic surgery to uphold his boyish youthful looks; originally natural blonde hair (now dyed since he's 60 and going grey); resplendent blue eyes (these have always been natural). The spitting image of past euro-centric times. He refuses to dye his hair a more world-representative black or dark brown, and refuses to wear black or brown contact lenses. He seems proud of his visage: that of a proto-typical (Swiss)-German. He is also proud of his German surname and emblazons it on everything: as if it was a coat-of-arms of a great House. When I worked for Music Tribe / Music Group / Behringer (the name keeps changing), Uli mostly wanted us to hire Engineers and Programmers. He was happy with the over-representation of white and european males in his companies. He ONLY talked to those (mostly white and european) male programmers and engineers. He ROUTINELY fired WHOLE OFFICES of non-white-non-engineer-non-programmers-non-males like the marketing departments. He always drove the factory workers hard, but was kinder (though demanding) of the white males: he had a sort of engineer-to-engineer camaraderie with them.. In a time when Responsible Global Companies are choosing to hire more representative candidates and women over white males: Uli Behringer continues to employ white-males in his companies. Additionally Uli Behringer loaths to cover advanced anti-viral treatments that are required by American workers: he believed that simply living a "normal" lifestyle would not-necessitate such treatments. This is why he closed the office in Seattle: he didn't want to pay for "LGBT related treatments that were a waste of money and their own fault anyway". He has a similar view of American women's "ways". The CEO: Uli Behringer rejects the modern world; this is in-part why he is recreating the synths of his youth (youth which he tries to keep a grasp of through vainglorious plastic surgery and constant physical workouts). This is also why he has hidden himself away in Asia for the past 30 years, and sheds light on the way his company (and it is HIS company: he runs everything: even though he has the money to hire managers to parcel out tasks). In his early youth America had not yet fully taken control over all of Europe, in terms of enforced morals. Once they did Uli fled to Asia where older male-centric morals abound (simply: Asia allows wealthy business men girls; while America imprisons any male that breaks said American rules protecting women.) Uli Behringer does NOT donate ANY money to Girls Not Brides campaign, nor to LGBT groups, nor to women-empowerment groups. He believes that by NOT donating he keeps prices low for his "real" customers. The fact of the matter is that these "real" customers: the bedroom musicians and synth freaks are mostly, demographically, white males. Uli Behringer prides himself in "not screwing them over". But by doing this, underprivledged groups and causes are not served. What is right and just in this world is for the white males of the opressive group to have to pay recompense to those who were and are opressed. In other companies these donation drives are built into the cost to the consumer. In Uli Behringer's company they are not: since he does not donate millions of dollars to said causes. By not donating to Girls Not Brides, Uli Behringer shows that he does not mind girls getting married young. By not donating to LGBT charities Uli Behringer shows that he does not particularly care about the plight of the actual good people of this world. In summary: If you support Racism, Sexism, race-to-the-bottom, consumer-price-point driven development, and incel-white-males who buy synthesizers. Then this is the company for you. If not: if you are an actually good person who rejects the sexist religions of old and embraces the new world where good people can prosper: avoid this company and campaign against it: your life may be on the line: if Uli undercuts your company you will be laid off and lose your health insurance and your anti-virals needed to live. It's THAT serious. Pros of working at Music Tribe: It's a good place if you are a programmer or an engineer, I guess. I eventually got old, had enough money, and wanted to do nothing so I resigned after 3 decades of a tireless race to the bottom. The name of the game it to screw over the other companies, more than make as much money as possible. Very different from other companies. It's good if you like that sort of thing. There are celebrations when other companies have to lay off their workers due to Uli undercutting them. He has a wry smile when he knows what's about to occur before we do. Suggestions to Management:
Re: [PATCH 0/7] arm64: dts: ti: k3-*: Squash up node_name_chars_strict warnings
On 02/09/20 4:00 am, Nishanth Menon wrote: > Hi, > > This is one part of cleanups meant for make W=2 dtbs for 5.10 on TI > dtbs. Hopefully we dont see node_name_chars_strict warnings anymore. > > As part of this cleanup, I ran a cross check of nodes that are > part of K3 as of right now, Vs what is "generic" definition as per 0.3 > dt specification: > https://pastebin.ubuntu.com/p/kp3g4ktBYp/ > > I dont think the remaining have a good reference, at least in my > subjective view. > > In possibly some cases, bootloaders may need to sync before doing DT > fixup etc. > > Nishanth Menon (7): > arm64: dts: ti: k3-am65*: Use generic gpio for node names > arm64: dts: ti: k3-am65*: Use generic clock for serdes clock name > arm64: dts: ti: k3-*: Use generic pinctrl for node names > arm64: dts: ti: k3-am65-base-board Use generic camera for node name > instead of ov5640 > arm64: dts: ti: k3-am65-wakeup: Use generic temperature-sensor for > node name > arm64: dts: ti: k3-*: Use generic adc for node names > arm64: dts: ti: k3-*: Fix up node_name_chars_strict errors Series looks good to me, Reviewed-by: Lokesh Vutla Thanks and regards, Lokesh
drivers/net/ethernet/huawei/hinic/hinic_hw_mbox.c:570:21: sparse: sparse: cast to restricted __be32
tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master head: b765a32a2e9170702467747e290614be072c4f76 commit: 088c5f0d1a7c7f01e668d9d2d75e7d93b43b7690 hinic: add generating mailbox random index support date: 4 weeks ago config: i386-randconfig-s001-20200902 (attached as .config) compiler: gcc-9 (Debian 9.3.0-15) 9.3.0 reproduce: # apt-get install sparse # sparse version: v0.6.2-191-g10164920-dirty git checkout 088c5f0d1a7c7f01e668d9d2d75e7d93b43b7690 # save the attached .config to linux build tree make W=1 C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' ARCH=i386 If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot sparse warnings: (new ones prefixed by >>) >> drivers/net/ethernet/huawei/hinic/hinic_hw_mbox.c:570:21: sparse: sparse: >> cast to restricted __be32 >> drivers/net/ethernet/huawei/hinic/hinic_hw_mbox.c:570:21: sparse: sparse: >> cast to restricted __be32 >> drivers/net/ethernet/huawei/hinic/hinic_hw_mbox.c:570:21: sparse: sparse: >> cast to restricted __be32 >> drivers/net/ethernet/huawei/hinic/hinic_hw_mbox.c:570:21: sparse: sparse: >> cast to restricted __be32 >> drivers/net/ethernet/huawei/hinic/hinic_hw_mbox.c:570:21: sparse: sparse: >> cast to restricted __be32 >> drivers/net/ethernet/huawei/hinic/hinic_hw_mbox.c:570:21: sparse: sparse: >> cast to restricted __be32 drivers/net/ethernet/huawei/hinic/hinic_hw_mbox.c:648:54: sparse: sparse: incorrect type in argument 2 (different address spaces) @@ expected void volatile [noderef] __iomem *addr @@ got unsigned char [usertype] * @@ drivers/net/ethernet/huawei/hinic/hinic_hw_mbox.c:648:54: sparse: expected void volatile [noderef] __iomem *addr drivers/net/ethernet/huawei/hinic/hinic_hw_mbox.c:648:54: sparse: got unsigned char [usertype] * drivers/net/ethernet/huawei/hinic/hinic_hw_mbox.c:671:58: sparse: sparse: incorrect type in argument 2 (different address spaces) @@ expected void volatile [noderef] __iomem *addr @@ got unsigned char [usertype] * @@ drivers/net/ethernet/huawei/hinic/hinic_hw_mbox.c:671:58: sparse: expected void volatile [noderef] __iomem *addr drivers/net/ethernet/huawei/hinic/hinic_hw_mbox.c:671:58: sparse: got unsigned char [usertype] * drivers/net/ethernet/huawei/hinic/hinic_hw_mbox.c:723:22: sparse: sparse: cast to restricted __be64 drivers/net/ethernet/huawei/hinic/hinic_hw_mbox.c:723:22: sparse: sparse: cast to restricted __be64 drivers/net/ethernet/huawei/hinic/hinic_hw_mbox.c:723:22: sparse: sparse: cast to restricted __be64 drivers/net/ethernet/huawei/hinic/hinic_hw_mbox.c:723:22: sparse: sparse: cast to restricted __be64 drivers/net/ethernet/huawei/hinic/hinic_hw_mbox.c:723:22: sparse: sparse: cast to restricted __be64 drivers/net/ethernet/huawei/hinic/hinic_hw_mbox.c:723:22: sparse: sparse: cast to restricted __be64 drivers/net/ethernet/huawei/hinic/hinic_hw_mbox.c:723:22: sparse: sparse: cast to restricted __be64 drivers/net/ethernet/huawei/hinic/hinic_hw_mbox.c:723:22: sparse: sparse: cast to restricted __be64 drivers/net/ethernet/huawei/hinic/hinic_hw_mbox.c:723:22: sparse: sparse: cast to restricted __be64 drivers/net/ethernet/huawei/hinic/hinic_hw_mbox.c:723:22: sparse: sparse: cast to restricted __be64 drivers/net/ethernet/huawei/hinic/hinic_hw_mbox.c:1164:25: sparse: sparse: incorrect type in assignment (different address spaces) @@ expected unsigned char [usertype] *data @@ got void [noderef] __iomem * @@ drivers/net/ethernet/huawei/hinic/hinic_hw_mbox.c:1164:25: sparse: expected unsigned char [usertype] *data drivers/net/ethernet/huawei/hinic/hinic_hw_mbox.c:1164:25: sparse: got void [noderef] __iomem * # https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=088c5f0d1a7c7f01e668d9d2d75e7d93b43b7690 git remote add linus https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git git fetch --no-tags linus master git checkout 088c5f0d1a7c7f01e668d9d2d75e7d93b43b7690 vim +570 drivers/net/ethernet/huawei/hinic/hinic_hw_mbox.c 541 542 static bool check_vf_mbox_random_id(struct hinic_mbox_func_to_func *func_to_func, 543 u8 *header) 544 { 545 struct hinic_hwdev *hwdev = func_to_func->hwdev; 546 struct hinic_mbox_work *mbox_work = NULL; 547 u64 mbox_header = *((u64 *)header); 548 u16 offset, src; 549 u32 random_id; 550 int vf_in_pf; 551 552 src = HINIC_MBOX_HEADER_GET(mbox_header, SRC_GLB_FUNC_IDX); 553 554 if (IS_PF_OR_PPF_SRC(src) || !func_to_func->support_vf_random) 555 return true; 556 557 if (!HINIC_IS_PPF(hwdev->hwif)) { 558 offset = hinic_glb_pf_vf_offset(hwdev->hwif); 559 vf_in_pf = src -
[PATCH v11 4/4] scsi: ufs: Prepare HPB read for cached sub-region
This patch changes the read I/O to the HPB read I/O. If the logical address of the read I/O belongs to active sub-region, the HPB driver modifies the read I/O command to HPB read. It modifies the UPIU command of UFS instead of modifying the existing SCSI command. In the HPB version 1.0, the maximum read I/O size that can be converted to HPB read is 4KB. The dirty map of the active sub-region prevents an incorrect HPB read that has stale physical page number which is updated by previous write I/O. Acked-by: Avri Altman Reviewed-by: Bart Van Assche Tested-by: Bean Huo Signed-off-by: Daejun Park --- drivers/scsi/ufs/ufshpb.c | 231 ++ 1 file changed, 231 insertions(+) diff --git a/drivers/scsi/ufs/ufshpb.c b/drivers/scsi/ufs/ufshpb.c index 2ac4cd6e4aa4..8ad2a711dd16 100644 --- a/drivers/scsi/ufs/ufshpb.c +++ b/drivers/scsi/ufs/ufshpb.c @@ -31,6 +31,29 @@ bool ufshpb_is_allowed(struct ufs_hba *hba) return !(hba->ufshpb_dev.hpb_disabled); } +static int ufshpb_is_valid_srgn(struct ufshpb_region *rgn, +struct ufshpb_subregion *srgn) +{ + return rgn->rgn_state != HPB_RGN_INACTIVE && + srgn->srgn_state == HPB_SRGN_VALID; +} + +static bool ufshpb_is_read_cmd(struct scsi_cmnd *cmd) +{ + return req_op(cmd->request) == REQ_OP_READ; +} + +static bool ufshpb_is_write_discard_cmd(struct scsi_cmnd *cmd) +{ + return op_is_write(req_op(cmd->request)) || + op_is_discard(req_op(cmd->request)); +} + +static bool ufshpb_is_support_chunk(int transfer_len) +{ + return transfer_len <= HPB_MULTI_CHUNK_HIGH; +} + static bool ufshpb_is_general_lun(int lun) { return lun < UFS_UPIU_MAX_UNIT_NUM_ID; @@ -97,8 +120,216 @@ static void ufshpb_set_state(struct ufshpb_lu *hpb, int state) atomic_set(>hpb_state, state); } +static void ufshpb_set_ppn_dirty(struct ufshpb_lu *hpb, int rgn_idx, +int srgn_idx, int srgn_offset, int cnt) +{ + struct ufshpb_region *rgn; + struct ufshpb_subregion *srgn; + int set_bit_len; + int bitmap_len = hpb->entries_per_srgn; + +next_srgn: + rgn = hpb->rgn_tbl + rgn_idx; + srgn = rgn->srgn_tbl + srgn_idx; + + if ((srgn_offset + cnt) > bitmap_len) + set_bit_len = bitmap_len - srgn_offset; + else + set_bit_len = cnt; + + if (rgn->rgn_state != HPB_RGN_INACTIVE && + srgn->srgn_state == HPB_SRGN_VALID) + bitmap_set(srgn->mctx->ppn_dirty, srgn_offset, set_bit_len); + + srgn_offset = 0; + if (++srgn_idx == hpb->srgns_per_rgn) { + srgn_idx = 0; + rgn_idx++; + } + + cnt -= set_bit_len; + if (cnt > 0) + goto next_srgn; + + WARN_ON(cnt < 0); +} + +static bool ufshpb_test_ppn_dirty(struct ufshpb_lu *hpb, int rgn_idx, + int srgn_idx, int srgn_offset, int cnt) +{ + struct ufshpb_region *rgn; + struct ufshpb_subregion *srgn; + int bitmap_len = hpb->entries_per_srgn; + int bit_len; + +next_srgn: + rgn = hpb->rgn_tbl + rgn_idx; + srgn = rgn->srgn_tbl + srgn_idx; + + if (!ufshpb_is_valid_srgn(rgn, srgn)) + return true; + + /* +* If the region state is active, mctx must be allocated. +* In this case, check whether the region is evicted or +* mctx allcation fail. +*/ + WARN_ON(!srgn->mctx); + + if ((srgn_offset + cnt) > bitmap_len) + bit_len = bitmap_len - srgn_offset; + else + bit_len = cnt; + + if (find_next_bit(srgn->mctx->ppn_dirty, + bit_len, srgn_offset) >= srgn_offset) + return true; + + srgn_offset = 0; + if (++srgn_idx == hpb->srgns_per_rgn) { + srgn_idx = 0; + rgn_idx++; + } + + cnt -= bit_len; + if (cnt > 0) + goto next_srgn; + + return false; +} + +static u64 ufshpb_get_ppn(struct ufshpb_lu *hpb, + struct ufshpb_map_ctx *mctx, int pos, int *error) +{ + u64 *ppn_table; + struct page *page; + int index, offset; + + index = pos / (PAGE_SIZE / HPB_ENTRY_SIZE); + offset = pos % (PAGE_SIZE / HPB_ENTRY_SIZE); + + page = mctx->m_page[index]; + if (unlikely(!page)) { + *error = -ENOMEM; + dev_err(>sdev_ufs_lu->sdev_dev, + "error. cannot find page in mctx\n"); + return 0; + } + + ppn_table = page_address(page); + if (unlikely(!ppn_table)) { + *error = -ENOMEM; + dev_err(>sdev_ufs_lu->sdev_dev, + "error. cannot get ppn_table\n"); + return 0; + } + + return ppn_table[offset]; +} + +static void +ufshpb_get_pos_from_lpn(struct ufshpb_lu *hpb, unsigned long lpn, int
Re: [PATCH kcsan 18/19] bitops, kcsan: Partially revert instrumentation for non-atomic bitops
Hi Paul and Marco, The whole update patchset looks good to me, just one question out of curiosity fo this one, please see below: On Mon, Aug 31, 2020 at 11:18:04AM -0700, paul...@kernel.org wrote: > From: Marco Elver > > Previous to the change to distinguish read-write accesses, when > CONFIG_KCSAN_ASSUME_PLAIN_WRITES_ATOMIC=y is set, KCSAN would consider > the non-atomic bitops as atomic. We want to partially revert to this > behaviour, but with one important distinction: report racing > modifications, since lost bits due to non-atomicity are certainly > possible. > > Given the operations here only modify a single bit, assuming > non-atomicity of the writer is sufficient may be reasonable for certain > usage (and follows the permissible nature of the "assume plain writes > atomic" rule). In other words: > > 1. We want non-atomic read-modify-write races to be reported; > this is accomplished by kcsan_check_read(), where any > concurrent write (atomic or not) will generate a report. > > 2. We do not want to report races with marked readers, but -do- > want to report races with unmarked readers; this is > accomplished by the instrument_write() ("assume atomic > write" with Kconfig option set). > Is there any code in kernel using the above assumption (i.e. non-atomicity of the writer is sufficient)? IOW, have you observed anything bad (e.g. an anoying false positive) after applying the read_write changes but without this patch? Regards, Boqun > With the above rules, when KCSAN_ASSUME_PLAIN_WRITES_ATOMIC is selected, > it is hoped that KCSAN's reporting behaviour is better aligned with > current expected permissible usage for non-atomic bitops. > > Note that, a side-effect of not telling KCSAN that the accesses are > read-writes, is that this information is not displayed in the access > summary in the report. It is, however, visible in inline-expanded stack > traces. For now, it does not make sense to introduce yet another special > case to KCSAN's runtime, only to cater to the case here. > > Cc: Dmitry Vyukov > Cc: Paul E. McKenney > Cc: Will Deacon > Cc: Arnd Bergmann > Cc: Daniel Axtens > Cc: Michael Ellerman > Cc: > Signed-off-by: Marco Elver > Signed-off-by: Paul E. McKenney > --- > .../asm-generic/bitops/instrumented-non-atomic.h | 30 > +++--- > 1 file changed, 27 insertions(+), 3 deletions(-) > > diff --git a/include/asm-generic/bitops/instrumented-non-atomic.h > b/include/asm-generic/bitops/instrumented-non-atomic.h > index f86234c..37363d5 100644 > --- a/include/asm-generic/bitops/instrumented-non-atomic.h > +++ b/include/asm-generic/bitops/instrumented-non-atomic.h > @@ -58,6 +58,30 @@ static inline void __change_bit(long nr, volatile unsigned > long *addr) > arch___change_bit(nr, addr); > } > > +static inline void __instrument_read_write_bitop(long nr, volatile unsigned > long *addr) > +{ > + if (IS_ENABLED(CONFIG_KCSAN_ASSUME_PLAIN_WRITES_ATOMIC)) { > + /* > + * We treat non-atomic read-write bitops a little more special. > + * Given the operations here only modify a single bit, assuming > + * non-atomicity of the writer is sufficient may be reasonable > + * for certain usage (and follows the permissible nature of the > + * assume-plain-writes-atomic rule): > + * 1. report read-modify-write races -> check read; > + * 2. do not report races with marked readers, but do report > + *races with unmarked readers -> check "atomic" write. > + */ > + kcsan_check_read(addr + BIT_WORD(nr), sizeof(long)); > + /* > + * Use generic write instrumentation, in case other sanitizers > + * or tools are enabled alongside KCSAN. > + */ > + instrument_write(addr + BIT_WORD(nr), sizeof(long)); > + } else { > + instrument_read_write(addr + BIT_WORD(nr), sizeof(long)); > + } > +} > + > /** > * __test_and_set_bit - Set a bit and return its old value > * @nr: Bit to set > @@ -68,7 +92,7 @@ static inline void __change_bit(long nr, volatile unsigned > long *addr) > */ > static inline bool __test_and_set_bit(long nr, volatile unsigned long *addr) > { > - instrument_read_write(addr + BIT_WORD(nr), sizeof(long)); > + __instrument_read_write_bitop(nr, addr); > return arch___test_and_set_bit(nr, addr); > } > > @@ -82,7 +106,7 @@ static inline bool __test_and_set_bit(long nr, volatile > unsigned long *addr) > */ > static inline bool __test_and_clear_bit(long nr, volatile unsigned long > *addr) > { > - instrument_read_write(addr + BIT_WORD(nr), sizeof(long)); > + __instrument_read_write_bitop(nr, addr); > return arch___test_and_clear_bit(nr, addr); > } > > @@ -96,7 +120,7 @@ static inline bool __test_and_clear_bit(long nr, volatile > unsigned
RE: [PATCH] arm64: topology: Stop using MPIDR for topology information
Hi Valentin: > -Original Message- > From: Valentin Schneider [mailto:valentin.schnei...@arm.com] > Sent: Saturday, August 29, 2020 9:00 PM > To: linux-kernel@vger.kernel.org; linux-arm-ker...@lists.infradead.org > Cc: Catalin Marinas; Will Deacon; Sudeep Holla; Robin Murphy; Jeremy > Linton; Dietmar Eggemann; Morten Rasmussen; Zengtao (B) > Subject: [PATCH] arm64: topology: Stop using MPIDR for topology > information > > In the absence of ACPI or DT topology data, we fallback to haphazardly > decoding *something* out of MPIDR. Sadly, the contents of that register > are > mostly unusable due to the implementation leniancy and things like Aff0 > having to be capped to 15 (despite being encoded on 8 bits). > > Consider a simple system with a single package of 32 cores, all under the > same LLC. We ought to be shoving them in the same core_sibling mask, > but > MPIDR is going to look like: > > | CPU | 0 | ... | 15 | 16 | ... | 31 | > |--+---+-+++-++ > | Aff0 | 0 | ... | 15 | 0 | ... | 15 | > | Aff1 | 0 | ... | 0 | 1 | ... | 1 | > | Aff2 | 0 | ... | 0 | 0 | ... | 0 | > > Which will eventually yield > > core_sibling(0-15) == 0-15 > core_sibling(16-31) == 16-31 > > NUMA woes > = > > If we try to play games with this and set up NUMA boundaries within those > groups of 16 cores via e.g. QEMU: > > # Node0: 0-9; Node1: 10-19 > $ qemu-system-aarch64 \ > -smp 20 -numa node,cpus=0-9,nodeid=0 -numa > node,cpus=10-19,nodeid=1 > > The scheduler's MC domain (all CPUs with same LLC) is going to be built via > > arch_topology.c::cpu_coregroup_mask() > > In there we try to figure out a sensible mask out of the topology > information we have. In short, here we'll pick the smallest of NUMA or > core sibling mask. > > node_mask(CPU9)== 0-9 > core_sibling(CPU9) == 0-15 > > MC mask for CPU9 will thus be 0-9, not a problem. > > node_mask(CPU10)== 10-19 > core_sibling(CPU10) == 0-15 > > MC mask for CPU10 will thus be 10-19, not a problem. > > node_mask(CPU16)== 10-19 > core_sibling(CPU16) == 16-19 > > MC mask for CPU16 will thus be 16-19... Uh oh. CPUs 16-19 are in two > different unique MC spans, and the scheduler has no idea what to make of > that. That triggers the WARN_ON() added by commit > > ccf74128d66c ("sched/topology: Assert non-NUMA topology masks > don't (partially) overlap") > > Fixing MPIDR-derived topology > = > > We could try to come up with some cleverer scheme to figure out which of > the available masks to pick, but really if one of those masks resulted from > MPIDR then it should be discarded because it's bound to be bogus. > > I was hoping to give MPIDR a chance for SMT, to figure out which threads > are > in the same core using Aff1-3 as core ID, but Sudeep and Robin pointed out > to me that there are systems out there where *all* cores have non-zero > values in their higher affinity fields (e.g. RK3288 has "5" in all of its > cores' MPIDR.Aff1), which would expose a bogus core ID to userspace. > > Stop using MPIDR for topology information. When no other source of > topology > information is available, mark each CPU as its own core and its NUMA > node > as its LLC domain. I agree with your idea to remove the topology functionality of MPIDR , but I think we need also consider ARM32 and GIC. > > Signed-off-by: Valentin Schneider > --- > arch/arm64/kernel/topology.c | 32 +--- > 1 file changed, 17 insertions(+), 15 deletions(-) > > diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c > index 0801a0f3c156..ff1dd1dbfe64 100644 > --- a/arch/arm64/kernel/topology.c > +++ b/arch/arm64/kernel/topology.c > @@ -36,21 +36,23 @@ void store_cpu_topology(unsigned int cpuid) > if (mpidr & MPIDR_UP_BITMASK) > return; > > - /* Create cpu topology mapping based on MPIDR. */ > - if (mpidr & MPIDR_MT_BITMASK) { > - /* Multiprocessor system : Multi-threads per core */ > - cpuid_topo->thread_id = MPIDR_AFFINITY_LEVEL(mpidr, 0); > - cpuid_topo->core_id= MPIDR_AFFINITY_LEVEL(mpidr, 1); > - cpuid_topo->package_id = MPIDR_AFFINITY_LEVEL(mpidr, 2) | > - MPIDR_AFFINITY_LEVEL(mpidr, 3) << 8; > - } else { > - /* Multiprocessor system : Single-thread per core */ > - cpuid_topo->thread_id = -1; > - cpuid_topo->core_id= MPIDR_AFFINITY_LEVEL(mpidr, 0); > - cpuid_topo->package_id = MPIDR_AFFINITY_LEVEL(mpidr, 1) | > - MPIDR_AFFINITY_LEVEL(mpidr, 2) << 8 | > - MPIDR_AFFINITY_LEVEL(mpidr, 3) << 16; > - } > + /* > + * This would be the place to create cpu topology based on MPIDR. > + * > + * However, it cannot be trusted to depict the actual topology; some > + * pieces of
Re: [PATCH] powerpc: Fix random segfault when freeing hugetlb range
Christophe Leroy writes: > The following random segfault is observed from time to time with > map_hugetlb selftest: > > root@localhost:~# ./map_hugetlb 1 19 > 524288 kB hugepages > Mapping 1 Mbytes > Segmentation fault > > [ 31.219972] map_hugetlb[365]: segfault (11) at 117 nip 77974f8c lr > 779a6834 code 1 in ld-2.23.so[77966000+21000] > [ 31.220192] map_hugetlb[365]: code: 9421ffc0 480318d1 93410028 90010044 > 9361002c 93810030 93a10034 93c10038 > [ 31.220307] map_hugetlb[365]: code: 93e1003c 93210024 8123007c 81430038 > <80e90004> 814a0004 7f443a14 813a0004 > [ 31.221911] BUG: Bad rss-counter state mm:(ptrval) type:MM_FILEPAGES val:33 > [ 31.229362] BUG: Bad rss-counter state mm:(ptrval) type:MM_ANONPAGES val:5 > > This fault is due to hugetlb_free_pgd_range() freeing page tables > that are also used by regular pages. > > As explain in the comment at the beginning of > hugetlb_free_pgd_range(), the verification done in free_pgd_range() > on floor and ceiling is not done here, which means > hugetlb_free_pte_range() can free outside the expected range. > > As the verification cannot be done in hugetlb_free_pgd_range(), it > must be done in hugetlb_free_pte_range(). > Reviewed-by: Aneesh Kumar K.V > Fixes: b250c8c08c79 ("powerpc/8xx: Manage 512k huge pages as standard pages.") > Cc: sta...@vger.kernel.org > Signed-off-by: Christophe Leroy > --- > arch/powerpc/mm/hugetlbpage.c | 18 -- > 1 file changed, 16 insertions(+), 2 deletions(-) > > diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c > index 26292544630f..e7ae2a2c4545 100644 > --- a/arch/powerpc/mm/hugetlbpage.c > +++ b/arch/powerpc/mm/hugetlbpage.c > @@ -330,10 +330,24 @@ static void free_hugepd_range(struct mmu_gather *tlb, > hugepd_t *hpdp, int pdshif >get_hugepd_cache_index(pdshift - shift)); > } > > -static void hugetlb_free_pte_range(struct mmu_gather *tlb, pmd_t *pmd, > unsigned long addr) > +static void hugetlb_free_pte_range(struct mmu_gather *tlb, pmd_t *pmd, > +unsigned long addr, unsigned long end, > +unsigned long floor, unsigned long ceiling) > { > + unsigned long start = addr; > pgtable_t token = pmd_pgtable(*pmd); > > + start &= PMD_MASK; > + if (start < floor) > + return; > + if (ceiling) { > + ceiling &= PMD_MASK; > + if (!ceiling) > + return; > + } > + if (end - 1 > ceiling - 1) > + return; > + We do repeat that for pud/pmd/pte hugetlb_free_range. Can we consolidate that with comment explaining we are checking if the pgtable entry is mapping outside the range? > pmd_clear(pmd); > pte_free_tlb(tlb, token, addr); > mm_dec_nr_ptes(tlb->mm); > @@ -363,7 +377,7 @@ static void hugetlb_free_pmd_range(struct mmu_gather > *tlb, pud_t *pud, >*/ > WARN_ON(!IS_ENABLED(CONFIG_PPC_8xx)); > > - hugetlb_free_pte_range(tlb, pmd, addr); > + hugetlb_free_pte_range(tlb, pmd, addr, end, floor, > ceiling); > > continue; > } > -- > 2.25.0
[PATCH net-next 0/2] Allow more than 255 IPv4 multicast interfaces
Currently it is not possible to use more than 255 multicast interfaces for IPv4 due to the format of the igmpmsg header which only has 8 bits available for the VIF ID. There is enough space for the full VIF ID in the Netlink cache notifications, however the value is currently taken directly from the igmpmsg header and has thus already been truncated. Using the full VIF ID in the Netlink notifications allows use of more than 255 IPv4 multicast interfaces if the user space routing daemon uses the Netlink notifications instead of the igmpmsg cache reports. However doing this reveals a deficiency in the Netlink cache report notifications, they lack any means for differentiating cache reports relating to different multicast routing tables. This is easily resolved by adding the multicast route table ID to the cache reports. Paul Davey (2): ipmr: Add route table ID to netlink cache reports ipmr: Use full VIF ID in netlink cache reports include/uapi/linux/mroute.h | 1 + net/ipv4/ipmr.c | 12 +++- 2 files changed, 8 insertions(+), 5 deletions(-) -- 2.28.0
[PATCH v11 1/4] scsi: ufs: Add HPB feature related parameters
This is a patch for parameters to be used for HPB feature. Reviewed-by: Bart Van Assche Reviewed-by: Can Guo Acked-by: Avri Altman Tested-by: Bean Huo Signed-off-by: Daejun Park --- drivers/scsi/ufs/ufs.h | 13 + 1 file changed, 13 insertions(+) diff --git a/drivers/scsi/ufs/ufs.h b/drivers/scsi/ufs/ufs.h index f8ab16f30fdc..e879ac34c065 100644 --- a/drivers/scsi/ufs/ufs.h +++ b/drivers/scsi/ufs/ufs.h @@ -122,6 +122,7 @@ enum flag_idn { QUERY_FLAG_IDN_WB_EN= 0x0E, QUERY_FLAG_IDN_WB_BUFF_FLUSH_EN = 0x0F, QUERY_FLAG_IDN_WB_BUFF_FLUSH_DURING_HIBERN8 = 0x10, + QUERY_FLAG_IDN_HPB_RESET= 0x11, }; /* Attribute idn for Query requests */ @@ -195,6 +196,9 @@ enum unit_desc_param { UNIT_DESC_PARAM_PHY_MEM_RSRC_CNT= 0x18, UNIT_DESC_PARAM_CTX_CAPABILITIES= 0x20, UNIT_DESC_PARAM_LARGE_UNIT_SIZE_M1 = 0x22, + UNIT_DESC_HPB_LU_MAX_ACTIVE_REGIONS = 0x23, + UNIT_DESC_HPB_LU_PIN_REGION_START_OFFSET= 0x25, + UNIT_DESC_HPB_LU_NUM_PIN_REGIONS= 0x27, UNIT_DESC_PARAM_WB_BUF_ALLOC_UNITS = 0x29, }; @@ -235,6 +239,8 @@ enum device_desc_param { DEVICE_DESC_PARAM_PSA_MAX_DATA = 0x25, DEVICE_DESC_PARAM_PSA_TMT = 0x29, DEVICE_DESC_PARAM_PRDCT_REV = 0x2A, + DEVICE_DESC_PARAM_HPB_VER = 0x40, + DEVICE_DESC_PARAM_HPB_CONTROL = 0x42, DEVICE_DESC_PARAM_EXT_UFS_FEATURE_SUP = 0x4F, DEVICE_DESC_PARAM_WB_PRESRV_USRSPC_EN = 0x53, DEVICE_DESC_PARAM_WB_TYPE = 0x54, @@ -283,6 +289,10 @@ enum geometry_desc_param { GEOMETRY_DESC_PARAM_ENM4_MAX_NUM_UNITS = 0x3E, GEOMETRY_DESC_PARAM_ENM4_CAP_ADJ_FCTR = 0x42, GEOMETRY_DESC_PARAM_OPT_LOG_BLK_SIZE= 0x44, + GEOMETRY_DESC_PARAM_HPB_REGION_SIZE = 0x48, + GEOMETRY_DESC_PARAM_HPB_NUMBER_LU = 0x49, + GEOMETRY_DESC_PARAM_HPB_SUBREGION_SIZE = 0x4A, + GEOMETRY_DESC_PARAM_HPB_MAX_ACTIVE_REGS = 0x4B, GEOMETRY_DESC_PARAM_WB_MAX_ALLOC_UNITS = 0x4F, GEOMETRY_DESC_PARAM_WB_MAX_WB_LUNS = 0x53, GEOMETRY_DESC_PARAM_WB_BUFF_CAP_ADJ = 0x54, @@ -327,8 +337,10 @@ enum { /* Possible values for dExtendedUFSFeaturesSupport */ enum { + UFS_DEV_HPB_SUPPORT = BIT(7), UFS_DEV_WRITE_BOOSTER_SUP = BIT(8), }; +#define UFS_DEV_HPB_SUPPORT_VERSION0x310 #define POWER_DESC_MAX_SIZE0x62 #define POWER_DESC_MAX_ACTV_ICC_LVLS 16 @@ -537,6 +549,7 @@ struct ufs_dev_info { u8 *model; u16 wspecversion; u32 clk_gating_wait_us; + u8 b_ufs_feature_sup; u32 d_ext_ufs_feature_sup; u8 b_wb_buffer_type; u32 d_wb_alloc_units; -- 2.17.1
[PATCH v11 2/4] scsi: ufs: Introduce HPB feature
This is a patch for the HPB feature. This patch adds HPB function calls to UFS core driver. Reviewed-by: Bart Van Assche Acked-by: Avri Altman Tested-by: Bean Huo Signed-off-by: Daejun Park --- drivers/scsi/ufs/Kconfig | 10 + drivers/scsi/ufs/Makefile | 1 + drivers/scsi/ufs/ufshcd.c | 60 drivers/scsi/ufs/ufshcd.h | 23 +- drivers/scsi/ufs/ufshpb.c | 621 ++ drivers/scsi/ufs/ufshpb.h | 174 +++ 6 files changed, 888 insertions(+), 1 deletion(-) create mode 100644 drivers/scsi/ufs/ufshpb.c create mode 100644 drivers/scsi/ufs/ufshpb.h diff --git a/drivers/scsi/ufs/Kconfig b/drivers/scsi/ufs/Kconfig index f6394999b98c..6571d6e4ff12 100644 --- a/drivers/scsi/ufs/Kconfig +++ b/drivers/scsi/ufs/Kconfig @@ -182,3 +182,13 @@ config SCSI_UFS_CRYPTO Enabling this makes it possible for the kernel to use the crypto capabilities of the UFS device (if present) to perform crypto operations on data being transferred to/from the device. + +config SCSI_UFS_HPB + bool "Support UFS Host Performance Booster" + depends on SCSI_UFSHCD + help + The UFS HPB feature improves random read performance. It caches + L2P (logical to physical) map of UFS to host DRAM. The driver uses HPB + read command by piggybacking physical page number for bypassing FTL (flash + translation layer)'s L2P address translation. + diff --git a/drivers/scsi/ufs/Makefile b/drivers/scsi/ufs/Makefile index 4679af1b564e..663e17cee359 100644 --- a/drivers/scsi/ufs/Makefile +++ b/drivers/scsi/ufs/Makefile @@ -11,6 +11,7 @@ obj-$(CONFIG_SCSI_UFSHCD) += ufshcd-core.o ufshcd-core-y += ufshcd.o ufs-sysfs.o ufshcd-core-$(CONFIG_SCSI_UFS_BSG) += ufs_bsg.o ufshcd-core-$(CONFIG_SCSI_UFS_CRYPTO) += ufshcd-crypto.o +ufshcd-core-$(CONFIG_SCSI_UFS_HPB) += ufshpb.o obj-$(CONFIG_SCSI_UFSHCD_PCI) += ufshcd-pci.o obj-$(CONFIG_SCSI_UFSHCD_PLATFORM) += ufshcd-pltfrm.o obj-$(CONFIG_SCSI_UFS_HISI) += ufs-hisi.o diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index 06e2439d523c..40ddce5b36d1 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -22,6 +22,7 @@ #include "ufs-sysfs.h" #include "ufs_bsg.h" #include "ufshcd-crypto.h" +#include "ufshpb.h" #include #include @@ -2546,6 +2547,8 @@ static int ufshcd_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *cmd) ufshcd_comp_scsi_upiu(hba, lrbp); + ufshpb_prep(hba, lrbp); + err = ufshcd_map_sg(hba, lrbp); if (err) { lrbp->cmd = NULL; @@ -4691,6 +4694,33 @@ static int ufshcd_change_queue_depth(struct scsi_device *sdev, int depth) return scsi_change_queue_depth(sdev, depth); } +static void ufshcd_hpb_destroy(struct ufs_hba *hba, struct scsi_device *sdev) +{ + /* skip well-known LU */ + if (sdev->lun >= UFS_UPIU_MAX_UNIT_NUM_ID) + return; + + if (!ufshpb_is_allowed(hba)) + return; + + ufshpb_destroy_lu(hba, sdev); +} + +static void ufshcd_hpb_configure(struct ufs_hba *hba, struct scsi_device *sdev) +{ + /* skip well-known LU */ + if (sdev->lun >= UFS_UPIU_MAX_UNIT_NUM_ID) + return; + + if (!(hba->dev_info.b_ufs_feature_sup & UFS_DEV_HPB_SUPPORT)) + return; + + if (!ufshpb_is_allowed(hba)) + return; + + ufshpb_init_hpb_lu(hba, sdev); +} + /** * ufshcd_slave_configure - adjust SCSI device configurations * @sdev: pointer to SCSI device @@ -4700,6 +4730,8 @@ static int ufshcd_slave_configure(struct scsi_device *sdev) struct ufs_hba *hba = shost_priv(sdev->host); struct request_queue *q = sdev->request_queue; + ufshcd_hpb_configure(hba, sdev); + blk_queue_update_dma_pad(q, PRDT_DATA_BYTE_COUNT_PAD - 1); if (ufshcd_is_rpm_autosuspend_allowed(hba)) @@ -4719,6 +4751,9 @@ static void ufshcd_slave_destroy(struct scsi_device *sdev) struct ufs_hba *hba; hba = shost_priv(sdev->host); + + ufshcd_hpb_destroy(hba, sdev); + /* Drop the reference as it won't be needed anymore */ if (ufshcd_scsi_to_upiu_lun(sdev->lun) == UFS_UPIU_UFS_DEVICE_WLUN) { unsigned long flags; @@ -4828,6 +4863,9 @@ ufshcd_transfer_rsp_status(struct ufs_hba *hba, struct ufshcd_lrb *lrbp) */ pm_runtime_get_noresume(hba->dev); } + + if (scsi_status == SAM_STAT_GOOD) + ufshpb_rsp_upiu(hba, lrbp); break; case UPIU_TRANSACTION_REJECT_UPIU: /* TODO: handle Reject UPIU Response */ @@ -6681,6 +6719,8 @@ static int ufshcd_host_reset_and_restore(struct ufs_hba *hba) * Stop the host controller and complete the requests * cleared by h/w */ +
[PATCH v11 3/4] scsi: ufs: L2P map management for HPB read
This is a patch for managing L2P map in HPB module. The HPB divides logical addresses into several regions. A region consists of several sub-regions. The sub-region is a basic unit where L2P mapping is managed. The driver loads L2P mapping data of each sub-region. The loaded sub-region is called active-state. The HPB driver unloads L2P mapping data as region unit. The unloaded region is called inactive-state. Sub-region/region candidates to be loaded and unloaded are delivered from the UFS device. The UFS device delivers the recommended active sub-region and inactivate region to the driver using sensedata. The HPB module performs L2P mapping management on the host through the delivered information. A pinned region is a pre-set regions on the UFS device that is always activate-state. The data structure for map data request and L2P map uses mempool API, minimizing allocation overhead while avoiding static allocation. The mininum size of the memory pool used in the HPB is implemented as a module parameter, so that it can be configurable by the user. To gurantee a minimum memory pool size of 4MB: ufshpb_host_map_kbytes=4096 The map_work manages active/inactive by 2 "to-do" lists. Each hpb lun maintains 2 "to-do" lists: hpb->lh_inact_rgn - regions to be inactivated, and hpb->lh_act_srgn - subregions to be activated Those lists are maintained on IO completion. Reviewed-by: Bart Van Assche Acked-by: Avri Altman Tested-by: Bean Huo Signed-off-by: Daejun Park --- drivers/scsi/ufs/ufs.h| 34 ++ drivers/scsi/ufs/ufshpb.c | 999 +- drivers/scsi/ufs/ufshpb.h | 55 +++ 3 files changed, 1085 insertions(+), 3 deletions(-) diff --git a/drivers/scsi/ufs/ufs.h b/drivers/scsi/ufs/ufs.h index e879ac34c065..b7c1c7dd6e70 100644 --- a/drivers/scsi/ufs/ufs.h +++ b/drivers/scsi/ufs/ufs.h @@ -472,6 +472,39 @@ struct utp_cmd_rsp { u8 sense_data[UFS_SENSE_SIZE]; }; +struct ufshpb_active_field { + __be16 active_rgn; + __be16 active_srgn; +}; + +/** + * struct utp_hpb_rsp - Response UPIU structure + * @residual_transfer_count: Residual transfer count DW-3 + * @reserved1: Reserved double words DW-4 to DW-7 + * @sense_data_len: Sense data length DW-8 U16 + * @desc_type: Descriptor type of sense data + * @additional_len: Additional length of sense data + * @hpb_type: HPB operation type + * @reserved2: Reserved field + * @active_rgn_cnt: Active region count + * @inactive_rgn_cnt: Inactive region count + * @hpb_active_field: Recommended to read HPB region and subregion + * @hpb_inactive_field: To be inactivated HPB region and subregion + */ +struct utp_hpb_rsp { + __be32 residual_transfer_count; + __be32 reserved1[4]; + __be16 sense_data_len; + u8 desc_type; + u8 additional_len; + u8 hpb_type; + u8 reserved2; + u8 active_rgn_cnt; + u8 inactive_rgn_cnt; + struct ufshpb_active_field hpb_active_field[2]; + __be16 hpb_inactive_field[2]; +}; + /** * struct utp_upiu_rsp - general upiu response structure * @header: UPIU header structure DW-0 to DW-2 @@ -482,6 +515,7 @@ struct utp_upiu_rsp { struct utp_upiu_header header; union { struct utp_cmd_rsp sr; + struct utp_hpb_rsp hr; struct utp_upiu_query qr; }; }; diff --git a/drivers/scsi/ufs/ufshpb.c b/drivers/scsi/ufs/ufshpb.c index 45eb80116bfd..2ac4cd6e4aa4 100644 --- a/drivers/scsi/ufs/ufshpb.c +++ b/drivers/scsi/ufs/ufshpb.c @@ -16,11 +16,77 @@ #include "ufshpb.h" #include "../sd.h" +/* memory management */ +static struct kmem_cache *ufshpb_mctx_cache; +static mempool_t *ufshpb_mctx_pool; +static mempool_t *ufshpb_page_pool; +/* A cache size of 2MB can cache ppn in the 1GB range. */ +static unsigned int ufshpb_host_map_kbytes = 2048; +static int tot_active_srgn_pages; + +static struct workqueue_struct *ufshpb_wq; + bool ufshpb_is_allowed(struct ufs_hba *hba) { return !(hba->ufshpb_dev.hpb_disabled); } +static bool ufshpb_is_general_lun(int lun) +{ + return lun < UFS_UPIU_MAX_UNIT_NUM_ID; +} + +static bool +ufshpb_is_pinned_region(struct ufshpb_lu *hpb, int rgn_idx) +{ + if (hpb->lu_pinned_end != PINNED_NOT_SET && + rgn_idx >= hpb->lu_pinned_start && + rgn_idx <= hpb->lu_pinned_end) + return true; + + return false; +} + +static bool ufshpb_is_empty_rsp_lists(struct ufshpb_lu *hpb) +{ + bool ret = true; + unsigned long flags; + + spin_lock_irqsave(>rsp_list_lock, flags); + if (!list_empty(>lh_inact_rgn) || !list_empty(>lh_act_srgn)) + ret = false; + spin_unlock_irqrestore(>rsp_list_lock, flags); + + return ret; +} + +static int ufshpb_may_field_valid(struct ufs_hba *hba, +struct ufshcd_lrb *lrbp, +struct utp_hpb_rsp *rsp_field) +{ + if (be16_to_cpu(rsp_field->sense_data_len) !=
[PATCH net-next 1/2] ipmr: Add route table ID to netlink cache reports
Insert the multicast route table ID as a Netlink attribute to Netlink cache report notifications. When multiple route tables are in use it is necessary to have a way to determine which route table a given cache report belongs to when receiving the cache report. Signed-off-by: Paul Davey --- include/uapi/linux/mroute.h | 1 + net/ipv4/ipmr.c | 4 +++- 2 files changed, 4 insertions(+), 1 deletion(-) diff --git a/include/uapi/linux/mroute.h b/include/uapi/linux/mroute.h index 11c8c1fc1124..918f1ef32ffe 100644 --- a/include/uapi/linux/mroute.h +++ b/include/uapi/linux/mroute.h @@ -169,6 +169,7 @@ enum { IPMRA_CREPORT_SRC_ADDR, IPMRA_CREPORT_DST_ADDR, IPMRA_CREPORT_PKT, + IPMRA_CREPORT_TABLE, __IPMRA_CREPORT_MAX }; #define IPMRA_CREPORT_MAX (__IPMRA_CREPORT_MAX - 1) diff --git a/net/ipv4/ipmr.c b/net/ipv4/ipmr.c index 876fd6ff1ff9..19b2f586319b 100644 --- a/net/ipv4/ipmr.c +++ b/net/ipv4/ipmr.c @@ -2396,6 +2396,7 @@ static size_t igmpmsg_netlink_msgsize(size_t payloadlen) + nla_total_size(4) /* IPMRA_CREPORT_VIF_ID */ + nla_total_size(4) /* IPMRA_CREPORT_SRC_ADDR */ + nla_total_size(4) /* IPMRA_CREPORT_DST_ADDR */ + + nla_total_size(4) /* IPMRA_CREPORT_TABLE */ /* IPMRA_CREPORT_PKT */ + nla_total_size(payloadlen) ; @@ -2431,7 +2432,8 @@ static void igmpmsg_netlink_event(struct mr_table *mrt, struct sk_buff *pkt) nla_put_in_addr(skb, IPMRA_CREPORT_SRC_ADDR, msg->im_src.s_addr) || nla_put_in_addr(skb, IPMRA_CREPORT_DST_ADDR, - msg->im_dst.s_addr)) + msg->im_dst.s_addr) || + nla_put_u32(skb, IPMRA_CREPORT_TABLE, mrt->id)) goto nla_put_failure; nla = nla_reserve(skb, IPMRA_CREPORT_PKT, payloadlen); -- 2.28.0
[PATCH net-next 2/2] ipmr: Use full VIF ID in netlink cache reports
Insert the full 16 bit VIF ID into ipmr Netlink cache reports. If using more than 255 multicast interfaces it is necessary to have access to a VIF ID for cache reports that is wider than 8 bits, the VIF ID present in the igmpmsg reports sent to mroute_sk are only 8 bits wide in the igmpmsg header. The VIF_ID attribute has 32 bits of space however so can store the full VIF ID. Signed-off-by: Paul Davey --- net/ipv4/ipmr.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/net/ipv4/ipmr.c b/net/ipv4/ipmr.c index 19b2f586319b..26cd4ec450f4 100644 --- a/net/ipv4/ipmr.c +++ b/net/ipv4/ipmr.c @@ -104,7 +104,7 @@ static int ipmr_cache_report(struct mr_table *mrt, struct sk_buff *pkt, vifi_t vifi, int assert); static void mroute_netlink_event(struct mr_table *mrt, struct mfc_cache *mfc, int cmd); -static void igmpmsg_netlink_event(struct mr_table *mrt, struct sk_buff *pkt); +static void igmpmsg_netlink_event(struct mr_table *mrt, struct sk_buff *pkt, vifi_t vifi); static void mroute_clean_tables(struct mr_table *mrt, int flags); static void ipmr_expire_process(struct timer_list *t); @@ -1072,7 +1072,7 @@ static int ipmr_cache_report(struct mr_table *mrt, return -EINVAL; } - igmpmsg_netlink_event(mrt, skb); + igmpmsg_netlink_event(mrt, skb, vifi); /* Deliver to mrouted */ ret = sock_queue_rcv_skb(mroute_sk, skb); @@ -2404,7 +2404,7 @@ static size_t igmpmsg_netlink_msgsize(size_t payloadlen) return len; } -static void igmpmsg_netlink_event(struct mr_table *mrt, struct sk_buff *pkt) +static void igmpmsg_netlink_event(struct mr_table *mrt, struct sk_buff *pkt, vifi_t vifi) { struct net *net = read_pnet(>net); struct nlmsghdr *nlh; @@ -2428,7 +2428,7 @@ static void igmpmsg_netlink_event(struct mr_table *mrt, struct sk_buff *pkt) rtgenm = nlmsg_data(nlh); rtgenm->rtgen_family = RTNL_FAMILY_IPMR; if (nla_put_u8(skb, IPMRA_CREPORT_MSGTYPE, msg->im_msgtype) || - nla_put_u32(skb, IPMRA_CREPORT_VIF_ID, msg->im_vif) || + nla_put_u32(skb, IPMRA_CREPORT_VIF_ID, vifi) || nla_put_in_addr(skb, IPMRA_CREPORT_SRC_ADDR, msg->im_src.s_addr) || nla_put_in_addr(skb, IPMRA_CREPORT_DST_ADDR, -- 2.28.0
[PATCH v11 0/4] scsi: ufs: Add Host Performance Booster Support
Changelog: v10 -> v11 Add a newline at end the last line on Kconfig file. v9 -> v10 1. Fix 64-bit division error 2. Fix problems commentted in Bart's review. v8 -> v9 1. Change sysfs initialization. 2. Change reading descriptor during HPB initialization 3. Fix problems commentted in Bart's review. 4. Change base commit from 5.9/scsi-queue to 5.10/scsi-queue. v7 -> v8 Remove wrongly added tags. v6 -> v7 1. Remove UFS feature layer. 2. Cleanup for sparse error. v5 -> v6 Change base commit to b53293fa662e28ae0cdd40828dc641c09f133405 v4 -> v5 Delete unused macro define. v3 -> v4 1. Cleanup. v2 -> v3 1. Add checking input module parameter value. 2. Change base commit from 5.8/scsi-queue to 5.9/scsi-queue. 3. Cleanup for unused variables and label. v1 -> v2 1. Change the full boilerplate text to SPDX style. 2. Adopt dynamic allocation for sub-region data structure. 3. Cleanup. NAND flash memory-based storage devices use Flash Translation Layer (FTL) to translate logical addresses of I/O requests to corresponding flash memory addresses. Mobile storage devices typically have RAM with constrained size, thus lack in memory to keep the whole mapping table. Therefore, mapping tables are partially retrieved from NAND flash on demand, causing random-read performance degradation. To improve random read performance, JESD220-3 (HPB v1.0) proposes HPB (Host Performance Booster) which uses host system memory as a cache for the FTL mapping table. By using HPB, FTL data can be read from host memory faster than from NAND flash memory. The current version only supports the DCM (device control mode). This patch consists of 3 parts to support HPB feature. 1) HPB probe and initialization process 2) READ -> HPB READ using cached map information 3) L2P (logical to physical) map management In the HPB probe and init process, the device information of the UFS is queried. After checking supported features, the data structure for the HPB is initialized according to the device information. A read I/O in the active sub-region where the map is cached is changed to HPB READ by the HPB. The HPB manages the L2P map using information received from the device. For active sub-region, the HPB caches through ufshpb_map request. For the in-active region, the HPB discards the L2P map. When a write I/O occurs in an active sub-region area, associated dirty bitmap checked as dirty for preventing stale read. HPB is shown to have a performance improvement of 58 - 67% for random read workload. [1] This series patches are based on the 5.9/scsi-queue branch. [1]: https://www.usenix.org/conference/hotstorage17/program/presentation/jeong Daejun park (4): scsi: ufs: Add HPB feature related parameters scsi: ufs: Introduce HPB feature scsi: ufs: L2P map management for HPB read scsi: ufs: Prepare HPB read for cached sub-region drivers/scsi/ufs/Kconfig | 10 + drivers/scsi/ufs/Makefile |1 + drivers/scsi/ufs/ufs.h| 47 + drivers/scsi/ufs/ufshcd.c | 60 ++ drivers/scsi/ufs/ufshcd.h | 23 +- drivers/scsi/ufs/ufshpb.c | 1845 drivers/scsi/ufs/ufshpb.h | 229 + 7 files changed, 2214 insertions(+), 1 deletion(-) created mode 100644 drivers/scsi/ufs/ufshpb.c created mode 100644 drivers/scsi/ufs/ufshpb.h
Re: [External] Re: [PATCH v2] kprobes: Fix kill kprobe which has been marked as gone
Hi Masami, On Wed, Sep 2, 2020 at 11:05 AM Masami Hiramatsu wrote: > > Hi Ingo, > > Could you merge this fix to -tip? This patch has been merged into Andrew's mm tree. > > I can resend it with other kprobes fixes. > > Hi Muchun, > > We also need; > > Cc: sta...@vger.kernel.org > > for bugfix so that the patch can be backported correctly after merged to > upstream. Yeah, I got it. Thanks. > > Thank you, > > On Mon, 31 Aug 2020 10:59:19 +0800 > Muchun Song wrote: > > > Cc Andrew and Steven. > > > > Any other comments or someone can add this to the queue for the > > merge window? It's worth fixing it. > > > > On Sat, Aug 22, 2020 at 11:01 AM Muchun Song > > wrote: > > > > > > If a kprobe is marked as gone, we should not kill it again. Otherwise, > > > we can disarm the kprobe more than once. In that case, the statistics > > > of kprobe_ftrace_enabled can unbalance which can lead to that kprobe > > > do not work. > > > > > > Fixes: e8386a0cb22f ("kprobes: support probing module __exit function") > > > Signed-off-by: Muchun Song > > > Co-developed-by: Chengming Zhou > > > Signed-off-by: Chengming Zhou > > > Acked-by: Masami Hiramatsu > > > --- > > > changelogs in v2: > > > 1. Add a WARN_ON_ONCE in the kill_kprobe() to catch incorrect use of it. > > > 2. Update 'Fixes' tag in the commmit log. > > > > > > kernel/kprobes.c | 9 - > > > 1 file changed, 8 insertions(+), 1 deletion(-) > > > > > > diff --git a/kernel/kprobes.c b/kernel/kprobes.c > > > index d36e2b017588..9348b0c36ae0 100644 > > > --- a/kernel/kprobes.c > > > +++ b/kernel/kprobes.c > > > @@ -2143,6 +2143,9 @@ static void kill_kprobe(struct kprobe *p) > > > > > > lockdep_assert_held(_mutex); > > > > > > + if (WARN_ON_ONCE(kprobe_gone(p))) > > > + return; > > > + > > > p->flags |= KPROBE_FLAG_GONE; > > > if (kprobe_aggrprobe(p)) { > > > /* > > > @@ -2422,7 +2425,10 @@ static int kprobes_module_callback(struct > > > notifier_block *nb, > > > mutex_lock(_mutex); > > > for (i = 0; i < KPROBE_TABLE_SIZE; i++) { > > > head = _table[i]; > > > - hlist_for_each_entry(p, head, hlist) > > > + hlist_for_each_entry(p, head, hlist) { > > > + if (kprobe_gone(p)) > > > + continue; > > > + > > > if (within_module_init((unsigned long)p->addr, > > > mod) || > > > (checkcore && > > > within_module_core((unsigned long)p->addr, > > > mod))) { > > > @@ -2439,6 +2445,7 @@ static int kprobes_module_callback(struct > > > notifier_block *nb, > > > */ > > > kill_kprobe(p); > > > } > > > + } > > > } > > > if (val == MODULE_STATE_GOING) > > > remove_module_kprobe_blacklist(mod); > > > -- > > > 2.11.0 > > > > > > > > > -- > > Yours, > > Muchun > > > -- > Masami Hiramatsu -- Yours, Muchun
[PATCH 0/5] SM8150 and SM8250 videocc drivers
Add videocc drivers for SM8150/SM8250 required to boot and use venus. Jonathan Marek (5): dt-bindings: clock: combine qcom,sdm845-videocc and qcom,sc7180-videocc dt-bindings: clock: add SM8150 QCOM video clock bindings dt-bindings: clock: add SM8250 QCOM video clock bindings clk: qcom: add video clock controller driver for SM8150 clk: qcom: add video clock controller driver for SM8250 .../bindings/clock/qcom,sc7180-videocc.yaml | 65 --- ...,sdm845-videocc.yaml => qcom,videocc.yaml} | 20 +- drivers/clk/qcom/Kconfig | 18 + drivers/clk/qcom/Makefile | 2 + drivers/clk/qcom/videocc-sm8150.c | 276 ++ drivers/clk/qcom/videocc-sm8250.c | 516 ++ .../dt-bindings/clock/qcom,videocc-sm8150.h | 25 + .../dt-bindings/clock/qcom,videocc-sm8250.h | 42 ++ 8 files changed, 894 insertions(+), 70 deletions(-) delete mode 100644 Documentation/devicetree/bindings/clock/qcom,sc7180-videocc.yaml rename Documentation/devicetree/bindings/clock/{qcom,sdm845-videocc.yaml => qcom,videocc.yaml} (63%) create mode 100644 drivers/clk/qcom/videocc-sm8150.c create mode 100644 drivers/clk/qcom/videocc-sm8250.c create mode 100644 include/dt-bindings/clock/qcom,videocc-sm8150.h create mode 100644 include/dt-bindings/clock/qcom,videocc-sm8250.h -- 2.26.1
[PATCH 5/5] clk: qcom: add video clock controller driver for SM8250
Add support for the video clock controller found on SM8250 based devices. Derived from the downstream driver. Signed-off-by: Jonathan Marek --- drivers/clk/qcom/Kconfig | 9 + drivers/clk/qcom/Makefile | 1 + drivers/clk/qcom/videocc-sm8250.c | 516 ++ 3 files changed, 526 insertions(+) create mode 100644 drivers/clk/qcom/videocc-sm8250.c diff --git a/drivers/clk/qcom/Kconfig b/drivers/clk/qcom/Kconfig index 40d7ee9886c9..95efa38211d5 100644 --- a/drivers/clk/qcom/Kconfig +++ b/drivers/clk/qcom/Kconfig @@ -453,6 +453,15 @@ config SM_VIDEOCC_8150 Say Y if you want to support video devices and functionality such as video encode and decode. +config SM_VIDEOCC_8250 + tristate "SM8250 Video Clock Controller" + select SDM_GCC_8250 + select QCOM_GDSC + help + Support for the video clock controller on SM8250 devices. + Say Y if you want to support video devices and functionality such as + video encode and decode. + config SPMI_PMIC_CLKDIV tristate "SPMI PMIC clkdiv Support" depends on SPMI || COMPILE_TEST diff --git a/drivers/clk/qcom/Makefile b/drivers/clk/qcom/Makefile index 6f4c580d2728..55fb20800b66 100644 --- a/drivers/clk/qcom/Makefile +++ b/drivers/clk/qcom/Makefile @@ -69,6 +69,7 @@ obj-$(CONFIG_SM_GCC_8250) += gcc-sm8250.o obj-$(CONFIG_SM_GPUCC_8150) += gpucc-sm8150.o obj-$(CONFIG_SM_GPUCC_8250) += gpucc-sm8250.o obj-$(CONFIG_SM_VIDEOCC_8150) += videocc-sm8150.o +obj-$(CONFIG_SM_VIDEOCC_8250) += videocc-sm8250.o obj-$(CONFIG_SPMI_PMIC_CLKDIV) += clk-spmi-pmic-div.o obj-$(CONFIG_KPSS_XCC) += kpss-xcc.o obj-$(CONFIG_QCOM_HFPLL) += hfpll.o diff --git a/drivers/clk/qcom/videocc-sm8250.c b/drivers/clk/qcom/videocc-sm8250.c new file mode 100644 index ..9fa3bd0b359b --- /dev/null +++ b/drivers/clk/qcom/videocc-sm8250.c @@ -0,0 +1,516 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Copyright (c) 2018-2020, The Linux Foundation. All rights reserved. + */ + +#include +#include +#include +#include + +#include + +#include "clk-alpha-pll.h" +#include "clk-branch.h" +#include "clk-rcg.h" +#include "clk-regmap.h" +#include "clk-regmap-divider.h" +#include "common.h" +#include "reset.h" +#include "gdsc.h" + +enum { + P_BI_TCXO, + P_CHIP_SLEEP_CLK, + P_CORE_BI_PLL_TEST_SE, + P_VIDEO_PLL0_OUT_MAIN, + P_VIDEO_PLL1_OUT_MAIN, +}; + +static const struct parent_map video_cc_parent_map_0[] = { + { P_BI_TCXO, 0 }, +}; + +static const struct clk_parent_data video_cc_parent_data_0[] = { + { .fw_name = "bi_tcxo_ao" }, +}; + +static struct pll_vco lucid_vco[] = { + { 24960, 20, 0 }, +}; + +static const struct alpha_pll_config video_pll0_config = { + .l = 0x25, + .alpha = 0x8000, + .config_ctl_val = 0x20485699, + .config_ctl_hi_val = 0x2261, + .config_ctl_hi1_val = 0x329A699C, + .user_ctl_val = 0x, + .user_ctl_hi_val = 0x0805, + .user_ctl_hi1_val = 0x, +}; + +static struct clk_alpha_pll video_pll0 = { + .offset = 0x42c, + .vco_table = lucid_vco, + .num_vco = ARRAY_SIZE(lucid_vco), + .regs = clk_alpha_pll_regs[CLK_ALPHA_PLL_TYPE_LUCID], + .clkr = { + .hw.init = &(struct clk_init_data){ + .name = "video_pll0", + .parent_data = &(const struct clk_parent_data){ + .fw_name = "bi_tcxo", + }, + .num_parents = 1, + .ops = _alpha_pll_lucid_ops, + }, + }, +}; + +static const struct alpha_pll_config video_pll1_config = { + .l = 0x2B, + .alpha = 0xC000, + .config_ctl_val = 0x20485699, + .config_ctl_hi_val = 0x2261, + .config_ctl_hi1_val = 0x329A699C, + .user_ctl_val = 0x, + .user_ctl_hi_val = 0x0805, + .user_ctl_hi1_val = 0x, +}; + +static struct clk_alpha_pll video_pll1 = { + .offset = 0x7d0, + .vco_table = lucid_vco, + .num_vco = ARRAY_SIZE(lucid_vco), + .regs = clk_alpha_pll_regs[CLK_ALPHA_PLL_TYPE_LUCID], + .clkr = { + .hw.init = &(struct clk_init_data){ + .name = "video_pll1", + .parent_data = &(const struct clk_parent_data){ + .fw_name = "bi_tcxo", + }, + .num_parents = 1, + .ops = _alpha_pll_lucid_ops, + }, + }, +}; + +static const struct parent_map video_cc_parent_map_1[] = { + { P_BI_TCXO, 0 }, + { P_VIDEO_PLL0_OUT_MAIN, 1 }, +}; + +static const struct clk_parent_data video_cc_parent_data_1[] = { + { .fw_name = "bi_tcxo" }, + { .hw = _pll0.clkr.hw }, +}; + +static const struct parent_map video_cc_parent_map_2[] = { + { P_BI_TCXO, 0 }, +
[PATCH 2/5] dt-bindings: clock: add SM8150 QCOM video clock bindings
Add device tree bindings for video clock controller for SM8150 SoCs. Signed-off-by: Jonathan Marek --- .../bindings/clock/qcom,videocc.yaml | 4 ++- .../dt-bindings/clock/qcom,videocc-sm8150.h | 25 +++ 2 files changed, 28 insertions(+), 1 deletion(-) create mode 100644 include/dt-bindings/clock/qcom,videocc-sm8150.h diff --git a/Documentation/devicetree/bindings/clock/qcom,videocc.yaml b/Documentation/devicetree/bindings/clock/qcom,videocc.yaml index 17666425476f..d04f5bd28dde 100644 --- a/Documentation/devicetree/bindings/clock/qcom,videocc.yaml +++ b/Documentation/devicetree/bindings/clock/qcom,videocc.yaml @@ -11,17 +11,19 @@ maintainers: description: | Qualcomm video clock control module which supports the clocks, resets and - power domains on SDM845/SC7180. + power domains on SDM845/SC7180/SM8150. See also: dt-bindings/clock/qcom,videocc-sdm845.h dt-bindings/clock/qcom,videocc-sc7180.h +dt-bindings/clock/qcom,videocc-sm8150.h properties: compatible: enum: - qcom,sdm845-videocc - qcom,sc7180-videocc + - qcom,sm8150-videocc clocks: items: diff --git a/include/dt-bindings/clock/qcom,videocc-sm8150.h b/include/dt-bindings/clock/qcom,videocc-sm8150.h new file mode 100644 index ..e24ee840cfdb --- /dev/null +++ b/include/dt-bindings/clock/qcom,videocc-sm8150.h @@ -0,0 +1,25 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * Copyright (c) 2017-2020, The Linux Foundation. All rights reserved. + */ + +#ifndef _DT_BINDINGS_CLK_QCOM_VIDEO_CC_SM8150_H +#define _DT_BINDINGS_CLK_QCOM_VIDEO_CC_SM8150_H + +/* VIDEO_CC clocks */ +#define VIDEO_CC_IRIS_AHB_CLK 0 +#define VIDEO_CC_IRIS_CLK_SRC 1 +#define VIDEO_CC_MVS0_CORE_CLK 2 +#define VIDEO_CC_MVS1_CORE_CLK 3 +#define VIDEO_CC_MVSC_CORE_CLK 4 +#define VIDEO_CC_PLL0 5 + +/* VIDEO_CC Resets */ +#define VIDEO_CC_MVSC_CORE_CLK_BCR 0 + +/* VIDEO_CC GDSCRs */ +#define VENUS_GDSC 0 +#define VCODEC0_GDSC 1 +#define VCODEC1_GDSC 2 + +#endif -- 2.26.1
[PATCH 1/5] dt-bindings: clock: combine qcom,sdm845-videocc and qcom,sc7180-videocc
These two bindings are almost identical, so combine them into one. This will make it easier to add the sm8150 and sm8250 videocc bindings. Signed-off-by: Jonathan Marek --- .../bindings/clock/qcom,sc7180-videocc.yaml | 65 --- ...,sdm845-videocc.yaml => qcom,videocc.yaml} | 14 ++-- 2 files changed, 9 insertions(+), 70 deletions(-) delete mode 100644 Documentation/devicetree/bindings/clock/qcom,sc7180-videocc.yaml rename Documentation/devicetree/bindings/clock/{qcom,sdm845-videocc.yaml => qcom,videocc.yaml} (76%) diff --git a/Documentation/devicetree/bindings/clock/qcom,sc7180-videocc.yaml b/Documentation/devicetree/bindings/clock/qcom,sc7180-videocc.yaml deleted file mode 100644 index 2feea2b91aa9.. --- a/Documentation/devicetree/bindings/clock/qcom,sc7180-videocc.yaml +++ /dev/null @@ -1,65 +0,0 @@ -# SPDX-License-Identifier: GPL-2.0-only -%YAML 1.2 -$id: http://devicetree.org/schemas/clock/qcom,sc7180-videocc.yaml# -$schema: http://devicetree.org/meta-schemas/core.yaml# - -title: Qualcomm Video Clock & Reset Controller Binding for SC7180 - -maintainers: - - Taniya Das - -description: | - Qualcomm video clock control module which supports the clocks, resets and - power domains on SC7180. - - See also dt-bindings/clock/qcom,videocc-sc7180.h. - -properties: - compatible: -const: qcom,sc7180-videocc - - clocks: -items: - - description: Board XO source - - clock-names: -items: - - const: bi_tcxo - - '#clock-cells': -const: 1 - - '#reset-cells': -const: 1 - - '#power-domain-cells': -const: 1 - - reg: -maxItems: 1 - -required: - - compatible - - reg - - clocks - - clock-names - - '#clock-cells' - - '#reset-cells' - - '#power-domain-cells' - -additionalProperties: false - -examples: - - | -#include -clock-controller@ab0 { - compatible = "qcom,sc7180-videocc"; - reg = <0x0ab0 0x1>; - clocks = < RPMH_CXO_CLK>; - clock-names = "bi_tcxo"; - #clock-cells = <1>; - #reset-cells = <1>; - #power-domain-cells = <1>; -}; -... diff --git a/Documentation/devicetree/bindings/clock/qcom,sdm845-videocc.yaml b/Documentation/devicetree/bindings/clock/qcom,videocc.yaml similarity index 76% rename from Documentation/devicetree/bindings/clock/qcom,sdm845-videocc.yaml rename to Documentation/devicetree/bindings/clock/qcom,videocc.yaml index f7a0cf53d5f0..17666425476f 100644 --- a/Documentation/devicetree/bindings/clock/qcom,sdm845-videocc.yaml +++ b/Documentation/devicetree/bindings/clock/qcom,videocc.yaml @@ -1,23 +1,27 @@ # SPDX-License-Identifier: GPL-2.0-only %YAML 1.2 --- -$id: http://devicetree.org/schemas/clock/qcom,sdm845-videocc.yaml# +$id: http://devicetree.org/schemas/clock/qcom,videocc.yaml# $schema: http://devicetree.org/meta-schemas/core.yaml# -title: Qualcomm Video Clock & Reset Controller Binding for SDM845 +title: Qualcomm Video Clock & Reset Controller Binding maintainers: - Taniya Das description: | Qualcomm video clock control module which supports the clocks, resets and - power domains on SDM845. + power domains on SDM845/SC7180. - See also dt-bindings/clock/qcom,videocc-sdm845.h. + See also: +dt-bindings/clock/qcom,videocc-sdm845.h +dt-bindings/clock/qcom,videocc-sc7180.h properties: compatible: -const: qcom,sdm845-videocc +enum: + - qcom,sdm845-videocc + - qcom,sc7180-videocc clocks: items: -- 2.26.1
[PATCH 3/5] dt-bindings: clock: add SM8250 QCOM video clock bindings
Add device tree bindings for video clock controller for SM8250 SoCs. Signed-off-by: Jonathan Marek --- .../bindings/clock/qcom,videocc.yaml | 6 ++- .../dt-bindings/clock/qcom,videocc-sm8250.h | 42 +++ 2 files changed, 47 insertions(+), 1 deletion(-) create mode 100644 include/dt-bindings/clock/qcom,videocc-sm8250.h diff --git a/Documentation/devicetree/bindings/clock/qcom,videocc.yaml b/Documentation/devicetree/bindings/clock/qcom,videocc.yaml index d04f5bd28dde..757837e260a2 100644 --- a/Documentation/devicetree/bindings/clock/qcom,videocc.yaml +++ b/Documentation/devicetree/bindings/clock/qcom,videocc.yaml @@ -11,12 +11,13 @@ maintainers: description: | Qualcomm video clock control module which supports the clocks, resets and - power domains on SDM845/SC7180/SM8150. + power domains on SDM845/SC7180/SM8150/SM8250. See also: dt-bindings/clock/qcom,videocc-sdm845.h dt-bindings/clock/qcom,videocc-sc7180.h dt-bindings/clock/qcom,videocc-sm8150.h +dt-bindings/clock/qcom,videocc-sm8250.h properties: compatible: @@ -24,14 +25,17 @@ properties: - qcom,sdm845-videocc - qcom,sc7180-videocc - qcom,sm8150-videocc + - qcom,sm8250-videocc clocks: items: - description: Board XO source + - description: Board XO source, always-on (required by sm8250 only) clock-names: items: - const: bi_tcxo + - const: bi_tcxo_ao '#clock-cells': const: 1 diff --git a/include/dt-bindings/clock/qcom,videocc-sm8250.h b/include/dt-bindings/clock/qcom,videocc-sm8250.h new file mode 100644 index ..4c44f9c468db --- /dev/null +++ b/include/dt-bindings/clock/qcom,videocc-sm8250.h @@ -0,0 +1,42 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * Copyright (c) 2018-2020, The Linux Foundation. All rights reserved. + */ + +#ifndef _DT_BINDINGS_CLK_QCOM_VIDEO_CC_SM8250_H +#define _DT_BINDINGS_CLK_QCOM_VIDEO_CC_SM8250_H + +/* VIDEO_CC clocks */ +#define VIDEO_CC_AHB_CLK 0 +#define VIDEO_CC_AHB_CLK_SRC 1 +#define VIDEO_CC_MVS0_CLK 2 +#define VIDEO_CC_MVS0_CLK_SRC 3 +#define VIDEO_CC_MVS0_DIV_CLK_SRC 4 +#define VIDEO_CC_MVS0C_CLK 5 +#define VIDEO_CC_MVS0C_DIV2_DIV_CLK_SRC6 +#define VIDEO_CC_MVS1_CLK 7 +#define VIDEO_CC_MVS1_CLK_SRC 8 +#define VIDEO_CC_MVS1_DIV2_CLK 9 +#define VIDEO_CC_MVS1_DIV_CLK_SRC 10 +#define VIDEO_CC_MVS1C_CLK 11 +#define VIDEO_CC_MVS1C_DIV2_DIV_CLK_SRC12 +#define VIDEO_CC_XO_CLK13 +#define VIDEO_CC_XO_CLK_SRC14 +#define VIDEO_CC_PLL0 15 +#define VIDEO_CC_PLL1 16 + +/* VIDEO_CC resets */ +#define VIDEO_CC_CVP_INTERFACE_BCR 0 +#define VIDEO_CC_CVP_MVS0_BCR 1 +#define VIDEO_CC_MVS0C_CLK_ARES2 +#define VIDEO_CC_CVP_MVS0C_BCR 3 +#define VIDEO_CC_CVP_MVS1_BCR 4 +#define VIDEO_CC_MVS1C_CLK_ARES5 +#define VIDEO_CC_CVP_MVS1C_BCR 6 + +#define MVS0C_GDSC 0 +#define MVS1C_GDSC 1 +#define MVS0_GDSC 2 +#define MVS1_GDSC 3 + +#endif -- 2.26.1
[PATCH 4/5] clk: qcom: add video clock controller driver for SM8150
Add support for the video clock controller found on SM8150 based devices. Derived from the downstream driver. Signed-off-by: Jonathan Marek --- drivers/clk/qcom/Kconfig | 9 + drivers/clk/qcom/Makefile | 1 + drivers/clk/qcom/videocc-sm8150.c | 276 ++ 3 files changed, 286 insertions(+) create mode 100644 drivers/clk/qcom/videocc-sm8150.c diff --git a/drivers/clk/qcom/Kconfig b/drivers/clk/qcom/Kconfig index 058327310c25..40d7ee9886c9 100644 --- a/drivers/clk/qcom/Kconfig +++ b/drivers/clk/qcom/Kconfig @@ -444,6 +444,15 @@ config SM_GPUCC_8250 Say Y if you want to support graphics controller devices and functionality such as 3D graphics. +config SM_VIDEOCC_8150 + tristate "SM8150 Video Clock Controller" + select SDM_GCC_8150 + select QCOM_GDSC + help + Support for the video clock controller on SM8150 devices. + Say Y if you want to support video devices and functionality such as + video encode and decode. + config SPMI_PMIC_CLKDIV tristate "SPMI PMIC clkdiv Support" depends on SPMI || COMPILE_TEST diff --git a/drivers/clk/qcom/Makefile b/drivers/clk/qcom/Makefile index 9677e769e7e9..6f4c580d2728 100644 --- a/drivers/clk/qcom/Makefile +++ b/drivers/clk/qcom/Makefile @@ -68,6 +68,7 @@ obj-$(CONFIG_SM_GCC_8150) += gcc-sm8150.o obj-$(CONFIG_SM_GCC_8250) += gcc-sm8250.o obj-$(CONFIG_SM_GPUCC_8150) += gpucc-sm8150.o obj-$(CONFIG_SM_GPUCC_8250) += gpucc-sm8250.o +obj-$(CONFIG_SM_VIDEOCC_8150) += videocc-sm8150.o obj-$(CONFIG_SPMI_PMIC_CLKDIV) += clk-spmi-pmic-div.o obj-$(CONFIG_KPSS_XCC) += kpss-xcc.o obj-$(CONFIG_QCOM_HFPLL) += hfpll.o diff --git a/drivers/clk/qcom/videocc-sm8150.c b/drivers/clk/qcom/videocc-sm8150.c new file mode 100644 index ..3087e2ec8fd4 --- /dev/null +++ b/drivers/clk/qcom/videocc-sm8150.c @@ -0,0 +1,276 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Copyright (c) 2017-2020, The Linux Foundation. All rights reserved. + */ + +#include +#include +#include +#include + +#include + +#include "common.h" +#include "clk-alpha-pll.h" +#include "clk-branch.h" +#include "clk-rcg.h" +#include "clk-regmap.h" +#include "reset.h" +#include "gdsc.h" + +enum { + P_BI_TCXO, + P_CHIP_SLEEP_CLK, + P_CORE_BI_PLL_TEST_SE, + P_VIDEO_PLL0_OUT_EVEN, + P_VIDEO_PLL0_OUT_MAIN, + P_VIDEO_PLL0_OUT_ODD, +}; + +static struct pll_vco trion_vco[] = { + { 24960, 20, 0 }, +}; + +static struct alpha_pll_config video_pll0_config = { + .l = 0x14, + .alpha = 0xD555, + .config_ctl_val = 0x20485699, + .config_ctl_hi_val = 0x2267, + .config_ctl_hi1_val = 0x0024, + .user_ctl_val = 0x, + .user_ctl_hi_val = 0x0805, + .user_ctl_hi1_val = 0x00D0, +}; + +static struct clk_alpha_pll video_pll0 = { + .offset = 0x42c, + .vco_table = trion_vco, + .num_vco = ARRAY_SIZE(trion_vco), + .regs = clk_alpha_pll_regs[CLK_ALPHA_PLL_TYPE_TRION], + .clkr = { + .hw.init = &(struct clk_init_data){ + .name = "video_pll0", + .parent_data = &(const struct clk_parent_data){ + .fw_name = "bi_tcxo", + }, + .num_parents = 1, + .ops = _alpha_pll_trion_ops, + }, + }, +}; + +static const struct parent_map video_cc_parent_map_0[] = { + { P_BI_TCXO, 0 }, + { P_VIDEO_PLL0_OUT_MAIN, 1 }, +}; + +static const struct clk_parent_data video_cc_parent_data_0[] = { + { .fw_name = "bi_tcxo" }, + { .hw = _pll0.clkr.hw }, +}; + +static const struct freq_tbl ftbl_video_cc_iris_clk_src[] = { + F(1920, P_BI_TCXO, 1, 0, 0), + F(2, P_VIDEO_PLL0_OUT_MAIN, 2, 0, 0), + F(24000, P_VIDEO_PLL0_OUT_MAIN, 2, 0, 0), + F(33800, P_VIDEO_PLL0_OUT_MAIN, 2, 0, 0), + F(36500, P_VIDEO_PLL0_OUT_MAIN, 2, 0, 0), + F(44400, P_VIDEO_PLL0_OUT_MAIN, 2, 0, 0), + F(53300, P_VIDEO_PLL0_OUT_MAIN, 2, 0, 0), + { } +}; + +static struct clk_rcg2 video_cc_iris_clk_src = { + .cmd_rcgr = 0x7f0, + .mnd_width = 0, + .hid_width = 5, + .parent_map = video_cc_parent_map_0, + .freq_tbl = ftbl_video_cc_iris_clk_src, + .clkr.hw.init = &(struct clk_init_data){ + .name = "video_cc_iris_clk_src", + .parent_data = video_cc_parent_data_0, + .num_parents = ARRAY_SIZE(video_cc_parent_data_0), + .flags = CLK_SET_RATE_PARENT, + .ops = _rcg2_shared_ops, + }, +}; + +static struct clk_branch video_cc_iris_ahb_clk = { + .halt_reg = 0x8f4, + .halt_check = BRANCH_VOTED, + .clkr = { + .enable_reg = 0x8f4, + .enable_mask = BIT(0), + .hw.init = &(struct clk_init_data){ +
Re: [PATCH v2 2/4] drm/vc4: hdmi: Add pixel bvb clock control
Hi Chanwoo, On 9/1/20 1:27 PM, Chanwoo Choi wrote: > Hi Hoegeun, > > It looks good to me. But, just one comment. > > On 9/1/20 1:07 PM, Hoegeun Kwon wrote: >> There is a problem that the output does not work at a resolution >> exceeding FHD. To solve this, we need to adjust the bvb clock at a >> resolution exceeding FHD. >> >> Signed-off-by: Hoegeun Kwon >> --- >> drivers/gpu/drm/vc4/vc4_hdmi.c | 25 + >> drivers/gpu/drm/vc4/vc4_hdmi.h | 1 + >> 2 files changed, 26 insertions(+) >> >> diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c b/drivers/gpu/drm/vc4/vc4_hdmi.c >> index 95ec5eedea39..eb3192d1fd86 100644 >> --- a/drivers/gpu/drm/vc4/vc4_hdmi.c >> +++ b/drivers/gpu/drm/vc4/vc4_hdmi.c >> @@ -80,6 +80,7 @@ >> # define VC4_HD_M_ENABLE BIT(0) >> >> #define CEC_CLOCK_FREQ 4 >> +#define VC4_HSM_MID_CLOCK 149985000 >> >> static int vc4_hdmi_debugfs_regs(struct seq_file *m, void *unused) >> { >> @@ -380,6 +381,7 @@ static void vc4_hdmi_encoder_post_crtc_powerdown(struct >> drm_encoder *encoder) >> HDMI_WRITE(HDMI_VID_CTL, >> HDMI_READ(HDMI_VID_CTL) & ~VC4_HD_VID_CTL_ENABLE); >> >> +clk_disable_unprepare(vc4_hdmi->pixel_bvb_clock); >> clk_disable_unprepare(vc4_hdmi->hsm_clock); >> clk_disable_unprepare(vc4_hdmi->pixel_clock); >> >> @@ -638,6 +640,23 @@ static void vc4_hdmi_encoder_pre_crtc_configure(struct >> drm_encoder *encoder) >> return; >> } >> >> +ret = clk_set_rate(vc4_hdmi->pixel_bvb_clock, >> +(hsm_rate > VC4_HSM_MID_CLOCK ? 15000 : 7500)); >> +if (ret) { >> +DRM_ERROR("Failed to set pixel bvb clock rate: %d\n", ret); >> +clk_disable_unprepare(vc4_hdmi->hsm_clock); >> +clk_disable_unprepare(vc4_hdmi->pixel_clock); >> +return; >> +} >> + >> +ret = clk_prepare_enable(vc4_hdmi->pixel_bvb_clock); >> +if (ret) { >> +DRM_ERROR("Failed to turn on pixel bvb clock: %d\n", ret); >> +clk_disable_unprepare(vc4_hdmi->hsm_clock); >> +clk_disable_unprepare(vc4_hdmi->pixel_clock); >> +return; >> +} > Generally, enable the clock before using clk and then change the clock rate. > I think that you better to change the order between clk_prepare_enable and > clk_set_rate. Thank you for your comment. As Maxime answered in another patch [1], there is no clear rule of order here. [1] https://lkml.org/lkml/2020/9/1/327 Best regards, Hoegeun
Re: linux-next: manual merge of the drm-misc tree with Linus' tree
Hi all, On Wed, 26 Aug 2020 10:01:13 +1000 Stephen Rothwell wrote: > > Hi all, > > Today's linux-next merge of the drm-misc tree got conflicts in: > > drivers/video/fbdev/arcfb.c > drivers/video/fbdev/atmel_lcdfb.c > drivers/video/fbdev/savage/savagefb_driver.c > > between commit: > > df561f6688fe ("treewide: Use fallthrough pseudo-keyword") > > from Linus' tree and commit: > > ad04fae0de07 ("fbdev: Use fallthrough pseudo-keyword") > > from the drm-misc tree. > > I fixed it up (they are much the same, I just used the version from Linus' > tree) and can carry the fix as necessary. This is now fixed as far as > linux-next is concerned, but any non trivial conflicts should be mentioned > to your upstream maintainer when your tree is submitted for merging. > You may also want to consider cooperating with the maintainer of the > conflicting tree to minimise any particularly complex conflicts. These conflicts now appear in the merge between the drm tree and Linus' tree. -- Cheers, Stephen Rothwell pgpkUDftM3X1q.pgp Description: OpenPGP digital signature
[PATCH] scsi: ufs: Fix NOP OUT timeout value
In some Samsung UFS devices, there is some booting fail issue with low-power UFS device. The reason of this issue is the UFS device has a little bit longer latency for NOP OUT response. It causes booting fail because NOP OUT command is issued during initialization to check whether the device transport protocol is ready or not. This issue is resolved by releasing NOP_OUT_TIMEOUT value. NOP_OUT_TIMEOUT: 30ms -> 50ms Signed-off-by: Daejun Park --- drivers/scsi/ufs/ufshcd.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index 06e2439d523c..5cbd0e9e4ef8 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -36,8 +36,8 @@ /* NOP OUT retries waiting for NOP IN response */ #define NOP_OUT_RETRIES10 -/* Timeout after 30 msecs if NOP OUT hangs without response */ -#define NOP_OUT_TIMEOUT30 /* msecs */ +/* Timeout after 50 msecs if NOP OUT hangs without response */ +#define NOP_OUT_TIMEOUT50 /* msecs */ /* Query request retries */ #define QUERY_REQ_RETRIES 3 -- 2.17.1
Re: [PATCH v33 11/21] x86/sgx: Linux Enclave Driver
On Fri, 03 Jul 2020 22:31:10 -0500, Jarkko Sakkinen wrote: On Wed, Jul 01, 2020 at 08:59:02PM -0700, Sean Christopherson wrote: On Thu, Jun 18, 2020 at 01:08:33AM +0300, Jarkko Sakkinen wrote: > +static int sgx_validate_secs(const struct sgx_secs *secs, > + unsigned long ssaframesize) > +{ > + if (secs->size < (2 * PAGE_SIZE) || !is_power_of_2(secs->size)) > + return -EINVAL; > + > + if (secs->base & (secs->size - 1)) > + return -EINVAL; > + > + if (secs->miscselect & sgx_misc_reserved_mask || > + secs->attributes & sgx_attributes_reserved_mask || > + secs->xfrm & sgx_xfrm_reserved_mask) > + return -EINVAL; > + > + if (secs->attributes & SGX_ATTR_MODE64BIT) { > + if (secs->size > sgx_encl_size_max_64) > + return -EINVAL; > + } else if (secs->size > sgx_encl_size_max_32) > + return -EINVAL; These should be >=, not >, the SDM uses one of those fancy ≥ ligatures. Internal versions use more obvious pseudocode, e.g.: if ((DS:TMP_SECS.ATTRIBUTES.MODE64BIT = 1) AND (DS:TMP_SECS.SIZE AND (~((1 << CPUID.18.0:EDX[15:8]) – 1))) { #GP(0); Updated as: static int sgx_validate_secs(const struct sgx_secs *secs) { u64 max_size = (secs->attributes & SGX_ATTR_MODE64BIT) ? sgx_encl_size_max_64 : sgx_encl_size_max_32; if (secs->size < (2 * PAGE_SIZE) || !is_power_of_2(secs->size)) return -EINVAL; if (secs->base & (secs->size - 1)) return -EINVAL; if (secs->miscselect & sgx_misc_reserved_mask || secs->attributes & sgx_attributes_reserved_mask || secs->xfrm & sgx_xfrm_reserved_mask) return -EINVAL; if (secs->size >= max_size) return -EINVAL; This should be > not >=. Issue raised and fixed by Fábio Silva for ported patches for OOT SGX support: https://github.com/intel/SGXDataCenterAttestationPrimitives/pull/123 I tested and verified with Intel arch, the comparison indeed should be >. Thanks Haitao
Re: linux-next: build failure after merge of the drm-misc tree
Hi all, On Wed, 26 Aug 2020 10:55:47 +1000 Stephen Rothwell wrote: > > After merging the drm-misc tree, today's linux-next build (x86_64 > allmodconfig) failed like this: > > drivers/gpu/drm/qxl/qxl_display.c: In function > 'qxl_display_read_client_monitors_config': > include/drm/drm_modeset_lock.h:167:7: error: implicit declaration of function > 'drm_drv_uses_atomic_modeset' [-Werror=implicit-function-declaration] > 167 | if (!drm_drv_uses_atomic_modeset(dev))\ > | ^~~ > drivers/gpu/drm/qxl/qxl_display.c:187:2: note: in expansion of macro > 'DRM_MODESET_LOCK_ALL_BEGIN' > 187 | DRM_MODESET_LOCK_ALL_BEGIN(dev, ctx, > DRM_MODESET_ACQUIRE_INTERRUPTIBLE, ret); > | ^~ > drivers/gpu/drm/qxl/qxl_display.c:189:35: error: macro > "DRM_MODESET_LOCK_ALL_END" requires 3 arguments, but only 2 given > 189 | DRM_MODESET_LOCK_ALL_END(ctx, ret); > | ^ > In file included from include/drm/drm_crtc.h:36, > from include/drm/drm_atomic.h:31, > from drivers/gpu/drm/qxl/qxl_display.c:29: > include/drm/drm_modeset_lock.h:194: note: macro "DRM_MODESET_LOCK_ALL_END" > defined here > 194 | #define DRM_MODESET_LOCK_ALL_END(dev, ctx, ret)\ > | > drivers/gpu/drm/qxl/qxl_display.c:189:2: error: 'DRM_MODESET_LOCK_ALL_END' > undeclared (first use in this function) > 189 | DRM_MODESET_LOCK_ALL_END(ctx, ret); > | ^~~~ > drivers/gpu/drm/qxl/qxl_display.c:189:2: note: each undeclared identifier is > reported only once for each function it appears in > drivers/gpu/drm/qxl/qxl_display.c:187:2: error: label 'modeset_lock_fail' > used but not defined > 187 | DRM_MODESET_LOCK_ALL_BEGIN(dev, ctx, > DRM_MODESET_ACQUIRE_INTERRUPTIBLE, ret); > | ^~ > In file included from include/drm/drm_crtc.h:36, > from include/drm/drm_atomic.h:31, > from drivers/gpu/drm/qxl/qxl_display.c:29: > include/drm/drm_modeset_lock.h:170:1: warning: label 'modeset_lock_retry' > defined but not used [-Wunused-label] > 170 | modeset_lock_retry: \ > | ^~ > drivers/gpu/drm/qxl/qxl_display.c:187:2: note: in expansion of macro > 'DRM_MODESET_LOCK_ALL_BEGIN' > 187 | DRM_MODESET_LOCK_ALL_BEGIN(dev, ctx, > DRM_MODESET_ACQUIRE_INTERRUPTIBLE, ret); > | ^~ > drivers/gpu/drm/qxl/qxl_display.c: In function > 'qxl_framebuffer_surface_dirty': > drivers/gpu/drm/qxl/qxl_display.c:434:35: error: macro > "DRM_MODESET_LOCK_ALL_END" requires 3 arguments, but only 2 given > 434 | DRM_MODESET_LOCK_ALL_END(ctx, ret); > | ^ > In file included from include/drm/drm_crtc.h:36, > from include/drm/drm_atomic.h:31, > from drivers/gpu/drm/qxl/qxl_display.c:29: > include/drm/drm_modeset_lock.h:194: note: macro "DRM_MODESET_LOCK_ALL_END" > defined here > 194 | #define DRM_MODESET_LOCK_ALL_END(dev, ctx, ret)\ > | > drivers/gpu/drm/qxl/qxl_display.c:434:2: error: 'DRM_MODESET_LOCK_ALL_END' > undeclared (first use in this function) > 434 | DRM_MODESET_LOCK_ALL_END(ctx, ret); > | ^~~~ > drivers/gpu/drm/qxl/qxl_display.c:411:2: error: label 'modeset_lock_fail' > used but not defined > 411 | DRM_MODESET_LOCK_ALL_BEGIN(fb->dev, ctx, > DRM_MODESET_ACQUIRE_INTERRUPTIBLE, ret); > | ^~ > In file included from include/drm/drm_crtc.h:36, > from include/drm/drm_atomic.h:31, > from drivers/gpu/drm/qxl/qxl_display.c:29: > include/drm/drm_modeset_lock.h:170:1: warning: label 'modeset_lock_retry' > defined but not used [-Wunused-label] > 170 | modeset_lock_retry: \ > | ^~ > drivers/gpu/drm/qxl/qxl_display.c:411:2: note: in expansion of macro > 'DRM_MODESET_LOCK_ALL_BEGIN' > 411 | DRM_MODESET_LOCK_ALL_BEGIN(fb->dev, ctx, > DRM_MODESET_ACQUIRE_INTERRUPTIBLE, ret); > | ^~ > > Caused by commit > > bbaac1354cc9 ("drm/qxl: Replace deprecated function in qxl_display") > > interacting with commit > > 77ef38574beb ("drm/modeset-lock: Take the modeset BKL for legacy drivers") > > from the drm-misc-fixes tree. > > drivers/gpu/drm/qxl/qxl_display.c manages to include > drm/drm_modeset_lock.h by some indirect route, but fails to have > drm/drm_drv.h similarly included. In fact, drm/drm_modeset_lock.h should > have included drm/drm_drv.h since it uses things declared there, and > drivers/gpu/drm/qxl/qxl_display.c should include drm/drm_modeset_lock.h > similarly. > > I have added the following hack patch for today. > > From: Stephen Rothwell > Date: Wed, 26 Aug 2020 10:40:18 +1000 > Subject: [PATCH] fix interaction with drm-misc-fix commit > > Signed-off-by: Stephen Rothwell > --- > drivers/gpu/drm/qxl/qxl_display.c | 5 +++-- >
Re: [PATCH v2] kprobes: Fix kill kprobe which has been marked as gone
Hi Ingo, Could you merge this fix to -tip? I can resend it with other kprobes fixes. Hi Muchun, We also need; Cc: sta...@vger.kernel.org for bugfix so that the patch can be backported correctly after merged to upstream. Thank you, On Mon, 31 Aug 2020 10:59:19 +0800 Muchun Song wrote: > Cc Andrew and Steven. > > Any other comments or someone can add this to the queue for the > merge window? It's worth fixing it. > > On Sat, Aug 22, 2020 at 11:01 AM Muchun Song wrote: > > > > If a kprobe is marked as gone, we should not kill it again. Otherwise, > > we can disarm the kprobe more than once. In that case, the statistics > > of kprobe_ftrace_enabled can unbalance which can lead to that kprobe > > do not work. > > > > Fixes: e8386a0cb22f ("kprobes: support probing module __exit function") > > Signed-off-by: Muchun Song > > Co-developed-by: Chengming Zhou > > Signed-off-by: Chengming Zhou > > Acked-by: Masami Hiramatsu > > --- > > changelogs in v2: > > 1. Add a WARN_ON_ONCE in the kill_kprobe() to catch incorrect use of it. > > 2. Update 'Fixes' tag in the commmit log. > > > > kernel/kprobes.c | 9 - > > 1 file changed, 8 insertions(+), 1 deletion(-) > > > > diff --git a/kernel/kprobes.c b/kernel/kprobes.c > > index d36e2b017588..9348b0c36ae0 100644 > > --- a/kernel/kprobes.c > > +++ b/kernel/kprobes.c > > @@ -2143,6 +2143,9 @@ static void kill_kprobe(struct kprobe *p) > > > > lockdep_assert_held(_mutex); > > > > + if (WARN_ON_ONCE(kprobe_gone(p))) > > + return; > > + > > p->flags |= KPROBE_FLAG_GONE; > > if (kprobe_aggrprobe(p)) { > > /* > > @@ -2422,7 +2425,10 @@ static int kprobes_module_callback(struct > > notifier_block *nb, > > mutex_lock(_mutex); > > for (i = 0; i < KPROBE_TABLE_SIZE; i++) { > > head = _table[i]; > > - hlist_for_each_entry(p, head, hlist) > > + hlist_for_each_entry(p, head, hlist) { > > + if (kprobe_gone(p)) > > + continue; > > + > > if (within_module_init((unsigned long)p->addr, mod) > > || > > (checkcore && > > within_module_core((unsigned long)p->addr, > > mod))) { > > @@ -2439,6 +2445,7 @@ static int kprobes_module_callback(struct > > notifier_block *nb, > > */ > > kill_kprobe(p); > > } > > + } > > } > > if (val == MODULE_STATE_GOING) > > remove_module_kprobe_blacklist(mod); > > -- > > 2.11.0 > > > > > -- > Yours, > Muchun -- Masami Hiramatsu
Re: [v4,0/4] introduce TI reset controller for MT8192 SoC
Hi Rob, Philipp, Matthias and all Gentle ping for this patch set. Thanks Crystal > > -Original Message- > From: Crystal Guo [mailto:crystal@mediatek.com] > Sent: Monday, August 17, 2020 11:03 AM > To: p.za...@pengutronix.de; robh...@kernel.org; matthias@gmail.com > Cc: srv_heupstream; linux-media...@lists.infradead.org; > linux-arm-ker...@lists.infradead.org; linux-kernel@vger.kernel.org; > devicet...@vger.kernel.org; s-a...@ti.com; a...@ti.com; Seiya Wang (王迺君); > Stanley Chu (朱原陞); Yingjoe Chen (陳英洲); Fan Chen (陳凡); Yong Liang (梁勇) > Subject: [v4,0/4] introduce TI reset controller for MT8192 SoC > > v4: > fix typos on v3 commit message. > > v3: > 1. revert v2 changes. > 2. add 'reset-duration-us' property to declare a minimum delay, which needs > to be waited between assert and deassert. > 3. add 'mediatek,infra-reset' to compatible. > > > v2 changes: > https://patchwork.kernel.org/patch/11697371/ > 1. add 'assert-deassert-together' property to introduce a new reset handler, > which allows device to do serialized assert and deassert operations in a > single step by 'reset' method. > 2. add 'update-force' property to introduce force-update method, which forces > the write operation in case the read already happens to return the correct > value. > 3. add 'generic-reset' to compatible > > v1 changes: > https://patchwork.kernel.org/patch/11690523/ > https://patchwork.kernel.org/patch/11690527/ > > Crystal Guo (4): > dt-binding: reset-controller: ti: add reset-duration-us property > dt-binding: reset-controller: ti: add 'mediatek,infra-reset' to > compatible > reset-controller: ti: introduce a new reset handler > arm64: dts: mt8192: add infracfg_rst node > > .../bindings/reset/ti-syscon-reset.txt| 6 + > arch/arm64/boot/dts/mediatek/mt8192.dtsi | 11 +++- > drivers/reset/reset-ti-syscon.c | 26 +-- > 3 files changed, 40 insertions(+), 3 deletions(-) > > > *MEDIATEK Confidential/Internal Use*
[RFC v2 00/11] Hyper-V: Support PAGE_SIZE larger than 4K
This patchset add the necessary changes to support guests whose page size is larger than 4K. Previous version: v1: https://lore.kernel.org/lkml/20200721014135.84140-1-boqun.f...@gmail.com/ Changes since v1: * Introduce a hv_ring_gpadl_send_offset() to improve the readability as per suggestion from Michael. * Use max(..., 2 * PAGE_SIZE) instead of hard-coding size for inputvsc ringbuffer to align with other ringbuffer settinngs * Calculate the exact size of storvsc payload (other than a maximum size) to save memory in storvsc_queuecommand() as per suggestion from Michael. * Use "unsigned int" for loop index inside a page, so that we can have the compiler's help for optimization in PAGE_SIZE == HV_HYP_PAGE_SIZE case as per suggestion from Michael. * Rebase on to v5.9-rc2 with Michael's latest core support patchset[1] Hyper-V always uses 4K as the page size and expects the same page size when communicating with guests. That is, all the "pfn"s in the hypervisor-guest communication protocol are the page numbers in the unit of HV_HYP_PAGE_SIZE rather than PAGE_SIZE. To support guests with larger page size, we need to convert between these two page sizes correctly in the hypervisor-guest communication, which is basically what this patchset does. In this conversion, one challenge is how to handle the ringbuffer. A ringbuffer has two parts: a header and a data part, both of which want to be PAGE_SIZE aligned in the guest, because we use the "double mapping" trick to map the data part twice in the guest virtual address space for faster wrap-around and ease to process data in place. However, the Hyper-V hypervisor always treats the ringbuffer headers as 4k pages. To overcome this gap, we enlarge the hv_ring_buffer structure to be always PAGE_SIZE aligned, and introduce the gpadl type concept to allow vmbus_establish_gpadl() to handle ringbuffer cases specially. Note that gpadl type is only meaningful to the guest, there is no such concept in Hyper-V hypervisor. This patchset consists of 11 patches: Patch 1~4: Introduce the types of gpadl, so that we can handle ringbuffer when PAGE_SIZE != HV_HYP_PAGE_SIZE, and also fix a few places where we should use HV_HYP_PAGE_SIZE other than PAGE_SIZE. Patch 5~6: Add a few helper functions to help calculate the hvpfn (page number in the unit of HV_HYP_PAGE_SIZE) and other related data. So that we can use them in the code of drivers. Patch 7~11: Use the helpers and change the driver code accordingly to make net/input/util/storage driver work with PAGE_SIZE != HV_HYP_PAGE_SIZE I've done some tests with PAGE_SIZE=64k and PAGE_SIZE=16k configurations on ARM64 guests (with Michael's patchset[1] for ARM64 Hyper-V guest support), nothing major breaks yet ;-) (I could observe an error caused by unaligned firmware data, but it's better to have it fixed in the Hyper-V). I also have done a build and boot test on x86, everything worked well. Looking forwards to comments and suggestions! Regards, Boqun [1]: https://lore.kernel.org/lkml/1598287583-71762-1-git-send-email-mikel...@microsoft.com/ Boqun Feng (11): Drivers: hv: vmbus: Always use HV_HYP_PAGE_SIZE for gpadl Drivers: hv: vmbus: Move __vmbus_open() Drivers: hv: vmbus: Introduce types of GPADL Drivers: hv: Use HV_HYP_PAGE in hv_synic_enable_regs() Drivers: hv: vmbus: Move virt_to_hvpfn() to hyperv header hv: hyperv.h: Introduce some hvpfn helper functions hv_netvsc: Use HV_HYP_PAGE_SIZE for Hyper-V communication Input: hyperv-keyboard: Make ringbuffer at least take two pages HID: hyperv: Make ringbuffer at least take two pages Driver: hv: util: Make ringbuffer at least take two pages scsi: storvsc: Support PAGE_SIZE larger than 4K drivers/hid/hid-hyperv.c | 4 +- drivers/hv/channel.c | 462 -- drivers/hv/hv.c | 4 +- drivers/hv/hv_util.c | 16 +- drivers/input/serio/hyperv-keyboard.c | 4 +- drivers/net/hyperv/netvsc.c | 2 +- drivers/net/hyperv/netvsc_drv.c | 46 +-- drivers/net/hyperv/rndis_filter.c | 12 +- drivers/scsi/storvsc_drv.c| 60 +++- include/linux/hyperv.h| 63 +++- 10 files changed, 447 insertions(+), 226 deletions(-) -- 2.28.0
[RFC v2 04/11] Drivers: hv: Use HV_HYP_PAGE in hv_synic_enable_regs()
Both the base_*_gpa should use the guest page number in Hyper-V page, so use HV_HYP_PAGE instead of PAGE. Signed-off-by: Boqun Feng --- drivers/hv/hv.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/hv/hv.c b/drivers/hv/hv.c index 7499079f4077..8ac8bbf5b5aa 100644 --- a/drivers/hv/hv.c +++ b/drivers/hv/hv.c @@ -165,7 +165,7 @@ void hv_synic_enable_regs(unsigned int cpu) hv_get_simp(simp.as_uint64); simp.simp_enabled = 1; simp.base_simp_gpa = virt_to_phys(hv_cpu->synic_message_page) - >> PAGE_SHIFT; + >> HV_HYP_PAGE_SHIFT; hv_set_simp(simp.as_uint64); @@ -173,7 +173,7 @@ void hv_synic_enable_regs(unsigned int cpu) hv_get_siefp(siefp.as_uint64); siefp.siefp_enabled = 1; siefp.base_siefp_gpa = virt_to_phys(hv_cpu->synic_event_page) - >> PAGE_SHIFT; + >> HV_HYP_PAGE_SHIFT; hv_set_siefp(siefp.as_uint64); -- 2.28.0
[RFC v2 02/11] Drivers: hv: vmbus: Move __vmbus_open()
Pure function movement, no functional changes. The move is made, because in a later change, __vmbus_open() will rely on some static functions afterwards, so we sperate the move and the modification of __vmbus_open() in two patches to make it easy to review. Signed-off-by: Boqun Feng Reviewed-by: Wei Liu --- drivers/hv/channel.c | 309 ++- 1 file changed, 155 insertions(+), 154 deletions(-) diff --git a/drivers/hv/channel.c b/drivers/hv/channel.c index 4d0f8e5a88d6..1cbe8fc931fc 100644 --- a/drivers/hv/channel.c +++ b/drivers/hv/channel.c @@ -109,160 +109,6 @@ int vmbus_alloc_ring(struct vmbus_channel *newchannel, } EXPORT_SYMBOL_GPL(vmbus_alloc_ring); -static int __vmbus_open(struct vmbus_channel *newchannel, - void *userdata, u32 userdatalen, - void (*onchannelcallback)(void *context), void *context) -{ - struct vmbus_channel_open_channel *open_msg; - struct vmbus_channel_msginfo *open_info = NULL; - struct page *page = newchannel->ringbuffer_page; - u32 send_pages, recv_pages; - unsigned long flags; - int err; - - if (userdatalen > MAX_USER_DEFINED_BYTES) - return -EINVAL; - - send_pages = newchannel->ringbuffer_send_offset; - recv_pages = newchannel->ringbuffer_pagecount - send_pages; - - if (newchannel->state != CHANNEL_OPEN_STATE) - return -EINVAL; - - newchannel->state = CHANNEL_OPENING_STATE; - newchannel->onchannel_callback = onchannelcallback; - newchannel->channel_callback_context = context; - - err = hv_ringbuffer_init(>outbound, page, send_pages); - if (err) - goto error_clean_ring; - - err = hv_ringbuffer_init(>inbound, -[send_pages], recv_pages); - if (err) - goto error_clean_ring; - - /* Establish the gpadl for the ring buffer */ - newchannel->ringbuffer_gpadlhandle = 0; - - err = vmbus_establish_gpadl(newchannel, - page_address(newchannel->ringbuffer_page), - (send_pages + recv_pages) << PAGE_SHIFT, - >ringbuffer_gpadlhandle); - if (err) - goto error_clean_ring; - - /* Create and init the channel open message */ - open_info = kmalloc(sizeof(*open_info) + - sizeof(struct vmbus_channel_open_channel), - GFP_KERNEL); - if (!open_info) { - err = -ENOMEM; - goto error_free_gpadl; - } - - init_completion(_info->waitevent); - open_info->waiting_channel = newchannel; - - open_msg = (struct vmbus_channel_open_channel *)open_info->msg; - open_msg->header.msgtype = CHANNELMSG_OPENCHANNEL; - open_msg->openid = newchannel->offermsg.child_relid; - open_msg->child_relid = newchannel->offermsg.child_relid; - open_msg->ringbuffer_gpadlhandle = newchannel->ringbuffer_gpadlhandle; - open_msg->downstream_ringbuffer_pageoffset = newchannel->ringbuffer_send_offset; - open_msg->target_vp = hv_cpu_number_to_vp_number(newchannel->target_cpu); - - if (userdatalen) - memcpy(open_msg->userdata, userdata, userdatalen); - - spin_lock_irqsave(_connection.channelmsg_lock, flags); - list_add_tail(_info->msglistentry, - _connection.chn_msg_list); - spin_unlock_irqrestore(_connection.channelmsg_lock, flags); - - if (newchannel->rescind) { - err = -ENODEV; - goto error_free_info; - } - - err = vmbus_post_msg(open_msg, -sizeof(struct vmbus_channel_open_channel), true); - - trace_vmbus_open(open_msg, err); - - if (err != 0) - goto error_clean_msglist; - - wait_for_completion(_info->waitevent); - - spin_lock_irqsave(_connection.channelmsg_lock, flags); - list_del(_info->msglistentry); - spin_unlock_irqrestore(_connection.channelmsg_lock, flags); - - if (newchannel->rescind) { - err = -ENODEV; - goto error_free_info; - } - - if (open_info->response.open_result.status) { - err = -EAGAIN; - goto error_free_info; - } - - newchannel->state = CHANNEL_OPENED_STATE; - kfree(open_info); - return 0; - -error_clean_msglist: - spin_lock_irqsave(_connection.channelmsg_lock, flags); - list_del(_info->msglistentry); - spin_unlock_irqrestore(_connection.channelmsg_lock, flags); -error_free_info: - kfree(open_info); -error_free_gpadl: - vmbus_teardown_gpadl(newchannel, newchannel->ringbuffer_gpadlhandle); - newchannel->ringbuffer_gpadlhandle = 0; -error_clean_ring: - hv_ringbuffer_cleanup(>outbound); - hv_ringbuffer_cleanup(>inbound); -
[RFC v2 03/11] Drivers: hv: vmbus: Introduce types of GPADL
This patch introduces two types of GPADL: HV_GPADL_{BUFFER, RING}. The types of GPADL are purely the concept in the guest, IOW the hypervisor treat them as the same. The reason of introducing the types of GPADL is to support guests whose page size is not 4k (the page size of Hyper-V hypervisor). In these guests, both the headers and the data parts of the ringbuffers need to be aligned to the PAGE_SIZE, because 1) some of the ringbuffers will be mapped into userspace and 2) we use "double mapping" mechanism to support fast wrap-around, and "double mapping" relies on ringbuffers being page-aligned. However, the Hyper-V hypervisor only uses 4k (HV_HYP_PAGE_SIZE) headers. Our solution to this is that we always make the headers of ringbuffers take one guest page and when GPADL is established between the guest and hypervisor, the only first 4k of header is used. To handle this special case, we need the types of GPADL to differ different guest memory usage for GPADL. Type enum is introduced along with several general interfaces to describe the differences between normal buffer GPADL and ringbuffer GPADL. Signed-off-by: Boqun Feng --- drivers/hv/channel.c | 159 +++-- include/linux/hyperv.h | 44 +++- 2 files changed, 182 insertions(+), 21 deletions(-) diff --git a/drivers/hv/channel.c b/drivers/hv/channel.c index 1cbe8fc931fc..7c443fd567e4 100644 --- a/drivers/hv/channel.c +++ b/drivers/hv/channel.c @@ -35,6 +35,98 @@ static unsigned long virt_to_hvpfn(void *addr) return paddr >> HV_HYP_PAGE_SHIFT; } +/* + * hv_gpadl_size - Return the real size of a gpadl, the size that Hyper-V uses + * + * For BUFFER gpadl, Hyper-V uses the exact same size as the guest does. + * + * For RING gpadl, in each ring, the guest uses one PAGE_SIZE as the header + * (because of the alignment requirement), however, the hypervisor only + * uses the first HV_HYP_PAGE_SIZE as the header, therefore leaving a + * (PAGE_SIZE - HV_HYP_PAGE_SIZE) gap. And since there are two rings in a + * ringbuffer, So the total size for a RING gpadl that Hyper-V uses is the + * total size that the guest uses minus twice of the gap size. + */ +static inline u32 hv_gpadl_size(enum hv_gpadl_type type, u32 size) +{ + switch (type) { + case HV_GPADL_BUFFER: + return size; + case HV_GPADL_RING: + /* The size of a ringbuffer must be page-aligned */ + BUG_ON(size % PAGE_SIZE); + /* +* Two things to notice here: +* 1) We're processing two ring buffers as a unit +* 2) We're skipping any space larger than HV_HYP_PAGE_SIZE in +* the first guest-size page of each of the two ring buffers. +* So we effectively subtract out two guest-size pages, and add +* back two Hyper-V size pages. +*/ + return size - 2 * (PAGE_SIZE - HV_HYP_PAGE_SIZE); + } + BUG(); + return 0; +} + +/* + * hv_ring_gpadl_send_offset - Calculate the send offset in a ring gpadl based + * on the offset in the guest + * + * @send_offset: the offset (in bytes) where the send ringbuffer starts in the + * virtual address space of the guest + */ +static inline u32 hv_ring_gpadl_send_offset(u32 send_offset) +{ + + /* +* For RING gpadl, in each ring, the guest uses one PAGE_SIZE as the +* header (because of the alignment requirement), however, the +* hypervisor only uses the first HV_HYP_PAGE_SIZE as the header, +* therefore leaving a (PAGE_SIZE - HV_HYP_PAGE_SIZE) gap. +* +* And to calculate the effective send offset in gpadl, we need to +* substract this gap. +*/ + return send_offset - (PAGE_SIZE - HV_HYP_PAGE_SIZE); +} + +/* + * hv_gpadl_hvpfn - Return the Hyper-V page PFN of the @i th Hyper-V page in + * the gpadl + * + * @type: the type of the gpadl + * @kbuffer: the pointer to the gpadl in the guest + * @size: the total size (in bytes) of the gpadl + * @send_offset: the offset (in bytes) where the send ringbuffer starts in the + * virtual address space of the guest + * @i: the index + */ +static inline u64 hv_gpadl_hvpfn(enum hv_gpadl_type type, void *kbuffer, +u32 size, u32 send_offset, int i) +{ + int send_idx = hv_ring_gpadl_send_offset(send_offset) >> HV_HYP_PAGE_SHIFT; + unsigned long delta = 0UL; + + switch (type) { + case HV_GPADL_BUFFER: + break; + case HV_GPADL_RING: + if (i == 0) + delta = 0; + else if (i <= send_idx) + delta = PAGE_SIZE - HV_HYP_PAGE_SIZE; + else + delta = 2 * (PAGE_SIZE - HV_HYP_PAGE_SIZE); + break; + default: + BUG(); +
[RFC v2 09/11] HID: hyperv: Make ringbuffer at least take two pages
When PAGE_SIZE > HV_HYP_PAGE_SIZE, we need the ringbuffer size to be at least 2 * PAGE_SIZE: one page for the header and at least one page of the data part (because of the alignment requirement for double mapping). So make sure the ringbuffer sizes to be at least 2 * PAGE_SIZE when using vmbus_open() to establish the vmbus connection. Signed-off-by: Boqun Feng --- drivers/hid/hid-hyperv.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/hid/hid-hyperv.c b/drivers/hid/hid-hyperv.c index 0b6ee1dee625..dff032f17ad0 100644 --- a/drivers/hid/hid-hyperv.c +++ b/drivers/hid/hid-hyperv.c @@ -104,8 +104,8 @@ struct synthhid_input_report { #pragma pack(pop) -#define INPUTVSC_SEND_RING_BUFFER_SIZE (40 * 1024) -#define INPUTVSC_RECV_RING_BUFFER_SIZE (40 * 1024) +#define INPUTVSC_SEND_RING_BUFFER_SIZE max(40 * 1024, 2 * PAGE_SIZE) +#define INPUTVSC_RECV_RING_BUFFER_SIZE max(40 * 1024, 2 * PAGE_SIZE) enum pipe_prot_msg_type { -- 2.28.0
[RFC v2 08/11] Input: hyperv-keyboard: Make ringbuffer at least take two pages
When PAGE_SIZE > HV_HYP_PAGE_SIZE, we need the ringbuffer size to be at least 2 * PAGE_SIZE: one page for the header and at least one page of the data part (because of the alignment requirement for double mapping). So make sure the ringbuffer sizes to be at least 2 * PAGE_SIZE when using vmbus_open() to establish the vmbus connection. Signed-off-by: Boqun Feng --- drivers/input/serio/hyperv-keyboard.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/input/serio/hyperv-keyboard.c b/drivers/input/serio/hyperv-keyboard.c index df4e9f6f4529..77ba57ba2691 100644 --- a/drivers/input/serio/hyperv-keyboard.c +++ b/drivers/input/serio/hyperv-keyboard.c @@ -75,8 +75,8 @@ struct synth_kbd_keystroke { #define HK_MAXIMUM_MESSAGE_SIZE 256 -#define KBD_VSC_SEND_RING_BUFFER_SIZE (40 * 1024) -#define KBD_VSC_RECV_RING_BUFFER_SIZE (40 * 1024) +#define KBD_VSC_SEND_RING_BUFFER_SIZE max(40 * 1024, 2 * PAGE_SIZE) +#define KBD_VSC_RECV_RING_BUFFER_SIZE max(40 * 1024, 2 * PAGE_SIZE) #define XTKBD_EMUL0 0xe0 #define XTKBD_EMUL1 0xe1 -- 2.28.0
[RFC v2 10/11] Driver: hv: util: Make ringbuffer at least take two pages
When PAGE_SIZE > HV_HYP_PAGE_SIZE, we need the ringbuffer size to be at least 2 * PAGE_SIZE: one page for the header and at least one page of the data part (because of the alignment requirement for double mapping). So make sure the ringbuffer sizes to be at least 2 * PAGE_SIZE when using vmbus_open() to establish the vmbus connection. Signed-off-by: Boqun Feng --- drivers/hv/hv_util.c | 16 1 file changed, 12 insertions(+), 4 deletions(-) diff --git a/drivers/hv/hv_util.c b/drivers/hv/hv_util.c index 92ee0fe4c919..73a77bead2be 100644 --- a/drivers/hv/hv_util.c +++ b/drivers/hv/hv_util.c @@ -461,6 +461,14 @@ static void heartbeat_onchannelcallback(void *context) } } +/* + * The size of each ring should be at least 2 * PAGE_SIZE, because we need one + * page for the header and at least another page (because of the alignment + * requirement for double mapping) for data part. + */ +#define HV_UTIL_RING_SEND_SIZE max(4 * HV_HYP_PAGE_SIZE, 2 * PAGE_SIZE) +#define HV_UTIL_RING_RECV_SIZE max(4 * HV_HYP_PAGE_SIZE, 2 * PAGE_SIZE) + static int util_probe(struct hv_device *dev, const struct hv_vmbus_device_id *dev_id) { @@ -491,8 +499,8 @@ static int util_probe(struct hv_device *dev, hv_set_drvdata(dev, srv); - ret = vmbus_open(dev->channel, 4 * HV_HYP_PAGE_SIZE, -4 * HV_HYP_PAGE_SIZE, NULL, 0, srv->util_cb, + ret = vmbus_open(dev->channel, HV_UTIL_RING_SEND_SIZE, +HV_UTIL_RING_RECV_SIZE, NULL, 0, srv->util_cb, dev->channel); if (ret) goto error; @@ -551,8 +559,8 @@ static int util_resume(struct hv_device *dev) return ret; } - ret = vmbus_open(dev->channel, 4 * HV_HYP_PAGE_SIZE, -4 * HV_HYP_PAGE_SIZE, NULL, 0, srv->util_cb, + ret = vmbus_open(dev->channel, HV_UTIL_RING_SEND_SIZE, +HV_UTIL_RING_RECV_SIZE, NULL, 0, srv->util_cb, dev->channel); return ret; } -- 2.28.0
[RFC v2 06/11] hv: hyperv.h: Introduce some hvpfn helper functions
When a guest communicate with the hypervisor, it must use HV_HYP_PAGE to calculate PFN, so introduce a few hvpfn helper functions as the counterpart of the page helper functions. This is the preparation for supporting guest whose PAGE_SIZE is not 4k. Signed-off-by: Boqun Feng --- include/linux/hyperv.h | 4 1 file changed, 4 insertions(+) diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h index 6f4831212979..54e84ba6b554 100644 --- a/include/linux/hyperv.h +++ b/include/linux/hyperv.h @@ -1687,4 +1687,8 @@ static inline unsigned long virt_to_hvpfn(void *addr) return paddr >> HV_HYP_PAGE_SHIFT; } +#define offset_in_hvpage(ptr) ((unsigned long)(ptr) & ~HV_HYP_PAGE_MASK) +#define HVPFN_UP(x)(((x) + HV_HYP_PAGE_SIZE-1) >> HV_HYP_PAGE_SHIFT) +#define page_to_hvpfn(page)((page_to_pfn(page) << PAGE_SHIFT) >> HV_HYP_PAGE_SHIFT) + #endif /* _HYPERV_H */ -- 2.28.0
[RFC v2 11/11] scsi: storvsc: Support PAGE_SIZE larger than 4K
Hyper-V always use 4k page size (HV_HYP_PAGE_SIZE), so when communicating with Hyper-V, a guest should always use HV_HYP_PAGE_SIZE as the unit for page related data. For storvsc, the data is vmbus_packet_mpb_array. And since in scsi_cmnd, sglist of pages (in unit of PAGE_SIZE) is used, we need convert pages in the sglist of scsi_cmnd into Hyper-V pages in vmbus_packet_mpb_array. This patch does the conversion by dividing pages in sglist into Hyper-V pages, offset and indexes in vmbus_packet_mpb_array are recalculated accordingly. Signed-off-by: Boqun Feng --- drivers/scsi/storvsc_drv.c | 60 ++ 1 file changed, 54 insertions(+), 6 deletions(-) diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c index 8f5f5dc863a4..3f6610717d4e 100644 --- a/drivers/scsi/storvsc_drv.c +++ b/drivers/scsi/storvsc_drv.c @@ -1739,23 +1739,71 @@ static int storvsc_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *scmnd) payload_sz = sizeof(cmd_request->mpb); if (sg_count) { - if (sg_count > MAX_PAGE_BUFFER_COUNT) { + unsigned int hvpg_idx = 0; + unsigned int j = 0; + unsigned long hvpg_offset = sgl->offset & ~HV_HYP_PAGE_MASK; + unsigned int hvpg_count = HVPFN_UP(hvpg_offset + length); - payload_sz = (sg_count * sizeof(u64) + + if (hvpg_count > MAX_PAGE_BUFFER_COUNT) { + + payload_sz = (hvpg_count * sizeof(u64) + sizeof(struct vmbus_packet_mpb_array)); payload = kzalloc(payload_sz, GFP_ATOMIC); if (!payload) return SCSI_MLQUEUE_DEVICE_BUSY; } + /* +* sgl is a list of PAGEs, and payload->range.pfn_array +* expects the page number in the unit of HV_HYP_PAGE_SIZE (the +* page size that Hyper-V uses, so here we need to divide PAGEs +* into HV_HYP_PAGE in case that PAGE_SIZE > HV_HYP_PAGE_SIZE. +*/ payload->range.len = length; - payload->range.offset = sgl[0].offset; + payload->range.offset = sgl[0].offset & ~HV_HYP_PAGE_MASK; + hvpg_idx = sgl[0].offset >> HV_HYP_PAGE_SHIFT; cur_sgl = sgl; - for (i = 0; i < sg_count; i++) { - payload->range.pfn_array[i] = - page_to_pfn(sg_page((cur_sgl))); + for (i = 0, j = 0; i < sg_count; i++) { + /* +* "PAGE_SIZE / HV_HYP_PAGE_SIZE - hvpg_idx" is the # +* of HV_HYP_PAGEs in the current PAGE. +* +* "hvpg_count - j" is the # of unhandled HV_HYP_PAGEs. +* +* As shown in the following, the minimal of both is +* the # of HV_HYP_PAGEs, we need to handle in this +* PAGE. +* +* |-- PAGE --| +* | PAGE_SIZE / HV_HYP_PAGE_SIZE in total | +* |hvpg|hvpg| ... |hvpg|... |hvpg| +* ^ ^ +* hvpg_idx| +* ^ | +* +---(hvpg_count - j)--+ +* +* or +* +* |-- PAGE --| +* | PAGE_SIZE / HV_HYP_PAGE_SIZE in total | +* |hvpg|hvpg| ... |hvpg|... |hvpg| +* ^ ^ +* hvpg_idx | +* ^ | +* +---(hvpg_count - j)+ +*/ + unsigned int nr_hvpg = min((unsigned int)(PAGE_SIZE / HV_HYP_PAGE_SIZE) - hvpg_idx, + hvpg_count - j); + unsigned int k; + + for (k = 0; k < nr_hvpg; k++) { + payload->range.pfn_array[j] = + page_to_hvpfn(sg_page((cur_sgl))) + hvpg_idx + k; + j++; + } cur_sgl = sg_next(cur_sgl); + hvpg_idx = 0; } } -- 2.28.0
[RFC v2 05/11] Drivers: hv: vmbus: Move virt_to_hvpfn() to hyperv header
There will be more places other than vmbus where we need to calculate the Hyper-V page PFN from a virtual address, so move virt_to_hvpfn() to hyperv generic header. Signed-off-by: Boqun Feng --- drivers/hv/channel.c | 13 - include/linux/hyperv.h | 15 +++ 2 files changed, 15 insertions(+), 13 deletions(-) diff --git a/drivers/hv/channel.c b/drivers/hv/channel.c index 7c443fd567e4..74a8f49ab76a 100644 --- a/drivers/hv/channel.c +++ b/drivers/hv/channel.c @@ -22,19 +22,6 @@ #include "hyperv_vmbus.h" -static unsigned long virt_to_hvpfn(void *addr) -{ - phys_addr_t paddr; - - if (is_vmalloc_addr(addr)) - paddr = page_to_phys(vmalloc_to_page(addr)) + -offset_in_page(addr); - else - paddr = __pa(addr); - - return paddr >> HV_HYP_PAGE_SHIFT; -} - /* * hv_gpadl_size - Return the real size of a gpadl, the size that Hyper-V uses * diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h index 7d16dd28aa48..6f4831212979 100644 --- a/include/linux/hyperv.h +++ b/include/linux/hyperv.h @@ -14,6 +14,7 @@ #include +#include #include #include #include @@ -23,6 +24,7 @@ #include #include #include +#include #define MAX_PAGE_BUFFER_COUNT 32 #define MAX_MULTIPAGE_BUFFER_COUNT 32 /* 128K */ @@ -1672,4 +1674,17 @@ struct hyperv_pci_block_ops { extern struct hyperv_pci_block_ops hvpci_block_ops; +static inline unsigned long virt_to_hvpfn(void *addr) +{ + phys_addr_t paddr; + + if (is_vmalloc_addr(addr)) + paddr = page_to_phys(vmalloc_to_page(addr)) + +offset_in_page(addr); + else + paddr = __pa(addr); + + return paddr >> HV_HYP_PAGE_SHIFT; +} + #endif /* _HYPERV_H */ -- 2.28.0
[RFC v2 07/11] hv_netvsc: Use HV_HYP_PAGE_SIZE for Hyper-V communication
When communicating with Hyper-V, HV_HYP_PAGE_SIZE should be used since that's the page size used by Hyper-V and Hyper-V expects all page-related data using the unit of HY_HYP_PAGE_SIZE, for example, the "pfn" in hv_page_buffer is actually the HV_HYP_PAGE (i.e. the Hyper-V page) number. In order to support guest whose page size is not 4k, we need to make hv_netvsc always use HV_HYP_PAGE_SIZE for Hyper-V communication. Signed-off-by: Boqun Feng --- drivers/net/hyperv/netvsc.c | 2 +- drivers/net/hyperv/netvsc_drv.c | 46 +++ drivers/net/hyperv/rndis_filter.c | 12 3 files changed, 30 insertions(+), 30 deletions(-) diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c index 41f5cf0bb997..1d6f2256da6b 100644 --- a/drivers/net/hyperv/netvsc.c +++ b/drivers/net/hyperv/netvsc.c @@ -794,7 +794,7 @@ static void netvsc_copy_to_send_buf(struct netvsc_device *net_device, } for (i = 0; i < page_count; i++) { - char *src = phys_to_virt(pb[i].pfn << PAGE_SHIFT); + char *src = phys_to_virt(pb[i].pfn << HV_HYP_PAGE_SHIFT); u32 offset = pb[i].offset; u32 len = pb[i].len; diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c index 64b0a74c1523..61ea568e1ddf 100644 --- a/drivers/net/hyperv/netvsc_drv.c +++ b/drivers/net/hyperv/netvsc_drv.c @@ -373,32 +373,29 @@ static u16 netvsc_select_queue(struct net_device *ndev, struct sk_buff *skb, return txq; } -static u32 fill_pg_buf(struct page *page, u32 offset, u32 len, +static u32 fill_pg_buf(unsigned long hvpfn, u32 offset, u32 len, struct hv_page_buffer *pb) { int j = 0; - /* Deal with compound pages by ignoring unused part -* of the page. -*/ - page += (offset >> PAGE_SHIFT); - offset &= ~PAGE_MASK; + hvpfn += offset >> HV_HYP_PAGE_SHIFT; + offset = offset & ~HV_HYP_PAGE_MASK; while (len > 0) { unsigned long bytes; - bytes = PAGE_SIZE - offset; + bytes = HV_HYP_PAGE_SIZE - offset; if (bytes > len) bytes = len; - pb[j].pfn = page_to_pfn(page); + pb[j].pfn = hvpfn; pb[j].offset = offset; pb[j].len = bytes; offset += bytes; len -= bytes; - if (offset == PAGE_SIZE && len) { - page++; + if (offset == HV_HYP_PAGE_SIZE && len) { + hvpfn++; offset = 0; j++; } @@ -421,23 +418,26 @@ static u32 init_page_array(void *hdr, u32 len, struct sk_buff *skb, * 2. skb linear data * 3. skb fragment data */ - slots_used += fill_pg_buf(virt_to_page(hdr), - offset_in_page(hdr), - len, [slots_used]); + slots_used += fill_pg_buf(virt_to_hvpfn(hdr), + offset_in_hvpage(hdr), + len, + [slots_used]); packet->rmsg_size = len; packet->rmsg_pgcnt = slots_used; - slots_used += fill_pg_buf(virt_to_page(data), - offset_in_page(data), - skb_headlen(skb), [slots_used]); + slots_used += fill_pg_buf(virt_to_hvpfn(data), + offset_in_hvpage(data), + skb_headlen(skb), + [slots_used]); for (i = 0; i < frags; i++) { skb_frag_t *frag = skb_shinfo(skb)->frags + i; - slots_used += fill_pg_buf(skb_frag_page(frag), - skb_frag_off(frag), - skb_frag_size(frag), [slots_used]); + slots_used += fill_pg_buf(page_to_hvpfn(skb_frag_page(frag)), + skb_frag_off(frag), + skb_frag_size(frag), + [slots_used]); } return slots_used; } @@ -453,8 +453,8 @@ static int count_skb_frag_slots(struct sk_buff *skb) unsigned long offset = skb_frag_off(frag); /* Skip unused frames from start of page */ - offset &= ~PAGE_MASK; - pages += PFN_UP(offset + size); + offset &= ~HV_HYP_PAGE_MASK; + pages += HVPFN_UP(offset + size); } return pages; } @@ -462,12 +462,12 @@ static int count_skb_frag_slots(struct sk_buff *skb) static int netvsc_get_slots(struct sk_buff *skb) { char *data = skb->data; - unsigned int offset = offset_in_page(data); + unsigned int offset = offset_in_hvpage(data); unsigned
[RFC v2 01/11] Drivers: hv: vmbus: Always use HV_HYP_PAGE_SIZE for gpadl
Since the hypervisor always uses 4K as its page size, the size of PFNs used for gpadl should be HV_HYP_PAGE_SIZE rather than PAGE_SIZE, so adjust this accordingly as the preparation for supporting 16K/64K page size guests. No functional changes on x86, since PAGE_SIZE is always 4k (equals to HV_HYP_PAGE_SIZE). Signed-off-by: Boqun Feng --- drivers/hv/channel.c | 13 + 1 file changed, 5 insertions(+), 8 deletions(-) diff --git a/drivers/hv/channel.c b/drivers/hv/channel.c index 3ebda7707e46..4d0f8e5a88d6 100644 --- a/drivers/hv/channel.c +++ b/drivers/hv/channel.c @@ -22,9 +22,6 @@ #include "hyperv_vmbus.h" -#define NUM_PAGES_SPANNED(addr, len) \ -((PAGE_ALIGN(addr + len) >> PAGE_SHIFT) - (addr >> PAGE_SHIFT)) - static unsigned long virt_to_hvpfn(void *addr) { phys_addr_t paddr; @@ -35,7 +32,7 @@ static unsigned long virt_to_hvpfn(void *addr) else paddr = __pa(addr); - return paddr >> PAGE_SHIFT; + return paddr >> HV_HYP_PAGE_SHIFT; } /* @@ -330,7 +327,7 @@ static int create_gpadl_header(void *kbuffer, u32 size, int pfnsum, pfncount, pfnleft, pfncurr, pfnsize; - pagecount = size >> PAGE_SHIFT; + pagecount = size >> HV_HYP_PAGE_SHIFT; /* do we need a gpadl body msg */ pfnsize = MAX_SIZE_CHANNEL_MESSAGE - @@ -360,7 +357,7 @@ static int create_gpadl_header(void *kbuffer, u32 size, gpadl_header->range[0].byte_count = size; for (i = 0; i < pfncount; i++) gpadl_header->range[0].pfn_array[i] = virt_to_hvpfn( - kbuffer + PAGE_SIZE * i); + kbuffer + HV_HYP_PAGE_SIZE * i); *msginfo = msgheader; pfnsum = pfncount; @@ -412,7 +409,7 @@ static int create_gpadl_header(void *kbuffer, u32 size, */ for (i = 0; i < pfncurr; i++) gpadl_body->pfn[i] = virt_to_hvpfn( - kbuffer + PAGE_SIZE * (pfnsum + i)); + kbuffer + HV_HYP_PAGE_SIZE * (pfnsum + i)); /* add to msg header */ list_add_tail(>msglistentry, @@ -441,7 +438,7 @@ static int create_gpadl_header(void *kbuffer, u32 size, gpadl_header->range[0].byte_count = size; for (i = 0; i < pagecount; i++) gpadl_header->range[0].pfn_array[i] = virt_to_hvpfn( - kbuffer + PAGE_SIZE * i); + kbuffer + HV_HYP_PAGE_SIZE * i); *msginfo = msgheader; } -- 2.28.0
[PATCH v2 08/10] soundwire: intel: add error log for clock-stop invalid configs
From: Pierre-Louis Bossart Detect cases where the clock is assumed to be stopped but the IP is not in the relevant state. There is no real way to recover here, but adding an error log can help detect bad programming sequences or race conditions. Signed-off-by: Pierre-Louis Bossart Signed-off-by: Bard Liao --- drivers/soundwire/intel.c | 5 + 1 file changed, 5 insertions(+) diff --git a/drivers/soundwire/intel.c b/drivers/soundwire/intel.c index 272826973426..97c8cfc54ddd 100644 --- a/drivers/soundwire/intel.c +++ b/drivers/soundwire/intel.c @@ -1931,6 +1931,11 @@ static int intel_resume_runtime(struct device *dev) } } } else if (!clock_stop_quirks) { + + clock_stop0 = sdw_cdns_is_clock_stop(>cdns); + if (!clock_stop0) + dev_err(dev, "%s invalid configuration, clock was not stopped", __func__); + ret = intel_init(sdw); if (ret) { dev_err(dev, "%s failed: %d", __func__, ret); -- 2.17.1
[PATCH v2 10/10] soundwire: intel: don't manage link power individually
From: Pierre-Louis Bossart Each link has separate power controls, but experimental results show we need to use an all-or-none approach to the link power management. This change has marginal power impacts, the DSP needs to be powered anyways before SoundWire links can be powered, and even when powered a link can be in clock-stopped mode. Signed-off-by: Pierre-Louis Bossart Signed-off-by: Bard Liao --- drivers/soundwire/intel.c | 70 +-- 1 file changed, 46 insertions(+), 24 deletions(-) diff --git a/drivers/soundwire/intel.c b/drivers/soundwire/intel.c index 97c8cfc54ddd..710f5eba936b 100644 --- a/drivers/soundwire/intel.c +++ b/drivers/soundwire/intel.c @@ -63,7 +63,9 @@ MODULE_PARM_DESC(sdw_md_flags, "SoundWire Intel Master device flags (0x0 all off #define SDW_SHIM_WAKESTS 0x192 #define SDW_SHIM_LCTL_SPA BIT(0) +#define SDW_SHIM_LCTL_SPA_MASK GENMASK(3, 0) #define SDW_SHIM_LCTL_CPA BIT(8) +#define SDW_SHIM_LCTL_CPA_MASK GENMASK(11, 8) #define SDW_SHIM_SYNC_SYNCPRD_VAL_24 (24000 / SDW_CADENCE_GSYNC_KHZ - 1) #define SDW_SHIM_SYNC_SYNCPRD_VAL_38_4 (38400 / SDW_CADENCE_GSYNC_KHZ - 1) @@ -295,8 +297,8 @@ static int intel_link_power_up(struct sdw_intel *sdw) u32 *shim_mask = sdw->link_res->shim_mask; struct sdw_bus *bus = >cdns.bus; struct sdw_master_prop *prop = >prop; - int spa_mask, cpa_mask; - int link_control; + u32 spa_mask, cpa_mask; + u32 link_control; int ret = 0; u32 syncprd; u32 sync_reg; @@ -319,6 +321,8 @@ static int intel_link_power_up(struct sdw_intel *sdw) syncprd = SDW_SHIM_SYNC_SYNCPRD_VAL_24; if (!*shim_mask) { + dev_dbg(sdw->cdns.dev, "%s: powering up all links\n", __func__); + /* we first need to program the SyncPRD/CPU registers */ dev_dbg(sdw->cdns.dev, "%s: first link up, programming SYNCPRD\n", __func__); @@ -331,21 +335,24 @@ static int intel_link_power_up(struct sdw_intel *sdw) /* Set SyncCPU bit */ sync_reg |= SDW_SHIM_SYNC_SYNCCPU; intel_writel(shim, SDW_SHIM_SYNC, sync_reg); - } - /* Link power up sequence */ - link_control = intel_readl(shim, SDW_SHIM_LCTL); - spa_mask = (SDW_SHIM_LCTL_SPA << link_id); - cpa_mask = (SDW_SHIM_LCTL_CPA << link_id); - link_control |= spa_mask; + /* Link power up sequence */ + link_control = intel_readl(shim, SDW_SHIM_LCTL); - ret = intel_set_bit(shim, SDW_SHIM_LCTL, link_control, cpa_mask); - if (ret < 0) { - dev_err(sdw->cdns.dev, "Failed to power up link: %d\n", ret); - goto out; - } + /* only power-up enabled links */ + spa_mask = sdw->link_res->link_mask << + SDW_REG_SHIFT(SDW_SHIM_LCTL_SPA_MASK); + cpa_mask = sdw->link_res->link_mask << + SDW_REG_SHIFT(SDW_SHIM_LCTL_CPA_MASK); + + link_control |= spa_mask; + + ret = intel_set_bit(shim, SDW_SHIM_LCTL, link_control, cpa_mask); + if (ret < 0) { + dev_err(sdw->cdns.dev, "Failed to power up link: %d\n", ret); + goto out; + } - if (!*shim_mask) { /* SyncCPU will change once link is active */ ret = intel_wait_bit(shim, SDW_SHIM_SYNC, SDW_SHIM_SYNC_SYNCCPU, 0); @@ -483,7 +490,7 @@ static void intel_shim_wake(struct sdw_intel *sdw, bool wake_enable) static int intel_link_power_down(struct sdw_intel *sdw) { - int link_control, spa_mask, cpa_mask; + u32 link_control, spa_mask, cpa_mask; unsigned int link_id = sdw->instance; void __iomem *shim = sdw->link_res->shim; u32 *shim_mask = sdw->link_res->shim_mask; @@ -493,24 +500,39 @@ static int intel_link_power_down(struct sdw_intel *sdw) intel_shim_master_ip_to_glue(sdw); - /* Link power down sequence */ - link_control = intel_readl(shim, SDW_SHIM_LCTL); - spa_mask = ~(SDW_SHIM_LCTL_SPA << link_id); - cpa_mask = (SDW_SHIM_LCTL_CPA << link_id); - link_control &= spa_mask; - - ret = intel_clear_bit(shim, SDW_SHIM_LCTL, link_control, cpa_mask); - if (!(*shim_mask & BIT(link_id))) dev_err(sdw->cdns.dev, "%s: Unbalanced power-up/down calls\n", __func__); *shim_mask &= ~BIT(link_id); + if (!*shim_mask) { + + dev_dbg(sdw->cdns.dev, "%s: powering down all links\n", __func__); + + /* Link power down sequence */ + link_control = intel_readl(shim, SDW_SHIM_LCTL); + + /* only power-down enabled links */ + spa_mask = (~sdw->link_res->link_mask) << +
[PATCH v2 05/10] soundwire: bus: update multi-link definition with hw sync details
From: Pierre-Louis Bossart Hardware-based synchronization is typically required when the bus->multi_link flag is set. On Intel platforms, when the Cadence IP is configured in 'Multi Master Mode', the hardware synchronization is required even when a stream only uses a single segment. The existing code only deal with hardware synchronization when a stream uses more than one segment so to remain backwards compatible we add a configuration threshold. For Intel cases this threshold will be set to one, other platforms may be able to use the SSP-based sync in those cases. Signed-off-by: Pierre-Louis Bossart Signed-off-by: Bard Liao --- include/linux/soundwire/sdw.h | 6 ++ 1 file changed, 6 insertions(+) diff --git a/include/linux/soundwire/sdw.h b/include/linux/soundwire/sdw.h index 76052f12c9f7..9adbe4fd7980 100644 --- a/include/linux/soundwire/sdw.h +++ b/include/linux/soundwire/sdw.h @@ -827,6 +827,11 @@ struct sdw_master_ops { * @multi_link: Store bus property that indicates if multi links * are supported. This flag is populated by drivers after reading * appropriate firmware (ACPI/DT). + * @hw_sync_min_links: Number of links used by a stream above which + * hardware-based synchronization is required. This value is only + * meaningful if multi_link is set. If set to 1, hardware-based + * synchronization will be used even if a stream only uses a single + * SoundWire segment. */ struct sdw_bus { struct device *dev; @@ -850,6 +855,7 @@ struct sdw_bus { unsigned int clk_stop_timeout; u32 bank_switch_timeout; bool multi_link; + int hw_sync_min_links; }; int sdw_bus_master_add(struct sdw_bus *bus, struct device *parent, -- 2.17.1
[PATCH v2 01/10] soundwire: intel: disable shim wake on suspend
From: Pierre-Louis Bossart If we enabled the clock stop mode and suspend, we need to disable the shim wake. We do so only if the parent is pm_runtime active due to power rail dependencies. GitHub issue: https://github.com/thesofproject/linux/issues/1678 Signed-off-by: Pierre-Louis Bossart Signed-off-by: Bard Liao --- drivers/soundwire/intel.c | 18 ++ 1 file changed, 18 insertions(+) diff --git a/drivers/soundwire/intel.c b/drivers/soundwire/intel.c index ebca8ced59ec..aa8484366c95 100644 --- a/drivers/soundwire/intel.c +++ b/drivers/soundwire/intel.c @@ -1532,6 +1532,7 @@ static int __maybe_unused intel_suspend(struct device *dev) struct sdw_cdns *cdns = dev_get_drvdata(dev); struct sdw_intel *sdw = cdns_to_intel(cdns); struct sdw_bus *bus = >bus; + u32 clock_stop_quirks; int ret; if (bus->prop.hw_disabled) { @@ -1543,6 +1544,23 @@ static int __maybe_unused intel_suspend(struct device *dev) if (pm_runtime_suspended(dev)) { dev_dbg(dev, "%s: pm_runtime status: suspended\n", __func__); + clock_stop_quirks = sdw->link_res->clock_stop_quirks; + + if ((clock_stop_quirks & SDW_INTEL_CLK_STOP_BUS_RESET || +!clock_stop_quirks) && + !pm_runtime_suspended(dev->parent)) { + + /* +* if we've enabled clock stop, and the parent +* is still active, disable shim wake. The +* SHIM registers are not accessible if the +* parent is already pm_runtime suspended so +* it's too late to change that configuration +*/ + + intel_shim_wake(sdw, false); + } + return 0; } -- 2.17.1
[PATCH v2 02/10] soundwire: intel: ignore software command retries
From: Pierre-Louis Bossart with multiple links synchronized in hardware, retrying commands in software is not recommended. Signed-off-by: Pierre-Louis Bossart Signed-off-by: Bard Liao --- drivers/soundwire/intel.c | 5 + 1 file changed, 5 insertions(+) diff --git a/drivers/soundwire/intel.c b/drivers/soundwire/intel.c index aa8484366c95..94a659e65f86 100644 --- a/drivers/soundwire/intel.c +++ b/drivers/soundwire/intel.c @@ -1355,6 +1355,11 @@ static int intel_master_probe(struct platform_device *pdev) dev_info(dev, "SoundWire master %d is disabled, will be ignored\n", bus->link_id); + /* +* Ignore BIOS err_threshold, it's a really bad idea when dealing +* with multiple hardware synchronized links +*/ + bus->prop.err_threshold = 0; return 0; } -- 2.17.1