Re: [PATCH v3 2/4] irqchip/qcom-pdc: Switch to using IRQCHIP_PLATFORM_DRIVER helper macros
On Thu 06 Aug 19:48 PDT 2020, John Stultz wrote: > On Thu, Aug 6, 2020 at 6:42 PM Bjorn Andersson > wrote: > > On Thu 06 Aug 18:22 PDT 2020, John Stultz wrote: > > > On Thu, Aug 6, 2020 at 5:43 PM Bjorn Andersson > > > wrote: > > > > On Wed 05 Aug 14:57 PDT 2020, John Stultz wrote: > > > > > On Wed, Aug 5, 2020 at 2:47 PM Steev Klimaszewski > > > > > wrote: > > > > > > On 8/5/20 4:16 PM, Steev Klimaszewski wrote: > > > > > > > On 8/5/20 3:19 PM, Saravana Kannan wrote: > > > > > > >> On Wed, Aug 5, 2020 at 12:44 AM John Stultz > > > > > > >> wrote: > > > > > > >>> > > > > > > >>> So this is where I bashfully admit I didn't get a chance to try > > > > > > >>> this > > > > > > >>> patch series out, as I had success with a much older version of > > > > > > >>> Saravana's macro magic. > > > > > > >>> > > > > > > >>> But unfortunately, now that this has landed in mainline, I'm > > > > > > >>> seeing > > > > > > >>> boot regressions on db845c. :( This is in the non-modular case, > > > > > > >>> building the driver in. > > > > > > >> Does that mean the modular version is working? Or you haven't > > > > > > >> tried > > > > > > >> that yet? I'll wait for your reply before I try to fix it. I > > > > > > >> don't > > > > > > >> have the hardware, but it should be easy to guess this issue > > > > > > >> looking > > > > > > >> at the code delta. > > > > > > > For what it's worth, I saw this too on the Lenovo C630 (started > > > > > > > on -next > > > > > > > around 20200727, but I didn't track it down as, well, there's > > > > > > > less way > > > > > > > to get debug output on the C630. > > > > > > > > > > > > > > In my testing, module or built-in doesn't matter, but reverting > > > > > > > does > > > > > > > allow me to boot again. > > > > > > > > > > > > > Actually - I spoke too soon - QCOM_PDC built-in with the commit > > > > > > reverted > > > > > > boots, however, module (on the c630 at least) doesn't boot whether > > > > > > it's > > > > > > a module or built-in. > > > > > > > > > > You may need to set deferred_probe_timeout=30 to give things a bit > > > > > more grace time to load. > > > > > > > > With the risk of me reading more into this than what you're saying, > > > > please don't upstream anything that depend this parameter to be > > > > increased. > > > > > > > > Compiling any of these drivers as module should not require the user to > > > > pass additional kernel command line parameters in order to get their > > > > device to boot. > > > > > > So, ideally I agree, and Saravana's fw_devlink work should allow us to > > > avoid it. But the reality is that it is already required (at least in > > > configurations heavily using modules) to give more time for modules > > > loaded to resolve missing dependencies after init begins (due to > > > changes in the driver core to fail loading after init so that optional > > > dt links aren't eternally looked for). This was seen when trying to > > > enable the qualcom clk drivers to modules. > > > > > > > So to clarify what you're saying, any system that boots successfully > > with the default options is a sign of pure luck - regardless of being > > builtin or modules. > > > > > > And there you have my exact argument against the deferred timeout magic > > going on in the driver core. But as you know people insist that it's > > more important to be able to boot some defunct system from NFS than a > > properly configured one reliably. > > I'd agree, but the NFS case was in use before, and when the original > deferred timeout/optional link handling stuff landed no one complained > they were broken by it (at least at the point where it landed). I did object when this was proposed and I've objected for the last two years, because we keep adding more and more subsystems to follow this broken behavior. > Only later when we started enabling more lower-level core drivers as > modules did the shortened dependency resolution time start to bite > folks. My attempt to set the default to be 30 seconds helped there, > but caused trouble and delays for the NFS case, and "don't break > existing users" seemed to rule, so I set the default timeout back to > 0. > I can't argue with that and I'm at loss on how to turn this around. > > > It doesn't seem necessary in this case, but I suggested it here as > > > I've got it enabled by default in my AOSP builds so that the > > > module-heavy configs for GKI boot properly (even if Saravana's > > > fw_devlink work is disabled). > > > > > > > With all due respect, that's your downstream kernel, the upstream kernel > > should not rely on luck, out-of-tree patches or kernel parameters. > > I agree that would be preferred. But kernel parameters are often there > for these sorts of cases where we can't always do the right thing. > As for out-of-tree patches, broken things don't get fixed until > out-of-tree patches are developed and upstreamed, and I know Saravana > is doing exactly that, and I hope his fw_devlink work helps
Re: [PATCH v3] drm/virtio: fix missing dma_fence_put() in virtio_gpu_execbuffer_ioctl()
On Fri, Aug 07, 2020 at 11:00:11AM +0800, 何鑫 wrote: > Xin He 于2020年7月21日周二 下午6:17写道: > > > > From: Qi Liu > > > > We should put the reference count of the fence after calling > > virtio_gpu_cmd_submit(). So add the missing dma_fence_put(). > > > > Fixes: 2cd7b6f08bc4 ("drm/virtio: add in/out fence support for explicit > > synchronization") > > Co-developed-by: Xin He > > Signed-off-by: Xin He > > Signed-off-by: Qi Liu > > Reviewed-by: Muchun Song > > --- > > > > changelog in v3: > > 1) Change the subject from "drm/virtio: fixed memory leak in > > virtio_gpu_execbuffer_ioctl()" to > >"drm/virtio: fix missing dma_fence_put() in > > virtio_gpu_execbuffer_ioctl()" > > 2) Rework the commit log > > > > changelog in v2: > > 1) Add a change description > > > > drivers/gpu/drm/virtio/virtgpu_ioctl.c | 1 + > > 1 file changed, 1 insertion(+) > > > > diff --git a/drivers/gpu/drm/virtio/virtgpu_ioctl.c > > b/drivers/gpu/drm/virtio/virtgpu_ioctl.c > > index 5df722072ba0..19c5bc01eb79 100644 > > --- a/drivers/gpu/drm/virtio/virtgpu_ioctl.c > > +++ b/drivers/gpu/drm/virtio/virtgpu_ioctl.c > > @@ -179,6 +179,7 @@ static int virtio_gpu_execbuffer_ioctl(struct > > drm_device *dev, void *data, > > > > virtio_gpu_cmd_submit(vgdev, buf, exbuf->size, > > vfpriv->ctx_id, buflist, out_fence); > > + dma_fence_put(_fence->f); > > virtio_gpu_notify(vgdev); > > return 0; > > > > -- > > 2.21.1 (Apple Git-122.3) > > > > cc Greg Why? $ ./scripts/get_maintainer.pl --file drivers/gpu/drm/virtio/virtgpu_ioctl.c David Airlie (maintainer:VIRTIO GPU DRIVER) Gerd Hoffmann (maintainer:VIRTIO GPU DRIVER) Daniel Vetter (maintainer:DRM DRIVERS) Sumit Semwal (maintainer:DMA BUFFER SHARING FRAMEWORK) "Christian König" (maintainer:DMA BUFFER SHARING FRAMEWORK) dri-de...@lists.freedesktop.org (open list:VIRTIO GPU DRIVER) virtualizat...@lists.linux-foundation.org (open list:VIRTIO GPU DRIVER) linux-kernel@vger.kernel.org (open list) linux-me...@vger.kernel.org (open list:DMA BUFFER SHARING FRAMEWORK) linaro-mm-...@lists.linaro.org (moderated list:DMA BUFFER SHARING FRAMEWORK)
Re: [PATCH] epic100: switch from 'pci_' to 'dma_' API
Le 06/08/2020 à 23:23, David Miller a écrit : From: Christophe JAILLET Date: Thu, 6 Aug 2020 22:19:35 +0200 The wrappers in include/linux/pci-dma-compat.h should go away. Christophe, the net-next tree is closed so I'd like to ask that you defer submitting these conversion patches until the net-next tree opens back up again. Thank you. Sure, sorry for the inconvenience. CJ
Re: drivers/staging/mt7621-pci/pci-mt7621.c:189:11: error: 'pci_generic_config_read' undeclared here (not in a function)
Hi, On Fri, Aug 7, 2020 at 1:51 AM kernel test robot wrote: > > Hi Sergio, > > First bad commit (maybe != root cause): > > tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git > master > head: 7b4ea9456dd3f73238408126ab00f1d906963d81 > commit: 3b2fa0c92686562ac0b8cf00c0326a45814f8e18 MIPS: ralink: enable PCI > support only if driver for mt7621 SoC is selected > date: 9 months ago > config: mips-randconfig-r005-20200807 (attached as .config) > compiler: mipsel-linux-gcc (GCC) 9.3.0 > reproduce (this is a W=1 build): > wget > https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O > ~/bin/make.cross > chmod +x ~/bin/make.cross > git checkout 3b2fa0c92686562ac0b8cf00c0326a45814f8e18 > # save the attached .config to linux build tree > COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross > ARCH=mips > > If you fix the issue, kindly add following tag as appropriate > Reported-by: kernel test robot > > All errors (new ones prefixed by >>): > > >> drivers/staging/mt7621-pci/pci-mt7621.c:189:11: error: > >> 'pci_generic_config_read' undeclared here (not in a function) > 189 | .read = pci_generic_config_read, > | ^~~ > >> drivers/staging/mt7621-pci/pci-mt7621.c:190:12: error: > >> 'pci_generic_config_write' undeclared here (not in a function) > 190 | .write = pci_generic_config_write, > |^~~~ [snip] PCI is not selected if CONFIG_SOC_MT7621 is not set which is the case for the attached kernel config... There was a time when the driver itself was depending on PCI (which for me seems to be the correct thing to do) but it was removed creating a regression for gnubee and ralink with pci based boards. This was done in 'c4d48cf5e2f0 ("MIPS: ralink: deactivate PCI support for SOC_MT7621")'. To try to fix this issue and being able again to properly compile the driver I sent the patch is mentioned here by the test robot in where HAVE_PCI was selected only in PCI_MT7621 SOC was selected... I was told in the mips kernel list that this way was not the best thing to do to fix the issue but the patch was accepted. I also asked about the original compile issue to the guy who removed first the 'depends PCI' stuff with no answer at all... So, what should I do to properly fix this? Thanks in advance for your time. Best regards, Sergio Paracuellos
drivers/net/wireless/mediatek/mt76/mt7915/init.c:339:26: sparse: sparse: cast from restricted __le16
tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master head: 86cfccb66937dd6cbf26ed619958b9e587e6a115 commit: 00b2e16e006390069480e90478aa8b6e924996d7 mt76: mt7915: add TxBF capabilities date: 3 months ago config: sparc64-randconfig-s032-20200806 (attached as .config) compiler: sparc64-linux-gcc (GCC) 9.3.0 reproduce: wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # apt-get install sparse # sparse version: v0.6.2-117-g8c7aee71-dirty git checkout 00b2e16e006390069480e90478aa8b6e924996d7 # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' ARCH=sparc64 If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot sparse warnings: (new ones prefixed by >>) >> drivers/net/wireless/mediatek/mt76/mt7915/init.c:339:26: sparse: sparse: >> cast from restricted __le16 >> drivers/net/wireless/mediatek/mt76/mt7915/init.c:339:26: sparse: sparse: >> cast from restricted __le16 >> drivers/net/wireless/mediatek/mt76/mt7915/init.c:339:26: sparse: sparse: >> cast from restricted __le16 >> drivers/net/wireless/mediatek/mt76/mt7915/init.c:339:26: sparse: sparse: >> cast from restricted __le16 >> drivers/net/wireless/mediatek/mt76/mt7915/init.c:339:26: sparse: sparse: >> cast from restricted __le16 >> drivers/net/wireless/mediatek/mt76/mt7915/init.c:339:26: sparse: sparse: >> cast from restricted __le16 vim +339 drivers/net/wireless/mediatek/mt76/mt7915/init.c 296 297 elem->phy_cap_info[3] &= ~IEEE80211_HE_PHY_CAP3_SU_BEAMFORMER; 298 elem->phy_cap_info[4] &= ~IEEE80211_HE_PHY_CAP4_MU_BEAMFORMER; 299 300 c = IEEE80211_HE_PHY_CAP5_BEAMFORMEE_NUM_SND_DIM_UNDER_80MHZ_MASK | 301 IEEE80211_HE_PHY_CAP5_BEAMFORMEE_NUM_SND_DIM_ABOVE_80MHZ_MASK; 302 elem->phy_cap_info[5] &= ~c; 303 304 c = IEEE80211_HE_PHY_CAP6_TRIG_SU_BEAMFORMER_FB | 305 IEEE80211_HE_PHY_CAP6_TRIG_MU_BEAMFORMER_FB; 306 elem->phy_cap_info[6] &= ~c; 307 308 elem->phy_cap_info[7] &= ~IEEE80211_HE_PHY_CAP7_MAX_NC_MASK; 309 310 c = IEEE80211_HE_PHY_CAP2_NDP_4x_LTF_AND_3_2US | 311 IEEE80211_HE_PHY_CAP2_UL_MU_FULL_MU_MIMO | 312 IEEE80211_HE_PHY_CAP2_UL_MU_PARTIAL_MU_MIMO; 313 elem->phy_cap_info[2] |= c; 314 315 c = IEEE80211_HE_PHY_CAP4_SU_BEAMFORMEE | 316 IEEE80211_HE_PHY_CAP4_BEAMFORMEE_MAX_STS_UNDER_80MHZ_4 | 317 IEEE80211_HE_PHY_CAP4_BEAMFORMEE_MAX_STS_ABOVE_80MHZ_4; 318 elem->phy_cap_info[4] |= c; 319 320 /* do not support NG16 due to spec D4.0 changes subcarrier idx */ 321 c = IEEE80211_HE_PHY_CAP6_CODEBOOK_SIZE_42_SU | 322 IEEE80211_HE_PHY_CAP6_CODEBOOK_SIZE_75_MU; 323 324 if (vif == NL80211_IFTYPE_STATION) 325 c |= IEEE80211_HE_PHY_CAP6_PARTIAL_BANDWIDTH_DL_MUMIMO; 326 327 elem->phy_cap_info[6] |= c; 328 329 if (nss < 2) 330 return; 331 332 if (vif != NL80211_IFTYPE_AP) 333 return; 334 335 elem->phy_cap_info[3] |= IEEE80211_HE_PHY_CAP3_SU_BEAMFORMER; 336 elem->phy_cap_info[4] |= IEEE80211_HE_PHY_CAP4_MU_BEAMFORMER; 337 338 /* num_snd_dim */ > 339 c = (nss - 1) | (max_t(int, mcs->tx_mcs_160, 1) << 3); 340 elem->phy_cap_info[5] |= c; 341 342 c = IEEE80211_HE_PHY_CAP6_TRIG_SU_BEAMFORMER_FB | 343 IEEE80211_HE_PHY_CAP6_TRIG_MU_BEAMFORMER_FB; 344 elem->phy_cap_info[6] |= c; 345 346 /* the maximum cap is 4 x 3, (Nr, Nc) = (3, 2) */ 347 elem->phy_cap_info[7] |= min_t(int, nss - 1, 2) << 3; 348 } 349 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org .config.gz Description: application/gzip
Re: [PATCH v1 2/2] perf/core: Fake regs for leaked kernel samples
Hi Peter, On 8/6/2020 5:24 PM, pet...@infradead.org wrote: On Thu, Aug 06, 2020 at 11:18:27AM +0200, pet...@infradead.org wrote: On Thu, Aug 06, 2020 at 10:26:29AM +0800, Jin, Yao wrote: +static struct pt_regs *sanitize_sample_regs(struct perf_event *event, struct pt_regs *regs) +{ + struct pt_regs *sample_regs = regs; + + /* user only */ + if (!event->attr.exclude_kernel || !event->attr.exclude_hv || + !event->attr.exclude_host || !event->attr.exclude_guest) + return sample_regs; + Is this condition correct? Say counting user event on host, exclude_kernel = 1 and exclude_host = 0. It will go "return sample_regs" path. I'm not sure, I'm terminally confused on virt stuff. [A] Suppose we have nested virt: L0-hv | G0/L1-hv | G1 And we're running in G0, then: - 'exclude_hv' would exclude L0 events - 'exclude_host' would ... exclude L1-hv events? - 'exclude_guest' would ... exclude G1 events? [B] Then the next question is, if G0 is a host, does the L1-hv run in G0 userspace or G0 kernel space? I was assuming G0 userspace would not include anything L1 (kvm is a kernel module after all), but what do I know. @@ -11609,7 +11636,8 @@ SYSCALL_DEFINE5(perf_event_open, if (err) return err; - if (!attr.exclude_kernel) { + if (!attr.exclude_kernel || !attr.exclude_callchain_kernel || + !attr.exclude_hv || !attr.exclude_host || !attr.exclude_guest) { err = perf_allow_kernel(); if (err) return err; I can understand the conditions "!attr.exclude_kernel || !attr.exclude_callchain_kernel". But I'm not very sure about the "!attr.exclude_hv || !attr.exclude_host || !attr.exclude_guest". Well, I'm very sure G0 userspace should never see L0 or G1 state, so exclude_hv and exclude_guest had better be true. On host, exclude_hv = 1, exclude_guest = 1 and exclude_host = 0, right? Same as above, is G0 host state G0 userspace? So even exclude_kernel = 1 but exclude_host = 0, we will still go perf_allow_kernel path. Please correct me if my understanding is wrong. Yes, because with those permission checks in place it means you have permission to see kernel bits. So if I understand 'exclude_host' wrong -- a distinct possibility -- can we then pretty please have the above [A-B] corrected and put in a comment near perf_event_attr and the exclude_* comments changed to refer to that? In my previous mail, I explained what I understood for 'exclude_host', but not sure if it's correct. Needs more review comments. Thanks Jin Yao
Re: INFO: task hung in pipe_read (2)
Hello! On Sat, Aug 01, 2020 at 10:39:00AM -0700, Linus Torvalds wrote: > On Sat, Aug 1, 2020 at 8:30 AM Tetsuo Handa > wrote: > > > > Waiting for response at > > https://lkml.kernel.org/r/45a9b2c8-d0b7-8f00-5b30-0cfe3e028...@i-love.sakura.ne.jp > > . > > I think handle_userfault() should have a (shortish) timeout, and just > return VM_FAULT_RETRY. The 1sec timeout if applied only to kernel faults (not the case yet but it'd be enough to solve the hangcheck timer), will work perfectly for Android, but it will break qemu. [ 916.954313] INFO: task syz-executor.0:61593 blocked for more than 40 seconds. If you want to enforce a timeout, 40 seconds or something of the order of the hangcheck timer would be more reasonable. 1sec is of the same order of magnitude of latency that you'd get with an host kernel upgrade in place with kexec (with the guest memory being preserved in RAM) that you'd suffer occasionally from in most public clouds. So postcopy live migration should be allowed to take 1 sec latency and it shouldn't become a deal breaker, that results in the VM getting killed. > The code is overly complex anyway, because it predates the "just return > RETRY". > > And because we can't wait forever when the source of the fault is a > kernel exception, I think we should add some extra logic to just say > "if this is a retry, we've already done this once, just return an > error". Until the uffp-wp was merged recently, we never needed more than one VM_FAULT_RETRY to handle uffd-missing faults, you seem to want to go back to that which again would be fine for uffd-missing faults. I haven't had time to read and test the testcase properly yet, but at first glance from reading the hangcheck report it looks like there would be just one userfault? So I don't see an immediate connection. The change adding a 1sec timeout would definitely fix this issue, but it'll also break qemu and probably the vast majority of the users. > This is a TEST PATCH ONLY. I think we'll actually have to do something > like this, but I think the final version might need to allow a couple > of retries, rather than just give up after just one second. > > But for testing your case, this patch might be enough to at least show > that "yeah, this kind of approach works". > > Andrea? Comments? As mentioned, this is probably much too aggressive, > but I do think we need to limit the time that the kernel will wait for > page faults. Why is pipe preventing to SIGKILL the task that is blocked on the mutex_lock? Is there any good reason for it or it simply has margin for improvement regardless of the hangcheck report? It'd be great if we can look into that before looking into the uffd specific bits. The hangcheck timer would have zero issues with tasks that can be killed, if only the pipe code could be improved to use mutex_lock_killable. /* use "==" to skip the TASK_KILLABLE tasks waiting on NFS */ if (t->state == TASK_UNINTERRUPTIBLE) check_hung_task(t, timeout); The hangcheck report is just telling us one task was in D state a little too long, but it wasn't fatal error and the kernel wasn't actually destabilized and the only malfunction reported is that a task was unkillable for too long. Now if it's impossible to improve the pipe code so it works better not just for uffd, there's still no reason to worry: we could disable uffd in the pipe context. For example ptrace opts-out of uffds, so that gdb doesn't get stuck if you read a pointer that should be handled by the process that is under debug. I hope it won't be necessary but it wouldn't be a major issue, certainly it wouldn't risk breaking qemu (and non-cooperative APIs are privileged so it could still skip the timeout). > Because userfaultfd has become a huge source of security holes as a > way to time kernel faults or delay them indefinitely. I assume you refer to the below: https://duasynt.com/blog/cve-2016-6187-heap-off-by-one-exploit https://blog.lizzie.io/using-userfaultfd.html These reports don't happen to mention CONFIG_SLAB_FREELIST_RANDOM=y which is already enabled by default in enterprise kernels for a reason and they don't mention the more recent CONFIG_SLAB_FREELIST_HARDENED and CONFIG_SHUFFLE_PAGE_ALLOCATOR. Can they test it with those options enabled again, does it still work so good or not anymore? That would be very helpful to know. Randomizing which is the next page that gets allocated is much more important than worrying about uffd because if you removed uffd you may still have other ways to temporarily stop the page fault depending on the setup. Example: https://bugs.chromium.org/p/project-zero/issues/detail?id=808 The above one doesn't use uffd, but it uses fuse. So is fuse also a source of security holes given they even use it for the exploit in a preferential way instead of uffd? "This can be done by abusing the writev() syscall and FUSE: The attacker mounts a FUSE filesystem that
Re: [PATCH v1 2/2] perf/core: Fake regs for leaked kernel samples
Hi Peter, On 8/6/2020 5:18 PM, pet...@infradead.org wrote: On Thu, Aug 06, 2020 at 10:26:29AM +0800, Jin, Yao wrote: +static struct pt_regs *sanitize_sample_regs(struct perf_event *event, struct pt_regs *regs) +{ + struct pt_regs *sample_regs = regs; + + /* user only */ + if (!event->attr.exclude_kernel || !event->attr.exclude_hv || + !event->attr.exclude_host || !event->attr.exclude_guest) + return sample_regs; + Is this condition correct? Say counting user event on host, exclude_kernel = 1 and exclude_host = 0. It will go "return sample_regs" path. I'm not sure, I'm terminally confused on virt stuff. Suppose we have nested virt: L0-hv | G0/L1-hv | G1 And we're running in G0, then: - 'exclude_hv' would exclude L0 events - 'exclude_host' would ... exclude L1-hv events? I think the exclude_host is generally set by guest (/arch/x86/kvm/pmu.c, pmc_reprogram_counter). If G0 is a host, if we set exclude_host in G0, I think we will not be able to count the events on G0. The appropriate usage is, G1 sets the exclude_host, then the events on G0 will not be collected by guest G1. That's my understanding for the usage of exclude_host. - 'exclude_guest' would ... exclude G1 events? Similarly, the appropriate usage is, the host (G0) sets the exclude_guest, then the events on G1 will not be collected by host G0. If G1 sets exclude_guest, since no guest is under G1, that's ineffective. Then the next question is, if G0 is a host, does the L1-hv run in G0 userspace or G0 kernel space? I'm not very sure. Maybe some in kernel, some in userspace(qemu)? Maybe some KVM experts can help to answer this question. I was assuming G0 userspace would not include anything L1 (kvm is a kernel module after all), but what do I know. I have tested following conditions in native environment (not in KVM guests), the result is not expected. /* user only */ if (!event->attr.exclude_kernel || !event->attr.exclude_hv || !event->attr.exclude_host || !event->attr.exclude_guest) return sample_regs; perf record -e cycles:u ./div perf report --stdio # Overhead Command Shared Object Symbol # ... ... # 49.51% div libc-2.27.so [.] __random_r 33.93% div libc-2.27.so [.] __random 8.13% div libc-2.27.so [.] rand 4.29% div div [.] main 4.14% div div [.] rand@plt 0.00% div [unknown] [k] 0xbd600cb0 0.00% div [unknown] [k] 0xbd600df0 0.00% div ld-2.27.so[.] _dl_relocate_object 0.00% div ld-2.27.so[.] _dl_start 0.00% div ld-2.27.so[.] _start 0xbd600cb0 and 0xbd600df0 are leaked kernel addresses. From debug, I can see: [ 6272.320258] jinyao: sanitize_sample_regs: event->attr.exclude_kernel = 1, event->attr.exclude_hv = 1, event->attr.exclude_host = 0, event->attr.exclude_guest = 0 So it goes "return sample_regs;" path. @@ -11609,7 +11636,8 @@ SYSCALL_DEFINE5(perf_event_open, if (err) return err; - if (!attr.exclude_kernel) { + if (!attr.exclude_kernel || !attr.exclude_callchain_kernel || + !attr.exclude_hv || !attr.exclude_host || !attr.exclude_guest) { err = perf_allow_kernel(); if (err) return err; I can understand the conditions "!attr.exclude_kernel || !attr.exclude_callchain_kernel". But I'm not very sure about the "!attr.exclude_hv || !attr.exclude_host || !attr.exclude_guest". Well, I'm very sure G0 userspace should never see L0 or G1 state, so exclude_hv and exclude_guest had better be true. On host, exclude_hv = 1, exclude_guest = 1 and exclude_host = 0, right? Same as above, is G0 host state G0 userspace? So even exclude_kernel = 1 but exclude_host = 0, we will still go perf_allow_kernel path. Please correct me if my understanding is wrong. Yes, because with those permission checks in place it means you have permission to see kernel bits. At the syscall entry, I also added some printk. Aug 7 03:37:40 kbl-ppc kernel: [ 854.688045] syscall: attr.exclude_kernel = 1, attr.exclude_callchain_kernel = 0, attr.exclude_hv = 0, attr.exclude_host = 0, attr.exclude_guest = 0 For my test case ("perf record -e cycles:u ./div"), the perf_allow_kernel() is also executed. Thanks Jin Yao
Re: [PATCH v6 11/11] bus: mhi: core: Introduce sysfs entries for MHI
On Mon, Jul 27, 2020 at 07:02:20PM -0700, Bhaumik Bhatt wrote: > Introduce sysfs entries to enable userspace clients the ability to read > the serial number and the OEM PK Hash values obtained from BHI. OEMs > need to read these device-specific hardware information values through > userspace for factory testing purposes and cannot be exposed via degbufs > as it may remain disabled for performance reasons. Also, update the > documentation for ABI to include these entries. > > Signed-off-by: Bhaumik Bhatt > --- > Documentation/ABI/stable/sysfs-bus-mhi | 21 ++ > MAINTAINERS| 1 + > drivers/bus/mhi/core/init.c| 53 > ++ > 3 files changed, 75 insertions(+) > create mode 100644 Documentation/ABI/stable/sysfs-bus-mhi > > diff --git a/Documentation/ABI/stable/sysfs-bus-mhi > b/Documentation/ABI/stable/sysfs-bus-mhi > new file mode 100644 > index 000..1d5d0d6 > --- /dev/null > +++ b/Documentation/ABI/stable/sysfs-bus-mhi > @@ -0,0 +1,21 @@ > +What:/sys/bus/mhi/devices/.../serialnumber > +Date:Jul 2020 > +KernelVersion: 5.8 > +Contact: Bhaumik Bhatt > +Description: The file holds the serial number of the client device obtained > + using a BHI (Boot Host Interface) register read after at least > + one attempt to power up the device has been done. If read > + without having the device power on at least once, the file will > + read all 0's. > +Users: Any userspace application or clients interested in > device info. I think you're not using tabs here and that's why it is showing mangled. Please use tabs as like other files. Thanks, Mani > + > +What:/sys/bus/mhi/devices/.../oem_pk_hash > +Date:Jul 2020 > +KernelVersion: 5.8 > +Contact: Bhaumik Bhatt > +Description: The file holds the OEM PK Hash value of the endpoint device > + obtained using a BHI (Boot Host Interface) register read after > + at least one attempt to power up the device has been done. If > + read without having the device power on at least once, the file > + will read all 0's. > +Users: Any userspace application or clients interested in > device info. > diff --git a/MAINTAINERS b/MAINTAINERS > index e64e5db..5e49316 100644 > --- a/MAINTAINERS > +++ b/MAINTAINERS > @@ -11018,6 +11018,7 @@ M:Hemant Kumar > L: linux-arm-...@vger.kernel.org > S: Maintained > T: git git://git.kernel.org/pub/scm/linux/kernel/git/mani/mhi.git > +F: Documentation/ABI/stable/sysfs-bus-mhi > F: Documentation/mhi/ > F: drivers/bus/mhi/ > F: include/linux/mhi.h > diff --git a/drivers/bus/mhi/core/init.c b/drivers/bus/mhi/core/init.c > index 972dbf0..c086ef2 100644 > --- a/drivers/bus/mhi/core/init.c > +++ b/drivers/bus/mhi/core/init.c > @@ -76,6 +76,56 @@ const char *to_mhi_pm_state_str(enum mhi_pm_state state) > return mhi_pm_state_str[index]; > } > > +static ssize_t serial_number_show(struct device *dev, > + struct device_attribute *attr, > + char *buf) > +{ > + struct mhi_device *mhi_dev = to_mhi_device(dev); > + struct mhi_controller *mhi_cntrl = mhi_dev->mhi_cntrl; > + > + return snprintf(buf, PAGE_SIZE, "Serial Number: %u\n", > + mhi_cntrl->serial_number); > +} > +static DEVICE_ATTR_RO(serial_number); > + > +static ssize_t oem_pk_hash_show(struct device *dev, > + struct device_attribute *attr, > + char *buf) > +{ > + struct mhi_device *mhi_dev = to_mhi_device(dev); > + struct mhi_controller *mhi_cntrl = mhi_dev->mhi_cntrl; > + int i, cnt = 0; > + > + for (i = 0; i < ARRAY_SIZE(mhi_cntrl->oem_pk_hash); i++) > + cnt += snprintf(buf + cnt, PAGE_SIZE - cnt, > + "OEMPKHASH[%d]: 0x%x\n", i, > + mhi_cntrl->oem_pk_hash[i]); > + > + return cnt; > +} > +static DEVICE_ATTR_RO(oem_pk_hash); > + > +static struct attribute *mhi_sysfs_attrs[] = { > + _attr_serial_number.attr, > + _attr_oem_pk_hash.attr, > + NULL, > +}; > + > +static const struct attribute_group mhi_sysfs_group = { > + .attrs = mhi_sysfs_attrs, > +}; > + > +static int mhi_create_sysfs(struct mhi_controller *mhi_cntrl) > +{ > + return sysfs_create_group(_cntrl->mhi_dev->dev.kobj, > + _sysfs_group); > +} > + > +static void mhi_destroy_sysfs(struct mhi_controller *mhi_cntrl) > +{ > + sysfs_remove_group(_cntrl->mhi_dev->dev.kobj, _sysfs_group); > +} > + > /* MHI protocol requires the transfer ring to be aligned with ring length */ > static int mhi_alloc_aligned_ring(struct mhi_controller *mhi_cntrl, > struct mhi_ring *ring, > @@ -917,6 +967,8 @@ int
Re: [PATCH v3 2/2] dma-pool: Only allocate from CMA when in same memory zone
On Thu, Aug 06, 2020 at 08:47:55PM +0200, Nicolas Saenz Julienne wrote: > There is no guarantee to CMA's placement, so allocating a zone specific > atomic pool from CMA might return memory from a completely different > memory zone. To get around this double check CMA's placement before > allocating from it. As the builtbot pointed out, memblock_start_of_DRAM can't be used from non-__init code. But lookig at it I think throwing that in is bogus anyway, as cma_get_base returns a proper physical address already.
[PATCH] serial: qcom_geni_serial: Fix recent kdb hang
The commit e42d6c3ec0c7 ("serial: qcom_geni_serial: Make kgdb work even if UART isn't console") worked pretty well and I've been doing a lot of debugging with it. However, recently I typed "dmesg" in kdb and then held the space key down to scroll through the pagination. My device hung. This was repeatable and I found that it was introduced with the aforementioned commit. It turns out that there are some strange boundary cases in geni where in some weird situations it will signal RX_LAST but then will put 0 in RX_LAST_BYTE. This means that the entire last FIFO entry is valid. This weird corner case is handled in qcom_geni_serial_handle_rx() where you can see that we only honor RX_LAST_BYTE if RX_LAST is set _and_ RX_LAST_BYTE is non-zero. If either of these is not true we use BYTES_PER_FIFO_WORD (4) for the size of the last FIFO word. Let's fix kgdb. While at it, also use the proper #define for 4. Fixes: e42d6c3ec0c7 ("serial: qcom_geni_serial: Make kgdb work even if UART isn't console") Signed-off-by: Douglas Anderson --- drivers/tty/serial/qcom_geni_serial.c | 9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/drivers/tty/serial/qcom_geni_serial.c b/drivers/tty/serial/qcom_geni_serial.c index 07b7b6b05b8b..e27077656939 100644 --- a/drivers/tty/serial/qcom_geni_serial.c +++ b/drivers/tty/serial/qcom_geni_serial.c @@ -361,11 +361,16 @@ static int qcom_geni_serial_get_char(struct uart_port *uport) return NO_POLL_CHAR; if (word_cnt == 1 && (status & RX_LAST)) + /* +* NOTE: If RX_LAST_BYTE_VALID is 0 it needs to be +* treated as if it was BYTES_PER_FIFO_WORD. +*/ private_data->poll_cached_bytes_cnt = (status & RX_LAST_BYTE_VALID_MSK) >> RX_LAST_BYTE_VALID_SHFT; - else - private_data->poll_cached_bytes_cnt = 4; + + if (private_data->poll_cached_bytes_cnt == 0) + private_data->poll_cached_bytes_cnt = BYTES_PER_FIFO_WORD; private_data->poll_cached_bytes = readl(uport->membase + SE_GENI_RX_FIFOn); -- 2.28.0.236.gb10cc79966-goog
Re: [PATCH v6 10/11] bus: mhi: core: Introduce APIs to allocate and free the MHI controller
On Mon, Jul 27, 2020 at 07:02:19PM -0700, Bhaumik Bhatt wrote: > Client devices should use the APIs provided to allocate and free > the MHI controller structure. This will help ensure that the > structure is zero-initialized and there are no false positives > with respect to reading any values such as the serial number or > the OEM PK hash. > > Signed-off-by: Bhaumik Bhatt Can you please also add the Suggested-by tag? Reviewed-by: Manivannan Sadhasivam Thanks, Mani > --- > drivers/bus/mhi/core/init.c | 16 > include/linux/mhi.h | 12 > 2 files changed, 28 insertions(+) > > diff --git a/drivers/bus/mhi/core/init.c b/drivers/bus/mhi/core/init.c > index d2c0f6e..972dbf0 100644 > --- a/drivers/bus/mhi/core/init.c > +++ b/drivers/bus/mhi/core/init.c > @@ -959,6 +959,22 @@ void mhi_unregister_controller(struct mhi_controller > *mhi_cntrl) > } > EXPORT_SYMBOL_GPL(mhi_unregister_controller); > > +struct mhi_controller *mhi_alloc_controller(void) > +{ > + struct mhi_controller *mhi_cntrl; > + > + mhi_cntrl = kzalloc(sizeof(*mhi_cntrl), GFP_KERNEL); > + > + return mhi_cntrl; > +} > +EXPORT_SYMBOL_GPL(mhi_alloc_controller); > + > +void mhi_free_controller(struct mhi_controller *mhi_cntrl) > +{ > + kfree(mhi_cntrl); > +} > +EXPORT_SYMBOL_GPL(mhi_free_controller); > + > int mhi_prepare_for_power_up(struct mhi_controller *mhi_cntrl) > { > struct device *dev = _cntrl->mhi_dev->dev; > diff --git a/include/linux/mhi.h b/include/linux/mhi.h > index d15e9ce..a35d876 100644 > --- a/include/linux/mhi.h > +++ b/include/linux/mhi.h > @@ -530,6 +530,18 @@ struct mhi_driver { > #define to_mhi_device(dev) container_of(dev, struct mhi_device, dev) > > /** > + * mhi_alloc_controller - Allocate the MHI Controller structure > + * Allocate the mhi_controller structure using zero initialized memory > + */ > +struct mhi_controller *mhi_alloc_controller(void); > + > +/** > + * mhi_free_controller - Free the MHI Controller structure > + * Free the mhi_controller structure which was previously allocated > + */ > +void mhi_free_controller(struct mhi_controller *mhi_cntrl); > + > +/** > * mhi_register_controller - Register MHI controller > * @mhi_cntrl: MHI controller to register > * @config: Configuration to use for the controller > -- > The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, > a Linux Foundation Collaborative Project >
Re: [PATCH v6 07/11] bus: mhi: core: Introduce counters to track MHI device state transitions
On Mon, Jul 27, 2020 at 07:02:16PM -0700, Bhaumik Bhatt wrote: > Use counters to track MHI device state transitions such as those > to M0, M2, or M3 states. This can help in better debug, allowing > the user to see the number of transitions to a certain MHI state > when queried using debugfs entries or via other mechanisms. > > Signed-off-by: Bhaumik Bhatt Reviewed-by: Manivannan Sadhasivam A minor nit below... > --- > drivers/bus/mhi/core/pm.c | 4 > include/linux/mhi.h | 2 ++ > 2 files changed, 6 insertions(+) > > diff --git a/drivers/bus/mhi/core/pm.c b/drivers/bus/mhi/core/pm.c > index 27bb471..ce4d969 100644 > --- a/drivers/bus/mhi/core/pm.c > +++ b/drivers/bus/mhi/core/pm.c > @@ -256,6 +256,7 @@ int mhi_pm_m0_transition(struct mhi_controller *mhi_cntrl) > dev_err(dev, "Unable to transition to M0 state\n"); > return -EIO; > } > + mhi_cntrl->M0++; > > /* Wake up the device */ > read_lock_bh(_cntrl->pm_lock); > @@ -326,6 +327,8 @@ void mhi_pm_m1_transition(struct mhi_controller > *mhi_cntrl) > mhi_cntrl->dev_state = MHI_STATE_M2; > > write_unlock_irq(_cntrl->pm_lock); > + > + mhi_cntrl->M2++; > wake_up_all(_cntrl->state_event); > > /* If there are any pending resources, exit M2 immediately */ > @@ -362,6 +365,7 @@ int mhi_pm_m3_transition(struct mhi_controller *mhi_cntrl) > return -EIO; > } > > + mhi_cntrl->M3++; > wake_up_all(_cntrl->state_event); > > return 0; > diff --git a/include/linux/mhi.h b/include/linux/mhi.h > index a8379b3..e38de6d 100644 > --- a/include/linux/mhi.h > +++ b/include/linux/mhi.h > @@ -328,6 +328,7 @@ struct mhi_controller_config { > * @dev_state: MHI device state > * @dev_wake: Device wakeup count > * @pending_pkts: Pending packets for the controller > + * @M0, M2, M3, M3_fast: Counters to track number of device MHI state changes > * @transition_list: List of MHI state transitions > * @transition_lock: Lock for protecting MHI state transition list > * @wlock: Lock for protecting device wakeup > @@ -407,6 +408,7 @@ struct mhi_controller { > enum mhi_state dev_state; > atomic_t dev_wake; > atomic_t pending_pkts; > + u32 M0, M2, M3, M3_fast; Are you using M3_fast anywhere in current series? If not, please drop it for now. Thanks, Mani > struct list_head transition_list; > spinlock_t transition_lock; > spinlock_t wlock; > -- > The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, > a Linux Foundation Collaborative Project >
[GIT PULL] xen: branch for v5.9-rc1
Linus, Please git pull the following tag: git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip.git for-linus-5.9-rc1-tag xen: branch for v5.9-rc1 It contains the following: - two trivial comment fixes - A small series for the Xen balloon driver fixing some issues - A series of the Xen privcmd driver targeting elimination of using get_user_pages*() in this driver - A series for the Xen swiotlb driver cleaning it up and adding support for letting the kernel run as dom0 on Rpi4 Thanks. Juergen arch/arm/xen/mm.c| 34 +- arch/x86/include/asm/xen/hypercall.h | 2 +- drivers/xen/balloon.c| 26 +++- drivers/xen/privcmd.c| 32 +- drivers/xen/swiotlb-xen.c| 119 +-- include/uapi/xen/gntdev.h| 2 +- include/xen/page.h | 1 - include/xen/swiotlb-xen.h| 8 +-- 8 files changed, 119 insertions(+), 105 deletions(-) Boris Ostrovsky (1): swiotlb-xen: use vmalloc_to_page on vmalloc virt addresses Randy Dunlap (2): xen/gntdev: gntdev.h: drop a duplicated word xen: hypercall.h: fix duplicated word Roger Pau Monne (3): xen/balloon: fix accounting in alloc_xenballooned_pages error path xen/balloon: make the balloon wait interruptible Revert "xen/balloon: Fix crash when ballooning on x86 32 bit PAE" Souptick Joarder (3): xen/privcmd: Corrected error handling path xen/privcmd: Mark pages as dirty xen/privcmd: Convert get_user_pages*() to pin_user_pages*() Stefano Stabellini (10): swiotlb-xen: remove start_dma_addr swiotlb-xen: add struct device * parameter to xen_phys_to_bus swiotlb-xen: add struct device * parameter to xen_bus_to_phys swiotlb-xen: add struct device * parameter to xen_dma_sync_for_cpu swiotlb-xen: add struct device * parameter to xen_dma_sync_for_device swiotlb-xen: add struct device * parameter to is_xen_swiotlb_buffer swiotlb-xen: remove XEN_PFN_PHYS swiotlb-xen: introduce phys_to_dma/dma_to_phys translations xen/arm: introduce phys/dma translations in xen_dma_sync_for_* xen/arm: call dma_to_phys on the dma_addr_t parameter of dma_cache_maint
Re: [PATCH] ftrace: Fixup lockdep assert held of text_mutex
On Fri, Aug 7, 2020 at 12:01 PM Steven Rostedt wrote: > > On Fri, 7 Aug 2020 10:59:16 +0800 > Guo Ren wrote: > > > > > > This looks like a bug in the lockdep_assert_held() in whatever arch > > > (riscv) is running. > > Seems you think it's a bug of arch implementation with the wrong usage > > of text_mutex? > > > > Also @riscv maintainer, > > How about modifying it in riscv's code? we still need to solve it. > > > > > > diff --git a/arch/riscv/include/asm/ftrace.h > > b/arch/riscv/include/asm/ftrace.h > > index ace8a6e..fb266c3 100644 > > --- a/arch/riscv/include/asm/ftrace.h > > +++ b/arch/riscv/include/asm/ftrace.h > > @@ -23,6 +23,12 @@ static inline unsigned long > > ftrace_call_adjust(unsigned long addr) > > > > struct dyn_arch_ftrace { > > }; > > + > > +#ifdef CONFIG_DYNAMIC_FTRACE > > +struct dyn_ftrace; > > +int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec); > > +#define ftrace_init_nop ftrace_init_nop > > +#endif > > #endif > > > > #ifdef CONFIG_DYNAMIC_FTRACE > > diff --git a/arch/riscv/kernel/ftrace.c b/arch/riscv/kernel/ftrace.c > > index 2ff63d0..9e9f7c0 100644 > > --- a/arch/riscv/kernel/ftrace.c > > +++ b/arch/riscv/kernel/ftrace.c > > @@ -97,6 +97,17 @@ int ftrace_make_nop(struct module *mod, struct > > dyn_ftrace *rec, > > return __ftrace_modify_call(rec->ip, addr, false); > > } > > > > +int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec) > > +{ > > + int ret; > > + > > + mutex_lock(_mutex); > > + ret = ftrace_make_nop(mod, rec, MCOUNT_ADDR); > > Looking at x86, we have the following code: > > static int ftrace_poke_late = 0; > > int ftrace_arch_code_modify_prepare(void) > __acquires(_mutex) > { > /* > * Need to grab text_mutex to prevent a race from module loading > * and live kernel patching from changing the text permissions while > * ftrace has it set to "read/write". > */ > mutex_lock(_mutex); > ftrace_poke_late = 1; > return 0; > } > > int ftrace_arch_code_modify_post_process(void) > __releases(_mutex) > { > /* > * ftrace_make_{call,nop}() may be called during > * module load, and we need to finish the text_poke_queue() > * that they do, here. > */ > text_poke_finish(); > ftrace_poke_late = 0; > mutex_unlock(_mutex); > return 0; > } > > And if ftrace_poke_late is not set, then ftrace_make_nop() does direct > modification (calls text_poke_early(), which is basically a memcpy). > > This path doesn't have any checks against text_mutex being held, > because it only happens at boot up. The solution is ok for me, but I want to get riscv maintainer's opinion before the next patch. @Paul Walmsley @Palmer Dabbelt > > > + mutex_unlock(_mutex); > > + > > + return ret; > > +} > > + > > int ftrace_update_ftrace_func(ftrace_func_t func) > > { > > int ret = __ftrace_modify_call((unsigned long)_call, > > --- > > > > > > --- a/kernel/trace/ftrace.c > > > > +++ b/kernel/trace/ftrace.c > > > > @@ -26,6 +26,7 @@ > > > > #include > > > > #include > > > > #include > > > > +#include > > > > #include > > > > #include > > > > #include > > > > @@ -6712,9 +6713,11 @@ void __init ftrace_init(void) > > > > > > ftrace_init() is called before SMP is initialized. Nothing else should > > > be running here. That means grabbing a mutex is useless. > > I don't agree, ftrace_init are modifying kernel text, so we should > > give the lock of text_mutex to keep semantic consistency. > > > Did you test your patch on x86 with lockdep? Ah.., no :P > > ftrace_process_locs() grabs the ftrace_lock, which I believe is held > when text_mutex is taken in other locations. So this will probably not > work anyway. > > text_mutex isn't to be taken at the ftrace level. Yes, currently it seemed only to be used by kernel/kprobes.c. -- Best Regards Guo Ren ML: https://lore.kernel.org/linux-csky/
Re: [PATCH v2 37/41] cpufreq: s3c24xx: move low-level clk reg access into platform code
On 06-08-20, 20:20, Krzysztof Kozlowski wrote: > From: Arnd Bergmann > > Rather than have the cpufreq drivers touch include the > common headers to get the constants, add a small indirection. > This is still not the proper way that would do this through > the common clk API, but it lets us kill off the header file > usage. > > Signed-off-by: Arnd Bergmann > [krzk: Rebase and fix -Wold-style-definition] > Signed-off-by: Krzysztof Kozlowski Acked-by: Viresh Kumar -- viresh
Re: [PATCH v2 36/41] cpufreq: s3c2412: use global s3c2412_cpufreq_setrefresh
On 06-08-20, 20:20, Krzysztof Kozlowski wrote: > From: Arnd Bergmann > > There are two identical copies of the s3c2412_cpufreq_setrefresh > function: a static one in the cpufreq driver and a global > version in iotiming-s3c2412.c. > > As the function requires the use of a hardcoded register address > from a header that we want to not be visible to drivers, just > move the existing global function and add a declaration in > one of the cpufreq header files. > > Signed-off-by: Arnd Bergmann > Signed-off-by: Krzysztof Kozlowski > --- > drivers/cpufreq/s3c2412-cpufreq.c| 23 > include/linux/soc/samsung/s3c-cpufreq-core.h | 1 + > 2 files changed, 1 insertion(+), 23 deletions(-) Acked-by: Viresh Kumar -- viresh
[git pull] m68knommu changes for v5.9
Hi Linus, Please pull the m68knommu changes for v5.9. Regards Greg The following changes since commit 92ed301919932f13b9172e525674157e983d: Linux 5.8-rc7 (2020-07-26 14:14:06 -0700) are available in the Git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu.git tags/m68knommu-for-v5.9 for you to fetch changes up to fde87ebf1daa8d96e4412aa06536da4b55103e02: m68k: stmark2: enable edma support for dspi (2020-07-27 12:32:00 +1000) m68knommu: collection of fixes for v5.9 Fixes include: . cleanup compiler warnings (IO access functions and unused variables) . ColdFire v3 cache control fix . ColdFire MMU comment cleanup . switch to using asm-generic cmpxchg_local() . stmark platform updates Angelo Dureghello (2): m68k: stmark2: defconfig updates m68k: stmark2: enable edma support for dspi Greg Ungerer (5): m68knommu: __force type casts for raw IO access m68knommu: fix use of cpu_to_le() on IO access m68k: fix ColdFire mmu init compile warning m68knommu: fix overwriting of bits in ColdFire V3 cache control m68k: use asm-generic cmpxchg_local() Mike Rapoport (1): m68k: mcfmmu: remove stale part of comment about steal_context arch/m68k/coldfire/stmark2.c| 5 arch/m68k/configs/stmark2_defconfig | 47 + arch/m68k/include/asm/cmpxchg.h | 8 --- arch/m68k/include/asm/io_no.h | 20 arch/m68k/include/asm/m53xxacr.h| 6 ++--- arch/m68k/mm/mcfmmu.c | 6 - 6 files changed, 45 insertions(+), 47 deletions(-)
Re: [PATCH v2 35/41] ARM: s3c: remove cpufreq header dependencies
On 06-08-20, 20:20, Krzysztof Kozlowski wrote: > From: Arnd Bergmann > > The cpufreq drivers are split between the machine directory > and the drivers/cpufreq directory. In order to share header > files after we convert s3c to multiplatform, those headers > have to live in a different global location. > > Move them to linux/soc/samsung/ in lack of a better place. > > Signed-off-by: Arnd Bergmann > Signed-off-by: Krzysztof Kozlowski > --- > arch/arm/mach-s3c24xx/common.c | 1 - > arch/arm/mach-s3c24xx/cpufreq-utils.c | 2 +- > arch/arm/mach-s3c24xx/iotiming-s3c2410.c | 2 +- > arch/arm/mach-s3c24xx/iotiming-s3c2412.c | 2 +- > arch/arm/mach-s3c24xx/mach-bast.c | 2 +- > arch/arm/mach-s3c24xx/mach-osiris-dvs.c| 2 +- > arch/arm/mach-s3c24xx/mach-osiris.c| 2 +- > arch/arm/mach-s3c24xx/pll-s3c2410.c| 4 ++-- > arch/arm/mach-s3c24xx/pll-s3c2440-1200.c | 4 ++-- > arch/arm/mach-s3c24xx/pll-s3c2440-16934400.c | 4 ++-- > arch/arm/mach-s3c24xx/s3c2410.c| 1 - > arch/arm/mach-s3c24xx/s3c2412.c| 1 - > arch/arm/mach-s3c24xx/s3c244x.c| 2 -- > arch/arm/mach-s3c64xx/s3c6400.c| 1 - > arch/arm/mach-s3c64xx/s3c6410.c| 2 +- > arch/arm/plat-samsung/include/plat/cpu.h | 9 - > drivers/cpufreq/s3c2410-cpufreq.c | 5 ++--- > drivers/cpufreq/s3c2412-cpufreq.c | 5 ++--- > drivers/cpufreq/s3c2440-cpufreq.c | 5 ++--- > drivers/cpufreq/s3c24xx-cpufreq-debugfs.c | 2 +- > drivers/cpufreq/s3c24xx-cpufreq.c | 5 ++--- > .../linux/soc/samsung/s3c-cpu-freq.h | 4 > .../linux/soc/samsung/s3c-cpufreq-core.h | 6 +- > include/linux/soc/samsung/s3c-pm.h | 10 ++ > 24 files changed, 41 insertions(+), 42 deletions(-) > rename arch/arm/plat-samsung/include/plat/cpu-freq.h => > include/linux/soc/samsung/s3c-cpu-freq.h (97%) > rename arch/arm/plat-samsung/include/plat/cpu-freq-core.h => > include/linux/soc/samsung/s3c-cpufreq-core.h (98%) Acked-by: Viresh Kumar -- viresh
Re: [PATCH v2 34/41] cpufreq: s3c24xx: split out registers
On 06-08-20, 20:20, Krzysztof Kozlowski wrote: > From: Arnd Bergmann > > Each of the cpufreq drivers uses a fixed set of register > bits, copy those definitions into the drivers to avoid > including mach/regs-clock.h. > > Signed-off-by: Arnd Bergmann > [krzk: Fix build by copying also S3C2410_LOCKTIME] > Signed-off-by: Krzysztof Kozlowski > > --- Acked-by: Viresh Kumar -- viresh
Re: [PATCH v3 4/5] mm: memcg: charge memcg percpu memory to the parent cgroup
On Thu, Aug 06, 2020 at 09:16:03PM -0700, Andrew Morton wrote: > On Wed, 29 Jul 2020 19:10:39 +0200 Michal Koutný wrote: > > > Hello. > > > > On Tue, Jun 23, 2020 at 11:45:14AM -0700, Roman Gushchin > > wrote: > > > Because the size of memory cgroup internal structures can dramatically > > > exceed the size of object or page which is pinning it in the memory, it's > > > not a good idea to simple ignore it. It actually breaks the isolation > > > between cgroups. > > No doubt about accounting the memory if it's significant amount. > > > > > Let's account the consumed percpu memory to the parent cgroup. > > Why did you choose charging to the parent of the created cgroup? > > > > Should the charge go the cgroup _that is creating_ the new memcg? > > > > One reason is that there are the throttling mechanisms for memory limits > > and those are better exercised when the actor and its memory artefact > > are the same cgroup, aren't they? Hi! In general, yes. But in this case I think it wouldn't be a good idea: most often cgroups are created by a centralized daemon (systemd), which is usually located in the root cgroup. Even if it's located not in the root cgroup, limiting it's memory will likely affect the whole system, even if only one specific limit was reached. If there is a containerized workload, which creates sub-cgroups, charging it's parent cgroup is perfectly effective. And the opposite, if we'll charge the cgroup of a process, who created a cgroup, we'll not cover the most common case: systemd creating cgroups for all services in the system. > > > > The second reason is based on the example Dlegation Containment > > (Documentation/admin-guide/cgroup-v2.rst) > > > > > For an example, let's assume cgroups C0 and C1 have been delegated to > > > user U0 who created C00, C01 under C0 and C10 under C1 as follows and > > > all processes under C0 and C1 belong to U0:: > > > > > > ~ - C0 - C00 > > > ~ cgroup~ \ C01 > > > ~ hierarchy ~ > > > ~ - C1 - C10 > > > > Thanks to permissions a task running in C0 creating a cgroup in C1 would > > deplete C1's supply victimizing tasks inside C1. Right, but it's quite unusual for tasks from one cgroup to create sub-cgroups in completely different cgroup. In this particular case there are tons of other ways how a task from C00 can hurt C1. > > These week-old issues appear to be significant. Roman? Or someone > else? Oh, I'm sorry, somehow I've missed this letter. Thank you for pointing at it! Thanks!
Re: [PATCH 2/3] mm/huge_memory.c: update tlb entry if pmd is changed
On Fri, 26 Jun 2020 13:43:06 +0530 "Aneesh Kumar K.V" wrote: > On 6/25/20 10:16 PM, Mike Kravetz wrote: > > On 6/25/20 5:01 AM, Aneesh Kumar K.V wrote: > >> Mike Kravetz writes: > >> > >>> On 6/24/20 2:26 AM, Bibo Mao wrote: > When set_pmd_at is called in function do_huge_pmd_anonymous_page, > new tlb entry can be added by software on MIPS platform. > > Here add update_mmu_cache_pmd when pmd entry is set, and > update_mmu_cache_pmd is defined as empty excepts arc/mips platform. > This patch has no negative effect on other platforms except arc/mips > system. > >>> > >>> I am confused by this comment. It appears that update_mmu_cache_pmd > >>> is defined as non-empty on arc, mips, powerpc and sparc architectures. > >>> Am I missing something? > >>> > >>> If those architectures do provide update_mmu_cache_pmd, then the previous > >>> patch and this one now call update_mmu_cache_pmd with the actual faulting > >>> address instead of the huge page aligned address. This was intentional > >>> for mips. However, are there any potential issues on the other > >>> architectures? > >>> I am no expert in any of those architectures. arc looks like it could be > >>> problematic as update_mmu_cache_pmd calls update_mmu_cache and then > >>> operates on (address & PAGE_MASK). That could now be different. > >>> > >> > >> Also we added update_mmu_cache_pmd to update a THP entry. That could be > >> different from a hugetlb entry on some architectures. If we need to do > >> hugetlb equivalent for update_mmu_cache, we should add a different > >> function. > > > > I do not know the mips architecture well enough or if the motivation for > > this patch was based on THP or hugetlb pages. However, it will change > > the address passed to update_mmu_cache_pmd from huge page aligned to the > > actual faulting address. Will such a change in the passed address impact > > the powerpc update_mmu_cache_pmd routine? > > > > Right now powerpc update_mmu_cache_pmd() is a dummy function. But I > agree we should audit arch to make sure such a change can work with > architectures. My comment was related to the fact that mmu cache update > w.r.t THP and hugetlb can be different on some platforms. So we may > want to avoid using the same function for both. So I'll assume that this patch is stalled until such an audit has taken place?
Re: [PATCH v8 0/4] scsi: ufs: Add Host Performance Booster Support
Avri, > Martin - Are you considering to merge the HPB feature eventually to > mainline kernel? I promise to take a look at the new series. But I can't say I'm a big fan of how this feature was defined in the spec. And - as discussed a couple of weeks ago - I would still like to see some supporting evidence from a realistic workload and not a synthetic benchmark. -- Martin K. Petersen Oracle Linux Engineering
Re: [PATCH v5 3/3] mm/page_alloc: Keep memoryless cpuless node 0 offline
On Fri, 3 Jul 2020 18:28:23 +0530 Srikar Dronamraju wrote: > > The memory hotplug changes that somehow because you can hotremove numa > > nodes and therefore make the nodemask sparse but that is not a common > > case. I am not sure what would happen if a completely new node was added > > and its corresponding node was already used by the renumbered one > > though. It would likely conflate the two I am afraid. But I am not sure > > this is really possible with x86 and a lack of a bug report would > > suggest that nobody is doing that at least. > > > > JFYI, > Satheesh copied in this mailchain had opened a bug a year on crash with vcpu > hotplug on memoryless node. > > https://bugzilla.kernel.org/show_bug.cgi?id=202187 So... do we merge this patch or not? Seems that the overall view is "risky but nobody is likely to do anything better any time soon"?
Re: [PATCH 1/3] scripts/sorttable: Change section type of orc_lookup to SHT_PROGBITS
> On Aug 6, 2020, at 11:08 PM, Ingo Molnar wrote: > > > * changhuaixin wrote: > >> Hi, Ingo >> >> Another way to write SHT_PROGBITS is using elf_create_section to write >> orc_lookup table headers, when orc_unwind_ip table and orc_unwind table are >> written. Is this a better solution? >> >> diff --git a/tools/objtool/orc_gen.c b/tools/objtool/orc_gen.c >> index 3f98dcfbc177..860d4dcec8e6 100644 >> --- a/tools/objtool/orc_gen.c >> +++ b/tools/objtool/orc_gen.c >> @@ -183,6 +183,10 @@ int create_orc_sections(struct objtool_file *file) >>u_sec = elf_create_section(file->elf, ".orc_unwind", >> sizeof(struct orc_entry), idx); >> >> + /* make flags of section orc_lookup right */ >> + if (!elf_create_section(file->elf, ".orc_lookup", sizeof(int), 0)) >> + return -1; >> + >>/* populate sections */ >>idx = 0; >>for_each_sec(file, sec) { > > Looks much nicer IMO. > > Mind turning this into a proper patch that does it plus reverts the > hack? > A new patchset is sent. Thanks, huaixin > Thanks, > > Ingo
Re: [PATCH] kernel: time: delete repeated words in comments
On Thu, Aug 6, 2020 at 8:32 PM Randy Dunlap wrote: > > Drop repeated words in kernel/time/. > {when, one, into} > Acked-by: John Stultz (I'm sure I'm to blame) thanks -john
[PATCH 3/3] x86/unwind/orc: Simplify unwind_init() for x86 boot
The ORC fast lookup table is built by scripts/sorttable tool. All that is left is setting lookup_num_blocks. Signed-off-by: Huaixin Chang Signed-off-by: Shile Zhang --- arch/x86/kernel/unwind_orc.c | 41 ++--- 1 file changed, 2 insertions(+), 39 deletions(-) diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c index ec88bbe08a32..29890389b4f6 100644 --- a/arch/x86/kernel/unwind_orc.c +++ b/arch/x86/kernel/unwind_orc.c @@ -264,48 +264,11 @@ void unwind_module_init(struct module *mod, void *_orc_ip, size_t orc_ip_size, void __init unwind_init(void) { - size_t orc_ip_size = (void *)__stop_orc_unwind_ip - (void *)__start_orc_unwind_ip; - size_t orc_size = (void *)__stop_orc_unwind - (void *)__start_orc_unwind; - size_t num_entries = orc_ip_size / sizeof(int); - struct orc_entry *orc; - int i; - - if (!num_entries || orc_ip_size % sizeof(int) != 0 || - orc_size % sizeof(struct orc_entry) != 0 || - num_entries != orc_size / sizeof(struct orc_entry)) { - orc_warn("WARNING: Bad or missing .orc_unwind table. Disabling unwinder.\n"); - return; - } - /* -* Note, the orc_unwind and orc_unwind_ip tables were already -* sorted at build time via the 'sorttable' tool. -* It's ready for binary search straight away, no need to sort it. +* All ORC tables are sorted and built via sorttable tool. Initialize +* lookup_num_blocks only. */ - - /* Initialize the fast lookup table: */ lookup_num_blocks = orc_lookup_end - orc_lookup; - for (i = 0; i < lookup_num_blocks-1; i++) { - orc = __orc_find(__start_orc_unwind_ip, __start_orc_unwind, -num_entries, -LOOKUP_START_IP + (LOOKUP_BLOCK_SIZE * i)); - if (!orc) { - orc_warn("WARNING: Corrupt .orc_unwind table. Disabling unwinder.\n"); - return; - } - - orc_lookup[i] = orc - __start_orc_unwind; - } - - /* Initialize the ending block: */ - orc = __orc_find(__start_orc_unwind_ip, __start_orc_unwind, num_entries, -LOOKUP_STOP_IP); - if (!orc) { - orc_warn("WARNING: Corrupt .orc_unwind table. Disabling unwinder.\n"); - return; - } - orc_lookup[lookup_num_blocks-1] = orc - __start_orc_unwind; - orc_init = true; } -- 2.14.4.44.g2045bb6
[PATCH 2/3] scripts/sorttable: Build ORC fast lookup table via sorttable tool
Since ORC tables are already sorted by sorttable tool, let us move building of fast lookup table into sorttable tool too. This saves us 6380us from boot time under Intel(R) Xeon(R) CPU E5-2682 v4 @ 2.50GHz with 64 cores. Signed-off-by: Huaixin Chang Signed-off-by: Shile Zhang --- arch/x86/include/asm/orc_lookup.h | 16 -- arch/x86/include/asm/orc_types.h | 16 ++ arch/x86/kernel/vmlinux.lds.S | 2 +- scripts/sorttable.h| 96 +++--- tools/arch/x86/include/asm/orc_types.h | 16 ++ 5 files changed, 122 insertions(+), 24 deletions(-) diff --git a/arch/x86/include/asm/orc_lookup.h b/arch/x86/include/asm/orc_lookup.h index 241631282e43..c75eb1f82bdb 100644 --- a/arch/x86/include/asm/orc_lookup.h +++ b/arch/x86/include/asm/orc_lookup.h @@ -5,22 +5,6 @@ #ifndef _ORC_LOOKUP_H #define _ORC_LOOKUP_H -/* - * This is a lookup table for speeding up access to the .orc_unwind table. - * Given an input address offset, the corresponding lookup table entry - * specifies a subset of the .orc_unwind table to search. - * - * Each block represents the end of the previous range and the start of the - * next range. An extra block is added to give the last range an end. - * - * The block size should be a power of 2 to avoid a costly 'div' instruction. - * - * A block size of 256 was chosen because it roughly doubles unwinder - * performance while only adding ~5% to the ORC data footprint. - */ -#define LOOKUP_BLOCK_ORDER 8 -#define LOOKUP_BLOCK_SIZE (1 << LOOKUP_BLOCK_ORDER) - #ifndef LINKER_SCRIPT extern unsigned int orc_lookup[]; diff --git a/arch/x86/include/asm/orc_types.h b/arch/x86/include/asm/orc_types.h index d25534940bde..b93c6a7b4da4 100644 --- a/arch/x86/include/asm/orc_types.h +++ b/arch/x86/include/asm/orc_types.h @@ -9,6 +9,22 @@ #include #include +/* + * This is a lookup table for speeding up access to the .orc_unwind table. + * Given an input address offset, the corresponding lookup table entry + * specifies a subset of the .orc_unwind table to search. + * + * Each block represents the end of the previous range and the start of the + * next range. An extra block is added to give the last range an end. + * + * The block size should be a power of 2 to avoid a costly 'div' instruction. + * + * A block size of 256 was chosen because it roughly doubles unwinder + * performance while only adding ~5% to the ORC data footprint. + */ +#define LOOKUP_BLOCK_ORDER 8 +#define LOOKUP_BLOCK_SIZE (1 << LOOKUP_BLOCK_ORDER) + /* * The ORC_REG_* registers are base registers which are used to find other * registers on the stack. diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S index 9a03e5b23135..75760e7f6319 100644 --- a/arch/x86/kernel/vmlinux.lds.S +++ b/arch/x86/kernel/vmlinux.lds.S @@ -29,7 +29,7 @@ #include #include #include -#include +#include #include #include diff --git a/scripts/sorttable.h b/scripts/sorttable.h index a2baa2fefb13..de9822f8ae8f 100644 --- a/scripts/sorttable.h +++ b/scripts/sorttable.h @@ -93,12 +93,50 @@ char g_err[ERRSTR_MAXSZ]; int *g_orc_ip_table; struct orc_entry *g_orc_table; +static unsigned long orc_ip_table_offset; pthread_t orc_sort_thread; +struct orc_sort_param { + size_t lookup_table_size; + unsigned int*orc_lookup_table; + unsigned long start_ip; + size_t text_size; + unsigned intorc_num_entries; +}; + static inline unsigned long orc_ip(const int *ip) { - return (unsigned long)ip + *ip; + return (unsigned long)ip + *ip + orc_ip_table_offset; +} + +static struct orc_entry *__orc_find(int *ip_table, struct orc_entry *u_table, + unsigned int num_entries, unsigned long ip) +{ + int *first = ip_table; + int *last = ip_table + num_entries - 1; + int *mid = first, *found = first; + + if (!num_entries) + return NULL; + + /* +* Do a binary range search to find the rightmost duplicate of a given +* starting address. Some entries are section terminators which are +* "weak" entries for ensuring there are no gaps. They should be +* ignored when they conflict with a real entry. +*/ + while (first <= last) { + mid = first + ((last - first) / 2); + + if (orc_ip(mid) <= ip) { + found = mid; + first = mid + 1; + } else + last = mid - 1; + } + + return u_table + (found - ip_table); } static int orc_sort_cmp(const void *_a, const void *_b) @@ -130,18 +168,24 @@ static void *sort_orctable(void *arg) int *idxs = NULL; int *tmp_orc_ip_table = NULL; struct orc_entry *tmp_orc_table = NULL; - unsigned int *orc_ip_size = (unsigned int *)arg; - unsigned int num_entries = *orc_ip_size / sizeof(int); +
[PATCH v2 0/3] Build ORC fast lookup table in scripts/sorttable tool
Move building of fast lookup table from boot to sorttable tool. This saves us 6380us boot time on Intel(R) Xeon(R) CPU E5-2682 v4 @ 2.50GHz with cores. It adds a little more than 7ms to boot time when testing on the same CPU. Changelog v2: 1. Write .orc_lookup section header via objtool 2. Move two ORC lookup table macro from orc_lookup.h into orc_types.h 3. Spell 'ORC' in capitalized fashion Huaixin Chang (3): objtool: Write .orc_lookup section header scripts/sorttable: Build ORC fast lookup table via sorttable tool x86/unwind/orc: Simplify unwind_init() for x86 boot arch/x86/include/asm/orc_lookup.h | 16 -- arch/x86/include/asm/orc_types.h | 16 ++ arch/x86/kernel/unwind_orc.c | 41 +-- arch/x86/kernel/vmlinux.lds.S | 2 +- scripts/sorttable.h| 96 +++--- tools/arch/x86/include/asm/orc_types.h | 16 ++ tools/objtool/orc_gen.c| 4 ++ 7 files changed, 128 insertions(+), 63 deletions(-) -- 2.14.4.44.g2045bb6
[PATCH 1/3] objtool: Write .orc_lookup section header
The purpose of this patch is to set sh_type to SHT_PROGBITS and remove write bits away from sh_flags. In order to write section header, just call elf_create_section() upon section orc_lookup with 0 entry written. Originally, section headers are as follows: [23] .orc_unwind_ipPROGBITS 8259f4b8 0179f4b8 00178bbc A 0 0 1 [24] .rela.orc_unwind_ RELA 11e57b58 008d4668 0018 I 7023 8 [25] .orc_unwind PROGBITS 82718074 01918074 0023519a A 0 0 1 [26] .orc_lookup NOBITS 8294d210 01b4d20e 00030038 WA 0 0 1 [27] .vvar PROGBITS 8297e000 01b7e000 1000 WA 0 0 16 Now, they are changed to: [23] .orc_unwind_ipPROGBITS 8259f4b8 0179f4b8 00178bbc A 0 0 1 [24] .rela.orc_unwind_ RELA 11e57b58 008d4668 0018 I 7023 8 [25] .orc_unwind PROGBITS 82718074 01918074 0023519a A 0 0 1 [26] .orc_lookup PROGBITS 8294d210 01b4d210 00030038 A 0 0 1 [27] .vvar PROGBITS 8297e000 01b7e000 1000 WA 0 0 16 Signed-off-by: Huaixin Chang --- tools/objtool/orc_gen.c | 4 1 file changed, 4 insertions(+) diff --git a/tools/objtool/orc_gen.c b/tools/objtool/orc_gen.c index 968f55e6dd94..2b2653979ad6 100644 --- a/tools/objtool/orc_gen.c +++ b/tools/objtool/orc_gen.c @@ -189,6 +189,10 @@ int create_orc_sections(struct objtool_file *file) u_sec = elf_create_section(file->elf, ".orc_unwind", sizeof(struct orc_entry), idx); + /* make flags of section orc_lookup right */ + if (!elf_create_section(file->elf, ".orc_lookup", sizeof(int), 0)) + return -1; + /* populate sections */ idx = 0; for_each_sec(file, sec) { -- 2.14.4.44.g2045bb6
security/integrity/platform_certs/keyring_handler.c:66:16: sparse: sparse: Using plain integer as NULL pointer
tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master head: 86cfccb66937dd6cbf26ed619958b9e587e6a115 commit: ad723674d6758478829ee766e3f1a2a24d56236f x86/efi: move common keyring handler functions to new file date: 9 months ago config: x86_64-randconfig-s022-20200807 (attached as .config) compiler: gcc-9 (Debian 9.3.0-15) 9.3.0 reproduce: # apt-get install sparse # sparse version: v0.6.2-117-g8c7aee71-dirty git checkout ad723674d6758478829ee766e3f1a2a24d56236f # save the attached .config to linux build tree make W=1 C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' ARCH=x86_64 If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot sparse warnings: (new ones prefixed by >>) >> security/integrity/platform_certs/keyring_handler.c:66:16: sparse: sparse: >> Using plain integer as NULL pointer security/integrity/platform_certs/keyring_handler.c:79:16: sparse: sparse: Using plain integer as NULL pointer vim +66 security/integrity/platform_certs/keyring_handler.c 57 58 /* 59 * Return the appropriate handler for particular signature list types found in 60 * the UEFI db and MokListRT tables. 61 */ 62 __init efi_element_handler_t get_handler_for_db(const efi_guid_t *sig_type) 63 { 64 if (efi_guidcmp(*sig_type, efi_cert_x509_guid) == 0) 65 return add_to_platform_keyring; > 66 return 0; 67 } 68 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org .config.gz Description: application/gzip
Re: [PATCH v3 4/5] mm: memcg: charge memcg percpu memory to the parent cgroup
On Wed, 29 Jul 2020 19:10:39 +0200 Michal Koutný wrote: > Hello. > > On Tue, Jun 23, 2020 at 11:45:14AM -0700, Roman Gushchin wrote: > > Because the size of memory cgroup internal structures can dramatically > > exceed the size of object or page which is pinning it in the memory, it's > > not a good idea to simple ignore it. It actually breaks the isolation > > between cgroups. > No doubt about accounting the memory if it's significant amount. > > > Let's account the consumed percpu memory to the parent cgroup. > Why did you choose charging to the parent of the created cgroup? > > Should the charge go the cgroup _that is creating_ the new memcg? > > One reason is that there are the throttling mechanisms for memory limits > and those are better exercised when the actor and its memory artefact > are the same cgroup, aren't they? > > The second reason is based on the example Dlegation Containment > (Documentation/admin-guide/cgroup-v2.rst) > > > For an example, let's assume cgroups C0 and C1 have been delegated to > > user U0 who created C00, C01 under C0 and C10 under C1 as follows and > > all processes under C0 and C1 belong to U0:: > > > > ~ - C0 - C00 > > ~ cgroup~ \ C01 > > ~ hierarchy ~ > > ~ - C1 - C10 > > Thanks to permissions a task running in C0 creating a cgroup in C1 would > deplete C1's supply victimizing tasks inside C1. These week-old issues appear to be significant. Roman? Or someone else?
Re: [PATCH v9 08/10] gpu: host1x: mipi: Keep MIPI clock enabled and mutex locked till calibration done
07.08.2020 07:06, Sowjanya Komatineni пишет: > > On 8/6/20 9:05 PM, Sowjanya Komatineni wrote: >> >> On 8/6/20 9:01 PM, Dmitry Osipenko wrote: >>> 07.08.2020 06:18, Sowjanya Komatineni пишет: On 8/6/20 8:14 PM, Sowjanya Komatineni wrote: > On 8/6/20 8:10 PM, Sowjanya Komatineni wrote: >> On 8/6/20 7:31 PM, Dmitry Osipenko wrote: >>> 06.08.2020 22:01, Sowjanya Komatineni пишет: >>> ... +int tegra_mipi_start_calibration(struct tegra_mipi_device *device) { const struct tegra_mipi_soc *soc = device->mipi->soc; unsigned int i; @@ -381,12 +375,16 @@ int tegra_mipi_calibrate(struct tegra_mipi_device *device) value |= MIPI_CAL_CTRL_START; tegra_mipi_writel(device->mipi, value, MIPI_CAL_CTRL); - mutex_unlock(>mipi->lock); - clk_disable(device->mipi->clk); + /* + * Wait for min 72uS to let calibration logic finish calibration + * sequence codes before waiting for pads idle state to apply the + * results. + */ + usleep_range(75, 80); >>> Could you please explain why the ACTIVE bit can't be polled >>> instead of >>> using the fixed delay? Doesn't ACTIVE bit represents the state of >>> the >>> busy FSM? >> Based on internal discussion, ACTIVE bit gets cleared when all >> enabled pads calibration is done (same time as when DONE set to 1). >> >> Will request HW designer to look into design and confirm exactly when >> ACTIVE bit gets cleared. >> >> Will get back on this. >> > Verified with HW designer. above is correct. ACTIVE bit update happens > same time as DONE bit. > > Active = !(DONE) > > In case of calibration logic waiting for LP-11 where done bit does not > get set, ACTIVE will still be 1 and on next start trigger new > calibration will start > Based on internal design check from designer, as long as its in waiting for LP-11 stage, next calibration request can be triggered again but ACTIVE bit we will see it at 1. So we should check for DONE bits to confirm if calibration is done or not. To start next calibration, it can take effect as long as its in wait for LP-11 mode. >>> I meant the start_calibration() will poll the ACTIVE bit (calibration >>> busy), while the finish_calibration() will poll the DONE bit >>> (calibration applied). >> >> ACTIVE bit can be 1 when previous calibration process does not see LP-11. >> >> So there is no need to use ACTIVE bit during start of calibration. >> >> At HW level, both ACTIVE and DONE bits get set at same time. >> >> So waiting for ACTIVE to be 0 during start calibration instead of >> *75uS will not work as ACTIVE bit will not become 0 after calibration >> sequence codes and it will get updated along with DONE bits only after >> applying results to pads which happens after seeing LP-11 on pads. >> > *typo fixed I see now, thank you.
[GIT PULL] xfs: new code for 5.9-rc1
Hi Linus, Please pull this large pile of new xfs code for 5.9. There are quite a few changes in this release, the most notable of which is that we've made inode flushing fully asynchronous, and we no longer block memory reclaim on this. Furthermore, we have fixed a long-standing bug in the quota code where soft limit warnings and inode limits were never tracked properly. Moving further down the line, the reflink control loops have been redesigned to behave more efficiently; and numerous small bugs have been fixed (see below). The xattr and quota code have been extensively refactored in preparation for more new features coming down the line. Finally, the behavior of DAX between ext4 and xfs has been stabilized, which gets us a step closer to removing the experimental tag from that feature. We have a few new contributors this time around. Welcome, all! This branch merges cleanly with master as of a few minutes ago, so please let me know if anything strange happens. I anticipate a second pull request next week for a few small bugfixes that have been trickling in, but this is it for big changes. --D The following changes since commit dcb7fd82c75ee2d6e6f9d8cc71c52519ed52e258: Linux 5.8-rc4 (2020-07-05 16:20:22 -0700) are available in the Git repository at: git://git.kernel.org/pub/scm/fs/xfs/xfs-linux.git tags/xfs-5.9-merge-7 for you to fetch changes up to 818d5a91559ffe1e1f2095dcbbdb96c13fdb94ec: fs/xfs: Support that ioctl(SETXFLAGS/GETXFLAGS) can set/get inode DAX on XFS. (2020-07-28 20:28:20 -0700) New code for 5.9: - Fix some btree block pingponging problems when swapping extents - Redesign the reflink copy loop so that we only run one remapping operation per transaction. This helps us avoid running out of block reservation on highly deduped filesystems. - Take the MMAPLOCK around filemap_map_pages. - Make inode reclaim fully async so that we avoid stalling processes on flushing inodes to disk. - Reduce inode cluster buffer RMW cycles by attaching the buffer to dirty inodes so we won't let go of the cluster buffer when we know we're going to need it soon. - Add some more checks to the realtime bitmap file scrubber. - Don't trip false lockdep warnings in fs freeze. - Remove various redundant lines of code. - Remove unnecessary calls to xfs_perag_{get,put}. - Preserve I_VERSION state across remounts. - Fix an unmount hang due to AIL going to sleep with a non-empty delwri buffer list. - Fix an error in the inode allocation space reservation macro that caused regressions in generic/531. - Fix a potential livelock when dquot flush fails because the dquot buffer is locked. - Fix a miscalculation when reserving inode quota that could cause users to exceed a hardlimit. - Refactor struct xfs_dquot to use native types for incore fields instead of abusing the ondisk struct for this purpose. This will eventually enable proper y2038+ support, but for now it merely cleans up the quota function declarations. - Actually increment the quota softlimit warning counter so that soft failures turn into hard(er) failures when they exceed the softlimit warning counter limits set by the administrator. - Split incore dquot state flags into their own field and namespace, to avoid mixing them with quota type flags. - Create a new quota type flags namespace so that we can make it obvious when a quota function takes a quota type (user, group, project) as an argument. - Rename the ondisk dquot flags field to type, as that more accurately represents what we store in it. - Drop our bespoke memory allocation flags in favor of GFP_*. - Rearrange the xattr functions so that we no longer mix metadata updates and transaction management (e.g. rolling complex transactions) in the same functions. This work will prepare us for atomic xattr operations (itself a prerequisite for directory backrefs) in future release cycles. - Support FS_DAX_FL (aka FS_XFLAG_DAX) via GETFLAGS/SETFLAGS. Allison Collins (22): xfs: Add xfs_has_attr and subroutines xfs: Check for -ENOATTR or -EEXIST xfs: Factor out new helper functions xfs_attr_rmtval_set xfs: Pull up trans handling in xfs_attr3_leaf_flipflags xfs: Split apart xfs_attr_leaf_addname xfs: Refactor xfs_attr_try_sf_addname xfs: Pull up trans roll from xfs_attr3_leaf_setflag xfs: Factor out xfs_attr_rmtval_invalidate xfs: Pull up trans roll in xfs_attr3_leaf_clearflag xfs: Refactor xfs_attr_rmtval_remove xfs: Pull up xfs_attr_rmtval_invalidate xfs: Add helper function xfs_attr_node_shrink xfs: Remove unneeded xfs_trans_roll_inode calls xfs: Remove xfs_trans_roll in xfs_attr_node_removename xfs: Add helpers xfs_attr_is_shortform and xfs_attr_set_shortform xfs: Add helper function xfs_attr_leaf_mark_incomplete
Re: [PATCH v9 08/10] gpu: host1x: mipi: Keep MIPI clock enabled and mutex locked till calibration done
On 8/6/20 9:05 PM, Sowjanya Komatineni wrote: On 8/6/20 9:01 PM, Dmitry Osipenko wrote: 07.08.2020 06:18, Sowjanya Komatineni пишет: On 8/6/20 8:14 PM, Sowjanya Komatineni wrote: On 8/6/20 8:10 PM, Sowjanya Komatineni wrote: On 8/6/20 7:31 PM, Dmitry Osipenko wrote: 06.08.2020 22:01, Sowjanya Komatineni пишет: ... +int tegra_mipi_start_calibration(struct tegra_mipi_device *device) { const struct tegra_mipi_soc *soc = device->mipi->soc; unsigned int i; @@ -381,12 +375,16 @@ int tegra_mipi_calibrate(struct tegra_mipi_device *device) value |= MIPI_CAL_CTRL_START; tegra_mipi_writel(device->mipi, value, MIPI_CAL_CTRL); - mutex_unlock(>mipi->lock); - clk_disable(device->mipi->clk); + /* + * Wait for min 72uS to let calibration logic finish calibration + * sequence codes before waiting for pads idle state to apply the + * results. + */ + usleep_range(75, 80); Could you please explain why the ACTIVE bit can't be polled instead of using the fixed delay? Doesn't ACTIVE bit represents the state of the busy FSM? Based on internal discussion, ACTIVE bit gets cleared when all enabled pads calibration is done (same time as when DONE set to 1). Will request HW designer to look into design and confirm exactly when ACTIVE bit gets cleared. Will get back on this. Verified with HW designer. above is correct. ACTIVE bit update happens same time as DONE bit. Active = !(DONE) In case of calibration logic waiting for LP-11 where done bit does not get set, ACTIVE will still be 1 and on next start trigger new calibration will start Based on internal design check from designer, as long as its in waiting for LP-11 stage, next calibration request can be triggered again but ACTIVE bit we will see it at 1. So we should check for DONE bits to confirm if calibration is done or not. To start next calibration, it can take effect as long as its in wait for LP-11 mode. I meant the start_calibration() will poll the ACTIVE bit (calibration busy), while the finish_calibration() will poll the DONE bit (calibration applied). ACTIVE bit can be 1 when previous calibration process does not see LP-11. So there is no need to use ACTIVE bit during start of calibration. At HW level, both ACTIVE and DONE bits get set at same time. So waiting for ACTIVE to be 0 during start calibration instead of *75uS will not work as ACTIVE bit will not become 0 after calibration sequence codes and it will get updated along with DONE bits only after applying results to pads which happens after seeing LP-11 on pads. *typo fixed
Re: [PATCH v9 08/10] gpu: host1x: mipi: Keep MIPI clock enabled and mutex locked till calibration done
On 8/6/20 9:01 PM, Dmitry Osipenko wrote: 07.08.2020 06:18, Sowjanya Komatineni пишет: On 8/6/20 8:14 PM, Sowjanya Komatineni wrote: On 8/6/20 8:10 PM, Sowjanya Komatineni wrote: On 8/6/20 7:31 PM, Dmitry Osipenko wrote: 06.08.2020 22:01, Sowjanya Komatineni пишет: ... +int tegra_mipi_start_calibration(struct tegra_mipi_device *device) { const struct tegra_mipi_soc *soc = device->mipi->soc; unsigned int i; @@ -381,12 +375,16 @@ int tegra_mipi_calibrate(struct tegra_mipi_device *device) value |= MIPI_CAL_CTRL_START; tegra_mipi_writel(device->mipi, value, MIPI_CAL_CTRL); - mutex_unlock(>mipi->lock); - clk_disable(device->mipi->clk); + /* + * Wait for min 72uS to let calibration logic finish calibration + * sequence codes before waiting for pads idle state to apply the + * results. + */ + usleep_range(75, 80); Could you please explain why the ACTIVE bit can't be polled instead of using the fixed delay? Doesn't ACTIVE bit represents the state of the busy FSM? Based on internal discussion, ACTIVE bit gets cleared when all enabled pads calibration is done (same time as when DONE set to 1). Will request HW designer to look into design and confirm exactly when ACTIVE bit gets cleared. Will get back on this. Verified with HW designer. above is correct. ACTIVE bit update happens same time as DONE bit. Active = !(DONE) In case of calibration logic waiting for LP-11 where done bit does not get set, ACTIVE will still be 1 and on next start trigger new calibration will start Based on internal design check from designer, as long as its in waiting for LP-11 stage, next calibration request can be triggered again but ACTIVE bit we will see it at 1. So we should check for DONE bits to confirm if calibration is done or not. To start next calibration, it can take effect as long as its in wait for LP-11 mode. I meant the start_calibration() will poll the ACTIVE bit (calibration busy), while the finish_calibration() will poll the DONE bit (calibration applied). ACTIVE bit can be 1 when previous calibration process does not see LP-11. So there is no need to use ACTIVE bit during start of calibration. At HW level, both ACTIVE and DONE bits get set at same time. So waiting for ACTIVE to be 0 during start calibration instead of 7uS will not work as ACTIVE bit will not become 0 after calibration sequence codes and it will get updated along with DONE bits only after applying results to pads which happens after seeing LP-11 on pads.
Re: [PATCH v9 08/10] gpu: host1x: mipi: Keep MIPI clock enabled and mutex locked till calibration done
07.08.2020 06:18, Sowjanya Komatineni пишет: > > On 8/6/20 8:14 PM, Sowjanya Komatineni wrote: >> >> On 8/6/20 8:10 PM, Sowjanya Komatineni wrote: >>> >>> On 8/6/20 7:31 PM, Dmitry Osipenko wrote: 06.08.2020 22:01, Sowjanya Komatineni пишет: ... > +int tegra_mipi_start_calibration(struct tegra_mipi_device *device) > { > const struct tegra_mipi_soc *soc = device->mipi->soc; > unsigned int i; > @@ -381,12 +375,16 @@ int tegra_mipi_calibrate(struct > tegra_mipi_device *device) > value |= MIPI_CAL_CTRL_START; > tegra_mipi_writel(device->mipi, value, MIPI_CAL_CTRL); > - mutex_unlock(>mipi->lock); > - clk_disable(device->mipi->clk); > + /* > + * Wait for min 72uS to let calibration logic finish calibration > + * sequence codes before waiting for pads idle state to apply the > + * results. > + */ > + usleep_range(75, 80); Could you please explain why the ACTIVE bit can't be polled instead of using the fixed delay? Doesn't ACTIVE bit represents the state of the busy FSM? >>> >>> Based on internal discussion, ACTIVE bit gets cleared when all >>> enabled pads calibration is done (same time as when DONE set to 1). >>> >>> Will request HW designer to look into design and confirm exactly when >>> ACTIVE bit gets cleared. >>> >>> Will get back on this. >>> >> Verified with HW designer. above is correct. ACTIVE bit update happens >> same time as DONE bit. >> >> Active = !(DONE) >> >> In case of calibration logic waiting for LP-11 where done bit does not >> get set, ACTIVE will still be 1 and on next start trigger new >> calibration will start >> > Based on internal design check from designer, as long as its in waiting > for LP-11 stage, next calibration request can be triggered again but > ACTIVE bit we will see it at 1. So we should check for DONE bits to > confirm if calibration is done or not. > > To start next calibration, it can take effect as long as its in wait for > LP-11 mode. I meant the start_calibration() will poll the ACTIVE bit (calibration busy), while the finish_calibration() will poll the DONE bit (calibration applied).
Re: [PATCH] ftrace: Fixup lockdep assert held of text_mutex
On Fri, 7 Aug 2020 10:59:16 +0800 Guo Ren wrote: > > > > This looks like a bug in the lockdep_assert_held() in whatever arch > > (riscv) is running. > Seems you think it's a bug of arch implementation with the wrong usage > of text_mutex? > > Also @riscv maintainer, > How about modifying it in riscv's code? we still need to solve it. > > > diff --git a/arch/riscv/include/asm/ftrace.h b/arch/riscv/include/asm/ftrace.h > index ace8a6e..fb266c3 100644 > --- a/arch/riscv/include/asm/ftrace.h > +++ b/arch/riscv/include/asm/ftrace.h > @@ -23,6 +23,12 @@ static inline unsigned long > ftrace_call_adjust(unsigned long addr) > > struct dyn_arch_ftrace { > }; > + > +#ifdef CONFIG_DYNAMIC_FTRACE > +struct dyn_ftrace; > +int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec); > +#define ftrace_init_nop ftrace_init_nop > +#endif > #endif > > #ifdef CONFIG_DYNAMIC_FTRACE > diff --git a/arch/riscv/kernel/ftrace.c b/arch/riscv/kernel/ftrace.c > index 2ff63d0..9e9f7c0 100644 > --- a/arch/riscv/kernel/ftrace.c > +++ b/arch/riscv/kernel/ftrace.c > @@ -97,6 +97,17 @@ int ftrace_make_nop(struct module *mod, struct > dyn_ftrace *rec, > return __ftrace_modify_call(rec->ip, addr, false); > } > > +int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec) > +{ > + int ret; > + > + mutex_lock(_mutex); > + ret = ftrace_make_nop(mod, rec, MCOUNT_ADDR); Looking at x86, we have the following code: static int ftrace_poke_late = 0; int ftrace_arch_code_modify_prepare(void) __acquires(_mutex) { /* * Need to grab text_mutex to prevent a race from module loading * and live kernel patching from changing the text permissions while * ftrace has it set to "read/write". */ mutex_lock(_mutex); ftrace_poke_late = 1; return 0; } int ftrace_arch_code_modify_post_process(void) __releases(_mutex) { /* * ftrace_make_{call,nop}() may be called during * module load, and we need to finish the text_poke_queue() * that they do, here. */ text_poke_finish(); ftrace_poke_late = 0; mutex_unlock(_mutex); return 0; } And if ftrace_poke_late is not set, then ftrace_make_nop() does direct modification (calls text_poke_early(), which is basically a memcpy). This path doesn't have any checks against text_mutex being held, because it only happens at boot up. > + mutex_unlock(_mutex); > + > + return ret; > +} > + > int ftrace_update_ftrace_func(ftrace_func_t func) > { > int ret = __ftrace_modify_call((unsigned long)_call, > --- > > > > --- a/kernel/trace/ftrace.c > > > +++ b/kernel/trace/ftrace.c > > > @@ -26,6 +26,7 @@ > > > #include > > > #include > > > #include > > > +#include > > > #include > > > #include > > > #include > > > @@ -6712,9 +6713,11 @@ void __init ftrace_init(void) > > > > ftrace_init() is called before SMP is initialized. Nothing else should > > be running here. That means grabbing a mutex is useless. > I don't agree, ftrace_init are modifying kernel text, so we should > give the lock of text_mutex to keep semantic consistency. Did you test your patch on x86 with lockdep? ftrace_process_locs() grabs the ftrace_lock, which I believe is held when text_mutex is taken in other locations. So this will probably not work anyway. text_mutex isn't to be taken at the ftrace level. -- Steve
Re: [PATCH] vdpa/mlx5: Fix erroneous null pointer checks
On 2020/8/7 上午11:37, Jason Wang wrote: On 2020/8/7 上午3:18, Alex Dewar wrote: In alloc_inout() in net/mlx5_vnet.c, there are a few places where memory is allocated to *in and *out, but only the values of in and out are null-checked (i.e. there is a missing dereference). Fix this. Addresses-Coverity: ("CID 1496603: (REVERSE_INULL)") Fixes: 1a86b377aa21 ("vdpa/mlx5: Add VDPA driver for supported mlx5 devices") Signed-off-by: Alex Dewar Acked-by: Jason Wang Colin posted something similar: [PATCH][next] vdpa/mlx5: fix memory allocation failure checks And I think his fix is better since it prevent raw pointers to be freed. Thanks
Re: [PATCH][next] vdpa/mlx5: fix memory allocation failure checks
On 2020/8/7 上午12:08, Colin King wrote: From: Colin Ian King The memory allocation failure checking for in and out is currently checking if the pointers are valid rather than the contents of what they point to. Hence the null check on failed memory allocations is incorrect. Fix this by adding the missing indirection in the check. Also for the default case, just set the *in and *out to null as these don't have any thing allocated to kfree. Finally remove the redundant *in and *out check as these have been already done on each allocation in the case statement. Addresses-Coverity: ("Null pointer dereference") Fixes: 1a86b377aa21 ("vdpa/mlx5: Add VDPA driver for supported mlx5 devices") Signed-off-by: Colin Ian King Acked-by: Jason Wang --- drivers/vdpa/mlx5/net/mlx5_vnet.c | 13 ++--- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c index 3ec44a4f0e45..55bc58e1dae9 100644 --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c @@ -867,7 +867,7 @@ static void alloc_inout(struct mlx5_vdpa_net *ndev, int cmd, void **in, int *inl *outlen = MLX5_ST_SZ_BYTES(qp_2rst_out); *in = kzalloc(*inlen, GFP_KERNEL); *out = kzalloc(*outlen, GFP_KERNEL); - if (!in || !out) + if (!*in || !*out) goto outerr; MLX5_SET(qp_2rst_in, *in, opcode, cmd); @@ -879,7 +879,7 @@ static void alloc_inout(struct mlx5_vdpa_net *ndev, int cmd, void **in, int *inl *outlen = MLX5_ST_SZ_BYTES(rst2init_qp_out); *in = kzalloc(*inlen, GFP_KERNEL); *out = kzalloc(MLX5_ST_SZ_BYTES(rst2init_qp_out), GFP_KERNEL); - if (!in || !out) + if (!*in || !*out) goto outerr; MLX5_SET(rst2init_qp_in, *in, opcode, cmd); @@ -896,7 +896,7 @@ static void alloc_inout(struct mlx5_vdpa_net *ndev, int cmd, void **in, int *inl *outlen = MLX5_ST_SZ_BYTES(init2rtr_qp_out); *in = kzalloc(*inlen, GFP_KERNEL); *out = kzalloc(MLX5_ST_SZ_BYTES(init2rtr_qp_out), GFP_KERNEL); - if (!in || !out) + if (!*in || !*out) goto outerr; MLX5_SET(init2rtr_qp_in, *in, opcode, cmd); @@ -914,7 +914,7 @@ static void alloc_inout(struct mlx5_vdpa_net *ndev, int cmd, void **in, int *inl *outlen = MLX5_ST_SZ_BYTES(rtr2rts_qp_out); *in = kzalloc(*inlen, GFP_KERNEL); *out = kzalloc(MLX5_ST_SZ_BYTES(rtr2rts_qp_out), GFP_KERNEL); - if (!in || !out) + if (!*in || !*out) goto outerr; MLX5_SET(rtr2rts_qp_in, *in, opcode, cmd); @@ -927,16 +927,15 @@ static void alloc_inout(struct mlx5_vdpa_net *ndev, int cmd, void **in, int *inl MLX5_SET(qpc, qpc, rnr_retry, 7); break; default: - goto outerr; + goto outerr_nullify; } - if (!*in || !*out) - goto outerr; return; outerr: kfree(*in); kfree(*out); +outerr_nullify: *in = NULL; *out = NULL; }
Re: [BUG] crypto: hisilicon: accessing the data mapped to streaming DMA
On 2020/8/3 9:29, Jia-Ju Bai wrote: > > > On 2020/8/3 9:12, Zhou Wang wrote: >> On 2020/8/2 22:52, Jia-Ju Bai wrote: >>> In qm_qp_ctx_cfg(), "sqc" and "aeqc" are mapped to streaming DMA: >>>eqc_dma = dma_map_single(..., eqc, ...); >>>.. >>>aeqc_dma = dma_map_single(..., aeqc, ...); >> Only sqc, cqc will be configured in qm_qp_ctx_cfg. >> >>> Then "sqc" and "aeqc" are accessed at many places, such as: >>>eqc->base_l = cpu_to_le32(lower_32_bits(qm->eqe_dma)); >>>eqc->base_h = cpu_to_le32(upper_32_bits(qm->eqe_dma)); >>>.. >>>aeqc->base_l = cpu_to_le32(lower_32_bits(qm->aeqe_dma)); >>>aeqc->base_h = cpu_to_le32(upper_32_bits(qm->aeqe_dma)); >> There are sqc, cqc, eqc, aeqc, you seems misunderstand them. >> >>> These accesses may cause data inconsistency between CPU cache and hardware. >>> >>> I am not sure how to properly fix this problem, and thus I only report it. >> In qm_qp_ctx_cfg, sqc/cqc memory will be allocated and related mailbox will >> be sent >> to hardware. In qm_eq_ctx_cfg, eqc/aeqc related operations will be done. >> >> So there is no problem here :) > > Ah, sorry, I misunderstood qm_eq_ctx_cfg() and qm_qp_ctx_cfg(), because their > names are quite similar. > Now, I re-organize this report as follows: > > In qm_eq_ctx_cfg(), "eqc" and "aeqc" are mapped to streaming DMA: > eqc_dma = dma_map_single(..., eqc, ...); > .. > aeqc_dma = dma_map_single(..., aeqc, ...); > > Then "sqc" and "aeqc" are accessed at some places in qm_eq_ctx_cfg(), such as: > eqc->base_l = cpu_to_le32(lower_32_bits(qm->eqe_dma)); > eqc->base_h = cpu_to_le32(upper_32_bits(qm->eqe_dma)); > .. > aeqc->base_l = cpu_to_le32(lower_32_bits(qm->aeqe_dma)); > aeqc->base_h = cpu_to_le32(upper_32_bits(qm->aeqe_dma)); > > These accesses may cause data inconsistency between CPU cache and hardware. > > Besides, in qm_qp_ctx_cfg(), "sqc" and "cqc" are mapped to streaming DMA: > sqc_dma = dma_map_single(..., sqc, ...); > .. > cqc_dma = dma_map_single(..., cqc, ...); > > > Then "sqc" and "cqc" are at some places in qm_qp_ctx_cfg(), such as: > sqc->cq_num = cpu_to_le16(qp_id); > sqc->w13 = cpu_to_le16(QM_MK_SQC_W13(0, 1, qp->alg_type)); > .. > cqc->dw3 = cpu_to_le32(QM_MK_CQC_DW3_V2(4)); > cqc->w8 = 0; > > These accesses may cause data inconsistency between CPU cache and hardware. > > I think such problems (if they are real) can be fixed by finishing data > assignment before DMA mapping. Sorry for late. I got your idea, from the semantics of dma_map_single/dma_unmap_single, we should not mix CPU and device DMA accessing here. The reason of working well is our hardware is hardware CC. Will fix this later. Thanks, Zhou > > > Best wishes, > Jia-Ju Bai > > . >
Re: [PATCHv3] coresight: etm4x: Fix etm4_count race by moving cpuhp callbacks to init
Sorry for the noise. Please ignore previous comment. The change is in old patch set of my series. This change is good to go. On 2020-08-07 11:52, Tingwei Zhang wrote: On Wed, Jul 29, 2020 at 01:13:10PM +0800, Sai Prakash Ranjan wrote: etm4_count keeps track of number of ETMv4 registered and on some systems, a race is observed on etm4_count variable which can lead to multiple calls to cpuhp_setup_state_nocalls_cpuslocked(). This function internally calls cpuhp_store_callbacks() which prevents multiple registrations of callbacks for a given state and due to this race, it returns -EBUSY leading to ETM probe failures like below. coresight-etm4x: probe of 704.etm failed with error -16 This race can easily be triggered with async probe by setting probe type as PROBE_PREFER_ASYNCHRONOUS and with ETM power management property "arm,coresight-loses-context-with-cpu". Prevent this race by moving cpuhp callbacks to etm driver init since the cpuhp callbacks doesn't have to depend on the etm4_count and can be once setup during driver init. Similarly we move cpu_pm notifier registration to driver init and completely remove etm4_count usage. Also now we can use non cpuslocked version of cpuhp callbacks with this movement. Fixes: 9b6a3f3633a5 ("coresight: etmv4: Fix CPU power management setup in probe() function") Fixes: 58eb457be028 ("hwtracing/coresight-etm4x: Convert to hotplug state machine") Suggested-by: Suzuki K Poulose Signed-off-by: Sai Prakash Ranjan --- Changes in v3: * Minor cleanups from v2 and change to device_initcall (Stephen Boyd) * Move to non cpuslocked cpuhp callbacks and rename to etm_pm_setup() (Mike Leach) Changes in v2: * Rearrange cpuhp callbacks and move them to driver init (Suzuki K Poulose) --- drivers/hwtracing/coresight/coresight-etm4x.c | 65 +-- 1 file changed, 31 insertions(+), 34 deletions(-) diff --git a/drivers/hwtracing/coresight/coresight-etm4x.c b/drivers/hwtracing/coresight/coresight-etm4x.c index 6d7d2169bfb2..fddfd93b9a7b 100644 --- a/drivers/hwtracing/coresight/coresight-etm4x.c +++ b/drivers/hwtracing/coresight/coresight-etm4x.c @@ -48,8 +48,6 @@ module_param(pm_save_enable, int, 0444); MODULE_PARM_DESC(pm_save_enable, "Save/restore state on power down: 1 = never, 2 = self-hosted"); -/* The number of ETMv4 currently registered */ -static int etm4_count; static struct etmv4_drvdata *etmdrvdata[NR_CPUS]; static void etm4_set_default_config(struct etmv4_config *config); static int etm4_set_event_filters(struct etmv4_drvdata *drvdata, @@ -1398,28 +1396,25 @@ static struct notifier_block etm4_cpu_pm_nb = { .notifier_call = etm4_cpu_pm_notify, }; -/* Setup PM. Called with cpus locked. Deals with error conditions and counts */ -static int etm4_pm_setup_cpuslocked(void) +/* Setup PM. Deals with error conditions and counts */ +static int __init etm4_pm_setup(void) { int ret; - if (etm4_count++) - return 0; - ret = cpu_pm_register_notifier(_cpu_pm_nb); if (ret) - goto reduce_count; + return ret; - ret = cpuhp_setup_state_nocalls_cpuslocked(CPUHP_AP_ARM_CORESIGHT_STARTING, - "arm/coresight4:starting", - etm4_starting_cpu, etm4_dying_cpu); + ret = cpuhp_setup_state_nocalls(CPUHP_AP_ARM_CORESIGHT_STARTING, + "arm/coresight4:starting", + etm4_starting_cpu, etm4_dying_cpu); if (ret) goto unregister_notifier; - ret = cpuhp_setup_state_nocalls_cpuslocked(CPUHP_AP_ONLINE_DYN, - "arm/coresight4:online", - etm4_online_cpu, NULL); + ret = cpuhp_setup_state_nocalls(CPUHP_AP_ONLINE_DYN, + "arm/coresight4:online", + etm4_online_cpu, NULL); /* HP dyn state ID returned in ret on success */ if (ret > 0) { @@ -1428,21 +1423,15 @@ static int etm4_pm_setup_cpuslocked(void) } /* failed dyn state - remove others */ - cpuhp_remove_state_nocalls_cpuslocked(CPUHP_AP_ARM_CORESIGHT_STARTING); + cpuhp_remove_state_nocalls(CPUHP_AP_ARM_CORESIGHT_STARTING); unregister_notifier: cpu_pm_unregister_notifier(_cpu_pm_nb); - -reduce_count: - --etm4_count; return ret; } -static void etm4_pm_clear(void) +static void __init etm4_pm_clear(void) { - if (--etm4_count != 0) - return; - cpu_pm_unregister_notifier(_cpu_pm_nb); cpuhp_remove_state_nocalls(CPUHP_AP_ARM_CORESIGHT_STARTING); if (hp_online) { @@ -1498,22 +1487,12 @@ static int etm4_probe(struct amba_device *adev, const struct amba_id *id) if (!desc.name) return -ENOMEM; -
Re: [PATCH] vdpa/mlx5: Fix uninitialised variable in core/mr.c
On 2020/8/7 上午2:56, Alex Dewar wrote: If the kernel is unable to allocate memory for the variable dmr then err will be returned without being set. Set err to -ENOMEM in this case. Fixes: 94abbccdf291 ("vdpa/mlx5: Add shared memory registration code") Addresses-Coverity: ("Uninitialized variables") Signed-off-by: Alex Dewar Acked-by: Jason Wang --- drivers/vdpa/mlx5/core/mr.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/vdpa/mlx5/core/mr.c b/drivers/vdpa/mlx5/core/mr.c index f5dec0274133..ef1c550f8266 100644 --- a/drivers/vdpa/mlx5/core/mr.c +++ b/drivers/vdpa/mlx5/core/mr.c @@ -319,8 +319,10 @@ static int add_direct_chain(struct mlx5_vdpa_dev *mvdev, u64 start, u64 size, u8 while (size) { sz = (u32)min_t(u64, MAX_KLM_SIZE, size); dmr = kzalloc(sizeof(*dmr), GFP_KERNEL); - if (!dmr) + if (!dmr) { + err = -ENOMEM; goto err_alloc; + } dmr->start = st; dmr->end = st + sz; -- 2.28.0
Re: [PATCHv3] coresight: etm4x: Fix etm4_count race by moving cpuhp callbacks to init
On Wed, Jul 29, 2020 at 01:13:10PM +0800, Sai Prakash Ranjan wrote: > etm4_count keeps track of number of ETMv4 registered and on some systems, > a race is observed on etm4_count variable which can lead to multiple calls > to cpuhp_setup_state_nocalls_cpuslocked(). This function internally calls > cpuhp_store_callbacks() which prevents multiple registrations of callbacks > for a given state and due to this race, it returns -EBUSY leading to ETM > probe failures like below. > > coresight-etm4x: probe of 704.etm failed with error -16 > > This race can easily be triggered with async probe by setting probe type > as PROBE_PREFER_ASYNCHRONOUS and with ETM power management property > "arm,coresight-loses-context-with-cpu". > > Prevent this race by moving cpuhp callbacks to etm driver init since the > cpuhp callbacks doesn't have to depend on the etm4_count and can be once > setup during driver init. Similarly we move cpu_pm notifier registration > to driver init and completely remove etm4_count usage. Also now we can > use non cpuslocked version of cpuhp callbacks with this movement. > > Fixes: 9b6a3f3633a5 ("coresight: etmv4: Fix CPU power management setup in > probe() function") > Fixes: 58eb457be028 ("hwtracing/coresight-etm4x: Convert to hotplug state > machine") > Suggested-by: Suzuki K Poulose > Signed-off-by: Sai Prakash Ranjan > --- > > Changes in v3: > * Minor cleanups from v2 and change to device_initcall (Stephen Boyd) > * Move to non cpuslocked cpuhp callbacks and rename to etm_pm_setup() (Mike > Leach) > > Changes in v2: > * Rearrange cpuhp callbacks and move them to driver init (Suzuki K Poulose) > > --- > drivers/hwtracing/coresight/coresight-etm4x.c | 65 +-- > 1 file changed, 31 insertions(+), 34 deletions(-) > > diff --git a/drivers/hwtracing/coresight/coresight-etm4x.c > b/drivers/hwtracing/coresight/coresight-etm4x.c > index 6d7d2169bfb2..fddfd93b9a7b 100644 > --- a/drivers/hwtracing/coresight/coresight-etm4x.c > +++ b/drivers/hwtracing/coresight/coresight-etm4x.c > @@ -48,8 +48,6 @@ module_param(pm_save_enable, int, 0444); > MODULE_PARM_DESC(pm_save_enable, > "Save/restore state on power down: 1 = never, 2 = self-hosted"); > > -/* The number of ETMv4 currently registered */ > -static int etm4_count; > static struct etmv4_drvdata *etmdrvdata[NR_CPUS]; > static void etm4_set_default_config(struct etmv4_config *config); > static int etm4_set_event_filters(struct etmv4_drvdata *drvdata, > @@ -1398,28 +1396,25 @@ static struct notifier_block etm4_cpu_pm_nb = { > .notifier_call = etm4_cpu_pm_notify, > }; > > -/* Setup PM. Called with cpus locked. Deals with error conditions and > counts */ > -static int etm4_pm_setup_cpuslocked(void) > +/* Setup PM. Deals with error conditions and counts */ > +static int __init etm4_pm_setup(void) > { > int ret; > > - if (etm4_count++) > - return 0; > - > ret = cpu_pm_register_notifier(_cpu_pm_nb); > if (ret) > - goto reduce_count; > + return ret; > > - ret = > cpuhp_setup_state_nocalls_cpuslocked(CPUHP_AP_ARM_CORESIGHT_STARTING, > -"arm/coresight4:starting", > -etm4_starting_cpu, > etm4_dying_cpu); > + ret = cpuhp_setup_state_nocalls(CPUHP_AP_ARM_CORESIGHT_STARTING, > + "arm/coresight4:starting", > + etm4_starting_cpu, etm4_dying_cpu); > > if (ret) > goto unregister_notifier; > > - ret = cpuhp_setup_state_nocalls_cpuslocked(CPUHP_AP_ONLINE_DYN, > -"arm/coresight4:online", > -etm4_online_cpu, NULL); > + ret = cpuhp_setup_state_nocalls(CPUHP_AP_ONLINE_DYN, > + "arm/coresight4:online", > + etm4_online_cpu, NULL); > > /* HP dyn state ID returned in ret on success */ > if (ret > 0) { > @@ -1428,21 +1423,15 @@ static int etm4_pm_setup_cpuslocked(void) > } > > /* failed dyn state - remove others */ > - cpuhp_remove_state_nocalls_cpuslocked(CPUHP_AP_ARM_CORESIGHT_STARTING); > + cpuhp_remove_state_nocalls(CPUHP_AP_ARM_CORESIGHT_STARTING); > > unregister_notifier: > cpu_pm_unregister_notifier(_cpu_pm_nb); > - > -reduce_count: > - --etm4_count; > return ret; > } > > -static void etm4_pm_clear(void) > +static void __init etm4_pm_clear(void) > { > - if (--etm4_count != 0) > - return; > - > cpu_pm_unregister_notifier(_cpu_pm_nb); > cpuhp_remove_state_nocalls(CPUHP_AP_ARM_CORESIGHT_STARTING); > if (hp_online) { > @@ -1498,22 +1487,12 @@ static int etm4_probe(struct amba_device *adev, > const struct amba_id *id) > if (!desc.name) > return -ENOMEM; > > - cpus_read_lock(); >
Re: [PATCH 1/2] vdpa: ifcvf: return err when fail to request config irq
On 2020/7/23 下午5:12, Jason Wang wrote: We ignore the err of requesting config interrupt, fix this. Fixes: e7991f376a4d ("ifcvf: implement config interrupt in IFCVF") Cc: Zhu Lingshan Signed-off-by: Jason Wang --- drivers/vdpa/ifcvf/ifcvf_main.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/vdpa/ifcvf/ifcvf_main.c b/drivers/vdpa/ifcvf/ifcvf_main.c index f5a60c14b979..ae7110955a44 100644 --- a/drivers/vdpa/ifcvf/ifcvf_main.c +++ b/drivers/vdpa/ifcvf/ifcvf_main.c @@ -76,6 +76,10 @@ static int ifcvf_request_irq(struct ifcvf_adapter *adapter) ret = devm_request_irq(>dev, irq, ifcvf_config_changed, 0, vf->config_msix_name, vf); + if (ret) { + IFCVF_ERR(pdev, "Failed to request config irq\n"); + return ret; + } for (i = 0; i < IFCVF_MAX_QUEUE_PAIRS * 2; i++) { snprintf(vf->vring[i].msix_name, 256, "ifcvf[%s]-%d\n", Hi Michael: Any comments on this series? Thanks
Re: [PATCH] vdpa/mlx5: Fix erroneous null pointer checks
On 2020/8/7 上午3:18, Alex Dewar wrote: In alloc_inout() in net/mlx5_vnet.c, there are a few places where memory is allocated to *in and *out, but only the values of in and out are null-checked (i.e. there is a missing dereference). Fix this. Addresses-Coverity: ("CID 1496603: (REVERSE_INULL)") Fixes: 1a86b377aa21 ("vdpa/mlx5: Add VDPA driver for supported mlx5 devices") Signed-off-by: Alex Dewar Acked-by: Jason Wang --- drivers/vdpa/mlx5/net/mlx5_vnet.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c index 3ec44a4f0e45..bcb6600c2839 100644 --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c @@ -867,7 +867,7 @@ static void alloc_inout(struct mlx5_vdpa_net *ndev, int cmd, void **in, int *inl *outlen = MLX5_ST_SZ_BYTES(qp_2rst_out); *in = kzalloc(*inlen, GFP_KERNEL); *out = kzalloc(*outlen, GFP_KERNEL); - if (!in || !out) + if (!*in || !*out) goto outerr; MLX5_SET(qp_2rst_in, *in, opcode, cmd); @@ -879,7 +879,7 @@ static void alloc_inout(struct mlx5_vdpa_net *ndev, int cmd, void **in, int *inl *outlen = MLX5_ST_SZ_BYTES(rst2init_qp_out); *in = kzalloc(*inlen, GFP_KERNEL); *out = kzalloc(MLX5_ST_SZ_BYTES(rst2init_qp_out), GFP_KERNEL); - if (!in || !out) + if (!*in || !*out) goto outerr; MLX5_SET(rst2init_qp_in, *in, opcode, cmd); @@ -896,7 +896,7 @@ static void alloc_inout(struct mlx5_vdpa_net *ndev, int cmd, void **in, int *inl *outlen = MLX5_ST_SZ_BYTES(init2rtr_qp_out); *in = kzalloc(*inlen, GFP_KERNEL); *out = kzalloc(MLX5_ST_SZ_BYTES(init2rtr_qp_out), GFP_KERNEL); - if (!in || !out) + if (!*in || !*out) goto outerr; MLX5_SET(init2rtr_qp_in, *in, opcode, cmd); @@ -914,7 +914,7 @@ static void alloc_inout(struct mlx5_vdpa_net *ndev, int cmd, void **in, int *inl *outlen = MLX5_ST_SZ_BYTES(rtr2rts_qp_out); *in = kzalloc(*inlen, GFP_KERNEL); *out = kzalloc(MLX5_ST_SZ_BYTES(rtr2rts_qp_out), GFP_KERNEL); - if (!in || !out) + if (!*in || !*out) goto outerr; MLX5_SET(rtr2rts_qp_in, *in, opcode, cmd);
[PATCH] kernel: time: delete repeated words in comments
Drop repeated words in kernel/time/. {when, one, into} Signed-off-by: Randy Dunlap Cc: John Stultz Cc: Thomas Gleixner --- kernel/time/alarmtimer.c |2 +- kernel/time/sched_clock.c |2 +- kernel/time/timekeeping.c |2 +- 3 files changed, 3 insertions(+), 3 deletions(-) --- linux-next-20200806.orig/kernel/time/alarmtimer.c +++ linux-next-20200806/kernel/time/alarmtimer.c @@ -192,7 +192,7 @@ static void alarmtimer_dequeue(struct al * When a alarm timer fires, this runs through the timerqueue to * see which alarms expired, and runs those. If there are more alarm * timers queued for the future, we set the hrtimer to fire when - * when the next future alarm timer expires. + * the next future alarm timer expires. */ static enum hrtimer_restart alarmtimer_fired(struct hrtimer *timer) { --- linux-next-20200806.orig/kernel/time/sched_clock.c +++ linux-next-20200806/kernel/time/sched_clock.c @@ -229,7 +229,7 @@ void __init generic_sched_clock_init(voi { /* * If no sched_clock() function has been provided at that point, -* make it the final one one. +* make it the final one. */ if (cd.actual_read_sched_clock == jiffy_sched_clock_read) sched_clock_register(jiffy_sched_clock_read, BITS_PER_LONG, HZ); --- linux-next-20200806.orig/kernel/time/timekeeping.c +++ linux-next-20200806/kernel/time/timekeeping.c @@ -2001,7 +2001,7 @@ static inline unsigned int accumulate_ns * logarithmic_accumulation - shifted accumulation of cycles * * This functions accumulates a shifted interval of cycles into - * into a shifted interval nanoseconds. Allows for O(log) accumulation + * a shifted interval nanoseconds. Allows for O(log) accumulation * loop. * * Returns the unconsumed cycles.
[PATCH] kernel: printk: delete repeated words in comments
Drop repeated words "the" in kernel/printk/. Signed-off-by: Randy Dunlap Cc: Petr Mladek Cc: Sergey Senozhatsky --- kernel/printk/printk.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) --- linux-next-20200806.orig/kernel/printk/printk.c +++ linux-next-20200806/kernel/printk/printk.c @@ -2461,7 +2461,7 @@ void console_unlock(void) * * console_trylock() is not able to detect the preemptive * context reliably. Therefore the value must be stored before -* and cleared after the the "again" goto label. +* and cleared after the "again" goto label. */ do_cond_resched = console_may_schedule; again: @@ -3374,7 +3374,7 @@ EXPORT_SYMBOL_GPL(kmsg_dump_get_line); * @len: length of line placed into buffer * * Start at the end of the kmsg buffer and fill the provided buffer - * with as many of the the *youngest* kmsg records that fit into it. + * with as many of the *youngest* kmsg records that fit into it. * If the buffer is large enough, all available kmsg records will be * copied with a single call. *
[PATCH] kernel: sched: delete repeated words in comments
Drop repeated words in kernel/sched/. {in, not} Signed-off-by: Randy Dunlap Cc: Mathieu Desnoyers Cc: "Paul E. McKenney" Cc: Ingo Molnar Cc: Peter Zijlstra Cc: Juri Lelli Cc: Vincent Guittot --- kernel/sched/fair.c |2 +- kernel/sched/membarrier.c |2 +- 2 files changed, 2 insertions(+), 2 deletions(-) --- linux-next-20200806.orig/kernel/sched/fair.c +++ linux-next-20200806/kernel/sched/fair.c @@ -5109,7 +5109,7 @@ static void do_sched_cfs_slack_timer(str /* * When a group wakes up we want to make sure that its quota is not already * expired/exceeded, otherwise it may be allowed to steal additional ticks of - * runtime as update_curr() throttling can not not trigger until it's on-rq. + * runtime as update_curr() throttling can not trigger until it's on-rq. */ static void check_enqueue_throttle(struct cfs_rq *cfs_rq) { --- linux-next-20200806.orig/kernel/sched/membarrier.c +++ linux-next-20200806/kernel/sched/membarrier.c @@ -229,7 +229,7 @@ static int sync_runqueues_membarrier_sta /* * For each cpu runqueue, if the task's mm match @mm, ensure that all -* @mm's membarrier state set bits are also set in in the runqueue's +* @mm's membarrier state set bits are also set in the runqueue's * membarrier state. This ensures that a runqueue scheduling * between threads which are users of @mm has its membarrier state * updated.
[PATCH] kernel: trace: delete repeated words in comments
Drop repeated words in kernel/trace/. {and, the, not} Signed-off-by: Randy Dunlap Cc: Steven Rostedt Cc: Ingo Molnar --- kernel/trace/ftrace.c |2 +- kernel/trace/trace.c |2 +- kernel/trace/trace_dynevent.c |2 +- kernel/trace/trace_events_synth.c |2 +- kernel/trace/tracing_map.c|2 +- 5 files changed, 5 insertions(+), 5 deletions(-) --- linux-next-20200806.orig/kernel/trace/ftrace.c +++ linux-next-20200806/kernel/trace/ftrace.c @@ -2402,7 +2402,7 @@ struct ftrace_ops direct_ops = { * * If the record has the FTRACE_FL_REGS set, that means that it * wants to convert to a callback that saves all regs. If FTRACE_FL_REGS - * is not not set, then it wants to convert to the normal callback. + * is not set, then it wants to convert to the normal callback. * * Returns the address of the trampoline to set to */ --- linux-next-20200806.orig/kernel/trace/trace.c +++ linux-next-20200806/kernel/trace/trace.c @@ -9243,7 +9243,7 @@ void ftrace_dump(enum ftrace_dump_mode o } /* -* We need to stop all tracing on all CPUS to read the +* We need to stop all tracing on all CPUS to read * the next buffer. This is a bit expensive, but is * not done often. We fill all what we can read, * and then release the locks again. --- linux-next-20200806.orig/kernel/trace/trace_dynevent.c +++ linux-next-20200806/kernel/trace/trace_dynevent.c @@ -402,7 +402,7 @@ void dynevent_arg_init(struct dynevent_a * whitespace, all followed by a separator, if applicable. After the * first arg string is successfully appended to the command string, * the optional @operator is appended, followed by the second arg and - * and optional @separator. If no separator was specified when + * optional @separator. If no separator was specified when * initializing the arg, a space will be appended. */ void dynevent_arg_pair_init(struct dynevent_arg_pair *arg_pair, --- linux-next-20200806.orig/kernel/trace/trace_events_synth.c +++ linux-next-20200806/kernel/trace/trace_events_synth.c @@ -1211,7 +1211,7 @@ __synth_event_trace_start(struct trace_e * ENABLED bit is set (which attaches the probe thus allowing * this code to be called, etc). Because this is called * directly by the user, we don't have that but we still need -* to honor not logging when disabled. For the the iterated +* to honor not logging when disabled. For the iterated * trace case, we save the enabed state upon start and just * ignore the following data calls. */ --- linux-next-20200806.orig/kernel/trace/tracing_map.c +++ linux-next-20200806/kernel/trace/tracing_map.c @@ -260,7 +260,7 @@ int tracing_map_add_var(struct tracing_m * to use cmp_fn. * * A key can be a subset of a compound key; for that purpose, the - * offset param is used to describe where within the the compound key + * offset param is used to describe where within the compound key * the key referenced by this key field resides. * * Return: The index identifying the field in the map and associated
[PATCH] kernel: locking: delete repeated words in comments
Drop repeated words in kernel/locking/. {it, no, the} Signed-off-by: Randy Dunlap Cc: Peter Zijlstra Cc: Ingo Molnar Cc: Will Deacon --- kernel/locking/rtmutex.c |4 ++-- kernel/locking/rwsem.c |2 +- kernel/locking/semaphore.c |2 +- 3 files changed, 4 insertions(+), 4 deletions(-) --- linux-next-20200806.orig/kernel/locking/rtmutex.c +++ linux-next-20200806/kernel/locking/rtmutex.c @@ -1438,7 +1438,7 @@ rt_mutex_fasttrylock(struct rt_mutex *lo } /* - * Performs the wakeup of the the top-waiter and re-enables preemption. + * Performs the wakeup of the top-waiter and re-enables preemption. */ void rt_mutex_postunlock(struct wake_q_head *wake_q) { @@ -1833,7 +1833,7 @@ struct task_struct *rt_mutex_next_owner( * been started. * @waiter:the pre-initialized rt_mutex_waiter * - * Wait for the the lock acquisition started on our behalf by + * Wait for the lock acquisition started on our behalf by * rt_mutex_start_proxy_lock(). Upon failure, the caller must call * rt_mutex_cleanup_proxy_lock(). * --- linux-next-20200806.orig/kernel/locking/rwsem.c +++ linux-next-20200806/kernel/locking/rwsem.c @@ -1177,7 +1177,7 @@ rwsem_down_write_slowpath(struct rw_sema /* * If there were already threads queued before us and: -* 1) there are no no active locks, wake the front +* 1) there are no active locks, wake the front * queued process(es) as the handoff bit might be set. * 2) there are no active writers and some readers, the lock * must be read owned; so we try to wake any read lock --- linux-next-20200806.orig/kernel/locking/semaphore.c +++ linux-next-20200806/kernel/locking/semaphore.c @@ -119,7 +119,7 @@ EXPORT_SYMBOL(down_killable); * @sem: the semaphore to be acquired * * Try to acquire the semaphore atomically. Returns 0 if the semaphore has - * been acquired successfully or 1 if it it cannot be acquired. + * been acquired successfully or 1 if it cannot be acquired. * * NOTE: This return value is inverted from both spin_trylock and * mutex_trylock! Be careful about this when converting code.
[PATCH] kernel: events: delete repeated words in comments
Drop repeated words in kernel/events/. {if, the, that, with, time} Signed-off-by: Randy Dunlap Cc: Peter Zijlstra Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo --- kernel/events/core.c|8 kernel/events/uprobes.c |2 +- 2 files changed, 5 insertions(+), 5 deletions(-) --- linux-next-20200806.orig/kernel/events/core.c +++ linux-next-20200806/kernel/events/core.c @@ -265,7 +265,7 @@ static void event_function_call(struct p if (!event->parent) { /* * If this is a !child event, we must hold ctx::mutex to -* stabilize the the event->ctx relation. See +* stabilize the event->ctx relation. See * perf_event_ctx_lock(). */ lockdep_assert_held(>mutex); @@ -1299,7 +1299,7 @@ static void put_ctx(struct perf_event_co * life-time rules separate them. That is an exiting task cannot fork, and a * spawning task cannot (yet) exit. * - * But remember that that these are parent<->child context relations, and + * But remember that these are parent<->child context relations, and * migration does not affect children, therefore these two orderings should not * interact. * @@ -1438,7 +1438,7 @@ static u64 primary_event_id(struct perf_ /* * Get the perf_event_context for a task and lock it. * - * This has to cope with with the fact that until it is locked, + * This has to cope with the fact that until it is locked, * the context could get moved to another task. */ static struct perf_event_context * @@ -2475,7 +2475,7 @@ static void perf_set_shadow_time(struct * But this is a bit hairy. * * So instead, we have an explicit cgroup call to remain -* within the time time source all along. We believe it +* within the time source all along. We believe it * is cleaner and simpler to understand. */ if (is_cgroup_event(event)) --- linux-next-20200806.orig/kernel/events/uprobes.c +++ linux-next-20200806/kernel/events/uprobes.c @@ -1735,7 +1735,7 @@ void uprobe_free_utask(struct task_struc } /* - * Allocate a uprobe_task object for the task if if necessary. + * Allocate a uprobe_task object for the task if necessary. * Called when the thread hits a breakpoint. * * Returns:
[PATCH] kernel: bpf: delete repeated words in comments
Drop repeated words in kernel/bpf/. {has, the} Signed-off-by: Randy Dunlap Cc: Alexei Starovoitov Cc: Daniel Borkmann Cc: net...@vger.kernel.org Cc: b...@vger.kernel.org --- kernel/bpf/core.c |2 +- kernel/bpf/verifier.c |2 +- 2 files changed, 2 insertions(+), 2 deletions(-) --- linux-next-20200806.orig/kernel/bpf/core.c +++ linux-next-20200806/kernel/bpf/core.c @@ -1966,7 +1966,7 @@ void bpf_prog_array_delete_safe(struct b * @index: the index of the program to replace * * Skips over dummy programs, by not counting them, when calculating - * the the position of the program to replace. + * the position of the program to replace. * * Return: * * 0 - Success --- linux-next-20200806.orig/kernel/bpf/verifier.c +++ linux-next-20200806/kernel/bpf/verifier.c @@ -8294,7 +8294,7 @@ static bool stacksafe(struct bpf_func_st if (old->stack[spi].slot_type[i % BPF_REG_SIZE] != cur->stack[spi].slot_type[i % BPF_REG_SIZE]) /* Ex: old explored (safe) state has STACK_SPILL in -* this stack slot, but current has has STACK_MISC -> +* this stack slot, but current has STACK_MISC -> * this verifier states are not equivalent, * return false to continue verification of this path */
答复: 答复: 答复: 答复: 答复: 答复: 答复: [PATCH] iommu/vt-d:Add support for ACPI device in RMRR
Hi baolu, I understand what you mean is that you want to put the following processing code in the acpi_device_create_direct_mappings() into the probe_acpi_namespace_devices() ,right? If you mean it , I think it's OK. if (pn_dev == NULL) { acpi_device->bus->iommu_ops = _iommu_ops; ret = iommu_probe_device(acpi_device); if (ret) { pr_err("acpi_device probe fail! ret:%d\n", ret); return ret; } return 0; } Best regards Felix cui-oc -邮件原件- 发件人: Lu Baolu 发送时间: 2020年8月7日 9:08 收件人: FelixCui-oc ; Joerg Roedel ; io...@lists.linux-foundation.org; linux-kernel@vger.kernel.org; David Woodhouse 抄送: baolu...@linux.intel.com; RaymondPang-oc ; CobeChen-oc 主题: Re: 答复: 答复: 答复: 答复: 答复: 答复: [PATCH] iommu/vt-d:Add support for ACPI device in RMRR Hi Felix, On 2020/8/6 14:51, FelixCui-oc wrote: > Hi baolu, > >Sure. Before that, let me sync my understanding with you. You > have an acpi namespace device in ANDD table, it also shows up in the device > scope of a RMRR. > >Current code doesn't enumerate that device for the RMRR, hence > iommu_create_device_direct_mappings() doesn't work for this device. > > >At the same time, probe_acpi_namespace_devices() doesn't work > for this device, hence you want to add a home-made > >acpi_device_create_direct_mappings() helper. > > Your understanding is right. > But there is a problem that even if the namespace device in > rmrr is enumerated in the code, probe_acpi_namespace_devices() also doesn't > work for this device. > This is because the dev parameter of the > iommu_create_device_direct_mappings() is not the namespace device in RMRR. > The actual parameter passed in is the namespace device's > physical node device. > In iommu_create_device_direct_mappings(), the physical node > device passed in cannot match the namespace device in rmrr->device[],right? > We need acpi_device_create_direct_mappings() helper ? > > In addition, adev->physical_node_list is related to the __HID > of namespace device reported by the bios. > For example, if the __HID reported by the bios belongs to > acpi_pnp_device_ids[], adev->physical_node_list has no devices. > So in acpi_device_create_direct_mappings(), I added the case > that adev->physical_node_list is empty. Got you. Thanks! Have you ever tried to have probe_acpi_namespace_devices() handle the case of empty adev->physical_node_list at the same time? Best regards, baolu > > > Best regards > Felix cui > > > > > -邮件原件- > 发件人: Lu Baolu > 发送时间: 2020年8月6日 10:36 > 收件人: FelixCui-oc ; Joerg Roedel > ; io...@lists.linux-foundation.org; > linux-kernel@vger.kernel.org; David Woodhouse > 抄送: baolu...@linux.intel.com; RaymondPang-oc > ; CobeChen-oc > 主题: Re: 答复: 答复: 答复: 答复: 答复: [PATCH] iommu/vt-d:Add support for ACPI > device in RMRR > > Hi Felix, > > On 8/5/20 3:37 PM, FelixCui-oc wrote: >> Hi baolu, >> Let me talk about why acpi_device_create_direct_mappings() is >> needed and please tell me if there is an error. > > Sure. Before that, let me sync my understanding with you. You have an acpi > namespace device in ANDD table, it also shows up in the device scope of a > RMRR. Current code doesn't enumerate that device for the RMRR, hence > iommu_create_device_direct_mappings() doesn't work for this device. > > At the same time, probe_acpi_namespace_devices() doesn't work for this > device, hence you want to add a home-made > acpi_device_create_direct_mappings() helper. > > Did I get it right? > >> In the probe_acpi_namespace_devices() function, only the device >> in the addev->physical_node_list is probed, >> but we need to establish identity mapping for the namespace >> device in RMRR. These are two different devices. > > The namespace device has been probed and put in one drhd's device list. > Hence, it should be processed by probe_acpi_namespace_devices(). So the > question is why there are no devices in addev->physical_node_list? > >> Therefore, the namespace device in RMRR is not mapped in >> probe_acpi_namespace_devices(). >> acpi_device_create_direct_mappings() is to create direct >> mappings for namespace devices in RMRR. > > Best regards, > baolu >
[PATCH v4] x86/cpu: Use SERIALIZE in sync_core() when available
The SERIALIZE instruction gives software a way to force the processor to complete all modifications to flags, registers and memory from previous instructions and drain all buffered writes to memory before the next instruction is fetched and executed. Thus, it serves the purpose of sync_core(). Use it when available. Cc: Andy Lutomirski Cc: Cathy Zhang Cc: Dave Hansen Cc: Fenghua Yu Cc: "H. Peter Anvin" Cc: Kyung Min Park Cc: Peter Zijlstra Cc: "Ravi V. Shankar" Cc: Sean Christopherson Cc: linux-e...@vger.kernel.org Cc: linux-kernel@vger.kernel.org Reviewed-by: Tony Luck Suggested-by: Andy Lutomirski Signed-off-by: Ricardo Neri --- This is v4 from my three previous submission [1], [2], and [3]. The first three patches of the series have been merged in Linus' tree. Hence, I am submitting only this patch for review. [1]. https://lkml.org/lkml/2020/7/27/8 [2]. https://lkml.org/lkml/2020/8/4/1090 [3]. https://lkml.org/lkml/2020/8/6/808 Changes since v3: * Reworked comments in sync_core() for better readability. (Dave Hansen) * Reworked the implementation to align with the style in special_insns.h. No functional changes were introduced. (Tony Luck) Changes since v2: * Support serialize with static_cpu_has() instead of using alternative runtime patching directly. (Borislav Petkov) Changes since v1: * Support SERIALIZE using alternative runtime patching. (Peter Zijlstra, H. Peter Anvin) * Added a note to specify which version of binutils supports SERIALIZE. (Peter Zijlstra) * Verified that (::: "memory") is used. (H. Peter Anvin) --- arch/x86/include/asm/special_insns.h | 6 ++ arch/x86/include/asm/sync_core.h | 26 ++ 2 files changed, 24 insertions(+), 8 deletions(-) diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h index 59a3e13204c3..5999b0b3dd4a 100644 --- a/arch/x86/include/asm/special_insns.h +++ b/arch/x86/include/asm/special_insns.h @@ -234,6 +234,12 @@ static inline void clwb(volatile void *__p) #define nop() asm volatile ("nop") +static inline void serialize(void) +{ + /* Instruction opcode for SERIALIZE; supported in binutils >= 2.35. */ + asm volatile(".byte 0xf, 0x1, 0xe8" ::: "memory"); +} + #endif /* __KERNEL__ */ #endif /* _ASM_X86_SPECIAL_INSNS_H */ diff --git a/arch/x86/include/asm/sync_core.h b/arch/x86/include/asm/sync_core.h index fdb5b356e59b..089712777fd9 100644 --- a/arch/x86/include/asm/sync_core.h +++ b/arch/x86/include/asm/sync_core.h @@ -5,6 +5,7 @@ #include #include #include +#include #ifdef CONFIG_X86_32 static inline void iret_to_self(void) @@ -54,14 +55,23 @@ static inline void iret_to_self(void) static inline void sync_core(void) { /* -* There are quite a few ways to do this. IRET-to-self is nice -* because it works on every CPU, at any CPL (so it's compatible -* with paravirtualization), and it never exits to a hypervisor. -* The only down sides are that it's a bit slow (it seems to be -* a bit more than 2x slower than the fastest options) and that -* it unmasks NMIs. The "push %cs" is needed because, in -* paravirtual environments, __KERNEL_CS may not be a valid CS -* value when we do IRET directly. +* The SERIALIZE instruction is the most straightforward way to +* do this but it not universally available. +*/ + if (static_cpu_has(X86_FEATURE_SERIALIZE)) { + serialize(); + return; + } + + /* +* For all other processors, there are quite a few ways to do this +* IRET-to-self is nice because it works on every CPU, at any CPL +* (so it's compatible with paravirtualization), and it never exits +* to a hypervisor. The only down sides are that it's a bit slow +* (it seems to be a bit more than 2x slower than the fastest +* options) and that it unmasks NMIs. The "push %cs" is needed +* because, in paravirtual environments, __KERNEL_CS may not be a +* valid CS value when we do IRET directly. * * In case NMI unmasking or performance ever becomes a problem, * the next best option appears to be MOV-to-CR2 and an -- 2.17.1
Re: [PATCH v2 03/24] virtio: allow __virtioXX, __leXX in config space
On 2020/8/6 下午1:58, Michael S. Tsirkin wrote: On Thu, Aug 06, 2020 at 11:37:38AM +0800, Jason Wang wrote: On 2020/8/5 下午7:45, Michael S. Tsirkin wrote: #define virtio_cread(vdev, structname, member, ptr) \ do {\ might_sleep(); \ /* Must match the member's type, and be integer */ \ - if (!typecheck(typeofstructname*)0)->member)), *(ptr))) \ + if (!__virtio_typecheck(structname, member, *(ptr)))\ (*ptr) = 1; \ A silly question, compare to using set()/get() directly, what's the value of the accessors macro here? Thanks get/set don't convert to the native endian, I guess that's why drivers use cread/cwrite. It is also nice that there's type safety, checking the correct integer width is used. Yes, but this is simply because a macro is used here, how about just doing things similar like virtio_cread_bytes(): static inline void virtio_cread(struct virtio_device *vdev, unsigned int offset, void *buf, size_t len) And do the endian conversion inside? Thanks Then you lose type safety. It's very easy to have an le32 field and try to read it into a u16 by mistake. These macros are all about preventing bugs: and the whole patchset is about several bugs sparse found - that is what prompted me to make type checks more strict. Yes, but we need to do the macro with compiler extensions. I wonder whether or not the kernel has already had something since this request here is pretty common? Thanks
Re: [PATCH v17 14/21] mm/compaction: do page isolation first in compaction
在 2020/8/7 上午2:38, Alexander Duyck 写道: >> + >> isolate_abort: >> if (locked) >> spin_unlock_irqrestore(>lru_lock, flags); >> + if (page) { >> + SetPageLRU(page); >> + put_page(page); >> + } >> >> /* >> * Updated the cached scanner pfn once the pageblock has been scanned > We should probably be calling SetPageLRU before we release the lru > lock instead of before. It might make sense to just call it before we > get here, similar to how you did in the isolate_fail_put case a few > lines later. Otherwise this seems to violate the rules you had set up > earlier where we were only going to be setting the LRU bit while > holding the LRU lock. Hi Alex, Set out of lock here should be fine. I never said we must set the bit in locking. And this page is get by get_page_unless_zero(), no warry on release. Thanks Alex
Re: [PATCH 1/4] vdpa: introduce config op to get valid iova range
On 2020/8/6 下午8:29, Michael S. Tsirkin wrote: On Thu, Aug 06, 2020 at 03:03:55PM +0300, Eli Cohen wrote: On Wed, Aug 05, 2020 at 08:51:56AM -0400, Michael S. Tsirkin wrote: On Wed, Jun 17, 2020 at 11:29:44AM +0800, Jason Wang wrote: This patch introduce a config op to get valid iova range from the vDPA device. Signed-off-by: Jason Wang --- include/linux/vdpa.h | 14 ++ 1 file changed, 14 insertions(+) diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h index 239db794357c..b7633ed2500c 100644 --- a/include/linux/vdpa.h +++ b/include/linux/vdpa.h @@ -41,6 +41,16 @@ struct vdpa_device { unsigned int index; }; +/** + * vDPA IOVA range - the IOVA range support by the device + * @start: start of the IOVA range + * @end: end of the IOVA range + */ +struct vdpa_iova_range { + u64 start; + u64 end; +}; + This is ambiguous. Is end in the range or just behind it? How about first/last? It is customary in the kernel to use start-end where end corresponds to the byte following the last in the range. See struct vm_area_struct vm_start and vm_end fields Exactly my point: include/linux/mm_types.h: unsigned long vm_end; /* The first byte after our end address in this case Jason wants it to be the last byte, not one behind. Ok, I somehow recall the reason :) See: struct iommu_domain_geometry { dma_addr_t aperture_start; /* First address that can be mapped */ dma_addr_t aperture_end; /* Last address that can be mapped */ bool force_aperture; /* DMA only allowed in mappable range? */ }; So what I proposed here is to be consistent with it. Thanks
Re: [PATCH v9 08/10] gpu: host1x: mipi: Keep MIPI clock enabled and mutex locked till calibration done
On 8/6/20 8:14 PM, Sowjanya Komatineni wrote: On 8/6/20 8:10 PM, Sowjanya Komatineni wrote: On 8/6/20 7:31 PM, Dmitry Osipenko wrote: 06.08.2020 22:01, Sowjanya Komatineni пишет: ... +int tegra_mipi_start_calibration(struct tegra_mipi_device *device) { const struct tegra_mipi_soc *soc = device->mipi->soc; unsigned int i; @@ -381,12 +375,16 @@ int tegra_mipi_calibrate(struct tegra_mipi_device *device) value |= MIPI_CAL_CTRL_START; tegra_mipi_writel(device->mipi, value, MIPI_CAL_CTRL); - mutex_unlock(>mipi->lock); - clk_disable(device->mipi->clk); + /* + * Wait for min 72uS to let calibration logic finish calibration + * sequence codes before waiting for pads idle state to apply the + * results. + */ + usleep_range(75, 80); Could you please explain why the ACTIVE bit can't be polled instead of using the fixed delay? Doesn't ACTIVE bit represents the state of the busy FSM? Based on internal discussion, ACTIVE bit gets cleared when all enabled pads calibration is done (same time as when DONE set to 1). Will request HW designer to look into design and confirm exactly when ACTIVE bit gets cleared. Will get back on this. Verified with HW designer. above is correct. ACTIVE bit update happens same time as DONE bit. Active = !(DONE) In case of calibration logic waiting for LP-11 where done bit does not get set, ACTIVE will still be 1 and on next start trigger new calibration will start Based on internal design check from designer, as long as its in waiting for LP-11 stage, next calibration request can be triggered again but ACTIVE bit we will see it at 1. So we should check for DONE bits to confirm if calibration is done or not. To start next calibration, it can take effect as long as its in wait for LP-11 mode.
Re: [PATCH v9 08/10] gpu: host1x: mipi: Keep MIPI clock enabled and mutex locked till calibration done
On 8/6/20 8:10 PM, Sowjanya Komatineni wrote: On 8/6/20 7:31 PM, Dmitry Osipenko wrote: 06.08.2020 22:01, Sowjanya Komatineni пишет: ... +int tegra_mipi_start_calibration(struct tegra_mipi_device *device) { const struct tegra_mipi_soc *soc = device->mipi->soc; unsigned int i; @@ -381,12 +375,16 @@ int tegra_mipi_calibrate(struct tegra_mipi_device *device) value |= MIPI_CAL_CTRL_START; tegra_mipi_writel(device->mipi, value, MIPI_CAL_CTRL); - mutex_unlock(>mipi->lock); - clk_disable(device->mipi->clk); + /* + * Wait for min 72uS to let calibration logic finish calibration + * sequence codes before waiting for pads idle state to apply the + * results. + */ + usleep_range(75, 80); Could you please explain why the ACTIVE bit can't be polled instead of using the fixed delay? Doesn't ACTIVE bit represents the state of the busy FSM? Based on internal discussion, ACTIVE bit gets cleared when all enabled pads calibration is done (same time as when DONE set to 1). Will request HW designer to look into design and confirm exactly when ACTIVE bit gets cleared. Will get back on this. Verified with HW designer. above is correct. ACTIVE bit update happens same time as DONE bit. Active = !(DONE) In case of calibration logic waiting for LP-11 where done bit does not get set, ACTIVE will still be 1 and on next start trigger new calibration will start
Re: [PATCH v3 2/4] irqchip/qcom-pdc: Switch to using IRQCHIP_PLATFORM_DRIVER helper macros
On Thu, Aug 6, 2020 at 8:09 PM John Stultz wrote: > > On Thu, Aug 6, 2020 at 8:02 PM Saravana Kannan wrote: > > On Thu, Aug 6, 2020 at 7:49 PM John Stultz wrote: > > > On Thu, Aug 6, 2020 at 6:42 PM Bjorn Andersson > > > wrote: > > > > With all due respect, that's your downstream kernel, the upstream kernel > > > > should not rely on luck, out-of-tree patches or kernel parameters. > > > > > > I agree that would be preferred. But kernel parameters are often there > > > for these sorts of cases where we can't always do the right thing. As > > > for out-of-tree patches, broken things don't get fixed until > > > out-of-tree patches are developed and upstreamed, and I know Saravana > > > is doing exactly that, and I hope his fw_devlink work helps fix it so > > > the module loading is not just a matter of luck. > > > > Btw, the only downstream fw_devlink change is setting itto =on (vs > > =permissive in upstream). > > I thought there was the clk_sync_state stuff as well? That's not needed to solve the module load ordering issues and deferred probe issues. That's only needed to keep clocks on till some of the modules are loaded and it depends on fw_devlink, but not really a part of fw_devlink IMHO. And yes, that's on my list of things to upstream. > > > Also I think Thierry's comments in the other thread today are also > > > good ideas for ways to better handle the optional dt link handling > > > (rather than using a timeout). > > > > Could you please give me a lore link to this thread? Just curious. > > Sure: https://lore.kernel.org/lkml/20200806135251.GB3351349@ulmo/ Thanks. -Saravana
转发: upstream test error: WARNING in do_epoll_wait
> >发件人: linux-kernel-ow...@vger.kernel.org >>代表 syzbot >发送时间: 2020年8月5日 15:19 >收件人: linux-fsde...@vger.kernel.org; linux-kernel@vger.kernel.org; >>syzkaller->b...@googlegroups.com; v...@zeniv.linux.org.uk >主题: upstream test error: WARNING in do_epoll_wait >Hello, >syzbot found the following issue on: >HEAD commit:4f30a60a Merge tag 'close-range-v5.9' of git://git.kernel... >git tree: upstream >console output: https://syzkaller.appspot.com/x/log.txt?x=14c5a7da90 >kernel config: https://syzkaller.appspot.com/x/.config?x=8bdd9944dedf0f16 >dashboard link: https://syzkaller.appspot.com/bug?extid=4429670d8213f5f26352 >compiler: clang version 11.0.0 (https://github.com/llvm/llvm-project.git >>ca2dcbd030eadbf0aa9b660efe864ff08af6e18b) >IMPORTANT: if you fix the issue, please add the following tag to the commit: >Reported-by: syzbot+4429670d8213f5f26...@syzkaller.appspotmail.com >[ cut here ] >WARNING: CPU: 1 PID: 8728 at fs/eventpoll.c:1828 ep_poll fs/eventpoll.c:1828 >[inline] >WARNING: CPU: 1 PID: 8728 at fs/eventpoll.c:1828 do_epoll_wait+0x337/0x920 >>fs/eventpoll.c:2333 >Kernel panic - not syncing: panic_on_warn set ... >CPU: 1 PID: 8728 Comm: syz-fuzzer Not tainted 5.8.0-syzkaller #0 >Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS >>Google 01/01/2011 >Call Trace: >__dump_stack lib/dump_stack.c:77 [inline] >dump_stack+0x16e/0x25d lib/dump_stack.c:118 >panic+0x20c/0x69a kernel/panic.c:231 >__warn+0x211/0x240 kernel/panic.c:600 >report_bug+0x153/0x1d0 lib/bug.c:198 >handle_bug+0x4d/0x90 arch/x86/kernel/traps.c:235 >exc_invalid_op+0x16/0x70 arch/x86/kernel/traps.c:255 >asm_exc_invalid_op+0x12/0x20 arch/x86/include/asm/idtentry.h:547 >RIP: 0010:ep_poll fs/eventpoll.c:1828 [inline] >RIP: 0010:do_epoll_wait+0x337/0x920 fs/eventpoll.c:2333 >Code: 41 be 01 00 00 00 31 c0 48 89 44 24 20 45 31 e4 e9 7f 01 00 00 e8 59 ab >c6 ff 41 bc f2 >ff ff ff e9 c8 03 00 00 e8 49 ab c6 ff <0f> 0b e9 58 fe ff ff >49 bf ff ff ff ff ff ff ff 7f e9 f0 fe ff >ff >RSP: 0018:c9e1fe28 EFLAGS: 00010293 >RAX: 81856297 RBX: 888120fafa00 RCX: 88811e196400 >RDX: RSI: RDI: >RBP: R08: 818560d8 R09: 88619eb7 >R10: R11: R12: 7000 >R13: 0080 R14: 0001 R15: 0003 >__do_sys_epoll_pwait fs/eventpoll.c:2364 [inline] >__se_sys_epoll_pwait fs/eventpoll.c:2350 [inline] >__x64_sys_epoll_pwait+0x92/0x150 fs/eventpoll.c:2350 >do_syscall_64+0x6a/0xe0 arch/x86/entry/common.c:384 >entry_SYSCALL_64_after_hwframe+0x44/0xa9 >RIP: 0033:0x469240 >Code: 0f 05 89 44 24 20 c3 cc cc cc 8b 7c 24 08 48 8b 74 24 10 8b 54 24 18 44 >8b 54 24 1c >49 c7 c0 00 00 00 00 b8 19 01 00 00 0f 05 <89> 44 24 20 c3 cc cc >cc cc cc cc cc cc cc cc cc >8b 7c 24 08 48 c7 >RSP: 002b:00c4b7f0 EFLAGS: 0246 ORIG_RAX: 0119 >RAX: ffda RBX: 0001 RCX: 00469240 >RDX: 0080 RSI: 00c4b840 RDI: 0003 >RBP: 00c4be40 R08: R09: >R10: 0001 R11: 0246 R12: 0003 >R13: 00c9cc00 R14: 00c00032c180 R15: >Kernel Offset: disabled >Rebooting in 86400 seconds.. In "ep_poll" func the lockdep_assert_irqs_enabled detected interrupt status, although before enter "ep_poll" func, irq is already enabled, but it was missed lockdep irq status set. --- a/kernel/entry/common.c +++ b/kernel/entry/common.c @@ -76,6 +76,7 @@ noinstr long syscall_enter_from_user_mode(struct pt_regs *regs, long syscall) instrumentation_begin(); local_irq_enable(); + lockdep_hardirqs_on(CALLER_ADDR0); ti_work = READ_ONCE(current_thread_info()->flags); if (ti_work & SYSCALL_ENTER_WORK) syscall = syscall_trace_enter(regs, syscall, ti_work); -- Can we should add lockdep_hardirqs_on? >--- >This report is generated by a bot. It may contain errors. >See https://goo.gl/tpsmEJ for more information about syzbot. >syzbot engineers can be reached at syzkal...@googlegroups.com. >syzbot will keep track of this issue. See: >https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
Re: [PATCH v9 08/10] gpu: host1x: mipi: Keep MIPI clock enabled and mutex locked till calibration done
On 8/6/20 7:31 PM, Dmitry Osipenko wrote: 06.08.2020 22:01, Sowjanya Komatineni пишет: ... +int tegra_mipi_start_calibration(struct tegra_mipi_device *device) { const struct tegra_mipi_soc *soc = device->mipi->soc; unsigned int i; @@ -381,12 +375,16 @@ int tegra_mipi_calibrate(struct tegra_mipi_device *device) value |= MIPI_CAL_CTRL_START; tegra_mipi_writel(device->mipi, value, MIPI_CAL_CTRL); - mutex_unlock(>mipi->lock); - clk_disable(device->mipi->clk); + /* +* Wait for min 72uS to let calibration logic finish calibration +* sequence codes before waiting for pads idle state to apply the +* results. +*/ + usleep_range(75, 80); Could you please explain why the ACTIVE bit can't be polled instead of using the fixed delay? Doesn't ACTIVE bit represents the state of the busy FSM? Based on internal discussion, ACTIVE bit gets cleared when all enabled pads calibration is done (same time as when DONE set to 1). Will request HW designer to look into design and confirm exactly when ACTIVE bit gets cleared. Will get back on this.
Re: [PATCH v3 2/4] irqchip/qcom-pdc: Switch to using IRQCHIP_PLATFORM_DRIVER helper macros
On Thu, Aug 6, 2020 at 8:02 PM Saravana Kannan wrote: > On Thu, Aug 6, 2020 at 7:49 PM John Stultz wrote: > > On Thu, Aug 6, 2020 at 6:42 PM Bjorn Andersson > > wrote: > > > With all due respect, that's your downstream kernel, the upstream kernel > > > should not rely on luck, out-of-tree patches or kernel parameters. > > > > I agree that would be preferred. But kernel parameters are often there > > for these sorts of cases where we can't always do the right thing. As > > for out-of-tree patches, broken things don't get fixed until > > out-of-tree patches are developed and upstreamed, and I know Saravana > > is doing exactly that, and I hope his fw_devlink work helps fix it so > > the module loading is not just a matter of luck. > > Btw, the only downstream fw_devlink change is setting itto =on (vs > =permissive in upstream). I thought there was the clk_sync_state stuff as well? > > Also I think Thierry's comments in the other thread today are also > > good ideas for ways to better handle the optional dt link handling > > (rather than using a timeout). > > Could you please give me a lore link to this thread? Just curious. Sure: https://lore.kernel.org/lkml/20200806135251.GB3351349@ulmo/ thanks -john
Re: [PATCH 1/4] vdpa: introduce config op to get valid iova range
On 2020/8/6 下午8:10, Eli Cohen wrote: On Wed, Jun 17, 2020 at 06:29:44AM +0300, Jason Wang wrote: This patch introduce a config op to get valid iova range from the vDPA device. Signed-off-by: Jason Wang --- include/linux/vdpa.h | 14 ++ 1 file changed, 14 insertions(+) diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h index 239db794357c..b7633ed2500c 100644 --- a/include/linux/vdpa.h +++ b/include/linux/vdpa.h @@ -41,6 +41,16 @@ struct vdpa_device { unsigned int index; }; +/** + * vDPA IOVA range - the IOVA range support by the device + * @start: start of the IOVA range + * @end: end of the IOVA range + */ +struct vdpa_iova_range { + u64 start; + u64 end; +}; + What do you do with this information? Suppose some device tells you it supports some limited range, say, from 0x4000 to 0x8000. What does qemu do with this information? For qemu, when qemu will fail the vDPA device creation when: 1) vIOMMU is not enabled and GPA is out of this range 2) vIOMMU is enabled but it can't report such range to guest For other userspace application, it will know it can only use this range as its IOVA. Thanks
Re: [PATCH v3 2/4] irqchip/qcom-pdc: Switch to using IRQCHIP_PLATFORM_DRIVER helper macros
On Thu, Aug 6, 2020 at 7:49 PM John Stultz wrote: > > On Thu, Aug 6, 2020 at 6:42 PM Bjorn Andersson > wrote: > > On Thu 06 Aug 18:22 PDT 2020, John Stultz wrote: > > > On Thu, Aug 6, 2020 at 5:43 PM Bjorn Andersson > > > wrote: > > > > On Wed 05 Aug 14:57 PDT 2020, John Stultz wrote: > > > > > On Wed, Aug 5, 2020 at 2:47 PM Steev Klimaszewski > > > > > wrote: > > > > > > On 8/5/20 4:16 PM, Steev Klimaszewski wrote: > > > > > > > On 8/5/20 3:19 PM, Saravana Kannan wrote: > > > > > > >> On Wed, Aug 5, 2020 at 12:44 AM John Stultz > > > > > > >> wrote: > > > > > > >>> > > > > > > >>> So this is where I bashfully admit I didn't get a chance to try > > > > > > >>> this > > > > > > >>> patch series out, as I had success with a much older version of > > > > > > >>> Saravana's macro magic. > > > > > > >>> > > > > > > >>> But unfortunately, now that this has landed in mainline, I'm > > > > > > >>> seeing > > > > > > >>> boot regressions on db845c. :( This is in the non-modular case, > > > > > > >>> building the driver in. > > > > > > >> Does that mean the modular version is working? Or you haven't > > > > > > >> tried > > > > > > >> that yet? I'll wait for your reply before I try to fix it. I > > > > > > >> don't > > > > > > >> have the hardware, but it should be easy to guess this issue > > > > > > >> looking > > > > > > >> at the code delta. > > > > > > > For what it's worth, I saw this too on the Lenovo C630 (started > > > > > > > on -next > > > > > > > around 20200727, but I didn't track it down as, well, there's > > > > > > > less way > > > > > > > to get debug output on the C630. > > > > > > > > > > > > > > In my testing, module or built-in doesn't matter, but reverting > > > > > > > does > > > > > > > allow me to boot again. > > > > > > > > > > > > > Actually - I spoke too soon - QCOM_PDC built-in with the commit > > > > > > reverted > > > > > > boots, however, module (on the c630 at least) doesn't boot whether > > > > > > it's > > > > > > a module or built-in. > > > > > > > > > > You may need to set deferred_probe_timeout=30 to give things a bit > > > > > more grace time to load. > > > > > > > > With the risk of me reading more into this than what you're saying, > > > > please don't upstream anything that depend this parameter to be > > > > increased. > > > > > > > > Compiling any of these drivers as module should not require the user to > > > > pass additional kernel command line parameters in order to get their > > > > device to boot. > > > > > > So, ideally I agree, and Saravana's fw_devlink work should allow us to > > > avoid it. But the reality is that it is already required (at least in > > > configurations heavily using modules) to give more time for modules > > > loaded to resolve missing dependencies after init begins (due to > > > changes in the driver core to fail loading after init so that optional > > > dt links aren't eternally looked for). This was seen when trying to > > > enable the qualcom clk drivers to modules. > > > > > > > So to clarify what you're saying, any system that boots successfully > > with the default options is a sign of pure luck - regardless of being > > builtin or modules. > > > > > > And there you have my exact argument against the deferred timeout magic > > going on in the driver core. But as you know people insist that it's > > more important to be able to boot some defunct system from NFS than a > > properly configured one reliably. > > I'd agree, but the NFS case was in use before, and when the original > deferred timeout/optional link handling stuff landed no one complained > they were broken by it (at least at the point where it landed). Only > later when we started enabling more lower-level core drivers as > modules did the shortened dependency resolution time start to bite > folks. My attempt to set the default to be 30 seconds helped there, > but caused trouble and delays for the NFS case, and "don't break > existing users" seemed to rule, so I set the default timeout back to > 0. > > > > It doesn't seem necessary in this case, but I suggested it here as > > > I've got it enabled by default in my AOSP builds so that the > > > module-heavy configs for GKI boot properly (even if Saravana's > > > fw_devlink work is disabled). > > > > > > > With all due respect, that's your downstream kernel, the upstream kernel > > should not rely on luck, out-of-tree patches or kernel parameters. > > I agree that would be preferred. But kernel parameters are often there > for these sorts of cases where we can't always do the right thing. As > for out-of-tree patches, broken things don't get fixed until > out-of-tree patches are developed and upstreamed, and I know Saravana > is doing exactly that, and I hope his fw_devlink work helps fix it so > the module loading is not just a matter of luck. Btw, the only downstream fw_devlink change is setting itto =on (vs =permissive in upstream). > Also I think Thierry's comments in the other thread today are also >
Re: [PATCH v2 19/24] vdpa: make sure set_features in invoked for legacy
On 2020/8/6 下午6:00, Michael S. Tsirkin wrote: On Thu, Aug 06, 2020 at 03:27:38PM +0800, Jason Wang wrote: On 2020/8/6 下午1:53, Michael S. Tsirkin wrote: On Thu, Aug 06, 2020 at 11:23:05AM +0800, Jason Wang wrote: On 2020/8/5 下午7:40, Michael S. Tsirkin wrote: On Wed, Aug 05, 2020 at 02:14:07PM +0800, Jason Wang wrote: On 2020/8/4 上午5:00, Michael S. Tsirkin wrote: Some legacy guests just assume features are 0 after reset. We detect that config space is accessed before features are set and set features to 0 automatically. Note: some legacy guests might not even access config space, if this is reported in the field we might need to catch a kick to handle these. I wonder whether it's easier to just support modern device? Thanks Well hardware vendors are I think interested in supporting legacy guests. Limiting vdpa to modern only would make it uncompetitive. My understanding is that, IOMMU_PLATFORM is mandatory for hardware vDPA to work. Hmm I don't really see why. Assume host maps guest memory properly, VM does not have an IOMMU, legacy guest can just work. Yes, guest may not set IOMMU_PLATFORM. Care explaining what's wrong with this picture? The problem is virtio_vdpa, without IOMMU_PLATFORM it uses PA which can not work if IOMMU is enabled. Thanks So that's a virtio_vdpa limitation. Probably not, I think this goes back to the long debate of whether to use DMA API unconditionally. If we did that, we can support legacy virtio driver. The vDPA device needs to provide a DMA device and the virtio core and perform DMA API with that device which should work for all of the cases. But a big question is, do upstream care about out of tree virtio drivers? Thanks In the same way, if a device does not have an on-device iommu *and* is not behind an iommu, then vdpa can't bind to it. But this virtio_vdpa specific hack does not belong in a generic vdpa code. So it can only work for modern device ... Thanks
Re: [PATCH v3] drm/virtio: fix missing dma_fence_put() in virtio_gpu_execbuffer_ioctl()
Xin He 于2020年7月21日周二 下午6:17写道: > > From: Qi Liu > > We should put the reference count of the fence after calling > virtio_gpu_cmd_submit(). So add the missing dma_fence_put(). > > Fixes: 2cd7b6f08bc4 ("drm/virtio: add in/out fence support for explicit > synchronization") > Co-developed-by: Xin He > Signed-off-by: Xin He > Signed-off-by: Qi Liu > Reviewed-by: Muchun Song > --- > > changelog in v3: > 1) Change the subject from "drm/virtio: fixed memory leak in > virtio_gpu_execbuffer_ioctl()" to >"drm/virtio: fix missing dma_fence_put() in virtio_gpu_execbuffer_ioctl()" > 2) Rework the commit log > > changelog in v2: > 1) Add a change description > > drivers/gpu/drm/virtio/virtgpu_ioctl.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/drivers/gpu/drm/virtio/virtgpu_ioctl.c > b/drivers/gpu/drm/virtio/virtgpu_ioctl.c > index 5df722072ba0..19c5bc01eb79 100644 > --- a/drivers/gpu/drm/virtio/virtgpu_ioctl.c > +++ b/drivers/gpu/drm/virtio/virtgpu_ioctl.c > @@ -179,6 +179,7 @@ static int virtio_gpu_execbuffer_ioctl(struct drm_device > *dev, void *data, > > virtio_gpu_cmd_submit(vgdev, buf, exbuf->size, > vfpriv->ctx_id, buflist, out_fence); > + dma_fence_put(_fence->f); > virtio_gpu_notify(vgdev); > return 0; > > -- > 2.21.1 (Apple Git-122.3) > cc Greg -- Xin He
Re: [PATCH] ftrace: Fixup lockdep assert held of text_mutex
On Thu, Aug 6, 2020 at 11:48 PM Steven Rostedt wrote: > > On Thu, 6 Aug 2020 14:50:54 + > guo...@kernel.org wrote: > > > From: Guo Ren > > > > The function ftrace_process_locs() will modify text code, so we > > should give a text_mutex lock. Because some arch's patch code > > will assert held of text_mutex even during start_kernel-> > > ftrace_init(). > > NAK. > > This looks like a bug in the lockdep_assert_held() in whatever arch > (riscv) is running. Seems you think it's a bug of arch implementation with the wrong usage of text_mutex? Also @riscv maintainer, How about modifying it in riscv's code? we still need to solve it. diff --git a/arch/riscv/include/asm/ftrace.h b/arch/riscv/include/asm/ftrace.h index ace8a6e..fb266c3 100644 --- a/arch/riscv/include/asm/ftrace.h +++ b/arch/riscv/include/asm/ftrace.h @@ -23,6 +23,12 @@ static inline unsigned long ftrace_call_adjust(unsigned long addr) struct dyn_arch_ftrace { }; + +#ifdef CONFIG_DYNAMIC_FTRACE +struct dyn_ftrace; +int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec); +#define ftrace_init_nop ftrace_init_nop +#endif #endif #ifdef CONFIG_DYNAMIC_FTRACE diff --git a/arch/riscv/kernel/ftrace.c b/arch/riscv/kernel/ftrace.c index 2ff63d0..9e9f7c0 100644 --- a/arch/riscv/kernel/ftrace.c +++ b/arch/riscv/kernel/ftrace.c @@ -97,6 +97,17 @@ int ftrace_make_nop(struct module *mod, struct dyn_ftrace *rec, return __ftrace_modify_call(rec->ip, addr, false); } +int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec) +{ + int ret; + + mutex_lock(_mutex); + ret = ftrace_make_nop(mod, rec, MCOUNT_ADDR); + mutex_unlock(_mutex); + + return ret; +} + int ftrace_update_ftrace_func(ftrace_func_t func) { int ret = __ftrace_modify_call((unsigned long)_call, --- > > > > > backtrace log: > >assert by lockdep_assert_held(_mutex) > > 0 patch_insn_write (addr=0xffe010fc , > > insn=0xffe001203eb8, len=8) at arch/riscv/kernel/patch.c:63 > > 1 0xffe0002042ec in patch_text_nosync (addr=, > > insns=, len=) at arch/riscv/kernel/patch.c:93 > > 2 0xffe00020628e in __ftrace_modify_call (hook_pos=, > > target=, enable=) at > > arch/riscv/kernel/ftrace.c:68 > > 3 0xffe0002063c0 in ftrace_make_nop (mod=, > > rec=0xffe001221c70 , addr=18446743936272720288) at > > arch/riscv/kernel/ftrace.c:97 > > 4 0xffe0002b13f0 in ftrace_init_nop (rec=, > > mod=) at ./include/linux/ftrace.h:647 > > 5 ftrace_nop_initialize (rec=, mod=) at > > kernel/trace/ftrace.c:2619 > > 6 ftrace_update_code (new_pgs=, mod=) at > > kernel/trace/ftrace.c:3063 > > 7 ftrace_process_locs (mod=, start=, > > end=) at kernel/trace/ftrace.c:6154 > > 8 0xffe0b6e6 in ftrace_init () at kernel/trace/ftrace.c:6715 > > 9 0xffe01b48 in start_kernel () at init/main.c:888 > > 10 0xffe010a8 in _start_kernel () at arch/riscv/kernel/head.S:247 > > > > Signed-off-by: Guo Ren > > Cc: Steven Rostedt > > Cc: Ingo Molnar > > --- > > kernel/trace/ftrace.c | 3 +++ > > 1 file changed, 3 insertions(+) > > > > diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c > > index 1903b80..4b48b88 100644 > > --- a/kernel/trace/ftrace.c > > +++ b/kernel/trace/ftrace.c > > @@ -26,6 +26,7 @@ > > #include > > #include > > #include > > +#include > > #include > > #include > > #include > > @@ -6712,9 +6713,11 @@ void __init ftrace_init(void) > > ftrace_init() is called before SMP is initialized. Nothing else should > be running here. That means grabbing a mutex is useless. I don't agree, ftrace_init are modifying kernel text, so we should give the lock of text_mutex to keep semantic consistency. > > -- Steve > > > > > > last_ftrace_enabled = ftrace_enabled = 1; > > > > + mutex_lock(_mutex); > > ret = ftrace_process_locs(NULL, > > __start_mcount_loc, > > __stop_mcount_loc); > > + mutex_unlock(_mutex); > > > > pr_info("ftrace: allocated %ld pages with %ld groups\n", > > ftrace_number_of_pages, ftrace_number_of_groups); > -- Best Regards Guo Ren ML: https://lore.kernel.org/linux-csky/
Re: [PATCH v3 2/4] irqchip/qcom-pdc: Switch to using IRQCHIP_PLATFORM_DRIVER helper macros
On Thu, Aug 6, 2020 at 6:42 PM Bjorn Andersson wrote: > On Thu 06 Aug 18:22 PDT 2020, John Stultz wrote: > > On Thu, Aug 6, 2020 at 5:43 PM Bjorn Andersson > > wrote: > > > On Wed 05 Aug 14:57 PDT 2020, John Stultz wrote: > > > > On Wed, Aug 5, 2020 at 2:47 PM Steev Klimaszewski > > > > wrote: > > > > > On 8/5/20 4:16 PM, Steev Klimaszewski wrote: > > > > > > On 8/5/20 3:19 PM, Saravana Kannan wrote: > > > > > >> On Wed, Aug 5, 2020 at 12:44 AM John Stultz > > > > > >> wrote: > > > > > >>> > > > > > >>> So this is where I bashfully admit I didn't get a chance to try > > > > > >>> this > > > > > >>> patch series out, as I had success with a much older version of > > > > > >>> Saravana's macro magic. > > > > > >>> > > > > > >>> But unfortunately, now that this has landed in mainline, I'm > > > > > >>> seeing > > > > > >>> boot regressions on db845c. :( This is in the non-modular case, > > > > > >>> building the driver in. > > > > > >> Does that mean the modular version is working? Or you haven't tried > > > > > >> that yet? I'll wait for your reply before I try to fix it. I don't > > > > > >> have the hardware, but it should be easy to guess this issue > > > > > >> looking > > > > > >> at the code delta. > > > > > > For what it's worth, I saw this too on the Lenovo C630 (started on > > > > > > -next > > > > > > around 20200727, but I didn't track it down as, well, there's less > > > > > > way > > > > > > to get debug output on the C630. > > > > > > > > > > > > In my testing, module or built-in doesn't matter, but reverting does > > > > > > allow me to boot again. > > > > > > > > > > > Actually - I spoke too soon - QCOM_PDC built-in with the commit > > > > > reverted > > > > > boots, however, module (on the c630 at least) doesn't boot whether > > > > > it's > > > > > a module or built-in. > > > > > > > > You may need to set deferred_probe_timeout=30 to give things a bit > > > > more grace time to load. > > > > > > With the risk of me reading more into this than what you're saying, > > > please don't upstream anything that depend this parameter to be > > > increased. > > > > > > Compiling any of these drivers as module should not require the user to > > > pass additional kernel command line parameters in order to get their > > > device to boot. > > > > So, ideally I agree, and Saravana's fw_devlink work should allow us to > > avoid it. But the reality is that it is already required (at least in > > configurations heavily using modules) to give more time for modules > > loaded to resolve missing dependencies after init begins (due to > > changes in the driver core to fail loading after init so that optional > > dt links aren't eternally looked for). This was seen when trying to > > enable the qualcom clk drivers to modules. > > > > So to clarify what you're saying, any system that boots successfully > with the default options is a sign of pure luck - regardless of being > builtin or modules. > > > And there you have my exact argument against the deferred timeout magic > going on in the driver core. But as you know people insist that it's > more important to be able to boot some defunct system from NFS than a > properly configured one reliably. I'd agree, but the NFS case was in use before, and when the original deferred timeout/optional link handling stuff landed no one complained they were broken by it (at least at the point where it landed). Only later when we started enabling more lower-level core drivers as modules did the shortened dependency resolution time start to bite folks. My attempt to set the default to be 30 seconds helped there, but caused trouble and delays for the NFS case, and "don't break existing users" seemed to rule, so I set the default timeout back to 0. > > It doesn't seem necessary in this case, but I suggested it here as > > I've got it enabled by default in my AOSP builds so that the > > module-heavy configs for GKI boot properly (even if Saravana's > > fw_devlink work is disabled). > > > > With all due respect, that's your downstream kernel, the upstream kernel > should not rely on luck, out-of-tree patches or kernel parameters. I agree that would be preferred. But kernel parameters are often there for these sorts of cases where we can't always do the right thing. As for out-of-tree patches, broken things don't get fixed until out-of-tree patches are developed and upstreamed, and I know Saravana is doing exactly that, and I hope his fw_devlink work helps fix it so the module loading is not just a matter of luck. Also I think Thierry's comments in the other thread today are also good ideas for ways to better handle the optional dt link handling (rather than using a timeout). thanks -john
Re: [PATCH] powerpc/signal: Move and simplify get_clean_sp()
Christoph Hellwig writes: > On Thu, Aug 06, 2020 at 08:50:20AM +, Christophe Leroy wrote: >> get_clean_sp() is only used in kernel/signal.c . Move it there. >> >> And GCC is smart enough to reduce the function when on PPC32, no >> need of a special PPC32 simple version. > > What about just open coding it in the only caller, which would seem even > cleaner? +1 cheers
Re: [PATCH] sched/core: add unlikely in group_has_capacity()
Yeah, because of the following two points, I also think the probability is 0%: a) the sd is protected by rcu lock, and load_balance() func is between rcu_read_lock() and rcu_read_unlock(). b) the sgs is a local variable. So in the group_classify(), the env->sd->imbalance_pct and the sgs will not be changed. May I remove the duplicate check from group_has_capacity() and resubmit a patch? Yours, Qi Zheng On 2020/8/6 下午10:45, Ingo Molnar wrote: * Qi Zheng wrote: 1. The group_has_capacity() function is only called in group_classify(). 2. Before calling the group_has_capacity() function, group_is_overloaded() will first judge the following formula, if it holds, the group_classify() will directly return the group_overloaded. (sgs->group_capacity * imbalance_pct) < (sgs->group_runnable * 100) Therefore, when the group_has_capacity() is called, the probability that the above formalu holds is very small. Hint compilers about that. Signed-off-by: Qi Zheng --- kernel/sched/fair.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 2ba8f230feb9..9074fd5e23b2 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -8234,8 +8234,8 @@ group_has_capacity(unsigned int imbalance_pct, struct sg_lb_stats *sgs) if (sgs->sum_nr_running < sgs->group_weight) return true; - if ((sgs->group_capacity * imbalance_pct) < - (sgs->group_runnable * 100)) + if (unlikely((sgs->group_capacity * imbalance_pct) < + (sgs->group_runnable * 100))) return false; Isn't the probability that this second check will match around 0%? I.e. wouldn't the right fix be to remove the duplicate check from group_has_capacity(), because it's already been checked in group_classify()? Maybe while leaving a comment in place? Thanks, Ingo
Re: [GIT PULL] dlm updates for 5.9
The pull request you sent on Thu, 6 Aug 2020 11:45:07 -0500: > git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm.git dlm-5.9 has been merged into torvalds/linux.git: https://git.kernel.org/torvalds/c/86cfccb66937dd6cbf26ed619958b9e587e6a115 Thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/prtracker.html
Re: [GIT PULL] RESEND: thermal for v5.9-rc1
On Thu, Aug 6, 2020 at 1:19 PM Daniel Lezcano wrote: > > > - Add generic netlink support for userspace notifications: events, > temperature > and discovery commands (Daniel Lezcano) This is "default y". Why? The help text doesn't explain either. Please explain, or remove the default y. We don't add new features and then try to force people to use them by enabling them by default. "default y" is mainly for when something unconditional gets split up and becomes conditional (so now "default y" means that you don't break peoples setups when they don't even know what it is). Alternatively, "default y" is for things that are make peoples lives immeasurably better somehow, and it would be a crime to not enable it because it's _so_ wonderful. So far, I'm not convinced we've ever hit that second case. Convince me that the thermal layer is so magical that it really warrants it. Tell me why and how my life is improved by enabling it. Linus
Re: [PATCH v8 5/8] powerpc/vdso: Prepare for switching VDSO to generic C implementation.
Segher Boessenkool writes: > On Thu, Aug 06, 2020 at 12:03:33PM +1000, Michael Ellerman wrote: >> Segher Boessenkool writes: >> > On Wed, Aug 05, 2020 at 04:24:16PM +1000, Michael Ellerman wrote: >> >> Christophe Leroy writes: >> >> > Indeed, 32-bit doesn't have a redzone, so I believe it needs a stack >> >> > frame whenever it has anything to same. > > ^^^ > >> >> > fbb60: 94 21 ff e0 stwur1,-32(r1) >> > >> > This is the *only* place where you can use a negative offset from r1: >> > in the stwu to extend the stack (set up a new stack frame, or make the >> > current one bigger). >> >> (You're talking about 32-bit code here right?) > > The "SYSV" ELF binding, yeah, which is used for 32-bit on Linux (give or > take, ho hum). > > The ABIs that have a red zone are much nicer here (but less simple) :-) Yep, just checking I wasn't misunderstanding your comment about negative offsets. >> >> At the same time it's much safer for us to just save/restore r2, and >> >> probably in the noise performance wise. >> > >> > If you want a function to be able to work with ABI-compliant code safely >> > (in all cases), you'll have to make it itself ABI-compliant as well, >> > yes :-) >> >> True. Except this is the VDSO which has previously been a bit wild west >> as far as ABI goes :) > > It could get away with many things because it was guaranteed to be a > leaf function. Some of those things even violate the ABIs, but you can > get away with it easily, much reduced scope. Now if this is generated > code, violating the rules will catch up with you sooner rather than > later ;-) Agreed. cheers
Re: [GIT PULL] erofs fixes for 5.9-rc1
The pull request you sent on Thu, 6 Aug 2020 11:20:17 +0800: > git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs.git > tags/erofs-for-5.9-rc1 has been merged into torvalds/linux.git: https://git.kernel.org/torvalds/c/019c407c1dfb81c37036323597e18cce73c84122 Thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/prtracker.html
Re: [GIT PULL] CIFS/SMB3 Fixes
The pull request you sent on Mon, 3 Aug 2020 17:45:03 -0500: > git://git.samba.org/sfrench/cifs-2.6.git tags/5.9-rc-smb3-fixes-part1 has been merged into torvalds/linux.git: https://git.kernel.org/torvalds/c/327a8d76b1ac2037f87bf041f3bc076407284ffc Thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/prtracker.html
Re: [GIT PULL v2] iomap: new code for 5.9-rc1
The pull request you sent on Thu, 6 Aug 2020 08:07:43 -0700: > git://git.kernel.org/pub/scm/fs/xfs/xfs-linux.git tags/iomap-5.9-merge-5 has been merged into torvalds/linux.git: https://git.kernel.org/torvalds/c/0e4656a299db8484933a143259e7e5ebae2e3a01 Thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/prtracker.html
Re: [PATCH v9 08/10] gpu: host1x: mipi: Keep MIPI clock enabled and mutex locked till calibration done
06.08.2020 22:01, Sowjanya Komatineni пишет: ... > +int tegra_mipi_start_calibration(struct tegra_mipi_device *device) > { > const struct tegra_mipi_soc *soc = device->mipi->soc; > unsigned int i; > @@ -381,12 +375,16 @@ int tegra_mipi_calibrate(struct tegra_mipi_device > *device) > value |= MIPI_CAL_CTRL_START; > tegra_mipi_writel(device->mipi, value, MIPI_CAL_CTRL); > > - mutex_unlock(>mipi->lock); > - clk_disable(device->mipi->clk); > + /* > + * Wait for min 72uS to let calibration logic finish calibration > + * sequence codes before waiting for pads idle state to apply the > + * results. > + */ > + usleep_range(75, 80); Could you please explain why the ACTIVE bit can't be polled instead of using the fixed delay? Doesn't ACTIVE bit represents the state of the busy FSM?
[PATCH] timer: mask unnecessary set of flags in do_init_timer
From: Qianli Zhao do_init_timer can specify flags of timer_list, but this function does not expect to specify the CPU or idx. If user invoking do_init_timer and specify CPU, The result may not what we expected. E.g: do_init_timer invoked in core2,and specify flags 0x1 final result flags is 0x3.If the specified CPU number exceeds the actual number,more serious problems will happen Signed-off-by: Qianli Zhao --- kernel/time/timer.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/time/timer.c b/kernel/time/timer.c index 026ac01..17e3737 100644 --- a/kernel/time/timer.c +++ b/kernel/time/timer.c @@ -789,7 +789,7 @@ static void do_init_timer(struct timer_list *timer, { timer->entry.pprev = NULL; timer->function = func; - timer->flags = flags | raw_smp_processor_id(); + timer->flags = (flags & ~TIMER_BASEMASK & ~TIMER_ARRAYMASK) | raw_smp_processor_id(); lockdep_init_map(>lockdep_map, name, key, 0); } -- 2.7.4
WARNING: refcount bug in bt_accept_dequeue
Hello, syzbot found the following issue on: HEAD commit:47ec5303 Merge git://git.kernel.org/pub/scm/linux/kernel/g.. git tree: upstream console output: https://syzkaller.appspot.com/x/log.txt?x=1083e7ec90 kernel config: https://syzkaller.appspot.com/x/.config?x=e0c783f658542f35 dashboard link: https://syzkaller.appspot.com/bug?extid=6048aa700d088954b0fc compiler: clang version 10.0.0 (https://github.com/llvm/llvm-project/ c2443155a0fb245c8f17f2c1c72b6ea391e86e81) syz repro: https://syzkaller.appspot.com/x/repro.syz?x=11c227dc90 IMPORTANT: if you fix the issue, please add the following tag to the commit: Reported-by: syzbot+6048aa700d088954b...@syzkaller.appspotmail.com [ cut here ] refcount_t: addition on 0; use-after-free. WARNING: CPU: 1 PID: 3805 at lib/refcount.c:25 refcount_warn_saturate+0x13d/0x1a0 lib/refcount.c:25 Kernel panic - not syncing: panic_on_warn set ... CPU: 1 PID: 3805 Comm: krfcommd Not tainted 5.8.0-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1f0/0x31e lib/dump_stack.c:118 panic+0x264/0x7a0 kernel/panic.c:231 __warn+0x227/0x250 kernel/panic.c:600 report_bug+0x1b1/0x2e0 lib/bug.c:198 handle_bug+0x42/0x80 arch/x86/kernel/traps.c:235 exc_invalid_op+0x16/0x40 arch/x86/kernel/traps.c:255 asm_exc_invalid_op+0x12/0x20 arch/x86/include/asm/idtentry.h:536 RIP: 0010:refcount_warn_saturate+0x13d/0x1a0 lib/refcount.c:25 Code: c7 93 1e 15 89 31 c0 e8 71 0c a9 fd 0f 0b eb a3 e8 88 66 d7 fd c6 05 97 4d ed 05 01 48 c7 c7 ca 1e 15 89 31 c0 e8 53 0c a9 fd <0f> 0b eb 85 e8 6a 66 d7 fd c6 05 7a 4d ed 05 01 48 c7 c7 f6 1e 15 RSP: 0018:c90001997b80 EFLAGS: 00010246 RAX: b5077223a4630100 RBX: 0002 RCX: 88809902e380 RDX: RSI: 8000 RDI: RBP: 0002 R08: 815dffd9 R09: ed1015d262c0 R10: ed1015d262c0 R11: R12: 88808df4b4b8 R13: 88808df4b080 R14: R15: dc00 refcount_add include/linux/refcount.h:205 [inline] refcount_inc include/linux/refcount.h:241 [inline] sock_hold include/net/sock.h:692 [inline] bt_accept_dequeue+0x34e/0x560 net/bluetooth/af_bluetooth.c:206 l2cap_sock_accept+0x21c/0x430 net/bluetooth/l2cap_sock.c:332 kernel_accept+0x143/0x410 net/socket.c:3569 rfcomm_accept_connection net/bluetooth/rfcomm/core.c:1931 [inline] rfcomm_process_sessions+0x1c5/0xc540 net/bluetooth/rfcomm/core.c:1990 rfcomm_run+0x3b5/0x900 net/bluetooth/rfcomm/core.c:2086 kthread+0x37e/0x3a0 drivers/block/aoe/aoecmd.c:1234 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294 Kernel Offset: disabled Rebooting in 86400 seconds.. --- This report is generated by a bot. It may contain errors. See https://goo.gl/tpsmEJ for more information about syzbot. syzbot engineers can be reached at syzkal...@googlegroups.com. syzbot will keep track of this issue. See: https://goo.gl/tpsmEJ#status for how to communicate with syzbot. syzbot can test patches for this issue, for details see: https://goo.gl/tpsmEJ#testing-patches
WARNING in irqentry_exit
Hello, syzbot found the following issue on: HEAD commit:47ec5303 Merge git://git.kernel.org/pub/scm/linux/kernel/g.. git tree: upstream console output: https://syzkaller.appspot.com/x/log.txt?x=1348d9dc90 kernel config: https://syzkaller.appspot.com/x/.config?x=7bb894f55faf8242 dashboard link: https://syzkaller.appspot.com/bug?extid=d4336c84ed0099fdbe47 compiler: gcc (GCC) 10.1.0-syz 20200507 syz repro: https://syzkaller.appspot.com/x/repro.syz?x=10f1572a90 C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1266949490 IMPORTANT: if you fix the issue, please add the following tag to the commit: Reported-by: syzbot+d4336c84ed0099fdb...@syzkaller.appspotmail.com [ cut here ] WARNING: CPU: 0 PID: 6845 at kernel/entry/common.c:338 irqentry_exit+0x47/0x90 kernel/entry/common.c:342 Kernel panic - not syncing: panic_on_warn set ... CPU: 0 PID: 6845 Comm: syz-executor025 Not tainted 5.8.0-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x18f/0x20d lib/dump_stack.c:118 panic+0x2e3/0x75c kernel/panic.c:231 __warn.cold+0x20/0x45 kernel/panic.c:600 report_bug+0x1bd/0x210 lib/bug.c:198 handle_bug+0x38/0x90 arch/x86/kernel/traps.c:235 exc_invalid_op+0x14/0x40 arch/x86/kernel/traps.c:255 asm_exc_invalid_op+0x12/0x20 arch/x86/include/asm/idtentry.h:536 RIP: 0010:irqentry_exit+0x47/0x90 kernel/entry/common.c:338 Code: 27 f6 87 91 00 00 00 02 74 18 40 84 f6 75 3b 65 8b 05 ad ea 12 78 a9 ff ff ff 7f 74 11 e9 91 cd 8a f9 40 84 f6 75 3f c3 eb 89 <0f> 0b eb ca e8 f0 bf 72 f9 65 48 8b 04 25 00 ff 01 00 48 8b 00 a8 RSP: :c900054471d0 EFLAGS: 00010002 RAX: 0001 RBX: RCX: 8179db77 RDX: 88809b98e340 RSI: RDI: c90005447208 RBP: c90005447208 R08: R09: R10: R11: R12: R13: R14: R15: exc_page_fault+0xc2/0x160 arch/x86/mm/fault.c:1421 asm_exc_page_fault+0x1e/0x30 arch/x86/include/asm/idtentry.h:538 RIP: 0010:__softirqentry_text_end+0x39/0x55 Code: be 04 e0 ff ba f2 ff ff ff 31 c9 e9 db c5 ce ff 41 bf f2 ff ff ff 31 c0 e9 6a 36 e1 f8 41 bf f2 ff ff ff 31 ed e9 85 36 e1 f8 <41> be f2 ff ff ff 48 31 c0 e9 92 39 e1 f8 41 be f2 ff ff ff 48 31 RSP: :c900054472b0 EFLAGS: 00010046 RAX: RBX: c90005447370 RCX: 810143e2 RDX: 88809b98e340 RSI: 810143f0 RDI: 0006 RBP: R08: R09: 88809b98e347 R10: R11: 0001 R12: c9000544737c R13: dc00 R14: R15: 88809b98e340 get_perf_callchain+0x321/0x620 kernel/events/callchain.c:222 perf_callchain+0x165/0x1c0 kernel/events/core.c:6986 perf_prepare_sample+0x8fd/0x1d40 kernel/events/core.c:7013 __perf_event_output kernel/events/core.c:7171 [inline] perf_event_output_forward+0xf3/0x270 kernel/events/core.c:7191 __perf_event_overflow+0x13c/0x370 kernel/events/core.c:8846 perf_swevent_overflow kernel/events/core.c:8922 [inline] perf_swevent_event+0x4b9/0x550 kernel/events/core.c:8960 perf_tp_event+0x2e4/0xb50 kernel/events/core.c:9378 perf_trace_run_bpf_submit+0x11c/0x200 kernel/events/core.c:9352 perf_trace_lock+0x2e2/0x4a0 include/trace/events/lock.h:39 trace_lock_release include/trace/events/lock.h:58 [inline] lock_release+0x571/0x8e0 kernel/locking/lockdep.c:5022 alloc_set_pte+0x3af/0x1ac0 mm/memory.c:3675 filemap_map_pages+0x103d/0x1280 mm/filemap.c:2725 do_fault_around mm/memory.c:3818 [inline] do_read_fault mm/memory.c:3852 [inline] do_fault mm/memory.c:3985 [inline] handle_pte_fault mm/memory.c:4225 [inline] __handle_mm_fault mm/memory.c:4357 [inline] handle_mm_fault+0x39b7/0x43f0 mm/memory.c:4394 do_user_addr_fault+0x5a2/0xd00 arch/x86/mm/fault.c:1295 handle_page_fault arch/x86/mm/fault.c:1365 [inline] exc_page_fault+0xa8/0x160 arch/x86/mm/fault.c:1418 asm_exc_page_fault+0x1e/0x30 arch/x86/include/asm/idtentry.h:538 RIP: 0033:0x49f4d0 Code: Bad RIP value. RSP: 002b:7fff1785f378 EFLAGS: 00010202 RAX: 0049f4d0 RBX: 0001 RCX: 006d1190 RDX: 00402140 RSI: RDI: 004bf848 RBP: 7fff1785f380 R08: 004002c8 R09: 004002c8 R10: R11: 0246 R12: 0001 R13: 006d1180 R14: R15: Kernel Offset: disabled Rebooting in 86400 seconds.. --- This report is generated by a bot. It may contain errors. See https://goo.gl/tpsmEJ for more information about syzbot. syzbot engineers can be reached at syzkal...@googlegroups.com. syzbot will keep track of this issue. See: https://goo.gl/tpsmEJ#status for how to communicate with syzbot. syzbot can test patches for this issue, for details see:
WARNING in compat_do_ebt_get_ctl
Hello, syzbot found the following issue on: HEAD commit:47ec5303 Merge git://git.kernel.org/pub/scm/linux/kernel/g.. git tree: upstream console output: https://syzkaller.appspot.com/x/log.txt?x=17e92e7690 kernel config: https://syzkaller.appspot.com/x/.config?x=7c06047f622c5724 dashboard link: https://syzkaller.appspot.com/bug?extid=5accb5c62faa1d346480 compiler: gcc (GCC) 10.1.0-syz 20200507 userspace arch: i386 Unfortunately, I don't have any reproducer for this issue yet. IMPORTANT: if you fix the issue, please add the following tag to the commit: Reported-by: syzbot+5accb5c62faa1d346...@syzkaller.appspotmail.com WARNING: CPU: 0 PID: 783 at include/linux/thread_info.h:134 copy_overflow include/linux/thread_info.h:134 [inline] WARNING: CPU: 0 PID: 783 at include/linux/thread_info.h:134 check_copy_size include/linux/thread_info.h:143 [inline] WARNING: CPU: 0 PID: 783 at include/linux/thread_info.h:134 copy_to_user include/linux/uaccess.h:151 [inline] WARNING: CPU: 0 PID: 783 at include/linux/thread_info.h:134 compat_do_ebt_get_ctl+0x47e/0x500 net/bridge/netfilter/ebtables.c:2270 Kernel panic - not syncing: panic_on_warn set ... CPU: 0 PID: 783 Comm: syz-executor.2 Not tainted 5.8.0-syzkaller #0 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x18f/0x20d lib/dump_stack.c:118 panic+0x2e3/0x75c kernel/panic.c:231 __warn.cold+0x20/0x45 kernel/panic.c:600 report_bug+0x1bd/0x210 lib/bug.c:198 handle_bug+0x38/0x90 arch/x86/kernel/traps.c:235 exc_invalid_op+0x14/0x40 arch/x86/kernel/traps.c:255 asm_exc_invalid_op+0x12/0x20 arch/x86/include/asm/idtentry.h:536 RIP: 0010:copy_overflow include/linux/thread_info.h:134 [inline] RIP: 0010:check_copy_size include/linux/thread_info.h:143 [inline] RIP: 0010:copy_to_user include/linux/uaccess.h:151 [inline] RIP: 0010:compat_do_ebt_get_ctl+0x47e/0x500 net/bridge/netfilter/ebtables.c:2270 Code: ba fd ff ff 4c 89 f7 e8 a0 11 a4 fa e9 ad fd ff ff e8 06 0f 64 fa 4c 89 e2 be 50 00 00 00 48 c7 c7 00 4e 0e 89 e8 64 20 35 fa <0f> 0b e9 dc fd ff ff 41 bc f2 ff ff ff e9 4f fe ff ff e8 7b 11 a4 RSP: 0018:c900047b7ae8 EFLAGS: 00010282 RAX: RBX: 1920008f6f5e RCX: RDX: 0004 RSI: 815d8eb7 RDI: f520008f6f4f RBP: 8a8f3460 R08: 0001 R09: 88802ce31927 R10: R11: 33383754 R12: ffab R13: 2100 R14: c900047b7d80 R15: c900047b7b20 do_ebt_get_ctl+0x2b4/0x790 net/bridge/netfilter/ebtables.c:2317 nf_getsockopt+0x72/0xd0 net/netfilter/nf_sockopt.c:116 ip_getsockopt net/ipv4/ip_sockglue.c:1778 [inline] ip_getsockopt+0x164/0x1c0 net/ipv4/ip_sockglue.c:1757 raw_getsockopt+0x1a1/0x1d0 net/ipv4/raw.c:876 __sys_getsockopt+0x219/0x4c0 net/socket.c:2179 __do_sys_getsockopt net/socket.c:2194 [inline] __se_sys_getsockopt net/socket.c:2191 [inline] __ia32_sys_getsockopt+0xb9/0x150 net/socket.c:2191 do_syscall_32_irqs_on arch/x86/entry/common.c:84 [inline] __do_fast_syscall_32+0x57/0x80 arch/x86/entry/common.c:126 do_fast_syscall_32+0x2f/0x70 arch/x86/entry/common.c:149 entry_SYSENTER_compat_after_hwframe+0x4d/0x5c RIP: 0023:0xf7f24569 Code: c4 01 10 03 03 74 c0 01 10 05 03 74 b8 01 10 06 03 74 b4 01 10 07 03 74 b0 01 10 08 03 74 d8 01 00 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 eb 0d 90 90 90 90 90 90 90 90 90 90 90 90 RSP: 002b:f551e0bc EFLAGS: 0296 ORIG_RAX: 016d RAX: ffda RBX: 0003 RCX: RDX: 0082 RSI: 2100 RDI: 2180 RBP: R08: R09: R10: R11: R12: R13: R14: R15: Kernel Offset: disabled Rebooting in 86400 seconds.. --- This report is generated by a bot. It may contain errors. See https://goo.gl/tpsmEJ for more information about syzbot. syzbot engineers can be reached at syzkal...@googlegroups.com. syzbot will keep track of this issue. See: https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
Re: [PATCH 2/2] selinux: add attributes to avc tracepoint
On Thu, Aug 6, 2020 at 12:37 PM Steven Rostedt wrote: > On Thu, 6 Aug 2020 08:32:38 -0400 > Stephen Smalley wrote: > > > > > > In the commit message or in a Documentation/trace/events-avc.rst ? > > > > I was just asking for it in the commit message / patch description. I > > don't know what is typical for Documentation/trace. > > No, the Documentation/trace is for generic tracing documentation. Not > for individual events. As I've said many times in the past, I've never rejected a patch because the patch description was too verbose, but I have rejected patches where the description hasn't provided enough useful information. I would really like to see the commit description show an example where you setup/load/etc. the event into the kernel, trigger and capture the event information, and then show how the event output can be parsed/processed into something meaningful by a user. I'm essentially looking for a "hello world" version of the SELinux tracepoint, does that make sense? -- paul moore www.paul-moore.com
[PATCH v5 2/2] soc: mediatek: add mt6779 devapc driver
MediaTek bus fabric provides TrustZone security support and data protection to prevent slaves from being accessed by unexpected masters. The security violation is logged and sent to the processor for further analysis or countermeasures. Any occurrence of security violation would raise an interrupt, and it will be handled by mtk-devapc driver. The violation information is printed in order to find the murderer. Signed-off-by: Neal Liu --- drivers/soc/mediatek/Kconfig |9 ++ drivers/soc/mediatek/Makefile |1 + drivers/soc/mediatek/mtk-devapc.c | 315 + 3 files changed, 325 insertions(+) create mode 100644 drivers/soc/mediatek/mtk-devapc.c diff --git a/drivers/soc/mediatek/Kconfig b/drivers/soc/mediatek/Kconfig index 59a56cd..1177c98 100644 --- a/drivers/soc/mediatek/Kconfig +++ b/drivers/soc/mediatek/Kconfig @@ -17,6 +17,15 @@ config MTK_CMDQ time limitation, such as updating display configuration during the vblank. +config MTK_DEVAPC + tristate "Mediatek Device APC Support" + help + Say yes here to enable support for Mediatek Device APC driver. + This driver is mainly used to handle the violation which catches + unexpected transaction. + The violation information is logged for further analysis or + countermeasures. + config MTK_INFRACFG bool "MediaTek INFRACFG Support" select REGMAP diff --git a/drivers/soc/mediatek/Makefile b/drivers/soc/mediatek/Makefile index 01f9f87..abfd4ba 100644 --- a/drivers/soc/mediatek/Makefile +++ b/drivers/soc/mediatek/Makefile @@ -1,5 +1,6 @@ # SPDX-License-Identifier: GPL-2.0-only obj-$(CONFIG_MTK_CMDQ) += mtk-cmdq-helper.o +obj-$(CONFIG_MTK_DEVAPC) += mtk-devapc.o obj-$(CONFIG_MTK_INFRACFG) += mtk-infracfg.o obj-$(CONFIG_MTK_PMIC_WRAP) += mtk-pmic-wrap.o obj-$(CONFIG_MTK_SCPSYS) += mtk-scpsys.o diff --git a/drivers/soc/mediatek/mtk-devapc.c b/drivers/soc/mediatek/mtk-devapc.c new file mode 100644 index 000..73287cf --- /dev/null +++ b/drivers/soc/mediatek/mtk-devapc.c @@ -0,0 +1,315 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2020 MediaTek Inc. + */ + +#include +#include +#include +#include +#include +#include +#include +#include + +#define VIO_MOD_TO_REG_IND(m) ((m) / 32) +#define VIO_MOD_TO_REG_OFF(m) ((m) % 32) + +struct mtk_devapc_vio_dbgs { + union { + u32 vio_dbg0; + struct { + u32 mstid:16; + u32 dmnid:6; + u32 vio_w:1; + u32 vio_r:1; + u32 addr_h:4; + u32 resv:4; + } dbg0_bits; + }; + + u32 vio_dbg1; +}; + +struct mtk_devapc_data { + u32 vio_idx_num; + u32 vio_mask_offset; + u32 vio_sta_offset; + u32 vio_dbg0_offset; + u32 vio_dbg1_offset; + u32 apc_con_offset; + u32 vio_shift_sta_offset; + u32 vio_shift_sel_offset; + u32 vio_shift_con_offset; +}; + +struct mtk_devapc_context { + struct device *dev; + void __iomem *infra_base; + struct clk *infra_clk; + const struct mtk_devapc_data *data; +}; + +static void clear_vio_status(struct mtk_devapc_context *ctx) +{ + void __iomem *reg; + int i; + + reg = ctx->infra_base + ctx->data->vio_sta_offset; + + for (i = 0; i < VIO_MOD_TO_REG_IND(ctx->data->vio_idx_num - 1); i++) + writel(GENMASK(31, 0), reg + 4 * i); + + writel(GENMASK(VIO_MOD_TO_REG_OFF(ctx->data->vio_idx_num - 1), 0), + reg + 4 * i); +} + +static void mask_module_irq(struct mtk_devapc_context *ctx, bool mask) +{ + void __iomem *reg; + u32 val; + int i; + + reg = ctx->infra_base + ctx->data->vio_mask_offset; + + if (mask) + val = GENMASK(31, 0); + else + val = 0; + + for (i = 0; i < VIO_MOD_TO_REG_IND(ctx->data->vio_idx_num - 1); i++) + writel(val, reg + 4 * i); + + val = readl(reg + 4 * i); + if (mask) + val |= GENMASK(VIO_MOD_TO_REG_OFF(ctx->data->vio_idx_num - 1), +0); + else + val &= ~GENMASK(VIO_MOD_TO_REG_OFF(ctx->data->vio_idx_num - 1), +0); + + writel(val, reg + 4 * i); +} + +#define PHY_DEVAPC_TIMEOUT 0x1 + +/* + * devapc_sync_vio_dbg - do "shift" mechansim" to get full violation information. + * shift mechanism is depends on devapc hardware design. + * Mediatek devapc set multiple slaves as a group. + * When violation is triggered, violation info is kept + * inside devapc hardware. + * Driver should do shift mechansim to sync full violation + * info to VIO_DBGs registers. + * + */
[PATCH v5 1/2] dt-bindings: devapc: add bindings for mtk-devapc
Add bindings for mtk-devapc. Signed-off-by: Neal Liu --- .../devicetree/bindings/soc/mediatek/devapc.yaml | 58 1 file changed, 58 insertions(+) create mode 100644 Documentation/devicetree/bindings/soc/mediatek/devapc.yaml diff --git a/Documentation/devicetree/bindings/soc/mediatek/devapc.yaml b/Documentation/devicetree/bindings/soc/mediatek/devapc.yaml new file mode 100644 index 000..6c763f8 --- /dev/null +++ b/Documentation/devicetree/bindings/soc/mediatek/devapc.yaml @@ -0,0 +1,58 @@ +# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause) +# # Copyright 2020 MediaTek Inc. +%YAML 1.2 +--- +$id: "http://devicetree.org/schemas/soc/mediatek/devapc.yaml#; +$schema: "http://devicetree.org/meta-schemas/core.yaml#; + +title: MediaTek Device Access Permission Control driver + +description: | + MediaTek bus fabric provides TrustZone security support and data + protection to prevent slaves from being accessed by unexpected masters. + The security violation is logged and sent to the processor for further + analysis and countermeasures. + +maintainers: + - Neal Liu + +properties: + compatible: +enum: + - mediatek,mt6779-devapc + + reg: +description: The base address of devapc register bank +maxItems: 1 + + interrupts: +description: A single interrupt specifier +maxItems: 1 + + clocks: +description: Contains module clock source and clock names +maxItems: 1 + + clock-names: +description: Names of the clocks list in clocks property +maxItems: 1 + +required: + - compatible + - reg + - interrupts + - clocks + - clock-names + +examples: + - | +#include +#include + +devapc: devapc@10207000 { + compatible = "mediatek,mt6779-devapc"; + reg = <0x10207000 0x1000>; + interrupts = ; + clocks = <_ao CLK_INFRA_DEVICE_APC>; + clock-names = "devapc-infra-clock"; +}; -- 1.7.9.5
[PATCH v5] Add MediaTek MT6779 devapc driver
These patch series introduce a MediaTek MT6779 devapc driver. MediaTek bus fabric provides TrustZone security support and data protection to prevent slaves from being accessed by unexpected masters. The security violation is logged and sent to the processor for further analysis or countermeasures. Any occurrence of security violation would raise an interrupt, and it will be handled by mtk-devapc driver. The violation information is printed in order to find the murderer. changes since v4: - refactor data structure. - merge two simple functions into one. - refactor register setting to prevent too many function call overhead. changes since v3: - revise violation handling flow to make it more easily to understand hardware behavior. - add more comments to understand how hardware works. changes since v2: - pass platform info through DT data. - remove unnecessary function. - remove slave_type because it always equals to 1 in current support SoC. - use vio_idx_num instread of list all devices' index. - add more comments to describe hardware behavior. changes since v1: - move SoC specific part to DT data. - remove unnecessary boundary check. - remove unnecessary data type declaration. - use read_poll_timeout() instread of for loop polling. - revise coding style elegantly. *** BLURB HERE *** Neal Liu (2): dt-bindings: devapc: add bindings for mtk-devapc soc: mediatek: add mt6779 devapc driver .../bindings/soc/mediatek/devapc.yaml | 58 drivers/soc/mediatek/Kconfig | 9 + drivers/soc/mediatek/Makefile | 1 + drivers/soc/mediatek/mtk-devapc.c | 315 ++ 4 files changed, 383 insertions(+) create mode 100644 Documentation/devicetree/bindings/soc/mediatek/devapc.yaml create mode 100644 drivers/soc/mediatek/mtk-devapc.c -- 2.18.0
Re: [PATCH] ubi: check kthread_should_stop() after the setting of task state
在 2020/8/7 4:15, Richard Weinberger 写道: On Wed, Aug 5, 2020 at 4:23 AM Zhihao Cheng wrote: Er, I can't get the point. I can list two possible situations, did I miss other situations? Yes. You keep ignoring the case I brought up. Let's start from scratch, maybe I miss something. So I'm sorry for being persistent. Never mind, we're all trying to figure it out. :-) . Besides, I'm not good at expressing question in English. (In Practicing...) The ubi thread can be reduced to a loop like this one: 1. for (;;) { 2. if (kthread_should_stop()) 3. break; 4. 5. if ( /* no work pending*/ ){ 6. set_current_state(TASK_INTERRUPTIBLE); 7. schedule(); 8. continue; 9. } 10. 11. do_work(); 12. } syzcaller found a case where stopping the thread did not work. If another task tries to stop the thread while no work is pending and the program counter in the thread is between lines 5 and 6, the kthread_stop() instruction has no effect. It has no effect because the thread sets the thread state to interruptible sleep and then schedules away. This is a common anti-pattern in the Linux kernel, sadly. Yes, but UBIFS is the exception, my solution looks like UBIFS. int ubifs_bg_thread(void *info) { while(1) { if (kthread_should_stop()) break; set_current_state(TASK_INTERRUPTIBLE); if (!c->need_bgt) { /* * Nothing prevents us from going sleep now and * be never woken up and block the task which * could wait in 'kthread_stop()' forever. */ if (kthread_should_stop()) break; schedule(); continue; } } } Do you agree with me so far or do you think syzcaller found a different issue? Yes, I agree. Your patch changes the loop as follows: 1. for (;;) { 2. if (kthread_should_stop()) 3. break; 4. 5. if ( /* no work pending*/ ){ 6. set_current_state(TASK_INTERRUPTIBLE); 7. 8. if (kthread_should_stop()) { 9. set_current_state(TASK_RUNNING); 10. break; 11. } 12. 13. schedule(); 14. continue; 15. } 16. 17. do_work(); 18. } That way there is a higher chance that the thread sees the stop flag and gracefully terminates, I fully agree on that. There's no disagreement so far. But it does not completely solve the problem. If kthread_stop() happens while the program counter of the ubi thread is at line 12, the stop flag is still missed and we end up in interruptible sleep just like before. That's where we hold different views. I have 3 viewpoints(You can point out which one you disagree.): 1. If kthread_stop() happens at line 12, ubi thread is *marked* with stop flag, it will stop at kthread_should_stop() as long as it can reach the next iteration. 2. If task A is on runqueue and its state is TASK_RUNNING, task A will be scheduled to execute. 3. If kthread_stop() happens at line 12, after program counter going to line 14, ubi thead is on runqueue and its state is TASK_RUNNING. I have explained this in situation 1 in last session. I mean ubi thread is on runqueue with TASK_RUNNING state & stop flag after the process you described. Line 12 kthread_stop() set_bit(mark stop flag) && wake_up_process(enqueue && set TASK_RUNNING ) => TASK_RUNNING & stop flag & on runqueue Line 13 schedule() Do nothing but pick next task to execute So, to solve the problem entirely I suggest changing schedule() to schedule_timeout() and let the thread wake up periodically.
Re: [PATCH] arm64: kaslr: Use standard early random function
On Thu, Aug 6, 2020 at 5:49 PM Guenter Roeck wrote: > > Use arch_get_random_seed_long_early() instead of arm64 specific functions > to solve the problem. As a side effect of this change, the code no longer > bypasses ARCH_RANDOM, which I consider desirable (after all, ARCH_RANDOM > was disabled for a reason). This patch looks sane to me, but let's see what the arm64 people say in case they have preferences.. Linus
Re: [PATCH 11/21] iommu/mediatek: Add power-domain operation
On Mon, 2020-07-27 at 16:49 +0800, chao hao wrote: > On Sat, 2020-07-11 at 14:48 +0800, Yong Wu wrote: > > In the previous SoC, the M4U HW is in the EMI power domain which is > > always on. the latest M4U is in the display power domain which may be > > turned on/off, thus we have to add pm_runtime interface for it. > > > > we should enable its power before M4U hw initial. and disable it after HW > > initialize. > > > > When the engine work, the engine always enable the power and clocks for > > smi-larb/smi-common, then the M4U's power will always be powered on > > automatically via the device link with smi-common. > > > > Note: we don't enable the M4U power in iommu_map/unmap for tlb flush. > > If its power already is on, of course it is ok. if the power is off, > > the main tlb will be reset while M4U power on, thus the tlb flush while > > m4u power off is unnecessary, just skip it. > > > > Signed-off-by: Yong Wu ... > > > > if (data->plat_data->m4u_plat == M4U_MT8173) { > > @@ -728,7 +756,15 @@ static int mtk_iommu_probe(struct platform_device > > *pdev) > > > > platform_set_drvdata(pdev, data); > > > > + if (dev->pm_domain) > > + pm_runtime_enable(dev); > > hi yong, > > If you put "pm_runtime_enable" here, it maybe not device_link with > smi_common for previous patch: > if(i || !pm_runtime_enabled(dev)) > continue; > > Whether put it up front? Thanks for review. My fault here. I will fix it. > > best regards, > chao
Re: [f2fs-dev] [PATCH] f2fs: remove unnecessary judgment in f2fs_drop_inode
On 2020/8/6 23:03, Liu Song via Linux-f2fs-devel wrote: From: Liu Song Inode hash has been removed in "make_bad_inode". If inode_unhashed is false, it must not be a bad inode. Signed-off-by: Liu Song --- fs/f2fs/super.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c index 20e56b0fa46a..ee01d15effe3 100644 --- a/fs/f2fs/super.c +++ b/fs/f2fs/super.c @@ -1043,7 +1043,7 @@ static int f2fs_drop_inode(struct inode *inode) * - inode_wait_for_writeback(inode) */ if ((!inode_unhashed(inode) && inode->i_state & I_SYNC)) { Since logic of "bad inode should be removed from hash" is not controlled by f2fs, so let's add below condition to make sure the rule of vfs is as it is. f2fs_bug_on(sbi, is_bad_inode(inode)); - if (!inode->i_nlink && !is_bad_inode(inode)) { + if (!inode->i_nlink) { /* to avoid evict_inode call simultaneously */ atomic_inc(>i_count); spin_unlock(>i_lock);
Re: [PATCH v3 1/3] driver core: Revert default driver_deferred_probe_timeout value to 0
On Thu, Aug 6, 2020 at 6:52 AM Thierry Reding wrote: > > On Wed, Apr 22, 2020 at 08:32:43PM +, John Stultz wrote: > > This patch addresses a regression in 5.7-rc1+ > > > > In commit c8c43cee29f6 ("driver core: Fix > > driver_deferred_probe_check_state() logic"), we both cleaned up > > the logic and also set the default driver_deferred_probe_timeout > > value to 30 seconds to allow for drivers that are missing > > dependencies to have some time so that the dependency may be > > loaded from userland after initcalls_done is set. > > > > However, Yoshihiro Shimoda reported that on his device that > > expects to have unmet dependencies (due to "optional links" in > > its devicetree), was failing to mount the NFS root. > > > > In digging further, it seemed the problem was that while the > > device properly probes after waiting 30 seconds for any missing > > modules to load, the ip_auto_config() had already failed, > > resulting in NFS to fail. This was due to ip_auto_config() > > calling wait_for_device_probe() which doesn't wait for the > > driver_deferred_probe_timeout to fire. > > > > Fixing that issue is possible, but could also introduce 30 > > second delays in bootups for users who don't have any > > missing dependencies, which is not ideal. > > > > So I think the best solution to avoid any regressions is to > > revert back to a default timeout value of zero, and allow > > systems that need to utilize the timeout in order for userland > > to load any modules that supply misisng dependencies in the dts > > to specify the timeout length via the exiting documented boot > > argument. > > > > Thanks to Geert for chasing down that ip_auto_config was why NFS > > was failing in this case! > > > > Cc: "David S. Miller" > > Cc: Alexey Kuznetsov > > Cc: Hideaki YOSHIFUJI > > Cc: Jakub Kicinski > > Cc: Greg Kroah-Hartman > > Cc: Rafael J. Wysocki > > Cc: Rob Herring > > Cc: Geert Uytterhoeven > > Cc: Yoshihiro Shimoda > > Cc: Robin Murphy > > Cc: Andy Shevchenko > > Cc: Sudeep Holla > > Cc: Andy Shevchenko > > Cc: Naresh Kamboju > > Cc: Basil Eljuse > > Cc: Ferry Toth > > Cc: Arnd Bergmann > > Cc: Anders Roxell > > Cc: netdev > > Cc: linux...@vger.kernel.org > > Reported-by: Yoshihiro Shimoda > > Tested-by: Yoshihiro Shimoda > > Fixes: c8c43cee29f6 ("driver core: Fix driver_deferred_probe_check_state() > > logic") > > Signed-off-by: John Stultz > > --- > > drivers/base/dd.c | 13 ++--- > > 1 file changed, 2 insertions(+), 11 deletions(-) > > Sorry for being a bit late to the party, but this breaks suspend/resume > support on various Tegra devices. I've only noticed now because, well, > suspend/resume have been broken for other reasons for a little while and > it's taken us a bit to resolve those issues. > > But now that those other issues have been fixed, I've started seeing an > issue where after resume from suspend some of the I2C controllers are no > longer working. The reason for this is that they share pins with DP AUX > controllers via the pinctrl framework. The DP AUX driver registers as > part of the DRM/KMS driver, which usually happens in userspace. Since > the deferred probe timeout was set to 0 by default this no longer works > because no pinctrl states are assigned to the I2C controller and > therefore upon resume the pins cannot be configured for I2C operation. Oof. My apologies! > I'm also somewhat confused by this patch and a few before because they > claim that they restore previous default behaviour, but that's just not > true. Originally when this timeout was introduced it was -1, which meant > that there was no timeout at all and hence users had to opt-in if they > wanted to use a deferred probe timeout. I don't think that's quite true, since the point of my original changes were to avoid troubles I was seeing with drivers not loading because once the timeout fired after init, driver loading would fail with ENODEV instead of returning EPROBE_DEFER. The logic that existed was buggy so the timeout handling didn't really work (changing the boot argument wouldn't help, because after init the logic would return ENODEV before it checked the timeout value). That said, looking at it now, I do realize the driver_deferred_probe_check_state_continue() logic in effect never returned ETIMEDOUT before was consolidated in the earlier changes, and now we've backed the default timeout to 0, old user (see bec6c0ecb243) will now get ETIMEDOUT where they wouldn't before. So would the following fix it up for you? (sorry its whitespace corrupted) diff --git a/drivers/pinctrl/devicetree.c b/drivers/pinctrl/devicetree.c index c6fe7d64c913..c7448be64d07 100644 --- a/drivers/pinctrl/devicetree.c +++ b/drivers/pinctrl/devicetree.c @@ -129,9 +129,8 @@ static int dt_to_map_one_config(struct pinctrl *p, if (!np_pctldev || of_node_is_root(np_pctldev)) { of_node_put(np_pctldev); ret =
[PATCH net-next v1] hinic: fix strncpy output truncated compile warnings
fix the compile warnings of 'strncpy' output truncated before terminating nul copying N bytes from a string of the same length Signed-off-by: Luo bin Reported-by: kernel test robot --- V0~V1: - use the strlen()+1 pattern consistently drivers/net/ethernet/huawei/hinic/hinic_devlink.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/net/ethernet/huawei/hinic/hinic_devlink.c b/drivers/net/ethernet/huawei/hinic/hinic_devlink.c index c6adc776f3c8..1ec88ebf81d6 100644 --- a/drivers/net/ethernet/huawei/hinic/hinic_devlink.c +++ b/drivers/net/ethernet/huawei/hinic/hinic_devlink.c @@ -342,9 +342,9 @@ static int chip_fault_show(struct devlink_fmsg *fmsg, level = event->event.chip.err_level; if (level < FAULT_LEVEL_MAX) - strncpy(level_str, fault_level[level], strlen(fault_level[level])); + strncpy(level_str, fault_level[level], strlen(fault_level[level]) + 1); else - strncpy(level_str, "Unknown", strlen("Unknown")); + strncpy(level_str, "Unknown", strlen("Unknown") + 1); if (level == FAULT_LEVEL_SERIOUS_FLR) { err = devlink_fmsg_u32_pair_put(fmsg, "Function level err func_id", @@ -388,9 +388,9 @@ static int fault_report_show(struct devlink_fmsg *fmsg, int err; if (event->type < FAULT_TYPE_MAX) - strncpy(type_str, fault_type[event->type], strlen(fault_type[event->type])); + strncpy(type_str, fault_type[event->type], strlen(fault_type[event->type]) + 1); else - strncpy(type_str, "Unknown", strlen("Unknown")); + strncpy(type_str, "Unknown", strlen("Unknown") + 1); err = devlink_fmsg_string_pair_put(fmsg, "Fault type", type_str); if (err) -- 2.17.1
Re: [PATCH v2] MIPS: Provide Kconfig option for default IEEE 754 conformance mode
Hi, Xuerui, On Thu, Aug 6, 2020 at 6:54 PM WANG Xuerui wrote: > > Hi Jiaxun, > > > On 2020/8/5 21:59, Jiaxun Yang wrote: > > > > > > 在 2020/8/1 14:11, Jiaxun Yang 写道: > >> Requested by downstream distros, a Kconfig option for default > >> IEEE 754 conformance mode allows them to set their mode to > >> relaxed by default. > >> > >> Signed-off-by: Jiaxun Yang > >> Reviewed-by: WANG Xuerui > >> Reviewed-by: Serge Semin > >> Reviewed-by: Huacai Chen > >> > >> -- > >> v2: Reword according to Xuerui's suggestion. > >> --- > > Hi Thomas, > > > > Is it possible to get this patch into 5.9 merge window? > > I think it have got enough review tag, and the config option was > > requested > > by a Debian developer. The next Debian release will take 5.9 lts > > kernel and > > they don't want to ship a non-bootable kernel in a major release. > > I have an idea. Can the downstream packagers make use of the builtin > command line config options, to inject the "ieee754=relaxed" or whatever > option necessary? If it is acceptable this patch should not be necessary > in the short term. Built-in "ieee754=relaxed" is already in upstream for Loongson-3. Huacai > > > > > Thanks. > > > > - Jiaxun
[PATCH 4.19] net/mlx5e: Don't support phys switch id if not in switchdev mode
From: Roi Dayan Support for phys switch id ndo added for representors and if we do not have representors there is no need to support it. Since each port return different switch id supporting this block support for creating bond over PFs and attaching to bridge in legacy mode. This bug doesn't exist upstream as the code got refactored and the netdev api is totally different. Fixes: cb67b832921c ("net/mlx5e: Introduce SRIOV VF representors") Signed-off-by: Roi Dayan Signed-off-by: Saeed Mahameed --- Hi Greg, Sorry for submitting a non upstream patch, but this bug is bothering some users on 4.19-stable kernels and it doesn't exist upstream, so i hope you are ok with backporting this one liner patch. Thanks !! drivers/net/ethernet/mellanox/mlx5/core/en_rep.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c index 701624a63d2f..1ab40d622ae1 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c @@ -198,7 +198,7 @@ int mlx5e_attr_get(struct net_device *dev, struct switchdev_attr *attr) struct mlx5_eswitch_rep *rep = rpriv->rep; struct mlx5_eswitch *esw = priv->mdev->priv.eswitch; - if (esw->mode == SRIOV_NONE) + if (esw->mode != SRIOV_OFFLOADS) return -EOPNOTSUPP; switch (attr->id) { -- 2.8.4
Re: [PATCH net-next] hinic: fix strncpy output truncated compile warnings
On 2020/8/7 8:57, luobin (L) wrote: > On 2020/8/7 3:01, David Miller wrote: >> From: Luo bin >> Date: Thu, 6 Aug 2020 15:48:30 +0800 >> >>> diff --git a/drivers/net/ethernet/huawei/hinic/hinic_devlink.c >>> b/drivers/net/ethernet/huawei/hinic/hinic_devlink.c >>> index c6adc776f3c8..1dc948c07b94 100644 >>> --- a/drivers/net/ethernet/huawei/hinic/hinic_devlink.c >>> +++ b/drivers/net/ethernet/huawei/hinic/hinic_devlink.c >>> @@ -342,9 +342,9 @@ static int chip_fault_show(struct devlink_fmsg *fmsg, >>> >>> level = event->event.chip.err_level; >>> if (level < FAULT_LEVEL_MAX) >>> - strncpy(level_str, fault_level[level], >>> strlen(fault_level[level])); >>> + strncpy(level_str, fault_level[level], >>> strlen(fault_level[level]) + 1); >>> else >>> - strncpy(level_str, "Unknown", strlen("Unknown")); >>> + strncpy(level_str, "Unknown", sizeof(level_str)); >>> >>> if (level == FAULT_LEVEL_SERIOUS_FLR) { >> >> Please fix these cases consistently, either use the strlen()+1 pattern >> or the "sizeof(destination)" one. >> >> Probably sizeof(destination) is best. >> . >> > Will fix. Thanks. Level_str array is initialized to zero, so can't use the > strlen()+1 pattern, I'll > use strlen()+1 consistently. > I have tried to use 'sizeof(level_str)' instead of 'strlen(fault_level[level]) + 1', but this will lead to following compile warning: In function ‘strncpy’, inlined from ‘chip_fault_show’ at drivers/net/ethernet/huawei/hinic/hinic_devlink.c:345:3: ./include/linux/string.h:297:30: warning: ‘__builtin_strncpy’ specified bound 17 equals destination size [-Wstringop-truncation] 297 | #define __underlying_strncpy __builtin_strncpy So I will use the strlen()+1 pattern consistently.
Re: [GIT PULL] auxdisplay for v5.9-rc1
The pull request you sent on Thu, 6 Aug 2020 21:04:21 +0200: > https://github.com/ojeda/linux.git tags/auxdisplay-for-linus-v5.9-rc1 has been merged into torvalds/linux.git: https://git.kernel.org/torvalds/c/ed35832648b5c22ce39fe9c476065389c6f330ef Thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/prtracker.html
Re: [GIT PULL] Please pull RDMA subsystem changes
The pull request you sent on Thu, 6 Aug 2020 15:27:32 -0300: > git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git tags/for-linus has been merged into torvalds/linux.git: https://git.kernel.org/torvalds/c/d7806bbd22cabc3e3b0a985cfcffa29cf156bb30 Thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/prtracker.html
Re: [GIT PULL] first round of SCSI updates for the 5.8+ merge window
The pull request you sent on Thu, 06 Aug 2020 13:55:15 -0700: > git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-misc has been merged into torvalds/linux.git: https://git.kernel.org/torvalds/c/dfdf16ecfd6abafc22b7f02364d9bb879ca8a5ee Thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/prtracker.html
Re: [GIT PULL] RESEND: thermal for v5.9-rc1
The pull request you sent on Thu, 6 Aug 2020 22:18:59 +0200: > ssh://g...@gitolite.kernel.org/pub/scm/linux/kernel/git/thermal/linux.git > tags/thermal-v5.9-rc1 has been merged into torvalds/linux.git: https://git.kernel.org/torvalds/c/96e3f3c16b7aedcd71502ccfc5778dddfc2e7b15 Thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/prtracker.html