Re: [PATCH v2 2/2] drm/amd: validate user GEM object size
Hi Yu, Thank you for the patch! Perhaps something to improve: [auto build test WARNING on linus/master] [also build test WARNING on v4.20-rc7 next-20181221] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Yu-Zhao/drm-amd-validate-user-pitch-alignment/20181222-153630 config: i386-randconfig-h1-12231406 (attached as .config) compiler: gcc-4.9 (Debian 4.9.4-2) 4.9.4 reproduce: # save the attached .config to linux build tree make ARCH=i386 All warnings (new ones prefixed by >>): drivers/gpu//drm/amd/amdgpu/amdgpu_display.c: In function 'amdgpu_display_user_framebuffer_create': >> drivers/gpu//drm/amd/amdgpu/amdgpu_display.c:556:3: warning: format '%ld' >> expects argument of type 'long int', but argument 4 has type 'size_t' >> [-Wformat=] DRM_DEBUG_KMS("Invalid GEM size: expecting >= %d but got %ld\n", ^ vim +556 drivers/gpu//drm/amd/amdgpu/amdgpu_display.c 521 522 struct drm_framebuffer * 523 amdgpu_display_user_framebuffer_create(struct drm_device *dev, 524 struct drm_file *file_priv, 525 const struct drm_mode_fb_cmd2 *mode_cmd) 526 { 527 struct drm_gem_object *obj; 528 struct amdgpu_framebuffer *amdgpu_fb; 529 int ret; 530 int height; 531 struct amdgpu_device *adev = dev->dev_private; 532 int cpp = drm_format_plane_cpp(mode_cmd->pixel_format, 0); 533 int pitch = amdgpu_align_pitch(adev, mode_cmd->pitches[0], cpp, false); 534 535 if (mode_cmd->pitches[0] != pitch) { 536 DRM_DEBUG_KMS("Invalid pitch: expecting %d but got %d\n", 537pitch, mode_cmd->pitches[0]); 538 return ERR_PTR(-EINVAL); 539 } 540 541 obj = drm_gem_object_lookup(file_priv, mode_cmd->handles[0]); 542 if (obj == NULL) { 543 dev_err(>pdev->dev, "No GEM object associated to handle 0x%08X, " 544 "can't create framebuffer\n", mode_cmd->handles[0]); 545 return ERR_PTR(-ENOENT); 546 } 547 548 /* Handle is imported dma-buf, so cannot be migrated to VRAM for scanout */ 549 if (obj->import_attach) { 550 DRM_DEBUG_KMS("Cannot create framebuffer from imported dma_buf\n"); 551 return ERR_PTR(-EINVAL); 552 } 553 554 height = ALIGN(mode_cmd->height, 8); 555 if (obj->size < pitch * height) { > 556 DRM_DEBUG_KMS("Invalid GEM size: expecting >= %d but > got %ld\n", 557pitch * height, obj->size); 558 return ERR_PTR(-EINVAL); 559 } 560 561 amdgpu_fb = kzalloc(sizeof(*amdgpu_fb), GFP_KERNEL); 562 if (amdgpu_fb == NULL) { 563 drm_gem_object_put_unlocked(obj); 564 return ERR_PTR(-ENOMEM); 565 } 566 567 ret = amdgpu_display_framebuffer_init(dev, amdgpu_fb, mode_cmd, obj); 568 if (ret) { 569 kfree(amdgpu_fb); 570 drm_gem_object_put_unlocked(obj); 571 return ERR_PTR(ret); 572 } 573 574 return _fb->base; 575 } 576 --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation .config.gz Description: application/gzip
[PATCH] scsi: sd: Fix cache_type_store()
Changing of caching mode via /sys/devices/.../scsi_disk/.../cache_type may fail if device responses to MODE SENSE command with DPOFUA flag set, and then checks this flag to be not set on MODE SELECT command. When trying to change cache_type, write always fails: # echo "none" >cache_type bash: echo: write error: Invalid argument And following appears in dmesg: [13007.865745] sd 1:0:1:0: [sda] Sense Key : Illegal Request [current] [13007.865753] sd 1:0:1:0: [sda] Add. Sense: Invalid field in parameter list Signed-off-by: Ivan Mironov --- drivers/scsi/sd.c | 15 +++ 1 file changed, 15 insertions(+) diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index bd0a5c694a97..698fe651fb1a 100644 --- a/drivers/scsi/sd.c +++ b/drivers/scsi/sd.c @@ -206,6 +206,21 @@ cache_type_store(struct device *dev, struct device_attribute *attr, sp = buffer_data[0] & 0x80 ? 1 : 0; buffer_data[0] &= ~0x80; + /* From SBC-4 r15, 6.5.1 "Mode pages overview", description of +* DEVICE-SPECIFIC PARAMETER field in the mode parameter header: +* ... +* The write protect (WP) bit for mode data sent with a MODE SELECT +* command shall be ignored by the device server. +* ... +* The DPOFUA bit is reserved for mode data sent with a MODE SELECT +* command. +* ... +* All other bits are also reserved, and all reserved bits shall be set +* to zero according to the same document. So, we can simply set this +* field to zero for compatibility. +*/ + data.device_specific = 0; + if (scsi_mode_select(sdp, 1, sp, 8, buffer_data, len, SD_TIMEOUT, SD_MAX_RETRIES, , )) { if (scsi_sense_valid()) -- 2.20.1
Re: general protection fault in put_pid
On Sat, Dec 22, 2018 at 8:07 PM Manfred Spraul wrote: > > Hi Dmitry, > > On 12/20/18 4:36 PM, Dmitry Vyukov wrote: > > On Wed, Dec 19, 2018 at 10:04 AM Manfred Spraul > > wrote: > >> Hello Dmitry, > >> > >> On 12/12/18 11:55 AM, Dmitry Vyukov wrote: > >>> On Tue, Dec 11, 2018 at 9:23 PM syzbot > >>> wrote: > Hello, > > syzbot found the following crash on: > > HEAD commit:f5d582777bcb Merge branch 'for-linus' of > git://git.kernel... > git tree: upstream > console output: https://syzkaller.appspot.com/x/log.txt?x=135bc54740 > kernel config: > https://syzkaller.appspot.com/x/.config?x=c8970c89a0efbb23 > dashboard link: > https://syzkaller.appspot.com/bug?extid=1145ec2e23165570c3ac > compiler: gcc (GCC) 8.0.1 20180413 (experimental) > syz repro: > https://syzkaller.appspot.com/x/repro.syz?x=16803afb40 > >>> +Manfred, this looks similar to the other few crashes related to > >>> semget$private(0x0, 0x4000, 0x3f) that you looked at. > >> I found one unexpected (incorrect?) locking, see the attached patch. > >> > >> But I doubt that this is the root cause of the crashes. > > > > But why? These one-off sporadic crashes reported by syzbot looks > > exactly like a subtle race and your patch touches sem_exit_ns involved > > in all reports. > > So if you don't spot anything else, I would say close these 3 reports > > with this patch (I see you already included Reported-by tags which is > > great!) and then wait for syzbot reaction. Since we got 3 of them, if > > it's still not fixed I would expect that syzbot will be able to > > retrigger this later again. > > As I wrote, unless semop() is used, sma->use_global_lock is always 9 and > nothing can happen. > > Every single-operation semop() reduces use_global_lock by one, i.e a > single semop call as done here cannot trigger the bug: > > https://syzkaller.appspot.com/text?tag=ReproSyz=16803afb40 It contains "repeat":true,"procs":6, which means that it run 6 processes running this test in infinite loop. The last mark about number of tests executed was: 2018/12/11 18:38:02 executed programs: 2955 > But, one more finding: > > https://syzkaller.appspot.com/bug?extid=1145ec2e23165570c3ac > > https://syzkaller.appspot.com/text?tag=CrashLog=109ecf6e40 > > The log file contain 1080 lines like these: > > > semget$private(..., 0x4003, ...) > > > > semget$private(..., 0x4006, ...) > > > > semget$private(..., 0x4007, ...) > > It ends up as kmalloc(128*0x400x), i.e. slightly more than 2 MB, an > allocation in the 4 MB kmalloc buffer: > > > [ 1201.210245] kmalloc-4194304 4698112KB4698112KB > > > i.e.: 1147 4 MB kmalloc blocks --> are we leaking nearly 100% of the > semaphore arrays?? /\/\/\/\/\/\ Ha, this is definitely not healthy. > This one looks similar: > > https://syzkaller.appspot.com/bug?extid=c92d3646e35bc5d1a909 > > except that the array sizes are mixed, and thus there are kmalloc-1M and > kmalloc-2M as well. > > (and I did not count the number of semget calls) > > > The test apps use unshare(CLONE_NEWNS) and unshare(CLONE_NEWIPC), correct? > > I.e. no CLONE_NEWUSER. > > https://github.com/google/syzkaller/blob/master/executor/common_linux.h#L1523 CLONE_NEWUSER is used on some instances as well: https://github.com/google/syzkaller/blob/master/executor/common_linux.h#L1765 This crash happened on 2 different instances and 1 of them uses CLONE_NEWUSER and another does not. If it's important because of CAP_ADMIN in IPC namespace, then all instances should have it (instances that don't use NEWUSER are just root).
Re: [PATCH] sock: Make sock->sk_tstamp thread-safe
On 12/21/2018 12:27 PM, Deepa Dinamani wrote: > Al Viro mentioned that there is probably a race condition > lurking in accesses of sk_tstamp on 32-bit machines. > > sock->sk_tstamp is of type ktime_t which is always an s64. > On a 32 bit architecture, we might run into situations of > unsafe access as the access to the field becomes non atomic. > > Use seqlocks for synchronization. > This allows us to avoid using spinlocks for readers as > readers do not need mutual exclusion. > Hi Deepa Please come up with something that has zero added costs for 64bit kernels. Most of us do not really care about 32bit kernels anymore, so we do not want to slow down 64bits kernels for such things. Look at include/linux/u64_stats_sync.h for initial thoughts. Thanks.
WARNING: ODEBUG bug in tipc_enable_bearer
Hello, syzbot found the following crash on: HEAD commit:ce28bb445388 Merge git://git.kernel.org/pub/scm/linux/kern.. git tree: net-next console output: https://syzkaller.appspot.com/x/log.txt?x=10eddf9740 kernel config: https://syzkaller.appspot.com/x/.config?x=67a2081147a23142 dashboard link: https://syzkaller.appspot.com/bug?extid=b981acf1fb240c0c128b compiler: gcc (GCC) 8.0.1 20180413 (experimental) syz repro: https://syzkaller.appspot.com/x/repro.syz?x=16a348c340 C reproducer: https://syzkaller.appspot.com/x/repro.c?x=134f5b1b40 IMPORTANT: if you fix the bug, please add the following tag to the commit: Reported-by: syzbot+b981acf1fb240c0c1...@syzkaller.appspotmail.com R10: R11: 0246 R12: R13: R14: R15: Disabling bearer [ cut here ] ODEBUG: free active (active state 1) object type: rcu_head hint: (null) WARNING: CPU: 1 PID: 6104 at lib/debugobjects.c:328 debug_print_object+0x16a/0x210 lib/debugobjects.c:325 Kernel panic - not syncing: panic_on_warn set ... CPU: 1 PID: 6104 Comm: syz-executor926 Not tainted 4.20.0-rc7+ #358 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1d3/0x2c6 lib/dump_stack.c:113 panic+0x2ad/0x55c kernel/panic.c:188 __warn.cold.8+0x20/0x45 kernel/panic.c:540 report_bug+0x254/0x2d0 lib/bug.c:186 fixup_bug arch/x86/kernel/traps.c:178 [inline] do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:271 do_invalid_op+0x36/0x40 arch/x86/kernel/traps.c:290 invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:973 RIP: 0010:debug_print_object+0x16a/0x210 lib/debugobjects.c:325 Code: 60 88 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 92 00 00 00 48 8b 14 dd e0 f7 60 88 4c 89 fe 48 c7 c7 80 ed 60 88 e8 16 a6 b7 fd <0f> 0b 83 05 61 92 86 06 01 48 83 c4 18 5b 41 5c 41 5d 41 5e 41 5f RSP: 0018:8881b3856fe0 EFLAGS: 00010086 RAX: RBX: 0003 RCX: RDX: RSI: 81653fa5 RDI: 0005 RBP: 8881b3857020 R08: 8881bd440300 R09: ed103b5e3ef8 R10: ed103b5e3ef8 R11: 8881daf1f7c7 R12: 0001 R13: 8959bf20 R14: R15: 8860f220 __debug_check_no_obj_freed lib/debugobjects.c:785 [inline] debug_check_no_obj_freed+0x3ae/0x58d lib/debugobjects.c:817 kfree+0xbd/0x230 mm/slab.c:3816 tipc_enable_bearer+0xefa/0xf10 net/tipc/bearer.c:322 __tipc_nl_bearer_enable+0x37c/0x4a0 net/tipc/bearer.c:900 tipc_nl_bearer_enable+0x22/0x30 net/tipc/bearer.c:908 genl_family_rcv_msg+0x8a7/0x11a0 net/netlink/genetlink.c:601 genl_rcv_msg+0xc6/0x168 net/netlink/genetlink.c:626 netlink_rcv_skb+0x16c/0x430 net/netlink/af_netlink.c:2477 genl_rcv+0x28/0x40 net/netlink/genetlink.c:637 netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline] netlink_unicast+0x59f/0x750 net/netlink/af_netlink.c:1336 netlink_sendmsg+0xa18/0xfc0 net/netlink/af_netlink.c:1917 sock_sendmsg_nosec net/socket.c:621 [inline] sock_sendmsg+0xd5/0x120 net/socket.c:631 ___sys_sendmsg+0x7fd/0x930 net/socket.c:2116 __sys_sendmsg+0x11d/0x280 net/socket.c:2154 __do_sys_sendmsg net/socket.c:2163 [inline] __se_sys_sendmsg net/socket.c:2161 [inline] __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2161 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x4456a9 Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 4b cd fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:7ffc7161d698 EFLAGS: 0246 ORIG_RAX: 002e RAX: ffda RBX: 7ffc7161d6e0 RCX: 004456a9 RDX: RSI: 20c0 RDI: 0003 RBP: 0004 R08: 0001 R09: 0100 R10: R11: 0246 R12: R13: R14: R15: == WARNING: possible circular locking dependency detected 4.20.0-rc7+ #358 Not tainted -- syz-executor926/6104 is trying to acquire lock: 65118b5a ((console_sem).lock){-...}, at: down_trylock+0x13/0x70 kernel/locking/semaphore.c:136 but task is already holding lock: d78d5834 (_hash[i].lock){-.-.}, at: __debug_check_no_obj_freed lib/debugobjects.c:776 [inline] d78d5834 (_hash[i].lock){-.-.}, at: debug_check_no_obj_freed+0x17a/0x58d lib/debugobjects.c:817 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #3 (_hash[i].lock){-.-.}: __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline] _raw_spin_lock_irqsave+0x99/0xd0
KASAN: slab-out-of-bounds Write in fpstate_init
Hello, syzbot found the following crash on: HEAD commit:6648e120dd1a Add linux-next specific files for 20181217 git tree: linux-next console output: https://syzkaller.appspot.com/x/log.txt?x=16fddf9740 kernel config: https://syzkaller.appspot.com/x/.config?x=9e2825c86a37329f dashboard link: https://syzkaller.appspot.com/bug?extid=5ce1a620488e94bf2f15 compiler: gcc (GCC) 8.0.1 20180413 (experimental) syz repro: https://syzkaller.appspot.com/x/repro.syz?x=107348c340 C reproducer: https://syzkaller.appspot.com/x/repro.c?x=135f5b1b40 IMPORTANT: if you fix the bug, please add the following tag to the commit: Reported-by: syzbot+5ce1a620488e94bf2...@syzkaller.appspotmail.com audit: type=1800 audit(1545532930.544:30): pid=6029 uid=0 auid=4294967295 ses=4294967295 subj==unconfined op=collect_data cause=failed(directio) comm="startpar" name="rmnologin" dev="sda1" ino=2423 res=0 sshd (6168) used greatest stack depth: 15728 bytes left L1TF CPU bug present and SMT on, data leak possible. See CVE-2018-3646 and https://www.kernel.org/doc/html/latest/admin-guide/l1tf.html for details. == BUG: KASAN: slab-out-of-bounds in memset include/linux/string.h:337 [inline] BUG: KASAN: slab-out-of-bounds in fpstate_init+0x50/0x160 arch/x86/kernel/fpu/core.c:178 Write of size 832 at addr 8881bba4bbc0 by task syz-executor233/6184 CPU: 1 PID: 6184 Comm: syz-executor233 Not tainted 4.20.0-rc6-next-20181217+ #172 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x244/0x39d lib/dump_stack.c:113 print_address_description.cold.4+0x9/0x1ff mm/kasan/report.c:187 kasan_report.cold.5+0x1b/0x39 mm/kasan/report.c:317 check_memory_region_inline mm/kasan/generic.c:185 [inline] check_memory_region+0x13e/0x1b0 mm/kasan/generic.c:191 memset+0x23/0x40 mm/kasan/common.c:113 memset include/linux/string.h:337 [inline] fpstate_init+0x50/0x160 arch/x86/kernel/fpu/core.c:178 fx_init arch/x86/kvm/x86.c:8646 [inline] kvm_arch_vcpu_init+0x3e9/0x870 arch/x86/kvm/x86.c:9006 kvm_vcpu_init+0x2fa/0x420 arch/x86/kvm/../../../virt/kvm/kvm_main.c:315 vmx_create_vcpu+0x1b7/0x2695 arch/x86/kvm/vmx/vmx.c:6375 kvm_arch_vcpu_create+0xe5/0x220 arch/x86/kvm/x86.c:8679 kvm_vm_ioctl_create_vcpu arch/x86/kvm/../../../virt/kvm/kvm_main.c:2555 [inline] kvm_vm_ioctl+0x526/0x2030 arch/x86/kvm/../../../virt/kvm/kvm_main.c:3084 vfs_ioctl fs/ioctl.c:46 [inline] file_ioctl fs/ioctl.c:509 [inline] do_vfs_ioctl+0x1de/0x1790 fs/ioctl.c:696 ksys_ioctl+0xa9/0xd0 fs/ioctl.c:713 __do_sys_ioctl fs/ioctl.c:720 [inline] __se_sys_ioctl fs/ioctl.c:718 [inline] __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x440039 Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 fb 13 fc ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:7ffe478aae78 EFLAGS: 0217 ORIG_RAX: 0010 RAX: ffda RBX: 004002c8 RCX: 00440039 RDX: RSI: ae41 RDI: 0004 RBP: 006ca018 R08: 004002c8 R09: 004002c8 R10: R11: 0217 R12: 004018c0 R13: 00401950 R14: R15: Allocated by task 6184: save_stack+0x43/0xd0 mm/kasan/common.c:73 set_track mm/kasan/common.c:85 [inline] kasan_kmalloc+0xcb/0xd0 mm/kasan/common.c:482 kasan_slab_alloc+0x12/0x20 mm/kasan/common.c:397 kmem_cache_alloc+0x130/0x730 mm/slab.c:3541 kmem_cache_zalloc include/linux/slab.h:730 [inline] vmx_create_vcpu+0x110/0x2695 arch/x86/kvm/vmx/vmx.c:6366 kvm_arch_vcpu_create+0xe5/0x220 arch/x86/kvm/x86.c:8679 kvm_vm_ioctl_create_vcpu arch/x86/kvm/../../../virt/kvm/kvm_main.c:2555 [inline] kvm_vm_ioctl+0x526/0x2030 arch/x86/kvm/../../../virt/kvm/kvm_main.c:3084 vfs_ioctl fs/ioctl.c:46 [inline] file_ioctl fs/ioctl.c:509 [inline] do_vfs_ioctl+0x1de/0x1790 fs/ioctl.c:696 ksys_ioctl+0xa9/0xd0 fs/ioctl.c:713 __do_sys_ioctl fs/ioctl.c:720 [inline] __se_sys_ioctl fs/ioctl.c:718 [inline] __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe Freed by task 0: (stack is not available) The buggy address belongs to the object at 8881bba4bb80 which belongs to the cache x86_fpu of size 832 The buggy address is located 64 bytes inside of 832-byte region [8881bba4bb80, 8881bba4bec0) The buggy address belongs to the page: page:ea0006ee92c0 count:1 mapcount:0 mapping:8881d7a9d380 index:0x0 flags: 0x2fffc000200(slab) raw: 02fffc000200 8881d6780448 8881d6780448 8881d7a9d380 raw: 8881bba4b040
[PATCH] pinctrl: freescale: Break dependency on SOC_IMX8MQ for i.MX8MQ
The CONFIG_SOC_IMX8MQ will go away, so the dependency can be based on ARCH_MXC && ARM64. Signed-off-by: Abel Vesa --- drivers/pinctrl/freescale/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/pinctrl/freescale/Kconfig b/drivers/pinctrl/freescale/Kconfig index 2d6db43..624902e 100644 --- a/drivers/pinctrl/freescale/Kconfig +++ b/drivers/pinctrl/freescale/Kconfig @@ -123,7 +123,7 @@ config PINCTRL_IMX7ULP config PINCTRL_IMX8MQ bool "IMX8MQ pinctrl driver" - depends on SOC_IMX8MQ + depends on ARCH_MXC && ARM64 select PINCTRL_IMX help Say Y here to enable the imx8mq pinctrl driver -- 2.7.4
Re: [PATCH] net/mlx5e: fix semicolon.cocci warnings
On Sat, Dec 22, 2018 at 08:02:16PM +0800, kbuild test robot wrote: > From: kbuild test robot > > drivers/net/ethernet/mellanox/mlx5/core/en_rep.c:1339:57-58: Unneeded > semicolon > > > Remove unneeded semicolon. > > Generated by: scripts/coccinelle/misc/semicolon.cocci > > Fixes: 4c8fb2986d44 ("net/mlx5e: Increase VF representors' SQ size to 128") > CC: Gavi Teitz > Signed-off-by: kbuild test robot > --- Dave, Can you please apply this small fix? Thanks, Reviewed-by: Leon Romanovsky signature.asc Description: PGP signature
[PATCH] phy: freescale: Break dependency on SOC_IMX8MQ for USB PHY
Since this is going to be used on more SoCs than just i.MX8MQ, make the dependency here more generic. Signed-off-by: Abel Vesa --- drivers/phy/freescale/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/phy/freescale/Kconfig b/drivers/phy/freescale/Kconfig index f050bd4..832670b 100644 --- a/drivers/phy/freescale/Kconfig +++ b/drivers/phy/freescale/Kconfig @@ -2,4 +2,4 @@ config PHY_FSL_IMX8MQ_USB tristate "Freescale i.MX8M USB3 PHY" depends on OF && HAS_IOMEM select GENERIC_PHY - default SOC_IMX8MQ + default ARCH_MXC && ARM64 -- 2.7.4
[PATCH] soc: imx: Break dependency on SOC_IMX8MQ for GPCv2
Since this is going to be used on more SoCs than just i.MX8MQ, make the dependency here more generic. Signed-off-by: Abel Vesa --- drivers/soc/imx/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/soc/imx/Kconfig b/drivers/soc/imx/Kconfig index 2112d18..7ffbb6b 100644 --- a/drivers/soc/imx/Kconfig +++ b/drivers/soc/imx/Kconfig @@ -2,7 +2,7 @@ menu "i.MX SoC drivers" config IMX_GPCV2_PM_DOMAINS bool "i.MX GPCv2 PM domains" - depends on SOC_IMX7D || SOC_IMX8MQ || (COMPILE_TEST && OF) + depends on SOC_IMX7D || ARCH_MXC || (COMPILE_TEST && OF) depends on PM select PM_GENERIC_DOMAINS default y if SOC_IMX7D -- 2.7.4
[BUG]Uncalibrated TSC is not accurate enough as a time keeper
The cpu_khz and tsc_khz are now read directly by the cpuid instruction, and they are deemed to be very accurate. But this is not the case in our situation. The OS time lags behind about 8 seconds per hour. The CPU information is as follows: processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 85 model name : Intel(R) Xeon(R) Gold 6140 CPU @ 2.30GHz stepping: 4 microcode : 0x24d cpu MHz : 2300.000 cache size : 25344 KB physical id : 0 siblings: 36 core id : 0 cpu cores : 18 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 22 It is this "cpuid level 22" that makes the kernel 4.14 to read both cpu_khz and tsc_khz directly by instruction "cpuid", and the TSC is thought to be very accurate, but in fact it is not. * TSC frequency determined by CPUID is a "hardware reported" * frequency and is the most accurate one so far we have. This * is considered a known frequency. +* +* The assumption may not be valid! +* */ - setup_force_cpu_cap(X86_FEATURE_TSC_KNOWN_FREQ);
[PATCH V8] iio: light: isl29018: add vcc regulator operation support
The light sensor's power supply could be controllable by regulator on some platforms, such as i.MX6Q-SABRESD board, the light sensor isl29023's power supply is controlled by a GPIO fixed regulator, need to make sure the regulator is enabled before any operation of sensor, this patch adds vcc regulator operation support. Signed-off-by: Anson Huang --- ChangeLog Since V7: - move devm_add_action to before devm_regmap_init_i2c call to save code of disabling regulator; - simply the error check logic in isl29018_suspend. --- drivers/iio/light/isl29018.c | 47 +++- 1 file changed, 46 insertions(+), 1 deletion(-) diff --git a/drivers/iio/light/isl29018.c b/drivers/iio/light/isl29018.c index b45400f..ec6ce89 100644 --- a/drivers/iio/light/isl29018.c +++ b/drivers/iio/light/isl29018.c @@ -23,6 +23,7 @@ #include #include #include +#include #include #include #include @@ -95,6 +96,7 @@ struct isl29018_chip { struct isl29018_scale scale; int prox_scheme; boolsuspended; + struct regulator*vcc_reg; }; static int isl29018_set_integration_time(struct isl29018_chip *chip, @@ -708,6 +710,16 @@ static const char *isl29018_match_acpi_device(struct device *dev, int *data) return dev_name(dev); } +static void isl29018_disable_regulator_action(void *_data) +{ + struct isl29018_chip *chip = _data; + int err; + + err = regulator_disable(chip->vcc_reg); + if (err) + pr_err("failed to disable isl29018's VCC regulator!\n"); +} + static int isl29018_probe(struct i2c_client *client, const struct i2c_device_id *id) { @@ -742,6 +754,28 @@ static int isl29018_probe(struct i2c_client *client, chip->scale = isl29018_scales[chip->int_time][0]; chip->suspended = false; + chip->vcc_reg = devm_regulator_get(>dev, "vcc"); + if (IS_ERR(chip->vcc_reg)) { + err = PTR_ERR(chip->vcc_reg); + if (err != -EPROBE_DEFER) + dev_err(>dev, "failed to get VCC regulator!\n"); + return err; + } + + err = regulator_enable(chip->vcc_reg); + if (err) { + dev_err(>dev, "failed to enable VCC regulator!\n"); + return err; + } + + err = devm_add_action(>dev, isl29018_disable_regulator_action, +chip); + if (err) { + isl29018_disable_regulator_action(chip); + dev_err(>dev, "failed to setup regulator cleanup action!\n"); + return err; + } + chip->regmap = devm_regmap_init_i2c(client, isl29018_chip_info_tbl[dev_id].regmap_cfg); if (IS_ERR(chip->regmap)) { @@ -768,6 +802,7 @@ static int isl29018_probe(struct i2c_client *client, static int isl29018_suspend(struct device *dev) { struct isl29018_chip *chip = iio_priv(dev_get_drvdata(dev)); + int ret; mutex_lock(>lock); @@ -777,10 +812,13 @@ static int isl29018_suspend(struct device *dev) * So we do not have much to do here. */ chip->suspended = true; + ret = regulator_disable(chip->vcc_reg); + if (ret) + dev_err(dev, "failed to disable VCC regulator\n"); mutex_unlock(>lock); - return 0; + return ret; } static int isl29018_resume(struct device *dev) @@ -790,6 +828,13 @@ static int isl29018_resume(struct device *dev) mutex_lock(>lock); + err = regulator_enable(chip->vcc_reg); + if (err) { + dev_err(dev, "failed to enable VCC regulator\n"); + mutex_unlock(>lock); + return err; + } + err = isl29018_chip_init(chip); if (!err) chip->suspended = false; -- 2.7.4
RE: [PATCH V7] iio: light: isl29018: add vcc regulator operation support
Hi, Jonathan Best Regards! Anson Huang > -Original Message- > From: Jonathan Cameron [mailto:ji...@kernel.org] > Sent: 2018年12月23日 1:15 > To: Anson Huang > Cc: knaac...@gmx.de; l...@metafoo.de; pme...@pmeerw.net; > linux-...@vger.kernel.org; linux-kernel@vger.kernel.org; feste...@gmail.com; > pr...@electromag.com.au; dl-linux-imx > Subject: Re: [PATCH V7] iio: light: isl29018: add vcc regulator operation > support > > On Mon, 17 Dec 2018 03:25:17 + > Anson Huang wrote: > > > The light sensor's power supply could be controllable by regulator on > > some platforms, such as i.MX6Q-SABRESD board, the light sensor > > isl29023's power supply is controlled by a GPIO fixed regulator, need > > to make sure the regulator is enabled before any operation of sensor, > > this patch adds vcc regulator operation support. > > > > Signed-off-by: Anson Huang > Hi Anson > > See below. > > > --- > > ChangeLog since V6 > > - using devm_regulator_get() instead of devm_regulator_get_optional() > since the regulator is > > there anyway, if dtb does NOT specify one, regulator framework will > assign dummy regulator for it; > > - Setup devm action for cleaning up regulator resource for error > handling. > > --- > > drivers/iio/light/isl29018.c | 58 > > > > 1 file changed, 58 insertions(+) > > > > diff --git a/drivers/iio/light/isl29018.c > > b/drivers/iio/light/isl29018.c index b45400f..63f7b9d 100644 > > --- a/drivers/iio/light/isl29018.c > > +++ b/drivers/iio/light/isl29018.c > > @@ -23,6 +23,7 @@ > > #include > > #include > > #include > > +#include > > #include > > #include > > #include > > @@ -95,6 +96,7 @@ struct isl29018_chip { > > struct isl29018_scale scale; > > int prox_scheme; > > boolsuspended; > > + struct regulator*vcc_reg; > > }; > > > > static int isl29018_set_integration_time(struct isl29018_chip *chip, > > @@ -708,6 +710,17 @@ static const char > *isl29018_match_acpi_device(struct device *dev, int *data) > > return dev_name(dev); > > } > > > > +static void isl29018_disable_regulator_action(void *_data) { > > + struct isl29018_chip *chip = _data; > > + int err; > > + > > + err = regulator_disable(chip->vcc_reg); > > + if (err) > > + dev_err(regmap_get_device(chip->regmap), > > + "failed to disable VCC regulator!\n"); } > > + > > static int isl29018_probe(struct i2c_client *client, > > const struct i2c_device_id *id) > > { > > @@ -742,6 +755,37 @@ static int isl29018_probe(struct i2c_client *client, > > chip->scale = isl29018_scales[chip->int_time][0]; > > chip->suspended = false; > > > > + chip->vcc_reg = devm_regulator_get(>dev, "vcc"); > > + if (IS_ERR(chip->vcc_reg)) { > > + err = PTR_ERR(chip->vcc_reg); > > + if (err != -EPROBE_DEFER) > > + dev_err(>dev, "failed to get VCC regulator!\n"); > > + return err; > > + } > > + > > + err = regulator_enable(chip->vcc_reg); > > + if (err) { > > + dev_err(>dev, "failed to enable VCC regulator!\n"); > > + return err; > > + } > > + > > + chip->regmap = devm_regmap_init_i2c(client, > > + isl29018_chip_info_tbl[dev_id].regmap_cfg); > > + if (IS_ERR(chip->regmap)) { > > + err = PTR_ERR(chip->regmap); > > + dev_err(>dev, "regmap initialization fails: %d\n", err); > > + regulator_disable(chip->vcc_reg); > > + return err; > > + } > > + > > + err = devm_add_action(>dev, isl29018_disable_regulator_action, > > +chip); > > + if (err) { > > I'm a little confused, why not do this before devm_regmap_init_i2c. > That way you won't have to disable the regulator in that one error path. > Also, devm_add_action_or_reset will call isl29018_disable_regulator_action > for you on error. It is because I used dev_err() in isl29018_disable_regulator_action which need regmap to get "dev" by regmap_get_device(chip->regmap), if it is accepted by just using pr_err() instead of dev_err, then I can do the devm_add_action before devm_regmap_init_i2c. I think using pr_err should be OK, I will use it in V8 patch. > > > + isl29018_disable_regulator_action(chip); > > + dev_err(>dev, "failed to setup regulator cleanup > > action!\n"); > > + return err; > > + } > > + > > chip->regmap = devm_regmap_init_i2c(client, > > isl29018_chip_info_tbl[dev_id].regmap_cfg); > > if (IS_ERR(chip->regmap)) { > > @@ -768,6 +812,7 @@ static int isl29018_probe(struct i2c_client > > *client, static int isl29018_suspend(struct device *dev) { > > struct isl29018_chip *chip = iio_priv(dev_get_drvdata(dev)); > > + int ret; > > > > mutex_lock(>lock); > > > > @@ -777,6 +822,12 @@ static int isl29018_suspend(struct device *dev) > > *
[PATCH V6] iio: magnetometer: mag3110: add vdd/vddio regulator operation support
The magnetometer's power supplies could be controllable on some platforms, such as i.MX6Q-SABRESD board, the mag3110's power supplies are controlled by a GPIO fixed regulator, need to make sure the regulators are enabled before any communication with mag3110, this patch adds vdd/vddio regulator operation support. Signed-off-by: Anson Huang --- ChangeLog since V5: Make sure both vdd and vddio regulators are aquired before enabling any one of them. --- drivers/iio/magnetometer/mag3110.c | 88 ++ 1 file changed, 80 insertions(+), 8 deletions(-) diff --git a/drivers/iio/magnetometer/mag3110.c b/drivers/iio/magnetometer/mag3110.c index f063355..f56a99a 100644 --- a/drivers/iio/magnetometer/mag3110.c +++ b/drivers/iio/magnetometer/mag3110.c @@ -20,6 +20,7 @@ #include #include #include +#include #define MAG3110_STATUS 0x00 #define MAG3110_OUT_X 0x01 /* MSB first */ @@ -56,6 +57,8 @@ struct mag3110_data { struct mutex lock; u8 ctrl_reg1; int sleep_val; + struct regulator *vdd_reg; + struct regulator *vddio_reg; }; static int mag3110_request(struct mag3110_data *data) @@ -469,17 +472,44 @@ static int mag3110_probe(struct i2c_client *client, struct iio_dev *indio_dev; int ret; - ret = i2c_smbus_read_byte_data(client, MAG3110_WHO_AM_I); - if (ret < 0) - return ret; - if (ret != MAG3110_DEVICE_ID) - return -ENODEV; - indio_dev = devm_iio_device_alloc(>dev, sizeof(*data)); if (!indio_dev) return -ENOMEM; data = iio_priv(indio_dev); + + data->vdd_reg = devm_regulator_get(>dev, "vdd"); + data->vddio_reg = devm_regulator_get(>dev, "vddio"); + if (IS_ERR(data->vdd_reg) || IS_ERR(data->vddio_reg)) { + if (PTR_ERR(data->vdd_reg) == -EPROBE_DEFER || + PTR_ERR(data->vddio_reg) == -EPROBE_DEFER) + return -EPROBE_DEFER; + + dev_err(>dev, "failed to get VDD/VDDIO regulator!\n"); + return IS_ERR(data->vdd_reg) ? + PTR_ERR(data->vdd_reg) : PTR_ERR(data->vddio_reg); + } + + ret = regulator_enable(data->vdd_reg); + if (ret) { + dev_err(>dev, "failed to enable VDD regulator!\n"); + return ret; + } + + ret = regulator_enable(data->vddio_reg); + if (ret) { + dev_err(>dev, "failed to enable VDDIO regulator!\n"); + goto disable_regulator_vdd; + } + + ret = i2c_smbus_read_byte_data(client, MAG3110_WHO_AM_I); + if (ret < 0) + goto disable_regulators; + if (ret != MAG3110_DEVICE_ID) { + ret = -ENODEV; + goto disable_regulators; + } + data->client = client; mutex_init(>lock); @@ -499,7 +529,7 @@ static int mag3110_probe(struct i2c_client *client, ret = mag3110_change_config(data, MAG3110_CTRL_REG1, data->ctrl_reg1); if (ret < 0) - return ret; + goto disable_regulators; ret = i2c_smbus_write_byte_data(client, MAG3110_CTRL_REG2, MAG3110_CTRL_AUTO_MRST_EN); @@ -520,16 +550,24 @@ static int mag3110_probe(struct i2c_client *client, iio_triggered_buffer_cleanup(indio_dev); standby_on_error: mag3110_standby(iio_priv(indio_dev)); +disable_regulators: + regulator_disable(data->vddio_reg); +disable_regulator_vdd: + regulator_disable(data->vdd_reg); + return ret; } static int mag3110_remove(struct i2c_client *client) { struct iio_dev *indio_dev = i2c_get_clientdata(client); + struct mag3110_data *data = iio_priv(indio_dev); iio_device_unregister(indio_dev); iio_triggered_buffer_cleanup(indio_dev); mag3110_standby(iio_priv(indio_dev)); + regulator_disable(data->vddio_reg); + regulator_disable(data->vdd_reg); return 0; } @@ -537,14 +575,48 @@ static int mag3110_remove(struct i2c_client *client) #ifdef CONFIG_PM_SLEEP static int mag3110_suspend(struct device *dev) { - return mag3110_standby(iio_priv(i2c_get_clientdata( + struct mag3110_data *data = iio_priv(i2c_get_clientdata( + to_i2c_client(dev))); + int ret; + + ret = mag3110_standby(iio_priv(i2c_get_clientdata( to_i2c_client(dev; + if (ret) + return ret; + + ret = regulator_disable(data->vddio_reg); + if (ret) { + dev_err(dev, "failed to disable VDDIO regulator\n"); + return ret; + } + + ret = regulator_disable(data->vdd_reg); + if (ret) { + dev_err(dev, "failed to disable VDD regulator\n"); + return ret; + } + + return 0; } static int mag3110_resume(struct device *dev) { struct mag3110_data *data = iio_priv(i2c_get_clientdata(
Re: [PATCH] net: core: Fix Spectre v1 vulnerability
On Sat, Dec 22, 2018 at 11:03:31PM -0600, Gustavo A. R. Silva wrote: > Alexei, > > On 12/22/18 10:12 PM, Alexei Starovoitov wrote: > > On Sat, Dec 22, 2018 at 09:37:02PM -0600, Gustavo A. R. Silva wrote: > > > > > > Can't we have the case in which the code can be "trained" to read > > > perfectly valid values for prog->len for quite a while, making the > > > microcode come into place and speculate about: > > > > > > 1013 if (flen == 0 || flen > BPF_MAXINSNS) > > > 1014 return false; > > > > > > and then make flen to be greater than BPF_MAXINSNS? > > > > Yes. The user space can train line 1013 to mispredict by passing > > smaller flen N times and then passing large flen. > > Why do you think it's exploitable? > > > > Without the patch in the mispredicted path the cpu will do > > if (0 < flen) condition and since flen is hot in the cache > > it will happily execute the filter[0] load... > > and about 12-20 u-ops later (depending on u-arch of cpu) when > > branch predictor realizes that it's a miss, the cpu will ignore > > the values computed in the shadow cpu registers used by speculative > > execution > > and go back to the 'return false' execution path. > > The side effect of bringing filter[0] value in L1 cache is still there. > > The cpu is incapable to undo that cache load. That's what spectre1 is about. > > Do you see how filter[0] value in cpu L1 cache is exploitable? > > > > In this regard, the policy has been to kill the speculation on that > first load (as I mentioned in my last email. It is also mentioned in > the commit log for every patch). > > > I took another look at the following patches: > > "net: core: Fix Spectre v1 vulnerability" > > "nfc: af_nfc: Fix Spectre v1 vulnerability" > > "can: af_can: Fix Spectre v1 vulnerability" > > and I have to say that none of them are necessary. > > I'm not sure whether there were other patches that pretend to fix spectre1. > > > > It's not about pretending to fix it. It's about trying to prevent the > conditions that can actually trigger the exploitation of a potential > vulnerability. The code is not always the same, it changes, it evolves, > and we are currently trying to catch that first load (that's what smatch > does in all these cases) that could eventually lead to an actual vuln. in other words there is no bug and there is no vulnerability, but there is a 'policy' set by ... ? So hence Nack to the policy and Nack to the patches.
Re: [PATCH] net: core: Fix Spectre v1 vulnerability
Alexei, On 12/22/18 10:12 PM, Alexei Starovoitov wrote: On Sat, Dec 22, 2018 at 09:37:02PM -0600, Gustavo A. R. Silva wrote: Can't we have the case in which the code can be "trained" to read perfectly valid values for prog->len for quite a while, making the microcode come into place and speculate about: 1013 if (flen == 0 || flen > BPF_MAXINSNS) 1014 return false; and then make flen to be greater than BPF_MAXINSNS? Yes. The user space can train line 1013 to mispredict by passing smaller flen N times and then passing large flen. Why do you think it's exploitable? Without the patch in the mispredicted path the cpu will do if (0 < flen) condition and since flen is hot in the cache it will happily execute the filter[0] load... and about 12-20 u-ops later (depending on u-arch of cpu) when branch predictor realizes that it's a miss, the cpu will ignore the values computed in the shadow cpu registers used by speculative execution and go back to the 'return false' execution path. The side effect of bringing filter[0] value in L1 cache is still there. The cpu is incapable to undo that cache load. That's what spectre1 is about. Do you see how filter[0] value in cpu L1 cache is exploitable? In this regard, the policy has been to kill the speculation on that first load (as I mentioned in my last email. It is also mentioned in the commit log for every patch). I took another look at the following patches: "net: core: Fix Spectre v1 vulnerability" "nfc: af_nfc: Fix Spectre v1 vulnerability" "can: af_can: Fix Spectre v1 vulnerability" and I have to say that none of them are necessary. I'm not sure whether there were other patches that pretend to fix spectre1. It's not about pretending to fix it. It's about trying to prevent the conditions that can actually trigger the exploitation of a potential vulnerability. The code is not always the same, it changes, it evolves, and we are currently trying to catch that first load (that's what smatch does in all these cases) that could eventually lead to an actual vuln. Thanks -- Gustavo
[PATCH v2 0/4] staging: iio: adt7316: dac fixes
Changes in v2: - Patches 1-2: reworded commit messages and added Fixes tags - Patches 3-4: added Fixes tags Jeremy Fertic (4): staging: iio: adt7316: fix dac_bits assignment staging: iio: adt7316: fix handling of dac high resolution option staging: iio: adt7316: fix the dac read calculation staging: iio: adt7316: fix the dac write calculation drivers/staging/iio/addac/adt7316.c | 46 ++--- 1 file changed, 28 insertions(+), 18 deletions(-) -- 2.19.1
[PATCH v2 2/4] staging: iio: adt7316: fix handling of dac high resolution option
The adt7316/7 and adt7516/7 have the option to output voltage proportional to temperature on dac a and/or dac b. The default dac resolution in this mode is 8 bits with the dac high resolution option enabling 10 bits. None of these settings affect dacs c and d. Remove the "1 (12 bits)" output from the show function since that is not an option for this mode. Return "1 (10 bits)" if the device is one of the above mentioned chips and the dac high resolution mode is enabled. In the store function, the driver currently allows the user to write to the ADT7316_DA_HIGH_RESOLUTION bit regardless of the device in use. Add a check to return an error in the case of an adt7318 or adt7519. Remove the else statement that clears the ADT7316_DA_HIGH_RESOLUTION bit. Instead, clear it before conditionally enabling it, depending on user input. This matches the typical pattern in the driver when an attribute is a boolean. Fixes: 35f6b6b86ede ("staging: iio: new ADT7316/7/8 and ADT7516/7/9 driver") Signed-off-by: Jeremy Fertic --- drivers/staging/iio/addac/adt7316.c | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/drivers/staging/iio/addac/adt7316.c b/drivers/staging/iio/addac/adt7316.c index e17c1cb12c94..80cc3420036b 100644 --- a/drivers/staging/iio/addac/adt7316.c +++ b/drivers/staging/iio/addac/adt7316.c @@ -633,9 +633,7 @@ static ssize_t adt7316_show_da_high_resolution(struct device *dev, struct adt7316_chip_info *chip = iio_priv(dev_info); if (chip->config3 & ADT7316_DA_HIGH_RESOLUTION) { - if (chip->id == ID_ADT7316 || chip->id == ID_ADT7516) - return sprintf(buf, "1 (12 bits)\n"); - if (chip->id == ID_ADT7317 || chip->id == ID_ADT7517) + if (chip->id != ID_ADT7318 && chip->id != ID_ADT7519) return sprintf(buf, "1 (10 bits)\n"); } @@ -652,10 +650,12 @@ static ssize_t adt7316_store_da_high_resolution(struct device *dev, u8 config3; int ret; + if (chip->id == ID_ADT7318 || chip->id == ID_ADT7519) + return -EPERM; + + config3 = chip->config3 & (~ADT7316_DA_HIGH_RESOLUTION); if (buf[0] == '1') - config3 = chip->config3 | ADT7316_DA_HIGH_RESOLUTION; - else - config3 = chip->config3 & (~ADT7316_DA_HIGH_RESOLUTION); + config3 |= ADT7316_DA_HIGH_RESOLUTION; ret = chip->bus.write(chip->bus.client, ADT7316_CONFIG3, config3); if (ret) -- 2.19.1
[PATCH v2 3/4] staging: iio: adt7316: fix the dac read calculation
The calculation of the current dac value is using the wrong bits of the dac lsb register. Create two macros to shift the lsb register value into lsb position, depending on whether the dac is 10 or 12 bit. Initialize data to 0 so, with an 8 bit dac, the msb register value can be bitwise ORed with data. Fixes: 35f6b6b86ede ("staging: iio: new ADT7316/7/8 and ADT7516/7/9 driver") Signed-off-by: Jeremy Fertic --- drivers/staging/iio/addac/adt7316.c | 10 -- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/drivers/staging/iio/addac/adt7316.c b/drivers/staging/iio/addac/adt7316.c index 80cc3420036b..d7f3d68e525b 100644 --- a/drivers/staging/iio/addac/adt7316.c +++ b/drivers/staging/iio/addac/adt7316.c @@ -47,6 +47,8 @@ #define ADT7516_MSB_AIN3 0xA #define ADT7516_MSB_AIN4 0xB #define ADT7316_DA_DATA_BASE 0x10 +#define ADT7316_DA_10_BIT_LSB_SHIFT6 +#define ADT7316_DA_12_BIT_LSB_SHIFT4 #define ADT7316_DA_MSB_DATA_REGS 4 #define ADT7316_LSB_DAC_A 0x10 #define ADT7316_MSB_DAC_A 0x11 @@ -1392,7 +1394,7 @@ static IIO_DEVICE_ATTR(ex_analog_temp_offset, 0644, static ssize_t adt7316_show_DAC(struct adt7316_chip_info *chip, int channel, char *buf) { - u16 data; + u16 data = 0; u8 msb, lsb, offset; int ret; @@ -1417,7 +1419,11 @@ static ssize_t adt7316_show_DAC(struct adt7316_chip_info *chip, if (ret) return -EIO; - data = (msb << offset) + (lsb & ((1 << offset) - 1)); + if (chip->dac_bits == 12) + data = lsb >> ADT7316_DA_12_BIT_LSB_SHIFT; + else if (chip->dac_bits == 10) + data = lsb >> ADT7316_DA_10_BIT_LSB_SHIFT; + data |= msb << offset; return sprintf(buf, "%d\n", data); } -- 2.19.1
[PATCH v2 4/4] staging: iio: adt7316: fix the dac write calculation
The lsb calculation is not masking the correct bits from the user input. Subtract 1 from (1 << offset) to correctly set up the mask to be applied to user input. The lsb register stores its value starting at the bit 7 position. adt7316_store_DAC() currently assumes the value is at the other end of the register. Shift the lsb value before storing it in a new variable lsb_reg, and write this variable to the lsb register. Fixes: 35f6b6b86ede ("staging: iio: new ADT7316/7/8 and ADT7516/7/9 driver") Signed-off-by: Jeremy Fertic --- drivers/staging/iio/addac/adt7316.c | 10 +++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/drivers/staging/iio/addac/adt7316.c b/drivers/staging/iio/addac/adt7316.c index d7f3d68e525b..6f7891b567b9 100644 --- a/drivers/staging/iio/addac/adt7316.c +++ b/drivers/staging/iio/addac/adt7316.c @@ -1431,7 +1431,7 @@ static ssize_t adt7316_show_DAC(struct adt7316_chip_info *chip, static ssize_t adt7316_store_DAC(struct adt7316_chip_info *chip, int channel, const char *buf, size_t len) { - u8 msb, lsb, offset; + u8 msb, lsb, lsb_reg, offset; u16 data; int ret; @@ -1449,9 +1449,13 @@ static ssize_t adt7316_store_DAC(struct adt7316_chip_info *chip, return -EINVAL; if (chip->dac_bits > 8) { - lsb = data & (1 << offset); + lsb = data & ((1 << offset) - 1); + if (chip->dac_bits == 12) + lsb_reg = lsb << ADT7316_DA_12_BIT_LSB_SHIFT; + else + lsb_reg = lsb << ADT7316_DA_10_BIT_LSB_SHIFT; ret = chip->bus.write(chip->bus.client, - ADT7316_DA_DATA_BASE + channel * 2, lsb); + ADT7316_DA_DATA_BASE + channel * 2, lsb_reg); if (ret) return -EIO; } -- 2.19.1
[PATCH v2 1/4] staging: iio: adt7316: fix dac_bits assignment
The value of dac_bits is used in adt7316_show_DAC() and adt7316_store_DAC(), and it should be either 8, 10, or 12 bits depending on the device in use. The driver currently only assigns a value to dac_bits in adt7316_store_da_high_resolution(). The purpose of the dac high resolution option is not to change dac resolution for normal operation. Instead, it is specific to an optional feature where one or two of the four dacs can be set to output voltage proportional to temperature. If the user chooses to set dac a and/or dac b to output voltage proportional to temperature, the da_high_resolution attribute can optionally be enabled to use 10 bit resolution rather than the default 8 bits. This is only available on the 10 and 12 bit dac devices. If the user attempts to read or write dacs a or b under these settings, the driver's current behaviour is to return an error. Dacs c and d continue to operate normally under these conditions. With the above in mind, remove the dac_bits assignments from this function since the value of dac_bits as used in the driver is not dependent on this dac high resolution option. Since the dac_bits assignments discussed above are currently the only ones in this driver, the default value of dac_bits is 0. This results in incorrect calculations when the dacs are read or written in adt7316_show_DAC() and adt7316_store_DAC(). To correct this, assign a value to dac_bits in adt7316_probe() to ensure correct operation as soon as the device is registered and available to userspace. Fixes: 35f6b6b86ede ("staging: iio: new ADT7316/7/8 and ADT7516/7/9 driver") Signed-off-by: Jeremy Fertic --- drivers/staging/iio/addac/adt7316.c | 18 +- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/drivers/staging/iio/addac/adt7316.c b/drivers/staging/iio/addac/adt7316.c index 9db49aa186bb..e17c1cb12c94 100644 --- a/drivers/staging/iio/addac/adt7316.c +++ b/drivers/staging/iio/addac/adt7316.c @@ -652,17 +652,10 @@ static ssize_t adt7316_store_da_high_resolution(struct device *dev, u8 config3; int ret; - chip->dac_bits = 8; - - if (buf[0] == '1') { + if (buf[0] == '1') config3 = chip->config3 | ADT7316_DA_HIGH_RESOLUTION; - if (chip->id == ID_ADT7316 || chip->id == ID_ADT7516) - chip->dac_bits = 12; - else if (chip->id == ID_ADT7317 || chip->id == ID_ADT7517) - chip->dac_bits = 10; - } else { + else config3 = chip->config3 & (~ADT7316_DA_HIGH_RESOLUTION); - } ret = chip->bus.write(chip->bus.client, ADT7316_CONFIG3, config3); if (ret) @@ -2145,6 +2138,13 @@ int adt7316_probe(struct device *dev, struct adt7316_bus *bus, else return -ENODEV; + if (chip->id == ID_ADT7316 || chip->id == ID_ADT7516) + chip->dac_bits = 12; + else if (chip->id == ID_ADT7317 || chip->id == ID_ADT7517) + chip->dac_bits = 10; + else + chip->dac_bits = 8; + chip->ldac_pin = devm_gpiod_get_optional(dev, "adi,ldac", GPIOD_OUT_LOW); if (IS_ERR(chip->ldac_pin)) { ret = PTR_ERR(chip->ldac_pin); -- 2.19.1
Re: [PATCH v2 01/12] fs-verity: add a documentation file
On Sat, Dec 22, 2018 at 08:10:07PM -0800, Matthew Wilcox wrote: > Pretty much every file format has the ability to put arbitrary blocks > of information into a file somewhere the tools which don't know about > it will skip it. For example, ZIP "includes an extra field facility > within file headers, which can be used to store extra data not defined > by existing ZIP specifications, and which allow compliant archivers that > do not recognize the fields to safely skip them. Header IDs 0–31 are > reserved for use by PKWARE. The remaining IDs can be used by third-party > vendors for proprietary usage. " (Wikipedia) > > ELF, PNG, PDF and many other formats have the ability to put data > _somewhere_. It might not be at the tail of the file, but there's > somewhere to do it. > > (I appreciate this isn't what Linus is asking for, but I'm pointing out > that this is by no means as intractable as you make it sound.) That design would require the fs-verity code to know the type of eacho file, and where to find the in-band Merkle tree for each file type that we wanted to support. And if you wanted to use fs-verity to protect a sudoers text configuration file (for example), we'd have to teach sudo how to ignore the userspace visible Merkle tree. So I agree with you that it's *possible*. But it's ***ugly***. *Way* uglier than putting the Merkle tree at the end of the file data and then making it invisible to userspace. Cheers, - Ted
Re: [PATCH v2 01/12] fs-verity: add a documentation file
On Sat, Dec 22, 2018 at 02:47:22PM -0800, Linus Torvalds wrote: > So I want to understand why this was made a filesystem operation in > the first place. What's fs-specific about this implementation? These are the things which are fs-specific. *) We have to splice into the file system's readpage processing so we can verify the merkle tree hash before we mark the page up-to-date. This is most of the complexity involved in adding fs-verity support, and that's because both ext4 and f2fs have their own fs-specific readpage[s]() implementations, and both ext4 and f2fs also supports fscrypt, which *also* has to splice into readpage[s](). *) The file system needs to define a file system feature bit in the superblock which means, "this file system uses fs-verity" --- so that old kernels will know that they need to refuse to mount the file system (f2fs) or mount the file system read-only (ext4). *) The file system needs to define inode flag which is used to indicate "this is a fs-verity protected file". This flag is not user-visible, so the file system just has to provide a single bit in the inode structure and a function which tests that bit. *) Ext4 chose to have i_size on disk to be size of the data. We did this so that the the fs-verity feature for ext4 could be a read-only compat feature. (e.g., an old kernel can safely mount a file system with fs-verity protected files, but only for a read-only mount.) This adds a bit more complexity for ext4 in that we need to look up in our extent tree to find the last block in the file (which is where the fs-verity superblock is located). For f2fs, it can just use the on-disk i_size to find the fs-verity superblock, and then from that, f2fs can find the original data i_size (which then gets presents to userspace when it calls stat(2).) As far as the last point is concerned, ext4 could have done things the f2fs way, which is simpler, and which would allowed us to make things much more generic. However, being able to support read-only mounts of file systems with fs-verity protected files was important to me. Everything else is generic and we tried to factor out as much common code as possible into fs/verity. But the model has always been that at least *some* changes would be needed in the file system to call out to the fs-verity code, primarily because we didn't want to make changes to readpage()/readpages() VFS<->low-level fs interface. That would have required making changes in dozens of file systems, and while that would have allowed us to factor out some duplicated code in {ext4,f2fs}_readpage[s]() --- right now it's only those two file systems out of 70 or so support fscrypt and fs-verity. It's just not worth it. - Ted
Re: [PATCH] net: core: Fix Spectre v1 vulnerability
On Sat, Dec 22, 2018 at 09:37:02PM -0600, Gustavo A. R. Silva wrote: > > Can't we have the case in which the code can be "trained" to read > perfectly valid values for prog->len for quite a while, making the > microcode come into place and speculate about: > > 1013 if (flen == 0 || flen > BPF_MAXINSNS) > 1014 return false; > > and then make flen to be greater than BPF_MAXINSNS? Yes. The user space can train line 1013 to mispredict by passing smaller flen N times and then passing large flen. Why do you think it's exploitable? Without the patch in the mispredicted path the cpu will do if (0 < flen) condition and since flen is hot in the cache it will happily execute the filter[0] load... and about 12-20 u-ops later (depending on u-arch of cpu) when branch predictor realizes that it's a miss, the cpu will ignore the values computed in the shadow cpu registers used by speculative execution and go back to the 'return false' execution path. The side effect of bringing filter[0] value in L1 cache is still there. The cpu is incapable to undo that cache load. That's what spectre1 is about. Do you see how filter[0] value in cpu L1 cache is exploitable? I took another look at the following patches: "net: core: Fix Spectre v1 vulnerability" "nfc: af_nfc: Fix Spectre v1 vulnerability" "can: af_can: Fix Spectre v1 vulnerability" and I have to say that none of them are necessary. I'm not sure whether there were other patches that pretend to fix spectre1.
Re: [BREAKAGE] Since 4.18, kernel sets SB_I_NODEV implicitly on userns mounts, breaking systemd-nspawn
Hi, I would like to thank you all for reacting to this issue so quickly, and I am really sorry for sending the message several time. I thought there was a problem with the way it was formatted or some such, hence why I sent it several times, because none of the messages seemed to get through. So yeah, real sorry about that bit, and thanking you all Gabriel C a écrit : Am So., 23. Dez. 2018 um 00:02 Uhr schrieb Linus Torvalds : On Sat, Dec 22, 2018 at 2:49 PM Christian Brauner wrote: To be fair, no one apart from me was pointing out that it actually breaks people including systemd folks even though I was bringing it up with them. I even tried to fix all of userspace after this got NACKED Seriously, the "we don't break user space" is the #1 rule in the kernel, and people should _know_ it's the #1 rule. If somebody ignores that rule, it needs to be escalated to me. Immediately. Because I need to know. I do that usually but I didn't saw Christian's revert the time and I never hit that issue. Just saw that now because the unusual [BREAKAGE] prefix. I need to know so that I can override the bogus NAK, and so that we can fix the breakage ASAP. The absolute last thing we need is some other user space then starting to rely on the new behavior, which just compounds the problem and makes it a *much* bigger problem. Yes and you are right .. https://github.com/lxc/lxc/pull/2438 I've added an comment there about 4.20.0. BR, Gabriel
Re: [PATCH v2 01/12] fs-verity: add a documentation file
On Fri, Dec 21, 2018 at 11:17:12PM -0500, Theodore Y. Ts'o wrote: > Userspace applications which are reading the file aren't going to be > expecting Merkle tree. For example, one of the use cases is Android > APK files, which are essentially ZIP files. ZIP files can be parsed > both from the front-end (streaming), or by looking for the complete > directory of all of the files in the ZIP file by starting at the end > of the file and moving backwards. If the Merkle tree was visible to > userspace programs that are opening and reading the file, it would > confuse them mightily. Pretty much every file format has the ability to put arbitrary blocks of information into a file somewhere the tools which don't know about it will skip it. For example, ZIP "includes an extra field facility within file headers, which can be used to store extra data not defined by existing ZIP specifications, and which allow compliant archivers that do not recognize the fields to safely skip them. Header IDs 0–31 are reserved for use by PKWARE. The remaining IDs can be used by third-party vendors for proprietary usage. " (Wikipedia) ELF, PNG, PDF and many other formats have the ability to put data _somewhere_. It might not be at the tail of the file, but there's somewhere to do it. (I appreciate this isn't what Linus is asking for, but I'm pointing out that this is by no means as intractable as you make it sound.)
Re: [PATCH] crypto: x86/chacha - avoid sleeping under kernel_fpu_begin()
On Sat, Dec 15, 2018 at 12:40:17PM -0800, Eric Biggers wrote: > From: Eric Biggers > > Passing atomic=true to skcipher_walk_virt() only makes the later > skcipher_walk_done() calls use atomic memory allocations, not > skcipher_walk_virt() itself. Thus, we have to move it outside of the > preemption-disabled region (kernel_fpu_begin()/kernel_fpu_end()). > > (skcipher_walk_virt() only allocates memory for certain layouts of the > input scatterlist, hence why I didn't notice this earlier...) > > Reported-by: syzbot+9bf843c33f782d73a...@syzkaller.appspotmail.com > Fixes: 4af78261870a ("crypto: x86/chacha20 - add XChaCha20 support") > Signed-off-by: Eric Biggers > --- > arch/x86/crypto/chacha_glue.c | 33 - > 1 file changed, 20 insertions(+), 13 deletions(-) Patch applied. Thanks. -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Re: [PATCH] crypto: cavium/nitrox - Added AEAD cipher support
On Fri, Dec 14, 2018 at 11:19:51AM +, Nagadheeraj Rottela wrote: > Added support to offload AEAD ciphers to NITROX. Currently supported > AEAD cipher is 'gcm(aes)'. > > Signed-off-by: Nagadheeraj Rottela > Reviewed-by: Srikanth Jampala > --- > drivers/crypto/cavium/nitrox/Makefile | 4 +- > drivers/crypto/cavium/nitrox/nitrox_aead.c | 364 > drivers/crypto/cavium/nitrox/nitrox_algs.c | 559 > + > drivers/crypto/cavium/nitrox/nitrox_common.h | 6 +- > drivers/crypto/cavium/nitrox/nitrox_req.h | 239 +-- > drivers/crypto/cavium/nitrox/nitrox_reqmgr.c | 38 +- > drivers/crypto/cavium/nitrox/nitrox_skcipher.c | 498 ++ > 7 files changed, 1103 insertions(+), 605 deletions(-) > create mode 100644 drivers/crypto/cavium/nitrox/nitrox_aead.c > create mode 100644 drivers/crypto/cavium/nitrox/nitrox_skcipher.c Patch applied. Thanks. -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Re: [PATCH 1/2] crypto: crypto_user_stat: remove unused dump functions
On Thu, Dec 13, 2018 at 08:36:37AM +, Corentin Labbe wrote: > This patch removes unused dump functions for crypto_user_stats. > There are remains of the copy/paste of crypto_user_base to > crypto_user_stat and I forgot to remove them. > > Signed-off-by: Corentin Labbe > --- > crypto/crypto_user_base.c| 4 +--- > crypto/crypto_user_stat.c| 33 > include/crypto/internal/cryptouser.h | 12 -- > 3 files changed, 1 insertion(+), 48 deletions(-) All applied. Thanks. -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Re: [PATCH] crypto: ux500 - Use proper enum in hash_set_dma_transfer
On Mon, Dec 10, 2018 at 04:49:54PM -0700, Nathan Chancellor wrote: > Clang warns when one enumerated type is implicitly converted to another: > > drivers/crypto/ux500/hash/hash_core.c:169:4: warning: implicit > conversion from enumeration type 'enum dma_data_direction' to different > enumeration type 'enum dma_transfer_direction' [-Wenum-conversion] > direction, DMA_CTRL_ACK | DMA_PREP_INTERRUPT); > ^ > 1 warning generated. > > dmaengine_prep_slave_sg expects an enum from dma_transfer_direction. > We know that the only direction supported by this function is > DMA_TO_DEVICE because of the check at the top of this function so we can > just use the equivalent value from dma_transfer_direction. > > DMA_TO_DEVICE = DMA_MEM_TO_DEV = 1 > > Signed-off-by: Nathan Chancellor > --- > drivers/crypto/ux500/hash/hash_core.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) Patch applied. Thanks. -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Re: [PATCH] crypto: ux500 - Use proper enum in cryp_set_dma_transfer
On Mon, Dec 10, 2018 at 04:49:29PM -0700, Nathan Chancellor wrote: > Clang warns when one enumerated type is implicitly converted to another: > > drivers/crypto/ux500/cryp/cryp_core.c:559:5: warning: implicit > conversion from enumeration type 'enum dma_data_direction' to different > enumeration type 'enum dma_transfer_direction' [-Wenum-conversion] > direction, DMA_CTRL_ACK); > ^ > drivers/crypto/ux500/cryp/cryp_core.c:583:5: warning: implicit > conversion from enumeration type 'enum dma_data_direction' to different > enumeration type 'enum dma_transfer_direction' [-Wenum-conversion] > direction, > ^ > 2 warnings generated. > > dmaengine_prep_slave_sg expects an enum from dma_transfer_direction. > Because we know the value of the dma_data_direction enum from the > switch statement, we can just use the proper value from > dma_transfer_direction so there is no more conversion. > > DMA_TO_DEVICE = DMA_MEM_TO_DEV = 1 > DMA_FROM_DEVICE = DMA_DEV_TO_MEM = 2 > > Signed-off-by: Nathan Chancellor > --- > drivers/crypto/ux500/cryp/cryp_core.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) Patch applied. Thanks. -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Re: [PATCH 00/12] x86/crypto: gcmaes AVX scatter/gather support
On Mon, Dec 10, 2018 at 07:56:28PM +, Dave Watson wrote: > This patch set refactors the x86 aes/gcm AVX crypto routines to > support true scatter/gather by adding gcm_enc/dec_update methods. > > It is similar to the previous SSE patchset starting at e1fd316f. > Unlike the SSE routines, the AVX routines did not support > keysize 192 & 256, this patchset also adds support for those > keysizes. > > The final patch updates the C glue code, passing everything through > the crypt_by_sg() function instead of the previous memcpy based > routines. > > Dave Watson (12): > x86/crypto: aesni: Merge GCM_ENC_DEC > x86/crypto: aesni: Introduce gcm_context_data > x86/crypto: aesni: Macro-ify func save/restore > x86/crypto: aesni: support 256 byte keys in avx asm > x86/crypto: aesni: Add GCM_COMPLETE macro > x86/crypto: aesni: Split AAD hash calculation to separate macro > x86/crypto: aesni: Merge avx precompute functions > x86/crypto: aesni: Fill in new context data structures > x86/crypto: aesni: Move ghash_mul to GCM_COMPLETE > x86/crypto: aesni: Introduce READ_PARTIAL_BLOCK macro > x86/crypto: aesni: Introduce partial block macro > x86/crypto: aesni: Add scatter/gather avx stubs, and use them in C > > arch/x86/crypto/aesni-intel_avx-x86_64.S | 2125 ++ > arch/x86/crypto/aesni-intel_glue.c | 353 ++-- > 2 files changed, 1117 insertions(+), 1361 deletions(-) All applied. Thanks. -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Re: [PATCH] net: core: Fix Spectre v1 vulnerability
On 12/22/18 9:00 PM, Alexei Starovoitov wrote: On Sat, Dec 22, 2018 at 08:53:40PM -0600, Gustavo A. R. Silva wrote: Hi, On 12/22/18 8:40 PM, David Miller wrote: From: Alexei Starovoitov Date: Sat, 22 Dec 2018 15:59:54 -0800 On Sat, Dec 22, 2018 at 03:07:22PM -0800, David Miller wrote: From: "Gustavo A. R. Silva" Date: Fri, 21 Dec 2018 14:49:01 -0600 flen is indirectly controlled by user-space, hence leading to a potential exploitation of the Spectre variant 1 vulnerability. This issue was detected with the help of Smatch: net/core/filter.c:1101 bpf_check_classic() warn: potential spectre issue 'filter' [w] Fix this by sanitizing flen before using it to index filter at line 1101: switch (filter[flen - 1].code) { and through pc at line 1040: const struct sock_filter *ftest = [pc]; Notice that given that speculation windows are large, the policy is to kill the speculation on the first load and not worry if it can be completed with a dependent load/store [1]. [1] https://marc.info/?l=linux-kernel=152449131114778=2 Signed-off-by: Gustavo A. R. Silva BPF folks, I'll take this directly. Applied and queued up for -stable, thanks. hmm. what was the rush? I think it is unnecessary change. Though fp is passed initially from user space it's copied into kernel struct first. There is no way user space can force kernel to mispredict branch in for (pc = 0; pc < flen; pc++) loop. The following piece of code is the one that can be mispredicted, not the for loop: 1013 if (flen == 0 || flen > BPF_MAXINSNS) 1014 return false; Instead of calling array_index_nospec() inside bpf_check_basics_ok(), I decided to place the call close to the code that could be compromised. This is when accessing filter[]. Why do you think it can be mispredicted? Beause fprog->len comes from user space: bpf_prog_create_from_user() -> bpf_check_basics_ok() I've looked at your other patch for nfc_sock_create() where you're adding: + proto = array_index_nospec(proto, NFC_SOCKPROTO_MAX); 'proto' is the value passed in _register_ into system call. There is no need to wrap it with array_index_nospec(). It's not a load from memory and user space cannot make it slow. Slow load is a necessary attribute to trigger speculative execution into mispredicted branch. We might be interpreting the information available about Spectre a bit different, but when the Spectre paper talks about memory vs cpu speed it seems to me that it's just an example to illustrate how the microcode can come into the equation and speculate. So I'm genuinely curious about your last statement: "Slow load is a necessary attribute..." Where did you get that info from? Can't we have the case in which the code can be "trained" to read perfectly valid values for prog->len for quite a while, making the microcode come into place and speculate about: 1013 if (flen == 0 || flen > BPF_MAXINSNS) 1014 return false; and then make flen to be greater than BPF_MAXINSNS? What tool did you use to analyze this? Did you analyze all warnings case by case or just trusted the tool and generated these patches? I read every case, but I sometimes might be wrong of course. Thanks -- Gustavo
Re: [for-next][PATCH 2/5] tracing: Use str_has_prefix() helper for histogram code
On Sat, 22 Dec 2018 12:03:41 -0600 Tom Zanussi wrote: > Hi Steve, > > On Sat, 2018-12-22 at 13:01 -0500, Steven Rostedt wrote: > > On Sat, 22 Dec 2018 11:20:09 -0500 > > Steven Rostedt wrote: > > > > > From: "Steven Rostedt (VMware)" > > > > > > The tracing histogram code contains a lot of instances of the > > > construct: > > > > > > strncmp(str, "const", sizeof("const") - 1) > > > > > > This can be prone to bugs due to typos or bad cut and paste. Use > > > the > > > str_has_prefix() helper macro instead that removes the need for > > > having two > > > copies of the constant string. > > > > > > Cc: Tom Zanussi > > > > I have no idea why I copied your intel email. The linux.intel.com > > appears to be no longer active. I'm going to rebase to fix this email > > address. > > linux.intel.com is active, but there's no zanussi there, just > tom.zanussi ;-) > > So tom.zanu...@linux.intel.com should work fine. BTW, would you give me your ack-by? -- Steve
Re: [for-next][PATCH 0/5] tracing: Add string_has_prefix() and usages
On Sun, 23 Dec 2018 12:41:20 +0900 Namhyung Kim wrote: > > Steven Rostedt (VMware) (5): > > string.h: Add str_has_prefix() helper function > > tracing: Use str_has_prefix() helper for histogram code > > tracing: Use str_has_prefix() instead of using fixed sizes > > tracing: Have the historgram use the result of str_has_prefix() for > > len of prefix > > tracing: Use the return of str_has_prefix() to remove open coded > > numbers > > For the series: > > Acked-by: Namhyung Kim Thanks Namhyung! -- Steve
Re: [PATCH] net: core: Fix Spectre v1 vulnerability
Alexei, On 12/22/18 9:37 PM, Gustavo A. R. Silva wrote: On 12/22/18 9:00 PM, Alexei Starovoitov wrote: On Sat, Dec 22, 2018 at 08:53:40PM -0600, Gustavo A. R. Silva wrote: Hi, On 12/22/18 8:40 PM, David Miller wrote: From: Alexei Starovoitov Date: Sat, 22 Dec 2018 15:59:54 -0800 On Sat, Dec 22, 2018 at 03:07:22PM -0800, David Miller wrote: From: "Gustavo A. R. Silva" Date: Fri, 21 Dec 2018 14:49:01 -0600 flen is indirectly controlled by user-space, hence leading to a potential exploitation of the Spectre variant 1 vulnerability. This issue was detected with the help of Smatch: net/core/filter.c:1101 bpf_check_classic() warn: potential spectre issue 'filter' [w] Fix this by sanitizing flen before using it to index filter at line 1101: switch (filter[flen - 1].code) { and through pc at line 1040: const struct sock_filter *ftest = [pc]; Notice that given that speculation windows are large, the policy is to kill the speculation on the first load and not worry if it can be completed with a dependent load/store [1]. [1] https://marc.info/?l=linux-kernel=152449131114778=2 Signed-off-by: Gustavo A. R. Silva BPF folks, I'll take this directly. Applied and queued up for -stable, thanks. hmm. what was the rush? I think it is unnecessary change. Though fp is passed initially from user space it's copied into kernel struct first. There is no way user space can force kernel to mispredict branch in for (pc = 0; pc < flen; pc++) loop. The following piece of code is the one that can be mispredicted, not the for loop: 1013 if (flen == 0 || flen > BPF_MAXINSNS) 1014 return false; Instead of calling array_index_nospec() inside bpf_check_basics_ok(), I decided to place the call close to the code that could be compromised. This is when accessing filter[]. Why do you think it can be mispredicted? Beause fprog->len comes from user space: bpf_prog_create_from_user() -> bpf_check_basics_ok() I've looked at your other patch for nfc_sock_create() where you're adding: + proto = array_index_nospec(proto, NFC_SOCKPROTO_MAX); 'proto' is the value passed in _register_ into system call. There is no need to wrap it with array_index_nospec(). It's not a load from memory and user space cannot make it slow. Slow load is a necessary attribute to trigger speculative execution into mispredicted branch. I think I know where the confusion is coming from. The load you talk about is the firs load needed in the following code: if (x < array1_size) { v = array2[array1[x]*256] } This is array[x] In this case, that first load needed would be: 1101: switch (filter[flen - 1].code) { or 1040: const struct sock_filter *ftest = [pc]; The policy has been to kill the speculation on that first load and not worry if it can be completed with a dependent load/store. As mentioned in the commit log. Thanks -- Gustavo
Re: [PATCH] nfc: af_nfc: Fix Spectre v1 vulnerability
On 12/22/18 8:42 PM, David Miller wrote: From: "Gustavo A. R. Silva" Date: Sat, 22 Dec 2018 17:37:35 -0600 I wonder if you can take this one too: https://lore.kernel.org/lkml/20181221212229.GA32635@embeddedor/ It's pretty similar to the af_nfc one. Sure, done. Great. Thanks. -- Gustavo
Re: [for-next][PATCH 5/5] tracing: Use the return of str_has_prefix() to remove open coded numbers
On Sun, 23 Dec 2018 12:23:52 +0900 Masami Hiramatsu wrote: > On Sat, 22 Dec 2018 11:20:12 -0500 > Steven Rostedt wrote: > > > From: "Steven Rostedt (VMware)" > > > > There are several locations that compare constants to the beginning of > > string variables to determine what commands should be done, then the > > constant length is used to index into the string. This is error prone as the > > hard coded numbers have to match the size of the constants. Instead, use the > > len returned from str_has_prefix() and remove the open coded string length > > sizes. > > Looks good to me. > > Acked-by: Masami Hiramatsu > > for trace_probe part. Thanks Masami! -- Steve
Re: [for-next][PATCH 0/5] tracing: Add string_has_prefix() and usages
On Sat, Dec 22, 2018 at 11:20:07AM -0500, Steven Rostedt wrote: > I hope everyone is OK with these changes. I pushed to linux-next but I'm > willing to change them if there are still issues. > > They ran through all my tests, althought zero-day-bot had a weird build > regression in sh, that looks totally unrelated: > > Regressions in current branch: > > arch/sh/kernel/dwarf.c:107:26: error: 'dwarf_frame_reg' defined but not used > [-Werror=unused-function] > arch/sh/kernel/dwarf.c:1209:0: error: unterminated argument list invoking > macro "WARN_ON" > arch/sh/kernel/dwarf.c:226:12: error: 'dwarf_read_encoded_value' defined but > not used [-Werror=unused-function] > arch/sh/kernel/dwarf.c:306:26: error: 'dwarf_lookup_cie' defined but not > used [-Werror=unused-function] > arch/sh/kernel/dwarf.c:38:27: error: 'dwarf_frame_cachep' defined but not > used [-Werror=unused-variable] > arch/sh/kernel/dwarf.c:399:12: error: 'dwarf_cfa_execute_insns' defined but > not used [-Werror=unused-function] > arch/sh/kernel/dwarf.c:41:27: error: 'dwarf_reg_cachep' defined but not used > [-Werror=unused-variable] > arch/sh/kernel/dwarf.c:580:22: error: unused variable 'frame' > [-Werror=unused-variable] > arch/sh/kernel/dwarf.c:581:20: error: unused variable 'cie' > [-Werror=unused-variable] > arch/sh/kernel/dwarf.c:582:20: error: unused variable 'fde' > [-Werror=unused-variable] > arch/sh/kernel/dwarf.c:583:20: error: unused variable 'reg' > [-Werror=unused-variable] > arch/sh/kernel/dwarf.c:584:16: error: unused variable 'addr' > [-Werror=unused-variable] > arch/sh/kernel/dwarf.c:622:3: error: expected ';' at end of input > arch/sh/kernel/dwarf.c:622:3: error: expected declaration or statement at > end of input > arch/sh/kernel/dwarf.c:622:3: error: 'WARN_ON' undeclared (first use in this > function); did you mean 'WMARK_LOW'? > > Pushing to my for-next branch should kick off another run. Let's see > if it reports that again. Unless someone knows why that happened? > > > git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace.git > for-next > > Head SHA1: 92b9de3e574bd874188a4e27a8830bb901a08ef8 > > > Steven Rostedt (VMware) (5): > string.h: Add str_has_prefix() helper function > tracing: Use str_has_prefix() helper for histogram code > tracing: Use str_has_prefix() instead of using fixed sizes > tracing: Have the historgram use the result of str_has_prefix() for len > of prefix > tracing: Use the return of str_has_prefix() to remove open coded numbers For the series: Acked-by: Namhyung Kim Thanks, Namhyung > > > include/linux/string.h | 20 > kernel/trace/trace.c | 8 +--- > kernel/trace/trace_events.c | 2 +- > kernel/trace/trace_events_hist.c | 35 ++- > kernel/trace/trace_probe.c | 17 + > kernel/trace/trace_stack.c | 6 -- > 6 files changed, 57 insertions(+), 31 deletions(-)
Re: [for-next][PATCH 4/5] tracing: Have the historgram use the result of str_has_prefix() for len of prefix
On Sat, Dec 22, 2018 at 11:20:11AM -0500, Steven Rostedt wrote: > From: "Steven Rostedt (VMware)" > > As str_has_prefix() returns the length on match, we can use that for the > updating of the string pointer instead of recalculating the prefix size. > > Cc: Tom Zanussi > Signed-off-by: Steven Rostedt (VMware) > --- > kernel/trace/trace_events_hist.c | 9 + > 1 file changed, 5 insertions(+), 4 deletions(-) > > diff --git a/kernel/trace/trace_events_hist.c > b/kernel/trace/trace_events_hist.c > index 0d878dcd1e4b..449d90cfa151 100644 > --- a/kernel/trace/trace_events_hist.c > +++ b/kernel/trace/trace_events_hist.c > @@ -4342,12 +4342,13 @@ static int parse_actions(struct hist_trigger_data > *hist_data) > unsigned int i; > int ret = 0; > char *str; > + int len; > > for (i = 0; i < hist_data->attrs->n_actions; i++) { > str = hist_data->attrs->action_str[i]; > > - if (str_has_prefix(str, "onmatch(")) { > - char *action_str = str + sizeof("onmatch(") - 1; > + if ((len = str_has_prefix(str, "onmatch("))) { > + char *action_str = str + len; Ah you did it here. Thanks, Namhyung > > data = onmatch_parse(tr, action_str); > if (IS_ERR(data)) { > @@ -4355,8 +4356,8 @@ static int parse_actions(struct hist_trigger_data > *hist_data) > break; > } > data->fn = action_trace; > - } else if (str_has_prefix(str, "onmax(")) { > - char *action_str = str + sizeof("onmax(") - 1; > + } else if ((len = str_has_prefix(str, "onmax("))) { > + char *action_str = str + len; > > data = onmax_parse(action_str); > if (IS_ERR(data)) { > -- > 2.19.2 > >
Re: [for-next][PATCH 2/5] tracing: Use str_has_prefix() helper for histogram code
On Sat, Dec 22, 2018 at 11:20:09AM -0500, Steven Rostedt wrote: > From: "Steven Rostedt (VMware)" > > The tracing histogram code contains a lot of instances of the construct: > > strncmp(str, "const", sizeof("const") - 1) > > This can be prone to bugs due to typos or bad cut and paste. Use the > str_has_prefix() helper macro instead that removes the need for having two > copies of the constant string. > > Cc: Tom Zanussi > Signed-off-by: Steven Rostedt (VMware) > --- > kernel/trace/trace_events_hist.c | 28 ++-- > 1 file changed, 14 insertions(+), 14 deletions(-) > > diff --git a/kernel/trace/trace_events_hist.c > b/kernel/trace/trace_events_hist.c > index c5448c103770..9d590138f870 100644 > --- a/kernel/trace/trace_events_hist.c > +++ b/kernel/trace/trace_events_hist.c > @@ -1881,8 +1881,8 @@ static int parse_action(char *str, struct > hist_trigger_attrs *attrs) > if (attrs->n_actions >= HIST_ACTIONS_MAX) > return ret; > > - if ((strncmp(str, "onmatch(", sizeof("onmatch(") - 1) == 0) || > - (strncmp(str, "onmax(", sizeof("onmax(") - 1) == 0)) { > + if ((str_has_prefix(str, "onmatch(")) || > + (str_has_prefix(str, "onmax("))) { > attrs->action_str[attrs->n_actions] = kstrdup(str, GFP_KERNEL); > if (!attrs->action_str[attrs->n_actions]) { > ret = -ENOMEM; > @@ -1899,34 +1899,34 @@ static int parse_assignment(char *str, struct > hist_trigger_attrs *attrs) > { > int ret = 0; > > - if ((strncmp(str, "key=", sizeof("key=") - 1) == 0) || > - (strncmp(str, "keys=", sizeof("keys=") - 1) == 0)) { > + if ((str_has_prefix(str, "key=")) || > + (str_has_prefix(str, "keys="))) { > attrs->keys_str = kstrdup(str, GFP_KERNEL); > if (!attrs->keys_str) { > ret = -ENOMEM; > goto out; > } > - } else if ((strncmp(str, "val=", sizeof("val=") - 1) == 0) || > - (strncmp(str, "vals=", sizeof("vals=") - 1) == 0) || > - (strncmp(str, "values=", sizeof("values=") - 1) == 0)) { > + } else if ((str_has_prefix(str, "val=")) || > +(str_has_prefix(str, "vals=")) || > +(str_has_prefix(str, "values="))) { > attrs->vals_str = kstrdup(str, GFP_KERNEL); > if (!attrs->vals_str) { > ret = -ENOMEM; > goto out; > } > - } else if (strncmp(str, "sort=", sizeof("sort=") - 1) == 0) { > + } else if (str_has_prefix(str, "sort=")) { > attrs->sort_key_str = kstrdup(str, GFP_KERNEL); > if (!attrs->sort_key_str) { > ret = -ENOMEM; > goto out; > } > - } else if (strncmp(str, "name=", sizeof("name=") - 1) == 0) { > + } else if (str_has_prefix(str, "name=")) { > attrs->name = kstrdup(str, GFP_KERNEL); > if (!attrs->name) { > ret = -ENOMEM; > goto out; > } > - } else if (strncmp(str, "clock=", sizeof("clock=") - 1) == 0) { > + } else if (str_has_prefix(str, "clock=")) { > strsep(, "="); > if (!str) { > ret = -EINVAL; > @@ -1939,7 +1939,7 @@ static int parse_assignment(char *str, struct > hist_trigger_attrs *attrs) > ret = -ENOMEM; > goto out; > } > - } else if (strncmp(str, "size=", sizeof("size=") - 1) == 0) { > + } else if (str_has_prefix(str, "size=")) { > int map_bits = parse_map_size(str); > > if (map_bits < 0) { > @@ -3558,7 +3558,7 @@ static struct action_data *onmax_parse(char *str) > if (!onmax_fn_name || !str) > goto free; > > - if (strncmp(onmax_fn_name, "save", sizeof("save") - 1) == 0) { > + if (str_has_prefix(onmax_fn_name, "save")) { > char *params = strsep(, ")"); > > if (!params) { > @@ -4346,7 +4346,7 @@ static int parse_actions(struct hist_trigger_data > *hist_data) > for (i = 0; i < hist_data->attrs->n_actions; i++) { > str = hist_data->attrs->action_str[i]; > > - if (strncmp(str, "onmatch(", sizeof("onmatch(") - 1) == 0) { > + if (str_has_prefix(str, "onmatch(")) { > char *action_str = str + sizeof("onmatch(") - 1; You'd better using the return value of str_has_prefix(). > > data = onmatch_parse(tr, action_str); > @@ -4355,7 +4355,7 @@ static int parse_actions(struct hist_trigger_data > *hist_data) > break; > } > data->fn = action_trace; > - } else if (strncmp(str, "onmax(", sizeof("onmax(") - 1) == 0) { > + } else if (str_has_prefix(str, "onmax(")) { >
Re: [for-next][PATCH 5/5] tracing: Use the return of str_has_prefix() to remove open coded numbers
On Sat, 22 Dec 2018 11:20:12 -0500 Steven Rostedt wrote: > From: "Steven Rostedt (VMware)" > > There are several locations that compare constants to the beginning of > string variables to determine what commands should be done, then the > constant length is used to index into the string. This is error prone as the > hard coded numbers have to match the size of the constants. Instead, use the > len returned from str_has_prefix() and remove the open coded string length > sizes. Looks good to me. Acked-by: Masami Hiramatsu for trace_probe part. Thanks! > > Cc: Joe Perches > Cc: Masami Hiramatsu > Signed-off-by: Steven Rostedt (VMware) > --- > kernel/trace/trace.c | 8 +--- > kernel/trace/trace_probe.c | 17 + > kernel/trace/trace_stack.c | 6 -- > 3 files changed, 18 insertions(+), 13 deletions(-) > > diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c > index eac2824a18ab..18b86c3974e1 100644 > --- a/kernel/trace/trace.c > +++ b/kernel/trace/trace.c > @@ -4408,13 +4408,15 @@ static int trace_set_options(struct trace_array *tr, > char *option) > int neg = 0; > int ret; > size_t orig_len = strlen(option); > + int len; > > cmp = strstrip(option); > > - if (str_has_prefix(cmp, "no")) { > + len = str_has_prefix(cmp, "no"); > + if (len) > neg = 1; > - cmp += 2; > - } > + > + cmp += len; > > mutex_lock(_types_lock); > > diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c > index 541375737403..9962cb5da8ac 100644 > --- a/kernel/trace/trace_probe.c > +++ b/kernel/trace/trace_probe.c > @@ -186,19 +186,20 @@ int traceprobe_parse_event_name(const char **pevent, > const char **pgroup, > static int parse_probe_vars(char *arg, const struct fetch_type *t, > struct fetch_insn *code, unsigned int flags) > { > - int ret = 0; > unsigned long param; > + int ret = 0; > + int len; > > if (strcmp(arg, "retval") == 0) { > if (flags & TPARG_FL_RETURN) > code->op = FETCH_OP_RETVAL; > else > ret = -EINVAL; > - } else if (str_has_prefix(arg, "stack")) { > - if (arg[5] == '\0') { > + } else if ((len = str_has_prefix(arg, "stack"))) { > + if (arg[len] == '\0') { > code->op = FETCH_OP_STACKP; > - } else if (isdigit(arg[5])) { > - ret = kstrtoul(arg + 5, 10, ); > + } else if (isdigit(arg[len])) { > + ret = kstrtoul(arg + len, 10, ); > if (ret || ((flags & TPARG_FL_KERNEL) && > param > PARAM_MAX_STACK)) > ret = -EINVAL; > @@ -213,10 +214,10 @@ static int parse_probe_vars(char *arg, const struct > fetch_type *t, > #ifdef CONFIG_HAVE_FUNCTION_ARG_ACCESS_API > } else if (((flags & TPARG_FL_MASK) == > (TPARG_FL_KERNEL | TPARG_FL_FENTRY)) && > -str_has_prefix(arg, "arg")) { > - if (!isdigit(arg[3])) > +(len = str_has_prefix(arg, "arg"))) { > + if (!isdigit(arg[len])) > return -EINVAL; > - ret = kstrtoul(arg + 3, 10, ); > + ret = kstrtoul(arg + len, 10, ); > if (ret || !param || param > PARAM_MAX_STACK) > return -EINVAL; > code->op = FETCH_OP_ARG; > diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c > index 3641f28c343f..eec648a0d673 100644 > --- a/kernel/trace/trace_stack.c > +++ b/kernel/trace/trace_stack.c > @@ -448,8 +448,10 @@ static char stack_trace_filter_buf[COMMAND_LINE_SIZE+1] > __initdata; > > static __init int enable_stacktrace(char *str) > { > - if (str_has_prefix(str, "_filter=")) > - strncpy(stack_trace_filter_buf, str+8, COMMAND_LINE_SIZE); > + int len; > + > + if ((len = str_has_prefix(str, "_filter="))) > + strncpy(stack_trace_filter_buf, str + len, COMMAND_LINE_SIZE); > > stack_tracer_enabled = 1; > last_stack_tracer_enabled = 1; > -- > 2.19.2 > > -- Masami Hiramatsu
Re: [PATCH v4] string.h: Add str_has_prefix() helper function
On Sun, 23 Dec 2018 12:13:43 +0900 Namhyung Kim wrote: > > Also, I believe there are some memcmp implementations that start at the > > end of the memory locations, not the beginning. That is, it compares > > backwards. Which is also legit for memcmp to do. > > I'm not sure, the man page says: > > RETURN VALUE >The memcmp() function returns an integer less than, equal to, >or greater than zero if the first n bytes of s1 is found, >respectively, to be less than, to match, or be greater than >the first n bytes of s2. > >For a nonzero return value, the sign is determined by the sign >of the difference between the first pair of bytes (interpreted >as unsigned char) that differ in s1 and s2. > >If n is zero, the return value is zero. > > > It should return difference in the first pair of bytes that differ so > I guess implementations should compare from the beginning. Ah, makes sense. I think I'm thinking of memcpy() which can start at the end, or maybe even the deprecated bcmp(). It's been a long time since I had to deal with the implementations of these. -- Steve
Re: [PATCH v3] string.h: Add str_has_prefix() helper
On Sat, Dec 22, 2018 at 07:22:07AM -0500, Steven Rostedt wrote: > On Sat, 22 Dec 2018 18:28:18 +0900 > Namhyung Kim wrote: > > > > > > > for (i = 0; i < hist_data->attrs->n_actions; i++) { > > > str = hist_data->attrs->action_str[i]; > > > > > > - if (str_has_prefix(str, "onmatch(")) { > > > - char *action_str = str + sizeof("onmatch(") - 1; > > > + if ((len = str_has_prefix(str, "onmatch("))) { > > > + char *action_str = str + len; > > > > IMHO, returning (match) length might confuse people that it might > > support partial match and returns the length actually matched rather > > than full match. If I knew it always returns the length of prefix (or > > 0) why it matters? Using strlen() in the next line will have same > > effect of compiler optimization for the constant strings, no? > > > > if (str_has_prefix(str, "onmatch(")) { > > char *action_str = str + strlen("onmatch("); > > The reason to return the length was to get rid of the need for > duplicating the strlen("") because it's a burden to keep the two > the same. Not only that, a lot of places just do "str + 7" because it's > easier to type. > > Yes, it has the same effect on the compiler, but that's not what we are > trying to solve. We are trying to get rid of the duplication, because > that requires humans to get it right, and humans are not good at that. Then I'm ok with that. :) Thanks, Namhyung
Re: [alsa-devel] [PATCH v3 1/9] ASoC: sun4i-i2s: Adjust regmap settings
On Sun, Dec 23, 2018 at 6:12 AM Jonas Karlman wrote: > > On 2018-12-21 17:44, Chen-Yu Tsai wrote: > > On Fri, Dec 21, 2018 at 11:21 PM wrote: > >> From: Marcus Cooper > >> > >> Bypass the regmap cache when flushing the i2s FIFOs and modify the tables > >> to reflect this. > >> > >> Signed-off-by: Marcus Cooper > >> --- > >> sound/soc/sunxi/sun4i-i2s.c | 29 + > >> 1 file changed, 9 insertions(+), 20 deletions(-) > >> > >> diff --git a/sound/soc/sunxi/sun4i-i2s.c b/sound/soc/sunxi/sun4i-i2s.c > >> index d5ec1a20499d..64d073cb2aee 100644 > >> --- a/sound/soc/sunxi/sun4i-i2s.c > >> +++ b/sound/soc/sunxi/sun4i-i2s.c > >> @@ -548,9 +548,11 @@ static int sun4i_i2s_set_fmt(struct snd_soc_dai *dai, > >> unsigned int fmt) > >> static void sun4i_i2s_start_capture(struct sun4i_i2s *i2s) > >> { > >> /* Flush RX FIFO */ > >> + regcache_cache_bypass(i2s->regmap, true); > >> regmap_update_bits(i2s->regmap, SUN4I_I2S_FIFO_CTRL_REG, > >>SUN4I_I2S_FIFO_CTRL_FLUSH_RX, > >>SUN4I_I2S_FIFO_CTRL_FLUSH_RX); > >> + regcache_cache_bypass(i2s->regmap, false); > > IIRC the flush cache bit is self-clearing. So you likely want to mark > > this register as volatile. If it is marked as volatile, then all access > > to that register bypasses the cache, so the regcache_cache_bypass calls > > are unneeded. > > I helped test this together with Marcus and when I tested with this > register marked > as volatile audio started to stutter, still unclear why audio starts to > stutter. Sounds like we're missing something. However the only other time we update this register is to set the FIFO mode. > Without this cache bypass the flush TX/RX bits gets cached and flush > happens unexpectedly > resulting in multi channel audio getting mapped to wrong speaker. This sounds like the flush is happening after DMA transfers and/or I2S operations have started, disrupting the order of the audio samples. I think that might be the case since the regcache is synced sequentially, and the FIFO control register is after the enable bits. That would imply that the device is taken out of runtime suspend after the .start_capture or .start_playback callbacks. Not sure if that's the case, but that would mean the bus clocks are still off at this point, and bypassing the cache and updating the bits is basically moot. I think there's something else happening here, but we need to figure it out instead of papering over it with something that "just works" but we don't know why it works. > Other ASoC codecs and fsl_spdif.c seems to use similar cache bypass for > reset/flush. fsl_spdif.c does it when forcing a soft reset, which would reset all the registers. It then does a cache sync, restoring any values set up. That's not what we're doing here. > > > > However, looking at the code, the write would seem to be ignored if the > > regmap is in the cache_only state. We only set this when the bus clock > > is disabled. Under such a condition, bypassing the cache and forcing a > > write would be unwise, as the system either drops the write, or stalls > > altogether. > > > >> /* Clear RX counter */ > >> regmap_write(i2s->regmap, SUN4I_I2S_RX_CNT_REG, 0); > >> @@ -569,9 +571,11 @@ static void sun4i_i2s_start_capture(struct sun4i_i2s > >> *i2s) > >> static void sun4i_i2s_start_playback(struct sun4i_i2s *i2s) > >> { > >> /* Flush TX FIFO */ > >> + regcache_cache_bypass(i2s->regmap, true); > >> regmap_update_bits(i2s->regmap, SUN4I_I2S_FIFO_CTRL_REG, > >>SUN4I_I2S_FIFO_CTRL_FLUSH_TX, > >>SUN4I_I2S_FIFO_CTRL_FLUSH_TX); > >> + regcache_cache_bypass(i2s->regmap, false); > >> > >> /* Clear TX counter */ > >> regmap_write(i2s->regmap, SUN4I_I2S_TX_CNT_REG, 0); > >> @@ -703,13 +707,7 @@ static const struct snd_soc_component_driver > >> sun4i_i2s_component = { > >> > >> static bool sun4i_i2s_rd_reg(struct device *dev, unsigned int reg) > >> { > >> - switch (reg) { > >> - case SUN4I_I2S_FIFO_TX_REG: > >> - return false; > >> - > >> - default: > >> - return true; > >> - } > >> + return true; > > I don't understand why this is relevant. Do you need to read back from the > > TX > > FIFO? > > This change was inspired by c66234cfedfc "ASoC: rockchip: i2s: fix > playback after runtime resume" [1] > On resume a cached sample would be written to FIFO_TX_REG unless it is > marked volatile, > the rockchip commit indicated that read is needed for volatile regs. > > [1] > https://github.com/torvalds/linux/commit/c66234cfedfc3e6e3b62563a5f2c1562be09a35d What about simply marking the FIFO registers as both unreadable and unwritable, thus excluding them from the regmap? That should get rid of any unwanted syncs. We are doing DMA transfers to/from them. Do we need regmap access? Either way this context
Re: [PATCH v4] string.h: Add str_has_prefix() helper function
On Sat, Dec 22, 2018 at 12:24:54PM -0500, Steven Rostedt wrote: > On Sat, 22 Dec 2018 12:23:35 -0500 > Steven Rostedt wrote: > > > On Sat, 22 Dec 2018 12:19:11 -0500 > > Steven Rostedt wrote: > > > > > Because memcmp() isn't required to test byte by byte. In fact, most > > > implementations don't which is why memcmp is faster than strcncmp. > > > > In fact, if memcmp() was safe to use if we only knew the size of one of > > the parameters, then there would be no reason for strncmp to exist. > > > > Also, I believe there are some memcmp implementations that start at the > end of the memory locations, not the beginning. That is, it compares > backwards. Which is also legit for memcmp to do. I'm not sure, the man page says: RETURN VALUE The memcmp() function returns an integer less than, equal to, or greater than zero if the first n bytes of s1 is found, respectively, to be less than, to match, or be greater than the first n bytes of s2. For a nonzero return value, the sign is determined by the sign of the difference between the first pair of bytes (interpreted as unsigned char) that differ in s1 and s2. If n is zero, the return value is zero. It should return difference in the first pair of bytes that differ so I guess implementations should compare from the beginning. Thanks, Namhyung
Re: [PATCH v4] string.h: Add str_has_prefix() helper function
On Sat, Dec 22, 2018 at 12:19:11PM -0500, Steven Rostedt wrote: > On Sun, 23 Dec 2018 01:46:05 +0900 > Namhyung Kim wrote: > > > > > What I meant by that is if a string is allocated at a end of a page, > > > and the next page is marked as not present. A read into that page will > > > cause a page fault, and since memcmp() does not stop at the '\0' it > > > will read into that not-present memory and trigger a fault, and that > > > read wont be in the exception table, and it will then BUG. > > > > Why it doesn't stop at the '\0' if one has it and the other doesn't? > > It's not because it's '\0', it's because they are different. The '\0' > > should be in the prev page (otherwise it's already a BUG) so it should > > be detected and stopped before going to next page IMHO. > > > > Because memcmp() isn't required to test byte by byte. In fact, most > implementations don't which is why memcmp is faster than strcncmp. > > It can be checking in 8 byte chunks or more (although perhaps not > likely). Perhaps there's an arch command that lets you compare 32 bytes > at a time, if the size passed to memcmp is 32 or more, the > implementation is allowed to read both src and dst of 32 bytes at a > time. If there was a '\0' followed by not present memory, you will > still get that fault. I thought such implementation would check the alignment and not cross the page boundary in a single read. But it's implementation's choice and I found that glibc's default implementation for misaligned pointer reads next chunk as well to form an aligned chunk using shifts. So for the safety it'd be better to use strcmp().. Thanks for your time and the explanation, Namhyung
Re: [PATCH] net: core: Fix Spectre v1 vulnerability
On Sat, Dec 22, 2018 at 08:53:40PM -0600, Gustavo A. R. Silva wrote: > Hi, > > On 12/22/18 8:40 PM, David Miller wrote: > > From: Alexei Starovoitov > > Date: Sat, 22 Dec 2018 15:59:54 -0800 > > > > > On Sat, Dec 22, 2018 at 03:07:22PM -0800, David Miller wrote: > > > > From: "Gustavo A. R. Silva" > > > > Date: Fri, 21 Dec 2018 14:49:01 -0600 > > > > > > > > > flen is indirectly controlled by user-space, hence leading to > > > > > a potential exploitation of the Spectre variant 1 vulnerability. > > > > > > > > > > This issue was detected with the help of Smatch: > > > > > > > > > > net/core/filter.c:1101 bpf_check_classic() warn: potential spectre > > > > > issue 'filter' [w] > > > > > > > > > > Fix this by sanitizing flen before using it to index filter at line > > > > > 1101: > > > > > > > > > > switch (filter[flen - 1].code) { > > > > > > > > > > and through pc at line 1040: > > > > > > > > > > const struct sock_filter *ftest = [pc]; > > > > > > > > > > Notice that given that speculation windows are large, the policy is > > > > > to kill the speculation on the first load and not worry if it can be > > > > > completed with a dependent load/store [1]. > > > > > > > > > > [1] https://marc.info/?l=linux-kernel=152449131114778=2 > > > > > > > > > > Signed-off-by: Gustavo A. R. Silva > > > > > > > > BPF folks, I'll take this directly. > > > > > > > > Applied and queued up for -stable, thanks. > > > > > > hmm. what was the rush? > > > I think it is unnecessary change. > > > Though fp is passed initially from user space > > > it's copied into kernel struct first. > > > There is no way user space can force kernel to mispredict > > > branch in for (pc = 0; pc < flen; pc++) loop. > The following piece of code is the one that can be mispredicted, not the for > loop: > > 1013 if (flen == 0 || flen > BPF_MAXINSNS) > 1014 return false; > > Instead of calling array_index_nospec() inside bpf_check_basics_ok(), I > decided to place the call close to the code that could be compromised. This > is when accessing filter[]. Why do you think it can be mispredicted? I've looked at your other patch for nfc_sock_create() where you're adding: + proto = array_index_nospec(proto, NFC_SOCKPROTO_MAX); 'proto' is the value passed in _register_ into system call. There is no need to wrap it with array_index_nospec(). It's not a load from memory and user space cannot make it slow. Slow load is a necessary attribute to trigger speculative execution into mispredicted branch. What tool did you use to analyze this? Did you analyze all warnings case by case or just trusted the tool and generated these patches?
[PATCH v2 1/1] sock: Make sock->sk_stamp thread-safe
Al Viro mentioned (Message-ID <20170626041334.gz10...@zeniv.linux.org.uk>) that there is probably a race condition lurking in accesses of sk_stamp on 32-bit machines. sock->sk_stamp is of type ktime_t which is always an s64. On a 32 bit architecture, we might run into situations of unsafe access as the access to the field becomes non atomic. Use seqlocks for synchronization. This allows us to avoid using spinlocks for readers as readers do not need mutual exclusion. Another approach to solve this is to require sk_lock for all modifications of the timestamps. The current approach allows for timestamps to have their own lock: sk_stamp_lock. This allows for the patch to not compete with already existing critical sections, and side effects are limited to the paths in the patch. The addition of the new field maintains the data locality optimizations from commit 9115e8cd2a0c ("net: reorganize struct sock for better data locality") Note that all the instances of the sk_stamp accesses are either through the ioctl or the syscall recvmsg. Signed-off-by: Deepa Dinamani --- Changes since v1: * fixed sunrpc sk_stamp update include/net/sock.h | 16 +--- net/compat.c | 32 ++-- net/core/sock.c | 34 +- net/sunrpc/svcsock.c | 2 ++ 4 files changed, 70 insertions(+), 14 deletions(-) diff --git a/include/net/sock.h b/include/net/sock.h index fe58aec00d09..2cb641606533 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -298,6 +298,7 @@ struct sock_common { *@sk_filter: socket filtering instructions *@sk_timer: sock cleanup timer *@sk_stamp: time stamp of last packet received + *@sk_stamp_seq: lock for accessing sk_stamp *@sk_tsflags: SO_TIMESTAMPING socket options *@sk_tskey: counter to disambiguate concurrent tstamp requests *@sk_zckey: counter to order MSG_ZEROCOPY notifications @@ -474,6 +475,7 @@ struct sock { const struct cred *sk_peer_cred; longsk_rcvtimeo; ktime_t sk_stamp; + seqlock_t sk_stamp_seq; u16 sk_tsflags; u8 sk_shutdown; u32 sk_tskey; @@ -2311,8 +2313,11 @@ sock_recv_timestamp(struct msghdr *msg, struct sock *sk, struct sk_buff *skb) (hwtstamps->hwtstamp && (sk->sk_tsflags & SOF_TIMESTAMPING_RAW_HARDWARE))) __sock_recv_timestamp(msg, sk, skb); - else + else { + write_seqlock(>sk_stamp_seq); sk->sk_stamp = kt; + write_sequnlock(>sk_stamp_seq); + } if (sock_flag(sk, SOCK_WIFI_STATUS) && skb->wifi_acked_valid) __sock_recv_wifi_status(msg, sk, skb); @@ -2332,10 +2337,15 @@ static inline void sock_recv_ts_and_drops(struct msghdr *msg, struct sock *sk, if (sk->sk_flags & FLAGS_TS_OR_DROPS || sk->sk_tsflags & TSFLAGS_ANY) __sock_recv_ts_and_drops(msg, sk, skb); - else if (unlikely(sock_flag(sk, SOCK_TIMESTAMP))) + else if (unlikely(sock_flag(sk, SOCK_TIMESTAMP))) { + write_seqlock(>sk_stamp_seq); sk->sk_stamp = skb->tstamp; - else if (unlikely(sk->sk_stamp == SK_DEFAULT_STAMP)) + write_sequnlock(>sk_stamp_seq); + } else if (unlikely(sk->sk_stamp == SK_DEFAULT_STAMP)) { + write_seqlock(>sk_stamp_seq); sk->sk_stamp = 0; + write_sequnlock(>sk_stamp_seq); + } } void __sock_tx_timestamp(__u16 tsflags, __u8 *tx_flags); diff --git a/net/compat.c b/net/compat.c index 6c9fceeefac0..17da30bd34a9 100644 --- a/net/compat.c +++ b/net/compat.c @@ -459,6 +459,7 @@ int compat_sock_get_timestamp(struct sock *sk, struct timeval __user *userstamp) { struct compat_timeval __user *ctv; int err; + unsigned int seq; struct timeval tv; if (COMPAT_USE_64BIT_TIME) @@ -467,12 +468,21 @@ int compat_sock_get_timestamp(struct sock *sk, struct timeval __user *userstamp) ctv = (struct compat_timeval __user *) userstamp; err = -ENOENT; sock_enable_timestamp(sk, SOCK_TIMESTAMP); - tv = ktime_to_timeval(sk->sk_stamp); + + do { + seq = read_seqbegin(>sk_stamp_seq); + tv = ktime_to_timeval(sk->sk_stamp); + } while (read_seqretry(>sk_stamp_seq, seq)); + if (tv.tv_sec == -1) return err; if (tv.tv_sec == 0) { - sk->sk_stamp = ktime_get_real(); - tv = ktime_to_timeval(sk->sk_stamp); + ktime_t kt = ktime_get_real(); + + write_seqlock(>sk_stamp_seq); + sk->sk_stamp = kt; + write_sequnlock(>sk_stamp_seq); + tv = ktime_to_timeval(kt); } err = 0; if (put_user(tv.tv_sec, >tv_sec) || @@ -486,6 +496,7 @@ int
Re: [PATCH] net: core: Fix Spectre v1 vulnerability
Hi, On 12/22/18 8:40 PM, David Miller wrote: From: Alexei Starovoitov Date: Sat, 22 Dec 2018 15:59:54 -0800 On Sat, Dec 22, 2018 at 03:07:22PM -0800, David Miller wrote: From: "Gustavo A. R. Silva" Date: Fri, 21 Dec 2018 14:49:01 -0600 flen is indirectly controlled by user-space, hence leading to a potential exploitation of the Spectre variant 1 vulnerability. This issue was detected with the help of Smatch: net/core/filter.c:1101 bpf_check_classic() warn: potential spectre issue 'filter' [w] Fix this by sanitizing flen before using it to index filter at line 1101: switch (filter[flen - 1].code) { and through pc at line 1040: const struct sock_filter *ftest = [pc]; Notice that given that speculation windows are large, the policy is to kill the speculation on the first load and not worry if it can be completed with a dependent load/store [1]. [1] https://marc.info/?l=linux-kernel=152449131114778=2 Signed-off-by: Gustavo A. R. Silva BPF folks, I'll take this directly. Applied and queued up for -stable, thanks. hmm. what was the rush? I think it is unnecessary change. Though fp is passed initially from user space it's copied into kernel struct first. There is no way user space can force kernel to mispredict branch in for (pc = 0; pc < flen; pc++) loop. The following piece of code is the one that can be mispredicted, not the for loop: 1013 if (flen == 0 || flen > BPF_MAXINSNS) 1014 return false; Instead of calling array_index_nospec() inside bpf_check_basics_ok(), I decided to place the call close to the code that could be compromised. This is when accessing filter[]. -- Gustavo The change doesn't harm, but I don't think it's a good idea to sprinkle kernel with array_index_nospec() just because some tool produced a warning. Ok, that makes sense, I can revert. Just let me know.
Re: [PATCH] nfc: af_nfc: Fix Spectre v1 vulnerability
From: "Gustavo A. R. Silva" Date: Sat, 22 Dec 2018 17:37:35 -0600 > I wonder if you can take this one too: > > https://lore.kernel.org/lkml/20181221212229.GA32635@embeddedor/ > > It's pretty similar to the af_nfc one. Sure, done.
Re: [PATCH] net: core: Fix Spectre v1 vulnerability
From: Alexei Starovoitov Date: Sat, 22 Dec 2018 15:59:54 -0800 > On Sat, Dec 22, 2018 at 03:07:22PM -0800, David Miller wrote: >> From: "Gustavo A. R. Silva" >> Date: Fri, 21 Dec 2018 14:49:01 -0600 >> >> > flen is indirectly controlled by user-space, hence leading to >> > a potential exploitation of the Spectre variant 1 vulnerability. >> > >> > This issue was detected with the help of Smatch: >> > >> > net/core/filter.c:1101 bpf_check_classic() warn: potential spectre issue >> > 'filter' [w] >> > >> > Fix this by sanitizing flen before using it to index filter at line 1101: >> > >> >switch (filter[flen - 1].code) { >> > >> > and through pc at line 1040: >> > >> >const struct sock_filter *ftest = [pc]; >> > >> > Notice that given that speculation windows are large, the policy is >> > to kill the speculation on the first load and not worry if it can be >> > completed with a dependent load/store [1]. >> > >> > [1] https://marc.info/?l=linux-kernel=152449131114778=2 >> > >> > Signed-off-by: Gustavo A. R. Silva >> >> BPF folks, I'll take this directly. >> >> Applied and queued up for -stable, thanks. > > hmm. what was the rush? > I think it is unnecessary change. > Though fp is passed initially from user space > it's copied into kernel struct first. > There is no way user space can force kernel to mispredict > branch in for (pc = 0; pc < flen; pc++) loop. > The change doesn't harm, but I don't think it's a good idea > to sprinkle kernel with array_index_nospec() just because some tool > produced a warning. Ok, that makes sense, I can revert. Just let me know.
Re: [PATCH 03/12] __wr_after_init: generic header
On 21/12/2018 21:45, Matthew Wilcox wrote: On Fri, Dec 21, 2018 at 11:38:16AM -0800, Nadav Amit wrote: On Dec 19, 2018, at 1:33 PM, Igor Stoppa wrote: +static inline void *wr_memset(void *p, int c, __kernel_size_t len) +{ + return __wr_op((unsigned long)p, (unsigned long)c, len, WR_MEMSET); +} What do you think about doing something like: #define __wr __attribute__((address_space(5))) And then make all the pointers to write-rarely memory to use this attribute? It might require more changes to the code, but can prevent bugs. I like this idea. It was something I was considering suggesting. I have been thinking about this sort of problem, although from a bit different angle: 1) enforcing alignment for pointers This can be implemented in similar way, by creating a multi-attribute that would define section, address space, like said here, and alignment. However I'm not sure if it's possible to do anything to enforce the alignment of a pointer field within a structure. I haven't had time to look into this yet. 2) validation of the correctness of the actual value Inside the kernel code, a function is not supposed to sanitize its arguments, as long as they come from some other trusted part of the kernel, rather than say from userspace or from some HW interface. However,ROP/JOP should be considered. I am aware of various efforts to make it harder to exploit these techniques, like signed pointers, CFI plugins, LTO. But they are not necessarily available on every platform and mostly, afaik, they focus on specific type of attacks. LTO can help with global optimizations, for example inlining functions across different objects. CFI can detect jumps in the middle of a function, rather than proper function invocation, from its natural entry point. Signed pointers can prevent data-based attacks to the execution flow, and they might have a role in preventing the attack I have in mind, but they are not available on all platforms. What I'd like to do, is to verify, at runtime, that the pointer belongs to the type that the receiving function is meant for. Ex: a legitimate __wr_after_init data must exist between __start_wr_after_init and __end_wr_after_init That is easier and cleaner to test, imho. But dynamically allocated memory doesn't have any such constraint. If it was possible to introduce, for example, a flag to pass to vmalloc, to get the vmap_area from within a specific address range, it would reduce the attack surface. In the implementation I have right now, I'm using extra flags for the pmalloc pages, which means the metadata is the new target for an attack. But with adding the constraint that a dynamically allocated protected memory page must be within a range, then the attacker must change the underlying PTE. And if a region of PTEs are all part of protected memory, it is possible to make the PMD write rare. -- igor
Re: [PATCH] debugfs: remove inc_nlink in debugfs_create_automount
On Sat, Dec 22, 2018 at 04:45:36PM +0800, yangerkun wrote: > Remove inc_nlink in debugfs_create_automount, or this inode will never > be free. Explain, please. What exactly would care about i_nlink in debugfs? It does *NOT* free any kind of backing store on inode eviction, for a good and simple reason - there is no backing store at all. And as for the icache retention, debugfs inodes are * never looked up in icache * never hashed * ... and thus never retained in icache past the final iput() i_nlink serves as a refcount - for on-disk inodes on filesystems that allow hardlinks and need to decide if the on-disk inode needs to follow an in-core one into oblivion. The lifetime of in-core inode is *NOT* controlled by i_nlink. They can very well outlive i_nlink dropping to 0, for starters. Consider e.g. the following: cat >/tmp/a.sh <<'EOF' echo still not freed >/tmp/a (sleep 5 && date && stat - && cat) st_nlink and that came straight from ->i_nlink of the (very much alive) in-core inode. And of course, in-core inodes do get freed just fine without i_nlink reaching zero. It's used for 4 things: 1) deciding whether it makes sense to evict in-core inode as soon as we have no more (in-core) references pinning them (i.e. when ->i_count reaches zero). If there's a chance that somebody will do an icache lookup finding that one, we might want to keep it around until memory pressure kicks it out. And since for something like normal Unix filesystem such icache lookup can happen as long as there are links to the (on-disk) inode left in some directories, default policy is "try to keep it around if i_nlink is positive *AND* it is reachable from icache in the first place". Filesystems might override that, but it's all moot if the in-core inode is *not* reachable from icache in the first place. Which is the case for debugfs and similar beasts. 2) deciding whether we want to free the on-disk inode when an in-core one gets evicted. Note that such freeing can not happen as long as in-core inode is around - unlinking an open file does *not* free it; it's still open and IO on such descriptor works just fine. There the normal rules are "if we are evicting an in-core inode and we know that it has no links left, it's time to free the on-disk counterpart". Up to individual filesystems, not applicable to debugfs for obvious reasons. 3) for the same filesystems, if the link count is maintained in on-disk inode we'll need to update it on unlink et.al. ->i_nlink of in-core inode is handy for keeping track of that. Again, not applicable in debugfs 4) reporting st_nlink to userland on stat/fstat/etc. That *is* applicable in debugfs and, in fact, it is the only use of ->i_nlink there.
Re: [LKP] [bochs] df2052cc92: WARNING:at_drivers/gpu/drm/drm_mode_config.c:#drm_mode_config_cleanup
On Fri, Dec 21, 2018 at 11:25:41AM -0800, Linus Torvalds wrote: > On Fri, Dec 21, 2018 at 12:32 AM kernel test robot > wrote: > > > > FYI, we noticed commit df2052cc9221 ("bochs: convert to > > drm_fb_helper_fbdev_setup/teardown") caused > > > > [ 487.591733] WARNING: CPU: 0 PID: 1 at > > drivers/gpu/drm/drm_mode_config.c:478 drm_mode_config_cleanup+0x270/0x290 > > Ok, this is apparently just a leak for what appears to be a not > particularly interesting error case, but the warning is new to 4.20 > (*) so it would be nice to have somebody look at it. > > That commit is supposed to fix a leak, but there's apparently > something still there. > (*) the *problem* is probably not new, it's just now exposed by the > switch to drm_mode_config_cleanup(). I concur, the issue was only revealed because a (not so interesting error path was triggered). Reproduced this on current master (v4.20-rc7-274-g23203e3f34c9), the trace leading up to the warning is the same: [ 50.008030] bochsdrmfb: enable CONFIG_FB_LITTLE_ENDIAN to support this framebuffer [ 50.009436] bochs-drm :00:02.0: [drm:drm_fb_helper_fbdev_setup] *ERROR* fbdev: Failed to set configuration (ret=-38) [ 50.011456] [drm] Initialized bochs-drm 1.0.0 20130925 for :00:02.0 on minor 2 [ 50.013604] WARNING: CPU: 1 PID: 1 at drivers/gpu/drm/drm_mode_config.c:477 drm_mode_config_cleanup+0x280/0x2a0 [ 50.016175] CPU: 1 PID: 1 Comm: swapper/0 Tainted: GT 4.20.0-rc7 #1 [ 50.017732] EIP: drm_mode_config_cleanup+0x280/0x2a0 ... [ 50.023155] Call Trace: [ 50.023155] ? bochs_kms_fini+0x1e/0x30 [ 50.023155] ? bochs_unload+0x18/0x40 [ 50.023155] ? bochs_pci_remove+0x18/0x30 [ 50.023155] ? pci_device_remove+0x1c/0x50 [ 50.031880] ? really_probe+0xf3/0x2d0 [ 50.031880] ? driver_probe_device+0x23/0xa0 The warning suggests that drm_framebuffer_init was called at some point without a matching call to drm_framebuffer_cleanup. Adding dump_stack() reveals: [ 97.673399] drm_framebuffer_init+0x17d/0x190 [ 97.674134] drm_gem_fb_alloc+0xbe/0x120 [ 97.674678] drm_gem_fbdev_fb_create+0x184/0x1c0 [ 97.675322] ? drm_gem_fb_simple_display_pipe_prepare_fb+0x20/0x20 [ 97.676771] ? drm_fb_helper_alloc_fbi+0xe1/0x120 [ 97.677408] bochsfb_create+0x245/0x5f0 [ 97.677935] ? bochsfb_mmap+0x60/0x60 [ 97.678421] __drm_fb_helper_initial_config_and_unlock+0x3d3/0x7b0 [ 97.678827] ? drm_setup_crtcs+0x1430/0x1430 [ 97.678827] drm_fb_helper_fbdev_setup+0x12b/0x230 [ 97.678827] bochs_fbdev_init+0x33/0x40 [ 97.678827] bochs_pci_probe+0x197/0x1a0 [ 97.678827] pci_device_probe+0xe9/0x180 [ 97.693848] bochsdrmfb: enable CONFIG_FB_LITTLE_ENDIAN to support this framebuffer [ 97.694880] bochs-drm :00:02.0: [drm:drm_fb_helper_fbdev_setup] *ERROR* fbdev: Failed to set configuration (ret=-38) More precisely, this is the call chain (obtained via GDB): drm_fb_helper_fbdev_setup -> drm_fb_helper_initial_config -> __drm_fb_helper_initial_config_and_unlock -> drm_fb_helper_single_fb_probe -> bochsfb_create -> drm_gem_fbdev_fb_create -> drm_gem_fb_alloc -> drm_framebuffer_init Let's have a look at the source of the error message, keep in mind that drm_fb_helper_fini is called on the error path: int drm_fb_helper_fbdev_setup(struct drm_device *dev, ...) { /* ... */ ret = drm_fb_helper_initial_config(fb_helper, preferred_bpp); if (ret < 0) { DRM_DEV_ERROR(dev->dev, "fbdev: Failed to set configuration (ret=%d)\n", ret); goto err_drm_fb_helper_fini; } return 0; err_drm_fb_helper_fini: drm_fb_helper_fini(fb_helper); return ret; } The CONFIG_FB_LITTLE_ENDIAN error above suggests that this code path is reached *after* calling bochsfb_create: drm_fb_helper_fbdev_setup -> drm_fb_helper_initial_config -> __drm_fb_helper_initial_config_and_unlock -> register_framebuffer -> do_register_framebuffer -> fb_check_foreignness (prints error and propagates error code back to drm_fb_helper_fbdev_setup). What does "drm_fb_helper_fini" do? Among other things it basically kfree's memory associated with "fb_helper->fbdev" which was created using "drm_fb_helper_alloc_fbi" in the "fb_probe" callback. This is sufficient for "drm_fb_helper_generic_probe" (introduced by Noralf), but not for "bochsfb_create" which additionally calls "drm_gem_fbdev_fb_create": info = drm_fb_helper_alloc_fbi(helper); if (IS_ERR(info)) { DRM_ERROR("Failed to allocate fbi: %ld\n", PTR_ERR(info)); return PTR_ERR(info); } info->par = >fb.helper; fb = drm_gem_fbdev_fb_create(bochs->dev, sizes, 0, gobj, NULL); if (IS_ERR(fb)) { DRM_ERROR("Failed to create framebuffer: %ld\n", PTR_ERR(fb)); return PTR_ERR(fb); } /*
[PATCH] doc:it_IT: translation for process/submitting-patches
It translats the document process/submitting-patches.rst. Signed-off-by: Federico Vaga --- .../it_IT/process/submitting-patches.rst | 862 +- 1 file changed, 858 insertions(+), 4 deletions(-) diff --git a/Documentation/translations/it_IT/process/submitting-patches.rst b/Documentation/translations/it_IT/process/submitting-patches.rst index d633775ed556..b6368a45a8e4 100644 --- a/Documentation/translations/it_IT/process/submitting-patches.rst +++ b/Documentation/translations/it_IT/process/submitting-patches.rst @@ -1,13 +1,867 @@ .. include:: ../disclaimer-ita.rst :Original: :ref:`Documentation/process/submitting-patches.rst ` - +:Translator: Federico Vaga .. _it_submittingpatches: -Sottomettere modifiche: la guida essenziale per vedere il vostro codice nel kernel -== +Inviare patch: la guida essenziale per vedere il vostro codice nel kernel += + +Una persona o un'azienda che volesse inviare una patch al kernel potrebbe +sentirsi scoraggiata dal processo di sottomissione, specialmente quando manca +una certa familiarità col "sistema". Questo testo è una raccolta di +suggerimenti che aumenteranno significativamente le probabilità di vedere le +vostre patch accettate. + +Questo documento contiene un vasto numero di suggerimenti concisi. Per +maggiori dettagli su come funziona il processo di sviluppo del kernel leggete +:ref:`Documentation/translations/it_IT/process `. +Leggete anche :ref:`Documentation/translations/it_IT/process/submit-checklist.rst ` +per una lista di punti da verificare prima di inviare del codice. Se state +inviando un driver, allora leggete anche :ref:`Documentation/translations/it_IT/process/submitting-drivers.rst `; +per delle patch relative alle associazioni per Device Tree leggete +Documentation/devicetree/bindings/submitting-patches.txt. + +Molti di questi passi descrivono il comportamento di base del sistema di +controllo di versione ``git``; se utilizzate ``git`` per preparare le vostre +patch molto del lavoro più ripetitivo lo troverete già fatto per voi, tuttavia +dovete preparare e documentare un certo numero di patch. Generalmente, l'uso +di ``git`` renderà la vostra vita di sviluppatore del kernel più facile. + +0) Ottenere i sorgenti attuali +-- + +Se non avete un repositorio coi sorgenti del kernel più recenti, allora usate +``git`` per ottenerli. Vorrete iniziare col repositorio principale che può +essere recuperato col comando:: + + git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git + +Notate, comunque, che potreste non voler sviluppare direttamente coi sorgenti +principali del kernel. La maggior parte dei manutentori hanno i propri +sorgenti e desiderano che le patch siano preparate basandosi su di essi. +Guardate l'elemento **T:** per un determinato sottosistema nel file MAINTANERS +che troverete nei sorgenti, o semplicemente chiedete al manutentore nel caso +in cui i sorgenti da usare non siano elencati il quel file. + +Esiste ancora la possibilità di scaricare un rilascio del kernel come archivio +tar (come descritto in una delle prossime sezioni), ma questa è la via più +complicata per sviluppare per il kernel. + +1) ``diff -up`` +--- + +Se dovete produrre le vostre patch a mano, usate ``diff -up`` o ``diff -uprN`` +per crearle. Git produce di base le patch in questo formato; se state +usando ``git``, potete saltare interamente questa sezione. + +Tutte le modifiche al kernel Linux avvengono mediate patch, come descritte +in :manpage:`diff(1)`. Quando create la vostra patch, assicuratevi di +crearla nel formato "unified diff", come l'argomento ``-u`` di +:manpage:`diff(1)`. +Inoltre, per favore usate l'argomento ``-p`` per mostrare la funzione C +alla quale si riferiscono le diverse modifiche - questo rende il risultato +di ``diff`` molto più facile da leggere. Le patch dovrebbero essere basate +sulla radice dei sorgenti del kernel, e non sulle sue sottocartelle. + +Per creare una patch per un singolo file, spesso è sufficiente fare:: + + SRCTREE= linux + MYFILE= drivers/net/mydriver.c + + cd $SRCTREE + cp $MYFILE $MYFILE.orig + vi $MYFILE # make your change + cd .. + diff -up $SRCTREE/$MYFILE{.orig,} > /tmp/patch + +Per creare una patch per molteplici file, dovreste spacchettare i sorgenti +"vergini", o comunque non modificati, e fare un ``diff`` coi vostri. +Per esempio:: + + MYSRC= /devel/linux + + tar xvfz linux-3.19.tar.gz + mv linux-3.19 linux-3.19-vanilla + diff -uprN -X linux-3.19-vanilla/Documentation/dontdiff \ + linux-3.19-vanilla $MYSRC > /tmp/patch + +``dontdiff`` è una lista di file che sono generati durante il processo di +compilazione del kernel; questi dovrebbero essere ignorati in qualsiasi +patch
[PATCH] drm/fb-helper: fix leaks in error path of drm_fb_helper_fbdev_setup
After drm_fb_helper_fbdev_setup calls drm_fb_helper_init, "dev->fb_helper" will be initialized (and thus drm_fb_helper_fini will have some effect). After that, drm_fb_helper_initial_config is called which may call the "fb_probe" driver callback. This driver callback may call drm_fb_helper_defio_init (as is done by drm_fb_helper_generic_probe) or set a framebuffer (as is done by bochs) as documented. These are normally cleaned up on exit by drm_fb_helper_fbdev_teardown which also calls drm_fb_helper_fini. If an error occurs after "fb_probe", but before setup is complete, then calling just drm_fb_helper_fini will leak resources. This was triggered by df2052cc922 ("bochs: convert to drm_fb_helper_fbdev_setup/teardown"): [ 50.008030] bochsdrmfb: enable CONFIG_FB_LITTLE_ENDIAN to support this framebuffer [ 50.009436] bochs-drm :00:02.0: [drm:drm_fb_helper_fbdev_setup] *ERROR* fbdev: Failed to set configuration (ret=-38) [ 50.011456] [drm] Initialized bochs-drm 1.0.0 20130925 for :00:02.0 on minor 2 [ 50.013604] WARNING: CPU: 1 PID: 1 at drivers/gpu/drm/drm_mode_config.c:477 drm_mode_config_cleanup+0x280/0x2a0 [ 50.016175] CPU: 1 PID: 1 Comm: swapper/0 Tainted: GT 4.20.0-rc7 #1 [ 50.017732] EIP: drm_mode_config_cleanup+0x280/0x2a0 ... [ 50.023155] Call Trace: [ 50.023155] ? bochs_kms_fini+0x1e/0x30 [ 50.023155] ? bochs_unload+0x18/0x40 This can be reproduced with QEMU and CONFIG_FB_LITTLE_ENDIAN=n. Link: https://lkml.kernel.org/r/20181221083226.GI23332@shao2-debian Link: https://lkml.kernel.org/r/20181223004315.GA11455@al Fixes: 8741216396b2 ("drm/fb-helper: Add drm_fb_helper_fbdev_setup/teardown()") Reported-by: kernel test robot Cc: Noralf Trønnes Signed-off-by: Peter Wu --- drivers/gpu/drm/drm_fb_helper.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c index 9d64f874f965..432e0f3b9267 100644 --- a/drivers/gpu/drm/drm_fb_helper.c +++ b/drivers/gpu/drm/drm_fb_helper.c @@ -2860,7 +2860,7 @@ int drm_fb_helper_fbdev_setup(struct drm_device *dev, return 0; err_drm_fb_helper_fini: - drm_fb_helper_fini(fb_helper); + drm_fb_helper_fbdev_teardown(dev); return ret; } -- 2.20.0
Re: [for-next][PATCH 0/5] tracing: Add string_has_prefix() and usages
On Sat, 22 Dec 2018 11:20:07 -0500 Steven Rostedt wrote: > They ran through all my tests, althought zero-day-bot had a weird build > regression in sh, that looks totally unrelated: > > Regressions in current branch: > > arch/sh/kernel/dwarf.c:107:26: error: 'dwarf_frame_reg' defined but not used > [-Werror=unused-function] > arch/sh/kernel/dwarf.c:1209:0: error: unterminated argument list invoking > macro "WARN_ON" > arch/sh/kernel/dwarf.c:226:12: error: 'dwarf_read_encoded_value' defined but > not used [-Werror=unused-function] > arch/sh/kernel/dwarf.c:306:26: error: 'dwarf_lookup_cie' defined but not > used [-Werror=unused-function] > arch/sh/kernel/dwarf.c:38:27: error: 'dwarf_frame_cachep' defined but not > used [-Werror=unused-variable] > arch/sh/kernel/dwarf.c:399:12: error: 'dwarf_cfa_execute_insns' defined but > not used [-Werror=unused-function] > arch/sh/kernel/dwarf.c:41:27: error: 'dwarf_reg_cachep' defined but not used > [-Werror=unused-variable] > arch/sh/kernel/dwarf.c:580:22: error: unused variable 'frame' > [-Werror=unused-variable] > arch/sh/kernel/dwarf.c:581:20: error: unused variable 'cie' > [-Werror=unused-variable] > arch/sh/kernel/dwarf.c:582:20: error: unused variable 'fde' > [-Werror=unused-variable] > arch/sh/kernel/dwarf.c:583:20: error: unused variable 'reg' > [-Werror=unused-variable] > arch/sh/kernel/dwarf.c:584:16: error: unused variable 'addr' > [-Werror=unused-variable] > arch/sh/kernel/dwarf.c:622:3: error: expected ';' at end of input > arch/sh/kernel/dwarf.c:622:3: error: expected declaration or statement at > end of input > arch/sh/kernel/dwarf.c:622:3: error: 'WARN_ON' undeclared (first use in this > function); did you mean 'WMARK_LOW'? > > Pushing to my for-next branch should kick off another run. Let's see > if it reports that again. Unless someone knows why that happened? FYI, Zeroday reported back a successful run of my for-next branch, and did it again, after I pushed a rebase that added an Acked-by. Thus, I'm guessing that the above is a simple fluke. -- Steve
[PATCH] debugfs: remove inc_nlink in debugfs_create_automount
Remove inc_nlink in debugfs_create_automount, or this inode will never be free. Signed-off-by: yangerkun --- fs/debugfs/inode.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/fs/debugfs/inode.c b/fs/debugfs/inode.c index 13b01351dd1c..9294238e364f 100644 --- a/fs/debugfs/inode.c +++ b/fs/debugfs/inode.c @@ -551,12 +551,11 @@ struct dentry *debugfs_create_automount(const char *name, if (unlikely(!inode)) return failed_creating(dentry); + /* directory inodes start off with i_nlink == 2 (for "." entry) */ make_empty_dir_inode(inode); inode->i_flags |= S_AUTOMOUNT; inode->i_private = data; dentry->d_fsdata = (void *)f; - /* directory inodes start off with i_nlink == 2 (for "." entry) */ - inc_nlink(inode); d_instantiate(dentry, inode); inc_nlink(d_inode(dentry->d_parent)); fsnotify_mkdir(d_inode(dentry->d_parent), dentry); -- 2.14.4
Re: [PATCH] debugfs: remove no need inc_nlink
Greg KH wrote on 2018/12/22 15:32: On Sat, Dec 22, 2018 at 11:41:11AM +0800, yangerkun wrote: Remove inc_nlink in debugfs_create_automount, or this inode will never be free. Signed-off-by: yangerkun --- fs/debugfs/inode.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/fs/debugfs/inode.c b/fs/debugfs/inode.c index 13b0135..9e6e225 100644 --- a/fs/debugfs/inode.c +++ b/fs/debugfs/inode.c @@ -516,8 +516,6 @@ struct dentry *debugfs_create_dir(const char *name, struct dentry *parent) inode->i_op = _dir_inode_operations; inode->i_fop = _dir_operations; - /* directory inodes start off with i_nlink == 2 (for "." entry) */ - inc_nlink(inode); Really? How did you test this and why does removing this line directly go against what the comment says? So sorry for this, the fuction should be modify is debugfs_create_automount. Patch will coming soon. Thanks, Kun. this feels really wrong... greg k-h .
Re: [PATCH] net: core: Fix Spectre v1 vulnerability
On Sat, Dec 22, 2018 at 03:07:22PM -0800, David Miller wrote: > From: "Gustavo A. R. Silva" > Date: Fri, 21 Dec 2018 14:49:01 -0600 > > > flen is indirectly controlled by user-space, hence leading to > > a potential exploitation of the Spectre variant 1 vulnerability. > > > > This issue was detected with the help of Smatch: > > > > net/core/filter.c:1101 bpf_check_classic() warn: potential spectre issue > > 'filter' [w] > > > > Fix this by sanitizing flen before using it to index filter at line 1101: > > > > switch (filter[flen - 1].code) { > > > > and through pc at line 1040: > > > > const struct sock_filter *ftest = [pc]; > > > > Notice that given that speculation windows are large, the policy is > > to kill the speculation on the first load and not worry if it can be > > completed with a dependent load/store [1]. > > > > [1] https://marc.info/?l=linux-kernel=152449131114778=2 > > > > Signed-off-by: Gustavo A. R. Silva > > BPF folks, I'll take this directly. > > Applied and queued up for -stable, thanks. hmm. what was the rush? I think it is unnecessary change. Though fp is passed initially from user space it's copied into kernel struct first. There is no way user space can force kernel to mispredict branch in for (pc = 0; pc < flen; pc++) loop. The change doesn't harm, but I don't think it's a good idea to sprinkle kernel with array_index_nospec() just because some tool produced a warning.
[GIT] Sparc
Here is an early pull request for the next merge window: 1) Automatic system call table generation, from Firoz Khan. 2) Clean up accesses to the OF device names by using full_name instead of path_component_name. Please pull, thanks a lot! The following changes since commit 25e19c1fe421280a47f37c3571aa379e6e67966c: Merge tag 'libnvdimm-fixes-4.20-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm (2018-11-18 12:21:09 -0800) are available in the Git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-next.git for you to fetch changes up to c23b8e7acea3dc034edeb902f0c843856e215938: ALSA: sparc: Use of_node_name_eq for node name comparisons (2018-12-05 21:00:02 -0800) David S. Miller (2): Merge branch 'sparc-OF-name-and-device_type-rework' Merge branch 'sparc-syscall-gen' Firoz Khan (4): sparc: move __IGNORE* entries to non uapi header sparc: add __NR_syscalls along with NR_syscalls sparc: add system call table generation support sparc: generate uapi header and system call table files Rob Herring (12): fs/openpromfs: use full_name instead of path_component_name fs/openpromfs: Use of_node_name_eq for node name comparisons sparc: Convert to using %pOF instead of full_name of: Drop full path from full_name for PDT systems sparc: prom: use property "name" directly to construct node names sparc: Convert to using %pOFn instead of device_node.name sparc: Use of_node_name_eq for node name comparisons sparc: Use device_type helpers to access the node type sparc: Remove unused leon_trans_init sparc: Use DT node full_name instead of name for resources sbus: Use of_node_name_eq for node name comparisons ALSA: sparc: Use of_node_name_eq for node name comparisons arch/sparc/Makefile | 3 + arch/sparc/include/asm/Kbuild| 4 +- arch/sparc/include/asm/floppy_64.h | 8 +- arch/sparc/include/asm/leon.h| 1 - arch/sparc/include/asm/parport.h | 2 +- arch/sparc/include/asm/unistd.h | 18 + arch/sparc/include/uapi/asm/Kbuild | 2 + arch/sparc/include/uapi/asm/unistd.h | 426 +-- arch/sparc/kernel/auxio_64.c | 11 ++- arch/sparc/kernel/central.c | 2 +- arch/sparc/kernel/chmc.c | 8 +- arch/sparc/kernel/ioport.c | 2 +- arch/sparc/kernel/irq_64.c | 2 +- arch/sparc/kernel/leon_kernel.c | 14 arch/sparc/kernel/of_device_32.c | 21 +++--- arch/sparc/kernel/of_device_64.c | 58 +++ arch/sparc/kernel/of_device_common.c | 4 +- arch/sparc/kernel/pci.c | 44 +-- arch/sparc/kernel/pci_sabre.c| 2 +- arch/sparc/kernel/power.c| 4 +- arch/sparc/kernel/process_32.c | 2 +- arch/sparc/kernel/prom_32.c | 44 +-- arch/sparc/kernel/prom_64.c | 75 ++- arch/sparc/kernel/prom_irqtrans.c| 20 ++--- arch/sparc/kernel/reboot.c | 3 +- arch/sparc/kernel/sbus.c | 4 +- arch/sparc/kernel/sun4d_irq.c| 14 ++-- arch/sparc/kernel/syscalls/Makefile | 55 ++ arch/sparc/kernel/syscalls/syscall.tbl | 409 ++ arch/sparc/kernel/syscalls/syscallhdr.sh | 36 + arch/sparc/kernel/syscalls/syscalltbl.sh | 36 + arch/sparc/kernel/systbls_32.S | 81 + arch/sparc/kernel/systbls_64.S | 157 +--- arch/sparc/kernel/time_64.c | 16 ++-- arch/sparc/kernel/vio.c | 9 +-- drivers/of/pdt.c | 50 - drivers/sbus/char/bbc_envctrl.c | 4 +- drivers/sbus/char/envctrl.c | 6 +- drivers/sbus/char/flash.c| 6 +- fs/openpromfs/inode.c| 11 +-- include/linux/of.h | 1 - sound/sparc/cs4231.c | 6 +- 42 files changed, 777 insertions(+), 904 deletions(-) create mode 100644 arch/sparc/kernel/syscalls/Makefile create mode 100644 arch/sparc/kernel/syscalls/syscall.tbl create mode 100644 arch/sparc/kernel/syscalls/syscallhdr.sh create mode 100644 arch/sparc/kernel/syscalls/syscalltbl.sh
Re: [PATCH] nfc: af_nfc: Fix Spectre v1 vulnerability
On 12/22/18 5:09 PM, David Miller wrote: From: "Gustavo A. R. Silva" Date: Fri, 21 Dec 2018 15:47:53 -0600 proto is indirectly controlled by user-space, hence leading to a potential exploitation of the Spectre variant 1 vulnerability. This issue was detected with the help of Smatch: net/nfc/af_nfc.c:42 nfc_sock_create() warn: potential spectre issue 'proto_tab' [w] (local cap) Fix this by sanitizing proto before using it to index proto_tab. Notice that given that speculation windows are large, the policy is to kill the speculation on the first load and not worry if it can be completed with a dependent load/store [1]. [1] https://marc.info/?l=linux-kernel=152449131114778=2 Signed-off-by: Gustavo A. R. Silva I'll take this directly, and queued up for -stable. Dave, I wonder if you can take this one too: https://lore.kernel.org/lkml/20181221212229.GA32635@embeddedor/ It's pretty similar to the af_nfc one. Thanks -- Gustavo
[PATCH 4/4] MAINTAINERS: add maintainer for HiSilicon QM and ZIP controller driver
Add Zhou Wang as a maintainer for HiSilicon QM and ZIP controller driver. Signed-off-by: Zhou Wang Reviewed-by: John Garry --- MAINTAINERS | 8 1 file changed, 8 insertions(+) diff --git a/MAINTAINERS b/MAINTAINERS index 0767f1d..5be84e2 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -6785,6 +6785,14 @@ S: Supported F: drivers/scsi/hisi_sas/ F: Documentation/devicetree/bindings/scsi/hisilicon-sas.txt +HISILICON QM AND ZIP Controller DRIVER +M: Zhou Wang +L: linux-cry...@vger.kernel.org +S: Maintained +F: drivers/crypto/hisilicon/qm.c +F: drivers/crypto/hisilicon/qm.h +F: drivers/crypto/hisilicon/zip/ + HMM - Heterogeneous Memory Management M: Jérôme Glisse L: linux...@kvack.org -- 2.8.1
[PATCH 1/4] Documentation: Add debugfs doc for hisi_zip
Add debugfs descriptions for HiSilicon ZIP and QM driver. Signed-off-by: Zhou Wang Reviewed-by: Jonathan Cameron --- Documentation/ABI/testing/debugfs-hisi-zip | 50 ++ 1 file changed, 50 insertions(+) create mode 100644 Documentation/ABI/testing/debugfs-hisi-zip diff --git a/Documentation/ABI/testing/debugfs-hisi-zip b/Documentation/ABI/testing/debugfs-hisi-zip new file mode 100644 index 000..a7c63e6 --- /dev/null +++ b/Documentation/ABI/testing/debugfs-hisi-zip @@ -0,0 +1,50 @@ +What: /sys/kernel/debug/hisi_zip//comp_core[01]/regs +Date: Nov 2018 +Contact:linux-cry...@vger.kernel.org +Description:Dump of compression cores related debug registers. + Only available for PF. + +What: /sys/kernel/debug/hisi_zip//decomp_core[0-5]/regs +Date: Nov 2018 +Contact:linux-cry...@vger.kernel.org +Description:Dump of decompression cores related debug registers. + Only available for PF. + +What: /sys/kernel/debug/hisi_zip//clear_enable +Date: Nov 2018 +Contact:linux-cry...@vger.kernel.org +Description:Compression/decompression core debug registers read clear + control. 1 means enable register read clear, otherwise 0. + Writing to this file has no functional effect, only enable or + disable counters clear after reading of these registers. + Only available for PF. + +What: /sys/kernel/debug/hisi_zip//current_qm +Date: Nov 2018 +Contact:linux-cry...@vger.kernel.org +Description:One ZIP controller has one PF and multiple VFs, each function + has a QM. Select the QM which below qm refers to. + Only available for PF. + +What: /sys/kernel/debug/hisi_zip//qm/qm_regs +Date: Nov 2018 +Contact:linux-cry...@vger.kernel.org +Description:Dump of QM related debug registers. + Available for PF and VF in host. VF in guest currently only + has one debug register. + +What: /sys/kernel/debug/hisi_zip//qm/current_q +Date: Nov 2018 +Contact:linux-cry...@vger.kernel.org +Description:One QM may contain multiple queues. Select specific queue to + show its debug registers in above qm_regs. + Only available for PF. + +What: /sys/kernel/debug/hisi_zip//qm/clear_enable +Date: Nov 2018 +Contact:linux-cry...@vger.kernel.org +Description:QM debug registers(qm_regs) read clear control. 1 means enable + register read clear, otherwise 0. + Writing to this file has no functional effect, only enable or + disable counters clear after reading of these registers. + Only available for PF. -- 2.8.1
[PATCH 2/4] crypto: hisilicon: Add queue management driver for HiSilicon QM module
QM is a general IP used by HiSilicon accelerators. It provides a general PCIe interface for the CPU and the accelerator to share a group of queues. A QM integrated in an accelerator provides queue management service. Queues can be assigned to PF and VFs, and queues can be controlled by unified mailboxes and doorbells. Specific task request are descripted by specific description buffer, which will be controlled and pass to related accelerator IP by QM. This patch adds a QM driver used by the accelerator driver to access the QM hardware. Signed-off-by: Zhou Wang Signed-off-by: Kenneth Lee Signed-off-by: Shiju Jose Signed-off-by: Hao Fang Reviewed-by: Jonathan Cameron Reviewed-by: John Garry --- drivers/crypto/hisilicon/Kconfig |4 + drivers/crypto/hisilicon/Makefile |1 + drivers/crypto/hisilicon/qm.c | 1823 + drivers/crypto/hisilicon/qm.h | 177 4 files changed, 2005 insertions(+) create mode 100644 drivers/crypto/hisilicon/qm.c create mode 100644 drivers/crypto/hisilicon/qm.h diff --git a/drivers/crypto/hisilicon/Kconfig b/drivers/crypto/hisilicon/Kconfig index 8ca9c50..993a98d 100644 --- a/drivers/crypto/hisilicon/Kconfig +++ b/drivers/crypto/hisilicon/Kconfig @@ -12,3 +12,7 @@ config CRYPTO_DEV_HISI_SEC To compile this as a module, choose M here: the module will be called hisi_sec. + +config CRYPTO_DEV_HISI_QM + tristate + depends on ARM64 && PCI && PCI_MSI diff --git a/drivers/crypto/hisilicon/Makefile b/drivers/crypto/hisilicon/Makefile index 463f46a..05e9052 100644 --- a/drivers/crypto/hisilicon/Makefile +++ b/drivers/crypto/hisilicon/Makefile @@ -1,2 +1,3 @@ # SPDX-License-Identifier: GPL-2.0 obj-$(CONFIG_CRYPTO_DEV_HISI_SEC) += sec/ +obj-$(CONFIG_CRYPTO_DEV_HISI_QM) += qm.o diff --git a/drivers/crypto/hisilicon/qm.c b/drivers/crypto/hisilicon/qm.c new file mode 100644 index 000..0a9d326 --- /dev/null +++ b/drivers/crypto/hisilicon/qm.c @@ -0,0 +1,1823 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2018 Hisilicon Limited. */ +#include +#include +#include +#include +#include +#include +#include +#include +#include "qm.h" + +/* eq/aeq irq enable */ +#define QM_VF_EQ_INT_SOURCE0x8 +#define QM_VF_EQ_INT_MASK 0xc + +/* mailbox */ +#define QM_MB_CMD_SQC 0x0 +#define QM_MB_CMD_CQC 0x1 +#define QM_MB_CMD_EQC 0x2 +#define QM_MB_CMD_AEQC 0x3 +#define QM_MB_CMD_SQC_BT 0x4 +#define QM_MB_CMD_CQC_BT 0x5 +#define QM_MB_CMD_SQC_VFT_V2 0x6 + +#define QM_MB_CMD_SEND_BASE0x300 +#define QM_MB_EVENT_SHIFT 8 +#define QM_MB_BUSY_SHIFT 13 +#define QM_MB_OP_SHIFT 14 +#define QM_MB_CMD_DATA_ADDR_L 0x304 +#define QM_MB_CMD_DATA_ADDR_H 0x308 + +/* sqc shift */ +#define QM_SQ_HOP_NUM_SHIFT0 +#define QM_SQ_PAGE_SIZE_SHIFT 4 +#define QM_SQ_BUF_SIZE_SHIFT 8 +#define QM_SQ_SQE_SIZE_SHIFT 12 +#define QM_SQ_PRIORITY_SHIFT 0 +#define QM_SQ_ORDERS_SHIFT 4 +#define QM_SQ_TYPE_SHIFT 8 + +#define QM_SQ_TYPE_MASK0xf + +/* cqc shift */ +#define QM_CQ_HOP_NUM_SHIFT0 +#define QM_CQ_PAGE_SIZE_SHIFT 4 +#define QM_CQ_BUF_SIZE_SHIFT 8 +#define QM_CQ_SQE_SIZE_SHIFT 12 +#define QM_CQ_PHASE_SHIFT 0 +#define QM_CQ_FLAG_SHIFT 1 + +#define QM_CQC_PHASE_BIT 0x1 +#define QM_CQE_PHASE(cqe) ((cqe)->w7 & 0x1) + +/* eqc shift */ +#define QM_EQC_EQE_SHIFT 12 +#define QM_EQC_PHASE_SHIFT 16 +#define QM_EQC_PHASE(eqc) eqc)->dw6) >> 16) & 0x1) +#define QM_EQC_PHASE_BIT 0x0001 + +#define QM_EQE_PHASE(eqe) (((eqe)->dw0 >> 16) & 0x1) +#define QM_EQE_CQN_MASK0x + +#define QM_AEQC_PHASE(aeqc)aeqc)->dw6) >> 16) & 0x1) +#define QM_AEQC_PHASE_BIT 0x0001 +#define QM_AEQE_PHASE(aeqe)(((aeqe)->dw0 >> 16) & 0x1) + +#define QM_DOORBELL_CMD_SQ 0 +#define QM_DOORBELL_CMD_CQ 1 +#define QM_DOORBELL_CMD_EQ 2 +#define QM_DOORBELL_CMD_AEQ3 + +#define QM_DOORBELL_BASE_V10x340 +#define QM_DOORBELL_SQ_CQ_BASE_V2 0x1000 +#define QM_DOORBELL_EQ_AEQ_BASE_V2 0x2000 + +#define QM_MEM_START_INIT 0x100040 +#define QM_MEM_INIT_DONE 0x100044 +#define QM_VFT_CFG_RDY 0x10006c +#define QM_VFT_CFG_OP_WR 0x100058 +#define QM_VFT_CFG_TYPE0x10005c +#define QM_SQC_VFT 0x0 +#define QM_CQC_VFT 0x1 +#define QM_VFT_CFG_ADDRESS 0x100060 +#define QM_VFT_CFG_OP_ENABLE 0x100054 + +#define QM_VFT_CFG_DATA_L 0x100064
[PATCH 0/4] crypto: hisilicon: Add HiSilicon QM and ZIP controller driver
This series adds HiSilicon QM and ZIP controller driver in crypto subsystem. A simple QM/ZIP driver which helps to provide an example for a general accelerator framework is under review in community[1]. Based on this simple driver, this series adds HW v2 support, PCI passthrough, reset, PCI/misc error handler, debug support. But unlike [1], driver in this patchset only registers to crypto subsystem. There will be a long discussion about above accelerator framework in the process of upstreaming. So let's firstly review and upstream QM/ZIP crypto driver. References: [1] https://lkml.org/lkml/2018/11/12/1951 Zhou Wang (4): Documentation: Add debugfs doc for hisi_zip crypto: hisilicon: Add queue management driver for HiSilicon QM module crypto: hisilicon: Add HiSilicon ZIP accelerator support MAINTAINERS: add maintainer for HiSilicon QM and ZIP controller driver Documentation/ABI/testing/debugfs-hisi-zip | 50 + MAINTAINERS|8 + drivers/crypto/hisilicon/Kconfig | 11 + drivers/crypto/hisilicon/Makefile |2 + drivers/crypto/hisilicon/qm.c | 1823 drivers/crypto/hisilicon/qm.h | 177 +++ drivers/crypto/hisilicon/zip/Makefile |2 + drivers/crypto/hisilicon/zip/zip.h | 60 + drivers/crypto/hisilicon/zip/zip_crypto.c | 410 +++ drivers/crypto/hisilicon/zip/zip_main.c| 1161 ++ 10 files changed, 3704 insertions(+) create mode 100644 Documentation/ABI/testing/debugfs-hisi-zip create mode 100644 drivers/crypto/hisilicon/qm.c create mode 100644 drivers/crypto/hisilicon/qm.h create mode 100644 drivers/crypto/hisilicon/zip/Makefile create mode 100644 drivers/crypto/hisilicon/zip/zip.h create mode 100644 drivers/crypto/hisilicon/zip/zip_crypto.c create mode 100644 drivers/crypto/hisilicon/zip/zip_main.c -- 2.8.1
[PATCH 3/4] crypto: hisilicon: Add HiSilicon ZIP accelerator support
The HiSilicon ZIP accelerator implements the zlib and gzip algorithm. It uses Hisilicon QM as the interface to the CPU. This patch provides PCIe driver to the accelerator and register it to the crypto subsystem. Signed-off-by: Zhou Wang Signed-off-by: Shiju Jose Signed-off-by: Kenneth Lee Signed-off-by: Hao Fang Reviewed-by: Jonathan Cameron Reviewed-by: John Garry --- drivers/crypto/hisilicon/Kconfig |7 + drivers/crypto/hisilicon/Makefile |1 + drivers/crypto/hisilicon/zip/Makefile |2 + drivers/crypto/hisilicon/zip/zip.h| 60 ++ drivers/crypto/hisilicon/zip/zip_crypto.c | 410 ++ drivers/crypto/hisilicon/zip/zip_main.c | 1161 + 6 files changed, 1641 insertions(+) create mode 100644 drivers/crypto/hisilicon/zip/Makefile create mode 100644 drivers/crypto/hisilicon/zip/zip.h create mode 100644 drivers/crypto/hisilicon/zip/zip_crypto.c create mode 100644 drivers/crypto/hisilicon/zip/zip_main.c diff --git a/drivers/crypto/hisilicon/Kconfig b/drivers/crypto/hisilicon/Kconfig index 993a98d..338a70a 100644 --- a/drivers/crypto/hisilicon/Kconfig +++ b/drivers/crypto/hisilicon/Kconfig @@ -16,3 +16,10 @@ config CRYPTO_DEV_HISI_SEC config CRYPTO_DEV_HISI_QM tristate depends on ARM64 && PCI && PCI_MSI + +config CRYPTO_DEV_HISI_ZIP + tristate "Support for HiSilicon ZIP accelerator" + depends on ARM64 + select CRYPTO_DEV_HISI_QM + help + Support for HiSilicon ZIP Driver diff --git a/drivers/crypto/hisilicon/Makefile b/drivers/crypto/hisilicon/Makefile index 05e9052..c97c5b2 100644 --- a/drivers/crypto/hisilicon/Makefile +++ b/drivers/crypto/hisilicon/Makefile @@ -1,3 +1,4 @@ # SPDX-License-Identifier: GPL-2.0 obj-$(CONFIG_CRYPTO_DEV_HISI_SEC) += sec/ obj-$(CONFIG_CRYPTO_DEV_HISI_QM) += qm.o +obj-$(CONFIG_CRYPTO_DEV_HISI_ZIP) += zip/ diff --git a/drivers/crypto/hisilicon/zip/Makefile b/drivers/crypto/hisilicon/zip/Makefile new file mode 100644 index 000..a936f09 --- /dev/null +++ b/drivers/crypto/hisilicon/zip/Makefile @@ -0,0 +1,2 @@ +obj-$(CONFIG_CRYPTO_DEV_HISI_ZIP) += hisi_zip.o +hisi_zip-objs = zip_main.o zip_crypto.o diff --git a/drivers/crypto/hisilicon/zip/zip.h b/drivers/crypto/hisilicon/zip/zip.h new file mode 100644 index 000..44a9e27 --- /dev/null +++ b/drivers/crypto/hisilicon/zip/zip.h @@ -0,0 +1,60 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2018 Hisilicon Limited. */ +#ifndef HISI_ZIP_H +#define HISI_ZIP_H + +#include +#include "../qm.h" + +#undef pr_fmt +#define pr_fmt(fmt)"hisi_zip: " fmt + +enum hisi_zip_error_type { + /* negative compression */ + HZIP_NC_ERR = 0x0d, +}; + +struct hisi_zip_ctrl; + +struct hisi_zip { + struct hisi_qm qm; + struct list_head list; + struct hisi_zip_ctrl *ctrl; +}; + +struct hisi_zip_sqe { + __u32 consumed; + __u32 produced; + __u32 comp_data_length; + __u32 dw3; + __u32 input_data_length; + __u32 lba_l; + __u32 lba_h; + __u32 dw7; + __u32 dw8; + __u32 dw9; + __u32 dw10; + __u32 priv_info; + __u32 dw12; + __u32 tag; + __u32 dest_avail_out; + __u32 rsvd0; + __u32 comp_head_addr_l; + __u32 comp_head_addr_h; + __u32 source_addr_l; + __u32 source_addr_h; + __u32 dest_addr_l; + __u32 dest_addr_h; + __u32 stream_ctx_addr_l; + __u32 stream_ctx_addr_h; + __u32 cipher_key1_addr_l; + __u32 cipher_key1_addr_h; + __u32 cipher_key2_addr_l; + __u32 cipher_key2_addr_h; + __u32 rsvd1[4]; +}; + +struct hisi_zip *find_zip_device(int node); +int hisi_zip_register_to_crypto(void); +void hisi_zip_unregister_from_crypto(void); +#endif diff --git a/drivers/crypto/hisilicon/zip/zip_crypto.c b/drivers/crypto/hisilicon/zip/zip_crypto.c new file mode 100644 index 000..172d237 --- /dev/null +++ b/drivers/crypto/hisilicon/zip/zip_crypto.c @@ -0,0 +1,410 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2018 Hisilicon Limited. */ +#include +#include +#include "zip.h" + +#define HZIP_INPUT_BUFFER_SIZE SZ_4M +#define HZIP_OUTPUT_BUFFER_SIZESZ_4M + +#define HZIP_ALG_TYPE_ZLIB 0x02 +#define HZIP_ALG_TYPE_GZIP 0x03 + +const u8 zlib_head[2] = {0x78, 0x9c}; +const u8 gzip_head[10] = {0x1f, 0x8b, 0x08, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x03}; + +#define COMP_NAME_TO_TYPE(alg_name)\ + (!strcmp((alg_name), "zlib-deflate") ? HZIP_ALG_TYPE_ZLIB : \ +!strcmp((alg_name), "gzip") ? HZIP_ALG_TYPE_GZIP : 0) \ + +#define TO_HEAD_SIZE(req_type) \ + (((req_type) == HZIP_ALG_TYPE_ZLIB) ? sizeof(zlib_head) : \ +((req_type) == HZIP_ALG_TYPE_GZIP) ? sizeof(gzip_head) : 0)\ + +#define TO_HEAD(req_type)
Re: [BREAKAGE] Since 4.18, kernel sets SB_I_NODEV implicitly on userns mounts, breaking systemd-nspawn
Am So., 23. Dez. 2018 um 00:02 Uhr schrieb Linus Torvalds : > > On Sat, Dec 22, 2018 at 2:49 PM Christian Brauner > wrote: > > > > To be fair, no one apart from me was pointing out that it actually > > breaks people including systemd folks > > even though I was bringing it up with them. I even tried to fix all of > > userspace after this got NACKED > > Seriously, the "we don't break user space" is the #1 rule in the > kernel, and people should _know_ it's the #1 rule. > > If somebody ignores that rule, it needs to be escalated to me. > Immediately. Because I need to know. > I do that usually but I didn't saw Christian's revert the time and I never hit that issue. Just saw that now because the unusual [BREAKAGE] prefix. > I need to know so that I can override the bogus NAK, and so that we > can fix the breakage ASAP. The absolute last thing we need is some > other user space then starting to rely on the new behavior, which just > compounds the problem and makes it a *much* bigger problem. > Yes and you are right .. https://github.com/lxc/lxc/pull/2438 I've added an comment there about 4.20.0. BR, Gabriel
Re: [BREAKAGE] Since 4.18, kernel sets SB_I_NODEV implicitly on userns mounts, breaking systemd-nspawn
On Sat, Dec 22, 2018 at 3:07 PM Christian Brauner wrote: > > However, for this case should I resend the revert? Since I was pointed at the original email thread, I just picked it up from there directly. It still applied cleanly, nothing had changed in that area. Linus
Re: [GIT PULL] auxdisplay for v4.20
The pull request you sent on Fri, 21 Dec 2018 21:53:02 +0100: > https://github.com/ojeda/linux.git tags/auxdisplay-for-linus-v4.20 has been merged into torvalds/linux.git: https://git.kernel.org/torvalds/c/38c0ecf6087a8cb2af24ddd2124e9ca3c666dcdf Thank you! -- Deet-doot-dot, I am a bot. https://korg.wiki.kernel.org/userdoc/prtracker
Re: [GIT PULL] compiler-attributes for v4.20
The pull request you sent on Fri, 21 Dec 2018 22:29:22 +0100: > https://github.com/ojeda/linux.git tags/compiler-attributes-for-linus-v4.20 has been merged into torvalds/linux.git: https://git.kernel.org/torvalds/c/1104bd96eb2af9707dce69a22c63bd432a41380a Thank you! -- Deet-doot-dot, I am a bot. https://korg.wiki.kernel.org/userdoc/prtracker
Re: [GIT PULL] SCSI fixes for 4.20-rc7
The pull request you sent on Fri, 21 Dec 2018 18:50:56 -0800: > git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes has been merged into torvalds/linux.git: https://git.kernel.org/torvalds/c/9105b8aa50c182371533fc97db64fc8f26f051b3 Thank you! -- Deet-doot-dot, I am a bot. https://korg.wiki.kernel.org/userdoc/prtracker
Re: [PATCH] nfc: af_nfc: Fix Spectre v1 vulnerability
From: "Gustavo A. R. Silva" Date: Fri, 21 Dec 2018 15:47:53 -0600 > proto is indirectly controlled by user-space, hence leading to > a potential exploitation of the Spectre variant 1 vulnerability. > > This issue was detected with the help of Smatch: > > net/nfc/af_nfc.c:42 nfc_sock_create() warn: potential spectre issue > 'proto_tab' [w] (local cap) > > Fix this by sanitizing proto before using it to index proto_tab. > > Notice that given that speculation windows are large, the policy is > to kill the speculation on the first load and not worry if it can be > completed with a dependent load/store [1]. > > [1] https://marc.info/?l=linux-kernel=152449131114778=2 > > Signed-off-by: Gustavo A. R. Silva I'll take this directly, and queued up for -stable. Thanks.
Re: [PATCH] phonet: af_phonet: Fix Spectre v1 vulnerability
From: "Gustavo A. R. Silva" Date: Fri, 21 Dec 2018 15:41:17 -0600 > protocol is indirectly controlled by user-space, hence leading to > a potential exploitation of the Spectre variant 1 vulnerability. > > This issue was detected with the help of Smatch: > > net/phonet/af_phonet.c:48 phonet_proto_get() warn: potential spectre issue > 'proto_tab' [w] (local cap) > > Fix this by sanitizing protocol before using it to index proto_tab. > > Notice that given that speculation windows are large, the policy is > to kill the speculation on the first load and not worry if it can be > completed with a dependent load/store [1]. > > [1] https://marc.info/?l=linux-kernel=152449131114778=2 > > Signed-off-by: Gustavo A. R. Silva Applied, thanks.
Re: [BREAKAGE] Since 4.18, kernel sets SB_I_NODEV implicitly on userns mounts, breaking systemd-nspawn
On Sun, Dec 23, 2018 at 12:02 AM Linus Torvalds wrote: > > On Sat, Dec 22, 2018 at 2:49 PM Christian Brauner > wrote: > > > > To be fair, no one apart from me was pointing out that it actually > > breaks people including systemd folks > > even though I was bringing it up with them. I even tried to fix all of > > userspace after this got NACKED > > Seriously, the "we don't break user space" is the #1 rule in the > kernel, and people should _know_ it's the #1 rule. > > If somebody ignores that rule, it needs to be escalated to me. > Immediately. Because I need to know. Fair enough. I usually try to be very conservative when sending patches directly your way and Eric is otherwise very much on top of not regressing userspace and I trust him. However, for this case should I resend the revert? Christian > > I need to know so that I can override the bogus NAK, and so that we > can fix the breakage ASAP. The absolute last thing we need is some > other user space then starting to rely on the new behavior, which just > compounds the problem and makes it a *much* bigger problem. > > But I also need to know so that I can then make sure I know not to > trust the person who broke rule #1. > > This is not some odd corner case for the kernel. This is literally the > rule we have lived with for *decades*. > > So please escalate to me whenever you feel a kernel developer doesn't > follow the first rule. Because the code that broke things *will* be > reverted (*). > > Linus > > (*) Yes, there are exceptions. We have had situations where some > interface was simply just a huge security issue or had some other > fundamental issue. And we've had cases where the breakage was just so > old that it was no longer fixable. So even rule #1 can sometimes have > things that hold it back. But it is *very* rare. Certainly nothing > like this.
Re: [PATCH] net: core: Fix Spectre v1 vulnerability
From: "Gustavo A. R. Silva" Date: Fri, 21 Dec 2018 14:49:01 -0600 > flen is indirectly controlled by user-space, hence leading to > a potential exploitation of the Spectre variant 1 vulnerability. > > This issue was detected with the help of Smatch: > > net/core/filter.c:1101 bpf_check_classic() warn: potential spectre issue > 'filter' [w] > > Fix this by sanitizing flen before using it to index filter at line 1101: > > switch (filter[flen - 1].code) { > > and through pc at line 1040: > > const struct sock_filter *ftest = [pc]; > > Notice that given that speculation windows are large, the policy is > to kill the speculation on the first load and not worry if it can be > completed with a dependent load/store [1]. > > [1] https://marc.info/?l=linux-kernel=152449131114778=2 > > Signed-off-by: Gustavo A. R. Silva BPF folks, I'll take this directly. Applied and queued up for -stable, thanks.
Re: WIP Droid 4 voice calls, GNSS & PM with a TS 27.010 serdev driver
Hi, little elves! > So the little elves have been slowly working to get voice calls > working on droid 4 with the mainline kernel. And just in time for the > upcoming holidays, it might be possible to call friends and relatives. > > I've pushed out an experimental branch containing serdev ts 27.010 > UART multiplexing support. That contains a serdev core driver for the > mdm6600 modem (that also now idles the modem for PM), support for Alsa > ASoC voice codec and mixer, and a GNSS driver for the GPS. > > Where it does not make sense to do a kernel serdev driver, I've > exposed the rest of the available 27.010 channels as ten /dev/motmdm* > character devices. There's /dev/motmdm1 for AT commands to dial voice > calls, /dev/motmdm3 for SMS eventually, and I think there's also a SIM > card reader at /dev/motmdm10. Then /dev/motmdm7 seems to be just an > echo channel. The other channels are still a bit of a mystery. I tried to get access at motmdm, but no: root@devuan:/home/user# minicom -D /dev/motmdm1 minicom: cannot open /dev/motmdm1: No such file or directory root@devuan:/home/user# ls -al /dev/motmdm1 ls: cannot access '/dev/motmdm1': No such file or directory root@devuan:/home/user# dmesg | grep motmd root@devuan:/home/user# zcat /proc/config.gz | grep MDM CONFIG_MFD_MOTMDM=y CONFIG_SND_SOC_MOTMDM=y CONFIG_PHY_MAPPHONE_MDM6600=y root@devuan:/home/user# uname -a Linux devuan 4.20.0-rc7-00304-gde109fe #19 SMP Sat Dec 22 20:16:19 CET 2018 armv7l GNU/Linux root@devuan:/home/user# Let me try to enable CONFIG_GNSS_MOTMDM. N_GSM also seems enabled. Is there anything else I need to enable in .config? Scary Solstice! Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html signature.asc Description: Digital signature
Re: [PATCH] sock: Make sock->sk_tstamp thread-safe
From: Deepa Dinamani Date: Fri, 21 Dec 2018 12:27:33 -0800 > Al Viro mentioned that there is probably a race condition > lurking in accesses of sk_tstamp on 32-bit machines. > > sock->sk_tstamp is of type ktime_t which is always an s64. > On a 32 bit architecture, we might run into situations of > unsafe access as the access to the field becomes non atomic. > > Use seqlocks for synchronization. > This allows us to avoid using spinlocks for readers as > readers do not need mutual exclusion. > > Another approach to solve this is to require sk_lock for all > modifications of the timestamps. The current approach allows > for timestamps to have their own lock: sk_tstamp_lock. > This allows for the patch to not compete with already > existing critical sections, and side effects are limited > to the paths in the patch. > > The addition of the new field maintains the data locality > optimizations from > commit 9115e8cd2a0c ("net: reorganize struct sock for better data > locality") > > Note that all the instances of the sk_tstamp accesses > are either through the ioctl or the syscall recvmsg. > > Signed-off-by: Deepa Dinamani Since, regardless of whether this is the final approach we will take, it seems that sunrpc needs to be added to this patch. So I'm definitely waiting for a new version. Thanks.
Re: [PATCH] m68k/mac: Use '030 reset method on SE/30
On Sat, 22 Dec 2018, Geert Uytterhoeven wrote: > > - local_irq_save(flags); > > - > > - rom_reset(); > > - > > - local_irq_restore(flags); > > I guess you removed the call to local_irq_restore() because you never > get there anyway? > If a ROM call returns, we have a real problem, because we didn't call it in an execution environment that it is designed to be called in. Anything could happen. Moreover, local_irq_restore() is bogus either way, given that there's nothing that our interrupt handlers can usefully do now. See also commit 558d5ad276c9 ("m68k/mac: Avoid soft-lockup warning after mach_power_off"). --
Re: [PATCH][next] smack: fix two memory leaks in smack_add_opt
On 22/12/2018 19:34, Al Viro wrote: > On Sat, Dec 22, 2018 at 12:27:50PM +, Colin King wrote: >> From: Colin Ian King >> >> Currently if s is null or when returning via the error exit label >> out_opt_err leaks of the allocated opts can occur. Fix the leak >> on the null s case by checking s is null before the allocation. Fix >> the leak on the exit path by checking if opts was allocated by >> kfree'ing opts. >> >> Detected by CoverityScan, CID#1475953 ("Resource leak") >> >> Fixes: b2130173efae ("smack: take the guts of smack_parse_opts_str() into a >> new helper") >> Signed-off-by: Colin Ian King > > There's a better fix in local tree, will go into -next tonight. > OK, sounds good to me. Thanks Colin
Re: [BREAKAGE] Since 4.18, kernel sets SB_I_NODEV implicitly on userns mounts, breaking systemd-nspawn
On Sat, Dec 22, 2018 at 2:49 PM Christian Brauner wrote: > > To be fair, no one apart from me was pointing out that it actually > breaks people including systemd folks > even though I was bringing it up with them. I even tried to fix all of > userspace after this got NACKED Seriously, the "we don't break user space" is the #1 rule in the kernel, and people should _know_ it's the #1 rule. If somebody ignores that rule, it needs to be escalated to me. Immediately. Because I need to know. I need to know so that I can override the bogus NAK, and so that we can fix the breakage ASAP. The absolute last thing we need is some other user space then starting to rely on the new behavior, which just compounds the problem and makes it a *much* bigger problem. But I also need to know so that I can then make sure I know not to trust the person who broke rule #1. This is not some odd corner case for the kernel. This is literally the rule we have lived with for *decades*. So please escalate to me whenever you feel a kernel developer doesn't follow the first rule. Because the code that broke things *will* be reverted (*). Linus (*) Yes, there are exceptions. We have had situations where some interface was simply just a huge security issue or had some other fundamental issue. And we've had cases where the breakage was just so old that it was no longer fixable. So even rule #1 can sometimes have things that hold it back. But it is *very* rare. Certainly nothing like this.
Re: [PATCH v2 03/11] vga-switcheroo: make PCI dependency explicit
On Sat, Dec 22, 2018 at 7:40 PM Lukas Wunner wrote: > > On Sat, Dec 22, 2018 at 09:07:12AM +, Sinan Kaya wrote: > > This driver depends on the PCI infrastructure but the dependency has not > > been explicitly called out. > > > > Signed-off-by: Sinan Kaya > > Reviewed-by: Lukas Wunner > > This series doesn't have a cover letter so it's unclear to me through > which tree it's supposed to go in? Each patch through the individual > subsystem tree or all through the same tree? If the former I guess I > could push this to drm-misc... > Feel free to apply individual patches independently. Let me know which one you applied so that I can drop them on the next rev. > Thanks, > > Lukas > > > --- > > drivers/gpu/vga/Kconfig | 1 + > > 1 file changed, 1 insertion(+) > > > > diff --git a/drivers/gpu/vga/Kconfig b/drivers/gpu/vga/Kconfig > > index b677e5d524e6..d5f1d8e1c6f8 100644 > > --- a/drivers/gpu/vga/Kconfig > > +++ b/drivers/gpu/vga/Kconfig > > @@ -21,6 +21,7 @@ config VGA_SWITCHEROO > > bool "Laptop Hybrid Graphics - GPU switching support" > > depends on X86 > > depends on ACPI > > + depends on PCI > > select VGA_ARB > > help > > Many laptops released in 2008/9/10 have two GPUs with a multiplexer > > -- > > 2.19.0
[PATCH] rtlwifi: rtl8822b: fix a missing check of alloc_skb
__netdev_alloc_skb() return NULl when it fails. skb_put() further uses it even when the allocation fails, leading to NULL pointer dereference. The fix inserts a check for the return value of __netdev_alloc_skb(). Signed-off-by: Kangjie Lu --- drivers/staging/rtlwifi/rtl8822be/fw.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/staging/rtlwifi/rtl8822be/fw.c b/drivers/staging/rtlwifi/rtl8822be/fw.c index a40396614814..f061dd1382aa 100644 --- a/drivers/staging/rtlwifi/rtl8822be/fw.c +++ b/drivers/staging/rtlwifi/rtl8822be/fw.c @@ -486,6 +486,8 @@ bool rtl8822b_halmac_cb_write_data_h2c(struct rtl_priv *rtlpriv, u8 *buf, /* without GFP_DMA, pci_map_single() may not work */ skb = __netdev_alloc_skb(NULL, size, GFP_ATOMIC | GFP_DMA); + if (!skb) + return false; memcpy((u8 *)skb_put(skb, size), buf, size); return _rtl8822be_send_bcn_or_cmd_packet(rtlpriv->hw, skb, H2C_QUEUE); -- 2.17.2 (Apple Git-113)
Re: tpm_tis TPM2.0 not detected on cold boot
On Sat, 2018-12-22 at 14:47 +0100, Michael Niewöhner wrote: > When I remove the timeout and boot directly to the linux kernel, I get that > "2314 TPM-self test error" since it has not finished, yet. The TPM is detected > by IMA and works fine then. > > Some more tests showed that any delay before booting the kernel causes the TPM > to not get detected. I tested, 10, 15, 20, 30, 60... seconds. Only in some > very > rare cases the TPM got detected. > > I wanted to know if the TPM is in an well initialized state at the time of > that > error. Since I was not able to get some test/debug kernel patches working I > decided to try kexec. It turned out that the TPM is indeed correctly working > and > will be detected just fine by linux after kexec! No surprise here. kexec would be the equivalent of a soft reboot. > > Is there anyone having an idea what could be wrong here? I am willing to debug > this but I have really no idea where to start :-( A while ago, I was "playing" with a pi. Commenting out tpm2_do_selftest() seemed to resolve a similar problem, but that was before James' patches. I don't know if that would make a difference now. Mimi
Re: [BREAKAGE] Since 4.18, kernel sets SB_I_NODEV implicitly on userns mounts, breaking systemd-nspawn
On Sat, Dec 22, 2018 at 11:20 PM Linus Torvalds wrote: > > Eric, this is entirely unacceptable. i would like to point out that I send a revert for this in *July* before any kernel with this change was released for the exact same reason. But I was ignored and no one came to argumentative aid: - https://lists.linuxfoundation.org/pipermail/containers/2018-July/039182.html - https://lists.linuxfoundation.org/pipermail/containers/2018-July/039183.html To be fair, no one apart from me was pointing out that it actually breaks people including systemd folks even though I was bringing it up with them. I even tried to fix all of userspace after this got NACKED ( https://github.com/systemd/systemd/pull/9483 ). Christian > > On Sat, Dec 22, 2018 at 12:58 PM Gabriel C wrote: > > > > Added some people to CC that might want to see this.. > > Thanks. > > > > Here's an email that was sent to lkml about the subject: > > > > > > https://lkml.org/lkml/2018/7/5/742 > > > > > > I link also this, quoting the last of it: > > > > > > https://lkml.org/lkml/2018/7/5/701 > > > > > > It has never been the case that mknod on a device node will guarantee > > > that you even can open the device node. The applications that regress > > > are broken. It doesn't mean we shouldn't be bug compatible, but we darn > > > well should document very clearly the bugs we are being bug compatible > > > with. > > Yeah, this is complete garbage. > > We have very clear rules in the kernel: if some change breaks existing > setups, it is ABSOLUTELY NEVER the application that is broken. > > It is the kernel. > > There is absolutely zero gray areas here. Eric, your behavior is > entirely out of line, and now we apparently have a regression that > goes back to June that I was not told about because of your incorrect > stance. > > Eric, I want to make this 1000% clear: there are no user space bugs. > If it used to work, then user space was clearly doing the right thing. > The fact that you tried to several times claim it was buggy user space > is a serious breach of trust. You KNOW this is the case. > > Seriously. There are no excuses. > > That commit is now reverted in my tree, and furthermore I will not > take any pull requests from you until you have made it clear that you > comprehend this very fundamental issue. > > Why did it take so long for this issue to be elevated to me? > >Linus
Re: [PATCH v2 01/12] fs-verity: add a documentation file
On Fri, Dec 21, 2018 at 8:20 PM Theodore Y. Ts'o wrote: > > On Fri, Dec 21, 2018 at 11:13:07AM -0800, Linus Torvalds wrote: > > > > In other words: either the model is that the file *itself* contains > > its own merkle tree that validates the file, or it isn't. You can't > > have it two ways. No silly "layout changes when you apply the hash" > > garbage. That's just crazy talk and invalidates the whole model. > > Userspace applications which are reading the file aren't going to be > expecting Merkle tree. For example, one of the use cases is Android > APK files, which are essentially ZIP files. ZIP files can be parsed > both from the front-end (streaming), or by looking for the complete > directory of all of the files in the ZIP file by starting at the end > of the file and moving backwards. If the Merkle tree was visible to > userspace programs that are opening and reading the file, it would > confuse them mightily. > > So what we do for ext4 and f2fs is make the Merkle tree invisible Again, this has nothing that is per-filesystem in it. If we were to decide to support the notion of "append merkle hashes to the file for validation" at the vfs layer, the same logic would apply: obviously the merkle data shouldn't be visible to user space. But that's not a reason to do it at a filesystem layer, quite the reverse: exactly like you say, as far as the *filesystem* is concerned, the data is there in the file. It's literally about the *view* of the file, ie the system call interface: >From the *file system's* perspective, > though, the metadata blocks are part of the file. To me that only argues that this all should be at the vfs layer, and that it shouldn't be the filesystem that hides it. Exactly because as far as the filesystem is concerned, the merkle data is there, it's just that we hide it at read (and stat) time. Preferably some way where it's namespace-dependent or whatever, so that you could still access the original file data from user space if you want to (eg some backup purpose or other). What I'm missing is any kind of sane explanation for why it was done so badly, and why it should be upstreamed despite the apparent bad implementation. It sounds like a complete hack. Again, to me either the point is that it's a generic extension of the file data, _or_ it's some filesystem-specific hidden data. The way you've done it and written the documentation, it's clearly a generic extension of normal file data, and I don't see what's fs-specific to it. > The problem is that xattrs are designed to be accessed via a set/get > interface, are currently limited, IIRC at 32k. The max size of an APK > is 300 megabytes; and the Merkle tree for a file that size will be > about 2.3 megabytes. That's way too big to store as an xattr; > certainly using the existing xattr interfaces. And it's also bigger > than most file systems can handle as xattrs today --- because they've > been optimzied for relatively small sizes, for things like SELinux > labels and ACL structures. So *this* kind of argument is what I'm looking for. That at least explains why it's not an xattr. Ugly, but understandable. > > So why is this sold as some unholy mess of "filesystem-specific" and > > "generic"? That part just annoys the hell out of me. Why isn't this > > sold as an *actual* generic model, where you just say "append the > > merkle tree to the file, then enable verity testing of the end result > > and validate the top-level hash". > > That was the original way it was sold, but Cristoph and Dave have > NACK'ed it in that form. That seems entirely irrelevant. What do Christoph and Dave have to do with it once it's generic? It would have _zero_ filesystem component if it's actually done in a generic manner. It would be a total no-op to XFS. Which makes me think "it wasn't actually sold as being filesystem-independent" at all. So I want to understand why this was made a filesystem operation in the first place. What's fs-specific about this implementation? Linus
Re: [PATCH 00/14] Add support for FM radio in hcill and kill TI_ST
Hi! > > This moves all remaining users of the legacy TI_ST driver to hcill (patches > > 1-3). Then patches 4-7 convert wl128x-radio driver to a standard platform > > device driver with support for multiple instances. Patch 7 will result in > > (userless) TI_ST driver no longer supporting radio at runtime. Patch 8-11 do > > some cleanups in the wl128x-radio driver. Finally patch 12 removes the TI_ST > > specific parts from wl128x-radio and adds the required infrastructure to > > use it > > with the serdev hcill driver instead. The remaining patches 13 and 14 remove > > the old TI_ST code. > > > > The new code has been tested on the Motorola Droid 4. For testing the audio > > should be configured to route Ext to Speaker or Headphone. Then you need to > > plug headphone, since its cable is used as antenna. For testing there is a > > 'radio' utility packages in Debian. When you start the utility you need to > > specify a frequency, since initial get_frequency returns an error: > > > > $ radio -f 100.0 > > Ok, it seems the driver does not work built-in, due to firmware issue: > > root@devuan:/home/user# dmesg | grep wl12 > [1.018951] reg-fixed-voltage regulator-wl12xx: GPIO lookup for > consumer (null) > [1.026550] reg-fixed-voltage regulator-wl12xx: using device tree > for GPIO lookup > [1.034271] of_get_named_gpiod_flags: can't parse 'gpios' property > of node '/regulator-wl12xx[0]' > [1.043487] of_get_named_gpiod_flags: parsed 'gpio' property of > node '/regulator-wl12xx[0]' - status (0) > [4.151885] wl12xx_driver wl12xx.1.auto: Direct firmware load for > ti-connectivity/wl128x-nvs.bin failed with error -2 > [ 11.368286] vwl1271: disabling > root@devuan:/home/user# find /lib/firmware/ | grep wl128 > /lib/firmware/ti-connectivity/wl128x-fw-5-plt.bin > /lib/firmware/ti-connectivity/wl128x-fw-5-mr.bin > /lib/firmware/ti-connectivity/wl128x-fw-5-sr.bin > /lib/firmware/ti-connectivity/wl128x-nvs.bin > root@devuan:/home/user# > > Ideas welcome... ... ... am I supposed to compile wl128-nvs.bin into > the kernel using EXTRA_FIRMWARE? EXTRA_FIRMWARE gets me further... some of it was not in debian. "Speaker right" needs to be set to "Ext" in alsamixer, and then... it works! :-) Quality does not seem to be great, but that may be mixer settings or something. Thanks! Pavel Tested-by: Pavel Machek -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html signature.asc Description: Digital signature
[PATCH v3 2/2] hugetlbfs: Use i_mmap_rwsem to fix page fault/truncate race
hugetlbfs page faults can race with truncate and hole punch operations. Current code in the page fault path attempts to handle this by 'backing out' operations if we encounter the race. One obvious omission in the current code is removing a page newly added to the page cache. This is pretty straight forward to address, but there is a more subtle and difficult issue of backing out hugetlb reservations. To handle this correctly, the 'reservation state' before page allocation needs to be noted so that it can be properly backed out. There are four distinct possibilities for reservation state: shared/reserved, shared/no-resv, private/reserved and private/no-resv. Backing out a reservation may require memory allocation which could fail so that needs to be taken into account as well. Instead of writing the required complicated code for this rare occurrence, just eliminate the race. i_mmap_rwsem is now held in read mode for the duration of page fault processing. Hold i_mmap_rwsem longer in truncation and hold punch code to cover the call to remove_inode_hugepages. With this modification, code in remove_inode_hugepages checking for races becomes 'dead' as it can not longer happen. Remove the dead code and expand comments to explain reasoning. Similarly, checks for races with truncation in the page fault path can be simplified and removed. Cc: Fixes: ebed4bfc8da8 ("hugetlb: fix absurd HugePages_Rsvd") Signed-off-by: Mike Kravetz --- fs/hugetlbfs/inode.c | 61 mm/hugetlb.c | 21 --- 2 files changed, 38 insertions(+), 44 deletions(-) diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c index 32920a10100e..a2fcea5f8225 100644 --- a/fs/hugetlbfs/inode.c +++ b/fs/hugetlbfs/inode.c @@ -383,17 +383,16 @@ hugetlb_vmdelete_list(struct rb_root_cached *root, pgoff_t start, pgoff_t end) * truncation is indicated by end of range being LLONG_MAX * In this case, we first scan the range and release found pages. * After releasing pages, hugetlb_unreserve_pages cleans up region/reserv - * maps and global counts. Page faults can not race with truncation - * in this routine. hugetlb_no_page() prevents page faults in the - * truncated range. It checks i_size before allocation, and again after - * with the page table lock for the page held. The same lock must be - * acquired to unmap a page. + * maps and global counts. * hole punch is indicated if end is not LLONG_MAX * In the hole punch case we scan the range and release found pages. * Only when releasing a page is the associated region/reserv map * deleted. The region/reserv map for ranges without associated - * pages are not modified. Page faults can race with hole punch. - * This is indicated if we find a mapped page. + * pages are not modified. + * + * Callers of this routine must hold the i_mmap_rwsem in write mode to prevent + * races with page faults. + * * Note: If the passed end of range value is beyond the end of file, but * not LLONG_MAX this routine still performs a hole punch operation. */ @@ -423,32 +422,14 @@ static void remove_inode_hugepages(struct inode *inode, loff_t lstart, for (i = 0; i < pagevec_count(); ++i) { struct page *page = pvec.pages[i]; - u32 hash; index = page->index; - hash = hugetlb_fault_mutex_hash(h, current->mm, - _vma, - mapping, index, 0); - mutex_lock(_fault_mutex_table[hash]); - /* -* If page is mapped, it was faulted in after being -* unmapped in caller. Unmap (again) now after taking -* the fault mutex. The mutex will prevent faults -* until we finish removing the page. -* -* This race can only happen in the hole punch case. -* Getting here in a truncate operation is a bug. +* A mapped page is impossible as callers should unmap +* all references before calling. And, i_mmap_rwsem +* prevents the creation of additional mappings. */ - if (unlikely(page_mapped(page))) { - BUG_ON(truncate_op); - - i_mmap_lock_write(mapping); - hugetlb_vmdelete_list(>i_mmap, - index * pages_per_huge_page(h), - (index + 1) * pages_per_huge_page(h)); - i_mmap_unlock_write(mapping); - } + VM_BUG_ON(page_mapped(page));
[PATCH v3 0/2] hugetlbfs: use i_mmap_rwsem for better synchronization
There are two primary issues addressed here: 1) For shared pmds, huge PTE pointers returned by huge_pte_alloc can become invalid via a call to huge_pmd_unshare by another thread. 2) hugetlbfs page faults can race with truncation causing invalid global reserve counts and state. Both issues are addressed by expanding the use of i_mmap_rwsem. These issues have existed for a long time. They can be recreated with a test program that causes page fault/truncation races. For simple mappings, this results in a negative HugePages_Rsvd count. If racing with mappings that contain shared pmds, we can hit "BUG at fs/hugetlbfs/inode.c:444!" or Oops! as the result of an invalid memory reference. v2 -> v3 Incorporated suggestions from Kirill. Code change to hold i_mmap_rwsem for duration of copy in copy_hugetlb_page_range. Took i_mmap_rwsem in hugetlbfs_evict_inode to be consistent with other callers. Other changes were to documentation/comments. v1 -> v2 Combined patches 2 and 3 of v1 series as suggested by Aneesh. No other changes were made. Patches are a follow up to the RFC, http://lkml.kernel.org/r/20181024045053.1467-1-mike.krav...@oracle.com Comments made by Naoya were addressed. Mike Kravetz (2): hugetlbfs: use i_mmap_rwsem for more pmd sharing synchronization hugetlbfs: Use i_mmap_rwsem to fix page fault/truncate race fs/hugetlbfs/inode.c | 61 +++- mm/hugetlb.c | 84 +++- mm/memory-failure.c | 14 +++- mm/migrate.c | 13 ++- mm/rmap.c| 4 +++ mm/userfaultfd.c | 11 -- 6 files changed, 125 insertions(+), 62 deletions(-) -- 2.17.2
[PATCH v3 1/2] hugetlbfs: use i_mmap_rwsem for more pmd sharing synchronization
While looking at BUGs associated with invalid huge page map counts, it was discovered and observed that a huge pte pointer could become 'invalid' and point to another task's page table. Consider the following: A task takes a page fault on a shared hugetlbfs file and calls huge_pte_alloc to get a ptep. Suppose the returned ptep points to a shared pmd. Now, another task truncates the hugetlbfs file. As part of truncation, it unmaps everyone who has the file mapped. If the range being truncated is covered by a shared pmd, huge_pmd_unshare will be called. For all but the last user of the shared pmd, huge_pmd_unshare will clear the pud pointing to the pmd. If the task in the middle of the page fault is not the last user, the ptep returned by huge_pte_alloc now points to another task's page table or worse. This leads to bad things such as incorrect page map/reference counts or invalid memory references. To fix, expand the use of i_mmap_rwsem as follows: - i_mmap_rwsem is held in read mode whenever huge_pmd_share is called. huge_pmd_share is only called via huge_pte_alloc, so callers of huge_pte_alloc take i_mmap_rwsem before calling. In addition, callers of huge_pte_alloc continue to hold the semaphore until finished with the ptep. - i_mmap_rwsem is held in write mode whenever huge_pmd_unshare is called. Cc: Fixes: 39dde65c9940 ("shared page table for hugetlb page") Signed-off-by: Mike Kravetz --- mm/hugetlb.c| 67 ++--- mm/memory-failure.c | 14 +- mm/migrate.c| 13 - mm/rmap.c | 4 +++ mm/userfaultfd.c| 11 ++-- 5 files changed, 89 insertions(+), 20 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 309fb8c969af..2a3162030167 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -3239,6 +3239,7 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src, int cow; struct hstate *h = hstate_vma(vma); unsigned long sz = huge_page_size(h); + struct address_space *mapping = vma->vm_file->f_mapping; unsigned long mmun_start; /* For mmu_notifiers */ unsigned long mmun_end; /* For mmu_notifiers */ int ret = 0; @@ -3247,14 +3248,25 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src, mmun_start = vma->vm_start; mmun_end = vma->vm_end; - if (cow) + if (cow) { mmu_notifier_invalidate_range_start(src, mmun_start, mmun_end); + } else { + /* +* For shared mappings i_mmap_rwsem must be held to call +* huge_pte_alloc, otherwise the returned ptep could go +* away if part of a shared pmd and another thread calls +* huge_pmd_unshare. +*/ + i_mmap_lock_read(mapping); + } for (addr = vma->vm_start; addr < vma->vm_end; addr += sz) { spinlock_t *src_ptl, *dst_ptl; + src_pte = huge_pte_offset(src, addr, sz); if (!src_pte) continue; + dst_pte = huge_pte_alloc(dst, addr, sz); if (!dst_pte) { ret = -ENOMEM; @@ -3325,6 +3337,8 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src, if (cow) mmu_notifier_invalidate_range_end(src, mmun_start, mmun_end); + else + i_mmap_unlock_read(mapping); return ret; } @@ -3772,14 +3786,18 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm, }; /* -* hugetlb_fault_mutex must be dropped before -* handling userfault. Reacquire after handling -* fault to make calling code simpler. +* hugetlb_fault_mutex and i_mmap_rwsem must be +* dropped before handling userfault. Reacquire +* after handling fault to make calling code simpler. */ hash = hugetlb_fault_mutex_hash(h, mm, vma, mapping, idx, haddr); mutex_unlock(_fault_mutex_table[hash]); + i_mmap_unlock_read(mapping); + ret = handle_userfault(, VM_UFFD_MISSING); + + i_mmap_lock_read(mapping); mutex_lock(_fault_mutex_table[hash]); goto out; } @@ -3927,6 +3945,11 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, ptep = huge_pte_offset(mm, haddr, huge_page_size(h)); if (ptep) { + /* +* Since we hold no locks, ptep could be stale. That is +* OK as we are only making decisions based on content and +* not
RE: [PATCH v1 1/2] dt-bindings: add binding for USBSS-DRD controller.
Hi Rob, >On Mon, Dec 10, 2018 at 12:39:14PM +, Pawel Laszczak wrote: >> This patch aim at documenting USB related dt-bindings for the >> Cadence USBSS-DRD controller. >> >> Signed-off-by: Pawel Laszczak >> --- >> .../devicetree/bindings/usb/cdns3-usb.txt | 31 +++ >> 1 file changed, 31 insertions(+) >> create mode 100644 Documentation/devicetree/bindings/usb/cdns3-usb.txt >> >> diff --git a/Documentation/devicetree/bindings/usb/cdns3-usb.txt >> b/Documentation/devicetree/bindings/usb/cdns3-usb.txt >> new file mode 100644 >> index ..ae4a255f0b10 >> --- /dev/null >> +++ b/Documentation/devicetree/bindings/usb/cdns3-usb.txt > >cdns-usb3.txt? In all functions we use prefix cdns3, so it was a reason for cdns3-usb.txt. I agree with you and I think that cdns-usb3.txt in this place will be better. 3 in cdns3 could be associated with company name, but that's not true. > >> @@ -0,0 +1,31 @@ >> +Binding for the Cadence USBSS-DRD controller >> + >> +Required properties: >> + - reg: Physical base address and size of the controller's register areas. >> + Controller has 3 different regions: >> + region 1 - HOST registers area >> + region 2 - DEVICE registers area >> + region 3 - OTG/DRD registers area >> + - compatible: Should contain: "cdns,usb3" > >Only 1 version of the IP block? I will add one more. Now I know that we should have at least one additional version. Controller version will be recognized at runtime, but in my opinion we should also have information about this in dt-binding documentation. > >> + - interrupts: Interrupt specifier. Refer to interrupt bindings. >> +Driver supports only single interrupt line. >> +This single interrupt is shared between Device, >> +host and OTG/DRD part of driver. >> + >> +Optional propertiesi: > >typo Thank, I know that already. Someone has already reported me this typo. "i" - insert mode in vim :) > >> + - maximum-speed : valid arguments are "super-speed", "high-speed" and >> + "full-speed"; refer to usb/generic.txt >> + - dr_mode: Should be one of "host", "peripheral" or "otg". >> + - phys: reference to the USB PHY >> + - phy-names: name of the USB PHY, should be "cdns3,usbphy" > >Don't need -names when there is only one. But in the future probably we will need to add next phy version. So maybe it's better to leave this name ? > >> + >> + >> +Example: >> +usb@f300 { >> +compatible = "cdns,usb3"; >> +interrupts = ; >> +reg = <0xf300 0x1 //memory area for HOST registers >> +0xf301 0x1 //memory area for DEVICE >> registers >> +0xf302 0x1>;//memory area for OTG/DRD >> registers >> +}; >> + >> -- >> 2.17.1 >> Thanks for comments Cheers, Pawel
Re: [BREAKAGE] Since 4.18, kernel sets SB_I_NODEV implicitly on userns mounts, breaking systemd-nspawn
Eric, this is entirely unacceptable. On Sat, Dec 22, 2018 at 12:58 PM Gabriel C wrote: > > Added some people to CC that might want to see this.. Thanks. > > Here's an email that was sent to lkml about the subject: > > > > https://lkml.org/lkml/2018/7/5/742 > > > > I link also this, quoting the last of it: > > > > https://lkml.org/lkml/2018/7/5/701 > > > > It has never been the case that mknod on a device node will guarantee > > that you even can open the device node. The applications that regress > > are broken. It doesn't mean we shouldn't be bug compatible, but we darn > > well should document very clearly the bugs we are being bug compatible with. Yeah, this is complete garbage. We have very clear rules in the kernel: if some change breaks existing setups, it is ABSOLUTELY NEVER the application that is broken. It is the kernel. There is absolutely zero gray areas here. Eric, your behavior is entirely out of line, and now we apparently have a regression that goes back to June that I was not told about because of your incorrect stance. Eric, I want to make this 1000% clear: there are no user space bugs. If it used to work, then user space was clearly doing the right thing. The fact that you tried to several times claim it was buggy user space is a serious breach of trust. You KNOW this is the case. Seriously. There are no excuses. That commit is now reverted in my tree, and furthermore I will not take any pull requests from you until you have made it clear that you comprehend this very fundamental issue. Why did it take so long for this issue to be elevated to me? Linus
Re: [PATCH v2 2/2] hugetlbfs: Use i_mmap_rwsem to fix page fault/truncate race
On Fri, Dec 21, 2018 at 02:17:32PM -0800, Mike Kravetz wrote: > Am I misunderstanding your question/concern? No. Thanks for the clarification. > > I have decided to add the locking (although unnecessary) with something like > this in hugetlbfs_evict_inode. > > /* >* The vfs layer guarantees that there are no other users of this >* inode. Therefore, it would be safe to call remove_inode_hugepages >* without holding i_mmap_rwsem. We acquire and hold here to be >* consistent with other callers. Since there will be no contention >* on the semaphore, overhead is negligible. >*/ > i_mmap_lock_write(mapping); > remove_inode_hugepages(inode, 0, LLONG_MAX); > i_mmap_unlock_write(mapping); LGTM. -- Kirill A. Shutemov
Re: [alsa-devel] [PATCH v3 1/9] ASoC: sun4i-i2s: Adjust regmap settings
On 2018-12-21 17:44, Chen-Yu Tsai wrote: > On Fri, Dec 21, 2018 at 11:21 PM wrote: >> From: Marcus Cooper >> >> Bypass the regmap cache when flushing the i2s FIFOs and modify the tables >> to reflect this. >> >> Signed-off-by: Marcus Cooper >> --- >> sound/soc/sunxi/sun4i-i2s.c | 29 + >> 1 file changed, 9 insertions(+), 20 deletions(-) >> >> diff --git a/sound/soc/sunxi/sun4i-i2s.c b/sound/soc/sunxi/sun4i-i2s.c >> index d5ec1a20499d..64d073cb2aee 100644 >> --- a/sound/soc/sunxi/sun4i-i2s.c >> +++ b/sound/soc/sunxi/sun4i-i2s.c >> @@ -548,9 +548,11 @@ static int sun4i_i2s_set_fmt(struct snd_soc_dai *dai, >> unsigned int fmt) >> static void sun4i_i2s_start_capture(struct sun4i_i2s *i2s) >> { >> /* Flush RX FIFO */ >> + regcache_cache_bypass(i2s->regmap, true); >> regmap_update_bits(i2s->regmap, SUN4I_I2S_FIFO_CTRL_REG, >>SUN4I_I2S_FIFO_CTRL_FLUSH_RX, >>SUN4I_I2S_FIFO_CTRL_FLUSH_RX); >> + regcache_cache_bypass(i2s->regmap, false); > IIRC the flush cache bit is self-clearing. So you likely want to mark > this register as volatile. If it is marked as volatile, then all access > to that register bypasses the cache, so the regcache_cache_bypass calls > are unneeded. I helped test this together with Marcus and when I tested with this register marked as volatile audio started to stutter, still unclear why audio starts to stutter. Without this cache bypass the flush TX/RX bits gets cached and flush happens unexpectedly resulting in multi channel audio getting mapped to wrong speaker. Other ASoC codecs and fsl_spdif.c seems to use similar cache bypass for reset/flush. > > However, looking at the code, the write would seem to be ignored if the > regmap is in the cache_only state. We only set this when the bus clock > is disabled. Under such a condition, bypassing the cache and forcing a > write would be unwise, as the system either drops the write, or stalls > altogether. > >> /* Clear RX counter */ >> regmap_write(i2s->regmap, SUN4I_I2S_RX_CNT_REG, 0); >> @@ -569,9 +571,11 @@ static void sun4i_i2s_start_capture(struct sun4i_i2s >> *i2s) >> static void sun4i_i2s_start_playback(struct sun4i_i2s *i2s) >> { >> /* Flush TX FIFO */ >> + regcache_cache_bypass(i2s->regmap, true); >> regmap_update_bits(i2s->regmap, SUN4I_I2S_FIFO_CTRL_REG, >>SUN4I_I2S_FIFO_CTRL_FLUSH_TX, >>SUN4I_I2S_FIFO_CTRL_FLUSH_TX); >> + regcache_cache_bypass(i2s->regmap, false); >> >> /* Clear TX counter */ >> regmap_write(i2s->regmap, SUN4I_I2S_TX_CNT_REG, 0); >> @@ -703,13 +707,7 @@ static const struct snd_soc_component_driver >> sun4i_i2s_component = { >> >> static bool sun4i_i2s_rd_reg(struct device *dev, unsigned int reg) >> { >> - switch (reg) { >> - case SUN4I_I2S_FIFO_TX_REG: >> - return false; >> - >> - default: >> - return true; >> - } >> + return true; > I don't understand why this is relevant. Do you need to read back from the TX > FIFO? This change was inspired by c66234cfedfc "ASoC: rockchip: i2s: fix playback after runtime resume" [1] On resume a cached sample would be written to FIFO_TX_REG unless it is marked volatile, the rockchip commit indicated that read is needed for volatile regs. [1] https://github.com/torvalds/linux/commit/c66234cfedfc3e6e3b62563a5f2c1562be09a35d > > If so, just drop the call-back altogether. If no callback is provided and no > rd_table is provided either, it defaults to all registers under max_register > (if max_register < 0) are readable. > > However this seems like it deserves to be a separate patch (where you explain > in the commit log why it's needed). > > Regards > ChenYu > >> } >> >> static bool sun4i_i2s_wr_reg(struct device *dev, unsigned int reg) >> @@ -728,6 +726,8 @@ static bool sun4i_i2s_volatile_reg(struct device *dev, >> unsigned int reg) >> { >> switch (reg) { >> case SUN4I_I2S_FIFO_RX_REG: >> + case SUN4I_I2S_FIFO_TX_REG: >> + case SUN4I_I2S_FIFO_STA_REG: >> case SUN4I_I2S_INT_STA_REG: >> case SUN4I_I2S_RX_CNT_REG: >> case SUN4I_I2S_TX_CNT_REG: >> @@ -738,23 +738,12 @@ static bool sun4i_i2s_volatile_reg(struct device *dev, >> unsigned int reg) >> } >> } >> >> -static bool sun8i_i2s_rd_reg(struct device *dev, unsigned int reg) >> -{ >> - switch (reg) { >> - case SUN8I_I2S_FIFO_TX_REG: >> - return false; >> - >> - default: >> - return true; >> - } >> -} >> - >> static bool sun8i_i2s_volatile_reg(struct device *dev, unsigned int reg) >> { >> if (reg == SUN8I_I2S_INT_STA_REG) >> return true; >> if (reg == SUN8I_I2S_FIFO_TX_REG) >> - return false; >> + return true; >> >> return sun4i_i2s_volatile_reg(dev, reg); >>
Re: [PATCH v2 01/12] fs-verity: add a documentation file
On Fri, Dec 21, 2018 at 11:13:07AM -0800, Linus Torvalds wrote: > > I do agree that your particular model is pretty damn broken in lots of ways. > > Why is it filesystem specific? If the whole point is that the file > itself has its own verification data (which I like), then I don't see > why this is then documented as some filesystem-specific layout model. > That's complete and utter garbage. > > In other words: either the model is that the file *itself* contains > its own merkle tree that validates the file, or it isn't. You can't > have it two ways. No silly "layout changes when you apply the hash" > garbage. That's just crazy talk and invalidates the whole model. Userspace applications which are reading the file aren't going to be expecting Merkle tree. For example, one of the use cases is Android APK files, which are essentially ZIP files. ZIP files can be parsed both from the front-end (streaming), or by looking for the complete directory of all of the files in the ZIP file by starting at the end of the file and moving backwards. If the Merkle tree was visible to userspace programs that are opening and reading the file, it would confuse them mightily. So what we do for ext4 and f2fs is make the Merkle tree invisible; if userspace stats the file, st_size will return size of the original "data" file, and reading beyond the st_size from userspace will behave like reading beyond EOF. From the *file system's* perspective, though, the metadata blocks are part of the file. There's just a difference between the userspace visible EOF and the file system's conception of EOF. I don't consider this a "layout change", and I personally believe this should be just *fine* for all file systems. The XFS developers are convinced that this is horrific, and no one sane should do this. OK, call me insane. But it works, and I think it's elegant and clean. So if *they* want to use some other layout, where the Merkle blocks are stored in some Alternate Data Stream, ala NTFS --- they are *free* to do that. It will require more work, and at that point, it will require a layout change. But it's Dave and Christoph who are insisting on doing that; not me! > And honestly, I still think that it's very odd to add the merge data > to the end, when the filesystem already supports xattrs. It would have > made much more sense to just make one xattr contain the merkle tree > validation data. The problem is that xattrs are designed to be accessed via a set/get interface, are currently limited, IIRC at 32k. The max size of an APK is 300 megabytes; and the Merkle tree for a file that size will be about 2.3 megabytes. That's way too big to store as an xattr; certainly using the existing xattr interfaces. And it's also bigger than most file systems can handle as xattrs today --- because they've been optimzied for relatively small sizes, for things like SELinux labels and ACL structures. > So why is this sold as some unholy mess of "filesystem-specific" and > "generic"? That part just annoys the hell out of me. Why isn't this > sold as an *actual* generic model, where you just say "append the > merkle tree to the file, then enable verity testing of the end result > and validate the top-level hash". That was the original way it was sold, but Cristoph and Dave have NACK'ed it in that form. The common fsverity code which is generic to ext4 and f2fs does treat it that way, with the note that we "lie" to userspace about is the size of the file and where the EOF is. Dave and Cristoph have declaimed strongly that this is this layout choice is horrible, and filesystem specific, and XFS could never do it that way. I don't understand why, but they are the XFS experts. So if they want to do something else, what I've been trying to point out is that they can do that, using the existing interface. > So what's the excuse for doing the crazy odd "let's just support one > single filesystem" model? Android devices use both ext4 and f2fs; it's the manufacturer's choice. So we wanted fs-verity to support both. And we didn't want to duplicate code across ext4 and f2fs; hence trying to put common code in fs/verity. So we aren't supporting one file system out of the gate; we're supporting two. Whether XFS wants to implement fs-verity is purely XFS's choice. XFS has chosen not to support fscrypt, which is currently used by ext4, f2fs, and ubifs, and both fscrypt's and fs-verity's initial use case has been for Android, which is not an area where XFS has proven to be a common choice. So I was not really expecting that they would have any interest in fs-verity. But they seem to have very strong opinions about how they would want to implement it, and it's different from what we have in the current "generic code shared by ext4 and f2fs". I was trying to show that even if they wanted to do things in this different way --- and I don't understand why it's so important to them --- it would be possible to do so. Cheers,
net: hns: question regarding ae_node device node refcounting
Hello, hns_nic_dev_probe() increments refcount of ae_node device node: ae_node = of_parse_phandle(dev->of_node, "ae-handle", 0); But there is no of_node_put() for ae_node. What is the right place to decrement the ae_node refount? Should it be placed in hns_nic_dev_probe() or in hns_nic_dev_remove()? Or may be it is managed by fwnode somehow? -- Alexey Khoroshilov Linux Verification Center, ISPRAS web: http://linuxtesting.org
Re: [PATCH] lightnvm: pblk: fix use-after-free bug
On 12/22/18 11:30 AM, Matias Bjørling wrote: > On 12/22/18 8:39 AM, Gustavo A. R. Silva wrote: >> Remove one of the calls to function bio_put(), so *bio* is only >> freed once. >> >> Notice that bio is being dereferenced in bio_put(), hence leading to >> a use-after-free bug once *bio* has already been freed. >> >> Addresses-Coverity-ID: 1475952 ("Use after free") >> Fixes: 55d8ec35398e ("lightnvm: pblk: support packed metadata") >> Signed-off-by: Gustavo A. R. Silva >> --- >> drivers/lightnvm/pblk-recovery.c | 1 - >> 1 file changed, 1 deletion(-) >> >> diff --git a/drivers/lightnvm/pblk-recovery.c >> b/drivers/lightnvm/pblk-recovery.c >> index 3fcf062d752c..5ee20da7bdb3 100644 >> --- a/drivers/lightnvm/pblk-recovery.c >> +++ b/drivers/lightnvm/pblk-recovery.c >> @@ -418,7 +418,6 @@ static int pblk_recov_scan_oob(struct pblk *pblk, struct >> pblk_line *line, >> if (ret) { >> pblk_err(pblk, "I/O submission failed: %d\n", ret); >> bio_put(bio); >> -bio_put(bio); >> return ret; >> } >> >> > > Thanks Gustavo. I missed that one. > > Jens, if possible could you please pick this up? Yep, added for the later pull. -- Jens Axboe
[PATCH] binderfs: implement "max" mount option
Since binderfs can be mounted by userns root in non-initial user namespaces some precautions are in order. First, a way to set a maximum on the number of binder devices that can be allocated per binderfs instance and second, a way to reserve a reasonable chunk of binderfs devices for the initial ipc namespace. A first approach as seen in [1] used sysctls similiar to devpts but was shown to be flawed (cf. [2] and [3]) since some aspects were unneeded. This is an alternative approach which avoids sysctls completely and instead switches to a single mount option. Starting with this commit binderfs instances can be mounted with a limit on the number of binder devices that can be allocated. The max= mount option serves as a per-instance limit. If max= is set then only number of binder devices can be allocated in this binderfs instance. Additionally, the binderfs instance in the initial ipc namespace will always have a reserve of at least 1024 binder devices unless explicitly capped via max=. This allows to safely bind-mount binderfs instances into unprivileged user namespaces since userns root in a non-initial user namespace cannot change the mount option as long as it does not own the mount namespace the binderfs mount was created in and hence cannot drain the host of minor device numbers [1]: https://lore.kernel.org/lkml/20181221133909.18794-1-christ...@brauner.io/ [2]; https://lore.kernel.org/lkml/20181221163316.ga8...@kroah.com/ [3]: https://lore.kernel.org/lkml/cahrssex+gdvw4fkkk8oznair9g5icjlyodo8hykv3o0o1jt...@mail.gmail.com/ [4]: https://lore.kernel.org/lkml/20181221192044.5yvfnuri7gdop...@brauner.io/ Cc: Todd Kjos Cc: Greg Kroah-Hartman Signed-off-by: Christian Brauner --- drivers/android/binderfs.c | 109 +++-- 1 file changed, 104 insertions(+), 5 deletions(-) diff --git a/drivers/android/binderfs.c b/drivers/android/binderfs.c index 7496b10532aa..93aee00994d4 100644 --- a/drivers/android/binderfs.c +++ b/drivers/android/binderfs.c @@ -20,6 +20,7 @@ #include #include #include +#include #include #include #include @@ -39,6 +40,11 @@ #define INODE_OFFSET 3 #define INTSTRLEN 21 #define BINDERFS_MAX_MINOR (1U << MINORBITS) +/* + * Ensure that the initial ipc namespace always has a good chunk of devices + * available. + */ +#define BINDERFS_MAX_MINOR_CAPPED (BINDERFS_MAX_MINOR - 1024) static struct vfsmount *binderfs_mnt; @@ -46,6 +52,24 @@ static dev_t binderfs_dev; static DEFINE_MUTEX(binderfs_minors_mutex); static DEFINE_IDA(binderfs_minors); +/** + * binderfs_mount_opts - mount options for binderfs + * @max: maximum number of allocatable binderfs binder devices + */ +struct binderfs_mount_opts { + int max; +}; + +enum { + Opt_max, + Opt_err +}; + +static const match_table_t tokens = { + { Opt_max, "max=%d" }, + { Opt_err, NULL } +}; + /** * binderfs_info - information about a binderfs mount * @ipc_ns: The ipc namespace the binderfs mount belongs to. @@ -55,13 +79,16 @@ static DEFINE_IDA(binderfs_minors); * created. * @root_gid: gid that needs to be used when a new binder device is * created. + * @mount_opts: The mount options in use. + * @device_count: The current number of allocated binder devices. */ struct binderfs_info { struct ipc_namespace *ipc_ns; struct dentry *control_dentry; kuid_t root_uid; kgid_t root_gid; - + struct binderfs_mount_opts mount_opts; + atomic_t device_count; }; static inline struct binderfs_info *BINDERFS_I(const struct inode *inode) @@ -107,13 +134,22 @@ static int binderfs_binder_device_create(struct inode *ref_inode, struct inode *inode = NULL; struct super_block *sb = ref_inode->i_sb; struct binderfs_info *info = sb->s_fs_info; + bool use_reserve = (info->ipc_ns == _ipc_ns); /* Reserve new minor number for the new device. */ mutex_lock(_minors_mutex); - minor = ida_alloc_max(_minors, BINDERFS_MAX_MINOR, GFP_KERNEL); + if (atomic_inc_return(>device_count) < info->mount_opts.max) + minor = ida_alloc_max(_minors, + use_reserve ? BINDERFS_MAX_MINOR : + BINDERFS_MAX_MINOR_CAPPED, + GFP_KERNEL); + else + minor = -ENOSPC; mutex_unlock(_minors_mutex); - if (minor < 0) + if (minor < 0) { + atomic_dec(>device_count); return minor; + } ret = -ENOMEM; device = kzalloc(sizeof(*device), GFP_KERNEL); @@ -187,6 +223,7 @@ static int binderfs_binder_device_create(struct inode *ref_inode, kfree(name); kfree(device); mutex_lock(_minors_mutex); + atomic_dec(>device_count); ida_free(_minors, minor); mutex_unlock(_minors_mutex); iput(inode);
[tip:efi/core] x86/efi: Don't unmap EFI boot services code/data regions for EFI_OLD_MEMMAP and EFI_MIXED_MODE
Commit-ID: 1debf0958fa27b7c469dbf22754929ec59a7c0e7 Gitweb: https://git.kernel.org/tip/1debf0958fa27b7c469dbf22754929ec59a7c0e7 Author: Sai Praneeth Prakhya AuthorDate: Fri, 21 Dec 2018 18:22:34 -0800 Committer: Ingo Molnar CommitDate: Sat, 22 Dec 2018 20:58:30 +0100 x86/efi: Don't unmap EFI boot services code/data regions for EFI_OLD_MEMMAP and EFI_MIXED_MODE The following commit: d5052a7130a6 ("x86/efi: Unmap EFI boot services code/data regions from efi_pgd") forgets to take two EFI modes into consideration, namely EFI_OLD_MEMMAP and EFI_MIXED_MODE: - EFI_OLD_MEMMAP is a legacy way of mapping EFI regions into swapper_pg_dir using ioremap() and init_memory_mapping(). This feature can be enabled by passing "efi=old_map" as kernel command line argument. But, efi_unmap_pages() unmaps EFI boot services code/data regions *only* from efi_pgd and hence cannot be used for unmapping EFI boot services code/data regions from swapper_pg_dir. Introduce a temporary fix to not unmap EFI boot services code/data regions when EFI_OLD_MEMMAP is enabled while working on a real fix. - EFI_MIXED_MODE is another feature where a 64-bit kernel runs on a 64-bit platform crippled by a 32-bit firmware. To support EFI_MIXED_MODE, all RAM (i.e. namely EFI regions like EFI_CONVENTIONAL_MEMORY, EFI_LOADER_, EFI_BOOT_SERVICES_ and EFI_RUNTIME_CODE/DATA regions) is mapped into efi_pgd all the time to facilitate EFI runtime calls access it's arguments in 1:1 mode. Hence, don't unmap EFI boot services code/data regions when booted in mixed mode. Signed-off-by: Sai Praneeth Prakhya Acked-by: Ard Biesheuvel Cc: Andy Lutomirski Cc: Bhupesh Sharma Cc: Borislav Petkov Cc: Dave Hansen Cc: Dave Hansen Cc: H. Peter Anvin Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Rik van Riel Cc: Thomas Gleixner Cc: linux-...@vger.kernel.org Link: http://lkml.kernel.org/r/20181222022234.7573-1-sai.praneeth.prak...@intel.com Signed-off-by: Ingo Molnar --- arch/x86/platform/efi/quirks.c | 16 1 file changed, 16 insertions(+) diff --git a/arch/x86/platform/efi/quirks.c b/arch/x86/platform/efi/quirks.c index 09e811b9da26..17456a1d3f04 100644 --- a/arch/x86/platform/efi/quirks.c +++ b/arch/x86/platform/efi/quirks.c @@ -380,6 +380,22 @@ static void __init efi_unmap_pages(efi_memory_desc_t *md) u64 pa = md->phys_addr; u64 va = md->virt_addr; + /* +* To Do: Remove this check after adding functionality to unmap EFI boot +* services code/data regions from direct mapping area because +* "efi=old_map" maps EFI regions in swapper_pg_dir. +*/ + if (efi_enabled(EFI_OLD_MEMMAP)) + return; + + /* +* EFI mixed mode has all RAM mapped to access arguments while making +* EFI runtime calls, hence don't unmap EFI boot services code/data +* regions. +*/ + if (!efi_is_native()) + return; + if (kernel_unmap_pages_in_pgd(pgd, pa, md->num_pages)) pr_err("Failed to unmap 1:1 mapping for 0x%llx\n", pa);