BUG: MAX_LOCKDEP_CHAINS too low!

2018-09-27 Thread syzbot

Hello,

syzbot found the following crash on:

HEAD commit:c307aaf3eb47 Merge tag 'iommu-fixes-v4.19-rc5' of git://gi..
git tree:   upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=13810df140
kernel config:  https://syzkaller.appspot.com/x/.config?x=dfb440e26f0a6f6f
dashboard link: https://syzkaller.appspot.com/bug?extid=aaa6fa4949cc5d9b7b25
compiler:   gcc (GCC) 8.0.1 20180413 (experimental)

Unfortunately, I don't have any reproducer for this crash yet.

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+aaa6fa4949cc5d9b7...@syzkaller.appspotmail.com

BUG: MAX_LOCKDEP_CHAINS too low!
turning off the locking correctness validator.
CPU: 0 PID: 9480 Comm: syz-executor4 Not tainted 4.19.0-rc5+ #256
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011

Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x1c4/0x2b4 lib/dump_stack.c:113
 add_chain_cache kernel/locking/lockdep.c:2254 [inline]
 lookup_chain_cache_add kernel/locking/lockdep.c:2366 [inline]
 validate_chain kernel/locking/lockdep.c:2386 [inline]
 __lock_acquire.cold.61+0x337/0x482 kernel/locking/lockdep.c:3411
 lock_acquire+0x1ed/0x520 kernel/locking/lockdep.c:3900
 __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
 _raw_spin_lock_irqsave+0x99/0xd0 kernel/locking/spinlock.c:152
 debug_object_activate+0x1d4/0x600 lib/debugobjects.c:473
 debug_hrtimer_activate kernel/time/hrtimer.c:416 [inline]
 debug_activate kernel/time/hrtimer.c:465 [inline]
 enqueue_hrtimer+0x97/0x560 kernel/time/hrtimer.c:954
 __hrtimer_start_range_ns kernel/time/hrtimer.c:1089 [inline]
 hrtimer_start_range_ns+0x640/0xe00 kernel/time/hrtimer.c:1115
 hrtimer_start include/linux/hrtimer.h:398 [inline]
 perf_swevent_start_hrtimer.part.74+0x19a/0x260 kernel/events/core.c:9145
 perf_swevent_start_hrtimer kernel/events/core.c:9133 [inline]
 cpu_clock_event_start+0x127/0x180 kernel/events/core.c:9203
 cpu_clock_event_add+0x4d/0x50 kernel/events/core.c:9215
 event_sched_in.isra.107+0x43c/0xe40 kernel/events/core.c:2279
 group_sched_in+0xe4/0x400 kernel/events/core.c:2315
 flexible_sched_in+0x792/0xc70 kernel/events/core.c:3309
 visit_groups_merge+0x380/0x6c0 kernel/events/core.c:3257
 ctx_flexible_sched_in kernel/events/core.c:3343 [inline]
 ctx_sched_in+0x392/0x790 kernel/events/core.c:3388
 perf_event_sched_in+0x6d/0xa0 kernel/events/core.c:2424
 perf_event_context_sched_in kernel/events/core.c:3428 [inline]
 __perf_event_task_sched_in+0x859/0xb60 kernel/events/core.c:3467
 perf_event_task_sched_in include/linux/perf_event.h:1095 [inline]
 finish_task_switch+0x366/0x900 kernel/sched/core.c:2673
 context_switch kernel/sched/core.c:2828 [inline]
 __schedule+0x874/0x1ed0 kernel/sched/core.c:3473
 preempt_schedule_irq+0x87/0x110 kernel/sched/core.c:3700
 retint_kernel+0x1b/0x2d
RIP: 0010:arch_local_irq_restore arch/x86/include/asm/paravirt.h:788  
[inline]

RIP: 0010:qlink_free mm/kasan/quarantine.c:150 [inline]
RIP: 0010:qlist_free_all+0xf8/0x140 mm/kasan/quarantine.c:166
Code: 40 10 00 00 00 00 48 83 c4 10 5b 41 5c 41 5d 41 5e 41 5f 5d c3 e8 b8  
8f c2 ff 48 83 3d 78 da af 07 00 74 46 48 8b 7d d0 57 9d <0f> 1f 44 00 00  
eb ae 48 89 df e8 09 1c 76 ff 48 b9 00 00 00 00 00

RSP: 0018:88019362ebb8 EFLAGS: 0286 ORIG_RAX: ff13
RAX:  RBX: 8801ac0a6a10 RCX: 110036f289f9
RDX:  RSI: 8801b7944fd0 RDI: 0286
RBP: 88019362ebf0 R08: 8801b7944fc8 R09: 0006
R10:  R11: 8801b7944700 R12: 
R13: 8801da94c500 R14: 88018e1a5f50 R15: 89723ac0
 quarantine_reduce+0x163/0x1a0 mm/kasan/quarantine.c:259
 kasan_kmalloc+0x9b/0xe0 mm/kasan/kasan.c:538
 kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:490
 slab_post_alloc_hook mm/slab.h:444 [inline]
 slab_alloc mm/slab.c:3392 [inline]
 __do_kmalloc mm/slab.c:3716 [inline]
 __kmalloc_track_caller+0x133/0x750 mm/slab.c:3733
 kstrdup+0x39/0x70 mm/util.c:56
 kstrdup_const+0x66/0x80 mm/util.c:77
 __kernfs_new_node+0xe8/0x8d0 fs/kernfs/dir.c:630
 kernfs_new_node+0x95/0x120 fs/kernfs/dir.c:695
 kernfs_create_link+0xdb/0x250 fs/kernfs/symlink.c:40
 sysfs_do_create_link_sd.isra.2+0x90/0x130 fs/sysfs/symlink.c:43
 sysfs_do_create_link fs/sysfs/symlink.c:79 [inline]
 sysfs_create_link+0x65/0xc0 fs/sysfs/symlink.c:91
 driver_sysfs_add+0x109/0x350 drivers/base/dd.c:370
 device_bind_driver+0x19/0xd0 drivers/base/dd.c:424
 mac80211_hwsim_new_radio+0x48b/0x3570  
drivers/net/wireless/mac80211_hwsim.c:2688

 hwsim_new_radio_nl+0x7dc/0xb20 drivers/net/wireless/mac80211_hwsim.c:3376
 genl_family_rcv_msg+0x8a9/0x1140 net/netlink/genetlink.c:601
 genl_rcv_msg+0xc6/0x168 net/netlink/genetlink.c:626
 netlink_rcv_skb+0x172/0x440 net/netlink/af_netlink.c:2454
 genl_rcv+0x28/0x40 net/netlink/genetlink.c:637
 netlink_unicast_kernel net/netlink/af_netlink.c:1317 [inline]
 netlink_unicast+0x5a5/0x760 

BUG: MAX_LOCKDEP_CHAINS too low!

2018-09-27 Thread syzbot

Hello,

syzbot found the following crash on:

HEAD commit:c307aaf3eb47 Merge tag 'iommu-fixes-v4.19-rc5' of git://gi..
git tree:   upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=13810df140
kernel config:  https://syzkaller.appspot.com/x/.config?x=dfb440e26f0a6f6f
dashboard link: https://syzkaller.appspot.com/bug?extid=aaa6fa4949cc5d9b7b25
compiler:   gcc (GCC) 8.0.1 20180413 (experimental)

Unfortunately, I don't have any reproducer for this crash yet.

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+aaa6fa4949cc5d9b7...@syzkaller.appspotmail.com

BUG: MAX_LOCKDEP_CHAINS too low!
turning off the locking correctness validator.
CPU: 0 PID: 9480 Comm: syz-executor4 Not tainted 4.19.0-rc5+ #256
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011

Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x1c4/0x2b4 lib/dump_stack.c:113
 add_chain_cache kernel/locking/lockdep.c:2254 [inline]
 lookup_chain_cache_add kernel/locking/lockdep.c:2366 [inline]
 validate_chain kernel/locking/lockdep.c:2386 [inline]
 __lock_acquire.cold.61+0x337/0x482 kernel/locking/lockdep.c:3411
 lock_acquire+0x1ed/0x520 kernel/locking/lockdep.c:3900
 __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
 _raw_spin_lock_irqsave+0x99/0xd0 kernel/locking/spinlock.c:152
 debug_object_activate+0x1d4/0x600 lib/debugobjects.c:473
 debug_hrtimer_activate kernel/time/hrtimer.c:416 [inline]
 debug_activate kernel/time/hrtimer.c:465 [inline]
 enqueue_hrtimer+0x97/0x560 kernel/time/hrtimer.c:954
 __hrtimer_start_range_ns kernel/time/hrtimer.c:1089 [inline]
 hrtimer_start_range_ns+0x640/0xe00 kernel/time/hrtimer.c:1115
 hrtimer_start include/linux/hrtimer.h:398 [inline]
 perf_swevent_start_hrtimer.part.74+0x19a/0x260 kernel/events/core.c:9145
 perf_swevent_start_hrtimer kernel/events/core.c:9133 [inline]
 cpu_clock_event_start+0x127/0x180 kernel/events/core.c:9203
 cpu_clock_event_add+0x4d/0x50 kernel/events/core.c:9215
 event_sched_in.isra.107+0x43c/0xe40 kernel/events/core.c:2279
 group_sched_in+0xe4/0x400 kernel/events/core.c:2315
 flexible_sched_in+0x792/0xc70 kernel/events/core.c:3309
 visit_groups_merge+0x380/0x6c0 kernel/events/core.c:3257
 ctx_flexible_sched_in kernel/events/core.c:3343 [inline]
 ctx_sched_in+0x392/0x790 kernel/events/core.c:3388
 perf_event_sched_in+0x6d/0xa0 kernel/events/core.c:2424
 perf_event_context_sched_in kernel/events/core.c:3428 [inline]
 __perf_event_task_sched_in+0x859/0xb60 kernel/events/core.c:3467
 perf_event_task_sched_in include/linux/perf_event.h:1095 [inline]
 finish_task_switch+0x366/0x900 kernel/sched/core.c:2673
 context_switch kernel/sched/core.c:2828 [inline]
 __schedule+0x874/0x1ed0 kernel/sched/core.c:3473
 preempt_schedule_irq+0x87/0x110 kernel/sched/core.c:3700
 retint_kernel+0x1b/0x2d
RIP: 0010:arch_local_irq_restore arch/x86/include/asm/paravirt.h:788  
[inline]

RIP: 0010:qlink_free mm/kasan/quarantine.c:150 [inline]
RIP: 0010:qlist_free_all+0xf8/0x140 mm/kasan/quarantine.c:166
Code: 40 10 00 00 00 00 48 83 c4 10 5b 41 5c 41 5d 41 5e 41 5f 5d c3 e8 b8  
8f c2 ff 48 83 3d 78 da af 07 00 74 46 48 8b 7d d0 57 9d <0f> 1f 44 00 00  
eb ae 48 89 df e8 09 1c 76 ff 48 b9 00 00 00 00 00

RSP: 0018:88019362ebb8 EFLAGS: 0286 ORIG_RAX: ff13
RAX:  RBX: 8801ac0a6a10 RCX: 110036f289f9
RDX:  RSI: 8801b7944fd0 RDI: 0286
RBP: 88019362ebf0 R08: 8801b7944fc8 R09: 0006
R10:  R11: 8801b7944700 R12: 
R13: 8801da94c500 R14: 88018e1a5f50 R15: 89723ac0
 quarantine_reduce+0x163/0x1a0 mm/kasan/quarantine.c:259
 kasan_kmalloc+0x9b/0xe0 mm/kasan/kasan.c:538
 kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:490
 slab_post_alloc_hook mm/slab.h:444 [inline]
 slab_alloc mm/slab.c:3392 [inline]
 __do_kmalloc mm/slab.c:3716 [inline]
 __kmalloc_track_caller+0x133/0x750 mm/slab.c:3733
 kstrdup+0x39/0x70 mm/util.c:56
 kstrdup_const+0x66/0x80 mm/util.c:77
 __kernfs_new_node+0xe8/0x8d0 fs/kernfs/dir.c:630
 kernfs_new_node+0x95/0x120 fs/kernfs/dir.c:695
 kernfs_create_link+0xdb/0x250 fs/kernfs/symlink.c:40
 sysfs_do_create_link_sd.isra.2+0x90/0x130 fs/sysfs/symlink.c:43
 sysfs_do_create_link fs/sysfs/symlink.c:79 [inline]
 sysfs_create_link+0x65/0xc0 fs/sysfs/symlink.c:91
 driver_sysfs_add+0x109/0x350 drivers/base/dd.c:370
 device_bind_driver+0x19/0xd0 drivers/base/dd.c:424
 mac80211_hwsim_new_radio+0x48b/0x3570  
drivers/net/wireless/mac80211_hwsim.c:2688

 hwsim_new_radio_nl+0x7dc/0xb20 drivers/net/wireless/mac80211_hwsim.c:3376
 genl_family_rcv_msg+0x8a9/0x1140 net/netlink/genetlink.c:601
 genl_rcv_msg+0xc6/0x168 net/netlink/genetlink.c:626
 netlink_rcv_skb+0x172/0x440 net/netlink/af_netlink.c:2454
 genl_rcv+0x28/0x40 net/netlink/genetlink.c:637
 netlink_unicast_kernel net/netlink/af_netlink.c:1317 [inline]
 netlink_unicast+0x5a5/0x760 

Re: [PATCH] mfd: arizona: Correct link for sound binding document

2018-09-27 Thread Lee Jones
On Wed, 26 Sep 2018, Rob Herring wrote:

> On Mon, Sep 17, 2018 at 04:33:22PM +0100, Charles Keepax wrote:
> > Signed-off-by: Charles Keepax 
> > ---
> >  Documentation/devicetree/bindings/mfd/arizona.txt | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> Applied.

Probably won't do any harm in this instance, but it's usually better
for MFD binding changes to go through the MFD tree to avoid
merge-conflicts.

-- 
Lee Jones [李琼斯]
Linaro Services Technical Lead
Linaro.org │ Open source software for ARM SoCs
Follow Linaro: Facebook | Twitter | Blog


Re: [PATCH] mfd: arizona: Correct link for sound binding document

2018-09-27 Thread Lee Jones
On Wed, 26 Sep 2018, Rob Herring wrote:

> On Mon, Sep 17, 2018 at 04:33:22PM +0100, Charles Keepax wrote:
> > Signed-off-by: Charles Keepax 
> > ---
> >  Documentation/devicetree/bindings/mfd/arizona.txt | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> Applied.

Probably won't do any harm in this instance, but it's usually better
for MFD binding changes to go through the MFD tree to avoid
merge-conflicts.

-- 
Lee Jones [李琼斯]
Linaro Services Technical Lead
Linaro.org │ Open source software for ARM SoCs
Follow Linaro: Facebook | Twitter | Blog


Re: [PATCH v3 2/2] vfio: add edid support to mbochs sample driver

2018-09-27 Thread Gerd Hoffmann
> > +   case MBOCHS_EDID_REGION_INDEX:
> > +   ext->base.argsz = sizeof(*ext);
> > +   ext->base.offset = MBOCHS_EDID_OFFSET;
> > +   ext->base.size = MBOCHS_EDID_SIZE;
> > +   ext->base.flags = (VFIO_REGION_INFO_FLAG_READ  |
> > +  VFIO_REGION_INFO_FLAG_WRITE |
> > +  VFIO_REGION_INFO_FLAG_CAPS);
> 
> Any reason to not to use _MMAP flag?

There is no page backing this.  Also it is not performance-critical,
edid updates should be rare, so the extra code for mmap support doesn't
look like it is worth it.

Also for the virtual registers (especially link_state) it is probably
useful to have the write callback of the mdev driver called to get
notified about the change.

> How would QEMU side code read this region? will it be always trapped?

qemu uses read & write syscalls (well, pread & pwrite actually).

> If vendor driver sets _MMAP flag, will QEMU side handle that case as well?

The current test branch doesn't, it expects read+write to work.
  https://git.kraxel.org/cgit/qemu/log/?h=sirius/edid-vfio

> I think since its blob, edid could be read by QEMU using one memcpy
> rather than adding multiple memcpy of 4 or 8 bytes.

>From qemu it's a single pwrite syscall actually.  mbochs_write() splits
it into 4 byte writes and calls mbochs_access() for each of them.  One
could probably add a special case for the EDID blob to mbochs_write().
But again: doesn't seem worth the effort given that edid updates should
be a rare event.

cheers,
  Gerd



Re: [PATCH v3 2/2] vfio: add edid support to mbochs sample driver

2018-09-27 Thread Gerd Hoffmann
> > +   case MBOCHS_EDID_REGION_INDEX:
> > +   ext->base.argsz = sizeof(*ext);
> > +   ext->base.offset = MBOCHS_EDID_OFFSET;
> > +   ext->base.size = MBOCHS_EDID_SIZE;
> > +   ext->base.flags = (VFIO_REGION_INFO_FLAG_READ  |
> > +  VFIO_REGION_INFO_FLAG_WRITE |
> > +  VFIO_REGION_INFO_FLAG_CAPS);
> 
> Any reason to not to use _MMAP flag?

There is no page backing this.  Also it is not performance-critical,
edid updates should be rare, so the extra code for mmap support doesn't
look like it is worth it.

Also for the virtual registers (especially link_state) it is probably
useful to have the write callback of the mdev driver called to get
notified about the change.

> How would QEMU side code read this region? will it be always trapped?

qemu uses read & write syscalls (well, pread & pwrite actually).

> If vendor driver sets _MMAP flag, will QEMU side handle that case as well?

The current test branch doesn't, it expects read+write to work.
  https://git.kraxel.org/cgit/qemu/log/?h=sirius/edid-vfio

> I think since its blob, edid could be read by QEMU using one memcpy
> rather than adding multiple memcpy of 4 or 8 bytes.

>From qemu it's a single pwrite syscall actually.  mbochs_write() splits
it into 4 byte writes and calls mbochs_access() for each of them.  One
could probably add a special case for the EDID blob to mbochs_write().
But again: doesn't seem worth the effort given that edid updates should
be a rare event.

cheers,
  Gerd



[PATCH 3/4] infiniband/mm: convert to the new put_user_page() call

2018-09-27 Thread john . hubbard
From: John Hubbard 

For code that retains pages via get_user_pages*(),
release those pages via the new put_user_page(),
instead of put_page().

This prepares for eventually fixing the problem described
in [1], and is following a plan listed in [2].

[1] https://lwn.net/Articles/753027/ : "The Trouble with get_user_pages()"

[2] https://lkml.kernel.org/r/20180709080554.21931-1-jhubb...@nvidia.com
Proposed steps for fixing get_user_pages() + DMA problems.

CC: Doug Ledford 
CC: Jason Gunthorpe 
CC: Mike Marciniszyn 
CC: Dennis Dalessandro 
CC: Christian Benvenuti 

CC: linux-r...@vger.kernel.org
CC: linux-kernel@vger.kernel.org
CC: linux...@kvack.org
Signed-off-by: John Hubbard 
---
 drivers/infiniband/core/umem.c  | 2 +-
 drivers/infiniband/core/umem_odp.c  | 2 +-
 drivers/infiniband/hw/hfi1/user_pages.c | 2 +-
 drivers/infiniband/hw/mthca/mthca_memfree.c | 6 +++---
 drivers/infiniband/hw/qib/qib_user_pages.c  | 2 +-
 drivers/infiniband/hw/qib/qib_user_sdma.c   | 8 
 drivers/infiniband/hw/usnic/usnic_uiom.c| 2 +-
 7 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c
index a41792dbae1f..9430d697cb9f 100644
--- a/drivers/infiniband/core/umem.c
+++ b/drivers/infiniband/core/umem.c
@@ -60,7 +60,7 @@ static void __ib_umem_release(struct ib_device *dev, struct 
ib_umem *umem, int d
page = sg_page(sg);
if (!PageDirty(page) && umem->writable && dirty)
set_page_dirty_lock(page);
-   put_page(page);
+   put_user_page(page);
}
 
sg_free_table(>sg_head);
diff --git a/drivers/infiniband/core/umem_odp.c 
b/drivers/infiniband/core/umem_odp.c
index 6ec748eccff7..6227b89cf05c 100644
--- a/drivers/infiniband/core/umem_odp.c
+++ b/drivers/infiniband/core/umem_odp.c
@@ -717,7 +717,7 @@ int ib_umem_odp_map_dma_pages(struct ib_umem *umem, u64 
user_virt, u64 bcnt,
ret = -EFAULT;
break;
}
-   put_page(local_page_list[j]);
+   put_user_page(local_page_list[j]);
continue;
}
 
diff --git a/drivers/infiniband/hw/hfi1/user_pages.c 
b/drivers/infiniband/hw/hfi1/user_pages.c
index e341e6dcc388..c7516029af33 100644
--- a/drivers/infiniband/hw/hfi1/user_pages.c
+++ b/drivers/infiniband/hw/hfi1/user_pages.c
@@ -126,7 +126,7 @@ void hfi1_release_user_pages(struct mm_struct *mm, struct 
page **p,
for (i = 0; i < npages; i++) {
if (dirty)
set_page_dirty_lock(p[i]);
-   put_page(p[i]);
+   put_user_page(p[i]);
}
 
if (mm) { /* during close after signal, mm can be NULL */
diff --git a/drivers/infiniband/hw/mthca/mthca_memfree.c 
b/drivers/infiniband/hw/mthca/mthca_memfree.c
index cc9c0c8ccba3..b8b12effd009 100644
--- a/drivers/infiniband/hw/mthca/mthca_memfree.c
+++ b/drivers/infiniband/hw/mthca/mthca_memfree.c
@@ -481,7 +481,7 @@ int mthca_map_user_db(struct mthca_dev *dev, struct 
mthca_uar *uar,
 
ret = pci_map_sg(dev->pdev, _tab->page[i].mem, 1, PCI_DMA_TODEVICE);
if (ret < 0) {
-   put_page(pages[0]);
+   put_user_page(pages[0]);
goto out;
}
 
@@ -489,7 +489,7 @@ int mthca_map_user_db(struct mthca_dev *dev, struct 
mthca_uar *uar,
 mthca_uarc_virt(dev, uar, i));
if (ret) {
pci_unmap_sg(dev->pdev, _tab->page[i].mem, 1, 
PCI_DMA_TODEVICE);
-   put_page(sg_page(_tab->page[i].mem));
+   put_user_page(sg_page(_tab->page[i].mem));
goto out;
}
 
@@ -555,7 +555,7 @@ void mthca_cleanup_user_db_tab(struct mthca_dev *dev, 
struct mthca_uar *uar,
if (db_tab->page[i].uvirt) {
mthca_UNMAP_ICM(dev, mthca_uarc_virt(dev, uar, i), 1);
pci_unmap_sg(dev->pdev, _tab->page[i].mem, 1, 
PCI_DMA_TODEVICE);
-   put_page(sg_page(_tab->page[i].mem));
+   put_user_page(sg_page(_tab->page[i].mem));
}
}
 
diff --git a/drivers/infiniband/hw/qib/qib_user_pages.c 
b/drivers/infiniband/hw/qib/qib_user_pages.c
index 16543d5e80c3..3f8fd42dd7fc 100644
--- a/drivers/infiniband/hw/qib/qib_user_pages.c
+++ b/drivers/infiniband/hw/qib/qib_user_pages.c
@@ -45,7 +45,7 @@ static void __qib_release_user_pages(struct page **p, size_t 
num_pages,
for (i = 0; i < num_pages; i++) {
if (dirty)
set_page_dirty_lock(p[i]);
-   put_page(p[i]);
+   put_user_page(p[i]);
}
 }
 
diff --git a/drivers/infiniband/hw/qib/qib_user_sdma.c 
b/drivers/infiniband/hw/qib/qib_user_sdma.c
index 926f3c8eba69..14f94d823907 

[PATCH 3/4] infiniband/mm: convert to the new put_user_page() call

2018-09-27 Thread john . hubbard
From: John Hubbard 

For code that retains pages via get_user_pages*(),
release those pages via the new put_user_page(),
instead of put_page().

This prepares for eventually fixing the problem described
in [1], and is following a plan listed in [2].

[1] https://lwn.net/Articles/753027/ : "The Trouble with get_user_pages()"

[2] https://lkml.kernel.org/r/20180709080554.21931-1-jhubb...@nvidia.com
Proposed steps for fixing get_user_pages() + DMA problems.

CC: Doug Ledford 
CC: Jason Gunthorpe 
CC: Mike Marciniszyn 
CC: Dennis Dalessandro 
CC: Christian Benvenuti 

CC: linux-r...@vger.kernel.org
CC: linux-kernel@vger.kernel.org
CC: linux...@kvack.org
Signed-off-by: John Hubbard 
---
 drivers/infiniband/core/umem.c  | 2 +-
 drivers/infiniband/core/umem_odp.c  | 2 +-
 drivers/infiniband/hw/hfi1/user_pages.c | 2 +-
 drivers/infiniband/hw/mthca/mthca_memfree.c | 6 +++---
 drivers/infiniband/hw/qib/qib_user_pages.c  | 2 +-
 drivers/infiniband/hw/qib/qib_user_sdma.c   | 8 
 drivers/infiniband/hw/usnic/usnic_uiom.c| 2 +-
 7 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c
index a41792dbae1f..9430d697cb9f 100644
--- a/drivers/infiniband/core/umem.c
+++ b/drivers/infiniband/core/umem.c
@@ -60,7 +60,7 @@ static void __ib_umem_release(struct ib_device *dev, struct 
ib_umem *umem, int d
page = sg_page(sg);
if (!PageDirty(page) && umem->writable && dirty)
set_page_dirty_lock(page);
-   put_page(page);
+   put_user_page(page);
}
 
sg_free_table(>sg_head);
diff --git a/drivers/infiniband/core/umem_odp.c 
b/drivers/infiniband/core/umem_odp.c
index 6ec748eccff7..6227b89cf05c 100644
--- a/drivers/infiniband/core/umem_odp.c
+++ b/drivers/infiniband/core/umem_odp.c
@@ -717,7 +717,7 @@ int ib_umem_odp_map_dma_pages(struct ib_umem *umem, u64 
user_virt, u64 bcnt,
ret = -EFAULT;
break;
}
-   put_page(local_page_list[j]);
+   put_user_page(local_page_list[j]);
continue;
}
 
diff --git a/drivers/infiniband/hw/hfi1/user_pages.c 
b/drivers/infiniband/hw/hfi1/user_pages.c
index e341e6dcc388..c7516029af33 100644
--- a/drivers/infiniband/hw/hfi1/user_pages.c
+++ b/drivers/infiniband/hw/hfi1/user_pages.c
@@ -126,7 +126,7 @@ void hfi1_release_user_pages(struct mm_struct *mm, struct 
page **p,
for (i = 0; i < npages; i++) {
if (dirty)
set_page_dirty_lock(p[i]);
-   put_page(p[i]);
+   put_user_page(p[i]);
}
 
if (mm) { /* during close after signal, mm can be NULL */
diff --git a/drivers/infiniband/hw/mthca/mthca_memfree.c 
b/drivers/infiniband/hw/mthca/mthca_memfree.c
index cc9c0c8ccba3..b8b12effd009 100644
--- a/drivers/infiniband/hw/mthca/mthca_memfree.c
+++ b/drivers/infiniband/hw/mthca/mthca_memfree.c
@@ -481,7 +481,7 @@ int mthca_map_user_db(struct mthca_dev *dev, struct 
mthca_uar *uar,
 
ret = pci_map_sg(dev->pdev, _tab->page[i].mem, 1, PCI_DMA_TODEVICE);
if (ret < 0) {
-   put_page(pages[0]);
+   put_user_page(pages[0]);
goto out;
}
 
@@ -489,7 +489,7 @@ int mthca_map_user_db(struct mthca_dev *dev, struct 
mthca_uar *uar,
 mthca_uarc_virt(dev, uar, i));
if (ret) {
pci_unmap_sg(dev->pdev, _tab->page[i].mem, 1, 
PCI_DMA_TODEVICE);
-   put_page(sg_page(_tab->page[i].mem));
+   put_user_page(sg_page(_tab->page[i].mem));
goto out;
}
 
@@ -555,7 +555,7 @@ void mthca_cleanup_user_db_tab(struct mthca_dev *dev, 
struct mthca_uar *uar,
if (db_tab->page[i].uvirt) {
mthca_UNMAP_ICM(dev, mthca_uarc_virt(dev, uar, i), 1);
pci_unmap_sg(dev->pdev, _tab->page[i].mem, 1, 
PCI_DMA_TODEVICE);
-   put_page(sg_page(_tab->page[i].mem));
+   put_user_page(sg_page(_tab->page[i].mem));
}
}
 
diff --git a/drivers/infiniband/hw/qib/qib_user_pages.c 
b/drivers/infiniband/hw/qib/qib_user_pages.c
index 16543d5e80c3..3f8fd42dd7fc 100644
--- a/drivers/infiniband/hw/qib/qib_user_pages.c
+++ b/drivers/infiniband/hw/qib/qib_user_pages.c
@@ -45,7 +45,7 @@ static void __qib_release_user_pages(struct page **p, size_t 
num_pages,
for (i = 0; i < num_pages; i++) {
if (dirty)
set_page_dirty_lock(p[i]);
-   put_page(p[i]);
+   put_user_page(p[i]);
}
 }
 
diff --git a/drivers/infiniband/hw/qib/qib_user_sdma.c 
b/drivers/infiniband/hw/qib/qib_user_sdma.c
index 926f3c8eba69..14f94d823907 

[PATCH 2/4] mm: introduce put_user_page(), placeholder version

2018-09-27 Thread john . hubbard
From: John Hubbard 

Introduces put_user_page(), which simply calls put_page().
This provides a way to update all get_user_pages*() callers,
so that they call put_user_page(), instead of put_page().

Also adds release_user_pages(), a drop-in replacement for
release_pages(). This is intended to be easily grep-able,
for later performance improvements, since release_user_pages
is not batched like release_pages() is, and is significantly
slower.

Also: rename goldfish_pipe.c's release_user_pages(), in order
to avoid a naming conflict with the new external function of
the same name.

This prepares for eventually fixing the problem described
in [1], and is following a plan listed in [2].

[1] https://lwn.net/Articles/753027/ : "The Trouble with get_user_pages()"

[2] https://lkml.kernel.org/r/20180709080554.21931-1-jhubb...@nvidia.com
Proposed steps for fixing get_user_pages() + DMA problems.

CC: Matthew Wilcox 
CC: Michal Hocko 
CC: Christopher Lameter 
CC: Jason Gunthorpe 
CC: Dan Williams 
CC: Jan Kara 
CC: Al Viro 
Signed-off-by: John Hubbard 
---
 drivers/platform/goldfish/goldfish_pipe.c |  4 ++--
 include/linux/mm.h| 14 ++
 2 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/drivers/platform/goldfish/goldfish_pipe.c 
b/drivers/platform/goldfish/goldfish_pipe.c
index 2da567540c2d..fad0345376e0 100644
--- a/drivers/platform/goldfish/goldfish_pipe.c
+++ b/drivers/platform/goldfish/goldfish_pipe.c
@@ -332,7 +332,7 @@ static int pin_user_pages(unsigned long first_page, 
unsigned long last_page,
 
 }
 
-static void release_user_pages(struct page **pages, int pages_count,
+static void __release_user_pages(struct page **pages, int pages_count,
int is_write, s32 consumed_size)
 {
int i;
@@ -410,7 +410,7 @@ static int transfer_max_buffers(struct goldfish_pipe *pipe,
 
*consumed_size = pipe->command_buffer->rw_params.consumed_size;
 
-   release_user_pages(pages, pages_count, is_write, *consumed_size);
+   __release_user_pages(pages, pages_count, is_write, *consumed_size);
 
mutex_unlock(>lock);
 
diff --git a/include/linux/mm.h b/include/linux/mm.h
index a61ebe8ad4ca..72caf803115f 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -943,6 +943,20 @@ static inline void put_page(struct page *page)
__put_page(page);
 }
 
+/* Placeholder version, until all get_user_pages*() callers are updated. */
+static inline void put_user_page(struct page *page)
+{
+   put_page(page);
+}
+
+/* A drop-in replacement for release_pages(): */
+static inline void release_user_pages(struct page **pages,
+ unsigned long npages)
+{
+   while (npages)
+   put_user_page(pages[--npages]);
+}
+
 #if defined(CONFIG_SPARSEMEM) && !defined(CONFIG_SPARSEMEM_VMEMMAP)
 #define SECTION_IN_PAGE_FLAGS
 #endif
-- 
2.19.0



[PATCH 2/4] mm: introduce put_user_page(), placeholder version

2018-09-27 Thread john . hubbard
From: John Hubbard 

Introduces put_user_page(), which simply calls put_page().
This provides a way to update all get_user_pages*() callers,
so that they call put_user_page(), instead of put_page().

Also adds release_user_pages(), a drop-in replacement for
release_pages(). This is intended to be easily grep-able,
for later performance improvements, since release_user_pages
is not batched like release_pages() is, and is significantly
slower.

Also: rename goldfish_pipe.c's release_user_pages(), in order
to avoid a naming conflict with the new external function of
the same name.

This prepares for eventually fixing the problem described
in [1], and is following a plan listed in [2].

[1] https://lwn.net/Articles/753027/ : "The Trouble with get_user_pages()"

[2] https://lkml.kernel.org/r/20180709080554.21931-1-jhubb...@nvidia.com
Proposed steps for fixing get_user_pages() + DMA problems.

CC: Matthew Wilcox 
CC: Michal Hocko 
CC: Christopher Lameter 
CC: Jason Gunthorpe 
CC: Dan Williams 
CC: Jan Kara 
CC: Al Viro 
Signed-off-by: John Hubbard 
---
 drivers/platform/goldfish/goldfish_pipe.c |  4 ++--
 include/linux/mm.h| 14 ++
 2 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/drivers/platform/goldfish/goldfish_pipe.c 
b/drivers/platform/goldfish/goldfish_pipe.c
index 2da567540c2d..fad0345376e0 100644
--- a/drivers/platform/goldfish/goldfish_pipe.c
+++ b/drivers/platform/goldfish/goldfish_pipe.c
@@ -332,7 +332,7 @@ static int pin_user_pages(unsigned long first_page, 
unsigned long last_page,
 
 }
 
-static void release_user_pages(struct page **pages, int pages_count,
+static void __release_user_pages(struct page **pages, int pages_count,
int is_write, s32 consumed_size)
 {
int i;
@@ -410,7 +410,7 @@ static int transfer_max_buffers(struct goldfish_pipe *pipe,
 
*consumed_size = pipe->command_buffer->rw_params.consumed_size;
 
-   release_user_pages(pages, pages_count, is_write, *consumed_size);
+   __release_user_pages(pages, pages_count, is_write, *consumed_size);
 
mutex_unlock(>lock);
 
diff --git a/include/linux/mm.h b/include/linux/mm.h
index a61ebe8ad4ca..72caf803115f 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -943,6 +943,20 @@ static inline void put_page(struct page *page)
__put_page(page);
 }
 
+/* Placeholder version, until all get_user_pages*() callers are updated. */
+static inline void put_user_page(struct page *page)
+{
+   put_page(page);
+}
+
+/* A drop-in replacement for release_pages(): */
+static inline void release_user_pages(struct page **pages,
+ unsigned long npages)
+{
+   while (npages)
+   put_user_page(pages[--npages]);
+}
+
 #if defined(CONFIG_SPARSEMEM) && !defined(CONFIG_SPARSEMEM_VMEMMAP)
 #define SECTION_IN_PAGE_FLAGS
 #endif
-- 
2.19.0



[PATCH 0/4] get_user_pages*() and RDMA: first steps

2018-09-27 Thread john . hubbard
From: John Hubbard 

Hi,

This short series prepares for eventually fixing the problem described
in [1], and is following a plan listed in [2].

I'd like to get the first two patches into the -mm tree.

Patch 1, although not technically critical to do now, is still nice to have,
because it's already been reviewed by Jan, and it's just one more thing on the
long TODO list here, that is ready to be checked off.

Patch 2 is required in order to allow me (and others, if I'm lucky) to start
submitting changes to convert all of the callsites of get_user_pages*() and
put_page().  I think this will work a lot better than trying to maintain a
massive patchset and submitting all at once.

Patch 3 converts infiniband drivers: put_page() --> put_user_page(). I picked
a fairly small and easy example.

Patch 4 converts a small driver from put_page() --> release_user_pages(). This
could just as easily have been done as a change from put_page() to
put_user_page(). The reason I did it this way is that this provides a small and
simple caller of the new release_user_pages() routine. I wanted both of the
new routines, even though just placeholders, to have callers.

Once these are all in, then the floodgates can open up to convert the large
number of get_user_pages*() callsites.

[1] https://lwn.net/Articles/753027/ : "The Trouble with get_user_pages()"

[2] https://lkml.kernel.org/r/20180709080554.21931-1-jhubb...@nvidia.com
Proposed steps for fixing get_user_pages() + DMA problems.

CC: Al Viro 
CC: Christian Benvenuti 
CC: Christopher Lameter 
CC: Dan Williams 
CC: Dennis Dalessandro 
CC: Doug Ledford 
CC: Jan Kara 
CC: Jason Gunthorpe 
CC: Matthew Wilcox 
CC: Michal Hocko 
CC: Mike Marciniszyn 
CC: linux-kernel@vger.kernel.org
CC: linux...@kvack.org
CC: linux-r...@vger.kernel.org

John Hubbard (4):
  mm: get_user_pages: consolidate error handling
  mm: introduce put_user_page(), placeholder version
  infiniband/mm: convert to the new put_user_page() call
  goldfish_pipe/mm: convert to the new release_user_pages() call

 drivers/infiniband/core/umem.c  |  2 +-
 drivers/infiniband/core/umem_odp.c  |  2 +-
 drivers/infiniband/hw/hfi1/user_pages.c |  2 +-
 drivers/infiniband/hw/mthca/mthca_memfree.c |  6 ++--
 drivers/infiniband/hw/qib/qib_user_pages.c  |  2 +-
 drivers/infiniband/hw/qib/qib_user_sdma.c   |  8 ++---
 drivers/infiniband/hw/usnic/usnic_uiom.c|  2 +-
 drivers/platform/goldfish/goldfish_pipe.c   |  7 ++--
 include/linux/mm.h  | 14 
 mm/gup.c| 37 -
 10 files changed, 52 insertions(+), 30 deletions(-)

-- 
2.19.0



[PATCH 1/4] mm: get_user_pages: consolidate error handling

2018-09-27 Thread john . hubbard
From: John Hubbard 

An upcoming patch requires a way to operate on each page that
any of the get_user_pages_*() variants returns.

In preparation for that, consolidate the error handling for
__get_user_pages(). This provides a single location (the "out:" label)
for operating on the collected set of pages that are about to be returned.

As long every use of the "ret" variable is being edited, rename
"ret" --> "err", so that its name matches its true role.
This also gets rid of two shadowed variable declarations, as a
tiny beneficial a side effect.

Reviewed-by: Jan Kara 
Signed-off-by: John Hubbard 
---
 mm/gup.c | 37 ++---
 1 file changed, 22 insertions(+), 15 deletions(-)

diff --git a/mm/gup.c b/mm/gup.c
index 1abc8b4afff6..05ee7c18e59a 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -660,6 +660,7 @@ static long __get_user_pages(struct task_struct *tsk, 
struct mm_struct *mm,
struct vm_area_struct **vmas, int *nonblocking)
 {
long i = 0;
+   int err = 0;
unsigned int page_mask;
struct vm_area_struct *vma = NULL;
 
@@ -685,18 +686,19 @@ static long __get_user_pages(struct task_struct *tsk, 
struct mm_struct *mm,
if (!vma || start >= vma->vm_end) {
vma = find_extend_vma(mm, start);
if (!vma && in_gate_area(mm, start)) {
-   int ret;
-   ret = get_gate_page(mm, start & PAGE_MASK,
+   err = get_gate_page(mm, start & PAGE_MASK,
gup_flags, ,
pages ? [i] : NULL);
-   if (ret)
-   return i ? : ret;
+   if (err)
+   goto out;
page_mask = 0;
goto next_page;
}
 
-   if (!vma || check_vma_flags(vma, gup_flags))
-   return i ? : -EFAULT;
+   if (!vma || check_vma_flags(vma, gup_flags)) {
+   err = -EFAULT;
+   goto out;
+   }
if (is_vm_hugetlb_page(vma)) {
i = follow_hugetlb_page(mm, vma, pages, vmas,
, _pages, i,
@@ -709,23 +711,25 @@ static long __get_user_pages(struct task_struct *tsk, 
struct mm_struct *mm,
 * If we have a pending SIGKILL, don't keep faulting pages and
 * potentially allocating memory.
 */
-   if (unlikely(fatal_signal_pending(current)))
-   return i ? i : -ERESTARTSYS;
+   if (unlikely(fatal_signal_pending(current))) {
+   err = -ERESTARTSYS;
+   goto out;
+   }
cond_resched();
page = follow_page_mask(vma, start, foll_flags, _mask);
if (!page) {
-   int ret;
-   ret = faultin_page(tsk, vma, start, _flags,
+   err = faultin_page(tsk, vma, start, _flags,
nonblocking);
-   switch (ret) {
+   switch (err) {
case 0:
goto retry;
case -EFAULT:
case -ENOMEM:
case -EHWPOISON:
-   return i ? i : ret;
+   goto out;
case -EBUSY:
-   return i;
+   err = 0;
+   goto out;
case -ENOENT:
goto next_page;
}
@@ -737,7 +741,8 @@ static long __get_user_pages(struct task_struct *tsk, 
struct mm_struct *mm,
 */
goto next_page;
} else if (IS_ERR(page)) {
-   return i ? i : PTR_ERR(page);
+   err = PTR_ERR(page);
+   goto out;
}
if (pages) {
pages[i] = page;
@@ -757,7 +762,9 @@ static long __get_user_pages(struct task_struct *tsk, 
struct mm_struct *mm,
start += page_increm * PAGE_SIZE;
nr_pages -= page_increm;
} while (nr_pages);
-   return i;
+
+out:
+   return i ? i : err;
 }
 
 static bool vma_permits_fault(struct vm_area_struct *vma,
-- 
2.19.0



[PATCH 4/4] goldfish_pipe/mm: convert to the new release_user_pages() call

2018-09-27 Thread john . hubbard
From: John Hubbard 

For code that retains pages via get_user_pages*(),
release those pages via the new release_user_pages(),
instead of calling put_page().

This prepares for eventually fixing the problem described
in [1], and is following a plan listed in [2].

[1] https://lwn.net/Articles/753027/ : "The Trouble with get_user_pages()"

[2] https://lkml.kernel.org/r/20180709080554.21931-1-jhubb...@nvidia.com
Proposed steps for fixing get_user_pages() + DMA problems.

CC: Al Viro 
Signed-off-by: John Hubbard 
---
 drivers/platform/goldfish/goldfish_pipe.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/platform/goldfish/goldfish_pipe.c 
b/drivers/platform/goldfish/goldfish_pipe.c
index fad0345376e0..1e9455a86698 100644
--- a/drivers/platform/goldfish/goldfish_pipe.c
+++ b/drivers/platform/goldfish/goldfish_pipe.c
@@ -340,8 +340,9 @@ static void __release_user_pages(struct page **pages, int 
pages_count,
for (i = 0; i < pages_count; i++) {
if (!is_write && consumed_size > 0)
set_page_dirty(pages[i]);
-   put_page(pages[i]);
}
+
+   release_user_pages(pages, pages_count);
 }
 
 /* Populate the call parameters, merging adjacent pages together */
-- 
2.19.0



[PATCH 0/4] get_user_pages*() and RDMA: first steps

2018-09-27 Thread john . hubbard
From: John Hubbard 

Hi,

This short series prepares for eventually fixing the problem described
in [1], and is following a plan listed in [2].

I'd like to get the first two patches into the -mm tree.

Patch 1, although not technically critical to do now, is still nice to have,
because it's already been reviewed by Jan, and it's just one more thing on the
long TODO list here, that is ready to be checked off.

Patch 2 is required in order to allow me (and others, if I'm lucky) to start
submitting changes to convert all of the callsites of get_user_pages*() and
put_page().  I think this will work a lot better than trying to maintain a
massive patchset and submitting all at once.

Patch 3 converts infiniband drivers: put_page() --> put_user_page(). I picked
a fairly small and easy example.

Patch 4 converts a small driver from put_page() --> release_user_pages(). This
could just as easily have been done as a change from put_page() to
put_user_page(). The reason I did it this way is that this provides a small and
simple caller of the new release_user_pages() routine. I wanted both of the
new routines, even though just placeholders, to have callers.

Once these are all in, then the floodgates can open up to convert the large
number of get_user_pages*() callsites.

[1] https://lwn.net/Articles/753027/ : "The Trouble with get_user_pages()"

[2] https://lkml.kernel.org/r/20180709080554.21931-1-jhubb...@nvidia.com
Proposed steps for fixing get_user_pages() + DMA problems.

CC: Al Viro 
CC: Christian Benvenuti 
CC: Christopher Lameter 
CC: Dan Williams 
CC: Dennis Dalessandro 
CC: Doug Ledford 
CC: Jan Kara 
CC: Jason Gunthorpe 
CC: Matthew Wilcox 
CC: Michal Hocko 
CC: Mike Marciniszyn 
CC: linux-kernel@vger.kernel.org
CC: linux...@kvack.org
CC: linux-r...@vger.kernel.org

John Hubbard (4):
  mm: get_user_pages: consolidate error handling
  mm: introduce put_user_page(), placeholder version
  infiniband/mm: convert to the new put_user_page() call
  goldfish_pipe/mm: convert to the new release_user_pages() call

 drivers/infiniband/core/umem.c  |  2 +-
 drivers/infiniband/core/umem_odp.c  |  2 +-
 drivers/infiniband/hw/hfi1/user_pages.c |  2 +-
 drivers/infiniband/hw/mthca/mthca_memfree.c |  6 ++--
 drivers/infiniband/hw/qib/qib_user_pages.c  |  2 +-
 drivers/infiniband/hw/qib/qib_user_sdma.c   |  8 ++---
 drivers/infiniband/hw/usnic/usnic_uiom.c|  2 +-
 drivers/platform/goldfish/goldfish_pipe.c   |  7 ++--
 include/linux/mm.h  | 14 
 mm/gup.c| 37 -
 10 files changed, 52 insertions(+), 30 deletions(-)

-- 
2.19.0



[PATCH 1/4] mm: get_user_pages: consolidate error handling

2018-09-27 Thread john . hubbard
From: John Hubbard 

An upcoming patch requires a way to operate on each page that
any of the get_user_pages_*() variants returns.

In preparation for that, consolidate the error handling for
__get_user_pages(). This provides a single location (the "out:" label)
for operating on the collected set of pages that are about to be returned.

As long every use of the "ret" variable is being edited, rename
"ret" --> "err", so that its name matches its true role.
This also gets rid of two shadowed variable declarations, as a
tiny beneficial a side effect.

Reviewed-by: Jan Kara 
Signed-off-by: John Hubbard 
---
 mm/gup.c | 37 ++---
 1 file changed, 22 insertions(+), 15 deletions(-)

diff --git a/mm/gup.c b/mm/gup.c
index 1abc8b4afff6..05ee7c18e59a 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -660,6 +660,7 @@ static long __get_user_pages(struct task_struct *tsk, 
struct mm_struct *mm,
struct vm_area_struct **vmas, int *nonblocking)
 {
long i = 0;
+   int err = 0;
unsigned int page_mask;
struct vm_area_struct *vma = NULL;
 
@@ -685,18 +686,19 @@ static long __get_user_pages(struct task_struct *tsk, 
struct mm_struct *mm,
if (!vma || start >= vma->vm_end) {
vma = find_extend_vma(mm, start);
if (!vma && in_gate_area(mm, start)) {
-   int ret;
-   ret = get_gate_page(mm, start & PAGE_MASK,
+   err = get_gate_page(mm, start & PAGE_MASK,
gup_flags, ,
pages ? [i] : NULL);
-   if (ret)
-   return i ? : ret;
+   if (err)
+   goto out;
page_mask = 0;
goto next_page;
}
 
-   if (!vma || check_vma_flags(vma, gup_flags))
-   return i ? : -EFAULT;
+   if (!vma || check_vma_flags(vma, gup_flags)) {
+   err = -EFAULT;
+   goto out;
+   }
if (is_vm_hugetlb_page(vma)) {
i = follow_hugetlb_page(mm, vma, pages, vmas,
, _pages, i,
@@ -709,23 +711,25 @@ static long __get_user_pages(struct task_struct *tsk, 
struct mm_struct *mm,
 * If we have a pending SIGKILL, don't keep faulting pages and
 * potentially allocating memory.
 */
-   if (unlikely(fatal_signal_pending(current)))
-   return i ? i : -ERESTARTSYS;
+   if (unlikely(fatal_signal_pending(current))) {
+   err = -ERESTARTSYS;
+   goto out;
+   }
cond_resched();
page = follow_page_mask(vma, start, foll_flags, _mask);
if (!page) {
-   int ret;
-   ret = faultin_page(tsk, vma, start, _flags,
+   err = faultin_page(tsk, vma, start, _flags,
nonblocking);
-   switch (ret) {
+   switch (err) {
case 0:
goto retry;
case -EFAULT:
case -ENOMEM:
case -EHWPOISON:
-   return i ? i : ret;
+   goto out;
case -EBUSY:
-   return i;
+   err = 0;
+   goto out;
case -ENOENT:
goto next_page;
}
@@ -737,7 +741,8 @@ static long __get_user_pages(struct task_struct *tsk, 
struct mm_struct *mm,
 */
goto next_page;
} else if (IS_ERR(page)) {
-   return i ? i : PTR_ERR(page);
+   err = PTR_ERR(page);
+   goto out;
}
if (pages) {
pages[i] = page;
@@ -757,7 +762,9 @@ static long __get_user_pages(struct task_struct *tsk, 
struct mm_struct *mm,
start += page_increm * PAGE_SIZE;
nr_pages -= page_increm;
} while (nr_pages);
-   return i;
+
+out:
+   return i ? i : err;
 }
 
 static bool vma_permits_fault(struct vm_area_struct *vma,
-- 
2.19.0



[PATCH 4/4] goldfish_pipe/mm: convert to the new release_user_pages() call

2018-09-27 Thread john . hubbard
From: John Hubbard 

For code that retains pages via get_user_pages*(),
release those pages via the new release_user_pages(),
instead of calling put_page().

This prepares for eventually fixing the problem described
in [1], and is following a plan listed in [2].

[1] https://lwn.net/Articles/753027/ : "The Trouble with get_user_pages()"

[2] https://lkml.kernel.org/r/20180709080554.21931-1-jhubb...@nvidia.com
Proposed steps for fixing get_user_pages() + DMA problems.

CC: Al Viro 
Signed-off-by: John Hubbard 
---
 drivers/platform/goldfish/goldfish_pipe.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/platform/goldfish/goldfish_pipe.c 
b/drivers/platform/goldfish/goldfish_pipe.c
index fad0345376e0..1e9455a86698 100644
--- a/drivers/platform/goldfish/goldfish_pipe.c
+++ b/drivers/platform/goldfish/goldfish_pipe.c
@@ -340,8 +340,9 @@ static void __release_user_pages(struct page **pages, int 
pages_count,
for (i = 0; i < pages_count; i++) {
if (!is_write && consumed_size > 0)
set_page_dirty(pages[i]);
-   put_page(pages[i]);
}
+
+   release_user_pages(pages, pages_count);
 }
 
 /* Populate the call parameters, merging adjacent pages together */
-- 
2.19.0



[PATCH] mm: fix __get_user_pages_fast() comment

2018-09-27 Thread Fengguang Wu

mmu_gather_tlb no longer exist. Replace with mmu_table_batch.

CC: triv...@kernel.org
Signed-off-by: Fengguang Wu 
---
mm/gup.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/gup.c b/mm/gup.c
index fc5f98069f4e..69194043ddd4 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -1798,8 +1798,8 @@ int __get_user_pages_fast(unsigned long start, int 
nr_pages, int write,
 * interrupts disabled by get_futex_key.
 *
 * With interrupts disabled, we block page table pages from being
-* freed from under us. See mmu_gather_tlb in asm-generic/tlb.h
-* for more details.
+* freed from under us. See struct mmu_table_batch comments in
+* include/asm-generic/tlb.h for more details.
 *
 * We do not adopt an rcu_read_lock(.) here as we also want to
 * block IPIs that come from THPs splitting.
--
2.15.0



[PATCH] mm: fix __get_user_pages_fast() comment

2018-09-27 Thread Fengguang Wu

mmu_gather_tlb no longer exist. Replace with mmu_table_batch.

CC: triv...@kernel.org
Signed-off-by: Fengguang Wu 
---
mm/gup.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/gup.c b/mm/gup.c
index fc5f98069f4e..69194043ddd4 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -1798,8 +1798,8 @@ int __get_user_pages_fast(unsigned long start, int 
nr_pages, int write,
 * interrupts disabled by get_futex_key.
 *
 * With interrupts disabled, we block page table pages from being
-* freed from under us. See mmu_gather_tlb in asm-generic/tlb.h
-* for more details.
+* freed from under us. See struct mmu_table_batch comments in
+* include/asm-generic/tlb.h for more details.
 *
 * We do not adopt an rcu_read_lock(.) here as we also want to
 * block IPIs that come from THPs splitting.
--
2.15.0



linux-next: Tree for Sep 28

2018-09-27 Thread Stephen Rothwell
Hi all,

News: there will be no linux-next release on Monday

Changes since 20180927:

Dropped trees: xarray, ida (temporarily)

The rdma tree gained a conflict against Linus' tree.

The userns tree gained a conflict against the arm64 tree.

Non-merge commits (relative to Linus' tree): 6651
 6736 files changed, 306346 insertions(+), 136858 deletions(-)



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" and checkout or reset to the new
master.

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log
files in the Next directory.  Between each merge, the tree was built
with a ppc64_defconfig for powerpc, an allmodconfig for x86_64, a
multi_v7_defconfig for arm and a native build of tools/perf. After
the final fixups (if any), I do an x86_64 modules_install followed by
builds for x86_64 allnoconfig, powerpc allnoconfig (32 and 64 bit),
ppc44x_defconfig, allyesconfig and pseries_le_defconfig and i386, sparc
and sparc64 defconfig. And finally, a simple boot test of the powerpc
pseries_le_defconfig kernel in qemu (with and without kvm enabled).

Below is a summary of the state of the merge.

I am currently merging 288 trees (counting Linus' and 66 trees of bug
fix patches pending for the current merge release).

Stats about the size of the tree over time can be seen at
http://neuling.org/linux-next-size.html .

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

-- 
Cheers,
Stephen Rothwell

$ git checkout master
$ git reset --hard stable
Merging origin/master (ad0371482b1e Merge tag 'for-linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma)
Merging fixes/master (72358c0b59b7 linux-next: build warnings from the build of 
Linus' tree)
Merging kbuild-current/fixes (ef8c4ed9db80 kbuild: allow to use GCC toolchain 
not in Clang search path)
Merging arc-current/for-curr (40660f1fcee8 ARC: build: Don't set CROSS_COMPILE 
in arch's Makefile)
Merging arm-current/fixes (afc9f65e01cd ARM: 8781/1: Fix Thumb-2 syscall return 
for binutils 2.29+)
Merging arm64-fixes/for-next/fixes (031e6e6b4e12 arm64: hugetlb: Avoid 
unnecessary clearing in huge_ptep_set_access_flags)
Merging m68k-current/for-linus (0986b16ab49b m68k/mac: Use correct PMU response 
format)
Merging powerpc-fixes/fixes (2483ef056f6e powerpc/numa: Use associativity if 
VPHN hcall is successful)
Merging sparc/master (df2def49c57b Merge tag 'acpi-4.19-rc1-2' of 
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm)
Merging fscrypt-current/for-stable (ae64f9bd1d36 Linux 4.15-rc2)
Merging net/master (d4ce58082f20 net-tcp: /proc/sys/net/ipv4/tcp_probe_interval 
is a u32 not int)
Merging bpf/master (d4ce58082f20 net-tcp: /proc/sys/net/ipv4/tcp_probe_interval 
is a u32 not int)
Merging ipsec/master (32bf94fb5c2e xfrm: validate template mode)
Merging netfilter/master (346fa83d1093 netfilter: conntrack: get rid of double 
sizeof)
Merging ipvs/master (feb9f55c33e5 netfilter: nft_dynset: allow dynamic updates 
of non-anonymous set)
Merging wireless-drivers/master (3baafeffa48a iwlwifi: 1000: set the TFD queue 
size)
Merging mac80211/master (1222a1601488 nl80211: Fix possible Spectre-v1 for CQM 
RSSI thresholds)
Merging rdma-fixes/for-rc (5c5702e259dc RDMA/core: Set right entry state before 
releasing reference)
Merging sound-current/for-linus (b3a5402cbceb ALSA: hda: Fix the 
audio-component completion timeout)
Merging sound-asoc-fixes/for-linus (6c96a58bb15b Merge branch 'asoc-4.19' into 
asoc-linus)
Merging regmap-fixes/for-linus (7876320f8880 Linux 4.19-rc4)
Merging regulator-fixes/for-linus (3564951ca82b Merge branch 'regulator-4.19' 
into regulator-linus)
Merging spi-fixes/for-linus (1f6b3b2c1ff4 Merge branch 'spi-4.19' into 
spi-linus)
Merging pci-current/for-linus (083874549fdf PCI: Reprogram bridge prefetch 
registers on resume)
Merging driver-core.current/driver-core-linus (7876320f8880 Linux 4.19-rc4)
Merging tty.current/tty-linus (7e620984b625 serial: imx: restore handshaking 
irq for imx1)
Merging usb.current/usb-linus (3e3b81965cbf usb: typec: mux: Take care of 
driver module reference counting)
Merging usb-gadget-fixes/fixes (d9707490077b usb: dwc2: Fix call location of 
dwc2_check_core_endianness)
Merging usb-serial-fixes/usb-linus (f5fad711c06e USB: serial: simple: add 
Motorola Tetra MTP6550 id)
Mergin

linux-next: Tree for Sep 28

2018-09-27 Thread Stephen Rothwell
Hi all,

News: there will be no linux-next release on Monday

Changes since 20180927:

Dropped trees: xarray, ida (temporarily)

The rdma tree gained a conflict against Linus' tree.

The userns tree gained a conflict against the arm64 tree.

Non-merge commits (relative to Linus' tree): 6651
 6736 files changed, 306346 insertions(+), 136858 deletions(-)



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" and checkout or reset to the new
master.

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log
files in the Next directory.  Between each merge, the tree was built
with a ppc64_defconfig for powerpc, an allmodconfig for x86_64, a
multi_v7_defconfig for arm and a native build of tools/perf. After
the final fixups (if any), I do an x86_64 modules_install followed by
builds for x86_64 allnoconfig, powerpc allnoconfig (32 and 64 bit),
ppc44x_defconfig, allyesconfig and pseries_le_defconfig and i386, sparc
and sparc64 defconfig. And finally, a simple boot test of the powerpc
pseries_le_defconfig kernel in qemu (with and without kvm enabled).

Below is a summary of the state of the merge.

I am currently merging 288 trees (counting Linus' and 66 trees of bug
fix patches pending for the current merge release).

Stats about the size of the tree over time can be seen at
http://neuling.org/linux-next-size.html .

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

-- 
Cheers,
Stephen Rothwell

$ git checkout master
$ git reset --hard stable
Merging origin/master (ad0371482b1e Merge tag 'for-linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma)
Merging fixes/master (72358c0b59b7 linux-next: build warnings from the build of 
Linus' tree)
Merging kbuild-current/fixes (ef8c4ed9db80 kbuild: allow to use GCC toolchain 
not in Clang search path)
Merging arc-current/for-curr (40660f1fcee8 ARC: build: Don't set CROSS_COMPILE 
in arch's Makefile)
Merging arm-current/fixes (afc9f65e01cd ARM: 8781/1: Fix Thumb-2 syscall return 
for binutils 2.29+)
Merging arm64-fixes/for-next/fixes (031e6e6b4e12 arm64: hugetlb: Avoid 
unnecessary clearing in huge_ptep_set_access_flags)
Merging m68k-current/for-linus (0986b16ab49b m68k/mac: Use correct PMU response 
format)
Merging powerpc-fixes/fixes (2483ef056f6e powerpc/numa: Use associativity if 
VPHN hcall is successful)
Merging sparc/master (df2def49c57b Merge tag 'acpi-4.19-rc1-2' of 
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm)
Merging fscrypt-current/for-stable (ae64f9bd1d36 Linux 4.15-rc2)
Merging net/master (d4ce58082f20 net-tcp: /proc/sys/net/ipv4/tcp_probe_interval 
is a u32 not int)
Merging bpf/master (d4ce58082f20 net-tcp: /proc/sys/net/ipv4/tcp_probe_interval 
is a u32 not int)
Merging ipsec/master (32bf94fb5c2e xfrm: validate template mode)
Merging netfilter/master (346fa83d1093 netfilter: conntrack: get rid of double 
sizeof)
Merging ipvs/master (feb9f55c33e5 netfilter: nft_dynset: allow dynamic updates 
of non-anonymous set)
Merging wireless-drivers/master (3baafeffa48a iwlwifi: 1000: set the TFD queue 
size)
Merging mac80211/master (1222a1601488 nl80211: Fix possible Spectre-v1 for CQM 
RSSI thresholds)
Merging rdma-fixes/for-rc (5c5702e259dc RDMA/core: Set right entry state before 
releasing reference)
Merging sound-current/for-linus (b3a5402cbceb ALSA: hda: Fix the 
audio-component completion timeout)
Merging sound-asoc-fixes/for-linus (6c96a58bb15b Merge branch 'asoc-4.19' into 
asoc-linus)
Merging regmap-fixes/for-linus (7876320f8880 Linux 4.19-rc4)
Merging regulator-fixes/for-linus (3564951ca82b Merge branch 'regulator-4.19' 
into regulator-linus)
Merging spi-fixes/for-linus (1f6b3b2c1ff4 Merge branch 'spi-4.19' into 
spi-linus)
Merging pci-current/for-linus (083874549fdf PCI: Reprogram bridge prefetch 
registers on resume)
Merging driver-core.current/driver-core-linus (7876320f8880 Linux 4.19-rc4)
Merging tty.current/tty-linus (7e620984b625 serial: imx: restore handshaking 
irq for imx1)
Merging usb.current/usb-linus (3e3b81965cbf usb: typec: mux: Take care of 
driver module reference counting)
Merging usb-gadget-fixes/fixes (d9707490077b usb: dwc2: Fix call location of 
dwc2_check_core_endianness)
Merging usb-serial-fixes/usb-linus (f5fad711c06e USB: serial: simple: add 
Motorola Tetra MTP6550 id)
Mergin

Re: [PATCH v4] ARM: dts: dra7: Fix up unaligned access setting for PCIe EP

2018-09-27 Thread Vignesh R



On Wednesday 26 September 2018 10:57 PM, Tony Lindgren wrote:
> * Vignesh R  [180924 22:25]:
>> Bit positions of PCIE_SS1_AXI2OCP_LEGACY_MODE_ENABLE and
>> PCIE_SS1_AXI2OCP_LEGACY_MODE_ENABLE in CTRL_CORE_SMA_SW_7 are
>> incorrectly documented in the TRM. In fact, the bit positions are
>> swapped. Update the DT bindings for PCIe EP to reflect the same.
>>
>> Fixes: d23f3839fe97 ("ARM: dts: DRA7: Add pcie1 dt node for EP mode")
>> Cc: sta...@vger.kernel.org
>> Signed-off-by: Vignesh R 
>> ---
>>
>> This patch is split from v3 here:
>> https://lore.kernel.org/patchwork/cover/967020/
>> Patch can be applied standalone and has no dependencies on other patches
>> in v3.
> 
> Hmm is this needed for v4.19-rc cycle or can this wait for
> v4.20 merge window?
> 

v4.20 should be fine.


-- 
Regards
Vignesh


Re: [PATCH v4] ARM: dts: dra7: Fix up unaligned access setting for PCIe EP

2018-09-27 Thread Vignesh R



On Wednesday 26 September 2018 10:57 PM, Tony Lindgren wrote:
> * Vignesh R  [180924 22:25]:
>> Bit positions of PCIE_SS1_AXI2OCP_LEGACY_MODE_ENABLE and
>> PCIE_SS1_AXI2OCP_LEGACY_MODE_ENABLE in CTRL_CORE_SMA_SW_7 are
>> incorrectly documented in the TRM. In fact, the bit positions are
>> swapped. Update the DT bindings for PCIe EP to reflect the same.
>>
>> Fixes: d23f3839fe97 ("ARM: dts: DRA7: Add pcie1 dt node for EP mode")
>> Cc: sta...@vger.kernel.org
>> Signed-off-by: Vignesh R 
>> ---
>>
>> This patch is split from v3 here:
>> https://lore.kernel.org/patchwork/cover/967020/
>> Patch can be applied standalone and has no dependencies on other patches
>> in v3.
> 
> Hmm is this needed for v4.19-rc cycle or can this wait for
> v4.20 merge window?
> 

v4.20 should be fine.


-- 
Regards
Vignesh


Re: [PATCH 1/2] dt-bindings: i2c-omap: Add new compatible for AM654 SoCs

2018-09-27 Thread Vignesh R



On Wednesday 26 September 2018 08:14 PM, Peter Rosin wrote:
> On 2018-09-26 13:57, Vignesh R wrote:
>> AM654 SoCs have similar I2C IP as OMAP SoCs. Add new compatible to
>> handle AM654 SoCs.
>>
>> Signed-off-by: Vignesh R 
>> ---
>>  Documentation/devicetree/bindings/i2c/i2c-omap.txt | 3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/Documentation/devicetree/bindings/i2c/i2c-omap.txt 
>> b/Documentation/devicetree/bindings/i2c/i2c-omap.txt
>> index 7e49839d4124..11ce869d682d 100644
>> --- a/Documentation/devicetree/bindings/i2c/i2c-omap.txt
>> +++ b/Documentation/devicetree/bindings/i2c/i2c-omap.txt
>> @@ -2,7 +2,8 @@ I2C for OMAP platforms
>>  
>>  Required properties :
>>  - compatible : Must be "ti,omap2420-i2c", "ti,omap2430-i2c", "ti,omap3-i2c"
>> -  or "ti,omap4-i2c"
>> +  or "ti,omap4-i2c" for OMAP2+ SoCs
>> +- compatible : Must "ti,am654-i2c", "ti,omap4-i2c" for AM654 SoCs
>   ^
>  be
> 
> Also, it looks bad with two 'compatible' entries in the properties list.
> I think the trend is to list one valid compatible (plus fallbacks)
> per line. So, you should consider reformatting, possibly like so:
> 
> - compatible : Must be one of
>   "ti,omap2420-i2c"
>   "ti,omap2430-i2c"
>   "ti,omap3-i2c"
> "ti,omap4-i2c"
>   "ti,am654-i2c", "ti,omap4-i2c"
> 

I have sent v2 with these changes. Thanks for the review!


-- 
Regards
Vignesh


Re: [PATCH 1/2] dt-bindings: i2c-omap: Add new compatible for AM654 SoCs

2018-09-27 Thread Vignesh R



On Wednesday 26 September 2018 08:14 PM, Peter Rosin wrote:
> On 2018-09-26 13:57, Vignesh R wrote:
>> AM654 SoCs have similar I2C IP as OMAP SoCs. Add new compatible to
>> handle AM654 SoCs.
>>
>> Signed-off-by: Vignesh R 
>> ---
>>  Documentation/devicetree/bindings/i2c/i2c-omap.txt | 3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/Documentation/devicetree/bindings/i2c/i2c-omap.txt 
>> b/Documentation/devicetree/bindings/i2c/i2c-omap.txt
>> index 7e49839d4124..11ce869d682d 100644
>> --- a/Documentation/devicetree/bindings/i2c/i2c-omap.txt
>> +++ b/Documentation/devicetree/bindings/i2c/i2c-omap.txt
>> @@ -2,7 +2,8 @@ I2C for OMAP platforms
>>  
>>  Required properties :
>>  - compatible : Must be "ti,omap2420-i2c", "ti,omap2430-i2c", "ti,omap3-i2c"
>> -  or "ti,omap4-i2c"
>> +  or "ti,omap4-i2c" for OMAP2+ SoCs
>> +- compatible : Must "ti,am654-i2c", "ti,omap4-i2c" for AM654 SoCs
>   ^
>  be
> 
> Also, it looks bad with two 'compatible' entries in the properties list.
> I think the trend is to list one valid compatible (plus fallbacks)
> per line. So, you should consider reformatting, possibly like so:
> 
> - compatible : Must be one of
>   "ti,omap2420-i2c"
>   "ti,omap2430-i2c"
>   "ti,omap3-i2c"
> "ti,omap4-i2c"
>   "ti,am654-i2c", "ti,omap4-i2c"
> 

I have sent v2 with these changes. Thanks for the review!


-- 
Regards
Vignesh


[PATCH v2 0/2] i2c-omap: Enable i2c-omap driver for AM654 SoCs

2018-09-27 Thread Vignesh R
Couple of patches to enable i2c-omap driver to be used with TI's new
AM654 platforms.


Vignesh R (2):
  dt-bindings: i2c-omap: Add new compatible for AM654 SoCs
  i2c: busses: Kconfig: Enable I2C_OMAP for ARCH_K3

 Documentation/devicetree/bindings/i2c/i2c-omap.txt | 8 ++--
 drivers/i2c/busses/Kconfig | 2 +-
 2 files changed, 7 insertions(+), 3 deletions(-)

-- 
2.19.0



[PATCH v2 2/2] i2c: busses: Kconfig: Enable I2C_OMAP for ARCH_K3

2018-09-27 Thread Vignesh R
Allow I2C_OMAP to be built for K3 platforms.

Signed-off-by: Vignesh R 
---

v2: No changes

 drivers/i2c/busses/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/i2c/busses/Kconfig b/drivers/i2c/busses/Kconfig
index 451d4ae50e66..ac4b09642f63 100644
--- a/drivers/i2c/busses/Kconfig
+++ b/drivers/i2c/busses/Kconfig
@@ -751,7 +751,7 @@ config I2C_OCORES
 
 config I2C_OMAP
tristate "OMAP I2C adapter"
-   depends on ARCH_OMAP
+   depends on ARCH_OMAP || ARCH_K3
default y if MACH_OMAP_H3 || MACH_OMAP_OSK
help
  If you say yes to this option, support will be included for the
-- 
2.19.0



[PATCH v2 1/2] dt-bindings: i2c-omap: Add new compatible for AM654 SoCs

2018-09-27 Thread Vignesh R
AM654 SoCs have same I2C IP as OMAP SoCs. Add new compatible to
handle AM654 SoCs. While at that reformat the existing compatible list
for older SoCs to list one valid compatible per line.

Signed-off-by: Vignesh R 
---

v2: Reformat compatible existing compatible list.

 Documentation/devicetree/bindings/i2c/i2c-omap.txt | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/Documentation/devicetree/bindings/i2c/i2c-omap.txt 
b/Documentation/devicetree/bindings/i2c/i2c-omap.txt
index 7e49839d4124..4b90ba9f31b7 100644
--- a/Documentation/devicetree/bindings/i2c/i2c-omap.txt
+++ b/Documentation/devicetree/bindings/i2c/i2c-omap.txt
@@ -1,8 +1,12 @@
 I2C for OMAP platforms
 
 Required properties :
-- compatible : Must be "ti,omap2420-i2c", "ti,omap2430-i2c", "ti,omap3-i2c"
-  or "ti,omap4-i2c"
+- compatible : Must be
+   "ti,omap2420-i2c" for OMAP2420 SoCs
+   "ti,omap2430-i2c" for OMAP2430 SoCs
+   "ti,omap3-i2c" for OMAP3 SoCs
+   "ti,omap4-i2c" for OMAP4+ SoCs
+   "ti,am654-i2c", "ti,omap4-i2c" for AM654 SoCs
 - ti,hwmods : Must be "i2c", n being the instance number (1-based)
 - #address-cells = <1>;
 - #size-cells = <0>;
-- 
2.19.0



[PATCH v2 0/2] i2c-omap: Enable i2c-omap driver for AM654 SoCs

2018-09-27 Thread Vignesh R
Couple of patches to enable i2c-omap driver to be used with TI's new
AM654 platforms.


Vignesh R (2):
  dt-bindings: i2c-omap: Add new compatible for AM654 SoCs
  i2c: busses: Kconfig: Enable I2C_OMAP for ARCH_K3

 Documentation/devicetree/bindings/i2c/i2c-omap.txt | 8 ++--
 drivers/i2c/busses/Kconfig | 2 +-
 2 files changed, 7 insertions(+), 3 deletions(-)

-- 
2.19.0



[PATCH v2 2/2] i2c: busses: Kconfig: Enable I2C_OMAP for ARCH_K3

2018-09-27 Thread Vignesh R
Allow I2C_OMAP to be built for K3 platforms.

Signed-off-by: Vignesh R 
---

v2: No changes

 drivers/i2c/busses/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/i2c/busses/Kconfig b/drivers/i2c/busses/Kconfig
index 451d4ae50e66..ac4b09642f63 100644
--- a/drivers/i2c/busses/Kconfig
+++ b/drivers/i2c/busses/Kconfig
@@ -751,7 +751,7 @@ config I2C_OCORES
 
 config I2C_OMAP
tristate "OMAP I2C adapter"
-   depends on ARCH_OMAP
+   depends on ARCH_OMAP || ARCH_K3
default y if MACH_OMAP_H3 || MACH_OMAP_OSK
help
  If you say yes to this option, support will be included for the
-- 
2.19.0



[PATCH v2 1/2] dt-bindings: i2c-omap: Add new compatible for AM654 SoCs

2018-09-27 Thread Vignesh R
AM654 SoCs have same I2C IP as OMAP SoCs. Add new compatible to
handle AM654 SoCs. While at that reformat the existing compatible list
for older SoCs to list one valid compatible per line.

Signed-off-by: Vignesh R 
---

v2: Reformat compatible existing compatible list.

 Documentation/devicetree/bindings/i2c/i2c-omap.txt | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/Documentation/devicetree/bindings/i2c/i2c-omap.txt 
b/Documentation/devicetree/bindings/i2c/i2c-omap.txt
index 7e49839d4124..4b90ba9f31b7 100644
--- a/Documentation/devicetree/bindings/i2c/i2c-omap.txt
+++ b/Documentation/devicetree/bindings/i2c/i2c-omap.txt
@@ -1,8 +1,12 @@
 I2C for OMAP platforms
 
 Required properties :
-- compatible : Must be "ti,omap2420-i2c", "ti,omap2430-i2c", "ti,omap3-i2c"
-  or "ti,omap4-i2c"
+- compatible : Must be
+   "ti,omap2420-i2c" for OMAP2420 SoCs
+   "ti,omap2430-i2c" for OMAP2430 SoCs
+   "ti,omap3-i2c" for OMAP3 SoCs
+   "ti,omap4-i2c" for OMAP4+ SoCs
+   "ti,am654-i2c", "ti,omap4-i2c" for AM654 SoCs
 - ti,hwmods : Must be "i2c", n being the instance number (1-based)
 - #address-cells = <1>;
 - #size-cells = <0>;
-- 
2.19.0



Re: [PATCH v7 4/6] files: add a replace_fd_files() function

2018-09-27 Thread Tycho Andersen
On Thu, Sep 27, 2018 at 07:20:50PM -0700, Kees Cook wrote:
> On Thu, Sep 27, 2018 at 2:59 PM, Kees Cook  wrote:
> > On Thu, Sep 27, 2018 at 8:11 AM, Tycho Andersen  wrote:
> >> Similar to fd_install/__fd_install, we want to be able to replace an fd of
> >> an arbitrary struct files_struct, not just current's. We'll use this in the
> >> next patch to implement the seccomp ioctl that allows inserting fds into a
> >> stopped process' context.
> >>
> >> v7: new in v7
> >>
> >> Signed-off-by: Tycho Andersen 
> >> CC: Alexander Viro 
> >> CC: Kees Cook 
> >> CC: Andy Lutomirski 
> >> CC: Oleg Nesterov 
> >> CC: Eric W. Biederman 
> >> CC: "Serge E. Hallyn" 
> >> CC: Christian Brauner 
> >> CC: Tyler Hicks 
> >> CC: Akihiro Suda 
> >> ---
> >>  fs/file.c| 22 +++---
> >>  include/linux/file.h |  8 
> >>  2 files changed, 23 insertions(+), 7 deletions(-)
> >>
> >> diff --git a/fs/file.c b/fs/file.c
> >> index 7ffd6e9d103d..3b3c5aadaadb 100644
> >> --- a/fs/file.c
> >> +++ b/fs/file.c
> >> @@ -850,24 +850,32 @@ __releases(>file_lock)
> >>  }
> >>
> >>  int replace_fd(unsigned fd, struct file *file, unsigned flags)
> >> +{
> >> +   return replace_fd_task(current, fd, file, flags);
> >> +}
> >> +
> >> +/*
> >> + * Same warning as __alloc_fd()/__fd_install() here.
> >> + */
> >> +int replace_fd_task(struct task_struct *task, unsigned fd,
> >> +   struct file *file, unsigned flags)
> >>  {
> >> int err;
> >> -   struct files_struct *files = current->files;
> >
> > Same feedback as Jann: on a purely "smaller diff" note, this could
> > just be s/current/task/ here and all the other s/files/task->files/
> > would go away...
> >
> >>
> >> if (!file)
> >> -   return __close_fd(files, fd);
> >> +   return __close_fd(task->files, fd);
> >>
> >> -   if (fd >= rlimit(RLIMIT_NOFILE))
> >> +   if (fd >= task_rlimit(task, RLIMIT_NOFILE))
> >> return -EBADF;
> >>
> >> -   spin_lock(>file_lock);
> >> -   err = expand_files(files, fd);
> >> +   spin_lock(>files->file_lock);
> >> +   err = expand_files(task->files, fd);
> >> if (unlikely(err < 0))
> >> goto out_unlock;
> >> -   return do_dup2(files, file, fd, flags);
> >> +   return do_dup2(task->files, file, fd, flags);
> >>
> >>  out_unlock:
> >> -   spin_unlock(>file_lock);
> >> +   spin_unlock(>files->file_lock);
> >> return err;
> >>  }
> >>
> >> diff --git a/include/linux/file.h b/include/linux/file.h
> >> index 6b2fb032416c..f94277fee038 100644
> >> --- a/include/linux/file.h
> >> +++ b/include/linux/file.h
> >> @@ -11,6 +11,7 @@
> >>  #include 
> >>
> >>  struct file;
> >> +struct task_struct;
> >>
> >>  extern void fput(struct file *);
> >>
> >> @@ -79,6 +80,13 @@ static inline void fdput_pos(struct fd f)
> >>
> >>  extern int f_dupfd(unsigned int from, struct file *file, unsigned flags);
> >>  extern int replace_fd(unsigned fd, struct file *file, unsigned flags);
> >> +/*
> >> + * Warning! This is only safe if you know the owner of the files_struct is
> >> + * stopped outside syscall context. It's a very bad idea to use this 
> >> unless you
> >> + * have similar guarantees in your code.
> >> + */
> >> +extern int replace_fd_task(struct task_struct *task, unsigned fd,
> >> +  struct file *file, unsigned flags);
> >
> > Perhaps call this __replace_fd() to indicate the "please don't use
> > this unless you're very sure"ness of it?
> >
> >>  extern void set_close_on_exec(unsigned int fd, int flag);
> >>  extern bool get_close_on_exec(unsigned int fd);
> >>  extern int get_unused_fd_flags(unsigned flags);
> >> --
> >> 2.17.1
> >>
> >
> > If I can get an Ack from Al, that would be very nice. :)
> 
> In out-of-band feedback from Al, he's pointed out a much cleaner
> approach: do the work on the "current" side. i.e. current is stopped
> in __seccomp_filter in the case SECCOMP_RET_USER_NOTIFY. Instead of
> having the ioctl-handing process doing the work, have it done on the
> other side. This may cause some additional complexity on the ioctl
> return path, but it solves both this problem and the "ptrace attach"
> issue: have the work delayed until "current" gets caught by seccomp.

So this is pretty much what we had in v6 (a one fd version, but the
idea is the same). The biggest issue is that in the case of e.g.
socketpair(), the fd values need to be written somewhere in the task's
memory, which means they need to be known before the response is sent.
If we have to wait until we're back in the task's context to install
them, we can't know the fd values.

V6 implementation: https://lkml.org/lkml/2018/9/6/773

Tycho


Re: [PATCH v7 4/6] files: add a replace_fd_files() function

2018-09-27 Thread Tycho Andersen
On Thu, Sep 27, 2018 at 07:20:50PM -0700, Kees Cook wrote:
> On Thu, Sep 27, 2018 at 2:59 PM, Kees Cook  wrote:
> > On Thu, Sep 27, 2018 at 8:11 AM, Tycho Andersen  wrote:
> >> Similar to fd_install/__fd_install, we want to be able to replace an fd of
> >> an arbitrary struct files_struct, not just current's. We'll use this in the
> >> next patch to implement the seccomp ioctl that allows inserting fds into a
> >> stopped process' context.
> >>
> >> v7: new in v7
> >>
> >> Signed-off-by: Tycho Andersen 
> >> CC: Alexander Viro 
> >> CC: Kees Cook 
> >> CC: Andy Lutomirski 
> >> CC: Oleg Nesterov 
> >> CC: Eric W. Biederman 
> >> CC: "Serge E. Hallyn" 
> >> CC: Christian Brauner 
> >> CC: Tyler Hicks 
> >> CC: Akihiro Suda 
> >> ---
> >>  fs/file.c| 22 +++---
> >>  include/linux/file.h |  8 
> >>  2 files changed, 23 insertions(+), 7 deletions(-)
> >>
> >> diff --git a/fs/file.c b/fs/file.c
> >> index 7ffd6e9d103d..3b3c5aadaadb 100644
> >> --- a/fs/file.c
> >> +++ b/fs/file.c
> >> @@ -850,24 +850,32 @@ __releases(>file_lock)
> >>  }
> >>
> >>  int replace_fd(unsigned fd, struct file *file, unsigned flags)
> >> +{
> >> +   return replace_fd_task(current, fd, file, flags);
> >> +}
> >> +
> >> +/*
> >> + * Same warning as __alloc_fd()/__fd_install() here.
> >> + */
> >> +int replace_fd_task(struct task_struct *task, unsigned fd,
> >> +   struct file *file, unsigned flags)
> >>  {
> >> int err;
> >> -   struct files_struct *files = current->files;
> >
> > Same feedback as Jann: on a purely "smaller diff" note, this could
> > just be s/current/task/ here and all the other s/files/task->files/
> > would go away...
> >
> >>
> >> if (!file)
> >> -   return __close_fd(files, fd);
> >> +   return __close_fd(task->files, fd);
> >>
> >> -   if (fd >= rlimit(RLIMIT_NOFILE))
> >> +   if (fd >= task_rlimit(task, RLIMIT_NOFILE))
> >> return -EBADF;
> >>
> >> -   spin_lock(>file_lock);
> >> -   err = expand_files(files, fd);
> >> +   spin_lock(>files->file_lock);
> >> +   err = expand_files(task->files, fd);
> >> if (unlikely(err < 0))
> >> goto out_unlock;
> >> -   return do_dup2(files, file, fd, flags);
> >> +   return do_dup2(task->files, file, fd, flags);
> >>
> >>  out_unlock:
> >> -   spin_unlock(>file_lock);
> >> +   spin_unlock(>files->file_lock);
> >> return err;
> >>  }
> >>
> >> diff --git a/include/linux/file.h b/include/linux/file.h
> >> index 6b2fb032416c..f94277fee038 100644
> >> --- a/include/linux/file.h
> >> +++ b/include/linux/file.h
> >> @@ -11,6 +11,7 @@
> >>  #include 
> >>
> >>  struct file;
> >> +struct task_struct;
> >>
> >>  extern void fput(struct file *);
> >>
> >> @@ -79,6 +80,13 @@ static inline void fdput_pos(struct fd f)
> >>
> >>  extern int f_dupfd(unsigned int from, struct file *file, unsigned flags);
> >>  extern int replace_fd(unsigned fd, struct file *file, unsigned flags);
> >> +/*
> >> + * Warning! This is only safe if you know the owner of the files_struct is
> >> + * stopped outside syscall context. It's a very bad idea to use this 
> >> unless you
> >> + * have similar guarantees in your code.
> >> + */
> >> +extern int replace_fd_task(struct task_struct *task, unsigned fd,
> >> +  struct file *file, unsigned flags);
> >
> > Perhaps call this __replace_fd() to indicate the "please don't use
> > this unless you're very sure"ness of it?
> >
> >>  extern void set_close_on_exec(unsigned int fd, int flag);
> >>  extern bool get_close_on_exec(unsigned int fd);
> >>  extern int get_unused_fd_flags(unsigned flags);
> >> --
> >> 2.17.1
> >>
> >
> > If I can get an Ack from Al, that would be very nice. :)
> 
> In out-of-band feedback from Al, he's pointed out a much cleaner
> approach: do the work on the "current" side. i.e. current is stopped
> in __seccomp_filter in the case SECCOMP_RET_USER_NOTIFY. Instead of
> having the ioctl-handing process doing the work, have it done on the
> other side. This may cause some additional complexity on the ioctl
> return path, but it solves both this problem and the "ptrace attach"
> issue: have the work delayed until "current" gets caught by seccomp.

So this is pretty much what we had in v6 (a one fd version, but the
idea is the same). The biggest issue is that in the case of e.g.
socketpair(), the fd values need to be written somewhere in the task's
memory, which means they need to be known before the response is sent.
If we have to wait until we're back in the task's context to install
them, we can't know the fd values.

V6 implementation: https://lkml.org/lkml/2018/9/6/773

Tycho


Re: [PATCH v3 1/1] perf: Sharing PMU counters across compatible events

2018-09-27 Thread Song Liu
Hi Ravi,

> On Sep 27, 2018, at 9:33 PM, Ravi Bangoria  
> wrote:
> 
> Hi Song,
> 
> On 09/25/2018 03:55 AM, Song Liu wrote:
>> This patch tries to enable PMU sharing. To make perf event scheduling
>> fast, we use special data structures.
>> 
>> An array of "struct perf_event_dup" is added to the perf_event_context,
>> to remember all the duplicated events under this ctx. All the events
>> under this ctx has a "dup_id" pointing to its perf_event_dup. Compatible
>> events under the same ctx share the same perf_event_dup. The following
>> figure shows a simplified version of the data structure.
>> 
>>  ctx ->  perf_event_dup -> master
>> ^
>> |
>> perf_event /|
>> |
>> perf_event /
>> 
> 
> I've not gone through the patch in detail, but I was specifically
> interested in scenarios where one perf instance is counting event
> systemwide and thus other perf instance fails to count the same
> event for a specific workload because that event can be counted
> in one hw counter only.
> 
> Ex: https://lkml.org/lkml/2018/3/12/1011
> 
> Seems this patch does not solve this issue. Please let me know if
> I'm missing anything.
> 

In this case, unfortunately, these two events cannot share the same
counter, because one of them is in cpu ctx; while the other belongs 
to the task ctx. They have to go through the rotation, that each 
event counts 50% of the time. However, if you have 2 events in cpu 
ctx and 2 events in task ctx on the same counter, this patch will 
help each event to count 50% of time, instead of 25%. 

Another potential solution is to create a cgroup for the workload, 
and attach perf event to the cgroup. Since cgroup events are added
to the cpu ctx, they can share counters with the system wide events. 

I made this trade-off for O(1) time context switch. If we share 
hw counter between cpu ctx and task ctx, we have to do linear time 
comparison to identify events that can share the counter.  

Thanks,
Song





Re: [PATCH v3 1/1] perf: Sharing PMU counters across compatible events

2018-09-27 Thread Song Liu
Hi Ravi,

> On Sep 27, 2018, at 9:33 PM, Ravi Bangoria  
> wrote:
> 
> Hi Song,
> 
> On 09/25/2018 03:55 AM, Song Liu wrote:
>> This patch tries to enable PMU sharing. To make perf event scheduling
>> fast, we use special data structures.
>> 
>> An array of "struct perf_event_dup" is added to the perf_event_context,
>> to remember all the duplicated events under this ctx. All the events
>> under this ctx has a "dup_id" pointing to its perf_event_dup. Compatible
>> events under the same ctx share the same perf_event_dup. The following
>> figure shows a simplified version of the data structure.
>> 
>>  ctx ->  perf_event_dup -> master
>> ^
>> |
>> perf_event /|
>> |
>> perf_event /
>> 
> 
> I've not gone through the patch in detail, but I was specifically
> interested in scenarios where one perf instance is counting event
> systemwide and thus other perf instance fails to count the same
> event for a specific workload because that event can be counted
> in one hw counter only.
> 
> Ex: https://lkml.org/lkml/2018/3/12/1011
> 
> Seems this patch does not solve this issue. Please let me know if
> I'm missing anything.
> 

In this case, unfortunately, these two events cannot share the same
counter, because one of them is in cpu ctx; while the other belongs 
to the task ctx. They have to go through the rotation, that each 
event counts 50% of the time. However, if you have 2 events in cpu 
ctx and 2 events in task ctx on the same counter, this patch will 
help each event to count 50% of time, instead of 25%. 

Another potential solution is to create a cgroup for the workload, 
and attach perf event to the cgroup. Since cgroup events are added
to the cpu ctx, they can share counters with the system wide events. 

I made this trade-off for O(1) time context switch. If we share 
hw counter between cpu ctx and task ctx, we have to do linear time 
comparison to identify events that can share the counter.  

Thanks,
Song





Re: [PATCH 4.18 00/88] 4.18.11-stable review

2018-09-27 Thread Greg Kroah-Hartman
On Thu, Sep 27, 2018 at 05:00:30PM -0300, Rafael David Tinoco wrote:
> On 9/27/18 6:02 AM, Greg Kroah-Hartman wrote:
> > This is the start of the stable review cycle for the 4.18.11 release.
> > There are 88 patches in this series, all will be posted as a response
> > to this one.  If anyone has any issues with these being applied, please
> > let me know.
> > 
> > Responses should be made by Sat Sep 29 09:02:26 UTC 2018.
> > Anything received after that time might be too late.
> > 
> > The whole patch series can be found in one patch at:
> > 
> > https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.18.11-rc1.gz
> > or in the git tree and branch at:
> > 
> > git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git 
> > linux-4.18.y
> > and the diffstat can be found below.
> > 
> > thanks,
> > 
> > greg k-h
> 
> Results from Linaro’s test farm.
> No regressions on arm64, arm, x86_64, and i386.

Great, thanks for testing all of these and letting me know.

greg k-h


Re: [PATCH 4.18 00/88] 4.18.11-stable review

2018-09-27 Thread Greg Kroah-Hartman
On Thu, Sep 27, 2018 at 05:00:30PM -0300, Rafael David Tinoco wrote:
> On 9/27/18 6:02 AM, Greg Kroah-Hartman wrote:
> > This is the start of the stable review cycle for the 4.18.11 release.
> > There are 88 patches in this series, all will be posted as a response
> > to this one.  If anyone has any issues with these being applied, please
> > let me know.
> > 
> > Responses should be made by Sat Sep 29 09:02:26 UTC 2018.
> > Anything received after that time might be too late.
> > 
> > The whole patch series can be found in one patch at:
> > 
> > https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.18.11-rc1.gz
> > or in the git tree and branch at:
> > 
> > git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git 
> > linux-4.18.y
> > and the diffstat can be found below.
> > 
> > thanks,
> > 
> > greg k-h
> 
> Results from Linaro’s test farm.
> No regressions on arm64, arm, x86_64, and i386.

Great, thanks for testing all of these and letting me know.

greg k-h


Re: [PATCH 4.18 00/88] 4.18.11-stable review

2018-09-27 Thread Greg Kroah-Hartman
On Thu, Sep 27, 2018 at 02:53:12PM -0700, Guenter Roeck wrote:
> On Thu, Sep 27, 2018 at 11:02:41AM +0200, Greg Kroah-Hartman wrote:
> > This is the start of the stable review cycle for the 4.18.11 release.
> > There are 88 patches in this series, all will be posted as a response
> > to this one.  If anyone has any issues with these being applied, please
> > let me know.
> > 
> > Responses should be made by Sat Sep 29 09:02:26 UTC 2018.
> > Anything received after that time might be too late.
> > 
> 
> Build results:
>   total: 137 pass: 137 fail: 0
> Qemu test results:
>   total: 321 pass: 319 fail: 2
> Failed tests: 
>   arm:sabrelite:imx_v6_v7_defconfig:imx6dl-sabrelite 
>   powerpc:g3beige:ppc_book3s_defconfig:nosmp:ide:rootfs
> 
> arm_sabrelite crashes in drm code. Presumably this is the same problem as
> reported by others with v4.14.
> 
> powerpc:g3beige is the known problem. Patch should be available upstream
> in the near future.
> 
> Details are available at https://kerneltests.org/builders/.

Thanks for testing all of these and letting me know.

greg k-h


Re: [PATCH 4.18 00/88] 4.18.11-stable review

2018-09-27 Thread Greg Kroah-Hartman
On Thu, Sep 27, 2018 at 02:09:08PM -0600, Shuah Khan wrote:
> On 09/27/2018 03:02 AM, Greg Kroah-Hartman wrote:
> > This is the start of the stable review cycle for the 4.18.11 release.
> > There are 88 patches in this series, all will be posted as a response
> > to this one.  If anyone has any issues with these being applied, please
> > let me know.
> > 
> > Responses should be made by Sat Sep 29 09:02:26 UTC 2018.
> > Anything received after that time might be too late.
> > 
> > The whole patch series can be found in one patch at:
> > 
> > https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.18.11-rc1.gz
> > or in the git tree and branch at:
> > 
> > git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git 
> > linux-4.18.y
> > and the diffstat can be found below.
> > 
> > thanks,
> > 
> > greg k-h
> > 
> 
> Compiled and booted on my test system. No dmesg regressions.

Thanks for testing all of these and letting me know.

greg k-h


Re: [PATCH 4.18 00/88] 4.18.11-stable review

2018-09-27 Thread Greg Kroah-Hartman
On Thu, Sep 27, 2018 at 02:53:12PM -0700, Guenter Roeck wrote:
> On Thu, Sep 27, 2018 at 11:02:41AM +0200, Greg Kroah-Hartman wrote:
> > This is the start of the stable review cycle for the 4.18.11 release.
> > There are 88 patches in this series, all will be posted as a response
> > to this one.  If anyone has any issues with these being applied, please
> > let me know.
> > 
> > Responses should be made by Sat Sep 29 09:02:26 UTC 2018.
> > Anything received after that time might be too late.
> > 
> 
> Build results:
>   total: 137 pass: 137 fail: 0
> Qemu test results:
>   total: 321 pass: 319 fail: 2
> Failed tests: 
>   arm:sabrelite:imx_v6_v7_defconfig:imx6dl-sabrelite 
>   powerpc:g3beige:ppc_book3s_defconfig:nosmp:ide:rootfs
> 
> arm_sabrelite crashes in drm code. Presumably this is the same problem as
> reported by others with v4.14.
> 
> powerpc:g3beige is the known problem. Patch should be available upstream
> in the near future.
> 
> Details are available at https://kerneltests.org/builders/.

Thanks for testing all of these and letting me know.

greg k-h


Re: [PATCH 4.18 00/88] 4.18.11-stable review

2018-09-27 Thread Greg Kroah-Hartman
On Thu, Sep 27, 2018 at 02:09:08PM -0600, Shuah Khan wrote:
> On 09/27/2018 03:02 AM, Greg Kroah-Hartman wrote:
> > This is the start of the stable review cycle for the 4.18.11 release.
> > There are 88 patches in this series, all will be posted as a response
> > to this one.  If anyone has any issues with these being applied, please
> > let me know.
> > 
> > Responses should be made by Sat Sep 29 09:02:26 UTC 2018.
> > Anything received after that time might be too late.
> > 
> > The whole patch series can be found in one patch at:
> > 
> > https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.18.11-rc1.gz
> > or in the git tree and branch at:
> > 
> > git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git 
> > linux-4.18.y
> > and the diffstat can be found below.
> > 
> > thanks,
> > 
> > greg k-h
> > 
> 
> Compiled and booted on my test system. No dmesg regressions.

Thanks for testing all of these and letting me know.

greg k-h


Re: [PATCH 4.14 00/64] 4.14.73-stable review

2018-09-27 Thread Greg Kroah-Hartman
On Thu, Sep 27, 2018 at 10:45:30PM +0100, Sudip Mukherjee wrote:
> Hi Greg,
> 
> On Thu, Sep 27, 2018 at 9:56 PM, Sudip Mukherjee
>  wrote:
> > Hi Greg,
> >
> > On Thu, Sep 27, 2018 at 10:03 AM, Greg Kroah-Hartman
> >  wrote:
> >> This is the start of the stable review cycle for the 4.14.73 release.
> >> There are 64 patches in this series, all will be posted as a response
> >> to this one.  If anyone has any issues with these being applied, please
> >> let me know.
> >>
> >> Responses should be made by Sat Sep 29 09:02:21 UTC 2018.
> >> Anything received after that time might be too late.
> >
> > My kvm guest had this:
> >
> 
> 
> 
> > [8.585076] RIP: drm_debugfs_init+0x183/0x370 [drm] RSP: 88002b1bf5e8
> > [8.585404] ---[ end trace 62728db3ac408aba ]---
> >
> > And I had to revert 7e58fe2a97bc ("drm/atomic: Use
> > drm_drv_uses_atomic_modeset() for debugfs creation") to make it work.
> > I am looking more into why it failed.
> 
> update:
> 
> Backporting 57078338b2e4 ("drm: fix drm_drv_uses_atomic_modeset on non
> modesetting drivers.") fixed the issue. But 7e58fe2a97bc ("drm/atomic:
> Use drm_drv_uses_atomic_modeset() for debugfs creation") changed the
> functionality of the check. If this has to be applied, it will also
> need an additional patch to keep the check same as linus tree.
> 
> But looking at the other mails now, and you have already dropped the patch.

Yes, already dropped :)


Re: [PATCH 4.14 00/64] 4.14.73-stable review

2018-09-27 Thread Greg Kroah-Hartman
On Thu, Sep 27, 2018 at 10:45:30PM +0100, Sudip Mukherjee wrote:
> Hi Greg,
> 
> On Thu, Sep 27, 2018 at 9:56 PM, Sudip Mukherjee
>  wrote:
> > Hi Greg,
> >
> > On Thu, Sep 27, 2018 at 10:03 AM, Greg Kroah-Hartman
> >  wrote:
> >> This is the start of the stable review cycle for the 4.14.73 release.
> >> There are 64 patches in this series, all will be posted as a response
> >> to this one.  If anyone has any issues with these being applied, please
> >> let me know.
> >>
> >> Responses should be made by Sat Sep 29 09:02:21 UTC 2018.
> >> Anything received after that time might be too late.
> >
> > My kvm guest had this:
> >
> 
> 
> 
> > [8.585076] RIP: drm_debugfs_init+0x183/0x370 [drm] RSP: 88002b1bf5e8
> > [8.585404] ---[ end trace 62728db3ac408aba ]---
> >
> > And I had to revert 7e58fe2a97bc ("drm/atomic: Use
> > drm_drv_uses_atomic_modeset() for debugfs creation") to make it work.
> > I am looking more into why it failed.
> 
> update:
> 
> Backporting 57078338b2e4 ("drm: fix drm_drv_uses_atomic_modeset on non
> modesetting drivers.") fixed the issue. But 7e58fe2a97bc ("drm/atomic:
> Use drm_drv_uses_atomic_modeset() for debugfs creation") changed the
> functionality of the check. If this has to be applied, it will also
> need an additional patch to keep the check same as linus tree.
> 
> But looking at the other mails now, and you have already dropped the patch.

Yes, already dropped :)


[PATCH] futex: Set USER_DS for the futex_detect_cmpxchg() test

2018-09-27 Thread Andy Lutomirski
futex_detect_cmpxchg() checks whether cmpxchg is available by trying
it on the NULL pointer and seeing what the error code is (EFAULT vs
ENOSYS).  This happens with KERNEL_DS set, which is impolite: while
the NULL *user* pointer is definitely invalid when there is no user
program running, the NULL *kernel* pointer seems more like a
programming error than a safe place to do an intentionally-failing
access.  An upcoming hardening series I'm working on causes the
existing code to OOPS, because it considers any failed uaccess with
KERNEL_DS to be a sign of a bug.

Explicitly set USER_DS to avoid this problem.

Cc: linux-s...@vger.kernel.org
Cc: Martin Schwidefsky 
Cc: Heiko Carstens 
Cc: Finn Thain 
Cc: Geert Uytterhoeven 
---

I have a couple questions here:

 - Is this actually okay on all architectures?  That is, are there
   cases where we'll screw up if we fail a USER_DS access this early?
   s390 stands out as the obvious special case (where USER_DS is not
   than just a subset of KERNEL_DS), but s390 opts out.

 - Why doesn't x86 set HAVE_FUTEX_CMPXCHG?  Or do we still support
   some 32-bit configurations that don't have cmpxchg and don't know
   about it at compile time?

 kernel/futex.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/kernel/futex.c b/kernel/futex.c
index 11fc3bb456d6..16bd3e72602a 100644
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -3593,6 +3593,7 @@ static void __init futex_detect_cmpxchg(void)
 {
 #ifndef CONFIG_HAVE_FUTEX_CMPXCHG
u32 curval;
+   mm_segment_t old_seg;
 
/*
 * This will fail and we want it. Some arch implementations do
@@ -3604,8 +3605,11 @@ static void __init futex_detect_cmpxchg(void)
 * implementation, the non-functional ones will return
 * -ENOSYS.
 */
+   old_seg = get_fs();
+   set_fs(USER_DS);
if (cmpxchg_futex_value_locked(, NULL, 0, 0) == -EFAULT)
futex_cmpxchg_enabled = 1;
+   set_fs(old_seg);
 #endif
 }
 
-- 
2.17.1



[PATCH] futex: Set USER_DS for the futex_detect_cmpxchg() test

2018-09-27 Thread Andy Lutomirski
futex_detect_cmpxchg() checks whether cmpxchg is available by trying
it on the NULL pointer and seeing what the error code is (EFAULT vs
ENOSYS).  This happens with KERNEL_DS set, which is impolite: while
the NULL *user* pointer is definitely invalid when there is no user
program running, the NULL *kernel* pointer seems more like a
programming error than a safe place to do an intentionally-failing
access.  An upcoming hardening series I'm working on causes the
existing code to OOPS, because it considers any failed uaccess with
KERNEL_DS to be a sign of a bug.

Explicitly set USER_DS to avoid this problem.

Cc: linux-s...@vger.kernel.org
Cc: Martin Schwidefsky 
Cc: Heiko Carstens 
Cc: Finn Thain 
Cc: Geert Uytterhoeven 
---

I have a couple questions here:

 - Is this actually okay on all architectures?  That is, are there
   cases where we'll screw up if we fail a USER_DS access this early?
   s390 stands out as the obvious special case (where USER_DS is not
   than just a subset of KERNEL_DS), but s390 opts out.

 - Why doesn't x86 set HAVE_FUTEX_CMPXCHG?  Or do we still support
   some 32-bit configurations that don't have cmpxchg and don't know
   about it at compile time?

 kernel/futex.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/kernel/futex.c b/kernel/futex.c
index 11fc3bb456d6..16bd3e72602a 100644
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -3593,6 +3593,7 @@ static void __init futex_detect_cmpxchg(void)
 {
 #ifndef CONFIG_HAVE_FUTEX_CMPXCHG
u32 curval;
+   mm_segment_t old_seg;
 
/*
 * This will fail and we want it. Some arch implementations do
@@ -3604,8 +3605,11 @@ static void __init futex_detect_cmpxchg(void)
 * implementation, the non-functional ones will return
 * -ENOSYS.
 */
+   old_seg = get_fs();
+   set_fs(USER_DS);
if (cmpxchg_futex_value_locked(, NULL, 0, 0) == -EFAULT)
futex_cmpxchg_enabled = 1;
+   set_fs(old_seg);
 #endif
 }
 
-- 
2.17.1



Re: [PATCH v3 1/1] perf: Sharing PMU counters across compatible events

2018-09-27 Thread Ravi Bangoria
Hi Song,

On 09/25/2018 03:55 AM, Song Liu wrote:
> This patch tries to enable PMU sharing. To make perf event scheduling
> fast, we use special data structures.
> 
> An array of "struct perf_event_dup" is added to the perf_event_context,
> to remember all the duplicated events under this ctx. All the events
> under this ctx has a "dup_id" pointing to its perf_event_dup. Compatible
> events under the same ctx share the same perf_event_dup. The following
> figure shows a simplified version of the data structure.
> 
>   ctx ->  perf_event_dup -> master
>  ^
>  |
>  perf_event /|
>  |
>  perf_event /
> 

I've not gone through the patch in detail, but I was specifically
interested in scenarios where one perf instance is counting event
systemwide and thus other perf instance fails to count the same
event for a specific workload because that event can be counted
in one hw counter only.

Ex: https://lkml.org/lkml/2018/3/12/1011

Seems this patch does not solve this issue. Please let me know if
I'm missing anything.

Thanks,
Ravi



Re: [PATCH v3 1/1] perf: Sharing PMU counters across compatible events

2018-09-27 Thread Ravi Bangoria
Hi Song,

On 09/25/2018 03:55 AM, Song Liu wrote:
> This patch tries to enable PMU sharing. To make perf event scheduling
> fast, we use special data structures.
> 
> An array of "struct perf_event_dup" is added to the perf_event_context,
> to remember all the duplicated events under this ctx. All the events
> under this ctx has a "dup_id" pointing to its perf_event_dup. Compatible
> events under the same ctx share the same perf_event_dup. The following
> figure shows a simplified version of the data structure.
> 
>   ctx ->  perf_event_dup -> master
>  ^
>  |
>  perf_event /|
>  |
>  perf_event /
> 

I've not gone through the patch in detail, but I was specifically
interested in scenarios where one perf instance is counting event
systemwide and thus other perf instance fails to count the same
event for a specific workload because that event can be counted
in one hw counter only.

Ex: https://lkml.org/lkml/2018/3/12/1011

Seems this patch does not solve this issue. Please let me know if
I'm missing anything.

Thanks,
Ravi



CONGRATULATIONS

2018-09-27 Thread YAHOO MAIL
Yahoo!©
We are delighted to inform you that you were drawn a winner of (USD 
$5,005,000.00)
in our 2018 Yahoo (email) lottery. To file for claim please contact below our 
(claim officer)
__

Sir. Warren Vandall
(Claims Officer)
Asian Regional Sector
___

You are to establish contact with the Following details:

Name
Residential/Office Address
Telephone
Fax Number
Age
M/F

Thanks & Regards
Mrs. Georgina Rowland
Yahoo Asia Pte. Ltd.
YAHOO! ASIA©

---
Yahoo Asia Pte. Ltd. (Co. Reg. No. 199700735D) VERIFIED BY YAHOO ASIA.


CONGRATULATIONS

2018-09-27 Thread YAHOO MAIL
Yahoo!©
We are delighted to inform you that you were drawn a winner of (USD 
$5,005,000.00)
in our 2018 Yahoo (email) lottery. To file for claim please contact below our 
(claim officer)
__

Sir. Warren Vandall
(Claims Officer)
Asian Regional Sector
___

You are to establish contact with the Following details:

Name
Residential/Office Address
Telephone
Fax Number
Age
M/F

Thanks & Regards
Mrs. Georgina Rowland
Yahoo Asia Pte. Ltd.
YAHOO! ASIA©

---
Yahoo Asia Pte. Ltd. (Co. Reg. No. 199700735D) VERIFIED BY YAHOO ASIA.


[PATCH v2] mtd: rawnand: denali: set SPARE_AREA_SKIP_BYTES register to 8 if unset

2018-09-27 Thread Masahiro Yamada
NAND devices need additional data area (OOB) for error correction,
but it is also used for Bad Block Marker (BBM).  In many cases, the
first byte in OOB is used for BBM, but the location actually depends
on chip vendors.  The NAND controller should preserve the precious
BBM to keep track of bad blocks.

In Denali IP, the SPARE_AREA_SKIP_BYTES register is used to specify
the number of bytes to skip from the start of OOB.  The ECC engine
will automatically skip the specified number of bytes when it gets
access to OOB area.

The same value for SPARE_AREA_SKIP_BYTES should be used between
firmware and the operating system if you intend to use the NAND
device across the control hand-off.

In fact, the current denali.c code expects firmware to have already
set the SPARE_AREA_SKIP_BYTES register, then reads the value out.

If no firmware (or bootloader) has initialized the controller, the
register value is zero, which is the default after power-on-reset.
In other words, the Linux driver cannot initialize the controller
by itself.

Some possible solutions are:

 [1] Add a DT property to specify the skipped bytes in OOB
 [2] Associate the preferred value with compatible
 [3] Hard-code the default value in the driver

My first attempt was [1], but in the review process, [3] was suggested
as a counter-implementation.
(https://lore.kernel.org/patchwork/patch/983055/)

The default value 8 was chosen to match to the boot ROM of the UniPhier
platform.  The preferred value may vary by platform.  If so, please
trade up to a different solution.

Signed-off-by: Masahiro Yamada 
---

Changes in v2:
  - Change approach from a DT-property to a hard-coded dafault

 drivers/mtd/nand/raw/denali.c | 14 ++
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/drivers/mtd/nand/raw/denali.c b/drivers/mtd/nand/raw/denali.c
index aaab121..18bbfc8 100644
--- a/drivers/mtd/nand/raw/denali.c
+++ b/drivers/mtd/nand/raw/denali.c
@@ -21,6 +21,7 @@
 #include "denali.h"
 
 #define DENALI_NAND_NAME"denali-nand"
+#define DENALI_DEFAULT_OOB_SKIP_BYTES  8
 
 /* for Indexed Addressing */
 #define DENALI_INDEXED_CTRL0x00
@@ -1056,12 +1057,17 @@ static void denali_hw_init(struct denali_nand_info 
*denali)
denali->revision = swab16(ioread32(denali->reg + REVISION));
 
/*
-* tell driver how many bit controller will skip before
-* writing ECC code in OOB, this register may be already
-* set by firmware. So we read this value out.
-* if this value is 0, just let it be.
+* Set how many bytes should be skipped before writing data in OOB.
+* If a non-zero value has already been set (by firmware or something),
+* just use it.  Otherwise, set the driver default.
 */
denali->oob_skip_bytes = ioread32(denali->reg + SPARE_AREA_SKIP_BYTES);
+   if (!denali->oob_skip_bytes) {
+   denali->oob_skip_bytes = DENALI_DEFAULT_OOB_SKIP_BYTES;
+   iowrite32(denali->oob_skip_bytes,
+ denali->reg + SPARE_AREA_SKIP_BYTES);
+   }
+
denali_detect_max_banks(denali);
iowrite32(0x0F, denali->reg + RB_PIN_ENABLED);
iowrite32(CHIP_EN_DONT_CARE__FLAG, denali->reg + CHIP_ENABLE_DONT_CARE);
-- 
2.7.4



[PATCH v2] mtd: rawnand: denali: set SPARE_AREA_SKIP_BYTES register to 8 if unset

2018-09-27 Thread Masahiro Yamada
NAND devices need additional data area (OOB) for error correction,
but it is also used for Bad Block Marker (BBM).  In many cases, the
first byte in OOB is used for BBM, but the location actually depends
on chip vendors.  The NAND controller should preserve the precious
BBM to keep track of bad blocks.

In Denali IP, the SPARE_AREA_SKIP_BYTES register is used to specify
the number of bytes to skip from the start of OOB.  The ECC engine
will automatically skip the specified number of bytes when it gets
access to OOB area.

The same value for SPARE_AREA_SKIP_BYTES should be used between
firmware and the operating system if you intend to use the NAND
device across the control hand-off.

In fact, the current denali.c code expects firmware to have already
set the SPARE_AREA_SKIP_BYTES register, then reads the value out.

If no firmware (or bootloader) has initialized the controller, the
register value is zero, which is the default after power-on-reset.
In other words, the Linux driver cannot initialize the controller
by itself.

Some possible solutions are:

 [1] Add a DT property to specify the skipped bytes in OOB
 [2] Associate the preferred value with compatible
 [3] Hard-code the default value in the driver

My first attempt was [1], but in the review process, [3] was suggested
as a counter-implementation.
(https://lore.kernel.org/patchwork/patch/983055/)

The default value 8 was chosen to match to the boot ROM of the UniPhier
platform.  The preferred value may vary by platform.  If so, please
trade up to a different solution.

Signed-off-by: Masahiro Yamada 
---

Changes in v2:
  - Change approach from a DT-property to a hard-coded dafault

 drivers/mtd/nand/raw/denali.c | 14 ++
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/drivers/mtd/nand/raw/denali.c b/drivers/mtd/nand/raw/denali.c
index aaab121..18bbfc8 100644
--- a/drivers/mtd/nand/raw/denali.c
+++ b/drivers/mtd/nand/raw/denali.c
@@ -21,6 +21,7 @@
 #include "denali.h"
 
 #define DENALI_NAND_NAME"denali-nand"
+#define DENALI_DEFAULT_OOB_SKIP_BYTES  8
 
 /* for Indexed Addressing */
 #define DENALI_INDEXED_CTRL0x00
@@ -1056,12 +1057,17 @@ static void denali_hw_init(struct denali_nand_info 
*denali)
denali->revision = swab16(ioread32(denali->reg + REVISION));
 
/*
-* tell driver how many bit controller will skip before
-* writing ECC code in OOB, this register may be already
-* set by firmware. So we read this value out.
-* if this value is 0, just let it be.
+* Set how many bytes should be skipped before writing data in OOB.
+* If a non-zero value has already been set (by firmware or something),
+* just use it.  Otherwise, set the driver default.
 */
denali->oob_skip_bytes = ioread32(denali->reg + SPARE_AREA_SKIP_BYTES);
+   if (!denali->oob_skip_bytes) {
+   denali->oob_skip_bytes = DENALI_DEFAULT_OOB_SKIP_BYTES;
+   iowrite32(denali->oob_skip_bytes,
+ denali->reg + SPARE_AREA_SKIP_BYTES);
+   }
+
denali_detect_max_banks(denali);
iowrite32(0x0F, denali->reg + RB_PIN_ENABLED);
iowrite32(CHIP_EN_DONT_CARE__FLAG, denali->reg + CHIP_ENABLE_DONT_CARE);
-- 
2.7.4



Re: Licenses and revocability, in a paragraph or less.

2018-09-27 Thread Eric S. Raymond
freedomfromr...@aaathats3as.com :
> As has been stated in easily accessible terms elsewhere:
> "Most courts hold that simple, non-exclusive licenses with unspecified
> durations that are silent on revocability are revocable at will. This means
> that the licensor may terminate the license at any time, with or without
> cause." +

Furthermore, license revocation is not the only option. In Jacobsen
vs. Katzer (535 F.3d 1373 (Fed. Cir. 2008) it was found that
open-source developers have an actionable right not to have their
software misappropriated even though the resulting damages are only
reputational rather than monetary.

Under that theory, developers can seek an injunction against a
misappropriating party without globally revoking their license.
The application of that case law to this situation is left as
an easy exercise for the reader.  Any competent paralegal could
write the brief in an evening. Hell, I could almost do it myself.

I do not personally want to see this happen.  But that it is possible
is a fact all parties must deal with.
-- 
http://www.catb.org/~esr/;>Eric S. Raymond

My work is funded by the Internet Civil Engineering Institute: https://icei.org
Please visit their site and donate: the civilization you save might be your own.




Re: Licenses and revocability, in a paragraph or less.

2018-09-27 Thread Eric S. Raymond
freedomfromr...@aaathats3as.com :
> As has been stated in easily accessible terms elsewhere:
> "Most courts hold that simple, non-exclusive licenses with unspecified
> durations that are silent on revocability are revocable at will. This means
> that the licensor may terminate the license at any time, with or without
> cause." +

Furthermore, license revocation is not the only option. In Jacobsen
vs. Katzer (535 F.3d 1373 (Fed. Cir. 2008) it was found that
open-source developers have an actionable right not to have their
software misappropriated even though the resulting damages are only
reputational rather than monetary.

Under that theory, developers can seek an injunction against a
misappropriating party without globally revoking their license.
The application of that case law to this situation is left as
an easy exercise for the reader.  Any competent paralegal could
write the brief in an evening. Hell, I could almost do it myself.

I do not personally want to see this happen.  But that it is possible
is a fact all parties must deal with.
-- 
http://www.catb.org/~esr/;>Eric S. Raymond

My work is funded by the Internet Civil Engineering Institute: https://icei.org
Please visit their site and donate: the civilization you save might be your own.




linux-next: manual merge of the userns tree with the arm64 tree

2018-09-27 Thread Stephen Rothwell
Hi Eric,

Today's linux-next merge of the userns tree got a conflict in:

  arch/arm64/kernel/traps.c

between commit:

  8a60419d3676 ("arm64: force_signal_inject: WARN if called from kernel 
context")

from the arm64 tree and commit:

  6fa998e83ef9 ("signal/arm64: Push siginfo generation into arm64_notify_die")

from the userns tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc arch/arm64/kernel/traps.c
index 21689c6a985f,856b32aa03d8..
--- a/arch/arm64/kernel/traps.c
+++ b/arch/arm64/kernel/traps.c
@@@ -353,12 -366,6 +368,9 @@@ void force_signal_inject(int signal, in
const char *desc;
struct pt_regs *regs = current_pt_regs();
  
 +  if (WARN_ON(!user_mode(regs)))
 +  return;
 +
-   clear_siginfo();
- 
switch (signal) {
case SIGILL:
desc = "undefined instruction";


pgpS7m5wWcbeI.pgp
Description: OpenPGP digital signature


linux-next: manual merge of the userns tree with the arm64 tree

2018-09-27 Thread Stephen Rothwell
Hi Eric,

Today's linux-next merge of the userns tree got a conflict in:

  arch/arm64/kernel/traps.c

between commit:

  8a60419d3676 ("arm64: force_signal_inject: WARN if called from kernel 
context")

from the arm64 tree and commit:

  6fa998e83ef9 ("signal/arm64: Push siginfo generation into arm64_notify_die")

from the userns tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc arch/arm64/kernel/traps.c
index 21689c6a985f,856b32aa03d8..
--- a/arch/arm64/kernel/traps.c
+++ b/arch/arm64/kernel/traps.c
@@@ -353,12 -366,6 +368,9 @@@ void force_signal_inject(int signal, in
const char *desc;
struct pt_regs *regs = current_pt_regs();
  
 +  if (WARN_ON(!user_mode(regs)))
 +  return;
 +
-   clear_siginfo();
- 
switch (signal) {
case SIGILL:
desc = "undefined instruction";


pgpS7m5wWcbeI.pgp
Description: OpenPGP digital signature


Licenses and revocability, in a paragraph or less.

2018-09-27 Thread freedomfromruin

As has been stated in easily accessible terms elsewhere:
"Most courts hold that simple, non-exclusive licenses with unspecified 
durations that are silent on revocability are revocable at will. This 
means that the licensor may terminate the license at any time, with or 
without cause." +


Version 2 of the GPL specifies no duration, nor does it declare that it 
is non-revocable by the grantor.


(Also note: A perpetual license may violate the rule against 
perpetuities in various jurisdictions where it is applied not only to 
real property but additionally to personal property (and the like), 
which is why the GPL-3's term of duration is set as the duration of 
copyright on the program (and not "forever"))


+[https://www.sidley.com/en/insights/newsupdates/2013/02/the-terms-revocable-and-irrevocable-in-license-agreements-tips-and-pitfalls]




Licenses and revocability, in a paragraph or less.

2018-09-27 Thread freedomfromruin

As has been stated in easily accessible terms elsewhere:
"Most courts hold that simple, non-exclusive licenses with unspecified 
durations that are silent on revocability are revocable at will. This 
means that the licensor may terminate the license at any time, with or 
without cause." +


Version 2 of the GPL specifies no duration, nor does it declare that it 
is non-revocable by the grantor.


(Also note: A perpetual license may violate the rule against 
perpetuities in various jurisdictions where it is applied not only to 
real property but additionally to personal property (and the like), 
which is why the GPL-3's term of duration is set as the duration of 
copyright on the program (and not "forever"))


+[https://www.sidley.com/en/insights/newsupdates/2013/02/the-terms-revocable-and-irrevocable-in-license-agreements-tips-and-pitfalls]




GPL v2 licensing, continued

2018-09-27 Thread freedomfromruin

Gnu GPL version 2, section 0:
"Each licensee is addressed as "you". "

The "you" is not referring to the licensor (copyright owner). It is 
referring to the licensees and then future 
sub-licensees/additional-licensees receiving the work from said previous 
licensee.


It is independently clear from the context of the clauses if you read 
them in full.


...and then section 0 comes around and makes it _explicit_ that "you" 
refers to the licensee. (if you had any doubt)


Additionally, you should know that the copyright owner is not bound by 
the gratuitous license he proffers to potential licensees regarding his 
property. The licensees are bound to his terms: he is the owner. They 
take at his benefaction.



GNU GENERAL PUBLIC LICENSE
   TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION

  0. This License applies to any program or other work which contains
a notice placed by the copyright holder saying it may be distributed
under the terms of this General Public License.  The "Program", below,
refers to any such program or work, and a "work based on the Program"
means either the Program or any derivative work under copyright law:
that is to say, a work containing the Program or a portion of it,
either verbatim or with modifications and/or translated into another
language.  (Hereinafter, translation is included without limitation in
the term "modification".)  Each licensee is addressed as "you".





GPL v2 licensing, continued

2018-09-27 Thread freedomfromruin

Gnu GPL version 2, section 0:
"Each licensee is addressed as "you". "

The "you" is not referring to the licensor (copyright owner). It is 
referring to the licensees and then future 
sub-licensees/additional-licensees receiving the work from said previous 
licensee.


It is independently clear from the context of the clauses if you read 
them in full.


...and then section 0 comes around and makes it _explicit_ that "you" 
refers to the licensee. (if you had any doubt)


Additionally, you should know that the copyright owner is not bound by 
the gratuitous license he proffers to potential licensees regarding his 
property. The licensees are bound to his terms: he is the owner. They 
take at his benefaction.



GNU GENERAL PUBLIC LICENSE
   TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION

  0. This License applies to any program or other work which contains
a notice placed by the copyright holder saying it may be distributed
under the terms of this General Public License.  The "Program", below,
refers to any such program or work, and a "work based on the Program"
means either the Program or any derivative work under copyright law:
that is to say, a work containing the Program or a portion of it,
either verbatim or with modifications and/or translated into another
language.  (Hereinafter, translation is included without limitation in
the term "modification".)  Each licensee is addressed as "you".





Re: [PATCH] rpmsg: fix memory leak on channel

2018-09-27 Thread Bjorn Andersson
On Thu 27 Sep 14:36 PDT 2018, Colin King wrote:

> From: Colin Ian King 
> 
> Currently a failed allocation of channel->name leads to an
> immediate return without freeing channel. Fix this by setting
> ret to -ENOMEM and jumping to an exit path that kfree's channel.
> 
> Detected by CoverityScan, CID#1473692 ("Resource Leak")
> 
> Fixes: 53e2822e56c7 ("rpmsg: Introduce Qualcomm SMD backend")
> Signed-off-by: Colin Ian King 

Added Cc: stable and applied.

Thanks,
Bjorn

> ---
>  drivers/rpmsg/qcom_smd.c | 7 +--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/rpmsg/qcom_smd.c b/drivers/rpmsg/qcom_smd.c
> index 0dae7c9f4a8f..4abbeea782fa 100644
> --- a/drivers/rpmsg/qcom_smd.c
> +++ b/drivers/rpmsg/qcom_smd.c
> @@ -1122,8 +1122,10 @@ static struct qcom_smd_channel 
> *qcom_smd_create_channel(struct qcom_smd_edge *ed
>  
>   channel->edge = edge;
>   channel->name = kstrdup(name, GFP_KERNEL);
> - if (!channel->name)
> - return ERR_PTR(-ENOMEM);
> + if (!channel->name) {
> + ret = -ENOMEM;
> + goto free_channel;
> + }
>  
>   spin_lock_init(>tx_lock);
>   spin_lock_init(>recv_lock);
> @@ -1173,6 +1175,7 @@ static struct qcom_smd_channel 
> *qcom_smd_create_channel(struct qcom_smd_edge *ed
>  
>  free_name_and_channel:
>   kfree(channel->name);
> +free_channel:
>   kfree(channel);
>  
>   return ERR_PTR(ret);
> -- 
> 2.17.1
> 


Re: [PATCH] rpmsg: fix memory leak on channel

2018-09-27 Thread Bjorn Andersson
On Thu 27 Sep 14:36 PDT 2018, Colin King wrote:

> From: Colin Ian King 
> 
> Currently a failed allocation of channel->name leads to an
> immediate return without freeing channel. Fix this by setting
> ret to -ENOMEM and jumping to an exit path that kfree's channel.
> 
> Detected by CoverityScan, CID#1473692 ("Resource Leak")
> 
> Fixes: 53e2822e56c7 ("rpmsg: Introduce Qualcomm SMD backend")
> Signed-off-by: Colin Ian King 

Added Cc: stable and applied.

Thanks,
Bjorn

> ---
>  drivers/rpmsg/qcom_smd.c | 7 +--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/rpmsg/qcom_smd.c b/drivers/rpmsg/qcom_smd.c
> index 0dae7c9f4a8f..4abbeea782fa 100644
> --- a/drivers/rpmsg/qcom_smd.c
> +++ b/drivers/rpmsg/qcom_smd.c
> @@ -1122,8 +1122,10 @@ static struct qcom_smd_channel 
> *qcom_smd_create_channel(struct qcom_smd_edge *ed
>  
>   channel->edge = edge;
>   channel->name = kstrdup(name, GFP_KERNEL);
> - if (!channel->name)
> - return ERR_PTR(-ENOMEM);
> + if (!channel->name) {
> + ret = -ENOMEM;
> + goto free_channel;
> + }
>  
>   spin_lock_init(>tx_lock);
>   spin_lock_init(>recv_lock);
> @@ -1173,6 +1175,7 @@ static struct qcom_smd_channel 
> *qcom_smd_create_channel(struct qcom_smd_edge *ed
>  
>  free_name_and_channel:
>   kfree(channel->name);
> +free_channel:
>   kfree(channel);
>  
>   return ERR_PTR(ret);
> -- 
> 2.17.1
> 


Re: [PATCH] IB/mlx4: Avoid implicit enumerated type conversion

2018-09-27 Thread Jason Gunthorpe
On Thu, Sep 27, 2018 at 05:55:43PM -0700, Nick Desaulniers wrote:
> On Thu, Sep 27, 2018 at 3:58 PM Jason Gunthorpe  wrote:
> >
> > On Thu, Sep 27, 2018 at 03:42:24PM -0700, Nick Desaulniers wrote:
> > > On Thu, Sep 27, 2018 at 3:33 PM Bart Van Assche  
> > > wrote:
> > > >
> > > > On Thu, 2018-09-27 at 16:28 -0600, Jason Gunthorpe wrote:
> > > > > On Thu, Sep 27, 2018 at 01:34:16PM -0700, Nick Desaulniers wrote:
> > > > >
> > > > > > > Neither ib_qp_create_flags nor mlx4_ib_qp_flags have negative 
> > > > > > > values, is
> > > > > > > signedness necessary?
> > > > > >
> > > > > > enums are by default restricted to the range of ints.
> > > > >
> > > > > That's not quite right, the compiler sizes the enum to be able to fit
> > > > > the largest value contained within, today that is int, but if we added
> > > > > 1<<31, then it would become larger.
> > > >
> > > > Hi Jason,
> > > >
> > > > Are you perhaps confusing C and C++? For C++, an enumeration whose 
> > > > underlying
> > > > type is not fixed, the underlying type is an integral type that can 
> > > > represent
> > > > all the enumerator values defined in the enumeration. For C however I 
> > > > think
> > > > that enumeration values are restricted to what fits in an int.
> > > >
> > > > Bart.
> > > >
> > >
> > > To quote the sacred texts (ANSIIISO9899-1990):
> >
> > > 6.5.2.2 Enumeration specifiers
> > > The expression that defines the value of an enumeration constant shall
> > > be an integral constant
> > > expression that has a value representable as an int.
> >
> > This is the wrong part of the standard to quote it is talking about
> > *enumeration constants* not the 'enum X' itself.
> >
> > Anyhow, the standard is hard to read in this area, but reality is
> > this:
> 
> You mean undefined behavior?

I think we call this an unstandardized compiler extension :)

> > #include 
> >
> > enum a
> > {
> > A1 = 1,
> > A2 = 1ULL<<40,
> > };
> >
> > int main(int argc, const char *argv[])
> > {
> > printf("%zu\n", sizeof(enum a));
> > return 0;
> > }
> >
> > $ gcc -Wall -std=c11 test.c && ./a.out
> > 8
> >
> > I forget if this a common compiler extension, unclear standard, or was
> > formally revised in C11 or what, but it is the real world the Linux
> > kernel lives in.
> >
> > It is even more confusing if you wonder what types A1 and A2 are!
> >
> > Jason
> 
> This example is a strawman; we're talking about the minimum sizeof an
> enum when all initialized values are representable within an int,

Hmm? I said "the compiler sizes the enum to be able to fit the largest
value contained within", which is correct for gnu89 mode. 

It is not ISO C, it looks like it is a popular compiler extension that
Linux relies on.

> And if you're going to throw type safety out the window by converting
> values from one enum to another, for storage you MUST use an int
> (anything larger as in your example is undefined behavior).

No, that isn't right even without this extension, it is confusing, but
the standard you quoted is talking about the type of the CONSTANT, not
the enum. Ie this:

  enum a {A1=1};
  enum a val = A1;
  int foo = val;

Gives this warning:

t.c:10:17: warning: implicit conversion changes signedness: 'enum a' to 'int' 
[-Wsign-conversion]

The correct integral storage for that enum is 'unsigned int'.

There is another peice of standard talking about the type of the enum
itself, and confoundingly it is a different type than the types of the
constants.

C++ got this right, the type of the enum and the type of the constants
are always the same and always sized to match the largest constant in
the enum, and C++11 got this *really right* and allows the programmer
to specify the underlying type of the enum and all of its constants.

No more subtle bugs with ~FOO because enum constant values have
negative types!

> I don't disagree with your point that values should be unsigned for
> bitwise operations, but it's not clean to reconcile that with
> converting values between different enums.  I suggest explicit casts
> to unsigned types before bitwise operations.

Sometimes the casts are needed, particularly when using ~, but for |
it is OK to have no casts, promotion rules work out OK.

But, again, this question was about the correct type to use when
storing bitwise flags, and that type is u32/64 etc no matter if the
constants are defined as enum constants or #defines values.

So the first patch was the right one! :)

Jason


Re: [PATCH] IB/mlx4: Avoid implicit enumerated type conversion

2018-09-27 Thread Jason Gunthorpe
On Thu, Sep 27, 2018 at 05:55:43PM -0700, Nick Desaulniers wrote:
> On Thu, Sep 27, 2018 at 3:58 PM Jason Gunthorpe  wrote:
> >
> > On Thu, Sep 27, 2018 at 03:42:24PM -0700, Nick Desaulniers wrote:
> > > On Thu, Sep 27, 2018 at 3:33 PM Bart Van Assche  
> > > wrote:
> > > >
> > > > On Thu, 2018-09-27 at 16:28 -0600, Jason Gunthorpe wrote:
> > > > > On Thu, Sep 27, 2018 at 01:34:16PM -0700, Nick Desaulniers wrote:
> > > > >
> > > > > > > Neither ib_qp_create_flags nor mlx4_ib_qp_flags have negative 
> > > > > > > values, is
> > > > > > > signedness necessary?
> > > > > >
> > > > > > enums are by default restricted to the range of ints.
> > > > >
> > > > > That's not quite right, the compiler sizes the enum to be able to fit
> > > > > the largest value contained within, today that is int, but if we added
> > > > > 1<<31, then it would become larger.
> > > >
> > > > Hi Jason,
> > > >
> > > > Are you perhaps confusing C and C++? For C++, an enumeration whose 
> > > > underlying
> > > > type is not fixed, the underlying type is an integral type that can 
> > > > represent
> > > > all the enumerator values defined in the enumeration. For C however I 
> > > > think
> > > > that enumeration values are restricted to what fits in an int.
> > > >
> > > > Bart.
> > > >
> > >
> > > To quote the sacred texts (ANSIIISO9899-1990):
> >
> > > 6.5.2.2 Enumeration specifiers
> > > The expression that defines the value of an enumeration constant shall
> > > be an integral constant
> > > expression that has a value representable as an int.
> >
> > This is the wrong part of the standard to quote it is talking about
> > *enumeration constants* not the 'enum X' itself.
> >
> > Anyhow, the standard is hard to read in this area, but reality is
> > this:
> 
> You mean undefined behavior?

I think we call this an unstandardized compiler extension :)

> > #include 
> >
> > enum a
> > {
> > A1 = 1,
> > A2 = 1ULL<<40,
> > };
> >
> > int main(int argc, const char *argv[])
> > {
> > printf("%zu\n", sizeof(enum a));
> > return 0;
> > }
> >
> > $ gcc -Wall -std=c11 test.c && ./a.out
> > 8
> >
> > I forget if this a common compiler extension, unclear standard, or was
> > formally revised in C11 or what, but it is the real world the Linux
> > kernel lives in.
> >
> > It is even more confusing if you wonder what types A1 and A2 are!
> >
> > Jason
> 
> This example is a strawman; we're talking about the minimum sizeof an
> enum when all initialized values are representable within an int,

Hmm? I said "the compiler sizes the enum to be able to fit the largest
value contained within", which is correct for gnu89 mode. 

It is not ISO C, it looks like it is a popular compiler extension that
Linux relies on.

> And if you're going to throw type safety out the window by converting
> values from one enum to another, for storage you MUST use an int
> (anything larger as in your example is undefined behavior).

No, that isn't right even without this extension, it is confusing, but
the standard you quoted is talking about the type of the CONSTANT, not
the enum. Ie this:

  enum a {A1=1};
  enum a val = A1;
  int foo = val;

Gives this warning:

t.c:10:17: warning: implicit conversion changes signedness: 'enum a' to 'int' 
[-Wsign-conversion]

The correct integral storage for that enum is 'unsigned int'.

There is another peice of standard talking about the type of the enum
itself, and confoundingly it is a different type than the types of the
constants.

C++ got this right, the type of the enum and the type of the constants
are always the same and always sized to match the largest constant in
the enum, and C++11 got this *really right* and allows the programmer
to specify the underlying type of the enum and all of its constants.

No more subtle bugs with ~FOO because enum constant values have
negative types!

> I don't disagree with your point that values should be unsigned for
> bitwise operations, but it's not clean to reconcile that with
> converting values between different enums.  I suggest explicit casts
> to unsigned types before bitwise operations.

Sometimes the casts are needed, particularly when using ~, but for |
it is OK to have no casts, promotion rules work out OK.

But, again, this question was about the correct type to use when
storing bitwise flags, and that type is u32/64 etc no matter if the
constants are defined as enum constants or #defines values.

So the first patch was the right one! :)

Jason


[PATCH v3 1/2] drivers: base: cacheinfo: Do not populate sysfs for unknown cache types

2018-09-27 Thread Jeffrey Hugo
If a cache has an unknown type because neither the hardware nor the
firmware told us, an entry in the sysfs tree will be made, but the type
file will not be present.  lscpu depends on the type file being present
for every entry, and will error out without printing system information
if lscpu cannot open the type file.

Presenting information about a cache without indicating its type is not
useful, therefore if we hit a cache with an unknown type, stop populating
sysfs so that userspace has the maximum amount of useful information.

This addresses the following lscpu error, which prevents any output.
lscpu: cannot open /sys/devices/system/cpu/cpu0/cache/index3/type: No such
file or directory

Suggested-by: Sudeep Holla 
Signed-off-by: Jeffrey Hugo 
Reviewed-by: Jeremy Linton 
---
 drivers/base/cacheinfo.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index 5d5b598..cf78fa6 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -615,6 +615,8 @@ static int cache_add_dev(unsigned int cpu)
this_leaf = this_cpu_ci->info_list + i;
if (this_leaf->disable_sysfs)
continue;
+   if (this_leaf->type == CACHE_TYPE_NOCACHE)
+   break;
cache_groups = cache_get_attribute_groups(this_leaf);
ci_dev = cpu_device_create(parent, this_leaf, cache_groups,
   "index%1u", i);
-- 
Qualcomm Datacenter Technologies as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.



[PATCH v3 2/2] ACPI/PPTT: Handle architecturally unknown cache types

2018-09-27 Thread Jeffrey Hugo
The type of a cache might not be specified by architectural mechanisms (ie
system registers), but its type might be specified in the PPTT.  In this
case, we should populate the type of the cache, rather than leave it
undefined.

This fixes the issue where the cacheinfo driver will not populate sysfs
for such caches, resulting in the information missing from utilities like
lstopo and lscpu, thus degrading the user experience.

Fixes: 2bd00bcd73e5 (ACPI/PPTT: Add Processor Properties Topology Table parsing)
Reported-by: Vijaya Kumar K 
Signed-off-by: Jeffrey Hugo 
---
 drivers/acpi/pptt.c | 30 +-
 1 file changed, 13 insertions(+), 17 deletions(-)

diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
index d1e26cb..38ac30e 100644
--- a/drivers/acpi/pptt.c
+++ b/drivers/acpi/pptt.c
@@ -357,25 +357,15 @@ static void update_cache_properties(struct cacheinfo 
*this_leaf,
struct acpi_pptt_cache *found_cache,
struct acpi_pptt_processor *cpu_node)
 {
-   int valid_flags = 0;
-
this_leaf->fw_token = cpu_node;
-   if (found_cache->flags & ACPI_PPTT_SIZE_PROPERTY_VALID) {
+   if (found_cache->flags & ACPI_PPTT_SIZE_PROPERTY_VALID)
this_leaf->size = found_cache->size;
-   valid_flags++;
-   }
-   if (found_cache->flags & ACPI_PPTT_LINE_SIZE_VALID) {
+   if (found_cache->flags & ACPI_PPTT_LINE_SIZE_VALID)
this_leaf->coherency_line_size = found_cache->line_size;
-   valid_flags++;
-   }
-   if (found_cache->flags & ACPI_PPTT_NUMBER_OF_SETS_VALID) {
+   if (found_cache->flags & ACPI_PPTT_NUMBER_OF_SETS_VALID)
this_leaf->number_of_sets = found_cache->number_of_sets;
-   valid_flags++;
-   }
-   if (found_cache->flags & ACPI_PPTT_ASSOCIATIVITY_VALID) {
+   if (found_cache->flags & ACPI_PPTT_ASSOCIATIVITY_VALID)
this_leaf->ways_of_associativity = found_cache->associativity;
-   valid_flags++;
-   }
if (found_cache->flags & ACPI_PPTT_WRITE_POLICY_VALID) {
switch (found_cache->attributes & ACPI_PPTT_MASK_WRITE_POLICY) {
case ACPI_PPTT_CACHE_POLICY_WT:
@@ -402,11 +392,17 @@ static void update_cache_properties(struct cacheinfo 
*this_leaf,
}
}
/*
-* If the above flags are valid, and the cache type is NOCACHE
-* update the cache type as well.
+* If cache type is NOCACHE, then the cache hasn't been specified
+* via other mechanisms.  Update the type if a cache type has been
+* provided.
+*
+* Note, we assume such caches are unified based on conventional system
+* design and known examples.  Significant work is required elsewhere to
+* fully support data/instruction only type caches which are only
+* specified in PPTT.
 */
if (this_leaf->type == CACHE_TYPE_NOCACHE &&
-   valid_flags == PPTT_CHECKED_ATTRIBUTES)
+   found_cache->flags & ACPI_PPTT_CACHE_TYPE_VALID)
this_leaf->type = CACHE_TYPE_UNIFIED;
 }
 
-- 
Qualcomm Datacenter Technologies as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.



[PATCH v3 0/2] PPTT handle Handle architecturally unknown cache types

2018-09-27 Thread Jeffrey Hugo
The ARM Architecture Reference Manual allows for caches to be "invisible" and
thus not specified in the system registers under some scenarios such as if the
cache cannot be managed by set/way operations.

However, such caches may be specified in the ACPI PPTT table for workload
performance/scheduling optimizations.

Currently such caches can cause an error in lscpu -

lscpu: cannot open /sys/devices/system/cpu/cpu0/cache/index3/type: No such
file or directory

and result in no output, providing a poor user experience.  lstopo is also
affected as such caches are not included in the output.

Address these issues by attempting to be a little more discerning about when
cache information is provided to userspace, and also utilize all sources for
cache information when possible.

[v3]
-removed valid flag in PPTT
-Added Jeremy Linton's reviewed-by

[v2]
-Updated cacheinfo per Sudeep's suggestion
-Integrated the PPTT fix into existing PPTT code per Sudeep's suggestion

Jeffrey Hugo (2):
  drivers: base: cacheinfo: Do not populate sysfs for unknown cache
types
  ACPI/PPTT: Handle architecturally unknown cache types

 drivers/acpi/pptt.c  | 15 +++
 drivers/base/cacheinfo.c |  2 ++
 2 files changed, 13 insertions(+), 4 deletions(-)

-- 
Qualcomm Datacenter Technologies as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.



[PATCH v3 2/2] ACPI/PPTT: Handle architecturally unknown cache types

2018-09-27 Thread Jeffrey Hugo
The type of a cache might not be specified by architectural mechanisms (ie
system registers), but its type might be specified in the PPTT.  In this
case, we should populate the type of the cache, rather than leave it
undefined.

This fixes the issue where the cacheinfo driver will not populate sysfs
for such caches, resulting in the information missing from utilities like
lstopo and lscpu, thus degrading the user experience.

Fixes: 2bd00bcd73e5 (ACPI/PPTT: Add Processor Properties Topology Table parsing)
Reported-by: Vijaya Kumar K 
Signed-off-by: Jeffrey Hugo 
---
 drivers/acpi/pptt.c | 30 +-
 1 file changed, 13 insertions(+), 17 deletions(-)

diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
index d1e26cb..38ac30e 100644
--- a/drivers/acpi/pptt.c
+++ b/drivers/acpi/pptt.c
@@ -357,25 +357,15 @@ static void update_cache_properties(struct cacheinfo 
*this_leaf,
struct acpi_pptt_cache *found_cache,
struct acpi_pptt_processor *cpu_node)
 {
-   int valid_flags = 0;
-
this_leaf->fw_token = cpu_node;
-   if (found_cache->flags & ACPI_PPTT_SIZE_PROPERTY_VALID) {
+   if (found_cache->flags & ACPI_PPTT_SIZE_PROPERTY_VALID)
this_leaf->size = found_cache->size;
-   valid_flags++;
-   }
-   if (found_cache->flags & ACPI_PPTT_LINE_SIZE_VALID) {
+   if (found_cache->flags & ACPI_PPTT_LINE_SIZE_VALID)
this_leaf->coherency_line_size = found_cache->line_size;
-   valid_flags++;
-   }
-   if (found_cache->flags & ACPI_PPTT_NUMBER_OF_SETS_VALID) {
+   if (found_cache->flags & ACPI_PPTT_NUMBER_OF_SETS_VALID)
this_leaf->number_of_sets = found_cache->number_of_sets;
-   valid_flags++;
-   }
-   if (found_cache->flags & ACPI_PPTT_ASSOCIATIVITY_VALID) {
+   if (found_cache->flags & ACPI_PPTT_ASSOCIATIVITY_VALID)
this_leaf->ways_of_associativity = found_cache->associativity;
-   valid_flags++;
-   }
if (found_cache->flags & ACPI_PPTT_WRITE_POLICY_VALID) {
switch (found_cache->attributes & ACPI_PPTT_MASK_WRITE_POLICY) {
case ACPI_PPTT_CACHE_POLICY_WT:
@@ -402,11 +392,17 @@ static void update_cache_properties(struct cacheinfo 
*this_leaf,
}
}
/*
-* If the above flags are valid, and the cache type is NOCACHE
-* update the cache type as well.
+* If cache type is NOCACHE, then the cache hasn't been specified
+* via other mechanisms.  Update the type if a cache type has been
+* provided.
+*
+* Note, we assume such caches are unified based on conventional system
+* design and known examples.  Significant work is required elsewhere to
+* fully support data/instruction only type caches which are only
+* specified in PPTT.
 */
if (this_leaf->type == CACHE_TYPE_NOCACHE &&
-   valid_flags == PPTT_CHECKED_ATTRIBUTES)
+   found_cache->flags & ACPI_PPTT_CACHE_TYPE_VALID)
this_leaf->type = CACHE_TYPE_UNIFIED;
 }
 
-- 
Qualcomm Datacenter Technologies as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.



[PATCH v3 0/2] PPTT handle Handle architecturally unknown cache types

2018-09-27 Thread Jeffrey Hugo
The ARM Architecture Reference Manual allows for caches to be "invisible" and
thus not specified in the system registers under some scenarios such as if the
cache cannot be managed by set/way operations.

However, such caches may be specified in the ACPI PPTT table for workload
performance/scheduling optimizations.

Currently such caches can cause an error in lscpu -

lscpu: cannot open /sys/devices/system/cpu/cpu0/cache/index3/type: No such
file or directory

and result in no output, providing a poor user experience.  lstopo is also
affected as such caches are not included in the output.

Address these issues by attempting to be a little more discerning about when
cache information is provided to userspace, and also utilize all sources for
cache information when possible.

[v3]
-removed valid flag in PPTT
-Added Jeremy Linton's reviewed-by

[v2]
-Updated cacheinfo per Sudeep's suggestion
-Integrated the PPTT fix into existing PPTT code per Sudeep's suggestion

Jeffrey Hugo (2):
  drivers: base: cacheinfo: Do not populate sysfs for unknown cache
types
  ACPI/PPTT: Handle architecturally unknown cache types

 drivers/acpi/pptt.c  | 15 +++
 drivers/base/cacheinfo.c |  2 ++
 2 files changed, 13 insertions(+), 4 deletions(-)

-- 
Qualcomm Datacenter Technologies as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.



[PATCH v3 1/2] drivers: base: cacheinfo: Do not populate sysfs for unknown cache types

2018-09-27 Thread Jeffrey Hugo
If a cache has an unknown type because neither the hardware nor the
firmware told us, an entry in the sysfs tree will be made, but the type
file will not be present.  lscpu depends on the type file being present
for every entry, and will error out without printing system information
if lscpu cannot open the type file.

Presenting information about a cache without indicating its type is not
useful, therefore if we hit a cache with an unknown type, stop populating
sysfs so that userspace has the maximum amount of useful information.

This addresses the following lscpu error, which prevents any output.
lscpu: cannot open /sys/devices/system/cpu/cpu0/cache/index3/type: No such
file or directory

Suggested-by: Sudeep Holla 
Signed-off-by: Jeffrey Hugo 
Reviewed-by: Jeremy Linton 
---
 drivers/base/cacheinfo.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index 5d5b598..cf78fa6 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -615,6 +615,8 @@ static int cache_add_dev(unsigned int cpu)
this_leaf = this_cpu_ci->info_list + i;
if (this_leaf->disable_sysfs)
continue;
+   if (this_leaf->type == CACHE_TYPE_NOCACHE)
+   break;
cache_groups = cache_get_attribute_groups(this_leaf);
ci_dev = cpu_device_create(parent, this_leaf, cache_groups,
   "index%1u", i);
-- 
Qualcomm Datacenter Technologies as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.



Re: [PATCH v4 2/3] ACPI / NUMA: Add warning message if the padding size for KASLR is not enough

2018-09-27 Thread Baoquan He
On 09/27/18 at 04:31pm, Masayoshi Mizuma wrote:
> From: Masayoshi Mizuma 
> 
> Add warning message if the padding size for KASLR,
> rand_mem_physical_padding, is not enough. The message also
> says the suitable padding size.
> 
> Signed-off-by: Masayoshi Mizuma 
> ---
>  arch/x86/include/asm/setup.h |  2 ++
>  drivers/acpi/numa.c  | 14 ++
>  2 files changed, 16 insertions(+)
> 
> diff --git a/arch/x86/include/asm/setup.h b/arch/x86/include/asm/setup.h
> index ae13bc9..65a5bf8 100644
> --- a/arch/x86/include/asm/setup.h
> +++ b/arch/x86/include/asm/setup.h
> @@ -80,6 +80,8 @@ static inline unsigned long kaslr_offset(void)
>   return (unsigned long)&_text - __START_KERNEL;
>  }
>  
> +extern int rand_mem_physical_padding;
> +
>  /*
>   * Do NOT EVER look at the BIOS memory size location.
>   * It does not work on many machines.
> diff --git a/drivers/acpi/numa.c b/drivers/acpi/numa.c
> index 8516760..9c3cc3c 100644
> --- a/drivers/acpi/numa.c
> +++ b/drivers/acpi/numa.c
> @@ -32,6 +32,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  static nodemask_t nodes_found_map = NODE_MASK_NONE;
>  
> @@ -435,6 +436,8 @@ acpi_table_parse_srat(enum acpi_srat_type id,
>  int __init acpi_numa_init(void)
>  {
>   int cnt = 0;
> + u32 max_phys_addr_tb;
> + u64 max_phys_addr;
>  
>   if (acpi_disabled)
>   return -EINVAL;
> @@ -463,6 +466,17 @@ int __init acpi_numa_init(void)
>  
>   cnt = acpi_table_parse_srat(ACPI_SRAT_TYPE_MEMORY_AFFINITY,
>   acpi_parse_memory_affinity, 0);
> +
> + if (parsed_numa_memblks && kaslr_enabled()) {
> + max_phys_addr = PFN_PHYS(max_possible_pfn);
> + max_phys_addr_tb = (roundup(max_phys_addr, 1ULL << 40)) 
> >> 40;
> +
> + if (max_phys_addr_tb > rand_mem_physical_padding)

Here I assume max_phys_addr_tb is the end of the possible RAM in system.
rand_mem_physical_padding is the preserved space for later memory
extending. Don't we add the actual RAM size to the
rand_mem_physical_padding, then compare with max_phys_addr_tb?

Please correct me if I am wrong.

Thanks
Baoquan

> + pr_warn("Set 'rand_mem_physical_padding=%d' "
> + "as the kernel parameter. "
> + "Otherwise, memory hotadd may be 
> failed.\n",
> + max_phys_addr_tb);
> + }
>   }
>  
>   /* SLIT: System Locality Information Table */
> -- 
> 2.18.0
> 


Re: [PATCH v4 2/3] ACPI / NUMA: Add warning message if the padding size for KASLR is not enough

2018-09-27 Thread Baoquan He
On 09/27/18 at 04:31pm, Masayoshi Mizuma wrote:
> From: Masayoshi Mizuma 
> 
> Add warning message if the padding size for KASLR,
> rand_mem_physical_padding, is not enough. The message also
> says the suitable padding size.
> 
> Signed-off-by: Masayoshi Mizuma 
> ---
>  arch/x86/include/asm/setup.h |  2 ++
>  drivers/acpi/numa.c  | 14 ++
>  2 files changed, 16 insertions(+)
> 
> diff --git a/arch/x86/include/asm/setup.h b/arch/x86/include/asm/setup.h
> index ae13bc9..65a5bf8 100644
> --- a/arch/x86/include/asm/setup.h
> +++ b/arch/x86/include/asm/setup.h
> @@ -80,6 +80,8 @@ static inline unsigned long kaslr_offset(void)
>   return (unsigned long)&_text - __START_KERNEL;
>  }
>  
> +extern int rand_mem_physical_padding;
> +
>  /*
>   * Do NOT EVER look at the BIOS memory size location.
>   * It does not work on many machines.
> diff --git a/drivers/acpi/numa.c b/drivers/acpi/numa.c
> index 8516760..9c3cc3c 100644
> --- a/drivers/acpi/numa.c
> +++ b/drivers/acpi/numa.c
> @@ -32,6 +32,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  static nodemask_t nodes_found_map = NODE_MASK_NONE;
>  
> @@ -435,6 +436,8 @@ acpi_table_parse_srat(enum acpi_srat_type id,
>  int __init acpi_numa_init(void)
>  {
>   int cnt = 0;
> + u32 max_phys_addr_tb;
> + u64 max_phys_addr;
>  
>   if (acpi_disabled)
>   return -EINVAL;
> @@ -463,6 +466,17 @@ int __init acpi_numa_init(void)
>  
>   cnt = acpi_table_parse_srat(ACPI_SRAT_TYPE_MEMORY_AFFINITY,
>   acpi_parse_memory_affinity, 0);
> +
> + if (parsed_numa_memblks && kaslr_enabled()) {
> + max_phys_addr = PFN_PHYS(max_possible_pfn);
> + max_phys_addr_tb = (roundup(max_phys_addr, 1ULL << 40)) 
> >> 40;
> +
> + if (max_phys_addr_tb > rand_mem_physical_padding)

Here I assume max_phys_addr_tb is the end of the possible RAM in system.
rand_mem_physical_padding is the preserved space for later memory
extending. Don't we add the actual RAM size to the
rand_mem_physical_padding, then compare with max_phys_addr_tb?

Please correct me if I am wrong.

Thanks
Baoquan

> + pr_warn("Set 'rand_mem_physical_padding=%d' "
> + "as the kernel parameter. "
> + "Otherwise, memory hotadd may be 
> failed.\n",
> + max_phys_addr_tb);
> + }
>   }
>  
>   /* SLIT: System Locality Information Table */
> -- 
> 2.18.0
> 


Re: [PATCH v7 4/6] files: add a replace_fd_files() function

2018-09-27 Thread Jann Horn
On Fri, Sep 28, 2018 at 4:20 AM Kees Cook  wrote:
> On Thu, Sep 27, 2018 at 2:59 PM, Kees Cook  wrote:
> > On Thu, Sep 27, 2018 at 8:11 AM, Tycho Andersen  wrote:
> >> Similar to fd_install/__fd_install, we want to be able to replace an fd of
> >> an arbitrary struct files_struct, not just current's. We'll use this in the
> >> next patch to implement the seccomp ioctl that allows inserting fds into a
> >> stopped process' context.
> >>
> >> v7: new in v7
> >>
> >> Signed-off-by: Tycho Andersen 
> >> CC: Alexander Viro 
> >> CC: Kees Cook 
> >> CC: Andy Lutomirski 
> >> CC: Oleg Nesterov 
> >> CC: Eric W. Biederman 
> >> CC: "Serge E. Hallyn" 
> >> CC: Christian Brauner 
> >> CC: Tyler Hicks 
> >> CC: Akihiro Suda 
> >> ---
> >>  fs/file.c| 22 +++---
> >>  include/linux/file.h |  8 
> >>  2 files changed, 23 insertions(+), 7 deletions(-)
> >>
> >> diff --git a/fs/file.c b/fs/file.c
> >> index 7ffd6e9d103d..3b3c5aadaadb 100644
> >> --- a/fs/file.c
> >> +++ b/fs/file.c
> >> @@ -850,24 +850,32 @@ __releases(>file_lock)
> >>  }
> >>
> >>  int replace_fd(unsigned fd, struct file *file, unsigned flags)
> >> +{
> >> +   return replace_fd_task(current, fd, file, flags);
> >> +}
> >> +
> >> +/*
> >> + * Same warning as __alloc_fd()/__fd_install() here.
> >> + */
> >> +int replace_fd_task(struct task_struct *task, unsigned fd,
> >> +   struct file *file, unsigned flags)
> >>  {
> >> int err;
> >> -   struct files_struct *files = current->files;
> >
> > Same feedback as Jann: on a purely "smaller diff" note, this could
> > just be s/current/task/ here and all the other s/files/task->files/
> > would go away...
> >
> >>
> >> if (!file)
> >> -   return __close_fd(files, fd);
> >> +   return __close_fd(task->files, fd);
> >>
> >> -   if (fd >= rlimit(RLIMIT_NOFILE))
> >> +   if (fd >= task_rlimit(task, RLIMIT_NOFILE))
> >> return -EBADF;
> >>
> >> -   spin_lock(>file_lock);
> >> -   err = expand_files(files, fd);
> >> +   spin_lock(>files->file_lock);
> >> +   err = expand_files(task->files, fd);
> >> if (unlikely(err < 0))
> >> goto out_unlock;
> >> -   return do_dup2(files, file, fd, flags);
> >> +   return do_dup2(task->files, file, fd, flags);
> >>
> >>  out_unlock:
> >> -   spin_unlock(>file_lock);
> >> +   spin_unlock(>files->file_lock);
> >> return err;
> >>  }
> >>
> >> diff --git a/include/linux/file.h b/include/linux/file.h
> >> index 6b2fb032416c..f94277fee038 100644
> >> --- a/include/linux/file.h
> >> +++ b/include/linux/file.h
> >> @@ -11,6 +11,7 @@
> >>  #include 
> >>
> >>  struct file;
> >> +struct task_struct;
> >>
> >>  extern void fput(struct file *);
> >>
> >> @@ -79,6 +80,13 @@ static inline void fdput_pos(struct fd f)
> >>
> >>  extern int f_dupfd(unsigned int from, struct file *file, unsigned flags);
> >>  extern int replace_fd(unsigned fd, struct file *file, unsigned flags);
> >> +/*
> >> + * Warning! This is only safe if you know the owner of the files_struct is
> >> + * stopped outside syscall context. It's a very bad idea to use this 
> >> unless you
> >> + * have similar guarantees in your code.
> >> + */
> >> +extern int replace_fd_task(struct task_struct *task, unsigned fd,
> >> +  struct file *file, unsigned flags);
> >
> > Perhaps call this __replace_fd() to indicate the "please don't use
> > this unless you're very sure"ness of it?
> >
> >>  extern void set_close_on_exec(unsigned int fd, int flag);
> >>  extern bool get_close_on_exec(unsigned int fd);
> >>  extern int get_unused_fd_flags(unsigned flags);
> >> --
> >> 2.17.1
> >>
> >
> > If I can get an Ack from Al, that would be very nice. :)
>
> In out-of-band feedback from Al, he's pointed out a much cleaner
> approach: do the work on the "current" side. i.e. current is stopped
> in __seccomp_filter in the case SECCOMP_RET_USER_NOTIFY. Instead of
> having the ioctl-handing process doing the work, have it done on the
> other side. This may cause some additional complexity on the ioctl
> return path, but it solves both this problem and the "ptrace attach"
> issue: have the work delayed until "current" gets caught by seccomp.

Can you elaborate on this? Are you saying you want to, for every file
descriptor that should be transferred, put a reference to the file
into the kernel's seccomp notification data structure, wake up the
task that's waiting for a reply, let the task install an fd, send back
a response on whether installing the FD worked, and then return that
response back to the container manager process? That sounds
like a pretty complicated dance that I'd prefer to avoid.


Re: [PATCH v7 4/6] files: add a replace_fd_files() function

2018-09-27 Thread Jann Horn
On Fri, Sep 28, 2018 at 4:20 AM Kees Cook  wrote:
> On Thu, Sep 27, 2018 at 2:59 PM, Kees Cook  wrote:
> > On Thu, Sep 27, 2018 at 8:11 AM, Tycho Andersen  wrote:
> >> Similar to fd_install/__fd_install, we want to be able to replace an fd of
> >> an arbitrary struct files_struct, not just current's. We'll use this in the
> >> next patch to implement the seccomp ioctl that allows inserting fds into a
> >> stopped process' context.
> >>
> >> v7: new in v7
> >>
> >> Signed-off-by: Tycho Andersen 
> >> CC: Alexander Viro 
> >> CC: Kees Cook 
> >> CC: Andy Lutomirski 
> >> CC: Oleg Nesterov 
> >> CC: Eric W. Biederman 
> >> CC: "Serge E. Hallyn" 
> >> CC: Christian Brauner 
> >> CC: Tyler Hicks 
> >> CC: Akihiro Suda 
> >> ---
> >>  fs/file.c| 22 +++---
> >>  include/linux/file.h |  8 
> >>  2 files changed, 23 insertions(+), 7 deletions(-)
> >>
> >> diff --git a/fs/file.c b/fs/file.c
> >> index 7ffd6e9d103d..3b3c5aadaadb 100644
> >> --- a/fs/file.c
> >> +++ b/fs/file.c
> >> @@ -850,24 +850,32 @@ __releases(>file_lock)
> >>  }
> >>
> >>  int replace_fd(unsigned fd, struct file *file, unsigned flags)
> >> +{
> >> +   return replace_fd_task(current, fd, file, flags);
> >> +}
> >> +
> >> +/*
> >> + * Same warning as __alloc_fd()/__fd_install() here.
> >> + */
> >> +int replace_fd_task(struct task_struct *task, unsigned fd,
> >> +   struct file *file, unsigned flags)
> >>  {
> >> int err;
> >> -   struct files_struct *files = current->files;
> >
> > Same feedback as Jann: on a purely "smaller diff" note, this could
> > just be s/current/task/ here and all the other s/files/task->files/
> > would go away...
> >
> >>
> >> if (!file)
> >> -   return __close_fd(files, fd);
> >> +   return __close_fd(task->files, fd);
> >>
> >> -   if (fd >= rlimit(RLIMIT_NOFILE))
> >> +   if (fd >= task_rlimit(task, RLIMIT_NOFILE))
> >> return -EBADF;
> >>
> >> -   spin_lock(>file_lock);
> >> -   err = expand_files(files, fd);
> >> +   spin_lock(>files->file_lock);
> >> +   err = expand_files(task->files, fd);
> >> if (unlikely(err < 0))
> >> goto out_unlock;
> >> -   return do_dup2(files, file, fd, flags);
> >> +   return do_dup2(task->files, file, fd, flags);
> >>
> >>  out_unlock:
> >> -   spin_unlock(>file_lock);
> >> +   spin_unlock(>files->file_lock);
> >> return err;
> >>  }
> >>
> >> diff --git a/include/linux/file.h b/include/linux/file.h
> >> index 6b2fb032416c..f94277fee038 100644
> >> --- a/include/linux/file.h
> >> +++ b/include/linux/file.h
> >> @@ -11,6 +11,7 @@
> >>  #include 
> >>
> >>  struct file;
> >> +struct task_struct;
> >>
> >>  extern void fput(struct file *);
> >>
> >> @@ -79,6 +80,13 @@ static inline void fdput_pos(struct fd f)
> >>
> >>  extern int f_dupfd(unsigned int from, struct file *file, unsigned flags);
> >>  extern int replace_fd(unsigned fd, struct file *file, unsigned flags);
> >> +/*
> >> + * Warning! This is only safe if you know the owner of the files_struct is
> >> + * stopped outside syscall context. It's a very bad idea to use this 
> >> unless you
> >> + * have similar guarantees in your code.
> >> + */
> >> +extern int replace_fd_task(struct task_struct *task, unsigned fd,
> >> +  struct file *file, unsigned flags);
> >
> > Perhaps call this __replace_fd() to indicate the "please don't use
> > this unless you're very sure"ness of it?
> >
> >>  extern void set_close_on_exec(unsigned int fd, int flag);
> >>  extern bool get_close_on_exec(unsigned int fd);
> >>  extern int get_unused_fd_flags(unsigned flags);
> >> --
> >> 2.17.1
> >>
> >
> > If I can get an Ack from Al, that would be very nice. :)
>
> In out-of-band feedback from Al, he's pointed out a much cleaner
> approach: do the work on the "current" side. i.e. current is stopped
> in __seccomp_filter in the case SECCOMP_RET_USER_NOTIFY. Instead of
> having the ioctl-handing process doing the work, have it done on the
> other side. This may cause some additional complexity on the ioctl
> return path, but it solves both this problem and the "ptrace attach"
> issue: have the work delayed until "current" gets caught by seccomp.

Can you elaborate on this? Are you saying you want to, for every file
descriptor that should be transferred, put a reference to the file
into the kernel's seccomp notification data structure, wake up the
task that's waiting for a reply, let the task install an fd, send back
a response on whether installing the FD worked, and then return that
response back to the container manager process? That sounds
like a pretty complicated dance that I'd prefer to avoid.


Re: [PATCH] bcache: add separate workqueue for journal_write to avoid deadlock

2018-09-27 Thread Coly Li



On 9/27/18 11:53 PM, Eddie Chapman wrote:

On 27/09/18 16:23, Coly Li wrote:


On 9/27/18 9:45 PM, guoju wrote:

After write SSD completed, bcache schedule journal_write work to
system_wq, that is a public workqueue in system, without WQ_MEM_RECLAIM
flag. system_wq is also a bound wq, and there may be no idle kworker on
current processor. Creating a new kworker may unfortunately need to
reclaim memory first, by shrinking cache and slab used by vfs, which
depends on bcache device. That's a deadlock.

This patch create a new workqueue for journal_write with WQ_MEM_RECLAIM
flag. It's rescuer thread will work to avoid the deadlock.

Signed-off-by: guoju 


Nice catch, this fix is quite important. I will try to submit to Jens 
ASAP.


Thanks.

Coly Li


Once this goes into 4.19, would this be a candidate for backporting to 
any stable kernels, or does it only fix something introduced in this 
cycle?


This bug exists in upstream for quite long time, it should be applied to 
all stable kernels which it can be applied. And it is Cced to 
sta...@vger.kernel.org already.


Coly Li



Re: [PATCH] bcache: add separate workqueue for journal_write to avoid deadlock

2018-09-27 Thread Coly Li



On 9/27/18 11:53 PM, Eddie Chapman wrote:

On 27/09/18 16:23, Coly Li wrote:


On 9/27/18 9:45 PM, guoju wrote:

After write SSD completed, bcache schedule journal_write work to
system_wq, that is a public workqueue in system, without WQ_MEM_RECLAIM
flag. system_wq is also a bound wq, and there may be no idle kworker on
current processor. Creating a new kworker may unfortunately need to
reclaim memory first, by shrinking cache and slab used by vfs, which
depends on bcache device. That's a deadlock.

This patch create a new workqueue for journal_write with WQ_MEM_RECLAIM
flag. It's rescuer thread will work to avoid the deadlock.

Signed-off-by: guoju 


Nice catch, this fix is quite important. I will try to submit to Jens 
ASAP.


Thanks.

Coly Li


Once this goes into 4.19, would this be a candidate for backporting to 
any stable kernels, or does it only fix something introduced in this 
cycle?


This bug exists in upstream for quite long time, it should be applied to 
all stable kernels which it can be applied. And it is Cced to 
sta...@vger.kernel.org already.


Coly Li



Re: [PATCH] bcache: add separate workqueue for journal_write to avoid deadlock

2018-09-27 Thread Coly Li

Hi Stefan,

This bug was triggered by following condition:

1, few system memory available to allocate

2, journal delayed its operations to system_wq, which needs to allocate 
memory to execute.


3, Due to lack of memory, kernel starts to reclaim system memory, and 
trigger writeback to file system on top of bcache device


4, the memory writeback I/O hitting bcache device via upper layer file 
system, requiring more bcache journal operations


5, a loop-blocking issue happens in bcache journal

If your system is under heavy memory pressure, this deadlock may also 
happens in your environment. Anyway, this is a patch I suggest to apply 
because it fix a real deadlock which is probably happens when system 
memory is exhausted.



Thanks.


Coly Li

On 9/28/18 1:16 AM, Stefan Priebe - Profihost AG wrote:

Hi Coly,

is this the deadlock I reported some weeks ago?

Greets,
Stefan

Excuse my typo sent from my mobile phone.

Am 27.09.2018 um 17:53 schrieb Eddie Chapman >:



On 27/09/18 16:23, Coly Li wrote:

On 9/27/18 9:45 PM, guoju wrote:

After write SSD completed, bcache schedule journal_write work to
system_wq, that is a public workqueue in system, without WQ_MEM_RECLAIM
flag. system_wq is also a bound wq, and there may be no idle kworker on
current processor. Creating a new kworker may unfortunately need to
reclaim memory first, by shrinking cache and slab used by vfs, which
depends on bcache device. That's a deadlock.

This patch create a new workqueue for journal_write with WQ_MEM_RECLAIM
flag. It's rescuer thread will work to avoid the deadlock.

Signed-off-by: guoju mailto:fanggu...@gmail.com>>
Nice catch, this fix is quite important. I will try to submit to 
Jens ASAP.

Thanks.
Coly Li


Once this goes into 4.19, would this be a candidate for backporting 
to any stable kernels, or does it only fix something introduced in 
this cycle?


thanks,
Eddie


Re: [PATCH] bcache: add separate workqueue for journal_write to avoid deadlock

2018-09-27 Thread Coly Li

Hi Stefan,

This bug was triggered by following condition:

1, few system memory available to allocate

2, journal delayed its operations to system_wq, which needs to allocate 
memory to execute.


3, Due to lack of memory, kernel starts to reclaim system memory, and 
trigger writeback to file system on top of bcache device


4, the memory writeback I/O hitting bcache device via upper layer file 
system, requiring more bcache journal operations


5, a loop-blocking issue happens in bcache journal

If your system is under heavy memory pressure, this deadlock may also 
happens in your environment. Anyway, this is a patch I suggest to apply 
because it fix a real deadlock which is probably happens when system 
memory is exhausted.



Thanks.


Coly Li

On 9/28/18 1:16 AM, Stefan Priebe - Profihost AG wrote:

Hi Coly,

is this the deadlock I reported some weeks ago?

Greets,
Stefan

Excuse my typo sent from my mobile phone.

Am 27.09.2018 um 17:53 schrieb Eddie Chapman >:



On 27/09/18 16:23, Coly Li wrote:

On 9/27/18 9:45 PM, guoju wrote:

After write SSD completed, bcache schedule journal_write work to
system_wq, that is a public workqueue in system, without WQ_MEM_RECLAIM
flag. system_wq is also a bound wq, and there may be no idle kworker on
current processor. Creating a new kworker may unfortunately need to
reclaim memory first, by shrinking cache and slab used by vfs, which
depends on bcache device. That's a deadlock.

This patch create a new workqueue for journal_write with WQ_MEM_RECLAIM
flag. It's rescuer thread will work to avoid the deadlock.

Signed-off-by: guoju mailto:fanggu...@gmail.com>>
Nice catch, this fix is quite important. I will try to submit to 
Jens ASAP.

Thanks.
Coly Li


Once this goes into 4.19, would this be a candidate for backporting 
to any stable kernels, or does it only fix something introduced in 
this cycle?


thanks,
Eddie


Re: [PATCH v4 3/3] docs: kernel-parameters.txt: document rand_mem_physical_padding parameter

2018-09-27 Thread Masayoshi Mizuma
On Thu, Sep 27, 2018 at 11:17:47PM +0200, Borislav Petkov wrote:
> On Thu, Sep 27, 2018 at 04:31:46PM -0400, Masayoshi Mizuma wrote:
> > From: Masayoshi Mizuma 
> > 
> > This kernel parameter allows to change the padding used
> > for the physical memory mapping section when KASLR
> > memory is enabled.
> > 
> > For memory hotplug capable systems, the default padding size,
> > CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING, may not be enough.
> > The option is useful to adjust the padding size.
> > 
> > Signed-off-by: Masayoshi Mizuma 
> > ---
> >  Documentation/admin-guide/kernel-parameters.txt | 7 +++
> >  1 file changed, 7 insertions(+)
> > 
> > diff --git a/Documentation/admin-guide/kernel-parameters.txt 
> > b/Documentation/admin-guide/kernel-parameters.txt
> > index 92eb1f4..de43cdf 100644
> > --- a/Documentation/admin-guide/kernel-parameters.txt
> > +++ b/Documentation/admin-guide/kernel-parameters.txt
> > @@ -3529,6 +3529,13 @@
> > fully seed the kernel's CRNG. Default is controlled
> > by CONFIG_RANDOM_TRUST_CPU.
> >  
> > +   rand_mem_physical_padding=
> > +   [KNL] Define the padding size in terabytes
> > +   used for the physical memory mapping section
> > +   when KASLR memory is enabled.
> > +   The default value is
> > +   CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING.
> 
> Yet another kernel parameter which forces me to go look at what the
> code does because this help text doesn't really help. And I see that in
> previous iterations ok lkml it was *actually* properly explained why
> this parameter is needed.
> 
> So please summarize that explanation here so that the user can make an
> informed decision when reading this help text. Always think of explaning
> this to a colleague of yours who doesn't know about the memory padding
> and memory hotadd problematic and try to write it in such a way so that
> your colleague understands it.
> 
> :-)

You are right, I didn't make it clear enough...
Thank you for your comments, I'll fix the description.

Thanks!
Masa


Re: [PATCH v4 3/3] docs: kernel-parameters.txt: document rand_mem_physical_padding parameter

2018-09-27 Thread Masayoshi Mizuma
On Thu, Sep 27, 2018 at 11:17:47PM +0200, Borislav Petkov wrote:
> On Thu, Sep 27, 2018 at 04:31:46PM -0400, Masayoshi Mizuma wrote:
> > From: Masayoshi Mizuma 
> > 
> > This kernel parameter allows to change the padding used
> > for the physical memory mapping section when KASLR
> > memory is enabled.
> > 
> > For memory hotplug capable systems, the default padding size,
> > CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING, may not be enough.
> > The option is useful to adjust the padding size.
> > 
> > Signed-off-by: Masayoshi Mizuma 
> > ---
> >  Documentation/admin-guide/kernel-parameters.txt | 7 +++
> >  1 file changed, 7 insertions(+)
> > 
> > diff --git a/Documentation/admin-guide/kernel-parameters.txt 
> > b/Documentation/admin-guide/kernel-parameters.txt
> > index 92eb1f4..de43cdf 100644
> > --- a/Documentation/admin-guide/kernel-parameters.txt
> > +++ b/Documentation/admin-guide/kernel-parameters.txt
> > @@ -3529,6 +3529,13 @@
> > fully seed the kernel's CRNG. Default is controlled
> > by CONFIG_RANDOM_TRUST_CPU.
> >  
> > +   rand_mem_physical_padding=
> > +   [KNL] Define the padding size in terabytes
> > +   used for the physical memory mapping section
> > +   when KASLR memory is enabled.
> > +   The default value is
> > +   CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING.
> 
> Yet another kernel parameter which forces me to go look at what the
> code does because this help text doesn't really help. And I see that in
> previous iterations ok lkml it was *actually* properly explained why
> this parameter is needed.
> 
> So please summarize that explanation here so that the user can make an
> informed decision when reading this help text. Always think of explaning
> this to a colleague of yours who doesn't know about the memory padding
> and memory hotadd problematic and try to write it in such a way so that
> your colleague understands it.
> 
> :-)

You are right, I didn't make it clear enough...
Thank you for your comments, I'll fix the description.

Thanks!
Masa


Re: [PATCH 0/3] Kbuild: Some fixdep tweaks

2018-09-27 Thread Masahiro Yamada
Hi Rasmus,


2018年9月27日(木) 3:58 Rasmus Villemoes :
>
> On 15 August 2018 at 16:27, Rasmus Villemoes  wrote:
> > These patches eliminate two (albeit tiny and shortlived) processes
> > from the cmd_and_fixdep rule, i.e. from every TU being
> > compiled. Whether the diffstat below is worth it I'll leave to Kbuild
> > maintainers to decide.
>
> Ping.


Sorry for delay.
As far as I tested, the performance improvement was not noticeable level.

This patch set actually sits on the fence.
I tend to choose not-apply when I cannot make up my mind.


If you have something more to convince me, please let me know.


-- 
Best Regards
Masahiro Yamada


Re: [PATCH 0/3] Kbuild: Some fixdep tweaks

2018-09-27 Thread Masahiro Yamada
Hi Rasmus,


2018年9月27日(木) 3:58 Rasmus Villemoes :
>
> On 15 August 2018 at 16:27, Rasmus Villemoes  wrote:
> > These patches eliminate two (albeit tiny and shortlived) processes
> > from the cmd_and_fixdep rule, i.e. from every TU being
> > compiled. Whether the diffstat below is worth it I'll leave to Kbuild
> > maintainers to decide.
>
> Ping.


Sorry for delay.
As far as I tested, the performance improvement was not noticeable level.

This patch set actually sits on the fence.
I tend to choose not-apply when I cannot make up my mind.


If you have something more to convince me, please let me know.


-- 
Best Regards
Masahiro Yamada


Re: [PATCH v7 4/6] files: add a replace_fd_files() function

2018-09-27 Thread Kees Cook
On Thu, Sep 27, 2018 at 2:59 PM, Kees Cook  wrote:
> On Thu, Sep 27, 2018 at 8:11 AM, Tycho Andersen  wrote:
>> Similar to fd_install/__fd_install, we want to be able to replace an fd of
>> an arbitrary struct files_struct, not just current's. We'll use this in the
>> next patch to implement the seccomp ioctl that allows inserting fds into a
>> stopped process' context.
>>
>> v7: new in v7
>>
>> Signed-off-by: Tycho Andersen 
>> CC: Alexander Viro 
>> CC: Kees Cook 
>> CC: Andy Lutomirski 
>> CC: Oleg Nesterov 
>> CC: Eric W. Biederman 
>> CC: "Serge E. Hallyn" 
>> CC: Christian Brauner 
>> CC: Tyler Hicks 
>> CC: Akihiro Suda 
>> ---
>>  fs/file.c| 22 +++---
>>  include/linux/file.h |  8 
>>  2 files changed, 23 insertions(+), 7 deletions(-)
>>
>> diff --git a/fs/file.c b/fs/file.c
>> index 7ffd6e9d103d..3b3c5aadaadb 100644
>> --- a/fs/file.c
>> +++ b/fs/file.c
>> @@ -850,24 +850,32 @@ __releases(>file_lock)
>>  }
>>
>>  int replace_fd(unsigned fd, struct file *file, unsigned flags)
>> +{
>> +   return replace_fd_task(current, fd, file, flags);
>> +}
>> +
>> +/*
>> + * Same warning as __alloc_fd()/__fd_install() here.
>> + */
>> +int replace_fd_task(struct task_struct *task, unsigned fd,
>> +   struct file *file, unsigned flags)
>>  {
>> int err;
>> -   struct files_struct *files = current->files;
>
> Same feedback as Jann: on a purely "smaller diff" note, this could
> just be s/current/task/ here and all the other s/files/task->files/
> would go away...
>
>>
>> if (!file)
>> -   return __close_fd(files, fd);
>> +   return __close_fd(task->files, fd);
>>
>> -   if (fd >= rlimit(RLIMIT_NOFILE))
>> +   if (fd >= task_rlimit(task, RLIMIT_NOFILE))
>> return -EBADF;
>>
>> -   spin_lock(>file_lock);
>> -   err = expand_files(files, fd);
>> +   spin_lock(>files->file_lock);
>> +   err = expand_files(task->files, fd);
>> if (unlikely(err < 0))
>> goto out_unlock;
>> -   return do_dup2(files, file, fd, flags);
>> +   return do_dup2(task->files, file, fd, flags);
>>
>>  out_unlock:
>> -   spin_unlock(>file_lock);
>> +   spin_unlock(>files->file_lock);
>> return err;
>>  }
>>
>> diff --git a/include/linux/file.h b/include/linux/file.h
>> index 6b2fb032416c..f94277fee038 100644
>> --- a/include/linux/file.h
>> +++ b/include/linux/file.h
>> @@ -11,6 +11,7 @@
>>  #include 
>>
>>  struct file;
>> +struct task_struct;
>>
>>  extern void fput(struct file *);
>>
>> @@ -79,6 +80,13 @@ static inline void fdput_pos(struct fd f)
>>
>>  extern int f_dupfd(unsigned int from, struct file *file, unsigned flags);
>>  extern int replace_fd(unsigned fd, struct file *file, unsigned flags);
>> +/*
>> + * Warning! This is only safe if you know the owner of the files_struct is
>> + * stopped outside syscall context. It's a very bad idea to use this unless 
>> you
>> + * have similar guarantees in your code.
>> + */
>> +extern int replace_fd_task(struct task_struct *task, unsigned fd,
>> +  struct file *file, unsigned flags);
>
> Perhaps call this __replace_fd() to indicate the "please don't use
> this unless you're very sure"ness of it?
>
>>  extern void set_close_on_exec(unsigned int fd, int flag);
>>  extern bool get_close_on_exec(unsigned int fd);
>>  extern int get_unused_fd_flags(unsigned flags);
>> --
>> 2.17.1
>>
>
> If I can get an Ack from Al, that would be very nice. :)

In out-of-band feedback from Al, he's pointed out a much cleaner
approach: do the work on the "current" side. i.e. current is stopped
in __seccomp_filter in the case SECCOMP_RET_USER_NOTIFY. Instead of
having the ioctl-handing process doing the work, have it done on the
other side. This may cause some additional complexity on the ioctl
return path, but it solves both this problem and the "ptrace attach"
issue: have the work delayed until "current" gets caught by seccomp.

-Kees

-- 
Kees Cook
Pixel Security


Re: [PATCH v4 2/3] ACPI / NUMA: Add warning message if the padding size for KASLR is not enough

2018-09-27 Thread Masayoshi Mizuma
On Thu, Sep 27, 2018 at 11:14:25PM +0200, Borislav Petkov wrote:
> On Thu, Sep 27, 2018 at 04:31:45PM -0400, Masayoshi Mizuma wrote:
> > From: Masayoshi Mizuma 
> > 
> > Add warning message if the padding size for KASLR,
> > rand_mem_physical_padding, is not enough. The message also
> > says the suitable padding size.
> > 
> > Signed-off-by: Masayoshi Mizuma 
> > ---
> >  arch/x86/include/asm/setup.h |  2 ++
> >  drivers/acpi/numa.c  | 14 ++
> >  2 files changed, 16 insertions(+)
> > 
> > diff --git a/arch/x86/include/asm/setup.h b/arch/x86/include/asm/setup.h
> > index ae13bc9..65a5bf8 100644
> > --- a/arch/x86/include/asm/setup.h
> > +++ b/arch/x86/include/asm/setup.h
> > @@ -80,6 +80,8 @@ static inline unsigned long kaslr_offset(void)
> > return (unsigned long)&_text - __START_KERNEL;
> >  }
> >  
> > +extern int rand_mem_physical_padding;
> > +
> >  /*
> >   * Do NOT EVER look at the BIOS memory size location.
> >   * It does not work on many machines.
> > diff --git a/drivers/acpi/numa.c b/drivers/acpi/numa.c
> > index 8516760..9c3cc3c 100644
> > --- a/drivers/acpi/numa.c
> > +++ b/drivers/acpi/numa.c
> > @@ -32,6 +32,7 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> >  
> >  static nodemask_t nodes_found_map = NODE_MASK_NONE;
> >  
> > @@ -435,6 +436,8 @@ acpi_table_parse_srat(enum acpi_srat_type id,
> >  int __init acpi_numa_init(void)
> >  {
> > int cnt = 0;
> > +   u32 max_phys_addr_tb;
> > +   u64 max_phys_addr;
> >  
> > if (acpi_disabled)
> > return -EINVAL;
> > @@ -463,6 +466,17 @@ int __init acpi_numa_init(void)
> >  
> > cnt = acpi_table_parse_srat(ACPI_SRAT_TYPE_MEMORY_AFFINITY,
> > acpi_parse_memory_affinity, 0);
> > +
> > +   if (parsed_numa_memblks && kaslr_enabled()) {
> > +   max_phys_addr = PFN_PHYS(max_possible_pfn);
> > +   max_phys_addr_tb = (roundup(max_phys_addr, 1ULL << 40)) 
> > >> 40;
> > +
> > +   if (max_phys_addr_tb > rand_mem_physical_padding)
> > +   pr_warn("Set 'rand_mem_physical_padding=%d' "
> > +   "as the kernel parameter. "
> > +   "Otherwise, memory hotadd may be 
> > failed.\n",
> > +   max_phys_addr_tb);
> 
> Please integrate scripts/checkpatch.pl into your patch creation
> workflow. Some of the warnings/errors *actually* make sense:
> 
> WARNING: quoted string split across lines
> #75: FILE: drivers/acpi/numa.c:476:
> +   pr_warn("Set 'rand_mem_physical_padding=%d' "
> +   "as the kernel parameter. "
> 
> WARNING: quoted string split across lines
> #76: FILE: drivers/acpi/numa.c:477:
> +   "as the kernel parameter. "
> +   "Otherwise, memory hotadd may be 
> failed.\n",
> 
> total: 0 errors, 2 warnings, 40 lines checked
> 
> Also, that sentence needs polishing:
> 
>   pr_warn("Set 'rand_mem_physical_padding=%d' to 
> avoid memory hotadd failure.\n",

Thank you for pointing it out.
I'll fix it.

Thnaks,
Masa

> 
> 
> -- 
> Regards/Gruss,
> Boris.
> 
> Good mailing practices for 400: avoid top-posting and trim the reply.


Re: [PATCH v7 4/6] files: add a replace_fd_files() function

2018-09-27 Thread Kees Cook
On Thu, Sep 27, 2018 at 2:59 PM, Kees Cook  wrote:
> On Thu, Sep 27, 2018 at 8:11 AM, Tycho Andersen  wrote:
>> Similar to fd_install/__fd_install, we want to be able to replace an fd of
>> an arbitrary struct files_struct, not just current's. We'll use this in the
>> next patch to implement the seccomp ioctl that allows inserting fds into a
>> stopped process' context.
>>
>> v7: new in v7
>>
>> Signed-off-by: Tycho Andersen 
>> CC: Alexander Viro 
>> CC: Kees Cook 
>> CC: Andy Lutomirski 
>> CC: Oleg Nesterov 
>> CC: Eric W. Biederman 
>> CC: "Serge E. Hallyn" 
>> CC: Christian Brauner 
>> CC: Tyler Hicks 
>> CC: Akihiro Suda 
>> ---
>>  fs/file.c| 22 +++---
>>  include/linux/file.h |  8 
>>  2 files changed, 23 insertions(+), 7 deletions(-)
>>
>> diff --git a/fs/file.c b/fs/file.c
>> index 7ffd6e9d103d..3b3c5aadaadb 100644
>> --- a/fs/file.c
>> +++ b/fs/file.c
>> @@ -850,24 +850,32 @@ __releases(>file_lock)
>>  }
>>
>>  int replace_fd(unsigned fd, struct file *file, unsigned flags)
>> +{
>> +   return replace_fd_task(current, fd, file, flags);
>> +}
>> +
>> +/*
>> + * Same warning as __alloc_fd()/__fd_install() here.
>> + */
>> +int replace_fd_task(struct task_struct *task, unsigned fd,
>> +   struct file *file, unsigned flags)
>>  {
>> int err;
>> -   struct files_struct *files = current->files;
>
> Same feedback as Jann: on a purely "smaller diff" note, this could
> just be s/current/task/ here and all the other s/files/task->files/
> would go away...
>
>>
>> if (!file)
>> -   return __close_fd(files, fd);
>> +   return __close_fd(task->files, fd);
>>
>> -   if (fd >= rlimit(RLIMIT_NOFILE))
>> +   if (fd >= task_rlimit(task, RLIMIT_NOFILE))
>> return -EBADF;
>>
>> -   spin_lock(>file_lock);
>> -   err = expand_files(files, fd);
>> +   spin_lock(>files->file_lock);
>> +   err = expand_files(task->files, fd);
>> if (unlikely(err < 0))
>> goto out_unlock;
>> -   return do_dup2(files, file, fd, flags);
>> +   return do_dup2(task->files, file, fd, flags);
>>
>>  out_unlock:
>> -   spin_unlock(>file_lock);
>> +   spin_unlock(>files->file_lock);
>> return err;
>>  }
>>
>> diff --git a/include/linux/file.h b/include/linux/file.h
>> index 6b2fb032416c..f94277fee038 100644
>> --- a/include/linux/file.h
>> +++ b/include/linux/file.h
>> @@ -11,6 +11,7 @@
>>  #include 
>>
>>  struct file;
>> +struct task_struct;
>>
>>  extern void fput(struct file *);
>>
>> @@ -79,6 +80,13 @@ static inline void fdput_pos(struct fd f)
>>
>>  extern int f_dupfd(unsigned int from, struct file *file, unsigned flags);
>>  extern int replace_fd(unsigned fd, struct file *file, unsigned flags);
>> +/*
>> + * Warning! This is only safe if you know the owner of the files_struct is
>> + * stopped outside syscall context. It's a very bad idea to use this unless 
>> you
>> + * have similar guarantees in your code.
>> + */
>> +extern int replace_fd_task(struct task_struct *task, unsigned fd,
>> +  struct file *file, unsigned flags);
>
> Perhaps call this __replace_fd() to indicate the "please don't use
> this unless you're very sure"ness of it?
>
>>  extern void set_close_on_exec(unsigned int fd, int flag);
>>  extern bool get_close_on_exec(unsigned int fd);
>>  extern int get_unused_fd_flags(unsigned flags);
>> --
>> 2.17.1
>>
>
> If I can get an Ack from Al, that would be very nice. :)

In out-of-band feedback from Al, he's pointed out a much cleaner
approach: do the work on the "current" side. i.e. current is stopped
in __seccomp_filter in the case SECCOMP_RET_USER_NOTIFY. Instead of
having the ioctl-handing process doing the work, have it done on the
other side. This may cause some additional complexity on the ioctl
return path, but it solves both this problem and the "ptrace attach"
issue: have the work delayed until "current" gets caught by seccomp.

-Kees

-- 
Kees Cook
Pixel Security


Re: [PATCH v4 2/3] ACPI / NUMA: Add warning message if the padding size for KASLR is not enough

2018-09-27 Thread Masayoshi Mizuma
On Thu, Sep 27, 2018 at 11:14:25PM +0200, Borislav Petkov wrote:
> On Thu, Sep 27, 2018 at 04:31:45PM -0400, Masayoshi Mizuma wrote:
> > From: Masayoshi Mizuma 
> > 
> > Add warning message if the padding size for KASLR,
> > rand_mem_physical_padding, is not enough. The message also
> > says the suitable padding size.
> > 
> > Signed-off-by: Masayoshi Mizuma 
> > ---
> >  arch/x86/include/asm/setup.h |  2 ++
> >  drivers/acpi/numa.c  | 14 ++
> >  2 files changed, 16 insertions(+)
> > 
> > diff --git a/arch/x86/include/asm/setup.h b/arch/x86/include/asm/setup.h
> > index ae13bc9..65a5bf8 100644
> > --- a/arch/x86/include/asm/setup.h
> > +++ b/arch/x86/include/asm/setup.h
> > @@ -80,6 +80,8 @@ static inline unsigned long kaslr_offset(void)
> > return (unsigned long)&_text - __START_KERNEL;
> >  }
> >  
> > +extern int rand_mem_physical_padding;
> > +
> >  /*
> >   * Do NOT EVER look at the BIOS memory size location.
> >   * It does not work on many machines.
> > diff --git a/drivers/acpi/numa.c b/drivers/acpi/numa.c
> > index 8516760..9c3cc3c 100644
> > --- a/drivers/acpi/numa.c
> > +++ b/drivers/acpi/numa.c
> > @@ -32,6 +32,7 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> >  
> >  static nodemask_t nodes_found_map = NODE_MASK_NONE;
> >  
> > @@ -435,6 +436,8 @@ acpi_table_parse_srat(enum acpi_srat_type id,
> >  int __init acpi_numa_init(void)
> >  {
> > int cnt = 0;
> > +   u32 max_phys_addr_tb;
> > +   u64 max_phys_addr;
> >  
> > if (acpi_disabled)
> > return -EINVAL;
> > @@ -463,6 +466,17 @@ int __init acpi_numa_init(void)
> >  
> > cnt = acpi_table_parse_srat(ACPI_SRAT_TYPE_MEMORY_AFFINITY,
> > acpi_parse_memory_affinity, 0);
> > +
> > +   if (parsed_numa_memblks && kaslr_enabled()) {
> > +   max_phys_addr = PFN_PHYS(max_possible_pfn);
> > +   max_phys_addr_tb = (roundup(max_phys_addr, 1ULL << 40)) 
> > >> 40;
> > +
> > +   if (max_phys_addr_tb > rand_mem_physical_padding)
> > +   pr_warn("Set 'rand_mem_physical_padding=%d' "
> > +   "as the kernel parameter. "
> > +   "Otherwise, memory hotadd may be 
> > failed.\n",
> > +   max_phys_addr_tb);
> 
> Please integrate scripts/checkpatch.pl into your patch creation
> workflow. Some of the warnings/errors *actually* make sense:
> 
> WARNING: quoted string split across lines
> #75: FILE: drivers/acpi/numa.c:476:
> +   pr_warn("Set 'rand_mem_physical_padding=%d' "
> +   "as the kernel parameter. "
> 
> WARNING: quoted string split across lines
> #76: FILE: drivers/acpi/numa.c:477:
> +   "as the kernel parameter. "
> +   "Otherwise, memory hotadd may be 
> failed.\n",
> 
> total: 0 errors, 2 warnings, 40 lines checked
> 
> Also, that sentence needs polishing:
> 
>   pr_warn("Set 'rand_mem_physical_padding=%d' to 
> avoid memory hotadd failure.\n",

Thank you for pointing it out.
I'll fix it.

Thnaks,
Masa

> 
> 
> -- 
> Regards/Gruss,
> Boris.
> 
> Good mailing practices for 400: avoid top-posting and trim the reply.


[PATCH -next] staging: rtlwifi: Remove set but not used variable 'ppsc'

2018-09-27 Thread YueHaibing
Fixes gcc '-Wunused-but-set-variable' warning:

drivers/staging/rtlwifi/btcoexist/halbtcoutsrc.c: In function 
'halbtc_leave_lps':
drivers/staging/rtlwifi/btcoexist/halbtcoutsrc.c:284:21: warning:
 variable 'ppsc' set but not used [-Wunused-but-set-variable]

drivers/staging/rtlwifi/btcoexist/halbtcoutsrc.c: In function 
'halbtc_enter_lps':
drivers/staging/rtlwifi/btcoexist/halbtcoutsrc.c:307:21: warning:
 variable 'ppsc' set but not used [-Wunused-but-set-variable]

Signed-off-by: YueHaibing 
---
 drivers/staging/rtlwifi/btcoexist/halbtcoutsrc.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/drivers/staging/rtlwifi/btcoexist/halbtcoutsrc.c 
b/drivers/staging/rtlwifi/btcoexist/halbtcoutsrc.c
index 85a7490..24e19ff 100644
--- a/drivers/staging/rtlwifi/btcoexist/halbtcoutsrc.c
+++ b/drivers/staging/rtlwifi/btcoexist/halbtcoutsrc.c
@@ -281,11 +281,9 @@ bool halbtc_send_bt_mp_operation(struct btc_coexist 
*btcoexist, u8 op_code,
 static void halbtc_leave_lps(struct btc_coexist *btcoexist)
 {
struct rtl_priv *rtlpriv;
-   struct rtl_ps_ctl *ppsc;
bool ap_enable = false;
 
rtlpriv = btcoexist->adapter;
-   ppsc = rtl_psc(rtlpriv);
 
btcoexist->btc_get(btcoexist, BTC_GET_BL_WIFI_AP_MODE_ENABLE,
   _enable);
@@ -304,11 +302,9 @@ static void halbtc_leave_lps(struct btc_coexist *btcoexist)
 static void halbtc_enter_lps(struct btc_coexist *btcoexist)
 {
struct rtl_priv *rtlpriv;
-   struct rtl_ps_ctl *ppsc;
bool ap_enable = false;
 
rtlpriv = btcoexist->adapter;
-   ppsc = rtl_psc(rtlpriv);
 
btcoexist->btc_get(btcoexist, BTC_GET_BL_WIFI_AP_MODE_ENABLE,
   _enable);



[PATCH -next] staging: rtlwifi: Remove set but not used variable 'ppsc'

2018-09-27 Thread YueHaibing
Fixes gcc '-Wunused-but-set-variable' warning:

drivers/staging/rtlwifi/btcoexist/halbtcoutsrc.c: In function 
'halbtc_leave_lps':
drivers/staging/rtlwifi/btcoexist/halbtcoutsrc.c:284:21: warning:
 variable 'ppsc' set but not used [-Wunused-but-set-variable]

drivers/staging/rtlwifi/btcoexist/halbtcoutsrc.c: In function 
'halbtc_enter_lps':
drivers/staging/rtlwifi/btcoexist/halbtcoutsrc.c:307:21: warning:
 variable 'ppsc' set but not used [-Wunused-but-set-variable]

Signed-off-by: YueHaibing 
---
 drivers/staging/rtlwifi/btcoexist/halbtcoutsrc.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/drivers/staging/rtlwifi/btcoexist/halbtcoutsrc.c 
b/drivers/staging/rtlwifi/btcoexist/halbtcoutsrc.c
index 85a7490..24e19ff 100644
--- a/drivers/staging/rtlwifi/btcoexist/halbtcoutsrc.c
+++ b/drivers/staging/rtlwifi/btcoexist/halbtcoutsrc.c
@@ -281,11 +281,9 @@ bool halbtc_send_bt_mp_operation(struct btc_coexist 
*btcoexist, u8 op_code,
 static void halbtc_leave_lps(struct btc_coexist *btcoexist)
 {
struct rtl_priv *rtlpriv;
-   struct rtl_ps_ctl *ppsc;
bool ap_enable = false;
 
rtlpriv = btcoexist->adapter;
-   ppsc = rtl_psc(rtlpriv);
 
btcoexist->btc_get(btcoexist, BTC_GET_BL_WIFI_AP_MODE_ENABLE,
   _enable);
@@ -304,11 +302,9 @@ static void halbtc_leave_lps(struct btc_coexist *btcoexist)
 static void halbtc_enter_lps(struct btc_coexist *btcoexist)
 {
struct rtl_priv *rtlpriv;
-   struct rtl_ps_ctl *ppsc;
bool ap_enable = false;
 
rtlpriv = btcoexist->adapter;
-   ppsc = rtl_psc(rtlpriv);
 
btcoexist->btc_get(btcoexist, BTC_GET_BL_WIFI_AP_MODE_ENABLE,
   _enable);



[PATCH v3 6/7] drivers: oprofile: Avoids building driver from direct make command

2018-09-27 Thread Leonardo Brás
Creates new Makefile to avoid building driver if
'make drivers/oprofile/' is called directly.

This driver is usually built from arch/$ARCH and seems to have
no meaning building alone.

Signed-off-by: Leonardo Brás 
---
 drivers/oprofile/Makefile | 1 +
 1 file changed, 1 insertion(+)
 create mode 100644 drivers/oprofile/Makefile

diff --git a/drivers/oprofile/Makefile b/drivers/oprofile/Makefile
new file mode 100644
index ..361867ec2338
--- /dev/null
+++ b/drivers/oprofile/Makefile
@@ -0,0 +1 @@
+#Does nothing, since the source is called from arch/$ARCH/ tree.
-- 
2.19.0



[PATCH v3 7/7] drivers: hwtracing: Adds Makefile to enable building from directory.

2018-09-27 Thread Leonardo Brás
Adds Makefile to enable building the driver using
'make drivers/hwtracing/'.
Changes drivers/Makefile to call the new Makefile directly.
It enables user building this driver without building the whole drivers/
subtree.

Signed-off-by: Leonardo Brás 
---
 drivers/Makefile   | 4 +---
 drivers/hwtracing/Makefile | 3 +++
 2 files changed, 4 insertions(+), 3 deletions(-)
 create mode 100644 drivers/hwtracing/Makefile

diff --git a/drivers/Makefile b/drivers/Makefile
index 578f469f72fb..a237be6b602f 100644
--- a/drivers/Makefile
+++ b/drivers/Makefile
@@ -174,9 +174,7 @@ obj-$(CONFIG_MCB)   += mcb/
 obj-$(CONFIG_PERF_EVENTS)  += perf/
 obj-$(CONFIG_RAS)  += ras/
 obj-$(CONFIG_THUNDERBOLT)  += thunderbolt/
-obj-$(CONFIG_CORESIGHT)+= hwtracing/coresight/
-obj-y  += hwtracing/intel_th/
-obj-$(CONFIG_STM)  += hwtracing/stm/
+obj-y  += hwtracing/
 obj-$(CONFIG_ANDROID)  += android/
 obj-$(CONFIG_NVMEM)+= nvmem/
 obj-$(CONFIG_FPGA) += fpga/
diff --git a/drivers/hwtracing/Makefile b/drivers/hwtracing/Makefile
new file mode 100644
index ..fe5773caec49
--- /dev/null
+++ b/drivers/hwtracing/Makefile
@@ -0,0 +1,3 @@
+obj-$(CONFIG_CORESIGHT)+= coresight/
+obj-y  += intel_th/
+obj-$(CONFIG_STM)  += stm/
-- 
2.19.0



[PATCH v3 6/7] drivers: oprofile: Avoids building driver from direct make command

2018-09-27 Thread Leonardo Brás
Creates new Makefile to avoid building driver if
'make drivers/oprofile/' is called directly.

This driver is usually built from arch/$ARCH and seems to have
no meaning building alone.

Signed-off-by: Leonardo Brás 
---
 drivers/oprofile/Makefile | 1 +
 1 file changed, 1 insertion(+)
 create mode 100644 drivers/oprofile/Makefile

diff --git a/drivers/oprofile/Makefile b/drivers/oprofile/Makefile
new file mode 100644
index ..361867ec2338
--- /dev/null
+++ b/drivers/oprofile/Makefile
@@ -0,0 +1 @@
+#Does nothing, since the source is called from arch/$ARCH/ tree.
-- 
2.19.0



[PATCH v3 7/7] drivers: hwtracing: Adds Makefile to enable building from directory.

2018-09-27 Thread Leonardo Brás
Adds Makefile to enable building the driver using
'make drivers/hwtracing/'.
Changes drivers/Makefile to call the new Makefile directly.
It enables user building this driver without building the whole drivers/
subtree.

Signed-off-by: Leonardo Brás 
---
 drivers/Makefile   | 4 +---
 drivers/hwtracing/Makefile | 3 +++
 2 files changed, 4 insertions(+), 3 deletions(-)
 create mode 100644 drivers/hwtracing/Makefile

diff --git a/drivers/Makefile b/drivers/Makefile
index 578f469f72fb..a237be6b602f 100644
--- a/drivers/Makefile
+++ b/drivers/Makefile
@@ -174,9 +174,7 @@ obj-$(CONFIG_MCB)   += mcb/
 obj-$(CONFIG_PERF_EVENTS)  += perf/
 obj-$(CONFIG_RAS)  += ras/
 obj-$(CONFIG_THUNDERBOLT)  += thunderbolt/
-obj-$(CONFIG_CORESIGHT)+= hwtracing/coresight/
-obj-y  += hwtracing/intel_th/
-obj-$(CONFIG_STM)  += hwtracing/stm/
+obj-y  += hwtracing/
 obj-$(CONFIG_ANDROID)  += android/
 obj-$(CONFIG_NVMEM)+= nvmem/
 obj-$(CONFIG_FPGA) += fpga/
diff --git a/drivers/hwtracing/Makefile b/drivers/hwtracing/Makefile
new file mode 100644
index ..fe5773caec49
--- /dev/null
+++ b/drivers/hwtracing/Makefile
@@ -0,0 +1,3 @@
+obj-$(CONFIG_CORESIGHT)+= coresight/
+obj-y  += intel_th/
+obj-$(CONFIG_STM)  += stm/
-- 
2.19.0



[PATCH v3 5/7] drivers: s390: Avoids building drivers if ARCH is not s390.

2018-09-27 Thread Leonardo Brás
Avoids building s390 drivers if 'make drivers/s390/' is called but
ARCH is not s390.

Signed-off-by: Leonardo Brás 
---
 drivers/s390/Makefile | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/s390/Makefile b/drivers/s390/Makefile
index a863b0462b43..0575f02dba45 100644
--- a/drivers/s390/Makefile
+++ b/drivers/s390/Makefile
@@ -3,7 +3,7 @@
 # Makefile for the S/390 specific device drivers
 #
 
-obj-y += cio/ block/ char/ crypto/ net/ scsi/ virtio/
-
-drivers-y += drivers/s390/built-in.a
-
+ifeq ($(ARCH),s390)
+   obj-y += cio/ block/ char/ crypto/ net/ scsi/ virtio/
+   drivers-y += drivers/s390/built-in.a
+endif
-- 
2.19.0



[PATCH v3 5/7] drivers: s390: Avoids building drivers if ARCH is not s390.

2018-09-27 Thread Leonardo Brás
Avoids building s390 drivers if 'make drivers/s390/' is called but
ARCH is not s390.

Signed-off-by: Leonardo Brás 
---
 drivers/s390/Makefile | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/s390/Makefile b/drivers/s390/Makefile
index a863b0462b43..0575f02dba45 100644
--- a/drivers/s390/Makefile
+++ b/drivers/s390/Makefile
@@ -3,7 +3,7 @@
 # Makefile for the S/390 specific device drivers
 #
 
-obj-y += cio/ block/ char/ crypto/ net/ scsi/ virtio/
-
-drivers-y += drivers/s390/built-in.a
-
+ifeq ($(ARCH),s390)
+   obj-y += cio/ block/ char/ crypto/ net/ scsi/ virtio/
+   drivers-y += drivers/s390/built-in.a
+endif
-- 
2.19.0



[PATCH v3 4/7] drivers: zorro: Avoids building proc.o if CONFIG_ZORRO is disabled

2018-09-27 Thread Leonardo Brás
Avoids building proc.o if 'make drivers/zorro/' is called and
CONFIG_ZORRO is disabled, even if CONFIG_PROC_FS is enabled.

Signed-off-by: Leonardo Brás 
---
 drivers/zorro/Makefile | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/zorro/Makefile b/drivers/zorro/Makefile
index b360ac4ea846..d580f9f08e0a 100644
--- a/drivers/zorro/Makefile
+++ b/drivers/zorro/Makefile
@@ -3,9 +3,10 @@
 # Makefile for the Zorro bus specific drivers.
 #
 
-obj-$(CONFIG_ZORRO)+= zorro.o zorro-driver.o zorro-sysfs.o
-obj-$(CONFIG_PROC_FS)  += proc.o
-obj-$(CONFIG_ZORRO_NAMES) +=  names.o
+obj-$(CONFIG_ZORRO):= zorro_all.o
+zorro_all-y+= zorro.o zorro-driver.o zorro-sysfs.o
+zorro_all-$(CONFIG_ZORRO_NAMES) += names.o
+zorro_all-$(CONFIG_PROC_FS)+= proc.o
 
 hostprogs-y:= gen-devlist
 
-- 
2.19.0



[PATCH v3 4/7] drivers: zorro: Avoids building proc.o if CONFIG_ZORRO is disabled

2018-09-27 Thread Leonardo Brás
Avoids building proc.o if 'make drivers/zorro/' is called and
CONFIG_ZORRO is disabled, even if CONFIG_PROC_FS is enabled.

Signed-off-by: Leonardo Brás 
---
 drivers/zorro/Makefile | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/zorro/Makefile b/drivers/zorro/Makefile
index b360ac4ea846..d580f9f08e0a 100644
--- a/drivers/zorro/Makefile
+++ b/drivers/zorro/Makefile
@@ -3,9 +3,10 @@
 # Makefile for the Zorro bus specific drivers.
 #
 
-obj-$(CONFIG_ZORRO)+= zorro.o zorro-driver.o zorro-sysfs.o
-obj-$(CONFIG_PROC_FS)  += proc.o
-obj-$(CONFIG_ZORRO_NAMES) +=  names.o
+obj-$(CONFIG_ZORRO):= zorro_all.o
+zorro_all-y+= zorro.o zorro-driver.o zorro-sysfs.o
+zorro_all-$(CONFIG_ZORRO_NAMES) += names.o
+zorro_all-$(CONFIG_PROC_FS)+= proc.o
 
 hostprogs-y:= gen-devlist
 
-- 
2.19.0



[PATCH v3 3/7] drivers: parisc: Avoids building driver if CONFIG_PARISC is disabled

2018-09-27 Thread Leonardo Brás
Avoids building driver if 'make drivers/parisc/' is called and
CONFIG_PARISC is disabled.

Signed-off-by: Leonardo Brás 
---
 drivers/parisc/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/parisc/Makefile b/drivers/parisc/Makefile
index 3cd5e6cb8478..80049d763aa0 100644
--- a/drivers/parisc/Makefile
+++ b/drivers/parisc/Makefile
@@ -24,5 +24,5 @@ obj-$(CONFIG_EISA)+= eisa.o eisa_enumerator.o 
eisa_eeprom.o
 obj-$(CONFIG_SUPERIO)  += superio.o
 obj-$(CONFIG_CHASSIS_LCD_LED)  += led.o
 obj-$(CONFIG_PDC_STABLE)   += pdc_stable.o
-obj-y  += power.o
+obj-$(CONFIG_PARISC)   += power.o
 
-- 
2.19.0



[PATCH v3 0/7] Remove errors building drivers/DRIVERNAME

2018-09-27 Thread Leonardo Brás
Special thanks for the feedback from:
- Finn Thain (I fixed the build problem)
- Geert Uytterhoeven (The cross compilers were very useful)
- Rolf Eike Beer (Was unintentional, thanks for the help!)

This Patchset changes some driver's Makefile to allow them building
using the command 'make drivers/DRIVERNAME', if compatible.

The changed drivers would return error if the above command was run
on them, after an x86 allyesconfig.

The main reason of this patchset is to allow building lists of
drivers looking for warnings and errors to be fixed.

I see this change as a new feature, not a bugfix. I understand
the default bahavior may be building with a simple 'make', but I
believe adding this new possibility will not be harmful.

My main objective is to allow developers with low processing power
to do changes in the kernel and look bugs using free services like
GiltabCI, before submitting to community.

If there is any interest helping/using this, I have a prototype in:
https://gitlab.com/LeoBras/linux-next


Leonardo Brás (7):
  drivers: dio: Avoids building driver if CONFIG_DIO is disabled
  drivers: nubus: Avoids building driver if CONFIG_NUBUS is disabled
  drivers: parisc: Avoids building driver if CONFIG_PARISC is disabled
  drivers: zorro: Avoids building proc.o if CONFIG_ZORRO is disabled
  drivers: s390: Avoids building drivers if ARCH is not s390.
  drivers: oprofile: Avoids building driver from direct make command
  drivers: hwtracing: Adds Makefile to enable building from directory.

 drivers/Makefile   | 4 +---
 drivers/dio/Makefile   | 2 +-
 drivers/hwtracing/Makefile | 3 +++
 drivers/nubus/Makefile | 5 +++--
 drivers/oprofile/Makefile  | 1 +
 drivers/parisc/Makefile| 2 +-
 drivers/s390/Makefile  | 8 
 drivers/zorro/Makefile | 7 ---
 8 files changed, 18 insertions(+), 14 deletions(-)
 create mode 100644 drivers/hwtracing/Makefile
 create mode 100644 drivers/oprofile/Makefile

-- 
2.19.0



[PATCH v3 2/7] drivers: nubus: Avoids building driver if CONFIG_NUBUS is disabled

2018-09-27 Thread Leonardo Brás
Avoids building driver if 'make drivers/nubus/' is called and
CONFIG_NUBUS is disabled.
Avoids building proc.o if CONFIG_PROC_FS is enabled but
CONFIG_NUBUS is disabled.

Signed-off-by: Leonardo Brás 
---
 drivers/nubus/Makefile | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/nubus/Makefile b/drivers/nubus/Makefile
index 6d063cde39d1..1daa51217e95 100644
--- a/drivers/nubus/Makefile
+++ b/drivers/nubus/Makefile
@@ -2,6 +2,7 @@
 # Makefile for the nubus specific drivers.
 #
 
-obj-y := nubus.o bus.o
+obj-$(CONFIG_NUBUS) := nubus_all.o
+nubus_all-y += bus.o nubus.o
 
-obj-$(CONFIG_PROC_FS) += proc.o
+nubus_all-$(CONFIG_PROC_FS) += proc.o
-- 
2.19.0



[PATCH v3 3/7] drivers: parisc: Avoids building driver if CONFIG_PARISC is disabled

2018-09-27 Thread Leonardo Brás
Avoids building driver if 'make drivers/parisc/' is called and
CONFIG_PARISC is disabled.

Signed-off-by: Leonardo Brás 
---
 drivers/parisc/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/parisc/Makefile b/drivers/parisc/Makefile
index 3cd5e6cb8478..80049d763aa0 100644
--- a/drivers/parisc/Makefile
+++ b/drivers/parisc/Makefile
@@ -24,5 +24,5 @@ obj-$(CONFIG_EISA)+= eisa.o eisa_enumerator.o 
eisa_eeprom.o
 obj-$(CONFIG_SUPERIO)  += superio.o
 obj-$(CONFIG_CHASSIS_LCD_LED)  += led.o
 obj-$(CONFIG_PDC_STABLE)   += pdc_stable.o
-obj-y  += power.o
+obj-$(CONFIG_PARISC)   += power.o
 
-- 
2.19.0



[PATCH v3 0/7] Remove errors building drivers/DRIVERNAME

2018-09-27 Thread Leonardo Brás
Special thanks for the feedback from:
- Finn Thain (I fixed the build problem)
- Geert Uytterhoeven (The cross compilers were very useful)
- Rolf Eike Beer (Was unintentional, thanks for the help!)

This Patchset changes some driver's Makefile to allow them building
using the command 'make drivers/DRIVERNAME', if compatible.

The changed drivers would return error if the above command was run
on them, after an x86 allyesconfig.

The main reason of this patchset is to allow building lists of
drivers looking for warnings and errors to be fixed.

I see this change as a new feature, not a bugfix. I understand
the default bahavior may be building with a simple 'make', but I
believe adding this new possibility will not be harmful.

My main objective is to allow developers with low processing power
to do changes in the kernel and look bugs using free services like
GiltabCI, before submitting to community.

If there is any interest helping/using this, I have a prototype in:
https://gitlab.com/LeoBras/linux-next


Leonardo Brás (7):
  drivers: dio: Avoids building driver if CONFIG_DIO is disabled
  drivers: nubus: Avoids building driver if CONFIG_NUBUS is disabled
  drivers: parisc: Avoids building driver if CONFIG_PARISC is disabled
  drivers: zorro: Avoids building proc.o if CONFIG_ZORRO is disabled
  drivers: s390: Avoids building drivers if ARCH is not s390.
  drivers: oprofile: Avoids building driver from direct make command
  drivers: hwtracing: Adds Makefile to enable building from directory.

 drivers/Makefile   | 4 +---
 drivers/dio/Makefile   | 2 +-
 drivers/hwtracing/Makefile | 3 +++
 drivers/nubus/Makefile | 5 +++--
 drivers/oprofile/Makefile  | 1 +
 drivers/parisc/Makefile| 2 +-
 drivers/s390/Makefile  | 8 
 drivers/zorro/Makefile | 7 ---
 8 files changed, 18 insertions(+), 14 deletions(-)
 create mode 100644 drivers/hwtracing/Makefile
 create mode 100644 drivers/oprofile/Makefile

-- 
2.19.0



[PATCH v3 2/7] drivers: nubus: Avoids building driver if CONFIG_NUBUS is disabled

2018-09-27 Thread Leonardo Brás
Avoids building driver if 'make drivers/nubus/' is called and
CONFIG_NUBUS is disabled.
Avoids building proc.o if CONFIG_PROC_FS is enabled but
CONFIG_NUBUS is disabled.

Signed-off-by: Leonardo Brás 
---
 drivers/nubus/Makefile | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/nubus/Makefile b/drivers/nubus/Makefile
index 6d063cde39d1..1daa51217e95 100644
--- a/drivers/nubus/Makefile
+++ b/drivers/nubus/Makefile
@@ -2,6 +2,7 @@
 # Makefile for the nubus specific drivers.
 #
 
-obj-y := nubus.o bus.o
+obj-$(CONFIG_NUBUS) := nubus_all.o
+nubus_all-y += bus.o nubus.o
 
-obj-$(CONFIG_PROC_FS) += proc.o
+nubus_all-$(CONFIG_PROC_FS) += proc.o
-- 
2.19.0



  1   2   3   4   5   6   7   8   9   10   >