date:20180408

[PATCH AUTOSEL for 4.4 005/162] MIPS: kprobes: flush_insn_slot should flush only if probe initialised

2018-04-08 Thread Sasha Levin

From: Marcin Nowakowski 

[ Upstream commit 698b851073ddf5a894910d63ca04605e0473414e ]

When ftrace is used with kprobes, it is possible for a kprobe to contain
an invalid location (ie. only initialised to 0 and not to a specific
location in the code). Trying to perform a cache flush on such location
leads to a crash r4k_flush_icache_range().

Fixes: c1bf207d6ee1 ("MIPS: kprobe: Add support.")
Signed-off-by: Marcin Nowakowski 
Cc: linux-m...@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/16296/
Signed-off-by: Ralf Baechle 
Signed-off-by: Sasha Levin 
---
 arch/mips/include/asm/kprobes.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/mips/include/asm/kprobes.h b/arch/mips/include/asm/kprobes.h
index daba1f9a4f79..174aedce3167 100644
--- a/arch/mips/include/asm/kprobes.h
+++ b/arch/mips/include/asm/kprobes.h
@@ -40,7 +40,8 @@ typedef union mips_instruction kprobe_opcode_t;
 
 #define flush_insn_slot(p) \
 do {   \
-   flush_icache_range((unsigned long)p->addr,  \
+   if (p->addr)\
+   flush_icache_range((unsigned long)p->addr,  \
   (unsigned long)p->addr + \
   (MAX_INSN_SIZE * sizeof(kprobe_opcode_t)));  \
 } while (0)
-- 
2.15.1

[PATCH AUTOSEL for 4.4 003/162] perf/core: Correct event creation with PERF_FORMAT_GROUP

2018-04-08 Thread Sasha Levin

From: Peter Zijlstra 

[ Upstream commit ba5213ae6b88fb170c4771fef6553f759c7d8cdd ]

Andi was asking about PERF_FORMAT_GROUP vs inherited events, which led
to the discovery of a bug from commit:

  3dab77fb1bf8 ("perf: Rework/fix the whole read vs group stuff")

 -   PERF_SAMPLE_GROUP   = 1U << 4,
 +   PERF_SAMPLE_READ= 1U << 4,

 -   if (attr->inherit && (attr->sample_type & PERF_SAMPLE_GROUP))
 +   if (attr->inherit && (attr->read_format & PERF_FORMAT_GROUP))

is a clear fail :/

While this changes user visible behaviour; it was previously possible
to create an inherited event with PERF_SAMPLE_READ; this is deemed
acceptible because its results were always incorrect.

Reported-by: Andi Kleen 
Signed-off-by: Peter Zijlstra (Intel) 
Cc: Alexander Shishkin 
Cc: Arnaldo Carvalho de Melo 
Cc: Jiri Olsa 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Stephane Eranian 
Cc: Thomas Gleixner 
Cc: Vince Weaver 
Fixes:  3dab77fb1bf8 ("perf: Rework/fix the whole read vs group stuff")
Link: 
http://lkml.kernel.org/r/20170530094512.dy2nljns2uq7q...@hirez.programming.kicks-ass.net
Signed-off-by: Ingo Molnar 
Signed-off-by: Sasha Levin 
---
 kernel/events/core.c | 15 ++-
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 8f75386e61a7..835ac4d9f349 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -5271,9 +5271,6 @@ static void perf_output_read_one(struct 
perf_output_handle *handle,
__output_copy(handle, values, n * sizeof(u64));
 }
 
-/*
- * XXX PERF_FORMAT_GROUP vs inherited events seems difficult.
- */
 static void perf_output_read_group(struct perf_output_handle *handle,
struct perf_event *event,
u64 enabled, u64 running)
@@ -5318,6 +5315,13 @@ static void perf_output_read_group(struct 
perf_output_handle *handle,
 #define PERF_FORMAT_TOTAL_TIMES (PERF_FORMAT_TOTAL_TIME_ENABLED|\
 PERF_FORMAT_TOTAL_TIME_RUNNING)
 
+/*
+ * XXX PERF_SAMPLE_READ vs inherited events seems difficult.
+ *
+ * The problem is that its both hard and excessively expensive to iterate the
+ * child list, not to mention that its impossible to IPI the children running
+ * on another CPU, from interrupt/NMI context.
+ */
 static void perf_output_read(struct perf_output_handle *handle,
 struct perf_event *event)
 {
@@ -7958,9 +7962,10 @@ perf_event_alloc(struct perf_event_attr *attr, int cpu,
local64_set(>period_left, hwc->sample_period);
 
/*
-* we currently do not support PERF_FORMAT_GROUP on inherited events
+* We currently do not support PERF_SAMPLE_READ on inherited events.
+* See perf_output_read().
 */
-   if (attr->inherit && (attr->read_format & PERF_FORMAT_GROUP))
+   if (attr->inherit && (attr->sample_type & PERF_SAMPLE_READ))
goto err_ns;
 
if (!has_branch_stack(event))
-- 
2.15.1

[PATCH AUTOSEL for 4.4 008/162] rcu: Make synchronize_rcu_mult() check for duplicates

2018-04-08 Thread Sasha Levin

From: "Paul E. McKenney" 

[ Upstream commit 68ab0b4263224157f4d0c0e42854169a183d7534 ]

Currently, doing synchronize_rcu_mult(call_rcu, call_rcu) might
(or might not) wait for two RCU grace periods.  One approach is
of course "don't do that!", but in CONFIG_PREEMPT=n kernels,
synchronize_rcu_mult(call_rcu, call_rcu_sched) does exactly that.
This results in an ugly #ifdef in sched_cpu_deactivate().

This commit therefore makes __wait_rcu_gp() check for duplicates,
which in turn allows duplicates to be passed to synchronize_rcu_mult()
without risk of waiting twice on the same type of grace period.

Signed-off-by: Paul E. McKenney 
Signed-off-by: Sasha Levin 
---
 kernel/rcu/update.c | 13 +++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c
index 5f748c5a40f0..d98acb903325 100644
--- a/kernel/rcu/update.c
+++ b/kernel/rcu/update.c
@@ -324,6 +324,7 @@ void __wait_rcu_gp(bool checktiny, int n, call_rcu_func_t 
*crcu_array,
   struct rcu_synchronize *rs_array)
 {
int i;
+   int j;
 
/* Initialize and register callbacks for each flavor specified. */
for (i = 0; i < n; i++) {
@@ -335,7 +336,11 @@ void __wait_rcu_gp(bool checktiny, int n, call_rcu_func_t 
*crcu_array,
}
init_rcu_head_on_stack(_array[i].head);
init_completion(_array[i].completion);
-   (crcu_array[i])(_array[i].head, wakeme_after_rcu);
+   for (j = 0; j < i; j++)
+   if (crcu_array[j] == crcu_array[i])
+   break;
+   if (j == i)
+   (crcu_array[i])(_array[i].head, wakeme_after_rcu);
}
 
/* Wait for all callbacks to be invoked. */
@@ -344,7 +349,11 @@ void __wait_rcu_gp(bool checktiny, int n, call_rcu_func_t 
*crcu_array,
(crcu_array[i] == call_rcu ||
 crcu_array[i] == call_rcu_bh))
continue;
-   wait_for_completion(_array[i].completion);
+   for (j = 0; j < i; j++)
+   if (crcu_array[j] == crcu_array[i])
+   break;
+   if (j == i)
+   wait_for_completion(_array[i].completion);
destroy_rcu_head_on_stack(_array[i].head);
}
 }
-- 
2.15.1

[PATCH AUTOSEL for 4.4 004/162] MIPS: mm: fixed mappings: correct initialisation

2018-04-08 Thread Sasha Levin

From: Marcin Nowakowski 

[ Upstream commit 71eb989ab5a110df8bcbb9609bacde73feacbedd ]

fixrange_init operates at PMD-granularity and expects the addresses to
be PMD-size aligned, but currently that might not be the case for
PKMAP_BASE unless it is defined properly, so ensure a correct alignment
is used before passing the address to fixrange_init.

fixed mappings: only align the start address that is passed to
fixrange_init rather than the value before adding the size, as we may
end up with uninitialised upper part of the range.

Signed-off-by: Marcin Nowakowski 
Cc: linux-m...@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/15948/
Signed-off-by: Ralf Baechle 
Signed-off-by: Sasha Levin 
---
 arch/mips/mm/pgtable-32.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/mips/mm/pgtable-32.c b/arch/mips/mm/pgtable-32.c
index adc6911ba748..b19a3c506b1e 100644
--- a/arch/mips/mm/pgtable-32.c
+++ b/arch/mips/mm/pgtable-32.c
@@ -51,15 +51,15 @@ void __init pagetable_init(void)
/*
 * Fixed mappings:
 */
-   vaddr = __fix_to_virt(__end_of_fixed_addresses - 1) & PMD_MASK;
-   fixrange_init(vaddr, vaddr + FIXADDR_SIZE, pgd_base);
+   vaddr = __fix_to_virt(__end_of_fixed_addresses - 1);
+   fixrange_init(vaddr & PMD_MASK, vaddr + FIXADDR_SIZE, pgd_base);
 
 #ifdef CONFIG_HIGHMEM
/*
 * Permanent kmaps:
 */
vaddr = PKMAP_BASE;
-   fixrange_init(vaddr, vaddr + PAGE_SIZE*LAST_PKMAP, pgd_base);
+   fixrange_init(vaddr & PMD_MASK, vaddr + PAGE_SIZE*LAST_PKMAP, pgd_base);
 
pgd = swapper_pg_dir + __pgd_offset(vaddr);
pud = pud_offset(pgd, vaddr);
-- 
2.15.1

[PATCH AUTOSEL for 4.9 289/293] vfs/proc/kcore, x86/mm/kcore: Fix SMAP fault when dumping vsyscall user page

2018-04-08 Thread Sasha Levin

From: Jia Zhang 

[ Upstream commit 595dd46ebfc10be041a365d0a3fa99df50b6ba73 ]

Commit:

  df04abfd181a ("fs/proc/kcore.c: Add bounce buffer for ktext data")

... introduced a bounce buffer to work around CONFIG_HARDENED_USERCOPY=y.
However, accessing the vsyscall user page will cause an SMAP fault.

Replace memcpy() with copy_from_user() to fix this bug works, but adding
a common way to handle this sort of user page may be useful for future.

Currently, only vsyscall page requires KCORE_USER.

Signed-off-by: Jia Zhang 
Reviewed-by: Jiri Olsa 
Cc: Al Viro 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: jo...@redhat.com
Link: 
http://lkml.kernel.org/r/1518446694-21124-2-git-send-email-zhang@linux.alibaba.com
Signed-off-by: Ingo Molnar 
Signed-off-by: Sasha Levin 
---
 arch/x86/mm/init_64.c | 3 +--
 fs/proc/kcore.c   | 4 
 include/linux/kcore.h | 1 +
 3 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index 7df8e3a79dc0..d35d0e4bbf99 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -1014,8 +1014,7 @@ void __init mem_init(void)
after_bootmem = 1;
 
/* Register memory areas for /proc/kcore */
-   kclist_add(_vsyscall, (void *)VSYSCALL_ADDR,
-PAGE_SIZE, KCORE_OTHER);
+   kclist_add(_vsyscall, (void *)VSYSCALL_ADDR, PAGE_SIZE, 
KCORE_USER);
 
mem_init_print_info(NULL);
 }
diff --git a/fs/proc/kcore.c b/fs/proc/kcore.c
index df7e07986ead..7ed961c0124f 100644
--- a/fs/proc/kcore.c
+++ b/fs/proc/kcore.c
@@ -505,6 +505,10 @@ read_kcore(struct file *file, char __user *buffer, size_t 
buflen, loff_t *fpos)
/* we have to zero-fill user buffer even if no read */
if (copy_to_user(buffer, buf, tsz))
return -EFAULT;
+   } else if (m->type == KCORE_USER) {
+   /* User page is handled prior to normal kernel page: */
+   if (copy_to_user(buffer, (char *)start, tsz))
+   return -EFAULT;
} else {
if (kern_addr_valid(start)) {
/*
diff --git a/include/linux/kcore.h b/include/linux/kcore.h
index d92762286645..3ffade4f2798 100644
--- a/include/linux/kcore.h
+++ b/include/linux/kcore.h
@@ -9,6 +9,7 @@ enum kcore_type {
KCORE_VMALLOC,
KCORE_RAM,
KCORE_VMEMMAP,
+   KCORE_USER,
KCORE_OTHER,
 };
 
-- 
2.15.1

[PATCH AUTOSEL for 4.4 007/162] net: emac: fix reset timeout with AR8035 phy

2018-04-08 Thread Sasha Levin

From: Christian Lamparter 

[ Upstream commit 19d90ece81da802207a9b91ce95a29fbdc40626e ]

This patch fixes a problem where the AR8035 PHY can't be
detected on an Cisco Meraki MR24, if the ethernet cable is
not connected on boot.

Russell Senior provided steps to reproduce the issue:
|Disconnect ethernet cable, apply power, wait until device has booted,
|plug in ethernet, check for interfaces, no eth0 is listed.
|
|This appears to be a problem during probing of the AR8035 Phy chip.
|When ethernet has no link, the phy detection fails, and eth0 is not
|created. Plugging ethernet later has no effect, because there is no
|interface as far as the kernel is concerned. The relevant part of
|the boot log looks like this:
|this is the failing case:
|
|[0.876611] /plb/opb/emac-rgmii@ef601500: input 0 in RGMII mode
|[0.882532] /plb/opb/ethernet@ef600c00: reset timeout
|[0.888546] /plb/opb/ethernet@ef600c00: can't find PHY!
|and the succeeding case:
|
|[0.876672] /plb/opb/emac-rgmii@ef601500: input 0 in RGMII mode
|[0.883952] eth0: EMAC-0 /plb/opb/ethernet@ef600c00, MAC 00:01:..
|[0.890822] eth0: found Atheros 8035 Gigabit Ethernet PHY (0x01)

Based on the comment and the commit message of
commit 23fbb5a87c56 ("emac: Fix EMAC soft reset on 460EX/GT").
This is because the AR8035 PHY doesn't provide the TX Clock,
if the ethernet cable is not attached. This causes the reset
to timeout and the PHY detection code in emac_init_phy() is
unable to detect the AR8035 PHY. As a result, the emac driver
bails out early and the user left with no ethernet.

In order to stay compatible with existing configurations, the driver
tries the current reset approach at first. Only if the first attempt
timed out, it does perform one more retry with the clock temporarily
switched to the internal source for just the duration of the reset.

LEDE-Bug: #687 

Cc: Chris Blake 
Reported-by: Russell Senior 
Fixes: 23fbb5a87c56e98 ("emac: Fix EMAC soft reset on 460EX/GT")
Signed-off-by: Christian Lamparter 
Reviewed-by: Andrew Lunn 
Signed-off-by: David S. Miller 
Signed-off-by: Sasha Levin 
---
 drivers/net/ethernet/ibm/emac/core.c | 26 ++
 1 file changed, 22 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/ibm/emac/core.c 
b/drivers/net/ethernet/ibm/emac/core.c
index 5d7db6c01c46..f301c03c527b 100644
--- a/drivers/net/ethernet/ibm/emac/core.c
+++ b/drivers/net/ethernet/ibm/emac/core.c
@@ -342,6 +342,7 @@ static int emac_reset(struct emac_instance *dev)
 {
struct emac_regs __iomem *p = dev->emacp;
int n = 20;
+   bool __maybe_unused try_internal_clock = false;
 
DBG(dev, "reset" NL);
 
@@ -354,6 +355,7 @@ static int emac_reset(struct emac_instance *dev)
}
 
 #ifdef CONFIG_PPC_DCR_NATIVE
+do_retry:
/*
 * PPC460EX/GT Embedded Processor Advanced User's Manual
 * section 28.10.1 Mode Register 0 (EMACx_MR0) states:
@@ -361,10 +363,19 @@ static int emac_reset(struct emac_instance *dev)
 * of the EMAC. If none is present, select the internal clock
 * (SDR0_ETH_CFG[EMACx_PHY_CLK] = 1).
 * After a soft reset, select the external clock.
+*
+* The AR8035-A PHY Meraki MR24 does not provide a TX Clk if the
+* ethernet cable is not attached. This causes the reset to timeout
+* and the PHY detection code in emac_init_phy() is unable to
+* communicate and detect the AR8035-A PHY. As a result, the emac
+* driver bails out early and the user has no ethernet.
+* In order to stay compatible with existing configurations, the
+* driver will temporarily switch to the internal clock, after
+* the first reset fails.
 */
if (emac_has_feature(dev, EMAC_FTR_460EX_PHY_CLK_FIX)) {
-   if (dev->phy_address == 0x &&
-   dev->phy_map == 0x) {
+   if (try_internal_clock || (dev->phy_address == 0x &&
+  dev->phy_map == 0x)) {
/* No PHY: select internal loop clock before reset */
dcri_clrset(SDR0, SDR0_ETH_CFG,
0, SDR0_ETH_CFG_ECS << dev->cell_index);
@@ -382,8 +393,15 @@ static int emac_reset(struct emac_instance *dev)
 
 #ifdef CONFIG_PPC_DCR_NATIVE
if (emac_has_feature(dev, EMAC_FTR_460EX_PHY_CLK_FIX)) {
-   if (dev->phy_address == 0x &&
-   dev->phy_map == 0x) {
+   if (!n && !try_internal_clock) {
+   /* first attempt has timed out. */
+   n = 20;
+   try_internal_clock = true;
+   goto

[PATCH AUTOSEL for 4.9 293/293] irqchip/gic-v3: Change pr_debug message to pr_devel

2018-04-08 Thread Sasha Levin

From: Mark Salter 

[ Upstream commit b6dd4d83dc2f78cebc9a7e6e7e4bc2be4d29b94d ]

The pr_debug() in gic-v3 gic_send_sgi() can trigger a circular locking
warning:

 GICv3: CPU10: ICC_SGI1R_EL1 5000400
 ==
 WARNING: possible circular locking dependency detected
 4.15.0+ #1 Tainted: GW
 --
 dynamic_debug01/1873 is trying to acquire lock:
  ((console_sem).lock){-...}, at: [<99c891ec>] down_trylock+0x20/0x4c

 but task is already holding lock:
  (>lock){-.-.}, at: [<842e1587>] __task_rq_lock+0x54/0xdc

 which lock already depends on the new lock.

 the existing dependency chain (in reverse order) is:

 -> #2 (>lock){-.-.}:
__lock_acquire+0x3b4/0x6e0
lock_acquire+0xf4/0x2a8
_raw_spin_lock+0x4c/0x60
task_fork_fair+0x3c/0x148
sched_fork+0x10c/0x214
copy_process.isra.32.part.33+0x4e8/0x14f0
_do_fork+0xe8/0x78c
kernel_thread+0x48/0x54
rest_init+0x34/0x2a4
start_kernel+0x45c/0x488

 -> #1 (>pi_lock){-.-.}:
__lock_acquire+0x3b4/0x6e0
lock_acquire+0xf4/0x2a8
_raw_spin_lock_irqsave+0x58/0x70
try_to_wake_up+0x48/0x600
wake_up_process+0x28/0x34
__up.isra.0+0x60/0x6c
up+0x60/0x68
__up_console_sem+0x4c/0x7c
console_unlock+0x328/0x634
vprintk_emit+0x25c/0x390
dev_vprintk_emit+0xc4/0x1fc
dev_printk_emit+0x88/0xa8
__dev_printk+0x58/0x9c
_dev_info+0x84/0xa8
usb_new_device+0x100/0x474
hub_port_connect+0x280/0x92c
hub_event+0x740/0xa84
process_one_work+0x240/0x70c
worker_thread+0x60/0x400
kthread+0x110/0x13c
ret_from_fork+0x10/0x18

 -> #0 ((console_sem).lock){-...}:
validate_chain.isra.34+0x6e4/0xa20
__lock_acquire+0x3b4/0x6e0
lock_acquire+0xf4/0x2a8
_raw_spin_lock_irqsave+0x58/0x70
down_trylock+0x20/0x4c
__down_trylock_console_sem+0x3c/0x9c
console_trylock+0x20/0xb0
vprintk_emit+0x254/0x390
vprintk_default+0x58/0x90
vprintk_func+0xbc/0x164
printk+0x80/0xa0
__dynamic_pr_debug+0x84/0xac
gic_raise_softirq+0x184/0x18c
smp_cross_call+0xac/0x218
smp_send_reschedule+0x3c/0x48
resched_curr+0x60/0x9c
check_preempt_curr+0x70/0xdc
wake_up_new_task+0x310/0x470
_do_fork+0x188/0x78c
SyS_clone+0x44/0x50
__sys_trace_return+0x0/0x4

 other info that might help us debug this:

 Chain exists of:
   (console_sem).lock --> >pi_lock --> >lock

  Possible unsafe locking scenario:

CPU0CPU1

   lock(>lock);
lock(>pi_lock);
lock(>lock);
   lock((console_sem).lock);

  *** DEADLOCK ***

 2 locks held by dynamic_debug01/1873:
  #0:  (>pi_lock){-.-.}, at: [<1366df53>] wake_up_new_task+0x40/0x470
  #1:  (>lock){-.-.}, at: [<842e1587>] __task_rq_lock+0x54/0xdc

 stack backtrace:
 CPU: 10 PID: 1873 Comm: dynamic_debug01 Tainted: GW4.15.0+ #1
 Hardware name: GIGABYTE R120-T34-00/MT30-GS2-00, BIOS T48 10/02/2017
 Call trace:
  dump_backtrace+0x0/0x188
  show_stack+0x24/0x2c
  dump_stack+0xa4/0xe0
  print_circular_bug.isra.31+0x29c/0x2b8
  check_prev_add.constprop.39+0x6c8/0x6dc
  validate_chain.isra.34+0x6e4/0xa20
  __lock_acquire+0x3b4/0x6e0
  lock_acquire+0xf4/0x2a8
  _raw_spin_lock_irqsave+0x58/0x70
  down_trylock+0x20/0x4c
  __down_trylock_console_sem+0x3c/0x9c
  console_trylock+0x20/0xb0
  vprintk_emit+0x254/0x390
  vprintk_default+0x58/0x90
  vprintk_func+0xbc/0x164
  printk+0x80/0xa0
  __dynamic_pr_debug+0x84/0xac
  gic_raise_softirq+0x184/0x18c
  smp_cross_call+0xac/0x218
  smp_send_reschedule+0x3c/0x48
  resched_curr+0x60/0x9c
  check_preempt_curr+0x70/0xdc
  wake_up_new_task+0x310/0x470
  _do_fork+0x188/0x78c
  SyS_clone+0x44/0x50
  __sys_trace_return+0x0/0x4
 GICv3: CPU0: ICC_SGI1R_EL1 12000

This could be fixed with printk_deferred() but that might lessen its
usefulness for debugging. So change it to pr_devel to keep it out of
production kernels. Developers working on gic-v3 can enable it as
needed in their kernels.

Signed-off-by: Mark Salter 
Signed-off-by: Marc Zyngier 
Signed-off-by: Sasha Levin 
---
 drivers/irqchip/irq-gic-v3.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
index 0ef240c64c65..4a9f26723783 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -601,7 +601,7 @@ static void gic_send_sgi(u64 cluster_id, u16 tlist, 
unsigned int irq)
   MPIDR_TO_SGI_AFFINITY(cluster_id, 1) |
   tlist << ICC_SGI1R_TARGET_LIST_SHIFT);

[PATCH AUTOSEL for 4.4 002/162] e1000e: Undo e1000e_pm_freeze if __e1000_shutdown fails

2018-04-08 Thread Sasha Levin

From: Chris Wilson 

[ Upstream commit 833521ebc65b1c3092e5c0d8a97092f98eec595d ]

An error during suspend (e100e_pm_suspend),

[  429.994338] ACPI : EC: event blocked
[  429.994633] e1000e: EEE TX LPI TIMER: 0011
[  430.955451] pci_pm_suspend(): e1000e_pm_suspend+0x0/0x30 [e1000e] returns -2
[  430.955454] dpm_run_callback(): pci_pm_suspend+0x0/0x140 returns -2
[  430.955458] PM: Device :00:19.0 failed to suspend async: error -2
[  430.955581] PM: Some devices failed to suspend, or early wake event detected
[  430.957709] ACPI : EC: event unblocked

lead to complete failure:

[  432.585002] [ cut here ]
[  432.585013] WARNING: CPU: 3 PID: 8372 at kernel/irq/manage.c:1478 
__free_irq+0x9f/0x280
[  432.585015] Trying to free already-free IRQ 20
[  432.585016] Modules linked in: cdc_ncm usbnet x86_pkg_temp_thermal 
intel_powerclamp coretemp mii crct10dif_pclmul crc32_pclmul ghash_clmulni_intel 
snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic snd_hda_intel 
snd_hda_codec snd_hwdep lpc_ich snd_hda_core snd_pcm mei_me mei sdhci_pci sdhci 
i915 mmc_core e1000e ptp pps_core prime_numbers
[  432.585042] CPU: 3 PID: 8372 Comm: kworker/u16:40 Tainted: G U  
4.10.0-rc8-CI-Patchwork_3870+ #1
[  432.585044] Hardware name: LENOVO 2356GCG/2356GCG, BIOS G7ET31WW (1.13 ) 
07/02/2012
[  432.585050] Workqueue: events_unbound async_run_entry_fn
[  432.585051] Call Trace:
[  432.585058]  dump_stack+0x67/0x92
[  432.585062]  __warn+0xc6/0xe0
[  432.585065]  warn_slowpath_fmt+0x4a/0x50
[  432.585070]  ? _raw_spin_lock_irqsave+0x49/0x60
[  432.585072]  __free_irq+0x9f/0x280
[  432.585075]  free_irq+0x34/0x80
[  432.585089]  e1000_free_irq+0x65/0x70 [e1000e]
[  432.585098]  e1000e_pm_freeze+0x7a/0xb0 [e1000e]
[  432.585106]  e1000e_pm_suspend+0x21/0x30 [e1000e]
[  432.585113]  pci_pm_suspend+0x71/0x140
[  432.585118]  dpm_run_callback+0x6f/0x330
[  432.585122]  ? pci_pm_freeze+0xe0/0xe0
[  432.585125]  __device_suspend+0xea/0x330
[  432.585128]  async_suspend+0x1a/0x90
[  432.585132]  async_run_entry_fn+0x34/0x160
[  432.585137]  process_one_work+0x1f4/0x6d0
[  432.585140]  ? process_one_work+0x16e/0x6d0
[  432.585143]  worker_thread+0x49/0x4a0
[  432.585145]  kthread+0x107/0x140
[  432.585148]  ? process_one_work+0x6d0/0x6d0
[  432.585150]  ? kthread_create_on_node+0x40/0x40
[  432.585154]  ret_from_fork+0x2e/0x40
[  432.585156] ---[ end trace 6712df7f8c4b9124 ]---

The unwind failures stems from commit 2800209994f8 ("e1000e: Refactor PM
flows"), but it may be a later patch that introduced the non-recoverable
behaviour.

Fixes: 2800209994f8 ("e1000e: Refactor PM flows")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99847
Signed-off-by: Chris Wilson 
Signed-off-by: Jani Nikula 
Tested-by: Aaron Brown 
Signed-off-by: Jeff Kirsher 
Signed-off-by: Sasha Levin 
---
 drivers/net/ethernet/intel/e1000e/netdev.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c 
b/drivers/net/ethernet/intel/e1000e/netdev.c
index e356e9187e84..12bdb7b5241a 100644
--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
@@ -6589,12 +6589,17 @@ static int e1000e_pm_thaw(struct device *dev)
 static int e1000e_pm_suspend(struct device *dev)
 {
struct pci_dev *pdev = to_pci_dev(dev);
+   int rc;
 
e1000e_flush_lpic(pdev);
 
e1000e_pm_freeze(dev);
 
-   return __e1000_shutdown(pdev, false);
+   rc = __e1000_shutdown(pdev, false);
+   if (rc)
+   e1000e_pm_thaw(dev);
+
+   return rc;
 }
 
 static int e1000e_pm_resume(struct device *dev)
-- 
2.15.1

[PATCH AUTOSEL for 4.9 287/293] tools/libbpf: handle issues with bpf ELF objects containing .eh_frames

2018-04-08 Thread Sasha Levin

From: Jesper Dangaard Brouer 

[ Upstream commit e3d91b0ca523d53158f435a3e13df7f0cb360ea2 ]

V3: More generic skipping of relo-section (suggested by Daniel)

If clang >= 4.0.1 is missing the option '-target bpf', it will cause
llc/llvm to create two ELF sections for "Exception Frames", with
section names '.eh_frame' and '.rel.eh_frame'.

The BPF ELF loader library libbpf fails when loading files with these
sections.  The other in-kernel BPF ELF loader in samples/bpf/bpf_load.c,
handle this gracefully. And iproute2 loader also seems to work with these
"eh" sections.

The issue in libbpf is caused by bpf_object__elf_collect() skipping
some sections, and later when performing relocation it will be
pointing to a skipped section, as these sections cannot be found by
bpf_object__find_prog_by_idx() in bpf_object__collect_reloc().

This is a general issue that also occurs for other sections, like
debug sections which are also skipped and can have relo section.

As suggested by Daniel.  To avoid keeping state about all skipped
sections, instead perform a direct qlookup in the ELF object.  Lookup
the section that the relo-section points to and check if it contains
executable machine instructions (denoted by the sh_flags
SHF_EXECINSTR).  Use this check to also skip irrelevant relo-sections.

Note, for samples/bpf/ the '-target bpf' parameter to clang cannot be used
due to incompatibility with asm embedded headers, that some of the samples
include. This is explained in more details by Yonghong Song in bpf_devel_QA.

Signed-off-by: Jesper Dangaard Brouer 
Signed-off-by: Daniel Borkmann 
Signed-off-by: Sasha Levin 
---
 tools/lib/bpf/libbpf.c | 26 ++
 1 file changed, 26 insertions(+)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index b699aea9a025..7788cfb7cd7e 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -590,6 +590,24 @@ bpf_object__init_maps_name(struct bpf_object *obj)
return 0;
 }
 
+static bool section_have_execinstr(struct bpf_object *obj, int idx)
+{
+   Elf_Scn *scn;
+   GElf_Shdr sh;
+
+   scn = elf_getscn(obj->efile.elf, idx);
+   if (!scn)
+   return false;
+
+   if (gelf_getshdr(scn, ) != )
+   return false;
+
+   if (sh.sh_flags & SHF_EXECINSTR)
+   return true;
+
+   return false;
+}
+
 static int bpf_object__elf_collect(struct bpf_object *obj)
 {
Elf *elf = obj->efile.elf;
@@ -673,6 +691,14 @@ static int bpf_object__elf_collect(struct bpf_object *obj)
} else if (sh.sh_type == SHT_REL) {
void *reloc = obj->efile.reloc;
int nr_reloc = obj->efile.nr_reloc + 1;
+   int sec = sh.sh_info; /* points to other section */
+
+   /* Only do relo for section with exec instructions */
+   if (!section_have_execinstr(obj, sec)) {
+   pr_debug("skip relo %s(%d) for section(%d)\n",
+name, idx, sec);
+   continue;
+   }
 
reloc = realloc(reloc,
sizeof(*obj->efile.reloc) * nr_reloc);
-- 
2.15.1

[PATCH AUTOSEL for 4.9 286/293] net: Extra '_get' in declaration of arch_get_platform_mac_address

2018-04-08 Thread Sasha Levin

From: Mathieu Malaterre 

[ Upstream commit e728789c52afccc1275cba1dd812f03abe16ea3c ]

In commit c7f5d105495a ("net: Add eth_platform_get_mac_address() helper."),
two declarations were added:

  int eth_platform_get_mac_address(struct device *dev, u8 *mac_addr);
  unsigned char *arch_get_platform_get_mac_address(void);

An extra '_get' was introduced in arch_get_platform_get_mac_address, remove
it. Fix compile warning using W=1:

  CC  net/ethernet/eth.o
net/ethernet/eth.c:523:24: warning: no previous prototype for 
‘arch_get_platform_mac_address’ [-Wmissing-prototypes]
 unsigned char * __weak arch_get_platform_mac_address(void)
^
  AR  net/ethernet/built-in.o

Signed-off-by: Mathieu Malaterre 
Signed-off-by: David S. Miller 
Signed-off-by: Sasha Levin 
---
 include/linux/etherdevice.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/etherdevice.h b/include/linux/etherdevice.h
index 6fec9e81bd70..a3a47b1bda91 100644
--- a/include/linux/etherdevice.h
+++ b/include/linux/etherdevice.h
@@ -31,7 +31,7 @@
 #ifdef __KERNEL__
 struct device;
 int eth_platform_get_mac_address(struct device *dev, u8 *mac_addr);
-unsigned char *arch_get_platform_get_mac_address(void);
+unsigned char *arch_get_platform_mac_address(void);
 u32 eth_get_headlen(void *data, unsigned int max_len);
 __be16 eth_type_trans(struct sk_buff *skb, struct net_device *dev);
 extern const struct header_ops eth_header_ops;
-- 
2.15.1

[PATCH AUTOSEL for 4.9 274/293] MIPS: TXx9: use IS_BUILTIN() for CONFIG_LEDS_CLASS

2018-04-08 Thread Sasha Levin

From: Matt Redfearn 

[ Upstream commit 0cde5b44a30f1daaef1c34e08191239dc63271c4 ]

When commit b27311e1cace ("MIPS: TXx9: Add RBTX4939 board support")
added board support for the RBTX4939, it added a call to
led_classdev_register even if the LED class is built as a module.
Built-in arch code cannot call module code directly like this. Commit
b33b44073734 ("MIPS: TXX9: use IS_ENABLED() macro") subsequently
changed the inclusion of this code to a single check that
CONFIG_LEDS_CLASS is either builtin or a module, but the same issue
remains.

This leads to MIPS allmodconfig builds failing when CONFIG_MACH_TX49XX=y
is set:

arch/mips/txx9/rbtx4939/setup.o: In function `rbtx4939_led_probe':
setup.c:(.init.text+0xc0): undefined reference to `of_led_classdev_register'
make: *** [Makefile:999: vmlinux] Error 1

Fix this by using the IS_BUILTIN() macro instead.

Fixes: b27311e1cace ("MIPS: TXx9: Add RBTX4939 board support")
Signed-off-by: Matt Redfearn 
Reviewed-by: James Hogan 
Cc: Ralf Baechle 
Cc: linux-m...@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/18544/
Signed-off-by: James Hogan 
Signed-off-by: Sasha Levin 
---
 arch/mips/txx9/rbtx4939/setup.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/mips/txx9/rbtx4939/setup.c b/arch/mips/txx9/rbtx4939/setup.c
index 8b937300fb7f..fd26fadc8617 100644
--- a/arch/mips/txx9/rbtx4939/setup.c
+++ b/arch/mips/txx9/rbtx4939/setup.c
@@ -186,7 +186,7 @@ static void __init rbtx4939_update_ioc_pen(void)
 
 #define RBTX4939_MAX_7SEGLEDS  8
 
-#if IS_ENABLED(CONFIG_LEDS_CLASS)
+#if IS_BUILTIN(CONFIG_LEDS_CLASS)
 static u8 led_val[RBTX4939_MAX_7SEGLEDS];
 struct rbtx4939_led_data {
struct led_classdev cdev;
@@ -261,7 +261,7 @@ static inline void rbtx4939_led_setup(void)
 
 static void __rbtx4939_7segled_putc(unsigned int pos, unsigned char val)
 {
-#if IS_ENABLED(CONFIG_LEDS_CLASS)
+#if IS_BUILTIN(CONFIG_LEDS_CLASS)
unsigned long flags;
local_irq_save(flags);
/* bit7: reserved for LED class */
-- 
2.15.1

[PATCH AUTOSEL for 4.9 285/293] nfsd: return RESOURCE not GARBAGE_ARGS on too many ops

2018-04-08 Thread Sasha Levin

From: "J. Bruce Fields" 

[ Upstream commit 0078117c6d9160031b866cfa1853514d4f6865d2 ]

A client that sends more than a hundred ops in a single compound
currently gets an rpc-level GARBAGE_ARGS error.

It would be more helpful to return NFS4ERR_RESOURCE, since that gives
the client a better idea how to recover (for example by splitting up the
compound into smaller compounds).

This is all a bit academic since we've never actually seen a reason for
clients to send such long compounds, but we may as well fix it.

While we're there, just use NFSD4_MAX_OPS_PER_COMPOUND == 16, the
constant we already use in the 4.1 case, instead of hard-coding 100.
Chances anyone actually uses even 16 ops per compound are small enough
that I think there's a neglible risk or any regression.

This fixes pynfs test COMP6.

Reported-by: "Lu, Xinyu" 
Signed-off-by: J. Bruce Fields 
Signed-off-by: Sasha Levin 
---
 fs/nfsd/nfs4proc.c | 3 +++
 fs/nfsd/nfs4xdr.c  | 9 +++--
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index eef0caf6e67d..ff414a9d77ed 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -1720,6 +1720,9 @@ nfsd4_proc_compound(struct svc_rqst *rqstp,
status = nfserr_minor_vers_mismatch;
if (nfsd_minorversion(args->minorversion, NFSD_TEST) <= 0)
goto out;
+   status = nfserr_resource;
+   if (args->opcnt > NFSD_MAX_OPS_PER_COMPOUND)
+   goto out;
 
status = nfs41_check_op_ordering(args);
if (status) {
diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index 2c4f7a22e128..940bd8232a9c 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -1866,8 +1866,13 @@ nfsd4_decode_compound(struct nfsd4_compoundargs *argp)
 
if (argp->taglen > NFSD4_MAX_TAGLEN)
goto xdr_error;
-   if (argp->opcnt > 100)
-   goto xdr_error;
+   /*
+* NFS4ERR_RESOURCE is a more helpful error than GARBAGE_ARGS
+* here, so we return success at the xdr level so that
+* nfsd4_proc can handle this is an NFS-level error.
+*/
+   if (argp->opcnt > NFSD_MAX_OPS_PER_COMPOUND)
+   return 0;
 
if (argp->opcnt > ARRAY_SIZE(argp->iops)) {
argp->ops = kzalloc(argp->opcnt * sizeof(*argp->ops), 
GFP_KERNEL);
-- 
2.15.1

[PATCH AUTOSEL for 4.9 282/293] bcache: fix for allocator and register thread race

2018-04-08 Thread Sasha Levin

From: Tang Junhui 

[ Upstream commit 682811b3ce1a5a4e20d700939a9042f01dbc66c4 ]

After long time running of random small IO writing,
I reboot the machine, and after the machine power on,
I found bcache got stuck, the stack is:
[root@ceph153 ~]# cat /proc/2510/task/*/stack
[] closure_sync+0x25/0x90 [bcache]
[] bch_journal+0x118/0x2b0 [bcache]
[] bch_journal_meta+0x47/0x70 [bcache]
[] bch_prio_write+0x237/0x340 [bcache]
[] bch_allocator_thread+0x3c8/0x3d0 [bcache]
[] kthread+0xcf/0xe0
[] ret_from_fork+0x58/0x90
[] 0x
[root@ceph153 ~]# cat /proc/2038/task/*/stack
[] __bch_btree_map_nodes+0x12d/0x150 [bcache]
[] bch_btree_insert+0xf1/0x170 [bcache]
[] bch_journal_replay+0x13f/0x230 [bcache]
[] run_cache_set+0x79a/0x7c2 [bcache]
[] register_bcache+0xd48/0x1310 [bcache]
[] kobj_attr_store+0xf/0x20
[] sysfs_write_file+0xc6/0x140
[] vfs_write+0xbd/0x1e0
[] SyS_write+0x7f/0xe0
[] system_call_fastpath+0x16/0x1
The stack shows the register thread and allocator thread
were getting stuck when registering cache device.

I reboot the machine several times, the issue always
exsit in this machine.

I debug the code, and found the call trace as bellow:
register_bcache()
   ==>run_cache_set()
  ==>bch_journal_replay()
 ==>bch_btree_insert()
==>__bch_btree_map_nodes()
   ==>btree_insert_fn()
  ==>btree_split() //node need split
 ==>btree_check_reserve()
In btree_check_reserve(), It will check if there is enough buckets
of RESERVE_BTREE type, since allocator thread did not work yet, so
no buckets of RESERVE_BTREE type allocated, so the register thread
waits on c->btree_cache_wait, and goes to sleep.

Then the allocator thread initialized, the call trace is bellow:
bch_allocator_thread()
==>bch_prio_write()
   ==>bch_journal_meta()
  ==>bch_journal()
 ==>journal_wait_for_write()
In journal_wait_for_write(), It will check if journal is full by
journal_full(), but the long time random small IO writing
causes the exhaustion of journal buckets(journal.blocks_free=0),
In order to release the journal buckets,
the allocator calls btree_flush_write() to flush keys to
btree nodes, and waits on c->journal.wait until btree nodes writing
over or there has already some journal buckets space, then the
allocator thread goes to sleep. but in btree_flush_write(), since
bch_journal_replay() is not finished, so no btree nodes have journal
(condition "if (btree_current_write(b)->journal)" never satisfied),
so we got no btree node to flush, no journal bucket released,
and allocator sleep all the times.

Through the above analysis, we can see that:
1) Register thread wait for allocator thread to allocate buckets of
   RESERVE_BTREE type;
2) Alloctor thread wait for register thread to replay journal, so it
   can flush btree nodes and get journal bucket.
   then they are all got stuck by waiting for each other.

Hua Rui provided a patch for me, by allocating some buckets of
RESERVE_BTREE type in advance, so the register thread can get bucket
when btree node splitting and no need to waiting for the allocator
thread. I tested it, it has effect, and register thread run a step
forward, but finally are still got stuck, the reason is only 8 bucket
of RESERVE_BTREE type were allocated, and in bch_journal_replay(),
after 2 btree nodes splitting, only 4 bucket of RESERVE_BTREE type left,
then btree_check_reserve() is not satisfied anymore, so it goes to sleep
again, and in the same time, alloctor thread did not flush enough btree
nodes to release a journal bucket, so they all got stuck again.

So we need to allocate more buckets of RESERVE_BTREE type in advance,
but how much is enough?  By experience and test, I think it should be
as much as journal buckets. Then I modify the code as this patch,
and test in the machine, and it works.

This patch modified base on Hua Rui’s patch, and allocate more buckets
of RESERVE_BTREE type in advance to avoid register thread and allocate
thread going to wait for each other.

[patch v2] ca->sb.njournal_buckets would be 0 in the first time after
cache creation, and no journal exists, so just 8 btree buckets is OK.

Signed-off-by: Hua Rui 
Signed-off-by: Tang Junhui 
Reviewed-by: Michael Lyle 
Signed-off-by: Jens Axboe 
Signed-off-by: Sasha Levin 
---
 drivers/md/bcache/btree.c |  9 ++---
 drivers/md/bcache/super.c | 13 -
 2 files changed, 18 insertions(+), 4 deletions(-)

diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c
index cac297f8170e..cf7c68920b33 100644
--- a/drivers/md/bcache/btree.c
+++ b/drivers/md/bcache/btree.c
@@ -1864,14 +1864,17 @@ void bch_initial_gc_finish(struct cache_set *c)
 */
for_each_cache(ca, c, i) {
for_each_bucket(b, ca) {
-   if (fifo_full(>free[RESERVE_PRIO]))
+

[PATCH AUTOSEL for 4.9 284/293] bcache: return attach error when no cache set exist

2018-04-08 Thread Sasha Levin

From: Tang Junhui 

[ Upstream commit 7f4fc93d4713394ee8f1cd44c238e046e11b4f15 ]

I attach a back-end device to a cache set, and the cache set is not
registered yet, this back-end device did not attach successfully, and no
error returned:
[root]# echo 87859280-fec6-4bcc-20df7ca8f86b > /sys/block/sde/bcache/attach
[root]#

In sysfs_attach(), the return value "v" is initialized to "size" in
the beginning, and if no cache set exist in bch_cache_sets, the "v" value
would not change any more, and return to sysfs, sysfs regard it as success
since the "size" is a positive number.

This patch fixes this issue by assigning "v" with "-ENOENT" in the
initialization.

Signed-off-by: Tang Junhui 
Reviewed-by: Michael Lyle 
Signed-off-by: Jens Axboe 
Signed-off-by: Sasha Levin 
---
 drivers/md/bcache/sysfs.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/md/bcache/sysfs.c b/drivers/md/bcache/sysfs.c
index 1efe31615281..5a5c1f1bd8a5 100644
--- a/drivers/md/bcache/sysfs.c
+++ b/drivers/md/bcache/sysfs.c
@@ -191,7 +191,7 @@ STORE(__cached_dev)
 {
struct cached_dev *dc = container_of(kobj, struct cached_dev,
 disk.kobj);
-   ssize_t v = size;
+   ssize_t v;
struct cache_set *c;
struct kobj_uevent_env *env;
 
@@ -268,6 +268,7 @@ STORE(__cached_dev)
if (bch_parse_uuid(buf, set_uuid) < 16)
return -EINVAL;
 
+   v = -ENOENT;
list_for_each_entry(c, _cache_sets, list) {
v = bch_cached_dev_attach(dc, c, set_uuid);
if (!v)
@@ -275,7 +276,7 @@ STORE(__cached_dev)
}
 
pr_err("Can't attach %s: cache set not found", buf);
-   size = v;
+   return v;
}
 
if (attr == _detach && dc->disk.c)
-- 
2.15.1

[PATCH AUTOSEL for 4.9 280/293] cifs: silence compiler warnings showing up with gcc-8.0.0

2018-04-08 Thread Sasha Levin

From: Arnd Bergmann 

[ Upstream commit ade7db991b47ab3016a414468164f4966bd08202 ]

This bug was fixed before, but came up again with the latest
compiler in another function:

fs/cifs/cifssmb.c: In function 'CIFSSMBSetEA':
fs/cifs/cifssmb.c:6362:3: error: 'strncpy' offset 8 is out of the bounds [0, 4] 
[-Werror=array-bounds]
   strncpy(parm_data->list[0].name, ea_name, name_len);

Let's apply the same fix that was used for the other instances.

Fixes: b2a3ad9ca502 ("cifs: silence compiler warnings showing up with 
gcc-4.7.0")
Signed-off-by: Arnd Bergmann 
Signed-off-by: Steve French 
Signed-off-by: Sasha Levin 
---
 fs/cifs/cifssmb.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/fs/cifs/cifssmb.c b/fs/cifs/cifssmb.c
index cc420d6b71f7..d57222894892 100644
--- a/fs/cifs/cifssmb.c
+++ b/fs/cifs/cifssmb.c
@@ -6413,9 +6413,7 @@ SetEARetry:
pSMB->InformationLevel =
cpu_to_le16(SMB_SET_FILE_EA);
 
-   parm_data =
-   (struct fealist *) (((char *) >hdr.Protocol) +
-  offset);
+   parm_data = (void *)pSMB + offsetof(struct smb_hdr, Protocol) + offset;
pSMB->ParameterOffset = cpu_to_le16(param_offset);
pSMB->DataOffset = cpu_to_le16(offset);
pSMB->SetupCount = 1;
-- 
2.15.1

[PATCH AUTOSEL for 4.9 278/293] arm64: spinlock: Fix theoretical trylock() A-B-A with LSE atomics

2018-04-08 Thread Sasha Levin

From: Will Deacon 

[ Upstream commit 202fb4ef81e3ec765c23bd1e6746a5c25b797d0e ]

If the spinlock "next" ticket wraps around between the initial LDR
and the cmpxchg in the LSE version of spin_trylock, then we can erroneously
think that we have successfuly acquired the lock because we only check
whether the next ticket return by the cmpxchg is equal to the owner ticket
in our updated lock word.

This patch fixes the issue by performing a full 32-bit check of the lock
word when trying to determine whether or not the CASA instruction updated
memory.

Reported-by: Catalin Marinas 
Signed-off-by: Will Deacon 
Signed-off-by: Catalin Marinas 
Signed-off-by: Sasha Levin 
---
 arch/arm64/include/asm/spinlock.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/spinlock.h 
b/arch/arm64/include/asm/spinlock.h
index cae331d553f8..a9d2dd03c977 100644
--- a/arch/arm64/include/asm/spinlock.h
+++ b/arch/arm64/include/asm/spinlock.h
@@ -141,8 +141,8 @@ static inline int arch_spin_trylock(arch_spinlock_t *lock)
"   cbnz%w1, 1f\n"
"   add %w1, %w0, %3\n"
"   casa%w0, %w1, %2\n"
-   "   and %w1, %w1, #0x\n"
-   "   eor %w1, %w1, %w0, lsr #16\n"
+   "   sub %w1, %w1, %3\n"
+   "   eor %w1, %w1, %w0\n"
"1:")
: "=" (lockval), "=" (tmp), "+Q" (*lock)
: "I" (1 << TICKET_SHIFT)
-- 
2.15.1

[PATCH AUTOSEL for 4.9 279/293] proc: fix /proc/*/map_files lookup

2018-04-08 Thread Sasha Levin

From: Alexey Dobriyan 

[ Upstream commit ac7f1061c2c11bb8936b1b6a94cdb48de732f7a4 ]

Current code does:

if (sscanf(dentry->d_name.name, "%lx-%lx", start, end) != 2)

However sscanf() is broken garbage.

It silently accepts whitespace between format specifiers
(did you know that?).

It silently accepts valid strings which result in integer overflow.

Do not use sscanf() for any even remotely reliable parsing code.

OK
# readlink '/proc/1/map_files/55a23af39000-55a23b05b000'
/lib/systemd/systemd

broken
# readlink '/proc/1/map_files/   55a23af39000-55a23b05b000'
/lib/systemd/systemd

broken
# readlink '/proc/1/map_files/55a23af39000-55a23b05b000'
/lib/systemd/systemd

very broken
# readlink 
'/proc/1/map_files/155a23af39000-55a23b05b000'
/lib/systemd/systemd

Andrei said:

: This patch breaks criu.  It was a bug in criu.  And this bug is on a minor
: path, which works when memfd_create() isn't available.  It is a reason why
: I ask to not backport this patch to stable kernels.
:
: In CRIU this bug can be triggered, only if this patch will be backported
: to a kernel which version is lower than v3.16.

Link: http://lkml.kernel.org/r/20171120212706.GA14325@avx2
Signed-off-by: Alexey Dobriyan 
Cc: Pavel Emelyanov 
Cc: Andrei Vagin 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds 
Signed-off-by: Sasha Levin 
---
 fs/proc/base.c | 29 -
 1 file changed, 28 insertions(+), 1 deletion(-)

diff --git a/fs/proc/base.c b/fs/proc/base.c
index e67fec3c9856..bc7e63d20523 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -94,6 +94,8 @@
 #include "internal.h"
 #include "fd.h"
 
+#include "../../lib/kstrtox.h"
+
 /* NOTE:
  * Implementing inode permission operations in /proc is almost
  * certainly an error.  Permission checks need to happen during
@@ -1864,8 +1866,33 @@ end_instantiate:
 static int dname_to_vma_addr(struct dentry *dentry,
 unsigned long *start, unsigned long *end)
 {
-   if (sscanf(dentry->d_name.name, "%lx-%lx", start, end) != 2)
+   const char *str = dentry->d_name.name;
+   unsigned long long sval, eval;
+   unsigned int len;
+
+   len = _parse_integer(str, 16, );
+   if (len & KSTRTOX_OVERFLOW)
+   return -EINVAL;
+   if (sval != (unsigned long)sval)
return -EINVAL;
+   str += len;
+
+   if (*str != '-')
+   return -EINVAL;
+   str++;
+
+   len = _parse_integer(str, 16, );
+   if (len & KSTRTOX_OVERFLOW)
+   return -EINVAL;
+   if (eval != (unsigned long)eval)
+   return -EINVAL;
+   str += len;
+
+   if (*str != '\0')
+   return -EINVAL;
+
+   *start = sval;
+   *end = eval;
 
return 0;
 }
-- 
2.15.1

[PATCH AUTOSEL for 4.9 281/293] bcache: properly set task state in bch_writeback_thread()

2018-04-08 Thread Sasha Levin

From: Coly Li 

[ Upstream commit 99361bbf26337186f02561109c17a4c4b1a7536a ]

Kernel thread routine bch_writeback_thread() has the following code block,

447 down_write(>writeback_lock);
448~450 if (check conditions) {
451 up_write(>writeback_lock);
452 set_current_state(TASK_INTERRUPTIBLE);
453
454 if (kthread_should_stop())
455 return 0;
456
457 schedule();
458 continue;
459 }

If condition check is true, its task state is set to TASK_INTERRUPTIBLE
and call schedule() to wait for others to wake up it.

There are 2 issues in current code,
1, Task state is set to TASK_INTERRUPTIBLE after the condition checks, if
   another process changes the condition and call wake_up_process(dc->
   writeback_thread), then at line 452 task state is set back to
   TASK_INTERRUPTIBLE, the writeback kernel thread will lose a chance to be
   waken up.
2, At line 454 if kthread_should_stop() is true, writeback kernel thread
   will return to kernel/kthread.c:kthread() with TASK_INTERRUPTIBLE and
   call do_exit(). It is not good to enter do_exit() with task state
   TASK_INTERRUPTIBLE, in following code path might_sleep() is called and a
   warning message is reported by __might_sleep(): "WARNING: do not call
   blocking ops when !TASK_RUNNING; state=1 set at []".

For the first issue, task state should be set before condition checks.
Ineed because dc->writeback_lock is required when modifying all the
conditions, calling set_current_state() inside code block where dc->
writeback_lock is hold is safe. But this is quite implicit, so I still move
set_current_state() before all the condition checks.

For the second issue, frankley speaking it does not hurt when kernel thread
exits with TASK_INTERRUPTIBLE state, but this warning message scares users,
makes them feel there might be something risky with bcache and hurt their
data.  Setting task state to TASK_RUNNING before returning fixes this
problem.

In alloc.c:allocator_wait(), there is also a similar issue, and is also
fixed in this patch.

Changelog:
v3: merge two similar fixes into one patch
v2: fix the race issue in v1 patch.
v1: initial buggy fix.

Signed-off-by: Coly Li 
Reviewed-by: Hannes Reinecke 
Reviewed-by: Michael Lyle 
Cc: Michael Lyle 
Cc: Junhui Tang 
Signed-off-by: Jens Axboe 
Signed-off-by: Sasha Levin 
---
 drivers/md/bcache/alloc.c | 4 +++-
 drivers/md/bcache/writeback.c | 7 +--
 2 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/md/bcache/alloc.c b/drivers/md/bcache/alloc.c
index 537903bf9add..8075731a745a 100644
--- a/drivers/md/bcache/alloc.c
+++ b/drivers/md/bcache/alloc.c
@@ -284,8 +284,10 @@ do {   
\
break;  \
\
mutex_unlock(&(ca)->set->bucket_lock);  \
-   if (kthread_should_stop())  \
+   if (kthread_should_stop()) {\
+   set_current_state(TASK_RUNNING);\
return 0;   \
+   }   \
\
schedule(); \
mutex_lock(&(ca)->set->bucket_lock);\
diff --git a/drivers/md/bcache/writeback.c b/drivers/md/bcache/writeback.c
index 4ce2b19fe120..db30da77aefb 100644
--- a/drivers/md/bcache/writeback.c
+++ b/drivers/md/bcache/writeback.c
@@ -420,18 +420,21 @@ static int bch_writeback_thread(void *arg)
 
while (!kthread_should_stop()) {
down_write(>writeback_lock);
+   set_current_state(TASK_INTERRUPTIBLE);
if (!atomic_read(>has_dirty) ||
(!test_bit(BCACHE_DEV_DETACHING, >disk.flags) &&
 !dc->writeback_running)) {
up_write(>writeback_lock);
-   set_current_state(TASK_INTERRUPTIBLE);
 
-   if (kthread_should_stop())
+   if (kthread_should_stop()) {
+   set_current_state(TASK_RUNNING);
return 0;
+   }
 
schedule();
continue;
}
+   set_current_state(TASK_RUNNING);
 
searched_full_index = refill_dirty(dc);
 
-- 
2.15.1

[PATCH AUTOSEL for 4.9 283/293] bcache: fix for data collapse after re-attaching an attached device

2018-04-08 Thread Sasha Levin

From: Tang Junhui 

[ Upstream commit 73ac105be390c1de42a2f21643c9778a5e002930 ]

back-end device sdm has already attached a cache_set with ID
f67ebe1f-f8bc-4d73-bfe5-9dc88607f119, then try to attach with
another cache set, and it returns with an error:
[root]# cd /sys/block/sdm/bcache
[root]# echo 5ccd0a63-148e-48b8-afa2-aca9cbd6279f > attach
-bash: echo: write error: Invalid argument

After that, execute a command to modify the label of bcache
device:
[root]# echo data_disk1 > label

Then we reboot the system, when the system power on, the back-end
device can not attach to cache_set, a messages show in the log:
Feb  5 12:05:52 ceph152 kernel: [922385.508498] bcache:
bch_cached_dev_attach() couldn't find uuid for sdm in set

In sysfs_attach(), dc->sb.set_uuid was assigned to the value
which input through sysfs, no matter whether it is success
or not in bch_cached_dev_attach(). For example, If the back-end
device has already attached to an cache set, bch_cached_dev_attach()
would fail, but dc->sb.set_uuid was changed. Then modify the
label of bcache device, it will call bch_write_bdev_super(),
which would write the dc->sb.set_uuid to the super block, so we
record a wrong cache set ID in the super block, after the system
reboot, the cache set couldn't find the uuid of the back-end
device, so the bcache device couldn't exist and use any more.

In this patch, we don't assigned cache set ID to dc->sb.set_uuid
in sysfs_attach() directly, but input it into bch_cached_dev_attach(),
and assigned dc->sb.set_uuid to the cache set ID after the back-end
device attached to the cache set successful.

Signed-off-by: Tang Junhui 
Reviewed-by: Michael Lyle 
Signed-off-by: Jens Axboe 
Signed-off-by: Sasha Levin 
---
 drivers/md/bcache/bcache.h |  2 +-
 drivers/md/bcache/super.c  | 10 ++
 drivers/md/bcache/sysfs.c  |  6 --
 3 files changed, 11 insertions(+), 7 deletions(-)

diff --git a/drivers/md/bcache/bcache.h b/drivers/md/bcache/bcache.h
index 02619cabda8b..7fe7df56fa33 100644
--- a/drivers/md/bcache/bcache.h
+++ b/drivers/md/bcache/bcache.h
@@ -904,7 +904,7 @@ void bcache_write_super(struct cache_set *);
 
 int bch_flash_dev_create(struct cache_set *c, uint64_t size);
 
-int bch_cached_dev_attach(struct cached_dev *, struct cache_set *);
+int bch_cached_dev_attach(struct cached_dev *, struct cache_set *, uint8_t *);
 void bch_cached_dev_detach(struct cached_dev *);
 void bch_cached_dev_run(struct cached_dev *);
 void bcache_device_stop(struct bcache_device *);
diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index 1a006f989ac2..757b13deeb1c 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -932,7 +932,8 @@ void bch_cached_dev_detach(struct cached_dev *dc)
cached_dev_put(dc);
 }
 
-int bch_cached_dev_attach(struct cached_dev *dc, struct cache_set *c)
+int bch_cached_dev_attach(struct cached_dev *dc, struct cache_set *c,
+ uint8_t *set_uuid)
 {
uint32_t rtime = cpu_to_le32(get_seconds());
struct uuid_entry *u;
@@ -941,7 +942,8 @@ int bch_cached_dev_attach(struct cached_dev *dc, struct 
cache_set *c)
 
bdevname(dc->bdev, buf);
 
-   if (memcmp(dc->sb.set_uuid, c->sb.set_uuid, 16))
+   if ((set_uuid && memcmp(set_uuid, c->sb.set_uuid, 16)) ||
+   (!set_uuid && memcmp(dc->sb.set_uuid, c->sb.set_uuid, 16)))
return -ENOENT;
 
if (dc->disk.c) {
@@ -1185,7 +1187,7 @@ static void register_bdev(struct cache_sb *sb, struct 
page *sb_page,
 
list_add(>list, _devices);
list_for_each_entry(c, _cache_sets, list)
-   bch_cached_dev_attach(dc, c);
+   bch_cached_dev_attach(dc, c, NULL);
 
if (BDEV_STATE(>sb) == BDEV_STATE_NONE ||
BDEV_STATE(>sb) == BDEV_STATE_STALE)
@@ -1708,7 +1710,7 @@ static void run_cache_set(struct cache_set *c)
bcache_write_super(c);
 
list_for_each_entry_safe(dc, t, _devices, list)
-   bch_cached_dev_attach(dc, c);
+   bch_cached_dev_attach(dc, c, NULL);
 
flash_devs_run(c);
 
diff --git a/drivers/md/bcache/sysfs.c b/drivers/md/bcache/sysfs.c
index 4fbb5532f24c..1efe31615281 100644
--- a/drivers/md/bcache/sysfs.c
+++ b/drivers/md/bcache/sysfs.c
@@ -263,11 +263,13 @@ STORE(__cached_dev)
}
 
if (attr == _attach) {
-   if (bch_parse_uuid(buf, dc->sb.set_uuid) < 16)
+   uint8_t set_uuid[16];
+
+   if (bch_parse_uuid(buf, set_uuid) < 16)
return -EINVAL;
 
list_for_each_entry(c, _cache_sets, list) {
-   v = bch_cached_dev_attach(dc, c);
+   v = bch_cached_dev_attach(dc, c, set_uuid);
if (!v)
return size;
}
-- 
2.15.1

[PATCH AUTOSEL for 4.9 271/293] ACPI / scan: Use acpi_bus_get_status() to initialize ACPI_TYPE_DEVICE devs

2018-04-08 Thread Sasha Levin

From: Hans de Goede 

[ Upstream commit 63347db0affadcbccd5613116ea8431c70139b3e ]

The acpi_get_bus_status wrapper for acpi_bus_get_status_handle has some
code to handle certain device quirks, in some cases we also need this
quirk handling for the initial _STA call.

Specifically on some devices calling _STA before all _DEP dependencies
are met results in errors like these:

[0.123579] ACPI Error: No handler for Region [ECRM] (ba9edc4c)
   [GenericSerialBus] (20170831/evregion-166)
[0.123601] ACPI Error: Region GenericSerialBus (ID=9) has no handler
   (20170831/exfldio-299)
[0.123618] ACPI Error: Method parse/execution failed
   \_SB.I2C1.BAT1._STA, AE_NOT_EXIST (20170831/psparse-550)

acpi_get_bus_status already has code to avoid this, so by using it we
also silence these errors from the initial _STA call.

Note that in order for the acpi_get_bus_status handling for this to work,
we initialize dep_unmet to 1 until acpi_device_dep_initialize gets called,
this means that battery devices will be instantiated with an initial
status of 0. This is not a problem, acpi_bus_attach will get called soon
after the instantiation anyways and it will update the status as first
point of order.

Signed-off-by: Hans de Goede 
Signed-off-by: Rafael J. Wysocki 
Signed-off-by: Sasha Levin 
---
 drivers/acpi/scan.c | 20 +---
 1 file changed, 17 insertions(+), 3 deletions(-)

diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c
index cf725d581cae..145dcf293c6f 100644
--- a/drivers/acpi/scan.c
+++ b/drivers/acpi/scan.c
@@ -1422,6 +1422,8 @@ void acpi_init_device_object(struct acpi_device *device, 
acpi_handle handle,
device_initialize(>dev);
dev_set_uevent_suppress(>dev, true);
acpi_init_coherency(device);
+   /* Assume there are unmet deps until acpi_device_dep_initialize() runs 
*/
+   device->dep_unmet = 1;
 }
 
 void acpi_device_add_finalize(struct acpi_device *device)
@@ -1445,6 +1447,14 @@ static int acpi_add_single_object(struct acpi_device 
**child,
}
 
acpi_init_device_object(device, handle, type, sta);
+   /*
+* For ACPI_BUS_TYPE_DEVICE getting the status is delayed till here so
+* that we can call acpi_bus_get_status() and use its quirk handling.
+* Note this must be done before the get power-/wakeup_dev-flags calls.
+*/
+   if (type == ACPI_BUS_TYPE_DEVICE)
+   acpi_bus_get_status(device);
+
acpi_bus_get_power_flags(device);
acpi_bus_get_wakeup_device_flags(device);
 
@@ -1517,9 +1527,11 @@ static int acpi_bus_type_and_status(acpi_handle handle, 
int *type,
return -ENODEV;
 
*type = ACPI_BUS_TYPE_DEVICE;
-   status = acpi_bus_get_status_handle(handle, sta);
-   if (ACPI_FAILURE(status))
-   *sta = 0;
+   /*
+* acpi_add_single_object updates this once we've an acpi_device
+* so that acpi_bus_get_status' quirk handling can be used.
+*/
+   *sta = 0;
break;
case ACPI_TYPE_PROCESSOR:
*type = ACPI_BUS_TYPE_PROCESSOR;
@@ -1621,6 +1633,8 @@ static void acpi_device_dep_initialize(struct acpi_device 
*adev)
acpi_status status;
int i;
 
+   adev->dep_unmet = 0;
+
if (!acpi_has_method(adev->handle, "_DEP"))
return;
 
-- 
2.15.1

[PATCH AUTOSEL for 4.9 276/293] xen/grant-table: Use put_page instead of free_page

2018-04-08 Thread Sasha Levin

From: Ross Lagerwall 

[ Upstream commit 3ac7292a25db1c607a50752055a18aba32ac2176 ]

The page given to gnttab_end_foreign_access() to free could be a
compound page so use put_page() instead of free_page() since it can
handle both compound and single pages correctly.

This bug was discovered when migrating a Xen VM with several VIFs and
CONFIG_DEBUG_VM enabled. It hits a BUG usually after fewer than 10
iterations. All netfront devices disconnect from the backend during a
suspend/resume and this will call gnttab_end_foreign_access() if a
netfront queue has an outstanding skb. The mismatch between calling
get_page() and free_page() on a compound page causes a reference
counting error which is detected when DEBUG_VM is enabled.

Signed-off-by: Ross Lagerwall 
Reviewed-by: Boris Ostrovsky 
Signed-off-by: Juergen Gross 
Signed-off-by: Sasha Levin 
---
 drivers/xen/grant-table.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c
index bb36b1e1dbcc..775d4195966c 100644
--- a/drivers/xen/grant-table.c
+++ b/drivers/xen/grant-table.c
@@ -327,7 +327,7 @@ static void gnttab_handle_deferred(unsigned long unused)
if (entry->page) {
pr_debug("freeing g.e. %#x (pfn %#lx)\n",
 entry->ref, page_to_pfn(entry->page));
-   __free_page(entry->page);
+   put_page(entry->page);
} else
pr_info("freeing g.e. %#x\n", entry->ref);
kfree(entry);
@@ -383,7 +383,7 @@ void gnttab_end_foreign_access(grant_ref_t ref, int 
readonly,
if (gnttab_end_foreign_access_ref(ref, readonly)) {
put_free_entry(ref);
if (page != 0)
-   free_page(page);
+   put_page(virt_to_page(page));
} else
gnttab_add_deferred(ref, readonly,
page ? virt_to_page(page) : NULL);
-- 
2.15.1

[PATCH AUTOSEL for 4.9 277/293] RDS: IB: Fix null pointer issue

2018-04-08 Thread Sasha Levin

From: Guanglei Li 

[ Upstream commit 2c0aa08631b86a4678dbc93b9caa5248014b4458 ]

Scenario:
1. Port down and do fail over
2. Ap do rds_bind syscall

PID: 47039  TASK: 89887e2fe640  CPU: 47  COMMAND: "kworker/u:6"
 #0 [898e35f159f0] machine_kexec at 8103abf9
 #1 [898e35f15a60] crash_kexec at 810b96e3
 #2 [898e35f15b30] oops_end at 8150f518
 #3 [898e35f15b60] no_context at 8104854c
 #4 [898e35f15ba0] __bad_area_nosemaphore at 81048675
 #5 [898e35f15bf0] bad_area_nosemaphore at 810487d3
 #6 [898e35f15c00] do_page_fault at 815120b8
 #7 [898e35f15d10] page_fault at 8150ea95
[exception RIP: unknown or invalid address]
RIP:   RSP: 898e35f15dc8  RFLAGS: 00010282
RAX: fffe  RBX: 889b77f6fc00  RCX:81c99d88
RDX:   RSI: 896019ee08e8  RDI:889b77f6fc00
RBP: 898e35f15df0   R8: 896019ee08c8  R9:
R10: 0400  R11:   R12:896019ee08c0
R13: 889b77f6fe68  R14: 81c99d80  R15: a022a1e0
ORIG_RAX:   CS: 0010 SS: 0018
 #8 [898e35f15dc8] cma_ndev_work_handler at a022a228 [rdma_cm]
 #9 [898e35f15df8] process_one_work at 8108a7c6
 #10 [898e35f15e58] worker_thread at 8108bda0
 #11 [898e35f15ee8] kthread at 81090fe6

PID: 45659  TASK: 880d313d2500  CPU: 31  COMMAND: "oracle_45659_ap"
 #0 [881024ccfc98] __schedule at 8150bac4
 #1 [881024ccfd40] schedule at 8150c2cf
 #2 [881024ccfd50] __mutex_lock_slowpath at 8150cee7
 #3 [881024ccfdc0] mutex_lock at 8150cdeb
 #4 [881024ccfde0] rdma_destroy_id at a022a027 [rdma_cm]
 #5 [881024ccfe10] rds_ib_laddr_check at a0357857 [rds_rdma]
 #6 [881024ccfe50] rds_trans_get_preferred at a0324c2a [rds]
 #7 [881024ccfe80] rds_bind at a031d690 [rds]
 #8 [881024ccfeb0] sys_bind at 8142a670

PID: 45659  PID: 47039
rds_ib_laddr_check
  /* create id_priv with a null event_handler */
  rdma_create_id
  rdma_bind_addr
cma_acquire_dev
  /* add id_priv to cma_dev->id_list */
  cma_attach_to_dev
cma_ndev_work_handler
  /* event_hanlder is null */
  id_priv->id.event_handler

Signed-off-by: Guanglei Li 
Signed-off-by: Honglei Wang 
Reviewed-by: Junxiao Bi 
Reviewed-by: Yanjun Zhu 
Reviewed-by: Leon Romanovsky 
Acked-by: Santosh Shilimkar 
Acked-by: Doug Ledford 
Signed-off-by: David S. Miller 
Signed-off-by: Sasha Levin 
---
 net/rds/ib.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/rds/ib.c b/net/rds/ib.c
index 5680d90b0b77..0efb3d2b338d 100644
--- a/net/rds/ib.c
+++ b/net/rds/ib.c
@@ -336,7 +336,8 @@ static int rds_ib_laddr_check(struct net *net, __be32 addr)
/* Create a CMA ID and try to bind it. This catches both
 * IB and iWARP capable NICs.
 */
-   cm_id = rdma_create_id(_net, NULL, NULL, RDMA_PS_TCP, IB_QPT_RC);
+   cm_id = rdma_create_id(_net, rds_rdma_cm_event_handler,
+  NULL, RDMA_PS_TCP, IB_QPT_RC);
if (IS_ERR(cm_id))
return PTR_ERR(cm_id);
 
-- 
2.15.1

[PATCH] ASoC: fsl_esai: Add freq check in set_dai_sysclk()

2018-04-08 Thread Nicolin Chen

The freq parameter indicates the physical frequency of an actual
input clock or a desired frequency of an output clock for HCKT/R.
It should never be passed 0. This might cause Division-by-zero.

So this patch adds a check to fix it.

Signed-off-by: Nicolin Chen 
---
 sound/soc/fsl/fsl_esai.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/sound/soc/fsl/fsl_esai.c b/sound/soc/fsl/fsl_esai.c
index da8fd98..d79e99e 100644
--- a/sound/soc/fsl/fsl_esai.c
+++ b/sound/soc/fsl/fsl_esai.c
@@ -226,6 +226,12 @@ static int fsl_esai_set_dai_sysclk(struct snd_soc_dai 
*dai, int clk_id,
unsigned long clk_rate;
int ret;
 
+   if (freq == 0) {
+   dev_err(dai->dev, "%sput freq of HCK%c should not be 0Hz\n",
+   in ? "in" : "out", tx ? 'T' : 'R');
+   return -EINVAL;
+   }
+
/* Bypass divider settings if the requirement doesn't change */
if (freq == esai_priv->hck_rate[tx] && dir == esai_priv->hck_dir[tx])
return 0;
-- 
2.7.4

[PATCH AUTOSEL for 4.9 270/293] ACPI: processor_perflib: Do not send _PPC change notification if not ready

2018-04-08 Thread Sasha Levin

From: Chen Yu 

[ Upstream commit ba1edb9a5125a617d612f98eead14b9b84e75c3a ]

The following warning was triggered after resumed from S3 -
if all the nonboot CPUs were put offline before suspend:

[ 1840.329515] unchecked MSR access error: RDMSR from 0x771 at rIP: 
0x86061e3a (native_read_msr+0xa/0x30)
[ 1840.329516] Call Trace:
[ 1840.329521]  __rdmsr_on_cpu+0x33/0x50
[ 1840.329525]  generic_exec_single+0x81/0xb0
[ 1840.329527]  smp_call_function_single+0xd2/0x100
[ 1840.329530]  ? acpi_ds_result_pop+0xdd/0xf2
[ 1840.329532]  ? acpi_ds_create_operand+0x215/0x23c
[ 1840.329534]  rdmsrl_on_cpu+0x57/0x80
[ 1840.329536]  ? cpumask_next+0x1b/0x20
[ 1840.329538]  ? rdmsrl_on_cpu+0x57/0x80
[ 1840.329541]  intel_pstate_update_perf_limits+0xf3/0x220
[ 1840.329544]  ? notifier_call_chain+0x4a/0x70
[ 1840.329546]  intel_pstate_set_policy+0x4e/0x150
[ 1840.329548]  cpufreq_set_policy+0xcd/0x2f0
[ 1840.329550]  cpufreq_update_policy+0xb2/0x130
[ 1840.329552]  ? cpufreq_update_policy+0x130/0x130
[ 1840.329556]  acpi_processor_ppc_has_changed+0x65/0x80
[ 1840.329558]  acpi_processor_notify+0x80/0x100
[ 1840.329561]  acpi_ev_notify_dispatch+0x44/0x5c
[ 1840.329563]  acpi_os_execute_deferred+0x14/0x20
[ 1840.329565]  process_one_work+0x193/0x3c0
[ 1840.329567]  worker_thread+0x35/0x3b0
[ 1840.329569]  kthread+0x125/0x140
[ 1840.329571]  ? process_one_work+0x3c0/0x3c0
[ 1840.329572]  ? kthread_park+0x60/0x60
[ 1840.329575]  ? do_syscall_64+0x67/0x180
[ 1840.329577]  ret_from_fork+0x25/0x30
[ 1840.329585] unchecked MSR access error: WRMSR to 0x774 (tried to write 
0x) at rIP: 0x86061f78 (native_write_msr+0x8/0x30)
[ 1840.329586] Call Trace:
[ 1840.329587]  __wrmsr_on_cpu+0x37/0x40
[ 1840.329589]  generic_exec_single+0x81/0xb0
[ 1840.329592]  smp_call_function_single+0xd2/0x100
[ 1840.329594]  ? acpi_ds_create_operand+0x215/0x23c
[ 1840.329595]  ? cpumask_next+0x1b/0x20
[ 1840.329597]  wrmsrl_on_cpu+0x57/0x70
[ 1840.329598]  ? rdmsrl_on_cpu+0x57/0x80
[ 1840.329599]  ? wrmsrl_on_cpu+0x57/0x70
[ 1840.329602]  intel_pstate_hwp_set+0xd3/0x150
[ 1840.329604]  intel_pstate_set_policy+0x119/0x150
[ 1840.329606]  cpufreq_set_policy+0xcd/0x2f0
[ 1840.329607]  cpufreq_update_policy+0xb2/0x130
[ 1840.329610]  ? cpufreq_update_policy+0x130/0x130
[ 1840.329613]  acpi_processor_ppc_has_changed+0x65/0x80
[ 1840.329615]  acpi_processor_notify+0x80/0x100
[ 1840.329617]  acpi_ev_notify_dispatch+0x44/0x5c
[ 1840.329619]  acpi_os_execute_deferred+0x14/0x20
[ 1840.329620]  process_one_work+0x193/0x3c0
[ 1840.329622]  worker_thread+0x35/0x3b0
[ 1840.329624]  kthread+0x125/0x140
[ 1840.329625]  ? process_one_work+0x3c0/0x3c0
[ 1840.329626]  ? kthread_park+0x60/0x60
[ 1840.329628]  ? do_syscall_64+0x67/0x180
[ 1840.329631]  ret_from_fork+0x25/0x30

This is because if there's only one online CPU, the MSR_PM_ENABLE
(package wide)can not be enabled after resumed, due to
intel_pstate_hwp_enable() will only be invoked on AP's online
process after resumed - if there's no AP online, the HWP remains
disabled after resumed (BIOS has disabled it in S3). Then if
there comes a _PPC change notification which touches HWP register
during this stage, the warning is triggered.

Since we don't call acpi_processor_register_performance() when
HWP is enabled, the pr->performance will be NULL. When this is
NULL we don't need to do _PPC change notification.

Reported-by: Doug Smythies 
Suggested-by: Srinivas Pandruvada 
Signed-off-by: Yu Chen 
Signed-off-by: Rafael J. Wysocki 
Signed-off-by: Sasha Levin 
---
 drivers/acpi/processor_perflib.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/acpi/processor_perflib.c b/drivers/acpi/processor_perflib.c
index bb01dea39fdc..9825780a1cd2 100644
--- a/drivers/acpi/processor_perflib.c
+++ b/drivers/acpi/processor_perflib.c
@@ -161,7 +161,7 @@ int acpi_processor_ppc_has_changed(struct acpi_processor 
*pr, int event_flag)
 {
int ret;
 
-   if (ignore_ppc) {
+   if (ignore_ppc || !pr->performance) {
/*
 * Only when it is notification event, the _OST object
 * will be evaluated. Otherwise it is skipped.
-- 
2.15.1

[PATCH AUTOSEL for 4.9 272/293] bpf: fix selftests/bpf test_kmod.sh failure when CONFIG_BPF_JIT_ALWAYS_ON=y

2018-04-08 Thread Sasha Levin

From: Yonghong Song 

[ Upstream commit 09584b406742413ac4c8d7e030374d4daa045b69 ]

With CONFIG_BPF_JIT_ALWAYS_ON is defined in the config file,
tools/testing/selftests/bpf/test_kmod.sh failed like below:
  [root@localhost bpf]# ./test_kmod.sh
  sysctl: setting key "net.core.bpf_jit_enable": Invalid argument
  [ JIT enabled:0 hardened:0 ]
  [  132.175681] test_bpf: #297 BPF_MAXINSNS: Jump, gap, jump, ... FAIL to 
prog_create err=-524 len=4096
  [  132.458834] test_bpf: Summary: 348 PASSED, 1 FAILED, [340/340 JIT'ed]
  [ JIT enabled:1 hardened:0 ]
  [  133.456025] test_bpf: #297 BPF_MAXINSNS: Jump, gap, jump, ... FAIL to 
prog_create err=-524 len=4096
  [  133.730935] test_bpf: Summary: 348 PASSED, 1 FAILED, [340/340 JIT'ed]
  [ JIT enabled:1 hardened:1 ]
  [  134.769730] test_bpf: #297 BPF_MAXINSNS: Jump, gap, jump, ... FAIL to 
prog_create err=-524 len=4096
  [  135.050864] test_bpf: Summary: 348 PASSED, 1 FAILED, [340/340 JIT'ed]
  [ JIT enabled:1 hardened:2 ]
  [  136.442882] test_bpf: #297 BPF_MAXINSNS: Jump, gap, jump, ... FAIL to 
prog_create err=-524 len=4096
  [  136.821810] test_bpf: Summary: 348 PASSED, 1 FAILED, [340/340 JIT'ed]
  [root@localhost bpf]#

The test_kmod.sh load/remove test_bpf.ko multiple times with different
settings for sysctl net.core.bpf_jit_{enable,harden}. The failed test #297
of test_bpf.ko is designed such that JIT always fails.

Commit 290af86629b2 (bpf: introduce BPF_JIT_ALWAYS_ON config)
introduced the following tightening logic:
...
if (!bpf_prog_is_dev_bound(fp->aux)) {
fp = bpf_int_jit_compile(fp);
#ifdef CONFIG_BPF_JIT_ALWAYS_ON
if (!fp->jited) {
*err = -ENOTSUPP;
return fp;
}
#endif
...
With this logic, Test #297 always gets return value -ENOTSUPP
when CONFIG_BPF_JIT_ALWAYS_ON is defined, causing the test failure.

This patch fixed the failure by marking Test #297 as expected failure
when CONFIG_BPF_JIT_ALWAYS_ON is defined.

Fixes: 290af86629b2 (bpf: introduce BPF_JIT_ALWAYS_ON config)
Signed-off-by: Yonghong Song 
Signed-off-by: Daniel Borkmann 
Signed-off-by: Sasha Levin 
---
 lib/test_bpf.c | 31 ++-
 1 file changed, 26 insertions(+), 5 deletions(-)

diff --git a/lib/test_bpf.c b/lib/test_bpf.c
index 98da7520a6aa..1586dfdea809 100644
--- a/lib/test_bpf.c
+++ b/lib/test_bpf.c
@@ -83,6 +83,7 @@ struct bpf_test {
__u32 result;
} test[MAX_SUBTESTS];
int (*fill_helper)(struct bpf_test *self);
+   int expected_errcode; /* used when FLAG_EXPECTED_FAIL is set in the aux 
*/
__u8 frag_data[MAX_DATA];
 };
 
@@ -1900,7 +1901,9 @@ static struct bpf_test tests[] = {
},
CLASSIC | FLAG_NO_DATA | FLAG_EXPECTED_FAIL,
{ },
-   { }
+   { },
+   .fill_helper = NULL,
+   .expected_errcode = -EINVAL,
},
{
"check: div_k_0",
@@ -1910,7 +1913,9 @@ static struct bpf_test tests[] = {
},
CLASSIC | FLAG_NO_DATA | FLAG_EXPECTED_FAIL,
{ },
-   { }
+   { },
+   .fill_helper = NULL,
+   .expected_errcode = -EINVAL,
},
{
"check: unknown insn",
@@ -1921,7 +1926,9 @@ static struct bpf_test tests[] = {
},
CLASSIC | FLAG_EXPECTED_FAIL,
{ },
-   { }
+   { },
+   .fill_helper = NULL,
+   .expected_errcode = -EINVAL,
},
{
"check: out of range spill/fill",
@@ -1931,7 +1938,9 @@ static struct bpf_test tests[] = {
},
CLASSIC | FLAG_NO_DATA | FLAG_EXPECTED_FAIL,
{ },
-   { }
+   { },
+   .fill_helper = NULL,
+   .expected_errcode = -EINVAL,
},
{
"JUMPS + HOLES",
@@ -2023,6 +2032,8 @@ static struct bpf_test tests[] = {
CLASSIC | FLAG_NO_DATA | FLAG_EXPECTED_FAIL,
{ },
{ },
+   .fill_helper = NULL,
+   .expected_errcode = -EINVAL,
},
{
"check: LDX + RET X",
@@ -2033,6 +2044,8 @@ static struct bpf_test tests[] = {
CLASSIC | FLAG_NO_DATA | FLAG_EXPECTED_FAIL,
{ },
{ },
+   .fill_helper = NULL,
+   .expected_errcode = -EINVAL,
},
{   /* Mainly checking JIT here. */
"M[]: alt STX + LDX",
@@ -2207,6 +2220,8 @@ static struct bpf_test tests[] = {
CLASSIC | FLAG_NO_DATA | FLAG_EXPECTED_FAIL,
{ },
{ },
+   .fill_helper = NULL,
+   .expected_errcode = -EINVAL,
},

[PATCH AUTOSEL for 4.9 268/293] x86/power: Fix swsusp_arch_resume prototype

2018-04-08 Thread Sasha Levin

From: Arnd Bergmann 

[ Upstream commit 328008a72d38b5bde6491e463405c34a81a65d3e ]

The declaration for swsusp_arch_resume marks it as 'asmlinkage', but the
definition in x86-32 does not, and it fails to include the header with the
declaration. This leads to a warning when building with
link-time-optimizations:

kernel/power/power.h:108:23: error: type of 'swsusp_arch_resume' does not match 
original declaration [-Werror=lto-type-mismatch]
 extern asmlinkage int swsusp_arch_resume(void);
   ^
arch/x86/power/hibernate_32.c:148:0: note: 'swsusp_arch_resume' was previously 
declared here
 int swsusp_arch_resume(void)

This moves the declaration into a globally visible header file and fixes up
both x86 definitions to match it.

Signed-off-by: Arnd Bergmann 
Signed-off-by: Thomas Gleixner 
Cc: Len Brown 
Cc: Andi Kleen 
Cc: Nicolas Pitre 
Cc: linux...@vger.kernel.org
Cc: "Rafael J. Wysocki" 
Cc: Pavel Machek 
Cc: Bart Van Assche 
Link: https://lkml.kernel.org/r/20180202145634.200291-2-a...@arndb.de
Signed-off-by: Sasha Levin 
---
 arch/x86/power/hibernate_32.c | 2 +-
 arch/x86/power/hibernate_64.c | 2 +-
 include/linux/suspend.h   | 2 ++
 kernel/power/power.h  | 3 ---
 4 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/arch/x86/power/hibernate_32.c b/arch/x86/power/hibernate_32.c
index 9f14bd34581d..74b516cb39df 100644
--- a/arch/x86/power/hibernate_32.c
+++ b/arch/x86/power/hibernate_32.c
@@ -142,7 +142,7 @@ static inline void resume_init_first_level_page_table(pgd_t 
*pg_dir)
 #endif
 }
 
-int swsusp_arch_resume(void)
+asmlinkage int swsusp_arch_resume(void)
 {
int error;
 
diff --git a/arch/x86/power/hibernate_64.c b/arch/x86/power/hibernate_64.c
index 9634557a5444..0cb1dd461529 100644
--- a/arch/x86/power/hibernate_64.c
+++ b/arch/x86/power/hibernate_64.c
@@ -149,7 +149,7 @@ static int relocate_restore_code(void)
return 0;
 }
 
-int swsusp_arch_resume(void)
+asmlinkage int swsusp_arch_resume(void)
 {
int error;
 
diff --git a/include/linux/suspend.h b/include/linux/suspend.h
index d9718378a8be..249dafce2788 100644
--- a/include/linux/suspend.h
+++ b/include/linux/suspend.h
@@ -378,6 +378,8 @@ extern int swsusp_page_is_forbidden(struct page *);
 extern void swsusp_set_page_free(struct page *);
 extern void swsusp_unset_page_free(struct page *);
 extern unsigned long get_safe_page(gfp_t gfp_mask);
+extern asmlinkage int swsusp_arch_suspend(void);
+extern asmlinkage int swsusp_arch_resume(void);
 
 extern void hibernation_set_ops(const struct platform_hibernation_ops *ops);
 extern int hibernate(void);
diff --git a/kernel/power/power.h b/kernel/power/power.h
index 56d1d0dedf76..ccba4d820078 100644
--- a/kernel/power/power.h
+++ b/kernel/power/power.h
@@ -103,9 +103,6 @@ extern int in_suspend;
 extern dev_t swsusp_resume_device;
 extern sector_t swsusp_resume_block;
 
-extern asmlinkage int swsusp_arch_suspend(void);
-extern asmlinkage int swsusp_arch_resume(void);
-
 extern int create_basic_memory_bitmaps(void);
 extern void free_basic_memory_bitmaps(void);
 extern int hibernate_preallocate_memory(void);
-- 
2.15.1

[PATCH AUTOSEL for 4.9 232/293] kconfig: Don't leak main menus during parsing

2018-04-08 Thread Sasha Levin

From: Ulf Magnusson 

[ Upstream commit 0724a7c32a54e3e50d28e19e30c59014f61d4e2c ]

If a 'mainmenu' entry appeared in the Kconfig files, two things would
leak:

- The 'struct property' allocated for the default "Linux Kernel
  Configuration" prompt.

- The string for the T_WORD/T_WORD_QUOTE prompt after the
  T_MAINMENU token, allocated on the heap in zconf.l.

To fix it, introduce a new 'no_mainmenu_stmt' nonterminal that matches
if there's no 'mainmenu' and adds the default prompt. That means the
prompt only gets allocated once regardless of whether there's a
'mainmenu' statement or not, and managing it becomes simple.

Summary from Valgrind on 'menuconfig' (ARCH=x86) before the fix:

LEAK SUMMARY:
   definitely lost: 344,568 bytes in 14,352 blocks
   ...

Summary after the fix:

LEAK SUMMARY:
   definitely lost: 344,440 bytes in 14,350 blocks
   ...

Signed-off-by: Ulf Magnusson 
Signed-off-by: Masahiro Yamada 
Signed-off-by: Sasha Levin 
---
 scripts/kconfig/zconf.y | 33 -
 1 file changed, 24 insertions(+), 9 deletions(-)

diff --git a/scripts/kconfig/zconf.y b/scripts/kconfig/zconf.y
index 71bf8bff696a..5122ed2d839a 100644
--- a/scripts/kconfig/zconf.y
+++ b/scripts/kconfig/zconf.y
@@ -107,7 +107,27 @@ static struct menu *current_menu, *current_entry;
 %%
 input: nl start | start;
 
-start: mainmenu_stmt stmt_list | stmt_list;
+start: mainmenu_stmt stmt_list | no_mainmenu_stmt stmt_list;
+
+/* mainmenu entry */
+
+mainmenu_stmt: T_MAINMENU prompt nl
+{
+   menu_add_prompt(P_MENU, $2, NULL);
+};
+
+/* Default main menu, if there's no mainmenu entry */
+
+no_mainmenu_stmt: /* empty */
+{
+   /*
+* Hack: Keep the main menu title on the heap so we can safely free it
+* later regardless of whether it comes from the 'prompt' in
+* mainmenu_stmt or here
+*/
+   menu_add_prompt(P_MENU, strdup("Linux Kernel Configuration"), NULL);
+};
+
 
 stmt_list:
  /* empty */
@@ -344,13 +364,6 @@ if_block:
| if_block choice_stmt
 ;
 
-/* mainmenu entry */
-
-mainmenu_stmt: T_MAINMENU prompt nl
-{
-   menu_add_prompt(P_MENU, $2, NULL);
-};
-
 /* menu entry */
 
 menu: T_MENU prompt T_EOL
@@ -495,6 +508,7 @@ word_opt: /* empty */   { $$ = NULL; }
 
 void conf_parse(const char *name)
 {
+   const char *tmp;
struct symbol *sym;
int i;
 
@@ -502,7 +516,6 @@ void conf_parse(const char *name)
 
sym_init();
_menu_init();
-   rootmenu.prompt = menu_add_prompt(P_MENU, "Linux Kernel Configuration", 
NULL);
 
if (getenv("ZCONF_DEBUG"))
zconfdebug = 1;
@@ -512,8 +525,10 @@ void conf_parse(const char *name)
if (!modules_sym)
modules_sym = sym_find( "n" );
 
+   tmp = rootmenu.prompt->text;
rootmenu.prompt->text = _(rootmenu.prompt->text);
rootmenu.prompt->text = sym_expand_string_value(rootmenu.prompt->text);
+   free((char*)tmp);
 
menu_finalize();
for_all_symbols(i, sym) {
-- 
2.15.1

[PATCH AUTOSEL for 4.9 273/293] MIPS: generic: Fix machine compatible matching

2018-04-08 Thread Sasha Levin

From: James Hogan 

[ Upstream commit 9a9ab3078e2744a1a55163cfaec73a5798aae33e ]

We now have a platform (Ranchu) in the "generic" platform which matches
based on the FDT compatible string using mips_machine_is_compatible(),
however that function doesn't stop at a blank struct
of_device_id::compatible as that is an array in the struct, not a
pointer to a string.

Fix the loop completion to check the first byte of the compatible array
rather than the address of the compatible array in the struct.

Fixes: eed0eabd12ef ("MIPS: generic: Introduce generic DT-based board support")
Signed-off-by: James Hogan 
Reviewed-by: Paul Burton 
Reviewed-by: Matt Redfearn 
Cc: Ralf Baechle 
Cc: linux-m...@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/18580/
Signed-off-by: Sasha Levin 
---
 arch/mips/include/asm/machine.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/mips/include/asm/machine.h b/arch/mips/include/asm/machine.h
index 6b444cd9526f..db930cdc715f 100644
--- a/arch/mips/include/asm/machine.h
+++ b/arch/mips/include/asm/machine.h
@@ -52,7 +52,7 @@ mips_machine_is_compatible(const struct mips_machine *mach, 
const void *fdt)
if (!mach->matches)
return NULL;
 
-   for (match = mach->matches; match->compatible; match++) {
+   for (match = mach->matches; match->compatible[0]; match++) {
if (fdt_node_check_compatible(fdt, 0, match->compatible) == 0)
return match;
}
-- 
2.15.1

[PATCH AUTOSEL for 4.9 266/293] drm/nouveau/pmu/fuc: don't use movw directly anymore

2018-04-08 Thread Sasha Levin

From: Karol Herbst 

[ Upstream commit fe9748b7b41cee11f8db57fb8b20bc540a33102a ]

Fixes failure to compile with recent envyas as a result of the 'movw'
alias being removed for v5.

A bit of history:

v3 only has a 16-bit sign-extended immediate mov op. In order to set
the high bits, there's a separate 'sethi' op. envyas validates that
the value passed to mov(imm) is between -0x8000 and 0x7fff. In order
to simplify macros that load both the low and high word, a 'movw'
alias was added which takes an unsigned 16-bit immediate. However the
actual hardware op still sign extends.

v5 has a full 32-bit immediate mov op. The v3 16-bit immediate mov op
is gone (loads 0 into the dst reg). However due to a bug in envyas,
the movw alias still existed, and selected the no-longer-present v3
16-bit immediate mov op. As a result usage of movw on v5 is the same
as mov with a 0x0 argument.

The proper fix throughout is to only ever use the 'movw' alias in
combination with 'sethi'. Anything else should get the sign-extended
validation to ensure that the intended value ends up in the
destination register.

Changes in fuc3 binaries is the result of a different encoding being
selected for a mov with an 8-bit value.

v2: added commit message written by Ilia, thanks for that!
v3: messed up rebasing, now it should apply

Signed-off-by: Karol Herbst 
Signed-off-by: Ben Skeggs 
Signed-off-by: Sasha Levin 
---
 .../drm/nouveau/nvkm/subdev/pmu/fuc/gf100.fuc3.h   |  746 +++
 .../drm/nouveau/nvkm/subdev/pmu/fuc/gk208.fuc5.h   |  802 
 .../drm/nouveau/nvkm/subdev/pmu/fuc/gt215.fuc3.h   | 1006 ++--
 .../gpu/drm/nouveau/nvkm/subdev/pmu/fuc/memx.fuc   |   30 +-
 4 files changed, 1292 insertions(+), 1292 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/fuc/gf100.fuc3.h 
b/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/fuc/gf100.fuc3.h
index e2faccffee6f..d66e0e76faf4 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/fuc/gf100.fuc3.h
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/fuc/gf100.fuc3.h
@@ -46,8 +46,8 @@ uint32_t gf100_pmu_data[] = {
0x,
0x,
0x584d454d,
-   0x0756,
-   0x0748,
+   0x0754,
+   0x0746,
0x,
0x,
0x,
@@ -68,8 +68,8 @@ uint32_t gf100_pmu_data[] = {
0x,
0x,
0x46524550,
-   0x075a,
0x0758,
+   0x0756,
0x,
0x,
0x,
@@ -90,8 +90,8 @@ uint32_t gf100_pmu_data[] = {
0x,
0x,
0x5f433249,
-   0x0b8a,
-   0x0a2d,
+   0x0b88,
+   0x0a2b,
0x,
0x,
0x,
@@ -112,8 +112,8 @@ uint32_t gf100_pmu_data[] = {
0x,
0x,
0x54534554,
-   0x0bb3,
-   0x0b8c,
+   0x0bb1,
+   0x0b8a,
0x,
0x,
0x,
@@ -134,8 +134,8 @@ uint32_t gf100_pmu_data[] = {
0x,
0x,
0x454c4449,
-   0x0bbf,
0x0bbd,
+   0x0bbb,
0x,
0x,
0x,
@@ -236,19 +236,19 @@ uint32_t gf100_pmu_data[] = {
0x05d3,
0x0003,
0x0002,
-   0x069d,
+   0x069b,
0x00040004,
0x,
-   0x06b9,
+   0x06b7,
0x00010005,
0x,
-   0x06d6,
+   0x06d4,
0x00010006,
0x,
0x065b,
0x0007,
0x,
-   0x06e1,
+   0x06df,
 /* 0x03c4: memx_func_tail */
 /* 0x03c4: memx_ts_start */
0x,
@@ -1372,432 +1372,432 @@ uint32_t gf100_pmu_code[] = {
 /* 0x065b: memx_func_wait_vblank */
0x9800f840,
0x66b00016,
-   0x130bf400,
+   0x120bf400,
0xf40166b0,
0x0ef4060b,
 /* 0x066d: memx_func_wait_vblank_head1 */
-   0x2077f12e,
-   0x070ef400,
-/* 0x0674: memx_func_wait_vblank_head0 */
-   0x000877f1,
-/* 0x0678: memx_func_wait_vblank_0 */
-   0x07c467f1,
-   0xcf0664b6,
-   0x67fd0066,
-   0xf31bf404,
-/* 0x0688: memx_func_wait_vblank_1 */
-   0x07c467f1,
-   0xcf0664b6,
-   0x67fd0066,
-   0xf30bf404,
-/* 0x0698: memx_func_wait_vblank_fini */
-   0xf80410b6,
-/* 0x069d: memx_func_wr32 */
-   0x00169800,
-   0xb6011598,
-   0x60f90810,
-   0xd0fc50f9,
-   0x21f4e0fc,
-   0x0242b640,
-   0xf8e91bf4,
-/* 0x06b9: memx_func_wait */
-   0x2c87f000,
-   0xcf0684b6,
-   0x1e980088,
-   0x011d9800,
-   0x98021c98,
-   0x10b6031b,
-   0xa321f410,
-/* 0x06d6: memx_func_delay */
-   0x1e9800f8,
-   0x0410b600,
-   0xf87e21f4,
-/* 0x06e1: memx_func_train

[PATCH AUTOSEL for 4.9 258/293] mm/mempolicy: add nodes_empty check in SYSC_migrate_pages

2018-04-08 Thread Sasha Levin

From: Yisheng Xie 

[ Upstream commit 0486a38bcc4749808edbc848f1bcf232042770fc ]

As in manpage of migrate_pages, the errno should be set to EINVAL when
none of the node IDs specified by new_nodes are on-line and allowed by
the process's current cpuset context, or none of the specified nodes
contain memory.  However, when test by following case:

new_nodes = 0;
old_nodes = 0xf;
ret = migrate_pages(pid, old_nodes, new_nodes, MAX);

The ret will be 0 and no errno is set.  As the new_nodes is empty, we
should expect EINVAL as documented.

To fix the case like above, this patch check whether target nodes AND
current task_nodes is empty, and then check whether AND
node_states[N_MEMORY] is empty.

Link: 
http://lkml.kernel.org/r/1510882624-44342-4-git-send-email-xieyishe...@huawei.com
Signed-off-by: Yisheng Xie 
Acked-by: Vlastimil Babka 
Cc: Andi Kleen 
Cc: Chris Salls 
Cc: Christopher Lameter 
Cc: David Rientjes 
Cc: Ingo Molnar 
Cc: Naoya Horiguchi 
Cc: Tan Xiaojun 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds 
Signed-off-by: Sasha Levin 
---
 mm/mempolicy.c | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 92f92f477304..c779a12f1ff8 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -1442,10 +1442,14 @@ SYSCALL_DEFINE4(migrate_pages, pid_t, pid, unsigned 
long, maxnode,
goto out_put;
}
 
-   if (!nodes_subset(*new, node_states[N_MEMORY])) {
-   err = -EINVAL;
+   task_nodes = cpuset_mems_allowed(current);
+   nodes_and(*new, *new, task_nodes);
+   if (nodes_empty(*new))
+   goto out_put;
+
+   nodes_and(*new, *new, node_states[N_MEMORY]);
+   if (nodes_empty(*new))
goto out_put;
-   }
 
err = security_task_movememory(task);
if (err)
-- 
2.15.1

[PATCH AUTOSEL for 4.9 267/293] netfilter: ipv6: nf_defrag: Kill frag queue on RFC2460 failure

2018-04-08 Thread Sasha Levin

From: Subash Abhinov Kasiviswanathan 

[ Upstream commit ea23d5e3bf340e413b8e05c13da233c99c64142b ]

Failures were seen in ICMPv6 fragmentation timeout tests if they were
run after the RFC2460 failure tests. Kernel was not sending out the
ICMPv6 fragment reassembly time exceeded packet after the fragmentation
reassembly timeout of 1 minute had elapsed.

This happened because the frag queue was not released if an error in
IPv6 fragmentation header was detected by RFC2460.

Fixes: 83f1999caeb1 ("netfilter: ipv6: nf_defrag: Pass on packets to stack per 
RFC2460")
Signed-off-by: Subash Abhinov Kasiviswanathan 
Signed-off-by: Pablo Neira Ayuso 
Signed-off-by: Sasha Levin 
---
 net/ipv6/netfilter/nf_conntrack_reasm.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/ipv6/netfilter/nf_conntrack_reasm.c 
b/net/ipv6/netfilter/nf_conntrack_reasm.c
index 5edfe66a3d7a..64ec23388450 100644
--- a/net/ipv6/netfilter/nf_conntrack_reasm.c
+++ b/net/ipv6/netfilter/nf_conntrack_reasm.c
@@ -263,6 +263,7 @@ static int nf_ct_frag6_queue(struct frag_queue *fq, struct 
sk_buff *skb,
 * this case. -DaveM
 */
pr_debug("end of fragment not rounded to 8 bytes.\n");
+   inet_frag_kill(>q, _frags);
return -EPROTO;
}
if (end > fq->q.len) {
-- 
2.15.1

[PATCH AUTOSEL for 4.9 264/293] openvswitch: Remove padding from packet before L3+ conntrack processing

2018-04-08 Thread Sasha Levin

From: Ed Swierk 

[ Upstream commit 9382fe71c0058465e942a633869629929102843d ]

IPv4 and IPv6 packets may arrive with lower-layer padding that is not
included in the L3 length. For example, a short IPv4 packet may have
up to 6 bytes of padding following the IP payload when received on an
Ethernet device with a minimum packet length of 64 bytes.

Higher-layer processing functions in netfilter (e.g. nf_ip_checksum(),
and help() in nf_conntrack_ftp) assume skb->len reflects the length of
the L3 header and payload, rather than referring back to
ip_hdr->tot_len or ipv6_hdr->payload_len, and get confused by
lower-layer padding.

In the normal IPv4 receive path, ip_rcv() trims the packet to
ip_hdr->tot_len before invoking netfilter hooks. In the IPv6 receive
path, ip6_rcv() does the same using ipv6_hdr->payload_len. Similarly
in the br_netfilter receive path, br_validate_ipv4() and
br_validate_ipv6() trim the packet to the L3 length before invoking
netfilter hooks.

Currently in the OVS conntrack receive path, ovs_ct_execute() pulls
the skb to the L3 header but does not trim it to the L3 length before
calling nf_conntrack_in(NF_INET_PRE_ROUTING). When
nf_conntrack_proto_tcp encounters a packet with lower-layer padding,
nf_ip_checksum() fails causing a "nf_ct_tcp: bad TCP checksum" log
message. While extra zero bytes don't affect the checksum, the length
in the IP pseudoheader does. That length is based on skb->len, and
without trimming, it doesn't match the length the sender used when
computing the checksum.

In ovs_ct_execute(), trim the skb to the L3 length before higher-layer
processing.

Signed-off-by: Ed Swierk 
Acked-by: Pravin B Shelar 
Signed-off-by: David S. Miller 
Signed-off-by: Sasha Levin 
---
 net/openvswitch/conntrack.c | 34 ++
 1 file changed, 34 insertions(+)

diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
index 466393936db9..f135814c34ad 100644
--- a/net/openvswitch/conntrack.c
+++ b/net/openvswitch/conntrack.c
@@ -906,6 +906,36 @@ static int ovs_ct_commit(struct net *net, struct 
sw_flow_key *key,
return 0;
 }
 
+/* Trim the skb to the length specified by the IP/IPv6 header,
+ * removing any trailing lower-layer padding. This prepares the skb
+ * for higher-layer processing that assumes skb->len excludes padding
+ * (such as nf_ip_checksum). The caller needs to pull the skb to the
+ * network header, and ensure ip_hdr/ipv6_hdr points to valid data.
+ */
+static int ovs_skb_network_trim(struct sk_buff *skb)
+{
+   unsigned int len;
+   int err;
+
+   switch (skb->protocol) {
+   case htons(ETH_P_IP):
+   len = ntohs(ip_hdr(skb)->tot_len);
+   break;
+   case htons(ETH_P_IPV6):
+   len = sizeof(struct ipv6hdr)
+   + ntohs(ipv6_hdr(skb)->payload_len);
+   break;
+   default:
+   len = skb->len;
+   }
+
+   err = pskb_trim_rcsum(skb, len);
+   if (err)
+   kfree_skb(skb);
+
+   return err;
+}
+
 /* Returns 0 on success, -EINPROGRESS if 'skb' is stolen, or other nonzero
  * value if 'skb' is freed.
  */
@@ -920,6 +950,10 @@ int ovs_ct_execute(struct net *net, struct sk_buff *skb,
nh_ofs = skb_network_offset(skb);
skb_pull_rcsum(skb, nh_ofs);
 
+   err = ovs_skb_network_trim(skb);
+   if (err)
+   return err;
+
if (key->ip.frag != OVS_FRAG_TYPE_NONE) {
err = handle_fragments(net, key, info->zone.id, skb);
if (err)
-- 
2.15.1

[PATCH AUTOSEL for 4.9 262/293] mm: pin address_space before dereferencing it while isolating an LRU page

2018-04-08 Thread Sasha Levin

From: Mel Gorman 

[ Upstream commit 69d763fc6d3aee787a3e8c8c35092b4f4960fa5d ]

Minchan Kim asked the following question -- what locks protects
address_space destroying when race happens between inode trauncation and
__isolate_lru_page? Jan Kara clarified by describing the race as follows

CPU1CPU2

truncate(inode) __isolate_lru_page()
  ...
  truncate_inode_page(mapping, page);
delete_from_page_cache(page)
  spin_lock_irqsave(>tree_lock, flags);
__delete_from_page_cache(page, NULL)
  page_cache_tree_delete(..)
...   mapping = page_mapping(page);
page->mapping = NULL;
...
  spin_unlock_irqrestore(>tree_lock, flags);
  page_cache_free_page(mapping, page)
put_page(page)
  if (put_page_testzero(page)) -> false
- inode now has no pages and can be freed including embedded address_space

  if (mapping && 
!mapping->a_ops->migratepage)
- we've dereferenced mapping which is potentially already free.

The race is theoretically possible but unlikely.  Before the
delete_from_page_cache, truncate_cleanup_page is called so the page is
likely to be !PageDirty or PageWriteback which gets skipped by the only
caller that checks the mappping in __isolate_lru_page.  Even if the race
occurs, a substantial amount of work has to happen during a tiny window
with no preemption but it could potentially be done using a virtual
machine to artifically slow one CPU or halt it during the critical
window.

This patch should eliminate the race with truncation by try-locking the
page before derefencing mapping and aborting if the lock was not
acquired.  There was a suggestion from Huang Ying to use RCU as a
side-effect to prevent mapping being freed.  However, I do not like the
solution as it's an unconventional means of preserving a mapping and
it's not a context where rcu_read_lock is obviously protecting rcu data.

Link: 
http://lkml.kernel.org/r/20180104102512.2qos3h5vqzeis...@techsingularity.net
Fixes: c82449352854 ("mm: compaction: make isolate_lru_page() filter-aware 
again")
Signed-off-by: Mel Gorman 
Acked-by: Minchan Kim 
Cc: "Huang, Ying" 
Cc: Jan Kara 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds 
Signed-off-by: Sasha Levin 
---
 mm/vmscan.c | 14 --
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index cdd5c3b5c357..d012c13d96f7 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1374,6 +1374,7 @@ int __isolate_lru_page(struct page *page, isolate_mode_t 
mode)
 
if (PageDirty(page)) {
struct address_space *mapping;
+   bool migrate_dirty;
 
/* ISOLATE_CLEAN means only clean pages */
if (mode & ISOLATE_CLEAN)
@@ -1382,10 +1383,19 @@ int __isolate_lru_page(struct page *page, 
isolate_mode_t mode)
/*
 * Only pages without mappings or that have a
 * ->migratepage callback are possible to migrate
-* without blocking
+* without blocking. However, we can be racing with
+* truncation so it's necessary to lock the page
+* to stabilise the mapping as truncation holds
+* the page lock until after the page is removed
+* from the page cache.
 */
+   if (!trylock_page(page))
+   return ret;
+
mapping = page_mapping(page);
-   if (mapping && !mapping->a_ops->migratepage)
+   migrate_dirty = mapping && mapping->a_ops->migratepage;
+   unlock_page(page);
+   if (!migrate_dirty)
return ret;
}
}
-- 
2.15.1

[PATCH AUTOSEL for 4.9 250/293] ntb_transport: Fix bug with max_mw_size parameter

2018-04-08 Thread Sasha Levin

From: Logan Gunthorpe 

[ Upstream commit cbd27448faff4843ac4b66cc71445a10623ff48d ]

When using the max_mw_size parameter of ntb_transport to limit the size of
the Memory windows, communication cannot be established and the queues
freeze.

This is because the mw_size that's reported to the peer is correctly
limited but the size used locally is not. So the MW is initialized
with a buffer smaller than the window but the TX side is using the
full window. This means the TX side will be writing to a region of the
window that points nowhere.

This is easily fixed by applying the same limit to tx_size in
ntb_transport_init_queue().

Fixes: e26a5843f7f5 ("NTB: Split ntb_hw_intel and ntb_transport drivers")
Signed-off-by: Logan Gunthorpe 
Acked-by: Allen Hubbe 
Cc: Dave Jiang 
Signed-off-by: Jon Mason 
Signed-off-by: Sasha Levin 
---
 drivers/ntb/ntb_transport.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/ntb/ntb_transport.c b/drivers/ntb/ntb_transport.c
index 24222a5d8df2..da95bd8f0f72 100644
--- a/drivers/ntb/ntb_transport.c
+++ b/drivers/ntb/ntb_transport.c
@@ -996,6 +996,9 @@ static int ntb_transport_init_queue(struct 
ntb_transport_ctx *nt,
mw_base = nt->mw_vec[mw_num].phys_addr;
mw_size = nt->mw_vec[mw_num].phys_size;
 
+   if (max_mw_size && mw_size > max_mw_size)
+   mw_size = max_mw_size;
+
tx_size = (unsigned int)mw_size / num_qps_mw;
qp_offset = tx_size * (qp_num / mw_count);
 
-- 
2.15.1

[PATCH AUTOSEL for 4.9 259/293] asm-generic: provide generic_pmdp_establish()

2018-04-08 Thread Sasha Levin

From: "Kirill A. Shutemov" 

[ Upstream commit c58f0bb77ed8bf93dfdde762b01cb67eebbdfc29 ]

Patch series "Do not lose dirty bit on THP pages", v4.

Vlastimil noted that pmdp_invalidate() is not atomic and we can lose
dirty and access bits if CPU sets them after pmdp dereference, but
before set_pmd_at().

The bug can lead to data loss, but the race window is tiny and I haven't
seen any reports that suggested that it happens in reality.  So I don't
think it worth sending it to stable.

Unfortunately, there's no way to address the issue in a generic way.  We
need to fix all architectures that support THP one-by-one.

All architectures that have THP supported have to provide atomic
pmdp_invalidate() that returns previous value.

If generic implementation of pmdp_invalidate() is used, architecture
needs to provide atomic pmdp_estabish().

pmdp_estabish() is not used out-side generic implementation of
pmdp_invalidate() so far, but I think this can change in the future.

This patch (of 12):

This is an implementation of pmdp_establish() that is only suitable for
an architecture that doesn't have hardware dirty/accessed bits.  In this
case we can't race with CPU which sets these bits and non-atomic
approach is fine.

Link: 
http://lkml.kernel.org/r/20171213105756.69879-2-kirill.shute...@linux.intel.com
Signed-off-by: Kirill A. Shutemov 
Cc: Vlastimil Babka 
Cc: Andrea Arcangeli 
Cc: Michal Hocko 
Cc: Aneesh Kumar K.V 
Cc: Catalin Marinas 
Cc: David Daney 
Cc: David Miller 
Cc: H. Peter Anvin 
Cc: Hugh Dickins 
Cc: Ingo Molnar 
Cc: Martin Schwidefsky 
Cc: Nitin Gupta 
Cc: Ralf Baechle 
Cc: Thomas Gleixner 
Cc: Vineet Gupta 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds 
Signed-off-by: Sasha Levin 
---
 include/asm-generic/pgtable.h | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
index f6ea0f3c03f8..4e8551c8ef18 100644
--- a/include/asm-generic/pgtable.h
+++ b/include/asm-generic/pgtable.h
@@ -234,6 +234,21 @@ extern void pgtable_trans_huge_deposit(struct mm_struct 
*mm, pmd_t *pmdp,
 extern pgtable_t pgtable_trans_huge_withdraw(struct mm_struct *mm, pmd_t 
*pmdp);
 #endif
 
+#ifdef CONFIG_TRANSPARENT_HUGEPAGE
+/*
+ * This is an implementation of pmdp_establish() that is only suitable for an
+ * architecture that doesn't have hardware dirty/accessed bits. In this case we
+ * can't race with CPU which sets these bits and non-atomic aproach is fine.
+ */
+static inline pmd_t generic_pmdp_establish(struct vm_area_struct *vma,
+   unsigned long address, pmd_t *pmdp, pmd_t pmd)
+{
+   pmd_t old_pmd = *pmdp;
+   set_pmd_at(vma->vm_mm, address, pmdp, pmd);
+   return old_pmd;
+}
+#endif
+
 #ifndef __HAVE_ARCH_PMDP_INVALIDATE
 extern void pmdp_invalidate(struct vm_area_struct *vma, unsigned long address,
pmd_t *pmdp);
-- 
2.15.1

[PATCH AUTOSEL for 4.9 261/293] mm: thp: use down_read_trylock() in khugepaged to avoid long block

2018-04-08 Thread Sasha Levin

From: Yang Shi 

[ Upstream commit 3b454ad35043dfbd3b5d2bb92b0991d6342afb44 ]

In the current design, khugepaged needs to acquire mmap_sem before
scanning an mm.  But in some corner cases, khugepaged may scan a process
which is modifying its memory mapping, so khugepaged blocks in
uninterruptible state.  But the process might hold the mmap_sem for a
long time when modifying a huge memory space and it may trigger the
below khugepaged hung issue:

  INFO: task khugepaged:270 blocked for more than 120 seconds.
  Tainted: G E 4.9.65-006.ali3000.alios7.x86_64 #1
  "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
  khugepaged D 0 270 2 0x 
  883f3deae4c0  883f610596c0 883f7d359440
  883f63818000 c90019adfc78 817079a5 d67e5aa8c1860a64
  0246 883f7d359440 c90019adfc88 883f610596c0
  Call Trace:
schedule+0x36/0x80
rwsem_down_read_failed+0xf0/0x150
call_rwsem_down_read_failed+0x18/0x30
down_read+0x20/0x40
khugepaged+0x476/0x11d0
kthread+0xe6/0x100
ret_from_fork+0x25/0x30

So it sounds pointless to just block khugepaged waiting for the
semaphore so replace down_read() with down_read_trylock() to move to
scan the next mm quickly instead of just blocking on the semaphore so
that other processes can get more chances to install THP.  Then
khugepaged can come back to scan the skipped mm when it has finished the
current round full_scan.

And it appears that the change can improve khugepaged efficiency a
little bit.

Below is the test result when running LTP on a 24 cores 4GB memory 2
nodes NUMA VM:

pristine  w/ trylock
  full_scan 197   187
  pages_collapsed   2126
  thp_fault_alloc   40818 44466
  thp_fault_fallback18413 16679
  thp_collapse_alloc21150
  thp_collapse_alloc_failed 1416
  thp_file_alloc369   369

[a...@linux-foundation.org: coding-style fixes]
[a...@linux-foundation.org: tweak comment]
[a...@arndb.de: avoid uninitialized variable use]
  Link: http://lkml.kernel.org/r/20171215125129.2948634-1-a...@arndb.de
Link: 
http://lkml.kernel.org/r/1513281203-54878-1-git-send-email-yan...@alibaba-inc.com
Signed-off-by: Yang Shi 
Acked-by: Kirill A. Shutemov 
Acked-by: Michal Hocko 
Cc: Hugh Dickins 
Cc: Andrea Arcangeli 
Signed-off-by: Arnd Bergmann 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds 
Signed-off-by: Sasha Levin 
---
 mm/khugepaged.c | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 898eb26f5dc8..48a39cbdf2d4 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -1678,10 +1678,14 @@ static unsigned int khugepaged_scan_mm_slot(unsigned 
int pages,
spin_unlock(_mm_lock);
 
mm = mm_slot->mm;
-   down_read(>mmap_sem);
-   if (unlikely(khugepaged_test_exit(mm)))
-   vma = NULL;
-   else
+   /*
+* Don't wait for semaphore (to avoid long wait times).  Just move to
+* the next mm on the list.
+*/
+   vma = NULL;
+   if (unlikely(!down_read_trylock(>mmap_sem)))
+   goto breakouterloop_mmap_sem;
+   if (likely(!khugepaged_test_exit(mm)))
vma = find_vma(mm, khugepaged_scan.address);
 
progress++;
-- 
2.15.1

[PATCH AUTOSEL for 4.9 245/293] device property: Define type of PROPERTY_ENRTY_*() macros

2018-04-08 Thread Sasha Levin

From: Andy Shevchenko 

[ Upstream commit c505cbd45f6e9c539d57dd171d95ec7e5e9f9cd0 ]

Some of the drivers may use the macro at runtime flow, like

  struct property_entry p[10];
...
  p[index++] = PROPERTY_ENTRY_U8("u8 property", u8_data);

In that case and absence of the data type compiler fails the build:

drivers/char/ipmi/ipmi_dmi.c:79:29: error: Expected ; at end of statement
drivers/char/ipmi/ipmi_dmi.c:79:29: error: got {

Acked-by: Corey Minyard 
Cc: Corey Minyard 
Signed-off-by: Andy Shevchenko 
Signed-off-by: Greg Kroah-Hartman 
Signed-off-by: Sasha Levin 
---
 include/linux/property.h | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/include/linux/property.h b/include/linux/property.h
index 338f9b76914b..459337fb44d0 100644
--- a/include/linux/property.h
+++ b/include/linux/property.h
@@ -187,7 +187,7 @@ struct property_entry {
  */
 
 #define PROPERTY_ENTRY_INTEGER_ARRAY(_name_, _type_, _val_)\
-{  \
+(struct property_entry) {  \
.name = _name_, \
.length = ARRAY_SIZE(_val_) * sizeof(_type_),   \
.is_array = true,   \
@@ -205,7 +205,7 @@ struct property_entry {
PROPERTY_ENTRY_INTEGER_ARRAY(_name_, u64, _val_)
 
 #define PROPERTY_ENTRY_STRING_ARRAY(_name_, _val_) \
-{  \
+(struct property_entry) {  \
.name = _name_, \
.length = ARRAY_SIZE(_val_) * sizeof(const char *), \
.is_array = true,   \
@@ -214,7 +214,7 @@ struct property_entry {
 }
 
 #define PROPERTY_ENTRY_INTEGER(_name_, _type_, _val_)  \
-{  \
+(struct property_entry) {  \
.name = _name_, \
.length = sizeof(_type_),   \
.is_string = false, \
@@ -231,7 +231,7 @@ struct property_entry {
PROPERTY_ENTRY_INTEGER(_name_, u64, _val_)
 
 #define PROPERTY_ENTRY_STRING(_name_, _val_)   \
-{  \
+(struct property_entry) {  \
.name = _name_, \
.length = sizeof(_val_),\
.is_string = true,  \
@@ -239,7 +239,7 @@ struct property_entry {
 }
 
 #define PROPERTY_ENTRY_BOOL(_name_)\
-{  \
+(struct property_entry) {  \
.name = _name_, \
 }
 
-- 
2.15.1

[PATCH AUTOSEL for 4.9 257/293] mm/mempolicy: fix the check of nodemask from user

2018-04-08 Thread Sasha Levin

From: Yisheng Xie 

[ Upstream commit 56521e7a02b7b84a5e72691a1fb15570e6055545 ]

As Xiaojun reported the ltp of migrate_pages01 will fail on arm64 system
which has 4 nodes[0...3], all have memory and CONFIG_NODES_SHIFT=2:

  migrate_pages010  TINFO  :  test_invalid_nodes
  migrate_pages01   14  TFAIL  :  migrate_pages_common.c:45: unexpected failure 
- returned value = 0, expected: -1
  migrate_pages01   15  TFAIL  :  migrate_pages_common.c:55: call succeeded 
unexpectedly

In this case the test_invalid_nodes of migrate_pages01 will call:
SYSC_migrate_pages as:

  migrate_pages(0, , {0x0001}, 64, , {0x0010}, 64) = 0

The new nodes specifies one or more node IDs that are greater than the
maximum supported node ID, however, the errno is not set to EINVAL as
expected.

As man pages of set_mempolicy[1], mbind[2], and migrate_pages[3]
mentioned, when nodemask specifies one or more node IDs that are greater
than the maximum supported node ID, the errno should set to EINVAL.
However, get_nodes only check whether the part of bits
[BITS_PER_LONG*BITS_TO_LONGS(MAX_NUMNODES), maxnode) is zero or not, and
remain [MAX_NUMNODES, BITS_PER_LONG*BITS_TO_LONGS(MAX_NUMNODES)
unchecked.

This patch is to check the bits of [MAX_NUMNODES, maxnode) in get_nodes
to let migrate_pages set the errno to EINVAL when nodemask specifies one
or more node IDs that are greater than the maximum supported node ID,
which follows the manpage's guide.

[1] http://man7.org/linux/man-pages/man2/set_mempolicy.2.html
[2] http://man7.org/linux/man-pages/man2/mbind.2.html
[3] http://man7.org/linux/man-pages/man2/migrate_pages.2.html

Link: 
http://lkml.kernel.org/r/1510882624-44342-3-git-send-email-xieyishe...@huawei.com
Signed-off-by: Yisheng Xie 
Reported-by: Tan Xiaojun 
Acked-by: Vlastimil Babka 
Cc: Andi Kleen 
Cc: Chris Salls 
Cc: Christopher Lameter 
Cc: David Rientjes 
Cc: Ingo Molnar 
Cc: Naoya Horiguchi 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds 
Signed-off-by: Sasha Levin 
---
 mm/mempolicy.c | 23 ---
 1 file changed, 20 insertions(+), 3 deletions(-)

diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index a8ab5e73dc61..92f92f477304 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -1264,6 +1264,7 @@ static int get_nodes(nodemask_t *nodes, const unsigned 
long __user *nmask,
 unsigned long maxnode)
 {
unsigned long k;
+   unsigned long t;
unsigned long nlongs;
unsigned long endmask;
 
@@ -1280,13 +1281,19 @@ static int get_nodes(nodemask_t *nodes, const unsigned 
long __user *nmask,
else
endmask = (1UL << (maxnode % BITS_PER_LONG)) - 1;
 
-   /* When the user specified more nodes than supported just check
-  if the non supported part is all zero. */
+   /*
+* When the user specified more nodes than supported just check
+* if the non supported part is all zero.
+*
+* If maxnode have more longs than MAX_NUMNODES, check
+* the bits in that area first. And then go through to
+* check the rest bits which equal or bigger than MAX_NUMNODES.
+* Otherwise, just check bits [MAX_NUMNODES, maxnode).
+*/
if (nlongs > BITS_TO_LONGS(MAX_NUMNODES)) {
if (nlongs > PAGE_SIZE/sizeof(long))
return -EINVAL;
for (k = BITS_TO_LONGS(MAX_NUMNODES); k < nlongs; k++) {
-   unsigned long t;
if (get_user(t, nmask + k))
return -EFAULT;
if (k == nlongs - 1) {
@@ -1299,6 +1306,16 @@ static int get_nodes(nodemask_t *nodes, const unsigned 
long __user *nmask,
endmask = ~0UL;
}
 
+   if (maxnode > MAX_NUMNODES && MAX_NUMNODES % BITS_PER_LONG != 0) {
+   unsigned long valid_mask = endmask;
+
+   valid_mask &= ~((1UL << (MAX_NUMNODES % BITS_PER_LONG)) - 1);
+   if (get_user(t, nmask + nlongs - 1))
+   return -EFAULT;
+   if (t & valid_mask)
+   return -EINVAL;
+   }
+
if (copy_from_user(nodes_addr(*nodes), nmask, nlongs*sizeof(unsigned 
long)))
return -EFAULT;
nodes_addr(*nodes)[nlongs-1] &= endmask;
-- 
2.15.1

[PATCH AUTOSEL for 4.9 247/293] powerpc/numa: Use ibm,max-associativity-domains to discover possible nodes

2018-04-08 Thread Sasha Levin

From: Michael Bringmann 

[ Upstream commit a346137e9142b039fd13af2e59696e3d40c487ef ]

On powerpc systems which allow 'hot-add' of CPU or memory resources,
it may occur that the new resources are to be inserted into nodes that
were not used for these resources at bootup. In the kernel, any node
that is used must be defined and initialized. These empty nodes may
occur when,

* Dedicated vs. shared resources. Shared resources require information
  such as the VPHN hcall for CPU assignment to nodes. Associativity
  decisions made based on dedicated resource rules, such as
  associativity properties in the device tree, may vary from decisions
  made using the values returned by the VPHN hcall.

* memoryless nodes at boot. Nodes need to be defined as 'possible' at
  boot for operation with other code modules. Previously, the powerpc
  code would limit the set of possible nodes to those which have
  memory assigned at boot, and were thus online. Subsequent add/remove
  of CPUs or memory would only work with this subset of possible
  nodes.

* memoryless nodes with CPUs at boot. Due to the previous restriction
  on nodes, nodes that had CPUs but no memory were being collapsed
  into other nodes that did have memory at boot. In practice this
  meant that the node assignment presented by the runtime kernel
  differed from the affinity and associativity attributes presented by
  the device tree or VPHN hcalls. Nodes that might be known to the
  pHyp were not 'possible' in the runtime kernel because they did not
  have memory at boot.

This patch ensures that sufficient nodes are defined to support
configuration requirements after boot, as well as at boot. This patch
set fixes a couple of problems.

* Nodes known to powerpc to be memoryless at boot, but to have CPUs in
  them are allowed to be 'possible' and 'online'. Memory allocations
  for those nodes are taken from another node that does have memory
  until and if memory is hot-added to the node. * Nodes which have no
  resources assigned at boot, but which may still be referenced
  subsequently by affinity or associativity attributes, are kept in
  the list of 'possible' nodes for powerpc. Hot-add of memory or CPUs
  to the system can reference these nodes and bring them online
  instead of redirecting to one of the set of nodes that were known to
  have memory at boot.

This patch extracts the value of the lowest domain level (number of
allocable resources) from the device tree property
"ibm,max-associativity-domains" to use as the maximum number of nodes
to setup as possibly available in the system. This new setting will
override the instruction:

nodes_and(node_possible_map, node_possible_map, node_online_map);

presently seen in the function arch/powerpc/mm/numa.c:initmem_init().

If the "ibm,max-associativity-domains" property is not present at
boot, no operation will be performed to define or enable additional
nodes, or enable the above 'nodes_and()'.

Signed-off-by: Michael Bringmann 
Reviewed-by: Nathan Fontenot 
Signed-off-by: Michael Ellerman 
Signed-off-by: Sasha Levin 
---
 arch/powerpc/mm/numa.c | 37 ++---
 1 file changed, 34 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
index a51c188b81f3..18ea1e49a323 100644
--- a/arch/powerpc/mm/numa.c
+++ b/arch/powerpc/mm/numa.c
@@ -904,6 +904,34 @@ static void __init setup_node_data(int nid, u64 start_pfn, 
u64 end_pfn)
NODE_DATA(nid)->node_spanned_pages = spanned_pages;
 }
 
+static void __init find_possible_nodes(void)
+{
+   struct device_node *rtas;
+   u32 numnodes, i;
+
+   if (min_common_depth <= 0)
+   return;
+
+   rtas = of_find_node_by_path("/rtas");
+   if (!rtas)
+   return;
+
+   if (of_property_read_u32_index(rtas,
+   "ibm,max-associativity-domains",
+   min_common_depth, ))
+   goto out;
+
+   for (i = 0; i < numnodes; i++) {
+   if (!node_possible(i)) {
+   setup_node_data(i, 0, 0);
+   node_set(i, node_possible_map);
+   }
+   }
+
+out:
+   of_node_put(rtas);
+}
+
 void __init initmem_init(void)
 {
int nid, cpu;
@@ -917,12 +945,15 @@ void __init initmem_init(void)
memblock_dump_all();
 
/*
-* Reduce the possible NUMA nodes to the online NUMA nodes,
-* since we do not support node hotplug. This ensures that  we
-* lower the maximum NUMA node ID to what is actually present.
+* Modify the set of possible NUMA nodes to reflect information
+* available about the set of online nodes, and the set of nodes
+* that we expect to make use of for this platform's affinity
+* calculations.
 */

[PATCH AUTOSEL for 4.9 256/293] ocfs2: return error when we attempt to access a dirty bh in jbd2

2018-04-08 Thread Sasha Levin

From: piaojun 

[ Upstream commit d984187e3a1ad7d12447a7ab2c43ce3717a2b5b3 ]

We should not reuse the dirty bh in jbd2 directly due to the following
situation:

1. When removing extent rec, we will dirty the bhs of extent rec and
   truncate log at the same time, and hand them over to jbd2.

2. The bhs are submitted to jbd2 area successfully.

3. The write-back thread of device help flush the bhs to disk but
   encounter write error due to abnormal storage link.

4. After a while the storage link become normal. Truncate log flush
   worker triggered by the next space reclaiming found the dirty bh of
   truncate log and clear its 'BH_Write_EIO' and then set it uptodate in
   __ocfs2_journal_access():

   ocfs2_truncate_log_worker
 ocfs2_flush_truncate_log
   __ocfs2_flush_truncate_log
 ocfs2_replay_truncate_records
   ocfs2_journal_access_di
 __ocfs2_journal_access // here we clear io_error and set 'tl_bh' 
uptodata.

5. Then jbd2 will flush the bh of truncate log to disk, but the bh of
   extent rec is still in error state, and unfortunately nobody will
   take care of it.

6. At last the space of extent rec was not reduced, but truncate log
   flush worker have given it back to globalalloc. That will cause
   duplicate cluster problem which could be identified by fsck.ocfs2.

Sadly we can hardly revert this but set fs read-only in case of ruining
atomicity and consistency of space reclaim.

Link: http://lkml.kernel.org/r/5a6e8092.8090...@huawei.com
Fixes: acf8fdbe6afb ("ocfs2: do not BUG if buffer not uptodate in 
__ocfs2_journal_access")
Signed-off-by: Jun Piao 
Reviewed-by: Yiwen Jiang 
Reviewed-by: Changwei Ge 
Cc: Mark Fasheh 
Cc: Joel Becker 
Cc: Junxiao Bi 
Cc: Joseph Qi 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds 
Signed-off-by: Sasha Levin 
---
 fs/ocfs2/journal.c | 23 ---
 1 file changed, 12 insertions(+), 11 deletions(-)

diff --git a/fs/ocfs2/journal.c b/fs/ocfs2/journal.c
index a244f14c6b87..fa947d36ae1d 100644
--- a/fs/ocfs2/journal.c
+++ b/fs/ocfs2/journal.c
@@ -666,23 +666,24 @@ static int __ocfs2_journal_access(handle_t *handle,
/* we can safely remove this assertion after testing. */
if (!buffer_uptodate(bh)) {
mlog(ML_ERROR, "giving me a buffer that's not uptodate!\n");
-   mlog(ML_ERROR, "b_blocknr=%llu\n",
-(unsigned long long)bh->b_blocknr);
+   mlog(ML_ERROR, "b_blocknr=%llu, b_state=0x%lx\n",
+(unsigned long long)bh->b_blocknr, bh->b_state);
 
lock_buffer(bh);
/*
-* A previous attempt to write this buffer head failed.
-* Nothing we can do but to retry the write and hope for
-* the best.
+* A previous transaction with a couple of buffer heads fail
+* to checkpoint, so all the bhs are marked as BH_Write_EIO.
+* For current transaction, the bh is just among those error
+* bhs which previous transaction handle. We can't just clear
+* its BH_Write_EIO and reuse directly, since other bhs are
+* not written to disk yet and that will cause metadata
+* inconsistency. So we should set fs read-only to avoid
+* further damage.
 */
if (buffer_write_io_error(bh) && !buffer_uptodate(bh)) {
-   clear_buffer_write_io_error(bh);
-   set_buffer_uptodate(bh);
-   }
-
-   if (!buffer_uptodate(bh)) {
unlock_buffer(bh);
-   return -EIO;
+   return ocfs2_error(osb->sb, "A previous attempt to "
+   "write this buffer head failed\n");
}
unlock_buffer(bh);
}
-- 
2.15.1

[PATCH AUTOSEL for 4.9 260/293] sparc64: update pmdp_invalidate() to return old pmd value

2018-04-08 Thread Sasha Levin

From: Nitin Gupta 

[ Upstream commit a8e654f01cb725d0bfd741ebca1bf4c9337969cc ]

It's required to avoid losing dirty and accessed bits.

[a...@linux-foundation.org: add a `do' to the do-while loop]
Link: 
http://lkml.kernel.org/r/20171213105756.69879-9-kirill.shute...@linux.intel.com
Signed-off-by: Nitin Gupta 
Signed-off-by: Kirill A. Shutemov 
Cc: David Miller 
Cc: Vlastimil Babka 
Cc: Andrea Arcangeli 
Cc: Michal Hocko 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds 
Signed-off-by: Sasha Levin 
---
 arch/sparc/include/asm/pgtable_64.h |  2 +-
 arch/sparc/mm/tlb.c | 23 ++-
 2 files changed, 19 insertions(+), 6 deletions(-)

diff --git a/arch/sparc/include/asm/pgtable_64.h 
b/arch/sparc/include/asm/pgtable_64.h
index b6802b978140..81ad06a1672f 100644
--- a/arch/sparc/include/asm/pgtable_64.h
+++ b/arch/sparc/include/asm/pgtable_64.h
@@ -952,7 +952,7 @@ void update_mmu_cache_pmd(struct vm_area_struct *vma, 
unsigned long addr,
  pmd_t *pmd);
 
 #define __HAVE_ARCH_PMDP_INVALIDATE
-extern void pmdp_invalidate(struct vm_area_struct *vma, unsigned long address,
+extern pmd_t pmdp_invalidate(struct vm_area_struct *vma, unsigned long address,
pmd_t *pmdp);
 
 #define __HAVE_ARCH_PGTABLE_DEPOSIT
diff --git a/arch/sparc/mm/tlb.c b/arch/sparc/mm/tlb.c
index c56a195c9071..b2722ed31053 100644
--- a/arch/sparc/mm/tlb.c
+++ b/arch/sparc/mm/tlb.c
@@ -219,17 +219,28 @@ void set_pmd_at(struct mm_struct *mm, unsigned long addr,
}
 }
 
+static inline pmd_t pmdp_establish(struct vm_area_struct *vma,
+   unsigned long address, pmd_t *pmdp, pmd_t pmd)
+{
+   pmd_t old;
+
+   do {
+   old = *pmdp;
+   } while (cmpxchg64(>pmd, old.pmd, pmd.pmd) != old.pmd);
+
+   return old;
+}
+
 /*
  * This routine is only called when splitting a THP
  */
-void pmdp_invalidate(struct vm_area_struct *vma, unsigned long address,
+pmd_t pmdp_invalidate(struct vm_area_struct *vma, unsigned long address,
 pmd_t *pmdp)
 {
-   pmd_t entry = *pmdp;
-
-   pmd_val(entry) &= ~_PAGE_VALID;
+   pmd_t old, entry;
 
-   set_pmd_at(vma->vm_mm, address, pmdp, entry);
+   entry = __pmd(pmd_val(*pmdp) & ~_PAGE_VALID);
+   old = pmdp_establish(vma, address, pmdp, entry);
flush_tlb_range(vma, address, address + HPAGE_PMD_SIZE);
 
/*
@@ -240,6 +251,8 @@ void pmdp_invalidate(struct vm_area_struct *vma, unsigned 
long address,
if ((pmd_val(entry) & _PAGE_PMD_HUGE) &&
!is_huge_zero_page(pmd_page(entry)))
(vma->vm_mm)->context.thp_pte_count--;
+
+   return old;
 }
 
 void pgtable_trans_huge_deposit(struct mm_struct *mm, pmd_t *pmdp,
-- 
2.15.1

[PATCH AUTOSEL for 4.9 249/293] RDMA/mlx5: Avoid memory leak in case of XRCD dealloc failure

2018-04-08 Thread Sasha Levin

From: Leon Romanovsky 

[ Upstream commit b081808a66345ba725b77ecd8d759bee874cd937 ]

Failure in XRCD FW deallocation command leaves memory leaked and
returns error to the user which he can't do anything about it.

This patch changes behavior to always free memory and always return
success to the user.

Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters")
Reviewed-by: Majd Dibbiny 
Signed-off-by: Leon Romanovsky 
Reviewed-by: Yuval Shaia 
Signed-off-by: Jason Gunthorpe 
Signed-off-by: Sasha Levin 
---
 drivers/infiniband/hw/mlx5/qp.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/qp.c b/drivers/infiniband/hw/mlx5/qp.c
index 403df3591d29..80ced0372e22 100644
--- a/drivers/infiniband/hw/mlx5/qp.c
+++ b/drivers/infiniband/hw/mlx5/qp.c
@@ -4605,13 +4605,10 @@ int mlx5_ib_dealloc_xrcd(struct ib_xrcd *xrcd)
int err;
 
err = mlx5_core_xrcd_dealloc(dev->mdev, xrcdn);
-   if (err) {
+   if (err)
mlx5_ib_warn(dev, "failed to dealloc xrcdn 0x%x\n", xrcdn);
-   return err;
-   }
 
kfree(xrcd);
-
return 0;
 }
 
-- 
2.15.1

[PATCH AUTOSEL for 4.9 246/293] jffs2: Fix use-after-free bug in jffs2_iget()'s error handling path

2018-04-08 Thread Sasha Levin

From: Jake Daryll Obina 

[ Upstream commit 5bdd0c6f89fba430e18d636493398389dadc3b17 ]

If jffs2_iget() fails for a newly-allocated inode, jffs2_do_clear_inode()
can get called twice in the error handling path, the first call in
jffs2_iget() itself and the second through iget_failed(). This can result
to a use-after-free error in the second jffs2_do_clear_inode() call, such
as shown by the oops below wherein the second jffs2_do_clear_inode() call
was trying to free node fragments that were already freed in the first
jffs2_do_clear_inode() call.

[   78.178860] jffs2: error: (1904) jffs2_do_read_inode_internal: CRC failed 
for read_inode of inode 24 at physical location 0x1fc00c
[   78.178914] Unable to handle kernel paging request at virtual address 
6b6b6b6b6b6b6b7b
[   78.185871] pgd = ffc03a567000
[   78.188794] [6b6b6b6b6b6b6b7b] *pgd=, *pud=
[   78.194968] Internal error: Oops: 9604 [#1] PREEMPT SMP
...
[   78.513147] PC is at rb_first_postorder+0xc/0x28
[   78.516503] LR is at jffs2_kill_fragtree+0x28/0x90 [jffs2]
[   78.520672] pc : [] lr : [] pstate: 
6105
[   78.526757] sp : ff800cea38f0
[   78.528753] x29: ff800cea38f0 x28: ffc01f3f8e80
[   78.532754] x27:  x26: ff800cea3c70
[   78.536756] x25: dc67c8ae x24: ffc033d6945d
[   78.540759] x23: ffc036811740 x22: ff800891a5b8
[   78.544760] x21:  x20: 
[   78.548762] x19: ffc037d48910 x18: ff800891a588
[   78.552764] x17: 0800 x16: 0c00
[   78.556766] x15: 0010 x14: 6f2065646f6e695f
[   78.560767] x13: 6461657220726f66 x12: 2064656c69616620
[   78.564769] x11: 435243203a6c616e x10: 7265746e695f6564
[   78.568771] x9 : 6f6e695f64616572 x8 : ffc037974038
[   78.572774] x7 :  x6 : 0008
[   78.576775] x5 : 002f91d85bd44a2f x4 : 
[   78.580777] x3 :  x2 : 00403755e000
[   78.584779] x1 : 6b6b6b6b6b6b6b6b x0 : 6b6b6b6b6b6b6b6b
...
[   79.038551] [] rb_first_postorder+0xc/0x28
[   79.042962] [] jffs2_do_clear_inode+0x88/0x100 [jffs2]
[   79.048395] [] jffs2_evict_inode+0x3c/0x48 [jffs2]
[   79.053443] [] evict+0xb0/0x168
[   79.056835] [] iput+0x1c0/0x200
[   79.060228] [] iget_failed+0x30/0x3c
[   79.064097] [] jffs2_iget+0x2d8/0x360 [jffs2]
[   79.068740] [] jffs2_lookup+0xe8/0x130 [jffs2]
[   79.073434] [] lookup_slow+0x118/0x190
[   79.077435] [] walk_component+0xfc/0x28c
[   79.081610] [] path_lookupat+0x84/0x108
[   79.085699] [] filename_lookup+0x88/0x100
[   79.089960] [] user_path_at_empty+0x58/0x6c
[   79.094396] [] vfs_statx+0xa4/0x114
[   79.098138] [] SyS_newfstatat+0x58/0x98
[   79.102227] [] __sys_trace_return+0x0/0x4
[   79.106489] Code: d65f03c0 f941 b4e1 aa0103e0 (f9400821)

The jffs2_do_clear_inode() call in jffs2_iget() is unnecessary since
iget_failed() will eventually call jffs2_do_clear_inode() if needed, so
just remove it.

Fixes: 5451f79f5f81 ("iget: stop JFFS2 from using iget() and read_inode()")
Reviewed-by: Richard Weinberger 
Signed-off-by: Jake Daryll Obina 
Signed-off-by: Al Viro 
Signed-off-by: Sasha Levin 
---
 fs/jffs2/fs.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/fs/jffs2/fs.c b/fs/jffs2/fs.c
index 567653f7c0ce..c9c47d03a690 100644
--- a/fs/jffs2/fs.c
+++ b/fs/jffs2/fs.c
@@ -361,7 +361,6 @@ error_io:
ret = -EIO;
 error:
mutex_unlock(>sem);
-   jffs2_do_clear_inode(c, f);
iget_failed(inode);
return ERR_PTR(ret);
 }
-- 
2.15.1

[PATCH AUTOSEL for 4.9 255/293] ocfs2/acl: use 'ip_xattr_sem' to protect getting extended attribute

2018-04-08 Thread Sasha Levin

From: piaojun 

[ Upstream commit 16c8d569f5704a84164f30ff01b29879f3438065 ]

The race between *set_acl and *get_acl will cause getting incomplete
xattr data as below:

  processAprocessB

  ocfs2_set_acl
ocfs2_xattr_set
  __ocfs2_xattr_set_handle

  ocfs2_get_acl_nolock
ocfs2_xattr_get_nolock:

processB may get incomplete xattr data if processA hasn't set_acl done.

So we should use 'ip_xattr_sem' to protect getting extended attribute in
ocfs2_get_acl_nolock(), as other processes could be changing it
concurrently.

Link: http://lkml.kernel.org/r/5a5ddcff.7030...@huawei.com
Signed-off-by: Jun Piao 
Reviewed-by: Alex Chen 
Cc: Mark Fasheh 
Cc: Joel Becker 
Cc: Junxiao Bi 
Cc: Joseph Qi 
Cc: Changwei Ge 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds 
Signed-off-by: Sasha Levin 
---
 fs/ocfs2/acl.c   | 6 ++
 fs/ocfs2/xattr.c | 2 ++
 2 files changed, 8 insertions(+)

diff --git a/fs/ocfs2/acl.c b/fs/ocfs2/acl.c
index bed1fcb63088..ee8dbbae78b6 100644
--- a/fs/ocfs2/acl.c
+++ b/fs/ocfs2/acl.c
@@ -314,7 +314,9 @@ struct posix_acl *ocfs2_iop_get_acl(struct inode *inode, 
int type)
return ERR_PTR(ret);
}
 
+   down_read(_I(inode)->ip_xattr_sem);
acl = ocfs2_get_acl_nolock(inode, type, di_bh);
+   up_read(_I(inode)->ip_xattr_sem);
 
ocfs2_inode_unlock(inode, 0);
brelse(di_bh);
@@ -333,7 +335,9 @@ int ocfs2_acl_chmod(struct inode *inode, struct buffer_head 
*bh)
if (!(osb->s_mount_opt & OCFS2_MOUNT_POSIX_ACL))
return 0;
 
+   down_read(_I(inode)->ip_xattr_sem);
acl = ocfs2_get_acl_nolock(inode, ACL_TYPE_ACCESS, bh);
+   up_read(_I(inode)->ip_xattr_sem);
if (IS_ERR(acl) || !acl)
return PTR_ERR(acl);
ret = __posix_acl_chmod(, GFP_KERNEL, inode->i_mode);
@@ -364,8 +368,10 @@ int ocfs2_init_acl(handle_t *handle,
 
if (!S_ISLNK(inode->i_mode)) {
if (osb->s_mount_opt & OCFS2_MOUNT_POSIX_ACL) {
+   down_read(_I(dir)->ip_xattr_sem);
acl = ocfs2_get_acl_nolock(dir, ACL_TYPE_DEFAULT,
   dir_bh);
+   up_read(_I(dir)->ip_xattr_sem);
if (IS_ERR(acl))
return PTR_ERR(acl);
}
diff --git a/fs/ocfs2/xattr.c b/fs/ocfs2/xattr.c
index 994e3bfaca7a..01932763b4d1 100644
--- a/fs/ocfs2/xattr.c
+++ b/fs/ocfs2/xattr.c
@@ -638,9 +638,11 @@ int ocfs2_calc_xattr_init(struct inode *dir,
 si->value_len);
 
if (osb->s_mount_opt & OCFS2_MOUNT_POSIX_ACL) {
+   down_read(_I(dir)->ip_xattr_sem);
acl_len = ocfs2_xattr_get_nolock(dir, dir_bh,
OCFS2_XATTR_INDEX_POSIX_ACL_DEFAULT,
"", NULL, 0);
+   up_read(_I(dir)->ip_xattr_sem);
if (acl_len > 0) {
a_size = ocfs2_xattr_entry_real_size(0, acl_len);
if (S_ISDIR(mode))
-- 
2.15.1

[PATCH AUTOSEL for 4.9 254/293] ocfs2: return -EROFS to mount.ocfs2 if inode block is invalid

2018-04-08 Thread Sasha Levin

From: piaojun 

[ Upstream commit 025bcbde3634b2c9b316f227fed13ad6ad6817fb ]

If metadata is corrupted such as 'invalid inode block', we will get
failed by calling 'mount()' and then set filesystem readonly as below:

  ocfs2_mount
ocfs2_initialize_super
  ocfs2_init_global_system_inodes
ocfs2_iget
  ocfs2_read_locked_inode
ocfs2_validate_inode_block
  ocfs2_error
ocfs2_handle_error
  ocfs2_set_ro_flag(osb, 0);  // set readonly

In this situation we need return -EROFS to 'mount.ocfs2', so that user
can fix it by fsck.  And then mount again.  In addition, 'mount.ocfs2'
should be updated correspondingly as it only return 1 for all errno.
And I will post a patch for 'mount.ocfs2' too.

Link: http://lkml.kernel.org/r/5a4302fa.2010...@huawei.com
Signed-off-by: Jun Piao 
Reviewed-by: Alex Chen 
Reviewed-by: Joseph Qi 
Reviewed-by: Changwei Ge 
Reviewed-by: Gang He 
Cc: Mark Fasheh 
Cc: Joel Becker 
Cc: Junxiao Bi 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds 
Signed-off-by: Sasha Levin 
---
 fs/ocfs2/super.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
index f56fe39fab04..64dfbe5755da 100644
--- a/fs/ocfs2/super.c
+++ b/fs/ocfs2/super.c
@@ -473,9 +473,8 @@ static int ocfs2_init_global_system_inodes(struct 
ocfs2_super *osb)
new = ocfs2_get_system_file_inode(osb, i, osb->slot_num);
if (!new) {
ocfs2_release_system_inodes(osb);
-   status = -EINVAL;
+   status = ocfs2_is_soft_readonly(osb) ? -EROFS : -EINVAL;
mlog_errno(status);
-   /* FIXME: Should ERROR_RO_FS */
mlog(ML_ERROR, "Unable to load system inode %d, "
 "possibly corrupt fs?", i);
goto bail;
@@ -504,7 +503,7 @@ static int ocfs2_init_local_system_inodes(struct 
ocfs2_super *osb)
new = ocfs2_get_system_file_inode(osb, i, osb->slot_num);
if (!new) {
ocfs2_release_system_inodes(osb);
-   status = -EINVAL;
+   status = ocfs2_is_soft_readonly(osb) ? -EROFS : -EINVAL;
mlog(ML_ERROR, "status=%d, sysfile=%d, slot=%d\n",
 status, i, osb->slot_num);
goto bail;
-- 
2.15.1

[PATCH AUTOSEL for 4.9 253/293] kvm: Map PFN-type memory regions as writable (if possible)

2018-04-08 Thread Sasha Levin

From: KarimAllah Ahmed 

[ Upstream commit a340b3e229b24a56f1c7f5826b15a3af0f4b13e5 ]

For EPT-violations that are triggered by a read, the pages are also mapped with
write permissions (if their memory region is also writable). That would avoid
getting yet another fault on the same page when a write occurs.

This optimization only happens when you have a "struct page" backing the memory
region. So also enable it for memory regions that do not have a "struct page".

Cc: Paolo Bonzini 
Cc: Radim Krčmář 
Cc: k...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: KarimAllah Ahmed 
Reviewed-by: Paolo Bonzini 
Signed-off-by: Radim Krčmář 
Signed-off-by: Sasha Levin 
---
 virt/kvm/kvm_main.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index eaae7252f60c..4f2a2df85b1f 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1466,7 +1466,8 @@ static bool vma_is_valid(struct vm_area_struct *vma, bool 
write_fault)
 
 static int hva_to_pfn_remapped(struct vm_area_struct *vma,
   unsigned long addr, bool *async,
-  bool write_fault, kvm_pfn_t *p_pfn)
+  bool write_fault, bool *writable,
+  kvm_pfn_t *p_pfn)
 {
unsigned long pfn;
int r;
@@ -1492,6 +1493,8 @@ static int hva_to_pfn_remapped(struct vm_area_struct *vma,
 
}
 
+   if (writable)
+   *writable = true;
 
/*
 * Get a reference here because callers of *hva_to_pfn* and
@@ -1557,7 +1560,7 @@ retry:
if (vma == NULL)
pfn = KVM_PFN_ERR_FAULT;
else if (vma->vm_flags & (VM_IO | VM_PFNMAP)) {
-   r = hva_to_pfn_remapped(vma, addr, async, write_fault, );
+   r = hva_to_pfn_remapped(vma, addr, async, write_fault, 
writable, );
if (r == -EAGAIN)
goto retry;
if (r < 0)
-- 
2.15.1

[PATCH AUTOSEL for 4.9 251/293] gianfar: prevent integer wrapping in the rx handler

2018-04-08 Thread Sasha Levin

From: Andy Spencer 

[ Upstream commit 202a0a70e445caee1d0ec7aae814e64b1189fa4d ]

When the frame check sequence (FCS) is split across the last two frames
of a fragmented packet, part of the FCS gets counted twice, once when
subtracting the FCS, and again when subtracting the previously received
data.

For example, if 1602 bytes are received, and the first fragment contains
the first 1600 bytes (including the first two bytes of the FCS), and the
second fragment contains the last two bytes of the FCS:

  'skb->len == 1600' from the first fragment

  size  = lstatus & BD_LENGTH_MASK; # 1602
  size -= ETH_FCS_LEN;  # 1598
  size -= skb->len; # -2

Since the size is unsigned, it wraps around and causes a BUG later in
the packet handling, as shown below:

  kernel BUG at ./include/linux/skbuff.h:2068!
  Oops: Exception in kernel mode, sig: 5 [#1]
  ...
  NIP [c021ec60] skb_pull+0x24/0x44
  LR [c01e2fbc] gfar_clean_rx_ring+0x498/0x690
  Call Trace:
  [df7edeb0] [c01e2c1c] gfar_clean_rx_ring+0xf8/0x690 (unreliable)
  [df7edf20] [c01e33a8] gfar_poll_rx_sq+0x3c/0x9c
  [df7edf40] [c023352c] net_rx_action+0x21c/0x274
  [df7edf90] [c0329000] __do_softirq+0xd8/0x240
  [df7edff0] [c000c108] call_do_irq+0x24/0x3c
  [c0597e90] [c00041dc] do_IRQ+0x64/0xc4
  [c0597eb0] [c000d920] ret_from_except+0x0/0x18
  --- interrupt: 501 at arch_cpu_idle+0x24/0x5c

Change the size to a signed integer and then trim off any part of the
FCS that was received prior to the last fragment.

Fixes: 6c389fc931bc ("gianfar: fix size of scatter-gathered frames")
Signed-off-by: Andy Spencer 
Signed-off-by: David S. Miller 
Signed-off-by: Sasha Levin 
---
 drivers/net/ethernet/freescale/gianfar.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/freescale/gianfar.c 
b/drivers/net/ethernet/freescale/gianfar.c
index e3b41ba95168..fa877f1e7f6f 100644
--- a/drivers/net/ethernet/freescale/gianfar.c
+++ b/drivers/net/ethernet/freescale/gianfar.c
@@ -2935,7 +2935,7 @@ static irqreturn_t gfar_transmit(int irq, void *grp_id)
 static bool gfar_add_rx_frag(struct gfar_rx_buff *rxb, u32 lstatus,
 struct sk_buff *skb, bool first)
 {
-   unsigned int size = lstatus & BD_LENGTH_MASK;
+   int size = lstatus & BD_LENGTH_MASK;
struct page *page = rxb->page;
bool last = !!(lstatus & BD_LFLAG(RXBD_LAST));
 
@@ -2950,11 +2950,16 @@ static bool gfar_add_rx_frag(struct gfar_rx_buff *rxb, 
u32 lstatus,
if (last)
size -= skb->len;
 
-   /* in case the last fragment consisted only of the FCS */
+   /* Add the last fragment if it contains something other than
+* the FCS, otherwise drop it and trim off any part of the FCS
+* that was already received.
+*/
if (size > 0)
skb_add_rx_frag(skb, skb_shinfo(skb)->nr_frags, page,
rxb->page_offset + RXBUF_ALIGNMENT,
size, GFAR_RXB_TRUESIZE);
+   else if (size < 0)
+   pskb_trim(skb, skb->len + size);
}
 
/* try reuse page */
-- 
2.15.1

[PATCH AUTOSEL for 4.9 252/293] tcp_nv: fix potential integer overflow in tcpnv_acked

2018-04-08 Thread Sasha Levin

From: "Gustavo A. R. Silva" 

[ Upstream commit e4823fbd229bfbba368b40cdadb8f4eeb20604cc ]

Add suffix ULL to constant 8 in order to avoid a potential integer
overflow and give the compiler complete information about the proper
arithmetic to use. Notice that this constant is used in a context that
expects an expression of type u64.

The current cast to u64 effectively applies to the whole expression
as an argument of type u64 to be passed to div64_u64, but it does
not prevent it from being evaluated using 32-bit arithmetic instead
of 64-bit arithmetic.

Also, once the expression is properly evaluated using 64-bit arithmentic,
there is no need for the parentheses and the external cast to u64.

Addresses-Coverity-ID: 1357588 ("Unintentional integer overflow")
Signed-off-by: Gustavo A. R. Silva 
Signed-off-by: David S. Miller 
Signed-off-by: Sasha Levin 
---
 net/ipv4/tcp_nv.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ipv4/tcp_nv.c b/net/ipv4/tcp_nv.c
index e45e2c41c7bd..37a3cb999859 100644
--- a/net/ipv4/tcp_nv.c
+++ b/net/ipv4/tcp_nv.c
@@ -338,7 +338,7 @@ static void tcpnv_acked(struct sock *sk, const struct 
ack_sample *sample)
 */
cwnd_by_slope = (u32)
div64_u64(((u64)ca->nv_rtt_max_rate) * ca->nv_min_rtt,
- (u64)(8 * tp->mss_cache));
+ 8ULL * tp->mss_cache);
max_win = cwnd_by_slope + nv_pad;
 
/* If cwnd > max_win, decrease cwnd
-- 
2.15.1

[PATCH AUTOSEL for 4.9 243/293] HID: roccat: prevent an out of bounds read in kovaplus_profile_activated()

2018-04-08 Thread Sasha Levin

From: Dan Carpenter 

[ Upstream commit 7ad81482cad67cbe1ec808490d1ddfc420c42008 ]

We get the "new_profile_index" value from the mouse device when we're
handling raw events.  Smatch taints it as untrusted data and complains
that we need a bounds check.  This seems like a reasonable warning
otherwise there is a small read beyond the end of the array.

Fixes: 0e70f97f257e ("HID: roccat: Add support for Kova[+] mouse")
Signed-off-by: Dan Carpenter 
Acked-by: Silvan Jegen 
Signed-off-by: Jiri Kosina 
Signed-off-by: Sasha Levin 
---
 drivers/hid/hid-roccat-kovaplus.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/hid/hid-roccat-kovaplus.c 
b/drivers/hid/hid-roccat-kovaplus.c
index 43617fb28b87..317c9c2c0a7c 100644
--- a/drivers/hid/hid-roccat-kovaplus.c
+++ b/drivers/hid/hid-roccat-kovaplus.c
@@ -37,6 +37,8 @@ static uint kovaplus_convert_event_cpi(uint value)
 static void kovaplus_profile_activated(struct kovaplus_device *kovaplus,
uint new_profile_index)
 {
+   if (new_profile_index >= ARRAY_SIZE(kovaplus->profile_settings))
+   return;
kovaplus->actual_profile = new_profile_index;
kovaplus->actual_cpi = 
kovaplus->profile_settings[new_profile_index].cpi_startup_level;
kovaplus->actual_x_sensitivity = 
kovaplus->profile_settings[new_profile_index].sensitivity_x;
-- 
2.15.1

[PATCH AUTOSEL for 4.9 248/293] powerpc/numa: Ensure nodes initialized for hotplug

2018-04-08 Thread Sasha Levin

From: Michael Bringmann 

[ Upstream commit ea05ba7c559c8e5a5946c3a94a2a266e9a6680a6 ]

This patch fixes some problems encountered at runtime with
configurations that support memory-less nodes, or that hot-add CPUs
into nodes that are memoryless during system execution after boot. The
problems of interest include:

* Nodes known to powerpc to be memoryless at boot, but to have CPUs in
  them are allowed to be 'possible' and 'online'. Memory allocations
  for those nodes are taken from another node that does have memory
  until and if memory is hot-added to the node.

* Nodes which have no resources assigned at boot, but which may still
  be referenced subsequently by affinity or associativity attributes,
  are kept in the list of 'possible' nodes for powerpc. Hot-add of
  memory or CPUs to the system can reference these nodes and bring
  them online instead of redirecting the references to one of the set
  of nodes known to have memory at boot.

Note that this software operates under the context of CPU hotplug. We
are not doing memory hotplug in this code, but rather updating the
kernel's CPU topology (i.e. arch_update_cpu_topology /
numa_update_cpu_topology). We are initializing a node that may be used
by CPUs or memory before it can be referenced as invalid by a CPU
hotplug operation. CPU hotplug operations are protected by a range of
APIs including cpu_maps_update_begin/cpu_maps_update_done,
cpus_read/write_lock / cpus_read/write_unlock, device locks, and more.
Memory hotplug operations, including try_online_node, are protected by
mem_hotplug_begin/mem_hotplug_done, device locks, and more. In the
case of CPUs being hot-added to a previously memoryless node, the
try_online_node operation occurs wholly within the CPU locks with no
overlap. Using HMC hot-add/hot-remove operations, we have been able to
add and remove CPUs to any possible node without failures. HMC
operations involve a degree self-serialization, though.

Signed-off-by: Michael Bringmann 
Reviewed-by: Nathan Fontenot 
Signed-off-by: Michael Ellerman 
Signed-off-by: Sasha Levin 
---
 arch/powerpc/mm/numa.c | 47 +--
 1 file changed, 37 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
index 18ea1e49a323..6cff96e0d77b 100644
--- a/arch/powerpc/mm/numa.c
+++ b/arch/powerpc/mm/numa.c
@@ -551,7 +551,7 @@ static int numa_setup_cpu(unsigned long lcpu)
nid = of_node_to_nid_single(cpu);
 
 out_present:
-   if (nid < 0 || !node_online(nid))
+   if (nid < 0 || !node_possible(nid))
nid = first_online_node;
 
map_cpu_to_node(lcpu, nid);
@@ -922,10 +922,8 @@ static void __init find_possible_nodes(void)
goto out;
 
for (i = 0; i < numnodes; i++) {
-   if (!node_possible(i)) {
-   setup_node_data(i, 0, 0);
+   if (!node_possible(i))
node_set(i, node_possible_map);
-   }
}
 
 out:
@@ -1305,6 +1303,40 @@ static long vphn_get_associativity(unsigned long cpu,
return rc;
 }
 
+static inline int find_and_online_cpu_nid(int cpu)
+{
+   __be32 associativity[VPHN_ASSOC_BUFSIZE] = {0};
+   int new_nid;
+
+   /* Use associativity from first thread for all siblings */
+   vphn_get_associativity(cpu, associativity);
+   new_nid = associativity_to_nid(associativity);
+   if (new_nid < 0 || !node_possible(new_nid))
+   new_nid = first_online_node;
+
+   if (NODE_DATA(new_nid) == NULL) {
+#ifdef CONFIG_MEMORY_HOTPLUG
+   /*
+* Need to ensure that NODE_DATA is initialized for a node from
+* available memory (see memblock_alloc_try_nid). If unable to
+* init the node, then default to nearest node that has memory
+* installed.
+*/
+   if (try_online_node(new_nid))
+   new_nid = first_online_node;
+#else
+   /*
+* Default to using the nearest node that has memory installed.
+* Otherwise, it would be necessary to patch the kernel MM code
+* to deal with more memoryless-node error conditions.
+*/
+   new_nid = first_online_node;
+#endif
+   }
+
+   return new_nid;
+}
+
 /*
  * Update the CPU maps and sysfs entries for a single CPU when its NUMA
  * characteristics change. This function doesn't perform any locking and is
@@ -1370,7 +1402,6 @@ int arch_update_cpu_topology(void)
 {
unsigned int cpu, sibling, changed = 0;
struct topology_update_data *updates, *ud;
-   __be32 associativity[VPHN_ASSOC_BUFSIZE] = {0};
cpumask_t updated_cpus;
struct device *dev;
int weight, new_nid, i = 0;
@@ -1405,11 +1436,7 @@ int

[PATCH AUTOSEL for 4.9 244/293] fm10k: fix "failed to kill vid" message for VF

2018-04-08 Thread Sasha Levin

From: Ngai-Mint Kwan 

[ Upstream commit cf315ea596ec26d7aa542a9ce354990875a920c0 ]

When a VF is under PF VLAN assignment:

ip link set  vf <#> vlan 

This will remove all previous entries in the VLAN table including those
generated by VLAN interfaces created on the VF. The issue arises when
the VF is under PF VLAN assignment and one or more of these VLAN
interfaces of the VF are deleted. When deleting these VLAN interfaces,
the following message will be generated in "dmesg":

failed to kill vid 0081/ for device 

This is due to the fact that "ndo_vlan_rx_kill_vid" exits with an error.
The handler for this ndo is "fm10k_update_vid". Any calls to this
function while under PF VLAN management will exit prematurely and, thus,
it will generate the failure message.

Additionally, since "fm10k_update_vid" exits prematurely, none of the
VLAN update is performed. So, even though the actual VLAN interfaces of
the VF will be deleted, the active_vlans bitmask is not cleared. When
the VF is no longer under PF VLAN assignment, the driver mistakenly
restores the previous entries of the VLAN table based on an
unsynchronized list of active VLANs.

The solution to this issue involves checking the VLAN update action type
before exiting "fm10k_update_vid". If the VLAN update action type is to
"add", this action will not be permitted while the VF is under PF VLAN
assignment and the VLAN update is abandoned like before.

However, if the VLAN update action type is to "kill", then we need to
also clear the active_vlans bitmask. However, we don't need to actually
queue any messages to the PF, because the MAC and VLAN tables have
already been cleared, and the PF would silently ignore these requests
anyways.

Signed-off-by: Ngai-Mint Kwan 
Signed-off-by: Jacob Keller 
Tested-by: Krishneil Singh 
Signed-off-by: Jeff Kirsher 
Signed-off-by: Sasha Levin 
---
 drivers/net/ethernet/intel/fm10k/fm10k_netdev.c | 14 --
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c 
b/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c
index 05629381be6b..ea5ea653e1db 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c
@@ -803,8 +803,12 @@ static int fm10k_update_vid(struct net_device *netdev, u16 
vid, bool set)
if (vid >= VLAN_N_VID)
return -EINVAL;
 
-   /* Verify we have permission to add VLANs */
-   if (hw->mac.vlan_override)
+   /* Verify that we have permission to add VLANs. If this is a request
+* to remove a VLAN, we still want to allow the user to remove the
+* VLAN device. In that case, we need to clear the bit in the
+* active_vlans bitmask.
+*/
+   if (set && hw->mac.vlan_override)
return -EACCES;
 
/* update active_vlans bitmask */
@@ -823,6 +827,12 @@ static int fm10k_update_vid(struct net_device *netdev, u16 
vid, bool set)
rx_ring->vid &= ~FM10K_VLAN_CLEAR;
}
 
+   /* If our VLAN has been overridden, there is no reason to send VLAN
+* removal requests as they will be silently ignored.
+*/
+   if (hw->mac.vlan_override)
+   return 0;
+
/* Do not remove default VLAN ID related entries from VLAN and MAC
 * tables
 */
-- 
2.15.1

[PATCH AUTOSEL for 4.9 242/293] scsi: fas216: fix sense buffer initialization

2018-04-08 Thread Sasha Levin

From: Arnd Bergmann 

[ Upstream commit 96d5eaa9bb74d299508d811d865c2c41b38b0301 ]

While testing with the ARM specific memset() macro removed, I ran into a
compiler warning that shows an old bug:

drivers/scsi/arm/fas216.c: In function 'fas216_rq_sns_done':
drivers/scsi/arm/fas216.c:2014:40: error: argument to 'sizeof' in 'memset' call 
is the same expression as the destination; did you mean to provide an explicit 
length? [-Werror=sizeof-pointer-memaccess]

It turns out that the definition of the scsi_cmd structure changed back
in linux-2.6.25, so now we clear only four bytes (sizeof(pointer))
instead of 96 (SCSI_SENSE_BUFFERSIZE). I did not check whether we
actually need to initialize the buffer here, but it's clear that if we
do it, we should use the correct size.

Fixes: de25deb18016 ("[SCSI] use dynamically allocated sense buffer")
Signed-off-by: Arnd Bergmann 
Signed-off-by: Martin K. Petersen 
Signed-off-by: Sasha Levin 
---
 drivers/scsi/arm/fas216.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/arm/fas216.c b/drivers/scsi/arm/fas216.c
index 24388795ee9a..936e8c735656 100644
--- a/drivers/scsi/arm/fas216.c
+++ b/drivers/scsi/arm/fas216.c
@@ -2011,7 +2011,7 @@ static void fas216_rq_sns_done(FAS216_Info *info, struct 
scsi_cmnd *SCpnt,
 * have valid data in the sense buffer that could
 * confuse the higher levels.
 */
-   memset(SCpnt->sense_buffer, 0, sizeof(SCpnt->sense_buffer));
+   memset(SCpnt->sense_buffer, 0, SCSI_SENSE_BUFFERSIZE);
 //printk("scsi%d.%c: sense buffer: ", info->host->host_no, '0' + 
SCpnt->device->id);
 //{ int i; for (i = 0; i < 32; i++) printk("%02x ", SCpnt->sense_buffer[i]); 
printk("\n"); }
/*
-- 
2.15.1

[PATCH AUTOSEL for 4.9 241/293] scsi: devinfo: fix format of the device list

2018-04-08 Thread Sasha Levin

From: Xose Vazquez Perez 

[ Upstream commit 3f884a0a8bdf28cfd1e9987d54d83350096cdd46 ]

Replace "" with NULL for product revision level, and merge TEXEL
duplicate entries.

Cc: Hannes Reinecke 
Cc: Martin K. Petersen 
Cc: James E.J. Bottomley 
Cc: SCSI ML 
Signed-off-by: Xose Vazquez Perez 
Signed-off-by: Martin K. Petersen 
Signed-off-by: Sasha Levin 
---
 drivers/scsi/scsi_devinfo.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/scsi/scsi_devinfo.c b/drivers/scsi/scsi_devinfo.c
index 43d4b30cbf65..498d2514cb59 100644
--- a/drivers/scsi/scsi_devinfo.c
+++ b/drivers/scsi/scsi_devinfo.c
@@ -108,8 +108,8 @@ static struct {
 * seagate controller, which causes SCSI code to reset bus.
 */
{"HP", "C1750A", "3226", BLIST_NOLUN},  /* scanjet iic */
-   {"HP", "C1790A", "", BLIST_NOLUN},  /* scanjet iip */
-   {"HP", "C2500A", "", BLIST_NOLUN},  /* scanjet iicx */
+   {"HP", "C1790A", NULL, BLIST_NOLUN},/* scanjet iip */
+   {"HP", "C2500A", NULL, BLIST_NOLUN},/* scanjet iicx */
{"MEDIAVIS", "CDR-H93MV", "1.31", BLIST_NOLUN}, /* locks up */
{"MICROTEK", "ScanMaker II", "5.61", BLIST_NOLUN},  /* responds to 
all lun */
{"MITSUMI", "CD-R CR-2201CS", "6119", BLIST_NOLUN}, /* locks up */
@@ -119,7 +119,7 @@ static struct {
{"QUANTUM", "FIREBALL ST4.3S", "0F0C", BLIST_NOLUN},/* locks up */
{"RELISYS", "Scorpio", NULL, BLIST_NOLUN},  /* responds to all lun 
*/
{"SANKYO", "CP525", "6.64", BLIST_NOLUN},   /* causes failed REQ 
SENSE, extra reset */
-   {"TEXEL", "CD-ROM", "1.06", BLIST_NOLUN},
+   {"TEXEL", "CD-ROM", "1.06", BLIST_NOLUN | BLIST_BORKEN},
{"transtec", "T5008", "0001", BLIST_NOREPORTLUN },
{"YAMAHA", "CDR100", "1.00", BLIST_NOLUN},  /* locks up */
{"YAMAHA", "CDR102", "1.00", BLIST_NOLUN},  /* locks up */
@@ -256,7 +256,6 @@ static struct {
{"ST650211", "CF", NULL, BLIST_RETRY_HWERROR},
{"SUN", "T300", "*", BLIST_SPARSELUN},
{"SUN", "T4", "*", BLIST_SPARSELUN},
-   {"TEXEL", "CD-ROM", "1.06", BLIST_BORKEN},
{"Tornado-", "F4", "*", BLIST_NOREPORTLUN},
{"TOSHIBA", "CDROM", NULL, BLIST_ISROM},
{"TOSHIBA", "CD-ROM", NULL, BLIST_ISROM},
-- 
2.15.1

[PATCH AUTOSEL for 4.9 233/293] kconfig: Fix automatic menu creation mem leak

2018-04-08 Thread Sasha Levin

From: Ulf Magnusson 

[ Upstream commit ae7440ef0c8013d68c00dad6900e7cce5311bb1c ]

expr_trans_compare() always allocates and returns a new expression,
giving the following leak outline:

...
*Allocate*
basedep = expr_trans_compare(basedep, E_UNEQUAL, _no);
...
for (menu = parent->next; menu; menu = menu->next) {
...
*Copy*
dep2 = expr_copy(basedep);
...
*Free copy*
expr_free(dep2);
}
*basedep lost!*

Fix by freeing 'basedep' after the loop.

Summary from Valgrind on 'menuconfig' (ARCH=x86) before the fix:

LEAK SUMMARY:
   definitely lost: 344,376 bytes in 14,349 blocks
   ...

Summary after the fix:

LEAK SUMMARY:
   definitely lost: 44,448 bytes in 1,852 blocks
   ...

Signed-off-by: Ulf Magnusson 
Signed-off-by: Masahiro Yamada 
Signed-off-by: Sasha Levin 
---
 scripts/kconfig/menu.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/scripts/kconfig/menu.c b/scripts/kconfig/menu.c
index aed678e8a777..4a61636158dd 100644
--- a/scripts/kconfig/menu.c
+++ b/scripts/kconfig/menu.c
@@ -364,6 +364,7 @@ void menu_finalize(struct menu *parent)
menu->parent = parent;
last_menu = menu;
}
+   expr_free(basedep);
if (last_menu) {
parent->list = parent->next;
parent->next = last_menu->next;
-- 
2.15.1

[PATCH AUTOSEL for 4.9 230/293] clk: ingenic: Fix recalc_rate for clocks with fixed divider

2018-04-08 Thread Sasha Levin

From: Paul Cercueil 

[ Upstream commit e6cfa64375d34a6c8c1861868a381013b2d3b921 ]

Previously, the clocks with a fixed divider would report their rate
as being the same as the one of their parent, independently of the
divider in use. This commit fixes this behaviour.

This went unnoticed as neither the jz4740 nor the jz4780 CGU code
have clocks with fixed dividers yet.

Signed-off-by: Paul Cercueil 
Acked-by: Stephen Boyd 
Cc: Ralf Baechle 
Cc: Maarten ter Huurne 
Cc: linux-m...@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/18477/
Signed-off-by: James Hogan 
Signed-off-by: Sasha Levin 
---
 drivers/clk/ingenic/cgu.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/clk/ingenic/cgu.c b/drivers/clk/ingenic/cgu.c
index e8248f9185f7..eb9002ccf3fc 100644
--- a/drivers/clk/ingenic/cgu.c
+++ b/drivers/clk/ingenic/cgu.c
@@ -328,6 +328,8 @@ ingenic_clk_recalc_rate(struct clk_hw *hw, unsigned long 
parent_rate)
div *= clk_info->div.div;
 
rate /= div;
+   } else if (clk_info->type & CGU_CLK_FIXDIV) {
+   rate /= clk_info->fixdiv.div;
}
 
return rate;
-- 
2.15.1

[PATCH AUTOSEL for 4.9 231/293] watchdog: sp5100_tco: Fix watchdog disable bit

2018-04-08 Thread Sasha Levin

From: Guenter Roeck 

[ Upstream commit f541c09ebfc61697b586b38c9ebaf4b70defb278 ]

According to all published information, the watchdog disable bit for SB800
compatible controllers is bit 1 of PM register 0x48, not bit 2. For the
most part that doesn't matter in practice, since the bit has to be cleared
to enable watchdog address decoding, which is the default setting, but it
still needs to be fixed.

Cc: Zoltán Böszörményi 
Signed-off-by: Guenter Roeck 
Signed-off-by: Wim Van Sebroeck 
Signed-off-by: Sasha Levin 
---
 drivers/watchdog/sp5100_tco.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/watchdog/sp5100_tco.h b/drivers/watchdog/sp5100_tco.h
index 2b28c00da0df..dfe20b81ced5 100644
--- a/drivers/watchdog/sp5100_tco.h
+++ b/drivers/watchdog/sp5100_tco.h
@@ -54,7 +54,7 @@
 #define SB800_PM_WATCHDOG_CONFIG   0x4C
 
 #define SB800_PCI_WATCHDOG_DECODE_EN   (1 << 0)
-#define SB800_PM_WATCHDOG_DISABLE  (1 << 2)
+#define SB800_PM_WATCHDOG_DISABLE  (1 << 1)
 #define SB800_PM_WATCHDOG_SECOND_RES   (3 << 0)
 #define SB800_ACPI_MMIO_DECODE_EN  (1 << 0)
 #define SB800_ACPI_MMIO_SEL(1 << 1)
-- 
2.15.1

[PATCH AUTOSEL for 4.9 239/293] Btrfs: fix scrub to repair raid6 corruption

2018-04-08 Thread Sasha Levin

From: Liu Bo 

[ Upstream commit 762221f095e3932669093466aaf4b85ed9ad2ac1 ]

The raid6 corruption is that,
suppose that all disks can be read without problems and if the content
that was read out doesn't match its checksum, currently for raid6
btrfs at most retries twice,

- the 1st retry is to rebuild with all other stripes, it'll eventually
  be a raid5 xor rebuild,
- if the 1st fails, the 2nd retry will deliberately fail parity p so
  that it will do raid6 style rebuild,

however, the chances are that another non-parity stripe content also
has something corrupted, so that the above retries are not able to
return correct content.

We've fixed normal reads to rebuild raid6 correctly with more retries
in Patch "Btrfs: make raid6 rebuild retry more"[1], this is to fix
scrub to do the exactly same rebuild process.

[1]: https://patchwork.kernel.org/patch/10091755/

Signed-off-by: Liu Bo 
Signed-off-by: David Sterba 
Signed-off-by: Sasha Levin 
---
 fs/btrfs/raid56.c  | 18 ++
 fs/btrfs/volumes.c |  9 -
 2 files changed, 22 insertions(+), 5 deletions(-)

diff --git a/fs/btrfs/raid56.c b/fs/btrfs/raid56.c
index d016d4a79864..af6a776fa18c 100644
--- a/fs/btrfs/raid56.c
+++ b/fs/btrfs/raid56.c
@@ -2161,11 +2161,21 @@ int raid56_parity_recover(struct btrfs_root *root, 
struct bio *bio,
}
 
/*
-* reconstruct from the q stripe if they are
-* asking for mirror 3
+* Loop retry:
+* for 'mirror == 2', reconstruct from all other stripes.
+* for 'mirror_num > 2', select a stripe to fail on every retry.
 */
-   if (mirror_num == 3)
-   rbio->failb = rbio->real_stripes - 2;
+   if (mirror_num > 2) {
+   /*
+* 'mirror == 3' is to fail the p stripe and
+* reconstruct from the q stripe.  'mirror > 3' is to
+* fail a data stripe and reconstruct from p+q stripe.
+*/
+   rbio->failb = rbio->real_stripes - (mirror_num - 1);
+   ASSERT(rbio->failb > 0);
+   if (rbio->failb <= rbio->faila)
+   rbio->failb--;
+   }
 
ret = lock_stripe_add(rbio);
 
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 4730ba2cc049..491c14ad982a 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -5177,7 +5177,14 @@ int btrfs_num_copies(struct btrfs_fs_info *fs_info, u64 
logical, u64 len)
else if (map->type & BTRFS_BLOCK_GROUP_RAID5)
ret = 2;
else if (map->type & BTRFS_BLOCK_GROUP_RAID6)
-   ret = 3;
+   /*
+* There could be two corrupted data stripes, we need
+* to loop retry in order to rebuild the correct data.
+* 
+* Fail a stripe at a time on every retry except the
+* stripe under reconstruction.
+*/
+   ret = map->num_stripes;
else
ret = 1;
free_extent_map(em);
-- 
2.15.1

[PATCH AUTOSEL for 4.9 238/293] btrfs: Fix out of bounds access in btrfs_search_slot

2018-04-08 Thread Sasha Levin

From: Nikolay Borisov 

[ Upstream commit 9ea2c7c9da13c9073e371c046cbbc45481ecb459 ]

When modifying a tree where the root is at BTRFS_MAX_LEVEL - 1 then
the level variable is going to be 7 (this is the max height of the
tree). On the other hand btrfs_cow_block is always called with
"level + 1" as an index into the nodes and slots arrays. This leads to
an out of bounds access. Admittdely this will be benign since an OOB
access of the nodes array will likely read the 0th element from the
slots array, which in this case is going to be 0 (since we start CoW at
the top of the tree). The OOB access into the slots array in turn will
read the 0th and 1st values of the locks array, which would both be 0
at the time. However, this benign behavior relies on the fact that the
path being passed hasn't been initialised, if it has already been used to
query a btree then it could potentially have populated the nodes/slots arrays.

Fix it by explicitly checking if we are at level 7 (the maximum allowed
index in nodes/slots arrays) and explicitly call the CoW routine with
NULL for parent's node/slot.

Signed-off-by: Nikolay Borisov 
Fixes-coverity-id: 711515
Reviewed-by: David Sterba 
Signed-off-by: David Sterba 
Signed-off-by: Sasha Levin 
---
 fs/btrfs/ctree.c | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c
index f6ba165d3f81..f22ffc6793cd 100644
--- a/fs/btrfs/ctree.c
+++ b/fs/btrfs/ctree.c
@@ -2760,6 +2760,8 @@ again:
 * contention with the cow code
 */
if (cow) {
+   bool last_level = (level == (BTRFS_MAX_LEVEL - 1));
+
/*
 * if we don't really need to cow this block
 * then we don't want to set the path blocking,
@@ -2784,9 +2786,13 @@ again:
}
 
btrfs_set_path_blocking(p);
-   err = btrfs_cow_block(trans, root, b,
- p->nodes[level + 1],
- p->slots[level + 1], );
+   if (last_level)
+   err = btrfs_cow_block(trans, root, b, NULL, 0,
+ );
+   else
+   err = btrfs_cow_block(trans, root, b,
+ p->nodes[level + 1],
+ p->slots[level + 1], );
if (err) {
ret = err;
goto done;
-- 
2.15.1

[PATCH AUTOSEL for 4.9 236/293] ipmi/powernv: Fix error return code in ipmi_powernv_probe()

2018-04-08 Thread Sasha Levin

From: Wei Yongjun 

[ Upstream commit e749d328b0b450aa78d562fa26a0cd8872325dd9 ]

Fix to return a negative error code from the request_irq() error
handling case instead of 0, as done elsewhere in this function.

Fixes: dce143c3381c ("ipmi/powernv: Convert to irq event interface")
Signed-off-by: Wei Yongjun 
Reviewed-by: Alexey Kardashevskiy 
Signed-off-by: Corey Minyard 
Signed-off-by: Sasha Levin 
---
 drivers/char/ipmi/ipmi_powernv.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/char/ipmi/ipmi_powernv.c b/drivers/char/ipmi/ipmi_powernv.c
index 6e658aa114f1..a70518a4fcec 100644
--- a/drivers/char/ipmi/ipmi_powernv.c
+++ b/drivers/char/ipmi/ipmi_powernv.c
@@ -251,8 +251,9 @@ static int ipmi_powernv_probe(struct platform_device *pdev)
ipmi->irq = opal_event_request(prop);
}
 
-   if (request_irq(ipmi->irq, ipmi_opal_event, IRQ_TYPE_LEVEL_HIGH,
-   "opal-ipmi", ipmi)) {
+   rc = request_irq(ipmi->irq, ipmi_opal_event, IRQ_TYPE_LEVEL_HIGH,
+"opal-ipmi", ipmi);
+   if (rc) {
dev_warn(dev, "Unable to request irq\n");
goto err_dispose;
}
-- 
2.15.1

[PATCH AUTOSEL for 4.9 234/293] kconfig: Fix expr_free() E_NOT leak

2018-04-08 Thread Sasha Levin

From: Ulf Magnusson 

[ Upstream commit 5b1374b3b3c2fc4f63a398adfa446fb8eff791a4 ]

Only the E_NOT operand and not the E_NOT node itself was freed, due to
accidentally returning too early in expr_free(). Outline of leak:

switch (e->type) {
...
case E_NOT:
expr_free(e->left.expr);
return;
...
}
*Never reached, 'e' leaked*
free(e);

Fix by changing the 'return' to a 'break'.

Summary from Valgrind on 'menuconfig' (ARCH=x86) before the fix:

LEAK SUMMARY:
   definitely lost: 44,448 bytes in 1,852 blocks
   ...

Summary after the fix:

LEAK SUMMARY:
   definitely lost: 1,608 bytes in 67 blocks
   ...

Signed-off-by: Ulf Magnusson 
Signed-off-by: Masahiro Yamada 
Signed-off-by: Sasha Levin 
---
 scripts/kconfig/expr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/scripts/kconfig/expr.c b/scripts/kconfig/expr.c
index cbf4996dd9c1..ed29bad1f03a 100644
--- a/scripts/kconfig/expr.c
+++ b/scripts/kconfig/expr.c
@@ -113,7 +113,7 @@ void expr_free(struct expr *e)
break;
case E_NOT:
expr_free(e->left.expr);
-   return;
+   break;
case E_EQUAL:
case E_GEQ:
case E_GTH:
-- 
2.15.1

[PATCH AUTOSEL for 4.9 229/293] nfs: Do not convert nfs_idmap_cache_timeout to jiffies

2018-04-08 Thread Sasha Levin

From: Jan Chochol 

[ Upstream commit cbebc6ef4fc830f4040d4140bf53484812d5d5d9 ]

Since commit 57e62324e469 ("NFS: Store the legacy idmapper result in the
keyring") nfs_idmap_cache_timeout changed units from jiffies to seconds.
Unfortunately sysctl interface was not updated accordingly.

As a effect updating /proc/sys/fs/nfs/idmap_cache_timeout with some
value will incorrectly multiply this value by HZ.
Also reading /proc/sys/fs/nfs/idmap_cache_timeout will show real value
divided by HZ.

Fixes: 57e62324e469 ("NFS: Store the legacy idmapper result in the keyring")
Signed-off-by: Jan Chochol 
Signed-off-by: Trond Myklebust 
Signed-off-by: Sasha Levin 
---
 fs/nfs/nfs4sysctl.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/nfs/nfs4sysctl.c b/fs/nfs/nfs4sysctl.c
index 8693d77c45ea..76241aa8d853 100644
--- a/fs/nfs/nfs4sysctl.c
+++ b/fs/nfs/nfs4sysctl.c
@@ -31,7 +31,7 @@ static struct ctl_table nfs4_cb_sysctls[] = {
.data = _idmap_cache_timeout,
.maxlen = sizeof(int),
.mode = 0644,
-   .proc_handler = proc_dointvec_jiffies,
+   .proc_handler = proc_dointvec,
},
{ }
 };
-- 
2.15.1

[PATCH AUTOSEL for 4.9 235/293] mac80211_hwsim: fix possible memory leak in hwsim_new_radio_nl()

2018-04-08 Thread Sasha Levin

From: "weiyongjun (A)" 

[ Upstream commit 0ddcff49b672239dda94d70d0fcf50317a9f4b51 ]

'hwname' is malloced in hwsim_new_radio_nl() and should be freed
before leaving from the error handling cases, otherwise it will cause
memory leak.

Fixes: ff4dd73dd2b4 ("mac80211_hwsim: check HWSIM_ATTR_RADIO_NAME length")
Signed-off-by: Wei Yongjun 
Reviewed-by: Ben Hutchings 
Signed-off-by: Johannes Berg 
Signed-off-by: Sasha Levin 
---
 drivers/net/wireless/mac80211_hwsim.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/net/wireless/mac80211_hwsim.c 
b/drivers/net/wireless/mac80211_hwsim.c
index 4182c3775a72..eb6a145d2ed6 100644
--- a/drivers/net/wireless/mac80211_hwsim.c
+++ b/drivers/net/wireless/mac80211_hwsim.c
@@ -3084,8 +3084,10 @@ static int hwsim_new_radio_nl(struct sk_buff *msg, 
struct genl_info *info)
if (info->attrs[HWSIM_ATTR_REG_CUSTOM_REG]) {
u32 idx = nla_get_u32(info->attrs[HWSIM_ATTR_REG_CUSTOM_REG]);
 
-   if (idx >= ARRAY_SIZE(hwsim_world_regdom_custom))
+   if (idx >= ARRAY_SIZE(hwsim_world_regdom_custom)) {
+   kfree(hwname);
return -EINVAL;
+   }
param.regd = hwsim_world_regdom_custom[idx];
}
 
-- 
2.15.1

[PATCH AUTOSEL for 4.9 225/293] iommu/vt-d: Use domain instead of cache fetching

2018-04-08 Thread Sasha Levin

From: Peter Xu 

[ Upstream commit 9d2e6505f6d6934e681aed502f566198cb25c74a ]

after commit a1ddcbe93010 ("iommu/vt-d: Pass dmar_domain directly into
iommu_flush_iotlb_psi", 2015-08-12), we have domain pointer as parameter
to iommu_flush_iotlb_psi(), so no need to fetch it from cache again.

More importantly, a NULL reference pointer bug is reported on RHEL7 (and
it can be reproduced on some old upstream kernels too, e.g., v4.13) by
unplugging an 40g nic from a VM (hard to test unplug on real host, but
it should be the same):

https://bugzilla.redhat.com/show_bug.cgi?id=1531367

[   24.391863] pciehp :00:03.0:pcie004: Slot(0): Attention button pressed
[   24.393442] pciehp :00:03.0:pcie004: Slot(0): Powering off due to button 
press
[   29.721068] i40evf :01:00.0: Unable to send opcode 2 to PF, err 
I40E_ERR_QUEUE_EMPTY, aq_err OK
[   29.783557] iommu: Removing device :01:00.0 from group 3
[   29.784662] BUG: unable to handle kernel NULL pointer dereference at 
0304
[   29.785817] IP: iommu_flush_iotlb_psi+0xcf/0x120
[   29.786486] PGD 0
[   29.786487] P4D 0
[   29.786812]
[   29.787390] Oops:  [#1] SMP
[   29.787876] Modules linked in: ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 
xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc 
ip6table_ng
[   29.795371] CPU: 0 PID: 156 Comm: kworker/0:2 Not tainted 4.13.0 #14
[   29.796366] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 
1.11.0-1.el7 04/01/2014
[   29.797593] Workqueue: pciehp-0 pciehp_power_thread
[   29.798328] task: 94f5745b4a00 task.stack: b326805ac000
[   29.799178] RIP: 0010:iommu_flush_iotlb_psi+0xcf/0x120
[   29.799919] RSP: 0018:b326805afbd0 EFLAGS: 00010086
[   29.800666] RAX: 94f5bc56e800 RBX:  RCX: 00020025
[   29.801667] RDX: 94f5bc56e000 RSI: 0082 RDI: 
[   29.802755] RBP: b326805afbf8 R08:  R09: 94f5bc86bbf0
[   29.803772] R10: b326805afba8 R11: 000ffdc4 R12: 94f5bc86a400
[   29.804789] R13:  R14: ffdc4000 R15: 
[   29.805792] FS:  () GS:94f5bfc0() 
knlGS:
[   29.806923] CS:  0010 DS:  ES:  CR0: 80050033
[   29.807736] CR2: 0304 CR3: 3499d000 CR4: 06f0
[   29.808747] Call Trace:
[   29.809156]  flush_unmaps_timeout+0x126/0x1c0
[   29.809800]  domain_exit+0xd6/0x100
[   29.810322]  device_notifier+0x6b/0x70
[   29.810902]  notifier_call_chain+0x4a/0x70
[   29.812822]  __blocking_notifier_call_chain+0x47/0x60
[   29.814499]  blocking_notifier_call_chain+0x16/0x20
[   29.816137]  device_del+0x233/0x320
[   29.817588]  pci_remove_bus_device+0x6f/0x110
[   29.819133]  pci_stop_and_remove_bus_device+0x1a/0x20
[   29.820817]  pciehp_unconfigure_device+0x7a/0x1d0
[   29.822434]  pciehp_disable_slot+0x52/0xe0
[   29.823931]  pciehp_power_thread+0x8a/0xa0
[   29.825411]  process_one_work+0x18c/0x3a0
[   29.826875]  worker_thread+0x4e/0x3b0
[   29.828263]  kthread+0x109/0x140
[   29.829564]  ? process_one_work+0x3a0/0x3a0
[   29.831081]  ? kthread_park+0x60/0x60
[   29.832464]  ret_from_fork+0x25/0x30
[   29.833794] Code: 85 ed 74 0b 5b 41 5c 41 5d 41 5e 41 5f 5d c3 49 8b 54 24 
60 44 89 f8 0f b6 c4 48 8b 04 c2 48 85 c0 74 49 45 0f b6 ff 4a 8b 3c f8 <80> bf
[   29.838514] RIP: iommu_flush_iotlb_psi+0xcf/0x120 RSP: b326805afbd0
[   29.840362] CR2: 0304
[   29.841716] ---[ end trace b10ec0d6900868d3 ]---

This patch fixes that problem if applied to v4.13 kernel.

The bug does not exist on latest upstream kernel since it's fixed as a
side effect of commit 13cf01744608 ("iommu/vt-d: Make use of iova
deferred flushing", 2017-08-15).  But IMHO it's still good to have this
patch upstream.

CC: Alex Williamson 
Signed-off-by: Peter Xu 
Fixes: a1ddcbe93010 ("iommu/vt-d: Pass dmar_domain directly into 
iommu_flush_iotlb_psi")
Reviewed-by: Alex Williamson 
Signed-off-by: Joerg Roedel 
Signed-off-by: Sasha Levin 
---
 drivers/iommu/intel-iommu.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 88bbc8ccc5e3..1612d3a22d42 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -1612,8 +1612,7 @@ static void iommu_flush_iotlb_psi(struct intel_iommu 
*iommu,
 * flush. However, device IOTLB doesn't need to be flushed in this case.
 */
if (!cap_caching_mode(iommu->cap) || !map)
-   iommu_flush_dev_iotlb(get_iommu_domain(iommu, did),
- addr, mask);
+   iommu_flush_dev_iotlb(domain, addr, mask);
 }
 
 static void iommu_disable_protect_mem_regions(struct intel_iommu *iommu)
-- 
2.15.1

[PATCH AUTOSEL for 4.9 226/293] dm thin: fix documentation relative to low water mark threshold

2018-04-08 Thread Sasha Levin

From: mulhern 

[ Upstream commit 9b28a1102efc75d81298198166ead87d643a29ce ]

Fixes:
1. The use of "exceeds" when the opposite of exceeds, falls below,
was meant.
2. Properly speaking, a table can not exceed a threshold.

It emphasizes the important point, which is that it is the userspace
daemon's responsibility to check for low free space when a device
is resumed, since it won't get a special event indicating low free
space in that situation.

Signed-off-by: mulhern 
Signed-off-by: Mike Snitzer 
Signed-off-by: Sasha Levin 
---
 Documentation/device-mapper/thin-provisioning.txt | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/Documentation/device-mapper/thin-provisioning.txt 
b/Documentation/device-mapper/thin-provisioning.txt
index 1699a55b7b70..ef639960b272 100644
--- a/Documentation/device-mapper/thin-provisioning.txt
+++ b/Documentation/device-mapper/thin-provisioning.txt
@@ -112,9 +112,11 @@ $low_water_mark is expressed in blocks of size 
$data_block_size.  If
 free space on the data device drops below this level then a dm event
 will be triggered which a userspace daemon should catch allowing it to
 extend the pool device.  Only one such event will be sent.
-Resuming a device with a new table itself triggers an event so the
-userspace daemon can use this to detect a situation where a new table
-already exceeds the threshold.
+
+No special event is triggered if a just resumed device's free space is below
+the low water mark. However, resuming a device always triggers an
+event; a userspace daemon should verify that free space exceeds the low
+water mark when handling this event.
 
 A low water mark for the metadata device is maintained in the kernel and
 will trigger a dm event if free space on the metadata device drops below
-- 
2.15.1

[PATCH AUTOSEL for 4.9 228/293] net: stmmac: dwmac-meson8b: propagate rate changes to the parent clock

2018-04-08 Thread Sasha Levin

From: Martin Blumenstingl 

[ Upstream commit fb7d38a70e1d8ffd54f7a7464dcc4889d7e490ad ]

On Meson8b the only valid input clock is MPLL2. The bootloader
configures that to run at 52394Hz which cannot be divided evenly
down to 125MHz using the m250_div clock. Currently the common clock
framework chooses a m250_div of 2 - with the internal fixed
"divide by 10" this results in a RGMII TX clock of 125001197Hz (120Hz
above the requested 125MHz).

Letting the common clock framework propagate the rate changes up to the
parent of m250_mux allows us to get the best possible clock rate. With
this patch the common clock framework calculates a rate of
very-close-to-250MHz (24701Hz to be exact) for the MPLL2 clock
(which is the mux input). Dividing that by 2 (which is an internal,
fixed divider for the RGMII TX clock) gives us an RGMII TX clock of
124999850Hz (which is only 150Hz off the requested 125MHz, compared to
1197Hz based on the MPLL2 rate set by u-boot and the Amlogic GPL kernel
sources).

SoCs from the Meson GX series are not affected by this change because
the input clock is FCLK_DIV2 whose rate cannot be changed (which is fine
since it's running at 1GHz, so it's already a multiple of 250MHz and
125MHz).

Fixes: 566e8251625304 ("net: stmmac: add a glue driver for the Amlogic Meson 8b 
/ GXBB DWMAC")
Suggested-by: Jerome Brunet 
Signed-off-by: Martin Blumenstingl 
Reviewed-by: Jerome Brunet 
Tested-by: Jerome Brunet 
Signed-off-by: David S. Miller 
Signed-off-by: Sasha Levin 
---
 drivers/net/ethernet/stmicro/stmmac/dwmac-meson8b.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-meson8b.c 
b/drivers/net/ethernet/stmicro/stmmac/dwmac-meson8b.c
index 923033867a4d..f356a44bcb81 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac-meson8b.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-meson8b.c
@@ -118,7 +118,7 @@ static int meson8b_init_clk(struct meson8b_dwmac *dwmac)
snprintf(clk_name, sizeof(clk_name), "%s#m250_sel", dev_name(dev));
init.name = clk_name;
init.ops = _mux_ops;
-   init.flags = 0;
+   init.flags = CLK_SET_RATE_PARENT;
init.parent_names = mux_parent_names;
init.num_parents = MUX_CLK_NUM_PARENTS;
 
-- 
2.15.1

[PATCH AUTOSEL for 4.9 223/293] tools lib traceevent: Fix get_field_str() for dynamic strings

2018-04-08 Thread Sasha Levin

From: "Steven Rostedt (VMware)" 

[ Upstream commit d777f8de99b05d399c0e4e51cdce016f26bd971b ]

If a field is a dynamic string, get_field_str() returned just the
offset/size value and not the string. Have it parse the offset/size
correctly to return the actual string. Otherwise filtering fails when
trying to filter fields that are dynamic strings.

Reported-by: Gopanapalli Pradeep 
Signed-off-by: Steven Rostedt 
Acked-by: Namhyung Kim 
Cc: Andrew Morton 
Link: http://lkml.kernel.org/r/20180112004823.146333...@goodmis.org
Signed-off-by: Arnaldo Carvalho de Melo 
Signed-off-by: Sasha Levin 
---
 tools/lib/traceevent/parse-filter.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/tools/lib/traceevent/parse-filter.c 
b/tools/lib/traceevent/parse-filter.c
index 7c214ceb9386..5e10ba796a6f 100644
--- a/tools/lib/traceevent/parse-filter.c
+++ b/tools/lib/traceevent/parse-filter.c
@@ -1879,17 +1879,25 @@ static const char *get_field_str(struct filter_arg 
*arg, struct pevent_record *r
struct pevent *pevent;
unsigned long long addr;
const char *val = NULL;
+   unsigned int size;
char hex[64];
 
/* If the field is not a string convert it */
if (arg->str.field->flags & FIELD_IS_STRING) {
val = record->data + arg->str.field->offset;
+   size = arg->str.field->size;
+
+   if (arg->str.field->flags & FIELD_IS_DYNAMIC) {
+   addr = *(unsigned int *)val;
+   val = record->data + (addr & 0x);
+   size = addr >> 16;
+   }
 
/*
 * We need to copy the data since we can't be sure the field
 * is null terminated.
 */
-   if (*(val + arg->str.field->size - 1)) {
+   if (*(val + size - 1)) {
/* copy it */
memcpy(arg->str.buffer, val, arg->str.field->size);
/* the buffer is already NULL terminated */
-- 
2.15.1

[PATCH AUTOSEL for 4.9 220/293] i40iw: Zero-out consumer key on allocate stag for FMR

2018-04-08 Thread Sasha Levin

From: Shiraz Saleem 

[ Upstream commit 6376e926af1a8661dd1b2e6d0896e07f84a35844 ]

If the application invalidates the MR before the FMR WR, HW parses the
consumer key portion of the stag and returns an invalid stag key
Asynchronous Event (AE) that tears down the QP.

Fix this by zeroing-out the consumer key portion of the allocated stag
returned to application for FMR.

Fixes: ee855d3b93f3 ("RDMA/i40iw: Add base memory management extensions")
Signed-off-by: Shiraz Saleem 
Signed-off-by: Jason Gunthorpe 
Signed-off-by: Sasha Levin 
---
 drivers/infiniband/hw/i40iw/i40iw_verbs.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/infiniband/hw/i40iw/i40iw_verbs.c 
b/drivers/infiniband/hw/i40iw/i40iw_verbs.c
index 4b892ca2b13a..095912fb3201 100644
--- a/drivers/infiniband/hw/i40iw/i40iw_verbs.c
+++ b/drivers/infiniband/hw/i40iw/i40iw_verbs.c
@@ -1515,6 +1515,7 @@ static struct ib_mr *i40iw_alloc_mr(struct ib_pd *pd,
err_code = -EOVERFLOW;
goto err;
}
+   stag &= ~I40IW_CQPSQ_STAG_KEY_MASK;
iwmr->stag = stag;
iwmr->ibmr.rkey = stag;
iwmr->ibmr.lkey = stag;
-- 
2.15.1

[PATCH AUTOSEL for 4.9 224/293] perf record: Fix failed memory allocation for get_cpuid_str

2018-04-08 Thread Sasha Levin

From: Thomas Richter 

[ Upstream commit 81fccd6ca507d3b2012eaf1edeb9b1dbf4bd22db ]

In x86 architecture dependend part function get_cpuid_str() mallocs a
128 byte buffer, but does not check if the memory allocation succeeded
or not.

When the memory allocation fails, function __get_cpuid() is called with
first parameter being a NULL pointer.  However this function references
its first parameter and operates on a NULL pointer which might cause
core dumps.

Signed-off-by: Thomas Richter 
Cc: Heiko Carstens 
Cc: Hendrik Brueckner 
Cc: Martin Schwidefsky 
Link: http://lkml.kernel.org/r/20180117131611.34319-1-tmri...@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo 
Signed-off-by: Sasha Levin 
---
 tools/perf/arch/x86/util/header.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/arch/x86/util/header.c 
b/tools/perf/arch/x86/util/header.c
index a74a48db26f5..2eb11543e2e9 100644
--- a/tools/perf/arch/x86/util/header.c
+++ b/tools/perf/arch/x86/util/header.c
@@ -69,7 +69,7 @@ get_cpuid_str(void)
 {
char *buf = malloc(128);
 
-   if (__get_cpuid(buf, 128, "%s-%u-%X$") < 0) {
+   if (buf && __get_cpuid(buf, 128, "%s-%u-%X$") < 0) {
free(buf);
return NULL;
}
-- 
2.15.1

[PATCH AUTOSEL for 4.9 221/293] tools lib traceevent: Simplify pointer print logic and fix %pF

2018-04-08 Thread Sasha Levin

From: "Steven Rostedt (VMware)" 

[ Upstream commit 38d70b7ca1769f26c0b79f3c08ff2cc949712b59 ]

When processing %pX in pretty_print(), simplify the logic slightly by
incrementing the ptr to the format string if isalnum(ptr[1]) is true.
This follows the logic a bit more closely to what is in the kernel.

Also, this fixes a small bug where %pF was not giving the offset of the
function.

Signed-off-by: Steven Rostedt 
Acked-by: Namhyung Kim 
Cc: Andrew Morton 
Link: http://lkml.kernel.org/r/20180112004822.260262...@goodmis.org
Signed-off-by: Arnaldo Carvalho de Melo 
Signed-off-by: Sasha Levin 
---
 tools/lib/traceevent/event-parse.c | 17 +
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/tools/lib/traceevent/event-parse.c 
b/tools/lib/traceevent/event-parse.c
index 664c90c8e22b..669475300ba8 100644
--- a/tools/lib/traceevent/event-parse.c
+++ b/tools/lib/traceevent/event-parse.c
@@ -4927,21 +4927,22 @@ static void pretty_print(struct trace_seq *s, void 
*data, int size, struct event
else
ls = 2;
 
-   if (*(ptr+1) == 'F' || *(ptr+1) == 'f' ||
-   *(ptr+1) == 'S' || *(ptr+1) == 's') {
+   if (isalnum(ptr[1]))
ptr++;
+
+   if (*ptr == 'F' || *ptr == 'f' ||
+   *ptr == 'S' || *ptr == 's') {
show_func = *ptr;
-   } else if (*(ptr+1) == 'M' || *(ptr+1) == 'm') {
-   print_mac_arg(s, *(ptr+1), data, size, 
event, arg);
-   ptr++;
+   } else if (*ptr == 'M' || *ptr == 'm') {
+   print_mac_arg(s, *ptr, data, size, 
event, arg);
arg = arg->next;
break;
-   } else if (*(ptr+1) == 'I' || *(ptr+1) == 'i') {
+   } else if (*ptr == 'I' || *ptr == 'i') {
int n;
 
-   n = print_ip_arg(s, ptr+1, data, size, 
event, arg);
+   n = print_ip_arg(s, ptr, data, size, 
event, arg);
if (n > 0) {
-   ptr += n;
+   ptr += n - 1;
arg = arg->next;
break;
}
-- 
2.15.1

[PATCH AUTOSEL for 4.9 216/293] netfilter: ipv6: nf_defrag: Pass on packets to stack per RFC2460

2018-04-08 Thread Sasha Levin

From: Subash Abhinov Kasiviswanathan 

[ Upstream commit 83f1999caeb14e15df205e80d210699951733287 ]

ipv6_defrag pulls network headers before fragment header. In case of
an error, the netfilter layer is currently dropping these packets.
This results in failure of some IPv6 standards tests which passed on
older kernels due to the netfilter framework using cloning.

The test case run here is a check for ICMPv6 error message replies
when some invalid IPv6 fragments are sent. This specific test case is
listed in https://www.ipv6ready.org/docs/Core_Conformance_Latest.pdf
in the Extension Header Processing Order section.

A packet with unrecognized option Type 11 is sent and the test expects
an ICMP error in line with RFC2460 section 4.2 -

11 - discard the packet and, only if the packet's Destination
 Address was not a multicast address, send an ICMP Parameter
 Problem, Code 2, message to the packet's Source Address,
 pointing to the unrecognized Option Type.

Since netfilter layer now drops all invalid IPv6 frag packets, we no
longer see the ICMP error message and fail the test case.

To fix this, save the transport header. If defrag is unable to process
the packet due to RFC2460, restore the transport header and allow packet
to be processed by stack. There is no change for other packet
processing paths.

Tested by confirming that stack sends an ICMP error when it receives
these packets. Also tested that fragmented ICMP pings succeed.

v1->v2: Instead of cloning always, save the transport_header and
restore it in case of this specific error. Update the title and
commit message accordingly.

Signed-off-by: Subash Abhinov Kasiviswanathan 
Signed-off-by: Pablo Neira Ayuso 
Signed-off-by: Sasha Levin 
---
 net/ipv6/netfilter/nf_conntrack_reasm.c | 15 ++-
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/net/ipv6/netfilter/nf_conntrack_reasm.c 
b/net/ipv6/netfilter/nf_conntrack_reasm.c
index b263bf3a19f7..5edfe66a3d7a 100644
--- a/net/ipv6/netfilter/nf_conntrack_reasm.c
+++ b/net/ipv6/netfilter/nf_conntrack_reasm.c
@@ -230,7 +230,7 @@ static int nf_ct_frag6_queue(struct frag_queue *fq, struct 
sk_buff *skb,
 
if ((unsigned int)end > IPV6_MAXPLEN) {
pr_debug("offset is too large.\n");
-   return -1;
+   return -EINVAL;
}
 
ecn = ip6_frag_ecn(ipv6_hdr(skb));
@@ -263,7 +263,7 @@ static int nf_ct_frag6_queue(struct frag_queue *fq, struct 
sk_buff *skb,
 * this case. -DaveM
 */
pr_debug("end of fragment not rounded to 8 bytes.\n");
-   return -1;
+   return -EPROTO;
}
if (end > fq->q.len) {
/* Some bits beyond end -> corruption. */
@@ -357,7 +357,7 @@ found:
 discard_fq:
inet_frag_kill(>q, _frags);
 err:
-   return -1;
+   return -EINVAL;
 }
 
 /*
@@ -566,6 +566,7 @@ find_prev_fhdr(struct sk_buff *skb, u8 *prevhdrp, int 
*prevhoff, int *fhoff)
 
 int nf_ct_frag6_gather(struct net *net, struct sk_buff *skb, u32 user)
 {
+   u16 savethdr = skb->transport_header;
struct net_device *dev = skb->dev;
int fhoff, nhoff, ret;
struct frag_hdr *fhdr;
@@ -599,8 +600,12 @@ int nf_ct_frag6_gather(struct net *net, struct sk_buff 
*skb, u32 user)
 
spin_lock_bh(>q.lock);
 
-   if (nf_ct_frag6_queue(fq, skb, fhdr, nhoff) < 0) {
-   ret = -EINVAL;
+   ret = nf_ct_frag6_queue(fq, skb, fhdr, nhoff);
+   if (ret < 0) {
+   if (ret == -EPROTO) {
+   skb->transport_header = savethdr;
+   ret = 0;
+   }
goto out_unlock;
}
 
-- 
2.15.1

[PATCH AUTOSEL for 4.9 222/293] perf callchain: Fix attr.sample_max_stack setting

2018-04-08 Thread Sasha Levin

From: Arnaldo Carvalho de Melo 

[ Upstream commit 249d98e567e25dd03e015e2d31e1b7b9648f34df ]

When setting the "dwarf" unwinder for a specific event and not
specifying the max-stack, the attr.sample_max_stack ended up using an
uninitialized callchain_param.max_stack, fix it by using designated
initializers for that callchain_param variable, zeroing all non
explicitely initialized struct members.

Here is what happened:

  # perf trace -vv --no-syscalls --max-stack 4 -e 
probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1
  callchain: type DWARF
  callchain: stack dump size 8192
  perf_event_attr:
type 2
size 112
config   0x730
{ sample_period, sample_freq }   1
sample_type  
IP|TID|TIME|ADDR|CALLCHAIN|CPU|PERIOD|RAW|REGS_USER|STACK_USER|DATA_SRC
exclude_callchain_user   1
{ wakeup_events, wakeup_watermark } 1
sample_regs_user 0xff0fff
sample_stack_user8192
sample_max_stack 50656
  sys_perf_event_open failed, error -75
  Value too large for defined data type
  # perf trace -vv --no-syscalls --max-stack 4 -e 
probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1
  callchain: type DWARF
  callchain: stack dump size 8192
  perf_event_attr:
type 2
size 112
config   0x730
sample_type  
IP|TID|TIME|ADDR|CALLCHAIN|CPU|PERIOD|RAW|REGS_USER|STACK_USER|DATA_SRC
exclude_callchain_user   1
sample_regs_user 0xff0fff
sample_stack_user8192
sample_max_stack 30448
  sys_perf_event_open failed, error -75
  Value too large for defined data type
  #

Now the attr.sample_max_stack is set to zero and the above works as
expected:

  # perf trace --no-syscalls --max-stack 4 -e 
probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1
  PING ::1(::1) 56 data bytes
  64 bytes from ::1: icmp_seq=1 ttl=64 time=0.072 ms

  --- ::1 ping statistics ---
  1 packets transmitted, 1 received, 0% packet loss, time 0ms
  rtt min/avg/max/mdev = 0.072/0.072/0.072/0.000 ms
   0.000 probe_libc:inet_pton:(7feb7a998350))
 __inet_pton (inlined)
 gaih_inet.constprop.7 
(/usr/lib64/libc-2.26.so)
 __GI_getaddrinfo (inlined)
 [0xaa39b6108f3f] (/usr/bin/ping)
  #

Cc: Adrian Hunter 
Cc: David Ahern 
Cc: Hendrick Brueckner 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Thomas Richter 
Cc: Wang Nan 
Link: https://lkml.kernel.org/n/tip-is9tramondqa9jlxxsgcm...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo 
Signed-off-by: Sasha Levin 
---
 tools/perf/util/evsel.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 1c1291afd6a6..4b52c76fb2a8 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -694,14 +694,14 @@ static void apply_config_terms(struct perf_evsel *evsel,
struct perf_evsel_config_term *term;
struct list_head *config_terms = >config_terms;
struct perf_event_attr *attr = >attr;
-   struct callchain_param param;
+   /* callgraph default */
+   struct callchain_param param = {
+   .record_mode = callchain_param.record_mode,
+   };
u32 dump_size = 0;
int max_stack = 0;
const char *callgraph_buf = NULL;
 
-   /* callgraph default */
-   param.record_mode = callchain_param.record_mode;
-
list_for_each_entry(term, config_terms, list) {
switch (term->type) {
case PERF_EVSEL__CONFIG_TERM_PERIOD:
-- 
2.15.1

[PATCH AUTOSEL for 4.9 218/293] PCI: Add function 1 DMA alias quirk for Marvell 9128

2018-04-08 Thread Sasha Levin

From: Alex Williamson 

[ Upstream commit aa008206634363ef800fbd5f0262016c9ff81dea ]

The Marvell 9128 is the original device generating bug 42679, from which
many other Marvell DMA alias quirks have been sourced, but we didn't have
positive confirmation of the fix on 9128 until now.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=42679
Link: https://www.spinics.net/lists/kvm/msg161459.html
Reported-by: Binarus 
Tested-by: Binarus 
Signed-off-by: Alex Williamson 
Signed-off-by: Bjorn Helgaas 
Signed-off-by: Sasha Levin 
---
 drivers/pci/quirks.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 0fc4843e8589..b3f49f9f640c 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -3857,6 +3857,8 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MARVELL_EXT, 
0x9120,
 quirk_dma_func1_alias);
 DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MARVELL_EXT, 0x9123,
 quirk_dma_func1_alias);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MARVELL_EXT, 0x9128,
+quirk_dma_func1_alias);
 /* https://bugzilla.kernel.org/show_bug.cgi?id=42679#c14 */
 DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MARVELL_EXT, 0x9130,
 quirk_dma_func1_alias);
-- 
2.15.1

[PATCH AUTOSEL for 4.9 211/293] x86/tsc: Allow TSC calibration without PIT

2018-04-08 Thread Sasha Levin

From: Peter Zijlstra 

[ Upstream commit 30c7e5b123673d5e570e238dbada2fb68a87212c ]

Zhang Rui reported that a Surface Pro 4 will fail to boot with
lapic=notscdeadline. Part of the problem is that that machine doesn't have
a PIT.

If, for some reason, the TSC init has to fall back to TSC calibration, it
relies on the PIT to be present.

Allow TSC calibration to reliably fall back to HPET.

The below results in an accurate TSC measurement when forced on a IVB:

  tsc: Unable to calibrate against PIT
  tsc: No reference (HPET/PMTIMER) available
  tsc: Unable to calibrate against PIT
  tsc: using HPET reference calibration
  tsc: Detected 2792.451 MHz processor

Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Thomas Gleixner 
Cc: len.br...@intel.com
Cc: rui.zh...@intel.com
Link: https://lkml.kernel.org/r/20171222092243.333145...@infradead.org
Signed-off-by: Sasha Levin 
---
 arch/x86/include/asm/i8259.h |  5 +
 arch/x86/kernel/tsc.c| 18 ++
 2 files changed, 23 insertions(+)

diff --git a/arch/x86/include/asm/i8259.h b/arch/x86/include/asm/i8259.h
index 39bcefc20de7..bb078786a323 100644
--- a/arch/x86/include/asm/i8259.h
+++ b/arch/x86/include/asm/i8259.h
@@ -68,6 +68,11 @@ struct legacy_pic {
 extern struct legacy_pic *legacy_pic;
 extern struct legacy_pic null_legacy_pic;
 
+static inline bool has_legacy_pic(void)
+{
+   return legacy_pic != _legacy_pic;
+}
+
 static inline int nr_legacy_irqs(void)
 {
return legacy_pic->nr_legacy_irqs;
diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index d07a9390023e..1845b5dc5a81 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -24,6 +24,7 @@
 #include 
 #include 
 #include 
+#include 
 
 unsigned int __read_mostly cpu_khz;/* TSC clocks / usec, not used here */
 EXPORT_SYMBOL(cpu_khz);
@@ -454,6 +455,20 @@ static unsigned long pit_calibrate_tsc(u32 latch, unsigned 
long ms, int loopmin)
unsigned long tscmin, tscmax;
int pitcnt;
 
+   if (!has_legacy_pic()) {
+   /*
+* Relies on tsc_early_delay_calibrate() to have given us semi
+* usable udelay(), wait for the same 50ms we would have with
+* the PIT loop below.
+*/
+   udelay(10 * USEC_PER_MSEC);
+   udelay(10 * USEC_PER_MSEC);
+   udelay(10 * USEC_PER_MSEC);
+   udelay(10 * USEC_PER_MSEC);
+   udelay(10 * USEC_PER_MSEC);
+   return ULONG_MAX;
+   }
+
/* Set the Gate high, disable speaker */
outb((inb(0x61) & ~0x02) | 0x01, 0x61);
 
@@ -578,6 +593,9 @@ static unsigned long quick_pit_calibrate(void)
u64 tsc, delta;
unsigned long d1, d2;
 
+   if (!has_legacy_pic())
+   return 0;
+
/* Set the Gate high, disable speaker */
outb((inb(0x61) & ~0x02) | 0x01, 0x61);
 
-- 
2.15.1

[PATCH AUTOSEL for 4.9 217/293] tracing/hrtimer: Fix tracing bugs by taking all clock bases and modes into account

2018-04-08 Thread Sasha Levin

From: Anna-Maria Gleixner 

[ Upstream commit 91633eed73a3ac37aaece5c8c1f93a18bae616a9 ]

So far only CLOCK_MONOTONIC and CLOCK_REALTIME were taken into account as
well as HRTIMER_MODE_ABS/REL in the hrtimer_init tracepoint. The query for
detecting the ABS or REL timer modes is not valid anymore, it got broken
by the introduction of HRTIMER_MODE_PINNED.

HRTIMER_MODE_PINNED is not evaluated in the hrtimer_init() call, but for the
sake of completeness print all given modes.

Signed-off-by: Anna-Maria Gleixner 
Cc: Christoph Hellwig 
Cc: John Stultz 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: keesc...@chromium.org
Link: http://lkml.kernel.org/r/20171221104205.7269-9-anna-ma...@linutronix.de
Signed-off-by: Ingo Molnar 
Signed-off-by: Sasha Levin 
---
 include/trace/events/timer.h | 20 
 1 file changed, 16 insertions(+), 4 deletions(-)

diff --git a/include/trace/events/timer.h b/include/trace/events/timer.h
index 28c5da6fdfac..3411da79407d 100644
--- a/include/trace/events/timer.h
+++ b/include/trace/events/timer.h
@@ -125,6 +125,20 @@ DEFINE_EVENT(timer_class, timer_cancel,
TP_ARGS(timer)
 );
 
+#define decode_clockid(type)   \
+   __print_symbolic(type,  \
+   { CLOCK_REALTIME,   "CLOCK_REALTIME"},  \
+   { CLOCK_MONOTONIC,  "CLOCK_MONOTONIC"   },  \
+   { CLOCK_BOOTTIME,   "CLOCK_BOOTTIME"},  \
+   { CLOCK_TAI,"CLOCK_TAI" })
+
+#define decode_hrtimer_mode(mode)  \
+   __print_symbolic(mode,  \
+   { HRTIMER_MODE_ABS, "ABS"   },  \
+   { HRTIMER_MODE_REL, "REL"   },  \
+   { HRTIMER_MODE_ABS_PINNED,  "ABS|PINNED"},  \
+   { HRTIMER_MODE_REL_PINNED,  "REL|PINNED"})
+
 /**
  * hrtimer_init - called when the hrtimer is initialized
  * @hrtimer:   pointer to struct hrtimer
@@ -151,10 +165,8 @@ TRACE_EVENT(hrtimer_init,
),
 
TP_printk("hrtimer=%p clockid=%s mode=%s", __entry->hrtimer,
- __entry->clockid == CLOCK_REALTIME ?
-   "CLOCK_REALTIME" : "CLOCK_MONOTONIC",
- __entry->mode == HRTIMER_MODE_ABS ?
-   "HRTIMER_MODE_ABS" : "HRTIMER_MODE_REL")
+ decode_clockid(__entry->clockid),
+ decode_hrtimer_mode(__entry->mode))
 );
 
 /**
-- 
2.15.1

[PATCH AUTOSEL for 4.9 215/293] kvm: x86: fix KVM_XEN_HVM_CONFIG ioctl

2018-04-08 Thread Sasha Levin

From: Paolo Bonzini 

[ Upstream commit 51776043afa415435c7e4636204fbe4f7edc4501 ]

This ioctl is obsolete (it was used by Xenner as far as I know) but
still let's not break it gratuitously...  Its handler is copying
directly into struct kvm.  Go through a bounce buffer instead, with
the added benefit that we can actually do something useful with the
flags argument---the previous code was exiting with -EINVAL but still
doing the copy.

This technically is a userspace ABI breakage, but since no one should be
using the ioctl, it's a good occasion to see if someone actually
complains.

Cc: kernel-harden...@lists.openwall.com
Cc: Kees Cook 
Cc: Radim Krčmář 
Signed-off-by: Paolo Bonzini 
Signed-off-by: Kees Cook 
Signed-off-by: Sasha Levin 
---
 arch/x86/kvm/x86.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 3aaaf305420d..803bb452aac6 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4130,13 +4130,14 @@ long kvm_arch_vm_ioctl(struct file *filp,
mutex_unlock(>lock);
break;
case KVM_XEN_HVM_CONFIG: {
+   struct kvm_xen_hvm_config xhc;
r = -EFAULT;
-   if (copy_from_user(>arch.xen_hvm_config, argp,
-  sizeof(struct kvm_xen_hvm_config)))
+   if (copy_from_user(, argp, sizeof(xhc)))
goto out;
r = -EINVAL;
-   if (kvm->arch.xen_hvm_config.flags)
+   if (xhc.flags)
goto out;
+   memcpy(>arch.xen_hvm_config, , sizeof(xhc));
r = 0;
break;
}
-- 
2.15.1

[PATCH] mm: workingset: fix NULL ptr dereference

2018-04-08 Thread Minchan Kim

Recently, I got a report like below.

[ 7858.792946] [] __list_del_entry+0x30/0xd0
[ 7858.792951] [] list_lru_del+0xac/0x1ac
[ 7858.792957] [] page_cache_tree_insert+0xd8/0x110
[ 7858.792962] [] __add_to_page_cache_locked+0xf8/0x4e0
[ 7858.792967] [] add_to_page_cache_lru+0x50/0x1ac
[ 7858.792972] [] pagecache_get_page+0x468/0x57c
[ 7858.792979] [] __get_node_page+0x84/0x764
[ 7858.792986] [] f2fs_iget+0x264/0xdc8
[ 7858.792991] [] f2fs_lookup+0x3b4/0x660
[ 7858.792998] [] lookup_slow+0x1e4/0x348
[ 7858.793003] [] walk_component+0x21c/0x320
[ 7858.793008] [] path_lookupat+0x90/0x1bc
[ 7858.793013] [] filename_lookup+0x8c/0x1a0
[ 7858.793018] [] vfs_fstatat+0x84/0x10c
[ 7858.793023] [] SyS_newfstatat+0x28/0x64

v4.9 kenrel already has the d3798ae8c6f3,("mm: filemap: don't
plant shadow entries without radix tree node") so I thought
it should be okay. When I was googling, I found others report
such problem and I think current kernel still has the problem.

https://bugzilla.redhat.com/show_bug.cgi?id=1431567
https://bugzilla.redhat.com/show_bug.cgi?id=1420335

It assumes shadow entry of radix tree relies on the init state
that node->private_list allocated should be list_empty state.
Currently, it's initailized in SLAB constructor which means
node of radix tree would be initialized only when *slub allocates
new page*, not *new object*. So, if some FS or subsystem pass
gfp_mask to __GFP_ZERO, slub allocator will do memset blindly.
That means allocated node can have !list_empty(node->private_list).
It ends up calling NULL deference at workingset_update_node by
failing list_empty check.

This patch should fix it.

Fixes: 449dd6984d0e ("mm: keep page cache radix tree nodes in check")
Reported-by: Chris Fries 
Cc: Johannes Weiner 
Cc: Jan Kara 
Signed-off-by: Minchan Kim 
---
If it is reviewed and proved with testing, I will resend the patch to
Ccing sta...@vger.kernel.org.

Thanks.

 lib/radix-tree.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/radix-tree.c b/lib/radix-tree.c
index 8e00138d593f..afcbdb6c495f 100644
--- a/lib/radix-tree.c
+++ b/lib/radix-tree.c
@@ -428,6 +428,7 @@ radix_tree_node_alloc(gfp_t gfp_mask, struct 
radix_tree_node *parent,
ret->exceptional = exceptional;
ret->parent = parent;
ret->root = root;
+   INIT_LIST_HEAD(>private_list);
}
return ret;
 }
@@ -2234,7 +2235,6 @@ radix_tree_node_ctor(void *arg)
struct radix_tree_node *node = arg;
 
memset(node, 0, sizeof(*node));
-   INIT_LIST_HEAD(>private_list);
 }
 
 static __init unsigned long __maxindex(unsigned int height)
-- 
2.17.0.484.g0c8726318c-goog

[PATCH AUTOSEL for 4.9 212/293] NFSv4: always set NFS_LOCK_LOST when a lock is lost.

2018-04-08 Thread Sasha Levin

From: NeilBrown 

[ Upstream commit dce2630c7da73b0634686bca557cc8945cc450c8 ]

There are 2 comments in the NFSv4 code which suggest that
SIGLOST should possibly be sent to a process.  In these
cases a lock has been lost.
The current practice is to set NFS_LOCK_LOST so that
read/write returns EIO when a lock is lost.
So change these comments to code when sets NFS_LOCK_LOST.

One case is when lock recovery after apparent server restart
fails with NFS4ERR_DENIED, NFS4ERR_RECLAIM_BAD, or
NFS4ERRO_RECLAIM_CONFLICT.  The other case is when a lock
attempt as part of lease recovery fails with NFS4ERR_DENIED.

In an ideal world, these should not happen.  However I have
a packet trace showing an NFSv4.1 session getting
NFS4ERR_BADSESSION after an extended network parition.  The
NFSv4.1 client treats this like server reboot until/unless
it get NFS4ERR_NO_GRACE, in which case it switches over to
"nograce" recovery mode.  In this network trace, the client
attempts to recover a lock and the server (incorrectly)
reports NFS4ERR_DENIED rather than NFS4ERR_NO_GRACE.  This
leads to the ineffective comment and the client then
continues to write using the OPEN stateid.

Signed-off-by: NeilBrown 
Signed-off-by: Trond Myklebust 
Signed-off-by: Sasha Levin 
---
 fs/nfs/nfs4proc.c  | 12 
 fs/nfs/nfs4state.c |  5 -
 2 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 4638654e26f3..883662d25714 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -1934,7 +1934,7 @@ static int nfs4_open_reclaim(struct nfs4_state_owner *sp, 
struct nfs4_state *sta
return ret;
 }
 
-static int nfs4_handle_delegation_recall_error(struct nfs_server *server, 
struct nfs4_state *state, const nfs4_stateid *stateid, int err)
+static int nfs4_handle_delegation_recall_error(struct nfs_server *server, 
struct nfs4_state *state, const nfs4_stateid *stateid, struct file_lock *fl, 
int err)
 {
switch (err) {
default:
@@ -1981,7 +1981,11 @@ static int nfs4_handle_delegation_recall_error(struct 
nfs_server *server, struct
return -EAGAIN;
case -ENOMEM:
case -NFS4ERR_DENIED:
-   /* kill_proc(fl->fl_pid, SIGLOST, 1); */
+   if (fl) {
+   struct nfs4_lock_state *lsp = 
fl->fl_u.nfs4_fl.owner;
+   if (lsp)
+   set_bit(NFS_LOCK_LOST, >ls_flags);
+   }
return 0;
}
return err;
@@ -2017,7 +2021,7 @@ int nfs4_open_delegation_recall(struct nfs_open_context 
*ctx,
err = nfs4_open_recover_helper(opendata, FMODE_READ);
}
nfs4_opendata_put(opendata);
-   return nfs4_handle_delegation_recall_error(server, state, stateid, err);
+   return nfs4_handle_delegation_recall_error(server, state, stateid, 
NULL, err);
 }
 
 static void nfs4_open_confirm_prepare(struct rpc_task *task, void *calldata)
@@ -6493,7 +6497,7 @@ int nfs4_lock_delegation_recall(struct file_lock *fl, 
struct nfs4_state *state,
if (err != 0)
return err;
err = _nfs4_do_setlk(state, F_SETLK, fl, NFS_LOCK_NEW);
-   return nfs4_handle_delegation_recall_error(server, state, stateid, err);
+   return nfs4_handle_delegation_recall_error(server, state, stateid, fl, 
err);
 }
 
 struct nfs_release_lockowner_data {
diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
index 71deeae6eefd..cfd1222ef303 100644
--- a/fs/nfs/nfs4state.c
+++ b/fs/nfs/nfs4state.c
@@ -1429,6 +1429,7 @@ static int nfs4_reclaim_locks(struct nfs4_state *state, 
const struct nfs4_state_
struct inode *inode = state->inode;
struct nfs_inode *nfsi = NFS_I(inode);
struct file_lock *fl;
+   struct nfs4_lock_state *lsp;
int status = 0;
struct file_lock_context *flctx = inode->i_flctx;
struct list_head *list;
@@ -1469,7 +1470,9 @@ restart:
case -NFS4ERR_DENIED:
case -NFS4ERR_RECLAIM_BAD:
case -NFS4ERR_RECLAIM_CONFLICT:
-   /* kill_proc(fl->fl_pid, SIGLOST, 1); */
+   lsp = fl->fl_u.nfs4_fl.owner;
+   if (lsp)
+   set_bit(NFS_LOCK_LOST, >ls_flags);
status = 0;
}
spin_lock(>flc_lock);
-- 
2.15.1

[PATCH AUTOSEL for 4.9 213/293] ALSA: hda - Use IS_REACHABLE() for dependency on input

2018-04-08 Thread Sasha Levin

From: Takashi Iwai 

[ Upstream commit c469652bb5e8fb715db7d152f46d33b3740c9b87 ]

The commit ffcd28d88e4f ("ALSA: hda - Select INPUT for Realtek
HD-audio codec") introduced the reverse-selection of CONFIG_INPUT for
Realtek codec in order to avoid the mess with dependency between
built-in and modules.  Later on, we obtained IS_REACHABLE() macro
exactly for this kind of problems, and now we can remove th INPUT
selection in Kconfig and put IS_REACHABLE(INPUT) to the appropriate
places in the code, so that the driver doesn't need to select other
subsystem forcibly.

Fixes: ffcd28d88e4f ("ALSA: hda - Select INPUT for Realtek HD-audio codec")
Reported-by: Randy Dunlap 
Acked-by: Randy Dunlap  # and build-tested
Signed-off-by: Takashi Iwai 
Signed-off-by: Sasha Levin 
---
 sound/pci/hda/Kconfig | 1 -
 sound/pci/hda/patch_realtek.c | 5 +
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/sound/pci/hda/Kconfig b/sound/pci/hda/Kconfig
index 7f3b5ed81995..f7a492c382d9 100644
--- a/sound/pci/hda/Kconfig
+++ b/sound/pci/hda/Kconfig
@@ -88,7 +88,6 @@ config SND_HDA_PATCH_LOADER
 config SND_HDA_CODEC_REALTEK
tristate "Build Realtek HD-audio codec support"
select SND_HDA_GENERIC
-   select INPUT
help
  Say Y or M here to include Realtek HD-audio codec support in
  snd-hda-intel driver, such as ALC880.
diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c
index e2230bed7409..242c80efbbef 100644
--- a/sound/pci/hda/patch_realtek.c
+++ b/sound/pci/hda/patch_realtek.c
@@ -3494,6 +3494,7 @@ static void alc280_fixup_hp_gpio4(struct hda_codec *codec,
}
 }
 
+#if IS_REACHABLE(INPUT)
 static void gpio2_mic_hotkey_event(struct hda_codec *codec,
   struct hda_jack_callback *event)
 {
@@ -3626,6 +3627,10 @@ static void alc233_fixup_lenovo_line2_mic_hotkey(struct 
hda_codec *codec,
spec->kb_dev = NULL;
}
 }
+#else /* INPUT */
+#define alc280_fixup_hp_gpio2_mic_hotkey   NULL
+#define alc233_fixup_lenovo_line2_mic_hotkey   NULL
+#endif /* INPUT */
 
 static void alc269_fixup_hp_line1_mic1_led(struct hda_codec *codec,
const struct hda_fixup *fix, int action)
-- 
2.15.1

[PATCH AUTOSEL for 4.9 214/293] ASoC: au1x: Fix timeout tests in au1xac97c_ac97_read()

2018-04-08 Thread Sasha Levin

From: Dan Carpenter 

[ Upstream commit 123af9043e93cb6f235207d260d50f832cdb5439 ]

The loop timeout doesn't work because it's a post op and ends with "tmo"
set to -1.  I changed it from a post-op to a pre-op and I changed the
initial the starting value from 5 to 6 so we still iterate 5 times.  I
left the other as it was because it's a large number.

Fixes: b3c70c9ea62a ("ASoC: Alchemy AC97C/I2SC audio support")
Signed-off-by: Dan Carpenter 
Signed-off-by: Mark Brown 
Signed-off-by: Sasha Levin 
---
 sound/soc/au1x/ac97c.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/sound/soc/au1x/ac97c.c b/sound/soc/au1x/ac97c.c
index 29a97d52e8ad..66d6c52e7761 100644
--- a/sound/soc/au1x/ac97c.c
+++ b/sound/soc/au1x/ac97c.c
@@ -91,8 +91,8 @@ static unsigned short au1xac97c_ac97_read(struct snd_ac97 
*ac97,
do {
mutex_lock(>lock);
 
-   tmo = 5;
-   while ((RD(ctx, AC97_STATUS) & STAT_CP) && tmo--)
+   tmo = 6;
+   while ((RD(ctx, AC97_STATUS) & STAT_CP) && --tmo)
udelay(21); /* wait an ac97 frame time */
if (!tmo) {
pr_debug("ac97rd timeout #1\n");
@@ -105,7 +105,7 @@ static unsigned short au1xac97c_ac97_read(struct snd_ac97 
*ac97,
 * poll, Forrest, poll...
 */
tmo = 0x1;
-   while ((RD(ctx, AC97_STATUS) & STAT_CP) && tmo--)
+   while ((RD(ctx, AC97_STATUS) & STAT_CP) && --tmo)
asm volatile ("nop");
data = RD(ctx, AC97_CMDRESP);
 
-- 
2.15.1

Re: linux-next: manual merge of the scsi-mkp tree with the efi-lock-down tree

2018-04-08 Thread Stephen Rothwell

Hi all,

On Fri, 6 Apr 2018 09:22:16 +1000 Stephen Rothwell  
wrote:
>
> On Thu, 15 Mar 2018 18:34:12 +1100 Stephen Rothwell  
> wrote:
> >
> > Today's linux-next merge of the scsi-mkp tree got a conflict in:
> > 
> >   drivers/scsi/eata.c
> > 
> > between commit:
> > 
> >   5b76b160badb ("scsi: Lock down the eata driver")
> > 
> > from the efi-lock-down tree and commit:
> > 
> >   6b1745caa14a ("scsi: eata: eata-pio: Deprecate legacy EATA drivers")
> > 
> > from the scsi-mkp tree.
> > 
> > I fixed it up (I just removed the file) and can carry the fix as
> > necessary. This is now fixed as far as linux-next is concerned, but any
> > non trivial conflicts should be mentioned to your upstream maintainer
> > when your tree is submitted for merging.  You may also want to consider
> > cooperating with the maintainer of the conflicting tree to minimise any
> > particularly complex conflicts.  
> 
> This is now a conflict between the efi-lock-down tree and Linus' tree.

This is now a conflict between the security tree and Linus' tree.

-- 
Cheers,
Stephen Rothwell


pgpYavBe_Q9hY.pgp
Description: OpenPGP digital signature

[PATCH AUTOSEL for 4.9 198/293] powerpc64/elfv1: Only dereference function descriptor for non-text symbols

2018-04-08 Thread Sasha Levin

From: "Naveen N. Rao" 

[ Upstream commit 83e840c770f2c578bbbff478d62a4403c073b438 ]

Currently, we assume that the function pointer we receive in
ppc_function_entry() points to a function descriptor. However, this is
not always the case. In particular, assembly symbols without the right
annotation do not have an associated function descriptor. Some of these
symbols are added to the kprobe blacklist using _ASM_NOKPROBE_SYMBOL().

When such addresses are subsequently processed through
arch_deref_entry_point() in populate_kprobe_blacklist(), we see the
below errors during bootup:
[0.663963] Failed to find blacklist at 7d9b02a648029b6c
[0.663970] Failed to find blacklist at a14d03d0394a0001
[0.663972] Failed to find blacklist at 7d5302a6f94d0388
[0.663973] Failed to find blacklist at 48027d11e8610178
[0.663974] Failed to find blacklist at f8010070f8410080
[0.663976] Failed to find blacklist at 386100704801f89d
[0.663977] Failed to find blacklist at 7d5302a6f94d00b0

Fix this by checking if the function pointer we receive in
ppc_function_entry() already points to kernel text. If so, we just
return it as is. If not, we assume that this is a function descriptor
and proceed to dereference it.

Suggested-by: Nicholas Piggin 
Reviewed-by: Nicholas Piggin 
Signed-off-by: Naveen N. Rao 
Signed-off-by: Michael Ellerman 
Signed-off-by: Sasha Levin 
---
 arch/powerpc/include/asm/code-patching.h | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/code-patching.h 
b/arch/powerpc/include/asm/code-patching.h
index b4ab1f497335..a96e4ad380d1 100644
--- a/arch/powerpc/include/asm/code-patching.h
+++ b/arch/powerpc/include/asm/code-patching.h
@@ -80,8 +80,16 @@ static inline unsigned long ppc_function_entry(void *func)
 * On PPC64 ABIv1 the function pointer actually points to the
 * function's descriptor. The first entry in the descriptor is the
 * address of the function text.
+*
+* However, we may also receive pointer to an assembly symbol. To
+* detect that, we first check if the function pointer we receive
+* already points to kernel/module text and we only dereference it
+* if it doesn't.
 */
-   return ((func_descr_t *)func)->entry;
+   if (kernel_text_address((unsigned long)func))
+   return (unsigned long)func;
+   else
+   return ((func_descr_t *)func)->entry;
 #else
return (unsigned long)func;
 #endif
-- 
2.15.1

[PATCH AUTOSEL for 4.9 208/293] perf unwind: Do not fail due to missing unwind support

2018-04-08 Thread Sasha Levin

From: Jiri Olsa 

[ Upstream commit 1934adf78e33fa69570a763c7ac5353212416bb0 ]

We currently fail the MMAP event processing if we don't have the MMAP
event's specific arch unwind support compiled in.

That's wrong and can lead to unresolved mmaps in report output for 32bit
binaries on 64bit server, like in this example on x86_64 server:

  $ cat ex.c
  int main(int argc, char **argv)
  {
  while (1) {}
  }
  $ gcc -o ex -m32 ex.c
  $ perf record ./ex
  ^C[ perf record: Woken up 2 times to write data ]
  [ perf record: Captured and wrote 0.371 MB perf.data (9322 samples) ]

Before:
  $ perf report --stdio

  SNIP

  # Overhead  Command  Shared Object Symbol
  #   ...    ..
  #
 100.00%  ex   [unknown] [.] 0x080483de
   0.00%  ex   [unknown] [.] 0xf76dba4f
   0.00%  ex   [unknown] [.] 0xf76e4c11
   0.00%  ex   [unknown] [.] 0xf76daa30

After:
  $ perf report --stdio

  SNIP

  # Overhead  Command  Shared Object  Symbol
  #   ...  .  ...
  #
 100.00%  ex   ex [.] main
   0.00%  ex   ld-2.24.so [.] _dl_start
   0.00%  ex   ld-2.24.so [.] do_lookup_x
   0.00%  ex   ld-2.24.so [.] _start

The fix is not to fail, just warn if there's not unwind support compiled
in.

Reported-by: Michael Lyle 
Signed-off-by: Jiri Olsa 
Tested-by: Arnaldo Carvalho de Melo 
Cc: David Ahern 
Cc: He Kuang 
Cc: Namhyung Kim 
Cc: Peter Zijlstra 
Link: http://lkml.kernel.org/r/20170704131131.27508-1-jo...@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo 
Signed-off-by: Sasha Levin 
---
 tools/perf/util/unwind-libunwind.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/unwind-libunwind.c 
b/tools/perf/util/unwind-libunwind.c
index 6d542a4e0648..8aef572d0889 100644
--- a/tools/perf/util/unwind-libunwind.c
+++ b/tools/perf/util/unwind-libunwind.c
@@ -50,7 +50,7 @@ int unwind__prepare_access(struct thread *thread, struct map 
*map,
 
if (!ops) {
pr_err("unwind: target platform=%s is not supported\n", arch);
-   return -1;
+   return 0;
}
 out_register:
unwind__register_ops(thread, ops);
-- 
2.15.1

[PATCH AUTOSEL for 4.9 207/293] perf evsel: Set attr.exclude_kernel when probing max attr.precise_ip

2018-04-08 Thread Sasha Levin

From: Arnaldo Carvalho de Melo 

[ Upstream commit 97365e81366f5ca16a9ce66cff4dd4c5b0d9f4db ]

We should set attr.exclude_kernel when probing for attr.precise_ip
level, otherwise !CAP_SYS_ADMIN users will not default to skidless
samples in capable hardware.

The increase in the paranoid level in commit 0161028b7c8a ("perf/core:
Change the default paranoia level to 2") broke this, fix it by excluding
kernel samples when probing.

Before:

  $ perf record usleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.018 MB perf.data (6 samples) ]
  $ perf evlist -v
  cycles:u: sample_freq: 4000, sample_type: IP|TID|TIME|PERIOD, exclude_kernel: 
1

After:

  $ perf record usleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.018 MB perf.data (8 samples) ]
  $ perf evlist -v
  cycles:ppp: sample_freq: 4000, sample_type: IP|TID|TIME|PERIOD, 
exclude_kernel: 1, precise_ip: 3

 ^

 ^

 ^
  $

To further clarify: we always set .exclude_kernel when non !CAP_SYS_ADMIN
users profile, its just on the attr.precise_ip probing that we weren't doing
so, fix it.

Cc: Adrian Hunter 
Cc: Andy Lutomirski 
Cc: David Ahern 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Wang Nan 
Fixes: 7f8d1ade1b19 ("perf tools: By default use the most precise "cycles" hw 
counter available")
Link: http://lkml.kernel.org/n/tip-t2qttwhbnua62o5gt75cu...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo 
Signed-off-by: Sasha Levin 
---
 tools/perf/util/evsel.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 97fe0c80ff02..1c1291afd6a6 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -258,6 +258,7 @@ struct perf_evsel *perf_evsel__new_cycles(void)
struct perf_event_attr attr = {
.type   = PERF_TYPE_HARDWARE,
.config = PERF_COUNT_HW_CPU_CYCLES,
+   .exclude_kernel = 1,
};
struct perf_evsel *evsel;
 
-- 
2.15.1

[PATCH AUTOSEL for 4.9 205/293] irqchip/gic-v3: Report failures in gic_irq_domain_alloc

2018-04-08 Thread Sasha Levin

From: Suzuki K Poulose 

[ Upstream commit 63c16c6eacb69d0cbdaee5dea0dd56d238375fe6 ]

If the GIC cannot map an IRQ via irq_domain_ops->alloc(), it doesn't
return an error code.  This can cause a problem with drivers, where
it thinks it has successfully got an IRQ for the device, but requesting
the same ends up failure with -ENOSYS (as the IRQ's chip is not set).

Fixes: commit 443acc4f37f6 ("irqchip: GICv3: Convert to domain hierarchy")
Cc: Marc Zyngier 
Signed-off-by: Suzuki K Poulose 
Signed-off-by: Marc Zyngier 
Signed-off-by: Sasha Levin 
---
 drivers/irqchip/irq-gic-v3.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
index fd4a78296b48..8c7f02318a6b 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -832,8 +832,11 @@ static int gic_irq_domain_alloc(struct irq_domain *domain, 
unsigned int virq,
if (ret)
return ret;
 
-   for (i = 0; i < nr_irqs; i++)
-   gic_irq_domain_map(domain, virq + i, hwirq + i);
+   for (i = 0; i < nr_irqs; i++) {
+   ret = gic_irq_domain_map(domain, virq + i, hwirq + i);
+   if (ret)
+   return ret;
+   }
 
return 0;
 }
-- 
2.15.1

[PATCH AUTOSEL for 4.9 203/293] f2fs: fix to avoid panic when encountering corrupt node

2018-04-08 Thread Sasha Levin

From: Chao Yu 

[ Upstream commit 1f258ec13b82d3d947b515a007a748ffcbe29f9a ]

With fault_injection option, generic/361 of fstests will complain us
with below message:

Call Trace:
 get_node_page+0x12/0x20 [f2fs]
 f2fs_iget+0x92/0x7d0 [f2fs]
 f2fs_fill_super+0x10fb/0x15e0 [f2fs]
 mount_bdev+0x184/0x1c0
 f2fs_mount+0x15/0x20 [f2fs]
 mount_fs+0x39/0x150
 vfs_kern_mount+0x67/0x110
 do_mount+0x1bb/0xc70
 SyS_mount+0x83/0xd0
 do_syscall_64+0x6e/0x160
 entry_SYSCALL64_slow_path+0x25/0x25

Since mkfs loop device in f2fs partition can be failed silently due to
checkpoint error injection, so root inode page can be corrupted, in order
to avoid needless panic, in get_node_page, it's better to leave message
and return error to caller, and let fsck repaire it later.

Signed-off-by: Chao Yu 
Signed-off-by: Jaegeuk Kim 
Signed-off-by: Sasha Levin 
---
 fs/f2fs/node.c | 14 +++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
index 01177ecdeab8..db787d7a2b9d 100644
--- a/fs/f2fs/node.c
+++ b/fs/f2fs/node.c
@@ -1139,6 +1139,7 @@ repeat:
f2fs_put_page(page, 1);
return ERR_PTR(err);
} else if (err == LOCKED_PAGE) {
+   err = 0;
goto page_hit;
}
 
@@ -1152,15 +1153,22 @@ repeat:
goto repeat;
}
 
-   if (unlikely(!PageUptodate(page)))
+   if (unlikely(!PageUptodate(page))) {
+   err = -EIO;
goto out_err;
+   }
 page_hit:
if(unlikely(nid != nid_of_node(page))) {
-   f2fs_bug_on(sbi, 1);
+   f2fs_msg(sbi->sb, KERN_WARNING, "inconsistent node block, "
+   "nid:%lu, 
node_footer[nid:%u,ino:%u,ofs:%u,cpver:%llu,blkaddr:%u]",
+   nid, nid_of_node(page), ino_of_node(page),
+   ofs_of_node(page), cpver_of_node(page),
+   next_blkaddr_of_node(page));
ClearPageUptodate(page);
+   err = -EINVAL;
 out_err:
f2fs_put_page(page, 1);
-   return ERR_PTR(-EIO);
+   return ERR_PTR(err);
}
return page;
 }
-- 
2.15.1

[PATCH AUTOSEL for 4.9 206/293] irqchip/gic-v3: Honor forced affinity setting

2018-04-08 Thread Sasha Levin

From: Suzuki K Poulose 

[ Upstream commit 65a30f8b300107266f316d550f060ccc186201a3 ]

Honor the 'force' flag for set_affinity, by selecting a CPU
from the given mask (which may not be reported "online" by
the cpu_online_mask). Some drivers, like ARM PMU, rely on it.

Cc: Marc Zyngier 
Reported-by: Mark Rutland 
Signed-off-by: Suzuki K Poulose 
Signed-off-by: Marc Zyngier 
Signed-off-by: Sasha Levin 
---
 drivers/irqchip/irq-gic-v3.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
index 8c7f02318a6b..0ef240c64c65 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -641,11 +641,16 @@ static void gic_smp_init(void)
 static int gic_set_affinity(struct irq_data *d, const struct cpumask *mask_val,
bool force)
 {
-   unsigned int cpu = cpumask_any_and(mask_val, cpu_online_mask);
+   unsigned int cpu;
void __iomem *reg;
int enabled;
u64 val;
 
+   if (force)
+   cpu = cpumask_first(mask_val);
+   else
+   cpu = cpumask_any_and(mask_val, cpu_online_mask);
+
if (cpu >= nr_cpu_ids)
return -EINVAL;
 
-- 
2.15.1

[PATCH AUTOSEL for 4.9 202/293] bridge: allow ext learned entries to change ports

2018-04-08 Thread Sasha Levin

From: Nikolay Aleksandrov 

[ Upstream commit 7597b266c56feaad7d4e6e65822766e929407da2 ]

current code silently ignores change of port in the request
message. This patch makes sure the port is modified and
notification is sent to userspace.

Fixes: cf6b8e1eedff ("bridge: add API to notify bridge driver of learned FBD on 
offloaded device")
Signed-off-by: Nikolay Aleksandrov 
Signed-off-by: Roopa Prabhu 
Signed-off-by: David S. Miller 
Signed-off-by: Sasha Levin 
---
 net/bridge/br_fdb.c | 28 
 1 file changed, 20 insertions(+), 8 deletions(-)

diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
index 6b43c8c88f19..f32b8138f9c8 100644
--- a/net/bridge/br_fdb.c
+++ b/net/bridge/br_fdb.c
@@ -1101,8 +1101,9 @@ void br_fdb_unsync_static(struct net_bridge *br, struct 
net_bridge_port *p)
 int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p,
  const unsigned char *addr, u16 vid)
 {
-   struct hlist_head *head;
struct net_bridge_fdb_entry *fdb;
+   struct hlist_head *head;
+   bool modified = false;
int err = 0;
 
ASSERT_RTNL();
@@ -1118,14 +1119,25 @@ int br_fdb_external_learn_add(struct net_bridge *br, 
struct net_bridge_port *p,
}
fdb->added_by_external_learn = 1;
fdb_notify(br, fdb, RTM_NEWNEIGH);
-   } else if (fdb->added_by_external_learn) {
-   /* Refresh entry */
-   fdb->updated = fdb->used = jiffies;
-   } else if (!fdb->added_by_user) {
-   /* Take over SW learned entry */
-   fdb->added_by_external_learn = 1;
+   } else {
fdb->updated = jiffies;
-   fdb_notify(br, fdb, RTM_NEWNEIGH);
+
+   if (fdb->dst != p) {
+   fdb->dst = p;
+   modified = true;
+   }
+
+   if (fdb->added_by_external_learn) {
+   /* Refresh entry */
+   fdb->used = jiffies;
+   } else if (!fdb->added_by_user) {
+   /* Take over SW learned entry */
+   fdb->added_by_external_learn = 1;
+   modified = true;
+   }
+
+   if (modified)
+   fdb_notify(br, fdb, RTM_NEWNEIGH);
}
 
 err_unlock:
-- 
2.15.1

[PATCH AUTOSEL for 4.9 204/293] irqchip/gic-v2: Report failures in gic_irq_domain_alloc

2018-04-08 Thread Sasha Levin

From: Suzuki K Poulose 

[ Upstream commit 456c59c31c5126fe31c64956c43670060ea9debd ]

If the GIC cannot map an IRQ via irq_domain_ops->alloc(), it doesn't
return an error code.  This can cause a problem with drivers, where
it thinks it has successfully got an IRQ for the device, but requesting
the same ends up failure with -ENOSYS (as the IRQ's chip is not set).

Fixes: commit 9a1091ef0017c ("irqchip: gic: Support hierarchy irq domain.")
Cc: Yingjoe Chen 
Cc: Marc Zyngier 
Signed-off-by: Suzuki K Poulose 
Signed-off-by: Marc Zyngier 
Signed-off-by: Sasha Levin 
---
 drivers/irqchip/irq-gic.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
index d6c404b3584d..230a4da1e196 100644
--- a/drivers/irqchip/irq-gic.c
+++ b/drivers/irqchip/irq-gic.c
@@ -1027,8 +1027,11 @@ static int gic_irq_domain_alloc(struct irq_domain 
*domain, unsigned int virq,
if (ret)
return ret;
 
-   for (i = 0; i < nr_irqs; i++)
-   gic_irq_domain_map(domain, virq + i, hwirq + i);
+   for (i = 0; i < nr_irqs; i++) {
+   ret = gic_irq_domain_map(domain, virq + i, hwirq + i);
+   if (ret)
+   return ret;
+   }
 
return 0;
 }
-- 
2.15.1

[PATCH AUTOSEL for 4.9 201/293] net: ethernet: mediatek: fixed deadlock captured by lockdep

2018-04-08 Thread Sasha Levin

From: Sean Wang 

[ Upstream commit 8d32e0624392bb4abfbe122f754757a4cb326d7f ]

Lockdep found an inconsistent lock state when mtk_get_stats64 is called
in user context while NAPI updates MAC statistics in softirq.

Use spin_trylock_bh/spin_unlock_bh fix following lockdep warning.

[   81.321030] WARNING: inconsistent lock state
[   81.325266] 4.12.0-rc1-00035-gd9dda65 #32 Not tainted
[   81.330273] 
[   81.334505] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
[   81.340464] ksoftirqd/0/7 [HC0[0]:SC1[1]:HE1:SE0] takes:
[   81.345731]  (>seq#2){+.?...}, at: [] 
mtk_handle_status_irq.part.6+0x70/0x84
[   81.354219] {SOFTIRQ-ON-W} state was registered at:
[   81.359062]   lock_acquire+0xfc/0x2b0
[   81.362696]   mtk_stats_update_mac+0x60/0x2c0
[   81.367017]   mtk_get_stats64+0x17c/0x18c
[   81.370995]   dev_get_stats+0x48/0xbc
[   81.374628]   rtnl_fill_stats+0x48/0x128
[   81.378520]   rtnl_fill_ifinfo+0x4ac/0xd1c
[   81.382584]   rtmsg_ifinfo_build_skb+0x7c/0xe0
[   81.386991]   rtmsg_ifinfo.part.5+0x24/0x54
[   81.391139]   rtmsg_ifinfo+0x24/0x28
[   81.394685]   __dev_notify_flags+0xa4/0xac
[   81.398749]   dev_change_flags+0x50/0x58
[   81.402640]   devinet_ioctl+0x768/0x85c
[   81.406444]   inet_ioctl+0x1a4/0x1d0
[   81.409990]   sock_ioctl+0x16c/0x33c
[   81.413538]   do_vfs_ioctl+0xb4/0xa34
[   81.417169]   SyS_ioctl+0x44/0x6c
[   81.420458]   ret_fast_syscall+0x0/0x1c
[   81.424260] irq event stamp: 3354692
[   81.427806] hardirqs last  enabled at (3354692): [] 
net_rx_action+0xc0/0x504
[   81.435660] hardirqs last disabled at (3354691): [] 
net_rx_action+0x8c/0x504
[   81.443515] softirqs last  enabled at (3354106): [] 
__do_softirq+0x4b4/0x614
[   81.451370] softirqs last disabled at (3354109): [] 
run_ksoftirqd+0x44/0x80
[   81.459134]
[   81.459134] other info that might help us debug this:
[   81.465608]  Possible unsafe locking scenario:
[   81.465608]
[   81.471478]CPU0
[   81.473900]
[   81.476321]   lock(>seq#2);
[   81.479701]   
[   81.482294] lock(>seq#2);
[   81.485847]
[   81.485847]  *** DEADLOCK ***
[   81.485847]
[   81.491720] 1 lock held by ksoftirqd/0/7:
[   81.495693]  #0:  (&(>hw_stats->stats_lock)->rlock){+.+...}, at: 
[] mtk_handle_status_irq.part.6+0x48/0x84
[   81.506579]
[   81.506579] stack backtrace:
[   81.510904] CPU: 0 PID: 7 Comm: ksoftirqd/0 Not tainted 
4.12.0-rc1-00035-gd9dda65 #32
[   81.518668] Hardware name: Mediatek Cortex-A7 (Device Tree)
[   81.524208] [] (unwind_backtrace) from [] 
(show_stack+0x20/0x24)
[   81.531899] [] (show_stack) from [] 
(dump_stack+0xb4/0xe0)
[   81.539072] [] (dump_stack) from [] 
(print_usage_bug+0x234/0x2e0)
[   81.546846] [] (print_usage_bug) from [] 
(mark_lock+0x63c/0x7bc)
[   81.554532] [] (mark_lock) from [] 
(__lock_acquire+0x654/0x1bfc)
[   81.562217] [] (__lock_acquire) from [] 
(lock_acquire+0xfc/0x2b0)
[   81.569990] [] (lock_acquire) from [] 
(mtk_stats_update_mac+0x60/0x2c0)
[   81.578283] [] (mtk_stats_update_mac) from [] 
(mtk_handle_status_irq.part.6+0x70/0x84)
[   81.587865] [] (mtk_handle_status_irq.part.6) from [] 
(mtk_napi_tx+0x358/0x37c)
[   81.596845] [] (mtk_napi_tx) from [] 
(net_rx_action+0x244/0x504)
[   81.604533] [] (net_rx_action) from [] 
(__do_softirq+0x134/0x614)
[   81.612306] [] (__do_softirq) from [] 
(run_ksoftirqd+0x44/0x80)
[   81.619907] [] (run_ksoftirqd) from [] 
(smpboot_thread_fn+0x14c/0x25c)
[   81.628110] [] (smpboot_thread_fn) from [] 
(kthread+0x150/0x180)
[   81.635798] [] (kthread) from [] 
(ret_from_fork+0x14/0x24)

Signed-off-by: Sean Wang 
Signed-off-by: David S. Miller 
Signed-off-by: Sasha Levin 
---
 drivers/net/ethernet/mediatek/mtk_eth_soc.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c 
b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
index 20de37a414fe..d76f65f9d8dd 100644
--- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
+++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
@@ -470,9 +470,9 @@ static struct rtnl_link_stats64 *mtk_get_stats64(struct 
net_device *dev,
unsigned int start;
 
if (netif_running(dev) && netif_device_present(dev)) {
-   if (spin_trylock(_stats->stats_lock)) {
+   if (spin_trylock_bh(_stats->stats_lock)) {
mtk_stats_update_mac(mac);
-   spin_unlock(_stats->stats_lock);
+   spin_unlock_bh(_stats->stats_lock);
}
}
 
@@ -2151,9 +2151,9 @@ static void mtk_get_ethtool_stats(struct net_device *dev,
return;
 
if (netif_running(dev) && netif_device_present(dev)) {
-   if (spin_trylock(>stats_lock)) {
+   if (spin_trylock_bh(>stats_lock)) {
mtk_stats_update_mac(mac);
-   spin_unlock(>stats_lock);
+

[PATCH AUTOSEL for 4.9 199/293] block: guard bvec iteration logic

2018-04-08 Thread Sasha Levin

From: Dmitry Monakhov 

[ Upstream commit b1fb2c52b2d85f51f36f1661409f9aeef94265ff ]

Currently if some one try to advance bvec beyond it's size we simply
dump WARN_ONCE and continue to iterate beyond bvec array boundaries.
This simply means that we endup dereferencing/corrupting random memory
region.

Sane reaction would be to propagate error back to calling context
But bvec_iter_advance's calling context is not always good for error
handling. For safity reason let truncate iterator size to zero which
will break external iteration loop which prevent us from unpredictable
memory range corruption. And even it caller ignores an error, it will
corrupt it's own bvecs, not others.

This patch does:
- Return error back to caller with hope that it will react on this
- Truncate iterator size

Code was added long time ago here 4550dd6c, luckily no one hit it
in real life :)

Signed-off-by: Dmitry Monakhov 
Reviewed-by: Ming Lei 
Reviewed-by: Martin K. Petersen 
[hch: switch to true/false returns instead of errno values]
Signed-off-by: Christoph Hellwig 
Signed-off-by: Jens Axboe 
Signed-off-by: Sasha Levin 
---
 drivers/nvdimm/blk.c |  3 ++-
 drivers/nvdimm/btt.c |  3 ++-
 include/linux/bio.h  |  4 +++-
 include/linux/bvec.h | 14 +-
 4 files changed, 16 insertions(+), 8 deletions(-)

diff --git a/drivers/nvdimm/blk.c b/drivers/nvdimm/blk.c
index 77db9795510f..ac6d6771d47c 100644
--- a/drivers/nvdimm/blk.c
+++ b/drivers/nvdimm/blk.c
@@ -106,7 +106,8 @@ static int nd_blk_rw_integrity(struct nd_namespace_blk 
*nsblk,
 
len -= cur_len;
dev_offset += cur_len;
-   bvec_iter_advance(bip->bip_vec, >bip_iter, cur_len);
+   if (!bvec_iter_advance(bip->bip_vec, >bip_iter, cur_len))
+   return -EIO;
}
 
return err;
diff --git a/drivers/nvdimm/btt.c b/drivers/nvdimm/btt.c
index 0c46ada027cf..add695bc2cb9 100644
--- a/drivers/nvdimm/btt.c
+++ b/drivers/nvdimm/btt.c
@@ -1075,7 +1075,8 @@ static int btt_rw_integrity(struct btt *btt, struct 
bio_integrity_payload *bip,
 
len -= cur_len;
meta_nsoff += cur_len;
-   bvec_iter_advance(bip->bip_vec, >bip_iter, cur_len);
+   if (!bvec_iter_advance(bip->bip_vec, >bip_iter, cur_len))
+   return -EIO;
}
 
return ret;
diff --git a/include/linux/bio.h b/include/linux/bio.h
index 97cb48f03dc7..9a804d65a50e 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -171,8 +171,10 @@ static inline void bio_advance_iter(struct bio *bio, 
struct bvec_iter *iter,
 
if (bio_no_advance_iter(bio))
iter->bi_size -= bytes;
-   else
+   else {
bvec_iter_advance(bio->bi_io_vec, iter, bytes);
+   /* TODO: It is reasonable to complete bio with error here. */
+   }
 }
 
 #define __bio_for_each_segment(bvl, bio, iter, start)  \
diff --git a/include/linux/bvec.h b/include/linux/bvec.h
index 89b65b82d98f..de317b4c13c1 100644
--- a/include/linux/bvec.h
+++ b/include/linux/bvec.h
@@ -22,6 +22,7 @@
 
 #include 
 #include 
+#include 
 
 /*
  * was unsigned short, but we might as well be ready for > 64kB I/O pages
@@ -66,12 +67,14 @@ struct bvec_iter {
.bv_offset  = bvec_iter_offset((bvec), (iter)), \
 })
 
-static inline void bvec_iter_advance(const struct bio_vec *bv,
-struct bvec_iter *iter,
-unsigned bytes)
+static inline bool bvec_iter_advance(const struct bio_vec *bv,
+   struct bvec_iter *iter, unsigned bytes)
 {
-   WARN_ONCE(bytes > iter->bi_size,
- "Attempted to advance past end of bvec iter\n");
+   if (WARN_ONCE(bytes > iter->bi_size,
+"Attempted to advance past end of bvec iter\n")) {
+   iter->bi_size = 0;
+   return false;
+   }
 
while (bytes) {
unsigned iter_len = bvec_iter_len(bv, *iter);
@@ -86,6 +89,7 @@ static inline void bvec_iter_advance(const struct bio_vec *bv,
iter->bi_idx++;
}
}
+   return true;
 }
 
 #define for_each_bvec(bvl, bio_vec, iter, start)   \
-- 
2.15.1

[PATCH AUTOSEL for 4.9 210/293] firewire-ohci: work around oversized DMA reads on JMicron controllers

2018-04-08 Thread Sasha Levin

From: Hector Martin 

[ Upstream commit 188775181bc05f29372b305ef96485840e351fde ]

At least some JMicron controllers issue buggy oversized DMA reads when
fetching context descriptors, always fetching 0x20 bytes at once for
descriptors which are only 0x10 bytes long. This is often harmless, but
can cause page faults on modern systems with IOMMUs:

DMAR: [DMA Read] Request device [05:00.0] fault addr fff56000 [fault reason 06] 
PTE Read access is not set
firewire_ohci :05:00.0: DMA context IT0 has stopped, error code: 
evt_descriptor_read

This works around the problem by always leaving 0x10 padding bytes at
the end of descriptor buffer pages, which should be harmless to do
unconditionally for controllers in case others have the same behavior.

Signed-off-by: Hector Martin 
Reviewed-by: Clemens Ladisch 
Signed-off-by: Stefan Richter 
Signed-off-by: Sasha Levin 
---
 drivers/firewire/ohci.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/firewire/ohci.c b/drivers/firewire/ohci.c
index 8bf89267dc25..d731b413cb2c 100644
--- a/drivers/firewire/ohci.c
+++ b/drivers/firewire/ohci.c
@@ -1130,7 +1130,13 @@ static int context_add_buffer(struct context *ctx)
return -ENOMEM;
 
offset = (void *)>buffer - (void *)desc;
-   desc->buffer_size = PAGE_SIZE - offset;
+   /*
+* Some controllers, like JMicron ones, always issue 0x20-byte DMA reads
+* for descriptors, even 0x10-byte ones. This can cause page faults when
+* an IOMMU is in use and the oversized read crosses a page boundary.
+* Work around this by always leaving at least 0x10 bytes of padding.
+*/
+   desc->buffer_size = PAGE_SIZE - offset - 0x10;
desc->buffer_bus = bus_addr + offset;
desc->used = 0;
 
-- 
2.15.1

[PATCH AUTOSEL for 4.9 195/293] powerpc/perf/hv-24x7: Fix off-by-one error in request_buffer check

2018-04-08 Thread Sasha Levin

From: Thiago Jung Bauermann 

[ Upstream commit 36c8fb2c616d9373758b155d9723774353067a87 ]

request_buffer can hold 254 requests, so if it already has that number of
entries we can't add a new one.

Also, define constant to show where the number comes from.

Fixes: e3ee15dc5d19 ("powerpc/perf/hv-24x7: Define add_event_to_24x7_request()")
Reviewed-by: Sukadev Bhattiprolu 
Signed-off-by: Thiago Jung Bauermann 
Signed-off-by: Michael Ellerman 
Signed-off-by: Sasha Levin 
---
 arch/powerpc/perf/hv-24x7.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/perf/hv-24x7.c b/arch/powerpc/perf/hv-24x7.c
index 13cea4e8c56d..9fc2be138e32 100644
--- a/arch/powerpc/perf/hv-24x7.c
+++ b/arch/powerpc/perf/hv-24x7.c
@@ -166,6 +166,10 @@ DEFINE_PER_CPU(struct hv_24x7_hw, hv_24x7_hw);
 DEFINE_PER_CPU(char, hv_24x7_reqb[H24x7_DATA_BUFFER_SIZE]) __aligned(4096);
 DEFINE_PER_CPU(char, hv_24x7_resb[H24x7_DATA_BUFFER_SIZE]) __aligned(4096);
 
+#define MAX_NUM_REQUESTS   ((H24x7_DATA_BUFFER_SIZE - \
+   sizeof(struct hv_24x7_request_buffer)) \
+   / sizeof(struct hv_24x7_request))
+
 static char *event_name(struct hv_24x7_event_data *ev, int *len)
 {
*len = be16_to_cpu(ev->event_name_len) - 2;
@@ -1107,7 +,7 @@ static int add_event_to_24x7_request(struct perf_event 
*event,
int i;
struct hv_24x7_request *req;
 
-   if (request_buffer->num_requests > 254) {
+   if (request_buffer->num_requests >= MAX_NUM_REQUESTS) {
pr_devel("Too many requests for 24x7 HCALL %d\n",
request_buffer->num_requests);
return -EINVAL;
-- 
2.15.1

[PATCH AUTOSEL for 4.9 194/293] powerpc/perf/hv-24x7: Fix passing of catalog version number

2018-04-08 Thread Sasha Levin

From: Thiago Jung Bauermann 

[ Upstream commit 12bf85a71000af7419b19b5e90910919f36f336c ]

H_GET_24X7_CATALOG_PAGE needs to be passed the version number obtained from
the first catalog page obtained previously. This is a 64 bit number, but
create_events_from_catalog truncates it to 32-bit.

This worked on POWER8, but POWER9 actually uses the upper bits so the call
fails with H_P3 because the hypervisor doesn't recognize the version.

This patch also adds the hcall return code to the error message, which is
helpful when debugging the problem.

Fixes: 5c5cd7b50259 ("powerpc/perf/hv-24x7: parse catalog and populate sysfs 
with events")
Reviewed-by: Sukadev Bhattiprolu 
Signed-off-by: Thiago Jung Bauermann 
Signed-off-by: Michael Ellerman 
Signed-off-by: Sasha Levin 
---
 arch/powerpc/perf/hv-24x7.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/perf/hv-24x7.c b/arch/powerpc/perf/hv-24x7.c
index 991c6a517ddc..13cea4e8c56d 100644
--- a/arch/powerpc/perf/hv-24x7.c
+++ b/arch/powerpc/perf/hv-24x7.c
@@ -670,7 +670,7 @@ static int create_events_from_catalog(struct attribute 
***events_,
   event_data_bytes, junk_events, event_idx, event_attr_ct, i,
   attr_max, event_idx_last, desc_ct, long_desc_ct;
ssize_t ct, ev_len;
-   uint32_t catalog_version_num;
+   uint64_t catalog_version_num;
struct attribute **events, **event_descs, **event_long_descs;
struct hv_24x7_catalog_page_0 *page_0 =
kmem_cache_alloc(hv_page_cache, GFP_KERNEL);
@@ -706,8 +706,8 @@ static int create_events_from_catalog(struct attribute 
***events_,
event_data_offs   = be16_to_cpu(page_0->event_data_offs);
event_data_len= be16_to_cpu(page_0->event_data_len);
 
-   pr_devel("cv %zu cl %zu eec %zu edo %zu edl %zu\n",
-   (size_t)catalog_version_num, catalog_len,
+   pr_devel("cv %llu cl %zu eec %zu edo %zu edl %zu\n",
+   catalog_version_num, catalog_len,
event_entry_count, event_data_offs, event_data_len);
 
if ((MAX_4K < event_data_len)
@@ -761,8 +761,8 @@ static int create_events_from_catalog(struct attribute 
***events_,
catalog_version_num,
i + event_data_offs);
if (hret) {
-   pr_err("failed to get event data in page %zu\n",
-   i + event_data_offs);
+   pr_err("Failed to get event data in page %zu: rc=%ld\n",
+  i + event_data_offs, hret);
ret = -EIO;
goto e_event_data;
}
-- 
2.15.1

[PATCH AUTOSEL for 4.9 197/293] net: cdc_mbim: apply "NDP to end" quirk to HP lt4132

2018-04-08 Thread Sasha Levin

From: Tore Anderson 

[ Upstream commit a68491f895a937778bb25b0795830797239de31f ]

The HP lt4132 LTE/HSPA+ 4G Module (03f0:a31d) is a rebranded Huawei
ME906s-158 device. It, like the ME906s-158, requires the "NDP to end"
quirk for correct operation.

Signed-off-by: Tore Anderson 
Signed-off-by: David S. Miller 
Signed-off-by: Sasha Levin 
---
 drivers/net/usb/cdc_mbim.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/net/usb/cdc_mbim.c b/drivers/net/usb/cdc_mbim.c
index 3a98f3762a4c..10862296c824 100644
--- a/drivers/net/usb/cdc_mbim.c
+++ b/drivers/net/usb/cdc_mbim.c
@@ -642,6 +642,13 @@ static const struct usb_device_id mbim_devs[] = {
  .driver_info = (unsigned long)_mbim_info_ndp_to_end,
},
 
+   /* The HP lt4132 (03f0:a31d) is a rebranded Huawei ME906s-158,
+* therefore it too requires the above "NDP to end" quirk.
+*/
+   { USB_DEVICE_AND_INTERFACE_INFO(0x03f0, 0xa31d, USB_CLASS_COMM, 
USB_CDC_SUBCLASS_MBIM, USB_CDC_PROTO_NONE),
+ .driver_info = (unsigned long)_mbim_info_ndp_to_end,
+   },
+
/* Telit LE922A6 in MBIM composition */
{ USB_DEVICE_AND_INTERFACE_INFO(0x1bc7, 0x1041, USB_CLASS_COMM, 
USB_CDC_SUBCLASS_MBIM, USB_CDC_PROTO_NONE),
  .driver_info = (unsigned long)_mbim_info_avoid_altsetting_toggle,
-- 
2.15.1

[PATCH AUTOSEL for 4.9 193/293] datapath: Avoid using stack larger than 1024.

2018-04-08 Thread Sasha Levin

From: Tonghao Zhang 

[ Upstream commit 9cc9a5cb176ccb4f2cda5ac34da5a659926f125f ]

When compiling OvS-master on 4.4.0-81 kernel,
there is a warning:

CC [M]  /root/ovs/datapath/linux/datapath.o
/root/ovs/datapath/linux/datapath.c: In function
'ovs_flow_cmd_set':
/root/ovs/datapath/linux/datapath.c:1221:1: warning:
the frame size of 1040 bytes is larger than 1024 bytes
[-Wframe-larger-than=]

This patch factors out match-init and action-copy to avoid
"Wframe-larger-than=1024" warning. Because mask is only
used to get actions, we new a function to save some
stack space.

Signed-off-by: Tonghao Zhang 
Acked-by: Pravin B Shelar 
Signed-off-by: David S. Miller 
Signed-off-by: Sasha Levin 
---
 net/openvswitch/datapath.c | 81 +-
 1 file changed, 58 insertions(+), 23 deletions(-)

diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c
index 453f806afe6e..0f594140c5ff 100644
--- a/net/openvswitch/datapath.c
+++ b/net/openvswitch/datapath.c
@@ -1105,6 +1105,58 @@ static struct sw_flow_actions *get_flow_actions(struct 
net *net,
return acts;
 }
 
+/* Factor out match-init and action-copy to avoid
+ * "Wframe-larger-than=1024" warning. Because mask is only
+ * used to get actions, we new a function to save some
+ * stack space.
+ *
+ * If there are not key and action attrs, we return 0
+ * directly. In the case, the caller will also not use the
+ * match as before. If there is action attr, we try to get
+ * actions and save them to *acts. Before returning from
+ * the function, we reset the match->mask pointer. Because
+ * we should not to return match object with dangling reference
+ * to mask.
+ * */
+static int ovs_nla_init_match_and_action(struct net *net,
+struct sw_flow_match *match,
+struct sw_flow_key *key,
+struct nlattr **a,
+struct sw_flow_actions **acts,
+bool log)
+{
+   struct sw_flow_mask mask;
+   int error = 0;
+
+   if (a[OVS_FLOW_ATTR_KEY]) {
+   ovs_match_init(match, key, true, );
+   error = ovs_nla_get_match(net, match, a[OVS_FLOW_ATTR_KEY],
+ a[OVS_FLOW_ATTR_MASK], log);
+   if (error)
+   goto error;
+   }
+
+   if (a[OVS_FLOW_ATTR_ACTIONS]) {
+   if (!a[OVS_FLOW_ATTR_KEY]) {
+   OVS_NLERR(log,
+ "Flow key attribute not present in set 
flow.");
+   return -EINVAL;
+   }
+
+   *acts = get_flow_actions(net, a[OVS_FLOW_ATTR_ACTIONS], key,
+, log);
+   if (IS_ERR(*acts)) {
+   error = PTR_ERR(*acts);
+   goto error;
+   }
+   }
+
+   /* On success, error is 0. */
+error:
+   match->mask = NULL;
+   return error;
+}
+
 static int ovs_flow_cmd_set(struct sk_buff *skb, struct genl_info *info)
 {
struct net *net = sock_net(skb->sk);
@@ -1112,7 +1164,6 @@ static int ovs_flow_cmd_set(struct sk_buff *skb, struct 
genl_info *info)
struct ovs_header *ovs_header = info->userhdr;
struct sw_flow_key key;
struct sw_flow *flow;
-   struct sw_flow_mask mask;
struct sk_buff *reply = NULL;
struct datapath *dp;
struct sw_flow_actions *old_acts = NULL, *acts = NULL;
@@ -1124,34 +1175,18 @@ static int ovs_flow_cmd_set(struct sk_buff *skb, struct 
genl_info *info)
bool ufid_present;
 
ufid_present = ovs_nla_get_ufid(, a[OVS_FLOW_ATTR_UFID], log);
-   if (a[OVS_FLOW_ATTR_KEY]) {
-   ovs_match_init(, , true, );
-   error = ovs_nla_get_match(net, , a[OVS_FLOW_ATTR_KEY],
- a[OVS_FLOW_ATTR_MASK], log);
-   } else if (!ufid_present) {
+   if (!a[OVS_FLOW_ATTR_KEY] && !ufid_present) {
OVS_NLERR(log,
  "Flow set message rejected, Key attribute missing.");
-   error = -EINVAL;
+   return -EINVAL;
}
+
+   error = ovs_nla_init_match_and_action(net, , , a,
+ , log);
if (error)
goto error;
 
-   /* Validate actions. */
-   if (a[OVS_FLOW_ATTR_ACTIONS]) {
-   if (!a[OVS_FLOW_ATTR_KEY]) {
-   OVS_NLERR(log,
- "Flow key attribute not present in set 
flow.");
-   error = -EINVAL;
-   goto error;
-   }
-
-   acts = get_flow_actions(net, a[OVS_FLOW_ATTR_ACTIONS], ,
-

[PATCH AUTOSEL for 4.9 183/293] iwlwifi: mvm: don't send fetch the TID from a non-QoS packet in TSO

2018-04-08 Thread Sasha Levin

From: Emmanuel Grumbach 

[ Upstream commit 4f555e602b42826b3d79081c9ef8b8e8fe29fc49 ]

Getting the TID of a packet before we know it is a QoS data
packet isn't a good idea. Delay the TID retrieval until
we know the packet is a QoS data packet.

Fixes: bb81bb68f472 ("iwlwifi: mvm: add Tx A-MSDU inside A-MPDU")
Signed-off-by: Emmanuel Grumbach 
Signed-off-by: Luca Coelho 
Signed-off-by: Sasha Levin 
---
 drivers/net/wireless/intel/iwlwifi/mvm/tx.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/tx.c 
b/drivers/net/wireless/intel/iwlwifi/mvm/tx.c
index 7465d4db136f..790952e48262 100644
--- a/drivers/net/wireless/intel/iwlwifi/mvm/tx.c
+++ b/drivers/net/wireless/intel/iwlwifi/mvm/tx.c
@@ -652,11 +652,6 @@ static int iwl_mvm_tx_tso(struct iwl_mvm *mvm, struct 
sk_buff *skb,
snap_ip_tcp = 8 + skb_transport_header(skb) - skb_network_header(skb) +
tcp_hdrlen(skb);
 
-   qc = ieee80211_get_qos_ctl(hdr);
-   tid = *qc & IEEE80211_QOS_CTL_TID_MASK;
-   if (WARN_ON_ONCE(tid >= IWL_MAX_TID_COUNT))
-   return -EINVAL;
-
dbg_max_amsdu_len = ACCESS_ONCE(mvm->max_amsdu_len);
 
if (!sta->max_amsdu_len ||
@@ -667,6 +662,11 @@ static int iwl_mvm_tx_tso(struct iwl_mvm *mvm, struct 
sk_buff *skb,
goto segment;
}
 
+   qc = ieee80211_get_qos_ctl(hdr);
+   tid = *qc & IEEE80211_QOS_CTL_TID_MASK;
+   if (WARN_ON_ONCE(tid >= IWL_MAX_TID_COUNT))
+   return -EINVAL;
+
/*
 * Do not build AMSDU for IPv6 with extension headers.
 * ask stack to segment and checkum the generated MPDUs for us.
-- 
2.15.1

[PATCH AUTOSEL for 4.9 196/293] dmaengine: qcom_hidma: correct API violation for submit

2018-04-08 Thread Sasha Levin

From: Sinan Kaya 

[ Upstream commit 99efdb3e48fb2fa84addb3102946d3eca341192b ]

Current code is violating the DMA Engine API by putting the submitted
requests directly into the HW queue. This causes queued transactions
to be started by another thread as soon as the first one finishes.

The DMA Engine document clearly states this.

"dmaengine_submit() will not start the DMA operation".

Move HW queuing of the requests into the issue_pending() routine
to comply with API requirements also create a new queued state for
temporarily holding the requests.

A descriptor goes through these transitions now.

free->prepared->queued->active->completed->free

as opposed to

free->prepared->active->completed->free

Signed-off-by: Sinan Kaya 
Signed-off-by: Vinod Koul 
Signed-off-by: Sasha Levin 
---
 drivers/dma/qcom/hidma.c | 15 ---
 drivers/dma/qcom/hidma.h |  1 +
 2 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/drivers/dma/qcom/hidma.c b/drivers/dma/qcom/hidma.c
index e244e10a94b5..d38a2ceaa0dc 100644
--- a/drivers/dma/qcom/hidma.c
+++ b/drivers/dma/qcom/hidma.c
@@ -208,6 +208,7 @@ static int hidma_chan_init(struct hidma_dev *dmadev, u32 
dma_sig)
INIT_LIST_HEAD(>prepared);
INIT_LIST_HEAD(>active);
INIT_LIST_HEAD(>completed);
+   INIT_LIST_HEAD(>queued);
 
spin_lock_init(>lock);
list_add_tail(>chan.device_node, >channels);
@@ -228,9 +229,15 @@ static void hidma_issue_pending(struct dma_chan *dmach)
struct hidma_chan *mchan = to_hidma_chan(dmach);
struct hidma_dev *dmadev = mchan->dmadev;
unsigned long flags;
+   struct hidma_desc *qdesc, *next;
int status;
 
spin_lock_irqsave(>lock, flags);
+   list_for_each_entry_safe(qdesc, next, >queued, node) {
+   hidma_ll_queue_request(dmadev->lldev, qdesc->tre_ch);
+   list_move_tail(>node, >active);
+   }
+
if (!mchan->running) {
struct hidma_desc *desc = list_first_entry(>active,
   struct hidma_desc,
@@ -313,17 +320,18 @@ static dma_cookie_t hidma_tx_submit(struct 
dma_async_tx_descriptor *txd)
pm_runtime_put_autosuspend(dmadev->ddev.dev);
return -ENODEV;
}
+   pm_runtime_mark_last_busy(dmadev->ddev.dev);
+   pm_runtime_put_autosuspend(dmadev->ddev.dev);
 
mdesc = container_of(txd, struct hidma_desc, desc);
spin_lock_irqsave(>lock, irqflags);
 
-   /* Move descriptor to active */
-   list_move_tail(>node, >active);
+   /* Move descriptor to queued */
+   list_move_tail(>node, >queued);
 
/* Update cookie */
cookie = dma_cookie_assign(txd);
 
-   hidma_ll_queue_request(dmadev->lldev, mdesc->tre_ch);
spin_unlock_irqrestore(>lock, irqflags);
 
return cookie;
@@ -429,6 +437,7 @@ static int hidma_terminate_channel(struct dma_chan *chan)
list_splice_init(>active, );
list_splice_init(>prepared, );
list_splice_init(>completed, );
+   list_splice_init(>queued, );
spin_unlock_irqrestore(>lock, irqflags);
 
/* this suspends the existing transfer */
diff --git a/drivers/dma/qcom/hidma.h b/drivers/dma/qcom/hidma.h
index e52e20716303..03775ca940e2 100644
--- a/drivers/dma/qcom/hidma.h
+++ b/drivers/dma/qcom/hidma.h
@@ -103,6 +103,7 @@ struct hidma_chan {
struct dma_chan chan;
struct list_headfree;
struct list_headprepared;
+   struct list_headqueued;
struct list_headactive;
struct list_headcompleted;
 
-- 
2.15.1

[PATCH AUTOSEL for 4.9 190/293] clk: scpi: error when clock fails to register

2018-04-08 Thread Sasha Levin

From: Jerome Brunet 

[ Upstream commit 2b286b09a048df80fd5f7dfc5057c2837679a1ab ]

Current implementation of scpi_clk_add just print a warning when clock
fails to register but then keep going as if nothing happened. The
provider is then registered with bogus data.

This may latter lead to an Oops in __clk_create_clk when
hlist_add_head(>clks_node, >core->clks) is called.

This patch fixes the issue and errors if a clock fails to register.

Fixes: cd52c2a4b5c4 ("clk: add support for clocks provided by SCP(System 
Control Processor)")
Signed-off-by: Jerome Brunet 
Reviewed-by: Sudeep Holla 
Signed-off-by: Stephen Boyd 
Signed-off-by: Sasha Levin 
---
 drivers/clk/clk-scpi.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/clk/clk-scpi.c b/drivers/clk/clk-scpi.c
index 96d37175d0ad..e44b5ca91fed 100644
--- a/drivers/clk/clk-scpi.c
+++ b/drivers/clk/clk-scpi.c
@@ -245,10 +245,12 @@ static int scpi_clk_add(struct device *dev, struct 
device_node *np,
sclk->id = val;
 
err = scpi_clk_ops_init(dev, match, sclk, name);
-   if (err)
+   if (err) {
dev_err(dev, "failed to register clock '%s'\n", name);
-   else
-   dev_dbg(dev, "Registered clock '%s'\n", name);
+   return err;
+   }
+
+   dev_dbg(dev, "Registered clock '%s'\n", name);
clk_data->clk[idx] = sclk;
}
 
-- 
2.15.1

[PATCH AUTOSEL for 4.9 192/293] PCI/PM: Avoid using device_may_wakeup() for runtime PM

2018-04-08 Thread Sasha Levin

From: "Rafael J. Wysocki" 

[ Upstream commit 666ff6f83e1db6ed847abf44eb5e3402d82b9350 ]

pci_target_state() calls device_may_wakeup() which checks whether or not
the device may wake up the system from sleep states, but pci_target_state()
is used for runtime PM too.

Since runtime PM is expected to always enable remote wakeup if possible,
modify pci_target_state() to take additional argument indicating whether or
not it should look for a state from which the device can signal wakeup and
pass either the return value of device_can_wakeup(), or "false" (if the
device itself is not wakeup-capable) to it from the code related to runtime
PM.

While at it, fix the comment in pci_dev_run_wake() which is not about sleep
states.

Signed-off-by: Rafael J. Wysocki 
Signed-off-by: Bjorn Helgaas 
Reviewed-by: Mika Westerberg 
Signed-off-by: Sasha Levin 
---
 drivers/pci/pci.c | 22 +-
 1 file changed, 13 insertions(+), 9 deletions(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 774b0e2d117b..d6480b30ea8a 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -1958,12 +1958,13 @@ EXPORT_SYMBOL(pci_wake_from_d3);
 /**
  * pci_target_state - find an appropriate low power state for a given PCI dev
  * @dev: PCI device
+ * @wakeup: Whether or not wakeup functionality will be enabled for the device.
  *
  * Use underlying platform code to find a supported low power state for @dev.
  * If the platform can't manage @dev, return the deepest state from which it
  * can generate wake events, based on any available PME info.
  */
-static pci_power_t pci_target_state(struct pci_dev *dev)
+static pci_power_t pci_target_state(struct pci_dev *dev, bool wakeup)
 {
pci_power_t target_state = PCI_D3hot;
 
@@ -2000,7 +2001,7 @@ static pci_power_t pci_target_state(struct pci_dev *dev)
if (dev->current_state == PCI_D3cold)
target_state = PCI_D3cold;
 
-   if (device_may_wakeup(>dev)) {
+   if (wakeup) {
/*
 * Find the deepest state from which the device can generate
 * wake-up events, make it the target state and enable device
@@ -2026,13 +2027,14 @@ static pci_power_t pci_target_state(struct pci_dev *dev)
  */
 int pci_prepare_to_sleep(struct pci_dev *dev)
 {
-   pci_power_t target_state = pci_target_state(dev);
+   bool wakeup = device_may_wakeup(>dev);
+   pci_power_t target_state = pci_target_state(dev, wakeup);
int error;
 
if (target_state == PCI_POWER_ERROR)
return -EIO;
 
-   pci_enable_wake(dev, target_state, device_may_wakeup(>dev));
+   pci_enable_wake(dev, target_state, wakeup);
 
error = pci_set_power_state(dev, target_state);
 
@@ -2065,9 +2067,10 @@ EXPORT_SYMBOL(pci_back_from_sleep);
  */
 int pci_finish_runtime_suspend(struct pci_dev *dev)
 {
-   pci_power_t target_state = pci_target_state(dev);
+   pci_power_t target_state;
int error;
 
+   target_state = pci_target_state(dev, device_can_wakeup(>dev));
if (target_state == PCI_POWER_ERROR)
return -EIO;
 
@@ -2103,8 +2106,8 @@ bool pci_dev_run_wake(struct pci_dev *dev)
if (!dev->pme_support)
return false;
 
-   /* PME-capable in principle, but not from the intended sleep state */
-   if (!pci_pme_capable(dev, pci_target_state(dev)))
+   /* PME-capable in principle, but not from the target power state */
+   if (!pci_pme_capable(dev, pci_target_state(dev, false)))
return false;
 
while (bus->parent) {
@@ -2139,9 +2142,10 @@ EXPORT_SYMBOL_GPL(pci_dev_run_wake);
 bool pci_dev_keep_suspended(struct pci_dev *pci_dev)
 {
struct device *dev = _dev->dev;
+   bool wakeup = device_may_wakeup(dev);
 
if (!pm_runtime_suspended(dev)
-   || pci_target_state(pci_dev) != pci_dev->current_state
+   || pci_target_state(pci_dev, wakeup) != pci_dev->current_state
|| platform_pci_need_resume(pci_dev)
|| (pci_dev->dev_flags & PCI_DEV_FLAGS_NEEDS_RESUME))
return false;
@@ -2159,7 +2163,7 @@ bool pci_dev_keep_suspended(struct pci_dev *pci_dev)
spin_lock_irq(>power.lock);
 
if (pm_runtime_suspended(dev) && pci_dev->current_state < PCI_D3cold &&
-   !device_may_wakeup(dev))
+   !wakeup)
__pci_pme_active(pci_dev, false);
 
spin_unlock_irq(>power.lock);
-- 
2.15.1

[PATCH AUTOSEL for 4.9 189/293] fs/dcache: init in_lookup_hashtable

2018-04-08 Thread Sasha Levin

From: Sebastian Andrzej Siewior 

[ Upstream commit 6916363f3083837ed5adb3df2dd90d6b97017dff ]

in_lookup_hashtable was introduced in commit 94bdd655caba ("parallel
lookups machinery, part 3") and never initialized but since it is in
the data it is all zeros. But we need this for -RT.

Cc: Alexander Viro 
Cc: linux-fsde...@vger.kernel.org
Signed-off-by: Sebastian Andrzej Siewior 
Signed-off-by: Al Viro 
Signed-off-by: Sasha Levin 
---
 fs/dcache.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/fs/dcache.c b/fs/dcache.c
index c0c7fa8224ba..4df3d2300c7b 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -3637,6 +3637,11 @@ EXPORT_SYMBOL(d_genocide);
 
 void __init vfs_caches_init_early(void)
 {
+   int i;
+
+   for (i = 0; i < ARRAY_SIZE(in_lookup_hashtable); i++)
+   INIT_HLIST_BL_HEAD(_lookup_hashtable[i]);
+
dcache_init_early();
inode_init_early();
 }
-- 
2.15.1

< 1 2 3 4 5 6 7 8 9 10 >

501 - 600 of 2422 matches

Mail list logo