BUG: perf error on syscalls for powerpc64.

2015-07-15 Thread Zumeng Chen
Hi All,

1028ccf5 did a change for sys_call_table from a pointer to an array of
unsigned long, I think it's not proper, here is my reason:

sys_call_table defined as a label in assembler should be pointer array
rather than an array as described in 1028ccf5. If we defined it as an
array, then arch_syscall_addr will return the address of sys_call_table[],
actually the content of sys_call_table[] is demanded by arch_syscall_addr.
so 'perf list' will ignore all syscalls since find_syscall_meta will
return null
in init_ftrace_syscalls because of the wrong arch_syscall_addr.

Did I miss something, or Gcc compiler has done something newer ?

Cheers,
Zumeng
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ARM: OMAP2: Delete unnecessary checks before three function calls

2015-07-15 Thread Paul Walmsley
Hello Markus

On Tue, 30 Jun 2015, SF Markus Elfring wrote:

> From: Markus Elfring 
> Date: Tue, 30 Jun 2015 14:00:16 +0200
> 
> The functions clk_disable(), of_node_put() and omap_device_delete() test
> whether their argument is NULL and then return immediately.
> Thus the test around the call is not needed.
> 
> This issue was detected by using the Coccinelle software.
> 
> Signed-off-by: Markus Elfring 

Thanks for the patch.  I have to say, I am a bit leery about applying the 
omap_device.c and omap_hwmod.c changes, since the called functions -- 
omap_device_delete() and clk_disable() -- don't explicitly document that 
NULLs are allowed to be passed in.  So there's no explicit contract that 
callers can rely upon, to (at least in theory) prevent those internal NULL 
pointer checks from being removed.

So I would suggest that those two functions' kerneldoc be patched first to 
explicitly state that passing in a NULL pointer is allowed.  Then I would 
feel a bit more comfortable applying the omap_device.c and omap_hwmod.c 
changes.

The kerneldoc for of_node_put() does explicitly allow NULLs to be passed 
in.  So I'll apply that change now for v4.3, touching up the commit 
message accordingly.

regards,

- Paul

> ---
>  arch/arm/mach-omap2/omap_device.c | 3 +--
>  arch/arm/mach-omap2/omap_hwmod.c  | 5 +
>  arch/arm/mach-omap2/timer.c   | 3 +--
>  3 files changed, 3 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/arm/mach-omap2/omap_device.c 
> b/arch/arm/mach-omap2/omap_device.c
> index 4cb8fd9..196366e 100644
> --- a/arch/arm/mach-omap2/omap_device.c
> +++ b/arch/arm/mach-omap2/omap_device.c
> @@ -193,8 +193,7 @@ static int _omap_device_notifier_call(struct 
> notifier_block *nb,
>  
>   switch (event) {
>   case BUS_NOTIFY_DEL_DEVICE:
> - if (pdev->archdata.od)
> - omap_device_delete(pdev->archdata.od);
> + omap_device_delete(pdev->archdata.od);
>   break;
>   case BUS_NOTIFY_ADD_DEVICE:
>   if (pdev->dev.of_node)
> diff --git a/arch/arm/mach-omap2/omap_hwmod.c 
> b/arch/arm/mach-omap2/omap_hwmod.c
> index d78c12e..1091ee7 100644
> --- a/arch/arm/mach-omap2/omap_hwmod.c
> +++ b/arch/arm/mach-omap2/omap_hwmod.c
> @@ -921,10 +921,7 @@ static int _disable_clocks(struct omap_hwmod *oh)
>   int i = 0;
>  
>   pr_debug("omap_hwmod: %s: disabling clocks\n", oh->name);
> -
> - if (oh->_clk)
> - clk_disable(oh->_clk);
> -
> + clk_disable(oh->_clk);
>   p = oh->slave_ports.next;
>  
>   while (i < oh->slaves_cnt) {
> diff --git a/arch/arm/mach-omap2/timer.c b/arch/arm/mach-omap2/timer.c
> index cac46d8..15448221 100644
> --- a/arch/arm/mach-omap2/timer.c
> +++ b/arch/arm/mach-omap2/timer.c
> @@ -208,8 +208,7 @@ static void __init omap_dmtimer_init(void)
>   /* If we are a secure device, remove any secure timer nodes */
>   if ((omap_type() != OMAP2_DEVICE_TYPE_GP)) {
>   np = omap_get_timer_dt(omap_timer_match, "ti,timer-secure");
> - if (np)
> - of_node_put(np);
> + of_node_put(np);
>   }
>  }
>  
> -- 
> 2.4.5
> 


- Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/7] fs: Ignore file caps in mounts from other user namespaces

2015-07-15 Thread Eric W. Biederman
Andy Lutomirski  writes:

> On Wed, Jul 15, 2015 at 10:04 PM, Eric W. Biederman
>  wrote:
>> Andy Lutomirski  writes:
>>
>>>
>>> So here's the semantic question:
>>>
>>> Suppose an unprivileged user (uid 1000) creates a user namespace and a
>>> mount namespace.  They stick a file (owned by uid 1000 as seen by
>>> init_user_ns) in there and mark it setuid root and give it fcaps.
>>
>> To make this make sense I have to ask, is this file on a filesystem
>> where uid 1000 as seen by the init_user_ns stored as uid 1000 on
>> the filesystem?  Or is this uid 0 as seen by the filesystem?
>>
>> I assume this is uid 0 on the filesystem in question or else your
>> unprivileged user would not have sufficient privileges over the
>> filesystem to setup fcaps.
>
> I was thinking uid 0 as seen by the filesystem.  But even if it were
> uid 1000, the unprivileged user can still set whatever mode and xattrs
> they want -- they control the backing store.

Yes.   And that is what I was really asking.  Are we taking about a
filesystem where the user controls the backing store?

>>> Then global root gets an fd to this filesystem.  If they execve the
>>> file directly, then, with my patch 4, it won't act as setuid 1000 and
>>> the fcaps will be ignored.  Even with my patch 4, though, if they bind
>>> mount the fs and execve the file from their bind mount, it will act as
>>> setuid 1000.  Maybe this is odd.  However, with Seth's patch 3, the
>>> fcaps will (correctly) not be honored.
>>
>> With patch 3 you can also think of it as fcaps being honored and you
>> get all the caps in the appropriate user namespace, but since you are
>> not in that user namespace and so don't have a place to store them
>> in struct cred you don't get the file caps.
>>
>> From the philosophy of interpreting the file as defined by the
>> filesystem in principle we could extend struct cred so you actually
>> get the creds just in uid 1000s user namespace, but that is very
>> unlikely to be worth it.
>
> I agree.
>
>>
>>> I tend to thing that, if we're not honoring the fcaps, we shouldn't be
>>> honoring the setuid bit either.  After all, it's really not a trusted
>>> file, even though the only user who could have messed with it really
>>> is the apparent owner.
>>
>> For the file caps we can't honor them because you don't have the bits
>> in struct cred.
>>
>> For setuid we can honor it, and setuid is something that the user
>> namespace allows.
>>
>
> We certainly *can* honor it.  But why should we?  I'd be more
> comfortable with this if the contents of an untrusted filesystem were
> really treated as just data.

In these weird bleed through situtations I don't know that we should.
But extending nosuid protections in this way is a bit like yama
a bit gratuitious stomping don't care cases in the semantics to
make bugs harder to exploit.

>>> And, if we're going to say we don't trust the file and shouldn't honor
>>> setuid or fcaps, then merging all the functionality into mnt_may_suid
>>> could make sense.  Yes, these two things do different things, but they
>>> could hook in to the same place.
>>
>> There are really two separate questions:
>> - Do we trust this filesystem?
>> - Do you have the bits to implement this concept?
>>
>> Even if in this specific context the two questions wind up looking
>> exactly the same. I think it makes a lot of sense to ask the two
>> questions separately.  As future maintenance changes may cause the
>> implementation of the questions to diverge.
>>
>
> Agreed.
>
> Unless someone thinks of an argument to the contrary, I'd say "no, we
> don't trust this filesystem".  I could be convinced otherwise.

But this is context dependent.  From the perspective of the container
we really do want to trust the filesystem.  As the container root set it
up, and if he isn't being hostile likely has a use for setfcaps files
and setuid files and all of the rest.

Perhaps I should phrase it as:
- In this context do we trust the code?   AKA mnt_may_suid?
- What do these bits mean in this context?  (Usually something more 
complicated).

Which says to me we want both patches 3 and 4 (even if 4 uses s_user_ns)
because 3 is different than 4.

And now I better context switch back to fixing bind mounts.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: linux-next: build failure after merge of the rcu tree

2015-07-15 Thread Stephen Rothwell
Hi Paul,

On Wed, 15 Jul 2015 20:51:38 -0700 "Paul E. McKenney" 
 wrote:
>
> Thank you in both cases!  I suspect that more will follow, so is there
> something I can do to make this easier?  (Hard for me to patch stuff
> that is not yet in the tree...)

No, that is what I am here for.  But it would be good if you remember
this when it comes time for your tree to be merged into tip ...

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 5/6] locking/pvqspinlock: Opportunistically defer kicking to unlock time

2015-07-15 Thread Peter Zijlstra
On Wed, Jul 15, 2015 at 10:18:35PM -0400, Waiman Long wrote:
> On 07/15/2015 06:03 AM, Peter Zijlstra wrote:

> >*groan*, so you complained the previous version of this patch was too
> >complex, but let me say I vastly preferred it to this one :/
> 
> I said it was complex as maintaining a tri-state variable needed more
> thought than 2 bi-state variables. I can revert it back to the tri-state
> variable as doing an unconditional kick in unlock simplifies the code at
> pv_wait_head().

Well, your state space isn't shrunk, you just use more variables and I'm
not entirely sure that actually matters.

What also doesn't help is that mixing with the kicking code.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 4/6] locking/pvqspinlock: Allow vCPUs kick-ahead

2015-07-15 Thread Peter Zijlstra
On Wed, Jul 15, 2015 at 10:01:02PM -0400, Waiman Long wrote:
> On 07/15/2015 05:39 AM, Peter Zijlstra wrote:
> >On Tue, Jul 14, 2015 at 10:13:35PM -0400, Waiman Long wrote:
> >>Frequent CPU halting (vmexit) and CPU kicking (vmenter) lengthens
> >>critical section and block forward progress.  This patch implements
> >>a kick-ahead mechanism where the unlocker will kick the queue head
> >>vCPUs as well as up to four additional vCPUs next to the queue head
> >>if they were halted.  The kickings are done after exiting the critical
> >>section to improve parallelism.
> >>
> >>The amount of kick-ahead allowed depends on the number of vCPUs
> >>in the VM guest.  This patch, by itself, won't do much as most of
> >>the kickings are currently done at lock time. Coupled with the next
> >>patch that defers lock time kicking to unlock time, it should improve
> >>overall system performance in a busy overcommitted guest.
> >>
> >>Linux kernel builds were run in KVM guest on an 8-socket, 4
> >>cores/socket Westmere-EX system and a 4-socket, 8 cores/socket
> >>Haswell-EX system. Both systems are configured to have 32 physical
> >>CPUs. The kernel build times before and after the patch were:
> >>
> >>WestmereHaswell
> >>   Patch32 vCPUs48 vCPUs32 vCPUs48 vCPUs
> >>   -
> >>   Before patch  3m25.0s10m34.1s 2m02.0s15m35.9s
> >>   After patch3m27.4s10m32.0s2m00.8s14m52.5s
> >>
> >>There wasn't too much difference before and after the patch.
> >That means either the patch isn't worth it, or as you seem to imply its
> >in the wrong place in this series.
> 
> It needs to be coupled with the next patch to be effective as most of the
> kicking are happening at the lock side, instead of at the unlock side. If
> you look at the sample pvqspinlock stats in patch 3:
> 
> lock_kick_count=755354
> unlock_kick_count=87
> 
> The number of unlock kicks is negligible compared with the lock kicks. Patch
> 5 does have a dependency on patch 4 unless we make it unconditionally defers
> kicking to the unlock call which was what I had done in the v1 patch. The
> reason why I change this in v2 is because I found a very slight performance
> degradation in doing so.

This way we cannot see the gains of the proposed complexity. So put it
in a place where you can.

> >You also do not offer any support for any of the magic numbers..
> 
> I chose 4 for PV_KICK_AHEAD_MAX as I didn't see much performance difference
> when I did a kick-ahead of 5. Also, it may be too unfair to the vCPU that
> was doing the kicking if the number is too big. Another magic number is
> pv_kick_ahead number. This one is kind of arbitrary. Right now I do a log2,
> but it can be divided by 4 (rshift 2) as well.

So what was the difference between 1-2-3-4 ? I would be thinking one
extra kick is the biggest help, no?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/3] ARM: multi_v7_defconfig: Enable max77802 regulator

2015-07-15 Thread Javier Martinez Canillas
Hello Krzysztof,

Thanks for the feedback.

On 07/16/2015 02:45 AM, Krzysztof Kozlowski wrote:
> On 16.07.2015 01:32, Javier Martinez Canillas wrote:
>> The Maxim max77802 Power Management IC has besides other devices, a set of
>> regulators. Commit f3caa529c6f5 ("ARM: multi_v7_defconfig: Enable max77802
>> regulator, rtc and clock drivers") was supposed to enable the config option
>> for the regulator driver as a module but the final version that landed did
>> not include this. So this patch enables the needed Kconfig option.
>>
>> Signed-off-by: Javier Martinez Canillas 
> 
> Please describe why do you want to enable it (IOW who will benefit from
> enabling it?). This symbol was removed by Kukjin from your commit:
>   [kg...@kernel.org: removing useless REGULATOR_MAX77802 config]
> so justification would be welcomed.
>

You are right, sorry for not making the commit message clear. This PMIC
is used by a couple of Exynos5 based boars such as the Peach Pit and Pi
Chromebooks. I expect it to be found in other designs too just like the
max77686 is found in many Exynos5 based boards.

I'll add this to the commit message on v2.
 
> Beside the commit description I agree with the patch.
>

Does this mean I can add your Reviewed-by to this patch as well?

> Best regards,
> Krzysztof
> 

Best regards,
-- 
Javier Martinez Canillas
Open Source Group
Samsung Research America
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 1/6] locking/pvqspinlock: Unconditional PV kick with _Q_SLOW_VAL

2015-07-15 Thread Peter Zijlstra
On Wed, Jul 15, 2015 at 08:18:23PM -0400, Waiman Long wrote:
> On 07/15/2015 05:10 AM, Peter Zijlstra wrote:
> > /*
> >+ * A failed cmpxchg doesn't provide any memory-ordering guarantees,
> >+ * so we need a barrier to order the read of the node data in
> >+ * pv_unhash *after* we've read the lock being _Q_SLOW_VAL.
> >+ *
> >+ * Matches the cmpxchg() in pv_wait_head() setting _Q_SLOW_VAL.
> >+ */
> >+smp_rmb();
> 
> According to memory_barriers.txt, cmpxchg() is a full memory barrier. It
> didn't say a failed cmpxchg will lose its memory guarantee. So is the
> documentation right? 

The documentation is not entirely clear on this; but there are hints
that this is so.

> Or is that true for some architectures? I think it is
> not true for x86.

On x86 LOCK CMPXCHG is always a sync point, but yes there are archs for
which a failed cmpxchg does _NOT_ provide any barrier semantics.

The reason I started looking was because Will made Argh64 one of those.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v5 2/3] pwm: add MediaTek display PWM driver support

2015-07-15 Thread YH Huang
On Wed, 2015-07-15 at 23:59 +0800, YH Huang wrote:
> On Mon, 2015-07-13 at 18:19 +0800, Daniel Kurtz wrote:
> > On Mon, Jul 13, 2015 at 5:04 PM, YH Huang  wrote:
> > > Add display PWM driver support to modify backlight for MT8173 and MT6595.
> > > The PWM has one channel to control the brightness of the display.
> > > When the (high_width / period) is closer to 1, the screen is brighter;
> > > otherwise, it is darker.
> > >
> > > Signed-off-by: YH Huang 
> > > ---
> > >  drivers/pwm/Kconfig|  10 ++
> > >  drivers/pwm/Makefile   |   1 +
> > >  drivers/pwm/pwm-mtk-disp.c | 256 
> > > +
> > >  3 files changed, 267 insertions(+)
> > >  create mode 100644 drivers/pwm/pwm-mtk-disp.c
> > >
> > > diff --git a/drivers/pwm/Kconfig b/drivers/pwm/Kconfig
> > > index b1541f4..f5b03a4 100644
> > > --- a/drivers/pwm/Kconfig
> > > +++ b/drivers/pwm/Kconfig
> > > @@ -211,6 +211,16 @@ config PWM_LPSS_PLATFORM
> > >   To compile this driver as a module, choose M here: the module
> > >   will be called pwm-lpss-platform.
> > >
> > > +config PWM_MTK_DISP
> > > +   tristate "MediaTek display PWM driver"
> > > +   depends on ARCH_MEDIATEK || COMPILE_TEST
> > > +   help
> > > + Generic PWM framework driver for MediaTek disp-pwm device.
> > > + The PWM is used to control the backlight brightness for display.
> > > +
> > > + To compile this driver as a module, choose M here: the module
> > > + will be called pwm-mtk-disp.
> > > +
> > >  config PWM_MXS
> > > tristate "Freescale MXS PWM support"
> > > depends on ARCH_MXS && OF
> > > diff --git a/drivers/pwm/Makefile b/drivers/pwm/Makefile
> > > index ec50eb5..99c9e75 100644
> > > --- a/drivers/pwm/Makefile
> > > +++ b/drivers/pwm/Makefile
> > > @@ -18,6 +18,7 @@ obj-$(CONFIG_PWM_LPC32XX) += pwm-lpc32xx.o
> > >  obj-$(CONFIG_PWM_LPSS) += pwm-lpss.o
> > >  obj-$(CONFIG_PWM_LPSS_PCI) += pwm-lpss-pci.o
> > >  obj-$(CONFIG_PWM_LPSS_PLATFORM)+= pwm-lpss-platform.o
> > > +obj-$(CONFIG_PWM_MTK_DISP) += pwm-mtk-disp.o
> > >  obj-$(CONFIG_PWM_MXS)  += pwm-mxs.o
> > >  obj-$(CONFIG_PWM_PCA9685)  += pwm-pca9685.o
> > >  obj-$(CONFIG_PWM_PUV3) += pwm-puv3.o
> > > diff --git a/drivers/pwm/pwm-mtk-disp.c b/drivers/pwm/pwm-mtk-disp.c
> > > new file mode 100644
> > > index 000..1f17cee
> > > --- /dev/null
> > > +++ b/drivers/pwm/pwm-mtk-disp.c
> > > @@ -0,0 +1,256 @@
> > > +/*
> > > + * MediaTek display pulse-width-modulation controller driver.
> > > + * Copyright (c) 2015 MediaTek Inc.
> > > + * Author: YH Huang 
> > > + *
> > > + * This program is free software; you can redistribute it and/or modify
> > > + * it under the terms of the GNU General Public License version 2 as
> > > + * published by the Free Software Foundation.
> > > + *
> > > + * This program is distributed in the hope that it will be useful,
> > > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > > + * GNU General Public License for more details.
> > > + */
> > > +
> > > +#include 
> > > +#include 
> > > +#include 
> > > +#include 
> > > +#include 
> > > +#include 
> > > +#include 
> > > +#include 
> > > +
> > > +#define DISP_PWM_EN0
> > 
> > The "DISP_PWM_*" are register offsets, so use a hex value, like this:
> > 
> > #define DISP_PWM_EN  0x00
> > 
> > Use BIT() for register *fields*, that is, the individual bits of a register.
> > 
> 
> Got it!
> 
> > > +#define PWM_ENABLE_MASKBIT(0)
> > > +
> > > +#define DISP_PWM_COMMITBIT(3)
> > 
> > #define DISP_PWM_COMMIT0x08
> > 
> > > +#define PWM_COMMIT_MASKBIT(0)
> > > +
> > > +#define DISP_PWM_CON_0 BIT(4)
> > 
> > #define DISP_PWM_COMMIT0x10
> > 
> > > +#define PWM_CLKDIV_SHIFT   16
> > > +#define PWM_CLKDIV_MAX 0x3ff
> > > +#define PWM_CLKDIV_MASK(PWM_CLKDIV_MAX << 
> > > PWM_CLKDIV_SHIFT)
> > > +
> > > +#define DISP_PWM_CON_1 0x14
> > > +#define PWM_PERIOD_MASK0xfff
> > > +/* Shift log2(PWM_PERIOD_MASK + 1) as divisor */
> > > +#define PWM_PERIOD_BIT_SHIFT   12
> > > +
> > > +#define PWM_HIGH_WIDTH_SHIFT   16
> > > +#define PWM_HIGH_WIDTH_MASK(0x1fff << PWM_HIGH_WIDTH_SHIFT)
> > > +
> > > +struct mtk_disp_pwm {
> > > +   struct pwm_chip chip;
> > > +   struct device *dev;
> > 
> > I don't think "dev" is actually used.  And, if needed, it can be
> > extracted from "chip".
> > 
> 
> I will drop it.
> 
> > > +   struct clk *clk_main;
> > > +   struct clk *clk_mm;
> > > +   void __iomem *base;
> > > +};
> > > +
> > > +static inline struct mtk_disp_pwm *to_mtk_disp_pwm(struct pwm_chip *chip)
> > > +{
> > > +   return container_of(chip, struct mtk_disp_pwm, chip);
> > > +}
> > > +
> > > +static void 

linux-next: build failure after merge of the akpm-current tree

2015-07-15 Thread Stephen Rothwell
Hi Andrew,

After merging the akpm-current tree, today's linux-next build (powerpc
ppc64_defconfig) failed like this:

ERROR: ".smpboot_register_percpu_thread_cpumask" 
[drivers/infiniband/hw/ehca/ib_ehca.ko] undefined!

Caused by commit

  2b07b4da35a9 ("smpboot: allow passing the cpumask on per-cpu thread 
registration")

I have added the following build faix for today:

From: Stephen Rothwell 
Date: Thu, 16 Jul 2015 15:30:05 +1000
Subject: [PATCH] smpboot: fix for allow passing the cpumask on per-cpu thread 
registration

Signed-off-by: Stephen Rothwell 
---
 kernel/smpboot.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/smpboot.c b/kernel/smpboot.c
index d99a41d25b0c..a818cbc73e14 100644
--- a/kernel/smpboot.c
+++ b/kernel/smpboot.c
@@ -308,7 +308,7 @@ out:
put_online_cpus();
return ret;
 }
-EXPORT_SYMBOL_GPL(smpboot_register_percpu_thread);
+EXPORT_SYMBOL_GPL(smpboot_register_percpu_thread_cpumask);
 
 /**
  * smpboot_unregister_percpu_thread - Unregister a per_cpu thread related to 
hotplug
-- 
2.1.4

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] staging: rtl8188eu: core: find and remove code valid only for 5 HGz.

2015-07-15 Thread Sudip Mukherjee
On Wed, Jul 15, 2015 at 10:04:08PM -0400, Sreenath Madasu wrote:
> This one of the TODO tasks for staging rtl8188eu driver. I have removed
> the code referring to channel > 14 for rtw_ap.c, rtw_ieee80211.c and
> rtw_mlme.c files. Please review.
Your patch will give a new build warning:
warning: unused variable ‘pcur_network’ [-Wunused-variable]

regards
sudip
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/9] ARM: multi_v7_defconfig: Enable max77802 regulator, rtc and clock drivers

2015-07-15 Thread Javier Martinez Canillas
Hello Krzysztof,

On 07/16/2015 02:42 AM, Krzysztof Kozlowski wrote:
> On 16.07.2015 00:38, Javier Martinez Canillas wrote:
>> Hello,
>>
>> On Thu, May 14, 2015 at 5:40 PM, Javier Martinez Canillas
>>  wrote:
>>> The Maxim max77802 Power Management IC is used on many Exynos machines.
>>> Besides a bunch of regulators, this chip has a Real-Time-Clock (RTC)
>>> and 2-channel 32kHz clock outputs.
>>>
>>> Enable the kernel config options to have the drivers for these devices
>>> built as a module.
>>>
>>> Signed-off-by: Javier Martinez Canillas 
>>> ---
>>>  arch/arm/configs/multi_v7_defconfig | 3 +++
>>>  1 file changed, 3 insertions(+)
>>>
>>> diff --git a/arch/arm/configs/multi_v7_defconfig 
>>> b/arch/arm/configs/multi_v7_defconfig
>>> index 2349584b6e08..080120fe5580 100644
>>> --- a/arch/arm/configs/multi_v7_defconfig
>>> +++ b/arch/arm/configs/multi_v7_defconfig
>>> @@ -373,6 +373,7 @@ CONFIG_POWER_RESET_SYSCON=y
>>>  CONFIG_REGULATOR_MAX8907=y
>>>  CONFIG_REGULATOR_MAX8973=y
>>>  CONFIG_REGULATOR_MAX77686=y
>>> +CONFIG_REGULATOR_MAX77802=m
>>
>> I noticed that the version that landed in 4.2-rc1 as commit
>> f3caa529c6f5 ("ARM: multi_v7_defconfig: Enable max77802 regulator, rtc
>> and clock drivers") doesn't include this symbol. I guess it was caused
>> by a wrong resolved conflict? I'll post a patch to enable the
>> regulator again.
> 
> As you can see in mentioned mainline commit Kukjin removed it manually:
> [kg...@kernel.org: removing useless REGULATOR_MAX77802 config]
>

Oh, I missed that in the commit message. I thought it was a merge / conflict
error, not something done on purpose.
 
> I wonder why?
>

Me too.

> Best regards,
> Krzysztof
> --

Best regards,
-- 
Javier Martinez Canillas
Open Source Group
Samsung Research America
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/1] mem-hotplug: Handle node hole when initializing numa_meminfo.

2015-07-15 Thread Tang Chen


On 07/16/2015 05:20 AM, Tejun Heo wrote:

On Wed, Jul 01, 2015 at 11:16:54AM +0800, Tang Chen wrote:
...

-   /* and there's no empty block */
-   if (bi->start >= bi->end)
+   /* and there's no empty or non-exist block */
+   if (bi->start >= bi->end ||
+   memblock_overlaps_region(,
+   bi->start, bi->end - bi->start) == -1)

Ugh can you please change memblock_overlaps_region() to return
bool instead?


Well, I think memblock_overlaps_region() is designed to return
the index of the region overlapping with the given region.
Maybe it had some users before.

Of course for now, it is only called by memblock_is_region_reserved().

It is OK to change the return value of memblock_overlaps_region() to bool.
But any caller of memblock_is_region_reserved() should also be changed.

I think it is OK to leave it there.

Thanks.



Thanks.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


linux-next: build warning after merge of the akpm-current tree

2015-07-15 Thread Stephen Rothwell
Hi Andrew,

After merging the akpm-current tree, today's linux-next build (arm
multi_v7_defconfig) produced this warning:

lib/genalloc.c: In function 'gen_pool_get':
/scratch/sfr/next/lib/genalloc.c:599:6: warning: passing argument 4 of 
'devres_find' discards 'const' qualifier from pointer target type
  p = devres_find(dev, devm_gen_pool_release, devm_gen_pool_match, name);
  ^
In file included from /scratch/sfr/next/include/linux/node.h:17:0,
 from /scratch/sfr/next/include/linux/cpu.h:16,
 from /scratch/sfr/next/include/linux/of_device.h:4,
 from /scratch/sfr/next/lib/genalloc.c:37:
/scratch/sfr/next/include/linux/device.h:620:14: note: expected 'void *' but 
argument is of type 'const char *'
 extern void *devres_find(struct device *dev, dr_release_t release,
  ^

Caused by commit

  e89a70fd54f2 ("genalloc: add support of multiple gen_pools per device")

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


linux-next: build failure after merge of the akpm-current tree

2015-07-15 Thread Stephen Rothwell
Hi Andrew,

After merging the akpm-current tree, today's linux-next build (arm
multi_v7_defconfig) failed like this:

arch/arm/kernel/entry-common.S: Assembler messages:
arch/arm/kernel/entry-common.S:108: Error: __NR_syscalls is not equal to the 
size of the syscall table

Caused by commit

  d221fc1f0f25 ("mm: mlock: add new mlock, munlock, and munlockall system 
calls")

I have added the following fix patch for today:

From: Stephen Rothwell 
Date: Thu, 16 Jul 2015 14:58:53 +1000
Subject: [PATCH] mm: mlock: fix for add new mlock, munlock, and munlockall 
system calls

Signed-off-by: Stephen Rothwell 
---
 arch/arm/include/asm/unistd.h | 2 +-
 arch/arm/kernel/calls.S   | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm/include/asm/unistd.h b/arch/arm/include/asm/unistd.h
index 32640c431a08..2516c09d65d7 100644
--- a/arch/arm/include/asm/unistd.h
+++ b/arch/arm/include/asm/unistd.h
@@ -19,7 +19,7 @@
  * This may need to be greater than __NR_last_syscall+1 in order to
  * account for the padding in the syscall table
  */
-#define __NR_syscalls  (388)
+#define __NR_syscalls  (392)
 
 /*
  * *NOTE*: This is a ghost syscall private to the kernel.  Only the
diff --git a/arch/arm/kernel/calls.S b/arch/arm/kernel/calls.S
index 514e77b26414..88808221383b 100644
--- a/arch/arm/kernel/calls.S
+++ b/arch/arm/kernel/calls.S
@@ -399,7 +399,7 @@
CALL(sys_execveat)
CALL(sys_mlock2)
CALL(sys_munlock2)
-/* 400 */  CALL(sys_munlockall2)
+/* 390 */  CALL(sys_munlockall2)
 #ifndef syscalls_counted
 .equ syscalls_padding, ((NR_syscalls + 3) & ~3) - NR_syscalls
 #define syscalls_counted
-- 
2.1.4

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 6/6] ARM: PRM: AM437x: Enable IO wakeup feature

2015-07-15 Thread Keerthy



On Thursday 16 July 2015 10:44 AM, Paul Walmsley wrote:

Hi

On Tue, 14 Jul 2015, Keerthy wrote:


Enable IO wakeup feature.

Signed-off-by: Keerthy 


Per my comments on one of the previous patches, please add a short
description in the commit message for what enabling I/O wakeup will do for
a user.


Okay will do that.



- Paul


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] asm-generic: {get,put}_user ptr argument evaluate only 1 time

2015-07-15 Thread Yoshinori Sato
Current implemantation ptr argument evaluate 2 times.
It'll be an unexpected result.

Signed-off-by: Yoshinori Sato 
---
 include/asm-generic/uaccess.h | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/include/asm-generic/uaccess.h b/include/asm-generic/uaccess.h
index 72d8803..1b813fb 100644
--- a/include/asm-generic/uaccess.h
+++ b/include/asm-generic/uaccess.h
@@ -163,9 +163,10 @@ static inline __must_check long __copy_to_user(void __user 
*to,
 
 #define put_user(x, ptr)   \
 ({ \
+   __typeof__((ptr)) __p = (ptr);  \
might_fault();  \
-   access_ok(VERIFY_WRITE, ptr, sizeof(*ptr)) ?\
-   __put_user(x, ptr) :\
+   access_ok(VERIFY_WRITE, __p, sizeof(*__p)) ?\
+   __put_user(x, __p) :\
-EFAULT;\
 })
 
@@ -225,9 +226,10 @@ extern int __put_user_bad(void) __attribute__((noreturn));
 
 #define get_user(x, ptr)   \
 ({ \
+   __typeof__((ptr)) __p = (ptr);  \
might_fault();  \
-   access_ok(VERIFY_READ, ptr, sizeof(*ptr)) ? \
-   __get_user(x, ptr) :\
+   access_ok(VERIFY_READ, __p, sizeof(*__p)) ? \
+   __get_user(x, __p) :\
-EFAULT;\
 })
 
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH v2] memory-barriers: remove smp_mb__after_unlock_lock()

2015-07-15 Thread Benjamin Herrenschmidt
On Thu, 2015-07-16 at 15:03 +1000, Benjamin Herrenschmidt wrote:
> On Thu, 2015-07-16 at 12:00 +1000, Michael Ellerman wrote:
> > That would fix the problem with smp_mb__after_unlock_lock(), but not
> > the original worry we had about loads happening before the SC in lock.
> 
> However I think isync fixes *that* :-) The problem with isync is as you
> said, it's not a -memory- barrier per-se, it's an execution barrier /
> context synchronizing instruction. The combination stwcx. + bne + isync
> however prevents the execution of anything past the isync until the
> stwcx has completed and the bne has been "decided", which prevents loads
> from leaking into the LL/SC loop. It will also prevent a store in the
> lock from being issued before the stwcx. has completed. It does *not*
> prevent as far as I can tell another unrelated store before the lock
> from leaking into the lock, including the one used to unlock a different
> lock.

Except that the architecture says:

<<
Because a Store Conditional instruction may com-
plete before its store has been performed, a condi-
tional Branch instruction that depends on the CR0
value set by a Store Conditional instruction does
not order the Store Conditional's store with respect
to storage accesses caused by instructions that
follow the Branch
>>

So isync in lock in architecturally incorrect, despite being what the
architecture recommends using, yay !

Ben.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 1/6] ARM: OMAP4: PRM: Remove hardcoding of PRM_IO_PMCTRL_OFFSET register

2015-07-15 Thread Keerthy

Paul,

Thanks for the review!

On Thursday 16 July 2015 07:24 AM, Paul Walmsley wrote:

Hi

a few minor comments

On Wed, 8 Jul 2015, Keerthy wrote:


PRM_IO_PMCTRL_OFFSET need not be same for all SOCs hence
remove hardcoding and use the value provided by the omap_prcm_irq_setup
structure.


Please mention here that the reason why you're making this change is to
support AM437x.


Sure. I will do that.





Signed-off-by: Keerthy 
---
  arch/arm/mach-omap2/prcm-common.h |  1 +
  arch/arm/mach-omap2/prm44xx.c | 11 ++-
  2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/arch/arm/mach-omap2/prcm-common.h 
b/arch/arm/mach-omap2/prcm-common.h
index 6ae0b3a..2e60406 100644
--- a/arch/arm/mach-omap2/prcm-common.h
+++ b/arch/arm/mach-omap2/prcm-common.h
@@ -494,6 +494,7 @@ struct omap_prcm_irq {
  struct omap_prcm_irq_setup {
u16 ack;
u16 mask;
+   u16 pm_ctrl;


Please add a kerneldoc structure documentation line for this new field, to
match the existing documentation here.


Okay.




u8 nr_regs;
u8 nr_irqs;
const struct omap_prcm_irq *irqs;
diff --git a/arch/arm/mach-omap2/prm44xx.c b/arch/arm/mach-omap2/prm44xx.c
index 4541700..8149e5a 100644
--- a/arch/arm/mach-omap2/prm44xx.c
+++ b/arch/arm/mach-omap2/prm44xx.c
@@ -45,6 +45,7 @@ static const struct omap_prcm_irq omap4_prcm_irqs[] = {
  static struct omap_prcm_irq_setup omap4_prcm_irq_setup = {
.ack= OMAP4_PRM_IRQSTATUS_MPU_OFFSET,
.mask   = OMAP4_PRM_IRQENABLE_MPU_OFFSET,
+   .pm_ctrl= OMAP4_PRM_IO_PMCTRL_OFFSET,
.nr_regs= 2,
.irqs   = omap4_prcm_irqs,
.nr_irqs= ARRAY_SIZE(omap4_prcm_irqs),
@@ -306,10 +307,10 @@ static void omap44xx_prm_reconfigure_io_chain(void)
omap4_prm_rmw_inst_reg_bits(OMAP4430_WUCLK_CTRL_MASK,
OMAP4430_WUCLK_CTRL_MASK,
inst,
-   OMAP4_PRM_IO_PMCTRL_OFFSET);
+   omap4_prcm_irq_setup.pm_ctrl);
omap_test_timeout(
(((omap4_prm_read_inst_reg(inst,
-  OMAP4_PRM_IO_PMCTRL_OFFSET) &
+  omap4_prcm_irq_setup.pm_ctrl) &
   OMAP4430_WUCLK_STATUS_MASK) >>
  OMAP4430_WUCLK_STATUS_SHIFT) == 1),
MAX_IOPAD_LATCH_TIME, i);
@@ -319,10 +320,10 @@ static void omap44xx_prm_reconfigure_io_chain(void)
/* Trigger WUCLKIN disable */
omap4_prm_rmw_inst_reg_bits(OMAP4430_WUCLK_CTRL_MASK, 0x0,
inst,
-   OMAP4_PRM_IO_PMCTRL_OFFSET);
+   omap4_prcm_irq_setup.pm_ctrl);
omap_test_timeout(
(((omap4_prm_read_inst_reg(inst,
-  OMAP4_PRM_IO_PMCTRL_OFFSET) &
+  omap4_prcm_irq_setup.pm_ctrl) &
   OMAP4430_WUCLK_STATUS_MASK) >>
  OMAP4430_WUCLK_STATUS_SHIFT) == 0),
MAX_IOPAD_LATCH_TIME, i);
@@ -350,7 +351,7 @@ static void __init omap44xx_prm_enable_io_wakeup(void)
omap4_prm_rmw_inst_reg_bits(OMAP4430_GLOBAL_WUEN_MASK,
OMAP4430_GLOBAL_WUEN_MASK,
inst,
-   OMAP4_PRM_IO_PMCTRL_OFFSET);
+   omap4_prcm_irq_setup.pm_ctrl);
  }

  /**
--
1.9.1




- Paul


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 2/6] ARM: AM43xx: Add the PRM IRQ register offsets

2015-07-15 Thread Keerthy



On Thursday 16 July 2015 08:08 AM, Paul Walmsley wrote:

On Thu, 16 Jul 2015, Paul Walmsley wrote:


On Wed, 8 Jul 2015, Keerthy wrote:


Add the PRM IRQ register offsets.

Signed-off-by: Keerthy 


Please add more detail to your commit messages so they conform to
Documentation/SubmittingPatches:

https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/Documentation/SubmittingPatches#n109

For example, this commit message should read something like:

---

ARM: AM43xx: Add the PRM IRQ register offsets

Add the PRM IRQ register offsets.  This is needed to support PRM I/O
wakeup on AM43xx.

--

Basically, your patches need to provide context as to _why_ the change is
needed.

I've fixed the message for this patch, and queued it for v4.3, but
please take care with this issue in the future.


Also I've moved the AM43XX_PRM_IO_PMCTRL_OFFSET macro out of the AM43XX CM
section, since it doesn't belong there.


Thanks Paul!




- Paul


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/7] fs: Ignore file caps in mounts from other user namespaces

2015-07-15 Thread Andy Lutomirski
On Wed, Jul 15, 2015 at 10:04 PM, Eric W. Biederman
 wrote:
> Andy Lutomirski  writes:
>
>> On Wed, Jul 15, 2015 at 9:23 PM, Eric W. Biederman
>>  wrote:
>>>
>>> Ok.  Andy I have stopped and really looked at your patch that is 4/7 in
>>> this series.  Something I had not done before since it sounded totally
>>> wrong.
>>>
>>> That combined with your earlier comments I think I can say something
>>> meaningful.
>>>
>>> Andy as I read your patch the thread you are primarily worried about is
>>> chdir(/some/directory/in/another/mnt/ns).  I think enhancing nosuid to
>>> deal with that case is reasonable, and is unlikely to break userspace.
>>> It is one of those hairy security things so we need to be careful not to
>>> introduce a regression.
>>>
>>
>> Indeed.  It's plausible this could regress something, but it would be
>> really weird.
>>
>>> I think a top down enhancement of nosuid to just block funny cases that
>>> no one cares about is completely sensible.Removing goofy corner
>>> that no one cares about and that are only good for security exploits
>>> seems reasonable.
>>>
>>
>> Agreed.
>>
>>> I am a little concerned that smack does not seem to respect nosuid
>>> on filesystems.  But that is an issue with nosuid not with your enhanced
>>> nosuid.
>>>
>>>
>>>
>>>
>>> Now this patch 3/7 really should be entitled:
>>> "Limit file caps to the userns of the super block".
>>>
>>> It really really is doing something different.   This change is about a
>>> bottom up understanding of what file caps means on a filesystem mounted
>>> by a user namespace root.
>>>
>>> That is file caps should only apply to the user namespace root of the
>>> root user who mounted the filesystem, because that is all the privileges
>>> the mounter of the filesystem had.
>>>
>>> This guarantees that even if the filesystem somehow propagates with
>>> mount propagation that there will be no issues.  I think I know how to
>>> make that happen...
>>>
>>>
>>>
>>>
>>> But deeply and fundamentally limiting a filesystem to only the
>>> privilieges of it's user namespace root, and enhancing nosuid
>>> protections are rather different things.
>>>
>>
>> So here's the semantic question:
>>
>> Suppose an unprivileged user (uid 1000) creates a user namespace and a
>> mount namespace.  They stick a file (owned by uid 1000 as seen by
>> init_user_ns) in there and mark it setuid root and give it fcaps.
>
> To make this make sense I have to ask, is this file on a filesystem
> where uid 1000 as seen by the init_user_ns stored as uid 1000 on
> the filesystem?  Or is this uid 0 as seen by the filesystem?
>
> I assume this is uid 0 on the filesystem in question or else your
> unprivileged user would not have sufficient privileges over the
> filesystem to setup fcaps.

I was thinking uid 0 as seen by the filesystem.  But even if it were
uid 1000, the unprivileged user can still set whatever mode and xattrs
they want -- they control the backing store.

>
>> Then global root gets an fd to this filesystem.  If they execve the
>> file directly, then, with my patch 4, it won't act as setuid 1000 and
>> the fcaps will be ignored.  Even with my patch 4, though, if they bind
>> mount the fs and execve the file from their bind mount, it will act as
>> setuid 1000.  Maybe this is odd.  However, with Seth's patch 3, the
>> fcaps will (correctly) not be honored.
>
> With patch 3 you can also think of it as fcaps being honored and you
> get all the caps in the appropriate user namespace, but since you are
> not in that user namespace and so don't have a place to store them
> in struct cred you don't get the file caps.
>
> From the philosophy of interpreting the file as defined by the
> filesystem in principle we could extend struct cred so you actually
> get the creds just in uid 1000s user namespace, but that is very
> unlikely to be worth it.

I agree.

>
>> I tend to thing that, if we're not honoring the fcaps, we shouldn't be
>> honoring the setuid bit either.  After all, it's really not a trusted
>> file, even though the only user who could have messed with it really
>> is the apparent owner.
>
> For the file caps we can't honor them because you don't have the bits
> in struct cred.
>
> For setuid we can honor it, and setuid is something that the user
> namespace allows.
>

We certainly *can* honor it.  But why should we?  I'd be more
comfortable with this if the contents of an untrusted filesystem were
really treated as just data.

>> And, if we're going to say we don't trust the file and shouldn't honor
>> setuid or fcaps, then merging all the functionality into mnt_may_suid
>> could make sense.  Yes, these two things do different things, but they
>> could hook in to the same place.
>
> There are really two separate questions:
> - Do we trust this filesystem?
> - Do you have the bits to implement this concept?
>
> Even if in this specific context the two questions wind up looking
> exactly the same. I think it makes a lot of sense to ask the two
> questions 

Re: [PATCH v3 6/6] ARM: PRM: AM437x: Enable IO wakeup feature

2015-07-15 Thread Paul Walmsley
Hi

On Tue, 14 Jul 2015, Keerthy wrote:

> Enable IO wakeup feature.
> 
> Signed-off-by: Keerthy 

Per my comments on one of the previous patches, please add a short 
description in the commit message for what enabling I/O wakeup will do for 
a user.

- Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v4 0/6] powernv: cpufreq: Report frequency throttle by OCC

2015-07-15 Thread Viresh Kumar
On 13-07-15, 19:39, Shilpasri G Bhat wrote:
> This patchset intends to add frequency throttle reporting mechanism
> to powernv-cpufreq driver when OCC throttles the frequency. OCC is an
> On-Chip-Controller which takes care of the power and thermal safety of
> the chip. The CPU frequency can be throttled during an OCC reset or
> when OCC tries to limit the max allowed frequency. The patchset will
> report such conditions so as to keep the user informed about reason
> for the drop in performance of workloads when frequency is throttled.
> 
> Changes from v3:
> - Rebased on top of 4.2-rc1
> - Minor changes in patch 2,3,4,6 this does not change the
>   functionality of the code
> - 594fcb9ec9e powerpc/powernv: Expose OPAL APIs required by PRD
>   interface , this patch fixes the build error due to which this
>   series was initially dropped
>   ERROR: ".opal_message_notifier_register"
>   drivers/cpufreq/powernv-cpufreq.ko] undefined!

I have already Acked v3 of this and that applies to this one as well..

-- 
viresh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: cpufreq/ondemand: unpinning an unpinned lock.

2015-07-15 Thread Viresh Kumar
On 16-07-15, 02:13, Rafael J. Wysocki wrote:
> Cc: Viresh as he's been working on governors recently.
> 
> On Wednesday, July 15, 2015 06:04:22 PM Dave Jones wrote:
> > WARNING: CPU: 1 PID: 29529 at kernel/locking/lockdep.c:3497 
> > lock_unpin_lock+0x109/0x110()
> > unpinning an unpinned lock
> > CPU: 1 PID: 29529 Comm: kworker/1:1 Not tainted 4.2.0-rc2-think+ #3
> > Workqueue: events od_dbs_timer
> >  0009 880094d5baa8 ae7f5e6f 0007
> >  880094d5baf8 880094d5bae8 ae07b91a 0118
> >  00e0 880507bd5c58 0092 0004
> > Call Trace:
> >  [] dump_stack+0x4f/0x7b
> >  [] warn_slowpath_common+0x8a/0xc0
> >  [] warn_slowpath_fmt+0x46/0x50
> >  [] lock_unpin_lock+0x109/0x110
> >  [] __schedule+0x3ac/0xb60
> >  [] schedule+0x41/0x90
> >  [] schedule_preempt_disabled+0x18/0x30
> >  [] mutex_lock_nested+0x16f/0x3e0
> >  [] ? gov_queue_work+0x2f/0xf0
> >  [] ? od_check_cpu+0x57/0xd0
> >  [] ? gov_queue_work+0x2f/0xf0
> >  [] gov_queue_work+0x2f/0xf0
> >  [] od_dbs_timer+0xbd/0x150
> >  [] process_one_work+0x1f3/0x7a0
> >  [] ? process_one_work+0x162/0x7a0
> >  [] ? worker_thread+0xf9/0x470
> >  [] worker_thread+0x69/0x470
> >  [] ? preempt_count_sub+0xa3/0xf0
> >  [] ? process_one_work+0x7a0/0x7a0
> >  [] kthread+0x11f/0x140
> >  [] ? kthread_create_on_node+0x250/0x250
> >  [] ret_from_fork+0x3f/0x70
> >  [] ? kthread_create_on_node+0x250/0x250
> > ---[ end trace 86cca931caec9193 ]---

I don't know why this will happen. Just to confirm, you are getting
this over 4.2-rc(1 or 2)? And you weren't getting these on 4.1 at all?
And its always reproducible? How ?

There have been races in cpufreq core since sometime and what got
pushed in 4.2-rc1 is just half of the fix. The other half is present
here:

http://marc.info/?i=cover.1434713657.git.viresh.kumar%40linaro.org

Please try this and let us know if things work well or not.

-- 
viresh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/7] fs: Ignore file caps in mounts from other user namespaces

2015-07-15 Thread Eric W. Biederman
Andy Lutomirski  writes:

> On Wed, Jul 15, 2015 at 9:23 PM, Eric W. Biederman
>  wrote:
>>
>> Ok.  Andy I have stopped and really looked at your patch that is 4/7 in
>> this series.  Something I had not done before since it sounded totally
>> wrong.
>>
>> That combined with your earlier comments I think I can say something
>> meaningful.
>>
>> Andy as I read your patch the thread you are primarily worried about is
>> chdir(/some/directory/in/another/mnt/ns).  I think enhancing nosuid to
>> deal with that case is reasonable, and is unlikely to break userspace.
>> It is one of those hairy security things so we need to be careful not to
>> introduce a regression.
>>
>
> Indeed.  It's plausible this could regress something, but it would be
> really weird.
>
>> I think a top down enhancement of nosuid to just block funny cases that
>> no one cares about is completely sensible.Removing goofy corner
>> that no one cares about and that are only good for security exploits
>> seems reasonable.
>>
>
> Agreed.
>
>> I am a little concerned that smack does not seem to respect nosuid
>> on filesystems.  But that is an issue with nosuid not with your enhanced
>> nosuid.
>>
>>
>>
>>
>> Now this patch 3/7 really should be entitled:
>> "Limit file caps to the userns of the super block".
>>
>> It really really is doing something different.   This change is about a
>> bottom up understanding of what file caps means on a filesystem mounted
>> by a user namespace root.
>>
>> That is file caps should only apply to the user namespace root of the
>> root user who mounted the filesystem, because that is all the privileges
>> the mounter of the filesystem had.
>>
>> This guarantees that even if the filesystem somehow propagates with
>> mount propagation that there will be no issues.  I think I know how to
>> make that happen...
>>
>>
>>
>>
>> But deeply and fundamentally limiting a filesystem to only the
>> privilieges of it's user namespace root, and enhancing nosuid
>> protections are rather different things.
>>
>
> So here's the semantic question:
>
> Suppose an unprivileged user (uid 1000) creates a user namespace and a
> mount namespace.  They stick a file (owned by uid 1000 as seen by
> init_user_ns) in there and mark it setuid root and give it fcaps.

To make this make sense I have to ask, is this file on a filesystem
where uid 1000 as seen by the init_user_ns stored as uid 1000 on
the filesystem?  Or is this uid 0 as seen by the filesystem?

I assume this is uid 0 on the filesystem in question or else your
unprivileged user would not have sufficient privileges over the
filesystem to setup fcaps.

> Then global root gets an fd to this filesystem.  If they execve the
> file directly, then, with my patch 4, it won't act as setuid 1000 and
> the fcaps will be ignored.  Even with my patch 4, though, if they bind
> mount the fs and execve the file from their bind mount, it will act as
> setuid 1000.  Maybe this is odd.  However, with Seth's patch 3, the
> fcaps will (correctly) not be honored.

With patch 3 you can also think of it as fcaps being honored and you
get all the caps in the appropriate user namespace, but since you are
not in that user namespace and so don't have a place to store them
in struct cred you don't get the file caps.

>From the philosophy of interpreting the file as defined by the
filesystem in principle we could extend struct cred so you actually
get the creds just in uid 1000s user namespace, but that is very
unlikely to be worth it.

> I tend to thing that, if we're not honoring the fcaps, we shouldn't be
> honoring the setuid bit either.  After all, it's really not a trusted
> file, even though the only user who could have messed with it really
> is the apparent owner.

For the file caps we can't honor them because you don't have the bits
in struct cred.

For setuid we can honor it, and setuid is something that the user
namespace allows.

> And, if we're going to say we don't trust the file and shouldn't honor
> setuid or fcaps, then merging all the functionality into mnt_may_suid
> could make sense.  Yes, these two things do different things, but they
> could hook in to the same place.

There are really two separate questions:
- Do we trust this filesystem?
- Do you have the bits to implement this concept?

Even if in this specific context the two questions wind up looking
exactly the same. I think it makes a lot of sense to ask the two
questions separately.  As future maintenance changes may cause the
implementation of the questions to diverge.

Eric

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH v2] memory-barriers: remove smp_mb__after_unlock_lock()

2015-07-15 Thread Benjamin Herrenschmidt
On Thu, 2015-07-16 at 12:00 +1000, Michael Ellerman wrote:
> That would fix the problem with smp_mb__after_unlock_lock(), but not
> the original worry we had about loads happening before the SC in lock.

However I think isync fixes *that* :-) The problem with isync is as you
said, it's not a -memory- barrier per-se, it's an execution barrier /
context synchronizing instruction. The combination stwcx. + bne + isync
however prevents the execution of anything past the isync until the
stwcx has completed and the bne has been "decided", which prevents loads
from leaking into the LL/SC loop. It will also prevent a store in the
lock from being issued before the stwcx. has completed. It does *not*
prevent as far as I can tell another unrelated store before the lock
from leaking into the lock, including the one used to unlock a different
lock.

Cheers,
Ben.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/7] Initial support for user namespace owned mounts

2015-07-15 Thread Eric W. Biederman
Casey Schaufler  writes:

> On 7/15/2015 6:08 PM, Andy Lutomirski wrote:
>> On Wed, Jul 15, 2015 at 3:39 PM, Casey Schaufler  
>> wrote:
>>> On 7/15/2015 2:06 PM, Eric W. Biederman wrote:
 Casey Schaufler  writes:
 The first step needs to be not trusting those labels and treating such
 filesystems as filesystems without label support.  I hope that is Seth
 has implemented.
>>> A filesystem with Smack labels gets mounted in a namespace. The labels
>>> are ignored. Instead, the filesystem defaults (potentially specified as
>>> mount options smackfsdef="something", but usually the floor label ("_"))
>>> are used, giving the user the ability to read everything and (usually)
>>> change nothing. This is both dangerous (unintended read access to files)
>>> and pointless (can't make changes).
>> I don't get it.
>>
>> If I mount an unprivileged filesystem, then either the contents were
>> put there *by me*, in which case letting me access them are fine, or
>> (with Seth's patches and then some) I control the backing store, in
>> which case I can do whatever I want regardless of what LSM thinks.
>>
>> So I don't see the problem.  Why would Smack or any other LSM care at
>> all, unless it wants to prevent me from mounting the fs in the first
>> place?
>
> First off, I don't cotton to the notion that you should be able
> to mount filesystems without privilege. But it seems I'm being
> outvoted on that. I suspect that there are cases where it might
> be safe, but I can't think of one off the top of my head.

There are two fundamental issues mounting filesystems without privielge,
by which I actually mean mounting filesystems as the root user in a user
namespace.

- Are the semantics safe.
- Is the extra attack surface a problem.

Figuring out how to make semantics safe is what we are talking about.

Once we sort out the semantics we can look at the handful of filesystems
like fuse where the extra attack surface is not a concern.

With that said desktop environments have for a long time been
automatically mounting whichever filesystem you place in your computer,
so in practice what this is really about is trying to align the kernel
with how people use filesystems.

I haven't looked closely but I think docker is just about as bad as
those desktop environments when it comes to mounting filesystems.

> If you do mount a filesystem it needs to behave according to the
> rules of the system.

I agree.

> If you have a security module that uses
> attributes on the filesystem you can't ignore them just because
> it's "your data". Mandatory access control schemes, including
> Smack and SELinux don't give a fig about who you are. It's the
> label on the data and the process that matter. If "you" get to
> muck the labels up, you've broken the mandatory access control.

So there are filesystems like fat and minix that can not store a label.
Since it is not possible to store labels securely in filesystems mounted
by unprivileged users (at least in the normal sense) the intent would be
to treat a filesystem mounted without the privileges of the global root
user as a filesystem that does not support xattrs.

Treating such a filesystem as a filesystem that does not support xattrs
is the only possible way support such a filesystem securely, because as
you have said someone who can muck up the labels breaks mandatory access
control.

Given how non-trivial it is to grasp the nuances of different lsms
mandatory access control semantics, I am asking Seth for the first past
to simply forbid mounting of filesystems with just user namespace
permissions when there is an lsm active.

Once we get that far smack may never need to support such systems.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


linux-next: manual merge of the akpm-current tree with the arm tree

2015-07-15 Thread Stephen Rothwell
Hi Andrew,

Today's linux-next merge of the akpm-current tree got a conflict in:

  arch/arm/include/asm/Kbuild

between commit:

  57853e8906a0 ("ARM: 8403/1: kbuild: don't use generic mcs_spinlock.h header")

from the arm tree and commit:

  74cf1a5a0c64 ("mm: clean up per architecture MM hook header files")

from the akpm-current tree.

I fixed it up (see below) and can carry the fix as necessary (no action
is required).

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au

diff --cc arch/arm/include/asm/Kbuild
index 517ef6dd22b9,30b3bc1666d2..
--- a/arch/arm/include/asm/Kbuild
+++ b/arch/arm/include/asm/Kbuild
@@@ -12,6 -12,8 +12,7 @@@ generic-y += irq_regs.
  generic-y += kdebug.h
  generic-y += local.h
  generic-y += local64.h
 -generic-y += mcs_spinlock.h
+ generic-y += mm-arch-hooks.h
  generic-y += msgbuf.h
  generic-y += param.h
  generic-y += parport.h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/7] fs: Ignore file caps in mounts from other user namespaces

2015-07-15 Thread Andy Lutomirski
On Wed, Jul 15, 2015 at 9:23 PM, Eric W. Biederman
 wrote:
>
> Ok.  Andy I have stopped and really looked at your patch that is 4/7 in
> this series.  Something I had not done before since it sounded totally
> wrong.
>
> That combined with your earlier comments I think I can say something
> meaningful.
>
> Andy as I read your patch the thread you are primarily worried about is
> chdir(/some/directory/in/another/mnt/ns).  I think enhancing nosuid to
> deal with that case is reasonable, and is unlikely to break userspace.
> It is one of those hairy security things so we need to be careful not to
> introduce a regression.
>

Indeed.  It's plausible this could regress something, but it would be
really weird.

> I think a top down enhancement of nosuid to just block funny cases that
> no one cares about is completely sensible.Removing goofy corner
> that no one cares about and that are only good for security exploits
> seems reasonable.
>

Agreed.

> I am a little concerned that smack does not seem to respect nosuid
> on filesystems.  But that is an issue with nosuid not with your enhanced
> nosuid.
>
>
>
>
> Now this patch 3/7 really should be entitled:
> "Limit file caps to the userns of the super block".
>
> It really really is doing something different.   This change is about a
> bottom up understanding of what file caps means on a filesystem mounted
> by a user namespace root.
>
> That is file caps should only apply to the user namespace root of the
> root user who mounted the filesystem, because that is all the privileges
> the mounter of the filesystem had.
>
> This guarantees that even if the filesystem somehow propagates with
> mount propagation that there will be no issues.  I think I know how to
> make that happen...
>
>
>
>
> But deeply and fundamentally limiting a filesystem to only the
> privilieges of it's user namespace root, and enhancing nosuid
> protections are rather different things.
>

So here's the semantic question:

Suppose an unprivileged user (uid 1000) creates a user namespace and a
mount namespace.  They stick a file (owned by uid 1000 as seen by
init_user_ns) in there and mark it setuid root and give it fcaps.

Then global root gets an fd to this filesystem.  If they execve the
file directly, then, with my patch 4, it won't act as setuid 1000 and
the fcaps will be ignored.  Even with my patch 4, though, if they bind
mount the fs and execve the file from their bind mount, it will act as
setuid 1000.  Maybe this is odd.  However, with Seth's patch 3, the
fcaps will (correctly) not be honored.

I tend to thing that, if we're not honoring the fcaps, we shouldn't be
honoring the setuid bit either.  After all, it's really not a trusted
file, even though the only user who could have messed with it really
is the apparent owner.

And, if we're going to say we don't trust the file and shouldn't honor
setuid or fcaps, then merging all the functionality into mnt_may_suid
could make sense.  Yes, these two things do different things, but they
could hook in to the same place.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


OFFICIAL LETTER 16\07\2015

2015-07-15 Thread MR. PHILIP COHEN


HELLO, 

KINDLY STUDY ATTACHED DOCUMENT FOR A BETTER UNDERSTANDING TO MY PROPOSAL.

THANKS FOR TAKING THE TIME TO READ MY E-MAIL MESSAGE.

REGARDS, 
MR. PHILIP COHEN



MR. PHILIP COHEN.docx
Description: MS-Word 2007 document


[PATCH v3 3/4] arm64: Add Broadcom iProc family support

2015-07-15 Thread Ray Jui
This patch adds support to Broadcom's iProc family of arm64 based SoCs
in the arm64 Kconfig and defconfig files

Signed-off-by: Ray Jui 
Reviewed-by: Scott Branden 
---
 arch/arm64/Kconfig   |5 +
 arch/arm64/configs/defconfig |2 ++
 2 files changed, 7 insertions(+)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 318175f..969ef4a 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -162,6 +162,11 @@ source "kernel/Kconfig.freezer"
 
 menu "Platform selection"
 
+config ARCH_BCM_IPROC
+   bool "Broadcom iProc SoC Family"
+   help
+ This enables support for Broadcom iProc based SoCs
+
 config ARCH_EXYNOS
bool
help
diff --git a/arch/arm64/configs/defconfig b/arch/arm64/configs/defconfig
index 4e17e7e..c83d51f 100644
--- a/arch/arm64/configs/defconfig
+++ b/arch/arm64/configs/defconfig
@@ -31,6 +31,7 @@ CONFIG_MODULES=y
 CONFIG_MODULE_UNLOAD=y
 # CONFIG_BLK_DEV_BSG is not set
 # CONFIG_IOSCHED_DEADLINE is not set
+CONFIG_ARCH_BCM_IPROC=y
 CONFIG_ARCH_EXYNOS7=y
 CONFIG_ARCH_FSL_LS2085A=y
 CONFIG_ARCH_HISI=y
@@ -102,6 +103,7 @@ CONFIG_SERIO_AMBAKMI=y
 CONFIG_LEGACY_PTY_COUNT=16
 CONFIG_SERIAL_8250=y
 CONFIG_SERIAL_8250_CONSOLE=y
+CONFIG_SERIAL_8250_DW=y
 CONFIG_SERIAL_8250_MT6577=y
 CONFIG_SERIAL_AMBA_PL011=y
 CONFIG_SERIAL_AMBA_PL011_CONSOLE=y
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3 1/4] PCI: iproc: enable arm64 support for iProc PCIe

2015-07-15 Thread Ray Jui
This patch enables arm64 support to the iProc PCIe driver

Signed-off-by: Ray Jui 
Reviewed-by: Scott Branden 
---
 drivers/pci/host/pcie-iproc.c |   15 ---
 drivers/pci/host/pcie-iproc.h |8 ++--
 2 files changed, 10 insertions(+), 13 deletions(-)

diff --git a/drivers/pci/host/pcie-iproc.c b/drivers/pci/host/pcie-iproc.c
index d77481e..8a556d5 100644
--- a/drivers/pci/host/pcie-iproc.c
+++ b/drivers/pci/host/pcie-iproc.c
@@ -58,11 +58,6 @@
 #define SYS_RC_INTX_EN   0x330
 #define SYS_RC_INTX_MASK 0xf
 
-static inline struct iproc_pcie *sys_to_pcie(struct pci_sys_data *sys)
-{
-   return sys->private_data;
-}
-
 /**
  * Note access to the configuration registers are protected at the higher layer
  * by 'pci_lock' in drivers/pci/access.c
@@ -71,8 +66,7 @@ static void __iomem *iproc_pcie_map_cfg_bus(struct pci_bus 
*bus,
unsigned int devfn,
int where)
 {
-   struct pci_sys_data *sys = bus->sysdata;
-   struct iproc_pcie *pcie = sys_to_pcie(sys);
+   struct iproc_pcie *pcie = bus->sysdata;
unsigned slot = PCI_SLOT(devfn);
unsigned fn = PCI_FUNC(devfn);
unsigned busno = bus->number;
@@ -208,10 +202,7 @@ int iproc_pcie_setup(struct iproc_pcie *pcie, struct 
list_head *res)
 
iproc_pcie_reset(pcie);
 
-   pcie->sysdata.private_data = pcie;
-
-   bus = pci_create_root_bus(pcie->dev, 0, _pcie_ops,
- >sysdata, res);
+   bus = pci_create_root_bus(pcie->dev, 0, _pcie_ops, pcie, res);
if (!bus) {
dev_err(pcie->dev, "unable to create PCI root bus\n");
ret = -ENOMEM;
@@ -229,7 +220,9 @@ int iproc_pcie_setup(struct iproc_pcie *pcie, struct 
list_head *res)
 
pci_scan_child_bus(bus);
pci_assign_unassigned_bus_resources(bus);
+#ifdef CONFIG_ARM
pci_fixup_irqs(pci_common_swizzle, pcie->map_irq);
+#endif
pci_bus_add_devices(bus);
 
return 0;
diff --git a/drivers/pci/host/pcie-iproc.h b/drivers/pci/host/pcie-iproc.h
index ba0a108..0ee9673 100644
--- a/drivers/pci/host/pcie-iproc.h
+++ b/drivers/pci/host/pcie-iproc.h
@@ -18,18 +18,22 @@
 
 /**
  * iProc PCIe device
+ * @sysdata: Per PCI controller data. This needs to be kept at the beginning of
+ * struct iproc_pcie, to enable support of both ARM32 and ARM64 platforms with
+ * minimal changes in the iProc PCIe core driver
  * @dev: pointer to device data structure
  * @base: PCIe host controller I/O register base
  * @resources: linked list of all PCI resources
- * @sysdata: Per PCI controller data
  * @root_bus: pointer to root bus
  * @phy: optional PHY device that controls the Serdes
  * @irqs: interrupt IDs
  */
 struct iproc_pcie {
+#ifdef CONFIG_ARM
+   struct pci_sys_data sysdata;
+#endif
struct device *dev;
void __iomem *base;
-   struct pci_sys_data sysdata;
struct pci_bus *root_bus;
struct phy *phy;
int irqs[IPROC_PCIE_MAX_NUM_IRQS];
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3 0/4] Add Broadcom North Star 2 support

2015-07-15 Thread Ray Jui
This patch series adds Broadcom North Star 2 (NS2) SoC support. NS2 is an ARMv8
based SoC and under the Broadcom iProc family.

Sorry for tying this with the Broadcom iProc PCIe driver fixes for ARM64. I
have to tie them together because iProc PCIe support is enabled by default
when ARCH_BCM_IPROC is enabled. Without the fixes in the iProc PCIe driver,
enabling CONFIG_ARCH_BCM_IPROC would break the build for arm64 defconfig. Let
me know if there's a better way to handle this.

This patch series is generated based on v4.2-rc2 and tested on Broadcom NS2 SVK

Code available on GITHUB: https://github.com/Broadcom/arm64-linux.git
branch is ns2-core-v3

Changes from V2:
- Drop hardcoded earlycon kernel command line paramter in NS2 SVK dts file
because 1) earlycon is a debugging feature that can be enabled in the
bootloader and should not be enabled by default in the board dts file and 2)
of_earlycon should be used and support should be added to 8250 DW driver

Changes from V1:
- Took Arnd's advice to tweak the location of struct pci_sys_data within
struct iproc_pcie. This helps to get rid of most of the CONFIG_ARM wrap in
iProc PCIe core driver
- Use stdout-path and alias for serial console in NS2 SVK dts
- Add all 4 CPU descriptions in NS2 dtsi
- Remove "clock-frequency" property in the armv8 timer node so timer frequency
can be determined based on readings from CNTFRQ_EL0
- Remove config flag ARCH_BCM_NS2. Leave only ARCH_BCM_IPROC for all Broadcom
arm64 SoCs as advised

Ray Jui (4):
  PCI: iproc: enable arm64 support for iProc PCIe
  PCI: iproc: Fix ARM64 dependency in Kconfig
  arm64: Add Broadcom iProc family support
  arm64: dts: Add Broadcom North Star 2 support

 Documentation/devicetree/bindings/arm/bcm/ns2.txt |9 ++
 arch/arm64/Kconfig|5 +
 arch/arm64/boot/dts/Makefile  |1 +
 arch/arm64/boot/dts/broadcom/Makefile |5 +
 arch/arm64/boot/dts/broadcom/ns2-svk.dts  |   59 +++
 arch/arm64/boot/dts/broadcom/ns2.dtsi |  118 +
 arch/arm64/configs/defconfig  |2 +
 drivers/pci/host/Kconfig  |2 +-
 drivers/pci/host/pcie-iproc.c |   15 +--
 drivers/pci/host/pcie-iproc.h |8 +-
 10 files changed, 210 insertions(+), 14 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/arm/bcm/ns2.txt
 create mode 100644 arch/arm64/boot/dts/broadcom/Makefile
 create mode 100644 arch/arm64/boot/dts/broadcom/ns2-svk.dts
 create mode 100644 arch/arm64/boot/dts/broadcom/ns2.dtsi

-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3 4/4] arm64: dts: Add Broadcom North Star 2 support

2015-07-15 Thread Ray Jui
Add Broadcom NS2 device tree binding document. Also add initial device
tree dtsi for Broadcom North Star 2 (NS2) SoC and board support for NS2
SVK board

Signed-off-by: Jon Mason 
Signed-off-by: Ray Jui 
Reviewed-by: Scott Branden 
---
 Documentation/devicetree/bindings/arm/bcm/ns2.txt |9 ++
 arch/arm64/boot/dts/Makefile  |1 +
 arch/arm64/boot/dts/broadcom/Makefile |5 +
 arch/arm64/boot/dts/broadcom/ns2-svk.dts  |   59 +++
 arch/arm64/boot/dts/broadcom/ns2.dtsi |  118 +
 5 files changed, 192 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/arm/bcm/ns2.txt
 create mode 100644 arch/arm64/boot/dts/broadcom/Makefile
 create mode 100644 arch/arm64/boot/dts/broadcom/ns2-svk.dts
 create mode 100644 arch/arm64/boot/dts/broadcom/ns2.dtsi

diff --git a/Documentation/devicetree/bindings/arm/bcm/ns2.txt 
b/Documentation/devicetree/bindings/arm/bcm/ns2.txt
new file mode 100644
index 000..35f056f
--- /dev/null
+++ b/Documentation/devicetree/bindings/arm/bcm/ns2.txt
@@ -0,0 +1,9 @@
+Broadcom North Star 2 (NS2) device tree bindings
+
+
+Boards with NS2 shall have the following properties:
+
+Required root node property:
+
+NS2 SVK board
+compatible = "brcm,ns2-svk", "brcm,ns2";
diff --git a/arch/arm64/boot/dts/Makefile b/arch/arm64/boot/dts/Makefile
index 38913be..9f95941 100644
--- a/arch/arm64/boot/dts/Makefile
+++ b/arch/arm64/boot/dts/Makefile
@@ -1,6 +1,7 @@
 dts-dirs += amd
 dts-dirs += apm
 dts-dirs += arm
+dts-dirs += broadcom
 dts-dirs += cavium
 dts-dirs += exynos
 dts-dirs += freescale
diff --git a/arch/arm64/boot/dts/broadcom/Makefile 
b/arch/arm64/boot/dts/broadcom/Makefile
new file mode 100644
index 000..e21fe66
--- /dev/null
+++ b/arch/arm64/boot/dts/broadcom/Makefile
@@ -0,0 +1,5 @@
+dtb-$(CONFIG_ARCH_BCM_IPROC) += ns2-svk.dtb
+
+always := $(dtb-y)
+subdir-y   := $(dts-dirs)
+clean-files:= *.dtb
diff --git a/arch/arm64/boot/dts/broadcom/ns2-svk.dts 
b/arch/arm64/boot/dts/broadcom/ns2-svk.dts
new file mode 100644
index 000..244baf8
--- /dev/null
+++ b/arch/arm64/boot/dts/broadcom/ns2-svk.dts
@@ -0,0 +1,59 @@
+/*
+ *  BSD LICENSE
+ *
+ *  Copyright(c) 2015 Broadcom Corporation.  All rights reserved.
+ *
+ *  Redistribution and use in source and binary forms, with or without
+ *  modification, are permitted provided that the following conditions
+ *  are met:
+ *
+ ** Redistributions of source code must retain the above copyright
+ *  notice, this list of conditions and the following disclaimer.
+ ** Redistributions in binary form must reproduce the above copyright
+ *  notice, this list of conditions and the following disclaimer in
+ *  the documentation and/or other materials provided with the
+ *  distribution.
+ ** Neither the name of Broadcom Corporation nor the names of its
+ *  contributors may be used to endorse or promote products derived
+ *  from this software without specific prior written permission.
+ *
+ *  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *  "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *  LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *  A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *  OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *  SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *  LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *  DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *  THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *  (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+/dts-v1/;
+
+#include "ns2.dtsi"
+
+/ {
+   model = "Broadcom NS2 SVK";
+   compatible = "brcm,ns2-svk", "brcm,ns2";
+
+   aliases {
+   serial0 = 
+   };
+
+   chosen {
+   stdout-path = "serial0:115200n8";
+   };
+
+   memory {
+   device_type = "memory";
+   reg = <0x0 0x8000 0x 0x4000>;
+   };
+
+   soc: soc {
+   uart3: serial@6613 {
+   status = "ok";
+   };
+   };
+};
diff --git a/arch/arm64/boot/dts/broadcom/ns2.dtsi 
b/arch/arm64/boot/dts/broadcom/ns2.dtsi
new file mode 100644
index 000..3c92d92
--- /dev/null
+++ b/arch/arm64/boot/dts/broadcom/ns2.dtsi
@@ -0,0 +1,118 @@
+/*
+ *  BSD LICENSE
+ *
+ *  Copyright(c) 2015 Broadcom Corporation.  All rights reserved.
+ *
+ *  Redistribution and use in source and binary forms, with or without
+ *  modification, are permitted provided that the following conditions
+ *  are met:
+ *
+ ** Redistributions of source code must retain the above copyright
+ *  

[PATCH v3 2/4] PCI: iproc: Fix ARM64 dependency in Kconfig

2015-07-15 Thread Ray Jui
Allow Broadcom iProc PCIe core driver to be compiled for ARM64

Signed-off-by: Ray Jui 
Reviewed-by: Vikram Prakash 
Reviewed-by: Scott Branden 
---
 drivers/pci/host/Kconfig |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/pci/host/Kconfig b/drivers/pci/host/Kconfig
index c132bdd..d2c6144 100644
--- a/drivers/pci/host/Kconfig
+++ b/drivers/pci/host/Kconfig
@@ -117,7 +117,7 @@ config PCI_VERSATILE
 
 config PCIE_IPROC
tristate "Broadcom iProc PCIe controller"
-   depends on OF && ARM
+   depends on OF && (ARM || ARM64)
default n
help
  This enables the iProc PCIe core controller support for Broadcom's
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/7] fs: Ignore file caps in mounts from other user namespaces

2015-07-15 Thread Eric W. Biederman

Ok.  Andy I have stopped and really looked at your patch that is 4/7 in
this series.  Something I had not done before since it sounded totally
wrong.

That combined with your earlier comments I think I can say something
meaningful.  

Andy as I read your patch the thread you are primarily worried about is
chdir(/some/directory/in/another/mnt/ns).  I think enhancing nosuid to
deal with that case is reasonable, and is unlikely to break userspace.
It is one of those hairy security things so we need to be careful not to
introduce a regression.

I think a top down enhancement of nosuid to just block funny cases that
no one cares about is completely sensible.Removing goofy corner
that no one cares about and that are only good for security exploits
seems reasonable.

I am a little concerned that smack does not seem to respect nosuid
on filesystems.  But that is an issue with nosuid not with your enhanced
nosuid.




Now this patch 3/7 really should be entitled:
"Limit file caps to the userns of the super block".

It really really is doing something different.   This change is about a
bottom up understanding of what file caps means on a filesystem mounted
by a user namespace root. 

That is file caps should only apply to the user namespace root of the
root user who mounted the filesystem, because that is all the privileges
the mounter of the filesystem had.

This guarantees that even if the filesystem somehow propagates with
mount propagation that there will be no issues.  I think I know how to
make that happen...




But deeply and fundamentally limiting a filesystem to only the
privilieges of it's user namespace root, and enhancing nosuid
protections are rather different things.


The approaches show up differently for dealing with uids and gids,
as mappings are required.  The approaches will likely to continue to
show up differently for file caps when Serge implements a version
of file caps with a user namespace root in them.

The approaches fundamentally will need to do different things with
security xattrs.  As mnt_may_suid can just treat as a filesystem
without labels, while ultimately the lsms will have to do something
meaningful.



So while in the very narrow case of todays file caps the two approaches
are the same.   Enhancing nosuid is something very different from
limiting a filesystem to it's mounters user namespace.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [V2 6/7] hvsock: introduce Hyper-V VM Sockets feature

2015-07-15 Thread David Miller
From: Dexuan Cui 
Date: Tue, 14 Jul 2015 03:00:48 -0700

> + pr_debug("hvsock_sk_destruct: called\n");

Debug logging just to state that a function is called is not appropriate,
we have very sophisticated tracing facilities in the kernel that can do
that transparently, and more.

PLease remove this.

> + if (hvsk->channel) {
> + pr_debug("hvsock_sk_destruct: calling vmbus_close()\n");

Likewise, these kinds of debug logs are totally inappropriate.

> +static int hvsock_release(struct socket *sock)
> +{
> + /* sock->sk is NULL, if accept() is interrupted by a signal */
> + if (sock->sk) {
> + __hvsock_release(sock->sk);
> + sock->sk = NULL;
> + }
> +
> + sock->state = SS_FREE;
> + pr_debug("hvsock_release called\n\n");

Likewise.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


LKML archives at UofI down?

2015-07-15 Thread Josh Triplett
The LKML archives once present at
http://lkml.iu.edu/hypermail/linux/kernel/index.html seem to be down;
http://lkml.iu.edu/hypermail/ appears empty.  Does anyone know what
happened to it?

- Josh Triplett
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [V2 3/7] Drivers: hv: vmbus: add APIs to send/recv hvsock packet and get the r/w-ability

2015-07-15 Thread David Miller
From: Dexuan Cui 
Date: Tue, 14 Jul 2015 02:58:56 -0700

> +int vmbus_sendpacket_hvsock(struct vmbus_channel *channel, void *buf, u32 
> len)
> +{
> + struct vmpacket_descriptor desc;
> + struct vmpipe_proto_header pipe_hdr;
> + u32 packetlen;
> + u32 packetlen_aligned;
> + struct kvec bufferlist[4];
> + u64 aligned_data = 0;
> + int ret;
> + bool signal = false;

Reverse christmas-tree (longest to shortest line) order these local
variables, please.

> +EXPORT_SYMBOL(vmbus_sendpacket_hvsock);

EXPORT_SYMBOL_GPL()

> +int vmbus_recvpacket_hvsock(struct vmbus_channel *channel, void *buffer,
> + u32 bufferlen, u32 *buffer_actual_len)
> +{
> + struct vmpacket_descriptor *desc;
> + struct vmpipe_proto_header *pipe_hdr;
> + u32 packet_len, payload_len;
> + int ret;
> + bool signal = false;

Again, please use reverse christmas-tree order.

> +void vmbus_get_hvsock_rw_status(struct vmbus_channel *channel,
> +bool *can_read, bool *can_write)

Second line is not properly indented, it should start exactly one
column after the openning parenthesis on the previous line.

> + hv_get_ringbuffer_availbytes(inring_info,
> + bytes_avail_toread,
> + bytes_avail_towrite);

Again, improperly indented.

> +extern int vmbus_sendpacket_hvsock(struct vmbus_channel *channel,
> + void *buf, u32 len);
> +

Likewise.

> +extern int vmbus_recvpacket_hvsock(struct vmbus_channel *channel, void 
> *buffer,
> + u32 bufferlen, u32 *buffer_actual_len);
> +
> +extern void vmbus_get_hvsock_rw_status(struct vmbus_channel *channel,
> +bool *can_read, bool *can_write);

Likewise.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH net-next] hv_netvsc: Add close of RNDIS filter into change mtu call

2015-07-15 Thread David Miller
From: Haiyang Zhang 
Date: Mon, 13 Jul 2015 13:09:16 -0700

> The current change mtu call only stops tx before removing RNDIS filter.
> In case ringbufer is not empty, the rndis_filter_device_remove() may
> hang on removing the buffers.
> 
> This patch adds close of RNDIS filter before removing it, also a
> gradual waiting loop until the ring is empty. The change_mtu hang
> issue under heavy traffic is solved by this patch.
> 
> Signed-off-by: Haiyang Zhang 
> Reviewed-by: K. Y. Srinivasan 

Applied, thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [3/3] IRQ: Print "unexpected IRQ" messages consistently across architectures

2015-07-15 Thread Michael Ellerman
On Mon, 2015-07-13 at 13:35 -0500, Bjorn Helgaas wrote:
> On Sun, Jul 12, 2015 at 10:23 PM, Michael Ellerman  
> wrote:
> > On Sun, 2015-12-07 at 22:02:11 UTC, Bjorn Helgaas wrote:
> >> Many architectures use a variant of "unexpected IRQ trap at vector %x" to
> >> log unexpected IRQs.  This is confusing because (a) it prints the Linux IRQ
> >> number, but "vector" more often refers to a CPU vector number, and (b) it
> >> prints the IRQ number in hex with no base indication, while Linux IRQ
> >> numbers are usually printed in decimal.
> >>
> >> Print the same text ("unexpected IRQ %d") across all architectures.
> >>
> >> No functional change other than the output text.
> >
> > There's already a fallback version in asm-generic, so shouldn't you instead
> > just delete all the versions that are identical to that?
> >
> > eg. on powerpc we have:
> >
> >>  static inline void ack_bad_irq(unsigned int irq)
> >>  {
> >> - printk(KERN_CRIT "unexpected IRQ trap at vector %02x\n", irq);
> >> + printk(KERN_CRIT "unexpected IRQ %d\n", irq);
> >>  }
> >
> > And the generic version is:
> >
> >>  #ifndef ack_bad_irq
> >>  static inline void ack_bad_irq(unsigned int irq)
> >>  {
> >> - printk(KERN_CRIT "unexpected IRQ trap at vector %02x\n", irq);
> >> + printk(KERN_CRIT "unexpected IRQ %d\n", irq);
> >>  }
> >>  #endif
> >
> > So we can just delete the powerpc version?
> 
> Wow, I really didn't do my homework here.  Not only is there a generic
> version already, but there's also print_irq_desc(), which prints way
> more information than any of the ack_bad_irq() implementations.

Even better :)

> I'll try again :)

Thanks.

cheers


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [V2 1/7] Drivers: hv: vmbus: define the new offer type for Hyper-V socket (hvsock)

2015-07-15 Thread David Miller
From: Dexuan Cui 
Date: Tue, 14 Jul 2015 02:58:03 -0700

> A helper function is also added.
> 
> Signed-off-by: Dexuan Cui 
> ---
>  include/linux/hyperv.h | 7 +++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h
> index 30d3a1f..aa21814 100644
> --- a/include/linux/hyperv.h
> +++ b/include/linux/hyperv.h
> @@ -236,6 +236,7 @@ struct vmbus_channel_offer {
>  #define VMBUS_CHANNEL_LOOPBACK_OFFER 0x100
>  #define VMBUS_CHANNEL_PARENT_OFFER   0x200
>  #define VMBUS_CHANNEL_REQUEST_MONITORED_NOTIFICATION 0x400
> +#define VMBUS_CHANNEL_TLNPI_PROVIDER_OFFER   0x2000
>  
>  struct vmpacket_descriptor {
>   u16 type;
> @@ -758,6 +759,12 @@ struct vmbus_channel {
>   struct list_head percpu_list;
>  };
>  
> +static inline bool is_hvsock_channel(const struct vmbus_channel *c)
> +{
> + return !!(c->offermsg.offer.chn_flags &
> + VMBUS_CHANNEL_TLNPI_PROVIDER_OFFER);
> +}
> +

This is not indented properly, plus it makes no sense to add a flag before
anyone even sets the flag.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/3] KVM: MTRR: fix memory type handling if MTRR is completely disabled

2015-07-15 Thread Alex Williamson
On Thu, 2015-07-16 at 03:25 +0800, Xiao Guangrong wrote:
> From: Xiao Guangrong 
> 
> Currently code uses default memory type if MTRR is fully disabled,
> fix it by using UC instead
> 
> Signed-off-by: Xiao Guangrong 
> ---

Seems to work for me.  I don't see a 0th patch, but for the series:

Tested-by: Alex Williamson 

Thanks!

>  arch/x86/kvm/mtrr.c | 21 -
>  1 file changed, 20 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/mtrr.c b/arch/x86/kvm/mtrr.c
> index de1d2d8..e275013 100644
> --- a/arch/x86/kvm/mtrr.c
> +++ b/arch/x86/kvm/mtrr.c
> @@ -120,6 +120,16 @@ static u8 mtrr_default_type(struct kvm_mtrr *mtrr_state)
>   return mtrr_state->deftype & IA32_MTRR_DEF_TYPE_TYPE_MASK;
>  }
>  
> +static u8 mtrr_disabled_type(void)
> +{
> + /*
> +  * Intel SDM 11.11.2.2: all MTRRs are disabled when
> +  * IA32_MTRR_DEF_TYPE.E bit is cleared, and the UC
> +  * memory type is applied to all of physical memory.
> +  */
> + return MTRR_TYPE_UNCACHABLE;
> +}
> +
>  /*
>  * Three terms are used in the following code:
>  * - segment, it indicates the address segments covered by fixed MTRRs.
> @@ -434,6 +444,8 @@ struct mtrr_iter {
>  
>   /* output fields. */
>   int mem_type;
> + /* mtrr is completely disabled? */
> + bool mtrr_disabled;
>   /* [start, end) is not fully covered in MTRRs? */
>   bool partial_map;
>  
> @@ -549,7 +561,7 @@ static void mtrr_lookup_var_next(struct mtrr_iter *iter)
>  static void mtrr_lookup_start(struct mtrr_iter *iter)
>  {
>   if (!mtrr_is_enabled(iter->mtrr_state)) {
> - iter->partial_map = true;
> + iter->mtrr_disabled = true;
>   return;
>   }
>  
> @@ -563,6 +575,7 @@ static void mtrr_lookup_init(struct mtrr_iter *iter,
>   iter->mtrr_state = mtrr_state;
>   iter->start = start;
>   iter->end = end;
> + iter->mtrr_disabled = false;
>   iter->partial_map = false;
>   iter->fixed = false;
>   iter->range = NULL;
> @@ -656,6 +669,9 @@ u8 kvm_mtrr_get_guest_memory_type(struct kvm_vcpu *vcpu, 
> gfn_t gfn)
>   return MTRR_TYPE_WRBACK;
>   }
>  
> + if (iter.mtrr_disabled)
> + return mtrr_disabled_type();
> +
>   /* It is not covered by MTRRs. */
>   if (iter.partial_map) {
>   /*
> @@ -689,6 +705,9 @@ bool kvm_mtrr_check_gfn_range_consistency(struct kvm_vcpu 
> *vcpu, gfn_t gfn,
>   return false;
>   }
>  
> + if (iter.mtrr_disabled)
> + return true;
> +
>   if (!iter.partial_map)
>   return true;
>  



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Question] How to implement GPIO driver for sparse hw numbers?

2015-07-15 Thread Masahiro Yamada
Hi Linus,


2015-07-15 7:04 GMT+09:00 Linus Walleij :
> On Fri, Jun 19, 2015 at 5:27 AM, Masahiro Yamada
>  wrote:
>
>> In my understanding, the GPIO driver framework requires that
>> the hw numbers should be contiguous within each GPIO chip.
>
> Yes but noone says that .request() to the driver has to succeed
> on every GPIO so just cover all GPIOs from 0 to 307 with
> your GPIO chip and then implement your "holes" in the GPIO
> range from 0 to 307 by letting .request() fail.

Thanks,
At first I also thought about it, but finally I did not adopt it.

Having holes in the GPIO range is not handy because:

[1] When we map a gpio range into a pin range,
we must divide "gpio-ranges" property into many lines
   gpio-ranges = http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] sm750fb: coding style fixes lines over 80 chars

2015-07-15 Thread Joe Perches
On Thu, 2015-07-16 at 00:16 +0530, Vinay Simha BN wrote:
> scripts/checkpatch.pl kernel coding style fixes of WARNING

Please don't be a checkpatch robot.

Use tools to prompt your brain, but don't ever turn
your brain off.

> diff --git a/drivers/staging/sm750fb/ddk750_help.h 
> b/drivers/staging/sm750fb/ddk750_help.h


> +/* if 718 big endian turned on,be aware that don't use this driver for 
> general
> +  use,only for ppc big-endian */
> +#warning "big endian on target cpu and enable nature big endian support of 
> 718
> + capability !"

Yes, this if #if 0, but it's also obviously incorrect

I didn't look at the rest.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/1] ath10k: fixing wrong initialization of struct channel

2015-07-15 Thread Maninder Singh
chandef is initialized with NULL and on the very next line,
we are using it to get channel, which is not correct.

channel should be initialized after obtaining chandef.

Signed-off-by: Maninder Singh 
---
 drivers/net/wireless/ath/ath10k/mac.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/wireless/ath/ath10k/mac.c 
b/drivers/net/wireless/ath/ath10k/mac.c
index 218b6af..3d196b5 100644
--- a/drivers/net/wireless/ath/ath10k/mac.c
+++ b/drivers/net/wireless/ath/ath10k/mac.c
@@ -836,7 +836,7 @@ static inline int ath10k_vdev_setup_sync(struct ath10k *ar)
 static int ath10k_monitor_vdev_start(struct ath10k *ar, int vdev_id)
 {
struct cfg80211_chan_def *chandef = NULL;
-   struct ieee80211_channel *channel = chandef->chan;
+   struct ieee80211_channel *channel = NULL;
struct wmi_vdev_start_request_arg arg = {};
int ret = 0;
 
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: linux-next: build failure after merge of the rcu tree

2015-07-15 Thread Paul E. McKenney
On Thu, Jul 16, 2015 at 01:14:23PM +1000, Stephen Rothwell wrote:
> Hi Paul,
> 
> After merging the rcu tree, today's linux-next build (arm
> multi_v7_defconfig) failed like this:
> 
> kernel/notifier.c: In function 'notify_die':
> kernel/notifier.c:547:2: error: implicit declaration of function 
> 'rcu_lockdep_assert' [-Werror=implicit-function-declaration]
>   rcu_lockdep_assert(rcu_is_watching(),
>   ^
> 
> Caused by commit
> 
>   02300fdb3e5f ("rcu: Rename rcu_lockdep_assert() to RCU_LOCKDEP_WARN()")
> 
> interacting with commit
> 
>   e727c7d7a11e ("notifiers, RCU: Assert that RCU is watching in notify_die()")
> 
> [ and I also noted
>   0333a209cbf6 ("x86/irq, context_tracking: Document how IRQ context tracking 
> works and add an RCU assertion")
> ]
> 
> from the tip tree.

Thank you in both cases!  I suspect that more will follow, so is there
something I can do to make this easier?  (Hard for me to patch stuff
that is not yet in the tree...)

Thanx, Paul

> I added the following merge fix patch:
> 
> From: Stephen Rothwell 
> Date: Thu, 16 Jul 2015 13:08:50 +1000
> Subject: [PATCH] rcu: merge fix for Rename rcu_lockdep_assert() to 
> RCU_LOCKDEP_WARN()
> 
> Signed-off-by: Stephen Rothwell 
> ---
>  arch/x86/kernel/irq.c | 2 +-
>  kernel/notifier.c | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c
> index 30dbf35bc90b..f9cd81825187 100644
> --- a/arch/x86/kernel/irq.c
> +++ b/arch/x86/kernel/irq.c
> @@ -234,7 +234,7 @@ __visible unsigned int __irq_entry do_IRQ(struct pt_regs 
> *regs)
>   entering_irq();
> 
>   /* entering_irq() tells RCU that we're not quiescent.  Check it. */
> - rcu_lockdep_assert(rcu_is_watching(), "IRQ failed to wake up RCU");
> + RCU_LOCKDEP_WARN(!rcu_is_watching(), "IRQ failed to wake up RCU");
> 
>   irq = __this_cpu_read(vector_irq[vector]);
> 
> diff --git a/kernel/notifier.c b/kernel/notifier.c
> index 980e4330fb59..fd2c9acbcc19 100644
> --- a/kernel/notifier.c
> +++ b/kernel/notifier.c
> @@ -544,7 +544,7 @@ int notrace notify_die(enum die_val val, const char *str,
>   .signr  = sig,
> 
>   };
> - rcu_lockdep_assert(rcu_is_watching(),
> + RCU_LOCKDEP_WARN(!rcu_is_watching(),
>  "notify_die called but RCU thinks we're quiescent");
>   return atomic_notifier_call_chain(_chain, val, );
>  }
> -- 
> 2.1.4
> 
> -- 
> Cheers,
> Stephen Rothwells...@canb.auug.org.au
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3] gpio: UniPhier: add driver for UniPhier GPIO controller

2015-07-15 Thread Masahiro Yamada
Hi Linus,


2015-07-15 23:15 GMT+09:00 Linus Walleij :
> On Tue, Jul 14, 2015 at 4:43 AM, Masahiro Yamada
>  wrote:
>
>> This GPIO controller device is used on UniPhier SoCs.
>>
>> Signed-off-by: Masahiro Yamada 
>> ---
>>
>> Changes in v3:
>>   - Use module_platform_driver()
>>
>> Changes in v2:
>>   - Fix typos in the comment block
>
> OK why no device tree bindings? Are they in a separate patch?


Sorry, I was planning to do it later.

OK.  I will come back with
Documentation/devicetree/bindings/gpio/uniphier-gpio.txt in binding info in it.


>> +/*
>> + * Unfortunately, the hardware specification adopts weird GPIO pin labeling.
>> + * The ports are named as
>> + *   PORT00,  PORT01,  PORT02,  ..., PORT07,
>> + *   PORT10,  PORT11,  PORT12,  ..., PORT17,
>> + *   PORT20,  PORT21,  PORT22,  ..., PORT27,
>> + *...
>> + *   PORT90,  PORT91,  PORT92,  ..., PORT97,
>> + *   PORT100, PORT101, PORT102, ..., PORT107,
>> + *...
>> + *
>> + * The PORTs with 8 or 9 in the one's place are missing, i.e. the one's 
>> place
>> + * is octal, while the other places are decimal.  If we handle the port 
>> numbers
>> + * as seen in the hardware documents, the GPIO offsets must be 
>> non-contiguous.
>> + * It is possible to have sparse GPIO pins, but not handy for GPIO range
>> + * mappings, register accessing, etc.
>> + *
>> + * To make things simpler (for driver and device tree implementation), this
>> + * driver takes contiguously-numbered GPIO offsets.  GPIO consumers should 
>> make
>> + * sure to convert the PORT number into the one that fits in this driver.
>> + * The conversion logic is very easy math, for example,
>> + *   PORT15  -->  GPIO offset 13   (8 * 1 + 5)
>> + *   PORT123 -->  GPIO offset 99   (8 * 12 + 3)
>> + */
>> +#define UNIPHIER_GPIO_PORTS_PER_BANK   8
>> +#define UNIPHIER_GPIO_BANK_MASK\
>> +   ((1UL << (UNIPHIER_GPIO_PORTS_PER_BANK)) - 1)
>
>
>
>> +
>> +#define UNIPHIER_GPIO_REG_DATA 0   /* data */
>> +#define UNIPHIER_GPIO_REG_DIR  4   /* direction (1:in, 0:out) */
>> +
>> +struct uniphier_gpio_priv {
>> +   struct of_mm_gpio_chip mmchip;
>> +   spinlock_t lock;
>> +};
>> +
>> +static unsigned uniphier_gpio_bank_to_reg(unsigned bank, unsigned reg_type)
>> +{
>> +   unsigned reg;
>> +
>> +   reg = (bank + 1) * 8 + reg_type;
>> +
>> +   /*
>> +* Unfortunately, there is a register hole at offset 0x90-0x9f.
>> +* Add 0x10 when crossing the hole.
>> +*/
>> +   if (reg >= 0x90)
>> +   reg += 0x10;
>> +
>> +   return reg;
>> +}
>> +
>> +static void uniphier_gpio_bank_write(struct gpio_chip *chip,
>> +unsigned bank, unsigned reg_type,
>> +unsigned mask, unsigned value)
>> +{
>> +   struct of_mm_gpio_chip *mmchip = to_of_mm_gpio_chip(chip);
>> +   struct uniphier_gpio_priv *priv;
>> +   unsigned long flags;
>> +   unsigned reg;
>> +   u32 tmp;
>> +
>> +   if (!mask)
>> +   return;
>> +
>> +   priv = container_of(mmchip, struct uniphier_gpio_priv, mmchip);
>> +
>> +   reg = uniphier_gpio_bank_to_reg(bank, reg_type);
>> +
>> +   /*
>> +* Note
>> +* regmap_update_bits() should not be used here.
>> +*
>> +* The DATA registers return the current readback of pins, not the
>> +* previously written data when they are configured as "input".
>> +* The DATA registers must be overwritten even if the data you are
>> +* going to write is the same as what readl() has returned.
>> +*
>> +* regmap_update_bits() does not write back if the data is not 
>> changed.
>> +*/
>
> Why is this mentioned when the driver doesn't even use regmap?
> Development artifact?


At first, I thought regmap_update_bits() might be useful,
but it tuned out a bad idea.

Anyway, it did not use regmap in this driver, so this comment sounds a
bit weird.
I will delete it in v4.



>> +static int uniphier_gpio_get_direction(struct gpio_chip *chip, unsigned 
>> offset)
>> +{
>> +   return uniphier_gpio_offset_read(chip, UNIPHIER_GPIO_REG_DIR, 
>> offset) ?
>> +   GPIOF_DIR_IN : GPIOF_DIR_OUT;
>
> Just use
> return !!uniphier_gpio_offset_read(chip, UNIPHIER_GPIO_REG_DIR, offset);


OK, will fix.

>> +static int uniphier_gpio_get(struct gpio_chip *chip, unsigned offset)
>> +{
>> +   return uniphier_gpio_offset_read(chip, offset, 
>> UNIPHIER_GPIO_REG_DATA);
>
> return !!uniphier_gpio_offset_read(chip, offset, UNIPHIER_GPIO_REG_DATA);

Likewise.


>> +static void uniphier_gpio_set_multiple(struct gpio_chip *chip,
>> +  unsigned long *mask,
>> +  unsigned long *bits)
>> +{
>> +   unsigned bank, shift, bank_mask, bank_bits;
>> +   int i;
>> +
>> +   for (i = 0; i < chip->ngpio; i += 

Re: [RFC PATCH 11/12] selftests/seccomp: Make seccomp tests work on big endian

2015-07-15 Thread Michael Ellerman
On Wed, 2015-07-15 at 08:16 -0700, Kees Cook wrote:
> On Wed, Jul 15, 2015 at 12:37 AM, Michael Ellerman  
> wrote:
> > diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c 
> > b/tools/testing/selftests/seccomp/seccomp_bpf.c
> > index b2374c131340..51adb9afb511 100644
> > --- a/tools/testing/selftests/seccomp/seccomp_bpf.c
> > +++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
> > @@ -82,7 +82,13 @@ struct seccomp_data {
> >  };
> >  #endif
> >
> > +#if __BYTE_ORDER == __LITTLE_ENDIAN
> >  #define syscall_arg(_n) (offsetof(struct seccomp_data, args[_n]))
> > +#elif __BYTE_ORDER == __BIG_ENDIAN
> > +#define syscall_arg(_n) (offsetof(struct seccomp_data, args[_n]) + 
> > sizeof(__u32))
> > +#else
> > +#error "wut?"
> > +#endif
> 
> Ah-ha! Yes, thanks. Could you change the #error to something that
> describes the particular (impossible) failure condition? "wut? Unknown
> __BYTE_ORDER?!". Not a huge deal, but I always like verbose errors. :)
> Especially for "impossible" situations. :)

Yeah sorry that was a "quick hack" which got promoted into an actual patch.

Fixed to use your message.

cheers


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH 09/12] powerpc/kernel: Add SIG_SYS support for compat tasks

2015-07-15 Thread Michael Ellerman
On Wed, 2015-07-15 at 08:12 -0700, Kees Cook wrote:
> On Wed, Jul 15, 2015 at 12:37 AM, Michael Ellerman  
> wrote:
> > diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c 
> > b/tools/testing/selftests/seccomp/seccomp_bpf.c
> > index c5abe7fd7590..b2374c131340 100644
> > --- a/tools/testing/selftests/seccomp/seccomp_bpf.c
> > +++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
> > @@ -645,6 +645,10 @@ static struct siginfo TRAP_info;
> >  static volatile int TRAP_nr;
> >  static void TRAP_action(int nr, siginfo_t *info, void *void_context)
> >  {
> > +   fprintf(stderr, "in TRAP_action\n");
> > +   fprintf(stderr, "info->si_call_addr %p\n", info->si_call_addr);
> > +   fprintf(stderr, "info->si_syscall %u\n", info->si_syscall);
> > +   fprintf(stderr, "info->si_arch %u\n", info->si_arch);
> > memcpy(_info, info, sizeof(TRAP_info));
> > TRAP_nr = nr;
> >  }
> 
> This chunk looks like left-over debugging?

Urgh yep, that's ugly. Thanks for noticing.

Will remove before merging :)

cheers


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/7] Initial support for user namespace owned mounts

2015-07-15 Thread Eric W. Biederman

Seth I think for the LSMs we should start with:

diff --git a/security/security.c b/security/security.c
index 062f3c997fdc..5b6ece92a8e5 100644
--- a/security/security.c
+++ b/security/security.c
@@ -310,6 +310,8 @@ int security_sb_statfs(struct dentry *dentry)
 int security_sb_mount(const char *dev_name, struct path *path,
const char *type, unsigned long flags, void *data)
 {
+   if (current_user_ns() != _user_ns)
+   return -EPERM;
return call_int_hook(sb_mount, 0, dev_name, path, type, flags, data);
 }


Then we should push this down into all of the lsms.
Then when we should remove or relax or change the check as appropriate
in each lsm.

The point is this is good enough to see that it is trivially safe,
and this allows us to focus on the core issues, and stop worrying about
the lsms for a bit.

Then we can focus on each lsm one at at time and take the time to really
understand them and talk with their maintainers etc to make certain
we get things correct.

This should remove the need for your patches 5, 6 and 7. For the
immediate future.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3/4] block: partition: introduce 'cpu' para to part_inc|dec_in_flight

2015-07-15 Thread Ming Lei
So that it is easier to convert part->in_flight[rw] into percpu variable
in the following patch.

Signed-off-by: Ming Lei 
---
 block/bio.c   | 4 ++--
 block/blk-core.c  | 4 ++--
 block/blk-merge.c | 2 +-
 drivers/nvdimm/core.c | 4 ++--
 include/linux/genhd.h | 4 ++--
 5 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/block/bio.c b/block/bio.c
index 2a00d34..fe8807f 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -1724,7 +1724,7 @@ void generic_start_io_acct(int rw, unsigned long sectors,
part_round_stats(cpu, part);
part_stat_inc(cpu, part, ios[rw]);
part_stat_add(cpu, part, sectors[rw], sectors);
-   part_inc_in_flight(part, rw);
+   part_inc_in_flight(cpu, part, rw);
 
part_stat_unlock();
 }
@@ -1738,7 +1738,7 @@ void generic_end_io_acct(int rw, struct hd_struct *part,
 
part_stat_add(cpu, part, ticks[rw], duration);
part_round_stats(cpu, part);
-   part_dec_in_flight(part, rw);
+   part_dec_in_flight(cpu, part, rw);
 
part_stat_unlock();
 }
diff --git a/block/blk-core.c b/block/blk-core.c
index 82819e6..f180a6d 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -2194,7 +2194,7 @@ void blk_account_io_done(struct request *req)
part_stat_inc(cpu, part, ios[rw]);
part_stat_add(cpu, part, ticks[rw], duration);
part_round_stats(cpu, part);
-   part_dec_in_flight(part, rw);
+   part_dec_in_flight(cpu, part, rw);
 
hd_struct_put(part);
part_stat_unlock();
@@ -2252,7 +2252,7 @@ void blk_account_io_start(struct request *rq, bool new_io)
hd_struct_get(part);
}
part_round_stats(cpu, part);
-   part_inc_in_flight(part, rw);
+   part_inc_in_flight(cpu, part, rw);
rq->part = part;
}
 
diff --git a/block/blk-merge.c b/block/blk-merge.c
index 30a0d9f..cb7c46d 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -449,7 +449,7 @@ static void blk_account_io_merge(struct request *req)
part = req->part;
 
part_round_stats(cpu, part);
-   part_dec_in_flight(part, rq_data_dir(req));
+   part_dec_in_flight(cpu, part, rq_data_dir(req));
 
hd_struct_put(part);
part_stat_unlock();
diff --git a/drivers/nvdimm/core.c b/drivers/nvdimm/core.c
index cb62ec6..053d026 100644
--- a/drivers/nvdimm/core.c
+++ b/drivers/nvdimm/core.c
@@ -224,7 +224,7 @@ void __nd_iostat_start(struct bio *bio, unsigned long 
*start)
part_round_stats(cpu, >part0);
part_stat_inc(cpu, >part0, ios[rw]);
part_stat_add(cpu, >part0, sectors[rw], bio_sectors(bio));
-   part_inc_in_flight(>part0, rw);
+   part_inc_in_flight(cpu, >part0, rw);
part_stat_unlock();
 }
 EXPORT_SYMBOL(__nd_iostat_start);
@@ -238,7 +238,7 @@ void nd_iostat_end(struct bio *bio, unsigned long start)
 
part_stat_add(cpu, >part0, ticks[rw], duration);
part_round_stats(cpu, >part0);
-   part_dec_in_flight(>part0, rw);
+   part_dec_in_flight(cpu, >part0, rw);
part_stat_unlock();
 }
 EXPORT_SYMBOL(nd_iostat_end);
diff --git a/include/linux/genhd.h b/include/linux/genhd.h
index 2adbfa6..612ae80 100644
--- a/include/linux/genhd.h
+++ b/include/linux/genhd.h
@@ -381,14 +381,14 @@ static inline void free_part_stats(struct hd_struct *part)
 #define part_stat_sub(cpu, gendiskp, field, subnd) \
part_stat_add(cpu, gendiskp, field, -subnd)
 
-static inline void part_inc_in_flight(struct hd_struct *part, int rw)
+static inline void part_inc_in_flight(int cpu, struct hd_struct *part, int rw)
 {
atomic_inc(>in_flight[rw]);
if (part->partno)
atomic_inc(_to_disk(part)->part0.in_flight[rw]);
 }
 
-static inline void part_dec_in_flight(struct hd_struct *part, int rw)
+static inline void part_dec_in_flight(int cpu, struct hd_struct *part, int rw)
 {
atomic_dec(>in_flight[rw]);
if (part->partno)
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/4] block: partition: convert percpu ref

2015-07-15 Thread Ming Lei
Percpu refcount is the perfect match for partition's case,
and the conversion is quite straight.

With the convertion, one pair of atomic inc/dec can be saved
for accounting block I/O, which is run in hot path of block I/O.

Signed-off-by: Ming Lei 
---
 block/genhd.c |  6 +-
 block/partition-generic.c |  9 +
 include/linux/genhd.h | 27 +--
 3 files changed, 27 insertions(+), 15 deletions(-)

diff --git a/block/genhd.c b/block/genhd.c
index ed3f5b9..3213b66 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -1284,7 +1284,11 @@ struct gendisk *alloc_disk_node(int minors, int node_id)
 * converted to make use of bd_mutex and sequence counters.
 */
seqcount_init(>part0.nr_sects_seq);
-   hd_ref_init(>part0);
+   if (hd_ref_init(>part0)) {
+   hd_free_part(>part0);
+   kfree(disk);
+   return NULL;
+   }
 
disk->minors = minors;
rand_initialize_disk(disk);
diff --git a/block/partition-generic.c b/block/partition-generic.c
index eca0d02..e771113 100644
--- a/block/partition-generic.c
+++ b/block/partition-generic.c
@@ -232,8 +232,9 @@ static void delete_partition_rcu_cb(struct rcu_head *head)
put_device(part_to_dev(part));
 }
 
-void __delete_partition(struct hd_struct *part)
+void __delete_partition(struct percpu_ref *ref)
 {
+   struct hd_struct *part = container_of(ref, struct hd_struct, ref);
call_rcu(>rcu_head, delete_partition_rcu_cb);
 }
 
@@ -254,7 +255,7 @@ void delete_partition(struct gendisk *disk, int partno)
kobject_put(part->holder_dir);
device_del(part_to_dev(part));
 
-   hd_struct_put(part);
+   hd_struct_kill(part);
 }
 
 static ssize_t whole_disk_show(struct device *dev,
@@ -355,8 +356,8 @@ struct hd_struct *add_partition(struct gendisk *disk, int 
partno,
if (!dev_get_uevent_suppress(ddev))
kobject_uevent(>kobj, KOBJ_ADD);
 
-   hd_ref_init(p);
-   return p;
+   if (!hd_ref_init(p))
+   return p;
 
 out_free_info:
free_part_info(p);
diff --git a/include/linux/genhd.h b/include/linux/genhd.h
index a221220..2adbfa6 100644
--- a/include/linux/genhd.h
+++ b/include/linux/genhd.h
@@ -13,6 +13,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #ifdef CONFIG_BLOCK
 
@@ -124,7 +125,7 @@ struct hd_struct {
 #else
struct disk_stats dkstats;
 #endif
-   atomic_t ref;
+   struct percpu_ref ref;
struct rcu_head rcu_head;
 };
 
@@ -611,7 +612,7 @@ extern struct hd_struct * __must_check add_partition(struct 
gendisk *disk,
 sector_t len, int flags,
 struct partition_meta_info
   *info);
-extern void __delete_partition(struct hd_struct *);
+extern void __delete_partition(struct percpu_ref *);
 extern void delete_partition(struct gendisk *, int);
 extern void printk_all_partitions(void);
 
@@ -640,33 +641,39 @@ extern ssize_t part_fail_store(struct device *dev,
   const char *buf, size_t count);
 #endif /* CONFIG_FAIL_MAKE_REQUEST */
 
-static inline void hd_ref_init(struct hd_struct *part)
+static inline int hd_ref_init(struct hd_struct *part)
 {
-   atomic_set(>ref, 1);
-   smp_mb();
+   if (percpu_ref_init(>ref, __delete_partition, 0,
+   GFP_KERNEL))
+   return -ENOMEM;
+   return 0;
 }
 
 static inline void hd_struct_get(struct hd_struct *part)
 {
-   atomic_inc(>ref);
-   smp_mb__after_atomic();
+   percpu_ref_get(>ref);
 }
 
 static inline int hd_struct_try_get(struct hd_struct *part)
 {
-   return atomic_inc_not_zero(>ref);
+   return percpu_ref_tryget_live(>ref);
 }
 
 static inline void hd_struct_put(struct hd_struct *part)
 {
-   if (atomic_dec_and_test(>ref))
-   __delete_partition(part);
+   percpu_ref_put(>ref);
+}
+
+static inline void hd_struct_kill(struct hd_struct *part)
+{
+   percpu_ref_kill(>ref);
 }
 
 static inline void hd_free_part(struct hd_struct *part)
 {
free_part_stats(part);
free_part_info(part);
+   percpu_ref_exit(>ref);
 }
 
 /*
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 4/4] block: account io: convert part->in_fligh[] into percpu variable

2015-07-15 Thread Ming Lei
So the atomic operations for accounting block I/O can be killed
completely, and it is OK to add the percpu variables in part_in_flight()
because the function is run at most one time in every tick.

Signed-off-by: Ming Lei 
---
 block/blk-core.c  |  1 +
 block/partition-generic.c |  5 +++--
 drivers/md/dm.c   | 10 ++
 include/linux/genhd.h | 24 ++--
 4 files changed, 28 insertions(+), 12 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index f180a6d..0001d4c 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1344,6 +1344,7 @@ static void part_round_stats_single(int cpu, struct 
hd_struct *part,
if (now == part->stamp)
return;
 
+   /* at most one percpu addition per one tick */
inflight = part_in_flight(part);
if (inflight) {
__part_stat_add(cpu, part, time_in_queue,
diff --git a/block/partition-generic.c b/block/partition-generic.c
index e771113..0a553e7 100644
--- a/block/partition-generic.c
+++ b/block/partition-generic.c
@@ -140,8 +140,9 @@ ssize_t part_inflight_show(struct device *dev,
 {
struct hd_struct *p = dev_to_part(dev);
 
-   return sprintf(buf, "%8u %8u\n", atomic_read(>in_flight[0]),
-   atomic_read(>in_flight[1]));
+   return sprintf(buf, "%8u %8u\n",
+   part_stat_read(p, in_flight[0]),
+   part_stat_read(p, in_flight[1]));
 }
 
 #ifdef CONFIG_FAIL_MAKE_REQUEST
diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index de70377..1b6d8be 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -651,9 +651,9 @@ static void start_io_acct(struct dm_io *io)
 
cpu = part_stat_lock();
part_round_stats(cpu, _disk(md)->part0);
+   part_stat_set(cpu, _disk(md)->part0, in_flight[rw],
+   atomic_inc_return(>pending[rw]));
part_stat_unlock();
-   atomic_set(_disk(md)->part0.in_flight[rw],
-   atomic_inc_return(>pending[rw]));
 
if (unlikely(dm_stats_used(>stats)))
dm_stats_account_io(>stats, bio->bi_rw, 
bio->bi_iter.bi_sector,
@@ -665,7 +665,7 @@ static void end_io_acct(struct dm_io *io)
struct mapped_device *md = io->md;
struct bio *bio = io->bio;
unsigned long duration = jiffies - io->start_time;
-   int pending;
+   int pending, cpu;
int rw = bio_data_dir(bio);
 
generic_end_io_acct(rw, _disk(md)->part0, io->start_time);
@@ -679,7 +679,9 @@ static void end_io_acct(struct dm_io *io)
 * a flush.
 */
pending = atomic_dec_return(>pending[rw]);
-   atomic_set(_disk(md)->part0.in_flight[rw], pending);
+   cpu = part_stat_lock();
+   part_stat_set(cpu, _disk(md)->part0, in_flight[rw], pending);
+   part_stat_unlock();
pending += atomic_read(>pending[rw^0x1]);
 
/* nudge anyone waiting on suspend queue */
diff --git a/include/linux/genhd.h b/include/linux/genhd.h
index 612ae80..abe5567 100644
--- a/include/linux/genhd.h
+++ b/include/linux/genhd.h
@@ -86,6 +86,7 @@ struct disk_stats {
unsigned long ticks[2];
unsigned long io_ticks;
unsigned long time_in_queue;
+   unsigned int  in_flight[2];
 };
 
 #define PARTITION_META_INFO_VOLNAMELTH 64
@@ -119,7 +120,6 @@ struct hd_struct {
int make_it_fail;
 #endif
unsigned long stamp;
-   atomic_t in_flight[2];
 #ifdef CONFIG_SMP
struct disk_stats __percpu *dkstats;
 #else
@@ -320,6 +320,9 @@ extern struct hd_struct *disk_map_sector_rcu(struct gendisk 
*disk,
res;\
 })
 
+#define part_stat_set(cpu, part, field, seted) \
+   (per_cpu_ptr((part)->dkstats, (cpu))->field = (seted))
+
 static inline void part_stat_set_all(struct hd_struct *part, int value)
 {
int i;
@@ -351,6 +354,9 @@ static inline void free_part_stats(struct hd_struct *part)
 
 #define part_stat_read(part, field)((part)->dkstats.field)
 
+#define part_stat_set(cpu, part, field, seted) \
+   ((part)->dkstats.field = (seted))
+
 static inline void part_stat_set_all(struct hd_struct *part, int value)
 {
memset(>dkstats, value, sizeof(struct disk_stats));
@@ -383,21 +389,27 @@ static inline void free_part_stats(struct hd_struct *part)
 
 static inline void part_inc_in_flight(int cpu, struct hd_struct *part, int rw)
 {
-   atomic_inc(>in_flight[rw]);
+   part_stat_inc(cpu, part, in_flight[rw]);
if (part->partno)
-   atomic_inc(_to_disk(part)->part0.in_flight[rw]);
+   part_stat_inc(cpu, _to_disk(part)->part0, in_flight[rw]);
 }
 
 static inline void part_dec_in_flight(int cpu, struct hd_struct *part, int rw)
 {
-   atomic_dec(>in_flight[rw]);
+   part_stat_dec(cpu, part, in_flight[rw]);
if (part->partno)
-   atomic_dec(_to_disk(part)->part0.in_flight[rw]);
+   part_stat_dec(cpu, 

[PATCH 0/4] block: account io: kill atomic operations

2015-07-15 Thread Ming Lei
Hi,

This patches kills two kinds of atomic operations in block
accounting I/O.

The 1st two patches convert atomic refcount of partition
into percpu refcount.

The 2nd two patches converts partition->in_flight[] into percpu
variable.

With this change, ~15% throughput improvement can be observed
when running fio(randread) over null blk in a dual-socket
environment.

 block/bio.c   |  4 ++--
 block/blk-core.c  |  5 ++--
 block/blk-merge.c |  2 +-
 block/genhd.c |  9 ---
 block/partition-generic.c | 17 ++---
 drivers/md/dm.c   | 10 
 drivers/nvdimm/core.c |  4 ++--
 include/linux/genhd.h | 61 +--
 8 files changed, 72 insertions(+), 40 deletions(-)

Thanks,
Ming

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/4] block: partition: introduce hd_free_part()

2015-07-15 Thread Ming Lei
So the helper can be used in both generic partition
case and part0 case.

Signed-off-by: Ming Lei 
---
 block/genhd.c | 3 +--
 block/partition-generic.c | 3 +--
 include/linux/genhd.h | 6 ++
 3 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/block/genhd.c b/block/genhd.c
index e552e1b..ed3f5b9 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -1110,8 +1110,7 @@ static void disk_release(struct device *dev)
disk_release_events(disk);
kfree(disk->random);
disk_replace_part_tbl(disk, NULL);
-   free_part_stats(>part0);
-   free_part_info(>part0);
+   hd_free_part(>part0);
if (disk->queue)
blk_put_queue(disk->queue);
kfree(disk);
diff --git a/block/partition-generic.c b/block/partition-generic.c
index 0d9e5f9..eca0d02 100644
--- a/block/partition-generic.c
+++ b/block/partition-generic.c
@@ -212,8 +212,7 @@ static void part_release(struct device *dev)
 {
struct hd_struct *p = dev_to_part(dev);
blk_free_devt(dev->devt);
-   free_part_stats(p);
-   free_part_info(p);
+   hd_free_part(p);
kfree(p);
 }
 
diff --git a/include/linux/genhd.h b/include/linux/genhd.h
index ec274e0..a221220 100644
--- a/include/linux/genhd.h
+++ b/include/linux/genhd.h
@@ -663,6 +663,12 @@ static inline void hd_struct_put(struct hd_struct *part)
__delete_partition(part);
 }
 
+static inline void hd_free_part(struct hd_struct *part)
+{
+   free_part_stats(part);
+   free_part_info(part);
+}
+
 /*
  * Any access of part->nr_sects which is not protected by partition
  * bd_mutex or gendisk bdev bd_mutex, should be done using this
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH perf/core v2 00/16] perf-probe --cache and SDT support

2015-07-15 Thread Hemant Kumar

Hi Masami,

On 07/15/2015 02:43 PM, Masami Hiramatsu wrote:

Hi,

Here is the 2nd version of the patchset for probe-cache and
initial SDT support which are going to be perf-cache finally.


Thanks for adding the SDT support.


The perf-probe is useful for debugging, but it strongly depends
on the debuginfo. Without debuginfo, it is just a frontend of
ftrace's dynamic events. This can usually happen in server
farms or on cloud system, since no one wants to distribute
big debuginfo packages.

To solve this issue, I had tried to make a pre-analyzed probes
( https://lkml.org/lkml/2014/10/31/207 ) but it has a problm
that we can't ensure the probed binary is same as what we analyzed.
Arnaldo gave me an idea to reuse build-id cache for that perpose
and this series is the first prototype of that.

At the same time, Hemant has started to support SDT probes which
also use the cache file of SDT info. So I decided to merge this
into the same build-id cache.
In this version, SDT support is still very limited, it works
as a part of probe-cache.

In this version, perf probe supports --cache option which means
that perf probe manipulate probe caches, for example,

   # perf probe --cache --add "probe-desc"

does not only add probe events but also add "probe-desc" and
it's result on the cache. (Note that the cached entry is always
referred even without --cache)
The --list and --del commands also support --cache. Note that
both are only manipulate caches, not real events.

To use SDT, we have to scan the target binary at first by using
perf-buildid-cache, e.g.

   # perf buildid-cache --add /lib/libc-2.17.so

And perf probe --cache --list shows what SDTs are scanned.

   # perf probe --cache --list
   /usr/lib/libc-2.17.so (a6fb821bdf53660eb2c29f778757aef294d3d392):
   libc:setjmp=setjmp
   libc:longjmp=longjmp
   libc:longjmp_target=longjmp_target
   libc:memory_heap_new=memory_heap_new
   libc:memory_sbrk_less=memory_sbrk_less
   libc:memory_arena_reuse_free_list=memory_arena_reuse_free_list
   libc:memory_arena_reuse=memory_arena_reuse
   ...

To use the SDT events, perf probe -x BIN %SDTEVENT allows you to
add a probe on SDTEVENT@BIN.

   # perf probe -x /lib/libc-2.17.so %memory_heap_new

If you define a cached probe with event name, you can also reuse
it as same as SDT events.

   # perf probe -x ./perf --cache -n 'myevent=dso__load $params'

(Note that "-n" option only updates caches)
To use the above "myevent", you just have to add "%myevent".

   # perf probe -x ./perf %myevent


TODOs:
  - Show available cached/SDT events by perf-list
  - Allow perf-record to use cached/SDT events directly


As I was already working on SDT events' recording
https://lkml.org/lkml/2014/11/2/73,
I can re-spin the patches on top of your patchset and make the
required changes to implement the above TODOs.
What would you suggest?


Thank you,

---

Hemant Kumar (1):
   perf/sdt: ELF support for SDT

Masami Hiramatsu (15):
   perf probe: Simplify __add_probe_trace_events code
   perf probe: Move ftrace probe-event operations to probe-file.c
   perf probe: Use strbuf for making strings in probe-event.c
   perf-buildid-cache: Use path/to/bin/buildid/elf instead of 
path/to/bin/buildid
   perf buildid: Use SBUILD_ID_SIZE macro
   perf buildid: Introduce sysfs/filename__sprintf_build_id
   perf: Add lsdir to read a directory
   perf-buildid-cache: Use lsdir for looking up buildid caches
   perf probe: Add --cache option to cache the probe definitions
   perf probe: Use cache entry if possible
   perf probe: Show all cached probes
   perf probe: Remove caches when --cache is given
   perf probe: Add group name support
   perf buildid-cache: Scan and import user SDT events to probe cache
   perf probe: Accept %sdt and %cached event name


  tools/perf/Documentation/perf-probe.txt |   14
  tools/perf/builtin-buildid-cache.c  |   22 -
  tools/perf/builtin-buildid-list.c   |   28 -
  tools/perf/builtin-probe.c  |3
  tools/perf/util/Build   |1
  tools/perf/util/build-id.c  |  230 ++--
  tools/perf/util/build-id.h  |   11
  tools/perf/util/dso.h   |5
  tools/perf/util/probe-event.c   |  918 ++-
  tools/perf/util/probe-event.h   |   16 -
  tools/perf/util/probe-file.c|  763 ++
  tools/perf/util/probe-file.h|   46 ++
  tools/perf/util/probe-finder.c  |   10
  tools/perf/util/symbol-elf.c|  252 +
  tools/perf/util/symbol.c|2
  tools/perf/util/symbol.h|   22 +
  tools/perf/util/util.c  |   34 +
  tools/perf/util/util.h  |4
  18 files changed, 1781 insertions(+), 600 deletions(-)
  create mode 100644 tools/perf/util/probe-file.c
  create mode 100644 tools/perf/util/probe-file.h




--
Thanks,
Hemant Kumar

--

linux-next: build failure after merge of the rcu tree

2015-07-15 Thread Stephen Rothwell
Hi Paul,

After merging the rcu tree, today's linux-next build (arm
multi_v7_defconfig) failed like this:

kernel/notifier.c: In function 'notify_die':
kernel/notifier.c:547:2: error: implicit declaration of function 
'rcu_lockdep_assert' [-Werror=implicit-function-declaration]
  rcu_lockdep_assert(rcu_is_watching(),
  ^

Caused by commit

  02300fdb3e5f ("rcu: Rename rcu_lockdep_assert() to RCU_LOCKDEP_WARN()")

interacting with commit

  e727c7d7a11e ("notifiers, RCU: Assert that RCU is watching in notify_die()")

[ and I also noted
  0333a209cbf6 ("x86/irq, context_tracking: Document how IRQ context tracking 
works and add an RCU assertion")
]

from the tip tree.

I added the following merge fix patch:

From: Stephen Rothwell 
Date: Thu, 16 Jul 2015 13:08:50 +1000
Subject: [PATCH] rcu: merge fix for Rename rcu_lockdep_assert() to 
RCU_LOCKDEP_WARN()

Signed-off-by: Stephen Rothwell 
---
 arch/x86/kernel/irq.c | 2 +-
 kernel/notifier.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c
index 30dbf35bc90b..f9cd81825187 100644
--- a/arch/x86/kernel/irq.c
+++ b/arch/x86/kernel/irq.c
@@ -234,7 +234,7 @@ __visible unsigned int __irq_entry do_IRQ(struct pt_regs 
*regs)
entering_irq();
 
/* entering_irq() tells RCU that we're not quiescent.  Check it. */
-   rcu_lockdep_assert(rcu_is_watching(), "IRQ failed to wake up RCU");
+   RCU_LOCKDEP_WARN(!rcu_is_watching(), "IRQ failed to wake up RCU");
 
irq = __this_cpu_read(vector_irq[vector]);
 
diff --git a/kernel/notifier.c b/kernel/notifier.c
index 980e4330fb59..fd2c9acbcc19 100644
--- a/kernel/notifier.c
+++ b/kernel/notifier.c
@@ -544,7 +544,7 @@ int notrace notify_die(enum die_val val, const char *str,
.signr  = sig,
 
};
-   rcu_lockdep_assert(rcu_is_watching(),
+   RCU_LOCKDEP_WARN(!rcu_is_watching(),
   "notify_die called but RCU thinks we're quiescent");
return atomic_notifier_call_chain(_chain, val, );
 }
-- 
2.1.4

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


linux-next: manual merge of the rcu tree with the tip tree

2015-07-15 Thread Stephen Rothwell
Hi Paul,

Today's linux-next merge of the rcu tree got a conflict in:

  arch/x86/kernel/traps.c

between commit:

  8c84014f3bbb ("x86/entry: Remove exception_enter() from most trap handlers")

from the tip tree and commit:

  02300fdb3e5f ("rcu: Rename rcu_lockdep_assert() to RCU_LOCKDEP_WARN()")

from the rcu tree.

I fixed it up (see below) and can carry the fix as necessary (no action
is required).

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au

diff --cc arch/x86/kernel/traps.c
index 8e65d8a9b8db,c5a5231d1d11..
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@@ -131,14 -136,19 +131,14 @@@ void ist_enter(struct pt_regs *regs
preempt_count_add(HARDIRQ_OFFSET);
  
/* This code is a bit fragile.  Test it. */
-   rcu_lockdep_assert(rcu_is_watching(), "ist_enter didn't work");
+   RCU_LOCKDEP_WARN(!rcu_is_watching(), "ist_enter didn't work");
 -
 -  return prev_state;
  }
  
 -void ist_exit(struct pt_regs *regs, enum ctx_state prev_state)
 +void ist_exit(struct pt_regs *regs)
  {
 -  /* Must be before exception_exit. */
preempt_count_sub(HARDIRQ_OFFSET);
  
 -  if (user_mode(regs))
 -  return exception_exit(prev_state);
 -  else
 +  if (!user_mode(regs))
rcu_nmi_exit();
  }
  
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/7] Initial support for user namespace owned mounts

2015-07-15 Thread Casey Schaufler
On 7/15/2015 6:08 PM, Andy Lutomirski wrote:
> On Wed, Jul 15, 2015 at 3:39 PM, Casey Schaufler  
> wrote:
>> On 7/15/2015 2:06 PM, Eric W. Biederman wrote:
>>> Casey Schaufler  writes:
>>> The first step needs to be not trusting those labels and treating such
>>> filesystems as filesystems without label support.  I hope that is Seth
>>> has implemented.
>> A filesystem with Smack labels gets mounted in a namespace. The labels
>> are ignored. Instead, the filesystem defaults (potentially specified as
>> mount options smackfsdef="something", but usually the floor label ("_"))
>> are used, giving the user the ability to read everything and (usually)
>> change nothing. This is both dangerous (unintended read access to files)
>> and pointless (can't make changes).
> I don't get it.
>
> If I mount an unprivileged filesystem, then either the contents were
> put there *by me*, in which case letting me access them are fine, or
> (with Seth's patches and then some) I control the backing store, in
> which case I can do whatever I want regardless of what LSM thinks.
>
> So I don't see the problem.  Why would Smack or any other LSM care at
> all, unless it wants to prevent me from mounting the fs in the first
> place?

First off, I don't cotton to the notion that you should be able
to mount filesystems without privilege. But it seems I'm being
outvoted on that. I suspect that there are cases where it might
be safe, but I can't think of one off the top of my head.

If you do mount a filesystem it needs to behave according to the
rules of the system. If you have a security module that uses
attributes on the filesystem you can't ignore them just because
it's "your data". Mandatory access control schemes, including
Smack and SELinux don't give a fig about who you are. It's the
label on the data and the process that matter. If "you" get to
muck the labels up, you've broken the mandatory access control.

> --Andy

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] fs: Add user namesapace member to struct super_block

2015-07-15 Thread Eric W. Biederman
Seth Forshee  writes:

> Initially this will be used to eliminate the implicit MNT_NODEV
> flag for mounts from user namespaces. In the future it will also
> be used for translating ids and checking capabilities for
> filesystems mounted from user namespaces.
>
> s_user_ns is initialized in alloc_super() and is generally set to
> current_user_ns(). To avoid security and corruption issues, two
> additional mount checks are also added:
>
>  - do_new_mount() gains a check that the user has CAP_SYS_ADMIN
>in current_user_ns().
>
>  - sget() will fail with EBUSY when the filesystem it's looking
>for is already mounted from another user namespace.
>
> proc needs some special handling here. The user namespace of
> current isn't appropriate when forking as a result of clone (2)
> with CLONE_NEWPID|CLONE_NEWUSER, as it will make proc unmountable
> from within the new user namespace. Instead, the user namespace
> which owns the new pid namespace should be used. sget_userns() is
> added to allow passing of a user namespace other than that of
> current, and this is used by proc_mount(). sget() becomes a
> wrapper around sget_userns() which passes current_user_ns().

>From bits of the previous conversation.

We need sget_userns(..., _user_ns) for sysfs.  The sysfs
xattrs can travel from one mount of sysfs to another via the sysfs
backing store.

For tmpfs and any other filesystems we support mounting without
privilige that support xattrs.  We need to identify them and
see if userspace is taking advantage of the ability to set
xattrs and file caps (unlikely).  If they are we need to call
sget_userns(..., _user_ns) on those filesystems as well.

Possibly/Probably we should just do that for all of the interesting
filesystems to start with and then change back to an ordinary old sget
after we have done the testing and confirmed we will not be introducing
userspace regressions.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [REGRESSION] 4.2-rc2: early boot memory corruption from FPU rework

2015-07-15 Thread Linus Torvalds
On Wed, Jul 15, 2015 at 5:34 PM, Dave Hansen
 wrote:
>
> I understand why you were misled by it, but the old "xsave_hdr_struct"
> was wrong.  Fenghua even posted patches to remove it before the FPU
> rework (you were cc'd):
>
> https://lkml.org/lkml/2015/4/18/164

Oh, and that patch looks like a good idea.

I wish there was some way to make sure sizeof() fail on it so that
we'd enforce that nobody allocates that thing as-is. I had this dim
memory that an unsized array at the end would do that, but I was
clearly wrong. It's just the array itself you can't do sizeof on, not
the structure that contains it. Is there some magic trick that I'm
forgetting?

  Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v1 3/4] mm/memory-failure: give up error handling for non-tail-refcounted thp

2015-07-15 Thread Naoya Horiguchi
On Thu, Jul 16, 2015 at 04:33:07AM +0200, Andi Kleen wrote:
> > @@ -909,6 +909,15 @@ int get_hwpoison_page(struct page *page)
> >  * directly for tail pages.
> >  */
> > if (PageTransHuge(head)) {
> > +   /*
> > +* Non anonymous thp exists only in allocation/free time. We
> > +* can't handle such a case correctly, so let's give it up.
> > +* This should be better than triggering BUG_ON when kernel
> > +* tries to touch a "partially handled" page.
> > +*/
> > +   if (!PageAnon(head))
> > +   return 0;
> 
> Please print a message for this case. In the future there will be
> likely more non anonymous THP pages from Kirill's large page cache work
> (so eventually we'll need it)

OK, I'll do this.

Thanks,
Naoya Horiguchi--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[GIT PULL] ARM: EXYNOS: mach: Improvements for 4.3

2015-07-15 Thread Krzysztof Kozlowski
Dear Kukjin,

Exynos mach-code related improvements. Description along with a tag.
You can find them also on the lists with my reviewed-by.

Best regards,
Krzysztof


The following changes since commit 1c4c7159ed2468f3ac4ce5a7f08d79663d381a93:

  Merge tag 'ext4_for_linus_stable' of 
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 (2015-07-05 16:24:54 
-0700)

are available in the git repository at:


  https://github.com/krzk/linux.git tags/samsung-mach-4.3

for you to fetch changes up to 70f83b6716ea0e5944071c12ff1716f93a9c2d8d:

  cpufreq: exynos: remove Exynos5250 specific cpufreq driver support 
(2015-07-16 10:39:56 +0900)


Improvements for Exynos based boards:
1. Switch to generic cpufreq-dt driver for Exynos5250. The old driver
   is removed.
2. Fix memory leak in cpufreq error path.
3. Cleanups: remove duplicated define with bootloader's sleep magic
   constant, staticize local function, drop 'owner' from
   platform driver, fix cast of iomem to ERR_PTR.


Bartlomiej Zolnierkiewicz (1):
  cpufreq: exynos: remove Exynos5250 specific cpufreq driver support

Krzysztof Kozlowski (4):
  ARM: EXYNOS: pmu: Make local function static
  ARM: EXYNOS: Remove duplicated define of SLEEP_MAGIC
  ARM: EXYNOS: pmu: Drop owner assignment
  ARM: EXYNOS: Use IOMEM_ERR_PTR when function returns iomem

Shailendra Verma (1):
  cpufreq: exynos: Fix for memory leak in case SOC name does not match

Thomas Abraham (3):
  clk: samsung: exynos5250: add cpu clock configuration data and 
instantiate cpu clock
  ARM: dts: Exynos5250: add CPU OPP and regulator supply property
  ARM: Exynos: switch to using generic cpufreq driver for Exynos5250

 arch/arm/boot/dts/exynos5250-arndale.dts  |   4 +
 arch/arm/boot/dts/exynos5250-smdk5250.dts |   4 +
 arch/arm/boot/dts/exynos5250-snow.dts |   4 +
 arch/arm/boot/dts/exynos5250-spring.dts   |   4 +
 arch/arm/boot/dts/exynos5250.dtsi |  22 
 arch/arm/mach-exynos/common.h |   6 +
 arch/arm/mach-exynos/exynos.c |   1 +
 arch/arm/mach-exynos/firmware.c   |   2 -
 arch/arm/mach-exynos/platsmp.c|   2 +-
 arch/arm/mach-exynos/pmu.c|   3 +-
 arch/arm/mach-exynos/suspend.c|   4 +-
 drivers/clk/samsung/clk-exynos5250.c  |  31 +
 drivers/cpufreq/Kconfig.arm   |  11 --
 drivers/cpufreq/Makefile  |   1 -
 drivers/cpufreq/exynos-cpufreq.c  |   9 +-
 drivers/cpufreq/exynos-cpufreq.h  |  17 ---
 drivers/cpufreq/exynos5250-cpufreq.c  | 210 --
 include/dt-bindings/clock/exynos5250.h|   1 +
 18 files changed, 84 insertions(+), 252 deletions(-)
 delete mode 100644 drivers/cpufreq/exynos5250-cpufreq.c
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[GIT PULL] ARM: EXYNOS: dts: Improvements for 4.3

2015-07-15 Thread Krzysztof Kozlowski
Dear Kukjin,

DTS related improvements. Description along with a tag.
You can find them also on the lists with my reviewed-by.

Best regards,
Krzysztof


The following changes since commit a419d78a6f97f8c977fe55d5d590cd0654ecd1ee:

  ARM: dts: Exynos4210: add CPU OPP and regulator supply property (2015-07-13 
21:16:05 +0900)

are available in the git repository at:

  https://github.com/krzk/linux.git tags/samsung-dt-4.3

for you to fetch changes up to cd0b551be420d49c2bde8dcf5ea147278dc89ffb:

  ARM: dts: Extend exynos5420-pinctrl nodes using labels instead of paths 
(2015-07-16 11:22:11 +0900)


Device Tree improvements for Exynos based boards:
1. Enable proper USB 3.0 regulators on Odroid XU3 board.
2. Set over-heat and over-voltage thresholds for Trats2 board fuel
   gauge.
3. Fix missing display frequency on Exynos3250 Rinato board
   (necessary to fix the display).
4. Enable thermal management and fan control on Odroid XU3 board.
   The speed of fan is adjusted to current temperature of SoC.
5. Cleanups and usage of label-notation for overriding nodes.


Anand Moon (5):
  ARM: dts: odroidxu3: Enable USB3 regulators
  ARM: dts: exynos5422-odroidxu3: Add pwm-fan node
  ARM: dts: exynos5422-odroidxu3: Enable TMU at Exynos5422 base
  ARM: dts: exynos5422-odroidxu3: Define default thermal-zones
  ARM: dts: exynos5422-odroidxu3: Enable thermal-zones

Andreas Färber (1):
  ARM: dts: Clean up exynos5410-smdk5410 indentation

Hyungwon Hwang (1):
  ARM: dts: fix the clock-frequency of exynos3250-rinato board's panel

Javier Martinez Canillas (4):
  ARM: dts: Include exynos5250-pinctrl after the nodes were defined
  ARM: dts: Extend exynos5250-pinctrl nodes using labels instead of paths
  ARM: dts: Include exynos5420-pinctrl after the nodes were defined
  ARM: dts: Extend exynos5420-pinctrl nodes using labels instead of paths

Krzysztof Kozlowski (2):
  ARM: dts: Set max17047 over heat and over voltage thresholds
  ARM: dts: Use labels for overriding nodes in exynos4210-universal

 arch/arm/boot/dts/exynos3250-rinato.dts|2 +-
 arch/arm/boot/dts/exynos4210-universal_c210.dts|  620 
 arch/arm/boot/dts/exynos4412-trats2.dts|3 +
 arch/arm/boot/dts/exynos5250-pinctrl.dtsi  | 1600 ++--
 arch/arm/boot/dts/exynos5250.dtsi  |3 +-
 arch/arm/boot/dts/exynos5410-smdk5410.dts  |6 +-
 arch/arm/boot/dts/exynos5420-pinctrl.dtsi  | 1411 +
 arch/arm/boot/dts/exynos5420.dtsi  |3 +-
 arch/arm/boot/dts/exynos5422-cpu-thermal.dtsi  |   59 +
 arch/arm/boot/dts/exynos5422-odroidxu3-common.dtsi |   46 +
 10 files changed, 1930 insertions(+), 1823 deletions(-)
 create mode 100644 arch/arm/boot/dts/exynos5422-cpu-thermal.dtsi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 4/6] ARM: OMAP: PRM: Remove hardcoding of IRQENABLE_MPU_2 and IRQSTATUS_MPU_2 register offsets

2015-07-15 Thread Paul Walmsley
On Wed, 8 Jul 2015, Keerthy wrote:

> The register offsets of IRQENABLE_MPU_2 and IRQSTATUS_MPU_2 are hardcoded.
> This makes it difficult to reuse the code for SoCs like AM437x that have
> a single instance of IRQENABLE_MPU and IRQSTATUS_MPU registers.
> Hence handling the case using offset of 4 to accommodate single set of IRQ*
> registers generically.
> 
> Signed-off-by: Keerthy 

Thanks, queued for v4.3.


- Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [REGRESSION] 4.2-rc2: early boot memory corruption from FPU rework

2015-07-15 Thread Linus Torvalds
On Wed, Jul 15, 2015 at 5:34 PM, Dave Hansen
 wrote:
>
> The old code sized the buffer in a fully architectural way and it
> worked.  The CPU *tells* you how much memory the 'xsave' instruction is
> going to scribble on.  The new code just merrily calls it and let it
> scribble away.  This is as clear-cut a regression as I've ever seen.

Yes, I think we'll need to revert it, or do something else drastic
like make that initial fp state allocation *much* bigger and then have
a "disable xsaves if if it's still not big enough".

setup_xstate_features() should be able to easily just say "this was
the maximum offset+size we saw", and we can take that to either do a
proper allocation, or verify that the static allocation is indeed big
enough.

Apparently a straight revert doesn't work, if only because things in
that area have been renamed very aggressively (both files and
functions and variables). Ingo?

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 2/6] ARM: AM43xx: Add the PRM IRQ register offsets

2015-07-15 Thread Paul Walmsley
On Thu, 16 Jul 2015, Paul Walmsley wrote:

> On Wed, 8 Jul 2015, Keerthy wrote:
> 
> > Add the PRM IRQ register offsets.
> > 
> > Signed-off-by: Keerthy 
> 
> Please add more detail to your commit messages so they conform to 
> Documentation/SubmittingPatches:
> 
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/Documentation/SubmittingPatches#n109
> 
> For example, this commit message should read something like:
> 
> ---
> 
> ARM: AM43xx: Add the PRM IRQ register offsets
> 
> Add the PRM IRQ register offsets.  This is needed to support PRM I/O 
> wakeup on AM43xx.
> 
> --
> 
> Basically, your patches need to provide context as to _why_ the change is 
> needed. 
> 
> I've fixed the message for this patch, and queued it for v4.3, but 
> please take care with this issue in the future.

Also I've moved the AM43XX_PRM_IO_PMCTRL_OFFSET macro out of the AM43XX CM 
section, since it doesn't belong there.


- Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 2/6] ARM: AM43xx: Add the PRM IRQ register offsets

2015-07-15 Thread Paul Walmsley
On Wed, 8 Jul 2015, Keerthy wrote:

> Add the PRM IRQ register offsets.
> 
> Signed-off-by: Keerthy 

Please add more detail to your commit messages so they conform to 
Documentation/SubmittingPatches:

https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/Documentation/SubmittingPatches#n109

For example, this commit message should read something like:

---

ARM: AM43xx: Add the PRM IRQ register offsets

Add the PRM IRQ register offsets.  This is needed to support PRM I/O 
wakeup on AM43xx.

--

Basically, your patches need to provide context as to _why_ the change is 
needed. 

I've fixed the message for this patch, and queued it for v4.3, but 
please take care with this issue in the future.


- Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v1 3/4] mm/memory-failure: give up error handling for non-tail-refcounted thp

2015-07-15 Thread Andi Kleen
> @@ -909,6 +909,15 @@ int get_hwpoison_page(struct page *page)
>* directly for tail pages.
>*/
>   if (PageTransHuge(head)) {
> + /*
> +  * Non anonymous thp exists only in allocation/free time. We
> +  * can't handle such a case correctly, so let's give it up.
> +  * This should be better than triggering BUG_ON when kernel
> +  * tries to touch a "partially handled" page.
> +  */
> + if (!PageAnon(head))
> + return 0;

Please print a message for this case. In the future there will be
likely more non anonymous THP pages from Kirill's large page cache work
(so eventually we'll need it)

-Andi

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.19.y-ckt 002/251] sctp: fix ASCONF list handling

2015-07-15 Thread Kamal Mostafa
3.19.8-ckt4 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Marcelo Ricardo Leitner 

[ Upstream commit 2d45a02d0166caf2627fe91897c6ffc3b19514c4 ]

->auto_asconf_splist is per namespace and mangled by functions like
sctp_setsockopt_auto_asconf() which doesn't guarantee any serialization.

Also, the call to inet_sk_copy_descendant() was backuping
->auto_asconf_list through the copy but was not honoring
->do_auto_asconf, which could lead to list corruption if it was
different between both sockets.

This commit thus fixes the list handling by using ->addr_wq_lock
spinlock to protect the list. A special handling is done upon socket
creation and destruction for that. Error handlig on sctp_init_sock()
will never return an error after having initialized asconf, so
sctp_destroy_sock() can be called without addrq_wq_lock. The lock now
will be take on sctp_close_sock(), before locking the socket, so we
don't do it in inverse order compared to sctp_addr_wq_timeout_handler().

Instead of taking the lock on sctp_sock_migrate() for copying and
restoring the list values, it's preferred to avoid rewritting it by
implementing sctp_copy_descendant().

Issue was found with a test application that kept flipping sysctl
default_auto_asconf on and off, but one could trigger it by issuing
simultaneous setsockopt() calls on multiple sockets or by
creating/destroying sockets fast enough. This is only triggerable
locally.

Fixes: 9f7d653b67ae ("sctp: Add Auto-ASCONF support (core).")
Reported-by: Ji Jianwen 
Suggested-by: Neil Horman 
Suggested-by: Hannes Frederic Sowa 
Acked-by: Hannes Frederic Sowa 
Signed-off-by: Marcelo Ricardo Leitner 
Signed-off-by: David S. Miller 
Cc: Moritz Mühlenhoff 
Reference: CVE-2015-3212
Signed-off-by: Kamal Mostafa 
---
 include/net/netns/sctp.h   |  1 +
 include/net/sctp/structs.h |  4 
 net/sctp/socket.c  | 43 ---
 3 files changed, 37 insertions(+), 11 deletions(-)

diff --git a/include/net/netns/sctp.h b/include/net/netns/sctp.h
index 3573a81..8ba379f 100644
--- a/include/net/netns/sctp.h
+++ b/include/net/netns/sctp.h
@@ -31,6 +31,7 @@ struct netns_sctp {
struct list_head addr_waitq;
struct timer_list addr_wq_timer;
struct list_head auto_asconf_splist;
+   /* Lock that protects both addr_waitq and auto_asconf_splist */
spinlock_t addr_wq_lock;
 
/* Lock that protects the local_addr_list writers */
diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h
index 2bb2fcf..495c87e 100644
--- a/include/net/sctp/structs.h
+++ b/include/net/sctp/structs.h
@@ -223,6 +223,10 @@ struct sctp_sock {
atomic_t pd_mode;
/* Receive to here while partial delivery is in effect. */
struct sk_buff_head pd_lobby;
+
+   /* These must be the last fields, as they will skipped on copies,
+* like on accept and peeloff operations
+*/
struct list_head auto_asconf_list;
int do_auto_asconf;
 };
diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index aafe94b..4e56571 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -1533,8 +1533,10 @@ static void sctp_close(struct sock *sk, long timeout)
 
/* Supposedly, no process has access to the socket, but
 * the net layers still may.
+* Also, sctp_destroy_sock() needs to be called with addr_wq_lock
+* held and that should be grabbed before socket lock.
 */
-   local_bh_disable();
+   spin_lock_bh(>sctp.addr_wq_lock);
bh_lock_sock(sk);
 
/* Hold the sock, since sk_common_release() will put sock_put()
@@ -1544,7 +1546,7 @@ static void sctp_close(struct sock *sk, long timeout)
sk_common_release(sk);
 
bh_unlock_sock(sk);
-   local_bh_enable();
+   spin_unlock_bh(>sctp.addr_wq_lock);
 
sock_put(sk);
 
@@ -3587,6 +3589,7 @@ static int sctp_setsockopt_auto_asconf(struct sock *sk, 
char __user *optval,
if ((val && sp->do_auto_asconf) || (!val && !sp->do_auto_asconf))
return 0;
 
+   spin_lock_bh(_net(sk)->sctp.addr_wq_lock);
if (val == 0 && sp->do_auto_asconf) {
list_del(>auto_asconf_list);
sp->do_auto_asconf = 0;
@@ -3595,6 +3598,7 @@ static int sctp_setsockopt_auto_asconf(struct sock *sk, 
char __user *optval,
_net(sk)->sctp.auto_asconf_splist);
sp->do_auto_asconf = 1;
}
+   spin_unlock_bh(_net(sk)->sctp.addr_wq_lock);
return 0;
 }
 
@@ -4128,18 +4132,28 @@ static int sctp_init_sock(struct sock *sk)
local_bh_disable();
percpu_counter_inc(_sockets_allocated);
sock_prot_inuse_add(net, sk->sk_prot, 1);
+
+   /* Nothing can fail after this block, otherwise
+* sctp_destroy_sock() will be called without addr_wq_lock held
+*/
if (net->sctp.default_auto_asconf) {
+   

[PATCH 3.19.y-ckt 009/251] net/mlx4_en: Wake TX queues only when there's enough room

2015-07-15 Thread Kamal Mostafa
3.19.8-ckt4 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Ido Shamay 

[ Upstream commit 488a9b48e398b157703766e2cd91ea45ac6997c5 ]

Indication of a single completed packet, marked by txbbs_skipped
being bigger then zero, in not enough in order to wake up a
stopped TX queue. The completed packet may contain a single TXBB,
while next packet to be sent (after the wake up) may have multiple
TXBBs (LSO/TSO packets for example), causing overflow in queue followed
by WQE corruption and TX queue timeout.
Instead, wake the stopped queue only when there's enough room for the
worst case (maximum sized WQE) packet that we should need to handle after
the queue is opened again.

Also created an helper routine - mlx4_en_is_tx_ring_full, which checks
if the current TX ring is full or not. It provides better code readability
and removes code duplication.

Signed-off-by: Ido Shamay 
Signed-off-by: Or Gerlitz 
Signed-off-by: David S. Miller 
Signed-off-by: Kamal Mostafa 
---
 drivers/net/ethernet/mellanox/mlx4/en_tx.c   | 19 +++
 drivers/net/ethernet/mellanox/mlx4/mlx4_en.h |  1 +
 2 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_tx.c 
b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
index 06c0de6..b54e621 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
@@ -66,6 +66,7 @@ int mlx4_en_create_tx_ring(struct mlx4_en_priv *priv,
ring->size = size;
ring->size_mask = size - 1;
ring->stride = stride;
+   ring->full_size = ring->size - HEADROOM - MAX_DESC_TXBBS;
 
tmp = size * sizeof(struct mlx4_en_tx_info);
ring->tx_info = kmalloc_node(tmp, GFP_KERNEL | __GFP_NOWARN, node);
@@ -232,6 +233,11 @@ void mlx4_en_deactivate_tx_ring(struct mlx4_en_priv *priv,
   MLX4_QP_STATE_RST, NULL, 0, 0, >qp);
 }
 
+static inline bool mlx4_en_is_tx_ring_full(struct mlx4_en_tx_ring *ring)
+{
+   return ring->prod - ring->cons > ring->full_size;
+}
+
 static void mlx4_en_stamp_wqe(struct mlx4_en_priv *priv,
  struct mlx4_en_tx_ring *ring, int index,
  u8 owner)
@@ -474,11 +480,10 @@ static bool mlx4_en_process_tx_cq(struct net_device *dev,
 
netdev_tx_completed_queue(ring->tx_queue, packets, bytes);
 
-   /*
-* Wakeup Tx queue if this stopped, and at least 1 packet
-* was completed
+   /* Wakeup Tx queue if this stopped, and ring is not full.
 */
-   if (netif_tx_queue_stopped(ring->tx_queue) && txbbs_skipped > 0) {
+   if (netif_tx_queue_stopped(ring->tx_queue) &&
+   !mlx4_en_is_tx_ring_full(ring)) {
netif_tx_wake_queue(ring->tx_queue);
ring->wake_queue++;
}
@@ -922,8 +927,7 @@ netdev_tx_t mlx4_en_xmit(struct sk_buff *skb, struct 
net_device *dev)
skb_tx_timestamp(skb);
 
/* Check available TXBBs And 2K spare for prefetch */
-   stop_queue = (int)(ring->prod - ring_cons) >
- ring->size - HEADROOM - MAX_DESC_TXBBS;
+   stop_queue = mlx4_en_is_tx_ring_full(ring);
if (unlikely(stop_queue)) {
netif_tx_stop_queue(ring->tx_queue);
ring->queue_stopped++;
@@ -992,8 +996,7 @@ netdev_tx_t mlx4_en_xmit(struct sk_buff *skb, struct 
net_device *dev)
smp_rmb();
 
ring_cons = ACCESS_ONCE(ring->cons);
-   if (unlikely(((int)(ring->prod - ring_cons)) <=
-ring->size - HEADROOM - MAX_DESC_TXBBS)) {
+   if (unlikely(!mlx4_en_is_tx_ring_full(ring))) {
netif_tx_wake_queue(ring->tx_queue);
ring->wake_queue++;
}
diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h 
b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
index 0e80118..18f8578 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
+++ b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
@@ -280,6 +280,7 @@ struct mlx4_en_tx_ring {
u32 size; /* number of TXBBs */
u32 size_mask;
u16 stride;
+   u32 full_size;
u16 cqn;/* index of port CQ associated with 
this ring */
u32 buf_size;
__be32  doorbell_qpn;
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.19.y-ckt 008/251] net/mlx4_en: Release TX QP when destroying TX ring

2015-07-15 Thread Kamal Mostafa
3.19.8-ckt4 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Eran Ben Elisha 

[ Upstream commit 0eb08514fdbdcd16fd6870680cd638f203662e9d ]

TX ring QP wasn't released at mlx4_en_destroy_tx_ring. Instead, the code
used the deprecated base_tx_qpn field. Move TX QP release to
mlx4_en_destroy_tx_ring and remove the base_tx_qpn field.

Fixes: ddae0349fdb7 ('net/mlx4: Change QP allocation scheme')
Signed-off-by: Eran Ben Elisha 
Signed-off-by: Or Gerlitz 
Signed-off-by: David S. Miller 
Signed-off-by: Kamal Mostafa 
---
 drivers/net/ethernet/mellanox/mlx4/en_netdev.c | 4 
 drivers/net/ethernet/mellanox/mlx4/en_tx.c | 1 +
 drivers/net/ethernet/mellanox/mlx4/mlx4_en.h   | 1 -
 3 files changed, 1 insertion(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c 
b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
index c998c4d..99b99eb 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
@@ -1973,10 +1973,6 @@ void mlx4_en_free_resources(struct mlx4_en_priv *priv)
mlx4_en_destroy_cq(priv, >rx_cq[i]);
}
 
-   if (priv->base_tx_qpn) {
-   mlx4_qp_release_range(priv->mdev->dev, priv->base_tx_qpn, 
priv->tx_ring_num);
-   priv->base_tx_qpn = 0;
-   }
 }
 
 int mlx4_en_alloc_resources(struct mlx4_en_priv *priv)
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_tx.c 
b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
index 18db895..06c0de6 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
@@ -180,6 +180,7 @@ void mlx4_en_destroy_tx_ring(struct mlx4_en_priv *priv,
mlx4_bf_free(mdev->dev, >bf);
mlx4_qp_remove(mdev->dev, >qp);
mlx4_qp_free(mdev->dev, >qp);
+   mlx4_qp_release_range(priv->mdev->dev, ring->qpn, 1);
mlx4_en_unmap_buffer(>wqres.buf);
mlx4_free_hwq_res(mdev->dev, >wqres, ring->buf_size);
kfree(ring->bounce_buf);
diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h 
b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
index 6cc49c1..0e80118 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
+++ b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
@@ -599,7 +599,6 @@ struct mlx4_en_priv {
int vids[128];
bool wol;
struct device *ddev;
-   int base_tx_qpn;
struct hlist_head mac_hash[MLX4_EN_MAC_HASH_SIZE];
struct hwtstamp_config hwtstamp_config;
 
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.19.y-ckt 010/251] net/mlx4_en: Fix wrong csum complete report when rxvlan offload is disabled

2015-07-15 Thread Kamal Mostafa
3.19.8-ckt4 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Ido Shamay 

[ Upstream commit 79a258526ce1051cb9684018c25a89d51ac21be8 ]

The check_csum() function relied on hwtstamp_rx_filter to know if rxvlan
offload is disabled. This is wrong since rxvlan offload can be switched
on/off regardless of hwtstamp_rx_filter.

Also moved check_csum to query CQE information to identify VLAN packets
and removed the check of IP packets, since it has been validated before.

Fixes: f8c6455bb04b ('net/mlx4_en: Extend checksum offloading by CHECKSUM 
COMPLETE')
Signed-off-by: Ido Shamay 
Signed-off-by: Or Gerlitz 
Signed-off-by: David S. Miller 
Signed-off-by: Kamal Mostafa 
---
 drivers/net/ethernet/mellanox/mlx4/en_rx.c | 17 ++---
 1 file changed, 6 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c 
b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
index 10d3533..7f16627 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
@@ -719,7 +719,7 @@ static int get_fixed_ipv6_csum(__wsum hw_checksum, struct 
sk_buff *skb,
 }
 #endif
 static int check_csum(struct mlx4_cqe *cqe, struct sk_buff *skb, void *va,
- int hwtstamp_rx_filter)
+ netdev_features_t dev_features)
 {
__wsum hw_checksum = 0;
 
@@ -727,14 +727,8 @@ static int check_csum(struct mlx4_cqe *cqe, struct sk_buff 
*skb, void *va,
 
hw_checksum = csum_unfold((__force __sum16)cqe->checksum);
 
-   if (((struct ethhdr *)va)->h_proto == htons(ETH_P_8021Q) &&
-   hwtstamp_rx_filter != HWTSTAMP_FILTER_NONE) {
-   /* next protocol non IPv4 or IPv6 */
-   if (((struct vlan_hdr *)hdr)->h_vlan_encapsulated_proto
-   != htons(ETH_P_IP) &&
-   ((struct vlan_hdr *)hdr)->h_vlan_encapsulated_proto
-   != htons(ETH_P_IPV6))
-   return -1;
+   if (cqe->vlan_my_qpn & cpu_to_be32(MLX4_CQE_VLAN_PRESENT_MASK) &&
+   !(dev_features & NETIF_F_HW_VLAN_CTAG_RX)) {
hw_checksum = get_fixed_vlan_csum(hw_checksum, hdr);
hdr += sizeof(struct vlan_hdr);
}
@@ -897,7 +891,8 @@ int mlx4_en_process_rx_cq(struct net_device *dev, struct 
mlx4_en_cq *cq, int bud
 
if (ip_summed == CHECKSUM_COMPLETE) {
void *va = 
skb_frag_address(skb_shinfo(gro_skb)->frags);
-   if (check_csum(cqe, gro_skb, va, 
ring->hwtstamp_rx_filter)) {
+   if (check_csum(cqe, gro_skb, va,
+  dev->features)) {
ip_summed = CHECKSUM_NONE;
ring->csum_none++;
ring->csum_complete--;
@@ -952,7 +947,7 @@ int mlx4_en_process_rx_cq(struct net_device *dev, struct 
mlx4_en_cq *cq, int bud
}
 
if (ip_summed == CHECKSUM_COMPLETE) {
-   if (check_csum(cqe, skb, skb->data, 
ring->hwtstamp_rx_filter)) {
+   if (check_csum(cqe, skb, skb->data, dev->features)) {
ip_summed = CHECKSUM_NONE;
ring->csum_complete--;
ring->csum_none++;
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/7] Initial support for user namespace owned mounts

2015-07-15 Thread Eric W. Biederman
Andy Lutomirski  writes:

> On Jul 15, 2015 3:34 PM, "Eric W. Biederman"  wrote:
>>
>> Seth Forshee  writes:
>>
>> > On Wed, Jul 15, 2015 at 04:06:35PM -0500, Eric W. Biederman wrote:
>> >> Casey Schaufler  writes:
>> >>
>> >> > On 7/15/2015 12:46 PM, Seth Forshee wrote:
>> >> >> These are the first in a larger set of patches that I've been working 
>> >> >> on
>> >> >> (with help from Eric Biederman) to support mounting ext4 and fuse
>> >> >> filesystems from within user namespaces. I've pushed the full series 
>> >> >> to:
>> >> >>
>> >> >>   git://kernel.ubuntu.com/sforshee/linux.git userns-mounts
>> >> >>
>> >> >> Taking the series as a whole, the strategy is to handle as much of the
>> >> >> heavy lifting as possible in the vfs so the filesystems don't have to
>> >> >> handle weird edge cases. If you look at the full series you'll find 
>> >> >> that
>> >> >> the changes in ext4 to support user namespace mounts turn out to be
>> >> >> fairly minimal (fuse is a bit more complicated though as it must deal
>> >> >> with translating ids for a userspace process which is running in pid 
>> >> >> and
>> >> >> user namespaces).
>> >> >>
>> >> >> The patches I'm sending today lay some of the groundwork in the vfs and
>> >> >> related code. They fall into two broad groups:
>> >> >>
>> >> >>  1. Patches 1-2 add s_user_ns and simplify MNT_NODEV handling. These 
>> >> >> are
>> >> >> pretty straightforward, and Eric has expressed interest in merging
>> >> >> these patches soon. Note that patch 2 won't apply cleanly without
>> >> >> Eric's noexec patches for proc and sys [1].
>> >> >>
>> >> >>  2. Patches 2-7 tighten down security for mounts with s_user_ns !=
>> >> >> _user_ns. This includes updates to how file caps and suid are
>> >> >> handled and LSM updates to ignore security labels on superblocks
>> >> >> from non-init namespaces.
>> >> >>
>> >> >> The LSM changes in particular may not be optimal, as I don't have a
>> >> >> lot of familiarity with this code, so I'd be especially 
>> >> >> appreciative
>> >> >> of review of these changes and suggestions on how to improve them.
>> >> >
>> >> > Lukasz Pawelczyk  proposed
>> >> > LSM support in user namespaces ([RFC] lsm: namespace hooks)
>> >> > that make a whole lot more sense than just turning off
>> >> > the option of using labels on files. Gutting the ability
>> >> > to use MAC in a namespace is a step down the road of
>> >> > making MAC and namespaces incompatible.
>> >>
>> >> This is not "turning off the option to use labels on files".
>> >>
>> >> This is supporting mounting filesystems like ext4 by unprivileged users
>> >> and not trusting the labels they set in the same way as we trust labels
>> >> on filesystems mounted by privileged users.
>> >>
>> >> The first step needs to be not trusting those labels and treating such
>> >> filesystems as filesystems without label support.  I hope that is Seth
>> >> has implemented.
>> >>
>> >> In the long run we can do more interesting things with such filesystems
>> >> once the appropriate LSM policy is in place.
>> >
>> > Yes, this exactly. Right now it looks to me like the only safe thing to
>> > do with mounts from unprivileged users is to ignore the security labels,
>> > so that's what I'm trying to do with these changes. If there's some
>> > better thing to do, or some better way to do it, I'm more than happy to
>> > receive that feedback.
>>
>> Ugh.
>>
>> This made me realize that we have an interesting problem here.  An
>> unprivileged mount of tmpfs probably needs to have
>> s_user_ns == _user_ns.
>>
>> Otherwise we will break security labels on tmpfs for no good reason.
>> ramfs and sysfs also seem to have similar concerns.
>>
>> Because they have no backing store we can trust those filesystems with
>> security labels.  Plus for at least sysfs there is the security label
>> bleed through issue, that we need to make certain works.
>>
>> Perhaps these filesystems with trusted backing store need to call
>> "sget_userns(..., _user_ns)".
>>
>> If we don't get this right we will have significant regressions with
>> respect to security labels, and that is not ok.
>
> That's only a problem if there's anyone who sets security labels on
> such a mount.  You need global caps to do that (I hope), which
> requires someone outside the userns to help, which means there's a
> good chance that literally no one does this.

Fair enough.  That is however something we need to test.  If no one
puts security labels or file caps on such a mount we can change things.
If not we can't because it would introduce regressions.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.19.y-ckt 020/251] [media] cx24116: fix a buffer overflow when checking userspace params

2015-07-15 Thread Kamal Mostafa
3.19.8-ckt4 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Mauro Carvalho Chehab 

commit 1fa2337a315a2448c5434f41e00d56b01a22283c upstream.

The maximum size for a DiSEqC command is 6, according to the
userspace API. However, the code allows to write up much more values:
drivers/media/dvb-frontends/cx24116.c:983 cx24116_send_diseqc_msg() 
error: buffer overflow 'd->msg' 6 <= 23

Signed-off-by: Mauro Carvalho Chehab 
Signed-off-by: Kamal Mostafa 
---
 drivers/media/dvb-frontends/cx24116.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/media/dvb-frontends/cx24116.c 
b/drivers/media/dvb-frontends/cx24116.c
index 2916d7c..7bc68b3 100644
--- a/drivers/media/dvb-frontends/cx24116.c
+++ b/drivers/media/dvb-frontends/cx24116.c
@@ -963,6 +963,10 @@ static int cx24116_send_diseqc_msg(struct dvb_frontend *fe,
struct cx24116_state *state = fe->demodulator_priv;
int i, ret;
 
+   /* Validate length */
+   if (d->msg_len > sizeof(d->msg))
+return -EINVAL;
+
/* Dump DiSEqC message */
if (debug) {
printk(KERN_INFO "cx24116: %s(", __func__);
@@ -974,10 +978,6 @@ static int cx24116_send_diseqc_msg(struct dvb_frontend *fe,
printk(") toneburst=%d\n", toneburst);
}
 
-   /* Validate length */
-   if (d->msg_len > (CX24116_ARGLEN - CX24116_DISEQC_MSGOFS))
-   return -EINVAL;
-
/* DiSEqC message */
for (i = 0; i < d->msg_len; i++)
state->dsec_cmd.args[CX24116_DISEQC_MSGOFS + i] = d->msg[i];
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.19.y-ckt 001/251] net: don't wait for order-3 page allocation

2015-07-15 Thread Kamal Mostafa
3.19.8-ckt4 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Shaohua Li 

[ Upstream commit fb05e7a89f500cfc06ae277bdc911b281928995d ]

We saw excessive direct memory compaction triggered by skb_page_frag_refill.
This causes performance issues and add latency. Commit 5640f7685831e0
introduces the order-3 allocation. According to the changelog, the order-3
allocation isn't a must-have but to improve performance. But direct memory
compaction has high overhead. The benefit of order-3 allocation can't
compensate the overhead of direct memory compaction.

This patch makes the order-3 page allocation atomic. If there is no memory
pressure and memory isn't fragmented, the alloction will still success, so we
don't sacrifice the order-3 benefit here. If the atomic allocation fails,
direct memory compaction will not be triggered, skb_page_frag_refill will
fallback to order-0 immediately, hence the direct memory compaction overhead is
avoided. In the allocation failure case, kswapd is waken up and doing
compaction, so chances are allocation could success next time.

alloc_skb_with_frags is the same.

The mellanox driver does similar thing, if this is accepted, we must fix
the driver too.

V3: fix the same issue in alloc_skb_with_frags as pointed out by Eric
V2: make the changelog clearer

Cc: Eric Dumazet 
Cc: Chris Mason 
Cc: Debabrata Banerjee 
Signed-off-by: Shaohua Li 
Acked-by: Eric Dumazet 
Signed-off-by: David S. Miller 
Signed-off-by: Kamal Mostafa 
---
 net/core/skbuff.c | 2 +-
 net/core/sock.c   | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 3b0a8b0..0998af7 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -4414,7 +4414,7 @@ struct sk_buff *alloc_skb_with_frags(unsigned long 
header_len,
 
while (order) {
if (npages >= 1 << order) {
-   page = alloc_pages(gfp_mask |
+   page = alloc_pages((gfp_mask & ~__GFP_WAIT) |
   __GFP_COMP |
   __GFP_NOWARN |
   __GFP_NORETRY,
diff --git a/net/core/sock.c b/net/core/sock.c
index a91f99f..3606cc5 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -1888,7 +1888,7 @@ bool skb_page_frag_refill(unsigned int sz, struct 
page_frag *pfrag, gfp_t gfp)
 
pfrag->offset = 0;
if (SKB_FRAG_PAGE_ORDER) {
-   pfrag->page = alloc_pages(gfp | __GFP_COMP |
+   pfrag->page = alloc_pages((gfp & ~__GFP_WAIT) | __GFP_COMP |
  __GFP_NOWARN | __GFP_NORETRY,
  SKB_FRAG_PAGE_ORDER);
if (likely(pfrag->page)) {
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.19.y-ckt 019/251] [media] s5h1420: fix a buffer overflow when checking userspace params

2015-07-15 Thread Kamal Mostafa
3.19.8-ckt4 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Mauro Carvalho Chehab 

commit 12f4543f5d6811f864e6c4952eb27253c7466c02 upstream.

The maximum size for a DiSEqC command is 6, according to the
userspace API. However, the code allows to write up to 7 values:
drivers/media/dvb-frontends/s5h1420.c:193 s5h1420_send_master_cmd() 
error: buffer overflow 'cmd->msg' 6 <= 7

Signed-off-by: Mauro Carvalho Chehab 
Signed-off-by: Kamal Mostafa 
---
 drivers/media/dvb-frontends/s5h1420.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/media/dvb-frontends/s5h1420.c 
b/drivers/media/dvb-frontends/s5h1420.c
index 93eeaf7..0b4f8fe 100644
--- a/drivers/media/dvb-frontends/s5h1420.c
+++ b/drivers/media/dvb-frontends/s5h1420.c
@@ -180,7 +180,7 @@ static int s5h1420_send_master_cmd (struct dvb_frontend* fe,
int result = 0;
 
dprintk("enter %s\n", __func__);
-   if (cmd->msg_len > 8)
+   if (cmd->msg_len > sizeof(cmd->msg))
return -EINVAL;
 
/* setup for DISEQC */
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.19.y-ckt 006/251] neigh: do not modify unlinked entries

2015-07-15 Thread Kamal Mostafa
3.19.8-ckt4 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Julian Anastasov 

[ Upstream commit 2c51a97f76d20ebf1f50fef908b986cb051fdff9 ]

The lockless lookups can return entry that is unlinked.
Sometimes they get reference before last neigh_cleanup_and_release,
sometimes they do not need reference. Later, any
modification attempts may result in the following problems:

1. entry is not destroyed immediately because neigh_update
can start the timer for dead entry, eg. on change to NUD_REACHABLE
state. As result, entry lives for some time but is invisible
and out of control.

2. __neigh_event_send can run in parallel with neigh_destroy
while refcnt=0 but if timer is started and expired refcnt can
reach 0 for second time leading to second neigh_destroy and
possible crash.

Thanks to Eric Dumazet and Ying Xue for their work and analyze
on the __neigh_event_send change.

Fixes: 767e97e1e0db ("neigh: RCU conversion of struct neighbour")
Fixes: a263b3093641 ("ipv4: Make neigh lookups directly in output packet path.")
Fixes: 6fd6ce2056de ("ipv6: Do not depend on rt->n in ip6_finish_output2().")
Cc: Eric Dumazet 
Cc: Ying Xue 
Signed-off-by: Julian Anastasov 
Acked-by: Eric Dumazet 
Signed-off-by: David S. Miller 
Signed-off-by: Kamal Mostafa 
---
 net/core/neighbour.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/net/core/neighbour.c b/net/core/neighbour.c
index 8d614c9..0385351 100644
--- a/net/core/neighbour.c
+++ b/net/core/neighbour.c
@@ -971,6 +971,8 @@ int __neigh_event_send(struct neighbour *neigh, struct 
sk_buff *skb)
rc = 0;
if (neigh->nud_state & (NUD_CONNECTED | NUD_DELAY | NUD_PROBE))
goto out_unlock_bh;
+   if (neigh->dead)
+   goto out_dead;
 
if (!(neigh->nud_state & (NUD_STALE | NUD_INCOMPLETE))) {
if (NEIGH_VAR(neigh->parms, MCAST_PROBES) +
@@ -1027,6 +1029,13 @@ out_unlock_bh:
write_unlock(>lock);
local_bh_enable();
return rc;
+
+out_dead:
+   if (neigh->nud_state & NUD_STALE)
+   goto out_unlock_bh;
+   write_unlock_bh(>lock);
+   kfree_skb(skb);
+   return 1;
 }
 EXPORT_SYMBOL(__neigh_event_send);
 
@@ -1090,6 +1099,8 @@ int neigh_update(struct neighbour *neigh, const u8 
*lladdr, u8 new,
if (!(flags & NEIGH_UPDATE_F_ADMIN) &&
(old & (NUD_NOARP | NUD_PERMANENT)))
goto out;
+   if (neigh->dead)
+   goto out;
 
if (!(new & NUD_VALID)) {
neigh_del_timer(neigh);
@@ -1239,6 +1250,8 @@ EXPORT_SYMBOL(neigh_update);
  */
 void __neigh_set_probe_once(struct neighbour *neigh)
 {
+   if (neigh->dead)
+   return;
neigh->updated = jiffies;
if (!(neigh->nud_state & NUD_FAILED))
return;
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.19.y-ckt 007/251] tcp: Do not call tcp_fastopen_reset_cipher from interrupt context

2015-07-15 Thread Kamal Mostafa
3.19.8-ckt4 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Christoph Paasch 

[ Upstream commit dfea2aa654243f70dc53b8648d0bbdeec55a7df1 ]

tcp_fastopen_reset_cipher really cannot be called from interrupt
context. It allocates the tcp_fastopen_context with GFP_KERNEL and
calls crypto_alloc_cipher, which allocates all kind of stuff with
GFP_KERNEL.

Thus, we might sleep when the key-generation is triggered by an
incoming TFO cookie-request which would then happen in interrupt-
context, as shown by enabling CONFIG_DEBUG_ATOMIC_SLEEP:

[   36.001813] BUG: sleeping function called from invalid context at 
mm/slub.c:1266
[   36.003624] in_atomic(): 1, irqs_disabled(): 0, pid: 1016, name: packetdrill
[   36.004859] CPU: 1 PID: 1016 Comm: packetdrill Not tainted 4.1.0-rc7 #14
[   36.006085] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
[   36.008250]  04f2 88007f8838a8 8171d53a 
880075a084a8
[   36.009630]  880075a08000 88007f8838c8 810967d3 
88007f883928
[   36.011076]   88007f8838f8 81096892 
88007f89be00
[   36.012494] Call Trace:
[   36.012953][] dump_stack+0x4f/0x6d
[   36.014085]  [] ___might_sleep+0x103/0x170
[   36.015117]  [] __might_sleep+0x52/0x90
[   36.016117]  [] kmem_cache_alloc_trace+0x47/0x190
[   36.017266]  [] ? tcp_fastopen_reset_cipher+0x42/0x130
[   36.018485]  [] tcp_fastopen_reset_cipher+0x42/0x130
[   36.019679]  [] tcp_fastopen_init_key_once+0x61/0x70
[   36.020884]  [] __tcp_fastopen_cookie_gen+0x1c/0x60
[   36.022058]  [] tcp_try_fastopen+0x58f/0x730
[   36.023118]  [] tcp_conn_request+0x3e8/0x7b0
[   36.024185]  [] ? __module_text_address+0x12/0x60
[   36.025327]  [] tcp_v4_conn_request+0x51/0x60
[   36.026410]  [] tcp_rcv_state_process+0x190/0xda0
[   36.027556]  [] ? __inet_lookup_established+0x47/0x170
[   36.028784]  [] tcp_v4_do_rcv+0x16d/0x3d0
[   36.029832]  [] ? security_sock_rcv_skb+0x16/0x20
[   36.030936]  [] tcp_v4_rcv+0x77a/0x7b0
[   36.031875]  [] ? iptable_filter_hook+0x33/0x70
[   36.032953]  [] ip_local_deliver_finish+0x92/0x1f0
[   36.034065]  [] ip_local_deliver+0x9a/0xb0
[   36.035069]  [] ? ip_rcv+0x3d0/0x3d0
[   36.035963]  [] ip_rcv_finish+0x119/0x330
[   36.036950]  [] ip_rcv+0x2e7/0x3d0
[   36.037847]  [] __netif_receive_skb_core+0x552/0x930
[   36.038994]  [] __netif_receive_skb+0x27/0x70
[   36.040033]  [] process_backlog+0xd2/0x1f0
[   36.041025]  [] net_rx_action+0x122/0x310
[   36.042007]  [] __do_softirq+0x103/0x2f0
[   36.042978]  [] do_softirq_own_stack+0x1c/0x30

This patch moves the call to tcp_fastopen_init_key_once to the places
where a listener socket creates its TFO-state, which always happens in
user-context (either from the setsockopt, or implicitly during the
listen()-call)

Cc: Eric Dumazet 
Cc: Hannes Frederic Sowa 
Fixes: 222e83d2e0ae ("tcp: switch tcp_fastopen key generation to 
net_get_random_once")
Signed-off-by: Christoph Paasch 
Acked-by: Eric Dumazet 
Signed-off-by: David S. Miller 
Signed-off-by: Kamal Mostafa 
---
 net/ipv4/af_inet.c  | 2 ++
 net/ipv4/tcp.c  | 7 +--
 net/ipv4/tcp_fastopen.c | 2 --
 3 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index a44773c..515f689 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -228,6 +228,8 @@ int inet_listen(struct socket *sock, int backlog)
err = 0;
if (err)
goto out;
+
+   tcp_fastopen_init_key_once(true);
}
err = inet_csk_listen_start(sk, backlog);
if (err)
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 3075723..48e9bb6 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -2566,10 +2566,13 @@ static int do_tcp_setsockopt(struct sock *sk, int level,
 
case TCP_FASTOPEN:
if (val >= 0 && ((1 << sk->sk_state) & (TCPF_CLOSE |
-   TCPF_LISTEN)))
+   TCPF_LISTEN))) {
+   tcp_fastopen_init_key_once(true);
+
err = fastopen_init_queue(sk, val);
-   else
+   } else {
err = -EINVAL;
+   }
break;
case TCP_TIMESTAMP:
if (!tp->repair)
diff --git a/net/ipv4/tcp_fastopen.c b/net/ipv4/tcp_fastopen.c
index c730772..b01d5bd 100644
--- a/net/ipv4/tcp_fastopen.c
+++ b/net/ipv4/tcp_fastopen.c
@@ -78,8 +78,6 @@ static bool __tcp_fastopen_cookie_gen(const void *path,
struct tcp_fastopen_context *ctx;
bool ok = false;
 
-   tcp_fastopen_init_key_once(true);
-
rcu_read_lock();
ctx = rcu_dereference(tcp_fastopen_ctx);
if (ctx) {
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the 

[PATCH 3.19.y-ckt 011/251] net: phy: fix phy link up when limiting speed via device tree

2015-07-15 Thread Kamal Mostafa
3.19.8-ckt4 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Mugunthan V N 

[ Upstream commit eb686231fce3770299760f24fdcf5ad041f44153 ]

When limiting phy link speed using "max-speed" to 100mbps or less on a
giga bit phy, phy never completes auto negotiation and phy state
machine is held in PHY_AN. Fixing this issue by comparing the giga
bit advertise though phydev->supported doesn't have it but phy has
BMSR_ESTATEN set. So that auto negotiation is restarted as old and
new advertise are different and link comes up fine.

Signed-off-by: Mugunthan V N 
Reviewed-by: Florian Fainelli 
Signed-off-by: David S. Miller 
Signed-off-by: Kamal Mostafa 
---
 drivers/net/phy/phy_device.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index 3fc91e8..70a0d88 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -782,10 +782,11 @@ static int genphy_config_advert(struct phy_device *phydev)
if (phydev->supported & (SUPPORTED_1000baseT_Half |
 SUPPORTED_1000baseT_Full)) {
adv |= ethtool_adv_to_mii_ctrl1000_t(advertise);
-   if (adv != oldadv)
-   changed = 1;
}
 
+   if (adv != oldadv)
+   changed = 1;
+
err = phy_write(phydev, MII_CTRL1000, adv);
if (err < 0)
return err;
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 6/6] cputime: Introduce cputime_to_timespec64()/timespec64_to_cputime()

2015-07-15 Thread Baolin Wang
On 15 July 2015 at 19:55, Thomas Gleixner  wrote:
> On Wed, 15 Jul 2015, Baolin Wang wrote:
>
>> On 15 July 2015 at 18:31, Thomas Gleixner  wrote:
>> > On Wed, 15 Jul 2015, Baolin Wang wrote:
>> >
>> >> The cputime_to_timespec() and timespec_to_cputime() functions are
>> >> not year 2038 safe on 32bit systems due to that the struct timepsec
>> >> will overflow in 2038 year.
>> >
>> > And how is this relevant? cputime is not based on wall clock time at
>> > all. So what has 2038 to do with cputime?
>> >
>> > We want proper explanations WHY we need such a change.
>>
>> When converting the posix-cpu-timers, it call the
>> cputime_to_timespec() function. Thus it need a conversion for this
>> function.
>
> There is no requirement to convert posix-cpu-timers on their own. We
> need to adopt the posix cpu timers code because it shares syscalls
> with the other posix timers, but that still does not explain why we
> need these functions.
>

In posix-cpu-timers, it also defined some 'k_clock struct' variables,
and we need to convert the callbacks of the 'k_clock struct' which are
not year 2038 safe on 32bit systems. Some callbacks which need to
convert call the cputime_to_timespec() function, thus we also want to
convert the cputime_to_timespec() function to a year 2038 safe
function to make all them ready for the year 2038 issue.

>> You can see that conversion in patch "posix-cpu-timers: Convert to
>> y2038 safe callbacks" from
>> https://git.linaro.org/people/baolin.wang/upstream_0627.git.
>
> I do not care about your random git tree. I care about proper
> changelogs. Your changelogs are just a copied boilerplate full of
> errors.
>
> Thanks,
>
> tglx



-- 
Baolin.wang
Best Regards
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.19.y-ckt 021/251] [media] af9013: Don't accept invalid bandwidth

2015-07-15 Thread Kamal Mostafa
3.19.8-ckt4 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Mauro Carvalho Chehab 

commit d7b76c91f471413de9ded837bddeca2164786571 upstream.

If userspace sends an invalid bandwidth, it should either return
EINVAL or switch to auto mode.

This driver will go past an array and program the hardware on a
wrong way if this happens.

Signed-off-by: Mauro Carvalho Chehab 
Signed-off-by: Kamal Mostafa 
---
 drivers/media/dvb-frontends/af9013.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/media/dvb-frontends/af9013.c 
b/drivers/media/dvb-frontends/af9013.c
index 8001690..ba6c8f6 100644
--- a/drivers/media/dvb-frontends/af9013.c
+++ b/drivers/media/dvb-frontends/af9013.c
@@ -605,6 +605,10 @@ static int af9013_set_frontend(struct dvb_frontend *fe)
}
}
 
+   /* Return an error if can't find bandwidth or the right clock */
+   if (i == ARRAY_SIZE(coeff_lut))
+   return -EINVAL;
+
ret = af9013_wr_regs(state, 0xae00, coeff_lut[i].val,
sizeof(coeff_lut[i].val));
}
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.19.y-ckt 022/251] [media] cx24117: fix a buffer overflow when checking userspace params

2015-07-15 Thread Kamal Mostafa
3.19.8-ckt4 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Mauro Carvalho Chehab 

commit 82e3b88b679049f043fe9b03991d6d66fc0a43c8 upstream.

The maximum size for a DiSEqC command is 6, according to the
userspace API. However, the code allows to write up much more values:
drivers/media/dvb-frontends/cx24116.c:983 cx24116_send_diseqc_msg() 
error: buffer overflow 'd->msg' 6 <= 23

Signed-off-by: Mauro Carvalho Chehab 
Signed-off-by: Kamal Mostafa 
---
 drivers/media/dvb-frontends/cx24117.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/media/dvb-frontends/cx24117.c 
b/drivers/media/dvb-frontends/cx24117.c
index acb965c..af63635 100644
--- a/drivers/media/dvb-frontends/cx24117.c
+++ b/drivers/media/dvb-frontends/cx24117.c
@@ -1043,7 +1043,7 @@ static int cx24117_send_diseqc_msg(struct dvb_frontend 
*fe,
dev_dbg(>priv->i2c->dev, ")\n");
 
/* Validate length */
-   if (d->msg_len > 15)
+   if (d->msg_len > sizeof(d->msg))
return -EINVAL;
 
/* DiSEqC message */
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.19.y-ckt 015/251] net: mvneta: introduce compatible string "marvell, armada-xp-neta"

2015-07-15 Thread Kamal Mostafa
3.19.8-ckt4 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Simon Guinot 

[ Upstream commit f522a975a8101895a85354b9c143f41b8248e71a ]

The mvneta driver supports the Ethernet IP found in the Armada 370, XP,
380 and 385 SoCs. Since at least one more hardware feature is available
for the Armada XP SoCs then a way to identify them is needed.

This patch introduces a new compatible string "marvell,armada-xp-neta".

Signed-off-by: Simon Guinot 
Fixes: c5aff18204da ("net: mvneta: driver for Marvell Armada 370/XP network 
unit")
Acked-by: Gregory CLEMENT 
Acked-by: Thomas Petazzoni 
Signed-off-by: David S. Miller 
Signed-off-by: Kamal Mostafa 
---
 Documentation/devicetree/bindings/net/marvell-armada-370-neta.txt | 2 +-
 drivers/net/ethernet/marvell/mvneta.c | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/Documentation/devicetree/bindings/net/marvell-armada-370-neta.txt 
b/Documentation/devicetree/bindings/net/marvell-armada-370-neta.txt
index 750d577..f5a8ca2 100644
--- a/Documentation/devicetree/bindings/net/marvell-armada-370-neta.txt
+++ b/Documentation/devicetree/bindings/net/marvell-armada-370-neta.txt
@@ -1,7 +1,7 @@
 * Marvell Armada 370 / Armada XP Ethernet Controller (NETA)
 
 Required properties:
-- compatible: should be "marvell,armada-370-neta".
+- compatible: "marvell,armada-370-neta" or "marvell,armada-xp-neta".
 - reg: address and length of the register set for the device.
 - interrupts: interrupt for the device
 - phy: See ethernet.txt file in the same directory.
diff --git a/drivers/net/ethernet/marvell/mvneta.c 
b/drivers/net/ethernet/marvell/mvneta.c
index 96208f1..cce60a1 100644
--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -3100,6 +3100,7 @@ static int mvneta_remove(struct platform_device *pdev)
 
 static const struct of_device_id mvneta_match[] = {
{ .compatible = "marvell,armada-370-neta" },
+   { .compatible = "marvell,armada-xp-neta" },
{ }
 };
 MODULE_DEVICE_TABLE(of, mvneta_match);
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.19.y-ckt 013/251] sctp: Fix race between OOTB responce and route removal

2015-07-15 Thread Kamal Mostafa
3.19.8-ckt4 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Alexander Sverdlin 

[ Upstream commit 29c4afc4e98f4dc0ea9df22c631841f9c220b944 ]

There is NULL pointer dereference possible during statistics update if the route
used for OOTB responce is removed at unfortunate time. If the route exists when
we receive OOTB packet and we finally jump into sctp_packet_transmit() to send
ABORT, but in the meantime route is removed under our feet, we take "no_route"
path and try to update stats with IP_INC_STATS(sock_net(asoc->base.sk), ...).

But sctp_ootb_pkt_new() used to prepare responce packet doesn't call
sctp_transport_set_owner() and therefore there is no asoc associated with this
packet. Probably temporary asoc just for OOTB responces is overkill, so just
introduce a check like in all other places in sctp_packet_transmit(), where
"asoc" is dereferenced.

To reproduce this, one needs to
0. ensure that sctp module is loaded (otherwise ABORT is not generated)
1. remove default route on the machine
2. while true; do
 ip route del [interface-specific route]
 ip route add [interface-specific route]
   done
3. send enough OOTB packets (i.e. HB REQs) from another host to trigger ABORT
   responce

On x86_64 the crash looks like this:

BUG: unable to handle kernel NULL pointer dereference at 0020
IP: [] sctp_packet_transmit+0x63c/0x730 [sctp]
PGD 0
Oops:  [#1] PREEMPT SMP
Modules linked in: ...
CPU: 0 PID: 0 Comm: swapper/0 Tainted: G   O4.0.5-1-ARCH #1
Hardware name: ...
task: 818124c0 ti: 8180 task.ti: 8180
RIP: 0010:[]  [] 
sctp_packet_transmit+0x63c/0x730 [sctp]
RSP: 0018:880127c037b8  EFLAGS: 00010296
RAX:  RBX:  RCX: 0015ff66b480
RDX: 0015ff66b400 RSI: 880127c17200 RDI: 880123403700
RBP: 880127c03888 R08: 00017200 R09: 814625af
R10: ea00047e4680 R11: ff80 R12: 8800b0d38a28
R13: 8800b0d38a28 R14: 8800b3e88000 R15: a05f24e0
FS:  () GS:880127c0() knlGS:
CS:  0010 DS:  ES:  CR0: 8005003b
CR2: 0020 CR3: c855b000 CR4: 07f0
Stack:
 880127c03910 8800b0d38a28 8189d240 88011f91b400
 880127c03828 a05c94c5  8800baa1c520
  0001  
Call Trace:
 
 [] ? sctp_sf_tabort_8_4_8.isra.20+0x85/0x140 [sctp]
 [] ? sctp_transport_put+0x52/0x80 [sctp]
 [] sctp_do_sm+0xb8c/0x19a0 [sctp]
 [] ? trigger_load_balance+0x90/0x210
 [] ? update_process_times+0x59/0x60
 [] ? timerqueue_add+0x60/0xb0
 [] ? enqueue_hrtimer+0x29/0xa0
 [] ? read_tsc+0x9/0x10
 [] ? put_page+0x55/0x60
 [] ? clockevents_program_event+0x6d/0x100
 [] ? skb_free_head+0x58/0x80
 [] ? chksum_update+0x1b/0x27 [crc32c_generic]
 [] ? crypto_shash_update+0xce/0xf0
 [] sctp_endpoint_bh_rcv+0x113/0x280 [sctp]
 [] sctp_inq_push+0x46/0x60 [sctp]
 [] sctp_rcv+0x880/0x910 [sctp]
 [] ? sctp_packet_transmit_chunk+0xb0/0xb0 [sctp]
 [] ? sctp_csum_update+0x20/0x20 [sctp]
 [] ? ip_route_input_noref+0x235/0xd30
 [] ? ack_ioapic_level+0x7b/0x150
 [] ip_local_deliver_finish+0xae/0x210
 [] ip_local_deliver+0x35/0x90
 [] ip_rcv_finish+0xf5/0x370
 [] ip_rcv+0x2b8/0x3a0
 [] __netif_receive_skb_core+0x763/0xa50
 [] __netif_receive_skb+0x18/0x60
 [] netif_receive_skb_internal+0x40/0xd0
 [] napi_gro_receive+0xe8/0x120
 [] rtl8169_poll+0x2da/0x660 [r8169]
 [] net_rx_action+0x21a/0x360
 [] __do_softirq+0xe1/0x2d0
 [] irq_exit+0xad/0xb0
 [] do_IRQ+0x58/0xf0
 [] common_interrupt+0x6d/0x6d
 
 [] ? hrtimer_start+0x18/0x20
 [] ? sctp_transport_destroy_rcu+0x29/0x30 [sctp]
 [] ? mwait_idle+0x60/0xa0
 [] arch_cpu_idle+0xf/0x20
 [] cpu_startup_entry+0x3ec/0x480
 [] rest_init+0x85/0x90
 [] start_kernel+0x48b/0x4ac
 [] ? early_idt_handlers+0x120/0x120
 [] x86_64_start_reservations+0x2a/0x2c
 [] x86_64_start_kernel+0x161/0x184
Code: 90 48 8b 80 b8 00 00 00 48 89 85 70 ff ff ff 48 83 bd 70 ff ff ff 00 0f 
85 cd fa ff ff 48 89 df 31 db e8 18 63 e7 e0 48 8b 45 80 <48> 8b 40 20 48 8b 40 
30 48 8b 80 68 01 00 00 65 48 ff 40 78 e9
RIP  [] sctp_packet_transmit+0x63c/0x730 [sctp]
 RSP 
CR2: 0020
---[ end trace 5aec7fd2dc983574 ]---
Kernel panic - not syncing: Fatal exception in interrupt
Kernel Offset: 0x0 from 0x8100 (relocation range: 
0x8000-0x9fff)
drm_kms_helper: panic occurred, switching back to text console
---[ end Kernel panic - not syncing: Fatal exception in interrupt

Signed-off-by: Alexander Sverdlin 
Acked-by: Neil Horman 
Acked-by: Marcelo Ricardo Leitner 
Acked-by: Vlad Yasevich 
Signed-off-by: David S. Miller 
Signed-off-by: Kamal Mostafa 
---
 net/sctp/output.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/net/sctp/output.c b/net/sctp/output.c
index fc5e45b..abe7c2d 100644
--- a/net/sctp/output.c
+++ 

[PATCH 3.19.y-ckt 012/251] bnx2x: fix lockdep splat

2015-07-15 Thread Kamal Mostafa
3.19.8-ckt4 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Eric Dumazet 

[ Upstream commit d53c66a5b80698620f7c9ba2372fff4017e987b8 ]

Michel reported following lockdep splat

[   44.718117] INFO: trying to register non-static key.
[   44.723081] the code is fine but needs lockdep annotation.
[   44.728559] turning off the locking correctness validator.
[   44.734036] CPU: 8 PID: 5483 Comm: ethtool Not tainted 4.1.0
[   44.770289] Call Trace:
[   44.772741]  [] dump_stack+0x4c/0x65
[   44.777879]  [] ? console_unlock+0x1f1/0x510
[   44.783708]  [] __lock_acquire+0x1d05/0x1f10
[   44.789538]  [] ? mark_held_locks+0x6a/0x90
[   44.795276]  [] ? trace_hardirqs_on_caller+0x105/0x1d0
[   44.801967]  [] ? trace_hardirqs_on+0xd/0x10
[   44.807793]  [] ? hrtimer_try_to_cancel+0x4a/0x250
[   44.814142]  [] lock_acquire+0xb6/0x290
[   44.819537]  [] ? flush_work+0x5/0x280
[   44.824844]  [] flush_work+0x3d/0x280
[   44.830061]  [] ? flush_work+0x5/0x280
[   44.835366]  [] ? schedule_hrtimeout_range+0x13/0x20
[   44.841889]  [] ? usleep_range+0x4b/0x50
[   44.847365]  [] ? mark_held_locks+0x6a/0x90
[   44.853102]  [] ? __cancel_work_timer+0x105/0x1c0
[   44.859359]  [] ? trace_hardirqs_on_caller+0x105/0x1d0
[   44.866045]  [] __cancel_work_timer+0x9f/0x1c0
[   44.872048]  [] ? bnx2x_func_stop+0x42/0x90 [bnx2x]
[   44.878481]  [] cancel_work_sync+0x10/0x20
[   44.884134]  [] bnx2x_chip_cleanup+0x245/0x730 [bnx2x]
[   44.890829]  [] ? up+0x32/0x50
[   44.895439]  [] ? del_timer_sync+0x5/0xd0
[   44.901005]  [] bnx2x_nic_unload+0x20d/0x8e0 [bnx2x]
[   44.907527]  [] ? might_fault+0x5f/0xb0
[   44.912921]  [] bnx2x_reload_if_running+0x2c/0x50 [bnx2x]
[   44.919879]  [] bnx2x_set_ringparam+0x2b5/0x460 [bnx2x]
[   44.926664]  [] dev_ethtool+0x55b/0x1c40
[   44.932148]  [] ? rtnl_lock+0x17/0x20
[   44.937364]  [] dev_ioctl+0x17b/0x630
[   44.942582]  [] sock_do_ioctl+0x5d/0x70
[   44.947972]  [] sock_ioctl+0x73/0x280
[   44.953192]  [] do_vfs_ioctl+0x88/0x5b0
[   44.958587]  [] ? up_read+0x23/0x40
[   44.963631]  [] ? __fget_light+0x6c/0xa0
[   44.969105]  [] SyS_ioctl+0x91/0xb0
[   44.974149]  [] system_call_fastpath+0x12/0x6f

As bnx2x_init_ptp() is only called if bp->flags contains PTP_SUPPORTED,
we also need to guard bnx2x_stop_ptp() with same condition, otherwise
ptp_task workqueue is not initialized and kernel barfs on
cancel_work_sync()

Fixes: eeed018cbfa30 ("bnx2x: Add timestamping and PTP hardware clock support")
Reported-by: Michel Lespinasse 
Signed-off-by: Eric Dumazet 
Cc: Michal Kalderon 
Cc: Ariel Elior 
Cc: Yuval Mintz 
Cc: David Decotigny 
Acked-by: Sony Chacko 
Signed-off-by: David S. Miller 
Signed-off-by: Kamal Mostafa 
---
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c 
b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
index ac6a0ef..39a1d3c 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
@@ -9310,7 +9310,8 @@ unload_error:
 * function stop ramrod is sent, since as part of this ramrod FW access
 * PTP registers.
 */
-   bnx2x_stop_ptp(bp);
+   if (bp->flags & PTP_SUPPORTED)
+   bnx2x_stop_ptp(bp);
 
/* Disable HW interrupts, NAPI */
bnx2x_netif_stop(bp, 1);
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.19.y-ckt 005/251] packet: avoid out of bounds read in round robin fanout

2015-07-15 Thread Kamal Mostafa
3.19.8-ckt4 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Willem de Bruijn 

[ Upstream commit 468479e6043c84f5a65299cc07cb08a22a28c2b1 ]

PACKET_FANOUT_LB computes f->rr_cur such that it is modulo
f->num_members. It returns the old value unconditionally, but
f->num_members may have changed since the last store. Ensure
that the return value is always < num.

When modifying the logic, simplify it further by replacing the loop
with an unconditional atomic increment.

Fixes: dc99f600698d ("packet: Add fanout support.")
Suggested-by: Eric Dumazet 
Signed-off-by: Willem de Bruijn 
Acked-by: Eric Dumazet 
Signed-off-by: David S. Miller 
Signed-off-by: Kamal Mostafa 
---
 net/packet/af_packet.c | 18 ++
 1 file changed, 2 insertions(+), 16 deletions(-)

diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index 8c7eb97..b215289 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -1258,16 +1258,6 @@ static void packet_sock_destruct(struct sock *sk)
sk_refcnt_debug_dec(sk);
 }
 
-static int fanout_rr_next(struct packet_fanout *f, unsigned int num)
-{
-   int x = atomic_read(>rr_cur) + 1;
-
-   if (x >= num)
-   x = 0;
-
-   return x;
-}
-
 static unsigned int fanout_demux_hash(struct packet_fanout *f,
  struct sk_buff *skb,
  unsigned int num)
@@ -1279,13 +1269,9 @@ static unsigned int fanout_demux_lb(struct packet_fanout 
*f,
struct sk_buff *skb,
unsigned int num)
 {
-   int cur, old;
+   unsigned int val = atomic_inc_return(>rr_cur);
 
-   cur = atomic_read(>rr_cur);
-   while ((old = atomic_cmpxchg(>rr_cur, cur,
-fanout_rr_next(f, num))) != cur)
-   cur = old;
-   return cur;
+   return val % num;
 }
 
 static unsigned int fanout_demux_cpu(struct packet_fanout *f,
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.19.y-ckt 017/251] net: mvneta: disable IP checksum with jumbo frames for Armada 370

2015-07-15 Thread Kamal Mostafa
3.19.8-ckt4 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Simon Guinot 

[ Upstream commit b65657fc240ae6c1d2a1e62db9a0e61ac9631d7a ]

The Ethernet controller found in the Armada 370, 380 and 385 SoCs don't
support TCP/IP checksumming with frame sizes larger than 1600 bytes.

This patch fixes the issue by disabling the features NETIF_F_IP_CSUM and
NETIF_F_TSO for the Armada 370 and compatibles SoCs when the MTU is set
to a value greater than 1600 bytes.

Signed-off-by: Simon Guinot 
Fixes: c5aff18204da ("net: mvneta: driver for Marvell Armada 370/XP network 
unit")
Acked-by: Thomas Petazzoni 
Signed-off-by: David S. Miller 
Signed-off-by: Kamal Mostafa 
---
 drivers/net/ethernet/marvell/mvneta.c | 26 +-
 1 file changed, 25 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/marvell/mvneta.c 
b/drivers/net/ethernet/marvell/mvneta.c
index cce60a1..2562249 100644
--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -304,6 +304,7 @@ struct mvneta_port {
unsigned int link;
unsigned int duplex;
unsigned int speed;
+   unsigned int tx_csum_limit;
 };
 
 /* The mvneta_tx_desc and mvneta_rx_desc structures describe the
@@ -2441,8 +2442,10 @@ static int mvneta_change_mtu(struct net_device *dev, int 
mtu)
 
dev->mtu = mtu;
 
-   if (!netif_running(dev))
+   if (!netif_running(dev)) {
+   netdev_update_features(dev);
return 0;
+   }
 
/* The interface is running, so we have to force a
 * reallocation of the queues
@@ -2471,9 +2474,26 @@ static int mvneta_change_mtu(struct net_device *dev, int 
mtu)
mvneta_start_dev(pp);
mvneta_port_up(pp);
 
+   netdev_update_features(dev);
+
return 0;
 }
 
+static netdev_features_t mvneta_fix_features(struct net_device *dev,
+netdev_features_t features)
+{
+   struct mvneta_port *pp = netdev_priv(dev);
+
+   if (pp->tx_csum_limit && dev->mtu > pp->tx_csum_limit) {
+   features &= ~(NETIF_F_IP_CSUM | NETIF_F_TSO);
+   netdev_info(dev,
+   "Disable IP checksum for MTU greater than %dB\n",
+   pp->tx_csum_limit);
+   }
+
+   return features;
+}
+
 /* Get mac address */
 static void mvneta_get_mac_addr(struct mvneta_port *pp, unsigned char *addr)
 {
@@ -2790,6 +2810,7 @@ static const struct net_device_ops mvneta_netdev_ops = {
.ndo_set_rx_mode = mvneta_set_rx_mode,
.ndo_set_mac_address = mvneta_set_mac_addr,
.ndo_change_mtu  = mvneta_change_mtu,
+   .ndo_fix_features= mvneta_fix_features,
.ndo_get_stats64 = mvneta_get_stats64,
.ndo_do_ioctl= mvneta_ioctl,
 };
@@ -3028,6 +3049,9 @@ static int mvneta_probe(struct platform_device *pdev)
}
}
 
+   if (of_device_is_compatible(dn, "marvell,armada-370-neta"))
+   pp->tx_csum_limit = 1600;
+
pp->tx_ring_size = MVNETA_MAX_TXD;
pp->rx_ring_size = MVNETA_MAX_RXD;
 
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.19.y-ckt 016/251] ARM: mvebu: update Ethernet compatible string for Armada XP

2015-07-15 Thread Kamal Mostafa
3.19.8-ckt4 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Simon Guinot 

[ Upstream commit ea3b55fe83b5fcede82d183164b9d6831b26e33b ]

This patch updates the Ethernet DT nodes for Armada XP SoCs with the
compatible string "marvell,armada-xp-neta".

Signed-off-by: Simon Guinot 
Fixes: 77916519cba3 ("arm: mvebu: Armada XP MV78230 has only three Ethernet 
interfaces")
Acked-by: Gregory CLEMENT 
Reviewed-by: Thomas Petazzoni 
Signed-off-by: David S. Miller 
Signed-off-by: Kamal Mostafa 
---
 arch/arm/boot/dts/armada-370-xp.dtsi |  2 --
 arch/arm/boot/dts/armada-370.dtsi|  8 
 arch/arm/boot/dts/armada-xp-mv78260.dtsi |  2 +-
 arch/arm/boot/dts/armada-xp-mv78460.dtsi |  2 +-
 arch/arm/boot/dts/armada-xp.dtsi | 10 +-
 5 files changed, 19 insertions(+), 5 deletions(-)

diff --git a/arch/arm/boot/dts/armada-370-xp.dtsi 
b/arch/arm/boot/dts/armada-370-xp.dtsi
index 1af4286..0c0e6b7 100644
--- a/arch/arm/boot/dts/armada-370-xp.dtsi
+++ b/arch/arm/boot/dts/armada-370-xp.dtsi
@@ -231,7 +231,6 @@
};
 
eth0: ethernet@7 {
-   compatible = "marvell,armada-370-neta";
reg = <0x7 0x4000>;
interrupts = <8>;
clocks = < 4>;
@@ -247,7 +246,6 @@
};
 
eth1: ethernet@74000 {
-   compatible = "marvell,armada-370-neta";
reg = <0x74000 0x4000>;
interrupts = <10>;
clocks = < 3>;
diff --git a/arch/arm/boot/dts/armada-370.dtsi 
b/arch/arm/boot/dts/armada-370.dtsi
index fdb3c12..7124a5b 100644
--- a/arch/arm/boot/dts/armada-370.dtsi
+++ b/arch/arm/boot/dts/armada-370.dtsi
@@ -272,6 +272,14 @@
dmacap,memset;
};
};
+
+   ethernet@7 {
+   compatible = "marvell,armada-370-neta";
+   };
+
+   ethernet@74000 {
+   compatible = "marvell,armada-370-neta";
+   };
};
};
 };
diff --git a/arch/arm/boot/dts/armada-xp-mv78260.dtsi 
b/arch/arm/boot/dts/armada-xp-mv78260.dtsi
index d7a8d0b..b8af89f 100644
--- a/arch/arm/boot/dts/armada-xp-mv78260.dtsi
+++ b/arch/arm/boot/dts/armada-xp-mv78260.dtsi
@@ -285,7 +285,7 @@
};
 
eth3: ethernet@34000 {
-   compatible = "marvell,armada-370-neta";
+   compatible = "marvell,armada-xp-neta";
reg = <0x34000 0x4000>;
interrupts = <14>;
clocks = < 1>;
diff --git a/arch/arm/boot/dts/armada-xp-mv78460.dtsi 
b/arch/arm/boot/dts/armada-xp-mv78460.dtsi
index 9c40c13..4b55434 100644
--- a/arch/arm/boot/dts/armada-xp-mv78460.dtsi
+++ b/arch/arm/boot/dts/armada-xp-mv78460.dtsi
@@ -323,7 +323,7 @@
};
 
eth3: ethernet@34000 {
-   compatible = "marvell,armada-370-neta";
+   compatible = "marvell,armada-xp-neta";
reg = <0x34000 0x4000>;
interrupts = <14>;
clocks = < 1>;
diff --git a/arch/arm/boot/dts/armada-xp.dtsi b/arch/arm/boot/dts/armada-xp.dtsi
index 62c3ba9..fa955dd 100644
--- a/arch/arm/boot/dts/armada-xp.dtsi
+++ b/arch/arm/boot/dts/armada-xp.dtsi
@@ -141,7 +141,7 @@
};
 
eth2: ethernet@3 {
-   compatible = "marvell,armada-370-neta";
+   compatible = "marvell,armada-xp-neta";
reg = <0x3 0x4000>;
interrupts = <12>;
clocks = < 2>;
@@ -184,6 +184,14 @@
};
};
 
+   ethernet@7 {
+   compatible = "marvell,armada-xp-neta";
+   };
+
+   ethernet@74000 {
+   compatible = "marvell,armada-xp-neta";
+   };
+
xor@f0900 {
compatible = "marvell,orion-xor";
reg = <0xF0900 0x100
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.19.y-ckt 018/251] sparc: Use GFP_ATOMIC in ldc_alloc_exp_dring() as it can be called in softirq context

2015-07-15 Thread Kamal Mostafa
3.19.8-ckt4 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Sowmini Varadhan 

[ Upstream commit 0edfad5959df7379c9e554fbe8ba264ae232d321 ]

Since it is possible for vnet_event_napi to end up doing
vnet_control_pkt_engine -> ... -> vnet_send_attr ->
vnet_port_alloc_tx_ring -> ldc_alloc_exp_dring -> kzalloc()
(i.e., in softirq context), kzalloc() should be called with
GFP_ATOMIC from ldc_alloc_exp_dring.

Signed-off-by: Sowmini Varadhan 
[ kamal: corrected upstream commit SHA ]
Cc: David Miller 
Signed-off-by: Kamal Mostafa 
---
 arch/sparc/kernel/ldc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/sparc/kernel/ldc.c b/arch/sparc/kernel/ldc.c
index 274a9f5..591f119f 100644
--- a/arch/sparc/kernel/ldc.c
+++ b/arch/sparc/kernel/ldc.c
@@ -2313,7 +2313,7 @@ void *ldc_alloc_exp_dring(struct ldc_channel *lp, 
unsigned int len,
if (len & (8UL - 1))
return ERR_PTR(-EINVAL);
 
-   buf = kzalloc(len, GFP_KERNEL);
+   buf = kzalloc(len, GFP_ATOMIC);
if (!buf)
return ERR_PTR(-ENOMEM);
 
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.19.y-ckt 014/251] amd-xgbe: Add the __GFP_NOWARN flag to Rx buffer allocation

2015-07-15 Thread Kamal Mostafa
3.19.8-ckt4 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Tom Lendacky 

[ Upstream commit 472cfe7127760d68b819cf35a26e5a1b44b30f4e ]

When allocating Rx related buffers, alloc_pages is called using an order
number that is decreased until successful. A system under stress can
experience failures during this allocation process resulting in a warning
being issued. This message can be of concern to end users even though the
failure is not fatal. Since the failure is not fatal and can occur
multiple times, the driver should include the __GFP_NOWARN flag to
suppress the warning message from being issued.

Signed-off-by: Tom Lendacky 
Signed-off-by: David S. Miller 
Signed-off-by: Kamal Mostafa 
---
 drivers/net/ethernet/amd/xgbe/xgbe-desc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/amd/xgbe/xgbe-desc.c 
b/drivers/net/ethernet/amd/xgbe/xgbe-desc.c
index a50891f..b873734 100644
--- a/drivers/net/ethernet/amd/xgbe/xgbe-desc.c
+++ b/drivers/net/ethernet/amd/xgbe/xgbe-desc.c
@@ -263,7 +263,7 @@ static int xgbe_alloc_pages(struct xgbe_prv_data *pdata,
int ret;
 
/* Try to obtain pages, decreasing order if necessary */
-   gfp |= __GFP_COLD | __GFP_COMP;
+   gfp |= __GFP_COLD | __GFP_COMP | __GFP_NOWARN;
while (order >= 0) {
pages = alloc_pages(gfp, order);
if (pages)
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.19.y-ckt 025/251] bus: arm-ccn: Fix node->XP config conversion

2015-07-15 Thread Kamal Mostafa
3.19.8-ckt4 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Pawel Moll 

commit a18f8e97fe69195823d7fb5c68a8d6565f39db4b upstream.

Events defined as watchpoints on nodes must have their config values
converted so that they apply to the respective node's XP. The
function setting new values was using wrong mask for the "port" field,
resulting in corrupted value. Fixed now.

Signed-off-by: Pawel Moll 
Signed-off-by: Kamal Mostafa 
---
 drivers/bus/arm-ccn.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/bus/arm-ccn.c b/drivers/bus/arm-ccn.c
index aaa0f2a..60397ec 100644
--- a/drivers/bus/arm-ccn.c
+++ b/drivers/bus/arm-ccn.c
@@ -212,7 +212,7 @@ static int arm_ccn_node_to_xp_port(int node)
 
 static void arm_ccn_pmu_config_set(u64 *config, u32 node_xp, u32 type, u32 
port)
 {
-   *config &= ~((0xff << 0) | (0xff << 8) | (0xff << 24));
+   *config &= ~((0xff << 0) | (0xff << 8) | (0x3 << 24));
*config |= (node_xp << 0) | (type << 8) | (port << 24);
 }
 
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] kprobes: Use debugfs_remove_recursive instead debugfs_remove

2015-07-15 Thread Wang Long
In debugfs_kprobe_init, we create a directory 'kprobes' and three
files 'list', 'enabled' and 'blacklist'. When any one of the three
files creation fails, we should remove all of them. But debugfs_remove
function can not complete this work. So use debugfs_remove_recursive
instead.

Signed-off-by: Wang Long 
---
 kernel/kprobes.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index c90e417..8cd82a5 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -2459,7 +2459,7 @@ static int __init debugfs_kprobe_init(void)
return 0;
 
 error:
-   debugfs_remove(dir);
+   debugfs_remove_recursive(dir);
return -ENOMEM;
 }
 
-- 
1.8.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 04/11] otg-fsm: move usb_bus_start_enum into otg-fsm->ops

2015-07-15 Thread Peter Chen
On Wed, Jul 15, 2015 at 04:30:27PM +0300, Roger Quadros wrote:
> On 14/07/15 03:34, Peter Chen wrote:
> > On Mon, Jul 13, 2015 at 01:13:54PM +0300, Roger Quadros wrote:
> >> Peter,
> >>
> >> On 13/07/15 04:58, Peter Chen wrote:
> >>> On Wed, Jul 08, 2015 at 01:19:30PM +0300, Roger Quadros wrote:
>  This is to prevent missing symbol build error if OTG is
>  enabled (built-in) and HCD core (CONFIG_USB) is module.
> 
> >>>
> >>> We may let the OTG-DRD/OTG-FSM depends on CONFIG_USB to fix it.
> >>
> >> CONFIG_OTG already depends on CONFIG_USB as it is a sub-option of
> >> CONFIG_USB. It doesn't depend on CONFIG_USB_GADGET and that can
> >> be fixed.
> >>
> >> But dependency is not the problem here. Symbols not available to
> >> OTG driver when USB/GADGET is 'm' is the problem.
> >>
> >> e.g.
> >> CONFIG_USB_OTG is always built-in.
> >> we need to work if CONFIG_USB is 'm'/'y'
> >> _and_ if CONFIG_USB_GADGET is 'm'/'y'
> >>
> > 
> > below should fix this issue, but we may need to make some
> > changes for code which are defined by CONFIG_USB_OTG.
> > 
> > diff --git a/drivers/usb/core/Kconfig b/drivers/usb/core/Kconfig
> > index a99c89e..5e374ad 100644
> > --- a/drivers/usb/core/Kconfig
> > +++ b/drivers/usb/core/Kconfig
> > @@ -42,8 +42,9 @@ config USB_DYNAMIC_MINORS
> >   If you are unsure about this, say N here.
> > 
> > config USB_OTG
> > -   bool "OTG support"
> > +   tristate "OTG support"
> > depends on PM
> > +   depends on USB && USB_GADGET
> > default n
> >help
> >  The most notable feature of
> >  USB OTG is support for a
> 
> With this USB_OTG will become 'm' when either USB or USB_GADGET is m
> and will break if either USB or USB_GADGET is made y as all OTG core
> API symbols won't be available. :)
> 

Ok, after thinking more, seems we can't handle properly if USB_OTG as
'm', your idea that using host/gadget/fsm->ops to call hcd/gadget API
and the controller driver will defines these ops (due to it will use
hcd/gadget function) is proper way currently.

-- 

Best Regards,
Peter Chen
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.19.y-ckt 032/251] intel_pstate: set BYT MSR with wrmsrl_on_cpu()

2015-07-15 Thread Kamal Mostafa
3.19.8-ckt4 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Joe Konno 

commit 0dd23f94251f49da99a6cbfb22418b2d757d77d6 upstream.

Commit 007bea098b86 (intel_pstate: Add setting voltage value for
baytrail P states.) introduced byt_set_pstate() with the assumption that
it would always be run by the CPU whose MSR is to be written by it.  It
turns out, however, that is not always the case in practice, so modify
byt_set_pstate() to enforce the MSR write done by it to always happen on
the right CPU.

Fixes: 007bea098b86 (intel_pstate: Add setting voltage value for baytrail P 
states.)
Signed-off-by: Joe Konno 
Acked-by: Kristen Carlson Accardi 
Signed-off-by: Rafael J. Wysocki 
Signed-off-by: Kamal Mostafa 
---
 drivers/cpufreq/intel_pstate.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
index 742eefb..c37c895 100644
--- a/drivers/cpufreq/intel_pstate.c
+++ b/drivers/cpufreq/intel_pstate.c
@@ -497,7 +497,7 @@ static void byt_set_pstate(struct cpudata *cpudata, int 
pstate)
 
val |= vid;
 
-   wrmsrl(MSR_IA32_PERF_CTL, val);
+   wrmsrl_on_cpu(cpudata->cpu, MSR_IA32_PERF_CTL, val);
 }
 
 #define BYT_BCLK_FREQS 5
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.19.y-ckt 029/251] spi: fix race freeing dummy_tx/rx before it is unmapped

2015-07-15 Thread Kamal Mostafa
3.19.8-ckt4 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Martin Sperl 

commit 8e76ef88f607174082023f50b87fe12dcdbe5db5 upstream.

Fix a race (with some kernel configurations) where a queued
master->pump_messages runs and frees dummy_tx/rx before
spi_unmap_msg is running (or is finished).

This results in the following messages:
  BUG: Bad page state in process
  page:db7ba030 count:0 mapcount:0 mapping:  (null) index:0x0
  flags: 0x200(arch_1)
  page dumped because: PAGE_FLAGS_CHECK_AT_PREP flag set
  ...

Reported-by: Noralf Trønnes 
Suggested-by: Noralf Trønnes 
Tested-by: Noralf Trønnes 
Signed-off-by: Martin Sperl 
Signed-off-by: Mark Brown 
Signed-off-by: Kamal Mostafa 
---
 drivers/spi/spi.c | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c
index a17f533..bfa47d5 100644
--- a/drivers/spi/spi.c
+++ b/drivers/spi/spi.c
@@ -1059,9 +1059,6 @@ void spi_finalize_current_message(struct spi_master 
*master)
 
spin_lock_irqsave(>queue_lock, flags);
mesg = master->cur_msg;
-   master->cur_msg = NULL;
-
-   queue_kthread_work(>kworker, >pump_messages);
spin_unlock_irqrestore(>queue_lock, flags);
 
spi_unmap_msg(master, mesg);
@@ -1074,9 +1071,13 @@ void spi_finalize_current_message(struct spi_master 
*master)
}
}
 
-   trace_spi_message_done(mesg);
-
+   spin_lock_irqsave(>queue_lock, flags);
+   master->cur_msg = NULL;
master->cur_msg_prepared = false;
+   queue_kthread_work(>kworker, >pump_messages);
+   spin_unlock_irqrestore(>queue_lock, flags);
+
+   trace_spi_message_done(mesg);
 
mesg->state = NULL;
if (mesg->complete)
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 5/6] locking/pvqspinlock: Opportunistically defer kicking to unlock time

2015-07-15 Thread Waiman Long

On 07/15/2015 06:03 AM, Peter Zijlstra wrote:

On Tue, Jul 14, 2015 at 10:13:36PM -0400, Waiman Long wrote:

+static void pv_kick_node(struct qspinlock *lock, struct mcs_spinlock *node)
  {
struct pv_node *pn = (struct pv_node *)node;

+   if (xchg(>state, vcpu_running) == vcpu_running)
+   return;
+
/*
+* Kicking the next node at lock time can actually be a bit faster
+* than doing it at unlock time because the critical section time
+* overlaps with the wakeup latency of the next node. However, if the
+* VM is too overcommmitted, it can happen that we need to kick the
+* CPU again at unlock time (double-kick). To avoid that and also to
+* fully utilize the kick-ahead functionality at unlock time,
+* the kicking will be deferred under either one of the following
+* 2 conditions:
 *
+* 1) The VM guest has too few vCPUs that kick-ahead is not even
+*enabled. In this case, the chance of double-kick will be
+*higher.
+* 2) The node after the next one is also in the halted state.
 *
+* In this case, the hashed flag is set to indicate that hashed
+* table has been filled and _Q_SLOW_VAL is set.
 */
-   if (xchg(>state, vcpu_running) == vcpu_halted) {
-   pvstat_inc(pvstat_lock_kick);
-   pv_kick(pn->cpu);
+   if ((!pv_kick_ahead || pv_get_kick_node(pn, 1))&&
+   (xchg(>hashed, 1) == 0)) {
+   struct __qspinlock *l = (void *)lock;
+
+   /*
+* As this is the same vCPU that will check the _Q_SLOW_VAL
+* value and the hash table later on at unlock time, no atomic
+* instruction is needed.
+*/
+   WRITE_ONCE(l->locked, _Q_SLOW_VAL);
+   (void)pv_hash(lock, pn);
+   return;
}
+
+   /*
+* Kicking the vCPU even if it is not really halted is safe.
+*/
+   pvstat_inc(pvstat_lock_kick);
+   pv_kick(pn->cpu);
  }

  /*
@@ -513,6 +545,13 @@ static void pv_wait_head(struct qspinlock *lock, struct 
mcs_spinlock *node)
cpu_relax();
}

+   if (!lp&&  (xchg(>hashed, 1) == 1))
+   /*
+* The hashed table&  _Q_SLOW_VAL had been filled
+* by the lock holder.
+*/
+   lp = (struct qspinlock **)-1;
+
if (!lp) { /* ONCE */
lp = pv_hash(lock, pn);
/*

*groan*, so you complained the previous version of this patch was too
complex, but let me say I vastly preferred it to this one :/


I said it was complex as maintaining a tri-state variable needed more 
thought than 2 bi-state variables. I can revert it back to the tri-state 
variable as doing an unconditional kick in unlock simplifies the code at 
pv_wait_head().


Cheers,
Longman
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.19.y-ckt 004/251] packet: read num_members once in packet_rcv_fanout()

2015-07-15 Thread Kamal Mostafa
3.19.8-ckt4 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Eric Dumazet 

[ Upstream commit f98f4514d07871da7a113dd9e3e330743fd70ae4 ]

We need to tell compiler it must not read f->num_members multiple
times. Otherwise testing if num is not zero is flaky, and we could
attempt an invalid divide by 0 in fanout_demux_cpu()

Note bug was present in packet_rcv_fanout_hash() and
packet_rcv_fanout_lb() but final 3.1 had a simple location
after commit 95ec3eb417115fb ("packet: Add 'cpu' fanout policy.")

Fixes: dc99f600698dc ("packet: Add fanout support.")
Signed-off-by: Eric Dumazet 
Cc: Willem de Bruijn 
Signed-off-by: David S. Miller 
Signed-off-by: Kamal Mostafa 
---
 net/packet/af_packet.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index 9cfe2e1..8c7eb97 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -1339,7 +1339,7 @@ static int packet_rcv_fanout(struct sk_buff *skb, struct 
net_device *dev,
 struct packet_type *pt, struct net_device 
*orig_dev)
 {
struct packet_fanout *f = pt->af_packet_priv;
-   unsigned int num = f->num_members;
+   unsigned int num = READ_ONCE(f->num_members);
struct packet_sock *po;
unsigned int idx;
 
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.19.y-ckt 023/251] [media] saa7164: fix querycap warning

2015-07-15 Thread Kamal Mostafa
3.19.8-ckt4 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Hans Verkuil 

commit 534bc3e2ee93835badca753bedce8073c67caa92 upstream.

Fix the VIDIOC_QUERYCAP warning due to the missing device_caps. Don't fill
in the version field, the V4L2 core will do that for you.

Signed-off-by: Hans Verkuil 
Signed-off-by: Mauro Carvalho Chehab 
Signed-off-by: Kamal Mostafa 
---
 drivers/media/pci/saa7164/saa7164-encoder.c | 11 ++-
 drivers/media/pci/saa7164/saa7164-vbi.c | 11 ++-
 2 files changed, 12 insertions(+), 10 deletions(-)

diff --git a/drivers/media/pci/saa7164/saa7164-encoder.c 
b/drivers/media/pci/saa7164/saa7164-encoder.c
index 9266965..7a0a651 100644
--- a/drivers/media/pci/saa7164/saa7164-encoder.c
+++ b/drivers/media/pci/saa7164/saa7164-encoder.c
@@ -721,13 +721,14 @@ static int vidioc_querycap(struct file *file, void  *priv,
sizeof(cap->card));
sprintf(cap->bus_info, "PCI:%s", pci_name(dev->pci));
 
-   cap->capabilities =
+   cap->device_caps =
V4L2_CAP_VIDEO_CAPTURE |
-   V4L2_CAP_READWRITE |
-   0;
+   V4L2_CAP_READWRITE |
+   V4L2_CAP_TUNER;
 
-   cap->capabilities |= V4L2_CAP_TUNER;
-   cap->version = 0;
+   cap->capabilities = cap->device_caps |
+   V4L2_CAP_VBI_CAPTURE |
+   V4L2_CAP_DEVICE_CAPS;
 
return 0;
 }
diff --git a/drivers/media/pci/saa7164/saa7164-vbi.c 
b/drivers/media/pci/saa7164/saa7164-vbi.c
index 6e025fe..06117e6 100644
--- a/drivers/media/pci/saa7164/saa7164-vbi.c
+++ b/drivers/media/pci/saa7164/saa7164-vbi.c
@@ -660,13 +660,14 @@ static int vidioc_querycap(struct file *file, void  *priv,
sizeof(cap->card));
sprintf(cap->bus_info, "PCI:%s", pci_name(dev->pci));
 
-   cap->capabilities =
+   cap->device_caps =
V4L2_CAP_VBI_CAPTURE |
-   V4L2_CAP_READWRITE |
-   0;
+   V4L2_CAP_READWRITE |
+   V4L2_CAP_TUNER;
 
-   cap->capabilities |= V4L2_CAP_TUNER;
-   cap->version = 0;
+   cap->capabilities = cap->device_caps |
+   V4L2_CAP_VIDEO_CAPTURE |
+   V4L2_CAP_DEVICE_CAPS;
 
return 0;
 }
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   3   4   5   6   7   8   9   10   >