BUG: perf error on syscalls for powerpc64.
Hi All, 1028ccf5 did a change for sys_call_table from a pointer to an array of unsigned long, I think it's not proper, here is my reason: sys_call_table defined as a label in assembler should be pointer array rather than an array as described in 1028ccf5. If we defined it as an array, then arch_syscall_addr will return the address of sys_call_table[], actually the content of sys_call_table[] is demanded by arch_syscall_addr. so 'perf list' will ignore all syscalls since find_syscall_meta will return null in init_ftrace_syscalls because of the wrong arch_syscall_addr. Did I miss something, or Gcc compiler has done something newer ? Cheers, Zumeng -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] ARM: OMAP2: Delete unnecessary checks before three function calls
Hello Markus On Tue, 30 Jun 2015, SF Markus Elfring wrote: > From: Markus Elfring > Date: Tue, 30 Jun 2015 14:00:16 +0200 > > The functions clk_disable(), of_node_put() and omap_device_delete() test > whether their argument is NULL and then return immediately. > Thus the test around the call is not needed. > > This issue was detected by using the Coccinelle software. > > Signed-off-by: Markus Elfring Thanks for the patch. I have to say, I am a bit leery about applying the omap_device.c and omap_hwmod.c changes, since the called functions -- omap_device_delete() and clk_disable() -- don't explicitly document that NULLs are allowed to be passed in. So there's no explicit contract that callers can rely upon, to (at least in theory) prevent those internal NULL pointer checks from being removed. So I would suggest that those two functions' kerneldoc be patched first to explicitly state that passing in a NULL pointer is allowed. Then I would feel a bit more comfortable applying the omap_device.c and omap_hwmod.c changes. The kerneldoc for of_node_put() does explicitly allow NULLs to be passed in. So I'll apply that change now for v4.3, touching up the commit message accordingly. regards, - Paul > --- > arch/arm/mach-omap2/omap_device.c | 3 +-- > arch/arm/mach-omap2/omap_hwmod.c | 5 + > arch/arm/mach-omap2/timer.c | 3 +-- > 3 files changed, 3 insertions(+), 8 deletions(-) > > diff --git a/arch/arm/mach-omap2/omap_device.c > b/arch/arm/mach-omap2/omap_device.c > index 4cb8fd9..196366e 100644 > --- a/arch/arm/mach-omap2/omap_device.c > +++ b/arch/arm/mach-omap2/omap_device.c > @@ -193,8 +193,7 @@ static int _omap_device_notifier_call(struct > notifier_block *nb, > > switch (event) { > case BUS_NOTIFY_DEL_DEVICE: > - if (pdev->archdata.od) > - omap_device_delete(pdev->archdata.od); > + omap_device_delete(pdev->archdata.od); > break; > case BUS_NOTIFY_ADD_DEVICE: > if (pdev->dev.of_node) > diff --git a/arch/arm/mach-omap2/omap_hwmod.c > b/arch/arm/mach-omap2/omap_hwmod.c > index d78c12e..1091ee7 100644 > --- a/arch/arm/mach-omap2/omap_hwmod.c > +++ b/arch/arm/mach-omap2/omap_hwmod.c > @@ -921,10 +921,7 @@ static int _disable_clocks(struct omap_hwmod *oh) > int i = 0; > > pr_debug("omap_hwmod: %s: disabling clocks\n", oh->name); > - > - if (oh->_clk) > - clk_disable(oh->_clk); > - > + clk_disable(oh->_clk); > p = oh->slave_ports.next; > > while (i < oh->slaves_cnt) { > diff --git a/arch/arm/mach-omap2/timer.c b/arch/arm/mach-omap2/timer.c > index cac46d8..15448221 100644 > --- a/arch/arm/mach-omap2/timer.c > +++ b/arch/arm/mach-omap2/timer.c > @@ -208,8 +208,7 @@ static void __init omap_dmtimer_init(void) > /* If we are a secure device, remove any secure timer nodes */ > if ((omap_type() != OMAP2_DEVICE_TYPE_GP)) { > np = omap_get_timer_dt(omap_timer_match, "ti,timer-secure"); > - if (np) > - of_node_put(np); > + of_node_put(np); > } > } > > -- > 2.4.5 > - Paul -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/7] fs: Ignore file caps in mounts from other user namespaces
Andy Lutomirski writes: > On Wed, Jul 15, 2015 at 10:04 PM, Eric W. Biederman > wrote: >> Andy Lutomirski writes: >> >>> >>> So here's the semantic question: >>> >>> Suppose an unprivileged user (uid 1000) creates a user namespace and a >>> mount namespace. They stick a file (owned by uid 1000 as seen by >>> init_user_ns) in there and mark it setuid root and give it fcaps. >> >> To make this make sense I have to ask, is this file on a filesystem >> where uid 1000 as seen by the init_user_ns stored as uid 1000 on >> the filesystem? Or is this uid 0 as seen by the filesystem? >> >> I assume this is uid 0 on the filesystem in question or else your >> unprivileged user would not have sufficient privileges over the >> filesystem to setup fcaps. > > I was thinking uid 0 as seen by the filesystem. But even if it were > uid 1000, the unprivileged user can still set whatever mode and xattrs > they want -- they control the backing store. Yes. And that is what I was really asking. Are we taking about a filesystem where the user controls the backing store? >>> Then global root gets an fd to this filesystem. If they execve the >>> file directly, then, with my patch 4, it won't act as setuid 1000 and >>> the fcaps will be ignored. Even with my patch 4, though, if they bind >>> mount the fs and execve the file from their bind mount, it will act as >>> setuid 1000. Maybe this is odd. However, with Seth's patch 3, the >>> fcaps will (correctly) not be honored. >> >> With patch 3 you can also think of it as fcaps being honored and you >> get all the caps in the appropriate user namespace, but since you are >> not in that user namespace and so don't have a place to store them >> in struct cred you don't get the file caps. >> >> From the philosophy of interpreting the file as defined by the >> filesystem in principle we could extend struct cred so you actually >> get the creds just in uid 1000s user namespace, but that is very >> unlikely to be worth it. > > I agree. > >> >>> I tend to thing that, if we're not honoring the fcaps, we shouldn't be >>> honoring the setuid bit either. After all, it's really not a trusted >>> file, even though the only user who could have messed with it really >>> is the apparent owner. >> >> For the file caps we can't honor them because you don't have the bits >> in struct cred. >> >> For setuid we can honor it, and setuid is something that the user >> namespace allows. >> > > We certainly *can* honor it. But why should we? I'd be more > comfortable with this if the contents of an untrusted filesystem were > really treated as just data. In these weird bleed through situtations I don't know that we should. But extending nosuid protections in this way is a bit like yama a bit gratuitious stomping don't care cases in the semantics to make bugs harder to exploit. >>> And, if we're going to say we don't trust the file and shouldn't honor >>> setuid or fcaps, then merging all the functionality into mnt_may_suid >>> could make sense. Yes, these two things do different things, but they >>> could hook in to the same place. >> >> There are really two separate questions: >> - Do we trust this filesystem? >> - Do you have the bits to implement this concept? >> >> Even if in this specific context the two questions wind up looking >> exactly the same. I think it makes a lot of sense to ask the two >> questions separately. As future maintenance changes may cause the >> implementation of the questions to diverge. >> > > Agreed. > > Unless someone thinks of an argument to the contrary, I'd say "no, we > don't trust this filesystem". I could be convinced otherwise. But this is context dependent. From the perspective of the container we really do want to trust the filesystem. As the container root set it up, and if he isn't being hostile likely has a use for setfcaps files and setuid files and all of the rest. Perhaps I should phrase it as: - In this context do we trust the code? AKA mnt_may_suid? - What do these bits mean in this context? (Usually something more complicated). Which says to me we want both patches 3 and 4 (even if 4 uses s_user_ns) because 3 is different than 4. And now I better context switch back to fixing bind mounts. Eric -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: linux-next: build failure after merge of the rcu tree
Hi Paul, On Wed, 15 Jul 2015 20:51:38 -0700 "Paul E. McKenney" wrote: > > Thank you in both cases! I suspect that more will follow, so is there > something I can do to make this easier? (Hard for me to patch stuff > that is not yet in the tree...) No, that is what I am here for. But it would be good if you remember this when it comes time for your tree to be merged into tip ... -- Cheers, Stephen Rothwells...@canb.auug.org.au -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 5/6] locking/pvqspinlock: Opportunistically defer kicking to unlock time
On Wed, Jul 15, 2015 at 10:18:35PM -0400, Waiman Long wrote: > On 07/15/2015 06:03 AM, Peter Zijlstra wrote: > >*groan*, so you complained the previous version of this patch was too > >complex, but let me say I vastly preferred it to this one :/ > > I said it was complex as maintaining a tri-state variable needed more > thought than 2 bi-state variables. I can revert it back to the tri-state > variable as doing an unconditional kick in unlock simplifies the code at > pv_wait_head(). Well, your state space isn't shrunk, you just use more variables and I'm not entirely sure that actually matters. What also doesn't help is that mixing with the kicking code. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 4/6] locking/pvqspinlock: Allow vCPUs kick-ahead
On Wed, Jul 15, 2015 at 10:01:02PM -0400, Waiman Long wrote: > On 07/15/2015 05:39 AM, Peter Zijlstra wrote: > >On Tue, Jul 14, 2015 at 10:13:35PM -0400, Waiman Long wrote: > >>Frequent CPU halting (vmexit) and CPU kicking (vmenter) lengthens > >>critical section and block forward progress. This patch implements > >>a kick-ahead mechanism where the unlocker will kick the queue head > >>vCPUs as well as up to four additional vCPUs next to the queue head > >>if they were halted. The kickings are done after exiting the critical > >>section to improve parallelism. > >> > >>The amount of kick-ahead allowed depends on the number of vCPUs > >>in the VM guest. This patch, by itself, won't do much as most of > >>the kickings are currently done at lock time. Coupled with the next > >>patch that defers lock time kicking to unlock time, it should improve > >>overall system performance in a busy overcommitted guest. > >> > >>Linux kernel builds were run in KVM guest on an 8-socket, 4 > >>cores/socket Westmere-EX system and a 4-socket, 8 cores/socket > >>Haswell-EX system. Both systems are configured to have 32 physical > >>CPUs. The kernel build times before and after the patch were: > >> > >>WestmereHaswell > >> Patch32 vCPUs48 vCPUs32 vCPUs48 vCPUs > >> - > >> Before patch 3m25.0s10m34.1s 2m02.0s15m35.9s > >> After patch3m27.4s10m32.0s2m00.8s14m52.5s > >> > >>There wasn't too much difference before and after the patch. > >That means either the patch isn't worth it, or as you seem to imply its > >in the wrong place in this series. > > It needs to be coupled with the next patch to be effective as most of the > kicking are happening at the lock side, instead of at the unlock side. If > you look at the sample pvqspinlock stats in patch 3: > > lock_kick_count=755354 > unlock_kick_count=87 > > The number of unlock kicks is negligible compared with the lock kicks. Patch > 5 does have a dependency on patch 4 unless we make it unconditionally defers > kicking to the unlock call which was what I had done in the v1 patch. The > reason why I change this in v2 is because I found a very slight performance > degradation in doing so. This way we cannot see the gains of the proposed complexity. So put it in a place where you can. > >You also do not offer any support for any of the magic numbers.. > > I chose 4 for PV_KICK_AHEAD_MAX as I didn't see much performance difference > when I did a kick-ahead of 5. Also, it may be too unfair to the vCPU that > was doing the kicking if the number is too big. Another magic number is > pv_kick_ahead number. This one is kind of arbitrary. Right now I do a log2, > but it can be divided by 4 (rshift 2) as well. So what was the difference between 1-2-3-4 ? I would be thinking one extra kick is the biggest help, no? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/3] ARM: multi_v7_defconfig: Enable max77802 regulator
Hello Krzysztof, Thanks for the feedback. On 07/16/2015 02:45 AM, Krzysztof Kozlowski wrote: > On 16.07.2015 01:32, Javier Martinez Canillas wrote: >> The Maxim max77802 Power Management IC has besides other devices, a set of >> regulators. Commit f3caa529c6f5 ("ARM: multi_v7_defconfig: Enable max77802 >> regulator, rtc and clock drivers") was supposed to enable the config option >> for the regulator driver as a module but the final version that landed did >> not include this. So this patch enables the needed Kconfig option. >> >> Signed-off-by: Javier Martinez Canillas > > Please describe why do you want to enable it (IOW who will benefit from > enabling it?). This symbol was removed by Kukjin from your commit: > [kg...@kernel.org: removing useless REGULATOR_MAX77802 config] > so justification would be welcomed. > You are right, sorry for not making the commit message clear. This PMIC is used by a couple of Exynos5 based boars such as the Peach Pit and Pi Chromebooks. I expect it to be found in other designs too just like the max77686 is found in many Exynos5 based boards. I'll add this to the commit message on v2. > Beside the commit description I agree with the patch. > Does this mean I can add your Reviewed-by to this patch as well? > Best regards, > Krzysztof > Best regards, -- Javier Martinez Canillas Open Source Group Samsung Research America -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 1/6] locking/pvqspinlock: Unconditional PV kick with _Q_SLOW_VAL
On Wed, Jul 15, 2015 at 08:18:23PM -0400, Waiman Long wrote: > On 07/15/2015 05:10 AM, Peter Zijlstra wrote: > > /* > >+ * A failed cmpxchg doesn't provide any memory-ordering guarantees, > >+ * so we need a barrier to order the read of the node data in > >+ * pv_unhash *after* we've read the lock being _Q_SLOW_VAL. > >+ * > >+ * Matches the cmpxchg() in pv_wait_head() setting _Q_SLOW_VAL. > >+ */ > >+smp_rmb(); > > According to memory_barriers.txt, cmpxchg() is a full memory barrier. It > didn't say a failed cmpxchg will lose its memory guarantee. So is the > documentation right? The documentation is not entirely clear on this; but there are hints that this is so. > Or is that true for some architectures? I think it is > not true for x86. On x86 LOCK CMPXCHG is always a sync point, but yes there are archs for which a failed cmpxchg does _NOT_ provide any barrier semantics. The reason I started looking was because Will made Argh64 one of those. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v5 2/3] pwm: add MediaTek display PWM driver support
On Wed, 2015-07-15 at 23:59 +0800, YH Huang wrote: > On Mon, 2015-07-13 at 18:19 +0800, Daniel Kurtz wrote: > > On Mon, Jul 13, 2015 at 5:04 PM, YH Huang wrote: > > > Add display PWM driver support to modify backlight for MT8173 and MT6595. > > > The PWM has one channel to control the brightness of the display. > > > When the (high_width / period) is closer to 1, the screen is brighter; > > > otherwise, it is darker. > > > > > > Signed-off-by: YH Huang > > > --- > > > drivers/pwm/Kconfig| 10 ++ > > > drivers/pwm/Makefile | 1 + > > > drivers/pwm/pwm-mtk-disp.c | 256 > > > + > > > 3 files changed, 267 insertions(+) > > > create mode 100644 drivers/pwm/pwm-mtk-disp.c > > > > > > diff --git a/drivers/pwm/Kconfig b/drivers/pwm/Kconfig > > > index b1541f4..f5b03a4 100644 > > > --- a/drivers/pwm/Kconfig > > > +++ b/drivers/pwm/Kconfig > > > @@ -211,6 +211,16 @@ config PWM_LPSS_PLATFORM > > > To compile this driver as a module, choose M here: the module > > > will be called pwm-lpss-platform. > > > > > > +config PWM_MTK_DISP > > > + tristate "MediaTek display PWM driver" > > > + depends on ARCH_MEDIATEK || COMPILE_TEST > > > + help > > > + Generic PWM framework driver for MediaTek disp-pwm device. > > > + The PWM is used to control the backlight brightness for display. > > > + > > > + To compile this driver as a module, choose M here: the module > > > + will be called pwm-mtk-disp. > > > + > > > config PWM_MXS > > > tristate "Freescale MXS PWM support" > > > depends on ARCH_MXS && OF > > > diff --git a/drivers/pwm/Makefile b/drivers/pwm/Makefile > > > index ec50eb5..99c9e75 100644 > > > --- a/drivers/pwm/Makefile > > > +++ b/drivers/pwm/Makefile > > > @@ -18,6 +18,7 @@ obj-$(CONFIG_PWM_LPC32XX) += pwm-lpc32xx.o > > > obj-$(CONFIG_PWM_LPSS) += pwm-lpss.o > > > obj-$(CONFIG_PWM_LPSS_PCI) += pwm-lpss-pci.o > > > obj-$(CONFIG_PWM_LPSS_PLATFORM)+= pwm-lpss-platform.o > > > +obj-$(CONFIG_PWM_MTK_DISP) += pwm-mtk-disp.o > > > obj-$(CONFIG_PWM_MXS) += pwm-mxs.o > > > obj-$(CONFIG_PWM_PCA9685) += pwm-pca9685.o > > > obj-$(CONFIG_PWM_PUV3) += pwm-puv3.o > > > diff --git a/drivers/pwm/pwm-mtk-disp.c b/drivers/pwm/pwm-mtk-disp.c > > > new file mode 100644 > > > index 000..1f17cee > > > --- /dev/null > > > +++ b/drivers/pwm/pwm-mtk-disp.c > > > @@ -0,0 +1,256 @@ > > > +/* > > > + * MediaTek display pulse-width-modulation controller driver. > > > + * Copyright (c) 2015 MediaTek Inc. > > > + * Author: YH Huang > > > + * > > > + * This program is free software; you can redistribute it and/or modify > > > + * it under the terms of the GNU General Public License version 2 as > > > + * published by the Free Software Foundation. > > > + * > > > + * This program is distributed in the hope that it will be useful, > > > + * but WITHOUT ANY WARRANTY; without even the implied warranty of > > > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > > > + * GNU General Public License for more details. > > > + */ > > > + > > > +#include > > > +#include > > > +#include > > > +#include > > > +#include > > > +#include > > > +#include > > > +#include > > > + > > > +#define DISP_PWM_EN0 > > > > The "DISP_PWM_*" are register offsets, so use a hex value, like this: > > > > #define DISP_PWM_EN 0x00 > > > > Use BIT() for register *fields*, that is, the individual bits of a register. > > > > Got it! > > > > +#define PWM_ENABLE_MASKBIT(0) > > > + > > > +#define DISP_PWM_COMMITBIT(3) > > > > #define DISP_PWM_COMMIT0x08 > > > > > +#define PWM_COMMIT_MASKBIT(0) > > > + > > > +#define DISP_PWM_CON_0 BIT(4) > > > > #define DISP_PWM_COMMIT0x10 > > > > > +#define PWM_CLKDIV_SHIFT 16 > > > +#define PWM_CLKDIV_MAX 0x3ff > > > +#define PWM_CLKDIV_MASK(PWM_CLKDIV_MAX << > > > PWM_CLKDIV_SHIFT) > > > + > > > +#define DISP_PWM_CON_1 0x14 > > > +#define PWM_PERIOD_MASK0xfff > > > +/* Shift log2(PWM_PERIOD_MASK + 1) as divisor */ > > > +#define PWM_PERIOD_BIT_SHIFT 12 > > > + > > > +#define PWM_HIGH_WIDTH_SHIFT 16 > > > +#define PWM_HIGH_WIDTH_MASK(0x1fff << PWM_HIGH_WIDTH_SHIFT) > > > + > > > +struct mtk_disp_pwm { > > > + struct pwm_chip chip; > > > + struct device *dev; > > > > I don't think "dev" is actually used. And, if needed, it can be > > extracted from "chip". > > > > I will drop it. > > > > + struct clk *clk_main; > > > + struct clk *clk_mm; > > > + void __iomem *base; > > > +}; > > > + > > > +static inline struct mtk_disp_pwm *to_mtk_disp_pwm(struct pwm_chip *chip) > > > +{ > > > + return container_of(chip, struct mtk_disp_pwm, chip); > > > +} > > > + > > > +static void
linux-next: build failure after merge of the akpm-current tree
Hi Andrew, After merging the akpm-current tree, today's linux-next build (powerpc ppc64_defconfig) failed like this: ERROR: ".smpboot_register_percpu_thread_cpumask" [drivers/infiniband/hw/ehca/ib_ehca.ko] undefined! Caused by commit 2b07b4da35a9 ("smpboot: allow passing the cpumask on per-cpu thread registration") I have added the following build faix for today: From: Stephen Rothwell Date: Thu, 16 Jul 2015 15:30:05 +1000 Subject: [PATCH] smpboot: fix for allow passing the cpumask on per-cpu thread registration Signed-off-by: Stephen Rothwell --- kernel/smpboot.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/smpboot.c b/kernel/smpboot.c index d99a41d25b0c..a818cbc73e14 100644 --- a/kernel/smpboot.c +++ b/kernel/smpboot.c @@ -308,7 +308,7 @@ out: put_online_cpus(); return ret; } -EXPORT_SYMBOL_GPL(smpboot_register_percpu_thread); +EXPORT_SYMBOL_GPL(smpboot_register_percpu_thread_cpumask); /** * smpboot_unregister_percpu_thread - Unregister a per_cpu thread related to hotplug -- 2.1.4 -- Cheers, Stephen Rothwells...@canb.auug.org.au -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] staging: rtl8188eu: core: find and remove code valid only for 5 HGz.
On Wed, Jul 15, 2015 at 10:04:08PM -0400, Sreenath Madasu wrote: > This one of the TODO tasks for staging rtl8188eu driver. I have removed > the code referring to channel > 14 for rtw_ap.c, rtw_ieee80211.c and > rtw_mlme.c files. Please review. Your patch will give a new build warning: warning: unused variable ‘pcur_network’ [-Wunused-variable] regards sudip -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/9] ARM: multi_v7_defconfig: Enable max77802 regulator, rtc and clock drivers
Hello Krzysztof, On 07/16/2015 02:42 AM, Krzysztof Kozlowski wrote: > On 16.07.2015 00:38, Javier Martinez Canillas wrote: >> Hello, >> >> On Thu, May 14, 2015 at 5:40 PM, Javier Martinez Canillas >> wrote: >>> The Maxim max77802 Power Management IC is used on many Exynos machines. >>> Besides a bunch of regulators, this chip has a Real-Time-Clock (RTC) >>> and 2-channel 32kHz clock outputs. >>> >>> Enable the kernel config options to have the drivers for these devices >>> built as a module. >>> >>> Signed-off-by: Javier Martinez Canillas >>> --- >>> arch/arm/configs/multi_v7_defconfig | 3 +++ >>> 1 file changed, 3 insertions(+) >>> >>> diff --git a/arch/arm/configs/multi_v7_defconfig >>> b/arch/arm/configs/multi_v7_defconfig >>> index 2349584b6e08..080120fe5580 100644 >>> --- a/arch/arm/configs/multi_v7_defconfig >>> +++ b/arch/arm/configs/multi_v7_defconfig >>> @@ -373,6 +373,7 @@ CONFIG_POWER_RESET_SYSCON=y >>> CONFIG_REGULATOR_MAX8907=y >>> CONFIG_REGULATOR_MAX8973=y >>> CONFIG_REGULATOR_MAX77686=y >>> +CONFIG_REGULATOR_MAX77802=m >> >> I noticed that the version that landed in 4.2-rc1 as commit >> f3caa529c6f5 ("ARM: multi_v7_defconfig: Enable max77802 regulator, rtc >> and clock drivers") doesn't include this symbol. I guess it was caused >> by a wrong resolved conflict? I'll post a patch to enable the >> regulator again. > > As you can see in mentioned mainline commit Kukjin removed it manually: > [kg...@kernel.org: removing useless REGULATOR_MAX77802 config] > Oh, I missed that in the commit message. I thought it was a merge / conflict error, not something done on purpose. > I wonder why? > Me too. > Best regards, > Krzysztof > -- Best regards, -- Javier Martinez Canillas Open Source Group Samsung Research America -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/1] mem-hotplug: Handle node hole when initializing numa_meminfo.
On 07/16/2015 05:20 AM, Tejun Heo wrote: On Wed, Jul 01, 2015 at 11:16:54AM +0800, Tang Chen wrote: ... - /* and there's no empty block */ - if (bi->start >= bi->end) + /* and there's no empty or non-exist block */ + if (bi->start >= bi->end || + memblock_overlaps_region(, + bi->start, bi->end - bi->start) == -1) Ugh can you please change memblock_overlaps_region() to return bool instead? Well, I think memblock_overlaps_region() is designed to return the index of the region overlapping with the given region. Maybe it had some users before. Of course for now, it is only called by memblock_is_region_reserved(). It is OK to change the return value of memblock_overlaps_region() to bool. But any caller of memblock_is_region_reserved() should also be changed. I think it is OK to leave it there. Thanks. Thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
linux-next: build warning after merge of the akpm-current tree
Hi Andrew, After merging the akpm-current tree, today's linux-next build (arm multi_v7_defconfig) produced this warning: lib/genalloc.c: In function 'gen_pool_get': /scratch/sfr/next/lib/genalloc.c:599:6: warning: passing argument 4 of 'devres_find' discards 'const' qualifier from pointer target type p = devres_find(dev, devm_gen_pool_release, devm_gen_pool_match, name); ^ In file included from /scratch/sfr/next/include/linux/node.h:17:0, from /scratch/sfr/next/include/linux/cpu.h:16, from /scratch/sfr/next/include/linux/of_device.h:4, from /scratch/sfr/next/lib/genalloc.c:37: /scratch/sfr/next/include/linux/device.h:620:14: note: expected 'void *' but argument is of type 'const char *' extern void *devres_find(struct device *dev, dr_release_t release, ^ Caused by commit e89a70fd54f2 ("genalloc: add support of multiple gen_pools per device") -- Cheers, Stephen Rothwells...@canb.auug.org.au -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
linux-next: build failure after merge of the akpm-current tree
Hi Andrew, After merging the akpm-current tree, today's linux-next build (arm multi_v7_defconfig) failed like this: arch/arm/kernel/entry-common.S: Assembler messages: arch/arm/kernel/entry-common.S:108: Error: __NR_syscalls is not equal to the size of the syscall table Caused by commit d221fc1f0f25 ("mm: mlock: add new mlock, munlock, and munlockall system calls") I have added the following fix patch for today: From: Stephen Rothwell Date: Thu, 16 Jul 2015 14:58:53 +1000 Subject: [PATCH] mm: mlock: fix for add new mlock, munlock, and munlockall system calls Signed-off-by: Stephen Rothwell --- arch/arm/include/asm/unistd.h | 2 +- arch/arm/kernel/calls.S | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/arm/include/asm/unistd.h b/arch/arm/include/asm/unistd.h index 32640c431a08..2516c09d65d7 100644 --- a/arch/arm/include/asm/unistd.h +++ b/arch/arm/include/asm/unistd.h @@ -19,7 +19,7 @@ * This may need to be greater than __NR_last_syscall+1 in order to * account for the padding in the syscall table */ -#define __NR_syscalls (388) +#define __NR_syscalls (392) /* * *NOTE*: This is a ghost syscall private to the kernel. Only the diff --git a/arch/arm/kernel/calls.S b/arch/arm/kernel/calls.S index 514e77b26414..88808221383b 100644 --- a/arch/arm/kernel/calls.S +++ b/arch/arm/kernel/calls.S @@ -399,7 +399,7 @@ CALL(sys_execveat) CALL(sys_mlock2) CALL(sys_munlock2) -/* 400 */ CALL(sys_munlockall2) +/* 390 */ CALL(sys_munlockall2) #ifndef syscalls_counted .equ syscalls_padding, ((NR_syscalls + 3) & ~3) - NR_syscalls #define syscalls_counted -- 2.1.4 -- Cheers, Stephen Rothwells...@canb.auug.org.au -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 6/6] ARM: PRM: AM437x: Enable IO wakeup feature
On Thursday 16 July 2015 10:44 AM, Paul Walmsley wrote: Hi On Tue, 14 Jul 2015, Keerthy wrote: Enable IO wakeup feature. Signed-off-by: Keerthy Per my comments on one of the previous patches, please add a short description in the commit message for what enabling I/O wakeup will do for a user. Okay will do that. - Paul -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] asm-generic: {get,put}_user ptr argument evaluate only 1 time
Current implemantation ptr argument evaluate 2 times. It'll be an unexpected result. Signed-off-by: Yoshinori Sato --- include/asm-generic/uaccess.h | 10 ++ 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/include/asm-generic/uaccess.h b/include/asm-generic/uaccess.h index 72d8803..1b813fb 100644 --- a/include/asm-generic/uaccess.h +++ b/include/asm-generic/uaccess.h @@ -163,9 +163,10 @@ static inline __must_check long __copy_to_user(void __user *to, #define put_user(x, ptr) \ ({ \ + __typeof__((ptr)) __p = (ptr); \ might_fault(); \ - access_ok(VERIFY_WRITE, ptr, sizeof(*ptr)) ?\ - __put_user(x, ptr) :\ + access_ok(VERIFY_WRITE, __p, sizeof(*__p)) ?\ + __put_user(x, __p) :\ -EFAULT;\ }) @@ -225,9 +226,10 @@ extern int __put_user_bad(void) __attribute__((noreturn)); #define get_user(x, ptr) \ ({ \ + __typeof__((ptr)) __p = (ptr); \ might_fault(); \ - access_ok(VERIFY_READ, ptr, sizeof(*ptr)) ? \ - __get_user(x, ptr) :\ + access_ok(VERIFY_READ, __p, sizeof(*__p)) ? \ + __get_user(x, __p) :\ -EFAULT;\ }) -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH v2] memory-barriers: remove smp_mb__after_unlock_lock()
On Thu, 2015-07-16 at 15:03 +1000, Benjamin Herrenschmidt wrote: > On Thu, 2015-07-16 at 12:00 +1000, Michael Ellerman wrote: > > That would fix the problem with smp_mb__after_unlock_lock(), but not > > the original worry we had about loads happening before the SC in lock. > > However I think isync fixes *that* :-) The problem with isync is as you > said, it's not a -memory- barrier per-se, it's an execution barrier / > context synchronizing instruction. The combination stwcx. + bne + isync > however prevents the execution of anything past the isync until the > stwcx has completed and the bne has been "decided", which prevents loads > from leaking into the LL/SC loop. It will also prevent a store in the > lock from being issued before the stwcx. has completed. It does *not* > prevent as far as I can tell another unrelated store before the lock > from leaking into the lock, including the one used to unlock a different > lock. Except that the architecture says: << Because a Store Conditional instruction may com- plete before its store has been performed, a condi- tional Branch instruction that depends on the CR0 value set by a Store Conditional instruction does not order the Store Conditional's store with respect to storage accesses caused by instructions that follow the Branch >> So isync in lock in architecturally incorrect, despite being what the architecture recommends using, yay ! Ben. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 1/6] ARM: OMAP4: PRM: Remove hardcoding of PRM_IO_PMCTRL_OFFSET register
Paul, Thanks for the review! On Thursday 16 July 2015 07:24 AM, Paul Walmsley wrote: Hi a few minor comments On Wed, 8 Jul 2015, Keerthy wrote: PRM_IO_PMCTRL_OFFSET need not be same for all SOCs hence remove hardcoding and use the value provided by the omap_prcm_irq_setup structure. Please mention here that the reason why you're making this change is to support AM437x. Sure. I will do that. Signed-off-by: Keerthy --- arch/arm/mach-omap2/prcm-common.h | 1 + arch/arm/mach-omap2/prm44xx.c | 11 ++- 2 files changed, 7 insertions(+), 5 deletions(-) diff --git a/arch/arm/mach-omap2/prcm-common.h b/arch/arm/mach-omap2/prcm-common.h index 6ae0b3a..2e60406 100644 --- a/arch/arm/mach-omap2/prcm-common.h +++ b/arch/arm/mach-omap2/prcm-common.h @@ -494,6 +494,7 @@ struct omap_prcm_irq { struct omap_prcm_irq_setup { u16 ack; u16 mask; + u16 pm_ctrl; Please add a kerneldoc structure documentation line for this new field, to match the existing documentation here. Okay. u8 nr_regs; u8 nr_irqs; const struct omap_prcm_irq *irqs; diff --git a/arch/arm/mach-omap2/prm44xx.c b/arch/arm/mach-omap2/prm44xx.c index 4541700..8149e5a 100644 --- a/arch/arm/mach-omap2/prm44xx.c +++ b/arch/arm/mach-omap2/prm44xx.c @@ -45,6 +45,7 @@ static const struct omap_prcm_irq omap4_prcm_irqs[] = { static struct omap_prcm_irq_setup omap4_prcm_irq_setup = { .ack= OMAP4_PRM_IRQSTATUS_MPU_OFFSET, .mask = OMAP4_PRM_IRQENABLE_MPU_OFFSET, + .pm_ctrl= OMAP4_PRM_IO_PMCTRL_OFFSET, .nr_regs= 2, .irqs = omap4_prcm_irqs, .nr_irqs= ARRAY_SIZE(omap4_prcm_irqs), @@ -306,10 +307,10 @@ static void omap44xx_prm_reconfigure_io_chain(void) omap4_prm_rmw_inst_reg_bits(OMAP4430_WUCLK_CTRL_MASK, OMAP4430_WUCLK_CTRL_MASK, inst, - OMAP4_PRM_IO_PMCTRL_OFFSET); + omap4_prcm_irq_setup.pm_ctrl); omap_test_timeout( (((omap4_prm_read_inst_reg(inst, - OMAP4_PRM_IO_PMCTRL_OFFSET) & + omap4_prcm_irq_setup.pm_ctrl) & OMAP4430_WUCLK_STATUS_MASK) >> OMAP4430_WUCLK_STATUS_SHIFT) == 1), MAX_IOPAD_LATCH_TIME, i); @@ -319,10 +320,10 @@ static void omap44xx_prm_reconfigure_io_chain(void) /* Trigger WUCLKIN disable */ omap4_prm_rmw_inst_reg_bits(OMAP4430_WUCLK_CTRL_MASK, 0x0, inst, - OMAP4_PRM_IO_PMCTRL_OFFSET); + omap4_prcm_irq_setup.pm_ctrl); omap_test_timeout( (((omap4_prm_read_inst_reg(inst, - OMAP4_PRM_IO_PMCTRL_OFFSET) & + omap4_prcm_irq_setup.pm_ctrl) & OMAP4430_WUCLK_STATUS_MASK) >> OMAP4430_WUCLK_STATUS_SHIFT) == 0), MAX_IOPAD_LATCH_TIME, i); @@ -350,7 +351,7 @@ static void __init omap44xx_prm_enable_io_wakeup(void) omap4_prm_rmw_inst_reg_bits(OMAP4430_GLOBAL_WUEN_MASK, OMAP4430_GLOBAL_WUEN_MASK, inst, - OMAP4_PRM_IO_PMCTRL_OFFSET); + omap4_prcm_irq_setup.pm_ctrl); } /** -- 1.9.1 - Paul -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 2/6] ARM: AM43xx: Add the PRM IRQ register offsets
On Thursday 16 July 2015 08:08 AM, Paul Walmsley wrote: On Thu, 16 Jul 2015, Paul Walmsley wrote: On Wed, 8 Jul 2015, Keerthy wrote: Add the PRM IRQ register offsets. Signed-off-by: Keerthy Please add more detail to your commit messages so they conform to Documentation/SubmittingPatches: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/Documentation/SubmittingPatches#n109 For example, this commit message should read something like: --- ARM: AM43xx: Add the PRM IRQ register offsets Add the PRM IRQ register offsets. This is needed to support PRM I/O wakeup on AM43xx. -- Basically, your patches need to provide context as to _why_ the change is needed. I've fixed the message for this patch, and queued it for v4.3, but please take care with this issue in the future. Also I've moved the AM43XX_PRM_IO_PMCTRL_OFFSET macro out of the AM43XX CM section, since it doesn't belong there. Thanks Paul! - Paul -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/7] fs: Ignore file caps in mounts from other user namespaces
On Wed, Jul 15, 2015 at 10:04 PM, Eric W. Biederman wrote: > Andy Lutomirski writes: > >> On Wed, Jul 15, 2015 at 9:23 PM, Eric W. Biederman >> wrote: >>> >>> Ok. Andy I have stopped and really looked at your patch that is 4/7 in >>> this series. Something I had not done before since it sounded totally >>> wrong. >>> >>> That combined with your earlier comments I think I can say something >>> meaningful. >>> >>> Andy as I read your patch the thread you are primarily worried about is >>> chdir(/some/directory/in/another/mnt/ns). I think enhancing nosuid to >>> deal with that case is reasonable, and is unlikely to break userspace. >>> It is one of those hairy security things so we need to be careful not to >>> introduce a regression. >>> >> >> Indeed. It's plausible this could regress something, but it would be >> really weird. >> >>> I think a top down enhancement of nosuid to just block funny cases that >>> no one cares about is completely sensible.Removing goofy corner >>> that no one cares about and that are only good for security exploits >>> seems reasonable. >>> >> >> Agreed. >> >>> I am a little concerned that smack does not seem to respect nosuid >>> on filesystems. But that is an issue with nosuid not with your enhanced >>> nosuid. >>> >>> >>> >>> >>> Now this patch 3/7 really should be entitled: >>> "Limit file caps to the userns of the super block". >>> >>> It really really is doing something different. This change is about a >>> bottom up understanding of what file caps means on a filesystem mounted >>> by a user namespace root. >>> >>> That is file caps should only apply to the user namespace root of the >>> root user who mounted the filesystem, because that is all the privileges >>> the mounter of the filesystem had. >>> >>> This guarantees that even if the filesystem somehow propagates with >>> mount propagation that there will be no issues. I think I know how to >>> make that happen... >>> >>> >>> >>> >>> But deeply and fundamentally limiting a filesystem to only the >>> privilieges of it's user namespace root, and enhancing nosuid >>> protections are rather different things. >>> >> >> So here's the semantic question: >> >> Suppose an unprivileged user (uid 1000) creates a user namespace and a >> mount namespace. They stick a file (owned by uid 1000 as seen by >> init_user_ns) in there and mark it setuid root and give it fcaps. > > To make this make sense I have to ask, is this file on a filesystem > where uid 1000 as seen by the init_user_ns stored as uid 1000 on > the filesystem? Or is this uid 0 as seen by the filesystem? > > I assume this is uid 0 on the filesystem in question or else your > unprivileged user would not have sufficient privileges over the > filesystem to setup fcaps. I was thinking uid 0 as seen by the filesystem. But even if it were uid 1000, the unprivileged user can still set whatever mode and xattrs they want -- they control the backing store. > >> Then global root gets an fd to this filesystem. If they execve the >> file directly, then, with my patch 4, it won't act as setuid 1000 and >> the fcaps will be ignored. Even with my patch 4, though, if they bind >> mount the fs and execve the file from their bind mount, it will act as >> setuid 1000. Maybe this is odd. However, with Seth's patch 3, the >> fcaps will (correctly) not be honored. > > With patch 3 you can also think of it as fcaps being honored and you > get all the caps in the appropriate user namespace, but since you are > not in that user namespace and so don't have a place to store them > in struct cred you don't get the file caps. > > From the philosophy of interpreting the file as defined by the > filesystem in principle we could extend struct cred so you actually > get the creds just in uid 1000s user namespace, but that is very > unlikely to be worth it. I agree. > >> I tend to thing that, if we're not honoring the fcaps, we shouldn't be >> honoring the setuid bit either. After all, it's really not a trusted >> file, even though the only user who could have messed with it really >> is the apparent owner. > > For the file caps we can't honor them because you don't have the bits > in struct cred. > > For setuid we can honor it, and setuid is something that the user > namespace allows. > We certainly *can* honor it. But why should we? I'd be more comfortable with this if the contents of an untrusted filesystem were really treated as just data. >> And, if we're going to say we don't trust the file and shouldn't honor >> setuid or fcaps, then merging all the functionality into mnt_may_suid >> could make sense. Yes, these two things do different things, but they >> could hook in to the same place. > > There are really two separate questions: > - Do we trust this filesystem? > - Do you have the bits to implement this concept? > > Even if in this specific context the two questions wind up looking > exactly the same. I think it makes a lot of sense to ask the two > questions
Re: [PATCH v3 6/6] ARM: PRM: AM437x: Enable IO wakeup feature
Hi On Tue, 14 Jul 2015, Keerthy wrote: > Enable IO wakeup feature. > > Signed-off-by: Keerthy Per my comments on one of the previous patches, please add a short description in the commit message for what enabling I/O wakeup will do for a user. - Paul -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v4 0/6] powernv: cpufreq: Report frequency throttle by OCC
On 13-07-15, 19:39, Shilpasri G Bhat wrote: > This patchset intends to add frequency throttle reporting mechanism > to powernv-cpufreq driver when OCC throttles the frequency. OCC is an > On-Chip-Controller which takes care of the power and thermal safety of > the chip. The CPU frequency can be throttled during an OCC reset or > when OCC tries to limit the max allowed frequency. The patchset will > report such conditions so as to keep the user informed about reason > for the drop in performance of workloads when frequency is throttled. > > Changes from v3: > - Rebased on top of 4.2-rc1 > - Minor changes in patch 2,3,4,6 this does not change the > functionality of the code > - 594fcb9ec9e powerpc/powernv: Expose OPAL APIs required by PRD > interface , this patch fixes the build error due to which this > series was initially dropped > ERROR: ".opal_message_notifier_register" > drivers/cpufreq/powernv-cpufreq.ko] undefined! I have already Acked v3 of this and that applies to this one as well.. -- viresh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: cpufreq/ondemand: unpinning an unpinned lock.
On 16-07-15, 02:13, Rafael J. Wysocki wrote: > Cc: Viresh as he's been working on governors recently. > > On Wednesday, July 15, 2015 06:04:22 PM Dave Jones wrote: > > WARNING: CPU: 1 PID: 29529 at kernel/locking/lockdep.c:3497 > > lock_unpin_lock+0x109/0x110() > > unpinning an unpinned lock > > CPU: 1 PID: 29529 Comm: kworker/1:1 Not tainted 4.2.0-rc2-think+ #3 > > Workqueue: events od_dbs_timer > > 0009 880094d5baa8 ae7f5e6f 0007 > > 880094d5baf8 880094d5bae8 ae07b91a 0118 > > 00e0 880507bd5c58 0092 0004 > > Call Trace: > > [] dump_stack+0x4f/0x7b > > [] warn_slowpath_common+0x8a/0xc0 > > [] warn_slowpath_fmt+0x46/0x50 > > [] lock_unpin_lock+0x109/0x110 > > [] __schedule+0x3ac/0xb60 > > [] schedule+0x41/0x90 > > [] schedule_preempt_disabled+0x18/0x30 > > [] mutex_lock_nested+0x16f/0x3e0 > > [] ? gov_queue_work+0x2f/0xf0 > > [] ? od_check_cpu+0x57/0xd0 > > [] ? gov_queue_work+0x2f/0xf0 > > [] gov_queue_work+0x2f/0xf0 > > [] od_dbs_timer+0xbd/0x150 > > [] process_one_work+0x1f3/0x7a0 > > [] ? process_one_work+0x162/0x7a0 > > [] ? worker_thread+0xf9/0x470 > > [] worker_thread+0x69/0x470 > > [] ? preempt_count_sub+0xa3/0xf0 > > [] ? process_one_work+0x7a0/0x7a0 > > [] kthread+0x11f/0x140 > > [] ? kthread_create_on_node+0x250/0x250 > > [] ret_from_fork+0x3f/0x70 > > [] ? kthread_create_on_node+0x250/0x250 > > ---[ end trace 86cca931caec9193 ]--- I don't know why this will happen. Just to confirm, you are getting this over 4.2-rc(1 or 2)? And you weren't getting these on 4.1 at all? And its always reproducible? How ? There have been races in cpufreq core since sometime and what got pushed in 4.2-rc1 is just half of the fix. The other half is present here: http://marc.info/?i=cover.1434713657.git.viresh.kumar%40linaro.org Please try this and let us know if things work well or not. -- viresh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/7] fs: Ignore file caps in mounts from other user namespaces
Andy Lutomirski writes: > On Wed, Jul 15, 2015 at 9:23 PM, Eric W. Biederman > wrote: >> >> Ok. Andy I have stopped and really looked at your patch that is 4/7 in >> this series. Something I had not done before since it sounded totally >> wrong. >> >> That combined with your earlier comments I think I can say something >> meaningful. >> >> Andy as I read your patch the thread you are primarily worried about is >> chdir(/some/directory/in/another/mnt/ns). I think enhancing nosuid to >> deal with that case is reasonable, and is unlikely to break userspace. >> It is one of those hairy security things so we need to be careful not to >> introduce a regression. >> > > Indeed. It's plausible this could regress something, but it would be > really weird. > >> I think a top down enhancement of nosuid to just block funny cases that >> no one cares about is completely sensible.Removing goofy corner >> that no one cares about and that are only good for security exploits >> seems reasonable. >> > > Agreed. > >> I am a little concerned that smack does not seem to respect nosuid >> on filesystems. But that is an issue with nosuid not with your enhanced >> nosuid. >> >> >> >> >> Now this patch 3/7 really should be entitled: >> "Limit file caps to the userns of the super block". >> >> It really really is doing something different. This change is about a >> bottom up understanding of what file caps means on a filesystem mounted >> by a user namespace root. >> >> That is file caps should only apply to the user namespace root of the >> root user who mounted the filesystem, because that is all the privileges >> the mounter of the filesystem had. >> >> This guarantees that even if the filesystem somehow propagates with >> mount propagation that there will be no issues. I think I know how to >> make that happen... >> >> >> >> >> But deeply and fundamentally limiting a filesystem to only the >> privilieges of it's user namespace root, and enhancing nosuid >> protections are rather different things. >> > > So here's the semantic question: > > Suppose an unprivileged user (uid 1000) creates a user namespace and a > mount namespace. They stick a file (owned by uid 1000 as seen by > init_user_ns) in there and mark it setuid root and give it fcaps. To make this make sense I have to ask, is this file on a filesystem where uid 1000 as seen by the init_user_ns stored as uid 1000 on the filesystem? Or is this uid 0 as seen by the filesystem? I assume this is uid 0 on the filesystem in question or else your unprivileged user would not have sufficient privileges over the filesystem to setup fcaps. > Then global root gets an fd to this filesystem. If they execve the > file directly, then, with my patch 4, it won't act as setuid 1000 and > the fcaps will be ignored. Even with my patch 4, though, if they bind > mount the fs and execve the file from their bind mount, it will act as > setuid 1000. Maybe this is odd. However, with Seth's patch 3, the > fcaps will (correctly) not be honored. With patch 3 you can also think of it as fcaps being honored and you get all the caps in the appropriate user namespace, but since you are not in that user namespace and so don't have a place to store them in struct cred you don't get the file caps. >From the philosophy of interpreting the file as defined by the filesystem in principle we could extend struct cred so you actually get the creds just in uid 1000s user namespace, but that is very unlikely to be worth it. > I tend to thing that, if we're not honoring the fcaps, we shouldn't be > honoring the setuid bit either. After all, it's really not a trusted > file, even though the only user who could have messed with it really > is the apparent owner. For the file caps we can't honor them because you don't have the bits in struct cred. For setuid we can honor it, and setuid is something that the user namespace allows. > And, if we're going to say we don't trust the file and shouldn't honor > setuid or fcaps, then merging all the functionality into mnt_may_suid > could make sense. Yes, these two things do different things, but they > could hook in to the same place. There are really two separate questions: - Do we trust this filesystem? - Do you have the bits to implement this concept? Even if in this specific context the two questions wind up looking exactly the same. I think it makes a lot of sense to ask the two questions separately. As future maintenance changes may cause the implementation of the questions to diverge. Eric -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH v2] memory-barriers: remove smp_mb__after_unlock_lock()
On Thu, 2015-07-16 at 12:00 +1000, Michael Ellerman wrote: > That would fix the problem with smp_mb__after_unlock_lock(), but not > the original worry we had about loads happening before the SC in lock. However I think isync fixes *that* :-) The problem with isync is as you said, it's not a -memory- barrier per-se, it's an execution barrier / context synchronizing instruction. The combination stwcx. + bne + isync however prevents the execution of anything past the isync until the stwcx has completed and the bne has been "decided", which prevents loads from leaking into the LL/SC loop. It will also prevent a store in the lock from being issued before the stwcx. has completed. It does *not* prevent as far as I can tell another unrelated store before the lock from leaking into the lock, including the one used to unlock a different lock. Cheers, Ben. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/7] Initial support for user namespace owned mounts
Casey Schaufler writes: > On 7/15/2015 6:08 PM, Andy Lutomirski wrote: >> On Wed, Jul 15, 2015 at 3:39 PM, Casey Schaufler >> wrote: >>> On 7/15/2015 2:06 PM, Eric W. Biederman wrote: Casey Schaufler writes: The first step needs to be not trusting those labels and treating such filesystems as filesystems without label support. I hope that is Seth has implemented. >>> A filesystem with Smack labels gets mounted in a namespace. The labels >>> are ignored. Instead, the filesystem defaults (potentially specified as >>> mount options smackfsdef="something", but usually the floor label ("_")) >>> are used, giving the user the ability to read everything and (usually) >>> change nothing. This is both dangerous (unintended read access to files) >>> and pointless (can't make changes). >> I don't get it. >> >> If I mount an unprivileged filesystem, then either the contents were >> put there *by me*, in which case letting me access them are fine, or >> (with Seth's patches and then some) I control the backing store, in >> which case I can do whatever I want regardless of what LSM thinks. >> >> So I don't see the problem. Why would Smack or any other LSM care at >> all, unless it wants to prevent me from mounting the fs in the first >> place? > > First off, I don't cotton to the notion that you should be able > to mount filesystems without privilege. But it seems I'm being > outvoted on that. I suspect that there are cases where it might > be safe, but I can't think of one off the top of my head. There are two fundamental issues mounting filesystems without privielge, by which I actually mean mounting filesystems as the root user in a user namespace. - Are the semantics safe. - Is the extra attack surface a problem. Figuring out how to make semantics safe is what we are talking about. Once we sort out the semantics we can look at the handful of filesystems like fuse where the extra attack surface is not a concern. With that said desktop environments have for a long time been automatically mounting whichever filesystem you place in your computer, so in practice what this is really about is trying to align the kernel with how people use filesystems. I haven't looked closely but I think docker is just about as bad as those desktop environments when it comes to mounting filesystems. > If you do mount a filesystem it needs to behave according to the > rules of the system. I agree. > If you have a security module that uses > attributes on the filesystem you can't ignore them just because > it's "your data". Mandatory access control schemes, including > Smack and SELinux don't give a fig about who you are. It's the > label on the data and the process that matter. If "you" get to > muck the labels up, you've broken the mandatory access control. So there are filesystems like fat and minix that can not store a label. Since it is not possible to store labels securely in filesystems mounted by unprivileged users (at least in the normal sense) the intent would be to treat a filesystem mounted without the privileges of the global root user as a filesystem that does not support xattrs. Treating such a filesystem as a filesystem that does not support xattrs is the only possible way support such a filesystem securely, because as you have said someone who can muck up the labels breaks mandatory access control. Given how non-trivial it is to grasp the nuances of different lsms mandatory access control semantics, I am asking Seth for the first past to simply forbid mounting of filesystems with just user namespace permissions when there is an lsm active. Once we get that far smack may never need to support such systems. Eric -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
linux-next: manual merge of the akpm-current tree with the arm tree
Hi Andrew, Today's linux-next merge of the akpm-current tree got a conflict in: arch/arm/include/asm/Kbuild between commit: 57853e8906a0 ("ARM: 8403/1: kbuild: don't use generic mcs_spinlock.h header") from the arm tree and commit: 74cf1a5a0c64 ("mm: clean up per architecture MM hook header files") from the akpm-current tree. I fixed it up (see below) and can carry the fix as necessary (no action is required). -- Cheers, Stephen Rothwells...@canb.auug.org.au diff --cc arch/arm/include/asm/Kbuild index 517ef6dd22b9,30b3bc1666d2.. --- a/arch/arm/include/asm/Kbuild +++ b/arch/arm/include/asm/Kbuild @@@ -12,6 -12,8 +12,7 @@@ generic-y += irq_regs. generic-y += kdebug.h generic-y += local.h generic-y += local64.h -generic-y += mcs_spinlock.h + generic-y += mm-arch-hooks.h generic-y += msgbuf.h generic-y += param.h generic-y += parport.h -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/7] fs: Ignore file caps in mounts from other user namespaces
On Wed, Jul 15, 2015 at 9:23 PM, Eric W. Biederman wrote: > > Ok. Andy I have stopped and really looked at your patch that is 4/7 in > this series. Something I had not done before since it sounded totally > wrong. > > That combined with your earlier comments I think I can say something > meaningful. > > Andy as I read your patch the thread you are primarily worried about is > chdir(/some/directory/in/another/mnt/ns). I think enhancing nosuid to > deal with that case is reasonable, and is unlikely to break userspace. > It is one of those hairy security things so we need to be careful not to > introduce a regression. > Indeed. It's plausible this could regress something, but it would be really weird. > I think a top down enhancement of nosuid to just block funny cases that > no one cares about is completely sensible.Removing goofy corner > that no one cares about and that are only good for security exploits > seems reasonable. > Agreed. > I am a little concerned that smack does not seem to respect nosuid > on filesystems. But that is an issue with nosuid not with your enhanced > nosuid. > > > > > Now this patch 3/7 really should be entitled: > "Limit file caps to the userns of the super block". > > It really really is doing something different. This change is about a > bottom up understanding of what file caps means on a filesystem mounted > by a user namespace root. > > That is file caps should only apply to the user namespace root of the > root user who mounted the filesystem, because that is all the privileges > the mounter of the filesystem had. > > This guarantees that even if the filesystem somehow propagates with > mount propagation that there will be no issues. I think I know how to > make that happen... > > > > > But deeply and fundamentally limiting a filesystem to only the > privilieges of it's user namespace root, and enhancing nosuid > protections are rather different things. > So here's the semantic question: Suppose an unprivileged user (uid 1000) creates a user namespace and a mount namespace. They stick a file (owned by uid 1000 as seen by init_user_ns) in there and mark it setuid root and give it fcaps. Then global root gets an fd to this filesystem. If they execve the file directly, then, with my patch 4, it won't act as setuid 1000 and the fcaps will be ignored. Even with my patch 4, though, if they bind mount the fs and execve the file from their bind mount, it will act as setuid 1000. Maybe this is odd. However, with Seth's patch 3, the fcaps will (correctly) not be honored. I tend to thing that, if we're not honoring the fcaps, we shouldn't be honoring the setuid bit either. After all, it's really not a trusted file, even though the only user who could have messed with it really is the apparent owner. And, if we're going to say we don't trust the file and shouldn't honor setuid or fcaps, then merging all the functionality into mnt_may_suid could make sense. Yes, these two things do different things, but they could hook in to the same place. --Andy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
OFFICIAL LETTER 16\07\2015
HELLO, KINDLY STUDY ATTACHED DOCUMENT FOR A BETTER UNDERSTANDING TO MY PROPOSAL. THANKS FOR TAKING THE TIME TO READ MY E-MAIL MESSAGE. REGARDS, MR. PHILIP COHEN MR. PHILIP COHEN.docx Description: MS-Word 2007 document
[PATCH v3 3/4] arm64: Add Broadcom iProc family support
This patch adds support to Broadcom's iProc family of arm64 based SoCs in the arm64 Kconfig and defconfig files Signed-off-by: Ray Jui Reviewed-by: Scott Branden --- arch/arm64/Kconfig |5 + arch/arm64/configs/defconfig |2 ++ 2 files changed, 7 insertions(+) diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 318175f..969ef4a 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -162,6 +162,11 @@ source "kernel/Kconfig.freezer" menu "Platform selection" +config ARCH_BCM_IPROC + bool "Broadcom iProc SoC Family" + help + This enables support for Broadcom iProc based SoCs + config ARCH_EXYNOS bool help diff --git a/arch/arm64/configs/defconfig b/arch/arm64/configs/defconfig index 4e17e7e..c83d51f 100644 --- a/arch/arm64/configs/defconfig +++ b/arch/arm64/configs/defconfig @@ -31,6 +31,7 @@ CONFIG_MODULES=y CONFIG_MODULE_UNLOAD=y # CONFIG_BLK_DEV_BSG is not set # CONFIG_IOSCHED_DEADLINE is not set +CONFIG_ARCH_BCM_IPROC=y CONFIG_ARCH_EXYNOS7=y CONFIG_ARCH_FSL_LS2085A=y CONFIG_ARCH_HISI=y @@ -102,6 +103,7 @@ CONFIG_SERIO_AMBAKMI=y CONFIG_LEGACY_PTY_COUNT=16 CONFIG_SERIAL_8250=y CONFIG_SERIAL_8250_CONSOLE=y +CONFIG_SERIAL_8250_DW=y CONFIG_SERIAL_8250_MT6577=y CONFIG_SERIAL_AMBA_PL011=y CONFIG_SERIAL_AMBA_PL011_CONSOLE=y -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v3 1/4] PCI: iproc: enable arm64 support for iProc PCIe
This patch enables arm64 support to the iProc PCIe driver Signed-off-by: Ray Jui Reviewed-by: Scott Branden --- drivers/pci/host/pcie-iproc.c | 15 --- drivers/pci/host/pcie-iproc.h |8 ++-- 2 files changed, 10 insertions(+), 13 deletions(-) diff --git a/drivers/pci/host/pcie-iproc.c b/drivers/pci/host/pcie-iproc.c index d77481e..8a556d5 100644 --- a/drivers/pci/host/pcie-iproc.c +++ b/drivers/pci/host/pcie-iproc.c @@ -58,11 +58,6 @@ #define SYS_RC_INTX_EN 0x330 #define SYS_RC_INTX_MASK 0xf -static inline struct iproc_pcie *sys_to_pcie(struct pci_sys_data *sys) -{ - return sys->private_data; -} - /** * Note access to the configuration registers are protected at the higher layer * by 'pci_lock' in drivers/pci/access.c @@ -71,8 +66,7 @@ static void __iomem *iproc_pcie_map_cfg_bus(struct pci_bus *bus, unsigned int devfn, int where) { - struct pci_sys_data *sys = bus->sysdata; - struct iproc_pcie *pcie = sys_to_pcie(sys); + struct iproc_pcie *pcie = bus->sysdata; unsigned slot = PCI_SLOT(devfn); unsigned fn = PCI_FUNC(devfn); unsigned busno = bus->number; @@ -208,10 +202,7 @@ int iproc_pcie_setup(struct iproc_pcie *pcie, struct list_head *res) iproc_pcie_reset(pcie); - pcie->sysdata.private_data = pcie; - - bus = pci_create_root_bus(pcie->dev, 0, _pcie_ops, - >sysdata, res); + bus = pci_create_root_bus(pcie->dev, 0, _pcie_ops, pcie, res); if (!bus) { dev_err(pcie->dev, "unable to create PCI root bus\n"); ret = -ENOMEM; @@ -229,7 +220,9 @@ int iproc_pcie_setup(struct iproc_pcie *pcie, struct list_head *res) pci_scan_child_bus(bus); pci_assign_unassigned_bus_resources(bus); +#ifdef CONFIG_ARM pci_fixup_irqs(pci_common_swizzle, pcie->map_irq); +#endif pci_bus_add_devices(bus); return 0; diff --git a/drivers/pci/host/pcie-iproc.h b/drivers/pci/host/pcie-iproc.h index ba0a108..0ee9673 100644 --- a/drivers/pci/host/pcie-iproc.h +++ b/drivers/pci/host/pcie-iproc.h @@ -18,18 +18,22 @@ /** * iProc PCIe device + * @sysdata: Per PCI controller data. This needs to be kept at the beginning of + * struct iproc_pcie, to enable support of both ARM32 and ARM64 platforms with + * minimal changes in the iProc PCIe core driver * @dev: pointer to device data structure * @base: PCIe host controller I/O register base * @resources: linked list of all PCI resources - * @sysdata: Per PCI controller data * @root_bus: pointer to root bus * @phy: optional PHY device that controls the Serdes * @irqs: interrupt IDs */ struct iproc_pcie { +#ifdef CONFIG_ARM + struct pci_sys_data sysdata; +#endif struct device *dev; void __iomem *base; - struct pci_sys_data sysdata; struct pci_bus *root_bus; struct phy *phy; int irqs[IPROC_PCIE_MAX_NUM_IRQS]; -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v3 0/4] Add Broadcom North Star 2 support
This patch series adds Broadcom North Star 2 (NS2) SoC support. NS2 is an ARMv8 based SoC and under the Broadcom iProc family. Sorry for tying this with the Broadcom iProc PCIe driver fixes for ARM64. I have to tie them together because iProc PCIe support is enabled by default when ARCH_BCM_IPROC is enabled. Without the fixes in the iProc PCIe driver, enabling CONFIG_ARCH_BCM_IPROC would break the build for arm64 defconfig. Let me know if there's a better way to handle this. This patch series is generated based on v4.2-rc2 and tested on Broadcom NS2 SVK Code available on GITHUB: https://github.com/Broadcom/arm64-linux.git branch is ns2-core-v3 Changes from V2: - Drop hardcoded earlycon kernel command line paramter in NS2 SVK dts file because 1) earlycon is a debugging feature that can be enabled in the bootloader and should not be enabled by default in the board dts file and 2) of_earlycon should be used and support should be added to 8250 DW driver Changes from V1: - Took Arnd's advice to tweak the location of struct pci_sys_data within struct iproc_pcie. This helps to get rid of most of the CONFIG_ARM wrap in iProc PCIe core driver - Use stdout-path and alias for serial console in NS2 SVK dts - Add all 4 CPU descriptions in NS2 dtsi - Remove "clock-frequency" property in the armv8 timer node so timer frequency can be determined based on readings from CNTFRQ_EL0 - Remove config flag ARCH_BCM_NS2. Leave only ARCH_BCM_IPROC for all Broadcom arm64 SoCs as advised Ray Jui (4): PCI: iproc: enable arm64 support for iProc PCIe PCI: iproc: Fix ARM64 dependency in Kconfig arm64: Add Broadcom iProc family support arm64: dts: Add Broadcom North Star 2 support Documentation/devicetree/bindings/arm/bcm/ns2.txt |9 ++ arch/arm64/Kconfig|5 + arch/arm64/boot/dts/Makefile |1 + arch/arm64/boot/dts/broadcom/Makefile |5 + arch/arm64/boot/dts/broadcom/ns2-svk.dts | 59 +++ arch/arm64/boot/dts/broadcom/ns2.dtsi | 118 + arch/arm64/configs/defconfig |2 + drivers/pci/host/Kconfig |2 +- drivers/pci/host/pcie-iproc.c | 15 +-- drivers/pci/host/pcie-iproc.h |8 +- 10 files changed, 210 insertions(+), 14 deletions(-) create mode 100644 Documentation/devicetree/bindings/arm/bcm/ns2.txt create mode 100644 arch/arm64/boot/dts/broadcom/Makefile create mode 100644 arch/arm64/boot/dts/broadcom/ns2-svk.dts create mode 100644 arch/arm64/boot/dts/broadcom/ns2.dtsi -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v3 4/4] arm64: dts: Add Broadcom North Star 2 support
Add Broadcom NS2 device tree binding document. Also add initial device tree dtsi for Broadcom North Star 2 (NS2) SoC and board support for NS2 SVK board Signed-off-by: Jon Mason Signed-off-by: Ray Jui Reviewed-by: Scott Branden --- Documentation/devicetree/bindings/arm/bcm/ns2.txt |9 ++ arch/arm64/boot/dts/Makefile |1 + arch/arm64/boot/dts/broadcom/Makefile |5 + arch/arm64/boot/dts/broadcom/ns2-svk.dts | 59 +++ arch/arm64/boot/dts/broadcom/ns2.dtsi | 118 + 5 files changed, 192 insertions(+) create mode 100644 Documentation/devicetree/bindings/arm/bcm/ns2.txt create mode 100644 arch/arm64/boot/dts/broadcom/Makefile create mode 100644 arch/arm64/boot/dts/broadcom/ns2-svk.dts create mode 100644 arch/arm64/boot/dts/broadcom/ns2.dtsi diff --git a/Documentation/devicetree/bindings/arm/bcm/ns2.txt b/Documentation/devicetree/bindings/arm/bcm/ns2.txt new file mode 100644 index 000..35f056f --- /dev/null +++ b/Documentation/devicetree/bindings/arm/bcm/ns2.txt @@ -0,0 +1,9 @@ +Broadcom North Star 2 (NS2) device tree bindings + + +Boards with NS2 shall have the following properties: + +Required root node property: + +NS2 SVK board +compatible = "brcm,ns2-svk", "brcm,ns2"; diff --git a/arch/arm64/boot/dts/Makefile b/arch/arm64/boot/dts/Makefile index 38913be..9f95941 100644 --- a/arch/arm64/boot/dts/Makefile +++ b/arch/arm64/boot/dts/Makefile @@ -1,6 +1,7 @@ dts-dirs += amd dts-dirs += apm dts-dirs += arm +dts-dirs += broadcom dts-dirs += cavium dts-dirs += exynos dts-dirs += freescale diff --git a/arch/arm64/boot/dts/broadcom/Makefile b/arch/arm64/boot/dts/broadcom/Makefile new file mode 100644 index 000..e21fe66 --- /dev/null +++ b/arch/arm64/boot/dts/broadcom/Makefile @@ -0,0 +1,5 @@ +dtb-$(CONFIG_ARCH_BCM_IPROC) += ns2-svk.dtb + +always := $(dtb-y) +subdir-y := $(dts-dirs) +clean-files:= *.dtb diff --git a/arch/arm64/boot/dts/broadcom/ns2-svk.dts b/arch/arm64/boot/dts/broadcom/ns2-svk.dts new file mode 100644 index 000..244baf8 --- /dev/null +++ b/arch/arm64/boot/dts/broadcom/ns2-svk.dts @@ -0,0 +1,59 @@ +/* + * BSD LICENSE + * + * Copyright(c) 2015 Broadcom Corporation. All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + ** Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + ** Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + ** Neither the name of Broadcom Corporation nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +/dts-v1/; + +#include "ns2.dtsi" + +/ { + model = "Broadcom NS2 SVK"; + compatible = "brcm,ns2-svk", "brcm,ns2"; + + aliases { + serial0 = + }; + + chosen { + stdout-path = "serial0:115200n8"; + }; + + memory { + device_type = "memory"; + reg = <0x0 0x8000 0x 0x4000>; + }; + + soc: soc { + uart3: serial@6613 { + status = "ok"; + }; + }; +}; diff --git a/arch/arm64/boot/dts/broadcom/ns2.dtsi b/arch/arm64/boot/dts/broadcom/ns2.dtsi new file mode 100644 index 000..3c92d92 --- /dev/null +++ b/arch/arm64/boot/dts/broadcom/ns2.dtsi @@ -0,0 +1,118 @@ +/* + * BSD LICENSE + * + * Copyright(c) 2015 Broadcom Corporation. All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + ** Redistributions of source code must retain the above copyright + *
[PATCH v3 2/4] PCI: iproc: Fix ARM64 dependency in Kconfig
Allow Broadcom iProc PCIe core driver to be compiled for ARM64 Signed-off-by: Ray Jui Reviewed-by: Vikram Prakash Reviewed-by: Scott Branden --- drivers/pci/host/Kconfig |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/pci/host/Kconfig b/drivers/pci/host/Kconfig index c132bdd..d2c6144 100644 --- a/drivers/pci/host/Kconfig +++ b/drivers/pci/host/Kconfig @@ -117,7 +117,7 @@ config PCI_VERSATILE config PCIE_IPROC tristate "Broadcom iProc PCIe controller" - depends on OF && ARM + depends on OF && (ARM || ARM64) default n help This enables the iProc PCIe core controller support for Broadcom's -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/7] fs: Ignore file caps in mounts from other user namespaces
Ok. Andy I have stopped and really looked at your patch that is 4/7 in this series. Something I had not done before since it sounded totally wrong. That combined with your earlier comments I think I can say something meaningful. Andy as I read your patch the thread you are primarily worried about is chdir(/some/directory/in/another/mnt/ns). I think enhancing nosuid to deal with that case is reasonable, and is unlikely to break userspace. It is one of those hairy security things so we need to be careful not to introduce a regression. I think a top down enhancement of nosuid to just block funny cases that no one cares about is completely sensible.Removing goofy corner that no one cares about and that are only good for security exploits seems reasonable. I am a little concerned that smack does not seem to respect nosuid on filesystems. But that is an issue with nosuid not with your enhanced nosuid. Now this patch 3/7 really should be entitled: "Limit file caps to the userns of the super block". It really really is doing something different. This change is about a bottom up understanding of what file caps means on a filesystem mounted by a user namespace root. That is file caps should only apply to the user namespace root of the root user who mounted the filesystem, because that is all the privileges the mounter of the filesystem had. This guarantees that even if the filesystem somehow propagates with mount propagation that there will be no issues. I think I know how to make that happen... But deeply and fundamentally limiting a filesystem to only the privilieges of it's user namespace root, and enhancing nosuid protections are rather different things. The approaches show up differently for dealing with uids and gids, as mappings are required. The approaches will likely to continue to show up differently for file caps when Serge implements a version of file caps with a user namespace root in them. The approaches fundamentally will need to do different things with security xattrs. As mnt_may_suid can just treat as a filesystem without labels, while ultimately the lsms will have to do something meaningful. So while in the very narrow case of todays file caps the two approaches are the same. Enhancing nosuid is something very different from limiting a filesystem to it's mounters user namespace. Eric -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [V2 6/7] hvsock: introduce Hyper-V VM Sockets feature
From: Dexuan Cui Date: Tue, 14 Jul 2015 03:00:48 -0700 > + pr_debug("hvsock_sk_destruct: called\n"); Debug logging just to state that a function is called is not appropriate, we have very sophisticated tracing facilities in the kernel that can do that transparently, and more. PLease remove this. > + if (hvsk->channel) { > + pr_debug("hvsock_sk_destruct: calling vmbus_close()\n"); Likewise, these kinds of debug logs are totally inappropriate. > +static int hvsock_release(struct socket *sock) > +{ > + /* sock->sk is NULL, if accept() is interrupted by a signal */ > + if (sock->sk) { > + __hvsock_release(sock->sk); > + sock->sk = NULL; > + } > + > + sock->state = SS_FREE; > + pr_debug("hvsock_release called\n\n"); Likewise. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
LKML archives at UofI down?
The LKML archives once present at http://lkml.iu.edu/hypermail/linux/kernel/index.html seem to be down; http://lkml.iu.edu/hypermail/ appears empty. Does anyone know what happened to it? - Josh Triplett -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [V2 3/7] Drivers: hv: vmbus: add APIs to send/recv hvsock packet and get the r/w-ability
From: Dexuan Cui Date: Tue, 14 Jul 2015 02:58:56 -0700 > +int vmbus_sendpacket_hvsock(struct vmbus_channel *channel, void *buf, u32 > len) > +{ > + struct vmpacket_descriptor desc; > + struct vmpipe_proto_header pipe_hdr; > + u32 packetlen; > + u32 packetlen_aligned; > + struct kvec bufferlist[4]; > + u64 aligned_data = 0; > + int ret; > + bool signal = false; Reverse christmas-tree (longest to shortest line) order these local variables, please. > +EXPORT_SYMBOL(vmbus_sendpacket_hvsock); EXPORT_SYMBOL_GPL() > +int vmbus_recvpacket_hvsock(struct vmbus_channel *channel, void *buffer, > + u32 bufferlen, u32 *buffer_actual_len) > +{ > + struct vmpacket_descriptor *desc; > + struct vmpipe_proto_header *pipe_hdr; > + u32 packet_len, payload_len; > + int ret; > + bool signal = false; Again, please use reverse christmas-tree order. > +void vmbus_get_hvsock_rw_status(struct vmbus_channel *channel, > +bool *can_read, bool *can_write) Second line is not properly indented, it should start exactly one column after the openning parenthesis on the previous line. > + hv_get_ringbuffer_availbytes(inring_info, > + bytes_avail_toread, > + bytes_avail_towrite); Again, improperly indented. > +extern int vmbus_sendpacket_hvsock(struct vmbus_channel *channel, > + void *buf, u32 len); > + Likewise. > +extern int vmbus_recvpacket_hvsock(struct vmbus_channel *channel, void > *buffer, > + u32 bufferlen, u32 *buffer_actual_len); > + > +extern void vmbus_get_hvsock_rw_status(struct vmbus_channel *channel, > +bool *can_read, bool *can_write); Likewise. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH net-next] hv_netvsc: Add close of RNDIS filter into change mtu call
From: Haiyang Zhang Date: Mon, 13 Jul 2015 13:09:16 -0700 > The current change mtu call only stops tx before removing RNDIS filter. > In case ringbufer is not empty, the rndis_filter_device_remove() may > hang on removing the buffers. > > This patch adds close of RNDIS filter before removing it, also a > gradual waiting loop until the ring is empty. The change_mtu hang > issue under heavy traffic is solved by this patch. > > Signed-off-by: Haiyang Zhang > Reviewed-by: K. Y. Srinivasan Applied, thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [3/3] IRQ: Print "unexpected IRQ" messages consistently across architectures
On Mon, 2015-07-13 at 13:35 -0500, Bjorn Helgaas wrote: > On Sun, Jul 12, 2015 at 10:23 PM, Michael Ellerman > wrote: > > On Sun, 2015-12-07 at 22:02:11 UTC, Bjorn Helgaas wrote: > >> Many architectures use a variant of "unexpected IRQ trap at vector %x" to > >> log unexpected IRQs. This is confusing because (a) it prints the Linux IRQ > >> number, but "vector" more often refers to a CPU vector number, and (b) it > >> prints the IRQ number in hex with no base indication, while Linux IRQ > >> numbers are usually printed in decimal. > >> > >> Print the same text ("unexpected IRQ %d") across all architectures. > >> > >> No functional change other than the output text. > > > > There's already a fallback version in asm-generic, so shouldn't you instead > > just delete all the versions that are identical to that? > > > > eg. on powerpc we have: > > > >> static inline void ack_bad_irq(unsigned int irq) > >> { > >> - printk(KERN_CRIT "unexpected IRQ trap at vector %02x\n", irq); > >> + printk(KERN_CRIT "unexpected IRQ %d\n", irq); > >> } > > > > And the generic version is: > > > >> #ifndef ack_bad_irq > >> static inline void ack_bad_irq(unsigned int irq) > >> { > >> - printk(KERN_CRIT "unexpected IRQ trap at vector %02x\n", irq); > >> + printk(KERN_CRIT "unexpected IRQ %d\n", irq); > >> } > >> #endif > > > > So we can just delete the powerpc version? > > Wow, I really didn't do my homework here. Not only is there a generic > version already, but there's also print_irq_desc(), which prints way > more information than any of the ack_bad_irq() implementations. Even better :) > I'll try again :) Thanks. cheers -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [V2 1/7] Drivers: hv: vmbus: define the new offer type for Hyper-V socket (hvsock)
From: Dexuan Cui Date: Tue, 14 Jul 2015 02:58:03 -0700 > A helper function is also added. > > Signed-off-by: Dexuan Cui > --- > include/linux/hyperv.h | 7 +++ > 1 file changed, 7 insertions(+) > > diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h > index 30d3a1f..aa21814 100644 > --- a/include/linux/hyperv.h > +++ b/include/linux/hyperv.h > @@ -236,6 +236,7 @@ struct vmbus_channel_offer { > #define VMBUS_CHANNEL_LOOPBACK_OFFER 0x100 > #define VMBUS_CHANNEL_PARENT_OFFER 0x200 > #define VMBUS_CHANNEL_REQUEST_MONITORED_NOTIFICATION 0x400 > +#define VMBUS_CHANNEL_TLNPI_PROVIDER_OFFER 0x2000 > > struct vmpacket_descriptor { > u16 type; > @@ -758,6 +759,12 @@ struct vmbus_channel { > struct list_head percpu_list; > }; > > +static inline bool is_hvsock_channel(const struct vmbus_channel *c) > +{ > + return !!(c->offermsg.offer.chn_flags & > + VMBUS_CHANNEL_TLNPI_PROVIDER_OFFER); > +} > + This is not indented properly, plus it makes no sense to add a flag before anyone even sets the flag. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/3] KVM: MTRR: fix memory type handling if MTRR is completely disabled
On Thu, 2015-07-16 at 03:25 +0800, Xiao Guangrong wrote: > From: Xiao Guangrong > > Currently code uses default memory type if MTRR is fully disabled, > fix it by using UC instead > > Signed-off-by: Xiao Guangrong > --- Seems to work for me. I don't see a 0th patch, but for the series: Tested-by: Alex Williamson Thanks! > arch/x86/kvm/mtrr.c | 21 - > 1 file changed, 20 insertions(+), 1 deletion(-) > > diff --git a/arch/x86/kvm/mtrr.c b/arch/x86/kvm/mtrr.c > index de1d2d8..e275013 100644 > --- a/arch/x86/kvm/mtrr.c > +++ b/arch/x86/kvm/mtrr.c > @@ -120,6 +120,16 @@ static u8 mtrr_default_type(struct kvm_mtrr *mtrr_state) > return mtrr_state->deftype & IA32_MTRR_DEF_TYPE_TYPE_MASK; > } > > +static u8 mtrr_disabled_type(void) > +{ > + /* > + * Intel SDM 11.11.2.2: all MTRRs are disabled when > + * IA32_MTRR_DEF_TYPE.E bit is cleared, and the UC > + * memory type is applied to all of physical memory. > + */ > + return MTRR_TYPE_UNCACHABLE; > +} > + > /* > * Three terms are used in the following code: > * - segment, it indicates the address segments covered by fixed MTRRs. > @@ -434,6 +444,8 @@ struct mtrr_iter { > > /* output fields. */ > int mem_type; > + /* mtrr is completely disabled? */ > + bool mtrr_disabled; > /* [start, end) is not fully covered in MTRRs? */ > bool partial_map; > > @@ -549,7 +561,7 @@ static void mtrr_lookup_var_next(struct mtrr_iter *iter) > static void mtrr_lookup_start(struct mtrr_iter *iter) > { > if (!mtrr_is_enabled(iter->mtrr_state)) { > - iter->partial_map = true; > + iter->mtrr_disabled = true; > return; > } > > @@ -563,6 +575,7 @@ static void mtrr_lookup_init(struct mtrr_iter *iter, > iter->mtrr_state = mtrr_state; > iter->start = start; > iter->end = end; > + iter->mtrr_disabled = false; > iter->partial_map = false; > iter->fixed = false; > iter->range = NULL; > @@ -656,6 +669,9 @@ u8 kvm_mtrr_get_guest_memory_type(struct kvm_vcpu *vcpu, > gfn_t gfn) > return MTRR_TYPE_WRBACK; > } > > + if (iter.mtrr_disabled) > + return mtrr_disabled_type(); > + > /* It is not covered by MTRRs. */ > if (iter.partial_map) { > /* > @@ -689,6 +705,9 @@ bool kvm_mtrr_check_gfn_range_consistency(struct kvm_vcpu > *vcpu, gfn_t gfn, > return false; > } > > + if (iter.mtrr_disabled) > + return true; > + > if (!iter.partial_map) > return true; > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Question] How to implement GPIO driver for sparse hw numbers?
Hi Linus, 2015-07-15 7:04 GMT+09:00 Linus Walleij : > On Fri, Jun 19, 2015 at 5:27 AM, Masahiro Yamada > wrote: > >> In my understanding, the GPIO driver framework requires that >> the hw numbers should be contiguous within each GPIO chip. > > Yes but noone says that .request() to the driver has to succeed > on every GPIO so just cover all GPIOs from 0 to 307 with > your GPIO chip and then implement your "holes" in the GPIO > range from 0 to 307 by letting .request() fail. Thanks, At first I also thought about it, but finally I did not adopt it. Having holes in the GPIO range is not handy because: [1] When we map a gpio range into a pin range, we must divide "gpio-ranges" property into many lines gpio-ranges = http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sm750fb: coding style fixes lines over 80 chars
On Thu, 2015-07-16 at 00:16 +0530, Vinay Simha BN wrote: > scripts/checkpatch.pl kernel coding style fixes of WARNING Please don't be a checkpatch robot. Use tools to prompt your brain, but don't ever turn your brain off. > diff --git a/drivers/staging/sm750fb/ddk750_help.h > b/drivers/staging/sm750fb/ddk750_help.h > +/* if 718 big endian turned on,be aware that don't use this driver for > general > + use,only for ppc big-endian */ > +#warning "big endian on target cpu and enable nature big endian support of > 718 > + capability !" Yes, this if #if 0, but it's also obviously incorrect I didn't look at the rest. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/1] ath10k: fixing wrong initialization of struct channel
chandef is initialized with NULL and on the very next line, we are using it to get channel, which is not correct. channel should be initialized after obtaining chandef. Signed-off-by: Maninder Singh --- drivers/net/wireless/ath/ath10k/mac.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/wireless/ath/ath10k/mac.c b/drivers/net/wireless/ath/ath10k/mac.c index 218b6af..3d196b5 100644 --- a/drivers/net/wireless/ath/ath10k/mac.c +++ b/drivers/net/wireless/ath/ath10k/mac.c @@ -836,7 +836,7 @@ static inline int ath10k_vdev_setup_sync(struct ath10k *ar) static int ath10k_monitor_vdev_start(struct ath10k *ar, int vdev_id) { struct cfg80211_chan_def *chandef = NULL; - struct ieee80211_channel *channel = chandef->chan; + struct ieee80211_channel *channel = NULL; struct wmi_vdev_start_request_arg arg = {}; int ret = 0; -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: linux-next: build failure after merge of the rcu tree
On Thu, Jul 16, 2015 at 01:14:23PM +1000, Stephen Rothwell wrote: > Hi Paul, > > After merging the rcu tree, today's linux-next build (arm > multi_v7_defconfig) failed like this: > > kernel/notifier.c: In function 'notify_die': > kernel/notifier.c:547:2: error: implicit declaration of function > 'rcu_lockdep_assert' [-Werror=implicit-function-declaration] > rcu_lockdep_assert(rcu_is_watching(), > ^ > > Caused by commit > > 02300fdb3e5f ("rcu: Rename rcu_lockdep_assert() to RCU_LOCKDEP_WARN()") > > interacting with commit > > e727c7d7a11e ("notifiers, RCU: Assert that RCU is watching in notify_die()") > > [ and I also noted > 0333a209cbf6 ("x86/irq, context_tracking: Document how IRQ context tracking > works and add an RCU assertion") > ] > > from the tip tree. Thank you in both cases! I suspect that more will follow, so is there something I can do to make this easier? (Hard for me to patch stuff that is not yet in the tree...) Thanx, Paul > I added the following merge fix patch: > > From: Stephen Rothwell > Date: Thu, 16 Jul 2015 13:08:50 +1000 > Subject: [PATCH] rcu: merge fix for Rename rcu_lockdep_assert() to > RCU_LOCKDEP_WARN() > > Signed-off-by: Stephen Rothwell > --- > arch/x86/kernel/irq.c | 2 +- > kernel/notifier.c | 2 +- > 2 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c > index 30dbf35bc90b..f9cd81825187 100644 > --- a/arch/x86/kernel/irq.c > +++ b/arch/x86/kernel/irq.c > @@ -234,7 +234,7 @@ __visible unsigned int __irq_entry do_IRQ(struct pt_regs > *regs) > entering_irq(); > > /* entering_irq() tells RCU that we're not quiescent. Check it. */ > - rcu_lockdep_assert(rcu_is_watching(), "IRQ failed to wake up RCU"); > + RCU_LOCKDEP_WARN(!rcu_is_watching(), "IRQ failed to wake up RCU"); > > irq = __this_cpu_read(vector_irq[vector]); > > diff --git a/kernel/notifier.c b/kernel/notifier.c > index 980e4330fb59..fd2c9acbcc19 100644 > --- a/kernel/notifier.c > +++ b/kernel/notifier.c > @@ -544,7 +544,7 @@ int notrace notify_die(enum die_val val, const char *str, > .signr = sig, > > }; > - rcu_lockdep_assert(rcu_is_watching(), > + RCU_LOCKDEP_WARN(!rcu_is_watching(), > "notify_die called but RCU thinks we're quiescent"); > return atomic_notifier_call_chain(_chain, val, ); > } > -- > 2.1.4 > > -- > Cheers, > Stephen Rothwells...@canb.auug.org.au > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3] gpio: UniPhier: add driver for UniPhier GPIO controller
Hi Linus, 2015-07-15 23:15 GMT+09:00 Linus Walleij : > On Tue, Jul 14, 2015 at 4:43 AM, Masahiro Yamada > wrote: > >> This GPIO controller device is used on UniPhier SoCs. >> >> Signed-off-by: Masahiro Yamada >> --- >> >> Changes in v3: >> - Use module_platform_driver() >> >> Changes in v2: >> - Fix typos in the comment block > > OK why no device tree bindings? Are they in a separate patch? Sorry, I was planning to do it later. OK. I will come back with Documentation/devicetree/bindings/gpio/uniphier-gpio.txt in binding info in it. >> +/* >> + * Unfortunately, the hardware specification adopts weird GPIO pin labeling. >> + * The ports are named as >> + * PORT00, PORT01, PORT02, ..., PORT07, >> + * PORT10, PORT11, PORT12, ..., PORT17, >> + * PORT20, PORT21, PORT22, ..., PORT27, >> + *... >> + * PORT90, PORT91, PORT92, ..., PORT97, >> + * PORT100, PORT101, PORT102, ..., PORT107, >> + *... >> + * >> + * The PORTs with 8 or 9 in the one's place are missing, i.e. the one's >> place >> + * is octal, while the other places are decimal. If we handle the port >> numbers >> + * as seen in the hardware documents, the GPIO offsets must be >> non-contiguous. >> + * It is possible to have sparse GPIO pins, but not handy for GPIO range >> + * mappings, register accessing, etc. >> + * >> + * To make things simpler (for driver and device tree implementation), this >> + * driver takes contiguously-numbered GPIO offsets. GPIO consumers should >> make >> + * sure to convert the PORT number into the one that fits in this driver. >> + * The conversion logic is very easy math, for example, >> + * PORT15 --> GPIO offset 13 (8 * 1 + 5) >> + * PORT123 --> GPIO offset 99 (8 * 12 + 3) >> + */ >> +#define UNIPHIER_GPIO_PORTS_PER_BANK 8 >> +#define UNIPHIER_GPIO_BANK_MASK\ >> + ((1UL << (UNIPHIER_GPIO_PORTS_PER_BANK)) - 1) > > > >> + >> +#define UNIPHIER_GPIO_REG_DATA 0 /* data */ >> +#define UNIPHIER_GPIO_REG_DIR 4 /* direction (1:in, 0:out) */ >> + >> +struct uniphier_gpio_priv { >> + struct of_mm_gpio_chip mmchip; >> + spinlock_t lock; >> +}; >> + >> +static unsigned uniphier_gpio_bank_to_reg(unsigned bank, unsigned reg_type) >> +{ >> + unsigned reg; >> + >> + reg = (bank + 1) * 8 + reg_type; >> + >> + /* >> +* Unfortunately, there is a register hole at offset 0x90-0x9f. >> +* Add 0x10 when crossing the hole. >> +*/ >> + if (reg >= 0x90) >> + reg += 0x10; >> + >> + return reg; >> +} >> + >> +static void uniphier_gpio_bank_write(struct gpio_chip *chip, >> +unsigned bank, unsigned reg_type, >> +unsigned mask, unsigned value) >> +{ >> + struct of_mm_gpio_chip *mmchip = to_of_mm_gpio_chip(chip); >> + struct uniphier_gpio_priv *priv; >> + unsigned long flags; >> + unsigned reg; >> + u32 tmp; >> + >> + if (!mask) >> + return; >> + >> + priv = container_of(mmchip, struct uniphier_gpio_priv, mmchip); >> + >> + reg = uniphier_gpio_bank_to_reg(bank, reg_type); >> + >> + /* >> +* Note >> +* regmap_update_bits() should not be used here. >> +* >> +* The DATA registers return the current readback of pins, not the >> +* previously written data when they are configured as "input". >> +* The DATA registers must be overwritten even if the data you are >> +* going to write is the same as what readl() has returned. >> +* >> +* regmap_update_bits() does not write back if the data is not >> changed. >> +*/ > > Why is this mentioned when the driver doesn't even use regmap? > Development artifact? At first, I thought regmap_update_bits() might be useful, but it tuned out a bad idea. Anyway, it did not use regmap in this driver, so this comment sounds a bit weird. I will delete it in v4. >> +static int uniphier_gpio_get_direction(struct gpio_chip *chip, unsigned >> offset) >> +{ >> + return uniphier_gpio_offset_read(chip, UNIPHIER_GPIO_REG_DIR, >> offset) ? >> + GPIOF_DIR_IN : GPIOF_DIR_OUT; > > Just use > return !!uniphier_gpio_offset_read(chip, UNIPHIER_GPIO_REG_DIR, offset); OK, will fix. >> +static int uniphier_gpio_get(struct gpio_chip *chip, unsigned offset) >> +{ >> + return uniphier_gpio_offset_read(chip, offset, >> UNIPHIER_GPIO_REG_DATA); > > return !!uniphier_gpio_offset_read(chip, offset, UNIPHIER_GPIO_REG_DATA); Likewise. >> +static void uniphier_gpio_set_multiple(struct gpio_chip *chip, >> + unsigned long *mask, >> + unsigned long *bits) >> +{ >> + unsigned bank, shift, bank_mask, bank_bits; >> + int i; >> + >> + for (i = 0; i < chip->ngpio; i +=
Re: [RFC PATCH 11/12] selftests/seccomp: Make seccomp tests work on big endian
On Wed, 2015-07-15 at 08:16 -0700, Kees Cook wrote: > On Wed, Jul 15, 2015 at 12:37 AM, Michael Ellerman > wrote: > > diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c > > b/tools/testing/selftests/seccomp/seccomp_bpf.c > > index b2374c131340..51adb9afb511 100644 > > --- a/tools/testing/selftests/seccomp/seccomp_bpf.c > > +++ b/tools/testing/selftests/seccomp/seccomp_bpf.c > > @@ -82,7 +82,13 @@ struct seccomp_data { > > }; > > #endif > > > > +#if __BYTE_ORDER == __LITTLE_ENDIAN > > #define syscall_arg(_n) (offsetof(struct seccomp_data, args[_n])) > > +#elif __BYTE_ORDER == __BIG_ENDIAN > > +#define syscall_arg(_n) (offsetof(struct seccomp_data, args[_n]) + > > sizeof(__u32)) > > +#else > > +#error "wut?" > > +#endif > > Ah-ha! Yes, thanks. Could you change the #error to something that > describes the particular (impossible) failure condition? "wut? Unknown > __BYTE_ORDER?!". Not a huge deal, but I always like verbose errors. :) > Especially for "impossible" situations. :) Yeah sorry that was a "quick hack" which got promoted into an actual patch. Fixed to use your message. cheers -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH 09/12] powerpc/kernel: Add SIG_SYS support for compat tasks
On Wed, 2015-07-15 at 08:12 -0700, Kees Cook wrote: > On Wed, Jul 15, 2015 at 12:37 AM, Michael Ellerman > wrote: > > diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c > > b/tools/testing/selftests/seccomp/seccomp_bpf.c > > index c5abe7fd7590..b2374c131340 100644 > > --- a/tools/testing/selftests/seccomp/seccomp_bpf.c > > +++ b/tools/testing/selftests/seccomp/seccomp_bpf.c > > @@ -645,6 +645,10 @@ static struct siginfo TRAP_info; > > static volatile int TRAP_nr; > > static void TRAP_action(int nr, siginfo_t *info, void *void_context) > > { > > + fprintf(stderr, "in TRAP_action\n"); > > + fprintf(stderr, "info->si_call_addr %p\n", info->si_call_addr); > > + fprintf(stderr, "info->si_syscall %u\n", info->si_syscall); > > + fprintf(stderr, "info->si_arch %u\n", info->si_arch); > > memcpy(_info, info, sizeof(TRAP_info)); > > TRAP_nr = nr; > > } > > This chunk looks like left-over debugging? Urgh yep, that's ugly. Thanks for noticing. Will remove before merging :) cheers -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/7] Initial support for user namespace owned mounts
Seth I think for the LSMs we should start with: diff --git a/security/security.c b/security/security.c index 062f3c997fdc..5b6ece92a8e5 100644 --- a/security/security.c +++ b/security/security.c @@ -310,6 +310,8 @@ int security_sb_statfs(struct dentry *dentry) int security_sb_mount(const char *dev_name, struct path *path, const char *type, unsigned long flags, void *data) { + if (current_user_ns() != _user_ns) + return -EPERM; return call_int_hook(sb_mount, 0, dev_name, path, type, flags, data); } Then we should push this down into all of the lsms. Then when we should remove or relax or change the check as appropriate in each lsm. The point is this is good enough to see that it is trivially safe, and this allows us to focus on the core issues, and stop worrying about the lsms for a bit. Then we can focus on each lsm one at at time and take the time to really understand them and talk with their maintainers etc to make certain we get things correct. This should remove the need for your patches 5, 6 and 7. For the immediate future. Eric -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3/4] block: partition: introduce 'cpu' para to part_inc|dec_in_flight
So that it is easier to convert part->in_flight[rw] into percpu variable in the following patch. Signed-off-by: Ming Lei --- block/bio.c | 4 ++-- block/blk-core.c | 4 ++-- block/blk-merge.c | 2 +- drivers/nvdimm/core.c | 4 ++-- include/linux/genhd.h | 4 ++-- 5 files changed, 9 insertions(+), 9 deletions(-) diff --git a/block/bio.c b/block/bio.c index 2a00d34..fe8807f 100644 --- a/block/bio.c +++ b/block/bio.c @@ -1724,7 +1724,7 @@ void generic_start_io_acct(int rw, unsigned long sectors, part_round_stats(cpu, part); part_stat_inc(cpu, part, ios[rw]); part_stat_add(cpu, part, sectors[rw], sectors); - part_inc_in_flight(part, rw); + part_inc_in_flight(cpu, part, rw); part_stat_unlock(); } @@ -1738,7 +1738,7 @@ void generic_end_io_acct(int rw, struct hd_struct *part, part_stat_add(cpu, part, ticks[rw], duration); part_round_stats(cpu, part); - part_dec_in_flight(part, rw); + part_dec_in_flight(cpu, part, rw); part_stat_unlock(); } diff --git a/block/blk-core.c b/block/blk-core.c index 82819e6..f180a6d 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -2194,7 +2194,7 @@ void blk_account_io_done(struct request *req) part_stat_inc(cpu, part, ios[rw]); part_stat_add(cpu, part, ticks[rw], duration); part_round_stats(cpu, part); - part_dec_in_flight(part, rw); + part_dec_in_flight(cpu, part, rw); hd_struct_put(part); part_stat_unlock(); @@ -2252,7 +2252,7 @@ void blk_account_io_start(struct request *rq, bool new_io) hd_struct_get(part); } part_round_stats(cpu, part); - part_inc_in_flight(part, rw); + part_inc_in_flight(cpu, part, rw); rq->part = part; } diff --git a/block/blk-merge.c b/block/blk-merge.c index 30a0d9f..cb7c46d 100644 --- a/block/blk-merge.c +++ b/block/blk-merge.c @@ -449,7 +449,7 @@ static void blk_account_io_merge(struct request *req) part = req->part; part_round_stats(cpu, part); - part_dec_in_flight(part, rq_data_dir(req)); + part_dec_in_flight(cpu, part, rq_data_dir(req)); hd_struct_put(part); part_stat_unlock(); diff --git a/drivers/nvdimm/core.c b/drivers/nvdimm/core.c index cb62ec6..053d026 100644 --- a/drivers/nvdimm/core.c +++ b/drivers/nvdimm/core.c @@ -224,7 +224,7 @@ void __nd_iostat_start(struct bio *bio, unsigned long *start) part_round_stats(cpu, >part0); part_stat_inc(cpu, >part0, ios[rw]); part_stat_add(cpu, >part0, sectors[rw], bio_sectors(bio)); - part_inc_in_flight(>part0, rw); + part_inc_in_flight(cpu, >part0, rw); part_stat_unlock(); } EXPORT_SYMBOL(__nd_iostat_start); @@ -238,7 +238,7 @@ void nd_iostat_end(struct bio *bio, unsigned long start) part_stat_add(cpu, >part0, ticks[rw], duration); part_round_stats(cpu, >part0); - part_dec_in_flight(>part0, rw); + part_dec_in_flight(cpu, >part0, rw); part_stat_unlock(); } EXPORT_SYMBOL(nd_iostat_end); diff --git a/include/linux/genhd.h b/include/linux/genhd.h index 2adbfa6..612ae80 100644 --- a/include/linux/genhd.h +++ b/include/linux/genhd.h @@ -381,14 +381,14 @@ static inline void free_part_stats(struct hd_struct *part) #define part_stat_sub(cpu, gendiskp, field, subnd) \ part_stat_add(cpu, gendiskp, field, -subnd) -static inline void part_inc_in_flight(struct hd_struct *part, int rw) +static inline void part_inc_in_flight(int cpu, struct hd_struct *part, int rw) { atomic_inc(>in_flight[rw]); if (part->partno) atomic_inc(_to_disk(part)->part0.in_flight[rw]); } -static inline void part_dec_in_flight(struct hd_struct *part, int rw) +static inline void part_dec_in_flight(int cpu, struct hd_struct *part, int rw) { atomic_dec(>in_flight[rw]); if (part->partno) -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/4] block: partition: convert percpu ref
Percpu refcount is the perfect match for partition's case, and the conversion is quite straight. With the convertion, one pair of atomic inc/dec can be saved for accounting block I/O, which is run in hot path of block I/O. Signed-off-by: Ming Lei --- block/genhd.c | 6 +- block/partition-generic.c | 9 + include/linux/genhd.h | 27 +-- 3 files changed, 27 insertions(+), 15 deletions(-) diff --git a/block/genhd.c b/block/genhd.c index ed3f5b9..3213b66 100644 --- a/block/genhd.c +++ b/block/genhd.c @@ -1284,7 +1284,11 @@ struct gendisk *alloc_disk_node(int minors, int node_id) * converted to make use of bd_mutex and sequence counters. */ seqcount_init(>part0.nr_sects_seq); - hd_ref_init(>part0); + if (hd_ref_init(>part0)) { + hd_free_part(>part0); + kfree(disk); + return NULL; + } disk->minors = minors; rand_initialize_disk(disk); diff --git a/block/partition-generic.c b/block/partition-generic.c index eca0d02..e771113 100644 --- a/block/partition-generic.c +++ b/block/partition-generic.c @@ -232,8 +232,9 @@ static void delete_partition_rcu_cb(struct rcu_head *head) put_device(part_to_dev(part)); } -void __delete_partition(struct hd_struct *part) +void __delete_partition(struct percpu_ref *ref) { + struct hd_struct *part = container_of(ref, struct hd_struct, ref); call_rcu(>rcu_head, delete_partition_rcu_cb); } @@ -254,7 +255,7 @@ void delete_partition(struct gendisk *disk, int partno) kobject_put(part->holder_dir); device_del(part_to_dev(part)); - hd_struct_put(part); + hd_struct_kill(part); } static ssize_t whole_disk_show(struct device *dev, @@ -355,8 +356,8 @@ struct hd_struct *add_partition(struct gendisk *disk, int partno, if (!dev_get_uevent_suppress(ddev)) kobject_uevent(>kobj, KOBJ_ADD); - hd_ref_init(p); - return p; + if (!hd_ref_init(p)) + return p; out_free_info: free_part_info(p); diff --git a/include/linux/genhd.h b/include/linux/genhd.h index a221220..2adbfa6 100644 --- a/include/linux/genhd.h +++ b/include/linux/genhd.h @@ -13,6 +13,7 @@ #include #include #include +#include #ifdef CONFIG_BLOCK @@ -124,7 +125,7 @@ struct hd_struct { #else struct disk_stats dkstats; #endif - atomic_t ref; + struct percpu_ref ref; struct rcu_head rcu_head; }; @@ -611,7 +612,7 @@ extern struct hd_struct * __must_check add_partition(struct gendisk *disk, sector_t len, int flags, struct partition_meta_info *info); -extern void __delete_partition(struct hd_struct *); +extern void __delete_partition(struct percpu_ref *); extern void delete_partition(struct gendisk *, int); extern void printk_all_partitions(void); @@ -640,33 +641,39 @@ extern ssize_t part_fail_store(struct device *dev, const char *buf, size_t count); #endif /* CONFIG_FAIL_MAKE_REQUEST */ -static inline void hd_ref_init(struct hd_struct *part) +static inline int hd_ref_init(struct hd_struct *part) { - atomic_set(>ref, 1); - smp_mb(); + if (percpu_ref_init(>ref, __delete_partition, 0, + GFP_KERNEL)) + return -ENOMEM; + return 0; } static inline void hd_struct_get(struct hd_struct *part) { - atomic_inc(>ref); - smp_mb__after_atomic(); + percpu_ref_get(>ref); } static inline int hd_struct_try_get(struct hd_struct *part) { - return atomic_inc_not_zero(>ref); + return percpu_ref_tryget_live(>ref); } static inline void hd_struct_put(struct hd_struct *part) { - if (atomic_dec_and_test(>ref)) - __delete_partition(part); + percpu_ref_put(>ref); +} + +static inline void hd_struct_kill(struct hd_struct *part) +{ + percpu_ref_kill(>ref); } static inline void hd_free_part(struct hd_struct *part) { free_part_stats(part); free_part_info(part); + percpu_ref_exit(>ref); } /* -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 4/4] block: account io: convert part->in_fligh[] into percpu variable
So the atomic operations for accounting block I/O can be killed completely, and it is OK to add the percpu variables in part_in_flight() because the function is run at most one time in every tick. Signed-off-by: Ming Lei --- block/blk-core.c | 1 + block/partition-generic.c | 5 +++-- drivers/md/dm.c | 10 ++ include/linux/genhd.h | 24 ++-- 4 files changed, 28 insertions(+), 12 deletions(-) diff --git a/block/blk-core.c b/block/blk-core.c index f180a6d..0001d4c 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -1344,6 +1344,7 @@ static void part_round_stats_single(int cpu, struct hd_struct *part, if (now == part->stamp) return; + /* at most one percpu addition per one tick */ inflight = part_in_flight(part); if (inflight) { __part_stat_add(cpu, part, time_in_queue, diff --git a/block/partition-generic.c b/block/partition-generic.c index e771113..0a553e7 100644 --- a/block/partition-generic.c +++ b/block/partition-generic.c @@ -140,8 +140,9 @@ ssize_t part_inflight_show(struct device *dev, { struct hd_struct *p = dev_to_part(dev); - return sprintf(buf, "%8u %8u\n", atomic_read(>in_flight[0]), - atomic_read(>in_flight[1])); + return sprintf(buf, "%8u %8u\n", + part_stat_read(p, in_flight[0]), + part_stat_read(p, in_flight[1])); } #ifdef CONFIG_FAIL_MAKE_REQUEST diff --git a/drivers/md/dm.c b/drivers/md/dm.c index de70377..1b6d8be 100644 --- a/drivers/md/dm.c +++ b/drivers/md/dm.c @@ -651,9 +651,9 @@ static void start_io_acct(struct dm_io *io) cpu = part_stat_lock(); part_round_stats(cpu, _disk(md)->part0); + part_stat_set(cpu, _disk(md)->part0, in_flight[rw], + atomic_inc_return(>pending[rw])); part_stat_unlock(); - atomic_set(_disk(md)->part0.in_flight[rw], - atomic_inc_return(>pending[rw])); if (unlikely(dm_stats_used(>stats))) dm_stats_account_io(>stats, bio->bi_rw, bio->bi_iter.bi_sector, @@ -665,7 +665,7 @@ static void end_io_acct(struct dm_io *io) struct mapped_device *md = io->md; struct bio *bio = io->bio; unsigned long duration = jiffies - io->start_time; - int pending; + int pending, cpu; int rw = bio_data_dir(bio); generic_end_io_acct(rw, _disk(md)->part0, io->start_time); @@ -679,7 +679,9 @@ static void end_io_acct(struct dm_io *io) * a flush. */ pending = atomic_dec_return(>pending[rw]); - atomic_set(_disk(md)->part0.in_flight[rw], pending); + cpu = part_stat_lock(); + part_stat_set(cpu, _disk(md)->part0, in_flight[rw], pending); + part_stat_unlock(); pending += atomic_read(>pending[rw^0x1]); /* nudge anyone waiting on suspend queue */ diff --git a/include/linux/genhd.h b/include/linux/genhd.h index 612ae80..abe5567 100644 --- a/include/linux/genhd.h +++ b/include/linux/genhd.h @@ -86,6 +86,7 @@ struct disk_stats { unsigned long ticks[2]; unsigned long io_ticks; unsigned long time_in_queue; + unsigned int in_flight[2]; }; #define PARTITION_META_INFO_VOLNAMELTH 64 @@ -119,7 +120,6 @@ struct hd_struct { int make_it_fail; #endif unsigned long stamp; - atomic_t in_flight[2]; #ifdef CONFIG_SMP struct disk_stats __percpu *dkstats; #else @@ -320,6 +320,9 @@ extern struct hd_struct *disk_map_sector_rcu(struct gendisk *disk, res;\ }) +#define part_stat_set(cpu, part, field, seted) \ + (per_cpu_ptr((part)->dkstats, (cpu))->field = (seted)) + static inline void part_stat_set_all(struct hd_struct *part, int value) { int i; @@ -351,6 +354,9 @@ static inline void free_part_stats(struct hd_struct *part) #define part_stat_read(part, field)((part)->dkstats.field) +#define part_stat_set(cpu, part, field, seted) \ + ((part)->dkstats.field = (seted)) + static inline void part_stat_set_all(struct hd_struct *part, int value) { memset(>dkstats, value, sizeof(struct disk_stats)); @@ -383,21 +389,27 @@ static inline void free_part_stats(struct hd_struct *part) static inline void part_inc_in_flight(int cpu, struct hd_struct *part, int rw) { - atomic_inc(>in_flight[rw]); + part_stat_inc(cpu, part, in_flight[rw]); if (part->partno) - atomic_inc(_to_disk(part)->part0.in_flight[rw]); + part_stat_inc(cpu, _to_disk(part)->part0, in_flight[rw]); } static inline void part_dec_in_flight(int cpu, struct hd_struct *part, int rw) { - atomic_dec(>in_flight[rw]); + part_stat_dec(cpu, part, in_flight[rw]); if (part->partno) - atomic_dec(_to_disk(part)->part0.in_flight[rw]); + part_stat_dec(cpu,
[PATCH 0/4] block: account io: kill atomic operations
Hi, This patches kills two kinds of atomic operations in block accounting I/O. The 1st two patches convert atomic refcount of partition into percpu refcount. The 2nd two patches converts partition->in_flight[] into percpu variable. With this change, ~15% throughput improvement can be observed when running fio(randread) over null blk in a dual-socket environment. block/bio.c | 4 ++-- block/blk-core.c | 5 ++-- block/blk-merge.c | 2 +- block/genhd.c | 9 --- block/partition-generic.c | 17 ++--- drivers/md/dm.c | 10 drivers/nvdimm/core.c | 4 ++-- include/linux/genhd.h | 61 +-- 8 files changed, 72 insertions(+), 40 deletions(-) Thanks, Ming -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/4] block: partition: introduce hd_free_part()
So the helper can be used in both generic partition case and part0 case. Signed-off-by: Ming Lei --- block/genhd.c | 3 +-- block/partition-generic.c | 3 +-- include/linux/genhd.h | 6 ++ 3 files changed, 8 insertions(+), 4 deletions(-) diff --git a/block/genhd.c b/block/genhd.c index e552e1b..ed3f5b9 100644 --- a/block/genhd.c +++ b/block/genhd.c @@ -1110,8 +1110,7 @@ static void disk_release(struct device *dev) disk_release_events(disk); kfree(disk->random); disk_replace_part_tbl(disk, NULL); - free_part_stats(>part0); - free_part_info(>part0); + hd_free_part(>part0); if (disk->queue) blk_put_queue(disk->queue); kfree(disk); diff --git a/block/partition-generic.c b/block/partition-generic.c index 0d9e5f9..eca0d02 100644 --- a/block/partition-generic.c +++ b/block/partition-generic.c @@ -212,8 +212,7 @@ static void part_release(struct device *dev) { struct hd_struct *p = dev_to_part(dev); blk_free_devt(dev->devt); - free_part_stats(p); - free_part_info(p); + hd_free_part(p); kfree(p); } diff --git a/include/linux/genhd.h b/include/linux/genhd.h index ec274e0..a221220 100644 --- a/include/linux/genhd.h +++ b/include/linux/genhd.h @@ -663,6 +663,12 @@ static inline void hd_struct_put(struct hd_struct *part) __delete_partition(part); } +static inline void hd_free_part(struct hd_struct *part) +{ + free_part_stats(part); + free_part_info(part); +} + /* * Any access of part->nr_sects which is not protected by partition * bd_mutex or gendisk bdev bd_mutex, should be done using this -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH perf/core v2 00/16] perf-probe --cache and SDT support
Hi Masami, On 07/15/2015 02:43 PM, Masami Hiramatsu wrote: Hi, Here is the 2nd version of the patchset for probe-cache and initial SDT support which are going to be perf-cache finally. Thanks for adding the SDT support. The perf-probe is useful for debugging, but it strongly depends on the debuginfo. Without debuginfo, it is just a frontend of ftrace's dynamic events. This can usually happen in server farms or on cloud system, since no one wants to distribute big debuginfo packages. To solve this issue, I had tried to make a pre-analyzed probes ( https://lkml.org/lkml/2014/10/31/207 ) but it has a problm that we can't ensure the probed binary is same as what we analyzed. Arnaldo gave me an idea to reuse build-id cache for that perpose and this series is the first prototype of that. At the same time, Hemant has started to support SDT probes which also use the cache file of SDT info. So I decided to merge this into the same build-id cache. In this version, SDT support is still very limited, it works as a part of probe-cache. In this version, perf probe supports --cache option which means that perf probe manipulate probe caches, for example, # perf probe --cache --add "probe-desc" does not only add probe events but also add "probe-desc" and it's result on the cache. (Note that the cached entry is always referred even without --cache) The --list and --del commands also support --cache. Note that both are only manipulate caches, not real events. To use SDT, we have to scan the target binary at first by using perf-buildid-cache, e.g. # perf buildid-cache --add /lib/libc-2.17.so And perf probe --cache --list shows what SDTs are scanned. # perf probe --cache --list /usr/lib/libc-2.17.so (a6fb821bdf53660eb2c29f778757aef294d3d392): libc:setjmp=setjmp libc:longjmp=longjmp libc:longjmp_target=longjmp_target libc:memory_heap_new=memory_heap_new libc:memory_sbrk_less=memory_sbrk_less libc:memory_arena_reuse_free_list=memory_arena_reuse_free_list libc:memory_arena_reuse=memory_arena_reuse ... To use the SDT events, perf probe -x BIN %SDTEVENT allows you to add a probe on SDTEVENT@BIN. # perf probe -x /lib/libc-2.17.so %memory_heap_new If you define a cached probe with event name, you can also reuse it as same as SDT events. # perf probe -x ./perf --cache -n 'myevent=dso__load $params' (Note that "-n" option only updates caches) To use the above "myevent", you just have to add "%myevent". # perf probe -x ./perf %myevent TODOs: - Show available cached/SDT events by perf-list - Allow perf-record to use cached/SDT events directly As I was already working on SDT events' recording https://lkml.org/lkml/2014/11/2/73, I can re-spin the patches on top of your patchset and make the required changes to implement the above TODOs. What would you suggest? Thank you, --- Hemant Kumar (1): perf/sdt: ELF support for SDT Masami Hiramatsu (15): perf probe: Simplify __add_probe_trace_events code perf probe: Move ftrace probe-event operations to probe-file.c perf probe: Use strbuf for making strings in probe-event.c perf-buildid-cache: Use path/to/bin/buildid/elf instead of path/to/bin/buildid perf buildid: Use SBUILD_ID_SIZE macro perf buildid: Introduce sysfs/filename__sprintf_build_id perf: Add lsdir to read a directory perf-buildid-cache: Use lsdir for looking up buildid caches perf probe: Add --cache option to cache the probe definitions perf probe: Use cache entry if possible perf probe: Show all cached probes perf probe: Remove caches when --cache is given perf probe: Add group name support perf buildid-cache: Scan and import user SDT events to probe cache perf probe: Accept %sdt and %cached event name tools/perf/Documentation/perf-probe.txt | 14 tools/perf/builtin-buildid-cache.c | 22 - tools/perf/builtin-buildid-list.c | 28 - tools/perf/builtin-probe.c |3 tools/perf/util/Build |1 tools/perf/util/build-id.c | 230 ++-- tools/perf/util/build-id.h | 11 tools/perf/util/dso.h |5 tools/perf/util/probe-event.c | 918 ++- tools/perf/util/probe-event.h | 16 - tools/perf/util/probe-file.c| 763 ++ tools/perf/util/probe-file.h| 46 ++ tools/perf/util/probe-finder.c | 10 tools/perf/util/symbol-elf.c| 252 + tools/perf/util/symbol.c|2 tools/perf/util/symbol.h| 22 + tools/perf/util/util.c | 34 + tools/perf/util/util.h |4 18 files changed, 1781 insertions(+), 600 deletions(-) create mode 100644 tools/perf/util/probe-file.c create mode 100644 tools/perf/util/probe-file.h -- Thanks, Hemant Kumar --
linux-next: build failure after merge of the rcu tree
Hi Paul, After merging the rcu tree, today's linux-next build (arm multi_v7_defconfig) failed like this: kernel/notifier.c: In function 'notify_die': kernel/notifier.c:547:2: error: implicit declaration of function 'rcu_lockdep_assert' [-Werror=implicit-function-declaration] rcu_lockdep_assert(rcu_is_watching(), ^ Caused by commit 02300fdb3e5f ("rcu: Rename rcu_lockdep_assert() to RCU_LOCKDEP_WARN()") interacting with commit e727c7d7a11e ("notifiers, RCU: Assert that RCU is watching in notify_die()") [ and I also noted 0333a209cbf6 ("x86/irq, context_tracking: Document how IRQ context tracking works and add an RCU assertion") ] from the tip tree. I added the following merge fix patch: From: Stephen Rothwell Date: Thu, 16 Jul 2015 13:08:50 +1000 Subject: [PATCH] rcu: merge fix for Rename rcu_lockdep_assert() to RCU_LOCKDEP_WARN() Signed-off-by: Stephen Rothwell --- arch/x86/kernel/irq.c | 2 +- kernel/notifier.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c index 30dbf35bc90b..f9cd81825187 100644 --- a/arch/x86/kernel/irq.c +++ b/arch/x86/kernel/irq.c @@ -234,7 +234,7 @@ __visible unsigned int __irq_entry do_IRQ(struct pt_regs *regs) entering_irq(); /* entering_irq() tells RCU that we're not quiescent. Check it. */ - rcu_lockdep_assert(rcu_is_watching(), "IRQ failed to wake up RCU"); + RCU_LOCKDEP_WARN(!rcu_is_watching(), "IRQ failed to wake up RCU"); irq = __this_cpu_read(vector_irq[vector]); diff --git a/kernel/notifier.c b/kernel/notifier.c index 980e4330fb59..fd2c9acbcc19 100644 --- a/kernel/notifier.c +++ b/kernel/notifier.c @@ -544,7 +544,7 @@ int notrace notify_die(enum die_val val, const char *str, .signr = sig, }; - rcu_lockdep_assert(rcu_is_watching(), + RCU_LOCKDEP_WARN(!rcu_is_watching(), "notify_die called but RCU thinks we're quiescent"); return atomic_notifier_call_chain(_chain, val, ); } -- 2.1.4 -- Cheers, Stephen Rothwells...@canb.auug.org.au -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
linux-next: manual merge of the rcu tree with the tip tree
Hi Paul, Today's linux-next merge of the rcu tree got a conflict in: arch/x86/kernel/traps.c between commit: 8c84014f3bbb ("x86/entry: Remove exception_enter() from most trap handlers") from the tip tree and commit: 02300fdb3e5f ("rcu: Rename rcu_lockdep_assert() to RCU_LOCKDEP_WARN()") from the rcu tree. I fixed it up (see below) and can carry the fix as necessary (no action is required). -- Cheers, Stephen Rothwells...@canb.auug.org.au diff --cc arch/x86/kernel/traps.c index 8e65d8a9b8db,c5a5231d1d11.. --- a/arch/x86/kernel/traps.c +++ b/arch/x86/kernel/traps.c @@@ -131,14 -136,19 +131,14 @@@ void ist_enter(struct pt_regs *regs preempt_count_add(HARDIRQ_OFFSET); /* This code is a bit fragile. Test it. */ - rcu_lockdep_assert(rcu_is_watching(), "ist_enter didn't work"); + RCU_LOCKDEP_WARN(!rcu_is_watching(), "ist_enter didn't work"); - - return prev_state; } -void ist_exit(struct pt_regs *regs, enum ctx_state prev_state) +void ist_exit(struct pt_regs *regs) { - /* Must be before exception_exit. */ preempt_count_sub(HARDIRQ_OFFSET); - if (user_mode(regs)) - return exception_exit(prev_state); - else + if (!user_mode(regs)) rcu_nmi_exit(); } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/7] Initial support for user namespace owned mounts
On 7/15/2015 6:08 PM, Andy Lutomirski wrote: > On Wed, Jul 15, 2015 at 3:39 PM, Casey Schaufler > wrote: >> On 7/15/2015 2:06 PM, Eric W. Biederman wrote: >>> Casey Schaufler writes: >>> The first step needs to be not trusting those labels and treating such >>> filesystems as filesystems without label support. I hope that is Seth >>> has implemented. >> A filesystem with Smack labels gets mounted in a namespace. The labels >> are ignored. Instead, the filesystem defaults (potentially specified as >> mount options smackfsdef="something", but usually the floor label ("_")) >> are used, giving the user the ability to read everything and (usually) >> change nothing. This is both dangerous (unintended read access to files) >> and pointless (can't make changes). > I don't get it. > > If I mount an unprivileged filesystem, then either the contents were > put there *by me*, in which case letting me access them are fine, or > (with Seth's patches and then some) I control the backing store, in > which case I can do whatever I want regardless of what LSM thinks. > > So I don't see the problem. Why would Smack or any other LSM care at > all, unless it wants to prevent me from mounting the fs in the first > place? First off, I don't cotton to the notion that you should be able to mount filesystems without privilege. But it seems I'm being outvoted on that. I suspect that there are cases where it might be safe, but I can't think of one off the top of my head. If you do mount a filesystem it needs to behave according to the rules of the system. If you have a security module that uses attributes on the filesystem you can't ignore them just because it's "your data". Mandatory access control schemes, including Smack and SELinux don't give a fig about who you are. It's the label on the data and the process that matter. If "you" get to muck the labels up, you've broken the mandatory access control. > --Andy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/7] fs: Add user namesapace member to struct super_block
Seth Forshee writes: > Initially this will be used to eliminate the implicit MNT_NODEV > flag for mounts from user namespaces. In the future it will also > be used for translating ids and checking capabilities for > filesystems mounted from user namespaces. > > s_user_ns is initialized in alloc_super() and is generally set to > current_user_ns(). To avoid security and corruption issues, two > additional mount checks are also added: > > - do_new_mount() gains a check that the user has CAP_SYS_ADMIN >in current_user_ns(). > > - sget() will fail with EBUSY when the filesystem it's looking >for is already mounted from another user namespace. > > proc needs some special handling here. The user namespace of > current isn't appropriate when forking as a result of clone (2) > with CLONE_NEWPID|CLONE_NEWUSER, as it will make proc unmountable > from within the new user namespace. Instead, the user namespace > which owns the new pid namespace should be used. sget_userns() is > added to allow passing of a user namespace other than that of > current, and this is used by proc_mount(). sget() becomes a > wrapper around sget_userns() which passes current_user_ns(). >From bits of the previous conversation. We need sget_userns(..., _user_ns) for sysfs. The sysfs xattrs can travel from one mount of sysfs to another via the sysfs backing store. For tmpfs and any other filesystems we support mounting without privilige that support xattrs. We need to identify them and see if userspace is taking advantage of the ability to set xattrs and file caps (unlikely). If they are we need to call sget_userns(..., _user_ns) on those filesystems as well. Possibly/Probably we should just do that for all of the interesting filesystems to start with and then change back to an ordinary old sget after we have done the testing and confirmed we will not be introducing userspace regressions. Eric -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [REGRESSION] 4.2-rc2: early boot memory corruption from FPU rework
On Wed, Jul 15, 2015 at 5:34 PM, Dave Hansen wrote: > > I understand why you were misled by it, but the old "xsave_hdr_struct" > was wrong. Fenghua even posted patches to remove it before the FPU > rework (you were cc'd): > > https://lkml.org/lkml/2015/4/18/164 Oh, and that patch looks like a good idea. I wish there was some way to make sure sizeof() fail on it so that we'd enforce that nobody allocates that thing as-is. I had this dim memory that an unsized array at the end would do that, but I was clearly wrong. It's just the array itself you can't do sizeof on, not the structure that contains it. Is there some magic trick that I'm forgetting? Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v1 3/4] mm/memory-failure: give up error handling for non-tail-refcounted thp
On Thu, Jul 16, 2015 at 04:33:07AM +0200, Andi Kleen wrote: > > @@ -909,6 +909,15 @@ int get_hwpoison_page(struct page *page) > > * directly for tail pages. > > */ > > if (PageTransHuge(head)) { > > + /* > > +* Non anonymous thp exists only in allocation/free time. We > > +* can't handle such a case correctly, so let's give it up. > > +* This should be better than triggering BUG_ON when kernel > > +* tries to touch a "partially handled" page. > > +*/ > > + if (!PageAnon(head)) > > + return 0; > > Please print a message for this case. In the future there will be > likely more non anonymous THP pages from Kirill's large page cache work > (so eventually we'll need it) OK, I'll do this. Thanks, Naoya Horiguchi-- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[GIT PULL] ARM: EXYNOS: mach: Improvements for 4.3
Dear Kukjin, Exynos mach-code related improvements. Description along with a tag. You can find them also on the lists with my reviewed-by. Best regards, Krzysztof The following changes since commit 1c4c7159ed2468f3ac4ce5a7f08d79663d381a93: Merge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 (2015-07-05 16:24:54 -0700) are available in the git repository at: https://github.com/krzk/linux.git tags/samsung-mach-4.3 for you to fetch changes up to 70f83b6716ea0e5944071c12ff1716f93a9c2d8d: cpufreq: exynos: remove Exynos5250 specific cpufreq driver support (2015-07-16 10:39:56 +0900) Improvements for Exynos based boards: 1. Switch to generic cpufreq-dt driver for Exynos5250. The old driver is removed. 2. Fix memory leak in cpufreq error path. 3. Cleanups: remove duplicated define with bootloader's sleep magic constant, staticize local function, drop 'owner' from platform driver, fix cast of iomem to ERR_PTR. Bartlomiej Zolnierkiewicz (1): cpufreq: exynos: remove Exynos5250 specific cpufreq driver support Krzysztof Kozlowski (4): ARM: EXYNOS: pmu: Make local function static ARM: EXYNOS: Remove duplicated define of SLEEP_MAGIC ARM: EXYNOS: pmu: Drop owner assignment ARM: EXYNOS: Use IOMEM_ERR_PTR when function returns iomem Shailendra Verma (1): cpufreq: exynos: Fix for memory leak in case SOC name does not match Thomas Abraham (3): clk: samsung: exynos5250: add cpu clock configuration data and instantiate cpu clock ARM: dts: Exynos5250: add CPU OPP and regulator supply property ARM: Exynos: switch to using generic cpufreq driver for Exynos5250 arch/arm/boot/dts/exynos5250-arndale.dts | 4 + arch/arm/boot/dts/exynos5250-smdk5250.dts | 4 + arch/arm/boot/dts/exynos5250-snow.dts | 4 + arch/arm/boot/dts/exynos5250-spring.dts | 4 + arch/arm/boot/dts/exynos5250.dtsi | 22 arch/arm/mach-exynos/common.h | 6 + arch/arm/mach-exynos/exynos.c | 1 + arch/arm/mach-exynos/firmware.c | 2 - arch/arm/mach-exynos/platsmp.c| 2 +- arch/arm/mach-exynos/pmu.c| 3 +- arch/arm/mach-exynos/suspend.c| 4 +- drivers/clk/samsung/clk-exynos5250.c | 31 + drivers/cpufreq/Kconfig.arm | 11 -- drivers/cpufreq/Makefile | 1 - drivers/cpufreq/exynos-cpufreq.c | 9 +- drivers/cpufreq/exynos-cpufreq.h | 17 --- drivers/cpufreq/exynos5250-cpufreq.c | 210 -- include/dt-bindings/clock/exynos5250.h| 1 + 18 files changed, 84 insertions(+), 252 deletions(-) delete mode 100644 drivers/cpufreq/exynos5250-cpufreq.c -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[GIT PULL] ARM: EXYNOS: dts: Improvements for 4.3
Dear Kukjin, DTS related improvements. Description along with a tag. You can find them also on the lists with my reviewed-by. Best regards, Krzysztof The following changes since commit a419d78a6f97f8c977fe55d5d590cd0654ecd1ee: ARM: dts: Exynos4210: add CPU OPP and regulator supply property (2015-07-13 21:16:05 +0900) are available in the git repository at: https://github.com/krzk/linux.git tags/samsung-dt-4.3 for you to fetch changes up to cd0b551be420d49c2bde8dcf5ea147278dc89ffb: ARM: dts: Extend exynos5420-pinctrl nodes using labels instead of paths (2015-07-16 11:22:11 +0900) Device Tree improvements for Exynos based boards: 1. Enable proper USB 3.0 regulators on Odroid XU3 board. 2. Set over-heat and over-voltage thresholds for Trats2 board fuel gauge. 3. Fix missing display frequency on Exynos3250 Rinato board (necessary to fix the display). 4. Enable thermal management and fan control on Odroid XU3 board. The speed of fan is adjusted to current temperature of SoC. 5. Cleanups and usage of label-notation for overriding nodes. Anand Moon (5): ARM: dts: odroidxu3: Enable USB3 regulators ARM: dts: exynos5422-odroidxu3: Add pwm-fan node ARM: dts: exynos5422-odroidxu3: Enable TMU at Exynos5422 base ARM: dts: exynos5422-odroidxu3: Define default thermal-zones ARM: dts: exynos5422-odroidxu3: Enable thermal-zones Andreas Färber (1): ARM: dts: Clean up exynos5410-smdk5410 indentation Hyungwon Hwang (1): ARM: dts: fix the clock-frequency of exynos3250-rinato board's panel Javier Martinez Canillas (4): ARM: dts: Include exynos5250-pinctrl after the nodes were defined ARM: dts: Extend exynos5250-pinctrl nodes using labels instead of paths ARM: dts: Include exynos5420-pinctrl after the nodes were defined ARM: dts: Extend exynos5420-pinctrl nodes using labels instead of paths Krzysztof Kozlowski (2): ARM: dts: Set max17047 over heat and over voltage thresholds ARM: dts: Use labels for overriding nodes in exynos4210-universal arch/arm/boot/dts/exynos3250-rinato.dts|2 +- arch/arm/boot/dts/exynos4210-universal_c210.dts| 620 arch/arm/boot/dts/exynos4412-trats2.dts|3 + arch/arm/boot/dts/exynos5250-pinctrl.dtsi | 1600 ++-- arch/arm/boot/dts/exynos5250.dtsi |3 +- arch/arm/boot/dts/exynos5410-smdk5410.dts |6 +- arch/arm/boot/dts/exynos5420-pinctrl.dtsi | 1411 + arch/arm/boot/dts/exynos5420.dtsi |3 +- arch/arm/boot/dts/exynos5422-cpu-thermal.dtsi | 59 + arch/arm/boot/dts/exynos5422-odroidxu3-common.dtsi | 46 + 10 files changed, 1930 insertions(+), 1823 deletions(-) create mode 100644 arch/arm/boot/dts/exynos5422-cpu-thermal.dtsi -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 4/6] ARM: OMAP: PRM: Remove hardcoding of IRQENABLE_MPU_2 and IRQSTATUS_MPU_2 register offsets
On Wed, 8 Jul 2015, Keerthy wrote: > The register offsets of IRQENABLE_MPU_2 and IRQSTATUS_MPU_2 are hardcoded. > This makes it difficult to reuse the code for SoCs like AM437x that have > a single instance of IRQENABLE_MPU and IRQSTATUS_MPU registers. > Hence handling the case using offset of 4 to accommodate single set of IRQ* > registers generically. > > Signed-off-by: Keerthy Thanks, queued for v4.3. - Paul -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [REGRESSION] 4.2-rc2: early boot memory corruption from FPU rework
On Wed, Jul 15, 2015 at 5:34 PM, Dave Hansen wrote: > > The old code sized the buffer in a fully architectural way and it > worked. The CPU *tells* you how much memory the 'xsave' instruction is > going to scribble on. The new code just merrily calls it and let it > scribble away. This is as clear-cut a regression as I've ever seen. Yes, I think we'll need to revert it, or do something else drastic like make that initial fp state allocation *much* bigger and then have a "disable xsaves if if it's still not big enough". setup_xstate_features() should be able to easily just say "this was the maximum offset+size we saw", and we can take that to either do a proper allocation, or verify that the static allocation is indeed big enough. Apparently a straight revert doesn't work, if only because things in that area have been renamed very aggressively (both files and functions and variables). Ingo? Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 2/6] ARM: AM43xx: Add the PRM IRQ register offsets
On Thu, 16 Jul 2015, Paul Walmsley wrote: > On Wed, 8 Jul 2015, Keerthy wrote: > > > Add the PRM IRQ register offsets. > > > > Signed-off-by: Keerthy > > Please add more detail to your commit messages so they conform to > Documentation/SubmittingPatches: > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/Documentation/SubmittingPatches#n109 > > For example, this commit message should read something like: > > --- > > ARM: AM43xx: Add the PRM IRQ register offsets > > Add the PRM IRQ register offsets. This is needed to support PRM I/O > wakeup on AM43xx. > > -- > > Basically, your patches need to provide context as to _why_ the change is > needed. > > I've fixed the message for this patch, and queued it for v4.3, but > please take care with this issue in the future. Also I've moved the AM43XX_PRM_IO_PMCTRL_OFFSET macro out of the AM43XX CM section, since it doesn't belong there. - Paul -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 2/6] ARM: AM43xx: Add the PRM IRQ register offsets
On Wed, 8 Jul 2015, Keerthy wrote: > Add the PRM IRQ register offsets. > > Signed-off-by: Keerthy Please add more detail to your commit messages so they conform to Documentation/SubmittingPatches: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/Documentation/SubmittingPatches#n109 For example, this commit message should read something like: --- ARM: AM43xx: Add the PRM IRQ register offsets Add the PRM IRQ register offsets. This is needed to support PRM I/O wakeup on AM43xx. -- Basically, your patches need to provide context as to _why_ the change is needed. I've fixed the message for this patch, and queued it for v4.3, but please take care with this issue in the future. - Paul -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v1 3/4] mm/memory-failure: give up error handling for non-tail-refcounted thp
> @@ -909,6 +909,15 @@ int get_hwpoison_page(struct page *page) >* directly for tail pages. >*/ > if (PageTransHuge(head)) { > + /* > + * Non anonymous thp exists only in allocation/free time. We > + * can't handle such a case correctly, so let's give it up. > + * This should be better than triggering BUG_ON when kernel > + * tries to touch a "partially handled" page. > + */ > + if (!PageAnon(head)) > + return 0; Please print a message for this case. In the future there will be likely more non anonymous THP pages from Kirill's large page cache work (so eventually we'll need it) -Andi -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3.19.y-ckt 002/251] sctp: fix ASCONF list handling
3.19.8-ckt4 -stable review patch. If anyone has any objections, please let me know. -- From: Marcelo Ricardo Leitner [ Upstream commit 2d45a02d0166caf2627fe91897c6ffc3b19514c4 ] ->auto_asconf_splist is per namespace and mangled by functions like sctp_setsockopt_auto_asconf() which doesn't guarantee any serialization. Also, the call to inet_sk_copy_descendant() was backuping ->auto_asconf_list through the copy but was not honoring ->do_auto_asconf, which could lead to list corruption if it was different between both sockets. This commit thus fixes the list handling by using ->addr_wq_lock spinlock to protect the list. A special handling is done upon socket creation and destruction for that. Error handlig on sctp_init_sock() will never return an error after having initialized asconf, so sctp_destroy_sock() can be called without addrq_wq_lock. The lock now will be take on sctp_close_sock(), before locking the socket, so we don't do it in inverse order compared to sctp_addr_wq_timeout_handler(). Instead of taking the lock on sctp_sock_migrate() for copying and restoring the list values, it's preferred to avoid rewritting it by implementing sctp_copy_descendant(). Issue was found with a test application that kept flipping sysctl default_auto_asconf on and off, but one could trigger it by issuing simultaneous setsockopt() calls on multiple sockets or by creating/destroying sockets fast enough. This is only triggerable locally. Fixes: 9f7d653b67ae ("sctp: Add Auto-ASCONF support (core).") Reported-by: Ji Jianwen Suggested-by: Neil Horman Suggested-by: Hannes Frederic Sowa Acked-by: Hannes Frederic Sowa Signed-off-by: Marcelo Ricardo Leitner Signed-off-by: David S. Miller Cc: Moritz Mühlenhoff Reference: CVE-2015-3212 Signed-off-by: Kamal Mostafa --- include/net/netns/sctp.h | 1 + include/net/sctp/structs.h | 4 net/sctp/socket.c | 43 --- 3 files changed, 37 insertions(+), 11 deletions(-) diff --git a/include/net/netns/sctp.h b/include/net/netns/sctp.h index 3573a81..8ba379f 100644 --- a/include/net/netns/sctp.h +++ b/include/net/netns/sctp.h @@ -31,6 +31,7 @@ struct netns_sctp { struct list_head addr_waitq; struct timer_list addr_wq_timer; struct list_head auto_asconf_splist; + /* Lock that protects both addr_waitq and auto_asconf_splist */ spinlock_t addr_wq_lock; /* Lock that protects the local_addr_list writers */ diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h index 2bb2fcf..495c87e 100644 --- a/include/net/sctp/structs.h +++ b/include/net/sctp/structs.h @@ -223,6 +223,10 @@ struct sctp_sock { atomic_t pd_mode; /* Receive to here while partial delivery is in effect. */ struct sk_buff_head pd_lobby; + + /* These must be the last fields, as they will skipped on copies, +* like on accept and peeloff operations +*/ struct list_head auto_asconf_list; int do_auto_asconf; }; diff --git a/net/sctp/socket.c b/net/sctp/socket.c index aafe94b..4e56571 100644 --- a/net/sctp/socket.c +++ b/net/sctp/socket.c @@ -1533,8 +1533,10 @@ static void sctp_close(struct sock *sk, long timeout) /* Supposedly, no process has access to the socket, but * the net layers still may. +* Also, sctp_destroy_sock() needs to be called with addr_wq_lock +* held and that should be grabbed before socket lock. */ - local_bh_disable(); + spin_lock_bh(>sctp.addr_wq_lock); bh_lock_sock(sk); /* Hold the sock, since sk_common_release() will put sock_put() @@ -1544,7 +1546,7 @@ static void sctp_close(struct sock *sk, long timeout) sk_common_release(sk); bh_unlock_sock(sk); - local_bh_enable(); + spin_unlock_bh(>sctp.addr_wq_lock); sock_put(sk); @@ -3587,6 +3589,7 @@ static int sctp_setsockopt_auto_asconf(struct sock *sk, char __user *optval, if ((val && sp->do_auto_asconf) || (!val && !sp->do_auto_asconf)) return 0; + spin_lock_bh(_net(sk)->sctp.addr_wq_lock); if (val == 0 && sp->do_auto_asconf) { list_del(>auto_asconf_list); sp->do_auto_asconf = 0; @@ -3595,6 +3598,7 @@ static int sctp_setsockopt_auto_asconf(struct sock *sk, char __user *optval, _net(sk)->sctp.auto_asconf_splist); sp->do_auto_asconf = 1; } + spin_unlock_bh(_net(sk)->sctp.addr_wq_lock); return 0; } @@ -4128,18 +4132,28 @@ static int sctp_init_sock(struct sock *sk) local_bh_disable(); percpu_counter_inc(_sockets_allocated); sock_prot_inuse_add(net, sk->sk_prot, 1); + + /* Nothing can fail after this block, otherwise +* sctp_destroy_sock() will be called without addr_wq_lock held +*/ if (net->sctp.default_auto_asconf) { +
[PATCH 3.19.y-ckt 009/251] net/mlx4_en: Wake TX queues only when there's enough room
3.19.8-ckt4 -stable review patch. If anyone has any objections, please let me know. -- From: Ido Shamay [ Upstream commit 488a9b48e398b157703766e2cd91ea45ac6997c5 ] Indication of a single completed packet, marked by txbbs_skipped being bigger then zero, in not enough in order to wake up a stopped TX queue. The completed packet may contain a single TXBB, while next packet to be sent (after the wake up) may have multiple TXBBs (LSO/TSO packets for example), causing overflow in queue followed by WQE corruption and TX queue timeout. Instead, wake the stopped queue only when there's enough room for the worst case (maximum sized WQE) packet that we should need to handle after the queue is opened again. Also created an helper routine - mlx4_en_is_tx_ring_full, which checks if the current TX ring is full or not. It provides better code readability and removes code duplication. Signed-off-by: Ido Shamay Signed-off-by: Or Gerlitz Signed-off-by: David S. Miller Signed-off-by: Kamal Mostafa --- drivers/net/ethernet/mellanox/mlx4/en_tx.c | 19 +++ drivers/net/ethernet/mellanox/mlx4/mlx4_en.h | 1 + 2 files changed, 12 insertions(+), 8 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx4/en_tx.c b/drivers/net/ethernet/mellanox/mlx4/en_tx.c index 06c0de6..b54e621 100644 --- a/drivers/net/ethernet/mellanox/mlx4/en_tx.c +++ b/drivers/net/ethernet/mellanox/mlx4/en_tx.c @@ -66,6 +66,7 @@ int mlx4_en_create_tx_ring(struct mlx4_en_priv *priv, ring->size = size; ring->size_mask = size - 1; ring->stride = stride; + ring->full_size = ring->size - HEADROOM - MAX_DESC_TXBBS; tmp = size * sizeof(struct mlx4_en_tx_info); ring->tx_info = kmalloc_node(tmp, GFP_KERNEL | __GFP_NOWARN, node); @@ -232,6 +233,11 @@ void mlx4_en_deactivate_tx_ring(struct mlx4_en_priv *priv, MLX4_QP_STATE_RST, NULL, 0, 0, >qp); } +static inline bool mlx4_en_is_tx_ring_full(struct mlx4_en_tx_ring *ring) +{ + return ring->prod - ring->cons > ring->full_size; +} + static void mlx4_en_stamp_wqe(struct mlx4_en_priv *priv, struct mlx4_en_tx_ring *ring, int index, u8 owner) @@ -474,11 +480,10 @@ static bool mlx4_en_process_tx_cq(struct net_device *dev, netdev_tx_completed_queue(ring->tx_queue, packets, bytes); - /* -* Wakeup Tx queue if this stopped, and at least 1 packet -* was completed + /* Wakeup Tx queue if this stopped, and ring is not full. */ - if (netif_tx_queue_stopped(ring->tx_queue) && txbbs_skipped > 0) { + if (netif_tx_queue_stopped(ring->tx_queue) && + !mlx4_en_is_tx_ring_full(ring)) { netif_tx_wake_queue(ring->tx_queue); ring->wake_queue++; } @@ -922,8 +927,7 @@ netdev_tx_t mlx4_en_xmit(struct sk_buff *skb, struct net_device *dev) skb_tx_timestamp(skb); /* Check available TXBBs And 2K spare for prefetch */ - stop_queue = (int)(ring->prod - ring_cons) > - ring->size - HEADROOM - MAX_DESC_TXBBS; + stop_queue = mlx4_en_is_tx_ring_full(ring); if (unlikely(stop_queue)) { netif_tx_stop_queue(ring->tx_queue); ring->queue_stopped++; @@ -992,8 +996,7 @@ netdev_tx_t mlx4_en_xmit(struct sk_buff *skb, struct net_device *dev) smp_rmb(); ring_cons = ACCESS_ONCE(ring->cons); - if (unlikely(((int)(ring->prod - ring_cons)) <= -ring->size - HEADROOM - MAX_DESC_TXBBS)) { + if (unlikely(!mlx4_en_is_tx_ring_full(ring))) { netif_tx_wake_queue(ring->tx_queue); ring->wake_queue++; } diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h index 0e80118..18f8578 100644 --- a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h +++ b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h @@ -280,6 +280,7 @@ struct mlx4_en_tx_ring { u32 size; /* number of TXBBs */ u32 size_mask; u16 stride; + u32 full_size; u16 cqn;/* index of port CQ associated with this ring */ u32 buf_size; __be32 doorbell_qpn; -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3.19.y-ckt 008/251] net/mlx4_en: Release TX QP when destroying TX ring
3.19.8-ckt4 -stable review patch. If anyone has any objections, please let me know. -- From: Eran Ben Elisha [ Upstream commit 0eb08514fdbdcd16fd6870680cd638f203662e9d ] TX ring QP wasn't released at mlx4_en_destroy_tx_ring. Instead, the code used the deprecated base_tx_qpn field. Move TX QP release to mlx4_en_destroy_tx_ring and remove the base_tx_qpn field. Fixes: ddae0349fdb7 ('net/mlx4: Change QP allocation scheme') Signed-off-by: Eran Ben Elisha Signed-off-by: Or Gerlitz Signed-off-by: David S. Miller Signed-off-by: Kamal Mostafa --- drivers/net/ethernet/mellanox/mlx4/en_netdev.c | 4 drivers/net/ethernet/mellanox/mlx4/en_tx.c | 1 + drivers/net/ethernet/mellanox/mlx4/mlx4_en.h | 1 - 3 files changed, 1 insertion(+), 5 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c index c998c4d..99b99eb 100644 --- a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c +++ b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c @@ -1973,10 +1973,6 @@ void mlx4_en_free_resources(struct mlx4_en_priv *priv) mlx4_en_destroy_cq(priv, >rx_cq[i]); } - if (priv->base_tx_qpn) { - mlx4_qp_release_range(priv->mdev->dev, priv->base_tx_qpn, priv->tx_ring_num); - priv->base_tx_qpn = 0; - } } int mlx4_en_alloc_resources(struct mlx4_en_priv *priv) diff --git a/drivers/net/ethernet/mellanox/mlx4/en_tx.c b/drivers/net/ethernet/mellanox/mlx4/en_tx.c index 18db895..06c0de6 100644 --- a/drivers/net/ethernet/mellanox/mlx4/en_tx.c +++ b/drivers/net/ethernet/mellanox/mlx4/en_tx.c @@ -180,6 +180,7 @@ void mlx4_en_destroy_tx_ring(struct mlx4_en_priv *priv, mlx4_bf_free(mdev->dev, >bf); mlx4_qp_remove(mdev->dev, >qp); mlx4_qp_free(mdev->dev, >qp); + mlx4_qp_release_range(priv->mdev->dev, ring->qpn, 1); mlx4_en_unmap_buffer(>wqres.buf); mlx4_free_hwq_res(mdev->dev, >wqres, ring->buf_size); kfree(ring->bounce_buf); diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h index 6cc49c1..0e80118 100644 --- a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h +++ b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h @@ -599,7 +599,6 @@ struct mlx4_en_priv { int vids[128]; bool wol; struct device *ddev; - int base_tx_qpn; struct hlist_head mac_hash[MLX4_EN_MAC_HASH_SIZE]; struct hwtstamp_config hwtstamp_config; -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3.19.y-ckt 010/251] net/mlx4_en: Fix wrong csum complete report when rxvlan offload is disabled
3.19.8-ckt4 -stable review patch. If anyone has any objections, please let me know. -- From: Ido Shamay [ Upstream commit 79a258526ce1051cb9684018c25a89d51ac21be8 ] The check_csum() function relied on hwtstamp_rx_filter to know if rxvlan offload is disabled. This is wrong since rxvlan offload can be switched on/off regardless of hwtstamp_rx_filter. Also moved check_csum to query CQE information to identify VLAN packets and removed the check of IP packets, since it has been validated before. Fixes: f8c6455bb04b ('net/mlx4_en: Extend checksum offloading by CHECKSUM COMPLETE') Signed-off-by: Ido Shamay Signed-off-by: Or Gerlitz Signed-off-by: David S. Miller Signed-off-by: Kamal Mostafa --- drivers/net/ethernet/mellanox/mlx4/en_rx.c | 17 ++--- 1 file changed, 6 insertions(+), 11 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c b/drivers/net/ethernet/mellanox/mlx4/en_rx.c index 10d3533..7f16627 100644 --- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c +++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c @@ -719,7 +719,7 @@ static int get_fixed_ipv6_csum(__wsum hw_checksum, struct sk_buff *skb, } #endif static int check_csum(struct mlx4_cqe *cqe, struct sk_buff *skb, void *va, - int hwtstamp_rx_filter) + netdev_features_t dev_features) { __wsum hw_checksum = 0; @@ -727,14 +727,8 @@ static int check_csum(struct mlx4_cqe *cqe, struct sk_buff *skb, void *va, hw_checksum = csum_unfold((__force __sum16)cqe->checksum); - if (((struct ethhdr *)va)->h_proto == htons(ETH_P_8021Q) && - hwtstamp_rx_filter != HWTSTAMP_FILTER_NONE) { - /* next protocol non IPv4 or IPv6 */ - if (((struct vlan_hdr *)hdr)->h_vlan_encapsulated_proto - != htons(ETH_P_IP) && - ((struct vlan_hdr *)hdr)->h_vlan_encapsulated_proto - != htons(ETH_P_IPV6)) - return -1; + if (cqe->vlan_my_qpn & cpu_to_be32(MLX4_CQE_VLAN_PRESENT_MASK) && + !(dev_features & NETIF_F_HW_VLAN_CTAG_RX)) { hw_checksum = get_fixed_vlan_csum(hw_checksum, hdr); hdr += sizeof(struct vlan_hdr); } @@ -897,7 +891,8 @@ int mlx4_en_process_rx_cq(struct net_device *dev, struct mlx4_en_cq *cq, int bud if (ip_summed == CHECKSUM_COMPLETE) { void *va = skb_frag_address(skb_shinfo(gro_skb)->frags); - if (check_csum(cqe, gro_skb, va, ring->hwtstamp_rx_filter)) { + if (check_csum(cqe, gro_skb, va, + dev->features)) { ip_summed = CHECKSUM_NONE; ring->csum_none++; ring->csum_complete--; @@ -952,7 +947,7 @@ int mlx4_en_process_rx_cq(struct net_device *dev, struct mlx4_en_cq *cq, int bud } if (ip_summed == CHECKSUM_COMPLETE) { - if (check_csum(cqe, skb, skb->data, ring->hwtstamp_rx_filter)) { + if (check_csum(cqe, skb, skb->data, dev->features)) { ip_summed = CHECKSUM_NONE; ring->csum_complete--; ring->csum_none++; -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/7] Initial support for user namespace owned mounts
Andy Lutomirski writes: > On Jul 15, 2015 3:34 PM, "Eric W. Biederman" wrote: >> >> Seth Forshee writes: >> >> > On Wed, Jul 15, 2015 at 04:06:35PM -0500, Eric W. Biederman wrote: >> >> Casey Schaufler writes: >> >> >> >> > On 7/15/2015 12:46 PM, Seth Forshee wrote: >> >> >> These are the first in a larger set of patches that I've been working >> >> >> on >> >> >> (with help from Eric Biederman) to support mounting ext4 and fuse >> >> >> filesystems from within user namespaces. I've pushed the full series >> >> >> to: >> >> >> >> >> >> git://kernel.ubuntu.com/sforshee/linux.git userns-mounts >> >> >> >> >> >> Taking the series as a whole, the strategy is to handle as much of the >> >> >> heavy lifting as possible in the vfs so the filesystems don't have to >> >> >> handle weird edge cases. If you look at the full series you'll find >> >> >> that >> >> >> the changes in ext4 to support user namespace mounts turn out to be >> >> >> fairly minimal (fuse is a bit more complicated though as it must deal >> >> >> with translating ids for a userspace process which is running in pid >> >> >> and >> >> >> user namespaces). >> >> >> >> >> >> The patches I'm sending today lay some of the groundwork in the vfs and >> >> >> related code. They fall into two broad groups: >> >> >> >> >> >> 1. Patches 1-2 add s_user_ns and simplify MNT_NODEV handling. These >> >> >> are >> >> >> pretty straightforward, and Eric has expressed interest in merging >> >> >> these patches soon. Note that patch 2 won't apply cleanly without >> >> >> Eric's noexec patches for proc and sys [1]. >> >> >> >> >> >> 2. Patches 2-7 tighten down security for mounts with s_user_ns != >> >> >> _user_ns. This includes updates to how file caps and suid are >> >> >> handled and LSM updates to ignore security labels on superblocks >> >> >> from non-init namespaces. >> >> >> >> >> >> The LSM changes in particular may not be optimal, as I don't have a >> >> >> lot of familiarity with this code, so I'd be especially >> >> >> appreciative >> >> >> of review of these changes and suggestions on how to improve them. >> >> > >> >> > Lukasz Pawelczyk proposed >> >> > LSM support in user namespaces ([RFC] lsm: namespace hooks) >> >> > that make a whole lot more sense than just turning off >> >> > the option of using labels on files. Gutting the ability >> >> > to use MAC in a namespace is a step down the road of >> >> > making MAC and namespaces incompatible. >> >> >> >> This is not "turning off the option to use labels on files". >> >> >> >> This is supporting mounting filesystems like ext4 by unprivileged users >> >> and not trusting the labels they set in the same way as we trust labels >> >> on filesystems mounted by privileged users. >> >> >> >> The first step needs to be not trusting those labels and treating such >> >> filesystems as filesystems without label support. I hope that is Seth >> >> has implemented. >> >> >> >> In the long run we can do more interesting things with such filesystems >> >> once the appropriate LSM policy is in place. >> > >> > Yes, this exactly. Right now it looks to me like the only safe thing to >> > do with mounts from unprivileged users is to ignore the security labels, >> > so that's what I'm trying to do with these changes. If there's some >> > better thing to do, or some better way to do it, I'm more than happy to >> > receive that feedback. >> >> Ugh. >> >> This made me realize that we have an interesting problem here. An >> unprivileged mount of tmpfs probably needs to have >> s_user_ns == _user_ns. >> >> Otherwise we will break security labels on tmpfs for no good reason. >> ramfs and sysfs also seem to have similar concerns. >> >> Because they have no backing store we can trust those filesystems with >> security labels. Plus for at least sysfs there is the security label >> bleed through issue, that we need to make certain works. >> >> Perhaps these filesystems with trusted backing store need to call >> "sget_userns(..., _user_ns)". >> >> If we don't get this right we will have significant regressions with >> respect to security labels, and that is not ok. > > That's only a problem if there's anyone who sets security labels on > such a mount. You need global caps to do that (I hope), which > requires someone outside the userns to help, which means there's a > good chance that literally no one does this. Fair enough. That is however something we need to test. If no one puts security labels or file caps on such a mount we can change things. If not we can't because it would introduce regressions. Eric -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3.19.y-ckt 020/251] [media] cx24116: fix a buffer overflow when checking userspace params
3.19.8-ckt4 -stable review patch. If anyone has any objections, please let me know. -- From: Mauro Carvalho Chehab commit 1fa2337a315a2448c5434f41e00d56b01a22283c upstream. The maximum size for a DiSEqC command is 6, according to the userspace API. However, the code allows to write up much more values: drivers/media/dvb-frontends/cx24116.c:983 cx24116_send_diseqc_msg() error: buffer overflow 'd->msg' 6 <= 23 Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Kamal Mostafa --- drivers/media/dvb-frontends/cx24116.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/media/dvb-frontends/cx24116.c b/drivers/media/dvb-frontends/cx24116.c index 2916d7c..7bc68b3 100644 --- a/drivers/media/dvb-frontends/cx24116.c +++ b/drivers/media/dvb-frontends/cx24116.c @@ -963,6 +963,10 @@ static int cx24116_send_diseqc_msg(struct dvb_frontend *fe, struct cx24116_state *state = fe->demodulator_priv; int i, ret; + /* Validate length */ + if (d->msg_len > sizeof(d->msg)) +return -EINVAL; + /* Dump DiSEqC message */ if (debug) { printk(KERN_INFO "cx24116: %s(", __func__); @@ -974,10 +978,6 @@ static int cx24116_send_diseqc_msg(struct dvb_frontend *fe, printk(") toneburst=%d\n", toneburst); } - /* Validate length */ - if (d->msg_len > (CX24116_ARGLEN - CX24116_DISEQC_MSGOFS)) - return -EINVAL; - /* DiSEqC message */ for (i = 0; i < d->msg_len; i++) state->dsec_cmd.args[CX24116_DISEQC_MSGOFS + i] = d->msg[i]; -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3.19.y-ckt 001/251] net: don't wait for order-3 page allocation
3.19.8-ckt4 -stable review patch. If anyone has any objections, please let me know. -- From: Shaohua Li [ Upstream commit fb05e7a89f500cfc06ae277bdc911b281928995d ] We saw excessive direct memory compaction triggered by skb_page_frag_refill. This causes performance issues and add latency. Commit 5640f7685831e0 introduces the order-3 allocation. According to the changelog, the order-3 allocation isn't a must-have but to improve performance. But direct memory compaction has high overhead. The benefit of order-3 allocation can't compensate the overhead of direct memory compaction. This patch makes the order-3 page allocation atomic. If there is no memory pressure and memory isn't fragmented, the alloction will still success, so we don't sacrifice the order-3 benefit here. If the atomic allocation fails, direct memory compaction will not be triggered, skb_page_frag_refill will fallback to order-0 immediately, hence the direct memory compaction overhead is avoided. In the allocation failure case, kswapd is waken up and doing compaction, so chances are allocation could success next time. alloc_skb_with_frags is the same. The mellanox driver does similar thing, if this is accepted, we must fix the driver too. V3: fix the same issue in alloc_skb_with_frags as pointed out by Eric V2: make the changelog clearer Cc: Eric Dumazet Cc: Chris Mason Cc: Debabrata Banerjee Signed-off-by: Shaohua Li Acked-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Kamal Mostafa --- net/core/skbuff.c | 2 +- net/core/sock.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/net/core/skbuff.c b/net/core/skbuff.c index 3b0a8b0..0998af7 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -4414,7 +4414,7 @@ struct sk_buff *alloc_skb_with_frags(unsigned long header_len, while (order) { if (npages >= 1 << order) { - page = alloc_pages(gfp_mask | + page = alloc_pages((gfp_mask & ~__GFP_WAIT) | __GFP_COMP | __GFP_NOWARN | __GFP_NORETRY, diff --git a/net/core/sock.c b/net/core/sock.c index a91f99f..3606cc5 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -1888,7 +1888,7 @@ bool skb_page_frag_refill(unsigned int sz, struct page_frag *pfrag, gfp_t gfp) pfrag->offset = 0; if (SKB_FRAG_PAGE_ORDER) { - pfrag->page = alloc_pages(gfp | __GFP_COMP | + pfrag->page = alloc_pages((gfp & ~__GFP_WAIT) | __GFP_COMP | __GFP_NOWARN | __GFP_NORETRY, SKB_FRAG_PAGE_ORDER); if (likely(pfrag->page)) { -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3.19.y-ckt 019/251] [media] s5h1420: fix a buffer overflow when checking userspace params
3.19.8-ckt4 -stable review patch. If anyone has any objections, please let me know. -- From: Mauro Carvalho Chehab commit 12f4543f5d6811f864e6c4952eb27253c7466c02 upstream. The maximum size for a DiSEqC command is 6, according to the userspace API. However, the code allows to write up to 7 values: drivers/media/dvb-frontends/s5h1420.c:193 s5h1420_send_master_cmd() error: buffer overflow 'cmd->msg' 6 <= 7 Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Kamal Mostafa --- drivers/media/dvb-frontends/s5h1420.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/media/dvb-frontends/s5h1420.c b/drivers/media/dvb-frontends/s5h1420.c index 93eeaf7..0b4f8fe 100644 --- a/drivers/media/dvb-frontends/s5h1420.c +++ b/drivers/media/dvb-frontends/s5h1420.c @@ -180,7 +180,7 @@ static int s5h1420_send_master_cmd (struct dvb_frontend* fe, int result = 0; dprintk("enter %s\n", __func__); - if (cmd->msg_len > 8) + if (cmd->msg_len > sizeof(cmd->msg)) return -EINVAL; /* setup for DISEQC */ -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3.19.y-ckt 006/251] neigh: do not modify unlinked entries
3.19.8-ckt4 -stable review patch. If anyone has any objections, please let me know. -- From: Julian Anastasov [ Upstream commit 2c51a97f76d20ebf1f50fef908b986cb051fdff9 ] The lockless lookups can return entry that is unlinked. Sometimes they get reference before last neigh_cleanup_and_release, sometimes they do not need reference. Later, any modification attempts may result in the following problems: 1. entry is not destroyed immediately because neigh_update can start the timer for dead entry, eg. on change to NUD_REACHABLE state. As result, entry lives for some time but is invisible and out of control. 2. __neigh_event_send can run in parallel with neigh_destroy while refcnt=0 but if timer is started and expired refcnt can reach 0 for second time leading to second neigh_destroy and possible crash. Thanks to Eric Dumazet and Ying Xue for their work and analyze on the __neigh_event_send change. Fixes: 767e97e1e0db ("neigh: RCU conversion of struct neighbour") Fixes: a263b3093641 ("ipv4: Make neigh lookups directly in output packet path.") Fixes: 6fd6ce2056de ("ipv6: Do not depend on rt->n in ip6_finish_output2().") Cc: Eric Dumazet Cc: Ying Xue Signed-off-by: Julian Anastasov Acked-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Kamal Mostafa --- net/core/neighbour.c | 13 + 1 file changed, 13 insertions(+) diff --git a/net/core/neighbour.c b/net/core/neighbour.c index 8d614c9..0385351 100644 --- a/net/core/neighbour.c +++ b/net/core/neighbour.c @@ -971,6 +971,8 @@ int __neigh_event_send(struct neighbour *neigh, struct sk_buff *skb) rc = 0; if (neigh->nud_state & (NUD_CONNECTED | NUD_DELAY | NUD_PROBE)) goto out_unlock_bh; + if (neigh->dead) + goto out_dead; if (!(neigh->nud_state & (NUD_STALE | NUD_INCOMPLETE))) { if (NEIGH_VAR(neigh->parms, MCAST_PROBES) + @@ -1027,6 +1029,13 @@ out_unlock_bh: write_unlock(>lock); local_bh_enable(); return rc; + +out_dead: + if (neigh->nud_state & NUD_STALE) + goto out_unlock_bh; + write_unlock_bh(>lock); + kfree_skb(skb); + return 1; } EXPORT_SYMBOL(__neigh_event_send); @@ -1090,6 +1099,8 @@ int neigh_update(struct neighbour *neigh, const u8 *lladdr, u8 new, if (!(flags & NEIGH_UPDATE_F_ADMIN) && (old & (NUD_NOARP | NUD_PERMANENT))) goto out; + if (neigh->dead) + goto out; if (!(new & NUD_VALID)) { neigh_del_timer(neigh); @@ -1239,6 +1250,8 @@ EXPORT_SYMBOL(neigh_update); */ void __neigh_set_probe_once(struct neighbour *neigh) { + if (neigh->dead) + return; neigh->updated = jiffies; if (!(neigh->nud_state & NUD_FAILED)) return; -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3.19.y-ckt 007/251] tcp: Do not call tcp_fastopen_reset_cipher from interrupt context
3.19.8-ckt4 -stable review patch. If anyone has any objections, please let me know. -- From: Christoph Paasch [ Upstream commit dfea2aa654243f70dc53b8648d0bbdeec55a7df1 ] tcp_fastopen_reset_cipher really cannot be called from interrupt context. It allocates the tcp_fastopen_context with GFP_KERNEL and calls crypto_alloc_cipher, which allocates all kind of stuff with GFP_KERNEL. Thus, we might sleep when the key-generation is triggered by an incoming TFO cookie-request which would then happen in interrupt- context, as shown by enabling CONFIG_DEBUG_ATOMIC_SLEEP: [ 36.001813] BUG: sleeping function called from invalid context at mm/slub.c:1266 [ 36.003624] in_atomic(): 1, irqs_disabled(): 0, pid: 1016, name: packetdrill [ 36.004859] CPU: 1 PID: 1016 Comm: packetdrill Not tainted 4.1.0-rc7 #14 [ 36.006085] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014 [ 36.008250] 04f2 88007f8838a8 8171d53a 880075a084a8 [ 36.009630] 880075a08000 88007f8838c8 810967d3 88007f883928 [ 36.011076] 88007f8838f8 81096892 88007f89be00 [ 36.012494] Call Trace: [ 36.012953][] dump_stack+0x4f/0x6d [ 36.014085] [] ___might_sleep+0x103/0x170 [ 36.015117] [] __might_sleep+0x52/0x90 [ 36.016117] [] kmem_cache_alloc_trace+0x47/0x190 [ 36.017266] [] ? tcp_fastopen_reset_cipher+0x42/0x130 [ 36.018485] [] tcp_fastopen_reset_cipher+0x42/0x130 [ 36.019679] [] tcp_fastopen_init_key_once+0x61/0x70 [ 36.020884] [] __tcp_fastopen_cookie_gen+0x1c/0x60 [ 36.022058] [] tcp_try_fastopen+0x58f/0x730 [ 36.023118] [] tcp_conn_request+0x3e8/0x7b0 [ 36.024185] [] ? __module_text_address+0x12/0x60 [ 36.025327] [] tcp_v4_conn_request+0x51/0x60 [ 36.026410] [] tcp_rcv_state_process+0x190/0xda0 [ 36.027556] [] ? __inet_lookup_established+0x47/0x170 [ 36.028784] [] tcp_v4_do_rcv+0x16d/0x3d0 [ 36.029832] [] ? security_sock_rcv_skb+0x16/0x20 [ 36.030936] [] tcp_v4_rcv+0x77a/0x7b0 [ 36.031875] [] ? iptable_filter_hook+0x33/0x70 [ 36.032953] [] ip_local_deliver_finish+0x92/0x1f0 [ 36.034065] [] ip_local_deliver+0x9a/0xb0 [ 36.035069] [] ? ip_rcv+0x3d0/0x3d0 [ 36.035963] [] ip_rcv_finish+0x119/0x330 [ 36.036950] [] ip_rcv+0x2e7/0x3d0 [ 36.037847] [] __netif_receive_skb_core+0x552/0x930 [ 36.038994] [] __netif_receive_skb+0x27/0x70 [ 36.040033] [] process_backlog+0xd2/0x1f0 [ 36.041025] [] net_rx_action+0x122/0x310 [ 36.042007] [] __do_softirq+0x103/0x2f0 [ 36.042978] [] do_softirq_own_stack+0x1c/0x30 This patch moves the call to tcp_fastopen_init_key_once to the places where a listener socket creates its TFO-state, which always happens in user-context (either from the setsockopt, or implicitly during the listen()-call) Cc: Eric Dumazet Cc: Hannes Frederic Sowa Fixes: 222e83d2e0ae ("tcp: switch tcp_fastopen key generation to net_get_random_once") Signed-off-by: Christoph Paasch Acked-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Kamal Mostafa --- net/ipv4/af_inet.c | 2 ++ net/ipv4/tcp.c | 7 +-- net/ipv4/tcp_fastopen.c | 2 -- 3 files changed, 7 insertions(+), 4 deletions(-) diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c index a44773c..515f689 100644 --- a/net/ipv4/af_inet.c +++ b/net/ipv4/af_inet.c @@ -228,6 +228,8 @@ int inet_listen(struct socket *sock, int backlog) err = 0; if (err) goto out; + + tcp_fastopen_init_key_once(true); } err = inet_csk_listen_start(sk, backlog); if (err) diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 3075723..48e9bb6 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -2566,10 +2566,13 @@ static int do_tcp_setsockopt(struct sock *sk, int level, case TCP_FASTOPEN: if (val >= 0 && ((1 << sk->sk_state) & (TCPF_CLOSE | - TCPF_LISTEN))) + TCPF_LISTEN))) { + tcp_fastopen_init_key_once(true); + err = fastopen_init_queue(sk, val); - else + } else { err = -EINVAL; + } break; case TCP_TIMESTAMP: if (!tp->repair) diff --git a/net/ipv4/tcp_fastopen.c b/net/ipv4/tcp_fastopen.c index c730772..b01d5bd 100644 --- a/net/ipv4/tcp_fastopen.c +++ b/net/ipv4/tcp_fastopen.c @@ -78,8 +78,6 @@ static bool __tcp_fastopen_cookie_gen(const void *path, struct tcp_fastopen_context *ctx; bool ok = false; - tcp_fastopen_init_key_once(true); - rcu_read_lock(); ctx = rcu_dereference(tcp_fastopen_ctx); if (ctx) { -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the
[PATCH 3.19.y-ckt 011/251] net: phy: fix phy link up when limiting speed via device tree
3.19.8-ckt4 -stable review patch. If anyone has any objections, please let me know. -- From: Mugunthan V N [ Upstream commit eb686231fce3770299760f24fdcf5ad041f44153 ] When limiting phy link speed using "max-speed" to 100mbps or less on a giga bit phy, phy never completes auto negotiation and phy state machine is held in PHY_AN. Fixing this issue by comparing the giga bit advertise though phydev->supported doesn't have it but phy has BMSR_ESTATEN set. So that auto negotiation is restarted as old and new advertise are different and link comes up fine. Signed-off-by: Mugunthan V N Reviewed-by: Florian Fainelli Signed-off-by: David S. Miller Signed-off-by: Kamal Mostafa --- drivers/net/phy/phy_device.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c index 3fc91e8..70a0d88 100644 --- a/drivers/net/phy/phy_device.c +++ b/drivers/net/phy/phy_device.c @@ -782,10 +782,11 @@ static int genphy_config_advert(struct phy_device *phydev) if (phydev->supported & (SUPPORTED_1000baseT_Half | SUPPORTED_1000baseT_Full)) { adv |= ethtool_adv_to_mii_ctrl1000_t(advertise); - if (adv != oldadv) - changed = 1; } + if (adv != oldadv) + changed = 1; + err = phy_write(phydev, MII_CTRL1000, adv); if (err < 0) return err; -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 6/6] cputime: Introduce cputime_to_timespec64()/timespec64_to_cputime()
On 15 July 2015 at 19:55, Thomas Gleixner wrote: > On Wed, 15 Jul 2015, Baolin Wang wrote: > >> On 15 July 2015 at 18:31, Thomas Gleixner wrote: >> > On Wed, 15 Jul 2015, Baolin Wang wrote: >> > >> >> The cputime_to_timespec() and timespec_to_cputime() functions are >> >> not year 2038 safe on 32bit systems due to that the struct timepsec >> >> will overflow in 2038 year. >> > >> > And how is this relevant? cputime is not based on wall clock time at >> > all. So what has 2038 to do with cputime? >> > >> > We want proper explanations WHY we need such a change. >> >> When converting the posix-cpu-timers, it call the >> cputime_to_timespec() function. Thus it need a conversion for this >> function. > > There is no requirement to convert posix-cpu-timers on their own. We > need to adopt the posix cpu timers code because it shares syscalls > with the other posix timers, but that still does not explain why we > need these functions. > In posix-cpu-timers, it also defined some 'k_clock struct' variables, and we need to convert the callbacks of the 'k_clock struct' which are not year 2038 safe on 32bit systems. Some callbacks which need to convert call the cputime_to_timespec() function, thus we also want to convert the cputime_to_timespec() function to a year 2038 safe function to make all them ready for the year 2038 issue. >> You can see that conversion in patch "posix-cpu-timers: Convert to >> y2038 safe callbacks" from >> https://git.linaro.org/people/baolin.wang/upstream_0627.git. > > I do not care about your random git tree. I care about proper > changelogs. Your changelogs are just a copied boilerplate full of > errors. > > Thanks, > > tglx -- Baolin.wang Best Regards -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3.19.y-ckt 021/251] [media] af9013: Don't accept invalid bandwidth
3.19.8-ckt4 -stable review patch. If anyone has any objections, please let me know. -- From: Mauro Carvalho Chehab commit d7b76c91f471413de9ded837bddeca2164786571 upstream. If userspace sends an invalid bandwidth, it should either return EINVAL or switch to auto mode. This driver will go past an array and program the hardware on a wrong way if this happens. Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Kamal Mostafa --- drivers/media/dvb-frontends/af9013.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/media/dvb-frontends/af9013.c b/drivers/media/dvb-frontends/af9013.c index 8001690..ba6c8f6 100644 --- a/drivers/media/dvb-frontends/af9013.c +++ b/drivers/media/dvb-frontends/af9013.c @@ -605,6 +605,10 @@ static int af9013_set_frontend(struct dvb_frontend *fe) } } + /* Return an error if can't find bandwidth or the right clock */ + if (i == ARRAY_SIZE(coeff_lut)) + return -EINVAL; + ret = af9013_wr_regs(state, 0xae00, coeff_lut[i].val, sizeof(coeff_lut[i].val)); } -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3.19.y-ckt 022/251] [media] cx24117: fix a buffer overflow when checking userspace params
3.19.8-ckt4 -stable review patch. If anyone has any objections, please let me know. -- From: Mauro Carvalho Chehab commit 82e3b88b679049f043fe9b03991d6d66fc0a43c8 upstream. The maximum size for a DiSEqC command is 6, according to the userspace API. However, the code allows to write up much more values: drivers/media/dvb-frontends/cx24116.c:983 cx24116_send_diseqc_msg() error: buffer overflow 'd->msg' 6 <= 23 Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Kamal Mostafa --- drivers/media/dvb-frontends/cx24117.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/media/dvb-frontends/cx24117.c b/drivers/media/dvb-frontends/cx24117.c index acb965c..af63635 100644 --- a/drivers/media/dvb-frontends/cx24117.c +++ b/drivers/media/dvb-frontends/cx24117.c @@ -1043,7 +1043,7 @@ static int cx24117_send_diseqc_msg(struct dvb_frontend *fe, dev_dbg(>priv->i2c->dev, ")\n"); /* Validate length */ - if (d->msg_len > 15) + if (d->msg_len > sizeof(d->msg)) return -EINVAL; /* DiSEqC message */ -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3.19.y-ckt 015/251] net: mvneta: introduce compatible string "marvell, armada-xp-neta"
3.19.8-ckt4 -stable review patch. If anyone has any objections, please let me know. -- From: Simon Guinot [ Upstream commit f522a975a8101895a85354b9c143f41b8248e71a ] The mvneta driver supports the Ethernet IP found in the Armada 370, XP, 380 and 385 SoCs. Since at least one more hardware feature is available for the Armada XP SoCs then a way to identify them is needed. This patch introduces a new compatible string "marvell,armada-xp-neta". Signed-off-by: Simon Guinot Fixes: c5aff18204da ("net: mvneta: driver for Marvell Armada 370/XP network unit") Acked-by: Gregory CLEMENT Acked-by: Thomas Petazzoni Signed-off-by: David S. Miller Signed-off-by: Kamal Mostafa --- Documentation/devicetree/bindings/net/marvell-armada-370-neta.txt | 2 +- drivers/net/ethernet/marvell/mvneta.c | 1 + 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/Documentation/devicetree/bindings/net/marvell-armada-370-neta.txt b/Documentation/devicetree/bindings/net/marvell-armada-370-neta.txt index 750d577..f5a8ca2 100644 --- a/Documentation/devicetree/bindings/net/marvell-armada-370-neta.txt +++ b/Documentation/devicetree/bindings/net/marvell-armada-370-neta.txt @@ -1,7 +1,7 @@ * Marvell Armada 370 / Armada XP Ethernet Controller (NETA) Required properties: -- compatible: should be "marvell,armada-370-neta". +- compatible: "marvell,armada-370-neta" or "marvell,armada-xp-neta". - reg: address and length of the register set for the device. - interrupts: interrupt for the device - phy: See ethernet.txt file in the same directory. diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c index 96208f1..cce60a1 100644 --- a/drivers/net/ethernet/marvell/mvneta.c +++ b/drivers/net/ethernet/marvell/mvneta.c @@ -3100,6 +3100,7 @@ static int mvneta_remove(struct platform_device *pdev) static const struct of_device_id mvneta_match[] = { { .compatible = "marvell,armada-370-neta" }, + { .compatible = "marvell,armada-xp-neta" }, { } }; MODULE_DEVICE_TABLE(of, mvneta_match); -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3.19.y-ckt 013/251] sctp: Fix race between OOTB responce and route removal
3.19.8-ckt4 -stable review patch. If anyone has any objections, please let me know. -- From: Alexander Sverdlin [ Upstream commit 29c4afc4e98f4dc0ea9df22c631841f9c220b944 ] There is NULL pointer dereference possible during statistics update if the route used for OOTB responce is removed at unfortunate time. If the route exists when we receive OOTB packet and we finally jump into sctp_packet_transmit() to send ABORT, but in the meantime route is removed under our feet, we take "no_route" path and try to update stats with IP_INC_STATS(sock_net(asoc->base.sk), ...). But sctp_ootb_pkt_new() used to prepare responce packet doesn't call sctp_transport_set_owner() and therefore there is no asoc associated with this packet. Probably temporary asoc just for OOTB responces is overkill, so just introduce a check like in all other places in sctp_packet_transmit(), where "asoc" is dereferenced. To reproduce this, one needs to 0. ensure that sctp module is loaded (otherwise ABORT is not generated) 1. remove default route on the machine 2. while true; do ip route del [interface-specific route] ip route add [interface-specific route] done 3. send enough OOTB packets (i.e. HB REQs) from another host to trigger ABORT responce On x86_64 the crash looks like this: BUG: unable to handle kernel NULL pointer dereference at 0020 IP: [] sctp_packet_transmit+0x63c/0x730 [sctp] PGD 0 Oops: [#1] PREEMPT SMP Modules linked in: ... CPU: 0 PID: 0 Comm: swapper/0 Tainted: G O4.0.5-1-ARCH #1 Hardware name: ... task: 818124c0 ti: 8180 task.ti: 8180 RIP: 0010:[] [] sctp_packet_transmit+0x63c/0x730 [sctp] RSP: 0018:880127c037b8 EFLAGS: 00010296 RAX: RBX: RCX: 0015ff66b480 RDX: 0015ff66b400 RSI: 880127c17200 RDI: 880123403700 RBP: 880127c03888 R08: 00017200 R09: 814625af R10: ea00047e4680 R11: ff80 R12: 8800b0d38a28 R13: 8800b0d38a28 R14: 8800b3e88000 R15: a05f24e0 FS: () GS:880127c0() knlGS: CS: 0010 DS: ES: CR0: 8005003b CR2: 0020 CR3: c855b000 CR4: 07f0 Stack: 880127c03910 8800b0d38a28 8189d240 88011f91b400 880127c03828 a05c94c5 8800baa1c520 0001 Call Trace: [] ? sctp_sf_tabort_8_4_8.isra.20+0x85/0x140 [sctp] [] ? sctp_transport_put+0x52/0x80 [sctp] [] sctp_do_sm+0xb8c/0x19a0 [sctp] [] ? trigger_load_balance+0x90/0x210 [] ? update_process_times+0x59/0x60 [] ? timerqueue_add+0x60/0xb0 [] ? enqueue_hrtimer+0x29/0xa0 [] ? read_tsc+0x9/0x10 [] ? put_page+0x55/0x60 [] ? clockevents_program_event+0x6d/0x100 [] ? skb_free_head+0x58/0x80 [] ? chksum_update+0x1b/0x27 [crc32c_generic] [] ? crypto_shash_update+0xce/0xf0 [] sctp_endpoint_bh_rcv+0x113/0x280 [sctp] [] sctp_inq_push+0x46/0x60 [sctp] [] sctp_rcv+0x880/0x910 [sctp] [] ? sctp_packet_transmit_chunk+0xb0/0xb0 [sctp] [] ? sctp_csum_update+0x20/0x20 [sctp] [] ? ip_route_input_noref+0x235/0xd30 [] ? ack_ioapic_level+0x7b/0x150 [] ip_local_deliver_finish+0xae/0x210 [] ip_local_deliver+0x35/0x90 [] ip_rcv_finish+0xf5/0x370 [] ip_rcv+0x2b8/0x3a0 [] __netif_receive_skb_core+0x763/0xa50 [] __netif_receive_skb+0x18/0x60 [] netif_receive_skb_internal+0x40/0xd0 [] napi_gro_receive+0xe8/0x120 [] rtl8169_poll+0x2da/0x660 [r8169] [] net_rx_action+0x21a/0x360 [] __do_softirq+0xe1/0x2d0 [] irq_exit+0xad/0xb0 [] do_IRQ+0x58/0xf0 [] common_interrupt+0x6d/0x6d [] ? hrtimer_start+0x18/0x20 [] ? sctp_transport_destroy_rcu+0x29/0x30 [sctp] [] ? mwait_idle+0x60/0xa0 [] arch_cpu_idle+0xf/0x20 [] cpu_startup_entry+0x3ec/0x480 [] rest_init+0x85/0x90 [] start_kernel+0x48b/0x4ac [] ? early_idt_handlers+0x120/0x120 [] x86_64_start_reservations+0x2a/0x2c [] x86_64_start_kernel+0x161/0x184 Code: 90 48 8b 80 b8 00 00 00 48 89 85 70 ff ff ff 48 83 bd 70 ff ff ff 00 0f 85 cd fa ff ff 48 89 df 31 db e8 18 63 e7 e0 48 8b 45 80 <48> 8b 40 20 48 8b 40 30 48 8b 80 68 01 00 00 65 48 ff 40 78 e9 RIP [] sctp_packet_transmit+0x63c/0x730 [sctp] RSP CR2: 0020 ---[ end trace 5aec7fd2dc983574 ]--- Kernel panic - not syncing: Fatal exception in interrupt Kernel Offset: 0x0 from 0x8100 (relocation range: 0x8000-0x9fff) drm_kms_helper: panic occurred, switching back to text console ---[ end Kernel panic - not syncing: Fatal exception in interrupt Signed-off-by: Alexander Sverdlin Acked-by: Neil Horman Acked-by: Marcelo Ricardo Leitner Acked-by: Vlad Yasevich Signed-off-by: David S. Miller Signed-off-by: Kamal Mostafa --- net/sctp/output.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/net/sctp/output.c b/net/sctp/output.c index fc5e45b..abe7c2d 100644 --- a/net/sctp/output.c +++
[PATCH 3.19.y-ckt 012/251] bnx2x: fix lockdep splat
3.19.8-ckt4 -stable review patch. If anyone has any objections, please let me know. -- From: Eric Dumazet [ Upstream commit d53c66a5b80698620f7c9ba2372fff4017e987b8 ] Michel reported following lockdep splat [ 44.718117] INFO: trying to register non-static key. [ 44.723081] the code is fine but needs lockdep annotation. [ 44.728559] turning off the locking correctness validator. [ 44.734036] CPU: 8 PID: 5483 Comm: ethtool Not tainted 4.1.0 [ 44.770289] Call Trace: [ 44.772741] [] dump_stack+0x4c/0x65 [ 44.777879] [] ? console_unlock+0x1f1/0x510 [ 44.783708] [] __lock_acquire+0x1d05/0x1f10 [ 44.789538] [] ? mark_held_locks+0x6a/0x90 [ 44.795276] [] ? trace_hardirqs_on_caller+0x105/0x1d0 [ 44.801967] [] ? trace_hardirqs_on+0xd/0x10 [ 44.807793] [] ? hrtimer_try_to_cancel+0x4a/0x250 [ 44.814142] [] lock_acquire+0xb6/0x290 [ 44.819537] [] ? flush_work+0x5/0x280 [ 44.824844] [] flush_work+0x3d/0x280 [ 44.830061] [] ? flush_work+0x5/0x280 [ 44.835366] [] ? schedule_hrtimeout_range+0x13/0x20 [ 44.841889] [] ? usleep_range+0x4b/0x50 [ 44.847365] [] ? mark_held_locks+0x6a/0x90 [ 44.853102] [] ? __cancel_work_timer+0x105/0x1c0 [ 44.859359] [] ? trace_hardirqs_on_caller+0x105/0x1d0 [ 44.866045] [] __cancel_work_timer+0x9f/0x1c0 [ 44.872048] [] ? bnx2x_func_stop+0x42/0x90 [bnx2x] [ 44.878481] [] cancel_work_sync+0x10/0x20 [ 44.884134] [] bnx2x_chip_cleanup+0x245/0x730 [bnx2x] [ 44.890829] [] ? up+0x32/0x50 [ 44.895439] [] ? del_timer_sync+0x5/0xd0 [ 44.901005] [] bnx2x_nic_unload+0x20d/0x8e0 [bnx2x] [ 44.907527] [] ? might_fault+0x5f/0xb0 [ 44.912921] [] bnx2x_reload_if_running+0x2c/0x50 [bnx2x] [ 44.919879] [] bnx2x_set_ringparam+0x2b5/0x460 [bnx2x] [ 44.926664] [] dev_ethtool+0x55b/0x1c40 [ 44.932148] [] ? rtnl_lock+0x17/0x20 [ 44.937364] [] dev_ioctl+0x17b/0x630 [ 44.942582] [] sock_do_ioctl+0x5d/0x70 [ 44.947972] [] sock_ioctl+0x73/0x280 [ 44.953192] [] do_vfs_ioctl+0x88/0x5b0 [ 44.958587] [] ? up_read+0x23/0x40 [ 44.963631] [] ? __fget_light+0x6c/0xa0 [ 44.969105] [] SyS_ioctl+0x91/0xb0 [ 44.974149] [] system_call_fastpath+0x12/0x6f As bnx2x_init_ptp() is only called if bp->flags contains PTP_SUPPORTED, we also need to guard bnx2x_stop_ptp() with same condition, otherwise ptp_task workqueue is not initialized and kernel barfs on cancel_work_sync() Fixes: eeed018cbfa30 ("bnx2x: Add timestamping and PTP hardware clock support") Reported-by: Michel Lespinasse Signed-off-by: Eric Dumazet Cc: Michal Kalderon Cc: Ariel Elior Cc: Yuval Mintz Cc: David Decotigny Acked-by: Sony Chacko Signed-off-by: David S. Miller Signed-off-by: Kamal Mostafa --- drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c index ac6a0ef..39a1d3c 100644 --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c @@ -9310,7 +9310,8 @@ unload_error: * function stop ramrod is sent, since as part of this ramrod FW access * PTP registers. */ - bnx2x_stop_ptp(bp); + if (bp->flags & PTP_SUPPORTED) + bnx2x_stop_ptp(bp); /* Disable HW interrupts, NAPI */ bnx2x_netif_stop(bp, 1); -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3.19.y-ckt 005/251] packet: avoid out of bounds read in round robin fanout
3.19.8-ckt4 -stable review patch. If anyone has any objections, please let me know. -- From: Willem de Bruijn [ Upstream commit 468479e6043c84f5a65299cc07cb08a22a28c2b1 ] PACKET_FANOUT_LB computes f->rr_cur such that it is modulo f->num_members. It returns the old value unconditionally, but f->num_members may have changed since the last store. Ensure that the return value is always < num. When modifying the logic, simplify it further by replacing the loop with an unconditional atomic increment. Fixes: dc99f600698d ("packet: Add fanout support.") Suggested-by: Eric Dumazet Signed-off-by: Willem de Bruijn Acked-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Kamal Mostafa --- net/packet/af_packet.c | 18 ++ 1 file changed, 2 insertions(+), 16 deletions(-) diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c index 8c7eb97..b215289 100644 --- a/net/packet/af_packet.c +++ b/net/packet/af_packet.c @@ -1258,16 +1258,6 @@ static void packet_sock_destruct(struct sock *sk) sk_refcnt_debug_dec(sk); } -static int fanout_rr_next(struct packet_fanout *f, unsigned int num) -{ - int x = atomic_read(>rr_cur) + 1; - - if (x >= num) - x = 0; - - return x; -} - static unsigned int fanout_demux_hash(struct packet_fanout *f, struct sk_buff *skb, unsigned int num) @@ -1279,13 +1269,9 @@ static unsigned int fanout_demux_lb(struct packet_fanout *f, struct sk_buff *skb, unsigned int num) { - int cur, old; + unsigned int val = atomic_inc_return(>rr_cur); - cur = atomic_read(>rr_cur); - while ((old = atomic_cmpxchg(>rr_cur, cur, -fanout_rr_next(f, num))) != cur) - cur = old; - return cur; + return val % num; } static unsigned int fanout_demux_cpu(struct packet_fanout *f, -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3.19.y-ckt 017/251] net: mvneta: disable IP checksum with jumbo frames for Armada 370
3.19.8-ckt4 -stable review patch. If anyone has any objections, please let me know. -- From: Simon Guinot [ Upstream commit b65657fc240ae6c1d2a1e62db9a0e61ac9631d7a ] The Ethernet controller found in the Armada 370, 380 and 385 SoCs don't support TCP/IP checksumming with frame sizes larger than 1600 bytes. This patch fixes the issue by disabling the features NETIF_F_IP_CSUM and NETIF_F_TSO for the Armada 370 and compatibles SoCs when the MTU is set to a value greater than 1600 bytes. Signed-off-by: Simon Guinot Fixes: c5aff18204da ("net: mvneta: driver for Marvell Armada 370/XP network unit") Acked-by: Thomas Petazzoni Signed-off-by: David S. Miller Signed-off-by: Kamal Mostafa --- drivers/net/ethernet/marvell/mvneta.c | 26 +- 1 file changed, 25 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c index cce60a1..2562249 100644 --- a/drivers/net/ethernet/marvell/mvneta.c +++ b/drivers/net/ethernet/marvell/mvneta.c @@ -304,6 +304,7 @@ struct mvneta_port { unsigned int link; unsigned int duplex; unsigned int speed; + unsigned int tx_csum_limit; }; /* The mvneta_tx_desc and mvneta_rx_desc structures describe the @@ -2441,8 +2442,10 @@ static int mvneta_change_mtu(struct net_device *dev, int mtu) dev->mtu = mtu; - if (!netif_running(dev)) + if (!netif_running(dev)) { + netdev_update_features(dev); return 0; + } /* The interface is running, so we have to force a * reallocation of the queues @@ -2471,9 +2474,26 @@ static int mvneta_change_mtu(struct net_device *dev, int mtu) mvneta_start_dev(pp); mvneta_port_up(pp); + netdev_update_features(dev); + return 0; } +static netdev_features_t mvneta_fix_features(struct net_device *dev, +netdev_features_t features) +{ + struct mvneta_port *pp = netdev_priv(dev); + + if (pp->tx_csum_limit && dev->mtu > pp->tx_csum_limit) { + features &= ~(NETIF_F_IP_CSUM | NETIF_F_TSO); + netdev_info(dev, + "Disable IP checksum for MTU greater than %dB\n", + pp->tx_csum_limit); + } + + return features; +} + /* Get mac address */ static void mvneta_get_mac_addr(struct mvneta_port *pp, unsigned char *addr) { @@ -2790,6 +2810,7 @@ static const struct net_device_ops mvneta_netdev_ops = { .ndo_set_rx_mode = mvneta_set_rx_mode, .ndo_set_mac_address = mvneta_set_mac_addr, .ndo_change_mtu = mvneta_change_mtu, + .ndo_fix_features= mvneta_fix_features, .ndo_get_stats64 = mvneta_get_stats64, .ndo_do_ioctl= mvneta_ioctl, }; @@ -3028,6 +3049,9 @@ static int mvneta_probe(struct platform_device *pdev) } } + if (of_device_is_compatible(dn, "marvell,armada-370-neta")) + pp->tx_csum_limit = 1600; + pp->tx_ring_size = MVNETA_MAX_TXD; pp->rx_ring_size = MVNETA_MAX_RXD; -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3.19.y-ckt 016/251] ARM: mvebu: update Ethernet compatible string for Armada XP
3.19.8-ckt4 -stable review patch. If anyone has any objections, please let me know. -- From: Simon Guinot [ Upstream commit ea3b55fe83b5fcede82d183164b9d6831b26e33b ] This patch updates the Ethernet DT nodes for Armada XP SoCs with the compatible string "marvell,armada-xp-neta". Signed-off-by: Simon Guinot Fixes: 77916519cba3 ("arm: mvebu: Armada XP MV78230 has only three Ethernet interfaces") Acked-by: Gregory CLEMENT Reviewed-by: Thomas Petazzoni Signed-off-by: David S. Miller Signed-off-by: Kamal Mostafa --- arch/arm/boot/dts/armada-370-xp.dtsi | 2 -- arch/arm/boot/dts/armada-370.dtsi| 8 arch/arm/boot/dts/armada-xp-mv78260.dtsi | 2 +- arch/arm/boot/dts/armada-xp-mv78460.dtsi | 2 +- arch/arm/boot/dts/armada-xp.dtsi | 10 +- 5 files changed, 19 insertions(+), 5 deletions(-) diff --git a/arch/arm/boot/dts/armada-370-xp.dtsi b/arch/arm/boot/dts/armada-370-xp.dtsi index 1af4286..0c0e6b7 100644 --- a/arch/arm/boot/dts/armada-370-xp.dtsi +++ b/arch/arm/boot/dts/armada-370-xp.dtsi @@ -231,7 +231,6 @@ }; eth0: ethernet@7 { - compatible = "marvell,armada-370-neta"; reg = <0x7 0x4000>; interrupts = <8>; clocks = < 4>; @@ -247,7 +246,6 @@ }; eth1: ethernet@74000 { - compatible = "marvell,armada-370-neta"; reg = <0x74000 0x4000>; interrupts = <10>; clocks = < 3>; diff --git a/arch/arm/boot/dts/armada-370.dtsi b/arch/arm/boot/dts/armada-370.dtsi index fdb3c12..7124a5b 100644 --- a/arch/arm/boot/dts/armada-370.dtsi +++ b/arch/arm/boot/dts/armada-370.dtsi @@ -272,6 +272,14 @@ dmacap,memset; }; }; + + ethernet@7 { + compatible = "marvell,armada-370-neta"; + }; + + ethernet@74000 { + compatible = "marvell,armada-370-neta"; + }; }; }; }; diff --git a/arch/arm/boot/dts/armada-xp-mv78260.dtsi b/arch/arm/boot/dts/armada-xp-mv78260.dtsi index d7a8d0b..b8af89f 100644 --- a/arch/arm/boot/dts/armada-xp-mv78260.dtsi +++ b/arch/arm/boot/dts/armada-xp-mv78260.dtsi @@ -285,7 +285,7 @@ }; eth3: ethernet@34000 { - compatible = "marvell,armada-370-neta"; + compatible = "marvell,armada-xp-neta"; reg = <0x34000 0x4000>; interrupts = <14>; clocks = < 1>; diff --git a/arch/arm/boot/dts/armada-xp-mv78460.dtsi b/arch/arm/boot/dts/armada-xp-mv78460.dtsi index 9c40c13..4b55434 100644 --- a/arch/arm/boot/dts/armada-xp-mv78460.dtsi +++ b/arch/arm/boot/dts/armada-xp-mv78460.dtsi @@ -323,7 +323,7 @@ }; eth3: ethernet@34000 { - compatible = "marvell,armada-370-neta"; + compatible = "marvell,armada-xp-neta"; reg = <0x34000 0x4000>; interrupts = <14>; clocks = < 1>; diff --git a/arch/arm/boot/dts/armada-xp.dtsi b/arch/arm/boot/dts/armada-xp.dtsi index 62c3ba9..fa955dd 100644 --- a/arch/arm/boot/dts/armada-xp.dtsi +++ b/arch/arm/boot/dts/armada-xp.dtsi @@ -141,7 +141,7 @@ }; eth2: ethernet@3 { - compatible = "marvell,armada-370-neta"; + compatible = "marvell,armada-xp-neta"; reg = <0x3 0x4000>; interrupts = <12>; clocks = < 2>; @@ -184,6 +184,14 @@ }; }; + ethernet@7 { + compatible = "marvell,armada-xp-neta"; + }; + + ethernet@74000 { + compatible = "marvell,armada-xp-neta"; + }; + xor@f0900 { compatible = "marvell,orion-xor"; reg = <0xF0900 0x100 -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3.19.y-ckt 018/251] sparc: Use GFP_ATOMIC in ldc_alloc_exp_dring() as it can be called in softirq context
3.19.8-ckt4 -stable review patch. If anyone has any objections, please let me know. -- From: Sowmini Varadhan [ Upstream commit 0edfad5959df7379c9e554fbe8ba264ae232d321 ] Since it is possible for vnet_event_napi to end up doing vnet_control_pkt_engine -> ... -> vnet_send_attr -> vnet_port_alloc_tx_ring -> ldc_alloc_exp_dring -> kzalloc() (i.e., in softirq context), kzalloc() should be called with GFP_ATOMIC from ldc_alloc_exp_dring. Signed-off-by: Sowmini Varadhan [ kamal: corrected upstream commit SHA ] Cc: David Miller Signed-off-by: Kamal Mostafa --- arch/sparc/kernel/ldc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/sparc/kernel/ldc.c b/arch/sparc/kernel/ldc.c index 274a9f5..591f119f 100644 --- a/arch/sparc/kernel/ldc.c +++ b/arch/sparc/kernel/ldc.c @@ -2313,7 +2313,7 @@ void *ldc_alloc_exp_dring(struct ldc_channel *lp, unsigned int len, if (len & (8UL - 1)) return ERR_PTR(-EINVAL); - buf = kzalloc(len, GFP_KERNEL); + buf = kzalloc(len, GFP_ATOMIC); if (!buf) return ERR_PTR(-ENOMEM); -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3.19.y-ckt 014/251] amd-xgbe: Add the __GFP_NOWARN flag to Rx buffer allocation
3.19.8-ckt4 -stable review patch. If anyone has any objections, please let me know. -- From: Tom Lendacky [ Upstream commit 472cfe7127760d68b819cf35a26e5a1b44b30f4e ] When allocating Rx related buffers, alloc_pages is called using an order number that is decreased until successful. A system under stress can experience failures during this allocation process resulting in a warning being issued. This message can be of concern to end users even though the failure is not fatal. Since the failure is not fatal and can occur multiple times, the driver should include the __GFP_NOWARN flag to suppress the warning message from being issued. Signed-off-by: Tom Lendacky Signed-off-by: David S. Miller Signed-off-by: Kamal Mostafa --- drivers/net/ethernet/amd/xgbe/xgbe-desc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/amd/xgbe/xgbe-desc.c b/drivers/net/ethernet/amd/xgbe/xgbe-desc.c index a50891f..b873734 100644 --- a/drivers/net/ethernet/amd/xgbe/xgbe-desc.c +++ b/drivers/net/ethernet/amd/xgbe/xgbe-desc.c @@ -263,7 +263,7 @@ static int xgbe_alloc_pages(struct xgbe_prv_data *pdata, int ret; /* Try to obtain pages, decreasing order if necessary */ - gfp |= __GFP_COLD | __GFP_COMP; + gfp |= __GFP_COLD | __GFP_COMP | __GFP_NOWARN; while (order >= 0) { pages = alloc_pages(gfp, order); if (pages) -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3.19.y-ckt 025/251] bus: arm-ccn: Fix node->XP config conversion
3.19.8-ckt4 -stable review patch. If anyone has any objections, please let me know. -- From: Pawel Moll commit a18f8e97fe69195823d7fb5c68a8d6565f39db4b upstream. Events defined as watchpoints on nodes must have their config values converted so that they apply to the respective node's XP. The function setting new values was using wrong mask for the "port" field, resulting in corrupted value. Fixed now. Signed-off-by: Pawel Moll Signed-off-by: Kamal Mostafa --- drivers/bus/arm-ccn.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/bus/arm-ccn.c b/drivers/bus/arm-ccn.c index aaa0f2a..60397ec 100644 --- a/drivers/bus/arm-ccn.c +++ b/drivers/bus/arm-ccn.c @@ -212,7 +212,7 @@ static int arm_ccn_node_to_xp_port(int node) static void arm_ccn_pmu_config_set(u64 *config, u32 node_xp, u32 type, u32 port) { - *config &= ~((0xff << 0) | (0xff << 8) | (0xff << 24)); + *config &= ~((0xff << 0) | (0xff << 8) | (0x3 << 24)); *config |= (node_xp << 0) | (type << 8) | (port << 24); } -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] kprobes: Use debugfs_remove_recursive instead debugfs_remove
In debugfs_kprobe_init, we create a directory 'kprobes' and three files 'list', 'enabled' and 'blacklist'. When any one of the three files creation fails, we should remove all of them. But debugfs_remove function can not complete this work. So use debugfs_remove_recursive instead. Signed-off-by: Wang Long --- kernel/kprobes.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/kprobes.c b/kernel/kprobes.c index c90e417..8cd82a5 100644 --- a/kernel/kprobes.c +++ b/kernel/kprobes.c @@ -2459,7 +2459,7 @@ static int __init debugfs_kprobe_init(void) return 0; error: - debugfs_remove(dir); + debugfs_remove_recursive(dir); return -ENOMEM; } -- 1.8.3.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 04/11] otg-fsm: move usb_bus_start_enum into otg-fsm->ops
On Wed, Jul 15, 2015 at 04:30:27PM +0300, Roger Quadros wrote: > On 14/07/15 03:34, Peter Chen wrote: > > On Mon, Jul 13, 2015 at 01:13:54PM +0300, Roger Quadros wrote: > >> Peter, > >> > >> On 13/07/15 04:58, Peter Chen wrote: > >>> On Wed, Jul 08, 2015 at 01:19:30PM +0300, Roger Quadros wrote: > This is to prevent missing symbol build error if OTG is > enabled (built-in) and HCD core (CONFIG_USB) is module. > > >>> > >>> We may let the OTG-DRD/OTG-FSM depends on CONFIG_USB to fix it. > >> > >> CONFIG_OTG already depends on CONFIG_USB as it is a sub-option of > >> CONFIG_USB. It doesn't depend on CONFIG_USB_GADGET and that can > >> be fixed. > >> > >> But dependency is not the problem here. Symbols not available to > >> OTG driver when USB/GADGET is 'm' is the problem. > >> > >> e.g. > >> CONFIG_USB_OTG is always built-in. > >> we need to work if CONFIG_USB is 'm'/'y' > >> _and_ if CONFIG_USB_GADGET is 'm'/'y' > >> > > > > below should fix this issue, but we may need to make some > > changes for code which are defined by CONFIG_USB_OTG. > > > > diff --git a/drivers/usb/core/Kconfig b/drivers/usb/core/Kconfig > > index a99c89e..5e374ad 100644 > > --- a/drivers/usb/core/Kconfig > > +++ b/drivers/usb/core/Kconfig > > @@ -42,8 +42,9 @@ config USB_DYNAMIC_MINORS > > If you are unsure about this, say N here. > > > > config USB_OTG > > - bool "OTG support" > > + tristate "OTG support" > > depends on PM > > + depends on USB && USB_GADGET > > default n > >help > > The most notable feature of > > USB OTG is support for a > > With this USB_OTG will become 'm' when either USB or USB_GADGET is m > and will break if either USB or USB_GADGET is made y as all OTG core > API symbols won't be available. :) > Ok, after thinking more, seems we can't handle properly if USB_OTG as 'm', your idea that using host/gadget/fsm->ops to call hcd/gadget API and the controller driver will defines these ops (due to it will use hcd/gadget function) is proper way currently. -- Best Regards, Peter Chen -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3.19.y-ckt 032/251] intel_pstate: set BYT MSR with wrmsrl_on_cpu()
3.19.8-ckt4 -stable review patch. If anyone has any objections, please let me know. -- From: Joe Konno commit 0dd23f94251f49da99a6cbfb22418b2d757d77d6 upstream. Commit 007bea098b86 (intel_pstate: Add setting voltage value for baytrail P states.) introduced byt_set_pstate() with the assumption that it would always be run by the CPU whose MSR is to be written by it. It turns out, however, that is not always the case in practice, so modify byt_set_pstate() to enforce the MSR write done by it to always happen on the right CPU. Fixes: 007bea098b86 (intel_pstate: Add setting voltage value for baytrail P states.) Signed-off-by: Joe Konno Acked-by: Kristen Carlson Accardi Signed-off-by: Rafael J. Wysocki Signed-off-by: Kamal Mostafa --- drivers/cpufreq/intel_pstate.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c index 742eefb..c37c895 100644 --- a/drivers/cpufreq/intel_pstate.c +++ b/drivers/cpufreq/intel_pstate.c @@ -497,7 +497,7 @@ static void byt_set_pstate(struct cpudata *cpudata, int pstate) val |= vid; - wrmsrl(MSR_IA32_PERF_CTL, val); + wrmsrl_on_cpu(cpudata->cpu, MSR_IA32_PERF_CTL, val); } #define BYT_BCLK_FREQS 5 -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3.19.y-ckt 029/251] spi: fix race freeing dummy_tx/rx before it is unmapped
3.19.8-ckt4 -stable review patch. If anyone has any objections, please let me know. -- From: Martin Sperl commit 8e76ef88f607174082023f50b87fe12dcdbe5db5 upstream. Fix a race (with some kernel configurations) where a queued master->pump_messages runs and frees dummy_tx/rx before spi_unmap_msg is running (or is finished). This results in the following messages: BUG: Bad page state in process page:db7ba030 count:0 mapcount:0 mapping: (null) index:0x0 flags: 0x200(arch_1) page dumped because: PAGE_FLAGS_CHECK_AT_PREP flag set ... Reported-by: Noralf Trønnes Suggested-by: Noralf Trønnes Tested-by: Noralf Trønnes Signed-off-by: Martin Sperl Signed-off-by: Mark Brown Signed-off-by: Kamal Mostafa --- drivers/spi/spi.c | 11 ++- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c index a17f533..bfa47d5 100644 --- a/drivers/spi/spi.c +++ b/drivers/spi/spi.c @@ -1059,9 +1059,6 @@ void spi_finalize_current_message(struct spi_master *master) spin_lock_irqsave(>queue_lock, flags); mesg = master->cur_msg; - master->cur_msg = NULL; - - queue_kthread_work(>kworker, >pump_messages); spin_unlock_irqrestore(>queue_lock, flags); spi_unmap_msg(master, mesg); @@ -1074,9 +1071,13 @@ void spi_finalize_current_message(struct spi_master *master) } } - trace_spi_message_done(mesg); - + spin_lock_irqsave(>queue_lock, flags); + master->cur_msg = NULL; master->cur_msg_prepared = false; + queue_kthread_work(>kworker, >pump_messages); + spin_unlock_irqrestore(>queue_lock, flags); + + trace_spi_message_done(mesg); mesg->state = NULL; if (mesg->complete) -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 5/6] locking/pvqspinlock: Opportunistically defer kicking to unlock time
On 07/15/2015 06:03 AM, Peter Zijlstra wrote: On Tue, Jul 14, 2015 at 10:13:36PM -0400, Waiman Long wrote: +static void pv_kick_node(struct qspinlock *lock, struct mcs_spinlock *node) { struct pv_node *pn = (struct pv_node *)node; + if (xchg(>state, vcpu_running) == vcpu_running) + return; + /* +* Kicking the next node at lock time can actually be a bit faster +* than doing it at unlock time because the critical section time +* overlaps with the wakeup latency of the next node. However, if the +* VM is too overcommmitted, it can happen that we need to kick the +* CPU again at unlock time (double-kick). To avoid that and also to +* fully utilize the kick-ahead functionality at unlock time, +* the kicking will be deferred under either one of the following +* 2 conditions: * +* 1) The VM guest has too few vCPUs that kick-ahead is not even +*enabled. In this case, the chance of double-kick will be +*higher. +* 2) The node after the next one is also in the halted state. * +* In this case, the hashed flag is set to indicate that hashed +* table has been filled and _Q_SLOW_VAL is set. */ - if (xchg(>state, vcpu_running) == vcpu_halted) { - pvstat_inc(pvstat_lock_kick); - pv_kick(pn->cpu); + if ((!pv_kick_ahead || pv_get_kick_node(pn, 1))&& + (xchg(>hashed, 1) == 0)) { + struct __qspinlock *l = (void *)lock; + + /* +* As this is the same vCPU that will check the _Q_SLOW_VAL +* value and the hash table later on at unlock time, no atomic +* instruction is needed. +*/ + WRITE_ONCE(l->locked, _Q_SLOW_VAL); + (void)pv_hash(lock, pn); + return; } + + /* +* Kicking the vCPU even if it is not really halted is safe. +*/ + pvstat_inc(pvstat_lock_kick); + pv_kick(pn->cpu); } /* @@ -513,6 +545,13 @@ static void pv_wait_head(struct qspinlock *lock, struct mcs_spinlock *node) cpu_relax(); } + if (!lp&& (xchg(>hashed, 1) == 1)) + /* +* The hashed table& _Q_SLOW_VAL had been filled +* by the lock holder. +*/ + lp = (struct qspinlock **)-1; + if (!lp) { /* ONCE */ lp = pv_hash(lock, pn); /* *groan*, so you complained the previous version of this patch was too complex, but let me say I vastly preferred it to this one :/ I said it was complex as maintaining a tri-state variable needed more thought than 2 bi-state variables. I can revert it back to the tri-state variable as doing an unconditional kick in unlock simplifies the code at pv_wait_head(). Cheers, Longman -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3.19.y-ckt 004/251] packet: read num_members once in packet_rcv_fanout()
3.19.8-ckt4 -stable review patch. If anyone has any objections, please let me know. -- From: Eric Dumazet [ Upstream commit f98f4514d07871da7a113dd9e3e330743fd70ae4 ] We need to tell compiler it must not read f->num_members multiple times. Otherwise testing if num is not zero is flaky, and we could attempt an invalid divide by 0 in fanout_demux_cpu() Note bug was present in packet_rcv_fanout_hash() and packet_rcv_fanout_lb() but final 3.1 had a simple location after commit 95ec3eb417115fb ("packet: Add 'cpu' fanout policy.") Fixes: dc99f600698dc ("packet: Add fanout support.") Signed-off-by: Eric Dumazet Cc: Willem de Bruijn Signed-off-by: David S. Miller Signed-off-by: Kamal Mostafa --- net/packet/af_packet.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c index 9cfe2e1..8c7eb97 100644 --- a/net/packet/af_packet.c +++ b/net/packet/af_packet.c @@ -1339,7 +1339,7 @@ static int packet_rcv_fanout(struct sk_buff *skb, struct net_device *dev, struct packet_type *pt, struct net_device *orig_dev) { struct packet_fanout *f = pt->af_packet_priv; - unsigned int num = f->num_members; + unsigned int num = READ_ONCE(f->num_members); struct packet_sock *po; unsigned int idx; -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3.19.y-ckt 023/251] [media] saa7164: fix querycap warning
3.19.8-ckt4 -stable review patch. If anyone has any objections, please let me know. -- From: Hans Verkuil commit 534bc3e2ee93835badca753bedce8073c67caa92 upstream. Fix the VIDIOC_QUERYCAP warning due to the missing device_caps. Don't fill in the version field, the V4L2 core will do that for you. Signed-off-by: Hans Verkuil Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Kamal Mostafa --- drivers/media/pci/saa7164/saa7164-encoder.c | 11 ++- drivers/media/pci/saa7164/saa7164-vbi.c | 11 ++- 2 files changed, 12 insertions(+), 10 deletions(-) diff --git a/drivers/media/pci/saa7164/saa7164-encoder.c b/drivers/media/pci/saa7164/saa7164-encoder.c index 9266965..7a0a651 100644 --- a/drivers/media/pci/saa7164/saa7164-encoder.c +++ b/drivers/media/pci/saa7164/saa7164-encoder.c @@ -721,13 +721,14 @@ static int vidioc_querycap(struct file *file, void *priv, sizeof(cap->card)); sprintf(cap->bus_info, "PCI:%s", pci_name(dev->pci)); - cap->capabilities = + cap->device_caps = V4L2_CAP_VIDEO_CAPTURE | - V4L2_CAP_READWRITE | - 0; + V4L2_CAP_READWRITE | + V4L2_CAP_TUNER; - cap->capabilities |= V4L2_CAP_TUNER; - cap->version = 0; + cap->capabilities = cap->device_caps | + V4L2_CAP_VBI_CAPTURE | + V4L2_CAP_DEVICE_CAPS; return 0; } diff --git a/drivers/media/pci/saa7164/saa7164-vbi.c b/drivers/media/pci/saa7164/saa7164-vbi.c index 6e025fe..06117e6 100644 --- a/drivers/media/pci/saa7164/saa7164-vbi.c +++ b/drivers/media/pci/saa7164/saa7164-vbi.c @@ -660,13 +660,14 @@ static int vidioc_querycap(struct file *file, void *priv, sizeof(cap->card)); sprintf(cap->bus_info, "PCI:%s", pci_name(dev->pci)); - cap->capabilities = + cap->device_caps = V4L2_CAP_VBI_CAPTURE | - V4L2_CAP_READWRITE | - 0; + V4L2_CAP_READWRITE | + V4L2_CAP_TUNER; - cap->capabilities |= V4L2_CAP_TUNER; - cap->version = 0; + cap->capabilities = cap->device_caps | + V4L2_CAP_VIDEO_CAPTURE | + V4L2_CAP_DEVICE_CAPS; return 0; } -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/